Adaptive Dictionary Embeddings Scale Multi-Anchor Representations to LLMs
Researchers introduce Adaptive Dictionary Embeddings (ADE), a framework that scales multi-anchor word representations to large language models. Traditional word embeddings use a single vector per word, creating bottlenecks for polysemous words. ADE overcomes this with three contributions: Vocabulary Projection (VP) transforms the two-stage anchor lookup into a single matrix operation; Grouped Positional Encoding (GPE) shares positional information among anchors of the same word; and a third unnamed contribution. The method is detailed in arXiv paper 2604.24940, demonstrating successful integration with modern transformer architectures.
Key facts
- ADE scales multi-anchor word representations to large language models.
- Traditional embeddings use a single vector per word, limiting semantic expressiveness.
- Vocabulary Projection (VP) reduces anchor lookup to a single matrix operation.
- Grouped Positional Encoding (GPE) shares positional info among anchors of the same word.
- The paper is published on arXiv with ID 2604.24940.
- ADE addresses computational inefficiency of prior multi-anchor approaches.
- The framework integrates with modern transformer architectures.
- Multi-anchor representations represent words as combinations of multiple vectors.
Entities
Institutions
- arXiv