Spectral Retrieval: Multi-Scale Sinc Convolution for LLM Multi-Agent Systems
A novel plug-in re-ranking method known as Spectral Retrieval bridges the gap between per-token MaxSim and mean-pool retrieval through multi-scale sinc convolution applied to token embeddings. In conventional dense retrieval, each document is depicted by a singular mean-pooled vector, which diminishes localized relevance signals. Spectral Retrieval utilizes per-token embeddings from a late-interaction index, convolving them with a normalized sinc kernel across various scales. At scale L=1, the kernel functions as an identity, replicating per-token MaxSim; as L increases, it transitions toward a uniform filter, mimicking mean pooling. In a controlled synthetic benchmark with 1,000 documents featuring single-position spikes, mean-pool retrieval shows chance performance (Recall@10 ~ 0.02), while Spectral Retrieval significantly enhances recall, aiming to optimize localized retrieval in LLM multi-agent systems.
Key facts
- Spectral Retrieval is a plug-in re-ranking stage for dense retrieval.
- It interpolates between per-token MaxSim and mean-pool retrieval.
- Uses multi-scale sinc convolution over token embeddings.
- Reuses per-token embeddings from a late-interaction index.
- At L=1, kernel is identity, recovering MaxSim; at large L, it approaches mean pooling.
- Maximum cosine over positions and scales yields a score no less informative than endpoints.
- On synthetic benchmark with 1,000 documents and single-position spikes, mean-pool Recall@10 ~ 0.02.
- Spectral Retrieval significantly improves recall over mean pooling.
- Designed for LLM multi-agent systems.
Entities
Institutions
- arXiv