LiteSemRAG: New LLM-Free Framework for Efficient Graph-Based Retrieval-Augmented Generation
A new research paper introduces LiteSemRAG, a lightweight framework designed for graph-based Retrieval-Augmented Generation that operates without large language models. The approach addresses computational inefficiencies in existing systems by eliminating reliance on LLMs during both indexing and querying phases, which reduces token consumption and latency. LiteSemRAG constructs a heterogeneous semantic graph using contextual token-level embeddings, explicitly distinguishing between surface lexical representations and context-dependent semantic meanings. To handle polysemy robustly, the framework implements a dynamic semantic node construction mechanism featuring chunk-level context aggregation and adaptive anomaly handling. During query processing, LiteSemRAG employs a two-step semantic-aware retrieval process that integrates co-occurrence graph weighting. The paper, identified as arXiv:2604.16350v1, was announced as a cross-disciplinary abstract. Graph-based RAG has demonstrated significant potential for enhancing multi-level reasoning and structured evidence aggregation, but previous frameworks incurred high computational costs due to their dependence on LLMs.
Key facts
- LiteSemRAG is a lightweight, fully LLM-free semantic-aware graph retrieval framework
- It constructs a heterogeneous semantic graph using contextual token-level embeddings
- The framework explicitly separates surface lexical representations from context-dependent semantic meanings
- A dynamic semantic node construction mechanism handles polysemy with chunk-level context aggregation and adaptive anomaly handling
- At query stage, LiteSemRAG performs a two-step semantic-aware retrieval process integrating co-occurrence graph weighting
- The paper is identified as arXiv:2604.16350v1 with Announce Type: cross
- Graph-based Retrieval-Augmented Generation has shown potential for improving multi-level reasoning and structured evidence aggregation
- Existing graph-based RAG frameworks rely heavily on large language models during indexing and querying, leading to high token consumption and computational costs
Entities
—