LiteSemRAG: New LLM-Free Framework for Efficient Graph-Based Retrieval-Augmented Generation

ai-technology · 2026-04-22

A new research paper introduces LiteSemRAG, a lightweight framework designed for graph-based Retrieval-Augmented Generation that operates without large language models. The approach addresses computational inefficiencies in existing systems by eliminating reliance on LLMs during both indexing and querying phases, which reduces token consumption and latency. LiteSemRAG constructs a heterogeneous semantic graph using contextual token-level embeddings, explicitly distinguishing between surface lexical representations and context-dependent semantic meanings. To handle polysemy robustly, the framework implements a dynamic semantic node construction mechanism featuring chunk-level context aggregation and adaptive anomaly handling. During query processing, LiteSemRAG employs a two-step semantic-aware retrieval process that integrates co-occurrence graph weighting. The paper, identified as arXiv:2604.16350v1, was announced as a cross-disciplinary abstract. Graph-based RAG has demonstrated significant potential for enhancing multi-level reasoning and structured evidence aggregation, but previous frameworks incurred high computational costs due to their dependence on LLMs.

Key facts

LiteSemRAG is a lightweight, fully LLM-free semantic-aware graph retrieval framework
It constructs a heterogeneous semantic graph using contextual token-level embeddings
The framework explicitly separates surface lexical representations from context-dependent semantic meanings
A dynamic semantic node construction mechanism handles polysemy with chunk-level context aggregation and adaptive anomaly handling
At query stage, LiteSemRAG performs a two-step semantic-aware retrieval process integrating co-occurrence graph weighting
The paper is identified as arXiv:2604.16350v1 with Announce Type: cross
Graph-based Retrieval-Augmented Generation has shown potential for improving multi-level reasoning and structured evidence aggregation
Existing graph-based RAG frameworks rely heavily on large language models during indexing and querying, leading to high token consumption and computational costs

Entities

—

Sources

arXiv cs.AI — 2026-04-21