LARAG: Link-Aware Retrieval for RAG Systems in Technical Docs
A new retrieval strategy called LARAG (Link-Aware RAG) improves answer quality in Retrieval-Augmented Generation systems by leveraging hyperlink structures in HTML documentation. Unlike standard embedding-based retrievers that treat corpora as flat passages, LARAG encodes hyperlink relations as metadata in chunk representations, enabling graph-like retrieval of locally relevant content. Tested on twenty expert-designed queries over Rulex Platform technical documentation with four prompting strategies, LARAG achieved the highest BERTScore F1 while retrieving fewer chunks. The approach is lightweight and exploits author-defined hyperlinks already present in technical manuals.
Key facts
- LARAG stands for Link-Aware Retrieval-Augmented Generation
- It uses hyperlink structure from HTML documentation
- Encodes hyperlink relations as metadata in chunk representations
- Achieved highest BERTScore F1 on Rulex Platform queries
- Retrieves fewer chunks than standard methods
- Tested on twenty expert-designed queries
- Four prompting strategies were evaluated
- Lightweight and author-defined hyperlink approach
Entities
Institutions
- Rulex Platform