MetaKGEnrich: Automated Knowledge-Graph Repair for LLMs

ai-technology · 2026-05-20

MetaKGEnrich is an automated system aimed at enhancing metacognitive AI by allowing large language models (LLMs) to independently identify and address knowledge deficiencies. The process starts with a seed query to construct knowledge graphs, followed by the detection of sparse areas using seven graph metrics. It formulates specific questions through GPT-4o, gathers online evidence via Tavily, incorporates this data into Neo4j, and then re-evaluates the query with GraphRAP for GPT-4 assessment. In trials involving 30 queries from three datasets—Google Research Natural Questions, MS MARCO, and HotpotQA—MetaKGEnrich boosted answer quality in 80% of HotpotQA, 87% of Google Research Natural Questions, and 83% of MS MARCO, while maintaining well-supported areas. This proof of concept illustrates how topological self-diagnosis can facilitate autonomous knowledge enhancement in AI systems.

Key facts

MetaKGEnrich is a fully automated pipeline for knowledge-graph population and LLM enrichment.
The system uses seven graph metrics to detect sparse regions in knowledge graphs.
GPT-4o generates targeted questions to fill knowledge gaps.
Web evidence is retrieved via Tavily and stored in Neo4j.
GraphRAG re-answers queries for GPT-4 to evaluate improvement.
Tested on 30 queries from Google Research Natural Questions, MS MARCO, and HotpotQA.
Answer quality improved in 80% of HotpotQA questions.
Answer quality improved in 87% of Google Research Natural Questions and 83% of MS MARCO questions.

Entities

Institutions

Google Research
MS MARCO
HotpotQA
Neo4j
Tavily
GPT-4o
GPT-4
GraphRAG

Sources

arXiv cs.AI — 2026-05-19