MetaKGEnrich: Automated Knowledge-Graph Repair for LLMs
MetaKGEnrich is an automated system aimed at enhancing metacognitive AI by allowing large language models (LLMs) to independently identify and address knowledge deficiencies. The process starts with a seed query to construct knowledge graphs, followed by the detection of sparse areas using seven graph metrics. It formulates specific questions through GPT-4o, gathers online evidence via Tavily, incorporates this data into Neo4j, and then re-evaluates the query with GraphRAP for GPT-4 assessment. In trials involving 30 queries from three datasets—Google Research Natural Questions, MS MARCO, and HotpotQA—MetaKGEnrich boosted answer quality in 80% of HotpotQA, 87% of Google Research Natural Questions, and 83% of MS MARCO, while maintaining well-supported areas. This proof of concept illustrates how topological self-diagnosis can facilitate autonomous knowledge enhancement in AI systems.
Key facts
- MetaKGEnrich is a fully automated pipeline for knowledge-graph population and LLM enrichment.
- The system uses seven graph metrics to detect sparse regions in knowledge graphs.
- GPT-4o generates targeted questions to fill knowledge gaps.
- Web evidence is retrieved via Tavily and stored in Neo4j.
- GraphRAG re-answers queries for GPT-4 to evaluate improvement.
- Tested on 30 queries from Google Research Natural Questions, MS MARCO, and HotpotQA.
- Answer quality improved in 80% of HotpotQA questions.
- Answer quality improved in 87% of Google Research Natural Questions and 83% of MS MARCO questions.
Entities
Institutions
- Google Research
- MS MARCO
- HotpotQA
- Neo4j
- Tavily
- GPT-4o
- GPT-4
- GraphRAG