Compressive Knowledge Graph Hypothesis: Selective Graph Facts for Scientific Hypothesis Generation
A new study from arXiv (2605.27176) investigates which knowledge graph facts are most useful for scientific hypothesis generation in language models. The researchers perturbed local knowledge graphs by varying density, ontology richness, topology, and control structure, testing Mistral-7B, Llama-3.1-70B, and Gemini 2.5 Flash on battery materials hypothesis generation. They found that graph utility is selective and model-dependent: while graph context changes outputs, models without a knowledge graph still recover substantial content from their own priors. Compact top-k subgraphs often approximate full-KG behavior, even when claimed-outcome triples are held out. Compression is not unique to semantic ranking—random and topology-based subsets also recover much of the signal. The results support a redundancy-aware Compressive KG hypothesis.
Key facts
- Study from arXiv:2605.27176
- Tests Mistral-7B, Llama-3.1-70B, and Gemini 2.5 Flash
- Focuses on battery materials hypothesis generation
- Perturbs local KGs by density, ontology richness, topology, control structure
- Compact top-k subgraphs approximate full-KG behavior
- No-KG outputs recover substantial graph content from model priors
- Random and topology-based subsets recover signal
- Supports redundancy-aware Compressive KG hypothesis
Entities
Institutions
- arXiv