Compressive Knowledge Graph Hypothesis: Selective Graph Facts for Scientific Hypothesis Generation

ai-technology · 2026-05-27

A new study from arXiv (2605.27176) investigates which knowledge graph facts are most useful for scientific hypothesis generation in language models. The researchers perturbed local knowledge graphs by varying density, ontology richness, topology, and control structure, testing Mistral-7B, Llama-3.1-70B, and Gemini 2.5 Flash on battery materials hypothesis generation. They found that graph utility is selective and model-dependent: while graph context changes outputs, models without a knowledge graph still recover substantial content from their own priors. Compact top-k subgraphs often approximate full-KG behavior, even when claimed-outcome triples are held out. Compression is not unique to semantic ranking—random and topology-based subsets also recover much of the signal. The results support a redundancy-aware Compressive KG hypothesis.

Key facts

Study from arXiv:2605.27176
Tests Mistral-7B, Llama-3.1-70B, and Gemini 2.5 Flash
Focuses on battery materials hypothesis generation
Perturbs local KGs by density, ontology richness, topology, control structure
Compact top-k subgraphs approximate full-KG behavior
No-KG outputs recover substantial graph content from model priors
Random and topology-based subsets recover signal
Supports redundancy-aware Compressive KG hypothesis

Compressive Knowledge Graph Hypothesis: Selective Graph Facts for Scientific Hypothesis Generation

Key facts

Entities

Institutions

Sources