Comet-H: Orchestrating LLMs for Research Software with Evolving Specs

other · 2026-05-01

A new arXiv preprint (2604.27209) identifies two failure modes in using large language models for research-software projects: hallucination accumulation, where claims outpace code or theory, and desynchronization, where code, theory, and the model's world model fall out of alignment. The authors propose Comet-H, an iterative prompt automaton that coordinates ideation, implementation, evaluation, grounding, and paper-writing as coupled components of a single workspace state. A controller selects prompts based on workspace gaps and carries forward unfinished work with a half-life mechanism.

Key facts

arXiv:2604.27209
Identifies hallucination accumulation and desynchronization as LM-specific failure modes
Proposes Comet-H iterative prompt automaton
Coordinates ideation, implementation, evaluation, grounding, and paper-writing
Controller selects prompts based on workspace gaps
Unfinished work carried forward with half-life mechanism
Preprint type: cross
Focus on research-software projects with evolving specifications

Comet-H: Orchestrating LLMs for Research Software with Evolving Specs

Key facts

Entities

Institutions

Sources