Comet-H: Orchestrating LLMs for Research Software with Evolving Specs
A new arXiv preprint (2604.27209) identifies two failure modes in using large language models for research-software projects: hallucination accumulation, where claims outpace code or theory, and desynchronization, where code, theory, and the model's world model fall out of alignment. The authors propose Comet-H, an iterative prompt automaton that coordinates ideation, implementation, evaluation, grounding, and paper-writing as coupled components of a single workspace state. A controller selects prompts based on workspace gaps and carries forward unfinished work with a half-life mechanism.
Key facts
- arXiv:2604.27209
- Identifies hallucination accumulation and desynchronization as LM-specific failure modes
- Proposes Comet-H iterative prompt automaton
- Coordinates ideation, implementation, evaluation, grounding, and paper-writing
- Controller selects prompts based on workspace gaps
- Unfinished work carried forward with half-life mechanism
- Preprint type: cross
- Focus on research-software projects with evolving specifications
Entities
Institutions
- arXiv