Language Models Tested on Historical Cosmology Show Domain Adaptation Limits
A new study on arXiv (2605.30415) examines how domain adaptation affects language models' explanatory behavior using historical cosmology as a testbed. In Phase 1, a small model was trained from scratch on a pre-Copernican corpus with heliocentric references removed, to see if Earth-motion or heliocentric continuations emerge. In Phase 2, a larger pretrained model was fine-tuned with QLoRA on the same corpus to study changes in explanatory framing and cosmological stance. Outputs were evaluated by an LLM-as-judge framework labeling stance (geocentric, heliocentric, ambiguous) and frame (premodern vs. modern). Results show smaller models occasionally generate local Earth-motion continuations but remain globally unstable, unable to support coherent cosmological reasoning. The study highlights challenges in domain adaptation for historical knowledge.
Key facts
- arXiv paper 2605.30415 investigates domain adaptation in language models using historical cosmology.
- Phase 1 trains a small language model from scratch on a pre-Copernican corpus with heliocentric references removed.
- Phase 2 fine-tunes a larger pretrained model using QLoRA on the same corpus.
- Model outputs are evaluated by an LLM-as-judge framework for cosmological stance and explanatory frame.
- Smaller models produce local Earth-motion continuations but lack global stability.
- The study reveals limitations in coherent cosmological reasoning after domain adaptation.
- The research uses a controlled experimental setting with historical cosmology.
- Domain adaptation modifies explanatory framing and cosmological stance in language models.
Entities
Institutions
- arXiv