SEAT Method Preserves Epistemic Abstention in LLM Knowledge Adaptation

ai-technology · 2026-04-22

A novel fine-tuning technique known as SEAT tackles a significant issue in the integration of new knowledge into large language models. Traditional fine-tuning often diminishes the model's capacity for epistemic abstention—the recognition of its own knowledge limitations—an aspect that is particularly vital in high-stakes environments where such abstention acts as a safeguard against inaccuracies. SEAT employs sparse tuning to limit global activation drift alongside entity-perturbed KL regularization, enhancing local epistemic boundaries and curbing knowledge spillover. Notably, this method does not necessitate alignment data, boundary probing, or post-hoc adjustments, making it suitable for lightweight and privacy-conscious applications. In tests across multiple models and datasets, SEAT demonstrated an 18% to 101% improvement in human-evaluated abstention on unfamiliar queries compared to the best baseline. The findings were published on arXiv under the identifier arXiv:2506.14387v3, categorized as a replacement announcement. This strategy successfully balances robust knowledge acquisition with the essential capability to abstain when uncertain.

Key facts

SEAT is a preventive fine-tuning method for LLMs
It preserves epistemic abstention while maintaining knowledge acquisition
Standard fine-tuning often erodes aligned epistemic abstention
Epistemic abstention is critical in high-stakes settings as a safeguard against hallucination
SEAT combines sparse tuning with entity-perturbed KL regularization
The method requires no alignment data, explicit boundary probing, or post-hoc re-alignment
SEAT improved human-evaluated abstention on unknown queries by 18%-101% over baselines
Research was announced on arXiv under identifier arXiv:2506.14387v3

SEAT Method Preserves Epistemic Abstention in LLM Knowledge Adaptation

Key facts

Entities

Institutions

Sources