ProFit: Probability-Guided Token Selection Improves LLM Fine-Tuning

ai-technology · 2026-05-07

ProFit is an innovative technique designed to tackle overfitting during the supervised fine-tuning (SFT) of large language models (LLMs) by strategically masking tokens with low probabilities. Conventional SFT tends to align with a singular reference answer, neglecting the multifaceted nature of language and leading to overfitting on less essential expressions. Although using several reference answers could mitigate this issue, it is often unfeasible due to high data and computational demands. ProFit capitalizes on the relationship between token probability and semantic significance: high-probability tokens embody the fundamental logical structure, while low-probability tokens are largely interchangeable. By masking these low-probability tokens, ProFit effectively avoids superficial overfitting without the need for multiple references. This method is elaborated in a paper available on arXiv (2601.09195v3), categorized as a cross-replacement announcement.

Key facts

ProFit selectively masks low-probability tokens during SFT.
Traditional SFT overfits to non-core expressions due to single-reference alignment.
Multiple reference answers are costly in data and computation.
High-probability tokens carry core logical framework.
Low-probability tokens are mostly replaceable expressions.
Paper available on arXiv: 2601.09195v3.
Announcement type: replace-cross.
Method aims to mitigate single-reference overfitting.

ProFit: Probability-Guided Token Selection Improves LLM Fine-Tuning

Key facts

Entities

Institutions

Sources