ARTFEED — Contemporary Art Intelligence

Group Fine-Tuning (GFT): A Unified Post-Training Framework for LLMs

other · 2026-04-30

A recent study published on arXiv introduces Group Fine-Tuning (GFT), a comprehensive post-training framework designed for large language models that overcomes the challenges associated with supervised fine-tuning (SFT) and reinforcement learning (RL). The researchers investigate the dynamics of training and conclude that SFT represents a specific instance of policy gradient optimization characterized by sparse implicit rewards and unstable inverse-probability weighting. This leads to issues such as single-path dependency, entropy collapse, and gradient explosion. GFT features Group Advantage Learning, which creates varied response groups and employs normalized contrastive supervision to mitigate reward sparsity, along with Dynamic Coefficient Rectification, which adaptively regulates inverse-probability weights for stable training. The study can be found at arXiv:2604.14258.

Key facts

  • arXiv:2604.14258
  • Group Fine-Tuning (GFT) proposed
  • SFT interpreted as special case of policy gradient optimization
  • SFT issues: single-path dependency, entropy collapse, gradient explosion
  • GFT includes Group Advantage Learning and Dynamic Coefficient Rectification
  • Group Advantage Learning uses diverse response groups and normalized contrastive supervision
  • Dynamic Coefficient Rectification adaptively bounds inverse-probability weights
  • Paper type: replace

Entities

Institutions

  • arXiv

Sources