Dr. Post-Training: Data Regularization for LLM Post-Training

ai-technology · 2026-05-11

A new framework, Dr. Post-Training (Data-Regularized Post-Training), reconceptualizes general training data as a data-induced regularizer in LLM post-training. Instead of selecting data, it constructs a feasible set of model update directions from general data and projects the target data's update onto that set. This prevents overfitting to scarce target data. Standard training and data selection methods are special cases of this framework. The work is published on arXiv under identifier 2605.07063.

Key facts

Dr. Post-Training is a data regularization framework for LLM post-training.
It treats general training data as a regularizer, not a selection pool.
At each step, it constructs a feasible set of update directions from general data.
It projects the target data's update direction onto that feasible set.
Standard training and data selection methods are special cases of this framework.
The paper is on arXiv with ID 2605.07063.
The approach addresses overfitting to scarce target data.
It moves beyond the data-selection framing.

Dr. Post-Training: Data Regularization for LLM Post-Training

Key facts

Entities

Institutions

Sources