FeatCal: Calibrating Merged AI Models to Reduce Feature Drift
A new method called FeatCal addresses performance degradation in merged AI models by calibrating weights layer by layer. Model merging combines task-specific experts into one model without retraining, but the merged model often underperforms individual experts. Researchers attribute this gap to feature drift—differences in features produced by the merged model versus the expert on the same input. Their theory decomposes drift into upstream propagation and local mismatch, tracking how it accumulates through layers. FeatCal uses a small calibration set to adjust merged model weights in forward order, reducing feature drift while preserving merging benefits. The method employs an efficient closed-form solution, avoiding gradient descent or iterative optimization. Tests on CLIP and GLUE benchmarks show improved performance. The paper is published on arXiv (2605.13030).
Key facts
- FeatCal calibrates merged model weights layer by layer in forward order.
- Model merging combines task experts into one model without joint training.
- Feature drift is decomposed into upstream propagation and local mismatch.
- FeatCal uses a small calibration set and a closed-form solution.
- No gradient descent, iterative optimization, or extra modules are needed.
- Tests were conducted on CLIP and GLUE benchmarks.
- The paper is on arXiv with ID 2605.13030.
- The method reduces feature drift while staying close to merged weights.
Entities
Institutions
- arXiv