DiDi-Merging: Slim Dynamic Model Merging Framework
A new framework called DiDi-Merging has been introduced by researchers, which is a streamlined dynamic model merging approach utilizing differentiable rank allocation to effectively balance shared and expert parameters. This innovation tackles the shortcomings of current dynamic merging techniques, which either utilize fully shared models with minimal experts or assign too much capacity to the experts. DiDi-Merging achieves performance comparable to previous dynamic baselines with just 1.24 times the parameters of a single fine-tuned model and outperforms them at 1.4 times, making it significantly more efficient than methods that demand over 2 times the parameters. Additionally, the framework features a data-free refinement step to enhance task fidelity, allowing for the effective combination of experts across various tasks without the need for joint training or original data access.
Key facts
- DiDi-Merging is a slim dynamic merging framework.
- It uses differentiable rank allocation to balance shared and expert parameters.
- It matches prior dynamic baselines at 1.24x parameters of a single fine-tuned model.
- It surpasses baselines at 1.4x parameters.
- Existing methods require over 2x parameters.
- It introduces a data-free refinement step.
- Model merging enables reuse of fine-tuned models without joint training or original data.
- Dynamic merging selectively activates task-relevant parameters.
Entities
—