Orthogonal Subspaces Method Improves LoRA Model Merging
Researchers have identified a cause of performance degradation when merging large language models fine-tuned with low-rank adaptation (LoRA). They propose Orthogonal Subspaces for Robust Model Merging (OSRM), which constrains the LoRA subspace prior to fine-tuning to prevent task interference. OSRM integrates with existing merging algorithms and was tested on eight datasets.
Key facts
- Fine-tuning LLMs for individual tasks is expensive for deployment and storage.
- Model merging combines multiple task-specific models into one multi-task model without additional training.
- Existing merging methods often fail for models fine-tuned with LoRA due to performance degradation.
- The issue arises from interplay between model parameters and data distributions.
- OSRM constrains the LoRA subspace prior to fine-tuning.
- OSRM reduces unintended interference among tasks.
- OSRM can integrate with most existing merging algorithms.
- Experiments were conducted on eight datasets.
Entities
—