BoostLoRA: Gradient Boosting for Efficient Fine-Tuning
The innovative approach known as BoostLoRA tackles the balance between the size of adapters and their expressiveness in parameter-efficient fine-tuning (PEFT). It does this by training and merging small adapters on misclassified instances iteratively, thus overcoming the constraints of a fixed low-rank subspace. Utilizing a ROTATE SVD basis strategy, each iteration is allocated to an orthogonal subspace, enabling the cumulative effective rank to increase linearly with each round, while keeping each adapter at an ultra-low rank. After merging, the adapters are discarded, leading to no inference overhead. On the Qwen2.5-3B model, BoostLoRA scores 89.1% on GSM8K and 68.8% on MATH-500, outperforming both TinyLoRA and full fine-tuning. In code generation, it achieves 57.2% on MBPP and 80.4% on HumanEval, while full fine-tuning falls below the zero-shot baseline.
Key facts
- BoostLoRA is a gradient-boosting framework for PEFT.
- It iteratively trains and merges minimal adapters on misclassified examples.
- ROTATE SVD basis strategy assigns each round to an orthogonal subspace.
- Cumulative effective rank grows linearly with rounds.
- Each adapter remains ultra-low-rank.
- Adapters are discarded after merging, leaving zero inference overhead.
- On Qwen2.5-3B, BoostLoRA achieves 89.1% on GSM8K and 68.8% on MATH-500.
- On code generation, it achieves 57.2% on MBPP and 80.4% on HumanEval.
Entities
—