PARA: Data-Free LoRA Compression via Adaptive Rank Allocation
arXiv:2604.27796 introduces Post-Optimization Adaptive Rank Allocation (PARA), a data-free method to compress Low-Rank Adaptation (LoRA) models by pruning ranks using Singular Value Decomposition with a global threshold. Unlike standard LoRA which uses uniform rank across all layers, PARA allocates non-uniform ranks based on layer-wise spectral importance, reducing parameter count by 75-90% while preserving predictive performance. As a post-hoc technique, it avoids training modifications and instabilities common in dynamic architectures. The method integrates seamlessly into existing fine-tuning pipelines without requiring additional data or retraining. Empirical results demonstrate significant compression with minimal accuracy loss, addressing parameter redundancy in large foundation models.
Key facts
- PARA is a data-free compression method for LoRA
- Uses Singular Value Decomposition with global threshold
- Allocates non-uniform ranks based on layer-wise spectral importance
- Reduces parameter count by 75-90%
- Preserves predictive performance
- Post-hoc method avoids training modifications
- Integrates into existing fine-tuning pipelines
- Addresses parameter redundancy in foundation models
Entities
—