PARA: Data-Free LoRA Compression via Adaptive Rank Allocation

ai-technology · 2026-05-01

arXiv:2604.27796 introduces Post-Optimization Adaptive Rank Allocation (PARA), a data-free method to compress Low-Rank Adaptation (LoRA) models by pruning ranks using Singular Value Decomposition with a global threshold. Unlike standard LoRA which uses uniform rank across all layers, PARA allocates non-uniform ranks based on layer-wise spectral importance, reducing parameter count by 75-90% while preserving predictive performance. As a post-hoc technique, it avoids training modifications and instabilities common in dynamic architectures. The method integrates seamlessly into existing fine-tuning pipelines without requiring additional data or retraining. Empirical results demonstrate significant compression with minimal accuracy loss, addressing parameter redundancy in large foundation models.

Key facts

PARA is a data-free compression method for LoRA
Uses Singular Value Decomposition with global threshold
Allocates non-uniform ranks based on layer-wise spectral importance
Reduces parameter count by 75-90%
Preserves predictive performance
Post-hoc method avoids training modifications
Integrates into existing fine-tuning pipelines
Addresses parameter redundancy in foundation models

Entities

—

Sources

arXiv cs.AI — 2026-05-01