Echo-LoRA: Cross-Layer Injection for Efficient LLM Fine-Tuning
Researchers propose Echo-LoRA, a parameter-efficient fine-tuning method that injects cross-layer representations into shallow LoRA modules. During training, boundary hidden states from deeper source layers are aggregated into a sample-level echo representation, then projected via lightweight networks into shallow adapters. Techniques like answer-only masking, masked distillation, and stochastic routing stabilize the auxiliary path and reduce train-inference gap. Evaluated on eight commonsense reasoning benchmarks, Echo-LoRA outperforms standard LoRA and DoRA baselines while adding minimal parameters. The method is published on arXiv under ID 2605.08177.
Key facts
- Echo-LoRA is a cross-layer representation injection method for parameter-efficient fine-tuning.
- It collects boundary hidden states from deeper layers and aggregates them into an echo representation.
- Lightweight projection and gating networks inject the signal into shallow LoRA or DoRA modules.
- Answer-only masking, masked distillation, and stochastic routing stabilize training.
- Evaluated on eight commonsense reasoning benchmarks.
- Outperforms standard LoRA and DoRA baselines.
- Published on arXiv with ID 2605.08177.
Entities
Institutions
- arXiv