Layer Pruning's Silent Phase Triggers LLM Performance Collapse
A new arXiv preprint (2605.07271) investigates why layer pruning in large language models (LLMs) causes sudden performance collapse. The authors propose studying pruning through decision representation, introducing Decision Margin and Option Frequency metrics along with an Iterative Pruning method. Analyzing multiple-choice tasks, they discover a sharp decision transition dividing the network into a Silent Phase (unable to predict correct answer) and a Decisive Phase (correct prediction emerges). Pruning the Decisive Phase has minimal effect, but pruning the Silent Phase triggers immediate collapse, indicating extreme sensitivity to structural changes. The paper concludes that pruning-induced collapse stems from disrupting the Silent Phase.
Key facts
- arXiv preprint 2605.07271
- Layer pruning in LLMs causes sudden performance collapse
- Study uses Decision Margin and Option Frequency metrics
- Iterative Pruning method analyzes layer-wise decision dynamics
- Network splits into Silent Phase and Decisive Phase
- Silent Phase is extremely sensitive to pruning
- Decisive Phase pruning has minimal impact
- Collapse caused by disrupting the Silent Phase
Entities
Institutions
- arXiv