Layer Pruning's Silent Phase Triggers LLM Performance Collapse

other · 2026-05-11

A new arXiv preprint (2605.07271) investigates why layer pruning in large language models (LLMs) causes sudden performance collapse. The authors propose studying pruning through decision representation, introducing Decision Margin and Option Frequency metrics along with an Iterative Pruning method. Analyzing multiple-choice tasks, they discover a sharp decision transition dividing the network into a Silent Phase (unable to predict correct answer) and a Decisive Phase (correct prediction emerges). Pruning the Decisive Phase has minimal effect, but pruning the Silent Phase triggers immediate collapse, indicating extreme sensitivity to structural changes. The paper concludes that pruning-induced collapse stems from disrupting the Silent Phase.

Key facts

arXiv preprint 2605.07271
Layer pruning in LLMs causes sudden performance collapse
Study uses Decision Margin and Option Frequency metrics
Iterative Pruning method analyzes layer-wise decision dynamics
Network splits into Silent Phase and Decisive Phase
Silent Phase is extremely sensitive to pruning
Decisive Phase pruning has minimal impact
Collapse caused by disrupting the Silent Phase

Layer Pruning's Silent Phase Triggers LLM Performance Collapse

Key facts

Entities

Institutions

Sources