Two Failure Modes of LLM Quantization Identified: Signal Degradation vs Computation Collapse

ai-technology · 2026-04-24

A new arXiv paper (2604.19884) presents a systematic mechanistic analysis of Post-Training Quantization (PTQ) in Large Language Models (LLMs), revealing two qualitatively distinct failure modes when reducing precision to 2-bit. The first, Signal Degradation, preserves computational patterns but impairs information precision through cumulative error. The second, Computation Collapse, destroys key components in early layers, preventing correct information processing. While 4-bit quantization is widely considered optimal, 2-bit typically triggers a catastrophic performance cliff. The study demonstrates that targeted, training-free repair can mitigate Signal Degradation but remains ineffective for Computation Collapse.

Key facts

Paper arXiv:2604.19884 analyzes LLM quantization failure modes.
Two distinct failure modes identified: Signal Degradation and Computation Collapse.
Signal Degradation preserves computational patterns but impairs precision via cumulative error.
Computation Collapse destroys key components in early layers.
4-bit quantization is widely regarded as optimal trade-off.
2-bit quantization triggers a catastrophic performance cliff.
Training-free repair can mitigate Signal Degradation.
Training-free repair is ineffective for Computation Collapse.

Two Failure Modes of LLM Quantization Identified: Signal Degradation vs Computation Collapse

Key facts

Entities

Institutions

Sources