ARTFEED — Contemporary Art Intelligence

Two Failure Modes of LLM Quantization Identified: Signal Degradation vs Computation Collapse

ai-technology · 2026-04-24

A new arXiv paper (2604.19884) presents a systematic mechanistic analysis of Post-Training Quantization (PTQ) in Large Language Models (LLMs), revealing two qualitatively distinct failure modes when reducing precision to 2-bit. The first, Signal Degradation, preserves computational patterns but impairs information precision through cumulative error. The second, Computation Collapse, destroys key components in early layers, preventing correct information processing. While 4-bit quantization is widely considered optimal, 2-bit typically triggers a catastrophic performance cliff. The study demonstrates that targeted, training-free repair can mitigate Signal Degradation but remains ineffective for Computation Collapse.

Key facts

  • Paper arXiv:2604.19884 analyzes LLM quantization failure modes.
  • Two distinct failure modes identified: Signal Degradation and Computation Collapse.
  • Signal Degradation preserves computational patterns but impairs precision via cumulative error.
  • Computation Collapse destroys key components in early layers.
  • 4-bit quantization is widely regarded as optimal trade-off.
  • 2-bit quantization triggers a catastrophic performance cliff.
  • Training-free repair can mitigate Signal Degradation.
  • Training-free repair is ineffective for Computation Collapse.

Entities

Institutions

  • arXiv

Sources