InfoQuant: Optimizing Activation Distributions for Low-Bit LLM Quantization

publication · 2026-05-27

A recent study published on arXiv (2605.26175), entitled 'InfoQuant: Shaping Activation Distributions for Low-Bit LLM Quantization,' tackles the issue of low-bit quantization of activations in large language models (LLMs). The researchers contend that current post-training quantization (PTQ) techniques do not adequately define which activation distributions are conducive to discretization, resulting in significant quantization errors despite seemingly smoother numerical activations. They suggest redefining activation transformation as a design for quantizer-oriented distributions and examine quantization errors through an information-theoretic lens. Their findings indicate that activations ideal for quantization should possess both a reduced numerical range and adequate variability.

Key facts

Paper is on arXiv with ID 2605.26175
Title: InfoQuant: Shaping Activation Distributions for Low-Bit LLM Quantization
Focuses on low-bit activation quantization for LLMs
Critiques existing PTQ methods for not specifying easy-to-discretize distributions
Proposes quantizer-facing distribution design
Uses information-theoretic analysis of quantization error
Key insight: activations need smaller numerical range and sufficient dispersion
Published on arXiv (likely 2025, based on ID prefix)

InfoQuant: Optimizing Activation Distributions for Low-Bit LLM Quantization

Key facts

Entities

Institutions

Sources