ARTFEED — Contemporary Art Intelligence

BDQ: A New Post-Training Quantization Method for LLMs

ai-technology · 2026-05-20

A new paper on arXiv (2605.18800) introduces Bidirectional Diagonal Quantization (BDQ), a post-training quantization method for Large Language Models (LLMs). The authors first model the mathematical relationship between quantization error and activation outliers, then propose a metric called Flatness to quantify outlier distribution. From this, they derive a theoretical optimal solution. BDQ addresses persistent outlier patterns in transformed weights and activations that degrade performance at low bit precision, offering a novel approach to LLM compression and acceleration.

Key facts

  • Paper ID: arXiv:2605.18800
  • Published on arXiv
  • Introduces Flatness metric for outlier distribution
  • Proposes Bidirectional Diagonal Quantization (BDQ)
  • Addresses activation outliers in LLM quantization
  • Derives theoretical optimal solution based on Flatness
  • Focuses on post-training quantization
  • Aims to improve LLM inference at lower bit precision

Entities

Institutions

  • arXiv

Sources