ARTFEED — Contemporary Art Intelligence

HiF8 W8A8 QAT Failure Modes in OpenPangu-Embedded-1B

other · 2026-05-27

A study on quantization-aware training (QAT) with HiF8 W8A8 for OpenPangu-Embedded-1B reveals two orthogonal failure modes: amax saturation from delayed scale estimates causing forward-pass clipping, and catastrophic forgetting from aggressive learning rates. Neither is detectable via training loss. The authors propose a conservative max-algorithm DTS over a 64-step window for amax saturation and a 500-step BF16 warmup with lr=10^{-5} for forgetting. Both fixes are necessary and sufficient.

Key facts

  • arXiv:2605.26189v1
  • HiF8 W8A8 QAT for OpenPangu-Embedded-1B
  • Delayed Tensor Scaling (DTS) used
  • Two failure modes: amax saturation and catastrophic forgetting
  • amax saturation caused by delayed scale estimates
  • Catastrophic forgetting from aggressive learning rate
  • Conservative max-algorithm DTS over 64-step window proposed
  • 500-step BF16 warmup with lr=10^{-5} proposed

Entities

Institutions

  • arXiv

Sources