HiF8 W8A8 QAT Failure Modes in OpenPangu-Embedded-1B

other · 2026-05-27

A study on quantization-aware training (QAT) with HiF8 W8A8 for OpenPangu-Embedded-1B reveals two orthogonal failure modes: amax saturation from delayed scale estimates causing forward-pass clipping, and catastrophic forgetting from aggressive learning rates. Neither is detectable via training loss. The authors propose a conservative max-algorithm DTS over a 64-step window for amax saturation and a 500-step BF16 warmup with lr=10^{-5} for forgetting. Both fixes are necessary and sufficient.

Key facts

arXiv:2605.26189v1
HiF8 W8A8 QAT for OpenPangu-Embedded-1B
Delayed Tensor Scaling (DTS) used
Two failure modes: amax saturation and catastrophic forgetting
amax saturation caused by delayed scale estimates
Catastrophic forgetting from aggressive learning rate
Conservative max-algorithm DTS over 64-step window proposed
500-step BF16 warmup with lr=10^{-5} proposed

HiF8 W8A8 QAT Failure Modes in OpenPangu-Embedded-1B

Key facts

Entities

Institutions

Sources