Motion-Compensated Weight Compression for Neural Networks

other · 2026-05-26

A novel weight compression technique known as Motion-Compensated Weight Compression (MCWC) has been introduced on arXiv. This approach organizes permutation-symmetric blocks, such as hidden units and attention heads, to leverage redundancy across layers, viewing depth as a foreseeable sequence. It employs a simple layer-sequential predictor that utilizes periodic keyframes and encodes quantized prediction residuals with a learned entropy model. Weights are reconstructed by the decoder through entropy decoding, dequantization, predictor-guided reconstruction, and inverse alignment. This method enhances compression efficiency in Transformer language modeling and vision classification tasks.

Key facts

MCWC stands for Motion-Compensated Weight Compression.
It aligns permutation-symmetric blocks such as hidden units and attention heads.
The method turns depth into a predictable sequence.
It uses a lightweight layer-sequential predictor with periodic keyframes.
Encodes quantized prediction residuals using a learned entropy model.
Decoder reconstructs deployable weights for fast inference.
Tested on Transformer language modeling and vision classification.
Improves compression performance over existing methods.

Motion-Compensated Weight Compression for Neural Networks

Key facts

Entities

Institutions

Sources