SplitQ: Channel Splitting for Low-Bit VLM Quantization

ai-technology · 2026-05-20

Researchers propose SplitQ, a post-training quantization framework for large vision-language models (VLMs) that addresses heterogeneous activation distributions across text and vision modalities. The method introduces a Modality-specific Outlier Channel Decoupling (MOCD) module to isolate salient outlier channels, which are unevenly distributed across modalities. An Adaptive Cross-Modal Calibration (ACC) further reduces remaining distribution discrepancies. The work targets efficient deployment of VLMs on resource-constrained devices.

Key facts

arXiv paper 2605.19929 proposes SplitQ for low-bit PTQ of VLMs
Heterogeneous activation distributions between text and vision modalities cause accuracy degradation
Outlier channels are modality-specific and unevenly distributed
MOCD module isolates salient modality-specific outlier channels
ACC module addresses cross-modal distribution discrepancies
Goal is efficient VLM deployment on resource-constrained devices

SplitQ: Channel Splitting for Low-Bit VLM Quantization

Key facts

Entities

Institutions

Sources