CAMPA: Efficient Decoupled Multimodal Graph Learning Framework

publication · 2026-05-13

Researchers propose CAMPA, a decoupled multimodal graph learning framework addressing modal conflict in propagation and aggregation stages. The framework improves efficiency and scalability for large-scale multimodal attributed graphs by aligning cross-modal semantic information. The paper presents systematic empirical analysis showing decoupled MGNNs outperform tightly coupled architectures in computational efficiency, while identifying modal conflict as a key bottleneck. CAMPA introduces cross-modal aligned propagation and aggregation to mitigate semantic divergence and misaligned multi-hop feature trajectories. The work is published on arXiv with identifier 2605.11468.

Key facts

CAMPA stands for Cross-modal Aligned Multimodal Propagation & Aggregation.
The paper is published on arXiv with identifier 2605.11468.
The research focuses on multimodal graph neural networks (MGNNs).
Decoupled MGNNs are more efficient and scalable than tightly coupled architectures.
Modal conflict arises in both propagation and aggregation stages.
Independent multi-hop diffusion causes cross-modal semantic divergence during propagation.
Naive fusion fails to align multi-hop feature trajectories during aggregation.
CAMPA addresses modal conflict through cross-modal alignment.

CAMPA: Efficient Decoupled Multimodal Graph Learning Framework

Key facts

Entities

Institutions

Sources