Systematic Review Quantifies Gains of Multimodal Fusion in Document Classification

other · 2026-05-26

A systematic review of 139 primary studies introduces a formal framework for information fusion in document classification, analyzing multimodal and multiview approaches. A random-effects meta-analysis, the first focused on document classification, quantifies performance gains: multimodal fusion improves accuracy by a mean of +5.28 percentage points (p=0.0016), while the F1-score effect is directionally positive but statistically non-significant. Multiview fusion yields consistent but modest accuracy gains of +4.67%. The review identifies key trends and provides guidance for practitioners, addressing the lack of a unified framework and quantitative synthesis in the field.

Key facts

Systematic review of 139 primary studies
Introduces a formal framework for information fusion
First random-effects meta-analysis focused on document classification
Multimodal fusion improves accuracy by +5.28 percentage points (p=0.0016)
F1-score effect for multimodal fusion is directionally positive but non-significant
Multiview fusion provides consistent accuracy gains of +4.67%
Addresses lack of unified framework and quantitative synthesis
Provides guidance for practitioners

Entities

—

Sources

arXiv cs.AI — 2026-05-26