Hybrid CNN-ViT Model Achieves 97.6% Accuracy in Brain Tumor MRI Classification

ai-technology · 2026-04-29

A novel hybrid deep learning framework has been introduced by researchers, integrating a SqueezeNet-style CNN component with a MobileViT-style global transformer component through an Adaptive Attention Gate for the classification of brain tumor MRIs. This gate is designed to learn weights on a per-sample, per-feature basis, allowing for the dynamic integration of local and global representations. When tested on the Brain Tumor MRI Dataset from Kaggle, the model demonstrated impressive results, achieving a test accuracy of 97.60%, precision of 97.30%, recall of 97.50%, an F1-score of 97.40%, and a macro-average AUC of 0.9946. This study tackles the difficulty of capturing both local textures and long-range dependencies in medical imaging.

Key facts

Hybrid architecture combines SqueezeNet-style CNN and MobileViT-style transformer branches
Adaptive Attention Gate dynamically weights per-sample, per-feature contributions
Test accuracy of 97.60% on Brain Tumor MRI Dataset (Kaggle)
Precision: 97.30%, Recall: 97.50%, F1-score: 97.40%
Macro-average AUC of 0.9946
Model trained and evaluated on Kaggle dataset
Addresses local texture and global dependency extraction in MRI
Published on arXiv with ID 2604.23137

Hybrid CNN-ViT Model Achieves 97.6% Accuracy in Brain Tumor MRI Classification

Key facts

Entities

Institutions

Sources