MAE Pretraining Boosts nnFormer for Medical Image Segmentation

ai-technology · 2026-04-29

A new self-supervised pretraining framework based on Masked Autoencoders (MAE) enhances the nnFormer transformer architecture for volumetric medical image segmentation. The method pretrains the model on unlabeled volumetric medical images to reconstruct randomly masked patches, reducing dependence on large labeled datasets. This addresses common issues like overfitting and training instability in fully supervised pipelines. The approach leverages abundant unlabeled clinical data, making segmentation more data-efficient and practical.

Key facts

nnFormer is a transformer architecture for volumetric medical image segmentation.
It captures long-range spatial interactions but requires large labeled datasets.
Overfitting and training instability are common issues.
Labeled medical images are time-consuming and expensive to obtain.
Unlabeled medical images are easily available in clinics.
The proposed method uses MAE-based self-supervised pretraining.
Pretraining involves reconstructing randomly masked patches.
The approach improves data efficiency and reduces overfitting.

Entities

—

Sources

arXiv cs.AI — 2026-04-28