MetaSR: Adaptive Metadata for Generative Super-Resolution
MetaSR, a novel framework utilizing a Diffusion Transformer (DiT), tackles the challenge of generative super-resolution (SR) in real-world contexts where content and degradation types differ by domain, genre, and segment. Traditional metadata-guided SR approaches implement a static conditioning design, which proves inadequate when valuable cues depend on the content and transmission budgets are restricted. Instead, MetaSR intelligently selects and incorporates task-specific metadata to steer SR while managing resource limitations. It leverages the DiT’s VAE and transformer backbone for integrating diverse metadata and employs an efficient distillation method for one-step diffusion inference. Tests across various content categories and degradation conditions reveal that MetaSR surpasses existing benchmark solutions.
Key facts
- MetaSR is a Diffusion Transformer (DiT)-based framework for generative super-resolution.
- It addresses content and degradation variations across domains, genres, and segments.
- Existing metadata-guided SR methods use a fixed conditioning design, which is suboptimal.
- MetaSR selects and injects task-relevant metadata to guide SR under resource constraints.
- It uses the DiT's own VAE and transformer backbone to fuse heterogeneous metadata.
- An efficient distillation strategy enables one-step diffusion inference.
- Experiments show MetaSR outperforms reference solutions across diverse content and degradation regimes.
- The paper is published on arXiv with ID 2604.26244.
Entities
Institutions
- arXiv