ViCrop-Det: Training-Free Small-Object Detection via Attention Entropy
A new method called ViCrop-Det uses spatial attention entropy to guide cropping for small-object detection without training. The approach addresses limitations of transformer-based architectures that apply uniform global receptive fields, causing feature degradation in dense microscopic target regions. ViCrop-Det is a training-free inference framework that adaptively shrinks spatial trust regions. It leverages the detection decoder's cross-attention distribution as an endogenous probe, evaluating local spatial ambiguity via Spatial Attention Entropy (SAE). This enables dynamic spatial routing, allocating a fixed computational budget exclusively to high-ambiguity zones. The method is inspired by attention entropy used in anomaly segmentation. The paper is published on arXiv with ID 2604.26806.
Key facts
- ViCrop-Det is a training-free inference framework for small-object detection.
- It uses Spatial Attention Entropy (SAE) to evaluate local spatial ambiguity.
- The method adaptively shrinks spatial trust regions to focus on high-ambiguity zones.
- It addresses limitations of transformer-based architectures with uniform receptive fields.
- The approach is inspired by attention entropy used in anomaly segmentation.
- The paper is available on arXiv with ID 2604.26806.
- ViCrop-Det allocates a fixed computational budget exclusively to dense conflict zones.
- The method leverages the detection decoder's cross-attention distribution as an endogenous probe.
Entities
Institutions
- arXiv