ViCrop-Det: Training-Free Small-Object Detection via Attention Entropy

ai-technology · 2026-04-30

A new method called ViCrop-Det uses spatial attention entropy to guide cropping for small-object detection without training. The approach addresses limitations of transformer-based architectures that apply uniform global receptive fields, causing feature degradation in dense microscopic target regions. ViCrop-Det is a training-free inference framework that adaptively shrinks spatial trust regions. It leverages the detection decoder's cross-attention distribution as an endogenous probe, evaluating local spatial ambiguity via Spatial Attention Entropy (SAE). This enables dynamic spatial routing, allocating a fixed computational budget exclusively to high-ambiguity zones. The method is inspired by attention entropy used in anomaly segmentation. The paper is published on arXiv with ID 2604.26806.

Key facts

ViCrop-Det is a training-free inference framework for small-object detection.
It uses Spatial Attention Entropy (SAE) to evaluate local spatial ambiguity.
The method adaptively shrinks spatial trust regions to focus on high-ambiguity zones.
It addresses limitations of transformer-based architectures with uniform receptive fields.
The approach is inspired by attention entropy used in anomaly segmentation.
The paper is available on arXiv with ID 2604.26806.
ViCrop-Det allocates a fixed computational budget exclusively to dense conflict zones.
The method leverages the detection decoder's cross-attention distribution as an endogenous probe.

ViCrop-Det: Training-Free Small-Object Detection via Attention Entropy

Key facts

Entities

Institutions

Sources