Few-Shot Precise Event Spotting via Multimodal Distillation in Sports Video
A new arXiv paper (2604.22839) introduces two complementary distillation strategies for few-shot Precise Event Spotting (PES) in fast-paced sports like tennis. The methods—Adaptive Weight Distillation (AWD) and Annealed Multimodal Distillation for Few-Shot Event Detection (AMD-FED)—leverage multimodal distillation to improve frame-level localization under limited supervision. AWD adaptively weights teacher predictions on unlabeled data, while AMD-FED transfers skeleton knowledge into visual modalities via annealed pseudo-labeling. Evaluated on the F3Set-Tennis(sub) dataset under few-shot k-clip settings, both approaches consistently outperform single-modality baselines and prior PES methods. The work addresses challenges of motion blur, subtle action differences, and scarce annotated data.
Key facts
- Paper arXiv:2604.22839 introduces AWD and AMD-FED for few-shot PES.
- AWD is a prediction-level distillation method using adaptive weighting.
- AMD-FED is a representation-level framework with annealed pseudo-labeling.
- Both methods use multimodal distillation to transfer skeleton knowledge to visual modalities.
- Evaluation on F3Set-Tennis(sub) dataset under few-shot k-clip settings.
- Methods outperform single-modality baselines and prior PES approaches.
- Target application is fast-paced sports like tennis with fine-grained events.
- Challenges addressed include motion blur, subtle action differences, and limited data.
Entities
Institutions
- arXiv