VLA Model for Adaptive Ultrasound-Guided Needle Insertion and Tracking
A new model called Vision-Language-Action (VLA) has been introduced for the automated and adaptive insertion and tracking of ultrasound-guided needles within a robotic ultrasound (RUS) system. This framework combines the control of needle insertion with tracking, allowing for real-time adjustments that adapt dynamically to the needle's location and the surrounding environment. Utilizing a Cross-Depth Fusion (CDF) tracking head, it merges shallow positional data with deep semantic features from an extensive vision backbone for seamless tracking. This method overcomes the shortcomings of traditional modular controllers that perform poorly in difficult scenarios. The research is available on arXiv under ID 2604.20347.
Key facts
- VLA model proposed for adaptive US-guided needle insertion and tracking
- Framework unifies needle tracking and insertion control
- Enables real-time, dynamically adaptive adjustment
- Cross-Depth Fusion (CDF) tracking head integrates shallow and deep features
- Addresses performance degradation of modular controllers
- Published on arXiv with ID 2604.20347
- Robotic ultrasound (RUS) system used
- End-to-end tracking achieved
Entities
Institutions
- arXiv