Phone2Act: Smartphone-Based Teleoperation for Robot Data Collection
A team of researchers has created Phone2Act, an affordable teleoperation system that transforms a standard smartphone into a 6-DoF robot controller utilizing Google ARCore. This hardware-agnostic solution is constructed on a modular ROS 2 framework featuring interchangeable bridge nodes, accommodating various platforms, from industrial cobots to budget-friendly bimanual arms, without requiring code alterations. A Universal Recorder aligns multi-camera RGB streams with robot state feedback and facilitates the export of demonstrations in the LeRobot dataset format, allowing for quick VLA fine-tuning. The effectiveness of this framework was confirmed by fine-tuning GR00T-N1.5 using 130 demonstrations, aiming to simplify and lower the costs associated with gathering diverse manipulation data for training Vision-Language-Action models, thus enhancing accessibility for research teams.
Key facts
- Phone2Act transforms a commodity smartphone into a 6-DoF robot controller via Google ARCore.
- The framework is hardware-agnostic and built on a modular ROS 2 architecture.
- It supports platforms from industrial cobots to low-cost bimanual arms without code modification.
- A Universal Recorder synchronizes multi-camera RGB streams with robot state feedback.
- Exports demonstrations natively in the LeRobot dataset format.
- Validated by fine-tuning GR00T-N1.5 on 130 demonstrations.
- Aims to reduce cost and complexity of collecting manipulation data for VLA model training.
- Published on arXiv under ID 2605.01948.
Entities
Institutions
- Google ARCore
- LeRobot
- GR00T-N1.5