BiCICLe: Bimanual Robot Manipulation via Multi-Agent In-Context Learning
A team of researchers has unveiled BiCICLe (Bimanual Coordinated In-Context Learning), marking the first framework that allows conventional LLMs to execute few-shot bimanual manipulation without the need for fine-tuning. This method conceptualizes bimanual control as a multi-agent leader-follower scenario, breaking down the action space into sequential, conditioned predictions for each arm. It also incorporates Arms' Debate, a process of iterative refinement, and adds a third LLM-as-Judge to assess the actions taken. This strategy effectively tackles the difficulties posed by a high-dimensional joint action space and stringent inter-arm coordination constraints that typically exceed the capabilities of standard context windows.
Key facts
- BiCICLe is the first framework for few-shot bimanual manipulation using standard LLMs without fine-tuning.
- It frames bimanual control as a multi-agent leader-follower problem.
- The action space is decoupled into sequential, conditioned single-arm predictions.
- Arms' Debate is an iterative refinement process.
- A third LLM-as-Judge evaluates actions.
- Standard context windows are overwhelmed by high-dimensional joint action space and tight inter-arm coordination constraints.
- The approach uses off-the-shelf, text-only LLMs.
- In-Context Learning preserves generalization capabilities without task-specific training.
Entities
—