Hera: Step-Level Device-Cloud Coordination for LLM Agents
Researchers propose Hera, a step-level coordinator for device-cloud LLM agents tackling long-horizon tasks. It uses a two-stage training paradigm: imitation learning for cold-start, then reinforcement learning optimizing both task success and cloud usage. This achieves a strong performance-cost Pareto frontier, addressing the coarse task-level routing limitations of existing systems.
Key facts
- Hera is a step-level device-cloud LLM agent coordinator.
- It targets long-horizon tasks.
- Uses two-stage training: imitation learning then reinforcement learning.
- Optimizes task success and cloud usage efficiency.
- Achieves strong performance-cost Pareto frontier.
- Addresses limitations of coarse task-level routing.
- Published on arXiv with ID 2605.24598.
- Focuses on device-cloud dilemma for LLM agents.
Entities
Institutions
- arXiv