DDA-Thinker: Dual-Atomic Reinforcement Learning for Reasoning-Driven Image Editing
Researchers propose DDA-Thinker, a framework that decouples planning from generation in image editing. The system uses a Thinker module optimized via dual-atomic reinforcement learning, with cognitive-atomic and visual-atomic rewards to assess plan quality and final image fidelity. This approach aims to improve reasoning-grounded planning in complex editing tasks.
Key facts
- DDA-Thinker is a Thinker-centric framework for reasoning-driven image editing.
- It decouples the planning module (Thinker) from the generative model (Editor).
- Dual-atomic reinforcement learning uses cognitive-atomic and visual-atomic rewards.
- Cognitive-atomic reward assesses the quality of the executable plan.
- Visual-atomic reward assesses the final image quality.
- The framework is designed for controlled analysis of the planning module.
- The approach targets tasks requiring complex reasoning.
- The paper is available on arXiv with ID 2604.25477.
Entities
Institutions
- arXiv