CRiSP: Reinforcement Learning for Quantum State Preparation
Researchers propose CRiSP (Clifford Reinforcement Learning agent for State Preparation), a framework that uses reinforcement learning to improve initialization in Variational Quantum Algorithms (VQAs). VQAs face optimization challenges like barren plateaus and local minima. CRiSP formulates discrete prefix selection as a sequential decision-making problem, employing Neural-Guided Monte Carlo Tree Search with a Transformer-based policy trained via self-play. It inserts learned Clifford gates before fixed parameterized rotations, enabling high-quality initial states through polynomial-time classical stabilizer simulation without altering the circuit architecture. The method aims to scale better than heuristic-based approaches in vast combinatorial search spaces.
Key facts
- CRiSP stands for Clifford Reinforcement Learning agent for State Preparation
- Uses Neural-Guided Monte Carlo Tree Search
- Transformer-based policy trained via self-play
- Inserts learned Clifford gates before fixed parameterized rotations
- Enables polynomial-time classical stabilizer simulation
- Addresses barren plateaus and local minima in VQAs
- Does not alter underlying circuit architecture
- Published on arXiv with ID 2605.23138
Entities
Institutions
- arXiv