Abstract
We propose a reinforcement learning (RL) scheme for feedback quantum control within the quantum approximate optimization algorithm (QAOA). We reformulate the QAOA variational minimization as a learning task, where an RL agent chooses the control parameters for the unitaries, given partial information on the system. Such an RL scheme finds a policy converging to the optimal adiabatic solution of the quantum Ising chain that can also be successfully transferred between systems with different sizes, even in the presence of disorder. This allows for immediate experimental verification of our proposal on more complicated models: the RL agent is trained on a small control system, simulated on classical hardware, and then tested on a larger physical sample.
1 More- Received 29 April 2020
- Accepted 28 August 2020
DOI:https://doi.org/10.1103/PhysRevResearch.2.033446
Published by the American Physical Society under the terms of the Creative Commons Attribution 4.0 International license. Further distribution of this work must maintain attribution to the author(s) and the published article's title, journal citation, and DOI.
Published by the American Physical Society