Using a Recurrent Neural Network to Reconstruct Quantum Dynamics of a Superconducting Qubit from Physical Observations

.

Quantum mechanics breaks dramatically with classical intuition, contradicting determinism and introducing many highly counterintuitive concepts, such as contextuality, non-classical correlations and the uncertainty principle.Despite its abstract mathematical framework, quantum mechanics can be formulated operationally as an extended information theory [1], where the physical system is treated as a black box in which preparation and measurement combine to give the probabilities of experimental outcomes.The physical parameters are then estimated by averaging measurement outcomes on a large ensemble.
The time evolution of the state of an isolated quantum mechanical system is governed by the Schrödinger equation.However realistic system cannot be isolated perfectly, and the coupling to an environment brings about qualitatively different behavior that cannot be accounted for via the Schrödinger equation alone.If the system is monitored continuously, the dynamics of the system is perturbed by the inevitable back-action induced by measurement.Although the system's evolution under measurement is stochastic, the measurement record faithfully reports the perturbation of the system with respect to the unperturbed coherent evolution.Consequently, the observer's knowledge of the wave-function can be updated using quantum filtering -the extraction of quantum information from a noisy signal.The stochastic time evolution of the wave function is the so called quantum trajectory.Under certain approximations, this task can be performed by integrating the stochastic quantum masterequation, provided that the Hamiltonian, dissipation and measurement operators are precisely calibrated [2][3][4][5].
On the other hand, Recurrent Neural Networks (RNN) are a powerful class of machine learning tools able to extract hidden correlations from large datasets [6].They are most commonly applied to time-binned data, and as such achieve excellent performance on difficult problems such as language translation [7] and speech recognition [8].RNN training is driven by examples and performed without specifying dictionaries or linguistic rules.Interestingly, quantum filtering [9] can be seen as a similar task in which noisy experimental signals must be translated into meaningful quantum information.Last year, various architectures of neural networks have been used in the realm of quantum physics for the prediction the theoretical quantum behavior of strongly correlated phases of matter [10][11][12][13][14], the design of efficient quantum error correction code [15], the decoding of large topological error correcting codes [16][17][18] and the optimization of dynamical decoupling schemes for quantum memories [19].
In this Letter, we show that neural networks can be trained to predict stochastic quantum evolution from raw observation without specifying quantum mechanics a priori.We demonstrate that the RNN reproduces the stochastic quantum evolution for a continuously monitored superconducting qubit under a Rabi Hamiltonian.Rather than providing a black-box model, we use the neural network to robustly extract all physical parameters required for quantum filtering.Moreover, while RNNs are temporally oriented, they are routinely trained both in the forward and backward time ordering, so that the network may exploit both past and future information.In the present application, the use of past and future continuous measurement outcomes improves the estimation accuracy of quantum trajectories at a given time through The qubit is simultaneously driven on resonance at a Rabi rate ΩR and dispersively monitored with a strength γ near the cavity resonance frequency.b.Data collected from the experimental system, comprising preparation, measurement outcomes and continuous measurement record of the qubit, are directly streamed to a RNN, which provides a prediction of the measurement outcome.The weights of the RNN are updated at each iteration through a stochastic gradient descent.c.The stochastic gradient descent aims at minimizing the cross-entropy loss function LW which evaluates the distance between the prediction and the measurement outcome.
a process called quantum smoothing [20,21].We train a bidirectional RNN to perform forward-backward analysis of trajectories, enabling quantum smoothing of predictions and the faithful tomography of an unknown initial state.By treating preparation and measurement on the same footing, the RNN structure highlights the time symmetry underlying the stochastic quantum evolution.

EXPERIMENTAL SYSTEM
Our experiment consists of a superconducting transmon qubit [22] dispersively coupled to a superconducting waveguide cavity [23].In the interaction picture and rotating wave approximation, our system is described by the Hamiltonian H = H int + H R , where We use a near-quantum-limited parametric amplifier [24] to amplify the quadrature of the reflected signal which is proportional to the qubit state-dependent phase shift.
After further amplification, we digitize the signal in 40 ns time steps, yielding a measurement record V t .We begin each run of the experiment by heralding the ground state of the qubit using the above readout technique.We then prepare the qubit along one of the 6 cardinal points of the Bloch sphere by applying a preparation pulse.Next, a measurement tone at the cavity frequency of 6.666 GHz continuously probes the cavity for a variable time T between 0 and 4 µs , which weakly measures the qubit in the σ Z basis.Concurrently, we apply the Rabi Hamiltonian H R .Finally, we apply pulses to perform qubit rotations and a projective measurement, yielding a single shot measurement of a desired qubit operator σ X , σ Y or σ Z .

QUANTUM TRAJECTORIES
To allow the neural network to operate as generally as possible, we formulate system inputs and outputs symmetrically, and avoid passing it objects such as a wave function that encode information about the structure of quantum theory.The role of the wave-function in quantum mechanics is to provide the probability of a measurement outcome y t given the preparation and evolution of the system at earlier times P (y t |y 0 ).In the case of a continuously monitored quantum bit, the preparation and measurement outcome are each a binary variable y 0 , y t ∈ {0, 1} extracted through a projective readout performed at the initial and final times respectively; the preparation and measurement configurations, labeled a and b, encode microwave pulses performing qubit rotations for state preparation and tomography respectively in the X, Y and Z basis.The stochastic measurement record {V t } is collected with a high quantum efficiency parametric amplifier during the qubit evolu- tion.Quantum trajectory theory describes how an observer's state of knowledge evolves given a measurement record [25].Therefore, quantum trajectories are specified by P (y t |y 0 , a, b, V 0 ...V t ), the probability of measuring the outcome y t with the measurement parameter b given the initial measurement y 0 in the preparation parameter a and the stochastic measurement outcome up to a time t.Tracking this quantum evolution can be understood as a translation of the measurement records into a quantum state evolution.Fig. 2 a. shows the distribution of measurement records obtained for the preparation setting (y 0 = 0, a=Z).
Quantum trajectories are typically extracted from continuous measurement by integrating the stochastic master equation (SME) governing the evolution of the density matrix ρ t (3) where L is the Lindblad superoperator describing the qubit dephasing induced by the measurement of strength γ, H is a measurement superoperator describing the backaction of the measurement on the quantum state for a quantum efficiency η and dw t is a Gaussian distributed variable with variance dt extracted from measurement record normalized appropriately using The probability distribution for the projective outcome is then given by the Born rule P X,Y,Z (t) = P (y t |y 0 , a, b = X, Y, Z, V 0 ...V t ) = (Tr[ρ t σ X,Y,Z ] + 1)/2.The integrated stochastic master equation provides faithful predictions when experimental parameters are precisely known from independent calibration under the assumption that the cavity decay rate is much larger than the qubit measurement rate κ γ.Fig. 2 a. shows two representative trajectories extracted from the measurement records based on the stochastic master equation.

RECURRENT NEURAL NETWORK
Based solely on a large set of labeled examples (y t , y 0 , a, b, {V τ }) directly extracted from the experimental system, we now demonstrate that the network can be trained to predict the probability P (y t |y 0 , a, b, V 0 ...V t ) of the observing the measurement outcome y t ∈ {0, 1} given the history of the quantum evolution accessible to the observer, in other words the best knowledge of the qubit wave-function.
We use a Long Short-Term Memory Recurrent Neural Network (LSTM-RNN) [26] schematically depicted in Fig. 1b.These typically consist of a layer of n virtual neurons-like nodes recurrently updated in time.The state of the neuron's layer at a time t is encoded in a n-dimensional vector h t .It is computed as a weighted linear combination of the neuron's layer state at a previous time t − 1 combined with the measurement record at a time t and passed through a non-linear activation function φ such that where W and B are the weights of the connections between the neurons, and the biases respectively, which are determined during the training stage.The probability P b (y t ) of the getting the outcome y given the measurement setting b is computed at each time step as a linear combination of the neuron layer state passed through the activation function given by P (y t |y 0 , a, b, V 0 ... The preparation settings a and the initial qubit state (input bit y 0 ) are specified in the initial state of the neuron layer.The neural network is trained to minimize a loss function L by strengthening or weakening connections between neuron layers encoded in the weight matrices W h,V,b , as shown in Fig. 1b.The cross-entropy loss function ) is minimized when the prediction P (y t |y 0 , a, b, V 0 ...V t ) and the distribution of experimental outcomes y T for a given measurement setting b match.Crucially, the function implemented by the neural network is differentiable, and therefore the weight matrices can be updated at each iteration of the training by differentiating the loss-function and applying a gradient-descent minimization step: W ← W − ξ ∂L b /∂W where ξ is the learning rate.The training process ends once the weight matrices W have converged toward a minimum of the loss function.The effectiveness of neural networks lies in their ability to converge toward a minimum of a very high dimensional non-linear loss landscape through gradient back-propagation as illustrated in Fig. 1c.

TRAINING
The Long Short Term Memory recurrent neural network comprises 64 neurons with rectified linear unit activation function.This specific RNN architecture evades the exploding/vanishing gradient problem of standard RNN architecture, improving the learning of long-term dependencies [27].The neural network is implemented with the Tensorflow library [28] developed by Google and optimized for a Graphics Processing Unit (Nvidia Tesla K80 GPU), which enables a speed up of the training.The data are fed to the network in batches, each containing 1024 measurement records on which a step of the gradient descent is preformed using ADAM optimizer [29].The measurement records is split in two data set.1.5 × 10 6 traces are used for the training and 5 × 10 5 randomly chosen traces are used for the evaluation and displayed in the manuscript.The training data can be re-injected several time to the network in order to improve the model accuracy, each of these training cycle corresponds to a training epoch, in practice up to 10 training epochs have been performed.At each training epoch, the learning rate is lowered from 1 × 10 −3 to 1 × 10 −6 .In order to improve the training robustness, 30% of the neurons are dropped out randomly during the first epoch.The fraction of dropped out neurons is gradually lowered to 0 with each subsequent training epoch.This method prevents the network from over-fitting and helps the generalization abilities of the model [30].Note that the training quality does not strongly depend on the details of these parameters.A key feature of the training is that it can be performed in real-time directly from raw data data collected from the experimental system, the training cycle is 0.8ms per trace, which is on par with the experimental repetition time.Therefore, the 2 × 10 6 traces are produced and fed to the RNN in 20 min.6 preparation settings (y 0 ∈ {0, 1}, a ∈ {X, Y, Z}) and 6 measurement settings (y T ∈ {0, 1}, b ∈ {X, Y, Z}) are used.In practice, we perform the preparation and measurement with the following rotations of the qubit − , R X 0 and R X π − which correspond to the cardinal points of the Bloch sphere.The associated preparation labels (y 0 , a) and measurement labels (y T , b) are then given respectively by (y, X), (ȳ, X), (y, Y ), (ȳ, Y ), (y, Z) and (ȳ, Z) with ȳ = 1 − y.The total time evolution is varied over 20 values within 4 µs, (T ∈ [0, 4])and the measurement record {V t } is acquired during the qubit evolution with a sampling time of 40 ns.Once the training achieved, the RNN returns the prediction P (y t |y 0 , a, b, V 0 ...V t ) which corresponding to the probability of measuring the qubit at a time t along the measurement axis b = X, Y and Z.

VALIDATION
Once the RNN is trained, the predictions of the measurement outcomes form an ensemble of trajectories for each of the measurement setting as shown on Fig. 2b.The prediction of the neural network are in good agreement with the representative trajectories integrated from the stochastic master equation.In this section, we demonstrate that the remaining discrepancies between predictions are in favor of the neural network.The accuracy of the training can be evaluated self-consistently on the evaluation dataset not used during the training.This method has been previously used to benchmark the prediction of the stochastic master equation [2][3][4][5].We select the subset of the trajectories leading to the same prediction p within a small δ such that S p = {y T such that P (y T |y 0 , a, b, V 0 ...V T ) ∈ [p−δ, p+δ]}.where N the total number of trajectories.As shown in Fig. 2d, the RNN prediction gives relative error lower than 10 −2 for all-measurement axis.As a comparison, using the same evaluation data set, the prediction of the stochastic master equation based on the independently calibrated experimental parameters gives a higher relative error along the Y and Z axis.Such a discrepancy can be attributed to small calibration errors and experimental drifts.This self-consistent evaluation demonstrates the prediction power of the trained RNN and its robustness against calibration errors of physical parameters.

BIDIRECTIONAL RNN
RNNs are inherently time oriented; the prediction at a time t P (y t |y 0 , a, b, V 0 ...V t ) only depends on the measurement record at earlier times.A common feature used to improve the prediction power of a RNN, for translation application in particular, is to combine the prediction of two RNNs trained respectively forward and backward in time, exploiting the same data in both directions [6].The forward prediction provides the trajectory given the past measurement record (V 0 → V t ) and the preparation settings (y 0 , a): P ⇒ (y t ) = P (y t |y 0 , a, b, V 0 ...V t ) while the backward prediction provides the trajectory given the "future" measurement record (V T → V t ) played backward and the measurement settings (y T , b): P ⇐ (y t ) = P (y t |y T , a, b, V T ...V t ).As shown in Fig. 3a, the RNN provides an ensemble of backward trajectories.The accuracy of backward prediction are evaluated using the same validation method than the forward prediction, the subset of backward trajectory S p giving the same prediction p must agree on average with the preparation measurement such that y 0 Sp = p.The accuracy of the backward prediction is shown in Fig. 3 b, where the relative error for the preparation settings X,Y and Z for the backward predictions are ⇐ X = 1.1 × 10 −2 , ⇐ Y = 0.9 × 10 −2 and ⇐ Z = 0.7 × 10 −2 , the overall accuracy is comparable to the forward prediction.Remarkably, the backward and forward predictions do not necessarily agree at a given t, indeed these predictions are based on distinct parts of the measurement records.They provide complementary information from the past and future evolution of the system.Theses predictions can therefore be combined to enhance the knowledge of the quantum state based on the full measurement record.Backward-forward analysis is a well-established postprocessing method with recurrent neural network [6] as well as hidden markov chain methods [31].Time-reversal symmetry underlies quantum evolution and exchange the role of state preparation and state measurement [32].In a sense, backward-forward analysis naturally translates into quantum regime as the prediction and retrodiction of quantum trajectories [33][34][35].Quantum prediction and retrodiction can be combined based on quantum smoothing techniques [20,21] enabling an enhancement of physical parameter estimation [36,37].The forward and backward predictions can be combined into a smoothed prediction by: .
(5) As depicted in Fig. 3c, the smoothed trajectories combine the backward and forward information such that it dismisses the least informative predictions (P ⇐ (y t ), P ⇒ (y t ) ∼ 0.5) and strengthen the most informative ones (P ⇐ (y t ), P ⇒ (y t ) ∼ 0/1).By removing ambiguities in the qubit evolution, we access information which is blurred by statistical uncertainties in the standard approach, and we observe an improved temporal resolution on quantum jumps undergone by the qubit.The forward-backward analysis demonstrates how bidirectional RNNs naturally combines causal and anticausal correlations hidden in the measurement records.

INITIAL STATE ESTIMATION
The role of the preparation (y 0 , a) and measurement (y T , b) are treated symmetrically in the forward and backward prediction.Hence while the forward RNN predicts the outcome of the final projective measurement, the backward RNN provides an estimation of the initial state of the system given the measurement record.These predictions can be therefore exploited to perform initial state tomography, this task is reminiscent of the enhanced readout discrimination by machine-learning demonstrated in Ref. [38].For the state estimation, we do not specify the final projective measurement and we initialize the backward network with a maximally unknown state (P ⇐ (y T ) = 0.5 for X, Y and Z).Each backward trajectory provides up to 1 bit of information about the initial state [39].Combining this information using maximum-likelihood methods allows for reconstructing the initial state P 0 .Here, the optimization consists in minimizing the following likelihood function over the probability of the initial state following Ref.[40] P 0 (y 0 |a) = argmin As shown in Fig. 4 a, we find an agreement between the initial state estimation and prepation within the 95% confidence interval estimated with bootstraping method.It demonstrates that despite the complicated dynamics, the combination of RNN backward predictions performs as a faithful qubit state tomography.

PARAMETER ESTIMATION
The trajectories predicted by the trained RNN can be exploited to estimate physical parameters of the experimental system.In Fig. 4 b, we plot the distribution of the forward RNN prediction in the Y ,Z plane for all times.This distribution exhibits a tilted ellipse shape within the Bloch sphere (white circle), the great axis of the ellipse is along the Z axis showing that the quantum trajectories tends to collapse toward the poles of the Bloch sphere, corresponding to the pointer states of the measurement operator.In the equatorial plane, the distribution is squeezed, indicating that the quantum state experiences a larger dephasing and loses purity.By performing a statistical analysis of the forward RNN prediction, we are able to reconstruct the physical parameters associated with the stochastic master equation describing the quantum evolution under continuous measurement.The stochastic master equation has two main contributions [25] ; on one hand the dissipative evolution encodes the Hamiltonian evolution along with the decoherence, while on the other hand the measurement back-action describes the update of the quantum state given the stochastic measurement record.The dissipative evolution can be extracted from the forward prediction of the RNN by evaluating the average drift of individual trajectories.We compute the ensemble averaged prediction change over intervals of 40 ns, d P = P t+1 − P t with P t = (P X (y t ), P Y (y t ), P Z (y t )), versus position on the Bloch sphere depicted in Fig. 4c.We observe a drift vector map in the Bloch sphere describing a rotation of the qubit state along the X-axis of the Bloch sphere, corresponding to a Rabi frequency of Ω R /2π = 0.82 ± 0.02 MHz.An additional collapse of the state toward the Z-axis corresponds to measurementinduced dephasing rate of γ φ = 1.1 ± 0.05 µs −1 .The measurement-induced disturbance can also be extracted from the prediction of the RNN by evaluating the average diffusion of the individual trajectories [4].We compute the covariance matrix associated with the prediction change over intervals of 40 ns, dP 2 = covar( P t+1 − P t ).The diffusion vector map is given by the eigenvectors of the covariance matrix weighted by its eigenvalues versus position in the Bloch sphere as depicted in Fig. 4b.This vector map describes the magnitude and the direction of the disturbance induced by the measurement in the Bloch sphere.We observe that the disturbance is maximal along the equatorial plane of the Bloch sphere and vanishes at the poles.From this map, we extract a measurement rate of γ m = 0.40 ± 0.01 µs −1 along the Z-axis of the Bloch sphere.The quantum efficiency of our measurement defined as the ratio of the measurement induced dephasing and the measurement rate gives η = γ m /γ φ = 36 %.Note that the quantum efficiency is usually challenging to estimate and required several steps of calibrations.The estimated experimental parameters differ sightly from the calibrations which is attributed to residual detuning of the Rabi drive with respect to the qubit frequency.

CONCLUSION
We demonstrate that a recurrent neural network can be trained to provide a model-independent prediction of the outcome of fully general quantum evolution based only on raw observation.The ensemble of predictions can be compared to quantum models such as the stochastic master equation to extract physical parameters without additional calibration.By considering causal and retrocausal evolution, we show that initial state tomography can be carried out even for non-trivial quantum evolution.The black box approach of this work is an illustration of the fact that quantum mechanics is an operational theory, in which states and measurement outcomes can be predicted from raw observation without the mathematical abstraction of a Hilbert space.The model-agnostic nature of the RNN is therefore readily generalized to larger quantum system.Such networks could excel at finding efficient state representations for larger systems, which could prove useful for real-time modelling, filtering and parameter estimation.The robust, model-independent nature of prediction is a promising tool for the calibration of future quantum processors and will enable characterization of imperfections outside of the scope of the usual approximation, such as correlated errors or non-Markovian noise, and may even be suited for identifying and quantifying effects initially unknown to the experimenter.

Figure 2 .
Figure 2. RNN prediction of the quantum evolution a. Blue-scale histogram of the normalized measurement records extracted from the experiment, traces plotted in color show representative instances.a. Red-scale histograms of RNN prediction for the measurement basis b=X,Y and Z in the driven case, beginning from y0 = 1 in the preparation basis a=X.Traces plotted in color show representative instances.c.Training validation; Ensemble of RNN prediction Sp leading to p = 0.85 at T = 2.5 µs indicated by the red maker.d.Comparison of the RNN prediction with the tomography -averaged measurement outcome yt, Inset -Ensemble of projective measurement for the predicted ensemble Sp.
Fig.2c  displays the agreement between the ensemble of trajectories ending in p ± δ = 0.85 ± 0.01 and the histogram of the final measurement value.If the predic-

Figure 3 .
Figure 3. RNN prediction and retrodiction of the quantum evolution a. Red-scale histograms of RNN prediction for the measurement basis b = Y and Z in the driven case beginning from y0 = 1 in the preparation basis a = Z, traces plotted in color show representative instances.b.Blue-scale histogram of the normalized measurement records extracted from the experiment, traces plotted in color show representative instances.c.Red-scale histograms of RNN retrodiction for the same measurement record.d.Comparison of the backward RNN prediction with the tomography -averaged measurement outcome y0.e. Red-scale histograms of smoothed RNN predictions based on the forward-backward analysis given by Eq.(5) for the same measurement records.

Figure 4 .
Figure 4. Parameter estimation of the quantum master equation and initial state tomography a. -State estimation.Estimation of 6 initial state preparations (red circles) using maximum likelihood estimation on backward RNN predictions (∼ 20, 000 trajectories each) initialized from an undetermined projective measurement outcome, the circle radius gives the 95% confidence interval extracted from bootstrapping methods.b.Distribution of the RNN predictions in the Y and Z measurement basis for all time.c.Average drift of individual trajectories in the Bloch sphere: The vector map of the averaged evolution of RNN predictions in the Y and Z measurement basis between two consecutive time steps.This map captures the Hamiltonian evolution and the Linbladian dissipation d.Average diffusion of individual trajectories in the Bloch sphere: computed vector map associated with the covariance of the prediction between two consecutive time steps in the Y and Z measurement basis.This map captures the measurement induced backaction.
is the reduced Plank's constant, a † (a) is the creation (annihilation) operator for the cavity mode, and σ X,Y are qubit Pauli operators.H R describes a microwave drive at the qubit transition frequency which induces unitary evolution of the qubit state characterized by the Rabi frequency Ω R .H int is the interac-