Systematic Magnus-based approach for suppressing leakage and non-adiabatic errors in quantum dynamics

We present a systematic, perturbative method for correcting quantum gates to suppress errors that take the target system out of a chosen subspace. It addresses the generic problem of non-adiabatic errors in adiabatic evolution and state preparation, as well as general leakage errors due to spurious couplings to undesirable states. The method is based on the Magnus expansion: by correcting control pulses, we modify the Magnus expansion of an initially-given, imperfect unitary in such a way that the desired evolution is obtained. Applications to adiabatic quantum state transfer, superconducting qubits and generalized Landau-Zener problems are discussed.


I. INTRODUCTION
The problem of leakage errors, where a quantum gate is corrupted by populating spurious states, is generic to a variety of situations in quantum information processing. The most prominent example is the problem of highfidelity qubit gates. Most physical implementations of qubits are multi-level systems, with two levels chosen to encode the logical states of the qubit. As a result, control sequences designed to implement a given unitary evolution can give rise to transitions out of the logical subspace (see, e.g. [1]). Such "leakage errors" generically become more prominent as gates are made faster, due to the increased bandwidth of control pulses (and consequent enhanced spectral weight at the frequencies of unwanted transitions). Yet another generic example of leakage errors comes from protocols utilizing adiabatic evolution, where e.g. one attempts to have a system remain in the instantaneous ground state of some time dependent Hamiltonian. The leakage errors here occur for any non-zero protocol speed, and take the form of nonadiabatic transitions between instantaneous eigenstates.
Given their ubiquity, there is great interest in devising methods for suppressing leakage errors. While one could use the machinery of optimal quantum control [2], analytic approaches which yield simple, smooth and robust control pulses are also desirable. Many approaches have been put forth in the context of specific problems. For superconducting circuits, perhaps the best known is the DRAG technique (Derivative Removal by Adiabatic Gate) [3,4], which was designed to minimize leakage errors for single qubit gates in a weakly anharmonic qubit. DRAG has also been employed to improve the fidelity of a Rydberg-blockade two-qubit entangling gate [5]. A technique for minimizing leakage errors in two-qubit gates in circuit QED has also been formulated [6]. Turning to leakage errors in adiabatic-evolution protocols, a general strategy here are the so-called "shortcuts to adiabaticity" (STA) techniques [7][8][9][10][11][12][13], which provide methods for improving pulse sequences. A recent technique based on a "quantum geometric" interpretation of adiabatic evolution has also been put forward [14].
In this work we present an extremely general strategy for mitigating leakage errors. In certain limits, it cap- tures aspects of both DRAG and STA approaches, but is also able to deal with situations where these methods fail or become impossible to implement. The underlying idea is sketched schematically in Fig. 1. We start with a control sequence that would perfectly implement some desired quantum gate if there were no leakage levels. This evolution is however corrupted by coupling to spurious leakage levels [ Fig. 1 (a)]. Our goal is to gently modify the control sequence to mitigate the errors. We do this in two steps. First, we design a control correction [described by a Hamiltonian W 1 (t)] that cancels the effects of pure leakage to leading order. With this correction, the population of leakage levels at the final time is highly suppressed. The dominant remaining error is then a "phase" error: despite residing in the computational subspace, the final state may still devi-ate from the target state [ Fig. 1 (b)]. Our second order control correction [described by W 2 (t)] corrects for these errors [ Fig. 1 (c)]. We stress that the goal is not to suppress leakage errors at all times, but rather to have them effectively cancel out on average.
The general philosophy sketched above is in many ways similar to that of both DRAG and STA (see, e.g., Refs. [3,13]). However, our means of implementing it is quite different. We use the Magnus expansion [15,16] to systematically improve control pulses and implement our strategy of cancelling leakage on average. As we show in detail, this allows us to treat a more general class of problems. In particular, we are able to correct leakage errors in cases where the DRAG approach fails due to the closing of a spectral gap. Also, in comparison to STA techniques, we can apply our methodology in a perturbative fashion, allowing us to treat complex systems where the exact diagonalization usually employed in STA (e.g. to find counterdiabatic driving fields [7][8][9][10][11]13]) is impractical. Recently a numerical variational approach has been put forward to circumvent the need of exact diagonalization [17].
The general idea of using the Magnus expansion to find optimized control sequences is of course not new, with the earliest work addressing the specific problem of achieving population inversion in nuclear magnetic resonance [18]. Most Magnus-based approaches for improved controls rely on simply using the first term of the expansion, and viewing this as a Fourier transform integral; one then corrects pulses to suppress spectral weight associated with unwanted leakage transitions [19,20]. Similar approaches based on Fourier analysis have been explored in the context of specific problems in Refs. [21] and [22]. As we discuss, our Magnus approach is significantly different from these strategies: it does not rely on an analogy to a Fourier transform (allowing us to treat more general problems), and is not restricted to simply looking at the first order term in the Magnus expansion.
The remainder of this paper is organized as follows: in Sec. II we define the general "leakage problem" in the context of a quantum gate generated by a Hamiltonian with time-dependent control fields. In Sec. III, we show how to systematically correct the control fields using the Magnus expansion so as to mitigate leakage errors. Finally, in Sec. IV, we consider a few illustrative and relevant applications of the method: suppression of non-adiabatic errors in stimulated Raman adiabatic passage (STIRAP) in a lambda-system, suppression of leakage in gates for superconducting qubits, and suppression of leakage in the multiple-crossings Landau-Zener model.
Our choice of examples in Sec. IV has been made to clearly contrast the Magnus-based approach against DRAG and STA techniques. The relatively simple problem of STIRAP demonstrates how the Magnus approach lets one perturbatively capture the leading effect of the STA technique; it also illustrates how DRAG fails when confronted with the closing of a spectral gap. Re-visiting the problem of leakage errors in superconducting qubits (for which DRAG was developed) helps one interpret the recent experimental result of Chen et al. [1]. This work demonstrated that optimizing the overall amplitudes of the DRAG controls (away from the theoreticallypredicted optimal values) helps further reduce gate errors; our Magnus-based approach provides a natural way for understanding why this is the case. Finally, the multiple-crossings Landau-Zener model demonstrates how the Magnus approach can correct a complex adiabatic evolution problem for which STA techniques cannot be implemented.

II. STATEMENT OF THE GENERAL LEAKAGE ERROR PROBLEM
We consider a quantum system with a Hilbert space H, having a subspace H comp that contains the useful computational states of the system. The goal is to implement some specified unitary gate U G taking H comp → H comp . The basic problem is that the time-dependent controls used to generate this evolution will invariably couple states inside H comp to those outside of it, leading to leakage and an imperfect implementation of the quantum gate.
For concreteness, we consider a basis {|n , 1 ≤ n ≤ N }, where the first Q levels form the computational subspace (e.g. the coupled states of a qubit encoded in |0 and |1 ), and the remaining N − Q levels are "leakage" levels. The generic time-dependent Hamiltonian will have the form (1) H Q (t) represents the desired control of the computational levels, and only has non-zero matrix elements between these levels. V (t) describes both the spurious couplings that are generated between computational levels and leakage levels (e.g. for a qubit, the coupling between |0 or |1 with any other excited state |n , n ≥ 2) and couplings between leakage states [23]. We include the dimensionless parameter here for convenience; it will let us track powers of V (t) in what follows. Solving the Schrödinger equation defined by Eq. (1) leads to the unitary time evolution operator Here, U 0 (t) describes the desired evolution generated by H 0 (t), with T the time-ordering operator. For simplicity and without loss of generality, we have chosen to set t i = 0.
In contrast, imperfections due to leakage are described by U I (t), which is defined as Here and throughout, we denote the interaction picture representation of a Schrödinger operator O(t) by O I (t): The last inequality defines the superoperator 0 (t). Note that this interaction picture is in general easy to find: one only needs to fully understand the desired dynamics in the computational subspace, as the evolution of the leakage subspace is trivial under H 0 (t) [24]. We also emphasize that the interaction picture defined in Eq. (5) is different than the one used to derive DRAG [3,4]. Our interaction picture is defined by the full unitary evolution generated by the ideal Hamiltonian. As a consequence, our approach treats only the "spurious" couplings as a perturbation (and not the desired couplings between computational levels). We assume that the various time-dependent fields have been chosen such that if there were no leakage, the evolution described by U 0 (t) at t = t f corresponds to some desired unitary gate U G in the computational subspace. Formally, in the case where there was no coupling to the leakage subspace, we would have: and achieving the desired gate would require We have introduced here the projection operator onto the computational subspace, P Q , and unit operator in this subspace 1 Q .
Including the effects of non-zero leakage, the ideal gate will be corrupted, and states initially in the computational subspace will evolve to states outside of it [see Fig. 1 (a)]. To mitigate the effects of leakage, one could try to somehow suppress the relevant off-diagonal elements of V (t) at all times during the protocol. This is however an overkill: to achieve Eq. (7) (and hence a perfect evolution), we only need the net effect of leakage transitions to average away at the final time. This implies that the final time "error propagator" U err = U I (t f ) should essentially act as the unit operator on computational states, i.e.
The question then becomes how this cancellation can be achieved by modifying the available control fields, while still having U 0 (t f ) generate the desired unitary in the computational subspace. The modification of the control fields changes H(t): this change is described by a "correction Hamiltonian" W (t), defined via We want the combined effect of the correction Hamiltonian W (t) and the leakage Hamiltonian V (t) to average away at the final time. The introduction of W (t) modifies the error propagator to be We thus need to chose W (t) so that The basic strategy for dealing with leakage is to find (to some desired degree of accuracy) a method for fulfilling Eq. (11) and thus cancelling on average the effects of leakage. If successful, the modified protocol will cause the system to depart from the computational subspace at intermediate times, but return to it at the final time to perform the desired gate [see Fig. 1 (c)].

A. Basic approach
Finding a suitable W (t) to fulfil Eq. (11) remains a daunting problem. There is however a situation where the solution is straightforward. Suppose that the Hamiltonian V I (t) commuted with itself all times, [ V I (t 1 ), V I (t 2 )] = 0, ∀t 1 , t 2 . In this case the time-ordering in Eq. (10) plays no role, and leakage errors are completely suppressed whenever W (t) satisfies The superoperator Q simply nulls the part of an operator acting only in the leakage subspace; it is defined via with P N −Q the projector onto the leakage subspace. Eq. (12) tells us that the time average of W (t) needs to cancel the relevant parts of the time-averaged leakage operator V I (t). While Eq. (12) does not provide a solution to the more general (and standard) case where V I (t) does not commute with itself at different times, it does provide a means of attack: we can perturbatively correct the solution for W (t) given by Eq. (12) to account for the noncommutativity of V I (t), thus yielding a W (t) that satisfies Hamiltonian describing jth-order correction to control the fields Eq. (13) V (j) (t) modified error Hamiltonian (includes first j control Hamiltonians Wj(t)) Eq. (14) Ω (j) (t) modified "Magnus operator" associated to V kth term of the modified Magnus expansion Eq. (16) Eq. (11) to a high degree of accuracy. We will construct W (t) as a series, where the first term W 1 (t) satisfies Eq. (12) and scales like , while the kth term scales like k .
As we show below, the well-known Magnus expansion [15,16] applied to the error propagator U I (t) can be used to construct each term of the series defined in Eq. (13). We will show that keeping the first k terms in this series allows one to cancel all leakage errors to order k . This is obviously beneficial in cases where the relevant matrix elements of the leakage operator V I (t) are small, as a term ∝ k is necessarily proportional to k powers of V (t). This approach also affords advantages in cases where the matrix elements of V I (t) are not necessarily small, but are rapidly oscillating.
Consider U (j) err , the modified error propagator in the presence of the first j terms in the expansion of the correction Hamiltonian: The Magnus expansion expresses these time-ordered exponentials as the simple exponential of an operator, which is itself defined as an infinite series. Applying it to U (j) err yields: , The first terms of each series are given by (see, e.g. [15,16]) Further, the initial condition U (j) In what follows, we refer to Ω (j) (t) as the "Magnus operator" corresponding to the jth order corrected error propagator U (j) err . Leaving the discussion of convergence to Appendix B 1, we first discuss how to use the above expansions to construct the correction Hamiltonian W (t). Our general strategy is to pick the first k terms in the series for W (t) such that we suppress terms up to order k in the corresponding error propagator U (k) err . This is equivalent to requiring that at each order, we chose W k (t) so that the Magnus operators satisfy B. Constructing the first order correction to the control Hamiltonian We start with the first order term in the correction Hamiltonian W 1 (t). To cancel error terms that are order , we require Q Ω (1) i.e. the first order term in the Magnus expansion of the modified error propagator should vanish. From Eq. (16), this implies that W 1 (t) should satisfy Eq. (12), i.e.  O[ 2 ] or higher. These additional terms also vanish completely in the simple case where V I (t) commutes with itself at all times. There are of course many possible choices of W 1 (t) that will satisfy Eq. (19); while in principle all such choices are equally as good, in practice not all will be attainable given the typically limited set of control fields that can be experimentally implemented. Depending on the particular problem, one now has two general choices: • One can attempt to find a W 1 (t) that exactly satisfies Eq. (19). As we show in Appendix B 2, this implies that the state-averaged fidelity error scales as 4 (as opposed to 2 with no correction).
• Alternatively, for more complex problems (or where the form of possible control Hamiltonians is tightly constrained) one can attempt to satisfy Eq. (19) approximately. This does not change the power with which the fidelity error scales with , but can significantly modify the prefactor of the leading 2 term. As we show in the Sec. IV with specific examples, this can still give significant benefits. This more approximate approach is discussed in more detail in Sec. III E.
In Fig. 2, we depict a flowchart illustrating the first few steps of the Magnus-based algorithm. We outline below two general methods for obtaining first order control corrections W 1 (t) that exactly satisfy Eq. (19).

Derivative-based control
One general method for obtaining a W 1 (t) satisfying Eq. (19) is to use the fact that in most physically relevant situations, the time dependence in V (t) arises from an external control that is switched on at t = 0 and off at t = t f , and hence one often has V (0) = V (t f ) = 0. If this is the case we can always pick W 1 (t) so that the integrand in the first order Magnus term Q Ω To see this explicitly, we first define the superoperator where the superoperator 0 (t) is defined in Eq. (5). We then choose This yields Q Ω (1) (22) Although the form for the correction in Eq. (21) may seem opaque, we will show in later sections that it can be explicitly found and implemented in a number of relevant problems. This solution also provides a good starting point when looking for controls that are constrained by the physical system into consideration. We refer to Eq. (21) as the derivative-based control.
The control defined in Eq. (21) has a striking similarity with DRAG since it involves the time-derivative of V (t). A crucial difference is that ∂ t V (t) is multiplied by the superoperator † 0 (t)L 0 (t), which ensures that W 1 (t) is a well-behaved function of time. This is in contrast to DRAG where the control Hamiltonians may become unphysical if the unperturbed Hamiltonian is timedependent. We give a simple example of this in Sec. IV A, for a problem of adiabatic state transfer where the adiabatic gap is time-dependent.

Generating function approach
Another extremely general method is to first pick an operator-valued function R(t) that satisfies R(t f ) = 0. We then can select W 1 (t) to make the leading-order Magnus term Ω (1) 1 (t) = R(t). By definition, Eq. (18) is then satisfied as required. Using Eq. (16), the required W 1 (t) is then found to be: If one can find a class of functions R(t) (depending say on a finite number of parameters), this then yields a family of possible first order control corrections W 1 (t). Note that since Ω (1)

C. Higher-order corrections to the control Hamiltonian
We now show how to find higher-order control corrections W k (t) and in particular the second order control W 2 (t). As for the first order correction, one has considerable freedom in picking the form of the higher-order controls. This directly follows from Eq. (17) which only requires the Magnus operators to only vanish at t f .
Consider first W 2 (t), which suppresses errors at order 2 . The procedure is completely analogous to how we chose W 1 (t): we pick W 2 (t) so that all non-zero terms in Q Ω (2) (t f ) are at least order O[ 3 ]. As we show explicitly in Appendix A, the needed correction has to fulfill where Ω (1) 2 (t) is obtained from Eq. (16). We stress that There is a particular choice of W 2 (t) that automatically fulfills Eq. (24). By taking the time-derivate (with respect to t f ) on both sides of Eq. (24), we find This choice for W 2 (t) is very convenient as it is fully determined by the form of the leakage operator and W 1 (t).
In particular, Eq.
The recipe for constructing W k (t) ∀k ≥ 2 is the same as the strategy for k = 2: we can always choose W k (t) such that Q Ω (k) (t) = Q Ω D. Heuristic interpretation of the controls As stated above, keeping only the first two terms in the series for W (t) will be sufficient to enhance the fidelity; it is thus useful to have a physical picture for these corrections. The first order Magnus expansion of the uncorrected leakage operator, Ω (0) 1 [see Eq. (16)], represents a time-integrated effective matrix element for transitions out of the computational subspace H comp ; the correction W 1 (t) thus describes an added control that cancels direct leakage errors [up to corrections due to the noncommutativity of V (t)].
In contrast, we can understand ∂ t Ω (1) 2 (t) as the operator that describes the state of the system when the corrected leakage operator V we can identify two sources of error: 2 is an operator that acts only within H comp , but where excursions to the leakage subspace at intermediate times nonetheless yield errors. We refer to the latter as "phase errors".
Thus, the second order correction W 2 (t) describes a control Hamiltonian that corrects both the phase evolution of states belonging to H comp and cancels leakage errors that scale like O( 2 ).
In Figs. 1 (b) and (c), we have depicted the dynamics of the system in the presence of W (t) = W 1 (t) and W (t) = W 1 (t) + W 2 (t), respectively.

E. Imperfect realizations of controls
While the Magnus approach allows one to easily find control corrections for a given problem, in many cases these corrections will require control fields that are not realizable given experimental limitations. In the following, we show that by properly implementing an optimized version of the truncated control corrections (only using experimentally accessible fields), one can still obtain significant error suppression. This statement can be quantified. If one could perfectly implement a first order Magnus correction W 1 (t), the error in the state-averaged fi-delityF scales like 4 (see Appendix B 2). In the case where W 1 (t) cannot be implemented exactly, but instead an optimized, truncated form is used, the fidelity behaves likeF = 1 − a 2 + O( 4 ). While the optimization cannot eliminate the 2 term completely, it can make the prefactor a 1. We consider the general situation where one has derived an ideal first order Magnus correction W 1 (t) [i.e. satisfying Eq. (19)], but where not all of the terms in this control Hamiltonian can be experimentally achieved. We thus write the ideal control as where W ctrl 1 (t) and W err 1 (t) are, respectively, the experimentally-implementable and unimplementable parts of W 1 (t). We discuss various truncation-plusoptimization schemes in what follows.

Simple truncation
The simplest strategy would be to simply truncate W 1 (t) and only implement its experimentally-attainable part W ctrl 1 (t). In this case, one no longer completely cancels the first order term in the Magnus expansion of the error propagator. The error at t f is fully determined by the Magnus expansion of Q V I (t) + W ctrl 1,I (t), which we denote Ξ (1) (t). We have which follows from Eqs. (18) and (26). The naive truncation can be sufficient if one is almost able to implement all of the ideal correction W 1 (t). This is equivalent to requiring i.e. the error associated to the modified leakage operator V I (t) + W ctrl 1,I (t) is smaller than the error associated to V I (t).

Variational approach
The general goal of our first order correction is to make the time-average of V (t)+ W 1 (t) vanish. While the ideal correction W 1 (t) accomplishes this task by construction, simply truncating it causes the integral to no longer be zero, c.f. Eq. (27). A simple way to try and reduce the magnitude of this integral is to treat the overall amplitude α of the implementable part of W 1 (t) as a variational parameter, i.e. W ctrl 1 (t) → α W ctrl 1 (t). One can then optimize the value of α to minimize the error integral: To do this optimization, one needs a suitable scalar measure of the size of this error integral. The relevant measure here is the state-averaged fidelityF [25], which quantifies how much on average a non-zero Q Ξ (1) 1 (α, t f ) causes the evolution under our gate to deviate from the ideal version. One finds where N is the dimension of the total Hilbert space. The variational approach then involves minimizingF (α) with respect to α. While this variational strategy respects the ideology of our approach, one could also optimize the state-averaged fidelity for the full error propagator U (1) err . We give an explicit example of this variational approach in Sec. IV A 4, in the context of a quantum state transfer protocol. Also, as discussed more in Sec. IV B, our variational approach sheds insight into the recent work of Chen et al.
[1]. This work discusses how an adhoc optimization of prefactors in standard DRAG-style pulse corrections could help suppress leakage-related errors in superconducting qubit gates; it can also be viewed as a version of our variational approach to dealing with missing controls.
One could of course do a more sophisticated procedure by optimizing the implementable control fields at each time. However, the resulting complexity would defeat the purpose of the Magnus-based approach, and would be akin to that of optimal control techniques. In contrast to optimal control, maximizingF (α) is not demanding numerically; nonetheless, it still leads to significant improvements as compared to the simple truncation strategy.

Iterative approach
If V (0) = V (t f ) = 0, an alternate approach to optimizing the truncated control Hamiltonian, is to first focus on the form of the unattainable controls W err 1 (t). As we care about the time-integral of this operator, it is possible in many cases (via suitable integration by parts) to turn parts of this operator into experimentally attainable controls. This is especially easy to implement if we have for some functions g e (t) and g c (t). If the above condition is fulfilled, then the first order Magnus expansion of the error propagator (including the ideal first order control correction) can be written Q Ω (1) where the superoperator L 0 (t) is defined in Eq. (20). In the last equality, we have performed an integration by parts, and used the fact that W err Equation (32) suggests a natural modification of the control Hamiltonian: W ctrl The error of the new control sequence is determined by Generically, for sufficiently slow control protocols, we expect the size of the modified error integral in Eq. (34) (as measured e.g. by the p = 2 operator norm) to be smaller than what we would have with just a naive truncation, Eq. (27). This is directly due to the presence of L 0 (t), which involves integrating the interaction-picture evolution operator. It thus roughly scales like the inverse of the energy gap separating computational and leakage levels. If this gap is large compared to the typical magnitude of g e (t) (which scales like the inverse timescale of the derivative of the control pulse), our procedure will significantly reduce the error. Note that the above procedure represents just the simplest version of a whole hierarchy of control optimizations. Due to the presence of W err 1 (t) in the last line of Eq. (32), one can repeat the procedure iteratively (i.e. via additional integration by parts) to reduce the error even further. Each iterative steps generates a new control Hamiltonian X ctrl k (t) (k ≥ 2). In Sec. IV B, we give an explicit example of implementing this procedure, in the context of fighting leakage errors in a superconducting qubit gate. We show that the approach can yield advantages over standard DRAG corrections.

Second-and higher-order control Hamiltonians
The previous discussion provided several approaches for dealing with the common case where the ideal first order control-correction Hamiltonian cannot be realized using the available experimental controls. Our general strategy relies in "truncating" W 1 (t) and subsequently in optimizing the truncated control Hamiltonian W ctrl 1 (t). As mentioned earlier, this procedure will lead to a stateaveraged fidelity whose leading order correction is proportional to 2 .
Even though one has not completely cancelled the 2 term in the error, there is still in general utility to finding higher-order control correction Hamiltonians as one would do in the ideal case (c.f. Sec. III C). This is perhaps best understood physically, using the heuristic interpretation of the control corrections introduced in Sec. III D. Recall that in that section, we discussed how the principal role of the first order control correction W 1 (t) is to cancel pure leakage errors (where the final state lies outside the computational subspace H comp ), while the second order correction W 2 (t) corrects phase errors (i.e. helps make sure that the portion of the final state in H comp has the correct form). In general, cancelling phase errors is useful even if one has not completely cancelled pure leakage errors by implementing the ideal first order correction W 1 (t).
A further reason for attempting to implement the higher-order controls is that they are very easily obtained from the first order control (exactly like the ideal case). In particular, the second order control is given by Eq. (25) with Q Ω

IV. APPLICATIONS
In this section, we apply our general method to a few concrete problems. These highlight the flexibility and utility of the method, and also allow one to understand better similarities and differences from existing methods such as STA and DRAG.

A. Suppressing non-adiabatic errors in STIRAP quantum state transfer
We start with the problem of adiabatic quantum state transfer. Consider a Λ-system with two degenerate ground states |1 and |3 , and an excited state |2 . We assume (as is standard) that states |1 and |3 can be controllably coupled to |2 , but cannot be coupled directly to one another. The generic control Hamiltonian thus has the form: where G p (t) and G s (t) denote the pump and Stokes pulse amplitudes, respectively. Stimulated Raman adiabatic passage (STIRAP) [26][27][28] is a technique that allows one to perform adiabatic state transfer from |1 to |3 without ever occupying |2 . Since the technique is based on adiabatic passage, it is intrinsically slow. There is thus interest in finding methods to speed up this approach.
Here, we use our Magnus-based algorithm to find control corrections that speed up STIRAP while keeping a high-fidelity (and while respecting the constraint of no direct coupling between |1 and |3 ). Apart from its physical relevance to a variety of systems, the STIRAP problem is interesting because so-called "shortcuts to adiabaticity" (STA) methods [8,10,12] can be used here to find control corrections allowing a perfect transfer fidelity [13]. We will use these to benchmark the performance of the Magnus method to gain further insight into the underlying physics. We stress that for more complex problems (like that in Sec. IV C), STA methods are essentially impossible to implement, whereas the Magnus approach remains workable and useful. The STIRAP problem is also interesting as it highlights how the Magnus method is able to surpass limitations of the standard DRAG technique for fighting leakage [4].

Correcting constant-gap pulses
We start by considering optimal pulses which keep the instantaneous gap between adiabatic eigenstates constant. Using the pulse shape studied extensively in [29], the control pulses are written Here, G 0 is the maximal strength of the coupling and 1/ν determines the characteristic timescale of the protocol.
We next move to the adiabatic frame. The Hamiltonian takes the form Here S † (t) is the unitary change of basis operator that diagonalizes H(t), and maps states in the lab frame onto states in the adiabatic frame. We have denoted the adiabatic eigenstates of the system by |b + , |b − (known as bright-states), and |d (dark-state). The corresponding instantaneous eigenenergies are time-independent and given by E ± = ±G 0 and E d = 0. An ideal STIRAP-style state transfer would involve having the system stay in the |d state as it adiabatically evolves from being |1 to |3 (accomplished by having θ(t) evolve continuously from 0 to π/2). Leakage errors here Error ε Adiabatic parameter ν G 0 no corr. 1st order corr. 2nd order corr. 2nd order opt. corr. are due to non-adiabatic transitions between the adiabatic eigenstates. They are caused by the term V (t) ∝θ in H ad . Generically, the size of V (t) (and hence leakage errors) increases as the protocol is made faster.
The goal as always is to find modifications of the original pulse sequence so as to suppress leakage (i.e. nonadiabatic) errors. To do this using our Magnus approach, we first write V (t) in the interaction picture defined by Eq. (5) [i.e. generated by the time-independent H 0 in Eq. (37)]: Note that for this problem we have Q V I (t) = V I (t) (i.e. it does not have matrix elements directly coupling leakage levels to one another). The first step of the Magnus approach for deriving control corrections is to find a first order control correction W 1 (t) satisfying the cancellation condition of Eq. (19). Since we haveθ(0) =θ(t f ) = 0, we can use the derivativebased control choice for W 1 (t) in Eq. (21). Writing the correction in the adiabatic frame [i.e. the same frame as Eq. (37)], we find: As always, we stress that the choice of W 1 (t) is not unique. The above choice has the merit of both being simple to find, and (as one can easily verify) of not requiring any direct coupling between levels |1 and |3 to implement. The second order control is fully determined by V I (t) and W 1 (t); it can be found using Eq. (25). We find where we have written the control in the adiabatic frame. Again, this control Hamiltonian does not require any direct |1 and |3 coupling. Hence, implementing both the first and second order Magnus-derived corrections corresponds to a simple modification of the control pulses G s (t) and G p (t). The form of the modified G p (t) control pulse is shown in Fig. 3 (b) for various choices of the speed parameter ν/G 0 ; G s (t) is just given by G p (−t). As ν/G 0 → 0, one is in the adiabatic limit, and the correction to the control pulses vanishes as expected.
To assess the efficacy of the different control sequences, we consider the state infidelity for a STIRAP style population transfer from |1 to |3 . This is defined as The behaviour of this error for the uncorrected STIRAP protocol versus the Magnus-based corrected protocols are shown in Fig. 3 (a). One sees that if the goal is an error less than 10 −3 [30], the second order Magnus-based correction allows for a ∼ 2.6-fold increase in the maximum protocol speed (i.e. speed parameter ν). Note that to simulate a realistic experiment where pulses all have a finite duration, we have integrated the Schrödinger equation from t i = − ln(−1 + π/2 arcsin[δ])/ν to t f = − ln(−1 + π/2 arccos[δ])/ν. This choice ensures that the control fields are small at both the initial and final protocol times: sin[θ(t i )] = cos[θ(t f )] = δ. We take δ = 10 −6 .

Link to shortcuts to adiabaticity
As previously discussed, for the STIRAP problem, one can use the STA approach to derive an exact control correction that always yields perfect fidelity, the so-called "super-adiabatic transitionless driving" (SATD) [13]. These ideas were recently tested experimentally [31,32]. The added control (written in the adiabatic basis) takes the form: To compare this correction against those found from our Magnus approach, we can formally expand it in powers ofθ(t)/G 0 . The zeroth-order term in this expansion is identical to the first order correction W 1 (t) found using our Magnus approach, c.f. Eq. (39). This suggests that the Magnus approach provides a perturbative route for obtaining the benefits of a STA protocol (at least to leading order).
If one continues to expand the SATD correction in powers ofθ, one easily sees that the correspondence to the Magnus approach does not extend past leading order (see Appendix C for more details). Recall that the basic physical picture underlying both approaches is the same, and corresponds to Fig. 1: to fight leakage, one constructs a "dressed" dark state, and implements control corrections to ensure the system evolution follows this dressed state. We see that the dressed states for the two approaches are in general not the same (though they are similar to leading order).
For the STIRAP problem, the existence of the perfect SATD correction makes the Magnus approach unnecessary. The same is not true for more complex systems, where STA approaches are impossible to implement, whereas the Magnus approach remains tractable. An explicit example of such a problem is given in Sec. IV C.

Optimal second order control correction
Given the freedom one has to pick W 2 (t) [see Eq. (24)], we can look for an "optimal" control, W 2,opt (t), that simultaneously cancels all leakage errors at order 2 and 3 . To find such a control, one needs to (numerically) solve a set of coupled first order differential equations, which defeats the purpose of our Magnus-based method (see Appendix D). However, since for STIRAP pure leakage errors are more detrimental than pure phase errors, one can try to find an approximate optimal control that suppresses pure leakage errors at order ε 3 (i.e third order processes that take the system out of the dark state subspace) at the expense of not fully cancelling phase errors at order ε 2 (hence giving an approximate optimal control Hamiltonian). We find (see Appendix D) Remarkably, implementing Eq. (43) only requires changing the numerical prefactor of W 2 (t) [see Eq. (40)].
In Fig. 3 (a), we have plotted the error obtained with the optimal second order Magnus-based control (red trace). As it can be seen, the error is always smaller than 10 −3 . If we define the range of validity of a "corrected" control sequence as the parameter regime for which the maximal amplitude of the corrected pulse does not exceed the maximal amplitude of the original pulse, then the validity of the optimal second order correction is very close to that of SATD. We numerically find that the optimal control obeys this criterion for ν/Ω 0 2.21, while SATD is valid for ν/Ω 0 2.63. Within our definition of high-fidelity adiabatic passage (ε ≤ 10 −3 ), the approximate optimal control found at second order is nearly as good as the exact control.

Corrections for a time-dependent gap and connections to DRAG
As previously discussed, the DRAG technique [3,4] is also a perturbative technique for constructing corrections to control pulses to suppress the deleterious effects of leakage transitions. The STIRAP problem considered in this section is an excellent testbed for seeing how it compares to our Magnus approach. For the uncorrected pulse sequence of the previous subsection [c.f. Eq. (36)], one finds that the DRAG approach yields identical control corrections as the Magnus approach, i.e. W 1 (t) and W 2 (t) in Eqs. (39) and (40). This is however a special case where the adiabatic gap in the original pulses is completely time-dependent. As we now show, for more general pulses, the DRAG technique can fail completely, while the Magnus approach remains effective.
Consider the most standard pulse shapes used for STI-RAP, where both G p (t) and G s (t) are Gaussians [26]. We have explicitly: with τ the delay between the sequences and t 0 = − ln(δ)/ν is defined such that G s (0) = δ. In the adiabatic frame, the Hamiltonian reads where One can straightforwardly apply standard DRAG to this problem to help cancel non-adiabatic errors. It predicts a first order correction similar to Eq. (39) with G 0 replaced by G(t). The amplitude of the correction is thus ∝θ(t)/G(t). Unfortunately, the gap G(t) → 0 at both late and early times, and does so faster thanθ(t). As a result and as shown in Fig. 4 (b), the DRAG correction diverges, and is thus not physical. In contrast, using the Magnus approach yields physical (non-divergent) corrections. The leakage Hamiltonian in the interaction picture is: where ∆(t) = t 0 dt 1 G(t 1 ). As usual, the first step of the Magnus approach is to construct a first order control correction W 1 (t) satisfying Eq. (19). While one could again use the derivative-based control technique to find a well-behaved W 1 (t), for illustrative purposes, we will use some of the other techniques discussed in Secs. III B 2 and III E. In particular, we will use the generating function approach combined with the variational approach.
The generating function approach to finding W 1 (t) requires one to first construct a well-chosen time-dependent generating function that vanishes at t = t f [i.e. satisfies Eq. (18)]. Motivated by the form of Ω (1) 1 (t) that was found in the last subsection for the case where G(t) was time-independent, we consider the function Here, α is a parameter that will be set in what follows.
Given this generating function, the corresponding control-correction W 1 (t) is found by using Eq. (23). Unfortunately, when written in the lab-frame, one sees that it has terms requiring a direct coupling between levels |1 and |3 . As described in Sec. III E, we can still proceed even though W 1 (t) cannot be fully implemented given our limited experimental controls. We first decompose it into its attainable [ W ctrl 1,ad (t)] and unattainable [ W err 1,ad (t)] parts, W 1,ad (t) = W ctrl 1,ad (t) + W err 1,ad (t), see Eq. (26). We find and Here, γ(t) = (αG(t)/G 0 − 1)θ(t).
To proceed, we will use the variational approach discussed in Sec. III E 2. We basically assume that only the correction W ctrl 1,ad (t) is implemented, and pick α to minimize the error that results from dropping W err 1,ad (t). This involves numerically minimizing the average fidelityF [α]   (50) Having found the first order control, the second order control Hamiltonian is obtained via Eq. (25). We find W ad,2 (t) = αθ 2 (t) with β(α, t) = Im e −i∆(t)θ (t) 2 t 0 dt 1 e i∆(t1) γ(t 1 ) In Fig. 4 (a), we have plotted the infidelity ε [see Eq. (41)] as a function of the adiabatic parameter ν/Ω 0 . We consider various controls: the original, uncorrected Gaussian pulses , the first order Magnus correction , and second order Magnus correction. For each value of ν/Ω 0 , we have chosen the α that minimizes Eq. (50). The exact time evolution operator U (j) (t) is found by integrating the Schrödinger equation from t = 0 to t f with G p (0) = G s (t f ) = δ. We have chosen δ = 10 −6 . If we consider, as previously, that a high-fidelity adiabatic passage is achieved for log 10 [ε (j) ] ≤ −3, then the modified control sequence defined by W (t) = W 1 (t)+ W 2 (t) allows for a five-fold speedup compared to the original Gaussianbased STIRAP sequence. In Figs. 4 (b) and (c), we plot the modified pump and Stokes pulses, respectively. As ν/G 0 → 0, the modified control sequence converges to the original sequence.
The curves in Fig. 4 (a) illustrate a general important point: in the extreme adiabatic limit, ν/G 0 1, the fidelity error does not tend to zero (as one would naturally expect), but instead shows an oscillating behaviour. The errors in this limit are not due to non-adiabatic transitions, but instead due to the finite protocol duration (see Appendix E). By averaging the fidelity error of the uncorrected protocol between ν/G 0 = 0.025 and ν/G 0 = 0.1, we see that these finite-time errors set an approximate lower limit on the error of ∼ 10 −4 . As our correction protocol is only designed to suppressed non-adiabatic errors, it will not be able to suppress the total error appreciably below this level. This also explains why the second-order corrected protocol appears to have a higher error than the first-order corrected protocol in the range 0.17 ν/G 0 0.28. In this regime, the errors remaining after the application of the first order control are dominated by finite-time effects, and hence applying a higher-order correction (designed to fight non-adiabatic errors) will not improve matters.

B. Leakage in a superconducting qubit
The DRAG technique for correcting pulses was originally formulated to suppress leakage during manipulations of superconducting qubits [3]. The qubit here is formed by the lowest two levels of an anharmonic oscillator (eigenstates |n ). Applying our Magnus-based method to this problem yields a different approach for constructing control corrections. One immediately encounters the situation discussed in Sec. III E, where the theoretically-ideal control corrections must be truncated and optimized to reflect the limited experimental control fields available. This provides a direct and very physical way to understand the recent experimental work of Chen et al. in Ref. [1]. This work demonstrated that an ad-hoc optimization of DRAG results in significant fidelity improvements. Our approach shows how the need for such an optimization arises naturally.
We consider a situation where a drive is applied to the qubit with a centre frequency ω dr and complex envelope 2κ(t). In the frame rotating at the drive frequency (and within the rotating wave approximation), the Hamiltonian describing the system is δ is the detuning between the drive frequency and qubit transition frequency, ∆ is the qubit anharmonicity, and λ is a parameter that characterizes the strength of the transition |1 ↔ |2 relative to the transition |0 ↔ |1 . We take the (uncorrected) drive amplitude to have a simple real-valued Gaussian envelope: with κ 0 denoting the maximum control amplitude and t 0 the effective width of the pulse. We will consider pulses that turn on at t = t i and turn off at t = t f . In this problem, the qubit levels are |0 and |1 , hence transitions to |2 represent leakage. We thus write and Note that at a formal level, the parameter λ plays the role of in Eq. (1). When the driving field is on resonance with the qubit transition frequency, i.e. δ = 0, the unitary evolution operator describing evolution under H 0 (t) (starting at a time t = t i ) is Here, erf(x) denotes the error function.
To implement the Magnus approach, we first write V (t) in the interaction picture defined by U 0 (t): We stress again that our choice of interaction picture is very different than that used in DRAG: our interaction picture is determined by the ideal time-dependent qubit Hamiltonian (including the driving field on the |0 1| transition). As such, the desirable qubit drive is not treated perturbatively. In contrast, DRAG effectively treats all time-dependent terms in Eq. (55) as a perturbation.
The leading order Magnus control correction W 1 (t) can now be conveniently found by using the derivative-based control approach [Eq. (21)]; the second order correction W 2 (t) follows from Eq. (25). We find and As discussed already, the W 1 (t) correction generically helps suppress pure leakage transitions that cause the final qubit state to occupy the |2 level, whereas the W 2 (t) correction suppresses phase errors occurring due to virtual transitions to |2 . We have assumed t i −1/κ 0 and t f 1/κ 0 such that κ(t i ) = κ(t f ) 0. While the above control corrections would ideally suppress the leading errors arising from leakage in our gate, they are not fully implementable in a typical experimental setup. In that case, the only available controls are the (complex) pulse envelope κ(t) and drive detuning δ(t) in Eq. (55). The control corrections we find differ from those predicted in standard first order DRAG theory, though DRAG encounters a similar issue: to fully implement the corrections required (c.f. Eq.(8) of Ref. [3]), one would e.g. need to be able to directly control the |2 0| transition, something not possible in standard setups.
In DRAG, one usually just truncates the unattainable terms in the control Hamiltonian. However, as already discussed in Sec. III E, simply truncating the ideal controls is not the best strategy for dealing with unachievable control corrections. Instead, one should tweak the form of the realizable parts of W 1 (t) and W 2 (t) to reduce the error; various strategies for doing this were presented in Sec. III E. This provides an alternate and rigorous explanation for the recent experimental result of Chen et al. in Ref. [1]. In that work, the authors effectively treated the amplitudes of the two kinds of DRAG corrections (quadrature control and detuning control) as free parameters. They demonstrated that by optimizing over their values, gate fidelities could be improved above what would be obtained by directly using the leadingorder DRAG correction [33]. In our approach, the need for such an optimization emerges naturally from the fact that one cannot implement all the terms in the control corrections W 1 (t) and W 2 (t).
In what follows, we given an explicit demonstration of how to use the iterative approach of Sec. III E to minimize errors arising from an imperfect implementation of the Magnus-based control Hamiltonians. We start by splitting the W 1 (t) correction given in Eq. (59) into its realizable part W ctrl 1 (t) and its remaining, unrealizable part W err 1 (t). Writing these terms in the interaction picture, we have (61) and (62) Iterating the integration-by-parts procedure described in Sec. III E 3 two times shows that we can capture some of the effects of W err 1I (t) by modifying the form of the attainable controls: W ctrl 1 (t) → W ctrl,B 1 (t). We find: +H.c.
(63) Note that this control is not written in the interaction picture, but in the same frame as the starting Hamiltonian of Eq. (55). We see that this control Hamiltonian only requires one to address the |1 ↔ |2 transition, and not the |0 ↔ |2 . However, in an experiment, one cannot drive the |1 ↔ |2 transition independently of the |0 ↔ |1 transition. Hence, the form of the control Hamiltonian in Eq. (63) that is actually implementable is: (64) While this control will help suppress leakage errors, the extra unwanted drive addressing the |0 ↔ |1 qubit transition will cause errors in our gate (phase errors). One thus needs to address this problem before proceeding further; this can be done using our general framework. The goal is to ensure that to first order in the Magnus expansion, the error progator (including our control correction) does not cause any extra evolution in the qubit subspace. To ensure this, we will add purely diagonal terms to our control correction: W ctrl,C  The relevant condition becomes: The needed diagonal control Hamiltonian W diag 1 (t) can be found using the derivative-based control protocol, yielding: (66) Note that this correction corresponds to a timedependent variation of the detuning δ(t) of the drive pulse, c.f. Eq. (55). The term in this control correction ∝ |2 2| is not strictly needed to satisfy Eq. (65), but was added so that the full correction is experimentally implementable, i.e. has the form of a simple detuning modification.
We can now summarize and give the final forms of the optimized, experimentally-implementable first order control Hamiltonian. It takes the form where the two terms on the RHS are given by Eqs. (64) and (66). Given the form of the first order control, we can immediately use Eq. (25) to find the second order control Hamiltonian: (68) Here, we have only kept the implementable part of W 2 (t). As a result, it also takes the form of a time-dependent detuning δ(t) of the pulse. Using a similar notation to that adopted in Ref. [3], and denoting the corrected version of the pulse envelope byκ(t), the final control sequence is given bỹ Without any corrections applied, the gate is achieved using a Gaussian envelope κ(t) of the form in Eq. (54), with t 0 = √ π/(4κ 0 ); the total gate time t f − t i is also chosen so that κ(t i ) = κ(t f ) 0. We see that, as expected, the ideal unconstrained Magnus corrections W ideal 1 (t) + W ideal 2 (t) yield the best fidelity improvements over the uncorrected pulse for the majority of parameters. For a target infidelity ofε = 10 −3 , these corrections allow one to reduce the gate time by a factor of ∼ 4. As shown, this ideal Magnus approach also yields significant improvements over the standard first order DRAG correction (including the first order detuning control).
In contrast, if we consider the experimentallyconstrained Magnus correction defined by Eqs. (67) and (68), we only find significant improvements over first order DRAG in the quasi-adiabatic regime where κ 0 /∆ is small. This is also not surprising given the underpinnings of the iterative approach used to derive these controls: it is only in the limit of a slow gate (compared to ∆) that this approach allows one to effectively mimic the full Magnus correction using only the experimentallyattainable control fields. Fig. 5 also reveals that in the extreme limit κ 0 /∆ → 0, the experimentally-constrained Magnus correction (blue curve) gives a slight advantage over the unconstrained Magnus correction (green curve). This might first seem surprising, as the ideal unconstrained correction eliminates the leading order fidelity error (order 2 ∝ λ 2 ), whereas the constrained correction only partially cancels this. While this is true, the integration-by-parts approach used to derive the constrained correction changes the scaling of the error in 1/∆. This results in a lower overall error in the case where κ 0 /∆ → 0 while λ stays fixed. Here, we sketch the situation where ω2 = ω3 = ω, the spurious couplings are comparable to η, and ω η.

C. Multiple-crossings Landau-Zener problem
We now apply our Magnus-based algorithm to a ubiquitous adiabatic passage protocol where there is no existing good method for fighting leakage: the so-called multiple-crossings Landau-Zener (MLZ) model [34,35]. This is a generalization of the well-known Landau-Zener-Stückelberg-Majorana (LZ) model [36][37][38][39] where in addition to the two desirable "qubit" states, several unwanted energy levels are present. Similar to the LZ model, the MLZ model describes a variety of important physical systems and control problems. This includes state transfer between the electronic spin of a nitrogen vacancy center (NV center) in diamond and the nuclear spin of nitrogen [40], and nuclear spin state preparation in selfassembled quantum dots [41].
The MLZ problem is especially difficult to address, as leakage errors persist even in the limit where the protocol speed becomes vanishingly small; this is in stark contrast to the STIRAP state-transfer problem discussed earlier. In the MLZ problem, couplings to spurious levels cause the very form of the adiabatic eigenstates of the full system to be corrupted by leakage. As a result, a perfect protocol is not realized even if one suppresses all non-adiabatic transitions. The MLZ problem cannot be corrected by only using STA techniques, thus illustrating two general limitations of such approaches: 1) they require exact diagonalization of the instantaneous Hamiltonian, something that is cumbersome for problems with large Hilbert spaces, and 2) STA techniques are only effective if the uncorrected Hamiltonian possesses a set of "good" instantaneous (adiabatic) eigenstates whose time-dependent form corresponds to the desired quantum evolution.

Landau-Zener state transfer
We start by reviewing adiabatic passage of a qubit within the LZ model. The Hamiltonian is [36][37][38][39] where α is the energy sweep rate, λ the coupling strength between diabatic states, andσ i (i = x, z) are Pauli operators. Theσ z eigenstates are denoted |0 , |1 , i.e.σ z = |1 1|−|0 0|. The adiabatic dynamics defined by Eq. (70) is best understood when introducing the dimensionless time τ = α/ t and dimensionless coupling η = λ/ √ α [42]. Within this framework, Eq. (70) becomes Consider a protocol where at τ = τ i 0 the system is initially prepared in its adiabatic ground, and then as time progresses, sweeps through the avoided level crossing (occurring at τ = 0). The protocol ends at a time τ = τ f 0. In the infinite-time limit characterized by η/ |τ i | , η/τ f 1, the probability that the system remains in the ground state is given by P ad = 1 − P LZSM , where the non-adiabatic transition probability P LZSM = exp[−πη 2 ]. In the limit η → ∞ (or equivalently α → 0, i.e. infinitesimally slow sweep rate), the evolution is perfectly adiabatic, and the system evolves adiabatically between |0 at τ i to |1 at τ f , Eq. (71) [see Fig. 6 (a)]. Depending on the physical situation, this can be viewed as having performed a NOT-gate on our qubit, or as having performed a state transfer operation (as in the NVsystem experiment of Ref. [40]).
If there are no other dynamically-relevant energy levels, one can use STA techniques to find various control correction Hamiltonians that suppress speed-constraints due to non-adiabatic transitions. The simplest control Hamiltonian is the "transitionless driving" (TD) Hamiltonian [7,9]: which was realized experimentally [43]. Another possibility would be to use the "superadiabatic transitionless driving" (SATD) Hamiltonian [8,10]: whose main advantage compared to W TD (τ ) is that it does not require aσ y coupling [something which is absent in the original, uncorrected Hamiltonian of Eq. (71)]. We stress that both of these corrections allow a perfect state transfer from |0 to |1 .

Addition of spurious coupled levels
Now consider the more general situation described by the multiple-crossings Landau-Zener [34,35] Here, H LZ (τ ) is given again by Eq. (71) and describes the desirable dynamics of the qubit levels |0 , |1 . In contrast, H LZ,aux (τ ) describes the dynamics within the leakage subspace, and V describes the undesirable coupling between qubit and leakage levels. We have We have assumed for simplicity that the spurious levels have an energy detuning rate that is the same as the qubit levels. ω 2 denotes the constant-in-time separation between the diabatic energies of the upward-moving qubit level |0 and upward-moving leakage level |2 . Similarly, ω 3 denotes the separation between the diabatic energies of the |1 and |3 states. Finally, the η ij denote couplings involving the leakage levels (including couplings to the qubit states). We only retain couplings between levels whose diabatic energies can cross.
We can immediately see from Eq. (75) that a simple adiabatic evolution will no longer allow us to perform the desired gate on the qubit levels (i.e. an effective NOT-gate on the qubit levels |0 , |1 ). In the ideal case, the adiabatic ground state of the full Hamiltonian should reduce to |0 at the start of the protocol and to |1 at the end of the protocol. The presence of the spurious levels will prevent this from being the case. For η/ |τ i | 1 but finite, the qubit state |0 and leakage state |2 will be weakly coupled to one another (via a virtual process involving the |3 state), leading the adiabatic ground state to have an appreciable and unwanted |2 component. Similarly, for η/τ f 1 but finite, the ground state will be an admixture of the qubit state |1 and leakage state |3 (see Fig. 6 (b) and Fig. 7).
Before proceeding, we stress that several experimental systems are described exactly by a Hamiltonian of the form of Eq. (74) (where both desirable and undesirable levels have diabatic energies changing linearly in time). For example, consider the experiment of Ref. [40], where the authors demonstrate the realization of a quantum memory using a NV center in diamond. Using adiabatic passage under the linear ramp of a magnetic field, they are able to transfer the state of the electronic spin of the NV center into the nuclear spin of nitrogen. In this system, it is the hyperfine interaction that allows one to perform state transfer between the two states of interest. However, imperfect magnetic field alignment implies that there are always unwanted coupling to other, unwanted electronic and nuclear spin states. This problem can be mapped exactly onto the multiple-crossings Landau-Zener problem described here.

Magnus-based control corrections
To derive control corrections for our MLZ system using the Magnus-based approach, we will take a two-step approch: • First, ignore the presence and coupling to the leakage levels, and use the SATD approach to cancel errors arising from non-adiabatic transitions between the adiabatic eigenstates of the qubit Hamiltonian H LZ (τ ). This involves using the control-correction Hamiltonian W SATD (τ ) given in Eq. (73).
• Next, include the coupling to the leakage levels. Use a Magnus-based correction to cancel the effects of V in Eq. (74).
We thus need to find corrections starting with the Hamiltonian Note that here, the Magnus-algorithm is only fighting errors arising from the coupling to the leakage levels; non-adiabatic errors within the qubit subspace are fully addressed using the known ideal control W SATD (τ ).
To apply the Magnus-approach, it is convenient to work in the so-called superadiabatic frame [44][45][46]   qubit subspace. One first moves to the adiabatic frame of the qubit Hamiltonian H LZ (τ ). One then finds the unitary that diagonalizes this adiabatic Hamiltonian at each time. Using this time-dependent unitary defines the final superadiabatic frame. In this frame, the Hamiltonian reads where the subscript "SAD" indicates an operator written in the superadiabatic frame. We have with |j = S † (τ )|j (j = 0, 1). The unitary operator S † (τ ) is the change of basis operator that takes one to the superadiabatic frame; it only acts non-trivially on the computational subspace. The operator S(τ ) is readily found since it corresponds diagonalizing a 2 × 2 matrix. The leakage Hamiltonian V becomes time-dependent in the superadiabatic frame. The matrix elements coupling leakage and qubit levels have the general form Explicit expressions for the matrix elements of Q V SAD (τ ) are readily found and are given in Appendix F. The V SAD (τ ) operator also couples leakage levels to one another, the form here is the same as in the original (lab) frame: Equation (77) has of course the general form of the leakage problem addressed in this paper. We can then proceed in applying our Magnus-based algorithm to minimize the effects of the leakage Hamiltonian V SAD (τ ). As always, the first step consists in finding a control correction W 1 (τ ) that averages away the pure leakage interaction described by Q V SAD (τ ). We will focus on a protocol that runs between τ = τ i 0 and τ = τ f 0. Since Q V SAD (τ i ) = 0 and Q V SAD (τ f ) = 0, we cannot rely on the derivative-based control to find W SAD,1 (τ ). We thus instead use the generating function approach (c.f. Sec. III B 2). This involves first picking a timedependent operator R(τ ) that is similar in form to the leakage operator V SAD (τ ), but which vanishes at τ = τ f . We take where the superoperator 0 takes us into the interaction picture generated by H 0,SAD [c.f. Eq. (5)], and We have essentially kept the parts ofV SAD (τ ) which vanish for large enough τ f . Next, we pick a control correction so that the first order expansion of the error propagator Ω (1) 1 (τ ) is equal to R(τ ) (and hence vanishes at large τ ). Using Eq. (23), we find where Q V SAD,I (τ ) = 0 Q V SAD (τ ) is the full leakage operator in the interaction picture. Given the form of the first order control correction Hamiltonian W 1 (τ ), the second order correction W 2 (τ ) follows immediately from Eq. (25).
In Fig. 8, we plot the fidelity error ε versus coupling η for a state transfer protocol where we start in state |0 at τ = τ i and evolve under H mc (τ ) (plus possible control corrections) until a time τ = τ f , with the goal of reaching |1 . We consider the extreme situation where all inter-level couplings (both desirable and undesirable) are the same: η = η 23 = η 03 = η 12 . We also have fixed ω 2 = ω 3 = 0.1 and taken τ i = −20 and τ f = 20. We see that without any corrections, the evolution generated by H mc (τ ) (black curve) never permits an error lower than 10 −1 . For fast protocols (small η) non-adiabatic errors limit the error, whereas for slow protocols (large η) the error is limited by the corruption of the adiabatic eigenstates by the leakage levels. While applying the qubitonly SATD correction gives improvements for a narrow range of η (orange curve), we see that the Magnus-based corrections yield better improvements over the range of η shown in Fig. 8. In particular, errors much lower than the 10 −3 level required for quantum error correction are possible.
In Fig. 9, we plot the controls predicted by the Magnusbased approach. As one can observe, the modified protocol requires both control of the detunings [diagonal elements of H (2) tot,ij (τ )] and couplings. Note also that the control fields are smooth functions of time.
Although our choice for Q Ω (1) 1 (t) [Eq. (81)] was not motivated by any experimental realization, the controls found here could in principle be implemented in a NVcenter in diamond in the manifold spanned by {|m s = −1, m n = 0 , |m s = −1, m n = +1 , |m s = 0, m n = 0 , |m s = 0, m n = −1 } ≡ {|0 , |2 , |3 , |1 }. Here m s labels the spin projection in the S = 1 ground state of the NV and m n labels the nuclear spin projection of the nitrogen. These controls are possible since the detuning is achieved by sweeping an external magnetic field [40], microwaves fields can be used to manipulate the transition |m s = −1, m n = 0 ↔ |m s = 0, m n = 0 [47], and the nuclear spin transitions could be addressed with the method developed in Ref. [48].

V. CONCLUSION
We have presented a systematic perturbative method based on the Magnus expansion that allows one to construct corrections to time-dependent control Hamiltonians in such a way that leakage errors during a quantum gate are minimized. The method is very general, and can be applied to problems where one does not have full control over the system. We have discussed how our method connects to both the well known techniques of DRAG and shortcuts to adiabaticity (STA), emphasize how in certain cases, it is able to overcome their limitations.

VI. ACKNOWLEDGEMENTS
We acknowledge funding from the University of Chicago Quantum Engineering program and the AFOSR MURI program. H. R. gratefully acknowledges funding from the Swiss SNF. Let us verify the claim in the main text, that the choice for W 2 (t) given in Eq. (24) explicitly cancels errors to order O[ 2 ]. To do this, we explicitly compute Ω (2) (t f ) to second order in V I (t). The first term in the Magnus expansion of Ω (2) (t) is given by where we used Eq. (19), Q 2 O(t) = Q O(t), and that W 2 (t) does not act on the leakage subspace, Q W 2 (t) = W 2 (t).           We have shown how to get control Hamiltonians based on the Magnus expansion. There is, however, an issue we have not addressed; namely the convergence problem. One would naturally expect the method to yield results whenever the Magnus expansion converges. In particular, the Magnus expansion for the modified error propagator U (j) err (which includes the first j control corrections W j (t)) is guaranteed to converge if the following condition holds [49,50] t 0 dt 1 V (j) with · 2 denoting the p = 2-norm. This condition allows one to make two strong statements about the Magnusbased algorithm: 1) it is possible to choose W k (t) such that Eq. (B1) is always fulfilled, 2) since the method is based on perturbation theory within a suitable interaction picture, there is always a parameter regime where Eq. (B1) holds.
Opposite to optimal control methods [2,51], we expect the Magnus-based algorithm to yield smooth control sequences. If V (t) is a smooth function of time (at least C 0 ), then it is always possible to find controls that are also smooth.

Impact on the fidelity
In this appendix, we show that implementing W 1 (t) [see Eq. (19)] predicted by the Magnus-based algorithm leads to an expression for the fidelity whose leading order correction scale as 4 .
First, we consider the state-dependent fidelity for the unitary evolution generated by H(t) + W 1 (t). For an initial state |ψ 0 is in H comp , we have where we have used the fact that U (t f ) = U 0 (t f ) U The result for the averaged-state fidelity follows from Eq. (B4), since the state-averaged fidelity is formally defined as [25] A similar calculation for H(t)+ W 1 (t)+ W 2 (t) shows that the leading order correction to the fidelity, both statedependent and state-averaged, scales like O( 6 ).
Note that the above equation is not specific to STIRAP and holds whenever Q V (t) = V (t). As explained in the main text, we could numerically solve Eq. (D1) to find the exact W 2,opt,ad (t). However, since leaking out of the dark state subspace is more harmful than a phase error, one can try to find an approximate solution to Eq. (D1) that only cancels pure leakage at the expense of a phase error.
(D4) The functions g 1 (t) and g 2 (t) are given by and = g 0 (t) + g 1 (t). (D6) Since g 0 (t f ) = 0, we can choose γ = −2/3 to cancel terms proportional to g 1 (t) and consequently suppress leakage at third order. With this choice, we have Ω (2,opt) (t f ) = 1 3 Ω (1) which shows that pure leakage errors are suppress to third order at the expense of a phase error. In this appendix, we derive an approximate expression for the infidelity of STIRAP using Gaussian pulse shapes. Such an expression can be obtained using the Magnus expansion to find an approximate time-evolution operator valid in the adiabatic regime (ν/G 0 < 1).
We look for solutions of the time-dependent Schrödinger equation defined by Eq. (45).
Using a first order Magnus expansion to approximate U I (t), we find Noticing that the transition probability from |1 to |3 is equivalent to the probability to remain in the dark state |d at t f , we have P |1 →|3 = P |d →|d = d| U I (t f )|d The state-dependent infidelity can then be expressed as Since t f is finite and t f ∝ 1/ν, ε exhibits an oscillatory behavior as a function of ν/G 0 .
In Fig. 10, we compare the infidelity as given by Eq. (E5) with the infidelity obtained through numerical integration of the Schrödinger equation defined by Eq. (35) with G p (t) and G s (t) given in Eq. (44).