Hidden Inverses: Coherent Error Cancellation at the Circuit Level

Coherent gate errors are a concern in many proposed quantum computing architectures. These errors can be effectively handled through composite pulse sequences for single-qubit gates, however, such techniques are less feasible for entangling operations. In this work, we benchmark our coherent errors by comparing the actual performance of composite single-qubit gates to the predicted performance based on characterization of individual single-qubit rotations. We then propose a compilation technique, which we refer to as hidden inverses, that creates circuits robust to these coherent errors. We present experimental data showing that these circuits suppress both overrotation and phase misalignment errors in our trapped ion system.

Here we present a method for canceling coherent errors without requiring any additional gates. We call this method hidden inverses because it uses the fact that quantum circuits often contain inverted gates, which are hidden due to the common use of self-adjoint unitary operations, for example, the Hadamard (H), controlled-NOT (CNOT), and Toffoli gates. Since these gates are constructed from physical operations, it can be beneficial to choose between different configurations. As an example, we could choose from two CNOT sequences, CNOT a = ABC or CNOT b = CNOT † a = C † B † A † , where A, B, C, A † , B † and C † are native physical gates. The goal is to determine when to implement CNOT a , and when to implement CNOT b = CNOT † a . This decision will depend on both the neighboring gates and the underlying error process.
The paper is organized as follows: In Sec. II we introduce the idea of hidden inverses and discuss their application in a common quantum circuit structure, the * bichen.zhang@duke.edu, ken.brown@duke.edu † Present Address: IonQ, Inc., College Park, MD 20740, USA ‡ Present Address: Google Research, Venice, CA 90291, USA parity-controlled Z rotation widely used to create highweight Pauli operations [25,26]. In Sec. III, we give an overview of the trapped ion quantum computing experimental platform used for the demonstration of the hidden inverse technique. In Sec. IV, we present the experimental results and analyze the performance of hidden inverses. In Sec. V, we numerically compare hidden inverses to other methods for reducing coherent errors. Finally, in Sec. VI, we summarize the results and conclude.

II. DESIGNING GATES TO CONTROL SYSTEMATIC ERROR
For solid-state and atomic quantum systems, quantum gates are performed by applying electromagnetic signals that generate a time-dependent Hamiltonian. For small systems, the advantage of a gate description is questionable and direct optimization of the pulse sequences to optimize the overall evolution is preferable [27][28][29][30][31][32]. Gates become a useful abstraction when we consider both larger systems and the challenge of calibration for multiple applications.
Let U 0 be the ideal gate and U a be an instance of the gate. U a could be a completely positive trace-preserving map, but we invoke the Stinespring dilation theorem to say that all instances of the gate differ only from the ideal gate by a unitary operation that is potentially on a larger Hilbert space. We consider both the left and right error operators V , U a = U 0 V R,a = V L,a U 0 . We are interested in the case where U a is a good approximation of U 0 and therefore V R and V L must be close to the identity.
The system controller does not control the noise but chooses only what signals to imperfectly apply. We further break the unitary label into a control label and an error label U a, . The question that we are exploring is whether it makes sense for the controller to have multiple versions of U , or only a single version, and given multiple versions what are the methods for compilation.
A very common motif in quantum mechanics and quantum algorithms is a unitary W that is transformed by another unitary U ,W = U W U † . We should find methods for implementing U a and U † b such that when averaged over the noise instances, and δ, U a, W U † b,δ is as close toW . This will work perfectly, if V R,a, W V † L,b,δ = W . The ideal cases are either when there is no noise or when W commutes with the V 's and V † R,a, = V † L,b,δ . In many cases W itself is a small rotation, for example, in a Trotter series, and V † R,a, = V † L,b,δ will still yield a reduction in error.
Given a quantum circuit with the structureW = U W U † one can choose the appropriate U a and U † b for every W . We limit ourselves in this study to the case of "hidden inverses" where U = U † , which occurs regularly in quantum circuits, for example, when operations are conjugated by CNOTs. Our key observation is that in many physical systems, a CNOT is constructed from a series of system-specific gates, whose inverses are readily available by changing the sign of the control field [33,34]. In these cases, CNOT a corresponds to the regular order of gates with the regular control fields, and CNOT b inverts the sequence and inverts the control field. It is the driven Hermitian conjugate of CNOT a . This extends to any unitary, which is its self-inverse.

A. Parity-controlled rotations and theoretical error models
A common place where self-inverses arise is when multiple CNOT(c, t)s are used to conjugate a single-qubit rotation to generate a multiqubit unitary. This structure is common in quantum simulation and quantum optimization algorithms [26,35]. To generate an n qubit weight Pauli operation, we can build a circuit that performs a single-qubit Z, Z(θ) = exp −i θ 2 Z , rotation whose direction is conditioned on the parity of n − 1 other qubits as follows: We expect the CNOTs that come before the rotation and the CNOTs that come after the rotation should differ depending on the error model. At this point, we need to introduce a physical decomposition of the CNOTs and an error model to proceed. Many error models can be considered. In the main text, we limit ourselves to a model where the entangling operation is generated by a single two-qubit Pauli operator that is decorated with single-qubit gates, but we also consider the gate Hamiltonian model introduced in Ref. [36] in Appendix A. For concreteness we further specialize to an XX-type interaction common in Mølmer-Sørensen (MS) gates in trapped ions [37]. For this case, the CNOT gate can be performed in either its standard or hidden inverse configurations as displayed in Fig. 1 with ion trap gates XX(θ) = exp(−iθXX), Standard and Hermitian conjugated decompositions of CNOT gate with native trapped-ion quantum operations consisting of single-qubit gates and Mølmer-Sørensen interactions.
The choice of configuration is nontrivial since CNOT gates are synthesized from a two-qubit gate and multiple single-qubit gates [33], which are subject to systematic overrotations and phase errors in addition to stochastic noise.
We first consider a simplified error model where only the two-qubit Mølmer-Sørensen gates have the same overrotation by a fraction . We calculate the average gate fidelity, F , from the entanglement fidelity of two unitaries U and V , Fe = |Tr[U † V ]| 2 /4 n , for the (n−1)-qubit parity-controlled rotation (Eq. 1) as a function of θ, , and n to be where F+ is the hidden inverse configuration fidelity and F− is the standard configuration fidelity. We note that F+(θ, , n) = F−(θ + π, , n) and we can obtain the best fidelity by choosing hidden inverses for |θ| ≤ π/2 and the standard circuit for |θ| > π/2 for θ ∈ [−π, π]. At θ = 0 and θ = π, there is an exponential difference in the fidelity between the two choices as a function of n. For small and small angle deviation, ϑ , around these ideal points of 0 and π, for the correct sequence choice fidelity drops as (n−1) (π/4) 2 2 ϑ 2 , while for the incorrect sequence the fidelity drops as (n − 1) (π/4) 2 2 4 − ϑ 2 . We only achieve perfect cancellation at the ideal points, but we benefit from our choice as long as θ is close to the ideal point.
With this simplified error model in mind, we consider additional errors starting with overrotation errors on all gates, phase-misalignment errors, and then a consideration of these systematic errors with additional stochastic errors. Phase misalignment occurs because two-qubit and single-qubit gates are driven by different fields and mechanisms. The Z basis is well defined by the energy eigenbasis of the undriven system Hamiltonian. X and Y in the rotating frame simply differ by a phase and a common experimental challenge is to align the X in a two-qubit XX interaction with the single qubit X interaction.
In Fig. 2, we examine how these errors affect the circuit fidelity of implementing Eq. 1 using the standard circuit with only CNOT and the hidden inverse circuit using CNOT and CNOT † from Fig. 1. These choices impact circuit performance when each gate is subject to either only overrotation error ( Fig.  2(a)) or only a phase misalignment between single-and twoqubit gates ( Fig. 2(b)).
In both Fig. 2 (a) Average gate fidelities of parity controlled-Z rotation circuits according to Eq. 1 with standard (dashed lines) and hidden inverse (solid lines) configurations for circuit width n ∈ {2, 4, ..., 10}, with each two-qubit gate subject to 2Q = 2% overrotation and each single-qubit gate subject to 1Q = 0.2% overrotation. The hidden inverse configuration outperforms the standard one when the absolute value of the Z rotation angle is less than approximately π/2. (b) Average gate fidelities of parity controlled-Z rotation circuits with φ diff = 3.5 • phase misalignment. The performance of hidden inverse configuration is higher than or equal to the standard one regardless of θ. This shows the scalability of the hidden inverse technique. (c) (d) The average gate-fidelity difference between hidden inverse and standard configuration when circuit width n = 2. The warm (cool) color shows the area where hidden inverse configuration outperforms (underperforms) the standard one. Curves in black represent the boundary where the fidelity difference is zero. When the coherent error is small, fidelities of standard configuration surpass the hidden inverse configuration by approximately 10 −5 due to the imperfect motional state coherence.
inverse configuration outperforms the standard one considerably when the Z(θ) rotation angle is small. This can be understood by noting a small angle rotation is near the identity, so the systematic errors are approximately canceled. We further note the oscillatory behavior suggests the need for compilation tools to determine which CNOTs should be inverted in more complex circuits. The amplitude of the oscillation is positively correlated with the number of control qubits showing the affect of these choices increases with circuit size. Due to the ubiquity of CNOT conjugations about single-qubit rotations in quantum algorithms, we expect a peep-hole style optimization [38] would work well when slowly drifting overrotation or phase offset errors dominate. In any real system, there will be multiple error types and the potential advantage can differ. Here we now examine numerically the n = 2 case using a detailed ion-trap error model. For the simulations shown in Fig. 2(c) and (d), we consider the systematic overrotation error, phase misalignment, and all dominant stochastic error sources in our experimental system [39] including laser dephasing error, motional dephasing error, and motional heating. We use a master equation in Lindblad form to simulate the open quantum system. The dominant stochastic error sources are depicted by corresponding collapse operators, while the coherent error sources are represented by parameter offsets in the Hamiltonian. Further details can be found in Appendix B.
For zero phase-alignment error, we compare the relative fidelity difference between standard circuits and hidden inverse circuits as we change the overrotation angle in Fig. 2 (c). We see that the broad feature of the overrotation data in Fig. 2(a) is preserved and hidden inverses perform well for |θ| < π/2. For small systematic errors, the difference in fidelity between the two circuits and the ideal circuit is less than 10 −5 , while the circuit fidelities are close to 1 − 10 −2 . Fig. 2(d) shows that the hidden inverse configuration suppresses phase misalignment in almost all the area of interest, even with additional stochastic noise. In the small region where the hidden inverse configuration exaggerates the phase misalignment, the fidelity difference is at the level of 10 −5 . Unlike for overrotation errors, where the advantage of the hidden inverse sequence requires the Z(θ) rotation angle to be small, the hidden inverse configuration provides a fidelity improvement for most phase-misalignment errors given our gate model.

III. EXPERIMENT IMPLEMENTATION OF AN ARBITRARY QUANTUM CIRCUIT
In the experiment, a chain of 171 Yb + ions is trapped in a linear chain 70 µm above the surface of a microfabricated surface trap made by Sandia National Laboratories. The |0 and |1 states of the qubit are encoded in the hyperfine ground states, 2 S 1/2 |F = 0, m f = 0 and 2 S 1/2 |F = 1, m f = 0 , respectively. A 369.5-nm laser is used to Doppler cool, electromagnetically-induced-transparency (EIT) cool, and prepare the ions in the |0 state. State detection is performed through state-dependent fluorescence by resonantly exciting the 2 S 1/2 |F = 1 to 2 P 1/2 |F = 0 transition and collecting the emitted photons [40,41]. The scattered photons are imaged with a 0.6 numerical-aperture lens and coupled into a linear array of multimode fibers with 100µm-diameter cores [42]. Each fiber in the array is connected to individual photomultiplier tubes, allowing for individual qubit readout. For the following experiments the Doppler cooling, EIT cooling, state initialization, and state detection take 1 ms, 500 µs, 15 µs, and 300 µs, respectively. Stimulated Raman transitions using a 355-nm picosecond pulsed laser drive single-qubit and two-qubit gates [43][44][45]. An elliptical beam addresses all qubits in the chain simultaneously while two tightly focused beams perpendicular to the elliptical beam individually address the two qubits [46]. Steering of each individual beam over the ion chain is accomplished by a pair of micro-electromechanical systems (MEMS) mirrors each tilting in orthogonal directions. The number of atomic qubits in our trapped-ion quantum processor is limited to 13 by the steering range of the MEMS mirrors. In this paper, a two-ion chain (two-qubit circuits) and a five-ion chain (fourqubit circuits) are used to prove the principle. The beams pass through acousto-optic modulators (AOMs) driven by a radio frequency system on chip (RFSoC), which provides the ability to change the amplitude, frequency, and phase of each beam. The RFSoC firmware is provided by Sandia National Laboratories QSCOUT project [47]. By controlling the duration of the pulse and the phase of one of the two Raman beams we can perform arbitrary single-qubit rotations, R(θ, φ). Two-qubit gates are implemented using the Mølmer-Sørensen scheme [48]. Frequency modulation (FM) of the Raman beams is performed in order to robustly disentangle the the qubit states from all of the motional modes [39,[49][50][51]. Further details of the setup can be found in Ref [39].
One universal gate set of our system contains the Mølmer-Sørensen [XX(π/4)] gate, X(θ) gates, and arbitrary Z rotations. Arbitrary Z(θ) rotations are implemented in a virtual way by accumulating a −θ phase in the subsequent gate operations. Indeed, Y (θ) gates are simply phase-shifted X(θ) gates.
We implement the Mølmer-Sørensen gate in a spin-phasesensitive configuration [45]. In this configuration, the rotation axis is not exactly aligned with the XX axis due to mechanical fluctuation in the optical path of the Raman beams. We note the intrinsic phase instability and the slight difference of the ac Stark shift between single-and two-qubit gates, and it is necessary to calibrate the phase between these gates. The calibration is done using parity measurement: We initialize the qubits in |00 state and implement a XX(π/4) gate on them. Then we apply a single-qubit π/2 rotation on both qubits. The phase φ of the single-qubit gates is varied from 0 to 2π. Finally, both qubits are measured in Z basis. The measured parity P is fitted to a sinusoid, P = A cos (φ0 + 2φ), with a phase offset φ0 from the parity-measurement results. In experiments, we observe the phase offset drifts as much as 4 • in a two-qubit system within several hours. Right after calibration, we can reduce the misalignment to as low as approximately 0.2 • .
We know due to the limits of our ability to stabilize laser intensity and phase at the ion that there will be systematic errors between gates. For single-qubit gates, we use gate set tomography (GST) on both direct quantum pulses and composite quantum pulses to characterize systematic errors. The results of GST infer time-varying overrotations exist in our system that are stable for a time > 1 ms. Details of the single-qubit GST experiment can be found in Appendix C.

IV. HIDDEN INVERSE EXPERIMENTAL PERFORMANCE
The base circuit for the hidden inverse experiment is the portion highlighted by the dashed box in Fig. 3(c). A CNOT gate is performed followed by a Z(θ) rotation on the target qubit [33,39]. The second CNOT gate is applied either with the same phase as the first (CNOT) or with a phase shift of π relative to the first (CNOT † ). The latter configuration is the hidden inverse case. We reverse the gate sequence order and each element gate's sign in the CNOT decomposition to conform to the Hermitian adjoint's antidistributive property. The base circuit is repeated 5 times to amplify the two-qubit gate overrotation error and phase-misalignment error between single-qubit and two-qubit gates, which are the dominant coherent error sources in the circuit. We note for the repeated circuits, we cannot experimentally distinguish cancellation of CNOT and CNOT † errors across Z(θ) with cancellation from the next CNOT. However, the circuit in Fig. 3(c) is needed to amplify the error.
Two separate sets of experiments are conducted to characterize the two-qubit gate overrotations, phase misalignment between single-qubit and two-qubit gate, and the effectiveness of the hidden inverse scheme. In both sets of experiments, the two-qubit gate fidelity is approximately 99.4% before injecting the coherent errors. The Z(θ) rotation angle is varied. For the first set of experiments, we introduce a 2Q = 2.25 ± 0.04% two-qubit gate overrotation error into the circuits and maintain the phase misalignment as small as possible (φ diff = −0.23 ± 0.17 • ). The circuit in Fig. 3(c) is implemented with and without the hidden inverses to quantify the suppression of overrotation errors. The system can be seen to significantly suffer from two-qubit overrotation error. Fig. 4(a1) shows the probability of detecting the |00 , |10 & |01 , and |11 states at the end of the circuit with the hidden inverses, and Fig. 4(a2) shows the results without the hidden inverses. The solid lines indicate fitted simulation results with two free variables, the overrotation error of the XX gates and the phase misalignment between single-qubit and two-qubit gates. When hidden inverses are used, the contrast of the |00 is improved, and the residual population in the odd-parity states is significantly reduced. This indicates suppression of overrotation errors from the XX gates. Using the theoretical model and fitting results, final state fidelities of the circuits in both configurations are estimated. As shown in Fig. 4(a3), the final state fidelities are improved from approximately 85% to 95% due to the usage of hidden inverses. While the Z(θ) rotation angle increases, the improvement results from hidden inverses decreases.
In the second set of experiments, we introduce a phasemisalignment error of φ diff = 3.89 ± 0.09 • and minimize the two-qubit gate overrotation ( 2Q = −0.05 ± 0.22%). We implement the circuit in both configurations to examine the suppression of phase-misalignment errors for the hidden inverse circuit. Fig. 4(b1) and (b2) show the results of the circuits with and without hidden inverses, respectively, for the set of experiments when phase misalignment is dominant. With the hidden inverse configuration, along with the improved contrast of the |00 population and the reduced odd-parity population, the curves regain symmetry about the 0 • Z rotation. This shows a correction of the phase misalignment between single-qubit gates and two-qubit gates. Fig. 4(b3) represents the estimated final state fidelities of the circuits in both con-figurations. The fidelities are improved from approximately 84% to 95%. In the case of phase misalignment, we note the improvement from hidden inverses fades away much slower than the case of overrotation error as the Z(θ) rotation angle increases. It agrees with the analysis in Sec. II.
Limited by the systematic error drifts in the experiment system and finite calibration time, we are not able to suppress all coherent error to optimal at the same time. A trade-off between amplitude error and phase error exists. After the most "ideal" calibration, we observe overrotation 2Q = 1.45(6)% and phase misalignment φ diff = −0.9(1) • . Data presented in Fig 4(c) shows that a clear fidelity improvement from hidden inverse configuration is observed in the most "ideal" condition of our system. We note that the fitting for all two-qubit circuit results is done utilizing the error model described in Appendix B.
Lastly, we extend the multiqubit parity control Z circuit to width n = 4, which is illustrated in Fig. 5(a). We note the n = 4 experiments are done in a five-ion chain, with one edge ion qubit idling during the experiment. With two individual addressing beams, we access the two additional ion qubits by steering one addressing beam with MEMS mirrors [39]. XX gates for all three ion pairs are calibrated separately. The average CNOT gate fidelity is approximately 90%. We assign this fidelity deduction to increasing optical crosstalk (> 3%), optical power loss from MEMS mirrors at large steering angles, and other error sources to be investigated. Similarly, varying the rotation angle θ, we measure final state probabilities for all 16 computational basis states. Fig. 5(b) and (c) present the final state results utilizing hidden inverses and standard configuration, respectively. By suppressing overrotation error, the hidden inverse configuration improves the contrast of |0 ⊗4 from approximately 0.40 to approximately 0.47 and suppresses the average residual population of states other than |0 ⊗4 and |1 ⊗4 from approximately 0.55 to approximately 0.50. Moreover, hidden inverses help the data points regain symmetry about θ = 0, which indicates a correction against phase misalignment. We fit the four-qubit circuit results with a model consisting of coherent error (parameter offsets) and stochastic error (depolarizing channels). The qubits go through a depolarizing channel after every twoqubit gate: with probability p = 0.87, the state remains the same, while with probability 1 − p = 0.13, the state collapses to a totally mixed state. From the fitting where we assume all two-qubit gates experience the same noise channel, we estimate the overrotation 2Q ≈ 5% and the phase misalignment φ diff ≈ −8 • . Due to the strong stochastic noise from interbeam crosstalk [52], hidden inverses provide only a limited improvement but still at no experimental cost.

V. ALTERNATIVE METHODS FOR REDUCING SYSTEMATIC ERRORS
Systematic errors can be reduced in a number of ways, and we briefly compare our method with other techniques in the context of the experiment. Hidden inverses work well in experiments with multiple CNOTs and with compatible systematic errors, even with some stochastic noise. It cannot be as powerful as total circuit optimization, but it provides a local control solution that can be applied to any quantum computer without additional time overhead.

A. Two-qubit Solovay-Kitaev-1 (SK1) composite pulses
Composite pulses developed for single-qubit gates to fix overrotations can be used to reduce overrotations in two-qubit gates using an isomorphism between one-qubit Pauli operators and a subgroup of two-qubit Pauli operators [53,54]. Previous calculations of hidden inverses built from composite two-qubit pulses were shown to greatly reduce circuit error in theory when the only error is gate overrotation [55]. In practice, we have not seen an experimental advantage for these pulses. SK1 adds two additional π MS gates resulting in a gate that is 3 times longer.
We numerically consider the implementation of SK1 sequences for MS gate [55] using a simplified error model to understand why these methods do not provide an advantage. The average gate-fidelity difference is presented in Fig. 6(a). When we consider only overrotation error (coherent) and motional heating (stochastic), we find for an overrotation error that is around 1% to 2% that the motional heating rate would need to be as low as 20 quanta per second for the SK1 sequence to improve gate performance. This is one order of magnitude lower than the heating rate in this system. When we consider all stochastic and coherent error sources in our system, SK1 sequences are predicted to severely limit the fidelity as shown in Fig. 6(b). All dominant stochastic error sources in our experimental system are considered, including laser dephasing error, motional dephasing error, and motional heating.

B. Randomized compiling
Randomized compiling (RC) [14] is a protocol for converting coherent errors into stochastic errors. RC introduces independent random single-qubit gates into a circuit such that in the absence of noise, the overall ideal unitary remains the same. In the presence of noise, RC twirls the error channel into a stochastic Pauli channel. RC improves circuit results by preventing the worst-case cumulative errors and simplifies the prediction of algorithmic performance by reducing the complexity of the error model.
In order to compare the performance of our hidden inverse protocol with randomized compiling, we numerically simulate each protocol on unitaries from Eq. 1 under three different noise models: detuning error, overrotation, and phase misalignment. For the randomized compiling part, we sample 100 equivalent circuits for each value of θ (in Z(θ) from Eq. 1) and take the average gate-fidelity of this ensemble. The average gate fidelity comparison is presented in Fig 7. We find that hidden inverse configurations provide benefit over randomized compiling when the noise orientation of the error model is inverted with the inverse gate (Fig 7(b), (c)). For errors that do not invert with the inverted gate controls, such as a detuning error, randomized compiling limits the coherent error accumulation providing a clear benefit over hidden inverses (Fig 7 (a)).

C. Hardware-specific compilation
Hidden inverses are developed in the context of a gate model of quantum computation. These gates need to be mapped onto a physical system and there are multiple software, and hardware layers between the user and the device. As a result, the operator of the quantum computer often prefers to compile any algorithm to the most hardwareefficient form to yield the highest overall fidelity.
Our running example circuit in this paper is the multiqubitparity controlled Z rotation described in Eq. 1. We show one way to map it to ion trap hardware but there are many hardware-specific ways to generate the same functionality. A clear example is that the base n = 2 circuit of a Z rotation by angle θ surrounded two CNOTs which are composed of two XX(π/4). This can be replaced by a single XX(θ/2) gate surrounded by single-qubit gates. Given that two-qubit gates are typically noisier than single-qubit gates, this transformation is experimentally useful for quantum systems with Ising-type two-qubit couplings from nuclear magnetic resonance [56] to trapped ions [57]. The cost here is that one needs to calibrate the two-qubit gate for multiple angles, which is inherently more error prone than the control of the Z rotation, which in the experiment is only advancing a digital phase. We recognize that calibration may be less of a concern for near-term variational algorithms given the mismatch between algorithm performance at the ideal angle versus the programmed angle [58].
For n > 2, the additional CNOTs could still benefit from hid-den inverses, even if we change the internal primitive. In some ion trap systems [59], the natural multiqubit interaction is a global Mølmer-Sørensen. In this case, there could be a further reduction of the time complexity of the overall procedure. We have not considered this case in detail since our micro-mirror system is not compatible with a global Mølmer-Sørensen gate.

D. Total optimization
Various noise-adaptive compilers have been proposed recently in the literature. They include aggregation of multiple logical operations into larger units [28], mapping and optimization of high-level quantum programs based on hardware specifications [29], and using machine learning and variational algorithms to develop noise-resilient circuits [60], [61]. While these methods outperform standard compilers for near-term devices with a few qubits and short depths, they are not expected to scale efficiently to be useful in large-scale faulttolerant machines without truncation. Hidden inverses, on the other hand, take advantage of local optimization and can be efficiently included in compilers for larger quantum systems.

VI. CONCLUSIONS AND OUTLOOK
Slowly varying experimental noise sources can either be corrected by frequent calibrations or by introducing circuit-level protections such as composite pulses and hidden inverses. By recognizing sets of gates that are self-adjoint, we can compile a circuit to cancel out coherent errors as long as the drift occurs at a timescale slower than the time between the two gates. We demonstrate a reduction of overrotations and phase misalignment for CNOT gates in an ion-trap system without changing the circuit length. Overall, these low-cost circuit compilation schemes provide a robust platform for reducing systematic error and have already been shown theoretically to provide an advantage for quantum chemistry circuits [62].
Hidden inverses can be applied to any system where the gates are derived from flexible pulse control. Hidden inverses can be further expanded to include gates that are only inverses on subspaces. From this viewpoint, we can reconsider the cancellation of coherent errors in stabilizer measurements by stabilizer slicing [18] as a hidden inverse on the logical subspace. Hidden inverses also show the utility of having multiple versions of the same basic gate for improving circuit performance in the presence of systematic errors and suggest alternative user interfaces for quantum computers between a static set of gates and full pulse control.

ACKNOWLEDGMENTS
The authors thank Erik Nielsen for helping with all pyGSTi-related queries. This work is supported by the Office of the Director of National Intelligence -Intelligence Advanced Research Projects Activity through ARO Contract No. W911NF-16-1-0082 (experimental implementation), National Science Foundation Expeditions in Computing Award 1730104 (n-qubit simulation), National Science Foundation STAQ Project No. Phy-181891 (trapped-ion control sequences), the U.S. Department of Energy (DOE), Office of Advanced Scientific Computing Research award DE-SC0019294 (hidden inverse protocol), and DOE Basic Energy Sciences Award DE-0019449 (experimental analysis). S.M. is funded in part by a NSF QISE-NET fellowship (DMR-1747426).

Appendix A: UNITARY CNOT ERROR MODEL
We consider the direct implementation of CNOT by a Hamiltonian [36] in the context of hidden inverses.
Here we consider the parity-controlled Z rotation and calculate the average gate fidelity by calculating the entanglement fidelity.
The average gate fidelity between two unitary operations U and V on n qubits is where Fe(U, V ) is the entanglement fidelity. In this case, the ideal unitary for an n − 1-qubit parity-controlled rotation of the target qubit n is where Π ψ,a projects qubit a to the state ψ. This allows us to write We can write where Xn = cos(θ)Xn + sin(θ) ⊗ n j=1 Zj Yn and we define X(θ) = cos(θ)X + sin(θ)Y Due to the projectors, each string of bits on the first n qubits will generate a residual unitary operation on the target qubit that depends only on the Haming weight w, the number of 1's in the bit string. The parity of the bit string determines the sign of θ in X(θ) Putting it all together we have where Tr2 is the trace over a two-dimensional space and B±(w, θ) = 2 cos w 2 2 ± cos(θ) sin w 2 2 (A13) We note that that B+(w, θ) = B−(w, θ + π) this leads to the maximum fidelities happening at different θ and the fidelities oscillating π out of phase. We find for hidden inverses Fe(U, V+)| θ=0 =1 and Fe(U, V+)| θ=π = cos((n − 1) /2) cos( /2) 2(n−1) and for the standard configuration Fe(U, V−)| θ=0 = 1 4 1 + 2 cos((n − 1) ) cos( ) n−1 + cos( ) 2(n−1) and Fe(U, V−)| θ=π = cos( /2) 2(n−1) using the mathematical identity .

Appendix B: MS GATE ERROR MODEL
The MS gate error model can be found in the Supplementary Material of Ref [39]. We present it here for convenience. We make some updates for the error model to simulate quantum circuits efficiently.
The Hamiltonian of the MS evolution of the jth motional mode with no modulation is written as [37,48,63] where Ω (1) r , and Ω (2) b are the Rabi frequencies of red and blue sideband transitions for the two target ions, δ j,r , and δ (2) j,b are the detunings for the jth motional mode, φr and φ b are the laser phases of the red and blue tone, respectively. With the expansion in Eq. (B1), we can simulate the number of error mechanisms: power imbalance on two target ions, power imbalance on red and blue tones, and detuning imbalance due to Stark shift. For the full MS evolution, the modes are sequentially simulated to minimize the computing resource. We save only the spin-state result for the next round of simulation. The Hamiltonian of different modes commute when Ω and Ω (2) r = Ω (2) b , which is a reasonable assumption in the MS gate. For the evolution of discrete segments in FM gates, we sequentially simulate every segment to obtain the final state.
We use a master equation [64] to simulate an open-quantum system considering multiple dissipative error mechanisms: motional heating, motional dephasing, and laser dephasing. The master equation is written in Lindblad form [65] dρ dt where ρ is the density matrix of the system, H is the Hamiltonian of the MS gate,Lj is the Lindblad operator for the jth decoherence process. The motional dephasing can be described by the Lindblad operator of the formLm = 2 τmâ †â , where τm is the motional coherence time. The anomalous heating can be described byL+ = √ where Γ is the heating rate. For these two operators, we sequentially simulate the evolution of each mode, then combine them to obtain the final state. The master-equation simulations represent the full density-matrix representation for a truncated state space of two qubits and one motional mode truncated to the first 13 Fock states (n ≤ 12). The laser dephasing can be described by the Lindblad operator of L l = 1/τ l (σ (1) z +σ (2) z ), where τ l is the laser coherence time. For this Lindblad operator, we perform a full master-equation simulation with all motional modes and spin states included. We truncate the far off-resonance motional modes, which have a smaller motional excitation, to smaller Fock states to save on computational resources. For the stochastic noise, we also combine the simulation with the Monte Carlo method. The simulations are performed using QuTip [66].
To avoid solving master equations whenever we encounter a MS gate in the quantum circuit, we calculate the Pauli transfer matrices (PTMs) before simulating the circuit. The PTM is represented as: where Pi is the Pauli basis, d = 2 n , n is the number of qubits, and Λ is the linear map [67]. Λ(Pj) is equivalent to applying the master-equation simulation on Pauli basis Pj. Singlequbit gates suffer from negligible stochastic noises. Therefore, we represent them with corresponding quantum operation matrices subject to minor coherent errors. In superoperator formalism, a quantum circuit comprised of quantum maps (the Mølmer-Sørensen gates and the single-qubit rotations) We compare the average gate fidelity of three instances of X(π/2) (red) and Y (π/2) (blue) gates. The three cases considered are direct characterization of raw gates, direct characterization of SK1 gates, and predicted characterization of SK1 gates based on the results from GST on the raw gates. The boxplot displays the minimum, the maximum, the sample median, and the first and third quartiles of the dataset.
is equivalent to matrix multiplication of the corresponding PTMs and can be calculated efficiently.

Appendix C: GATE SET TOMOGRAPHY FOR SINGLE-QUBIT GATES
We design an experiment to measure the performance of SK1 gates and test how well GST predicts their performance. The experiment serves as a preliminary systematic error characterization. First, we run GST on a gate set composed of the SK1 compiled gates {XSK1(π/2), YSK1(π/2)}, followed by an experiment where we run GST on a gate set comprised of the raw gates that generate SK1 sequences {X(π/2), Y (π/2), SK1 + X (2π), SK1 − X (2π), SK1 + Y (2π), SK1 − Y (2π)}. GST produces a completely positive trace-preserving map for each gate, represented as a PTM.
We calculate the fidelity of the SK1 gates and the raw gates from these PTMs. The PTMs allow us to calculate any fidelity and we choose the average gate fidelity, F(U, E) = dψ ψ| U † E(|ψ ψ|)U |ψ where U is the ideal gate and E is the actual gate [68]. From the GST PTMs, we calculate a fidelity for SK1 X(π/2) and SK1 Y (π/2) of 0.999 36(5) and 0.999 27(3), respectively, while the fidelity for raw-compiled X(π/2) is 0.9982(1) and for Y (π/2) is 0.9985 (2). We see a clear improvement in fidelity due to the SK1 composite pulses. Also, smaller error bars in the calculated fidelity of SK1 gates indicate that the gates are more uniform. The estimated error generator for each gate, which is a Lindbladian type operator that acts after the ideal gate (G = e L G0), describes how the gate is failing to match the target. Specifically, the Hamiltonian projection of this error generator produces the coherent part of the error. We find that SK1 turns approximately 1% overrotation into approximately 0.01% overrotation as expected.
We then combine the PTMs obtained from the raw-pulse GST to construct SK1 X(π/2) and Y (π/2) gate PTMs. Notice that the constructed PTMs are significantly different from the direct SK1 gate PTMs obtained from composite pulse GST. The fidelity for the predicted SK1 X(π/2) gate is 0.9917(4), and for the Y (π/2) gate it is 0.9931(2). Fig. 8 contains box plots of the calculated fidelities. It indicates that raw-pulse GST predicts SK1 composite pulses degrade gate fidelities. This result contradicts the experiment, where SK1 does improve the gate performance. This discrepancy can be explained by an overrotation error that is slowly varying. The raw pulse GST averages over the time-varying overrotations, yielding a PTM that describes average raw pulses for which SK1 would not be useful. Simulations readily reproduce this behavior.
We use pyGSTi (version 0.9.9.1) [25] for all GST-related works. This section explains the experimental design and data analysis for characterizing SK1 gates and their elementary rotations.
Experiment design-The experimental circuits are generated by pyGSTi's fiducial and germ selection algorithms. Fiducial sequences are used to prepare and measure an informationally complete set of operations. Germs are designed to amplify all possible gate errors. Given a set of operations (also called the gate set), we use the algorithms to generate the appropriate fiducials and germs. Our gate sets are {XSK1(π/2), YSK1(π/2)} and {X(π/2), Y (π/2), SK1 + X (2π), SK1 − X (2π), SK1 + Y (2π), SK1 − Y (2π)}. Fiducials and germs in hand, we choose the length of the experiments (number of times to repeat each germ between fiducial pairs) as L = 256 and L = 32, respectively. The experiment lengths are different for two gate sets because raw gates are noisier than composite pulse gates, and raw gates reach a similar noise level as composite pulse gates with less noise amplification.
Data analysis.-We run standard GST as implemented in pyGSTi. Results of the gate set {XSK1(π/2), YSK1(π/2)} follow directly from the output provided by GST (other than error bars, which we discuss next). For the gate set {X(π/2), Y (π/2), SK1 + X (2π), SK1 − X (2π), SK1 + Y (2π), SK1 − Y (2π)}, we get the PTMs for the elementary rotations directly from GST. We calculate the predicted SK1 gate PTMs through matrix multiplication of the elementary rotation PTMs, RSK1(θ, φ) = SK1 − R (2π)SK1 + R (2π)R(π/2), where R ∈ {X, Y }. To generate the error bars on the calculated fidelity metrics, we use a nonparametric bootstrapping technique from pyGSTi. We take the final estimate from running standard GST as the target model for generating nonparametric bootstrapping samples and then run gauge optimization on these raw bootstrapped models to generate our final set of models. Error bars are calculated from the standard deviation of average gate-fidelity metrics on the set. GST provides information on "Goodness of fit," i.e., how well GST estimates the fit to characterize the data, to provide confidence in the data analysis. A rating scale from 1 to 5 summarizes various statistical measures. For both gate sets, the experiments receive a score higher than 4 indicating a good fit. The noise in the PTMs can be better understood using projections of the gate error generators. These are Linbladianlike operators generated by projecting the error generator into some subspace. We are primarily concerned with the Hamiltonian projection, which produces the coherent error. We use built-in pyGSTi functions to calculate these projections and deduce the amount of overrotation error.