Realization of high-fidelity CZ and ZZ-free iSWAP gates with a tunable coupler

High-fidelity two-qubit gates at scale are a key requirement to realize the full promise of quantum computation and simulation. The advent and use of coupler elements to tunably control two-qubit interactions has improved operational fidelity in many-qubit systems by reducing parasitic coupling and frequency crowding issues. Nonetheless, two-qubit gate errors still limit the capability of near-term quantum applications. The reason, in part, is the existing framework for tunable couplers based on the dispersive approximation does not fully incorporate three-body multi-level dynamics, which is essential for addressing coherent leakage to the coupler and parasitic longitudinal ($ZZ$) interactions during two-qubit gates. Here, we present a systematic approach that goes beyond the dispersive approximation to exploit the engineered level structure of the coupler and optimize its control. Using this approach, we experimentally demonstrate CZ and $ZZ$-free iSWAP gates with two-qubit interaction fidelities of $99.76 \pm 0.07$% and $99.87 \pm 0.23$%, respectively, which are close to their $T_1$ limits.

High-fidelity two-qubit gates at scale are a key requirement to realize the full promise of quantum computation and simulation. The advent and use of coupler elements to tunably control two-qubit interactions has improved operational fidelity in many-qubit systems by reducing parasitic coupling and frequency crowding issues. Nonetheless, two-qubit gate errors still limit the capability of nearterm quantum applications. The reason, in part, is the existing framework for tunable couplers based on the dispersive approximation does not fully incorporate three-body multi-level dynamics, which is essential for addressing coherent leakage to the coupler and parasitic longitudinal (ZZ) interactions during two-qubit gates. Here, we present a systematic approach that goes beyond the dispersive approximation to exploit the engineered level structure of the coupler and optimize its control. Using this approach, we experimentally demonstrate CZ and ZZ-free iSWAP gates with two-qubit interaction fidelities of 99.76 ± 0.07% and 99.87 ± 0.23%, respectively, which are close to their T1 limits.
To further improve the fidelity of coupler-mediated entangling gates, a systematic approach for optimizing control and level-structure of the coupler is required. However, the existing theoretical framework based on the perturbative approach, which assumes a dispersive qubit-coupler interaction [13], has several limitations. First, when performing fast two-qubit gates, the qubitcoupler coupling generally enters into the non-or weaklydispersive regime. Therefore, the perturbative approach breaks down and coherent energy exchange between the qubit and coupler arises, which is not captured within * youngkyu@mit.edu † william.oliver@mit.edu the existing framework. In other words, theoretical treatments are simplified at the cost of overlooking coherent leakage to the coupler -non-adiabatic error -when performing fast two-qubit gates. Furthermore, the perturbative treatment of tunable couplers disregards the presence of higher levels of the coupler [13]. This is a significant omission; the higher level of the coupler participates in the multi-level dynamics of two-qubit gates, and thereby, adds a considerable amount of residual two-qubit interactions.
In this paper, we engineer the control and levelstructure of the coupler by going beyond the dispersive approximation in order to realize high-fidelity twoqubit gates. We implement both longitudinal (CZ) and transversal (iSWAP) two-qubit gates; the availability of both type of gates generally reduces gate overheads of NISQ algorithms [3,24]. We propose an intuitive, yet systematic approach for optimizing control to suppress coherent leakage to the coupler. Via optimized control, we significantly reduce the non-adiabatic error of a 60 ns-long CZ gate, thereby demonstrating a two-qubit interaction fidelity of 99.76 ± 0.07%. in interleaved randomized benchmarking. We also address a fundamental issue of the iSWAP gate when coupling two transmon qubits: parasitic ZZ interaction due to their higher levels [10,17,[25][26][27]. We successfully suppress the residual ZZ interaction of the iSWAP gate in a passive manner, by exploiting the engineered coupler level structure and demonstrate a two-qubit interaction fidelity of 99.87 ± 0.23% with a 30 ns gate duration.
We shall consider a pairwise interacting three-body quantum system, in which each constituent body is a multi-level anharmonic oscillator (Fig. 1a)  bits are encoded in the first two levels of the leftmost and rightmost anharmonic oscillators with resonant frequencies ω 1 and ω 2 , respectively. The middle anharmonic oscillator serves as the coupler. These two distant qubits and the coupler are coupled through exchange-type interactions with coupling strengths g 1c , g 2c , and g 12 . We assume the qubit-coupler interactions to be much stronger than the direct qubit-qubit interaction g 1c = g 2c g 12 . This is the case for our device, and is a practical parameter regime for tunable couplers, in general [13]. We approximate the qubits and the coupler as Duffing oscillators, a common model for anharmonic multi-level qubit systems such as the transmon [28] and the C-shunt flux qubit [29]. Thus, the system Hamiltonian can be written as follows ( ≡ 1), where b † i and b i (i, j ∈ {1, 2, c}) are, respectively, the raising and lowering operators defined in the eigenbasis of the corresponding oscillators. The level anharmonicity of each oscillator is denoted by η i . As shown in Ref. [13], the destructive interference between the coupler-mediated and direct qubit-qubit couplings enables the resulting net qubit-qubit coupling to be turned on and off by adjusting the coupler frequency ω c .
We realize this pairwise interacting three-body system in a circuit quantum electrodynamics setup [30,31] using three capacitively coupled transmons (Figs. 1bd) [28,32]. The transmon coupler at the center mediates interaction between the two distant transmon qubits. While the resonant frequency ω 1 /2π of qubit 1 (QB1) is fixed at 4.16 GHz, the frequencies of qubit 2 (QB2) and the coupler (CPLR) are tunable (ω 2 /2π = 3.7-4.7 GHz and ω c /2π = 3.7-6.7 GHz) by modulating the external magnetic flux threading through their asymmetric SQUID loops [33]. More details about the device are provided in Appendix B. Coupler-mediated two-qubit gates are implemented by dynamically tuning ω 2 and ω c . Both qubits have microwave control lines to drive single qubit X-and Y-rotation gates. Both the qubits and the coupler are dispersively coupled to coplanar waveguide resonators for their state readout. We discriminate between the ground, first-and second-excited states, such that we can distinguish 27 different states of the system (see Appendix D for details).
We use the notation |QB1, CPLR, QB2 to represent the eigenstates of the system (Eq. (1)) in the idling configuration where CPLR is placed at the frequency such that the effective QB1-QB2 coupling is nearly zero (dashed lines in Fig. 2a). Note that these states approximate the diabatic (bare) states, i.e., the eigenstates of the uncoupled system, because QB1 and QB2 are effectively decoupled and both are far-detuned from CPLR (g ic /(ω c − ω i ) < 1/20, i ∈ {1, 2}). To implement CZ and iSWAP gates, we use non-adiabatic transitions between |101 and |200 , and |100 and |001 , respectively [26,[34][35][36]. The non-adiabatic transitions are regulated by adjusting ω c , which effectively tunes the coupling strengths between |101 and |200 (2g CZ ), or between |100 and |001 (2g iSWAP ). For example, biasing ω c closer to ω 1 and ω 2 leads to opening of the avoided crossings (|g CZ | > 0, |g iSWAP | > 0) and downward level shifts induced by qubit-coupler interactions (solid curves in Fig. 2a). The CZ gate is performed by suddenly bringing the states |101 and |200 into resonance at their "bare" energy degeneracy point, which projects these "bare" states onto the dressed states formed by the couplingg CZ and results in Larmor precession within the dressed-state basis. We let them complete a single period of an oscillation, such The tunable coupling for the CZ and the iSWAP gates. (a) Illustrations of level crossings relevant to the CZ and the iSWAP gates. The energy splittings (2gCZ and 2giSWAP) are tunable by adjusting ωc. See main text for details. (b-c) Experimental data for the energy exchange between |200 and |101 , and |100 and |001 as function of the coupler frequency ωc, respectively. The pulse sequences are illustrated at the top. (d) By fitting the oscillations with sinusoidal curves, we extract the swap rates |2gCZ|/2π and |2giSWAP|/2π (circles). The top x-axis shows the corresponding perturbation parameter g1c/(ωc − ω1) at each ωc.
that |101 picks up an overall phase e iπ . To implement the iSWAP gate, we put |100 and |001 on resonance and let them complete half an oscillation, so that the two states are fully swapped.
We first demonstrate the tunability of the effective QB1-QB2 coupling strengthsg CZ andg iSWAP by measuring the energy exchange between |101 and |200 , and |100 and |001 , respectively, as a function of CPLR frequency ω c . To measure the energy exchange between |101 and |200 , we first prepare |101 by applying π pulses to both QB1 and QB2 at the idling configuration. Next, we rapidly adjust QB2 frequency ω 2 so that |101 and |200 are on resonance and then turn ong CZ by shifting ω c . We wait a variable delay time τ and measure the state population of |200 . We repeat these measurements with varying ω c (Fig. 2b). In the similar manner, to measure g iSWAP , we prepare |100 and measure the state population transferred to |001 as a function of τ and ω c (Fig. 2c).
In Fig. 2d, we plot the effective coupling strengths 2g CZ /2π and 2g iSWAP /2π as a function of CPLR frequency ω c by fitting the excitation exchange oscillations. To implement fast two-qubit gates (we use a 60 ns-long CZ gate and a 30 ns-long iSWAP gate), a strong coupling strength is required, which strongly hybridizes the CPLR with both QB1 and QB2 (g ic /(ω c − ω i ) ≈ 1/3). However, dynamically entering and exiting such a nondispersive regime easily leads to coherent leakage into the CPLR (non-adiabatic error). Hence, well-engineered control is required to avoid the coherent leakage when implementing fast two-qubit gates.
To implement an optimized control scheme, we propose a tractable model for analyzing the leakage dynamics. We first note that the energy levels of the system interact via excitation-preserving exchange within the rotating wave approximation, such that the dynamics can be analyzed in the two independent manifolds, one involving single excitation and one involving double excitations (Figs. 3a and 3b, respectively). In each manifold, we identify the subspaces spanned by the states which strongly interact with computational qubit states and cause leakage during the CZ gate (dashed boxes in Fig. 3, see Appendix G for details). For the sake of simplifying the leakage dynamics, we intentionally chose a small anharmonicity for the coupler to avoid strong hybridization of |020 with other states during CZ gates (see Appendix I for details). Of these states, |100 and |101 are computational qubit states and all others are leakage states. In the double-excitation manifold, the transition between |200 and |011 is dipoleforbidden (requires a second-order process), and is therefore suppressed. This allows the description of the corresponding three-level dynamics to be further simplified by introduction of a partially hybridized basis: a bright state |B ≡ cos Θ|011 + sin Θ|200 and a dark state |D ≡ cos Θ|200 − sin Θ|011 , where Θ ≡ tan −1 ( √ 2g 12 /g 1c ) [37]. If g 1c g 12 , within this truncated three-level subspace, the computational state |101 only interacts with the bright state |B and we can neglect the dark state |D . Consequently, the leakage dynamics within the singleand double-excitation manifolds are described by the following effective two-level Hamiltonians H 1 (t) and H 2 (t), respectively.
where the coupling strength between |101 and |B is given by g B = g 1c cos Θ + √ 2g 12 sin Θ and the energy of the |B is ω B = cos 2 Θ(ω c (t) + ω 2 ) + sin 2 Θ(ω 1 + ω 2 ). Such a mapping of the multi-level dynamics onto twolevel systems is useful, because optimal control techniques are well-studied for two-level cases [38]. This technique of simplifying multi-level leakage dynamics using bright and dark states is also used to optimize the control pulse for our iSWAP gate (see Appendix G). Since g 1c g 12 (Θ ≈ 0, |B ≈ |011 , and |D ≈ |200 ), the effective Hamiltonians H 1 (t) and H 2 (t) are equivalent up to offset energies. This equivalence enables us to suppress leakage in both single-and double-excitation manifolds by optimizing a single control parameter ω c (t). Note that although |200 behaves as a dark state in this truncated subspace, it still interacts with |101 via a second-order process through the intermediate state |110 (outside the truncated subspace), which enables the CZ gate. Our goal here is to suppress fast, non-adiabatic transitions between |101 and |B ≈ |011 (a first-order interaction) that occur much faster than a slow swapping between |101 and |D ≈ |200 (a second-order interaction through the intermediate state |110 ). Therefore our two-level system model addresses only the predominant, leading-order leakage dynamics. Developing a theoretical framework for addressing additional leakage dynamics, such as leakage into |110 , will be the subject of future work.
Following Ref. [38], we take the Slepian-based approach to devise an optimal control waveform ω c (t) that targets adiabatic evolution within the effective two-level systems. In Appendix H, we present numerical simulation results that validate the suppression of leakage to CPLR when using the optimized pulse shape for both CZ and iSWAP gates.
We experimentally assess the performance of an optimized control pulse for the CZ gate by comparing its performance to a simple square pulse (Fig 4). First, to characterize the leakage into CPLR, we vary the con-trol pulse amplitude and measure the leakage of the CZ gates into |011 (Figs. 4b-c and f-g). The amplitude is parametrized by the minimum point of CPLR frequency f min c (see Figs. 4a and e). The chevron patterns of the |101 population p |101 represent coherent energy exchange between |101 and |200 . We predistort the pulses to eliminate non-idealities, such as transients in the control lines, to ensure the desired pulse shape is faithfully delivered to the device (see Appendix F for details) and thereby achieve symmetric chevron patterns [39][40][41]. On top of the chevrons, we observe distinctive periodic resonances for the square pulse, which are due to the leakage to |011 . We suppress this leakage via an optimized control pulse shape (Fig. 4g). Although we only present measurements of the leakage population to |011 in Fig. 4, we have experimentally confirmed that the leakage to other states in the two-photon manifolds -|020 , |110 and |002 -are negligible (see Appendix T), thereby validating our two-level system model in Eq. (2)).
Next, we confirm the improvement due to optimal pulse shaping by comparing the gate errors of the CZ gates. The tune-up procedures for the CZ gate are illustrated in Appendix N. The single-qubit XY gates are performed in the idling configuration where the static ZZ interaction between QB1 and QB2 is eliminated (see Appendices E and L). In Figs. 4d and 4h, we measure the fidelities of the CZ gates via interleaved randomized benchmarking (RB) [42][43][44]. Due to a dynamic change of the qubit frequencies during a two-qubit gate, additional Z-rotations accrue during the gate. Such Zrotations need to be undone by applying compensating Z-rotations. Therefore, the error rate of an interleaved two-qubit gate consists of two factors: the error rate of the native two-qubit interaction (which contains unwanted Z-rotations due to the change of the qubit frequencies) and the error rate of the compensation Z-rotations. Throughout this paper, we focus on the quality of the native two-qubit interaction. In the case of CZ gates, we correct the additional Z-rotations by applying virtual Z gates that are essentially errorfree [45]. Therefore, the error rate r int of the interleaved CZ gate is equivalent to the two-qubit interaction error rate r CZ ≡ 1 − F CZ of the native CZ gate. The CZ gate with optimal pulse-shaping shows a higher twoqubit interaction fidelity F CZ = 1 − r int = 99.76 ± 0.07%, which amounts to a 70% error reduction compared to the square-shaped control pulse.
Based on the average gate composition for the twoqubit Cliffords [44], we have estimated the two-qubit Clifford error rates r Clifford,est using the following formula: r Clifford,est = 8.25 × r 1qb + 1.5 × r CZ . The estimated Clifford error rates r Clifford,est for the square and optimal pulses are 1.79 ± 0.29% and 1.02 ± 0.11%, respectively. Differences between r Clifford,est and r Clifford are possibly due to residual distortion of the two-qubit gate pulses, which may additionally degrade the quality of subsequent single-qubit gates. By comparing Figs. 4(d) and (h), we find that change in r Clifford (0.79%) is very  − f1) and the number of CZ gates NCZ. (c,g) Leakage population to |011 after applying repeated CZ gates. The square-shaped pulse shows periodic leakage to |011 , which is suppressed down to the background noise limit by optimizing the pulse shape. (d,h) Interleaved randomized benchmarking (RB) results of the CZ gates. The pulse sequence is illustrated at the top; we apply N Clifford random two-qubit Clifford gates (C2) and the recovery Clifford gate C −1 2 , which makes the total sequence equal to the identity operation. Errors bars represent ±1 standard deviations. We measure 30 random sequences for each sequence length N Cliffords . To ensure accurate uncertainties of the error rates (r Clifford and rint), we perform a weighted least-squares fit using the inverse of variance as the weights. close to 1.5×the change in r int (0.77%), conforming to the theory.
By solving a Lindblad master equation, we find that the T 1 limit for a 60 ns-long CZ gate is approximately 99.85% (see Appendix R). We also simulate the contribution of 1/f α flux noise (which predominantly limits the pure dephasing times T * 2 of QB2 and CPLR at the idling configuration) to the gate error rate and find it to be an order of magnitude smaller than the T 1 -contribution (see Appendices C and S for details). The gap between the measured fidelity F CZ and its coherence limit implies additional coherent, leakage errors due to imperfect control. We find the leakage rate of the CZ gate with an optimal pulse shape is 0.06 ± 0.07% possibly due to residual pulse distortion of Z pulses (see Appendix O for details). Now, we move on to engineering the level structure of the coupler to suppress residual ZZ interactions during the iSWAP gate. The transmon qubit has a weak negative anharmonicity [28]. Therefore, the second excited levels of the transmons |200 and |002 are located near the computational qubit state |101 when the two qubits are in resonance. Interaction between these three energy levels leads to level repulsion (red arrows in Fig. 5a). Due to the repulsion, the frequency of |101 is shifted upward (note that |200 and |002 are located below |101 ), which results in a positive ZZ interaction of strength , where E |m denotes the eigenenergy of |m . Such residual ZZ interactions have generally either been accomodated [26] or actively corrected by applying a partial CZ gate [17] within the transmon qubit architecture. Recently, an approach to suppress the ZZ interactions by using qubits with opposing signs for their anharmonicity (e.g., a transmon qubit and a C-shunt flux qubit) has been proposed and demonstrated [46,47]. In this work, we utilize the higher level of the coupler |020 to counteract the level repulsion while only using transmon qubits. Note that |020 is located above |101 , thereby providing a means to cancel the unwanted ZZ term (blue arrow in Fig. 5a).
In Fig. 5b, we measure the residual ZZ strength ζ as a function of ω c , when QB1 and QB2 are in resonance. To measure ζ, we perform a cross-Ramsey type experiment, which measures the conditional phase accumulation φ ZZ of QB1 while initializing QB2 in either its Pulse sequence for (c,d): Cancelling out residual ZZ interaction of the iSWAP by exploiting the engineered coupler level structure. (a) The residual ZZ interaction during iSWAP is originated from the level repulsion between |101 and the second excited level of the qubits (red arrows). This level repulsion is counteracted by utilizing the level repulsion from the 2nd-excited state of the coupler (blue arrow). (b) Residual ZZ strength ζ as a function of the coupler frequency ωc, when the two qubits are on resonance. The top x-axis shows the corresponding perturbation parameter g1c/(ωc −ω1). The solid curves correspond to numerical simulation assuming either ηc = ∞ (yellow) or ηc/2π = −90 MHz (green). (c) ZZ angle of the iSWAP gate φZZ /NiSWAP as a function of the gate length tG. We cancel out the ZZ angle by exploiting the tunability of ζ from positive to negative values. Inset shows the dynamic change of ζ during the excursion of ωc for a 30-ns long iSWAP gate. Each data point is obtained by fitting the accumulated ZZ angle φZZ of NiSWAP-times repeated iSWAP gates with a linear function as shown in (d). (e) The results of interleaved randomized benchmarking (RB) for the ZZ-free 30 ns-long iSWAP gate (see main text for details). We measure 30 random sequences for each sequence length (N Cliffords ). Errors bars represent ±1 standard deviations. The error rates (r Clifford and rint) and their uncertainties are extracted by performing a weighted least-squares fit using the inverse of variance as the weights. ground or excited state. We measure φ ZZ at full periods of the swap oscillation, where the net amount of excitation exchange is zero. Dividing φ ZZ by the swap period (2π/g iSWAP ), we extract ζ/2π. The experimental data show good agreement with numerical simulation (green curve in Fig. 5b, see Appendix Q for details about the simulation). We also compare the experimental data with simulated ζ for a 2-level CPLR (yellow curve in Fig. 5b).
Owing to the presence of the higher level of CPLR, ζ is significantly suppressed. We also note that levels beyond the 2nd-excited level of CPLR have little impact on the dynamics, since they are outside the relevant manifolds. This result clearly indicates that using a well-engineered multi-level coupler can significantly reduce a residual ZZ error of the iSWAP gate, thereby further enhancing the fidelity.
When performing the iSWAP gate, its residual ZZ angle φ ZZ is accumulated by a dynamic change of ζ during the excursion of CPLR frequency ω c . If the negative and positive portions of ζ during the gate are equal, the overall ZZ phase is completely cancelled out. We measure the residual ZZ angle φ ZZ of the iSWAP gate by adjusting the pulse length in sync with the pulse amplitude such that the excitation is always fully swapped (Fig. 5c). We optimize the iSWAP pulse shape in the same manner to suppress coherent leakage to CPLR (see Appendices G and H for details). Therefore, we simultaneously address both coherent leakage to CPLR and residual ZZ interaction by optimizing the pulse shape and duration. Owing to the cancellation induced by the higher level of CPLR, the iSWAP gate with a 30 ns duration features negligible residual ZZ (φ ZZ /N iSWAP = 0.02±0.03°), which we refer to as the ZZ-free iSWAP gate (Fig. 5d). Note that the duration of a ZZ-free iSWAP gate depends on the coupler's anharmonicity η c . Here, we engineer the coupler's level structure such that its anharmonicity is relatively small (η c /2π = −90 MHz), in order to implement a faster ZZ-free iSWAP gate than what would be possible with larger η c (see Appendix I for details).
We measure the two-qubit interaction fidelity of the ZZ-free iSWAP gate by performing interleaved randomized benchmarking (RB) in Fig. 5e. The tune-up procedures for the iSWAP gate are described in Appendix N. Unlike the CZ gate, when performing single qubit gates, we bias QB1 and QB2 in resonance to synchronize their XY axes in the Bloch sphere (see Appendix J for details). This is facilitated by the tunable coupler, which switches off the effective transverse coupling between QB1 and QB2. Since they are put in resonance, the microwave crosstalk between the XY drive tones becomes critical. We cancel out this microwave crosstalk by applying active cancellation tone for each of the drive lines (see Appendix K for details). We find that using a long microwave pulse is desirable for better active cancellation. Hence, we apply 70 ns-long microwave pulses when implementing X and Y single-qubit gates, even though they show lower average gate fidelities (QB1 = 99.92%, QB2 = 99.81%) than the 30 ns-long pulses used in the CZ gate benchmarking experiments (QB1 = 99.94%, QB2 = 99.90%). See Appendix L for single-qubit Clifford randomized benchmarking data.
Unlike the CZ gate, we implement actual (not virtual) Z gates using XY gates (see Appendix P) to cancel out Zrotations that are accompanied by the iSWAP gate. As a consequence, when performing the interleaved RB, the iSWAP-interleaved sequence acquires 0.1125 additional XY gates per Clifford on average (see Appendix M for details). We extract the two-qubit interaction fidelity F iSWAP by subtracting the contribution of single-qubit gate error 0.1125×r 1QB ≈ 0.015% from the error rate r int of the interleaved gate: Owing to the ZZ-cancellation and a short gate length, the measured iSWAP gate exhibits high twoqubit interaction fidelity F iSWAP = 99.87 ± 0.23%. Based on the average gate composition for the two-qubit Cliffords (see Appendix M for details), we have estimated the two-qubit Clifford error rates r Clifford,est using the following formula: r Clifford,est ) = 10.9375 × r 1qb + 1.5 × r iSWAP . The estimated Clifford error rate is 1.73 ± 0.37%. The difference between r Clifford,est and r Clifford could be due to residual distortion of the two-qubit gate pulses. The measured two-qubit interaction fidelity is close to the T 1 limit of 99.91% obtained by solving the Lindblad master equation (see Appendix R). We confirm that error contribution of 1/f α flux noise is relatively insignificant (see Appendices C and S for details). Again, note that this two-qubit interaction fidelity only quantifies the quality of the native iSWAP gate, which contains additional unwanted single-qubit Z rotations. In practice, such additional Z-rotations can be compensated by compiling the correctional Z-rotations into adjacent single-qubit gates (see Appendix M for example).
We note a large uncertainty in an estimate for the twoqubit interaction fidelity F iSWAP . The large uncertainty is mainly due to relatively low single-qubit gate fidelities (≈ 99.86%), which degrades the fidelity of reference Clifford sequences [48]. These low single-qubit gate fidelities arise from biasing the qubits on resonance to avoid phase swapping, which necessitates microwave crosstalk cancellation. Further research should be undertaken to improve the single-qubit gate fidelities in this architecture. One alternative is to bias the qubits off-resonantly and correct for the accumulated single qubit phases in software [49]. Exploring the viability of this approach will be the subjects of our future work. In addition, applying iterative randomized benchmarking [50,51] would be useful to better characterize the contributions of systematic coherent errors versus incoherent noise to the two-qubit gates.
Looking forward, our work provides a path towards building quantum information processors that are capable of running near-term quantum applications and ultimately achieving fault tolerant quantum computation. Our optimal control approaches to suppressing coherent leakage of multi-qubit gates is of particular importance, because leakage error is especially detrimental to the implementation of quantum error correcting codes [44,[52][53][54][55][56]. Additionally, the high-fidelity ZZfree iSWAP gate (more generally, XY entangling gates without residual ZZ) is beneficial for NISQ applications, since it enables efficient circuit compilation and improves the accuracy of NISQ algorithms such as quantum simulation of molecular eigenstates [57][58][59], quantum approximate optimization algorithms (QAOA) for solving highdimensional graph problems [60,61], and quantum simulation of many-body condensed matter systems (e.g., the 2D XY model) [49,[62][63][64] While the residual ZZ of an iSWAP gate can be cancelled by applying an additional CPHASE gate, this inevitably increases the circuit depth, which degrades the performance of (NISQ) algorithms. Alternatively, one can implement error mitigation techniques to alleviate the detrimental effect of residual ZZ on algorithms [59], but this may also introduce overhead, such as additional measurements and classical post-processing, depending on the mitigation protocols. Notably, all these efforts to reduce the impact of residual ZZ of XY entangling gates can be simply avoided by using our ZZ cancellation approach.
Taken together, the principles and demonstrations shown in this work will help resolve major challenges in the implementation of quantum computing hardware for NISQ-era applications.

DATA AVAILABILITY
The data that support the findings of this study may be made available from the corresponding authors upon request and with the permission of the US Government sponsors who funded the work. The experiments were performed in a BlueFors XLD-600 dilution refrigerator with a base temperature of 10 mK. We magnetically shielded the device with a superconducting can surrounded by a Cryoperm-10 cylinder. All attenuators in the cryogenic samples are made by XMA and installed to remove excess thermal photons from higher-temperature stages. We apply microwave readout tones to measure the transmission of the device. We pump the Josephson travelling wave parametric amplifier (JTWPA) [65] using an RF source (Berkeley Nucleonics Corp Model 845), in order to pre-amplify the readout signal at base temperature. A microwave isolator placed after the sample allows for the signal to pass through to the JTWPA without being attenuated, while removing all the reflected noise of the JTWPA and dumping it in a 50 Ω termination. Two microwave isolators are placed after the JTWPA to prevent noise from highertemperature stages to the JTWPA and the sample. We amplify the signal by using a high-electron mobility transistor (HEMT) amplifier, which is thermally connected to the 4 K stage. The output line is further amplified outside of the cryostat with an amplifier (MITEQ AMF-5D-00101200-23-10P) with a quoted noise figure of 2.3 dB, and a preamplifier (Stanford Research SR445A).
Outside of the cryostat, we have all of the control electronics which allow us to apply signals used for the XY and Z controls of the qubits and the coupler. Pulse envelopes of XY control signals and readout signals are programmed in Labber software and then uploaded to arbitrary waveform generators (AWG Keysight M3202A). Subsequently, the pulses generated by AWGs are mixed with coherent tones from RF sources (Rohde and Schwarz SGS100A). Z control signals are generated by AWGs (AWG Keysight M3202A). We also apply magnetic flux globally through a coil installed in the device package as an additional DC flux bias source (Yokogawa GS200). All components for generating signals are frequency-locked by a common SRS rubidium frequency standard (10 MHz). A detailed schematic is given in Fig. 6 The device parameters are summarized in Table I. The |0 → |1 transition frequencies of the qubits and the coupler as a function of flux bias are shown in Fig. 7. We note that the maximum frequencies have decreased 1-2 MHz in each cool-down, due to device aging.
The coupling strengths between the qubit and coupler g 1c /2π, g 2c /2π are approximately 72 MHz. Note that further increasing g 1c , g 2c would enable faster gates with fewer non-adiabatic effects. However, solely increasing g 1c , g 2c would result in an increase of the idling coupler frequency ω c,idle at which the net qubit-qubit coupling is nearly zero. There are practical considerations that place an upper limit on ω c,idle . For example, other modes in the system such as readout resonators or/and spurious package modes should not be within the operating frequency range of the coupler. Alternatively, one could also increase the direct qubit-qubit coupling g 12 to compensate for the increased g 1c , g 2c such that ω c,idle remains at a lower frequency and is within the ideal operating range (no other modes fall within this range). However, increasing g 12 can be potentially problematic when scaling up due to strong parasitic ZZ coupling between nextnearest neighboring qubits. Given these constraints, we have chosen the coupling strengths that enable fast twoqubit gates (30-60 ns) via our optimized control techniques and avoid unwanted resonances between a coupler and other modes on the chip.
We also measure T 1 of QB2 and CPLR as functions of their frequencies ω 2 and ω c (Fig. 9). We find TLSs in both QB2 and coupler, but they are located out of the operating frequency ranges, so that they negligibly affect the performance of two-qubit gates. However we note that the TLS landscape varied between cool-downs, occasionally causing TLSs to jump into the operating range for CPLR. We observed degradation of the two qubit gate fidelities (below 99%), when TLSs are strongly coupled to the coupler in its operating frequency range (Fig. 10a).
In these experiments, we have used relatively large area Josephson junctions (1-3 µm × 200 µm). This is to: (1) achieve a much higher coupler frequency than the qubit frequencies while using only a single e-beam layer, and (2) use asymmetric SQUIDs, which are advantageous for their lower flux sensitivity. When only using a single e-beam layer and, therefore, a single critical current density, the need for both large and small E J results in certain junctions being necessarily large in area. These large junctions ultimately lead to an increase in the number of TLSs [68,69]. To mitigate this issue, one can em-ploy multiple e-beam layers with different critical current densities, such that both large and small E J values can be fabricated using only small-area Josepshon junctions (≈ 200 nm × 200 nm). These smaller junctions reduce the probability of strongly coupled TLSs appearing in the operating regime. This approach will be implemented in future work.    We characterize 1/f α flux noise which predominantly limits the dephasing times T * 2 , T echo 2 of QB2 and CPLR at the idling configuration. Following Ref [70], we estimate the power spectral densities S echo is a Gaussian pure dephasing rate and 1 2π ∂ω ∂Φ is a flux sensitivity of the |0 → |1 transition frequency. To compute Γ echo φ , we perform a fit to the decay curve is the energy relaxation time measured in a preceding T 1 experiment. Table II presents the power spectral densities at 1Hz for QB2 and CPLR. We find that these values are comparable with the numbers in the literature [70,71].   We compute the noise power spectral density at long time-scales (5 × 10 −4 -10 −1 Hz) from repeated Ramsey measurements [72,73] (see Figs. 11 and 12). The estimated power spectral density shows a 1/f α dependence and is fitted to the following equation (an orange dashed curve in Fig. 12 , where the exponent α ≈ 1.33 and A Ramsey Φ ≈ (6 µΦ 0 ) 2 × (Hz) (α−1) . We find that the corresponding power spectral density at 1Hz S Ramsey Φ (ω/2π = 1 Hz) ≈ (1.77 µΦ 0 ) 2 /Hz.

Appendix D: State readout
We probe quantum states of QB1, QB2, and CPLR via the dispersive readout scheme [30]. We drive the readout resonators by applying a square-shaped 3 µs-long microwave pulse. We discriminate the three states -ground state |0 , the first excited state |1 , and the second excited state |2 -for QB1, QB2, and CPLR. The first excited state |1 is prepared by applying a 250 ns-long π pulse that drives the |0 → |1 transition. The second excited state |2 is prepared by applying two consecutive 250 nslong π pulses, which drive the |0 → |1 and |1 → |2 transitions respectively.
For state discrimination, we use a linear support vector machine, which finds hyper-planes separating the I-Q data into three parts which corresponding to |0 , |1 and |2 , respectively (Fig. 13) [74]. We characterize the readout performance by computing assignment probability matrices. See Table III for the assignment probability matrices. Note that the assignment probabilities include state preparation errors.  Appendix E: Static ZZ interaction in the dispersive limit In this section, we present experimental data and perturbative calculations of the static ZZ interaction ζ as a function of CPLR's frequency ω c . For the perturbative analysis, we assume that QB1, QB2, and CPLR are dispersively coupled to each other g ij /|ω i − ω j | 1 (i, j ∈ {1, 2, c}, i < j). Fig. 14 shows experimental data of ZZ interaction strength ζ as a function of ω c . In this measurement, we bias the frequencies of QB1 and QB2 at 4.16 GHz and 4.00 GHz, respectively. We measure ζ via a cross-Ramsey type experiment which measures the QB1 frequency while initializing QB2 in either its ground or excited state. The static ZZ interaction is nearly eliminated when ω c /2π = 5.45 GHz. At this bias point (the idling configuration), we perform single qubit gates in the CZ gate benchmarking experiments (Fig. 4 in the main text). Static ZZ interaction strength ζ as a function of the coupler frequency ωc. QB1 and QB2 are biased at ω1/2π = 4.16 GHz and ω2/2π = 4.00 GHz. The solid black curve corresponds to ζ obtained by the perturbation theory up to the fourth order without rotating wave approximation (Eq. (E3)). The blue dashed curve corresponds to ζ obtained by numerically diagonalizing the system Hamiltonian (Eq. (1), see Appendix Q for the parameters used).
Following Ref. [75], we use perturbation theory to calculate theoretical values of ζ up to the fourth order according to the following formula: where E |m denotes the eigenenergy of the eigenstate |m ∈ {|000 , |100 , |001 , |101 }. Specifically, we calculate the ZZ contributions of the n-th order perturbations ζ (n) by computing the n-th order corrections E (n) |m to the eigenenergies of |m (n ∈ {2, 3, 4}, |m ∈ {|101 , |001 , |100 , |000 ) as follows: The ZZ contributions from the n-th order perturbation terms can be split into the rapid counter-rotating-wave terms ζ (n) CRW and the slow rotating-wave terms ζ (n) RW . In general, the rapid oscillating terms are neglected by applying the rotating wave approximation [16]. However, in our case, we note that the fast-oscillating terms considerably contribute to the static ZZ interaction.
The total ZZ contribution up to the fourth order perturbation is given as ζ =ζ For brevity's sake, we introduce the following notations.
where i, j ∈ {1, 2, c}. The ZZ contributions from the n-th order perturbation terms are calculated as follows.
RW =g 1c g 2c g 12 For the fourth-order slow rotating-wave terms, we omit smaller contributing terms containing g 12 .
Since ζ CRW and ζ CRW are expressed by a large number of terms, for the sake of clarity, we instead write the corresponding eigenenergy corrections E (n) |m as follows. Namely ζ CRW is given by where the eigenenergy corrections E (n) |m are given as For the fourth-order fast oscillating terms, we omit smaller contributing terms. Specifically, terms of order O(g 4 ic /∆ 2 ic Σ ic ) and O(g 4 ic /∆ ic Σ 2 ic ) are calculated, whereas terms of order O(g 4 ic /Σ 3 ic ) (i ∈ {1, 2}) and O(g 12 ) are neglected. Therefore ZZ contribution ζ (4) CRW from the fourth-order fast oscillating terms is given as where the eigenenergy corrections E (n) |m for the predominantly contributing terms are given as follows.
a. Order O(g 2 1c g 2 2c /∆ 2 ic Σ ic ), i ∈ {1, 2}: In Fig. 14, we plot the theoretical values of ζ obtained by the calculation of Eq. (E3). The theoretical calculation (solid black curve) shows good agreement with both experimental data (orange circles) and a numerical simulation (blue dashed curve).

Appendix F: Z-pulse transient calibration
The shape of the Z control (flux control) pulses are distorted as they pass through various electric components. This pulse distortion can be analyzed in the frequency domain by measuring the step response. In general, the qubit is employed as a sensor to characterize the step response of the flux control line [39,40,44]. Specifically, we measure a Ramsey-type experiment, which measures the dynamic frequency change of the qubit as a response to the flux change.
The step response can be fitted by multiple exponential time constants τ k and settling amplitudes a k (k = 1, 2, · · · ) as follows.
where V in,step (t) corresponds to a step function generated by AWG and V out,step (t) corresponds to the response of the qubit to the step function. Note that we express the qubit response V out,step (t) in the unit of AWG voltage and calculate the relative amplitude change V out,step (t)/V in,step (t).
To reliably characterize long-time scale transients of the Z control pulses, we use a new protocol, which utilizes the Han spin echo technique [76] (detailed procedures will be described in a forthcoming manuscript [41]). We measure the turn-off transients of a square-shaped pulse with fixed duration τ pulse and fit the response with the following equation: where V in,pulse (t) corresponds to a τ pulse -long squareshaped Z-pulse generated by AWG and V out,pulse (t) corresponds to the response of the qubit to the pulse. Figs. 15a and b show the turn-off transients of the QB2 and CPLR Z-pulses, respectively. The pulse sequences are illustrated in the insets. We plot the relative amplitude change V out,step (t)/V in,step (t) as a function of the time delay between the Z-pulse and the tomography pulse (t − τ pulse ). We fit the transients with a sum of multiple exponential curves and extract the exponential time constants τ k and the corresponding settling amplitudes (Table. IV) Notably, we observed long-time transients (≈ 30 µs) in our experimental setup, which are critical to correct in order to achieve high-fidelity two-qubit gates. We also measure and correct transients in the flux crosstalk (Fig. 15c), possibly due to an additional pulse distortion that occurs during the transmission from the end of CPLR's flux line to the QB2's SQUID.   In this section, we derive the effective two-level Hamiltonians that describe the coherent leakage of CZ (Eq. (2)) in the main text) and iSWAP gates. We first identify the states that strongly interact with the computational qubit states (|000 , |100 , |001 , and |101 ) during the two-qubit gates and cause the coherent leakage. Subsequently, we truncate the system Hamiltonian (Eq. (1)) into the relevant subspaces spanned by these leakage states and the associated computational qubit states.
We identify the leakage states for the CZ gate in the single-and double-excitation manifolds (Fig. 16). Recall that, when performing the CZ gate, we bring |101 in resonance with |200 (ω 1 + η 1 = ω 2 ) and bias the coupler closer to the qubits to switch on the effective qubitqubit couplingg CZ . Therefore, in the single excitation manifold, |010 strongly interacts with |100 , since |010 (CPLR) is brought closer to |100 (QB1) in terms of energy. On the other hand, |001 (QB2) is detuned from |100 (QB1) by QB1's anharmonicity η 1 , and thus |001 is located farther from |010 and is less hybridized with QB1 and CPLR. Thus, we focus on the two-level dynamics between |100 and |010 and define the relevant subspace accordingly (a dashed purple box in Fig. 16a).
Along the same line, in the double-excitation manifold, we identify the leakage states which strongly interact with the computational qubit state |101 and cause the coherent leakage during the CZ gate. We first rule |020 out as a leakage state, since it couples to |101 via a second-order process that is generally weaker than firstorder interactions. Next, we rule out |110 and |002 , since they are relatively far-detuned from |101 compared to |011 and |200 . Specifically, |002 is detuned from |101 by QB2's anharmonicity η 2 , of which magnitude is much greater than the direct QB1-QB2 coupling strength √ 2g 12 (|η 2 | √ 2g 12 ). In addition, |110 is located farther from |101 than |011 by QB1's anharmonicity |η 1 |. After ruling these out as leakage states, we determine the relevant subspace as shown in Fig. 16b (spanned by the states within the dashed green box).
Next, we truncate the system Hamiltonian to the relevant subspaces in both the single-and double-excitation manifolds and obtain the following effective Hamiltonians H CZ 1 and H CZ 2 : where we have replaced ω 1 + η 1 by ω 2 , since ω 1 + η 1 = ω 2 is assumed here. Note that this Hamiltonian truncation is only valid in the regime where |η 1 |, |η 2 | g 1c , g 2c (in our device, |η 1 | ≈ |η 2 | ≈ 3g 1c ≈ 3g 2c ). To analyze the leakage dynamics under general conditions, the leakage contribution from additional levels need to be considered and will be of interest in future research.
To further simplify the three-level dynamics of H cz 2 , we introduce a partially hybridized basis: a bright state |B ≡ cos Θ|011 + sin Θ|200 and a dark state |D ≡ cos Θ|200 − sin Θ|011 , where Θ ≡ tan −1 ( √ 2g 12 /g 1c ) [37]. To this end, we rewrite H CZ 2 in the hybridized basis as follows: where the eigenenergies of |D and |B are given as The coupling strength g B between |B and |101 is given as and the coupling strength g r between |B and |D is given as In the parameter regime, where g 1c g 12 (Θ ≈ 0), g r becomes zero, and therefore |101 only interacts with the bright state |B ; the dark state |D is decoupled from both of the states. This allows us to further reduce the three-level dynamics onto an effective two-level system, as described by Eq. (2) in the main text. As a result, the two effective Hamiltonians H CZ 1 (|100 and |001 subspace) and H CZ 2 (|101 and |B subspace) are equivalent to the following effective Hamiltonian H CZ eff up to offset energies: Optimal control techniques are well-studied for this class of effective Hamiltonians, which we will further discuss in Appendix H. Next, we identify the leakage states for the iSWAP gate. When performing the iSWAP gate, we bring |100 (QB1) in resonance with |001 (QB2), and bias the coupler closer to the qubits to switch on the effective qubitqubit couplingg iSWAP . Unlike the CZ gate, the computational qubit states |100 and |001 are equally detuned from a leakage state |010 in terms of energy. Therefore, we consider leakage from both |001 and |100 to |010 . Accordingly, we determine the relevant subspace in the single-excitation manifold as shown in Fig. 17 (spanned by states in the purple dashed box). In the double-excitation manifold, we rule |020 out as a leakage state, because it couples to the computational qubit state |101 via a second-order process. We also rule out |200 and |002 , since they are detuned from |101 by QB1 and QB2 anharmonicities, respectively, of which both are much greater than the QB1-QB2 direct coupling strength √ 2g 12 (|η 1 |, |η 2 | √ 2g 12 ). Given that, we determine the relevant subspace in the double-excitation manifold as shown in Fig. 17b  in the single-and double-excitation manifolds, respectively, are given as follows.
Appendix H: Suppression of leakage using a Slepian-based optimal control As detailed in Appendix G, the effective Hamiltonians that describe the leakage dynamics of CZ and iSWAP gates are given as follows: We optimize the control waveform ω c (t) for adiabatic behavior under these two-level systems. Note that these effective two-level systems address only predominant leakage channels, not all possible leakage channels during the two qubit gates. Specifically, H CZ eff (t) addresses leakage from |100 to |010 (in the single-excitation manifold) and leakage from |101 to |011 (in the double-excitation manifold) during the CZ gate. In the case of the iSWAP gate, H iSWAP eff (t) addresses leakage from |100 and |001 to |010 (in the single-excitation manifold) and leakage from |101 to |110 and |011 (in the double-excitation manifold).
Following Ref. [38], we take a Slepian-based approach to implement an optimal control pulse that minimizes leakage errors for any pulse longer than the chosen pulse length. For example, a Slepian control pulse for a 60 nslong CZ gate minimizes the leakage error of CZ pulses which have the same pulse amplitude, but longer pulse lengths than 60 ns.
In Fig. 18, we numerically simulate coherent leakage of CZ gates (see Appendix Q for details about the simulation). We assess the performance of an optimized control pulse by comparing to a simple square pulse (Fig. 18a). Considering the bandwidth limitation of our AWGs, the square pulse is smoothed by applying a Hanning filter with a window width of 5 points (1 ns intervals). The control pulse amplitudes, which are parameterized by the minimum point of CPLR frequency f min c , are chosen such that 60 ns-long control pulses perform the CZ gate. In Fig. 18b, to characterize the leakage in the doubleexcitation manifold, we prepare |101 and apply a control pulse, and then measure the population of leakage states |110 , |011 , |200 , |020 , and |002 with varying the pulse length. We note that the square pulse shaping causes significant leakage, especially to |011 (an orange curve in Fig. 18b). By using the optimal pulse, we suppress leakage populations p |110 , p |011 , p |020 , and p |002 below 10 −7 for pulses longer than the chosen gate length: 60 ns. Fig. 18c shows leakage in the single-excitation manifold. Here we characterize leakage from the computational qubit state |100 after applying a CZ pulse. As in the case of double-excitation manifold, a square-shaped control pulse causes significant leakage, to both |010 and |001 . By using the optimal control, we suppress the leakage population p |010 to |010 below 10 −7 . However, we note that the leakage to |001 is not suppressed as much, We prepare |101 and apply a control pulse, and then measure the state populations. By using the optimal pulse shaping, we suppress population of the leakage state |011 (orange curve) below 10 −7 for pulses longer than 60 ns. (c) Coherent leakage in the single-excitation manifold. We prepare |100 and apply a control pulse, and then measure the state populations. By using optimal pulse shaping, we suppress population of the leakage state |010 (blue curve) below 10 −7 for pulses longer than 60 ns. The leakage to |001 is not suppressed as much, since the optimal control relies on the effective Hamiltonian H CZ eff (t) that only addresses leakage from |100 to |010 in the single-excitation manifold. The data points in (b-c) are obtained every 1 ns. compared to |010 . This is because our theoretical model H CZ eff (t) only addresses leakage from |100 to |010 without taking |001 into account.
In Fig. 19, we simulate coherent leakage of iSWAP gates. We compare the performance of an optimal control pulse to a square pulse (Fig. 19a). The control pulse amplitudes (f min c ) are chosen such that 30 ns-long con- and apply a control pulse, and then measure the state populations. By using optimal pulse shaping, we suppress population of the leakage states |110 and |011 (blue and orange curves) below 10 −7 for pulses longer than 30 ns (black dashed line). The leakage to |200 and |002 are not suppressed as much, since the optimal control relies on the effective Hamiltonian H iSWAP eff (t) that only addresses the leakage from |101 to |110 and |011 in the double-excitation manifold. (c) Coherent leakage in the single-excitation manifold. We prepare |101 and apply a control pulse, and then measure the state populations. By using the optimal pulse shaping, we suppress population of the leakage state |010 (blue curve) below 10 −7 for pulses longer than 30 ns (black dashed line). The data points in (b-c) are obtained every 1 ns. trol pulses perform the iSWAP gate. In Fig. 18c and d, we characterize leakage in the double-excitation manifold as in the case of the CZ gates. The square control pulse causes significant leakage. By using the optimized pulse, we suppress the leakage population to |110 (a blue curve) and |110 (a orange curve) below 10 −7 for pulses longer than the chosen gate length: 30 ns. Fig. 19c shows the leakage in the single-excitation manifold. Here we characterize leakage from the computational qubit state |100 . The square-shaped pulse causes significant leakage errors. By using the optimal control, we suppress the state population of a leakage state |010 below 10 −7 .
In this section, we demonstrated our Slepian-based optimal control by presenting numerical simulation results. We suppress population of the predominant leakage states below 10 −7 , by using the optimized control. However, not every leakage channel is suppressed to the same level, since our theoretical model addresses only the predominant leakage channels. Developing a theoretical framework for addressing the full leakage channels will be the subject of future work.
Appendix I: Advantages of small anharmonicity ηc for the coupler Smaller η c enables the ZZ-free iSWAP interaction at a lower coupler frequency ω c , which allows for a strongerg iSWAP such that we can implement faster ZZfree iSWAP gates. Figs. 20(a) and (b) show numerical simulations of coupling strength (2g iSWAP ) of the iSWAP interaction and its residual ZZ strength (ζ iSWAP ) as a function of the coupler anharmonicity (η c ). QB1 and QB2 frequencies are on resonance, which is the standard configuration for activating iSWAP gates. In each figure, the blue dashed curve represents our parameter choice for the coupler anharmonicity (η c /2π = −90 MHz) and red dashed curve represents a parameter regime, where residual ZZ interaction becomes zero owing to the cancellation induced by the second excited state of the coupler. We note that as η c decreases, the coupler frequency at which residual ZZ interaction is cancelled also decreases (red dashed curve), whileg iSWAP remains constant approximately. Therefore, the resultantg iSWAP for the ZZ-free iSWAP increases, thereby reducing the gate duration. Hence, owing to our relatively small coupler anharmonicity, we are able to make a short (30 ns) ZZfree iSWAP gate.
In addition, smaller η c prevents |020 from being strongly hybridized with |101 and |200 during a CZ gate; this enables us to simplify multi-level leakage dynamics and use the Slepian-based optimal control. In the following paragraph, we explain how we numerically estimate the state overlap of |020 with |101 and |200 .
The eigenstates (i.e., dressed states) | ψ of our system (see Eq. (1) for the system Hamiltonian) can be expressed by a linear combination of basis states (i.e., bare states) |φ j as follows.
where N denotes the dimension of our Hilbert space. We estimate the complex state overlap coefficients 020| 101 Numerical simulations of (a) coupling strength giSWAP and (b) residual ZZ strength ζiSWAP for an iSWAP interaction vs. the coupler anharmonicity ηc (y-axis) and the coupler frequency ωc (x-axis), where ω1/2π = ω2/2π = 4.16 GHz. Note that smaller ηc enables ZZ-free iSWAP interaction (ζiSWAP = 0, red dashed curves) with strongergiSWAP, thereby enabling faster ZZ-free iSWAP gates. and 020| 200 by numerically diagonalizing the Hamiltonian based on the device parameters (see Appendix Q for the device parameters used for simulation). We define a sum of squares of these coefficients (| 020| 101 | 2 + 020| 200 | 2 ) as a metric that quantifies how strongly |020 hybridizes with |101 and |200 . Figs. 21(a) and (b) show numerical simulation results of coupling strength (2g CZ ) of the CZ interaction and the state overlap (| 020| 101 | 2 + 020| 200 | 2 ) as a function of CPLR anharmonicity (η c ). QB1 and QB2 frequencies are set such that ω 1 + η 1 = ω 2 , which is the standard configuration for activating CZ gates. We note that the as η c decreases, the state overlap | 020| 101 | 2 + 020| 200 | 2 also decreases, while the coupling strengthg CZ remains constant approximately. Hence, by using a relatively small coupler anharmonicity, we are able to reduce the hybridization of |020 with |101 and |200 when performing CZ gates.

Appendix J: Synchronization of XY axes for the iSWAP gate
The computational qubit state is generally defined in a reference frame, rotating at the frequency of qubit driving tone (this frame is often called the logical frame). Accordingly, in a multi-qubit system, we use multiple independently rotating frames to refer the computational state of each qubit. Notably, performing iSWAP-like gates by tuning qubit frequencies into resonance [26] causes a non-trivial local phase shift in the logical frame due to the unmatched rotating frequencies. In this section, we explain how this phase shift occurs by presenting a simple example and discuss how it can be avoided.
We consider an uncoupled two-qubit system with Hamiltonian defined as follows in the laboratory frame ( ≡ 1) where ω 1 and ω 2 denote the transition frequencies of each qubit. Consider an arbitrary state ψ(t) evolving under the Hamiltonian H lab as follows.
where c m (t) denotes the probability amplitude of a basis state |m ∈ {|00 , |01 , |10 , |11 } at time t. In the doubly rotating frame (i.e. the logical frame), where each frame rotates at the corresponding qubit frequency, the logical state vectorψ(t) is given bỹ Now, suppose that we apply an iSWAP gate at t = τ 1 (τ 1 > 0). In the lab frame, the iSWAP gate swaps the probability amplitudes c 01 (τ 1 ) and c 10 (τ 1 ) and adds a relative phase of i as follows.
Subsequently, in the logical frame (the doubly rotating frame), the state vectorψ(t)| t=τ1 can be written as Note that the logical state vector has acquired additional local phase shifts e i(ω1−ω2)τ1 and e i(ω2−ω1)τ1 on the basis |01 and |10 , after the iSWAP gate. These phase shifts are artifacts of the frequency difference between the two rotating frames |ω 2 − ω 1 |. Notably, longitudinal entangling gates (e.g., the CZ gate) do not cause this phase shift, since they do not involve any energy exchange. Also, parametrically driven two-qubit gates [10,77], which activate resonant exchange interactions in the logical frame (not the lab frame), do not result in this phase shift.
In this paper, we avoid this phase shift by putting the qubits in the same rotating frame; we drive the qubits using tones with the same frequency to synchronize their XY axes. However, driving one qubit, which is in resonance with other qubits, requires careful attention when implementing single qubit gates. Due to the microwave crosstalk, one microwave pulse can considerably drive multiple qubits at the same time. To resolve this issue, we actively cancel out the microwave crosstalk by applying cancellation tones simultaneously (see Appendix K for details).
We apply cancellation drives to orthonormalize the XY control and find a remaining crosstalk of below 3 × 10 −5 (Fig. 22e). Single-qubit RB measurement data for QB2. "S" denotes the simultaneous application of single-qubit Cliffords, "I" denotes the isolated application of single-qubit Cliffords. The pulse duration of X-and Y-rotation gates is 70 ns. Both QB1 and QB2 are biased at 4.16 GHz. We apply cancellation pulses to offset microwave crosstalk (orange and green circles) and reduce the gate errors by more than a factor of 10. The data are averaged over 20 random sequences for each sequence length.
Appendix M: Two-qubit Clifford Randomized benchmarking for the iSWAP gate Following Refs. [42,44], we construct the two-qubit Clifford group, which has four distinct classes as shown in Fig. 25. The single-qubit Clifford group C 1 is the group of 24 single-qubit rotations, which can be written in terms of the X-and Y-rotations [44]. One of three-element single-qubit rotation groups S 1 is shown in Table. V.
By rewriting the CNOT and the SWAP in terms of the iSWAP (Fig. 26), we generate the two-qubit Cliffords in terms of the iSWAP and single-qubit XY gates as shown in Fig. 27. Our native iSWAP gate accompanies single-qubit Z rotations since the qubit frequencies are dynamically tuned during the gate. We undo these unwanted Z-rotations by incorporating compensating Zrotations into the existing single qubit gates that are either preceded or followed by the iSWAP gate. For example, in the case of "the iSWAP-like Cliffords" (Fig. 27), we update the single-qubit gates that are preceded by an iSWAP gate such that they undo the Z-rotations of the iSWAP gate. Specifically, we replace the corresponding single-qubit Clifford gate (C 1 ) by three rotation gates along x − y − x axes, which can implement an arbitrary single-qubit rotation according to Euler's rotation theorem (see also Appendix P).
We now calculate the average gate composition for the two-qubit Cliffords. The single-qubit class has 576 elements and contains 90/24 single qubit gates per element, on average. The CNOT-like class has 5184 elements and contains 2 iSWAP gates and 13 single qubit gates per element, on average. The iSWAP-like class has 5184 elements and contains 1 iSWAP gate and 70/3 single qubit gates per element, on average. The SWAP-like class has 576 elements and contains 3 iSWAP gates and 14 single qubit gates per element, on average. Given these, we find that the two-qubit Cliffords are composed of 1.5 iSWAP gates and 10.9375 single qubit gates, on average. Therefore, the average error rate of twoqubit Cliffords can be calculated as follows: r Clifford = 10.9375 × r 1qb + 1.5 × r iSWAP , where r 1qb and r iSWAP are the average error rates of single-qubit gates and an iSWAP gate, respectively.
To characterize the two-qubit interaction fidelity of the iSWAP gate, we perform interleaved randomized benchmarking [42][43][44]. The benchmarking sequences are illustrated in Fig. 28. Note that when interleaving the iSWAP gate, we incorporate compensating Z-rotations into the subsequent reference Clifford. For example, in the case of "1QB-gates-like" reference Clifford, the constituent single-qubit Clifford C 1 on each qubit is replaced by three single-qubit gates along x − y − x axes. Other two-qubit Clifford classes already have three single-qubit gates along x−y −x axes at their front ends (see Fig. 27), so they do not require additional XY gates when interleaving the iSWAP gate. As a consequence, the iSWAPinterleaved sequence acquires additional 0.1125 XY gates on average. The error contribution of additional XY gates is accounted for when estimating the two-qubit interaction fidelity of the iSWAP gate.
(a) Decomposition of the CNOT gate into the iSWAP gates. (b) Decomposition of the SWAP gate into the iSWAP gates.
CNOT-like class 1QB-Gates-like class Two-qubit Clifford classes written in terms of the iSWAP gate and single-qubit gates. Since our native iSWAP gate accompanies additional unwanted single-qubit Zrotations, we incorporate compensating Z-rotations into the existing single qubit gates that are either preceded or followed by the iSWAP gate to undo the unwanted Z-rotations. The orange colored arrows denote which single-qubit gates are subject to be updated to undo the Z-rotations of the iSWAP. . 28. (a) A diagram of the standard (or reference) twoqubit RB sequence. (b) A diagram of the two-qubit RB sequence interleaved by the iSWAP gate. The additional unwanted Z-rotations of the interleaved iSWAP gate are canceled out by the subsequent two-qubit Clifford (orange arrows).
Appendix N: Two-qubit gate tune-up procedures We calibrate the CZ gate by adjusting the Z control amplitudes for a fixed gate length (60 ns) and measuring the leakage from |101 and the conditional phase (CPHASE) angle φ CZ . A control pulse for the CZ gate and pulse sequences for these measurements are illustrated at the top of Fig. 29. To measure the leakage from |101 , we prepare |101 by applying π pulses to both QB1 and QB2 and measure the state population of |101 after a CZ gate (Fig. 29a). To measure the CPHASE angle, we perform a cross-Ramsey type experiment, which measures the conditional phase accumulation of QB1, while initializing QB2 in its ground state or excited state (Fig. 29b).
We find the optimal spot (red circles in Figs. 29a and b) in the parameter space for the CZ gate, which minimizes both the leakage and the CPHASE angle error (≡ φ CZ − 180 • ). Notably, the measured data has a slight tilt (the leakage and the CPHASE angle data are not symmetric about their x-axes: the qubit-qubit detuning) due to the level repulsion induced by qubit-coupler interactions. These tune-up measurements are qualitatively reproduced by time-dependent Hamiltonian simulations for three-interacting qutrits (Figs. 29c and d). See Appendix Q for details about the simulations.
Near the optimal spot, we note that the leakage is predominantly controlled by the CPLR Z-pulse amplitude (the y axes of the plots), while the CPHASE angle is controlled by the QB2-Z pulse amplitude (the x axes of the plots). Keeping this in mind, we individually adjust the CPLR-Z pulse amplitude and the QB2-Z pulse amplitude by measuring the leakage and the CPHASE angle error, respectively. For fine-adjustment of the amplitudes, we measure multiple CZ pulses to amplify the effects of small amplitude errors (Fig. 30). The measurement data exhibits symmetric chevron patterns that allow us to easily find optimal values for the pulse amplitudes to minimize the leakage and the CPHASE angle error (≡ φ CZ − N CZ × 180 • ). We repeat this class of fine-tuning measurements 2-3 times within narrower amplitude ranges so that we can make the most precise adjustments possible (ultimately limited by the amplitude resolution limit of our AWGs).
Finally, to offset single-qubit phase accumulation that accompanies the CZ gate, we subsequently apply virtual Z gates [45]. To calibrate these virtual Z gates, we perform Ramsey experiments on QB1 and QB2 and measure the single-qubit phase accumulation of each qubit due to the CZ gate. Fine-tuning the angles of the virtual Z gates is done by a numerical optimization method (the Nelder-Mead algorithm) with the fidelity of twoqubit randomized benchmarking sequences as an objective function [78].
Along the same line, we calibrate the iSWAP gate by adjusting the Z control pulse amplitudes for a fixed gate length (30 ns) and measure the swap angle (Fig. 31). The swap angle θ iSWAP quantifies how much the popula- Tune-up measurements for the CZ gate. (a,b) Experimental data of tune-up measurements for a 60 nslong CZ gate. We measure leakage from |101 and conditional phase (CPHASE) angle φCZ as functions of QB2 Z-pulse amplitude (x axis) and CPLR Z-pulse amplitude (y axis). The control pulse and sequences to measure leakage and CPHASE angle are illustrated at the top. We find an optimal parameter set that minimizes both the leakage and CPHASE angle error (red circles). (c,d) Numeric simulation reproducing the experimental data of tune-up measurements.
tion of QB1 has been transferred to QB2 and vice versa. Accordingly, to measure θ iSWAP , we prepare |100 and measure how much the population of |100 has transferred to the population of |001 by an iSWAP gate (θ iSWAP ≡ tan −1 (p |001 /p |100 ), where p |001 and p |100 (a) Experimental data of tune-up measurements for a 30 nslong iSWAP gate. We measure the iSWAP angle θiSWAP as functions of QB2 Z-pulse amplitude (x axis) and CPLR Zpulse amplitude (y axis). The control sequence to measure iSWAP angle is illustrated at the top. A red circle denotes an optimal parameter set that maximizes θiSWAP. (b) Numeric simulation reproducing the experimental data of tune-up measurements.
are the measured populations of |001 and |100 , respectively, at the end). We find an optimal spot for the iSWAP gate (red circle in Fig. 31a) which maximizes θ iSWAP (0 • ≤ θ iSWAP ≤ 90 • ). In Fig. 31b, we numerically simulate this tune-up measurement and show good qualitative agreement with the experimental result.
To finely adjust the CPLR-Z pulse amplitude and the QB2-Z pulse amplitude, we measure multiple iSWAP pulses (Fig. 32). Since the swap angle is controlled by both the CPLR-Z and QB2-Z amplitudes, we adjust the both amplitudes in an alternating manner-that is, adjusting the amplitudes of QB2-Z, CPLR-Z, QB2-Z, CPLR-Z, · · · -with varying the number of iSWAP pulses (N iSWAP ∈ {21, 51, 101}).
Finally, to offset single-qubit phase accumulation that is accompanied when performing the iSWAP gate, we apply actual Z gates using XY control (see Appendix P for details). To calibrate these Z gates, we perform Ramsey experiments on QB1 and QB2 and measure the single-qubit phase accumulation of each qubit due to the iSWAP gate. As in the case of the CZ gate, we numerically search the optimal angles of the Z gates that maximize the sequence fidelity of two-qubit RB sequences. Appendix O: Residual leakage of the CZ gate Following Refs. [79,80], we estimate the average leakage rate of our 60 ns-long CZ gate with an optimized pulse shape from our interleaved randomized benchmarking measurement (Fig. 4(h)). To estimate the leakage rate, we fit the population in the computational subspace p X1 ≡ p |000 + p |001 + p |100 + p |101 to an exponential model (for both reference and interleaved RB curves, see To ensure accurate uncertainties of the leakage rates, we perform a weighted least-squares fit using the inverse of variance as the weights. Then we estimate the average leakage rates L 1,ref , L 1,int per Clifford as follows.
The average leakage rate L CZ 1 per CZ gate is subsequently obtained by the following equation, The leakage rate L CZ 1 per CZ gate is estimated of 0.06 ± 0.07%. We find that the most of the residual leakage is introduced into the second excited state of QB1 (1 − p X1 ≈ p |200 +p |201 ), which indicates the residual leakage may be due to residual pulse distortion in Z-control pulses of QB2 and CPLR. Characterization of the leakage rate of the CZ gate with an optimized pulse shape. Population in the computational subspace pX 1 = p |000 + p |001 + p |100 + p |101 for the reference and interleaved two-qubit randomized benchmarking sequences.
Appendix P: Z-corrections for the two-qubit gates Two-qubit gates are accompanied by local phase shifts (single-qubit Z-rotations), since the qubit frequencies are dynamically tuned during the gates. To undo these phase shifts, we apply additional single Z-rotations either before or after the two-qubit gates. In the case of the CZ gate, we utilize virtual Z gates [45] which are simply implemented by shifting phase offsets of microwave pulses. In contrast, in the case of the iSWAP gate, we implement actual Z gates since the iSWAP gate that we consider in this work is not compatible with virtual Z gates in general [45].
We implement actual Z rotations by combining X and Y rotations. According to Euler's rotation theorem, any rotation matrix can be described by the multiplication of three rotation matrices along x − y − x axes. Subsequently, arbitrary Z gates (we call the Euler-Z gate) with rotation angle θ z can be implemented by a series of Xand Y-rotations: R x (−π/2) − R y (θ z ) − R x (π/2), where R x , R y are single-qubit rotations along the x and y axes, respectively (Fig. 34).
Euler-Z (θ z )   Each T1 contribution is computed by taking the difference between the gate errors in the presence and the absence of corresponding energy relaxation. We find that the sum of individual T1 contributions is approximately equal to the total T1-induced error computed from a separate simulation that takes all possible relaxations into account (purple circles in Fig. 35). Fig. 35b shows the average gate errors of the iSWAP gate in the absence and the presence of energy relaxations. The lowest gate error is achieved when t G ≈ 25 ns; this is the point where the residual ZZ interaction of the iSWAP is minimized, owing to the cancellation induced by the higher level of CPLR. The T 1 contributions of QB1, QB2, and CPLR to the 30 ns-long iSWAP gate error are summarized in Table. VII.
Appendix S: 1/f α flux noise contribution to gate errors We simulate the error contribution of 1/f α flux noise, which predominantly limits the dephasing times T * 2 , T echo 2 of QB2 and CPLR. While there are other noise sources affecting the qubits and coupler such as charge noise and photon-shot noise, we assume that their impacts are negligible. Specifically, our transmon qubits and coupler have large E J /E c (> 60), which makes their charge dispersions smaller than ≈ 1 kHz such that they are insensitive to charge noise. Notably, owing to large E J /E c and negligible photon shot noise, QB1 (which has a fixed-frequency) exhibits nearly T 1 -limited dephasing times T * 2 ≈ T 1 , T echo 2 ≈ 2T 1 .
We extract the power spectral density S Φ (f ) ≈ of flux noise from the repeated Ramsey and echo experiments of QB2 (see Appendix C for details). We simulate contribution of 1/f α flux noise to gate error by performing Monte Carlo simulations with 1,000 random flux noise realizations; flux noise samples are generated based on the power spectral density S Φ (f ). We assume that QB2 and CPLR experience the same flux noise power (but we independently sample noise waveforms for QB2 and CPLR). We set the low-cutoff frequency of the noise PSD at 10 −4 Hz, which is close to 1/(2× the total length of RB experiment). We set the high-cutoff frequency at 5 × 10 9 Hz, which is close to the |0 → |1 transition frequencies of the qubits and coupler; following Ref. [84], we assume that only the noise below the qubit frequency results in the phase noise. Contribution of 1/f α flux noise to error rates of CZ gates. Notably, the error contribution from CPLR flux noise increases as the gate duration decreases (orange circles), possibly due to stronger qubit-coupler hybridization.  Figs. 36 and 37 show the estimated contribution of the flux noise to gate error rates for CZ and iSWAP gates, respectively. We extract the error contribution of 1/f α flux noise by subtracting the error rate in the presence of flux noise by the error rate in the absence of flux noise. We compare the flux noise induced error rates and T1-induced error rates for 60 ns-long CZ and 30 ns-long iSWAP gates in Table VIII. We find that flux noise contribution is quite small. This is because our gate lengths are short (30-60 ns) such that the impact of long-time correlated noise, i.e., 1/f noise, is significantly suppressed. This result is consistent with Ref. [25]. It also explains how flux-tunable qubits can achieve highfidelity gates (both single-and two-qubit gates) in general, even though they operate at flux sensitive points at the idling configuration.