Fast high-fidelity gates for galvanically-coupled fluxonium qubits using strong flux modulation

Long coherence times, large anharmonicity and robust charge-noise insensitivity render fluxonium qubits an interesting alternative to transmons. Recent experiments have demonstrated record coherence times for low-frequency fluxonia. Here, we propose a galvanic-coupling scheme with flux-tunable $\textit{XX}$ coupling. To implement a high-fidelity entangling $\sqrt{i\mathrm{SWAP}}$ gate, we modulate the strength of this coupling and devise variable-time identity gates to synchronize required single-qubit operations. Both types of gates are implemented using strong ac flux drives, lasting for only a few drive periods. We employ a theoretical framework capable of capturing qubit dynamics beyond the rotating-wave approximation (RWA) as required for such strong drives. We predict an open-system fidelity of $F>0.999$ for the $\sqrt{i\mathrm{SWAP}}$ gate under realistic conditions.


I. INTRODUCTION
A major challenge in the field of quantum computing is to break free from the imperfections characteristic of the noisy intermediate-scale quantum (NISQ) [1] era. For superconducting qubits, this will specifically require further improvements of two-qubit gate performance beyond the current stateof-the-art, with errors on the order of 10 −3 − 10 −4 [2][3][4]. To reach even lower two-qubit gate infidelities, it is worth reexamining the framework routinely used for developing the pulse trains which generate the gates of interest. Most commonly, this framework is intimately linked to the use of the rotating-wave approximation (RWA). This approximation is highly convenient as it can help remove fast time dependence from the Hamiltonian, yields an intuitive picture of the dynamics, and makes calculations particularly simple [5]. However, the range of validity of the RWA is limited and reliance on it constrains the parameter space explorable for maximizing gate fidelities. For low-frequency qubits such as heavy fluxonium [6][7][8], this is particularly unfortunate as higher gate fidelities can indeed be achieved for parameter choices outside the reach of the RWA.
We consider fluxonium qubits linked galvanically via a flux-tunable coupler. To avoid directly exciting the coupler degrees of freedom, the interaction between the qubits and the coupler is chosen to be dispersive. This allows for an effective description in which the coupler is eliminated, but mediates a tunable XX interaction. We show that the strength of this effective XX coupling changes sign as a function of coupler flux and thus passes through zero. In addition, we find that the parasitic ZZ interaction strength is suppressed, which is a general feature of coupled systems of low-frequency fluxonium qubits [18].
We describe how to execute two-qubit gates via sinusoidal modulation of the coupler flux for a duration as short as a few drive periods. Based on our analysis outside the RWA regime, we find that the implemented entangling operations generally differ from named gates by relative phases. We compensate for these phase factors using single-qubit Z rotations to obtain a high-fidelity √ iSWAP gate. Ordinarily, single-qubit gates are designed in the convenient regime where RWA applies [5]. In this scenario, the switch into a frame co-rotating with the drive renders operations simple rotations about fixed axes in the Bloch-sphere picture. Identity gates are obtained by idling and single-qubit Z rotations are obtained either e.g. "virtually" by modifying the phase of the drive field in software [34,35] or "physically" by detuning the qubit frequency from that of the drive [36]. The situation is reversed for systems of heavy fluxonium qubits [6,7] where drive strengths exceeding the RWA-range are employed -motivating the use of the laboratory frame for qubit operations [8,13]. In this frame, qubits acquire dynamical phases in the absence of control pulses; in other words, idling yields Z rotations of each qubit [8,13]. To synchro-nize gates in multi-qubit systems, identity gates of variable time duration must be devised. We show that identity gates can again be implemented using sinusoidal modulation of the qubit fluxes for only a single drive period, resulting in ultrafast I gates (when compared to the single-qubit Larmor period). Combining the identity gates with single-qubit Z rotations assists in achieving high-fidelity entangling gates. This paper is organized as follows. We introduce the coupling scheme in Sec. II, and derive the full-circuit Hamiltonian and its effective counterpart governing the low-energy physics. We describe the implementation of single-qubit gates in Sec. III focusing specifically on the needed identity gates. In Sec. IV we detail our scheme for performing high-fidelity two-qubit gates, including the necessary single-qubit Z rotations and identity gates. In Sec. V we summarize our results and provide an outlook on future work.

II. GALVANICALLY-COUPLED HEAVY-FLUXONIUM QUBITS
We consider galvanically linking two fluxonium qubits via a flux-tunable coupler. We bias the heavy-fluxonium qubits at their half-flux sweet spots, as such fluxonia are linearly insensitive to flux noise and thus have achieved record coherence times [8,16,17]. The qubit states are delocalized over two neighboring wells of the fluxonium potential, with the qubit frequency set by the tunnel splitting [37,38]. Single-qubit gates can be achieved by tuning the external flux away from the sweet spot, activating a transverse X interaction [8]. A galvanic-coupling scheme helps yield strong coupling strengths, as those are quantified by phase matrix elements rather than charge matrix elements (as would be the case for capacitive coupling). Phase matrix elements are not suppressed by the qubit frequency [27], making a galvanic-coupling architecture attractive for achieving entangling gates on low-frequency qubits. Moreover, for fluxonium qubits whose inductance is kinetic [38] rather than geometric [39], a galvanic connection can generally yield stronger coupling strengths than those achieved through mutual inductance alone [39]. To make the coupling strength tunable, we generalize the so-called "fluxonium molecule" circuit introduced in Ref. [40] by inserting a coupler Josephson junction as shown in Fig. 1

. The circuit Hamiltonian is
where µ obeys the correspondence a → 0, b → 1 when appearing in an exponent. See Appendix A for details on the full derivation of the Hamiltonian H. We have defined the inductive energy of the coupler E Lc = 1 2 (E L + E L ), and the charging energies E Cµ = e 2 /2C µ , µ = a, b, E C+ = e 2 /(2[C/2]), E C− = e 2 /(2[C c + C/2]), where the remaining circuit parameters can be read off from Fig. 1. The node variables ϕ a , ϕ b are the qubit variables, while the coupler variables are defined as θ ± = ϕ 1 ± ϕ 2 . We have isolated the qubit-flux shifts away from the sweet spot δφ µ = φ µ − π, where φ µ = 2πΦ µ /Φ 0 is the reduced external flux, Φ µ is the external flux in the corresponding loop and Φ 0 = h/2e is the superconducting flux quantum.
The Hamiltonian H is composed of two fluxonium qubits and two coupler degrees of freedom, where the qubits interact with the coupler via terms in V . The coupler θ + degree of freedom is harmonic, while the coupler θ − degree of freedom is fluxonium like and thus tunable by external flux. It is important to note that there is no term in the Hamiltonian directly coupling the qubits, thus the qubit-qubit interaction is entirely mediated by the coupler. Here we assume symmetric qubit inductors, coupler inductors, and stray coupler capacitances, respectively, see Fig. 1. In this case, there are no terms in the Hamiltonian that directly couple the coupler degrees of freedom. This assumption is relaxed in Appendix A where we derive the Hamiltonian in the presence of disorder in circuit parameters.
The coupler should have the following two desired properties. The first is it must allow for the execution of high-fidelity two-qubit gates. A straightforward way to achieve this goal is to ensure that the interaction of the qubits with the cou- pler degrees of freedom is dispersive, allowing for an effective description in terms of two coupled qubits. The second requirement is that the two-qubit coupling strength should be sufficiently flux dependent, allowing for tuning from zero to values that allow for fast gates compared with the coherence times T 1 , T 2 of each qubit. In the next section, we derive the effective Hamiltonian of the system assuming that the qubitcoupler interaction is dispersive. Following this, we discuss specific coupler parameter choices that satisfy the above requirements.

A. Low-energy effective Hamiltonian
Near the half-flux sweet spots for each qubit, the qubit excitation energies are small compared with the energy needed to excite the coupler or higher-lying fluxonium states. These energy scales naturally define two subspaces: the low-energy subspace defined by the projector onto the computational states P = ,m=0,1 | a , m b , 0 − , 0 + a , m b , 0 − , 0 + | and the high-energy subspace spanned by all other states. We have defined the bare states | a , m b , n − , p + that are eigenstates of H 0 with eigenenergies E a l + E b m + E − n + pω + . The variables , m, n, p correspond to the number of excitations in the degrees of freedom corresponding to the variables ϕ a , ϕ b , θ − , θ + , respectively. The coupler θ − mode is fluxonium like, however operated in a different parameter regime than the heavy-fluxonium qubits. The bare wave functions and spectra of qubit a and the coupler θ − mode are shown in Fig. 2. We discuss below in detail the ideal parameter regime in which to operate the coupler modes.
States in different subspaces are coupled by the perturbation V , which is small compared with the relevant energy separations. Thus, the interaction is dispersive and we can obtain an effective description of the low-energy physics via a Schrieffer-Wolff transformation [5,41]. This is done by introducing a unitary e −S with anti-hermitian generator S that decouples the high-and low-energy subspaces order-by-order. We find the effective Hamiltonian up to second order upon projecting onto the low-energy subspace This transformation is carried out in detail in Appendix B.
The Hamiltonian H eff describes two qubits with frequencies ω µ = ω µ + χ µ , where ω µ = E µ 1 − E µ 0 are the bare qubit frequencies (here and in the following we set = 1) and χ µ are the Lamb shifts. The qubits are coupled via a transverse XX interaction with strength J that is tunable with coupler flux φ c . There are additional single-qubit X terms with strength  Table I for device parameters.
Ω µ that depend on the coupler flux φ c as well as the qubit fluxes δφ µ . We show below that both J and Ω µ can be tuned through zero, yielding two decoupled qubits. The coefficient Ω µ is defined as and arises at first-order in perturbation theory from two contributions. The first term on the right-hand side is due to a qubit-flux offset from the sweet spot δφ µ , while the second is from the coupling between the qubits and the coupler θ − mode. The matrix element 0 − |θ − |0 − is φ c dependent and generally nonzero due to the absence of selection rules for a fluxonium biased away from a sweet spot [42,43]. Interpreting this second term as an effective flux shift away from the sweet spot for each qubit, we cancel this shift by setting Thus, we obtain the coupler-flux dependent "sweet-spot contour" shown in Fig. 3. It is important to note that this phenomenon is independent of effects due to geometric flux crosstalk and arises instead directly from coupling terms in V . The two-qubit interaction arises at second-order in perturbation theory, with strength J = J − − J + . The coefficient J − (J + ) is due to interaction of the qubit with the coupler θ − (θ + ) mode. The strength of the interaction J − is tunable due to the dependence of the matrix elements 0 − |θ − |n − and energies E − n on coupler flux, see Appendix B for details. The coefficient J + is static, thus the two-qubit interaction is eliminated by tuning J − to equal J + in magnitude [44]. We can understand the XX nature of the two-qubit coupling by considering the terms appearing in V that mediate the interaction between different subsystems. At the qubit sweet spots and considering only the computational states, the operators ϕ µ are off-diagonal and therefore proportional to σ µ x with proper choice of phases. Thus, the effective two-qubit interaction consists of a virtual second-order process whereby an excitation is exchanged between the two qubits, or both qubits are co-excited or co-de-excited.
In what follows, we always operate from the dc flux bias point on the sweet-spot contour Ω µ = 0 where the two-qubit coupling is turned off J = 0, see Fig. 3. We do this to keep both qubits at their respective sweet spots and to prevent any unwanted parasitic entanglement between the qubits. We refer to this configuration of dc fluxes as the "off position." Both single-and two-qubit gates are performed by ac flux excursions about this point. Note that the value of the coupler flux φ c at the off position is generally parameter dependent.

B. Parameter regime of the coupler
To obtain an effective description in terms of two-coupled qubits, parameter choices for the coupler should support a ELc. The star marks the chosen parameters, Table I. dispersive interaction. Additionally, we require that the coupling strength be sufficiently flux dependent, allowing both for the execution of fast gates and for the interaction to be efficiently turned off. We quantify the dispersiveness of the interaction by calculating the Lamb shift χ a (we obtain similar results in the following utilizing instead χ b ). Considering the requirements on flux dependence, we calculate the slope of the coupling strength J − with respect to Φ c at the off position. We target parameters such that |∂ Φc J − |/h ≈ 100 MHz/Φ 0 to achieve MHz level coupling strengths (implying fast gates compared with T 1 , T 2 ) for small ac flux excursions Φ c 0.03 Φ 0 where a linear relationship between the coupling strength J and flux Φ c is expected to be valid. This value of the slope also ensures that the device remains insensitive to typical 1/f flux noise amplitudes A Φ ≈ 1µΦ 0 [45].
We sweep over E Jc and E C− and calculate |∂ Φc J − | as well as χ a at the off position, see Fig. 4. We fix E L /h = 2 GHz (E Lc /h = 1.1 GHz), however we obtain similar results when considering instead larger or smaller values of E L . It is worth emphasizing that the off position is parameter dependent, thus we reposition the dc fluxes appropriately for each parameter set. For relatively large E Jc and small E C− , the lowest-lying states at intermediate flux values localize in minima of the cosine potential. The off position is then generally near the sweet spot, where the vanishing energy difference between the states |0 − , |1 − , as well as the rapid increase in the value of the matrix element 0 − |θ − |1 − enable flux tunability of J − . These factors in turn imply extreme sensitivity to flux |∂ Φc J − |/h 100 MHz/Φ 0 as well as a breakdown of the dispersive regime. For relatively small E Jc and large E C− , flux tunability is lost as the spectrum is nearly harmonic. For decreasing E Jc and E C− , excitation energies are suppressed leading to a breakdown of the dispersive interaction. The pa-rameter regime that supports both a dispersive interaction and "Goldilocks" flux dependence is thus E C− > E Jc E Lc . The parameters E C+ , E Lc that define the coupler θ + mode are implied by the parameter choices for the coupler θ − mode, with the restriction E C+ > E C− due to the finite junction capacitance. The parameters used in the remainder of this work are given in Table I.

C. Numerical results
We compute the low-energy spectra of the full-model Hamiltonian H as well as the effective Hamiltonian H eff and plot the results in Fig. 5(a). We vary the coupler flux along the contour C shown in Fig. 3 to ensure that the qubits remain at their sweet spots. Relative deviations between the two spectra are at the level of a percent or less, indicating that the exact results can be accurately described by an effective model of two qubits coupled by a tunable XX interaction. The value of the tunable-coupling coefficient J is shown in Fig. 5(b) and crosses through zero at φ c ≈ 0.27 · 2π. At this position in flux space, the coupler is in the "off" state.
To quantify the on-off ratio of the tunable coupler, we numerically calculate the strength of the parasitic ZZ interaction using the formula ζ ZZ = E 1100 −E 1000 −E 0100 +E 0000 [18].
The eigenenergy E ijnp of the dressed state |i a , j b , n − , p + is found by numerically diagonalizing the full model Hamiltonian H. The ZZ interaction strength ζ ZZ is less than 0.3 h·kHz at the off position for the parameters considered here, implying an on-off ratio on the order of 10 5 . It is a general feature of coupled systems of low-frequency fluxonia that ZZ interaction strengths are suppressed, due to the small repulsions between computational and non-computational states [18].

III. SINGLE-QUBIT GATES
There are important differences between how single-qubit gates are performed on high frequency qubits like transmons and how they are executed on low-frequency qubits like those studied here. For transmon qubits, drive strengths are typically small compared with the qubit frequency. It is then appropriate to move into a frame co-rotating with the drive frequency (typically on or near resonance with the qubit frequency) and perform the RWA [5,46]. The rotating-frame Hamiltonian is now time independent, allowing for the relatively straightforward calculation of time-evolution operators (propagators). Observe that in this rotating frame, idling corresponds to an identity operation (assuming a resonant drive). In contrast, to obtain fast gates for low-frequency qubits like heavy fluxonium [8] or superconducting composite qubits [13], drive strengths typically equal or exceed the qubit frequencies. Thus, gates are typically performed in the laboratory frame as it is not appropriate to move into a rotating frame like that described above [8,[13][14][15]. In the lab frame, qubit states acquire dynamical phase factors while idling. Indeed we utilize these Z rotations in Sec. IV for achieving a highfidelity we obtain an identity operation (up to an overall sign) only by idling for exact multiples of the Larmor period τ q = 2π/ω q , where ω q is the qubit frequency. If we now consider multiple qubits with non-commensurate frequencies, it is not obvious how to perform an operation on one qubit without a second qubit acquiring dynamical phase during the gate time. Therefore, we seek an active means of obtaining variable-time identity operations for low-frequency qubits. Single-qubit X/2 and Y/2 gates can be obtained using the techniques described in e.g. Refs. [8,13], allowing for universal control when combined with arbitrary Z rotations achieved by idling. We utilize flux pulses that begin and end at zero and whose shapes are described by sinusoidal functions, but that only last for a single period [13]. This pulse shape is chosen because the external flux averages to zero [47], helping eliminate longtimescale distortions [48]. The Hamiltonian of a single fluxonium biased at the half-flux sweet spot and subject to a sinu- Projecting onto the computational subspace yields [8] defining the effective drive amplitude A = E L 0|ϕ|1 δφ and making use of selection rules at the half-flux sweet spot. For typical heavy-fluxonium parameters such as those chosen for qubits a and b, the amplitude of the drive A exceeds the qubit frequency ω q for deviations from the sweet spot as small as δφ = 0.02 · 2π. Indeed, such strong drives have been used to implement fast single-qubit gates with high fidelities [8,13].
Here, we utilize similarly strong drives for the implementation of identity pulses. We seek conditions on the drive strength A and frequency ω d such that the propagator U q (t) is equal to the identity operation after a single drive period, U q (t = 2π/ω d ) = 1 1. The propagator satisfies the time-dependent Schrödinger equation with the initial condition U q (0) = 1 1. In the regime where the qubit frequency ω q is small compared to the drive amplitude A, it is appropriate to move into the interaction picture defined with respect to the drive. This transformation is achieved via the unitary The interaction-frame Hamiltonian is while the propagator in this frame U q (t) satisfies dU q (t)/dt = −iH q U q (t). To obtain an approximation to the propagator U q (t) we carry out a Magnus expansion [9][10][11] in which the propagator is assumed to take an exponential form ). The first term in this series is ∆ 1 (t) = −i t 0 H q (t )dt , and the formulas for terms up to fourth order are given in e.g. Refs. [10,11]. We truncate the Magnus series after the first term, as we find that higher-order terms are generally small and can be neglected. The propagator at the conclusion of the pulse is where τ d = 2π/ω d and J 0 is the zeroth-order Bessel function of the first kind. The propagators in the lab and interaction frames are related by U q (t) = U 0 (t)U q (t)U † 0 (0). Because the lab and interaction frames coincide at t = 0 and t = τ d , the propagators in the lab and interaction frames are the same at the conclusion of the pulse. To obtain an identity gate, the general solution is which is an equation for the variables A, ω d . Solutions which avoid fixing ω d based on the value of ω q are those for r = 0 which satisfy where j k is the k th zero of J 0 . Thus, by choosing a combination of drive amplitude A and frequency ω d (and thus gate time) obeying Eq. (14), we obtain a variable-time identity gate. We note that it is also possible to arrive at Eq. (13) using a perturbative analysis in the context of Floquet theory [49]. We present numerical results illustrating that the proposed identity gates can be achieved with high fidelity. To calculate the closed-system fidelity of a quantum operation we utilize the formula [50] where d is the dimension of the relevant subspace of the Hilbert space, U T is the target unitary and U is the projection onto the d dimensional subspace of the propagator realized by time evolution. This formula is especially useful when considering systems where leakage may be an issue; in such cases, deviations of the operator U from unitarity are penalized by the term Tr(U † U ). To obtain the propagator associated with time evolution under the Hamiltonian H fl (t), it is most appropriate to express H fl (t) in the eigenbasis of the static Hamiltonian H π . The qubit states are the two lowestenergy states, and we retain up to eight eigenstates to monitor leakage. Diagonalization of H π is done using scqubits [51], while time-dependent simulations are performed using QuTiP [52]. Sweeping over the drive frequency and amplitude of the flux pulse, we monitor the fidelity of an identity operation, taking d = 2 and U T = 1 1 in Eq. (15), see Fig. 6.
Regions of high fidelity appear as "fingers" in the space of inverse gate time (drive frequency ω d ) and effective drive amplitude A. The colored lines are given by A = j k ω d /2 for k = 1, 2, 3, 4, 5, corresponding to the drive parameters that analytically predict identity gates. These lines overlap with the regions of high fidelity computed numerically for large amplitude A compared with the qubit frequency ω q . For decreasing ω d and A, the lines begin to deviate from the highfidelity fingers due to the breakdown of the Magnus expansion [10]. Nevertheless, we find numerically that high-fidelity F > 0.9999 identity gates can be achieved across a wide range of inverse gate times 0.5 ω d /ω q 7. Leakage outside the computational subspace is negligible for the parameters considered here.
The time evolution of the qubit states in the lab frame in the form of trajectories on the Bloch sphere during identity pulses is shown in Fig. 7. The drive frequencies ω d used in Figs. 7(a)-(c) are ω d /ω q = 5.3, 3.3, 2.1, with drive amplitudes A obtained from Eq. (14) using the Bessel zeros j 1 , j 1 , j 2 , respectively. In all three cases we obtain fidelities of F ≥ 0.9998. Numerical optimization of the drive amplitude keeping the drive frequency fixed yields F > 0.99999 in each case. The optimized amplitude generally differs from the amplitude derived analytically by less than or of the order of a percent. In this section, we analyzed an isolated heavy fluxonium subject to an ac flux drive. The generalization to fluxonia embedded in the architecture considered in this work is straightforward. At the off position, the operator activated by a flux drive on qubit a and b is to a good approximation of the form of XI and IX, respectively, see Appendix C for details. Thus, the identity-gate protocol is immediately applicable to each individual qubit.

IV. TWO-QUBIT ENTANGLING GATE
When performing two-qubit gates on low-frequency fluxonium qubits, we encounter similar issues to those present for single-qubit gates. To achieve relatively fast gate times compared with T 1 and T 2 , we utilize drive strengths where the RWA is invalid. We perform a Magnus expansion in a frame co-rotating with the qubit frequencies to account for the counter-rotating terms order-by-order. We note that similar results can be obtained using a Floquet analysis [53,54]. Because single-qubit gates are performed in the lab frame, we then transform the rotating-frame propagator back into the lab frame.
To activate a two-qubit interaction, we consider an ac si- where we emphasize that gates are performed in the basis of the dressed states, see Appendix C for details. The Pauli matrices are defined as e.g.σ a x = j=0,1 |0j 1j| + H.c., etc., utilizing the shorthand |i a j b ≡ |i a , j b , 0 − , 0 + . The effective drive amplitude A ≡ J ac δφ c is defined in terms of the ac two-qubit coupling strength J ac given in Eq. (C15). For our parameters, J ac = 18.3 MHz. Just as for single-qubit gates, we drive with sin ω d t rather than cos ω d t because we intend to activate the interaction for only one or a few drive periods n [55]. In general, the propagator at the final time τ n = 2πn/ω d is where T is the time-ordering operator and |a| 2 + |d| 2 = |b| 2 + |c| 2 = 1. To obtain an entangling gate, we target drive parameters A, ω d that yield |b| = |c| = 1/ √ 2 and d = 0 [56]. We parametrize this gate as To see that this gate is entangling, note either that it can produce Bell states or that it can be transformed into the entangling gate using only single-qubit operations [57]. One such transformation using single-qubit gates is where Expressions for the Z rotation angles in Eq. (20) in terms of α, β and γ are specified below. The relationship (20) provides an explicit recipe for constructing a √ iSWAP gate, given a √ φSWAP gate and arbitrary single-qubit Z rotations. Quantum algorithms are typically written in terms of named gates like √ iSWAP [58], as opposed to the native gate √ φSWAP achieved here. Thus, it may be useful to immediately transform the obtained √ φSWAP gate into the more familiar √ iSWAP. This is the strategy we pursue here. Generally, only three of the Z rotations in Eq. (20) are necessary. We make use of the freedom of the extra Z rotation by choosing the angle θ b2 ∈ [0, 2π) that optimizes the overall gate time, including the Z rotations. The remaining angles are set to to satisfy Eq. (20). In the following, we find explicit expressions for α, β and γ in terms of the drive parameters and qubit frequencies. Because we operate in the lab frame, these Z rotations are obtained by idling. Idle times for coincident Z rotations may differ in general, therefore to synchronize the time spent performing single-qubit gates we make use of the variable-time single-qubit identity gates discussed in Sec. III.

A. Constructing √ φSWAP
The propagator √ φSWAP can be obtained from time evolution under the Hamiltonian H 2q (t) as follows. The qubit frequencies ω a , ω b are fixed by operating the qubits at their sweet spots, while the drive parameters A, ω d may be varied. The Hamiltonian H 2q (t) only couples the pairs of states defining ω ± = ω a ± ω b . The Hamiltonians H + (t), H − (t) describe dynamics in the |00 , |11 , and the |01 , |10 subspaces, respectively. The corresponding Pauli matrices are denoted by Σ ± j , for example Σ + z = |00 00| − |11 11|. For realistic parameters, the two-level-system frequencies ω + and ω − are large compared with the drive amplitude A. In this case it is appropriate to move into the interaction picture defined by the unitaries The interaction-frame Hamiltonians are To calculate the associated propagators, we carry out a Magnus expansion including the first-and second-order terms. It is straightforward to calculate higher-order corrections, however we find for our parameters that they are small and can be neglected. The expression for the propagator is then U ± (t) = exp(∆ ± At the conclusion of the gate t = τ n , we obtain where and we have defined . We have neglected corrections of order O([A/ω d ] 3 ) to ξ ± . The first-order terms involving Σ ± x , Σ ± y determine the amount of population transfer between the two states in each subspace. The second-order terms encode the leading-order beyond-the-RWA corrections and are proportional to Σ ± z . Indeed, in the resonant limit, the first-order terms ∆ ± 1 (τ n ) = −iτ n A 2 Σ ± y reproduce the RWA results [5] while the secondorder terms ∆ ± 2 (τ n ) = iτ n 3A 2 8ω d Σ ± z correspond to the wellknown Bloch-Siegert shift [12,59]. Transforming the propagator back to the lab frame via the identity U ± (τ n ) = defining ϑ ± = πnω ± /ω d . The approximate equality is valid for tanc(ξ ± )ε ± 1, and tanc(x) = tan(x)/x.

B. Determining optimal drive parameters
To obtain the √ φSWAP gate, we require The solution for Eq. (32) is For nonzero p, solutions (A, ω d ) can only be found by numerically solving the full transcendental equation (34). We find in the following that to satisfy Eq. (33), it is necessary to have the freedom of varying the drive amplitude A. Thus, we only consider the case p = 0. Setting m = n is excluded in Eq. (35) as in this case the left-hand side of Eq. (34) does not vanish. However, this restriction is no issue, as motivated by the drive frequency ω d = ω − used to obtain the √ iSWAP gate when the RWA is valid [5] we do not consider on resonance driving of the |00 ↔ |11 transition ω d = ω + . With ω d given by Eq. (35), the expression for ε + simplifies to ε + = A 2 ω 2 d πnm 1−m 2 and we satisfy Eq. (32) with the phase α = ϑ + − ε + .
Considering now the requirement (33) for U − , the solution is where the ± indicates that the sign may be absorbed into the phases β, γ. We interpret Eq. (36) as an equation for the unknown A, as we have fixed ω d previously. Solving for A yields where we have set q = 0 to minimize the magnitude of A. In general, the fraction on the right-hand side of Eq. (37) may be positive or negative, depending on n and the magnitude of ω − relative to ω d . Thus, we choose the sign of ξ − = ±π/4 based on which yields a positive drive amplitude A. With the drive frequency and amplitude given by Eq. (35) and Eq. (37) respectively, we satisfy Eq. (33) with phases β = ϑ − −4ε − /π and γ = 0 or γ = π depending on the sign of ξ − .
The previous analysis only leaves us to choose the integers m, n, see Eq. (35). We make use of this freedom to limit the drive amplitude A in magnitude. Careful inspection of the removable singularity in Eq. (37) suggests the usage of a drive frequency ω d near ω − . This can be achieved by a combination of n and m such that their ratio n/m closely approximates ω − /ω + . The optimal choice of n must balance between mitigating the effects of T 1 and T 2 by keeping gate durations 2πn/ω d as short as possible, and holding at bay unwanted population transfer incurred by strong drive amplitudes A ∼ 1/n, see Appendix C for details. With n and m specified as such, we have constructed the √ φSWAP gate allowing for the execution of a √ iSWAP gate when combined with single-qubit Z rotations. The full gate duration is t tot = 2πn/ω d + max(t a1 , t b1 ) + max(t a2 , t b2 ), where t µi = −θ µi /ω µ . The equations for the times t µi are understood modulo 2π and the Z rotation angles are known in terms of the phases α, β, γ, see Eq. (21). The angle θ b2 is a free parameter and is chosen to minimize the overall gate time by forcing the idle times after the √ φSWAP gate to coincide t a1 = t b1 . For our parameters we obtain t tot = 113 ns, where 2πn/ωd = 85 ns and the singlequbit gates require 28 ns, see Fig. 8. For the initial state |00 , population appreciably varies only during the singlequbit identity-gate segments, see Fig. 8(c). Meanwhile, for the state |01 , population transfer to the |10 state occurs during the √ φSWAP portion of the gate, see Fig. 8(d). Closedsystem simulations of this pulse sequence yield a gate fidelity of F = 0.9996 for achieving a √ iSWAP gate, calculated using Eq. (15), taking d = 4 and U T = √ iSWAP. Infidelities at the 10 −4 level are likely due to residual effects from the higher-lying states that cause unwanted transitions in the computational subspace, see Appendix C for details.
To include the detrimental effects of decoherence on gate fidelities, we numerically solve the Lindblad master equation where ρ(t) is the system density matrix and is the standard form of the dissipator. The relevant jump operators L are here: We neglect decoherence processes due to higher-lying states, noting that their occupation remains minimal throughout the duration of the pulse. We consider two sets of estimates for decoherence rates, one conservative, Γ φ = 1/80 µs, Γ 1 = 1/300 µs, and one optimistic, Γ φ = 1/4000 µs, Γ 1 = 1/1000 µs, both consistent with recent experiments [8,17]. At the conclusion of the gate, we project onto the computational subspace and perform numerical quantum process tomography [23,52,61] to obtain the process matrix χ. The open-system gate fidelity is calculated using the formula [50,62,63] where d is the dimension of the relevant subspace and χ T is the target process matrix. We obtain open-system gate fidelities of F = 0.997, F = 0.9994 for the two cases of conservative and optimistic decoherence rate estimates, respectively.

V. DISCUSSION AND CONCLUSION
In this work, we have proposed a galvanic-coupling scheme for fluxonia in which XX coupling can be switched on and off while maintaining the qubits at their respective sweet spots. Motivated by record coherence times achieved with heavyfluxonium qubits, we have concentrated on operating at frequencies below ∼200 MHz [8,17]. The magnitude of drive strengths required in this case invalidates RWA and makes it more natural to perform gates with reference to the lab frame. We have presented a protocol involving flux biasing and strong flux modulation that achieves a fast and highfidelity √ φSWAP gate. To transform this into the more familiar √ iSWAP gate, we introduce variable-time identity gates. These gates when combined with Z rotations help us realize a √ iSWAP gate with fidelity F > 0.999. Fidelities are limited by incoherent errors as well as unwanted transitions in the computational subspace mediated by higher-lying states.
A crucial open question that warrants future research is how to achieve scalability in this fluxonium-based architecture. One may envision extending the device into a 1D array of qubits and couplers. Generalization to 2D arrays with increased qubit connectivity will require additional modifications, and will be useful for steps towards e.g. error-correcting surface codes [64].

Appendix A: Full-circuit analysis
In this appendix, we construct the Lagrangian and Hamiltonian of the full circuit shown in Fig. 9, allowing for disorder in circuit parameters. We follow the method of Vool and Devoret [68] to construct the circuit Lagrangian, yielding with node variables and circuit parameters as shown in Fig. 9.
We consider the case of small deviations from otherwise pairwise equivalent qubit inductors E La , E Lb , coupler inductors E L1 , E L2 and stray capacitances C 1 , C 2 . In the absence of parameter disorder, the coupler modes θ ± = ϕ 1 ± ϕ 2 decouple, simplifying the analysis (though we show below that small parameter disorder is not expected to significantly impact device performance). Using these variables the Lagrangian becomes where E Lc = 1 2 (E L + E L ) and we have introduced notation for the average and relative deviation of the qubit inductors , and the stray ca- . We perform the Legendre transform to obtain the Hamiltonian and promote our variables to non-commuting operators obeying the commutation relations [ϕ µ , n µ ] = i for µ = a, b and [θ j , n j ] = i for j = ±. The Hamiltonian is H = H 0 + V + H dis , where Charging energy definitions introduced above are given in the main text and H 0 , V and H dis refer to the bare, coupling and disorder Hamiltonians, respectively. We have neglected higher-order disorder contributions proportional to dC 2 on the assumption that disorder is small. Additionally, we isolated the qubit flux shift away from the sweet spot δφ µ = φ µ − π and performed the variable transformation ϕ µ → ϕ µ − δφ µ .

Appendix B: Schrieffer-Wolff transformation
In this appendix, we derive the second-order effective Hamiltonian describing the qubit-qubit interaction that is dispersively mediated by the tunable coupler. First, we consider the symmetric case H dis = 0. Later, we relax this assumption and allow for parameter disorder.

Effective Hamiltonian without parameter disorder
We separate the Hilbert space into a low-and high-energy subspace defined by the respective projectors We have defined the states | a , m b , n − , p + that are eigenstates of the bare Hamiltonian H 0 with eigenenergies E a + E b m +E − n +pω + . The perturbation V couples states within the same subspace, as well as states in separate subspaces. We utilize a Schrieffer-Wolff transformation [5,41] e −S with antihermitian generator S to decouple the low-and high-energy subspaces order-by-order. To carry out the transformation, the effective Hamiltonian H eff = P e S He −S P and generator S are expanded as We then collect terms of the same order and enforce both that the effective Hamiltonian is block diagonal and that the generator is block off diagonal. The zeroth-and first-order contributions to the effective Hamiltonian in the computational subspace (neglecting constant terms) are found by applying the projector P onto H 0 and V respectively [5,41] The Pauli matrices are defined as e.g.
Calculation of the second-order contribution H (2) to the effective Hamiltonian is facilitated by the first-order generator S (1) . The expression for the matrix elements of S (1) is well known [5] and we obtain defining the small parameters µ,(1) where E µ jk = E µ j − E µ k , µ = a, b, −, and We have introduced annihilation and creation operators a + , a † + for the coupler θ + mode via θ + = osc √ 2 (a + + a † + ) and osc = (8E C+ /E Lc ) 1/4 is the oscillator length. The primed sum in Eq. (B8) indicates that n is allowed to be zero if j ≥ 2, acknowledging that the perturbation V can couple computational states to higher-lying states of the qubit fluxonia without exciting the coupler θ − mode. We have neglected contributions proportional to δφ µ in Eq. (B8) as they are comparatively small and can be neglected.

Effective Hamiltonian in the presence of disorder
We now consider how disorder in circuit parameters modifies the form of the effective Hamiltonian Eq. (B15). This disorder could arise for example from fabrication imperfections. We show below that up to second order, inductive disorder merely results in a modification to Eq. (B7), while capacitive disorder does not contribute.
From Eq. (A4) we see that inductive asymmetry adds a disorder term to the Hamiltonian If we assume that the relative deviations are small compared with unity, it is justified to add this term to V and treat it perturbatively. Observe that on the one hand, for virtual transitions mediated by this term, the excitation number of either qubit cannot change. On the other hand, the excitation number of the coupler θ + mode must change. Thus, the firstorder contributions vanish, and the only second-order terms that contribute beyond a global energy shift are where we have defined Thus up to second order, disorder in the inductors serves only to modify the expressions for the coefficients Ω µ . As discussed in the main text, this amounts to a shift in the sweet spot location of each qubit and can be canceled by a corresponding shift of the static qubit fluxes. Thus, small disorder in either the qubit inductors or the coupler inductors does not adversely affect device performance. We now turn our attention to capacitive disorder C 1 = C 2 (disorder in the qubit capacitances poses no issue, as the qubits remain decoupled from all other degrees of freedom in the kinetic part of the Hamiltonian). In this case, we proceed as before and treat perturbatively the capacitive disorder term Consider the relation between phase and charge matrix elements in fluxonium [27] and observe that the charge matrix element vanishes if j = k. Thus, any virtual transition mediated by the perturbation (B19) must excite both the coupler θ − mode and the coupler θ + mode and thus does not contribute at second order beyond a global energy shift.

Appendix C: Drive operators
In this appendix, we calculate the matrix elements and consider the effects of the relevant drive operators activated by time-dependent flux drives. Allocating the time-dependent flux in the same way as for static flux generally introduces terms proportional to the time derivative of the external flux [69]. Imposing the constraint that these terms should not appear implies a specific grouping of the flux in the full Hamiltonian H. For our parameters, we find to a good approximation that the ac qubit fluxes are allocated to their respective inductors, and the ac coupler flux is spread across all four inductors. We first decompose the external fluxes into staticφ µ and timedependent δφ µ (t) components, whereφ µ are the dc flux values at the off position. The ac qubit fluxes are already properly allocated, while the appropriate grouping of the coupler flux is obtained via θ − → θ − − δφ c (t). The full time-dependent Hamiltonian is thus Matrix elements of the operators h µ with respect to eigenstates of the static Hamiltonian H st = H 0 + V determine the time evolution, once the time-dependent drives δφ µ (t) are specified. The Schrieffer-Wolff transformation allows us to define new basis states that are approximate eigenstates of H st and thus perturbatively calculate these matrix elements.
The leading-order contributions to select matrix elements of the drive operators h µ occur at second order. Thus, to include all relevant corrections to the wave functions that contribute to these matrix elements, we calculate the second-order generator S (2) associated with the Schrieffer-Wolff transformation discussed in Appendix B. To simplify the calculation we ignore all contributions from the coupler θ + mode due to the inequality |η| < | | in parameter regimes of interest, yielding [5,41] S (2) = µ=a,b j=0,1 j ,n µ, (2) jj ,n |j µ , 0 − j µ , n − | (C4) + j,k=0,1 j ,k ,n ab, (2) jj ,kk ,n |j a , k b , 0 − j a , k b , n − | − H.c.
At the off position, the effective Hamiltonian H eff = −

Qubit-flux drive operators
Experimentally, the amplitude of an ac flux drive will typically be no larger than δΦ µ ≤ 0.1Φ 0 [8]. In this case, we have checked that transitions to higher-lying states mediated by the drive operators h a , h b are suppressed. Thus, we need only consider matrix elements of these operators in the computational subspace. Using the expression (C5) for the dressed states given in terms of the bare states, we find m|h a | m /E L = 2 where we have introduced the shorthand | , m ≡ | a , m b , 0 − , 0 + for states in the computational subspace, and the labels are understood modulo 2. In Eqs. (C6)-(C8), second-order contributions are small and can be neglected, while in Eq. (C9) the leading-order contributions are at second-order. These analytical approximations indicate that in the computational subspace and at the off position, the operator h a simplifies dramatically to leading order to Ω a acσ a x . For example, the matrix element (C8) vanishes at the off position by definition, see Eq. (B14). Further explicit verification of the form of h a is tedious and will be omitted. We find excellent agreement between the semi-analytic formulas and exact results: for the parameters considered here, we obtain |Ω a ac |/h = 558, 561 MHz using Eqs. (C6)-(C9) and numerics, respectively. The coefficients associated with all other operators (aside from the irrelevant identity) in the decomposition of h a are of the order of 2 h·MHz or smaller in absolute value, as calculated both from the semi-analytic formulas and exact results.

Coupler-flux drive operator
The operator h c activated by coupler-flux modulation induces both wanted and unwanted transitions in the computational subspace. The latter proceed through virtual excitations of higher-lying states. We analyze both types of transitions in the following.

a. Computational-subspace matrix elements
The matrix elements of h c governing the wanted transitions can be obtained within second-order perturbation theory using Eq. (C5)  mode dominate the sum on v, see Fig. 10. (Note that this virtual process is heavily suppressed in the context of qubit-flux drives, due to the comparatively small coefficient E L /2 E Lc multiplying the operator θ − in h a ).
These transitions mediated by the higher-lying states can significantly degrade gate fidelities. Indeed, attempting to implement the √ φSWAP gate with only a single drive period leads to poor fidelities F < 0.9 due to the unwanted transitions. Slowing down the gate by utilizing two drive periods as in Sec. IV mitigates this issue in large part (infidelities are on the order of 10 −4 ) by reducing the required drive amplitude. It is an interesting avenue for further research to investigate means for overcoming this limitation on δφ c to achieve faster gate times without sacrificing fidelity.