Non-perturbative analytical diagonalization of Hamiltonians with application to coupling suppression and enhancement in cQED

Deriving effective Hamiltonian models plays an essential role in quantum theory, with particular emphasis in recent years on control and engineering problems. In this work, we present two symbolic methods for computing effective Hamiltonian models: the Non-perturbative Analytical Diagonalization (NPAD) and the Recursive Schrieffer-Wolff Transformation (RSWT). NPAD makes use of the Jacobi iteration and works without the assumptions of perturbation theory while retaining convergence, allowing to treat a very wide range of models. In the perturbation regime, it reduces to RSWT, which takes advantage of an in-built recursive structure where remarkably the number of terms increases only linearly with perturbation order, exponentially decreasing the number of terms compared to the ubiquitous Schrieffer-Wolff method. In this regime, NPAD further gives an exponential reduction in terms, i.e. superexponential compared to Schrieffer-Wolff, relevant to high precision expansions. Both methods consist of algebraic expressions and can be easily automated for symbolic computation. To demonstrate the application of the methods, we study the ZZ and cross-resonance interactions of superconducting qubits systems. We investigate both suppressing and engineering the coupling in near-resonant and quasi-dispersive regimes. With the proposed methods, the coupling strength in the effective Hamiltonians can be estimated with high precision comparable to numerical results.

Deriving effective models is of fundamental importance in the study of complex quantum systems. Often, in an effective model, one decouples the system of interest from the ancillary space, as shown in Figure 1. The dynamics are then studied within the effective subspace, which is usually much easier than in the original Hilbert space, and provides fundamental information such as conserved symmetries, entanglement formation, orbital hybridization, computational eigenstates, spectroscopic transitions, effective lattice models, etc. In terms of the Hamiltonian operator, an effective compression of the Hilbert space can be achieved by diagonalization or block diagonalization.
When the coupling between the system and ancillary space is small compared to the dynamics within the subspace, the effective model is often derived by a perturbative expansion. In the field of quantum mechanics, a ubiquitous expansion method that enables reduced state space dimension is the Schrieffer-Wolff Transformation (SWT) [1,2], also known in various sub-fields as adiabatic elimination [3], Thomas-Fermi or Born-Oppenheimer approximation [4,5], and quasi-degenerate perturbation theory [6]. Finding uses throughout quantum physics, SWT can be found in atomic physics [3], superconducting qubits [7,8], condensed matter [2], semi- conductor physics [9], to name a few.
The SWT method is however limited to regimes where a clear energy hierarchy can be found and therefore fails to converge for a wide variety of physical examples. In particular, for infinite-dimensional systems such as coupled harmonic and anharmonic systems (e.g., in superconducting quantum processors), the abundance of both engineered and spurious resonances motivates the use of other techniques. Moreover, even when perturbation theory is applicable, the number of terms in the expansions grows exponentially as the perturbation level and therefore are not practically usable in many instances.
In this article, we introduce a new symbolic algorithm, Non-Perturbative Analytical Diagonalization (NPAD), that allows the computation of closed-from, parametric effective Hamiltonians in a finite-dimensional Hilbert space with a guarantee for convergence. The method makes use of the Jacobi iteration and recursively applies Givens rotations to remove all unwanted couplings. In the perturbative limit, it reduces via BCH expansion to a variant of SWT, which we refer to as the Recursive Schrieffer-Wolff Transformation (RSWT). For this method, the number of commutators grows only linearly with respect to the perturbation order, in contrast to the exponential growth in the traditional approach. Both methods can be used in low-order expansions to provide compact analytical expressions of effective Hamiltonians; or, alternatively, higher-order expansions that allow for fast parametric design [10] and tuning [11] of effective Hamiltonian models (and, e.g., subsequent automatic differentiation). As illustrated in Figure 1, with the two methods, one can tune the system for engineered decoupling or enhanced controlled coupling.
The key insight of our work is that the iteration step in forming the effective model can be applied recursively, i.e. after each step the transformed Hamiltonian is viewed as a new starting point and determines the next step. Moreover, each step can act on a chosen single state-to-state coupling at a time, thereby providing an exact elimination of the term. In this regard, this can be understood as a generalization of the well-known numerical Jacobi iteration used for diagonalization of real symmetric matrices [12], which has also found use for Hermitian operators [13,14]. Similar ideas have also been widely used in the orbital localization problem [15].
As demonstrations of the practical utility of the methods, we study superconducting qubits, which are especially relevant for robust parametric design methods, not only because they are prone to spurious resonances [16][17][18], but because they can be readily fabricated across a very wide range of energy scales [19,20].
We investigate both the near-resonant regime and in the quasi-dispersive regime, focusing on the ZZ and crossresonance interaction. In the near-resonant regime, we consider the two-excitation manifold and compute accurate approximations of the ZZ interaction strength applicable to the full parameter regime for gate implementation [21][22][23][24]. In the second scenario, we study the suppression of ZZ interactions [10,[25][26][27][28][29][30][31][32][33][34][35][36][37][38][39][40][41] in the traditional setup of resonator mediated coupling without direct qubit-qubit interaction. The result shows that the ZZ interaction can be suppressed without resorting to ad-ditional coupling in a regime where the qubit-resonator detuning is comparable to the qubit anharmonicity, described by an equation of a circle. Extending the applications to block diagonalization, we then compute the coupling strength of a microwave-activated cross-resonant interaction. We show that, with only 4 Givens rotations, we can diagonalize the drive and achieve accurate estimation in the regime where the perturbation method fails. This paper is organized as follows: In Section II, we present the mathematical methods, NPAD and RSWT, for diagonalization and obtaining effective Hamiltonian models. We also briefly discuss generalizing the two methods to block diagonalization in Section II C. Next, in Section III, we demonstrate the applications to superconducting systems. We study the ZZ interaction for generating entanglement in the near-resonant regime (Section III A), and in the (quasi-) dispersive regime for suppressing cross-talk noise (Section III B). The computation of the cross-resonance coupling strength is presented in Section III C. We conclude and give an outlook of other possible applications in Section IV.

A. Non-perturbative Analytical Diagonalization
In this subsection, we introduce the NPAD for symbolic diagonalization of Hermitian matrices and discuss how it can be applied to obtain effective models.
In this algorithm, a Givens rotation is defined in each iteration to remove one specifically targeted off-diagonal term. By iteratively applying the rotations, the transformed matrix converges to the diagonal form. The rotation keeps the energy structure when the off-diagonal coupling is small while always exactly removing the coupling even when it is comparable to or larger than the energy gap. Compared to the Jacobi method used in numerical diagonalization [12][13][14], we truncate the iteration at a much earlier stage. As each iteration consists only of a few algebraic expressions, the algorithm produces a closed-form, parametric expression of the transformed matrix.
We start from a two-by-two Hermitian matrix and define a complex Givens rotation that diagonalizes it. Then, we generalize the rotation to higher-dimensional matrices, discuss the convergence of the iteration, and how to use it as a symbolic algorithm. In Section III A, we show a concrete application where we apply NPAD with only two rotations to approximate the energy spectrum of a near-resonant quantum system which can not be studied perturbatively.
The Givens rotation illustrated on a Bloch sphere. A Hermitian matrix defined in Eq.
(1) is denoted as a point on the surface of a Bloch sphere with the radius δ 2 + g 2 . This is different from the Bloch sphere representation of a quantum state, where the radius is always smaller than or equal to 1. The coordinates correspond to the coefficients in the representation in the Pauli basis. The Givens rotation U that diagonalizes the matrix can be viewed as a rotation denoted by the blue arrow (for δ ≥ 0). (b): The computational graph of the Givens rotation U , defining the main mathematical steps in the symbolic algorithm 1. The inputs g, δ and φ can be directly extracted from the Hamiltonian.

Givens rotations
We consider a two-by-two Hermitian matrix where g, φ, ε and δ are real numbers. The matrix can be decomposed in the Pauli basis as which can be illustrated in a Bloch sphere with the radius δ 2 + g 2 (omitting the identity) as shown in Figure 2a. Without loss of generality, we assume that g ≥ 0 and absorb the sign into the complex phase.
The diagonalization can be understood as a rotation on the Bloch sphere to the North or South pole. In particular, if δ ≥ 0, it is rotated to the North pole, and otherwise to the South pole, avoiding unnecessarily flipping the energy level during the diagonalization. This rotation is performed around the axisn = cos(φ)σ y − sin(φ)σ x with the angle θ = arctan ( g δ ). As an illustration, for δ ≥ 0, the rotation is denoted by a blue arrow in Figure 2a.
The unitary transformation that diagonalizes the matrix is given by where S = i 2 θn is referred to as the generator of the rotation. The transformation satisfies Λ = U HU † with Λ the diagonalized matrix. We refer to U as a Givens rotation [42]. Notice that in most literature, the Givens rotation is defined with φ = 0. Here we use this more general (Hermitian) definition as it shares many common properties.
The computation of the unitary consists only of elementary mathematical functions, as illustrated in Figure 2b. This is critical for it to be used as a building block for a symbolic algorithm. As we will see later, by concatenating this building block, a parameterized expression can be generated for an arbitrary Hermitian matrix.

Simplified formulation
In practice, the inverse trigonometric function in the expression of θ is often avoided by using the trigonometric identities with t = tan θ 2 . We then rewrite Eq. (4) as with κ = g/δ. We choose the root with smaller norm for the convenience that the rotation will not flip the two energy levels [43]: In this way, the parameters cos θ 2 and sin θ 2 in the Givens rotation can be calculated directly from g and δ using algebraic expressions. It is also evident in Eq. (6) that the rotation angle is bounded by |θ| ≤ π/2.

The iterative method
We now apply the Givens rotation to remove the (j, k)th entry of a general Hermitian matrix H. The parameters are chosen to be consistent with Eq. (1), i.e., δ jk = (H j,j − H k,k )/2 and g jk e −iφ jk = H j,k . For simplicity, we use the notation c jk = cos θ jk 2 , s jk = sin θ jk 2 , and t jk = s jk /c jk . We write the Givens rotation U jk as where the diagonal elements are all 1 except for two entries (j, j) and (k, k). All other entries not explicitly defined are 0.
Applying this unitary transformation with H = U jk HU † jk eliminates the off-diagonal entry H j,k , i.e., |H j,k | = |H k,j | = g jk = 0. It renormalizes the energies such that δ jk = δ jk + t jk g jk (8) However, this will also mix other entries on the j, k-th rows and columns, given by with h = j, k.
One can diagonalize the matrix by applying the rotation U jk with the corresponding parameters iteratively on the largest remaining non-zero off-diagonal entry, which is referred to as the Jacobi iteration [12]. That is, we can iteratively solve for the eigenenergies by picking the next largest off-diagonal element, e.g., H j ,k = g j k e −iφ j k , and applying another Givens rotation, as summarized in algorithm 1. In practice, the above definition of the Jacobi iteration can be relaxed. For instance, the next target does not always have to be the largest element. In fact, the order of the rotations does not affect the convergence, as long as all terms are covered in the iteration rules (e.g., cyclic iterations on all off-diagonal entries) [13]. However, performing the rotation first on large elements usually increases the convergence rate. This can be seen by studying the norm of all off-diagonal terms H F = m =n |H m,n | 2 . Since we have |H h,j | 2 + |H h,k | 2 = |H h,j | 2 + |H h,k | 2 for h = j, k and H j,k = 0, each Givens rotation reduces the norm of all off-diagonal terms: If (j, k) is chosen so that |H j,k | 2 is larger than the average norm among the off-diagonal terms, one obtains [13] ) H F (12) where N (N −1) is the total number of off-diagonal terms. Therefore, the algorithm converges exponentially. Moreover, if the off-diagonal terms are much smaller than the energy gap, the convergence becomes even faster, i.e., exponentially fast with a quadratic convergence rate [14]. This leads to an efficient variant of the Schrieffer-Wolfflike methods, as described in Section II B. From the above analysis, we also see that the Givens rotation does not have to exactly zero the target coupling. Instead, it only needs to reduce the total norm. Therefore, if the structure of the Hamiltonian is known, rotations can be grouped such that all rotations within one group are constructed from the same Hamiltonian and then applied recursively. We will also explore this possibility in concrete examples later in the article.
As a machine-precision, numerical diagonalization algorithm, the Jacobi iteration is slower than the QR method for dense matrices. However, in many problems in quantum engineering, the Hamiltonian is often sparse and it is known in advance which interaction needs to be removed. It is not always necessary to compute the fully diagonalized matrix but only to transform it into a frame where the target subspace is sufficiently decoupled from the leakage levels. Therefore, an iterative method where each step is targeted at one off-diagonal entry is of particular interest.
As a symbolic method, we can truncate the Jacobi iteration at a very early stage to obtain closed-formed parametric expressions. It will also correctly calculate the renormalized energy and other couplings while keeping the energy structure in the perturbative limit, as will be discussed in Section II B.

B. Recursive Schrieffer-Wolff perturbation method
In the previous subsection, we introduced NPAD that produces a closed-form, parametric expression of an approximately diagonalized matrix. Here, we show that in the perturbative limit, where the coupling is much smaller than the bare energy difference, the Jacobi iteration reduces to a Schrieffer-Wolff-like transformation. Interestingly, the recursive nature of the Jacobi iteration is preserved in this limit. Instead of looking for one generator that diagonalizes the full matrix as in the traditional Schrieffer-Wolff transformation (SWT), an iteration is constructed such that every time only the leading-order coupling is removed. We refer to it as recursive Schrieffer-Wolff transformation (RSWT) because of the recursive expression it produces. We also show that RSWT demonstrates an exponential improvement in complexity compared to SWT for perturbation beyond the leading order. In Section III B, we demonstrate an application of RSWT in estimating the ZZ interaction between two Transmon qubits in a dispersive regime.

Givens rotation in the perturbative limit
In the perturbative limit, compared to U jk in Eq. (7), it is more convenient to specify the generator defined in Eq. (3). For the Givens rotation U jk the corresponding generator S has two non-zero entries all other entries being 0. In addition, assuming that we only aim at removing the leading-order off-diagonal terms, we define a generator where the sum over P denotes all pairs of non-zero offdiagonal entries in H. The assumption of perturbation indicates that S F 1. In this case, the unitary generated by S still eliminates all the leading-order coupling because The difference between RSWT and SWT appears when one considers higher-order perturbation. In SWT, one expands the transformed Hamiltonian H and the generator S perturbatively as a function of a small parameter and collects terms of the same order on both sides of Eq. (16). However, here, the generator is predefined and it only eliminates the leading-order coupling. Similar to the Jacobi iteration, we treat the transformed Hamiltonian H as a new Hermitian matrix and perform another round of leading-order transformation as the next iteration. This results in a recursive expression for H , which is still a closed-form expression. The remaining off-diagonal terms can always be removed by the next iteration if the truncation level of BCH formula is high enough to guarantee sufficient accuracy. We present the iteration of RSWT in detail in the next subsection and show that it simplifies the calculation for perturbation beyond the leading order.

The RSWT iterations
In the following, we outline the iterative procedure of the RSWT. We denote the initial matrix H as step zero, with the notation D 0 = D, V 0 = λV and H 0 = H = D 0 + V 0 . The parameter λ is the dimensionless small parameter used to track the perturbation order. Assume that we want to compute the perturbation to the eigenenergy up to the order λ K . We refer to this as the λ Kperturbation. Given the Hamiltonian of iteration n, H n , we can compute the next iteration H n+1 as follows.
We first define a generator S n+1 according to Eq. (14) such that [S n+1 , D n ] = −V n , where D n and V n are the diagonal and the off-diagonal part of H n . As the energy gap D n always stays at O(λ 0 ) under the assumption of small perturbation, S n+1 is of the same order as V n . Notice that S n+1 is generated from the perturbed matrix in the previous iteration, H n , in contrast to the unperturbed matrix as in SWT.
Then, the next level of perturbation is computed with where C is the nested commutator defined by and C 0 (A, B) = B. The truncation level m of the BCH expansion will be defined explicitly later. Because [S n+1 , D n ] = −V n by construction, we have for all n and t Therefore, plugging in Eq. (19) into Eq. (17) simplifies it to Notice that t starts from 1 in the sum, which means that all coupling terms at the same order of V n are removed and the order of the remaining coupling, V n+1 , is squared. This iteration is applied until the desired order is reached, as summarized in algorithm 2.
To ensure that the truncation of the BCH is accurate up to the order O(λ K ), for the nth iteration, we need to choose the truncation m = K 2 n , which ensures that . This maximal level m is halved every time the iteration step increases because the remaining coupling is quadratically smaller. This means that, in contrast to SWT, the first iteration has the largest number of terms in RSWT. In appendix A, we show that, if S n+1 < 1 2 , the error of the truncation in Eq. (20) is bounded by where the H ∞ n+1 is Eq. (20) in the limit m → ∞. input : a Hermitian matrix H0 output: H including correction to the eigenenergy up to λ K nmax ← log 2 (K) ; for n ← 0; n < nmax; n ← n + 1 do 1. Dn ← diag(Hn); Vn ← Hn − Dn; 2. initialize a zero matrix Sn+1; Both the NPAD and the RSWT methods introduced in the previous sections can be designed to only target a selected set of off-diagonal terms and, hence, used for block-diagonalization. This is especially useful to engineer transversal coupling in a subsystem and leave the remaining levels as intact as possible. Here, we briefly discuss these generalizations. Notice that it is always possible to first diagonalize the matrix and then reconstruct the block diagonalized form that satisfies certain conditions, for instance as in Ref. [44]. In the following, we discuss only methods that do not diagonalize the matrix first.
In NPAD, by construction, each rotation removes one off-diagonal element. With Givens rotations only applied to the inter-block elements, an iteration for block diagonalization can be defined. The norm of all off-diagonal entries, H F , is still monotonously decreasing according to Eq. (11). Hence, a limit exists and its convergence is also the convergence of the block diagonalization. However, the convergence is not always monotonous with respect to the norm of all inter-block terms. This is because a Givens rotation may rotate a large intra-block term into an inter-block entry. Therefore, the algorithm may not always converge faster than the full diagonalization would. Nevertheless, if the dominant coupling terms in the Hamiltonian are known, the Jacobi iteration can be designed to target at those to realize an efficient blockdiagonalization. In Section III C, we show an example of this in computing the cross-resonance coupling strength through NPAD.
For perturbation, RSWT can be applied as a block diagonalization method under the constraint that both the inter-block and the intra-block coupling are much smaller than the inter-block energy gap. This can be achieved by slightly modifying the RSWT iterations: We first separate the diagonal, the intra-block and the inter-block terms: H n = D n + V intra n + V inter n . Next, in algorithm 2 we only define S for those non-zero entries in V inter n , i.e. the couplings we wish to remove. And in the last step we replace Eq. (20) with In this definition, the leading inter block coupling is of the ). As we do not remove the intrablock coupling, we still get V intra n = O(λ). Therefore, the remaining coupling is O(λV intra n ), i.e. the perturbation order is increased by one, instead of being squared as in the case of full diagonalization. Therefore, the exponential reduction of the number of commutators does not always apply in the case of block diagonalization. However, notice that the small parameter λ here is defined as the (largest) ratio between the inter-block couplings and gaps, which is usually much smaller than those within the block. Hence, if carefully designed, the convergence can still surpass the full diagonalization in the first few perturbative orders.

D. Comparison between different methods
To help understand the proposed methods, we here discuss the difference between them and the traditional methods. We first compare RSWT with traditional SWT and then NPAD with the perturbation methods.
For RSWT, with the same target accuracy, e.g., O(λ K ), it should provide the same expression as from SWT, up to the error O(λ K+1 ). However, compared to the SWT, RSWT requires a much smaller number of iterations and commutators. To reach O(λ K ), SWT needs K−1 iterations, while RSWT only needs log 2 (K) because of the quadratic convergence rate. More importantly, the total number of commutators grows only linearly for RSWT, compared to the exponentially fast growth for SWT [7].
Intuitively, this is because RSWT uses the recursive structure and avoids unnecessary expansions of the intermediate results. Mathematically, this can be seen from the following two aspects: First, in RSWT, each iteration improves the perturbation level from λ k to λ 2k , instead of λ k+1 . Hence, the number of iterations increases only logarithmically with respect to the perturbation order, as seen in the definition of n max in algorithm 2. This is because we always treat the transformed matrix as a new one and remove the leading-order coupling. It is consistent with the quadratic convergence rate of the Jacobi iteration with small off-diagonal terms. Second, in RSWT, the generator S n is only used at the current iteration. Hence, there are no mixed terms such as The total number of commutators required to reach level λ K is shown in Table I, where we have taken into consideration that if C t (A, B) is known, computing C t+1 (A, B) only requires one additional commutator. The detailed calculation is presented in appendix B.
The NPAD method, on the other hand, uses non-linear rotations to replace the linear perturbative expansion. More concretely, in the Jacobi iteration, by targeting only one coupling in each recursive iteration, the unitary transformation can be analytically expressed as a Givens rotation, thus avoiding the BCH expansion in Eq. (16). Therefore, it efficiently and accurately captures the nonperturbative interactions in the system.
To compare it with the perturbation methods, we estimate the number of operations required for NPAD in the perturbative regime. Assume we construct the Jacobi iteration from the G coupling terms used in generating an S in RSWT. Applying those unitaries is, to the leading order, the same as applying one RSWT iteration. A single Givens rotation on a Hamiltonian takes O(N ) operations, where N is the matrix size. Thus, the cost for computing the effective Hamiltonian after G rotations is the same as computing one commutator [S, ·], up to a constant factor. Because the Givens rotation avoids the BCH expansion, there is no nested commutators and the total number operations is O(n max N G) with n max the number of iterations in algorithm 2. Hence the number of terms scales logarithmically with respect to K instead of linearly as for RSWT, i.e., a super-exponential reduction compared to SWT (Table I). However, the non-linear expressions provided by NPAD are usually harder to simplify and evaluate by hand compared to the rational expressions obtained from perturbation.
From the above discussion, one can see that it is also straightforward to combine NPAD with perturbation. Instead of fully diagonalizing the matrix, the Jacobi iteration can be designed to remove only the dominant couplings and combined with perturbation methods to obtain simplified analytical expressions. In fact, this is often used implicitly in the analysis when, e.g., a strongly coupled two-level system is perturbatively interacting with another quantum system. The Jacobi iteration suggests that this can be generalized systematically to more complicated scenarios.

III. PHYSICAL APPLICATIONS
In this section, we use the methods introduced in Section II to study the ZZ interaction in two different parameter regimes. In a two-qubit system, the ZZ interaction strength is defined by where E jk denotes the eigenenergy of the two-qubit states |jk . The Hamiltonian interaction term is written as ζσ z1 σ z2 , acting on the two qubits. Typically, in superconducting systems, it arises from the interaction of the |11 K  2  3  4  5  6  7  8   SWT  1  4 11 26 57 120 247  RSWT 1  2  4  5  7  8 11  NPAD 1  1  2  2  2  2  3   TABLE I. The number of terms in the evaluation for different methods to reach the λ K -perturbation. The number denotes the total number of commutators in SWT and RSWT, or the total number of sweeps over all couplings for NPAD. This describes both the "algebraic complexity" (i.e complexity of the output algebraic expressions) and the computational (timecost) complexity. The complexity is reduced from exponential to linear and eventually to logarithmic. However, notice that although the computational complexity for one commutator and for one Jacobi sweep scales the same in terms of the number of couplings to be removed (see the main text), the Givens rotation in NPAD consists of non-linear algebraic expressions which are individually more expensive to compute.

A. Effective ZZ entanglement from multi-level non-dispersive interactions
In this first application, we apply the NPAD method described in Section II A to study a model consisting of two directly coupled qubits in the near-resonant regime, where the ZZ interaction can be used to construct a control-Z (CZ) gate (see Figure 1 block B) [21][22][23][24]. We show that, with two rotations, NPAD provides an improvement on the estimation of the interaction strength for at least one order of magnitude, compared to approximating the system as only a single avoided crossing between the strongly interacting levels, as is standard in the literature. In addition, if one of the non-computational bases is comparably further detuned than the other, the correction takes the form of a Kerr nonlinearity, with a renormalized coupling strength accounting for the nearresonant dynamics.
We consider the Hamiltonian of two superconducting qubits that are directly coupled under the rotating-wave [45,46] and Duffing [47] approximations: where b q , ω q , α q are the annihilation operator, the qubit bare frequency and the anharmonicity, respectively. The parameter g denotes the coupling strength. In this Hamiltonian, the sum of the eigenenergies is always a constant E 10 + E 01 = ω 1 + ω 2 because of the symmetry. Hence, the ZZ interaction comes solely from the interaction between the state |11 and the non-computational basis |20 and |02 . If the frequency is tuned so that the state |11 is close to one of the non-computational states, the coupling will shift the eigenenergy, leading to a large ZZ interaction (Figure 3a). For simplicity, we consider the Hilbert subspace consisting of |20 , |11 , |02 and write the following Hamiltonian The parameters in the diagonal elements are given by δ = (ω 1 − ω 2 + α 1 )/2 and ∆ = 3(ω 1 − ω 2 )/2 − α 2 + α 1 /2.
To keep the result general, we use two different coupling strengths g 1 and g 2 , although according to Eq. (24) they both equal √ 2g. Without loss of generality, we assume the state |02 is comparably further detuned from the other two, i.e. ∆ > g j , δ. If in contrast |20 is further detuned, one can exchange the |02 and |20 in the matrix and redefine δ and ∆ accordingly. Notice that this Hamiltonian is different from a Λ system [3], where coupling exists only between far detuned levels.
To implement the CZ gate, one tunes the qubit frequency ω 1 so that the states |11 and |20 are swept from a far-detuned to a near-resonant regime. Hence, the perturbative expansion diverges and cannot be used. A naive approach is to neglect the far-detuned state |02 and approximate the interaction as a single avoided crossing. In this case, ζ is approximated by However, the interaction g 2 results in an error that, in the experimentally studied parameter regimes, can be as large as 10%, as shown in Figure 3b.
In the following, we show that with only two Givens rotations, one can obtain an analytical approximation, with the error reduced by one order of magnitude. The correction can be understood as a Kerr non-linearity with a renormalized coupling strength.
To get an accurate estimation of the ZZ interaction ζ, we need to calculate the eigenenergy of |11 by eliminating its coupling with the other two states. Therefore, we will make two rotations sequentially on the entry (0, 1) and (1, 2), given by where U 1 and U 2 are Givens rotations (Eq. (3)) constructed for eliminating the entries (0, 1) and (1, 2). Because the matrix is real symmetric, the phase φ in Eq. (3) is 0. The first transformed Hamiltonian, H (1) = U 1 HU † 1 , takes the form where E 2 = δ 1 + g 2 1 δ 2 is the eigenenergy for diagonalizing the two-level system of |20 and |11 , consistent with Eq. (26). The notations used is the same as in Section II A 3. In this frame, the coupling between |11 and |02 is reduced to c 01 g 2 , where c 01 is given by the nonlinear expression This non-linearity is crucial for the accurate estimation of the eigenenergy. The second rotation further removes this renormalized coupling c 01 g 2 , giving (30) Including the new correction, g 2 c 01 t 12 , the eigenenergy of state |11 reads In Figure 3b, we plot the error of the estimated interaction strength ζ = H 1,1 + δ using typical parameters of superconducting hardware, compared to the numerical diagonalizationζ. An improvement of at least one order of magnitude is observed compared to traditional methods. Following the assumptions that ∆ δ, g j , Eq. (31) simplifies to We see that the correction takes the form of a Kerr nonlinearity [48], but with a renormalized coupling strength c 01 g 2 . This non-linear factor c 01 accounts for the dynamics between |20 and |11 in the near-resonant regime. The same effect can be observed in higher levels where similar three-level subspaces exist. This approximation is plotted as a dashed curve in Figure 3b. The error of this estimation comes both from the expansion of the square root in Eq. (31) as well as from the remaining coupling in H (2) . The former can be approximated by the next order expansion For the latter, we consider the remaining coupling in H (2) between |20 and |11 , which reads g 2 s 01 s 12 . In the limit ∆ δ, g j , we have s 12 ≤ θ12 2 ≤ g2  31)), hybrid method with the additional assumption ∆ δ, gj (dashed, Eq. (32)), by assuming only a 2-level system (dash-dot, Eq. (26)), and with a leading-order perturbation (dotted). The shaded area covers the region below the error estimation given by Eq. (33). The grey arrow denotes a typical path to generate a CZ gate through ZZ interaction by changing the qubit-qubit detuning. The two jumps are located at ω1 = ω2 + α2 and ω1 + α1 = ω2, i.e., the points where the bare energy level swaps. This changes the direction of the Givens rotation. The parameters used are g1 = g2 = √ 2 · 0.1 GHz and α1 = α2 = −0.3 GHz.
that this coupling is much smaller than the energy difference. Hence, further correction can be estimated by The contribution of the other remaining coupling between |20 and |02 is much smaller due to the large energy gap. Since 2 is one order smaller than the 1 , 1 will be the dominant error. We plot the region below this error in Figure 3b as a shaded background.
For the more general cases without assuming ∆ δ, g j , it is hard to provide an error estimation due to the non-linearity. However, the result in Figure 3b indicates that Eq. (31) still shows a good performance in other parameter regimes commonly used in superconducting hardware, with an error smaller than 3%. We also observe that an improvement for another order of magnitude can be achieved by introducing a third rotation again on the entry (0, 1).

B. ZZ coupling suppression in the quasi-dispersive regime
In this second example, we use the two methods to investigate the suppression of ZZ cross-talk with the qubitresonator-qubit setup in the dispersive cQED regime, which corresponds to Figure 1 block A. We demonstrate that in the traditional setup without direct inter-qubit coupling, the ZZ interaction defined in Eq. (23) can still be zeroed in a quasi-dispersive regime by engineering the two parameters of qubit-resonator detuning. The zero points are described by an equation of a circle in the λ 4 -perturbation. To accurately capture the interaction strength in the quasi-dispersive regime, we also compute with RSWT the λ 6 -perturbation and show that the NPAD method with only 8 Givens rotations provides an expression with similar accuracy.
We consider a Hamiltonian of two superconducting qubits connected by a resonator: Due to the finite detuning between the resonator and the qubits, a static ZZ interaction exists even if there is no additional control operation on the system. In order to implement high-quality quantum operations, this interaction needs to be sufficiently suppressed. Several approaches have been developed to suppress the ZZ interaction. One way is to add a direct capacitive coupling channel in parallel with the resonator [27][28][29][30][31][32][33][34][35][36]49]. By engineering the parameters, the two interaction channels cancel each other. The interaction can either be turned on through a tunable coupler or through the cross resonant control scheme. The second approach is to choose a hybrid qubit system with opposite anharmonicity, which allows parameter engineering to suppress the ZZ interaction. One implementation is using a transmon and a capacitively shunt flux qubit (CSFQ) [25,26]. Other methods include using additional off-resonant drive [39][40][41] and different types of qubits have also been proposed [38].
Most of the above works are based on the strong dispersive regime, where the resonator is only weakly coupled with the qubits. In this regime, the ZZ interaction strength ζ is only determined by the effective interaction with the two non-computational qubit state, |20 and |02 [7] ζ disp = − 2J 2 20,11 where ∆ q = ω q − ω c is the qubit-resonator frequency detuning, α q the anharmonicity and J jk,j k the effective coupling strength between the physical qubit state |jk and |j k . They are obtained by performing a leading order SWT and effectively decouple the resonator from the two qubits. In this regime, it is impossible to achieve zero ZZ interaction unless the two anharmonicity α q adopt different signs. However, Eq. (36) is only valid when ignoring the higher level of the resonator. If we reduce the qubit detuning ∆ q so that it becomes comparable with the anharmonicity α q , the second excited state of the resonator comes into the picture and can be used to suppress the ZZ interaction, also known as the quasidisq regime [10,37]. We identify this regime as the quasi-dispersive regime because g/∆ q is manufactured larger than 0.1, e.g. in superconducting qubits with weak anharmonicity such as Transmons, though we show the same analysis can also hold for stronger anharmonicities. As a result, the calculation of ζ disp cannot be treated by only the leading-order SWT. In particular, we will see that, in the straddling regime, where |∆ 1 − ∆ 2 | < α, the interaction with the second excited resonator state leads to a λ 4 -perturbative correction that can be used to suppress the ZZ interaction.
In the following, we first use the λ 4 -perturbation to qualitatively understand the energy landscape and then investigate the higher-order corrections. For the λ 4perturbation, using RWST, we only need 2 iterations and evaluate 4 commutators instead of 3 iterations and 11 commutators, as for traditional perturbation (Table I).
In fact, the traditional approach that first approximates the system as an effective qubit-qubit direct interaction and then applies another perturbation to obtain the ZZ strength is also a two-step recursion [7]. However, for simplicity, it neglects the resonator states in the second perturbation. As detailed in appendix C, adding the resonator states, we obtain a better estimation for the quasi-dispersive regime. The result is consistent with the diagrammatic techniques used in [25,50].
To illustrate the energy landscape, we write the interaction strength as The first two terms coincide with Eq. (36) in the strong dispersive regime, up to O( g 4 ∆ 3 ). Assuming α = α 1 = α 2 and set ζ (4) = 0 in Eq. (37), we obtain an equation of a circle that describes the location of the zero points where ∆ + = ∆ 1 + ∆ 2 and ∆ − = ∆ 1 − ∆ 2 . In this λ 4perturbation, the zero-points depend only on the anharmonicity α but not on the coupling strength g q . Equation (38) indicates that the ZZ interaction can be suppressed by varying the sum and difference of the two qubit-resonator detunings, as illustrated in Figure 4a. Because the perturbative approximation is only valid away from the resonant lines, the useful part of the pa- rameter regime is the half-circle with ∆ + < α, in particular, the region marked by the grey box in Figure 4a.
In addition, we also studied different contributions to the ZZ interaction. In Figure 5a, we plot the strong dispersive approximation, the λ 4 -perturbation as well as the contribution of second excited qubit and resonator state to ζ (4) (see appendix C for analytical expressions). One observes from the plot that, in the quasi dispersive regime, the increasing virtual interaction with the second excited resonator state acts against the interaction with the second excited qubit states. Notice that all contributions to ζ (4) are virtual interactions of the second excited state, i.e., ζ (4) = ζ t + ζ r , as illustrated in Figure 5b.
Although the λ 4 -perturbation gives insight into the different contributions to the energy shift, perturbation beyond the order λ 4 also has a non-negligible contribution in the quasi-dispersive regime. Since RSWT requires considerably fewer commutators, we are able to compute the λ 6 -perturbation, with only two iterations and 7 commutators (see Table I). The λ 6 -perturbation captures the location of the minimum more accurately, but still shows a false minimum close to the resonant regime, as shown in Figure 4b.
Apart from perturbation, we also apply NPAD to compute the interaction strength. We first define 4 Givens rotations with respect to the direct qubit-resonator coupling terms from the original Hamiltonian. The rotations are then applied sequentially to obtain the first effective Hamiltonian. Next, we apply another 4 rotations targeted at the two-photon couplings, such as the effective qubit-qubit coupling. The indices of those 8 rotations are listed in the first two columns of Table II. These two steps are equivalent to the two iterations in RSWT. However, the recursive Givens rotations replace the BCH expansion, resulting in a much simpler calculation. Illustrated in Figure 4b, the approximation with those 8 rotations is as good as the λ 6 -perturbation, but without the false minimum. Both capture the zero points very well compared to the numerical diagonalization, where the 4 lowest levels are included for each qubit and the resonator.
With those calculations, we can then investigate the effect of the high order corrections. We find that, for instance, g q shifts the zero point to the regime of smaller frequency detuning, corresponding to shrinking the halfcircle in the numerical calculation in Figure 4a. In addition, for stronger coupling strength, the dip becomes narrower, which indicates a trade-off between the interaction strength and feasibility of qubit fabrication [51]. A detailed description of the effect of higher-order perturbation in the quasi-dispersive regime is presented in appendix D.
Overall, our investigation reveals different contributions to the ZZ interaction and provides tools to study the energy landscape in this quasi-dispersive regime. Because of the comparably smaller detuning, operations on this regime provide stronger interactions for entangling gates, and hence may achieve a better quantum speed limit for universal gate sets, i.e. without sacrificing local gates [10].

C. The cross-resonance coupling strength
Following the previous examples, we here study superconducting qubits under an external cross-resonance drive. The cross-resonance interaction is activated by driving the control qubit with the frequency of the target qubit, which has been studied intensively and demonstrated in various experiments [33,[52][53][54][55][56]. In the twoqubit subspace, the dominant Hamiltonian term is writ- The former is the typical cause of ZZ cross-talk in the strong dispersive regime, while the latter is used to counteract the energy shift. The notation ζ (4) refers to the λ 4 -perturbation [Eq. (37)] that goes pass zero in the quasi-dispersive regime.
In addition, ζ disp denotes the strong dispersive approximation [Eq. (36)], which also underestimates the ZZ interaction induced by the non-qubit states. Parameters used are the same as in Figure 4. ten as a Pauli matrix ZX, which generates a CNOT gate up to single-qubit corrections. Therefore, ideally, only the population of the target qubit will change after the gate operation. The effective model is usually derived by block diagonalizing the non-qubit leakage levels as well as the population flip of the control qubit [7,8,57]. The coupling strength is then characterized by the coefficient of the ZX Hamiltonian term. The analytical block diagonalization of the Hamiltonian is only possible when neglecting all the non-qubit levels. Hence, perturbative expansion is often used, where the small parameter is defined as Ω/∆ − , i.e., the ratio between the drive amplitude and the qubit-qubit detuning. However, to achieve fast gates, the qubitqubit detuning is often designed to be small, ranging from 50 MHz to 200 MHz. Therefore, the perturbative diagonalization only works well for a weak drive.
In the rest of this subsection, we show that with only 4 two-by-two Givens rotations on the single-photon couplings, we can block-diagonalize the drive term and ob- tain an estimation of the coupling strength as good as the numerical result and far above the perturbative regime. We start from the static Hamiltonian H in Eq. (24) and define a driving Hamiltonian in the rotating frame The full Hamiltonian is then written as with ω d the driving frequency [7]. To compute the interaction strength, both the qubit-qubit effective interaction g and the drive on the control qubit Ω need to be diagonalized. In particular, the second one can be as large as the energy gap and dominant in the unwanted couplings [57]. For simplicity, we assume g is small and diagonalize it with a leading-order perturbation, discarding all terms smaller than O(g 2 ). In this frame, one obtains a ZX interaction that increases linearly with the drive strength [7]. This is equivalent to moving to the eigenbases of the idling qubits and allows us to focus on applying NPAD to the drive H d . The same method used in Section III B can be applied here to improve this approximation.
Targeting the dominant drive terms listed in the right column of Table II, we construct 4 Givens rotations. The rotations are constructed with respect to the same Hamiltonians and then applied iteratively as separate unitaries. The obtained ZX interaction strength reads with c j = cos(θ j /2), s j = sin(θ j /2) and ∆ − = ω 1 − ω 2 . The rotation angles are defined by the drive strength θ 1 = arctan Ω ∆− and θ 2 = arctan √ 2Ω 2∆−+a1 . This analytical coupling strength is plotted in Figure 6, compared with the perturbative expansions in Ref. [7] and numerical block-diagonalization. The result matches well with the numerical calculation, even when the ratio Cross-resonance coupling strength as a function of the drive strength. The analytical coupling strength is computed with 4 two-by-two Givens rotations on the singlephoton coupling terms [Eq. (40)] and compared to perturbative expansion and the numerical calculation. The parameters used are inspired by the device in Ref. [33], with the qubit-qubit detuning approximately 60 MHz and an effective qubit-qubit coupling −3 MHz. Ω ∆− is approaching one. On the contrary, the perturbative expansion shows a large deviation as the driving power increases. The numerical block-diagonalization is implemented using the least action method [7,29,44]. To our surprise, although no least action condition is imposed on the Jacobi iteration, the method automatically follows this track and avoids unnecessary rotations. This suggests that the Jacobi iteration chooses an efficient path of block-diagonalization.
Notice that in the above example, no rotations are performed for levels beyond the second excited state because they are not directly coupled to the qubit subspace. In other parameter regimes, more couplings terms may become significant and need to be added to the diagonalization. For instance, the two-photon interaction between |0 and |2 of the control qubit will be dominant in the regime where ∆ − ≈ −α 2 /2 [8]. The fact that high precision can be achieved with only rotations on the singlephoton couplings in this example also indicates that the dominant error of perturbation lies in the BCH expansion used in diagonalizing the strong single-photon couplings, rather than in higher levels or high-order interactions.

IV. CONCLUSION AND OUTLOOK
We introduced the symbolic algorithm NPAD, based on the Jacobi iteration, for computing closed-form, parametric expressions of effective Hamiltonians. The method applies rotation unitaries iteratively on to a Hamiltonian, with each rotation recursively defined upon the previous result and removing a chosen coupling between two states. Compared to perturbation, it uses twoby-two rotations to avoid the exponentially increasing commutators in the BCH expansion. In the perturbative limit, the method reduces to a modified form of the Schrieffer-Wolff transformation, RSWT, that inherits the recursive structure of the Jacobi iteration. The recursive structure avoids unnecessary expansion and results in an exponential reduction in the number of commutators compared to the traditional perturbative expansion. The two methods can also be combined as a hybrid method, where NPAD is used to remove strong couplings while RSWT is applied afterwards to effectively eliminate the remaining weak coupling.
Applying these methods to superconducting qubit systems, we showed that high precision estimation can be achieved beyond the perturbation regime, either as explicit short analytical expressions, or closed-form parametric expressions for computer-aided calculation. Although in the study we used the Kerr model, more detailed models such as in Ref. [8] can also be incorporated with little additional effort.
Despite the fact that using the Jacobi iteration for machine-precision diagonalization is less efficient than other methods such as QR diagonalization, the iteration can be truncated for symbolic approximation. For many questions in quantum engineering, the most part of the energy structure and dominant couplings is known in advance. Therefore, the iterative method can be designed for removing dominant couplings and decoupling a subspace from non-relevant Hilbert spaces, which is often used in modeling dynamics in large quantum systems [18,58]. The result is, however, always a closed-form, parametric expression, which, though usually harder for the human to read, shows its own advantage in computer-aided calculations.
We expect our method to have significant application in quantum technologies, where elimination of auxiliary or unwanted spaces (e.g. for block-diagonalization) needs to be done to significant precision to enable practically useful models. In particular, relevant applications include experiment and architecture design, reservoir engineering, cross-talk suppression, few-and many-body interaction engineering, effective qubit models, and more generally improved approximations where Schrieffer-Wolff methods are typically used. We also expect that the methods presented here will find extensions for simplifying other equations of motion, such as in open-quantum systems [59,60], non-linear systems [61], or for uncertainty propagation [62]. Last but not least, accurate, parametric diagonalization should be especially useful for time-dependent diagonalization where adiabatic following can be enforced by DRAG [63,64] or other counter-diabatic [65,66]  In the main text, we presented Eq. (20) as the expression to compute the transformed matrix H , which is a function of the off-diagonal part of the original matrix V and the generator S. The expression is derived from a truncated BCH formula. In the following, we derive the error bound of the truncation.
Without truncation, Eq. (20) is written as where we neglected the index n for the iteration step. If the expansion is truncated at t = m − 1, one obtains where we assume in the last inequality that S < 1/2.

Appendix B: Efficiency comparison between RSWT and SWT
We show here that, given a finite-dimensional Hamiltonian H, RSWT is more efficient than SWT for perturbation beyond level λ 2 with an exponential decrease in the number of commutators. We measure the complexity by the number of commutators that need to be evaluated to compute all eigenenergy corrections up to λ K , denoted by N . The general formula is presented below while the numbers for K ≤ 8 is given in Table I in the main text. For SWT, one can find the general expression as well as explicit formulas up to λ 5 in Ref. [7]. The number of iterations required to reach order λ K is K − 1. In addition, at each iteration n, one needs to include also mixed terms composed of generator S l with l ≤ n. The number N is given by where 2 l−1 is the number of distinct tuples (S i1 , S i2 , S i3 , · · · ) with j i j = l. We have taken into consideration that [S 1 , diag(H)] = −V and [S n+1 , diag(H)] is known by the construction of S n+1 . For RSWT, the calculation of commutators in each iteration is given in Eq. (20). Because C t+1 (A, B) can be calculated from C t+1 (A, B) with only one additional commutators, the number commutators to be evaluated in Eq. (20) is exactly m−1 = K 2 n −1. The total number of iteration n max is given by log 2 (K) . Therefore, we obtain The reduction compared to SWT comes from the fact that the energy difference in H n is used in the definition of S n+1 , rather than the bare energy difference in H. The recursive expressions avoid unnecessary expansions. One obtains the same final expressions as from SWT up to λ K , if one expands the energy difference into a polynomial series 1 ∆E bare + ∆E correction = 1 ∆E bare poly ∆E correction ∆E bare (B3) and substitutes in expressions so that it depends only on the bare energy and couplings.
Appendix C: RSWT results for the ZZ interaction strength Using RSWT described in Section II B, we compute the effective Hamiltonian up to λ 6 , where λ is defined as the ratio between the largest coupling and energy gap. To compute the λ 4 -and λ 6 -perturbation, RWST only takes 2 iterations with 4 and 7 commutators respectively, which is significantly smaller than those required for SWT as shown in Table I. A third iteration only adds an improvement of O(λ 8 ) to the eigenenergy because the off-diagonal terms of H 2 are at most O(λ 4 ).
Because of the recursive structure of RSWT, each matrix element in H n+1 is given as a function of matrix elements in H n . Hence, the final result is a closed-form expression parameterized by the matrix elements of the original Hamiltonian H, i.e. the hardware parameters. The parametric expression consists only of algebraic expressions and the dependence can be illustrated as a network. For instance, we show the network representation of the λ 4 -perturbation ζ (4) in Figure 7. Each symbol in layer n + 1 is analytically expressed as a function of symbols in layer n, represented by arrows. The arrows between the first and the second layer represent the definition ζ (4) = E (4) 011 −E (4) 001 −E (4) 010 . Given all the six hardware parameters (layer 0), one can evaluate ζ (4) by recursively evaluating all the nodes it depends on.
In the following, we present the analysis of λ 4 -and λ 6 -perturbation.  (4) parameterized by ∆1, ∆2, α1, α2, g1, and g2, obtained from the two-step RSWT. Each node in the 1st and 2nd layers is a matrix entry in the Hamiltonian H1 and H2. A node in layer n + 1 is expressed as a function of the nodes in layer n, represented by an edge. In particular, symbols E (n,k) lpq represent the λ k diagonal entries of lpq| Hn |lpq and V (n,k) lpq,l p q the effective coupling. The upper index k denotes the level of perturbation, e.g., k = 4 means that it is a λ 4perturbative correction.

λ 4 -perturbation
The λ 4 -perturbative correction for ζ is given as The notation E (n,k) lpq represents the λ k -perturbation obtained from H n . The sub-indices lpq denotes the resonator state |l and two qubit states |p , |q . where V (n,k) lpq,l p q denotes the interaction between state |lpq and |l p q .
The physical meaning of each term in Eq. (C2) can be interpreted as follows: The first two terms are identical to the dispersive approximation given in Eq. (36), which is the consequence of the effective qubit-qubit interaction. The third term, depending on the effective interaction between |200 and |011 , is 0 at this order. This is because the destructive interference between the path |011 → |110 → |200 and |011 → |101 → |200 results in V is what the approximation of a strong dispersive regime fails to characterize. It was generated by the commutator [S 1 , [S 1 , [S 1 , V 0 ]]] and the energy gaps in the denominator of entries in S 1 are always the qubit-resonator detuning (plus the anharmonicity), which, in the strong dispersive regime, is much larger than the qubit-qubit detuning in Eq. (36). Hence the last term is much smaller in the strong dispersive regime. However, in the quasidispersive regime, it plays a key role in suppressing the ZZ interaction as shown in Figure 5b.
After including the single-excitation terms E (2,4) 010 and E (2,4) 001 using the same two-step RSWT, we separate the contributions of virtual interaction into 2 categories: those including the second excited qubit state (denoted by t) and those including the second excited resonator state (denoted by r): where ζ disp is given by Eq. (36). Summing all the contributions gives the λ 4 -perturbation ζ (4) in Eq. (37). Notice that virtual interactions that only involve the first excited state have no contribution to the ZZ interaction at this perturbation level, i.e., ζ (4) = ζ t + ζ t . This is because the energy shift of |011 induced by |101 and |110 cancels that of |010 and |001 induced by |100 .

λ 6 -perturbation
Using the two-step RSWT, we also computed the λ 6perturbative correction to the ZZ interaction strength: The first contribution corresponds to the effective qubit-qubit interaction and dominants in the strong dispersive regime. It turns out that it only includes the next order of effective interaction and energy difference. Hence, for simplicity, we present it together with ζ
The rest of the contribution can be summed as The second contribution, ζ (6) rest,g 4 1 g 2 2 , is obtained again by interchanging the sub-index 1 and 2.
Appendix D: Effect of higher-order perturbation on the zero points of ZZ interaction The λ 4 -perturbation described by Eq. (37) predicts the zero points as a circle with a radius of 2|α|, independent of g. However, as they are located in the quasi-dispersive regime for systems with weak anharmonicity, the higherorder perturbation is not always negligible. Here, we qualitatively describe how the higher-order (> 4) affect the zero points of ζ.
We observe that, in contrast to the λ 4 -perturbation, when including the higher orders, the zero points depends on the coupling strength g. As shown in Figure 8, the higher-order perturbation shifts the zero points to the regime of smaller detuning. The larger the coupling, the stronger the shift is.
One can estimate the accuracy of perturbation around the zero points by the ratio g/|α|. At the zero points of ζ, the larger the anharmonicity and the smaller the coupling, the better is the perturbative approximation. This is because the perturbation is characterized by the small parameter λ = g/∆ and near the zero-points ∆ depends linearly on α [see Eq. (37)], hence the ratio g/|α|. This is also illustrated in Figure 8, where we compare the deviation between the numerical result and the perturbation. The minimum even vanishes in the analytical result when it is close to the resonant lines. This behaviour also indicates that for superconducting qubits with a relatively large anharmonicity, the ZZ interaction can also be completely suppressed in the strong dispersive regime in this qubit-resonator-qubit model.