Mitigating crosstalk errors by randomized compiling: Simulation of the BCS model on a superconducting quantum computer

We develop and apply an extension of the randomized compiling (RC) protocol that includes a special treatment of neighboring qubits and dramatically reduces crosstalk effects caused by the application of faulty gates on superconducting qubits in IBMQ quantum computers (\texttt{ibm\_lagos} and \texttt{ibmq\_ehningen}). Crosstalk errors, stemming from CNOT two-qubit gates, are a crucial source of errors on numerous quantum computing platforms. For the IBMQ machines, their magnitude is often overlooked-9. Our RC protocol turns coherent noise due to crosstalk into a depolarising noise channel that can then be treated using established error mitigation schemes, such as noise estimation circuits. We apply our approach to the quantum simulation of the non-equilibrium dynamics of the Bardeen-Cooper-Schrieffer (BCS) Hamiltonian for superconductivity, a particularly challenging model to simulate on quantum hardware because of the long-range interaction of Cooper pairs. With 135 CNOT gates, we work in a regime where crosstalk, as opposed to either trotterization or qubit decoherence, dominates the error. Our twirling of neighboring qubits is shown to dramatically improve the noise estimation protocol without the need to add new qubits or circuits and allows for a quantitative simulation of the BCS model.


I. INTRODUCTION
Despite the long lasting quest for a noiseless and scalable quantum computer (QC) that could achieve quantum advantage [1], currently available platforms require the development and improvement of the so-called Noisy Intermediate-Scale Quantum Computer (NISQ) [2].In these devices, qubit operations are subject to noise, which produces errors at a level that remains beyond the scope of fault-tolerant quantum error correction codes [3][4][5].Nevertheless, several implementations of digital quantum computers are now accessible, like the IBMQ digital quantum computers [6], which can implement a universal set of quantum operations on arrays of superconducting transmon qubits (available devices containing up to 128 qubits).In order to make the most of these already existing machines, numerous error mitigation solutions have been proposed over the past decade [7][8][9][10][11][12], with the ultimate goal of pushing the efficiency and precision limit of NISQ until they surpass the performances of classical computers on a set of well-known quantum algorithms: quantum simulation, quantum optimization and quantum Fourier transform-based algorithms (for detailed reviews on quantum algorithms see [13,14]).While the existing quantum hardware is quickly approaching the number of qubits theoretically required to do so, noiseinduced errors remain very significant on NISQ [15,16], and practical solutions on how to reduce these errors to a minimum are lacking.
On IBMQ devices, quantum gate implementations produce errors that can mainly be attributed to the 2-qubit (CNOT) gate [17], with an announced error rate of 1%, obtained by Clifford randomized benchmarking [18,19].However, little is known about the structure of the quantum noise channels acting on the qubits [20][21][22].The noise is known to be a mixture of random, incoherent noise channels, experimentally linked to imperfect gate application and qubit decoherence, as well as coherent noise channels [23], induced by the weak residual couplings between a qubit and its environment (neighboring qubits, readout resonator and residual electromagnetic fields for instance).Coherent noise may be correlated, both in time and in space, and lead to systematic deviations from the perfect output.One of the identified origins of coherent noise on a given qubit is linked to unwanted resonances while applying radio frequency (RF) pulses to implement a quantum gate on neighboring qubits.This effect, dubbed crosstalk [25], is known to be important in superconducting qubit architectures [26,27].Crosstalk severely limits the performance of these NISQ implementations, in spite of considerable experimental efforts [17,28,29] and software mitigation approaches [30][31][32].Finally, we emphasize that previous experiments have shown that the effective noise acting on the transmon qubits of IBMQ QCs is very unstable in time, and may change drastically over the scale of a few hours/days [33], which limits the possibilities for benchmarking the noise channels and reusing this information.
Error mitigation schemes are, therefore, essential in order to make NISQ a reliable platform for quantum computing.Over the past decade, several of those techniques have been proposed with the aim of turning noisy outputs into practical and reliable results.They can be Crosstalk may extend the effect of this noise to the neighboring qubits.Gates applied before and after the CNOT gates depict the RC protocol (white gates are used for the standard RC procedure, adding the red gates constitutes the crosstalk RC).A more detailed description is given in Sec.III.(b): Figure of merit for the simulation accuracy when using different error mitigation techniques.Clearly, the effect of mitigating crosstalk is not negligible.The figure of merit is defined as ϵ = S −1 j |ij||(nj − ij)|, where j enumerates all simulated evolution durations and observables measured, ij is the ideal noiseless expectation value for the respective observable, nj is the value measured on the noisy QC; S = j |ij| represents the normalization factor, as |ij| weights prioritize the contributions of observables with expectation value away from 0. (c)-(d): Evolution of the expectation value Y of the second qubit out of three under the 3-body BCS hamiltonian (see Eq. ( 1)) with energy levels {−1, 0, +1} and coupling constant g = 0.5 respectively on both IBMQ devices.The red curve is the noiseless trotterized dynamics, the dashed orange curve represents the simulation on a real, noisy QC.The dashed blue (resp.green) curve shows the simulated evolution when using a number of error mitigation protocols: readout-error correction, standard randomized compiling (resp.crosstalk randomized compiling), and NEC.While not allowing for perfect agreement with the ideal red curve, crosstalk randomized compiling dramatically improves the accuracy of the quantum simulation.
classified into two categories: noise parameter-dependent and noise parameter-free error mitigation protocols.The former, which include e.g.Probabilistic Error Cancellation (PEC) [11,12], rely on a precise characterization of the noise parameters, which requires a substantial initial overhead [34][35][36][37].Noise parameter-free error mitigation schemes, on the other hand, usually assume a model of noise but do not require any characterization of their parameters.Therefore, they are free from this initial overhead but fail when noise model assumptions are not met.The most prominent example is known as Zero Noise Ex-trapolation (ZNE), which consists of performing several experiments with increasing noise strength (which can be achieved both with experimental or algorithmic approaches), and extrapolating the results to the zero noise limit [11,12,[38][39][40].Another example is the so-called Noise Estimation Circuits (NEC) technique, introduced in [41].A common feature of both parameter-dependent and parameter-free methods is an exponential scaling of the required number of circuit runs with the number of noisy gates [42].
The NEC method was originally designed for noisy qubits affected by a very simple one-parameter noise channel called depolarising noise.This assumption is unrealistic for NISQ, and particularly for IBMQ quantum computers, as discussed above.Fortunately, the noise structure can be simplified using a procedure called Randomized Compiling (RC) [43][44][45].RC consists in averaging a given circuit output over different randomized versions of the original circuit in a way which effectively maps any noise channel to a simpler Pauli noise channel [46], bringing it closer to the depolarising channel required to apply the NEC method.
RC is typically performed only on qubits directly affected by CNOT gates [47][48][49][50], therefore ignoring crosstalk errors.These are typically assumed to be insignificant, yet turn out to be harmful to our simulation results in practice.Methods for mitigating crosstalk via RC have been previously proposed [44] and demonstrated to be effective [45].In this article, we propose an extension of the twirling gate set (see Fig. 1-a or Fig. 5) which further maps crosstalk errors onto a depolarizing noise channel.We then demonstrate that using this technique drastically improves the performances of the NEC mitigation scheme implementation on IBMQ QC.We expect that similar improvements can be achieved for other error mitigation schemes that assume a depolarising noise model.This is relevant for all QC architectures affected by crosstalk.In addition, our results confirm that crosstalk errors on IBMQ devices cannot be neglected (we find that they lead to an effective error rate per CNOT gate of up to 5%) and need to be mitigated.
To benchmark the RC and NEC methods used in this article, we perform a quantum simulation of the BCS (Bardeen-Schrieffer-Cooper) hamiltonian, the historical model describing superconductivity through the emergence of electronic Cooper pairs [51].Starting from an initial quenched state, we simulate the out-of-equilibrium dynamics of the system using a simple Trotter algorithm.The evolution is then monitored by measuring time-dependent expectation values of Pauli strings observables.Specific features make this model very suitable to benchmark the performance of QCs.First, due to the hardcore boson statistics of the Cooper pairs, the model has a natural description in terms of Pauli matrices.Second, the BCS model is integrable, so that the few-particle dynamics can be completely solved efficiently on a classical computer [52][53][54][55][56]. Finally, the BCS interaction term contains an all-to-all coupling between spins, which makes it challenging to implement on a low connectivity platform like IBMQ QC (which has a quasilinear layout of qubits).Previously, the BCS model has been implemented on IBMQ with the purpose of finding its ground state energies using variatonal quantum algorithms [57][58][59].Also, an implementation of the BCS evolution operator for a NV-center-based QC with a starshaped connectivity has been proposed recently [60].
Our key results are summarized in Fig. 1.Specifically, Fig. 1-b shows that crosstalk randomized compiling dramatically improves the accuracy of the quantum simulation without the need of adding new qubits or circuits (compared to the standard randomized compiling).While Fig. 1-b presents the results averaged over multiple observables and simulation times, Fig. 1-c,d demonstrates how the results of applying different techniques compare for one specific observable as a function of the simulated evolution time (i.e.number of Trotter steps applied).It is clear that the contribution of crosstalk is not negligible.
This article is organized as follows: In Section II, we Trotterize the BCS time evolution, and show how to map the corresponding unitary operator to a quantum circuit.We run naive simulations of the model on IBMQ QCs, illustrating the dramatic effect of the noise on the results.In Section III, we present our improved RC procedure, and perform a comparative analysis between standard RC and crosstalk RC, by applying them to our BCS simulations.Section IV is devoted to our implementation of the NEC mitigation scheme, and readout error correction.Finally, we provide in Section V an in-depth comparison of various levels of error mitigation -from no mitigation to NEC assisted with crosstalk RC.We discuss the observed discrepancies between classical simulations assuming a simple depolarising noise channel and the results from real QC.

II. QUANTUM SIMULATION OF THE BCS MODEL ON A SUPERCONDUCTING QC
Throughout this work, we model a quench in the BCS model with 3 Cooper pairs in order to gauge the efficiency of both RC techniques and their combination with NEC.The dynamical evolution of various local observables is the quantitative outcome of these simulations.In this section, we show how the BCS quench can be simulated on a quantum computer and present the results of a basic simulation -without any error mitigation -setting a reference for the performance of more advanced methods.
We execute the quantum code on 3 linearly connected qubits of ibmq lagos and of ibmq ehningen, cf.Fig. 2. Throughout the paper, unless stated otherwise, results for ibmq lagos are presented.Results for ibmq ehningen are gathered in Appendix.C.

A. Problem definition
The BCS hamiltonian can be represented in the spin language via (see Appendix A) where L = 3 is the number of energy levels and σ a j , a ∈ {x, y, z}, refers to the operator such Coupling map of (a): ibm lagos and (b): ibmq ehningen machines.Links represent the physical junctions on which 2-qubit gates can be realised.The green qubits are used for the simulation and black qubits are the idle qubits.
that the Pauli matrix σ a is applied on the energy level j and the identity operator on the other energy levels.The eigenstates |0⟩ , |1⟩ of σ z j denote respectively the absence or presence of a Cooper pair on the energy level j.
The system is initialized in state |ψ 0 ⟩, corresponding to the mean field solution of the above hamiltonian, cf.Appendix A. The aim is to simulate the evolution of the system up to time t, described by the evolution operator U (t) = e −iHBCSt , and measure expectation values of observables ⟨ψ 0 | O |ψ 0 ⟩, where |ψ(t)⟩ = U (t) |ψ 0 ⟩, and the observables O are products of Pauli operators on the qubits.

B. Trotter algorithm for the unitary evolution
First, we need to implement the unitary evolution operator U (t) = e −iHBCSt on the quantum computer, i.e. to break down U (t) into a quantum circuit comprised of 1and 2-qubit unitary gates that can be directly applied on the QC (native gates).This operation, dubbed transpilation, is a difficult problem that usually cannot be tackled exactly, even for simple integrable systems.To circumvent this issue, we approximate the exact unitary evolution operator using the Trotter algorithm.We slice the full time evolution into r small time steps of length ∆t = t/r: e −iHt = e −iH∆t r . ( For each time step, we use the Suzuki-Trotter formula [65] to expand the complex exponential at first order in ∆t.For the BCS hamiltonian this yields: For each Trotter step, the error accumulated by this approximation has an upper bound of order O(∆t 2 ).This error scales as O(t∆t) for the whole dynamics (see e.g.[66,67]).In our particular case however, this upper bound is not reached, and the first order Trotter algorithm remains a good approximation of the exact dynamics over the time range we consider.Thus, in what follows we compare the results obtained on a quantum computer to the result of perfect Trotter evolution, and not to the exact result of continuous evolution.
The exponential terms in Eq. ( 3) can now be straightforwardly transpiled into a quantum circuit comprised of the native 1-qubit and 2-qubit gates of the IBMQ machines.More details about this procedure can be found in Appendix B, and the resulting quantum circuit is displayed in Fig. 3, where the two first exponential factors in Eq. ( 3) are highlighted with red boxes as an example.The rightmost one, in particular, implements the interactions between qubits q 0 and q 1 .It is then simply repeated for the interaction q 1 − q 2 .For the last interaction however, a SWAP gate (composed of a series 3 alternating CNOT gates) is required, to account for the linear layout of the quantum computers.This gate exchanges the logical information contained in physical qubits q 1 and q 2 , thus allowing to implement the interaction q 0 − q 2 using physical qubit q 1 as an intermediary.Finally, we obtain the equivalent quantum circuit representation of an elementary Trotter step from Fig. 3, which requires 9 CNOT gates in total.To simulate the dynamics up to an arbitrary time t, we simply concatenate r = t/∆t copies of this circuit [70].

C. Parameters
We choose the following model parameters: on-site energies ϵ j ∈ {−1, 0, +1} and a coupling constant g = 0.5.We then perform the Trotter evolution up to time T = 3.0 with a time step ∆t = 0.2.Unless stated otherwise, quantum simulations shown throughout the main article are performed on qubits q 0 , q 1 and q 2 of the ibm lagos machine.We choose q 0 as the first qubit with energy ϵ 0 = −1, q 1 as the second qubit with energy ϵ 1 = 0 and q 2 as the third qubit with energy ϵ 2 = +1.

D. Observables
After performing the evolution, we monitor the dynamics of the system by evaluating the expectation values of different observables, which are tensor products of Pauli matrices.These observables may bear little physical interest, but their expectation values are straightforwardly estimated from simple measurements, and their exact evolution is easily obtained classically.We made two different experiments: in the first one we select observables from set letters correspond to the direction of the Pauli observable, and subscripts refer to the qubit they are measured on).
We therefore have access to a total of 7+7 different observables.All the measurement outputs are sampled over 32000 repetitions.

E. Raw Results
Despite the small number of qubits, the 3-qubit BCS circuits are challenging to simulate on a noisy QC due to their depth.Their execution is subject to significant experimental noise, mainly originating from imperfect (CNOT) gate applications (IBM estimates the CNOT gate's error rate at ∼ 10 −2 while the single-qubit gate's error is of the order of 10 −4 ).
We have simulated the dynamics with circuits containing up to 15 Trotter steps with 9 CNOT gates per step, i.e with a maximum of 135 CNOT gates.We plot in Fig. 4 the time evolution for three of the fourteen available observables (the other observables are plotted in Appendix C): the first qubit's x-component ⟨X 0 ⟩ (see Fig. 4-a), the third qubit's z-component ⟨Z 2 ⟩ (see Fig. 4b) and their joint evolution ⟨X 0 Z 2 ⟩ (see Fig. 4-c).We compare their evolution with the exact BCS dynamics (black solid line), the trotterized dynamics obtained on a noiseless classical computer (red solid line) and on a real, noisy, IBM QC (ibm lagos) (orange dotted line).We first emphasize that the first-order trotterized dynamics is indeed a good approximation of the exact evolution here, so we can expect the algorithmic Trotter error to be negligible compared to errors stemming from the noise acting on the QC.
When the Trotter circuits are run on a real, noisy QC, we observe that the measured expectation values of all observables decay with a priori unpredictable fluctuations.Despite the fact that the CNOT error rate is around 1%, the effect of noise is already dramatic after a few Trotter steps, and no interesting features of the dynamics can be extracted from these results.This illustrates the need for error mitigation strategies to fight the effects of the noise, which we can expect to be even more dramatic as the circuit depth increases.For the sake of comparison, simulation of this system as well as the following error mitigation protocols have also been performed on qubits q 18 , q 21 and q 23 of the ibmq ehningen device measured respectively in the direction x, y and z.The results are qualitatively similar and are shown in Fig. 13 in Appendix C.
Small systems like the one we consider, whose full dynamics is easy to simulate on a classical computer, constitute a good platform to benchmark different noisetailoring and quantum error mitigation protocols.In what follows, we implement a few of these protocols and assess their efficiency to improve the results of our quantum simulation.

III. SIMPLIFYING THE NOISE STRUCTURE : CROSSTALK RANDOMIZED COMPILING
Errors in QC have multiple experimental origins.They can originate for instance from imperfect measurement during a qubit state readout.They can also stem from the experimentally imperfect application of certain quantum gates, in particular CNOT gates.In addition, qubit decoherence constitutes another source of errors, which sets an upper bound on the size of the circuits than can be run.On superconducting chips of IBM, the decoherence times T 1 and T 2 are typically of the order of ∼ 100µs while the application of the CNOT gates is ∼ 300ns [17,71] (data taken from IBMQ backend properties).Because we apply no more than 135 CNOT gates, we consider decoherence of qubits negligible compared to the noise introduced by CNOT gates.
To address errors arising from noisy CNOT gates, understanding and simplifying their noise structure is crucial.Randomized compiling (RC) was introduced in [44] as a method for converting any coherent noise channel into a Pauli (incoherent) noise channel.The original proposal recommended applying the twirling protocol to all qubits, but in practice, this approach is often performed only on the active qubits of the CNOT gates [47][48][49][50], therefore overlooking spillover crosstalk effects.
In the present Section, we discuss the importance of applying RC to neighboring qubits as well, and improve the procedure to simplify crosstalk noise even further.We start, in Sec.III A, by presenting in details the standard RC protocol on active qubits.Then in Sec.III B, we propose an extension of the quantum gate twirling set on idle qubits, which turns any coherent noise channel into a depolarising (incoherent and unbiased) noise channel.By implementing the crosstalk RC protocol in our BCS simulation, we demonstrate its efficiency on IBMQ devices.Unlike for the standard RC protocol, we show that final results can be faithfully reproduced with classical simulations of noisy quantum circuits assuming a simple depolarising noise channel.

A. Standard randomized compiling protocol
Before mitigating errors introduced by the application of CNOT gates on IBM QCs, we first need a way to model them.The effect of external markovian noise on the physical qubits can be described by a superoperator called a quantum error channel [5], which is usually applied after the noisy gate, and acts on the full density matrix of the system.A general error channel for two qubits can be written as follows: with p * ijkl = p klij , and {σ 0 , σ 1 , σ 2 , σ 3 } ≡ {I 2 , σ x , σ y , σ z }.The trace-preserving condition yields 16 equations: reducing the naive number of 4 4 = 256 independent parameters to 240.We do not know a priori the values for the p ijkl , and determining them would require quantum state tomography protocols with unrealistic overhead.Moreover, this noise originates from many different experimental sources, and is not only different for different qubits, but can also change in time, making it impossible to fully characterize.Randomized compiling is a computational method which allows for the reduction in the complexity of the noise channel (Eq.( 4)) to only 15 free parameters, at the expense of running more circuit experiments.The resulting, equivalent noise channel reads: where 0 ≤ p ij ≤ 1 and i,j p ij = 1.Such an error channel is called a Pauli noise channel.
RC consists in inserting a random layer of singlequbit gates (considered as noiseless) before and after each CNOT gates such that the overall action is still equivalent to a CNOT gate: where U i (i ∈ {1, 2, 3, 4}) are random single-qubit gates, with combinations U 1 , ..., U 4 deliberately chosen to satisfy Eq. ( 8).The set of allowed combinations comprises 16 options that can be constructed by choosing U 1 and U 2 arbitrarily from the set T p = {I 2 , σ x , σ y , σ z }, and then choosing U 3 and U 4 from the same set such that Eq. ( 8) is satisfied.Table I of [41] lists the different possibilities for U i 's.
In the presence of noise, one can show [46] that averaging over all twirling configurations {U i } convert the general error channel, Eq. ( 4), to an effective Pauli noise, Eq. ( 7)).In other words, the average over the expectation values measured from all possible RC circuits coincides with the value that would be obtained from the original circuit with a Pauli noise channel on each CNOT gate.
The number of circuits with different twirl configurations scales exponentially, though: 16 nCNOT where n CNOT is the number of CNOT gates in the circuit.Performing an average over this number of circuit experiments is unrealistic in practice.However, in the assumption of noise Markovianity (i.e., the noise on different CNOT gates is not correlated), one can sample very few twirl configurations.For an observable ∈ [−1, +1], the deviation from the exact average scales O(1/ √ N RC ) where N RC is the number of RC circuits with different twirl configurations -in the spirit of Monte Carlo sampling.For N RC = 300, error bars in Fig. 6 are below the plot marker size, underscoring the method's accuracy.
We note in passing that the above discussion concerns the cost of "converting" the error channel by means of RC.Further use of error mitigation techniques (such as NEC in Sec.V) on the measured results would require exponential accuracy of the measured results (cf.Fig. 8) and thus impose exponential growth on the required N RC with the number of CNOT gates.Yet, in practice, this is not the limiting factor.For example, in Fig. 12, the number of RC circuits has been increased to 600 for t > 1.8, yet this adjustment does not bring about significant changes in comparison with Fig. 8.

B. Randomized compiling for crosstalk
So far, by restricting ourselves to 2-qubit error channel, we have supposed that the noise acting on CNOT gates only affects the qubits on which the gate is applied.In practice, the application of a CNOT is known to also produce errors on neighboring qubit, an effect known as crosstalk [24,26,31,32].Since the precise noise structure induced by crosstalk is also unknown, we aim to simplify it as well by extending RC to these neighboring qubits.
The crosstalk RC (cRC) procedure for 3 qubits is represented in Fig. 5. Unlike for the two active qubits on which the CNOT is applied, no logical operation is performed on the neighboring qubits.As a consequence, any single-qubit gates can be part of the twirling set (imagine replacing CNOT with identity in Eq. ( 8); then once could choose for any U 1 and U 2 m not only those belonging to T p ).We can thus simplify the structure of the noise channel on the neighboring qubits even further than for the active qubits by using two distinct twirling sets.The first one is the same twirling set T p as in the previous section.The second is composed of the gates T R = {R x (π/2), R y (π/2), R z (π/2)}.We perform two RC procedures in a row.First, we apply the gate drawn randomly from the set T p , then a second gate drawn from T R and undo their action by choosing their corresponding inverse gate after application of the CNOT on the neighboring qubits.
FIG. 5. Representation of the crosstalk RC protocol used for our 3-qubit simulations.The white gates are used in the standard RC protocol.In order to twirl crosstalk errors, we extend this scheme by adding the red gates.Gates P , Q, R, S, and T are either the identity matrix or one of the three Pauli matrices.P , Q, T are drawn randomly from Tp, and R and S are chosen such that the circuit is stricly equivalent to an isolated CNOT.Gates Ri(π/2) are π/2 rotations around the axis i ∈ {x, y, z}, drawn at random.Barriers indicate the order of application of gates and prevent annihilation between single-qubit gates.
In Appendix D, we derive the effective general nqubit error channel (n − 2 neighbors that can experience crosstalk), as obtained after performing cRC (see Eq. (D23)).The restriction of this channel to our 3qubits BCS simulations (with only one neighbor) has 16 × 2 n−2 − 1 = 31 free parameters: p i,j,s a=1,2,3 P i,j,s,a .ρ.P i,j,s,a , ( with where indices 1 and 2 stand for the control and target qubits of the CNOT, s indicates if an error has occured on the neighboring qubit (s = 1) or not (s = 0), and a encodes whether it was an X, Y or Z error (a = 1, 2 or 3).Note, in particular, that taking the partial trace of the density matrix over the neighboring qubit subspace yields the same Pauli channel as in the standard RC protocol presented in the last section as the effective error channel for the active qubits (see Eqs. ( 7), (D21)).Likewise, tracing over the active qubits yields a 1-qubit depolarizing noise channel for the neighboring qubit because p i,j,s does not depend on a: is the error rate and λ the depolarizing parameter.The key point about cRC is that it simplifies the error channel of neighboring qubits because it symmetrizes the effect of noise on the Bloch sphere with respect to the different directions.Application of R i (π/2) permutes the eigenstates of σ j and σ k : Indices are such that ϵ ijk = +1, where ϵ is the the Levi-Civita tensor.As a result, the depolarizing noise channel is obtained by equally mixing the different coefficients of the Pauli noise channel.

C. Comparing standard and crosstalk RC protocols through BCS simulation
We apply both the standard and crosstalk RC methods from the last sections to the BCS simulation algorithm described in Sec.II.For the crosstalk RC protocol (resp.standard RC protocol), results are shown in Fig. 6a,c,e (resp.Fig. 6-b,d,f).As expected, both RC protocols do not improve the accuracy of the results on their own, as they simplify the CNOT noise structure, but don't eliminate the noise.However, we observe that they tend to flatten the curve and attenuate the unpredictable fluctuations of the expectation values due to the complex structure of the original noise.This could be explained by the fact that the coherent part of the noise, typically responsible for such fluctuations, has indeed been turned incoherent.
To further interpret these results, we try to replicate the QC's output with classical simulations of our Trotter quantum circuits using a simple noise channel.In the following, we assume that the noise acting on the quantum device is close to isotropic (in the sense of the qubit basis), so that the Pauli noise channel restricted to the active qubits can be approximated by a 2-qubit depolarising noise channel.This assumption is rather standard [41] for quantum error mitigation protocols because it allows to relate noisy and noiseless expectation values analytically.Under this assumption, the (n = 3)-qubit error channel can be described by only 2 n−1 − 1 = 3 free parameters {λ CNOT , λ neigh., λ glob.}): where 0 and 1 denote the indices of the active qubits and 2 the index of the neighboring qubit, Tr {k} is the partial trace over the set of qubits {k}.We call this noise channel a quasi-local depolarising noise.As displayed in Fig. 6-a,c,e (magenta curves), the QC results can be well reproduced by classical simulations (using the density matrix method of the noise simulator Qiskit Aer) performed with a quasi-local depolarising noise (see Eq. ( 11)) applying on every CNOT of each junction (between qubits 0 and 1 and between qubits 1 and 2).This leaves us with 5 free parameters, λ 01 CNOT , λ 01 neigh.,λ 12 CNOT , λ 12 neigh.and λ glob.that we assume are the same for each Trotter step.We find a set of parameters that minimizes the square distance between the classical simulation results and all the data points for the 7 different observables (7 × 15 = 105 data points).The resulting fit parameters are: λ 01 CNOT ≃ 0, λ 12 CNOT ≃ 0.014, λ 01 neigh.≃ 0.05, λ 12 neigh.≃ 0.01 and λ glob.≃ 0.002, with the regression standard error χ 2 = 2.10 −4 .From the fit parameters, it is clear that the noise associated to neighboring qubits cannot be neglected.Rather, they seem to confirm that crosstalk plays an important role in IBM's quantum devices and that the scheme we are proposing is essential for the implementation of efficient error mitigation protocols.The noise parameters derived differ noticeably from the CNOT gate error rate (around 0.005 − 0.01) announced by IBM.We believe that this discrepancy stems from the method used to estimate these parameters.IBM uses the randomized benchmarking protocol [18,19], which does not take crosstalk into account because each junction is considered separately.The resulting error rate does not seem to relate easily to the Pauli noise channel parameters that we obtain through the cRC method.
For comparison, we also plot in Fig. 6-b,d,f, fit by classical simulations of the results we obtain using the standard RC protocol (no twirling of neighboring qubits).The resulting fit parameters are: λ 01 CNOT ≃ 0.007, λ 12 CNOT ≃ 0, λ 01 neigh.≃ 0.07, λ 12 neigh.≃ 0.009 and λ glob.≃ 0.0002 and the regression standard error is χ 2 ≃ 0.02.The quality of the fit is not as good as in the former case, indicating that the noise model of Eq. ( 11) is a rough approximation of the real noise occurring on the QC.
The above shows that cRC is essential to accurately describe the noise on the system with Eq. (11), ensuring that the effect of crosstalk is reduced to a depolarising noise channel on the neighboring qubits.In turn, the simpler noise structure facilitates error mitigation, as we illustrate in the next Section.

IV. ERROR MITIGATION TECHNIQUES
In order to illustrate the practical implications of our cRC protocol, we implement two auxiliary error mitigation schemes.First, to tackle errors stemming from measurements, we apply a simple readout error correction (REC) procedure detailed in [72].Second, to mitigate the noise affecting CNOT gates, we use the noise estimation circuits (NEC) method proposed in [41].

A. Readout error correction (REC)
Measurement is expected to introduce an error rate of 2% (data taken from IBM backend properties).In what follows, we systematically apply the iterative Bayesian unfolding method presented in [72] to correct readout errors (REC).Note that we invert the whole 2 3 × 2 3 measurement matrix, taking into account measurement error correlations.This method scales poorly with the number of measured qubits.However, we have found that perfoming the inversion qubit-per-qubit, discarding the correlations, yields similar results.

B. Noise estimation circuits (NEC)
The NEC technique aims to restore the noiseless value of an observable by dividing the noisy expectation value of an observable by a certain factor which estimates the circuit fidelity.This factor is obtained by running additional circuits derived from the original one, while preserving the same CNOT structure, and in practice allows one to partly get rid of the errors stemming from depolarizing noise channels.
To understand the above statement, first, one has to relate the noisy and the noiseless expectation values of an observable.In the same way as done in [41] for local and global depolarising noise (see also Appendix F), we expand the noisy outcome up to the first order in the number of errors for the quasi-local depolarising noise channel (see Eq. ( 11)), which yields: where n CNOT is the number of CNOT gates in the circuit, and ⟨O⟩ 0 is the noiseless outcome.⟨O⟩ 1 is the sum of expectation values obtained from all circuits where an error occurred on a single CNOT gate, putting the active qubits of the CNOT in the maximally mixed state I/4.⟨O ′ ⟩ 1 is the sum of expectation values obtained from all circuits where an error occurred on a single CNOT gate, putting the neighboring qubit of the CNOT in the maximally mixed state I/2.For exposition simplicity, we consider here that noise parameters for different junctions are equal, but the result can be generalized to CNOT gates with different noise parameters for different junctions.
To mitigate these errors, it has been proposed in [41] to run extra circuits (called noise estimation circuits) whose outcome yields a part of the error terms appearing in Eq. (12).To construct these NEC, one simply removes all single-qubit gates from the original Trotter circuits (see Fig. 7).
In our Trotter circuits, we can distinguish two types of CNOT gates.The first type is the CNOT gates encoding the interaction terms.After removing all the single-qubit gates, they are now applied in a row with the same control and target qubits.As they always come by pairs, they are now logically equivalent to the identity operator.The second type is the CNOT gates implementing the SWAP gate.Their action does not change after removing singlequbit gates.The overall unitary operation performed by the estimation circuits, for any number of Trotter steps, is thus the identity operator up to some SWAP operations.
Applying Eq. ( 12) to the NEC and dividing the noisy Trotter circuits outcome by the fidelity estimated via NEC leads to the mitigated expectation value: To build it, we remove the initial state initialization and single-qubit gates from the example circuit in Fig. 3.A random initial product state is added before the circuit and undone at the end (white boxes).We also perform RC on each CNOT gate (not shown here).
where ⟨E⟩ 1 and ⟨E ′ ⟩ 1 are the NEC measured expectation values correspond respectively to ⟨O⟩ 1 and ⟨O ′ ⟩ 1 in the original circuits and ⟨E⟩ 0 = 1.Eqs. (13) shows that the NEC mitigates exactly the noise associated with the global part of our quasi-local depolarising channel.For the active qubits and neighboring depolarising channels, it is expected that the coefficients in front of λ CNOT and λ neigh.are strongly suppressed.
To show this, we propose in Appendix G an expansion in terms of the Trotter time step ∆t (the computation is done in the case of a local depolarising noise).This computation takes into account any number of errors in the circuit but only relates the noisy result to the noiseless terms at zeroth an first order ∆t.Applying the equation found (see Eq. (G11)) to the case of NEC gives an exact result as the only non vanishing term is of order 0 in ∆t.Therefore, one finds that the main term of the noisy Trotter circuit is reproduced by the fidelity estimated using NEC.The result holds for any number of errors.In particular, at small time (i.e. for a few Trotter steps), we have: As a consequence, NEC is an efficient method to remove noise at small evolution times, or equivalently for small-depth circuits.The results can be generalized also for quasi-local depolarising noise.At longer evolution times, we expect the noise estimated by NEC to differ from the noise on the Trotter circuits.To remove this remaining noise, it has been proposed in [41] to use the zero noise extrapolation method [11,12], which consists in artificially increasing the noise of sets of CNOT gates [39] in order to extrapolate the result to the case of zero noise.
In this article, we do not apply this additional mitigation scheme because as we will see, at longer evolution times the QC result deviates from the classical simulations using depolarising noise, indicating that the cRC noise channel can no longer be described accurately as an effective quasi-local depolarising noise channel.
We remind the reader making the noise describable in terms of (11) requires the application of RC techniques.
Therefore, any circuits we run involve twirling applied in agreement with either RC or cRC protocols described in Sec.III.
Finally, in order to stabilize the fidelity-estimation output over different initial states, we generate random initial product states and undo them at the end after the application of the noise estimation circuit by inserting the corresponding inverse gates and taking into account the action of the SWAP gates.We produce N RC = 300 such noise estimation circuits, each of which has a different random initial state and a different twirling configuration for the RC, cf.Fig. 7.

V. COMBINING CROSSTALK RC AND ERROR MITIGATION
In order to obtain the error-mitigated results and thus assess the ultimate benefit of replacing RC with cRC, we combine all the techniques described above: REC, cRC/RC, and NEC.Namely, we divide the Trotter result obtained in Sec.III -after averaging over 300 RC circuits and performing REC -by the result of the noise estimation circuits as described in Sec.IV, also averaged over 300 RC and with REC.
The results for the observables ⟨X 0 ⟩, ⟨Z 2 ⟩ and ⟨X 0 Z 2 ⟩ using the crosstalk RC technique are plotted in Fig. 8a,c,e.The results for the standard RC protocol are shown in Fig. 8-b,d,f.While not perfect, the results using cRC are clearly better than the results using the standard RC.
Note the error bars that are sizeable in Fig. 8, as opposed to the negligible ones in Fig. 6.This is due to dividing by the fidelity estimate in the NEC protocol, which amplifies the expectation value, but also the errors.For the same reason, the expectation values of the observables may lie outside their range [−1, 1].In practice, such behavior indicates that we have reached the regime where the efficiency of NEC falls off, as unaccounted sources of noise kick in.
Results for all the remaining observables ⟨X 0 ⟩, ⟨Y 1 ⟩, ⟨Z 2 ⟩, ⟨X 0 Y 1 ⟩, ⟨X 0 Z 2 ⟩, ⟨Y 1 Z 2 ⟩ and ⟨X 0 Y 1 Z 2 ⟩ are gathered in Fig. 12 of Appendix C. The conclusion of cRC producing better results than RC holds for these as well.This is quantitatively demonstrated in Fig. 1-b by comparing the accuracy averaged over all the observables.Below we provide an in-detail analysis of the results.

A. Global accuracy improvement
A comparison between the two columns of Fig. 8 clearly shows that cRC, i.e. the twirling of neighboring qubits using rotation gates, qualitatively improves the accuracy of noise estimation protocol, while not requiring to run more circuits, compared to the standard RC protocol.These experiments demonstrate that, at least in the context of superconducting qubit architectures like IBMQ QC, this method should systematically be preferred.
The error mitigation, however, remains imperfect, as simulations still deviate from the noiseless result.These deviations can be separated into two categories.The first comprises the cases where the noiseless expectation values that we are measuring become close to 0. In this case, we observed that the RC result of the Trotter circuit sometimes gets the wrong sign (see e.g.Fig. 8-a, 1 ≤ t ≤ 1.8 and Fig. 8-e, 0.8 ≤ t ≤ 1.4).As a consequence, dividing by the fidelity estimate of NEC (which is positive by definition), only enhances this sign problem and makes the result deviate strongly from the noiseless curve, even at intermediate times.In spite of this, the oscillations of the exact evolution are still well reproduced by the mitigated curve.The second category relates to the strong deviations occurring at longer times (see e.g.Fig. 8-c,e t ≥ 2).We identified the two problems as having different origins.The first one stems from theexpected -effect of a local, 2-qubit depolarizing noise (cf.Eq. ( 11)), while the second issue is caused by the fact that the noise channel is not exactly depolarizing after RC, which negatively affects the output of the estimation circuits.This deviation becomes more important as the time increases since the estimated fidelity becomes very small.The next two subsections are devoted to our analysis of these deviations and their origin.

B. Error mitigation enhances the wrong sign
In this subsection, we discuss why the mitigated result may display a wrong sign as the expectation value approaches 0, inducing unwanted deviations from the perfect result, which are in turn enhanced by the NEC technique.In the case of a pure global depolarising noise, Eq. ( 12) (or Eq. (F2)) guarantees that the noisy result stays of the same sign as the noiseless one.However, when the noise is local (but still depolarising) this may not hold anymore.The equation becomes more involved, and the different terms in the Trotter expansion do not experience the same multiplicative error (see Eq. (G11) of Appendix G).In Fig. 8, we simulate the Trotter circuit using the local noise channel of Eq. ( 11) for each CNOT and the depolarising parameter from the fit of Fig. 6 (magenta and cyan curves on Fig. 6).We then apply the same noise channel to the estimation circuits and divide both results to obtain the mitigated one (green and blue solid curves in Fig. 8).In Fig. 8, we show classical simulations using the noise channel of Eq. ( 11), which reproduce quite well the observed deviation of the QC results.Therefore, in this case, the deviation from the noiseless dynamics is explained by the local depolarising noise occurring on qubits.In this regime, the NEC protocol only works approximately, but other error mitigation protocols such as Zero-Noise Extrapolation (ZNE), are more efficient to tackle local depolarising noise, and can be combined with NEC to improve the results.In Fig. 8-c,e, at longer time, we observe that mitigated fits (green solid curves) deviate slightly from the noiseless Trotter dynamics, but this deviation does not correspond to the one observed on the QC (green dotted curves) which indicates that it is not caused by the locality of the noise.

C. Analysis of long-time deviations
In Fig. 8, the error bars quantify the statistical fluctuations due to the finite sampling of the RC method and the finite number of shots (see Appendix E for a detailed explanation of the method used to estimate statistical fluctuations).The deviation observed from the classical simulations performed using a quasi-local depolarising noise are not within these error bars.In Fig. 12, we have increased N RC of NEC from 300 to 600 for t ≥ 1.8 in order to reduce this uncertainty.If one compares Fig. 8-a,c,e together with Fig. 12-a,c,e, data points have moved consistently within the range given by the error bars, there is no clear improvement of the results.Another source of error must be the cause of this deviation.
Since the deviations are not related to an insufficient number of RC samples, or to shot noise, it must originate from a poor estimation of the effective noise parameters on the Trotter cicuits after cRC by the NEC.To understand this, we compare the noise parameters of the Eq. ( 11) estimated using the Trotter circuits with cRC with the ones obtained from the fidelity estimations from the estimation circuits (see Fig. 9).We fit the 7 different estimation curves and find that resulting fit parameters obtained are: λ 01 CNOT ≃ 0.007, λ 12 CNOT ≃ 0.016, λ 01 neigh.≃ 0.05, λ 12 neigh.≃ 0.009 and λ glob.≃ 0.002 and χ 2 ≃ 4.10 −4 .These results only slightly differ from what we obtained from the randomized compiling of Trotter circuits (see Sec. III C).These small deviations on the noise parameters estimated by the two types of circuits are not relevant at small times but become crucial at longer times.Indeed, the uncertainty on the mitigated result becomes large as the denominator (estimation circuit) approaches 0. It can be estimated as (see Supple- mental Material of [41]): where ⟨E⟩ noisy is the mean value estimated from the NEC, ⟨O⟩ noisy is the mean value estimated from the Trotter circuits, ⟨O⟩ mitigated = ⟨O⟩noisy ⟨E⟩noisy , σ e , σ o and σ m their respective standard deviation.Fig. 9 shows that the fidelity is exponentially decreasing with the number of CNOT gates.Note, in particular, that the noise estimation method should be less stable for multiqubit observables because one expects their fidelity to be smaller compared to single-qubit observables (as they are more likely to be directly affected by errors).For an example, see Fig. 12-g in Appendix C where the error bars (uncertainty due to shot noise and the finite sampling of the RC method) become large for the observable X 0 Y 1 Z 2 at long time.NEC q 0 NEC q 2 NEC q 0 q 2 FIG. 9. Effective fidelity estimation for qubits q0 (blue points), q2 (green points) and q0q2 (red points), obtained by averaging over 300 randomly compiled noise estimation circuits.Solid lines are the results of the fit using the noise model of Eq. (11).Error bars due to shot noise and finite sampling of RC configurations are smaller than the plot marker size.
The noise parameter uncertainty stemming from the Trotter and the noise estimation circuits reflects the deviation of the effective noise from a pure quasi-local depolarising noise even after the application of cRC.This deviation can have two origins.On one hand, the assumption of a 2-qubit depolarising noise on the CNOT's active qubits is not exact, as the theorem only guarantees a Pauli noise channel.To avoid making this extra assumption, it will be either necessary to find a way to further simplify the Pauli noise channel on active qubits into a depolarising noise or to use more sophisticated error mitigation protocols (as compared to NEC) which take into account the full effect of the Pauli noise.On the other hand, it may be possible that single-qubit gate's error cannot be neglected anymore.For the longest circuit using the crosstalk twirling protocol, we have on average (depending on the twirl configuration) n 1 = 1187 singlequbit gates and 135 CNOT gates.Single-qubit gate's error rate being estimated at ϵ 1 ∼ 10 −4 , their total effect on the output of a circuit should be around √ n 1 ×ϵ 1 ∼ 0.3%.
However, as we have seen with Eq. ( 15) and Eq. ( 16), small deviations can become important when the denominator approaches 0.

VI. CONCLUSION
In this work, we have presented and analysed simulations of the BCS hamiltonian for three energy levels on an IBM superconducting quantum computer.To obtain reliable results from the noisy QC, we employ a simple error mitigation technique called NEC, whose efficiency is conditioned by the simplicity of the noise structure.In order to achieve the best possible results, we use the RC technique, which effectively turns the unknown and complex noise channels acting on the QC into a simpler channel.
We use two versions of RC: the standard RC and the crosstalk RC.We find that the latter enables achieving much better accuracy, by further simplifying the noise caused by crosstalk effects, while not requiring extra circuit runs or qubits compared to the former.This demonstrates the significance of crosstalk on IBM quantum computers.
Our work closes a gap in the field of error mitigation on NISQs: while the specific structure of the noise on real devices is rather complex and usually not well understood, theoretical proposals for error mitigation techniques usually assume a simple depolarising error channel.We explicitly demonstrate that the use of crosstalk RC does indeed bring unknown noise closer to depolarising noise, allowing us to take advantage of existing error mitigation proposals.This is the main reason for the significant overall performance improvements stemming from the use of our crosstalk RC protocol, as shown in Fig. 1.
There is another appeal in our approach: crosstalk errors are notoriously hard to characterize.Being a systematic, coherent noise channel, it is difficult to predict their net effect on the output of a given quantum computation.Using crosstalk RC and our theoretical prediction for the structure of the effective error channel, we show that existing protocols significantly underestimate the real error in superconducting quantum computers.We find in particular that crosstalk error channels map to a depolarising error channel with error rate per CNOT gate that can be as high as 5%, i.e. almost 5 times the values obtained by randomized benchmarking for the error rate on the active qubits of the CNOT gates.This result is in good qualitative agreement with the recent papers [73,74], in which crosstalk is also found to be the main source of errors for quantum computation on IBM quantum computers.
The accuracy gain from using crosstalk RC, in comparison to the standard RC, in the explicit simulations of the dynamics of the BCS model on NISQs is significant.It allowed us to perform intermediate-time (up to 15 Trotter steps) quantum simulations of an all-to-all interacting model with an accuracy beyond what can, to our knowledge, be found in the literature.This brings us one step closer to quantum simulations on NISQs matching the capabilities of classical computers. of 1-qubit gates.It is indeed a product of R z (θ) = e −i θ 2 σ z gates, which perform a rotation of angle θ around the Zaxis on the Bloch sphere.In our case θ j = −∆t(2ϵ j − g) where j is the index of the different qubits.The R z gate is a native gate for IBM quantum computers.

Transpilation of 2-body operators
The rightmost product of Eq. ( 3) describes the interaction between Cooper pairs.To implement them, one needs to use 2-qubit gates for entangling both qubits.The CNOT gate is the native 2-qubit gate of most of IBM QC.
Using σ x i σ x j , σ y i σ y j = 0, we obtain: where α = −g∆t in our case.To implement 2-body interaction terms of the form where a ∈ {x, y, z}, one needs 2 CNOT gates, a single R z (α) gate, and a 1-qubit gate O a : We represent in Fig. 10 the 2-qubit quantum circuit equivalent to the exponential (B2).Therefore, imple-10.Quantum circuit decomposition of the unitary operator e −i α 2 σ a i σ a j with a ∈ {x, y, z}.Here i and j denote qubit indices, the vertical lines represent the 2-qubit CNOT gates, for which the control qubit is indicated by the black dot.The other rectangles represent 1-qubit gates explicited in Eq. (B3).
menting Eq. (B1) as a quantum circuit requires 4 CNOT gates.Note however that after implementing such an interaction term using Qiskit [68], we use the transpile function with the option optimization level=3 (i.e. the most optimized method), which outputs an equivalent quantum circuit using only 2 CNOT gates (with some single qubit native gates)

Transpilation for the whole dynamics
In this subsection, we use the 1-qubit and 2-qubits quantum circuits previously derived to build the circuit implementing the fully trotterized dynamics.Naively, we could simply concatenate the corresponding elementary circuits on the physical qubits i,j ∈ {0, L − 1} to reproduce the product of Eq. (3).We see then that the number of 2-qubits interaction circuits required grows as L 2 = L(L−1) where L is the number of energy levels, which translates into L(L − 1) noisy CNOT gates for each Trotter step.However, an additional difficulty stems from the architecture of the IBM quantum computers (and superconducting quantum computers in general).As shown in Fig. 2, QCs used (ibm lagos and ibmq ehningen) have a certain qubit coupling map, and CNOT gates can only be performed on neighboring physical qubits.As a consequence, it is not straightforward to implement an interaction between two distant qubits (in the sense of the coupling map of Fig. 2).To overcome this problem, one has to swap around the logical information contained in physical qubits, so that two interacting logical qubits are brought together on neighboring physical qubits.This swap of information is achieved using the so-called SWAP gate.A simple example of the use of SWAP gates is given in Fig. 11.Note that the SWAP gate is not a native gate of the IBM machines.It requires three CNOT gates as shown in Fig. 11.
FIG. 11.Top: Logical decomposition of the SWAP gate (left) into 3 native CNOT gates (right).Bottom: Example of usage of the SWAP gate (represented with crosses at each end).The numbering refers to the position of logical qubits, while the solid horizontal lines are the physical qubits of the QC.The leftmost circuit cannot be realised in a layout where qubits 1, 2 and 3 are on a line, so we have to use the rightmost circuit instead.Because of this, the number of required CNOT increases by 3, and the positions of the logical qubits are modified.
To implement an interaction between two qubits initially separated by a distance d (in the sense of the coupling map of Fig. 2), at least d − 1 SWAP gates are required, i.e 3(d − 1) CNOT gates.Consequently, the number of CNOT gates -and thus the number of errorsgrows very quickly with the size of the system we want to simulate (i.e. the number of qubits).

Initial state preparation
In an effort to minimize the number of noisy CNOT gates in the circuit, we made the choice to start with an initial product (non entangled) state, which can be prepared using only single-qubit gates.A natural product state in the BCS problem is the mean-field ground state which is the ground state of the BCS hamiltonian in the thermodynamic limit but has a non trivial dynamic for finite system.To construct it, we use the gap equation derived from the mean field approximation [69]: where ∆ is the energy gap.The BCS mean-field ground state wave function can be written as: where and we also have the normalization condition We rewrite u i and v i as: which yields: Thus, to prepare this initial state we simply apply on each qubit the gate R y (θ i ) followed by the gate R z (φ).The first box on Fig. 3 shows an example of this preparation with the set of parameters specified in Sec.II C (these correspond to ∆ ≈ 0.46).

Appendix C: Evolution of all observables mitigated
In this appendix, we present the entirety of the results of our runs on the quantum computer, without RC, with standard RC and with crosstalk RC.This data was used to obtain the average absolute error plotted in Fig. 1, and only a portion of it is presented in the main text.In Figs. 12 and 13 we show the full time evolution for a first set of seven observables, executed respectively on ibm lagos and ibmq ehningen.In Fig. 14, we show the same evolution for a different set of seven observables, also obtained from ibm lagos.RC averages for both the Trotter circuits and the NEC have been performed over 300 samples, with the exception of the crosstalk RC curves at times t ≥ 1.8 where we averaged over 600 NEC to check that the deviations do not come from a lack of RC circuits.One can see that over different runs, the noise may change and the efficiency of the mitigated protocol varies.12. Evolution of the 7 different observables derived from the measurement of of ibm lagos' qubits q0, q1 and q2 respectively in the direction x, y, z after application of mitigation protocols (NEC + RC + REC).Blue (resp.green) curves correspond to standard (resp.crosstalk) RC.Dashed curves are the result obtained on a QC and the solid line are the corresponding fit made with the noise model of Eq. (11).Exact evolution (black lines), noiseless (resp.noisy) Trotter dynamics (red lines) (resp.dashed orange curves) are also plotted for each quantity.Error bars are due to the limited number of shots and the finite sampling over twirl configurations.13.Evolution of the 7 different observables derived from the measurement of ibm ehningen's qubits q18, q21 and q23 respectively in the direction x, y, z after application of mitigation protocols (NEC + RC + REC).Blue (resp.green) curves correspond to standard (resp.crosstalk) RC.Dashed curves are the result obtained on a QC.The green solid line is the fit of crosstalk RC made with the noise model of Eq. ( 11) (fit for the standard RC did not converge).Exact evolution (black lines), noiseless (resp.noisy) Trotter dynamics (red lines) (resp.dashed orange curves) are also plotted for each quantity.Error bars are due to the limited number of shots and the finite sampling over twirl configurations.

Appendix D: Noise channel after randomized compiling
First, we review the detailed computation of a general n-qubit noise channel twirled over the set T ⊗n p = {I 2 , σ x , σ y , σ z } ⊗n .Then, we argue that method can be applied to a 2-qubit noise channel induced by the application of a noisy CNOT gate.The main difference being that the first twirl can only be applied before the CNOT.Finally, we study the crosstalk RC protocol de- Evolution of the 7 different observables derived from the measurement of ibm lagos' qubits q0, q1 and q2 in the direction z after application of mitigation protocols (NEC + RC + REC).Blue (resp.green) curves correspond to standard (resp.crosstalk) RC.Dashed curves are the result obtained on a QC and the solid line are the corresponding fit made with the noise model of Eq. (11).Exact evolution (black lines), noiseless (resp.noisy) Trotter dynamics (red lines) (resp.dashed orange curves) are also plotted for each quantity.Error bars correspond to the uncertainty due to shot noise and the finite sampling over twirl configurations.fined in Sec.III and derived the n-qubit channel.We also show that when traced over the neighboring qubits this channel reduced to a 2-qubit Pauli noise channel and when traced over all the qubits except one neigbouring qubit of the CNOT it gives a 1-qubit depolarizing noise channel.
1. Pauli twirling of a n-qubit noise channel Here, we follow the step of computation detailed in [46].A general noise channel applied on a density matrix can be written as: M is the so-called Kraus operator and we should stress that for a given noise channel this decomposition in terms of Kraus operators is not unique.Each Kraus operator can be decomposed over the Pauli basis: where P = {I 2 , σ x , σ y , σ z } ⊗n .The general noise channel reads: Tr(M.

Randomized compiling for CNOT gates
In practice, a noise channel emerges from the application of a noisy gate.Consequently, RC cannot be directly applied to the noise channel; rather, it is performed on the entire noisy gate, modeled by its noiseless counterpart Eq. (D4) rewrites: where w ′ ∈ P and such that w.U = U.w ′ .In general, for any U it is not guaranteed to find such w ′ .Following the main text, here the noisy gate U noisy will be the 2qubit CNOT gate and for all w it exists such w ′ .Let us denote CNOT ij the CNOT gate acting on qubit i and j and w = w i ⊗ w j a̸ =i,j w a .For any (w i , w j ) ∈ T ⊗2 p Table I of [41] gives the corresponding (w ′i , w ′j ).For a ̸ = i, j, the action of the perfect CNOT is trivial and then w ′a = w a .The rest of the computation follows the above derivation.This twirling of noisy gates is called randomized compiling because, as it is explained in the main text, the space of twirling gates grows exponentially with the number of CNOT gates in the circuit and one usually samples with a finite number of different twirl configuration this space by choosing randomly gates in the twirling set.

Rotation twirling
In this section, we derive the noise channel obtained after Pauli twirling and apply the rotation twirling on neighboring qubits such as described in Sec.III B (see Fig. 5).First, we start by a simpler case, we consider a 1qubit noise channel and apply this double RC procedure.The noise channel of Eq. (D4) is modified as follows: where T R = {R x (π/2), R y (π/2), R z (π/2)} as defined in Sec.III B. For each rotation R, the Pauli matrices p 1 and p 2 are mapped to the Pauli matrices p ′ 1 = R † .p 1 .R and p ′ 2 = R † .p 2 .R and we rewrite Eq. (D10): The same derivation as in the above subsection can be done for p ′ 1 and p ′ 2 and therefore only the diagonal term p ′ 1 = p ′ 2 does not vanish.We get: The effect of the rotation matrices when the identity matrix is applied (p = I 2 ) does not change the matrix: where k, l = x, y, z and ϵ klm is the Levi-Civita tensor.Eq. (D12) reduces to: Eq. ( D15) is a 1-qubit depolarising channel.Adding a rotation twirling set equally mixes the different coefficients of the Pauli noise channel to obtain a depolarising channel.
Let us now consider the real case where a CNOT is applied and the rotation twirling is performed over the neighboring qubits.The twirled channel reads: where ρ ′ = CNOT.ρ.CNOT.Here n denotes the number of qubits on which the noise channel applies.When considering crosstalk, n is equal to 2 (the qubits on which the CNOT is applied) + the number of direct neighbors.The same trick can be applied as in Eq. (D11) and after a similar derivation we end up with: The probability to apply to the density matrix the Pauli gate p is 1 Let us consider for the rest of the derivation that 1 and 2 denotes the index of the control and target qubits of the CNOT gate, hence, there is no rotation matrix applied on these qubits.We decompose the whole Pauli matrix and rotation matrix on each individual qubits: P = One can also trace over all the qubits except one neighboring qubit b.The probability to apply the gate p b is: When p b = σ x , σ y or σ z , the probabilities are the same: where q b relates to the depolarizing parameter λ b = 4q b 3 .Therefore, it mimics a depolarizing channel when looking at only one neighboring qubit.When tracing over all the neighboring qubits and looking at the noise channel on qubits 1 and 2, one has still a Pauli noise channel where the probability to apply the gate p 1 ⊗ p 2 is: The general form of the noise channel obtained after

Relation between noisy and noiseless outcome
In the simple case of a global depolarising channel, this can be done exactly.For local depolarising channel, the relation can be expanded in terms of number of errors occuring in the circuit and one keeps only the first order term.After considering these two noisy channels, the result generalizes to the case of Eq. ( 11) and yields Eq. ( 12) which includes these two cases plus a noise on the neighboring qubit.
A global depolarising channel on each CNOT reads: where n is the number of qubits in the circuit and 0 ≤ λ ≤ 4 n 4 n −1 .If an error occurs in the circuit, it outputs the maximally mixed state.The noisy result of a traceless observable is therefore expressed as [41]: where n CNOT is the number of CNOT gates in the circuit.
In the case of a depolarising channel acting only on the qubits i, j where the CNOT is applied (called local depolarising channel), the error channel is written as: where Tr ij is the partial trace over the active qubits i, j.
An exact expression between the noisy result and the noiseless cannot be derived but one can expand it in terms of the number of errors: where ⟨O⟩ i is the outcome such that i errors occurred in the circuit, it is a sum of expectation values of all circuit where i CNOT gates output the maximally mixed state on the active qubits.In the last line we expand the expression up to one error occuring in the circuit.

Noisy expectation value of the noise estimation circuit
In the case of a global depolarising noise, applying Eq. (F2) to the NEC leads to: For the local noise, we apply Eq.(F4) in the case of NEC and divide both results, which yields: We will now show in the next appendix that the term in front of λ CNOT is small.
Appendix G: First order Trotter expansion of an observable under a local depolarising noise In this Appendix, we show that assuming a local depolarising noise on each CNOT gate, the noise of the NEC reproduces part of the noise of the Trotter circuit up to the first Trotter order.
Every linear quantum circuits can be written as a sequence of a 2-body unitary gates (see Fig. 15-a for an example).
We start by considering only one qubit measured (qubit m) along the direction d = x, y, z (in Fig. 15-a m = 2).The density matrix just before the measurement is denoted as ρ 0 .Thus, one computes the quantity Tr(ρ 0 σ d m ).Each U k ij unitary gates contains an arbitrary number of CNOT gates r k ij .Here i and j are the qubit indices on which the unitary gate is applied and k corresponds to the 2k − 1 layer of gates if i is odd and 2k layer if i is even.We suppose a local depolarizing noise after each CNOT gates with parameter ε: where Tr ij denotes the partial trace over the qubits i and j.No matter on which CNOT gates the noise occurs inside a U k ij unitary gate the noisy outcome will be Iij 4 ⊗ Tr ij (ρ).Then the noisy channel after a U k ij unitary gate is: . Suppose we have noise on the last layer of unitary gates (in the above example on U 3  34 and U 3  12 ).If the noise is on a unitary gate which is not on the measured qubit, this gate is not in the lightcone of the measured qubit and therefore its noise cannot affect the result.If the noise is on a unitary gate which directly acts on the measured Fig. 15-c   To implement interaction between two non neighboring qubits, one need to use SWAP gates which are succession of three CNOT gates.These SWAP gates need to be taken into account at order 0 because they do not depend on any small parameter.This modifies the above formula: with SWAP gates the measured qubit does not remain in the same physical qubit we need to follow it during the course of the circuit.In the formula of Eq. (G11), the qubit measured m becomes a function of the layers: m(j) which indicate the location of the measured qubit at layer j.
Eq. (G11) can be applied to estimate the noise measured by the noise estimation circuit.In this case, the general U a bc gates reduce to the identity operator up to some SWAP operations and ∆t = 0. Therefore, these circuits estimate only the noise term in factor of the whole expression L j=1 (1 − ϵ) r j m−1m (1 − ϵ) r j mm+1 i.e the noise coming from the CNOT acting on the measured qubit.Then, it shows that if one divides the noise of the Trotter circuit by the noise of the estimation circuit, one can remove all the noise of the main term and a part of the first order Trotter term.
The number of higher order terms in ∆t scale exponentially with the total time evolution and therefore nothing guarantee that these terms will remain small.Their noise will deviate strongly as time goes on from the noise estimation circuit (see Fig. 8 for a simulation of the noise estimation method using the error channel of Eq. ( 11) (quasi-local depolarizing noise)).Therefore, the noise estimation circuit is a method which delays the effect of the noise and allows simulation to longer evolution time.
Eq. (G11) can be generalized to multiple measured qubits.For each term containing the index m for the measured qubit, one should add a product running over the different measured qubits.The formula for the noise estimation circuit also gets this modification and is still valid in that case.This result generalizes to an error channel of the form of Eq. (11).As long as a depolarising channel is applied on qubits whenever an error occurs, the main term of the Trotter expansion will be reproduced by the estimation circuit.

FIG. 1 .
FIG.1.Illustration of the importance of crosstalk mitigation.(a): Pictorial representation of noise occuring during the application of a CNOT gate (blue cloud).Crosstalk may extend the effect of this noise to the neighboring qubits.Gates applied before and after the CNOT gates depict the RC protocol (white gates are used for the standard RC procedure, adding the red gates constitutes the crosstalk RC).A more detailed description is given in Sec.III.(b): Figure of merit for the simulation accuracy when using different error mitigation techniques.Clearly, the effect of mitigating crosstalk is not negligible.The figure of merit is defined as ϵ = S −1 j |ij||(nj − ij)|, where j enumerates all simulated evolution durations and observables measured, ij is the ideal noiseless expectation value for the respective observable, nj is the value measured on the noisy QC; S = j |ij| represents the normalization factor, as |ij| weights prioritize the contributions of observables with expectation value away from 0. (c)-(d): Evolution of the expectation value Y of the second qubit out of three under the 3-body BCS hamiltonian (see Eq. (1)) with energy levels {−1, 0, +1} and coupling constant g = 0.5 respectively on both IBMQ devices.The red curve is the noiseless trotterized dynamics, the dashed orange curve represents the simulation on a real, noisy QC.The dashed blue (resp.green) curve shows the simulated evolution when using a number of error mitigation protocols: readout-error correction, standard randomized compiling (resp.crosstalk randomized compiling), and NEC.While not allowing for perfect agreement with the ideal red curve, crosstalk randomized compiling dramatically improves the accuracy of the quantum simulation.

FIG. 3 .
FIG.3.Transpiled quantum circuit for a single Trotter step (3), preceded by the initial state preparation.The on-site energy and the interaction between the first two qubits are highlighted in red.The highlighted SWAP gate is required to ensure the interaction between the first and last qubit on a linear layout (see main text).Quantum circuits for the evolution up to arbitrary time t = r∆t are built by concatenation of this elementary, circuit r times (without the initialization), and thus contain 9r CNOT gates.

X 0 Z 2 FIG. 4 .
FIG. 4. Time evolution of the first qubit's x-component ⟨X0⟩ (top panel), the third qubit's z-component ⟨Z2⟩ (middle panel) and their joint evolution ⟨X0Z2⟩ (bottom panel) under the exact BCS hamiltonian evolution (black), the Trotter noiseless circuits (red) and the Trotter circuit on real noisy QC (orange).Error bars due to shot noise are smaller than the plot marker size σs ∼ 1/ √ Ns = 5.10 −3 where Ns = 32000 is the number of shots.

Measurement X 0 Z 2 FIG. 6 .
FIG.6.Time evolution of ⟨X0⟩ (top panels), ⟨Z2⟩ (middle panels) and ⟨X0Z2⟩ (bottom panels).The green (resp.blue) dashed curves on the left (resp.right)panels correspond to the crosstalk (resp.standard) RC protocol.The RC protocol has been averaged over 300 different circuits.The magenta (resp.cyan) solid curves on the left (resp.right) panels are the result of a fit of the green (resp.blue) curves made using classical simulations of the circuit with the noise model of Eq. (11) (see main text).Error bars due to shot noise and finite sampling of random twirl configurations are computed in Appendix E. They are smaller than the plot marker size.The red and orange curves correspond respectively to the noiseless and noisy trotterized dynamics and are plotted here for comparison.

FIG. 8 .
FIG.8.Evolution of ⟨X0⟩ (top panels), ⟨Z2⟩ (middle panels) and ⟨X0Z2⟩ (bottom panels) after quantum error mitigation protocols (NEC + cRC/RC + REC).Green (resp.blue) dotted curves on the left (resp.right) panels are the results of the real IBMQ device using crosstalk (resp.standard) RC protocols.Green (resp.blue) solid curves are the fits obtained in Fig.6-a,c,e(magenta curves) (resp.Fig.6-b,d,f(cyan curves)) mitigated using classical simulation of NEC.Error bars correspond to the uncertainty due to the limited number of shots and the finite sampling of twirl configurations.
FIG.12.Evolution of the 7 different observables derived from the measurement of of ibm lagos' qubits q0, q1 and q2 respectively in the direction x, y, z after application of mitigation protocols (NEC + RC + REC).Blue (resp.green) curves correspond to standard (resp.crosstalk) RC.Dashed curves are the result obtained on a QC and the solid line are the corresponding fit made with the noise model of Eq.(11).Exact evolution (black lines), noiseless (resp.noisy) Trotter dynamics (red lines) (resp.dashed orange curves) are also plotted for each quantity.Error bars are due to the limited number of shots and the finite sampling over twirl configurations.
FIG.13.Evolution of the 7 different observables derived from the measurement of ibm ehningen's qubits q18, q21 and q23 respectively in the direction x, y, z after application of mitigation protocols (NEC + RC + REC).Blue (resp.green) curves correspond to standard (resp.crosstalk) RC.Dashed curves are the result obtained on a QC.The green solid line is the fit of crosstalk RC made with the noise model of Eq. (11) (fit for the standard RC did not converge).Exact evolution (black lines), noiseless (resp.noisy) Trotter dynamics (red lines) (resp.dashed orange curves) are also plotted for each quantity.Error bars are due to the limited number of shots and the finite sampling over twirl configurations.
FIG.14.Evolution of the 7 different observables derived from the measurement of ibm lagos' qubits q0, q1 and q2 in the direction z after application of mitigation protocols (NEC + RC + REC).Blue (resp.green) curves correspond to standard (resp.crosstalk) RC.Dashed curves are the result obtained on a QC and the solid line are the corresponding fit made with the noise model of Eq.(11).Exact evolution (black lines), noiseless (resp.noisy) Trotter dynamics (red lines) (resp.dashed orange curves) are also plotted for each quantity.Error bars correspond to the uncertainty due to shot noise and the finite sampling over twirl configurations.

p 1 ) 1 ) 1 .
Tr(M.p 2 ) * p 1 .ρ.p 2 (D3) The twirling of the noise channel over T ⊗n p = P consists of applying Pauli matrices (w in Eq. (D4)) before and after the noise channel and average the result over the twirling set: Tr(M.p 2 ) * × p 1 .ρ.p 2 ζ(w, p 1 )ζ(w, p 2 ) (D5) where ζ(w, p) is the sign of the commutation between w and p: +1 if both commute, −1 if they anticommute.In particular, one has the relation: ζ(w, p 1 .p 2 ) = ζ(w, p 1 )ζ(w, p 2 ) (D6) One inserts Eq. (D6) in Eq. (D5), and splits the sums over p 1 and p 2 in two parts: the first one when p 1 = p 2 and the second one when p 1 ̸ = p 2 .The first sum evaluates ζ(w, p 2 1 ) = ζ(w, I n ) = +1.For the second sum when p 1 ̸ = p 2 , we decompose the Pauli matrices over the different qubits w = n−1 a=0 w a and p = p 1 p 2 = n−1 a=0 p a .The sum over w rewrites: w∈P ζ(w, p 1 .p 2 ) = n−1 a=0 w a ∈Tp ζ(w a , p a ) (D7) When p 1 ̸ = p 2 , it exists at least one qubit a such that p a ̸ = I 2 .For that qubit, terms w a = I 2 and w a = p a evaluate to ζ(w a , p a ) = +1 while the two other terms give −As a consequence, Eq. (D7) vanishes.Only the diagonal term (p 1 = p 2 ) is non zero in Eq. (D5) and gives: p)| 2 p.ρ.p (D8) Any noise channel is mapped, after twirling, to a Pauli noise channel with coefficients 1 4 n M |Tr(M p)| 2 for each Pauli matrix p where the M 's are the Kraus operators of the original noise channel.
and R = a̸ =1,2 r a .Using Eqs.(D13) and (D14), the coefficients in Eq. (D17) only differs depending when p b = I 2 or p b ̸ = I 2 where b is the index of a neighboring qubit.Therefore, the number of free parameter decreases thanks to this additional twirling.It goes from 4 n − 1 for a Pauli noise channel on n qubits to 16 × 2 n−2 − 1 = 2 n+2 − 1.
⟨E⟩ glob.noisy = (1 − λ glob. ) nCNOT (F5)where ⟨E⟩ glob.noisy is the outcome of the NEC (the noiseless outcome being 1).In this simple case, the NEC exactly reproduces the noise of the Trotter circuit.Dividing the Trotter circuits by the NEC yields the noiseless result:

( 1 
(G12), (G13).One can then generalize the above formula for a circuit of 2L layers and measured qubit m:Tr(ρ 0 σ d m ) = L j=1 − ϵ) r j m−1m (1 − ϵ) r j mm+1 Tr(ρ ini .σd m )ρ ini is the initial density matrix before the first layer of unitary gates and FIG. 7. Example of an estimation circuit for the first Trotter step.
represents the different term of this expansion.After application of the noise of the unitary U 2 12 and U234 we obtain:Tr(ρ 0 .σd 2 ) ≃ (1 − ε) r 3