Mitigating Realistic Noise in Practical Noisy Intermediate-Scale Quantum Devices

Quantum error mitigation (QEM) is vital for noisy intermediate-scale quantum (NISQ) devices. While most conventional QEM schemes assume discrete gate-based circuits with noise appearing either before or after each gate, the assumptions are inappropriate for describing realistic noise that may have strong gate dependence and complicated nonlocal eﬀects, and general computing models such as analog quantum simulators. To address these challenges, we ﬁrst extend the scenario, where each computation process, being either digital or analog, is described by a continuous time evolution. For noise from imperfections of the engineered Hamiltonian or additional noise operators, we show it can be eﬀectively suppressed by a stochastic QEM method. Since our method assumes only accurate single qubit controls, it is applicable to all digital quantum computers and various analog simulators. Meanwhile, errors in the mitigation procedure can be suppressed by leveraging the Richardson extrapolation method. As we numerically test our method with various Hamiltonians under energy relaxation and dephasing noise and digital quantum circuits with additional two-qubit crosstalk, we show an improvement of simulation accuracy by 2 orders. We assess the resource cost of our scheme and conclude the feasibility of accurate quantum computing with NISQ devices.


I. INTRODUCTION
With the experimental demonstration of quantum supremacy [1], whether current or near-future noisy intermediate-scale quantum (NISQ) devices are sufficient for realizing quantum advantages in practical problems becomes one of the most exciting challenges in quantum computing [2]. Since NISQ devices have insufficient qubits to implement fault tolerance, effective quantumerror-mitigation (QEM) schemes are crucial for suppressing errors to guarantee the calculation accuracy to surpass the classical limit. Among different QEM schemes via different postprocessing mechanisms  QEM method is one of the most effective techniques [4,5], which fully inverts noise effect by requiring a full tomography of the noise process and assuming noise independently appears either before or after each gate in a digital gatebased quantum computer. While these assumptions are adopted for many QEM schemes, realistic noise is more complicated. Specifically, since every gate is experimentally realized via the time evolution of quantum controls [1,[30][31][32][33][34][35][36][37][38], noise happens along with the evolution, whose effect inevitably mixes with the gate or process and even scrambles nonlocally (see appendix). For example, as one of the major noises in superconducting qubits, crosstalk of multiqubit gates originates from the imperfect time evolution with unwanted interactions [31][32][33][38][39][40]. Therefore, such inherent dynamics-based and nonlocal noise effects make conventional QEM schemes less effective for practical NISQ devices. Meanwhile, a more natural and noise-robust computation model is via analog quantum simulators [41][42][43][44][45][46][47][48][49][50][51][52][53][54][55][56][57][58], which directly emulate the target system without even implementing gates. It also remains an open challenge to suppress errors for reliable medium-or large-scale analog quantum simulators [59][60][61][62].
In this work, we present QEM schemes without assumptions of gate-based circuits or simplified local noise models of each gate. Specifically, we introduce stochastic error mitigation for a continuous evolution with noise described by imperfections of the engineered Hamiltonian or superoperators induced from the interaction with the environment [48,60,62,63]. Compared to existing methods, such as dynamical decoupling, which are generally limited to low-frequency noise and small simulations [64][65][66][67], our work introduces a universal way to mitigate realistic noise under experiment-friendly assumptions. Our work considers continuous evolution of the system and assumes accurate single-qubit operations, which is applicable to all digital quantum simulators and various analog simulators. Our method is compatible with existing QEMs, and its combination with Richardson extrapolation can be further leveraged to suppress errors in inaccurate model estimations and recovery operations. We numerically test our scheme for various Hamiltonians with energy relaxation and dephasing noise and a quantum circuit with two-qubit crosstalk noise. We conduct a resource estimation for nearterm devices involving up to 100 qubits and show the feasibility of our QEM scheme in the NISQ regime.

II. BACKGROUND AND FRAMEWORK
We first introduce the background of analog quantum simulation (AQS) and digital quantum simulation (DQS) with noisy operations. In a digital gate-based quantum computer, the effect of noise is usually simplified as a quantum channel appearing either before or after each gate, whereas realistic noise occurring in the experimental apparatus is more complicated. Specifically, every gate in digital circuits or every process in analog simulation is physically realized via a continuous real-time evolution of a Hamiltonian and therefore errors can either inherently mix with the evolution-making it strongly gate or process dependent, or act on a multiple number of qubits-leading to highly nonlocal correlated effects (crosstalks). For instance, dominant errors in superconducting qubits are inherent system dephasing or relaxation, and coherent errors (or crosstalk) when applying entangling gates. While AQSs are believed to be less prone to noise, this holds true mostly in comparison to DQS, and when considering an intermediate simulation scale [59], outcomes of AQS could be sensitive to noise (for example, see theoretical studies on the sensitivity to errors [61,62] and noisy simulation result [52] of AQS).
Since conventional quantum-error-mitigation methods are restricted to gate-based digital quantum computers and oversimplified noise models, they fail to work for realistic errors and general continuous quantum processes. For instance, owing to the restricted set of allowed operations in analog quantum simulators, it is challenging to suppress or correct errors of a continuous process in this context [61] and it remains an open challenge to suppress errors for reliable medium-or large-scale quantum simulators [59]. Our work addresses this problem by considering a more general scenario of a continuous process with realistic noise models occurring in quantum simulators.
In particular, we introduce the model that describes either gate syntheses or continuous processes in digital or analog simulation. We consider the ideal evolution of state ρ I (t) with a target Hamiltonian H sys as In practice, we map H sys to a noisy controllable quantum hardware H sim , whose time evolution is described by the Lindblad master equation of the noisy state ρ N (t) as Here, H sim = H sys + δH corresponds to coherent errors (such as crosstalk or imperfections of Hamiltonian) and is the noise superoperator with error strength λ that describes inherent coupling with the environment (such as dephasing and damping) [48,60]. Instead of assuming a local singlequbit noise channel of each gate in conventional QEM, we consider a local noise model by assuming local Lindblad terms. We note that local noise operators at instant time t can easily propagate to become global noise after integrating time, which may cause nonlocal noise effects in reality.
Suppose we are interested in measuring the state at time T with an observable O. The task of QEM is to recover the noiseless measurement outcome O I = tr[Oρ I (t)] via noisy process. In general, it would be difficult to efficiently mitigate arbitrary noise with any noise strength. Here, we assume that the noise operators act weakly, locally and time independently on small subsystems. We note that even though the coherent error δH and the Lindblad operators L k act locally on the quantum system, the effect of errors propagates to the entire system after the evolution. Therefore, such global effects of noise cannot be effectively mitigated using the conventional quasiprobability method, which assumes a simple gate-independent error model described by single-or two-qubit error channels before or after each gate. We also assume that accurate individual single-qubit controls are allowed, which holds for digital NISQ devices where single-qubit operations can achieve averaged fidelity of 99.9999% [68], whereas the record for two-qubit fidelity is 3 orders lower [69,70]. While not all analog quantum simulators support individual single-qubit controls, they can indeed be achieved in various platforms with superconducting qubits [31,[71][72][73][74], ion trap systems [52,75,76], and Rydberg atoms [77]. Therefore, our framework is compatible with various practical NISQ devices.
In the following, we focus on qubit systems and assume time-independent noise. We note that the discussion can be naturally generalized to multilevel systems, as well as general time-dependent noise (see Appendices B 2 and E).

III. CONTINUOUS QEM
Quantum gate in digital circuits or joint evolution process in analog simulation is physically realized via a continuous real-time evolution of a Hamiltonian. In this section, we extend the QEM method to a more practical scenario and show how to mitigate errors for these inherent dynamics-based and nonlocal noise in practical noisy quantum devices. We first introduce "continuous" QEM as a preliminary scheme as shown in Fig. 1(a). Consider a small time step δt, the discretized evolution of Eqs. (1) and (2) can be represented as Here α = I , N and E α (t) denotes the ideal (α = I ) or noisy (α = N ) channel that evolves the state from t to t + δt within small δt. We can find a recovery operation E Q that approximately maps the noisy evolution back to the noiseless one as The operation E Q is in general not completely positive, hence cannot be physically realized by a quantum channel. Nevertheless, similar to probabilistic QEM for discrete gates [4,5], we can efficiently decompose E Q as a linear sum of a polynomial number of physical operators {B j } that are tensor products of qubit operators, with coefficients c = 1 + O(δt), α j = ±1, and a normalized probability distribution p j . We refer to Sec. VI and Appendix B 4 for details of the decomposition and its optimization via linear programming. Under this decomposition, the whole ideal evolution from 0 to T can be mathematically decomposed as where n = T/δt, C = c n , α j = n−1 k=0 α j k , p j = n−1 k=0 p j k , and j = (j 1 , . . . j n−1 ). Denote the ideally evolved state as ρ I (T) = n−1 k=0 E I (kδt)ρ(0) and the noisily evolved and corrected state as ρ Q, j (T) = n−1 k=0 B j k E N (kδt)ρ(0), we can approximate the ideal state ρ I (T) as a linear sum of noisy states as In practice, we can randomly prepare ρ Q, j (T) with probability p j , measure the observable O, and multiply the outcome with the coefficient Cα j . Then the average measurement outcome O Q, j of the noisy and corrected states ρ Q, j approximates the noiseless measurement outcome.
To measure the average outcome to an additive error ε with failure probability δ, we need N s ∝ C 2 log(δ −1 )/ε 2 samples according to the Hoeffding inequality. Since the number of samples needed given access to ρ I (T) is N 0 ∝ log(δ −1 )/ε 2 , the error-mitigation scheme introduces a sampling overhead C 2 , which can be regarded as a resource cost for the stochastic QEM scheme. The overhead scales as C 2 (T) = exp[O(λT)] given noisy strength λ and evolution time T. Here we choose a normalization λ so that the 034026-3 contribution from L exp is bounded by a constant. Therefore, the condition that the scheme works efficiently with a constant resource cost is λT = O(1). By regarding λ as the error rate, the condition can be intuitively interpreted as that the total noise rate is a constant, aligning with the result for conventional QEM.
We remark that this error-mitigation scheme works for general errors and we can mitigate correlated stochastic noise and unwanted interactions between (a small number of) multiple qubits. In Appendix B 2, we provide more details of the continuous error mitigation, including the decomposition of recovery operations and the resource cost for this method. In addition, this scheme can be naturally applied to multilevel systems when we can prepare the basis operations {B j } for them.

IV. STOCHASTIC QEM
In practice, it could be challenging to "continuously" interchange the noisy evolution and the recovery operation within a sufficiently small time step δt. Since E I (t) ≈ E N (t) and the recovery operation at each time is almost an identity operation with B 0 being the identity channel I. The probability to generate the identity operation I and B i (i ≥ 1) is In addition, the parity α 0 for B 0 = I is always unity, and the parity α i corresponding to B j (i ≥ 1) is equal to sign(q j ). We can further apply the Monte Carlo method to stochastically realize the continuous recovery operations as shown in Fig. 1(b). Specifically, we initialize α = 1 and randomly generate q ∈ [0, 1] at time t = 0. Then evolve the state according to the noisy evolution E N until time t jp by solving p(t jp ) = q with p(t) = exp [− (t)] and (t) = t j ≥1p j . At time t jp , we generate another uniformly distributed random number q ∈ [0, 1], apply the recovery operation B j if q ∈ [s j −1 , s j ], and update the coefficient is the number of basis operations, and the sum omits the identity channel. Then, we randomly initialize q, and repeat this procedure until time reaches T. On average, we prove in Sec. V that the stochastic QEM scheme is equivalent to the "continuous" one. While differently, the stochastic QEM does not assume time discretization and it requires only to randomly apply a few recovery operations, scaling linearly to the total noise strength as O(λT) (see Appendix B 2). We can insert the recovery operations by "pausing" the original noisy evolution. Alternatively, since we can determine the time t jp and the recovery operations before the experiment, they can be preengineered into the original evolution. Therefore, we can effectively implement stochastic QEM via the noisy time evolution of Eq. (2) with an adjusted Hamiltonian. We discuss its implementation for both analog quantum simulation and digital gate-based quantum simulation in Sec. VII.
The stochastic error-mitigation scheme is summarized as follows.

V. EQUIVALENCE BETWEEN CONTINUOUS ERROR MITIGATION AND STOCHASTIC ERROR MITIGATION
We now prove the equivalence of stochastic error mitigation to continuous error mitigation. Provided the recovery operation in Eq. (9), we can interpret it as with probability 1 − ip i δt we do nothing, and with probabilityp i δt we apply a corresponding correction operation. We also multiply c × α i to the output measurement. We can regard the event that applies the correction operations as a jump similar to the stochastic Schrödinger equation approach.
Starting at time t = 0, the probability that there is no jump until time t is where (t) = i≥1p i (t). The probability to have jump in the time interval [t, t + dt] is Now suppose we generate a uniformly distributed random variable q ∈ [0, 1] and solve to determine the jump time t jp . Then the probability that jump happens at time t jp or in particular between [t jp , which agrees with Eq. (11). We can thus use the uniformly distributed random variable q to determine the jump time to equivalently simulate the original continuous error mitigation process. Now, at the jump time t jp , we apply the basis operation other than the identity operation. We can determine the basis operation by generating another uni- and N op is the number of the basis operations.
We can predetermine the jump time t 1 jp , t 2 jp , . . . , t k jp from Eq. (12). For time-independent noise, the jump time can be simply determined as t jp = − log(q)/ i≥1p i with q randomly generated from [0, 1]. Given evolution time T, the average number of recovery operations is proportional to O(λT). In the numerics, the average number of recovery operations is about 0.3 times per evolution on average given a realistic noise model and simulation task.

VI. DECOMPOSITION OF THE RECOVERY OPERATION AND OPTIMIZATION
We now discuss the decomposition of the recovery operation into local basis operations. We denote the complete basis operations as {B i }. For multiple qubit systems, tensor products of single-qubit operations, e.g., B i ⊗ B j , also form a complete basis set for composite systems. Therefore, if we can implement the complete basis operations for a single qubit, we can also emulate arbitrary operations for multiple qubits systems. In Ref. [5], it is shown that every single-qubit operation can be emulated by using 16 basis operations. This is because every single-qubit operation (including projective measurements) can be expressed with square matrices with 4 × 4 = 16 elements by using the Pauli transfer representation [78]. Therefore, 16 linearly independent operations are sufficient to emulate arbitrary single-qubit operations. In Table I, we show one efficient set of basis operations for a single qubit.
We show in Appendix B 2 that the recovery operations without Hamiltonian error can be analytically expressed as where L represents the noise superoperator and λ is the noise strength. From Eq. (14), we can analytically decompose the general noise into local basis operations. In Appendix B 4, we provide the recovery operations for several typical Markovian processes, including depolarizing, dephasing and amplitude damping, during the quantum simulation. It is worth mentioning that by using only observables within spatial domain, we can recover the Lindbladian acting on this domain and reconstruct the local Markovian dynamics [79,80].
Overcomplete basis can be used to further reduce the resource cost for the stochastic error-mitigation scheme. In general, the target quasiprobability operation E Q can be decomposed as a linear combination of unitary channels and projective measurements by using Pauli transfer matrix representation. The quasiprobability operation E Q can be decomposed into a complete basis {B i } as where we set B 0 = I. Given the target quasiprobability operation E Q , the overall resource cost for the quasiprobability scheme is given by In order to minimize the resource cost, we aim to reduce C 1 . Consider an overcomplete basis {B i }, which includes the complete basis {B i } and also other randomly generated unitary operators and projective measurements. Then the quasiprobability operation E Q is decomposed into this Minimizing C 1 = q 0 + i≥1 |q i | can be further rewritten as a linear programming as follows: The overall resource cost C(T) for the stochastic errormitigation scheme can therefore be reduced by this linear programming optimization method.

VII. IMPLEMENTATION OF THE SCHEME WITH ANALOG AND DIGITAL QUANTUM SIMULATORS
In this section, we discuss the implementation of our scheme with analog and digital quantum simulators. To implement stochastic error mitigation with an analog quantum simulator, we insert the single-qubit recovery operations at each jump time, which is predetermined by Algorithm 1. As the evolution of most quantum simulators is based on external pulses, such as trapped ions and superconducting qubits, it would be practically feasible to interrupt the continuous evolution by simply turning off the external pulses and then turning on the single-qubit recovery pulses. The joint evolution and the single-qubit dynamics can be preengineered as a modified evolution of AQS, as shown in Fig. 2. In practice, when turning on and off the joint evolution cannot be realized in a short time, we can alternatively apply single-qubit recovery pulses with a short duration and a sufficiently strong intensity compared to the parameters of the AQS Hamiltonian, as shown in Fig. 2(b)II. This is similar to the banged analog-digital quantum-computing protocol introduced in Refs. [81,82], which implements single-qubit gates without turning off the background Hamiltonian. In this case, when single-qubit rotations are performed in a time δt that is much smaller than the timescale of the joint evolution, the additional error per single-qubit rotation introduced by the background evolution of the Hamiltonian is on the order of O(δt 3 ). Therefore, errors induced from the error-mitigation procedure could be very small, and they could be further mitigated via the hybrid approach in Sec. VIII.
On the other hand, the stochastic error-mitigation scheme could be naturally implemented on a digital gatebased quantum computer. We note that digital gates are  1)] of preengineered single-qubit dynamics to mitigate the errors accumulated in the evolution, which can also be regarded as a modified evolution H . (b) Two schemes of the errormitigated process or gate with modified pulse sequences. Dashed lines represent the original pulse that constructs the target process or multiqubit gates. Provided controllable drive that could be freely turned on and off, we can synthesize the error-mitigated process and gate by modifying the original pulse sequence as shown in scheme I, which corresponds to (a). In the case of restricted driving operations, we can alternatively apply a strong and fast single-qubit pulse to the original pulse to mitigate either process errors or gate errors as shown in scheme II. We note that scheme II could similarly be applied in (a), which introduces a negligible error of O(δt 3 ) when each single-qubit gate is implemented in δt T.
experimentally realized via continuous pulse sequences [1,[30][31][32][33][34][35][36][37][38], thus we can construct the error-mitigated gates by modifying the original pulse sequence with the predetermined pulses (recovery operations). A similar process has been experimentally demonstrated in Ref. [36], where the effect of the new pulse sequence is to effectively mitigate the unwanted terms in the driving Hamiltonian. Our QEM method can be used to eliminate the general coherent and incoherent errors of the gates to achieve high gate fidelity. Therefore, provided control of pulse sequence, we can engineer the pulse sequence as shown in Fig. 2(b), and prepare the error-mitigated gates to perform quantumcomputing or quantum-simulation tasks. We also note that if the driving operations are restricted, we can similarly apply the fast single-qubit pulse to the original pulse to mitigate the errors, where the additional error induced in this process is on the order of O(δt 3 ), as shown in the bottom of Fig. 2 As shown in Fig. 2, the exact quantum hardware that we need to implement our error-mitigation scheme actually lies in between fully analog simulation (we can control only all qubits with a predetermined Hamiltonian) and a fully digital quantum computer (we can apply any operation on a small number of qubits). The scenario can be regarded as a modified analog-digital quantum simulator, where we need only to apply strong local gates along the background dynamics (no matter whether it is digital or analog). Compared to a fully AQS, we need only to insert single-qubit operations and joint evolution and the single-qubit dynamics can be preengineered by using Algorithm 1. The readers could regard it as analogdigital quantum simulation or time-dependent Hamiltonian dynamics (although we require only controllable singlequbit operations instead of arbitrary Hamiltonian simulation). Compared to a fully DQS, our scheme does not need to apply any two-qubit gate and hence it significantly avoids crosstalks. It is worth noting that given evolution time T in AQS or pulse sequence of the target gate in DQS, the average number of recovery operations is linear to λT. In practice, the average number of recovery operations could be a small number (for instance, less than 1 in our numerical simulation), and therefore easy to implement in reality. To summarize, our scheme can be implemented on all digital and most analog quantum simulators, and analog-digital quantum simulators we describe above, as long as accurate and fast single-qubit operations are allowed.

VIII. REDUCTION OF MODEL ESTIMATION ERROR
For systems with finite-range interactions, a local Markovian dynamical process can be reconstructed by using only local measurements and we refer to Refs. [79,80] for details. Given a prior knowledge of the noise model, the above stochastic QEM schemes can eliminate the physical errors by applying basis operations at jump time. Nonetheless, the realistic noise L exp and the estimated noise L est may differ due to imprecise estimation of the noise model. Here we combine the extrapolation QEM method [3,4] to mitigate model estimation error and the errors associated with imperfect recovery operations.
We first show how to boost model estimation error, which will be used for its mitigation as a preliminary. Assume that the evolution of the quantum system is described by the open-system master equation We show in Appendix C1 that by evolving the state ρ λ under the rescaled Hamiltonian (1/r)H t/r for time rt, we can effectively boost physical errors of quantum systems, which can be expressed as ρ λ (rt) = ρ rλ (t). Here, we assume that the noise superoperator L is invariant under rescaling, and the initial conditions holds ρ λ (0) = ρ rλ (0). The effective evolution after applying the errormitigation method with L est is which can be implemented by rerunning the errormitigated experiment for a rescaled time rt under the rescaled Hamiltonian.
As the value of r ≥ 1 can be tuned, we choose several different values of r and suppress the model estimation error via Richardson extrapolation. Specifically, with more than two values of r denoted as {r j } and constants β j = l =j r l (r l − r j ) −1 , we have where O rλ is the measurement outcome after stochastic error mitigation, corresponding to ρ (Q) rλ (T), γ n = j |β j |, r max = max j r j , and L 1 = max ρ tr|L(ρ)|. Therefore, in addition to λT = O(1), the scheme is efficient provided We refer to Appendix C 2 for the derivation of Eq. (21). From Eq. (21), the deviation between the ideal measurement outcome and the error-mitigated one is bounded independently with the Hamiltonian, i.e., the to-be-simulated problem. The bound relies only on the noise model, the evolution time, the number of samples, and the parameters used in extrapolation. Moreover, since imperfections of the basis operations B i lead to deviation of L est , which can be regarded as another type of model estimation error, they can be corrected via the extrapolation procedure. We note that noise could have fluctuations or drift in the experimental apparatus, which in practice could be challenging to obtain the precise noise model. Our hybrid QEM incorporating extrapolation is therefore practically useful as this method alleviates the requirement of precise estimation of the noise model and can be robust to the error of recovery operations. We refer to Appendix C for detailed analysis of error mitigation for the model estimation.

IX. NUMERICAL SIMULATION
Now, we test our QEM schemes for analog quantum simulators and gate-based digital quantum circuits. We first consider a 2D anisotropic Heisenberg y on a 2 × 2 square lattice, where ij represents nearest-neighbor pairs. This model has been extensively used to investigate the quantum magnetism and criticality [83][84][85][86][87]. We consider analog simulation via a noisy superconducting quantum simulator with energy relaxation L 1 and dephasing L 2 noise [48,[88][89][90]. Here the Lindblad operator takes the form of for β = 1, 2, L (j ) . Such a noise model is also relevant for other quantum simulators such as trapped ions [42,51,58,76], NMR [46,47,50], ultracold atoms [53,57], optical lattices apparatus [55], etc. The noise can be characterized by measuring energy relaxation time T 1 and dephasing time T 2 without full process tomography [63,[89][90][91] and more generally via local measurements [79,80]. We also consider physical errors for the single-qubit recovery operations as singlequbit inhomogeneous Pauli error, , Y, Z being the Pauli channel and p α being the error probability. In our simulation, we set J = h = 2π × 4 MHz, γ = 0.25, and the noise strength λ 1 = λ 2 = 0.04 MHz [32,63,90]. For model estimation error, we set p x = p y = 0.25% and p z = 0.5%, which can be achieved with current superconducting simulators [31,92], and consider the real noise strength to be 10% greater than the estimated one, i.e., λ exp = 1.1λ est . We set the initial state to |+ ⊗4 with |+ = (|0 + |1 )/ √ 2, evolve it to time T = 16π/J , and measure the expectation value of the normalized nearest-neighbor correlation x /4 with 10 6 samples. The numerical result without model estimation error is shown in Figs. 3(a)-3(c). Specifically, we compare the time evolution of the expectation value of the correlation operator in Figs. 3(a) and 3(b) and the fidelity F(ρ I , ρ eff ) = tr ρ 1/2 eff ρ I ρ 1/2 eff of the effective density matrix ρ eff and the ideal one ρ I in Fig. 3(c). We can see that Richardson extrapolation and stochastic QEM improve the accuracy by 1 and 2 orders, respectively. The result with model estimation error is shown in Figs. 3(d)-3(f). Here, we also consider the hybrid method with both stochastic QEM and linear extrapolation, with optimized r 0 = 1 and r 1 = 1.8. We can see that stochastic QEM still outperforms Richardson extrapolation with large evolution time and the hybrid method can be further used to improve the simulation accuracy. The simulation result thus indicates that the hybrid QEM scheme can be robust to the drift of noise [93][94][95]. The performance of the QEM schemes can be made clearer without considering sampling errors. As shown in Fig. 3(g  Here, the controlled-NOT (CNOT) gate in the quantum circuits is generated by cross-resonance (CR) gates, which are experimentally realized by using microwaves to drive the control qubit (c) at the frequency of the target qubit (t), resulting in a driving Hamiltonian x ] [30,32,[36][37][38]. Here, is the effective qubitqubit coupling and γ represents the effect of crosstalk between qubits [36]. We consider inherent environmental noise and recovery operation error as in the above analog simulator, and additional coherent crosstalk errors γ = 1%. We set = 2π MHz, the evolution time T = π/4 , and run 10 5 samples. We mitigate the noisy two-qubit pulse sequence by inserting basis operations, and shows the fidelity dependence of circuit depth d with and without QEM in Fig. 4(c). The result clearly shows that stochastic QEM improves the computing accuracy by 2 orders. In Appendix D, we show numerical simulations for both Ising and frustrated quantum spin Hamiltonian, and we further demonstrate how the QEM methods can be applied to temporally correlated noise.

X. RESOURCE COST FOR NISQ DEVICES
In this section, we estimate the resource cost for stochastic error mitigation with NISQ devices. Given a precise noise model, the stochastic error-mitigation method in principle enables exact recovery of the ideal evolution. However, to achieve the same accuracy of the measurement on the ideal evolution, we need C 2 times more samples or experiment runs with the error-mitigated noisy evolution. The overhead C 2 is likely to be prohibitively large with a significant amount of noise on a NISQ device. Nevertheless, we show that the overhead can be reasonably small (less than 100) when the total error (defined below) is less than 1. In particular, we consider a noisy superconducting simulator with up to N = 100 qubits, which suffers from single-qubit relaxation and dephasing noise with equal noise strength λ 1 = λ 2 . While the noise strength is defined as the noise rate at instant time, we define the total noise strength as the noise strength of the whole N -qubit system within time T. The dependence of the overhead C 2 on the total noise strength and number of qubits is shown in Fig. 5(a). For a practical case with T = 1 μs, λ 1 + λ 2 = 0.01 MHz, N = 100, the cost C 2 ( = 1) = 30 and we further show the number of measurements needed to achieve a given simulation accuracy in Fig. 5(b). Note that the overhead C 2 is independent of the Hamiltonian H sim , so the results apply for general NISQ devices (see Appendix B 2).

XI. DISCUSSION
To summarize, we propose stochastic and hybrid quantum-error-mitigation schemes to mitigate noise in a continuous process. While previous error-mitigation methods for DQS regard each gate as one entity and noise as an error channel before or after the gate, such a description becomes inadequate when the quantum gate is on multiqubits and the noise are inherently mixed in the realization of the quantum gate. By regarding the implementation of each quantum gate as a continuous process, our errormitigation methods can thus be applied to mitigate errors for realization of multiqubit gates (which generally have large errors). Since dominant noise in NISQ devices is from implementing multiqubit operations or inherent noise with finite coherence time, our scheme can effectively suppress them and thus extend the computation capability of analog quantum simulators and digital gate-based quantum computers in solving practical problems [31]. We numerically test it with analog simulators for several Hamiltonian simulations under incoherent errors including energy relaxing and dephasing noise and a parameterized quantum circuit under additional coherent crosstalk noise. We show its feasibility with general NISQ devices with up to 100 qubits. The proposed QEM schemes work for all digital and many analog quantum simulators with accurate single-qubit controls.
Furthermore, resolving the drift or temporal fluctuations of noise is challenging for conventional QEM methods. Our hybrid scheme incorporating extrapolation can mitigate model estimation error and the error of recovery operations, which alleviates the requirement of precise tomography of error model and precise control of the quantum simulators. Our method is tested to be robust to the drift of noise. Although the discussion focused on local time-independent noise, our scheme can be potentially generalized to general nonlocal time-dependent noise by employing the time-dependent recovery operation E Q (t), and we numerically test its viability for the time-dependent noise in Appendix E. We leave the detailed discussion to a future work.

Error model
Here we review the concept of quantum error mitigation for digital quantum computing. In a digital gate-based quantum computer, the effect of noise is simplified as a quantum channel appearing either before or after each gate. The output state is different from the ideal output, which can be described as where ρ noise out is a noisy output and ρ ideal out is a noise-free output from the quantum circuit, U k and N k are k th quantum operation and accompanying noise to it, and N g is the number of gates. Here, we assume the noise processes are Markovian for simplicity. Fault-tolerant error correction based on encoding of qubits can be used to compensate for the effect of noise and obtain correct computation results in principle. However, in near-term quantum computing, the number of qubits and gate operations is restricted due to imperfections of quantum devices including physical noise and limited interactions among qubits. Therefore, faulttolerant error correction necessitating encoding of qubits is not ideal for near-term quantum computing. Instead, QEM is introduced for mitigating errors in quantum circuits without using additional qubits. By using QEM, one cannot restore the quantum state itself, but can instead obtain an approximation of expected values of observables corresponding to the ideal density matrix, i.e., for any observable O. Here we use QEM(ρ) to denote the process of error mitigation, which may not satisfy the requirements of a quantum channel. Therefore, we generally need a classical postprocessing to realize QEM(ρ), which may introduce a sampling overhead (cost) when measuring observables. The cost in general increases exponentially with respect to the error strength as we shortly see below. Therefore, a constant error strength is generally required in order to make QEM work.

Quasiprobability method
Among different QEM schemes via different postprocessing mechanisms, the quasiprobability error-mitigation method is one of the most effective techniques. It recovers the ideal unitary processes by randomly generating noisy operations, with postprocessing of measurement results. Suppose the ideal quantum operation is denoted as U, then the key idea of the quasiprobability method is to express the ideal evolution U as a linear combination of noisy operations K i as where U and K i are superoperators, and i q i = 1, C = i |q i |, p i = |q i |/C. As q i can be negative, we refer to q i as the quasiprobability, and therefore the overhead coefficient C ≥ 1 in general. To obtain the error-free expectation value of an observable O, we randomly generate noisy operation K i with probability p i , multiply the measured result by the parity factor sgn(q i ), and obtain the expectation value O eff as follows: Finally, the error-free expectation value of O is approximated by C O eff . Note that the variance is amplified C 2 times greater, and thus C 2 can be interpreted as a resource cost to achieve the same accuracy as that without QEM.
As an example, we illustrate the case that the singlequbit operation is affected by depolarizing errors as DU. The removal of the error D can be formally done by applying its inverse channel D −1 . Now, the depolarizing channel can be expressed as with the inverse channel derived as Consequently, the ideal channel U can be expressed as where I, X , Y, and Z correspond to an identity operation, and superoperators for Pauli operators. Note that Eq. (A7) is written in the same form as Eq. (A3), and we can hence perform the quasiprobability method accordingly. For the error-mitigation method to be useful in digital quantum computing, this quasiprobability operation is applied after each noisy gate. The parity is updated depending on the generated operations, and the final outcome of the parity is applied to measurement results in the same way as a single quantum operation shown in Eq. (A3). Suppose there are N gates, the total overhead C N can be expressed as where C i is the overhead coefficient for the ith gate, and N is the number of gates in the quantum circuit. Suppose the error ε i for each gate is small, the cost C i is close to 1. A first-order expansion gives C i ≈ 1 + λ i ε i and thus the total overhead C N is approximated as For simplicity, we assume λ i = λ and ε i = ε are independent of i. Then we have Here we denote ε N = εN to be the total error rate of all the N gates. Then it is not hard to see that the total cost C N increases exponentially to the total error rate ε N . However, with a constant total error rate ε N , we still have a constant overhead. Thus a constant total error rate is generally the assumption for error mitigation for digital quantum computing.

APPENDIX B: STOCHASTIC ERROR MITIGATION
As discussed in the above section, the QEM method assumes the noise appears either before or after each gate in a digital gate-based quantum computer, but realistic noise occurring in the experimental apparatus is more complicated. Specifically, every gate in digital circuits or every process in analog simulation is physically realized via a continuous real-time evolution of a Hamiltonian and thus errors can either inherently mix with the evolution making it strongly gate or process dependent, or act on a multiple number of qubits leading to highly nonlocal correlated effects (crosstalks). Since conventional quantum error-mitigation methods are restricted to gate-based digital quantum computers and oversimplified noise models, they fail to work for realistic errors and general continuous quantum processes. In this section, we extend the QEM method to a more practical scenario and show how to mitigate errors for these inherent dynamics-based and nonlocal noise in practical noisy quantum devices.

Pauli transfer matrix representation
We first introduce the Pauli transfer matrix representation of states, observables, and channels as a preliminary.

034026-11
By using Pauli transfer representation, a state and an observable are mapped to a real column and row vectors, respectively, as follows: and where P k ∈ {I , σ x , σ y , σ z } ⊗n , n is the number of qubits, and d = 2 n . Furthermore, for a process, i.e., E(ρ) = k K k ρK † k , the Pauli transfer matrix representation can be expressed as By using the Pauli transfer representation, we have

Continuous error-mitigation scheme
We first illustrate the detailed procedure of continuous error mitigation. We can rewrite the evolution of noisy and ideal quantum states by using infinitesimal δt as where H (t) denotes the ideal Hamiltonian with L corresponding to the noisy evolution. In the presence of Markovian stochastic noise involved with environment, while the dynamics induced with the undesired Hamiltonian H C (t), which causes coherent errors, can be described as The latter case occurs due to the imperfection of the analog quantum simulators and implementation of quantum logic gates from physical Hamiltonians [32,62]. For systems with finite-range interactions, Bairey et al. and Silva et al. proposed methods that use only local measurements to reconstruct local Markovian dynamical process [79,80]. We show how to eliminate these errors by using a continuous error-mitigation method. By using the Pauli transfer matrix representation, Eq. (B5) is mapped to |ρ α (t + δt) = [I + E α (t)δt] |ρ α (t) where |ρ α (t) (α = N , I ) is the vectorized density matrix of ρ α (t) and E α (t) corresponds to the second term of Eq. (B5). Equivalently, the superoperartor representation of the evolution gives ρ α (t + δt) = E α (t)ρ α (t). In the following, we use these two equivalent representations interchangeably. Note that the evolution induced by E α in the main text becomes I + E α δt in the Pauli transfer representation. We introduce the recovery operation I + E Q δt to obtain the ideal dynamics, which can be expressed as such that E Q = E I − E N . Note that (I + E Q δt) corresponds to E Q in the main text. Due to the linearity of the representation, we can see that E Q corresponds to the Pauli transfer matrix representation of −λL ρ N (t) . In this framework, E I (t) ≈ E N (t) holds within a sufficiently small time step δt.
The experimental errors including the interactions with the open environment, undesired couplings and imperfections in the quantum simulators are generally local and we therefore assume E Q can be decomposed into local operators as Q operates on polynomial subsystems of the N -qubit quantum system. We now decompose the operation E Q into the set of basis operations as where q (S) j is the quasiprobability and B (S) j is the basis operation for compensating the errors. Note that B (S) j acts only on the same small subsystem as E (S) Q . By performing basis operations for E (S) Q with corresponding quasiprobability distributions in Eq. (B9), we can implement the overall quasiprobability operations corresponding to E Q as shown below. Therefore, we can extend the quasiprobability operations into a large-scale system. We remark that this quasiprobability approach works for any errors and we can mitigate correlated stochastic noise and unwanted interactions between (a small number of) multiple qubits. In addition, this argument can be naturally applied to multilevel systems when we can prepare basis operations for them. In particular, the quasiprobability operation at time t takes the form of where B 0 is set to be an identity operation and we also omit the superscript (S) for simplicity. The probability to generate the identity operation I and In addition, the parity α 0 for B 0 = I is always unity, and the parity α i corresponding to The overhead coefficient c corresponding to E (S) Q is given by . As we discuss above, this coefficient introduces a sampling overhead. The overhead coefficient from t = 0 to t = T within infinitely small discretization δt is Note that |q i | ∝ λ, therefore we have C 1 ∝ λ, and the overall overhead is Here we choose a proper normalization λ so that the contribution of L is bounded by a constant l: L exp 1 ≤ l. Here, we define the superoperator norm by 1 = sup A { (A) 1 / A 1 : A = 0} with A 1 = tr|A|. Therefore, given a finite number of samples in the experiments the condition that the scheme works efficiently with a constant resource cost is λT = O(1). By interpreting λT as the total noise strength, the requirement is thus consistent with the case of DQS.
It is also possible to consider time-dependent recovery operation for suppressing time-dependent noise. In this case, the quasiprobability becomes time dependent and can be obtained by Eq. (B9). Therefore, the overall overhead for time-dependent noise is

Comparison with conventional error mitigation
Errors, occurring in the continuous time evolution, can inherently mix and propagate with the evolution leading to highly nonlocal correlated effects. For instance, dominant errors in superconducting qubits are inherent system dephasing or relaxation, and coherent errors (or crosstalk) when applying entangling gates. Analog quantum simulators may not even implement discretized quantum gates. Therefore, conventional quantum error-mitigation methods fail to work for realistic errors and general continuous quantum processes. Our work addresses the problems by first considering a more general scenario of a continuous process with realistic noise models. More concretely, we consider the time-independent Lindblad master equation with dynamics of Hamiltonian (including coherent errors) and incoherent Markovian process which describes either gate synthesis in digital quantum computing or the continuous evolution of a analog quantum simulator. Here δH and L describe coherent errors (such as crosstalk or imperfections of Hamiltonian) and inherent coupling with the environment (such as dephasing and damping), respectively. We note that even though the coherent error δH and the Lindblad operators L k act locally on the quantum system, the effect of errors propagates to the entire system after the evolution. Therefore, such global effects of noise cannot be effectively mitigated using the conventional quasiprobability method, which assumes simple gate-independent error model described by singleor two-qubit error channels before or after each gate. Our work proposes two key techniques to overcome this problem.
1. First, we discretize the continuous time into small time steps so that we sequentially apply error mitigation for the noisy evolution at each time step. We emphasize that discretized evolution is yet not equivalent to discretized digital computing with local single-and two-qubit gates. This is because the continuous evolution even with a small time step could be a joint evolution (effectively a joint quantum gate) on all the qubits. Therefore, we directly mitigate errors of all the evolved qubits in each small time evolution, whereas conventional error-mitigation methods operate on each local gate. This also explains why we can mitigate crosstalk of multiple qubits, whereas conventional methods mitigate only the effective noise channel for each 034026-13 gate. In practice, one can choose a sufficiently small time step so that the error mitigation works in a "continuous" way.
2. However, continuous error mitigation with small discretized time requires to constantly pause the original evolution to apply recovery operations and the time discretization also introduces additional errors. To resolve these problems, we further introduce stochastic error mitigation, which equivalently simulates the continuous errormitigation procedure with infinitely small time step. The stochastic error-mitigation method thus simultaneously solves all the issues and provides our final solution for error mitigation of a continuous process. We note that stochastic error mitigation requires only application of a small number of single-qubit recovery operations at certain times. We can thus preengineer the recovery operations into the original evolution Hamiltonian without interrupting the simulation.
To summarize, the first contribution of our work is to solve a major open problem of mitigating realistic (inherently gate-or process-mixed and nonlocal) noise for both digital and analog quantum simulators, which has strong applications in achieving quantum advantage with near-term noisy quantum devices. The techniques we introduce for the stochastic error mitigation method are highly nontrivial and do represent significant advances in our understanding of mitigating multiqubit errors for processes beyond discretized gates and oversimplified noise models.

Complete basis operation set
In Ref. [5], it is shown that every single-qubit operation can be emulated by using 16 basis operations. This is because every single-qubit operation (including projective measurements) can be expressed with square matrices with 4 × 4 = 16 elements by using the Pauli transfer representation [78]. Therefore, 16 linearly independent operations are sufficient to emulate arbitrary single-qubit operations. In Table I, we show one efficient set of basis operations for a single qubit in Ref. [5].
For multiple-qubit systems, tensor products of singlequbit operations, e.g., B i ⊗ B j also forms a complete basis set for composite systems. Therefore, if we can implement the complete basis operations for a single qubit, we can also emulate arbitrary operations for multiple-qubit systems. Moreover, we can also apply the error mitigation to multilevel systems if we can prepare the corresponding basis operations.
By using only observables within spatial domain, we can recover the Lindbladian acting on this domain and reconstruct the local Markovian dynamics [79]. Here, we provide the recovery operations for several typical Markovian processes during the quantum simulation and coherent errors in implementing CNOT gates.
The recovery operations can be analytically expressed as E Q = I − λLδt, where L represents the noise superoperator and λ is the noise strength. For depolarizing, dephasing, and amplitude damping, the recovery operations E Q can be, respectively, decomposed as In the parameterized quantum circuits, the CNOT gates or more general entangling gates are prepared by crossresonance drive, with the drive Hamiltonian where is the effective qubit-qubit coupling, γ represents the effect of crosstalk between qubits and H corresponds to additional errors whose strengths can be revealed by Hamiltonian tomography. On IBM's quantum devices, for example, H includes μσ (c) z I (t) with μ corresponding to the drive-induced Stark shift. In the cross-resonance drive, one dominant error is from crosstalk, and the corresponding recovery operation is with λ = γ . The additional error, for example, the driveinduced Stark-shift can be mitigated by the recovery operation

APPENDIX C: HYBRID ERROR MITIGATION
In this section, we show how to apply the extrapolation method to mitigate model estimation error and the errors associated with imperfect recovery operations. Combined with stochastic error mitigation, we thus propose a hybrid error-mitigation method for errors in practical NISQ devices.

Boosting model estimation error
We first show how to boost model estimation error, which is used for its mitigation. Assume that the evolution of the quantum system is described by the open-system 034026-14 master equation The evolution of the system under a scaled Hamiltonian drive (1/r)H t/r takes the form of Assuming the noise superoperator L is invariant under rescaling, we have On the other hand, the density matrix ρ rλ (t) with enhanced noise strength rλ is given by Comparing Eqs. (C3) and (C4), one finds that ρ λ (rt) and ρ rλ (t) follow the same differential equation, and thus with the initial conditions ρ λ (0) = ρ rλ (0) we prove ρ λ (rt) = ρ rλ (t). This indicates by evolving the rescaled Hamiltonian for time rt, we can effectively boost physical errors of quantum systems. Now, we discuss how to boost the model estimation error. By applying stochastic error mitigation, we obtain is the error-mitigated effective density matrix after stochastic error mitigation. Assuming L = L exp − L est is invariant under rescaling of the Hamiltonian, we can similarly obtain This can be experimentally achieved by applying stochastic error mitigation for a rescaled time rt under the rescaled Hamiltonian.
It is worth noting that even if the noise model is time dependent, our method can still work as long as the evolution can be described by a Lindblad equation and its dependence on time is known.
For example, when we consider a time-dependent noisy process with stochastic error mitigation described by where L 0 is time independent. Then, the rescaled dynamical equation becomes In this case, we can interpret that the noise rate is boosted by a factor of r 2 . We later show how such a time-dependent noise process can be mitigated in Appendix E.

Richardson's extrapolation for physical errors and model estimation errors
In this section, we briefly review the extrapolation method proposed in Ref. [3,4]. We assume the opensystem evolution is described by In Ref. [4], it is shown that the expectation value of an observable O can be expressed as where O = max ψ ψ|O|ψ is the spectra norm of O.
Here, in the case that L is a Lindblad-type operator, one can have the bound for a n+1 as In order to employ the extrapolation method, we need to obtain the expectation value of observable O(r j λ)

034026-15
(j = 0, 1, . . . , n, r 0 = 1) at time t = T corresponding to the equation which can be obtained by using the rescaling of the Hamiltonian as described in Appendix C 1. Then we can obtain the approximation of the noise-free expectation value of the observable O as where O(0) * is the estimated noise-free expectation value up to an error of order O(λ n+1 ), and O rλ are the measurement outcome corresponding to the state ρ rλ (T).
Here the coefficients β j = l =j r l (r l − r j ) −1 are defined by the solution of the following equations: In Ref. [4], it has been shown that the difference between the estimator and the error-free expectation value is bounded by where γ n = n j =0 |β j |, r max = max j r j , and max / N sample is the largest experimental errors due to shot noises with N sample being the number of samples. From Eq. (C17), we can see that extrapolation methods require Here, we use the fact that the variance of the errormitigated expectation value of the observable is amplified with the overhead coefficient C. From Eq. (C21), the deviation between the ideal measurement outcome and the error-mitigated one is bounded independently with the Hamiltonian, i.e., the tobe-simulated problem. The bound relies only on the noise model, the evolution time, the number of samples, and the parameters used in extrapolation.

APPENDIX D: NUMERICAL SIMULATION
As we show in the Appendix C 2, the variation of the performance of our error-mitigation methods in terms of different Hamiltonians and noise models is theoretically well bounded, which indicates that the theory does apply for general Hamiltonian simulation with NISQ devices. In this section, we report additional numerical simulation for the transverse-field Ising model and frustrated spin-half model as the J 1 − J 2 model to verify the viability of our theory.
We first consider a four-qubit one-dimensional transverse-field Ising model We consider the quantum critical point at J = h = 2π × 4 MHz, where correlations exhibit power-law decay instead of exponential decay. The noise strength λ 1 = λ 2 = 0.04 MHz and errors of single-qubit operation p x = p y = 0.25% and p z = 0.5%, which are the same as in the main text for comparison. We set the initial state to (|+ ) ⊗4 with |+ = (|0 + |1 )/ √ 2, evolve it to time T = 16π/J , and measure the expectation value of the normalized nextnearest-neighbor correlation function ij σ (i) x σ (j ) x . The total number of samples of the measurement is fixed to be 10 6 . To demonstrate the performance of stochastic and hybrid error mitigation much clearer, we consider eight- where H 0 = n ij J ijσ z . We refer a detailed derivation of time-dependent noise model to Appendix E1.
This result indicates that the averaged trajectory is equivalent to the time-dependent noisy evolution. By applying the hybrid error-mitigation method, a combination of stochastic error mitigation and linear extrapolation, we show in Fig. 8 that the time-dependent noise can be mitigated without detailed knowledge of the noise strength and noise type.