Error Mitigation via Veriﬁed Phase Estimation

The accumulation of noise in quantum computers is the dominant issue stymieing the push of quantum algorithms beyond their classical counterparts. We do not expect to be able to aﬀord the overhead required for quantum error correction in the next decade, so in the meantime we must rely on low-cost, unscalable error mitigation techniques to bring quantum computing to its full potential. In this paper we present a new error mitigation technique based on quantum phase estimation that can also reduce errors in expectation value estimation (e.g., for variational algorithms). The general idea is to apply phase estimation while eﬀectively postselecting for the system register to be in the starting state, which allows us to catch and discard errors that knock us away from there. We refer to this technique as “veriﬁed phase estimation” (VPE) and show that it can be adapted to function without the use of control qubits in order to simplify the control circuitry for near-term implementations. Using VPE, we demonstrate the estimation of expectation values on numerical simulations of intermediate-scale quantum circuits with multiple orders of magnitude improvement over unmitigated estimation at near-term error rates (even after accounting for the additional complexity of phase estimation). Our numerical results suggest that VPE can mitigate against any single errors that might occur; i.e., the error in the estimated expectation values often scale as O ( p 2 ) , where p is the probability of an error occurring at any point in the circuit. This property reveals VPE as a practical technique for mitigating errors in near-term quantum experiments.


I. INTRODUCTION
Error mitigation is likely essential for near-term quantum computations to realize valuable applications.State-of-the-art technology in superconducting qubits has recently pushed quantum computers beyond the capability of their classical counterparts [1] and enabled intermediate-scale demonstrations of quantum algorithms for optimization [2,3], quantum chemistry [4][5][6], and machine learning [7], with tens of qubits and hundreds of quantum gates.However, these experiments clearly reveal a noise barrier that needs to be overcome if such applications will ever scale to the classically intractable regime.In the long term, a path towards this goal is known through quantum error correction [8][9][10].Yet, the requirements to successfully error correct large-scale quantum applications [11][12][13][14][15] are still a few orders of magnitude above the current state of the art, and will likely require many years to achieve.In the meantime, quantum applications research has focused on finding the elusive beyond-classical noisy, intermediate-scale quantum (NISQ) application [16], with the hope to accelerate the path to practical quantum computing.However, without the resources to correct errors, one must develop strategies to mitigate the aforementioned noise barrier.Otherwise, the output of NISQ devices will be corrupted beyond usefulness for algorithms significantly more complex than those already attempted.
Much of the attention in the NISQ era has been directed towards variational algorithms, with applications in optimization [17], chemistry and materials science [18], and machine learning [19,20].These shift much of the complexity of the algorithm to a classical outer loop involving many circuit repetitions, leaving the quantum computer with only the task of preparing quantum states and estimating expectation values of operators on said states.However, preparation circuits need to have significant depth to avoid being classically simulated [21].Errors accumulated over this circuit quickly distort the prepared state to one different than was targeted.This has meant that most quantum experiments to date have had difficulty achieving standard accuracy benchmarks prior to applying error mitigation techniques [2,[4][5][6]22].However, accuracy improvements of orders of magnitude have been achieved with error mitigation in these experiments, suggesting there may yet be hope for the NISQ era.
The zoo of error mitigation techniques is large and varied.One may first attempt to design algorithms that are naturally noise robust.For example, the optimization procedure in a variational algorithm makes the algorithm robust against control errors (e.g., over-or under-rotations when gates are applied) [18].Also, subspace expansions of the variational quantum eigensolver in materials science in chemistry can correct errors by projection or approximate projection into a desired subspace [23,24].Given the ability to artificially introduce additional noise into a device, one can extrapolate from multiple experiments at different noise levels to a hypothetical noiseless experiment [25], which has shown promising results on real devices [26].One may alternatively probabilistically compile circuits by inserting additional gates to average out or cancel out noise, given sufficient knowledge of the error model of the device [25,27].When classically postprocessing partial state tomography data from an experiment, one may attempt to regularize the obtained results using reduced density matrix constraints [28].Finally, one may mitigate errors that take a state outside of a symmetry-conserving subspace of a quantum problem, either by direct postselection or artificial projection of the estimated density matrix in postprocessing, producing a "symmetry-verified" state [24,[29][30][31].Recent efforts have extended this protocol by introducing symmetries into problems to increase the range of errors that may be detected [32], which is analogous to the way quantum error-correcting codes introduce engineered symmetries.
Ideally, we would prefer to go beyond verifying that a system's state remains within a target subspace and instead directly verify that the system's state is the one we desire.This would result in reaching the information theoretic optimal limit of postselected error mitigation in which one could completely mitigate the effect of all errors by repeating the experiment a number of times, scaling inversely with the circuit fidelity (equivalent to the ability to perfectly detect errors).The fact that the circuit fidelity is expected to decrease exponentially in the gate complexity indicates that eventually we will still need error correction; however, moving closer to this limit is certain to enable more powerful NISQ experiments.
In this work we develop a method for error mitigation of quantum phase estimation experiments, by verifying that the system returns to its initial state after the phase estimation step.We show that the set of experiments that pass this condition contain all the necessary information to perform quantum phase estimation.This yields a powerful error mitigation technique, as in most cases errors will not return the system to this initial state.Our techniques apply to variants of phase estimation that might involve postprocessing on a single control qubit [33,34], or when performing recently developed control-free variants [35,36].We further develop it into a simple scheme for verified expectation value estimation by dividing a target Hamiltonian into a sum of fast-forwardable terms.This yields a simple, low-cost scheme for the measurement of expectation values, which may be immediately incorporated into the quantum step of a variational quantum algorithm.We study the mitigation power of this protocol in numerical simulations of small-scale experiments of free-fermion, transverse Ising, and electronic structure Hamiltonians.Verification is observed to mitigate all single (and even all double) errors throughout many of these simulations, as evidenced by a clear second (or third)-order sensitivity in our results to the underlying gate error rate.We observe in the best-case scenario case an up to 10 000fold suppression of error at physical error rates; this is not achieved for all systems studied, but verification is found to improve experimental error in all simulations performed.We find the error mitigation power to be highly system, circuit, and noise model dependent.Finally, we study the measurement cost of this protocol in the presence of sampling noise, finding that it is comparable to standard partial state tomography techniques for energy estimation.
The outline of this paper is as follows.In Sec.II, we give a pedagogical example of how one might verify the estimation of expectation values of an arbitrary Hamiltonian, by writing it as a sum of Pauli operators and performing (fastforwarded) verified phase estimation on each individual term.In Sec.III we then derive the theory behind verified phase estimation itself, outline how it can mitigate errors, give algorithms for performing verified phase estimation with a single control qubit, or with access to a reference state, and study the increased sampling noise cost.In Sec.IV, we extend these ideas to give algorithms for verified expectation value estimation, and derive the conditions under which one may perform verified estimation of multiple expectation values in parallel (i.e., using the same system register).In Sec.V, we then implement these ideas, studying the mitigation power of verified expectation value estimation in a variety of systems and implementations developed earlier in the text under various noise models, and testing the convergence of the protocol under sampling noise.

II. PEDAGOGICAL EXAMPLE OF VERIFICATION PROTOCOL FOR EXPECTATION VALUE ESTIMATION
In this section we outline a simple implementation of verified expectation value estimation of a target operator H on a state |ψ , as a practical example of the more PRX QUANTUM 2, 020317 (2021) complicated methods to be found later in the text.The idea behind all verification protocols is to prepare |ψ = U p |0 , indirectly estimate H via phase estimation, and then verify that we remain in |ψ by uncomputing |0 = U † p |ψ and measuring in the computational basis.If |ψ is not an eigenstate of H , the system may by shifted away from this state by the quantum phase estimation unitary-i.e., even in the absence of error we do not expect the system to always pass verification.However, as we show later in this work, the data required for phase estimation are contained entirely within the set of experiments that pass verification; we may effectively ignore any experiments that fail.This in turn allows us to ignore any errors that knock the system away from |ψ , making this a potent error mitigation scheme.We have constructed various implementations of this idea, which we expand on in Secs.III and IV, and compare in Sec.V.However, the most general protocols require relatively complicated circuits and classical postprocessing.For clarity of exposition, in this section we focus on stepping through a simple protocol for the verification of expectation values, which avoids complex signal processing and circuity requirements.The protocol we describe will work for arbitrary H and |ψ , and may often be a desirable choice for a real experiment.However, depending on the choices of H and |ψ and the noise model, other protocols described later in the text may be more optimal in terms of their mitigation power.
A process diagram for a simplified verified phase estimation protocol is given in Fig. 1.To begin, we write H as a sum of fast-forwardable terms H s (multiplied by coefficients h s ) ( 1 ) Here, by fast forwardable we mean that each H s is chosen such that time evolution e iH s t may be implemented on a quantum register with the same number of gates for each value of t.Although fast forwarding is forbidden for arbitrary H [37], decomposition of any sparse, row computable H into a linear combination of polynomially many fast-forwardable Hamiltonians is always possible [38].For example, the N -qubit Pauli operators P i ∈ P N = {1, X , Y, Z} ⊗N form a basis for the set of all Nqubit operators and are themselves fast forwardable; we take this decomposition for our simple example.
We then implement verified phase estimation (with a single control qubit) to estimate the expectation values ψ|H s |ψ .This involves evolving the system by H s conditional on a control qubit.(Circuits to implement this are well known; see, e.g., Ref. [39].)The conditional evolution encodes a phase function on the control qubit.That is, if we write X c and Y c for the X and Y Pauli operators on this control qubit, following the conditional evolution we Blue denotes circuits to be executed or data to be extracted from a quantum computer; red denotes signal details to be estimated via classical postprocessing.The protocol proceeds as follows.Top left: a complex Hamiltonian H is split into a number of fastforwardable summands H s .The spectral function g(t) of | under time evolution of each piece is obtained (bottom left) via verified, fast-forwarded phase estimation.In this example, a control qubit is used to extract the phase function via phase kickback.The resulting data form a weighted sum of oscillations with frequencies equal to the eigenvalues E (s) j of the corresponding factor (bottom middle).This may be decomposed in a variety of classical postprocessing techniques to obtain estimations of the expectation values H s depending on the type of H s chosen (bottom right).Regardless of the method used, the expectation values must be normalized to obey Eq. ( 24), the last step in the verification process.As the expectation value is linear, the verified estimates of H s obtained may be immediately summed together to give a verified estimate for H (top right).
PRX QUANTUM 2, 020317 (2021) have (2) Here, A 0 and A 1 are the squared amplitudes of |ψ in the eigenbasis of H s (which has known eigenvalues ±1).The expectation value X c may be estimated by measuring the control qubit M times in the x basis, counting the number of times m x,0 or m x,1 a 0 or 1 is seen, and approximating (A similar procedure may be performed for Y.) To verify this estimate, we uncompute the preparation of the system, and count the number m (v)  x,0 (m (v)  x,1 ) of measurements of 0 (1) on the control qubit when the uncomputed state on the system is returned to the initial |0 state.We then replace our estimation by [Note that we only replace the numerator, and not the denominator, of Eq. ( 3), which makes this not strictly postselection; see Sec.III B for more details.]The expectation value H s is encoded within the phase function g(t), and must be inferred from the estimates above.In our example protocol, this requires inferring the amplitudes A 0 and A 1 (as the eigenvalues ±1 are already known).These may be simply estimated by a two-parameter fit of Eq. ( 2) to the extracted values of g(t).
As we show later in the text, in the absence of error, Eqs. ( 3) and (4) yield the same result (in the large M limit).Errors tend to scatter the system into a state that fails verification.The primary effect this has on the estimator in Eq. ( 4) is to rescale g(t) → p NE g(t) (where p NE is the probability of no error occurring).However, the converse is not true; states may fail verification due to the relative dephasing between the |0 and |1 eigenstates of H s , and we cannot infer the value of p NE from a single point g(t).Instead, we can infer the value of p NE from the normalization of the starting state |ψ .As our circuit is fast forwarded, under reasonable noise assumptions, p NE is independent of t, and this propagates immediately through the fit of Eq. ( 2 and we may correct for this by estimating Finally, as expectation values are linear, after repeating this procedure for all H s in Eq. ( 1), we may sum the result; Note that each H s will have different values of A 0 , A 1 , and g(t) (we have avoided explicitly labeling the above for simplicity).In practice, the number of samples for estimation of each H s should be varied to minimize the error in the final estimation of H (i.e., importance sampling on the h s coefficients).

III. SCHEMES FOR VERIFIED PHASE ESTIMATION
A. Review of single-control quantum phase estimation Quantum phase estimation (QPE) refers to a family of protocols to learn eigenphases e iφ j of a unitary operator U. Equivalently, quantum phase estimation may be used to learn eigenvalues E j of a Hermitian operator H , as each such operator generates a unitary via exponentiation: U = e iHt [40].(Such estimation requires limiting the size of t to prevent aliasing-e iE j t = e iE j t if E j t = E j t + 2nπ , which makes estimation ambiguous.)The eigenvalues of H and the eigenphases of U are related by the same exponentiation and correspond to the same eigenstates |E j -if H |E j = E j |E j , U|E j = e iφ j |E j and φ j = E j t.
In the single-control variant of QPE, the phases φ j are learnt by imprinting them on a control qubit-a process known as phase kickback.Any unitary U may be implemented as a (perhaps approximate) quantum circuit on a quantum "system" register, but quantum mechanics tells us that e iφ |ψ ≡ |ψ for all pure states |ψ and numbers φ ∈ R.This implies that if the system register were prepared in the pure state |E j and U applied, we would not be able to infer the phase φ j from the resulting state e iφ j |E j ≡ |E j .However, a relative phase φ between two states, (1/ √ 2)(|ψ 1 + e iφ |ψ 2 ), is a physical observable that may be detected.Such detection may be achieved by performing the unitary U conditional on the control qubit being in the state |1 (and doing nothing when the control qubit is in the state |0 ).This is commonly written as the "controlled" unitary C − U. When C − U acts on a system register prepared in an eigenstate |E j and a control qubit prepared in the state (|0 + |1 )/ √ 2, the global state evolves to We see that the eigenphase e iφ j from the system register is kicked back onto the control qubit, while the system register itself remains unchanged.We may estimate this eigenphase e iφ j by repeatedly performing the QPE protocol, measuring the control qubit in the X or the Y basis, and recording the number of single-shot readouts of 1 and 0. In the Hamiltonian case, from this estimate one may immediately infer that (1/it)Arg(e iφ j ) = E j mod 2π t.The error in the estimation of E j decreases with t; asymptotically optimal protocols need to balance this against the ambiguity modulo 2π t by repeating the estimation at multiple values of t [41][42][43].In terms of estimating the eigenphases e iφ j of a unitary U, this optimization requires repeating the above procedure for C − U k at varying points k.Often, one does not prepare an eigenstate |E j , but instead prepares a starting state Applying C − U k to such a state no longer leaves it unchanged, but instead entangles it with the control qubit.This produces the combined state (on the system+control register) When one has instead performed controlled time evolution (via the unitary C − e iHt ), one may instead write The sum over j in the above equation looks problematic, but it turns out that the eigenphases φ j (or eigenvalues E j ) remain encoded on the control qubit, in a sum weighted by the norm square A j := |a j | 2 of the initial amplitudes a j .To be precise, one may trace over the system register to obtain the reduced density matrix of the control qubit with g(t) the phase function of |ψ s under H , Estimates of g(t) may be obtained as an expectation value of the Pauli operators X and Y. Measuring these expectation values requires rotating the control qubit into the x or y basis, reading it out, and averaging the output over many repetitions (or shots) of the experiment.For a unitary operator U, one may obtain an equivalent phase function by estimating with | (k) defined in Eq. ( 9).The tomography to extract these expectation values is the same as described in the previous paragraph.Information about the eigenvalues E j and amplitudes A j = |a j | 2 may be inferred classically from estimates of g(t) at multiple values of t.When these are estimated sufficiently well, the expectation value of the Hamiltonian may be calculated as Inference of the amplitudes A j from g(t) to error takes asymptotic time ( −2 ) on a quantum device, even when the eigenvalues E j are already known [44].By propagating variances, this implies equivalent convergence in the estimation of expectation values via Eq.(17).One need not resolve all 2 N eigenvalues of an N -qubit operator in order to evaluate Eq. (17).Time-series analysis methods [34] or integral methods [45] produce a coarse-grained approximation to the spectrum that may be averaged over to obtain expectation values with similar convergence rates.Alternatively, for simple operators with a highly degenerate spectrum (e.g., Pauli operators), curve fitting will be sufficient to extract the required data (as described in Sec.II) [46].

B. Verifying a phase estimation experiment
As the data from single-control quantum phase estimation are accumulated entirely on the control qubit, one would be tempted to throw the system register away (or rather, reset the register and begin anew).In the absence of error correction this temptation grows larger; noise levels in near-term devices are high enough that coherent states of more than a few qubits degrade over the course of any reasonably sized algorithm to within a few percent fidelity to the target state-if not less [4].However, even when corrupted, the information contained within the system register is valuable, as one can use this information to diagnose potential errors in the data to be read from the control qubit.For instance, in the presence of global symmetries of the Hamiltonian, one could imagine mitigating errors that do not commute with this symmetry via symmetry verification [29,30,32].In verifying these symmetries, we are in effect projecting the system into a subspace of the global Hilbert space that contains the information we desire.One could imagine constructing ever-smaller Hilbert spaces, which trades circuit complexity for error-detection power.It turns out that the limit of this construction is achievable: instead of measuring one or more symmetries on the system register, we can instead verify that it has returned to its initial state |ψ s .(This is similar to the echo-type measurements made in randomized benchmarking [47] or quantum Hamiltonian learning [48].) Assuming that |ψ s is prepared from the computational basis state |0 by a preparation unitary U p , this measurement may be achieved by applying U † p , and reading out each qubit in the computational basis.One would expect such a measurement to distort the phase function g(t), but this is not so, as we may expand the trace in Eq. ( 11) to show that Here, the left-hand side of the equation is the expectation value of ρ c (t) regardless of the state of the control register, and the right-hand side is the (non-normalized) expectation value of ρ c (t) on verified experiments only.The lack of normalization means that this is not a postselection technique; instead one assumes that the contribution of states that fail verification to the final estimation of g(t) is zero.
[By contrast, states that pass verification either contribute +1 or −1 to the estimation of g

(t).]
We can make a physical argument why Eq. ( 18) holds and verification should not affect the estimation of g(t) in the absence of noise.Let us decompose the reduced density matrix on the control qubit as i.e., into the ensemble of states that have passed verification, ρ (v) c , and those that have failed, ρ (f ) c .When the control qubit is in the |0 state, the system register is not evolved, so in the absence of noise the state will pass verification every time.This implies that a verification failure in the absence of noise projects the control qubit into the |1 state; this fraction of states on average contributes nothing to the estimate of g(t).In other words, Note that postselecting (i.e., keeping only the experimental data where verification was passed) would instead prepare the state ρ (v)  c /Trace[ρ (v)  c ].This will not yield the desired result, as which is not equal to g(t) unless |ψ s is an eigenstate of e iHt (in which case ρ (v) c = ρ c ). (Moreover, this rescaling can be up to a factor 2 in the absence of noise, and the spectrum of this new function is significantly different to the original.)To give some intuition, one can imagine phase estimation on a mixed state in two steps: performing phase estimation on individual states to generate a set of signal functions e iE j t , and then summing and returning the weighted result g(t).The set of states that fail verification, ρ (f ) , captures the relative dephasing between these states, which cannot be ignored when attempting to recover this result.Instead, an explicit protocol for the measurement of a single g(t) within verified single-control phase estimation takes the following form.
Inputs: circuits to implement U p , U † p and controlled time evolution e iHt ; number of repetitions M of measurements in the x and y bases.Output: an estimate of g(t) with variance O(1/M ) in both the real and imaginary parts.
1. Prepare classical initial variables g x = 0, g y = 0. 2. Prepare the system register in a starting state |ψ s = U p |0 and the control qubit in the state Simulate time evolution e iHt conditional on the control qubit.4. Apply the inverse circuit U † p to the system register.5. Rotate the control qubit into the X or Y basis and measure it to obtain a number m ∈ [0, 1]. 6.If all qubits in the system register read 0, increment the relevant variable g x or g y by (−1) m .7. Repeat steps 2-6 M times in the X basis and M times in the Y basis, and estimate g(t) by We consider the increased sampling cost in the presence of error in Sec.III C 1.

C. Why verification mitigates errors
The mitigation power from verification is based on the relative size of the Hilbert spaces in which the states that have passed verification and states that have failed verification, ρ = ρ (v) + ρ (f ) , live.If we define the Hilbert spaces in which the two ensembles live H (v) and H (f ) , respectively, we have dim[ An error that occurs during the circuit is then likely to scatter the system into the set of rejected states.As an extreme example, the probability that a completely random error (i.e., an error that scatters all states to a random state) at any point in the circuit will yield a state in H (v)  can be immediately calculated to be 2/(2 N +1 − 2) ∼ 2 −N .This includes errors during preparation of |ψ s by the unitary U p and the inversion of U † p to perform the verification itself.As we are not postselecting on the verification output, g(t) is still affected by this shift, but the distortion may be accounted for in classical postprocessing.In this simple noise model the effect of noise is then to replace the estimate of g(t) by where p NE (t) and p err (t) are the probabilities of no error or some error occurring, respectively.(In Appendix A we derive the specific requirements for this to be the case.)Assuming that errors occur at a constant rate as a function of the circuit depth, and all scatter the system outside H (v) , for fast-forwardable Hamiltonians, p NE (t) = p NE and This can be seen as a uniform damping of each squared amplitude A j to A j = p NE A j .Such damping may be corrected for classically as we know |ψ s is normalized, and so we may estimate Depending on the classical signal processing method used, one may not obtain estimates of all A j and E j , but may instead directly calculate j A j E j and j A j .For example, one could use g err (0) = j A j as such a reference point.For non-fast-forwardable Hamiltonians, assuming again that errors occur at a constant rate throughout the circuit and that all scatter the system outside H (v) , we have This can be seen to be an imaginary shift to the eigenvalues E j → E j + iτ err .It can be corrected for in signal processing of the phase function by taking only the real parts of the E j eigenvalues.
The above analysis is not necessarily true for simulation of an arbitrary Hamiltonian under a realistic noise model.In particular, if the instantaneous state during simulation is a near eigenstate of the error model, then the correction in Eq. ( 22) may be as large as O(1) instead of O(2 −N ).
In Appendix A we study this in more detail, and specify the conditions under which errors will distort the results of verified phase estimation.

Sampling costs
The error mitigation from verification comes at the cost of increasing the number of samples required to estimate g(t).Assuming that all errors fall outside the verified subspace, estimating g(t) to precision requires estimating g err (t) to precision p NE .To obtain g x in Algorithm 1 (and equivalently for g y ), we average over a set of M experimental outputs that may take the values {−1, 0, 1}.Let us define the ith experimental output g x i ; then we have Our estimate of the noisy g err (t) is then given by Re[g err (t)] = P( As each experiment is independent and identically distributed, the variance on our estimates of these probabilities is Cov[P(g x i = 1), P( Propagating variances gives We may then bound the requirements to estimate g err (t) to variance and thus the number of shots required to estimate (the real or imaginary part) of g(t) would scale as This is not to be ignored; verification requires at least doubling the size of the circuit, which, if p NE = 0.01 (as has been reported [1] and mitigated successfully [4] in previous experiments), will increase the measurement count by a factor of 100.Some of the methods presented in this work involve increasing the circuit depth by factors of up to 14, which will be impractical for large experiments without further circuit optimization.

Control noise
An important realistic error to consider in QPE is error on the control qubit.This keeps the system within the verified subspace, and so is not captured by the above analysis.However, the effect of many common error channels may still be mitigated by verification.For example, let us assume that the circuit decomposition of C − U involves the control qubit performing only single-qubit gates and controlled operations on the rest of the circuit (which is typically the case).In this case, one may show that the effect of a depolarizing channel of strength λ, acting on the control qubit at any point in the circuit, sends the final state of the system to where ρ NE is the state in the absence of error, and In this case, the (noisy) estimate of g(t) is sent to (1 − λ)g(t), and expectation values and eigenvalues may be recovered via the same analysis as in Sec.III C.However, the above analysis will not hold for a more general noise model, and schemes such as randomized compiling [49] may be required to unbias the estimate of g(t).An example of this biasing effect is if an amplitude-damping channel is present on the control qubit between the final measurement prerotation and readout in the computational basis.Left unchecked, this will shift the estimate of g(t) to In addition to damping the true signal g(t), this additive signal presents as a 0-energy eigenvalue in the spectrum of g(t).This will not be accounted for by naive renormalization of H as outlined in Algorithm 3 below; the estimation protocol will instead estimate (1 − λ) H . Though this could be corrected in postprocessing, we suggest that a more stable mitigation is to flip the |0 and |1 states on the control qubit for half of the experiments.This may be compiled into the final prerotation, and does not increase the total sampling cost of the experiment (only half as many samples need to be taken at each prerotation setting for the same accuracy).We observe similar biases on bit-flip noise channels that tend to decay the real and imaginary parts of g(t) asymmetrically.This may be compensated for in turn by compiling a π/4 Z rotation on the initial control qubit state, and uncompiling it in the final prerotation.(One can see that this commutes with all gates in the circuit.)For the noise models studied numerically in this text, we have found either one or both of the above compilation schemes sufficient to mitigate control error.More complicated noise models may require more complicated compilation schemes; extending the above will be an interesting task for future work.In particular, the above analysis does not apply to correlated two-qubit noise during operations between the control qubit and the rest of the system.

D. Verified control-free phase estimation
As was recently demonstrated in Ref. [35], the control qubit may be removed from a QPE experiment if we have the ability to prepare an alternative reference eigenstate |ψ r of the Hamiltonian H (with ψ s |ψ r = 0).For example, in the electronic structure problem in quantum chemistry the number-conserving Hamiltonian has the vacuum as a potential reference state.(A similar situation was considered in Ref. [50] for the purposes of random gap estimation, but estimating single eigenvalues E j from this class of experiments is somewhat awkward.)This was also recently considered as an extension to the well-known robust QPE scheme [51], requiring both |ψ r and |ψ s to be eigenstates of the system [36].Note that |ψ r need not FIG.2. Quantum circuit for control-free verified phase estimation.The preparation unitary U p is defined in Eq. (42).The first gate in the circuit is a Hadamard gate (roman H) on the top-most qubit (labeled the target qubit in the text), which should not be confused with the Hamiltonian H . necessarily be a zero-energy eigenstate of H , though the corresponding eigenenergy E r should be known to high accuracy.In this case, one needs to prepare the correlated state (1/ √ 2)(|ψ s + |ψ r ) and perform uncontrolled time evolution, and finally measure the off-diagonal element |ψ s ψ r |.This is shown in the circuit (Fig. 2).Evaluating the circuit provides an estimate of and the additional phase may be subtracted in postprocessing.
The protocol for verified control-free phase estimation does not differ significantly from the single-control case.Besides the loss of the control qubit and removal of control from the time evolution circuit, we also now require our preparation circuit to prepare the starting state We assume that this is achieved by first applying a Hadamard gate to a single target qubit in the system register, placing the system in the state (Here we use the notation | 1 T for the basis state where the target qubit is in the |1 state and all other qubits are in |0 .)Then, the desired preparation may be achieved by a preparation unitary U p that performs the mapping (We use the same notation as for the single-control unitary on purpose, as, under the associations |0 |ψ s ↔ |ψ r and |1 ↔ |ψ s , one may see that the two are equivalent.)With this definition, estimation of |ψ r ψ s | may be achieved by inverting U p , as In particular, after inversion, the reduced density matrix of the target qubit contains the desired phase function g(t), and the verification consists of checking whether all other qubits are measured into 0. The full control-free protocol is then the following.
Inputs: circuits to prepare a superposition of |ψ s and |ψ r , invert the preparation, and implement time evolution e iHt ; number of repetitions M of measurements in the x and y bases; the reference eigenstate energy E r .
Output: an estimate of g(t) [Eq.( 41)] with variance O(1/M ) in both the real and imaginary parts.
1. Prepare classical initial variables g x = 0, g y = 0. 2. Prepare the system register in a starting state Apply the unitary U k (or, equivalently, simulate time evolution e iHt ). 4. Apply the inverse circuit U † p to the system register.5. Rotate the target qubit into the X or Y basis and measure it to obtain a number m ∈ 0, 1. 6. Measure all other qubits, and if they all read out 0, increment the relevant variable g x or g y by (−1) m .7. Repeat steps 2-6 M times in the X basis and M times in the Y basis, and estimate g(t) by e iE r t (g x /M + ig y /M ).
The analysis of Sec.III C is identical for the control-free case, with the absence of the issue of control noise, as is the analysis of Sec.III C 1.However, we note that at the beginning and the end of any experiment, single-qubit noise on the target qubit behaves similarly to control qubit noise.This necessitates averaging over multiple initial and final rotations of the target qubit to prevent bias in the estimation of g(t).
The above analysis implies that the algorithms studied in Refs.[35,50] should be amenable to verification immediately as well.It also provides some additional explanation for the error robustness observed in the robust phase estimation of Ref. [36].

IV. VERIFIED EXPECTATION VALUE ESTIMATION
In many circumstances, one wishes not to know the eigenvalues of a Hermitian operator H , but instead its expectation value H under a specified state | .For instance, in a variational quantum eigensolver [18], one prepares a state | ( θ) = U( θ)|0 dependent on a set of classical input parameters θ , then measures the expectation value E( θ) = ( θ)|H | ( θ) .This is then optimized over θ in a classical outer loop, with the optimized state | ( θ opt ) hopefully a good approximation of the true ground state |E 0 .In quantum variational algorithms it is typical that ( θ)|H | ( θ) is estimated by means of partial state tomography [31,52,53].However, noise in the preparation unitary U( θ) causes an errant state ρ err ( θ) = | ( θ) ( θ)| to be prepared and tomographed, propagating the preparation error directly to a final estimation error.The noise analysis in Sec.III C extends to both the preparation and mitigation unitaries, so if verified phase estimation is used to provide estimates of eigenvalues and amplitudes, one may reconstruct and inherit the mitigation power of the verification protocol.This has the added advantage that control errors in the preparation circuit (which, being a repeated error, are not mitigated against) are able to be compensated for during the outer optimization loop of the VQE, as is well known [4,18].Quantum phase estimation has previously been suggested as an alternative to partial state tomography for expectation value estimation, both to improve the rate of estimation [54] and to provide a witness for the presence of eigenstates of the Hamiltonian [55].The verification protocols described in this work should be applicable to these methods as well.A general algorithm for verified expectation value estimation takes the following form.
Inputs: (noisy) circuits to implement U p , U † p and controlled time evolution e iHt ; a set of t values; number of repetitions M of measurements in the x and y bases (that can be t dependent); a method for classical signal processing (e.g., a curve fitting algorithm).Output: an estimate of H .
1. Estimate g err (t) for all given points t using Algorithm 1 to the chosen precision.2. Obtain estimates for individual E j and A j values via classical signal processing.

Estimate H as
Algorithm 3. Verified expectation value estimation.
One might worry that the sum in Eq. ( 44) is over an exponentially large number of eigenstates |E j .However, one need not resolve all eigenvalues E j in order to accurately estimate the expectation value ( θ)|H | ( θ) ; if eigenvalues within δ of each other are binned, the resulting expectation value will be accurate to within δ.We may formalize this by considering the spectral function g S of |ψ s under H , This can be seen to be the Fourier transform of the phase function g(t) [strictly, g(t) is the inverse Fourier transform of g S (E/2π)], and a coarse-grained approximation may be obtained via time-series methods [34] or integral methods [45] with rigorous bounds on each.Numerically, we find that signal processing methods such as Prony's method [33] also perform acceptably (see Sec. V D).For fast-forwardable Hamiltonians (such as Pauli operators), one often already knows the target eigenvalues of the problem.Furthermore, the eigenspectrum of these Hamiltonians is often highly degenerate, making simple curve fitting a practical (and attractive) alternative.Instead of analyzing the phase function at many points as described above, one may expand and simply estimate Im[g(t)] for short times t.This is similar to the manner in which eigenphases are estimated in the WAVES protocol [55] (sans verification).In this case, the normalization of the resulting amplitudes [Eq.(25)] must be achieved by the condition that g(0) = j A j , yielding

A. Fast-forwarded and parallelized Hamiltonian decompositions
As expectation values are linear, we may estimate H by splitting it into multiple terms, estimating the expectation values of each term individually, and resumming: If individual H s may be simulated at a lower circuit depth, this can reduce the accumulation of unmitigated errors, at the cost of requiring more simulation.This ability becomes especially useful if one chooses the H s to be fast forwardable.Here, we define a fast-forwardable Hamiltonian H s as one for which a circuit implementation of e iH s t has constant depth in t.The circuit depth required to simulate e iHt for arbitrary H is bounded below as O(t) [37], but, for certain operators, this may be improved on [56].For example, as the Pauli operators {1, X , Y, Z} ⊗N are both fast forwardable and form a basis for the set of N -qubit Hermitian operators, a set of H s terms may be taken from these to decompose an arbitrary Hamiltonian.As another example, given an instance of the electronic structure problem, one may attempt a low-rank factorization of the interaction operator into a sum of O(N ) diagonalizable (and thus fast-forwardable) terms [57].
In order to speed up estimation of expectation values of multiple terms H s in a decomposed Hamiltonian H = s H s , it may be possible to perform the verified phase estimation step of each H s in parallel.For example, we can perform time evolution of L multiple summands, each controlled by a different control qubit, in between the preparation and verification steps of a single instance.In the absence of verification, such parallelization will not affect the outcome of quantum phase estimation of any individual H s , so long as all terms estimated in parallel commute.This follows immediately from the fact that the time evolution for one such term does not evolve the system between eigenspaces of another.This is complicated by the addition of verification, as the additional circuitry means that the system may evolve away from |ψ s despite a specific control qubit being in |0 .In Appendix B, we show that this gives rise to a set of spurious signals in the estimated phase function g (s) (t): Here, the ghost eigenvalues are where the E (s ) j are the true eigenvalues of the Hamiltonians H s and v is an L-bit vector written in binary (i.e., v s ∈ 0, 1).The corresponding, v-independent amplitudes are Although this is a far more complicated signal than the standard phase function g(t), we calculate in Appendix B that it yields the same expectation value, i.e., v,j ,j This implies that verified parallel phase estimation may proceed in much the same way as the series protocol.

B. Comparison to other methods of error mitigation
Error mitigation techniques differ vastly, both in their cost to implement and their effectiveness against different forms of noise.This implies that care needs to be taken in a real experiment to choose the best mitigation technique (or combination of mitigation techniques) for the job.Though a comparison between multiple techniques in a realistic setting lies outside the scope of this work, we give some predictions here on how VPE might compare in performance to other mitigation techniques, and whether it might be possible to compare to different techniques.We can classify all error mitigation techniques that the authors know of into the following broad categories.
(a) Circuit design: many forms of noise may be mitigated by careful design of a circuit to, e.g., minimize crosstalk between simultaneous gates [58], cancel out Z over-or under-rotation (e.g., via echo pulses [59]), or optimize a circuit variationally to cancel out control parameter drift on a long timescale [18,60].(Whether or not this counts as error mitigation or calibration of the underlying quantum device is left to the reader to decide.)Depending on the source of noise these techniques may significantly reduce or even nullify its effect, which may be far more effective than VPE.On the other hand, noise sources such as T1 error cannot be easily calibrated away (due to the associated photon loss); in these situations (where VPE performs quite well) these methods will have little effect.VPE is clearly compatible with any such techniques, as these consist of adjustments to the implementation of a given circuit rather than an algorithmic overhead.(b) Postselection or verification techniques: this class of techniques uses knowledge of the problem to restrict the state of the quantum device to within a small region of the N -qubit Hilbert space, often by leveraging symmetries of the Hamiltonian of the problem to be solved.VPE itself falls into this category, alongside symmetry verification [29,30], and quantum subspace expansion techniques [23,24].The performance of these techniques is dependent on their ability to catch errors outside the allowed Hilbert space, so, as the dimension of the Hilbert space for VPE is only 2, we expect it to have greater mitigation power in general than these other techniques.(This can be observed in Appendix G, where VPE shows an asymptotic improvement over symmetry verification in a small numerical simulation.)However, as the circuit depth of VPE is typically far longer than that of other postselection or verification techniques (which can be achieved in some cases without any additional circuitry), the requirements on the number of measurements to overcome sampling noise will be significantly worse.As these techniques overlap in their effect on the quantum state, it is not particularly possible to combine them; instead one should choose the best trade-off between mitigation power and the number of measurements.
(c) Error extrapolation techniques: assuming that one can artificially introduce noise into a system, these techniques rely on parameterizing the output f of a quantum circuit as a function of a "noise parameter" f = f (λ), fitting a functional form, and extrapolating to λ = 0.The noise parameter can either be adjusted experimentally (e.g., by adjusting the wait time or detuning of an underlying gate) [25,27] or algorithmically (e.g., by inverting noisy gates [61]).The mitigation power of such a technique depends on how well the noise can be tuned as a function of this single parameter, and how well one can pin down a functional form for f (λ).This is not easily comparable to VPE, as the physical source of the mitigation is qualitatively significantly different.We expect that the relative performance will depend on the experiment and the hardware itself.In theory these methods could be combined with VPE (either by extrapolating the phase function or the VPE result).However, it is unclear whether the output of VPE will be more challenging to fit, reducing the effectiveness of the extrapolation.(d) Result extrapolation techniques: instead of fitting the output f of a quantum circuit to an artificial noise term, one can consider comparing the output of similar quantum circuits tailored to efficient classical simulation.This technique has been demonstrated experimentally in Refs.[58,62], and proposed within a VQE setting (by tuning the parameters to points where the solution is known) [63].
In some sense VPE can be considered to be similar to these methods, with the |ψ s |0 or |ψ r states providing an entangled reference state for the target evolution.However, this relationship is not completely clear, as VPE strictly relies on the coherence between the two states.Understanding this similarity is a clear avenue for future research.Regardless, VPE should be able to be combined with at least some of these techniques to provide yet more mitigation power.(e) Probabilistic cancelation techniques: given knowledge of the true process maps of the gates being performed on a quantum device, one can in principle construct families of quantum circuits that, when combined, yield a target noiseless result [25,27].However, these methods require much additional characterization of the device, which is a problem in systems with large amounts of drift.
In principle, given sufficient knowledge of the noise, this method works perfectly, but at a greatly increased measurement cost, making it difficult to make a fair comparison in a theoretical setting.Testing this method against VPE in a real experiment would be an interesting target for future research.
(f) Purification techniques: as the output of a quantum algorithm is often ideally pure, these techniques attempt to reduce errors by mapping a noisy impure state to a purer one.This may be achieved, e.g., for free-fermion states via McWeeny purification [4], or for more general states via virtual distillation [64].
For more complex states, the McWeeny process cannot be used, but it has proven remarkably effective when available.Virtual distillation and VPE appear to be remarkably similar in their increased measurement cost and their mitigation performance, as well as their circuit structure.Understanding this similarity and comparing the two in more detail is a clear avenue for future research.

V. NUMERICAL EXPERIMENTS
To investigate the mitigation capability of verified phase estimation, we first use it for expectation value estimation.To prepare states, we take different variational ansätze with randomly drawn parameters.We compare the performance of verified and unverified circuits across multiple target Hamiltonians, noise strengths, and noise models to attempt to identify trends in the method.All simulations are executed using the Cirq quantum software development framework [65] and simulators therein.Hamiltonians and complex circuits are further generated using code from the OpenFermion [66] libraries.Except for when mentioned, the Cirq noise models are chosen to be a constant error rate per qubit per moment, where a moment is a period of the circuit where gates occur.Equivalently, this can be thought of as an error rate per qubit per gate, but including error on idling gates as well.The noise models considered are not as complex as those typically observed in experiment (which are typically highly nonuniform, and can include crosstalk and non-Markovianity alongside other effects), but we expect our results should provide a suggestion of the mitigation power of this method in a real quantum device.

A. Givens rotation circuits for free-fermion Hamiltonians
We first test the mitigation ability of the verification protocol on an instance of a "Givens rotation circuit" of the form developed for implementing rotations of singleparticle fermionic basis functions in Ref. [67].This circuit takes the form where c † j and c j are the creation and annihilation operators for a fermion on site j , and θ j ,l = θ l,j .Such a circuit is classically simulatable, but it is a critical piece of infrastructure in quantum computing applications for quantum chemistry [4,11,13,31,57].It is also low depth: it may be decomposed exactly by a sequence of matchgates [68], with optimal compilation in a circuit depth of exactly N .When acting on a N -qubit register prepared in the state N f −1 n=0 X n |0 , this may prepare an arbitrary ground state of a free-fermion Hamiltonian with N f particles by an appropriate choice of θ .In this work, we take a simple free-fermion Hamiltonian as an example-namely, a one-dimensional chain Such a Hamiltonian may be diagonalized, where V here takes the same form as in Eq. ( 55).This decomposition allows immediately for the fast forwarding of time evolution, as As the Givens rotation circuits conserve particle number, the vacuum |0 may be used as a reference state for controlfree verified estimation.A superposition of this reference state and starting state U( θ) N f n=1 X n |0 may be prepared by acting the Givens rotation circuit on the Greenberger-Horne-Zeilinger (GHZ) state which may itself be prepared by, e.g., a chain of controlled-NOT (CNOT) gates: Note here the backwards product that runs left to right (i.e., the CNOT gate between qubit 1 and qubit 0 is executed first).Following the definitions in Sec.III D for verified control-free phase estimation, we can write the complete preparation unitary as Then, as the product of two Givens rotation circuits is itself a Givens rotation circuit [67], we may compile VU( θ) = U( θ ) and implement this in a single Givens rotation circuit.
The complete VPE circuit for this circuit consists of the GHZ preparation, a single Givens rotation, a set of single-qubit z rotations, uncomputing the Givens rotation, uncomputing the GHZ preparation, and measurement in the X or Y basis.The resulting circuit for verified phase estimation is more than twice the length of the circuit required for the unmitigated VQE.We assume here that the VQE tomography does not require any additional overhead, and directly estimate the expectation value from the simulated density matrix.For verified phase estimation, we extract the phase function from the simulated density matrix, and then process it to estimate expectation values using Prony's method.In order to not bias the final readout (which can lead to significant error in estimation), we average the rotation into the X and Y bases over both +π/2 and −π/2 rotations (see Sec. III C 2).To simplify the analysis here, we do not include additional sampling noise.In Fig. 3, we plot the rms error for two error models over a range of noise models and strengths.For each noise model and at each strength, we sample 50 random choices for the initial parameters θ [and set t = 1 in Eq. ( 56)].In the presence of a uniform single-qubit depolarizing channel (Fig. 3, left), we see that the verified error displays a clear ∼ p 2 trend (where is the error in the final estimation and p is the error per qubit per moment).This implies that the effect of all single errors in this noise (rms) FIG. 3. Mitigation of a four-qubit Givens rotation circuit via verified phase estimation.Left: error in estimation of random states in a free-fermion system [Eq.(56)] under a uniform depolarizing channel.Right: error in the same estimation, but this time under an amplitude and phase damping model.In both plots, the RMS error (crosses) is calculated over 50 different estimations for each error rate using either standard partial state tomography (red) or using verified control-free phase estimation.Individual data points (dashes) are additionally shown.For reference, dashed lines showing linear (red), quadratic (black), and cubic (blue) dependence on the gate error rate are plotted.model are suppressed by the error mitigation (or fortuitously cancel), but that pairs of errors near to each other in time may affect results.Under the effect of an amplitude and phase damping channel (Fig. 3, right), the suppression is even starker; we see a clear ∼ p 3 trend till the error drops to below 10 −5 , providing up to 4 orders of magnitude gain in precision.Below 10 −5 the error plateaus.This is due to numerical stability issues with Prony's method, and not a fundamental limit of the procedure [69].This level of estimation error only becomes relevant after > 10 10 p −2 err individual shots have been taken (with p err the probability of an error over the entire circuit).As such, we expect this to not be relevant for most experiments.The lower error rate makes some sense: amplitude damping errors can only ever reduce the number of excitations in the circuit, and so by themselves can never return to a state with nonzero overlap with |ψ s .However, the precise mode for the leading contribution to the error rate is still somewhat unclear.

B. The variational Hamiltonian ansatz for the transverse-field Ising model
We next attempt the verification of a completely different model and ansatz.The transverse-field Ising model (TFIM) is a well-known spin system, with Hamiltonian where we take the sum j + 1 modulo N (i.e., periodic boundary conditions).In one dimension, this model has a critical phase when J z = J x , making this a simple model to study interesting quantum phenomena.Exact ground states of this model may be found by the variational Hamiltonian ansatz (VHA) [70] for any values of J x and J z [71].The VHA consists of alternating the Ising model and transverse field terms p times, with at each layer p the amount of time to be treated as a free variable: (Note that, for this given model, the VHA is equivalent to the quantum alternating operator ansatz of Ref. [17].) The TFIM does not have any simple eigenstates, and nor does the VHA, so simple methods of control-free verified phase estimation are not available.Instead, we attempt single-control verified phase estimation.To lower the error incurred during the circuit, we perform VPE in series for every term in Eq. (62).Unfortunately, verification works significantly less well in this setting, as shown in Fig. 4.
For both noise models considered, we see a clear ∼ p trend with the energy error in the final result and p the error per qubit per moment.This suggests that errors that map the noiseless state into one with nontrivial overlap FIG. 4. Mitigation of a four-qubit VHA circuit via verified phase estimation.Left: error in estimation of the energy of random states generated by the quantum approximate optimization ansatz in the critical phase of the transverse-field Ising model [Eq.( 62)] under a uniform depolarizing channel.Right: error in the same estimation, but this time under an amplitude and phase damping model.In both plots, the rms error (crosses) is calculated over 50 different estimations for each error rate (with randomly chosen ansatz parameters) using either standard partial state tomography (red) or using verified control-free phase estimation.Individual data points (dashes) are additionally shown.For reference, dashed lines showing linear (red) dependence on the gate error rate are plotted.
with the verified density matrix are dominant in this circuit.Regardless, we note that verification does provide an approximate 8-fold improvement in error rate over the unmitigated circuit, despite the verification circuit requiring one additional qubit and being 3 times as long.This result is lessened in the presence of amplitude and phase damping noise, till the point where the mitigation only improves estimation by a factor of 2.
Variational optimization is well known to mitigate certain types of coherent noise (e.g., coherent parameter drift) [18,60]; it also appears to provide some mitigation of incoherent noise when in combination with verified phase estimation.In Fig. 5, we perform a variational outer loop over the circuit studied in Fig. 4.Although the ∼ p behavior appears to roughly remain in the latter half of the optimization, the gain from error mitigation improves from 2-8 times to around 50 times, a significant improvement.We note that the optimization is no longer variationally bound-below about 10 −2 error per qubit per moment, the results are scattered relatively evenly on either side of the true value.By contrast, in the absence of sampling noise partial state tomography results will always be variationally bound.We suspect this result may be due to the fact that slightly different circuits need to be run to measure different terms, yielding an "effective state" that lies slightly outside the positive cone of allowed physical quantum  62)] by variational optimization of a VHA ansatz.The resulting expectation values are measured either by verified single-control phase estimation (black) or taken directly from the simulated state (red).We plot the median (crosses) of the absolute energy error over ten optimization attempts, each starting from a different initial point.Individual errors are plotted behind (faint dashes).Guide lines showing a linear dependence are additionally plotted (red dashed lines).
states.Though this effect does not appear to be particularly severe in this case, further study may be needed to see that it does not become an issue in larger experiments.

C. Fermionic swap networks for electronic structure Hamiltonians
As a final system for simulation, we move to studying the ability to verify molecular hydrogen on four qubits using a fermionic swap network.This ansatz was first studied in Ref. [67]; it consists of a network of two-qubit fermionic simulation gates, which take the form The parameters θ and φ are then left free to be optimized during the circuit.Molecular hydrogen is a simple example of the full electronic structure Hamiltonian, which takes the form Solving this Hamiltonian for mid-to-large system sizes (approximately 60+ qubits) with strong interactions is a key target application for quantum computers [11,13,72].
We study three different methods for verified expectation value estimation of the electronic structure Hamiltonian.Following a transformation from fermionic to qubit operators, Eq. ( 65), we first consider a decomposition over single Pauli operators for single-control VPE, as was performed for the transverse-field Ising model in Sec.V B. However, in order to perform control-free VPE on these terms, we require a reference state.Individual fermionic terms in Eq. ( 65) are number conserving, so the fermionic vacuum is a good reference state for these, but this is not the case for individual Pauli terms.To circumvent this problem, we split Eq. ( 65) into fermionic terms (summed with their Hermitian conjugate), and decompose these into Pauli operators.(One can check that the resulting Pauli operators commute, and so their time evolution may be easily fast forwarded.)The VPE circuits in both of the above methods are 3-4 times the depth of the original VQE.
Alternatively, by performing a low-rank factorization of the Coulomb operator, we may write H in the form [57] where the U l are single-particle basis changes that may be implemented via Givens rotation circuits.Each such term in this factorization is fast forwardable.Here H (0) is a free-fermion Hamiltonian and may be simulated via the methods discussed earlier in this section.The interacting factors H (l) may also be diagonalized by diagonalizing the single-particle t (l) i,j matrices.One finds that which may be easily implemented on superconducting hardware, as e β is realized by a controlledphase gate.All of the above Hamiltonians, as well as the fermionic swap network itself, conserve particle number, and so we may again use the vacuum as a reference state for verified control-free quantum phase estimation.We do not consider the single-control version for comparison in this case.The resulting circuit is over 10 times as long as the VQE itself, as we are unable to compile the final basis rotation into the ansatz.
The mitigation power of VPE differs vastly between the different choices of decomposition used, and the different noise models chosen.In Fig. 6, we plot the effect of mitigating depolarizing and amplitude and phase damping channels, using the three decompositions described above.We see that control-free [Figs.6(a1)  demonstrate a second-order sensitivity to the physical qubit error rate, consistent with the previous results in Fig. 3.In this case, the Pauli decomposition clearly outperforms the low-rank factorization, which we attribute to the large reduction (approximately 2-3 times) in total circuit depth.However, although the low-rank factorization repeats the third-order sensitivity to amplitude and phase damping seen in Fig. 3 [Fig.6(a2)], this is not observed in the Pauli decomposition case [Fig.6(b2)].We investigate this further in Appendix F, and find that this first-order error can be traced back to the verified estimation of a single term-the two-body interaction term.We attribute this to the fact that the time evolution circuit for this term breaks number conservation (which is not the case for any other term in the sum), which makes it more susceptible to amplitude damping noise.Understanding this feature in detail, and determining whether better circuit optimizations exist, are clear targets for future research.In any case, all three implementations of VPE studied show at least an order of magnitude improvement compared to partial state tomography, and in some cases up to 3 orders of magnitude improvement, demonstrating the power of this technique.

D. Sampling costs
In a realistic experiment, direct estimation of any expectation value requires repeatedly repreparing the target state and measuring in an appropriate basis to accumulate statistics on the probability of seeing a given 0 or 1 measurement.In verified phase estimation, this repetition must be performed instead on the control qubit (for single control) or target qubit (for control free) to accumulate the phase function.Repreparation is necessary between subsequent measurements, as such a measurement collapses the global wavefunction, erasing the information about the probability to be estimated.This implies that each repetition carries substantial cost, and the rate of convergence of error estimation is a critical bottleneck in any variational algorithm.Although one might expect quantum phase estimation to speed up this estimation (which has been proposed previously [54]), this is only the case when one is estimating eigenvalues of the target Hamiltonian in a specific QPE instance.We wish to divide up our Hamiltonian for fast-forwarding purposes, and in most cases the resulting terms will not be simultaneously diagonalizable, so no set of mutual eigenstates will exist; instead, the results of Sec.III C 1 will hold.Furthermore, as our expectation value estimation requires to sum over multiple different amplitudes, we should not expect this to improve over the cost of partial state tomography (which requires noncommuting terms to be measured on separate preparations of the state).The error in expectation value estimation will further depend on the type of classical postprocessing used.In Fig. 7, we compare the convergence of two types of classical postprocessing to that of standard partial state tomography.We perform this simulation on the four-spin VHA-TFIM system studied in Figs. 4 and 5, on a representative point in the spectrum (the error-free variational minimum).We do not perform any measurement grouping or parallelization strategies for either method, and instead report our results as a function of the number of measurements per Pauli operator.The first method (green) assumes knowledge about the eigenvalues of the fast-forwarded Hamiltonians, in which case one need only fit the amplitudes, while the second (blue) first estimates the eigenvalues using Prony's method before fitting the amplitudes to the resulting signal.(We compensate for the presence of spurious phases in Prony's method by a slight adjustment described in Appendix C.) All methods of estimation are seen to converge at a rate ∼ M −1/2 , where is the estimation error and M is the number of samples taken.
We see that using the prior knowledge of the phases gives a significant advantage in convergence, with the resulting error rate being almost an order of magnitude worse when using Prony's method.This advantage persists in the presence of a depolarizing channel (1% error rate), although the convergence of all methods flattens as they approach the sampling-noise-free estimation value.We note that both classical postprocessing methods converge to the same result here, as expected.It is unclear whether the good overlap between the unverified circuit and the phase fitting method is due to them both achieving a lower bound for convergence or just coincidence.Further investigation here would be a good target for future work.The addition of noise makes convergence more costly.This increase can be bounded below by removing the fraction of experiments where at least one error has occurred (as we are at best effectively removing these results).Confirming this trend would also be a good target for future work.

VI. CONCLUSION
In this work, we present a new method for error mitigation, based on verification of the system register in a single-control quantum phase estimation routine.We further extend this method to a scheme for verification of control-free quantum phase estimation.By writing a complex Hamiltonian as a sum of fast-forwardable parts and using this technique to estimate the expectation value of each part, this becomes a powerful error mitigation tool for near-term experiments such as variational algorithms.Errors that take the system away from the small verified subspace do not affect the mitigated QPE results (at the cost of requiring additional repetitions of the circuit).We perform numerical studies of this error mitigation capability of the verification protocol on three different systems, finding the suppression of all single depolarizing errors when a Givens rotation circuit or a fermionic swap network prepare random states of a small fermionic system.The suppression is further magnified in the presence of amplitude and phase damping, resulting in a gain of up to 4 orders of magnitude in accuracy.For a simulation of the transverse-field Ising model, the error suppression is less pronounced.However, we find that variational optimization improves the error mitigation to a gain in accuracy of about 50-fold.We further demonstrate that the combination of variational optimization and verification mitigates against constant control error (which is not naturally mitigated by the verification itself).However, we find that the choice of postprocessing technique in the classical postprocessing may affect the estimation error by a factor of 10 in the presence of sampling noise.
Though verified phase estimation as presented already appears to be one of the most powerful error mitigation techniques available to NISQ-era quantum computing, further avenues for optimization exist.The wide range of possible options for verification, how to divide the Hamiltonian, and the classical postprocessing method all provide metaparameters that we have not yet determined how to optimize for any specific problem.Furthermore, circuits that quickly scramble errors would appear to make verification more reliable.Whether this observation can be used for meaningful optimization is a clear target for future work.Similarly, as errors need to have the instantaneous state as a near eigenstate to not fail verification, the errors that verified phase estimation is most susceptible to must commute, and could potentially be corrected with a classical error-correcting code.As these codes require much less overhead than full-blown QEC, this may be a practical method to ensure universal suppression of single-qubit errors.Future work could also investigate whether verified phase estimation may be combined efficiently with other error mitigation techniques.More generally, it would be timely to benchmark the zoo of error mitigation techniques against one another, and determine which combination of techniques works best in a range of situations.removed by classical postprocessing techniques [33,42,51].However, the shrinking of the signal increases the sampling requirements to estimate g(t) exponentially in t.
Although random error channels are exponentially suppressed by verification [following Eq. (A22)], realistic error models are biased, and may apply undesired phases to g err τ (t) instead of setting it to 0. The density matrix in Eq. (A16) is not normalized, but it must be positive, which implies that When this deformation is asymmetric around the z axis, or a rotation, g(t) may be quickly corrupted [73].However, symmetric noise (such as a depolarizing channel, or T 1 or T 2 channels during the bulk of the circuit) can be seen to simply dampen g(t) in an identical manner to N .That is, the dampening will depend only on the rate at which these errors occur.Such dampening will be canceled by renormalization, as observed in Fig. 9. Errors that do not rotate between |0 v and |1 v , but still contribute nontrivially to g (err)   τ (t) to first order, must have both |0 v and |1 v as approximate eigenstates of the error channel.This suggests a reason why control-free VPE is more noise robust to noise than single-control VPE: the starting and reference states are very different when looked at locally, which makes it less likely that a single local error will have both states as near eigenstates.It also suggests a reason why we might expect the suppression of errors to be only second order: if the same error occurs in subsequent moments (in a local frame), and the basis states |0 v = |0 v (τ ) have not evolved significantly between these moments, the second error will almost (but not completely) cancel out the first, driving the system back into the verified subspace in an uncorrectable manner.This implies that a circuit that more quickly scrambles the basis states |0 v and |1 v between moments should be less susceptible to error than one where the states evolve slowly.Understanding the dynamics of these noisy circuits in more detail is a clear target for future work.ISWAP 1/2 gate.We additionally add amplitude and phase damping noise at a rate p/2.In Fig. 10, we plot the result following optimization via the COBYLA algorithm implemented in scipy [74], in the absence of sampling noise.We see that the verification circuit is insensitive to the incoherent noise as expected, and behaves similarly to the effect of amplitude and phase damping alone (right-hand side plot of Fig. 3 of the main text).

APPENDIX F: TERMWISE COMPARISON OF VPE PERFORMANCE
To attempt to further understand the ability of VPE to mitigate errors, in this appendix we consider the effect of estimating different types of terms on the same preparation circuit.We consider the fermionic swap network used in Sec.V C of the main text to prepare states for a first-order sensitivity to this error model, whilst the lowrank factorization demonstrated a third-order sensitivity to the same model.)We see that the H s = Z 0 Z 1 term (left plot) shows the cubic dependence on error rate observed in previous amplitude-damping experiments, whilst the two-body scattering term (right plot) does not.This two-body scattering term is the only term contributing to the first-order decay of the VPE estimation observed in Fig. 6(b2) of the main text-all other terms in the decomposition display similar decay to the left-hand side plot of Fig. 11.This indicates that the errors to which we are first-order sensitive occur during the circuit implementation of e iH s t , and not the state preparation.
The circuit implementing e iH s t for the two-body scattering term is the only such circuit that does not conserve number throughout.(Instead, this evolution is achieved in two steps: a basis transform of XY, YX → IZ, ZI on pairs of qubits, ZZ rotations between the pairs and uncomputing, and then a basis transform of XX , YY → IZ, ZI on pairs of qubits, ZZ rotations between the pairs, and uncomputing again.)Finding decompositions of these circuits more amenable to VPE is a clear target for future work.

APPENDIX G: COMPARISON TO SYMMETRY VERIFICATION
In this section we present a comparison of verified phase estimation and symmetry verification on a depolarizing noise model, using the experiment in Fig. 6(b1) of the main text.In order to improve performance, we choose to verify on the number operator i Z i , instead of the parity i Z i .To perform symmetry verification, we take the quantum state prepared by the circuit, and directly project this into the number-conserving space.(In a real experiment simultaneous readout of the number operator and all terms is possible [4], but requires a slight addition of circuitry, which would increase the final error slightly.)In Fig. 12, we observe that while symmetry verification reduces error by around an order of magnitude, it does not provide the same asymptotic improvement as VPE.We also note that VPE improves over symmetry verification at all error rates, despite having a circuit over 3 times as deep.This is to be expected; as phase (Z i ) errors commute with the number operator, these cannot be detected by symmetry verification and so contribute at first order to the final error rate.

FIG. 1 .
FIG. 1. Process diagram of the protocol for verified estimation of the expectation value of a Hamiltonian on a state |ψ = U p | 0 .Blue denotes circuits to be executed or data to be extracted from a quantum computer; red denotes signal details to be estimated via classical postprocessing.The protocol proceeds as follows.Top left: a complex Hamiltonian H is split into a number of fastforwardable summands H s .The spectral function g(t) of | under time evolution of each piece is obtained (bottom left) via verified, fast-forwarded phase estimation.In this example, a control qubit is used to extract the phase function via phase kickback.The resulting data form a weighted sum of oscillations with frequencies equal to the eigenvalues E (s) j of the corresponding factor (bottom middle).This may be decomposed in a variety of classical postprocessing techniques to obtain estimations of the expectation values H s depending on the type of H s chosen (bottom right).Regardless of the method used, the expectation values must be normalized to obey Eq. (24), the last step in the verification process.As the expectation value is linear, the verified estimates of H s obtained may be immediately summed together to give a verified estimate for H (top right).

FIG. 5 .
FIG.5.Error in estimating the ground state energy of a foursite transverse-field Ising model [Eq.(62)] by variational optimization of a VHA ansatz.The resulting expectation values are measured either by verified single-control phase estimation (black) or taken directly from the simulated state (red).We plot the median (crosses) of the absolute energy error over ten optimization attempts, each starting from a different initial point.Individual errors are plotted behind (faint dashes).Guide lines showing a linear dependence are additionally plotted (red dashed lines).

FIG. 7 .
FIG. 7. Convergence of the estimation of a single point in a four-site transverse-field Ising model with the number of samples taken, using verified phase estimation processed either with Prony's method (blue) or by fitting known phases to the phase function (green), or standard partial state tomography (red) on individual Pauli terms.Left: convergence in the absence of error.Right: convergence in the presence of 1% depolarizing error per qubit per moment.In each subfigure we plot the median energy error (crosses and lines) over 200 simulations, which are plotted themselves behind (faint dashes).

FIG. 10 .
FIG.10.Error in estimating the ground state energy of a free-fermion system [Eq.(58) of the main text] of four fermions (on four qubits), using control-free verified phase estimation and a VQE.Noise model is mixture of ampliand phase damping and constant two-qubit control error (details in text).Median absolute errors for both verified estimation (black crosses) and standard partial state tomography (red crosses) are calculated over ten different optimization attempts.Individual simulations are plotted behind (faint dashes).Each optimization starts from a different parameter set.Linear (red dashed) and cubic (blue dashed) lines are shown as guides.
H 2 Hamiltonian.When this was split into numberconserving Pauli operator sums [Figs.6(b1)-(b2) of the main text], different circuits had to be used to estimate individual terms.In Fig. 11, we show the result of estimating the expectation values of two of the individual terms used in the control-free Pauli operator decomposition under an amplitude-damping noise model [Fig.6(b2) of the main text].(Recall that this figure demonstrated

FIG. 11 .
FIG. 11.Expectation value estimation of two individual H s terms from the control-free number-conserving Pauli operator decomposition of the H 2 Hamiltonian studied in Fig. 6 of the main text on states prepared by a fermionic swap network.The two terms here comprise part of the sum [Eq.(6) of the main text] for the expectation value of Fig. 6(b2) of the main text-but are studied here without prefactors (i.e., H s = 1).Each figure is labeled with the studied term, and guide lines (dashed red and blue) are given to show observed scaling laws.Data presented are the median (crosses) over 50 individual data points (faint dashes) of the absolute error in estimation using VPE (black) and standard partial state tomography (red).
FIG. 12.Comparison between verified phase estimation and symmetry verification.Both techniques are compared to the estimation of the expectation value of the electronic structure Hamiltonian for H 2 under a depolarizing noise model.Verified and unverified results are from the same simulated experiment as Fig. 6(b1) of the main text.Although symmetry verification improves the energy error by around a factor of 10, it still exhibits first-order scaling, as it cannot correct for phase (Z) errors during the experiment.
grows exponentially with the size of the circuit required to implement e iHt or U p .In a simple model, if the error per qubit per moment is p (i.e., assuming qubit decay is more dominant than gate noise in the model), an N -qubit circuit of depth d would have This is exactly what one would expect from an actual postselection technique [i.e., where Mp NE samples are used to estimate g(t)].We remind the reader that p NE here is the probability of no error occurring over the entire circuit.As one should expect for an error mitigation technique, this in turn and 6(b2)]

)
This means that errors must either fail to scatter both |0 v and |1 v , or rotate between these states and the failed state |ρ(f ).When control-free methods are used, |0 v is separated from |1 v and |ρ (f ) by highly nonlocal excitations, which are nonphysical error channels.However, when single-control methods are used, |0 v is coupled to |1 v and |ρ (f ) by control qubit errors.These control qubit errors deform the Bloch sphere defined by |0