Statistical phase estimation and error mitigation on a superconducting quantum processor

Quantum phase estimation (QPE) is a key quantum algorithm, which has been widely studied as a method to perform chemistry and solid-state calculations on future fault-tolerant quantum computers. Recently, several authors have proposed statistical alternatives to QPE that have benefits on early fault-tolerant devices, including shorter circuits and better suitability for error mitigation techniques. However, practical implementations of the algorithm on real quantum processors are lacking. In this paper we practically implement statistical phase estimation on Rigetti's superconducting processors. We specifically use the method of Lin and Tong [PRX Quantum 3, 010318 (2022)] using the improved Fourier approximation of Wan et al. [PRL 129, 030503 (2022)], and applying a variational compilation technique to reduce circuit depth. We then incorporate error mitigation strategies including zero-noise extrapolation and readout error mitigation with bit-flip averaging. We propose a simple method to estimate energies from the statistical phase estimation data, which is found to improve the accuracy in final energy estimates by one to two orders of magnitude with respect to prior theoretical bounds, reducing the cost to perform accurate phase estimation calculations. We apply these methods to chemistry problems for active spaces up to 4 electrons in 4 orbitals, including the application of a quantum embedding method, and use them to correctly estimate energies within chemical precision. Our work demonstrates that statistical phase estimation has a natural resilience to noise, particularly after mitigating coherent errors, and can achieve far higher accuracy than suggested by previous analysis, demonstrating its potential as a valuable quantum algorithm for early fault-tolerant devices.


I. INTRODUCTION
Quantum phase estimation (QPE) [1,2] is one of the most widely studied quantum algorithms, due to its potential for exponential speedups in some classes of problems.While this potential is promising, the quantum circuits involved in useful applications of QPE have a very high depth.Because of this, QPE is often described as a fault-tolerant quantum algorithm, which will require a large-scale quantum error correction (QEC) solution for non-trivial applications.Many studies have been performed in recent years to assess the resources required to apply QPE to active spaces at the limit of current classical algorithms, often estimating the need for tens or hundreds of millions of qubits using current QPE and QEC schemes [3][4][5][6].
In the last few years, statistical modifications to the QPE algorithm have been proposed [7][8][9][10][11][12][13][14] that are better suited for early fault-tolerant quantum computers.In particular, the circuits involved in such methods typically have lower depth than other QPE approaches, and use far fewer auxiliary qubits than techniques based on qubitization.
Separately, error mitigation has been a major theme in early practical applications of quantum computing, including applications to chemistry and condensed matter physics problems.Current quantum processors are often referred to as noisy intermediate scale quantum (NISQ) devices due to their high error rates and low qubit counts.In the absence of QEC, which requires higher qubit counts and lower error rates than currently available, error mitigation has been widely investigated.This includes techniques for readout error mitigation [15,16], as well as numerous methods for mitigating gate errors, such as zero-noise extrapolation (ZNE) [17][18][19], probabilistic error cancellation (PEC) [17,20,21], Clifford data regression (CDR) [22,23], noise tailoring techniques such as randomized compiling [24,25], and symmetry constraints or postselection [26][27][28].See also Ref. [29] for a recent review of error mitigation techniques, and Ref. [30] for a recent study testing ZNE and PEC on multiple quantum computing platforms.
Many error mitigation techniques are built with expectation values in mind."Textbook" and many other QPE approaches measure a discrete output, namely the bits of the energy estimate, and are therefore not well-suited for such techniques.In contrast, the circuits involved in statistical phase estimation methods are typically Hadamard tests, whose output is an expectation value, as shown in Figure 1.Multiple such circuits are performed, with the resulting expectation values used to construct an appropriate function, from which the desired eigenvalues may be estimated.
In this paper, we perform a detailed application of statistical phase estimation methods on Rigetti's quantum processors [31,32] and demonstrate a number of practical improvements, which both increase the final accuracy of energy estimates and also increase the resilience of the method to noise.We focus on an approach based on the cumulative distribution function (CDF) of the Hamiltonian's spectral measure, which was introduced by Lin and Tong [8], and use the improved Fourier approximation derived by Wan et al. [9] (but do not investigate the randomized compilation approach for Hamiltonian simulation introduced in the same paper).We also implement and test the quantum eigenvalue estimation algorithm (QEEA) by Somma [7].In both approaches, we apply importance sampling as described in Ref. [8].We apply these methods to several molecules, including two examples motivated by pharmaceutical applications, using a state-of-the-art chemical embedding approach to construct relevant active spaces [33].These results are achieved by using a variational circuit compilation strategy to allow the required operations to be performed with a low circuit depth, suitable for current quantum processors.
We investigate several techniques to mitigate errors in statistical phase estimation on Rigetti's quantum devices; these include symmetrized readout error mitigation [16] and zero-noise extrapolation [19].We show that mitigating coherent errors [24,25] is important in statistical phase estimation, and gives the method some considerable tolerance to noise, and that this methodology can be effectively combined with importance sampling, allowing energies to be extracted with confidence, even in the presence of significant QPU errors.
We introduce a simple approach to estimate energies from the QPU data with significantly improved accuracy, compared to the theoretical bounds derived in Ref. [9].For each system studied, we are able to obtain energy estimates to within chemical accuracy of the exact result.The largest example we study is a model of a pharmaceutically relevant molecule, where the ground-state energy is obtained with an error of less than 0.1 mHa.We further study a Trotterized example with two-qubit gate depths of up to 100, and demonstrate the importance of mitigating coherent errors.
The structure of the paper is as follows.In Section II A we cover the theory of statistical phase estimation, particularly the methods of Refs.[8] and [9], including importance sampling.In Section II B we discuss a variational compilation method to reduce circuit depth.Section II C introduces the error mitigation techniques to be applied.Results are then presented from Rigetti's Aspen devices, first applying the statistical phase estimation method to an example with 2 electrons in 2 orbitals, followed by study of larger active spaces in combination with error mitigation strategies, and a final example using a Trotterized expansion of the time evolution operator.

II. THEORY A. Statistical phase estimation
We consider n-qubit Hamiltonians of the form where P l are n-qubit Pauli operators, and denote the eigenvalues and eigenvectors of H by {λ i ; |Ψ i }.We will be concerned with estimating {λ i } using statistical phase estimation methods.For these techniques it will be necessary to bound the Hamiltonian in a known range, and we therefore work with a scaled Hamiltonian τ H, where τ > 0.
In this paper we focus on circuits of the form shown in Fig. 1.This is a Hadamard test circuit, where setting V = ½ or V = S † = |0 0| − i|1 1| allows measurement in the X or Y bases, respectively.For V = ½, defining a random variable X equal to +1 for |0 measurements and −1 for |1 measurements, it can be shown that Similarly, for V = S † measurements, defining a random variable Y equal to +1 for |0 measurements and −1 for |1 measurements gives Performing the circuits of Fig. 1 therefore allows estimation of up to statistical errors, which are controlled by averaging over multiple repetitions of the circuit, or "shots".The vector g k is the main quantity of interest that we seek to estimate by quantum computation.
In general |ψ will not be an exact eigenstate of H.We denote the expansion of |ψ in the eigenbasis of H by which gives where we define p i = |ν i | 2 .Therefore g k will in general consist of a sum of oscillating signals, whose frequencies are determined by the energies of H, and whose amplitudes are determined by the components of the corresponding eigenstates in |ψ .The goal of statistical phase estimation methods is to extract (some of) the phases λ i from the noisy estimates of g k , and hence estimate the energies of H. Multiple such methods have been introduced in recent years, each suggesting different techniques to construct eigenvalue estimates from the g k estimates.Roughly speaking, these methods involve identifying some function f (H) which allows eigenvalues to be identified.The function f (H) is then expanded in a Fourier series, which can be constructed, after truncation, using the g k estimates.Truncation is needed to put a finite limit on the unitary time evolution τ k.Note that once g k has been constructed, the task of estimating the desired λ i is a purely classical task, and is related to similar problems in signal processing.
We begin by reviewing the approach of Wan et al. [9].This approach is based on a similar method by Lin and Tong in [8], but uses an alternative Fourier approximation that allows them to prove a better asymptotic complexity.Note that Ref. [9] also introduces a randomized compiling approach to implement e −iτ Hk with reduced circuit depth, but this approach is not tested in this paper.

CDF-based statistical phase estimation
In the approach of [8,9] we wish to calculate a cumulative distribution function (CDF) associated with the Hamiltonian and state |ψ , If C(x) could be constructed then it would allow us to identify the eigenvalues of H through its discontinuities.In practice, the CDF will be constructed as a sum of terms e ikx for integer values of k, and so it is necessary to define C(x) to be 2π-periodic.We therefore instead define where Θ(x) is a 2π-periodic Heaviside step function, and p(y) is the probability distribution of τ H associated with |ψ , It is straightforward to check that this definition gives the desired C(x) = i:λi≤x p i for |x| ≤ π/2.For |x| > π/2 this is not true, and we should therefore choose τ such that τ H ≤ π/2.The discontinuities of C(x) in the range |x| ≤ π/2 can then be used to identify eigenvalues of τ H.One can proceed by considering a Fourier series expansion of the step function.Ref. [9] defines where F (x) is an approximation to Θ(x).The authors carefully construct this approximation to satisfy various bounds on its error, and on the scaling of |k|<N |F k | with N .In this paper we focus on performing small problems on NISQ devices, and therefore do not consider these scaling properties in detail.Instead we simply state and use the derived Fourier coefficients, where 0 ≤ j ≤ d − 1 and d is related to N above by N = 2d + 1.In addition we have is the n'th modified Bessel function of the first kind.The Fourier coefficient F k is non-zero for odd k only; even k do not contribute.The Fourier coefficients, and hence the approximate Heaviside function F (x), depend on a parameter β > 0. For untruncated k the approximation becomes more accurate as β increases.However, {F k } decay more slowly with increasing β, and so for a truncated summation there is a trade-off in the choice of N and β.
An approximate periodic CDF can then be expressed Noting that g −k = g * k and that F k = −i|F k | for k > 0 and F k = i|F k | for k < 0, this can be simplified to the following final expression In practice one only needs to estimate g k for k ≥ 1, and only for odd values of k.

Importance sampling
The summation to be estimated in the CDF-based QPE approach takes the form in Eq. (15), where each term is weighted by a Fourier coefficient, |F k |.These Fourier coefficients decay rapidly, such that contributions at large k may be several orders of magnitude smaller than those at low k.
For this reason, Ref. [8] suggests performing importance sampling of this summation.Here, terms are randomly sampled with probabilities proportional to |F k |, where We obtain a set of N S values {k 1 , . . ., k NS }, where each k i is sampled with probability P ki .The importance-sampled CDF (which we denote H(x)) can then be constructed as CDF is obtained using Eq. ( 15) using exact values of g k .The CDF "with importance sampling errors" is obtained from Eq. ( 17) with NS = 5000, but again using exact g k values.Dashed vertical lines are exact energies.(b) The derivative of the CDF, zoomed in around the ground-state energy.The maximum of the CDF derivative provides an accurate energy estimate.
which is an unbiased estimator for C(x).The estimates of Re[g ki ] and Im[g ki ] for each k i are then each obtained by performing Hadamard tests as in Fig. 1, with the resulting real and imaginary components denoted r i and s i , respectively.In our experiments, each r i and s i estimate will be averaged over multiple shots in practice; on current cloud-based platforms, it is inefficient to perform a circuit for a single shot due to the overhead in submitting a circuit.
The CDF can then be constructed by which again is an unbiased estimator for C(x).Note that there are separate estimates r i and s i for each sample k i (rather than only obtaining single estimates for each unique k).Also note that we use the same set of samples {k 1 , . . ., k NS }, {r 1 , . . ., r NS } and {s 1 , . . ., s NS } for every value of x when constructing G(x) (rather than performing a fresh sample for each x).
We emphasize that there are two separate levels of sampling here.We refer to | C(x) − H(x)| as "importance sampling error", whereas | C(x) − G(x)| contains importance sampling error and also "shot noise".
To aid with discussion later, it will be helpful to write Eq. ( 18) in an alternative form.Let n k denote the number of times that k is sampled during importance sampling.We can then write where are estimates of the real and imaginary parts of g k , averaged over all repeated samples of k.In order to mitigate coherent errors when running on a QPU, we will perform a separate Pauli twirl for each sample of k.Therefore, rk (s k ) will denote the estimate of Re[g k ] (Im[g k ]) averaged over n k Pauli twirls of the appropriate Hadamard test (with each single-Pauli twirl estimate r i or s i also averaged over multiple shots in practice).
As an example, Fig. 2a presents simulated results for H 4 in a STO-3G basis, with a square geometry of side length 1.28 Å.The state |ψ is taken to be the Hartree-Fock state.The CDF estimates used here are C(x) (red) and H(x) (blue) with N S = 5000 (thus shot noise is not present in these examples).Because this problem is multi-reference, the CDF has multiple "jumps", each corresponding to an energy eigenvalue of τ H.While this introduces noise into the CDF estimate, the first few eigenvalues can still be clearly identified.
Fig. 3 plots the values |F k | against k for the CDF-QPE method, and compares to those from Somma's QEEA, which is discussed in Appendix B. In the QEEA we take the half-bin width as ǫ = 3 × 10 −3 .In the CDF-QPE Comparison of the Fourier coefficients from the two statistical phase estimation methods considered (rescaled so that |F1| = 1), for similar target accuracies.In the QEEA we choose a bin size of 3 × 10 −3 .In the CDF-QPE method we set β = 10 5 which gives at least a similar level of accuracy with high probability.The Fourier coefficients decay super-polynomially in both methods, which allows efficient importance sampling.
method we set β = 10 5 , which has been chosen to target an equivalent accuracy of around 3 × 10 −3 in estimates of λ i .The Fourier coefficients F k decay super-polynomially in both methods, though the decay is much more rapid in the CDF-QPE method.This rapid decay is the reason for the large efficiency gain in using importance sampling.

Estimating energies in the CDF-QPE method
Eq. ( 15) provides a formula to construct the approximate CDF, which in the limit of large β and N can be used to obtain the exact energies τ λ i of τ H through its jump discontinuities.In practice, this function is only constructed to a finite precision, and a method is needed to estimate λ i from the approximate C(x).
Ref. [9] proves that F (x) can be constructed with a guaranteed level of accuracy, provided sufficient β and d are chosen.In particular, it is proven that for any ǫ > 0 and δ ∈ (0, π/2), the condition is satisfied, provided and with a sufficient d, which can be chosen as d = O(δ −1 log(ǫ −1 )), and W (•) denotes the principal branch of the Lambert-W function.Throughout this paper we use Eq.(22) to choose β for a given target accuracy of δ, after loosely setting ǫ = 0.1.Note that it is always possible to construct F (x) via Eq.(10) to check the accuracy of F (x) for a given β and d; we use this approach to choose d after first choosing β from Eq. (22).Given the accuracy guarantees above, the authors of [9] estimate τ λ i with a procedure similar to a binary search, suggested in Ref. [8].This approach allows a careful proof of the algorithm's scaling for a given target accuracy δ.
In this paper we take a different practical approach to estimating τ λ i from C(x), which we find can give more accurate estimates than the target accuracy δ by one to two orders of magnitude or more.This is performed by maximizing the derivative of C(x) in the region of each jump.From Eq. ( 15), the derivative can be calculated (up to an unimportant constant) as This can be viewed as an objective function, and the locations of its local maxima (in the regions of jumps in C(x)) can be used to estimate each τ λ i .When performing importance sampling and averaging over shots, we instead take as an objective function, which provides an unbiased estimator of C′ (x), up to an unimportant overall constant.Here, n k , rk and sk are as defined in Section II A 2.
To motivate why the above provides accurate estimates of τ λ i , recall that C(x) is defined as a convolution between the approximate Heaviside function, F (x), and the probability density function, p(y), as in Eq. (8).F (x) is constructed to meet the accuracy condition in Eq. ( 21), restricting the jump to a region of width ∼ 2δ.However, from the Fourier definition of F (x) it can be seen that the maximum of the derivative of F (x) lies at exactly x = 0, even for small β and d.Consider the simple case when p 0 = 1, so that the initial state |ψ is an exact eigenstate of τ H with energy τ λ 0 .In this case p(y) = δ(y − τ λ 0 ).Since C(x) is a convolution between F (x) and p(y), the maximum of its derivative will then lie at exactly τ λ 0 , even for small β.In the more general case where multiple p i 's are non-zero, the CDF derivative will be a sum of such contributions that will overlap, and this argument no longer holds exactly.However, if the gap between eigenvalues τ λ i is much greater than δ, then it is expected to remain a significantly better approximation than the bound provided by Eq. ( 21).
These arguments will be affected by the presence of noise, including shot noise and importance sampling errors.Also note that the additional factors of k in G′ (x) will increase noise from high-k contributions, potentially making the derivative more susceptible to errors.We will show in our results, however, that this approach is often robust in practice.
Fig. 2b plots the CDF derivative for the H 4 example described above, zoomed in on the ground-state energy.It can be seen that the maximum is an accurate estimate of τ λ 0 , even after applying importance sampling.By numerically maximizing this function, the estimate of τ λ 0 is in error by only 2.5 × 10 −7 Ha without importance sampling (i.e. using Eq. ( 15)), which increases to 9.2 × 10 −6 Ha with importance sampling (using Eq. ( 17)).This can be compared to the width of jump region, which is ∼ 10 −3 Ha.
We note that a comparable approach has recently been suggested by Wang et al. [13].In this, the method of Lin and Tong is used to find an approximate region where the ground-state energy is located.A more accurate estimate is then obtained by finding the maximum of (f σ * p)(x) in this region, where f σ (x) is a Gaussian filter kernel.We expect that our approach is comparable from a practical point of view, although avoids working with a separate Gaussian kernel.Instead we work with the derivative of F (x) in place of f σ (x), which we find convenient in practice.

B. Variational circuit compilation
Applying statistical phase estimation requires estimating g k = ψ|e −iτ Hk |ψ using the Hadamard test of Figure 1.This requires implementing the controlled-e −iτ Hk unitary for a range of k values.On early fault-tolerant quantum computers it is often anticipated that this will be achieved using a Trotter expansion of e −iτ Hk .On current NISQ devices, the circuit depth required to achieve this is far too high for non-trivial problems, particularly for ab initio Hamiltonians where there are many terms in H and for the high values of k required for good precision in QPE.
Instead, in this work we primarily use a variational compilation technique that allows each controlled-e −iτ Hk operation to be compiled to a constant circuit depth, up to a negligible error.This technique has been recently used in other studies applying closely-related circuits on superconducting processors [34,35].
Specifically, we consider a circuit ansatz as shown in Figure 4, consisting of alternating one-qubit and two-qubit layers.One-qubit layers consist of U 3 gates, which each allow an arbitrary one-qubit rotation.Two-qubit layers are constructed using CZ gates, entangling alternating pairs of qubits in each layer with a "brickwork" pattern.The U 3 gates are parameterized by Euler angles (θ, φ, λ), and implemented in native gates for Rigetti's processors as where rotation gates are defined R Z (θ) = e −iθZ/2 and R X (θ) = e −iθX/2 .The parameters in the U 3 gates can be variationally optimized such that the circuit ansatz closely matches the action of the desired unitary on a given input state.Specifically, following Ref.[35], we define the loss function where the L 2 norm is used.Here, U is the target unitary, Ũ (p) is that of the ansatz circuit with parameters p, and |Ψ = |+ ⊗ |ψ is the input state to U in the circuit.We apply this variational compilation procedure to the controlled-e −iτ Hk operation.Other components in the circuit are constructed directly as native gates.
The minimization of L(p) is performed classically.Constructing and optimizing this loss function requires constructing the action of U on |Ψ .As such, the current methodology is not scalable to systems beyond classical computation, but is nonetheless valuable for near-term NISQ studies.Alternative loss functions based on the reduced density matrices can be used instead [34]; this compilation strategy could then be applied to sub-circuits on fewer qubits than the total circuit, allowing the approach to scale to large numbers of total qubits.We do not consider Circuit ansatz used to compile controlled-e −iτ Hk operations.One-qubit layers consist of U3 gates.Each U3 gate is specified by three parameters that are optimized to approximately match the action of the desired unitary.Each U3 gate is applied as five native gates on the quantum processor, see Eq. ( 25).Two-qubit layers are formed from CZ gates.
these alternative loss functions in the current study, and instead work with Eq. ( 26).While the use of constant-depth circuits simplifies some aspects of the error mitigation, there are many aspects of the error mitigation task that remain challenging and important to treat carefully, as we shall see.To investigate the additional challenges introduced by using Trotterization, we also study a Trotterized example on a QPU at the end of Section IV.Additionally, a Trotterized example is studied in Appendix C in the presence of both unitary and depolarizing errors.
The circuit optimization was implemented using the JAX library [36], which enables automatic differentiation of Python functions.We use the BFGS algorithm implemented in the JAX library to perform the minimization of L(p).We find BFGS to be far more robust than optimizations using stochastic gradient descent methods, for this task.

Zero-noise extrapolation
In this paper we apply zero-noise extrapolation (ZNE) [17][18][19]37] to mitigate errors in the expectation values of the Hadamard tests.ZNE is one of the most commonly studied error mitigation methods in the literature.The core idea of ZNE is to execute the target circuit at varying error rates, denoted λ, and extrapolate the results to obtain an estimate at a reduced error rate.Expectation values are estimated for the original circuit, defined λ = 1, in addition to circuits at increased error rates λ > 1.A function is fit to these expectation values and used to extrapolate to error rate λ = 0, which gives the error mitigated estimate.
There are various possible methods to increase the error rate λ.Examples in the literature include parameter noise scaling and pulse stretching [19,37].In our implementation of ZNE we increase λ using identity insertion (sometimes referred to as "unitary folding") [38].Identity insertion n > 0 times replaces a unitary operation U according to The error rate λ is then defined According to this definition, λ = 1 corresponds to n = 0, meaning that no identity insertion is performed.Circuits in this study are performed in layers of one or two-qubit gates that are executed in parallel (see Section II B and Figure 4).In our ZNE implementation we fold full two-qubit gate layers, which typically have the higher error rates than one-qubit gates on superconducting devices.Folding layers of gates rather than individual gates helps to ensure a consistent error profile with folding, for example ensuring that the crosstalk will be consistent in each layer.
In our implementation of ZNE we execute circuits at error rates λ = 1, 3 and 5. To obtain circuits at error rates λ = 3 and λ = 5 we apply identity insertion as in Eq. ( 27), with n = 1 and n = 2, respectively.A possible drawback of using such large error rates as 3 or 5 is that for low gate fidelities the final error can be too large to perform a reliable extrapolation.However, in the examples studied in this paper, this is not found to be a significant problem.Furthermore, using only odd integer λ values means that every two-qubit gate layer is folded, avoiding complications around having to pick a subset of layers to fold.

Mitigating coherent errors
As discussed in Section II C 1, we fold the two-qubit layers, which consist of CZ gates in this work.Such CZ gates typically suffer from significant coherent errors on current superconducting devices.To mitigate these coherent errors, we apply a form of randomized compiling (RC) [24,25].This is achieved by applying random Pauli gates, uniformly sampled from {I, Z, X, Y }, to each qubit before a CZ layer.Because CZ gates are Clifford operators, it is always possible to then apply corresponding cancelling Paulis after the CZ layer [39,40].This essentially has the effect of Pauli twirling the CZ gate layer [41,42].This twirling process is usually averaged over multiple instances of the same circuit with a new set of random Paulis applied in each.After inserting the Pauli layers, each circuit contains subsequent layers of Pauli and U 3 gates, which can be merged into a single U 3 gate layer.This procedure is known to convert an arbitrary error channel into a Pauli error channel, thus eliminating coherent errors.
Our implementation of randomized compiling differs from previous descriptions due to the use of importance sampling.Remember that our goal is to estimate the CDF, C(x), defined in Eq. 15.As discussed in Section II A 2, we importance sample this summation, as contributions at high k will typically be orders of magnitude smaller than at small k.We incorporate the twirling procedure into the importance sampling of C(x).If n k denotes the number of times that k is sampled during importance sampling, then the estimates for Re[g k ] and Im[g k ] are each averaged over n k independent twirls of the corresponding circuits.These estimates are denoted rk and sk , as defined in Eq. 20.Therefore, estimates for k = 1 will typically be averaged over a large number of twirls, while many circuits for large values of k will be performed for just a single twirl (i.e., without averaging).This will lead to poorer results at large k (both larger coherent errors and larger statistical errors), which will also impact on the performance of ZNE.However, because these terms are weighted by n k their contribution will be small, and thus it is to be expected that the corresponding errors will not significantly impact C(x).The converse is also true; using this approach, errors at low k will be much smaller, which is beneficial for the final estimate of C(x) due to their high weight in the summation.
Mitigating coherent errors is known to improve the performance of ZNE, allowing a more reliable fitting of the expectation values with λ.This was demonstrated for example in Ref. [43], which provided theoretical arguments to justify this finding, and will be further demonstrated in our results.Specifically, we find an exponential fit to be accurate in most cases.It should be pointed out, however, that the theoretical arguments for well-behaved exponential ZNE extrapolations in Ref. [43] only hold for depolarizing channels, and indeed the authors show that this result does not hold in general for Pauli channels with non-equal Pauli weights.Despite this, we will see that ZNE performs well after randomized compiling.We mention that the noiseless output extrapolation (NOX) method [44] has been proposed to overcome the above potential shortcomings, though we do not consider this method here.
Given the above, our strategy for ZNE is to attempt an exponential fit for Re[g k ] and Im[g k ] at every value of k sampled.However, in some cases (particularly at large k where n k is small), this fit may be unstable or of poor quality.Since we know that all Re[g k ] and Im[g k ] values must lie in the range [−1, +1], we loosely check that the extrapolated estimate is less 1.2 in magnitude.If this is not the case, then we instead switch to a quadratic fit for that data point.This strategy avoids extremely large g k estimates due to unstable exponential fits.

Readout error mitigation
We apply readout error mitigation to our results using the symmetrized approach described in Ref. [16].Other readout mitigation strategies include methods based on assuming readout noise to be local or based on continuous time Markov processes [15,45].
Suppose we are mitigating the readout of n qubits.Define a calibration matrix A as where |i and |j are computational basis states.Also define a vector C such that the i'th element is equal to the number of times the n-qubit state is measured to state |i .Then let C ideal be such a vector C under perfect readout.
Applying matrix A to C ideal gives an estimate of the results under noisy readout where E[•] denotes an expectation value.Therefore, an estimate of C ideal can be obtained by inverting A, To estimate the matrix A we repeatedly prepare and measure each of the 2 n computational basis states.We use the measurement outcomes to estimate the probabilities as in Eq. ( 29) and thus the matrix A.
In practice the readout error when measuring state |1 is often higher than the readout error when measuring state |0 , so that the calibration matrix A will not be symmetric.Moreover, readout errors often drift quite rapidly, so that a given estimate of A may not be accurate throughout an experiment.We can nonetheless symmetrize the calibration matrix by bit-flip averaging [16], in which an X gate is applied directly before measurement for half of shots performed (for every circuit involved in estimating each g k ).This symmetrizes any errors that remain after applying readout mitigation, resulting in a better-behaved final CDF estimate.Non-symmetric readout errors will generally result in additive errors in the g k estimates, which can show up as a spurious signal at λ i = 0 in the Fourier transform.Symmetrizing readout errors in this way can therefore significantly improve the quality of results.This simple approach of applying X before measurement in 50% of shots is found to be very effective in practice.

A. Software
To generate phase estimation circuits, including variational circuit compilation, we use software developed by Riverlane.These circuits were then converted to pyQuil format, and compiled to executables that were run on Rigetti's Aspen quantum processors, using their Quantum Cloud Services (QCS) platform [46] and associated software [47].We also used pyQuil extensively to perform prior testing on quantum virtual machines (QVMs).
We use the ORCA program package [48] to perform embedding calculations, and PySCF [49,50] to perform Hartree-Fock to generate the fermionic Hamiltonian for other systems.The Gaussian 16 program package [51] was used to perform geometry optimization for one structure detailed in Appendix A. The OpenFermion library [52] was used to perform mapping from fermionic to qubit operators using the Bravyi-Kitaev mapping [53], discussed further below.Variational circuit compilation was performed using the JAX library [36].

B. Implementation details
Here we discuss some specifics related to implementation of circuits for Rigetti's software stack and QPUs.As discussed in Section II C 2, we perform a separate Pauli twirl of the circuit for each value k i obtained during importance sampling.For example, if k = 1 is sampled n 1 = 50 times during the importance sampling step, then the estimates for r1 and s1 would each be averaged over 50 independent twirls of the corresponding k = 1 circuits.
Ideally we would like to perform as many Pauli twirls as possible, and so would seek to perform an independent twirl for each shot.In practice, each separate twirl must be submitted to the QPU as a separate circuit.Because circuit loading takes significantly longer than circuit running, it is inefficient to perform only one shot per twirl.Instead we perform 100 shots of each, with 50 each for the original and bit-flipped versions of the circuit in order to symmetrize readout errors.Therefore, when constructing the estimates rk and sk as in Eq. ( 20), it should be understood that the estimates r i and s i are each averaged over 100 shots in this manner, before averaging over n k independent twirls to construct rk and sk .These are the final estimates plotted for Re[g k ] and Im[g k ] in subsequent sections, and also used in constructing the final CDF estimates via Eq.(19).
To improve the effectiveness of ZNE, it is important for each gate layer to be as consistent as possible, thus ensuring a similar noise profile for each.To help achieve this, we add FENCE statements around all two-qubit layers in the Quil circuit, which ensures identical pulse timings.Additionally, we decompose all one-qubit gates with the structure . This includes trivial Pauli gates such as I and Z.Our aim here is to again ensure that gate layers are as consistent as possible.It should be noted that R Z gates are performed virtually on Rigetti's quantum processors [54].

C. Chemical systems
We study five separate systems in active spaces from 2 to 4 spatial orbitals.Here, the active space refers to a particular set of orbitals and a number of electrons used to occupy those orbitals.An active space with n electrons in m spatial orbitals (2m spin orbitals) is denoted (ne, mo).Two of our example systems are motivated by pharmaceutical applications, and use a recently-developed embedding method to target a chemically relevant region of the molecule with a small active space [33].The first of these systems is methanethiol, using a (2e,2o) active space for a minimal model of hydrogen abstraction.The second is a structure that we refer to as "clusterTS", taking a (4e,4o) active space.Both of these systems are described in Appendix A.
We also study H + 3 and H − 3 in the STO-3G basis as example 3-orbital systems.The geometry is an equilateral triangle in both cases, with a bond distance of 0.9 Å in H + 3 and 1.75 Å in H − 3 .Lastly we study H 2 to investigate a minimal example of Trotterization.Here a STO-3G basis is used once again, with a stretched bond length of 2.0 Å.
The qubit Hamiltonian is generated using the Bravyi-Kitaev qubit mapping [53] for all systems.Specifically, we use the approach of Ref. [55], which allows two qubits to be tapered due to spin and particle number symmetries, and is implemented in OpenFermion [52].Thus, for an active space of M spatial orbitals, the corresponding qubit Hamiltonian requires 2M − 2 qubits.For the Trotterized H 2 example we taper a further qubit, which is possible due to reflection symmetry.This allows the Hamiltonian for H 2 STO-3G to be represented by a single qubit.
The Hartree-Fock wave function is taken as the initial wave function, |ψ , for all systems.Molecular geometries are given in Supplementary Material.For H 2 , H + 3 and H − 3 , the PySCF inputs used to generate fermionic integrals are included in additional data.

A. Methanethiol (2e, 2o)
As a first example we consider application to methanethiol in a (2e,2o) active space, using orbitals centered on the SH bond, as described in Appendix A. To model the dissociation limit we take the SH distance to be 4 Å.This is a minimal model, but results in a multi-reference problem.We focus first on demonstrating how the CDF is constructed from the QPU calculations, and in particular on the effect of incorporating randomized compiling (RC) into the importance sampling procedure.In Section IV B we then apply this approach to more challenging Hamiltonians with 4 and 6 qubits.
The qubit Hamiltonian for this system can be constructed using 2 qubits, so that 3 qubits are required for each Hadamard test.We variationally compile each controlled-e −iτ Hk operation to a brickwork circuit ansatz.Here, the variational compilation can be performed with negligible errors using only 3 layers of CZ gates; the value of L(p) is typically smaller than 10 −6 after optimization.The total number of CZ gates is 3, 9 and 15 for error rate λ = 1, 3 and 5 circuits, respectively.
Calculations were performed on Aspen-11 using qubits 11, 26 and 27, with qubit 11 taken as the ancilla.CDF Fourier parameters were taken as β = 10 5 and d = 2 × 10 3 .From Eq. ( 22) together with ǫ = 0.1, this corresponds to an accuracy of δ ∼ 0.003.N S = 2 × 10 3 samples were taken for the importance sampling, and 100 shots were performed per sample, thus the total number of shots was 2 × 10 5 .Due to the very rapid decay of F k , most samples were performed at small k.For example, 530 samples were performed at k = 1 and 153 samples were performed at k = 3, whereas only 28 samples were taken in total for all values k > 1000.
Figure 5 presents results for the real components of g k (up to k = 79) and the corresponding CDF estimates, for error rates λ = 1, 3 and 5, performed both with and without randomized compiling.Without RC, the CDF at λ = 1 can be used to correctly estimate the energies of τ H through its jumps, but this becomes challenging for the excited states at higher error rates, and the general quality of the CDF is poor.This is improved significantly by applying RC.The energies are clearly identifiable at all error rates, and the shape of the CDF is largely correct.By inspecting the values of Re[g k ], it can be seen that the expectation values decay with increasing error rate in a more systematic manner when RC is applied than without.Fig. 6 (a) emphasizes this behaviour by showing an example ZNE extrapolation for the real component of g k at k = 3.An exponential fit is seen to be accurate with RC applied, which leads to an improved estimate.This exponential decay is not observed when RC is not incorporated, making ZNE less effective.
Fig. 6 (b) shows the CDFs obtained from ZNE-extrapolated estimates of g k , and compared to the exact CDF.Here, the "exact" CDF is that obtained with importance sampling using the same samples {k i }, but in the absence of QPU errors or shot noise; this corresponds to H(x) in Eq. 17. ZNE aims to correct gate errors by improving estimates of g k , but cannot correct importance sampling noise, and so this is the fair comparison to make.The same general behaviour is observed; with RC applied, the ZNE-corrected CDF has roughly the correct amplitude compared to the exact result.In contrast, the ZNE-corrected CDF obtained without RC is of relatively poor quality, and has large fluctuations (even beyond the expected importance sampling noise) away from the jump regions.These results demonstrate the importance of mitigating coherent errors in statistical phase estimation experiments.

B. Four-and six-qubit Hamiltonians
We next apply these methods to larger systems, first to H − 3 which requires 5-qubit circuits, and then to the ClusterTS system defined in Appendix A, which requires 7-qubit circuits.Given the improvements observed by applying RC in FIG. 5. Results from the CDF-QPE method performed on the Aspen-11 QPU at three different error rates (ZNE-extrapolated results are not presented here), for methanethiol (2e,2o) with a stretched SH bond.Results on the left subplots are performed without mitigation of coherent errors.Results on the right subplots mitigate coherent errors by RC.There are three significant eigenstates present, whose exact eigenvalues are marked by dashed lines in the CDF.Mitigation of coherent errors leads to a significantly better behaved estimates of g k with increasing error rate, which leads to improved estimates of C(x).The location of jumps in the CDF match the exact eigenvalues closely, even at high error rates.the previous section, it is applied for all results in this section.Fig. 7 shows results for H − 3 .Here, controlled-e −iτ Hk unitaries were compiled with 7 CZ layers for the λ = 1 circuits, with 2 CZ gates per layer.CDF Fourier parameters were β = 10 6 and d = 5 × 10 3 .N S = 4 × 10 3 samples were taken for importance sampling.Circuits were performed on Aspen-M-3 using qubits 30 and 34-37, with 34 taken as the ancilla.Corresponding CZ fidelities, as estimated by randomized benchmarking, were between 97.5% and 99.3%.
It is seen that ZNE does a good job at correcting g k estimates, particularly at low k where more samples are taken.This system is multi-reference, with 3 eigenstates having a significant overlap with the initial HF state, leading to 3 jump regions visible in the CDF.The ground-state energy is clearly identifiable at all error rates, though there is no clear signal from excited states at error rate λ = 5.Also shown is the CDF derivative, G′ (x), zoomed in on the ground-state energy; the maximum of this objective function provides an extremely accurate estimate of the true energy at all error rates.Fig. 8 presents equivalent results for the clusterTS system, which has 4 orbitals in the active space.Here we take CDF Fourier parameters β = 10 6 and d = 5 × 10 3 , and N S = 4.8 × 10 3 for importance sampling.We are able to obtain a good representation of each circuit using 9 CZ layers for each controlled-e −iτ Hk operation.Each layer contains 3 CZ gates.Thus for λ = 1 each circuit contains 27 CZ gates.For the highest error rate, λ = 5, each circuit contains 135 CZ gates.The final CDF is constructed by averaging over a large number of circuits; when considering both X and Y measurement bases, each ZNE error rate, the large number samples N S (each corresponding to a separate Pauli twirl) and bit-flip averaging to symmetrize readout errors, 57,600 separate circuits were performed to construct the CDFs, with 50 shots of each.Circuits were again performed on Aspen-M-3, using qubits 30-32 and 34-37, with qubit 34 again taken as the ancilla.The corresponding CZ fidelities, as estimated by randomized benchmarking, were between 96.7% to 99.2%.
Figures 8 (a) and (b) show the real and imaginary components of g k at each error rate, and the ZNE-extrapolated results.It is again observed that g k estimates decay sensibly with λ within statistical errors, and extrapolated estimates are a significant improvement for most k values.The final CDF estimates are shown in Fig. 8 (c).Here the ground-state wave function is single reference, leading to a single jump in the CDF, which is distinguishable at each error rate.The maximum of the CDF derivative in Fig. 8 (d) again allows an accurate estimate of λ 0 to be obtained.
As one method of quantifying the improvement made to the CDF estimates by applying ZNE, we consider the distance metric where [−α, α] is the range in which the CDF is calculated, with α = 1.5 here, and we calculate the integral numerically.An exact estimate of the CDF corresponds to W = 0. Table I gives the percentage reduction in W after performing ZNE (λ = 0) compared to the unmitigated estimates (λ = 1).The CDF is improved by around 77-80% for the larger systems studied.This improvement demonstrates the potential of ZNE to mitigate errors in expectation values, and agrees with previous ZNE studies.However, we find that the benefit of performing ZNE is somewhat limited in statistical phase estimation.Ultimately, the energies of the Hamiltonian are estimated through the jumps in the CDF; thus, two important metrics are the final energy estimates, and the ability to distinguish these jumps from the sampling noise.
Here we find no improvement upon performing ZNE.Table I gives errors in ground-state energy estimates for each system, calculated as ∆λ 0 = λ estimate 0 − λ exact 0 .No systematic improvement is observed by performing ZNE, and in many cases the error is slightly increased (although statistical errors may account for this).Similarly, while ZNE increases the amplitude of the CDF, the importance sampling noise is also inevitably amplified.This latter result may be expected; remember that ZNE only aims to address QPU errors, and not statistical sampling errors.Moreover, ZNE comes with a significant sampling overhead in order to estimate each g k with sufficient precision at high λ, as required for a reliable extrapolation.
The lack of improvement by performing ZNE seems to be associated with the natural resilience of statistical phase estimation to noise, particularly after mitigating coherent errors and symmetrizing readout errors.Indeed, even in the presence of significant QPU errors, the ground-state energy errors are found to be extremely small, typically smaller than 0.1 mHa.Again note that for all of the examples studied here, β was chosen according to Eq. ( 22) for a target accuracy of δ ∼ 1 mHa.Thus the final accuracy is often found to significantly exceed this target, even in the presence of noise.Therefore, while applying ZNE is found to give little practical benefit, we find that mitigating coherent errors by RC is very beneficial, and can lead to an algorithm with natural noise tolerance.Since statistical phase estimation requires averaging over a large number of circuits, the required twirls can be incorporated at minimal cost compared to the bare method, unlike ZNE, which requires an exponential additional sampling cost.Furthermore, incorporating the twirls directly into the importance sampling procedure is found to be practically effective.

C. Trotterization
The previous sections have investigated constructing the CDF for circuits whose depth is independent of k.This requires mitigation of various errors that reduce the performance of the algorithm.However, for a scalable approach we require circuits whose length grows at least linearly with k.As a final example we investigate a minimal model of H 2 using Trotterization, and investigate the performance of the same error mitigation techniques.
We consider H 2 in a STO-3G basis set, with a stretched internuclear distance of 2.0 Å.Using the Bravyi-Kitaev transformation with qubit tapering, this Hamiltonian can be represented by a single data qubit [55].In particular, qubits are tapered due to particle number, spin and reflection symmetries.This allows the qubit Hamiltonian to be written as where c 1 = 0.121256 and c 2 = 0.259138, and we have discarded a constant shift of −0.662537.The two eigenstates of H correspond to the lowest bonding and anti-bonding states of H 2 .We choose τ = 1.5/(c 1 + c 2 ) = 3.943 and work with the scaled Hamiltonian τ H.We then perform first-order Trotterization with a single Trotter step, i.e. taking e −iτ Hk ∼ (e −iτ c2X e −iτ c1Z ) k .The exact energies of τ H before and after Trotterization are ±1.128189Ha and ±1.089119 Ha, respectively.Thus there is a Trotter error of 39 mHa, or 10 mHa after rescaling by τ −1 .We are not concerned with Trotter error here, but rather with the performance of the statistical phase estimation method and error mitigation, and thus only compare to the Trotterized energies from now on.Note that a similar H 2 Hamiltonian has been used in a study of textbook QPE on a neutral-atom quantum computer, performed to three bits of precision [56].
The circuit for a single Trotter step is given in Appendix C and has a CZ depth of 4. Therefore, circuits to estimate g k have a CZ depth of 4k.Results are performed on Aspen-M-2 using qubits 121 and 122.The estimated CZ-gate fidelity from randomized benchmarking was 99.22 ± 0.1662%.The ancilla was taken as qubit 121, with an estimated readout fidelity of 98.4%.
Simulation parameters were chosen as β = 50, d = 30, and N S = 500.These are considerably lower than those used in previous sections due to the need to limit circuit depth.Taking ǫ = 0.1, this corresponds to δ ≈ 0.13 in Eq. ( 22), thus we expect a low resolution in the CDF.The highest value of k sampled was k = 25.The highest CZ depth was therefore 100 with λ = 1, or 500 with λ = 5.
Results are presented in Figure 11, performed with (left) and without (right) RC applied.Results display similar behaviour as observed in Section IV A. In particular, the decay of g k with increasing λ is better behaved for the twirled results, leading to more accurate extrapolations for ZNE.A better energy estimate is also obtained (as determined by maximizing the CDF derivative) with RC applied than without.For the ground-state energy the error in the ZNE-extrapolated estimate is reduced from −22 mHa to −8 mHa after applying RC.These errors remain large in both cases, but this is expected due to the low precision used.As observed in Section IV B, we again find that ZNE improves the amplitude of the CDF towards the exact value, but does not lead to improvements in the energy estimates.
We note that a related result has been observed in simulations performed in Ref. [57], where the authors consider a control-free variant of phase estimation.They show that coherent noise causes errors in the phases of the unitary, and prove that these errors are removed to first order with RC, verified with simulations.Such a result can be similarly motivated in statistical phase estimation.For g k = i p i e −iτ λik , the Fourier transform gives a sum of delta functions, shifted by the energies, λ i .In the ideal case of a depolarizing noise model, we expect g k to decay exponentially with k relative to the exact result, that is g k ∼ e −γ|k| i p i e −iτ λik , for some decay rate γ.The energies λ i can still be extracted exactly by taking a Fourier transform of this g k ; the corresponding poles will be broadened, but the maxima of the poles, and therefore the energy estimates, are unchanged.This also motivates why ZNE should not be expected to improve energy estimates further.In Appendix C, simulated results are performed for the same system but with higher precision, applying depolarizing noise in one example and unitary errors in another.The results are found to confirm these ideas; increasing the depolarizing error rate broadens each peak in the CDF derivative, but does not affect energy estimates, beyond statistical noise.Under unitary errors there is a significant error in the ground-state energy estimate, which is largely removed by incorporating RC into the importance sampling procedure, at the cost of broadened signals in the CDF.Lastly, we note that Ref. [58] has also analyzed a statistical version of phase estimation, and given theoretical arguments to justify the approach having tolerance to noise under certain models.Results are performed at error rates 1, 3 and 5, and extrapolated.RC leads to better final energy estimates and more accurate relative signal between the two states.ZNE improves the signal but does not improve energy estimates.

V. CONCLUSION
In this study we have implemented statistical phase estimation techniques on Rigetti's quantum processors, in combination with error mitigation and chemical embedding methods, allowing accurate energy estimation for several small chemical problems.In addition, a variational compilation technique was used to reduce circuit depth.We found this combination of techniques to be robust in practice, allowing accurate estimation of the ground-state energy with high confidence, even in the presence of significant QPU errors.The variational compilation technique was also found to be robust, and should be seen as a valuable tool for near-term NISQ studies.In the longer term, there is an interesting possibility of using this technique to optimize repeated sub-blocks within larger circuits.
We demonstrated that the CDF-based approach of Lin and Tong [8], using the optimized Fourier approximation of Wan et al. [9], can be used to give significantly better energy estimates in practice than suggested by previous bounds.In particular, the derivative of the estimated CDF can be viewed as an objective function, the maximum of which gives accurate energy estimates.This estimate can be orders of magnitude more accurate than the rigorous bound derived in Ref. [9], even in the presence of both QPU and importance sampling errors.This improved accuracy will allow use of shorter circuits, allowing useful applications of phase estimation to be performed sooner.
We showed that error mitigation and noise tailoring are important for improving the quality of the estimated CDF.In particular, there is a significant benefit in mitigating coherent errors.The twirling procedure in randomized compiling was incorporated into the importance sampling of the CDF, thus performing a much larger number of twirls at small k than at large k.Indeed, many circuits at large k are performed for only a single twirl in this approach.Despite this, we found the approach to be robust, leading to noise resilience in the statistical phase estimation results.We did not find further improvements in energy estimates after applying ZNE; although ZNE does improve the signal of the CDF in each case, the energy estimates themselves and the signal-to-noise ratio were not systematically improved.Given the very high additional sampling cost associated with performing ZNE, we overall find it preferable to not use ZNE for this application.Lastly, we mention that an alternative error mitigation approach has been suggested recently, which involves post selecting the data qubits of the Hadamard test circuits to be measured in the starting state [28]; we have not tested this approach, but believe it could be combined with the methodology developed here to improve results further.
The results presented suggest the possibility to perform large-scale phase estimation experiments in a manner that allows noise resilience, and also that the required circuit depth for a given accuracy can often be made much lower than in traditional QPE approaches.Use of shorter circuits and accurate results in the presence of noise will be crucial for successful applications of phase estimation on early fault-tolerant devices.Taken together with other recent results [13,14], we believe that this shows significant promise for statistical phase estimation techniques, emphasizing their potential as valuable techniques for chemistry and materials problems, and motivating further development work in this direction.
approaches to estimate the eigenvalues {λ i }.
In addition to the CDF-QPE approach discussed in the main paper, we have also implemented the quantum eigenvalue estimation algorithm (QEEA) of Somma [7], and tested it with estimates of {g k } from Rigetti's QPUs.Although the underlying approach to estimate {λ i } is quite different, it will be seen that the final objective function takes an identical form, primarily differing in the Fourier coefficients of the target function.We also extend the QEEA method by implementing importance sampling, and briefly demonstrate its performance here.

Theory
In the QEEA, a range [−α, α] is considered in which the Hamiltonian is known to be bounded, and where α ≤ π. α = 1/2 is chosen in the original presentation.The range [−α, α] is divided into M bins, each with width 2ǫ for a small ǫ > 0. These bins are constructed to overlap with each other, such that the center of the j'th bin is given by −α + jǫ.Using bins that overlap helps to ensure appropriate normalization of results; we will return to this point shortly.
Consider a state |ψ , which will be chosen to have a large overlap with a state (or set of states) whose energy we seek to estimate.For a chemical system where we wish to estimate the ground-state energy, for example, |ψ might be taken as the Hartree-Fock state.For each bin, the QEEA aims to assess whether corresponding eigenstates (whose eigenvalues lie within the bin) have a significant overlap with |ψ .By making ǫ sufficiently small, we can then obtain accurate estimates of the desired λ i .
More precisely, for each bin we define a function f j which acts as a window function for that bin.This means that f j (λ i ) will be non-zero only if λ i is within the j'th bin.For an eigenstate |Ψ i of τ H with eigenvalue τ λ i , The goal of the QEEA is then to construct the vector This is referred to as the probability vector.Only bins containing eigenstates supported by |ψ will have non-zero values p j , and the magnitudes p j will be related to the corresponding overlaps, Ψ i |ψ .Thus, by assessing the bins with a large p j , one can estimate the desired eigenvalues of τ H.We can estimate p j from a set of g k estimates by expanding each f j as a Fourier series, Inserting this expansion into Eq.(B2) and truncating such that |k| < N , for some N ∈ N, gives pj Here, pj denotes the approximate probability vector, with an error introduced due to truncation.It only remains to choose the precise form of f j (x).There are many choices that could be made, but because the Fourier series is truncated, it is important that the Fourier coefficients F j (k) decay rapidly.Somma chooses where 1 j is the indicator function for the j'th bin, and h ǫ is a rescaled bump function.Specifically, with a such that 1 −1 h(x)dx = 1, and then The key properties for this choice are that the Fourier coefficients F j (k) decay super-polynomially, and that This second point is important to ensure that contributions are appropriately normalized; this is why bins are chosen to overlap, preventing potential issues if an eigenvalue lies near the edge of a bin.
As stated in Appendix A of [7], the Fourier coefficients can be derived as where λ j is the centre of the j'th bin, and H(k) is the Fourier transform of h(x), which we calculate numerically through a fast Fourier transform after discretization.The coefficients F j (k) only depend on the bin index j through a phase factor, which allow us to write the second line and thus define F k independent of j.We can therefore rewrite Eq. (B4) as Note that this expression has an almost identical form to that in Eq. ( 14) in the CDF-QPE method.The main difference is in the Fourier coefficients used.These were plotted for ǫ = 3 × 10 −3 in Fig. 3, and compared to those in the method of Wan et al., aiming for a similar final accuracy by using Eq.22 to select β.The Fourier coefficients decay less rapidly in the QEEA.However, as pointed out by Somma [7], there are advantages to this binning approach; in particular, if the gaps between eigenvalues are very small (for example in a solid-state system with bands of energy values) then solving the QEEA is much less ambitious than distinguishing individual eigenvalues to very high precision, and so there may be advantages in such cases.
Lastly, noting that g

Importance sampling
We have additionally implemented importance sampling in the QEEA, which was not considered in the original presentation of the method.As pointed out in [8], this can be used to reduce the complexity of the QEEA, bringing it closer to that of other statistical phase estimation approaches, some of which have proven Heisenberg-limited scaling [8,11].We do not perform a study of scaling here.However, importance sampling allows use of a much smaller bin width (and therefore higher precision) for a given number of circuits to perform.
The probability vectors in the QEEA can be constructed using importance sampling in an identical manner to that described in Section II A 2, starting from Eq. (B11).The main difference is that the sign of F k can vary.These signs therefore must also be absorbed into the importance sampled summation, but the approach is otherwise unchanged.As described in the main text, we also performed RC in the following results, incorporating it into the importance sampling procedure by performing one twirl for each sample.

Results
We performed the QEEA on Rigetti's Aspen-11 for the same methanethiol system studied in Section IV A, which requires 3 qubits in total.Each controlled-e −iτ Hk operation was compiled to a circuit ansatz with 3 CZ layers, as in Section IV A. The half-bin width was set to ǫ = 3 × 10 −3 , and the corresponding Fourier coefficients were importance sampled with N S = 2000.A separate Pauli twirl was performed for each sample.The Fourier summation was truncated at N = 4001.
Results are presented in Fig. 10, and can be compared to equivalent results from the CDF-QPE method in Fig. 5.As for the CDF-QPE method, we find the QEEA to be robust, and that each eigenvalue can be clearly and correctly identified from the probability vector, within the accuracy determined by roughly half the bin width.
As found in Section IV B, there is no improvement to the energy estimates after performing ZNE.In this case the estimates are already correct within the resolution determined by ǫ, and so there is no improvement to be made.ZNE boosts the signal from the probability vector, but overshoots considerably for the ground-state bin.One reason for this is that, because the coefficients |F k | decay much more slowly in the QEEA, very few samples are performed for any particular k, even at small k.This makes mitigation of coherent errors less successful, and also increases statistical error bars on each g k estimate, thus lowering the quality of each extrapolation and therefore also the ZNE estimate of the probability vector.
which leads to the circuit on the left.The circuit on the right is then expressed with CZ as the only two-qubit gate, which can be obtained through standard circuit identities.
In addition to the results in the main text, we have performed simulated results using pyQuil's QVM.This allows us to investigate higher precision and the effect of varying error rates.The same H 2 example is considered with an identical Trotter step.However, CDF-QPE parameters of β = 5 × 10 4 , d = 511 and N S = 1000 are taken.This choice of β corresponds to δ ≈ 0.004 in Eq. ( 22), after choosing ǫ = 0.1.As for results in the main text, we perform 100 shots for each k i value obtained during importance sampling.
First we consider applying depolarizing noise to each CZ gate, before next considering the effect of coherent errors.The depolarizing channel is defined where n is the number of qubits, equal to 2 when applied to a CZ gate, and p is the depolarizing error parameter.In the following results we vary p from 5 × 10 −4 to 4 × 10 −3 .All other gates and measurements are applied without error.Figure 12 presents the CDF, G(x), and its derivative, G′ (x).Subplots are zoomed in on the ground state (GS) and excited state (ES).As might be expected, the jumps in the CDF become more broad as the error rate is increased.Despite this, the energy estimate obtained by maximizing the CDF derivative remains accurate in each case.The circuit on the right is reduced so that the only two-qubit gates are CZ gates, using standard identities.The one-qubit gates are each implemented in native Rigetti gates via the structure RZ(φ) RX (−π/2) RZ (θ) RX (π/2) RZ (λ), which ensures consistent gate layers.Simulations are performed on a pyQuil quantum virtual machine (QVM).Depolarizing noise is applied to CZ gates with four different error rates, p. Subplots on the left and right are zoomed in on the ground state (GS) and excited state (ES), respectively.Increasing the error rate broadens the "jump" region of the CDF and the corresponding peak of the derivative.The peak of the CDF remains roughly correct regardless, although statistical noise at higher p can lead to errors in the final energy estimate.Note that different x-axis scales are used between subplots.
It is straightforward to see that depolarizing noise does not prevent us from obtaining accurate energy estimates.Under depolarizing noise, expectation values will decay as e −γ|k| , for some decay rate γ.We then expect g k ∼ e −γ|k| i p i e −iτ λik . (C4) This decay factor does not affect the frequencies present in g k , but it should be expected that it becomes more challenging to reliably estimate each λ with increasing γ.To be precise, the Fourier transform of e −γ|k| is Lorentzian centered about 0 whose width grows with γ.The results in Figure 12 are as expected, given these arguments.
A separate type of errors are coherent (or unitary) errors, which preserve the purity of the input state.Ref. [57] considers a type of control-free phase estimation, and demonstrates that unitary errors cause errors in the final phase estimates, which can be largely removed with RC.Here, we demonstrate a similar result with the methodology developed in this paper, with the CDF-based method of Ref. [9] and integrating RC with importance sampling.Following Ref. [57], we apply each CZ gate with a unitary error, so that U ′ CZ = ΛU CZ , with Λ = e −i(θ/2)Z⊗Z . (C5) We choose θ = 0.1, which is a very large error in practice.All other gates and measurements are applied without error.Simulations are performed on a pyQuil quantum virtual machine (QVM).A unitary error of e −i(θ/2)Z⊗Z with θ = 0.1 is applied after every CZ gate.We then consider how this affects the CDF, both before and after applying randomized compiling.Without RC there is a large error in the ground-state (GS) energy.This error is effectively removed by RC, at the cost of reduced signal.Interestingly, the error in the excited-state energy is much smaller, but still slightly improved by RC.The excited state (ES) is harder to distinguish in the CDF derivative due to statistical noise.Note that different x-axis scales are used between subplots.
Results are presented in Fig. 13.All simulation parameters are the same as for the results in Fig. 12, including β, d and N S .Applying the CZ unitary error Λ leads to a large error in the ground-state energy of −7.6 mHa (after rescaling by τ −1 ).This is reduced to +0.3 mHa after applying RC.Interestingly, the error in the excited-state energy is less than 1 mHa in both cases, although there is a slight improvement with RC applied.These results demonstrate that coherent errors can cause significant errors in energy estimates from statistical phase estimation, but that RC is a promising approach to help mitigate these errors in practice.

FIG. 2 .
FIG. 2. (a)The CDF for H4 STO-3G, taking the Hartree-Fock wave function as the initial state, calculated with and without sampling errors.Results are simulated in this example.Parameters are β = 10 6 and d = 5000.The "without sampling errors" CDF is obtained using Eq.(15) using exact values of g k .The CDF "with importance sampling errors" is obtained from Eq. (17) with NS = 5000, but again using exact g k values.Dashed vertical lines are exact energies.(b) The derivative of the CDF, zoomed in around the ground-state energy.The maximum of the CDF derivative provides an accurate energy estimate.

FIG. 6 .
FIG.6.Results for methanethiol (2e,2o) with a stretched SH bond, using g k estimates obtained from Aspen-11.(a) Estimates of Re[g k ] for k = 3 as an example, performed at errors rates λ = 1, 3 and 5.An exponential fit is accurate after performing randomized compiling, leading to an improved ZNE estimate at λ = 0.Such a fit cannot be reliably performed for data obtained without mitigation of coherent errors.(b) CDFs constructed using ZNE-extrapolated g k estimates.

FIG. 7 .
FIG.7.Results from the CDF-QPE method for H − 3 in a STO-3G basis, performed on Aspen-M-3.Here the ground-state wave function is multi-reference, leading to three jump regions in the CDF, each indicating an energy eigenvalue.Subplots on the left, (a) and (b), show estimates of the real and imaginary parts of g k for error rates 1, 3 and 5, and the subsequent ZNE-extrapolated estimates.Note that we set d = 5000, but only present results up to k = 79 for clarity.(c) shows the CDF itself, and (d) shows its derivative, zoomed in on the ground-state energy.The derivative of the CDF can be used to obtain an extremely accurate estimate of each energy.Extrapolation improves the amplitude of the CDF, though the location of the jump is not affected.

FIG. 8 .
FIG.8.Results from the CDF-QPE method for the 6-qubit "clusterTS" system, performed on Aspen-M-3.Here the groundstate wave function is single reference, leading to a single jump region in the CDF.Subplots on the left, (a) and (b), show estimates of the real and imaginary parts of g k for error rates 1, 3 and 5, and the subsequent ZNE-extrapolated estimates.(c) shows the CDF itself, and (d) shows the CDF derivative, zoomed in on the ground-state energy.

FIG. 9 .
FIG.9.Results for a stretched H2 molecule, using first-order Trotterization to construct e −iτ Hk , performed on the Aspen-M-2 QPU.Top: estimates of the real part of g k , estimated without (a) and with (b) randomized compiling.Dashed lines are added between estimates for clarity.Bottom: the derivative of the CDF constructed from g k , without (c) and with (d) randomized compiling.Subplots labelled GS and ES are zoomed in on the ground state and excited state, respectively.Results are performed at error rates 1, 3 and 5, and extrapolated.RC leads to better final energy estimates and more accurate relative signal between the two states.ZNE improves the signal but does not improve energy estimates.

FIG. 10 .
FIG. 10. Results from Aspen-11, performing the QEEA on methanethiol (2e,2o) with a stretched SH bond.(a): The full probability vector from −1.5 to +1.5.(b) and (c) are zoomed in on the ground-state and first-excited state, respectively.Results presented are for error rates λ = 1, 3 and 5, and the ZNE-extrapolated (λ = 0) result.The half-bin width is set to ǫ = 3 × 10 −3 and we set N = 4001.For importance sampling, NS = 2000 samples were taken.The correct energy is estimated for every error rate, to the precision considered.
Fig. 11 presents the circuit used for each Trotter step in Section IV C. Since the Hamiltonian has the form H = c 1 Z + c 2 X, a single first-order Trotter step is taken as

RZ ( 2 FIG. 11 .
FIG.11.The circuit for a single Trotter step for H2 in a STO-3G basis, where the Hamiltonian has the form H = c1Z + c2X.The circuit on the right is reduced so that the only two-qubit gates are CZ gates, using standard identities.The one-qubit gates are each implemented in native Rigetti gates via the structure RZ(φ) RX (−π/2) RZ (θ) RX (π/2) RZ (λ), which ensures consistent gate layers.

FIG. 12 .
FIG.12.CDFs and their derivatives for stretched H2 STO-3G, performed with Trotterization using the Trotter step in Fig.11.Simulations are performed on a pyQuil quantum virtual machine (QVM).Depolarizing noise is applied to CZ gates with four different error rates, p. Subplots on the left and right are zoomed in on the ground state (GS) and excited state (ES), respectively.Increasing the error rate broadens the "jump" region of the CDF and the corresponding peak of the derivative.The peak of the CDF remains roughly correct regardless, although statistical noise at higher p can lead to errors in the final energy estimate.Note that different x-axis scales are used between subplots.

− 1 .FIG. 13 .
FIG.13.CDFs and their derivatives for stretched H2 STO-3G, performed with Trotterization using the Trotter step in Fig.11.Simulations are performed on a pyQuil quantum virtual machine (QVM).A unitary error of e −i(θ/2)Z⊗Z with θ = 0.1 is applied after every CZ gate.We then consider how this affects the CDF, both before and after applying randomized compiling.Without RC there is a large error in the ground-state (GS) energy.This error is effectively removed by RC, at the cost of reduced signal.Interestingly, the error in the excited-state energy is much smaller, but still slightly improved by RC.The excited state (ES) is harder to distinguish in the CDF derivative due to statistical noise.Note that different x-axis scales are used between subplots.

TABLE I .
Metrics assessing the effect of ZNE.All results used RC and readout error mitigation.Applying ZNE leads to a significant improvement in the distance metric W for all systems studied.The error in the ground-state energy estimate, ∆λ0, is extremely small both with and without ZNE applied, however there is no improvement made within statistical errors by applying ZNE, and the estimate is even worsened in some cases.