Generalized quantum subspace expansion

One of the major challenges for erroneous quantum computers is undoubtedly the control over the effect of noise. Considering the rapid growth of available quantum resources that are not fully fault-tolerant, it is crucial to develop practical hardware-friendly quantum error mitigation (QEM) techniques to suppress unwanted errors. Here, we propose a novel generalized quantum subspace expansion method which can handle stochastic, coherent, and algorithmic errors in quantum computers. By fully exploiting the substantially extended subspace, we can efficiently mitigate the noise present in the spectra of a given Hamiltonian, without relying on any information of noise. The performance of our method is discussed under two highly practical setups: the quantum subspaces are mainly spanned by powers of the noisy state $\rho^m$ and a set of error-boosted states, respectively. We numerically demonstrate in both situations that we can suppress errors by orders of magnitude, and show that out protocol inherits the advantages of previous error-agnostic QEM techniques as well as overcoming their drawbacks.

Introduction.-Control over computational errors is one of the central problems for the implementation of practical quantum computing algorithms using quantum devices subject to imperfections [1,2].Towards the goal of achieving fully fault-tolerant computation based on logical operations, the number of required qubits was reduced, and their error rates were improved drastically in the recent years, although the realization of ultimate digital quantum computing is years ahead [3].Therefore, it is important to ask whether we can establish information processing techniques which exploit the increasing quantum resource without performing fully-functional error correction.
The quantum error mitigation (QEM) techniques perform post-processing on measurement data (usually expectation values) to eliminate unwanted bias from computation results, in exchange for additional measurement costs [4][5][6][7][8][9][10][11][12][13][14][15].One of the most prominent examples is the quasi-probability method [5,7].Once the error profile of gate operations is given, stochastic operations are inserted to construct the inverse operations of each error map so that we can retrieve the computation result for the intended quantum operation.However, the characterization of the noise model, e.g., via the gate set tomography, is quite costly and easily deteriorated by noise drift.
Meanwhile, error-agnostic QEM methods which do not rely on prior knowledge on the error have been proposed: the quantum subspace expansion (QSE) method [16][17][18][19] and the virtual distillation (VD) method, which is also called exponential error suppression (EES) method [20][21][22][23].In the QSE method, we classically realize a variational subspace spanned by a set of quantum states {|ψ i } i as |ψ = i c i |ψ i , which can be effectively generated via additional measurements and post-processing.While the QSE method was initially proposed to compute excited states from a ground state realized on a quantum device, it also contributes to the mitigation of errors.By construction, the QSE method is well-suited for mitigating coherent errors which may come from insufficient variational optimization, lack of quantum circuit representability, and etc.However, it cannot suppress stochastic errors efficiently, since in general we need a linear combination of exponentially many Pauli operators to construct a projector to the error-free subspace [4,16].The VD/EES method, on the other hand, is complementary in this sense.By applying entangling operations between M identical copies of noisy quantum states ρ, we can obtain the error-mitigated expectation value of an observable O as O whose fidelity with a dominant eigenvector of ρ exponentially approaches unity.Although this method can significantly compensate for stochastic errors, it is entirely vulnerable to coherent errors which distorts the dominant eigenvector.
In this work, we propose a unified framework of erroragnostic QEM techniques which we refer to as the generalized quantum subspace expansion (GSE) method.The central idea is to extend the notion of quantum sub- 1. Suppressing errors in 6 lowest eigenstate calculation of one-dimensional transverse-field Ising model by interfering M copies of identical noisy quantum states.Eigenenergies computed by (a) VD/EES method, (b) GSE method based on the power subspace, and (c) GSE method with additional bases.For the power subspace, we take the bases as σi = ρ i (i = 0, 1, ..., M  2 ) and A = I for even number of copies M , while we take σi = ρ i (i = 0, 1, ..., M −1 2 ) and A = ρ for odd M 's.In (c), we additionally include non-Hermite operators ρ m H (m = 0, 1, ..., M/2 ).(d) The log scale plot of the deviation ∆E from the exact eigenenergies.For each eigenstate level n, we generate the noisy state ρ by adding depolarizing error after each gate of a variational quantum circuit, whose parameters are optimized by the subspace-search VQE algorithm [24] to solve an 8-qubit system under h = 1.
The depolarizing error rate p dep is taken so that expected number of total error in ρ is given as Ntot = Ngatep dep where Ngate is the number of gates.For all data presented in this figure we set Ntot = 1.5.
spaces to include general operators that are related to the target noisy quantum state, which allows us to distill the state into an error-mitigated eigenstate of the target Hamiltonian.We show that the GSE method, which provides a substantial generalization of the QSE method, inherits the advantages of previous error-agnostic QEM techniques as well as overcoming their drawbacks.This is demonstrated under two practical choices of the subspace.In the first example, the subspace consisting of powers of a noisy quantum state ρ m achieves not only the exponential suppression of stochastic errors which is even more efficient than the VD/EES method, but also efficiently mitigates coherent errors.In the second example, we span the subspace by non-equivalent quantum states corresponding to different noise levels.Unlike the commonly used error-extrapolation method, the GSE method with the subspace of error-controlled states is quite robust even when the control over noise level is imprecise, and hence highly beneficial to practical applications.
Framework of generalized quantum subspace expansion.-Supposewe obtain a noisy approximation ρ of some desired state, e.g. an eigenstate of a given Hamiltonian H using the variatioanal quantum eigensolver (VQE) or its variants [24][25][26][27][28][29][30][31][32][33].The GSE method uses the following ansatz in the extended subspace to repserent an eigenstate: where P = i α i σ i (α i ∈ C) is a general operator, σ i is generally a non-Hermite operator, and A is a positivesemidefinite Hermite operator.In this paper, we refer to σ i as a base of subspace.It is easy to check that ρ EM is a positive-semidefinite Hermite operator whose trace is unity, which ensures that ρ EM corresponds to a physical quantum state.Note that σ i and A can be related to the noisy state ρ.For example, we can choose σ i = ρ and A = ρ; this highlights the crucial difference of the novel GSE method from the conventional QSE (see Supplementary materials (SM) for more details [34]) that it also includes general operators related to quantum states in the expanded subspace.To span the most general subspace, we can take a base as follows, where lk is a quantum state, U lk and V (i) lk are operators that allow for an efficient measurements on quantum computers (e.g.local Pauli operators or unitary operators), and L k denotes the number of quantum state.See SM for more details [34].
To obtain the error-mitigated spectra of the Hamiltonian, we determine the coefficients α = (α 0 , α 1 , ...) by solving the following generalized eigenvalue problem [34]: where H ij = Tr[σ † i Aσ j H] and S ij = Tr[σ † i Aσ j ] with E being the error-mitigated eigenenergy.The coefficients are normalized as α † S α = 1 to satisfy Tr[ρ EM ] = 1.Note that H ij and S ij need to be efficiently computed on quantum computers.Once we find α which suffices Eq. (3), we can compute the error-mitigated expectation value of any observable . By implementing the generalized quantum subspaces spanned by Eq. ( 2), we can efficiently perform erroragnostic QEM.To illustrate the significance of our scheme, we will describe slightly more specific but highly practical two subclasses.Due to their features explained counts E FIG. 2. Histograms of ground-state energy estimation by VD/EES (blue), GSE method based on the power subspace (orange), and GSE+ method that includes additional term ρH included in power-subspace bases (red) using M = 2 copies.Here, we take the number of total measurement shots to be 10 9 .The gray dotted line indicates the exact ground state energy of 1d TFI model with N = 8 qubits.
thereafter, we refer to the employed subspaces as the power subspace and fault subspace, respectively.
Power subspace.-Letus first restrict the bases of subspace to powers of noisy quantum states as σ i = ρ i (i = 0, 1, ..., m) and set A = I: This shows that the error-mitigated state ρ EM is represented as the series expansion of the state ρ as ρ EM = 2m n=0 f n ρ n where f n = i+j=n α * i α j .Setting m = 1, for instance, leads to ρ EM = f 0 I + f 1 ρ + f 2 ρ 2 , which clarifies that ρ EM is represented as a polynomial of ρ [35].
It has been pointed out that higher order states themselves are extremely useful [20,21,36].By effectively computing the expectation value of an observable corresponding to the state ρ we can exponentially suppress the contribution from the non-dominant eigenstates of ρ (See SM for details [34]).Our key insight is that the non-dominant states will be suppressed even more efficiently by interfering them with each other.In fact, it is straightforward to see that the power subspace for A = I completely includes ρ (2m) VD , and therefore in the case of ground-state simulation we can always surpass the performance of the VD/EES method when the dominant vector gives good approximation of the ground state [34].
To illustrate the expected gain by our approach, we numerically demonstrate our algorithm.Figure 1 shows the results for 6 lowest eigenstates of the one-dimensional transverse-field Ising (1d TFI) model, whose Hamiltonian is given as H = − r Z r Z r+1 + h r X r where X r and Z r denote the x-and z-components of the Pauli matrix acting on the r-th site and h is the amplitude of the transverse magnetic field.We set h = 1 in the following.It is clear from Fig. 1 that both the VD/EES method and our GSE method yields exponential suppression of error # of errors ΔE FIG. 3. Relationship of the expected number of errors Ntot and the ground-state energy deviation ∆E.Blue filled circles and red filled circles denote the data from the VD/EES and GSE+ methods using M = 2 copies of identical noisy quantum states, respectively.Note that GSE+ denotes the GSE method with additional term ρH included in the bases of subspace {σi}.The purple crosses indicate the ordinary QSE method which corresponds to choosing A = ρ and σi ∈ {I, H}.
The black and green lines indicate results from the raw noisy quantum state and error-free optimized circuit, respectively.While the accuracy by the VD/EES result is bounded by the insufficient expressibility of the variational quantum circuit, the GSE method can reach beyond this limit by further exploring the subspace.
with respect to the number of copies M .Moreover, the interference with non-dominant states in ρ yields quicker convergence of the expectation value Tr[ρ EM H] towards the exact values; this is further boosted by including additional operators such as ρ m H to the bases, which is discriminated as GSE+ method in the figures.While we observe a trade-off between the accuracy and estimation variance as shown in Fig. 2, the greater suppression in GSE/GSE+ method gives us an advantage when the measurement resources are not too scarce.Such a gain in the performance is found not only in the energy, but also measures such as the fidelity and trace distance (See SM for details [34]).
Now, let us further analyze the effect of the crucial obstacle for the previous exponential error suppression techniques-the coherent errors.It has been pointed out in Refs.[20,21,37] that the stochastic gate errors themselves may cause a deviation of the dominant vector, which is called the coherent mismatch.In addition, there are numerous other sources that give rise to the coherent errors, e.g., restrictions on the variational ansatz structure of quantum states due to experimental limitations.In this regard, we interestingly find that our method provides a significant improvement over previous methods, since the expressibility of quantum states can be enhanced effectively by the subspace.
Figure 3 shows the result for numerical simulations focused on the ground state to support our findings.While the accuracy of the raw noisy state and the conventional QSE method scales only linearly with respect to N tot , both the VD/EES and GSE methods using two copies < l a t e x i t s h a 1 _ b a s e 6 4 = " p l v q z l J Z 9 Z D 4. Influence of fluctuation in the stretch factor λi.The blue and orange points denote the results from the GSE method using fault subspace and the extrapolation method for the VD/EES calculation using M = 2 copies, respectively.It can be clearly observed that the extrapolation method under uncertain noise control yields both systematic deviation and increased variance.For each error unit , we generate 500 sets of noisy quantum states ρ( λi ) where λi = λi + N (0, λi σ 2 ) for λi ∈ {1, 2, 3} and σ = 0.1.We and assume that each Pauli term is estimated without any shot-noise.
of ρ provide quadratic suppression in the noisy regime.However, the difference of two methods is highlighted in the low-error regime, in which the accuracy of the VD/EES method is bounded by the the performance of the original VQE simulation.Namely, when the ideal quantum circuit is not powerful enough and involves algorithmic error, we cannot remedy the shortage by merely restoring the dominant vector.In sharp contrast, our method is capable of eliminating such unwanted errors.
It is important to remark that the required number of measurements for the GSE method scales quadratically with respect to the desired accuracy, just as in the usual quantum measurements (See SM for details [34]).When the dominant vector of ρ gives a good approximation of the ground state, this is mainly accounted for by the sampling cost rooting from higher powers ρ M .
Fault subspace.-Nowwe proceed to another practical subclass of the GSE framework that employs nonidentical quantum states to span the quantum subspace.Here, the error-agnostic QEM is realized by utilizing quantum states from different noise levels, and hence refer to the subspace as the fault subspace; we take σ i = ρ(λ i ) where is the unit of the controlled error (e.g., infidelity per gate) and λ i ≥ 1 determines the actual error level.For instance, we consider an error-mitigated state as follows: where we have set A = I and σ i = ρ(λ i ).We may also extend the fault subspace to include high orders The concept of the fault subspace is closely related to the celebrated error-extrapolation method [5,6].In the error-extrapolation method, one estimates the zeronoise limit of the expectation value of a given observable O based on results at n + 1 noise levels O(λ .., n (See SM for details [34]).This implies that the errorextrapolation method constructs an effective density matrix as ρ ex = n i=0 β i ρ(λ i ).Due to its simplicity and practicality, the extrapolation method has been investigated widely both theoretically and experimentally.However, the extrapolation is based on a highly nontrivial assumption that the noise level can be accurately controlled (e.g. by extending the gate execution duration).Moreover, since the extrapolation is a purely mathematical operation that does not take any physical constraint into account, it may produce unphysical results even if the measurement is done perfectly, e.g., ρ ex can be a unphysical state whose eigenvalues can be negative.
The GSE method using the fault subspace can solve the above problems.First, the results obtained from the GSE method corresponds to a physical density matrix.Second, the GSE method using the fault subspace does not rely on the accurate knowledge of noise levels.This is because the GSE method simply aims to construct a truncated Hilbert space so that the lowest eigenstate is included.It suffices to employ bases that are not identical to each other, while the choice of error levels may affect the practical efficiency.
As a demonstration, we numerically investigate the ground state of 1d TFI model assuming that the control over the noise level is imperfect (See SM for simulation of excited states [34]).Here, we consider three noise levels ρ i = ρ( λi ) where λi = λ i + N (0, λ i σ 2 ) for λ i ∈ {1, 2, 3} and variance σ 2 .The energy at the zero-noise limit is estimated by the Richardson extrapolation for each set of data D = {(λ i , Tr[Hρ 2 ( λi )]/Tr[ρ 2 ( λi )]}.(See SM for details [34]).The extrapolated value fluctuates due to the random realization of λi , which does not affect the GSE method almost at all.We highlight this contrast in Fig. 4. Due to the stability, the GSE method is suitable for experiments on quantum devices.
Summary and Outlook.-We have proposed a generalized quantum subspace expansion which unifies the advantages of previously reported error-agnostic methods and furthermore overcomes their drawbacks.As a practical demonstration, we have first discussed to include powers of the noisy quantum state ρ m in the base of the subspace.This does not only provide the exponential suppression of stochastic error which is even more efficient than the VD/EES method, but it also eliminates the coherent errors of the dominant vector.In the second strategy, we have presented a method that spans the subspace using quantum states with various noise levels.Unlike the commonly used error-extrapolation technique, the GSE method exhibits robust performance even when the control over noise level is imprecise.
There are several future directions.First, an efficient combination of the proposed scheme and other QEM methods is worth exploring.For example, we can combine quasi-probability method with the proposed method to suppress the bias of error-mitigated expectation values due to finite characterization errors.We also expect that exploiting symmetry of the system in the subspace [17,36] will also improve the computational accuracy.Second, our method is not restricted to nearterm quantum computing, but may help improve computational accuracy even in the fault-tolerant quantum computing regimes, when problems of interest involves calculation of eigenspectra.Namely, we may apply the proposed method to mitigate the effect of errors due to decoding of logical qubits or insufficient number of Tgates without any characterization.This is in contrast with the previous works based on the quasi-probability method [38][39][40][41].The study of suitable subspace in our GSE framework is also important in future works.In this section, we provide a concise review on the quantum subspace expansion (QSE) method [16,17,19].In short, the QSE method can be understood as a post-processing technique that allows one to further explore the Hilbert space in variational simulation.Namely, given a set of (non-orthogonal) quantum states {|ψ i } i , one considers an effective ansatz given by a linear combination whose coefficients c = [...c i ...] T are determined so that a desired property of | ψ is optimized.

A. QSE method with energy-based variational principle
A common strategy to simulate the ground state of a given Hamiltonian H is to employ an energy-based variational principle, which is referred to as the Ritz variational principle in literature [45].We aim to minimize the following cost function: where λ is the Lagrange multiplier introduced to restrict the norm of the ansatz (S1) to be unity.From the stationary condition ∂ c * i L = 0, we straightforwardly obtain the following generalized eigenvalue problem, where H ij = ψ i |H|ψ j and S ij = ψ i |ψ j with λ yielding the minimal energy achievable within the subspace spanned by {|ψ } i .Here, we normalize the coefficients to satisfy c † S c = 1, which directly follows from ∂ λ L = 0. To stabilize the computation, we further cut off eigenvalues of the metric S that are below some threshold [16,46].It is worth mentioning that, while the Ritz variational principle itself is designed for the ground state, other eigenstates also satisfy the stationary condition ∂ c * i L = ∂ λ L = 0 and therefore can be obtained from Eq. (S4), if the subspace is properly included.If needed, one may further compute the energy variance as α † V α/ α † S α where V ij = ψ i |(H − E) 2 |ψ j to obtain the estimation error of the energy [47,48].
We remark that the idea of extending the variational ansatz after stochastic/numerical optimization was already developed in the field of classical simulation.For example, one of the most efficient choices of subspace to suppress algorithmic errors in eigenvalue problem is the Krylov subspace {H i |ψ 0 } i , which motivated us to choose the bases of subspace as {I, H} for the calculation in Fig. 3.However, as was initially pointed out by Ref. [16], one of the most important feature of the subspace method in the context of quantum simulation is the suppression of hardware errors.To the best of our knowledge, there is no unified understanding on how to construct bases that efficiently suppress both algorithmic and hardware errors.

B. QSE method with variance-based variational principle
One may employ an alternative variational principle to compute the eigenspectra of a given Hamiltonian by utilizing the property of an eigenstate that the energy variance H 2 − H 2 shall be zero.Following the knowledge of classical variational Monte Carlo simulation [49,50], here we define the cost function as where ω is an initial guess of the target eigenenergy that may be either fixed or updated iteratively until convergence.
In parallel to the case for energy-based variational principle, we obtain the following: where After solving Eq. (S6) and choosing the optimal c that yields the smallest variance, we can compute the eigenenergy as E = c † H c/ c † S c.As we have mentioned in §S1 A, the energy variance provides an upper-bound of an estimation error of the energy.
As we further mention in Sec.S5, in the current paper we initially set ω to the energy obtained from the VD/EES method, which is necessarily computed in GSE methods based on the power or fault subspaces.After solving Eq. (S6) and computing the energy, we replace ω with E and solve the generalized eigenvalue problem again.Note that this step does not require additional measurement on quantum computers; the cost is entirely that of classical computation.

S2. VIRTUAL DISTILLATION OR EXPONENTIAL ERROR SUPPRESSION METHOD
In this section, we review the virtual distillation (VD) method or exponential error suppression (EES) method proposed in Refs.[20,21].We consider a noisy state ρ, which can be written in terms of the spectral decomposition: where we define ψ i |ψ j = δ i,j and p 0 > p 1 ≥ p 2 ≥ • • • ≥ 0, and we refer |ψ 0 as a dominant vector.In that method, we can effectively compute the expectation value of an observable from that dominant vector |ψ 0 with exponentially small error by measuring the numerator Tr[ρ M H] and the denominator Tr[ρ M ], respectively.These quantities can be measured by unitary diagonalization [20] or indirect (or non-destructive) measurement [20][21][22].Fig. S1 shows a controlled derangement quantum circuit for the case of M = 3 to calculate the numerator Tr[ρ M H].We emphasize that when increasing the number of copies M , the virtual state ρ ] exponentially gets closer to the dominant vector |ψ 0 .
The VD/EES methods have the best performance when there are only stochastic (or orthogonal) errors that change an ideal state |ψ id into its orthogonal states.In this case, we have |ψ 0 = |ψ id with sufficiently small error rates.However, this is not always the case.Some type of error causes a change in a state from |ψ id to non-orthogonal states, and this leads to |ψ 0 = |ψ id .The infidelity of these states is called the coherent mismatch 1 − | ψ 0 |ψ id | 2 [37], and the VD/EES method cannot eliminate such coherent errors in general, although the effect of the coherent errors was investigated through numerical results [20,21] and analytic results [37].
where we set M = 3 and the Hamiltonian is decomposed into a linear combination of products of Pauli operators Pa as H = a faPa.
We also remark that even if the state is unphysical, i.e., the state has negative eigenvalues, as long as exponentially vanishes as M increases, which implies virtual distillation still works.This ensures that VD/EES method can be applied to error-mitigated unphysical states to further improve the computation accuracy.

S3. IMPLEMENTATION OF GENERAL SUBSPACE
The most general subspace can be implemented using the bases given as follows, where lk and V (i) lk are general operators, L k denotes the number of quantum state, and ρ lk is a quantum state.Suppose that the Hamiltonian H can be decomposed into a linear combination of products of Pauli operators as H = a f a P a .The elements H ij := Tr[σ † i Aσ j H] and S ij := Tr[σ † i Aσ j ] (which we call a Hamiltonian in the expanded subspace and an overlap matrix, respectively) in the Eq. ( 3) can be described as: and these can be calculated on quantum computers by using a modified controlled derangement operator.Let us show an example when A = ρ and L 1 = 1, assuming a single subspace given as are local Pauli operators, then the calculation of H ij can be performed by the quantum circuit shown in Fig. S2(a).Alternatively, if U 1 and V 1 are unitary operators, then we can obtain the expectation value from the circuit shown in Fig. S2(b).

S4. POWER SUBSPACE INCLUDES THE VIRTUAL DISTILLATION AMONG THE SUBSPACE
We compare the performance of the VD/EES method, Tr[ρ M ] , with that of the GSE method using power subspace, where we define a vector α k : Note that the (k + 1)-th component of α k is a non-zero value and all the others are zero.Eq. (S12) means that the power subspace expansion includes the expectation values obtained by the VD/EES methods.From the above result, we obtain This implies that the convergence of the power subspace expansion is equal or greater than that of the VD/EES methods.

S5. GENERALIZED QUANTUM SUBSPACE EXPANSION METHOD FOR EXCITED STATES
In this section, we further discuss the excited-state simulation by the GSE method.For the consistency with the main text, here again we consider one-dimensional transverse-field Ising (1d TFI) model under open boundary condition as H = r (Z r Z r+1 + hX r ) with h = 1.In Sec.S5 A and Sec.S5 B, we show that the GSE method with the power subspace and fault subspace can mitigate the errors even for excited states, respectively.
Before proceeding to the results by two choice of subspaces, let us summarize the concrete overall procedure.We focus on simulation of 1d TFI model using the hardware-efficient ansatz whose structure is provided in Sec.S9.
1. First, we simulate the K lowest eigenstates of the Hamiltonian.Here we use the subspace-search variational quantum eigensolver algorithm (SSVQE) [24].The SSVQE algorithm optimizes variational parameters θ so that the weighted sum of the energy expectation value from multiple orthogonal states is minimized: where initial states satisfy ψ = δ kk , weights are positive w k > 0, and K is the desired number of excited states.Throughout our paper, we choose the weights to be w k = K−k K .Note that in our numerical demonstration this optimization is performed assuming that the number of measurement shots is infinite.
(2.For our numerical demonstration, we introduce single-qubit depolarizing noise after every quantum gate and obtain the noisy state ρ (n) for every U (θ)|ψ . This step is skipped when one runs the algorithm on real quantum device.) 3. Next, for every eigenstate level n, we define an effective ansatz as where Then, we solve either of the generalized eigenvalue problems where For the latter, we initially set the reference energy as ω = Tr[(ρ (n) ) M H]/Tr[(ρ (n) ) M ], namely the energy obtained from the VD/EES method.Note that this quantity is necessarily calculated, if one wishes to apply the GSE method.4. Finally, among the multiple solutions of the generalized eigenvalue equation, we choose the one with the smallest variance, which is computed as α n with some initial guess, e.g., the energy from VD/EES method, and choose the closest value.It is practically important to remark that, this step is only optional, when one is simulating the lowest (highest) eigenstate using the energy-based variational principle.In such a case, one can typically choose the smallest (largest) solution of Eq. (S17).
Two remarks are in order.First, the reference energy ω introduced in step 3 is a hyperparameter.To avoid potential error that may arise from choosing an inappropriate ω, one may iterate steps 3 and 4. That is, once one solves the generalized eigenvalue problem (S18), one may replace the reference energy with the obtained energy as ω = α † n H n α n / α † n S n α n and define a new generalized eigenvalue problem.This procedure can be repeated until the energy/variance converges.We remark that this iteration does not require additional measurement on quantum computers; the cost is entirely that of classical computation to solve Eq. (S17) or (S18).In the current work, we find the deviation of variance estimation to be sufficiently small, and therefore fix the number of repetition to 2. Second, to stabilize the computation, we regularize the metric S n by cutting off its eigenvalues that are below the threshold = 10 −8 [16,46].Note that this value can be adjusted upon request; one may simply solve the generalized eigenvalue problem under various values and check its stability.Since this procedure involves only classical computation, it does not require any additional quantum measurement.Here, GSE+ indicates that we additionally include non-Hermite term ρ m H (m = 0, 1, ..., M/2 ) to the bases of subspace.The squares, triangles, and circles denote the data from VD/EES, GSE, and GSE+ methods, respectively.In (b) we display the result for the computation using Eq.(S18), which follows from the variance-based variational principle.We can see that the GSE method can efficiently mitigate the error over the entire eigenspectra, which is even more significant in the GSE+ method.The variational ansatz consists of depth d = 20 and the total error number is Ntot = 3.0.

A. Power subspace for excited states
Here we discuss the performance of the GSE method with the power subspace.While in the main text we have presented the numerical demonstration for K = 6 eigenstates in N = 8 qubit system, here we further attempt to simulate the entire eigenspectra using the GSE method.The result for simulation of all 16 eigenstates for N = 4 qubit system is shown in Fig. S3.We can see from Fig. S3(a) that the GSE methods exhibit greater suppression of errors than the VD/EES method even for excited states.Another strategy, which we find to be effective for the 1d TFI model as shown in Fig. S3(b), is to employ the variance-based variational principle, or Eq.(S18).While the performance of the GSE method is similar to the case with the energy-based variational principle, we find improvement in the results of the GSE+ method (especially for M = 3 and 4).We remark that one may perform more stable calculation by utilizing the symmetry of the system.Namely, one can check whether the solution of Eq. (S17) is in desired symmetry sector, e.g., by combining with the symmetry verification methods, which we leave as a future work.S4.Simulation of 6 minimal eigenstates using the GSE method with the fault subspace (blue) and the error-extrapolation method for the VD/EES calculation using M = 2 copies (orange) for N = 8 qubit system.Here, we adopt the energy-based variational principle.As in the calculation merely for the ground state, we clearly observe that the error-extrapolation method suffers from uncertainty in the stretch factor of errors.Here, we fix the error unit to be = 1.0, and generate 500 sets of noisy quantum states ρ( λi ) where λi = λi + N (0, λi σ 2 ) for λi ∈ {1, 2, 3} and σ = 0.1.We assume that each Pauli term is estimated without any shot-noise.The variational ansatz consists of depth d = 18.

B. Fault subspace for excited states
Next, we proceed to the simulation of excited states using the fault subspace.As in the main text, we assume that the stretch factor of the error fluctuates due to noise-control imperfection, which leads the well-known errorextrapolation method [5,7] to yield unphysical estimation of physical observables (also see Sec.S6). Figure .S4(a) compares the eigenenergy estimation by GSE method using the fault subspace and the error-extrapolation method applied to mitigate errors in the VD method using M = 2 copies.Similar to the ground-state simulation as provided in main text (e.g.Fig. 4), the error-extrapolation method suffers from the additional uncertainty where the GSE method with the fault subspace is barely affected.We can further see from Fig. S4(b) that there is the unwanted bias introduced in the error-extrapolation method, which results in separation of accuracy in one or two orders.
In Fig. S4 we show the simulation results over the entire eigenspectra for N = 4 qubit system.As was done in the simulation using the power subspace, among the solutions obtained by energy-based variational principle (Eq.(S17)), we choose the one with the smallest variance.For the current simulation, we did not observe a large difference in the performance of calculations based on two variational principles.

S6. ADDITIONAL NUMERICAL RESULTS ON ERROR MITIGATION PERFORMANCE
In this section, we analyze the performance of the GSE method by comparing the error in physical observables such as two-point correlation and computing distance measure (such as fidelity or trace distance) from the exact state.In particular, here we focus on the ground-state simulation of 1d transverse-field Ising model.Note that a physical observable O can be computed as where while α is computed by solving the generalized eigenvalue problem.
A. Errors of physical observables under power subspace In the main text, we have demonstrated that the GSE method shows stronger suppression of errors in eigenstate energy over the QSE or VD/EES methods.We further find that this is not only the case for the energy.As

B. Errors of physical observables under fault subspace
We can also perform similar analysis for the fault subspace by comparing with the error-extrapolation method.Figure S7(a) and (b) show the results for computing two-point correlators Z 0 Z r and X 0 X r , respectively.In both plots, we observe a huge separation in the variance, which reflects the vulnerability of the error-extrapolation method Analysis on the performance of error-mitigated states that are effectively realized by the GSE method using the fault subspace and error-extrapolation method.(a)(b) Two-point correlators Z0Zr and X0Xr where r is the site index.(c) The quantum state fidelity F between the error-mitigated states ρ and the exact ground state ρGS.All data from the GSE method satisfies F 1, which is violated in the error-extrapolation method.We fix the error unit to be = 1.5, and generate 500 sets of noisy quantum states ρ( λi ) where λi = λi + N (0, λi σ 2 ) for λi ∈ {1, 2, 3} and σ = 0.1.We assume that each Pauli term is estimated without any shot-noise.The variational ansatz consists of depth d = 6.
against the fluctuation in noise level control.Even more prominent data is shown in Fig. S7(c).We find that the effective quantum state ρ extrap = i β i ρ(λ i ) is highly nonphysical, which results in fidelity F (ρ extrap , ρ GS ) > 1.In sharp contrast, we find that the effective state constructed by the fault subspace always satisfies F 1.

S7. EFFECT OF SHOT NOISE ON THE GENERALIZED QUANTUM SUBSPACE EXPANSION
Here, we provide analytical results for the effect of shot noise to the solution of the generalized eigenvalue problem [51].First we introduce noise-free matrices that describe the quantum subspace; let us denote the Hamiltonian in the expanded subspace and the overlap matrix denote as H 0 and S 0 , respectively.Note that S 0 ≥ 0. The generalized eigenvalue problem without shot noise can be described as H 0 α 0 = E 00 S 0 α 0 , where α 0 is the error-free ideal solution.In the following discussion, we assume the ideal eigenvalues are not degenerated.Denoting the effects of shot noise as δH and δS, we represent H = H 0 + δH and S = S 0 + δS.Now, we have Here, α n = α on + δ α n and E n = E 0n + δE n with E 0n and α 0n representing the ideal n-th eigenvalue and solution of the generalized eigenvalue problem, respectively.Focusing on the first order of the deviation, we get Now, by expanding δ α n = m nm α 0m , we have By multiplying α † 0n from the left in Eq. (S22), we obtain where we used Multiplying α † 0l (l = n) from the left in Eq. (S22) leads to We also note that from the normalization condition α † n S α n = 1, we can derive nn = − 1 2 α 0n δS α 0n .To summarize, we obtain Let us denote the total number of measurements and the dimension of the subspace as N s and D. Since each element of δH and δS is in the order of O(DN ).We can see {S 1/2 0 α n } n are mutually orthogonal and normalized due to Eq. (S24).Thus, for an arbitrary D × D matrix B, we have where • op denotes an operator norm.Substituting B = δH − E 0n δS to Eq. (S27) results in where • F is the Frobenius norm and we used C op ≤ C F for an arbitrary matrix C, and δH F H F / N s D −2 /2 and δS F S F / N s D −2 /2.Also, we assigned the same number of samples to the measurement of H and S. Here, we assumed A = ρ or A = I/d with d being the system dimension.Note that rescaling the A matrix does not affect the overall accuracy.
When we decompose the Hamiltonian into the linear combination of Pauli operators as H = a h a P a , we get |H ij | ≤ γ with γ = a |h a |.Thus, we obtain where we used E 0n ≤ H op ≤ γ, and also assumed S ij ≤ 1, which is satisfied in highly practical cases such as the power subspace (without additional bases) and fault subspace.Thus, we can achieve the required accuracy ε when N s ≥ 16γ 2 D 4 S −1 0 2 /ε 2 .We further discuss S −1 0 op for the power subspace in the case of D = 2 and D = 3, which is highly practical for near-term devices.For D = 2, when we set A = ρ for the ansatz given in Eq. ( 1), we have and we obtain S −1

S8. EFFECT OF SHOT NOISE ON OBSERVABLE ESTIMATION
As is discussed in the previous section, the values of physical observable necessarily deviates from the ideal ones unless we take infinite number of measurements.Such fluctuations are often called the shot noise.For instance, results provided in Fig. 1 in the main text take the effect of shot noise into account.Each matrix element H ij (or S ij ) is given as ij is the expectation value and δH ij is a Gaussian noise whose amplitude is determined from the variance that arise from projective measurements.In the following, we discuss the variance of estimators for physical observables such as H ij and S ij , which are computed by letting multiple quantum states interfere with each other.

A. Power subspace
It is instructive to start from a case where we have M identical copies to compute the expectation value of an operator O.As was proposed in Ref. [20,21], let us employ the cyclic shift operator S (M ) which act on M systems as follows, where |ψ m denotes an arbitrary quantum state of m-th system.Practically, S (M ) can be regarded as a product of shift operators between two systems as m,m+1 that can be constructed as a tensor product of swap operators acting on all corresponding qubits where S (2) m,m+1 |ψ m ⊗ |ψ m+1 = |ψ m+1 ⊗ |ψ m .After some computation, we can show that the expectation value for the power of quantum state ρ M can be expressed as where we take . The variance for the single-shot estimation of the numerator can be given as where ] is introduced to denote the expectation value computed in the extended Hilbert space, which consists of M systems.It is beneficial to remark on a case when the observable is a linear combination of Pauli operators; we decompose as O = NO k=1 c k P k where P k is a product of Pauli operators and N O denotes the number of terms.Then, the variance is given as Here, we assume that we measure Tr[ρ M P 1 ], Tr[ρ M P 2 ], • • • , Tr[ρ M P NO ] in separate experiments, and therefore the total variance S34 is estimated as a sum of each term.The variance for the single-shot estimation of the denominator can be obtained from a parallel argument as Finally, we can calculate the variance of the estimator of the observable itself by substituting the above expressions into the standard formula.If we take n s measurement shots for every term, the overall variance can be calculated as where the covariance between S (M ) O (M ) and S (M ) is set to zero in our numerical simulation.This is because we have chosen O (M ) to be [S (M ) O (M ) , S (M ) ] = 0.In this case, the denominator and numerator of Eq. (S32) cannot be measured simultaneously, and hence we can ignore the covariance by measuring them separately.Let us also remark that overall variance can be improved, e.g., by weighing the number of measurements according to the coefficient c k .
We have used Eq.(S36) to estimate the variance of the matrix elements H ij and S ij for the power subspace.Note that bases of subspaces {σ i }, including the additional terms ρ m H, are chosen so that all elements are given as Tr[ρ m H k ] (k ∈ Z + ).We later discuss the case when we must compute, e.g., Tr[ρHρH].

B. Fault subspace
Next, we discuss the case when the error-mitigated quantum state is expressed by non-identical quantum states.It is practical to first set M = 2 so that we can describe the case for the fault subspace.Given a quantum state ρ EM = m i,j=0 α * i α j ρ i ρ j with ρ i = ρ(λ i ) and M = 2m, the expectation value of a physical observable O is expressed as where we have defined Based on the discussion of Eq. (S33), we obtain the variance for the single-shot estimation of an element O ij as where we have defined . Also, in a similar way to Eq. (S35), the variance for single-shot estimation of S ij is given as Using the above expressions, we find that the variance under n s measurement shots for each term can be formally given as In the main text, we calculate the variance of the physical observable (in particular the energy H ) in a stochastic way rather than calculating from Eq. (S40).This procedure can be summarized as follows: (1) add a Gaussian noise to H ij and S ij whose variance is determined from Eqs. (S38) and (S39), ( 2) regularize S to omit small/negative eigenvalues, (3) solve the generalized linear equation, and ( 4) repeat ( 1)-( 3) until the variance can be estimated with sufficient accuracy.

C. General case
Here, we briefly comment on the variance using the most general subspace.For simplicity, we focus on the variance of estimating where Q m (m = 1, ..., M ) is taken as a local Pauli operator.This can be measured using the cyclic shift operator S (M ) and and hence the variance of the estimator can be given as follows, where here we have denoted

S9. DETAILS REGARDING THE STRUCTURE OF VARIATIONAL QUANTUM CIRCUITS
In the main text, we have simulated the ground state of the one-dimensional transverse-field Ising model H = − r Z r Z r+1 + h r X r where X r and Z r denote the x-and z-component of the Pauli matrices acting on the r-th site under open boundary condition.Throughout this paper, we have employed the hardware-efficient ansatz structure as shown in Fig. S8, where the repetition number of units has been taken as d = 12 for results in Figs. 1 and 4, and d = 6 for that in Fig. 3. Noisy quantum states are generated by adding local depolarizing noise after each quantum gate.

a t e x i t s h a 1 _ b a s e 6 4 = " 0 n G P 6 c Y o h z s h U b l t i n m a + C 5 q r c 8 = " >
k e 7 p h T 5 + r d X w a r S 9 1 H l W O l p h F c M n k 7 n 3 f 1 U V n l 0 c f q n + 9 O y i h B X P q 8 b e L Y 9 p 3 0 L t 6 G t H F 6 3 c a j b e m K N r e m X / V 9 S k B 7 6 B U X t T b z I i e 4 k Q f 0 D y 5 3 N 3 g 6 2 F R H I p Q Z n F W G r N / 4 o g p j C L e X 7 v Z a S w g T T y f K 7 A K c 5 w H n i W h q W o N N F J l Q K + J o p v I U 1 / A r 2 e i d 8 = < / l a t e x i t > b m z J l 7 7 p y Z U W x d c z 2 i x 5 D U 1 t 7 R 2 d X d E + 7 t 6 x + I R A e H 1 l y r 6 q g i r 1 q 6 5 W w o s i t 0 z R R 5 T / N 0 s W E 7 Q j Y U X a w r l a X G / n p N O K 5 m m a v e n i 0 K h l w 2 t Z K m y h 5 T m w e 0 7 c h m W R f F a J x S 5 M d 4 K 0 g H I I 4 g V q z o N b a x A w s q q j A g Y M J j r E O G y 2 0 L a R B s 5 g q o M + c w 0 v x 9 g U O E W V v l L M E Z M r M V H s u 8 2 g p Y k 9 e N m q 6 v V v k U n b v D y n E k 6 I F u 6 J X u 6 Z a e 6 O P X W n W / R s P L H s 9 K U y v s Y u R o N P f + r 8 r g 2 c P u l + p P z x 5 K m P O 9 a u z d 9 p n G L d S m v r Z / 8 p q b z y b q S b q k Z / Z / Q Y 9 0 x z c w a 2 / q V U Z k T x H m D 0 j / f O 5 W s D a V S s + k K D M d X 1 g M v q I b M U x g k t 9 7 F g t Y x g r y f K 6 B Y 5 z h P P Q i j U g x a a y Z K o U C z T C + h Z T 8 B E L o j V U = < / l a t e x i t > |0i < l a t e x i t s h a 1 _ b a s e 6 4 = " p t s N G 5 b Z Z U 0 2 F v + y 3 i d p r l 2 n / t I = " > b m z J l 7 7 p y Z U W x d c z 2 i x 5 D U 1 t 7 R 2 d X d E + 7 t 6 x + I R A e H 1 l y r 6 q g i r 1 q 6 5 W w o s i t 0 z R R 5 T / N 0 s W E 7 Q j Y U X a w r l a X G / n p N O K 5 m m a v e n i 0 K h l w 2 t Z K m y h 5 T m w e 0 7 c h m W R f F a J x S 5 M d 4 K 0 g H I I 4 g V q z o N b a x A w s q q j A g Y M J j r E O G y 2 0 L a R B s 5 g q o M + c w 0 v x 9 g U O E W V v l L M E Z M r M V H s u 8 2 g p Y k 9 e N m q 6 v V v k U n b v D y n E k 6 I F u 6 J X u 6 Z a e 6 O P X W n W / R s P L H s 9 K U y v s Y u R o N P f + r 8 r g 2 c P u l + p P z x 5 K m P O 9 a u z d 9 p n G L d S m v r Z / 8 p q b z y b q S b q k Z / Z / Q Y 9 0 x z c w a 2 / q V U Z k T x H m D 0 j / f O 5 W s D a V S s + k K D M d X 1 g M v q I b M U x g k t 9 7 F g t Y x g r y f K 6 B Y 5 z h P P Q i j U g x a a y Z K o U C z T C + h Z T 8 B E L o j V U = < / l a t e x i t > |0i < l a t e x i t s h a 1 _ b a s e 6 4 = " p t s N G 5 b Z Z U 0 2 F v + y 3 i d p r l 2 n / t I = " > b m z J l 7 7 p y Z U W x d c z 2 i x 5 D U 1 t 7 R 2 d X d E + 7 t 6 x + I R A e H 1 l y r 6 q g i r 1 q 6 5 W w o s i t 0 z R R 5 T / N 0 s W E 7 Q j Y U X a w r l a X G / n p N O K 5 m m a v e n i 0 K h l w 2 t Z K m y h 5 T m w e 0 7 c h m W R f F a J x S 5 M d 4 K 0 g H I I 4 g V q z o N b a x A w s q q j A g Y M J j r E O G y 2 0 L a R B s 5 g q o M + c w 0 v x 9 g U O E W V v l L M E Z M r M V H s u 8 2 g p Y k 9 e N m q 6 v V v k U n b v D y n E k 6 I F u 6 J X u 6 Z a e 6 O P X W n W / R s P L H s 9 K U y v s Y u R o N P f + r 8 r g 2 c P u l + p P z x 5 K m P O 9 a u z d 9 p n G L d S m v r Z / 8 p q b z y b q S b q k Z / Z / Q Y 9 0 x z c w a 2 / q V U Z k T x H m D 0 j / f O 5 W s D a V S s + k K D M d X 1 g M v q I b M U x g k t 9 7 F g t Y x g r y f K 6 B Y 5 z h P P Q i j U g x a a y Z K o U C z T C + h Z T 8 B E L o j V U = < / l a t e x i t > |0i < l a t e x i t s h a 1 _ b a s e 6 4 = " w v A r 9 p 2 a + b v w q a Y j H n D j w t s 0 I F r U O 8 z M m T P 3 3 D k z o 1 i 6 5 r h E z Y D U 0 9 v X P x A c D A 0 N j 4 y G I 2 P j G c e s 2 q p I q 6 Z u 2 j l F d o S u G S L t a q 4 u c p Y t 5 I q i i 6 z m 8 q g z 5 z I y / H 2 B E 4 R Z W + U s w R k q s 2 U e 9 3 m 1 G 7 A W r 1 s 1 P V + t 8 y k m d 5 e V U c T p m e 6 p S U / 0 Q K / 0 8 W u t u l + j 5 e W I Z 6 2 t F U 5 h 9 H Q y 8 / 6 v q s K z x M G X 6 k / P E i U s + 1 4 N 9 u 7 4 T O s W e l t f O z 5 v Z l b S 8 f o s 3 V C D / V / T C z 3 y D a z a m 3 6 b E u k r h P k D k j + f u x P k 5 h P J x Q S l F m K r a 8 F X 9 G M a M 5 j j 9 1 7 C K j a w i S y f e 4 g z X O A y 1 F D G l E l l q p 2 q h A L N B L 6 F E v s E u r O M L w = = < / l a t e x i t >

• • •
< l a t e x i t s h a 1 _ b a s e 6 4 = " X S T k 9 + a F 9 z m 8 q g z 5 z I y / H 2 B E 4 R Z W + U s w R k q s 2 U e 9 3 m 1 G 7 A W r 1 s 1 P V + t 8 y k m d 5 e V U c T p m e 6 p S U / 0 Q K / 0 8 W u t u l + j 5 e W I Z 6 2 t F U 5 h 9 H Q y 8 / 6 v q s K z x M G X 6 k / P E i U s + 1 4 N 9 u 7 4 T O s W e l t f O z 5 v Z l b S 8 f o s 3 V C D / V / T C z 3 y D a z a m 3 6 b E u k r h P k D k j + f u x P k 5 h P J x Q S l F m K r a 8 F X 9 G M a M 5 j j 9 1 7 C K j a w i S y f e 4 g z X O A y 1 F D G l E l l q p 2 q h A L N B L 6 F E v s E u r O M L w = = < / l a t e x i t >

• • •
< l a t e x i t s h a 1 _ b a s e 6 4 = " X S T k 9 + a F 9 o j m l 4 k u g l p H R 1 9 / T 2 9 Q + E B 4 e G R 0 Y j Y + M 5 z 6 6 6 u s j q t m m 7 O 5 r q C d O w R F Y a 0 h Q 7 j i v U i m a K b a 2 8 3 t r f r g n X M 2 x r S x 4 5 I l 9 R 9 y 2 j Z O i q Z C q 3 p x d t 6 R U i M U q Q H 9 F O k A x A D E F s 2 p E 7 7 K E I G z q q q E D A g m R s Q o X H b R d J E B z m 8 q g z 5 z I y / H 2 B E 4 R Z W + U s w R k q s 2 U e 9 3 m 1 G 7 A W r 1 s 1 P V + t 8 y k m d 5 e V U c T p m e 6 p S U / 0 Q K / 0 8 W u t u l + j 5 e W I Z 6 2 t F U 5 h 9 H Q y 8 / 6 v q s K z x M G X 6 k / P E i U s + 1 4 N 9 u 7 4 T O s W e l t f O z 5 v Z l b S 8 f o s 3 V C D / V / T C z 3 y D a z a m 3 6 b E u k r h P k D k j + f u x P k 5 h P J x Q S l F m K r a 8 F X 9 G M a M 5 j j 9 1 7 C K j a w i S y f e 4 g z X O A y 1 F D G l E l l q p 2 q h A L N B L 6 F E v s E u r O M L w = = < / l a t e x i t >

S10. ERROR-EXTRAPOLATION METHOD
Here, we briefly review the error-extrapolation method used for the quantum error mitigation [5].Let us assume that a quantum state can be characterized by an error-control parameters as ρ(λ ), where is a fixed value which can be considered as the minimum achievable noise level and λ ≥ 1 is the stretch factor.The expectation value of any observable O will be expressed as a power series around its noise-free value O * as The noise-free value O * can be calculated from n + 1 results corresponding to stretch factors {λ i } n i=0 as where β i = j =i λ j (λ j − λ i ) −1 are solutions of linear equation Since {β i } is common among any physical observable, this is equivalent to estimate the zero-noise density matrix as ρ ex = i β i ρ(λ i ).We remark that Eq. (S44) can be used as long as O(λ ) is controlled by λ.Namely, the error-extrapolation method can be applied not only for the calculation of the raw expectation value but also for the calculation with the VD/EES method such as Tr[ρ M (λ i )H]/Tr[ρ M (λ i )].
While the error-extrapolation technique is highly practical, it is crucial to perform perfect control over the stretch factor λ.This is demanding for the actual experimental setup, and therefore it is highly possible that λ deviates from the actual target value.Figure S9 shows that the effect of such an unwanted fluctuation can be somewhat suppressed by employing data points calculated from the VD/EES method.However, it is evident from Fig. S10(a) that the GSE method can estimate the error-free observable much more efficiently because such an effect barely affects the construction of the quantum subspace.This can be confirmed to be true for various error levels, especially when we take a sufficient number of measurement shots n s so that the effect of shot-noise is negligible (See Fig. 4 in the main text for a case with n s → ∞).When n s is finite, the result by the GSE method may not necessarily satisfy E GSE ≥ E GS where E GS is the exact ground-state energy (See Fig. S10(b)).However, we numerically confirm that the fluctuation is much smaller and hence is more reliable.
It must be noted that the GSE method using the fault subspace can outperform the conventional error-extrapolation method even when the stretch factor λ i is precisely known.Let us consider distilling the error-extrapolated density matrix ρ ex using M = 2 copies as ρ 2 ex /Tr[ρ 2 ex ].We can see that this state is included in the variational ansatz employed in the GSE method , i.e., ρ EM = ij α * i α j ρ(λ i )ρ(λ j ).When ρ ex is a physical state, this also holds for higher orders since we can always construct a variational ansatz that includes ρ M ex /Tr[ρ M ex ].For instance, we can simply take σ i = ρ(λ i ) and A = ρ M −2 ex which yields ρ EM = ij α * i α j ρ(λ i )ρ M −2 ex ρ(λ j ).
< l a t e x i t s h a 1 _ b a s e 6 4 = " r w A g A j / I q W T r v M l w S c q 1 D I g Y d V A = " > A A A C d 3 i c h V H L S s N A F D 2 N 7 / q q u h F c W C y K q 3 I j i u K q 6 M a l r 6 p g p S R = ρ M /Tr[ρ M ], FIG. S2.The quantum circuits for computing γ (ij) kk when U1 and V1 are (a) local Pauli operators or (b) unitary operators.Since γ (ij) kk is a complex number, we need to calculate the real and imaginary parts by measuring Pauli-X and -Y operators of the ancilla qubit, respectively.ζ (ij) kk can be similarly evaluated by excluding the operator Pa from the circuit.
FIG. S3.Simulation of all 16 eigenstates of 1d transverse-field Ising model with N = 4 qubits.(a) Eigenenergies computed from the solution of the generalized eigenvalue problem (S17), which is based on the energy-based variational principle.Here, GSE+ indicates that we additionally include non-Hermite term ρ m H (m = 0, 1, ..., M/2 ) to the bases of subspace.The squares, triangles, and circles denote the data from VD/EES, GSE, and GSE+ methods, respectively.In (b) we display the result for the computation using Eq.(S18), which follows from the variance-based variational principle.We can see that the GSE method can efficiently mitigate the error over the entire eigenspectra, which is even more significant in the GSE+ method.The variational ansatz consists of depth d = 20 and the total error number is Ntot = 3.0.
FIG.S4.Simulation of 6 minimal eigenstates using the GSE method with the fault subspace (blue) and the error-extrapolation method for the VD/EES calculation using M = 2 copies (orange) for N = 8 qubit system.Here, we adopt the energy-based variational principle.As in the calculation merely for the ground state, we clearly observe that the error-extrapolation method suffers from uncertainty in the stretch factor of errors.Here, we fix the error unit to be = 1.0, and generate 500 sets of noisy quantum states ρ( λi ) where λi = λi + N (0, λi σ 2 ) for λi ∈ {1, 2, 3} and σ = 0.1.We assume that each Pauli term is estimated without any shot-noise.The variational ansatz consists of depth d = 18.
FIG. S5.Simulation of all 16 eigenstates of N = 4 qubit system using the GSE method with the fault subspace with energybased (blue) and variance-based variational principle (red) and the error-extrapolation method for the VD/EES calculation using M = 2 copies (orange).The variational ansatz consists of depth d = 20.The inset shows the log-scale plot of energy error ∆E for individual eigenstate.
FIG.S6.Analysis on the performance of error-mitigated states that are effectively realized by the VD/EES, GSE, and GSE+ methods.(a) Quantum state fidelity F between the error-mitigated states ρ and the exact ground state ρGS.The inset shows log plot of the infidelity 1 − F .(b) Trace distance ρ − ρGS 1 with its log plot provided as inset, and (c) expectation values of 2-body correlators Z0Zr where r is the site index.In all panels, the blue, orange, and red points denote data by VD/EES, GSE, and GSE+ methods.In (c), we fix the number of copies to M = 2.The black stars indicate the exact values.The variational ansatz with depth d = 6 is optimized via the VQE algorithm.The depolarizing error rate p dep is taken so that expected number of total error in ρ (n) is given as Ntot = Ngatep dep where Ngate is the number of gates.Here, we set Ntot = 2.5.

− 1 2 s 1 2s 2 s
), we have δE n = O(DN − ) and δ α n = O(DN − 1 n D y U p o r Z u r j d E F P 7 P + c H u i W b + D U X s 3 L g i i e I s 4 P k P v e 7 p 9 g Z T K b m 8 5 S Y S o 9 v x A 9 R S d G M I Y J 7 v c M 5 r G E P M p h n 4 9 x i r P Y s z a o D W u j z V Q t F m k G 8 S W 0 z A c d A 4 z M < / l a t e x i t > ⇥d < l a t e x i t s h a 1 _ b a s e 6 4 = " X S T k 9 FIG. S8.Structure of hardware-efficient ansatz employed in this work.

< l a t e x i t s h a 1 _ b a s e 6 4 =
FIG.S9.Error-extrapolation method to estimate the zero-noise limit of the ground-state energy.Red and orange dots denote the estimated energy from raw quantum states and VD/EES method using M = 2 copies, respectively.The colored lines are the extrapolation fit for noisy values to obtain the noise-free values, and the black dotted line indicates extrapolation under perfect noise control.For each error unit , we generate 500 sets of noisy quantum states ρ( λi ) where λi = λi + N (0, λiσ 2 ) for λi ∈ {1, 2, 3} and σ = 0.1.