Lattice Renormalization of Quantum Simulations

With advances in quantum computing, new opportunities arise to tackle challenging calculations in quantum field theory. We show that trotterized time-evolution operators can be related by analytic continuation to the Euclidean transfer matrix on an anisotropic lattice. In turn, trotterization entails renormalization of the temporal and spatial lattice spacings. Based on the tools of Euclidean lattice field theory, we propose two schemes to determine Minkowski lattice spacings, using Euclidean data and thereby overcoming the demands on quantum resources for scale setting. In addition, we advocate using a fixed-anisotropy approach to the continuum to reduce both circuit depth and number of independent simulations. We demonstrate these methods with Qiskit noiseless simulators for a $2+1$D discrete non-Abelian $D_4$ gauge theory with two spatial plaquettes.

Alongside the necessary hardware improvements, theoretical questions must be resolved to fully utilize a digital quantum computer. Due to the finite resources, one must regulate the quantum field theory. This regularization occurs in multiple steps: discretization, digitization, state preparation, propagation, and evaluation. Each can introduce new operators and potentially break symmetries. In addition, quantum noise can be interpreted as additional terms in the Hamiltonian. In order to recover the physical theory, the resulting effects from regularization and quantum noises must be renormalized.
Following classical lattice field theory (LFT), it seems natural to first regularize the theory by discretizing spacetime. Then one could represent the (Minkowski or Euclidean) spacetime lattice in the qubits. This allows direct access to the entire path integral. The authors of [22,23] * carena@fnal.gov † hlamm@fnal.gov ‡ yingying@fnal.gov § wanqiangl@uchicago.edu suggest this is useful for finite-density field theory. Alas, the number of qubits scales with the spacetime volume V which improves the scaling of e V in classical computations. For time-dependent field theories, the preferred method is to use the Hamiltonian formalism. In this case, the number of qubits scales with a spatial lattice. Discretization reduces spacetime symmetries and introduce new operators into the LFT which are not present in the continuum theory that modifies the nonperturbative renormalization.
In this paper we investigate the renormalization of LFT in Minkoswki spacetime due to trotterizing U(t). The consequence of this will be shown to be the introduction of a temporal lattice spacing, and new operators depending upon it which vanish in the Hamiltonian limit.
In the continuum limit, Minkowski and Euclidean results are the analytic continuation of each other [128][129][130]. At finite a t and finite statistics, this exact relation is complicated, but approximate relations remain [9,[131][132][133][134]. While analytic continuation of lattice observables suffer from signal-to-noise problems [92,[134][135][136][137][138][139][140][141], observables suitable for scale setting have been studied [142,143]. Since knowledge of a, a t is required for any continuum extrapolation, performing scale setting with classical computations would significantly improve the common error budget of quantum computations as the uncertainties for scale setting could be reduced using Euclidean data. We will explore two different schemes for performing analytic continuation of the renormalized lattice spacings, and demonstrate its capabilities for reliable Minkowski scale setting through classical Euclidean computations.
A crucial part of our study is to explore how Minokowski lattice observables computed with quantum circuits can be extrapolated to the continuum in an efficient manner. We will show that trotterized time-evolution can be understood as a Minkowski path integral on an anisotropic lattice 1 . We present a toy model, a D 4 gauge group in 2 + 1D with a two spatial plaquettes, to exemplify the power of a fixed anisotropy trajectory to extrapolate quantities to the continuum limit. This requires as a first step to establish the scale setting for the lattice spacings, a, a t , that can profit from our analytic continuation schemes. This paper is organized as follows. In Sec. II we briefly review the Euclidean action lattice formalism and its connection through the transfer matrix to the Hamiltonian formalism. In Sec. III we derive the trotterized real-time evolution operator and relate it to the transfer matrix. Based on this, we propose two schemes to obtain Minkowski lattice spacings via analytic continuation and advocate the use of a fixed-anisotropy approach to the continuum. In Sec. IV, we discuss the systematic errors from computing a, a t via analytic continuation. Further, in Sec. V we present numerical results in our toy model for these techniques. Finally, we conclude in Sec. VI.

II. LATTICE BASICS
To understand how renormalization arises in quantum simulations, it is useful to review the connection between the Kogut-Susskind Hamiltonian [96] and the Euclidean Wilson action. We summarize the derivation of [145] that begins with the anisotropic Wilson action in Euclidean time τ = it defined on a spacetime lattice: where i = t, s refers to temporal and spatial plaquettes U i formed from gauge links given by elements of the group. The anisotropy comes from using different couplings for spatial and temporal plaquettes, that can be written as with β i (a, a 0 ), g i (a, a 0 ) depending nonperturbatively on the temporal and spatial lattice spacings, a 0 , a. The first step in the process of computing physical observables from LFT is the determination of the lattice spacings through scale setting. For simplicity we will consider the isotropic case via β E ≡ β t = β s , a = a 0 and in analogy with Eq. (2), define β E = 2N g −2 E for SU (N ) group. The anisotropic case merely requires performing the procedure for both a, a 0 independently. One scale sets by computing a lattice quantity am(β E ) where m has a known physical value m phys (e.g. the pion mass). Any lattice m(β E ) differs from the true m phys by a dependent errors, but for this one specific observable we set m(β E ) = m phys to obtain a dimensionful value for a With this, β E is removed from our theory and we can speak only in terms of a. All other lattice masses can then be written as am k (a) and their continuum values can be predicted by computing them at multiple lattice spacings and extrapolating to a → 0 via When working with continuous gauge theories, there is no theoretical issue with computing at arbitrarily small a, but one is limited by computing resources due to topological freezing and critical slowing-down. On quantum devices, the current resources require dramatic approximations of the continuous group. Here, we will consider the discrete gauge theories. Certain discrete subgroups of continuous groups are effective field theories for the continuous groups [146,147] which break down below a minimum lattice spacing [30,31,38]. Therefore one cannot take a of a discrete group arbitrarily close to zero.
Lattice quantities like am(β E ) are obtained from correlation functions, e.g. the temporal correlator O i (na 0 )O j (0) . In the limit where the temporal length of the lattice goes to infinity, this correlator becomes a vacuum expectation value From these correlators, one extracts a 0 m k which correspond to the lattice eigenenergies. For scale setting, one usually wants the lowest energy state a 0 m 1 of a specific sector, which can be extracted from the sum by taking n large: with ∆E the energy gap between the lowest energy state and the next lowest energy state. An equivalent way of expressing the renormalized parameters is by defining the anisotropic parameter ξ ≡ a/a 0 and considering ξ and a as independent parameters. By allowing ξ = 1 and especially ξ 1, lattice practitioners have achieved great success with probing glueballs [148,149], high temperature thermodynamics [150], etc. As we approach the Hamiltonian limit (a 0 → 0), another couplings, g 2 H = g s g t , and the speed of light, c = g s g −1 t , become useful. These bare couplings are related to each other in the weak coupling limit by [151,152] The c i (ξ) were computed perturbatively for SU (N ) at ξ = ∞ for D = 4 in [151]. This was generalized in [152] to arbitrary ξ and to arbitrary dimensions in [153].
For typical values of β i considered in simulations, there are large corrections to these weak coupling results and thus nonperturbative determination of a, ξ is required [154][155][156]. In pure gauge theory, one method for the determination of ξ is made by comparing ratios of spatial-spatial Wilson loops to spatial-temporal Wilson loops [157]. Once ξ is measured, a could be determined using standard methods such as the Sommer scale r 0 [148,149,158] or the Wilson flow [159,160].
Euclidean lattice theories satisfying the reflection positivity have a well-defined Hamiltonian with real eigenvalues [128,161]. To the derive this Hamiltonian, we first define a transfer matrix, T (a, a 0 ) which takes a state at time τ , |τ , to |τ + 1 . T is related to the action through the partition function Z: where N is the number of temporal lattice sites. It follows that the matrix elements of T (a 0 ) are [145] τ +1|T (a 0 )|τ where we have symmetrically split the potential term. In order to extract a Hamiltonian from T (a 0 ), it is convenient to have T (a 0 ) only in terms of a single time slice. While for U s this presents no issues, U t couples the same link at two times U ij (x, τ ), U ij (x, τ + 1) via two time-like links U 0i (x, τ ), U 0j (x + 1, τ ). Fixing into the temporal gauge, U 0i = 1, yields for the kinetic term in the action To proceed, we need to remove the dependence on U † ij (τ + 1) and express T (a 0 ) in terms of operators. The link operator is easy to defineÛ ij |τ = U ij |τ . Therefore Re TrÛs .
For T K , we need an operator that changes a given link, this operator has the group property of R ij (g)R ij (h) = R ij (gh). This gauge link translation can be used to define a conjugate momentum toÛ ij by performing a rotation on U ij (x, τ + 1). With this, we write DgR ij (g)e βt Re Tr g , (13) where the product is over all spatial links U ij (τ ). Any group element can be written as g = e iω·λ where λ i are the adjoint generators, and R ij (g) = e iω·lij can be written in terms of the generators l ij for that representation. Defining α (Dω α )J(ω) as the invariant group measure with a Jacobian J, it is possible to rewrite T K (a 0 ) as This transfer matrix is exact for any a 0 . At the cost of a sum over all character functions of the group, it can be performed analytically. This is done in [112] and appears practical for discrete groups. But for continuous group, the summation is over infinite terms which is undesirable. As a 0 → 0 for unitary continuous group, Eq. (14) can be expanded to O(ω 2 ): leaving Gaussian integrals. Integrating yields where N is an overall normalization. For discrete groups, the contribution to T K is also dominated by group elements close to 1 when a 0 → 0. However, since for discrete group g cannot be arbitrarily close to 1, a naive limit of taking a 0 → 0 leads to degenerate spectrum for T K . Special care has to be taken [162] to avoid this degeneracy as we also show for the D N group in Appendix A. Neglecting the normalization factor N , the final transfer matrix T (a 0 ) is given by Re TrÛs . (17) Since T (a 0 ) corresponds to the translation from τ to τ +1, it can be used to define a Hamiltonian H(a 0 , a) These steps form the link between S E and T (a 0 ) in Fig. 1.
However, because l ij andÛ ij are non-commuting operators, writing Eq. Using this, we obtain for H(a, a 0 ): For conciseness, we define certain Hamiltonian terms It is important to emphasize that varying either a 0 or a requires an adjustment to c(a, a 0 ) and g H (a, a 0 ) to preserve the scale setting condition. Taking the continuoustime limit of the transfer matrix: the BCH terms in H(a, a 0 ) vanish and we obtain the Kogut-Susskind Hamiltonian [96], H KS ≡ − 1 τ log(T (τ )) (See Fig. 1): Besides the ξ-dependent terms, another difference between H KS and H(a, a 0 ) is that c, g H only depend upon a in Eq. (23), while in Eq. (20) they also depend on a 0 .
Historically, the Hamiltonian formalism with limited success was used to evaluate lattice theory by computing results analytically [163][164][165][166], variationally [167][168][169][170], and with exact diagonalization [171]. In such cases, there was no benefit to keeping a 0 finite and therefore all were done in the Hamiltonian limit. In contrast, quantum simulations have good reasons to considering the Hamiltonian at finite a 0 as we demonstrate in the next section.
Schematic of the relations between the various lattice and continuum functions discussed in this work.

III. TROTTERIZATION AND TIME-EVOLUTION
The starting point for real-time evolution on quantum computers is to define U(t) = e −iHt . For gauge theories, one typically takes the lattice Hamiltonian to be H KS in Eq. (23). This U(t) cannot be implemented easily on quantum computers and must be approximated, resulting in effects that have to be renormalized. Trotterization discretizes the evolution into N = t/a t steps formed of products of e ixiHj where different choices of x i ∝ a t lead to errors at different order of a t and H i are a decomposition of H into mutually noncommuting terms. In the case of H KS , there are two The number of terms and x i for a given O(a p t ) error can be derived by repeated applications of the BCH relation, Eq. (19). In the case of O(a 2 t ), this corresponds to finding By trotterizating U(t), we have introduced a temporal lattice spacing a t . One might be tempted to believe that the renormalized a t is a parameter that can be directly tuned, but this is incorrect. This can be seen by inserting Eq. (21) into Eq. (27), from which one observes that in the same way as the Euclidean results, a t is always multiplied by [c(a, a t )a] −1 . Thus, changes in a t are compensated by modifying the bare speed of light. It it therefore natural to define a lattice bare parameter which we can control in simulations: Thus we find that a t must be determined nonperturbatively by scale setting. Naively, a t , a would be obtained by performing a quantum simulation, which on near-term hardware is likely to be noisy, and therefore the lattice spacings will have large uncertainties. Since all other lattice observables depend upon the scale-setting, minimizing the uncertainties of a, a t is crucial to the overall program of quantum simulations of LFT. In practice, keeping these uncertainties small require high statistics of very deep circuits with error mitigation. Instead of computing a t , a on the quantum device, it is possible to utilize classical Euclidean computations of a, a 0 to scale setting in a quantum computer. If the Minkowski and Euclidean lattice Hamiltonians were the same, we could trivially set a, a 0 to a and a t -establishing a link between the lattice results in each metric (See Fig. 1). However, the Hamiltonians only match when a t , a 0 → 0, when they both reduce to H KS . Instead, at finite a t , a 0 they differ as we show explicitly in the following. In analogy to Eq. (28), we define the quantity δ τ ≡ a0 c(a,a0)a on the Euclidean side. With this, we can recast T (a 0 ) in Eq. (17): with the following Hamiltonian terms: Taking δ τ → iδ t in Eq. (29), we analytically continue which we recognize as Eq. (25) written in terms of bare parameters. From these, we can define the dimensionless Hamiltonians using only bare parameters: (32) where δ = δ τ , iδ t depending on the metric signature. From this we see that there are differing signs for the BCH terms in real and imaginary time. Given that the Hamiltonians differ, correlation functions and the scale setting observables a t m and a 0 m must also differ even if we take δ t = δ τ . But, these differences arise at O(δ 2 t , δ 2 τ ) and vanish in the continuous time limit, δ t = δ τ = 0.
One possible scheme for using a, a 0 to determine a, a t would be to simply neglect these O(δ 2 t , δ 2 τ ) errors and assume the two sets of scales are equal for the same g H and δ τ = δ t : A benefit of this scheme is that only one set of Euclidean couplings is simulated. While this scheme introduces an O(δ 2 t , δ 2 τ ) systematic error into the scale setting, one could easily imagine it being tolerable compared to errors from near-term quantum computers.
In principle, this systematic error can be reduced by observing that if one takes δ τ → iδ t , then the two Hamiltonians agree. Formally, this means that the eigenvalues m k (δ t , g 2 H ) are the analytic continuation of m k (δ τ , g 2 H ). While the spatial correlators in Minkowski behave exactly like the Euclidean ones of Eq. (5) with the replacement of m k (δ t , g 2 H ), the temporal correlators require (34) where |k are the Minkowski eigenstates. While the excited state contributions do not decrease with na t → ∞, provided that we can isolate a single scale setting parameter m then the lattice results a t m(a, a t ) are the analytic continuation of a 0 m(a, a 0 ). This suggest that a scale setting scheme with reduced systematic error is through analytical continuation: In contrast to Scheme A, this procedure requires the determination of the lattice spacings at multiple values of g H , δ τ in the region around the desired lattice spacings. With this set of values, one derives a fit function for a(δ τ , g 2 H ), a 0 (δ τ , g 2 H ). This function can then be analytically continued to Minkowski space, reducing the nonperturbative BCH errors. The effectiveness of this method, like all analytic continuations of lattice results, depends sensitively on the statistics and fitting function, as we will discuss later.
In the preceding discussion, the relation between the lattice spacings depending upon the Hamiltonians being analytical continuations of each other. Traditionally, Euclidean calculations are performed with an action like Eq. (1). Different actions correspond to different lattice Hamiltonians. For the Wilson action, we observe that H(a, a 0 ) of Eq. (20) is not the exact lattice Hamiltonian being computed, but arises only when we expand T K to O(ω 2 ). Thus, a systematic mismatch occurs if one tries to scale set a, a t with Wilson action results. Furthermore, including the higher order ω terms from Eq. (1) leads to a non-trivial dependence on δ τ . This causes a non-Hermitian Hamiltonian upon analytic continuation [172] although this behavior may prove manageable [130]. In contrast, using an action with a heat-kernel kinetic term (the Laplace-Beltrami operator) [173] with a Wilson single plaquette potential term, the higher order ω terms vanish and the mapping is exact. Another approach to obtain a Hermitian lattice Hamiltonian useful in Minkowski space is obtained by analytic continuing the character expansion of T K (δ τ = δ t , g 2 H ) term-by-term [9,172]. With observables computed at multiple a t , a, one can perform continuum extrapolations analogous to the Euclidean calculations. Influenced by nonrelativistic results, the literature has emphasized first approaching the Hamiltonian limit a t → 0, then taking a → 0. This procedure introduces two inefficiencies. First, taking a continuum limit as a two-step procedure requires m × n separate simulations at m values of a t for n values of a in the (a, a t ) plane, extrapolating each fixed-a set to the a t → 0, and then extrapolating the remaining a-dependent results to a → 0. Secondly, because uncertainty in the a → 0 extrapolation is controlled by how well each a t → 0 extrapolation is, one desires smaller a t . This increases gate costs ∝ δ −1 t . Instead of first approaching the Hamiltonian limit, a more efficient trajectory is to compute a set of points a, a t → 0 at fixed ξ t . This clearly reduces the total number of quantum simulations required. Additionally, since only one extrapolation is performed lattice errors are easier to control. This would allow larger a t . We thus expect the fixed-ξ t trajectory to avoid deep quantum circuits and suffer less from noise.

IV. THE ERRORS OF SCALE-SETTING IN MINKOWSKI METRIC
Although the analytic continuation between Euclidean and Minkowski scales proposed in Scheme B is formally exact, in practice one can only perform the analytic continuation from fits to discrete, noisy Euclidean data a(δ τ , g 2 H ), a 0 (δ τ , g 2 H ). This leads to a signal-to-noise problem when performing the analytic continuation [138]. Indeed, to achieve a certain precision the Euclidean data has to be exponentially more accurate: intuitively, this is because excitations caused by BCH operators decay exponentially in τ but oscillate in t. As a result, low-energy observables in Euclidean lattices tend to be less sensitive to the variation of δ than their Minkowski counterparts. Hence the analytic continuation is ill-posed [174]. Fortunately, at small δ, the calculations with lower-energy states are less influenced by the higher order BCH operators, and hence one can reasonably tame the errors intrinsic to the Euclidean data. This observation is of crucial relevance for our scale-setting procedure.
On the other hand, the difference between Minkowski and Euclidean renormalized lattice scales must be smaller for smaller trotter steps δ t,τ as both H(δ τ , g 2 H ) and H(iδ t , g 2 H ) are closer to the same Hamiltonian limit. This implies that there might be a parameter space where Scheme A yields more accurate results for Minkowski scale-setting, leading to question the feasibility of Scheme B. In this section, we give upper bounds for the scale-setting error of both schemes and discuss if the advantage of analytic continuation of a given scheme is balanced out by its errors.
For δ τ = δ t , Scheme A approximates λ(iδ t , g 2 H ) as λ m (δ τ , g 2 H ) such that the error is given by where λ(0, g 2 H ) is the corresponding energy gap evaluated in the continuous time limit δ = 0. The first two terms on the RHS of Eq. (36) quantify the errors from the BCH contributions. The last term is A , the statistical error of the Euclidean temporal scale at δ τ . We obtain the following constraint on the trotterization error according to the Bauer-Fike theorem [175], At small δ, H(δ, g 2 H ) − H(0, g 2 H ) is dominated by the leading order BCH commutators of order |δ| 2 .
where d is the number of spacial dimensions, d U is the dimension of the representation of the group element U in H V , N link is the number of links in the spacial lattice, and l2 is the spectral norm ofl 2 . We have introduced additional definitions into the second line, The inequality of Eq. (38) which is bounded by M + A for δ ≤ max δ. As M quantifies the upper limit of the trotterization error in Eq. (37) and also the error of Scheme A, we refer to M as the error bound parameter from the BCH expansion.
As Scheme B requires the knowledge of the functional dependence of the scales on g H and δ τ , we assume that the theoretical λ(δ, g 2 H ) is both analytic and even in power of δ within the radius |δ| ≤ max δ, i.e. λ(δ, g 2 H ) = λ(−δ, g 2 H ). This is based on the following perturbative argument: for small |δ|, the BCH commutators H(δ, g 2 H ) − H(0, g 2 H ) can be treated as perturbations to the Hamiltonian limit H(0, g 2 H ). With the second-order trotterization, all the non-vanishing BCH commutators depend on even orders of δ. Therefore, perturbatively, the difference in the spectra of H(δ, g 2 H ) and H(0, g 2 H ) should be analytic and even in powers of δ order by order. One thus fits the functional form λ f (δ τ , g 2 H ) for Euclidean temporal scales in even powers of δ τ . This guarantees that the analytic continuation λ f (iδ t , g 2 H ) is real, thus avoiding nonunitarity in the Minkowski metric.
Define B as the maximum deviation of λ f (δ τ , g 2 H ) from the theoretical λ(δ τ , g 2 H ) across the Euclidean region, such that |λ f (δ τ , g 2 H ) − λ(δ τ , g 2 H )| ≤ B for all 0 < δ τ ≤ max δ. B is affected by both the precision of the measurements on the Euclidean data and the fitting procedure. Since the computational resources required grow with decreasing δ τ , |λ f (0, g 2 H ) − λ(0, g 2 H )| is likely to be the largest deviation. The region δ ≤ max δ is defined such that O(δ 4 ) terms are at most equal to the O(δ 2 ) terms. Therefore within this region, a quartic λ f (δ τ , g 2 H ) should reasonably approximate λ(δ τ , g 2 H ). By performing calculations at three or more δ τ each with A , one would expect the value of |λ f (0, g 2 H ) − λ(0, g 2 H )| and therefore B to be larger than A only by an order-one factor. In addition, we will require the condition in Eq. (C2): |λ f (δ, g 2 H )−λ(δ, g 2 H )| ≤ M + B in the whole complex plane satisfying |δ| < max δ. As shown in Appendix C, we derive an upper bound on the error of the temporal scale-setting for Scheme B: with 0 < ω(δ t ) = 4 π arctan δt max δ < 1 following the Lemma 1 in [174].
The advantage of using Scheme B could be seen from comparing the ratio of Eq. (42) to Eq. (41) when setting A = B = . The error in Scheme A has a quadratic dependence on δ t . When /M 1, the growth of the errors in Scheme B is delayed with respect to that in Scheme A, until δ t is very close to max δ, as shown in Fig. 2. It results that such advantage of Scheme B over Scheme A requires /M < 0.053, where around /M = 0.053, the derivatives of the two prefactors respective to δ t are the same at δ t = max δ. Ultimately, the overall accuracy of the scale setting in both schemes is controlled by the magnitude of the error bound parameter M. All the above holds, unless the error bound parameter M is too large such that simulations at very small δ t values are necessary to have the overall accuracy under control, which are computationally very expensive.
The result that Scheme B performs better at small /M ratio has clear physical interpretations. Correcting BCH errors as Scheme B does via analytic continuation becomes more important when the error bound parameter M is larger. On the other hand, in the Scheme B, the analytic continuation is sensitive to the Euclidean precision as seen from Eq. (42), while Scheme A is less affected. For Scheme B to yield a smaller error, certain accuracy of the functional form λ f (δ τ , g 2 H ) has to be achieved. This leads to the practical concern that a large amount of resources on the Euclidean calculation might be required at small couplings as the signals become weaker. In addition, for the low energy states involved in the scale-setting procedure, the error bound parameter M could be smaller than the estimation using Eq. (39). Therefore, we expect both Eq. (41) and Eq. (42) to be conservative bounds, as one can confirm by comparing Tab. I and Fig. 7. In such cases, a smaller error would be required to obtain the same values of /M that govern the suppression of the errors in Scheme B.
In Table I we estimate for D 4 models with 2 and 3 spatial dimensions and different number of plaquettes/number of links/different values of the bare coupling g H , the largest possible trotter step compatible with 0.1 errors on the temporal-scale-setting for Scheme A and Scheme B, assuming = 0.02. We observe that as expected, the advantage of Scheme B to allow the use of relative large trotter step at a given systematic error level is remarkable for larger systems and stronger couplings, that in turn corresponds to larger values of M .   Fig. 7. The absolute error in Tab. I can be converted to the relative error in Fig. 7 by use of a factor δt/atm1×1 ≈ 1 from Tab. II.

V. NUMERICAL RESULTS
In this section we present a concrete demonstration of some of the theoretical perspectives discussed above by using a two-plaquette theory with the discrete, non-Abelian gauge group D 4 . We perform classical simulations of a quantum computer using qiskit [176,177]. These calculations are performed without modeling of realistic noise sources, corresponding to a perfect, error-free quantum computer.
We simulate the D 4 gauge field on the two-dimensional The lattice geometry used for the D4 gauge simulation. The plaquettes are given by U † 2 U † 0 U3U0 and U † 3 U † 1 U2U1. Dash lines are used to indicate repeated links due to the periodic boundary conditions. lattice shown in Fig. 3. This lattice represents the smallest two-dimensional lattice which cannot be reduced to a onedimensional theory. The simulations requires a five D 4 registers, and uses a total of 17 qubits: 12 for physical degrees of freedom, 3 for an ancillary group register, and 2 ancillary qubits. Note that, for brevity, we have broken with the notation of previous sections, in referring to a link not by the source and sink sites, but instead with a single direct index i = 0 . . . 3.
We define a trace on D 4 (not uniquely specified by the group structure) by embedding D 4 into U (2), and defining the trace via the fundamental representation of that Lie group. The embedding of D 4 < U (2) is generated by the elements σ x and iσ z . The Hamiltonian terms are where log T (i) K is the one-link kinetic term for the i-th link, determined as discussed in Appendix A2. 2 In total, the quantum simulations entailed ∼ 200 entangling gates per δ [112]. This is roughly the resources recently used in [178][179][180], suggesting that a single step of time evolution may be possible on current quantum devices.
Stochastic state preparation has been demonstrated for thermal states in D 4 [84] and particle-like states in Z 2 [85].
While these results are promising, the initial states are found to have contamination from excited states that complicates the analysis. Therefore, to simplify the study of trotterization and the continuum limit, we use exact diagonalization of the Kogut-Susskind Hamiltonian to compute the eigenvalues and then construct our initial state as where |ψ i correspond to the i−th excited state. By preparing such initial states, the corresponding timedependent state should be As pointed out in [85] at finite a t , trotterization mixes Kogut-Susskind eigenstates through nonzero transition matrix elements ψ k |H(a, a t )|ψ(0) : As a t → 0, one can show that the time-dependent state in Eq. (46) reduces to Eq. (45). Performing measurements of the operator O on this state with a quantum computer yields where l = [0, N ] is the integer trotter step. c k,j , s k,j incorporate both the mixing effects entailed in λ k and the matrix elements of O. The E k here and in the following are eigenvalues of H(a, a t ) and have explicit dependence on a t . This expression, analogous to euclidean lattice field theory operators, can be used to fit the energies of states It is useful here to compare the fitting procedure to that of Euclidean LFT. In Euclidean space, O(τ ) ∼ i α i e −Eiτ and thus taking τ → ∞ acts as a low-pass filter which removes higher energy states. In this way, for sufficiently large τ , O(τ ) should be exponentially dominated by a single state E 1 and one can extract a 0 E 1 . In contrast, O(t) ∼ i β i e −iEit and thus the excited-state contribution to the lattice matrix element doesn't decrease as t → ∞. This lack of a natural low-pass filter is why real-time evolution can access matrix elements that could be inaccessibly to Euclidean LFT; it also means that the excited-state contamination from imprecise state preparation and trotterization cannot be trivially removed. In this sense, the sampling advantage of quantum computers could be jeopardized by errors induced from excited state contamination.
In this work, we will use two different spatial Wilson loops as our operators O, which have different quantum numbers, and therefore are sensitive to different eigenstates. The first is the left plaquette, O 1×1 = Re Tr U 0 U 3 U † 0 U † 2 , and by following the construction of [112] it can be measured without any ancillary qubit. The second is the Wilson loop over the entire lattice, which requires an ancillary group register to be computed [181]. For each of these operators, we construct different initial states from Eq. (44).
The computations are done for multiple values of g 2 H (a, a t ) = [0.71, 1.25] and δ t = [0.01, 0.7] for N δ t = [10,20]. The circuits used are detailed in [112]. The BCH contributions vanish for g 2 H (a, a t ) → 0, and as a result, the matrix elements c k,j , s k,j in Eq. (47) vanish with only c i,0 and s i,0 surviving. Thus the statistical errors required to resolve oscillations in O(t) must be decreased accordingly, calling for increased number of shots. An additional complication from the continuum limit approach is that the gaps a t m j − a t m i → 0 as g 2 H → 0 and thus contamination due to trotterization errors grow unless δ t is decreased as well. Together, these amount to the cost of the approach to the continuum to scale poorly. For our model, we find that for g 2 H (a, a t ) > 1, the required number of shots for the qiskit noiseless simulator qasm ranged from 1600 to 64000 as we decreased δ t , g 2 H . For this reason, we utilized state vector simulator -which reports exact probability distributions -for g 2 H (a, a t ) ≤ 1 in order to investigate the continuum limit at reduced computational cost. This just emphasizes the importance of being able to perform calculations at large a, a t on reducing the quantum resources required. For the D 4 theory, the eigenvalues of H V (Eq. (43)) are 1/g 2 H × {−4, −2, 0, 2, 4}. States evolved under one-step time evolution operator e iδtH V /2 built from H V will obtain phases in the range 4,4]. For the δ t and g 2 H chosen, the phase differences are smaller than 2π so that one can resolve states with different potential energies.

V.1. Scale Setting in Minkowski Spacetime
We evaluate O 1×1 (t) to investigate the effect of the renormalization of the temporal scale in Minkowski calculation, by performing fits to Eq. (47) and extracting the lowest energy gap a t m 1×1 from O 1×1 (t) . The initial state is constructed from Eq. (44) with excited state i = 2. An example of these calculations is found in Fig. 4 for fixed g 2 H (a, a 0 ) = 1.11 and three different δ t = {0.5, 0.25, 0.1}. Comparing the trotterized results to the continuous-time one calculated using classical computations, the state contamination can be clearly observed and decreases for smaller values of δ t . The results for a t m 1×1 , for the whole range of g 2 H and δ t are found in Table II. We observe that the bare mass a t m 1×1 /δ t is changed by only 3% when δ t increases from 0.1 to 0.5 for g 2 H = 1.11, while the mixing effect could be changed by around 20% from comparing the c k,j , s k,j amplitudes in Fig. 4.  (14) 0.189(3) 1 0.1 0.07580 (8) 0.1749 (7) (9) 0.04387(7) 0.71 0.01 0.004191(9) 0.008753 (7) In the Euclidean lattice calculation, using the relation in Eq. (7) obtained in the weak coupling limit [151][152][153], the perturbative renormalized anisotropy is Even without a functional form for c i (ξ), one can see that solving Eq. (48) self-consistently for ξ will break a 0 ∝ δ τ .  [151][152][153] -suggesting that the nonlinear renormalization is generically small. While the exact functions g 2 H (a, a 0 ), c i (ξ), depend upon the metric, the form of Eq. (48) should remain unchanged. It is possible to investigate Eq. (48) in Minkowski space by rewriting it as a t = c(a, a t )aδ t . For the case of g 2 H = 1.25 plotted in Fig. 5, one sees that a t has a clear δ 2 t dependence. Furthermore, Fig. 6 shows that the linear dependence of a t upon δ t is a g 2 H dependent function. Together, these demonstrate the renormalization of a t with respect to the bare coupling g H and the trotter step δ t . Although for our toy model with two spatial plaquettes we cannot extract the spatial scale a from numerical results, we can relate a to the running of g using the perturbative results [182] In the above, b 0 , b 1 are the standard g 3 , g 5 coefficients in the perturbative β function. Altogether, to extract physical properties from quantum simulations, we must first determine both scales explicitly.

V.2. Comparison of Schemes for Scale Setting
To reduce the quantum resources for lattice calculation, we determine the scale using the Euclidean data following the procedure proposed in Sec. III. As a demonstration, we will focus on the temporal lattice spacing and show its determination using Scheme A and Scheme B.
We solve the eigenvalues of the Euclidean transfer matrix built from H K and H V in Eq. (43). By taking the logarithm of these eigenvalues, we could extract the energies of the spectra on a Euclidean space-time, with the ground state energy normalized to zero. The observable a 0 m 1×1 is the energy corresponding to the 2nd excited state. With the set of bare couplings 1/ and δ τ = [0.5, 0.01], we obtain a 0 m 1×1 and in addition we apply an error of 1% to a 0 m 1×1 for any bare coupling and a 0 used, in order to mimic the measurement error using classical Monte Carlo simulations. Then we do a 2D fit to get the functional dependence of a 0 m 1×1 on both δ τ , g 2 H as: Given that spatial lattice spacing a is related to g 2 H via the relation in Eq. (49) and f i functions should be powerlaw expansions of a, we use the following functional form for the 2D fitting of a 0 m 1×1 The fitted a 0 m 1×1 (χ 2 /d.o.f = 1.3) is at most 3% away from their truth value at 68% C.L.. With this, we can then analytically continue with either scheme. In Fig. 7, the percent systematic error for both schemes compared to a t computed by diagonalizing U (a t ) in Sec. V.1 (which agree with Tab. II). The colored region shows the errors from the 1σ band in the Euclidean fitting. For the three g 2 H used in Fig. 7, the errors ≤ 2% at 68% C.L., with a weak dependence on δ τ . As expected, Scheme A, B give precise evaluations of a t at small δ t . When δ t gets larger, the BCH errors of Scheme A should increase as δ 2 t . BCH errors get smaller for smaller g 2 H , as which lowers the error bound parameter M . As shown in Fig. 7, for g 2 H ≤ 0.33( ∼ M ), Scheme A provides a better estimation of the lattice scales for the whole region of δ t < 0.5. The errors from the analytical continuation of Scheme B barely increase for larger values of δ t , and are in principle less sensitive to g 2 H as expected. For the assumed precision in the Euclidean data and the fitting procedure, the error, , is slightly larger for the small g 2 H region and altogether a delayed growth in the errors is maintained for g 2 H ≥ 0.71. One can conclude that if g 2 H is not particularly small, Scheme B is more likely to render the lowest errors in the scale determination. In both cases, the observed 2% errors are well within Eq. (41) and Eq. (42), which predict a 10% error bound for δ t ≤ δ A(B) t shown in Tab. I. Once the measurement of the renormalized spatial lattice spacing is allowed when considering sufficiently large systems, one can obtain the renormalized spatial lattice spacing similarly using only Euclidean data.

V.3. Approaching the Continuum
In Euclidean LFT, one uses a, a 0 for extrapolating to the continuum limit. This is because performing extrapolations in terms of the bare parameters requires either very higher order fit functions (and therefore many calculations) or calculations with very small g H and δ τ (at large computational cost). Instead, using a, a 0 is an extrapolation in nonperturbative variables, so the bare couplings can be much larger and the fit functions simpler. The same should be true in Minkowski spacetime. In principle, one can approach the continuum along any trajectory as a function of a, a t , but experience from Euclidean LFT [157,160], suggests that taking a trajectory of fixed-ξ t would smoothly and efficiently extrapolate to the continuum limit for Minkowski LFT. In approximations to U (t) other than trotterization [99][100][101][102], definitions for a nonperturbative scale like a t are currently unknown, making the extrapolation to the continuum nontrivial.
In the continuum limit a, a t → 0 and thus a t m i vanishes. One must instead explore finite physical quantities such as a t m i /a t m 1×1 . Due to the smallness of our lattice, the only states which mix with |ψ 2 are from cutoff effects which have a t m i /a t m 1×1 → ∞. Therefore, we perform another fit to O 2×1 (t) from which we extract a second physical mass, a t m 2×1 . The initial state for this purpose is constructed with i = 13. With these, we can study different approaches to the continuum limit. As the scales set from real-time evolution are already available (Tab. II) and Scheme B gives comparable precision, in the following, we use a t from Tab. II. We point out that the precision of the spacings extracted from real-time evolution would be much worse when noise is included.
Numerically, the first trajectory to approach the continuum is to first take a t → 0 (the Hamiltonian limit) then take a → 0. Since the lattice spacing errors of H(a, a t ) are O(a 2 , a 2 t ), we fit the mass ratio to In Fig. 8 and Fig. 9, we show these best fit lines of Eq. (52) to the data points extracted from qasm and state vector. The values of κ 0 correspond to the Hamiltonian limit value of the mass ratio, and we tabulate them in Table III. We find good agreement between the κ 0 in Tab. III and those calculated by direct diagonalization of the Hamiltonian with δ t = 0 for different g 2 H . From Fig. 8, we could see that the difference for a t m 2×1 /a t m 1×1 is only 8% between its value at δ t = 0.5 and the Hamiltonian limit for g 2 H = 1.11, while from Fig. 4, for the same value of g 2 H and δ t , the O 1×1 (t) could differ from the Hamiltonian limit by up to 20%. The deviation in a t m 2×1 /a t m 1×1 represents BCH corrections to the eigenvalues and the latter one is the correction to wavefunction. Thus less resources (larger trotterization step) could be needed to control the error of a t m 2×1 /a t m 1×1 below certain threshold. Assuming errors scaling as O(a 2 ) and using Eq. (49), we can fit the κ 0 in Tab. III to the continuum limit using In contrast to extrapolating in a t , the reliability of this extrapolation depends much more sensitively on being in the perturbative regime, otherwise Eq. (49) receives large corrections and our functional form will be inadequate. So from Eq. (53), we must perform the nonlinear fit in g 2 H which introduces complicated correlations between the fitting parameters. Despite these limitations, we find that the continuum limit result of λ 0 = 2.03(4) agrees with the continuum value m 2×1 /m 1×1 = 2 computed from the direct diagonalization of the Hamiltonian with g 2 H ∼ 0 and δ t = 0. The mass ratio in the continuous time limit, κ 0 in Tab. III (black points), and the best fit line using Eq. 53 are found in Fig. 10 (black line and grey shaded region for 68% C.L..).
The number of trotter steps N (and therefore gate costs) is proportional to δ −1 t . Clearly therefore, it is an undesirable feature to first take the Hamiltonian limit a t → 0 (δ t → 0) and then the continuum limit a → 0. Instead of working at fixed a, while decreasing δ t , one can work at fixed ξ t = a at and then one could directly approach the continuum limit a, a t → 0 through this trajectory. Up to a 2 order, the extrapolation in terms of the bare coupling, would be of analogous functional form as that on the right-hand of Eq. (53), with the finite-ξ t effects entering the fitting parameters. It is thus important to explore how to define the finite ξ t trajectory. One could perform this tuning on the quantum computer. This would proceed by computing both a t , a or ξ t using some physical observables (e.g. Wilson loops or dispersion relations) and then adjust the bare parameters until one has a fixed ξ t trajectory. This procedure requires multiple quantum  (5) simulations to be performed -albeit at larger δ t then an extrapolation to the Hamiltonian limit requires and only the scale-setting observable needs to be computed. But, as demonstrated, we can reliably extract the dependence of the scales a, a t on δ t and g 2 H by analytical continuation using Scheme B. We can then directly invert these relations to find the δ t and g 2 H for a fixed ξ t avoiding the quantum computer entirely for scale-setting.
Given the smallness of our system, we are unable to extract a or the corresponding ξ t to determine a set of δ t and g 2 H . This forces us to consider some approximations. While the previous discussions demonstrate renormalization, as the nonperturbative renormalization effects partially cancel out [152], we expect that the renormalization of ξ t is milder than that of a, a t individually. Therefore, we can approximate the fixed ξ t trajectory as fixed δ t . In Fig. 10, we show altogether the extrapolation to the continuum limit through the Hamiltonian limit (gray line and black dots), as well as two fixed values of ξ t ≈ {10, 2} to highlights the power of fixed ξ t trajectories in achieving the continuum limit results. Indeed, following the tra-jectory of ξ t ≈ 10 (red circle), we can successfully reach the correct continuum limit with clear advantages over the Hamiltonian extrapolation method. For ξ t ≈ 10, the needed number of data point simulations is reduced by 67% when comparing to the Hamiltonian limit procedure that requires all the measurements shown in Fig. 8 and Fig. 9. In addition, the circuit depth is greatly reduced through avoiding small trotterization steps, e.g. δ t < 0.1. In Fig. 10, we further investigate using an even smaller ξ t ≈ 2, hence δ t = 0.5 (blue triangles), which implies a further improvement on the circuits depth demands. Due to the large uncertainties from quantum simulations as represented by the large error bars from Fig. 10, we calculate the atm2×1 atm1×1 through explicit diagonalization of the time evolution operator at fixed ξ t ≈ 1/δ t = 2. We find that the continuum limit is also successfully reached for this large value of trotter step. Although multiple simulations are needed to control the measurement uncertainties on a quantum computer, this highlights the advantages of fixed-ξ calculations in avoiding larger errors from enhanced circuit depth.

VI. CONCLUSIONS
The natural construction of quantum field theories on quantum computers is as a lattice-regularized theory. Any meaningful calculation must therefore be renormalized and taken to the continuum limit before comparison to experiments can be made. As we have discussed, this involves many hurdles and it will not be easy. Typically quantum simulations are constructed within the Hamiltonian formalism where a spatial lattice with spacing a is time-evolved. Further approximations are required, as the time evolution operator U(t) built from the Hamiltonian usually cannot be efficiently implemented.
In this article, we have tackled three important issues related to the simulations of Minkowksi lattice field theories on quantum computers. Firstly, we discussed how renormalization in the form of a temporal lattice spacing a t arises from trotterizating U(t). Secondly, by relating this trotterized time evolution operator to the Euclidean transfer matrix, we propose two different schemes using analytical continuation to set the Minkowski lattice spacings. The most straightforward scheme is to simply equate the Euclidean lattice spacings to those in the real-time calculation (Scheme A). This scheme introduces O(δ 2 t ) errors in addition to the statistical errors in the Euclidean calculations. Our second method (Scheme B) can correct the O(δ 2 t ) errors of Scheme A by analytically continuing the best-fit function of the Euclidean spacings. We have derived conservative bounds on the systematic errors for these schemes for small δ t , and a loose constraint on the error of the fit to take advantage of Scheme B. This enables us to reduce the requirements for quantum resources in the scale setting procedure. Thirdly, we show that by taking a fixed anisotropic-ξ t approach to the continuum limit, one can further reduce the number of simulations with the added benefit of shallower circuits and lower required gate fidelities. We demonstrated these ideas for a 2 + 1D, discrete non-Abelian D 4 model using qiskit noiseless simulators.
The results of this work suggest a number of followups. In particular, these procedures could be tested in the near-term for Z 2 gauge theories in 2+1 dimensions on multiple lattice sizes, allowing for the incorporation of finite-volume effects and the explicit calculation of a. Additionally, since the quantum resources increase as a t , a → 0, improved Hamiltonians that account for both lattice effects and quantum noise could accelerate extrapolation to the continuum approach while reducing the quantum resources required. Finally, extending the error bounds on our scale-setting schemes to larger δ t would be well motivated to ensure that the systematic errors introduced are under control.
and is inherently positive from the positivity of T 1 K when β t > 0 for finite n. Notice that in Eq. (A3), the group elements with m = 1 contribute only to the second piece in each summation of c r for d r = 1 and do not contribute to c 2,k . For the D 4 group considered in Sec. V, we have c 1,1 = 6 + e 2βt + e −2βt , c 2,1 = e 2βt − e −2βt (A4) c 1,2 = c 1,3 = c 1,4 = −2 + e 2βt + e −2βt .
The corresponding H K after normalizing the ground state energy to zero in the character basis |r, l is written as: log( c 1,1 c r ) |r, l r, l| (A5) where l represents the degree of degeneracy of the d 2 r modes in the r irreducible representation. The trotterization error we discussed in Sec. IV would be proportional to the maximal eigenvalues of H K . Recall that β t = a/(g 2 t a 0 ) = 1/(g 2 H δ τ ). In the small a 0 region, the contribution to the c r in Eq. (A4) are all dominated by the e 2βt terms, which quantify the contribution from the identity group element (j = m = 0). Taking the limit a 0 → 0 for discrete groups would result in degenerate spectra: which instead approach a finite value in the a 0 → 0 limit. To obtain non-degenerate spectra and have the above energy ratios fixed, one need to view β t and β s as parameters, without referring to their dependence on a and a 0 in Eq. (2). The continuous time limit should be taken while keeping [162]: where g 2 d (a) is some finite constant for a given spatial lattice spacing. It then follows that: For the numerical part of our simulation in Sec. V, we fix δ t = δ τ = 1 and the corresponding dimensional spectra in the small g 2 H limit is given by which is the spectra in the above Hamiltonian limit by identifying exp(− 2 g 2 H (a) )/c = g 2 d (2a)/2. This relation also indicates that for the same spatial lattice spacing measured, the bare coupling for the the discrete group should be exponentially suppressed than the bare coupling for a continuous group.