Lower bounding ground-state energies of local Hamiltonians through the renormalization group

Given a renormalization scheme, we show how to formulate a tractable convex relaxation of the set of feasible local density matrices of a many-body quantum system. The relaxation is obtained by introducing a hierarchy of constraints between the reduced states of ever-growing sets of lattice sites. The coarse-graining maps of the underlying renormalization procedure serve to eliminate a vast number of those constraints, such that the remaining ones can be enforced with reasonable computational means. This can be used to obtain rigorous lower bounds on the ground state energy of arbitrary local Hamiltonians, by performing a linear optimization over the resulting convex relaxation of reduced quantum states. The quality of the bounds crucially depends on the particular renormalization scheme, which must be tailored to the target Hamiltonian. We apply our method to 1D translation-invariant spin models, obtaining energy bounds comparable to those attained by optimizing over locally translation-invariant states of n & 100 spins. Beyond this demonstration, the general method can be applied to a wide range of other problems, such as spin systems in higher spatial dimensions, electronic structure problems, and various other many-body optimization problems, such as entanglement and nonlocality detection.


I. INTRODUCTION
A central task in quantum theory is the study of quantum many-body systems.To start, it is the key problem in condensed matter, quantum chemistry, and high energy physics.Beyond that, the nature of the quantum correlations arising in such systems is a central topic in quantum information (entanglement theory), as well as in quantum foundations, where the validity of physical theories is tested through the correlations they exhibit (such as Bell experiments, or experiments testing the limits of quantum theory).However, solving the quantum many-body problem is extremely challenging, both analytically and computationally.Its hardness stems from the rapid growth of the number of parameters that are required to specify a state of the system-a vector in an exponentially large Hilbert space-with the number of its constituents.
Yet, the vast majority of these parameters is not relevant for the study of any given system.At a fundamental level, this is due to locality.In any experiment, we can only probe a small number of properties, such as expectation values of local (few-body) operators, correlations, or functions thereof.At the same time, the laws governing physical systems-encoded in their Hamiltonians-are also written in terms of local operators.It is thus sufficient to deal with local quantities in order to describe and predict the properties of any given system.This holds true whether we are studying the ground-state properties of a solid or a molecule, designing a measurement protocol to certify the presence of entanglement in a quantum computer, or characterizing the correlations that can arise in a physical theory that we aim to test.
It is thus appealing to altogether drop the description of the system in terms of an exponentially big state vector, and instead characterize quantum many-body systems in terms of their relevant observable correlations-the local marginals of the full distribution.This can be done, e.g., by working directly with the set of expectation values of the relevant observables, or by formulating the problem in terms of the local reduced density matrices that encode all local expectation values [1][2][3].Clearly, this description is significantly more efficient, as only the few quantities required to predict the properties of interest need to be captured.However, in this description, a new difficulty arises: The fact that those marginals need to be consistent with a global quantum mechanical wavefunction imposes highly non-trivial constraints on them [4][5][6][7][8][9].The likely simplest such constraint is that for a spin-1/2 system, the Pauli expectation values must satisfy ⟨σ x ⟩ 2 + ⟨σ y ⟩ 2 + ⟨σ z ⟩ 2 ≤ 1, but significantly more subtle constraints arise for many-body systems.In order to study a system based on its marginals, one therefore must be able to characterize the set of values they can-or cannot-take.
1.The set of marginals M and its inner (a) and outer (b) approximations.The set M consists of those local variables (e.g.reduced density matrices on a fixed collection of subsystems) that are compatible with a global quantum state of the whole system, and is therefore a subset of the set of all possible local variables S (e.g.all tuples of positive semidefinite matrices of trace 1).Finding the ground-state energy E0 of a local Hamiltonian amounts to minimizing a linear function over the set of marginals M.This task is intractable because the set is very complex.Approximate solutions can be obtained by minimizing the energy over approximations of M: Variational ansatz methods (a) minimize the energy function over inner approximations V k ⊂ M, and produce upper bounds v k ≥ E0.Relaxation methods (b) minimize the energy over outer approximations R k ⊃ M, thereby obtaining lower bounds r k ≤ E0.Numerical methods often feature hierarchies of approximating sets that provide tighter bounds as the level k is increased (e.g.increasing the number of ansatz parameters).This increases the computational complexity of the problem, and the efficiency of such methods is then determined by the trade-off (c) they exhibit between the precision of the bound and the computational complexity of the problem that has to be solved to produce it.
In order to get a better understanding of this phenomenon in the quantum many-body context, let us consider a concrete example: the antiferromagnetic Heisenberg spin-1/2 chain H = i (σ i x σ i+1 x + σ i y σ i+1 y + σ i z σ i+1 z )/4.Its energy (per site) is fully determined by the 2-site reduced density matrix ρ 2 .If we only impose that ρ 2 itself is physical (i.e.positive and normalized) we find that the energy is minimized by the singlet state |Ψ − ⟩, with value −0.75.The true ground state energy per site is however equal to 0.25 − log(2) ≈ −0.44.The result we obtained is thus clearly unphysical.It requires each spin to be in a maximally entangled singlet state with both of its neighbors simultaneously, which is well-known to be impossible by the monogamy of entanglement [10].
Remarkably, this simple analysis can still be used to obtain rigorous estimates for the ground state energy.On the one hand, it is clear that when we minimized over all 2-body states in the above, we have disregarded certain constraints on ρ 2 which are needed to make it consistent with a global state-i.e.we relaxed the problem.We therefore end up with an energy which is below what can be obtained from any physical wavefunction, that is, a lower bound to the true ground state energy.At the same time, we can also use the solution above to construct an explicit ansatz wavefunction, by placing the energetically optimal singlet state between consecutive pairs of adjacent spins |Ψ − ⟩ ⊗ |Ψ − ⟩ ⊗ |Ψ − ⟩ ⊗ . . .(and possibly symmetrizing with respect to translation).This is clearly a physical ansatz for the many-body wavefunction, and thus its energy (on average, −.375 per site) is above the optimal value-it yields an upper bound to the true ground state energy.
Both the upper and the lower bound can be systematically improved by considering blocks of n sites, and minimizing the energy over all ρ n .The resulting optimal energy gives a hierarchy of increasingly better lower bounds to the ground state energy-known as Anderson bounds [11]-while the ansatz constructed from the optimal ρ n correspondingly gives increasingly better upper bounds.Clearly, this analysis is not bound to the Heisenberg model, but can be applied to any other ground state problem.Formally, we can understand the two approaches as either enlarging (through relaxation of constraints), or restricting (by specifying an ansatz), the space of two-body marginals we allow in the optimization, see Fig. 1 for a detailed discussion.
What is the precision of the energy obtained this way, and how does the computational cost depend on it?The lower and upper bounds differ precisely by the way in which they treat the interaction term across the boundary of n-site regions, and thus, the resulting energies per site differ by O(1/n).The required computation task, on the other hand, is the diagonalization of an n-site system, which scales exponentially in n.We thus find that the computational cost grows exponentially in the desired accuracy-a rather unfavorable scaling.This raises the question whether more efficient methods to obtain both upper and lower bounds exist.
For upper bounds, any variational wavefunction gives rise to physically allowed marginals, and thus to an upper bound for the energy.Variational wavefunctions are plentiful, tailored to the physics of the system at hand (such as BCS [12], Laughlin [13], or Gutzwiller projected wavefunctions [14,15]).A particularly powerful and versatile class are tensor network wavefunctions, such as MPS, PEPS [16], TTNs [17], or MERA [18], that can be understood as variational ansatzes arising from various renormalization schemes [19].Renormalization schemes, by construction, adapt to the given problem in the way in which they select which degrees of freedom to keep, and correspondingly, variational tensor network families come with a high degree of versatility.Their ability to adapt to the problem at hand, together with efficient numerical algorithms, has made them the prime tool for the study of many quantum many-body problems, most prominently in one dimension.Importantly, the computational cost of tensor network algorithms typically increases at most polynomially with the desired accuracy, an exponential improvement over the simple hierarchy introduced above.
For lower bounds, no comparably powerful approaches are known.The development of semidefinite programming [20]-an optimization framework that allows positivity constraints to be imposed on matrices-opened the way to formulating relaxations that significantly improve on the Anderson-bounds hierarchy.Yet such methods are typically formulated by first deriving an exhaustive hierarchy of constraints, and then discarding all constraints that exceed a certain level of complexity (e.g. as in Refs.[3,[21][22][23][24]). Since the complexity within the hierarchy generally grows exponentially, typically all but the lowest-level constraints have to be dropped.While for some systems, even few constraints can already produce useful lower bounds (e.g.[1,3,25]), and cleverly selecting which constraints to keep from the different levels of the hierarchy can lead to improvements [26][27][28], those methods still exhibit the same exponential growth of the computational cost with the desired accuracy, and therefore cannot match the performance of state-of-the-art variational methods.
This situation is all the more unsatisfactory given the fundamental importance of rigorous lower bounds.Firstly, of course, they are needed to complement variationally obtained upper bounds, if one is interested in assigning rigorous error bounds to numerically obtained estimates.However, the importance of lower bounds reaches down to a far more fundamental level: Any attempt at falsifying a probabilistic physical theory-be it entanglement witnesses rejecting a separable (i.e.not entangled) description [29], Bell inequality violations ruling out a local hidden variable model [30], or experiments designed to detect a breakdown of quantum theory [31]-requires us to prove that the experimentally measured values lie outside of the set of values compatible with the theory at hand.This is precisely what is achieved by rigorous lower bounds on the minimum value attainable in the compatible set: an observation of a lower value proves incompatibility.Given this fundamental importance of precise and rigorous lower bounds, the exponential scaling and lack of adaptability of the existing methods for finding such bounds is all the more pressing.
In this paper, we propose a general framework to construct efficient relaxations of optimization problems for quantum many-body systems, such as the ground state problem.It yields versatile families of relaxations which can be continuously tuned and adapted to the specific scenario.Crucially, their computation cost scales polynomially with the desired accuracy, and thus overcomes the exponential scaling of existing methods.Its construction is guided by renormalization, which allows us to use the established knowledge on renormalization transformations of quantum systems to closely adapt the relaxation to the problem at hand.More generally, our framework can be seen as a comprehensive way to efficiently and systematically relax sets of local many-body correlations with specific global consistency conditions, and is thus applicable to a wide range of problems, including tasks such as entanglement or non-locality detection.
The key idea of our method is as follows.We start by expanding the compatibility constraints on the physical reduced density matrices as a hierarchy of increasingly complex objects.However, unlike previous approaches, which truncate this hierarchy at some level n, we first compress (or "coarse-grain") the objects in the hierarchy, such that each of them becomes of constant (or otherwise tractable) size.This way, we circumvent the exponential growth of complexity with the level n in the hierarchy, which allows us to reach significantly larger values of n.The attainable accuracy is thus no longer limited by n, but rather by the chosen compression scheme, including the degree of compression.Unlike the static truncation at fixed n, this compression can now be adjusted continuously, and it can be tuned to the problem at hand, guided e.g. by renormalization ideas.This way, the relaxation can be adjusted to the specific scenario, and much higher accuracies can be obtained.
We present a detailed realization of our method for the central problem of finding the ground state energy density of a one-dimensional (1D) spin chain.We show that, in this case, both Matrix Product States (MPS) and Tree Tensor Networks (TTN) provide suitable coarse-graining schemes, corresponding to Wilson's NRG and real-space renormalization, respectively [19].We test the method on a range of paradigmatic spin chains, both critical and gapped.We find that our relaxation method yields lower bounds which are between one and two orders of magnitude better than what could be reached with existing approaches which truncate the hierarchy without compression (where for the same accuracy the exact solution of a problem involving on the order of 100 spins would be required).We find renormalization to be a useful guide for adjusting the coarse-graining scheme to the target Hamiltonian, that is, that using MPS or TTN describing variationally optimized wavefunctions for the compression in our relaxation, performs indeed rather well.Yet, we also demonstrate that further optimizing over those coarse-graining maps can additionally enhance the accuracy of the method.Our results provide rigorous lower bounds on the energy densities of infinite spin chains, and significantly improve upon lower bounds previously obtained for such systems using reduced-densitymatrix theory and similar approaches [3,24,26].
The method presented can be straightforwardly generalized to ground state problems in higher dimensions, different geometries, or fermionic systems, by choosing a suitable renormalization scheme and the corresponding tensor network ansatz.The general framework underlying the method is much broader, and can be adapted to handle a diverse set of problems, ranging from electronic structure problems in condensed matter and quantum chemistry all the way to entanglement [32][33][34][35][36] and nonlocality detection [37][38][39][40][41] in many-body systems, as well as other situations where the positivity of intractably large matrices, subject to semidefinite constraints, has to be ensured [42][43][44].
The presentation of the material in the paper aims to equally bridge two aspects: On the one hand, a comprehensive presentation of the general method, and on the other hand, a tangible discussion of its application to 1D spin systems.In order to enhance the accessibility of the manuscript to the reader, we have thus structured it as follows: • Section II provides a concise summary of the method : It derives the method specifically for the case of 1D spin chains with MPS as coarse-graining maps, and provides a brief summary of the central numerical findings.It is self-contained and can be read as an "executive summary" of our work.After having read Section II, the reader should be able to continue directly with those sections which interest them the most.
• Section III introduces the general framework and demonstrates its range of applicability.The general formulation is presented in Section III A. It is set up such that it can be particularized to a wide range of scenarios, as well as to various underlying renormalization schemes.Specific realizations of the general framework then follow: -In Section III B it is shown how a renormalization procedure based on tree tensor networks gives rise to yet another family of relaxations for 1D systems, distinct from the one derived in Section II.
-The application of the method to lattice spin systems in higher dimensions is discussed in Section III C.
-Section III D shows how our approach can be implemented within the reduced-density-matrix theory framework.
-Section III E discusses how our approach can be applied to the problems of non-locality and entanglement detection.
• Section IV deals with specialized aspects pertaining to our numerical implementation, namely how to optimize coarse-grainers and how to certify solutions.
• Section V reports our numerical findings for 1D spin chains, and discusses the results and the methods used in detail.
Finally, in Section VI we present our conclusions.

II. A CASE STUDY: 1D TRANSLATION-INVARIANT HAMILTONIANS
In this section, we will introduce our method and illustrate its key ideas by means of a concrete problem.Specifically, we will consider the paradigmatic problem of finding the minimum energy density of a translation-invariant local Hamiltonian on a one-dimensional (1D) spin chain, also known as the 1D translation-invariant local Hamiltonian problem.We will see that, in this scenario, a natural choice of coarse-graining maps to relax the problem will be given by translation-invariant matrix product states (MPS).Remarkably, the same MPS that have been variationally optimized to upper bound the ground state energy turn out to perform very well as coarse-graining maps for the lower bound as well.We conclude the section by providing numerical results that demonstrate the significant improvement in accuracy that can be obtained through our method.
A. Setting up the problem as a hierarchy of constraints We start by rephrasing the ground state energy problem as a hierarchy of semidefinite constraints.The problem that we wish to solve is finding the minimum energy density of a translation-invariant Hamiltonian consisting of an identical nearest-neighbors interaction term h, acting on every pair of consecutive spins on an infinite chain.As the Hamiltonian is translation invariant, we can restrict our minimization to states that have the same symmetry.Because all the interaction terms are identical, the energy density can be evaluated in terms of the 2-body reduced density matrix, ρ (2) .To obtain the true ground state energy density of the infinite system we need to restrict ρ (2)  to be compatible with a global translation-invariant (TI) state.We formally represent this condition by ρ (2) ← ψ TI .(We do not elaborate on what it means to be a state on the infinite chain as from now on we will only use rather obvious properties of the reduced states of ψ TI on finite regions.)With this notation the problem reads Tr hρ (2)  (1) The existence of a global state ψ TI implies that all its m-body reduced states ρ (m) also exist and are compatible with each other in the sense that ρ (m−1) is obtained from ρ (m) by tracing out one spin.Since ψ TI is assumed to be translation invariant, tracing out the leftmost spin gives the same reduced state as tracing out the rightmost one.We denote this compatibility condition as where Tr L denotes the partial trace of the leftmost spin (and similarly Tr R ).We call a state ρ (m) with this property locally translation invariant (LTI): We now rewrite Eq. ( 1) by explicitly including the compatibility conditions between the reduced states ρ (m) of ψ TI for m ∈ {2, . . ., n} in the constraints as follows Tr hρ (2)   s.t.Tr(ρ (2) ) = 1, ρ (m) ≥ 0 for all m ∈ {2, . . ., n} , ρ (2) B. First relaxation: truncating the hierarchy Next we relax Eq. ( 4) by dropping the condition requiring the existence of ψ TI but keeping all the other constraints.That is, we require that the reduced states ρ (m) for m = {2, . . ., n} exist and are compatible in the sense of Eq. (2).In particular this implies that ρ (n) is LTI.We refer to the resulting problem as the LTI problem of size n and its solution as E LTI (n): Tr hρ (2) ) , for all m ∈ {3, . . ., n} .
(5) By removing the constraint ρ (n) ← ψ TI we allow more states {ρ (m) } to be considered in the minimization and therefore obtain a lower bound, i.e.E TI ≥ E LTI (n) for all n ≥ 2. The resulting optimization problem, Eq. ( 5), involves matrices of finite sizes and can be solved numerically using semidefinite programming (SDP) [20].For an LTI state ρ (n+1) , Eq. (3) implies that ρ (n) is also LTI, we therefore have that the sequence {E LTI (n)} n is non-decreasing.Furthermore, it converges to the exact energy density in the limit n → ∞.
The analysis so far suggests the following strategy for solving Eq. ( 1): Solve Eq. ( 5) while keeping as many states ρ (m) as we can fit in memory, up to some m max .Unfortunately, this approach does not lead to accurate lower bounds because the resources needed to solve the resulting SDP scale exponentially with m max .
C. Second relaxation: applying coarse-graining to the constraints So far we have shown that the LTI problem, Eq. ( 5), is a relaxation of the translation-invariant local Hamiltonian problem, Eq. (1).We also pointed out that to obtain tight lower bounds one has to solve the LTI problem for n ≫ 1, which is not a tractable task due to the exponentially large state ρ (n) appearing in Eq. (5).To overcome this impasse we suggest to relax the hierarchy of constraints appearing in the LTI problem, which we have formally written as ρ (2) ← ρ (3) ← . . .← ρ (n) , in a smarter way than just truncating it at a low level m max ≪ n.In particular, we would like to keep some of the constraints arising due to the state ρ (n) rather than discard it completely.To achieve this we apply a renormalization procedure to the states ρ (m) .We now explain how one constraint of the form ρ (m−1) ← ρ (m) can be relaxed and compressed by applying a coarse-graining map.In the next subsection we explain how this can be iterated to relax and compress the entire LTI problem, Eq. (5).
To explain the following steps we rely on graphical tensor notation, in which the constraints ρ (m−1) = Tr L (ρ (m) ) = Tr R (ρ (m) ) appearing in Eq. ( 5) are written as follows (we focus on m = 3, 4, 5): ρ (4)   = ρ (5)   ρ (4)   = ρ (5)   , (6) where the partial trace is represented by connecting the corresponding legs of the tensor.Now consider the first two rows in Eq. ( 6).The following steps form the elementary building block of our procedure and are described in Eq. ( 7).First we apply to both sides of each equation a coarse-graining map W 2 : C d ⊗ C d → C χ , which maps two spins of dimension d into one new site of dimension χ, such that in both rows the two central spins of ρ (4) are acted upon, leading to the equations in the center of Eq. (7).
(In the diagrams d is indicated by thin lines and χ by thick ones.)Next, we can compress the coarse-grained relations, as described by the second arrow in Eq. ( 7): We notice that after the coarse-graining, ρ (3) is related only to the image of ρ (4) under the coarse-graining map We can therefore encode the coarse-grained constraints using a smaller-dimensional variable ω (4) .
What we have achieved is to formulate a relaxation of the constraints on ρ (3) in terms of a smaller variable.In the example we just described, the size reduction was not so significant because we only coarse-grained two spins.However, by iterating the above coarse-graining and compression steps as we will now explain, we can drastically reduce the overall size of the problem.

D. Iterative coarse-graining: a renormalization procedure
The next step is to iterate the above described coarse-graining and compression in order to coarse-grain all the states ρ (m) up to m = n for large values of n.To do so we require the coarse-graining maps to satisfy certain compatibility conditions that we shall now explain.Consider the constraints involving ρ (4) and ρ (5) (the last two rows in Eq. ( 6)).
In order to compress ρ (5) we need to apply a 3-body coarse-graining map W 3 .We would like to have the following: However, we notice that we may no longer write conditions involving ρ (4) as we have already compressed it and replaced it with ω (4) when we treated the first two constraints.In order to continue we thus need to be able to express the left-hand side of Eq. ( 8) in terms of ω (4) which appears in Eq. ( 7).This can be done if the coarse-graining maps W 3 and W 2 are related as follows.
i.e. we require the existence of two additional coarse-graining maps, L 2 and R 2 , that act on the output of W 2 together with an additional spin on the left or right side respectively.Given maps that satisfy Eq. ( 9) we can substitute W 3 in Eq. ( 8) with the appropriate expression, such that W 2 acts on the two central spins in ρ (4) , as shown in Eq. (10).
Then the left-hand side of Eq. ( 8) can be expressed in terms of ω (4) and L 2 (respectively R 2 ), and the right-hand side-in terms of ω (5) : C2(ρ (4) ) →ω (4)   − −−−−−−−− −→ C3(ρ (5) ) →ω (5)   ω (4) We can continue in this fashion and coarse-grain all the states up to ρ (n) if for all m = 2, . . ., n − 3 we have (analogously to Eq. ( 9)) E. An ansatz for the coarse-graining maps: matrix product states We have just shown that we can coarse-grain and compress all the states ρ (m) appearing in Eq. ( 5) consistently if the coarse-graining maps satisfy Eq. (11).There is a simple way to satisfy Eq. ( 11): Uniform matrix-product states (MPS) are translation-invariant states encoded by a single rank-3 tensor, A, with dimensions (D, d, D) where d is the dimension of each spin and D is the so-called bond dimension (or virtual dimension).Contracting m copies of the tensor along the bonds results in a map from m spins to the bonds at the edges.Representing the MPS tensor A by a circle and the bond indices by red lines, we define the maps W A m as follows (e.g. for m = 3) In this case the coarse-graining dimension χ equals D 2 .The maps L m (and respectively R m ) are then chosen to be the same for all m: L m ≡ L A (and R m ≡ R A ) and act by contracting an additional MPS tensor from the left (and respectively from the right) and as the identity on the other bond.With this choice of maps, we apply the above described steps to the LTI problem, Eq. ( 5), to obtain the following (shown up to m=5): ρ (3)   = ω (4)   ρ (3)   = ω (4)   ω (4)   = ω (5)   ω (4)   = ω (5)   .(13) For m > 5 the last two constraints on the right side of Eq. ( 13) are repeated between the pairs of states ω (m−1) , ω (m)  for m = 6, . . ., n.The final relaxation is then obtained by replacing the constraints in Eq. ( 5) starting from m = 3 by the ones in the right hand side of Eq. ( 13) and in addition demanding ω (m) ≥ 0 for m = 4, . . ., n.For a any MPS tensor A, we obtain the following relaxation of Eq. ( 5): Tr (h ⊗ I)ρ (3)   s.t.Tr(ρ ≥ 0 , ω (m) ≥ 0 , for all m ∈ {4, . . ., n} , Tr L (ρ (3) ) = Tr R (ρ (3) ) , where † and where W A 2 is given by Eq. ( 12), and similarly R A and L A act as R A (•)(R A ) † and L A (•)(L A ) † respectively (see the last two rows on the right had side of Eq. ( 13)).Note that every state where D is the bond dimension of the MPS, and thus the memory scaling of the relaxation is O(nd 4 D 4 ).
The error of the lower bound obtained with the relaxed constraints Eq. ( 13) depends on the MPS tensor A used to perform the coarse-graining.In the resulting SDP, the entries of the MPS tensor appear as hyperparameters which can be optimized in order to tighten the lower bound.The bond dimension D determines the dimension of the subspace to which the original constraints are restricted due to the action of the coarse-graining maps.Increasing D allows to keep more constraints.
Viewing MPS as describing coarse-graining maps is by no means a new insight.This idea lies at the heart of the density matrix renormalization group algorithm [45,46] where the MPS produced at each iteration serves to coarsegrain the Hamiltonian in the next one.This connection suggests that an MPS approximation to the ground state, when used in our scheme, might give rise to a tight relaxation for the same Hamiltonian.
Following this reasoning, in our implementation we used the tensor describing the MPS ground-state approximation obtained from a variational algorithm based on a uniform-MPS ansatz-the VUMPS algorithm [47].The results which we will now present demonstrate that this heuristic choice performed very well.(For further analysis of the choice of MPS tensor for the relaxation refer to Section V D below.)Distance ∆E relax. of the lower bounds to the ground state energy density, obtained by solving the MPS-based relaxation outlined in Section II, to the true value, for (a) the transverse field Ising model at the critical point, and (b) the spin 1/2 Heisenberg antiferromagnet.The differently colored dashed and dash-dotted lines show results obtained using MPS with different bond dimensions D (leading to better relaxations).The black circles give the energy obtained by solving the exact hierarchy up to n sites, whose computational cost grows exponentially in n; it displays an algebraic scaling.The MPS relaxations approximate its behavior increasingly well as D increases, at a fraction of the computation cost (which scales algebraically in the effective n), thus allowing to reach accuracies of between one and two orders of magnitude higher.

F. Numerical results
Let us now demonstrate the power of the method through the numerical study of some selected spin models.The models presented here serve to demonstrate the general performance of the method within the framework of this introductory section; results on a wider range of models, together with a more detailed discussion of the numerical findings and an analysis of the performance of the method, can be found in Section V.
In the following, we present results obtained for two paradigmatic spin 1/2 models: the critical transverse-field Ising (TFI) model, and the isotropic antiferromagnetic Heisenberg chain.For each model, we solved the relaxation in Eq. ( 14) with the coarse-graining maps constructed from a variationally optimized MPS, A ⋆ D , of bond dimensions D, for various values of n and D. The obtained energies are denoted by E relax.(n, D) and are equal to E relax.(14).We then determined their deviations from the exact ground state energy density

MPS(A
For comparison, we also computed the LTI energy density E LTI (n) of the exact hierarchy truncated at the n-th level, by solving Eq. ( 5) for attainable values of n, and the corresponding deviation ∆E LTI (n) := E TI − E LTI (n) from the exact value.Since our method is a relaxation of the LTI problem Eq. ( 5), it holds that ∆E LTI (n) ≤ ∆E relax.(n, D) for all D.
Figure 2 shows ∆E relax.(n, D) for different values of D (colored dashed and dash-dotted lines), as well as ∆E LTI (n) (black line with circles) as a function of n for the TFI model (panel a) and the Heisenberg model (panel b).We observe that the accuracy obtained from the exact hierarchy truncated at level n, ∆E LTI , displays an algebraic decay n −α with α ≈ 2; however, one is limited to rather small n and thus low accuracies.The curves obtained from our relaxation initially follow the same curve as ∆E LTI and in fact extrapolate it to larger n, and as n is increased eventually saturate at some ∆E relax.(n = ∞, D).It is immediately evident from the plots that the accuracy obtained from our relaxation is between one and two orders of magnitude higher than that obtained from the LTI hierarchy.From the plots, we can estimate the effective n ≡ n eff.(D) required to reach the same accuracy from the exact LTI hierarchy (by linear extrapolation of the latter in the log-log plot); we find n eff.(6) ≃ 120 for the critical TFI model, and n eff.(7) ≃ 60 for the Heisenberg antiferromagnet, both significantly beyond what can be achieved without the relaxation.
The same features were observed in all other models on which we tested our method; in fact, for gapped models, the observed convergence was even more rapid.For an overview of all results and a detailed discussion of the various findings and numerical methods, we refer the reader to Section V.

III. THE GENERAL METHOD
In Section II we dealt with the ground state problem of a 1D spin chain.We have shown how the constraints imposed on a local reduced density matrix by a global state can be organized in a hierarchical fashion due to the many-body structure of the problem.The hierarchy involves constraints arising from reduced states of increasing groups of spins.By applying a renormalization procedure to this hierarchy we were able to compress the ground state energy problem.Choosing the renormalization procedure according to the given target Hamiltonian allowed us to efficiently compute tight lower bounds on its ground state energy.
The many-body structure of the constraints, on which our renormalization approach hinged, is however not a property special to 1D spin systems.In this section we present the structure of our method in the most general terms and provide blueprints for its implementation in a wide range of scenarios.As we will see, the same ideas that we implemented in the 1D translation-invariant setting can be applied to lattice systems in any dimension, or with other connectivity graphs.They can be applied to fermionic or bosonic particles, and even to systems that do not follow the laws of quantum mechanics.Furthermore, the renormalization procedure we followed in Section II, by which we coarse-grained the spin chain one spin at a time, is also just one possible choice: other renormalization flows fit within our general framework as well, and each such flow gives rise to a different hierarchy of relaxations.
We first present the general formulation of the method in Section III A. We generalize the procedure we followed in Section II while making minimal assumptions on the structure or even kind of system.Yet, the purpose of Section III A is not to merely provide an abstract reformulation of what was already presented above, rather, it is to highlight the essential ingredients and the conditions that should be verified for setting up similar renormalization-based relaxations in various settings.Later, in the rest of this section, we will follow the prescription outlined in Section III A while applying our general method to different settings.
In Section III B we will demonstrate how a block-spin renormalization flow gives rise to yet another relaxation method for 1D TI systems.In Section III C we discuss how our method can be applied to higher dimensional spinlattice-systems and analyze which renormalization flows are suitable for this purpose.In Section III D we introduce the Reduced-Density-Matrix-Theory (RDMT) relaxation framework, whose scope of application includes any quantum mechanical system with local interactions, and show how to implement our method within this framework.Finally, in Section III E we explain how our method can be used to boost the performance of detection and certification methods in quantum information theory.

A. A general correspondence between renormalization flows and hierarchies of relaxations
Let us abstract what we did in Section II.Our goal there was to minimize an objective function, namely, the energy density of a 1D TI quantum system.This energy density was a linear function of ρ (2) , the 2-particle reduced density matrix of the 1D TI spin chain.Correspondingly, the starting point of a general formulation of the technique described in Section II is a linear function H of a small number of parameters of an otherwise large (or infinite) physical system.
Depending on the problem and the representation we choose to work in, those parameters could be the entries of a reduced density matrix (as in Section II), or multiple such reduced density matrices, as in the case when H is not translation invariant; they could be several expectation values which appear in H, e.g. the terms ⟨σ x ⊗ σ x ⟩, ⟨σ y ⊗ σ y ⟩, and ⟨σ z ⊗ σ z ⟩ which appear in the Heisenberg S = 1/2 Hamiltonian; or they could be some other representation of local states.Whatever the case may be, we denote these local parameters by Θ 0 .Like in the 1D example, our goal is to minimize H(Θ 0 ) over all local data Θ 0 compatible with some global state Ψ; we represent this constraint formally by Θ 0 ← Ψ and say that Ψ is a realization of Θ 0 .The problem we wish to solve is thus where the optimization is over the set of all 'physical states' Ψ ∈ K, a notion that is also problem specific: K could be the entire global Hilbert space; a symmetric subspace thereof (as in Section II where we restricted the optimization to translation invariant states); a convex set of states with a property of interest, such the set of separable (i.e.not entangled) states; or the state space of a certain so-called general probabilistic theory (not necessarily quantum theory, see Section III E below).
In our 1D example, we observed that, if ρ (2) admits a TI quantum realization Ψ, then there exists a sequence of quantum states of increasing size (ρ (m) ) m such that ρ (m−1) = Tr m (ρ (m) ).Each such state ρ (m) was not an arbitrary matrix; on the contrary, it had to belong to the set S LTI m of positive semidefinite and locally translation invariant matrices (see Eq. ( 3)) of trace one.Note that the proposition ρ (m) ∈ S LTI m can be verified efficiently (in time polynomial in the size of ρ (m) ).We therefore say that S LTI m is a tractable set.In a general setting, we similarly assume that the constraint "Θ 0 ← Ψ, for some physical Ψ" can be broken down into a (finite or infinite) hierarchy of conditions that hold between objects of increasing size.More specifically, we assume that there exists a sequence of (tractable) convex sets (S m ) n m=1 (where we allow n = ∞), and a sequence of marginalization maps (Tr m ) n m=1 , i.e. maps that discard information, with the following property: for all Θ 0 admitting a physical realization, there exists a sequence (Θ m ) n m=1 , with Θ m ∈ S m , for m ≥ 1, such that We might further assume that this hierarchy of conditions is complete, i.e. that Eq. ( 16) is both a necessary and sufficient condition for Θ 0 having a physical realization (this was the case in Section II).This assumption is, however, unnecessary for what comes next.
Since the size of Θ m might scale badly with m (in the 1D example, it scaled exponentially with m), it is impractical to optimize H(Θ 0 ) via the hierarchy of conditions ( 16), as was proposed in Eq. ( 4).In the 1D example, we escaped from this predicament by introducing a coarse-graining transformation m that, acting on all the qubits of ρ (m+2) but the first and the last, mapped this m-body density matrix to the four-partite (and low dimensional) matrix ω (m+2) .Our choice of the coarse-graining map W m ensured that ω (m+2) is also a state (up to normalization).
Following the same idea, for a general setting we require a sequence of coarse-graining transformations (C m ) n m=1 , which map each 'state' Θ m ∈ S m to some Ω m ∈ S ′ m , where S ′ m is the set of 'states' of the coarse-grained system.We choose the sets S ′ m such that their dimension grows at most polynomially with m.The maps C m are thus to be understood as physical coarse-graining operations.This notion of course depends on the definition of what are valid states in any given scenario.As we will see, for many scenarios of interest, the set of physical operations is easy to characterize.
Our next aim is to find relations between the coarse-grained variables Ω m , that capture some of the relations between the variables Θ m (Eq.( 16)).More precisely, in order to get a relaxation of the original problem, we need to find relations that follow from (and are therefore weaker than) Eq. ( 16).This can be achieved by imposing certain compatibility conditions between the maps C m and the marginalization maps Tr m .In order to formulate those conditions in the general setting we look back at Eq. ( 10) and analyse more closely the interplay between coarse-graining and marginalization maps there.
First, we realized that we need to express W 3 as the composition of I ⊗ W 2 (resp.W 2 ⊗ I) with the coarse-graining map L 2 (resp.R 2 ) (see Eq. ( 9) and similarly for all m ≥ 2 in Eq. ( 11)).In the general case, we therefore require that the maps (C m ) m form a renormalization flow, i.e. that each C m+1 implements one coarse-graining step more than its predecessor C m .Second, looking at what happens to ρ (5) in Eq. ( 10), we note that in order to substitute it with ω (5)  we used the fact that the partial trace maps commute with (I ⊗ W 3 ⊗ I).Overall, the relation we used in Eq. ( 10) can be formally written as: where Tr R/L means Tr R or Tr L as appropriate: Tr R was used in the first row of Eq. ( 10) and Tr L in the second row, and where we introduced the Tr ′ notation to emphasize that Tr and Tr ′ act on systems of different dimensions.
By analogy, we require in the general case that, for m ≥ 1, there exists a coarse-graining map B m and a marginalization map Tr ′ m such that: where the maps in Eq. ( 18) are in 1-to-1 correspondence with those in Eq. (17).
Given maps C m , B m , Tr m , and Tr ′ m that satisfy Eq. ( 18) we can perform the analogue of the procedure described in Section II D: namely, we apply each side of Eq. ( 18) to Θ m+1 and invoke the relation Θ m = Tr m+1 Θ m+1 on the LHS.This leaves us with Then by substituting all C m (Θ m ) by the compressed states Ω m in the above, we obtain the desired relation: We can now put everything together.Suppose that Eq. ( 16) defines a relaxation of the set of physical local parameters Θ 0 and that we find a renormalization flow with maps (C m , B m ) m and marginalization maps (Tr ′ m ) m satisfying Eq. (18).Then, if we start with C 1 := I (and thus Ω 1 = Θ 1 ), the following optimization problem is a relaxation of the original minimization task (15): Since B m , Tr ′ m are linear maps and the sets S 0 , S ′ 1 , S ′ 2 , ... are convex, we have that the problem variables Θ 0 , {Ω m } m in ( 21) are optimized over a convex region.In addition, the objective function H is linear (and therefore convex) on the variables Θ 0 .The relaxation ( 21) is thus a convex optimization problem [48] and, due to the tractability of the sets S 0 , S ′ 1 , S ′ 2 , ..., we can use general algorithms from convex optimization theory to tackle it.If, like in our 1D example in Section II, the sets S 0 , S ′ 1 , S ′ 2 , ... happen to be defined through affine and positive semidefinite constraints, then eq. ( 21) defines a semidefinite program [20].
Note that the coarse-graining maps C m appear in Eq. ( 21) as hyperparameters.The compatibility relation in Eq. ( 18), which was the key to arriving at the relaxation Eq. ( 21), typically does not determine the maps C m completely.Far from it, it leaves us with many free parameters that we can vary in order to tighten the lower bound given by Eq. (21).In the example in Section II we saw that any uniform MPS gives rise to suitable coarse-graining maps.In the following Sections III B to III D we will show further examples of how tensor networks provide us with renormalization flows compatible with Eq. ( 18).
Equipped with this general prescription, we will now use it to implement further examples of renormalization-based relaxations.

B. Block spin renormalization: relaxations using tree tensor networks
In this section we give another example of a renormalization flow that gives rise to a relaxation hierarchy, in line with the general correspondence which we described in Section III A. We consider a block-spin renormalization procedure and apply it to the same 1D LTI problem that was treated in Section II.In Section V we will present the full benchmarking results of both relaxation variants-the MPS-based one from Section II as well as the block-spin (or tree-tensor-network) based relaxation which we will now develop.The relaxation based on block-spin renormalization is presented here for the 1D case, but admits straightforward generalizations to higher spatial dimensions.We will discuss this point further in Section III C.
We first consider the exact n-body LTI constraint that was our starting point in Section II: ρ (2) ← ρ (n) , but this time we start from n = 2 N and break the constraint down into a different sequence of constraints, one which relates states that double in size at each step: If we were to keep the full LTI symmetry of the state ρ (2 N ) we would have to impose the equivalence of all of the ways in which ρ (2 N −1 ) can be obtained from ρ (2 N ) (there are 2 N −1 + 1 of them).However, since we are going to iteratively map each pair of spins into one blocked spin, we will not be able to keep the LTI property fully: After blocking two spins we can only consider translation by two sites, after blocking twice -by four, and so on.
Our starting point for this scheme is therefore a relaxation of the 2 N -body LTI problem where we impose only the constraints that are compatible with the block coarse-graining, namely, we require that for all k = 2, . . ., N − 1 the following three ways of obtaining ρ (2 k−1 ) from ρ (2 k ) are equivalent (see also the left side of Fig. 3): where Tr AB , and Tr CD trace out half of the spins on the left (1, . . ., 2 k−1 ), and on the right (2 k−1 + 1, . . ., 2 k ) respectively, and Tr AD traces out everything but the 2 k−1 spins in the middle.
Next we coarse-grain each state ρ (2 k ) by applying k − 2 layers of block coarse-graining maps 2 k →2 k−1 , where at each layer the systems size is halved (as indicated by the subscript 2 k → 2 k−1 ) until we end up with a 4-body state.Each layer is composed as follows with W (l) a completely positive map that maps two (possibly already blocked) spins into one blocked spin, i.e.W (l) maps states on C D l−1 ⊗ C D l−1 to states on C D l , where D 0 = d is the dimension of the physical spins and the dimension of the blocked spins after l layers of coarse-graining is D l .We denote by ω (2 k ) the resulting states, which are all 4-partite states with local dimensions D k−2 .
To obtain a relaxation in terms of (ω (2 k ) ) k we follow the prescription in Section III A, where the key step was making sure that Eq. ( 18) was satisfied.We now turn to this equation.Having chosen the coarse-graining maps C m as we have done in Eq. ( 25), and the marginalization maps Tr as in Eq. ( 23), we see that Eq. ( 18) in our setting is to be understood as follows: taking the partial trace of ω (2 k ) with respect to, say, the leftmost two blocked spins should give the same result as first tracing out the 2 k−1 leftmost spins in ρ (2 k ) and then coarse-graining the resulting state until only two blocked spins remain.The equivalence between the two different orders of operations (first partial trace and then coarse-grain or vice versa) can be ensured by choosing the maps W (l) composing each layer V (l) to be completely positive and trace preserving (CPTP) rather than just completely positive.In this setting Eq. ( 18) thus reads: ρ (4)   ρ (8)   ρ (16)   Tr CD/AD/AB Tr CD/AD/AB ω (8)   ω (16)     23) before coarse-graining is applied: Each state ρ (2 k ) is obtained from the state above it in three different ways by tracing out the appropriate spins, as indicated by the curved black arrows.On the right side, the coarsegrained 4-partite states ω (2 k ) are represented by the dashed rectangles.Each ω (2 k ) is obtained from ρ (2 k ) by applying the C k maps that consist of layers of 2-to-1 coarse-graining maps indicated by inverted 'Y' shapes, see Eq. (24).The maps B k−1 then implement an additional coarse-graining step on ω (2 k ) .The relaxed constraints between the states ω (2 k ) , Eq. ( 27 where Tr ′ AB , Tr ′ AD and Tr ′ CD act on the 4-body state ω (2 k ) and trace out spins (1,2), (1,4) and (3,4) respectively.The constraints on ω (2 k ) are then obtained according to Eq. ( 20) and read: where due to the LTI property of ω (2 k ) it does not matter whether we use Tr ′ AB or Tr ′ AD or Tr ′ CD .Figure 3 shows the initial states ρ (2 k ) for k = 2, 3, 4 and the constraints between them (Eq.( 23)).The figure further shows how the constraints between the coarse-grained states ω (2 k ) are obtained through Eq. (26).We thus arrive at the following relaxation ) , for all m ∈ {3, . . ., N } , which depends on the choice of coarse-graining maps W (l) .In our numerical implementation we solved the relaxation Eq. ( 28) starting with coarse-graining maps constructed from variationally optimized tree-tensor-network states obtained with the algorithm in Refs.[17,49].We then further optimized the maps following the procedure described in Section IV B. The results of the implementation of this method are presented in Section V.In Appendix A we explain how we constructed the initial coarse-graining maps for this scheme (which have to be CPTP maps) from variational tree-tensor-network states.

C. Spin systems in higher spatial dimensions
Our applications of the method in the 1D setting suggest straightforward generalizations to higher dimensions.Consider first the case of a square 2D lattice.The most obvious generalization of our 1D implementation in Section II is to use projected entangled-pair states (PEPS)-the 2D analogue of MPS-to do the coarse-graining.This approach would however quickly run into the following problem: As we coarse-grain larger and larger regions R by contracting more PEPS tensors, the coarse-graining dimension grows as D |∂R| , where |∂R| is the length of the boundary of R.This means that after applying the coarse-graining procedure we would still end up with a problem that grows exponentially in size as the region R is increased.Moreover the smallest region for which coarse-graining with PEPS would reduce the dimension, i.e. for which d |R| > D |∂R| , is itself rather large (for both D and d equal to 2 one would need to start from a 5 × 5 square) making the problem intractable for standard SDP software.
The tree-tensor-network variant of the method does not suffer from this drawback.The scheme we presented for the 1D case can be directly applied in the setting of a square 2D lattice if in Eq. (23) we replace the states supported on chains of spins {1, . . ., 2 k } with ones supported on rectangles {(i, j) | i = 1, . . ., 2 k , j = 1, . . ., 2 k ′ }.Coarse-graining maps can then be constructed from 2D TTN states [17] and applied to patches of different sizes and geometries.Higher dimensional tree coarse-graining schemes can similarly be implemented.
Another relaxation approach that is applicable to higher dimensional systems originates in quantum chemistry and goes by the name of Reduced Density Matrix Theory (RDMT).This approach has so far been the most successful in treating 2D lattice systems [26,28].We will introduce this approach in Section III D and dedicate that section to explaining how our renormalization approach can be applied there.Lattice systems in 2D would be an appropriate setting to test whether this leads to an advantage over current approaches.
Implementing these ideas goes beyond the scope of this paper as its focus is on introducing the renormalization approach and benchmarking it in 1D as a proof of concept.However, to get an idea of what to expect when dealing with 2D systems, we provide in Appendix B some preliminary results concerning the 2D LTI problem.There we show that, similarly to the 1D case, the 2D LTI relaxation exhibits a clear scaling of the accuracy of the lower bound, ∆E 2D LTI (n), with the system size n.This can be used as a benchmark for relaxation methods in 2D: one could infer from it the effective system size corresponding to a bound with a given accuracy, similarly to our estimate of n eff.from the scaling of the LTI energy in 1D in Section II F.
We further demonstrate that both ways of encoding the 2D LTI constraints presented in Fig. 4, using either square regions or triangular ones, lead to the same scaling of accuracy with system size.We then show that by applying 4. The constraints defining the 2D LTI hierarchies for square-shaped (a) and triangular (b) regions.Each shape depicts a state on a 2D region containing the spins that are indicated by black circles.Each equality sign between two shapes defines a constraint demanding that the application of partial traces on the spins marked by red ×s on both sides results in the same state.Symmetry with respect to rotations of the lattice by 90 • is assumed.Higher levels in each hierarchy are formulated similarly.Both formulation lead to the same scaling of ∆E 2D LTI (n) with n, as shown in Fig. 9 in Appendix B.
one coarse-graining step to the 2D LTI constraints we can recover the energy corresponding to the next level in the hierarchy already with coarse-graining dimension as low as D = 3.This leads to precision that would have been unattainable without coarse-graining, and is comparable to state-of-the-art results [26,28].Finally, we discuss further possible coarse-graining schemes applicable in 2D.

D. Reduced-density-matrix theory
Reduced-density-matrix theory (RDMT) is a relaxation method that is best known for its application in quantum chemistry, where it has been used to accurately compute the shape of the potential-energy curve of molecules [1, 2, 50-55] and see Ref. [56] for further references.The same method has also been applied to lattice systems, and has appeared in the literature under several different names [3,24,26,28,[57][58][59][60][61].RDMT also underlies recently proposed so-called bootstrap methods that can constrain spectral properties beyond the ground state energy [44,62].Viewed more generally, RDMT can be seen as the non-commutative generalization of the Lasserre-or sum-of-squares-hierarchy from polynomial optimization [22,63,64], and therefore underlies approximation algorithms in quantum complexity theory [65,66].

RDMT in a nutshell
The basic idea of this approach is based on exactly the same reasoning that led us to the formulation of Eq. ( 5) in Section II, namely: to constrain the 2-body reduced density matrix-which determines the energy of the system-by requiring its compatibility with larger and larger physical states.The implementation of this idea within the RDMT approach is however more general than Eq. ( 5), where physical states were simply n-body reduced density matrices, and hinges on a slightly more abstract representation of quantum states.
Mathematically, a quantum state is simply a function that assigns expectation values to observables g (that is, g → tr[ρg]).More precisely, a state is a linear functional L : A → C on the algebra of observables of the system A. In addition, any physical state must be a positive functional, i.e. it should assign non-negative numbers to observables of the form g † g, where g ∈ A is any global observable.
Demanding global positivity is of course an intractable task when we are dealing with a many-body system.In Section II this required optimizing over global wavefunctions Ψ (equivalently, global functionals L Ψ (g) := ⟨Ψ|g|Ψ⟩).We then relaxed this global positivity constraint by requiring the existence of a smaller, and thus tractable, positive object: an n-body state ρ (n) ≥ 0. In other words, we demanded that there is a functional defined on the algebra of local n-body observables L (n) : A n → C satisfying a weaker-local rather than global-positivity condition: for all g n ∈ A n ⊂ A we have L (n) (g † n g n ) := Tr(g † n g n ρ (n) ) ≥ 0. RDMT generalizes this idea as follows.Let O be a set of operators and let M = {m i } s i=0 be a finite set of monomials (products) of operators from O (with m 0 = I).Instead of considering a global functional L acting on the entire algebra of observables, we can now define L only on the subspace span(M † M ) := span{m † i m j |i, j = 0, . . ., s}, and demand its positivity there, i.e. that L(g † g) ≥ 0 for every g ∈ span(M ).
If the Hamiltonian we are studying, H, is in span(M † M ), then the energy of the system can be evaluated in L and is given by L(H).We can then formulate a relaxation of the ground state energy problem in terms of L with the relaxed positivity condition L(g † g) ≥ 0 ∀g ∈ span(M ).This turns out to be an SDP as the condition that L is positive can be expressed by the positivity of Γ(L), the so-called moment matrix of L: , for all i, j = 0, . . ., s .
The normalization condition for the state reads L(I) = 1.The linearity of L means that it is defined by its values on a basis of span(M † M ), i.e. in terms of dim span(M † M ) ≤ (s + 1) 2 parameters x i := L(X i ), where {X i } is a fixed chosen basis of span(M † M ) with X 0 = I.In terms of the x i s, the moment matrix of L is given by Γ The following SDP is thus a relaxation of the ground state problem: where the size of the matrix Γ(L) is (s + 1) × (s + 1) (s + 1 is the number of elements in M ).
As an example, consider a system composed of n spin-1/2 particles and let H be any 2-local Hamiltonian where each term h (i,j) acts on spins i and j.If we choose the set M to include all the single-site Pauli operators a } j=1,....,n a=x,y,z , where σ x , σ y and σ (j) z are the Pauli x, y, and z operators acting on the jth spin, we have that H ∈ span(M † M ) and we can thus obtain a lower bound on its ground-state energy by solving Eq. (31).Enlarging the set M to include higher degree monomials, i.e. products of k ≥ 2 Pauli operators, k i=1 σ (ji) ai , will improve the lower bound; however, this would cause the size of Γ to grow exponentially with k.
We thus see that the RDMT approach is limited by the same problem of exponential scaling which we highlighted in Section II.For spatially local Hamiltonians this can be mitigated to some extent, as realized in Refs.[26,28].There the operators in the set M are chosen according to a their locality: products of k Pauli operators are included only if all the spins on which they act are within a region of size r k , with r k quickly decreasing to zero with k.Within this approach one could also optimize the choice of the regions depending on the given Hamiltonian (in analogy to the ideas proposed in Ref. [27] for state ensembles).
However, this is still a rather rigid framework, as the choice one has is between a discrete set of parameters.Furthermore, as we have demonstrated in Section II, relaxation schemes based only on locality (e.g.: the relaxation in Eq. ( 5), which is only defined by a length scale n), include a vast number of constraints that are irrelevant for a given target Hamiltonian, and could thus be further relaxed without compromising the precision of the resulting energy.
Given the generality and scope of applicability of RDMT, any method to boost its performance would be a highly welcome development.We will now outline how our renormalization procedure can be implemented within the RDMT framework, enhancing it with the continuous degree of tunability provided by the coarse-graining maps which compose the renormalization flow.We leave the implementation of the proposed scheme for future work, as we expect that extensive numerical investigations will be needed to properly assess its performance on challenging many-body problems.

Extending Reduced-Density-Matrix theory through renormalization
The relaxations obtained through the RDMT approach possess a hierarchical structure and thus fit naturally within the general framework outlined in Section III.Analogous to Eq. ( 16), the global positivity constraint L(g † g) ≥ 0 ∀g ∈ A can be broken down into a sequence of constraints, defined on an increasing sequence of sets of operators M k ⊂ M k+1 .For each set in the sequence we define a functional L (k) on span(M † k M k ) and demand its positivity As the sets M k form an increasing sequence, the domain on which each L (k+1) is defined includes that of its predecessor L (k) .Since all of the functionals L (k) are supposed to refer to the same state, we demand the consistency condition We thus arrive at the analogue of Eq. ( 16), the starting point for the application of our renormalization approach: As we have demonstrated in the previous sections, for a given system there could be several possible ways of choosing such a sequence depending on the renormalization procedure we seek to implement.Our general method in Section III A hinged on choosing the sequence Eq. ( 16) and the renormalization flow in a compatible way, as expressed in Eq. ( 18).We will now give an example satisfying this compatibility relation, and derive-following the prescription in Section III A-a compressed relaxation of the ground state problem within the RDMT framework.
For this demonstration we consider a system of n qubits with a Hamiltonian allowing for interactions between every pair of qubits (Eq.( 32)).For this setting we propose the following renormalization procedure.We start from spins 1 and 2 and coarse-grain them into a system labeled Q 2 , next we coarse-grain Q 2 and spin 3 into Q 3 and so on.If we denote those coarse-graining maps by B (Q k−1 ,k)→Q k (acting only on systems Q k−1 and k, and mapping them to Q k ), the maps C j we choose are In order for the sequence Eq. ( 34) to be compatible with this coarse-graining, we choose the sets M k to consist of a full Pauli basis for every group of k + 1 spins which contains spins 1 through k, plus one further spin: {1, . . ., k, l} l>k , i.e.
where {σ (j) a } a=x,y,z are the Pauli matrices acting on spin j and σ (j) 0 = I.For large k, the specification of L (k) requires too many parameters.To relax the SDP optimization over L (k) we need to define their compressed counterparts, which we denote as Λ (k) := C k−1 (L (k) ), where C k−1 is our coarse-grainig map specified in Eq. ( 35) (with C 1 := I).To see what the compressed state functional Λ (k) should be, consider the action of the map C k−1 on the physical spins: After the map C k−1 is applied, the spins 1, . . ., k − 1 have been compressed into the system Q k−1 .As Λ (k) should represent a state on the coarse-grained system, we define it on a space of compressed operators span((M ′ k ) † M ′ k ), where and where ( σ ) q is a basis of operators of the system Q k .For any operator of the form where the map C * k−1 denotes the adjoint of C k−1 : If C k−1 maps states of the spins (1, . . ., k −1) to states of the system b ′ can be evaluated in L (k) because M k includes a full basis of operators for the spins (1, . . ., k). Defining ) is a functional on the set of monomials (M ′′ k ) † M ′′ k , with Noticing that M ′′ k ⊂ M ′ k+1 , we arrive at the analog of Eq. ( 18): with Tr ′ k+1 defined as restricting Λ (k+1) to span (M ′′ k ) † M ′′ k : Our proposed relaxation for the Hamiltonian minimization problem Eq. ( 31) is therefore: such that Λ (2) ) , for all k = 2, ..., n − 1.
Similarly to L in Eq. ( 31), each Λ (k) is determined by a vector of parameters x (k) specifying the values of Λ (k) on a basis of span((M ′ k ) † M ′ k ).The moment matrices Γ(Λ (k) ) are also linear functions of those parameters as in Eq. ( 31).Finally as both B (Q k−1 ,k)→Q k and Tr ′ k+1 act as linear transformations on Λ (k) and Λ (k+1) , we can represent them as matrices B k and T k+1 acting on x (k) and x (k+1) , respectively.The relaxation Eq. ( 42) can then be expressed explicitly in terms of the vectors x (k) .

E. Quantum information theory: entanglement and nonlocality detection
The general method described in Section III A also leads to sound relaxations when the variables Θ m , Ω m and the transformations Tr m , C m are interpreted not as representations of many-body quantum states and transformations thereof, but as representations of the many-body states of generalized probabilistic theories (GPTs) [67][68][69][70] and maps transforming such states.As observed in [71], key problems in quantum information theory, such as entanglement and nonlocality detection in many-body systems, can be interpreted as determining the existence of a state in a GPT.
The formalism of GPTs was introduced to describe and reason about physical theories of all conceivable sorts: quantum, classical or else.A GPT is specified by a list of physical system types, together with composition rules that detail the type one must use to describe composite systems.In quantum mechanics, system types are specified by the system's Hilbert space dimension, and the tensor composition rule tells us that the type of a composite system consisting of a subsystem with dimension d 1 and a subsystem with dimension d 2 is d 1 d 2 .Each system type T comes with a state space S T , namely, a convex set of vectors, each of which describes a possible state of system T .The composition rule must also specify how to derive the marginals of a composite system, given a description of its overall state, as well as how to represent independent system preparations in a joint system description.Transformations in a GPT correspond to linear maps that, acting on part of the state of a bipartite system, always return a valid bipartite state.That is, the linear map M , transforming systems of type T into systems of type T ′ , is physical if, for all system types U , and all states s U T ∈ S U T , (I U ⊗ M T )(s U T ) ∈ S U T ′ .In [71] such transformations are called connectors.In the case of quantum theory this is just the complete positivity condition.
Consider for example the entanglement marginal problem [72], i.e., deciding if the observed state ensemble {ρ I } I∈I of a many-body system corresponds to the marginals of a fully separable quantum state.This is equivalent to deciding if {ρ I } I∈I are the marginals of a global state in the GPT SEP defined in [71], whose state spaces are, precisely, the sets of fully separable quantum states.More generally, the GPT marginal problem would ask if the ensemble of GPT states Θ 0 = {ρ I } I∈I arises as marginalizations of a global GPT state Ψ ∈ K, where K is the cone of global states of the GPT.The dual of this problem can be called the GPT local Hamiltonian problem, and it can be cast in exactly the same form as Eq. ( 21): Finding a tight lower bound to Eq. ( 43) then allows to use the Hamiltonian H as a witness for the presence of the property of interest, e.g.: entanglement.(In the standard marginal problem setting, the detected property is incompatibility with a global state: An energy value lower than the ground state energy indicates that the state ensemble with that energy is incompatible with a global state.)Clearly, Eq. ( 21) corresponds to a lower bound on the GPT local Hamiltonian problem, Eq. ( 43), if we regard Θ 0 , Ω m as ensembles of states of a GPT; Tr (•) and Tr ′ (•) as the marginalization maps provided by the GPT's composition rule; B m as connectors; and the sets S 0 , S ′ m as the corresponding GPT state spaces.

IV. OPTIMIZING THE COARSE-GRAINING MAPS AND CERTIFYING NUMERICAL SOLUTIONS
In this section, we discuss two further numerical aspects of the optimization algorithms.First, we show that it is possible to include an optimization over the coarse-graining maps into the algorithm.This optimization takes itself the form of an SDP; thus, the entire optimization can be carried out as a sequence of alternating SDPs.Second, we discuss how the use of the dual SDP allows us to certify numerical solutions, yielding fully rigorous lower bounds.

A. Dual SDP
Both methods rely on the dual SDP of the relaxation in Eq. ( 21), which we shall now introduce.Recall that in Eq. ( 21) the variables are restricted to the sets of physical states: Θ 0 ∈ S 0 and Ω m ∈ S ′ m .In the cases we are dealing with, those sets are given by the intersection of a cone of positive semidefinite matrices with a subspace defined by a linear constraint (e.g. the LTI condition), where for Θ 0 we also have an additional normalization constraint.In the dual SDP we will need to deal with the dual cones S ′ * m .Let S = Ω ≥ 0|L(Ω) = 0 be the intersection of the cone Ω ≥ 0 with the subspace defined in terms of a linear operator L as L(Ω) = 0. Its dual cone, i.e. the set of Hermitian matrices Z satisfying Tr(ZΩ) ≥ 0 for all Ω ∈ S, is given by S * = Y + L * (X)|Y ≥ 0, X = X † , where L * is the adjoint of L and X is any Hermitian matrix.A constraint of the form Z ∈ S * thus can be written as a positivity constraint: Z − L * (X) ≥ 0 for some X.We denote by λ m the Lagrange multiplier of the mth constraint in Eq. ( 21).We further write the normalization constraint explicitly as Tr(Θ 0 ) = µ, and denote its Lagrange multiplier by ε.With this notation the dual of Eq. ( 21) is where (•) * denotes the adjoint of an operator.The solution of Eq. ( 44) is always a lower bound to the solution of Eq. ( 21) and the two are equal when strong duality holds [20].

B. Optimizing over the coarse-graining maps
In both of our implementations we used tensor networks as an ansatz for the coarse-graining maps, and we have provided heuristic arguments for constructing them from a variational ground-state approximation when formulating our relaxation.Our numerical results show that while there is merit to this heuristic, the results can be improved by optimizing the coarse-graining maps.In the MPS-based method (Eq.( 14), Section II) this was not necessary as the results obtained using the MPS ground-state approximation were very good.In the tree-tensor-network-based variant (Eq.( 28), Section III B) there was more room for improvement.We therefore implemented the following procedure to optimize the coarse-graining maps used in the tree-tensor-network-based variant.
The form of the dual problem Eq. ( 44) suggests a simple scheme to optimize over the maps (B m ) m .Namely, first solve the problem Eq. ( 44) with the initial maps, hence obtaining the solution ε (1) , (λ m ) m .Next, choose j ∈ {1, . . ., n − 1}, fix the variable λ j = λ (1) j and solve Eq. ( 44) optimizing over ε, B j , (X m ) m and (λ m ) m̸ =j with additional positive semidefinite and linear constraints on the Choi representation of B j to ensure it is completely positive and trace preserving.This provides a solution ε (2) , (λ with objective value at least as high as the previous one, i.e., ε (2) ≥ ε (1) .Next we can either fix B j = B (2) j and optimize again over all variables (λ m ) m and ε or choose another index j ′ and optimize B j ′ fixing λ j ′ .By conducting coordinate descent in this fashion, it is possible to improve the lower bound obtained with the initial coarse-graining maps.
We implemented this procedure in the tree-tensor-network-based example.In this setting there is an additional snag as the maps B m are always of the form W (l) ⊗ W (l) (recall Eq. ( 27)).This presents a problem since W (l) ⊗ W (l) is not linear in W (l) .To overcome this we parametrize B m in terms of two maps W (l) 1 and W (l) 2 as follows This parametrization is linear in both W (l) 1 and W (l) 2 .We can then optimize over one map while holding the other one fixed and alternate between the two.This more general form of the maps (B m ) m corresponds to a modified choice of the layers V (l) 2 k →2 k−1 comprising the block-coarse-graining procedure in Eq. ( 27): which is still a valid relaxation of the LTI problem: each such layer preserves the LTI symmetry corresponding to its length scale and the maps 2 k →2 k−1 defined in Eq. ( 46) and B l+1 (W 2 ) defined in Eq. ( 45) satisfy the required relation Eq. (26).
We note that as an alternative to the procedure described above, one could also pursue other methods to optimize over the coarse-graining maps, such as gradient-decent methods obtained by differentiating the SDP solution with respect to the entries of the tensors as described in Refs.[73,74].

C. Rigorous bounds from finite solver precision
In theory a solution to the relaxation in Eq. ( 21) provides a rigorous lower bound to the problem in Eq. ( 15).However, when performing the numerical optimization the solution will always be accurate up to some finite precision.Furthermore, for large-scale problems some SDP solvers exhibit slow convergence and one might wish to obtain a rigorous result from the last computed iteration and not wait for full convergence.According to basic SDP theory any feasible point of the dual of an SDP gives a lower bound to the primal's optimal value [20].To obtain a rigorous bound from a numerically obtained solution it is sufficient to take the dual variable produced by the solver and modify it to make it strictly feasible.
In our scheme this can be done by using the structure of the dual problem Eq. (44).Starting from a candidate solution (ε, {λ m }, {X m }) produced by the solver, which is not known to be feasible, we can check if it satisfies the constraints in Eq. ( 44) by exact diagonalization of the matrices involved (each one of them is typically of modest size).If it is not feasible, we can correct it as follows: Starting from the last constraint, we replace λ n−1 → λ n−1 = λ n−1 − e n−1 I, where e n−1 is the minimal eigenvalue of the left hand side of the last constraint, −Tr ′ * n (λ n−1 ) − L * n (X n ).After this substitution the last constraint is satisfied because the adjoint of the partial trace Tr ′ n simply inserts tensor products with identity terms (e.g. for a bipartite system AB: ).We then proceed to compute e n−2 , the minimum eigenvalue of the second to last constraint after the substitution with λ n−1 , and replace λ n−2 → λ n−2 = λ n−2 − e n−2 I.We proceed in this way until in the last step we add the minimum eigenvalue of H + λ 0 − ε µ − L * 0 (X 0 ) to ε, the objective energy function, thereby satisfying all constraints with a modified set of variables and a possibly lower bound on the ground state energy.
Note that after, the substitution λ m → λ m = λ m − e m I, a term e m B * m (I) is added to the previous constraint which can inflate the next required correction e m−1 .However, if the maps B m satisfy B * m (I) ≤ I, i.e. if B m are chosen to be trace non-increasing, then the errors of all the individual corrections at most add up in the final step where the energy ε is corrected.In our MPS-based implementation this condition on B m can be satisfied by transforming the MPS into e.g. the left gauge (see Ref. [47] for details) before constructing the coarse-graining maps.In the tree-tensor-networkbased implementation this condition is always satisfied as the maps used there are trace preserving.In theory, gauge transformations of the MPS should not change the value of the relaxation Eq. ( 14) as the constraints before and after changing the gauge are related by an invertible map.In practice, however, the numerical result can be affected by the choice of gauge if the map implementing the gauge transformation is poorly conditioned.
Finally note that depending on the specific scenario there might be other ways to modify the dual solution to make it strictly feasible.In our examples shifting the (λ m ) m variables by a constant is the straightforward choice since the (X m ) m variables correspond to the LTI constraint, meaning they appear in the equations as L * m (X m ) = I⊗X m −X m ⊗I and shifting them by a constant does not affect the constraints.

V. BENCHMARKING THE METHOD ON 1D HAMILTONIANS
In this section, we present numerical results obtained with our relaxation method on a broad range of models, where we have implemented the two variants of the method: the MPS-based variant presented in Section II, and the TTN-based variant presented in Section III B. In addition to the results themselves, we also provide a detailed analysis of the performance of the method.

A. Studied models
We have studied the following 1D spin models: The critical Ising model, the Heisenberg antiferromagnet, the XX model, the XXZ model in the symmetry broken Ising phase (the last three all being instances of the XXZ chain), a critical spin-1/2 J 1 -J 2 model, and the spin-1 Haldane chain.The precise Hamiltonians used are Transverse field Ising model (TFI) , and S X i , S Y i , S Z i the spin operators acting on site i of the chain, with eigenvalues ±1/2 for spin 1/2).An overview of of all the models and the corresponding parameters is given in Table I.The short names used for the models in the following (especially in Fig. 5) are typeset in boldface in the table.

Model Name
Hamiltonian Gapped/Critical (a) Critical TFI HTFI( 1) HJ 1 -J 2 (4.15, 1) Critical TABLE I. Models on which we benchmarked the method.The Hamiltonians of the models are specified in Eq. ( 47).In the main text and in the figures we refer to each model by the letter (a-f ) and abbreviated name appearing in boldface in the table.See Ref. [75] for the phase diagram of the S = 1/2 XXZ Hamiltonian (models b, c and d), and Ref. [76] for that of model f.

B. Numerical results
For all of the above models we solved the MPS-based relaxation, Eq. ( 14), with coarse-graining maps constructed from variationally optimized MPS tensors.For models a through d we solved the tree-tensor-network-based relaxation, Eq. ( 28), with maps first constructed from variationally optimized tree tensors, and then further optimized as described in Section IV B. For each model, we solved the relaxations for various values of n and bond dimensions D. We denote the obtained energy densities by E relax.
MPS(A ⋆ D ) (n) from Eq. ( 14), where A ⋆ D is the variationally optimized MPS with bond dimension D obtained with the VUMPS algorithm [47].

D
) (n) from Eq. ( 28), where (W opt.(l)D ) l are the optimized maps resulting from applying the procedure described in Section IV B to initial maps constructed from a variational TTN with dimension D, which was obtained with the TTN algorithm in Refs.[17,49].
For both variants the solutions have been certified as described in Section IV C.
For comparison, we have also computed the LTI energy density E LTI (n) of the exact SDP hierarchy truncated at the n-th level, Eq. ( 5).Since both variants of our method are relaxations of the LTI problem, it holds that E relax.MPS/TTN (n, D) ≤ E LTI (n).For certain systems, the optimization with VUMPS can fail to converge for specific bond dimensions D for a singlesite unit cell (see [47]).In some cases, this could be remedied by a sublattice rotation [specifically, a π rotation around the z-axis on every second spin which transforms H Where known, we used the exact ground state energy density for E TI (models a, b, c, d and e), and otherwise (model f ) a high-accuracy estimate obtained using VUMPS with a much larger MPS bond dimension 200; note that in the latter case, the deviation of our relaxations from the true energy density is strictly smaller than the error ∆E shown in Fig. 5 (due to the variational nature of the obtained energy density).In addition, for comparison we also provide the distance ∆E var.(D) of the energy of the variationally optimized MPS which we employed for the coarse-graining, i.e. the precision of the corresponding upper bound ; it is shown in the column to the right of each panel.(Note that the MPS data for models (a) and (b) has already been shown in Fig. 2.) The first feature observed in all plots is the linear decay (on a log-log-scale) of the LTI energy (black line with circles), ∆E LTI (n) ∝ n −α , with an exponent α ≈ 2 throughout all models.The MPS curves ∆E relax.
MPS (n, D) for different D (colored dotted and dash-dotted lines) initially follow the (extrapolated) LTI line, and eventually flatten out and converge to a constant value ∆E relax.
MPS (n = ∞, D) as n gets larger.As D is increased, the LTI slope is followed for longer, and the attainable energy is improved.
A closer inspection of this behavior for the gapped models c and e reveals that the curves obtained from our relaxation in fact fall below the linear extrapolation of the LTI line (whether for even or odd n); the same behavior was observed for the TFI model in its gapped phase (not shown).Since ∆E relax.MPS (n, D) ≥ ∆E LTI (n), this shows that the LTI energy actually converges super-algebraically (or at least with a larger slope) in gapped models.This improved convergence rate could not have been inferred from the LTI results themselves, as they are limited to small n, but was only possible using the data obtained from our relaxation.
In all the critical models (all but c and e), the line traced by the tightest result obtained with our relaxation for each n coincides with the linear extrapolation of the LTI results in the log-log-plot.(The relaxation results were computed for even values of n, and the slope obtained from a linear fit to those points lies between the even-n and odd-n LTI slopes.)From this extrapolation, we can determine the effective LTI problem size, n eff.(D), required to obtain the same accuracy as the one achievable with a given D ∆E relax.
MPS (n = ∞, D).We find that for the critical TFI model (a) n eff.In all cases, this lies significantly above the n reachable for the LTI problem, whose complexity grows exponentially in n.More generally, the effective n exhibits an algebraic scaling n eff.(D) ∝ D κ , with a slope κ consistent with the well-known correlation length vs. bond dimension scaling for MPS, κ = 6/(c( 12/c + 1)) (with c the central charge) [77][78][79][80].Together with the n −2 scaling of the LTI accuracy, this gives a scaling prediction ∆E relax.
MPS ∝ D −2κ attainable with the MPS-based relaxation for a given D.
A remarkable feature is that the distance of the lower bounds from the true energy ∆E relax.MPS (n = ∞, D) is generally comparable to the upper bounds obtained from the variationally optimized MPS ∆E var.(D), and in some cases even better by up to half an order.This is rather surprising, given that the MPS has been explicitly optimized to give the optimal upper bound, while we have simply used the same MPS for the coarse-graining for the lower bound, without optimizing over the MPS in any way to obtain optimal lower bounds.This seems to hint that a good renormalizationbased variational ansatz also forms a good basis for a coarse-graining relaxation, a connection that deserves further study.Motivated by this, we have carried out additional numerical studies on the correlation between the precision of the lower and upper bounds, on which we report in Section V D below.
In addition, Fig. 5 reports data obtained using the tree-tensor-network-based method for different D (blue lines).The tree-tensor-network-based variant did not perform as well as the MPS-based one; nevertheless, it improved upon the precision attainable by solving the LTI problem.That this variant performed less well than the MPS-based one was partly to be expected because the TTN-based method starts from the constraints in Eq. (23), which are already a relaxation of the LTI problem as was explained in Section III B. In this variant, we have included the optimization over the coarse-graining maps described in Section IV B; the data reported in Fig. 5 is the optimized data.Through this optimization, we were able to decrease the error of the lower bound by factors ranging from 2 to 6. (E.g., in the TFI model with D = 3 and n = 64 the error was ∆E relax.TTN (64, 3) = 0.0006 prior to optimizing, and 0.000096 afterwards.)Let us note that the way we construct the initial CPTP maps from the tree tensors for this relaxation is ad hoc and could certainly be improved.Alternatively, one could pursue other methods to optimize over the coarse-graining maps, such as gradient-decent methods described at the end of Section IV B.    A feature which merits discussion are the even-odd steps displayed both by the LTI energy and its approximation through the MPS relaxation for all models but the Ising model.All those models have antiferromagnetic couplings, suggesting that the origin of those steps could be traced back to magnetic frustration occurring in XXZ rings [81][82][83].To understand this, note that the LTI problem on n sites can be obtained from the ground state problem on any periodic chain of size m ≥ n by omitting some constraints: The LTI energy density E LTI (n) is thus upper bounded by the smallest ground state energy density for any ring of size m ≥ n, min m≥n E PBC (m).For the models (b) to (f ), the ground state energy density of the periodic chain shows an oscillatory behavior, with the odd-m energy density being higher due to frustration.Thus, the upper bound min m≥n E PBC (m) shows precisely the same step-like behavior as observed for E LTI (n) in Fig. 5.It is thus plausible that the step-like behavior of E LTI (n) has the same origin.Remarkably, we find that for models b through f , as well as in all spin-1/2 XXZ models with ∆ ∈ [−2, 2], E LTI (n) = min m≥n E PBC (m) (up to solver precision), i.e., the the solution of the LTI problem for even n is in fact equal to E PBC (n).This is certainly remarkable, as the LTI problem is a relaxation of the PBC problem, and deserves further study.Let us note that this equality, however, breaks down for more general models (e.g. in the XYZ family), and thus must be rooted in some special properties of the models considered.

C. Efficiency analysis
Let us now assess the performance of the method, that is, the way in which the resources required to solve the problem scale with the precision of the lower bound.To this end, we compare in Fig. 6 the improvement in precision as a function of the number of variables in the problem, N vars., for the MPS-based relaxation [Eq.( 14)] and the exact LTI problem [Eq.( 5)], using the entire data set obtained for all the models (see Table I) on which we tested our method.In Fig. 6, the dotted lines represent the LTI problem and the solid lines our MPS-based relaxation, for each of the models a through f (shown in different colors).We observe that LTI lines become increasingly flat; indeed, from the exponential scaling N vars.(LTI(n)) ∝ d 2n , together with the observed ∆E LTI (n) ∝ n −α , we expect it to scale as − log( • ) in the log-log-plot.Our relaxation results, on the other hand, seems to follow a clear linear trend in the plot.This demonstrates that our method results in algebraic convergence of the energy as a function of the size of the SDP.The observed rates of convergence for the different models scale as N −β vars.with β ∈ [0.59, 0.74], depending on the model; this is in line with the different exponents for n eff (D) ∝ D κ and the n −2 scaling of the energy accuracy.

FIG. 6.
The precision of the tightest lower bound obtained for a given memory budget (the number of variables in the problem) by solving the MPS-based relaxation Eq. ( 14) (solid lines), as well as the LTI problem Eq. ( 5) (dotted lines) for the models listed in Table I.The LTI lines follow a logarithmic curves whereas the MPS-relaxation data follow linear trends.The slopes of the lines were found to be between −0.59 and −0.74 depending on the model.

D. Correlation between upper and lower bounds
A remarkable observation was how well variationally optimized MPS performed when used as coarse-graining maps for our relaxation, resulting in lower bounds of comparable precision.To further investigate this issue, we have numerically studied the relation between the quality of the upper and lower bounds obtained from a given MPS.To this end, we have generated random MPS tensors and applied the VUMPS algorithm to them for a number of iterations between 0 and 64; this way, we could sample MPS which provide variational upper bounds of increasing accuracy.The resulting data is shown in Fig. 7.For each tensor, the upper bound was obtained by evaluating the energy density of the corresponding MPS; its distance from the ground state energy, ∆E VUMPS , is shown on the x-axis.The lower bound for each tensor was obtained by solving the MPS-based relaxation, Eq. ( 14), using the coarse-graining maps constructed from the given MPS.The corresponding error ∆E relax. of the lower bound is shown on the y-axis.
Figure 7 shows a clear correlation between the precision of the upper and the lower bounds.It also shows that while the variationally obtained MPS does not always give rise to the tightest lower bound attainable with that bond dimension, the difference between the best lower bound and that obtained from the MPS variationally optimized with VUMPS for a given bond dimension is small when compared to the improvement obtained when going to a larger bond dimension.

E. Comparison with existing methods
We give a brief overview of related works, and compare their results with ours where relevant.The only results that can be compared directly are those of Ref. [26] where translation-invariant XXZ models on infinite lattices in 1D and 2D are treated.The errors of the bounds they obtained for H 1/2 XXZ (∆) are 2 × 10 −3 for ∆ = 1; 3 × 10 −3 for ∆ = 2; and below 10 −4 for ∆ = 0 (i.e. the XX model, which is equivalent to a system of free fermions).We obtained 3 × 10 −4 , 2 × 10 −4 , and 3 × 10 −4 respectively.Several further works have implemented relaxation methods based on RDMT, or very similar approaches.Those, however, deal with periodic spin chains and therefore do not provide rigorous bounds on the infinite-system value.(We already mentioned above that the energy density on a ring of size   14) as a function of the precision of the variational upper bound corresponding to the MPS that was used to construct the coarse-graining maps for the relaxation.
For each i ∈ {0, 1, 2, 4, 8, . . ., 64} 300 random MPS tensors with bond dimension D were generated and updated by i iterations of the VUMPS algorithm.Each tensor was then used to obtain a lower bound using our relaxation scheme, and an upper bound by evaluating the energy density of the uniform MPS generated by the tensor.
n, E PBC (n), is lower bounded by the LTI enery E LTI (n); furthermore, one can show that E LTI (n) ≥ E PBC (n) − c/n for a constant c.) Ref. [3] gives lower bounds for translation-invariant spin Hamiltonians: for the spin 1/2 Heisenberg model on periodic chains of sizes up to 50 they obtained energy densities 10 −2 below the true infinite-system value.In Ref. [24] a method based on correlation matrices, equivalent to RDMT [84], is applied to finite periodic spin chains.For the critical Ising model and XXZ models with ∆ ∈ [0, 2] they obtained errors of the order 10 −2 and 10 −1 , respectively.Ref. [28] applies RDMT to bound the ground-state energy, as well as ground-state correlation functions in the Heisenberg model with nearest-neighbor and next-nearest-neighbor interactions in periodic 1D and 2D lattices.They obtain an error of 4 × 10 −4 for the Heisenberg model on a periodic chain of length 100.

F. Numerical implementation
We conclude this section with some details regarding our numerical implementations.Sample code is available at [85].We used predominantly general-purpose software packages to solve the SDPs involved.We used YALMIP [86] for modeling the problems and solved primarily using MOSEK [87].The unfavorable memory scaling of interior-point methods (see e.g.[88]) prevented us from going to bond dimensions higher than 7 in the MPS-based variant and 5 in the tree-tensor-network-based variant.To overcome this, we tried to use the splitting conic solver (SCS) [89] which is a first-order method and requires less memory, but observed slow convergence and were not able to improve the results in any of the models, with the exception of the S = 1 Heisenberg chain.In this model we used our own matrix-free implementation of SCS (where the constraint matrix is not explicitly constructed as a sparse matrix, but only implicitly applied, which provides significant savings due to the tensor-network structure of the constraints) to produce the results for bond dimensions 6 through 9.In addition, SCS was used to solve the larger instances of the LTI problem, Eq. ( 5), where it performed very well.
We have not explored further methods beyond the ones mentioned.We believe the results we presented could be markedly improved upon by using custom-tailored solvers.The chain structure of the constraints in the relaxed problem Eq. ( 14) (and more generally Eq. ( 21)) suggests that splitting methods [90] other than the one employed by SCS could lead to an advantage.Combining those with matrix-free solvers would further reduce the complexity and allow to perform computations for larger system sizes n and bond dimesnions D.
Finally we note that the presence of local symmetries can be naturally accounted for within our method.In Appendix C we explain how by using suitable coarse-graining maps, one ends up with a relaxation involving symmetric states thus reducing the problem's complexity.

VI. CONCLUSION
In this paper, we have introduced a general method to obtain rigorous lower bounds for a wide range of minimization problems in a quantum many-body setting.As a prime application, we demonstrated how our method can be applied to lower bound the ground state energy of quantum spin systems.At the same time, the method lends itself to a wide range of other optimization problems with a similar structure, such as the certification of entanglement and nonlocality in quantum information theory.The method achieves this by providing a systematic and customizable way to obtain outer approximations to the set of physical quantum correlations; it is thus suggestive that it can also provide new ideas on how to relax other numerical methods based on positivity constraints, such as reduced-density-matrix theory or various bootstrap techniques.
The key idea of our method is to relax the convex set of few-body density matrices or other correlations which are physical-that is, compatible with a global quantum state-in two steps.In a first step, the global compatibility constraint is decomposed into a hierarchy of constraints between increasingly complex objects.A first relaxation can then be obtained by truncating the hierarchy at a finite level n.Approaches of this kind have been pursued previously in various settings; however, they suffer from the exponential growth of the involved objects, which severely limits the level of the hierarchy that one can reach.
The key novelty of our method lies in the second relaxation: We apply a coarse-graining procedure to all objects in the hierarchy, in a way which compresses them down to objects of a fixed (or otherwise tractable) size.This compression must be chosen such as to keep the relevant degrees of freedom, which can be accomplished, e.g., by employing renormalization-based ideas.At the same time, any such compression scheme must act in a consistent way across all levels of the hierarchy, a demand yet again met by renormalization-based coarse-graining maps.
By utilizing this compression, we can effectively reach very large levels n of the original hierarchy, and thus achieve results with a much improved accuracy.At the same time, the optimization problem remains a semidefinite program (SDP), and can thus be efficiently solved.In addition, the possibility to smoothly adjust the coarse-graining maps to best suit the problem at hand, and to optimize them as well in the course of the optimization, makes our relaxation much more tunable than existing approaches, which are based on discrete, and thus rigid, hierarchies.Moreover, insights from renormalization procedures can be used as a guide in choosing those maps.
We have worked out our method in detail for 1D quantum spin chains, and demonstrated its power through the explicit numerical study of a range of relevant models.Here, we have employed two coarse-graining schemes, both of which are motivated by renormalization-based variational ansatzes: One based on Matrix Product States (MPS), and one on Tree Tensor Networks (TTN); in the latter case, we have additionally implemented an optimization over the coarse-grainers.In both cases, we observed that the variationally optimized wavefunction also provides a very good coarse-graining map for the relaxation of the hierarchy, in line with the intuition that the coarse-graining should keep low-energy degrees of freedom, akin to renormalization schemes.We have tested both approaches on a wide range of gapped as well as critical models.We found that both schemes clearly outperform the exact truncated hierarchy, with the MPS approach generally performing noticeably better.Specifically, using the MPS-based relaxation, we were able to gain between one and two orders of magnitude in precision in energy, on par with sizes of the truncated hierarchy between n ≈ 40 and n ≈ 120 (while the exact hierarchy was limited to n = 13).Overall, we observed that the resources required for our method scaled polynomially in the targeted energy precision, as opposed to the exponential scaling of the exact truncated hierarchy.
The general framework presented in this paper can be applied to a much broader range of problems than those that we used as a demonstration.First, the method opens avenues to obtain provable lower bounds on ground state energies for two-dimensional systems, where good variational upper bounds -e.g.: using tensor networks-are much harder to obtain, and where the performance of those methods is less well understood.It can also be applied to reduced-density-matrix theory and related approaches based on correlation matrices, and thus to problems without an underlying locality notion, such as in quantum chemistry.Finally, at its heart, our relaxation presents a way to certify when a (very large or even infinite) matrix that satisfies a set of linear and convex constraints on its entries (such as having certain two-body marginals or few-point correlations, or being a separable state) cannot be chosen to be positive; as such, it is also applicable to problems such as entanglement or non-locality detection, as well as to the so-called bootstrap approach to many-body problems.
A key challenge in generalizing our method is the right choice of coarse-graining maps.In this regard, we have demonstrated that maps based on tensor networks, which can in most cases be interpreted as variational wavefunctions underlying different renormalization schemes [19], are a suitable option.It is reasonable to expect that tensor network ansatzes will work in higher dimensional problems as well.Moreover, given the often excellent performance of tensor networks beyond quantum lattice problems, it is likely that suitably chosen tensor network ansatzes will also make good coarse-grainers in scenarios other than energy minimization.Remarkably, tensor networks optimized to derive variational upper bounds generally performed surprisingly well when used as coarse-grainers in our relaxation scheme; a better understanding of this connection, as well as of the ways in which different tensor networks can be used as coarse-graining maps, remain as open problems for future work.   in all the 1D models (not shown in Fig. 5).Second, The slopes of the LTI squares and LTI triangles bounds are very close in both models.It seems that the LTI conditions for the two different shapes have similar power in terms of "constraints per spin".When comparing the LTI slopes of the 2D models with those observed in 1D we notice that the convergence is faster in 1D (where the scaling was n −2 ).Finally, as expected, for increasing coarse-graining dimension D the value of the lower bound from the relaxation defined in Fig. 8 converges towards the extrapolation of the LTI line.(The compression would be lossless for D = 8 in the case of triangles and D = 16 in the case of squares.)These results demonstrate that our method can be used in 2D to push beyond what is possible using the LTI hierarchy.The bounds we obtained with this rudimentary demonstration are already very close to the best lower bounds we are aware of for those systems (given in Ref. [26]).From the linear extrapolation in Fig. 9 we can read off which region sizes have to be reached in a relaxation scheme based on the 2D LTI hierarchy in order to achieve a desired accuracy.The SDPs we solved to produce these results are very simple and are easy to set up using SDP modeling software packages (50-100 lines of code).Sample code is available at [85].Larger coarse graining dimensions could be handled by implementing matrix-free SCS for this problem.
It also demonstrates the challenges of applying our approach to relax the 2D-LTI hierarchy: to preserve the LTI symmetry of the problem when formulating the next level of the hierarchy we must increase the size of the patch of side length l by 2l + 1 spins in the case of squares, and by l + 1 spins in the case of triangles.In addition, even after coarse-graining we end up with a system consisting of all the spins on the boundary of the patch, which quickly becomes too large.Finally, it is not obvious how to implement the composition properties required from the coarse-graining maps if we would like to continue with a further coarse-graining step which is compatible with the LTI symmetry (maps based on PEPS would work but they result in a coarse-graining dimension which is too large).
We already mentioned in the main text in Section III C that those difficulties can be overcome by using a tree coarsegraining procedure, or by first formulating the problem within the RDMT framework and applying the procedure described in Section III D. Another way would be to give up the explicit LTI symmetry in our relaxation and instead relax a chain of constraints in which one increases the triangular region by at most two spins at each step, such that an additional side-length-2 triangle can fit in it.The region already covered can be then coarse-grained with an MPS using a snake pattern in order to cover triangles of increasing size.In this scheme the system size still grows from one step to the next, but it requires only l + 1 spins to be kept to relax the 2D LTI problem for a side-length-l triangle.We can use the extrapolation of the LTI bounds in Fig. 9 to estimate the precision we could hope to achieve by implementing this scheme.For example, in order to improve the current best result for the Heisenberg model (Ref.[26]) by an order of magnitude we would need to solve for a triangular region of side length l ≳ 11, which is still a tractable SDP.We leave this for future work.
FIG. 2.Distance ∆E relax. of the lower bounds to the ground state energy density, obtained by solving the MPS-based relaxation outlined in Section II, to the true value, for (a) the transverse field Ising model at the critical point, and (b) the spin 1/2 Heisenberg antiferromagnet.The differently colored dashed and dash-dotted lines show results obtained using MPS with different bond dimensions D (leading to better relaxations).The black circles give the energy obtained by solving the exact hierarchy up to n sites, whose computational cost grows exponentially in n; it displays an algebraic scaling.The MPS relaxations approximate its behavior increasingly well as D increases, at a fraction of the computation cost (which scales algebraically in the effective n), thus allowing to reach accuracies of between one and two orders of magnitude higher.

FIG. 3 .
FIG.3.Graphical representation of the tree-tensor-network-based relaxation scheme for n = 16.The left side of the figure shows the constraints Eq. (23) before coarse-graining is applied: Each state ρ (2 k ) is obtained from the state above it in three different ways by tracing out the appropriate spins, as indicated by the curved black arrows.On the right side, the coarsegrained 4-partite states ω (2 k ) are represented by the dashed rectangles.Each ω (2 k ) is obtained from ρ (2 k ) by applying the C k maps that consist of layers of 2-to-1 coarse-graining maps indicated by inverted 'Y' shapes, see Eq.(24).The maps B k−1 then implement an additional coarse-graining step on ω(2 k ) .The relaxed constraints between the states ω (2 k ) , Eq. (27), are indicated FIG.3.Graphical representation of the tree-tensor-network-based relaxation scheme for n = 16.The left side of the figure shows the constraints Eq. (23) before coarse-graining is applied: Each state ρ (2 k ) is obtained from the state above it in three different ways by tracing out the appropriate spins, as indicated by the curved black arrows.On the right side, the coarsegrained 4-partite states ω (2 k ) are represented by the dashed rectangles.Each ω (2 k ) is obtained from ρ (2 k ) by applying the C k maps that consist of layers of 2-to-1 coarse-graining maps indicated by inverted 'Y' shapes, see Eq.(24).The maps B k−1 then implement an additional coarse-graining step on ω(2 k ) .The relaxed constraints between the states ω (2 k ) , Eq. (27), are indicated by !=.

2 XXZ
(−∆)].Yet, VUMPS would not converge for all values of D, leading to missing data points (D = 5, 7 in model c and D = 2, 3 in models d and f ).Fig. 5 shows the distance of the obtained lower bounds from the actual ground state energy density E TI , denoted by ∆E relax.MPS/TTN (n, D) := E TI − E relax MPS/TTN (n, D) and ∆E LTI (n) = E TI − E LTI (n), respectively, for all six models.

FIG. 5 .
FIG.5.The error of the lower bounds to the ground-state energies of the models listed in TableIobtained by solving the MPSbased and tree-tensor-network (TTN) based relaxations, Eqs.(14) and (28), of the translation-invariant (TI) local Hamiltonian problem Eq. (1).Both methods are relaxations of the LTI problem, Eq. (5).The error of the LTI lower bound, ∆ELTI(n) := ETI − ELTI(n), is plotted by the solid black line with circles.The dotted and dash-dotted lines plot ∆E relax.MPS (n, D) := ETI − E relax.MPS (n, D) obtained from the MPS-based implementation, and the solid blue lines correspond to the TTN-based variant and plot ∆E relax.TTN (n, D), both of which are shown for various sizes n and various bond dimensions D. In the TTN-based method the coarse-graining maps were optimized numerically as described in Section IV B. The bar to the right of each panel shows the precision of the variational upper bound, ∆E var., corresponding to the MPS used in the MPS-based relaxation.

42 FIG. 7 .
FIG.7.Precision of the lower bound obtained by solving the MPS-based relaxation Eq. (14) as a function of the precision of the variational upper bound corresponding to the MPS that was used to construct the coarse-graining maps for the relaxation.For each i ∈ {0, 1, 2, 4, 8, . . ., 64} 300 random MPS tensors with bond dimension D were generated and updated by i iterations of the VUMPS algorithm.Each tensor was then used to obtain a lower bound using our relaxation scheme, and an upper bound by evaluating the energy density of the uniform MPS generated by the tensor.

FIG. 9 .
FIG.9.2D LTI results.The error ∆E of the lower bounds to the ground-state energies of the Heisenberg and XX models on a 2D square lattice obtained by solving the first two levels of the 2D LTI hierarchy for square-shaped regions (Fig.4 (a)), the first three levels for triangle-shaped regions (Fig.4 (b)), and the relaxations with coarse-graining dimension 1 ≤ D ≤ 3 of the next level for both squares and triangles (Fig.8(a) and (b) respectively).In addition Anderson bounds for square-shaped regions of sizes up to 5 × 5 are shown.The error ∆E is plotted against the number of spins n in the largest region appearing in the problem (before compression) on a log-log scale, and was computed with respect to energies obtained from quantum Monte Carlo computations (Refs.[92,93]).For each of the series: LTI squares, LTI triangles, and Anderson squares a linear extrapolation is plotted by a dotted line.