Entropy scaling law and the quantum marginal problem

Quantum many-body states that frequently appear in physics often obey an entropy scaling law, meaning that an entanglement entropy of a subsystem can be expressed as a sum of terms that scale linearly with its volume and area, plus a correction term that is independent of its size. We conjecture that these states have an efficient dual description in terms of a set of marginal density matrices on bounded regions, obeying the same entropy scaling law locally. We prove a restricted version of this conjecture for translationally invariant systems in two spatial dimensions. Specifically, we prove that a translationally invariant marginal obeying three non-linear constraints -- all of which follow from the entropy scaling law straightforwardly -- must be consistent with some global state on an infinite lattice. Moreover, we derive a closed-form expression for the maximum entropy density compatible with those marginals, deriving a variational upper bound on the thermodynamic free energy. Our construction's main assumptions are satisfied exactly by solvable models of topological order and approximately by finite-temperature Gibbs states of certain quantum spin Hamiltonians.

I can illustrate the second approach with the same image of a nut to be opened. The first analogy that came to my mind is of immersing the nut in some softening liquid, and why not simply water? From time to time you rub so the liquid penetrates better, and otherwise you let time pass. The shell becomes more flexible through weeks and months-when the time is ripe, hand pressure is enough, the shell opens like a perfectly ripened avocado! A different image came to me a few weeks ago. The unknown thing to be known appeared to me as some stretch of earth or hard marl, resisting penetration ... the sea advances insensibly in silence, nothing seems to happen, nothing moves, the water is so far off you hardly hear it ... yet it finally surrounds the resistant substance. [Grothendieck 1985[Grothendieck -1987 1 Introduction The discovery that there are materials exhibiting superconductivity at a temperature significantly higher than what the BCS theory of superconductivity [1] predicts has been one of the major driving forces behind the study of strongly correlated electron systems over the past few decades [2]. The struggle to definitively identify the underlying microscopic physical mechanism has been a major theme of research in strongly correlated material over this period.
While writing down a toy model of such systems is often straightforward, solving them is far from trivial. Kitaev showed that the problem of estimating the ground state energy of such models is QMA-complete, which is likely to be hard even for a quantum computer [3]. To make progress, one must resort to a set of physical assumptions that can alleviate this otherwise exorbitant computational cost.
A "model citizen" of such physical assumption is the spectral gap. Hastings showed that onedimensional quantum many-body systems with a constant spectral gap obey area law, in which case the ground state wavefunction can be efficiently described by a tensor network called matrix product state [4]. Moreover, there is a classical algorithm that, given a one-dimensional gapped Hamiltonian, finds the ground state wavefunction in polynomial time [5]. These developments firmly establish that one-dimensional gapped systems form an easy subclass of Hamiltonians that are amenable to classical computational methods.
However, attempts to extend these insights to higher dimensions have been stymied by many obstacles. Despite a recent work on establishing the subvolume law of entanglement entropy in two spatial dimensions [6], a rigorous proof of the area law in higher dimensions remains open. Moreover, even if we can prove area law in the future, that alone cannot be the end of the story. Ge and Eisert showed that a class of states in two-dimensions that satisfy the area law for all Renyi entropies cannot be characterized by tensor networks in higher dimensions using a polynomial number of parameters [7]. These authors further conclude that, even amongst the states that obey area law, the states that have an efficient classical description constitutes an even smaller corner of that already restrictive set.
If so, what kind of physical states can we describe efficiently? In this paper, we will take a top-down approach to tackle this problem. By postulating a plausible property of gapped systems, we will be able to derive an efficient tensor network that can accurately approximate their ground states. (In fact, the property we demand can be satisfied, albeit approximately, even by the finitetemperature Gibbs state of some locally interacting many-body Hamiltonian.) The derived tensor network, which we briefly mention in Appendix F, have two useful properties. First, the network can be contracted efficiently, in time that scales linearly with the system size. These tensor networks are in stark contrast with (more) general tensor networks, which are computationally hard to contract [8]. 1 Second, the global entropy of the tensor network can be computed efficiently. This is a highly nontrivial feature that not many tensor networks can bolster; see however, Ref. [10] for a notable exception.
While the existence of such a tensor network is undoubtedly surprising, why it has those properties is not apparent from its definition. A better perspective is to interpret the tensor network as a maximum-entropy state consistent with a set of marginal density matrices, marginals for short. From this point of view, the fundamental data that characterizes the global state is the set of marginals on balls of bounded radius that overlap with each other; see Fig. 1.
Of course, there is a well-known difficulty in such characterization. Given a set of marginals, one may not be able to efficiently decide if those marginals are consistent with some global state. A completely general version of this problem is known to be QMA-complete [11,12]. Therefore, it will be too ambitious to find a complete solution to this problem. This is where our main result comes in. We provide a physically motivated sufficient condition on the set of marginals such that, if the condition is satisfied, one can guarantee the existence of a global state consistent with those marginals. Moreover, under the same condition, we can derive an exact expression for the maximum global entropy consistent with the given marginals. Therefore, not only can we compute an upper bound to the ground state energy, but we can also compute an upper bound to the thermodynamic free energy.
Our conditions are satisfied if the underlying state, up to a finite-depth quantum circuit, obeys the following scaling law for entanglement entropy: for any disk-shaped region A, where the first two terms are proportional to the volume and the boundary of A respectively, quantified in terms of the number of degrees of freedom in the interior/boundary of A, and the last term is a constant that is independent of the size of A. (See Fig. 2.) Here α 0 and α 1 are non-universal constants and γ is a universal term that is independent of these microscopic details. Modulo the correction term that vanishes in the |A| → ∞ limit, Eq. (1) is a commonly used form of the entanglement entropy that is expected to hold in physical systems with a finite correlation length, both at zero [13,14] and finite temperature [15]; see Ref. [16] for a review. In so far as Eq. (1) is valid, our result is exact. Since Eq. (1) has been observed to hold in a large class of systems up to a correction that decays exponentially in |∂A|, we expect the approximation error to decay rapidly as we increase the size of A. Provided that our speculation is correct, by minimizing the free energy within the space of such marginals, we will be able to obtain a variational upper bound on the free energy which well-approximates the true thermodynamic free energy. With Eq. (1), we can describe our main result somewhat more precisely, however, still glossing over the mathematical details. Roughly speaking, we prove the following correspondence: Translationally invariant marginal obeying Eq. (1) locally.
Translationally invariant state obeying Eq. (1) globally. (2) Specifically, there is a one-to-one map between a translationally invariant marginal on a finite cluster obeying Eq. (1) within that marginal and a global state consistent with those marginals obeying Eq. (1) at a larger scale. This means that a translationally invariant marginal on a bounded region obeying Eq. (1) defines a translationally invariant state of an infinite system. Moreover, the maximum entropy (density) of the the latter can be computed exactly using Eq. (1).
To reach this conclusion, we make a simple but nontrivial observation: that the states that satisfy Eq. (1) obey the following identity for a judiciously chosen set of subsystems; see Fig. 3 for examples. The states that satisfy Eq. (3) have a very special structure known as the quantum Markov chain structure [17,18]. The derivation of our result is an "elementary" consequence of this fact in the sense that every argument is based only on the properties of the quantum Markov chain. However, the proof of Eq. (2) is rather long and technical. To alleviate this difficulty, we developed a theory of merging algebra. This is an algebra equipped with two types of binary operations over a set of marginals, with nontrivial relations inherited from Eq. (1). Interestingly, these operations are generally non-associative, and one of them is not even commutative. Instead, the commutation/association relations hold only in certain specific contexts. Despite this complication, these rules are still structured enough to let us prove our central claims.
While the merging algebra is very different from what we ordinarily encounter in physics, there is clearly some structure hidden beneath all this new mathematics. That a generic-looking equation like Eq. (1) leads to such a novel mathematical structure is surprising. But what is even more surprising is that this very structure can lead to a physically well-motivated condition to ensure the existence of a global state consistent with some marginals. This is known as the quantum marginal problem, which has been notoriously difficult to make progress on; see Ref. [19] and the references therein.
One may wonder: is it really essential to invoke a new mathematical structure? Is there no simple way to derive the same result more concisely? Unfortunately, we do not have a satisfactory answer to these questions. However, let us point out that our construction becomes significantly simpler in the classical setting [20], giving hope that perhaps a more straightforward proof will be found in the future. For now, the main reason why the arguments in Ref. [20] do not directly generalize to the quantum setting is that in quantum mechanics, reduced density matrices generally do not commute with each other. For that reason, many of the operations we consider are intrinsically non-commutative. In fact, they are generally not even associative. Further progress in this direction seems to require a deeper understanding of the structure of multipartite quantum states that satisfy Eq. (3) on multiple subsystems.
An important message from our work is that one should take Eq. (1) seriously. As we have already discussed, Eq. (1) leads to a formulation of efficiently contractable tensor networks, physicallymotivated solutions to the quantum marginal problem, and even a new kind of algebra whose exact nature is still a mystery. Moreover, it was shown in Ref. [21][22][23] that Eq. (1) leads to a highly constrained set of consistency equations that govern the basic "laws" of the low-energy excitations in such systems. These results strongly indicate that there are rich physics and mathematics hidden beneath Eq. (1) that warrants a further exploration. In particular, we would like to conjecture the following generalization of Eq. (2): Locally consistent marginals obeying the identities in Fig. 3 internally.
Global quantum state obeying the identities in Fig. 3 everywhere.
? (4) That is, one may be able to remove the translational invariance condition entirely. Proving or disproving Eq. (4) is an important open problem that is left for future work.
The rest of this paper is structured as follows. In Section 2, we provide a more in-depth summary of our main results. In Section 3, we introduce the fundamental objects behind our work, namely the marginals over bounded regions. We will summarize the constraints imposed on those marginals and briefly sketch their implications. In Section 4, we provide a short review on the properties of the von Neumann entropy. In Section 5, we introduce and set out the basic rules of the merging algebra. Moreover, we introduce an object called snake and study its properties. In Section 6, we will summarize the key statements that directly lead to the main results of this paper. This section will only provide an overview of these statements, deferring the proofs to Appendix B, C, and D. In Section 7, we derive the main results from the statements in Section 6. We end with a discussion in Section 8.

Solving quantum many-body physics locally
In this section, we will provide a more explicit explanation of our main result. The class of systems we are considering is depicted on the right hand side of Fig. 4, which is a (triangular) lattice of quantum spins which are locally interacting with each other. Without loss of generality, with a sufficient number of coarse-graining steps, we can always assume that the interaction terms of the Hamiltonian can be localized to a 2 × 2 cluster of the coarse-grained lattice.
Given a Hamiltonian that acts locally on this lattice, one may wish to study its properties at both zero and non-zero temperature, by minimizing the energy and the free energy respectively. Unfortunately, this is often a difficult problem because a direct minimization in the entire Hilbert space incurs an exponential cost.
One may hope to reduce the exorbiant cost by considering a smaller subset of physically relevant states and variationally minimizing the energy or the free energy within the family of such states. In this paper, we will take a top-down approach to identify such a family. Instead of coming up with an ansatz and attempting to justify their physical relevance, we will begin with a physical , we obtain a quantum spin system with larger local Hilbert space dimension, on a triangular lattice. Each of the coarse-grained degrees of freedom are referred to as clusters. After coarse-graining enough times, the local terms of the Hamiltonian can be localized to a 2 × 2 cluster(green). The pair of integers represent the xand y-coordinate of each lattice site on the coarse-grained lattice.
principle and explore its consequences seriously. Our starting point is Eq. (1), which for the reader's convenience is restated below: For a class of states that satisfy Eq. (1), there is a universal fact that is independent of α 0 , α 1 , and γ. This is the fact that the the first two terms of Eq. (1) obey the inclusion-exclusion principle. Therefore, with a judicious choice of subsystems, certain linear combinations of entanglement entropies may vanish. Of particular relevance to us are the following three identities. Consider a 2 × 2 and a 3 × 3 cluster, denoted as and respectively. The marginals over these clusters should satisfy and where S(·) represent the entanglement entropy of the empty clusters in the paranthesis over some (fixed) global state that satisfies Eq. (1). These are precisely the clusters appearing on the left-hand side of Fig. 4, representing coarse-grained degrees of freedom. The gray clusters mean that we have applied a partial trace on those clusters. For instance, the first term on the right hand side of Eq. (6) is the entropy of the reduced density matrix of a 2 × 2 cluster, located at the bottom left corner of the 3 × 3 cluster. Let us briefly comment on what the "volume" (|A|) and the "area" (|∂A|) terms mean. There are two heuristic ways to calculate them, both leading to Eq. (5) and (6). The first approach is to count the number of clusters. Specifically, for the volume term, simply count the number of clusters; for the area term, simply count the number of clusters that are placed at the boundary. It is a straightforward exercise to verify that both Eq. (5) and (6) follow from this simple rule.
Alternatively, one may view each of the cluster as collection of smaller clusters. For instance, imagine replacing each cluster by a 2 × 2 cluster.
Now Eq. (5) and (6) reads: which again all follow from the same rule based on the counting of clusters. The same conclusion continues to hold even if we replace each cluster in Eq. (5) and (6) to n × m cluster for n, m ∈ Z + .
Admittedly, this discussion is somewhat heuristic. However, an important point is that there is a large class of quantum many-body systems that are expected to satisfy Eq. (5) and (6). We have supplemented our heuristic discussion with concrete many-body quantum states in Appendix E for which these conditions are satisfied either exactly (at zero temperature) or approximately (at finite temperature).
So far, we found that the reduced density matrices of a state that satisfies Eq. (1) obeys Eqs. (5) and (6) over the 2 × 2 and 3 × 3 clusters. Moreover, these reduced density matrices must be locally consistent with each other, meaning that their measurement statistics must be identical on their overlapping supports. In this paper, we put forward the following conjecture, which would establish a map in the opposite direction: Conjecture. Consider a set of locally consistent marginals that satisfy Eqs. (5) and (6). The maximum-entropy state consistent with those marginals exists and furthermore obeys S(ABC) = S(AB) + S(BC) − S(B) for every A, B, and C such that (i) A, B, C, AB, BC, and ABC are all disk-like regions and (ii) A and C are not adjacent to each other.
The part of the conjecture that involves the entropy is formalizing an expectation that Eq. (1) holds everywhere. In other words, this conjecture states that the local reduced density matrices satisfying Eqs. (5) and (6) form a "dual" (or more precisely, an equivalent) description of a density matrix that obeys Eq. (1) everywhere. If true, an immediate application of this duality would be an efficient computation of the expectation values of local observables and the entanglement entropy. The former can be computed from the marginals directly and the latter can be computed from the fact that the leading terms of the entropy obeys the inclusion-exclusion principle.
We provide two nontrivial evidences in support of this conjecture, by focusing on translationallyinvariant systems in two spatial dimensions. First, given a set of locally consistent marginals satisfying Eqs. (5) and (6), we show that for any N, M ≥ 2, there always exists a global state on a N × M cluster which is consistent with the marginal and its translations. Second, we derive a closed-form expression for the maximum entropy of the N × M cluster that is consistent with Eqs. (5) and (6). This expression is consistent with Eq. (1).
More concretely, we assume that the marginals and obey the following set of constraints: and Eqs. (5) and (6). Here T = t x , t y is the group of translations generated by the xand the y-translations (t x and t y respectively) and T (S) of a set S is an orbit of T . The notation . .} means ρ i c = σ j for all ρ i and σ j , which is another way of saying that ρ i and σ j have identical reduced density matrices on their overlapping supports for all i and j. For instance, if ρ 1 acts on H A ⊗ H B and σ 1 acts on H B ⊗ H C , ρ 1 c = σ 1 means Tr A (ρ 1 ) = Tr C (σ 1 ). From these assumptions, we deduce that (for any N and M ) there is a state ρ N ×M on a N × M cluster such that That is, there is a state that is consistent with the marginals on the 2×2 cluster and its translations. Moreover, we obtain an exact expression for the maximum entropy that obeys Eq. (8). This expression can be readily computed from the marginal . These expressions are particularly useful for directly probing the thermodynamic limit. Given a translationally-invariant Hamiltonian, the expectation value of the local term with respect to the marginal ought to be consistent with the energy density of some translationally invariant quantum state, upper bounding the ground state energy density. Moreover, in the infinite-volume limit, the expression for the maximum entropy density reads: Combining these two results, we can obtain an upper bound on the thermodynamic free energy density that can be readily computed from . To reach this conclusion, we extensively use a technique called merging, first introduced by Kato et al. [24]. The authors of Ref. [24] proved the following statement. Consider two quantum states ρ ABC and σ BCD such that I An important fact about the merging technique is that one can deduce the existence of a consistent global state by merely verifying local conditions. Note that all the required conditions can be verified by either (i) computing the entanglement entropies of the given density matrices or (ii) comparing their reduced density matrices. Once those conditions are verified, we can deduce the existence of some state τ that is consistent with both ρ and σ. Moreover, the maximum-entropy state consistent with those marginals again have zero conditional mutual information.
That the outcome of the merging process is a state with zero conditional mutual information implies that the merging process can be bootstrapped. In the context of our work, this bootstrapping process can be schematically expressed as follows: where the tuples represent locations of the clusters. We defer the exact meaning of this diagram to the later part of the paper, specifically, to Section 7. Roughly speaking, by using the merging technique, we merge the marginals on the 2 × 2 clusters to obtain a set of density matrices on a N × 2 cluster for any N ≥ 2. Then, we merge these density matrices to obtain a density matrix over a N × M cluster for any N, M ≥ 2. By showing that this argument applies to any N and M , we can conclude that there is a translationally invariant state consistent with if the requisite conditions in Eqs. (5), (6), and (7) are satisfied. Moreover, using the conditional independence of the maximum-entropy merged state, we can decompose the entropy of the merged state into a linear combination of entropies that can be readily computed from the marginals.
An important subtlety that we have not discussed up to this point is how to use the merging technique. To apply the technique, one must establish certain internal conditional independence relations. At first, one may think that such a relation follows straightforwardly from the repeated application of the merging technique. While this is, in some sense, true, establishing the requisite conditional independence relations is in fact not a straightforward task.
To get a hint of the underlying complexity of this problem, it is helpful to consider a simplified example. Suppose that, somehow, one managed to construct a density matrix over the following clusters: . Next, the goal is to extend this density matrix to the density matrix supported on the following (enlarged) set of clusters: . At first, one may hope to do so by merging density matrices over the following two clusters: The key issue is that the condition I(A : C|B) = 0 involves subsystems that cannot be supported strictly inside of the 3 × 3 clusters. As such, one cannot directly use the conditions imposed on a single patch of 3 × 3 clusters to derive the requisite conditional independence condition. To do so, one must somehow combine the conditions on overlapping patches of 3 × 3 clusters.
Combining these local conditions into a nonlocal conditional independence condition is a difficult task and the tools developed in Section 5 will prove to be indispensable for proving such statements; without such tools, the proof is expected to be significantly longer and much less modular. With that said, a further simplification of the proof will be undoubtedly useful for better understanding our construction. This is left for future work.

Fundamental marginals
In this section, we introduce the key objects behind our work, the fundamental marginals. These are the basic building blocks of our theory, denoted as The two elements of M respectively represent density matrices over the 2 × 2 and 3 × 3 clusters. 2 As it stands, there is an ambiguity because we have not specified the location of these clusters. We will simplify this issue by imposing translational invariance on these marginals. These are the linear constraints, specified as C L : In particular, the reduced density matrices of these marginals are locally consistent with each other. While C L removes the need to specify the absolute location of these clusters in many cases, sometimes these locations will be crucial to our analysis. Two kinds of situations will occur. In one case, the location of these clusters within some bounded region will matter. However, the exact location of the region itself will be immaterial to our analysis. In that case, we shall specify this bounded region by dotted squares. One can view this as a "canvas" on which the clusters representing the support of the marginals are drawn. For instance, = at the bottom left corner of .
In some cases, we will deal with unbounded regions. In that case, we can no longer avoid specifying the absolute coordinate of the clusters. In those cases, we will use the following convention: , . . .
which represent the marginals whose bottom-right corner of their support is located at (x, y). We will say these clusters are anchored at (x, y). The fundamental marginals obey extra constraints inherited from Eqs. (5) and (6). After using the translational invariance condition, we obtain the following primary Markovian constraints, primaries for short.
The primaries are important because they imply a number of descendant constraints, descendants for short. The descendants are the "emergent" constraints that arise from the primaries, which follow straightforwardly from the strong subadditivity of entropy(SSA) [25]. While the descendants are more intimately related to the core of our proof, the number of primaries is significantly fewer compared to that of the descendants. Therefore, the set of primaries is a more economical way of organizing the key assumptions behind our work.
The set of descendant constraints we use is summarized in Table 1; see Appendix A for the derivation. 3 In deriving our main results, we shall frequently consult this table and refer to these constraints by their category and the lower-case Roman letters. For example, Type I-a) cluster constraint would refer to the first equation in Eq. (18).
Snake constraints: Type I cluster constraints: Type II cluster constraints:

Interlude: Entropy
In this section, we provide a streamliend overview of the properties of the von Neumann entropy.
Of particular interest to us is the strong subadditivity (SSA) of entropy [25]. Without loss of generality, consider a tripartite state ρ, say over H A ⊗ H B ⊗ H C . SSA is the following statement: We will frequently use an object called conditional mutual information, defined as We will refer to B as the conditioning subsystem and A and C as the conditioned subsystems. Conditional mutual information is useful because it obeys a number of nontrivial monotonicity relations. Specifically, conditional mutual information is non-increasing under a removal of a conditioned subsystem.
Also, conditional mutual information is nonincreasing under moving (a part of) conditioned subsystem to the conditioning subsystem.
Both of these inequalities follow straightforwardly from the SSA. Some states satisfy SSA with an equality. Such states are said to be conditionally independent and are important for us for two reasons. First, all the identities summarized in Table 1 are precisely statements about certain states being conditionally independent. Second, conditionally independent states have a very special structure. The following theorem was proved by Petz [17,18].
where Φ B,→BC and Φ B→AB are Petz maps, defined as where we have suppressed the identity operator for convenience. For instance, ρ B should be really viewed as I A ⊗ ρ B ⊗ I C . This theorem will be useful because we can build up the "global" state ρ ABC from its marginals ρ AB and ρ BC . The main difficulty of using Theorem 1 comes from the fact that the reduced density matrices generally do not commute with each other. This motivates the development of the merging algebra, explained in our next Section.
latter. For our purpose, this is not an issue because all the physically important quantities such as the energy and the (maximum) global entropy consistent with these marginals can be computed directly from the former. 3 Let us make a side remark. Let CM,D be the constraints summarized in Table 1. One can actually show that the set of density matrices obeying both CM,D and CL is equal to the set of density matrices obeying CM,P and CL. We can formally state this fact as CM,P , CL ∼ = CM,D, CL.
Therefore, one can choose either of these constraints without losing any generality.

Merging algebra
The constraints in Table 1 imply that the reduced density matrices of the fundamental marginals can be merged in a particular way. In this section, we discuss a variety of facts that will be useful in proving such a statement.
The main content of this section is an algebra called the merging algebra. Most of the objects of this algebra belong to the following set.
where P(S) of a set S is the power set of S, Λ is the set of lattice sites, H A is the Hilbert space associated with A ⊂ Λ, and S(H) of a Hilbert space H is the state space of H. From now on, unless specified otherwise, we will assume that density matrices belong to S Λ . The merging algebra is equipped with two binary operations, referred to as right-merge and max-merge operations. Consider σ, λ ∈ S Λ . These binary operations are defined as follows.
Here Φ λ is the Petz map, which in this context is defined as follows. (See Theorem 1.) Let Note that this is a slight abuse of notation because the map Φ λ actually depends on the argument; the arguments with different supports may give rise to a different quantum channel. By definition, the max-merge operation is commutative. However, the right-merge is not. Moreover, neither of them are generally associative. Nevertheless, the tools developed in this section, together with Table 1, will provide enough structure to let us derive the main results of this paper.
While the max-merge operation will always yield a valid density matrix in this paper, the same cannot be said in a more general context. For completeness, we introduce a special object called nil. Given two density matrices, there may not be any density matrix (on the union of their supports) that are consistent with those density matrices. In that case, we will simply say that Similarly, we will use the following rule: Provided that we began with density matrices, the set of objects generated by the max-merge and the right-merge operations are either a density matrix or the nil object. The merging algebra is generated by these binary operations, applied to the reduced density matrices of ; see Table 2.

Merging algebra
Generating set: and its reduced density matrices, over every 3 × 3 clusters .
( must obey C L and C M,P .) Binary operations:

Basics of the merging algebra
In this section, we will derive fundamental facts about the merging algebra. This section is a bit abstract but the meaning behind each results should be well-encapsulated by their respective names.
We begin with the fundamental lemma, which establishes a web of equivalence relations between various statements about conditionally independent states. This is the lemma that we use the most in this paper. Lemma 1. (Fundamental lemma) The following statements are equivalent: where τ is the reduced density matrix of λ (and σ) on Supp(λ) ∩ Supp(σ).
Moreover, Φ(λ) and Φ(σ) appearing in condition 6 and 7 are both the maximum entropy state consistent with λ and σ.
Proof. First, 1 → 2 and 1 → 3 follows from Petz's theorem (Theorem 1). Moreover, 2 → 5 and 3 → 4 follows from the definition of the max-merge. Lastly, 4 → 6 and 5 → 7 is true by definition. Now we prove 6 → 1. Let B be the intersection of the support of λ and σ. Also, let A = Supp(λ) \ B and C = Supp(σ) \ B. Because conditional entropy is concave [25], Note that λ = (Φ(λ)) AB because of our very assumption. Moreover, where τ ABC = Φ(λ). By SSA, we have Therefore, it must be that Note that Φ(λ) is consistent with both λ and σ. Moreover, its entropy is maximum over all possible states that are consistent with both λ and σ. Therefore, a density matrix λ σ satisfying the first condition exists. Such a state is unique [26], so it must be Φ(λ). The proof of 7 → 1 is similar.
Next, we have the commutation lemma. Colloquially speaking, the commutation lemma provides a sufficient condition under which two right-merges commute with each other. This is an intuitive result. If the two density matrices associated with the right-merges do not have overlapping supports, the Petz map assoicated with them act on disjoint supports. As such, their action should commute.
Proof. Specifically, Therefore, we conclude that It is straightforward to show that the right hand side of Eq. (38) is equal to (σ τ ) τ , establishing Eq. (41).
Next, we have the merging lemma. This lemma was originally proved by Kato et al. [24]. Given two conditionally independent states, it provides a sufficient condition under which one can merge those two states into a larger state. Importantly, the merged state possesses nontrivial conditional independence relations that are inherited from the states prior to the merging. and where the first line is our assumption and the second line follows from the commutation lemma (Lemma 2). The following identity is the key.
To see why, note that In the first line, we used the fact that the support of τ does not intersect with the support of λ. Therefore, the Petz map associated with λ "commutes" with the partial trace. Note that σ λ = λ σ follows from Lemma 1 and our assumption. Therefore, Moreover, λ σ c = τ . To prove this fact, it suffices to show that σ c = τ because the support of λ does not intersect with that of τ . Because σ τ = σ τ , and σ τ is a density matrix, σ τ cannot be nil. Therefore, σ c = τ , which implies λ σ c = τ . Therefore, the fourth condition in Lemma 1 is satisfied. Therefore, establishing the horizontal identity on the top of Eq. 40. Note that our conditions, by the virtue of Lemma 1, can be rewritten as Thus, we can apply the same logic to conclude that By the commutativity of , establishing the horizontal identity on the bottom of Eq. 40. Now, it remains to establish the vertical identities. Note that establishing the vertical identity on the right side of Eq. (40). This completes the proof of Eq. (40).
Let us make an important remark on the merging lemma: that the lemma can be bootstrapped. Specifically, the lemma begins with two conditionally independent states and ends up with another conditionally independent state over an enlarged system. The fact that this new state is conditionally independent (with an appropriate choice of subsystems) implies that it can be merged with yet another state provided that the conditions stated in the merging lemma holds. Therefore, by chaining this argument, it is possible to start with a set of marginals on bounded regions and merge them to form a density matrix on an unbounded region. Importantly, the fact that this can be done can be verified locally, over the originally given density matrices. We shall see a nontrivial example in Section 5.2.

Snake
There is a very important object that will repeatedly appear in the remainder of this paper. This is the snake. 4 In this section, we will study their properties. Let us begin with the definition. Then, we define a snake of (ρ i ) N i=1 := (ρ 1 , . . . , ρ N ) as Remark. A snake of a sequence (ρ i ) N i=1 := (ρ 1 , . . . , ρ N ), if it exists, is a quantum state [10]. Moreover, this state is consistent with all ρ i from i = 1 to N .
Remark. If a sequence of density matrices (ρ i ) N i=1 can form a snake, so can its subsequence (ρ i ) m i=n for any 1 ≤ n < m ≤ N.
Snake is a fluid object. It can take many different forms that are equivalent to each other. We will prove a number of results in this direction. First, let us introduce the mutation lemma. This lemma says two things. First, a snake forms a Markov chain; see Eq. (50). Second, it can be written as a sequence of right-merges, either from the left to the right, or from the right to the left; see Eq. (51). The main point of this lemma is that snake is a mutable object that can have many different forms. Depending on the context, we can "mutate" a snake into a form that becomes more amenable to our analysis.
for all 1 ≤ n < N . Moreover, Proof. The proof is based on induction. For n = 1, the statement is true by our assumptions. Suppose the claim is true for n ≥ 1.
Let us first prove By our assumption, In the second line, we used the commutation lemma (Lemma 2). In the third line, we used our assumption.
By tracing out Supp(ρ n+1 ) \ Supp(S((ρ i ) n i=1 )), we obtain Furthermore, Therefore, by the equivalence of the fourth condition and the third condition in Lemma 1, we conclude Snake has a local entropy decomposition. The proof follows straightforwardly from Lemma 1 and the mutation lemma (Lemma 4).
Another very useful property is that snakes can be "split" into two snakes. This lemma says that a snake can be viewed as a merged state of two "shorter" snakes.
for all n such that 1 ≤ n < N .
Proof. By the mutation lemma (Lemma 4), , From Corollary 1, where τ n is the reduced density matrix of S ( . Thus, the first condition of Lemma 1 is satisfied by S (ρ i ) N i=1 . By the uniqueness of such a state [26], we establish the horizontal identity on the top. The remaining identities follow from the second and the third condition of Lemma 1.

Snakes in action
Up to this point, we have focused on the properties of the fundamental marginals. From here on, we will shift our focus to the composite marginals that are made out of those marginals from a sequence of merging operations.
Specifically, we will be studying the properties of the density matrices that are created by merging the following density matrices: which represent the marginals inherited from over the 2 × 1 and 2 × 2 clusters, anchored at an arbitrary point (x, y).
From these objects, we can define the level -1 and level -2 snakes, introduced below.
Note that both the level-1 and the level-2 snakes are indeed snakes in the sense of Definition 1, by the Snake-b) constraint and the Type I-a) cluster constraint respectively, by applying these constraints to the first condition of Lemma 1. These snakes have properties that will play a vital role in Section 7.1 and 7.2. To wit, it will be useful to introduce a notion of extension maps. Let ρ ≤y be a density matrix supported on a set of sites with y-coordinate less or equal to y and let ρ ≥y be a density matrix supported on a set of sites {(x , y ) : y ≤ y}. The upward extension acting on the y-th row, denoted as E y,↑ , acts as follows: The downward extension acting on the y-th row, denoted as E y,↓ , acts as follows: Note that the extension maps "extend" the density matrix, either in an upward or a downward direction. Roughly speaking, these extension maps can be thought as the "inverses" of the partial trace of a single row. That is, for a class of quantum states we consider, we will see that Tr y+1 (E y,↑ (ρ ≤y )) = ρ ≤y , where Tr ··· means taking a partial trace, over the clusters with y-coordinate specified in the subscript. Note that we only said that Eq. (64) holds on a class of states we consider. This is because the composition of the partial trace and the extensions are generally not equal to an identity map. 5 Let us summarize the important identities. The first equation is Eq. (65), the solid lines in particular: Level-2 snake at y Level-1 snake at y Level-1 snake at y+1 where t y ∈ T is a translation in the y-direction by 1; see Section 2 for the setup.
We will also prove the following twist identity: More formally, the map from the level-2 snakes to the level-1 snakes is encapsulated by the following proposition.
The map from the level-1 snakes to the level-2 snakes can be summarized as follows.
The proofs of these statements are quite technical. We will explain them in great detail in Appendix B, C, and D. These propositions are important because, as a whole, they imply that one can form a snake from a sequence of level-2 snakes, as we explain in Section 7.1.

Main results
From Proposition 1, 2, and 3, our main claims follow immediately, as we explain below.

Consistency
Suppose and obey C L and C M,P . Then, we can show that is consistent with a translationally invariant state on an infinite lattice. 6 Our approach is to build up a global state gradually, from the 2 × 2 clusters to a level-2 snake, and then a snake made out of the level-2 snakes. Schematically, we have An important observation is that two level-2 snakes that overlap with each other can be merged together.
By the first identity in Proposition 1, Thus, the sixth condition of Lemma 1 is satisfied, with the following choice of λ, σ, and Φ: The third condition of Lemma 1 implies the main claim.
It follows that we can define the following snake: Importantly, this object is not nil. (See Definition 1.) Therefore, this snake must be consistent with for all 0 < y ≤ N − 1. It then follows for all x, y ∈ Z + ; the consistencies of the 2 × 2 clusters on the boundary invoke C L .
Theorem 3. For every and that satisfy C L and C M,P , for every integer N, M ≥ 2, there is a density matrix over N × M cluster(see Fig. 4) that is consistent with over every 2 × 2 cluster.

Entropy
In this Section, we derive an exact formula for the maximum entropy consistent with , subject to the constraints C L and C M,P . We will proceed in two steps. First, we will compute the entropy of the snakes made out of the level-2 snakes. Using Corollary 1, this calculation becomes straightforward. Second, we will prove that this entropy is in fact the maximum entropy consistent with . This way, we derive an expression for the maximum entropy.
Proof. From Corollary 1, Therefore, again from Corollary 1, Now, let us derive an upper bound on the entropy that matches Theorem 4. First, consider a density matrix on the first two rows, denoted as ρ [2]. By repeatedly using SSA, we obtain More generally, let ρ[k] be a density matrix over the first k rows. We can derive the following recursive inequality: Therefore, we can obtain the following bound. Using Snake-a) constraint, for any density matrix over the first M rows, Thus, we have proved Theorem 5, stated below.
Theorem 5. Consider a family of density matrices acting on a N × M cluster (see Fig. 4) which are consistent with obeying C L and C M,P . The maximum entropy within this family is We also obtain the following expression for the maximum entropy density in the thermodynamic limit.

Beyond mean-field
From Theorem 3 and 5, so long as C L and C M,P are satisfied exactly, one can compute an upper bound to the global free energy. By minimizing this upper bound over a family of marginals that satisfy these constraints, we can obtain a variational upper bound to the global free energy. In this section, we emphasize that this upper bound must be better than mean-field theory in some sense. At least for translationally-invariant systems, our family of marginals include the mean-field solutions. This is because mean-field states have a tensor product structure and any states that has a tensor product structure automatically satisfies C M,P .
However, there are states that satisfy C M,P that cannot be written as a product state. A notable example is the ground states of Levin-Wen model [27] for which Eq. (1) holds exactly [13,14] locally everywhere. Therefore, the reduced density matrices of the ground state of the Levin-Wen model must obey C M,P exactly. Therefore, our class of density matrices include local reduced density matrices of topologically ordered systems, which cannot be mapped from a product state by a finite-depth quantum circuit [28]. For concreteness, we have added an explicit calculation for Kitaev's toric code [3] in Appendix E.
Let us end with a side remark. While Theorem 3 does prove that the reduced density matrices of Levin-Wen models are consistent with some global state on an infinite lattice, it does not prove that they are consistent with some global state on a torus. That would require a periodic boundary condition, which is incompatible with the definition of the snake; see Definition 1. However, the state itself is still long-range entangled; see Ref. [21].

Discussion
In this paper, we have put forward a conjectured duality between many-body quantum states obeying Eq. (1) globally and the set of density matrices obeying the same equation locally. If true, in numerical studies of systems that are expected to obey Eq. (1), one can take a shortcut and just minimize the energy and the free energy of the ansatz over a set of density matrices on bounded regions that obey Eq. (1). The number of parameters in the latter case is polynomial in the system size, an exponential improvement over a naive approach in which the minimization is taken over exponentially many parameters.
We provided nontrivial evidences for this conjecture, by proving this conjecture in the context of translationally invariant systems. This has led to a nontrivial upper bound to the ground state energy and the thermodynamic free energy of generic interacting quantum many-body system, which subsumes the mean-field theory bound.
While there are systems in two spatial dimensions that break Eq. (1) [29,30], the requisite entropy scaling law can be restored by applying a finite-depth quantum circuit. Viewed this way, one can (in principle) simply redefine the Hamiltonian so that Eq. (1) holds, at which point the machinery of this paper again applies. We are unaware of any solvable models of gapped twodimensional quantum many-body systems which cannot be accommodated using this approach. This is not to say that two-dimensional quantum many-body systems is solved, of course. In order to utilize our result, one must actually find a practical numerical method that can minimize the energy and the free energy in the family of states that satisfy C L and C M,P . The former set of constraints are linear, and as such, more benign. On the other hand, the latter set of constraints are non-linear, making it somewhat difficult to make progress on.
Moreover, realistic systems will not satisfy Eq. (1) exactly. The effect of this approximation error can be studied using the approach of Ref. [31]. However, for practical purposes, it will be desirable to tighten this bound.
Overcoming these challenges will be an important step we need to take to make our approach practical. We end with some thoughts on how to extend the current result.
1. While we have focused on quantum spins, an analogous statement for fermions are expected to hold. The main question is whether the statements in Section 5 can be reproduced. A fermionic version of SSA would follow from the monotonicity of quantum relative entropy [32] and its equality condition was investigated by Petz [17]. In this setup, the partial trace operation needs to be replaced by a conditional expectation.
2. The maximum-entropy state obtained in this paper, restricted to a quasi-one-dimensional cluster extending in thex-direction, becomes a snake constructed from a set of marginals that form a "chain." This is an instance of the Markovian matrix product density operator [10], which admits an exact contraction for any 2-point correlation function along the chain. One may be able to prove a similar statement for quasi-one-dimensional cluster in any direction. The readers are encouraged to verify that this is true in theŷ-direction. 3. One may be able to simplify C M,P further. For classical states, the last condition of C M,P is implied by the first two conditions, reducing the number of nonlinear-constraints to two and also reducing the size of the maximal cluster one needs to consider. To what extent we can reduce these constraints is an important open problem.
4. If we have a set of translationally invariant marginals that satisfy C M,P individually, their convex combinations are also consistent with a convex combination of translationally invariant states on an infinite lattice, each of which are the maximum-entropy extensions of the marginals. This is true even if the convex combination of the marginals do not satisfy C M,P as a whole. Moreover, we can compute the the convex combination of the extensions if the set is finite and if the maximum-entropy extensions of the marginals on an infinite lattice are mutually orthogonal to each other. 7 If these conditions are met, the entropy density converges to a convex combination of the individual maximum entropy densities in the thermodynamic limit. 8 This way, one can relax the condition C M,P .

Acknowledgement
This work was supported by the Australian Research Council via the Centre of Excellence in Engineered Quantum Systems (EQUS) project number CE170100009.

A Descendants
In this Appendix, we derive the identities in Table 1 from C M,P . To start with, the following identity will be useful.
Note that the last term vanishes because of Eq. (13). Moreover, the terms in the paranthesis on line 2, 3, and 4 must be nonnegative because of SSA. Because the overall sum must be 0 by Eq. (15), each of these terms must vanish. Moreover, we can alternatively consider the following decomposition: Again, the last term vanishes and the remaining terms must vanish. We can summarize these identities as follows.
Lemma 6. If satisfies C L and C M,P , Similarly, we can consider the following identity. and Like before, the last term vanishes because of Eq. (14). Moreover, the terms in the paranthesis on line 2, 3, and 4 are all nonnegative because of SSA. Because the overall sum is 0 by Eq. (15), each of these terms must vanish. Neglecting the ones that have already appeared in Lemma 6, we arrive at the following lemma.
We will also need the following identities: Proof. The first equation can be derived as follows. Let us rearrange the first result within Lemma 6 as follows: Interpreting this as a conditional mutual information between and conditioned on being 0, we can apply Eq. (22) to obtain the following result: By SSA, the left hand side should be nonnegative. Therefore, it must be equal to 0. Now, note that because of Eq. (14). Therefore, we can conclude that Viewing the left hand side of this equation as a conditional mutual information between and conditioned on , we can use Eq. (22) to derive the following result: Using translational invariance, we conclude The other equation can be derived in a similar way, by making the following changes to the argument above: We prove that level-2 snakes become a level-1 snake upon taking a partial trace over either of the rows. Specifically, Proposition 1.
Below, we derive Proposition 1 in three steps. First, we derive a set of entropy identities in Appendix B.1. From these identities, we derive a set of merging algebra identities in Appendix B.2. With these identities, we complete the proof in Appendix B.3

B.1 Entropy identities
We will need to prove the following identities: as well as their rotated versions (by π): Proof. By the Type I-a) cluster constraint, Interpreting this as a conditional mutual information between and conditioned on being 0, we can apply Eq. (22) to obtain the following result: By SSA, the first line of Eq. (106) must be nonnegative. By Eq. (105), the second line must be 0. Therefore, the first line must be 0. This proves our claim.
Proof. Using Lemma 9 and the Snake-c) constraint, one can derive which can be rewritten as Using Eq. (23), By SSA, the first line of Eq. (110) must be nonnegative. By Eq. (109), the second line must be 0. Therefore, the first line must be 0. The main claim follows from this fact, by using the Snake-c) constraint and the identity in Lemma 9.
The proof of Eqs. (102) and (103) follows the same strategy up to a π rotation in the chosen constraints and subsystems. As such, we state these results without proof.
Moreover, using Lemma 9 and the Snake-c) constraint, one can conclude: Similarly, by a π-rotation,

B.2 Merging algebra identities
Let us use the key identities that follow from Appendix B.1.
Proof. This is a straightforward application of Lemma 1 to Lemma 10, by using the equivalence of the first and the third condition of Lemma 1.
As in Appendix B.1, a π-rotated versions of these statements can be proven in a similar way. We state them without proof.

B.3 Completing the proof
Now we are in a position to complete the proof of Proposition 1. We restate the proposition for the reader's convenience.

Proposition 1.
Tr y Proof. By the mutation lemma (Lemma 4), Consider an object S i , which is defined below: for i such that 1 < i < N , where We will show that The i = 3 case follows from Lemma 14. For i > 3, note that the right-merge in Eq. (122) "commutes" with the right-merge of the 2 × 2 cluster in Eq. (120), by using the commutation lemma (Lemma 2). After exchanging these right-merges, we have using the Lemma 13 in the second line. Moreover, using Lemma 14 in the first line and using the commutation lemma (Lemma 2) in the second line. After applying the splitting lemma (Lemma 5), Moreover, we can use Lemma 14 and the merging lemma (Lemma 3) in the following way. Let λ be the reduced density matrix of the level-1 snake over the clusters ranging from column 1 to i − 1, σ be the reduced density matrix of the level-1 snake over the clusters on column i − 1 and i, and τ be the -shaped cluster anchored at (i + 1, y). We can apply Lemma 3 to precisely these choices of λ, σ, and τ . Thus, we conclude that By using the second condition of Lemma 1, we conclude that proving Eq. (122). From Eq. (122), the main claim readily follows. (1, y+1) The remainder of our main claim can be shown in an exactly analogous way, by "rotating" every subsystem and constraints by π. Specifically, the order of the right-merges in Eq. (119) is reversed. Moreover, Lemma 13 and Lemma 14 are changed to Lemma 15 and Lemma 16 respectively.

C Level-1 → Level-2
Now, we prove that the upward and the downward extensions map level-1 snakes to level-2 snakes. The main result of this appendix is Proposition 2.

Proposition 2.
(1, y) To prove Proposition 2, it will be convenient to introduce new composite marginals, defined below.
Composite 4: := . (132) We will study the properties of these objects in the remainder of Appendix C. As a side comment, the readers may soon notice that the analysis of these objects are significantly more involved than the reduced density matrices of the fundamental marginals. This is to be expected. Unlike the fundamental marginals, it is a priori not even clear if the composite marginals are consistent with the fundamental marginals. Due to this reason, the structure of Appendix C will be quite different from that of Appendix B. We won't be able to completely decouple the analysis on the entropy from the analysis on the merging algebra.
Our analysis is organized as follows. In Appendix C.1, we derive the basic identities that do not involve the composite objects. In Appendix C.1, we will derive the identities involving the composite objects. By using these identities, we will prove Proposition 2 in Appendix C.3.

C.1 Basic identities
In this appendix, we will prove simple identities over the marginals of .
Proof. From the Type I-b) cluster constraint, Using the monotonicity of conditional mutual information (Eq. (22)), The first line of Eq. (135) must be nonnegative because of SSA. The second line is 0. Therefore, the first line must be 0. Plugging in this result to the first condition of Lemma 1, we arrive at our claim.
Of course, the π-rotated version can be proved in a similar way, by replacing the Type I-b) cluster constraint to the Type II-b) cluster constraint. This leads to the following lemma.
We can also derive the following identity: Proof. From the Type I-a) cluster constraint, Using Eq. (23), The first line of Eq. (139) must be nonnegative because of SSA. The second line is 0. Therefore, the first line must be 0. Plugging in this result to the first condition of Lemma 1, we arrive at our claim.
The π-rotated version is proved in a similar way. We state the result without proof. (140)

C.2 Composite identities
Now, we will derive a number of basic identities involving the objects defined in Definition 4. The key findings are summarized below.
We will also state the π-rotated versions of these statements at the end. Since the underlying analysis is essentially the same, we will focus on the proof of Eqs. (141) and (142).
Lemma 21. Moreover, Proof. By Corollary 4, By Lemma 1 (specifically, using the equivalence of the first and the third condition), Moreover, applying the Snake-b) constraint to Lemma 1 (again, using the equivalence of the first and the third condition), We can plug in Eqs. (161) The first consistency condition can be readily verified from the bottom of Eq. (141) and the the top right corner of Eq. (142). By using this relation, after taking a partial trace on Supp(Composite 2)\ Supp(Composite 1) on Composite 2, we obtain Composite 1. The second consistency condition also follows easily. Take a partial trace on the bottom-right corner of Composite 2, using the top-right corner of Eq. (142). By the Type I-a) cluster constraint, the resulting density matrix must be consistent with , which proves the claim. Therefore, we have proved the consistency conditions in Eq. (161).
Second, we prove the entropy condition.
Proof. From Lemma 21, From Corollary 4, Combining these two, From Lemma 12, where in the second line we used the Snake-a) constraint. From Eqs. (165) completing the proof of Eq. (142). As before, we can apply the analogous proofs to Composite 3 and Composite 4 by rotating all the involved subsystems and constraints by π. We will state them below without proofs.

C.3 Completing the proof
With the identities derived so far, we are in a position to prove the main result of this appendix, Proposition 2. We restate this below for the reader's convenience.
S i is related to the main claim of this proposition in the following way. Using the commutation lemma (Lemma 2), one can show that We will show that The i = 3 case follows directly from Lemma 15. For i > 3, note that by the splitting lemma (Lemma 5). Therefore, which follows from the commutation lemma (Lemma 2). Note that Therefore, using the commutation lemma (Lemma 2). Using the Type I-a) cluster constraint and the splitting lemma (Lemma 5), we conclude proving the recursion relation. Using Eq. (172), we have using the top of Lemma 15 in the first line and using Lemma 19 in the second line. Applying the splitting lemma (Lemma 5), the first identity of our main claim is proved. The second identity follows from an analogous analysis, by rotating every constraints/subsystems by π.

D Twist
In this appendix, we prove the following twist identity.
The proof of the twist identity involves the following new composite object: The key identity is the following: = . (182) Roughly speaking, we can use Eq. (182) in the following way. First, expand the left hand side of Eq. (69) by writing down the action of E y,↑ explicitly and also expressing the snake as a sequence of right-merges, involving marginals from the left end to the right end sequentially. One can apply Eq. (182) from the left end to the right end on this object. After applying some extra identities at the end, one can show that the end result is equal to the right hand side of Eq. (69). This fact can be verified by expanding the right hand side in a way similar to the left hand side, by explicitly writing down the action of E y,↓ and expressing the snake as a sequence of right-merges, involving marginals from the right to the left.
As we shall see later, it is not too difficult to show that Eq. (182) would follow if these "commutation relations" hold: While these identities are indeed correct, they do not immediately follow from the commutation lemma (Lemma 2). This is because the commutation lemma requires two right-merged marginals to have disjoint supports.
In the rest of this appendix, we will prove Eqs. (183) in Appendix D.1. We will then complete the proof of Proposition 3 in Appendix D.2.

D.1 Commutation relations
It will be convenient to work with the following objects.
To see why, note that which follows from Type I-b) cluster constraint and the monotonicity of the conditional mutual information (Eq. (22)). Then by Petz's theorem (Theorem 1), Moreover, note that which follows from the Type I-b) cluster constraint and Eq. (23). By Petz's theorem (Theorem 1), where the first line follows from the above-mentioned merging process 9 and the second line follows from Type I-d) cluster constraint and Lemma 1.
Proof. Let us recall the definition.
Viewing the two right-merges in Eq. (197) together as a quantum channel, we can use Lemma 24 and Lemma 25 can be applied to the seventh condition of Lemma 1. To be more concrete, these two lemmas imply that 10 Using the first condition of Lemma 1, The last two terms in the second line of Eq. (199) can be broken down further. First, applying the monotonicity of conditional mutual information (Eq. (22)) to the Type II-d) cluster constraint, 11 Second, applying the monotonicity of conditional mutual information (Eq. (22)) to the Type II-a) cluster constraint, 12 Plugging in these results, we get Proposition 4.
= . (203) Proof. From Lemma 24, we see that Moreover, in Eq. (195), we have proved that = . (205) These two identities, together with Lemma 26, can be applied to the first condition of Lemma 1.
The main claim follows from the equivalence of the first and the third condition of Lemma 1.
After applying the Type I-d) cluster constraint, we obtain the following "commutation relation." Proposition 5.
= . (206) As before, a π-rotated version of the entire proof can be reproduced in an analogous way. = . (209) Proof. Note that = = = = . (210) The first line is the definition. In the second line, we applied the Type II-c) cluster constraint to Lemma 1. In the third line, we used the commutation lemma (Lemma 2). In the last line, we applied the Type I-c) cluster constraint to Lemma 1. Therefore, = = = . (211) In the second line, we have plugged in the result of Eq. (210). In the third line, we used the commutation lemma (Lemma 2). In the last line, we have used Proposition 5. One can apply a similar logic to the right hand side of Eq. (209) by rotating all the subsystems/identities by π. (For instance, Proposition 5 would change to Proposition 7.) The end result is the same as the last line of Eq. (211), completing the proof. Now, we are in a position to prove Proposition 3. We restate the result for the reader's convenience.
Using the commutation lemma (Lemma 2), Moreover, again using the commutation lemma, using the commutation lemma (Lemma 2) in the first line, applying the Type I-d) cluster constraint to the first condition in Lemma 1 in the second line, and using Proposition 6 in the last line. Moreover, using the definition of and then applying the Type II-b) cluster constraint and the Type I-a) cluster constraint to the first condition of Lemma 1, we obtain = . (224) Plugging in Eq. (224) to Eq. (221), we obtain a level-2 snake 13 which is right-merged by two 2 × 2 clusters anchored at a y-coordinate of y − 1. We can plug in this expression into Eq. (220), which is equivalent to the left hand side of Eq. (212). The obtained expression, by the mutation lemma (Lemma 4), establishes our main claim.

E Ground state example
In this section, we discuss a physical ground state that satisfies Eqs. (5) and (6). Note that these equations are satisfied trivially for product states, e.g., states in the following tensor product form: where the tensor product is taken over a set of sites in a lattice. Therefore, we will focus on states which are in some sense "far" from the states of such form. The state we describe below are topologically ordered in a sense that there is an obstruction to preparing the ground state from a product state by applying a "short" quantum circuit. If one were to prepare a topologically ordered ground state from the product state, one would need to apply a quantum circuit whose depth grows extensively with the system size. Assuming that the allowed gates are geometrically local, i.e., supported on a ball of bounded radius, there is a Ω(L) lower bound for a system of size L × L [28]. This lower bounds suggest that topologically ordered ground state cannot be well-approximated by a product state. In contrast, the ansatz proposed in this paper will provide an exact description of such states in the thermodynamic limit using a density matrix on a bounded region.

E.1 Toric Code
Toric code is a paradigmatic model of topological order [3]. this is a model defined on a square lattice, with the physical degrees of freedom living on the edges of the lattice. The Hamiltonian is where s is a site and p is a plaquette. Here A s is a tensor product of Pauli-X operators acting on the set of edges coincident on s and B p is a tensor product of Pauli-Z operators acting on the set of edges that surround the plauqette p. Every term commutes, and as such, any ground state |ψ must satisfy A s |ψ = B p |ψ = |ψ for all s and p. For our discussion, it will be convenient to consider the "rotated" version of the toric code in which the qubits live on the sites of the square lattice, and the stabilizers are alternating sequence of tensor product of Pauli-Xs and Zs along each plaquette; see Fig. 5. Embedding this lattice into a triangular lattice is simple. One can simply "shear" the lattice; see Fig. 6. Now, the sites of this sheared lattice can be embedded straightforwardly into a triangular lattice.  Toric code is an example of stabilizer code, for which a simple machinery has been already developed to study its ground state properties [33,34]. Given a subsystem, say A, we can compute the von Neumann entropy of the ground state reduced density matrix of A as follows. First, note that A s , B p generates an Abelian group, which we denote as S TC . Consider a subgroup S TC,A ⊆ S TC which is generated by the elements of S TC that are supported on A exclusively. The ground state entanglement entropy is [33] S(A) = |A| log 2 − log |S TC,A |. (227)

E.2 Verifying local constraints
We can use Eq. (227) to verify Eqs. (5) and (6). We shall use the embedding described in Fig. 6. Within this convention, we can first verify Eqs. (5) as follows. We restate this equation for the readers' convience below: In this case, no subsystem appearing in the entropy calculation has a nontrivial stabilizer subgroup. This can be seen easily from the fact that S TC does not contain any element with a weight of less than 4. Therefore, S = 3, Moreover, Therefore, Eqs. (5) are satisfied.
Verifying Eq. (6) is also straightforward. Let us restate the equation below.
Here the subsystems involved in the entropy calculation can have nontrivial stabilizer subgroups. For instance, the stabilizer group of the subsystem appearing in the left hand side of Eq. (6) is generated by four "plaquette operators" in our new convention, two of which are tensor product of Pauli Xs and the rest being tensor product of Pauli Zs. Therefore, we have Similarly, the first four terms appearing on the right hand side of Eq. (6) have a correction of −1, owing to the fact that each subsystem can support exactly one plaquette operator. Therefore,

E.3 Finite temperature
The entropy of the subsystem at finite temperature was calculated by Castelnovo and Chamon [15]. (For the interested readers, the relevant equation is Eq. (39) in Ref. [15].) Their formula, in the limit the size of the subsystem goes to infinity, reads: where α is a constant that depends on the temperature, m A is the number connected components, |A| is the number of X-type stabilizers strictly supported in region A, and |∂A| is the number of X-type supported nontrivially on both A and its complement. Since Eq. (E.3) holds only in the |A| → ∞ limit, the subsystems chosen in Eqs. (5) and (6)  Similarly, one can replace the clusters in Eqs. (5) and (6) to a × cluster for any > 1. These conditions can be satisfied up to an arbitrarily small error by increasing .

F Efficient tensor network
In this Appendix, we show that the maximum-entropy state in Eq. (75) is an efficient twodimensional tensor network. The rest of this Appendix is structured as follows. First, we express the maximum entropy state as a quantum state created by a sequence of linear maps acting on bounded regions. Second, we will explain why the resulting state can be viewed as a tensor network. Lastly, we shall briefly explain in what sense our construction is efficient.

F.1 Construction
Let us begin by explaining how the maximum-entropy global state consistent with the marginals can be constructed. Using the convention in Fig. 4, we shall generate a sequence of reduced density matrices that add one coarse-grained site at a time, from the bottom to the top, and from left to right. Specifically, we shall argue that the maximum-entropy global state in Theorem 4 is equal to where S 1 is the level-1 snake at y = 1; see Definition 3. By the mutation lemma (Lemma 4), the level-1 snake can be created by a sequence of quantum channels from left to right. Also, E i,↑ is a sequence of quantum channels acting from left to right. Therefore, Eq. (232) is indeed a quantum state created by a sequence of localized quantum channels, in the order we have prescribed above. The proof of Eq. (232) is based on induction. This is the key identity: for 1 < k < M + 1. The k = 2 case follows from Proposition 2. The k = 3 case follows a logic similar to the proof of Theorem 2. To be more specific, in the proof of Theorem 2, one concludes that the sixth condition of Lemma 1 is satisfied, with the choice of λ, σ, and Φ specified in Eq. (74). We have already established in Lemma 1 that Φ(λ) is the maximum entropy state, which is equal to the merged state of the two level-2 snakes, one at y = 1 and 2 and the other at y = 2 and 3. More generally, The first line is based on the induction hypothesis, specifically, the k = 2 case. The second line uses the mutation lemma (Lemma 4). (Note that the order is reversed.) The third line uses the splitting lemma (Lemma 5). The fourth line is based on an observation that the map E k−1,↑ acts trivially on the snake that appears after . The fifth line follows from the induction hypothesis, i.e., the k = 3 case. The sixth line uses the splitting lemma (Lemma 5).

F.2 Tensor network representation
In the recast form of Eq. (232), the maximum-entropy global state can be viewed as a twodimensional tensor network; see Ref. [35] for a review. To see why, note that Eq. (232) essentially says that the global state can be created by a sequence of linear maps on bounded regions. One can view this linear map as a tensor whose legs are connected to the set of physical qubits that it acts on, with an additional set of legs that represent the newly introduced physical degrees of freedom.
More concretely, we can formulate our argument as an induction. Suppose that  232)). The outward-facing legs are the physical degrees of freedom, one representing the "bra" and the other representing the "ket." The blue tensors are the newly introduced tensors and the the green tensors are the ones that are nontrivally affected by the local channel that creates the blue tensor. The yellow tensors remain unaffected.
for k < M is a tensor network on a k × L lattice with constant bond dimension, where L is the number of degrees of freedom in S 2 . If we apply E k+1,↑ , it is easy to see that the resulting state is a tensor network on a (k + 1) × L lattice; see Fig. 7. Moreover, for each site of the k × L-sized tensor network, at most O(1) number of local recovery channels act. Therefore, the bond dimension of the tensor network grows at most by a constant amount. Therefore, at all steps, the bond dimension of the tensor network is bounded from above some constant. This argument establishes that the state in Eq. (75) has a constant bond-dimension two-dimensional tensor network representation.
Having established that the maximum entropy state (Eq. (232)) is a two-dimensional tensor network, let us move on to the next topic. In what sense is this tensor network efficient? In the literature, the word efficient is used in two different ways. Sometimes, it means that the amount of data that defines the state grows polynomially with the system size. It should be easy to see that our tensor network is certainly efficient in that sense.
A more stringent notion of efficiency concerns calculation of expectation values. The main question is, given a description of the tensor network, whether there is a polynomial-time algorithm to compute the expectation values of local observables. Let us focus on geometrically local observables to be more specific, i.e., observables that can be supported on a ball of constant radius. Note that any such observable (provided that its support is sufficiently small) can be computed directly from the reduced density matrices that define the tensor network, since we have already proved that the global state is consistent with the marginals. Therefore, our tensor network is efficient also in the sense that there is a polynomial-time (in fact, constant-time) algorithm to compute such expectation values. We note that two-dimensional tensor networks generally do not posses such a property [8] although they can be if the underlying tensors are "isometric" [9].
In fact, our result shows that there is another interesting sense in which our tensor network is efficient. We can compute not just the local expectation values, but also the global entropy! Recall that entropy of a state is a nonlinear function of the state. As such, it is a highly nontrivial fact that one can compute the entropy polynomial time.
To summarize, there is a sense in which the state we are studying, which does have a tensor network representation, is much more efficient than the garden-variety tensor networks. Not only is it possible to have a tensor network with constant bond dimension, but it is also possible to have an efficient scheme to estimate local expectation values, and even the global entropy.
However, these properties are more manifest from the perspective of the local reduced density matrices. As such, we believe a better point of view is to think in terms of the marginals.