Realistic Area-Law Bound on Entanglement from Exponentially Decaying Correlations

A remarkable feature of typical ground states of strongly-correlated many-body systems is that the entanglement entropy is not an extensive quantity. In one dimension, there exists a proof that a finite correlation length sets a constant upper-bound on the entanglement entropy, called the area law. However, the known bound exists only in a hypothetical limit, rendering its physical relevance highly questionable. In this paper, we give a simple proof of the area law for entanglement entropy in one dimension under the condition of exponentially decaying correlations. Our proof dramatically reduces the previously known bound on the entanglement entropy, bringing it, for the first time, into a realistic regime. The proof is composed of several simple and straightforward steps based on elementary quantum information tools. We discuss the underlying physical picture, based on a renormalization-like construction underpinning the proof, which transforms the entanglement entropy of a continuous region into a sum of mutual informations in different length scales and the entanglement entropy at the boundary.


I. INTRODUCTION
Understanding the universal nature of stronglycorrelated many-body systems is one of the central topics in theoretical physics. Even though strongly-correlated systems are generally intractable, it is possible to unfold the universal relationship between their characteristic attributes, providing a guiding principle for studying specific model Hamiltonians. For example, the existence or absence of a spectral gap, a finite or diverging correlation length, and the behavior of entanglement entropies are commonly studied attributes, which find intriguing mutual connections [1][2][3][4][5][6][7][8][9][10][11][12][13].
One of the prominent open problems in this context is whether the ground states of gapped Hamiltonians always obey the area law for entanglement entropy in any dimension, i.e., whether the entanglement between a subregion and its complement scales as the boundary size of the chosen region or can grow faster, e.g., as the volume of the region [1]. The underlying idea is that the existence of a gap significantly restricts the correlation that the ground state can accommodate. There is a wellestablished theorem, namely, the exponential-clustering theorem, which states that the existence of a spectral gap implies a finite correlation length in the ground state [2][3][4]. Indeed, a seminal work by Hastings [5] and several ensuing works [6][7][8] have given proofs of the area law in one-dimensional gapped systems, wherein the area law means a constant bound on the entanglement entropy. In higher dimensional cases, however, only partial results are present [9,10]. Originally spawned by the Bekenstein-Hawking entropy [14,15], the area law has arrested a huge interest over the last decade, thanks to its widespread relevance, e.g., to frameworks based on tensor network states [16,17], topological entanglement entropies [18,19], the holographic formula based on the AdS/CFT correspondence [20], the Hamiltonian complexity theory [21], and so on.
Since the proof for one-dimensional gapped systems, a naturally ensuing question was whether a finite correla-tion length alone can imply the area law. Albeit likely at first glance, serious doubt was cast upon its possibility due to unfavorable examples such as quantum datahiding states and quantum expander states, for which a small correlation and a large entanglement can coexist [11][12][13]22]. Amid such uncertainty, the recent proof that a finite correlation length indeed implies the area law in one dimension was a remarkable achievement [12,13]. However, the physical relevance of that proof is highly questionable because the obtained upper-bound of the entanglement entropy is ridiculously huge to such an extent that it is never reachable in any physically sensible situation (having a constant of ∼ 10 8 in the exponent, the bound easily surpasses the estimated number of atoms in the whole universe!) [13]. Consequently, we are still facing a quite unsatisfactory situation: under the condition of a finite correlation length alone, does the upper-bound of entanglement entropies exist only in such a hypothetical limit? Answering this question is important in truly confirming our picture on one-dimensional systems: in one dimension, a finite gap implies exponential decay of correlations, which in turn implies the area law. Here, the aforementioned unfavorable examples again seem to suggest that this picture might be misleading in reality.
In this paper, we give a proof of the one-dimensional entanglement area law from exponentially decaying correlations, which dramatically reduces the previously obtained bound and, for the first time, brings the bound into a realistic regime. As well as the involved constants, our bound also improves the asymptotic dependence on the correlation length. With ξ being the correlation length, we obtain the bound of ∼ (log ξ)2 (const.)ξ , while the previous proof gives ∼ ξ (const.)ξ [12,13]. In fact, the exponential dependency of the bound is unavoidable in general as was exemplified in Ref. [11]. Our bound thus leaves only little room for improvement. Interestingly, the dependence on ξ is even more favorable than that of Hastings' original proof for gapped systems, which reads ∼ ξ(log ξ)2 (const.)ξ [5], although this bound was significantly improved by a recent work (for gapped … … FIG. 1. The basic partitioning of the system for the proof. Here, lB 1 ≤ lB 2 as a convention. We let l b = lB 1 and lB = lB 1 + lB 2 . lC is defined to be lA + lB because S(C) = S(AB). systems) [8].
Moreover, compared to the previous one, our proof is remarkably simpler and more straightforward. The proof directly addresses the internal structure of the states with exponentially decaying correlations using elementary quantum information tools. Such a direct nature allows us to envisages a clear and intuitive picture on the encountered situation. The central part of the proof is to show that when the length scale is increased as Fig. 1 initially increases indefinitely, but saturates at some point, and then decreases exponentially in n. Combined with a simple renormalization-like construction, this behavior of the mutual information accounts not only for the entanglement area law, but also for why the area-law bound is exponentially large in the correlation length and how the common intuition-with a finite correlation length, the entanglement of a large region is determined by the correlations around the boundary-indeed makes sense.
Thus, the present work makes our view on onedimensional systems quite solid and consistent. We hope that our proof offers a more direct and detailed insight into the situation and becomes an important step towards the understanding of the area law in higher dimensions.

II. MAIN THEOREM
We consider a one-dimensional chain of qubits, i.e., s = 1/2 spins, in a pure state ρ = |Ψ Ψ|. This setting covers the cases of higher dimensional spins because one can then decompose each spin into a set of qubits and rescale the length. We make the following assumption: Assumption. For arbitrary operators X and Y supported, respectively, on regions R X and R Y separated by a graph distance l, the following inequality holds: where ξ is the correlation length of the system, · denotes the operator norm, and · = Tr(· ρ). Without loss of generality, we assume ξ ≥ 1.
Under the above assumption, we prove the following theorem.
Theorem. For any real parameter α 0 ∈ [2/3, 1), the entanglement entropy S of an arbitrary continuous region is bounded by where with x denoting the smallest integer larger than or equal to x.
We note that the theorem is obtained by a particular choice of the parameters used in the proof. Thus, the constants in the theorem do not represent the optimal ones. Having said that, we also note that it is nevertheless unlikely that the constants could be significantly reduced, unless the proof is enhanced by a completely new idea.

III. OVERALL PICTURE
Let us partition the chain into regions A, B = B 1 +B 2 , and C, as shown in Fig. 1. This partitioning is our basic setting throughout the proof. Our plan is to inspect the entropic relations between the subregions while changing their sizes. The size of each region is denoted by l with the corresponding subscript as in the caption of the figure. We will denote the reduced density matrix by ρ with the corresponding superscript, e.g., ρ C = Tr AB ρ, and use the similar convention for other density matrices. The von Neumann entropy of a local region is denoted, e.g., by S(C) = S(ρ C ), etc.
To begin with, recall the definition of the mutual information: One can rearrange the terms using S(C) = S(B 1 AB 2 ) and S(AC) = S(B), ending up with a nice structure: where is defined for later convenience. The expression (6) can be utilized as follows. Consider the partitioning in Fig. 2. Let n = 3 n 0 with some small unit length 0 and integer n. We are interested in the entanglement entropy of large region A Performing a similar task to the end regions, Summing Eqs. (8) and (9) over 1 ≤ i ≤ n, most of the S(·) terms are cancelled out. The result is where f (· · · ) is the abbreviation of the sum of all the mutual information terms in Eqs. (8) and (9) with different length scales. Note here that all the mutual information terms corresponding to I(A : C) in Fig. 1 are added, whereas those corresponding to I(B 1 : B 2 ) are always subtracted.
The expression (10) reduces the problem into figuring out how the mutual information scales with varying the length scale. We prove in the next section that if {l B1 , l A , l B2 } in Fig. 1 are all comparable to x n 0 with x > 0, the behavior of the upper-bound of I(A : C) with respect to increasing n is such that (i) it initially increases indefinitely for n n 0 with n 0 being linearly large in the correlation length ξ (transient behavior), but (ii) it saturates around n n 0 (saturation), and then (iii) it decreases exponentially in n for n n 0 (asymptotic behavior). A similar behavior can be proven for I(B 1 : B 2 ) by a slight modification of the proof, albeit not to be shown explicitly. Together with the expression (10), such behavior of the mutual information produces three straightforward implications.
First, the entanglement entropy of an arbitrarily large region of length 4 n is upper-bounded by a constant as f (· · · ) converges due to the asymptotic behavior of the mutual information. This leads to the entanglement area law as follows. For an arbitrary continuous region of length l, one can choose the partitioning in Fig. 1 such that l A = l and l B1 = l B2 = 4 n − l with sufficiently large n. Then, from the strong subadditivity S(A) ≤ S(B 1 A)+ S(AB 2 ) − S(C), we find the entanglement entropy of an arbitrary region is upper-bounded by twice the maximum entanglement entropy of a region of length 4 n . This bound is the inequality (2) in the main theorem.
Second, the upper-bound of the entanglement entropy is mostly determined by the transient and saturation behavior. As n 0 is linear in ξ, the area-law bound is exponentially large in the correlation length.
Third, Eq. (10) is valid for any choice of 0 . Our plan is to take the size of A to be the saturation length scale, which will be denoted by l 0 , instead of a small unit length 0 . Then, the entanglement entropy is bounded by the entanglement entropies of the two boundary regions A 1 , where the remaining mutual information terms in f (· · · ) are treated as a finite cor-rection to the bound. The point to be addressed here is that if we change l 0 to 3 m l 0 with positive integer m, then while the size of the boundary region is increased, the sum of the remaining mutual information terms decays exponentially in m due to the aforementioned asymptotic behavior. The physical meaning of this is that if the correlation length is finite, the entanglement entropy of a large region is determined by the correlations around the boundary, which is consistent with our common intuition. A caveat is that as mentioned above, the boundary region responsible for the entanglement entropy is exponentially large in the correlation length in general.

IV. EXISTENCE OF AN AREA-LAW BOUND
The state of the system can be written in a Schmidtdecomposed form as where the Schmidt coefficients p i 's are sorted in descending order and M ≤ 2 l A is the Schmidt number. If a local projection on region A is applied, the resulting normalized state of region C is given by where we define The exponential decay of correlations implies that the larger Q mn and l b are, the closer ρ C mn is to the original state ρ C of region C: where denotes the trace distance between two states ρ and σ.
Proof. For any operator Λ C supported on region C, where we use the assumption (1) and I−P mn /Q mn = 0. By choosing 0 ≤ Λ C ≤ I maximizing the trace, we obtain the lemma.
An important step is to define q(α) ∈ {0, 1, · · · , M } with a control parameter α ∈ (0, 1) to be chosen later in such a way that the number of ii 's are basically uncertain and should be considered arbitrary apart from the constraint in Lemma 1. This uncertain portion amounts to For Fig. 1. For brevity, we simply denote Q(α), the meaning of which will be clear from the context. We will later see that when {l B1 , l A , l B2 } are all comparable, Q(α) asymptotically vanishes for a large length scale, which is an important part of the proof. The Fannes' inequality states that for states ρ and σ acting on a Hilbert space of [23]. If ∈ [0, 1/2], we can use a modified form as log − (1 − ) log(1 − ) ≤ −2 log . From this modified Fannes' inequality, Lemma 1 directly leads to the following lemma. where As the Hilbert-space dimension for region C is upper-bounded by M 2 l B from the expression (11), the Fannes' inequality (19) leads to the lemma.
Note that log M ≤ l A . If {l B1 , l A , l B2 } are all comparable to each other, the bound in Lemma 2 decreases exponentially in l b .
Let us define another important quantity, namely, S(l) ∈ [0, l], which denotes the maximum entropy of a continuous region of length l. Furthermore, let us define the maximum entropy per site with a slight modification: where The additional constant is added merely for simplicity of the ensuing formulae. Note that from the subadditivity of entropy, s(nl) ≤s(l) for any positive integer n.
It turns out that Q(α) is related tos(l) from the concavity of the entropy for region A.
Proof. If q(α) = M , the lemma is satisfied as Q(α) = 0. Suppose q(α) < M . Given various possible cases of {p i }, From this and the concavity of entropy, which implies the lemma.
In addition, the concavity of the entropy for region B results in the following lemma.
Proof. If q(α) = 0, the lemma is satisfied as Q(α) = 1. Suppose q(α) > 0. From the concavity of entropy, The second inequality implies that there exists a cer- 1q(α) ). The lemma then follows because S(φ B i0i0 ) = S(φ C i0i0 ) and S(C) ≤ S(φ C i0i0 ) + (l C , l b , α) from Lemma 2. Suppose l B1 : l A : l B2 = 1 : (x − 2) : 1 for some integer x ≥ 3 and l b = l 0 , hence l C = xl 0 . Note S(B) ≤ 2S(l 0 ) from the subadditivity of entropy. If l 0 is sufficiently large so that (l C , l b , α) ≤ h , Lemma 4 implies On the other hand, ass(l A ) =s((x − 2)l 0 ) ≤s(l 0 ) from the subadditivity (24), Q(α) is also bounded bys(l 0 ) from Lemma 3. The inequality (29) thus turns intō where we define a function γ(x, l) accordingly. Consequently, if there exists l 0 with sufficiently smalls(l 0 ) so that γ(x, l 0 ) < 1, then where γ 0 ≡ γ(x, l 0 ). Note that lim n→∞ γ(x, x n l 0 ) = 2/x if γ 0 < 1. The inequality (31) and Lemma 3 indicate that both s(x n l 0 ) and Q(α) asymptotically decay exponentially in n. It turns out later that this is related to the asymptotic behavior of the mutual information mentioned in the previous section. Before proceeding, however, we need to make sure that there indeed exists such l 0 that makes s(l 0 ) sufficiently small. Only then, the above argument is valid. The inequality (29) is insufficient here as Q(α) may be arbitrarily close to one in the first place.
In order to obtain a complementary inequality, we revisit the subadditivity of entropy, which can be derived from the expression (11): where denotes the Shannon entropy. The second line is a general inequality [23], the third line uses S(φ C ii ) = S(φ B ii ), and the last line uses the concavity of entropy. The following lemma is obtained by incorporating Lemma 2 into the inequality (32).
Proof. Let us slightly modify the inequality (32). That is, But, again from the inequality the inequality (32), along with Eqs. (35) and (36), becomes Let us now actually group the indices so that which is always possible as p i < 2 −αl b /ξ for i > q(α) and To obtain the lower bound of H({R m }), we follow the same logic as in the proof of Lemma 3 and use the concavity of entropy to find out As min which implies the lemma.
Suppose l B1 : l A : l B2 = 1 : (x − 2) : 1 and l b = 0 . If x ≥ 4 and 0 is sufficiently large so that (l C , l b , α) ≤ h , Lemma 5 implies where we have used the subadditivity (24). It can be seen that the inequalities (29) and (42) can play complementary roles. Note that for any Q c ∈ (0, 1), if Q(α) ≤ Q c ,s(xl) ≤ 2s(l) x(1−Qc) from the inequality (29), and if Q(α) ≥ Q c ,s(xl) ≤s(l) − Q c α xξ from the inequality (42). We thus find For example, take x = 4 and Q c = 2/5. Then, s(xl) ≤s(l) − α/10ξ ifs(l) ≥ 3α/5ξ. Consequently, if we assumes(x n 0 ) ≥ 3α/5ξ for all positive integer n, 0 ≤s(x n 0 ) ≤s( 0 ) − nα/10ξ leads to a contradiction, which means there exists n 10ξ/α such that s(x n 0 ) < 3α/5ξ. Once we reach this point, the inequality (43) impliess(x n+1 0 ) ≤ (5/6)s(x n 0 ) and for positive integer m,s(x n+m 0 ) < (5/6) m (3α/5ξ) decreases exponentially in m (in fact, the decrement is much faster because Q(α) also decreases from Lemma 3). We thus come to a conclusion that for arbitrarily small > 0, there exists l 0 = x n0 0 such thats(l 0 ) < with n 0 being linearly large in ξ and log(1/ ), hence the inequality (31) is indeed valid. The detailed calculation is presented in the next section. The next step is to see how mutual information I(A : C) scales with increasing the length scale as l 0 → xl 0 → x 2 l 0 → · · · . As the mutual information quantifies the amount of information shared between two regions, the exponentially decaying correlation is expected to imply a decay of I(A : C) with increasing the length scale. The explicit bound of I(A : C) can be obtained by separating the state ρ AC = M i,j=1 √ p i p j |i A j|⊗φ C ij into three parts as and inspecting how each part contributes to S(AC). From a rather straightforward analysis, we end up with the following lemma.
If Q(α) < 1/2, The detailed proof is given in the Appendix. Suppose {l B1 , l A , l B2 } are small-integer multiples of x n l 0 . Then, x, the right-hand side of the inequality (44) is exponentially small in n. This is possible with x > 4 and sufficiently smalls(l 0 ) as can be seen from the inequality (30).
We are now fully ready for the final stage of the proof. Let us define Then, from the expression (10), the following lemma follows.
Lemma 7. For l n = 3 n l 0 , This lemma states that S(4l n ) is upper-bounded by a constant 2S(l 0 ) independent of n, up to the correction terms. The remaining task is to ensure that the correction terms are indeed upper-bounded by a constant. This is easily done in view of our analysis above. We have shown above that when {l B1 , l A , l B2 } are small-integer multiples of x n l 0 with x > 4, where it is understood thats(l 0 ) can be made arbitrarily small so that γ 2 0 x < 1. This directly implies that λ 2m decays exponentially in m as l 2m = 9 m l 0 (i.e., x = 9), hence ∞ m=0 λ 2m is a finite constant. Likewise, λ 2m+1 also decays exponentially in m, hence ∞ m=0 λ 2m+1 is a finite constant. As a result, ∞ i=0 λ i is upper-bounded by a constant, as we anticipated.
The analytical bound can be obtained by summing the right-hand side of the inequality (48) over n and restoring the omitted constant prefactors. While the convergence is apparent from the functional form, however, the result is not particularly illuminating. Moreover, the actual bound is in fact much tighter than that as we didn't consider for the moment that γ(x, x i l 0 ) in the inequality (31) decreases with i. As the inequality (30) is nonlinear, the tighter bound can be obtained only with the aid of numerical summation. In the next section, we present the details of the calculation.

V. CALCULATION OF THE AREA-LAW BOUND
As explained in the previous section, the proof is composed of two main steps. In the first step, we use Lemmas 4 and 5 to prove that there exists a certain l 0 such thats(l 0 ) is below a small threshold. In the second step, we use Lemmas 3, 4, 6, and 7 to upper-bound S(4 · 3 n l 0 ). The area-law bound is then S(l) ≤ 2S(4 · 3 n l 0 ) for arbitrary l from the strong subadditivity of entropy.

A. First step
The following lemma incorporates Lemmas 4 and 5.

B. Second step
The remaining task is to obtain the upper-bound of the summation in Lemma 7. Let l n ≡ 3 n l 0 . As explained in the previous section, it is convenient to decompose the summation as i λ i = m λ 2m + m λ 2m+1 .
The summation ∞ m=0 λ 2m+1 is obtained similarly. Here we frequently use the subadditivity, e.g.,s(l 2m+1 ) ≤ s(l 2m ), and 3l 2m+1 = l 2m+2 . We then obtain the following bounds similar to the above ones: From the set of upper-bounds of {s(l 0 ),s(l 2 ),s(l 4 ), · · · } and the inequalities (57), (58), and (59), we obtain the following bound: By incorporating this and Lemma 9 into Lemma 7 and noting that η(l n , l n , 2l n ) is exponentially small in n, we arrive at the following lemma: Lemma 10. Inherit the definitions in Lemma 9. For any α 0 ∈ [2/3, 1), there exists integer N such that for n ≥ N , The essential ingredient of the proof was defining the entropy per sites(l) and the cut-off index q(α) associated with the cut-off proportion Q(α). We obtained various relations ofs(l) in terms of Q(α) for the partitioning in Fig. 1. An important observation was that as the length scale is increased as 0 → x 0 → x 2 0 → · · · → x n 0 with some positive integer x,s(x n 0 ) asymptotically decays exponentially in n and this is accompanied by the asymptotic exponential decay of mutual information I(A : C) in n when {l B1 , l A , l B2 } are all comparable to x n 0 . This property motivated us to devise a renormalization-like construction in Fig. 2 revealing that the entropy of a large region is equivalent to the entropy of the fixed end regions up to the correction terms composed of the mutual informations with different length scales. The correction terms are upper-bounded by a constant thanks to the asymptotic decay of the mutual information.
Such an asymptotic behavior of I(A : C) is, however, preceded by a transient stage in which I(A : C) increases indefinitely. During the transient stage, the decrement ofs(x n 0 ) is slower and Q(α) is unbounded (note that in Lemma 3, Q(α) can be bounded only when s(l A ) < αl b /ξl A = O[1/ξ]). This transient stage persists until the length scale reaches a certain value l 0 that is exponentially large in ξ. Finding such l 0 was the first step of the previous section. This transient behavior is responsible for the area-law bound being exponentially large in ξ. This kind of states were exemplified in Ref. [11] as expander graph states.
One can slightly modify the logic towards Lemma 6 in order to find the upper-bound of I(B 1 : B 2 ) instead. That is, we write the state as instead of Eq. (11) and define (18). Then, we can obtain an upper-bound similar to that in Lemma 6: The behavior of I(A : C) and I(B 1 : B 2 )-the initial increment, saturation, and the asymptotic decayis a characteristic feature of the states with exponentially decaying correlations, although the initial growth of the mutual information may be possibly absent. There would be various kinds of states with different behaviors of the mutual information. It is an interesting point that a state obeys an entanglement area law even when I(A : C) asymptotically decays polynomially as O[n −k ] with k > 1. It seems that there is a room between the exponential and the polynomial decay, so the assumption (1) might be slightly mitigable, although we do not have a further result. This viewpoint suggests that the behavior of mutual information I(A : C) in the IR limit is an important attribute governing the area-law scaling of entanglement entropies.
As a final remark, we note that the idea of taking a cutoff index q(α) is in line with the idea of approximating one-dimensional many-body states with matrix product states [16]. In the language of the matrix product state, q(α) plays the role of a bond dimension, while Q(α) governs the accuracy of the approximation. Again, Ref. [11] exemplifies the worst-case scenario in which a logarithm of the bond dimension is exponentially large in the correlation length, which is consistent with our proof.
ii) Suppose 0 < q(α) < M , hence 0 < Q(α) < 1. We can split ρ AC into three parts: Our aim here is to find the lower-bound of S(ρ AC ). Let us first deal with the last sum. Suppose we add to the system a single qubit a initialized in state |0 a and apply a local unitary transformation on a + A such that |0 a |i A → |0 a |i A for i ≤ q(α) and |0 a |i A → |1 a |i A for i > q(α). The resulting state is which differs from Eq. (62) by the last sum. Note that |0 a 0| ⊗ ρ AC andρ aAC can be transformed to each other by a local unitary transformation on a+A, which implies S(ρ AC ) = S(ρ aAC ). Thus, from the triangle inequality, Note S(ρ a ) = H({Q(α), 1 − Q(α)}).
Let us now deal with the other two sums. Introduce another statẽ which differs fromρ AC by the off-diagonal terms in the first sum. We find D(ρ AC ,σ AC ) Note where X A ij = |i A j| + |j A i| and Y A ij = −i|i A j| + i|j A i| are the Pauli matrices. As the term inside the norm is hermitian having real eigenvalues, one can write, e.g., the first term as for some matrix Λ C ij that has ±1 as eigenvalues. Thus, where we use the Schmidt decomposition (11) in the first line, the assumption (1) in the second line, and X A ij = 0 and X A ij = Λ C ij = 1 in the last line. The other norm is bounded in the same manner. We thus find D ρ AC ,σ AC ≤ (# of terms in Eq. (66)) · 2 · 2 −l b /ξ where we use q(α) ≤ 2 αl b /ξ (otherwise q(α) i=1 p i > 1). Note thatσ AC is purified by attaching a system with Hilbert-space dimension q(α) + 1 and the region B, whilẽ ρ AC is purified by attaching a qubit and the region B (i.e., |Ψ is the purification), which meansσ AC has a larger Hilbert-space dimension. Furthermore, both the states share the same basis states. Consequently, the Fannes' inequality (19) can be applied with d = {q(α) + 1}2 l B . From d ≤ (2 αl b /ξ + 1)2 l B < 2 l b +l B , it follows We are now in a position to lower-bound S(AC). For convenience, letσ AC mn ≡ (1/Q mn )P mnσ AC P mn . Then, Eq. (65) can be written as Note where the first line comes from a general equality S( i p i |i i| ⊗ ρ i ) = − i p i log p i + i p i S(ρ i ) satisfied when {|i } is an orthonormal basis [23] and the third line from Lemma 2. Note also Q(α)S(σ AC q(α)+1,M ) ≥ Q(α)S(σ A q(α)+1,M ) − Q(α)S(σ C q(α)+1,M ) where the first line comes from the triangle inequality of entropy, the second line from the concavity of entropy, and the third line from Lemma 2 and S(σ C ) = S(C). From the inequalities (64) and (67) Combining these two inequalities and noting S(ρ a ) = H({Q(α), 1 − Q(α)}) ≤ −2Q(α) log Q(α) for Q(α) ≤ 1/2, the lemma follows.