Quantum information measures for restricted sets of observables

We study measures of quantum information when the space spanned by the set of accessible observables is not closed under products, i.e., we consider systems where an observer may be able to measure the expectation values of two operators, $\langle O_1 \rangle$ and $\langle O_2 \rangle$, but may not have access to $\langle O_1 O_2 \rangle$. This problem is relevant for the study of localized quantum information in gravity since the set of approximately-local operators in a region may not be closed under arbitrary products. While we cannot naturally associate a density matrix with a state in this setting, it is still possible to define a modular operator for a state, and distinguish between two states using a relative modular operator. These operators are defined on a little Hilbert space, which parameterizes small deformations of the system away from its original state, and they do not depend on the structure of the full Hilbert space of the theory. We extract a class of relative-entropy-like quantities from the spectrum of these operators that measure the distance between states, are monotonic under contractions of the set of available observables, and vanish only when the states are equal. Consequently, these distance-measures can be used to define measures of bipartite and multipartite entanglement. We describe applications of our measures to coarse-grained and fine-grained subregion dualities in AdS/CFT and provide a few sample calculations to illustrate our formalism.


Introduction and Summary of Results
In studying entanglement and other quantum information measures associated with a system, we often assume that the Hilbert space factorizes as H ⊗ H with a factor, H associated with the system of interest and another factor, H associated with the rest of the world. A slightly more general way to state this assumption is that we assume that we have access to an algebra of observables that characterizes the system of interest. In the case where the Hilbert space admits a bipartite decomposition, this algebra simply comprises the set of all operators in the theory that act trivially on H but may act non-trivially on H. This set of operators forms a linear space and is closed under multiplication and the adjoint operation.
However, it often happens that while we can measure the expectation value of one operator in the system, say O 1 , and also another operator, say O 2 , we cannot measure the product O 1 O 2 due to some physical constraints. In this paper, we explore the extent to which one can define various quantum information measures in such settings -when the space spanned by the available observables is not an algebra because it is not closed under multiplication.
We were originally motivated to study this problem with a view to understanding local entanglement in theories of quantum gravity. In theories of quantum gravity, except for some special regions, the set of approximately-local operators associated with a region does not form an algebra [1,2].
This point can be understood by contrasting gravity with local quantum field theories. In a local quantum field theory, given a region, R, the product of any two local operators in R is another operator in R and therefore the set of all local operators that belong to R is closed under multiplication. This algebraic structure holds even in gauge theories, where the algebras associated with regions contain a center [3]. The presence of this center creates ambiguities in quantum information measures, since the center can be considered either to belong to the region or to belong to its complement. But this ambiguity does not alter the fact that the set of local operators is closed under multiplication.
However, in a theory of quantum gravity, the situation is more subtle. Although there are no exactly local gauge invariant operators in gravity, it is possible, in a sense, to approximately localize simple operator in a region R. However, if we start considering sufficiently complicated polynomials of these operators, then (except for some special regions) these complicated polynomials do not remain confined to R in any meaningful sense. Therefore the set of approximately local observables does not span an algebra in a theory of quantum gravity. As a corollary, the Hilbert space of the theory does not factorize into a Hilbert space associated with R and another factor associated with its complement. An explicit example of this phenomenon was given in [2] motivated by the results of [4,5,6], where the lack of factorization of the Hilbert space was important for the resolution of the information paradox proposed there. We give some additional examples in section 5. In the absence of an algebraic structure, there is no natural way to associate a density matrix with R. However, as we will describe below, it is still possible to define a related quantity, called a modular operator. We can obtain some intuition for this object by thinking of a special case. Consider an ordinary local quantum field theory without gauge fields, where the Hilbert space factorizes as H ⊗ H and where, with respect to the full set of operators (including operators in H), the system is in a pure state. Then, if the density matrix of R is ρ, the density matrix of H is ρ which has the same spectrum as ρ. In this setting, the modular operator associated with R is ρ ⊗ ρ −1 . We also study the relative modular operator, which distinguishes between states. In the setting above, given two states, where the density matrix associated to R is ρ and σ respectively, the relative modular operator between the states is σ ⊗ ρ −1 .
In the absence of a direct-product factorization, we will show that the modular and the relative modular operator continue to exist even though they cannot be interpreted simply in terms of density matrices.
We use the modular and the relative modular operators to define measures of distance between states. As we review below, an appropriate measure of distance underpins the study of other measures of quantum information, such as entanglement. In the usual setting, a common and useful measure of distance is provided by the relative entropy. While we are not able to directly generalize the relative entropy, we are able to find other measures that share the nice properties of the relative entropy.
We now describe our results in some more detail. We consider a set of operators, which we denote by A. We assume that this set forms a complex vector space and also that it is closed under the adjoint operation: A ∈ A =⇒ A † ∈ A. For notational simplicity, we also take 1 ∈ A. We set dim(A) = D. We denote the state of the system by ψ. We do not insist that ψ be pure, and this is reflected in our notation. We use ψ(A) to denote expectation values of A in the state in ψ. The action of the elements of A on ψ creates a little Hilbert space, H ψ = span{|A 1 , . . . |A D }, whose vectors are in one-to-one correspondence with elements of A and where the norm is set by the state: A j |A i = ψ(A † j A i ). The little Hilbert space also contains a vector dual to the identity operator, 1, denoted by |1 . Note that while the structure of the little Hilbert space requires us to measure one-point and two-point functions of operators in A, we will not require any higher point functions.
We then define information measures using constructs that operate entirely within this little Hilbert space. This is an important physical constraint in our analysis and reflects the fact that the observer may have no prior information about the structure of the full Hilbert space of the theory.
The modular operator on H ψ is defined as The relative modular operator is also an operator within H ψ . If we are given another state φ, then the relative modular operator is defined as In the case where A forms an algebra, Araki [7] defined the relative entropy as This definition can be adopted, as it stands, to the case where A is not an algebra. However, we find that the physical quantity, so defined, shares some but not all of the desirable properties that the relative entropy has when A is an algebra. Even, in the case where A is not closed under multiplication, we find that S ar A (ψ|φ) continues to be nonnegative. S ar A (ψ|φ) is also monotonic under contractions of the set of available observables, i.e., if we consider a subset of observables whose span is a subspace of the original space, B ⊂ A, then S ar B (ψ|φ) ≤ S ar A (ψ|φ). However, it turns out that it is no longer true that this quantity vanishes only when the states are the same.

S ar
Therefore, we introduce two additional measures of distance between states. Let X = (∆ ψ ) − 1 2 ∆(ψ|φ)(∆ ψ ) − 1 2 and X , and X −1 denote the operator norms of X and its inverse respectively. Then we set S A (ψ|φ) = log X + log X −1 , This choice of f χ is not unique and we describe how the χ-distance can be generalized in various ways. Even though it is not obvious from the formulas above, both the measures in (1.1) are symmetric between ψ and φ.
We show that both S A (ψ|φ) are χ A (ψ|φ) are nonnegative and they both have the property that i.e., when these quantities vanish, the states ψ, φ are indistinguishable through the measurement of one-and two-point functions of A. Just like the relative entropy both these measures of distance are monotonic -if we shrink A by dropping some observables, the distance between states either remains constant or decreases. Moreover, both these measures share an additional important property of the relative entropy, which we call insularity.
If we simply add on an ancillary system in a state ψ anc whose operators, A anc are unentangled with the original system, then this does not change the distance between states: S A⊗Aanc (ψ ⊗ ψ anc |φ ⊗ ψ anc ) = S A (ψ|φ) and χ A⊗Aanc (ψ ⊗ ψ anc |φ ⊗ ψ anc ) = χ A (ψ|φ). (The notation for product states, used in this relation, is standard but such states are defined more precisely in (2.7).) S A (ψ|φ) can be interpreted as an entropy, since it is additive. More precisely S A (ψ|φ) has the somewhat undesirable property that the distance between states can easily become infinite, as we explain in greater detail below. On the other hand, χ A (ψ|φ) is not additive for product states, but it has the advantage that it always varies between [0, 1]. The measures S A (ψ|φ) and χ A (ψ|φ) are just two of a large class of distance measures, derived from the spectrum of the modular and relative modular operators, which can discern between states as in (1.2), and are monotonic and insular as described above. We denote any measure of distance that satisfies these properties through D A (ψ, φ).
We describe how any such measure, D A (ψ, φ), can be used to define a measure of bipartite and multipartite entanglement. The key point is that notions of purity and separability can be generalized easily to our setting. A state, ψ, is pure with respect to A if it cannot be written as a convex combination of two other states: ψ 1 , ψ 2 , 0 < λ < 1 such that ψ = λψ 1 +(1−λ)ψ 2 . Similarly, given a direct product decomposition, A = A 1 ⊗ A 2 , we can define separable states as those that can be written as convex combinations of product states. A measure of bipartite entanglement is then given by where the infimum is taken over the set of all separable states. If the space of observables admits a multipartite factorization A = A 1 ⊗ . . . A n then the definition above immediately generalizes to a measure of multipartite entanglement.
The virtue of this measure of entanglement is that, first, it measures only quantum and not classical correlations; second, it is invariant under local unitary transformations; and third that it remains constant or decreases under local operations and classical communication (LOCC).
We describe some applications of our information measures to subregion duality in AdS/CFT [8] -the question of what region in the bulk is probed by a given spacetime region, R, on the boundary. It is generally believed that this bulk region is given by the socalled "entanglement-wedge" of the boundary region [9]. This duality can be most precisely phrased in terms of entanglement measures, and is equivalent to the claim that "bulk relative entropy on the entanglement wedge is equal to boundary relative entropy in the region R" [10,11]. This duality is a statement about the bulk dual to all operators in the region Rthis is what one may call a "fine grained" subregion duality.
However, it is often convenient to consider another version of the subregion duality, which which we call a "coarse grained" duality. Here, we consider the set of simple operators in the region R, which we call A R C , and likewise restrict bulk observables in the dual region to simple polynomials in local field operators, which we call A B C . In this setting, we can also consider arbitrary regions R, which extend in both space and time and may not be causally complete. Then, we argue that the bulk region dual to R in a coarse grained sense is given by which indicates the union of the causal wedges, C D of each causal diamond, D that belongs to R. (Causal wedges are defined in (5.5).) In the conventional formalism, it is difficult to write down an information-theoretic analogue of (1.3) since the set of simple operators in R do not form an algebra. However, in our formalism, if ψ and φ are two states then (1.3) simply implies where D A (ψ, φ) is any one of the measures of distance described above.
We also briefly comment on what happens when we consider the set of all operators in R including arbitrarily complicated operators. Although we are not able to prove the entanglement-wedge proposal, we are able to show that, in this limit, the bulk dual, which we denote by B at least contains a region where the union runs over all spacelike intervals, S, that lie in R and C S indicates the causal wedge of the causal diamond built on S. Our proof only uses principles from canonical gravity and, in particular, the fact that the Hamiltonian can be expressed as a boundary term.
In terms of information measures, if A R is the set of all operators in R and A B F is the set of all simple bulk operators in B F then (5.10) implies that The inequality arises because the region B F is, in general, smaller than the entanglement wedge and because the measure of distance is monotonic.
After a description of our setup in section 2, we turn to a detailed discussion of the modular and relative modular operator in section 3. We then describe how the spectrum of these operators can be used to define measures of distance, and also entanglement, in section 4. Section 5 describes the applications of this formalism to AdS/CFT. Apart from the application of our measures to coarse-grained subregion dualities, we also discuss fine-grained dualities and the extent to which they can be derived in a theory of quantum gravity. Section 5 is somewhat independent from the rest of the paper. The reader who is interested in our formalism only from an information-theoretic perspective may skip this section. Conversely, this section can also be read independently of the information-theoretic content in this paper for some simple new results in bulk reconstruction. In Appendix A, we provide some sample calculations to illustrate our formalism.

Relation to previous work
We conclude this introduction by reminding the reader of some previous work in these directions. In the holographic context, [12] first discussed the question of evaluating the entropy relevant for an observer who had access to information only in a bounded spacetime region. The question of reconstructing the bulk given only restricted data on the boundary was also considered in [13]. These latter papers are also closely related to the seminal work on S-maximization by Jaynes [14].
However, the perspective in this extant literature differs somewhat from ours. In particular, the principle of S-maximization suggests that given a set of coarse-grained observables, we should evaluate quantum information measures by finding the fine-grained density matrix with the largest von Neumann entropy that is also consistent with the expectation values of these coarse-grained observables. However, this leads to answers that are sensitive to the structure of the full fine-grained Hilbert space of the theory including, especially, its dimension. This is the reason that the holographic prescriptions above lead to entropies that scale with the central charge of the boundary theory.
In this paper, we would like to consider an observer who has no prior information about the fine-grained degrees of freedom in the theory. This is why the information measures that we define below, in a holographic context, will not scale with the central charge of the boundary theory but just depend on the dimension of the set of restricted observables that we choose to include.

Setup
In this section, we describe our physical setup. We consider a general physical system in a state ψ. We also consider an observer, who has limited abilities to manipulate the system and make measurements. First, we allow the observer to access any linear combination of a set of simple operators A 1 . . . A D . This set of operators, which includes the identity operator, is denoted by (2.1) The operators A i need not be Hermitian but we will demand that A is closed under the adjoint operation However, note that when we make reference to a basis for A as in (2.1), the Hermitian conjugates of all operators are assumed to be included in this basis and we do not need to display them separately. Note that A is not a C * -algebra because 1 In this paper, we will assume that the complete set of physical quantities that are accessible to the observer is spanned by the two-point functions We now describe some simple properties of this set of two-point functions.
First, note that since 1 ∈ A this set of two-point functions also includes all one-point functions ψ(A). We also assume that the state is normalized so that ψ(1) = 1. In general, we will assume that that the state ψ is separating with respect to A which can be stated as At times we may consider states where ψ(A † A) = 0 for some A ∈ A (see, for example, the discussion of pure states below) but these cases can be handled by taking a limit of states that satisfy (2.3). Second, note that A itself is a complex vector space, and when we wish to consider operations on this vector space, we will denote elements of A using bra-ket notation as |A . This set of states includes |1 . The two-point functions (2.2) establish an inner-product on this vector space.
Note that (2.3) automatically implies that g ij is Hermitian. To see this, we note that both ) must be real and this can only happen if Therefore, the inner-product (2.4) is conjugate bilinear and positive-definite and endows H ψ with the structure of a Hilbert space. We will call H ψ the little Hilbert space.
The "little Hilbert space" was first introduced in [5,6]. Here, this space was interpreted as the space of "small deformations" about the state ψ produced by "acting" with elements of A. The two-point functions (2.2) then just tell us about the "angles" between these deformations. In [15], the little Hilbert space was termed the "code subspace". We will not need these interpretations in this analysis but they are useful to keep in mind.
Note that there is no natural basis choice for A. We can always make GL(D, C) transformations on the basis to obtain a new metric. Under such a transformation, the metric transforms as follows We can use this freedom to always diagonalize g. This corresponds to choosing a basis for A that satisfies g ij = λ i δ ij , for some set of real numbers λ i > 0. Note that the λ i can themselves be changed by rescaling the basis elements. We will see later how it is possible to extract invariants by combining the adjoint map and the metric. The description above completely outlines the mathematical framework that we will need in the rest of the paper. However, to remove any possible confusion, we give a simple example. Consider a lattice quantum field theory, comprising a single local scalar field φ(t, x i ) on a lattice where the spatial coordinates can take on N distinct values x 1 . . . x N . Then one interesting restricted set of observables is obtained by taking A to be the set of polynomials with products of at most M distinct field operators, with M N , on a constant time surface, t = 0. A basis for A is given by monomials of the form φ(0, x π 1 ) . . . φ(0, x πq ), where x π 1 . . . x πq is any selection of q points of the N available lattice points and q ≤ M . Here, we clearly have D = M q=0 N ! (N −q)!q! . Note that one-and two-point functions of elements of A are not one-and two-point functions in terms of the elementary field; this set contains up to 2M -point correlators of the fields. Second note that A is not an algebra because if we multiply two M -point monomials, we get a monomial with 2M insertions of the field, and this is not part of A.
This kind of setup is appropriate for a bulk observer in AdS/CFT. This observer can measure correlators with a small number of insertions of local bulk fields. However, in general, it is not possible for the observer to measure arbitrarily complicated polynomials of these local bulk fields or measure fields at arbitrarily small separations. The bulk observer also has limited knowledge about the structure of the CFT Hilbert space, and can only operate in the little Hilbert space that she can investigate by exciting the state with sources dual to simple bulk operators.

Subtleties with a density matrix interpretation
When A is an algebra, we can identify the density matrix of the system as an element of the algebra, ρ ∈ A, with the property that Tr(ρA) = ψ(A), ∀A ∈ A.
However, in the setup under consideration, where we have access only to the little Hilbert space, this is not a viable option. For example, insisting that the density matrix gives the right two-point functions would lead us to demand Tr(ρA † i A j ) = g ij . First, to evaluate the contribution to the trace from within the little Hilbert space, we need to know three-point functions of elements of A, which we do not have access to. But even worse since, in general A i A j / ∈ A, the trace also receives contributions from outside the little Hilbert space. So to evaluate the trace we need to know the action of elements of A on the full Hilbert space and not just in the neighbourhood of the state ψ. This would violate our physical presumption that the observer does not have information about the global Hilbert space or the behaviour of A i outside of a neighbourhood of ψ.
Therefore, here, we will not attempt to construct a density matrix to reproduce the expectation values (2.2).

Pure and mixed states
We can now proceed to define the notion of a pure state. A state ψ is said to be pure on the observables A if ψ has the property that there is no other state ψ that is uniformly smaller than ψ. We denote the set of pure states by D pure .
This can equivalently be stated as the criterion that a state ψ is said to be pure on the observables A if it cannot be written as a convex combination of two other states 2 ψ ∈ D pure ⇐⇒ ψ = λψ + (1 − λ)ψ , with 0 < λ < 1 and any distinct states, ψ , ψ = ψ.
Note that all the states that appear in these definitions are assumed to be normalized so that ψ(1) = 1. These definitions are direct generalizations of the definitions used for pure states in quantum information theory when the set of observables forms an algebra. (See, for instance, definition 5.3.5 in [16].) The task here, as in the more complicated examples we encounter later, is to find the right definition -from among the many equivalent definitions, which hold when A is an algebra -that can generalize to the case where A is not an algebra.
We remind the reader, who may be unfamiliar with the definitions above, that when A is an algebra and we can associate a density matrix with the state then, in some basis, the density matrix of a pure state can be written as diag{1, 0, 0, . . .}. This makes it clear that one cannot find another density matrix that is uniformly smaller than a pure-state density matrix, and also that a pure-state density matrix cannot be written as a convex combination of other density matrices.
A simple example of a pure state is as follows. Consider the case where A = {1, A}, with A † = A and consider a state that satisfies for any real value of µ. We might try and construct a "smaller" state through λψ (1) = λ; λψ (A) = µ ; λψ (A 2 ) = κ, with λ < 1; κ < µ 2 . But a little thought shows that this does not work. First note that since λψ ((1 − λA µ ) 2 ) ≥ 0, we must have κ ≥ µ 2 λ . Then we find that Moreover, the final inequality is saturated only if µ = λµ and κ = λµ 2 but then we would have ψ = ψ which is not allowed. Therefore, we cannot have λψ < ψ for any choice of λ and ψ .
The set of mixed states is the complement of the set of pure states. It comprises all those sets that can be written as convex combinations of other states.

Separable and entangled states
It may happen at times, that the space of observables, A admits a direct product factorization so that For example, we may consider two separated systems and then consider separate observations on these systems. Even in gravity, we may consider two different observers who are localized in spacelike separated regions of the asymptotic spacetime. Observations made by these observers commute and if we consider the full set of possible observations this forms the direct-product space above. Of course, we do not assume that either A 1 or A 2 are algebras. A product state on A is a state where all expectation values can be decomposed as a product of expectation values in a state on A 1 and expectation values in a state on A 2 .
A convex combination of product states is called separable. We denote the set of separable states by D sep .
with λ i > 0 and where the subscript i indicates that the sum can run over arbitrary product states. The provision that λ i > 0 is important, since otherwise all states on A can be written as linear combinations of product states. A separable state can be interpreted as a classical mixture of product states. A state is said to be entangled if it is not separable. In section 4, we will quantify entanglement. In appendix A.3, we give some examples of separable and entangled states. In general, the question of determining whether a state is separable or entangled is a difficult one, and in fact, it is known to be a NP-hard problem [17].
We have considered bipartite splittings of the set of observables here, but the same formalism extends in a natural way to multipartite splittings of the available observables.

Local operations and classical communication
For later use, we consider an important possible transformation of the state defined by ψ → L(ψ), where Here M (n) and N (n) are arbitrary invertible matrices indexed by n and the sum over n can run to an arbitrary number, n m . These matrices must satisfy a completeness condition summarized by L(ψ)(1) = 1. Such a transformation is called a LOCC-transformation in the quantum information literature since it can be achieved by an observer who can access observables in A 1 and another observer who can access observables in A 2 making local transformations on the state. However, these local transformations are correlated between the two observers, and these correlations can be achieved by purely classical communication.
It is clear that LOCC operations map separable states to separable states: ψ ∈ D sep =⇒ L(ψ) ∈ D sep . Moreover, by enhancing the set of observables to B = A 1 ⊗ A 2 ⊗ A anc by using an ancillary set of observables, we can realize the LOCC transformation in terms of a single transformation of the form (2.5) on an initial product state ψ ⊗ ψ anc . To see this we choose dim(A anc ) = n m and choose the initial state of the ancillary system to satisfy ψ anc (A anc,i A anc,j ) = δ ij . We then consider the following linear transformation acting on elements of B where 1 anc denotes the identity operator in the ancillary system. (This is just the analogue of lifting an operator-sum representation of a superoperator to a unitary transformation in a larger space. See, for example, section 3.2 in [18].) Although the equation above does not completely specify L, since we have not specified how it acts on those elements of B that involve non-trivial insertions of operators from A anc , we will not need this information.
We now consider the state whose two-point functions are given by where B i ∈ B and where we have abused notation slightly by allowing L to act both on the state and on B.
Then it is clear that when restricted only to elements of A 1 ⊗ A 2 this state coincides with the original LOCC-transformed state Therefore, the LOCC operation is just an active version of (2.5) in the space with the ancillary observables included.

The Modular and Relative Modular Operators
We now introduce the modular and relative modular operators. Although these operators can be defined independently, they naturally appear in the Tomita-Takesaki theory of modular automorphisms of von Neumann algebras [19], and our discussion follows this path. Several of the properties of the modular and relative modular operators that we describe in this section are well known in the literature on the Tomita-Takesaki theory, and follow from a simple application of the definitions of these operators. So our objective in this section is only to emphasize that these results continue to be true even when A is not an algebra. Moreover, this section serves to collect, in one place, all the properties of these operators that we will need.

The modular operator
First, we introduce the anti-linear map, S ψ : H ψ → H ψ that acts on the little Hilbert space through We can also define the adjoint of this map as usual. This is defined by using the usual rules for obtaining the Hermitian adjoint of an anti-linear operator.
The modular operator is then defined through The relations above allow us to work out the matrix elements of the modular operator.
Note that the formula (3.3) holds in an arbitrary basis. It is clear from (3.2) that the modular operator is positive and Hermitian. We can also see this directly. Positivity follows because Hermiticity follows because We can write the matrix elements of ∆ ψ more explicitly in terms of the operator S ψ and the metric g ij . Even though S ψ is anti-linear, in a given basis, we may write We then have

Interpretation of the modular operator when A is an algebra
We now pause in our discussion of the properties of the modular operator to discuss its interpretation in a simple setting We consider the following Special Case (SC). We take the the Hilbert space to have a bipartite factorization, H ⊗ H, with dim(H) = dim(H). We take A to be the set of all operators that may act non-trivially on H but act trivially on H. Any state, ψ, on H, can be represented as a pure state on H ⊗ H. We will denote this pure state by |Ψ . The little Hilbert space is just generated by the action of A on this state: H ψ = A|Ψ .
In the text below, we will return to this simplified setting, as a means of gaining insights on the structures that we define.
We then claim that the matrix elements (3.3) are correct if This means that ∆ ψ acts as the density matrix on H and as the inverse of the density matrix on H. We remind the reader of condition (2.3), which ensures that ρ is nonsingular.
In terms of the modular Hamiltonian H mod is the modular Hamiltonian for H. Note that, as a consequence of this relation, ∆ ψ is not an element of the algebra, and nor is it an element of the commutant, which consists of operators that act trivially on H.
We now verify the formula (3.4). In the Schmidt basis the state can be written as |Ψ = a n |n,n , where |n runs over some orthonormal basis and |n denotes the vector in H that is entangled with |n . Then we have ρ = Tr H (|Ψ Ψ|) = n |a n | 2 |n n|.
And the claimed modular operator is given by Let us now take two operator A p , A q : H → H and lift them to operators on the full space by demanding that they act as the identity on H. We now see that n,r,s,n (a n ) * |a s | 2 |a r | 2 a n n,n|A q ⊗ 1|s,r s,r|A p ⊗ 1|n ,n .
The dot-product of the barred-vectors just yields the delta functions δ nr δrn leading to We also note that the, due to (2.3), the set of states A|Ψ is dense in the full Hilbert space, H ⊗ H. Therefore the matrix elements of the modular operator above completely specify the operator in H ⊗ H.
The result above can also be interpreted as follows. Since the space H ψ in this case is isomorphic to H ⊗ H we can also write the action of the modular operator as It is easy to see then that precisely as required. Indeed, the modular operator is often introduced as an operator that acts on the space of operators in the Hilbert space. (See, for instance, [20].) However, we find that (3.4) (which holds under the additional conditions outlined in SC) is more useful in providing intuition about ∆ ψ .

Spectrum of the modular operator
The matrix elements of (3.3) transform under GL(D, C) transformations of the basis that we use for A. However, we now show how the spectrum of eigenvalues of ∆ ψ provides us with an invariant quantity that characterizes the state. The spectrum is defined by the usual eigenvalue equation which yields the eigenvectors i c i |A i and the eigenvalues λ.
Note that the matrix elements of the modular operator are given by (3.3), but such a basis is not orthonormal in general, so we cannot simply diagonalize (3.3). Instead we go to an orthonormal basis denoted by Of course the transformation M O ji is not unique since we can make unitary transformations. Any other basis that is related by In the orthonormal basis, the matrix elements of the modular operator are given by The eigenvalues of the matrix (∆ ψ ) O ij give us the spectrum that we need. It is easy to see that this spectrum is invariant under the unitary ambiguity that exists in (3.6). In the primed basis above, we would have and this matrix clearly has the same spectrum as ∆ ψ O .
We can also solve (3.5) directly by simply contracting it with A j | leading to the equation If we denote the inverse of g by g −1 so that j g −1 kj g ji = δ ki then the equation above becomes i,j We can now diagonalize the matrix j g −1 kj ∆ ψ ji to obtain the spectrum of eigenvalues λ. Under a GL(D, C) transformation as in (2.5) we see that Since matrices related by similarity transformations have the same spectrum, it is clear that the two matrices above have the same spectrum. This argument also shows that the spectrum of eigenvalues thus obtained is the same as the spectrum of (∆ ψ ) O ij introduced above. Therefore the spectrum of the modular operator characterizes the state independent of the basis chosen for A.

Pairing of eigenvalues of the modular operator
We now show that eigenvalues in the spectrum of the modular operator appear in reciprocal pairs. This can be written as where we use the notation SP to denote the spectrum of an operator.
We start by proving the identity Note that from (3.1) we obviously have S 2 ψ = 1. We also have (S † ψ ) 2 = 1 since for arbitrary A i and A j , Now multiplying both sides of (3.7) by S ψ ∆ ψ we see that the relation we need to prove is But using the definition of ∆ ψ we see that this is just Now consider a solution to the eigenvalue equation for ∆ ψ (3.5) Then we have where, in the last equation, we have used the fact that all eigenvalues of ∆ ψ are real and positive. Therefore S ψ |A is an eigenvector of ∆ ψ with eigenvalue λ −1 .
Note also that if |A is an eigenvector of ∆ ψ and if A † = A then we must have λ = 1.
The fact that eigenvalues of ∆ ψ appear in reciprocal pairs is natural in the situation where A is an algebra acting on one part of a bipartite Hilbert space as in SC. We see from (3.4) that if ρ i are the eigenvalues of ρ then the eigenvalues of ∆ ψ are just ρ i ρ j and so they naturally occur in reciprocal pairs corresponding to all possible ratios of the eigenvalues of the density matrix. But we see above that even when A is not an algebra, the modular operator continues to have this property.

The relative modular operator
The modular operator characterizes a state. We now describe the relative modular operator which depends on two states and can be used to characterize their difference. To define the relative modular operator, we consider a second state φ. This second state also induces a separate Hilbert space structure on A leading to another little Hilbert space H φ . We denote vectors in this space through |A φ and the inner-product is given by Note that we denote vectors in H φ with an additional subscript. But, in this section, we will continue to denote vectors in H ψ using the notation |A with no additional subscripts. Then we define a map S (ψ|φ) : The relative modular operator is then defined through From the definitions above, it is clear that ∆(ψ|φ) maps H ψ → H ψ and its matrix elements are

Interpretation of the relative modular operator when A is an algebra
We now describe the interpretation of the relative modular operator in the special case SC. We additionally assume that the state φ is also pure in the full Hilbert space and we denote this pure state by |Φ . As usual, we denote the density matrix of H in the state ψ by ρ and remind the reader that the density matrix of H is then ρ with the same spectrum as ρ.
Similarly, we denote the density matrix of H in the state φ by σ. We will then show that Here, as in (3.4), the second factor acts on the complementary space. We assume that both states have a Schmidt decomposition as follows |Ψ = n a n |n,n , (3.10) Notice that the Latin indices, n and the Greek indices α run over different sets of states, corresponding to the Schmidt basis that diagonalizes entanglement in the two states. The density matrices are given by and the claimed form of the relative modular operator is We see, using the form above, that if A p , A q are any two operators: H → H then which is precisely what we need. Note that in going from the first to the second line, we used the identity operators to equate n = m = n . Just as in section 3.1.1 the space H ψ is isomorphic to H ⊗ H due to the separating nature of the state, ψ. This has two implications. First, it tells us that the matrix elements above completely specify the relative modular operator in the special case SC. Second, it allows us to rewrite this result in the slightly more general form ∆(ψ|φ)|A = |σAρ −1 , when A is an algebra.
We can easily verify that, in this case, , precisely as required. Once again, the reader may find the result (3.9) (which requires additional conditions) somewhat more useful to develop intuition about the relative modular operator.

Spectrum of the relative modular operator
We define the eigenvalues of the relative modular operator through the eigenvalue equation The spectrum of possible values of λ can be computed by going to an orthogonal basis, just as we described for the modular operator. Once again, this spectrum does not depend on the choice of basis for A. We will not repeat the proof of this claim, since this discussion is almost identical to the discussion for the modular operator above.

3.2.2
Relationship between the eigenvalues of ∆(ψ|φ) and ∆(φ|ψ) We now show that the spectrum of ∆(ψ|φ) is the inverse of the spectrum of ∆(φ|ψ): In the special case SC, this property is obvious from (3.9) but we will show that it holds more generally.
To see this we note that the maps S (ψ|φ) : H ψ → H φ and S (φ|ψ) : where the subscript on 1 distinguishes the identity operators in the two spaces. It follows that Some simple manipulations of these identities leads to , where, in the last step, we used the definition (3.8).But this means that ∆(ψ|φ)S (φ|ψ) = S (φ|ψ) ∆(φ|ψ) −1 . (3.12) The relation (3.12) now tells us that given an eigenvector of ∆(φ|ψ) −1 that satisfies we have an eigenvector of ∆(ψ|φ) with the same eigenvalue: where in the last step we have assumed that λ ∈ R + and so it commutes with S (φ|ψ) .

Relationship between the relative modular and modular operators
It is clear from their respective definitions that the modular operator and the relative modular operator are related through ∆ ψ = ∆(ψ|ψ).
It is also clear that the state φ is equal to the state ψ only if ∆ ψ = ∆(ψ|φ).
We will also need one additional property in what follows. The spectrum of ∆ ψ ∆(ψ|φ) −1 is the same as the spectrum of ∆(φ|ψ)∆ φ −1 . This is natural when we consider the special case (3.9). However, the result holds more generally and we can demonstrate it as follows.
First we note that as can be easily checked through their action on a general state. Correspondingly, we also have and using the fact that ∆(ψ|φ) −1 = S (φ|ψ) S † (φ|ψ) , which follows from (3.11), we find that But now we see that Using the fact that the spectrum of the operator is left unchanged by a similarity transform, this result easily extends into the following result.
for arbitrary values of x, y.

Measures of Distance and Entanglement
We have now set the stage for defining quantum information measures for sets. In quantum information theory, an important role in defining measures of information is played by a notion of distance between states. When A is an algebra, the relative entropy is a commonly used notion of distance between states. However, we describe how some simple attempts to directly generalize the notion of relative entropy fail. Then we describe some new measures of distance between states for sets of observables and show that these measures meet our requirements.
We will look for a measure of distance D A (ψ, φ) that depends on the observables at hand -two-point functions of elements of A -and satisfies a set of properties. The significance of these properties is that it was shown in [21] that a measure of distance that satisfies properties (1)-(3) can be used to define a good measure of bipartite and multipartite entanglement in a sense that we will review below.
We demand the following properties from our distance measure D A (ψ, φ).

Basis Independence
A very basic property, that we will demand from all distance measures, is that they should not depend on the basis chosen for A.

Specificity:
In general, we would like distance to be positive: Here equality between states means that one-point and two-point functions of elements of A are the same.
3. Monotonicity: Reducing the number of observables with which one probes the state should make states less distinguishable. Therefore if B ⊂ A then D B (ψ, φ) ≤ D A (ψ, φ).

Insularity:
The distance between two states should not change if we simply add on a spectator system. If we write A = A 1 ⊗ A 2 as in (2.6) and we consider states ψ = ψ 1 ⊗ ψ 2 and φ = φ 1 ⊗ ψ 2 (where product states are defined in (2.7)) then We call property (4) "insularity" because for the distance measure to be meaningful it should not "care" about the state of the rest of the Universe. From a physical perspective, this is an obvious property that a measure of distance must obey but we will see that it is surprisingly effective in ruling out several plausible distance-measures. It is often useful to have a measure of distance that satisfies two additional nice properties: additivity and finiteness. We emphasize that these properties are often not necessary; indeed, even in the algebraic setting, the relative entropy is not finite.

5.
Additivity: For a product state as defined in (2.7), we would like This is useful, particularly if we use the distance measure to define a notion of entropy, which needs to be extensive. 6. Finiteness: This is simply the property that the distance between any two states is bounded above. ∃K, such thatD A (ψ, φ) < K, ∀ψ, φ.

Distance and Entanglement
We now explain the relationship between measures of distance and entanglement following [21] . We will need to introduce additional structure on the space of observables to make sense of the notion of entanglement. In this subsection, we assume that the set of accessible observables satisfies a direct-product structure as in (2.6). This helps us define a notion of bipartite entanglement. We set this measure E A (ψ) to be where φ runs over the set of separable states as defined in (2.8). A notion of multipartite entanglement can be defined along similar lines. Intuitively speaking, this measure is a generalization of the "mutual information." We remind the reader that the mutual information can be defined as the relative entropy between a state and the corresponding product state. More precisely, given a state ψ, we define an associated product state by ψ prod . Then, the mutual information is just defined as the relative entropy between these two states: S ar A (ψ|ψ prod ). So the mutual information measures the distance between a state and a specific separable state (the associated product state), using a specific notion of distance (the relative entropy). The definition (4.1) has two differences. First, it does not commit itself to using a particular separable state since it may happen that the product state is not the closest separable state. Second, it allows us to use an arbitrary notion of distance and this is important for us since, as we explain below, we are unable to satisfactorily generalize the relative entropy when A is not an algebra.
It was shown in [21] that the measure of entanglement defined through (4.1) satisfies the following desirable properties.
1. It measures only quantum correlations. This is because some correlations between observations in A 1 and A 2 can simply be explained through classical probabilistic physics. The states that have only such correlations are precisely the set of separable states. But if we see that if ψ ∈ D sep ⇐⇒ E A (ψ) = 0.
So, E A (ψ) gives us a measure of quantum correlations.

E A (ψ) is invariant under local unitary transformations.
Here we assume that ∃U 1 ∈ A 1 and ∃U 2 ∈ A 2 such that U 1 A 1 U † 1 ∈ A 1 , ∀A 1 ∈ A 1 and U 2 A 2 U † 2 ∈ A 2 , ∀A 2 ∈ A 2 . Then the observer can act with U 1 and U 2 and again make measurements in A 1 and A 2 . More precisely, we change the state to U 1 U 2 (ψ) which is defined by Now we see that if φ ∈ D sep then U 1 U 2 (φ) ∈ D sep and moreover that which follows from Property (1). So and therefore, as expected, entanglement is unchanged by local unitary transformations.
3. The entanglement between two subsystems is not changed by simply adding on a separate auxiliary system, and the property (4) of the distance ensures this.
4. Finally, we can consider LOCC operations as discussed in section 2.3.1. We see that entanglement can only decrease under such operations. This is because we see that in evaluating E A (L(ψ)), if the infimum of (4.1) is attained at φ then E A (L(ψ)) ≤ D A (L(ψ), L(φ)), since L(φ) ∈ D sep . But since, as explained in section 2.3.1, the LOCCoperation can be achieved by a simple basis-transformation in an enlarged space with ancillary observables D A (L(ψ), L(φ)) ≤ D A⊗Aanc (L(ψ ⊗ ψ anc ), L(φ ⊗ ψ anc )) = D A (ψ, φ).
Here, in the first step we used the monotonicity of the distance function and in the next step we used the basis-independence and insularity of the distance function. Therefore we see that E A (L(ψ)) ≤ E A (ψ).
This is a pleasing result since it shows that entanglement is fundamentally a quantum resource that can only be destroyed and not created by two observers acting locally and communicating classically.
So we see that an appropriate measure of distance and the notion of separable states is essential to quantifying entanglement.

Distance measures
In the rest of the section, we now discuss measures of distance that satisfy the properties above.
When A is an algebra, the relative entropy is a very useful notion of distance. So, we will first discuss some attempts to construct analogues of the relative entropy by generalizing a construction due to Araki [7]. The measures constructed in this manner reduce to the relative entropy when A is an algebra but when A is not an algebra they either fail to satisfy property (2) (specificity) or else they fail to satisfy property (3) (monotonicity). Therefore, in subsection 4.2.2, we then construct new measures of distance that satisfy all the necessary properties (1) (basis independence), (2) (specificity), (3) (monotonicity) and (4) (insularity) . One of these measures -the "normed entropy" -also satisfies property (5) (additivity) although it does not satisfy property (6) (finiteness). We also find a large class of measures that do not satisfy property (5) but satisfy property (6).

Generalizations of the relative entropy
When A is an algebra, Araki [7] showed that the usual relative entropy is just given by It is easy to verify this formula under the assumptions of SC, when A is an algebra and when ψ and φ can be represented by density matrices ρ and σ respectively. In that situation recall that the standard definition of the relative entropy is S ar A (ψ|φ) = Tr (ρ log(ρ) − ρ log(σ)) . Using the expressions for the density matrices in (3.10) it is easy to see that this is precisely the same as (4.3). We now consider the quantity defined by (4.2) when A is not an algebra. We will call this quantity the "Araki relative entropy" and check whether it obeys the various properties that one expects of the relative entropy. We will follow the approach of [22].

Proof of monotonicity of the Araki relative entropy
We now show that the Araki relative entropy, (4.2), is monotonic under projections of the set of observables. This means that if we take where B is also a linear space closed under the adjoint operation, then we find that To prove this relation we first note that the relative modular operator, when the set of observables is B, can be obtained from the relative modular operator defined on A through where P B is a projector on H ψ that projects onto the little Hilbert space generated by B.
Then we have S ar B (ψ|φ) = − 1| log P B ∆(ψ|φ)P B |1 . To compare this quantity with the original Araki relative entropy we adapt the argument of [22] and use the following identity. Let X be a bounded invertible matrix. Then, if P is a projector, What is important about this decomposition is that when X is a positive operator, we have So, inverting and then projecting leads to a larger operator than projecting and inverting. Since we can write (for any bounded and invertible operator) we see that the inequality (4.6) also implies that P log(X)P ≤ P log(P XP )P.
In our case, we note that P B |1 = |1 and therefore we immediately see that which proves the relation (4.4). We see that the nonnegativity of (4.2) also follows immediately. Since the trivial set of observables spanned by the identity operator, {1}, is always a subset of A and since (4.2) evaluates to 0 for this set of observables we see that we always have S ar A (ψ|φ) ≥ 0.
Failure of the Araki relative entropy to distinguish some states However, we now find that the Araki relative entropy cannot always distinguish states, when A is not an algebra. The Araki relative entropy vanishes whenever one-point functions are equal S ar but this does not imply that ψ = φ since we may still have ψ(A i A j ) = φ(A i A j ) for some two-point functions.
To prove (4.7) we again follow [22] and take B to consist of only the identity operator (and its c-number multiples). Then P B |A i = ψ(A i )|1 . Moreover, with X = t + ∆(ψ|φ) and applying (4.5), we see that for all values of t. But then acting on |1 we must have which can only happen if X −1 |1 ∝ |1 or multiplying both sides by X if X|1 ∝ |1 . The constant of proportionality can be set by sandwiching the state on the left with 1| and we see that S ar A (ψ|φ) = 0 ⇐⇒ ∆(ψ|φ)|1 = |1 . Sandwiching this expression with A| for arbitrary A leads immediately to (4.7).
To see a simple example of how non-identical states can have the same relative entropy consider a simple case where a basis for A is formed by three elements {1, A, A † } with A † = A. Take the states and a second state Then it is clear that the states are not equal. But we see that, in the orthonormal basis |1 , 1 |A † , the relative modular operator is given by It is clear that 1| log(∆(ψ|φ))|1 = 0, although the states are clearly unequal.
Note that if A had been an algebra, this would not have happened. When A is an algebra the state is completely characterized by its one-point functions. Indeed the difference between one-and higher-point functions is moot since all products of operators correspond to another operator in the algebra. From (4.7) we see then that when A is an algebra, the relative entropy vanishes only when the states are equal. This is not in contradiction with the result on monotonicity which simply tells us that further projections will not reduce the entropy any further and keep it at 0.
We have not been able to find a distance measure that reduces to the relative entropy when A is an algebra but also satisfies properties (1) -(5) when A is not an algebra. We should mention one other obvious generalizations of the relative entropy expressed in terms of modular operators.

S ar
, (4.8) where the traces are taken only over the little Hilbert space in an orthonormal basis. We can check that the expression (4.8) reduces to the relative entropy when A is an algebra by using the expression (3.9). It also uniquely distinguishes states. This can be proved by recognizing that (4.8) has the same form as the expression for the relative entropy in terms of two density matrices and therefore it can only vanish when ∆ ψ /Tr(∆ ψ ) = ∆(ψ|φ)/Tr(∆(ψ|φ)). However, since 1|∆ ψ |1 = 1|∆(ψ|φ)|1 , we see that this in fact implies ∆ ψ = ∆(ψ|φ). Thus the distance measure in (4.8) satisfies property 2 . It is easy to see that also satisfies properties 1 (basis-independence), 4 (insularity) and 5 (additivity). However, this expression fails to satisfy property 3 -that of monotonicity.
We feel that such formulas deserve some further attention, and perhaps a refinement of (4.8) or (4.2) can be engineered to satisfy all of properties (1) - (5) and also reduce to the relative entropy when A is an algebra.
We now turn to other measures of the distance that do satisfy the necessary properties but do not reduce to the relative entropy.

The normed entropy and other distance measures
We now describe some measures of distance that satisfy all the properties 1 to 4 including one measure -the normed entropy -that is also additive (property 5).
The fundamental fact that we will exploit is (3.13): two states are equal only if the modular operator of one is equal to the relative modular operator to the other state. So, we will use the difference between these two operators as a measure of the distance between the two states. Accordingly, we define. (4.9) If X = 1, then the modular and relative modular operators coincide, and so the states are the same. We want to define a measure of distance based on the spectrum of X .
One might imagine, at first glance, that this is simply a matter of comparing X to the identity and using any of the standard matrix norms. However, we note that there is a tension between properties (3) (monotonicity) and (4) (insularity). Insularity implies that the measure of distance should not change if we take X → X ⊗ 1 where 1 denotes the identity matrix in an arbitrary number of dimensions. On the other hand, a simple way to satisfy monotonicity is to use a norm that always decreases under X → P X P , where P is a projector. 3 To see the tension between these two requirements, consider the potential distance measure Tr((X − 1) 2 ). This satisfies specificity and monotonicity. But clearly, it is not insular. We may attempt to correct for this, by dividing by the dimension: 1 D Tr((X − 1) 2 ). But then we lose monotonicity.
One way out is to use the operator norm X and X −1 . Since X is Hermitian and positive, these two norms simply correspond to the largest eigenvalue of X and the inverse of its smallest eigenvalue. Note, by the result proved in (3.14), that with Furthermore, it is easy to see that 1|X |1 = 1 and also that φ 1|Y|1 φ = 1, and therefore the largest eigenvalue, X ≥ 1, and the smallest eigenvalue, X −1 −1 ≤ 1. The condition for states to be equal, which is that X = 1, then becomes This just states that both the largest and smallest eigenvalues of X become 1. The fact that any measure based on these operator norms will satisfy insularity (property (4)) is obvious since Furthermore, these operator norms behave simply under contractions of A. Using the inequality (4.6) and using a similarity transformation to move the power of the modular operator entirely to the left we see that for B ⊂ A Therefore under a contraction of A, we see that X decreases. By applying to same logic to the operator, Y, and using (4.10) we see that X −1 also decreases under a contraction of A. Therefore the smallest eigenvalue of X , X −1 −1 , increases. So we conclude that any measure of distance specified by a nonnegative function of two variables, D : will satisfy the conditions of specificity, insularity and monotonicity provided that the function satisfies D(x, y) = 0 ⇐⇒ x = y = 1, and In fact, the simplest choice for this function is just D(x, y) = x − y which translates into D A (ψ, φ) = X − 1 X −1 . If we additionally demand additivity (property (5)) then this translates to the statement that D(x 1 x 2 , y 1 y 2 ) = D(x 1 , y 1 ) + D(x 2 , y 2 ) and a choice that satisfies this condition is the "normed entropy". This is given by S A (ψ|φ) = log X + log X −1 . (4.11) Using (4.10) we can also write the normed entropy as S A (ψ|φ) = log Y + log Y −1 . The arguments above immediately tell us that the normed entropy is symmetric in its arguments, specific, monotonic, insular and additive. However, it is not finite. This can be easily seen by considering a pure state. For pure states, in the appropriate basis we see that several eigenvalues of the modular operator are zero. Thus, unless the second state also has the same set of zero-norm states in its little Hilbert space, the normed entropy diverges when one uses it to measure the distance between one pure state and another state.
However, if we do not insist on additivity, then it is not difficult to define a finite measure of distance by simply choosing another function D(x, y). For instance, we may take the measure of distance to be (4.12) This measure of distance varies between (0, 1) and vanishes only when X = X −1 = 1. Since X ≥ 1 and X −1 ≥ 1, this distance decreases with decreasing X and decreasing X −1 . So, it is specific, insular and monotonic, and it is also clearly finite. However, it is not additive. The specific choice of the function in (4.12) is clearly not unique but is motivated by simplicity, and a desire to treat reciprocal eigenvalues in X symmetrically.

Summary of distance measures
In this section, we first described how an appropriate measure of distance could be used to induce measures of entanglement on the set of states.
We then discussed various notions of distance. For the convenience of the reader, the table below provides a summary of the properties of the various measures of distance that we have considered. As we see, only the normed entropy and the χ-distance satisfy all of the necessary properties 2, 3, 4. The normed entropy is additive (property (5)) but not finite whereas the χ-distance is finite (property (6)) but not additive. All the measures are independent of the basis chosen for A, and so they automatically satisfy property (1)

Coarse and Fine-Grained Subregion Dualities in AdS/CFT
We now describe some applications of our formalism to the formulation of subregion dualities in the AdS/CFT correspondence.
We should clarify the relationship of our approach to the existing literature on holographic entanglement, following the Ryu-Takayanagi conjecture [23] and its generalization by Hubeny, Rangamani and Takayanagi [24]. The literature largely deals with the question of understanding bulk geometry from boundary entanglement [25]. Here, our perspective is rather different. We are interested in studying quantum information measures directly from the point of view of the bulk gravitational theory.
The question of bulk-entanglement has been studied only in the free-field limit [26,10], where bulk quantities appear as one-loop corrections to the Ryu-Takayanagi formula. (See, for example, [27], for nice reviews and calculations of quantum information measures in free quantum field theories.) Here, we take some initial steps towards clarifying the conceptual meaning of bulk quantum information measures and placing them in a framework where, in principle, there is no obstacle to turning on interactions in Newton's constant.
Accordingly, we consider a large-N conformal field theory that is dual to a bulk theory of quantum gravity in AdS d+1 . Since the boundary theory is at large N , we are in a regime where all curvature scales are large compared to the Planck scale and we additionally assume that the string coupling is small so that curvature scales are also widely separated from the string scale. In this regime, the boundary theory has a natural set of "simple operators" called generalized free-field operators. (In N = 4 SYM, the generalized free-field operators are the single-trace operators at low dimension.) In the discussion below, we will denote such an operator by O(t, x). To lighten the notation, we suppress tensor indices, and other indices required to distinguish between different generalized fields.
We now consider the following questions: 1. Coarse-Grained Subregion Duality Problem: Given a spacetime region R on the boundary, is there a spacetime region B C in the bulk so that all information about B C can be obtained through measurements of low-order polynomials of generalized free-field operators in R?.

2.
Fine-Grained Subregion Duality Problem: Given a spacetime region R on the boundary, is there a spacetime region B in the bulk so that all information about B can be obtained through arbitrary measurements in R?.
In formulating the questions above, we do not demand that R is a causal diamond, but allow it to be an arbitrary region in spacetime. Moreover, we note that the fine-grained subregion duality problem is not reflexive. We do not demand that all information about R can be obtained from B. We now examine the answer to these questions, and note the utility of our quantum information measures in testing these subregion dualities.

Coarse grained subregion dualities
To determine what information about the bulk may be obtained by measuring low-point correlation functions of generalized free-fields, we can organize these correlation functions as one and two-point functions of a set of operators that we can call A R C .
More precisely, we need to do the following. We consider a lattice of points (t i , x i ) ∈ R and then consider the set of polynomials in these operators with an order limited by n coarse . We can choose n coarse to be any number parametrically separated from N . 4 A basis for this set is given by monomials in the generalized free fields.
A R C = span of{ (O(t 1 , x 1 ) . . . O(t n , x n ))}, n ≤ n coarse . (5.1) Note that A R C is not an algebra. The one-and two-point functions of operators in A R C (which translate into correlators with up to 2 n coarse insertions of the generalized free-fields) have information about a region in the bulk that we will call B C -this is a "coarse-grained" version of a subregion duality in AdS/CFT. We now describe the geometric structure of B C more carefully.
First consider a single causal diamond inside R that we call D. A causal diamond is specified by two points that are timelike to each other. If we call the later point, P , and the earlier point P then the causal diamond defined by these two points is given by whereJ + (P ) andJ − (P ) denote the causal future of P and the causal past of P respectively taken only on the boundary. Now, consider a scalar single-trace operator O(t, x) with dimension ∆ with (t, x) ∈ R. This operator is dual to a bulk field φ(t, x, r) with mass given by ∆(∆ − d) = m 2 . Here, we have introduced the coordinate r for the radial direction, with r = ∞ being the boundary and where the bulk metric diverges as r 2 as r → ∞. At large N , this bulk field obeys the bulk equation where the corrections come from interaction terms. We can solve this bulk equation of motion with boundary conditions specified on D as This solution simply leads to the standard HKLL kernel which expresses the bulk field as a function of its boundary values [28]. 5 When the bulk geometry is close enough to the AdS vacuum, the solution to (5.3), subject to the boundary conditions (5.4), is valid within the causal wedge of D in the bulk. The causal wedge is defined as In any actual calculation, it is also necessary to regulate the local generalized free-fields to turn them into bounded operators. This can be done, for example, by transforming to Fourier space to obtain the modes of these fields, and then cutting off the inverse Fourier transform by writing the "local" operators as a sum over a finite but large number of Fourier modes. We will discuss this further in forthcoming work. 5 As explained in [29,5], to avoid some of the difficulties outlined in [30] it is important to examine this kernel in momentum space rather than position space.
where now J + (D) and J − (D) denote the causal future and causal past of a region but now including those points that lie in the bulk and are not on the boundary. For the causal diamond specified by (5.2), note that we also have C D = J + (P ) ∩ J − (P ). The result that the bulk equations of motion can be solved within the causal wedge in a general background has not been established rigorously but we believe that it should be true because of the following simple physical intuition: if one draws the future and past null cone from any point in B C then both these null cones intersect R. Therefore, operators in R can "sense" the presence of any excitation within B C purely through the classical propagation of gravity waves.
We can now use this to understand the coarse-grained dual of an arbitrary spacetime region on the boundary. First, we consider all causal diamonds that fit inside R. Then it is clear that we have This is because any region R can be completely tiled by boundary causal diamonds. Note that many of the diamonds within R overlap. The causal wedge of any such diamond is defined as C D as in (5.5).
Since each causal diamond, D, in R is precisely dual to the set of operators in C D in a coarse-grained sense, it is clear that the union of all causal diamonds is dual to the union of all causal wedges. Therefore, the coarse-grained bulk dual of the region, R that we denote by B C is given by The only subtlety to keep in mind is that, in general, This is because we do not necessarily have J ± (R) ⊆ R for arbitrary spacetime regions that are not complete causal diamonds. We remind the reader that what the duality (5.6) means is that the simple operators that are part of A R C give information about correlation functions of simple polynomials in bulk fields that live in B C . Note also that we derived (5.6) at large N . Even perturbatively, in O 1 N the notion of bulk locality is gauge-dependent in the bulk. However, we believe that it should be possible to choose a gauge so that (5.6) remains valid perturbatively in O 1 N . This conjecture can be checked using our entanglement measures both at large N and perturbatively in 1 N as described below.

Information measures in coarse-grained subregion dualities
The entanglement measures that we have described above are tailor made to check the duality (5.6). Consider two states ψ and φ. We can probe these states both on the boundary, by using operators in A R C and through simple bulk operators. Using a truncation procedure analogous to (5.1), we define the set of low-order polynomials in bulk fields to be A B C . Then we claim that It would be tempting to conjecture that the full spectrum of the bulk and boundary relative modular operators matches. However, this is likely to depend on the details of how precisely we regulate the bulk and the boundary theories. As a result (5.7) is likely to be more useful in practice.
Note that conventional entanglement measures cannot be applied to (5.6) (except at infinite N ) since neither the bulk, nor the boundary set of operators form an algebra.
The simplest example of (5.6) is where we restrict A R C to consist of just single insertions of single-trace operators. Then our entanglement measures are sensitive to two-point functions, and this is already sufficient to check (5.6) at large N . At the next level, we can restrict A R C to consist of single-insertions of either single-trace or double-trace operators and consider the first non-trivial power of O 1 N in all correlation functions. Then (5.7) already becomes sensitive to perturbative corrections but can still be calculated at least in some simple examples. This can be used to check (5.6). We will comment further on this in forthcoming work.
The boundary time band: an example of coarse-grained subregion dualities As an example of a subregion duality of the kind that we are interested in, we revisit an example first considered in [2]. Consider global AdS d+1 with metric ds 2 = −(1 + r 2 )dt 2 + dr 2 1+r 2 + r 2 dΩ 2 d−1 and a time band on the boundary that covers the entire sphere but extends in time from −T 2 to T 2 , where T < π. (For more details we refer the reader to [2].) Then this time-band can be tiled with multiple overlapping diamonds as shown in Figure 1b. The coarse-grained dual of this time band is the complement of a bulk causal diamond as shown in figure 1c that is bounded by the light-sheets r = cot T 2 ± t . As we will see in the next subsection, the fine-grained dual of this same region is all of AdS!

Fine grained subregion dualities:
We will now enlarge the set of operators from A R C to a larger set that we call A R F . If we denote the set of all boundary operators within R, by A R , then the set A R F ⊂ A R . Nevertheless we show that by considering the set of operators A R F , we can obtain information about a bulk region B F that is, in general, larger than B C . The interesting part of this finegrained duality R ↔ B F is that it may violate bulk causality in the sense that there may be points in B F that are not causally connected to R.
To define, A R F , we again consider the set of generalized free-fields. One light generalized free-field that must exist in any conformal field theory is the stress tensor T µν . For notational simplicity we assume that the boundary is either S d−1 × R or R d−1 × R so that ∂ ∂t is a Killing vector on the geometry. We then consider the set of all spacelike slices, S, within the region  where dΣ µ = √ hn µ d d−1 x and n µ is the future-directed unit normal to S and h is the induced metric on S. Note that if S had been a complete Cauchy slice then H{S} would have reduced to the Hamiltonian, H, of the theory. Now, denote the causal completion of S on the boundary by S. The causal completion is defined as follows. We consider all points that are spacelike to S and denote them by S . Then S is the set of all points that are spacelike to S . Now, since the boundary theory is exactly causal, for any Heisenberg operator on the boundary theory that is localized within S we have This is because O(t, x) commutes with the Hamiltonian density outside S. Therefore all operators within S can be obtained through time-evolution with H{S}. In particular an operator at a point (t + τ, x) can be written as If we cut this series off using N c = O (N ) terms, and if we also ensure that the operators (5.9) are only used within correlators where the minimum separation between points is parametrically larger than O 1 N then we see that (5.9) gives an excellent approximation to (5.8). We now define A R F to be the set of simple polynomials in the operators (5.8). More precisely, we take Note that, using (5.9), A R F can be thought of as a combination of the simple polynomials in A R C and a set of simple polynomials of a very specific set of complicated polynomials in the elements of A R C .
The bulk dual to the set of operators, A R F follows immediately from the previous duality. The set of simple operators in each causal completion S is dual to a set of simple operators in the bulk region given by C S , which is defined just as in the previous subsection. Therefore if we define then the set of operators in A R F is dual to a set of simple bulk operators that live on B F . We denote this set by A B F . It is clear that the boundary region R must probe B F in any theory of quantum gravity in anti-de Sitter space. This is because all that we have used to establish the form of B F in (5.10) is the fact that the canonical Hamiltonian is a boundary term [31] and that asymptotic operators commute exactly at spacelike separation.
As we mentioned above, the curious part of the duality A R F ↔ A B F is that it may violate bulk causality in the sense that there may be points in B F that are not causally connected to R. For instance consider the time-band shown in Figure 1. Then one spacelike slice that lives within this time band is simply the slice with t = 0 on the boundary. The causal completion of this slice is the entire boundary. Therefore the region B F corresponding to the time-band is all of AdS! This is in sharp contrast to the coarse-grained bulk dual region, B C , which is just the region shown in Figure 1c.
This duality also gives us an example where the set of approximately-local operators in a region may not form an algebra. Note that the elements of A B F can also be generated by taking complicated polynomials of A B C . But now we see that these complicated polynomials should be understood as simple field operators in the region B F that, in general, is larger than B C . Therefore complicated polynomials of approximately-local operators in B C do not remain meaningfully confined to B C .

Information measures in fine-grained subregion dualities
Note that, as we have defined it above, A R F is also not an algebra. However, our quantum information measures can still be applied to the duality above. This duality implies the equality of our distance measures between two states, when they are probed by A R F and Note, however, that we still have A R F ⊂ A R . Therefore, by the monotonicity of the distance measures we also have As we will see below, this may correspond to the fact that the region B F is still smaller than the full region dual to A R .

Entanglement wedges
Before, we close this section, we should note that it is, of course, possible to extend the sets A R C and A R F into algebras, simply by taking the set of all polynomials in the generalized free-fields. This leads us to the set of all operators in the region R, which we have called A R .
The bulk dual to A R has been studied extensively in the literature, and it is generally believed that, for each spacelike slice within R, the bulk dual corresponds to the entanglement wedge of the slice [9]. The entanglement wedge of a boundary spacelike slice is the bulk region that is causally determined by data on the Ryu-Takayanagi surface that ends on the boundary slice. The union of such entanglement wedges should then give us the complete bulk dual to R, which we denote by B. The strongest evidence for this claim is that the relative entropy between two states evaluated on B is equal to the relative entropy evaluated on R [10,11]. We do not know of any direct proof of this conjecture although if one assumes that operators in B are dual to operators in R then it is possible to write down a formula relating the two sets of operators [32].
Note that, in general, the region specified in (5.10) is a subset of the entanglement wedge. Thus, the entanglement-wedge proposal suggests that locality is violated even more strongly than is suggested by (5.10). The fact that B F ⊆ B is just the geometric analogue of (5.11).
We note that if one is just interested in studying the duality R ↔ B, which holds when we consider the set of all operators in R then, since A R is an algebra, conventional quantum information measures work well, and our quantum information measures do not have any particular role to play.
We also note that the bulk region B is an example of a region, where the set of local operators can be completed to form an algebra. This algebra is just A R . If we take arbitrary products of local operators in the region, B, this gives us other elements of A R but does not allow us to expand our knowledge about the bulk to any region larger than B.

Conclusion
In this paper, we have described measures of quantum information that are applicable when we can only probe a system with a limited number of observables, and when the space spanned by these observables, A, does not close to form a von Neumann algebra. We believe that these measures are particularly relevant in quantum gravity where, for physical reasons, the set of approximately localized operators in a region may not close to form an algebra. A corollary to this fact is that the Hilbert space of gravity does not factorize into a Hilbert space associated with a region, and its complement.
However, even for simpler systems that do not contain gravity, the set of accessible observables may not be closed under multiplication due to physical or experimental limitations. We believe that the information measures we have defined here serve as a natural generalization of conventional information measures and may be useful in a study of such systems.
One of the central conclusions of this paper is that the objects that deserve attention in this situation are the modular and the relative modular operators. When A is an algebra, these operators can be written in terms of the density matrix associated with the state. However, these operators can be defined more generally, and their spectrum gives us a characterization of the state that is invariant under general linear transformations of the basis used for A. Moreover in sections 3.1.3, 3.2.2, 3.2.3, we showed that several properties of these spectra, which are obvious when A is an algebra, continue to hold even when A is not an algebra.
A key quantum information measure is the "distance" between states. An appropriate notion of distance, which obeys various properties that we reviewed in section 4, can then be used to define notions of bipartite or multipartite entanglement. When A is an algebra, the relative entropy is commonly used to describe the distance between states. However, when A is not an algebra, the simplest generalization of the relative entropy fails the test of specificity: the relative entropy may vanish even when two states are not equal.
We proceeded to describe an entire class of distance measures in section 4.2.2 which did meet all of our desired properties. We focused on two of these measures -the normed entropy, which was additive, and the χ-distance, which was finite. These distance measures rely on the operator norm of a combination of the relative modular operator and the modular operator. We used these distance measures to describe notions of entanglement that measured only quantum correlations between systems, were invariant under local unitary transformations and decreased or remained constant under LOCC operations.
It is clear that there are several directions for further work. The first set of questions is purely information-theoretic. First, the distance measures we have defined are only a subset of a large class of distance measure that can be defined through the spectra of the modular and relative modular operators. We believe that other possible distance measures should also be investigated and classified. Second, our measure of entanglement is defined by calculating the minimum distance between a state and the set of separable states. As we mentioned above determining the closest separable state is, even numerically, a difficult task. So, it would be nice to understand if simpler measures of entanglement can be defined.
A third interesting question is whether we can define notions of distance that rely on a trace, rather than the operator-norm. As we explained above, the reason for using the operator-norm was to resolve the tension between the property of "insularity" and the property of "monotonicity." In the case where A is an algebra, trace-based measures like the relative entropy satisfy both these required properties since they are defined using density matrices, which are bounded and have unit trace. However, the modular operator does not share this property. Nevertheless, it is not clear that this difficulty is insurmountable and this question deserves further attention particularly since trace-based measures may be "smoother" than measures based on the operator norm.
Finally, an important property of entanglement is that it is monogamous. How, precisely, is the monogamy of entanglement reflected in these entanglement measures? These questions are particularly relevant, when these measures of entanglement are applied to theories of gravity since the monogamy of entanglement plays a significant role in precise formulations of the information paradox.
Another question, which is relevant while applying these measures to quantum field theory or gravity has to do with the UV-sensitivity of our measures. In our discussions above, we assumed that all the relevant operators had been regulated, so that they could be treated as bounded operators in a finite Hilbert space. However, in quantum field theories, it is clear that various quantities, such as the spectrum of the modular and the relative modular operator, may be sensitive to the UV-cutoff. We believe that the distance measures defined in section 4 should be UV-safe, but it is clearly important to investigate this further. There are several natural questions about entanglement in gravity, for which our formalism seems relevant. For example, in the AdS/CFT correspondence, a natural question is as follows. Consider the "annular" subregion of global AdS, at constant time, with r > r 0 , where r is the radial coordinate and r 0 is some cutoff. The bulk dual of the time-band that is shown in Figure 1c is just the bulk causal completion of this annulus. The complement of this region is the "disk" with r < r 0 . We would then like to analyze the entanglement between the region r > r 0 and the region r < r 0 . (See figure  2.) Within the conventional setting, this question can only be addressed at infinite N , where the effects of gravity are irrelevant. This is because, as we discussed in section 5.2, if we consider all operators in the annular region, and since the annular region includes a slice of the boundary at constant time, then this set is the complete set of operators in the theory. So all operators in the region r < r 0 are already included in the region r > r 0 ! However, our measures of entanglement can be applied to this question, if we simply use the coarse-grained set of bulk and boundary operators described in section 5.1. Moreover, if we restrict the set of coarse-grained observables to polynomials that contain only a small number of insertions of generalized free-fields, then we believe that our measures should be computable numerically, at least to the first non-trivial order in 1 N . We believe that this is an interesting problem, and we hope to comment further on this in forthcoming work. We take our states, ψ, φ to be general mixed states in this system. They can be specified by 2 N × 2 N density matrices and we denote these density matrices by ρ and σ respectively. We generate these density matrices as random unit-trace operators that are positive and selfadjoint. We then calculate the following matrices that are defined in the text. (Here, we have implemented the simplification that A † i = A i .) g ij = Tr(ρA i A j ); (∆ ψ ) ij = Tr(ρA j A i ); (∆(ψ|φ)) ij = Tr(σA j A i ).
Then, it is not difficult to see that the spectrum of the matrix X defined in (4.9) can be calculated simply by calculating the spectrum of the matrix where ∆ ψ −1 just denote the inverse of the modular operator and we have suppressed the matrix indices to lighten the notation. This is because the elements of X in an orthonormal basis are related to the matrix on the right hand side of the equation above by a similarity transformation. The largest and smallest eigenvalues in this spectrum give us ||X || and 1/||X −1 ||.
To compute this spectrum is straightforward in principle, but quickly becomes computationally expensive for large N . All the matrices above are D × D sized matrices. In the table below, we give the value of D for various values of K and N . Note that we must have K ≤ N and so entries with K > N are omitted.  Figure 3 shows the normed entropy and the χ-distance for various values of N and K computed numerically for randomly generated states.

A.2 Distance measures for random matrices
While our example in the previous section was physically motivated it is also possible to compute our quantum information measures when the matrix of correlation functions is given by a random matrix. More precisely, we take A to consist of D Hermitian operators 1, A 2 . . . A D . Note that even if we are originally give a non-Hermitian basis of operators, given any pair of operators, A, A † , we can always transform to a Hermitian basis by taking the two combinations 1 2 (A + A † ) and i 2 (A − A † ). Now, since the basis of operators is Hermitian, the matrix of correlation functions can be taken to be a random Hermitian positive matrix. Since the basis is Hermitian, the modular operator is just the transpose of this matrix, Similarly, we can take (∆(ψ|φ)) ij = φ(A j A i ), to be another random Hermitian positive matrix. We can choose to probe this system with a subset of these operators, B ⊆ A, with dim(B) = D . For each subset of A we can consider SP(X B ) = SP (P B ∆ ψ P B ) −1 · P B ∆(ψ|φ)P B .
The largest and smallest eigenvalues in this spectrum give us ||X B || and 1/||X −1 B || and we can use these two to define the normed-entropy and the χ-distance when the system is probed with B. In our example, we take the subset of operators, B, to be just the first D operators of A but it is easy to generalize this to random subspaces of A.
In figure 4, we show the normed entropy and the χ-distance starting with 5 different values of D: (100, 200, 300, 400, 500) and then choosing subsets of operators with all values of D from 2 . . . D. It is clear that both measures are monotonic as we expect.

A.3 Separable and entangled states
In the text, our measures of entanglement were given in terms of the distance from the closest separable state. However, given a state, it is not easy to determine whether it is separable or not. However, it is often possible to look for an entanglement witness that can distinguish an entangled state from a separable one [33]. An entanglement witness is just an operator that has positive expectation values in all separable states; therefore if we find a state where this witness has a negative expectation we know that the state is entangled. One such entanglement witness, which is commonly used for low-dimensional systems is the partial transpose as we review below [34].
We consider a direct product splitting of A = A 1 ⊗A 2 with dim(A 1 ) = D 1 and dim(A 2 ) = D 2 . If a state is separable as in (2.8) then the matrix of correlations defined in (2.2) can be written as where g 1i is a matrix of correlations for elements of A 1 and g 2i is a matrix of correlations for A 2 and we have suppressed tensor indices to lighten the notation. As we explained above, this correlation matrix must be Hermitian and positive. Now, consider a positive map Λ that acts on D 2 × D 2 matrices and lift its action to D × D matrices by considering the map 1 ⊗ Λ. When acting on a convex decomposition as in (A.2), this map produces a positive matrix.
However, if Λ is not a completely positive map then if g does not have a decomposition as in (A.2), then it is not necessary that (1 ⊗ Λ)(g) will be a positive matrix.
If we take Λ to be the transpose operation, then this gives us an example of a map between matrices that is positive but not completely positive. The action of 1 ⊗ Λ on a matrix is then given by the "partial transpose". Now, if we can find a matrix of correlations, g, that has the property that its partial transpose has a negative eigenvalue then this proves that the matrix cannot be written in terms of a convex sum as in (A.2).
It is easy to construct a concrete example. The simplest example requires D 1 = D 2 = 3 and we take A 1 = span of{1, A 1 , A † 1 }; A 2 = span of{1, A 2 , A † 2 }; A = span of{1, Then we only need to find a matrix of two-point correlations that has the property that its partial transpose has at least one negative eigenvalue. Consider the following numerically generated 9 dimensional matrix of two-point functions We can check that the spectrum of this matrix and its partial transpose are given by and we see that the last eigenvalue of the partial transpose is negative. So this matrix of correlations represents an entangled state.
While we applied the partial transpose criterion to the matrix of two-point functions, note that we could just as well have applied it to the modular operator. Second, we should caution the reader that the matrix of two-point functions is quite different from the density matrix itself, even though both these matrices are positive. For instance, in the example above, since the operators in A 1 and A 2 commute, the matrix of two-point functions reflects this symmetry and not every positive 9 × 9 matrix is allowed. One such constraint, in the basis above, is that we must have g 24 = g 42 because ψ(A 1 A 2 ) = ψ(A 2 A 1 ).