The Characterization of Noncontextuality in the Framework of Generalized Probabilistic Theories

One of the most well-motivated and widely applicable notions of classicality for an operational theory is explainability by a generalized-noncontextual ontological model. We here explore what notion of classicality this implies for the generalized probabilistic theory (GPT) that arises from a given operational theory, focusing on prepare-and-measure scenarios. We first show that, when mapping an operational theory to a GPT by quotienting relative to operational equivalences, the constraint of explainability by a generalized-noncontextual ontological model is mapped to the constraint of explainability by an ontological model. We then show that, under the additional assumption that the ontic state space is of finite cardinality, this constraint on the GPT can be expressed as a geometric condition which we term simplex-embeddability. A simplex-embeddable GPT need not be classical in the traditional sense of its state space being a simplex and its effect space being the dual of this simplex; rather, it must admit of an explanation in terms of such a simplicial GPT. As such, simplex embeddability constitutes an intuitive and freestanding notion of classicality for GPTs. Our result also provides a novel means of testing whether a given set of experimental data admits of such an explanation.

One of the most well-motivated and widely applicable notions of classicality for an operational theory is explainability by a generalized-noncontextual ontological model. We here explore what notion of classicality this implies for the generalized probabilistic theory (GPT) that arises from a given operational theory, focusing on prepare-and-measure scenarios. We first show that, when mapping an operational theory to a GPT by quotienting relative to operational equivalences, the constraint of explainability by a generalized-noncontextual ontological model is mapped to the constraint of explainability by an ontological model. We then show that, under the additional assumption that the ontic state space is of finite cardinality, this constraint on the GPT can be expressed as a geometric condition which we term simplex-embeddability. A simplex-embeddable GPT need not be classical in the traditional sense of its state space being a simplex and its effect space being the dual of this simplex; rather, it must admit of an explanation in terms of such a simplicial GPT. As such, simplexembeddability constitutes an intuitive and freestanding notion of classicality for GPTs. Our result also provides a novel means of testing whether a given set of experimental data admits of such an explanation.
In what precise sense does quantum theory necessitate a departure from a classical worldview? Although this is one of the central questions in the foundations of quantum theory, there is no consensus on its answer. Arguably the two most stringent notions of nonclassicality proposed to date are: the failure to admit of a locally causal ontological model [1,2] and the failure to admit of a generalizednoncontextual ontological model [3]. Both of these are operationally meaningful notions of nonclassicality, in the sense that one can determine in principle whether a given set of operational statistics admits of a classical explanation by their lights, regardless of its consistency with quantum theory. 1 This implies that any experimental evidence for such nonclassicality imposes a constraint on any physical theory that hopes to be empirically adequate, including any putative successor to quantum theory.
For prepare-measure experiments on a single system, the notion of local causality is not applicable, and so, of the two notions, only generalized noncontextuality is a candidate for an operationally meaningful notion of classicality for such experiments. Elsewhere [13] it has been argued that its operational meaningfulness and its larger scope of applicability make the notion of generalized noncontextuality the best notion of classicality available today. Furthermore, it can be shown to subsume the central ideas behind several other notions of classicality, such as the existence of a nonnegative quasi-probability representation [14,15] or of a locally causal model [1,2]. Additionally, the failure of noncontextuality has been shown to be a resource for quantum information processing [16][17][18][19][20][21][22][23][24]. Finally, if one assumes the framework of ontological models, then assuming generalized noncontextuality can be understood as assuming an instance of a methodological principle for theory construction due to Leibniz (see Ref. [25] and the appendix of Ref. [12]). Given that Einstein made significant use of this principle when he developed the Theory of Relativity [25], it is seen to have impressive credentials in physics and therefore is a very natural constraint to impose on ontological models. From this perspective, the impossibility of finding generalized-noncontextual ontological models is best understood as a failing of the framework of ontological models itself, and hence as a type of nonclassicality.
It is our purpose here to understand what the existence of a generalized-noncontextual ontological model of an operational theory implies about the geometry of the generalized probabilistic theory [26,27] (GPT) which characterizes that operational theory. Ultimately we will prove the following: Theorem 1. A prepare-and-measure operational scenario admits of a generalized-noncontextual ontological model on an ontic state space of finite cardinality if and only if the GPT which represents it is simplex-embeddable.
Simplex embeddability will be defined rigorously in Definition 1. It stipulates that there is a linear map that embeds the state space in a simplex and another linear map that embeds the effect space in the dual of this simplex, such that the pair of maps together preserve inner products.
This theorem implies that if one takes explainability by a generalized noncontextual model as one's notion of classicality for an operational theory, then one must take simplex-embeddability as one's notion of classicality for a GPT. This is in contrast with the prevailing idea that a GPT should be deemed classical if and only if its state space is a simplex and its effect space the dual thereof: simplex-embeddability deems a GPT as classical if and only if it can be simulated by such a simplex and its dual.
We begin by introducing operational theories and the generalized probabilistic theories associated to these, as well as ontological models of each. Operational Theories-An operational theory is a very minimal type of theory that stipulates for a given system a set of preparation procedures and measurement procedures that can be implemented on that system, denoted Preps and Mmts respectively. These are conceived of as lists of lab instructions that one could, in principle, implement on the given system. We will here find it useful to consider operational effects, defined as the tuple consisting of a measurement and an outcome thereof. We obtain the set of all operational effects, denoted Effects, by considering the set of all outcomes k for each measurement M in the set Mmts. A particular operational effect will be denoted [k|M ]. An operational theory stipulates a probability rule that determines the probability of obtaining operational effect [k|M ] given preparation P , denoted Pr([k|M ],P ). This probability rule, however, is not completely arbitrary, but must be compatible with certain relations that hold between the procedures. 2 For instance, if P 1 is described as a procedure that convexly mixes P 2 and P 3 , with the choice determined by a coin-flipping mechanism, then Pr([k|M ], P 1 ) must be equal to the corresponding mixture of Pr([k|M ], P 2 ) and Pr([k|M ], P 3 ) for all [k|M ] [26]. For operational effects, analogous constraints from convexity hold, as do additional constraints due to coarse-graining relationships among operational effects. For example, if one operational effect [k 1 |M 1 ] is described as being the coarse-graining of two others, [k 2 |M 2 ] and [k 3 |M 3 ], then Pr([k 1 |M 1 ],P ) must be the sum of Pr([k 2 |M 2 ],P ) and Pr([k 3 |M 3 ],P ) for all P . As we comment below, these constraints on Pr( , ) have important consequences for the generalized probabilistic theory associated to the operational theory.
In all, an operational theory of a preparemeasure experiment on a single system is a triple T := {Preps,Mmts,Pr( , )}.
Finally, we define the notion of operational equivalence of procedures [3]. Preparation procedures P and P are said to be operationally equivalent, denoted P P , if they give rise to the same statistics for all physically possible operational effects, that is, if Pr ( Generalized Probabilistic Theories-The full framework of GPTs allows for sequential and parallel composition; however, we will focus here on the fragment of a GPT that describes prepare-measure experiments on a single system. A given GPT associates to a system a convex set of states, Ω. One can think of this set as being a generalization of the Bloch ball in quantum theory, where the states in the set are the normalised (potentially mixed) states of the theory. We make the standard assumptions that Ω is finite dimensional and compact. While Ω naturally lives inside an affine space, AffSpan[Ω], for convenience we will represent this as living inside a real inner product space (V, , ) of one dimension higher, where we embed AffSpan[Ω] as a hyperplane in V which does not intersect with the origin 0. This is analogous to embedding the Bloch-Ball within the real vector space of Hermitian matrices. The reason for doing so is that we can then define both the GPT states and GPT effects within the same space. A GPT also associates to every system a set of GPT effect vectors, E. In the framework of GPTs, the probability of obtaining an effect e ∈ E given a state s ∈ Ω is given by the inner product: Prob(e,s) := e,s . (1) We require that E must satisfy the following constraints. If one defines the dual of Ω, denoted Ω * , as the set of vectors in V whose inner product with all state vectors in Ω is between 0 and 1, i.e., then E is a compact convex set contained in Ω * , E ⊆ Ω * , which contains the origin 0 and the "unit effect" u, which in turn satisfy, respectively, 0,s = 0 and u,s = 1 for all s ∈ Ω. Due to how we embedded AffSpan[Ω] within V , u necessarily exists and is unique. 3 The state and effect spaces of any valid GPT must satisfy the principle of tomography, which states that the GPT states and GPT effects can be uniquely identified by the probabilities that they produce. Formally, for the GPT states we have that e,s 1 = e,s 2 for all e ∈ E if and only if s 1 = s 2 , and for the GPT effects we have that e 1 ,s = e 2 ,s for all s ∈ Ω if and only if e 1 = e 2 . A GPT, G, is therefore defined by G := (V, , ,Ω,E).

The GPT associated to an Operational Theory-
The GPT associated to an operational theory T is the theory that one obtains when one quotients T relative to the notion of operational equivalence defined above. It is specified by a pair of quotienting maps s : Preps → Ω and e : Effects → E for all preparations and effects in the operational theory.
Note that Eq. (4) and the assumption of tomography guarantee that every operationally equivalent pair of preparations (effects) in the operational theory is mapped to the same GPT state (GPT effect) vector, and hence that each GPT vector is a representation of an operational equivalence class of operational procedures. That is: Furthermore, they imply that nontrivial convex and coarse-graining relations holding among preparations (respectively effects) in the operational theory are encoded in the geometric relations between the GPT state vectors (respectively GPT effect vectors) in the GPT. For example, if P 1 is a convex mixture of P 2 and P 3 with weights w and (1−w) (or if P 1 is operationally equivalent to such a mixture), then it will follow that s P1 = ws P2 +(1−w)s P3 .

Ontological Model of an Operational Theory-
An ontological model of an operational theory T is an attempt to explain the operational statistics of T by positing that the system has an ontic state, drawn from a set of ontic states Λ, which determines (perhaps only probabilistically) the measurement outcome, and where there may be epistemic uncertainty about the ontic state for a given preparation procedure. Thus, it associates to each preparation P ∈ Preps a normalized probability distribution over Λ, denoted µ P , representing an agent's knowledge of the ontic state when they know that the preparation was P . Denoting the set of such distributions by D[Λ], the ontological model specifies a map Furthermore, an ontological model associates to each operational effect [k|M ] ∈ Effects a response function on Λ, denoted ξ [k|M ] , where ξ [k|M ] (λ), represents the probability of obtaining the outcome k in a measurement of M when the ontic state of the system fed into the measurement device is λ ∈ Λ. Denoting the set of such response functions by F[Λ], the ontological model specifies a map These two maps must preserve the convex and coarsegraining relations between operational procedures that were discussed above. For example, if P 1 is a convex mixture of P 2 and P 3 with weights w and (1 − w), then [3], and similarly for operational effects. Finally, the ontological model must reproduce the probability rule of the operational theory T via (where we have assumed Λ to be discrete for simplicity). Generalized Noncontextuality-We are now in a position to define the notion of classicality of an operational theory T with which we are concerned in this article, namely, the existence of a generalized-noncontextual ontological model of T . An ontological model of a prepare-andmeasure experiment satisfies generalized noncontextuality if every two procedures which are operationally equivalent have identical representations in the ontological model. In other words, the constraint for preparations is that while the constraint for operational effects is that Ontological Model of a GPT-Just like an ontological model of an operational theory (discussed above), an ontological model of a GPT is also an attempt to explain the operational statistics in terms of a space Λ of ontic states that the system can take and epistemic uncertainty thereof. In this case, however, what is being modelled ontologically are not preparations and measurements, but operational equivalence classes thereof. Thus, an ontological model of a GPT associates to each GPT state vector s ∈ Ω a normalized probability distribution over Λ, denotedμ s ∈ D[Λ], and to each GPT effect vector e ∈ E a response function on Λ, denotedξ e ∈ F[Λ]. Hence, it specifies a pair of maps which must be linear by the assumption that they preserve the convex and coarse-graining relations defined by the geometry of the GPT state and GPT effect spaces. Finally, the ontological model must reproduce the probability rule of the GPT via It is now clear that a generalized-noncontextual ontological model of an operational theory T is equivalent to an ontological model of the GPT associated to T . (An explicit proof is given in the appendix.) Hence:

Proposition 1. There exists a generalized-noncontextual ontological model of an operational theory T describing prepare-measure experiments on a system if and only if there exists an ontological model of the GPT G that T defines.
As we show in Appendix B, an ontological model of a GPT is equivalent to a positive quasiprobability representation of that GPT. Hence, Proposition 1 is the generalization of the results of Ref. [14,15] from quantum theory to an arbitrary GPT.
Note that ontological models of GPTs, unlike those of operational theories, cannot be said to be either generalized-contextual or generalized-noncontextual.
Recall that contexts are defined as differences among procedures that are operationally equivalent, so there is no notion of context in a GPT, since the latter is obtained by quotienting relative to operational equivalences. To ask whether the ontological representation of a GPT state (or GPT effect) varies with context when there is no variability of context for GPT states (or GPT effects) is a category mistake, in the same way that asking whether X varies with Y when Y exhibits no variability is a category mistake [30].
This implies another point of contrast between ontological models of GPTs and those of operational theories. For a given operational theory, one can always construct an ontological model by allowing this model to be generalized-contextual (see, e,.g., Refs. [31,32]). But it is not the case that one can always construct an ontological model of a GPT because such models do not have the benefit of the representational flexibility afforded by nontrivial context-dependences.

The
Geometric Criterion associated to Noncontextuality-We argued in the introduction that an operational theory is best viewed as classical if its operational predictions admit of an explanation in terms of a generalized-noncontextual ontological model. It is natural, therefore, to determine what this notion of classicality for an operational theory entails for the GPT that the latter defines. By Proposition 1, this is equivalent to finding a criterion for when a GPT admits of an ontological model. It is desirable to have this criterion expressed in the native language of GPTs, that is, in terms of convex-geometry. We will now give such a geometric condition, which we term simplex-embeddability, under the assumption that the ontological model has an ontic state space Λ of finite cardinality. (At the end of the article, we show that this restriction is irrelevant for many practical purposes.) e,s V = κ(e),ι(s) W ∀e ∈ E , s ∈ Ω. (16) Note that while it is only the space of GPT states which embeds within a simplex, while the space of GPT effects embeds within a hypercube dual to this simplex, we nonetheless use the term 'simplex-embeddable' as an umbrella term for the pair of embedding relations. With this definition in hand, one can prove the following (as we show in the appendix).

Theorem 2. A GPT G describing a prepare-measure experiment admits of an ontological model over an ontic space Λ of finite cardinality if and only if it is simplex-embeddable.
The proof of this result is quite simple. Note, however, a crucial aspect of the solution that is not obvious at first glance: the dimension of the vector space in which this embedding can be constructed may be greater than the native dimension of the GPT. We provide an explicit example of the necessity of such a 'dimension gap' in appendix D.
By combining Proposition 1 and Theorem 2, one immediately obtains our main result, Theorem 1, which gives a geometric characterization of the set of GPTs that are associated to operational theories which admit of generalized-noncontextual ontological models over ontic state spaces of finite cardinality. Discussion-In the introduction, we listed several motivations for taking generalized noncontextuality as one's notion of classicality for operational theories. By Theorem 1, each of these motivations can be reappropriated as a motivation for taking simplex-embeddability as one's notion of classicality for a GPT. This notion of classicality is distinct from the prevailing notion that a GPT should be deemed classical if and only if its state space is a simplex and its effect space the dual of this simplex. Such a simplicial GPT is one for which all logically possible measurements can be performed simultaneously and without disturbance, and hence any such GPT is well-characterized as classical. However, a much broader class of GPTs is also well-characterized as classical: namely, all those which can be simulated by a simplicial GPT. This is exactly what our notion of simplex-embeddability captures.
The notion of simplex-embeddability is a more permissive notion of classicality than that of simpliciality since the class of simplex-embeddable GPTs is strictly larger than the class of simplicial GPTs. In related work undertaken simultaneously by Shahandeh [33] and by Barnum and Lami [34], it was shown that these two classes coincide if one restricts to GPTs which satisfy the no-restriction hypothesis [29]. Consequently, for these GPTs (and only these) the two notions of classicality coincide. Applications-Our result provides a novel way to witness the nonclassicality (or classicality) of a given set of prepare-measure experiments on a system. One first infers the set of GPTs that are consistent with one's data, using the techniques described in Ref. [35], and then one tests these GPTs for simplex-embeddability. This provides a very even-handed test of noncontextuality insofar as it does not privilege any particular preparations or measurements, but rather appeals to the geometry of the entire state space and the entire effect space.
We now address two important practical issues with testing for simplex-embeddability in such a scheme. Firstly, one needs some prescription for how to test whether a generic GPT G admits an embedding of the type described in Definition 1 for a given dimension d. Secondly, even supposing that such an algorithm is available and efficient, a further obstacle is the lack of any upper bound on the dimension d up to which one must apply this test. Because Theorem 2 does not, a priori, provide any bounds on d, it may be that one must repeat the embeddability test for a sequence of ever-larger dimensions d, and there is no guarantee that the process will terminate at any finite d.
A critical question, then, is whether there exists any finite dimension b(G) such that if a GPT G is simplexembeddable, then that embedding can be constructed in a vector space of dimension d ≤ b(G). Without such a bound, one also cannot answer the question of whether there exist GPTs which only admit of ontological models where the cardinality of ontic states is uncountable.
In the special case where the GPT in question has an effect space that is a polytope, such a bound can be computed by leveraging arguments from Ref. [36]. Such a GPT is necessarily consistent with some operational theory defined by a finite set of operational effects. But the arguments in Ref. [36] imply that any prepare-and-measure scenario where the set of operational effects is of finite cardinality admits of a generalized-noncontextual ontological model if and only if it admits of one with at most a particular finite number 4 (which we denote d max ) of ontic states. By the construction used to prove Proposition 1, this generalizednoncontextual ontological model for the operational theory can be translated into an ontological model for the GPT on the same finite-cardinality ontic state space.
To witness nonclassicality of a generic GPT, it suffices to find inner approximations of its state and effect spaces which do not embed in a simplex and its dual. Similarly, to witness classicality, it suffices to find outer approximations of these that do embed in a simplex and its dual. In practice, one can always take the (inner or outer) approximations of the effect space to be polytopic, in which case the dimension bound described above can be leveraged. Note further that in a real experiment, the data will not pick out a single GPT but a set of compatible possibilities [35]; to witness nonclassicality (respectively classicality) of the GPT that generated the experimental data, then, it suffices to find a polytopic inner (respectively outer) approximation of all the GPTs in this set. Outlook-A key question for future work is to extend our result beyond the prepare-and-measure scenario to more general compositional scenarios. As discussed above, it would also be interesting to remove the dimensional caveats from our main theorems, and to find efficient methods for testing whether a given GPT is simplex-embeddable or not. We now prove Proposition 1.
Proof. In one direction, given an ontological model of the GPT G (i.e. given the relevant mapsμ andξ ) which is associated to the operational theory T (via the quotienting maps s and e ) we can construct a generalized-noncontextual ontological model for T by simply composing the quotienting map followed by the ontological map; that is, by constructing µ :=μ s (A1) and ξ :=ξ e .
It is then easy to check that these maps define an ontological model of the operational theory T . Firstly, µ preserves the convex relations among preparations, since s preserves these relations andμ is linear. Similarly, ξ preserves the convex and coarse-graining relations for operational effects.
The resulting model is easily seen to be noncontextual, since and similarly for operational effects. Finally, the two maps together reproduce the predictions of the operational theory, since Conversely, given a generalized-noncontextual ontological model of an operational theory T (i.e. given the relevant maps µ and ξ ) we can construct an ontological model of the GPT G associated to it (via the relevant maps s and e ) by defining the mapsμ andξ as the unique linear maps satisfying for all preparations P such that s P = s, and for all operational effects [k|M ] such that e [k|M ] = e. It is then easy to check thatμ preserves the convex relations between state vectors. E.g., suppose that P 1 is a convex mixture of P 2 and P 3 with weights w and (1−w), and hence µ P1 = wµ P2 +(1−w)µ P3 by fact that µ preserves the convex relations among preparation procedures. We therefore find thatμ s P 1 = µ P1 = wµ P2 + (1 − w)µ P3 = wμ s P 2 + (1 − w)μ s P 3 . Hence whenever s P1 = ws P2 +(1−w)s P3 (which follows from the operational equivalence) holds, so does µ s P 1 = wμ s P 2 +(1−w)μ s P 3 , and soμ does indeed preserve the convex relations among preparation procedures. The proof thatξ preserves the convex and coarse-graining relations among GPT effects is analogous. Finally, the two maps together reproduce the predictions of the GPT, since We have argued that a GPT should be deemed classical if and only if an ontological representation of it exists. We now prove the following proposition, which implies that this notion of classicality is equivalent to another notion of classicality, namely, the existence of a positive quasiprobability representation.

Proposition 2. An ontological model of a GPT G describing prepare-measure experiments on a system is equivalent to a positive quasiprobability representation of G.
A quasiprobability representation of a GPT associates to each GPT system a set Λ. It associates to each GPT state on this system, s ∈ Ω, a real-valued function over Λ, denotedμ s and satisfying λ∈Λμ s (λ) = 1. This is termed a quasiprobability distribution over Λ because if the function were valued in [0, 1] rather than the reals, it could be interpreted as a probability distribution over Λ. Formally, it specifies a linear map from state vectors to the real vector space of functions from Λ to R, denoted R Λ . That is, It also associates to each GPT effect e a real-valued function over Λ, denotedξ e and satisfying e∈χξ e (λ) = 1 for all λ ∈ Λ where χ is any set of GPT effects corresponding to the set of outcomes of a possible measurement. Note that such a set must satisfy e∈χ e = u. The functionξ e is termed a quasiresponse function because if the function were valued in [0,1] rather than the reals, it could be interpreted as a response function. Formally, it specifies a linear map Now consider the probability of obtaining GPT effect e given a preparation associated to state s, which is given by e,s V in the GPT. In a quasiprobability representation of this GPT, one computes this probability using the same formula that would be appropriate if the quasi-probabilities were true probabilities, namely, λ∈Λξ e (λ)μ s (λ). Thus, ∀e ∈ E, ∀s ∈ Ω, e,s V =ξ e ·μ s = λ∈Λξ e (λ)μ s (λ). (B3) Finally, we will say that a quasiprobability representation of a GPT is positive if ∀s ∈ Ω,∀λ ∈ Λ: 0 ≤μ s (λ) ≤ 1 and ∀e ∈ E,∀λ ∈ Λ: 0 ≤ξ e (λ) ≤ 1.

Appendix C: Proof of Theorem 2
We now wish to prove Theorem 2. It is convenient for the proof to formally define the notion of a simplicial GPT which was discussed in the main text. A simplicial GPT of given dimension d is a tuple (W, , W ,∆ d ,∆ * d ) defined by a vector space W with inner product , W , a simplicial state space Ω = ∆ d (of intrinsic dimension d−1), and the dual hypercube E = ∆ * d (of intrinsic dimension d) as its effect space. Clearly, the inner product space, simplex, and the dual appearing in the definition of simplex embeddability (Def. 1) define such a simplicial GPT. 5 Next, it is useful to characterize the inherent degeneracies in the definition of a GPT, that is, to characterize under what conditions two GPTs make the same operational predictions, and hence are essentially equivalent. (C3) As a simple example, note that (V, , V , Ω, T T (E)) is equivalent to (V, , V ,T (Ω),E) are equivalent GPTs for any reversible map T (by definition of the transpose T) [26]. Now, note that any simplicial GPT of given dimension d is equivalent (in the above sense) to one in a particular canonical form, defined as follows. The vector space is 5 For a GPT G which is simplex-embeddable but not itself simplicial, the embedding relation cannot be interpreted as merely due to technological limitations. To posit that a world is governed by the GPT G is to posit that the states and effects that are physically realizable are all and only the states and effects included in G. Consequently, any states and effects that are included in the simplicial GPT into which G embeds, but are outside of G, are by assumption not physically realizable. The embedding relation must therefore be understood as a fundamental (rather than practical) limitation on what is possible in a world governed by such a GPT.
We now give an example of a simplex-embeddable GPT whose state and effect spaces cannot be embedded into a simplex and its dual (respectively) in a vector space of the same dimension as the GPT itself. The example is the GPT associated to the well-known subtheory of the stabilizer qubit theory containing only the real-amplitude states and effects, termed the stabilizer rebit quantum subtheory.
The GPT state space of the stabilizer rebit is the convex subset of the full qubit effect space. This stabilizer rebit GPT admits of an ontological model; one such model is the toy model of Ref. [37], (We present this model explicitly at the end of this appendix.) As demanded by Theorem 2, this implies that the stabilizer rebit state space can be embedded in a simplex whose dual contains the stabilizer rebit effect space. Indeed, the space of probability distributions over ontic states of the toy model of Ref. [37] defines a tetrahedron which contains the state space, and whose dual contains the effect space. Note that the stabilizer rebit state space is 2-dimensional, while the simplex (the tetrahedron) in which it is embedded is intrinsically 3-dimensional. As we now prove, this dimension mismatch is unavoidable: it is not possible to find an embedding into any intrinsically 2-dimensional simplex such that the effect space embeds in the dual.
Proof. We assume that there exists a triangle which embeds the stabilizer rebit state space, whose dual embeds the stabilizer rebit effect space, and we then prove a contradiction.
Note that for any of the extremal states, there exists an effect that evaluates to zero uniquely on that state, for example given the state |0 0| then we have the effect |1 1| that gives probability zero only for that state. The set of states for which this effect evaluates to zero, i.e. those satisfying tr[|1 1|ρ] = 1|ρ|1 = 0, can be geometrically represented by a hyperplane in the state space: Let us consider the embedding of the state |0 0| into the triangle. A priori this could go to any point in the triangle. However, we know that there should be some effect in the dual that evaluates to zero on this state, and not on any of the other states we will embed. This means that |0 0| must be mapped to a point on the boundary of the triangle. (The only effect that evaluates to zero on an interior point of the triangle is the zero effect which would also evaluate to zero on all of the other states). This means that |0 0| must lie on a proper face of the triangle, and the associated effect defines the hyperplane that picks out that face. There are two possibilities here, |0 0| is mapped to a vertex, or an edge, that is |0 0| or |0 0| . Now consider a second state, e.g. |+ +|. The same argument applies again, so it must be mapped to some proper face of the triangle. Moreover, this must be disjoint from the face into which we embedded |0 0|, since otherwise the effect associated to |0 0| would also evaluate to zero on |+ +|, for example: That is, it would imply that tr[|1 1||+ +|] = 0, and so the ontological model would give predictions different from those of the GPT. Hence, |0 0| and |+ +| must be mapped to disjoint faces, for example: We can apply this same reasoning to each of the four extremal states to conclude that they must be mapped to four disjoint faces of the triangle. But there are at most three disjoint faces of a triangle, so we reach a contradiction.