Nonlocality Without Entanglement: Quantum Theory and Beyond

Quantum nonlocality without entanglement (Q-NWE) encapsulates nonlocal behavior of multipartite product states as they may entail global operation for optimal decoding of the classical information encoded within. Here we show that the phenomena of NWE is not specific to quantum theory only, rather a class of generalized probabilistic theories can exhibit such behavior. In fact several manifestations of NWE, e.g., asymmetric local discrimination, suboptimal local discrimination, notion of separable but locally unimplementable measurement arise generically in operational theories other than quantum theory. We propose a framework to compare the strength of NWE in different theories and show that such behavior in quantum theory is limited, suggesting a specific topological feature of quantum theory, namely, the continuity of state space structure. Our work adds profound foundational appeal to the study of NWE phenomena along with its information theoretic relevance.

Introduction.-One of the most counter-intuitive aspects of quantum theory is its nonlocal behavior. John Bell in his seminal result showed that entangled states of composite quantum systems can result in correlations that do not allow any local realistic explanation [1] (see [2,3] for reviews on Bell nonlocality). Such correlations, however, are not available in its most strengthened form [4]. This limited behavior of Bell nonlocality has later been axiomatized in deriving quantum theory [5] and subsequently motivates several computational and information theoretic primitives that aim to single out the correlations allowed in physical world [6][7][8][9][10][11].
Nonlocal behaviors of quantum theory, however, not always necessitate entanglement. In a pioneering work Bennett et al. recognized that multipartite quantum systems can exhibit nonlocal properties involving only product states in a way fundamentally different from Bell nonlocality [12]. In particular, they constructed sets of product states that cannot be exactly discriminated using local operations and classical communication (LOCC) while mutual orthogonality assures their perfect global discrimination. The authors coined the term 'quantum nonlocality without entanglement' (Q-NWE) for this phenomenon as the sets allow local preparation (with some preshared strategy) but prohibit perfect local discrimination. Putting differently, global measurement can be more efficient than LOCC for extracting classical information encoded in locally prepared ensemble of quantum states. Local indistinguishability has later been identified as crucial primitive for number of distributed quantum protocols, namely, quantum data hiding [13][14][15] and quantum secret sharing [16][17][18]; and it underlies the non-zero gap between single-shot and multi-shot classical capacities of noisy quantum channels [19]. On the foundational part, the recent Pusey-Barrett-Rudolph theorem uses such a nonlocal feature of nonorthogonal product states to establish ψ-ontic nature of quantum wave function [20].
Here we ask the question what quantum feature is indeed captured in 'quantum nonlocality without entanglement'? More particularly we look for whether this phenomenon is specific to quantum theory or is it possible to devise generalized probabilistic models other than quantum mechanics that also manifest similar behavior. Quite surprisingly, like the Bell nonlocality case, here also we find affirmative answer. Recall that indistinguishability of pure states in quantum theory can be thought of as an artifact of Hilbert space structure as it entails nonorthogonality among the states; and the impossibility of perfect local discrimination of an orthogonal product basis, there, results in a separable measurement that cannot be implemented locally. The generalized probability theory (GPT) framework admits a broader mathematical set-up that includes quantum and classical theory as special examples. It can incorporate the notion of indistinguishable pure states without invoking the much constrained Hilbert space structure of quantum mechanics [21]. Here we show that this general framework can also exhibit nonzero gap in optimal success probabilities of product states discrimination under local and global protocols, i.e., it evinces the NWE phenomena. In fact, we find that several aspects of NWE, as observed in quantum theory, are indeed feasible in this generalized framework. For instance, it is possible to have a set of bipartite product states that allows perfect local discrimination when one of the parties starts the protocol whereas the protocol fails for the other party, a fact already known in quantum theory [22,23]. In the GPT framework, we then constructed sets of product states that cannot be perfectly discriminated by any local protocol whereas a global measurement serves the purpose exactly. Furthermore, all the effects constituting the perfect discrimination measurement are product effects ensuring the existence of separable but locally unimplementable measurement in GPT framework. This mimics the NWE phenomena as established in the seminal work of Bennett et al. [12]. We propose a methodology to compare the strength of NWE in different theories and subsequently show that such behavior in quantum theory is limited in nature. Importantly, it turns out that this limited behavior of NWE in quantum theory is deeply linked with one of its topological feature. More particularly the limited NWE is caused due to the continuity of state space structure in quantum theory which assures existence of a continuous reversible transformation on a system between any two pure states [24]. Our present work thus adds a deep foundational appeal to the study of NWE phenomena. In the following we start with a brief discussion on GPT framework.
Generalized probabilistic theory.-This mathematical framework is broad enough to encapsulate all possible probabilistic theories that use the notion of states to yield the outcome probabilities of measurements (see the Appendix, and Refs. [24][25][26][27] for details of this framework). In a GPT, a system S ys ≡ (Ω, E ) is associated with a set of states Ω and a set of effects E . Typically Ω is considered to be a compact convex set embedded in the positive convex cone V + of some real vector space V while E is embedded in the cone V * + which is dual to the state cone V + . An effect e ∈ E corresponds to a linear functional on Ω that maps each state ω ∈ Ω onto a probability p(e|ω), representing successful filter of the effect e on the state ω. Collection of effects {e i } i forms a measurement whenever ∑ i e i = u, with u being the unit effect such that p(u|ω) = 1, ∀ ω ∈ Ω. A preparation or state ω thus specifies outcome probabilities for all measurements that can be performed on it. Given two systems S ys (A) ≡ (Ω A , E A ) and S ys (B) ≡ (Ω B , E B ) the theory also delineates their composition. The composite system S ys (AB) ≡ (Ω AB , E AB ) satisfies some natural conditions, such as no-signaling and local tomography [24], that narrow down the possibilities of such compositions and assure that Ω AB is embedded in the positive cone of V A ⊗ V B [28][29][30].
Hilbert space description of quantum theory lies within this framework. State of the system is represented by a density operator ρ ∈ D(H) [31] while effects correspond to positive semi-definite operators on H, where H is the Hilbert space associated with the system. The outcome probabilities follow the Born (trace) rule. The Hilbert space of a composite system consisting the subsystems A i 's is given by i H A i , where H A i be the Hilbert space of i th subsystems.
Here we recall another class of GPTs, namely polygonal models P ly (n) ≡ (Ω(n), E (n)) [32]. The state spaces Ω(n) for elementary systems are regular polygons with n vertices. The states and effects are represented by vectors in R 3 and p(e|ω) is given by usual Euclidean inner product. For a fixed n, Ω(n) is the convex hull of n pure states {ω i } n−1 i=0 with ω i := (r n cos(2πi/n), r n sin(2πi/n), 1) T ∈ R 3 ; T denotes trans-pose and r n := sec(π/n). The unit effect is given by u := (0, 0, 1) T . The set E (n) of all possible measurement effects consists of convex hull of zero effect, unit effect, and the extremal effects {e i ,ē i } n−1 i=0 , where e i := 1 2 (r n cos((2i + 1)π/n), r n sin((2i + 1)π/n), 1) T for even n and e i := 1 1+r n 2 (r n cos(2πi/n), r n sin(2πi/n), 1) T for odd n andē := u − e. This class of models has attracted considerable interest in the recent past [33][34][35][36].
A composite system allows the possibility of a state ω AB ∈ Ω AB that cannot be prepared as a statistical mixture of some product states, i.e.
with {p i } i being a probability distribution. Such states are called entangled states. Entangled effects are defined similarly. Whenever such entangled states and/or entangled effects are invoked in a GPT, they must satisfy the basic self-consistency (SC) condition -any valid composition of systems, states, effects, and their transformations should produce non-negative conditional probabilities. However, mathematically several selfconsistent compositions of the elementary system are possible while considering a multipartite system. For instance, bipartite composition of squre-bit model P ly (4) allows four such nontrivial compositions -(i) PR model, (ii) HS Model, (iii) Hybrid model, and (iv) Frozen model [37]. In quantum case also different, in fact infinite, selfconsistent compositions are possible [28][29][30]38]. Among these only the quantum composite state space D(H ⊗n ) possesses the property of self-duality [39], which in the GPT framework has recently been derived from a computational primitive [40]. With this prelude we now move to the main part of this work. Nonlocality without entanglement.-This particular phenomena is related to multipartite state discrimination problem under local protocols. In the GPT framework the task can be defined as follows. Suppose an n-partite state chosen randomly from an ensemble is distributed among n number of spatially separated parties who know the ensemble but not the exact state and aim to identify it given one copy of the system. However, there is severe restrictions on their action -they can only perform operations on their respective parts of the composite system and can classically communicate with each other. In general, the ensemble can consist of entangled states, but when studying the NWE phenomena we will consider ensembles of product states only. Before proceeding further let us discuss a bit more about local operations. For that in the following we define separable measurements in GPT framework.

1.
Consider an n-partite system Figure 1. [Color on-line] Polygonal models P ly (4) (left) and P ly (5) (right). Normalized state plane (at z = 1) is depicted. Pentagon model is self-dual whereas squit is not. In both models ω i 's represent pure states. In squit model there are only two extremal measurements, M i ≡ {e i , e i+1 |e i + e i+1 = u} with i ∈ {0, 1}, whereas the other has five extremal measurements, , · · · , 4}. In both models e i 's are ray extremal effects, whereasē i 's are extremal effects of E (5) but not ray extremal.
Quite surprisingly all separable quantum measurement are not locally implementable, i.e., cannot be realized by LOCC [12]. Generally a LOCC protocol consists of multi-round steps that make its mathematical characterization hard in quantum theory [41]. During such a protocol when some party performs a measurement on her/his subsystem and obtains some outcome then naturally the question arises how the given state gets updated. As noted in Ref. [42] any valid post-measurement update rule in a GPT must satisfy some basic consistency requirements imposed through Bayesian reasoning. While the von Neumann-Lüders rule in quantum theory is a consistent update rule, there is no natural way in an arbitrary GPT to come up with such a rule. Despite this we now show that several features of local state discrimination problem, as observed in quantum theory, have similar manifestations in GPT.
We start with the example of asymmetric local discrimination. For composite quantum systems there exist orthogonal product states that can be perfectly discriminated locally if and only if one of the parties starts the protocol [22,23]. For instance, the set is perfectly discriminable when Alice starts the protocol, but not the other way around. Similar is possible in other generalized probabilistic models. Lemma 1. Consider the four states, $(4) : Fig.1). This set can be perfectly discriminated locally if and only if Alice starts the protocol. Proof. Squit model P ly (4) allows two extremal measurements M 0 ≡ {e 0 , e 2 } and M 1 ≡ {e 1 , e 3 } (see Fig.1). Alice measures M 0 on her part. Outcome corresponding to the effect e 0 ensures that the state is one of the first two states of $(4) that Bob can discriminate through M 1 measurement on his part, otherwise he performs M 0 and perfetly discriminated the other two states. When Bob starts the protocol, whatever measurement he performs the outcomes divide $(4) into 1 vs 3 groups which is impossible for Alice to perfectly discriminate.
As already mentioned, for P ly (4) ⊗2 four different selfconsistent nontrivial compositions are possible along with the trivial minimal composition P ly (4) ⊗2 min ≡ Ω(4) ⊗2 min , E (4) ⊗2 min , where both the state space Ω(4) ⊗2 min and the effect space E (4) ⊗4 min contain only product states and product effects respectively. Among these, the minimal composition and the HS composition [37] are Bell local models as they contains no entangled state. However, Lemma 1 holds true in all of these five models. We now consider a more exotic phenomena of NWE. In their classic paper [12] Bennett et al. provided example of orthogonal product bases for (C 3 ) ⊗2 and (C 2 ) ⊗3 Hilbert spaces that cannot be perfectly discriminated under LOCC operations. However global separable measurements perfectly discriminate the states. To obtain a similar manifestation in GPT we consider the tripartite system P ly (5) ⊗3 . Likewise P ly (4) ⊗2 , here also it will be interesting to find out all possible selfconsistent compositions. However, the minimal composition P ly (5) ⊗3 min ≡ Ω(5) ⊗3 min , E (5) ⊗3 min will suffice our purpose.
. This set is perfectly discriminable under global operation whereas no local protocol can discriminate them exactly.
Proof. Symmetry of the states in $(5) assures that any of the party ( say Alice) can start the local discrimination protocol. Suppose that Alice performs the ex- Fig. 1). Since, postmeasurement update is not well defined in polygonal models, therefore a measurement outcome should either exactly identify the given state or it should conclusively eliminate some possibilities. Alice's outcome e 0 ensures that the given state is none of {φ 2 , φ 7 , φ 8 }. Similarly the outcomeē 0 excludes {φ 1 , φ 5 , φ 6 }. Having this information Bob and/or Carlie performs suitable measurements on their respective subsystems. At this step one of the outcomes corresponding to the best chosen measurement conclusively eliminates two states. Thus the last party has to identify the state from remaining three states, which is impossible (a flow chart of the protocol is provided in the Appendix). If Alice measures M 1 ≡ {e 1 ,ē 1 }, her outcome e 1 (ē 1 ) eliminates only one state φ 4 (φ 3 ) making the protocol less effective. It is not hard to see that whichever measurement Alice starts with no perfect discrimination is possible.
The state discriminable measurement in Theorem 1 is a separable measurement (see Definition 1). Since no local protocol can perfectly identify the states, therefore this particular separable measurement cannot be implemented locally. Similar construction is also possible in other higher gonal models. Please see the Appendix for explicit constructions in hexagonal and heptagonal models. Regarding the squit model, however, we have the following observation. This follows from the fact that an elementary system of squit does not allow any pair of indistinguishable pure states. In other words, local indistinguishability among pure states is necessary for existence of nonlocal product states of Theorem 1.
So far we have studied different aspects of NWE in the broader mathematical framework of GPTs. Naturally the question arises how to compare the strength of NWE in different GPTs. To do so, first note that the elementary systems considered in different GPTs must be of same type. Recall that the phenomena of NWE fundamentally demonstrates difference between local and global operations in extracting classical information encoded in product states. Thus to be in similar footing, different such systems must have same 'classical information storage capacity', a notion recently studied for quantum system in Ref. [43] and generalized for GPTs in Ref. [37] by the name 'signaling dimension'. The concept can be understood with the following communication scenario. Given two finite alphabets X = {x} and Y = {y} of cardinalities m and n respectively, let P m→n S ys denotes the convex set of all m-input/n-output conditional probability distributions generated by transmitting an elementary system S ys from a sender to a receiver who may have pre-shared randomness. In such a scenario signaling dimension is defined as follows. Definition 2. (Dall'Arno et al., PRL 119, 020401) The signaling dimension of a system S ys , denoted by κ(S ys ), is defined as the smallest integer d such that P m→n S ys for all m and n.
Here, P m→n C d denotes the set of all m-input/n-output conditional probability distributions obtained by means of a d-dimensional classical noiseless channel and shared random data. Suppose that S 1 ys and S 2 ys are elementary systems of two different theories having identical signaling dimension and (S i ys ) ⊗n be their n-partite selfconsistent compositions, with i ∈ {1, 2}. Consider now sets of product states (having same cardinality) in both theses theories that cannot be distinguished locally while respective global measurements can discriminate the states perfectly. The quantity ∆[i] := 1 − P L [i] amounts to the gap between global and local success probabilities in discriminating the states, where P L [i] is optimized under all local protocols allowed in the i th theory. If it turns out that ∆[1] > ∆ [2], then we can say Theory-1 depicts stronger NWE in comparison to Theory-2. In quantum theory both for the systems (C 3 ) ⊗2 and (C 2 ) ⊗3 we have two different sets of product states with cardinality 8 that exhibit NWE phenomena [12]. However, the example of (C 2 ) ⊗3 is comparable with that of Theorem Proof of the theorem is provided in Appendix. Here it is worth mentioning that for pentagon model local success probability is optimized over all possible local protocols which consists of 1-way protocols only. The corresponding quantum value is also obtained under 1-way LOCC protocol. More general local protocol (consisting 2-way LOCC) may further decrease the value of ∆[QT] which in general is difficult to optimize [44]. From the structure of constructions arguably it follows that other polygonal models also have ∆ = 1/8. Furthermore, we note that instead of the uniform priori distribution of the states {φ i } 8 i=1 if one consider a biased prior distribution (a more feasible situation for the experimental purpose) then also limited NWE behavior of quantum theory can be established (see the Appendix). Importantly, the continuity of the quantum state space plays the crucial role in resulting limited NWE behavior compared to the discrete polygonal models.
One may ask for the GPT analog of the nonlocal product bases in (C 3 ) ⊗2 [12]. We have a negative impression at this point that such analogy will not be possible in bipartite composition of the polygonal models. This is due to the fact that all the polygonal models have signaling dimension 2 whereas that of the qutrit system is 3. Of course a rigorous mathematical proof of this intuition will be worth interesting. Such a result will generalize the Theorem 4 of Ref. [23] in GPT framework.
Discussion.-In this work we study the 'nonlocality without entanglement' phenomena in the broader mathematical framework of generalized probabilistic theories. We show that this particular nonclassical behavior in quantum theory is limited as compared to other GPT models. This, in a sense, is similar to the fact of limited Bell nonlocality in quantum theory as observed by Rohrlich and Popescu in their seminal work [5]. In fact, Rohrlich-Popescu proposed to axiomatized this limited Bell nonlocal behavior (along with relativistic causality) to derive quantum theory. Subsequently, several information and physical principles have been proposed to explain limited Bell nonlocality in quantum theory [6][7][8][9][10][11], and its connection with other quantum features have also been established [45][46][47][48]. In our work, we observe that the limited nonlocality without entanglement feature in quantum theory is owing to the continuity of quantum state space structure which is presumed in axiomatic derivation of quantum theory either directly [24], or invoked through other assumptions such as 'reversible transformation' [49] or 'purification' [50]. However, the present work shows that the same feature can be obtained as an emerging fact if we presume the limited NWE as a fundamental characteristic of the theory. It therefore welcomes novel information/physical primitive(s) to explain this limited NWE behavior in quantum theory.
Our work also motivates further research to study other exotic features of NWE phenomena in GPT framework. For instance, the phenomena of NWE in quantum theory was first anticipated by Peres and Wootters [51].
They conjectured that LOCC measurements are suboptimal for discrimination of a specific set of nonorthogonal product states (see also [52]). More recently, the authors in Ref. [53] revisited the classic problem of Peres and Wootters and proved that there conjecture is indeed true. A similar example in GPT framework is yet to be obtained. On the other hand, multipartite generalization of NWE phenomena have been studied very recently [54,55]. Similar study in the GPT framework may provide further insight about the structure of quantum theory.  1]. Ω is considered to be topologically closed, i.e., there is no physical distinction between states that can be prepared exactly and states that can be prepared to arbitrary accuracy. Furthermore finite dimensionality of V guarantees compactness of Ω.

ACKNOWLEDGMENTS
(B) Effect space: Effects are linear functional on Ω that maps each state onto a probability. The set of all linear functionals is as Ω * ⊂ V * + . Framework of GPTs may assume, a priori, that all mathematically well-defined states and observables are not physically implementable. For example, the set of physically allowed effects E may be a strict subset of Ω * . A theory in which all elements of Ω * are allowed effects is called 'dual'. The property of duality is often assumed as a starting point in derivations of quantum theory and referred to as the 'no-restriction hypothesis' [50]. However, recently it has been shown that the set of 'almost-quantum correlations' violates the no-restriction hypothesis [56]. A d-outcome measurement M is specified by a collection of d effects, i.e., M ≡ {e j | ∑ j e j = u}. In GPT framework the notion of distinguishability is defined in the following sense. (C) transformation: Transformation maps states into state (also effects into effects), i.e., T : V → V, with T(V + ) ⊆ V + . They are linear and preserve statistical mixtures. They cannot increase the total probability, but are allowed to decrease it.
(D) Joint system: For subsystems S ys (A) ≡ (Ω A , E A ) and S ys (B) ≡ (Ω B , E B ) a GPT also specifies their composition S ys (AB) ≡ (Ω AB , E AB ). Clearly Ω AB is convex by definition. In general, one can imagine many weird and wonderful relations among Ω A , Ω B , and Ω AB . One can, however, narrow down the possibilities by imposing the following quite natural conditions: (i) Ever joint state ω AB ∈ Ω AB should assign joint probability to pair of effects (e A , e B ); e A ∈ E A and e B ∈ E B .
(ii) Joint probabilities must respect the no-signaling condition, i.e., the marginal outcome probabilities of the measurements on B should not depend on the measurement choice on A and vice versa.
(iii) The joint probabilities for the pairs of effects (e A , e B ) specifies the joint state. This condition is known as local tomography condition [24]. The real-vector-space quantum theory does not satisfy this condition [57].
Any joint state space Ω AB satisfying the aforesaid requirements is embedded in the positive cone of V A ⊗ V B . Furthermore, it must lie between two extremes, the maximal and the minimal tensor product [28][29][30].
Definition 4. The maximal tensor product Ω A ⊗ max Ω B , is the set of all bilinear functionals φ : ≥ 0 for all e A ∈ E A and e B ∈ E B and (ii) φ(u A , u B ) = 1, where u A and u B are unit effects for system A and B respectively. The maximal tensor product has an important operational characterization: it is the largest set of states assigning probabilities to all product measurements but not allowing signaling.
Definition 5. The minimal tensor product Ω A ⊗ min Ω B , is the convex hull of the product states, where a product

Polygon theories (A) Elementary systems:
We denote the system as P ly (n) ≡ (Ω(n), E (n)). For a fixed n, Ω(n) is the convex hull of n pure states {ω i } n−1 i=0 with ω i := (r n cos(2πi/n), r n sin(2πi/n), 1) T ∈ R 3 ; where T denotes transpose and r n := sec(π/n). The unit effect is given by u := (0, 0, 1) T . The set E (n) is the convex hull of zero effect, unit effect, and the extremal effects {e i ,ē i } n−1 i=0 , where e i := 1 2 (r n cos((2i − 1)π/n), r n sin((2i − 1)π/n), 1) T for even n and e i := 1 1+r n 2 (r n cos(2πi/n), r n sin(2πi/n), 1) T for odd n. The pure effects {e i } n−1 i=0 correspond to exposed rays and consequently the extreme rays of V * + [58]. For oddgonal cases, due to self-duality of state cone V + and its effect cone V * + [59] every pure effect e i has one to one ray-correspondence to the pure state ω i . Consequently, for every pure state ω i there exist exactly two other pure states ω i and ω i such that ω i andω The discriminating measurement consists of the effects {e i ,ē i } such that p(e i |ω i ) = 1 and p(e i |ω (η) i=0 are extremal elements of E (n) but they are not ray extremal, i.e., they do not lie on an extremal ray of the cone V * + . For an even-gon, the scenario is quite different as the self duality between V + and V * + is absent. Here all the e i 's and their complementary effectsē i 's correspond to extreme rays of V * + .
(a) Construction in P ly (6) ⊗3 min ≡ Ω(6) ⊗3 min , E (6) ⊗3 min : the set of states are given by, where each ω i ∈ Ω(6). Local indistinguishability follows like in Proposition-3. The separable measurement that perfectly discriminate this set is given by, where each e i ∈ E (6). It is a straightforward observation that p(E i |φ j ) = δ ij (see Fig.2).

Strength of NWE in different theories
Pentagon model: We are interested in optimal success probability of local discrimination of the states, [Color on-line] The optimal local protocol for distinguishing the set $(5). In the last step, if there is a hit in square shaped box the state is conclusively identified, else ambiguity arises. This flow chart of optimal local discrimination looks similar for the other polygonal models.
As discussed in the proof of Proposition-2, due to party symmetry any party can start the local discrimination protocol. Starting with Alice's measurement M 0 = {e 0 ,ē 0 }, the full discrimination protocol is diagrammatically depicted in Fig 3. The above protocol shows that out of 8 states 6 can be discriminated perfectly whereas confusion arises for the other 2 states. Since the states are given with uniform random probability, the the success probability turns out to be, It follows from the proof of Proposition-2, any other strategy is no good for yielding a greater success probability. Therefore we have, It is not hard to see that in other polygonal model also we have ∆ = 1 8 . Quantum theory: A similar type example in quantum theory is provided by Bennett el al. in 3-qubit system [12]. The set of states are given by, where {|0 , |1 } are the eigenstates of Pauli σ z operator and |± := (|0 ± |1 )/ √ 2. In general it is very hard to characterize the set of LOCC operation in quantum theory [41]. It is also difficult to find out the optimal discrimination probability under such protocols [44]. So for discriminating the set Q, we first consider 1-way LOCC protocol, where one of the party starts the protocol and based on her/his result one of the remaining two parties do some local operation and finally the third party performs local operation and try to guess the state.
Due to symmetry in construction, for the set Q, any of the parties can start the protocol. If Alice starts by performing a measurement in Pauli σ z basis and they follow a protocol like in the pentagon model then the success probability turns out to be 7/8 (to see it replace M 0 by σ z and M 1 by σ x in the flow chart 3). Instead of this let Alice perform a measurement in {|θ , |θ ⊥ } basis (see Fig 4). Based on her outcome |θ and |θ ⊥ she divides the states into two groups G 1 ≡ {φ 1 , φ 3 , φ 5 , φ 6 } and G 2 ≡ {φ 2 , φ 4 , φ 7 , φ 8 }, respectively and informs her outcome to Bob and Charlie. Note that both these groups are perfectly local discriminable by Bob and Charlie. Therefore the error occurs at Alice's step only. The error can occur in two ways, the state |θ may clicks even when the state is from G 2 and similarly |θ ⊥ may clicks even when the state is from G 1 . The total error is thus given by, Optimizing over θ, we obtain P err = 1 8 (4 − √ 10) at θ = tan −1 1 3 , and thus P succ L = 1 − P err which subsequently yield,

Ensemble with biased probability distribution
In the above study we have considered that the set of states {φ i } 8 i=1 in Eqs.(B1),(B3),(B5), and (B8) are given with uniform probability distribution. This is a true idealistic demand in practical purpose. Here we assume that the states are chosen with a biased probability distribution, we consider ensemble {p i , φ i } 8 i=1 such that p i > 0 ∀ i & ∑ 8 i=1 p i = 1. Suppose that p 4 = p 4 = p and rests are (1 − 2p)/6. As discussed earlier, due to the party symmetric nature of the set {φ i } 8 i=1 , in uniform distribution case any of the party can start the local discrimination protocol. However, for For 0 < p ≤ 1 8 the protocol (a) is advantageous whereas for 1 8 ≤ p < 1 2 the protocol (b) turns out to be superior. Quantum theory: (a) If Alice follows a protocol as of Fig 4, 5).