Optimal and tight Bell inequalities for state-independent contextuality sets

Two fundamental quantum resources, nonlocality and contextuality, can be connected through Bell inequalities that are violated by state-independent contextuality (SI-C) sets. These Bell inequalities allow for applications that require simultaneous nonlocality and contextuality. However, for existing Bell inequalities, the nonlocality produced by SI-C sets is very sensitive to noise. This precludes experimental implementation. Here we identify the Bell inequalities for which the nonlocality produced by SI-C sets is optimal, i.e., maximally robust to either noise or detection inefficiency, for the simplest SI-C [S. Yu and C. H. Oh, Phys. Rev. Lett. 108, 030402 (2012)] and Kochen-Specker sets [A. Cabello et al., Phys. Lett. A 212, 183 (1996)] and show that, in both cases, nonlocality is sufficiently resistant for experiments. Our work enables experiments that combine nonlocality and contextuality and therefore paves the way for applications that take advantage of their synergy.

Here we address the problem of combining nonlocality and contextuality in the same experiment. This will allow us to tackle tasks that cannot be accomplished using either nonlocality or contextuality individually. To this end, we consider the scenario depicted in Fig. 1, involving three nodes (Alice, Bob, and Charlie). A source of entangled pairs of particles is placed between Alice and Bob, which they use to produce nonlocal correlations. Furthermore, we assume that the measurements that Bob performs are nondemolition projective (also known as ideal [22]) measurements and that Charlie performs additional measurements on Bob's particle [23][24][25][26][27][28][29] (see Fig. 1). We aim at producing contextuality between Bob and Charlie using the same state and measurements that Bob uses for producing nonlocality with Alice. We refer to this target as simultaneous nonlocality and contextuality (SNC).
The straightforward application of SNC is employing two protocols with quantum advantage in the same experiment. These could be, for example, nonlocality-based secret communication [7] and a contextuality-based communication complexity protocol with quantum advantage [16]. In addition, SNC is important by itself as there are * junior.gonzales@fysik.su.se † ana.predojevic@fysik.su.se ‡ adan@us.es applications that require both nonlocality and contextuality to achieve tasks that none of them can accomplish individually [28]. For example, combining nonlocalityand contextuality-based self-testing [17,18] might facilitate certification of quantum transformations produced by Bob's device [30]. Finally, a third motivation for SNC is investigating the connections between nonlocality and contextuality [31].
Simultaneous nonlocality and contextuality cannot be produced by simply combining the violation of the simplest Bell inequality, the Clauser-Horne-Shimony-Holt inequality [32], between Alice and Bob, and the violation of the simplest noncontextuality inequality, the Klyachko-Can-Binicioglu-Shumovsky inequality [33], between Bob and Charlie. The reason is that, in this case, there is a fundamental trade-off between nonlocality and contextuality [24,25,29]. However, it has been recently shown [34] that SNC is possible if all parties choose their measurements from any state-independent contextuality (SI-C) set [35,36]. A SI-C set contains two-outcome observables represented by rank-one projectors and produces contextual correlations (i.e., violates a given noncontex-tuality inequality) no matter what the initial quantum state is. In particular, a SI-C set produces contextuality also when the initial state is mixed, as it is the case for the reduced state of Bob's particle before he performs his measurement (see Fig. 1). State-independent contextuality sets have been shown experimentally [37][38][39] and can be considered fundamental quantum resources on their own.
The first SI-C set identified had 117 observables in dimension d = 3 and was used by Kochen and Specker to prove the KS theorem of impossibility of hidden variables [4]. State-independent contextuality sets that have the properties needed to prove the KS theorem are called KS sets (see the Supplemental Material [22]). Recently, it has been shown [40] that the simplest KS set has 18 observables in dimension d = 4 [41]. This set, here called KS18, is shown in Fig. 2(a). The optimal (i.e., maximally violated by KS18, for any state, including states with an arbitrary degree of noise) and tight noncontextuality inequalities (i.e., separating the set of noncontextual and contextual correlations) for KS18 are known [35,42,43].
While any KS set is a SI-C set, not any SI-C set is a KS set (see the Supplemental Material [22]). The simplest [44,45] SI-C set is the one with 13 observables in dimension d = 3 found by Yu and Oh [46] and shown in Fig. 3(a). The Yu-Oh set is not a KS set [22]. The optimal and tight noncontextuality inequalities for the Yu-Oh set are also known [43].
The correlations produced by measuring any SI-C set in dimension d on a two-qudit maximally entangled state violate a Bell inequality constructed from the SI-C set [41]. However, such inequalities are neither optimal (in this case meaning maximally resistant to either noise or detection inefficiency [47]) nor tight Bell inequalities (i.e., separating the set of local and nonlocal correlations [48]). Moreover, these inequalities do not allow for experimental Bell tests because nonlocality with respect to them is very sensitive to noise, which prevents experimental implementations and in particular those with spacelike separation. On the other hand, tightness is important for both fundamental and practical reasons [49][50][51][52][53].
The fact that the optimal and tight Bell inequalities are not known for any SI-C set contrasts with the fact that, as it was pointed out before, the optimal and tight noncontextuality inequalities for KS18 and the Yu-Oh set were already identified. This means that, in the scenario shown in Fig. 1, the optimal witnesses for detecting contextuality between Bob and Charlie using the most fundamental SI-C sets are known, but the optimal witnesses for detecting nonlocality between Alice and Bob are still missing.
The aim of this work is to identify the optimal and tight Bell inequalities for the correlations produced by measuring KS18 and the Yu-Oh set on maximally entangled states. Hereafter, we will refer to these correlations as KS18 correlations and Yu-Oh correlations, respectively.
Our motivation roots, first, in having Bell inequalities that can be exploited and deployed in experiments requir-  (5). Color coding is used to emphasize that the coefficients in I t KS18 share the same symmetries as the graph shown in (a). The entries with white background correspond to graph nodes and edges shown in (a). The coefficients of the entries with white background are also color coded. The coefficients associated with the corresponding edges have the same color as used in (a) (red, blue, and black). The coefficients associated with nonadjacent nodes [not shown in (a)] have entries with three different backgrounds (orange, violet, and cyan), one for each of the three orbits of nonadjacent nodes in (a) (see the Supplemental Material [22]).
ing spacelike separation and that enable the development of SNC and its applications. Second, we are motivated by the fact that optimal and tight Bell inequalities for SI-C sets are by themselves fundamental. On the one hand, they provide the optimal way of using a fundamental quantum resource (a SI-C set) for producing a fundamental quantum effect (nonlocality). On the other 3. (a) Yu-Oh set and its graph of compatibility. Each vector vi of the Yu-Oh set is represented by a node. Orthogonal vectors, which correspond to compatible observables, are represented by adjacent nodes. Same color nodes (edges) are equivalent (see the Supplemental Material [22]). (b) Bell operator I t Yu-Oh,V . The Bell inequality I t Yu-Oh,V ≤ 12 is tight and provides maximum resistance to noise for the Yu-Oh correlations. The coefficients of I t Yu-Oh,V are presented with the aid of a matrix of the form (5). Color coding is used to emphasize that the coefficients in I t Yu-Oh,V share the same symmetries as the graph shown in (a). The entries with white background correspond to graph nodes and edges. The coefficient associated with each of the nodes (edges) has the same color as used in (a) (red, blue, black, and green). The coefficients associated with nonadjacent nodes have entries with five different backgrounds (brown, violet, cyan, orange, and magenta), one for each of the five orbits of nonadjacent nodes [not shown in (a)] (see the Supplemental Material [22]). hand, they allow proving Bell's theorem [1] through the violation of Bell inequalities inspired by the KS theorem [4], thus connecting these two fundamental theorems.
Methods.-The set of local correlations for the Bell scenario with two parties, m measurement settings, and two outcomes, called the (2, m, 2) Bell scenario, is a polytope, called the local polytope, that has 2 2m vertices [48]. For the KS18 correlations, m = 18. For the Yu-Oh correlations, m = 13. This makes finding optimal and tight Bell inequalities difficult (see the Supplemental Material [22]).
To address this, we developed a three-step approach. In the first step, we identify Bell inequalities for which the nonlocality of the KS18 or Yu-Oh correlations has high resistance to noise or detection inefficiency. In the second step, we verify whether these inequalities are tight and if not we use them to construct tight inequalities. In the third step, we verify whether the resulting inequalities are maximally robust to either white noise or detection inefficiency, respectively.
In the first step, we implement a numerical technique based on Gilbert's algorithm for quadratic minimization [54]. This iterative algorithm minimizes the distance between a given matrix of correlations and the local polytope and yields a Bell inequality [55][56][57] (see the Supplemental Material [22] for details).
Depending on the type of robustness we want, we adopt a different approach. To obtain Bell inequalities with high resistance to white noise, we assume that the state shared by Alice and Bob is of the form |jj , 1 is the identity matrix, d is the dimension of the local subsystems (d = 4 and 3 for the KS18 and Yu-Oh correlations, respectively), and V is called the visibility. For any state of the form (1), the joint probability that Alice obtains outcome 1 for measurement Π i (with possible outcomes 0 and 1) on her particle and Bob obtains the outcome 1 for measurement Π j on his particle is (2) Similarly, the marginal probability that each of the parties obtains outcome 1 for measurement Π i is For a given Bell inequality, we denote by V crit the minimum value of V required to violate the inequality with the state (1). To obtain Bell inequalities resistant to detection inefficiency, we assume that the source of pairs is heralded, the initial state is |ψ , and each of the parties assigns the outcome 0 when they fail to detect the particle [47]. Then where η is the detection efficiency; η it is assumed to be the same for all parties, measurements, and outcomes. For each correlation (i.e., state and measurements) violating a Bell inequality, there is a critical value of the detection efficiency η crit above which local models cannot simulate the quantum correlations [47].
At the end of the first step, we have Bell inequalities with respect to which the KS18 or Yu-Oh correlations are robust to either noise or detection inefficiency. In the second step, we check whether these inequalities are tight. To this end, we collect all the vertices that saturate the local bound and form the largest set of affinely independent vectors. If the length of the affinely independent set is D, then they span a vectorial subspace of dimension D − 1 (the polytope is in R D ), hence a facet of the local polytope so the Bell inequality is tight [50,58].
However, in most cases the Bell inequalities obtained after the first step are not tight. Then we use them to obtain tight inequalities. For that, we exploit three facts. (i) When the inequalities obtained after the first step are written using the Collins-Gisin parametrization [59] (explained below), their coefficients display symmetries that allow us to reduce the number of independent coefficients. (ii) The vertices of the local polytope that saturate the local bound have an orthogonal subspace of dimension 1. Therefore, the linear combination of all these vertices must be a vector with at most one component equal to zero. Otherwise there would be at least two linearly independent vectors that are orthogonal to all the vertices, leading to an orthogonal subspace of at least dimension 2. (iii) A facet of a polytope in R D must at least be saturated by D vertices. Otherwise, this facet could not contain D affinely independent vectors [60,61]. (See the Supplemental Material [22] for details.) Finally, the third step of our method consists in proving that the inequalities obtained after the second step are optimal with respect to white noise or detection efficiency. In order to do so, we identify local models that, for the critical values of detection efficiency η crit and visibility V crit , reproduce the KS18 or Yu-Oh correlations.
(See the Supplemental Material [22] for details.) The Collins-Gisin parametrization follows from the fact that any Bell inequality with two-outcome measurements can be written as where the coefficients can be arranged in a matrix as and L is the upper bound of I for local models.
Results.-Using the methods described before, we have obtained five Bell inequalities: two optimal and tight Bell inequalities for the Yu-Oh correlations and two optimal and one tight Bell inequalities for the KS18 correlations.
The tight inequalities for the (2, 13, 2) Bell scenario are where I t Yu-Oh,V is given in Fig. 3(b) and I t Yu-Oh,η in the Supplemental Material [22]. The subindex Yu-Oh indicates the correlations used to obtain the inequality. The subindex V or η indicates that the correlations are maximally resistant to either noise or detection inefficiency, respectively. The superindex t indicates that the inequality is tight. The Yu-Oh correlations yield The critical visibility for I t Yu-Oh,V and the critical detection efficiency for I t Yu-Oh,η are V crit = 0.7917, respectively, which, on the one hand, are a significant improvement compared to the values in [34], namely, V crit = 0.9578 and η crit = 0.9710, respectively (see the Supplemental Material [22] for details), and, on the other hand, are within the reach of currently attainable visibilities in experiments with high-dimensional systems [62][63][64][65][66] and current detection efficiencies for photons [67]. We have also obtained three Bell inequalities for the (2, 18, 2) Bell scenario, where I t KS18 is given in Fig. 2(b) and I KS18,V and I KS18,η are given in the Supplemental Material [22]. The KS18 correlations yield The critical visibility for I KS18,V and the critical detec-tion efficiency for I KS18,η are respectively, which are a significant improvement over the values in [34], namely, V crit = 0.9317 and η crit = 0.9428, respectively (see the Supplemental Material [22] for details). Moreover, I KS18,η ≤ 0 allows for loopholefree experiments with nonheralded sources [47]. Finding tight Bell inequalities for the KS18 correlations proved to be more challenging due to the complexity of the corresponding local polytope. However, we obtained one tight inequality I t KS18 ≤ 8. This inequality displays an interesting feature: Its quantum bound (i.e., the highest possible value allowed by quantum mechanics) matches the value attained by the KS18 correlations. This is remarkable because it proves that the KS18 correlations are in the boundary of the set of quantum correlations, which means that they are not only nonlocal, but also extremal [30]. Extremality has been recognized as the key feature for nonlocal correlations to allow for device-independent quantum key distribution [2,68] and self-testing of quantum devices [19]. (See the Supplemental Material [22] for further details on device-independent applications of the KS18 and Yu-Oh correlations.) Finally, as shown in Figs. 2 and 3, two of the tight Bell operators I t KS18 and I t Yu-Oh,V , respectively, display the same (highly nontrivial) symmetries as the graph of compatibility of the corresponding set of local measurements (see the Supplemental Material [22]). This is surprising and requires further investigation, since, a priori, we do not expect any facet of the local polytope to be related to the graph of compatibility of a SI-C set.
Conclusions.-Using a three-step method, we have obtained Bell inequalities that are optimal (maximally resistant to either noise or detection inefficiency) for correlations produced by maximally entangled states and KS18 (the simplest KS set in quantum mechanics) and the Yu-Oh set (the simplest SI-C set). They fundamentally connect the theorems of Bell, and Kochen and Specker, allow us to perform Bell tests with SI-C sets and spacelike separation and achieve simultaneous Bell nonlocality (with spacelike separation) and contextuality (with timelike separation). Therefore, they pave the way to tasks requiring both resources simultaneously and, more importantly, to tasks that cannot be accomplished with each of the resources individually. We have demonstrated that the KS18 correlations maximally violate the Bell inequality I t KS18 ≤ 8 and can be used for device-independent quantum key distribution. Moreover, they allow for Bell self-testing while KS18 can also be used for certification with sequential measurements (Bob and Charlie in Fig. 1) [30], thus the correlations for three parties (the KS18 nonlocal correlations between Alice and Bob and the contextual correlations produced by sequentially measuring KS18 between Bob and Charlie) could be used to certify in a device-independent way quantum transformations. All these functionalities contribute to closing of the gap between general probabilistic theories (which refer to states, measurements, and transformations) and the device-independent framework (which refer only to the conditional probabilities of obtaining outputs from inputs) [69]. 12, 575 (2021 Here, we collect definitions of concepts related to Kochen-Specker (KS) contextuality for ideal measurements that are used in this work.
Firstly, we should point out that Bell nonlocality and KS contextuality for ideal measurements have a common origin. If ρ is a quantum state and S is a set of observables, the quantum theory predicts the existence of pairs (ρ, S) such that, for every s ∈ S of jointly measurable observables, there is a probability distribution P ρ (a|s). Here, a is the set of outcomes for the observables in s, such that, for every observable x ∈ S, the marginal probability P (a x |x) is independent of which subset x belongs to, but such that the set of all possible P ρ (a|s) cannot be obtained from a single probability distribution in a single probability space. This phenomenon is generically called contextuality or measurement contextuality. Two manifestations of it are the Bell nonlocality (in which events are produced by spacelike separated measurements) and the KS contextuality between ideal sequential measurements (in which events are produced by ideal measurements).
Definition 1 An ideal measurement of an observable A is a measurement of A that gives the same outcome when repeated on the same physical system and does not disturb any compatible observable.
Definition 2 Two observables A and B are compatible if there exists a third observable C such that, for every initial state ρ and for every outcome a of A, and, for every outcome b of B, where P (A = a|ρ) is the probability of obtaining outcome a for A given the state ρ.
Definition 3 A Kochen-Specker (KS) contextuality scenario is defined by a set of ideal measurements, their respective sets of outcomes, and a set of contexts.
Definition 4 In a KS contextuality scenario, a context is a set of ideal measurements of compatible observables.
Definition 5 A behavior (or matrix of correlations) for a KS contextuality scenario is a set of (normalized) probability distributions produced by ideal measurements satisfying the relations of compatibility of the scenario, one for each of the contexts, and such that the probability for every outcome of every measurement does not depend on the context (nondisturbance condition).
Definition 6 A behavior for a contextuality scenario is contextual if the probability distributions for each context cannot be obtained as the marginals of a global probability distribution on all observables. Otherwise the behavior is noncontextual.

Definition 7
The relations of compatibility between N observables can be represented by an N -node graph, called the graph of compatibility of the scenario, in which each node represents an observable and adjacent nodes correspond to compatible observables.
Definition 8 A noncontextuality (NC) inequality is an inequality satisfied by any noncontextual behavior.
Definition 9 A state-independent contextuality (SI-C) set in dimension d is a set of rank-one projectors that produces contextual behaviors for any quantum state in dimension d.
and 0 ≤ y < 1 such that j∈I w j ≤ y for all I, where I is any set of nodes in the graph of compatibility of S no two of which are adjacent, and i w i Π i ≥ 1 1.
Definition 10 A KS set is a set of rank-one projectors which does not admit an assignment of 0 or 1 satisfying that: (I) two orthogonal projectors cannot both have assigned 1, (II) for every set of mutually orthogonal projectors summing the identity, one of them must be assigned 1.

Appendix B: Tight Bell inequalities
Here, we explain why obtaining tight Bell inequalities is a difficult problem for Bell scenarios with many measurements, and review some approaches followed in the literature.
Definition 11 A Bell scenario is defined by a set of parties, their respective sets of measurements, and their respective sets of outcomes.
For any Bell scenario, the classical (local realistic) set of correlations is a polytope called the local polytope [48,70,71]. For the simplest Bell scenario, the one with two parties, two settings, and two outcomes or (2, 2, 2) Bell scenario, the local polytope has 16 extremal points and 24 facets. Nonsignaling correlations can violate the Bell inequalities corresponding to 8 of these facets. Each of these facets defines a so-called tight Bell inequality whose violation detects nonlocality. The facets corresponding to Bell inequalities that cannot be violated by nonsignaling correlations are called trivial facets. In the case of (2, 2, 2), all nontrivial facets are associated to the same (up to relabelings) Bell inequality, the Clauser-Horne-Shimony-Holt inequality [32].
Another approach to derive Bell inequalities is using quantum correlations for their construction. For example, using the correlations produced by two maximally entangled ququarts and the measurements of the Peres-Mermin (or magic) square, one can obtain a tight Bell inequality for the (2, 3, 4) Bell scenario [81]. Another example are Bell inequalities for the (n, 3, 2) Bell scenarios constructed from n-qubit graph states [13,82,83]. Other examples of this approach are a family of Bell inequalities for the (2, m, d) Bell scenario tailored for maximally entangled pairs of qudits [84], and a family of Bell inequalities based on multiple copies of the two-qubit maximally entangled state [57]. However, these inequalities are not tight. For a review on tight Bell inequalities, see [85].

Appendix C: Gilbert's algorithm
Here, we provide details of our implementation of Gilbert's algorithm for quadratic minimization [54]. In addition, practical examples are given in [86]. Gilbert's algorithm has been used for various tasks in quantum information such as finding better bounds for the Grothendieck constant [55,56] and reducing the detection efficiency threshold for Bell tests [57,87].
Gilbert's algorithm minimizes the distance between a target point r and a convex set S defined over R n , via calls to an oracle that can perform linear optimizations over S [55]. The algorithm determines if r is inside S by finding a point s ∈ S such that || r− s|| ≤ δ, with δ > 0. In case the target lies outside the set, the algorithm yields a witness c that proofs that the point does not belong to the convex set, i.e., c. s < c. r, ∀ s ∈ S.
In our case, the convex set is the local polytope L, the vectors represent the correlations, local or nonlocal, and the witnesses c are the Bell inequalities to start with.
The algorithm has the following four steps: First step. We set the target point r(V ), e.g., the KS-18 (or the Yu-Oh) correlations for a given value of V , and we choose randomly a local point s k for k = 0. An analogous procedure follows for the case of η.
Second step. We maximize the overlap ( r(V ) − s k ). l over all l ∈ L. That is, and call l k the vertex that achieves the maximum. Notice that, since the local set is a polytope, it is sufficient to evaluate the overlap over all the vertices to find the global maximum.
Third step. We minimize the distance from r(V ) to the convex combination of l k and s k Min and use the optimal parameter * to define the point s k+1 as Fourth step. We set s k = s k+1 and repeat the algorithm until we obtain || r(V ) − s k || < δ. Notice that at the end of each iteration we can retrieve c = r(V ) − s k .

Heuristic method to optimize the overlap
It is important to point out that the second step of the algorithm, the optimization of the overlap, runs over all the 2 2m vertices of the local polytope. This optimization is an NP-hard problem [48] and, for the cases studied in this work, is extremely time-consuming. Hence, it is useful to apply an heuristic method to optimize the overlap in a reasonable time [55,56].
In order to explain the heuristic method, it is easier to refer to ( r(V ) − s k ) by its components Γ a,b,x,y and to l by P A a,x P B b,y . In this way, the overlap can be written as a,b,x,y Γ a,b,x,y P A a,x P B b,y . Then, to optimize the overlap, we adopt the following strategy: First step. We initialize l or, equivalently (P A a,x , P B b,y ), by randomly generating a seed inside the local polytope.
Second step. We keep P A a,x fixed and try to find better values of P B b,y . To do so, we iterate over y, and, if the sum a,x P A a,x (Γ a,0,x,y − Γ a,1,x,y ) is positive, we set P B 0,y = 1 and P B 1,y = 0. If the sum is negative, we do the opposite and set P B 0,y = 0 and P B 1,y = 1. Third step. We repeat the procedure while keeping P B b,y fixed instead. We iterate over x, and, if the sum b,y P B b,y (Γ 0,b,x,y − Γ 1,b,x,y ) is positive, we set P A 0,x = 1 and P A 1,x = 0, otherwise we set P A 0,x = 0 and P A 1,x = 1. Fourth step. We iterate the second and third steps until the overlap converges.
This procedure yields higher values of the overlap with every iteration. However, it could converge to a local maximum instead of the global maximum [55,56]. We tried to avoid this problem by repeating the optimization with different random seeds. While it is possible to impose some symmetry on the resulting Bell inequality [57,87], in this work we did not.

Numerical details
There are few considerations that one needs to take into account before putting in practice the algorithm. In case that the target correlations r are local, the algorithm is guaranteed to converge after a number of iterations of the order of O(1/δ 2 ) [54]. Therefore, there is a tradeoff between the method's accuracy δ and the amount of time that we need to spend for it. Moreover, since δ > 0, there will be some correlations that are nonlocal, but will be regarded as local by the algorithm. However, for our objective, i.e., deriving robust Bell inequalities, we can always choose the last nonlocal point according to the algorithm and retrieve its optimal witness c. In our calculations we used δ = 10 −3 . We run the algorithm in parallel for different values of V and different values of η. In both cases, the values range from 0.69 to 1 and in steps of 0.01.
Finally, due to the heuristic nature of the algorithm, once we retrieve c, we need to evaluate the overlap on all the vertices of the polytope to make sure that the local bound is correct. We performed this calculation in Python [86] and double checked the results using the matlab package QETLAB [88].

Appendix D: Details on the second step of the method
Here, we detail how the facts (i)-(iii) in the main text allow us to obtain tight Bell inequalities.
For bipartite Bell scenarios with m measurement settings and two outputs, the local correlations are in a polytope in R D , where D = m 2 + 2m, due to the normalization and nonsignaling conditions [2,61].
After applying Gilbert's algorithm, we obtain a Bell inequality c 0 for which the correlations an improved resistance to white noise or detection inefficiency, respectively. In general, c 0 is not tight. However, we can use it as a starting point to derive a tight inequality. To do so, first, we collect all the vertices that saturate the local bound of c 0 . If these vertices contain a set of D affinely independent vectors, then they fulfill the tightness condition and hence c 0 is tight. In general, it is not, but still the saturating vertices give us a starting set of points D 0 that must be 'completed' in order to make the inequality tight. Considering fact (ii), we can make a convex combination of all the saturating vertices and check whether or not there are zero coefficients in the resulting vector v r . The presence of zero coefficients in the resulting vector implies that the inequality is over-penalizing certain vertices that need to be included in D 0 to fulfill the tightness condition. In practice, fact (ii) identifies which coefficients of c 0 need to be set to zero in order to allow the necessary vertices to join D 0 . Finally, fact (iii) leads us to optimize the coefficients of c 0 to maximize the number of saturating vertices. To do so, we considered the symmetries displayed by the coefficients of c 0 and their sign. Note that, in principle, an inequality with m inputs and two outputs has m 2 + 2m independent coefficients. For instance, in the case of the KS18 correlations there would be 360 coefficients, but after Gilbert's algorithm this number is reduced to 6 (see E3). Taking advantage of this, we assign values to the coefficients in the range of c + 0 ∈ {0, k}, for positive integer coefficients, and similarly for the negative ones c − 0 ∈ {−k, 0}. The simplest case to start with is k = 1, and then we increment k until the inequality fulfills the tightness condition.
Using this second step, we obtained I t Yu-Oh,η and I t KS18 . For I t Yu-Oh,V only the first step was necessary.
Appendix E: Details on the Bell inequalities obtained in this work and how they compare to previous works Here, we provide the explicit expressions of the five Bell inequalities that we have obtained in this work and compare them with the previously known Bell inequalities for the corresponding SI-C sets [34]. Hereafter, we will refer to the Bell inequalities in [34] as the graph-based Bell inequalities, and we will denote by I (G,w) their corresponding Bell operators.
In order to present the inequalities, we use the Collins-Gisin parametrization introduced in [59], where, to specify the coefficients of the Bell operator I in a Bell inequality I ≤ L, we write a matrix as in Eq. (5) (see main text). For example, the Bell operator of the Clauser-Horne inequality [89] is represented by

Bell inequalities for the KS18 correlations
For the KS18 correlations, the Bell operators for both the graph-based Bell inequality [34] and the three Bell inequalities that we have found in this work are of the following form: g g g g g g g g g g g g g g g g g g g f a a d d c c d d c c e b c b b b c  g a f a c d d d d c b b c c e c b b c  g a a f d c d d c d b b c b c b c c e  g d c d f a a d d c c b b c c e c b b  g d d c a f a d c d c b b b b c e c c  g c d d a a f c d d e c c b b c c b b  g c d d d d c f a a c e c c b b b c b  g d d c d c d a f a b c b c b b c e c  g d c d c d d a a f b c b e c c b c b  g c b b c c e c b b f a a d d c c d d  g c b b b b c e c c a f a c d d d c d  g e c c b b c c b b a a f d c d d d c  g b c b c b b c c e d c d f a a d c d  g c e c c b b b b c d d c a f a d d The five additional horizontal and vertical lines are eye guides that help us to show that the matrix of coefficients can be divided in similar blocks. This will be important when studying the symmetries of the Bell operators.
The graph-based inequality for the KS18 correlations is with a = b = e = −1/2, f = 1, and c = d = g = 0 in Eq. (E3) [34]. The Bell inequality that we have obtained and is maximally robust against white noise is with a = −12/9, b = −32/9, c = 19/9, d = −1/9, e = −21/9, f = 8/9, and g = −1 in Eq. (E3). The Bell inequality that is maximally robust against detection inefficiency is Finally, the tight Bell inequality that is presented in the main text, see Fig. 2 with a = b = e = −2, c = 1, and d = f = g = 0 in Eq. (E3). As it was mentioned in the main text, using this inequality we can prove that the KS18 correlations are extremal. For this, we first calculate an upper bound on the maximum violation of I t KS18 that quantum systems, of any dimension, can achieve. This calculation is performed using the Navascués-Pironio-Acín hierarchy [90] at level 1 + AB of the hierarchy. Remarkably, this upper bound matches the value attained by the KS18 correlations, proving our statement.
The relevant features of these four inequalities are summarized in Table E

Bell inequalities for the Yu-Oh correlations
For the Yu-Oh correlations, the graph-based Bell inequality [34] is with I (G,w) The tight Bell inequality robust to noise obtained in this work is with The optimal inequality with respect to the detection inefficiency is with The relevant features of these three inequalities are summarized in Table E  Here, we prove that the Bell inequalities (E5), (E6), (E10), and (E12) are optimal. That is, we prove that, for the KS18 correlations, the value of V crit [η crit ] for the Bell inequality (E5) [(E6)] is the smallest V crit [η crit ] that can be found for any Bell inequality. For V ≤ V crit [η ≤ η crit ], there is a local model reproducing the correlations. Similarly, we prove that, for the Yu-Oh correlations, the value of V crit [η crit ] for the Bell inequality (E10) [(E12)] is the smallest V crit [η crit ] that can be found for any Bell inequality.
A matrix of correlations (or behavior) p is local if and only if it can be written as the convex combination of the vertices of the local polytope v λ [2], where λ indexes all vertices. For the (2, m, 2) Bell scenario, λ = {1, . . . , 2 2m }. If a smaller subset of vertices λ is enough to reproduce p, then the correlations are local, since the coefficients q λ =λ can be considered zero in Eq. (F1) [87]. Taking this into account, we proved that inequalities (E5), (E6), (E10), and (E12) are optimal by explicit construction of the corresponding local models. To do so, we took the KS18 (Yu-Oh) correlations evaluated at V crit [or η crit , depending on the optimality to analyze] as p. Then, we collect all the vertices that saturate the local bound of the inequality. In general, the number of saturating vertices is substantially smaller than 2 2m allowing us to use linear programming. Finally, we successfully solved the linear program in Eq. (F1) using Mathematica, thus proving that our inequalities are optimal.

Appendix G: Relation between the Bell inequalities
Here, we explain why, for each type of correlations, the optimal Bell inequality with respect to white noise is different from the optimal Bell inequality with respect to detection inefficiency.
When correlations are affected by white noise, they can be written as a convex combination of the noiseless correlations, with weight V , and the correlations obtained measuring the maximally mixed state, with weight 1−V . For V = 1, the correlations are nonlocal because they violate the inequalities presented in [34]. For V = 0, the correlations belong to the local polytope, as they correspond to measurements over a classical state. Therefore, the trajectory in the space of correlations is a straight line that starts in the quantum set and ends in the local polytope (see Fig. 4).
When the detection efficiency decreases, the probabilities are of the form shown in Eq. (4) (see the main text). This is different than the case of white noise, where the state is changed instead. Again, for η = 1 the correlations are nonlocal, while for η = 0, the correlations correspond to a vertex of the local polytope. In fact, it is the deterministic point in which Alice and Bob never assign 1 to their outputs P η (Π A i = Π B j = 1) = 0, P η (Π A i = 1) = 0 and P η (Π B j = 1) = 0. This time, the trajectory followed by the correlations is more complicated. Moreover, since the model operates over the probabilities, the final point η = 0 is reached regardless the dimension of the state. In contrast, in the white noise model the final point of the trajectory depends on the dimension d of the local subsystems (see Fig. 4).
In the first step of our approach, the numerical method searches iteratively for the closest local point s with respect to a given correlation r and yields the vector c = r − s, which is a Bell inequality. Given that both models bring the correlations along different trajectories and end in different points, they enter the local polytope though different facets. Consequently, the Bell inequalities obtained are different (see Fig. 4). independent quantum key distribution (DI-QKD) protocol. In order to show this we use the Devetak-Winter formula [91] where r DW is the key rate, H(A|E) is the quantum conditional entropy between Alice and an eavesdropper Eve and H(A 1 |B 1 ) is the conditional Shannon entropy between Alice and Bob. H(A 1 |E) quantifies the amount of local randomness present in the outcomes of Alice's measurements. While H(A|B) quantifies the strength of the correlations between the honest parties. In a DI-QKD protocol it is necessary to include H(A|B) in the key rate calculation, since the aim is that both parties share the same key at the end of the protocol. This is only achieved after the raw key is post-processed using classical error correction and privacy amplification. In the case of DI-RNG the rate is given only by H(A|E).
In addition, we consider that both parties use their first measurements, A 1 and B 1 , to distill the key. Then, H(A 1 |B 1 ) is calculated as In order to compute H(A 1 |E), we use the numerical technique developed in [92]. To do this calculation we use the complete probability distribution. In this way, we determine the thresholds for DI-RNG and DI-QKD when the correlations are affected by white noise and detection inefficiency.
Our results for the the Yu-Oh correlations are shown in Figs. 5 and 6. As it is expected, the requirements to distill a secret key are higher than those for randomness generation. The lower bounds we found show that, for DI-RNG, is necessary η ≥ 0.90 and V ≥ 0.92. Whereas, for distilling a secret key, η ≥ 0.9330 and V ≥ 0.9477 is needed. These are minimal requirements since we have performed the optimizations with only one source of error at a time.
For the KS18, there are 18 measurements and thus computing numerically H(A|E) is not possible. However, following [2,68], we expect that, since the parties have extremal correlations, the eavesdropper Eve cannot gain any information about the outcomes of the parties' measurements. Therefore perfect correlations yielding H(A|B) = 0. Therefore, the secret key against collective attacks is also r DW ≥ 0.8113 bits.
Appendix I: Proofs that two of the tight Bell operators have the same symmetries as the graph of compatibility of the corresponding SI-C set Here, we explain the exact mathematical sense in which the tight Bell operator I t KS18 shown in Fig. 2(b) (see main text) has the same symmetries as the graph of compatibility of the KS set of Fig. 2(a) (see main text). We also explain why the tight Bell operator I t Yu-Oh,V shown in Fig. 3(b) (see main text) has the same symmetries as the graph of compatibility of the Yu-Oh set displayed in Fig. 3(a) (see main text).
For these purposes, we first explain what are the symmetries of a graph and how to compute them. Then, we detail the symmetries of the two graphs that we are considering. Finally, we prove our statements.

Symmetries of a graph
A (vertex) automorphism in a graph G = (V, E), with vertex set V and edge set E, is a permutation σ of its vertices that preserves adjacency. That is, σ(u)σ(v) ∈ E if and only if uv ∈ E. An automorphism of G is a graph isomorphism with itself, i.e., a mapping from the vertices of G back to vertices of G such that the resulting graph is isomorphic with G. The set of automorphisms defines a permutation group known as the graph's automorphism group. A number of software implementations exist for computing graph automorphisms, including nauty [93] and SAUCY [94].
The automorphisms of G induce a partition of its vertices into orbits. Two vertices belong to the same orbit if and only if there exists an automorphism that takes one to the other. Each of the orbits contains vertices that are structurally equivalent (or symmetrical).
To find which edges (or pairs of adjacent vertices) of G are structurally equivalent, one can compute the line graph of G, L(G), which is constructed in the following way: for each edge in G, make a vertex in L(G); for every two edges in G that have a vertex in common, make an edge between their corresponding vertices in L(G). Then, the (vertex) automorphisms of L(G) induce a partition of the edges of G into orbits. Each one of these orbits contains edges of G that are structurally equivalent in G.
To find which pairs of nonadjacent vertices of G are structurally equivalent, one can compute the line graph of the complement of G, which is the graph G on the same vertices such that two distinct vertices of G are adjacent if and only if they are not adjacent in G. Then, the (vertex) automorphisms of L(G) induce a partition of the pairs of nonadjacent vertices of G into orbits. Each one of these orbits contains pairs of nonadjacent vertices of G that are structurally equivalent in G.

Symmetries of the graph of compatibility of KS18
The 18 vertices of the graph of compatibility of KS18 only have one orbit. That is, all vertices are structurally equivalent. In this case, it is said that the graph is vertex transitive.
The 63 edges can be partitioned in three orbits, see Fig. 2(a) (see main text).
The A (or red) orbit with 18 edges, which are the 6 edges of the cliques (sets of mutually ad- The B (or black) orbit with 36 edges, which are the 6 × 6 edges of the cliques The C (or blue) orbit with 9 edges: The α (or violet background) orbit with 18 nonad- The β (or orange background) orbit with 36 nonadjacent pairs: The γ (or cyan background) orbit with 36 nonadjacent pairs: All this information can be summarized in the following matrix: a a a a a a a a a a a a a a a a a  a aa A B β β α α β β γ γ C B γ B A A γ a A aa B α β β β β α B B γ γ C γ A A γ a B B aa β α β β α β B B γ B γ B γ γ C a β α β aa A B β β α γ A A γ γ C γ B B a β β α A aa B β α β γ A A B B γ C γ γ a α β β B B aa α β β C γ γ B B γ γ B B a α β β β β α aa A B γ C γ γ A A B γ B a β β α β α β A aa B B γ B γ A A γ C γ a β α β α β β B B aa B γ B C γ γ B γ B a γ B B γ γ C γ B B aa B B β β α α β β a γ B B A A γ C γ γ B aa A α β β β α β a C γ γ A A γ γ B B B A aa β α β β β α a B γ B γ B B γ γ C β α β aa B B β α β a γ C γ γ B B A A γ β β α B aa A β β α a B γ B C γ γ A A γ α β β B A aa α β β a A A γ γ C γ B γ B α β β β β α aa A B a A A γ B γ B γ C γ β α β α β β A aa B a γ γ C B γ B B γ B β β α β α β B B aa (I1) 3. Proof that the tight Bell inequality associated to K18 has the same symmetries as the graph of compatibility of KS18 Eq. (I1) reflects the symmetries (automorphisms) of the graph of compatibility of KS18. Fig. 2(b) (see main text) provides the coefficients of I t KS18 , which defines a facet of the local polytope of the (2, 18, 2) Bell scenario.
I t KS18 has the same symmetries as the graph of compatibility of KS18 in the sense that we can associate to each different symbol in Eq. (I1) a unique coefficient in Fig. 2