Efficient device-independent entanglement detection for multipartite systems

Entanglement is one of the most studied properties of quantum mechanics for its application in quantum information protocols. Nevertheless, detecting the presence of entanglement in large multipartite sates continues to be a great challenge both from the theoretical and the experimental point of view. Most of the known methods either have computational costs that scale inefficiently with the number of particles or require more information on the state than what is attainable in everyday experiments. We introduce a new technique for entanglement detection that provides several important advantages in these respects. First, it scales efficiently with the number of particles, thus allowing for application to systems composed by up to few tens of particles. Second, it needs only the knowledge of a subset of all possible measurements on the state, therefore being apt for experimental implementation. Moreover, since it is based on the detection of nonlocality, our method is device independent. We report several examples of its implementation for well-known multipartite states, showing that the introduced technique has a promising range of applications.


I. INTRODUCTION
Entanglement is the key ingredient for several protocols in quantum information theory, such as quantum teleportation [1], quantum key distribution [2], measurementbased quantum computation [3] and quantum metrology schemes [4]. Therefore, developing techniques to detect the presence of entanglement in quantum states is crucial and in the past years several methods have been introduced.
The most general way to detect entanglement in a given system consists of reconstructing its quantum state using tomography and then applying any entanglement criterion to the resulting state [5]. This, however, is costly both from an experimental and a theoretical perspective. First, determining the state of large quantum systems is impractical in experiments, given that quantum tomography implies measuring a number of observables that increases exponentially with the number of systems, e.g., 3 N observables even in the simplest case of N qubits [6]. Second, determining whether an arbitrary state is entangled is known to be a hard problem -to the best of our knowledge, the computational resources of the most efficient known algorithm scale exponentially with N [7]. Because of these problems, it is very desirable to develop entanglement detection techniques with more accessible experimental and computational requirements.
One possible approach is to make use of entanglement witnesses. These are criteria for detecting entanglement that require measuring only some expectation values of local observables [8]. In particular, attempts have been made to derive witnesses that adapt to the limited amount of information that is usually available in a typical experiment. For instance, one can consider witnesses involving only two-body correlators [9] or a few global measurements [10,11]. Nonetheless, * flavio.baccari@icfo.eu entanglement witnesses constitute a method that lacks generality, given that the known methods are generally tailored to detect very specific states. There are techniques capable of deriving a witness for any generic entangled state, which can also be constrained to the available set of data [12], or adapted to require the minimal amount of measurements on the system [13]. However, they always involve an optimization procedure that runs on an exponentially increasing number of parameters. A method to detect metrologically useful (hence entangled) states based on a couple of measurements has recently been proposed [14]. However, these states represent only a subset of all entangled states.
A qualitatively different approach to entanglement detection is based on Bell nonlocality [15]. Indeed, the presence of nonlocality provides a certificate of the entanglement in the state. Moreover, it has the advantage that it can be assessed in a device-independent manner, i.e. without making any assumption on the actual experimental implementation [16]. The easiest way to detect nonlocality is by means of the violation of a Bell inequality. However, in analogy with the entanglement case, each inequality is usually violated by a very specific class of states. In the general case, verifying whether a set of observed correlations is nonlocal can be done via linear programing [17]. Nonetheless, the number of variables involved again grows exponentially with the number of particles, e.g. as 4 N already for the simplest scenarios where only two dichotomic measurements per party are applied [16].
To summarize, the methods to detect entanglement known so far are either not general or too costly, from a computational and/or experimental viewpoint, to be applied to large systems.
Here we present a novel technique for device-independent entanglement detection that is efficient both experimentally and computationally. On the one hand, it requires the knowledge of a subset of all possible measurements, most of them consisting of few-body correlation functions, which makes it suitable for practical implementations. On the other hand, it can be applied to any set of observed correlations and can be implemented by semidefinite programing involving a number of variables that grows polynomially with N .
Of course, all these nice properties become possible only because our method for entanglement detection is a relaxation of the initial hard problem. However, and despite being a relaxation, we demonstrate the power of our approach by showing how it can be successfully applied to several physically relevant examples for systems of up to 29 qubits. These examples demonstrate that our approach opens a promising avenue for entanglement detection of large many-body quantum systems.
This article is organized as follows: in Section II we introduce the basic notation and definitions, while in Section III we present the idea of the method together with the application to a simple scenario. Section IV is devoted to the presentation of its geometrical interpretation together with the comparison to the other techniques. In Section V and VI we list some examples of application of the method to relevant classes of states. Lastly, Section VII contains conclusions and some future perspectives.

II. NOTATION AND DEFINITIONS
We consider an entanglement detection scenario in which N observers, denoted by A 1 , . . ., A N , share an N -partite quantum state ρ N . Each A i performs m possible measurements, each having d outcomes. We represent the measurement of party i by M ai xi , where x i ∈ {0, . . ., m − 1} denotes the measurement choice and a i ∈ {0, . . ., d − 1} are the corresponding outcomes.
By repeating the experiment sufficiently many times, the observers can estimate the conditional probabilities of getting the different outcomes depending on the measurements they have performed. The conditional probability distribution p(a 1 , . . ., a N |x 1 , . . ., x N ) describes the correlations observed among the observers when applying the local measurements M ai xi on the state ρ N . One can also define the marginal distributions ..,i k is the reduced state of ρ N corresponding to the considered subset of parties. Marginals can equivalently be obtained from the full distribution (1) by summing over the remaining outcomes.
Since in what follows we mostly consider scenarios involving two-output measurements (resulting from projective measurements performed on qubits), it is convenient to introduce the concept of correlators where 0 ≤ i 1 < . . . < i k < N , x ij ∈ {0, m − 1} and 1 ≤ k ≤ N . The value of k represents the order of the correlators: for instance, expectation values M (i1) are of order two. Correlators of order N are often referred to as full-body correlators. In scenarios involving only dicothomic measurements, correlators encode all the information in the observed distribution (1). When working with correlators, it is also useful to introduce the measurement operators in the expectation value form, namely by using the notation M Now that the main concepts have been introduced, we proceed with outlining the proposed entanglement detection method.

III. METHOD
Our method is based on the following reasoning (discussed in detail below): 1. If a quantum state ρ N is separable, local measurements performed on it produce local correlations (i.e. correla-tions admitting a local model).
2. Any local correlations can be realized by performing commuting local measurements on a quantum state.
3. Correlations produced by commuting local measurements define a positive moment matrix with constraints associated to the commutation of all the measurements.
4. Our method consists in checking if the observed correlations are consistent with such a positive moment matrix. In the negative case the state ρ N producing the correlations is proven to be entangled.
Let us now explain all these points in detail. First, given a separable quantum state, i.e. ρ N = λ p λ i ρ Ai λ , any set of conditional probability distributions obtained after performing local measurements on it admits a decomposition of the following form p(a 1 , . . ., a N |x 1 , . . ., where p(a i |x i , λ) = tr(M ai xi ρ Ai λ ). In the context of Bell nonlocality, distributions that can be written in this form are called local [16]. Local correlations do not violate any Bell inequality. Conversely, if a given distribution cannot be described by a local model like (4), it is said to be nonlocal. We notice that, by the reasoning presented above, whenever the set of observed distributions (1) is nonlocal, we can conclude that the shared state is entangled. Moreover, since nonlocality is a property that can be assessed at the level of the probability distribution, it can be seen as a device-independent way of detecting entanglement. For the sake of brevity, throughout the rest of the paper we therefore refer to our method as a nonlocality detection one.
The second ingredient is that any local set of probability distributions has a quantum realization in terms of local commuting measurements applied to a quantum state [18]. In order to see it more explicitly we first realize that any decomposition of the form (4) can be rewritten as where D(a i |x i , λ) are deterministic functions that give a fixed outcome a for each measurement, i.e. D(a i |x i , λ) = δ ai,λ(xi) , such that a i = λ(x i ), with λ(·) a function from {0, . . . , m−1} to {0, . . . , d − 1} [16]. It is easy to see that any such decomposition can be reproduced by choosing the multipartite state ρ N = λ q λ |λ λ| ⊗N and measurement operators of The last step consists in using a modified version of the Navascues-Pironio-Acin (NPA) hierarchy [19,20] that takes into account the commutativity of the local measurements to test if the observed probability distribution is local (a similar idea was introduced in the context of quantum steering [21] -see also [22]). The NPA hierarchy consists of a sequence of tests aimed at certifying if a given set of probability distributions has a quantum realization (1). In NPA one imposes the commutativity of the measurements between the distant parties. Now, we will impose the extra constraints that the local measurements on each party also commute. The resulting semidefinite program (SDP) hierarchy is nothing but an application in this context of the more general method for polynomial optimization over noncommuting variables introduced in [23], see also [24]. As noticed there, by imposing commutativity of all the variables this general hierarchy reduces to the well-known Lassere hierarchy, namely the relaxation for polynomial optimization of commuting variables [25]. An application of this relaxation technique to describe local correlations was also proposed in [26]. However, to the best of our knowledge, no systematic analysis of its application to multipartite scenarios has been considered thus far.

Details and convergence of the hierarchy
It is convenient for what follows to recall the main ingredients of the NPA hierarchy [19,20], which, as said, was de-signed to characterize probability distributions with a quantum realization (1). Consider a set O, composed by some products of the measurements operators {M ai xi } or linear combinations of them. By indexing the elements in the set as O i with i = 1, . . .k, we introduce the so-called moment matrix Γ as the k × k matrix whose entries are defined by . For any choice of measurements and state, it can be shown that Γ satisfies the following properties: i) it is positive semidefinite, ii) its entries satisfy a series of linear constraints associated to the commutation relations among measurement operators by different parties and the fact that they correspond to projectors, iii) some of its entries can be computed from the observed probability distribution (1), iv) some of its entries correspond to unobservable numbers (e.g. when O i and O j involve noncommuting observables).
Based on these facts one can define a hierarchy of tests to check whether a given set of correlations has a quantum realization. One first defines the sets O ν composed of products of at most ν of the measurement operators, and creates the corresponding Γ matrix using the set of correlations and leaving the unassigned entries as variables. Then one seeks for values for these variables that could make the Γ positive. This problem constitutes a SDP, for which some efficient solving algorithms are known [27]. If no such values are found this means that the set of correlations used does not have a quantum realization. By increasing the value of ν, one gets a sequence of stricter and stricter ways of testing the belonging of a distribution to the quantum set.
We can now use the same idea to define a hierarchy of conditions to test whether a given set of correlations has a quantum realization with commuting measurements. To do so we simply impose additional linear constraints on the entries of the moment matrix resulting from assuming that the local measurements also commute (for a more detailed discussion, see Appendix A). Thus, given a set of observed probability distributions one can use them to build a NPA-type matrix with the additional linear constraints associated to the local commutation relations, and run a SDP to check its positivity to certify if the considered set of correlations can not be obtained by measuring a separable state.
Interestingly, the convergence of this hierarchy follows from the results in [20,23]. Roughly speaking, one can say that since the NPA hierarchy is proven to converge to the set of quantum correlations, our method provides a hierarchy that converges to the set of quantum correlations with commuting measurements, which we have shown to be equivalent to the set of local correlations [28]. Therefore, any nonlocal correlation would fail the SDP test at a finite step of the sequence given by the O ν .
Moreover, the commutativity of all the measurements implies that the total number of variables that can be involved in the SDP test is finite. The reason is that the longest nontrivial product of the operators that can appear in the moment matrix consists of the products of all the different M ai xi . Hence, the number of variables in the moment matrix stops growing after the first step at which this product appears. This also implies that the convergence of the hierarchy is met at a finite step as well, namely coinciding to the level ν at which the longest nontrivial products appear in the list of operators O ν . Indeed, it is easy to see that for µ > ν , there cannot appear new operators in the generating set, i.e. O µ = O ν . Of course, the aforementioned levels depend on the numbers (N, m, d) defining the scenario and it is, in general, high. Indeed, according to the Collins-Gisin representation [29], one has N m(d − 1) independent measurement operators M ai xi . Therefore, the product of all of them would first appear in the moment matrix To conclude, we stress that, depending on the level of the hierarchy, one might not need knowledge of the full probability distribution. Indeed, by looking at (2), it is evident that to define a marginal distribution involving k parties, one requires the product of k measurements M ai xi . Now, given that the operators of the set O ν contain products of at most ν measurement operators, the terms in the moment matrix at level ν can only coincide with the marginals of the observed distri-bution of up to k = 2ν parties. Therefore, in the multipartite setting, fixing the level of the hierarchy is also a way to limit the order of the marginals that can be assigned in the moment matrix.

Simple example
After presenting the general idea of the method, it is convenient to conclude the section by illustrating it with a concrete example. In what follows, we present the explicit form of the moment matrix for the bipartite case, two dicothomic measurements per party and level ν = 2 of the hierarchy. For the sake of simplicity, we rename the expectation value operators for the two parties as A x and B y , with x, y = 0, 1. In this scenario, the set of operators reads as where we define the following unassigned variables Now, if we further impose commutativity of all the measurements, namely [A 0 , A 1 ] = 0, [B 0 , B 1 ] = 0, the corresponding linear constraints reduce the number of variables. Explicitly, one gets v * i = v i for any i = 1, . . ., 15, and also (8) For a visual representation, the variables that become identical because of the commutativity constraints are represented by the same color in (6). For any set of observed correlations { A x , B y , A x B y }, testing whether it is local can be done in the following steps: assigning the values to the entries of Γ that can be derived from the observed correlations and leaving the remaining terms as variables, then checking whether there is an assignment for such variables such that the matrix is positive semidefinite. For instance, it is possible to check that any set of correlations that violates the well-known Clauser-Horne-Shimony-Holt (CHSH) inequality [30] is incompatible with a positive semidefinite matrix (6). We stress that a necessary condition to produce correlations that violate (9) is that the measurements performed by each party does not commute with each other. This shows how the commutativity constraints imposed in the SDP test are crucial for the detection of the nonlocality of the observed correlations.
To conclude, we notice that, in this particular scenario, any set of nonlocal correlations has to violate the CHSH inequality (or symmetrical equivalent of it) [16]. Therefore, it turns out that in this case the second level of the hierarchy is already capable of detecting any nonlocal correlation. That is, even if in this scenario the hierarchy is expected to converge at level ν = 4, the second level happens already to be tight to the local set.

IV. GEOMETRICAL CHARACTERIZATION OF CORRELATIONS
Before presenting the applications of our method, we review a geometrical perspective, schematically represented in Figure 1 [16], that is useful when studying correlations among many different parties. It is known that the set of local correlations (4) defines a polytope, i.e. a convex set with a finite number of extremal points. Such points coincide with the deterministic strategies D(a i |x i , λ) introduced in (5) and can be easily defined for any multipartite scenario. As represented in Figure 1, the set of quantum correlations (1) is strictly bigger than the local set. All the points lying outside the set L represent nonlocal correlations.
Determining whether some observed correlations are nonlocal corresponds to checking whether they are associated to a point outside the local set. A very simple way to detect nonlocality is by means of Bell inequalities. They are inequalities that are satisfied by any local distribution and geometrically they constitute hyperplanes separating the L set from the rest of the correlations. Violating a Bell inequality directly implies that the corresponding distribution is nonlocal. However, there can be nonlocal correlations that are not detected by a given inequality, meaning that they fall on the same side of the hyperplane as local correlations.
On the other hand, a very general technique to check if a point belongs to the local set consists in determining if it can be decomposed as a convex combination of its vertices [17]. Such a question is a typical instance of a linear programing problem, for which there exist algorithms that run in a time that is polynomial in the number of variables [31]. Nevertheless, finding a convex decomposition in the multipartite scenario is generally an intractable problem because the number of deterministic strategies grows as d mN . Already in the simplest cases in which each party measures only m = 2, 3 dicothomic measurements, the best approach currently known stops at N = 11 and N = 7 respectively [32].
Coming back to the SDP method presented in the previous section, we can now show how the technique can help in overcoming the limitations imposed on the linear program. Let us define the family of sets L ν as the ones composed by the correlations that are compatible with the moment matrix Γ defined by the observables O ν and the additional constraints of commuting measurements. Given that any local distribution has a quantum representation with commuting measurements, the series L 1 ⊇ L 2 ⊇ . . . ⊇ L defines a hierarchy of sets approximating better and better the local set from outside. In Figure 1 we show a schematic representation of the first levels of approximations.
Interestingly, it can be seen that the first level of the hierarchy is not capable of detecting any nonlocal correlations. A simple way to understand it is that, in the moment matrix generated by O 1 , imposing commutativity of the local measurement does not result in any additional constraint in the entries.
A clear example is given by the N = 2 case presented in the previous section. The moment matrix corresponding to the first level can be identified with the 5×5 top-left corner of (6). There, the only modification imposed by local commutativity is the condition for the matrix to be real, which can always be assumed when working with quantum correlations. Therefore, we can say that L 1 = Q 1 , meaning that the first level of our relaxation coincides with the first level of the original NPA, thus resulting in an approximation of the quantum set from outside.
Since we are interested in focusing on the first nontrivial level that allows for nonlocality detection, we then consider L 2 . We notice that, at this level of the hierarchy, specifying the entries Γ ij = tr(O † i O j ρ) requires knowledge of upto-four-body correlators. Moreover, the amount of terms in the set O 2 scales as the number of possible pairs of measurements M ai xi , that is, as N 2 m 2 d 2 . This implies that the size of the moment matrix scales only quadratically with the number of parties and measurements, which is much more efficient compared to the exponential dependence d mN of the linear program. Moreover, since the elements in the moment matrix involve at most four operators, this implies that the number of measurements to be estimated experimentally scales as N 4 m 4 d 4 .
As mentioned before, checking whether a set of observed correlations belongs to L 2 constitutes a SDP feasibility problem. Since we are addressing approximations of the local set, there will be nonlocal correlations that will fall inside L 2 and that will not be distinguishable from the local correlations. Therefore, our technique can provide only necessary conditions for nonlocality. Nonetheless, we are able to find several examples in which this method is able to successfully detect nonlocal correlations arising from various relevant states, proving that it is not only scalable, but also a powerful method despite being a relaxation.

V. APPLICATIONS
The goal of this section is to show that the SDP relaxation can be successfully employed for detection of nonlocality arising from a broad range of quantum states. We focus particularly in exploring the efficient scaling of the method in terms of number of particles. To generate the SDP relaxations, we use the software Ncpol2sdpa [33], and we solve the SDPs with Mosek [34].
We collect evidence that, from a computational point of view, the main limiting factor of the technique is not time but the amount of memory required to store the moment matrix. Indeed, the longest time that is taken to run one of the codes amounts to approximately 9 h [35]. Despite the memory limitation, the SDP technique allows us to consider multipartite scenarios that cannot be dealt with in the standard linear program approach to check locality. Indeed, for the scenarios with m = 2, 3, we are able to detect nonlocality for systems of up to N = 29 and N = 15 respectively, thus overcoming the current limits of [32].
In the following sections, we list the examples of states we L L 1 L 2 Q Bell-like inequality FIG. 1. Pictorial representation of the sets of correlations, together with our approach to detection of multipartite nonlocality. The L and Q sets delimit the local and quantum correlations respectively. As shown here, the first forms a polytope, namely a convex set delimited by a finite amount of extremal points, while the second, despite still being convex, is not a polytope. The light orange sets are the first representatives of the hierarchy L1 ⊇ L2 ⊇ . . . ⊇ L approximating the local set from outside. It can be seen that some of the quantum correlations lie outside the L2, meaning that they are detected as nonlocal from the SDP relaxation at the second level. The dotted line shows a Bell-like inequality that can be obtained by the corresponding dual problem.
consider. Given that we study cases with dichotomic measurements only, we present them in the expectation value form {M  We are able to show that the obtained probability distribution is detected as nonlocal at level L 2 for N ≤ 29. We recall that in this scenario the complexity of this test scales as O(N 4 ), in terms of both elements to assign in the moment matrix and measurements to implement experimentally.
We also study the robustness of our technique to white noise, where 0 ≤ p ≤ 1 and 1 N represents the identity operator acting on the space of N qubits. We estimate numerically the maximal value of p, referred to as p max , for which the given correlations are still nonlocal according to the SDP criterion. Table I reports the resulting values as a function of the number of parties. While the robustness to noise decreases with the number of parties, the method tolerates realistic amounts of noise, always larger than 6%, for all the tested configurations.  Finally, in order to study the robustness of the proposed test with respect to the choice of measurements, we also consider a situation where the parties are not able to fully align their measurements and choose randomly two orthogonal measurements [36]. More precisely, we assume that M are vectors chosen uniformly at random, with the only constraint of being orthogonal; namely x The results for random measurements also exemplify one of the advantages of our approach with respect to previous entanglement detection schemes. Given some observed correlations, our test can be run and sometimes detects whether the correlations are nonlocal and therefore come from an entangled state. To our understanding, reaching similar conclusions using entanglement witnesses or other entanglement criteria is much harder, as they require solving optimization problems involving N -qubit mixed states.

GHZ state
Another well-studied multipartite state is the Greenberger-Horne-Zeilinger (GHZ) state, given by Contrarily to the W state, such a state is not suited for detection of nonlocality with few-body correlations because all the k-body distributions arising from measurements on (12) are the same as those obtained by measuring the separable mixed state 1 2 (|0 0| ⊗k + |1 1| ⊗k ). Therefore, in order to apply our nonlocality detection method to the GHZ state we need to involve at least one full-body term.
The solutions we present are inspired by the self-testing scheme for graph states introduced in [37]: the first scenario involves m = 3 dichotomic measurements per party; namely M }. The moment matrix corresponding to such set represents a mixed level of the relaxation, containing also two full-body correlators in the entries. However, since the number of added columns and rows is fixed to 2 for any N , this level is basically equivalent to level L 2 . Therefore, we preserve the efficient O(N 4 ) scaling with the number of parties of elements in the moment matrix and measurements to implement.
By numerically solving the SDP associated to this mixed level of the hierarchy we are able to confirm nonlocality of the correlations arising from the GHZ state and the given measurement for up to N ≤ 15 parties. Moreover, we check that the number of full-body values that is necessary to assign is constant for any of the considered N , coinciding with the two correlators M Lastly, we estimate that the robustness to noise in this case does not depend on N and it amounts to p max ≈ 0.17.
As a second scenario, we also notice that one can produce nonlocal correlations from the GHZ at the level O mix by considering m = 2 measurement choices only. Indeed, if ones considers M the resulting correlations are detected as nonlocal for any N ≤ 28 (the fact that we are not able to reach N = 29 is due to the mixed level of the relaxations, which results in a bigger matrix compared the scenario for the W state). Table (II) shows the corresponding robustness to noise, computed in the same way as for the W state. For both configurations, the noise robustness of our scheme in detecting GHZ states seems to saturate for large N even if the computational (and experimental) effort scales polynomially.

Graph states
Graph states [38] constitute another important family of multipartite entangled states. Such states are defined as follows: consider a graph G, i.e. a set of N vertices labeled by i N pmax N pmax N pmax  connected by some edges E ij connecting the vertices i and j.
We associate a qubit system in the state |+ i for each edge i and apply a control-Z grate CZ ij = diag(1, 1, 1, −1) to every pair of qubits i and j that are linked according to the graph G. We notice that GHZ state is also a graph state, associated to the so-called star graph. However, due to its particular relevance in quantum information, we prefer to treat its case in the previous section. Here we consider some other exemplary graph states such as the 1D and 2D cluster states and the loop graph state illustrated in Figure 2. Inspired by the self-testing scheme in [37], we consider that each party applies three measurements given by σ x , σ z and σ d . We are able to detect nonlocality in the obtained correlations at level L 2 for states involving up to N = 15 qubits. Again, the method at this level scales as N 4 .
Interestingly, our approach for the detection of nonlocal correlations generated by graph states shows to be qualitatively different from McKague's scheme in [37]. While the latter requires correlators of an order that depends on the connectivity of the graph (namely, equal to 1 plus the maximal number of neighbors that each vertex has), our method seems -at least in some cases -to be independent of it. Indeed, we are able to detect nonlocality with four-body correlators in 2D cluster states, whose connectivity would imply five-body correlators for the self-testing scheme.

VI. EXPLICIT BELL INEQUALITIES
Another nice property of our nonlocality criterion comes from the fact that, as it can be put in a SDP form, it immediately provides a method to find experimentally friendly Bell inequalities involving a subset of all possible measurements. In fact, it turns out that the the SDP proposed in Sec. III has a dual formulation that can be interpreted as the optimization of a linear function of the correlations that can be seen as a Belllike functional, i.e. a functional that has a nontrivial bound for all correlations in L k [20] (see Appendix A for details). Thus, if a set of correlations is found to be nonlocal, then the solution of the SDP provides a Bell inequality that is satisfied by correlations in L ν and that is violated by the tested correlations. Importantly, this Bell inequality can further be used to test other sets of correlations.
By using the two sets of correlations obtained by measuring 3 dicothomic observables per party in the GHZ state we are able to find the following Bell inequality: Numerically, we could certify the validity of this inequality for up to N ≤ 15. Moreover, in principle the bound of β C = 2(N − 1) is only guaranteed to be satisfied by correlations in L mix . However, motivated by the obtained numerical insight, we could prove that this bound actually coincides with the true local bound, and therefore, (13) is a valid Bell inequality for all N (for all the analytical proofs regarding this section, see Appendix B). This shows that, at least in this instance, the L mix defined by the SDP relaxation associated to O mix is tight to the local set.
It is also easy to show that (13) is violated by the GHZ state and the previously introduced choice of measurements. In particular, the value reached is I 3 GHZ = (1 + √ 2)(N − 1) for any N . Given that both the local bound and the violation scale linearly with N , the robustness of nonlocality to white noise is constant and amounts to p max = Similarly, we also find the following Bell inequality by using the set of correlations involving only two measurements per party described for the GHZ state: Once more, although this inequality is found numerically for up to N ≤ 28 we prove that it is valid for any N . Moreover the bound β C = 2(N − 1) is not only valid for correlation in L mix but for any local set of correlations. The GHZ state and the given measurements result in a violation of Given that in this case the relative violation is lower, we also have a lower robustness to noise, coinciding with p max = √ 2−1 √ 2+3 ≈ 0.09 for any N . We notice that this value is different from the ones reported in Table (II). The reason is that, to derive inequality (14) from the dual, we restrict to assign only the values of the two-body correlations and the two full-body ones. On the other hand, the results in Table (II) also take into account the assignment of the threeand four-body correlators, showing that this additional knowledge helps in improving the robustness to noise.
As a final remark, we stress that the measurement settings considered to derive an inequality from the dual might not be the optimal ones. For instance, we are able to identify different measurement choices for the case of (14) that lead to a higher violation of such an inequality, hence resulting also in a better robustness to noise (see Appendix B for details).

VII. DISCUSSION
We introduce a technique for efficient device-independent entanglement detection for multipartite quantum systems. It relies on a hierarchy of necessary conditions for nonlocality in the observed correlations. By focusing on the second level of the hierarchy, we consider a test that requires knowledge of up to four-body correlators only. We show that it can be successfully applied to detect entanglement of many physically relevant states, such as the W , the GHZ and the graph states. Besides being suitable for experimental implementation, our technique also has an efficient scaling in terms of computational requirements, given that the number of variables involved grows polynomially with N . This allows us to overcome the limitation of the currently known methods and to detect entanglement for states of up to few tens of particles. Moreover, the proposed technique has a completely general approach and it can be applied to any set of observed correlations. This makes it particularly relevant for the detection of new classes of multipartite entangled states.
We note that our techniques can also be used as a semidefinite constraints to impose locality. Consider, for instance, a linear function f of the observed correlations. One could find an upper bound on the value of this function over local correlations by maximizing it under the constraint that the moment matrix Γ is positive semidefinite. A particular example could be to take f to be a Bell polynomial. Thus this approach would find a bound f ≤ β C satisfied by all local correlations.
As a future question in this direction, it would be interesting to study how accurate is the approximation of the local set of correlations provided by the second level of the hierarchy. In some of the scenario that we consider the approximation is actually tight, but this is not generally the case. A possible approach could be to compare the local bound of some known Bell inequalities with that resulting from the hierarchy.
Furthermore, we notice that the second level of the hierarchy also has an efficient scaling with the number of measurements performed by the parties. This would allow us to inquire whether an increasing number of measurement choices can provide an advantage for entanglement detection in multipartite systems.
Lastly, we believe that the present techniques can be readily applied in current state-of-the-art experiments. For instance, experiments composed by up to 7 ions have demonstrated nonlocality using an exponentially increasing number of full correlators [39]. Moreover, recent experiments have produced GHZ-like states in systems composed by 14 ions [40] and 10 photons [41,42] with visibilities within the range required to observe a violation of the Bell inequalities presented here. We notice, however, that the measurements required to certify the presence of nonlocal correlations using our approach are different from the ones reported in these works.

VIII. ACKNOWLEDGEMENTS
The authors thank J. Kaniewski and the referees for helpful comments. D.C. thanks R. Chaves for discussions and hospitality at the International Institute of Physics (Natal-Brazil). This work was supported by the ERC CoG grant QITBOX, Spanish MINECO (Severo Ochoa grant SEV-2015-0522, a Ramón y Cajal fellowship, a Severo Ochoa PhD fellowship, QIBEQI FIS2016-80773-P and FOQUS FIS2013-46768-P), the AXA Chair in Quantum Information Science, the Generalitat de Catalunya (SGR875 and CERCA Program) and Fundacion Cellex. The authors also acknowledge the computational resources granted by the High Performance Computing Center North (SNIC 2016/1-320).
Here, we present in more detail the SDP relaxation associated to quantum realizations with commuting measurements. In order to be consistent with the examples presented in the main text, we express it in terms of correlators, but we stress that a formulation in terms of projector and probabilities for higher numbers of outcomes is straightforward.
Let us consider that the N observers A i are allowed to perform m dicothomic measurements each. We can therefore define the operators M xi } up to order ν. Then, by indexing the operators in the sets as O i for i = 1, . . ., k, we define the k × k moment matrix as follows where ρ N is a generic N-partite quantum state. As it was shown in [19,20], for any set of quantum correlations P , the properties i)-iii) and the fact that the associated ρ N is a proper quantum state reflect into the following properties of the moment matrix: • Γ † = Γ, Up to this point, the method we describe coincides with the NPA hierarchy [19,20], which is used to check whether a set of observed correlations is compatible with a quantum realization. In order to define a hierarchy to test for local hidden variables realization, we introduce the additional condition that all the measurements for the same party have to also be commuting, namely It can be seen that property iv) implies a second set of linear constraints on the Γ matrix, which we identify as i,j (F m ) ij Γ ij = g m (P ) m = 1, . . ., l To make it clearer, we show an example of linear constraint that can come only if we impose condition iv). Let us consider the following four operators: O k = M xj . It is easy to see that, by exploiting i)-iii) plus iv), Γ kl = Γ nm for any choice of x i , y i , x j = 1, . . ., m and i, j = 1, . . ., N . Now, for any chosen O ν , we can test whether an observed distribution P is compatible with a local model via the following SDP maximize λ, subject to Γ − λ1 0, i,j (F m ) ij Γ ij = g m (P ) m = 1, . . ., l , (A1) i,j (F m ) ij Γ ij = g m (P ) m = 1, . . ., l , which is the primal form of the problem. A solution λ * min < 0 implies that it is not possible to find a semidefinite positive moment matrix satisfying the given linear constraints. Therefore P has no quantum realization with commuting measurements and we conclude it is nonlocal. We notice that by increasing the value of ν we get a sequence of more and more stringent tests for nonlocality. Indeed, the linear constraints for the level ν are always a subset of the ones coming from ν + 1. Moreover, in analogy with the NPA hierarchy, the series of tests is convergent; hence any nonlocal correlation will give a negative solution λ * min at a finite step of the sequence.
Interestingly, we can also study the dual form of the SDP problem, which reads as follows: minimize G(P ) = k y k g k (P ) + k y k g k (P ), Thanks to the strong duality of the problem, a negative solution for the primal implies also G(P ) = λ * min < 0. Since any point in L ν satisfies the SDP condition at level ν with G(P ) ≥ 0, we can interpret G(P ) as a Bell-like inequality separating the L ν from the rest of the correlations. Indeed, since g k (P ) and g k (P ) are linear in terms of the probability distribution, we derive that G(P ) ≥ 0 defines also a linear inequality for P . Violation of such an inequality directly implies nonlocality.