Sample-efficient device-independent quantum state verification and certification

Authentication of quantum sources is a crucial task in building reliable and efficient protocols for quantum-information processing. Steady progress vis-\`{a}-vis verification of quantum devices in the scenario with fully characterized measurement devices has been observed in recent years. When it comes to the scenario with uncharacterized measurements, the so-called black-box scenario, practical verification methods are still rather scarce. Development of self-testing methods is an important step forward, but these results so far have been used for reliable verification only by considering the asymptotic behavior of large, identically and independently distributed (IID) samples of a quantum resource. Such strong assumptions deprive the verification procedure of its truly device-independent character. In this paper, we develop a systematic approach to device-independent verification of quantum states free of IID assumptions in the finite copy regime. Remarkably, we show that device-independent verification can be performed with optimal sample efficiency. Finally, for the case of independent copies, we develop a device-independent protocol for quantum state certification: a protocol in which a fragment of the resource copies is measured to warrant the rest of the copies to be close to some target state.


I. INTRODUCTION
Entangled states have been identified as a key resource in applications of quantum technologies, such as quantum communication [1], computation [2], cryptography [3], and sensing [4]. Harvesting a complete quantum advantage demands an increased precision in manufacturing the relevant components and devices. However, realistic quantum devices are rarely perfect: they suffer from unavoidable errors, noise or decoherence and in some cases, they are not trusted. Therefore, characterization and validation play a central role in real applications of quantum technologies. Verification techniques have a multifaceted complexity. Complexity is measured in terms of a minimal number of queries or instances of resource used (sample complexity), a different number of local measurement settings needed for the procedure (measurement complexity), quantum computational power needed to prepare verification circuit (quantum computational complexity), or classical resources needed to postprocess the verification results (postprocessing complexity). A compact comparison of different techniques in terms of complexity can be found in Ref. [5].
Complexity directly depends on required confidence and informativeness [5,6]: how much of the information we would like to learn about the underlying system and how certain we want to be about the result. The required informativeness depends on the application, and usually, there is a trade-off between informativeness and complexity. Take, for instance, sampling complexity. A less informative technique is, for example, a simple entanglement verification [7,8] in which (sample) complexity can be reduced to several copies only [9] or even to a logical minimum of a single-copy verification [10]. On the other hand, fully informative techniques, such as quantum state tomography [11][12][13][14][15] exhibit exponential sample complexity. Trading informativeness for complex-ity results in a reduction of resources. Somewhere in between low complexity (e.g., entanglement verification) and exponential complexity (full tomography), one finds resource-efficient tasks such as direct fidelity estimation [16] and shadow tomography tasks [17][18][19]. Another important example is quantum state verification, which can be achieved with remarkably low sampling complexity [20][21][22]. A specific feature of quantum state verification is that all measurements are local and postprocessing complexity is much lower compared to that of the tomographic methods. The technique received a lot of attention and has been applied to construct verification protocols for various classes of states [10,[21][22][23][24][25][26][27][28]. It has also been successfully experimentally implemented in Refs. [29,30]. This task is the central focus of our work.
One drawback of the techniques mentioned so far is that they all assume perfect characterization of all the measurements implemented during the process. This is inconsistent with a device-independent (DI) scenario in which all devices, including the measurement ones, are treated as black boxes [31][32][33][34]. Such a paradigm is very useful in the context of protocols involving any potential adversarial activity. A prominent candidate for device-independent state verification is the self-testing method [35,36]. However, self-testing conclusions are drawn from the knowledge of the asymptotic behavior of an experiment, which in practice invokes the identically and independently distributed (IID) assumption: IID experimental runs. For this reason, self-testing is more of a theoretical prerequisite for practical DI quantum state verification. The literature concerning self-testing usually does not discuss the sample complexity of the procedure. The finite statistics effects in self-testing in the non-IID setting are rigorously treated in Ref. [37] on the example of self-testing the maximally entangled pair of qubits. This is closely related to conditional estimators of the Bell inequality violation used in the hypothesis test of local realism without the IID assump-tion [38][39][40][41][42]. Other examples for taking into account non-IID effects can be mostly found in the works incorporating selftesting results into protocols for delegated quantum computing such as Refs. [43][44][45].
Another downside of verification, in general, is that the measurement process destroys the quantum resource and the conclusion is made about the resource, which is fully consumed. This hinders the possibility to use the resource for other protocols, without invoking further assumptions. A certification protocol, as we define it, represents a way out: a fragment of copies is measured and a conclusion is drawn about the remaining copies. Protocols of this kind have been used for entanglement certification [46], quantum state verification [47], authentication of teleportation [48], quantum contract signing [49], and single-copy fidelity certification [21,22].
In this paper, we develop a systematic way to build practical DI quantum state verification protocols starting from selftesting results. Our verification protocol achieves optimal sample efficiency by taking into account the finite statistics effect and is free of the IID assumption. With our methods, selftesting tools find direct application even in cryptographic protocols, where earlier they were deemed inappropriate because of the IID assumption they invoke. Based on these results we design a DI certification protocol in which a certificate about the unmeasured copies can be guaranteed based on the DI verification procedure performed on the measured copies. This is done by joining together insights from quantum state verification methods with those native to self-testing, and building a general protocol for sample-efficient DI quantum state verification and certification.

II. PRELIMINARIES
In this section, we clarify the terminology and notation and define rigorously the verification and certification tasks, which will be explained in the next sections. Furthermore, we introduce appropriate figures of merit.

A. Verification VS certification
Typically, the terms verification and certification are used interchangeably in the literature [5]. Here, we distinguish these two terms and associate them to two different tasks. Both tasks refer to the situation in which specification of certain quantum property is confirmed or refuted.
Quantum verification refers to the task in which a certain property of a quantum system is checked, i.e., verified. For example, a source emits N copies, and the aim is to confirm or refute that the states of the copies are to some degree close to the referent state, by performing appropriate measurements on all of them ( Fig. 1(a)). Importantly, all the copies produced by the source are consumed during the verification process.
Quantum certification refers to the scenario in which a subset of the emitted copies is measured, but a certificate of some property of the rest, e.g., proximity to the target state, can still be invoked. If the certification is successful, there is a certificate referring to the state of the unmeasured copies ( Fig.  1(b)).

B. Figures of merit
Verification. In order to quantify the quality of the verification and certification procedure, one needs to choose an appropriate figure of merit. The scenario we consider is the following: a source is producing N independent copies of the quantum system S = {σ 1 , · · · , σ N }, where σ i is the state of the ith copy. For now, we abandon the assumption that the copies are identical but still assume that they are independently distributed. The generalizations to the full non-IID case will follow by the end of the next section. In general, our goal is to verify and certify whether the source produces the copies close to some target state |ψ〉. Formally, the aim is to infer, with some confidence level 1−δ, that the average state fidelity of the states from S with |ψ〉 is larger than some value 1 − η, with η ∈ (0, 1). The closeness of the average state fidelity ensures that any (local) measurement on the emitted copies will give statistics close to the target state statistics. To see this, let Π be an arbitrary measurement operator, ψ = |ψ〉〈ψ| and ||·, ·|| 1 the trace distance. The following chain of inequalities holds: The first inequality is the property of the trace distance, while the second follows from the relation between trace distance and fidelity [50]. The last line is obtained by using Jensen's inequality. From these bounds, we see that the average mean is close to the target mean value for arbitrary measurement. Consequently, we know from the Chernoff bound for independent variables that sample mean concentrates around the average mean [51,52], which in turn implies that local measurement on the sample will give approximately the same statistics as the target state. However, in a device-independent scenario, one cannot hope to verify fidelity with some particular state. This is due to the fact that the device-independent scenario cannot detect potential local isometries, so a pair of states related by one local isometry can perform equally well in all deviceindependent tasks. More details about this are given in the next section. The figure of merit has to be fidelity optimized over all local isometries applied to one of the states. In the device-independent literature, this quantity is called extractability [53,54]. Formally, it is defined as follows. Let Figure 1: Comparison of quantum state verification and quantum state certification. The crucial difference is that the former provides a conclusion only after completely consuming the resource, while the latter provides a certificate about the unmeasured resource. For untrusted measurements and the black box scenario we speak of DI quantum state verification and certification. (a) Protocol for quantum state verification. In quantum state verification all available copies are measured and the sample is verified to be (or not) close to the target state. (b) Protocol for quantum state certification. In quantum state certification a part of the sample is measured, and if verified, the rest of the sample is certified to be (or not) close to the target state.
H j be a Hilbert space native to the j-th emitted state and let D(H j ) be the set of all density operators on it. We also define the ancillary Hilbert space H a , which is the same as the Hilbert space native to the target state, and the set of density operators on it D(H a ). Consider a state the source emits in j-th round σ j ∈ D(H j ), an arbitrary local isometriy Φ : D(H j ) −→ D(H j ⊗ H a ) and a target state ψ ∈ D(H a ). Then the extractability of ψ from σ j [55] is given by For a sequence of states S = {σ 1 , · · · , σ N } we define the average extractability of ψ from S as For all practical purposes, the extractability is DI equivalent of fidelity, as in a device-dependent scenario, it simply reduces to fidelity. As we commented above, if some state has high fidelity with a given pure target state it means that performing an arbitrary measurement on these two states results in approximately the same statistics. High extractability of the target state from the physical state means that there is an isometry bringing the physical state close to the target state. This isometry has to be taken into account when comparing the measurement statistics, i.e., the statistics obtained by applying an arbitrary measurement to the target state will be approximately the same as the statistics obtained when the image under the inverse isometry of that same measurement is applied to the physical state [56].
Certification. In the certification process, the source produces the set of independent states S = {σ 1 , · · · , σ N }. A fraction N 1 ≈ µN, is randomly chosen to be measured, where µ is the probability for one copy to be measured. Let us denote the set of labels of measured copies with I and the corresponding set of states with S v = {σ i |i ∈ I}. The remaining N 2 = N − N 1 copies, we label with J, and they are preserved as certificate S c = {σ j | j ∈ J}. We perform the verification task on S v and if the test passes we certify (with certain confidence) that S c has the average fidelity with the target statē higher than some value 1 − η. To adapt this figure of merit to the device-independent scenario we optimize it over all local isometries and obtain the average extractability of the target state from the remaining copies: So, device-independent certification aims to show that if a test we make passes, the average extractability of the target state from the unmeasured copies must be higher than some value with a given confidence level.
Non-IID case. Up to now, we assumed that the copies produced by the source are mutually independent, i.e., there are no correlations among them. To indicate how correlations can hinder the verification process, let us consider the source that over N rounds produces the full state 0.5|ψ〉〈ψ| ⊗N + 0.5χ ⊗N , where |ψ〉 is the target state and χ is some separable state with very low fidelity with ψ. Such a source is not reliable as it has a 50% chance to produce a "bad" state in all rounds. In an actual experiment the source would emit either the sequence {|ψ〉, · · · , |ψ〉} or the sequence {χ, · · · , χ}. The first one would pass all verification tests, and based on such a test one might wrongly conclude that the source is good. To circumvent this behavior of non-IID sources, one way out is to verify not the correct functionality of the source, but the sequence generated in the specific experiment. To formalize the former, one introduces conditional states that actually correctly capture the above example. Of course, one may potentially formalize non-IIDness in a different way, unfortunately there is not much literature on this topic and this is left for future considerations.
In this work, we adhere to the idea of conditional fidelity, introduced in Ref. [37]. With the aid of the techniques introduced there we can consider the verification process also in the case when the full state produced by the source σ N is correlated, or even entangled. In that case, in every round we consider the conditional statẽ where M o k |i k is the measurement performed on the k-th copy, Thus, the sequence of statesS = {σ 1 , · · · ,σ N } consists of mutually independent copies, and it represents the sequence of states actually measured in this particular experiment of N runs. The conditional states trivially encompass the IID and independent case scenario (per definition). Clearly, they also correctly capture the example provided above, i.e. the test fails if the "bad" state χ ⊗N is produced, while the test passes if the sequence of target states is generated. The figure of merit in this case is the average conditional fidelitỹ and its device-independent counterpart, the average conditional extractabilitȳ Hence, the non-IID verification process at the end returns the average extractability of the target state |ψ〉 from the sequence of the conditional statesS.
In the following sections we design the first general protocol for DI quantum state verification, both in the scenario where different experimental rounds are independent but not identical, and in the full non-IID scenario. As far as we know the only rigorous treatment of non-IID and DI quantum state verification was presented in Ref. [37] for the particular case of maximally entangled pair of qubits. Our method is general, and can be used to verify any quantum state that can be self-tested. Concerning the DI quantum state certification we propose the first protocol in the scenario with independent experimental rounds, and comment on the conceptual difficulties when it comes to designing the full non-IID protocol.

III. A FRAMEWORK FOR DEVICE-INDEPENDENT QUANTUM STATE VERIFICATION
In this section, we build the protocol for non-IID quantum state verification. We start by discussing two main ingredients for the protocol, which are quantum state verification, Sec. III A, and self-testing Sec. III B. Then we show how they can be used to construct a DI protocol for quantum state verification and present a few case studies.

A. Quantum state verification
Independent copies. In this section, we recall the framework for quantum state verification [20][21][22]57]. The aim is to verify that an uncharacterized source is producing the target state σ = |ψ〉〈ψ| by using only local measurements. Here, the source is assumed to produce a sequence of mutually independent copies S = {σ 1 , · · · , σ N }.
The verification procedure consists of performing fully characterized measurements on the copies produced by the source. The measurement strategy Ω, thus consists of l different binary local measurements {M o|i }, where measurement settings are labeled with i ∈ {1, · · · , l} and outputs with o ∈ {0, 1}. In the jth round, a label i is randomly sampled and the corresponding measurement is applied to the state σ j . We say that the state σ j has passed the round if it returned the output o = 1. Otherwise, we say it failed. The first time a round fails, the process is aborted. The measurements are chosen to be such that |ψ〉〈ψ| is the only state with 〈ψ|M 1|i |ψ〉 = 1 for all i = 1, · · · , l. In other words, only the target state has 100% chance to successfully pass all the rounds. If all N rounds are passed we verify with certain confidence level 1 − δ that the average fidelity F av (S, ψ), defined in Eq. (1), must be higher than some value 1 − η: The analysis is based on finding the second-largest eigenvalue of the strategy operatorΩ = i p i M 1|i , where p usually corresponds to a uniform distribution. The largest eigenvalue ofΩ is 1 and the corresponding eigenvector is |ψ〉〈ψ|. Of all states ρ with 〈ψ|ρ|ψ〉 = 1 − η, the highest average probability to pass a randomly chosen round has the state (1−η)|ψ〉〈ψ|+η|ψ 〉〈ψ |, where |ψ 〉 is the eigenvector ofΩ associated to the second-largest eigenvalue 1 − ν(Ω) [here ν(Ω) is defined as the gap between the largest and the second largest eigenvalue ofΩ] , resulting in the probability of 1 − ην(Ω) to pass the round in the best case. Thus, for N rounds, the best strategy is achieved when all states have exactly the same fidelity with |ψ〉 [21] resulting in overall success probability [1 − ην(Ω)] N . Our aim is to confirm that the average fidelity of the states from S with |ψ〉 is larger than 1 − η. Hence, to verify the target state with the confidence level 1 − δ, which scales as Note that the efficiency of the procedure depends on the spectral gap of the strategy operatorΩ. In theory, the largest gap is equal to 1, in which case the strategy consists of a single measurement |ψ〉〈ψ|, 1 − |ψ〉〈ψ|, and for which one gets optimal scaling of N = O(1/η). However, such a measurement is entangled whenever |ψ〉 is, and for the sake of experimental simplicity, one of the requirements of the protocol is that the strategy involves only local measurements. Surprisingly, Eq. (10) shows that in certain cases we can achieve the same optimal scaling for local strategies (up to a constant factor). One can generalize the procedure to nonperfect strategies characterized by the strategy operatorΩ, such that Tr[Ωψ] < 1 and obtain quadratically worse scaling [57]. In that case, the procedure does not halt the first time a round is failed. After N rounds the lower bound of the average fidelity is estimated based on the frequency of the successful rounds. This is particularly important for creating DI verification protocols, as for some states there are no self-tests that can be phrased as tasks passed by the target state with maximum probability. In this case, we get quadratically worst scaling of the number of copies O(1/η 2 ), which we call suboptimal scaling, but still efficient in terms of resources.
Non-IID case. Until now, adapting the quantum verification procedure to a non-IID scenario has not received a lot of attention. So far, recent work [21] presented the method for building verification protocols adapted to such a scenario, starting from an already existing one for the independent copies [22]. In their work, authors construct a non-IID protocol in which measuring N − 1 copies allows the fidelity certificate for the remaining unmeasured one to be proved. This procedure is discussed in more details in Sec. IV. In our work, we implement a DI scenario and go beyond one-copy certification with the goal to certify an asymptotically large number of copies O(N).

B. Self-testing
Self-testing is the only known device-independent verification procedure [35,36]. It aims to verify that a source is producing a certain target state |ψ〉. The state actually produced by the source is called the physical state and the applied measurements are termed physical measurements. The key aspect of self-testing is device-independence: local measurements are not characterized, nor trusted. All devices are treated as black boxes. Different parties, sharing a multipartite state σ are spatially separated and noncommunicating. Each party queries their box with classical input i, denoting the measurement choice, and the boxes return classical output o. Usually, the sources are assumed to be IID. Therefore, by repeating the measurement process for many rounds the parties can estimate the probabilities of obtaining the set of outputs o 1 , o 2 , · · · , o n when the set of inputs is i 1 , i 2 , · · · , i n : The set of probabilities is usually termed, simply, correlations. Assuming the correctness of quantum mechanics, the correlations are related to the physical state and measurements through the Born rule: As all information is drawn only from the correlation probabilities it is impossible to verify the exact form of the physical state. Indeed, certain transformations such as simultaneous local rotations of the state and measurements, or embedding in Hilbert spaces of larger dimension, leave the observed probabilities unchanged. These transformations are captured by the notion of local isometries, and the best one can hope for is to verify that a local isometry maps the physical state to the target one. If such local isometry exists one says that the physical and the target state are equivalent. Rigorously written, the states σ ∈ D(H ) and ψ ∈ D(H ) are equivalent if there exists a local isometry Φ : In some cases, if the observed correlations are the same as the correlations obtained from the target state, one can infer the equivalence between the physical and the target state. In general, the self-testing statement does not need to specify the form of the isometry, but in most cases, the proof is based on the specific isometry.
One of the bigger challenges on the way towards constructing a self-testing protocol is to find candidate correlations. Natural candidates are the correlations reaching the quantum bound of some of Bell's inequalities: The local bound b L is the limit achievable by the correlations compatible with local hidden variable models. The quantum bound b Q is the maximal violation achievable by the correlations compatible with quantum theory, i.e., those for which there exist measurement operators {M o j |i j } n j=1 and density matrix σ, satisfying Eq. (11). In many cases, the quantum bound can be achieved only by using a specific state named the target state (up to local isometries), which is the basis of self-testing.
The simplest and probably the most widely used in selftesting are the correlations that maximally violate the Clauser-Horn-Shimony-Holt (CHSH) inequality [58]. It has been proven that two parties can maximally violate the CHSH inequality only if they apply anticommuting measurements on a maximally entangled state of two qubits. Thus, the CHSH inequality can be used to self-test the maximally entangled pair of qubits.
Let us now concentrate on quantum bounds that are also algebraic bounds, i.e. bounds constrained only by the positivity of observed probabilities. Notably, for a fixed set of inputs If the quantum bound of the inequality is equal to the algebraic bound, the probabilities p(o 1 , o 2 , · · · , o n |i 1 , i 2 , · · · , i n ), associated with coefficients β i 1 ,i 2 ,··· ,i n o 1 ,o 2 ,··· ,o n = β i have to sum to 1. Probabilities associated to the coefficients β i 1 ,i 2 ,··· ,i n o 1 ,o 2 ,··· ,o n < β i are equal to 0. This holds for all sets of inputs. An example of a Bell inequality whose quantum bound is equal to the algebraic one and self-tests the underlying state is the Mermin inequality [59]. All states maximally violating it have to be equivalent up to local isometries to the Greenberger-Horn-Zeilinger (GHZ) state |ψ GHZ 〉 = (|000〉 + |111〉)/ 2.
A very important aspect of a self-testing procedure is its robustness. Let us consider a self-test based on the maximal violation of a certain Bell inequality. The maximal violation is very difficult to reach due to the unavoidable presence of noise and imperfections. It is desirable that one can establish a lower bound on the extractability of the target state from the physical state in the case when maximal violation is not exactly reached.
A self-test is robust if observing violation b < b Q allows putting a lower bound on the extractability of the target state from the physical one. The extractability Ξ(σ, |ψ〉) of the target state ψ from any state σ achieving the Bell violation b Q −ε is at least 1− f (ε), where f is a function, which depends on the characteristics of the given self-test. As shown in Ref. [54], a lower bound to the extractability can always be represented as a linear function of the Bell violation, i.e., without loss of generality we take f (ε) =cε thatc is a constant. Furthermore, if the isometry Φ used in the self-test is optimal, we say that the self-test is tight in terms of robustness. Such is the self-test of the GHZ state based on the violation of the Mermin inequality [54].

C. Device-independent quantum state verification
The previous two sections gave us all the ingredients for constructing a protocol for DI quantum state verification. For clarity, we start with the case where the target state for the self-test achieves the algebraic bound of the corresponding Bell inequality, and we discuss the general case afterwards. The verification can be performed in two different setups, i.e., in the case of independently distributed copies and the case of non-IID source. Both scenarios aim to construct a DI verification procedure characterized by the optimal sample efficiency.
Independently distributed copies. As the procedure is DI the aim is to verify whether the average extractabilityΞ(S,ψ) of the target state |ψ〉 from the set of states S = {σ 1 , σ 2 , ..., σ N } is above some value or not. In the language of hypothesis testing, our hypothesis is the following: the average extractabilitȳ Ξ(S, ψ) of the target state from S is higher than a given value, and we aim to test it. The first step is to view the Bell inequality as a procedure in which only the states equivalent to the target state can pass all rounds, i.e., as a nonlocal game. For the criteria and procedure to convert a Bell inequality into the nonlocal game see Ref. [60]. For the set of inputs i = (i 1 , i 2 , · · · , i n ) let us denote the outputs corresponding to β i = max o 1 ,o 2 ,··· ,o n {β i 1 ,i 2 ,··· ,i n o 1 ,o 2 ,··· ,o n } as the correct outputs, while all the others are denoted as wrong outputs. As only a state equivalent to the target state is able to maximally violate the Bell inequality, such a state alone can return the correct outputs for every set of inputs. This already provides us with contours for a DI verification procedure. The first time the boxes return a set of wrong outputs the procedure can be halted, since either the underlying state is not equivalent to the target state or the measurements performed are not those allowing construction of a successful verification procedure.
For the second step, we need to estimate the number of copies necessary to exceed a certain bound on the average extractability. For that purpose let us look into the robustness bounds. Conversely, the robustness statement (see the previous section) says that the state σ for which there exists isometry Φ, such that F[Φ(σ), |ψ〉] = 1 − η, and thus Ξ(σ, ψ) ≤ 1 − η, at best can achieve the Bell violation b Q − η/c, wherẽ c is a constant characterizing linear dependence between extractability and the Bell violation. Furthermore, we translate the self-test procedure into a binary task, the so-called nonlocal game, in which local input settings correspond to binary questions queried at random. The performance is characterized by the percentage of the correct answers, i.e., the winning probability. Thus, a Bell violation b Q −η/c directly translates to probability of success 1−η/b Qc = 1−ηc, where we set c = 1/b Qc that we use frequently throughout the text.
Note that, in the case of a nonalgebraic bound, i.e., a Bell inequality with a gap between quantum and algebraic bound, the maximal probability of success is p QM < 1 (p QM denotes quantum bound), but relation remains linear. This establishes a direct relation between extractability and probability of success in a self-test. Once this is known it is easy to estimate the number of copies necessary to pass the verification test.
After a more general description, it is instructive to look at a specific case. The simplest example is the tripartite qubit GHZ state and self-testing based on Mermin's inequality [59] Any violation of the Mermin inequality allows a nontrivial conclusion to be made about the extractability of the GHZ state from the underlying state. In particular, a state σ j achieving violation 4 −ε j has extractability with GHZ higher than 1 −ε j /c, where c = 2 − 2. This bound is proven to be tight [54]. Therefore, we translate the self-test procedure into the nonlocal game and characterize it using probability of success. For example, if in a round we ask one of the global questions [set of inputs (0, 0, 1), (0, 1, 0), (1, 0, 0), or (1, 1, 1)] and get the correct outputs, the achieved score in the round is p j = 1, otherwise p j = 0. The final score is P = N j=1 p j /N.
Since the self-test is tight we can conclude that if extractability of the GHZ state from the state σ j is not greater than 1−η j , the highest violation of the Mermin inequality the state σ j can achieve is 4 − η j /c. It further means that the state has probability of success p η j = 1 − η j /4c = 1 − cη j .
The probability that the sequence of states S passes all rounds Π N where 1 − η = 1 − N j=1 η j /N is the average extractabilitȳ Ξ(S, σ). Relation (16) is a consequence between the inequality of arithmetic and geometric means: The bound is saturated when η j = η for all j = 1, · · · , N. For the fixed average extractability 1−η, the optimal performance is achieved when the GHZ state has the same extractability 1 − η from all the states belonging to the sequence S. In this case, a probability that a sequence of states, characterized by the average extractability of the target state lower than 1 − η, gives a correct answer in all N rounds is In other words, the confidence level 1 − δ that the average extractabilityΞ(S,ψ) is larger than 1 − η has a lower bound 1−(1−cη) N . From here we can estimate the number of copies: Importantly, this is asymptotically the best sample efficiency one can achieve (up to a constant factor).
Here we discuss the case where all the copies passed the test successfully, which is impossible in real experiments. For that reason, we now adapt the procedure to a more realistic scenario in which some number of failed rounds is allowed before halting the protocol. Furthermore, in what follows we do not impose that the target state has to be self-testable through a Bell inequality whose quantum bound is equal to the algebraic one. The general protocol for DI quantum state verification can be summarized as follows.
1) Fix the lower bound on average extractability that we want to certify 1 − η. This implies the lower bound on the average success probability of the whole sample p QM − ε 2 , i.e., p QM − cη.
3) Fix the number of samples N according to the lower bound given in Eq. (23).

4)
Run the protocol: measure all the available copies according to a procedure corresponding to a self-test for the corresponding target state.

5)
If the success rate P ≥ p 1 , the protocol is successful and average extractability of the measured sequence of states is Ξ ≥ 1 − η with confidence level 1 − δ. Otherwise, the protocol is inconclusive.
We start the explanation of the protocol with the observation that a self-testing robustness bound provides the relation between the success probability and extractability. A state σ j providing extractability 1 − η j to the target state has maximal achievable success probability p η j = p QM −cη j , where p QM is the success probability of the target state. A given sequence of independent copies S = {σ 1 , ..., σ N } can be characterized with the average extractabilityΞ(S,ψ), which corresponds to the upper bound on average success probabilityp def = 1 N N j=1 p η j . We say that the verification test is successful if the success rate The aim is to show that the probability for a sequence of states characterized with average extractabilityΞ ≤ 1 − η to pass the test decreases exponentially with the number of copies. In order to do so, we prove that this holds under stronger assumption, i.e., if success probabilityp ≤ p QM − ε 2 (ε 2 = cη).
The relation between two is as follows. If we denote the set of states satisfyingΞ(S,ψ) ≤ 1 − η with Σ η , and the set of states satisfyingp ≤ p QM − cη with Π η , then on the level of sets we have Σ η ⊆ Π η . Our target quantity is the probability Similarly, we define Following the set relations one directly reads Further on, we derive the upper bound on p max related to the probability of success, which will then automatically imply, by Eq. (21), the desired bound on extractability. This probability can be expressed with the help of the Chernoff-like bound (for details, see Appendix A): where p 2 = p QM −ε 2 and D(a||b) = a log(a/b)+(1−a) log((1− a)/(1−b)) is the Kullback-Leibler (KL) divergence. From here we can estimate the number of copies needed to verify the lower bound of the extractability where 1 − δ is the required confidence level. Note that, if the test is passed, the conditionp > p QM − ε 2 directly translates to the extractability bound, i.e., extractability must be greater than 1 − 2 /c. Now, we see how KL divergence gives two completely different scalings in two regimes. Let us consider the case when the quantum bound is equal to the algebraic bound p QM = 1 and make a Taylor expansion of Eq. (23). In this case, we obtain that the number of copies N scales like O(ln δ −1 /cη), which is the optimal scaling. On the other hand, if we work with nonalgebraic bound p QM < 1, Taylor expansion of Eq. (23) gives N = O(ln δ −1 /c 2 η 2 ). All details of the expansions can be found in Appendix A.
Finally, with this we can formulate our first result.

Result 1: DI quantum state verification.
The entangled state ψ can be device-independently verified if there exists a robust self-test for it, based on the set of correlations, which can be phrased as a nonlocal game. The verification test achieves optimal sample efficiency of O(ln δ −1 /cη) if the state saturates the algebraic bound of a corresponding Bell inequality. For nonalgebraic bound self-tests, the state is verified with suboptimal sample efficiency of O(ln δ −1 /c 2 η 2 ).
In comparison with the device-dependent case, see Eq. (10) and comments below, obtained results for the deviceindependent case differ only by a constant factor, both for optimal and suboptimal scaling.
In Fig. 2(a) we compare the sample efficiency of devicedependent and device-independent quantum state verification of the tripartite GHZ state. The DI protocol is based on the performance in Mermin's GHZ game. As a device-dependent verification protocol we use the optimal protocol, proposed in Ref. [61]. The scaling of the device-dependent version of quantum state verification is better only by a constant factor. This scaling is obtained for the protocol not tolerating any failed rounds, as in the standard quantum state verification. The theoretical plot for stabilization of the confidence level for different average fidelities 1 − η in a more realistic scenario is given in Fig. 2(b). In particular, we take that 0.95 of copies successfully passed experimental runs, therefore the maximal estimated value for extractability is approximately 0.90 (obtained using linear dependence between fidelity and the average probability of success).
Non-IID source. The previous section accounts for independent copies the source produces. Here we adopt the strategy used in the context of experimental self-testing protocol [37] to adapt our verification protocol to the non-IID case. As indicated in Sec. II the idea is to verify, not the correct functioning of the source, but the average extractability of the target state from the states produced by the source in a given sequence. Let us return to the example from Sec. II: a source "flips a coin" and depending on the outcome decides whether to send a sequence of target states |ψ〉, or a sequence of useless separable states χ, such that F(χ, ψ) = 0.5. The full state the source produces is ρ = (|ψ〉〈ψ| ⊗n + χ ⊗n )/2. Obviously, the source is not good, as the extractability cannot pass 0.75. However, with probability 50% the source sends the sequence of target states, which can pass the test in all rounds. If this happens, we can indeed verify that the actual produced sequence is good. And, for practical purposes, it is a perfectly plausible task: verification of the sequence of the states one had access to, rather than the verification of the source itself.
To formalize this let us introduce the conditioning. Conditional extractability of the target state from the state available in round j is defined in the following way: where σ [ j] is the full state over j rounds, M o k |i k is the measurement operator used in the k-th round, Average conditional extractability for the sequence of states S is defined as If the states used in different rounds are correlated, as in the example given above, the score in round i can be viewed as conditional, as it might depend on all the previous scores. Of course, the rounds are performed sequentially in order to get adequate conditional probabilities. Such a conditional score in the round j is denoted with p j|past , and importantly all conditional scores are mutually independent by construction: p j+1, j|past = p j+1|past p j|past . The final score is average of all the conditional scoresp = 1 N N j=1 p j|past . This score can be used to estimate the number of rounds needed to state with a certain confidence level that the average conditional extractability of the target state from all states is above some limit. With the introduced figures of merit, everything we said in the previous section can be translated to the scenario with a non-IID source, if the probability to pass a random round p j is exchanged with the conditional probability p j|past and the average extractability is exchanged with average conditional extractability. As all conditional probabilities are independent, in the derivation of the final result given by Eq. (23), in Eq. (19) we useΞ cond instead ofΞ, and in Eq. (20) we use a modified definition ofp.

IV. A FRAMEWORK FOR DEVICE-INDEPENDENT QUANTUM STATE CERTIFICATION
In practice, there is a strong interest to verify the functionality of an uncharacterized source before using it in some information-processing protocol. The DI quantum state verification protocol we introduced above is not applicable in such a scenario for two reasons. Firstly, as we explained, it is not the functionality of the source that is verified, but the average extractability of the target state from the produced copies. Secondly, all the copies are consumed in the process of verification, and they cannot be reused.
In this section, we address this issue and construct a protocol for DI certification: from the copies emitted by the source, some fraction is randomly chosen to be measured, the same way they would be measured in the DI quantum state verification, while all the remaining copies are preserved to be used in some other protocol of interest. Importantly, we show that the performance of the measured copies in the quantum state verification test allows us to say something about the average Figure 2: Examples of DI quantum state verification. (a) The verification of the tripartite GHZ state. We compare the number of state copies N needed for quantum state verification in the device-independent, Eq. (18), versus device-dependent scenario for a fixed confidence level [61]. The scaling of the device-dependent version of verification is better only by a constant factor, which in the particular example has the value 2(2 + 2)/3. This is the type of test where we assume that all rounds are successfully passed (i.e., p 1 = 1 in our protocol 1). (b) Verification of tripartite GHZ state in a realistic scenario. For different values of η, where 1 − η is the average extractability, we compare the growth of the confidence level as a function of the number of copies N of the prepared quantum state under the assumption that on average 95% of experimental runs are successful (i.e., overall success rate P is greater than p 1 = 0.95).
extractability of the target state from the remaining copies. The task of certifying one copy based on the previous N measurements has been discussed in the fully non-IID scenario in Refs. [21,22]. While the protocol works in a full adversarial scenario, it can be used to certify just a single (out of N) copy. In this work, we rather consider the problem of certifying a large number of copies in a DI manner. We develop an efficient scheme for the case of independent copies, while the full non-IID solutions still remain for future investigations. As in the scenario for the DI quantum state verification, the underlying self-testing procedure can be seen as a nonlocal game, in which for every set of inputs, the collective set of outputs is either correct or wrong. Hence, every measurement round is either passed, if the set of outputs is correct, and otherwise it is failed.
Independent copies. As a starting point for this section, let us state the protocol for device-independent quantum state certification, free of IID assumption.

1)
Fix the lower bound 1 − η c on the certificate average extractability we want to certify. This implies the lower bound 1 − η on the average extractability of the whole sample, and consequently to the average success probability of the whole sample p QM − ε 2 .
3) Fix the number of samples N according to the lower bound given in Eq. (27).

4)
Run the protocol: randomly choose N 1 ≈ µN instances of the given quantum system, measure them as in DI quantum state verification protocol.

5)
The number of successful rounds is q 1 . If q 1 /N 1 ≥ p 1 , the protocol is successful and the average extractability of the certificate can be found with confidence level 1−δ. Otherwise, the protocol is inconclusive.
Assume that the source has produced N independent copies S = {σ 1 , · · · , σ N }. For each copy we randomly choose whether it will be measured or not, for example by flipping a µ-biased coin. In the end, a fraction N 1 ≈ µN of them is chosen for the verification process, and we denote this set with I. The set of indices labeling the remaining N 2 = N − N 1 copies is denoted with J. The N 1 chosen copies undergo the same procedure we describe in the DI quantum state verification. In each round, the boxes provide a correct (p j = 1) or wrong (p j = 0) answer, hence one can regard the measurement process as revealing the output of a dichotomic random variable. The measurement phase is characterized by the number of times the boxes provided a correct answer, which we denote with q 1 . Thus the overall verification success is P = j∈I p j /N 1 = q 1 /N 1 . The certification is said to be successful if P ≥ p QM −ε 1 ≡ p 1 .
Here p QM denotes the quantum bound achieved for the target state. The aim is to show that if the average extractabilityΞ is upper bounded by 1 − η, i.e., the success probability is upper bounded by p QM −ε 2 ≡ p 2 , the verification task can be passed with exponentially small probability. Note that, in order to derive exponential tail bounds on the certification probability, we assume ε 2 > ε 1 . Namely, having N copies at our disposal, the probability to achieve success, i.e., to have P ≥ p 1 , given average extractabilityΞ ≤ 1 − η, i.e.,p ≤ p 2 [argumentation on the level of sets is the same as in quantum state verification section, see discussion around Eq. (21)], is given with the following Chernoff-like bound (for the proof, see Appendix A): From here, we can estimate the number of copies where 1 − δ is the confidence level. Now we derive lower bounds on the success probability of the remaining copies with the confidence level 1 − δ. The success probability of the whole sample is where p v = j∈I p η j /N 1 is the success probability of measured states and p c = j∈J p η j /(N − N 1 ) the success probability of the remaining ones. Given that trivially p v ≤ p QM , the expressionp ≥ p 2 implies Due to the existence of the self-testing robust bounds (e.g. p η c = p QM − cη c ), we can conclude that the remaining, unmeasured copies have the average extractabilityΞ c higher than 1 − η c , with confidence level 1 − δ, under the certification condition demanding that a randomly chosen sequence of measured quantum systems passed successfully q 1 rounds in the verification part.
Again, we discuss the case of the algebraic and nonalgebraic bound separately. In the case of p QM = 1, Taylor expansion of eq. (27) gives the optimal sample efficiency N = O(ln δ −1 /µ(1 − µ)cη c ). On the other hand, for a nonalgebraic bound, i.e., p QM < 1, one obtains sample efficiency Details of the expansions are presented in Appendix A.
Result 2: DI quantum state certification. The entangled state |ψ〉 can be device-independently certified if there exists a robust self-test for it, based on the Bell inequality, which can be phrased as a nonlocal game. The sample efficiency is N = O(ln δ −1 /cη c ) in the case of a Bell inequality whose quantum bound is also the algebraic bound and N = O(ln δ −1 /c 2 η 2 c ) in the case of a nonequivalence between the quantum and the algebraic bound.
We give a concrete example of certification of a tripartite GHZ state through a Mermin test. In our numerical simulations, we take µ = 1/2 and N 1 ≈ N/2 and we set the overall verification success to 0.98. We provide an estimation of the average extractability 1−η c of the GHZ state from the remaining copies S c . Figure 3 illustrates how confidence level grows with a total size of the sample (number of copies) if we want to certify that average extractability of the remaining copies is greater than some value. By making this number lower, i.e., having a smaller success rate in the verification phase, the size of the sample for the reliable certification grows and the bound on the average extractability of the GHZ state from the certificate decreases for the same confidence level. Figure 3: The average extractability of the GHZ state from the certificate in the case of a tripartite GHZ state. The verification phase is done using a Mermin test. Confidence level for certifying average extractability of the GHZ state from the remaining copies S c is greater than 1 − η c depending on the total number of copies N. Here µ = 1 2 and the success rate in the verification phase is set to 0.98.

V. DISCUSSION
In this paper, we developed a protocol for sample-efficient device-independent quantum state verification and certification. The procedures merge well-explored protocols for quantum state verification and self-testing. The biggest room for improvement is the verification procedure in fully non-IID certification scenario. In the verification scenario we tackled the problem by introducing the conditional fidelities, but it remains an open question whether some nontrivial statements can be made about the full state produced by the source. We managed to adapt the procedure to the certification scenario where not all copies are consumed through verification in cases where different copies are not identical, but are independent from each other. While trying to build a fully non-IID DI certification protocol we encountered technical difficulties whenever we aimed to certify the average extractability for more than one remaining copy. In the situation where all copies but one are measured fully non-IID DI protocol could be constructed by using the approach of conditional extractability. The aim is to build such a protocol in the near future. For future research, we are left to explore the possibility of building a certification protocol in a strictly sequential scenario, where resorting to permutational invariance is not allowed. We hope that this might be possible with the help of entropy accumulation theorems [46,62]. Let us also note that our methods could be used for DI verification of resources different than states, for example, quantum gates, by merging quantum verification [63] and self-testing methods [64]. Finally, a trivial consequence of our work is sampleefficient nonlocality detection. While our work focused on the certification of quantum states, if one aims to simply detect nonlocality, it is possible to do it with a very few copies of the state, as it was similarly done for entanglement in Refs. [9,10]. Similar work has been done in Ref. [40], where optimal sample efficiency was achieved for detecting nonlocality. The same method was adapted for entanglement detection [65] and confidence-interval construction [66]. The method presented in Ref. [40] works with early stopping rules, such that the number of samples to be used needs not to be fixed in advance.

VI. ACKNOWLEDGEMENTS
We acknowledge fruitful discussions with Dragoljub Gočanin and Jean-Daniel Bancal. We thank Boris Bourdoncle and Flavio Baccari for their valuable comments about the paper. We especially thank Yunguang Han for pointing out the mistake in Fig. 2 Let us take N copies of the quantum system produced by an untrusted source. In our protocol we assume that copies are independent σ 1 , σ 2 , ..., σ N , but not necessarily identically distributed. We denote with p η j = q − cη j maximal probability for the jth copy to pass the round, where 1 − η j is extractability of the given copy and c the constant that depends on the self-test. The average probability of success is given bȳ In order to obtain a certificate, we measure part of the sample and compare the result with the quantum bound p QM . Quantum bound p QM represents the probability of success of the target quantum state |ψ〉. We set µ to be the probability for a copy to be measured. On average, the experiment consumes N 1 ≈ µN copies for verification. The remaining N 2 = N − N 1 copies are left as a certificate. The protocol can be summarized in the following steps: • In each run we toss a µ-biased coin and we get a sequence of bits {m 1 , m 2 , ..., m N }. If m j = 1 we measure the jth copy and obtain result o j ; otherwise we leave it as a (potential) certificate. The total number of measured copies is N 1 = N j=1 m j .
• In order to extend our analysis to all copies, we artificially define a measurement result for both measured and unmeasured copies byõ j = m j o j , j = 1, 2, ..., N.
• The success rate in the verification phase can be written as P = 1 • The test is successful if P > p QM − ε 1 .
Suppose that our sample has the average extractabilityΞ, which is upper bounded by 1 − η, consequently the average success probability of the whole samplep is smaller than p QM − cη = p QM − ε 2 . Our target quantity is the upper bound on probability to obtain success rate of the measured part greater than p QM − ε 1 , under assumption Ξ ≤ 1 − η, i.e., p ≤ p QM − 2 . Here ε 1 and ε 2 are free parameters lying in the interval (0, 1). In what follows, we see that a valid expression can only be derived for ε 2 > ε 1 . For the sake of compactness, we set p 1 = p QM − ε 1 and p 2 = p QM − ε 2 and write certification condition as p [P ≥ p 1 |p ≤ p 2 ]. Probability to pass the test given average success probabilityp ≤ p 2 is given by = p e (s 1 +...+s N )t ≥ 1|p ≤ p 2 , for t > 0. We introduce random variables s k = m k o k − m k p 1 with possible values {0, −p 1 , 1 − p 1 } and respective probabilities {1 − µ, µ(1 − p k ), µp k }. For the purpose of an example, we have m k = 1 with probability µ, while o k = 1 with probability p k , thus outcome s k = 1 − p 1 appears with probability µp k . Then, we apply Markov's inequality and given that s k s are independent random variables (fourth line below): The fifth line is the consequence of the relation between the arithmetic and the geometric means. The sixth line follows from k p k = Np, while the seventh line is a consequence of the conditionp ≤ p 2 .
Certification. In any other case µ < 1. Again, by fixing ε 1 < ε 2 and taking confidence level 1 − δ, we can estimate the number of copies, Eq. (27), necessary for certification protocol. If the protocol is successful, we have a lower bound on the average success probabilityp and an estimate for the lower bound on the success probability of the remaining copies p c , with confidence level 1 − δ. We havē where p v is the average success probability of the measured copies. Ifp ≥ p 2 , directly from Eq. (A5) and from the logical bound p v ≤ p QM we obtain which directly translates to average certificate extractability with confidence level greater than For completeness, we make Taylor expansion of Eq. (27) and comment on the scaling. In the case of p QM = 1, Taylor expansion gives which after introducing ε 2 ≥ cη c (1 − µ) and ε 1 < ε 2 recreates optimal scaling N = O[ln δ −1 /µ(1 − µ)cη c ]. Similarly, for p QM < 1, after expansion we obtain N ≈ 2(1 − p QM )p QM ln δ −1 which gives quadratically worse scaling N = O[ln δ −1 /µ(1 − µ) 2 c 2 η 2 c ].