Security and Composability of Randomness Expansion from Bell Inequalities

The nonlocal behavior of quantum mechanics can be used to generate guaranteed fresh randomness from an untrusted device that consists of two nonsignalling components; since the generation process requires some initial fresh randomness to act as a catalyst, one also speaks of randomness expansion. Colbeck and Kent proposed the first method for generating randomness from untrusted devices, however, without providing a rigorous analysis. This was addressed subsequently by Pironio et al. [Nature 464 (2010)], who aimed at deriving a lower bound on the min-entropy of the data extracted from an untrusted device, based only on the observed non-local behavior of the device. Although that article succeeded in developing important tools towards the acquired goal, it failed in putting the tools together in a rigorous and correct way, and the given formal claim on the guaranteed amount of min-entropy needs to be revisited. In this paper we show how to combine the tools provided by Pironio et al., as to obtain a meaningful and correct lower bound on the min-entropy of the data produced by an untrusted device, based on the observed non-local behavior of the device. Our main result confirms the essence of the improperly formulated claims of Pironio et al., and puts them on solid ground. We also address the question of composability and show that different untrusted devices can be composed in an alternating manner under the assumption that they are not entangled. This enables for superpolynomial randomness expansion based on two untrusted yet unentangled devices.


Introduction
Background.One of the counter-intuitive features of quantum mechanics is its non-locality: measuring possibly far apart quantum systems in randomly selected bases (chosen out of some given class) may lead to correlations that are impossible to obtain classically.Anticipated by Einstein, Rosen and Podolsky [EPR35], it was John Bell [Bel64] who put this property on firm ground by proposing an inequality that is satisfied by any classical correlation, but is violated when the correlation is obtained from measuring entangled quantum states.Such inequalities are called Bell inequalities.
An important example of such a Bell inequality was proposed by Clauser Horne, Shimony and Holt [CHSH69] and states that if X and Y are independent uniformly distributed bits, and if the bit A is obtained by "processing" X without knowing Y , and the bit B is obtained by "processing" Y without knowing X, then the probability that A ⊕ B = X ∧ Y is at most 75%.This bound on the probability holds if the processing is done classically with shared randomness, but can be violated when the processing involves measuring an entangled quantum state; in this latter case, a probability of roughly 85% can be achieved.
Violating a Bell inequality necessarily means that there must be some amount of fresh randomness in the outputs A and B (given the inputs X and Y ).More formally, consider an untrusted device D, prepared by an adversarial manufacturer Eve.The device consists of two components, set up by Eve, which on respective inputs X and Y produce respective outputs A and B without communicating.No matter how the two components work, as long as a given Bell inequality is violated during n sequential interactions with D (which can be observed by doing statistics), there must be a certain amount of uncertainty in the n output pairs (A 1 , B 1 ), . . ., (A n , B n ), even given the n input pairs (X 1 , Y 1 ), . . ., (X n , Y n ), and thus it should be possible to apply a randomness extractor to obtain nearly-random bits.
This kind of randomness expansion from untrusted devices was first suggested by Colbeck [Col09] and Colbeck and Kent [CK11], who presented a scheme that uses GHZ states and reaches a linear expansion, however, without providing a rigorous security analysis.The main point missing in these works is a method to rigorously bound the min-entropy of a device's output.The work of Pironio et al. [PAM + 10] addresses this issue, and they propose a technique to numerically compute a lower bound on the min-entropy of the output pair AB (conditioned on X and Y ) as a function of the Bell value of the device D (which quantifies the violation of Bell inequality).For the special case of CHSH, they also show an analytical bound.
The authors of [PAM + 10] also consider the case of n sequential interactions with D, and they show how to estimate the average Bell value of D over the n rounds by doing statistics over the observed data.This is non-trivial because the Bell value of D may change over the different rounds, and, for each round, it may depend on the behavior of the previous rounds.In other words, the Bell value of D during round i + 1 depends on the history (A 1 , B 1 , X 1 , Y 1 ), . . ., (A i , B i , X i , Y i ).Combining things, Pironio et al. then claim to have a bound on the min-entropy of (A 1 , B 1 ), . . ., (A n , B n ), conditioned on (X 1 , Y 1 ), . . ., (X n , Y n ), as a function of the observed data, i.e., as a function of (A 1 , B 1 , X 1 , Y 1 ), . . ., (A n , B n , X n , Y n ).However, such a statement does not make sense, since the considered min-entropy is a value determined by the experiment description (which specifies the probability distribution), whereas the claimed bound depends on the specific outcome of the experiment.1 Furthermore, not only is the claim improperly formulated, but there is also a flaw in its derivation, which is without an obvious fix.Thus, even though the necessary tools are provided in [PAM + 10], they are not put together in the right rigorous way to be able to control the min-entropy of (A 1 , B 1 ), . . ., (A n , B n ) produced by an untrusted device D.
Our Result.In this paper, we make up for this shortfall in [PAM + 10].Specifically, we show how to rigorously and correctly put together the tools provided in [PAM + 10] in order to obtain a meaninful (and correct) bound on the min-entropy of (A 1 , B 1 ), . . ., (A n , B n ), conditioned on (X 1 , Y 1 ), . . ., (X n , Y n ), by means of the observed data.The trick is to consider and bound the min-entropy conditioned on the event that the estimator for the average Bell value lies in some interval.This gives us some control over the average Bell value of the device, but, as we show, still leaves enough uncertainty in the data to get a good bound on its min-entropy.
We also address the question of the composability of untrusted devices, and we show that under the assumption that different devices are not entangled, the output of one device, after privacy amplification, can be used as input for a second device, and the resulting output of the second device, after privacy amplification, can again be fed into the first device, etc.Using an extractor with a short seed for doing the privacy amplification, this allows for a superpolynomial randomness-expansion scheme using two untrusted (but guaranteed-to-be unentangled) devices.
Concurrent and Related Work.In concurrent and independent work, Vazirani and Vidick [VV11] as well as Pironio and Massar [PM11a] came up with results that are overlapping with ours.We briefly discuss here the similarities and the differences between our results and those of Vazirani and Vidick and of Pironio and Massar.We encourage the reader to also look at the comparisons given in [VV11,PM11a].
Vazirani and Vidick obtain a randomness-expansion scheme with superpolynomial expansion and security against quantum side information.We do not achieve security against quantum side information, and our superpolynomial randomness-expansion scheme requires two unentangled devices in an iterative way, whereas their scheme works with just one single device.On the other hand, their result is tailored to CHSH and requires an almost full violation of Bell inequality, while our result is generic and holds for any Bell inequality, and we show that any violation leads to some amount of fresh randomness.
Pironio and Massar's results on the other hand are very similar to ours, and only differ in some minor details. 2In a very recent preprint, Barrett, Colbeck and Kent point out the possibility of Trojan-horse attacks on device-independent randomness-expansion protocols [BCK12, Appendix].It seems impossible to prevent that Eve programs devices (that are used multiple times) to release in later rounds information about previous outputs.We note that although such an attack seems unavoidable, in a single activation of our randomness-expansion scheme (see [FGS11,Section 5] for details), we can re-use the same devices over and over again and still prevent such a Trojan-horse attack by only releasing the output of the very last round (and aborting if things go wrong before the last round is reached).

Preliminaries
We assume the reader is familiar with quantum information processing, and we merely fix our notation and some basic concepts in this section.Throughout the paper, all logarithms are base 2.

Quantum States
The state of a quantum system A is given by a density matrix ρ A , i.e., a positive-semidefinite trace-1 matrix acting on some Hilbert space H A .We denote the set of all such matrices, acting on H A , by D(H A ).The state space of the joint quantum systems AB, which consist of two (or more) subsystems A and B , is given by the tensor product H AB = H A ⊗ H B .If the state of the joint system is given by ρ AB , then the state of the sub-system A when considered as a "stand alone" system is given by the reduced density matrix ρ A = tr B (ρ AB ) ∈ D(H A ), obtained by tracing out system B.
A random variable X over a finite set X with probability distribution P X can be represented by means of the density matrix as ρ X = x P X (x)|x x| ∈ D(H X ), where {|x } x∈X forms a basis of H X = C |X| .Thus, we may view X as a quantum system, and we say that its state, ρ X , is classical.If the state of a quantum system E depends on the random variable X, in that the state of E is given by ρ x, then we can view the pair XE as a bi-partite quantum system in state ρ . This naturally extends to multiple random variables and quantum systems.
The distance between two states ρ E , ρE ∈ D(H E ) is measured by their trace distance 1 2 ρ E − ρE 1 , where • 1 is the L 1 norm. 3In case of classical states ρ X and ρX , corresponding to distributions P X and PX , the trace distance coincides with the statistical distance

Closeness to Uniform, Min-Entropy, and Extractors
In the following definitions, we consider a bi-partite system XE with classical X, given by ρ XE .X is said to be random and independent of E if ρ XE = ρ U ⊗ ρ E , where ρ U is the fully mixed state on H X (i.e., U is classical and, as random variable, uniformly distributed).
If Ω is some event, determined by the random variable X, then d(X | E, Ω) is naturally defined by means of replacing the distribution P X by P X|Ω .The same applies to the next two definitions.
Definition 2.2.The guessing probability of X given E is where the supremum is over all POVMs {M x } x on H E .
Definition 2.3.The min-entropy of X given E is given by This definition was shown in [KRS09] to coincide with the definition originally introduced by Renner [Ren05] which also coincides with the classical definition of conditional min-entropy, in the case where E is classical.
extractor, if for any bipartite quantum system XE with classical X and with H min (X | E) ≥ k, and for a uniform and independent seed Y , we have Note that we find "extractor against quantum adversaries" a too cumbersome terminology; thus we just call Ext a (strong) extractor, even though it is a stronger notion than the standard notion of a (strong) extractor.

Bell-Inequality and CHSH
For given finite sets A, B, X, Y, consider a conditional probability distribution P AB|XY , specified as follows.There exists ρ AB ∈ D(H A ⊗ H B ) for an arbitrary (finite) dimensional two-partite quantum system AB, and families of measurements {M a x } and {N b y }, indexed by x ∈ X and y ∈ Y, acting on A and B, and with measurement outcomes a ∈ A and b ∈ B, respectively such that P AB|XY is called classical (or local) if there exist (conditional) probability distributions P R , P A|XR and P B|Y R such that P AB|XY (a, b | x, y) = r P R (r)P A|XR (a | x, r)P B|Y R (b | y, r) for all a, b, x, y; this is equivalent to requiring that P AB|XY can be specified by means of a separable state ρ AB .We let I 0 denote the maximal Bell value achievable (for a given set of Bell coefficients) with a classical P AB|XY .We speak of a violation of Bell inequality if there exists a quantum system resulting in conditional probability distribution with a Bell value greater than I 0 .
3 Fresh Randomness from Untrusted Devices In this section, we recall (some of) the findings of [PAM + 10], and also discuss and fix some subtle issue that got neglected there.Throughout this and the upcoming sections, we consider fixed finite sets A, B, X, Y, and a fixed set C = {c abxy } of Bell coefficients.The reader may think of CHSH, but our results hold generally.

A Single Interaction
We consider an untrusted device D, prepared by an adversary Eve.As discussed in the introduction, D consists of two components,4 which, on respective inputs x ∈ X and y ∈ Y, produce respective outputs a ∈ A and b ∈ B without communicating.Formally, D's behavior is given by an unknown conditional probability distribution P AB|XY , which is specified by an unknown quantum state ρ AB ∈ D(H A ⊗H B ) of unknown dimension, and unknown families of measurements {M a x } and {N b y }, acting on the respective systems A and B. We are interested in the guaranteed amount of uncertainty in A and B (conditioned on X and Y ), under the promise that P AB|XY has some given Bell value, greater than I 0 .This motivates the following definition.Definition 3.1.For a given set of Bell coefficients, we define h • to be the function where the outer infimum is over all finite dimensional Hilbert spaces H A and H B , all states ρ AB ∈ D(H A ⊗ H B ), and all families of measurements {M a x } and {N b y } such that the resulting conditional probability distribution Also, we define h to be the convex closure of h • , i.e., the maximal convex function that does not exceed h • .5 Pironio et al. [PAM + 10] show that by means of a hierarchy of semi-definite programs (SDPs) [NPA07, NPA08], h • (I) can be numerically computed up to arbitrary precision (by means of a possibly expensive computation).They also show an analytical lower bound of 1 − log 1 + 2 − I 2 /4 for h • (I) in the case of CHSH, which reaches 1 for I = I max = 2 √ 2 (whereas the numerical calculation gives h • (2 √ 2) ≈ 1.23), and monotonically decreases to 0 as I goes down to I 0 = 2; see Figure 2 in [PAM + 10].Since this lower bound is convex, it is also a lower bound on h;6 we will need this later on.For now, we can conclude that if an unknown bipartite quantum system (with fixed measurements {M a x } and {N b y }) is promised to have a CHSH value of I = 2 √ 2, then the joint min-entropy in the measurement outcomes A and B is lower bounded by approximately 1.23 bits (respectively 1 bit, if one wants to rely on the analytical bound).

Sequential Repetitions
In order to get more uncertainty, and in order to be able to estimate the Bell value, we consider a sequential repetition of extracting uncertainty from an untrusted device D as above.Informally, rather than interacting with D once (i.e., inputting (x, y) ∈ X×Y and observing (a, b) ∈ A×B), D is interacted with n times in sequence, by inputting (x 1 , y 1 ) ∈ X × Y and observing (a 1 , b 1 ) ∈ A × B, inputting (x 2 , y 2 ) ∈ X × Y and observing (a 2 , b 2 ) ∈ A × B, etc.This procedure is formalized as follows.
Modeling.We consider an arbitrary but fixed bipartite state ρ AB ∈ D(H A ⊗H B ) of an arbitrary finite-dimensional bipartite quantum system AB, and a sequence of n arbitrary but fixed pairs of families of measurements ({M a 1 x 1 }, {N b 1 y 1 }), . . ., ({M an xn }, {N bn yn }).For each pair, {M a j x j } is a family of measurements, indexed by x j ∈ X, acting on A, with measurement outcomes a j ∈ A, and similar for {N b j y j }.We allow the two components of the device D to communicate between the rounds; this is captured by a sequence U 2 , . . ., U n of unitary transformations acting on H A ⊗ H B , where U j is applied to the (collapsed) state before the jth interaction.For j ∈ {1, . . ., n}, denote with a j the concatenation of the first j rounds a j = a 1 • • • a j and the same for b, x and y.Let A j , B j , X j , Y j be the corresponding random variables.To ease notation, we use bold letters as shortcuts for the concatenation of all n rounds, e.g. a = a n , A = A n , etc.
Formally, the conditional probability distribution P AB|XY is defined as where Hist j = (A j−1 , B j−1 , X j−1 , Y j−1 ) and hist j = (a j−1 , b j−1 , x j−1 , y j−1 ), and where ρ AB|Hist j =hist j is inductively defined for j = 1, . . ., n as follows.ρ AB|Hist 1 =hist 1 = ρ AB , and, for 1 is the state obtained by applying U j+1 to the state to which ρ AB|Hist j =hist j collapses when A and B are measured by {M a j x j } and {N b j y j }, respectively, and a j and b j are observed.What is important to realize is that before every round j, the situation is exactly as in the previous Section 3.1, with a fixed state ρ AB|Hist j =hist j and fixed measurements {M We would like to point out that there is no need to make {M a j x j } (and the same for {N b j y j }) dependent on previous in-and outputs, i.e., on hist j using the above notation, because we may assume that the measurement {M a j x j } encodes x j and a j into the post-measurement state of A, and that the subsequent unitary U j+1 copies this (classical) information into the state of B. The subsequent measurements can then be control measurements, which performs a measurement depending on the history.Similarly, we may assume the {M a j x j }'s to be identical for different j's (and the same for {N b j y j }'s), since the quantum system A may maintain a counter that is increased by every unitary U j , and {M a j x j } can then be chosen as a control measurement that is controlled by the counter. 7 Given the conditional probability distribution P AB|XY as specified above, which describes the input-output behavior of the n sequential interactions with the device D, once a distribution P XY is decided upon, which specifies how the inputs x j and y j are chosen in each round, the joint probability distribution P ABXY is determined as P ABXY = P XY P AB|XY .
Estimating the Bell value.
Once the device D is given, i.e. the state ρ AB , the measurements ({M a 1 x 1 }, {N b 1 y 1 }), . .., ({M an xn }, {N bn yn }) and the unitaries U 2 , . . ., U n are fixed, P A 1 B 1 |X 1 Y 1 and thus the Bell value of the first round of interaction, I 1 = I(P A 1 B 1 |X 1 Y 1 ), is determined.For the other rounds, this is slightly more subtle.The reason is that the state ρ AB|Hist 2 =hist 2 before the second round, and thus the probability distribution on what happened in the first round, i.e., depends on hist 2 = (a 1 , b 1 , x 1 , y 1 ).Thus, the Bell value of the second round, , is a function of hist 2 .Similarly, the Bell value of the j-th round, be the average Bell value, averaged over the n rounds, and we write Ī = Ī(a, b, x, y) to make its dependency on the a, b etc. explicit.
where χ(e) is the indicator of the event e (that is, χ(e) = 1 if the event e occurs and 0 otherwise), the following holds.
Proposition 3.2 ([PAM + 10]).For Ī and Î as above, for arbitrary but iid (X, Y ), meaning that P XY = j P X j Y j with P X j Y j = P XY for all j, and for any ε > 0: , where I max is the maximal value of I achievable by means of a quantum system, and p min = min x,y P XY (x, y) and c max = max{c abxy }.
7 These observations on the independence of the measurements on the history and the round are not crucial for our proofs; they merely simplify the notation.
We would like to point out that for the bound on P [G] to hold, it is crucial that ρ AB is independent of (X, Y ) (and the (X i , Y i )'s are iid): clearly, the device can fool you if it knows the inputs it will get in advance.However, for the event G as defined in the proof below, the bound on the guessing probability holds irrespectively of the distribution of X and Y .Indeed, the value of Guess(AB | X = x, Y = y, L ε = ℓ, G) is determined by the conditional probability distribution P AB|XY (•, •|x, y) alone (which is determined by ρ AB , the family of measurements and the unitaries); this holds because L ε as well as G (this, we will see below) are uniquely determined by A, B, X and Y .
Proof.Let B guess be the bad event Ī(A, B, X, Y ) ≤ Î(A, B, X, Y ) − ε that the estimated Bell value Î is significantly larger than the average Bell value Ī, and let G guess be its complement (which we understand as a good event); by Proposition 3.2, we know that P [B guess ] ≤ 2 −c(p min )ε 2 n .We define B 1 to be the set of all "bad inputs" (x, y) with the property that . Finally, we define B 2 to be the set of all (x, y, ℓ) with the property that It follows from the definition of We slightly abuse notation and identify the set B 1 with the bad event (X, Y ) ∈ B 1 and we write G 1 for its complementary good event, and correspondingly for B 2 and G 2 .We now define the good event G as G := G guess ∧ G 1 ∧ G 2 .Using union bound over the bad events, it is not too hard to show that It remains to argue the bound on the min-entropy.Let a, b, x, y be such that Ī(a, b, x, y) > Î(a, b, x, y) − ε, i.e., they have positive probability conditioned on the good event G guess .Furthermore, let ℓ be the unique value with Î(a, b, x, y) − ε ∈ Ω ℓ .If (x, y) ∈ B 1 , then P [G guess |X = x, Y = y] ≥ 1 2 and hence, conditioning on the event G guess can increase the probabilities by at most a factor of 2. For those (x, y) ∈ B 1 , it then follows from (3) that If additionally we have (x, y, ℓ) ∈ B 2 , then Note that additionally conditioning on G 1 and G 2 does not change the above conditional probability distribution if (x, y) ∈ B 1 and (x, y, ℓ) ∈ B 2 .Thus, the same bound also applies to P AB|XY Lε,G (a, b | x, y, ℓ), for all a, b, x, y and ℓ with P ABXY Lε|G (a, b, x, y, ℓ) > 0. By definition of the guessing probability and the min-entropy, this proves the claim.
A specific example.
Consider now n sequential interactions with an untrusted device D, where in each round x j and y j are chosen (according to P X j Y j ) and input into D, and a j and b j are obtained as output from D. Let us say that from the collected data, we get Î(a, b, x, y) = 2.7 ∈ Ω 3 as estimation for the average Bell value.By Theorem 3.3, he have that given x and y and L ε = 3, the min-entropy of a and b is at least n • (h(2.6)− δ) − 1 ≈ n • (0.36 − δ) > n/3 bits, except with probability 4 • 2 −δn + 3 • 2 −c(q)ε 2 n .11Thus, when applying a suitable randomness extractor to a, b, we can extract, say, n/4 bits that are exponentially close to uniformly distributed (given x and y and L ε = 3).
In order to sample the inputs according to the biased input distribution P XY , as suggested in [PAM + 10], it is known to be sufficient (in average) to have access to n • O(q log(1/q)) random bits [KY76].Since, q log(1/q) converges to 0 for q → 0, if q is chosen to be a small enough constant, then, say, n/4 random bits are sufficient.Thus, by starting off with n/4 random bits,12 we obtained another n/4 almost-random bits and thus hold now n/2 random bits.Thus, we have expanded the randomness by a factor 2. Choosing q = O(1/ √ n), one obtains an expansion factor O( √ n/ log n) while still being negligibly close to perfect randomness (since c(1/ Having generated fresh randomness from an untrusted device D, one is now tempted to use the newly obtained randomness to generate even more fresh randomness from the device D, and so on.This does not work.The reason is that the generated randomness is not random to the device D, or, more formally, not independent of the internal state of D; indeed, D has already observed x and y and it has itself produced a and b.We argue below, however, that we can use the fresh randomness to generate even more randomness from another device, as long as the devices are not entangled with each other nor with the adversary.

Classical Side Information
The case where the adversarial producer Eve of the devices holds classical side information about the device D, can be reduced to the case without side information by conditioning on particular values of the side information.

Composability
Consider two (or more) untrusted devices D and D ′ , prepared by the adversary Eve.We assume that D and D ′ cannot communicate and are not entangled with each other.The case when Eve holds classical side information about the devices can be treated as described in the previous section.We can then apply Theorem 3.3 to argue that the output AB produced by D has high min-entropy (except with small probability) given the internal state of D ′ (because D ′ is independent of D), assuming that a large enough average Bell value is observed.It thus follows that by applying an extractor (with suitable parameters and a freshly chosen seed) to AB, we obtain a bitstring K that is close to random and independent of the internal state of D ′ .This in particular implies that if we use the randomness K to sample the input X ′ Y ′ to D ′ (according to a prescribed distribution), then X ′ Y ′ is close to independent of the internal state of D ′ .As the dependency between the internal (quantum) state of D and the in-/outputs of D ′ is purely classical, we can condition on this classical information and apply Theorem 3.3 to argue that the output A ′ B ′ produced by D ′ has high min-entropy given the current internal state of D. Therefore, we are in the same situation as above, and so can use the randomness extracted from A ′ B ′ to sample again inputs for D, and we can keep on going like this as long as a large enough Bell value is observed.We stress that for the above line of reasoning only works because we assumed the devices D, D ′ to be unentangled to start with.In order to see quantitatively how this procedure can lead to a superpolynomial randomness expansion, we refer to [FGS11, Section 5].

Conclusion and Open Problems
An interesting extension to our result is to generalize Theorem 3.3 to the setting of quantum side information.This would allow a composition theorem for the more general case in which the devices can be entangled with each other and with Eve.Numerical calculations seem to suggest that the bound on the min-entropy does carry over to the quantum setting.Unfortunately, we are unable at the moment to give a rigorous proof of this claim and leave it as main open question.
Definition 2.5(Bell Value).For any set C = {c abxy } of Bell coefficients, the Bell value of P AB|XY (with respect to C) is defined asI(P AB|XY ) =abxy c abxy P AB|XY (a, b | x, y) a jx j } and {N b j y j } in the device D, and thus P A j B j |X j Y j Hist j (•, •|•, •, hist j ) here behaves as P AB|XY does in Section 3.1.

8
Pironio et al. show in [PAM + 10] that the average Bell value Ī can be estimated by analyzing the data collected over the n rounds.Specifically, defining Î = Î(a, b, x, y) = 1 n n j=1 abxy c abxy χ(a j = a, b j = b, x j = x, y j = y) P XY (x, y)