Systems of random variables and the Free Will Theorem

The title refers to the Free Will Theorem by Conway and Kochen whose flashy formulation is: if experimenters possess free will, then so do particles. In more modest terms, the theorem says that individual pairs of spacelike separated particles cannot be described by deterministic systems provided their mixture is the same for all choices of measurement settings. We reformulate and generalize the Free Will Theorem theorem in terms of systems of random variables, and show that the proof is based on two observations: (1) some compound systems are contextual (non-local), and (2) any deterministic system with spacelike separated components is non-signaling. The contradiction between the two is obtained by showing that a mixture of non-signaling deterministic systems, if they exist, is always noncontextual. The"experimenters' free will"(independence) assumption is not needed for the proof: it is made redundant by the assumption (1) above, critical for the proof. We next argue that the reason why an individual pair of particles is not described by a deterministic system is more elementary than in the Free Will Theorem. A system, contextual or not and deterministic or not, includes several choices of settings, each of which can be factually used without changing the system. An individual pair of particles can only afford a single realization of random variables for a single choice of settings. With this conceptualization, the"free will of experimenters"cannot be even meaningfully formulated, and the choice between the determinism and"free will of particles"becomes arbitrary and inconsequential.

In this paper the issues related to the Free Will Theorem (FWT) [1,2] are discussed in terms of random variables. Conway and Kochen in [2,3] emphasize that their theorem does not use probabilistic notions. This seems to plunge our paper in a controversy from the outset [3][4][5]: our analysis of the FWT would be suspect if we used conceptual means that are not acceptable in the original formulation of the theorem. This is not the case however. We use the language of random variables to describe quantum experiments, involving large numbers of replications with multitudes of particles. On the level of individual particles (more specifically, individual pairs of entangled particles), our focus, the same as in the original FWT, is exclusively on whether they can be described as systems of deterministic outcomes (which are, of course, a special case of random variables). Probabilistic description of quantum experiments is hardly controversial, and we have a good demonstration of the benefits it offers for the FWT in [6]. Moreover, it is unavoidable. It is known [5,6] that the FWT can be found or extracted from much earlier work than [1,2], with the Kochen-Specker system used by Conway and Kochen being replaced with other contextual systems (in fact, any contextual system, as we will see later). The contextuality of many of these systems is more saliently probabilisitic than that of the Kochen-Specker system. This prominently includes the EPR/Bohm system, with a variant of the FWT being already seen in Bell's pioneering work [7].
Our reformulations lead us to critically re-examine and modify the FWT, although not invalidate it. In particular, we show that the assumption of the free will of experimenters (or independence assumption, as many authors prefer to call it [6,[9][10][11][12][13]) is not needed in the FWT. This assumption is only needed to ensure that experimental observations correctly identify the system experimented on as contextual. It is therefore made unnecessary by another assumption, critical for the FWT and underivable from the independence assumption -that a contextual system with certain properties exists. Furthermore, we argue that while the question of whether individual particles can be described by deterministic systems is indeed to be answered negatively, and while the FWT is indeed one way of demonstrating this, there is a more elementary reason for this negative answer: the notion of a system, deterministic or not, is not applicable to an individual pair of particles to begin with. The latter is a realization of random variables for a single choice of settings, whereas the notion of a system involves several mutually exclusive settings, each of which can be factually and repeatedly used.
Our analysis is based on the Contextuality-by-Default (CbD) theory (e.g., [14][15][16]), but its utilization in this paper is confined to only two basic principles. The first one is that each random variable is identified not only by the property it measures but also by the context (settings) in which it measures this property. The second principle is that no two random variables recorded in different, mutually exclusive contexts possess a joint distribution. Moreover, these principles are only applied to a special subset of systems of random variables, the compound, "Alice-Bob"-type systems with spacelike separation. These systems are non-signaling, and this makes it unnecessary for us to invoke most of the content of CbD.
Because of this the reader need not be familiar with CbD to understand this paper.
However, a brief comment may be needed on the two principles just mentioned. The double-indexation of random variables means that if Alice choses a setting x and Bob chooses a setting y, their measurement outcomes (random variables) are represented as, respectively, A x,y and B x,y . And if Bob changes his setting to y ′ while Alice maintains her setting, her measurement is represented by another random variable, A x,y ′ . A reader might erroneously interpret this as indicating that Bob somehow influences Alice's measurements despite their spacelike separation (a "spooky action at a distance"). This is not the case. The distribution of A x,y ′ is the same as that of A x,y , so Bob transfers no information to Alice. There is no "action." The difference between A x,y and A x,y ′ simply reflects the relational nature of random variables in classical probability theory. A random variable is a measurable function on a probability space, and any variable defined on the same space is jointly distributed with it: their observed realizations are paired. Therefore, if A x,y and A x,y ′ were the same random variable, they would be jointly distributed. But this would mean that realizations of A x,y and realizations of A x,y ′ co-occur (and are equal), while in reality they occur in mutually exclusive contexts. [? ] The scheme of the paper is as follows. In the next section we introduce formal notions and definitions related to systems of random variables. In Sections III and IV we present the FWT in the language of such systems. In Section V we show that a systematic use of the language of random variables makes the FWT unnecessary (though not wrong): the experimenter' free will (independence) becomes unformulable, and the choice between the determinism and free will of particles becomes arbitrary and inconsequential. The concluding section provides a brief summary.

II. PRELIMINARIES
A compound system of random variables is an indexed set of random variables where x is a property measured by Alice, y is a property measured by Bob, (x, y) is the context in which the measurements are made, and C is a set of all possible contexts. Every random variable therefore is identified by the property it measures and the context in which it measures it. To simplify discussion, we will assume that all random variables have a finite number of values. The properties x, y are also referred to as settings, although there is the obvious semantic difference: a setting x designates the decision and arrangements made to measure property x. Alice and Bob are always assumed to be spacelike separated. Because of this, by special relativity, the system is non-signaling: the distributions of the variables are context-independent, for any x, y, y ′ such that (x, y) , (x, y ′ ) ∈ C. The symbol dist = indicates equality of distributions. Analogously, for any y, x, x ′ such that (x, y) , (x ′ , y) ∈ C.
One prominent example of a compound system is the EPR/Bohm system [7,18], Another prominent example is the compound version of the Kochen-Specker-Peres system [19,20], , with C = {x = 1, . . . , x = 40} × {y = 1, . . . , y = 33}. In R EP RB , the x-values and y-values enumerate choices of axes by Alice and Bob, and the random variables are 0/1 (say, spin values in spin-1 /2 particles). In R KSP , the y-values represent 33 special axes in Peres's proof of the Kochen-Specker theorem [20], and the x-values encode 40 Peres's triples formed using these 33 axes; the A-variables have values 011, 101, 110, and the B-variables are 0/1. Any two random variables recorded in the same context, and referred to as an AB-pair, are jointly distributed : this means that Pr [A x,y = a, B x,y = b] is well defined, for (x, y) ∈ C. However, two random variables from different contexts are stochastically unrelated, i.e. have no joint distribution: i.e., if (x, y) = (x ′ , y ′ ), the event conjunctions A x,y = a, A x ′ ,y ′ = a ′ , are not well-defined events, and no probabilities can be assigned to them. This formal distinction reflects the obvious fact that random variables from mutually exclusive contexts can never be observed together, in any empirical meaning of "together." In particular, A x,y and A x,y ′ in (2) are not equal, because they are not jointly distributed. All this means that the system R in (1) is a collection of stochastically unrelated AB-pairs, combined within a single system only because every AB-pair shares at least one property it measures with at least one other AB-pair.
Random variables attaining a given value with probability 1 are deterministic variables. A deterministic system is a system containing only deterministic variables. Thus, the two systems below are deterministic versions of the EPR/Bohm system, non-signaling (D EP RB ) and signaling (D ′ EP RB ): In presenting these deterministic systems we conveniently identify the random variables with their supports, say, writing 0 instead of A 2,1 ≡ 0 (≡ meaning "equal with probability 1"). Because one loses the indexation as a result, one has to indicate for each number what properties it measures (at the bottom of the tables). Note that we use capital Roman letters to designate random variables, and the script letters R, D to refer to systems -because a system is not a random variable, it is a set of stochastically unrelated random variables (the AB-pairs).
A couplingR for a system R is an identically doublelabeled set of jointly distributed random variables such that every AB-pair ofR is distributed as the corresponding AB-pair of R: for every (x, y) ∈ C. Note that we can writeR rather thanR in (7) because a coupling is a random variable in its own right. Thus, while a coupling of R EP RB can be presented asR it is no longer a set of four stochastically unrelated pairs, but a random variablē with 2 8 possible values.
[? ] Note also, that any deterministic system D has a unique coupling D, and the two are easy to confuse if one uses our convenient identification of deterministic random variables with their supports. Thus, the coupling of D ′ EP RB in (6) is written precisely as D ′ EP RB itself: A (non-signaling) compound system R is noncontextual if it has a couplingR such that whenever the indicated contexts are defined (belong to C). If such a coupling does not exist, the system is contextual.
Overlooking logical subtleties [17], this definition is equivalent to the traditional definitions of contextuality and locality, in terms of the non-existence of joint distributions for single-indexed random variables [8] and in terms of hidden variables with noncontextual/local mapping into observables [7,19]. Perhaps this would be more clear on observing that noncontextuality is equivalent to the existence of a set of jointly distributed single-indexed random variables such that for every (x, y) ∈ C.

III. FREE WILL THEOREM
If all AB-pairs in a compound system R are set to specific values, the resulting system is called a realization of R. The reason this has to be presented as a definition is that, unlike a realization of a random variable (e.g., an AB-pair), a realization of a system is not an observable outcome of any experiment: it is a pure mathematical abstraction, as the variables do not co-occur across contexts. Recall that each random variable in R is finitevalued, because of which the set of possible realizations of R is finite. Thus, the system R EP RB has 4 4 realizations, whereas R KSP has 6 40·33 realizations.
The question posed in the FWT theorem can be formulated thus: given that an idealized experiment involving an unlimited number of particle pairs is described by a system R, is it possible that each individual pair of particles is a deterministic system that coincides with one of the realizations of R? The crux of the issue here is in whether the realizations of the systems can describe individual particle pairs. If not for this constraint, it would be innocuous (although still objectionable, as we will see later) to say that the system R is presentable as a mixture of some of its realizations D 1 , . . . , D k taken as deterministic systems: However, in the question asked by the FWT, these deterministic systems are assumed to describe real physical entities (particle pairs), because of which they should be physically realizable. In particular, they are subject to special relativity, and have to be non-signaling. For instance, in the decomposition of R EP RB , the signaling deterministic system D ′ EP RB in (6) is not allowed. Theorem (reformulated and generalized FWT). A contextual system R cannot be decomposed as in (15), where D 1 , . . . , D k are non-signaling deterministic systems each of which coincides with a realization of R.
Proof. If k = 0 (no non-signaling realizations of R exist), the theorem is proved. Assume that (15) holds with k > 0. Introduce a random variable Λ such that Pr [Λ = i] = p i (i = 1, . . . , k). Each D i , being deterministic, has a unique coupling D i , and the mixturē is a coupling of R. In this coupling, where a x,y i is the realization of A x,y in system D i . But a x,y i = a x,y ′ i for any (x, y) , (x, y ′ ) ∈ C (non-signaling). We have then Analogously, for all (x, y) , (x ′ , y) ∈ C. By definition then, R is noncontextual, contrary to the theorem's premise.
Equivalently, and perhaps more familiar to physicists, the proof could be formulated as a demonstration that R has a local hidden variable model. Using the same Λ as in the proof, we have a couplingR of R such that Ā x,y ,B x,y = (f (Λ, x, y) , g (Λ, x, y)) , where for each value Λ = i and each (x, y), the function (f, g) reads the value of (a x,y which is a local (i.e., noncontextual) model with Λ as a hidden variable.
Applying this theorem to R KSP used in Conway and Kochen's proof, this system has no non-signaling realizations (by the Kochen-Specker theorem). Rather surprisingly therefore, the proof of the Conway-Kochen version of the FWT is contained in the first sentence of the proof above. For R EP RB , we have 16 non-signaling realizations of this system, and the proof says that their mixtures can only be noncontextual.

IV. WHERE IS THE FREE WILL ASSUMPTION IN THE PROOF?
The formulations and proofs given in the previous section do not even mention the hypothetical freedom with which Alice and Bob choose their settings. How is it possible? The answer is that the experimenters' free will assumption is indeed redundant. The proof above is contingent on the assumption that R is a correct description of an idealized experiment involving an unlimited number of particle pairs, those whereof we ask whether they could be deterministic systems. The experimenters' free will is only needed to dismiss a conspiracy of nature leading to an incorrect identification of the system in such an experiment. Let us explain this using the system R EP RB .
Consider the possibility that in an EPR/Bohm experiment only four types of entangled particle pairs are pos-sible, described by the deterministic systems Then the true system R EP RB obtained from any mixture of these four systems is noncontextual. Suppose, however, that whenever Alice and Bob choose (x, y) = (1, 1) or (2, 1) (the first and third rows in the matrices), the nature chooses to supply a pair of particles described either by D 1 or by D 2 , equiprobably; whereas for (x, y) = (1, 2) and (2, 2) (the second and fourth rows in the matrices) the nature choses between D 3 and D 4 equiprobably. Neither Alice nor Bob nor anyone analyzing their experimental data has any way of knowing this. Following many replications, the results will be a statistical estimate of a system in which all random variables are uniformly distributed, and This is a non-signaling contextual system (a PR box, [21]), and if used in the proof of the FWT in place of what we assumed to be the true system, it will lead one to the wrong conclusion that no decomposition (15) is possible.
To avoid such conspiratorial scenarios one can postulate that the distribution of the non-signaling deterministic systems in (15) is the same for all contexts (all choices of settings). This can be interpreted in terms of Alice's and Bob's free will, but does not have to. It would be better therefore to call this assumption unbiasedness, but not to multiply terminology we follow the authors who call it (measurement or setting) independence. The assumption, of course, is consistent with the choices of settings being perfectly predetermined, but simply uncorrelated with the occurrences of the different types of deterministic systems. Moreover, Alice's and Bob's choices may very well be correlated, it makes no difference.
The example just given, of a noncontextual system being mistaken for a contextual one, suggests the logical possibility of taking the decomposition (15) for granted, and accounting for the (apparent) contextuality of the observed system R either by relaxing the requirement of non-signaling of the deterministic systems D i or by exploring deviations from the independence assumption. There is an obvious reciprocity between the two, and it has indeed been researched and quantified [9][10][11][12]. This line of study is outside the scope of our paper. In the FWT we are only interested in whether individual particle pairs can be described by deterministic non-signaling systems, and the answer given is negative.
As the above reasoning shows, the independence assumption is not a necessary premise of this theorem, because it is obviated by the critical assumption that certain contextual systems exist. The situation is this: (i): if we do not assume, e.g., that R EP RB for certain quadruples of axes is contextual, then the FWT for this system cannot be proved whether or not one adopts the independence assumption; and (ii): if we do assume the contextuality of R EP RB (presumably because we believe experiments or quantum-mechanical theory), the proof can be carried out without mentioning the independence assumption.
Analogously, Conway and Kochen have to postulate the existence of a system R KSP with certain properties (the SPIN assumption), ensuring that no consistent assignment of values to Alice's measurements is possible (the Kochen-Specker theorem); and they also have to postulate that if Bob's axis coincides with one of the three axes chosen by Alice, then the corresponding measurement outcomes always coincide (the TWIN assumption). With these postulates, however, the independence (part of their MIN assumption) is not needed.
[? ] Summarizing, the independence (or "experimenters' free will") is only needed if we consider the epistemological question: how can one be sure that the compound system estimated from an experiment is truly contextual? The FWT is a conditional statement: if we have a contextual compound system with certain properties, it cannot be decomposed as in (15).

V. SYSTEMS VERSUS ISOLATED AB-PAIRS
There is, however, a simpler reason not to use deterministic systems when describing individual particle pairs. This reason is that a single pair of particles is a realization of an isolated AB-pair of a system rather than a realization of an entire system. The difference is that the AB-pair is determined by the factual context chosen by Alice and Bob, while any given realization of a system also includes the counterfactual contexts that Alice and Bob "could have chosen." Thus, experimental trials for R EP RB produce a series of outcomes, such as . . . . . . . . . . . .
We see no logical reason to think that the observed value A 1,2 , B 1,2 = (0, 1) is somehow related to any specific values of the AB-pairs for contexts other than the factual (x, y) = (1, 2). Such a relation would only be reasonable if there existed a way to observe these alternative AB-pairs by factually performing the measurements for (x, y) = (1, 1), (2,1), and (2, 2) in addition to (x, y) = (1, 2) on the same pair of particles. We know that this is impossible. In an idealized experiment all particle pairs are generated and prepared in precisely the same way, their spin values, for any given choice of settings, are measured in precisely the same way, and these spin values are not further related to observable circumstances other than the settings (if they are, these circumstances should be included as part of redefined contexts). For instance, one may very well be interested in, say, how previous measurements in various contexts affect the measurements in a present context -but this would mean that the previous measurements (settings and/or outcomes) should be formally considered part of the present context, which would radically redefine the system one is dealing with. Barring such redefinitions, a sequence like (25) should be treated as several (here, four) unrelated to each other sequences, each defined by a specific context.
One may now ask seemingly the same question as in the FWT but applied to pairs of particles treated as realizations of specific AB-pairs: can they be considered deterministic variables? In other words, the question is whether each pair of particles can be described as with the specific settings (x, y) under which the measurements are recorded. This is, however, a very different question, and the answer is: yes, if one so wishes, but this makes no difference. Flips of a fair coin can always be considered a mixture of two deterministic variables with respective values Head and Tail, each occurring with probability 1 /2. Following the prevailing tradition in statistics, a sequence of realizations of an AB-pair can be treated as a realization of a random sample (a set of identically distributed independent random variables). However, it can also be treated as a realization of a set of different random variables (e.g., deterministic ones), randomly alternating. Let us again use R EP RB for an illustration. Assume that one of its AB-pairs is distributed as value (1, 1) (1, 0) (0, 1) (0, 0) probability p 1 − p 0 0 .
Clearly, this variable is indistinguishable from any mixture X with probability q Y with probability 1 − q of the random variables X, Y with the respective distributions X : (1, 1) (1, 0) provided In particular (and trivially), X, Y can be viewed as deterministic variables, mixed as X with probability p Y with probability 1 − p .
We see that the question of whether the individual pairs of particles have "free will" of their own, i.e., whether they are deterministic entities, looses its meaning. It should also be noted that once this context-wise view of the individual particle pairs is adopted, one need not be concerned with the independence assumption ("free will" of Alice and Bob). In fact this assumption now is unformulable. Each AB-pair corresponds to a fixed context, and one cannot informatively say that the ABpair in a given context does not depend on context.

VI. CONCLUSION
To summarize, the view that follows from the conceptual framework of CbD does not invalidate the FWT-type theorems, but makes them unnecessary. These theorems can be viewed as reductio ad absurdum demonstrations that individual pairs of particles should not be viewed as deterministic systems. In CbD this is true, because the individual pairs of particles should not be viewed as systems to begin with, only as realizations of random variables for a given choice of settings. As Peres famously put it, "unperformed experiments have no results" [22] -and these non-existing results should not be appended to factual results, lest one runs into a contradiction. The reason we can speak of R in (1) as a system is that as we switch from one context to another, all other macroscopic circumstances of measurements (overall experimental set-up and preparation procedure) remain the same. We can repeatedly experiment with such a system without changing its defining parameters. In particular, a deterministic system allows one to repeatedly experi-ment with it and obtain the same results for any given context. Individual pairs of particles afford only one pair of measurements, in one particular context.