Cosmic Bell Test using Random Measurement Settings from High-Redshift Quasars

In this Letter, we present a cosmic Bell experiment with polarization-entangled photons, in which measurement settings were determined based on real-time measurements of the wavelength of photons from high-redshift quasars, whose light was emitted billions of years ago, the experiment simultaneously ensures locality. Assuming fair sampling for all detected photons and that the wavelength of the quasar photons had not been selectively altered or previewed between emission and detection, we observe statistically significant violation of Bell's inequality by $9.3$ standard deviations, corresponding to an estimated $p$ value of $\lesssim 7.4 \times 10^{-21}$. This experiment pushes back to at least $\sim 7.8$ Gyr ago the most recent time by which any local-realist influences could have exploited the"freedom-of-choice"loophole to engineer the observed Bell violation, excluding any such mechanism from $96\%$ of the space-time volume of the past light cone of our experiment, extending from the big bang to today.

In this paper we present a cosmic Bell experiment with polarization-entangled photons, in which measurement settings were determined based on real-time measurements of the wavelength of photons from high-redshift quasars, whose light was emitted billions of years ago; the experiment simultaneously ensures locality. Assuming fair sampling for all detected photons, and that the wavelength of the quasar photons had not been selectively altered or previewed between emission and detection, we observe statistically significant violation of Bell's inequality by 9.3 standard deviations, corresponding to an estimated p value of 7.4 × 10 −21 . This experiment pushes back to at least ∼7.8 Gyr ago the most recent time by which any local-realist influences could have exploited the "freedom-of-choice" loophole to engineer the observed Bell violation, excluding any such mechanism from 96% of the space-time volume of the past light cone of our experiment, extending from the big bang to today.
Background. To Erwin Schrödinger, entanglement was "the characteristic trait of quantum mechanics, the one that enforces its entire departure from classical lines of thought" [1]. He referred to an analysis by Einstein, Podolsky, and Rosen (EPR) [2], regarding the quantummechanical predictions for perfect correlations in certain quantum systems. EPR made two assumptions explicit. Regarding locality, they wrote: "Since at the time of measurement the two systems no longer interact, no real change can take place in the second system in consequence of anything that may be done to the first system." They also articulated a "reality criterion": "If, without in any way disturbing a system, we can predict with certainty (i.e., with probability equal to unity) the value of a physical quantity, there exists an element of physical reality corresponding to this physical quantity." In the light of these two assumptions and their analysis of a particular two-particle state, EPR concluded that quantum mechanics is incomplete. While the EPR reasoning is logically unassailable, Niels Bohr pointed out that the EPR assumptions need not hold for quantum observations [3].
This discussion had laid dormant for several decades, but in 1964 John Stewart Bell demonstrated that a complete theory based on the EPR premises makes predictions that are in conflict with those of quantum mechanics [4,5]. In such local-realist theories, it is assumed that every individual system carries its own set of properties prior to measurement, which are presumed to be inde-pendent of any possible influence from outside its past light cone. Bell concluded that in a local-realist theory the strength of correlations among measurements on different particles' properties is limited and smaller than the predictions of quantum physics. This is expressed by Bell's inequality.
With Bell's result, a question that previously had been dismissed as "merely philosophical" became experimentally testable. In 1969, Clauser, Horne, Shimony, and Holt (CHSH) published their inequality as an experimentally accessible variant of Bell's version [6]. The idea was to measure the four probabilities p(A, B|a i , b j ) of measurement results A, B ∈ {+1, −1}, in which Alice chooses between two measurement bases a 1 and a 2 , and likewise Bob chooses between the two measurement bases b 1 and b 2 . For systems in particular states subject to judicious choices of measurement bases, the predictions for correlations among the measurement outcomes A, B under various combinations of settings a i , b j differ markedly between quantum mechanics and models that satisfy EPR's assumptions of locality and realism. Subsequently, entangled-particle states have been shown to violate Bell's inequality in numerous situations, consistent with the predictions of quantum theory [7][8][9]. Yet experiments always require sets of assumptions for their interpretation [10,11]. In tests of local realism, these assumptions can be seen as loopholes, by which, in principle, it could be argued that local realism has not been completely ruled out [8,12]. Closing the locality loophole [13,14], for example, requires that the measurement settings are changed by Alice shortly before the arrival of an entangled particle at her detector, such that no signal could inform Bob about Alice's measurement setting or outcome before Bob completes a measurement at his own detector (and vice versa). The fair sampling assumption, on the other hand, states that the measured subset of particles is representative of the complete set. This loophole is closed if a sufficiently high fraction of the entangled pairs is detected [15][16][17]. Recently, several experiments have observed significant violations of Bell's inequality while simultaneously closing both the locality and fair-sampling loopholes [18][19][20][21].
The freedom-of-choice loophole, as usually understood, concerns events that might have transpired within the causal past of a given experiment, which a local-realist mechanism could have exploited in order to mimic the predictions from quantum mechanics [42]. In a recent pilot test [38], measurement settings for a test of Bell's inequality were determined by real-time observation of light from Milky Way stars, thereby constraining any such local-realist mechanism to have acted no more recently than ∼600 years ago, rather than microseconds before a given experimental run (as in previous tests [36]). The magnitude of that leap reflected how comparatively little attention had been devoted previously to experimentally addressing this loophole. Given the expansion history of the universe since the big bang, however, the pilot test [38] excluded only about one hundredthousandth of one percent of the relevant space-time volume within the past light cone of the experiment.
In this paper, we describe a Cosmic Bell experiment that pushes the origin of the measurement settings considerably deeper into cosmic history, constraining any local-realist mechanism to have acted no more recently than 7.78 Gyr ago. Based on the arrangement of highredshift quasars used in our experiment, these results exclude any local-realist mechanism that might have exploited the freedom-of-choice loophole from 96.0% of the space-time volume of the past light cone of the experiment, extending from the big bang to today.
Experimental implementation. Figure 1 shows a schematic of the experimental setup at the Observatorio del Roque de los Muchachos on the Canary Island of La Palma. A central entangled photon source was located in a container next to the Nordic Optical Telescope. One entangled-photon observer, Alice, was situated in another container next to the Telescopio Nazionale Galileo (TNG), and Bob was stationed at the ground floor of the William Herschel Telescope (WHT). The quasar photons were collected by the TNG [43] and the WHT [44]. The random numbers extracted from these signals were transmitted to the observers using BNC cables. The polarization-entangled photons were distributed from the source to the receivers via free-space optical links. A more detailed schematic of the setup can be seen in Figure 2.
The entangled photon source (see Supplementary Materials [45]) was based on type-0 spontaneous parametric down-conversion (SPDC) in a Sagnac loop configuration [46,47]. It generated fiber-coupled photon pairs at center wavelengths λ A = 850 nm and λ B = 773.6 nm in a state close to the maximally entangled Bell state where subscripts A and B label the respective single-mode fiber for Alice and Bob. Each photon was guided to a transmitting telescope (Tx), and distributed via free-space optical channels to the receiving stations of Alice and Bob. Each station consisted of a receiving telescope for entangled photons (Rx), a polarization analyzer (POL), a control and data acquisition unit (CaDa) locked to a rubidium frequency standard, and a cosmic random number generator (CRNG). The entangled photons were guided from the Rx to the polarization analyzer, where an electrooptical modulator (EOM) performed the fast switching between the complementary measurement bases accordingly. The polarization was measured using a polarizing beam splitter followed by a single-photon avalanche diode (SPAD) in each output port.
The CRNGs at TNG (Alice) and WHT (Bob) were essentially identical. The optical path for each CRNG featured a magnified intermediate image, which enabled one to adjust the field of view with an iris in order to minimize background light. A dichroic mirror with a cutoff wavelength of 630 nm split the incoming light in a transmitted "red" and a reflected "blue" arm. Additional filters (shortpass at 620 nm in the blue arm and longpass at 637 nm in the red arm) efficiently filtered out misdirected photons whose wavelengths were near the cutoff wavelength of the dichroic mirror. Incorporating these additional filters yielded much smaller fractions of misdirected astronomical photons than in our previous experiment [38], with f w < 2 × 10 −5 (see the Supplemental Material [45]). Light from each arm was fed to a SPAD. Electric signals from the SPADs were processed by the CaDa, which triggered the EOM to apply the corresponding measurement settings. Alice measured linear polarization along 22.5 • /112.5 • (red) and 67.5 • /157.5 • (blue), while Bob measured linear polarization along 0 • /90 • (red) and 45 • /135 • (blue). All SPAD events, from the CRNGs and the polarization analyzers, were timestamped in the CaDa and recorded by a computer.
Using the wavelength of cosmic photons to implement the measurement settings requires the assumption that the wavelength of each photon was set at emission and has not been selectively altered or previewed between emission and detection. (Well-known processes, such as cosmological redshift and gravitational lensing, treat all photons from a given astronomical source in a uniform way, independent of the photons' wavelength at emission [48][49][50].) Within an optically linear medium, there does not exist any known physical process that can absorb and reradiate a given photon at a different wavelength along the same line of sight, without violating the local conservation of energy and momentum. We further assume that the detected cosmic photons represent a fair sample, despite significant losses in the intergalactic and interstellar media, the Earth's atmosphere, and the experimental setup.
Various scenarios could (in principle) lead to corrupt choices of measurement settings within our experiment. For example, local sources of photons ("noise") rather than genuine astronomical photons could trigger the CRNGs. The most significant sources of local noise include sky glow, light pollution, and detector dark counts. The overall background was measured by pointing each telescope to a dark sky patch 10 arcseconds away from its target quasar before and after each observing period.
Space-time arrangement. Ensuring locality requires that any information leaving Alice's quasar at the speed of light along with her setting-determining cosmic photon could not have reached Bob before his measurement of the entangled photon is completed, and vice versa.
The projected space-time diagram in Figure 3 illustrates the situation for the first observed quasar pair (pair 1) of our experiment. The entangled photons are generated at point S and travel through 12 m of optical fiber, resulting in a delay of τ fiber ≈ 50 ns. The distance over the free-space channels is x A = 534 m to A and x B = 500 m to B.
To ensure the locality conditions, measurements of entangled photons must only be accepted within a certain valid time window, τ k valid , which has to be chosen such that the selection and implementation of the corresponding settings on one side remain space-like separated from the measurements on the other side. τ k valid is constrained to within a certain time window τ k geom , which depends on the time-dependent directions of the quasars relative to A and B. Given the moderate time dependence of τ k geom over the relatively brief observing periods (≤ 17 min), we use the shortest value per side within the observing period:τ k geom = min t (τ k geom ), wherē τ A geom = 2.81 µs (2.67 µs) andτ B geom = 1.48 µs (1.11 µs) for pair 1 (2). Various delays from signal transmission through fibers and BNC cables, and to implement a given setting with the EOM, have to be subtracted from τ k geom to compute the correct validity time τ k valid . The delay until a certain setting was implemented, τ k set , was measured to be 325 ns and 430 ns for Alice and Bob, respectively. An additional buffer was used on both sides with τ buffer = 150 ns to account for small inaccuracies in tim- A photon pair source located in the middle produced polarization-entangled photons at center wavelengths of 773.6 nm and 850 nm. The photons were separated into two spatial modes via a dichroic mirror and sent via free-space channels to the quantum receivers at Bob (773.6 nm) and Alice (850 nm). Fast steering mirrors guided the photons to the receivers using a green LED as a reference. Electro-optical modulators (EOM) rotate the measurement basis according to the input signals from the CRNGs. Polarization measurements are performed using a polarizing beam splitter (PBS) with avalanche photodiodes in each output path. Detection events are timestamped by the control and data acquisition unit (CaDa) and stored locally. Quasar light is collected by the astronomical telescopes and fed into an optical system that creates a magnified image with an iris to restrict the field of view. The quasar light is then split according to its wavelength into a "blue" and a "red" channel, whereby each channel contains additional filters to remove misdirected photons. The detector signals are used to trigger the implementation of the corresponding measurement basis at the EOM. TABLE I. For Alice and Bob's side (k = {A, B}), we list the QSO Simbad identifiers, azimuth (az k ) (clockwise from due North) and altitude (alt k ) above horizon at the start of the observing periods, and redshift (z) and lookback time to emission (t lb ) for quasars observed in pairs 1 and 2, beginning at UTC 2018-01-11 00:20:00 (pair 1) and 2018-01-11 01:21:00 (pair 2). Pair 1 was observed for a total of 17 min, pair 2 for 12 min. τ k valid is the time the detector setting was valid, taking into account delays and safety margins (see Figure 3). The last three columns show the measured CHSH parameter, as well as the p value and the number of standard deviations ν by which our local-realist model can be rejected (see the Supplemental Material [45] For pairs 1 and 2, measurement settings at Bob's station were determined based on observations of a quasar with redshift z B = 3.911 [51], corresponding to a lookback time to the emission of that light t B lb = 12.21 Gyr ago. Measurement settings at Alice's station were determined based on observations of quasars with z A = 0.964 [52] (pair 1) and z A = 0.268 [53] (pair 2), corresponding to t A lb = 7.78 Gyr and 3.22 Gyr ago, respectively. (See Table I.) These times may be compared with the age of our observable universe since the big bang, t lb = 13.80 Gyr [54]. We consider possible implications of inhomogeneities along the lines of sight to these objects, such as gravitational lensing effects, in the Supplemental Material [45].  (1+1)D space-time diagram for pair 1, with the origin at the source S of entangled pair creation (black dot) and a spatial projection axis chosen to minimize its distance to Alice and Bob. After a short fiber delay (too small to see), entangled photons are sent via free-space channels (black lines) to be measured by Alice and Bob at events A and B.
Galaxy symbols indicate examples of measurements of valid settings from quasar photons emitted far away at space-time events QA and QB. Ensuring locality limits settings to the shaded regions. Delays to implement each setting and an added safety buffer shorten the validity time windows that were actually used to the darker shaded regions. [45].) Events associated with any local-realist mechanism that could have affected detector settings and measurement outcomes of our experiment would need to lie within the past light cones of Q A and/or Q B , and hence are restricted to have acted no more recently than t A lb = 7.78 Gyr or 3.22 Gyr ago for pairs 1 and 2, respectively.

ment. (See Supplemental Material
Analysis and results. We performed two Cosmic Bell tests with the quasars listed in Table I, for a total measurement time of 17 min (pair 1) and 12 min (pair 2). In the analysis of our acquired data, we follow the assumption of fair sampling and fair coincidences [12]. Thus, our data can be postselected for coincidence events at Alice's and Bob's stations. We correct for the clock drift as in Ref. [55] and identify coincidences within a time window of 2.66 ns. We then check for correlations between measurement outcomes A, B ∈ {+1, −1} for particular settings choices (a i , b j ), i, j ∈ {1, 2} using the Clauser-Horne-Shimony-Holt (CHSH) inequality [6]: where E ij = 2p(A = B|a i b j ) − 1 and p(A = B|a i b j ) is the probability of Alice and Bob obtaining the same measurement outcome for the joint settings (a i , b j ). While four probabilities can arithmetically add up to 4, localrealistic correlations cannot exceed an S value of 2 and the quantum-mechanical limit is 2 √ 2 [56]. As can be seen from Table I  ceed the local-realist bound of 2. However, not all of our settings were determined by genuine cosmic photons. A certain fraction of settings k on each side (k ∈ {A, B}) was produced by some kind of local process, including sky glow, ambient light, and detector dark counts. We therefore consider such settings to be "corrupt" and assume that a local-realist mechanism could have exploited them to produce maximal CHSH correlations, with S = 4. Such a (hypothetical) mechanism could produce CHSH correlations as large as S = 2(1 − A − B ) + 4( A + B ) [38,40,57].
In our analysis we account for such "corrupt" settings as well as unequal (biased) frequencies for various combinations of detector settings (a i , b j ), and possible "memory effects" by which a local-realist mechanism could exploit knowledge of settings and outcomes of previous trials (see Supplemental Material [45]). From this detailed treatment, we find that correlations at least as large as observed in our data could have been produced by a local-realist mechanism only with probabilities p ≤ 7.4 × 10 −21 for pair 1 and p ≤ 7.0 × 10 −13 for pair 2, corresponding to experimental violations of the Bell-CHSH bound by at least 9.3 and 7.1 standard deviations, respectively.

Conclusions.
For each Cosmic Bell test reported here, we assume fair sampling and close the locality loophole. We also constrain the freedom-of-choice loophole with detector settings determined by extragalactic events, such that any local-realist mechanism would need to have acted no more recently than 7.78 Gyr or 3.22 Gyr ago for pairs 1 and 2, respectively-more than six orders of magnitude deeper into cosmic history than the experiments reported in Ref. [38]. This corresponds to excluding such local-realist mechanisms from 96.0% (pair 1) and 63.5% (pair 2) of the relevant space-time regions, compared to ∼ 10 −5 % of the relevant space-time region as in Ref. [38] (see Supplemental Materials [45]).
We have therefore dramatically limited the space-time regions from which local-realist mechanisms could have affected the outcome of our experiment to early in the history of our universe. To constrain such models further, one could use other physical signals to set detector settings, such as patches of the cosmic microwave background radiation (CMB), or even primordial neutrinos or gravitational waves, thereby constraining such models all the way back to the big bang-or perhaps even earlier, into a phase of early-universe inflation [31,38]. Such extreme tests might ultimately prove relevant to the question of whether quantum entanglement undergirds the emergence of space-time itself. (For a recent review, see Ref. [58]).
Note Added. After we completed our experiment, a similar experiment was conducted by another group, the results of which are reported in Ref. [59].

Causal Alignment
As described in Ref. [38], the time-dependent locations of astronomical sources on the sky relative to our ground-based experimental site complicates the enforcement of the space-like separation conditions needed to address both the locality and freedom-of-choice loopholes. For example, the photon from quasar emission event Q A must be received by Alice's cosmic-photon receiving telescope (Rx-CP) before that photon's causal wavefront reaches either the Rx-CP or the entangledphoton receiving telescope (Rx-EP) on Bob's side, and vice versa.
In this section we first present our main result for the causal-alignment windows, τ k valid (t) (for sides k, = {A, B}), within which settings chosen by astronomical photons remain valid, and then derive the various terms in our expression. As shown in Figure 3 of the Main paper, we parameterize τ k valid (t) as where τ k geom (t) arises from the geometrical arrangement of the quasars relative to the locations of relevant instrumentation on Earth, andτ k is the minimum value of τ k geom (t) during an observing window. The term τ k set indicates the time required to electronically output a bit and implement the detector setting, while τ k buffer accommodates total delays due to atmosphere, telescope optics, and detector response. Negative validity times for either k would indicate an instantaneous configuration that was out of "causal alignment," in which at least one setting would be invalid for the purposes of closing the locality loophole.
For τ k geom (t), we find where r k and m k are the spatial 3-vectors for the locations of the cosmic receiving telescopes (Rx-CP) and entangled-particle detectors (Rx-EP) for side k, respectively; s is the spatial 3-vector for the location of the source of entangled particles; and c is the speed of light in vacuum. The time-dependent unit vectorn k (t) points toward the quasar used to set detector settings on side k, and is computed using astronomical ephemeris calculations. Additionally, n is the index of refraction of air and γ k , which acts like an index of refraction, parameterizes the group velocity delay through fiber optics and/or electrical cables connecting the telescope and entangled photon detector on side k.
We work with a space-time metric signature (+, −, −, −), so that space-time events represented by four-vectors A µ and B µ will be space-like separated if (A µ − B µ ) 2 < 0. We represent spatial and temporal intervals of cosmological magnitude-such as the interval between emission of light from a distant quasar and its detection on Earth today-in terms of a spatially flat Friedmann-Lemaître-Robertson-Walker (FLRW) line element because on length-scales greater than O(100) Mpc, our universe has been measured to be homogeneous [63], isotropic [64], and spatially flat [54] to high accuracy. We have where t is cosmic time, equal to the proper time recorded by a freely falling observer at the origin of the spatial coordinate system, χ is a (dimensionless) comoving distance, and dΩ 2 (2) (θ, φ) = dθ 2 + sin 2 θdφ 2 is the line-element for a unit 2-sphere. In this section we ignore possible complications from inhomogeneities, such as gravitational-lensing effects.
In Eq. (3), a(t) is the (dimensionless) cosmic scale factor and the constant R 0 ≡ c/H 0 has dimensions of length. We use H 0 = 67.74 km s −1 Mpc −1 as the present value of the Hubble parameter [54], corresponding to a Hubble time t H = H −1 0 = 14.43 Gyr and hence R 0 = 14.43 Glyr. Physical distances at a given cosmic time, r(t), are related to comoving distances by r(t) = R 0 a(t)χ. The observed redshift, z, for astronomical objects arising from cosmic expansion is given by where t 0 is the present time and t e is the time of emission. We set a(t 0 ) = 1 and take t = 0 to be the time of the hot big bang (following any primordial phase of inflation, if inflation occurred). We further assign the origin of the spatial coordinates to be the center of the Earth. Errors introduced by treating the rotating Earth as an inertial frame are less than one part in 10 6 , and are easily accommodated within τ buffer . Quasar emission event Q k occurred a long time ago, in a galaxy far, far away. Hence it is convenient to introduce (dimensionless) conformal time, dη ≡ H 0 dt/a(t), to take into account the cosmic expansion between the emission and detection of the cosmic photons. Then Eq. (3) becomes and (radial) null geodesics correspond to dη = dχ. In these coordinates, the 4-vector corresponding to the receipt of a quasar photon at detector k on Earth may be written R µ k = (η r k , r k ), and the 4-vector for the emission of the photon at the quasar is where χ r k is the (comoving) spatial location of the quasar photon detector Rx-CP on side k, andn k (t) is the unit spatial 3-vector pointing from the center of the Earth toward the quasar. The entangled pair is emitted from the source at S µ = (η s , χ s ). See Figure 1.
The entangled-pair emission event should be space-like separated from the arrival of the quasar photons, which (We further note that in the limit (η r k − η q k ) |χ r k |, the spherical waves emitted from the quasar arrive at the Earth as plane waves to a very good approximation.) The quantities in Eq. (7) all refer to Earthbound events, and hence we may transform back to coordinates more convenient for describing a given experimental trial. Upon recalling that a(t 0 ) = 1, we have χ r k = R −1 0 r k , where r k is the present (physical) spatial location of the quasarphoton detector Rx-CP as reckoned from the center of the Earth, and likewise χ s = R −1 0 s. We also have The cosmic scale factor varies imperceptibly during the course of the experiment, so we may expand , and overdots denote derivatives with respect to cosmic time, t. During a given experimental trial, δt = t r k − t s is typically a fraction of a second, so we Each measurement on an entangled particle should be completed before the causal wavefront from the quasar emission on the other side arrives, which requires , using Eq. (6) for Q µ k , and proceeding as above to relate Meanwhile, each quasar photon must be received, processed, and converted to a stable detector setting before the entangled photon from the source arrives. To calculate τ k geom (t), we consider only the spatial arrangement of the various events, since we separately accommodate additional delays (from telescope optics, electronics, cables, and the like) with the factors τ k set and τ k buffer . For τ k geom (t), we may therefore parameterize where t d k is the time when the setting for the entangledphoton detector on side k is set. Eq. (11) takes into account the fact that the quasar-photon reception and the detector-setting event can occur at different spatial locations. For measurements on the future light cone of the entangled-particle emission, we can use the particles' travel time from the source to the detectors to write For the setting to be valid, meanwhile, it must be set before the measurement, t d k < t m k . Then we may rearrange Eqs. (11) and (12) to write Similarly substituting Eq. (12) for t m into Eq. (10) we have We want the most conservative limit on τ k geom (t). Subtracting the lefthand side of Eq. (9) from the lefthand side of Eq. (14) yields c −1 (n + cos θ)|m − s| ≥ 0, since the index of refraction satisfies n ≥ 1, and | cos θ| ≤ 1, where θ is the angle betweenn k and (m − s). Therefore the lefthand side of Eq. (14) provides the tighter lower bound on (t r k − t s ). The validity time is determined by the difference between the upper and lower bounds on (t r k − t s ): subtracting the lefthand side of Eq. (14) from the righthand side of Eq. (13) yields our expression for τ k geom (t) in Eq. (2).
For our experiment, we may set m k r k for each side, and accommodate measured delays for signal propagation and processing within the factors τ k set . Then we may compute values for τ k geom (t) using the coordinates for the various experimental stations shown in Table I. We used τ buffer = 150 ns on each side, as well as τ A set = 325 ns and τ B set = 430 ns. Incorporating these values for τ k set and τ buffer as in Eq. (1) yielded the final values of τ k valid that we used, shown in Table II.

Excluded Spacetime Regions
By using light from distant quasars to determine detector settings, we may constrain the space-time region within which any putative local-realist mechanism could have engineered the observed correlations among measurements on the entangled particles. In this section we consider the space-time regions excluded from such localrealist scenarios.
Following the discussion in Ref. [65], we may relate the measured redshift for a given astronomical object to the conformal time at which the light we receive on Earth was emitted by the object, η q . We take η = 0 to correspond to the time of the hot big bang. We may also compute the lookback time to the emission event (in cosmic time), t lb , reckoned from the present, t 0 .
We parameterize the Friedmann equation governing the evolution of a(t) in terms of the function where H(a) is the Hubble parameter for a given scale factor a = a(t), and we again use the best-fit value H 0 = 67.74 km s −1 Mpc −1 [54]. The Ω i ≡ ρ i /ρ c are the present-day ratios of the energy densities of dark energy (ρ Λ ), cold matter (ρ M ), and radiation (ρ R ) to the criti-cal density, ρ c = 3H 2 0 /(8πG), where G is Newton's gravitational constant. (The quantity ρ M includes contributions from both baryonic matter and cold dark matter.) We also define the total fractional density of dark energy, cold matter, and radiation (Ω T ≡ Ω Λ + Ω M + Ω R ), and the fractional density associated with spatial curvature (Ω k ≡ 1 − Ω T ). We assume that ρ Λ arises from a genuine cosmological constant with equation of state w = p/ρ = −1, and hence Ω Λ a −3(1+w) = Ω Λ , which is consistent with observations [54]. We adopt the best-fit cosmological parameters from Ref. [54], Here Ω R = Ω M /(1 + z eq ), with the redshift for matter-radiation equality given by z eq = 3371 [54].
Redshifts for the three quasars we observed are listed in Table II. For QSO J0831+5245, which was observed for both quasar pairs, we use a reported host galaxy redshift of z = 3.9114 ± 0.0003 from Ref. [51]. For QSO B0350-073, we use the reported redshift of z = 0.9635389 ± 0.00011 from the Sloan Digital Sky Survey (SDSS) Quasar Catalog Fourteenth Data Release [52]. For QSO B0422+004, reported redshifts include z = 0.31 [66] and z = 0.268 [53]. Neither reported redshift included an uncertainty, so we conservatively adopt the smaller value, z = 0.268. Given the small redshift uncertainties for QSO J0831+5245 and QSO B0350-073, and that no redshift uncertainties were reported for QSO B0422+004, we assume that all redshift uncertainties are negligible.
The conformal time for the emission event from a distant quasar at redshift z may be written [65] upon using a e = 1/(1+z) from Eq. (4) (and recalling our convention that a(t 0 ) = 1). Using dη = H 0 dt/a(t), we may similarly compute the (cosmic time) lookback time to the emission event from today as again using a e = 1/(1+z). In Table II we list the quasars used in pairs 1 and 2 for k = A, B, their measured redshifts, z, and the corresponding values of η k (z) and t k lb (z) for each quasar. The present age of the universe corresponds to η 0 = η(z = 0) = 3.20, and the lookback time to the hot big bang is t lb (∞) = 13.80 Gyr.
Neglecting (for the moment) any possible effects from inhomogeneities along the lines of sight between the quasar emission events and our receipt of the cosmic photons on Earth, we assume that any local-realist mechanism that could have engineered the observed violations For each quasar, we list its QSO ID number from the Simbad database, celestial coordinates, angular separations (α) of each pair, azimuth (clockwise from due North) and altitude above horizon at the start of each observating run, and redshift (z). We also list the (dimensionless) conformal time of quasar emission (η k ), the lookback times (in Gyr) to each quasar emission event (t k lb ), as well as the fraction of the physical 4-volume of the past light cone of our experiment, extending back to the big bang, from which any local-realist mechanism that might account for the measured violations of the Bell-CHSH inequality is excluded (F excl ). Finally, we list the validity time τ k valid from Eq. (1), which gives the minimum time that detector settings are valid for side k = {A, B} during each experimental run, taking into account various delays and safety margins.
of the Bell-CHSH inequality must have acted within the past lightcone of either quasar emission event. Only within those spacetime regions could the local-realist mechanism have altered or previewed the bit that we would later receive on Earth, and shared that information (at or below the speed of light) with other elements of our experimental apparatus, such as the source of entangled particles or the detectors on the other side of our experiment [31,38]. For pair 1, any such local-realist mechanism is constrained to have acted no more recently than t lb = 7.78 Gyr ago, while for pair 2 the constraint is t lb = 3.22 Gyr ago.
We may further characterize the space-time region within which any local-realist mechanism could have acted in order to produce the observed violations of the Bell-CHSH inequality. That region consists of the union of the past lightcones from the quasar emission events utilized for a given experimental run, V We calculate the 4-volume contained within the past light cone of a quasar emission event by integrating the invariant volume element dV = √ −g d 4 x over the region bounded by past-directed null geodesics extending from the quasar emission event, where g = det[g µν (x)] is the determinant of the space-time metric. As we saw above, null geodesics take the form dη = dχ in the coordinates of Eq. (5). Taking the spatial origin to lie along the worldline of quasar k, the 4-volume of the past light cone from emission event Q k may be evaluated as where Θ(x) is the Heaviside step function. In a FLRW universe, the most recent (conformal) time at which the past light cones from emission events Q A and Q B overlap is given by [65] Here χ L is the comoving spatial distance between the worldlines of quasars A and B, which, in a spatially flat universe, is given by where α is the angle between the quasars as seen from Earth, and [65] Along any η = constant surface, with 0 ≤ η ≤ η AB , the past light cones from emission events Q A and Q B appear as three-dimensional spheres that partially overlap. In Euclidean space, the (spatial) three-volume of the intersection region of two spheres of radii r 1 and r 2 , with distance between their centers d, is given by [67] V (23) Upon substituting d → χ L , r 1 → η A −η, and r 2 → η B −η, making use of Eq. (20) for η AB , and performing some straightforward algebra, we find the space-time 4-volume of the intersection region of the past light cones from emission events Q A and Q B to be V (4) The union of the past light cones from emission events Q A and Q B therefore has the 4-volume while the 4-volume of the past light cone of our experiment is given by V exp = V (4) (η 0 ). For an experimental run using a pair of quasars with redshifts z A , z B , and relative angle α, the space-time region that is excluded from playing any role in an explanation based on a local-realist mechanism is given by V Q . As a fraction of V (4) exp , this may be written The relative angle (as seen from Earth) between the quasars in pair 1 is α = 83.81 • , and for the quasars of pair 2 is α = 72.84 • . Given the redshifts for each quasar listed in Table II, we find F excl = 0.960 for pair 1, and F excl = 0.635 for pair 2. See Figure 2 and Figure 3. We may compare these values for F excl with the corresponding values for our Vienna pilot test involving Milky Way stars [38]. We again use Eq. (4) to relate redshift to cosmic scale factor. Since the motions of Milky Way stars are dominated by local peculiar velocities independent of Hubble expansion, astronomers do not measure cosmicexpansion redshifts for Milky Way stars. Nonetheless, we may compute effective values of z with which to parameterize the various emission times. For the nearby sources used in Ref. [38], we may Taylor expand  4) and (27), these correspond to effective redshifts z A = 4.19 × 10 −8 and z B = 1.32 × 10 −7 for run 1, and z A = 4.00 × 10 −8 and z B = 2.51 × 10 −7 for run 2. Given α 1 = 119 • for run 1 and α 2 = 112 • for run 2, the times when the past light cones from emission events A and B most recently overlapped were t AB = 2409 years ago (run 1) and t AB = 4039 years ago (run 2), corresponding to z AB = 1.67 × 10 −7 (run 1) and z AB = 2.80 × 10 −7 (run 2). Repeating the calculation as above, we then find F excl = 1.38 × 10 −7 for run 1, and F excl = 1.45 × 10 −7 for run 2. In other words, the Vienna pilot test excluded about one hundred-thousandth of one percent of the relevant spacetime volume, compared to the exclusion of 96.0% (pair 1) and 63.5% (pair 2) achieved in the present experiment.

Effects of Inhomogeneities along the Line of Sight
Our discussion to this point has assumed that the quasar photons were emitted from point-like astronom-ical objects. In reality, quasars are large, messy objects; a given photon may be subject to complicated interactions involving optically thick atmospheres before escaping from the quasar. To address such scenarios, we consider the "effective emission time" to be the latest time η k that any local interactions associated with the quasar could have altered the wavelength of the photon. Any corrections arising from strong electromagnetic fields, plasma effects, or related atmospheric phenomena near the quasar would affect a precise calculation of η k by some small quantity (with cosmic-time values for the corrections ∆t t k lb ). Compared to our simple estimates of η k (z) and t k lb (z) based on the measured redshifts of the quasars, any such corrections would be indiscernible, given t k lb (z) ∼ O(1−10) Gyr for the quasars used in pairs 1 and 2.
Atmospheric or related interactions at the quasar could introduce delays between the arrival at Earth of the causal wavefront from the emission event of a given photon and the receipt of that photon on Earth. In principle, a local-realist mechanism could therefore exploit information about the wavelength of the incoming quasar photon prior to its detection, in order to engineer the observed violations of the Bell-CHSH inequality. However, any such advanced signal about the incoming quasar photon would need to be correlated with the detector-setting photon itself, and therefore the information carried by the advanced signal must also have originated within the past light cone of the quasar emission event Q k , with effective emission time η k (z).
Similar considerations apply to the case in which photons from a given quasar are subject to strong gravitational lensing en route from the quasar to Earth. For example, it is known that light from quasar QSO J0831+5245 (which we used to determine Bob's settings in pairs 1 and 2) is lensed by a large, intervening mass [68,69], producing multiple images of the original quasar as seen from Earth. The multiple images arise from different paths that quasar photons take between the lens and Earth. In the case of this particular quasar, it has been estimated that the distinct paths correspond to a difference in arrival times at Earth of as much as 1 day [69].
Given the delay in arrival times, it is possible (in principle) that a local-realist mechanism could receive information from a short-path photon about the wavelength of a long-path photon before the latter arrives at Earth, and exploit that advanced signal to engineer the observed violations of the Bell-CHSH inequality. Much like the case of atmospheric delays at the quasar itself, however, any information of value to the local-realist mechanism would need to have originated within the past light cone of the emission event, with effective emission time, η k (z). Such scenarios are therefore constrained to the same spacetime regions described above. See Figure 4.
One scenario in which gravitational lensing could affect . Note that the current age of the universe is t0 = 13.80 Gyr. The teardrop shape of the past light cones when plotted in (t, r) coordinates arises from the varying expansion rate of the universe over time, given by the scale factor a(t). The paths traveled by the quasar photons along the surface of the gray past light cone of the experiment are shown in red and blue. In each figure, a local-realist mechanism could have exploited information from the red and/or blue regions (and their overlap) to engineer the observed violations of the Bell-CHSH inequality. The gray regions outside of the red and blue regions are excluded from any such local-realist explanation. Assuming negligible uncertainties for the reported redshifts, these excluded regions amount to F excl = 96.0% (pair 1) and 63.5% (pair 2) of the total space-time 4-volume within the past light cone of our experiment, spanning all of cosmic history since the big bang.
our conclusions would arise if the wavelength of a quasar photon could somehow be measured without altering the photon's wavelength or trajectory. (No such measurement is possible according to quantum mechanics, but a local-realist mechanism, by design, is meant to be distinct from quantum mechanics.) If an "in-flight" measurement were possible, then a local-realist mechanism could potentially measure the wavelength of a quasar photon as it approaches the gravitational lens, and arrange for information about that photon's wavelength to arrive at Earth via a short path from the lens, rather than a long path. In such a scenario, the most recent time by which the local-realist mechanism would need to have acted would be bounded by the time the quasar photons reach the lens, which is more recent than the emission-time from the quasar. (We do not consider a scenario in which the local-realist mechanism could alter the wavelength of the quasar photon without changing its trajectory; any such mechanism would violate the conservation of energy and momentum.)

FIG. 4. A scenario in which light from quasar emission event
QB is lensed by a large, intervening mass between the quasar and Earth. The lens can produce multiple images of the quasar as seen from Earth. In the scenario shown, photons following path 1 arrive at Earth before photons that follow path 2. However, to be valuable to a local-realist mechanism, the information that arrives along path 1 must be correlated with information that originates within the past light cone of the effective emission event QB (red or purple regions).
In the case of the lensed quasar in our study, the intervening lens has an estimated redshift z lens 3 [68]. This corresponds to a lookback time from Earth of t lb (z lens ) 11.6 Gyr ago, compared to the quasar emission time t B lb = 12.21 Gyr ago. The time at which photons arrive at the lens is considerably earlier than the emission times from either of the quasars with which this quasar was paired (t A lb = 7.78 Gyr for pair 1 and t A lb = 3.22 Gyr for pair 2), so even such an in-flight measurement scenario would have no impact on our overall conclusions. Likewise, because the fraction of the relevant space-time 4-volume is dominated by the volume of the past light cone of the more recent emission event, re-calculating F excl using z lens 3 rather than z B = 3.911 yields virtually no change compared to the values computed above: F excl,lens = 0.960 for pair 1, and F excl,lens = 0.635 for pair 2.
Photons from distant quasars can be affected in other ways between emission and detection, beyond gravitational lensing. In particular, the intergalactic medium can affect the quasar spectra observed on Earth. Quasar sources typically have strong emission at the Lymanα line, which, in the laboratory frame, corresponds to λ α emit = 121.6 nm (deep in the ultraviolet). However, en route, photons from high-redshift quasars encounter clouds of neutral hydrogen gas, which preferentially absorb photons at the Lyman-α wavelength -for quasar photons that have been redshifted during their travel to 121.6 nm by the time they encounter the gas cloud. Photons from very distant quasars can encounter multiple gas clouds en route to Earth, resulting in a dense "Lyman-α forest" of absorption lines at wavelengths shorter than λ α obs = (1 + z)λ α emit , where z is the redshift of the original quasar emission [70,71].
In our experiment, Alice's receiving station observed quasars at redshifts z A = 0.964 (pair 1) and z A = 0.268 (pair 2). Hence the Lyman-α forest affected quasar photons that would have been received with wavelengths λ obs < 238.8 nm (pair 1) and λ obs < 154.2 nm (pair 2). But, as discussed further below, the detectors of our cosmic random number generators (CRNGs) were largely insensitive to λ obs < 400 nm. Hence selective absorption by the Lyman-α forest would have had no observable effect for either of the "quasar A" sources in our experiment. This also means that effects from transmission through the intergalactic medium could not have introduced correlations between the detected photons from quasars A and B, because any effects on the quasar-A photons would have fallen outside the sensitivity range of our detectors.
On the other hand, at Bob's receiving station we used a more distant quasar, with z B = 3.911. The photons from this distant quasar certainly could have been affected by selective absorption within the Lyman-α forest, for observed wavelengths λ < λ α obs = (1 + z B )λ α emit = 597.2 nm. Our CRNGs used dichroic filters to distinguish "red" from "blue" astronomical photons on either side of λ filter = 630 nm. Therefore the entire range of photons from quasar B that could have been affected by the Lyman-α forest falls within the "blue" channel. Akin to the gravitational-lensing scenario described above, one may imagine some local-realist "conspiracy" that used the Lyman-α forest to purposefully alter the spectra that would be received at Bob's station (by suppressing "blue" photons), and/or that could have "alerted" other elements of our experimental apparatus with a preview of the patterns in the upcoming sequence of "red" and "blue" detections at Bob's station.
However, given our detectors' sensitivity for λ ≥ 400 nm, the closest gas cloud that could have affected the observable portion of the spectrum from quasar B would be at some redshift z cloud such that 400 nm = (1 + z cloud )λ α emit , or z cloud = 2.29, corresponding to a lookback time at which the quasar photons encountered that cloud of t cloud lb = 10.93 Gyr ago. This falls considerably longer ago than the lookback times to the emission events from quasar A: t A lb = 7.78 Gyr ago for pair 1, t A lb = 3.22 Gyr ago for pair 2. Therefore, any "conspiracy" that might have occurred as recently as t cloud lb = 10.93 Gyr ago is consistent with our overall conclusion: we have constrained any such local-realist mechanisms to have occurred no more recently than t A lb . Likewise, we may calculate the excluded space-time volume fraction under the assumption that some "conspiracy" occurred at a gas cloud at z cloud = 2.29, by substituting z B → z cloud . Repeating the calculation as above, we then find F excl = 0.958 rather than 0.960 for pair 1, whereas for pair 2, we find F excl = 0.635, unchanged from our original calculation.

Quasar Selection
In anticipation of our limited observing opportunities at the observatory, we searched for quasar pairs whose space-time arrangement would exclude the largest fraction of the 4-volume of the past light cone of our experiment, while maintaining a sufficiently high signal-to-noise ratio to yield a statistically significant result, given constraints on telescope time. (Following Refs. [38,40,57], we discuss requirements for signal-to-noise below.) We started with a database of the ∼62 000 quasars from the Simbad database that had a Sloan r -band magnitude brighter than 19. We cross-referenced this to the SDSS DR14 database, taking the most conservative of their redshift estimates (or other redshifts from Simbad and the literature, where SDSS redshifts were not available). We considered only quasars that were the brightest for their distance within each 5 • by 5 • patch of sky. This left ∼4 000 quasars. For each pair of these, and for every minute of allotted telescope time, we simulated the number of quasar and skyglow photons that each telescope's detectors would record when pointed at the relevant in- stantaneous elevation angle > 25 • and observed through the associated airmass. Figure 5 shows the path on the sky that the quasars of pair 1 took during a period of 1.5 hours on one of our observing nights; our experiment with pair 1 was conducted near the end of that window.
To estimate the rate at which entangled photon coincidences would accumulate, we required that the relevant signal-to-noise threshold be exceeded in each of the red/blue detector-setting ports for both quasars, and that detectors be triggered while both quasars were in causal alignment with respect to the experimental stations such that τ k valid (t) > 0 for both sides k = A, B. This ensured that we only included entangled-photon coincidences while closing the locality loophole and ensuring the signal-to-noise was sufficiently high.
For a given experimental visibility, entangled-photon coincidence rate, and quasar-photon rate, we estimated the statistical significance we could achieve during the time window while all these conditions were met. Then we chose the highest-redshift pairs with the largest observable angular separations that our simulations predicted could yield significant results in the time allotted.
For the best of these candidates, we further required each object to have published spectra and verified redshifts (either SDSS DR14 or elsewhere in the literature). We manually vetted these to ensure they were legitimate quasar spectra and accurately estimated redshifts, and not, for example, misidentified stellar spectra of r ∼13-19 magnitude stars (which the SDSS algorithms some-times misclassified as extragalactic objects with redshifts z > 0.1 [52]). These cases were non-negligible as contaminants to the subset of the SDSS DR14 database that our software preferentially selected, so the manual checks were required before choosing final target quasars to observe. For each of the four initially scheduled time slots on the telescopes, we performed this procedure to select ∼10-20 vetted targets, yielding several possible pairs that would be optimal at the beginning of the scheduled observation window. The best pairs balanced the trade-off between the largest excluded cosmological 4-volume and the highest statistical significance that could be expected to accumulate within the relevant time window.
Although we were originally scheduled for observation windows over the course of four consecutive nights, due to bad weather and technical problems, we were only able to conduct experiments during the last of our scheduled sessions (early in the morning of 11 January 2018).

Experimental Details
We used Type-0 spontaneous parametric down conversion in a 30 mm periodically poled KTiOPO 4 (ppKTP) crystal placed in a Sagnac-interferometer loop to produce entangled photon pairs. The crystal was bi-directionally pumped by a grating-stabilized 405 nm laser to produce pairs of horizontally polarized down-converted photons at wavelengths of 773.6 nm and 850 nm with a spectral bandwidth of ≈2.5 nm FWHM. The entangled state was rotated near the maximally entangled Bell state |ψ − = 1 √ 2 (|HV − |V H ) with fiber polarization controller paddles. The relative phase was adjusted by the polarization of the pump beam by optimizing the Bell test visibility. We locally measured heralding efficiencies of 31% and 41%, which differed by about 1% before and after the experiment, confirming that these parameters remained stable over the duration of the measurement. For pair 1 (pair 2) the duty cycle of Alice's and Bob's measurements -i.e. the temporal sum of used valid setting intervals divided by the total measurement time per run -were 0.32 % (1.17 %) and 1.62 % (0.96 %), respectively, resulting in a duty-cycle for valid coincidence detections between Alice and Bob of 5.2 × 10 −5 (1.1 × 10 −4 ).

Dichroic Selection of CRNGs
As emphasized in our previous work [38,40], it is critical to select appropriate dichroic filters for generating astronomical random bits on the basis of their wavelength. The filters should be chosen to minimize the predictability of the random bits generated. This means minimizing cross-talk between the "red" and "blue" detection channels, and choosing the infrared cutoff of our red band to be opaque to skyglow, which at the Roque de los Muchachos observatory increases rapidly with increasing wavelength over the range ∼ 700 − 1000 nm [72] due to transitions in the rovibrational states of OH radicals which are abundant in the upper atmosphere [73]. We employed a system of four dichroic elements which define our detection bands as shown in Figure 7. First, a longpass dichroic beamsplitter (BS) is used to reflect incoming light with λ 635 nm and send it towards our "blue" detector. After the dichroic beamsplitter, we place additional shortpass (SP1: λ < 620 nm) and longpass filters (LP1: λ > 637 nm) in the blue and red output ports to reject photons near the transition wavelength, which may go either way at the dichroic beamsplitter. This way, the fraction of red photons detected in the blue arm and vice versa is negligible (f w < 2 × 10 −5 ). This represents a significant improvement over our previous experiment, in which the wrong-way fractions for each detector were f w ∼ O(10 −2 ) [38]. Finally, after the longpass filter in the transmitted (red) output port, we define the long-wavelength cutoff of our red band with a shortpass filter (SP2) at ≈ 745 nm, a wavelength chosen to optimize the trade-off between rejection of infrared skyglow and transmission of quasar photons.
In order to make the selection of the dichroic elements BS, SP1, LP1, and SP2, we began with a hand-prepared list of available filtersets (BS, SP1, LP1) whose wavelengths were compatible as well as a list of available filters SP2. Then, for every combination of (BS, SP1, LP1) and SP2, we computed the signal-to-noise ratio in the red and blue channels up to an unknown overall constant which varies from quasar to quasar. We chose the final filter combination that yielded a strong signal-to-noise ratio for all three quasars observed. For every observation except that of QSO B0422+004, our first-principles computation of the red-blue imbalance agreed with the measured count rates to within 11%. For the violently-variable BL Lac object QSO B0422+004, our observation was redder than predicted by our model by a factor of ≈ 1.6. This is consistent with recent photometric observations, which report a brightening of QSO B0422+004 in the V band by a factor of 2.5 between December 9, 2014, and December 29, 2015 [74]. Similar dramatic variations in brightness were also observed in the 1.2 − 2.2 micron J, H, and K bands between February 2013 and January 2015 [75][76][77].
The optical efficiency of our CRNGs is shown in Table III. It varies from quasar to quasar due to their diverse spectral shape and different observing altitudes. For all computations, we employ a spectral model, formulated in Ref. [38], which takes as input a quasar spectrum (counts per second per wavelength), its observation altitude, and choices for the dichroic elements BS, SP1, LP1, and LP2. Quasar spectra are corrected for atmospheric extinction with tabulated values for atmospheric reddening at the Roque de los Muchachos [78]. Both quasar and skyglow spectra are weighted by the transmission curves of the  telescopes' aluminum mirror coatings, our lenses' antireflection coatings, and our ID120 detectors' quantum efficiency curve. These constituent curves are plotted in Figure 6, and the weighted spectra are plotted in Figure 7.

Data Analysis
In this section we analyze the data from the two experimental runs, which were each conducted early in the morning of 11 January 2018. We make the assumptions of fair sampling and fair coincidences [12]. Thus, for testing local realism, all data can be postselected to coincidence events between Alice's and Bob's measurement stations. These coincidences were identified using a time window of 2.66 ns.
As in Ref. The quasar spectra are reddened by Rayleigh scattering. Bottom: Our system of four dichroic elements splits incoming photons cleanly into a reflect (blue) channel and a transmit (red) channel. The red channel is shortpassed to reject a large fraction of Meinel rovibrational emissions which grow rapidly at wavelengths longer than ∼ 700 nm. Quasar spectra were taken from: QSO J0831+5245 [79], QSO B0350-073 (SDSS DR14, [52]), and QSO B0422+004 [80]. The skyglow spectrum is from Ref. [72]. is given by and the total number of all recorded coincidences is A point estimate of the joint setting probabilities is given by We may then test whether the probabilities q ij can be factorized, that is, whether they can be written (approximately) as where We may also estimate the conditional probabilities for correlated outcomes in which both Alice and Bob observe the same result: If we define the quantity then the Bell-CHSH inequality for local-realist models [6] takes the form C ≤ 0 (if one neglects the "freedom-ofchoice" loophole [57]). We may also construct the correlation functions in terms of which we may construct the quantity The Bell-CHSH inequality may then be written S ≤ 2.
The quantities C and S are related as S = 2|C + 1|.
For pair 2, we have N = 12, 420 total coincidence counts. Using the measured coincidence counts in Eqs. (38) and (37), we find the values for the Bell-CHSH parameters C and S of Eqs. (34) and (36) as shown in Table IV. There we also show the visibility fraction, defined as where S QM max = 2 √ 2, the Tsirelson bound [56], is the maximum value that the quantity S may attain according to quantum mechanics.

Statistical Independence of Settings Choices
We first consider whether the joint settings frequencies q ij for the data from pairs 1 and 2 may be factorized. For the individual settings probabilities p(a i ) and p(b j ) defined in Eq. (32), we find the values shown in Table VI. The joint frequencies q ij and the inferred joint probabilities p ij are shown in Table VII. Note that for each dataset, we find ij q ij = ij p ij = 1.00, as required. We expect that for each dataset, q ij p ij . If we did not find q ij p ij , then (in principle) there could have been some common cause that established correlations among the various setting choices; the choices ij would not be independent. To test for any such violation of independence among the joint settings ij, we may conduct a Pearson's χ 2 test by calculating the statistic and then computing the p-value corresponding to this value of χ 2 [38]. For pair 1, we find χ 2 = 0.1504, which implies that (under the assumption of independent setting choices) there is a purely statistical chance of p = 0.698 that the observed frequencies q ij (or data that deviate even more from the inferred p ij ) would be obtained. For pair 2, we find χ 2 = 2.405, corresponding to a statistical chance p = 0.121. Given that each of these p values is larger than typical thresholds (such as p < 0.05), we conclude that there is no strong statistical support to refute the hypothesis that the settings ij were independent of each other, for both pairs 1 and 2.

Testing for Violations of No-Signaling
Another statistical test to consider is whether our data are consistent with the assumption of no-signaling. To be consistent with the principle of no-signaling, the local outcome probabilities must not depend on the setting of the distant detector, which is to say [38]: The analogous expressions for the '−' outcomes follow from p(A = −|a i b j ) = 1−p(A = +|a i b j ). Point estimates for these conditional probabilities may be estimated from the measured coincidence counts. We define with analogous expressions for measurement outcomes A = − and B = ±. Point estimates for the various probabilities are shown in Table VIII. We may assess whether these conditional probabilities show any statistical evidence of the violation of nosignaling by performing pooled two-proportion z-tests for each relevant pair, such as p(A = +|a 1 b 1 ) and p(A = +|a 1 b 2 ) [38], which yields the p-values listed in Table IX. As can be seen in Table IX, nearly all of the single-sided outcomes from pairs 1 and 2 are consistent with the hypothesis of no-signaling. The only anomalously low p-value is for p(A = +|a 2 b j ) for pair 2, which yields p = 0.023 < 0.05. However, for 8 independent tests, the probability that one shows a p-value at least as bad as 0.023 is 1−(1−0.023) 8 = 0.170, so the overall suite of tests is consistent with the hypothesis of no-signaling.

Predictability of Settings
In this section we consider imperfections in the experiment that can lead to an excess predictability [57] of the setting choices. Such excess predictability quantifies the fraction of runs in which one could predict a specific setting better than would be inferred from the overall bias of the setting choices, given all possible knowledge about the setting-generation process that could be available at the emission event for the entangled particles and thus at the distant measurement events. Loosely speaking, quantifies the fraction of runs in which the assumptions of locality and freedom-of-choice fail to hold.
In general, we consider a given trial "corrupt" if any one of three possibilities occurred at either Alice's or Bob's detector: it (1) involved a noise photon rather than a cosmic photon (where noise photons could arise from either detector dark counts or skyglow), or (2) involved a cosmic photon that was misdirected by a dichroic filter, or (3) involved a cosmic photon that was previewed by the local-realist model for the purpose of considering a dichroic-mirror error, but was passed over because the cosmic photon already had the desired color.
Given these considerations, we parameterize the excess predictabilities for each detector setting as [38] where the rate of detected photons at Alice's detector is given by with a comparable expression for Bob's detector, r bj .
Here n ai is the measured rate of noise photons (dark counts and skyglow), which may be quantified by pointing the receiving telescope marginally away from its quasar target; f i→i is the fraction of cosmic photons whose color (if correctly identified) would have led to setting choice a i , but which were misdirected by the dichroic mirror toward the wrong port, yielding setting a i ; and s (A) i is the detected rate of cosmic photons which have a color that (when correctly identified) yield setting choice a i . Because the fractions of misdirected cosmic photons were so small (as noted above), with f w < 2 × 10 −5 for both red-to-blue and blue-to-red at Alice's and Bob's stations, we may neglect the effects from nonzero f w . This simplifies our analysis compared to Ref. [38], in which the various wrong-way fractions were as large as f w = O(10 −2 ). For pairs 1 and 2, we find the signal rates and noise rate shown in Table V.
As in Refs. [38,57], we assume that a local-realist model could exploit each "corrupt" trial so as to produce measurement outcomes that exceed the usual Bell-CHSH inequalities of C ≤ 0 or S ≤ 2. In particular, we make the conservative assumption that predictable setting events do not occur simultaneously at both detectors, so that the total fraction of corrupt joint settings is simply the sum of the corrupt settings on each side: If any value calculated as in Eq. (45) exceeds 1, then the corresponding ij is set to 1. We further assume that the local-realist model can maximally exploit each corrupted trial, so that the maximum value of C that  the local-realist model could attain by exploiting excess predictabilities of detector settings would be [57] C ≤ , where ≡ max ij ij = max i ai + max j bj .
Following Refs. [38,40,57], to ensure that a sufficient number of genuine quasar photons are detected compared to skyglow, dark current, and misdirected photons, the inequality of Eq. (46) places a constraint on the visibility fraction, such that we require for all times during the experiment, where the visibility fraction V is defined in Eq. (39). Eq. (48), in turn, sets a constraint on the signal-to-noise ratio on each of the four settings ports. Given the values of V shown in Table IV, the constraint of Eq. (48) becomes < 0.322 (pair 1) and < 0.314 (pair 2). As shown in Table XI, all values for the excess predictabilities ij easily satisfy this constraint, for both pairs of quasars. Meanwhile, each of the corrupt fractions ai and bj has some statistical uncertainty, σ a i and σ b j , solely due to fluctuating skynoise during the measurement run. Since the total number of runs is recorded, the only unknown is in the total number of runs that were conducted with noise photons in our CRNGs over our measurement period. This is estimated by measuring the average noise count rates before and after the measurement period at each telescope and using the higher of the two count rates to estimate the total number of corrupt runs. In each of our noise measurements we find that the estimated unbiased mean-square error in the count rates is consistent with Poisson noise.
We temporarily drop the labels for Alice and Bob, and assume that the rates r i and n i are independent (which follows from our assumption of fair sampling for all detected photons). Then we find If we further assume that Alice's and Bob's predictabilities are independent, then we find with an estimated uncertainty on , as defined in Eq. (47), of Values of ai ± σ a i and bj ± σ b j for pairs 1 and 2 are shown in Table X, and values of the excess predictabilities for joint settings, ij ± σ ij , are shown in Table XI.   Since the exact number of runs is known and recorded, we consider the probabilities ij to be known (to within some uncertainty σ ij ), but the actual number of corrupt trials to be subject to statistical fluctuations. In other words, the occurrence of a corruption in any trial is taken to be an independent random event, which has probability ij that depends on the settings pair ij. We assume that for "uncorrupt" trials, the local-realist model has no information about what the settings pair will be beyond the joint settings probabilities q ij [38,57].
In general, the total rates (r ai , r bj ) and noise rates (n ai , n bj ) can vary during the course of an observing period. For example, when observing a given quasar over a substantial period of time, the receiving telescope will collect its light through varying amounts of airmass, as the quasar rises above or moves toward the horizon, thereby affecting the noise rate. In practice, however, our observing periods for both pairs 1 and 2 were brief enough (≤ 17 min) that the measured values for r ai , r bj and n ai , n bj did not change substantially; the largest variation among all four detector settings yielded a difference ∆ a1 = 8.6% in the excess predictability. Therefore we adopt the conservative approach of assuming constant values of each ai and bj during a given experimental run, and use the largest values for each detector setting. Such an approach will (modestly) underestimate the statistical significance of our results.

Statistical Predictability of Random Bits
Throughout our analysis, we assume that the bits within the sequence gathered from a given quasar are independent of each other. That is, we assume that there is no sequence of bits within the bitstream that contains any information about any future bit. If this independence did not hold, then (in principle) a localrealist mechanism would be able to exploit any excess predictability (beyond the natural bias) to engineer a measured violation of Bell's inequality. Although complete independence among the bits within each bitstream can never be rigorously proven, we can calculate bounds for the available information. As we assume fair-sampling for the cosmic photons, we included all detection events, instead of postselecting for those that satisfied the requirements of τ k valid (t) to yield a valid setting. For the calculation of the mutual informationÎ(m) we adopt the approach developed in Ref. [40]: We calculate the mutual information between every bit and every sequence of the m = 17 preceding bits, correcting the biased estimator for the finite size of our bitstream. The value m has been chosen such that it is strictly larger than log 2 (L) − 7 for each measurement file, where L is the total number of detection events [81,82]. The results are presented in Table XII. The calculated values ofÎ(m) are 2 − 3 orders of magnitude smaller than the values of the excess predictabilities ai and bj that we already incorporate in our data analysis. These small values ofÎ(m) therefore make no quantitative impact on our analysis or conclusions.

Statistical Significance of Bell Violation
As discussed in Ref. [38], there exist several different approaches to estimating the statistical significance for Bell experiments. The result of any such statistical analysis is a p-value, that is, a bound for the probability that the null hypothesis-in our case, local realism with  XII. We characterize the statistical predictability of our random bitstreams generated by each quasar by computing the mutual information of each bit with the m = 17 bits preceeding it. This measure corresponds to the probability of guessing the bit correctly given knowledge of the prior 17 bits. The number of lookback bits m for which our estimator is valid is set by L, the length of the bitstream.
excess predictability of the detector settings ij , biased detector-setting frequencies q ij , fair sampling, fair coincidences, and any other additional assumptions-could have produced the experimentally observed data by a random, statistical variation. Until recently, it was typical in the literature to make several simplifying assumptions when calculating the pvalue for a Bell test, such as that each trial was independent and identically distributed, that the local-realist mechanism could not make any use of "memory" of the settings and outcomes of previous trials, and that excess predictabilities ij could be neglected. Under those assumptions, one typically applied Poisson statistics for single coincidence counts. As emphasized in Ref. [38], however, such an approach assumes that the measured coincidence counts N AB ij are equal to their expected values, only to contradict that assumption by calculating the probability that the N AB ij could have values differing by several standard deviations from their expected values. More recent, sophisticated analyses do not make such assumptions, and represent significant improvements over previous methods [57,[83][84][85][86]. However, even these newer approaches are not optimal for our particular experimental arrangement. For example, they are not optimized for experiments with unequal (biased) settings probabilities, and/or they yield non-tight upper bounds for p that can dramatically underestimate the statistical significance.
For these reasons, in Ref. [38] we presented an ab initio calculation of the p-value tailored more specifically to experiments like ours. We follow the same steps here, and refer readers to the more detailed description of the calculation in Ref. [38]. First we construct the quantity ]. If we estimate the conditional probabilities p(A = B|a i b j ) as in Eq. (33), then W = (3+C)N , with C defined in Eq. (34). As described in Ref. [38], to avoid the ambiguity that would arise by the need to assume a specific prior probability distribution for the various detector settings, we take the actual number N ij of the occurrences of each settings pair a i b j as given. The relevant ensemble with respect to which we calculate probabilities is the ensemble of all possible orders in which the settings choices could have occurred. The p-value we calculate is then the fraction of orderings for which the local-realist mechanism, using its best strategy, could obtain a value of W greater than or equal to the value obtained in the experiment.
In the absence of noise or errors, the local-realist mechanism could specify which outcomes (A, B) will arise for each of the possible settings (a i , b j ). The best plans will win for three of the four possible settings pairs, but will lose for one of the possible settings pairs. Therefore a plan may be fully specified by identifying which settings pair will be the loser for a given trial. In the presence of noise and errors, for each time the settings pair is a i b j , there is a probability ij that the trial is corrupt. If the trial is corrupt, it automatically registers as a win. If it is not corrupt, then it has a probability P win ij of registering as a win, where we take P win ij = p(A = B|a i b j ) for (ij) = (22), and p(A = B|a i b j ) for the other cases. These considerations motivate the form of W in Eq. (52) [38].
The ensemble average of W takes the form [38] W = N (3 +¯ ), where we have defined To calculate the statistical significance, we also must calculate σ 2 W ≡ W 2 − W 2 . As in Ref. [38], we calculate σ opt W , subject to the condition that the local-realist mechanism may choose the fractions f ij so as to optimize its strategy, where the f ij are defined as the fraction of trials for which the local-realist mechanism chooses settings pair (ij) to be the losing pair. In Ref. [38] we found Each value of f ij must be non-negative. (If any value is negative, we use a second Lagrange multiplier and recalculate the f opt ij [38].) For pairs 1 and 2, we find all values f opt ij > 0 when calculated as in Eq. (55), so we may use our expression for σ opt W from Ref. [38], namely We may then calculate the number of standard deviations ν by which the correlations among the measured coincidence counts exceed those that the local-realist mechanism could have engineered by exploiting excess predictabilities:ν The quantities W , W , and σ opt W each depend on ai and bj as well as on the coincidence counts N AB ij , which we take as given. Whereas Eq. (57) takes into account the excess predictabilities, ij , however, it does not incorporate the uncertainties on the excess predictabilities, σ a i and σ b j . For the next step, we therefore propagate uncertainties on ai and bj to compute the uncertainty on ν, which we denote ∆ ν . The uncertainty ∆ ν takes the form [38] The naive estimate of the number of standard deviations, ν in Eq. (57), implicitly assumed σ a i , σ b j = 0, and therefore ∆ ν = 0. If we allow for an uncertainty in ν equal to n times the 1-σ uncertainty inν, then we should calculate the p-value using ν n ≡ν − n∆ ν .
If we choose n so that n = ν n , then we find Assuming a Gaussian distribution for large-sample experiments, we conclude that the conditional probability that the local-realist mechanism could achieve a value of W as large as the observed value W obs , assuming that the true value of ν ≥ ν n , is given by Moreover, if we assume Gaussian statistics for the uncertainty in ν, then there is an equal probability that the true value of ν is less than ν n , in which case we must conservatively assume that W might exceed W obs . Therefore the p-value corresponding to the total probability that W ≥ W obs is bounded by