Systematic errors in high-precision gravity measurements by light-pulse atom interferometry on the ground and in space

We focus on the fact that light-pulse atom interferometers measure the atoms' acceleration with only three data points per drop. As a result, the measured effect of the gravity gradient is systematically larger than the true one, an error linear with the gradient and quadratic in time almost unnoticed so far. We show how this error affects the absolute measurement of the gravitational acceleration $g$ as well as ground and space experiments with gradiometers based on atom interferometry such as those designed for space geodesy, the measurement of the universal constant of gravity and the detection of gravitational waves. When atom interferometers test the universality of free fall and the weak equivalence principle by dropping different isotopes of the same atom one laser interrogates both isotopes and the error reported here cancels out. With atom clouds of different species and two lasers of different frequencies the phase shifts measured by the interferometer differ by a large amount even in absence of violation. Systematic errors, including common mode accelerations coupled to the gravity gradient with the reported error, lead to hard concurrent requirements --on the ground and in space-- on several dimensionless parameters all of which must be smaller than the sought-for violation signal.

We focus on the fact that light-pulse atom interferometers measure the atoms' acceleration with only three data points per drop. As a result, the measured effect of the gravity gradient is systematically larger than the true one, an error linear with the gradient and quadratic in time almost unnoticed so far. We show how this error affects the absolute measurement of the gravitational acceleration g as well as ground and space experiments with gradiometers based on atom interferometry such as those designed for space geodesy, the measurement of the universal constant of gravity and the detection of gravitational waves. When atom interferometers test the universality of free fall and the weak equivalence principle by dropping different isotopes of the same atom one laser interrogates both isotopes and the error reported here cancels out. With atom clouds of different species and two lasers of different frequencies the phase shifts measured by the interferometer differ by a large amount even in absence of violation. Systematic errors, including common mode accelerations coupled to the gravity gradient with the reported error, lead to hard concurrent requirements -on the ground and in space-on several dimensionless parameters all of which must be smaller than the sought-for violation signal.
Light-pulse atom interferometers (AIs) are based on quantum mechanics. As the atoms fall, the atomic wave packet is split, redirected, and finally recombined via three atom-light interactions at times 0, T , and 2T . The phase that the atoms acquire during the interferometer sequence is proportional to the gravitational acceleration that they are subjected to.
Although one might think that the phase shift depends on quantum mechanical quantities ". . . this is merely an illusion since we can write the scale factor [between the phase shift and the gravitational acceleration] in terms of the parameters we control experimentally, i.e. Raman pulse vector k and pulse timing T . It then takes the form kT 2 . . . . We can simply ignore the quantum nature of the atom and model it as a classical point particle that carries an internal clock and can measure the local phase of the light field." ( [1], Sec. 2.1.3). The same reference also demonstrates that, in the case of a gravitational field with a linear gradient, both the exact path integral approach and the purely classical one lead to the same exact closed form for the phase shift and free fall acceleration measured by the AI, which is then expanded in power series of the local gravity gradient γ for convenience [1]. The recoil velocity is the only part of the atom-light interaction which is not found in the classical model; however, it does not appear in the phase shift actually measured by AIs, because they are operated symmetrically so as to cancel it out [2,3]. Thus, the classical approach gives excellent predictions of the phase shift measured by the interferometer, while including the quantum mechanical details related to the internal degrees of freedom is needed to account for smaller effects, such as the finite length of the light pulses.
We focus on the fact that AIs measure the atoms' position along the trajectory only three times per drop (in correspondence with the three light pulses), unlike laser interferometers in falling corner-cube gravimeters which make hundreds to a thousand measurements per drop [4]. Hence, although predicted exactly, the gravitational acceleration measured by AIs is the true one only in a uniform field.
The predicted value of the measured acceleration first appeared in Ref. [5], and it has been confirmed ever since [1,[6][7][8][9]. However, none of these works mentions that in the presence of gravity gradient this value is the average free fall acceleration (at time T of the middle pulse) based on three position measurements. This is, of course, only an approximation to the true acceleration, expressed mathematically by the second time derivative of the position, or obtained experimentally with a sufficiently large number of measurements per drop.
Initially, the lack of precise measurements made the difference unimportant, and by the time the precision improved nobody went back to this issue. However, its physical consequences do deserve to be carefully addressed. Using the classical approach [1] we point them out when AIs are used to measure the absolute value of the gravitational acceleration g, for gravity gradiometry and for testing the universality of free fall (UFF), both on the ground and in space.
Since UFF tests are included, we allow from the start the possibility that the equivalence of inertial and gravitational mass may be violated for atoms of different species A, B in the field of Earth (violation of the weak equivalence principle, WEP) hence violating UFF [10]. We therefore write the masses as m g A,B = m i A,B (1+η A,B ), where superscripts i, g refer to inertial or gravitational mass and the Eötvös parameters η A , η B , η ⊕ may not be exactly zero (although they must be smaller than 1 by many orders of magnitude [11,12]). The equation of motion for atoms A or B reads where R ⊕ is the Earth's radius and the z axis points upward. UFF is tested by measuring the differential accelerationz B −z A , then η ⊕ cancels out, and there is a violation if, with identical initial conditions and no noise, the ratioz differs from zero; thus, what matters is the different composition of the atoms under test, which should be maximized [13][14][15]. Hence we assume M g ⊕ = M i ⊕ ≡ M ⊕ . Using a perturbative approach for a gravity field with a linear gradient γ (see, e.g., Ref. [16]) the equation of motion reads where are the initial position and velocity errors of the atoms at release (the exact values are assumed to be zero). The solution is We compute the phase shift δφ A,B measured by the AI following the step-by-step algorithm outlined in Ref. [1]. Assuming the same k for all three pulses ( k is the momentum transfer, with the reduced Planck constant) and the same time interval T between subsequent pulses, it is and, using (4) With the scale factor kT 2 (k and T measured experimentally), this gives the free fall acceleration g A,B meas (T ) that the AI is predicted to measure at time T of the middle pulse. In modulus, If η A,B = 0 (WEP and UFF hold) this is the same as in [1]; it is the expansion to order γ of an exact result which can be obtained in closed form by an exact path integral treatment or within a purely classical description.
In a gravitational field with a linear gradient the free fall acceleration of the atoms at time T is obtained from (3) while the AI measurement gives, at the same time T , the value (7), which is systematically larger (in modulus) than the true one by the amount: with a relative error ∆a g• = 1 12 γT 2 . The discrepancy was pointed out in Ref. [17] where it was explained with the simple algebra involved in computing (6) from (5) and (4). In physical terms, it is due to the limitation, intrinsic to the AI instrument, of making only three position measurements per drop. Whether it can be neglected or not will depend on the specific experiment. If not, appropriate systematic checks are required in order to partially model this term, while any remaining unknown fraction of it must be too small to matter.
With the same perturbative approach the calculation can be extended to order γ 2 by using the solution z A,B (t) to first order in γ as given by (4), rather than to order

as used in (3). The new equation of motion reads
Its solution leads to the phase shift, hence to the acceleration measured by the AI to order γ 2 : The result is the same as Eq. (2.19) in Ref. [1] (where it was obtained by expanding to second order the exact result in closed form) and it is generally accepted. However, it differs from the true acceleration given by (9) (at the same time T ) with a relative systematic error: though we limit our analysis to order γ.
Let us now consider an AI experiment in space, inside a spacecraft in low Earth orbit such as the International Space Station (ISS). The ISS is Earth pointing, the AI axis is aligned with the radial direction and the nominal point O of atoms' release (origin of the radial axis ζ pointing away from Earth) is at distance h from the center of mass of the spacecraft (e.g., closer to Earth than the center of mass itself). When testing UFF with atom species A and B we can assume η spacecraf t = η ⊕ = 0 since they cancel out anyway [18]. The equation of motion reads with r being the orbital radius of the spacecraft (constant for simplicity) and n being its orbital angular velocity obeying Kepler's third law where a tide = γ orb h is the tidal acceleration at the nominal release point and g orb = GM ⊕ /r 2 8.7 ms −2 , γ orb = 3g orb /r 3.8 × 10 −6 s −2 are the gravitational acceleration and gravity gradient of Earth (the numerical values refer to an orbiting altitude of 400 km). This equation shows that in orbit the largest acceleration is the tidal one, with a tide g orb 3 h r 1 while the driving acceleration of UFF violation is g orb (slightly weaker than in ground drop tests), meaning that when the free fall accelerations of two atom species are subtracted a composition dependent violation signal would be g orb η, with η = η B − η A . The violation signal, if any, is an anomalous acceleration in the same direction (and unknown sign) as the monopole gravitational attraction from the source body, in this case the radial direction to the Earth's center of mass. Hence, the equation of motion (13) contains the tidal acceleration (due to gravity gradient) in the radial direction and not its transversal component [19,20]. The ratio of the variable acceleration γ orb ζ A,B relative to the constant term a tide is in analogy to the corresponding ratio on the ground γz A,B g• 1 2 γT 2 . Note that γ orb is only slightly larger than γ while it is expected that T can be several times larger in space than on the ground, because of near weightlessness conditions. This is considered the key motivation for moving the experiment to space, since it means, for a given free fall acceleration, a larger phase shift and hence higher sensitivity (as T 2 ). However, it also means a larger gradient effect (also as T 2 ). With this warning we proceed with a perturbative approach as on ground. To order γ orb , it is where ζ • A,B and Υ • A,B are position and velocity errors at release and the last term is of order γ 2 orb but cannot be neglected because the free fall acceleration to be measured is of order γ orb . We are led to the measured acceleration (in modulus): which, by comparison with its theoretical counterpart (14) at the same time, shows a systematic relative error ∆a a tide = 1 12 γ orb T 2 , similar to the ground experiment. When testing UFF release errors result in position and velocity offsets between the two atom clouds whichbecause of gravity gradient-give rise to a systematic differential acceleration error that mimics a violation signal [17]. The effect of release errors is known to be a major issue in all UFF experiments based on "mass dropping", while it does not occur if the test masses oscillate around an equilibrium position, as in torsion balance tests or in the proposed Galileo Galilei (GG) experiment in space [21]. As proposed by Roura [22], the effect of release errors coupled to the local gradient can be eliminated if the momentum transfer of the second laser pulse is modified by a small quantity of order γ such that the atoms fall as if they were moving in a uniform field. On the ground the nominal value k 2 to be applied at the second pulse is T ) remains if this value is not implemented exactly (a successful reduction γ res /γ 10 −2 has been reported [3]): The acceleration term (8) remains too, in which the gradient is unaffected by whatever reduction has been achieved for the previous one, as pointed out in the Comment [23] and acknowledged by Roura [24]. This is inevitable because ∆k 2 has been computed in order to nullify the effect of the local gradient on the atoms whose motion is governed by (3). Instead, the acceleration measured by the AI and used for tuning the change ∆k 2 , is affected by the error (8) which cannot therefore be compensated. Indeed, attempts to compensate it [25] are questionable because compensation would alter the free fall acceleration of the atoms and force it to equal a measured value which (already to first order in γ) is not fully correct.
The very fact that in proposing the gravity gradient compensation scheme Roura did not address the acceleration term (8) indicates that the systematic error made by taking only three measurements per drop has not been recognized.
A similar approach in space leads to a residual gradient γ orb−res < γ orb after applying [22], and to the phase difference: where the error given by the last term contains γ 2 orb , but amounts to 1 12 γ orb T 2 relative to a tide = γ orb h, which is the quantity to be measured, and therefore cannot be ignored as hinted by Refs. [23,24].
A previous approach to reducing gravity gradient and initial offset errors in a proposed test of UFF on the ISS was based on the idea of rotating the interferometer axis [26]. For a dedicated mission the idea of rotating the whole spacecraft has been proposed by Rasel's group as the key to reduce tidal effects [27]. In both cases the authors invoke a similarity with MICROSCOPE space experiment [12].
In MICROSCOPE, the offset vectors between the centers of mass of the macroscopic test bodies -being due to construction and mounting errors-are fixed with the apparatus and therefore follow its rotation at all time, allowing the main tidal effect to be distinguished from a violation signal during the offline data analysis of a sufficiently long run -this is not a mass dropping experiment (Ref. [21], Sec. 7). Instead, mass dropping tests with AIs require a huge number of drops to reduce single shot noise, each one with its own initial conditions and mismatch vector between different atom clouds, and the assumption that all these vectors are fixed with the apparatus cannot be taken for granted. The argument presented in Ref. [26] that the proposed instrument "has random but specified mismatch tolerances" is a weak one. Being systematic, this error must be below the target acceleration of the test in all drops; otherwise -should mismatch reversal not occur even in a small number of drops during the entire run (which is hard to rule out by direct measurement)-the resulting average acceleration will be larger than the target, thus questioning the significance of a possible "violation" detection.
Roura's proposal [22] is therefore to be preferred, as long as the acceleration term (8) is recognized and dealt with, if necessary.
On the ground the error (8) affects the absolute measurement of g. The best such measurement has achieved ∆g/g 3 × 10 −9 [7,28], only about three times worse than obtained by the absolute gravimeter with free falling corner-cube and laser interferometry [29]. With T = 160 ms, the acceleration 7 12 γg • T 2 in (7) exceeds the target error and has required a series of ad hoc measurements (drops from different heights) to be modelled and reduced below the target. Should it be possible to improve the sensitivity of the instrument by increasing T , and to reduce the gradient and its effect coupled to initial condition errors as proposed by Ref. [22], the error (8) would still remain and should be taken care of for the absolute measurement of g to be improved.
In gravity gradiometers, two spatially separated AIs with atoms of the same species interrogated by the same laser (hence ∆T =0 and ∆k = 0) measure their individual free fall accelerations at their specific locations and compute their difference. The advantage is that the differential (tidal) acceleration is less affected than g by disturbances mostly in common mode, such as vibration noise. They are used for geodesy applications, but also for the measurement of the universal constant of gravity G and the detection of gravitational waves. On the ground, if the release points A and B are separated vertically by ∆h (A at the reference level and B higher by ∆h), the differential acceleration is while the gradiometer measures: with a systematic acceleration error proportional to γ 2 which cannot be neglected relative to the tidal acceleration measured by the gradiometer, the fractional error being 7 48 γT 2 . In space, with the release point A as in (12), and B at a radial distance ∆h (farther away from Earth), the gradiometer would measure with a fractional systematic error 1 12 γ orb T 2 . The error -like the physical quantity to be measured-contains the gradient. Therefore, depending on the target precision and accuracy of the experiment, ad hoc independent measurements are needed in order to model and reduce it below the target.
In tests of UFF with AIs in which different atoms A and B are dropped "simultaneously", the individual phase shifts are measured and their difference δφ B − δφ A is computed, to yield zero if no composition dependent effect is detected (i. e., η = η B − η A =z B −z A (z A +z B )/2 = 0, UFF and WEP hold).
Different isotopes of the same atom can be interrogated with the same laser. In this case T is the same and the gradient term with 1 12 or 7 12 coefficient cancels out. Different atom species need different lasers and a requirement arises, for a given target η of the UFF test, on the time difference ∆T , as pointed out by Refs. [23,24]. However, the main problem with different lasers is that different frequencies (k A = k B ) result in widely different phase shifts measured by the interferometer even in case of perfect synchronization (∆T = 0), zero gradient (γ = 0), no noise and no violation. For instance, using 87 Rb and 39 K, the fractional difference of the phase shifts is of the order of k K −k Rb k K 1.67 × 10 −2 (k K = 4π/767 nm −1 , k Rb = 4π/780 nm −1 ). When seeking a violation signal many orders of magnitude smaller this is a major problem, which never occurred before in the long history of these experiments that goes back to Galileo. Taking into account only k A = k B the difference of phase shifts reads As mentioned earlier, there is a large term even if WEP holds, which makes this quantity hardly suitable to detect a tiny violation. Moreover, the Eötvös parameters η A or η B appear in addition to η = η B − η A [see Eq. (2)] mixed with the Raman vectors. This mixing disappears and violation is correctly expressed by η if we use the ratio of phase shifts instead: By defining k A = k A + ∆k A , k B = k B + ∆k B with k A , k B being the exact values and ∆k A , ∆k B being the respective experimental errors we get where we have included gravity gradient and initial condition errors, the synchronization and Raman vector errors, and also perturbations resulting in common mode accelerations a cm with inevitable differential residuals ∆a dm . The most relevant and best studied is vibration noise [30]; by using the same mirror it is ideally common mode, but a differential residual remains due to imperfect rejection. Systematic errors must obey the conditions The requirement on initial condition errors is very severe [17] and must be relaxed by applying, for each species, an appropriate frequency shift at the second laser pulse [22] in order to make the residual gradient γ res as small as possible. Gravity gradient is relevant because it couples also to common mode accelerations which would otherwise cancel out (also to higher order). Thus, vibration noise must meet the tight requirements (24) in differential and in common mode. Note that after applying [22], the requirement on a cm is relaxed only by a factor 7 because the residual (8) contains the actual gradient γ and not the reduced one γ res . Concerning synchronization, the requirement (24) comes from the fact that each phase shift grows as T 2 times the leading free fall acceleration, hence the relative error 2 ∆T T competes with η. The issue has been faced by [31] in a WEP test with 87 Rb and 39 K, though only to a few parts in 10 7 . By chirping the lasers at a particular rate a wave acceleration was applied in order to compensate to some extent (for each species) the leading acceleration of the atoms which gives a phase shift proportional to T 2 . Raman pulse errors must meet similar tight requirements, and it is envisaged to make use of frequency comb technology [30].
In Ref. [30] the authors propose a data analysis that would allow a violation term containing η to be separated from vibration noise. However, this is not the only violation term in their equations, due to the mixing shown by (21). Violation appears only as η in the ratio of the phase shifts, and (23) shows beyond question that vibration noise, both in common and differential modes, cannot be separated from η; ∆a dm g• and 1 6 γT 2 acm g• could be misinterpreted as a violation and therefore must be below the tight bounds (24).
For a violation at level η to be detected, it is necessary (i) that all errors are negligible with respect to η and (ii) that the absolute value of the ratio of the Raman vectors k B k A is measured to both precision and accuracy better than η, so as to distinguish a deviation from the measured value at this tiny level. Instead, in UFF tests with macroscopic test masses, the physical quantity of interest is zero if η = 0 ("null experiments" [21]).
With η already established at levels below 10 −13 , 10 −14 [11,12], the requirements (24) are challenging, and each one needs a specific challenging technology, all to be implemented together. Every error could be a WEP violation and needs specific systematic checks in order to be distinguished from it; all checks must have the target sensitivity and therefore require a total integration time each [17,22].
In space the ratio of phase shifts reads hence the requirements on systematic errors are: ∆T T < η 2 g orb a tide , ∆k A k A < η g orb a tide , ∆k B k B < η g orb a tide , γ orb−res g orb γ orb T 2 η. By comparison with (23) and (24), the requirements on synchronization and Raman vector errors are relaxed by the large factor g orb a tide , as noticed by [30], because in orbit the driving violation signal is g orb while the leading free fall acceleration is a tide [see Eq. (14)]. However, this factor is gained only for systematic errors linear with a tide , not for gradient and initial condition errors and for vibration noise, both in common and differential mode, in which case the requirements are as tight as on ground.
The intrinsic limitations and severe requirements of UFF tests performed by dropping atoms of different species are the reasons why almost all tests, especially if aiming at high precision [33], drop two isotopes of the same atom, 87 Rb and 85 Rb. However, with only two neutrons difference, chances are low that these experiments may detect composition dependent effects which would lead to new physics [11,32].
Light-pulse atom interferometers have the advantage that atoms provide both the test mass and the readout. However, they have only three time-position measurements each drop to recover the acceleration, unlike falling corner-cube gravimeters which can rely on hundreds to a thousand data points per drop. The resulting systematic error grows linearly with the gradient and quadratically with the time interval T between laser pulses. This error must be addressed in attempts to improve the absolute measurement of g and must be proved to be irrelevant -or taken care of-in gravity gradiometers for the measurement of the absolute value of the universal constant of gravity G, for space geodesy and for the detection of gravitational waves. This work has been supported by (ESA) Contract No. 4000125653 through an ITI type B grant. Thanks are due to Neil Ashby, Giuseppe Catastini, Chris Overstreet, Marco Pisani and Massimo Zucco for useful discussions.