Super-Nyquist ultralight dark matter searches with broadband atom gradiometers

Atom gradiometers have emerged as compelling broadband probes of scalar ultralight dark matter (ULDM) candidates that oscillate with frequencies between approximately $10^{-2}$ Hz and $10^3$ Hz. ULDM signals with frequencies greater than $\sim 1$ Hz exceed the expected Nyquist frequency of atom gradiometers, and so are affected by aliasing and related phenomena, including signal folding and spectral distortion. To facilitate the discovery of super-Nyquist ULDM signals, in this work we investigate the impact of these effects on parameter reconstruction using a robust likelihood-based framework. We demonstrate that accurate reconstruction of ULDM parameters can be achieved as long as the experimental frequency resolution is larger than the ULDM signal linewidth. Notably, as ULDM candidates whose frequencies differ by integer multiples of the sampling frequency are identified at the same aliased frequency, our discovery analysis recovers discrete islands in parameter space. Our study represents the first comprehensive exploration of aliasing in the context of dark matter direct detection and paves the way for enhanced ULDM detection strategies with atom gradiometers.

1 Introduction E ver since the formulation of the dark matter (DM) hypothesis, the search for dark matter in direct detection experiments has been one of the greatest priorities in particle physics [1,2].Until recently, the possibility of charting the strikingly diverse and vast landscape of DM models beyond conventional GeV-scale candidates seemed like a remote possibility.Now, thanks to extraordinary advancements in a wealth of cutting-edge technologies with ever-increasing sensitivity to minute effects, it is expected that large regions of DM model space will be within the reach of the next generation of direct detection experiments, such as atom interferometers.
In addition to being excellent probes of gravitational waves in the mid-frequency gap [3], large-scale atom interferometer experiments, such as AION [4], MAGIS [5], MIGA [6], EL-GAR [7], and ZAIGA [8], would be powerful probes of ultralight dark matter (ULDM).In particular, these experiments would be especially ideal probes of scalar ULDM signatures through their exquisite sensitivity to changes in atomic structures.Indeed, scalar ULDM with dilatonic couplings to Standard Model (SM) operators would give rise to time-varying oscillations in atomic transition frequencies [9,10], which in turn would generate an oscillatory non-vanishing phase difference between pairs of spatially-separated interferometers that are interrogated by the same set of lasers through a gradiometer configuration [11,12].
As first shown in Ref. [11], and later studied in detail within the context of the AION and MAGIS experiments, terrestrial long-baseline single-photon vertical atom gradiometers and space-based experiments that operate in broadband mode would be especially powerful probes of scalar ULDM with masses between ∼ 10 −18 eV and 10 −13 eV, corresponding to signals oscillating at ∼ 10 −2 Hz and ∼ 10 3 Hz, respectively.Importantly, these experiments are expected to outcompete other complementary probes in this frequency range, such as atomic clocks [10], the MICROSCOPE experiment [13], torsion balance experiments [14], the AURIGA experiment [15] and superradiance constraints from the observations of fastspinning stellar-mass black-holes in X-ray binaries [16].
The high-frequency range of broadband interferometer experiments, by which we mean the window containing signal frequencies greater than ∼ 1 Hz, offers particularly interesting search prospects in light of projected exclusion limits from future experiments, and theoretical considerations.For instance, future non-interferometer experiments, such as resonant cavity experiments [17], will only set weak limits in the 1-100 Hz frequency window compared with those arising from proposed vertical long-baseline gradiometers operating single-photon atomic transitions, e.g., AION and MAGIS.Hence, it is expected that only long-baseline interferometers will be able to set constraints on non-conventional scalar dark matter production mechanisms in this mass window, such as the thermal misalignment mechanism [18], and probe characteristic signatures of ULDM in direct detection experiments that are most amplified in this frequency range, such as gravitational focusing [19].Moreover, the high-frequency region of parameter space accessible to longbaseline gradiometers is not fine-tuned, which is often considered as an important criterion for selecting a region of parameter space to target.Assuming a UV-cutoff at 10 TeV, loop-corrections to the mass of a scalar ULDM candidate with linear couplings to electron masses would be smaller than the renormalised mass for couplings satisfying the relation d ϕ ≲ m ϕ /(3.3×10 −10 eV), which would be within the reach of a 1-km gradiometer operating with the parameters proposed in Ref. [20]. 1side from these considerations, the projected reach of long-baseline atom interferometers may start to become limited by mass-density fluctuations induced by seismic activity below ∼ 0.5 Hz; in turn, this means that the peak sensitivity of a 1-km gradiometer oper-ating with the parameters proposed in Ref. [20] would be shifted from 0.1 Hz to ∼ 1 Hz, which lies close to or beyond the expected sampling frequency of these experiments.As is well known in signal analysis, any signal oscillating at a frequency larger than half of the experimental sampling frequency, also known as the Nyquist frequency, is aliased to a lower frequency between zero and the Nyquist frequency, see e.g., Refs.[21,22].Additionally, important spectral distortions due to aliasing would affect the qualitative features of a putative signal, and thus impact the ability to correctly identify a signal as being of a ULDM origin, as briefly discussed first in Ref. [23] within the context of ULDM searches with a network of atomic clocks.As a study in this direction is lacking within the context of ULDM searches with atom interferometers, here we provide a complete and versatile likelihoodbased analysis framework, which makes use of the machinery developed in Refs.[20,24], for discovering these high-frequency signals.With the aid of Monte Carlo (MC) simulations, we also confirm the robustness of our analysis strategy in reconstructing the properties of an injected signal.Importantly, leveraging on these statistical tools, we provide preand post-data collection strategies that broadband atom interferometer experiments should consider to maximise the potential to discover a super-Nyquist ULDM signal.
The rest of the paper is organised as follows.In section 2, we review the scalar ULDM signal in broadband vertical atom gradiometers employing single-photon atomic transitions.In section 3, we present the likelihood-based analysis of a ULDM signal in the frequency domain.In section 4, we provide a detailed discussion of aliasing within the context of ULDM searches: in section 4.1, we provide the reader with an overview of the concepts and phenomena associated with aliasing, such as Nyquist windows and folding; in sections 4.2-4.5 we present several techniques to correctly reconstruct super-Nyquist ULDM signals that are affected by aliasing; and in section 4.6 we apply these techniques to a discovery search based on several MC realisations of the data.Finally, we summarise our results in section 5. Several appendices provide further details that complement and validate our analysis and results.

Scalar ULDM signal in vertical atom gradiometers
In light of its large occupation number, small mean velocity and velocity dispersion that is characteristic of DM in the Milky Way, scalar ULDM can be modeled as a temporally and spatially oscillating, non-relativistic classical field with frequency largely set by the DM mass, m ϕ = 2πf ϕ , and small kinetic corrections [25]. 2 As a result of the wave's dispersion ∆ω ϕ = 2π∆f ϕ ∼ m ϕ v 0 σ v where {v 0 , σ v } ∼ 10 −3 , ULDM is characterised by a coherence time τ c ≡ 2π/(m ϕ v 2 0 ), which sets the timescale on which the field amplitude and phase vary considerably.Assuming the random phase model, the ULDM field can be expressed as a sum of its Fourier components with uncorrelated phases [26].
In the context of direct detection experiments, which aim to measure a time-varying signal that is proportional to the ULDM field itself over an integration time T int , it is advantageous to express the ULDM field in terms of an experiment's frequency resolution ∆f ≡ 1/T int .Without loss of generality, assuming the random phase model and neglecting its spatial variation, a scalar ULDM field can be expressed as [20] where ρ DM = 0.3 GeV/cm 3 is the local DM density, ω a ≃ m ϕ (1 + v 2 a /2) is the angular frequency of the ULDM wave for a given speed v a and θ a ∈ [0, 2π) is a random phase.The sum is over velocity classes that are identified by the index a; the size of each velocity class ∆v is related to the experiment's frequency resolution ∆f = 1/T int ≃ m ϕ v 0 ∆v/2π. 3The variable α a is Rayleigh distributed with ⟨α 2 a ⟩ = 2; its probability density function is given by The DM speed distribution is denoted by f DM (v), which we assume to correspond to the Standard Halo Model (SHM) [27,28], namely where σ v is the velocity dispersion which is set, at the solar position, by the value of the local standard of rest v 0 = √ 2σ v ≈ 238 km/s, and v obs ≈ 252 km/s is the average speed of the Earth relative to the halo rest frame [29]. 4  We end this discussion by noting that the spatial structure of the ULDM wave can be neglected in individual gradiometer experiments by observing that: (i) on average the magnitude of wave vector k ϕ = m ϕ v = 1/λ dB , where λ dB is the DM's de Broglie wavelength, is suppressed by a factor of v ∼ 10 −3 relative to the angular frequency; (ii) the longest length scale in an atom gradiometer experiment is set by the length of the baseline, which, in the mass range of interest, is significantly smaller than λ dB ≃ 10 −14 eV/m ϕ 2 × 10 7 km.Hence, corrections to the phase from the ULDM spatial structure are highly subdominant. 5  3 The former equality follows from the discrete Fourier transform of the data, whereas the latter follows from the signal's kinetic energy. 4Although the DM speed distribution is characterised by a cut-off at the escape velocity vesc ∼ 800 km/s in the Earth's frame and may feature substantial radial anisotropic components, commonly referred to as the Gaia Sausage or Gaia-Enceladus [30], we expect the simple SHM form in Eq. (2.3) with no cut-off and anisotropies to be sufficient for the analysis strategy presented here. 5The spatial structure of the ULDM wave could be relevant when performing ULDM searches with networked atom gradiometers.When the distance between gradiometer experiments (e.g.MAGIS and AION) is comparable to the field's de Broglie wavelength, the correction to the ULDM's phase would be on the order of the time-dependent phase [31].Hence, two networked experiments that are located at antipodal points could be used to explore the spatial structure of the ULDM scalar candidates with masses m ϕ ≳ 10 −11 eV.A study on the physics potential of such searches is beyond the scope of this work.

ULDM-induced differential phase shift
Linear interactions between scalar ULDM and SM photons/electrons give rise to the oscillation of the electron mass m e and fine-structure constant α [32,33], ) where the parameters {d e , d me } denote the coupling strengths relative to the Planck mass, which we explicitly express in terms of Newton's gravitational constant G N .These couplings lead to oscillations in atomic transition frequencies which could be detected in the differential phase shift Φ measured between two atom interferometers that compose an atom gradiometer.Explicitly, as first calculated in Ref. [11] and subsequently generalised to baselines of arbitrary length in Refs.[12,20], the ULDM gradiometer phase shift measured after a time m∆t from the start of the first experiment between two coupled interferometers, which are separated by a distance ∆z, takes the following form: where ) Here, we have assumed the 'broadband' atom gradiometer sequence shown in Fig. 1, which consists of an initial π/2-pulse, (4n − 3) π-pulses from alternating directions, and a final π/2-pulse, where n is the number of large momentum transfer (LMT) kicks.In eq.(2.6), ϕ a,m ⊃ ω a m∆t + θ a is the phase of the DM wave at the end of the interferometer sequence; the amplitude A a depends both on experimental parameters (the number of LMT kicks n, the interrogation time T , the length of the baseline L characterising the interferometric sequence, and the angular frequency ω A of the optical transition 6 ) and phenomenological parameters (the local dark matter density ρ DM , the ULDM mass m ϕ , and the ULDM-SM coupling strength d ϕ ∈ {d e , d me }).In this work, we assume the parameters listed in Table 1.
The dependence of Eq. (2.6) on the random and uncorrelated variables α a and θ a implies that the ULDM signal is stochastic [34].In the next section, we review the details regarding the characterisation of stochastic signals in the frequency domain, and we show how a likelihood-based framework can be used to search for a ULDM signal.This framework was developed in Ref. [20] and builds on previous work in the context of axion-like particle searches [24,31].
. Schematic representation of the atom gradiometer sequence considered in this work for n = 4 LMT kicks.The atom's excited (|e⟩) and ground (|g⟩) states are shown in blue and cyan, respectively.π/2and π-pulses are displayed as wavy lines in fuchsia (dashed) and red (solid), respectively.Atom-light interactions are indicated with black dots.The length of the baseline is L, the distance between the atom interferometers is ∆z, and T is the interrogation time.Here, we show two sequences, which are temporally separated by the sampling interval ∆t.

Frequency-domain ULDM signal analysis
The data that is measured by an interferometer consists of a finite number of phase measurements at discrete points in time, and hence constitutes a time-series of finite length.Here, we assume a constant time interval ∆t between successive measurements.As a DM signal is characterised by a frequency largely set by its mass and a frequency spread dictated by its speed distribution, the appropriate tool for analysing the data is the discrete power spectral density (PSD), which is defined as where Φ DM,k is the discrete Fourier transform of the data, k ∈ {0, 1, ..., N − 1} labels the frequency bin accessible to the experiment, and N = T int /∆t.As derived in Ref. [20], without loss of generality, the expectation value of the signal's PSD takes the form where we define v ω = 2ω/m ϕ − 2 , and express the k th angular frequency and resolution as ω k = 2πk/T int and ∆ω = 2π/T int , respectively.Here, A k is as defined in Eq. (2.7) under the redefinition ω a → ω k .By construction, the observable spectral content of the ULDM-induced signal depends on the duration of the experiment relative to the signal's coherence time, T int /τ c .When T int /τ c ≪ 1, the size of the frequency bins exceed the signal's spectral linewidth ∆f ϕ ≈ 1/τ c .In turn, this implies that the signal is observed in a single frequency bin, i.e., in Eq. (2.6), F DM = 1 and the sum is carried over a single frequency component, so that Eq. (3.2) is Table 1.List of experimental parameters assumed in this paper.These could be implemented in future vertical gradiometers, e.g., AION-km.Here, L is the length of the baseline, T is the interrogation time, 4n − 1 is the total number of large momentum transfer kicks transferred during a single cycle, ∆z is the gradiometer length, δϕ is the shot noise-limited phase resolution per interferometer, T int is the total integration time for the experiment and f s is the sampling frequency.
only non-zero for a single value of k.The ULDM field is then well-described by oscillations at ω ϕ ≈ m ϕ with a fixed but random amplitude and phase, such that the signal's PSD corresponds to a single spike in the frequency bin centred at ∼ f ϕ .Thus, in this case, we would expect the mass resolution to be comparable to the experiment's frequency resolution, namely ∆f = 1/T int .On the other hand, when T int /τ c ≫ 1, the size of the frequency bins would be smaller than the signal's dispersion (∝ 1/τ c ) such that the spectral content of the signal could be resolved, i.e., in this limit, F DM (v a ) = f DM (v a ) ∆v and the sum in Eq. (2.6) is over all resolvable DM speeds [24].In practice, to resolve the DM speed distribution which is imprinted onto the signal, it is necessary to choose a sufficiently long integration time so that the frequency resolution satisfies ∆f ≲ 10 −6 f ϕ .For DM masses in the highfrequency regime considered in this paper (i.e., f ϕ ≳ 1 Hz), this requires T int ≳ 10 6 s.For book-keeping purposes, we summarise the relationship between the key parameters in the frequency and time domain in Table 2.

Likelihood-based analysis
As was shown in Ref. [20], both the ULDM signal and dominant irreducible backgrounds in future single-photon atom gradiometers will be Gaussian distributed with zero mean.Hence, the PSD of the expected signal and background will be exponentially distributed.For a signal-plus-background model M S + B , we define the parameter vector θ = {θ sig , θ nuis }, where θ sig describes the signal parameters that characterize the ULDM signal contribution, and θ nuis describes the background (nuisance parameters).In this case, the appropriate likelihood 7 to determine the correct upper limits on the couplings and to claim a discovery is given by where the product is over frequency indices k ∈ {1, 2, ..., N − 1}, excluding k = N/2, and where is the sum of the expected PSD of the signal and background.Above ∼ 1 Hz, atom shot noise is expected to dominate the background [20], so we will assume that the expected PSD of the noise is frequency-independent.
To reduce the number of frequency bins post-data collection over which to evaluate the likelihood, which drastically shortens the computational time required to evaluate the test statistic (TS) to claim a discovery or set upper limits without changing their value [24], it is possible to: i) break up the time series into N stacks chunks of duration T int /N stacks ; ii) compute the PSD on each time series, which we label as S (l) data, k ; and iii) compute the average of these PSD for each frequency bin k ∈ {1, ..., N/N stacks − 1}, excluding k = N/2N stacks .We refer to this averaged quantity as the "stacked PSD", which we define mathematically as The distribution of the sum of N stacks -independent and identically distributed random variables, each having an exponential distribution with the same mean, is given by the Erlang distribution [35].Upon a change of variable, the probability distribution function (PDF) of the stacked PSD is then given by the re-scaled Erlang distribution where ⟨S k ⟩ is the expectation value of the PSD at the k th frequency bin.It then follows that the likelihood8 for the stacked data can be defined as where In light of the reduction in the effective integration time by a factor of N stacks , the frequency resolution will be decreased by a factor of N stacks , as shown in Table 2. Thus, in order to resolve the spectral features of a putative ULDM signal with f ϕ ∼ 1 Hz which is integrated over a time T int = 10 8 s, an experimentalist should choose N stacks ≲ 10 2 .

Test statistic for discovery
With these likelihoods, we can define the discovery test statistic, namely where θ nuis denotes the vector of nuisance parameters that maximise the likelihood in the background-only hypothesis (i.e., the denominator term), and θ nuis represents the vector of nuisance parameters that maximise the likelihood in the signal-plus-background hypothesis for a given set of signal parameters.The best fit point in ULDM parameter space is then defined by the vector θ sig that maximises Eq. (3.9).Unless otherwise stated, we set θ sig = (m ϕ , d 2 ϕ ).9 Since the likelihoods are Gaussian, we define the confidence region (C.R.) with confidence level (C.L.) α in ULDM parameter space as the set {θ sig } satisfying where TS max = TS( θ sig ) corresponds to the maximum value of the discovery test statistic, and Q α is a quantile of order α of the χ 2 distribution, and as such depends on the confidence level α and the number of fitted parameters [36].When a ULDM signal is non-zero in multiple frequency bins, the Wald approximation and Wilks' theorem [37] are valid.In this case, the C.R. at the 68.3% and 95.4% C.L. are associated with Q 68.3% = 2.3 and Q 95.4% ≈ 5.99, respectively.Obtaining the threshold value of the test statistic to determine the global significance of a signal is involved and requires the application of the look elsewhere effect.The simplest approach would be to evaluate the test statistic for discovery at fixed masses over a range of independent (i.e., non-overlapping) frequency windows, whose width is given by the expected line-width of the ULDM signal.With this prescription, we can estimate the number of independent frequency windows given a minimum and maximum frequency over which to perform the scan, f min and f max respectively, as N windows ≈ 10 6 ln(f max /f min ) for N windows ≫ 1.Hence, the threshold for claiming a discovery can be related to the p-value via where Φ −1 is the inverse cumulative distribution function of the normal distribution.For example, a 5σ local discovery (i.e.N windows = 1) would require p ≈ 2.87 × 10 −7 and √ TS thresh = 5.If we scanned between 2 Hz and 22 Hz (i.e.N windows ≈ 2.4 × 10 6 ), a 5σ global discovery (i.e.N windows = 1) would instead require p ≈ 2.87 × 10 −7 and √ TS thresh ≈ 7.32. 10rom these likelihoods it is also possible to define the test statistic for setting upper limits on d ϕ at specific mass values.Since the focus of this work is on understanding how to discover a super-Nyquist signal, we provide a short discussion on this test statistic in Appendix B.

Discovering a super-Nyquist ULDM signal
In this section, we will explore the approach towards discovering a super-Nyquist ULDM signal in a broadband atom gradiometer.After introducing the concept of aliasing, we show how a signal oscillating close to or beyond the Nyquist frequency of the experiment can be correctly reconstructed by making use of the framework presented in section 3.

An overview of aliasing, Nyquist windows, and folding
The well-motivated high-frequency window to which vertical atom gradiometers would be sensitive lies above O(1 Hz), which is greater than and comparable to the Nyquist frequency f Ny = f s /2 of projected experiments.As is well-known in Fourier analysis [21,40], any signal whose spectral content is greater than f Ny will be mapped to frequencies between 0 and f Ny due to a phenomenon known as aliasing.Specifically, it follows from the Nyquist-Shannon theorem [41] that a signal oscillating with frequency f is detected at two frequencies between 0 and f s : f * 1 = f − κf s and f * 2 = (κ + 1)f s − f , where κ is the largest non-negative integer for which 0 < f * 1 < f s .Hence, the spectrum of a signal oscillating at a frequency greater than f Ny and aliased to f * 1 will be added to the spectrum of a non-aliased signal oscillating with frequency f * 1 .By definition, f * 1 will be measured either in the range [0, f Ny ] or in the range [f Ny , f s ], which we refer to as the first and second Nyquist window, respectively.Hence, if f * 1 is in the first Nyquist window, then f * 2 is in the second Nyquist window, and vice versa.Since f * 1 is related to f by a frequency shift, the original lineshape of the spectrum will be preserved at f * 1 ; this is to be contrasted with the spectrum identified at f * 2 , which is related to f by both a frequency shift and a parity transformation, and so will be a mirror image of the signal's original lineshape.This phenomenon is commonly referred to in the literature as folding [21,40].In this sense, the Nyquist frequency acts as an axis of reflection, so that the spectral content measured in the first Nyquist window is a mirror image of the spectral content in the second Nyquist window.All of these principles follow from the symmetries of the Fourier transform, which we review in Appendix A.
To illustrate aliasing in the context of ULDM searches, in the top row of Fig. 2 we show a MC realisation of the broadband signal induced by a scalar ULDM candidate with mass First Nyquist window Second Nyquist window  1 and N stacks = 100.In the left column, the aliased signals are mapped to the first Nyquist window, whereas in the right column the signals are mapped to the second Nyquist window.Spectral folding occurs for signals whose aliased frequency is f * ϕ,2 , and so can be seen in the second Nyquist window for f ϕ = 9.1 Hz and in the first Nyquist window for f ϕ = 9.9 Hz.For comparison, we also show the aliased ULDM mass at 0.1 (0.9) Hz with a purple dash-dotted (red dotted) line in the left (right) panels.
m ϕ = 2πf ϕ = 2π × 9.1 Hz and coupling strength d ϕ = 3 × 10 −4 that would be measured by a gradiometer which operates with the parameters in Table 1 and N stacks = 100.In this case, f * ϕ,1 = 0.1 Hz and f * ϕ,2 = 0.9 Hz; hence, f * ϕ,1 and f * ϕ,2 are measured in the first and second Nyquist windows, respectively.Since f * ϕ,1 is contained in the first Nyquist window, the spectrum identified at f * ϕ,1 = 0.1 will not be affected by folding.This can be clearly seen in the upper left panel, where the signal rises sharply around the mass of the signal and falls at high frequencies.On the other hand, f * ϕ,2 , which is contained in the second Nyquist window, will be affected by folding.Indeed, as shown in the upper right panel, the signal now rises sharply at f * ϕ,2 = 0.9 Hz and falls at low frequencies.For the sake of completeness, in the second row of Fig. 2 we show a MC realisation of the broadband signal induced by a scalar with mass m ϕ = 2πf ϕ = 2π × 9.9 Hz and coupling strength d ϕ = 10 −4 as measured by the same instrument.In this case, f * ϕ,1 = 0.9 Hz and f * ϕ,2 = 0.1 Hz, so that the aliased spectrum in the first Nyquist window will now be affected by folding, whilst the aliased spectrum in the second Nyquist window will not.

Disentangling an aliased from a non-aliased ULDM signal
To discover ULDM in the super-Nyquist frequency range it is imperative to be able to distinguish between aliased and non-aliased signals.This can only be achieved when a subset of the signal's features is unaffected by aliasing.In the context of scalar ULDM searches with broadband atom gradiometers, this set includes the amplitude of the signal and its spectral line-width.Owing to aliasing, a ULDM signal with frequency f ϕ > f Nyq would be identified at a smaller frequency between zero and f s , but crucially, would inherit its original frequency spread and amplitude.This is because the signal amplitude depends on both the coupling strength and the ULDM mass, while the spectral width depends on the properties of the dark matter's speed distribution and the ULDM mass.
Prima facie, it would then seem that an aliased ULDM signal could always be correctly disentangled from a non-aliased one.This statement, however, is not correct.Indeed, because of the amplitude's degeneracy with coupling strength, aliased and non-aliased signals cannot be disentangled when their spectral content is contained within a single frequency bin, which occurs when the stacked integration time T st int = T int /N stacks exceeds the ULDM's coherence time.Therefore, even if T int > τ c in the frequency range of interest, a post-data collection choice of stacking could weaken the ability to distinguish between aliased and non-aliased signals.To illustrate this point, let's consider an experiment operating with the parameters shown in Table 1, and hunting for signals satisfying T int > τ c .In the left column of Fig. 3  Instead, the correct statement is the following: aliased and non-aliased signals can only be disentangled when their spectral content is spread over multiple frequency bins.Indeed, in this regime, aliased and non-aliased signals that are mapped to the same frequency between zero and f s , and have identical maximum PSD amplitude, can still be distinguished because of the spectral line-width's linear dependence on the ULDM mass (or equivalently, f ϕ ).To illustrate this point, we consider the usual experiment operating with the parameters shown in Table 1, but in the case of limited stacking (N stacks = 10).For this choice of stacking and integration time, the non-aliased signal at f ϕ = 0.4 Hz with d 2 ϕ = 2 × 10 −12 and the aliased signal at f ϕ = 8.4 Hz with d 2 ϕ = 4 × 10 −8 satisfy the condition T st int > τ c .In the right column of Fig. 3 we show the expected PSD of these two signals.In each case, the signals are mapped to f * ϕ,1 = 0.4 Hz but exhibit significantly different spectral broadening.Since the spectral width of a ULDM signal scales linearly with f ϕ , the PSD of the f ϕ = 8.4 Hz signal is the broadest.Hence, despite their identical maximum amplitude, these two signals can be easily distinguished.
If we did not take aliasing into account, an aliased signal satisfying the condition T st int > τ c would be confused for a non-aliased one with different coupling strength and In the top row we show the spectra of a ULDM signal with f ϕ = 0.4 Hz, which is not affected by aliasing in light of the chosen sampling frequency (see Table 1), for different coupling strengths.In the second row, we show the spectra of a ULDM signal with f ϕ = 8.4 Hz, which is subject to aliasing, for different coupling strengths.In the regime of short stacked integration time, the aliased and non-aliased signals that we show are identical; in the opposite regime, the aliased and non-aliased signal are characterised by a marked difference in spectral broadening.
much larger (and unphysical) velocity dispersion, which we remind the reader is given by v 0 / √ 2. We show this in Fig. 4, where we provide a comparison of the reconstructed coupling and v 0 of an injected ULDM signal using the discovery test statistic defined in Eq. (3.9) at fixed mass.In particular, we analyse the Asimov data set 11  of the SHM in Eq. (2.3), thereby implying that the signal at 0.4 Hz cannot be confidently attributed to ULDM.
In summary, in light of the difficulty in disentangling a non-aliased signal from an aliased one, we conclude that: i) N stacks should be chosen post-data collection so that the effective integration time (i.e., the duration of each stacked time series, T st int ) should be larger than the largest coherence time that the experimentalist wishes to probe; and ii) the speed parameters should be constrained by the predictions of the SHM, so that the spectral breadth of the signal depends exclusively on the ULDM mass.

Disentangling a folded from a non-folded ULDM signal
In light of the Nyquist-Shannon theorem, the spectral content of the first and second Nyquist windows is identical under a parity transformation.Specifically, the spectrum measured at where κ is the largest non-negative integer for which 0 < f * ϕ,1 < f s .However, the aliased copy identified at f * ϕ,1 will preserve the signal spectrum's original orientation, while the aliased copy identified at f * ϕ,2 will be its mirror image.
The degree to which the excesses at ∼ f * ϕ,1 and ∼ f * ϕ,2 can be correctly identified with the folded and non-folded alias, respectively, of a super-Nyquist ULDM candidate depends on the resolution of the characteristic lineshape of the ULDM signal, which in turn depends on the stacked integration time.To measure this, we introduce the folding discriminant, which quantifies the degree to which an excess is attributed to a folded or a non-folded alias.The definition of this measure relies on the properties of an injected ULDM signal f ′ ϕ .We choose to have a non-folded alias in the first Nyquist window, and therefore a folded alias in the second window.The folding discriminant is then defined as the difference between the maximised test statistic for discovery assuming that: the first Nyquist window features a non-folded alias of a signal with frequency ∼ f ′ ϕ ; and the second Nyquist window contains a non-folded alias of a different signal with frequency ∼ f ′′ ϕ .Mathematically, this is equivalent to where N f ϕ is defined as the neighbourhood around f ϕ .Since the injected signal contains a non-folded alias in the first Nyquist window only, in Eq. (4.1) the second term will be bounded above by the first term.The difference between these test statistics is then analogous to the definition of quantile Q α of the χ 2 distribution which was used in the definition of confidence regions in ULDM parameter space (cf.Eq. (3.10)).Hence, the larger the difference between these two test statistics (i.e. the larger the value of the folding discriminant), the larger the discriminating power between folded and non-folded signals.
In Fig. 5, we make use of the folding discriminant on two Asimov data sets containing an injected signal with ≈ 5σ local significance (f ϕ = 9.1 Hz and d 2 ϕ = 10 −9 ) and an injected signal with ≈ 5σ global significance (f ϕ = 9.1 Hz and d 2 ϕ = 1.4 × 10 −9 ). 12 In both cases, we fix T int (i.e. the data has been already taken by the experimentalist) but allow for N stacks to be tuned, which implies that we are effectively scanning over different values of T st int .We further choose the experimental parameters stated in Table 1 and two different values of the sampling frequency: 0.3 Hz and 3 Hz.For this choice of sampling frequencies, the alias of the injected signal identified in the first Nyquist window will not be affected by folding.The folded alias of the injected signal will instead be identified at 0.2 Hz and 2.9 Hz for f s = 0.3 Hz and f s = 3 Hz, respectively.For both sampling frequencies, we assume that the aliases in the second Nyquist window are attributable to ULDM fields with f ϕ = 8.9 Hz, for which the aliases in the second Nyquist window would not be folded.
As shown in both panels of Fig. 5, the folding discriminant is approximately zero and constant for T int /τ c ≳ 1, which corresponds to N stacks ≲ 10 3 for T int = 10 8 s, independently of the significance of the injected signal.This follows from the fact that the folded and non-folded signal spectra will each be contained within a single bin, which implies that the two signals will be identical (up to a difference in spectral amplitude); hence, the analysis is unable to discriminate between the folded signal at f ϕ ≈ 8.9 Hz and the non-folded signal at f ϕ ≈ 9.1 Hz.Furthermore, since increasing N stacks (i.e.decreasing T st int ) does not improve the resolution of the signal's spectral content, the ability to distinguish between these two signals is independent of N stacks .We assume the experimental parameters stated in Table 1 with the exception of the sampling frequency, which we set to 0.3 Hz in the left panel and 3 Hz in the right panel.The folding discriminant is saturated by signals satisfying T int /τ c ≳ 5.The similarity between the panels implies that the ability to disentangle folded and non-folded signals is independent of f s .
For T int /τ c ≳ 5, the folding discriminant is maximal and constant.This can be understood as follows: the limit T int > τ c implies that the signal's spectral content is well-resolved.In this case, the folded and non-folded aliased copies of the true signal will be characterised by well-resolved line-shapes with opposite parity; hence, the spectra at f * ϕ,1 and f * ϕ,2 can be correctly identified with the non-folded and folded aliases, respectively.Since we have assumed no DM substructure, increasing T st int further (i.e.choosing N stacks ≲ 10 2 for T int = 10 8 s) does not improve the resolution of the signal's characteristic lineshape; hence the ability to distinguish between these two signals is independent of N stacks .In both panels, however, the folding discriminant is largest for the signal that has the highest significance (i.e. the signal with the largest coupling), which follows from the definition of the folding discriminant and from the scaling of Eq. (3.9) with d 2 ϕ .This can also be understood as follows: for large couplings, the features of the expected signal's lineshape are more pronounced; hence, the folded and non-folded aliases of ULDM signals with large couplings can be more readily distinguished.
Finally, for our choice of sampling frequencies, f * ϕ,1 and f * ϕ,2 are not affected by f s ; hence, the degree to which the likelihood can distinguish between folded and non-folded signals is largely insensitive to the sampling frequency, which explains the substantial similarity between the left and right panels of Fig. 5.
In summary, we conclude that: i) the degree to which folded and non-folded signals can be distinguished depends on the ability to resolve the signal's characteristic lineshape; ii) folded and non-folded signals satisfying T st int /τ c ≳ 5 have well-resolved lineshapes with opposite parity and so can be readily disentangled, whilst those satisfying T st int /τ c ≲ 1 have unresolved lineshapes and so cannot be correctly reconstructed; and iv ) the degree to which folded and non-folded signals can be distinguished is independent of the sampling frequency f s .

Disentangling neighbouring aliased and equally-folded signals
From the Nyquist-Shannon theorem, two ULDM signals whose frequencies differ by integer multiples of the sampling frequency are both identified at the same frequency between zero and f s -i.e. two ULDM signals with f ′ ϕ = κf s + f * ϕ and f ′′ ϕ = κ ′ f s + f * ϕ , for integers κ and κ ′ , would be imaged at f * ϕ .Hence, different from the case of folded and non-folded signals, the spectra of neighbouring aliased signals will have the same parity (i.e. they will exhibit the same degree of folding).Since ∆f ϕ ∝ f ϕ , such signals may be differentiated post-data-taking by leveraging exclusively on the spectral linewidth's dependence on the DM mass.
The degree to which such signals can be disentangled largely depends on the sampling frequency of the experiment.In particular, for high-frequency ULDM signals, the larger the sampling frequency, the larger the frequency difference between super-Nyquist signals which are consistent with the same alias; the larger the frequency splitting between neighbouring signals, the larger the linewidth difference between neighbouring signals, and so the greater the incompatibility between such ULDM candidates.To measure this, we introduce the aliasing discriminant.In analogy to the folding discriminant defined in section 4.3, the aliasing discriminant is computed using the discovery test statistic.Additionally, the definition of this measure also relies on the properties of an injected ULDM signal f ϕ , which we choose to have a non-folded alias in the first Nyquist window.However, different from the folding discriminant, this object is defined as the difference between the test statistic for discovery assuming that: the first Nyquist window contains a non-folded alias of a signal with frequency ∼ f ′ ϕ ; and the first Nyquist window contains a non-folded alias of a signal with frequency f ′′ ϕ = f ′ ϕ + κf s .Mathematically, this is equivalent to Since the injected signal is at f ϕ , in Eq. (4.2) the second term will be bounded above by the first term.Hence, as for the folding discriminant, this object is analogous to the definition of the quantile Q α .Therefore, the larger the difference between these two test statistics (i.e. the larger the value of the aliasing discriminant), the larger the discriminating power between super-Nyquist signals with different masses.By making use of the aliasing discriminant, which we evaluate on two Asimov data sets containing an injected signal with ≈ 5σ local significance (f ϕ = 9.1 Hz and d 2 ϕ = 10 −9 ) and an injected signal with ≈ 5σ global significance (f ϕ = 9.1 Hz and d 2 ϕ = 1.4 × 10 −9 ), in Fig. 6 we illustrate the likelihood's ability to distinguish between neighbouring aliased and equally-folded signals as a function of κ.Similarly to Fig. 5, we focus on two different sampling frequencies: 0.3 Hz, which we display in the left panel, and 3 Hz, which we display in the right panel.Here, however, we set N stacks = 100.The aliasing discriminant is then computed over ULDM masses contained in integer multiples of the first Nyquist window, .We assume the experimental parameters stated in Table 1 with the exception of the sampling frequency, which we set to 0.3 Hz in the left panel and 3 Hz in the right panel.The dashed horizontal lines define the quantiles associated with the 68.3% and 95.4% C.L. The aliasing discriminant increases away from the injected signal; as the right panel shows, this object is largest for signals whose frequency is much larger or smaller than the injected one.
which we define as κ, closest to the one containing 9.1 Hz, which we define as κ 9.1 Hz , respectively.For illustrative purposes, we restrict ourselves to 28 ≤ κ ≤ 34 for f s = 0.3 Hz, and 1 ≤ κ ≤ 7 for f s = 3 Hz.The multiples of the first Nyquist window containing 9.1 Hz for f s = 0.3 Hz and f s = 3 Hz are then κ 9.1 Hz = 30 and κ 9.1 Hz = 3, respectively.
In both panels, the aliasing discriminant is zero and at its minimum for κ = κ 9.1 Hz , which implies that the reconstructed signal is favoured to be in the correct multiple of the Nyquist window.Furthermore, the aliasing discriminant increases with |κ − κ 9.1 Hz |, which implies that the ability to distinguish between neighbouring and equally-folded signals increases with the difference between the reconstructed signal's expected linewidth.Furthermore, as for the folding discriminant, neighbouring aliased signals with larger couplings (and larger significance) will be more readily distinguished, as shown by the higher values of the aliasing discriminant away from κ 9.1 Hz .
The degree to which equally-folded signals in neighbouring multiples of the first Nyquist window are less significant depends on the sampling frequency.For f s = 0.3 Hz, the ratio of f ϕ to 9.1 Hz in neighbouring windows is approximately unity, which implies that the spectral content of these signals is comparable with the injected one; hence, the aliasing discriminant is approximately zero, i.e. neighbouring signals cannot be easily disentangled.For f s = 3 Hz, however, 0.34 ≲ f ϕ /(9.1 Hz) ≲ 2.4, which implies that the spectral content of these signals is much wider or narrower than that of the injected signal; hence, the aliasing discriminant increases rapidly, i.e. neighbouring aliased signals will be more robustly disfavoured.In particular, we note that reconstructed signals that are at least twice or at most half as wide as the injected signal in the frequency domain would not be contained within the 95.4% C.L. of the injected signal.
From this last observation, we can infer a condition on f ϕ and f s to maximise the likelihood's ability to distinguish between neighbouring aliased and equally-folded signals.
As shown in Fig. 6, the aliasing discriminant of signals whose linewidth is at least twice or at most half as wide as that of the injected signal are not contained within the 95.4% C.L. quantile of the global maximum.Hence, signals satisfying (f ϕ + f s )/f ϕ ≳ 2 and (f ϕ − f s )/f ϕ ≲ 1/2 will not be in good agreement with the injected signal.Combining these two conditions, we find f s /f ϕ ≳ 3/5.
In summary, we conclude that: i) the degree to which neighbouring equally-folded aliased signals can be disentangled improves with the sampling frequency; and ii) in order to contain the 95.4% C.L. confidence region of a ≈ 5σ globally significant discovery in a single Nyquist window, it will be necessary to choose a value of the sampling frequency that satisfies the condition f s /f ϕ ≳ 3/5, where f ϕ is the largest ULDM mass to be considered in the scan; equivalently, for a given choice of f s , which is stipulated before the start of the measurement campaign, f ϕ = 3 f ϕ /5 would be the largest ULDM mass that could be unambiguously reconstructed by our analysis.

Discovering distorted signals
Another defining feature of aliased signals is distortion due to folding.This arises when the spectral content of a signal exceeds the frequency range of a given Nyquist window; in light of the symmetries of the Fourier transform (see Appendix A), the power spectrum that leaks into the neighbouring Nyquist window is reflected back to the original Nyquist window, and added to the PSD that was initially aliased to this Nyquist window.In the case of ULDM searches, this phenomenon would occur for signals satisfying is contained within the first Nyquist window, or f * ϕ + f ϕ v 2 esc /2 > f s , when f * ϕ is contained within the second Nyquist window.
All ULDM signals satisfying f ϕ ≳ 10 6 f Ny would be affected by distortions due to folding, independently of f * ϕ .This is because the spectral width of these very high-frequency super-Nyquist signals exceeds the size of a single Nyquist window.Here, we will not consider such signals: this part of parameter space is already competitively probed by complementary probes for typical values of f s , and so would not be a primary target of ULDM searches with broadband atom gradiometers.
Spectral distortions owing to folding would be of interest to broadband interferometers for high-frequency ULDM signals well below f ϕ ∼ 10 6 Hz and sufficiently close to integer multiples of the Nyquist frequency.This is illustrated in Fig. 7 where we show the effect of spectral distortions on three different ULDM signals with d 2 ϕ = 2 × 10 −8 and: f ϕ = (12 − 3.5 × 10 −5 ) Hz, f ϕ = (12 − 2 × 10 −5 ) Hz and f ϕ = (12 − 1 × 10 −5 ) Hz, which we show in the left, central and right panels respectively.For our choice of sampling frequency, f s = 1 Hz, the non-folded alias of the true ULDM signal is imaged in the second Nyquist window, while the folded alias of the true ULDM signal is imaged in the first Nyquist In purple, we show the non-folded spectrum, which is aliased to the second Nyquist window (2 nd NW) and leaks into the first Nyquist window (1 st NW); in green we show the folded spectrum, which is aliased to the 1 st NW and leaks into the 2 nd NW; and in grey we show the distorted aliased spectrum, which consists of the sum of the folded and non-folded spectra.We assume the experimental values stated in Table 1 and N stacks = 50.We see that, the closer an aliased signal is imaged to an integer multiple of the Nyquist, the greater the degree of spectral distortion.
window.The degree of distortion increases as f ϕ tends to 12 Hz, which corresponds to an even multiple of the Nyquist frequency.
By taking into account spectral distortions, the likelihoods defined in section 3.1 can correctly reconstruct such signals, independently of the degree of distortion induced by folding.We show this in Fig. 8, where we plot the posterior distribution of the reconstructed ULDM parameters, namely f ϕ and d ϕ is the same for both signals, the uncertainty on the mass differs appreciably: for the injected signal at f ϕ = (12−2×10 −5 ) Hz, the relative uncertainty on ( f ϕ − 12) Hz is 0.5%, while for the injected signal at f ϕ = (12 − 1 × 10 −5 ) Hz, the relative uncertainty on ( f ϕ − 12) Hz is ≈ 2%.This difference can be explained by noting that the degree of distortion in the signal's line-shape is highly sensitive to the value of the alias of f ϕ with respect to the characteristic line-width ∆f ϕ .For signals satisfying The inferred ULDM parameters are consistent with the injected signal to within 2σ credible region independently of the degree of spectral distortions due to folding.
esc /2 (e.g. the f ϕ = (12 − 2 × 10 −5 ) Hz signal shown in the central panel of Fig. 7), the distortion will affect the high-tail of the imprinted dark matter speed distribution.Since this deviation from the SHM cannot be accounted for by tuning d 2 ϕ , such signals would be more sensitive to changes in f * ϕ , and thus f ϕ .In summary, we conclude that: i) super-Nyquist signals that are affected by spectral distortions due to folding can be correctly reconstructed using the tools presented in section 3; and ii) the relative uncertainty on the reconstructed coupling d 2 ϕ is independent of the degree of distortion, while the relative uncertainty on the reconstructed value of f ϕ is smallest for signals satisfying ∆f ϕ ≪ f s − f * ϕ < f ϕ v 2 esc /2, i.e., distorted signals whose deviation from their corresponding non-distorted spectra predominantly affects the high-speed tail of the imprinted dark matter speed distribution.

Example discovery search
We complete this section with an example of a discovery analysis, which will provide a unified context for the phenomena related to aliasing discussed earlier.
For the purpose of comparison, we generate two MC data sets 13 in the frequency domain (see Appendix C for details) assuming a ULDM signal with f ϕ = 9.1 Hz, d 2 ϕ = 10 −9 , the experimental parameters mentioned in Table 1 and two different sampling frequencies: 0.3 Hz and 3 Hz.To maximise the degree to which folded and non-folded signals can be disentangled, while also minimising the number of bins over which to perform the scan (i.e.maximising N stacks at fixed T int ), we set N stacks = 100 (cf.section 4.3).
To reconstruct the injected signal parameters, we perform a parameter scan in the (m ϕ , d 2 ϕ ) plane of interest using a nested sampling algorithm as implemented in PyMultiNest [42], a Python interface to MultiNest [43][44][45].The range of ULDM masses (or frequencies) that are scanned over also includes multiples of the first and second Nyquist windows.In this example, we restrict the analysis to frequencies between ∼ 2 Hz and ∼ 22 Hz, for which we expect the reconstructed signals to not be contained within the C.R. at the 95.4% C.L. of the best-fit point for a 5.1σ locally (0.4σ globally) significant discovery (i.e.TS max ≈ 26, cf.section 4.4).Since N stacks > 1, the test statistic for discovery is evaluated for the stacked likelihood defined in Eq. (3.7).Additionally, to disentangle aliased from non-aliased signals, the ULDM model used in the scan is defined for fixed ULDM speed parameters, specifically v 0 , v obs and v esc , which we set to those of the SHM (cf.section 4.2).
In Fig. 9, we plot the results of this analysis for both data sets.In the upper row, we display the regions of parameter space that are consistent with the global maximum at the 95.4% C.L. for f s = 0.3 Hz and f s = 3 Hz, which we show on the left and right, respectively.For f s = 0.3 Hz, we see that most Nyquist windows between ∼ 2 Hz and 22 Hz are consistent with the best-fit value, which is identified at ≈ 8.5 Hz and is consistent with the injected signal.For f s = 3 Hz, however, fewer regions of parameter space are consistent with the best-fit, which is correctly reconstructed at ≈ 9.1 Hz.Different from other dark matter direct detection experiments, these regions of parameter space are disconnected, precisely in light of the symmetry between Nyquist windows.For f s = 3 Hz, however, fewer regions of parameter space are consistent with the best-fit value, as a result of aliasing: because the peak in the first Nyquist window is located at ≈ 0.1 Hz, scanned frequencies f scan ϕ satisfying 0.5 ≲ f scan ϕ /f ϕ ≲ 2 and identified at 0.1 Hz are consistent with the injected signal; in agreement with the conclusions of section 4.4, since f scan ϕ = f ϕ + af Ny , for integers a, the smaller f Ny , and so f s , the more disconnected regions of parameter space consistent with the results.
In the lower row of Fig. 9, we illustrate the profile likelihood ratio with respect to f ϕ for the corresponding data sets, which is defined in terms of the discovery test statistic as Here, we shade the range of frequencies that would be aliased to the first Nyquist window in beige, and the range of frequencies that would be aliased to the second Nyquist window in blue.In these panels we see that a peak is visible in each scanned Nyquist frequency range, which implies that a signal is reconstructed in all scanned multiples of the Nyquist frequency.However, not all of these signals will be compatible with the global maximum at 2σ.Indeed, disconnected islands are only visible in the corresponding mass-coupling plane when the profile likelihood ratio exceeds the 95.4% C.R.
To illustrate the impact of small changes in the size of d 2 ϕ on the ability to correctly reconstruct super-Nyquist ULDM signals, in Fig. 10, we plot the results of a discovery In the upper row, we show the islands of parameter space that are consistent with the inferred best-fit at the 95.4% C.L. in red, and the injected signal with a green cross.The yellow shaded region has been excluded by MICROSCOPE, while the fine-tuned region of parameter space assuming a UV-cutoff Λ = 10 TeV is located above the blue line.The lower row shows the one-dimensional profile likelihood ratio for the ULDM frequency.The beige (blue) shaded regions mark multiples of the first (second) Nyquist window.The horizontal magenta lines correspond to the C.R. at the 68.3% and 95.4% C.L. In these panels, we label the best-fit point with a purple star.
analysis on two MC data sets which contain a ULDM signal with f ϕ = 9.1 Hz and d 2 ϕ = 1.4 × 10 −9 , the latter differing from the value of d 2 ϕ used in generating the data sets for Fig. 9 by √ 2. All other parameters are identical to the ones that were used to generate the results illustrated in Fig. 9.In agreement with the results from sections 4.3 and 4.4, for this choice of coupling a reduced number of Nyquist windows will contain signals that are consistent with the best-fit value, which is identified at ≈ 9.1 Hz for both f s = 0.3 Hz and f s = 3 Hz and consists of a 7.6σ locally (5.4σ globally) significant discovery.Additionally, as shown in the lower panels of Fig. 10, for such a choice of coupling, no range of frequencies whose folded alias would lie in the second Nyquist window is contained within the C.R. at  9. The upper row shows (in red) the islands of parameter space that are consistent with the inferred best-fit value at the 95.4% C.L. The lower row shows the one-dimensional profile likelihood ratio for the ULDM frequency, where the beige (blue) shaded regions mark multiples of the first (second) Nyquist window.The horizontal magenta lines correspond to 68.3% (1σ) and 95.4% (2σ) confidence region of the right panels.The injected signal is shown with a green cross while the best-fit point is labelled with a purple star.the 95.4% C.L., independently of the sampling frequency.This confirms that folded and non-folded aliased signals can be disentangled when the data contains a highly significant signal (c.f.section 4.3).
In light of the ULDM signal's dependence on experimentally tuneable parameters, it also follows that the discovery analysis of super-Nyquist signals is highly sensitive to small changes in sequence parameters.For example, the scalar ULDM signal amplitude in gradiometer experiments depends linearly on the gradiometer length ∆z and the number of LMT pulses n.Therefore, if a super-Nyquist signal is detected with a 5σ local significance, it is possible to improve the detection significance to 5σ global significance, without changing the frequency associated with the experiment's peak sensitivity, by increasing ∆z by a factor In both cases, the mass resolution is on the order of the stacked frequency resolution.The contours, which correspond to the confidence regions at the 68.3% and 95.4% C.L., are tightest for the most significant injected signal. of 2 1/4 . 14We therefore recommend that future broadband atom gradiometer experiments consider implementing designs that could be modified post-construction.
Finally, for completeness, in Fig. 11 we enlarge the C.R. for the f s = 0.3 Hz case in the vicinity of 8.5 Hz and 9.1 Hz, for both choices of coupling.For the data set containing a signal with d 2 ϕ = 10 −9 , we observe that, despite being four Nyquist windows away from the injected mass, the reconstructed ULDM mass is still consistent with 9.1 Hz at the 68.3 C.L.This is to be contrasted with the dataset containing a signal with d 2 ϕ = 1.4 × 10 −9 , for which the best fit point lies in the vicinity of 9.1 Hz.Additionally, in both cases the resolution on the mass is on the order of the stacked frequency resolution ∆f st = 10 −6 Hz.In agreement with the notion of the TS as a measure of significance, the contours of the more significant signal (lower panels) are tighter.

Discussion and Summary
Super-Nyquist ULDM signals, which we defined as ULDM signals whose spectral content exceeds half of an experiment's sampling frequency, are a well-motivated target for future broadband atom gradiometer experiments.These signals, however, would be affected by spectral features that deviate substantially from those of sub-Nyquist signals, and so would not be correctly reconstructed using the analysis routines previously discussed in the literature.To address this lacuna, in this work we have provided the first systematic approach to discovering super-Nyquist ULDM signals with broadband atom gradiometers through the use of a comprehensive likelihood formalism and statistical framework.
To this end, we have conducted a detailed exploration of the phenomenon of aliasing, whereby any super-Nyquist signal is identified at a frequency between zero and the Nyquist frequency.Importantly, even after aliasing to a lower frequency, the width of the spectral lineshape, which exhibits a linear dependence on the ULDM mass (or, equivalently, the ULDM frequency), remains unchanged.As explained in section 4.2, this characteristic enables the differentiation of aliased signals from non-aliased ones, provided that the experimental frequency resolution, which is determined by the integration time and the number of stacks (cf.table 2), is sufficiently high.
Furthermore, in addition to the shift to lower frequencies, we discussed two other aspects of aliasing: folding and distortion.Folding refers to the reflection of the signal around the Nyquist frequency, while distortion occurs when the spectral content of a signal exceeds the frequency range of a given Nyquist window and is added onto the original signal spectrum, resulting in a substantially modified lineshape (cf.sections 4.3-4.5).By accounting for all these aspects of aliasing, we have shown that our likelihood formalism can give an accurate reconstruction of the original signal parameters, as long as the frequency resolution is large enough.Indeed, we found that an experimental (stacked) frequency resolution greater than approximately five times the signal linewidth is sufficient.
A notable feature that occurs in the reconstruction of super-Nyquist signals was demonstrated in Figs. 9 and 10.Because ULDM frequencies that differ by integer multiples of the sampling frequency are identified at the same aliased frequency, the discovery analysis recovers discrete islands of parameter space.Each island represents a set of ULDM frequencies consistent with the best fit point.Within each island, ULDM frequencies of order the experimental (stacked) frequency resolution are found to be consistent with the signal (cf.Fig. 11), while the overall number of islands depends on the statistical significance of the ULDM signal, in conjunction with the magnitude of the sampling frequency.
Our systematic exploration of the phenomenon of aliasing has shown that the ability to accurately reconstruct super-Nyquist ULDM signals depends primarily on experimentally tunable parameters that can be set pre-or post-data taking.These include, respectively, the sampling frequency and the sequence parameters, which affect the significance of the ULDM signal for a given ULDM coupling value, and the experimental stacked frequency resolution.These considerations may inform future experimental designs and enhance ULDM detection strategies with upcoming atom gradiometers.
There is scope to go beyond the analysis presented in this work in two primary ways.Firstly, we have neglected sources of coloured noise, such as gravity gradient noise (GGN), which could dominate the background at both frequencies below and above the Nyquist frequency.For example, at frequencies below ∼ 0.5 Hz, GGN is expected to eventually dominate the background of terrestrial long-baseline experiments [20,46].Therefore, highfrequency signals of a ULDM nature that are aliased to low frequencies, especially below 0.5 Hz, would have to contend with a frequency-dependent background of a geological nature.To accurately model this background, anti-aliasing filtering techniques like those proposed in Ref. [47] could be implemented to distinguish between the aliased and nonaliased parts of the background spectrum.By leveraging the dependence of the GGN contribution on the ground's vertical spectrum at the Earth's surface and on local geological properties, the filter could be modelled on local density measurements and seismometer data [48].
Furthermore, coloured noise that dominates the background above the Nyquist frequency would suffer from aliasing and related effects discussed in this work.Coloured noise whose spectrum leaks beyond the Nyquist frequency will be folded to lower frequencies and added to the non-aliased low-frequency spectrum.Therefore, to accurately reconstruct a super-Nyquist ULDM signal in the presence of this background, it would be necessary to subtract the expected non-aliased low-frequency background spectrum from the background model.Alternatively, to avoid introducing systematic errors, it may be feasible to design gradiometer sequences that are simultaneously sensitive to the non-aliased low-frequency spectrum signal and insensitive to the ULDM signal.In this case, a noise-free spectrum could be obtained through spectral subtraction [49].A more detailed investigation in this direction is left for future work.
In this work, we have also assumed that the sampling frequency of the experiment is constant and known with arbitrary precision.This assumption, which presents an idealised scenario, leads to the second primary extension of this work.Interestingly, as proven in Ref. [50], in the case of unevenly sampled, the Nyquist frequency is given by 1/(2Ω), where Ω is the largest factor such that the temporal spacing between any sampled point is given by an integer multiple of Ω.Hence, choosing Ω = 0.1 s and performing T int /Ω measurements at times t i = n i Ω, where n i is a uniformly sampled integer between zero and T int /Ω, the Nyquist frequency would be given by f Ny = 10 Hz.Following this argument, measurements that are performed at different times with finite precision would be characterised by a very large Nyquist frequency [51].For example, assuming a timing precision on the order of 10 −3 s, the Nyquist frequency will be bounded above by 5 × 10 2 Hz, where this bound will be reached in the limit that no larger factors exist.Since the measurement of the atom populations at the end of the experiment is dictated in large part by the timing of lasers, we expect this timing precision to be achievable.While this technique holds promise for completely eliminating aliasing effects from atom gradiometer data, its implementation would require the application of the Lomb-Scargle periodogram [52], which would modify both the statistical features of the signal as well as the analytical form of the signal and background.Therefore, further investigation of this approach is deferred to future studies.
Finally, we emphasise that while our discussion has been specifically tailored to scalar ULDM searches, the analysis and findings of this work can be readily adapted to other ULDM searches utilising atom gradiometers, such as spin-1 DM [53], provided that the assumptions made here remain valid.More generally, the conclusions of this work would also be relevant to other state-of-the-art broadband experiments hunting for ULDM candidates, including broadband sensors searching for axion-like particles.would be infinite.Therefore, when considering shot-noise limited experiments, we will assume that the noise spectrum is band-limited, which implies that aliasing and folding do not affect the background.

B Test statistic for setting upper limits
In this appendix, we study the test statistic for setting upper limits on the ULDM-SM coupling at particular mass values.As before, we set θ sig = (m ϕ , d 2 ϕ ), where we remind the reader that m ϕ is the ULDM mass and d ϕ is the coupling strength of the relevant linear interaction between ULDM and SM operators.Thus, we can define the test statistic for setting upper limits on the ULDM-SM coupling d ϕ as q m ϕ , d where d 2 ϕ is the value of d 2 ϕ that maximises the likelihood at fixed m ϕ .Here, θ nuis denotes the values of the nuisance parameters that maximise the likelihood in the signal-plusbackground hypothesis given the best-fit value of the squared coupling strength d 2 ϕ (i.e., the denominator term); θ nuis represents the values of the nuisance parameters that maximise the likelihood in the signal-plus-background hypothesis given the squared coupling strength d 2 ϕ (i.e., the numerator term).
In the limit of a large data sample, i.e., T int ≫ τ c , the signal is spread over multiple frequency bins.Hence, we can invoke the Wald approximation and Wilks' theorem [37], which imply that the test statistic q at fixed m ϕ is described by a half chi-squared distribution with one degree of freedom [61]. 16For a given m ϕ , the 95% confidence level limit on d 2 ϕ is set when q m ϕ , d 2 ϕ, 95% ≈ −2.706.In this limit, the N σ confidence intervals on d 2 ϕ, 95% can then be computed via q m ϕ , d 2 ϕ, 95% ± N σ = − Φ −1 (0.95) ± N 2 , (B.2) where Φ −1 is the inverse of the cumulative distribution function for the Normal distribution.For our choice of integration time T int = 10 8 s and no stacking, this regime applies for ULDM masses above ∼ 10 −17 eV, i.e., frequencies above 10 −2 Hz.With mild stacking (e.g.N stacks = 10), which reduces the value of the integration time by the number of stacks N stacks , this regime will still apply for signals oscillating at frequencies greater than O(0.1 Hz).

C Monte Carlo simulations
In this section, we discuss in detail our approach towards generating Monte Carlo (MC) simulations of the data.We will first focus on MC simulations of the signal in the time domain, which we convert to a PSD in the frequency domain.After showing that the statistical properties of the MC data agree with Ref. [20], we will argue that the MC simulations can be performed directly in the frequency domain.In light of the noticeable reduction in computational time required to generate the data, we highlight that all of the analysis results presented in this work were performed on data generated via the latter approach.

C.1 Time-domain approach
To simulate the signal in the time domain, we closely follow the methodology developed in Ref. [24], which consists of constructing the ULDM signal from the distributions describing individual non-relativistic classical scalar fields.We build the total signal by summing over N ϕ ≫ 1 single field contributions. 17In detail, we define the contribution to the signal from a single field as where i ∈ 1, . . ., N ϕ identifies a specific ULDM particle, and θ ∈ [0, 2π) is a random phase.The angular frequency ω i = m ϕ (1 + v 2 i /2) of each ULDM field is set by the DM mass m ϕ and DM speed v i ∼ 10 −3 .The amplitude of the signal is defined as where all of the variables are defined in section 2.1.We remind the reader that we assume a constant sampling frequency, so that the field will be evaluated at ∆t time intervals for t ≤ T int .
To check the validity of our likelihood model, we computed the average of the PSDs from multiple realisations of the total ULDM signal and found it to be in excellent agreement with the expected PSD in Eq. (3.2) for the case of non-aliased, aliased and distorted signals.
As an example, consider an aliased ULDM signal in Fig. 12 between 0 and f Ny = 0.5 Hz at f = 0.2 Hz when m ϕ = 2π×9.2Hz and d 2 ϕ = 1; the experimental parameters are the same as in Table 1, except for T int ≈ 100 τ c ∼ 10 3 s, atom shot noise variance of 10 −6 and unphysical speed distribution parameters, namely v 0 = 2.38 × 10 4 km/s and v obs = 2.52 × 10 4 km/s. 18 The dashed line in Fig. 12 shows the sum of the expected PSD for the ULDM signal in Eq. (3.2) and atom shot-noise, whereas the filled bars correspond to the result of PSD averaging over 500 MC simulations of the ULDM signal plus atom shot-noise data in the time-domain.As expected, there is an excellent agreement between the two results, thereby providing a validation of our approach. 17Owing to the computational cost associated with building fields with large occupation numbers, we set N ϕ ≳ 10 3 .
18 Similar to Fig. 1 in Ref. [24], we show the validation plot for unphysical speed parameters to reduce the overall computation time required for generating a ULDM signal in the time-domain via the MC approach and averaging over the PSDs from multiple simulations (500 in this case).A larger ULDM speed parameters relative to the SHM reduces (increases) the signal's coherence time (frequency spread), thereby alleviating some of the computation burden in the limit of Tint/τc ≫ 1. ϕ = 1 and an atom shot noise variance of 10 −6 , except for unphysical speed distribution parameters, namely v 0 = 2.38 × 10 4 km/s and v obs = 2.52 × 10 4 km/s; the experimental parameters are the same as in Table 1.

C.2 Frequency-domain approach
In the previous section, we showed that by generating MC simulations of a time-dependent ULDM signal in time-domain, and taking the PSD average of multiple simulations, the resulting PSD matches well with Eq. (3.2).However, for realistic ULDM signals that have a coherence time of O(years), simulations in the time-domain are not ideal due to a large computational cost associated with sampling the ULDM field over multiple coherence times, especially when the sampling rate (or time-separation) is high (small); the time-series data in such case would be too enormous to store and analyse.In this case, MC simulations in the frequency domain directly offer a viable solution.
As shown in Ref. [24], the expected PSD value for a time-dependent ULDM signal is exponentially distributed with a mean given by Eq. (3.2).Thus, we can generate an expected PSD sample for a ULDM signal by simply taking random samples from this distribution at each frequency bin, which drastically reduces the time required to generate a MC sample.

Figure 2 .
Figure 2. Comparison between the expected (black line) and MC-generated (filled bars) PSDs of an injected super-Nyquist ULDM signal with f ϕ = 9.1 Hz and d ϕ = 3 × 10 −4 (top row), and f ϕ = 9.9 Hz and d ϕ = 10 −4 (bottom row) for the experimental parameters shown in Table1and N stacks = 100.In the left column, the aliased signals are mapped to the first Nyquist window, whereas in the right column the signals are mapped to the second Nyquist window.Spectral folding occurs for signals whose aliased frequency is f * ϕ,2 , and so can be seen in the second Nyquist window for f ϕ = 9.1 Hz and in the first Nyquist window for f ϕ = 9.9 Hz.For comparison, we also show the aliased ULDM mass at 0.1 (0.9) Hz with a purple dash-dotted (red dotted) line in the left (right) panels.
we show the expected power spectrum density of a non-aliased signal at f ϕ = 0.4 Hz with d 2 ϕ = 4 × 10 −10 , and the expected power spectrum density of an aliased signal at f ϕ = 8.4 Hz with d 2 ϕ = 4.16 × 10 −7 , assuming aggressive stacking (N stacks = 10 4 ).In light of the sampling frequency, both signals are identified at 0.1 Hz; since the stacked integration time T st int = 10 4 s satisfies the condition T st int < τ c for both ULDM masses, both signals are contained in a single frequency bin.Therefore, despite the vastly different phenomenological parameters, both signals appear identical and hence cannot be distinguished.

1 ]Figure 3 .
Figure 3. Expected power spectrum densities of aliased and non-aliased ULDM signals which satisfy either T st int < τ c (left column) or T st int > τ c (right column).In the top row we show the spectra of a ULDM signal with f ϕ = 0.4 Hz, which is not affected by aliasing in light of the chosen sampling frequency (see Table1), for different coupling strengths.In the second row, we show the spectra of a ULDM signal with f ϕ = 8.4 Hz, which is subject to aliasing, for different coupling strengths.In the regime of short stacked integration time, the aliased and non-aliased signals that we show are identical; in the opposite regime, the aliased and non-aliased signal are characterised by a marked difference in spectral broadening.

− 14 Figure 4 .
Figure 4. Comparison of the posterior distributions of d 2 ϕ and v 0 for an injected signal with f ϕ = 8.4 Hz and d 2 ϕ = 4 × 10 −8 at two fixed masses: f fixed ϕ = 0.4 Hz, which assumes no aliasing (left panel), and f fixed ϕ = 8.4 Hz, which assumes aliasing (right panel).The 1σ and 2σ credible regions are shown with solid lines.The injected (v 0 , d 2 ϕ ) value is shown in the 2D plane of the right panel by a blue square; it is not visible in the left panel due to different axis scales.Here, we have taken the data set to be equal to the mean predictions of the model under consideration and neglecting statistical fluctuations, i.e. the Asimov approach.

Figure 5 .
Figure 5. Significance comparison between neighbouring folded and non-folded signals as a function of the ratio of the stacked integration time T st int to the ULDM coherence time τ c for f ϕ = 9.1 Hz.The comparison is quantified via the folding discriminant, as defined in the body of the paper, which we evaluate on two Asimov datasets containing a ULDM signal with: f ϕ = 9.1 Hz and d 2 ϕ = 10 −9 (solid blue), and f ϕ = 9.1 Hz and d 2 ϕ = 1.4 × 10 −9 (dashed green).We assume the experimental parameters stated in Table1with the exception of the sampling frequency, which we set to 0.3 Hz in the left panel and 3 Hz in the right panel.The folding discriminant is saturated by signals satisfying T int /τ c ≳ 5.The similarity between the panels implies that the ability to disentangle folded and non-folded signals is independent of f s .

Figure 6 .
Figure 6.Significance comparison between neighbouring equally-folded signals as a function of integer multiples of the first Nyquist window (lower axis), or equivalently as a function of the frequency (upper axis).The comparison is quantified via the aliasing discriminant, as defined in the body of the paper, which we evaluate on two Asimov datasets containing a ULDM signal with: f ϕ = 9.1 Hz and d 2 ϕ = 10 −9 (solid blue) and f ϕ = 9.1 Hz and d 2 ϕ = 1.4 × 10 −9 (dashed green).We assume the experimental parameters stated in Table1with the exception of the sampling frequency, which we set to 0.3 Hz in the left panel and 3 Hz in the right panel.The dashed horizontal lines define the quantiles associated with the 68.3% and 95.4% C.L. The aliasing discriminant increases away from the injected signal; as the right panel shows, this object is largest for signals whose frequency is much larger or smaller than the injected one.

Figure 7 .
Figure 7. Spectral distortion on ULDM signals close to an integer multiple of the Nyquist frequency, which we set to 0.5 Hz.We show this effect for three different signals with d 2 ϕ = 2 × 10 −8 : f ϕ = (12 − 3.5 × 10 −5 ) Hz (left panel), f ϕ = (12 − 2 × 10 −5 ) Hz (central panel) and f ϕ = (12 − 1 × 10 −5 ) Hz (right panel).In purple, we show the non-folded spectrum, which is aliased to the second Nyquist window (2 nd NW) and leaks into the first Nyquist window (1 st NW); in green we show the folded spectrum, which is aliased to the 1 st NW and leaks into the 2 nd NW; and in grey we show the distorted aliased spectrum, which consists of the sum of the folded and non-folded spectra.We assume the experimental values stated in Table1and N stacks = 50.We see that, the closer an aliased signal is imaged to an integer multiple of the Nyquist, the greater the degree of spectral distortion.

d 2 φ−Figure 8 .
Figure 8.Comparison between the posterior distributions of f ϕ and d 2 ϕ for Asimov data sets containing an injected signal with d 2 ϕ = 2 × 10 −8 and f ϕ = (12 − 2 × 10 −5 ) Hz (left panel) and f ϕ = (12 − 1 × 10 −5 ) Hz (right panel), which correspond to the grey curves plotted in the central and right panels of Fig. 7, respectively.The injected signal parameters are shown by blue solid lines in 1D and a blue square in 2D planes.The 1σ and 2σ credible regions are shown with solid lines.The inferred ULDM parameters are consistent with the injected signal to within 2σ credible region independently of the degree of spectral distortions due to folding.

Figure 9 .
Figure 9. Discovery analysis on a MC-generated data set containing an injected signal with f ϕ = 9.1 Hz and d 2 ϕ = 10 −9 for f s = 0.3 Hz (left column) and f s = 3 Hz (right column).In the upper row, we show the islands of parameter space that are consistent with the inferred best-fit at the 95.4% C.L. in red, and the injected signal with a green cross.The yellow shaded region has been excluded by MICROSCOPE, while the fine-tuned region of parameter space assuming a UV-cutoff Λ = 10 TeV is located above the blue line.The lower row shows the one-dimensional profile likelihood ratio for the ULDM frequency.The beige (blue) shaded regions mark multiples of the first (second) Nyquist window.The horizontal magenta lines correspond to the C.R. at the 68.3% and 95.4% C.L. In these panels, we label the best-fit point with a purple star.

Figure 10 .
Figure 10.Discovery analysis on a MC-generated data set containing an injected signal with d 2 ϕ = 1.4 × 10 −9 and other parameters as in Fig.9.The upper row shows (in red) the islands of parameter space that are consistent with the inferred best-fit value at the 95.4% C.L. The lower row shows the one-dimensional profile likelihood ratio for the ULDM frequency, where the beige (blue) shaded regions mark multiples of the first (second) Nyquist window.The horizontal magenta lines correspond to 68.3% (1σ) and 95.4% (2σ) confidence region of the right panels.The injected signal is shown with a green cross while the best-fit point is labelled with a purple star.

Figure 11 .
Figure 11.Enlargement of the parameter space shown in Figs. 9 and 10 close to 8.5 Hz (left column) and 9.1 Hz (right column) for f s = 0.3 Hz when the injected signal is at d 2 ϕ = 10 −9 (top row) and d 2 ϕ = 1.4 × 10 −9 (bottom row).In both cases, the mass resolution is on the order of the stacked frequency resolution.The contours, which correspond to the confidence regions at the 68.3% and 95.4% C.L., are tightest for the most significant injected signal.

Figure 12 .
Figure 12.Comparison between the expected PSD of a time-dependent ULDM signal with atom shot noise (dashed line) vs PSD averaging over 500 MC simulations performed in the time-domain (filled bars).The signal is generated for m ϕ = 2π × 9.2 Hz, d 2 ϕ = 1 and an atom shot noise variance of 10 −6 , except for unphysical speed distribution parameters, namely v 0 = 2.38 × 10 4 km/s and v obs = 2.52 × 10 4 km/s; the experimental parameters are the same as in Table1.

Table 2 .
Dictionary of conjugate variables in the time (left column) and frequency (right column) domains pertinent to the analysis of a time-dependent ULDM signal.