First indirect detection constraints on axions in the Solar basin

Axions with masses of order keV can be produced in great abundance within the Solar core. The majority of Sun-produced axions escape to infinity, but a small fraction of the flux is produced with speeds below the escape velocity. Over time, this process populates a basin of slow-moving axions trapped on bound orbits. These axions can decay to two photons, yielding an observable signature. We place the first limits on this solar basin of axions using recent quiescent solar observations made by the NuSTAR X-ray telescope. We compare three different methodologies for setting constraints, and obtain world-leading limits for axions with masses between 5 and 30 keV, in some cases improving on stellar cooling bounds by more than an order of magnitude in coupling.


I. INTRODUCTION
The Sun is an exquisite laboratory for new physics. Studies of particles produced in the Sun have provided us with a wealth of information about light, weaklyinteracting particles, including neutrinos [1][2][3], axions [4][5][6], and dark photons [7][8][9]. Most of these studies have focused solely on the flux of new particles that escape the gravitational potential of the Sun. However, a tiny fraction will be produced at sufficiently small velocities to become gravitationally trapped, forming a "Solar basin" of new particles. Ref. [10] showed that despite a small injection rate, the local density in this basin can grow to an appreciable fraction of the local dark matter density, thus providing a new target for direct detection experiments.
If these particles are unstable, however, their density in the Solar basin ceases to grow when decays balance injections, roughly when the Sun's age is comparable to the particle's lifetime. Once this steady-state is reached, the basin decays yield a constant flux of decay products. In the case of keV-mass axion-like particles [11][12][13][14][15] (hereafter "axions") with a coupling to photons, these decays appear as a narrow line in the X-ray band with a characteristic angular distribution. (Certain aspects of these phenomena were foreseen in Ref. [16], which attempted to address the solar coronal heating problem with a Kaluza-Klein spectrum of gravitationally-trapped axions.) This signal is therefore best probed by X-ray telescopes capable of direct solar observations, such as NuS-TAR [17]. While not initially designed as a solar ob-servatory, it has made observations of both the active Sun [18][19][20] and, more recently, the quiescent Sun [21,22] during its nine-year lifetime. Having been constructed with point sources in mind, it boasts a sub-arcminute angular resolution and energy range of 3-78 keV.
In this Letter, we use a subset of NuSTAR's recent quiescent solar limb dwells to constrain the axion-photon and axion-electron coupling, leveraging the characteristic spatial and spectral features of the basin signal. We compute the full production rate for a variety of production processes, and set limits using three different methodologies in order to adopt the most robust bound on parameter space. Two of these methods (Poisson and CL s limits) are familiar, but we also present the first multidimensional implementation of a backgroundagnostic limit-setting methodology based upon the work of Yellin [23,24]. We place constraints stronger than existing limits [25][26][27][28], by about an order of magnitude, on the axion-photon and axion-electron couplings, for axion masses between 5 keV and 30 keV. Figures 1 and 2 show our main results, taken along slices of parameter space given by g aγγ = α/2πf and g aee = m e /f in Fig. 1 where f is the axion decay constant, and g aee = 0 in Fig. 2.
The data and code used to obtain the results of this study is available on GitHub , and a link (ƭ) below each figure provides the code with which it was generated. We employ natural units wherein = c = k B = 1.

II. SIGNAL
We consider the leading axion couplings to electrons and photons: + g aee 2m e (∂ µ a)ψ e γ µ γ 5 ψ e − 1 4 g aγγ aF µν F µν , (2) , the regular red curve shows the optimum cuboid (Yellin) method (Sec. III B 2), and the thin red curve shows the CLs limit, which is the most stringent in most of the parameter space (Sec. III B 3). This suggests that the modeled smooth X-ray background dominates over the background from microflares in the data used in this analysis. The gray shaded regions are excluded by cooling of horizontal branch stars [26], white dwarfs [29], and red giants [30]. Note that the weakening of the CLs limit at low masses is due to the use of a 4 keV lower photon energy cutoff to avoid solar contamination in comparison to the 3 keV cutoffs for Poisson and Yellin. ƭ where m is the axion mass, a is the axion field, g aee is the axion coupling to electrons (contributing primarily to axion production), and g aγγ is the axion coupling to photons (allowing decays, and also contributing to production).
The energy density injection rate into the basin at a radius R as a result of these processes has been computed in general in Ref. [10], and can be put into the forṁ where M is the mass of the Sun and Φ(R ) is the gravitational potential at the site of production R , with the integral evaluated over the solar volume (0 < R < R ). The function Q(R ) encapsulates the production rate of non-relativistic particles into bound orbits [10] and is the sum of three primary components: Compton production ( Q C ), bremsstrahlung ( Q B ), and Primakoff production ( Q P ). Expressions for the first two of these components appear in Ref. [10], and we derive the Primakoff expression in App. A, where n e is the electron number density, T the temperature, and α the fine structure constant. The quantityρ b (R) scales quadratically with g aee (for the Q C and Q B terms) and g aγγ (for the Q P term); a plot of its dependence on m can be found in App. B.
The axions in the basin decay to photons (a → 2γ) either directly via their tree-level photon coupling or at loop level via their electron coupling, at a rate: If the axion's lifetime Γ −1 rad is shorter than the Sun's age t = 4.6 Gyr, then the system reaches a steady state in which the rate of basin energy density injection (ρ b , which we take to be time-independent) is balanced by the losses due to axion decay (ρ b Γ rad ). In general, the present-day signal flux is proportional to . We are primarily interested in the region R < R < 2R , where the majority of solar basin axions reside, and where secular perturbations from the planets can be neglected [31]. In App. B, we argue reabsorption of basin particles can also be ignore for the parameter space considered.
Since axions trapped in the solar basin have low velocities (v < v esc ), the decay to two photons takes place effectively at rest in the frame of the observer, so each photon acquires an energy E γ m/2. The spectral signature is therefore a line at X-ray energies.
The angular signature follows from the characteristic 1/R 4 dependence of the basin injection rate, a universal feature of injection into a 1/R potential [10]. Integrating along a non-Sun-crossing line of sight yields a 1/θ 3 fall-off in observed flux with increasing θ, where θ is the angle from the center of the Sun. Additionally, there is a doubling discontinuity of the signal at the solar limb (θ ≡ arcsin[R /AU]), where decays obscured by the Sun for θ < θ become visible when θ > θ . A plot of the resulting angular template can be found in App. B.
Explicitly, we compute the number flux of decay photons per unit solid angle dΩ at Earth by integrating along the line of sight distance z: with z * AU cos θ − sin 2 θ − sin 2 θ for θ < θ , and z * = ∞ for θ > θ . We neglect the ∼ 10 −3 R -thick region above the solar surface in which the optical depth drops abruptly to zero as it is negligible in comparison to the spatial resolution of the detector. Rewriting the radius in terms of the line-of-sight distance and angle, R = AU sin 2 θ + (cos θ − z/AU) 2 , we can extract the angular template function T (θ) ∝ dz R −4 , normalized such that T (0) = 1. The signal flux can then be put in the more convenient form with S 0 the signal flux per unit solid angle at θ = 0. The total number of expected signal events µ is the expression of Eq. 7 integrated over the exposure (corrected by livetime), ancillary response functions (ARFs), and field of view.

A. Data processing
The data used for our analysis were collected by NuS-TAR on Sep 12, 2020 during a series of quiescent limb dwells. Our data set is taken from the dwell with the least contamination from localized flares (Orbit 2) and further restricted to that orbit's combination of camera head units (CHUs) with maximal livetime (CHU12). (It is necessary to restrict the data to a particular combination of CHUs in order to ensure the consistency of the spatial coordinates in our analysis; when pointing at the Sun, CHU4 is blinded, hence an uncertainty of 2 arcmin is introduced on the relative pointing direction between different CHU combinations [32].) Our selection comprises about 1500 s of observations, during which the solar center underwent a 1.26 arcmin shift through NuS-TAR's field of view. All events in NuSTAR's calibrated energy range (3-78 keV) were recorded. (A spatial plot of the data can be found in Appendix C 1.) The photon collection efficiency of NuSTAR over the field of view is quantified in a discrete 13 × 13 partition of subfields for both detectors (A and B), each with angular extent of arcmin × arcmin and separate ancillary response function (ARF) computed using the extendedsource functionality of the nuproducts pipeline [32] under the assumption of a spatially uniform background. These ARFs are effective collection areas that take into account detector effects such as aperture stop obscuration, detector absorption, and vignetting [32] that vary over the field of view.

B. Limits
To set constraints, we choose three separate methodologies, each with their own associated benefits and challenges. The first is a simple Poisson limit, the second is a generalization of the optimum interval method of Refs. [23,24] (to our knowledge, the first multidimensional implementation of the algorithm over binned data), and the third is a likelihood-based CL s method.

Poisson limit
The premise behind the Poisson limit is simple: the expected signal counts should not exceed the total recorded counts at some level of confidence. We identify the signal region by integrating over the observation time and spatial coordinates, but restricting to a narrow window in energy E ∈ [m/2 − 2σ E , m/2 + 2σ E ], where σ E = 0.166 keV is the spectral resolution. (See App. D). Denoting by N the total number of recorded counts in that energy interval, and by µ the corresponding expected number of counts in a given signal model specified by (m, g aγγ , g aee , α , δ ), the Poisson likelihood is given by The 90% confidence limit µ lim is set at the value of µ, which depends on (m, g aγγ , g aee , α , δ ), such that the cumulative probability for µ > µ lim is less than 0.1. In practice, we compute a Poisson limit on S 0 (Eq. 7) for each (m, α , δ ), and conservatively report the least stringent limit among a set of 317 discrete, uniformlyspaced solar positions (α , δ ) within 6 arcmin around the fiducial solar position. This simple limit-setting procedure suffers from the loss of spatial and temporal information, and sets the standard for the methods of Secs. III B 2 and III B 3.

Yellin limit
The "Poisson" upper limit we obtained in Sec. III B 1 stems from a comparison between the expected signal events and the observed events integrated over the entire field of view and observation time. However, the signal has a known spatial, spectral, and temporal dependence, which one can exploit to mitigate some of the background contamination, even for a completely unknown distribution of the background, as proposed by Yellin in Refs. [23,24]. The method is based on selecting regions within the observed range that contain exceptionally few events in comparison to the signal expectation, or equivalently, exceptionally large regions that contain a given number of events. By construction, this method is relatively insensitive to parts of the signal region with high background (e.g. a flare). Yellin's method [23] was initially designed for searches using a signal model with a known one-dimensional distribution, for data that contain few events (e.g. [33]). It is relatively straightforward to find the largest interval which contains a certain number of events, from which one can then find the most improbable of these intervals, the "optimum interval," in the data by comparing to Monte Carlo (MC) simulations generated with only signal events. Hence, a backgroundindependent limit can be obtained.
In the case of our analysis, the data exists in a fourdimensional space spanned by (α, δ) ≡ (RA, DEC) coordinates in the field of view, energy, and time. A main background from the Sun comes from solar (micro)flares, which are localized features in both space and time. Therefore, we generalize the optimum interval method to a multi-dimensional space as outlined in Ref. [24], by identifying the "optimum cuboid" in the four spacetime dimensions. This dramatically increases the computational complexity, requiring efficient binning and downsampling strategies discussed in App. F that preserve the power of the optimum cuboid method, while handling the high dimensionality and statistics of our data set. We perform Monte Carlo simulations of pure signal to compute the cumulative distribution functions (CDFs) for the largest cuboid volumes enclosing n events for n = 0, 1, 2, . . . , N data . These CDFs can then be combined to place a 90% CL limit on the signal flux parametrized by S 0 in Eq. 7. As for the Poisson limit, we repeat the procedure for 317 different solar positions, and report the least stringent limit for each value of m.
The main advantage of this optimum cuboid method is that it is completely independent of any background model, making it ideal for data sets with unknown backgrounds.

CLs method
A (typically) more stringent upper limit can be set with the CL s method [34], which requires a sufficiently flexible model capable of capturing the background while not overfitting. We make no attempt at modeling the intermittent background from solar (micro-)flares. The dominant non-solar background in most of the energy range of interest arises from cosmic X-rays that enter the detector at a glancing angle, never having passed through the optical bench. Subleading contributions arise from solar lines and internal components of the telescope. We choose to model these three backgrounds using the spectral shapes measured in Ref. [35], but allow their respective normalizations to float; a plot of the spectra is shown in App. E 2. We treat these backgrounds as spatially uniform, unaffected by the ARFs, since they are not focused by the X-ray optics. Additionally, we raise the minimum energy cutoff for the data to 4 keV to avoid contamination from solar activity. See Appendix E 1 for more detail on the background modeling procedure.
The likelihood can be expressed as a product of Poisson likelihoods over spectral and spatial bins. We add a Gaussian prior around the fiducial solar position with standard deviation σ = 2 arcmin. (See Appendix E 2 for the full expression.) For each axion mass m, the model parameters are therefore the signal flux S 0 , the initial solar position, and the normalizations of the three background components. We perform Markov Chain Monte Carlo (MCMC) analyses at each m, and then apply the CL s method [34] on the resulting marginalized posterior for S 0 to place our constraint.

IV. RESULTS
The resulting limits are depicted in Figs. 1, 2, and 3. We represent our results in the three-dimensional parameter space spanned by m, g aγγ , and g aee in a few different ways. The first is to unify the two couplings in terms of an axion decay constant f via the relations g aγγ = α/2πf and g aee = m e /f , corresponding to an axion with a treelevel electron coupling, and one-loop anomaly coupling to photons, as in DFSZ-type models [36,37]. In this case, the axion is produced mainly through the axion-electron coupling (g aee ), and decays primarily through the axionphoton coupling (g aγγ ). This yields the limits displayed in Fig. 1, up to one order of magnitude stronger than existing stellar cooling constraints [25][26][27][28][29][30].
We plot in Fig. 2 the limits in m-g aγγ parameter space, with zero axion-electron coupling, where the axion-photon coupling is responsible for both the production and the decay of the axion. These limits also (approximately) apply to the case where the axion-electron coupling is generated at one-loop order, as in KSVZ-type models [38,39]. Our analysis places a limit on an axion coupling exclusively to photons, over an order of magnitude below existing constraints from horizontal branch star cooling [26].
Finally, we plot our results in g aee -g aγγ space in Fig. 3, with contours corresponding to different axion masses, encapsulating all information at discrete mass values, and showcasing the scaling of our constraints in various lim-  [26] and on the axionelectron coupling from white dwarfs [29], respectively. ƭ its. Tracing a single curve from the upper left to the lower right, one first sees the region in which the axion production is dominated by the Primakoff process, resulting in a bound independent of the electron coupling, and more stringent than horizontal branch star cooling constraints [26] for all but the highest mass displayed. At the dot, the Compton process overtakes as the dominant production mode, and our constraint is approximately independent of g aγγ . At even weaker photon couplings, the axion lifetime becomes longer than the age of the Sun, hence the basin decay has not yet fully reached detailed balance with the injection rate (see Eqs. 5 and 7), weakening the constraint to a contour at fixed product g 2 aee g 2 aγγ . Finally, at very small values of the photon coupling (lower right), the limits become vertical again, indicating the onset of axion decay via an electron loop, occurring well into the region constrained by white dwarf cooling [29] (vertical dashed line). The red dotted line corresponds to the "universal coupling" slice from Fig. 1.
The limits from the optimal cuboid (Yellin) and likelihood (CL s ) methods outperform the simple Poisson limit over the entire parameter space, as the former two are able to isolate the signal from the background. The optimum cuboid method offers the highest potential gain in data sets with localized, sporadic features such as flares. (In reality, the realized gain is artificially small, as we already picked a highly quiescent solar period.) Both the Yellin and Poisson limits are completely independent of any background modeling. The likelihood limit typically outperforms the optimum cuboid limit since it includes background fitting of the temporally constant cosmic and instrumental X-ray backgrounds.

V. CONCLUSION
In this Letter, we have presented a first search for Xrays from axion decays in the Solar basin, using observations by the NuSTAR telescope. We set new constraints on axion parameter space by over an order of magnitude beyond existing bounds in the mass range of 5-30 keV.
Our analysis can be augmented by including a background model of the quiescent, micro-flaring Sun, which would strengthen the CL s limit, and make possible a potential positive detection using the likelihood method. Longer, dedicated solar observations would also improve the sensitivity of both the likelihood and optimum cuboid methods. The optimum cuboid limit could also tighten with the inclusion of more (and possibly noisier) data, as it automatically selects the least likely region of observation space. Both analyses could also be marginally improved with a more precise knowledge of the fiducial position of the Solar center. A calibration of NuSTAR's ARFs down to lower energies would also extend the sensitivity down to about 2 keV.
The Sun is not the only star that can produce an axion basin: compact remnants such as white dwarfs and neutron stars are attractive targets, as are stars with hot, dense cores, such as (super-)giants and horizontal-branch stars. For targets with extremely low backgrounds, the smaller signal fluxes from these more distant sources could be compensated by stacking multiple exposures. Such strategies could extend the first-of-its-kind indirect detection search for a stellar basin presented here to a larger mass range, even smaller couplings, and other weakly-coupled particles, such as dark photons [40] and milli-charged particles [41]. As we have shown in this Letter, indirect detection of stellar basins can probe heretofore unexplored parameter space for weakly-coupled particles. With the plethora of extensions outlined above, stellar basins will be an exciting target in the hunt for new physics beyond the Standard Model.

Acknowledgments
Acknowledgements-We thank Iain Hannah, Crystal Kim, and David Smith for discussion regarding NuSTAR data products and instrumental response. We are grateful to Robert Lasenby for sharing preliminary results upon which App. A is based, as well as many clarifying discussions and comments on the draft. We thank Asimina Arvanitaki and Matthew Johnson for comments on the draft, and Andrea Caputo, Joshua Foster, In this section, we compute the production rate of nonrelativistic (NR) axions from the Sun through the axionphoton and axion-electron coupling. Axions can be produced via the Compton and bremsstrahlung processes, as well as the Primakoff and electro-Primakoff processes. The dominant production rate via the axion-electron coupling in the axion mass range of interest is the Compton process (see Ref. [10] for more details), with production function (to be used in Eq. 3): where ω pl ≡ 4παn e /m e is the plasma frequency. For the axion-photon coupling, the leading production mode is the Primakoff process depicted in Fig. 4, whose production rate vanishes at leading order in the NR expansion [42]. At this order, the only 3-momentum in the diagram is the momentum of the photon k γ and k a , so the matrix element of an aFF coupling, depending on the cross product of k γ and k a , vanishes as k a → 0. At subleading order, there are two expansion parameters, the electron velocity v e ∼ 3T /m e ∼ 0.1 and the axion velocity v a ∼ v esc ∼ 5 × 10 −3 . In the following, we will compute the Primakoff production rate at O(v 2 e ) in the NR expansion.
We start by expanding the spinors to first order in the velocity expansion. In the kinematic regime of interest, the incoming photon energy is approximately equal to the axion mass. Only the transverse mode of the photon contributes to the matrix element in vacuum, which is found to be for m a max{T, ω pl }, where k i and k f are the initial and final 3-momentum of the electron, and k γ is the 3momentum of the incoming photon.
Extending the result to lower axion masses requires accounting for thermal effects [43][44][45][46]. The production rate of an axion in the limit of vanishing v a is where β(Q) is the imaginary part of the the propagator, θ(q 0 ) is a Heaviside theta function, and g(K γ , Q) is the coupling function: where g L and g T refer to the coupling functions of longitudinal and transverse plasmons, respectively. The "Cut" plasmon 4-momentum is Q = (q 0 , q), and that of the "Pole" plasmon K γ = (ω γ , k γ ), while φ is the angle between the 3-momenta q and k γ [43]. Following [43], we refer to the off-shell plasmon with momentum Q (see Fig. 4) as a "Cut" plasmon, while the on-shell initial state plasmon with momentum K γ as a "Pole" plasmon. In the following, we will justify the approximations we made to arrive at Eq. A2 as well as its qualitative features by examining the properties of the functions g(K γ , Q) and β(Q).
The emission rate depends on the coupling function g(K γ , Q), which contains the Lorentz contraction of the vertices in Fig. 4. The emission of relativistic axions via the Primakoff effect is dominated by the exchange of a longitudinal plasmon. However, the longitudinal coupling function g L vanishes in the limit φ → 0 (k a → 0), signaling a v 2 a suppression as expected (and consistent with the in-vacuum result). In contrast, the transverse coupling function g T does not vanish in the same φ → 0 limit. In the following, we will therefore focus on the transverse-plasmon exchange.
The emission rate also depends on β(Q), the imaginary part of the propagator of the "Cut" plasmon where π(Q) is part of the polarization tensor π µν (Q) as defined in Ref. [43] and computed in Ref. [46]: with L and T subscripts referring to longitudinal and transverse plasmons. These results hold for a dilute NR plasma; see Ref. [46] for an in-depth analysis. In the above, we have employed the definitions ξ ± ≡ ξ(1 ± Q 2 2q0me ), the plasma dispersion function with ξ ≡ q 0 /(qσ), and σ is the electron velocity dispersion [46]. The plasma dispersion function Z(ξ) contains the entire imaginary part of π(Q). We thus find: for Q 2 (m e σ) 2 , implying that the emission of relativistic axions through exchange of a transverse plasmon is suppressed by v 2 e compared to the emission through exchange of a longitudinal plasmon, confirming our prior result.
This σ 2 difference between the longitudinal and transverse plasmon also shows up in the real part of π(Q). Whereas the function π L (Q) → ω 2 pl /σ 2 ∼ k 2 D in the limit q 0 → 0, the function π T (Q) → 0 in the same limit. Here, k D = e 2 n e /T is the Debye screening wave vector. This suggests that the exchange of a transverse plasmon does not experience the same Debye screening as the exchange of a longitudinal plasmon, also as expected.
We can now integrate over the distribution of the initial state electron and photon, as well as the final state phase space to find the production rate of non-relativistic axions as valid for axion masses 0.3 keV ω pl m a < 2m e σ ≈ 50 keV. As m a approaches ω pl , the production rate of non-relativistic axion through the "Cut-Pole" diagram becomes exponentially small due to the exponential suppression in the plasma dispersion function at large ξ.
The "Pole-Pole" contribution to the NR axion flux can only arise from the decay of the transverse plasmon to a longitudinal plasmon at the same momentum, but this rate is suppressed by the axion velocity v 2 a and will not dominate in the mass range of interest to this work. NR axions can also be produced through the electro-Primakoff effect ("Cut-Cut"). Similarly, the matrix element can also only get contributions at O(v 2 e ), so its rate is always subdominant compared to the Primakoff rate in the Sun.

Appendix B: Basin Evolution and Saturation
The energy density of axions in the solar basin depends not just on their injection rate from the Compton, bremsstrahlung and Primakoff production processes, but also on their effective accumulation time. There are three processes that contribute to basin depletion, and therefore set the effective lifetime of an axion in the stellar basin.
The first process is axion decay, with rate computed in Eq. 5, yielding a radiative lifetime within a few orders of magnitude from the age of the Sun (t ) for the parameter space of interest. In the main text (Eq. 7 in particular), we have taken this radiative decay as the primary depletion process.
The second process is gravitational ejection, from close encounters or orbital resonances. For our data set based on observations of the solar limb, the vast majority of Xray photons originate from axions on orbits with aphelia between R and 2R . For such tight orbits, gravitational ejection times are much longer than the Sun's age [31].
The third process is re-absorption of basin particles in the Sun, which can be estimated as [10]: where f sat = 1 e m/T −1 whose average is weighted by the production function Q over the solar volume, as in [40,Eq. 21]. This time is also much longer than Sun's age for parameter space that is not already excluded for the mass range of interest, as shown in the bottom panel of Fig. 5, which assumes f sat = 1 e m/T −1 with constant T = 1.3 keV for simplicity, and evaluates τ abs at R = R .
Appendix C: Data

Observation and detector response
We use data recorded by NuSTAR on 12-September-2020 during a series of quiescent limb dwells. Our dataset is taken from the dwell with the least contamination from localized flares (Orbit 2) and further restricted to that orbit's combination of camera head units (CHUs) with maximal livetime (CHU12). We choose to restrict the data to a particular combination of CHUs in order to ensure the consistency of the spatial coordinates in our analysis; Solar observations are performed in NuSTAR's Mode 06 (Science Mode) in which CHU4 is blinded, hence an uncertainty of ∼2 arcmin is introduced on the absolute pointing direction between different CHU combinations [32]. This uncertainty also requires our analyses to marginalize over the true solar position. The effective exposure for CHU12 was 1501.17 seconds for focal plane module (FPM) FPMA and 1481.86 seconds for FPMB, during which the solar limb underwent a 1.26 shift of (∆RA, ∆Dec) = (0.020 • , −0.0084 • ) through NuSTAR's field of view. See Table I  Event data was taken from Level 2 cleaned and calibrated event files. The spatial distribution is shown in Fig. 7. Energy was computed from the pulse invariant (PI) using the formula E = (0.04 keV)PI+1.6 keV. ARFs were computed using the nuproducts pipeline [32]. NuS-TAR is primarily a point-source telescope, so the ARFs  were computed using the extended source functionality and flat count distribution (assuming spatially uniform background) with no background extraction. Source regions were taken to be the cells associated with the gridding of the FOV into arcmin 2 regions (13 × 13 for both detectors) (see Sec. III A), with a different ARF calculated separately within each to account for the spatial dependence of the effective area due to detector effects (aperture stop, detector absorption, vignetting). The resulting expected spatial distribution of signal events is shown in Fig. 8.

Mock data generation
To compare to the true observation data, our limitsetting machinery was tested using mock data for which the axion mass and couplings are known. This mock data was generated in the following manner.
The background component was generated by sampling uniformly in time over the good time intervals for the observation and sampling energies from an interpolated distribution function formed from the 200-bin histogram of the true data. The background was sampled to be spatially uniform. The number of background counts was sampled according to Poisson statistics with expectation value given by the true number of counts in the data.
The signal component was generated by computing the expected number of counts of the signal model (Eq. 7) within each subgrid of the FOV for which an ARF was computed (taking the Sun to track its fiducial trajectory) and sampling from a Poisson distribution with that mean, then distributing those counts uniformly spatially over the subgrid and spectrally as a Gaussian with width σ E centered at E = m/2. The temporal dependence was taken to be uniform. While some approximations made in the generation of this mock data result in slight differences from a true expected signal, the mock data sufficed to verify the efficacy of our modeling framework.

Appendix D: Energy Resolution
In order to determine a fiducial energy resolution for spectral lines in NuSTAR, we fit a known X-ray line in data of the active Sun with a Gaussian, taking the best fit width of the Gaussian σ line as our fiducial resolution σ E . We target the iron line at E = 6.7 keV [47] in observations of the active Sun taken on 01-November-2014 [48, where {c 1 , c 2 , c line , σ line } were simultaneously fit to the data. The first term is included to model the solar background as a falling exponential, though since the line is narrow, the fit is insensitive to the particular functional form of this component. The second term models the line itself as a Gaussian centered at an energy E line with width σ line . The results of the fit are shown in Fig. 9. The fit yields a line energy E line = 6.45 keV and a width σ line = 0.166 keV, which we choose to use as our fiducial energy resolution σ E for our analyses. This is a known marginal shift in the line energy (which is not yet understood), however the energy resolution is typical for NuSTAR observations [e.g., 17].
Appendix E: Likelihood Analysis Method

Background characterization
The background present in our data arises from several different sources, as mentioned in the main body of the text. The dominant background arises from cosmic X-rays that enter the detector at a glancing angle, never having passed through the optical bench, however subleading contributions arise from solar lines and internal components of the telescope. We choose to model the background using the spectral shapes measured in [35] but allowing the respective normalizations to vary. A plot of the various spectra is shown in Fig. 11. B 1 ("aper-ture" in Fig. 9 of [35]) corresponds to the cosmic X-ray background that does not pass through the optical bench, B 2 ("int. lines") to internal lines arising from the radiation environment of the detector, and B 3 ("int. cont.") to gamma-ray Compton scattering and other featureless instrumental components. (See Appendix A of Ref. [35] for further discussion of these backgrounds.) There is an additional contribution at very low energies that arises from active regions on the Sun. These stochastic features are not possible to encapsulate in a simple model, hence when performing the CL s method, we have chosen to excise data contaminated by these microflares by raising the lower energy cutoff on the data used to set this limit. In order to select this cutoff, we performed a global fit of the background model to the observed data for E min ranging from 3 to 20 keV with a spacing of 0.24 keV. The χ 2 of the fit and coefficients B 1 , B 2 , and B 3 stabilize near 3.8 keV, hence we selected 4 keV as our E min for the CL s analysis. This cutoff explains the dramatic weakening of sensitivity of our CL s bounds below m 8 keV in Figs. 1 and 2. As can be seen in Fig. 12, with this choice of cutoff, the CL s method is biased towards finding higher p-values, hence is a conservative methodology.

Likelihood computation
As stated in Sec. C 1, we grid the field of view into 13 × 13 grid of arcmin 2 cells, for both detectors A and B. The data, d ijEt , is therefore binned by spatial cell (i, j = 1, . . . , 13), energy, and time, with energy bin widths of 40 eV. Each of the time bins is multiplied by the active exposure time of that bin. The data is also multiplied by the computed ancillary response function (ARF) that characterizes the detector effective area in each bin and the energy resolution σ E = 0.166 keV. The Poisson likelihood for our flux model is a product over all bins: where µ ijEt is the expected number of counts in each bin given the flux model, and n ijEt is the observed counts. The total flux model, including both the expected signal shape and backgrounds (denoted by f k for functional form and B k for free normalization, see Sec. E 1), is given by where Ω = 1 arcmin 2 is the solid angle of each subfield. This leaves six free parameters: B 1 , B 2 , and B 3 representing the background normalizations, S 0 for the overall signal strength, and α and δ for the (initial) solar position, which enters T (θ) through θ = α, δ − α (t), δ (t) . The angular flux template T (θ) is derived as shown in Sec. II and given explicitly by: where z min ≡ sin 2 θ − sin 2 θ. The expected signal is thus adjusted for the detector's effective area per bin (via the ARF), and for the motion of the Sun during the data collection.
To set a limit we run a Markov Chain Monte Carlo (MCMC) which maximizes the likelihood function: where LL m is the logarithm of the likelihood given the data within 2σ E of a given axion mass, and LL m is that for the data outside of the 2σ E range (separated such that only the energy spectrum of the non-axion region must be fitted to reduce computation time). We add a Gaussian prior to take into account our imprecise knowledge on the solar position around its fiducial value (α fid , δ fid ): where σ = 2 arcmin is the uncertainty in the solar position. We also force B 1 , B 2 , B 3 and the data in any ijEt bin to be strictly positive. We compute our MCMC chains via the emcee Python package [49], then extract a limit from the marginalized posterior on S 0 using the CL s test statistic [34], defined by where p S0 denotes the cumulative distribution function at a given value of S 0 . A 90% CL upper limit on S 0 is set by solving CL s = 0.1.

Diagnostic plots
Here we include various diagnostic plots for the CL s fit for a particular value of axion mass, m = 15 keV. Fig. 11 is the ultimate spectral fit of the various background components. It can be seen that the best fit (blue) coincides well with the observed data (black). Fig. 10 shows the corner plot for the MCMC fitting procedure. Fig. 12 shows the ultimate confidence intervals on S 0 as a function of axion mass.

Appendix F: Optimum cuboid method
As discussed in Section III B 2, Yellin's original method [23] was designed for searches with a signal model with a known one-dimensional distribution, and for data containing very few events. This method was generalized [24] to searches with a larger number of events by searching for the largest interval (fraction of the data set) which contains no more than certain number of events. In this way, instead of computing the distance in the phase space between any two consecutive events, one can scan the interval size, and find the interval which sees a largest downward fluctuation in the number of events inside it. The method presented in Ref. [24] suggests the The bias towards S0 > 0 demonstrates that the model used is in general conservative. ƭ possibility of coarsely discretizing the data region, which is essential in our generalization of the method to high statistics and to a higher-dimensional data space.
The main challenge for extending these methods to higher dimensional spaces is the rapid growth of computational complexity with the number of dimensions D. A naive expectation for the steps required for finding the largest cuboid with a certain number of data points n in a data set with N total points in D dimensions grows as N 2D−2 n, while more advanced algorithms have reduced the complexity for finding the largest empty cuboid to O(N 8/3 (log N ) 3 ) in 3 dimensions [50, 51]. For our data in D = 4, N ranges from O(10 2 ) at high axion masses to O(10 4 ) at low axion masses, where the complexity in the naive prescription is beyond our computational power.
As a result, we generalize the Yellin method to higher dimensions by discretizing the full experimental range into k 4 = 10 4 cuboids, constructed such that each cuboid contains exactly 1/k 4 of the expected signal events. We then generate large number of MC samples for a fixed average of µ signal events (uniformly distributed on the unit cuboid). We count the number of signal events in each unit cuboid and locate the largest cuboid V (i) n which contains n number of events for the ith MC sample. This list {V n } (i) over the many MC samples is then used to compute the cumulative distribution function of the V n , denoted C n (V ), for each n. In MC sample i, the maximum value of C n (V ) achieves for any particular V n ) is around 0.5 by construction, as this corresponds to the median. A large C n (V can be used to compute C max , the value for which the cumulative distribution function of C (i) max over all N sim samples reaches 90%. The procedure is carried out for a list of µ values corresponding to signal strength with which we obtain a function C max (µ), plotted in Fig. 13, which we can compare with data. (A glossary of these terms is provided at the end of this subsection for reference. ) We project each photon observed by NuSTAR, with coordinates (α, δ), arrival time, and energy, onto a unit 4D cube. The projection is such that a pure signal distribution (i.e. Eq. 7) would be uniformly distributed on this unit 4D cube; we follow the projection implementation of Ref. [24,App. C]. We then subdivide this unit cube into k 4 = 10 4 bins as we did for the MC samples. For each axion mass m and each of the 317 solar positions samples, this procedure results in a list of events indexed by four integers indicating the generalized bin to which each event belongs. This list can then be used to compute a list of V data n , as well as C n (V data n ) of this list of V n with the cumulative distribution functions obtained from the MC simulations for each µ. We can then find C data max (µ) as the largest of C n (V data n ) of each µ value and compare it to C lim max (µ) of Fig. 13 for each µ value. We have generated MC samples only for µ ≤ 128. If C data max (µ) < C lim max (µ) even for µ = 128, we employ a 10% for all µ. For limit-setting purposes, we conservatively use the red line C lim max , which is its 2-sigma upward fluctuation. ƭ down-sampling strategy. In that case, we randomly select 1/2 of the events projected on the 4D unit cube, and attempt to set a limit as outlined above on half of this data set. If µ = 128 is still not excluded, we downsample by another factor of 1/2, and so forth, in an iterative way. When this protocol converges after d downsampling steps to a limit µ lim (d) < 128, we take the limit on µ for the raw data to be µ lim = 2 d µ lim (d) . Finally, we take the least stringent limit of the 317 solar positions at each axion mass. The same process is carried out for all axion masses to find the 90%CL limit µ lim as a function of axion mass. This limit is plotted in the bottom panel of Fig. 15 as a red line, with a corresponding flux per unit solid angle at the solar center depicted in the top panel.
A list of the definition of the quantities we used to set limits are: • k: Number of generalized bins per MC sample, in each dimension.
• N (i) : Total number of simulated events in an MC sample, where i indexes the different samples.
• N MC : Number of MC samples generated.
• V (i) n : The largest volume that contains no more than n points in the i-th MC sample.
• {V n } i : The array of V (i) n for n = 0, . . . , N (i) in sample i. Note that V (i) N (i) = 1 because the largest volume that contains no more than N (i) points is the entire box.
• µ = N (i) : Expected number of data points in the MC samples. The following quantities will depend on µ implicitly.
• P n (V ) = PDF Vn (V ): The probability distribution function, for a particular n, over the V ( n i) of the MC samples. The location of the peak of this PDF increases monotonically with n, since the corresponding average largest volume containing no more than n points necessarily gets larger.
• C n (V ) = CDF Vn (V ): The cumulative distribution function of V n , or simply V 0 dṼ P n (Ṽ ). The point at which this curve rapidly saturates towards 1 increases monotonically with n, see above.
• C (i) max : In MC sample i, the maximum value that any C n (V ) achieves for the particular {V n } (i) of that sample: C • Υ(C) = PDF Cmax (C): The probability distribution function of C max over all N MC samples.
• C max : The value at which the CDF of Υ(C) reaches 0.9, i.e.
• C data max : This is the maximum of C n (V n ) across n on the observed data. The µ value for which C data max = C max (note that both sides depend on µ) this corresponds to our a 90%CL limit µ = µ lim,Yellin . The Poisson and Yellin limits in the full threedimensional parameter space. Each yellow/brown curve corresponds to a particular axion mass, while black dashed lines indicate stellar cooling bounds on the couplings. The red dotted line is the slice corresponding to the "universal coupling" displayed in Fig. 1. See text of App. G for details. ƭ

Appendix G: Additional limit plots
In this section, we include a few additional constraint figures. Figure 15 is our "raw" exclusion figure upon which all others are based. It shows the limits on S 0 , the signal flux per unit solid angle at the center of the Sun, as a function of mass, for the three methodologies. The bottom panel shows the corresponding total number of events µ, in comparison to the total number N of events recorded.
In Fig. 14, we show the Poisson and Yellin limits in the full three-dimensional parameter space, the analogs  Fig. 3 for the CL s limit. The limit curves have various turning points on this three-dimensional parameter space. As an example, consider the m = 15 keV line: going from left to right, the four distinct sections of the curve correspond to when • The axion-photon coupling determines the production and decay rates of the axion (horizontal).
• The axion-electron coupling determines the axion production rate and the axion-photon coupling determines the axion decay rate, and the axion lifetime is shorter than the age of the universe (vertical).
• The axion-electron coupling determines the axion production rate while the axion-photon coupling determines the axion decay rate, and the axion lifetime is longer than the age of the universe (oblique).
• The axion-electron coupling determines the axion production and decay rates (decay proceeds via an electron loop), and the lifetime of the axion is longer than the age of the universe (vertical).
The second and fourth section may not show up for some axion masses.