Microwave cavity searches for low-frequency axion dark matter

For low-mass (frequency $\ll$ GHz) axions, dark matter detection experiments searching for an axion-photon-photon coupling generally have suppressed sensitivity, if they use a static background magnetic field. This geometric suppression can be alleviated by using a high-frequency oscillating background field. Here, we present a high-level sketch of such an experiment, using superconducting cavities at $\sim$ GHz frequencies. We discuss the physical limits on signal power arising from cavity properties, and point out cavity geometries that could circumvent some of these limitations. We also consider how backgrounds, including vibrational noise and drive signal leakage, might impact sensitivity. While practical microwave field strengths are significantly below attainable static magnetic fields, the lack of geometric suppression, and higher quality factors, may allow superconducting cavity experiments to be competitive in some regimes.


I. INTRODUCTION
While dark matter (DM) has so far only been observed through its gravitational interactions, many theoretical candidates have additional interactions with the Standard Model (SM). This has motivated a wide range of laboratory searches for DM, most famously through the WIMP direct detection program. Another wellmotivated dark matter candidate is the QCD axion (and more generally, axion-like particles), for which an extensive experimental program also exists [1].
Given the various existing and proposed axion DM detection experiments, a natural question is how sensitive an 'ideal' experiment could be, and how close we can get to that ideal. In a companion paper [2], we investigate this question, for experiments searching for axion DM through the aF µνF µν axion-photon-photon coupling. We focus on this coupling for a number of reasons -it is a generic prediction of QCD axion models [3,4], it arises in many other theories of axion-like DM [5], and it is a particularly easy to analyse.
We show that, under fairly general assumptions, an experiment's sensitivity to axion DM is limited by the magnetic field energy maintained inside the detector. For example, suppose that dark matter consists of an axion-like particle a of mass m a , with a coupling L ⊃ − 1 4 g aγγ aF µνF µν to the SM. In an experiment searching for axions over a mass range ∆m a , the expected timeaveraged power absorbed, at the least favourable axion mass, satisfiesP where U B is the time-averaged magnetic field energy in the experimental volume (ignoring magnetic fields on very small spatial scales), and ρ a 0.3 GeV cm −3 is the axion DM energy density. Related expressions can also * rlasenby@stanford.edu be obtained for the quantum-limited detectability of such a signal [2]. The highest (time-averaged) magnetic field energy densities are obtained from almost stationary currents, and correspondingly, most axion detection experiments use a static background magnetic field. In that case, for axion Compton wavelengths smaller than the length scale of the detector, it is relatively straightforward to attain the limit from equation 1, at least to O(1). For example, cavity haloscopes such as ADMX [6], and dielectric haloscope proposals [7,8], can absorb > ∼ 60% of this power into a single EM field mode.
However, for m a L −1 , where L is the length scale of the experiment, the EM fields at frequencies ∼ m a are naturally in the quasi-static regime. As discussed in [2], this suppresses the interaction with the axion oscillation, giving a parametric suppression of ∼ (m a L) 2 in the absorption rate, relative to the limit from equation 1. Roughly speaking, this arises because the electric fields associated with current fluctuations are suppressed compared to the magnetic fields, reducing the interaction with with the axion-sourced effective current. This suppression affects low-frequency axion detection proposals such as ABRACADABRA [9] and DM Radio [10], reducing their sensitivity. 1 As we point out in [2], an alternative approach is to use an oscillating background magnetic field, since oscillations at frequencies ω 0 > ∼ L −1 are no longer in the quasi-static regime. This results in a signal at ω 0 ± m a , which for ω 0 m a is at much higher frequencies than the original axion DM signal. Such approaches are generally referred to as 'up-conversion' experiments [11].
Such experiments have been proposed at optical frequencies [12][13][14]. However, these encounter a number of issues. The most serious is that achievable magnetic field strengths at optical frequencies are very small, compared to static magnetic fields. Another is that, even if other noise sources are overcome, the shot noise suppression coming from absorbing fewer, but higher-energy, optical photons degrades the theoretical sensitivity limits still further. Consequently, it would be very difficult to match the sensitivity of static-field experiments, despite the lack of a (m a L) 2 suppression.
We can address both of these issues by using lowerfrequency magnetic field oscillations. In particular, reasonably large magnetic fields (∼ 0.2 T) are routinely attained at ∼ GHz frequencies inside superconducting (SRF) cavities [15,16], and as we discuss below, it may be possible to achieve even higher fields. The lower frequency also alleviates shot noise issues. Thus, while the average magnetic fields will still be lower than those used in static-field experiments, the relative (m a L) 2 enhancement may be enough to make up-conversion experiments interesting.
SRF up-conversion experiments were first proposed in [17]. However, this paper was mainly concerned with addressing GHz-scale axion frequencies, and did not consider the parametric scaling at low axion masses. Upconversion experiments for low-mass axions were considered in [11], but calculational errors (see [2]) resulted in parametrically incorrect sensitivity limits, which were orders of magnitude too optimistic.
In this paper, we discuss the basic physics and design considerations involved in microwave cavity upconversion experiments. One of the most obvious questions is how to choose the cavity geometry. We derive constraints on the signal power attainable from cavities, and show that for simple geometries (in which all of the walls are visible from an interior point) the RMS magnetic field is limited by the magnetic field at the walls. Since, for superconducting cavities, this this limited by the superconductor's material properties, the signal power from such cavities is bounded. We also show how this bound can be circumvented using cavities with more complicated shapes, and illustrate that, to probe significant QCD axion parameter space with small cavity volumes, such geometries may be practically necessary.
In addition to such considerations, which would be important for more advanced experiments, we also outline a nominal first-generation experiment, aiming to be as simple as possible while still having interesting reach. We give a high-level discussion of the noise issues that might arise, and derive representative sensitivity estimates. This discussion is not intended as a design study, and a realistic experiment might be significantly more complicated. Instead, our goal is to illustrate the physical parameters that might be required, and to motivate further study of this experimental direction.

II. AXION DM UP-CONVERSION
We will assume that dark matter consists of a light axion-like particle a, which couples to the SM via the electromagnetic F µνF µν operator. This has Lagrangian 2 where V (a) is the potential for the axion -in general, only the mass term V (a) = 1 2 m 2 a a 2 will be important for us.
For light DM (m a eV), the occupation number in the Milky Way is 1, and almost all cosmological histories result in its state today being a coherent, classicallike oscillation [3,[18][19][20][21]. Since g aγγ (and other couplings) are constrained to be very small, interactions with a detector will have a negligible effect on the DM's state. Consequently, for the purposes of detection, we can treat the DM oscillation as a fixed classical background field.
Under integration by parts, the interaction term is Since axion DM in the galaxy is non-relativistic, with typical velocity ∼ 10 −3 , the dominant term is the spatial current J (a) gȧB. Hence, the effects of the axion field are equivalent to those of an oscillating current density, with profile set by the background magnetic field. Suppose that the background magnetic field is oscillating at frequency ω 0 , with B 0 (t, x) B 0 cos(ω 0 t)b(x). For a single-frequency axion oscillation, a(t) = a 0 cos(m a t), J (a) will have frequency components at ω 0 ± m a (where we assume m a ω 0 ). For an EM mode with electric field profile E 1 , the instantaneous power input from the axion current is P a = − dV E 1 · J (a) . Considering only a single frequency component of J (a) , say ω J = m a + ω B , the cycle-averaged input power isP a = − 1 4 cos α g ω J a 0 B 0 dV E 1 · b, where α is the relative phase of the electric field response and the J (a) oscillation. The power dissipated is P diss = ω J U/Q l , where Q l is the (loaded) quality factor of the mode. Equating these, we find that the cycle-averaged absorbed power, once fully rung up, is where U 0 = 1 2 B 2 0 V b , and C 01 ≤ 1, with equality iff E 1 ∝ b. For a high quality factor mode, if the signal frequency ω is close to the resonance frequency ω 1 , The expression in equation 5 is equivalent to equation 3 in [17].
In many circumstances (see appendix B), we are interested in the signal power averaged over different axion masses. Integrating over an axion mass range ∆m significantly larger than the bandwidth of the target mode, the average power absorbed is This formula is valid even for low-frequency B 0 oscillations, as long as the integration time is long enough to resolve the variation of B. However, if ω 0 L −1 , where L −1 is the linear scale of the shielded experimental volume (or the magnetic field extent, whichever is smaller), then C 01 is generally suppressed compared to the theoretical limit, with C 01 ∼ (ω 0 L) 2 . This is because the EM fields in the volume are in the quasi-static regime, as discussed in [2]. Consequently, to avoid this geometrical suppression, we want to take ω 0 > ∼ L −1 , i.e. > ∼ GHz for laboratory-scale experiments.
A common limitation on the rate at which large magnetic fields can be varied is the large amount of field energy stored -taking some nominal parameters, Tesla 2 × m 3 ∼ MJ. Feeding that amount of energy into and out of a system must generally be done rather slowly. For example, the currents through high-field superconducting magnets generally take minutes or more to build up to their full values, with faster changes damaging the system. Faster rates of change are possible with specially-designed superconducting systems [22], but the dissipated power increases with frequency, and is generally prohibitive for rates of change > ∼ 1 T/s. Resistive conductors can tolerate more heating / stresses, but sustained operation at high field strengths dissipates very large powers. Similarly, varying the fields from magnetic materials will either involve mechanical motion, or hysteretic energy losses. In all of these cases, achieving strong ∼ GHz frequency magnetic fields will not be possible.

A. SRF cavities
We can get around these issues by using cavities with high quality factors, in which magnetic field energy is exchanged back and forth with the electric field energy inside the cavity, rather than needing to be transferred in and out each cycle. Filling the cavity to high field amplitude is still a slow process, but once this amplitude has been established, a high-field oscillation can be maintained with only a small energy input, to counteract the small dissipation rate.
The basic setup of a cavity up-conversion experiment is illustrated in figure 1. The cavity's shape is tuned so that there are two modes with frequency difference |ω 1 − ω 0 | m a . One of these modes is driven to a high field amplitude, and in the presence of this background oscillation, an axion DM oscillation would lead to signal power in the other mode, according to equation 5. If the two modes have large C 01 , then this process will be efficient [2].
To achieve a high quality factor, and allow for high drive fields without excessive energy dissipation, the cavity walls should be superconducting. SRF (superconducting radio frequency) cavities have been extensively developed for particle acceleration [15]. They are also starting to be used directly in hidden sector particle searches -the Fermilab DarkSRF project [23] is constructing a dark photon detection experiment using SRF cavities, and there have been other proposals for axion detection [24,25]. In this section, we will review some of the important properties of SRF cavities, which will affect the design and operation of an up-conversion experiment.
• Peak surface magnetic field : if the magnetic field at the walls of the cavity becomes too large, the behaviour of the superconducting material will change. Type-I superconductors generally have rather low critical fields (e.g. for Aluminium, H c 0.01 T [26]), so are unsuitable for high-field cavities. SRF cavities are fabricated using Type-II superconductors, and almost always use niobium. This has a critical field value H c1 above which vortices penetrate; if this happens, then the radio-frequency oscillation of these vortices will lead to increased dissipation, and generally runaway thermal instability [15]. It is actually possible to operate slightly above H c1 , in a metastable 'Meissner state' -while it energetically favourable to have flux deep within the bulk, establishing this configuration involves penetrating a surface energy barrier [15]. Vortices start penetrating the surface at H sh , which for niobium is 0.2 T. Consequently, the magnetic field at the cavity walls should always be < ∼ 0.2 T. As we will see below, this puts limits on the achievable fields inside the cavity, and correspondingly, on the signal power attainable from axion DM.
• Surface resistance: the quality factor of a cavity mode is set by its magnetic fields at the cavity walls (which determine the wall currents), and the surface resistance there. This resistance has a 'BCS' component, and a 'residual' component, The BCS component, for oscillations at frequency The left-most box illustrates a cavity in which we are interested in two modes, one with magnetic field profile B0, and another with electric field profile E1. The right-hand graph illustrates the frequencies of these modes as we vary the shape of the cavity (here, the length d) , showing that they become degenerate at some d. When the cavity length is tuned to be close to this point, the frequency splitting between the modes will be small. If the ω0 mode is driven with high amplitude, then in the presence of an axion DM oscillation with ma |ω1 − ω0|, power will be transferred to the ω1 mode, where it can be detected. The mode profiles shown here are schematic - Figure 2 shows actual mode profiles for a particular cavity geometry.
ω, is given approximately by where λ is the (effective) penetration depth, σ n is the normal-state conductivity, and ∆ is the superconducting gap energy [27]. The residual resistance is the component that persists as T → 0, and is measured to be ∼ few nΩ for good niobium cavities (physically, it is not entirely clear what this residual resistance is dominated by [15,27]). The R BCS ∝ ω 2 dependence means that, for frequencies > ∼ 3 GHz, the BCS resistance starts to dominate.
Since it is an increasing function of temperature, this can lead to thermal instability problems [15]. Hence, SRF cavities are generally operated at lower frequencies.
• Cooling: The quality factor of a mode sets the power dissipated, for a given energy stored. As we will we see in the next section, taking the H sh limit and R s values from above, and applying them to a simple laboratory-scale (∼ 60 L) cavity, gives P diss ∼ 30 W (with Q 2 × 10 11 ). Since this heat eventually needs to be dissipated to the T H ∼ 300 K environment, the maximum efficiency of the cooling system is η C = T0 where T 0 is the temperature of the cavity. So, we would need at least 10 kW of electrical power to cool the cavity to T 0 = 1 K. Taking a typical thermal efficiency of η T 0.2, this becomes > ∼ 50 kW. These figures illustrate that cooling the cavity to significantly sub-kelvin temperature would be prohibitively power-hungry. The high cooling powers required generally necessitate the use of liquid helium cooling systems. The pump machinery involved leads to mechanical vibrations of the cavity, which can introduce noise and tuning issues, as we discuss in section III B.
• Field emission: if the electric fields at the cavity walls are high enough, then electrons can escape from the surface via tunneling. The field emission rate can be approximated via a modified Fowler-Nordheim formula [15], where φ is the work function of the wall material, A e is the effective emitting area, E 0 is the electric field at the wall, and β is a phenomenological 'field enhancement factor'. For pure niobium, φ 4.3 eV [28], so 6φ 3/2 √ m e 60 GV/ m 200 T. As we will see in section II B, for the cavities of interest to us, the peak electric field at the walls will be < ∼ the peak magnetic field, which is restricted to H sh 0.2 T. Consequently, if β ∼ 1, we would naively expect field emission to be negligible. However, experimentally, field emission is observed at significantly lower electric fields, down to ∼ 10MV/ m 0.03 T [29]. This appears to arise from defects (especially foreign objects, such as metallic fragments) on the walls of the cavity, which can have β up to ∼ 700 [29]. Too much field emission can lead to quality factor degradation, as EM energy is lost to electrons. Even if this is not a concern, we may be worried about much smaller levels of field emission, if the electrons deposit energy into the signal mode. We discuss this noise source in section III D.
• Tuning: to change the axion mass that we are sensitive to, we need to change the frequency splitting between the drive and signal modes, which entails changing the shape of the cavity. If the axion mass is small compared to the mode frequencies, then only a small fractional change in the frequency of each mode is needed to cover an O(1) axion mass range, simplifying the tuning problem. The usual method of tuning SRF cavities is simply via elastic deformation of the cavity walls, through an external forcing (such as a piezoelectric transducer, and/or a mechanical screw) [15,30]. To stay within the (low-temperature) elastic limit of niobium, the material strain should be < ∼ few×10 −3 [31]. For typical ∼ GHz cavities, this usually translates to a tunable range of a few hundred kHZ [15] -for higher cavity modes, the absolute range may be greater. As well as static deformations, we also need to worry about vibrations, as discussed in section III B.
These points only represent a simple sketch of the issues that an experiment would encounter, and much more detailed analysis would be needed for a real implementation. However, they give some idea of the properties of SRF cavities that are relevant to our setup.

B. Cavity geometry
In designing an SRF up-conversion experiment, the most obvious question is what cavity geometry to use. To detect low-frequency axions, we need a pair of almost degenerate modes to act as the drive and signal modes, and these should have large C 01 . If we want to use lowlying cavity modes, this generally requires a tuning of the cavity shape. For example, if we take a cylindrical cavity, and consider e.g. the TE 011 mode, the mode this is naturally degenerate with is TM 111 (i.e. they are degenerate at all height-to-radius ratios). However, due to the m mismatch, these modes have C 01 = 0. We can attain good overlap with e.g. the TE 011 /TM 020 mode pair, but the mode frequencies are only degenerate at d/R 0.79, where d is the height of the cylinder and R is its radius [11]. For more symmetrical cavities, such as a sphere, the overlaps for degenerate modes are always zero.
If we are only concerned with thermal noise, the best possible sensitivity is determined by the signal power, and the bandwidths of the drive and signal modes (see section III A). Since the signal power scales as B 2 0 , where B 0 is the magnetic field strength in the driven mode, the simplest way to increase the signal power is to increase the amplitude of the driven mode. However, the power dissipated increases ∝ B 2 0 as well, and even if this is not a problem (or other issues such as field emission), at some point the maximum magnetic field at the cavity walls will increase past H sh 0.2 T. This means that, for a given cavity geometry, there is a maximum achievable signal power.
Given some constraints on available volume and cooling power, we can ask how large a signal power can be obtained by optimising the cavity geometry. The volume constraint is necessary; since dissipation is a wall effect, whereas signal scales with volume (for fixed field amplitudes), if we increase the cavity dimensions by a factor α, while reducing the field amplitudes to keep the signal power constant, the dissipated power scales ∝ 1/α. Hence, by scaling up the cavity, we can always reduce the required cooling power for a given signal level.
Taking a cylindrical cavity geometry as our example, which height-to-radius ratio gives rise to a degenerate mode pair with the best (axion-mass-averaged) signal power per volume? If we hold max ∂V B 2 constant (where V denotes the volume of the cavity, and ∂V its boundary walls), then among the low-lying modes, the best choice is to drive TE 012 , and pick up in TM 013 , which are degenerate at d/R 2.35. This mode pair has C 01 = 0.19, and maximises U 0 C 01 /V (for example, driving TM 013 and picking up in TM 012 gives the same C 01 , but the stored energy in the TM mode is smaller, if we restrict the magnetic field at the walls to be < 0.2 T). The field profiles for these modes are illustrated in Figure 2. In addition to giving high signal power, this mode pair also has attractive noise-rejection properties, as discussed in Section III. Consequently, we will use it as our nominal experimental setup for most of this paper. Some features of this mode are summarised in Table IV, where its properties for R = 20 cm and R = 50 cm are given (at the degenerate d/R ratio). Properties for other sizes can be obtained by scaling.

General constraints
We can also consider more general cavity geometries. Both the cooling power and H sh limitations are based on the magnetic fields of the drive mode at the cavity walls. It is not immediately obvious that these can't be made small by a clever choice of cavity geometry. For example, the wall electric fields for the cylindrical TE 012 mode are everywhere zero (with the consequence that field emission can be highly suppressed; see section III D). However, as shown in appendix A, the wall fields can be related to the energy stored in the cavity via where x is the vector from some origin to the wall location, n is the outward-pointing normal to the wall, and the angle brackets denote time averaging. Since dA x · n = 3V , if we can choose an origin for which x · n ≥ 0 everywhere, then for a harmonic oscillation, so the magnetic field energy inside the cavity can be bounded by the maximum magnetic field at the walls. Similarly, the power dissipated is Since, if x · n ≥ 0 everywhere, FIG. 2. Illustration of the optimum drive and signal modes for a cylindrical cavity, as discussed in section II B. The B0 · E1 panel shows the dot product of the drive mode's magnetic field with the signal mode's electric field, with green indicating a postive value, and red a negative value -the integrated value over the volume gives C01 0.19, as defined in section II.
we have From equation 7, the signal power is bounded by the energy in the drive mode. Consequently, for 'simple' cavities, for which there is an interior point from which all of the walls are visible, a given signal power implies a lower bound on max ∂V B 2 , and on P diss (in the latter case, assuming a given linear extent for the experiment, as per the scaling discussion above).
Comparing these limits to the properties of the TE 012 /TM 013 mode pair, the maximum signal power per volume, for a given max ∂V B 2 , is < ∼ 10 times larger, using the above limits. Similarly, for a cavity with the same maximum linear extent, the signal power is < ∼ 12 times larger, for a given dissipated power. For realistic geometries, the limits are probably significantly lower.

Non-convex geometries
However, equation 11 also makes it fairly obvious how to get around these limitations. If dA|x · n| 3V , then U can be large even if the wall fields are small. As an explicit example, we can consider a toroidal cavity (illustrated in figure 3), formed by bending a corrugated cylindrical waveguide of radius a around to meet itself, resulting in a toroid with overall radius R. If we assume quarter-wavelength-deep corrugations, then at frequencies for which ωa 1, the modes of a corrugated waveguide are dominantly transverse, and the wall fields are suppressed by ∼ (ωa) −1 relative to the fields in the interior [32][33][34]. Taking an explicit example, the linearly-polarised HE 11 modes have transverse fields Illustration of a toroidal corrugated waveguide, as discussed in section II B 2. The radius of curvature R is taken to be much larger than the waveguide radius a. and have where λ is the free-space wavelength of the mode (and similarly for the electric fields) [32]. Consequently, by making ωa large, the interior fields can be made parametrically larger than the wall fields.
For a linear waveguide, the problem comes at the endcaps -here, the wall fields are ∼ |B(r = 0)|. Bending the waveguide into a toroid eliminates these endcaps. Of course, in order to preserve the smallness of the wall fields, the radius of curvature R must be large compared to a. Performing a naive perturbative calculation [35], in which we take both R/a and ω/a to be large, and treating the corrugated wall as a surface with uniform effective reactance [32], the correction to the wall fields from the waveguide's curvature is approximately The constant factor in this estimate should be treated as an O(1) estimate, since we do not calculate the actual behaviour in the corrugations. However, the parametric form should hold, and illustrates that as long as R a, it is possible to make aω large and obtain interior fields significantly larger than the wall fields.
Taking some illustrative numbers, if we assume a toroid with waveguide radius a = 10 cm, and take R = 4 m, then choosing as high as mode frequency as practical, ω 2π ×3 GHz, gives aω 6. The total energy stored in the HE 11 mode, for wall fields < ∼ 0.2 T, is 40 kJ. For a TE 012 /TM 013 cylinder of comparable volume, U 10 kJ.
To use the toroidal cavity for axion detection, we need almost degenerate modes with good overlap. Conceptually, this is very simple -we simply use the HE 11 modes with orthogonal polarizations, offset by a quarterwavelength along the toroid. This is in exact analogy to the optical up-conversion experiments proposed in [12][13][14], which convert linearly-polarized optical photons to the orthogonal polarization. It has the advantage of having, in the large-R limit, perfect overlap between the drive and signal modes. Consequently, the advantage in axion signal strength is even larger than the advantage in stored energy -using the same nominal parameters as above, the toroid's signal strength would be ∼ 20 times higher (a more detailed calculation would be required to find the proper finite-R behaviour). This compares to the factor 10 limit derived above for 'simple' cavities, showing that the non-trivial geometry is necessary for such improvements. If we instead used a linear corrugated waveguide, then the drive amplitude would be constrained by the end-caps, and the signal strength would be only a factor ∼ 2 higher than a cylinder of the same volume.
The toroidal cavity serves as a particularly symmetrical, and thus easy-to-analyse, example of a cavity with large dA|x · n|. There are, of course, many other possibilities. For example, we could instead take a linear corrugated waveguide, and replace its endcaps by largearea reflectors. In [36], it is claimed that this can reduce the peak magnetic field at the walls by a factor ∼ 2; for a long waveguide, this would result in a signal power per volume ∼ 8 times larger than for the TE 012 /TM 013 cylinder pair. It is not immediately clear whether the enhancement can be made parametrically large, as for the toroid example.
These kinds of cavities may be significantly more complicated to fabricate than the simple cylindrical cavities discussed above. In addition, as we will discuss in the next section, they lack some of the noise-rejection properties of more symmetrical cavities, and would be more complicated to tune and control. Consequently, we will not attempt to analyse them in detail. They do, however, serve as an example of how larger signal powers could potentially be realised.
In table IV, we list some estimated parameters for a R = 8 m, a = 13 m toroidal cavity used as an upconversion experiment. This size is chosen make significant QCD axion sensitivity possible, at least theoretically (see section III A and figure 5). Even these large sizes are significantly smaller than the axion coherence length for the masses of interest, l a ∼ 10 3 ν −1 a ∼ 300 m GHz νa , so our approximation of the axion field as spatially constant will be valid.

III. BACKGROUNDS & SENSITIVITY
A. Thermal and amplifier noise As mentioned in section II A, the power dissipated at high drive fields means that cooling the cavity to subkelvin temperatures is impractical. Since 2π GHz 50 mK, this means that the physical temperature is always significantly higher than the signal mode frequency, and thermal noise needs to be taken into account. In addition, the system we use to read out the signal -generally, a chain of microwave amplifiers -will introduce its own noise.
In appendix B, we review the theory of signal detection for a high-Q target mode, assuming readout via an amplifier isolated behind a circulator (as for most microwave systems). If the noise associated with the amplifier system corresponds to a smaller temperature than the physical temperature of the cavity, then it is favourable to 'overcouple' the signal mode to the output port (i.e. have it lose more power to the output port than to environmental dissipation) [37]. This reduces the loaded quality factor of the target mode, which is naively bad, but also dilutes the thermal noise reaching the amplifier; the latter effect turns out to dominate.
The high quality factors attainable in SRF cavities mean that, even given this overcoupling, the bandwidth of the target mode will be ν a , for axion masses of interest. This means that, to cover an O(1) range in axion masses, we need to operate the experiment in multiple different configurations, with different frequency splittings between the drive and target modes (as discussed briefly in section II A). In the limit of long integration times, any sufficiently dense, and roughly equal, spacing of these frequency splittings will give approximately the same expected SNR, Here, T 0 is the physical temperature of the cavity walls (assumed to be ω 1 ), T n is related to the noise of the amplifier system (see appendix B), ∆m a is the range of axion masses we want to cover (assumed to be < ∼ O(1)), Q a 10 6 is the inverse fractional bandwidth of the axion signal, t tot is the total integration time for all configurations, Q 1 is the unloaded quality factor of the target mode, ω 1 is the frequency of the target mode (assumed to change by a small fractional amount between configurations), and P 0 is the power absorbed by the (unloaded) target mode when on-resonance with a monochromatic signal (see equation 5). Since P 0 ∝ g 2 , the smallest coupling we have sensitivity to scales as n , as expected [37]. From [2], we know that for fixed t tot , this improvement with increasing Q 1 must saturate at some point. Fairly obviously, this will happen when the time spent in each configuration is not long enough to resolve the loaded bandwidth of the target mode. If, as is the case in most of our parameter space, ∆ω a m a /Q a is such that ∆ω a ω 1 /Q l , where Q l is the loaded quality factor of the target mode, then the minimum spacing of frequency splittings is ∼ ∆ω a (otherwise some axion masses will fall into the gaps). This means that the time spent in each configuration is t 1 t tot ∆ωa ∆ma ttot Qa ma ∆ma . If t 1 < ∼ Q l /ω 1 , then the mode cannot fully ring up in the time available, and it would be more favourable to reduce Q l by overcoupling further. In this regime, the best attainable SNR is where W ≡P t tot is the expected power absorbed from the axion signal.
In figure 4, the dot-dashed line shows the thermalnoise-limited sensitivity for the nominal 60L cavity discussed above, given a total integration time of one year to cover an e-fold in axion mass range. It assumes that the physical temperature of the cavity is T 0 = 1.4 K, and the amplifier system is quantum-limited, so T n = ω 1 . Near-quantum-limited amplifiers have been demonstrated at microwave frequencies, and incorporated in the ADMX [6,38] and HAYSTAC [39] axion DM experiments. The loaded quality factor, for the optimal overcoupling level, is Q l 2 × 10 9 , so it takes ∼ 2 s to resolve the signal mode bandwidth, and a total integration time of > ∼ 2 × 10 6 s 24 days to be in the regime of equation 18. Since the coupling sensitivity scales like g sens ∝ m 1/2 a , but g ∝ m a for the QCD axion, lower-mass QCD axions are still harder to detect. As the figure shows, an experiment with these parameters could not attain QCD axion sensitivity over an O(1) axion mass range, even at higher masses (without violating our assumptions, e.g. by injecting a highly non-classical state into the target mode). Moreover, though we extend the sensitivity projection up to ν a ∼ 100 MHz, tuning a cavity over an O(1) range of frequency splittings would be difficult at such high masses, as discussed in section II A.
The simplest way to obtain higher sensitivities would be to use larger cavities. Figure 5 shows the thermalnoise-limited sensitivities for two such examples. The first is a simple scaling-up of the TE 012 /TM 013 configuration, using a cylinder of radius 50 cm (giving a volume of ∼ 900 L). Its properties are summarised in Table IV compared a smaller cylinder, the signal power scales with the volume. Even an experiment of this size could not reach KSVZ axion sensitivity over an O(1) axion mass range, at reasonable axion masses. At this size, the mode frequency is ∼ 440 MHz, which is at the lower end of the frequency range generally used in SRF cavities.
To illustrate the kind of parameters that would be required for significant DFSZ axion sensitivity, Figure 5 also shows the approximate thermally-limited sensitivity for a corrugated toroidal cavity, as introduced in section II B 2. This is taken to have waveguide radius a = 13 cm, and R = 8 m; the resulting properties are summarised in Table IV. The naive estimate of the signal mode quality factor is high enough to put us in the regime of equation 19, even for a total integration time of a year.
As emphasised in section II B 2, this kind of projection should not be taken as a concrete experimental proposal -for that, one would need to understand the control issues, noise problems etc. associated with these cavity designs. Instead, it illustrates what would be necessary, in principle, to probe very small couplings.

B. Vibrations
While thermal noise must be taken into account in all axion detection experiments, up-conversion experiments have the additional challenge of a very large-amplitude background EM oscillation. If there are any environmental oscillations at the axion frequency, and any non-linear processes via which couple these to the drive and signal modes, this will represent an additional noise source.
The most obvious such process is mechanical vibrations of the cavity walls. If we write δx(t, x) as the displacement of the cavity wall from initial position x, and n as the outward pointing normal to the initial wall position, then to linear order in δx, the interaction Hamiltonian of the wall displacement with the EM fields is The resulting interaction between the drive and signal modes is This interaction actually represents the signal mechanism for proposed SRF gravitational-wave detectors, such as MAGO [40][41][42] -the effect of a gravitational wave can be treated as a deformation of the cavity walls, which up-converts photons from the drive mode to the signal mode.
We can see from equation 21 that, if the drive and signal modes of the unperturbed cavity are orthogonal at the walls, then to linear order in δx, wall vibrations do not couple them. This is one reason why the TE 012 /TM 013 mode pair introduced in section II B is attractive -in an ideal cylinder, these modes have E 0 · E 1 = 0, B 0 · B 1 = 0 throughout the whole cavity. However, it will not be possible to fabricate the cavity shape perfectly, and some deformations will result in non-orthgonal fields at the walls. For example, if the cavity cross-section is elliptical rather than cylindrical, then the deformed TE 012 mode has a small electric field at the cylinder's walls, where f is the flattening of the ellipse (f = (a − b)/a), where a ≥ b ≥ 0 are the axis lengths), and the normalisation is set by taking the maximum magnetic field at the walls to be = 0.2 T. Since the TM 013 mode has E r ∝ sin(3πz/d) at the walls, the effect of a cavity vibration is controlled by where A w is the area of the cavity walls, and we have normalised so that a vibration with δx = sin(2φ) sin 2πz d sin 3πz d x(t) has C = 1. Then, for monochromatic oscillations, the power transferred to the signal mode is where we have taken the parameters of the 20 cm cylindrical cavity listed in table IV. Consequently, the displacement noise in the relevant bandwidth must be very small, in order not to overwhelm the axion signal power.
The example of an elliptical cavity illustrates the general feature that, for a cavity deformation of fractional size ∼ f , we expect the wall fields to be perturbed by ∼ f [43,44], and in general to be non-orthogonal. It may be possible to deform the cavity post-fabrication to alleviate such issues, but we do not attempt to consider such possibilities here. Conversely, for other cavity geometries, such as a corrugated waveguide, the wall fields are non-orthogonal even for an unperturbed cavity.
The other information we need to determine the vibration-induced noise spectrum is the vibration spectrum of the cavity's walls. At high enough frequencies, this will probably be dominated by thermal vibrations. If ultrasonic waves from outside the cavity are sufficiently attenuated, then the amplitude will be set by the cavity's physical temperature. The ultrasonic attenuation length scale in liquid helium is ∼ cm for frequencies > ∼ 10 MHz [45,46], though it may be significantly smaller for metals [45,47]. Investigation would be required to determine the actual properties of a cavity setup.
At lower frequencies, external acoustic noise will not be strongly attenuated, and in addition, above-thermal noise sources (such as machinery) will be present. A nominal spectrum for this external displacement noise, in a reasonably 'quiet' setting, is S xx (ν) 10 −7 cm √ Hz 10 Hz ν 2 (25) for ν > ∼ 10 Hz [48]. The measured acoustic noise in the vicinity of the MAGO prototype (figure 14 of [42]) also shows a S xx ∼ 1/ν 4 spectral density for ν > ∼ 3 kHz, with displacement amplitude a factor ∼ 10 higher than equation 25. At lower frequencies, especially with the helium cooling system in operation, there is a complicated, spiky spectrum with significantly larger amplitude.
We can compare these acoustic spectra to the spectrum from thermal oscillations by noting that, if we can treat the material as a bulk medium of large extent, with a linear acoustic dispersion relation, then the PSD of surface displacements is given by [49] where ρ is the density of the material, T is the temperature, v s is the sound speed, and p is a dimensionless constant governing the interaction of the surface with bulk phonons (e.g. for aluminium, p 2.3 [49] (27) suggesting that, very roughly, thermal noise will dominate for ν > ∼ MHz. Properly calculating the effects of vibrations would require assumptions about the imperfections of the cavity shape, and modelling the mechanical modes of the cavity-cryostat system. Here, we only attempt to make a rough estimate of these effects. To calculate the vibration-induced noise power as a function of frequency, we need the PSD S v xx for oscillations with profile set by B 0 · B 1 − E 0 · E 1 at the cavity walls. We will assume that this profile is reasonably smooth, corresponding to slowly-varying deviations from a cylindrical shape (e.g. the ellipse example above).
Decomposing the cavity vibrations into weaklydamped modes, there will be contributions to S v xx from modes with different resonant frequencies. At ω much higher than the resonant frequency of low-order mechanical modes, the vibrational modes with similar frequencies will have small overlap with the (assumed smooth) profile, while the low-order vibrational modes will have much smaller frequencies. For thermal vibrations, the contribution from the latter will be S xx ∼ 1 Qm T M ω 3 , where M is the mass of the cavity walls, and Q m is the mechanical quality factor of the low-order modes. The modes with resonant frequency ∼ ω will contribute S xx ∼ (λ/L) c T M ω 3 , where λ ∼ v s /ω is the wavelength of the modes, and (λ/L) c corresponds to the suppressed spatial overlap with the profile (with higher c corresponding to smoother profiles). At frequencies low enough that individual modes stop overlapping in frequency, the contributions to S xx will be spikier. For ω around that of the low-lying mechanical modes, it will have maximum value S xx ∼ Q m T M ω 3 . For acoustic, rather than thermal, noise, the same general considerations will apply, but the effective T in the above equations will be frequency dependent. In figure 4, we display an example of the vibration-noiselimited sensitivity, using the estimates discussed above. Taking acoustic noise with spectrum as per equation 25, we find that vibration-induced noise dominates thermal EM noise for ν a < ∼ 300 kHz (we assume that Q m 10 3 and c = 2, in the notation of the previous paragraph, and assume 3mm thick cavity walls, giving a cavity mass of ∼ 20, kg). The thermal vibrations, assuming T = 1.4 K, are sub-dominant. We emphasise that all of these estimates are best viewed as guesses, and a much more careful treatment would be necessary for a realistic experiment.

Frequency variability
So far, we have considered how vibrations can give rise to a coupling between the drive and signal modes. However, vibrations will also change the energy of each of these modes themselves. Typically, this 'detuning' will (before any compensation) be O(10 Hz), corresponding to ∼ nm-scale displacements of the cavity walls [50][51][52]. Most of this variation will come from low-frequency ( < ∼ O(10 Hz)) vibrations.
If these vibrations cause the frequency splitting between the drive and signal modes to change over time, by more than their bandwidths, then we are effectively changing the up-conversion tuning over time. This change in 'scan strategy' can affect the thermal noise limited sensitivity, as discussed in section III A.
For SRF accelerator cavities, large frequency detunings can increase the power needed to drive the cavity [53], as well as moving the accelerating voltage out of phase with the electron bunches [52]. Consequently, feedback systems are general employed which measure the detuning of the cavity, and apply a mechanical deformation (via a piezo transducer) to correct this. Using such systems, frequency stabilization to < ∼ Hz has been demonstrated [52] In our case, measurement of the changing mode frequencies is very important, since any unknown variation of the frequencies over time will reduce sensitivity. Further investigation of how well such a measurement system (and feedback control system) could perform in our setup would be required.
In principle, it may also be possible to measure the higher-frequency mechanical vibrations of the cavity discussed above, and use this to compensate for, or subtract out, the induced noise in the signal mode. We do not attempt to model the feasibility of such measurements here.

C. Drive leakage
In an ideal experiment, the cavity input power would couple only to the drive mode, and the detector would be coupled only to the signal mode. For mode pairs such as our TE 012 /TM 013 example, it is easy to see how this could be accomplished in principle. Input and output is usually accomplished through small slots in the cavity wall, which connect to waveguides. For a slot much smaller than the wavelength of a cavity mode, the coupling is approximately ∝ B slot · B cav , where B cav is the mode's magnetic field at the cavity wall [54]. Since the signal and drive modes are everywhere orthogonal, a correctly polarized input/output port should, to a good approximation, couple only to one of them. The different spatial profiles of the modes (see Figure 2) provide an extra discriminator -for example, we could place an input port at a wall location where the signal mode's magnetic field is very small, and an output port where the drive mode is small.
However, as was the case for the vibrational couplings discussed above, non-perfect fabrication of the cavity will spoil these assumptions. For example, if the output slot was slightly misaligned, it would have a small coupling to the drive mode. Generally, for a Fourier-domain signal S I (ω) at the input port, and assuming that the drive and signal modes are the only ones at nearby frequencies, the output signal will be Here, C oD represents the coupling between the drive mode and the output port, and G D (ω) is the response function of the drive mode, etc. For a well-constructed cavity, C oD and C Si will be small, but still non-zero. Since the signal and drive modes have very narrowbandwidth response functions, we expect S O (ω) to be peaked around ω 0 and ω 1 .
Naively, the output signal near ω 0 is not an issue -we can just Fourier transform the output data, and ignore it. However, the non-ideal nature of the output electronics may introduce problems. Most directly, if the amplitude of the ∼ ω 0 noise, relative to the ∼ ω 1 signal, is too large, then the dynamic range of the amplifier may be exceeded. One way to address this is to place a frequency filter on the output port, to reject frequencies too far from ω 1 . However, if this is not enough (which is especially likely at smaller m a ), it may be necessary to reduce C oD through some feedback control mechanism.
Using the nominal experimental parameters from above, the energy stored in the drive mode of the 60L cavity is ∼ 700 J. If the output port were critically coupled to the drive mode, this would result in ∼ 30 W output power. At the thermal-noise-limited sensitivity (section III A), the axion-sourced signal power is P sig 2 × 10 −23 W for ν a ∼ MHz (assuming an integration time of 1 year per e-fold in axion mass range). If we assume an amplifier with a dynamic range of 50dB, then to avoid swamping this signal, the output power needs to be a factor of ∼ 10 −20 lower than the 30 W level, corresponding to a ∼ 10 −10 amplitude suppression. These numbers make it clear why leakage suppression might be challenging. Furthermore, in addition to leakage through the cavity, we would also need to worry about leakage through the laboratory environment. An experimental rule of thumb is apparently [55] that suppressions of more than ∼ 10 −19 in power are difficult to achieve, for electronics in the same laboratory space.
These kinds of issues were encountered by the MAGO SRF gravitational-wave detector mentioned in the previous section. Their setup used two identical cavities with a small coupling between them, giving rise to almost degenerate symmetric and antisymmetric modes. To drive the symmetric mode, a magic-tee was used to split the drive signal into in-phase components, while to sample the antisymmetric mode, another magic-tee was used to take the difference of outputs from each cavity. Since the drive and signal modes had the same profiles in each cavity, this relative phase was the only way of distinguishing between them. With this setup, they achieved a power suppression of ∼ 48dB for the output, corresponding to an amplitude suppression of ∼ 1/250 [40]. The output power was dominated by the ∼ ω 0 peak.
To improve this, they implemented a feedback system, using a variable phase shifter to control the phase of one of the input ports (and another to control the phase of an output port). This phase shifter was controlled by a feedback loop, designed to suppress the output power. Using this system, they achieved an output power suppression of ∼ 140dB [40] (which would seem to indicate that the amplitudes at the two output lines were matched to one part in ∼ 10 7 by the geometry of the setup, unless they also applied feedback control the amplitudes). One could imagine other ways of implementing feedback schemesfor example, combining the (attenuated) drive signal with the output directly -but the MAGO scheme provides a concrete example of the kind of control system that may be required. In our case, the extra geometric rejection provided by the profiles of the drive and signal modes, along with a high-Q bandpass filter on the output, make it seem plausible that sufficiently high rejections could be obtained.
If the drive signal has narrow bandwidth compared to the frequency splitting between the drive and signal modes, then leakage at frequencies close to ω 1 will be suppressed. However, it may interfere with the axion signal we hope to detect, so more care is required in dealing with it. Since we can measure the input drive signal, it does not necessarily represent irreducible noise -if we know the transfer function from the input to the output, e.g. via the signal mode of the cavity, then we can simply subtract it out. This can either be done in software, or via analogue mixing (though in the former case, dynamic range issues may still arise).
The more worrying case is if the transfer function varies over time, in an a-priori unknown way (for example, MAGO observed that temperature fluctuations in their input and output cables affected the phase shift experienced by the signal [42]). As a quantitative example, suppose that the time delay δ(t) along the signal path is time-varying, δ(t) = δ cos ω δ t. For ω 0 δ 1, we can use the Jacobi-Anger expansion [56], e iω0(t+δ cos ω δ t) = e iω0t (J 0 (ω 0 δ) + 2iJ 1 (ω 0 δ) cos ω δ t + . . . ) (29) So, given leakage at frequencies ∼ ω 0 , the amplitude of the noise component at frequencies close to ω 1 is ∼ (ω 0 δ) times the constant leakage amplitude. This gives a noise power, for our nominal setup, of where we take |C Si | to be the amplitude suppression, compared to critical coupling, of the output ports to the drive mode. Translating this into an equivalent noise temperature, for a δ PSD of fractional bandwidth 1/Q δ , and total power δ 2 . Consequently, unless the temporal variation in transfer characteristics is substantial, this noise contribution is likely to be smaller than e.g. that of the acoustic noise considered in the previous section. In figure 4, we show the effect on sensitivity for Q δ 1, ω 0 δ 0.1, illustrating that, for these parameters, it is fairly similar to our estimate for thermal vibrations.
Of course, as well as this in-principle issue, there may be technical issues in suppressing the ∼ ω 1 leakage, especially for low-dynamic-range amplifiers. For the purposes of our sensitivity estimates, we will assume that these are surmountable at high axion masses, and sub-dominant to vibrational noise at lower ones.
If there are other modes close-to-degenerate with the drive and signal modes, then these may affect signal leakage. For example, in a perfect cylindrical cavity, the TE 012 mode is exactly degenerate with the TM 112 mode. While such degeneracies will be lifted by the inevitable cavity shape imperfections (e.g. the ∼ 10 −3 fractional deformations we were assuming would generically give ∼ MHz splittings), if they are still a problem, then it may be simplest to give the cavity an intentional, slight deformation, as was done for the cylindrical MAGO prototype [40].

D. Free charges
As well as wall effects, such as the vibrational backgrounds considered in Section III B, EM noise can also arise through currents inside the volume, i.e. from the movement of free charges inside the cavity. These charges may originate from the cavity walls via field emission, or may come from outside the cavity.
For initially low-energy charged particles, such as those arising from field emission, the oscillating drive field inside the cavity will have a significant effect on their motion, and it is necessary to solve for their trajectories inside the cavity. However, for particles with sufficiently high initial kinetic energy, the effect of the drive field on the trajectory is unimportant, and the particle effectively travels in straight line. This is the case for e.g. cosmic ray muons, which provide a simple test case we can work out.
We are interested in the effect of the charged particle on the signal mode. Working in a gauge where A 0 = 0, we can write the interaction Hamiltonian as If we write the vector potential for the signal mode as A(t, x) = A s (t) a(x), then the dynamics of A s are analogous to those of a harmonic oscillator with 'mass' = dV a 2 ≡ V a [2]. Writing H int = (qv(t) · a(x(t)))A s (t) ≡ j(t)A s (t), the linear response function governing the response of A s to the forcing j isχ(ω) The expected energy delivered to the signal mode is where the averaging is taken over different phases of the signal mode oscillation. For a high quality factor mode, Imχ will be very narrowly peaked compared toj, so Consequently, for a low-lying mode of the cavity, a single relativistic transit of a charged particle will deposit, on average, ∼ Cq 2 photons into the signal mode, where q is the charge of the particle, and C is a dimensionless geometric overlap factor. It should be noted that, in the presence of an oscillation of definite phase in the signal mode (for example, due to thermal fluctuations), there will also be an O(q) contribution to the energy absorbed. However, as discussed in [2], the energy uncertainty of a coherent state is also larger, and the relative detectability of the perturbation is still set by W /T . The flux of cosmic ray muons at sea level is ∼ 1/((10 cm) 2 s) [57]. They have an average energy of ∼ 4 GeV, so are certainly high-energy for the purposes of our calculation. Considering our nominal 20 cm-radius cylindrical cavity, there are < ∼ 20 muons passing through the cavity per second, corresponding to a delivered power of ∼C × 10 −24 W to the signal mode, whereC is the average value of C over different trajectories. A large value of C, e.g. for a trajectory along the central z-axis, is C 0.17. In comparison, the thermal-noise-limited signal power, assuming an integration time per e-fold in axion mass range of one year, is P sig 2 × 10 −23 W νa MHz . Since we do not expect the flux of other high-energy charged particles to be significantly larger than the muon flux, we can conclude that cosmic rays will not be a significant background at higher ν a . These calculations would apply to standard cavity haloscopes as well, with similar results (see also [58]).
For cavities with small enough electric fields at the walls, field emission should be negligible. Taking the previous example of a slightly elliptical cavity, if the TE 012 drive mode has maximum magnetic field of 0.2 T at the walls, then from equation 22, the maximum wall electric field is E max 40 kV/m f 10 −3 , where f is the flattening of the ellipse. Since electric fields > ∼ few MV/m are required for field emission, even fairly loose mechanical tolerances should be enough to suppress it. This is contrast to the modes used in SRF cavities for particle acceleration, such as the TM 010 mode -to obtain an accelerating voltage down the axis of the cavity, these have large electric fields at the walls.
Other cavity geometries, such as a corrugated waveguide, do not necessarily have such suitable drive modes with small electric fields at the walls. In these cases, proper simulations would need to be carried out to determine whether the resulting electron trajectories transfer too much energy to the signal mode. If the rate of field emission is high enough, then multiple electrons could contribute coherently to such energy transfer, potentially worsening the problem.

IV. SENSITIVITY COMPARISONS
As mentioned in the introduction, there are a number of different approaches to low-frequency axion DM detection through the FF coupling. Whether SRF upconversion experiments are worth pursuing depends on their plausible sensitivity, relative to these alternatives.
Static background field experiments, such as the ABRACADABRA [9] and DM Radio [10] proposals, offer the best theoretical sensitivity at higher axion masses. In Figure 5, we take the DM Radio (DMR) sensitivity projections 3 [68], and compare them to the sensitivity projections for the nominal SRF experiments discussed in the previous section, with parameters summarised in Table IV. Compared to the nominal DMR parameters, the corresponding SRF experiments have significantly smaller RMS magnetic fields, but gain by avoiding the quasi-static geometric suppression, and by having higher target mode quality factor. Correspondingly, they have different scaling of sensitivity with axion mass. Figure 5 illustrates that, for significant QCD axion sensitivity in a SRF up-conversion experiment, either very  Figure 5. In each case, the maximum magnetic field at the cavity walls is taken to be 0.2 T. The various quantities are defined in section II. Note that the quality factors given are in the sense of dissipation, i.e. P diss = ωU/Q, rather than frequency stability. The latter will depend on how well vibrations can be controlled, as per section III B 1.
large (many cubic meters) conventional cavities, or few cubic meter advanced cavities, would be required (ignoring potential quantum enhancements). This is true even if we only take thermal noise into account -further study would be required to determine whether other noise sources could be mitigated. As discussed in the previous section, such mitigations will likely work best at higher axion masses; more realistic sensitivity projections for the larger cavities would likely have shapes similar to the small-scale cylinder projection of figure 4.
We can also compare SRF up-conversion experiments to those at optical frequencies, as proposed in [12][13][14]. In figure 5, we show the projections for the ADBC experiment [14], taking the 2 meter cavity version for commonality with the other meter-scale experiments on the plot. They assume a circulating optical power of 10 kW, giving a stored energy of ∼ 10 −4 J. This is many orders of magnitude below the cavity energies of the SRF experiments, and the shot-noise-limited sensitivity shown in the plot is correspondingly many orders of magnitude worse. The ADBC projection does not consider experimental issues such as polarizer efficiency, vibrational noise, etc, so the sensitivity of a realistic experiment may be worse. Of course, optical experiments may also be easier to easier and/or cheaper to implement than SRF experiments.

V. CONCLUSIONS
The nominal SRF experiments we considered above have signal power limited by the maximum magnetic field at the cavity walls. Apart from increasing the cavity size, one possibility for improving this is to use a different superconducting material for the walls. For example, Nb 3 Sn has H sh 0.41 T, and has been extensively investigated as a potential alternative to niobium [15,69]. While existing fabrication methods lead to worse highfield performance than niobium (potentially due to defects), research into improving these is ongoing [70]. Another potential benefit of alternative materials is that they can have higher T c (e.g. T c 18 K for Nb 3 Sn [15]); this can help by suppressing R BCS , or by allowing operation at higher temperatures, where cooling systems are more efficient.
If the thermal noise limited sensitivity projections can actually be achieved, then further progress (for the same cavity geometry) would depend on 'quantum engineer-ing'. As discussed in [2], preparing the target mode in a non-classical state, such as a squeezed state or a Fock state, can improve sensitivity. At microwave frequencies, such technologies are being developed as part of the > ∼ GHz axion detection program. For example, squeezed state injection is being incorporated into Phase II of the HAYSTAC experiment [39].
Looking beyond axion DM detection experiments, the more complicated cavity geometries we have discussed may be useful in other situations where large, highfrequency magnetic fields are required. For example, microwave frequency light-through-wall experiments to search for hidden sector particles, independent of whether they are DM, have been proposed [25,71], and a dark photon experiment of this form is being constructed at Fermilab [23]. The signal strength in these experiments depends on how large a field amplitude can be sustained in the driven cavity. Understanding whether or not highfield cavities can be constructed with the correct geometry to source hidden-sector particles would require further work.
Coming back to the prospects for low-frequency axion DM detection, a pathfinder SRF experiment that covers significant ALP parameter space, along the lines of our nominal 60L cavity, seems plausibly feasible. In principle, larger volumes and/or more advanced cavity geometries could allow for QCD axion sensitivity, at least insofar as fundamental noise limits are concerned. Further work would be required to understand whether other backgrounds could be mitigated. As an alternative to static background field approaches, SRF up-conversion would present very different experimental challenges, and may be worth further investigation. To draw an analogy, resonant bar detectors and laser interferometers represented very different approaches to gravitational wave detection. While static-field experiments may well prove to be more practical, it is important to understand the alternatives, especially as technology (such as superconducting materials) evolves.
Similar experimental concepts to those we have discussed are also proposed in [72], which should appear on the arXiv at the same time as this paper.
FIG. 4. Sensitivity projection for an up-conversion experiment using a 60 litre (20 cm radius) cylindrical cavity, where the TE012 mode is driven, and signals are picked up in the TM013 mode. We assume that the maximum magnetic field for the drive mode at the cavity wall is H sh = 0.2 T, and an integration time of one year per e-fold in axion mass range. The 'thermal EM' line shows the sensitivity limit in the presence of thermal EM noise, at the assumed physical temperature of 1.4 K (see section III A). The 'thermal vibrations' line is an estimate of the effect of thermal vibrations of the cavity walls, which up-convert power from the signal mode to the drive mode (section III B). The 'acoustic vibrations' line also incorporates extra acoustic noise from external sources. The dashed line between them is an estimate of the effects of time-varying drive signal leakage (section III C). Taking all of these noise sources into account, the estimated sensitivity reach of the experiment is given by the blue shaded region. It should be noted that, while the 'thermal EM' limit is set by basic physical parameters, the vibrational limits depend on guesses about the properties of the cavity system, and could be very different in a real experiment. Also, while we have extended the projected reach to axion frequencies ∼ 100 MHz, scanning an order one axion mass range at these frequencies would be difficult (see section II A). In Figure 5, this reach is compared to other proposed experiments. The gray shaded regions correspond to the parameter space ruled out by observations of horizontal branch stars [59][60][61], existing cavity haloscope experiments [6,[62][63][64][65][66], and the ABRACADABRA-10cm experiment [67]. The green diagonal band corresponds to the 'natural' range of gaγγ values at each QCD axion mass -if we write gaγγ = α EM 2πfa E N − 1.92 [3], then the upper edge of the band is at E/N = 5 [4], and the lower edge at E/N = 2 [3]. The gray diagonal lines indicate the KSVZ (upper, E/N = 0) and DFSZ (lower, E/N = 8/3) models.
FIG. 5. Comparison of sensitivity projections for different kinds of low-frequency axion DM detection experiments. The '60L cylinder' region corresponds to the 20 cm cavity up-conversion experiment described in the text, with parameters given in table IV (we assume an integration time of 1 year per e-fold in axion mass range). We also display the thermal-noiselimited sensitivities for the larger up-conversion experiments listed in table IV. The '900L cylinder' line corresponds to a scaled-up TE012 → TM013 experiment, with 50 cm radius. The '8m toroid' line corresponds to a corrugated toroidal cavity, as described in section II B 2; we assume an overall radius of 8 m, a waveguide radius of 13 cm, and a frequency of 2.8 GHz, resulting in the parameters listed in table IV. We leave an analysis of the less fundamental noise sources for these larger up-conversion experiments to future work. As a representative of static-magnetic-field experiments, we take the Dark Matter Radio proposal [10,68]. The 'DMR 50L' line corresponds to a 50 litre experiment, with a background magnetic field of 0.5 T, and a physical temperature of 10 mK -the resonator quality factor is taken to be 10 6 , with an integration time of 6 months per e-fold in axion mass. The 'DMR 1000L' line corresponds to a cubic metre volume, with a background magnetic field of 4 T. We also show an example of a proposed optical-frequency up-conversion experiment, the 'ADBC' proposal from [14]. This assumes a 2 metre Faby-Perot cavity, with a circulating optical power of 10 kW. The existing haloscope limits, QCD axion band, and HB star constraints are as per Figure 4.
where we have used conservation of the SET, ∂ µ T µν = 0. Then, time-averaging over an oscillation period of the fields inside the cavity, Using the boundary conditions at the conducting walls, n × E = 0, n · B = 0, and the form of the Maxwell SET, we have where n is the outward-pointing normal to the cavity wall. Thus, Physically, this is an integrated version of the Slater formula for the energy change due to cavity wall deformations [75]. If we imagine dilating the cavity dimensions from their initial value to zero, while keeping the wall fields fixed, we obtain equation A7.
FIG. 6. Signal diagram of a cavity readout system using an amplifier isolated by a circulator, as discussed in appendix B. The T0 load represents the thermal noise from the environment (i.e. the cavity walls etc), while the Tc load is a cold load that absorbs the amplifier's back-action noise.
then the signal frequency ω 1 . However, as discussed in [2,37], it is possible to reduce the thermal noise contaminating the signal, by overcoupling the signal mode to a 'cold' detector. Here, we review these SNR calculations, as they apply to our setup. Figure 6 illustrates the readout system we assume, with an amplifier isolated behind a circulator, and a cold load absorbing the amplifier's back-action noise. If the output port is coupled to the signal mode ξ times more strongly than environmental dissipation (i.e. a mode fluctuation loses ξ times more of its energy to the output port), then the transmission coefficient for thermal noise from the walls is A(ω) = 4ξ (1+ξ) 2 cos 2 α(ω), where cos 2 α is as per section II. If we assume that the output line is impedance matched to its load, so that no reflections are sent back to the cavity, then the (single-sided) noise PSD at the detector input is where T c the effective temperature of the back-action noise from the detector. Here, S T (ω) = n T (ω)ω, where n T (ω) ≡ (e ω/T − 1) −1 is the thermal occupation number, so S T T for T ω. As expected, if T 0 = T c , then the system is in equilibrium, and the noise PSD is the same at all frequencies.
Assuming a high-gain amplifier, the noise PSD at the amplifier output also has contributions from the amplifier's output noise, and from the amplification of vacuum fluctuations at the input. We can refer the amplifier's output fluctuations to its input by dividing by the power gain G, For a phase-insensitive amplifier, if the input state is coherent, then S vac = ω 2 . For a SQL-limited amplifier, S amp = ω 2 as well, so vacuum plus amplifier noise combine to give 'a single photon' of output noise, in the usual phrasing [76]. Below, we will write S a ≡ S vac + S amp , which for a phase-insensitive amplifier with coherent input state is ≥ ω.
The PSD of absorbed power from the axion field can be written as where j(t) =ȧ(t)B 0 (t)V b (in the notation of section II), and P 0 is the power the would be absorbed on-resonance from a monochromatic j(t) = j 0 cos 2 (ωt) oscillation. For a top-hat j spectrum of bandwidth δω a , we would have S jj /|j 0 | 2 = 2π/δω a . A fraction ξ 1+ξ of this will enter the detector port. Consequently, the ratio of signal to noise PSDs, referred to the amplifier input, is From the Dicke radiometer formula [77], the SNR contribution from a small frequency bin, of bandwidth δν, is SNR Ssig Sn √ t 1 δν, where t 1 is the integration time. The contributions from different frequency bins add in quadrature, so assuming that the integration time is long enough to resolve the spectral features of S sig and S n , we have At this point, the obvious question is what value of ξ we should select to maximise the total SNR, in different circumstances. To answer this, it is helpful to extract the ξ dependence from P 0 and cos 2 α; this gives in [37]. If S jj is even narrower than the unloaded bandwidth of the signal mode, then to maximise sensitivity at the axion mass we are tuned to, we simply want to maximise S sig /S n on-resonance, which is achieved at ξ = 1 (i.e. the usual critical coupling [76]). If, on the other hand, S jj is wide compared to the unloaded signal mode bandwidth, then we want to maximise SNR 2 t 1 S jj P 0 |j 0 | 2 ω1 2 dν ξ 4ξ(S T0 − S Tc ) + . . .

(B7)
We write ξ opt for the value of ξ that maximises this integral. For example, if T c = T 0 , then ξ opt = 2. Another case of physical interest is if T 0 T c , S a . For an SRF cavity, while cooling the cavity walls to below 1 K would be prohibitively difficult, realising a cold load at significantly lower temperatures, and an amplifier noise temperature 1 K, is feasible. In this case, ξ opt 2S T 0 S Tc +Sa . The best achievable parameters for a SQL-limited amplifier are T c = 0, S a = ω, in which case ξ opt 2T 0 /ω 1 . This is the optimum overcoupling found in [37].
In the T 0 T c , S a case, with ξ = ξ opt , the loaded quality factor of the signal mode is Q l Q 1 ω1 2T0 . The equivalent quality factor for the expression in equation B6 is Q s Q l / √ 3, labelled the 'sensitivity quality factor' in [37]. Physically, overcoupling by ξ opt reduces the loaded quality factor of the mode, but also dilutes the thermal noise reaching the amplifier down to S in ω1 2 cos 2 α. Increasing ξ further results in S a dominating over S in , so we reduce the quality factor for little gain.
In [37], the improvement in scan-averaged sensitivity coming from using ξ = ξ opt 2T 0 /ω 1 versus ξ 1 is phrased as gaining sensitivity 'outside of the resonator bandwidth'. For scattering-type setups, this applies to the unloaded resonantor bandwidth ∼ ω 1 /Q 1 . At ξ = ξ opt , the loaded resonantor bandwidth (i.e. the physical bandwidth of mode fluctuations in the experiment) is, to within a O(1) factor, the same as the sensitivity bandwidth, as per the previous paragraph. The improved scan-averaged sensitivity comes from this bandwidth being parametrically larger than the unloaded bandwidth, but the on-resonance SNR being only a O(1) factor smaller. Overcoupling the signal mode reduces the on-resonance signal power, but reduces the thermal fluctuations in the signal mode by parametrically the same amount, while increasing the signal mode's loaded bandwidth.
This situation can be contrasted with the case of an amplifier used in 'op-amp' mode [76], which is discussed in [10,37] for the case of flux-to-voltage amplifiers. In the limit of very large amplifier power gain, the power lost from the mode to the amplifier necessarily vanishes, and the effect on the signal mode's quality factor is very small. As discussed in [10,37], one can obtain an analogous increase in scan-averaged sensitivity by 'overcoupling' to the amplifier, by the same ξ opt 2T 0 /ω 1 factor relative to the 'critical' coupling that optimises on-resonance sen-sitivity. In this case, however, the sensitivity bandwidth is ∼ ξ opt times larger than the physical resonator bandwidth, and the mode's fluctuations are larger than those expected from the temperature T 0 .
Returning to the scattering case, if we take ξ = ξ opt , then for an axion signal with bandwidth δω a ω 1 /Q l , equation B7 tells us that the SNR from a single tuned configuration, with frequency splitting within the axion bandwidth, is where T n ≡ S Tc + S a . P 0 is evaluated for a monochromatic j oscillation with a 0 set by the dark matter density, and B 0 set by the RMS magnetic field in the drive mode.
More generally, unless the axion mass is small compared to the sensitivity bandwidth, covering an O(1) range in axion masses requires running the experiment in multiple different configuration -for up-conversion, with multiple different frequency splittings between the drive and signal modes. A 'scan strategy' specifies which frequency splittings to choose, and how long to stay in each of them. In order to cover the axion mass range equally, the frequency splittings should be spaced equally 4 , at frequencies differing by < ∼ ν b , where ν b = max(δν a , ν 1 /Q s ).
If the SNR formulae derived above apply, then any such scan strategy will give approximately the same SNR. The simplest example we can consider is to take S jj to be a top-hat of width δω a , and take the different frequency splittings to be spaced equally at δω a , such that only a single configuration responds strongly at each possible axion mass. Then, the time spent in each configuration is t 1 t tot δωa ∆ma , and the SNR from the responding configuration is SNR 2 0.7 (P 0 /Q 1 ) 2 t tot Q a Q 1 ω 1 T 0 T n m a ∆m a (B9) For a denser set of frequency splittings, multiple configurations will have significant response at each axion mass.
Since the absorbed power, averaged over axion masses, is given by equation 7, any sufficiently dense set of ∼ equal spacings will give ∼ the same signal power at each axion mass, so the SNR will still be given by equation B9. However, once the time spent covering each frequency becomes too small, these SNR formulae become invalid. In the case of a single configuration, once t 1 < ∼ Q l /ν 1 , we cannot resolve the signal mode bandwidth, and the axion signal does not have time to fully ring up the mode.
We can gain some insight into this behaviour by rewriting equation B9 in terms of the average energy absorbed 4 for < ∼ O(1) frequency ranges, the choice of axion masses prior, e.g. linear vs log-linear, does not make a significant difference. over the lifetime of the experiment,W P t tot . This gives As discussed in [2], SNR/W stops growing for Q l > ∼ t 1 ν 1 , attaining a maximum value of SNR 0.2W /T n .