Dissipative superradiant spin amplifier for enhanced quantum sensing

Quantum metrology protocols exploiting ensembles of $N$ two-level systems and Ramsey-style measurements are ubiquitous. However, in many cases excess readout noise severely degrades the measurement sensitivity; in particular in sensors based on ensembles of solid-state defect spins. We present a dissipative"spin amplification"protocol that allows one to dramatically improve the sensitivity of such schemes, even in the presence of realistic intrinsic dissipation and noise. Our method is based on exploiting collective (i.e., superradiant) spin decay, an effect that is usually seen as a nuisance because it limits spin-squeezing protocols. We show that our approach can allow a system with a highly imperfect spin readout to approach SQL-like scaling in $N$ within a factor of two, without needing to change the actual readout mechanism. Our ideas are compatible with several state-of-the-art experimental platforms where an ensemble of solid-state spins (NV centers, SiV centers) is coupled to a common microwave or mechanical mode.


I. INTRODUCTION
The field of quantum sensing seeks to use the unique properties of quantum states of light and matter to develop powerful new measurement strategies. Within this broad field, perhaps the most ubiquitous class of sensors are ensembles of two-level systems. Such sensors have been realized in a variety of platforms, including atomic ensembles in cavity QED systems [1][2][3], and collections of defect spins in semiconductor materials [4][5][6][7]. They have also been employed to measure a multitude of diverse sensing targets, ranging from magnetometry [8,9] to the sensing of electric fields [10] and even temperature [11]. Finding new general strategies for improving such sensors could thus have an extremely wide impact. A general and well-explored method here is to use collective spin-spin interactions to generate entanglement, with the prototypical example being the creation of spin-squeezed states. The intrinsic fluctuations of such states can be parametrically smaller than those of a simple product state [12][13][14], allowing in principle dramatic improvements in sensitivity.
Spin squeezing ultimately uses entanglement to suppress fundamental spin projection noise. However, this is only a useful strategy in settings where the extrinsic measurement noise associated with the readout of the spin ensemble is smaller than the intrinsic quantum noise of the ensemble's quantum state [14,15]. While this limit of ideal readout can be approached in atomic platforms, typical solid-state spin sensors [such as ensembles of nitrogen vacancy (NV) defect center spins that are read out using spin-dependent fluorescence] have measurement noise that is orders of magnitude higher than the fundamental intrinsic quantum noise [16]. Thus, in solid-state spin sensors with fluorescence readout, both reducing the readout noise down to the standard quantum limit (SQL) and (in a subsequent step) surpassing the SQL (e.g., using spin squeezing) are major open milestones. Many experimental efforts have been made to achieve the first one by changing the readout mechanism of the spins [17][18][19][20][21]. This strategy typically works well for single or few spins, but projection-noise limited readout of a large ensemble still remains an open problem [16]. Here, we propose a different method to reach the first milestone in spin ensembles: Starting from extremely large readout noise several orders of magnitude above the SQL, our method reduces the effective readout noise down to a factor of two above the SQL, notably without changing the actual fluorescence readout protocol. We stress that this paper is not considering spin-squeezed initial states and sensitivities beyond the SQL, although our method could potentially be extended in this way to approach the Heisenberg limit.
In situations where measurement noise is the key limitation, a potentially more powerful approach than spin squeezing is the complementary strategy of amplification: before performing readout, increase the magnitude of the "signal" encoded in the spin ensemble. The amplification then effectively reduces the imprecision resulting from any following measurement noise. This strategy is well known in quantum optics [22,23] and is standard when measuring weak electromagnetic signals. Different amplification mechanisms have been proposed [24,25], but amplification was only recently studied in the spin context [26][27][28][29][30][31][32]. Davis et al. [26,29] demonstrated that the same collective spin-spin interaction commonly used for spin squeezing (the so-called one-axis twist (OAT) interaction) could be harnessed for amplification. In the absence of dissipation, they showed that their approach allowed near Heisenberg-limited measurement despite having measurement noise that was on par with the projection noise of an unentangled state. This scheme (which can be viewed as a special kind of more general "interaction-based readout" protocols [31,[33][34][35]) has been implemented in cavity QED [3], Bose-condensed cold atom systems [36], and atoms trapped in an optical lattice [37]; a similar strategy was also used to amplify the displacement of a trapped ion [38]. Unfortunately, despite its success in a variety of atomic platforms, the amplification scheme of Ref. [26] is ineffective in setups where the spin ensemble consists of simple two-level systems that experience even small levels of T 1 relaxation (either intrinsic, or due to the cavity mode used to generate collective interactions). As analyzed in the Discussion, the T 1 relaxation both causes a degradation of the signal gain and causes the measurement signal to be overwhelmed by a large background contribution. This is true even if the single-spin cooperativity is larger than unity. Consequently, this approach to spin amplification cannot be used in many systems of interest, including almost all commonly studied solid-state sensing platforms.
In this work, we introduce a conceptually new spin amplification strategy for an ensemble of N two-level systems that overcomes the limitation posed by dissipation. Unlike previous work on interaction-based measurement, it does not use collective unitary dynamics for amplification, but instead directly exploits cavity-induced dissipation as the key resource. We show that the collective decay of a spin ensemble coupled to a lossy bosonic mode gives rise to a signal gain G that exhibits the maximum possible scaling of G ∝ √ N . Crucially, in the presence of local dissipation, the amplification in our scheme depends only on the collective cooperativity (not on more restrictive conditions in terms of single-spin cooperativity), and this maximum gain can be reached even in regimes where the single-spin cooperativity is much smaller than unity. Moreover, our amplification mechanism has an "added noise" that approaches the quantum limit one would expect for a bosonic phase-preserving linear amplifier. In addition, the scheme is compatible with standard dynamical decoupling techniques to mitigate inhomogeneous broadening. Our scheme has yet another surprising feature: in principle, it allows one to achieve an estimation error scaling like 1/ √ N even if one only performs a final readout on a small number of spins N A N . Finally, unlike existing unitary amplification protocols, which require the signal to be in a certain spin component [26,29], our scheme amplifies any signal encoded in the transverse polarization of a spin ensemble (similar to phase-preserving amplification in bosonic systems [39]).
We stress that in contrast to the majority of interaction-based readout protocols, we are not aiming to use entangled states to reach the Heisenberg limit (HL). Instead, our goal is to approach the standard quantum limit (SQL) using conventional dissipative spin ensembles, in systems where extrinsic readout noise is extremely large compared to spin projection noise.
It is interesting to also consider our ideas in a broader context. Our scheme represents a previously unexplored aspect of Dicke superradiance [40][41][42][43], a paradigmatic effect in quantum optics. Superradiance is the collective enhancement of the spontaneous emission of N indistinguishable spins interacting with a common radiation field: if the spins are initialized in the excited state, quantum interference effects will cause a short superradiant emission burst of amplitude ∝ N 2 instead of simple exponential decay. In contrast to most work on superradiance, our focus is not on properties of the emitted radiation [40,44,45] or optical amplification [46,47], but rather on the "back-action" on the spin system itself. This back-action directly generates the amplification effect we exploit. Somewhat surprisingly, we show that our superradiant amplification mechanism continues to be effective in the limit of dissipation-free unitary dynamics, where the collective physics is described by a standard Tavis-Cummings model.
We stress that our work is also completely distinct from spin-amplification protocols in spintronics and nuclear magnetic resonance (NMR) systems: We are not aiming to measure the state of a single spin by copying it to a large ensemble [48] or to a distant spin which can be read out more easily [49]. Instead, our goal is to amplify a signal that is already encoded in the entire spin ensemble. On the level of a semiclassical description, superradiance is similar to radiation damping in NMR systems [50], which has been proposed as a method to amplify and measure small magnetizations in NMR setups [51,52]. However, these protocols cannot be used in quantum metrology (where quantum noise is important) and they use a qualitatively and quantitatively distinct sensing scheme from the ideas we present here (see Supplemental [53]).

A. Dissipative gain mechanism and basic sensing protocol
We consider a general sensing setup, where an ensemble of N two-level systems is subject to a global magnetic field whose value we wish to estimate via a standard Ramsey-type measurement protocol (see Fig. 1(a)). This involves starting the ensemble in a fully polarized coherent spin state (CSS) at time t 1 and letting it rotate under the signal field by an angle φ, such that the signal φ is encoded in the value of one component of the collective spin vector (hereŜ y ) at time t 2 . We assume the standard case of an infinitesimal signal φ, with Ŝ y depending linearly on φ, and we set t 2 = 0 for convenience. The total error in the estimation of φ is then given by (see e.g. [15,16]): where (∆S y ) 2 = Ŝ 2 y − Ŝ y 2 . The first term (∆φ) 2 int is the intrinsic spin-projection noise associated with the quantum state of the ensemble, while the second term (∆φ) 2 det describes added noise associated with the imperfect readout ofŜ y . This additional error can be expressed as an equivalent amount ofŜ y noise, Σ 2 det , that Ramsey sequence Signal amplification and readout −π/2 y-rotation signal acquisition −π/2 y-rotation signal amplification π/2 x-rotation − ]ρ). Only collective decay leads to transient amplification, i.e., G(t) > 1. We use N = 120 spins and an initial coherent spin state in the y-z-plane as shown in the third Bloch sphere in (a) with φ = 10 −5 . The insets show sketches of the corresponding trajectories of the spin vector (the initial angle φ is exaggerated for readability). is referred back to the signal φ using the transduction factor |∂ φ Ŝ y |.
Consider first the generic situation where the detection noise completely dominates the intrinsic projection noise, (∆φ) det (∆φ) int . This is the typical scenario in many solid-state systems, e.g. ensembles of NV defects in a diamond crystal whose state is read out using spindependent optical fluorescence [16]. The goal is to reduce (∆φ) det without changing the final spin readout mechanism (i.e., Σ 2 det remains unchanged). The only option available is "spin amplification", i.e., enhancement of the transduction factor that encodes the sensitivity of the en-semble to φ. Specifically, before doing the final readout ofŜ y , we want to somehow implement a dynamics that yields with a time-dependent gain factor G(t) that is larger than unity at the end of the amplification stage, i.e., t = t 3 in Fig. 1(a). Achieving large gain will clearly reduce the total estimation error in the regime where measurement noise dominates: ∆φ → ∆φ/G(t 3 ). One might worry that in a more general situation, where the intrinsic projection noise is also important, this strategy is not useful, as one might end up amplifying the projection noise far more than the signal. We show in Sec. II C that this is not the case for our scheme: even if we use the optimal t 3 which maximizes the gain G(t), the amplified spin-projection noise referred back to φ [i.e., (∆φ) int in Eq. (1)] is only approximately twice the value of this quantity in the initial state. This is reminiscent of the well-known quantum limit for phase-preserving amplification for bosonic systems [22,39] (see Supplemental for a detailed discussion [53]). We next focus on what is perhaps the most crucial issue: how can we implement amplification dynamics in as simple a way as possible? Any kind of amplifier inevitably requires an energy source. Here, this will be achieved by preparing the spin ensemble in an excited state. For concreteness, we assume that the ensemble has a free HamiltonianĤ 0 = ωŜ z , where ω > 0 and = 1. Hence, at the end of the signal acquisition step at t = t 2 = 0 (see Fig. 1(a)), we rotate the state such that its polarization is almost entirely in the +z direction (apart from the small rotation caused by the sensing parameter φ), i.e., the ensemble is close to being in its maximally excited state. For the following dynamics, we consider simple relaxation of the ensemble towards the ground state ofĤ 0 (where the net polarization is in the −z direction). Consider now a situation where each spin is subject to independent, single-spin T 1 relaxation (at rate γ rel ) as well as a collective relaxation process (at rate Γ). In the rotating frame set byĤ 0 , the Lindblad master equation governing this dynamics is: y )/2 is the lowering operator of spin j,σ (j) x,y,z are the Pauli operators acting on spin j, and D[Ô]ρ =ÔρÔ † − {Ô †Ô ,ρ}/2 is the standard Lindblad dissipation superoperator.
At first glance, it is hard to imagine that such a simple relaxational dynamics will result in anything interesting. Surprisingly, this is not the case. It is straightforward to derive equations of motion that govern the expectation values ofŜ x andŜ y : Not surprisingly, we see that single-spin relaxation is indeed boring: it simply causes any initial transverse polarization to decay with time. However, the same is not true for the collective dissipation. Within a standard meanfield approximation, the first term on the right-hand side of Eq. (4) suggests that there will be exponential growth of both Ŝ x and Ŝ y at short times if the condition Ŝ z > 1/2 holds, i.e., if the spins have a net excitation. This is the amplification mechanism that we will exploit, and that we maximize with our chosen initial condition.
The resulting picture is that with collective decay, the relaxation of the ensemble polarization towards the south pole is accompanied (for intermediate times at least) by a growth of the initial values of Ŝ x,y . This "phasepreserving" (i.e., isotropic in the S x -S y -plane) amplification mechanism will generate a gain G(t) ≥ 1 that will enhance the subsequent measurement. Numerically-exact simulations show that this general picture is correct, see Fig. 1(b). One finds that the maximum amplification gain G(t) occurs at a time t = t max that approximately coincides with the average polarization vector crossing the equator. We stress that the collective nature of the relaxation is crucial: independent T 1 decay yields no amplification. At a heuristic level, the collective dissipator in Eq. (4) mediates dissipative interactions between different spins, and these interactions are crucial to have gain.
We thus have outlined our basic amplification procedure: prepare a CSS close to the north pole of the generalized Bloch sphere (with φ encoded in the smallŜ x andŜ y components of the polarization), then turn on collective relaxation. Stopping the relaxation at time t = t max results in the desired amplification of information on φ in the average spin polarization; this can be then read out as is standard by converting transverse polarization into population differences via a π/2 rotation, as shown in Fig. 1(a). We stress that the generic ingredients needed here are the same as those needed to realize OAT spin squeezing and amplification protocols: a Tavis-Cummings model where the spin ensemble couples to a single, common bosonic mode (a photonic cavity mode [54][55][56], or even a mechanical mode [57][58][59][60]) and time-dependent control over the strength of the collective interaction [37,61]. In previously-proposed OAT protocols, cavity loss limits the effectiveness of the scheme, and one thus works with a large cavity-ensemble detuning to minimize its impact. In contrast, our scheme utilizes the cavity decay as a key resource, allowing one to operate with a resonant cavity-ensemble coupling. In such an implementation, the ability to control the detuning between the cavity and the spin ensemble provides a means to turn on and off the collective decay Γ. This general setup will be analyzed in more in Sec. II F and an analysis of timing errors is given in the Supplemental [53]. Alternatively, one can achieve time-dependent control over the collective decay rate by driving Raman transitions in a Λ-type three-level system [62].
Before proceeding to a more quantitative analysis, we pause to note that, for short times and γ rel = 0, one can directly connect the superradiant spin-amplification physics here to simple phase-preserving bosonic linear amplification. Given our initial state, it is convenient to represent the ensemble using a Holstein-Primakoff bosonic modeâ viaŜ z ≡ N/2 −â †â . For short times, one can linearize the transformation forŜ x andŜ y , with the result that these are just proportional to the quadratures ofâ. The same linearization turns the collective decay in Eq. (3) into simple bosonic anti-damping: dρ/dt ∼ ΓN D[â † ]ρ. This dynamics causes exponential growth of â , and describes phase-preserving amplification of a non-degenerate parametric amplifier in the limit where the idler mode can be adiabatically eliminated [39]. While this linearized picture provides intuition into the origin of gain, it is not sufficient to fully understand our system: the nonlinearity of the spin system is crucial in determining the non-monotonic behaviour of G(t) shown in Fig. 1(b), and in determining the maximum gain. We explore this more in what follows.
Finally, we note that Eq. (3) (with γ rel = 0) has previously been studied as a spin-only, Markovian description of superradiance, i.e., the collective decay of a collection of two-level atoms coupled to a common radiation field [44,63]. The vast majority of studies of superradiance focus on the properties of the radiation emitted by an initially excited collection of atoms. We stress that our focus here is very different. We have no interest in this emission (and will not assume any access to the reservoir responsible for the collective spin dissipation). Instead, we use the effective superradiant decay generated by Eq. (3) only as a tool to induce nonlinear collective spin dynamics, which can then be used for amplification and quantum metrology.

B. Mean-field theory description of superradiant amplification
To gain a more quantitative understanding of our nonlinear amplification process, we analyze the dynamics of Eq. (3) with γ rel = 0 using a standard mean-field theory (MFT) decoupling, as detailed in App. B. This analysis goes beyond a linearized bosonic theory obtained from a Holstein-Primakoff transformation and is able to capture aspects of the intrinsic nonlinearity of the spin dynamics. We start by using MFT to understand the gain dynamics, which can be determined by considering the evolution of the mean values of the collective spin operator; fluctuations and added-noise physics will be considered in Sec. II C. Note that a simpler approach based on semiclassical equations of motion fails to capture the amplification dynamics correctly, i.e., superradiant am-plification is a genuinely quantum effect and quantum fluctuations need to be taken into account (see Supplemental [53] and Sec. II F 3).
As detailed in App. B, the MFT equation of motion for S z ≡ Ŝ z in the large-N limit is where the constant term is obtained by using the fact that the dynamics conservesŜ 2 . Starting from a highly polarized initial state with S z (0) = N cos(φ)/2, this equation describes the well-known nonlinear superradiant decay of the S z component to the steady state | ↓ ⊗N [45]. The corresponding equations of motion for average values S x and S y correspond to the expected decoupling of Eq. (4): where we have introduced the instantaneous gain rate λ(t). For λ(t) > 0 (λ(t) < 0), any initial polarization component of the collective average Bloch vector in the x-y plane will be amplified (damped). Without loss of generality, we chose the initial transverse polarization to be entirely in the y direction. Thus, the S x component will always remain zero since the initial state has S x (0) = 0. In contrast, the highly polarized initial state S z (0) ≈ N/2 1/2 leads to amplification of the nonzero initial value S y (0) = N sin(φ)/2 at short times. In the long run, the superradiant decay evolves S z (t) to its steadystate value S z (t → ∞) = −N/2. As a consequence, for sufficiently long times, the time-dependent gain rate λ(t) will be reduced and amplification ultimately turns into damping if S z (t) < 1/2. The MFT equation of motion (6) predicts that maximum amplification of S y is achieved at the time t max where S z (t max ) = 1/2, which is clearly beyond the regime of applicability of a linearized theory based on the Holstein-Primakoff transformation. In the large-N limit, the MFT result for t max takes the form which is the well-known delay time of the superradiant emission peak [45]. The short transient period where λ(t) > 0 is enough to yield significant amplification: Evaluating this at t = t max given by Eq. (7) yields the following MFT result for the maximum value of S x,y : Note that the signal gain increases with increasing N while the waiting time t max required to reach the maximum gain decreases, giving rise to very fast amplification.
Importantly, the optimal amplification time t max given in Eq. (7) is independent of the tilt angle φ in the metrologically relevant limit of φ 1. Therefore, the gain G(t) is independent of the signal φ. The breakdown of this relation defines the dynamic range of the spin amplifier and is analyzed in the Supplemental [53].
We now verify this intuitive picture derived from MFT using numerically-exact solutions of Eq. (3). To analyze the solutions, we define the time-dependent signal gain G(t) as follows: where t max is determined numerically. Note that this is identical to the definition given in Eq.
(2), as G(t) is independent of φ for φ 1. Combining Eqs. (9) and (10), we thus expect a scaling G max ∝ √ N based on MFT. Numerically-exact master equation simulations shown in Figs. 1(c) and 2 confirm that (up to numerical prefactors) the scaling of G max and t max predicted by MFT are correct in the large-N limit.
It is also interesting to note that on general grounds, G max ∝ √ N is the maximal gain scaling that we expect to be possible. This follows from the fact that we would expect initial fluctuations ofŜ x andŜ y to be amplified (at least) the same way as the average values of these quantities, and hence expect (∆S x ) 2 ≥ G 2 max N/4, where N/4 represents the initial fluctuations ofŜ x in the initial CSS. Next, note that because of the finite dimensional Hilbert space, (∆S x ) 2 cannot be arbitrarily large and is bounded by N 2 /4. This immediately tells us that G max cannot grow with N faster than √ N . The gain scaling can also be understood heuristically by using the fact that there is only instantaneous gain for a time t < t max = ln(N )/N Γ, and that, during this time period, the instantaneous gain rate is λ(t) ≈ N Γ/2. Exponentiating the product of this rate and t max again yields a √ N scaling. We stress that the spin-only quantum master equation (3) as well as the mean-field results for the behaviour of S z are well known in the superradiance literature (see e.g. [44,45,63]). The new aspect of our work here is to identify the amplification physics associated with superradiant decay, and use MFT to provide a quantiative description of it.
C. Improving sensitivity and approaching the SQL with extremely bad measurements We now discuss how the amplification dynamics can improve the total estimation error (∆φ) introduced in Eq. (1). For concreteness, we focus on the general situation where the readout mechanism involves adding independent contributions from each spin in the ensemble, and hence the noise associated with the readout itself scales as N : with Ξ det an N -independent constant. Note that the factor of 1/4 in the definition is convenient, as Ξ 2 det directly describes the ratio of readout noise to the intrinsic projection noise. Equation (11) describes the scaling of readout noise in many practically relevant situations, including standard spin-dependent-fluorescence readout of solid-state spin ensembles [16] and of trapped ions [64]. In this case and for φ 1, one has whereC is the fluorescence contrast of the two spin states and n avg is the average number of detected photons per spin in a single run of the protocol (see App. A for details).
In considering the estimation error, we will also now account for the fact that our amplification mechanism will not only cause Ŝ y to grow, but also cause the variance (∆S y ) 2 to grow over its initial CSS value of N/4. The very best case is that the variance is amplified exactly the same way as the signal, but in general there will be excess fluctuations beyond this. This motivates the definition of the added noise of our amplification scheme (similar to the definition of the added noise σ add of a linear amplifier). Letting (∆Ŝ y ) 2 | amp denote the variance ofŜ y in the final, post-amplification state of the spin ensemble after an optimal amplification time, we write: We have normalized σ add to the value of the CSS variance; hence, σ 2 add = 1 corresponds to effectively doubling the initial fluctuations (once the gain has been included).
For linear bosonic phase-preserving amplifiers, it is well known that the added noise of a phase-preserving amplifier is at best the size of the vacuum noise [22,39,65]. At a fundamental level, this can be attributed to the dynamics amplifying both quadratures of the input signal, quantities that are described by non-commuting operators. One might expect a similar constraint here, as our spin amplifier also amplifies two non-commuting quantities (namelyŜ x andŜ y ). Hence, one might expect that the best we can achieve in our spin amplifier is to have the added noise satisfy σ 2 add = 1. A heuristic argument that parallels Caves' classic calculation [22] suggests one indeed has the constraint σ 2 add ≥ 1−1/G 2 (T )N (see Supplemental [53]). For our system, full master equation simulations let us investigate how the added noise behaves for large N and maximum amplification. Remarkably, we find σ 2 add ≈ 1.3 in the large-N limit, which is just slightly above the expected level based on the heuristic argument (see Fig. 3(b)). This leads to a crucial conclusion: our amplification scheme is useful even if one cares about approaching the SQL.   Fig. 1(c)), one can derive an upper bound on the added noise: With the above definitions in hand, we can finally quantify the estimation error in Eq. (1) of our amplification-assisted measurement protocol. Combining Eqs. (2), (11), and (13), one finds that the general expression applied to our scheme reduces to: where we have used the large-N scaling of the maximum gain in the last equation: G max = c 0 √ N with c 0 ≈ 0.42. There are two crucial things to note here: First, if readout noise completely dominates (despite the amplification), our amplification approach changes the scaling of the estimation error (∆φ) with the number of spins from 1/ √ N to 1/N . While this scaling is reminiscent of Heisenberg-limited scaling, there is no connection: in our case, this rapid scaling with N only holds if one is far from the SQL. Nonetheless, this shows the potential of amplification to dramatically increase sensitivity in this readout-limited regime.
Second, for very large N Ξ 2 det , the amplification protocol will make the added measurement noise negligible compared to the fundamental noise of the quantum state. In this limit, the total estimation error almost reaches the SQL: it scales as (∆φ) ∝ (1 + σ 2 add )/N ≈ 2.3/N . This is only off by a numerical prefactor √ 2.3 from the exact SQL. We thus have established another key feature of our scheme: using amplification and a large enough ensemble, one can in principle approach the SQL within a factor of two regardless of how bad the spin readout is. For a fixed detector noise Ξ det , the crossover in the estimation error (∆φ) from a 1/N scaling to a 1/ √ N scaling is illustrated in Fig. 3(a).

D. Enhanced sensitivity despite reading out a small number of spins
There are many practical situations where, even though the signal of interest φ influences all N spins in the ensemble, one can only read out the state of a small subensemble A with N A N spins. For example, for fluorescence readout of an NV spin ensemble, the spot size of the laser could be much smaller than the spatial extent of the entire ensemble. For a standard Ramsey scheme (i.e., no superradiant amplification), there are no correlations between spins, and the unmeasured N − N A spins do not help in improving the measurement. In the best case, the estimation error then scales as ∆φ ∝ 1/ √ N A . Surprisingly, the situation is radically different if we first implement superradiant amplification on the full ensemble before reading out the state of the small subensemble. In this case, we are able to achieve an SQL-like scaling ∆φ ∝ 1/ √ N even though one measures only N A N spins. This dramatically improved scaling reflects the fact that the superradiant amplification involves a dissipative interaction between all the spins, hence the final state of the small subensemble is sensitive to the total number of spins N .
To analyze this few-spin readout scenario, we partition the N spins into two subensembles A and B of size N A and N B ≡ N − N A , respectively. Without loss of generality, we enumerate the spins starting with subensemble The best reported fluorescence readout of an NV ensemble is a factor of Ξ det = 1/ C2 navg = 67 above the SQL [16]. We assume the ideal case where this factor is independent of the ensemble size (dashed blue line). Amplification suppresses the readout noise (solid red line) and allows one to approach the SQL (dash-dotted black line). The inset shows the scaling of the total estimation error (∆φ) 2 with and without amplification (solid red and dashed blue lines, respectively), and the SQL (dash-dotted black line). The curves have been obtained using a MFT analysis of Eq. (3) for γ rel /Γ = 0 and agree qualitatively with numerically exact solutions of the master equation (3), which have been used to calculate the (b) added noise σ 2 add (defined in Eq. (15)) for γ rel /Γ = 0. The dashdotted blue line illustrates the expected minimum amount of added noise, 1 − 1/G 2 max N , based on a heuristic argument detailed in the Supplemental [53]. Note that this is not a strict lower bound.
A, which allows us to define the subensemble operatorŝ Their sum is the spin operator of the full ensemble,Ŝ k =Ŝ A k +Ŝ B k . We now consider a scheme where only the spin state of the A ensemble is measured at the very end of the amplification protocol shown in Fig. 1. The statistics of this measurement are controlled by the operatorŜ A y , with the signal encoded in its average value. Note that our ideal amplification dynamics always results in a spin state that is fully permutationsymmetric (i.e., at any instant in time, the average value of single spin operators are identical for all spins). It thus follows immediately that the subensemble gain is identical to the gain associated with the full ensemble: with G max = c 0 √ N in the large-N limit (and c 0 ≈ 0.42). We stress that the gain is determined by the size of the full ensemble even though we are only measuring N A N spins, which can also be seen by inspecting the equations of motion for the transverse componentŝ σ (k) x,y of an arbitrary spin k: The y component of each individual spin is driven by a collective spin operatorŜ y whose expectation value is proportional to the ensemble size, Ŝ y = N φ/2. Next, consider the fluctuations inŜ A y . The variance of this operator must be less than N 2 A /4 in any state; we thus parameterize these fluctuations by If we now only consider the fundamental spin projection noise (i.e., ignore any additional readout noise), we can combine these results to write the estimation error in φ as: We thus have a crucial result: even in the worst-case scenario q = 1, for large N , our estimation error scales as 1/N despite measuring N A N spins. We can use a similar analysis to consider the contribution of detection noise to the estimation error in our subensemble readout scheme. We again assume (as is appropriate for fluorescence readout) that the detector noise scales with the number spins that are read out, i.e., Σ 2 det,A = Ξ 2 det N A /4. We thus obtain the detection-noise contribution to the estimation error: i.e., the detection noise is again suppressed by a factor of N , the size of the full ensemble.
Combining these results, we find where c 0 ≈ 0.42 in the large-N limit. We thus find that, in the case where N A is held fixed while N is increased, our superradiant amplification scheme yields a measurement sensitivity that scales as (∆φ) ∝ 1/ √ N . Surprisingly, it is controlled by the full size of the ensemble, and not controlled by the much smaller number of spins that are actually measured, N A . We illustrate this in Fig. 4 for the extreme case of readout of a single spin, N A = 1, and for the case of readout of a small fraction of the spin ensemble, N A = 0.01N . We stress that the analysis above (like the analysis throughout this paper) is done in the limit of an infinitesimally small signal φ.
E. Impact of single-spin dissipation and finite-temperature in the generic model While our superradiant dissipative spin amplifier exhibits remarkable performance in the ideal case where the only dissipation is the desirable collective loss in Eq. (3), it is also crucial to understand what happens when additional unwanted forms of common dissipation are added.

Local dissipation
We first consider the impact of single-spin dissipation, namely Markovian dephasing and relaxation at rates γ φ and γ rel , respectively. The master equation for our spin ensemble now takes the form Numerically exact solutions of Eq. (21), shown in Fig. 5, demonstrate that an initial signal is still amplified if the collective cooperativities with k ∈ {φ, rel}, exceed a threshold value of the order of unity. This is equivalent to the threshold condition for superradiant lasing [46,66]. Further, we find that achieving the maximum gain G ∝ √ N does not require strong coupling at the single-spin level: it only requires a large collective cooperativity, and not a large single-spin cooperativity η k ≡ C k /N .
Note that the dependence of the gain on cooperativity can be understood at a heuristic level by inspecting the MFT equations of motion (6), which now take the form: At short times, the collective decay term tends to increase S y at a rate N Γ whereas local dissipation aims to decrease S y at rates γ φ and γ rel /2, respectively. Amplification is only possible if the slope of S y at t = 0 is positive, which is equivalent to the conditions C φ > 1 and C rel > 1/2, respectively. For weak local dissipation, i.e., C k 1, the numerical results shown in Fig. 5 are well described by the mean-field result where a φ ≈ 3 and a rel ≈ 6. In the opposite limit C k 1, there is no amplification, G max (γ k ) = 1.

Finite temperature
Another potential imperfection is that the reservoir responsible for collective relaxation may not be at zero temperature, giving rise to an unwanted collective excitation process. This could be relevant in setups where collective effects stem from coupling to a mechanical degree of freedom, a promising approach for ensembles of defect spins in solids [57,58,60]. In this general case, the master equation takes the form The parameter n th determines the relative strength between the collective decay and excitation rates and can be interpreted as an effective thermal occupation of the bath generating the collective decay. The gain as a function of the effective thermal occupation number n th based on numerically exact solution of the full quantum master equation (25) is shown in Fig. 6(a). A nonzero n th reduces the gain as compared to the ideal gain G max obtained for n th = 0, and ultimately prevents any amplification in the limit n th 1. MFT again allows one to develop an intuitive picture of how a bath temperature degrades amplification dynamics. In the presence of finite temperature and for large N , the mean-field equations of motion (5) and (6) read The impact of finite temperature n th > 0 is thus twofold. First, the time-dependent gain factor in Eq. (26) is shut off at an earlier time, namely, if the condition S z (t) = 1/2 + n th holds. This implies that no amplification will occur if n th > N/2. If this were the only effect, the generation of gain would be largely insensitive to thermal occupancies n th N . Unfortunately, there is a second, more damaging mechanism. As the above equations show, the instantaneous gain rate λ(t) is controlled by S z (t). The decay of of this polarization is seeded by both quantum and thermal fluctuations in the environment. Hence, a The superradiant decay dynamics involves all N spins, therefore, the gain factor still scales ∝ √ N . (a) As a consequence, even if only a single spin is measured, NA = 1, amplification allows one to reduce the estimation error with a SQL-like scaling Here, we consider the ideal case Ξ 2 det = 0; adding detection noise will only change a constant prefactor which shifts the dashed blue and solid red curves vertically relative to the dash-dotted SQL curve. (b) Comparison between a measurement of the full ensemble, NA = N , and a measurement of only a small subensemble, NA = 0.01N , in the presence of detection noise, Ξ det = 67. For the subensemble, the initial estimation error is higher due to the smaller number of measured spins but the gain still allows one to reduce the readout noise with a 1/N -like scaling until intrinsic and added noise become appreciable. The plots are based on Eq. (20), the MFT scaling relations (i.e., c0 = 1/2, σ 2 add = 1), and a worst-case estimate q = 1 (equivalent to maximum (∆S A y ) 2 fluctuations of the measured subensemble).
(a) Maximum gain in the presence of local dephasing, Gmax(γ φ ), as a function of the collective cooperativity C φ = N Γ/γ φ [calculated by numerically exact integration of the master equation (21) with γ rel = 0]. Each data point is obtained by maximizing the time-dependent gain G(t) over the evolution time t. Collective amplification and local dephasing compete and amplification is observed if C φ 2, i.e., if the collective amplification rate dominates over local decay. (b) Analogous numerical results for maximum gain in the presence of local relaxation, Gmax(γ rel ), as a function of the collective cooperativity C rel = N Γ/γ rel (with γ φ = 0). We again see that the collective cooperativity is the relevant parameter for obtaining maximum gain.
non-zero n th accelerates this decay, leading to a more rapid decay of polarization, and a shorter time interval where the instantaneous gain rate is positive. This ultimately suppresses the maximum gain.
The above argument can be made quantitative if we expand S z for short times around its initial value, S z = N/2 − δ, where δ 1. To leading order in N and δ, the equation of motion of the deviation δ is dδ/dt = N Γ(1 + n th ) + N Γδ, where the first term shows explicitly that both bath vacuum fluctuations and thermal fluctuations 10 −3 10 −1 10 1 Thermal occupation number n th FIG. 6. Maximum gain in the presence of a non-zero thermal occupancy n th of the environment responsible for collective spin decay, obtained by numerically exact solution of the master equation (25). Each data point is obtained by maximizing the gain G(t) over the evolution time t. We see that a non-zero bath thermal occupancy rapidly degrades gain. At a heuristic level, the decay of the ensemble's z polarization is seeded by both thermal and quantum bath fluctuations. Nonzero thermal fluctuations hence accelerate the decay, leading to a shorter time interval where the instantaneous gain rate λ(t) is positive. This allows one to quantitatively understand the suppression of maximum gain seen here, see Eq. (29).
drive the initial decay of polarization. As a consequence, the superradiant emission occurs faster and, in the limit N 1 + 2n th , the time to reach maximum amplification is In the same limit, the maximum gain is given by which shows that a thermal occupation of n th = 3 will decrease the gain by 3 dB. Note that G max (n th ) still scales ∝ √ N , i.e., for a fixed value of n th , the reduction can be compensated by increasing the number of spins. The experimental demonstration of superradiance in NV-center spins by Angerer et al. [67] has been performed at 25 mK. The spins were resonant with a microwave cavity at a frequency of about 3 GHz, which corresponds to a thermal occupation of n th ≈ 0.002 1.

F. Implementation using cavity-mediated dissipation
While there are many ways to engineer the collective relaxation that powers our superradiant amplifier, we specialize here to a ubiquitous realization that allows the tuneability we require: couple the spin ensemble to a common lossy bosonic mode. To that end, we consider a setup where N spin-1/2 systems are coupled to a damped bosonic modeâ by a standard Tavis-Cummings coupling (see Fig. 7): Here, ω cav and ω j denote the frequencies of the bosonic mode and the spins, respectively, and g j denotes the coupling strength of spin j to the bosonic mode. The bosonic mode is damped at an energy decay rate κ and the entire system is thus described by the quantum master equation For collective phenomena, we ideally want all atoms to have the same frequency ω j =ω and be equally coupled to the cavity, g j = g. For superradiant decay, we further want the spins to be resonant with the cavity, i.e., have zero detuning ω cav −ω = 0. If, in addition, the bosonic mode is strongly damped, κ √ N g, theâ mode can be eliminated adiabatically, which gives rise to the spin-only master equation (3) with a collective decay rate and γ rel = 0. Returning to Fig. 1, note that a crucial part of our protocol is the ability to turn on and off the collective dissipation on demand (i.e., to start the amplification dynamics at the appropriate point in the measurement sequence, and then turn it off once maximum gain is reached). This implementation provides a variety of means for doing this. Perhaps the simplest is to control the spin-cavity detuning ∆ by, e.g. changing the applied z magnetic field on the spins. In the limit of an extremely large detuning, the superradiant decay rate Γ is suppressed compared to Eq. (32) by the small factor κ 2 /(κ 2 + 4∆ 2 ) 1. In the following, we separately analyze the impact of coupling inhomogeneities, g j = g, and of inhomogeneous broadening, ω j =ω.

Non-uniform single-spin couplings
To analyze the impact of inhomogeneous coupling parameters g j , we follow the standard approach outlined in Ref. [44]. It uses an expansion of the mean-field equations to leading order in the deviations δ j = g j −ḡ of the average couplingḡ =  3) and (21) can be implemented experimentally by coupling N spin-1/2 systems (level splittings ωj) to a strongly damped bosonic mode (frequency ωcav and single-spin coupling strengths gj). The mode is depicted here as a resonant mode of a photonic cavity, but one could use a wide variety of systems (e.g. microwave or mechanical modes). The energy decay rate of the bosonic mode is κ and each spin may undergo local relaxation or dephasing processes at rates γ rel or γ φ , respectively.
Pulses Time π x π z π y FIG. 8. Dynamical decoupling sequence to cancel inhomogeneous broadening in the Hamiltonian (30). The π pulse about the x axis cancels disorder in the spin frequencies ωj.
The subsequent π pulses about the z and y axis compensate unwanted interaction terms generated by the first π pulse. The overall pulse sequence is applied repeatedly and generates the average Hamiltonian (34) if the repetition rate 1/T is much larger than the standard deviation of the distribution of the spin transition frequencies ωj. associated with the ensemble by the factor i.e., the maximum gain and the optimal time are now given by G ci = √ µN /2 and t ci max = ln(µN )/γ 0 µN , respectively, where we defined γ 0 = N k=1 4g 2 k /κN . Hence, the maximum gain G max is reduced by a disorderdependent prefactor, but the fundamental scaling is retained.

Inhomogeneous broadening
Inhomogeneous broadening can be canceled by the dynamical decoupling sequence introduced recently in Ref. [68], which is summarized in Fig. 8. Different spin transition frequencies ω j in Eq. (30) lead to a dephasing of the individual spins in the ensemble, which can be compensated by a π pulse about the x axis halfway through the sequence. However, this pulse will modify the interaction term in Eq. (30) and will turn collective decay into collective excitation. This can be compensated by a π pulse about the z axis at time 3T /4, which changes the sign of the coupling constants g j . Note that such a pulse can be generated using a combination of x and y rotations. The final π pulse about the y axis at time T reverts all previous operations and restores the original Hamiltonian (30). The average Hamiltonian of this pulse sequence in a frame rotating at ω 0 is if the repetition rate 1/T of the decoupling sequence is much larger than the standard deviation of the distribution of the frequencies ω j . More details on the derivation of this decoupling sequence are provided in a recent publication [68]. If one chooses not to use dynamical decoupling, the analysis outlined in Sec. II F 1 can be adapted to estimate the effect of inhomogeneous broadening on the superradiant decay dynamics [63].

Limit of an undamped cavity
Returning to our cavity-based implementation of the superradiant spin amplifier in Eqs. (30) and (31), one might worry about whether this physics also persists in regime where the cavity damping rate κ is not large enough to allow for an adiabatic elimination. To address this, we briefly consider the extreme limit of this situation, κ → 0, where we simply obtain a completely unitary dynamics generated by the resonant Tavis-Cummings Hamiltonian where ω cav = ω. Figure 9 shows numerical results for the time-maximized gain G max starting from an initial state e iφŜx | ↑ . . . ↑ ⊗ |0 , where |0 denotes the vacuum state of the cavity. A complementing analysis based on MFT is discussed in the Supplemental [53]. We find that spin amplification dynamics still holds in the unitary regime, with an identical G max ∝ √ N scaling of the maximum gain. We stress that realizing this limit of fully unitary collective dynamics is challenging in most spin-ensemble sensing platforms. Nonetheless, this limit shows that our amplification dynamics will survive even if the adiabatic elimination condition √ N g κ that leads to Eq. (31) is not perfectly satisfied. This further enhances the experimental flexibility of our scheme.
Although both the dissipative and the unitary case yield G max ∝ √ N , the underlying dynamics is quite different. The time t max to reach maximum amplification in the coherent case, shown in Fig. 9(a), is parametrically longer if we consider the limit of a large number of spins N : t max ∝ ln √ N / √ N (as opposed to a t max ∝ ln N/N scaling in the dissipative case). Consequently, the instantaneous gain rates λ(t) are also quite different in both cases: whereas dissipative superradiant decay has an almost constant instantaneous gain rate over a very short time, the gain in the Tavis-Cummings model is nonmonotonic, starts at zero, and grows at short times, as shown in Fig. 9(b).
Note that, for the coherent Tavis-Cummings model, the timescale for maximum amplification is analogous to the timescale that governs quasi-periodic oscillations of excitation number in the large-N limit; this latter phenomenon has been derived analytically in previous work [69][70][71]. However, the semiclassical approach used in these works fails to accurately describe the gain physics that is of interest here (see Supplemental [53]). Finally, in the Supplemental, we show that the added noise in the unitary case is also close to the expected quantum limit. Surprisingly, it is approximately equal to what we have found in the dissipative limit, σ 2 add ≈ 1.3.

A. Comparison and advantages over unitary OAT amplification schemes
The dissipative spin amplification scheme introduced in this work is effective in the presence of collective loss, and in fact harnesses it as a key resource. As discussed in the Introduction, this is in sharp contrast to conventional approaches that use unitary dynamics to improve sensing in the presence of measurement noise: such approaches become infeasible with even small amounts of T 1 relaxation (whether collective or single-spin in nature). To illustrate this, we focus on the scheme presented in the seminal work by Davis et al. [26], where OAT dynamics is used to generate effective spin amplification. This scheme involves starting a spin ensemble in a CSS |ψ 0 that is fully polarized in the x direction. The protocol then corresponds to the composite unitary evolution The first step corresponds to the generation of squeezing using the OAT HamiltonianĤ OAT = χŜ 2 z for a time t, i.e.,Û sqz = exp(−iĤ OAT t). The next step is signal acquisition: the state is rotated by a small angle φ about theŜ y axis, via the unitaryR y (φ) = e −iφŜy . Finally, the last step is another evolution under the OAT interaction Hamiltonian, for an identical time t as the first step, but with an opposite sign of the interaction χ → −χ, i.e., U amp =Û −1 sqz . In this scheme, the final signal gain is created entirely by the last OAT evolution stepÛ amp ; the first "presqueezing" step only serves to control the fluctuations in the final state. The suppression of the readout-noise term Ξ 2 det depends only on the maximum gain G max , as shown in Eq. (15). Since we consider a regime where readout noise is dominant, we ignore the initial squeezing step in the following discussion and focus only on the gain of the OAT protocol.
We thus consider a CSS that is almost completely polarized in the x direction, with a small z polarization that encodes the signal rotation φ of interest. Without dissi-pation, the OAT Hamiltonian leads to the Heisenberg equation of motion For short times, we have S x ≈ N/2, and the OAT interaction causes the expectation value of S y to grow linearly in time at a rate set by the initial "signal" value of S z = N sin(φ)/2. The amplified signal is thus contained in S y , and it is this spin component that is ultimately read out. We can thus define the signal gain analogously as in Eq. (10), Note that the amplification mechanism here is analogous to bosonic amplification using a QND interaction [72,73]. In the spin system, nonlinearity eventually causes the the growth of S y to saturate, leading to a maximum gain at a time t OAT max ∝ 1/χ √ N [26]. A crucial aspect of the OAT gain-generation mechanism is the conservation ofŜ z (analogous to a QND structure in the bosonic system). This leaves it vulnerable to any unwanted dissipative dynamics that breaks this conservation law. Unfortunately, such symmetry breaking is common in many standard sensing setups. Consider perhaps the simplest method for realizing an OAT Hamiltonian, where the spin ensemble is coupled to a bosonic mode (e.g. a photonic cavity mode or mechanical mode) via the Tavis-Cummings Hamiltonian (35). Working in the large detuning limit |∆| = |ω cav − ω| g allows one to adiabatically eliminate the bosonic mode. This results in both the desired OAT Hamiltonian interaction, but also a collective loss dissipator associated with the loss rate κ of the cavity mode: where the OAT strength is χ = g 2 /∆ and the collective decay rate is Γ coll = χκ/∆. We also included the Lindblad terms for single-spin relaxation and dephasing. We now have an immediate problem: even with no signal (i.e., φ = 0), the collective loss will cause S z to grow in magnitude during the amplification part of the protocol. This will result in a relatively large contribution to S y that is indistinguishable from the presence of a signal. An approximate mean-field treatment shows that, for φ 1 and short times, the average ofŜ z has the form In the limit of interest φ → 0, the average z polarization induced by relaxation will completely dominate the contribution from the signal φ, which translates into the final measured quantityŜ y being swamped by a large φindependent contribution. This behaviour is indeed seen in full numerical simulations of the dynamics, as depicted in Fig. 10(a). Note that single-spin relaxation will have an analogous effect here to collective relaxation. One might think that this problem is merely a technicality, that could be dealt with by simply subtracting off the φ-independent background. However, this would require an extremely precise calibration that would be difficult if not impossible to reliably implement in most cases of interest. Alternatively, one could try to reduce the deleterious impact of Γ coll by using a very large detuning ∆ (since Γ coll /χ ∝ 1/∆). This strategy is also not effective if there is any appreciable single-spin dissipation. Consider for example the case where there is non-zero single-spin dephasing at a rate γ φ . In this case (and neglecting for a moment collective loss, i.e., Γ = 0), one can show using the exact solution of the master equation (39) reported in Refs. [74,75] that the gain G(t) of the OAT protocol is reduced by an exponential factor, To obtain a large gain G OAT (t) ∝ √ N , it is thus crucial that t OAT be at most of the order of 1/γ φ , which precludes the use of indefinitely large detuning.
To study the joint impact of both collective and local relaxation in more detail, we use MFT and consider the gain after background subtraction, where δ Ŝ y (t) = Ŝ y (t, φ) − Ŝ y (t, 0) denotes the signal after background subtraction. Figure 10(b) shows G OAT sub (t) (evaluated at its first peak at time t = τ 1 , see inset of Fig. 10(a)) as a function of the single-spin cooperativity η k = 4g 2 /κγ k , where k ∈ {φ, rel}. For each data point, we optimize over the detuning ∆ and the optimal values are shown in the inset of Fig. 10(b). While the inset of Fig. 10(a) suggests that local maxima of G OAT sub (t) beyond the first peak at t = τ 1 may lead to larger amplification, this is an artifact of having no single-spin dissipation. By integrating the quantum master equation (39) numerically for N = 20 spins, we have explicitly verified that the performance of the OAT amplification scheme is not improved by considering time evolution past the first maximum of G OAT sub (t) if local dissipation is taken into account (i.e., the first gain peak at t = τ 1 is the optimal choice). In the presence of both collective and local dissipation, we find that amplification in the OAT scheme is strongly reduced unless the single-spin cooperativity satisfies η φ √ N or η rel N 0.9 (see Supplemental [53]). Note that this condition becomes harder to saturate if the spin number N grows. This is in sharp contrast to our dissipative amplification scheme, which only requires the collective cooperativity to satisfy C k 1. We thus  [26] for an ensemble of N standard two-level systems coupled to a detuned bosonic mode via a Tavis-Cummings coupling. The initial signal is encoded in Ŝ z , whose value is then transduced to Ŝ y with gain, see Eq. (37). (a) Time evolution of Ŝ y , both with (solid blue curve) and without (dashed orange curve) an initial small signal φ (obtained by numerically exact solution of the master equation (39) for N = 500, Γ coll /χ = 0.02, and γ rel /χ = γ φ /χ = 0). Collective decay leads to a large average value of Ŝ y even without any initial signal φ. The inset shows the tiny signal obtained after subtraction of this background, δ Ŝ y (t) = Ŝ y (t, φ) − Ŝ y (t, 0) . The black dashed line is the optimal gain one could reach in the absence of collective dissipation. (b) Scaling of the gain after background subtraction, G OAT sub (τ1), in the presence of collective decay and either single-spin dephasing (blue curve) or single-spin relaxation (orange curve) as a function of the respective single-spin cooperativity η k = 4g 2 /κγ k with k ∈ {φ, rel}. We evaluate the gain at its first peak at time t = τ1 (indicated in the inset of (a) by the gray dotted vertical line), which is the time of maximum gain if local dissipation is taken into account. The gain G OAT sub (τ1) is strongly reduced compared to its ideal value in the absence of dissipation (dotted black line; note that MFT predicts this quantity to be slightly smaller than master-equation simulations), unless the single-spin cooperativity satisfies η φ √ N or η rel N 0.9 (see Supplemental [53]). Simulations were done using MFT for N = 10 4 spins. For each η k , the detuning ∆ was optimized with the optimized values shown in the inset.
find that the OAT amplification scheme is of extremely limited utility in the standard case where dissipative twolevel sytems have a Tavis-Cummings coupling to a common bosonic mode: even if one could perform the subtraction of a large φ-independent background, achieving maximum amplification requires an unrealistically large value of the single-spin cooperativity, which is out of reach on solid-state quantum sensing platforms.
We stress that, as already discussed in Ref. [26], one can largely circumvent the above problems by using spin ensembles where each constituent spin has more than two levels. For instance, one can then use two extremely longlived ground-state spin levels for the sensing and generate the OAT interaction using an auxiliary third level of each spin and a driven cavity [61]. In this case, cavity decay does not lead to a collective relaxation process, only collective dephasing. Since there is no net tendency for S z to relax, one does not need to do a large, calibrated background subtraction. Aspects of the effect of the collective dephasing (as well as incoherent spin flips generated by spontaneous emission) were analyzed in Ref. [26]. While this general approach is well suited to several atomic platforms, it is more restrictive than the case we analyze, where we simply require an ensemble of two-level systems.

B. Experimental implementations
The focus of this paper is not on one specific experimental platform, but is rather to illuminate the general physics of the collective spin amplification process, a mechanism relevant to many different potential systems. While there are many AMO platforms capable of realizing our resonant, dissipative Tavis-Cummings model, we wish to particularly highlight potential solid-state implementations based on defect spins. These systems have considerable promise in the context of quantum sensing, but usually suffer from the practical obstacle that the ensemble readout is far above the SQL [16].
We start by noting that recent work has experimentally demonstrated superradiance effects in sensingcompatible solid-state spin ensembles [67,76]. Angerer et al. [67] demonstrated superradiant optical emission from N ≈ 10 16 negatively charged NV centers, which were homogeneously coupled to a microwave cavity mode in the fast cavity limit, i.e., with a decay rate κ much larger than all other characteristic rates in the system. Moreover, improved setups with collective cooperativities larger than unity were reported and ways to increase the collective cooperativities even more have been discussed [21,77]. The essential ingredients to observe superradiant spin amplification in large ensembles of NV de-fects coupled to microwave modes have thus been demonstrated experimentally. Instead of a microwave cavity mode, the bosonic modeâ could also be implemented by a mechanical mode that is strain-coupled to defect centers [78], e.g. employing mechanical cantilevers [57], optomechanical crystals [60], bulk resonators [58], or surfaceacoustic-wave resonators [79]. In addition to NV centers, silicon vacancy (SiV) defect centers could be used [80,81], which offer larger and field-tunable spin-mechanical coupling rates. Superradiant amplification could then pave a way to dramatically reduce the detrimental impact of detection noise and to approach SQL scaling.

IV. CONCLUSION
In this work, we have proposed and analyzed a simple yet powerful protocol to reduce the detrimental impact of readout noise in quantum metrology protocols. Unlike previous ideas for spin amplification, our protocol is effective for dissipative ensembles of standard two-level systems, and does not require a large single-spin cooperativity. It allows a system with a highly inefficient spin readout to ultimately reach the SQL within a factor of two.
Our protocol uses the well-known physics of superradiant decay for a new task, namely, amplification of a signal encoded initially in any transverse component of a spin ensemble. In contrast to usual treatments of superradiance, we are not interested in the emitted radiation. Instead, we use superradiance as a tool to induce nonlinear amplification dynamics in the spin system. The gain factor of our protocol achieves the maximum possible scaling, G max ∝ √ N in the large-N limit. The added noise associated the amplification is close to the minimum allowed value one would expect for a quantumlimited bosonic amplifier. While single-spin dissipation and finite temperature do reduce the gain, they do no change the fundamental scaling ∝ √ N . In the case of single-spin dissipation, we stress that maximum gain can be achieved by having a large collective cooperativity, i.e., one does not need a large single-spin cooperativity. Our protocol is compatible with standard dynamical decoupling techniques to mitigate inhomogeneous broadening effects. Note that another unique aspect of our scheme is that it amplifies all spin directions perpendicular to the z axis equally (as opposed to only amplifying a single direction in spin space). This could potentially be a useful tool in measurement schemes beyond generalized Ramsey protocols.
Our work also suggests several fruitful directions for future work. It would be interesting to combine the dissipative amplification mechanism introduced here with dissipative spin squeezing to achieve near-Heisenberg-limited sensitivity in systems with highly imperfect spin readout. On a fundamental level, the intrinsic nonlinearity of spin systems requires generalizations of the existing bounds on added noise of phase-preserving amplifiers. The fact the amount of added noise is very similar both in the purely dissipative case and in the coherent κ → 0 limit may hint at a more fundamental reason to explain the numerically found level of σ 2 add ≈ 1.3. Regarding experimental platforms for quantum sensing, it would also be interesting to study the dynamics and utility of dissipative spin amplification in ensembles where intrinsic dipolar spin-spin interactions are strong.
In this Appendix, we provide details on the fluorescence readout [16], which we model in terms of a positive operator-valued measurement (POVM). A more general derivation, which does not use the language of POVMs and keeps the readout method general, is given in the Supplemental [53].
Fluorescence readout measures each of the N spins in the z basis, i.e.,σ (j) z |σ j = σ j |σ j with σ j = ±1 and j ∈ {1, . . . , N }. Each spin emits n j photons independently of the state of other spins in the sample, with a state-dependent Poissonian probability distribution P σj (n j ). Mean and variance of P σj (n j ) are given by n b (n d ) if the spin is in the bright σ j = −1 (dark σ j = +1) state. Thus, the readout of each single spin can be modeled by a POVM with measurement op-eratorM nj ,σj = P σj (n j )|σ j σ j | and effect operator E nj = σj P σj (n j )|σ j σ j | defining the probability that spin j emits n j photons.
The photodetector only measures the total number of photons n = Here, P σ1,...,σ N (n) is the convolution of all N single-spin probability distributions P σj (n j ), i.e., a Poissonian dis-tribution with mean and variance N b n b + N d n d , where N b (N d ) is the number of spins in the bright (dark) state.
It is now convenient to switch to a basis |j, m of simultaneous eigenstates of the collective operatorsŜ 2 andŜ z , such that N b and N d are simply related to theŜ z quantum number, Note that the permutation invariance of the spins allows us to focus on an effective basis which averages over the degeneracy of the total-angular-momentum subspaces with j < N/2 [82]. In this basis, we can rewrite the effect operator (A1) aŝ where P m (n) is a Poissonian distribution with mean and Here, we defined the average number of emitted photons per spin n avg = (n b + n d )/2 and the contrast between the bright and the dark stateC = (n b − n d )/(n b + n d ) [8,16,18]. The average number of detected photons for the statê ρ is now given bȳ where we added a zero in the last step. This allows us to separate two contributions to the variance of (∆n) 2 : The classical noise which is added by the detector due to the fact that the probability distribution P m (n) has a finite variance for each basis state |j, m (the term in the square brackets), and the intrinsic quantum fluctuations of the stateρ expressed in terms of the measured photon number n (the last two lines). Using the explicit expressions for the mean and variance of P m (n), we find which can be referred back to an uncertainty in φ using the transduction factor ∂ φn = −2n avgC ∂ φ Ŝ z , For a small signal Ŝ z = N φ/2 with φ 1, one can ignore the second term in the numerator of the detection noise term. Note that these equations are given in terms of the basis of the final measurement at times t 3 < t ≤ t 4 in Fig. 1(a), which is theŜ z spin component. In the main text, we discuss everything in terms of the final state of the amplification step at t = t 3 . It differs from the measured state by the π/2 rotation at t = t 3 , which mapsŜ y →Ŝ z and (∆S y ) 2 → (∆S z ) 2 .
To discuss the scaling of the two terms in Eq. (A7), we focus on two typical probe states in a Ramsey experiment: CSS and spin-squeezed states.
For a CSS, the slope of the signal depends on the length of the spin vector, |∂ φ Ŝ z | = N/2, i.e., the first (detection noise) term has an SQL-like scaling 1/C 2 n avg N ∝ 1/N with a readout-dependent prefactor 1/C 2 n avg . The CSS variance is (∆S z ) 2 = N/4, therefore, the second (projection-noise) term reduces to 1/N . In the absence of amplification, the measurement error (∆φ) 2 thus has a SQL-like 1/N scaling with a readout-noise-dependent prefactor 1 + 1/C 2 n avg .
For a spin-squeezed state, the slope of the signal depends on the effective length of the spin vector along the mean spin direction |∂ φ Ŝ z | = | Ŝ msd | ≤ N/2. Since a spin-squeezed state wraps around the Bloch sphere for sufficiently large squeezing, | Ŝ msd | decreases with increasing squeezing strength. The first (detection-noise) term thus reduces to i.e., detection noise is larger for a spin-squeezed state than for a simple CSS if squeezing is sufficiently strong to reduce | Ŝ msd |. The second (projection-noise) term can be expressed as ξ 2 R /N , where we introduced the Wineland parameter ξ 2 R = N min ⊥ (∆S ⊥ ) 2 /| Ŝ msd | 2 and ⊥ denotes the directions perpendicular to the mean-spin direction [13,14,83]. Using spin squeezing, one can push the Wineland parameter below unity such that the projection noise reaches at best a Heisenberg-like 1/N 2 scaling. Note that, in the presence of a very bad readout 1/C 2 n avg 1, this optimizes an almost irrelevant term of the overall measurement error and thus does not improve (∆φ) 2 significantly. As a consequence, the overall measurement error (∆φ) 2 still scales ∝ 1/N and the loss of signal slope, | Ŝ msd | → 0, increases the detrimental impact of detection noise beyond the level one would have observed for a simple CSS probe state. Hence, spin squeezing is not a useful strategy if readout noise dominates.
Finally, we give typical values for the readout-noise prefactor 1/C 2 n avg . For fluorescence readout in trappedion setups [64], the decay of the dark state into the bright state is slow enough to allow for sufficiently long integration times such that n avg ≈ 30 andC ≈ 98 % [84], yielding a strong suppression of detection noise by a factor of 1/C 2 n avg ≈ 0.03 1. However, the situation is dramatically different for solid-state defects, e.g. negatively charged NV defects in diamond [16]. Here, fluorescence readout leads to a rapid polarization of the NV spin into the bright state, such that the best values even for a single NV center are n avg ≈ 0.3 andC ≈ 15 % [16,18]. Therefore, the detection noise dominates the over projection noise by a factor of 1/C 2 n avg ≈ 150. For ensembles of many NV centers, the detrimental impact of readout noise is even larger: the best reported value is 1/C 2 n avg ≈ 67 2 ≈ 4500 [16,85].

Appendix B: Mean-field theory analysis
To gain intuition on the amplification dynamics, we use MFT to derive approximate nonlinear equations of motion for the system. The differential equations for the spin components S k = Ŝ k , where k ∈ {x, y, z}, generate an (infinite) hierarchy of coupled differential equations for higher-order spin correlation functions, which we truncate and close by performing a second-order cumulant expansion [86]. This treatment is exact if the state is Gaussian.
We start with the quantum master equation Without loss of generality, we take the initial state to be e iφŜx |↑ . . . ↑ . This initial state has S x (t) = C xy (t) = C xz (t) = 0, where the covariances are defined by C kl ≡ (Ŝ kŜl +Ŝ lŜk ) /2 − Ŝ k Ŝ l for k, l ∈ {x, y, z}. Since we are interested in the limit of a very small signal, φ 1, we drop all terms of the order φ 2 in the initial conditions and in the equations of motion, which implies C xx (t) = C yy (t). The mean-field equations are then given bẏ For simplicity, we ignore local dissipation for now, γ rel = γ φ = 0. Then, the equations of motion conserve total angular momentum, where C zz is found to be suppressed by an order of N compared to C xx and S 2 z . We thus drop C zz and use Eq. (B7) to eliminate the covariance C xx from the meanfield equations. This step decouples the equation of motion for S z from the rest of the system, but the equation of motion of S y still depends on the covariance C yz . At short times, C yz is suppressed compared to the S z S y term by a factor of N , therefore, we drop it from the equation of motion. In this way, we obtain a very simple set of equations of motion for S y and S z : The solutions predict the exact dynamics (determined by numerically exact solution of the quantum master equation (B1)) qualitatively correct, i.e., they allow us to derive the scaling laws in N up to numerical prefactors. Note that Eq. (B9) for n th = 0 has already been obtained in the literature on superradiance using other derivations [44,45].
Supplemental Material for Dissipative superradiant spin amplifier for enhanced quantum sensing In this section, we adapt Caves' derivation of the quantum limit on the added noise of a bosonic linear amplifier [22] to the phase-preserving spin amplifier considered in this work. We start by writing down a minimal description (in terms of Heisenberg equations of motion) of a spin amplifier that amplifies any polarization transverse to the z direction. The amplification process translates an input state (as encoded in the initial-time t = 0 collective spin operatorsŜ α ≡Ŝ α (0) for α ∈ {x, y, z}), to an output state (similarly encoded in the final time t = T Heisenberg-picture spin operatorsŜ α (T ) ≡T α ). Assuming linear amplification dynamics suggests writing the the solution of the Heisenberg equations of motion in the form:T The first term in each equation captures the linear amplification dynamics, with G(T ) denoting the gain. The remaining Hermitian operatorsF α describe all additional terms arising from solving the Heisenberg equations. Note that at this stage, we do not make any assumptions about the dynamics ofŜ z and the value ofŜ z (T ) ≡T z . We next assume that the average values of the finaltime spin operators are fully described by the linear gain terms (e.g. as is seen in our system for initial states with small transverse polarizations). As such, theF α can be viewed as zero-mean operators that describe added noise associated with the amplification dynamics. We further assume that these noise operators are uncorrelated with the initial spin state, i.e., Ŝ αFβ = 0 and [Ŝ α ,F β ] = 0. In reality, theF α also describe the nonlinear response of our system; we are implicitly assuming that the initial state of our system lets us safely ignore such terms (i.e., the initial transverse polarization is small).
With these assumptions in hand, we can now construct a bound on the size of the added noise. We first use the fact that the final-time spin operators must obey canonical spin commutation relations, and hence we must have [T x ,T y ] = iT z . This results in the constraint The fluctuations in the final-time transverse spin opera-tors are given by: Since the amplifier acts identically on the x and y components, F 2 x and F 2 y are identical and we can write where we definedF ± =F x ± iF y and used Eq. (S3) in the last step. Using |∂ φ T x,y | = G(T )|∂ φ Ŝ x,y |, we thus find In our specific spin amplification protocol, we start with a CSS close to a maximally polarized state, i.e., Ŝ z = 2(∆S x,y ) 2 = N/2, and we interrupt the amplification dynamics at the time T = t max when the condition T z = 1/2 holds. This yields i.e., the spin amplification process doubles the input spinprojection noise in the large-N limit.
Note that this argument is not a strict theoretical lower bound on the noise that is added during the amplification step. Instead, it is a heuristic argument that helps one to develop a sense how much added noise can be expected due to amplification. Even though our amplification scheme is conceptually very simple, we numerically find that it does not saturate the prediction of this heuristic argument (see Figs. 3(b) of the main text and Fig. S5). Hence, an interesting question for future research is to refine this argument to see if the actual lower bound on the added noise in the large-N limit is larger than predicted here. In this section, we discuss the dependence of the amplification tiem t max and the maximum gain G max on the initial tilt angle φ, and we provide additional information on the dynamic range of the spin amplifier.
If all N spins are initialized in the excited state, (quantum) fluctuations will seed the superradiant decay towards the collective ground state. If the initialization to the excited state is imperfect but the deviation of the collective spin vector from the north pole of the Bloch sphere is smaller than the level of fluctuations, the seeding of the superradiant decay is still dominated by fluctuations. As a consequence, the time t max to reach maximum gain becomes independent of the signal angle φ for sufficiently small φ [as shown in Eq. (7) of the main text] and it depends only on the collective decay rate Γ and the number of spins N [as shown in Fig. 2(a) of the main text]. Quantum metrology protocols operate in this regime, φ 1. Figure S2(a) shows how t max ultimately becomes φ dependent and decreases to zero as φ increases.
In the regime where t max is independent of φ, the maximum gain G max defined in Eq. (10) of the main text is also independent of φ. This is important because it establishes a simple linear relation between the amplified signal and the initial small value of φ after a constant amplification time t max . For larger values of φ, G max decreases as a function of φ, as sketched in the inset of Fig. S2(b). The dynamic range of an amplifier quantifies the range of φ values over which G max is constant and a linear amplification relation is obtained. We determine the dynamic range of the spin amplifier numerically by rescaling the gain G max (φ) to the range [0, 1] by defining and numerically finding the angle φ −3dB where 1 − δG max (φ) has decreased to 1/2. The dynamic range φ −3dB is plotted in the main panel of Fig. S2(b).

b. Calibration of tmax and Gmax
Note that precise knowledge of the number N of spins is not required to calibrate the maximum amplification time t max and the maximum gain G max . Instead, these quantities can be experimentally determined by the following measurement: The spin system is repeatedly prepared in a coherent spin state in the y-z plane, which is tilted away from the z axis by a small angle φ < φ −3dB . This can be achieved by standard control pulses. One then switches on the superradiant decay for different time delays t and measures the final transverse S y polarization. In this way, one maps out S y (t) as a function of time, which allows one to determine the timedependent gain G(t) defined in Eq. (10) of the main text [see Fig. 1(b) of the main text]. From this emasurement, one can determine t max and G max .

c. Analysis of timing errors in the amplification step
The amplification step of the modified Ramsey sequence shown in Fig. 1(a) of the main text requires timedependent control of the collective decay rate Γ(t). Both the optimal amplification time t max and the FWHM of   δt due to imperfect control of the amplification time tmax is smaller than the uncertainty due to spin-projection noise.
the gain peak decrease with increasing N , as shown in Figs. 2(a) and (b) of the main text, respectively. Therefore, one may worry that timing errors in the control of Γ(t) could limit the performance of our scheme. This is not the case.
Timing errors will contribute an additional term to the uncertainty budget in Eq. (1) of the main text, which can be estimated using mean-field theory. Using Eq. (8) of the main text, we find in the limit N 1 i.e., we are insensitive to timing fluctuations δt to first order in δt, but they decrease the gain quadratically. The corresponding uncertainty in the estimation of φ is and the overall uncertainty is The third, detection-noise term can be ignored if N and, thus, G(t max ) are large enough. The second, timing fluctuations term vanishes in the metrologically relevant limit φ → 0. As a worst-case estimate, we will now assume that φ does not vanish but is given by the dynamic range of the amplifier discussed in Sec. B 2 a, φ ≈ 1/N 0.4 . From Fig. 2(a) of the main text, we know that one must be able to switch Γ(t) on a timescale t max ∝ ln(N )/ΓN . Assuming there is a relative error ε in the timing, δt = εt max , the overall uncertainty can be rewritten as The first, projection-noise term dominates if the relative timing error satisfies which puts very moderate requirements on the timing error, as shown in Fig. S3.
d. Impact of the signal field during the amplification step In the idealized protocol shown in Fig. 1(a) of the main text, the signal is imprinted to the sensing state only during the signal acquisition interval from t 1 to t 2 , and it is switched off during the subsequent amplification step from t 2 to t 3 . In practice, it may not be possible to switch off the signal. Therefore, we consider a modified quantum master equation where the HamiltonianĤ = ω sigŜz represents the collective rotation about the z axis due to the signal to be sensed. The effect of the (unknown) signal can be gauged away by switching to a rotating frameχ(t) = e iωsigtŜzρ (t)e −iωsigtŜz , in which we recover the pure superradiant decay dynamics described by Eq. (3) of the main text (for γ rel = 0), Maximum gain G max and the corresponding amplification time t max are thus unaffected by the presence of the signal. However, the final state in the lab frame after the amplfication step will be rotated by an angle φ sig = ω sig t max as compared to the unperturbed dynamics,ρ (t max ) = e −iωsigtmaxŜzχ (t max )e +iωsigtmaxŜz .
Its collective spin vector will be rotated in the equatorial plane away from the y axis by an angle φ max . In the typical quantum metrology setting, the precession frequency is very small, ω sig → 0, and the additional rotation during the amplification step can be ignored since φ sig 1. In particular, for our scheme, the amplification time decreases if the number of spins is increased, t max ∝ ln(N )/N ≈ 1/N , which suppresses φ sig even more.
Note that, even if φ sig was not negligible, it could be easily determined by repeating the protocol shown in Fig. 1(a) of the main text twice, measuring S x (t max ) ∝ sin(φ sig ) and S y (t max ) ∝ cos(φ sig ) in the two runs, respectively.
3. Mean-field theory in the limit of an undamped cavity As discussed in the main text, amplification can also be achieved in a regime where the cavity degree of freedom cannot be eliminated adiabatically, i.e., √ N g κ. In this section, we use MFT to analyze the coherent limit of Eq. (31) of the main text (i.e., κ → 0). MFT lets us explore substantially larger system sizes than direct numerical simulation of the Schrödinger equation (which were used to generate the data shown in Fig. 9 of the main text). We consider the resonant (ω cav = ω) Tavis-Cummings Hamiltonian (35) of the main text, which reads in a frame rotating at the cavity frequencŷ and we consider a separable initial state consisting of the cavity mode in a vacuum, and the spins maximally polarized in a state e iφŜx | ↑ . . . ↑ . Using a secondorder cumulant expansion [86], we can derive a closed set of equations of motion (EOMs) for the spin-cavity system. Introducing the cavity quadrature operatorŝ Q = â † +â / √ 2, andP = i â † −â / √ 2 as well as the notation Q = Q , C P x = (PŜ x +Ŝ xP ) /2 − P Ŝ x , etc., we can readily write down the set of MFT equations of motion: All expectation values (within the second-order cumulant expansion approximation) which are not explicitly shown above have only a trivial evolution, i.e., they remain zero.
As an aside, we note that there are two constraints that are also satisfied by our system. The first one is totalangular-momentum conservation, which lets us write and the second one is conservation of the total excitation number, which yields Either (or both) of the above constraints could be used to further reduce the full set of EOMs shown above.
a. Scaling of the gain Solving the system of equations (S10) to (S21) numerically, we can study the scaling of the maximum gain G max as well as the the corresponding time scale t max for systems with very large spin number N , a regime which is inaccessible by numerical integration of the Schrödinger equation. Figure S1(a) shows the scaling of G max , while panel (b) displays the corresponding time t max required to reach the optimal gain value, both as a function of spin number N . We observe scaling behavior that very closely resembles the one observed for smaller-N Schrödingerequation simulations shown in Fig. 9 of the main text. Like in the dissipative version of our amplification protocol, the gain scales ∝ √ N , while t max has a parametrically slower N -scaling, t max ∝ ln √ N / √ N .

b. Semi-classical theory
Following work in [71], one might hope that much of the core physics or the amplification in the coherent limit of our amplification protocol could be captured by a semiclassical approximation, where all fluctuations (i.e., the covariances) are neglected. Such a case would let us simplify the set of Eqs. (S10) to (S21) tȯ which, using Eq. (S23) could be further reduced tȯ These semiclassical equations indeed predict that S y will grow at short time. However, solving Eqs. (S27) and (S28) numerically, one finds that S y increases monotonically over a time scale much longer than t max obtained from MFT, see Fig. S4. Therefore, it is clear that one must include the effects of fluctuations to properly describe the amplification physics in the coherent κ → 0 limit. 4. Added noise σ 2 add in the limit of an undamped cavity Above, we gave a heuristic argument (assuming linear amplification dynamics) which showed that the added noise σ 2 add (defined in Eq. (13) of the main text), should approximately follow the relation in the limit of large spin number N . For dissipative amplification, we found numerically that the added noise stays close to this heuristic expectation and tends to σ 2 add ≈ 1.3 in the large-N limit (see Fig. 3(b) of the main text). Here, we show that similar behavior is also present in the case of purely coherent evolution, where κ → 0. Specifically, Fig. S5 shows σ 2 add as a function of spin number N . Black dots correspond to data obtained from solving the Schrödinger equation numerically, while the dashed blue line indicates 1 − 1/G 2 max N . Curiously, we see similar behavior to the dissipative case and observe σ 2 add 1.3 in the large-N limit. Consequently, our amplification protocol could also be useful in the coherent limit if the readout noise is not extremely large and one cares about approaching the SQL.

Comparison between semiclassical and mean-field-theory dynamics of the transverse magnetization
In this section, we briefly discuss the effect of radiation damping in NMR systems [50] (which is somewhat related to superradiance) and we show why our dissipative amplification scheme is very different from sensing protocols based on radiation damping [51,52].
In NMR setups, the time-dependent magnetic field M = (M x , M y , M z ) of a spin vector precessing about the quantization axis (typically the z axis) induces a current in the measurement coil. This current generates a magnetic field which aims to rotate the spin vector back to its stable equilibrium position. This process is called radiation damping [50] and can be described by classical Added noise σ 2 add in the coherent spinamplification protocol, calculated by numerical integration of the Schrödinger equation using the resonant Tavis-Cummings Hamiltonian given in Eq. (35) of the main text with ωcav = ω. The quantity σ 2 add at the optimal evolution time tmax (data points) is close to the amount of noise 1 − 1/G 2 max N that is expected based on a heuristic argument valid in the limit N 1 (dashed-dotted line).
Bloch equations where T 1 and T 2 are the relaxation and dephasing time, respectively, and γ is the gyromagnetic ratio of the spins. These equations are similar to the semiclassical equations of motion for superradiant decay [45], which can be obtained from Eq. (21) of the main text by deriving the equations of motion for S k ≡ Ŝ k , k ∈ {x, y, z}, and factorizing all higher-order expectation values Ŝ jŜk ≈ Ŝ j Ŝ k , Note that the two sets of differential equations have opposite stable steady-state solutions: M x,y (t → ∞) = 0 and M z (t → ∞) = +N/2 as opposed to S x,y (t → ∞) = 0 and S z (t → ∞) = −N/2. For S z (0) ≈ N/2 1/2 one can neglect the term −ΓS x,y /2 in Eq. (S32) at short times. In the absence of local relaxation and dephasing, γ rel , γ φ → 0 and T 1 , T 2 → ∞, one then finds that Eqs. (S30) and (S31) are identical to Eqs. (S32) and (S33) upon the substitution M k → S k , except for the second term −ΓS z in Eq. (S33). This term captures spontaneous decay and is crucial to "seed" the superradiant decay dynamics out of a perfectly inverted state. Since is not present in the radiation damping equation (S31), a magnetization aligned exactly along the +z or −z direction is a stable solution of Eqs. (S30) and (S31) in the absence of T 1 relaxation. Seeding of the radiation damping dynamics must be introduced manually by considering thermal fluctuations of the current in the pickup coil [51], experimental imperfections which cause a small deviations from a perfectly inverted state [51], and dipole-dipole interactions between the spins [87]. These effects will cause the magnetization to ultimately flip back to the stable orientation along the +z direction and lead to a large transient magnetization in the x-y plane, similar to a superradiant emission burst.
Walls et al. proposed to use the time delay of this peak in the transverse magnetization for sensing [52]. They consider a system consisting of a solute in solution. The solvent spins are initialized in the metastable state, i.e., they are antialigned with the external magnetic field. The initial transverse magnetization of the solute spins triggers the solvent spins to flip back to the stable state. If the solute's magnetization is larger than the scale of the fluctuations around the metastable state, the delay time depends on the magnitude of the solute's initial magnetization [51,52]. For a smaller solute magnetization, the delay time becomes independent of the state of the solvent spins and the radiation-damping-based scheme becomes insensitive.
In our scheme, such a situation would correspond to an initial tilt angle φ larger than the fluctuations of the sensing state. While one could in principle reduce the level of thermal fluctuations by cooling the setup, unavoidable quantum fluctuations of the sensing state will pose a strict lower bound on the minimum detectable angle φ in the quantum sensor. However, quantum metrology protocols do operate in the regime where the angle φ is much smaller than the scale of the quantum or thermal fluctuations. In this regime, the delay time is independent of φ [as shown in Eq. (7) of the main text] and, thus, the radiation-damping-based amplification scheme is useless for quantum metrology.
Instead of focusing on the delay time, our scheme measures the amplitude of the transverse magnetization peak, which remains sensitive to the initial tilt angle φ [as shown in Eq. (9) of the main text]. We stress that this amplitude dynamics cannot be generated by the classical backaction due to radiation damping. Figure S6 compares the decay dynamics due to radiation damping, the semiclassical equations of motion for superradiant decay, the mean-field theory for superradiant decay given by Eqs. (B2) to (B6) of the main text, and results obtained by numerically exact integration of the quantum master equation (21) of superradiant decay. As expected from the above discussion, the Bloch equations of radiation damping and the semiclassical equations of motion for superradiance predict very similar dynamics. While they manage to capture the dynamics of the z component of the magnetization qualitatively, they fail completely to describe the dynamics of the transverse magnetiation in The initial state is a coherent spin state in the z-y plane tilted away from a perfectly inverted state (i.e., a state pointing along the +z direction for superradiance, along the −z direction for radiation damping) by an angle φ = 0.001. While all methods describe the dynamics of the z magnetization qualitatively correctly, radiation damping and a semiclassical treatment of superradiance fail to describe the dynamics of the y magnetization correctly and predict a maximum transverse magnetization of N/2 (dotted black line). The same is true in the case of a unitary Tavis-Cummings interaction, discussed in Sec. B 3 and shown in Fig. S4. the x-y plane, which is at the heart of our spin amplification scheme. This is due to the fact that Eqs. (S30) and (S31) preserve the length of the magnetization vector (for T 1 , T 2 → ∞) and, thus, describe a rotation of a pure state on the surface of the collective Bloch sphere. Superradiant decay, however, creates a highly mixed state at transient times, whose spin vector is in the interior of the Bloch sphere. In order to describe the transverse amplitude dynamics of superradiant decay, one must use at least a MFT approach, which takes quantum correlations into account.
In conclusion, while radiation damping in NMR has some similarity with a semiclassical analysis of superradiance, it leads to a completely different transient dynamics of the x-y magnetization. Therefore, it cannot be used to implement our spin-amplification scheme. While radiation damping dynamics could be used to infer a sufficiently strong initial transverse magnetization in NMR more efficiently from the peak delay time than from a direct measurement [52] it cannot be used in the context of quantum metrology. In contrast, our scheme is compatible with a standard quantum-metrology Ramsey sequence and allows one to approach the SQL even in the presence of extremely large readout noise, which is an aspect that has not been analyzed in the context of NMR. 6. OAT amplification with single-spin dissipation using mean-field theory In the main text we showed that the performance of the OAT amplification protocol is particularly sensitive to noise, both collective decay (governed by the cavity relaxation rate κ) as well as single-spin dephasing and single-spin relaxation (governed by the rates γ φ and γ rel respectively). Here, we present additional results obtained using MFT simulations, which explore this effect in more detail.
As discussed in the main text, collective decay generates a large background that needs to be subtracted to extract the amplified signal. Therefore, we consider the gain G OAT sub (t) defined in Eq. (42) of the main text, which is the signal after background subtraction, δ Ŝ y (t) = Ŝ y (t, φ) − Ŝ y (t, 0) , normalized to the initial signal Ŝ z = N φ/2. Single-spin decay decreases the gain over time and therefore limits the maximum possible amplification time. To determine the maximum gain, one thus has to optimize both the detuning (which determines the ratio between the OAT strength χ and the collective decay rate Γ, see main text), and the amplification time t max . The result of this optimization is shown in Fig. S7, where we compare G OAT sub (t) to its ideal value obtained in the limit η k → ∞ (i.e., no local dissipation). The gain in this limit is equivalent to the gain in the absence of any dissipation, since the condition ∆ opt κ holds, i.e., collective dissipation is strongly suppressed. Similar to Fig. 10(a) of the main text, we evaluate the maximum gain at the time t max = τ 1 of the first peak of G OAT sub (t). Figures S7(a) and (b) show data for singlespin dephasing and single-spin relaxation, respectively, and the insets show the corresponding values of the optimal spin-cavity detuning ∆ opt . In both scenarios, very high single-spin cooperativities are required to result significant gain: one needs η φ √ N and η rel N 0.9 .
7. More general model of the readout process In this section, we provide an alternative derivation of the readout model. In contrast to the discussion in App. A of the main text, we consider a more generic situation where the overall measurement result n ≡ N j=1 n j is the sum of N independent random measurement results n j which are collected in parallel from each spin. The probability distribution P σj (n j ) of each individual result n j depends on the quantum state of the corresponding spin j in the measurement basis |σ j . In contrast to App. A, we leave the measurement basis and the properties of the probability distribution P σj (n j ) completely general for now, and we will only specialize the final result to the case of fluorescence readout. Moreover, we do not use the language of POVMs, since the addition of readout noise is a classical process and a POVM is not necessarily needed to model it.
For a general pure single-spin state |ψ j = σj c σj |σ j , we assume that the measurement result n j will be distributed according to the weighted sum of the probability distributions of the corresponding basis states, P |ψj (n j ) = σj |c σj | 2 P σj (n j ). Given an ensemble of N spins, the probability distribution of the overall measurement result n = N j=1 n j for a product state |σ 1 , . . . , σ N in the measurement basis is the convolution of all single-spin probability distributions, P σ1,...,σ N (n) = * N j=1 P σj (n). Similar to the single-spin case, we assume that the probability distribution of n for a general pure N -spin state The second line shows that calculating a moment n m of the probability distribution (S35) involves two different averages: First, a n-average with respect to the classical conditional probability distribution P σ1,...,σ N (n) describing the readout for a particular spin configuration {σ j } in the measurement basis. We will denote this average by E {σj } [n m ] ≡ n n m P σ1,...,σ N (n) . (S37) Second, an average of the classical expectation values E {σj } [n m ] with respect to the probabilities |c σ1,...,σ N | 2 to obtain a certain spin configuration {σ j } in the quantum state (S34). This is the step where the properties of the quantum state |ψ enter and we will denote this average by where f {σj } is a function that depends on the spin configuration {σ j }. Note that this does not look like the typical quantum expectation value of an observable with respect to the quantum state |ψ . However, for a specific readout model, the moments of the probability distribution P σ1,...,σ N (n) will be related to moments of an observable of the quantums state: for instance, in the case of fluorescence readout discussed below, this will be the spin componentŜ z [see also Eq. (A4) of the main text]. Therefore, the expectation value f {σj } |ψ will turn into a familiar quantum expectation value.
With these definitions at hand, the variance of n given by Eq. (S36) can be rewritten as follows: where we added a zero in the last line. Similar to Eq. (A5) of the main text, the first term in Eq. (S39) describes the classical noise which is added by the detector due to the fact that P σ1,...,σ N (n) has a finite variance for each basis state |σ 1 , . . . , σ N . The second term represents the variance of E {σj } [n] due to the intrinsic fluctuations of the state |ψ , i.e., its intrinsic spin-projection noise expressed in terms of the measured quantity n.
The average measurement result can be expressed as follows:n The change ofn with respect to the signal φ, ∂ φn , is the transduction factor that we need to refer the measure- FIG. S7. Performance of the OAT spin amplification protocol proposed in Ref. [26] in the presence of (a) single-spin dephasing and (b) single-spin relaxation. Each plot shows the optimal gain G OAT sub (t) after background subtraction (evaluated at the time τ1 of the first peak) as a function of the single-spin cooperativity η k (with k = {φ, rel}), normalized to the gain calculated in the limit η k → ∞. Simulations were done using MFT for N = 1000, 5000, and 10000 spins, and the spin-cavity detuning was optimized for each value of η k (see insets). These plots suggest that, to achieve significant performance, the single-spin cooperativity must be substantially large and satisfy η φ √ N and η rel N 0.9 . ment error (∆n) 2 back to the signal: Note that both the numerator and denominator in the second term are expressed using only E {σj } [n], i.e., if n is related to some spin observableÔ by a linear transformation [e.g.,Ŝ z as shown in Eq. (A4) of the main text], the conversion factors will drop out and the second term will become the bare spin-projection noise with respect toÔ. Finally, we specialize this result to the case of fluorescence readout [16]. In this case, the quantity n denotes the number of detected photons and P |σj (n j ) is a Poissonian distribution with mean n b (n d ) if spin j is in the bright (dark) state. The overall measurement result n also follows a Poissonian distribution with expectation where n avg = (n b + n d )/2 is the average number of emitted photons andC = (n b − n d )/(n b + n d ) is the contrast between the bright and the dark state [8,16,18]. Evaluating Eq. (S41), we find This is the same result as Eq. (A7) of the main text.