Ultralong Dephasing Times in Solid-State Spin Ensembles via Quantum Control

Quantum spin dephasing is caused by inhomogeneous coupling to the environment, with resulting limits to the measurement time and precision of spin-based sensors. The effects of spin dephasing can be especially pernicious for dense ensembles of electronic spins in the solid-state, such as for nitrogen-vacancy (NV) color centers in diamond. We report the use of two complementary techniques, spin bath control and double quantum coherence, to enhance the inhomogeneous spin dephasing time ($T_2^*$) for NV ensembles by more than an order of magnitude. In combination, these quantum control techniques (i) eliminate the effects of the dominant NV spin ensemble dephasing mechanisms, including crystal strain gradients and dipolar interactions with paramagnetic bath spins, and (ii) increase the effective NV gyromagnetic ratio by a factor of two. Applied independently, spin bath control and double quantum coherence elucidate the sources of spin dephasing over a wide range of NV and spin bath concentrations. These results demonstrate the longest reported $T_2^*$ in a solid-state electronic spin ensemble at room temperature, and outline a path towards NV-diamond magnetometers with broadband femtotesla sensitivity.

For NV ensembles, the DC magnetic field sensitivity is typically limited by dephasing of the NV sensor spins. In such instances, spin interactions with an inhomogeneous environment (see Fig. 1a) limit the experimental sensing time to the spin dephasing time T * 2 1 µs [21][22][23][24]. Hahn echo and dynamical decoupling protocols can restore the NV ensemble phase coherence by isolating the NV sensor spins from environmental noise and, in principle, permit sensing times approaching the spin lattice relaxation (T 1 ∼ ms) [25][26][27]. However, these protocols restrict sensing to AC signals within a narrow bandwidth. For this reason, the development of high sensitivity, broadband magnetometers requires new approaches to extend T * 2 for * these authors contributed equally to this work † rwalsworth@cfa.harvard.edu NV ensembles while retaining the ability to measure DC signals.
To date, spin dephasing mechanisms for NV ensembles have not been systematically studied, as spatially inhomogeneous effects do not lead to single NV spin dephasing, which has traditionally been the focus of the NVdiamond literature [11,[28][29][30]. Here, we characterize and control the dominant NV spin ensemble dephasing mechanisms by combining two quantum control techniques, double quantum (DQ) coherence magnetometry [29,30] and spin bath driving [31,32]. We apply these techniques to three isotopically engineered 12 C samples with widely varying nitrogen and NV concentrations. In combination, we show that these quantum control techniques can extend the NV spin ensemble T * 2 by more than an order of magnitude.
Several inhomogeneous spectral broadening mechanisms can contribute to NV spin ensemble dephasing in bulk diamond.
First, the formation of negativelycharged NV − centers (with electronic spin S = 1) requires the incorporation of nitrogen into the diamond lattice. As a result, paramagnetic substitutional nitrogen impurities (P1 centers, S = 1/2) [33][34][35] typically persist at densities similar to or exceeding the NV concentration, leading to a 'spin bath' that couples to the NV spins via incoherent dipolar interactions, with a magnitude that can vary significantly across the NV ensemble. Second, 13 C nuclei (I = 1/2) can be a considerable source of NV spin dephasing in diamonds with natural isotopic abundance (1.07 %), with the magnitude of this effect varying spatially due to the random location of 13 C within the diamond lattice [36,37]. Such NV spin ensemble dephasing, however, can be greatly reduced through isotope engineering of the host diamond material [11].
Third, strain is well-known to affect the diamond crystal and the zero-magnetic-field splitting between NV spin states [38,39]. The exact contribution of strain gradients to NV spin ensemble dephasing has not been quantified rigorously because strain varies throughout and between samples, and is in part dependent upon the substrate used for diamond growth [40,41]. Furthermore, the interrogation of spatially large NV ensembles requires the design of uniform magnetic bias fields to minimize magnetic field gradients across the detection volume.
We assume that the relevant NV spin ensemble dephasing mechanisms are independent and can be summarized by Eqn. 1, where T * 2 {·} describes the T * 2 -limit imposed by a particular dephasing mechanism, and the "≈"-symbol indicates that individual dephasing rates add approximately linearly.
DQ magnetometry employs the {−1, +1} sub-basis of the NV spin−1 system for quantum sensing. In this basis, noise sources that shift the | ± 1 states in common mode (e.g., strain inhomogeneities and spectrum drifts due to temperature fluctuations of the host diamond; fourth and sixth term in Eqn. 1, respectively) are suppressed by probing the energy difference between the | + 1 and | − 1 spin states. In addition, the NV DQ spin coherence accumulates phase due to an external magnetic field at twice the rate of traditional single quantum (SQ) coherence magnetometry, for which the |0 and | + 1 (or |−1 ) spin states are probed. DQ magnetometry provides enhanced susceptibility to target magnetic field signals while also making the spin coherence twice as sensitive to magnetic noise, including interactions with the paramagnetic spin bath. We therefore use resonant radiofrequency control to decouple the bath spins from the NV sensors (second and third term in Eqn. 1). By employing both DQ magnetometry and spin bath driving with isotopically enriched samples, we elucidate and effectively eliminate the dominant sources of NV spin ensemble dephasing, realizing up to a 16× extension of the ensemble T * 2 in diamond. These techniques are also compatible with Ramsey-based DC sensing, and we find up to an 8× improvement in DC magnetic field sensitivity. Our results should enable broadband DC sensing using NV spin ensembles with spin interrogation times approaching those used in AC sensing; and may aid in the fabrication of optimized samples for a wide range of solid-state sensor species. Linewidths are Fourier-broadened. The peaks labeled i and ii correspond to dipole-forbidden transitions of the 14 N electronic spins (∆mI = 0, see Suppl. XI). The simulated spectrum using the full nitrogen Hamiltonian is shown in red, with linewidth and amplitudes chosen to reflect the experimental data.

Double Quantum Magnetometry
The enhanced sensitivity to magnetic fields and insensitivity to common-mode noise sources in this DQ basis can be understood by considering the full ground-state Hamiltonian for NV centers, given by (neglecting the hyperfine interaction) [10], where D ≈ 2.87 GHz is the zero-field spin-state splitting, S = {S x , S y , S z } are the dimensionless spin-1 operators, B = {B x , B y , B z } are the local magnetic field components, γ N V /2π ≈ 28 GHz/T is the NV gyromagnetic ratio, and {M x , M y , M z } describe the strain and electric field contributions to H [42]. Ignoring terms ∝ S x , S y due to the large zero-field splitting D and a small applied bias B z 10 mT along z, the transition frequencies f ±1 (see Fig. 1b) are On-axis strain contributions (∝ M z ) as well as temperature fluctuations ( ∂D ∂T ≈ −74 kHz/K) [21,43] shift the f ±1 transitions linearly.
Thus, when performing DQ magnetometry where the difference ∆f = f +1 − f −1 is probed, their effects are to first order suppressed.
In addition, a pertubative analysis of the complete Hamiltonian in Eqn. 2 (see Suppl. VII) shows that the effects of off-axis strain contributions (∝ M x , M y ) on DQ magnetometry are reduced by a factor M 2 x + M 2 y /(γ N V B z /π), proportional to the bias magnetic field B z . Similarly, the effects of off-axis magnetic fields (∝ B x , B y ) on DQ magnetometry are suppressed due to the large zero-field splitting D, and are also largely common-mode. Working in the DQ basis at moderate bias fields can therefore lead to an enhancement in T * 2 for NV ensembles if strain inhomogeneities, small off-axis magnetic field gradients (B x , B y D), or temperature fluctuations are significant mechanisms of inhomogeneous spin dephasing. This result should be contrasted with single NV measurements in which T * 2 and T 2 in the DQ basis were found to be approximately half the values in the SQ basis, i.e., τ coh DQ ≈ τ coh SQ /2 [29,30]. Since spatial inhomogeneities are not relevant for single centers, the reduced decay times are attributed to an increased sensitivity to magnetic noise in the DQ basis due to the paramagnetic spin bath.
For example, using vector magnetic microscopy (VMM) [19], we mapped the on-axis strain component M z in a 1 mm 2 -region for one of the three NV ensemble diamond samples studied in this work ([N] = 0.75 ppm, Sample B) to quantify the length-scale and magnitude of strain inhomogeneity (Fig. 1c). From this analysis, we estimate an average strain gradient M z /L ≈ 2.8 kHz/µm, which, as we show below, is in good agreement with the observed SQ T * 2 in our samples.

Spin Bath Driving
To mitigate NV spin dephasing due to the spin bath, we drive the bath electronic spins [31,32] using resonant radiofrequency (RF) radiation. In Fig. 1d, we display the spin resonance spectrum of a nitrogen-rich di-amond sample ([N] = 0.75 ppm, Sample B), recorded via the NV double electron-electron resonance (DEER) technique [44] in the frequency range 100 -500 MHz (see Suppl. IX). The data reveal 6 distinct spectral peaks attributed to 14 N substitutional defects in the diamond lattice. The resonance peaks have an approximate amplitude ratio of 1:3:1:3:3:1 resulting from the four crystallographic Jahn-Teller orientations of the nitrogen defects at two possible angles with respect to an applied bias magnetic field (B z = 8.5 mT, aligned along the [111]-axis), as well as 3 hyperfine states [45][46][47] (see Suppl. IX for details). Additional smaller peaks i and ii are attributed to dipole-forbidden nitrogen spin transitions and other electronic dark spins [48].
In pulsed spin bath driving [31], a multi-frequency RF π-pulse is applied to each of the bath spin resonances midway through the NV Ramsey sequence, decoupling the bath from the NV sensor spins in analogy to a refocusing π-pulse in a spin echo sequence [25]. Alternatively, the bath spins can be driven with continuous wave (CW) [31,32]. In this case, the Rabi drive strength Ω Bath at each bath spin resonance frequency must significantly exceed the characteristic coupling strength γ between the bath spins and NV centers, i.e., Ω Bath /γ 1, to achieve effective decoupling. Under this condition, the baths spins undergo many Rabi oscillations during the characteristic dipolar interaction time 1/γ. As a result, the dipolar interaction with the bath is incoherently averaged and the NV spin dephasing time increases.

RESULTS
We studied three diamond samples with increasing nitrogen concentrations that are summarized in Table I. Samples A ([N] 0.05 ppm) and B ([N] = 0.75 ppm) each consist of a 14 N-doped, ≈ 100 µm-thick chemical-vapordeposition (CVD) layer (99.99% 12 C) deposited on top of a diamond substrate. Sample C ([N] = 10 ppm) possesses a 40 µm-thick, 15 N-doped CVD layer (99.95% 12 C) on a diamond substrate. For all three samples, the nitrogenlimited NV dephasing times can be estimated from the average dipolar interaction strength between electronic spins giving T * 2,NV-N ≈ 350 µs, 23 µs, and 2 µs for Samples A, B, and C, respectively. Analysis and measurements suggest that the 13 C nuclear spin bath limit to T * 2 is ≈ 100 µs for Samples A and B, and ≈ 20 µs for Sample C (for details, see Suppl. V). All samples are unirradiated and the N-to-NV conversion efficiency is 1%. Contributions from NV-NV dipolar interactions to T * 2 can therefore be neglected. The parameter regime covered by Samples A, B, and C was chosen to best illustrate the efficacy of DQ coherence magnetometry and spin bath driving.
We measured T * 2 values in the SQ and DQ bases, denoted T * 2,SQ and T * 2,DQ from here on, by performing a single-or two-tone π/2 − τ − π/2 Ramsey sequence, respectively (see inset Fig. 2). In both instances, the ob- are calculated using the contributions of 13 C and nitrogen spins as described in the main text. Reasonable agreement is found between the estimated T * ,est 2,NV−( 13 C+N) and twice the measured T * ,meas 2,DQ , consistent with the twice faster dephasing in the DQ basis. Values listed with a ∼ symbol are order-ofmagnitude estimates. For all samples, [NV] [N] and NV contributions to T * 2 can be neglected (1 ppm = 1.76 × 10 17 cm −3 ).
served Ramsey signal exhibits a characteristic stretched exponential decay envelope that is modulated by the frequency detunings of the applied NV drive(s) from the NV hyperfine transitions. We fit the data to the expres- where the free parameters in the fit are the maximal contrast C 0 at τ = 0, dephasing time T * 2 , stretched exponential parameter p, time-offsets τ 0,i , and (up to) three frequencies f i from the NV hyperfine splittings. The p value provides a phenomenological description of the decay envelope, which depends on the specific noise sources in the spin bath as well as the distribution of individual resonance lines within the NV ensemble. For a purely magneticnoise-limited spin bath, the NV ensemble decay envelope exhibits simple exponential decay (p = 1) [49,50]; whereas a non-integer p-value (p = 1) suggests magnetic and/or strain gradient-limited NV spin ensemble dephasing.
Strain-dominated dephasing (Sample A: low nitrogen density regime)

Experiments on Sample A ([N]
0.05 ppm, 14 N) probed the low nitrogen density regime. In different regions of this diamond, the measured SQ Ramsey dephasing time varies between T * 2,SQ 5−12 µs, with 1 < p < 2. Strikingly, even the longest measured T * 2,SQ is ∼ 30× shorter than the calculated T * 2,NV-N given by the nitrogen concentration of the sample ( 350 µs, see Table I) and is approximately 10× smaller than the expected SQ limit due to 0.01% 13 C spins ( 100 µs). This discrepancy indicates that dipolar broadening due to paramagnetic spins is not the dominant NV dephasing mechanism. Indeed, the spatial variation in T * 2,SQ and low concentration of nitrogen and 13 C spins suggests that crystal lattice strain inhomogeneity is the main source of NV spin ensemble dephasing in this sample. For the measured NV ensemble volume (∼ 10 4 µm 3 ) and the reference strain gradient (Fig. 1c), we estimate a strain gradient limited dephasing time of ∼ 6 µs, in reasonable agreement with the observed T * 2,SQ . Measurements in the DQ basis at moderate bias magnetic fields are to first order straininsensitive, and therefore provide a means to eliminate the dominant contribution of strain to NV spin ensemble . Upper inset: Illustration of DQ Ramsey protocol with two-tone microwave (MW) pulses, whereÛS=1(π/2) is the spin-1 unitary evolution operator [30]. For SQ measurements, a single-tone MW pulse is applied instead to generate the pseudo-spin-1/2 unitary evolution operatorÛ S=1/2 (π/2). Lower inset: Discrete Fourier transform of the SQ (solid blue) and DQ (dashed black) Ramsey measurements with a MW drive detuned 0.4 MHz from the {0, ±1} transitions. NV sensor spins accumulate phase twice as quickly in the DQ basis as in the SQ basis.
dephasing. Fig. 2 shows data for T * 2 in both the SQ and DQ bases for an example region of Sample A with SQ dephasing time T * 2,SQ = 5.8(2) µs and p = 1.7(2). For these measurements, we applied a small 2.2 mT bias field parallel to one NV axis (misalignment angle < 3 • ) to lift the | ± 1 degeneracy, and optimized the magnet geometry to reduce magnetic field gradients over the sensing volume (see Suppl. VI). In the DQ basis, we find T * 2,DQ = 34(2)µs with p = 1.0(1), which is a ∼ 6× improvement over the measured T * 2 in the SQ basis. We observed similar T * 2 improvements in the DQ basis in other regions of this diamond. Our results suggest that in the low nitrogen density regime, dipolar interactions with the 13 C nuclear spin bath are the primary decoherence mechanism when DQ basis measurements are employed to remove strain and temperature effects. Specifically, the measured T * 0.05 ppm), we observed SQ Ramsey dephasing times T * 2,SQ 1 − 10 µs in different regions of Sample B, which are similar to the results from Sample A. We conclude that strain inhomogeneities are also a significant contributor to NV spin ensemble dephasing in Sample B . Comparative measurements of T * 2 in both the SQ and DQ bases yield a more moderate increase in T * 2,DQ for Sample B than for Sample A. Example Ramsey measurements of Sample B are displayed in Fig. 3, showing T * 2,SQ = 1.80(6) µs in the SQ basis increasing to T * 2,DQ = 6.9(5) µs in the DQ basis, a ∼ 4× extension. The observed T * 2,DQ in Sample B approaches the expected limit set by dipolar coupling of NV spins to residual nitrogen spins in the diamond (T * 2,N-NV /2 12 µs), but is still well below the expected DQ limit due to 0.01 % 13 C nuclear spins ( 50 µs).
Measuring NV Ramsey decay in both the SQ and DQ bases while driving the nitrogen spins, either via application of CW or pulsed RF fields [31,32], is effective in revealing the electronic spin bath contribution to NV ensemble dephasing. With continuous drive fields of Rabi frequency Ω N = 2 MHz applied to nitrogen spin resonances 1 − 6, i, and ii (see Fig. 1d), we find that T * 2,SQ+Drive = 1.94(6) µs, which only marginally exceeds T * 2,SQ = 1.80(6) µs. This result is consistent with NV ensemble SQ dephasing being dominated by strain gradients in Sample B, rendering spin bath driving ineffective in the SQ basis. In contrast, DQ Ramsey measurements exhibit a significant additional increase in T * 2 when the bath drive is applied, improving from T * 2,DQ = 6.9(5) µs to T * 2,DQ+Drive = 29.2(7) µs. This ∼ 16× improvement over T * 2,SQ confirms that, for Sample B without spin bath drive, dipolar interactions with the nitrogen spin bath are the dominant mechanism of NV spin ensemble dephasing ; the SQ coherence with spin-bath drive (blue, 2 nd from top); the DQ coherence with no drive (black, 3 rd from top); and the DQ coherence with spin-bath drive (black, 4 th from top). There is a 16.2× improvement of T * 2 with spin-bath drive when the DQ coherence is used for sensing compared to SQ with no drive. Inset: Two-tone NV Ramsey protocol with applied spin-bath bath drive resonant with nitrogen spins.
in the DQ basis. Note that the NV dephasing time for Sample B with DQ plus spin bath drive is only slightly below that for Sample A with DQ alone (≈ 34 µs). We attribute this T * 2 limit in Sample B primarily to NV dipolar interactions with 0.01% 13 C nuclear spins. There is also an additional small contribution from magnetic field gradients over the detection volume (∼ 10 4 µm 3 ) due to the four times larger applied bias field (B 0 = 8.5 mT), relative to Sample A, which was used in Sample B to resolve the nitrogen ESR spectral features (see Suppl. Table S3 and S4). We obtained similar extensions of T * 2 using pulsed driving of the nitrogen bath spins (see Supp. X).
We also characterized the efficacy of CW spin bath driving for increasing T * 2 in both the SQ and DQ bases (see Fig. 4a). While T * 2,SQ remains approximately constant with varying Rabi drive frequency Ω N , T * 2,DQ exhibits an initial rapid increase and saturates at T * 2,DQ ≈ 27 µs for Ω N 1 MHz (only resonances 1 − 6 are driven here). To explain the observed trend, we introduce a model that distinguishes between (i) NV spin ensemble dephasing due to nitrogen bath spins, which de-pends upon bath drive strength Ω N , and (ii) dephasing from drive-independent sources (including strain and 13 C spins), Taking the coherent dynamics of the bath drive into account (see Suppl. VIII), the data is well described by the functional form where ∆m = 1(2) is the change in spin quantum number in the SQ (DQ) basis and δ N = γ N /2π is the Lorentzian linewidth (half width at half max) of the nitrogen spin resonances measured through DEER ESR (Fig. 1d). Although we find that NV and nitrogen spins have comparable T * 2 (γ NV-N ≈ γ N , see Suppl. XI), the effective linewidth δ N relevant for bath driving is increased due to imperfect overlap of the nitrogen spin resonances caused by a small misalignment angle of the applied bias magnetic field.
Using the NV-N dipolar estimate for Sample B, γ NV-N ≈ 2π × 7 kHz, δ N ≈ 80 kHz extracted from DEER measurements (Suppl. XI), and a saturation value of T * 2,other ≈ 27 µs, we combine Eqns. 4 and 5 and plot the calculated T * 2 as a function of Ω N in Fig. 4a (black, dashed line). The good agreement between the model and our data in the DQ basis suggests that Eqns. 4 and 5 capture the dependence of T * 2 on drive field magnitude (i.e., Rabi frequency). Alternatively, we fit the model to the DQ data (red, solid line) and extract γ f it NV-N = 2π ×9.3(2)kHz and δ f it N = 60(3) kHz, in reasonable agreement with our estimated parameters. In summary, the results from Sample B show that the combination of spin bath driving and sensing in the DQ basis suppresses inhomogeneous NV ensemble dephasing due to both interactions with the nitrogen spin bath and strain-gradients. Similar to Sample A, further enhancement in T * 2 could be achieved with improved isotopic purity, as well as reduced magnetic-gradients due to the applied magnetic bias field.  Fig. 4b. At this high nitrogen density, interactions with the nitrogen bath dominate NV spin ensemble dephasing, and T * 2,SQ and T * 2,DQ both exhibit a clear dependence on spin bath drive strength Ω N . With no drive (Ω N = 0), we measured T * 2,DQ ≈ T * 2,SQ /2, in agreement with dephasing dominated by a paramagnetic spin environment and the twice higher precession rate in the DQ basis [29,30,51]. Note that this result is in contrast to the observed DQ basis enhancement of T * 2 at lower nitrogen density for Samples A and B (Figs. 2 and 3). We also find that T * 2 in Sample C increases more rapidly as a function of spin bath drive amplitude in the DQ basis than in the SQ basis, such that T * 2,DQ surpasses T * 2,SQ with sufficient spin bath drive strength. We attribute the T * 2 -limit in the SQ basis ( 1.8 µs) to strain inhomogeneities in this sample, whereas the longest observed T * 2 in the DQ basis ( 3.4 µs) is in agreement with dephasing due to the 0.05% 13 C and 0.5 ppm residual 14 N spin impurities. The latter were incorporated during growth of this 15 N sample (see Suppl. Table S5).
In Fig. 4c we plot T * 2,NV-N ≡ 2 × T * 2,DQ versus sample nitrogen concentration [N] to account for the twice faster dephasing of the DQ coherence. To improve the range of [N] coverage, we include DQ data for additional diamonds, Samples D ([N] = 3 ppm) and E ([N] = 48 ppm). To our knowledge, the dependence of the NV spin ensemble dephasing time on [N] has not previously been experimentally reported. Fitting the data to the function 1/T * 2,NV-N = A NV-N · [N] (red shaded region), we find the characteristic NV-N interaction strength for NV ensembles to be A NV-N = 2π × 16.6(2.6) kHz/ppm [1/A N-NV = 9.6(1.8) µs · ppm] in the SQ sub-basis. This value is about 1.8× larger than the dipolar-estimate γ e-e = 2π × 9.1 kHz/ppm (black dashed-dotted line), which is used above in estimates of NV dephasing due to the nitrogen spin bath. We also performed numerical spin bath simulations for the NV-N spin system and determine the second moment of the dipolar-broadened single NV ESR linewidth [49, Ch. III and IV]. By simulating 10 4 random spin bath configurations, we extract the ensemble-averaged dephasing time from the distribution of the single NV linewidths [50]. The results of this simulation (black dashed line) are in excellent agreement with the experiment and confirm the validity of our obtained scaling for T * 2,NV-N (N). Additional details of the simulation are provided in Ref. [52].

Ramsey DC Magnetic Field Sensing
We demonstrated that combining the two quantum control techniques can greatly improve the sensitivity of Ramsey DC magnetometry. Fig. 4d compares the accumulated phase for SQ, DQ, and DQ plus spin bath drive measurements of a tunable static magnetic field of amplitude B DC , for Sample B. Sweeping B DC leads to a characteristic observed oscillation of the Ramsey signal is the measurement contrast and φ = ∆m × γ N V B DC τ is the accumulated phase during the free precession interval τ ≈ T * 2 . Choosing τ SQ = 1.308 µs and τ DQ+Drive = 23.99 µs (see Suppl. XII), we find a 36.3(1.9)× faster oscillation period (at equal measurement contrast) when DQ and spin Samples were selected to have a predominately electronic nitrogen (P1) spin bath using DEER ESR measurements. The black dasheddotted line is the dipolar-interaction-estimated dependence of T * 2 on nitrogen concentration (Suppl. V). We fit the data using an orthogonal-distance-regression routine to account for the uncertainties in [N] and T * 2 . A fit to the form 1/T * 2 = ANV-N[N] yields AN-NV = 2π × 16.6(2.6) kHz/ppm [1/ANV-N = 9.6(1.8) µs · ppm]. The red shaded region indicates the 95 % standard error of the fit value for AN-NV. The black dashed line is the expected scaling extracted from numerical simulations using a second-moment analysis of the NV ensemble ESR linewidth (see text for details). (d) Measured Ramsey DC magnetometry signal S ∝ C sin(φ(τ )) for Sample B, in the SQ and DQ bases, as well as the DQ sub-basis with spin-bath drive (see main text for details). There is a 36× faster oscillation in the DQ sub-basis with spin-bath drive compared to SQ with no drive. This greatly enhanced DC magnetic field sensitivity is a direct result of the extended T * 2 , with the sensitivity enhancement given by 2 × τDQ+Drive/τSQ at equal contrast. The slight decrease in observed contrast in the DQ + drive case for |BDC | > 0.05 mT is a result of changes in the Zeeman resonance frequencies of the nitrogen spins due to the applied test field BDC , which was not corrected for in these measurements.
bath driving are both employed, compared to a SQ measurement. This enhancement in phase accumulation, and hence DC magnetic field sensitivity, agrees well with the expected improvement (2 × τ DQ+Drive /τ SQ = 36.7).

DISCUSSION
Our results (i) characterize the dominant spin dephasing mechanisms for NV ensembles in bulk diamond (strain and interactions with the paramagnetic spin bath); and (ii) demonstrate that the combination of DQ magnetometry and spin bath driving can greatly extend the NV spin ensemble T * 2 . For example, in Sample B we find that these quantum control techniques, when combined, provide a 16.2× improvement in T * 2 . Operation in the DQ basis protects against common-mode inhomogeneities and enables an extension of T * 2 for samples with [N] 1 ppm. In such samples, strain inhomogeneities are found to be the main causes of NV spin ensemble dephasing. In samples with higher N concentration ([N] 1 ppm), spin bath driving in combination with DQ sensing provides an increase of the NV ensemble T * 2 by decoupling paramagnetic nitrogen and other electronic dark spins from the NV spins. Our results suggest that quantum control techniques may allow the NV ensemble T * 2 to approach the bare Hahn echo coherence time T 2 . Note that spin bath driving may also be used to enhance the NV ensemble T 2 in Hahn echo, dynamical decoupling [25,26], and spectral decomposition experimental protocols [53].
Furthermore, we showed that the combination of DQ magnetometry and spin bath driving allows improved DC Ramsey magnetic field sensing. The relative enhancement in photon-shot-noise-limited sensitivity (neglecting experimental overhead time) is quantified by 2 × √ ζ, where the factor of two accounts for the enhanced gyromagnetic ratio in the DQ basis and ζ ≡ T * 2,DQ /T * 2,SQ is the ratio of maximally achieved T * 2 in the DQ basis (with spin bath drive when advantageous) and non-driven T * 2 in the SQ basis. For Samples A, B, and C, we calculate 2 × √ ζ = 5.2×, 8.1×, and 3.9×, respectively, using our experimental values. In practice, increasing T * 2 also decreases the fractional overhead time associated with NV optical initialization and readout, resulting in even greater DC magnetic field sensitivity improvements and an approximately linear sensitivity enhancement with ζ (see Suppl. XII). We expect that these quantum control techniques will remain effective when integrated with other approaches to optimize NV ensemble magnetic field sensitivity, such as high laser power and good N-to-NV conversion efficiency. In particular, conversion efficiencies of 1−30 % have been reported for NV ensemble measurements [13,21,23,54], such that the nitrogen spin bath continues to be a relevant spin dephasing mechanism.
There are multiple avenues for further improvement in NV ensemble T * 2 and DC magnetic field sensitivity, beyond the gains demonstrated in this work. First, the 13 C limitation to T * 2 , observed for all samples, can be mitigated via improved isotopic purity ([ 12 C] > 99.99 %); or possibly through driving of the nuclear spin bath [55]. Second, more efficient RF delivery will enable faster spin bath driving (higher Rabi drive frequency Ω N ), which will be critical for decoupling denser nitrogen baths and thereby extending T * 2 ∝ Ω 2 N /δ 2 N ∝ Ω 2 N /[N] 2 (see Eqn. 5). Third, short NV ensemble T * 2 times have so far prevented effective utilization of more exotic readout techniques, e.g., involving quantum logic [56][57][58] or spin-to-chargeconversion [59,60]. Such methods offer greatly improved NV spin-state readout fidelity but introduce substantial overhead time, typically requiring tens to hundreds of microseconds per readout operation. The NV spin ensemble dephasing times demonstrated in this work (T * 2 20 µs) may allow effective application of these readout schemes, which only offer sensitivity improvements when the sequence sensing time (set by T * 2 for DC sensing) is comparable to the added overhead time. We note that the NV ensemble T * 2 values obtained in this work are the longest for any electronic solid-state spin system at room temperature (see comparison Fig. S2) suggesting that stateof-the-art DC magnetic field sensitivity [13,61] may be increased to ∼ 100 fT/ √ Hz for optimized NV ensembles in a diamond sensing volume ∼ (100 µm) 3 (see discussion on NV ensemble DC magnetic field sensitivity optimization in Barry et al. [13]). In conclusion, DQ magnetometry in combination with spin bath driving allows for order-of-magnitude increase in the NV ensemble T * 2 in diamond, providing a clear path to ultra-high sensitivity DC magnetometry with NV ensemble coherence times approaching T 2 .  Author contributions statement 11

I. EXPERIMENTAL METHODS
A custom-built, wide-field microscope collected the spin-dependent fluorescence from an NV ensemble onto an avalanche photodiode. Optical initialization and readout of the NV ensemble was accomplished via 532 nm continuouswave (CW) laser light focused through the same objective used for fluorescence collection (Fig. 1a). The detection volume was given by the 532 nm beam excitation at the surface (diameter ≈ 20 µm) and sample thickness (100 µm for Samples A and B, 40 µm for Sample C). A static magnetic bias field was applied to split the |−1 and |+1 degeneracy in the NV ground state using two permanent samarium cobalt ring magnets in a Helmholtz-type configuration, with the generated field aligned along one [111] crystallographic axis of the diamond (≡ẑ). The magnet geometry was optimized using the Radia software package [62] to minimize field gradients over the detection volume (see Suppl. VI). A planar waveguide fabricated onto a glass substrate delivered 2 − 3.5 GHz microwave radiation for coherent control of the NV ensemble spin states. To manipulate the nitrogen spin resonances (see Fig. 1d), a 1 mm-diameter copper loop was positioned above the diamond sample to apply 100 − 600 MHz radiofrequency (RF) signals, synthesized from up to eight individual signal generators. Pulsed measurements on the NV and nitrogen spins were performed using a computer-controlled pulse generator and microwave switches. The NV ESR measurement contrast (Fig. 2, 3, and 4d) is determined by comparing the fluorescence from the NV ensemble in the |0 state (maximal fluorescence) relative to the | + 1 or | − 1 state (minimal fluorescence) [10] and is defined as visibility C = max−min max+min . The DEER (Fig. 1d) and DC magnetometry contrast (Fig. 4d) are calculated in the same fashion, but are reduced by ≈ 1/e since the best phase sensitivity in those measurements is obtained at τ ≈ T 2 and τ ≈ T * 2 , respectively (see Suppl. XI and XII). For noise rejection, most pulse sequences in this work use a back-to-back double measurement scheme [27], where the accumulated NV spin ensemble phase signal is first projected onto the |0 state and then onto the | + 1 (or | − 1 ) state. The contrast for a single measurement is then defined as the visibility of both sequences.

II. SAMPLE INFORMATION (ALL SAMPLES)
Information for all samples used in this study is summarized in Table S1. In Fig. S2 we show a survey of inhomogeneous dephasing times for electronic solid-state spin ensembles.

IV. STRAIN CONTRIBUTION TO T * 2
The on-axis strain component M z in Sample B was mapped across a 1 × 1 mm area using a separate wide-field imager of NV spin-state-dependent fluorescence. A bias field B 0 ∼ 1.5 mT was applied to split the spin resonances from the four NV orientations. Measurements were performed following the vector magnetic microscopy (VMM) technique [19]. Eqn. 3 in the main text was used to analyze the measured NV resonance frequencies from each camera pixel (ignoring M x and M y terms as small perturbations, see Suppl. VII). This procedure yielded the average B x , B y , and B z magnetic field components, as well as the M z on-axis strain components for all four NV orientations in each camera pixel, corresponding to 2.42 µm×2.42 µm transverse resolution on the diamond sample. Figure 1c of the main text shows the resulting map of the on-axis strain inhomogeneity M z in Sample B for the NV orientation interrogated in this work. This map indicates an approximate strain gradient of 2.8 kHz/µm across the field of view. The estimated strain gradient was used for all samples, while recognizing the likely variation between samples and within different regions of a sample. Across a 20-µm diameter spot, the measured strain inhomogeneity corresponds to a T * 2 limit of ≈ 6 µs, which compares well with the measured variation in T * 2,SQ for Samples A and B (see Table 1). Note that the contributions to M z can be microscopic (e.g., due to nearby point defects) or macroscopic (e.g., due to crystal defects with size > 10 µm). In addition, the VMM technique integrates over macroscopic gradients within the depth of field of the VMM microscope. For the present experiments the resolution along the z-axis (i.e., perpendicular to the diamond surface) is given approximately by the thickness of the NV-diamond layer. Consequently, the strain gradient estimate shown in Fig. 1c is a measure of M z gradients in-plane within the NV layer, and strain gradients across the NV layer thickness are not resolvable in this measurement.
The NV spin ensemble T * 2 as a function of nitrogen concentration is estimated from the average dipolar coupling between electronic nitrogen spins, which is given by γ e−e = a × µ 0 4π g 2 µ 2 B / 1 r 3 ≈ 2π × 9.1 · [N] kHz/ppm, where µ 0 is the vacuum permeability, g is the electron g-factor, µ B is the Bohr magneton, is the reduced Planck constant, r = 0.55[N] −1/3 is the average spacing between electronic nitrogen spins as a function of density [N] (in parts-per-million) within diamond [66], and a is a factor of order unity collecting additional parameters from the dipolar estimate such as the angular dependence and spin resonance lineshape of the ensemble [49]. A sample with [N] = 1 ppm has an estimated T * 2,NV-N ≈ 1/(2π × 9.1 kHz) = 17.5 µs using this dipolar estimate. Similarly, Table S1 gives the estimates T * 2,NV-N for Samples A, B, and C.  Figure S2. Inhomogeneous spin dephasing times. Experimental results from this work are compared to that of related spin defect systems (see legend). Inhomogeneous dephasing due to paramagnetic bath spins (e.g., nitrogen and 13 C nuclear spins in diamond), strain fields and other effects limit T * 2 ens at lower sensor-spin densities 1. At higher sensor-spin densities approaching unity, spin-spin interaction places an upper bound on the ensemble dephasing time (red shaded area). This limit to T * 2 is estimated using γe−e (see Suppl. V) and a fractional sensor-spin density of 1 corresponds to ∼ 10 23 cm −3 . Red arrows indicate improvement from the bare T * 2 as measured in the NV SQ basis and increase when DQ sensing and spin bath drive (where advantageous) are employed to suppress inhomogeneities. The maximal obtained T * 2 values for Sample A, B and C are multiplied by a factor of two to account for the twice higher gyromagnetic ratio in the NV DQ basis. The region in which individual, single spins are resolvable with confocal microscopy (∼ 200 nm average spin separation) is shown in gray.  Fig. 4c are estimated by considering: the values reported by the manufacturer (Element Six Inc.); fluorescence measurements in a confocal microscope (Sample A); and Hahn echo T 2 measurements using the calibration value T 2 (N) 165 µs · ppm reported in Ref. [52] (Samples B and C). In the dilute 13 C limit (n 13C 1.1 %, where n 13C is the 13 C spin concentration in percent), the NV-13 C contact interaction can be neglected and thus the NV ensemble ESR linewidth is expected to be linearly-dependent on the 13 C concentration [6, 49], i.e., 1/T * 2,NV-13C = A NV-13 C · n 13C . An NV spin ensemble T * 2 measurement on a natural abundance sample with n 13C = 1.07 % therefore provides a reasonable lower-bound estimate for A NV-13C from which the 13 C contribution in our diamond samples can be calculated. Fig. S3 shows a DQ Ramsey measurement of a natural 13 C abundance sample. Via a fit to the Ramsey data in the time domain, we extract T * 2,DQ = 445(30) ns and p = 1.0(1). After correcting for the small contribution of 0.4 ppm nitrogen spins in the sample using the calibration found in Fig. 4c of the main text, we calculate A NV-13C ≈ 2π×160 kHz/% (1/A13 C ≈ 1 µs ·%) from which we determine the NV-13 C limits given in Table 1 and the main text of the paper. The NV-diamond epifluorescence microscope employs a custom-built samarium-cobalt (SmCo) magnet geometry designed to apply a homogeneous external field B 0 parallel to NVs oriented along the [111] diamond crystallographic axis. The field strength can be varied from 2 to 20 mT (Fig. S4a). SmCo was chosen for its low reversible temperature coefficient (-0.03 %/K). Calculations performed using the Radia software package [62] enabled the optimization of the geometry to minimize B 0 gradients across the NV fluorescence collection volume. This collection volume is approximately cylindrical, with a measured diameter of ≈ 20 µm and a length determined by the NV layer thickness along the z-axis (40−100 µm, depending on the diamond sample; see descriptions in the main text). To calculate the expected B 0 field strength along the target NV orientation, the dimensions and properties of the magnets were used as Radia input, as well as an estimated 3 • misalignment angle of the magnetic field with the NV axis. We find good agreement between the calculated field strength and values extracted from NV ESR measurements in Sample B, over a few millimeter lengthscale. The simulation results and measured values are plotted together in Fig. S4b. The z-direction gradient is reduced compared to the gradient in the xy-plane due to a high degree of symmetry along the z-axis for the magnet geometry.

Uncertainties in nitrogen concentration [N] used in
Using data and simulation, we calculate that the B 0 gradient at 8.5 mT induces an NV ensemble ESR linewidth broadening of less than 0.1 kHz across the collection volume of Sample B. This corresponds to a T * 2 -limit on the order of 1 ms. However, due to interaction of the bias magnetic field with nearby materials and the displacement of the collection volume from the magnetic field saddle point, the experimentally realized gradient for Sample B was found to contribute an NV ESR linewidth broadening ≈ 1 kHz (implying a T * 2 -limit ≈ 320 µs), which constitutes a small but non-negligible contribution to the T * 2 values measured in this work. Ramsey measurements for Sample A were taken at a four times smaller bias field; we estimate therefore ≈ 4× better magnetic field homogeneity. For Sample C, with a layer thickness of 40 µm, the contribution of the magnetic field gradient at 10 mT to T * 2 was similar to that of Sample B.

VII. NV HAMILTONIAN IN SINGLE AND DOUBLE QUANTUM BASES
In this section we discuss the influence of strain and magnetic fields in the single quantum (SQ) and double quantum (DQ) bases by considering several limiting cases. We first discuss how common-mode noise sources, i.e., sources that shift the NV | − 1 and | + 1 energy levels in-phase and with equal magnitude, are suppressed in the DQ basis. We then discuss how off-NV-axis strain fields are suppressed even by moderate bias magnetic fields. Lastly, we discuss the effect of off-axis magnetic fields on the NV spin-state energy levels and T * 2 . We begin with the negatively-charged NV ground electronic state electronic spin (S = 1) Hamiltonian, which is given by [10] (neglecting hyperfine and quadrupolar effects): Case 1: Zero strain, zero off-axis magnetic field For zero strain/electric field ({M x , M y , M z } = 0) and zero off-axis magnetic field (B ⊥ = 0), the Hamiltonian in Eqn. S2 is diagonal: and the energy levels are given by the zero-field splitting D and Zeeman energies ± γNV 2π B z , where | ± 1, 0 are the Zeeman eigenstates NV spin ensemble measurements in the DQ basis, for which the difference between the f −1 = E |0 →|−1> and f +1 = E |0 →|+1> transitions is probed (see Fig. 1b), are to first-order insensitive to inhomogeneities and fluctuations in D (e.g., due to drift in temperature), and other common-mode noise sources. However, DQ measurements are twice as sensitive to magnetic fields along B z . The DQ basis therefore provides both enhanced magnetic field sensitivity and protection against common-mode noise sources (for higher order effects see, e.g., the Supplement of Ref. [29]).
Case 2: Non-zero strain, zero off-axis magnetic field For non-zero strain/electric field components, but negligible off-axis magnetic fields (B ⊥ ≈ 0), the energy eigenvalues of the NV Hamiltonian (Eqn. S2) for the | ± 1 states become From Eqn. S7 it follows that off-axis strain (∝ ||M ⊥ ||) is suppressed by moderate on-axis bias fields by a factor ||M ⊥ || γNVBz/π , as noted in the main text. Reported values for ||M ⊥ || are ∼ 10 kHz [29] and ∼ 100 kHz [38] for single NV centers in bulk diamond, and ∼ 7 MHz in nano-diamonds [38]. Fig. 1c in the main text shows that the measured on-axis strain M z in Sample B varies by 2 − 3 MHz (see Suppl. IV for details).

Case 3: Non-zero off-axis magnetic field
For non-zero off-axis magnetic field (B ⊥ = 0) we find the energy values for the NV Hamiltonian (Eqn. S1) by treating B ⊥ as a small perturbation, with perturbation Hamiltonian V ≡ H − H 0 . To simplify the analysis we set M || = M ⊥ = 0. Using time-independent perturbation theory (TIPT, see for example Ref. [67]), the corrected energy levels are then given by (2) where we have used in the last two lines the fact that γNV 2π B z D in our experiments. The new transition frequencies for E |0 →|±1 are then found to be From Eqn. S11 it follows that energy level shifts due to perpendicular magnetic fields are mitigated by the large zero-field splitting D; and are further suppressed in the DQ basis, as they add (approximately) in common-mode. At moderate bias fields, B z = 2 − 20 mT, and typical misalignment angles of θ ∼ 3 • (or lower), we estimate a frequency shift of 0.1 − 1 kHz in the SQ basis.

VIII. SPIN BATH DRIVING MODEL
The effective magnetic field produced by the ensemble of nitrogen spins is modeled as a Lorentzian line shape with spectral width δ N (half width at half max) and a maximum γ NV-N at zero drive frequency (Ω N = 0). This lineshape is derived in the context of dilute dipolar-coupled spin ensembles using the methods of moments [49,Ch. III and IV] and is consistent with NV DEER linewidth measurements (see Suppl. XI). The limit to the NV ensemble T * 2 taking the bath drive into account is given by (see Eqn. 5 of main text) (S12) At sufficiently high drive strengths (Ω N δ N ), the nitrogen spin ensemble is coherently driven and the resulting magnetic field noise spectrum is detuned away from the zero-frequency component, to which NV Ramsey measurements are maximally sensitive [68]. For this case, the NV spin ensemble T * 2 increases ∝ Ω 2 N /δ 2 N . At drive strength Ω N δ N , however, the nitrogen spin ensemble is inhomogeneously driven and the dynamics of the spin bath cannot be described by coherent driving. Nonetheless, 1/T * 2 given by Eqn. S12 approaches γ NV-N in the limit Ω N → 0, which is captured by the Lorentzian model. This model (Eqn. S12) is in excellent agreement with the data for Sample B ([N] = 0.75 ppm, δ N ≈ 11 kHz), for which Ω N > δ N for the range of drive strengths employed. Ω N > δ N also holds when the slight mismatch of nitrogen spin resonances is taken into account, effectively increasing the nitrogen linewidth relevant for bath driving (δ N ≈ 60 kHz, see discussion in main text). For Sample C ([N] = 10 ppm, δ N ≈ 150 kHz), we find that the effective linewidth δ N extracted from fitting the data in Fig. 4b is about 4× larger (≈ 600 kHz) than what is expected from the dipolar estimate even after account for the small B 0 misalignment angle and resultant slight mismatch of nitrogen spin resonance frequencies. We attribute this discrepancy to incoherent dynamics at drive strength Ω N ∼ δ N . Indeed, we find that for Sample C at drive strengths Ω N δ N the Ramsey signals exhibit multi-exponential decay with slow and fast decay rates, consistent with a larger effective δ N . To nonetheless enable a qualitative comparison with Sample B, in these instances the stretched exponential parameter is restricted to p ≥ 1 when extracting the NV spin ensemble T * 2 . At drive frequencies Ω N > δ N , the observed Ramsey signal returns to a simple exponential decay, confirming the validity of our driving model in this regime for Sample C. A more complete driving model, beyond the scope of this work, should take into account the changes of spin bath dynamics at drive strengths Ω N ∼ δ N .

IX. 14 N AND 15 N DOUBLE ELECTRON-ELECTRON RESONANCE SPECTRA
We account for the 14 N and 15 N spin resonances, observed in NV double electron-electron resonance (DEER) spectra (see Fig. 1d and S6), in terms of Jahn-Teller, hyperfine, and quadrupolar splittings. The relevant spin Hamiltonian for the substitutional nitrogen defect is given by [33][34][35]69] where µ B is the Bohr magneton, h is the Planck's constant, B = (B x , B y , B z ) is the magnetic field vector, g is the electronic g-factor tensor, µ N is the nuclear magneton, S = (S x , S y , S z ) is the electronic spin vector, A is the hyperfine tensor, I = (I x , I y , I z ) is the nuclear spin vector, and Q is the nuclear electric quadrupole tensor. This Hamiltonian can be simplified in the following way: First, we neglect the nuclear Zeeman energy (second term above) since its contribution is negligible at magnetic fields used in this work ( 10 mT). Second, the Jahn-Teller distortion defines a symmetry axis for the nitrogen defect along any of the [111]-crystal axis directions [45,47]. Under this trigonal symmetry (as with NV centers), and by going into an appropriate coordinate system, tensors g, A, and Q are diagonal and defined by at most two parameters: Here, g ⊥ , g , A ⊥ , A , P ⊥ , and P are the gyromagnetic, hyperfine, and quadrupolar on-and off-axis tensor components, respectively, in the principal coordinate system. Further simplifications can be made by noting that the g-factor is approximately isotropic [33], i.e., g ⊥ ≈ g ≡ g, and that for exact axial symmetry the off-axis components of the quadrupole tensor, P ⊥ , vanish [44]. Equation S13 may now be written as 14 N spectrum 14 N has S = 1/2 and I = 1, leading to six eigenstates |m S = ±1/2, m I = 0, ±1 . The corresponding three dipole-allowed transitions (∆m S = ±1, ∆m I = 0, solid arrows) are shown in Fig. S5, along with the four first-order forbidden transitions (∆m S = ±1, ∆m I = ±1, dashed arrows). A nitrogen defect in diamond undergoes a Jahn-Teller (JT) distortion, which defines a hyperfine quantization axis along any of the four [111] crystallographic directions, irrespective of the applied magnetic field. Taking all JT orientations into account, the full 14 N spin resonance spectrum displays a total of 12 dipole-allowed resonances. By aligning the magnetic field along any of the [111]-directions of the diamond crystal, the 12 transitions are partially degenerate and reduce to six visible transitions in an NV DEER measurement, with an amplitude ratio 1:3:1:3:3:1, as shown in Fig. 2b of the main text and Fig. S6a. We obtain the spectrum for the off-axis and degenerate JT orientations from Eqn. S15 by rotating the bias field by θ = 109.471 around either the x or y axis, where θ is the angle between any two crystallographic axes, i.e., taking B → R x or y (θ = 109.471 • ) · B.

X. CONTINUOUS VERSUS PULSED SPIN BATH DRIVING
As described in the main text, both continuous (CW) and pulsed driving can decouple the electronic spin bath from the NV sensor spins (see Fig. S7). In CW driving, the bath spins are driven continuously such that they undergo many Rabi oscillations during the characteristic interaction time 1/γ NV-N , and thus the time-averaged NV-N dipolar interaction approaches zero. For pulsed driving, π-pulses resonant with spin transitions in the bath are applied midway through the NV Ramsey free precession interval, to refocus bath-induced dephasing. Fig. S7a illustrates both methods for a given applied RF field with a Rabi frequency of Ω N .
Although we treat CW driving in the main text in detail, we find experimentally that pulsed driving yields similar T * 2 improvements over the measured range of Rabi drives. For example, Fig. S7a compares T * 2 for Sample B for both schemes at maximum bath drive strength Ω N = 1.5 MHz (for pulsed driving τ π ≡ 1/2Ω N ). Both decoupling schemes result in comparable T * 2 improvements (13 − 15×) over the non-driven SQ measurement, which is shown for reference. We attribute the slightly lower max T * 2 achieved in pulsed driving to detunings of the RF drive from the spin resonances of the main nitrogen groups, leading to less efficient driving of the spin population (see next section).
To study the efficacy of both driving schemes, we plot T * 2 as a function of Rabi drive Ω N in Fig. S7b. In the limit of τ π ≈ T * 2 , pulsed driving resembles the CW case and both schemes converge to the same maximal T * 2 .
Despite the similar improvements in T * 2 achieved using both methods, pulsed driving can reduce heating of the MW delivery loop and diamond sample -an important consideration for temperature sensitive applications. For this reason, pulsed driving may be preferable in such experiments despite the need for π-pulse calibration across multiple resonances.

XI. NV AND NITROGEN SPIN RESONANCE LINEWIDTH MEASUREMENTS
The NV and nitrogen (P1) ensemble spin resonance linewidths are determined using pulsed ESR and pulsed DEER NV spectral measurements, respectively, as shown in Fig. S8. Low Rabi drive strength and consequently long π-pulse durations can be used to avoid Fourier power broadening [70]. We find that nitrogen spin resonance spectra are typically narrower than for NV ensembles in the SQ basis, due to the effects of strain gradients in diamond on NV zero-field splittings.
For the spin bath driving model described in the main text (Eqn. 4), we are interested in the natural (i.e., nonpower-boadened) linewidth δ N of spin resonances corresponding to, for example, 14 N groups 1 − 6 (see Fig. 2b in main text and Fig. S8a in Supplement). In Ref. [25] it was reported that the different 14 N groups have approximately equal linewidth, i.e., that δ N,i ≈ δ N . However, we find that the bias field B z being only slightly misaligned (∼3 degree) from one of the [111] crystal axes causes the three degenerate spin resonances to be imperfectly overlapped, leading to a larger effective linewidth.
In Fig. S8b and c we compare the NV pulsed DEER linewidths of 14 N group 1 (a single resonance) with that of group 5 (three overlapped resonances) for different π bath -pulse durations. At short π bath -pulse durations (high MW powers), the linewidths are power broadened due to the applied microwave field, such that the measured linewidth is a convolution of the natural linewidth and the inverse duration of the π bath -pulse [70]. At longer π bath -pulse durations (reduced MW power), however, the measured linewidth approaches its natural width. In this instance, and for dipolar-limited linewidth broadening, the lineshape is Lorentzian with full width at half max Γ = 1/πT * 2,N . At the longest π bath -pulse durations used in this work, we find that group 1 consists of a narrow, approximately 25 kHz-wide peak. In contrast, group 5 reveals two peaks, consisting of two overlapped 14 N transitions and one detuned transition, which is attributed to imperfect magnetic field alignment. The splitting between the two peaks in group 5 is ≈ 80 kHz, which we use as the effective 14 N linewidth δ N in Eqn. 4 of the main text, and which is consistent with the value extracted from fitting the spin-bath driving model to the data (see Fig. 4a, δ fit N ≈ 60 kHz). In Fig. S8e we compare the measured NV and 14 N group 1 ensemble linewidths (full width at half max) for Sample B as a function of π-pulse duration. For both species, the linewidth narrows at long π-pulse durations, as discussed above, reaching non-power-broadened (natural) values. Notably, the non-power-broadened NV linewidth [321(7) kHz, extracted from a fit to the data] is ∼ 16× larger than the natural 14 N linewidth [20.6(1.2) kHz]. This order-of-magnitude difference is a manifestation of the strong strain field gradients in this sample. Specifically, pulsed ESR measurements of the NV ensemble linewidth (see Fig. S8a) are performed in the SQ {0, +1} or {0, −1} sub-basis, and are therefore strain gradient limited. In contrast, nitrogen defects in diamond have S = 1/2, and thus do not couple to electric fields or strain gradients. As a consistency check, note that NV ensemble Ramsey measurements in Sample B, made in the DQ basis (with no spin-bath driving), yield a strain-independent dephasing time T * 2,DQ = 6.9(5) µs. This dephasing time, presumably limited by the nitrogen spin bath, implies a 14 N spin resonance linewidth given by 1 2 × 1/πT * 2,DQ = 23(2) kHz, which is in good agreement with our pulsed DEER measurements of the natural 14 N linewidth. Similar consistency is found for measurements of the NV and 15 N ensemble spin resonance linewidths in Sample C, as shown in Fig. S8f. Such agreement across multiple samples is further evidence that the DQ T * 2 value for NV ensembles is limited by the surrounding nitrogen spin bath, as discussed in the main text. Note that for our samples [NV] [N] and we can therefore ignore the back action of NVs onto nitrogen spins in the DEER readout. For denser NV samples, however, this back action has to be taken into account [71].

XII. DC MAGNETOMETRY WITH DQ AND SPIN-BATH DRIVE
With Eqn. S20 we calculate and compare the sensitivities for the three measurement modalities (SQ, DQ, and DQ + spin-bath drive) applied to Sample B. Using C ≈ 0.026, which remains constant for the three schemes (see Fig. S9a), sensing times τ SQ = 1.308 µs, τ DQ = 6.436 µs, and τ DQ+Drive = 23.99 µs, standard deviations σ SQ = 0.0321, σ DQ = 0.0324, and σ DQ+Drive = 0.0325 calculated from 1 s of data, fixed sequence duration of τ + τ D = 70 µs, and γ NV = 2π × 28 GHz/T, the estimated sensitivities for the SQ, DQ and DQ+Drive measurement schemes are η = 70.7, 6.65, and 1.97 nT/ √ Hz, respectively. In summary, we obtain a 10× improvement in DC magnetic field sensitivity in the DQ basis, relative to the conventional SQ basis, and a 35× improvement using the DQ basis with spin bath drive. Note that this enhancement greatly exceeds the expected improvement when no dead time is present (τ D τ ) and is attributed to the approximately linear increase in sensitivity with sensing times τ τ D . We also plot the Allan deviation for the three schemes in Fig. S9b showing a τ −1/2 scaling for a measurement time of ≈ 1 s and the indicated enhancements in sensitivity. Lastly, we discuss an appropriate choice of spin concentrations in diamond and other sample material properties for enhanced-sensitivity magnetometry employing DQ coherence and spin bath driving. To simplify the discussion, we focus on the following combination of relevant parameters for NV magnetometry, which (in the appropriate limits discussed herein) is proportional to the photon-shot-noise-limited volume-normalized magnetic sensitivity η V (see Suppl. of Ref. [13]), given by Here, ∆m = 1(2) in the SQ (DQ) basis, [N] is the substitutional nitrogen (P1) center concentration, n NV =[NV]/[N] is the normalized concentration of NV centers relative to the nitrogen concentration, and T * 2 ([N]) is the NV ensemble dephasing time in the sensing basis chosen (SQ or DQ). The quantity η N describes the dependence of the sensitivity on nitrogen concentration [N]. Since the nitrogen concentration enters Eqn. S21 both explicitly and also through n NV and T * 2 , we need to investigate Eqn. S21 for a range of [N]. In the case of fixed n NV and nitrogen-spin-bath-limited dephasing, i.e., T * 2 ∝ 1/[N], η N remains constant as a function of [N]. In this simplistic picture, shorter T * 2 values may be exchanged for higher nitrogen (and thus NV center) concentrations and vice versa, with no effect on sensitivity. Such a discussion, however, neglects experimental overhead due to NV state initialization and readout, which is characterized by the dead time τ D (see Eqn. S20). Since T * 2 1 µs τ D in a typical SQ NV ensemble experiment, increasing T * 2 through optimized sample fabrication, DQ coherence magnetometry, and/or spin bath driving is preferred, and larger sensitivity gains are obtained when compared to an equivalent increase in NV center concentration.
More generally, the ensemble T * 2 depends on numerous diamond-related parameters (including the concentration of spin impurities and strain fields) and external conditions (such as temperature fluctuations of the diamond sample and magnetic field gradients due to the applied bias field). Focusing on the parameters intrinsic to diamond, the relevant contributions to T * 2 are (compare to Eqn. 1 in main text) 1/T * 2 ≈ 1/T * 2 {NV-13 C} + 1/T * 2 {NV-N}(Ω N ) + 1/T * 2 {NV-NV}(N) + 1/T * 2 {strain} + ..., where we added the term 1/T * 2 {NV-NV} to account for dephasing due to NV-NV dipolar interactions. This dephasing mechanism was neglected in the main text due to the low N-to-NV conversion efficiencies of Samples A, B, and C (n NV 1), but its contribution becomes relevant at increased conversion efficiencies intended for optimized diamond magnetometry. To model Eqn. S21 across a range of nitrogen concentrations, we now combine Eqns. S21 and S22 and include the dependence of T * 2 {NV-N}(Ω N ) on bath drive strength Ω N (see Eqn. 5 in the main text). We also anticipate optimized diamond samples to be isotopically engineered with T * 2 { 13 C = 0.01 %} 100 µs (or longer) and to possess strain field gradients comparable to this work's samples (T * 2 {strain} 5 µs). N-to-NV conversion efficiencies of up to 30 % have been reported for NV ensembles [54] suggesting that n NV 0.4 is feasible for an optimized diamond sample. The simulation results for η N in this parameter regime are summarized in Fig. S10 for SQ (blue) and DQ coherence magnetometry (red) and plotted for spin bath drive strengths Ω N = 0 (solid), 1, and 10 MHz (dashed).  Figure S10. ηN given by Eqn. S21, which is proportional to the NV ensemble volume-normalized magnetic field sensitivity, as a function of nitrogen (P1) center concentration [N] for SQ (blue) and DQ (red) magnetometry with spin bath drive strengths ΩN = 0 (solid), 1, and 10 MHz (dashed). Lower values correspond to higher sensitivities and vice versa. For the simulation we combine Eqns. S21 and S22 with the following parameters: T * 2 { 13 C = 0.01 %} 100 µs, T * 2 {strain} 5 µs, T * 2 {NV-N} is given by Eqn. 5 of the main text, nNV = 0.4, and T * 2 {NV-NV}(N) = (ANV-NV · nNV · [N]/4) −1 . Here, ANV-NV ≈ 2ANV-N 2π × 33 kHz/ppm due to the twice higher spin multiplicity of the NV centers [49, Ch. III and IV], and the factor 1/4 accounts for the fraction of NV centers used for sensing when all four NV orientations are distinguished. In this instance and assuming perfect optical initialization of NV centers, 3/4 of the NV centers are in the ms = 0 spin state and do not contribute to dephasing during the sensing sequence. Grey shaded regions indicate approximate improvements in sensitivity for DQ magnetometry with drive over SQ magnetometry alone.
Without spin bath drive applied (Ω N = 0) and at low nitrogen concentrations ([N] 1 ppm), η N in Eqn. S21 is larger (i.e., sensitivity is reduced) for SQ magnetometry due to the T * 2 -limit imposed by strain gradients. In this lower nitrogen regime, working in the DQ basis leads to substantial improvements in sensitivity (i.e., smaller values and thus higher sensitivity). At higher nitrogen concentrations ([N] 1 ppm), dipolar interactions with the spin bath dominate NV dephasing, strain contributions become negligible, and DQ coherence approaches a √ 2 enhancement in sensitivity over SQ coherence measurements. Note that the crossover between the low and high nitrogen regime is set by the strain contributions to T * 2 (here 5 µs) and lower (higher) strain contributions shift the crossover to lower (higher) nitrogen concentrations.
With spin bath drive applied, however, we find that additional gains in sensitivity are obtained for both SQ and DQ coherence measurements. The largest improvements for η N are obtained when DQ coherence measurements are employed at high spin bath drive strengths (1 − 10 MHz) and for an optimal nitrogen regime of 1 − 100 ppm. In practice, we expect the 1 − 10 ppm nitrogen regime to be optimal when all parameters relevant for magnetic sensing are considered. Herein we discuss additional parameters only qualitatively: i) The necessary Rabi frequency for effective spin bath driving increases linearly with nitrogen concentration, meaning the required RF power increases quadratically. Spin bath driving at high nitrogen concentrations becomes thus increasingly more challenging. ii) Samples A, B, and C in the main text were selected to have a predominately electronic nitrogen (P1) spin bath. The incorporation of high nitrogen concentrations in diamond, however, can lead to a larger variety of nitrogen-related spin species in the bath, which include nitrogen-clusters, NV 0 , and NVH defects (for example see Refs. [72][73][74] and therein). A more diverse spin bath severely increases the complexity of the bath drive, eventually rendering it impractical. Finally, iii) there are indications that diamond samples with a high nitrogen content exhibit a larger fraction of NV 0 centers. In such samples, the NV − measurement contrast C (see Eqn. S20) and, thus magnetic sensitivity, is diminished.
Summarizing, our analysis suggests that the 1−10 ppm nitrogen regime is optimal for high sensitivity magnetometry using NV ensembles but further work is required to quantitatively account for all parameters relevant for sensing. Note that the 2×, 4×, and 6× enhancement in sensitivity indicated in Fig. S10 for DQ with drive over SQ magnetometry alone, corresponds to a 4×, 16×, and 36× reduction in measurement time, respectively; and even larger relative enhancements in sensitivity should be realized when accounting for the experimental dead time τ D in NV ensemble experiments. Table S3.

XIII. DEPHASING CHANNELS PER SAMPLE
NV spin ensemble dephasing mechanisms for Sample A. Individual contributions to dephasing are determined using the estimated/calibrated values described in the main text and Supplement (column 2). The data show good agreement between calculated and measured total dephasing times T * 2,SQ and T * 2,DQ (last two rows