Quantum Metrology with Strongly Interacting Spin Systems

Quantum metrology makes use of coherent superpositions to detect weak signals. While in principle the sensitivity can be improved by increasing the density of sensing particles, in practice this improvement is severely hindered by interactions between them. Using a dense ensemble of interacting electronic spins in diamond, we demonstrate a novel approach to quantum metrology. It is based on a new method of robust quantum control, which allows us to simultaneously eliminate the undesired effects associated with spin-spin interactions, disorder and control imperfections, enabling a five-fold enhancement in coherence time compared to conventional control sequences. Combined with optimal initialization and readout protocols, this allows us to break the limit for AC magnetic field sensing imposed by interactions, opening a promising avenue for the development of solid-state ensemble magnetometers with unprecedented sensitivity.

nals. While in principle the sensitivity can be improved by increasing the density of sensing particles, in practice this improvement is severely hindered by interactions between them. Using a dense ensemble of interacting electronic spins in diamond, we demonstrate a novel approach to quantum metrology.
It is based on a new method of robust quantum control, which allows us to simultaneously eliminate the undesired effects associated with spin-spin interactions, disorder and control imperfections, enabling a five-fold enhancement in coherence time compared to conventional control sequences. Combined with optimal initialization and readout protocols, this allows us to break the limit for AC magnetic field sensing imposed by interactions, opening a promising avenue for the development of solid-state ensemble magnetometers with unprecedented sensitivity.
Quantum metrology is a powerful tool for explorations of fundamental physical phenomena (1) and applications in material science (2) and biochemical analysis (3,4). In particular, electronic spins associated with color centers in diamond have recently emerged as a promising platform for nanoscale precision sensing and imaging, with superior sensitivity and spatial resolution (5,6). A common approach to improving sensitivity of quantum systems is to utilize a dense ensemble of individual sensors and take advantage of parallel averaging. However, beyond a certain density, undesired interactions between sensors result in a rapid decay of the ensemble coherence (7), limiting the overall sensitivity of the quantum system (Fig. 1A). Moreover, in practice, disorder and control errors further deteriorate coherence and metrological sensitivity of such interacting many-body quantum systems. Over the past few decades, pulsed control techniques realizing dynamical decoupling and motional averaging have been developed and deployed to manipulate ensembles of quantum systems. While extremely successful in the context of nuclear magnetic resonance (NMR), magnetic resonance imaging (MRI) (8)(9)(10)(11)(12)(13) and atomic gas magnetometers (14), the efficacy of these techniques is severely limited in the presence of strong disorder and other imperfections. Therefore, these techniques are not readily applicable to quantum sensors based on electronic spin ensembles (15,16), where such effects are prominent.
In this Report, we develop and demonstrate a new approach for quantum sensing with disordered interacting spin ensembles. Our method uses periodic pulsed manipulation (Floquet engineering) of a spin ensemble (8) to detect an external signal of interest with high sensitivity, while simultaneously decoupling the effects of interactions and disorder, and being faulttolerant against the leading-order imperfections arising from finite pulse durations and experimental control errors. Specifically, we introduce a set of simple rules imposed on the pulse sequence and a new, generalized picture of AC field sensing (15,16), which allow us to design and implement a sequence that breaks the sensitivity limit imposed by spin-spin interactions.
Our experimental system consists of a dense electronic spin ensemble of NV centers in diamond (17), as shown in Fig. 1B. NV centers exhibit long-lived spin coherence even at room temperature and are excellent sensors of magnetic fields, electric fields, pressure, and temperature (5,6,18,19). Our sample has a high density of NV centers (∼15 ppm, see (16)), with long-range magnetic dipolar interactions between the spins as well as strong on-site disorder originating from other paramagnetic impurities, inhomogeneous strain in the diamond lattice, and local electric fields (17). The bulk diamond is etched into a nanobeam to improve control homogeneity and confine the probing volume to V = 8.1(9) × 10 −3 µm 3 (16). The NV center ground state is an electronic S = 1 spin, and we apply a static magnetic field to isolate an ensemble of effective two-level systems formed of NVs with the same crystallographic orientation. We initialize and detect the spin states optically and use resonant microwave excitation to drive coherent spin dynamics. See (16) for further details of the measurement sequences.
Spin echo measurements (20) reveal that the coherence decay time T 2 in our dense NV ensemble is limited to only 1.0 µs (Fig. 2C, gray crosses). The conventional method to extend T 2 beyond the spin echo is the XY-8 dynamical decoupling sequence (21), consisting of equallyspaced π pulses along thex andŷ axes ( Fig. 2A, top row). In our system, however, the XY-8 sequence only provides a small improvement (T 2 = 1.6 µs, Fig. 2C, blue squares), since the XY-8 T 2 is limited by strong spin-spin interactions, which are not affected by π rotations.
In order to significantly extend T 2 in the presence of interactions and control imperfections, we use a novel approach to design pulse sequences (15). We model our spin system, including the control fields and the external AC magnetic field target signal, by the Hamiltonian (16,17) where the internal system Hamiltonian is , global spin-control pulses are given by H Ω (t) = i (Ω x i (t)S x i +Ω y i (t)S y i ) and the external target signal is H AC (t) = γ NV B AC cos(2πf AC t−φ) i S z i . Here, S µ i (µ = x, y, z) are spin-1/2 operators, h i is a random on-site disorder potential that follows a Gaussian distribution with standard deviation W = (2π) 4.0 MHz, J ij /r 3 ij is the anisotropic dipolar interaction strength between two spins of the same crystallographic orientation at a distance r ij , with strength J = (2π) 35 kHz at a typical separation of 11 nm, Ω x,y i (t) are the global control amplitudes exhibiting weak position dependence due to spatial field inhomogeneities, γ NV is the gyromagnetic ratio of the NV center, and B AC , f AC and φ are the amplitude, frequency and phase of the target AC signal, respectively.
Our approach employs average Hamiltonian theory to engineer the system evolution through pulsed periodic manipulation of the spins (8,20). A sequence composed of n equidistant control pulses {P k ; k = 1, 2, .., n} with spacing τ defines a unitary time-evolution operator U(T ) = P k e −iHsτ · · · P 1 e −iHsτ over the Floquet period T . If the pulse spacing τ is much shorter than the timescales of the system Hamiltonian (τ 1 W , 1 J ), the unitary operator U(T ) can effectively be approximated by a time-independent average Hamiltonian as U(T ) ≈ e −iHavgT , with H avg = 1 T n k=1H k , andH k = (P k−1 · · · P 1 ) † H s (P k−1 · · · P 1 ),H 1 = H s . Motivated by this picture, we develop a pulse sequence which generates a desirable form of H avg from the H s intrinsic to the system.
Hamiltonian engineering can be understood as the result of a sequence of frame transformations (also known as toggling frame transformations) which rotate the spin operators in the interaction picture (Fig. 2B): for example, a π-pulse flips S z i → −S z i , while a π/2-pulse rotates S z i → ±S x,y i depending on the rotation axis. Importantly, the average Hamiltonian is uniquely specified by such toggling frame transformations of the S z operator (15,16), resulting in simple decoupling conditions that facilitate the procedure to find desired pulse sequences. For example, any pulse sequence in which the transformed S z operator spends equal time along the positive and negative direction for each axis-effectively producing a spin echo along all three axes-suppresses the on-site disorder Hamiltonian. Similarly, in order to symmetrize the dipolar interaction into a Heisenberg interaction Hamiltonian (where polarized states are eigenstates and coherence is preserved (22)), we require that the transformed S z operator spend an equal amount of time in each directionx,ŷ,ẑ (9,16). Furthermore, we can prioritize one condition over another to find a pulse sequence that better suits a given system; since disorder is dominant in our spin ensemble (W J), we perform the echo operation more frequently than interaction symmetrization.
In realistic situations, the above strategy will be affected by various imperfections, such as disorder and interactions acting during the finite pulse durations and errors of each control pulse, resulting in imperfections δH avg to the target effective Hamiltonian, H eff = H avg + δH avg .
Indeed, if we simply design a pulse sequence that decouples disorder and interactions only in the ideal pulse limit (Seq. A in Fig. 2A), we see only a marginal increase in coherence time compared to XY-8, yielding T 2 = 2.8 µs (Fig. 2C, green diamonds). A careful examination of Seq. A reveals that pulse-related imperfections play a dominant role in the dynamics, illustrating the importance of robust sequence design (16).
To address this key challenge, a number of strategies have been proposed that aim to cancel the undesired Hamiltonian terms acting during the finite pulse duration (10-13); however, a simple and systematic approach to treating the imperfections in a general setting is still lacking. Remarkably, we find that the transformations of the S z operator during the free evolution intervals are in fact sufficient to predict and suppress the errors during pulse rotations. This is a consequence of the fact that the unwanted residual Hamiltonian acting within the finite pulse duration can be uniquely determined by the two toggling-frame Hamiltonians on either side of the control pulse (see Ref. (15) for a comprehensive discussion). This insight motivates us to describe the S z operator transformations using the matrix, F = [F µ,k ] = 2Tr[S µSz k ], µ =x,ŷ,ẑ, whereS z k is the transformed spin operator within the k-th free evolution period. Crucially, this allows us to construct a simple set of algebraic conditions imposed on the matrix F to formalize the above decoupling rules, enabling not only the suppression of disorder and interaction effects during free evolution periods, but also the cancellation of dominant imperfections arising from finite pulses as well as rotation angle errors (see Eqs. (S8-S11) in (16) for the exact expressions of decoupling conditions).
By satisfying these rules, we can thus systematically generate robust pulse sequences that to first order yield a pure Heisenberg Hamiltonian with δH avg = 0. Seq. B, as shown in Fig. 2A, is an example of a robust pulse sequence designed with our formalism (15,16). Due to its robustness against all leading-order effects, it shows a significant extension of coherence time compared to the sequences described above, reaching T 2 = 7.9 µs (Fig. 2C, squares). Moreover, the coherence time is independent of the initial state.
We now apply this method to quantum sensing, where our goal is to robustly engineer the dynamics of the spin ensemble to be sensitive to the target sensing signal. AC magnetic field sensing typically uses periodic inversions of the spin operator between S z and −S z in the interaction picture, driven by a train of equidistant π pulses at a separation of 1 2f AC . This modulation causes cumulative precession of the sensor spin when the AC field sign change coincides with the frame inversion, resulting in high sensitivity to a signal field at f AC . Our interactiondecoupling sequences explore all three frame directions S x , S y , S z and AC selectivity requires synchronized periodic frame inversions in each of the three axes, while preserving the desired H avg and suppressing δH avg to maintain long coherence times.
In Fig. 3A we illustrate how this is achieved in Seq. B. The pulses lead to periodic changes in the sign and orientation of the interaction-picture S z operator, depicted by the time-domain modulation functions for each axis direction, F x , F y and F z . The detailed resonance characteristics of the pulse sequence can be characterized by the Fourier transformsF µ (f ) = is the spectral phase for a given axis µ. |F z (f )| 2 , as well as the total intensity |F t (f )| 2 (16). At the dominant resonance of the total intensity (red arrow in Fig. 3B), all three axes exhibit a phase-locked periodic sign modulation ( Fig. 3A), leading to constructive phase accumulation and high sensitivity.
In order to intuitively understand our sensing protocol and quantify its sensitivity, we generalize the average Hamiltonian analysis to incorporate AC signal fields, finding (15,16) where B eff is an effective magnetic field vector in the interaction picture which appears static to the driven spins. This allows a simple interpretation of our scheme: the spins undergo a precession around B eff , with the field orientation and magnitude determined by the frequencydomain modulation functionsF µ and |F t |, respectively. In our optimal Seq. B, the signal at the principal resonance f AC gives rise to |F x | = |F y | = |F z | withφ x =φ y =φ z , leading to B eff ∝ [ 1 3 , 1 3 , 1 3 ] with the total strength | B eff | reduced by a factor 1/ √ 3. This reduction is fundamental and, in fact, close to optimal (15), given the requirements to suppress the effects of spin-spin interactions via symmetrization. However, despite the reduction, the sensitivity is still improved because the coherence time is extended by interaction suppression. Moreover, the sensitivity to the signal B AC is maximized when the spins are initialized perpendicular to the B eff direction-to allow the largest precession orbit (Fig. 3C)-and the corresponding optimal readout requires an unconventional rotation axis [−1, 1, 0] and angle arccos( 2/3) to bring the precession plane parallel to theẑ axis. Indeed, as shown in Fig. 3D, we observe a larger contrast when spins are initialized in an optimal direction along [1, 1, −2], compared to an initialization in thex direction.
We now proceed to characterize the AC magnetic field sensitivity, defined as the minimum detectable signal amplitude per unit time, η = σ S |dS/dB AC | . Here, S is the spin contrast, σ S is the uncertainty of S for one second of averaging, and |dS/dB AC | is the gradient of S with respect to the field amplitude B AC . Figure 4A shows the measured contrast S as a function of B AC under optimal conditions for Seq. B and the conventional sequence XY-8, where we choose acquisition parameters for each of the two sequences that optimize their respective absolute sensitivities (16). We find that the spin contrast shows faster oscillations and a significantly steeper maximum slope under Seq. B, indicating that it is more sensitive to the external signal than XY-8. This is due to a combination of optimal state preparation and readout schemes as well as significantly improved coherence times, despite a reduced effective signal strength | B eff |.
In Fig. 4B, we show the sensitivity scaling with phase accumulation time, for a fixed signal frequency and integration time, finding good agreement with the theoretical prediction (16). We find that the volume-normalized sensitivity η V = η √ V of Seq. B, designed with our optimal sensing approach, reaches more than 40% improvement in sensitivity over the conventional XY-8 sequence. With these enhancements, we demonstrate η V = 20(2) nT · µm 3/2 / √ Hz for Seq. B, among the best volume-normalized sensitivities for solid-state magnetometers measured thus far (16,23).
Our work establishes a novel approach to quantum sensing by utilizing robust interaction decoupling, and provides the first demonstration of a solid-state ensemble quantum sensor surpassing the interaction limit. Additionally, our design formalism and generalized effective field picture are also directly applicable to robust DC field sensing. While our approach already yields a significant improvement in sensitivity, the T 2 reached here is still shorter than the depolarization time T 1 ∼ 100 µs in our dense spin ensemble. This T 2 is likely limited by waveform distortions in control pulses and higher-order terms in the average Hamiltonian analysis. The former can be mitigated by waveform engineering (24), and the latter by sequence symmetrization (10) or disorder-reduction via spin-bath engineering (25,26). Together with new diamond growth techniques (27), double-quantum magnetometry (26,28) and improved photon collection (23), these improvements may push the volume-normalized sensitivity to single-digit picotesla level in a µm 3 volume (16), opening the door to many applications, such as high-sensitivity nanoscale NMR (3,4) and investigations of strongly correlated condensed matter systems (2). Beyond applications to diamond magnetometry, the robust sequence design presented in this work can be extended to engineer a broad class of many-body Hamiltonians (15,22) in a wide variety of quantum hardware platforms, providing a useful tool for quantum information processing (29), simulation (30,31), and metrology (32-34). Figure 1: Interaction limit to spin ensemble quantum sensing. (A) Volume-normalized magnetic field sensitivity as a function of total spin density. The dashed line denotes the standard quantum limit scaling and the solid curve shows the behaviour when interactions between spins are taken into account for the typical readout efficiency factor C = 0.01 (18). The sensitivity plateaus beyond a critical density due to a coherence time reduction. Robust interaction decoupling (red arrow) allows us to break the interaction limit. (B) Illustration of the black diamond nanobeam used as a spin ensemble quantum sensor. Microwave and optical excitation are delivered to the NV spins to control and read out their spin states, and an AC magnetic field is used as a target sensing signal. The inset shows the three magnetic sublevels, |0 and |±1 , in the ground state of NV centers, where two levels, |0, −1 are addressed using resonant microwave driving. All measurements are performed at room temperature under ambient conditions. NVs are initialized using pulsed laser excitation at 532 nm (green trace) and read out through emitted photons detected by a single photon counting module (red trace). We perform N repetitions of a sensing sequence unit of length T (blue trace) and repeat the same measurement with an additional π pulse (yellow trace) acting on the NV centers for differential readout of the spin state (16). The box illustrates the details of different pulse sequences, XY-8, Seq. A and Seq. B, composed of π/2 and π rotations alongx andŷ axes. Bars above (below) the line indicate driving along positive (negative) axis directions. (B) Key concepts for sequence design. The sequence is described by pulses P i and the time-dependent frame transformation of the system between the pulses. We highlight the orientation of each rotated frame by the axis that points along theẑ axis of the fixed external reference frame. Decoupling sequences are designed by imposing average Hamiltonian conditions on the evolution of the highlighted axis (15). For example, the effects of disorder can be cancelled by implementing an echo-like evolution, +μ → −μ where µ = x, y, z (top row), and interactions are symmetrized by equal evolution in each of thex,ŷ andẑ axes in the transformed frames (bottom row). Additionally, the pulse sequence is designed to mutually correct rotation angle errors and finite pulse duration effects (15,16).
The principal resonance is highlighted by a red arrow, yielding maximum sensitivity. (C) Illustration of the effective magnetic field created by Seq. B at the principal resonance. In the average Hamiltonian picture, theẑ-direction sensing field in the external reference frame transforms to the [1,1,1]-direction field B eff in the effective spin frame with a reduced strength (16). Optimal sensitivity is achieved by initializing the spins into the plane perpendicular to the effective magnetic field direction. This optimal state preparation allows the spins to precess along the trajectory with the largest contrast (red dashed line). For comparison, the precession evolution for initialization to the conventionalx axis is shown as a blue dashed line. (D) Sensing resonance spectra near the principal resonance. The optimal initialization (red) shows greater contrast than thex-axis initialization (blue). Markers indicate experimental data and solid lines denote theoretical predictions calculated from the frequency-domain modulation functions (16). . The fit is a sinusoidal oscillation with an exponentially decaying profile (16). Seq. B shows a steeper slope at zero field, indicating that it is more sensitive than XY-8 to the external field. The interrogation times, t = 1.80 µs for XY-8 and t = 6.52 µs for Seq. B, are independently optimized to achieve maximal sensitivity (16). (B) Extracted absolute sensitivity η and volumenormalized sensitivity η V = η √ V with sensing volume V = 8.1(9) × 10 −3 µm 3 as a function of phase accumulation time t (error bars are given for η). The pulse spacing τ is fixed at 25 ns to detect the AC signal oscillating at frequency 11 MHz (see Fig. 3D). A comparison of the two sequences at their respective optimal sensing times reveals that Seq. B outperforms XY-8 by ∼40%. The solid lines indicate the theoretical sensitivity scaling using the independently estimated sensor characteristics (16). This PDF file includes:

Supplementary Text
Figs. S1 to S10 Table S1 References

Experimental details
Diamond sample: The sample used in this work is a type-Ib, high-pressure high-temperature (HPHT) diamond with a total negatively-charged nitrogen-vacancy (NV − ) concentration of ∼15 ppm (see Characterization of NV − Disorder and Interaction Strength below). To achieve such a high density, the sample was irradiated with high-energy electron beams (2 MeV) at a flux of 1.3 × 10 13 cm −2 s −1 and simultaneously in-situ annealed at 700 − 800 • C for a total of 285 hours to reach a total fluence of 1.4 × 10 19 cm −2 . This leads to a high conversion efficiency (> 10%) of the initial nitrogen into NV centers. The average distance between NV centers and the corresponding NV-NV interaction strength, taking into account all NV lattice orientations, are estimated to be ∼7 nm and ∼ (2π) 140 kHz, respectively (1). The unpaired leftover nitrogen (P1 centers) act as paramagnetic defects, creating a local magnetic field to nearby NV centers. Due to the presence of such impurities, as well as 13 C nuclear spins, local charges, and strain in the sample, the linewidth of the NV resonance is inhomogeneously broadened to (2π) 4.0 MHz (Gaussian standard deviation), corresponding to on-site disorder terms. Due to strong inhomogeneous broadening, the hyperfine interaction of strength (2π) 2.2 MHz between the NV electronic spin and the 14 N host nuclear spin is not resolved in the measured optically-detected magnetic resonance spectrum (see Fig. S3A). To improve control homogeneity, we fabricated a 20 µm-long nanobeam structure from bulk diamond via Faraday cage angled etching (2). As shown in Fig. S1, the nanobeam has a triangular cross section, with dimensions labeled in the figure.
Adopting a definition similar to the way mode volumes are defined, we define the effective probing area A to be the integrated power within the 2D Airy pattern divided by the maximum intensity at the center of the beam. Using this definition, we have where r is the effective probing radius, NA = 1.3 is the numerical aperture of the objective, λ = 532 nm is the excitation wavelength, and J 1 is the Bessel function of the first kind of order one. Plugging in these values gives r = λ/(π · NA) ≈ 130 nm.
Denoting the bottom edge length as c, the height as h, we estimate the probing volume by integrating over the intersection of a cylinder of radius r with the profile of the nanobeam. Since 2r < c, the bottom part will be an integration over a circular profile, while the upper parts will be cut off due to the edges of the triangular nanobeam. This gives the following integration for the volume: Assuming a 10 nm uncertainty on each of the parameters of the nanobeam, we can perform error propagation to arrive at a probing volume of V ≈ (8.1 ± 0.9) × 10 −3 µm 3 . Optical setup: The optical setup consists of a room temperature home-built confocal microscope with an oil-immersion objective (Nikon 100x, NA = 1.3). The sample is mounted on a piezoelectric stage in the focal plane of the microscope. A green laser (532 nm) is used to excite the ensemble of NV centers inside the confocal volume. To switch the excitation laser on and off, an acousto-optic modulator (Gooch & Housego, 3200-125) is installed in a double-pass configuration. NV centers emit fluorescence into the phonon sideband (630-800 nm), which is isolated from the excitation laser by a dichroic mirror. After passing through a 650 nm longpass filter, the emitted photons are split via a 1-to-4 fiber optic coupler (Thorlabs, TNQ630HF) and detected by four single-photon-counting modules (SPCM). Due to the large number of NV centers residing in the probing volume, the photon count rate is relatively high, and at green excitation powers above 50 µW the SPCMs start to experience saturation effects. To compensate for this, we perform a calibration of the saturation characteristics for each SPCM, and determine the input photon count rate from the output numbers. All photon count rates and contrast numbers reported are determined through this procedure. For the sensing measurements, we optimize the green laser power to maximize the signal-to-noise ratio (SNR). This requires operating at a high photon count rate to maximize signal, which results in some saturation effects; subsequently, the noise on the signal will be larger than the expected photon shot noise, and our experimental system is not shot noise limited.
Microwave setup: A high sampling rate arbitrary waveform generator (Tektronix, model AWG7122C) defines waveforms of the pulses with a temporal resolution of 83 ps (=12 Gigasamples per second). Using two separate channels of the AWG, we synthesize π/2 and π pulses at a calibrated Rabi frequency of Ω = 25 MHz (π/2 pulse length of 10 ns) and generate a continuous-wave target sensing signal oscillating at an arbitrary frequency. The synthesized pulses and the sensing signal are separately amplified by microwave amplifiers (Mini-Circuits, ZHL-16W-43-S+ and LZY-22+, respectively). To reduce the nonlinearity of the microwave amplifier, a DC block (Picosecond, 5501a) and low-pass filter (Mini-Circuits, VLF-3000+) are added. The amplified microwaves are combined together by a diplexer (Microwave Circuits, D3005001) before being delivered to a coplanar waveguide where the NV-ensemble is located, with an estimated 1% inhomogeneity in Rabi frequency across the probing volume (3). The AC magnetic field target sensing signal will have the same amount of inhomogeneity, which will be neglected in the following, as the broadening associated with it is minimal. Attenuators and a circulator (Pasternack, PE83CR1004) are also inserted at various places along the microwave delivery line to isolate unwanted reflections at the interfaces between different cables and devices.
Experimental parameters: The diamond sample contains four subgroups of NV centers, each oriented along one of the four different crystallographic axes of the crystal. Each NV center has a magnetic-field-sensitive spin triplet in its electronic ground state, |m s = 0, ±1 , which can be coherently manipulated via microwaves. The combination of a permanent magnet and electromagnetic coils is used to produce a static magnetic field that adjusts relative energy spacings between the spin states via Zeeman shifts. In the experiments presented in the main text, the orientation of the static field is set parallel to the crystallographic axis of a single NV group with a magnitude of 260 Gauss. In such a setting, the transition frequencies of the NV centers from |0 to |−1 and |0 to |+1 correspond to ω 1 = 2.137 GHz and ω 2 = 3.602 GHz, respectively. Illumination by a green laser initializes the NV spin state into |0 via a stateselective intersystem crossing (4). A moderate laser power of around 50-75 µW is chosen to suppress fluorescence fluctuations associated with charge state instabilities in the dense NV ensemble while maintaining good fluorescence count rates (5).
Measurement sequences: In each measurement, illumination by a green laser initializes the NV spin state into |0 via a state-selective intersystem crossing (4). After initialization, a series of microwave pulses resonant with the |0 ↔ |−1 transition are applied to create coherent superpositions of the two states and control the effective two-level dynamics. As illustrated in Fig. 2A of the main text, each pulse sequence is repeated twice, with and without a final π pulse acting on the |0 ↔ |−1 transition. At the end of each sequence, the photons emitted from NV centers are detected for a short duration (t read = 1.2 µs, chosen to maximize the SNR at our operating laser power) to read out their spin states. If we define c 1 and c 2 as the two counters recording the photon numbers with and without the π pulse, respectively, then the contrast S is defined as S = 2(c 1 − c 2 )/(c 1 + c 2 ). Physically, S is linearly proportional to the population difference between |−1 and |0 . Such a differential measurement is beneficial for suppressing common noise sources between the two counters, improving the SNR. To confirm this, we experimentally characterize the Allan deviation for the individual counters c 1,2 as well as the contrast S. The Allan deviation is known to estimate the long-term stability of a sensor (6). As shown in Fig. S2, the individual counters start to deviate from the expected 1/ √ τ scaling when the averaging time τ is longer than ∼ 0.1 seconds, indicating that further averaging does not lead to an improvement in SNR. However, the Allan deviation of the contrast S continues to follow the desired scaling, suggesting that the contrast measurement cancels out common-mode noise present in adjacent counters. As seen in (C), the contrast S = 2(c 1 − c 2 )/(c 1 + c 2 ) effectively rejects common-mode noise between the consecutive counters, giving rise to an improved stability of the sensor.
For the spin-echo measurement, we vary the free evolution time and monitor the contrast as a function of the total evolution time. For the other dynamical decoupling measurements, we fix the pulse separation τ = 25 ns and monitor the contrast as a function of the number of repetitions of the dynamical decoupling block (equivalently, the total evolution time). We fit the decay profile with a stretched exponential S(t) = S 0 exp[−(t/T 2 ) α ], and plot the normalized contrast S(t)/S 0 as the coherence in Fig. 2C of the main text. For these measurements, we choose a low experimental duty cycle and a green laser power of 50 µW. More specifically, we use a readout green pulse of 5 µs with the photon counter duration being 1.2 µs, followed by a 100 µs wait time for full equilibration of the charge dynamics (5), and then a 20 µs green pulse to repolarize the NV centers.
For the sensing measurements, we use a fixed pulse separation τ = 25 ns, and choose the frequency of the AC sensing signal field to be at the most sensitive resonance, with its frequency calculated by taking into account the frequency-domain modulation function with experimental parameters for the pulse durations and spacings (see Eq. (3) of the main text as well as Sec. 2.1 and Ref. (7)). In addition, we experimentally verify that the chosen frequencies give rise to maximal signal contrast. In order to maximize the SNR, we fully optimize all measurement parameters and minimize the experimental overhead times. For this purpose, we utilize a single green pulse (75 µW) for both initialization and readout, and optimize its duration to be 6 µs, balancing the time overhead and imperfect repolarization effects of the NVs (see Sec. S4B for details). To guarantee a reset of the NV sensor, we insert a depolarizing π/2 pulse right after readout to eliminate any remnant polarization of the NV centers, preventing any correlations between neighboring measurements. The fixed duty cycle in these measurements ensures that charge dynamics does not affect the measurement results.
For each sensing sequence, we first fix the AC sensing signal frequency to the expected resonance frequency and optimize the relative phase between the sensing signal and the sequence to achieve maximal contrast (see Sec. 3.2). We then sweep the signal frequency to experimentally confirm the location of the resonance (see for instance Fig. 3D of the main text), and perform another iteration of phase optimization at the newly identified resonance frequency. Note that during this frequency sweep, the phase is carefully adjusted to prevent any additional artifacts arising from phase differences at different frequencies. Having confirmed the resonance frequency and optimal signal phase, we sweep the AC signal amplitude and measure the contrast oscillations as a function of AC signal magnetic field strength (see Sec. 3.1 for the calibration procedure of the field strength). Taking the maximum slope of the oscillations (typically achieved at zero field for sine magnetometry) and evaluating the fluctuations in each data point, as well as considering the measurement duration, we can extract the sensitivity, see Sec. 4 for more details. The entire procedure is then repeated for a number of sequence repetition cycles to obtain the optimal sensitivity for a given sequence, balancing the effects of longer phase accumulation times and increased decoherence.
Characterization of NV − Disorder and Interaction Strength: In order to characterize the on-site disorder and interaction strength of our NV ensemble, we compare the experimentallymeasured decay profile of a Ramsey sequence and XY-8 sequence to numerical simulations. We consider a system of N spin-1/2 particles with on-site disorder and dipolar interactions, with where the system Hamiltonian H s and the time-dependent control field H Ω (t) can be described as H Here, S µ i with µ ∈ {x, y, z} are spin-1/2 operators for spin i, h i is an on-site disorder Zeeman potential, Ω x/y are Rabi frequencies of the microwave driving alongx/ŷ-axes, J 0 = (2π) 52 MHz·nm 3 is the coefficient of dipolar interactions between two electronic spins, and r ij , θ ij are the distance and relative orientation between the two spins at sites i and j.
For simulations, we numerically integrate the time evolution under the Hamiltonian for various pulse-sequences. π-or π/2-pulses along eitherx-orŷ-axes are implemented by setting the corresponding Rabi frequency Ω x = Ω y = (2π) × 41.7 MHz and evolving the system for an appropriate time duration t p , thus incorporating the effects of both interactions and disorder during these pulses. The free evolution time is chosen to be τ = 20 ns, in accordance with this set of experiments. We provide exact numerical results for N = 16 particles and assume that the N spins are randomly positioned within a 3D box of dimension L × L × L (L ≡ n −1/3 ) with periodic boundary conditions, where n is the spin density, and require that no pair is closer than a cut-off distance r cut = 1.4 nm. Physically, this cut-off distance arises because any pair of NV centers much closer than r cut will be severely affected by direct charge tunneling, which in turn significantly modifies their energy level structure (8). Due to the finite system size, the simulations exhibit a residual polarization at long times, which we subtract out for comparison to experiments.
As shown in Fig. S3A, we first measure the random on-site disorder via continuous-wave electron spin resonance, which is well-described by a Gaussian distribution with standard deviation σ W = (2π) 4.0±0.1 MHz. Using this disorder strength, we then turn to an XY-8 sequence to extract the spin density by comparing simulated decay curves to the experimental observations (Fig. S3B). We expect the decoherence rate under the XY-8 sequence to be dominated by spin-spin interactions as the sequence is designed to efficiently decouple the static disorder, which we have also directly verified in previous measurements, where the coherence time under an XY-8 sequence scales proportionally to the number of resonant NVs (9). By varying spin densities in the simulations, we find that a total spin density of around 15 ppm best reproduces the experimental decay profile (Fig. S3B). We note however that this estimation is only a rough approximation to the actual spin density, as the simulations in such a small system may not fully reflect the effect of long-range interactions, especially for spin ensembles in three dimensions. Our previous density extraction based on the spin-echo sequence resulted in a higher estimated spin density of ∼45 ppm (1), indicating that the numerically-extracted density values depend on the details of the modeling of finite-size spin systems as well as the methods employed to quantify the spin density. 2 Pulse sequence design and performance analysis

Summary of design formalism
In this section, we summarize the key ingredients of our pulse sequence design framework. A more detailed description and analysis, as well as its use in other applications and different system Hamiltonians, can be found in the accompanying paper (7). The key idea of our approach is to adopt a description of the pulse sequence in terms of how the S z spin operator in the interaction picture is transformed during the free evolution periods (10,11), also known as the toggling frame picture. Using this pulse sequence description, we generalize previous techniques developed in the nuclear magnetic resonance (NMR) community to generic spin systems with general types of interactions, including our disorder-dominated dipolar spin ensembles. Crucially, we find that this enables a simple algebraic description of both the conditions for on-site disorder and interactions to be decoupled during the free evolution periods, and for the systematic suppression of many kinds of finite pulse duration effects, including the action of on-site disorder and residual interactions during the pulse, as well as rotation angle errors. Moreover, this description is also naturally suited to analyze the sensing properties of the sequence. Utilizing these simple algebraic conditions for decoupling and sensing, we are able to design different pulse sequences with varying decoupling performances, with Seq. A and Seq. B in the main text as specific examples.
We start by describing the pulse sequence representation, and the set of algebraic rules imposed on the representation that guarantee leading-order fault-tolerant dynamical decoupling. A given pulse sequence, consisting of n π/2 pulses with finite pulse duration t p as building blocks (a π pulse is treated as two π/2 pulses with zero time separation in between), {P k }, k = 1, 2, · · · , n, is represented by a frame-duration vector τ = [τ k ] and a 3-by-n frame matrix The elements τ k of the frame-duration vector indicate the free evolution duration preceding pulse P k . The elements F µ,k of the frame matrix characterize the form of the transformed Pauli spin operatorS z k in the free evolution duration preceding pulse P k . More precisely, we definẽ from which the frame matrix can also be easily obtained by inverting the relation: Based on this representation, we can easily formulate the conditions for disorder and interactions to be decoupled, and for all types of leading-order finite pulse imperfections to be suppressed, see Ref. While the full derivation of the above design rules can be found in Ref. (7), here we sketch the main idea of it. As discussed in the main text, a given periodic pulse sequence {P k ; k = 1, 2, .., n} with free evolution time before each pulse given by {τ k ; k = 1, 2, .., n} defines a unitary operator U(T ) = P k e −iHsτ k · · · P 1 e −iHsτ 1 over one cycle. Defining the toggling-frame HamiltoniansH k = (P k−1 · · · P 1 ) † H s (P k−1 · · · P 1 ), the unitary operator can be rewritten as where the leading-order average Hamiltonian is H avg = 1 T n k=1H k τ k and T is the Floquet period. The leading-order description is a good approximation to the dynamics when the driving frequency 1/T is much faster than the local energy scales of the system Hamiltonian H s . To further increase accuracy, higher-order corrections to this expression can be systematically accounted for using the Floquet-Magnus expansion (12).
The internal system Hamiltonian of NV ensembles includes disorder and interaction terms, given in Eq. (S4), and can be rewritten as where h i is the on-site disorder field for the spin at site i and J ij /r 3 ij is the dipolar interaction strength between two spins at sites i and j. As the Heisenberg component of the interaction Hamiltonian, S i · S j , is invariant under global rotations, the k-th toggling-frame Hamiltoniañ H k can be uniquely specified by how the S z spin operator is transformed after the preceding k − 1 pulses:H As discussed above,S z k = µ F µ,k S µ . Therefore, the matrix F µ,k allows us to directly write down the average Hamiltonian, H avg = 1 T n k=1H k τ k describing spin evolution. This method is in fact generally applicable to any Hamiltonian under a strong quantizing field, with up to three-body interactions (see Ref. (7) for a detailed explanation).
With this analysis, we can now understand the intuition behind the above rules for leadingorder fault-tolerant dynamical decoupling of disorder and interactions.
First, the cancellation of disorder [Rule 1] can be understood as the requirement that a spin echo is performed along each axis direction, such that a precession around a positive disorder axis (e.g. +S z ) is compensated by a negative precession (−S z ). This suggests that there should be an equal amount of free evolution time along the positive and negative directions for each axis. In addition to the contributions from free evolution periods, the disorder will also be acting during the finite pulse duration. Crucially, since the spin operator continuously rotates during the π/2 pulse (e.g. S z cos θ + S x sin θ when rotating from S z (θ = 0) to S x (θ = π/2)), the disorder acting during the finite pulse duration results in a contribution that is proportional to the disorder Hamiltonian immediately preceding and following the pulse, with an additional numerical prefactor 2/π after integrating over the pulse. Thus, the disorder Hamiltonian of the pulse P k can be written asH P k = 2 π (H dis k−1 +H dis k ). This effectively extends the disorder contribution from the k-th free evolution period τ k to a duration of τ k + 4t p /π. By treating π pulses or composite π/2 pulses as a combination of several π/2 pulses with zero time separation in between, and specifying the intermediate frame direction after each π/2 pulse, we ensure that the effects of finite pulse duration during all pulses are treated appropriately.
Similarly, the suppression of interaction effects [Rule 2] in our Hamiltonian can be easily understood from a generalization of the WAHUHA sequence (13). Here, by rotating the S z spin operator to spend equal time along thex,ŷ andẑ directions, the interactions are symmetrized into the Heisenberg form, which preserves the coherence of polarized initial states that are typically realized in ensemble experiments. The effects of finite pulse duration can again be analyzed by considering the continuous rotation of the spin operator; however, as the interaction involves two operators, there will be both a contribution that extends the effective duration of each free evolution period, and an interaction cross-term involving the axis directions before and after the pulse. The interaction cross-term cancellation can be easily phrased as a parity product condition between neighboring frame directions [Rule 3].
Furthermore, robustness against different control imperfections can also be readily incorporated with our representation. Here, we focus on the dominant source of control error in many experimental systems, namely rotation angle errors that may arise due to an imperfection calibration of microwave power or control field spatial inhomogeneities. Note that other types of imperfections can also be incorporated, as discussed in Ref. (7). Here, the intuition is that the average Hamiltonian contribution corresponding to a rotation angle error can be easily analyzed in the toggling frame, where a systematic over-/under-rotation along the +μ-axis (positive chirality) can be compensated by another rotation along the −μ-axis (negative chirality) in the toggling frame. This chirality can be conveniently described mathematically by the cross product between the neighboring frame directions, allowing a simple algebraic rule for the cancellation of rotation angle errors [Rule 4].
Combining these, we have a set of succinct algebraic conditions to describe leading-order fault-tolerant dynamical decoupling, which can enable a significant extension of spin coherence times. However, to perform effective sensing, it is also crucial to maintain high sensitivity to an AC external signal. We now show how the representation we introduced above also readily enables the analysis of sensing properties of a given sequence, allowing us to design pulse sequences that approach the optimal sensitivity to an external field under the constraints of interaction-decoupling.
First, we derive the average Hamiltonian contribution of the target AC sensing signal (see Ref. (7) for more details). For an AC magnetic field signal causing a time-dependent sinusoidal line shift, the Hamiltonian is given by where Re denotes taking the real part, B AC , f , φ are the amplitude, frequency and phase of the AC magnetic field. Note that under the secular approximation in the presence of a strong quantizing field, only the projection of B AC onto the quantization axis will be relevant. To capture the resonance characteristics of a given pulse sequence, we define the frequency-domain modulation functions where F µ (t) is the time-domain modulation function, as defined in Eq. (S7) but accounting for the finite pulse duration effects. N is the sequence repetition number, T is the duration of the Floquet period, andφ µ (f ) are spectral phases capturing the relative phase differences between different axes. The average Hamiltonian contribution corresponding to this sensing field can be written as where the frequency-dependent vectorial effective sensing field B eff is expressed as This implies that under the influence of the target AC sensing field, the spin will precess around an effective magnetic field B eff that is determined by the vector frame modulation functions of the sensing sequence. We note that this effective sensing field description can also be useful in the analysis of DC sensing sequences. In this vectorial sensing scheme, the total spectral response |F t | is evaluated as Thus, to optimize sensitivity, we require that the frequency-domain modulation functions |F µ (f )| along each of the three axis directions have resonances aligned at the same frequency f , and that the phase of the resonanceφ µ (f ) is the same along different axes to achieve coherent addition of sensing field contributions. This is particularly important when performing AC sensing with interacting spin ensembles, as interaction decoupling requires the toggling-frameS z spin operator to spend equal time along each of the three axes (see Rule 2 above). For conventional interaction-decoupling pulse sequences that only utilize the effective sensing field along theẑaxis, the corresponding sensing efficiency can only be 1/3 of that given by the XY-8 sequence.
However, utilizing the full vector nature of the phase-synchronized effective sensing field B eff , as is achieved with Seq. B, we can boost the sensitivity by a factor of √ 3, which is close to optimal for interaction-decoupling pulse sequences (7).

Detailed analysis of pulse sequences
We now apply the above rules for dynamical decoupling and fault-tolerance against leadingorder imperfections to the sequences introduced in the main text. First, we introduce a convenient pictorial representation of the frame matrix F µ,k (see Fig. S4), which will help visualize the decoupling properties of the sequences (7). Note that this representation is equivalent to that introduced in Fig. 3A of the main text, but has the advantage that the decoupling rules 1-4 can be readily analyzed for more complicated sequences. In each panel of Fig. S4, the top row indicates the conventional representation of the corresponding sequence in terms of the pulses applied. The matrix below is a pictorial version of the frame matrix representation introduced above, where the three rows indicate thex,ŷ,ẑ axis directions and the columns indicate different free evolution periods of duration τ . A yellow/green block in the µ-th row, k-th column matrix indicates that F µ,k = +1/ − 1, i.e. the toggling frame spin operatorS z k = +S µ /−S µ during the k-th free evolution period. Meanwhile, the narrow bars in between the blocks indicate the spin operator orientation during the intermediate frames at the middle of π pulses or composite π/2 pulses. Figure S4: Pictorial representation of pulse sequences introduced in the main text. The top row of each panel indicates the conventional pulse sequence representation in terms of applied π/2 and π pulses along thex andŷ axes, with above/below the black line indicating positive/negative rotations. The bottom row shows the frame matrix representation, illustrating the toggling-frame transformations of the S z spin operator; each row corresponds to a different axis direction, each column corresponds to a different free evolution period, and the filled boxes/bars indicate the spin operator orientations during free evolution periods/intermediate frames, with yellow/green indicating a positive/negative orientation. (A) XY-8 sequence, which decouples disorder but not interactions, (B) Sequence A, which decouples disorder and interactions in the infinitely short pulse limit, but does not address control imperfections arising from interactions acting during the finite pulse duration and rotation angle errors, (C) Sequence B, which suppresses disorder, interactions, and all finite pulse imperfections to leading order.
First, we analyze the XY-8 sequence, widely used for dynamical decoupling and AC sensing with non-interacting ensembles.
1. Disorder: There are an equal number of green and yellow squares/bars in each row of the matrix, indicating that disorder terms cancel for both free evolution periods and finite pulse durations.
2. Rotation angle errors: For each pair of axis directions, the total chirality of all rotations sum to zero. As an example, if we examine all interfaces betweenŷ andẑ frame directions, we see that the chirality of the first π pulse (+ẑ → +ŷ → −ẑ) cancels with the third π pulse (+ẑ → −ŷ → −ẑ), and similarly for the fifth and seventh pulses.
3. Interactions: All free evolution periods are located along theẑ-axis (all filled squares are in the third row), so interactions are not fully symmetrized.
Therefore, although the XY-8 sequence addresses on-site disorder and rotation angle errors, it does not suppress the effects of interactions, and consequently the observed coherence time is limited by spin-spin interactions.
Second, we analyze Seq. A, which is designed to cancel the effects of disorder and interactions during the free evolution periods, but does not fully suppress all pulse-related imperfections.
1. Disorder: There are again an equal number of green and yellow squares/bars in each row of the matrix, indicating that disorder terms cancel for both free evolution periods and finite pulse durations.

Rotation angle errors:
The chirality condition is not satisfied between theŷ andẑ directions; to see this, notice that the first π/2 pulse has chirality −x, while the fifth and six pulses combine to also give a chirality −x, such that the net chirality is not zero. This indicates that rotation angle errors are not fully suppressed.
3. Interactions during free evolution periods: There are an equal number of filled blocks along each row of the matrix, indicating that interaction terms are cancelled during free evolution periods.

Interactions from intermediate frames:
There are more narrow bars in thex andŷ directions compared to theẑ direction, so the average interaction contribution in rule 2 is not cancelled for finite pulse durations.

5.
Interaction cross-terms due to finite pulse durations: The parity condition is not satisfied between theŷ andẑ direction, since between these two rows there are three parity-preserving interfaces and one parity-changing interface. This indicates that the interaction cross-terms are also not fully suppressed for finite pulse durations.
Therefore, while Seq. A is expected to work well in the absence of pulse imperfections, the performance will be severely affected for finite pulse durations.
Finally, we apply the design rules to Seq. B, and confirm that it satisfies all conditions for dynamical decoupling and fault-tolerance against leading-order imperfections.
1. Disorder: For the free evolution periods, the sequence shows a clear echo structure with alternating yellow and green squares, indicating that it should efficiently suppress disorder. The intermediate frames are also mostly directly paired into yellow and green pairs, except for the middle yellow bar along thex axis, which is paired with the last green bar along the same axis.
2. Rotation angle errors: The interface chirality sum for each pair of axes is zero, so rotation angle errors are fully cancelled. One simple way to see this without explicitly calculating the individual chiralities is to note that the k-th block/bar has opposite color from the (n − k)-th block/bar, where n is the total number of blocks/bars, neglecting the final green bar in thexdirection. Therefore, the chirality contributions at each interface will be precisely cancelled by another contribution from the opposite side of the sequence, summing to zero at the end.
3. Interactions during free evolution periods: There are an equal number (16) of filled square blocks along each axis direction, so interactions during free evolution periods are fully cancelled.

Interactions from intermediate frames:
There are an equal number (16) of narrow bars along each axis direction, so interactions during finite pulse durations are fully cancelled.
In addition to satisfying all of the above rules, Seq. B is also designed with heuristics to suppress higher-order contributions. For example, it has a fast spin-echo structure for both free evolution periods and intermediate frames, such that the dominant disorder effects are echoed out as soon as possible. This is particularly important for our disorder-dominated spin ensemble, and will significantly reduce higher-order contributions to the effective Hamiltonian, since most of the commutators in higher-order terms that involve disorder will cancel out.
We now analyze the sensing properties of Seq. B. As discussed in the preceding section, the sensitivity of a given pulse sequence to an AC external field at frequency f and phase φ is characterized by the frequency-domain modulation function |F t (f )|. This is plotted for Seq. B in Fig. 3B of the main text, where the dominant resonance is highlighted. As can be seen in both Fig. 3A of the main text and Fig. S4C, the AC signal will be well-synchronized with the frame changes along each axis in this case, and the signal phase φ for which the field strength along each axis is maximized coincides for different axes. This implies that the effective field direction will point along the [1, 1, 1]-direction, efficiently utilizing the phase accumulation along all three axes.
We note that using our algebraic rules, one can prove the necessity of composite π/2 pulse structures for robust interaction-decoupling sensing sequences that exhibit a fast spin-echo structure (7). This is because to effectively accumulate phase for this frequency f , the frame modulation pattern along each axis must be fixed (e.g. always appearing as [+1, −1] alongx); this in turn implies that any interface between two axis directions will always have a fixed parity, leading to rule 3 above being violated. Thus, composite π/2 pulses, which can adjust the parity of frame-changing interfaces, must always be employed in such sequence design, as is the case with Seq. B presented in the main text. See Ref. (7) for a more detailed discussion.

Importance of fault-tolerant sequence design and numerical validation
We now numerically confirm that the algebraic decoupling rules summarized in Eqs. (S8-S11) provide a good characterization of the degree of robustness for a given pulse sequence. In the limit of zero pulse durations, there are a number of equivalent pulse sequences that generate the same leading-order average Hamiltonian. However, in realistic situations where the control pulses have finite duration (t p = 0) and rotation angle errors (θ = π/2), control pulse imperfections induce error terms in the average Hamiltonian, giving rise to rapid decoherence as well as undesired dynamics in spin systems. One such example is illustrated in Fig. S5, where a modified WAHUHA sequence designed to suppress disorder and interactions in the infinitely short pulse limit does not perform well for finite pulse durations, resulting in an effective disorder field along theŷ-direction and rapid decay of coherences along thex andẑ directions. This implies that depending on their level of designed fault-tolerance against experimental imperfections, sequences expected to have similar performance in the ideal pulse limit will behave very differently in the presence of finite pulse duration and rotation angle errors. Higher-order contributions to the effective Hamiltonian can also vary between different sequences depending on the details of pulse arrangement, further differentiating decoupling performances.
To understand the effects of control imperfections and identify better pulse sequences, we numerically examine the decoupling performance of different Floquet pulse sequences categorized into four distinct classes: Class I is a set of robust pulse sequences that satisfy all fault-tolerant rules and Class II, III and IV are a set of pulse sequences that are not robust to interaction cross-terms (violation of Rule 3), rotation angle errors (violation of Rule 4), or both cross-terms and rotation angle errors (violation of Rule 3 and 4), respectively. Here we only consider pulse sequences that maintain the fast spin-echo structure, ideal for AC sensing applications. Under this constraint, we generate all possible configurations of pulse sequences consisting of 12 free evolution intervals, and classify them into the aforementioned classes according to their robustness. A numerical search allows us to find 448, 576, 3072 and 9984 sequences belonging to Class I, II, III and IV, respectively. In our experiments, we also augment Figure S5: Importance of fault-tolerant sequence design. (A) A Floquet pulse sequence which decouples both disorder and interactions in the ideal, infinitely short pulse limit. Pulse legends as in Fig. S4. In the case of non-zero pulse duration, decoupling rule 2 (Eq. S9) is violated due to the presence of intermediate frames on theŷ-axis (thin yellow bar at the center). In other words, the middle π pulse produces an on-site disorder field along theŷ-axis in the togglingframe picture. (B) Experimental result probing coherence under the sequence presented in (A). The three curves show different coherence decay profiles for spins initialized along various axes. While spins initialized along thex-andẑ-axes decay quickly within ∼1 µs, the spins initialized along theŷ-axis exhibit extremely long coherence. This originates from the imperfect disorder cancellation (violation of rule 2), resulting in an on-site disorder field along theŷ-axis that defines a spin-locking field in the leading-order average Hamiltonian. (C) Numerical simulation probing coherence decay under the sequence presented in (A). For the simulation, we study the dynamics of a finite-size system consisting of N = 9 spins subject to periodic driving. The random on-site disorder and spin-spin interactions are assumed to follow box distributions with strengths estimated from experimentally measured parameters (see text for details). The solid and dashed lines indicate spin dynamics under the exact Floquet unitary and the leading-order average Hamiltonian, respectively. In both cases, we use the same finite pulse duration for global spin-rotation pulses and reproduce the experimental observations seen in (B). each sequence by repeating the same sequence but withx-andŷ-axes exchanged in its matrixbased representation, leading to a total of 24 free evolution frames in one Floquet cycle. Such xy-phase symmetrization does not modify the fault-tolerant character of pulse sequences, and is expected to cancel certain higher-order contributions through symmetrization. In addition, we also obtain 48-frame sequences by doubling the 24-frame sequences, where the second half of the sequence is flipped left-right and in sign compared to the first half to guarantee sensing properties and fault-tolerance while symmetrizing further; Seq. B is an example of a 48-frame sequence (see Fig. S4). Figure S6 summarizes simulation results for the numerically obtained sequences of 12 free evolution intervals, where we extract the 1/e coherence decay times T 2 averaged overx,ŷ,ẑ initial states (the average is performed over the decay rates). For the simulations, we performed exact diagonalization of Floquet unitary operators for a finite-size spin system and monitor ensemble coherence at stroboscopic times t = N T where N is an integer. We fix the system size to 8 spins and assume random on-site disorder, h i ∈ N (0, σ W ) and random interactions, J ij ∈ [−J, J], following Gaussian and uniform distributions with σ W = (2π) 4.0 MHz and J = (2π) 0.2 MHz, respectively. We use the pulse spacing τ = 25 ns and the pulse duration t p = 10 ns, consistent with the sequence parameters used in the experiments. As shown in Fig. S6, all sequences showing the longest coherence times belong to Class I, validating the importance of incorporating fault-tolerant decoupling rules. Figure S6: Numerical validation of the fault-tolerant decoupling rules. The figure shows a histogram of 1/e coherence decay times for a set of pulse sequences exhibiting different degrees of robustness. These sequences are classified into four distinct categories, Class I (blue), II (red), III (green) and IV(yellow), according to their robustness to control imperfections; for example, Class I sequences satisfy all fault-tolerant decoupling rules (see text for details). To extract 1/e coherence times, we run numerical simulations by solving the exact Floquet spin dynamics for a disordered, interacting 8-spin system and monitor global spin polarization as a function of time. We repeat the same sequence for three different initial states prepared along thex,ŷ and z axes. The mean 1/e coherence time is defined as the inverse of the mean 1/e decoherence rates averaged over the three different initial states. We numerically confirm that long coherence times are indeed achieved when dominant imperfections are properly treated in the design of Floquet pulse sequences.

Remaining limits to coherence
Our sequence decouples disorder and interactions and suppresses dominant pulse-related imperfections at the leading-order average Hamiltonian level. These provide a significant extension of spin coherence times, with a 5-fold improvement over conventional dynamical decoupling sequences. However, the coherence times achieved here are not yet limited by the depolarization time T 1 , which in the present sample is around 100 µs.
There are several factors that could limit the observed coherence times for Seq. B: First, higher-order terms in the Floquet-Magnus expansion are not fully suppressed, leading to residual disorder and interaction effects. This could be particularly important for contributions involving the large on-site disorder in our electronic spin ensemble. Indeed, performing exact diagonalization simulations for different disorder strengths, we find that the coherence time significantly extends as one reduces the disorder strength. Second, microwave waveform imperfections due to e.g. interface reflections, could affect the efficiency of dynamical decoupling and reduce coherence times. Indeed, we have found that when the attenuators along the microwave path are removed, the larger reflections significantly reduce the observed coherence times and cause oscillations in the signal. While the attenuators alleviate this issue, there could still be residual effects. Third, there may be small residual contributions such as time-dependent disorder effects from spin-bath dynamics of the environment. Finally, imperfect spin polarization into the |m s = 0 state could cause additional spin-exchange dynamics out of the two-level system formed by |m s = 0, −1 , causing decoherence. However, we find that intentionally moving part of the spin population into the |m s = +1 state during state preparation does not significantly affect the coherence time, indicating that this decay channel is probably not dominant. Note that imperfect polarization within the |m s = 0, −1 spin levels is not expected to have an impact on the coherence times; only the polarization into |m s = +1 should have an effect.
3 AC magnetic-field sensing with interacting spins

Calibration of AC magnetic field amplitude
In the following subsections, we provide details on the responses of our NV-ensemble magnetometer. To characterize the performance of AC magnetic field sensing, a continuous-wave sinusoidal signal is generated by the AWG with a controlled amplitude and phase relative to the NV sensing sequence. To calibrate the strength of the magnetic field, we employ an XY-8 sequence and measure the contrast S as a function of AWG input voltage (Fig. S7). The contrast S shows periodic oscillations with increasing input amplitude, indicating that the NV centers undergo a coherent precession due to the external AC magnetic field. Analytically, the contrast oscillation can be fitted to where S max is the maximal signal contrast, γ NV = (2π)28 GHz/T is the gyromagnetic ratio of the NV center, t is the phase accumulation time for sensing, β = 0.95 is a small correction factor due to the finite duration of π pulses used in the XY-8 sequence, and B AC is the AC magnetic field amplitude (6). A comparison of the extracted B AC with the applied voltage allows the precise calibration of the AC magnetic field generated for a given AWG output voltage. The results are shown in Fig. S7, where we obtain a calibration constant of 32.8(6) µT/V. Figure S7: Calibration of sensing signal amplitude. A conventional XY-8 sequence is employed to perform AC magnetometry and calibrate the magnitude of the AC magnetic field. A continuous-wave signal is directly synthesized by the AWG and optimally synchronized with the XY-8 sequence to maximize phase accumulation. To satisfy the resonance condition, the frequency of the AC signal is set to f = 1 2(τ +tπ) , where τ is the free evolution interval between adjacent π pulses of the XY-8 sequence and t π is the π pulse length. Here, we fix the phase accumulation time to t = 2.16 µs. The spin contrast is monitored as a function of AWG input voltage. The measured trace is fitted to a sinusoidal curve (solid line) to extract the frequency of the contrast oscillation. Using the extracted frequency, we obtain a calibration constant of 32.8(6) µT/V.

Sensing response as a function of relative phase between pulse sequence and signal
We now consider the sensing response as a function of the relative phase between the pulse sequence and signal. Here, we assume that the spectral phases at resonance frequency f are aligned between different axes, givingφ x (f ) =φ y (f ) =φ z (f ), as is the case for our sensing sequence Seq. B. In this case, the phase accumulation of the sensor due to an external signal is constructive if the signal and applied pulse sequence are in phase, i.e., when the relative phase φ 0 , defined as φ 0 = φ −φ z (f ), between the two is 0 or ±π. If φ 0 = ±π/2, then there will be zero phase accumulation. To experimentally confirm this relative phase dependence, we sweep the value of φ 0 from −π to +π using Seq. B and measure contrast variations as a function of signal phase (see Fig. S8A). Interestingly, the phase responses for thex,ŷ andẑ-axis initial states are not identical to each other, although the effective field strength along each axis is expected to be the same at the average Hamiltonian level (see discussion in Sec. 2.2). The observed asymmetries are attributed to the fact that at finite target sensing signal amplitudes, the rotations induced by the sensing signal at different times do not commute with each other (disorder-dominated higher-order contributions), and thus the dynamics exhibit deviations from the average Hamiltonian picture and the symmetry between different axis directions is broken. Despite this non-trivial phase dependence, an optimal state preparation orthogonal to the [1, 1, 1]-direction still shows largest contrast and results in maximal sensitivity. In addition, we note that the sign-inversion symmetry of the signal field is broken: The sensor response at φ 0 = 0 is not the same as that of φ 0 = π, likely due to the aforementioned non-commuting effect.
To lend support to this, a numerical simulation is performed by solving the time-dependent Schrödinger equation in the presence of a target AC sensing field, where we include the on-site disorder and interactions in the system. The result is shown in Fig. S8B, showing qualitatively similar behavior to the experimental data and suggesting that this effect indeed originates from the non-commuting nature of π/2 rotations in the presence of an external sensing field. Figure S8: Dependence of NV-ensemble magnetometer response on the relative phase delay between the sensing signal and Seq. B. (A) Experimental data, (B) numerical simulations. Different colors denote different initialization conditions, corresponding to initial spin orientations along thex (blue circles),ŷ (red squares),ẑ (yellow diamonds), and the optimal [1, 1, −2] (purple stars) axes. In (B), we numerically solve the time-dependent Schrödinger equation for the full many-body Hamiltonian consisting of 6 spins to capture disorder and interaction effects. We use a time step dt = 1 ns for numerical integration. The sensor spins are randomly generated in 3D using estimated experimental NV concentrations, with periodic boundary conditions imposed. Each spin has a random on-site disorder drawn from a Gaussian distribution with zero mean and a standard deviation of ∼ (2π) 4.0 MHz. Long-range dipolar interactions (∼ 1/r 3 ) are assumed with a coupling strength determined by the relative distance and orientation between the spins. The simulation parameters are chosen in accordance with the experiments (τ = 25 ns, t π = 20 ns, t π/2 = 10 ns, t = 6.516 µs, N = 3 Floquet periods, T 1 = 100 µs with stretched exponential of power 0.5, and the frequency/amplitude of the AC signal are set to 11 MHz and 6.6 µT).
We now rigorously derive the optimal signal phase at which the contrast is maximized, for a given effective magnetic field and initial state. This will allow us to model the electron spin resonance (ESR) frequency response in the general case of vectorial effective magnetic fields in the toggling frame, going beyond the conventional modeling of ESR responses that only considerẑ-directional sensing fields.
To be more concrete, we consider the evolution of a given initial state |ψ 0 for phase ac-cumulation time t under some pulse sequence and target AC magnetic field, with the Hamiltonian described in Sec. 2.1. In the toggling frame, the dynamics of the spin corresponds to a precession of angle θ = γtB t / around the axis (u x , u y , u z ) = (B x , B y , B z )/B t , with the total magnetic field B t = B 2 x + B 2 y + B 2 z . Here, the individual field components already take into account the relative phase alignment with respect to the target signal field, namely, Writing the initial state as a vector on the Bloch sphere, the evolution corresponds to the rotation matrix For cosine magnetometry, the observed contrast then corresponds to the deviation from unity of the projection of the rotated state onto the initial state. As discussed above, in the experiments, we optimize the relative phase delay between the sensing signal and the pulse sequence to maximize the contrast. For numerical comparisons, it is therefore important to determine the optimal signal phase for which the contrast is maximized.
In the following, we show that for small signal field rotations, the contrast is maximized when the phase is chosen such that the field strength orthogonal to the initialization direction is maximized, regardless of the field strength along the initialization direction. To illustrate this, we first consider the case of spins initialized along thex-direction. After a phase accumulation time t, the projection onto the initialization direction is given by the (1,1)-element of the rotation matrix in Eq. (S21). Expanding for small θ, we find the contrast to be Thus, to maximize contrast, we need to maximize the total field strength in theŷ andẑ directions. For a generic initialization direction, we can redefine the coordinate system to have one of the basis vectors pointing along the chosen initialization direction, and rewrite the effective The signal contrast is then maximized when the projection of B eff onto the latter two transverse directions is maximal, which can be easily written down based on linearity of the field component addition.
Combining these results with the calculated frequency-domain modulation functionsF µ (f ), we can easily choose the phase of the AC magnetic field signal that gives maximal contrast under each measurement condition, and use this information to determine the expected contrast. This allows us to generate the theoretical ESR curves in Fig. 3D of the main text.

Initialization and readout protocols
Similar to the conventional sine and cosine magnetometry protocols, for our new sequences, we can also design spin initialization and readout protocols to maximize the sensitivity to overall polarization (sine magnetometry) or fluctuations in polarization (cosine magnetometry) at close to zero signal field. Both of these protocols require that the spin is initialized in a direction perpendicular to the effective magnetic field direction in order to maximize precession.
In our experiment, we vary the initialization direction to implement sine and cosine magnetometry, while fixing the readout rotation. For readout, we rotate around an axis perpendicular to the effective magnetic field direction, and choose the rotation angle to rotate the precession plane to contain theẑ-axis for state measurement via spin-state-dependent fluorescence. Here, we choose to rotate the v 1 = [1, 1, 1]/ √ 3 effective field direction into v 2 = [1, 1, 0]/ √ 2, so that the precession plane perpendicular to the effective field direction contains theẑ-axis. This corresponds to a rotation around the [−1, 1, 0]-axis by an angle arccos( v 1 · v 2 ) = arccos 2/3 . With the readout rotation determined, one can now easily find the initialization rotations that are analogous to cosine and sine magnetometry. More specifically, cosine magnetometry corresponds to initializing the spins at the location of maximum contrast, while sine magnetometry corresponds to initializating the spins at the location of zero contrast. In our experiments, we realize cosine magnetomery (Fig. 3D of the main text) by a rotation of angle arccos 2/3 around the [1, −1, 0] axis, to rotate into the state [−1, −1, 2]/ √ 6, and realize sine magnetometry (Fig. 4 of the main text) by a rotation of angle π/2 around the [1, 1, 0] axis, to rotate into the state [1, −1, 0]/ √ 2. Under the readout pulse, the former is rotated to the [0, 0, 1] state (maximal contrast), while the latter is not rotated and remains on the equator (zero contrast). Note that this is slightly different from the conventional approach for switching between sine and cosine magnetometry, where one fixes the initialization rotation and changes the readout rotation, but the two methods are equivalent to each other. For example, we can choose instead to always rotate the spin into the state [1, −1, 0]/ √ 2 for initialization. The readout rotation for sine magnetometry remains unchanged, while we can choose a π/2 rotation around the [1, 1, 0] direction for cosine magnetometry. Note however, that this requires a rotation of a larger angle than our protocol; in fact, our initialization and readout protocol is the one that requires minimal rotation angle.

Sensing response as a function of sensing signal amplitude
In order to evaluate the sensitivity of our NV-ensemble magnetometer, the amplitude of a continuous-wave signal is swept while the relative phase between the pulse sequence and signal is locked to the optimal phase delay φ 0 = 0. With our new sensing sequence Seq. B, we monitor how the contrast S varies as a function of signal amplitude at a fixed phase accumulation time. As shown in Fig. S9A, the contrast exhibits sinusoidal oscillations for different initial states in response to the AC sensing signal. However, due to the effective precession axis along the [1, 1, 1] direction under the pulse sequence, spin initializations along thex-andŷ-axes result in asymmetric oscillations with reduced contrast, introducing positive and negative offsets to S, respectively. On the other hand, if spins are prepared along the [1, −1, 0] direction and rotated by angle arccos 2/3 around [−1, 1, 0] for readout (sine magnetometry explained above), then the contrast oscillation remains symmetric with zero offset, with an increased slope at the zero-field point (see Fig. S9A). This unconventional state initialization and readout is crucial to achieve optimal sensitivity. Figure S9: Dependence of NV-ensemble magnetometer response on AC signal amplitude under the leading-order fault-tolerant sensing sequence (Seq. B). (A) Experimental data and (B) numerical simulation. In (A), different markers denote different measurement conditions, each corresponding to initial spin orientations along thex (blue circles),ŷ (red squares), and optimal [1, −1, 0] (purple stars) axes, while different lines indicate a fit taking the form of a Gaussiandecaying sinusoidal function with an offset. In (B), we numerically solve the time-dependent Schrödinger equation for the full many-body Hamiltonian consisting of 6 spins to capture interaction effects. The same simulation parameters are used as in Fig. S7.
As the AC signal amplitude is further increased, we find that the peak-to-peak contrast starts to decrease. As shown in Fig. S9B, a qualitatively similar behavior is also observed in a numerical simulation where the time-dependent many-body dynamics is solved for a dipolar spin system subject to the same dynamical decoupling pulse sequence and an external AC signal. We attribute the reduction in constrast to a decrease in disorder and interaction decoupling efficiency of the pulse sequence, due to the presence of a large AC signal that modifies the effective spin frame significantly, causing the average Hamiltonian picture in the frames specified by the pulses to partially break down (14). Numerically, we find that this effect is dominated by contributions from the reduced disorder decoupling efficiency. To account for these effects, in Fig. S9, we fit the contrast oscillations to a sinusoidal curve with a Gaussian decay.

.1 AC magnetic field sensitivity
The AC magnetic-field sensitivity η is defined as the minimum detectable signal that yields unit SNR for an integration time of 1 second (15), and can be estimated by where t is the phase accumulation time, T 2 is the coherence time of the ensemble, α is the exponent of the stretched exponential decoherence profile, C is an overall readout efficiency parameter, K is the number of NV centers in a confocal volume, t m is the additional time needed to initialize and read out the NV centers and b is a pulse-sequence-dependent sensitivity factor (6). For example, b = 1 for the XY-8 sequence and b = √ 3 for Seq. B. Assuming a single group sensor density of 4 ppm and a sensor volume of V = 8.1 × 10 −3 µm 3 , we identify K ≈ 6 × 10 3 spins. By fitting Eq. (S23) to the measured sensitivity curves presented in Fig. 4B of the main text, we estimate the overall readout efficiency per NV as C ≈ 1.3 × 10 −3 .

Sensitivity optimization
The sensitivity can be experimentally quantified by where σ 1s S is the uncertainty of the contrast S for 1 second averaging and |dS/dB AC | is the gradient of S with respect to the field amplitude B AC . The uncertainty σ 1s S and the slope |dS/dB AC | are extracted at the zero-field point (B AC = 0), from which we estimate the maximum sensitivity (see Fig. 4A in the main text). The contrast S is proportional to the probability p since S measures the population difference between the |0 and |−1 states.
In order to optimize the sensitivity, NV state preparation and readout parameters, including counter length t read and green duration t gr , are fine-tuned. In our experiments, the measurement and readout overhead time is t m = t gr + 0.3 µs (including a short buffer to adjust for time delays between different equipment) and the sensor readout is performed at the leading edge inside the green illumination. As shown in Fig. S10, t read = 1.2 µs and t gr = 6 µs result in the best sensitivity values for both Seq. B and XY-8. With these parameters, we probe the sensitivity of our dense NV ensemble as a function of phase accumulation time. As presented in Fig. 4B in the main text, the sensitivity scaling calculated from Eq. (S23) is in excellent agreement with the experimental observations. The minimum sensitivity currently achieved in our system is estimated to be 220(3) nT/ √ Hz measured under Seq. B, which is a factor of ∼1.7 improved over the conventional XY-8 sequence. This translates into a volume-normalized sensitivity of 20(2) nT·µm 3/2 / √ Hz, which is among the best volume-normalized sensitivities demonstrated using electronic spin ensembles. Figure S10: Optimization of sensitivity. To identify the parameter regime where sensitivity is optimized, the counter length t read (A,B) and green duration t gr (C,D) are varied to find the optimal point balancing readout signal contrast and readout integration uncertainty. We find that t read = 1.2 µs and t gr = 5 − 6 µs show the highest sensitivity for both Seq. B (A,C) and XY-8 (B,D). The errorbars denote the standard deviations of sensitivity values, obtained via error propagation.

Comparison of spin-ensemble-based sensors
The figure of merit for ensemble-based sensors is the volume-normalized sensitivity. As shown in Tab. S1, we compare our volume-normalized sensitivity with existing work on 3D ensemble magnetic field sensing. We calculate the overall readout efficiency parameter C for each system using Eq. (S23) and values extracted from Refs. (16)(17)(18)(19)(20). Despite the relatively low C value extracted for our system, our dense spin ensemble achieves volume-normalized sensitivities among the best reported so far (21).
While our work is the first to push the sensitivity below the limit imposed by interactions, this advance may not be obvious from a direct comparison of the η V values. In addition to the differences in C, which arise mainly from different photon collection strategies, we attribute this fact to a sizable uncertainty in the quoted spin densities for NV-based diamond sensors. In principle, for XY sequences, the ρ · T 2 product (the figure of merit governing the sensitivity scaling) is expected to lie below a universal interaction limit, given by the upper limit of T 2 imposed by interactions, where ρ · T 2 reaches a constant number for any spin density (22). The quoted ρ · T 2 values, however, display large variations, with some a few orders of magnitude larger than our value, indicating that the high spin densities claimed in some works may have been over-estimated (20). This also leads to an under-estimate of the readout efficiency parameter C. In particular, at high densities, due to charge dynamics, the effective NV center density (as extracted in Sec. S1 above) is expected to be lower than the value simply estimated from the conversion efficiency of nitrogen defects to NV centers. Moreover, extracting the effective NV center density from numerical simulations can be a non-trivial task at these high densities, since it is challenging to accurately describe long-range interacting spin dynamics only using a limited system size. Table S1: Comparison of magnetometer performance for ensemble spin sensors with different spin densities. † : our experiments with Seq. B operate beyond the interaction limit.
We expect that the current sensitivity can be further enhanced by optimizing sensor characteristics and readout efficiencies. To this end, we estimate the volume-normalized sensitivity to improve up to single-digit picotesla levels of η V = 8 pT·µm 3/2 / √ Hz attainable under our phase-synchronized sensing sequence (Seq. B). Here, we used the realistic system parameters of lifetime-limited coherence times of T 2 = 2T 1 = 200 µs, a total spin density of 15 ppm and C = 0.2 (23)(24)(25)(26)(27). For the projected C we assume a collection efficiency above 5%, which can be achieved through nanofabrication techniques such as nanoscale parabolic reflectors (15,28).