High-Resolution Nanoscale Solid-State Nuclear Magnetic Resonance Spectroscopy

We present a new method for high-resolution nanoscale magnetic resonance imaging (nano-MRI) that combines the high spin sensitivity of nanowire-based magnetic resonance detection with high spectral resolution nuclear magnetic resonance (NMR) spectroscopy. By applying NMR pulses designed using optimal control theory, we demonstrate a factor of $500$ reduction of the proton spin resonance linewidth in a $(50\text{-nm})^{\text{3}}$ volume of polystyrene and image proton spins in one dimension with a spatial resolution below $2~\text{nm}$.

that combines ultra-sensitive spin detection with highresolution magnetic resonance spectroscopy.
As the spin detection and imaging success of MRFM were being established, there were also efforts to demonstrate its utility for dipolar [9,10], quadrupolar [11], and chemical-shift [12] spectroscopy by employing magic echo pulse sequences and other techniques developed for conventional solid-state magnetic resonance imaging. These experiments were able to achieve spectral resolution under 1 kHz, showing the spectroscopy possibilities of MRFM, but were limited to relatively large spin ensembles with spatial resolution around 1 µm. As Eberhardt et al. write, the static gradient sources used in many MRFM experiments complicate spectroscopy attempts [12]. They solved this problem by moving the gradient source out of range of their sample during the encoding pulse sequence using a piezo.
In our MRFM experiments, we use a Current-Focusing Field Gradient Source (CFFGS) which generates large magnetic fields on nanometer length scales by focusing electric currents through a narrow (∼ 100 nm) metallic constriction. MRFM using a CFFGS has the potential to combine nanoscale imaging with magnetic resonance spectroscopy. The CFFGS can generate very large timedependent magnetic fields (∼ 0.1 T) and magnetic field gradients (∼ 10 6 T/m) over a wide frequency bandwidth (DC -GHz) [13] in nanometer-scale volumes. As Nichol et al. demonstrated in a previous two-dimensional nano-MRI experiment, these characteristics provide the potential for very sensitive spin detection, high imaging resolution, and rapid spin manipulations on spin ensembles as large as (100 nm) 3 [3].
In the nano-MRI experiment by Nichol et al., the spatial resolution was determined by the strength of magnetic field gradients and the encoding time. The encoding time was limited by the T * 2 spin dephasing time, dominated by homonuclear dipolar interactions between proton spins [3]. To image with finer spatial resolution and to enable high-resolution spectroscopy, dipolar interactions between spins must be decoupled. The solidstate MRI community has developed a variety of techniques to acheive this. In particular, the magic echo pulse sequence has been used to decouple both homonuclear and heteronuclear interactions. In addition, it has long laboratory-and rotating-frame spin evolution periods, during which gradient pulses may be applied for spatial encoding.
Recently, nanoscale solid-state chemical-shift spectroscopy has been performed using nitrogen-vacancy centers [14]. In this work, the WAHUHA and MREV-8 pulse sequences were used to extend the spin coherence times. Such dynamical decoupling sequences typically require implementing the same unitary operation throughout the ensemble. While the CFFGS is capable of producing intense magnetic fields, the large B 1 variation over the sample volume makes the use of hard pulses unfeasible. Such experimental limitations can be efficiently dealt with using pulses designed with optimal control theory (OCT). Provided one has sufficient knowledge of the experimental conditions, such as resonance offsets, bandwidth limitations, and B 1 dispersion, OCT pulses can be designed to perform rapid, robust unitary transformations.
In this paper, we demonstrate rapid high-fidelity rotations in a nanoscale ensemble of proton spins by applying OCT-engineered pulses using the magnetic fields produced by the CFFGS. We present a new extension of optimal control theory (OCT) that incorporates Average Hamiltonian Theory (AHT) to allow for control of the average Hamiltonian over the duration of the pulse. We use this capability to perform dynamical decoupling during OCT pulses that have duration comparable to T * 2 . By incorporating OCT-AHT π 2 pulses into magic echo sequences, we demonstrate a factor of 500 narrowing of the proton-resonance linewidth, from 29 kHz to 57 Hz, in a (50-nm) 3 volume of polystyrene. Taking advantage of this greatly extended coherence time, we perform one-dimensional Fourier imaging of proton spins in polystyrene with a spatial resolution between 1.8 nm -2.6 nm within a 35-nm region of the sample. This work serves as a demonstration of the power of OCT to achieve high-fidelity spin control for spectroscopy and imaging in CFFGS-based nano-MRI systems.

II. APPARATUS
As shown in Fig. 1, we used an apparatus very similar to that described in our previous nano-MRI work [3]. Force-detected magnetic resonance measurements were performed using silicon nanowire (SiNW) resonators, grown via the vapor-liquid-solid method. The SiNWs were modified using a procedure developed to split the frequencies of the two lowest-frequency flexural modes, while maintaining high quality factors. The details related to mode engineering of SiNW resonanators are discussed in the Supplemental Information (SI) section I. Fi- nally, the SiNWs were coated at the tip with polystyrene dissolved in diethyl phthalate, using a micromanipulator. The modified SiNW resonator used for this experiment was approximately 20-µm long, with tip dimensions of 60 nm × 80 nm. At 4.2 K, the two fundamental flexural modes had resonance frequencies f 1,2 = (315 kHz, 369 kHz), quality factors Q 1,2 = (8000, 8300), and spring constant k ≈ 1.0 × 10 −4 N/m. We used a 2-µm wavelength laser interferometer polarized vertically along the length of the SiNW to measure its displacement [15]. The lower-frequency mode was used because its oscillation direction was better aligned with the interferometer.
To generate time-varying magnetic fields and magnetic field gradients, we fabricated a CFFGS device by electron-beam lithography and liftoff of 10 nm/80 nm thick Ti/Ag films deposited on a MgO(100) substrate by electron-beam evaporation. The device contained a 150nm wide and 100-nm long constriction, which served to focus electrical currents to produce the magnetic fields used for spin detection and control.
Spin detection was performed by flowing 70-mA peak current through the constriction near the SiNW resonance frequency, corresponding to a peak field gradient of dB z /dx = 1.0 × 10 6 T/m near the tip of the SiNW. Additional information about the spin detection method is presented in SI section II. Magnetic resonance measurements were made at a proton resonance frequency of 48 MHz. The radio-frequency (RF) magnetic field used for spin control was produced by applying 50-mA peak current to the constriction, corresponding to a Rabi field B 1 = 0.037 T near the tip of the SiNW. Fig. 2a shows a cross section of the proton Rabi frequency distribution at the center of the constriction. The magnetic fields produced by the constriction are calculated using the COMSOL Multiphysics finite-element simulation software. The simulations are based on the di-mensions of the constriction measured using a scanning electron microscope.
The OCT pulses that were designed for these measurements occupy a wide bandwidth (∼ 10 MHz). We found that waveform distortions caused by standing waves in the cables between the room temperature electronics and the CFFGS device can severly degrade the performance of the OCT pulses. We therefore developed a technique, based on force detection, that allowed us to characterize the complex transfer function of our electronics directly at the site of the spins. This capability was crucial for achieving good performance of the OCT pulses. The details of this technique will be provided in a separate publication. The spin-resonance frequency of 48 MHz was chosen to coincide with a relatively flat region of the transfer function.
The experiment took place in high vacuum at a base temperature of 4.2 K. We observed approximately 1-2 K heating of the nanowire due to the laser. Using Attocube nanopositioners and custom piezo walkers and scanners, the tip of the polystyrene-coated SiNW was positioned approximately 50-nm above the center of the constriction.

III. OPTIMAL CONTROL PULSE DESIGN
In order to implement most decoupling pulse sequences, one needs the ability to perform the same final spin rotation over the entire spin ensemble. Hard-pulse rotations are not achievable in our system due to the inhomogeneity of fields produced by the constriction. Certain state-to-state transfers such as adiabatic passages can be achieved easily even in the presence of an inhomogeneous B 1 field, but these are insufficient for decoupling sequences. OCT pulses have been used successfully in conventional NMR experiments to address various experimental constraints, including inhomogeneous B 1 fields.
The task of any OCT algorithm is to find a control sequence a(t ) over a period 0 ≤ t ≤ t that adheres to experimental constraints and generates the desired evolution for some ensemble of controlled systems. In the context of quantum control, a(t ) is generally an electromagnetic waveform, designed to generate a Hamiltonian evolution. Frequently, we are interested in a robust implementation of a particular unitary transformation V for each member among the ensemble of interest; in this setting, we are dealing with an ensemble of nuclear spins. If U i [a(t )] is the unitary operation generated on the i th spin at time t, we define a target function Φ ({U i }), a functional of a(t ), that measures the total distance between U i [a(t )] and V for the entire ensemble. The central problem of OCT is to find an a(t ) that minimizes Φ [a(t )]. Many examples of numerical algorithms for solving this problem exist in the literature [16,17].
The benefit of OCT algorithms over attempting to find a(t ) analytically is at least twofold: analytical solutions (e.g. composite hard pulses) are usually restricted to a single system rather than an ensemble, and generally ignore deterministic experimental distortions to the waveform seen by the spins. Numerical pulse searches can easily accommodate both of these demands by incorporating a set of ensemble member specific transfer functions to the control search, which account for waveform distortions and differences in the Hamiltonian parameters. For OCT algorithms, the presence of such transfer functions merely adds an extra step into the computation and typically does not appreciably affect their performance. Robust control in the presence of B 0 and B 1 inhomogeneities [18,19], and compensation for waveform distortions [20,21], have been demonstrated for conventional NMR measurements.
In this work, we use the Gradient Ascent Pulse Engineering (GRAPE) algorithm [16], which prescribes an efficient way to find the functional derivative δΦ[a(t )]/δa(t ) of Φ with respect to a(t ) for piecewise constant control sequences a(t ). This converts the control search into a multivariate optimization problem. Given the functional derivative δΦ[a(t )]/δa(t ) and a (usually) randomly generated initial guess a 0 (t), we use a simple multivariate gradient descent algorithm to reach the local minimum of Φ near a 0 (t). In practice, we generate a number of initial guesses until we find a waveform that yields a suitably low Φ. The details of this procedure are further discussed in SI section IV. In general, the length of the control sequence will increase with the range of demands (e.g. range of targeted B 0 or B 1 , bandwidth constraints for the waveform a(t ), etc.) required by the sequence. Additional demands typically require a larger number of initial guesses before a suitable sequence is found.
For this experiment, we searched for a pulse with a bandwidth of 10 MHz that would perform a π 2 rotation on all spins in a B 1 range, initially from 0.6-1.2 MHz, based on the experimental distribution of proton Rabi frequencies shown in Fig. 2b. Details of the Rabi frequency measurement are given in SI section III. Initially, we searched for a pulse that performed the desired unitary for this range of B 1 , without explicitly averaging the dipolar Hamiltonian. The shortest pulse we found had a length of t = 7 µs, which is comparable to the measured T * 2 = 10.9 µs. To avoid dephasing over the duration of the pulse, we used a new method that incorporates Average Hamiltonian Theory (AHT) into the optimal control pulse design [22]. We added constraints into our pulse optimization to minimize the 0 th -order Magnus terms of the homonuclear dipolar and chemical-shift Hamiltonians over the duration of the pulse. Because of these added constraints, the duration of the pulse increased from 7 µs to 13 µs, however, the dephasing during the pulse was dramatically reduced. Details of the pulse optimization and calculated performance metrics are given in SI sections IV-VI.
To experimentally test the performance of the 13-µs long OCT pulse, we applied repeated Ω[ π 2 ] x pulses to spins starting from the z-axis and measured the resulting z-axis magnetization. Here, Ω[ π 2 ] x refers to an OCT pulse that performs a π 2 ] x rotation. The data in Fig. 3 show the measured spin signal after sequential application of the Ω[ π 2 ] x pulse, in increments of 5 pulses. The signal decays to e −1 after 100 pulses, indicating that the operation works with 99% accuracy for spins starting from the zaxis.
This measurement serves as an initial demonstration of the power of OCT-AHT to produce high-fidelity unitary transformations tailored to our experimental conditions while averaging unwanted Hamiltonian terms to nearly zero. In this respect, these OCT pulses approximate delta-function hard pulses, making them perfectly suited to most NMR pulse sequences.

IV. MAGIC ECHOES
One of the most basic and important tests of the OCT pulses is to evaluate their performance as substitutes for hard-pulse rotations in dynamical decoupling pulse sequences. The magic echo is a basic decoupling pulse sequence frequently used in solid state NMR which refocuses the homonuclear dipolar Hamiltonian [23]. In Fig.  4 we show data and the pulse diagram for a magic echo performed using Ω[ π 2 ] y rotations. To perform the magic echo, we first created a coherence by applying an Ω[ π 2 ] x pulse. The spins then evolved under the static field B 0 for a time τ . During this period, the spins dephased over a time T * 2 , primarily due to homonuclear dipolar coupling. After the first free evolution period τ , an Ω[ π 2 ] y pulse was applied, followed by a rotary echo (RE) (x :x) lasting a total time 4τ . The RE removes dephasing resulting from B 1 inhomogeneity. Following the RE, a second Ω[ π 2 ] y pulse was applied. After another free evolution period τ , the echo formed as the total dipolar evolution was averaged to zero. In our implementation, we used two Ω[ π 2 ] y pulses, which also averaged the chemical-shift Hamiltonian to zero over the duration of the sequence, and caused an overall π phase flip, resulting in the negative echo signal shown.
Symmetric magic echo (SME) sequences are formed by combining several magic echoes around different axes [24]. By doing so, the effect of certain pulse imperfections can be significantly reduced, resulting in longer coherence times than simply repeating a single magic echo. SME sequences have been demonstrated to have excellent line-narrowing capabilities, and can be used for chemical-shift spectroscopy and Fourier-transform imaging by encoding static (B 0 ) and radio-frequency (B 1 ) field strengths. We tested the SME4 and SME16 sequences, both of which remove chemical-shift effects, by applying the sequences repeatedly and observing the signal decay. Fig. 5 shows data sets obtained by repeated applications of the SME4 and SME16 pulse sequences. We observed that the SME16 sequence performed slightly better than the SME4, with the longest spin coherence time (5.6 ms)  Normalized spin correlation vs. evolution time after sequential applications of an SME decoupling sequence. The evolution time includes the time during the OCT pulses. SME4 (τ = 25 µs) and SME16 (τ = 15 µs) shown. (Inset) observed for SME16 (τ = 15 µs). For all measurements, the value of τ was chosen to be longer than T * 2 to avoid spin-lock effects [25,26].

V. IMAGING
The long spin evolution periods within the magic echo sequence are ideal for imaging. Spatial information can be encoded by applying B 0 gradient pulses during the two laboratory-frame evolution periods lasting 2τ , and B 1 gradients during the rotating-frame evolution lasting 4τ . In this experiment, we offset the rotary evolution times, while fixing the total rotary evolution period to be 4τ , to encode spatial information using the large B 1 inhomogeneity produced by the CFFGS.
To test the high-resolution imaging capabilities of our system, we applied an SME4 sequence with the RE portion modified for B 1 encoding (Fig. 6a). With the modified RE, the spins evolve in the B 1 gradient for a time τ e = 8∆T within a single SME4 sequence, where ∆T is the offset introduced to each RE. To verify that the introduction of an offset into the RE did not degrade the refocusing properties of the SME4, we applied two backto-back modified SME4 sequences, changing the sign of ∆T between the SME4 sequences to undo the evolution in the B 1 gradient. We compared the spin signal after applying the two modified SME4 sequences to the spin signal after applying two SME4 sequences that were not modified. The magnitude of the signal was the same for the two cases, confirming that the modified RE maintained the refocusing properties of the original SME4.
For the imaging measurements, a 7.5-µs long OCT pulse was designed to target spins experiencing maximum Rabi frequencies from 0.9-1.75 MHz, which allowed us to accurately measure the spins closest to the constriction. These new pulses were used in the SME4 sequence to collect the imaging data shown in Fig. 6. To image the one-dimensional spin distribution as a function of Rabi frequency, we incremented ∆T and measured the corresponding spin correlation. Fig. 6b shows the correlation as a function of τ e . To find the distribution of spins as a function of ω 1 , we performed a cosine transform on the time-domain data (Fig. 6c). Note that the frequency resolution of the green data (25 kHz) is much finer than that seen in Fig. 2b (100 kHz), which was taken without the benefit of the SME4 sequence.
To convert the frequency-domain data to a function of position, we used the simulated magnetic field data shown in Fig. 2a. We took the B 1 field at each vertical position to be the average over a 50 × 50 (nm) 2 region in the x and y dimensions. Since the spin sensitivity of our measurement depended on the gradient of B 1 , we also used this magnetic field simulation to convert the measured spin correlation into a number of detected spins, giving the spin density plot shown in Fig. 6d. To validate the image reconstruction, we physically retracted the tip by 10 nm, from 55-nm to 65-nm above the surface of the constriction. After doing so, the calculated spin density reconstructed 10-nm further above the surface (Fig. 6d), confirming our model of the magnetic fields produced by the constriction.
Finally, by assuming an average SiNW radius of 27 nm, and a rotationally symmetric distribution of polystyrene on the circumference of the SiNW, we reconstructed a one-dimensional image of the polystyrene coating on the tip of the SiNW, shown in Fig. 6e. a Average gradients in a region 50 -80 nm above the constriction produced by applying 50-mA-pk (B1-encoding) and 70-mA-pk (B0-encoding) current through the CFFGS. b Maximum encoding times for an SME16 are 64τ (B1-encoding) and 32τ (B0-encoding). c Spatial encoding limits are calculated as λ = π/γ ∂B τe where γ is the proton gyromagnetic ratio, ∂B is the average encoding gradient, and τe is the maximum encoding time.

VI. CONCLUSION
We have demonstrated the ability to encode the onedimensional spin distribution with an average resolution of 2 nm over a 30-nm vertical region. To achieve this resolution, we used only a small fraction of the total encoding time available to us. Table I shows the available encoding times during a single SME16 (τ = 15 µs). We see that with the gradients and encoding times available to us, it is possible to encode spins with sub-Angstrom resolution.
This capability opens up the possibility for NMR "diffraction" measurements, first proposed by Mansfield and Grannell in 1973 [27]. In one dimension, NMR diffraction works by matching the encoding wavenumber k = γGτ e to an integer multiple of the reciprocal lattice vector 2π/a, where a is the lattice constant, γ is the spin gyromagnetic ratio, G is the magnetic field gradient, and τ e is the encoding time. The extension to three dimensions is straightforward. By analogy to Bragg scattering, in NMR diffraction a coherent signal is produced from a macroscopic ensemble of spins only if the Bragg condition is satisfied.
Atomic scale NMR diffraction has not been demonstrated, because until now the combination of long coherence times and the large magnetic field gradients necessary to encode spins at the Angstrom scale was not available. NMR diffraction would provide a fundamentally new way to study the atomic structure of crystalline solids by providing chemically-specific structure factor in-formation. In particular, NMR diffraction could be very useful for the study of biological systems that possess periodic structures.
In this work, we have used the intense magnetic fields produced by a CFFGS in combination with OCT pulses that incorporate AHT to realize high-fidelity spin control in nanometer-scale ensembles of nuclear spins. By combining this capability with high-sensitivity spin detection, we have demonstrated high-resolution solid-state spectroscopy and imaging in a system of proton spins with strong spin-spin interactions. We believe that the combination of these capabilities is very powerful and will enable new approaches in nano-MRI.

Supplementary Information for High-Resolution Nanoscale Solid-State Nuclear
Magnetic Resonance Spectroscopy

I. NANOWIRE MODE SPLITTING, PASSIVATION, AND SAMPLE ATTACHMENT
We used silicon nanowires (SiNWs) grown by the vapor-liquid-solid method with drop-cast gold nanoparticle catalysts. The nanowires were grown to be about 20-µm long, with a slight taper at the base, and a tip diameter of roughly 120 nm. The wires' radial symmetry makes the two lowest-frequency flexural modes nearly degenerate. The SiNW used for the experiment initially had fundamental modes f 1,2 = 617.7 kHz, 618.6 kHz with Q = 23, 000 for both modes at room temperature. For the spin measurements, we needed to couple to only one of the flexural modes. Therefore, we developed a procedure to separate the frequencies of the fundamental modes by removing radial symmetry.
Some of the processes required the SiNW to be exposed to high temperature. To prevent diffusion of the Au catalyst, we first used an ion mill to remove the Au catalyst by orienting the nanowire tips towards the ion source. We used a bias voltage of 100 V because the low voltage increases the milling selectivity of Au over Si to about 10:1. We had previously calibrated the milling rate, and confirmed with a scanning electron microscope that the gold particles were fully removed. Fig. S1a shows a diagram of the cross section of the nanowire after the Au catalyst removal.
Next, we used a electron-beam evaporator to directionally deposit high-purity silicon on the top and bottom of the nanowire (Fig. S1b). We deposited approximately a 20-nm thick Si layer on both the top and bottom of the SiNW at a rate of 0.2Å/s. By depositing equal amounts of Si on both sides, we prevented the nanowire from warping due to differential stress produced by the amorphous Si layer. This deposition step removed the nanowires radial symmetry, separating the frequencies of the two lowest-frequency flexural modes. However, the amorphous Si layer dramatically lowers the quality factor to Q ≈ 1, 000.
To improve the quality factor, we next grew oxide to consume the amorphous evaporated silicon. We used a tube furnace with 1 atm pure oxygen at 900 • C to consume approximately 60 nm of Si from the diameter of the SiNW (Fig. S1c). Finally, the thermal oxide was removed using vapor hydrofluoric acid, which very selectively etches SiO 2 but not Si, leaving behind a smaller, crystalline, assymetric nanowire (Fig. S1d). The resultant nanowire had tip dimensions of about 50×80 nm, f 1,2 = 362 kHz, 423 kHz with Q ≈ 7, 500 for both modes at room temperature.
Next, we grew and passivated a thin thermal oxide layer on the SiNW to try to reduce the concentration of dangling bonds and defects on the surface. Oxide quality is improved at higher growth temperatures. To grow a thin layer of oxide at high temperature, we preheated a furnace tube with 1 atm O 2 to 1000 • C, then gradually inserted the silicon nanowire chip over 30 s, left it in the furnace for 20 s, and gradually removed it over another 30 s. Based on calibration measurements made on Si wafers, this procedure should have grown about 3-4 nm of silicon oxide. Immediately after the oxide growth, we moved the nanowire chip to another tube furnace and annealed it for 2 hours at 300 • C in 1 atm of 5% H 2 /Ar forming gas.
To attach the polystyrene sample, we dissolved 5200 MW polystyrene in diethyl phthalate, which has a very low

Amorphous Silicon Evaporated
Thermal Oxide Grown vapor pressure (0.002 mmHg at room temperature). We coated the tip of a micropipette with a small drop of the solution, and used high precision manual translation stages under an optical microscope with a long working distance to carefully insert the tip of the target nanowire approximately 1-2 µm into the polystyrene solution. Using a scanning electron microscope would degrade the polystyrene, therefore, initial confirmation of polystyrene attachment was performed by measuring the nanowire resonance frequencies. The polystyrene mass-loads the nanowire, lowering the resonance frequencies to f 1,2 = 313 kHz, 367 kHz at room temperature. Final confirmation that there is polystyrene on the last 100 nm of the nanowire, where we can measure it, comes from the magnetic resonance images of the spin distribution.

II. MODIFIED MAGGIC PROTOCOL
We measure the √ N fluctuations in an ensemble of ∼ 10 6 proton spins. Because the mean spin signal is zero, we measure the average correlation between two consecutive measurements, where the duration of each measurement is chosen to be much shorter than the correlation time of the statistical fluctuations. The magnetic resonance pulse sequence is applied between the two spin-measurement blocks.
For all the measurements described in this paper, we measured the spin signal using a modified version of the MAGGIC protocol described by Nichol et al. in 2012 [1]. The overall pulse scheme, shown in Fig. S2a, consisted of a readout period, during which the MAGGIC protocol was applied at maximum amplitude, and an encoding period during which the MAGGIC protocol was off. The encoding sequence consisted of an SME sequence or other spin manipulations neccessary for a particular measurement. The MAGGIC protocol was ramped on and off exponentially with time constant τ = 2Q ω0 = 8.1 ms, where ω 0 = 2π × 315 kHz is the angular frequency of the fundamental resonance mode of the nanowire, and Q = 8000 is the quality factor of the fundamental mode. There is electrostatic coupling between the SiNW and the electric fields that drive the gradient modulation currents through the CFFGS. The exponential ramp allows the nanowire to ring up and down gradually when these fields are turned on and off, minimizing transients. Adiabatic spin inversions were not applied during the ramp times to minimize the decay of the statistical spin correlations.
To minimize the electrostatic coupling between the SiNW and the gradient modulation electric fields, we applied a DC bias voltage to the CFFGS to null the static charge on the tip of SiNW resulting from a potential difference between the SiNW and the CFFGS. We found that the electrostatic coupling was minimized by applying a potential FIG. S3. Contour map of dBz/dx gradients (T/m) calculated for 70-mA-pk current passing through the constriction. These gradients are generated at the nanowire resonance frequency during the readout periods of the MAGGIC protocol, and are those referred to as the gradient modulation and denoted by G( r) in the equations below. The center of the top surface of the constriction is at the position (0,0). difference of ∼ 0.6 V between the CFFGS and the SiNW. In addition, feed-forward cancellation was used to minimize the sidebands caused by the gradient modulation [1]. We experimented with the exact timings for this sequence, and settled on a ramp time of τ ramp = 30 ms, and a readout time of τ 0 = 80 ms. The encoding time τ e was varied between tens of microseconds and several milliseconds depending on the desired encoding sequence. Fig. S2b shows the MAGGIC protocol block, consisting of two short hyperbolic secant adiabatic full passages (AFPs), and two periods of sinusoidal gradient modulation at the nanowire resonance frequency f 0 . The second gradient-modulation block turns on with the same phase as the first at time t = (n + 1 2 )t 0 , where n is a positive integer (we used n = 63), and t 0 is the period of the nanowire resonance. This ensures that there is no Fourier component of the gradient modulation at the nanowire resonance frequency.
Following the MAGGIC protocol theory developed by Nichol et al. [1], during the readout periods, a single spin produces a force at the nanowire resonance frequency f 0 with average amplitude F 0 = µDG. µ is the magnetic moment of the spin, D is the duty cycle of the f 0 gradient waveform, and G = dBz dx is the peak magnetic field gradient at the location of the spin (Fig. S3). The longitudinal correlation of each spin is Markovian, described by the random telegraph function h(t) ∈ {±1}, with the correlations ∞ 0 h(t )h(t + τ )dt = e −τ /τm . In our measurements we determined τ m = 0.6 s. Now let us consider the correlation function for a single spin at position r. The signal obtained from such a spin will be the average correlation of theẑ spin measurement made before and after the encoding period: where ... refers to the average of an ensemble of measurements, and where U ( r) is the unitary transformation applied to a spin at position r during the encoding time τ e . In writing SI, we have assumed that the encoding time is much shorter than the longitudinal relaxation time (τ e T 1 ). We will consider how the random telegraph noise for a single spin affects the correlation measurement.
Now we can perform the integration: Returning to (S1), the average correlation for a single spin is given by: We treat each spin in the measurement ensemble as statistically independent. This allows us to express the total correlation as: where ρ( r) is the spin density and the integration is done over the measured volume.

III. RABI MEASUREMENT
To measure the distribution of Rabi frequencies over our spin sample, we applied hard pulses aroundx to the spins. This produced a field-dependent rotation, giving the unitary transformation: where t p is the duration of the pulse, and ω 1 ( r) = γB x ( r)/2 is the position-dependent Rabi frequency defined in the main text. It may be seen in Fig. S4 that within our sample volume B y ( r) is at least an order of magnitude smaller than B x ( r). Since B 1 ( r) = B 2 x ( r) + B 2 y ( r) 1/2 /2 and B 2 y ( r) B 2 x ( r) we used the simplifying approximation that B 1 ( r) = B x ( r)/2.
We define: To avoid dephasing errors, we kept t p constant while varying the amplitude of ω 1 ( r). However, we were interested in the distribution of Rabi frequencies for the maximum applied current through the constriction (50 mA). Therefore we define new variables, the position-dependent Rabi frequency at maximum current: ω 1max ( r), and an effective time t e such that: (S10) For the initial Rabi distribution measurement shown in Fig. 2b, we took steps corresponding to a t e increment of 0.25 µs from t e = 0 µs to t e = 5 µs, and measured the spin correlation at each step. We performed a discrete cosine transform (DCT-I) on these data to produce the spin correlation as a function of ω 1max ( r). The transformed data correspond to 0.1 MHz steps of ω 1max /2π from 0 -2.0 MHz.

IV. PULSE SEARCH
Pulse search techniques described in [2] provided us with a capability to simultaneously engineer arbitrary unitary operations and selectively suppress evolution under unwanted system Hamiltonian terms by averaging their respective Average Hamiltonian Theory (AHT) [3] terms to zero. Crucially, we could do this over a range of control parameters (e.g. Rabi frequencies and resonance offsets) and in the presence of known deterministic distortions to the control sequence. For this work we engineered control sequences that implemented π/2 spin rotations and suspended the evolution under homonuclear dipolar interactions (D = σ.σ − 3 σ z ⊗ σ z ) and σ z Hamiltonians. Accordingly, our pulse optimization target function combined three terms: fidelity to the target unitary, V = exp −i π 2 σx 2 , and the operator norms of zeroth order AHT terms for σ z and dipolar Hamiltonians.
We parametrized our control Hamiltonian as where we took the amplitude modulation function Ω(t) to be a positive unitless function 0 ≤ Ω(t) ≤ 1, and the phase modulation function φ(t) to satisfy −π ≤ φ(t) ≤ π. The Rabi strength parameter γ took values between 2πν min and 2πν max , with ν min and ν max being respectively the minimum and maximum targeted Rabi strengths. Over a period 0 ≤ t 1 ≤ t the control Hamiltonian H ctrl (t, γ) generates a unitary where T denotes the time ordered exponential. Zeroth-order AHT prescribes setting the following integrals to zero: γ)] to suspend the dipolar interaction and I σz (γ) = t 0 dt 1 U † (t 1 , γ).σ z .U (t 1 , γ) to eliminate chemical shift and resonance offset effects for a particular γ value. The three quantities minimized during the pulse engineering are the unitary metric F U (γ), dipolar metric F D (γ), and σ z metric F σz (γ), defined as . (S15) The quantities (S13), (S14) and (S15) only take values between 0 and 1 and are identically zero for a control sequence , Ω(t) sin [φ(t)]} which for a particular γ yields U (t, γ) = V while setting I D (γ) and I σz (γ) to zero. Following the method in [2,4] we divided the pulse length t into intervals of equal length ∆t = t N , N being the number of steps, and work with piecewise constant controls {Ω x (t) , Ω y (t)} = {Ω (i) x , Ω (i) y } for i∆t ≤ t < (i + 1)∆t and i ∈ {0, 1, ..., N − 1}. Consequently, the pulse metrics F U (γ), F D (γ) and F σz (γ) became functions of {Ω (i) x , Ω (i) y } and γ, with their partial derivatives, e.g.
y }, γ , easily evaluated by matrix exponential methods introduced in [2]. We minimized the cost function using a standard gradient descent algorithm, Γ being a set of fourteen γ values distributed roughly uniformly over the Rabi range 2πν min and 2πν max . The relative weights for the individual pulse metrics in (S16) were picked so to give approximately equal minimization rates for F U , F D and F σz . All searches were started with Ω and Ω (i) y drawn from independent uniform pseudorandom distributions over a range − 1 We kept increasing the pulse length until we observed sufficiently rapid convergence speeds. The final pulse for ν min = 0.6 MHz, ν max = 1.2 MHz turned out to be 13-µs long with N = 522, and for ν min = 0.9 MHz, ν max = 1.75 MHz we found a 7.5-µs long pulse with N = 360.
Furthermore, to minimize transients, we fixed the first and last five steps of the pulse to have zero amplitude. We also incorporated a combined transfer function of a numerical bandpass filter and experimentally determined amplitude and phase transfer functions to account for pulse distortions due to our radio frequency (RF) electronics. This procedure is described in the next section. The resulting waveform and plots of the individual metrics for the 13-µs long pulse are plotted in Fig. S5.

V. TRANSFER FUNCTION
We verified that our RF electronics exhibited a linear response over the range of output power used in our measurements. Therefore all distortions to the pulse due to the electronics could be characterized by the amplitude and phase transfer functions of the system, A(f ) and φ(f ), respectively. We measured A(f ) and φ(f ) of our electronics over a frequency range of 40 MHz ≤ f ≤ 70 MHz which are presented in Fig. S6. We used this transfer function in conjunction with a numerical bandpass filter to limit the frequency range of the waveform by multiplying A(f ) with a numerical bandpass filter A filter (f ) given by the function where f 0 denotes the carrier frequency and ∆f the bandwidth of the filter. This ensured smooth cut-off of frequencies outside of f 0 − ∆f 2 , f 0 + ∆f 2 . The numerical filtering was done to ensure that the spectral range of the pulse remained in the region where the amplitude transfer function was relatively flat and the phase transfer function relatively linear, thereby reducing distortions to the waveform caused by errors in determining the transfer function. We used f 0 = 48 MHz and ∆f = 10 MHz yielding a numerical amplitude transfer function shown in Fig. S6, while the limited spectral range of the 13-µs pulse can be seen in Fig. S5c.

VI. SIMULATIONS
To demonstrate the necessity for including AHT into our optimization we have compared the performance of the 13-µs pulse presented in Fig. S5 with a pulse optimized without the averaging targets. Despite the latter being roughly half the length of the averaging pulse it performed substantially worse in numerical simulations of an eight spin network. For the simulations, we evaluated the Hamiltonian evolution of eight spins under dipolar couplings and various Rabi strengths and resonance offsets; these simulations provided a good way to assess the performance of pulses, since the effects of experimental distortions and transients was totally removed, and the ability to change the strength of various Hamiltonian terms enabled us to easily discern between different contributors (e.g. Rabi dispersion, resonance offesets/chemical shifts, dipolar evolution) to pulse performance. This benefit did come with a Unitary metric FU (γ) defined in (S13) as a function of Rabi strength parameter γ, it can be seen that the pulse targets the range 2π · 0.6 MHz to 2π · 1.2 MHz. (e) Dipolar metric FD(γ) defined in (S14) as a function of Rabi strength parameter γ. (f) σz metric Fσ z (γ) defined in (S15) as a function of Rabi strength parameter γ.  severe limitation to the system size, hence we could only expect the coherence times found in eight spin simulations to put an upper bound to the experimental results.
• Single Spin Simulations: We performed single spin and eight spin simulations when analysing our pulses, which were represented as two-dimensional arrays {Ω x , Ω (i) y }, γ i , Γ being the same set of fourteen γ values used in Section IV. As before, we forced the beginning and the end of the pulse to have zero amplitude and used the same transfer function as for the search in Section IV. This time we terminated the minimization only once the performance of the pulse without any dipolar effects nor resonance offsets mimicked the performance of the 13-µs pulse. To determine that we evaluated σ z I (n), given by (S19), for 250 consecutive applications of either pulse, the results are displayed in Fig. S8. We use this kind of termination condition rather than matching the F U (γ) curves for the pulses to ensure that the eight spin simulations comparing the pulses will highlight errors resulting from evolution under the dipolar and σ z Hamiltonians rather than incoherence from the Rabi dispersion. Analogously to the search in Section IV, we increased the pulse length until we found a pulse satisfying our criteria. The pulse found was 7-µs long with N = 292. Its waveform and metrics are presented in Fig. S10. It can be seen that while the unitary metric F U (γ) for the shorter pulse is actually somewhat better than the one for the averaging pulse, its dipolar and σ z metrics F D (γ) and F σz (γ) are substantially worse.
Finally, we evaluated 2J z VIII given by (S22) for the pulses presented in Figs. S5 & S10 for 250 back-to-back applications of the pulses (Fig. S9). It can be seen that the pulse without AHT terms performs about 7.1 times worse despite being roughly half as long. Importantly, the performance of the 7-µs pulse is sufficiently bad that it would not have enabled the experiments discussed in the main text. Separate simulations revealed that the signal decay for the 7-µs pulse was mostly due to dipolar evolution; the removal of the resonance offsets increased the effective e −1 time by a factor of 1.3, while the removal of resonance offsets for the 13-µs pulse increased the e −1 time by a factor of 1.5.