Resolving Phonon Fock States in a Multimode Cavity with a Double-Slit Qubit

We resolve phonon number states in the spectrum of a superconducting qubit coupled to a multimode acoustic cavity. Crucial to this resolution is the sharp frequency dependence in the qubit-phonon interaction engineered by coupling the qubit to surface acoustic waves in two locations separated by $\sim40$ acoustic wavelengths. In analogy to double-slit diffraction, the resulting self-interference generates high-contrast frequency structure in the qubit-phonon interaction. We observe this frequency structure both in the coupling rate to multiple cavity modes and in the qubit spontaneous emission rate into unconfined modes. We use this sharp frequency structure to resolve single phonons by tuning the qubit to a frequency of destructive interference where all acoustic interactions are dispersive. By exciting several detuned yet strongly-coupled phononic modes and measuring the resulting qubit spectrum, we observe that, for two modes, the device enters the strong dispersive regime where single phonons are spectrally resolved.

Quantum control over mechanical degrees of freedom promises insight into fundamental physics as well as the development of innovative quantum technologies. As mechanical resonators are massive and macroscopic, they can probe quantum theories at large scales [1][2][3], while the ability of mechanical motion to couple to a variety of quantum systems has inspired numerous mechanicsbased transduction schemes [4][5][6][7][8][9][10]. Additionally, mechanical elements are compact compared to their electromagnetic counterparts, enabling the on-chip fabrication of many wavelength microwave structures such as high-performance filters and multimode resonators [11][12][13]. High-fidelity control over the large number of modes achievable in acoustic platforms would be a powerful resource for quantum information processing [14].
The field of circuit quantum electrodynamics (cQED) has provided both guidance and tools for achieving quantum control over mechanical excitations. In cQED, the state of a photonic mode is measured and manipulated using superconducting qubits. These qubits can also interact with mechanical systems using piezoelectric materials. Two seminal works leveraged this fact to couple a qubit to a dilatational resonator [15] and to propagating surface acoustic waves (SAWs) [16]. Both surface and bulk acoustic waves can be confined to form highovertone resonators [17,18], leading to demonstrations of qubit-phonon coupling in multimode cavities [19][20][21][22][23]. Most recently, a pair of experiments used resonant interactions to create number and superposition states of an acoustic cavity mode, thereby demonstrating basic quantum control of acoustic phonons [3,24]. Following the example of cQED, achieving strong dispersive interactions in acoustic systems would lead to improved quantum control through quantum nondemolition phonon measurement [25,26] and qubit mediated phonon-phonon interactions [27,28]. Realizing these techniques in acoustic * lucas.sletten@colorado.edu systems would enable the exploration of a multimode analogy of cQED [29].
But coupling a single qubit with uniform strength to multiple modes of an acoustic cavity reduces the number of coherent interactions achievable with a given mode. Consider that for any qubit frequency inside the cavity bandwidth, there exists some nearest mode k with detuning ∆ k less than half the cavity's free spectral range f s . To be in the dispersive limit for all modes, the qubit must have coupling g ∆ k < f s /2. This limited coupling then bounds the number of operations possible within the qubit's coherence time (2πγ) −1 at approximately g 2 /(∆ k γ). The number of interactions achieved for mode k is further reduced by a large factor ≈ |n − k|f s /∆ k for the nth cavity mode. Eschewing the dispersive limit by choosing g ≈ f s yields strong, resonant interactions between the qubit and multiple cavity modes. The resulting hybrid modes are composed predominantly of linear cavity modes, effectively diluting the qubit's nonlinearity and thereby increasing the time required for coherent operations [21]. This reduction in coherent operations can be overcome by engineering a frequency-structured interaction such that modes far from the qubit frequency couple with rates exceeding f s , while the coupling to nearby modes is suppressed, preserving the dispersive limit.
Indeed, acoustic platforms excel at realizing such strongly frequency-dependent couplings. In SAW devices, an interdigitated transducer (IDT) converts between electrical and acoustic signals with a frequency dependence determined by the Fourier transform of the IDT geometry [12,30]. A desired frequency response can be engineered by computing its inverse Fourier transform and shaping the IDT accordingly. Moreover, the slow speed of sound (v = 2880 m/s on GaAs) implies that megahertz frequency resolution can be realized with millimeter geometries.
In this article, we engineer a frequency-dependent coupling between a transmon qubit and a multimode SAW cavity to realize g ∼ f s together with dispersive operation. The qubit couples to phonons through an IDT that is bisected to create a pair of interaction regions separated by a long travel time, τ ≈ 9 ns [ Fig. 1(a)]. In close analogy to double-slit interference, the manywavelength separation between interaction regions creates sharp fringes in the frequency dependence of the qubit-phonon interaction strength [30,31]. We observe the designed frequency dependence as a high-contrast modulation of both the coherent exchange rate between the qubit and cavity modes and the qubit spontaneous emission rate into unconfined phonons. This frequency dependence greatly reduces the coupling to certain modes to create frequency windows for dispersive operation. We tune the qubit transition to such a window and observe the single-phonon Stark shift from three strongly coupled modes of the cavity by populating these modes while measuring the qubit spectrum. For two of these modes, we enter the strong dispersive regime where the single-phonon Stark shift exceeds the qubit and acoustic linewidths, demonstrating that spatially extended coupling can be leveraged to take full advantage of multimode acoustic systems.
The device we study comprises a tunable transmon qubit on a piezoelectric GaAs surface with two IDT halves embedded in a multimode SAW cavity [ Fig. 1(a)]. The cavity is formed between two Bragg mirrors made of aluminum strips that reflect surface waves over a 100-MHz bandwidth to form a phononic Fabry-Perot cavity [ Fig. 1(b) and 1(c)]. The effective cavity length extends beyond the mirror separation L = 125 µm by 20 µm from acoustic penetration into the mirror array to create a mode spacing f s ≈ 10 MHz. The mirrors and IDT were designed with periodicity λ c = 675 nm, which corresponds to a center frequency near f c ≈ 4.25 GHz. The IDT halves, each 8 periods long, are mirror images of each other reflected across the center of the cavity and separated by S = 26 µm. The mechanical loading effects on the resonator from the IDT are minimized by using thin metal (30 nm of aluminum) and a split electrode design [ Fig. 1(d)] [11]. Qubit readout and control are enabled by attaching antenna paddles [ Fig. 1(e)] that strongly couple the qubit to a copper waveguide cavity at 5.9 GHz (see Appendix A).
An IDT split in half achieves a mode-selective coupling by creating a frequency profile A(f ) analogous to the spatial profile of double-slit diffraction. The IDT is split into two regions of length D separated by distance S. The Fourier transform about the symmetry point between these two regions is real and the product of two factors: a slow sinc envelope centered on f c with period v/D and a fast sinusoidal modulation with period 1/τ = v/S: Outside the mirror bandwidth, the qubit loses energy to propagating phonons at a rate Γ 1 ∝ A(f ) 2 [ Fig. 1(f)] [21,30]. Within the mirror band, the qubit exchanges excitations with confined acoustic modes [ Fig. 1  (d) (e) Antenna paddles couple the qubit to a copper waveguide cavity at 5.9 GHz for readout and control. (f) When the qubit is tuned outside the mirror bandwidth (unshaded), the IDT launches phonons with a frequency dependence determined by the Fourier transform of the IDT geometry, creating fringes in the qubit loss rate Γ1. (g) Inside the mirror band, the IDT modifies the qubit coupling strength gm to the evenly spaced cavity modes. By tuning the qubit frequency to a zero in the coupling at fz, coupling rates exceeding the mode spacing can be achieved with dispersive operation, provided the slope near fz (red dotted line) is much smaller than one. nian where the qubit is described by Pauli matrices and transition frequency f q , the cavity modes are described by annihilation (creation) operators a m (a † m ) and frequencies f m , and the qubit and cavity couple with strength g m . If the IDT is symmetric about the cavity center, then g m has the form g m = g 0 A(f m ) ≈ g 0 sin (πf m τ ) where g 0 is the maximal qubit-cavity coupling strength and the slowly varying sinc is approximated as unity. With the designed separation between IDT halves, the coupling varies with a periodicity approximately equal to the mirror bandwidth (1/τ ≈ 100 MHz), ensuring at least one cavity mode achieves coupling near g 0 .
Dispersive operation can be achieved regardless of mode density or maximal coupling strength by designing A(f ) to cross zero with a sufficiently shallow slope. Consider tuning the qubit to a frequency f z such that A(f z ) = 0. A cavity mode with small detuning ∆ z from the qubit will couple with rate g z that is bounded above by this detuning multiplied by the slope of the coupling Thus, the magnitude of A (f z ) constrains g z /∆ z and thereby sets a lower limit on how dispersive qubit-cavity interactions can be. We engineer this slope, approximated for the split-IDT design as g 0 A (f z ) ≈ πg 0 τ = 0.14, to be much smaller than one.
To confirm the designed frequency structure in the device, we measure the qubit spectrum as an applied magnetic flux tunes its frequency. We begin by tuning the qubit across the mirror bandwidth to investigate the frequency region where phonons are confined. We observe pronounced avoided crossings in the qubit spectrum where the qubit coherently exchanges energy with cavity phonons [ Fig. 2(a)]. The extracted coupling rates [ Fig. 2(b), see Appendix C] vary between the modes, with several strongly coupled modes in close spectral proximity to crossing-free regions wider than f s at both edges of the mirror band. Three main effects explain the observed behavior. First, the split-IDT modulates the coupling proportional to A(f ), coupling the qubit strongly to modes near 4.25 GHz with g 0 = 5.1 MHz while decoupling it from modes roughly 5f s above or below. Second, neighboring cavity modes couple to the qubit with alternating strength because the qubit IDT is approximately symmetric about the cavity center, strongly (weakly) coupling the qubit to modes with even (odd) spatial symmetry. Lastly, resonant exchange between the qubit and cavity modes at the edge of the mirror bandwidth is unresolved as the coupling rate is much less than the loss rate of these weakly confined modes.
To study A(f ) outside the mirror band, we tune the qubit over a 1-GHz span and examine the influence of propagating phonons on the qubit linewidth and transition frequency. In contrast to the discrete cavity modes, propagating modes form a continuum, enabling a dense sampling of A(f ) over a broad frequency range and affording a clear picture of how effectively the split IDT tailored the qubit-phonon interaction. In the measured qubit spectra [ Fig. 3(a) The coupling strengths to each mode are extracted from the measured crossings. The qubit couples more strongly to modes at the center of the mirror band with maximum strength g0 = 5.1 MHz. The IDT is symmetric about the cavity center, resulting in strong coupling to even modes (green) and weak coupling due to odd modes (blue). Coupling to transverse modes (orange) is a factor of 5 smaller. An inference of the frequency dependence of the qubit coupling strength (gray dotted line) made from measurements of the qubit spontaneous emission rate (see main text) agrees well with the measured coupling rates.
tic interactions are emphasized by subtracting the flux dependence expected from an acoustically uncoupled qubit (see Appendix B). At frequencies detuned from the central avoided crossings, the qubit linewidth oscillates with a period of 110 MHz that is consistent with the expected delay time and an amplitude that decays as the qubit tunes out of the IDT bandwidth. Additionally, the qubit frequency deviates from the uncoupled flux dependence with a similarly enveloped oscillation with matching 110-MHz periodicity. Both of these effects can be understood by modeling the qubit's emission of phonons from the IDT as a frequency-dependent resistance, which must be accompanied by a frequency-dependent reactance from Kramers-Kronig relations [11,12]. We observe this reactance as a modulation of the qubit frequency compared to its uncoupled flux tuning, an effect describable as a phononic Lamb shift [30,32]. We determine the qubit energy decay rate with increased precision by measuring qubit excited state lifetime (T 1 ) in the time domain. With the qubit far detuned from the acoustic cavity modes, we observe Γ 1 = (2πT 1 ) −1 oscillating in frequency with large amplitude; the loss increases by a factor of 25 above its min-qubit frequency (GHz) imal value within a 55-MHz span [ Fig. 3(b)]. A simple model that combines a prediction for the phonon emission rate from the IDT and a constant internal quality factor Q i closely fits the measured qubit loss rate, giving Q i = 1.2 × 10 4 and τ = 9.04 ns (see Appendix D). The nulls in Γ 1 arise from destructive interference between the two IDT halves, an effect with close parallels to an atom interfering with its mirror image [33,34]. As the depth of these nulls is approximately uniform across the IDT bandwidth, phonon loss from imperfect destructive interference is less than 75 kHz. Additionally, the extracted IDT parameters from the qubit loss rate can be used to calculate the frequency-dependent phononic Lamb shift, showing agreement with the measured qubit frequency [inset of Fig. 3(b)].
Our measurement of the qubit interaction with propagating modes also provides an independent inference of the interaction strength between the qubit and cavity modes. The best-fit model from Fig. 3(b) determines A(f ) using propagating modes and can be extended to frequencies inside the mirror band, where it closely fol-drive frequency (GHz)   lows the measured coupling rates [ Fig. 2(b)].
Having characterized the qubit interaction with both confined and propagating phonons, we turn to resolving the qubit's Stark shift from individual cavity phonons. This resolution requires dispersive operation with all modes, i.e. g m |∆ m | for all modes m, where ∆ m = f q − f m , as well as a Stark shift that exceeds both the qubit and acoustic loss rates. Tuning the qubit to f z = 4.318 GHz realizes dispersive operation; the leastdispersive interaction is with mode 7 where ∆ 7 /g 7 = 8.5 1. In this multimode dispersive regime, the interaction term in the Hamiltonian [Eq. 1] becomes where individual phonons shift the transmon frequency by 2χ m . An accurate calculation of χ m must include the higher levels of the transmon, and is well approximated as where α = −190 MHz is the transmon's anharmonicity.
With the qubit at f z , its transition frequency is above the acoustic modes while the |e → |f transition is below, such that ∆ m > 0 and ∆ m + α < 0 for all modes m.
With this level ordering, the two terms in Eq.
To populate a target cavity mode with phonons, we drive the qubit at a frequency far detuned from its own transition but resonant with the cavity mode [3]. In Fig. 4(a), spectroscopy shows the qubit transition at 4.318 GHz and, with much higher drive power, three acoustic resonances at lower frequencies. The measured qubit linewidth γ = 550 kHz is only marginally larger than the sum of contributions from Q i , intrinsic dephasing, and expected power broadening (see Appendix E). The acoustic linewidths are measured to be κ m ≈ 250 kHz for all three modes, only slightly larger than the expected 200 kHz of diffraction loss from the flat-flat mirror design of the cavity [12,17,21].
We measure the single-phonon Stark shift of the three strongly coupled modes by varying the population in these modes and measuring the qubit spectrum. A 3µs drive pulse at f m creates a coherent state in mode m with n m average phonons [25]. The resulting Starkdriven qubit spectrum, measured with a spectroscopy pulse concurrent with the acoustic drive, consists of a sum of Lorentzians that each correspond to a phonon number state in the cavity [ Fig. 4(b)]. These Lorentzians are spaced by 2χ m and broaden with higher phonon number in proportion to κ m . Sweeping the drive power at one of the three modes, the measured qubit spectrum broadens and shifts up in frequency. Crucially, several resolved peaks appear for modes 5 and 7 arising from a distribution of phonon Fock states in the cavity. To model the measurement, we assume the cavity occupation is Poissonian distributed and fit the average phonon number in each trace. We find good agreement between the model and measurement for acoustic linewidths κ 3,5,7 = 200, 250, 275 kHz and singlephonon Stark shifts 2χ 3,5,7 = 500, 1050, 890 kHz (see Appendix E). As the single-phonon Stark shifts for modes 5 and 7 exceed both the qubit and acoustic linewidths, we confirm that the device enters the strong dispersive regime for two acoustic modes.
Resolving phonon Fock states in a multimode cavity through spatial engineering suggests multiple future directions. For the measured device, the dominant source of phonon loss was likely diffraction and could be eliminated by using curved reflectors to form a stable cavity [10]. Combining improved phonon lifetimes with the demonstrated coupling strengths would enable quantum nondemolition phonon detection and qubit-mediated interactions between phonon modes. Furthermore, the number of modes accessible to the qubit can be increased simply by elongating the cavity, highlighting the promise of SAW systems for multimode quantum information processing [3,28]. More generally, the engineering of timedelayed self-interactions not only enables a wide range of frequency structures but can also give rise to non-Markovian dynamics [36,37], suggesting delay may prove a valuable resource for quantum information processing [38].
The qubit state is measured through its dispersive interaction with a 5.9-GHz copper waveguide cavity. The qubit has a large electric dipole moment, coupling it to the readout cavity with strength g c = 115 MHz. Different readout techniques were used to probe the qubit state depending on the measurement details.
We used bright-state readout [40] to measure the qubit decay rate Γ 1 as a function of frequency [ Fig. 3(b)]. This type of readout is well suited for measuring fast decays as the cavity can persist in the bright state for a time that exceeds the natural qubit lifetime.
For qubit spectroscopy, we used single quadrature dispersive readout backed by a flux-pumped Josephson parametric amplifier. To measure the Stark-driven qubit spectra, we used a pulsed readout scheme that minimized qubit dephasing from readout phonons [ Fig. 4]. Continuous readout was used for qubit spectroscopy as a function of flux [Figs. 2(a) and 3(a)]. In the broad qubit spectroscopy, we compensate for the varying excited state contrast resulting from frequency-dependent qubit loss by adjusting the qubit drive power. This power level is independently determined from the measured T 1 times.

Appendix B: Qubit flux dependence
The qubit transition frequency is tuned using an offchip coil to thread magnetic flux through the 50-µm 2 loop formed by the two Josephson junctions. Omitting acoustic interactions, we model the qubit frequency f q as a function of coil current I as where f 0 is the zero-field qubit frequency, I c is the coil current required to thread a half flux quantum through the qubit loop, I 0 is the current offset required to offset ambient fields, and a is the normalized difference between the junction critical currents. From fitting the measured qubit frequency [ Fig. 5(a)], we find f 0 = 5.718 GHz, I c = 1.168 mA, I 0 = 79.2 µA, and a = 0.14.
The qubit flux dependence is weakly modified by its interaction with the continuum of propagating phonon modes. We model this phononic Lamb shift δ as where Γ 0 is the maximal loss rate to phonons, f c is the center frequency of the IDT, N q = 8 is the number of finger periods in each IDT, and τ is the intra-IDT delay [30]. The measured phonon loss rate (see Appendix D) independently determines Γ 0 , f c , and τ , allowing the Lamb shift to be calculated with no free parameters. This calculated Lamb shift closely matches the residual from the flux fit [inset of Fig. 3(b) and Fig. 5(b)] except near avoided crossings.

Appendix C: Acoustic cavity characterization
Extracting the coupling strengths from the closely spaced avoided crossings requires a multimode formalism. The eigenmodes of the system are found by diagonalizing the interaction Hamiltonian, including 9 purely longitudinal modes and 5 transverse modes. The eigenvalues of the matrices as a function of flux are fit to the measured avoided crossing spectrum [ Fig. 6(a)]. The general properties of the mirrors can be inferred from the precise measurement of the mode spacings. Near the center of the mirror bandwidth, the modes are spaced by f s = 10.6 MHz, but they become more closely spaced near the edge of the mirror bandwidth due to deeper phonon propagation into the mirror stack [ Fig. 6(b)]. We find a simple mirror model matches the measurements with a single-element reflectivity r s = 3.5%, which corresponds to a mirror bandwidth of 100 MHz.

Appendix D: Phonon emission rate
The qubit lifetime is measured over a wide frequency range to directly probe the qubit spontaneous emission rate into unconfined phonons. The qubit loss rate Γ 1 as a function of qubit frequency f q is modeled by where Q i is the qubit internal quality factor, Γ 0 is the maximal loss rate to phonons, f c is the center frequency   of the IDT, N q = 8 is the number of finger periods in each IDT, and τ is the intra-IDT delay time. We find Q i = 1.2 × 10 4 , Γ 0 = 11 MHz, f c = 4.24 GHz, and τ = 9.04 ns. The best fit Γ 0 is close to the expected value of 12.5 MHz calculated using room temperature GaAs properties [16]. The qubit studied constitutes a giant atom where the intra-IDT delay time approaches the phonon-limited qubit lifetime. Deep in this regime, the qubit fully decays before a phonon can travel between the IDT halves, leading to a host of effects such as nonexponential decay. The transition to this regime occurs when the product πτ Γ 0 reaches 1 [30,37]. For this device, πτ Γ 0 ≈ 0.3. However, evidence of non-Markovian physics was obscured by the presence of mirrors and the short timescale (9 ns) associated with the nonexponential decays. A small fraction of the measured time traces display nonexponential features but with timescales far exceeding the intra-IDT delay time. These decays are excluded from the reported qubit energy decay rates [Fig 7].

Appendix E: Number splitting analysis
The measured Stark-driven spectra are fit to a sum of unit-area Lorentzians with weights assumed to be Poissonian distributed in the number basis with mean phonon number n, P e (f, n) = C 0 + C 1 nmax n=0 P n (n)S(f, n, n), where n is the phonon number in mode m, f is the spectroscopy frequency, C 0 is a constant offset, C 1 is an overall amplitude, and n max = 6 is a cutoff phonon number. The two factors in the sum are given by P n (n) = e −n n n n! and S = 1 2π γ + κ m (n + n) [f − (f q − 2χ m n)] 2 + [γ + κ m (n + n)] 2 /4 , where γ is the zero-phonon qubit linewidth, f q is the zero-phonon qubit frequency, κ m is the loss rate of mode m, and 2χ m is the single-phonon Stark shift from mode m. Fits of the average phonon number show a linear dependence on applied drive power for the three measured modes [Fig. 8]. The strong drive used to populate the acoustic modes also weakly excites the qubit, causing the trace offset C 0 to increase with n. Additionally, the bare qubit frequency pulls weakly up with off-resonant drive power at a rate of about 150 kHz per phonon, an unexplained effect that is included in the fits. The qubit coherence times at f z are measured to be T 1 = 415 ns and T * 2 = 705 ns. The T * 2 time is almost twice T 1 , and we calculate an intrinsic dephasing rate of (2πT φ ) −1 = 30 kHz. The spectroscopic qubit linewidth was measured to be γ = 550 kHz at f z . Together, frequency-independent energy loss (360 kHz), intrinsic dephasing (30 kHz), the effective Rabi rate from the drive tone (100 kHz), and the finite duration of the drive pulse (50 kHz) sum to a 540-kHz qubit linewidth. marginally smaller than the measured value.
Additionally, an unstable avoided crossing appeared intermittently between 4.312 and 4.322 GHz with sub-MHz coupling rate, fluctuating with a several-hour timescale. We reject data when the defect was present by interleaving independent diagnostics with the Starkdriven spectra and removing defect-present data in postprocessing.