Coupling a superconducting quantum circuit to a phononic crystal defect cavity

Connecting nanoscale mechanical resonators to microwave quantum circuits opens new avenues for storing, processing, and transmitting quantum information. In this work, we couple a phononic crystal cavity to a tunable superconducting quantum circuit. By fabricating a one-dimensional periodic pattern in a thin film of lithium niobate and introducing a defect in this artificial lattice, we localize a 6 gigahertz acoustic resonance to a wavelength-scale volume of less than one cubic micron. The strong piezoelectricity of lithium niobate efficiently couples the localized vibrations to the electric field of a widely tunable high-impedance Josephson junction array resonator. We measure a direct phonon-photon coupling rate $g/2\pi \approx 1.6 \, \mathrm{MHz}$ and a mechanical quality factor $Q_\mathrm{m} \approx 3 \times 10^4$ leading to a cooperativity $C\sim 4$ when the two modes are tuned into resonance. Our work has direct application to engineering hybrid quantum systems for microwave-to-optical conversion as well as emerging architectures for quantum information processing.


I. INTRODUCTION
Compact and low-loss acoustic wave devices that perform complex signal processing at radio frequencies are ubiquitous in classical communication systems [1]. Much like their classical counterparts, emerging quantum machines operating at microwave frequencies [2] also stand to benefit from their integration with these devices. This is conditioned on the realization of sufficiently versatile quantum phononic technologies. Several promising approaches have emerged in the last few years. Each has its own strengths and weaknesses and they can be broadly categorized by the degree to which the acoustic waves are confined as compared to their wavelength. In a series of remarkable experiments, thin-film [3], surface [4][5][6][7], and bulk acoustic wave resonators [8,9] made of piezoelectric materials have coupled gigahertz phonons with varying levels of confinement to superconducting circuits. Nonetheless, smaller mode volumes, lower losses, and greater control over the mode structure are desired.
One of the most promising approaches for realizing ultra-low-loss mechanical resonators is to use phononic crystal cavities that confine acoustic waves in all three dimensions. Wavelength-scale confinement and periodicity qualitatively alter the properties of waves and allow far greater control over the phonon density of states. Periodic patterning of a thin slab of elastic material can give rise to a phononic bandgap -a range of frequencies devoid of propagating waves. By introducing defects in such a crystal, mechanical energy is localized at the wavelength scale [10][11][12][13][14] without any "clamping" losses. The existence of a full phononic bandgap eliminates all modes into which a phonon can be linearly scattered, leading to a significant increase in the coherence time of such resonators. For example, lifetimes on the order of one second corresponding to Q > 10 10 have been optically measured in 5 GHz phononic crystal cavities made from silicon [15]. Moreover, the small mode volume of phononic crystal cavities leads to a dramatic reduction in the density of spurious modes that can negatively impact performance of quantum acoustic systems, while enabling a denser packing of devices for greater scalability.
The greater confinement and control over the acoustic mode structure comes at the cost of weaker coupling. At gigahertz frequencies, the modes of phononic crystal cavities are confined to extremely small volumes ( 1µm 3 ). This leads to smaller forces for a given oscillating voltage when compared to approaches with transducer dimensions of tens to hundreds of microns [16]. Up to now, it has only been possible to efficiently read out and couple to localized modes of phononic crystal cavities with optical photons where the electromagnetic energy is similarly localized [10,17]. Nonetheless, to connect these systems to microwave superconducting quantum circuits, efficient and tunable coupling between microwave photons and phonons is needed. In this work we demonstrate the direct coupling of a superconducting circuit to a wavelength-scale phononic nanocavity, opening a new avenue in quantum acoustics.

II. DEVICE DESIGN AND FABRICATION
At the heart of our device lies a suspended quasione-dimensional phononic crystal fabricated from a 200nanometer-thick film of lithium niobate (LiNbO 3 ). The crystal has a lattice constant of 1 µm and has a complete phononic bandgap in the vicinity of ν = 6 GHz [ Fig.  1(a)]. This bandgap is used to localize the resonances of a single defect site introduced in the center of the lattice. In particular, we engineer a defect mode with a strain field S that generates a charge polarization P i = e ijk S jk that is predominantly aligned in-plane in the direction perpendicular to the lattice [ Fig. 1(b)]; here e is the piezoelectric coupling tensor of LiNbO 3 . We then use this polarization to couple the defect mode to the microwave-frequency electric field of a readout circuit, applied by gate electrodes placed within 200 nanometers of the defect. The quasi-one-dimensional phononic crystal with lattice constant a = 1 µm, showing the bands of all possible mode polarizations in the range of frequencies relevant to this work. A complete bandgap near ν = 6.5 GHz is clearly visible, with a narrower gap also visible below. Other relevant simulation parameters (matching those of the fabricated structures) are the length and width of the connecting struts (320 nm and 240 nm, respectively), the film thickness (224 nm), and the sidewall angle (5 • ). (b) Deformation u(r) and electrostatic potential φ(r) of a mode localized at the defect site, at frequency ν = 6.48 GHz near the center of the bandgap. Modes of this polarization can be coupled to electric fields pointing in the direction perpendicular to the crystal lattice. Here the length and width of the defect are a def = 1.6 µm and w def = 500 nm, respectively. (c) Schematic of the device, including the drive/readout line (blue) capacitively coupled to the resonator, the flux line (red) used to flux bias the SQUID array, and the electrodes (gray) that couple the circuit to the phononic cavity (light blue). The LiNbO3 crystal axes are indicated.
readout circuit is a lumped-element microwave resonator formed from the capacitance C r of the gate electrodes and a series of Josephson junctions in a superconducting quantum interference device (SQUID) array configuration with total Josephson inductance L r = Φ 2 0 /E J (Φ e ) [ Fig. 1(c)], where Φ 0 = /2e is the reduced flux quantum. The effective Josephson energy E J (Φ 0 ) of the array depends on the external flux Φ e threading the SQUIDs, making the resonator frequency ω r = 1/ √ L r C r tunable by applying a small current to an on-chip flux line. In addition, the small parasitic capacitance of the array enables us to achieve a relatively large resonator impedance Z r = L r /C r ≈ 580 Ω. This is an important feature of our device, as the piezoelectric coupling strength is proportional to the zero-point voltage fluctuations of the circuit and V zp ∼ √ Z r . The resonator impedance is largely limited by the presence of the flux line (highlighted in red in Fig. 2), which is a major source of parasitic capacitance between the two nodes of the resonator.
Here we perform the device fabrication on a 500 nm film of X-cut LiNbO 3 on a 500 µm high-resistivity (> 3 kΩ·cm) Si substrate and involves seven masks of lithography consisting of the following four stages [see Fig.  2(a)]: 1) LiNbO 3 film thinning, 2) patterning of phononic nanostructures, 3) deposition of Al layers, including all microwave circuitry and Josephson junctions, and 4) masked undercut of structures. The film is first thinned down to the target thickness (approximately 200 nm for this device) by blanket argon milling. We then expose a mask on positive resist with a single step of electronbeam (ebeam) lithography and transfer it to the LiNbO 3 film with an optimized argon milling process to form the phononic nanostructures. Now masking only the structures, an argon milling step is done to remove the LiNbO 3 film from the entire sample. This step allows us to place all microwave circuits on a high-resistivity silicon substrate where they are not vulnerable to acoustic radiation losses induced by the piezoelectric film. Aluminum microwave ground planes and feedlines are defined on the exposed silicon via liftoff, and the SQUID arrays are fabricated with a Dolan bridge double-angle evaporation process to grow the Al/AlOx/Al junctions [21,22]. The gate electrodes used to address the phononic defect sites are patterned with a separate ebeam mask and normalincidence Al evaporation, and finally a bandage process [23] is used to ensure lossless superconducting connections between all metalization layers. As a final step, we release the structures with a masked XeF 2 dry etch that etches the underlying Si with extremely high selectivity to the LiNbO 3 and the Al [19,20], leaving all aluminum layers intact at the end of the process.
In Fig. 2(b) we show a set of scanning-electron micrographs of a finished device nearly identical to the one used in this experiment. The full microwave circuit is shown in the center. The charge line (highlighted blue) is capacitively coupled to the resonator and is used for driving and readout. The flux line (highlighted red) is used to apply either DC or RF magnetic fields to the SQUID array and tune the resonator frequency. The flux line is shorted to ground in a symmetric configuration in order to reduce leakage of photons through the mutual inductance between the resonator and the line. The junction array, placed 5 µm away from the flux line, is composed of N SQ = 17 nominally identical SQUIDs in series and has a total inductance L r ≈ 11 nH inferred by measuring the normal-state resistance of three copies of the array on the same chip. The two terminals of the SQUID array are routed to a set of electrodes used to address six independent phononic crystal defect cavities. These elec- trodes, the rest of the wiring, and the immediate environment of the resonator amount to a total capacitance of C r = 33 fF, determined from finite-element electrostatics simulations.
Each of the six cavities has the same nominal mirror cell design and therefore the same phononic band structure. As a result the modes that are supported by the cavities appear in the same frequency bands. In order to spectrally resolve these modes, we sweep the length a def of the defect cells, from 1.4 µm to 1.65 µm in steps of 50 nm. Because the bandgap is quite small -only a small percentage of the center frequency -many defects do not support localized modes of the correct polarization. By scaling the defect across the six cavities, we therefore increase the likelihood of generating and observing a localized mode.

III. MODELING AND MEASUREMENT RESULTS
We model our system as a microwave-frequency electromagnetic mode with annihilation operatorâ and frequency ω r that is linearly coupled, with a rate g, to a mechanical modeb at frequency ω m . This model is valid so long as we are interested in a range of frequencies sufficiently distant from other mechanical resonances in the system as compared to the relevant interaction rates. The Hamiltonian iŝ where χ is the Kerr nonlinearity of the microwave mode introduced by array. For an array of N SQ identical SQUIDs this is given by is the charging energy [24]. For this device χ/2π ≈ −2 MHz, which is larger than the typical anharmonicity of parametric amplifier devices [25] but significantly smaller than that of transmon qubits [26]. We can further include the effect of a coherent drive sent into the input port by adding a drive termĤ to the Hamiltonian, where κ e is the extrinsic decay rate of the microwave mode into the readout channel and ω d is the drive frequency. For sufficiently weak driving the system response is linear and the Kerr term in Eq. (1) can be neglected. Specifically, this is valid if χ â †â κ, when the frequency shift induced by the drive is much smaller than the total electromagnetic linewidth κ [27].
We perform our characterization measurements at the bottom plate of a dilution refrigerator at a temperature of T = 7 mK. We probe the system by measuring the reflection spectrum S 11 (ω) through the charge port of the device (full details of the measurement setup are provided in Appendix D). In Fig. 3(a) we show a typical normalized reflection spectrum of the resonator in the linear regime, in this case tuned to a frequency of ν = 5.9 GHz far detuned from any mechanical resonance. The reflection coefficient is S 11 (ω) = −1 + 2η e χ r (ω), where η e ≡ κ e /κ is the coupling efficiency and χ r (ω) = [2i(ω − ω r )/κ + 1] GHz. This frequency is more than 2 GHz detuned from the flux sweet spot, leading to an intrinsic linewidth κ i = κ − κ e that is dominated by flux noise dephasing (see Appendix B). In Fig. 3(b) we show the linear spectroscopy results for a range of values of the external magnetic flux Φ e , illustrating the DC-bias response ω r (Φ e ) = ω r,max | cos(2πΦ e /Φ 0 )| of the resonator frequency. We infer ω r,max /2π = 8.31 GHz, lying outside of our measurement band. Further, since the SQUIDs are composed of nominally identical junctions the lower frequency part of our tuning curve also lies outside of the measurement band.
We can now use the tunable response of the resonator to look for additional signatures in the spectrum. Tuning the resonance from the top of our measurement band at ν ≈ 8 GHz down to ν ≈ 5 GHz, we find a series of resonances that anti-cross with the microwave mode, largely concentrated in the 6 − 6.5 GHz range. In Fig. 3(c) we show the anti-crossing of the most strongly coupled mechanical mode we found for this device, along with a line cut at the point of minimum detuning shown in Fig. 3(d).
Using the entire anti-crossing dataset, we extract the parameters of the mechanical mode by fitting the spectrum to the simple linear input-output model described in Appendix A. In the case of a single mechanical mode coupled to the readout resonator, the reflection spectrum can be written as where χ m (ω) = [2i(ω − ω m )/γ + 1] −1 is the dimensionless mechanical susceptibility and C ≡ 4g 2 /κγ is the cooperativity. A least-squares fit to this model results in ω m /2π = 5.9754 GHz, g/2π = 1.65 ± 0.07 MHz and γ/2π = 220 ± 70 kHz, corresponding to a mechanical quality factor Q m = ω m /γ ≈ 3 × 10 4 . The maximum cooperativity, i.e., the ratio of the mechanical resonator's electromagnetic read-out coupling to its intrinsic losses, approaches C ≈ 4.5 on resonance. Crucially, our mode lies in a "quiet" region where the closest observed mechanical modes are 50 MHz and 250 MHz below and above, respectively (see Appendix C).
In order to better understand the measured electromechanical response, we perform finite-element simulations of the full LiNbO 3 structure, simultaneously solving the equations of elasticity, electrostatics, and their coupling via piezoelectricity. Following a procedure described in Ref. [16], we numerically calculate the electromechanical admittance function Y m (ω) seen at the electrical terminals of a single phononic cavity [28] and generate an effective circuit using Foster synthesis [29]. Using this technique we calculate coupling rates in the range g/2π ≈ 1.5 − 2.5 MHz for the cavity geometries present in this device, in agreement with the measurement. We measure the reflection spectra at higher drive power levels to verify the expected linearity of the mechanical resonance, and to distinguish it from other degrees of freedom, such as two-level systems (TLS) that have been observed in chip-scale devices [30]. The strong Kerr nonlinearity of the resonator allows us to calibrate the coherently-driven photon occupation. We set the resonator frequency to ω r /2π = 5.90 GHz, detuned from the mechanical mode, and vary the probe power. For very low powers, we can approximate the effect of the drive as a frequency shift δω r = χ â †â /2 [ Fig. 4(a)] and use this to extract the photon number. For low probe powers, we observe a linear dependence of the frequency shift as expected from a linearized model in which the steadystate occupation of the resonator redshifts the frequency seen by the probe tone. However, as the probe power is increased, a more complex nonlinear response is observed as evidenced by the deviation of the estimated δω r from the simplified linear dependence. We use the lower power points to obtain a nominal calibrated photon number n r = â †â , which is valid at low drive strengths and represents an upper bound to the occupation when extrapolated to stronger drives. This requires us to accurately estimate χ, which we do in two different ways: first using the measured resonator frequency and the normalstate junction resistance, and second by simulating the capacitance matrix of the device. Both of these methods give us nearly the same value of χ/2π = −2.0 MHz. We now place the resonator to the red side of the mechanical mode and change the driving strength while sweeping the probe frequency to obtain the traces shown in Fig. 4(b). We observe the microwave mode broaden and redshift as the occupation is increased to a few photons, while the mechanical mode remains unchanged. We therefore conclude that the observed resonance is not due to a TLS. Additionally, we note that the frequency and linewidth of the observed resonance remained constant over several experimental runs that involved temperature cycling the device.

IV. OUTLOOK
We have demonstrated efficient coupling between a localized phononic cavity and a superconducting microwave circuit. The cooperativity C ∼ 4 is already sufficient for efficient conversion of microwave photons to highly localized microwave phonons that can in turn be up-converted efficiently to optical photons [17] -a promising route for microwave-to-optical conversion [31,32]. The performance of the device can be further improved by increasing the size of the bandgap to allow for higher mechanical Q. Larger phononic bandgaps lead to greater robustness to fabrication imperfections, which we believe currently limit the coherence time of the resonances (see Appendix C). In addition, optimizing the electrode placement and mode profile can lead to an increase in the coupling rate g.
For quantum acoustic structures to become competitive with the best electromagnetic cavities, higher interaction rates g and quality factors Q need to be achieved while minimizing spurious resonances to allow for fast gate operations [33]. Interestingly, due to the small capacitance of the transducer and the ability to minimize crosstalk between resonances through phonon bandgap engineering, this architecture lends itself well to engineering systems where many bosonic linear modes couple to a single qubit [34]. Whether such a quantum acoustic approach will be competitive in the realm of quantum information processing relies on improvements in the g and Q of the devices, which will be the focus of future work.

V. ACKNOWLEDGEMENTS
We gratefully acknowledge R. Patel, C. Sarabalis, R. Van Laer, and N. Lörch for useful discussions. This work was supported by NSF ECCS-1509107, NSF ECCS-1708734, ONR MURI QOMAND, and start-up funds from Stanford University. ASN is grateful for support from Terman, Hellman, and Packard Fellowships. PAA and JDW are partially supported by the Stanford Graduate Fellowship (SGF), and MP is partially funded by the Swiss National Science Foundation postdoctoral fellowship. Device fabrication was performed at the Stanford Nano Shared Facilities (SNSF) and the Stanford Nanofabrication Facility (SNF). The SNSF is supported by the National Science Foundation under Grant No. ECCS-1542152. EAW was partly supported by an SNSF fellowship.

Appendix A: Reflection spectra
We model the mechanical system as a collection of harmonic modes {b i } linearly coupled to a Kerr oscillator, which in turn is coupled to a single input/output channel for driving and readout.
The Hamiltonian of the system iŝ where ω r is the frequency of the microwave mode, χ is the anharmonicity, {ω (k) m } are the frequencies of the mechanical modes, and {g k } are their coupling rates to the microwave mode. We have explicitly included a coherent driving field at frequency ω d (which couples to the system at rate κ e ) and bundled all other bath terms intoĤ Bfollowing a standard input-output treatment, these terms simply generate additional decay terms in the Heisenberg equations. We can eliminate the time dependence inĤ by going into an interaction frame with respect tô H 0 / ≡ ω d â †â + b † kb k . The transformed Hamiltonian (now omitting the bath terms) becomeŝ We can neglect the nonlinear term in the weak-drive regime where χ|α in | 2 κ 2 . Our experiment only measures the average output field amplitudes in steady state, given by â ≡ α and b k ≡ β k . These obey the Heisenberg equations of motioṅ which can be written in the Fourier domain as We define the bare dimensionless susceptibilities where κ = κ e + κ i and {γ k } are the total decay rates of the microwave and mechanical modes, respectively. Together with the input-output boundary condition α out = −α in + √ κ e α, we can then directly solve for the reflection coefficient S 11 ≡ α out /α in , and obtain where C k ≡ 4g 2 k /κγ k is the cooperativity (or readout efficiency) for mode k. We can finally assume that a single mechanical modeb k ≡b is relevant at a given resonator frequency ω r , in the sense that it is the only mode that imprints a measurable signature in the reflection signal. Some algebraic manipulation leads us to the expression for S 11 (ω) shown in the main text and used for fitting the data.

Appendix B: Wide-band characterization of SQUID array resonator
The center frequency of the SQUID array resonator is widely tunable, allowing us to probe the the mechanical mode spectrum over a large range of frequencies. In Fig.  5 we plot the resonator frequency ω r /2π as a function of the bias voltage applied to run a current through the onchip flux line (see Appendix D for details) along with a fit to the function ω r (V ) = ω r,max | cos(GV + φ offset )|, giving us a calibration of the external flux Φ e threading the SQUIDs. We infer that the flux-insensitive point is at ω r,max /2π = 8.31 GHz and lies outside of our measurement band. At every bias point, we fit the reflection spectrum S 11 (ω) to the model derived in Appendix A (Eq. 2 with g = 0) in order to extract the total and extrinsic resonator linewidths (κ and κ e , respectively), also shown in Fig. 5. We also define an intrinsic linewidth κ i ≡ κ − κ e , which contains contributions from both energy relaxation and pure dephasing. Since our scattering parameter measurement uses only mean field amplitudes, we do not have the ability to separate these two contributions, but we can still look at the frequency dependence of κ i in order to gain insight into the decoherence mechanisms affecting this device. We find that κ e increases with frequency as expected from capacitive coupling to the feedline, whereas κ i decreases as ω r /2π approaches the flux-insensitive point. We attribute this dependence to a strong contribution of flux-noise dephasing to the intrinsic linewidth. In fact, we see that at the mechanical frequency ω m /2π ≈ 6 GHz the intrinsic linewidth is dominated by flux noise, suggesting that the coherence times in future experiments can be improved by operating the tunable circuits near or at the flux sweet spot.
Appendix C: Complete mechanical spectrum of the system In addition to the mode at ω m /2π = 5.9754 GHz presented in the main text, we observe other modes distributed over a wide range of frequencies. In Fig. 6 we show the complete mechanical spectrum of this device. The positions of all observed modes are indicated with vertical lines, and we plot the quality factor Q m and coupling rate g of nine modes with sufficiently strong signatures in S 11 to be fit reliably to the model. The mode presented in the main text is indicated in red. Interestingly, we find that all modes have quality factors on the order of 10 4 , consistent with the hypothesis that their losses are dominated by acoustic radiation, or "clamping" loss. These measurements are also consistent with previous studies of losses in phononic crystal cavities made from silicon that lack full phononic bandgaps [35]. The size of the bandgap in this work is not very large ing that the resonances tend to tightly cluster within certain regions. In both plots, the resonance at ωm/2π = 5.975 GHz presented in the main text is indicated by the red point. The quality factors and coupling rates are obtained through reflection spectra collected at various detunings ∆ = ωr − ωm around each mechanical mode and fitting them to Eq. 2; error bars indicate the standard deviation of the parameter estimates for these fits.
(< 5% of its center frequency according to our finite element simulations), which is of comparable magnitude to the fabrication-induced disorder in the phononic crystals. This allows trapped phonons to tunnel out of the defect region and irreversibly escape through the clamping points [35]. In order to suppress this loss channel, future devices will require larger bandgaps, which can be achieved through further improvements to the design and fabrication. We also observe that with the exception of mode presented in the main text -indicated by the red point at g/2π ≈ 1.6 MHz -all modes have coupling rates on the order of 100 kHz. This reduction in the coupling rate has been observed in silicon optomechanical crystals as the modes are tuned outside of a bandgap region [11]. In our case, finite element simulations of all six cavity geometries present in this device predict rates between 1.5 − 2.5 MHz, leading us to conclude that only one resonance in the spectrum is tightly confined to the defect in the way predicted by the simulations. The coupling rates can therefore be improved by engineering cavities with larger bandgaps, optimizing the geometry of the defect, and changing the placement of the electrodes.

Appendix D: Experimental setup
Our sample is packaged in a copper enclosure to protect it from stray radiation and limit spurious modes.
The package is placed inside a multi-layer magnetic shield anchored to the mixing-chamber plate (T ≈ 7 mK) of a cryogen-free dilution refrigerator. A Rhode & Schwartz ZNB20 vector network analyzer (VNA) generates a probe tone that is sent down to the input port of the device through a cascade of attenuators thermalized to various temperature stages of the refrigerator. A circulator (QuinStar QCY-060400C000) separates the input and output signals and an additional isolator (Quin-Star QCY-060400C000) protects the device from hot (T ∼ 3 K) radiation in the output line. The output signal is routed up to the 3 K stage through superconducting NbTi cables, where it is amplified by a high-electron mobility transistor (HEMT) amplifier (Caltech CITCRYO1-12A). The signal is further amplified at room temperature by two low-noise amplifiers (Miteq AFS4-02001800-24-10P-4 & AFS4-00100800-14-10P-4) with a 4-8 GHz bandpass filter (Keenlion KBF-4/8-Q7S) between them before being detected at the VNA.
Flux biasing is provided by a programmable voltage source (SRS SIM928). The DC voltage passes through a cold low-pass filter (Aivon Therma-24G) at the 3 K stage and enters the DC port of a bias tee (Anritsu K250) mounted at the mixing-chamber plate. In addition an AC flux can be applied with a microwave generator (Keysight E8257D), which sends a tone to the RF port of the bias tee through an additional attenuated line, though this capability is not used in this experiment. Finally, the DC+RF output of the tee is sent directly to the flux port of the device.