Strong dispersive coupling between a mechanical resonator and a fluxonium superconducting qubit

We demonstrate strong dispersive coupling between a fluxonium superconducting qubit and a 690 megahertz mechanical oscillator, extending the reach of circuit quantum acousto-dynamics (cQAD) experiments into a new range of frequencies. We have engineered a qubit-phonon coupling rate of $g\approx2\pi\times14~\text{MHz}$, and achieved a dispersive interaction that exceeds the decoherence rates of both systems while the qubit and mechanics are highly nonresonant ($\Delta/g\gtrsim10$). Leveraging this strong coupling, we perform phonon number-resolved measurements of the mechanical resonator and investigate its dissipation and dephasing properties. Our results demonstrate the potential for fluxonium-based hybrid quantum systems, and a path for developing new quantum sensing and information processing schemes with phonons at frequencies below 700 MHz to significantly expand the toolbox of cQAD.


INTRODUCTION
Mechanical resonators are promising candidates for hardware-efficient quantum memory [1][2][3] and novel types of quantum sensors, as they offer a smaller spatial footprint and couple to degrees of freedom such as mass and force.In this work, we achieve strong dispersive coupling between a fluxonium superconducting qubit and a sub-gigahertz mechanical oscillator -an essential step toward developing quantum sensing and networking components operating at lower frequencies.Achieving significant coupling between qubits and mechanical oscillators poses a challenge within the established paradigm pursued in most recent circuit quantum acoustodynamics (cQAD) efforts, where piezoelectricity mediates resonant coupling between a weakly anharmonic transmon qubit and a mechanical oscillator [4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][19][20][21].Most of these demonstrations operate in the 2 to 8 gigahertz frequency range.We are motivated to work with lower-frequency resonators, which are expected to exhibit longer coherence times and provide new frequency windows for operating quantum devices.Seeking to operate at lower mechanical frequencies effectively compels us to move to a different type of qubit [22][23][24] to avoid rapidly diminishing coupling rates -particularly for nanomechanical oscillators that have minimal gate capacitance.Our work shows that a substantial qubit-mechanics coupling rate can be achieved at lower frequency by using a fluxonium qubit.We show that the resulting large dispersive interaction rates, exceeding the decoherence rates of both systems, enable phonon number-resolved measurements of mechanical resonators through the fluxonium and time-dependent coherence measurements of the os-cillator.Remarkably, we are able to achieve large dispersive cooperativities despite working at a large detuning (∆/g ≥ 10) as compared to previous cQAD demonstrations.Larger detunings allow our mechanical resonator to be more effectively isolated from the qubit, and viceversa, an aspect that will become more important for longer-lived resonances.
In this work we demonstrate strong dispersive coupling between a lithium niobate (LN) phononic crystal resonator at ω m /2π ≈ 690 MHz and a superconducting qubit, allowing us to resolve individual phonon levels.The device is composed of a fluxonium circuit capacitively coupled to an on-chip readout resonator [25,26] and heterogeneously integrated [18,20] with a nanomechanical phononic crystal cavity [16,17].The effective electrical circuit as well as microscope images of the key components at different scales are depicted in Fig. 1.The mechanical frequency is approximately a factor of 3 smaller than the similar resonators in Ref. [20].In this frequency range, and at a resonator temperature of T eff ∼ 30 mK, thermal excitations are significantly more common ( ω m /k B T ∼ 1).To address this lower frequency we use a "light" fluxonium qubit [25,27] that preserves the insensitivity to charge noise and GHz-frequency readout associated with transmons [28], while also realizing a large qubit-phonon coupling rate g/2π ≈ 13.5 MHz.The qubit and mechanics are fabricated on separate chips and coupled capacitively across a vacuum gap.We primarily operate the qubit in the strong dispersive regime (ω eg,0 = ω m0 + ∆ where |∆|/g 1).The dynamics are then described approximately by an effective Hamiltonian [29,30], Eqn. 1 suggests that we can perform quantum nondemolition (QND) measurements of phonon population by probing the qubit and that we can perform QND measurements of qubit population by probing the mechanics.When the shift-per-excitation 2χ m exceeds the linewidth of the qubit, individual phonon states become resolvable in the qubit excitation spectrum [31] as suggested by eqn. 1.In our system, ω/k B T ∼ 1, and the equilibrium thermal state contains excitations above the ground state for both mechanics and qubit.Therefore we anticipate an excitation spectrum that follows the thermal distribution of the system.High fidelity gates generally require starting from a pure state, e.g., by cooling the system to its ground state [32,33] or otherwise stabilizing in a low-entropy state [26,34].In this work we demonstrate mechanics-fluxonium coupling in the disper-sive regime, and observe the thermal excitation spectrum of the system.We also perform partially-coherent operations on the initial thermal state to demonstrate feasibility of phonon-number measurement and single-phonon state preparation.With modest improvements in qubit frequency stability, discussed in Appendix I, we anticipate high-fidelty state preparation, and single-phonon control in this platform.We organize our work in two parts.In Sec.II we focus on characterizing the coupling and measuring the level structure in the dispersive regime.We first observe strong resonant coupling by tuning the qubit through the mechanics, performing two-tone spectroscopy and measuring the minimum splitting of the avoided crossing.We then detune the qubit to ∆ coherent /g eg ≈ 9 and observe phonon-number resolved transitions following coherent excitation of the mechanics [13,17].In Sec.III we focus on implementing partially coherent gates between the phonon and qubit.We tune the qubit to ∆ swap /g eg ≈ 11 and modulate the qubit frequency to swap single-photonlike states from the qubit into the mechanics.Using swap-like operations by frequency modulation, we measure energy decay (T 1m ) and phase decay (T 2m ) of the mechanics [20].

A. Resonant coupling
The qudit arises from a fluxonium superconducting circuit [25][26][27][35][36][37][38] with Hamiltonian given by Ĥq = 4E C n2 q − E J cos φq + where φ e = 2πΦ e /Φ 0 is the external flux bias in units of reduced flux quantum.The qudit-mechanics coupling is described by a linear piezoelectric interaction [2], We are primarily interested in the coupling to the qubit (g, e) transition, which occurs with a rate g eg ≡ g m | q g|n q |e q |.By varying the DC current flowing in the flux line, we tune the qubit frequency ω eg through the anticipated mechanical frequency ω m0 and observe an avoided crossing of width 2g eg /2π ≈ 27.1 MHz near ω m0 /2π ≈ 692 MHz, shown in Fig. 2a.The normalized coupling g eg /ω m0 ≈ 1.9% is on the same order as in strongly coupled superconducting-only systems [30].
The experimental spectra we observe display features that are typically absent for transmon-mechanical avoided crossings measured with GHz-frequency mechanical resonators [12,17].We observe additional peaks between the outer branches of the avoided crossing.We interpret these peaks as representing transitions between levels above the ground state |g0 , visible in spectroscopy Spectroscopy of qubit-mechanics coupling.(a) Qubit spectrum as a function of applied magnetic flux Φe in units of the magnetic flux quantum Φ0.Solid curves denote first-order transition frequencies predicted by diagonalizing the coupled qudit-mechanical Hamiltonian; the procedure for determining model parameters is detailed in Appendix D. Transitions are labeled using the undressed basis states {|qubit ⊗ |mechanics } with greatest overlap to the eigenstates involved in each transition.These labels change when passing through the avoided crossing and are not intended as quantitative descriptions of the eigenstates, as the spectroscopy window covers a region of strong hybridization.(b) Finer spectrum taken along the vertical dashed line in (a), at the approximate center of the avoided crossing.The qubit excitation amplitude is reduced by a factor of 5, and the unlabeled peak seen near 681.5 MHz in (a) no longer appears.We attribute this peak to a second-order transition [11].Vertical dashed lines denote the transition frequencies predicted in (a) and align well to the lower peaks; less well to the upper peaks.
as they are thermally excited [39,40].To verify this interpretation, we diagonalize the original Hamiltonian [41,42] Ĥ = ω m0 b †b + Ĥq + Ĥint exactly and fit the energy differences between the eigenvalues to the observed peak frequencies.Except for a flux-independent feature at 697 MHz, which we attribute to a weakly coupled parasitic mechanical resonance, the observed transition frequencies agree with the model.The spectrum shown in Fig. 2a is power broadened, preventing us from resolving the individual transitions.We reduce the excitation power and perform a narrower band sweep at a fixed flux of 0.49Φ 0 , and observe resolved peaks near 685 MHz on the lower-frequency side of the window agreeing with the theoretical model (Fig. 2b).The corresponding peaks on the higher-frequency side do not agree quantitatively with the model, likely due to coupling to the parasitic mechanical mode.

B. Dispersive coupling
In the dispersive regime, the qubit frequency is shifted by 2χ m for each phonon excitation.We can resolve this splitting in the (g, e) transition by exciting the qubit with pulses at different center frequencies.In our measurement [17], we first coherently drive the mechanical resonator to modify the phonon number distribution.We then drive the qubit (g, e) transition while varying the pulse center frequency, after which we measure the qubit state.To select the qubit detuning ∆ coherent , we step the flux bias from ∆/g eg ∼ 3 to ∼ 13 and measure qubit coherence times, shown in Fig. 3a.We choose our qubit-mechanics detuning to be ∆ coherent /g eg ≈ 9 to simultaneously achieve a large detuning (which makes the measurement more QND) and maximize T 1q .The qubit frequency corresponding to this detuning is ω eg /2π ∼ 816 MHz.We then measure the qubit (g, e) spectrum for the case of no driving, and after driving the mechanics with increasing amplitude.In each case, the mechanics is driven for T pump = 1 µs and the qubit is excited for T probe = 5 µs at each frequency point [44].The measured spectra are shown in Fig. 3b.In each spectrum we observe multiple peaks, and we observe more peaks at larger drive amplitudes.To estimate the dispersive shift, we calculate the splitting between peaks representing phonon states |0 m and |1 m and obtain 2χ m /2π = 2.23 ± 0.01 MHz.
The area A(n) under the n th peak of the qubit excitation spectrum is proportional to the probability P (n) of phonon state |n m .This allows us to determine the phonon number distribution and to characterize the effect of the coherent driving on the phonon population [13,31].We fit each spectrum in Fig. 3b to a sum of Voigt profiles to obtain the areas A(n), and use the areas to calculate a mean phonon number n for each drive amplitude.We anticipate that a thermally excited mechanical system with a mean population of nth , when coherently displaced by α, will result in a mean phonon population of n = nth + |α| 2 .Because α should be proportional to drive amplitude, we plot the n obtained from P (n) as a function of squared drive amplitude, shown in Fig. 3d.The linear fit yields a thermal phonon number nth = 0.57±0.06,which for the 690 MHz mechanical mode corresponds to an effective temperature T eff = 33 ± 2 mK.
In addition to a shift of the dressed qubit frequency with mechanical excitation number, we expect to see an equivalent shift of the mechanical frequency corresponding to the qubit (g, e) state.Thermal population of |e q leads to a second peak in the mechanical spectrum.To verify the dispersive model, we measure the mechanical spectrum for varying flux biases with the qubit detuned outside the spectroscopy window.We observe two peaks at frequencies in good agreement with theory, shown in Fig. 3c.With the qubit detuned from the mechanics by ∆ coherent , the dispersive shift 2χ m obtained from the qubit peak splitting in Fig. 3b also agrees with the mechanical peak splitting.We observe that the mechanical peak splitting decreases faster with increasing ∆ than predicted by the simplified two-level qubit model where 2χ m ≈ 2|g eg | 2 /∆.We find (Appendix D) that this be-havior is consistent with the contribution of higher qubit levels to the dispersive shift, and similar to what is observed in the transmon-resonator system [17,28].In Fig. 3c we also observe regions of decreased peak amplitude.We attribute the reduced signal to resonant couplings between the qubit transition and parasitic mechanical modes at higher frequencies, where the qubit transition frequency is outside of the band gap of the phononic crystal (595−739 MHz).Frequency crowding involving parasitic modes may interfere with control of the target mode and will be addressed in future studies.

III. MEASURING MECHANICAL LIFETIMES WITH FREQUENCY MODULATION A. Measurement sequence
We use the fluxonium to better understand the coherence properties of sub-GHz phononic crystal resonators.We operate the qubit at a large detuning from the mechanical mode and use the measurement sequence shown in Fig. 4c.First, we excite the qubit with either a π pulse to exchange populations of |g q and |e q , or a π/2 pulse to create a superposition.Next we swap the qubit excitation into the mechanical mode by modulating the flux bias at frequencies near the qubit-mechanical detuning, generating an effective coupling rate g eff that depends on the modulation amplitude [45][46][47].After the swap we allow the system to evolve freely for a time t, during which the excitation swapped into the mechanics experiences decoherence from the mechanical environment.We then modulate the flux bias again to swap the excitation back into the qubit.Finally, we measure the qubit state.We measure mechanical energy decay by using a π pulse for the initial qubit excitation.To measure mechanical phase decay, we use a π/2 pulse for the initial qubit excitation, and perform a second π/2 pulse right before measuring the qubit in the (g, e) basis.For a qubit-mechanical system that begins in the ground state, these experiments measure the coherence properties of a qubit encoded in the |0 m , |1 m states of the mechanical oscillator [20].Due to thermal excitations in our system, the probability of the system beginning in the ground state is reduced, and we expect the experiments to give us some information about the the decay rates of phonon states up to approximately |3 m .
We choose the qubit control parameters for mechanical coherence measurements by considering the physical mechanism of the swap operation.Modulating the flux bias at frequency ω mod generates a time-dependent qubit frequency ω eg (t) = ωeg + ε mod cos(ω mod t + θ mod ).The time-dependent frequency creates sidebands of the qubit state |e q , shown in Fig. 4a, with frequency spacing equal to f mod = ω mod /2π [45].When a sideband is near resonance with the mechanical mode, Rabi oscillations exchange excitations between the qubit and mechanical mode.We couple the first lower sideband to the mechan-  FIG. 4. First-order sideband coupling.(a) Schematic of a Rabi oscillation experiment, driven by flux-modulating the qubit [45].Parasitic modes are sketched to emphasize the need for a frequency-selective interaction.Translucent circles represent residual probabilities due to initial thermal populations.(b) Pulse sequence for Rabi experiment.The pulse envelope describes a timedependent sideband coupling g eff (τ ).(c) Pulse sequence [20] for measuring T1m or T2m.A variable delay time t separates two swap pulses.(d) Qubit response as a function of frequency and amplitude of the flux-modulation pulse in (b).The response is measured relative to a reference experiment where the qubit Xπ pulse is performed with no modulation afterward.The dashed line indicates the modulation amplitude chosen for swap pulses.(e) Qubit response as a function of modulation frequency and short delay times for the pulse sequence in (c), with R = I.When the sideband is detuned by ∆ from the target mode, we observe oscillations at frequencies differing from ∆ and attribute them to parasitic couplings [11].The dashed line indicates the modulation frequency chosen for swap pulses.(f) T1m measurement for the mechanics, using R = I.Shading indicates one simultaneous standard error in all fit parameters.(g) T2m measurement for the mechanics, using R = X π/2 .
ics by modulating at a frequency f mod ∼ (ω eg − ω m )/2π, driving Rabi oscillations at a rate 2g eff given by, where J 1 is the first-order Bessel function of the first kind.To study the target mechanical mode, we avoid unwanted interactions by ensuring that no other qubit sidebands are near resonance with strongly-coupled mechanical modes [48].We find that a qubit-mechanical detuning of ∆ swap /g eg ≈ 11 is suitable as it avoids interactions with a second strongly-coupled mode at 950 MHz and reduces the parasitic coupling between the second mode and the first upper sideband.We calibrate a swap pulse by first exciting the qubit with a π pulse, then applying a flux modulation pulse with variable frequency and amplitude (Fig. 4b).The modulation pulse duration is fixed at τ mod = 100 ns including ramps of duration 10 ns on each side.We observe Rabi oscillation as we sweep the amplitude of the pulse as shown in Fig. 4d.A large response indicates a significant population transfer from |e q to |g q , and we choose the modulation amplitude that maximizes the response.An ideal Rabi oscillation pattern is symmetric about the resonant modulation frequency, however we observe a pattern that bends toward higher frequencies as modulation amplitude increases.We attribute this bending to nonlinearity in the flux modulation, causing a small shift in the time-averaged qubit frequency ωeg with increasing modulation amplitude [49,50].Because this frequency shift is of similar magnitude to the effective coupling g eff , we perform a second calibration to verify the resonant modulation frequency.For this calibration we measure energy relaxation of the mechanical mode using the pulse sequence in Fig. 4c, following the process described above for delays t up to 2 µs, while sweeping the modulation frequency.We observe multiple oscillations in the data (Fig. 4e), except at nearly-resonant modulation frequencies f mod ∼ 155.6 ± 0.8 MHz.We choose f mod = 155.6MHz for our swap pulse.

B. Mechanical coherence
After calibrating the swap operation, we measure mechanical coherence using the pulse sequence in Fig. 4c while sweeping delays t over a wider range.The result of an energy-relaxation experiment is shown in Fig. 4f.We observe a multi-exponential curve that is described well by the sum of three decaying exponentials, similarly to relaxation curves observed for phononic crys-TABLE I. Dispersive cooperativities in quantum acoustics.We use the following abbreviations for mechanical resonators: PNC (phononic crystal ), BAW (bulk acoustic waves), SAW (surface acoustic waves), and DRUM (voltagebiased drumhead )."+" represents, "coupled to".Cooperativities are rounded to two figures.Values with an asterisk ( * ) are predicted using a hypothetical detuning ∆ and device parameters reported in the corresponding reference.

IV. CONCLUSIONS
We have demonstrated dispersive phonon-resolving measurements of a piezoelectric resonator below 1 GHz using a superconducting qubit in the light-fluxonium regime.We engineered a large qubit-phonon coupling rate within an order of magnitude of the ultra-strong regime [51] and have leveraged this strong coupling to measure mechanical coherence by flux-modulating the qubit.We observe large dispersive cooperativities of a few hundred (Table I) while operating within the QND regime at a detuning ∆ swap /g eg > 10.The large cooperativities indicate a strong dispersive interaction between the qubit and mechanics, which exceeds the decoherence rates of both systems.They enable phonon number resolved measurements of our mechanical resonator, and we use these to perform a dissipation and dephasing study of our mechanical system [21].Our results open the way for new quantum sensing and infor-mation processing schemes with phonons at frequencies below 700 MHz.The mechanical frequencies of the resonator in our approach can be readily extended down to 100 MHz by modifying the fluxonium and phononic crystal parameters.Challenges with going to these even lower frequencies include the difficulty in reproducibly fabricating single Josephson junctions with low energies E J /h 1.5 GHz [37,52], which limits our ability to operate light-fluxonium qubits at arbitrarily low frequencies, and the requirement to etch deeper than 250 nm into lithium niobate to realize thicker phononic crystals at lower frequencies.For the latter, we have recently demonstrated high quality ion-mill etching of lithium niobate with a depth approaching 700 nm for optical devices [53].Another experimental limitation of this work was the slow fluctuation in the qubit transition frequency, which we discuss in Appendix E. Frequency fluctuation limited the usable lifetime of calibration measurements to less than one day, preventing us from effectively calibrating a cooling protocol.We suggest experimental modifications to improve frequency stability and implement cooling in Appendix I. Finally, in contrast to approaches using the transmon, understanding the phononic crystal response outside of its bandgap is important, particularly to effectively drive fluxonium dynamics beyond |g q and |e q .No ground plane is used, however a 50 nm-thick aluminum film is patterned underneath the 900 nm spacers (long horizontal rectangles) such that the base of the spacers is coplanar with the top surface of the coupling capacitor pads as if a ground plane were present.The top chip is designed with rotational symmetry to enable coupling mechanics on either side to the qubit.flip-chip geometry as the final step in fabrication.All electron-beam lithography (EBL) masks are patterned with a JEOL JBX-6300FS (100 kV), and all photolithography masks are patterned with a Heidelberg MLA150 direct-writer (405 nm).All lift-off masks are treated with gentle downstream oxygen plasma to remove polymer residues from interfaces before depositing additional material.An image of the final device is shown in Fig. 5e.
Mechanical oscillators are patterned in thin-film lithium niobate (LN), X-cut with 5 mol% MgO codoping, bonded to a silicon 111 substrate.The fabrication procedure consists of initial film preparation followed by six patterned masks.Starting with an LN thickness of approximately 500 nm, samples are thermally annealed for 8 hours at 500 C, then the LN film is thinned to a target of 250±5 nm by blanket argon ion milling.Mask 1 defines the mechanical structures by EBL using a hydrogen silsesquioxane (HSQ) mask, followed by argon ion milling.Remaining HSQ and redeposited material are removed in a heated bath of dilute hydrofluoric acid followed by baths of piranha and buffered oxide etchant.Qubit circuits are patterned in aluminum on a 525 µm high-resistivity silicon substrate (ρ > 10 kΩ • cm).The two-mask fabrication procedure is based on Refs.[37,54].Before patterning, the substrate is cleaned in baths of piranha and buffered oxide etchant.Mask 1 patterns qubit electrodes and Josephson junctions by EBL and liftoff, using a Dolan-bridge method [55] similar to the patterning of 3D antenna qubits.The geometry of the junctionarray inductor is adapted to the asymmetric double-angle evaporation recipe for the T-style single junction [54].Mask 2 patterns the ground plane, readout resonator, and all control lines for the qubit, by photolithography and liftoff (150 nm target Al thickness).Our circuit fabrication in this work prioritizes expedience rather than qubit coherence, and we discuss improvements in Appendix I.
The final fabrication step uses a submicron die bonder (Finetech Fineplacer Lambda) to align the mechanics top-chip to the qubit bottom-chip.The top chip is secured using an adhesive polymer (9:1 ethanol:GE Varnish) applied manually to opposing edges.In our circuit layout, it is necessary to complete the flux control line with a wirebond between on-chip bond pads [56], which was chosen to simplify routing of coplanar waveguides (CPWs) within the boundaries of the circuit chip (6.9 mm × 2.9 mm).The mechanics chip is relatively small (1.5 mm × 1.4 mm), to enable manual application of the adhesive without overlapping the superconducting circuits.Manual handling of the mechanics chips after the xenon difluoride etch is minimized, as some previous chips flew away or flipped over before flip-chip bonding due to small agitations on nearby surfaces.

Appendix B: Experimental setup
The experimental setup is shown in Fig. 6.We use a 5 GS/s arbitrary waveform generator (AWG) (Tektronix series 5200) for all pulsed experiments in this work.AWG channels (1, 2, 3) respectively output signals for qubit excitation, readout, and flux modulation; channel 4 could be utilized in future work to cool the qubit using few-GHz pulses.Analog upconversion is used to output readout signals near 4.92 GHz.All control lines are coaxial except between the DC source at room temperature and the 10 mK plate, which uses a shielded twisted pair with one terminal connected to fridge ground.We use Keysight E8257D sources for local oscillators and to pump a travelling wave parametric amplifier (TWPA).We operate the TWPA [60] at a conservative signal-to-noise gain of 15 dB near 4.92 GHz to minimize spurious frequency content, with pump frequency at 6.344 GHz.Heterodyne data are collected using analog downconversion of the readout signal to 125 MHz, 12 − bit digital acquisition at 500 MS/s (AlazarTech ATS9350), and digital down-conversion of one of ±125 MHz to DC.
We attempt to reduce current noise in the flux line using filtering, attenuation, and thermalization of components at the 10 mK stage using copper braid and largearea contact with copper mounts.While the qubit still displays a large pure-dephasing rate (see Appendix F), even a marginal improvement in T 2q is useful for the experiments in this work as the number-splitting measurements would be severely limited by a factor-of-two increase in qubit linewidth.The DC flux line is wired to favor voltage-biasing, in which case the on-chip current noise due to the source is limited by an 8 kΩ series resistance in the RC filter at the 3 K stage (Aivon Therma-24G).We use an SRS SIM928 for the DC voltage source and add an ultralowpass RC filter across the output [26], contributing another 1 kΩ of series resistance.Our use of GHz-and MHz-cutoff lowpass filters in the DC flux line and a modified bias tee with capacitor removed from the AC input port also follow Ref. [26].The RF flux line was originally intended for fast DC pulses [20], then reconfigured for RF modulation after frequency drift in similar qubit designs suggested instability in the DC flux bias.

Appendix C: Device design
We model the experimental device with three modes corresponding to qudit, target mechanical mode, and readout mode: A typical process for modeling Hamiltonian parameters is outlined in Fig. 7.We design the mechanical resonator is first, as it constrains designs for the coupling circuit.

Mechanics
We design the mechanical resonator using finiteelement simulations in COMSOL Multiphysics [17,[19][20][21].The crystal Z-axis of the lithium niobate film is oriented perpendicular (horizontal) to the cavity propagation axis (vertical in Fig. 7a).We assume target thicknesses t (LN, Al) = (250, 50) nm, and we approximate fabrication imperfections using a sidewall angle θ sw = 12.2 • and a corner rounding radius of 50 nm.To determine the mirror cell dimensions we seek a band gap scaled down in frequency by a factor of 3 relative to Ref. [20], tripling the lattice constant to a = 3.0 µm.We rescale other planar dimensions by similar factors, yielding a simulated band gap between 595 and 739 MHz.The fractional band gap (0.216) is smaller than in previous work, which we attribute to our use of a conservatively large strut width s = 300 nm.
We simulate the electroacoustic admittance across the electrodes to study mechanical resonances.To increase the qubit-mechanics coupling strength, we choose the electrode length L e = 1.05 µm to be a large fraction of the defect half-length L y /2 = 2.95/2 µm, and we sweep the defect width L x over a wide range.To reduce computation time for this sweep, we simulate an isolated defect cell with clamped boundary conditions halfway along the struts leading to the defect, which we find raises predicted resonant frequencies by 10s of MHz.A typical result is shown in Fig. 7c: pairs of admittance poles (red, left) and zeros (blue, right) form curves near the target frequency range, with large pole-zero splitting indicating strong coupling to the electrodes.Three distinct strongcoupling regions are visible in a column around 700 MHz.i.

Read Mech
Qudit (2) (5) b.The lowest pole-zero pair corresponds to mode shapes in previous work, resembling half-wavelength shear resonances.The next pole-zero pair corresponds to mode shapes resembling 3/2 shear wavelengths shown in Fig. 7d for the chosen L x = 7.0 µm.While we do not observe an increase in the pole-zero splitting using this mode shape, we predict that the increased capacitance between the wide electrodes nevertheless increases the coupling g m to the qudit charge.After choosing a defect width, we simulate the the full phononic crystal resonator and observe confinement of the mechanical mode displacement to the defect by over 4 orders of magnitude (Fig. 7d).We fit the admittance near the target mode frequency (Fig. 7e) to an LC model [19] and extract circuit parameters shown in Table II Parasitic mechanical resonances are visible as additional peaks and dips in Fig. 7e.The experimental device was designed by requiring at least 20 MHz of separation between the pole of the target mode and any other extremum in the simulated admittance.While Fig. 7e satisfies this, experiments were still limited in part by frequency-crowding due to non-design modes.An example of a non-design mode is shown in 7f, corresponding to a barely-visible blip in the admittance at 700.5 MHz.This feature was overlooked in designs, where the frequency sweep was not fine enough to detect it.However, we observe a parasitic mode near 697 MHz in experiments (Fig. 2), and this mode may have interfered with measurements of the qubit-mechanical level structure.Fu-ture experiments will benefit from a larger free spectral range between the design mode and other mechanical resonances.

Circuit
Circuit design amounts to choosing E J and E L , and simulating the capacitive network to predict E C .We design the metal geometry for large qubit-mechanics coupling g eg subject to a the following conditions: (1) to observe resonant coupling, the minimum qubit frequency lies below the mechanical frequency, (2) to improve qubit coherence in the dispersive regime, the minimum qubit frequency lies within the primary band gap of the phononic crystal, detuned below the mechanics by several g eg , and, (3) to obtain a large readout dispersive shift, the qudit-readout coupling g r /2π 25 MHz and detuning |ω r0 − ω hg,0 |/2π ∼ 100 MHz (in our target regime | q g|n q |h q | ∼ 0.3, so the coupling remains dispersive).The device we implemented experimentally in this work only partially satisfies these conditions.Condition (3) and the second half of (2) are not met.This is due to our use of a top chip with stronger mechanical coupling and smaller capacitive loading relative to designs considered for the bottom chip.Contributions to this effect include removing the top-chip ground plane used in previous works, and decreasing the target flip-chip gap from 1.0 µm to 0.9 µm.We summarize a design process that in principle enables satisfying all the conditions.The regime of (E C , E J , E L ) targeted in designs follows Ref. [37], in the neighborhood of qubits (A, D) tabulated therein.A representative example of target parameters is: E C /h = 0.7 GHz, E J /h = 3.0 GHz, E L /h = 1.0 GHz.The fluxonium regime typically satisfies 1 E J /E C 10 and E L /E J 1.Here E L /E J ∼ 1/3 pushes the upper edge of the fluxonium regime such that near halfflux, the harmonic confinement surrounding the doublewell potential is steep, and the computational states |g, e q are not strongly localized in the two wells.This "light fluxonium" is no longer protected from T 1 -type decay as the transition element | q g|n q |e q | ∼ 0.2 is not strongly suppressed by localization.The low spectrum remains strongly sensitive to E J /E C , so we target values of E C /h = e 2 /2C Σ within an accuracy of ±25 MHz, corresponding to an accuracy of ±1 fF in the effective qubit capacitance C Σ .We therefore attempt to account for all fF-scale contributions to C Σ .Starting from a simulated capacitive network sketched in 7h, we consider three additional sources of capacitance.First, an additional electrostatic simulation of the aluminum wires extending across LN tethering structures (7g) suggests an additional C tethers = 1.0 fF, added in parallel to circuit branch (5,6).Second, for the single junction we assume a plasma frequency ω J /2π = √ 8E CJ E J /h ∼ 20 to 25 MHz, suggesting an additional C J ∼ 1 ± 0.2 fF added across branch (1,2).Finally, for the array of N = 74 junctions we estimate a characteristic impedance Z A = L JA /C A ∼ 10 to 15 kΩ using techniques in Refs.[52,61], adding C A ∼ 1.2 ± 0.5 fF across branch (1, 2) [62].
To quantize the circuit, we first reduce the model in Fig. 7h to its three dynamical coordinates using a procedure similar to Ref. [35], yielding the equivalent circuit in Fig. 7i.We omit the readout effective resistance R r and replace the voltage driving node 4 with a short for simplicity in the diagram (extending the calculation to include drives is straightforward).A generic static Lagrangian modeling circuit QED is [42,63], where Φ is a vector of node-flux coordinates, Φ e is a vector of external flux biases, C is the Maxwell capacitance matrix, and U is a potential function describing inductors and Josephson junctions.For our relatively simple circuit we identify dynamical coordinates by eye.If the potential function U contains no couplings between a sub-graph G and the rest of the circuit (including ground), there is a conserved charge, Substituting each instance of eqn.C3 or C4 into eqn.C2 reduces the number of coordinates by one.We use this method to remove node 5 and substitute (Φ q ≡ Φ 1 − Φ 2 , Φ m ≡ Φ 7 − Φ 6 ).Quantization follows from Legendre transform in the reduced coordinates: H = j∈(q, m, r) (∂ Φj L red ) Φj − L red .A more general method for coordinate transformations in circuit QED is provided in Ref. [42].
The piezoelectric coupling to the (g, e) transition is given by, where red ) qq , and the bound on the charge transition element is derived in Ref. [2].The bound is saturated exactly for linear circuits, while for anharmonic qubits we find transmons achieve 99% of the bound and light fluxoniums can be engineered to achieve 70 − 80% of the bound.The utility of strongly-anharmonic circuits for strong coupling is dominated by a large charging energy E C or equivalently a small capacitance C Σ = 1/(C −1 red ) qq .
Using a light fluxonium, we predict an increase in β qm by a factor of 4 to 5 relative to a transmon near 700 MHz, compensating for the fluxonium's reduced charge element.Finally, we consider the largest resonant coupling achievable between a qubit and a piezoelectric mechanical mode.Starting with g eg /ω < β qm /2, we estimate an upper bound for β qm using a simplified model where the simulated admittance Y m (ω) in Fig. 7h is shunted by a capacitance C q and an arbitrary potential element that sets the qubit on resonance with the mechanics.This describes an ideal coupling circuit where the parasitic capacitance network is eliminated and the qubit is galvanically connected to the mechanical electrodes.In this model, where K 2 is the electroacoustic coupling constant [19,64], and the bound is obtained in the limit of negligible C q .Using this bound we predict that the ultra-strong coupling regime g eg /ω > 0.1 requires K 2 > 0.05.For the mechanical admittance in this work we fit K 2 = 0.16, and for other defect geometries we fit K 2 ∼ 0.2 − 0.25, suggesting that ultra-strong coupling may be possible using an improved coupling circuit.The bound in eqn.C6 is optimistic, and increasing β qm in experiments remains a topic of future work.

Appendix D: Fitting tuning spectrum
Experimental device characterization involves determining parameters in the coupled qudit-mechanics Hamiltonian, given explicitly by, where we have ignored coupling to the readout mode in eqn.C1.The qudit energies (E C , E J , E L ) are obtained by measuring and fitting the frequency of one of more qudit transitions for variable flux bias φ e = 2πΦ e /Φ 0 .While for fluxonium it may be preferable to to measure and fit an extended spectrum containing several transitions and/or a large fraction of a flux-tuning period [35,37], we observe inconsistent visibility of GHzfrequency qudit transitions in two-tone spectroscopy and therefore use a restricted fit to the qubit frequency ω eg (Φ e )/2π < 1 GHz.
To calibrate external flux, the tuning period with respect to voltage bias is determined independently from  7d, equivalent circuit parameters, and mode parameters for the test device measured at room temperature (RT) by reflection off test ports (1,2) in Fig. 5, using a VNA and −45 dBm output power.Table follows Ref. [21] and parameter conversions given in Ref. [ measurements of the readout mode with a vector network analyzer (VNA) as the voltage bias is swept (Fig. 8a), yielding V period = 25.56 ± 0.04 V. We perform this measurement before all qubit spectroscopy to avoid suspected hysteresis in the qubit frequency associated with larger variations in applied flux.While we observed hysteresis with previous iterations of the device, we do not observe hysteresis for the device in this work.The voltage bias at half-flux is determined as V half = 7.540 ± 0.01 V from symmetry about the minimum qubit-like frequency in Fig. 2a.
To fit Hamiltonian parameters given the flux calibration, we first combine measurements of qubit (g, e)-like peak frequencies below 1 GHz, including the symmetry point at half-flux, and excluding centers of avoided crossings with non-target modes (Fig. 8)b.Despite the frequency tuning extending far beyond the avoided crossing with the target mode, we include the coupling g m in the tuning fit because it contributes a large shift to the minimum qubit-like frequency.To expedite fitting we truncate the coupling in eqn.D1 to a Jaynes-Cummings a. Charge transition elements for the bare qudit, predicted using fitted energies.The (e, f ) transition is included because |e q has substantial thermal population and because the transition contributes non-negligibly to the qubit-mechanics dispersive shift in eqn.D4.(e) Comparing predicted dispersive shifts using numerical diagonalization (fine-dashed curve) and perturbation theory with varying level truncations.Solid curves represent truncations including |f q and higher, and agree better with diagonalization compared with truncation at |e q (coarse-dashed curve).Inset shows the main regime utilized in this work, with (solid, dashed) vertical lines respectively indicating the qubit biases labeled (∆ coherent , ∆swap) in the main text.
model [65], where g eg ≡ g m | q g|n q |e q |.The approximate qubit-like frequency is, where δ 0 ≡ ω eg,0 − ω m0 is the bare-basis detuning and ω eg,0 is calculated from the first line of eqn.D1.Fitting to eqn.D3 gives: E C /h = 0.8016 GHz, E J /h = 2.6349 GHz, E L /h = 0.7966 GHz, ω m0 /2π = 691.71MHz, and g m /2π = 67.0MHz.We then calculate the eigenfrequencies of the qubit-mechanical system and compare predicted transition frequencies to the spectroscopy measurements in Fig. 2a.We hold (E C , E J , E L ) fixed to the above values and adjust ω m0 and g m to improve agreement between the model curves and data.We evaluate the agreement by eye, so we assume error bars given by (ω m0 ) or propagated from (g m ) the frequency step in the spectra (0.25 MHz).We find ω m0 /2π = 691.75MHz and g m /2π = 66.6 MHz, used for all model calculations.A summary of system parameters and error bars is given in Table III.
Experimental data suggest that the qubit-mechanics dispersive shift |2χ m | < 2|g 2 eg /∆|, which can occur when qudit levels above |e q contribute to the shift.Secondorder perturbation theory gives an expression [36] for the mechanical frequency shift given qudit state |j q , where g jk ≡ ig m ( q j|n q |k q ), and ω kj,0 ≡ ω k,0 − ω j,0 are transition frequencies in the bare qubit spectrum (Fig. 8c).Eqn.D4 includes Bloch-Siegert shifts [66,67], which are important for the dispersive contribution of qudit transitions ω kj,0 that are far-detuned from the mechanical frequency ω m0 .The peak separation in number-splitting experiments is approximately 2χ m = χ m,e − χ m,g , and the mechanical frequency receives a vacuum shift δω m ≈ (χ m,e + χ m,g )/2.In Fig. 8 we predict 2χ m using joint diagonalization and perturbation theory (PT), sweeping the number of states k used in eqn.D4.The PT accuracy improves greatly when the third qudit level |f q is included, after which including levels above |f q contributes minimal shift, similarly to a transmon-resonator system [28].

Appendix E: Tracking frequency drift
When the qubit (g, e) transition is detuned in the dispersive regime above the mechanics, we observe frequency drift on the order of the qubit linewidth over time scales in the tens of minutes.In principle these drifts can be corrected by actively feeding back onto the flux line current to keep the qubit energy fixed.Given that the shifts are small, we find it more convenient to correct this effect in software while post-processing the data.To generate the spectra in Fig. 3b of the main text, we partially  correct for drift by measuring spectra repeatedly, detecting the frequency of a reference peak in post-processing, and aligning spectra to negate the drift of the reference peak [17].An example of this process is shown in Fig- ure 9 for the largest mechanical displacement shown in the main text.We choose the zero-phonon peak of the phonon-number spectra as the reference peak and do not measure additional spectra for peak-tracking purposes.
For larger excitation amplitudes the zero-phonon peak is smaller, and we improve the accuracy of peak detection by averaging neighboring spectra together in small bins of 2 or 3 before fitting the frequency of the reference peak.This approach benefits from fast repetition of measurements relative to the frequency drift.Resolution of phonon-number peaks up to |4 m can be seen in Fig. 9d even without post-processing.The post-processed data improves resolution and symmetry of the peaks and is used for the spectral fits shown in the main text.Using the Hamiltonian parameters extracted in section D, we predict that the dispersive shift 2χ m varies by no more than 3% during the observed frequency drifts, contributing a small broadening ∝ n to the |n m peak similarly to phonon loss.We expect this broadening to be negligible relative to the MHz-scale qubit linewidth.

Appendix F: Qubit dephasing
Fast qubit dephasing is a major limitation in this work.We fit the maximum T 2e,q < 4 µs at half-flux, T 2e,q < 2 µs in the dispersive regime, and we observe T 2e,q < T 1q in all cases, suggesting that pure dephasing dominates even with first-order insensitivity to flux noise.We use single-pulse echo measurements for T 2e,q [43] to suppress dispersive frequency components from thermal phonon occupations P m (n = 1, 2) ∼ (0.23, 0.09), noting that this also refocuses slow dephasing from 1/f noise so T 2e,q tends to exceed the Ramsey T 2q .To extract T 2e,q , we fit decay traces to stretched-exponential func-a.
b.  tions exp(−(t/T 2e ) n ) following Ref.[68], with fitted values of n shown in Fig. 10a.Physically, the decay might be described by the product of exponential decay due to white noise and Gaussian decay due to 1/f -noise [26,43], i.e. exp(−t/T C − t 2 /T 2 φ ).We interpret the stretchedexponential fits as approximations to extract an effective 1/e decay time that is easily bounded from the data, and a stretching index n that varies between 1 for dominant exponential decay and 2 for dominant Gaussian decay.In principle, n < 1 could approximate multi-exponential decay, and a comparison to an exponential decay fit is shown in Fig. 10b.
We also perform non-echo Ramsey measurements with qubit at ∆ swap to estimate T 2q .Fig. 10c shows a timedomain Ramsey fit and its Fourier transform, with model given by [20,21], where {A n , T 2q , ω 0 , 2χ m } are fit parameters and we take N max = 2 after initial fits yielded values of A 3 below the noise floor.The phase offset is ϕ n = 2χ m nt d , and we set t d = 1.13 × (τ pulse = 50 ns) following simulations in Ref. [20] for qubit X π/2 pulses of the same shape.For the T 2q decay envelope we find more accurate fits using a regular exponential compared to a stretched exponential or Gaussian.Fitting yields T 2q = 0.33 ± 0.01 µs and 2χ m /2π = 1.67 ± 0.02 MHz.To interpret the short dephasing times, we discuss two contributions to dephasing that may be particularly large for fluxonium qubits and sub-GHz mechanics: strong coupling to a thermal resonator, and 1/f flux noise.

Thermal mechanics
Near half flux T 2e,q may be limited by phonon number fluctuations from the target mode.We use the following expression [69,70] with caution to estimate the limiting order of magnitude for T 2e,q due to thermal occupation of the mechanics: 1 T2e,q Γ th φ , where, (F2) Eqn.F 1 is not limited to nth,m 1, but we do not anticipate quantitative accuracy because (1) the expression applies to time scales t (1/κ m = T 1m ) but T 2e,q /T 1m 4, (2) resonator decay is assumed to occur with a single rate κ m but we observe multiexponential decay, and (3) |∆/g eg | ∼ 1.8 is not in the dispersive regime.Nevertheless, to predict the order of magnitude we use nth,m = 0.57, T 1m = 0.85 µs, and 2χ m /2π = −11.2MHz and find 1/Γ th φ ∼ 1.5 µs.For comparison we estimate the qubit pure-dephasing lifetime from measurements, T −1 φe,q ≡ 1/T 2e,q − 1/(2T 1q ).This yields T φe,q = 6.1 µs at half flux, 4 times longer than predicted using eqn.F2.For the dispersive regime accessed in this work, 2χ m /2π ≥ 1.6 MHz, such that eqn.F 1 predicts 1/Γ th φ ∼ 1.5 − 1.6 µs over the entire regime.While the dispersive approximation is more accurate at detunings such as (∆ coherent , ∆ swap ) the flux-tuning slope is relatively steep and flux noise becomes a larger contribution to the dephasing rate.

1/f noise
Our discussion in this section closely follows Refs.[43,71,72].Flux-tunable superconducting qubits are broadly affected by 1/f -type flux noise, with a spectral density of form S Φ (ω) = A 2 Φ (2π × 1 Hz/|ω|) γΦ , where γ Φ ≈ 0.8 − 1.0 and A 2 Φ ∼ (1 µΦ 0 ) 2 /Hz.The scaling factor A 2 Φ may be larger if noise from electronics such as the DC bias source is not heavily attenuated, or due to unwanted ground loops.Because qubit coherence is not the main focus in this work, we estimate only the predicted limitation on T 2q and T 2e,q at the two main static biases used in this work: (∆ coherent , ∆ swap ).We take γ Φ = 1 for simplicity.The leading-order phase decay in an Npulse Carr-Purcell-Meiboom-Gill (CPMG) experiment is approximately, (F3) where the filter functions g N for experiments in this work are g 0 (ω, t) = sinc 2 (ωt/2) for Ramsey and g 1 (ω, t) = sin 2 (ωt/4) sinc 2 (ωt/4) for one echo pulse (approximated as instantaneous).It is typical to exclude frequencies smaller than a cutoff ω c if the integral would otherwise diverge.For Ramsey experiments, (F4) where γ ≈ 0.577 is the Euler constant and we assume ω c t 1 to ignore terms at O((ω c t) 2 ); the constant 3/2 − γ can be absorbed as ln(2.516/ωc t) = ln(0.400/fc t) in analogy to Ref. [73].The value of t in the logarithm can be set to a representative value on the order of the relevant experimental T 2 , and the cutoff ω c ∼ 2π/T exp can be calculated using the total data acquisition time.For one echo pulse [74], no cutoff is needed at leading order: We define the pure dephasing time using −χ N (t) ≡ −t 2 /T 2 φ,N , and for the cutoff logarithm set t = 5 µs and ω c /2π = 1/(600s).With qubit at (∆ coherent , ∆ swap ), ∂ωeg ∂Φ ≈ 2π × (10.67, 11.34) GHz/Φ 0 , yielding T φ,0 = (3.6,3.4) µs and T φ,1 = (18, 17) µs.The observed T φe,q = (1.9,1.7) µs are shorter than the calculated T φ,1 by an order of magnitude, and resemble the phonon-fluctuation dephasing time predicted in the previous section F 1. However eqn.F2 does not explain the observed trend with tuning away from half flux, where T 2e,q decreases and the echo decay becomes more Gaussian.Furthermore, the measured ratio between single-echo and Ramsey pure-dephasing times, T φe,q /T φq ≈ 4.7, resembles the predicted ratio from eqns.F4 and F5: T φ,1 /T φ,0 ≈ 4.9.These observations could be explained more straightforwardly by a larger noise amplitude A 2 Φ and a smaller phonon-fluctuation dephasing rate.For example, noise amplitudes in the range A 2 Φ ∼ 1 − 5 µΦ 2 0 /Hz have been observed for loops of Josephson junctions [26,72].Noise amplitudes may increase with increasing geometric aspect ratio loop perimeter wire width [75], and in our device this aspect ratio ∼ 200 is relatively large.Future studies will benefit from quantitatively modeling and reducing pure dephasing.spectral data show a raised baseline that increases with mechanics drive amplitude, which could be explained by off-resonant excitation of the qubit or by a small cross-Kerr interaction between the mechanics and readout resonator.To obtain the near-zero baselines shown in Fig. 11a, we perform reference measurements in which we excite the mechanics with a coherent drive but do not measure the qubit spectrum.We then measure the qubit spectrum following the same coherent drive on the mechanics, and subtract the reference measurement from the measured spectrum.We fit each spectrum to a model with six Voigt peaks [13], each with independent Lorentzian and Gaussian linewidth parameters.We anticipate that the observed lineshapes result from four main broadening mechanisms: white noise to finite T 1q and thermal noise (Lorentzian broadening), 1/f flux noise (Gaussian broadening), frequency-shift errors in post-processing (Gaussian by design), and the frequency spectrum of the spectroscopy pulse (sinc-like approximation to Gaussian, for a sinusoidal pulse envelope in time domain).For larger drive amplitudes, we anticipate that a small population 5% in higher phonon levels n ≥ 6 is not captured within the spectroscopy window.Because of this, for each drive amplitude we fit the distribution of peak areas to a model, rather than normalizing to the total area of all observed peaks.For the model we choose the Fock distribution of a displaced thermal state [76], where τ = exp(− ω m /k B T eff ) = nth nth +1 , and L n (x) are Laguerre polynomials.Two example distributions are compared in Fig. 11b and a summary for all drive amplitudes is shown in Fig. 11c.At larger drive amplitudes we observe an expected linear trend for the fitted α, though the linear fit (dashed line) would have a larger positive intercept if we did not include zero drive amplitude in the fit.At smaller drive amplitudes the fit does not distinguish a nonzero displacement α, so we re-fit to an undisplaced thermal distribution (α = 0) and interpret only n ≈ nth + |α| 2 quantitatively in the main text.When converted to the same units, the slopes of linear fits in Figs.3d and 11c agree within one standard error.
In our number-splitting measurements, the displaced mechanical state decays during the qubit spectroscopy pulse.We use long probe pulses to reduce Fourier broadening, so the bandwidth of the probe pulse resolves the dispersive shift: 2/T probe 2χ m /2π.An ideal choice to observe larger phonon distributions would be T probe < T 1m , however this requires χ m T 1m /2π 1 which is not satisfied in this work.In both this work and Ref. [17], T probe > T 1m , limiting the size of observed phonon distributions.Despite the mechanical state undergoing significant decay during the measurement, we model the extracted phonon probabilities using displaced thermal states.We motivate this choice by noting that for a resonator undergoing single-phonon loss at rate κ m , a displacement α(0) applied to an initial thermal state decays as α(t) = α(0)e −κmt/2 regardless of the thermal occupation [77].Measurements at larger drive amplitudes where |α| 2 > nth agree well with this model, as shown in Fig. 11b for 207.5 mV.However for smaller drive amplitudes e.g. 120 mV, the model in eqn.G1 fits less accurately for n ∈ (0, 1, 2), passing through none of the error bars.This type of discrepancy appears for the three smallest nonzero drive amplitudes, and was not improved by constraining the fitted displacement α to lie near the linear fit in Fig. 11c.This behavior suggests a systematic difference between the extracted P (n) and the model for small drive amplitudes, perhaps relating to dephasing in a coupled TLS ensemble [21].More experimental data are needed to evaluate this hypothesis.

Appendix H: Rabi swap
We investigate the flux modulation pulse used to swap excitations between the qubit and mechanical mode in section III of the main text.We perform time-domain simulations using the QuTiP package [78] to model an ideal Rabi experiment shown in Fig. 4b and compare the result with data shown in Fig. 4d.The pulse envelope includes a flat-top of duration τ mod − 2τ r between ramps of duration τ r .The upward ramp is given by, where τ d = τ r / 2 sin −1 (2 −1/4 ) , (τ mod , τ r ) = (100, 10) ns, and the downward ramp follows the upward shape in reverse.We allocate the time-dependent flux to the inductor [79,80], such that the drive Hamiltonian is, Ĥd (t) = kV (t)E L φq for some constant k.The simulation involves numerically integrating the Lindblad master equation for the qudit-mechanical system, where Ĥ = Ĥ0 + Ĥd (t), Ĥ0 is given by eqn.D1 and we simulate short times up to the pulse duration τ mod .We use the following collapse operators: where κ m↓ = ( and emission/absorption rates for the qubit are assigned similarly.The Hilbert space includes N q = 6 bare qudit levels obtained by diagonalizing with N q, Fock = 100, and N m = 10 bare phonon levels. Rabi experiments sweeping drive amplitude instead of drive duration often display spurious behavior at large amplitudes, for example due to AC Stark shifts or breakdown of the rotating wave approximation (RWA).Cleaner sinusoidal Rabi chevrons might be observed by sweeping drive duration at a fixed, lower drive amplitude, as performed in many works, for example Refs.[13,18,20,48].We sweep drive amplitude instead of drive duration to circumvent an apparent instrumentation bug where the AWG fails to output flux-modulation pulses longer than 100 ns.
Fig. 12a shows results of a simulation modeling the Rabi swap calibration attempted in section III.The initial state is prepared starting with thermal equilibrium at T eff = 33 mK, followed by an ideal X π pulse modeled as Û = n (|en gn| + h.c.), where we use |jn to denote the joint eigenstate with maximum overlap to the bare state |j q ⊗ |n m .Within the dispersive approximation, the readout signal is proportional to the qubit population asymmetry P (e) − P (g), and in experiments we subtract baseline measurements of readout signal obtained by applying the X π pulse without flux modulation afterward.We therefore plot the following quantity, Signal ∝ |(P (e) − P (g)) t=τ mod − (P (e) − P (g)) t=0 |, (H4) modeling the change in population asymmetry due to the modulation pulse relative to any state preparation done a. beforehand.In the simulated signal we observe rightward bending similar to the experimental data, and we calibrate the modulation voltage: k = 1.7 × 10 −5 Φ 0 /mVpp.The simulated signal does not return to zero between fringes, also consistent with experimental observations.We find that the finite signal between fringes is dominated by dynamics outside the single-excitation subspace (g1, e0) that occur due to thermal excitations.For example, Fig. 12b shows unconditioned qubit and phonon probabilities as a function of modulation amplitude using f mod = 155.6MHz.We find that the maximum readout signal does not coincide with the maximum single-phonon probability P (1), due to smaller, faster Rabi oscillations occurring in N -excitation subspaces that contribute to the unconditioned P (e).The simulated P (1) is approximately 0.45 at maximum readout signal, while its maximum of 0.54 occurs at a higher modulation amplitude.Additional work appears necessary to maximize the probability of phonon |1 m without initial cooling.To verify phonon distributions following qubit excitation and a qubit-mechanics swap pulse, we perform Ramsey measurements following Refs.[20,21].Fourier transforms of the Ramsey data are shown in 12c, and we interpret them qualitatively.We observe two resolved peaks with spacing 2χ m /2π ∼ 1.6 − 1.7 MHz, and their relative areas appear qualitatively consistent with exchange of population between |0 m and |1 m .Data for θ/π = 1 appear qualitatively consistent with the prediction P (1) > P (0) > P (2) obtained from Fig. 12b at the modulation amplitude where "Signal" is maximum.
For T 1m measurements (Fig. 12d), we vary the phase θ of the second modulation pulse relative to the first.Results of a two-swap experiment can depend on this phase when the target swap interaction is imperfectly calibrated, or when flux modulation drives unwanted interactions.To approximate a DC ringdown curve, we take the average of the (complex) readout signal over 4 phases θ = (0, π/2, π, 3π/2), shown in Fig. 12e.This approximately cancels small oscillations between out-ofphase traces for t 1 µs.

Appendix I: Proposed improvements
Fluxonium qubits often have externally-coupled transitions across a wide range of frequencies.For experiments in this work, the only desired coupling is between the (g, e) transition and the target mechanical mode, however we anticipate that few-GHz transitions such as (e, f ), (g, f ) and (g, h) also couple to modes in the upper mechanical spectrum, which may complicate readout and cooling protocols.To reduce these spurious couplings we suggest adding a compact, high-impedance lowpass filter between the qubit and mechanics, ideally with a cutoff frequency near 1 GHz.Such a filter could be realized with additional piezoelectric design [81], kinetic inductance, or the inductance of a Josephson junction array.To minimize additional processing, filters could be patterned within the LN tethers or by including additional junctions in the qubit metallization.
Fabricating superconducting ground planes and waveguides in the second liftoff mask may decrease T 1q and the internal quality factor of the readout mode.In established fabrication procedures [20,54], ground planes and wiring are typically patterned first on freshly acidcleaned substrate, followed by Josephson junctions.In superconducting-only systems T 1q may also be increased using a sapphire substrate with niobium or tantalum films patterned by etching [68], and by shortening the perimeter of qubit metal islands [82].In our system we observe another, stronger limitation on T 1q associated with strong coupling to the target mechanical mode, despite the heterogeneously integrated flip-chip geometry.We measured two additional devices fabricated using the same procedure as this work, but either with reduced qubit-mechanics coupling g eg /2π = 1.3 MHz, or without the mechanics top chip.For both devices, we observed an order-of-magnitude longer qubit lifetime T 1q ∼ 20 − 100 µs.Furthermore, we measured another device composed of a niobium-on-silicon circuit chip and a mechanics top chip, and we observed qubit lifetimes T 1q ∼ 2 − 3 µs similar to this work.While improved circuit fabrication may increase T 1q , future work will also benefit from understanding limitations on T 1q associated with the mechanics chip.
We suggest two modifications to our fabrication of Josephson junctions in this work.First, our patterning of junctions before the ground plane includes baking the junctions at 180 • C to prepare the second resist mask, which we find increases the array inductance by up to 30%.Although we calibrate the average inductance shift, we observe that the inductance distribution widens and drifts with deviations in bake temperature between fabrication runs.Fabrication control will therefore benefit from avoiding high-temperature bakes after junction fabrication, which may be critical for placing the minimum qubit frequency within the phononic band gap.Second, our use of the asymmetric T-junction evaporation [54] for the fluxonium junction array is unusual.The first evaporation at a large angle of 62 • relative to normal incidence results in larger metal islands between array junctions, increasing the parasitic capacitances associated with these islands.This may lower the frequencies of waveguide-like modes in the junction array [61,83,84], which contribute to dephasing of the qubit (g, e) transition through a dispersive shift and may couple resonantly to higher qubit transitions.The array island capacitances may be reduced using a symmetric, smaller-angle evaporation [26,68] that still yields a small single junction.
Reducing slow drift in the qubit frequency represents a critical improvement for future devices.We anticipate that fabricating superconducting crossovers [85,86] as DC shunts across coplanar waveguides may result in improved flux stability, in part by reducing the coupling between loops in the circuit and noisy environmental fields.
High-fidelity quantum operations require cooling the joint qubit-mechanical system.Cooling protocols for superconducting qubits [26,38,87] could be extended to cool the mechanics using sideband coupling or fast swaps.We attempted to cool the qubit transition using steadystate driving on the (e, h) qudit transition, relying on emission from the (g, h) transition into the readout mode to cool the qubit.While for previous devices we observed cooling of the (g, e) transition using this approach, we did not observe cooling for the device in this work.We attribute the absence of cooling to a slower Purcell decay of (g, h) through the readout mode, limited by a large detuning ω hg,0 − ω r0 .For future work we consider a more robust cooling protocol [87], in which population of |e q is transferred to the readout using simultaneous drives on the qudit-readout transitions (e0, f 0) and (f 0, g1).The method requires calibrating AC Stark shifts on these transitions, due to the large drive amplitudes needed to achieve fast cooling.We anticipate that improved frequency stability will facilitate these calibrations (and therefore cooling) in future studies.

FIG. 1 .
FIG. 1. Description of device.(a) Schematic of the energy levels for different systems in this work.A strongly-nonlinear Josephson oscillator (Qudit) is coupled to a sub-GHz mechanical resonator and a few-GHz readout resonator, with respective coupling rates gm,r between the qudit charge operator nq and the (mechanical, readout) charge quadratures.Qudit transitions dominating these respective interactions are labeled.(b) Circuit schematic of the flip-chip device.The qudit is patterned on the bottom chip (blue) and coupled to the mechanical mode on the top chip (maroon) through two vacuum-gap capacitors.The target mechanical mode is represented as a Butterworth-van Dyke equivalent circuit, omitting additional series-LC branches describing parasitic modes of the real device.(c) Optical micrograph of the qudit and control lines, with inset showing coupling pads leading to the mechanical resonator on the top chip.(d) Optical micrograph of the Josephson junction loop providing the qudit nonlinearity.(e) False-color scanning electron micrograph of a representative mechanical resonator.The experimental resonator was not imaged to minimize handling risks discussed in Appendix A. Scale bars for (c, d, e) respectively represent (50, 10, 2) µm.
FIG. 2.Spectroscopy of qubit-mechanics coupling.(a) Qubit spectrum as a function of applied magnetic flux Φe in units of the magnetic flux quantum Φ0.Solid curves denote first-order transition frequencies predicted by diagonalizing the coupled qudit-mechanical Hamiltonian; the procedure for determining model parameters is detailed in Appendix D. Transitions are labeled using the undressed basis states {|qubit ⊗ |mechanics } with greatest overlap to the eigenstates involved in each transition.These labels change when passing through the avoided crossing and are not intended as quantitative descriptions of the eigenstates, as the spectroscopy window covers a region of strong hybridization.(b) Finer spectrum taken along the vertical dashed line in (a), at the approximate center of the avoided crossing.The qubit excitation amplitude is reduced by a factor of 5, and the unlabeled peak seen near 681.5 MHz in (a) no longer appears.We attribute this peak to a second-order transition[11].Vertical dashed lines denote the transition frequencies predicted in (a) and align well to the lower peaks; less well to the upper peaks.

FIG. 3 .
FIG. 3. Characterization of phonon number-splitting.(a)Qubit coherence times as a function of coarsely-stepped flux bias.T2e,q is measured using single-pulse echo experiments[43] to suppress additional frequency components in the Ramsey signal due to thermal occupation of the mechanics[20].The vertical arrow indicates the bias chosen for numbersplitting measurements.(b) Number-splitting spectra for variable coherent drive amplitude.Solid curves show fits to Voigt profiles, and data are rescaled so that the total Fock population is normalized.To compensate for slow frequency drift, spectra are shifted to align centers of the |0 peaks.(c) Spectroscopy of the mechanical mode as the qubit is tuned across the quasi-dispersive regime.For each flux bias, the measured amplitudes are normalized to the mean amplitude over all frequencies, to improve visibility.Upper and lower overlaid curves show calculated transition frequencies continued from Fig. 2, and their splitting becomes approximately 2χm in the dispersive limit.The pentagram indicates the fitted χm from (b) relative to average of the outer curves.(d) Calibration of coherent displacement amplitudes extracted from (b).The mean phonon number is calculated from fitted peak areas; further details are given in Appendix G.The shaded area represents one simultaneous standard error in fit parameters.

FIG. 5 .
FIG. 5. Extended device images.(a) Optical micrograph of experimentally active regions on the bottom chip, including the meandered readout resonator.Defects in the aluminum ground plane are associated with debris particles in the photoresist during patterning.(b) Scanning electron micrograph of a representative single Josephson junction with identical geometry to the experimental device.Slight discoloration of the silicon substrate is typical; polymer residue on the aluminum is not ideal.(c) Optical micrograph of the top chip before flip-chip bonding.Corners are truncated by the microscope field of view.No ground plane is used, however a 50 nm-thick aluminum film is patterned underneath the 900 nm spacers (long horizontal rectangles) such that the base of the spacers is coplanar with the top surface of the coupling capacitor pads as if a ground plane were present.The top chip is designed with rotational symmetry to enable coupling mechanics on either side to the qubit.(d) Scanning electron micrograph of the representative mechanical resonator from Fig.1e, showing the suspended phononic crystal.Scale bars in (a-d) respectively represent (500, 2, 500, 10) µm.(e) Photograph of the experimental device after flip-chip assembly and packaging in a printed circuit board (PCB).Test ports are used to probe a copy of the experimental mechanics.Application of adhesive is intentionally biased toward the test pads to protect the experimental device from unintentional overflow.An example of unintentional overflow can be seen overlapping with the Test 1 bond pad.
FIG. 5. Extended device images.(a) Optical micrograph of experimentally active regions on the bottom chip, including the meandered readout resonator.Defects in the aluminum ground plane are associated with debris particles in the photoresist during patterning.(b) Scanning electron micrograph of a representative single Josephson junction with identical geometry to the experimental device.Slight discoloration of the silicon substrate is typical; polymer residue on the aluminum is not ideal.(c) Optical micrograph of the top chip before flip-chip bonding.Corners are truncated by the microscope field of view.No ground plane is used, however a 50 nm-thick aluminum film is patterned underneath the 900 nm spacers (long horizontal rectangles) such that the base of the spacers is coplanar with the top surface of the coupling capacitor pads as if a ground plane were present.The top chip is designed with rotational symmetry to enable coupling mechanics on either side to the qubit.(d) Scanning electron micrograph of the representative mechanical resonator from Fig.1e, showing the suspended phononic crystal.Scale bars in (a-d) respectively represent (500, 2, 500, 10) µm.(e) Photograph of the experimental device after flip-chip assembly and packaging in a printed circuit board (PCB).Test ports are used to probe a copy of the experimental mechanics.Application of adhesive is intentionally biased toward the test pads to protect the experimental device from unintentional overflow.An example of unintentional overflow can be seen overlapping with the Test 1 bond pad.
Mask 2 patterns aluminum electrodes on the LN by EBL and liftoff, and includes the larger coupling pads shown in the inset of Fig. 1c.Mask 3 patterns aluminum flip-chip alignment marks by photolithography and liftoff.Mask 4 patterns aluminum bandages by EBL and liftoff to ensure galvanic connection of electrodes across the vertical step between silicon and LN.Mask 5 patterns aluminum spacers by EBL and liftoff, with target thickness of 900 nm determining the flip-chip separation distance.Mask 6 performs a masked xenon difluoride dry etch to undercut and suspend the mechanical structures, with mask patterned by EBL.

FIG. 6 .
FIG.6.Experimental setup.The sample is located at the mixing-chamber plate of a dilution refrigerator (Bluefors LD250), packaged in a microwave PCB and copper enclosure, and surrounded by cryogenic magnetic shielding.The AWG provides a 10 MHz reference signal to phase-lock all RF instruments, including the ADC.Circulator passbands are 4 − 8 GHz and isolator passbands are 3 − 12 GHz; both are magnetically shielded with µ-metal."Ecco."denotes coaxial infrared filters made with Eccosorb[57,58], with lowpass cutoffs near 20 GHz.The TWPA pump is combined with the readout signal through the −20 dB port of a directional coupler mounted inside the shielding (RF Lambda RFDC2G8G20, not shown), and the 5.58 GHz lowpass after the HEMT attenuates pump feedthrough to avoid saturating the room-temperature amplifiers with pump power.Notation of this figure follows Refs.[17,59].

FIG. 7 .
FIG. 7. Design considerations.(a) Dimensions of phononic crystal cavity, showing electrostatic potential of the target mode from finite-element simulation.Notation follows Ref. [21].(b) Simulated band structure for the phononic crystal mirror cells, with the two largest band gaps shaded.Dashed lines show important qubit and mechanical frequencies in this work, of which only ωm0 would be targeted at this stage of design.(c) Admittance magnitude across on-defect electrodes as a function of frequency and defect width.To reduce simulation time, the mirror cells are omitted.The dashed line indicates the design width.(d) Target mode simulation of the full resonator, with color indicating the normalized mechanical displacement log 10 |u(r)/umax|.(e) Imaginary admittance magnitude across the electrodes in (d), fit to a single-mode model.The dashed box surrounds a small blip in the admittance associated with a non-design mode near the frequency of the "parasitic mode" suggested in section II A of the main text.(f) Simulated mechanical displacement and electrostatic potential for the parasitic mode indicated in (e).(g)Simulated electrostatic potential of wires on partially-released LN tethers.(h) Nearly-full circuit model used for design.An electrostatic capacitance matrix is simulated for all nodes except 7 (including capacitances to ground, not shown), then the mechanical admittance model Ym(ω) is inserted using the fit in (e).Shaded boxes identify the three dynamical coordinates in the model.(i) Equivalent circuit (ignoring drives) obtained by reducing the circuit in (h), to be quantized as eqn.C1.
. The representative mechanical resonator shown in Figs.1e and 5d was de-signed with slightly larger dimensions L x = 7.8 µm and L e = 1.25 µm.

FIG. 8 .
FIG.8.Flux-tuning spectrum.(a) VNA spectrum of readout mode.The voltage tuning period is estimated from the periodicity of the largest avoided crossing, likely involving the qudit (g, f ) transition.(b) Flux-tuning data used to fit qudit energies.Dashed curves are fits overlaid on experimental spectra.The spectroscopy signal disappears abruptly above 1 GHz.High-signal vertical bands represent an occasional bug in the measurement chain.(c) Transition frequencies and (d) Charge transition elements for the bare qudit, predicted using fitted energies.The (e, f ) transition is included because |e q has substantial thermal population and because the transition contributes non-negligibly to the qubit-mechanics dispersive shift in eqn.D4.(e) Comparing predicted dispersive shifts using numerical diagonalization (fine-dashed curve) and perturbation theory with varying level truncations.Solid curves represent truncations including |f q and higher, and agree better with diagonalization compared with truncation at |e q (coarse-dashed curve).Inset shows the main regime utilized in this work, with (solid, dashed) vertical lines respectively indicating the qubit biases labeled (∆ coherent , ∆swap) in the main text.

FIG. 9 .
FIG. 9. Qubit frequency drift.(a) Raw number-splitting data contributing to Fig. 3b for drive amplitude = 225 mV on the mechanical mode.Magnitude of qubit response is shown, with each horizontal slice representing one measurement of the full spectrum averaged over a 70-s interval.100 spectra were obtained over nearly 2 hours, with large drift during the first half hour.The anomalous high-amplitude spectra visible near the center and top of the plot likely represent a bug in the measurement chain, and were observed at rates of 3 to 7 per 100 spectra (b) Truncated and binned data after eliminating 7 anomalous traces from (a), averaging neighboring spectra in bins of size 2, and eliminating the remainder.(c) Alignment of binned traces obtained by detecting the highest-amplitude peak, fitting it to a Gaussian profile, and shifting the spectrum by an integer number of the original frequency step.(d) Number-splitting spectra obtained from averaging together the respective spectra in (a, all data), (b, truncated), and (c, post-processed)

FIG. 10 .
FIG.10.Example echo and Ramsey measurements.(a) Exponential stretching factors for the T2e,q fits shown in Fig.3aof the main text."Final" fits were constrained to n ≥ 1; at half-flux the preliminary unconstrained fit yielded n ≈ 0.64.(b) Example fits to echo data with different values of n.In the upper plot, data and fits at half flux suggest multiexponential decay.(c) Example Ramsey data and fit, including fast Fourier transform (FFT) of each.Weakly-resolved additional peaks suggest thermal phonon populations Pm(n = 1, 2), motivating our use of echo measurements to suppress dispersive frequency components when the phonon distribution is not of direct interest.

FIG. 11 .
FIG. 11.Example number-splitting data.(a) Two example number-splitting traces replotted from Fig.3, representing relatively low and high drive amplitudes.(b) Fock probabilities estimated from data in (a) by fitting relative peak areas to displaced thermal probabilities.Data are normalized such that the fitted distribution would sum to 1 over all n.(c) Results of fits similar to (b) for all drive amplitudes.For the two lowest nonzero amplitudes, the fit did not distinguish an accurate α, resulting in very large error bars.

FIG. 12 .
FIG. 12. Sideband Rabi swap details.(a) Simulated "calibration" for a sideband Rabi swap pulse, modeling experimental data shown in Fig. 4d.We plot the magnitude of the change in qubit population asymmetry after the pulse.(b) Simulated qubit and phonon probabilities (unconditioned) as a function of modulation amplitude for fixed frequency f mod = 155.6MHz used in experiments."Signal" represents one half of the value plotted in (a), and for an ideal swap would rise to a maximum of 1.The vertical dashed line denotes the modulation amplitude chosen for mechanical coherence measurements.(c) FFT of Ramsey data with measurement immediately after the sequence (qubit X θ , Rabi swap).Lines connecting data points represent only guides to the eye.Peaks representing |0, 1 m are visible; we attribute misalignment in frequency between traces to slow flux drift.The signal near DC is spurious, as the mean of each time-domain trace is subtracted before taking the FFT.(d) Pulse sequence for measuring mechanics lifetime T1m, indicating the modulation phase θ swept in the 4-phase average.(e) T1m data truncated to short delays < 2 µs showing few-mV variations in amplitude across the 4 phases.The mean corresponds to the decay curve fit in Fig. 4f of the main text.

TABLE II .
Design and test-device parameters for mechanical mode.Design values for simulation of the target mode shown in Fig. 19].

TABLE III .
Experimental device parameters.Uncertainties represent one standard error.For qubitlike frequencies ωeg, the uncertainty describes a typical scale of slow drift in the center of spectroscopy peaks observed over many experiments, with example shown in Fig.9.Readout mode parameters were obtained from data shown in Fig.8a.