Universal Quantum Transducers based on Surface Acoustic Waves

We propose a universal, on-chip quantum transducer based on surface acoustic waves in piezo-active materials. Because of the intrinsic piezoelectric (and/or magnetostrictive) properties of the material, our approach provides a universal platform capable of coherently linking a broad array of qubits, including quantum dots, trapped ions, nitrogen-vacancy centers or superconducting qubits. The quantized modes of surface acoustic waves lie in the gigahertz range, can be strongly confined close to the surface in phononic cavities and guided in acoustic waveguides. We show that this type of surface acoustic excitations can be utilized efficiently as a quantum bus, serving as an on-chip, mechanical cavity-QED equivalent of microwave photons and enabling long-range coupling of a wide range of qubits.


I. INTRODUCTION
The realization of long-range interactions between remote qubits is arguably one of the greatest challenges towards developing a scalable, solid-state spin-based quantum information processor [1]. One approach to address this problem is to interface qubits with a common quantum bus that distributes quantum information between distant qubits. The transduction of quantum information between stationary and moving qubits is central to this approach. A particularly efficient implementation of such a quantum bus can be found in the field of circuit QED where spatially separated superconducting qubits interact via microwave photons confined in transmission line cavities [2][3][4]. In this way, multiple qubits have been coupled successfully over relatively large distances of the order of millimeters [5,6]. Fueled by dramatic advances in the fabrication and manipulation of nanomechanical systems [7], an alternate line of research has pursued the idea of coherent, longrange interactions between individual qubits mediated by mechanical resonators, with resonant phonons playing the role of cavity photons [8][9][10][11][12][13].
In this paper, we propose a new realization of a quantum transducer and data bus based on surface acoustic waves (SAWs). SAWs involve phononlike excitations bound to the surface of a solid and are widely used in modern electronic devices, e.g., as compact microwave filters [14,15]. Inspired by two recent experiments [16,17], where the coherent quantum nature of SAWs has been explored, here we propose and analyze SAW phonon modes in piezoactive materials as a universal mediator for longrange couplings between remote qubits. Our approach involves qubits interacting with a localized SAW phonon mode, defined by a high-Q resonator, which in turn can be coupled weakly to a SAW waveguide (WG) serving as a quantum bus; as demonstrated below, the qubits can be encoded in a great variety of spin or charge degrees of freedom. We show that the Hamiltonian for an individual node (for a schematic representation see Fig. 1) can take on the generic Jaynes-Cummings form ðℏ ¼ 1Þ, whereσ refers to the usual Pauli matrices describing the qubit with transition frequency ω q and a is the bosonic operator for the localized SAW cavity mode of frequency ω c =2π ∼ GHz [18]. The coupling g between the qubit and the acoustic cavity mode is mediated intrinsically by the piezoproperties of the host material, it is proportional to the electric or magnetic zero-point fluctuations associated with a single SAW phonon and, close to the surface, can reach values of g ∼ 400 MHz, much larger than the relevant decoherence processes and sufficiently large to allow for quantum effects and coherent coupling in the spin-cavity system as evidenced by cooperativities [19] of C ∼ 10-100 (see Sec. IV and Table I for definition and applicable  values). For ω q ≈ ω c , H node allows for a controlled mapping of the qubit state onto a coherent phonon superposition, which can then be mapped to an itinerant SAW phonon in a waveguide, opening up the possibility to implement on-chip many quantum communication protocols well known in the context of optical quantum networks [13,20]. The most pertinent features of our proposal can be summarized as follows. (1) Our scheme is not specific to any particular qubit realization, but-thanks to the plethora of physical properties associated with SAWs in piezoactive materials (strain and electric and magnetic fields)-provides a common on-chip platform accessible to various different implementations of qubits, comprising both natural (e.g., ions) and artificial candidates, such as quantum dots (QDs) or superconducting qubits. In particular, this opens up the possibility to interconnect dissimilar systems in new electroacoustic quantum devices.
(2) Typical SAW frequencies lie in the gigahertz range, closely matching transition frequencies of artificial atoms and enabling ground-state cooling by conventional cryogenic techniques. (3) Our scheme is built upon an established technology [14,15]: Lithographic fabrication techniques provide almost arbitrary geometries with high precision as evidenced by a large range of SAW devices such as delay lines, bandpass filters, resonators, etc. In particular, the essential building blocks needed to interface qubits with SAW phonons have already been fabricated, according to design principles familiar from electromagnetic devices: (i) SAW resonators, the mechanical equivalents of Fabry-Perot cavities, with low-temperature measurements reaching quality factors of Q ∼ 10 5 even at gigahertz frequencies [22][23][24], and (ii) acoustic waveguides as analog to optical fibers [14]. (4) For a given frequency in the gigahertz range, due to the slow speed of sound of ∼10 3 m=s for typical materials, device dimensions are in the micrometer range, which is convenient for fabrication and integration with semiconductor components, and about 10 5 times smaller than corresponding electromagnetic resonators. (5) Since SAWs propagate elastically on the surface of a solid within a depth of approximately one wavelength, the mode volume is intrinsically confined in the direction normal to the surface. Further surface confinement then yields large zero-point fluctuations. (6) Yet another inherent advantage of our system is the intrinsic nature of the coupling. In piezoelectric materials, the SAW is accompanied by an electrical potential ϕ, which has the same spatial and temporal periodicities as the mechanical displacement and provides an intrinsic qubit-phonon coupling mechanism. For example, recently, qubit lifetimes in GaAs singlet-triplet qubits were found to be limited by the piezoelectric electronphonon coupling [25]. Here, our scheme provides a new paradigm, where coupling to phonons becomes a highly valuable asset for coherent quantum control rather than a liability.
In what follows, we first review the most important features of surface acoustic waves, with a focus on the associated zero-point fluctuations. Next, we discuss the different components making up the SAW-based quantum transducer and the acoustic quantum network it enables: SAW cavities, including a detailed analysis of the achievable quality factor Q, SAW waveguides, and a variety of different candidate systems serving as qubits. Lastly, as exemplary application, we show how to transfer quantum states between distant nodes of the network under realistic conditions. Finally, we draw conclusions and give an outlook on future directions of research.

II. SAW PROPERTIES
Elastic waves in piezoelectric solids are described by where the vector uðx; tÞ denotes the displacement field (x is the Cartesian coordinate vector), ρ is the mass density, and repeated indices are summed over ði; j ¼ x; y; zÞ; c, ϵ, and e refer to the elasticity, permittivity, and piezoelectric tensors, respectively [26]: they are largely defined by crystal symmetry [27]. For example, for cubic crystals such as GaAs, there is only one nonzero component for the permittivity and the piezoelectric tensor, labeled as ϵ and FIG. 1. SAW as a universal quantum transducer. Distributed Bragg reflectors made of grooves form an acoustic cavity for surface acoustic waves. The resonant frequency of the cavity is determined by the pitch p, f c ¼ v s =2p. Reflection occurs effectively at some distance inside the grating; the fictitious mirrors above the surface are not part of the actual experimental setup, but are shown for illustrative purposes only. Red arrows indicate the relevant decay channels for the cavity mode: leakage through the mirrors, internal losses due to, for example, surface imperfections, and conversion into bulk modes. Qubits inside and outside of the solid can be coupled to the cavity mode. In more complex structures, the elastic medium can consist of multiple layers on top of some substrate. e 14 , respectively [26]. Since elastic disturbances propagate much slower than the speed of light, it is common practice to apply the so-called quasistatic approximation [27], where the electric field is given by E i ¼ −∂ i ϕ. When considering surface waves, Eqs. (2) and (3) must be supplemented by the mechanical boundary condition that there should be no forces on the free surface (taken to be at z ¼ 0 withẑ being the outward normal to the surface), that is T zx ¼ T zy ¼ T zz ¼ 0 at z ¼ 0 (where T ij ¼ c ijkl ∂ l u k þ e kij ∂ k ϕ is the stress tensor), and appropriate electrical boundary conditions [26].
If not stated otherwise, the term SAW refers to the prototypical (piezoelectric) Rayleigh wave solution as theoretically and experimentally studied, for example, in Refs. [16,17,26,28] and used extensively in different electronic devices [14,15]. It is nondispersive, decays exponentially into the medium with a characteristic penetration depth of a wavelength, and has a phase velocity v s ¼ ω=k that is lower than the bulk velocities in that medium, because the solid behaves less rigidly in the absence of material above the surface [27]. As a result, it cannot phase match to any bulk wave [14,29]. As usual, we consider specific orientations for which the piezoelectric field produced by the SAW is strongest [14,29], for example, a SAW with a wave vector along the [110] direction of a (001) GaAs crystal (cf. Refs. [16,26] and Appendixes A and B).

A. SAWs in quantum regime
In a semiclassical picture, an acoustic phonon associated with a SAW creates a time-dependent strain field, s kl ¼ ð∂ l u k þ ∂ k u l Þ=2, and a (quasistatic) electrical potential ϕðx; tÞ. Upon quantization, the mechanical displacement becomes an operator that can be expressed in terms of the elementary normal modes asûðxÞ ¼ P n ½v n ðxÞa n þ H:c:, where a n ða † n Þ are bosonic annihilation (creation) operators for the vibrational eigenmode n and the set of normal modes v n ðxÞ derives from the Helmholtz-like equation Wv n ðxÞ ¼ −ρω 2 n v n ðxÞ associated with Eqs. (2) and (3). The mode normalization is given by R d 3 xρv Ã n ðxÞ · v n ðxÞ ¼ ℏ=2ω n [25,30]. An important figure of merit in this context is the amplitude of the mechanical zero-point motion U 0 . Along the lines of cavity QED [2], a simple estimate for U 0 can be obtained by equating the classical energy of a SAW ∼ R d 3 xρ _ u 2 with the quantum energy of a single phonon, that is, ℏω. This leads directly to where we used the dispersion relation ω ¼ v s k and the intrinsic mode confinement V ≈ Aλ characteristic for SAWs. The quantity U 0 refers to the mechanical amplitude associated with a single SAW phonon close to the surface. It depends on only the material parameters ρ and v s and follows a generic ∼A −1=2 behavior, where A is the effective mode area on the surface. The estimate given in Eq. (4) agrees very well with more detailed calculations presented in Appendix C. Several other important quantities that are central for signal transduction between qubits and SAWs follow directly from U 0 : The (dimensionless) zero-point strain can be estimated as s 0 ≈ kU 0 . The intrinsic piezoelectric potential associated with a single phonon derives from Eq. (3) as ϕ 0 ≈ ðe 14 =ϵÞU 0 [31]. Lastly, the electric field amplitude due to a single acoustic phonon is ξ 0 ≈ kϕ 0 ¼ ðe 14 =ϵÞkU 0 , illustrating the linear relation between electric field and strain characteristic for piezoelectric materials [8]. In summary, we typically find U 0 ≈ 2 fm= ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi A ½μm 2 p , yielding U 0 ≈ 2 fm for micronscale confinement (cf. Appendix D). This is comparable to typical zero-point fluctuation amplitudes of localized mechanical oscillators [32]. Moreover, for micron-scale surface confinement and GaAs material parameters, we obtain ξ 0 ≈ 20 V=m, which compares favorably with typical values of ∼10 −3 and ∼0.2 V=m encountered in cavity and circuit QED, respectively [2].
For the sake of clarity, we have focused on piezoelectric materials so far. However, there are also piezomagnetic materials that exhibit a large magnetostrictive effect. In that case, elastic distortions are coupled to a (quasistatic) magnetic instead of an electric field [33,34]; for details, see Appendix D. For typical materials, such as Terfenol-D, the magnetic field associated with a single phonon can be estimated as B 0 ≈ ð2-6Þ μT= ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi A ½μm 2 p . Finally, we note that composite structures comprising both piezoelectric and piezomagnetic materials can support magnetoelectric surface acoustic waves [35,36].

A. SAW cavities
To boost single-phonon effects, it is essential to increase U 0 . In analogy to cavity QED, this can be achieved by confining the SAW mode in an acoustic resonator. The physics of SAW cavities has been theoretically studied and experimentally verified since the early 1970s [14,37]. Here, we provide an analysis of a SAW cavity based on an on-chip distributed Bragg reflector in view of potential applications in quantum information science; for details, see Appendix E. SAW resonators of this type can usually be designed to host a single resonance f c ¼ ω c =2π ¼ v s =λ c (λ c ¼ 2p) only and can be viewed as an acoustic Fabry-Perot resonator with effective reflection centers, sketched by localized mirrors in Fig. 1, situated at some effective penetration distance into the grating [14]. Therefore, the total effective cavity size along the mirror axis is L c > D, where D is the physical gap between the gratings. The total cavity linewidth κ ¼ ω c =Q ¼ κ gd þ κ bd can be decomposed into desired (leakage through the mirrors) and undesired (conversion into bulk modes and internal losses due to surface imperfections etc.) losses, labeled as κ gd and κ bd , respectively; for a schematic illustration, compare Fig. 1. For the total quality factor Q, we can typically identify three distinct regimes (cf. Fig. 2): For very small groove depths h=λ c ≲ 2%, losses are dominated by coupling to SAW modes outside of the cavity, dubbed as the Q r regime ðκ gd ≫ κ bd Þ, whereas for very deep grooves, losses due to conversion into bulk modes become excessive (Q b regime, κ gd ≪ κ bd ). In between, for a sufficiently high number of grooves N, the quality factor Q can ultimately be limited by internal losses (surface cracks, defects in the material, etc.), referred to as the Q m regime ðκ gd ≪ κ bd Þ. For N ≈ 300, we find that the onset of the bulk-wave limit occurs for h=λ c ≳ 2.5%, in excellent agreement with experimental findings [37,38]. With regard to applications in quantum information schemes, the Q r regime plays a special role in that resonator phonons leaking out through the acoustic mirrors can be processed further by guiding them in acoustic SAW waveguides (see below). To capture this behavior quantitatively, we analyze κ gd =κ bd (cf. Fig. 2): for κ gd =κ bd ≫ 1, leakage through the mirrors is the strongest decay mechanism for the cavity phonon, whereas the undesired decay channels are suppressed. Our analysis shows that, for gigahertz frequencies f c ≈ 3 GHz, N ≈ 100, and h=λ c ≈ 2%, a quality factor of Q ≈ 10 3 is achievable, together with an effective cavity confinement L c ≈ 40λ c (for D ≲ 5λ c ) and κ gd =κ bd ≳ 20 (illustrated by the circle in Fig. 2); accordingly, the probability for a cavity phonon to leak through the mirrors (rather than into the bulk for example) is κ gd =ðκ gd þ κ bd Þ ≳ 95%. Note that the resulting total cavity linewidth of κ=2π ¼ f c =Q ≈ ð1-3Þ MHz is similar to the ones typically encountered in circuit QED [6]. To compare this to the effective cavity-qubit coupling, we need to fix the effective mode area of the SAW cavity. In addition to the longitudinal confinement by the Bragg mirror (as discussed above), a transverse confinement length L trans (in directionŷ) can be provided, e.g., using waveguiding, etching, or (similar to cavity QED) focusing techniques [14,39,40]. For transverse confinement L trans ≈ ð1-5Þ μm and a typical resonant cavity wavelength λ c ≈ 1 μm, the effective mode area is then In the desired regime κ gd =κ bd ≫ 1, this is largely limited by the deliberately low reflectivity of a single groove; accordingly, the cavity mode leaks strongly into the grating such that L c ≫ D (cf. Appendix E for details). While up to now we have focused on this standard Bragg design (due to its experimentally validated frequency selectivity and quality factors), let us briefly mention potential approaches to reduce A and thus increase single-phonon effects even further. (i) The most straightforward strategy (that is still compatiblewith the Bragg mirror design) is to reduce λ c as much as possible, down to the maximum frequency f c ¼ v s =λ c that can still be made resonant with the (typically highly tunable) qubit's transition frequency ω q =2π; note that fundamental Rayleigh modes with f c ≈ 6 GHz have been demonstrated experimentally [41]. (ii) In order to increase the reflectivity of a single groove, one could use deeper grooves. To circumvent the resulting increased losses into the bulk [cf. Fig. 2(b)], freestanding structures (where the effect of bulk phonon modes is reduced) could be employed. (iii) Lastly, alternative cavity designs such as so-called trapped energy resonators make it possible to strongly confine acoustic resonances in the center of plate resonators [42].

B. SAW waveguides
Not only can SAWs be confined in cavities, but they can also be guided in acoustic waveguides [14,43]. Two dominant types of design are (i) topographic WGs, such as ridge-type WGs, where the substrate is locally deformed using etching techniques, and (ii) overlay WGs (such as strip-or slot-type WGs), where one or two strips of one material are deposited on the substrate of another to form core and clad regions with different acoustic velocities. If the SAW velocity is slower (higher) in the film than in the substrate, the film acts as a core (cladding) for the guide, whereas the unmodified substrate corresponds to the cladding (core). An attenuation coefficient of ∼0.6 dB=mm has been reported for a 10 μm-wide slot-type WG, defined by Al cladding layers on a GaAs substrate [39,40]. This shows that SAWs can propagate basically dissipation free over chip-scale distances exceeding several millimeters. (a) Quality factor Q for N ¼ 100 (dashed blue curve) and N ¼ 300 (red solid curve) grooves as a function of the normalized grove depth h=λ c . For shallow grooves, Q is limited by leakage losses due to imperfect acoustic mirrors (Q r regime, gray area), whereas for deep grooves conversion to bulk modes dominates (Q b regime); compare asymptotics (dash-dotted lines).
(b) Ratio of desired to undesired decay rates κ gd =κ bd . The stronger Q is dominated by Q r , the higher κ gd =κ bd . Here, w=p ¼ 0.5, D ¼ 5.25λ c , and f c ¼ 3 GHz; typical material parameters for LiNbO 3 are used (cf. Appendix E).
Typically, one-dimensional WG designs have been investigated, but-to expand the design flexibility-one could use multiple acoustic lenses in order to guide SAWs around a bend [29].

A. Versatility
To complete the analogy with cavity QED, a nonlinear element similar to an atom needs to be introduced. Here, we highlight three different exemplary systems, illustrating the versatility of our SAW-based platform. We focus on quantum dots, trapped ions, and nitrogen-vacancy (NV) centers, but similar considerations naturally apply to other promising quantum information candidates such as superconducting qubits [7,8,17,44], Rydberg atoms [45], or electron spins bound to a phosphorus donor atom in silicon [41]. In all cases considered, a single cavity mode a, with frequency ω c close to the relevant transition frequency, is retained. We provide estimates for the single-phonon coupling strength and cooperativity (cf. Table I), while more detailed analyses go beyond the scope of this work and are subject to future research.

QD charge qubit
A natural candidate for our scheme is a charge qubit embedded in a lithographically defined GaAs double quantum dot (DQD) containing a single electron. The DQD can well be described by an effective two-level system, characterized by an energy offset ϵ and interdot tunneling t c yielding a level splitting Ω ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi . The electron's charge e couples to the piezoelectric potential; the deformation coupling is much smaller than the piezoelectric coupling and can therefore safely be neglected [47]. Since the quantum dot is small compared to the SAW wavelength, we neglect potential effects coming from the structure making up the dots (heterostructure and metallic gates); for a detailed discussion, see Appendix I. Performing a standard rotating-wave approximation (RWA) (valid for δ; g ch ≪ ω c ), we find that the system can be described by a Hamiltonian of Jaynes-Cummings form, where δ ¼ Ω − ω c specifies the detuning between the qubit and the cavity mode, and S AE ¼ jAEih∓j (and so on) refer to pseudospin operators associated with the eigenstates jAEi of the DQD Hamiltonian (cf. Appendix F). The Hamiltonian H dot describes the coherent exchange of excitations between the qubit and the acoustic cavity mode. The strength of this interaction g ch ¼ eϕ 0 F ðkdÞ sin ðkl=2Þ is proportional to the charge e and the piezoelectric potential associated with a single phonon ϕ 0 . The decay of the SAW mode into the bulk is captured by the function F ðkdÞ (d is the distance between the DQD and the surface; see Appendix B for details), while the factor sin ðkl=2Þ reflects the assumed mode function along the axis connecting the two dots, separated by a distance l. For (typical) values of l ≈ λ c =2 ¼ 250 nm and d ≈ 50 nm ≪ λ c , the geometrical factor F ðkdÞ sin ðkl=2Þ then leads to a reduction in coupling strength compared to the bare value eϕ 0 (at the surface) by a factor of ∼2 only. In total, we then obtain g ch ≈ 2 GHz= ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi A ½μm 2 p . For lateral confinement L trans ≈ ð1-5Þ μm, the effective mode area is A ¼ L trans L c ≈ ð20-100Þ μm 2 . The resulting charge-resonator coupling strength g ch ≈ ð200-450Þ MHz compares well with values obtained using superconducting qubits coupled to localized nanomechanical resonators made of piezoelectric material, where g ≈ ð0.4-1.2Þ GHz [7,8], or superconducting resonators coupled to Cooper pair box qubits ðg=2π ≈ 6 MHzÞ [3], transmon qubits ðg=2π ≈ 100 MHzÞ [48], and indium arsenide DQD qubits ðg=2π ≈ 30 MHzÞ [49]. Note that, in principle, the coupling strength g ch could be further enhanced by additionally depositing a strongly piezoelectric material such as LiNbO 3 on the GaAs substrate [16]. Moreover, with a LiNbO 3 film on top of the surface, nonpiezoelectric materials such as Si or Ge could also be used to host the quantum dots [50]. The level splitting ΩðtÞ and interdot tunneling t c ðtÞ can be tuned in situ via external gate voltages. By controlling δ one can rapidly turn on and off the interaction between the qubit and the cavity: For an effective interaction time τ ¼ π=2g eff ðg eff ¼ 2g ch t c =ΩÞ on resonance ðδ ¼ 0Þ, an arbitrary state of the qubit is swapped to the absence or presence of a cavity phonon; TABLE I. Estimates for single-phonon coupling strength g and cooperativity C. We use A ¼ ð1-5Þ μm × 40λ c , T ¼ 20 mK [17], (conservative) quality factors of Q ¼ ð1; 1; 3; 1Þ × 10 3 , and frequencies of ω c ¼ 2πð6; 1.5; 2 × 10 −3 ; 3Þ GHz for the four systems listed. For the spin qubit T ⋆ 2 ¼ 2 μs [21], and for the trapped ion scenario, g ion ðC ion Þ is given for d ¼ 150 μm due to the prolonged dephasing time farther away from the surface (C ion improves with increasing d, even though g ion decreases, up to a point where other dephasing start to dominate). Further details are given in the text. i.e., ðαj−i þ βjþiÞj0i → j−iðαj0i − iβj1iÞ, where jni labels the Fock states of the cavity mode. Apart from this SWAP operation, further quantum control techniques known from cavity QED may be accessible [51]. Note that below we generalize our results to spin qubits embedded in DQDs.

Trapped ion
The electric field associated with the SAW mode does not only extend into the solid, but, for a free surface, in general there will also be an electrical potential decaying exponentially into the vacuum above the surface ∼ exp ½−kjzj [26]; cf. Appendix B. This allows for coupling to systems situated above the surface, without any mechanical contact. For example, consider a single ion of charge q and mass m trapped at a distance d above the surface of a strongly piezoelectric material such as LiNbO 3 or AlN. The electric dipole induced by the ion motion couples to the electric field of the SAW phonon mode. The dynamics of this system are described by the Hamiltonian where b refers to the annihilation operator of the ion's motional mode and ω t is the (axial) trapping frequency. The single-phonon coupling strength is given by Apart from the exponential decay FðkdÞ ¼ exp ½−kd, the effective coupling is reduced by the Lamb-Dicke parameter η LD ¼ 2πx 0 =λ c , with x 0 ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ℏ=2mω t p , since the motion of the ion is restricted to a region small compared with the SAW wavelength λ c . For LiNbO 3 , a surface mode area of A ¼ ð1-5Þ μm × 40λ c , the commonly used 9 Be þ ion, and typical ion trap parameters with d ≈ 30 μm and ω t =2π ≈ 2 MHz [52], we obtain g ion ≈ ð3-6.7Þ kHz. Here, g ion refers to the coupling between the ion's motion and the cavity. However, based on H ion , one can, in principle, generalize the well-known protocols operating on the ion's spin and motion to operations on the spin and the acoustic phonon mode [53].

NV center
Yet another system well suited for our scheme are NV centers in diamond. Even though diamond itself is not piezoactive, it has played a key role in the context of highfrequency SAW devices due to its record-high sound velocity [14]; for example, high-performance SAW resonators with a quality factor of Q ¼ 12 500 at ω c ≳ 10 GHz were experimentally demonstrated for AlN/diamond heterostructures [54,55]. To make use of the large magnetic coupling coefficient of the NV center spin γ NV ¼ 2π × 28 GHz=T, here we consider a hybrid device composed of a thin layer of diamond with a single (negatively charged) NV center with ground-state spin S implanted a distance d ≈ 10 nm away from the interface with a strongly piezomagnetic material. Equivalently, building upon current quantum sensing approaches [56,57], one could use a diamond nanocrystal (typically ∼10 nm in size) in order to get the NV center extremely close to the surface of the piezomagnetic material and thus maximize the coupling to the SAW cavity mode; compare Fig. 3(a) for a schematic illustration. In the presence of an external magnetic field B ext [58], the system is described by where D ¼ 2π × 2.88 GHz is the zero-field splitting, g NV ¼ γ NV B 0 is the single-phonon coupling strength, and η α NV is a dimensionless factor encoding the orientation of the NV spin with respect to the magnetic stray field of the cavity mode. For d ≪ λ c , a rough estimate shows that at least one component of η α NV is of order unity [34]. For a NV center close to a Terfenol-D layer of thickness h ≫ λ c , we find g NV ≈ 400 kHz= ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi A ½μm 2 p . Thus, as compared to direct strain coupling ≲200 Hz= ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi A ½μm 2 p , the presence of the piezomagnetic layer is found to boost the single-phonon coupling strength by 3 orders of magnitude; this is in agreement with previous theoretical results for a static setting [34]. Typically, an IDT consists of two thin-film electrodes on a piezoelectric material, each formed by interdigitated fingers. When an ac voltage is applied to the IDT on resonance (defined by the periodicity of the fingers as ω IDT =2π ¼ v s =p IDT , where v s is the SAW propagation speed), it launches a SAW across the substrate surface in the two directions perpendicular to IDT fingers [14,15,17].

B. Decoherence
In the analysis above, we ignored the presence of decoherence, which in any realistic setting will degrade the effects of coherent qubit-phonon interactions. In this context, the cooperativity parameter, defined as C ¼ g 2 T 2 =½κðn th þ 1Þ, is a key figure of merit. Here, T 2 refers to the corresponding dephasing time, whilen th ¼ ðexp ½ℏω c =k B T − 1Þ −1 gives the thermal occupation number of the cavity mode at temperature T. The parameter C compares the coherent single-phonon coupling strength g with the geometric mean of the qubit's decoherence rate ∼T −1 2 and the cavity's effective linewidth ∼κðn th þ 1Þ; in direct analogy to cavity QED, C > 1 marks the onset of coherent quantum effects in a coupled spin-oscillator system, even in the presence of noise, cf. Ref. [10] and Appendix H for a detailed discussion. To estimate C, we take the following parameters for the dephasing time T 2 . For system (i), T 2 ≈ 10 ns is measured close to the charge degeneracy point ϵ ¼ 0 [46]. In scenario (ii), motional decoherence rates of 0.5 Hz are measured in a cryogenically cooled trap for an ion height of 150 μm and 1 MHz motional frequency [52]. Since this rate scales as ∼d −4 [53,59], we take T 2 ½s ≈ 2ðd ½μm=150Þ 4 . Lastly, for the NV center (iii), T 2 ≈ 0.6 s is demonstrated for ensembles of NV spins [60] and we assume an optimistic value of T 2 ¼ 100 ms, similarly to Ref. [61]. The results are summarized in Table I. We find that C > 1 should be experimentally feasible, which is sufficient to perform a quantum gate between two spins mediated by a thermal mechanical mode [9].

C. Qubit-qubit coupling
When placing a pair of qubits into the same cavity, the regime of large single spin cooperativity C ≫ 1 allows for coherent cavity-phonon-mediated interactions and quantum gates between the two spins via the effective interaction Hamiltonian H int ¼ g dr ðS þ 1 S − 2 þ H:c:Þ, where g dr ¼ g 2 =δ ≪ g in the so-called dispersive regime [4]. For the estimates given in Table I, we restrict ourselves to the Q r regime with Q ≈ 10 3 , where leakage through the acoustic mirrors dominates over undesired (nonscalable) phonon losses ðκ gd ≫ κ bd Þ. However, note that small-scale experiments using a single cavity only (where there is no need for guiding the SAW phonon into a waveguide for further quantum information processing) can be operated in the Q m regime (which is limited only by internal material losses), where the quality factor Q ≈ Q m ≳ 10 5 is maximized (and thus overall phonon losses minimal).
As a specific example, consider two NV centers, both coupled with strength g NV ≈ 100 kHz to the cavity and in resonance with each other, but detuned from the resonator. Since for large detuning δ the cavity is only virtually populated, the cavity decay rate is reduced to κ dr ¼ whereas the spin-spin coupling is g dr ≈ 0.1g NV ≈ 10 kHz. Therefore, T 2 ¼ 1 ms is already sufficient to approach the strong-coupling regime, where g dr ≫ κ dr ; T −1 2 .

D. Coherent driving
Finally, we note that, in all cases considered above, one could implement a coherent, electrical control by pumping the cavity mode using standard interdigital transducers (IDTs) [14,15,17]; compare Fig. 3(b) for a schematic illustration. The effect of the additional Hamiltonian H drive ¼ Ξ cos ðω IDT tÞ½a þ a † can be accounted for by replacing the cavity state by a coherent state; that is, a → α. For example, in the case of Eq. (5), one could then drive Rabi oscillations between the states jþi and j−i with the amplified Rabi frequency Ω R ¼ gα.

V. STATE TRANSFER PROTOCOL
The possibility of quantum transduction between SAWs and different realizations of stationary qubits enables a variety of applications including quantum information achitectures that use SAW phonons as a quantum bus to couple dissimilar and/or spatially separated qubits. The most fundamental task in such a quantum network is the implementation of a state transfer protocol between two remote qubits 1 and 2, which achieves the mapping In analogy to optical networks, this can be accomplished via coherent emission and reabsorption of a single phonon in a waveguide [13]. As first shown in the context of atomic QED [20], in principle perfect, deterministic state transfer can be implemented by identifying appropriate timedependent control pulses.
Before we discuss a specific implementation of such a transfer scheme in detail, we provide a general approximate result for the state transfer fidelity F. As demonstrated in detail in Appendix H, for small infidelities one can take as a general estimate for the state transfer fidelity. Here, individual errors arise from intrinsic phonon losses ∼ε ¼ κ bd =κ gd and qubit dephasing ∼C −1 ∼ T −1 2 , respectively; the numerical coefficient C ∼ Oð1Þ depends on the specific control pulse and may be optimized for a given set of experimental parameters [62]. This simple, analytical result holds for a Markovian noise model where qubit dephasing is described by a standard pure dephasing term leading to an exponential loss of coherence ∼ exp ð−t=T 2 Þ and agrees well with numerical results presented in Ref. [62]. For non-Markovian qubit dephasing an even better scaling with C can be expected [9]. Using experimentally achievable parameters ε ≈ 5% and C ≈ 30, we can then estimate F ≈ 90%, showing that fidelities sufficiently high for quantum communication should be feasible for all physical implementations listed in Table I. In the following, we detail the implementation of a transfer scheme based on spin qubits implemented in gate-defined double quantum dots [63]. In particular, we consider singlet-triplet-like qubits encoded in lateral QDs, where two electrons are localized in adjacent, tunnelcoupled dots. As compared to the charge qubits discussed above, this system is known to feature superior coherence time scales [64][65][66][67] which are largely limited by the relatively strong hyperfine interaction between the electronic spin and the nuclei in the host environment [65], resulting in a random, slowly evolving magnetic (Overhauser) field for the electronic spin. To mitigate this decoherence mechanism, two common approaches are (i) spin-echo techniques, which allow us to extend spin coherence from a time-ensemble-averaged dephasing time T ⋆ 2 ≈ 100 ns to T 2 ≳ 250 μs [67], and (ii) narrowing of the nuclear field distribution [65,68]. Recently, real-time adaptive control and estimation methods (that are compatible with arbitrary qubit operations) have allowed to narrow the nuclear spin distribution to values that prolong T ⋆ 2 to T ⋆ 2 > 2 μs [21]. For our purposes, the latter is particularly attractive as it can be done simply before loading and transmitting the quantum information, whereas spin-echo techniques can be employed as well, however, at the expense of more complex pulse sequences (see Appendix G for details). In order to couple the electric field associated with the SAW cavity mode to the electron spin states of such a DQD, the essential idea is to make use of an effective electric dipole moment associated with the exchange-coupled spin states of the DQD [69][70][71][72]. As detailed in Appendix G, we then find that in the usual singlet-triplet subspace spanned by the two-electron states fj⇑⇓i; j⇓⇑ig, a single node can well be described by the prototypical Jaynes-Cummings Hamiltonian given in Eq. (1). As compared to the direct charge coupling g ch , the single phonon coupling strength g is reduced since the qubit states jli have a small admixture of the localized singlet hS 02 jli ðl ¼ 0; 1Þ only. Using typical parameters values, we find g ≈ 0.1g ch ≈ 200 MHz= ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi A ½μm 2 p [73]. In this system, the coupling gðtÞ can be tuned with great flexibility via the tunnel-coupling t c and/or the detuning parameter ϵ.
The state transfer between two such singlet-triplet qubits connected by a SAW waveguide can be adequately described within the theoretical framework of cascaded quantum systems, as outlined in detail, for example, in Refs. [13,20,74,75]: The underlying quantum Langevin equations describing the system can be converted into an effective, cascaded master equation for the system's density matrix ρ. For the relevant case of two qubits, it can be written as Here, D½aρ ¼ aρa † − 1 2 fa † a; ρg is a Lindblad term with jump operator a and H S ðtÞ ¼ describes the coherent Jaynes-Cummings dynamics of the two nodes. The ideal cascaded interaction is captured by L ideal , which contains the nonlocal coherent environment-mediated coupling transferring excitations from qubit 1 to qubit 2 [76], while L noise summarizes undesired decoherence processes: We account for intrinsic phonon losses (bulk-mode conversion, material imperfections, etc.) with a rate κ bd and (nonexponential) qubit dephasing. Since the nuclear spins evolve on relatively long time scales, the electronic spins in quantum dots typically experience non-Markovian noise leading to a nonexponential loss of coherence on a characteristic time scale T ⋆ 2 given by the width of the nuclear field distribution σ nuc as T ⋆ 2 ¼ ffiffi ffi 2 p =σ nuc [21,65]. Recently, a record-low value of σ nuc =2π ¼ 80 kHz has been reported [21], yielding an extended time-ensemble-averaged electron dephasing time of T ⋆ 2 ¼ 2.8 μs. In our model, to realistically account for the dephasing induced by the quasistatic, yet unknown Overhauser field, the detuning parameters δ i are sampled independently from a normal distribution pðδ i Þ with zero mean (since nominal resonance can be achieved via the electronic control parameters) and standard deviation σ nuc [68]; see Appendix G for details. In Appendix J, we also provide numerical results for standard Markovian dephasing, showing that non-Markovian noise is beneficial in terms of faithful state transfer.
Under ideal conditions where L noise ¼ 0, the setup is analogous to the one studied in Ref. [20] and the same timesymmetry arguments can be employed to determine the optimal control pulses g i ðtÞ for faithful state transfer: if a phonon is emitted by the first node, then, upon reversing the direction of time, one would observe perfect reabsorption. By engineering the emitted phonon wave packet such that it is invariant under time reversal and using a timereversed control pulse for the second node g 2 ðtÞ ¼ g 1 ð−tÞ, the absorption process in the second node is a time-reversed copy of the emission in the first and therefore in principle perfect. Based on this reasoning (for details, see Ref. [20]), we find the explicit, optimal control pulse shown in Fig. 4(c).
To account for noise, we simulate the full master equation numerically. The results are displayed in Fig. 4(a), where for every random pair δ ¼ ðδ 1 ; δ 2 Þ the fidelity of the protocol is defined as the overlap between the target state jψ tar i and the actual state after the transfer ρðt f Þ; that is, F δ ¼ hψ tar jρðt f Þjψ tar i. The average fidelityF of the protocol is determined by averaging over the classical noise in δ; that is,F ¼ R dδ 1 dδ 2 pðδ 1 Þpðδ 2 ÞF δ . Taking an effective mode area A ≈ 100 μm 2 as above and Q ≈ 10 3 to be well within the Q r regime where κ bd =κ gd ≈ 5%, we have g ≈ κ gd ≈ 20 MHz. For two nodes separated by millimeter distances, propagation losses are negligible and κ bd =κ gd ≈ 5% captures well all intrinsic phonon losses during the transfer. We then find that for realistic undesired phonon losses κ bd =κ gd ≈ 5% and σ nuc =2π ¼ 80 kHz (such that σ nuc =κ gd ≈ 2.5%) [21], transfer fidelities close to 95% seem feasible. Notably, this could be improved even further using spin-echo techniques, such that T 2 ≈ 10 2 T ⋆ 2 [67]. Therefore, state transfer fidelities F > 2=3 as required for quantum communication [77] seem feasible with present technology. Near-unit fidelities might be approached from further optimizations of the system's parameters, the cavity design, the control pulses, and/or from communication protocols that correct for errors such as phonon losses [78][79][80]. Once the transfer is complete, the system can be tuned adiabatically into a storage regime that immunizes the qubit against electronic noise, and dominant errors from hyperfine interaction with ambient nuclear spins can be mitigated by standard, occasional refocusing of the spins [67,69]. Alternatively, one could also investigate silicon dots: while this setup requires a more sophisticated heterostructure including some piezoelectric layer (as studied experimentally in Ref. [41]), it potentially benefits from prolonged dephasing times T ⋆ 2 > 100 μs [81], since nuclear spins are largely absent in isotopically purified 28 Si.

VI. SUMMARY AND OUTLOOK
In summary, we propose and analyze SAW phonons in piezoactive materials (such as GaAs) as a universal quantum transducer that allows us to convert quantum information between stationary and propagating realizations. We show that a sound-based quantum information architecture based on SAW cavities and waveguides is very versatile, bears striking similarities to cavity QED, and can serve as a scalable mediator of long-range spin-spin interactions between a variety of qubit implementations, allowing for faithful quantum state transfer between remote qubits with existing experimental technology. The proposed combination of techniques and concepts known from quantum optics and quantum information, in conjunction with the technological expertise for SAW devices, is likely to lead to further, rapid theoretical and experimental progress.
Finally, we highlight possible directions of research going beyond our present work. First, since our scheme is not specific to any particular qubit realization, novel hybrid systems could be developed by embedding dissimilar systems such as quantum dots and superconducting qubits into a common SAW architecture. Second, our setup could also be used as a transducer between different propagating quantum systems such as phonons and photons. Light can be coupled into the SAW circuit via (for example) NV centers or self-assembled quantum dots, and structures guiding both photons and SAW phonons have already been fabricated experimentally [39,40]. Finally, the SAW architecture opens up a novel, on-chip test bed for investigations reminiscent of quantum optics, bringing the highly developed toolbox of quantum optics and cavity QED to the widely anticipated field of quantum acoustics [11,16,17,82]. Potential applications include quantum simulation of many-body dynamics [83], quantum state engineering (yielding, for example, squeezed states of sound), quantum-enhanced sensing, sound detection, and sound-based material analysis.  in the presence of quasistatic (non-Markovian) Overhauser noise, as a function of the root-mean-square fluctuations σ nuc in the detuning parameters δ i ði ¼ 1; 2Þ, for κ bd =κ gd ¼ 0 (solid line, circles) and κ bd =κ gd ¼ 10% (dash-dotted line, squares). (b) After n ¼ 100 runs with random values for δ i ,F approximately reaches convergence. The curves refer to σ nuc =κ gd ¼ ð0; 2; …; 10Þ% (from top to bottom) for κ bd =κ gd ¼ 10%. (c) Pulse shape g 1 ðtÞ for first node.

APPENDIX A: CLASSICAL DESCRIPTION OF NONPIEZOELECTRIC SURFACE ACOUSTIC WAVES
In this appendix, we review the general (classical) theoretical framework describing SAW in cubic lattices, such as diamond or GaAs. We derive an analytical solution for propagation in the [110] direction. The latter is of particular interest in piezoelectric systems. The classical description of a SAW is explicitly shown here to make our work self-contained, but follows standard references such as Ref. [26].

Wave equation
The propagation of acoustic waves (bulk and surface waves) in a solid is described by the equation where u denotes the displacement vector with u i being the displacement along the Cartesian coordinatex i ðx 1 ¼x; x 2 ¼ŷ;x 3 ¼ẑÞ, ρ gives the mass density, and T is the stress tensor; T ij is the ith component of force per unit area perpendicular to thex j axis. Moreover, x is the Cartesian coordinate vector, where in the following we assume a material with infinite dimensions inx;ŷ and a surface perpendicular to theẑ direction at z ¼ 0. The stress tensor obeys a generalized Hooke law (stress is linearly proportional to strain) where the strain tensor is defined as Using the symmetry c ijkl ¼ c ijlk , in terms of displacements we find such that Eq. (A1) takes the form of a set of three coupled wave equations, The elasticity tensor c obeys the symmetries c ijkl ¼ c jikl ¼ c ijlk ¼ c klij and is largely defined by the crystal symmetry.

Mechanical boundary condition
The free surface at z ¼ 0 is stress free (no external forces are acting upon it), such that the three components of stress across z ¼ 0 shall vanish; that is, This results in the boundary conditions

Cubic lattice
For a cubic lattice (such as GaAs or diamond), the elastic tensor c ijkl has three independent elastic constants, generally denoted by c 11 ; c 12 , and c 44 ; compare Table II. Taking the three direct twofold axes as the coordinate axes, the wave equations then read (and cyclic permutations), while the mechanical boundary conditions can be written as at z ¼ 0. In the following, we seek solutions that propagate along the surface with a wave vector k ¼ kðlx þ mŷÞ, where l ¼ cosðθÞ, m ¼ sinðθÞ, and θ is the angle between thex axis and k. Following Ref. [84], we make the ansatz 0 where the decay constant q describes the exponential decay of the surface wave into the bulk and c is the phase velocity. Plugging this ansatz into the mechanical wave equations can be rewritten as If the medium lies in the half-space z > 0, the roots with negative real part will lead to a solution that does not converge as z → ∞. Thus, only the roots that lead to vanishing displacements deep in the bulk are kept. Then, the most general solution can be written as a superposition of surface waves with allowed q r values as ðu x ; u y ; iu z Þ ¼ X r¼1;2;3 ðξ r ; η r ; ζ r ÞK r e −kq r z e ikðlxþmy−ctÞ ; where, for any q r ¼ q r ðc; θÞ, the ratios of the amplitudes can be calculated according to where we introduce the quantities and Note that for each root q r and displacement u i there is an associated amplitude. The phase velocity c, however, is the same for every root q r , and needs to be determined from the mechanical boundary conditions as we describe below.
Similarly to the acoustic wave equations, the boundary conditions can be rewritten as BðK 1 ; K 2 ; K 3 Þ ¼ 0, where the boundary condition matrix B is with a ¼ c 11 =c 12 . Again, nontrivial solutions are found for detðBÞ ¼ 0. The requirements detðMÞ ¼ 0, detðBÞ ¼ 0, together with Eq. (A14) constitute the formal solution of the problem [84]; detðMÞ ¼ 0 and detðBÞ ¼ 0 may be seen as determining c 2 and q 2 , and Eq. (A14) then gives the ratios of the components of the displacement. In the following, we discuss a special case where one can eliminate the q dependence in detðBÞ ¼ 0, leading to an explicit, analytically simple equation for the phase velocity c, which depends only on the material properties.

Propagation in [110] direction
The wave equations simplify for propagation in highsymmetry directions. Here, we consider propagation in the yielding the roots q 2 1 ; q 2 2 . We choose the roots commensurate with the convergence condition yielding the general ansatz The amplitude ratios γ 0 r ¼ iW r =U 0 r can be obtained from the kernel of M as where X ¼ ρc 2 =c 11 . In the coordinate system fx 0 ;ẑg, the mechanical boundary conditions read For the ansatz given in Eq. (A20), they can be reformulated as B 110 ðU 0 The requirement detðB 110 Þ ¼ 0 can be written as q 1 ½c 12 þ ρc 2 þ c 11 q 2 1 ½c 12 ðc 44 − ρc 2 Þ þ c 11 c 44 q 2 2 − q 2 ½c 12 þ ρc 2 þ c 11 q 2 2 ½c 12 ðc 44 − ρc 2 Þ þ c 11 c 44 q 2 1 ¼ 0: From the symmetry of this equation it is clear that one can remove a factor ðq 1 − q 2 Þ leading to Using simple expressions for q 2 1 q 2 2 and q 2 1 þ q 2 2 obtained from Eq. (A19), one arrives at the following explicit equation for the wave velocity c [26,84]: which is cubic in X ¼ ρc 2 =c 11 . If not stated otherwise, we consider the mode with the lowest sound velocity, referred to as the Rayleigh mode; compare Fig. 5.

APPENDIX B: SURFACE ACOUSTIC WAVES IN PIEZOELECTRIC MATERIALS
In a piezoelectric material, elastic and electromagnetic waves are coupled. In principle, the field distribution can be found only by solving simultaneously the equations of both Newton and Maxwell. The corresponding solutions are hybrid elastoelectromagnetic waves, i.e., elastic waves with velocity v s accompanied by electric fields, and electromagnetic waves with velocity c ≈ 10 5 v s accompanied by mechanical strains. For the first type of wave, the magnetic field is negligible, because it is due to an electric field traveling with a velocity v s much slower than the speed of light c; therefore, one can approximate Maxwell's equations as Thus, the propagation of elastic waves in a piezoelectric material can be described within the quasistatic approximation, where the electric field is essentially static compared to electromagnetic fields [27]. The potential ϕ and the associated electric field are not electromagnetic in nature but rather a component of the predominantly mechanical wave propagating with velocity v s .

General analysis a. Wave equation
The basic equations that govern the propagation of acoustic waves in a piezoelectrical material connect the mechanical stress T and the electrical displacement D with the mechanical strain and the electrical field. The coupled constitutive equations are where e with ðe ijk ¼ e ikj Þ and ϵ are the piezoelectric and permittivity tensor, respectively. Here, Hooke's law is extended by the additional stress term due to the piezoelectric effect, while the equation for the displacement D i includes the polarization produced by the strain. Therefore, Newton's law becomes For an insulating solid, the electric displacement D i must satisfy Poisson's equation ∂D i =∂x i ¼ 0, which yields b

. Mechanical boundary conditions
In the presence of piezoelectric coupling the mechanical boundary conditions [compare Eq. (A6)] generalize to Using the symmetries c ijkl ¼ c jikl and e kij ¼ e kji , it is easy to check that this is equivalent to Eq. (41) in Ref. [26].

c. Electric boundary condition
In addition to the stress-free boundary conditions, piezoelectricity introduces an electric boundary condition: The normal component of the electric displacement needs to be continuous across the surface [39]; that is, where by definition D z ¼ eẑ jk ∂u j =∂x k − ϵẑ j ∂ϕ=∂x j . Outside of the medium ðz < 0Þ, we assume vacuum; thus, where the electrical potential has to satisfy Poisson's equation Δϕ out ¼ 0. The ansatz gives Δϕ out ¼ ð−k 2 þ Ω 2 k 2 Þϕ out ¼ 0. Thus, for proper convergence far away from the surface z → −∞, we take the decay constant Ω ¼ 1; accordingly, the electrical potential decays exponentially into the vacuum above the surface on a typical length scale given by the SAW wavelength λ ¼ 2π=k ≈ 1 μm. Therefore, for the electrical displacement outside of the medium, we find D z ¼ −ϵ 0 kϕ. Lastly, the electrical potential has to be continuous across the surface [26], i.e., which allows us to determine the amplitude A out . In summary, Eq. (B6) can be rewritten as UNIVERSAL QUANTUM TRANSDUCERS BASED ON … PHYS. REV. X 5, 031031 (2015) 031031-13 ðeẑ jk ∂u j =∂x k − ϵẑ j ∂ϕ=∂x j þ ϵ 0 kϕÞj z¼0 ¼ 0:

Cubic lattice a. Specific analysis for cubic systems
For a cubic, piezoelectric system there is only one independent nonzero component of the piezoelectric tensor called e 14 [26,27]. With this piezoelectric coupling, the wave equations are given by four coupled partial differential equations, and cyclic permutations for u y and u z . Here, Δ is the Laplacian and ϵ is the dielectric constant of the medium. For a cubic lattice, the mechanical boundary conditions at z ¼ 0 explicitly read while the electrical boundary condition [compare the general relation in Eq. (B9)] leads to In general, the wave equations can be formulated into a 4 × 4 matrix M; the condition det M ¼ 0 can then used to find the four decay constants. In addition, the mechanical and electrical boundary conditions can be recast to a 4 × 4 boundary condition matrix B, from which one can deduce the allowed phase velocities of the piezoelectric SAW by solving det B ¼ 0.

b. Perturbative treatment
For materials with weak piezoelectric coupling (such as GaAs), the properties of surface acoustic waves are primarily determined by the elastic constants and density of the medium. Then, within a perturbative treatment of the piezoelectric coupling, one can obtain analytical expressions for the strain and piezoelectric fields. Here, we summarize the results for SAWs propagating alonĝ x 0 ∥½110 of theẑ∥½001 surface following Refs. [26,28]. Since the piezoelectric coupling e 14 is small, it follows from Eq. (B11) that ϕ will be order e 14 smaller than the mechanical displacements u; that is, This results in additional terms in the wave equations that are of order ∼e 2 14 =ϵ ≈ 10 8 N=m 2 . Since the elastic constants are 2-3 orders of magnitude bigger than this piezoelectric term, the wave equations [Eq. (B10)] and (cyclic versions for u y ; u z ) will be solved by the nonpiezoelectric solution with corrections only at order e 2 14 . The nonpiezoelectric solution derived in detail in Appendix A can be summarized as where the sound velocity v for the Rayleigh mode follows from the smallest solution of with c 0 11 ¼ c 44 þ ðc 11 þ c 12 Þ=2. The decay constant q is a solution of Lastly, the parameters γ, φ can be obtained from Now, based on the nonpiezoelectric solution given in Eq. (B17), the potential ϕ is constructed such that both the wave equation in Eq. (B11) and the electrical boundary condition in Eq. (B15) are solved. In the fx 0 ;ŷ 0 ;ẑg coordinate system they read explicitly One can readily check that this is achieved by the form proposed in Refs. [26,28], where ϕ 0 ¼ ðe 14 =ϵÞU and A out ¼ iϕ 0 F ð0Þ. Here, we introduce the dimensionless function F ðkzÞ, which determines the length scale on which the electrical potential generated by the SAW decays into the bulk. It is given by with A 1 ¼ jA 1 je −iξ , q ¼ α þ βi, and For Al x Ga 1−x As, we obtain the following parameter values (compare Ref. [26]): jA 1 j ≈ 1.59, A 3 ¼ −3.1, α ≈ 0.501, β ≈ 0.472, φ ¼ 1.06, and ξ ¼ −0.33. The electric potential for this parameter set is shown in Fig. 7.

APPENDIX C: MECHANICAL ZERO-POINT FLUCTUATION
Here, we provide more detailed calculations and estimates for the mechanical zero-point motion U 0 of a SAW.
We show that they agree very well with the simple estimate given in the main text. Finally, we provide details on the material parameters used to obtain the numerical estimates.
Our first approach follows closely the one presented in Ref. [32]. The analysis starts out from the mechanical displacement operator in the Heisenberg picture: uðx; tÞ ¼ X n ½v n ðxÞa n e −iω n t þ H:c:: To obtain the proper normalization of the displacement profiles, we assume a single-phonon Fock state, that is, jΨi ¼ a † n jvaci ¼ j0;…;0;1 n ;0.…i, where jvaci ¼ Q n j0i n is the phonon vacuum, and compute the expectation value of additional field energy above the vacuum E mech , defined as twice the kinetic energy, since for a mechanical mode half of the energy is kinetic, the other one potential [32]. We find where the last equality defines the effective mode volume for mode n. Setting U 0 ¼ max½jv n ðrÞj, and assuming the phonon energy as E mech ¼ ℏω n , we arrive at the general result for a phonon mode, U 0 ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ℏ=2ρVω n p ; this confirms the simple estimate given in the main text.

Explicit example
Next, we provide a calculation based on the exact analytical results derived in Appendixes A and B. In what follows, we assume that, in analogy to cavity QED, cavity confinement leads to the quantization k n ¼ nπ=L c , where A ¼ L 2 c is the effective quantization area. In a full 3D model, A ¼ L x L y , where L y is related to the spread of the transverse mode function as discussed (for example) in Ref. [14]. For simplicity, here we take L x ¼ L y . Surface wave resonators can routinely be designed to show only one resonance k 0 [14]. Within this single-mode approximation, based on results derived in Appendix A for a SAW traveling wave, we take the quantized mechanical displacement describing a SAW standing wave along the axiŝ Here, the functions χ 0 ðzÞ and ζ 0 ðzÞ describe how the SAW decays into the bulk, with material-dependent parameters Ω ¼ Ω r þ iΩ i , γ ¼ jγj exp ½−iθ, and φ; numerical values are presented in Table III. We note that for GaAs we find ζð0Þ=χð0Þ ≈ 1.33. This is in very good agreement with the numerical values of c x ¼ ju x =ϕj ¼ 0.98 nm=V and c z ¼ ju z =ϕj ¼ 1.31 nm=V as given in Ref. [15]. Normalization of the mode function allows us to determine the parameter U 0 . Performing the integration, we find where the parameter δ depends on the material parameters; see Table III. Using typical material parameters, we obtain for GaAs (diamond) ffiffiffiffiffiffiffiffiffiffi Ω r =δ p ¼ 0.64ð1.17Þ and . This is in very good agreement with the numerical values presented in the main text.

Estimates derived from literature
In Ref. [26], it is shown that the SAW Rayleigh mode studied in Appendix A has a classical energy density E (energy per unit surface area) given by where U is the amplitude of the wave, k the wave vector, and H a material-dependent factor which is given as H ≈ 28.2 × 10 10 N=m 2 for GaAs. By equating the classical energy of the SAW given by EA, where A is the quantization area, with its quantum-mechanical analog N ph ℏω (N ph is the number of phonons), we estimate the single-phonon displacement U 0 as This estimate is also found to be in very good agreement with a result given in Ref. [39], as where C is a normalization constant with numerical value C ≈ 0.45 for GaAs [39]. Therefore, for an effective mode area of L c ¼ ffiffiffi ffi A p ¼ 1 μm, we find a single-phonon displacement of U 0 ≈ 1 fm. This confirms the estimates given in the main text.

APPENDIX D: ZERO-POINT ESTIMATES
In this appendix, we provide details on piezomagnetic materials and numerical estimates of the zero-point quantities for several relevant materials. The results are summarized in Table IV. The underlying input parameters are given below.
Theoretically, piezomagnetic materials with a large magnetostrictive effect are typically described in a 1:1 correspondence to Eqs. (2) and (3), with the appropriate replacements (using standard notation) E → H, D → B, ϵ ij → μ ij , and e ijk → h ijk [33]. Coupling between mechanical and magnetic degrees of freedom is described by the piezomagnetic tensor h, which can reach values as high as ∼700 T=strain [36]; for our estimates we refer to terfenol-D, where h 15 ≈ 167 T=strain. The magnetic field associated with a single phonon can then be estimated as B 0 ≈ h 15 s 0 , where h 15 refers to a typical (nonzero) element of h.
For the piezoelectric materials GaAs, LiNbO 3 , and quartz, all material parameters are obtained from Ref. [27]. Phase velocities for typical cut directions are used; that is, (100)[001] GaAs, Y-Z LiNbO 3 , and ST quartz. For the piezomagnetic (magnetostrictive) materials CoFe 2 O 4 and terfenol-D, all material parameters are taken from Ref. [35]. We use the phase velocities of the bulk shear waves given there as v sh ¼ 3.02 × 10 3 m=s and v sh ¼ 1.19 × 10 3 m=s for CoFe 2 O 4 and terfenol-D, respectively. This gives a conservative estimate for U 0 , since Rayleigh modes have phase velocities that are lower than the ones of bulk modes [14]. For example, in the case of CoFe 2 O 4 in TABLE IV. Estimates for zero-point fluctuations (mechanical amplitude U 0 , strain s 0 , electrical potential ϕ 0 , electric field ξ 0 and magnetic field B 0 ) close to the surface ðd ≪ λÞ for typical piezo-electric and piezo-magnetic (magnetostrictive) materials. All values must be multiplied by the universal scaling factor 1= ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi A ½μm 2 p ; thus, they refer to an effective surface mode area of size A ¼ 1 μm 2 . Lower (upper) bounds for ϕ 0 and ξ 0 comprise minimum (maximum) non-zero element of e with maximum (minimum) non-zero element of ϵ. We have set k ¼ 2π=μm. Details on cut-directions and material parameters are given in the text.  Ref. [36], wave velocities for Rayleigh-type surface waves in a piezoelectric-piezomagnetic layered half-space are found to be v s ≈ 2840 m=s < v sh .

APPENDIX E: SAW CAVITIES
In this Appendix, we present a detailed discussion of the theoretical model describing the SAW resonator.
Typically, a SAW cavity is based on an on-chip distributed Bragg reflector formed by a periodic array of either metal electrodes or grooves etched into the surface; see Fig. 1. In such a grating, each strip reflects only weakly, but, for many strips N ≫ 1, the total reflection jRj can approach unity if the pitch p equals half the wavelength, p ¼ λ c =2. This Bragg condition defines the center frequency At f ¼ f c , the total reflection coefficient is given by where N is the number of strips and r s is the reflection coefficient associated with a single strip [14,15]. The total reflection coefficient jRj goes to unity in the limit Njr s j ≫ 1; see Fig. 8. Typically, N ≳ 200 and jr s j ≈ ð1-2Þ% [14]. For f ≈ f c , jr s j increases with the normalized groove depth as jr s j ¼ C 1 h=λ c sin ðπw=pÞ þ C 2 ðh=λ c Þ 2 cos ðπw=pÞ, with material-dependent prefactors [38]. For LiNbO 3 , C 1 ¼ 0.67 and C 2 ¼ 42 [38]. As argued in Ref. [38], the first term ∼C 1 is due to a impedance mismatch, while the second one ∼C 2 is due to the stored energy effect. Because of the distributed nature of the mirror, strong reflection occurs over a fractional bandwidth only, given by δf=f c ≈ 2jr s j=π. In practice, the cavity formed by two reflective gratings can be viewed as an acoustic Fabry-Perot resonator with effective reflection centers, sketched by localized mirrors in Fig. 1, situated at some effective penetration distance into the grating, given by L p ¼ tanh ½ðN − 1Þjr s jλ c =ð4jr s jÞ ≈ λ c =4jr s j [14,15,37]. Therefore, the total effective cavity size along the mirror axis is L c ≈ D þ 2L p , where D is the physical gap between the gratings; compare Fig. 1. For N ≈ 100-300, h=λ c ≈ 2%, we then obtain L c ≈ 38λ c and L c ≈ 42λ c for D ¼ 0.75λ c and D ¼ 5.25λ c , respectively. In analogy to an optical Fabry-Perot resonator, the mode spacing can then by estimated as Δf=f c ¼ λ c =2L c ≈ jr s j. Since this is larger than δf=f c , SAW resonators can be designed to host a single resonance only [14].
The total decay rate of this resonance κ can be decomposed into four relevant contributions [38], κ ¼ κ bk þ κ d þ κ m þ κ r , which includes conversion into bulk modes ∼κ bk , diffraction losses ∼κ d , internal losses due to material imperfections ∼κ m , and leakage (radiation) losses due to imperfect mirrors ∼κ r . The associated Q factors are given by Q i ¼ ω c =κ i . The desired decay rate is κ gd ¼ κ r , whereas the undesired one is κ bd ¼ κ bk þ κ m þ κ d . Here, κ d is associated with diffraction losses due to spillover beyond the aperture of the reflector. It can be made negligible by lateral confinement using, for example, waveguide structures, focusing, or etching techniques [14,39,40]. Q m refers to losses due to interaction with thermal phonons, losses due to defects in the material, and propagation losses due to contamination [37,38]. These losses ultimately limit Q: Low-temperature experiments on quartz have demonstrated SAW resonators with Q m × f½GHz > 10 5 [22,23]. Another source of losses is due to mode conversion into bulk modes. Measurements show that Q bk ¼ 2πN eff =½C b ðh=λ c Þ 2 , with N eff ¼ L c =λ c and a material-dependent prefactor C b [85]; for LiNbO 3 (quartz), C b ¼ 8.7ð10Þ [85,86]. Typically, κ bk is found to be negligible for small groove depths, h=λ c < 2% [38]. Finally, κ r arises from leakage through imperfectly reflecting gratings ðjRj < 1Þ; in direct analogy to optical Fabry-Perot resonators, the associated Q factor is given by Q r ¼ 2πN eff =ð1 − jRj 2 Þ. Assuming negligible diffraction losses (that can be minimized via waveguidelike confinement [14,43]) and cryostat temperatures, the total Q factor is then given by

APPENDIX F: CHARGE QUBIT COUPLED TO SAW CAVITY MODE
We consider a GaAs charge qubit embedded in a tunnelcoupled double quantum dot containing a single electron. In the one-electron regime, the single-particle orbital level spacing is on the order of ∼1 meV. Therefore, the system is well described by an effective two-level system: The state of the qubit is set by the position of the electron in the double-well potential, with the logical basis jLi; jRi corresponding to the electron localized in the left (right) orbital. The Hamiltonian describing this system reads with the (orbital) Pauli operators defined as σ z ¼ jLihLj − jRihRj and σ x ¼ jLihRj þ jRihLj, respectively. In Eq. (5), ϵ refers to the level detuning between the dots, while t c gives the tunnel coupling. The level splitting between the eigenstates of H ch is given by Ω ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi , with a pure tunnel splitting of Ω ¼ 2t c at the charge degeneracy point ðϵ ¼ 0Þ; typical parameter values are t c ∼ μeV and ϵ ∼ μeV, such that the level splitting Ω ∼ GHz lies in the microwave regime. At the charge degeneracy point, where to first order the qubit is insensitive to charge fluctuations ðdΩ=dϵ ¼ 0Þ, the coherence time has been found to be T 2 ≈ 10 ns [46].
We now consider a charge qubit as described above inside a SAW resonator with a single resonance frequency ω c close to the qubit's transition frequency, ω c ≈ Ω, that is the regime of small detuning δ ¼ Ω − ω c ≈ 0; note that single-resonance SAW cavities can be realized routinely with today's standard techniques [14]. Within this singlemode approximation, the Hamiltonian describing the SAW cavity simply reads where a † ðaÞ creates (annihilates) a phonon inside the cavity. The electrostatic potential associated with this mode is given byφðxÞ ¼ ϕðxÞ½a þ a † , where the mode function ϕðxÞ can be obtained from the corresponding mechanical mode function wðxÞ via the relation ϵΔϕðxÞ ¼ e kij ∂ j ∂ k w i ðxÞ; here, Δ is the Laplacian, e kij the piezoelectric tensor, and ϵ the permittivity of the material. The electron's charge e couples to the phononinduced electrical potentialφ. In second quantization, the piezoelectric interaction reads H int ¼ e R dxφðxÞnðxÞ, where e is the electron's charge,nðxÞ ¼ P σ ψ † σ ðxÞψ σ ðxÞ is the electron number density operator, and ψ † σ ðxÞ creates an electron with spin σ at position x [25]. SinceφðxÞ varies on a micron length scale which is large compared to the spatial extension ∼40 nm of the electron's wave function in a QD [65], the electron density is approximately given by a delta function at the center of the corresponding dots. For the DQD system under consideration, H int is then approximately given by H int ¼ e P iφ ðx i Þn i ; here, x i refers to the center of the electronic orbital wave function ψ i ðxÞ of dot i ¼ L; R. Note that this form of H int becomes exact if the overlap integral vanishes, that is, if R dxϕðxÞψ Ã L ðxÞψ R ðxÞ ¼ 0 is satisfied. As shown below, for a mode function ϕðxÞ of sine form, this condition maximizes the piezoelectric coupling strength between the electronic DQD system and the phonon mode. For the charge qubit system under consideration, coupling to the cavity mode is then described by H int ¼ eða þ a † Þ ½ϕðx L ÞjLihLj þ ϕðx R ÞjRihRj; here, x i refers to the center of the electronic orbital wave function φ i ðxÞ of dot i ¼ L; R and the transverse directionŷ has been integrated out already. To obtain strong coupling between the qubit and the cavity, we assume a mode profile ϕðxÞ ¼ φ 0 sin ðkxÞ, with a node tuned between the two dots, such that ϕðx L Þ ¼ φ 0 sin ðkl=2Þ ¼ −ϕðx R Þ; here, l gives the distance between the two dots. Note that the single phonon amplitude, defined as φ 0 ¼ ϕ 0 F ðkdÞ, with F ðkdÞ ≈ 0 for d ≫ λ, accounts for the decay of the SAW into the bulk. For λ ¼ ð0.5-1Þ μm and a 2DEG (where the DQD is embedded) situated a distance d ¼ 50 nm below the surface, however, the single-phonon amplitude is reduced by a factor of ∼2 only, φ 0 ≈ ð0.45-0.52Þϕ 0 ; see Appendix B for details. Then, the coupling between qubit and cavity reads where the single-phonon coupling strength is Here, we assume λ ≈ 2l, such that the geometrical factor sin ðπl=λÞ ≈ 1 [47]. In principle, the coupling strength g ch could be further enhanced by additionally depositing a strongly piezoelectric material such as LiNbO 3 on the GaAs substrate. Moreover, comparison with standard literature shows that the piezoelectric electron-phonon coupling strength can be expressed as g pe ¼ ffiffiffi ffi P p U 0 ≈ eðe 14 =ϵÞU 0 ¼ eϕ 0 , where P ¼ ðee 14 =ϵÞ 2 is a material parameter quantifying the piezoelectric coupling strength in zinc-blende structures [87,88]. Using P ¼ 5.4 × 10 −20 for GaAs, the single-phonon Rabi frequency can be estimated as g pe ≈ 2.87 μeV= ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi A ½μm 2 p . This corroborates our estimate for g ch .
In summary, the total system can be described by the This corresponds to the generic Hamiltonian for a qubitresonator system [32]. It is instructive to rewrite H in the eigenbasis of H ch , given by where g is the electron g factor and μ B the Bohr magneton.
In the presence of a micromagnet or nanomagnet, the two local magnetic fields B i are inhomogeneous, B L ≠ B R . We can then write where B 0 is the external homogeneous magnetic field, while B m ðx i Þ is the micromagnet slanting field at the location of dot x i . In practice, B 0 is a few tesla, at least larger than the saturation field of the micromagnet B 0 ≳ 0.5 T, while the magnetic gradient ΔB ¼ ∥B m ðx R Þ − B m ðx L Þ∥ can reach ΔB ≈ 100 mT, corresponding to an electronic energy scale of jgμ B ΔBj ≈ 2 μeV [89]. Field derivatives realized experimentally are ∂B m;z =∂x ≈ 1.5 mT=nm. Alternatively, the magnetic gradient can be realized via the Overhauser field, as experimentally demonstrated, for example, in Ref. [67]. Note that the Fermi contact hyperfine interaction between electron and nuclear spins reads H hf ¼ P i h i · S i . Here, h i is the Overhauser field in QD i ¼ L; R. When treating h i as a classical (random) variable, H hf is equivalent to H Z , and thus one can absorb h i into the definition of the magnetic field B i in Eq. (G5); also see Ref. [89].
To facilitate the discussion, we introduce the magnetic sum field B ¼ ðB L þ B R Þ=2 and the difference field ΔB ¼ ðB R − B L Þ=2. While B conserves the total spin, that is, ½BðS L þ S R Þ; ðS L þ S R Þ 2 ¼ 0, the gradient field ΔB does not. We set the quantization axisẑ along B ¼ Bẑ. For sufficiently large magnetic field B, the electronic levels with S z tot ¼ S z L þ S z R ≠ 0 are far detuned and can be neglected for the remainder of the discussion. Therefore, in the following, we restrict ourselves to the S z tot ¼ 0 subspace. The components ΔB x;y give rise to transitions out of the (logical) subspace S z tot ¼ 0. Since these processes are assumed to be far off resonance, they are neglected, leaving us with the only relevant magnetic gradient Δ ¼ gμ B ΔB z =2; compare also Refs. [66,67]. For a schematic illustration, compare Fig. 9.
In summary, in the regime of interest H 0 simplifies to The eigenstates of H 0 within the relevant S z tot ¼ S z L þ S z R ¼ 0 subspace can be expressed as with corresponding eigenenergies ϵ l ðl ¼ 0; 1; 2Þ. The spectrum is displayed in Fig. 10. For large negative detuning −ϵ ≫ t c , the level j2i is far detuned, and the electronic subsystem can be simplified to an effective two-level system comprising the levels fj0i; j1ig, that is, which can be identified with a "singlet-triplet-like" logical qubit subspace. Here, ω 0 ¼ ϵ 1 − ϵ 0 refers to the qubit's transition frequency. Note that the magnetic gradient causes efficient mixing between jT 0 i and jS 11 i for Δ ≳ jt 2 c =ϵj. In the regime of interest, the dominant character of the qubit's levels is j1i ≈ j⇓⇑i, j0i ≈ j⇑⇓i (or vice versa) [66] and the transition frequency is approximately ω 0 ≈ 2Δ. For Δ ≈ 1 μeV, the transition frequency ω 0 ¼ ϵ 1 − ϵ 0 ≈ 2 μeV ≈ 3 GHz matches typical SAW frequencies ∼GHz.

Coupling to SAW phonon mode
Along the lines of Appendix F, again we consider a SAW resonator with a single relevant confined phonon mode of frequency ω c close to the qubit's transition frequency ω 0 . For a DQD in the two-electron regime, in the basis of Eq. (G6) coupling to the resoantor mode can be written as [47] where g 0 ¼ eφ 0 η geo . Here, η geo is a geometrical factor accounting for the DQD's position with respect to the mode function ϕðxÞ, defined according to φ 0 η geo ¼ ϕðx R Þ − ϕðx L Þ. For example, taking a standing-wave pattern alongx as demonstrated experimentally in Ref. [90], together with a transverse mode function restricting the spread in theŷ direction [40,43], we obtain η geo ¼ sin ð2πx R =λÞ − sin ð2πx L =λÞ. It takes on its maximum value η opt when tuning a node of the standing wave at the center between the two dots, that is, x R ¼ l=2, x L ¼ −l=2; this gives η opt ¼ 2 sin ðπl=λÞ, where l is the distance between the two dots [47]. As compared to the charge qubit described in Appendix F, there is an additional factor of 2, since here we consider a DQD in the twoelectron regime, whereas the charge qubit consists of one electron only. For typical parameters (l ¼ 220 nm, λ ≈ 1.4 μm) as used in Ref. [47], we get η opt ≈ 0.95, while l ¼ 220 nm, λ ≈ 0.5 μm leads to the largest possible value of η opt ≈ 2.
In summary, within the effective electronic two-level subspace fj0i; j1ig, the system is described by the Hamiltonian whereV pe ¼ g 0 ½a þ a † . Applying a unitary transformation to a frame rotating at the cavity frequency ω c according , performing a RWA, and dropping a global energy shiftε ¼ ðϵ 0 þ ϵ 1 Þ=2, we arrive at the effective (time-independent) Hamiltonian of Jaynes-Cummings form, FIG. 9. Illustration of the relevant electronic levels under consideration. The triplet levels with S z tot ≠ 1 can be tuned off resonance by applying a sufficiently large homogeneous magnetic field.
where we introduce the spin operators S þ ¼ j1ih0j and S z ¼ ðj1ih1j − j0ih0jÞ=2. Moreover,δ ¼ ω 0 − ω c is the detuning between the qubit's transition frequency ω 0 and the cavity frequency ω c , and the effective single-phonon Rabi frequency is defined as The coupling between the qubit and the cavity mode is mediated by the piezoelectric potential; therefore, it is proportional to the electron's charge e and the singlephonon electric potential ϕ 0 . Because of the prolonged decoherence time scales, here we consider an effective (singlet-triplet-like) spin qubit rather than a charge qubit, such that the coupling g QD is reduced by the (small) admixtures with the localized singlet κ l ¼hS 02 jli.

Cooperativity
In this context, an important figure of merit is the singlespin cooperativity [61], C ¼ g 2 QD T 2 =κðn th þ 1Þ, where κ ¼ ω c =Q is the mechanical damping rate andn th ¼ 1=ðe ℏω c =k B T − 1Þ is the equilibrium phonon occupation number at temperature T; here, since ℏω c ≫ k B T for cryostatic temperatures,n th ≈ 0. For singlet-triplet qubits in lateral QDs, T ⋆ 2 ≈ 100 ns [64]; using spin-echo techniques, experimentally this has even been extended to T 2 ¼ 276 μs. Even in the absence of spin-echo pulses, with a far-from-optimistic dephasing time T ⋆ 2 ≈ 100 ns [21], for a moderately small cavity size A ≈ 100 μm 2 , a quality factor of Q ¼ 900 is sufficient to reach C ≈ Table V). Note that C > 1 allows us to perform a quantum gate between two spins mediated by a thermal mechanical mode [9].

Discussion of approximations
To arrive at the effective Hamiltonian given in Eq. (G11), we make two essential approximations: (i) first, we neglect the electronic level j2i yielding an effective two-level system, and (ii) second, we apply a RWA leading to a major simplification of H DQD ; see Eq. (G11) as compared to Eq. (G10). In order to corroborate these approximations, we now compare exact numerical simulations of the full system where none of the approximations have been applied to the simplified, approximate description described above. While the dynamics of the former is described by the master equation, with H 0 , H cav , and H int given in Eqs. (G6), (F2), and (G9), respectively, the latter is described by a similar master  11. The product κ 0 κ 1 directly affects the effective singlephonon Rabi frequency g QD =g 0 ¼ κ 0 κ 1 [69], while the difference jκ 2 1 − κ 2 0 j determines the robustness of the qubit against charge noise. Here, t c ¼ 5Δ.
TABLE V. Estimates of the single-spin cooperativity C for a DQD singlet-triplet qubit with T ⋆ 2 ≈ 100 ns, in a SAW cavity at gigahertz frequencies ω c =2π ≈ 1.5 GHz and cryostat temperatures wheren th ≈ 0 for both a small, low-Q and a large, high-Q SAW resonator. The coupling strength g QD could be further increased by additionally depositing a strongly piezoelectric material such as LiNbO 3 on the GaAs substrate, and spin-echo (and/or narrowing) techniques allow for dephasing times extended by up to 3 orders of magnitude [21,64].
with ρ referring to the density matrix of the combined system comprising the DQD and the cavity mode. Here, we also account for decay of cavity phonons out of the resonator with a rate κ, described by the Lindblad term D½aρ ¼ aρa † − 1 2 fa † a; ρg. As a figure of merit to validate approximation (i) we determine the population of the level j2i, that is, Tr½ρj2ih2j, describing the undesired leakage out of the logical subspace; ideally, this should be zero. Note that leakage into the triplet levels jT AE i could be accounted for along the lines, but they can be tuned far off resonance by another, independent experimental knob, the external homogeneous magnetic field. The results are summarized in Fig. 12: We find very good agreement between the exact and the approximate model, with a negligibly small error Tr½ρj2ih2j ∼ Oð10 −5 Þ. This justifies the approximations made above and shows that (in the regime of interest) the system can simply be described by Eq. (G15).

Noise sources for the DQD-based system a. Charge noise
In a DQD device background charge fluctuations and noise in the gate voltages may cause undesired dephasing processes. In a recent experimental study [91], voltage fluctuations in the interdot detuning parameter ϵ have been identified as the main source of charge noise in a singlettriplet qubit. Charge noise can be treated by introducing a Gaussian distribution in ϵ, with a variance σ ϵ ; typically, σ ϵ ≈ ð1-3Þ μeV [89]. The qubit's transition frequency ω 0 , however, turns out to be rather insensitive to fluctuations in ϵ, with a (tunable) sensitivity of approximately ∂ω 0 =∂ϵ ≲ 10 −2 ; see Fig. 13. In agreement with experimental results presented in Ref. [91], we find ∂ω 0 =∂ϵ ∼ ω 0 , indicating ω 0 to be an exponential function of ϵ. At very negative detuning ϵ, dephasing due to charge noise is practically absent, and T ⋆ 2 will be limited by nuclear noise [91]. Fluctuations in the tunneling amplitude t c can be treated along the lines: we find ω 0 to be similarly insensitive to noise in t c , ∂ω 0 =∂t c ≈ 10 −2 .

b. Nuclear noise: Spin echo
The electronic qubit introduced above has been defined for a fixed set of parameters ðt c ; ϵ; ΔÞ; compare Eq. (G6). Now, let us consider the effect of deviations from these fixed parameters, H 0 → H 0 þ δH, where δH can be decomposed as δH ¼ δH el þ δH nuc , with (c) the error Tr½ρj2ih2j quantifying the leakage to j2i. The latter is found to be negligibly small ∼10 −5 . We setδ ¼ ω 0 − ω c ¼ 0. Numerical parameters are t c ¼ 10 μeV, ϵ ¼ −7 μeV, Δ ¼ 1 μeV, η geo eφ 0 ¼ 5.2 × 10 −2 μeV, such that g QD ¼ 4 × 10 −3 μeV ≈ 6 MHz. The cavity decay rate is κ ¼ g QD =2, corresponding to Q ≈ 10 3 . where δt c and δϵ can be tuned electrostatically and basically in situ. In most practical situations this does not hold for δΔ: The primary source of decoherence in this system has been found to come from (slow) fluctuations in the Overhauser field generated by the nuclear spins [21,66,67]. In our model, this can be directly identified with a random, slowly time-dependent parameter δΔ ¼ δΔðtÞ.
In the following, assuming nominal resonanceδ ¼ 0, we detail a sequence of Hahn-echo pulses that cancels the undesired noise term and restores the pure, resonant Jaynes-Cummings dynamics: We consider four short time intervals of length τ, for which the unitary evolution is approximately given by U i ≈ 1 − iH i τ, interspersed by three π pulses. First, we let the system evolve with H 1 ¼ δS z þ g QD ½S þ a þ S − a † , then we apply a π pulse alongx ðS z → −S z ; S x → S x ; S y → −S y Þ such that S AE → S ∓ and the system evolves in the second time interval with H 2 ¼ −δS z þ g QD ½S − a þ S þ a † . Next, we apply a π pulse alongẑ ðS z → S z ; S x → −S x ; S y → −S y Þ such that S AE → −S AE leading to H 3 ¼ −δS z − g QD ½S − a þ S þ a † . Finally, a π pulse alongŷ ðS z → −S z ; S x → −S x ; S y → S y Þ is applied such that S AE → −S ∓ , giving H 4 ¼ δS z þ g QD ½S þ a þ S − a † . In summary, the system evolves over a time interval of 4τ according to Thus, in order to cancel the noise term, the effective singlephonon coupling strength is lowered by only a factor of 1=2.

Different spin-resonator coupling
In Appendix G, we show how to realize the prototypical Jaynes-Cummings Hamiltonian for SAW phonons interacting with a DQD; see Eq. (G11). Alternatively, if one does not absorb the gradient Δ into the definition of the qubit basis, one can identify the logical subspace with the electronic states jT 0 i and jSi, where jSi is one of the two hybridized singlets (while the other one jS 0 i is far detuned and neglected) [21,66,67,91]. Here, the electronic Hamiltonian reads H 0 ¼ −JðϵÞjSihSj −ΔðjT 0 ihSj þH:c:Þ, whereΔ ¼ hS 11 jSiΔ and JðϵÞ describes the exchange interaction. In this regime, the spin-resonator interaction takes on a form that is well known from other (localized) implementations of mechanical resonators [9], namely, which can be viewed as a phonon-state-dependent force, leading to a shift of the qubit's transition frequency depending on the positionx ¼ ða þ a † Þ= ffiffi ffi 2 p . Here, the single-phonon Rabi frequency is g QD ¼ κ 2 S η geo eϕ 0 F ðkdÞ, with κ S ¼ hS 02 jSi. Based on the coupling of Eq. (G21), one can envisage a variety of experiments known from quantum optics: For example, in the limit of vanishing gradient Δ ¼ 0, thex quadrature of the phonon mode could serve as a quantum nondemolition variable, as it is an integral of motion of the coupled system of the phonon mode and electronic meter.

APPENDIX H: GENERALIZED DEFINITION OF THE COOPERATIVITY PARAMETER
Here, we provide a generalized discussion of the cooperativity parameter C, which, in particular, accounts for losses of the cavity mode other than leakage through the nonperfect mirrors. Furthermore, we derive a simple, analytical estimate for the state transfer fidelity F in terms of the parameter C and undesired phonon losses with a rate ∼κ bd .
We consider a single qubit fj0i; j1ig coupled to a cavity mode. The system is described by the master equation where D½aϱ ¼ aϱa † − 1 2 fa † a; ϱg. The first term describes decay of the cavity mode. The corresponding decay rate can be decomposed into desired (leakage through the mirrors) and undesired (bulk-mode conversion, material imperfections, etc.) contributions, labeled as κ gd and κ bd , respectively. Thus, we write κ ¼ κ gd þ κ bd . The second term with (on-resonance) H JC ¼ gðS þ a þ S − a † Þ refers to the coherent interaction between qubit and cavity mode, while the last term describes pure dephasing of the qubit with a rate Γ deph .
In the bad cavity limit (where κ ≫ g; Γ deph ), one can adiabatically eliminate the cavity mode by projecting the system onto the cavity vacuum, Pϱ ¼ Tr cav ½ϱ ⊗ ρ ss cav ¼ ρ ⊗ jvacihvacj. Standard techniques (perturbation theory up to second order in V, compare Ref. [92]) then yield the effective master equation for the qubit's density matrix ρ ¼ Tr cav ½Pϱ only, with the effective decay rateκ ¼ 4g 2 =κ. For comparison, the same procedure in standard cavity QED, where κ ¼ κ gd and Γ deph D½j1ih1jρ → γD½S − ρ, yields the effective master equation for the atom only, _ ρ ¼κD½S − ρ þ γD½S − ρ. Therefore, the atom decays with an effective spontaneous emission rate γ tot enhanced by the Purcell factor, γ tot ¼ γ þκ ¼ ð1 þ 4g 2 =κγÞγ. Comparing good ∼κ to bad ∼γ decay channels, here one defines the cooperativity parameter in a straightforward way as C atom ¼ g 2 =κγ. This is readily read as the cavity-to-freespace scattering ratio, since the effective rate at which an excited atom emits an excitation into the cavity is given bỹ κ ∼ g 2 =κ. For C atom > 1, the atom is then more likely to decay into the cavity mode rather than into another mode outside the cavity. In cavity QED, large cooperativity C atom ≫ 1 has allowed for a number of key experimental demonstrations, such as an enhancement of spontaneous emission [93], photon blockade [94], and vacuum-induced transparency [95].
Here, we aim for a theoretical description that singles out the desired trajectories, where phonon emission through the mirrors happens first, from all others. To do so, we rewrite Eq. (H2) as where we define an effective (non-Hermitian) Hamiltonian H and jump operators according to J gd ρ ¼κ gd S − ρS þ ; ðH6Þ Here, we decompose the effective decay rateκ as Formally solving Eq. (H4) gives with the total jump operator J ρ ¼ J gd ρ þ J bd ρ and The exact solution given in Eq. (H9) can be iterated, giving an illustrative expansion in terms of the jumps J . It reads Here, the nth-order term comprises n jumps J with free evolution U between the jumps. Now, we can single out the desired events where the first quantum jump is governed by J gd . This leads to the definition ρðtÞ ¼ UðtÞρð0Þ þ ρ gd ðtÞ þ ρ bd ðtÞ; where ρ gd ðtÞ subsumes all desired trajectories: We focus on a qubit, initially in the excited state, i.e., ρð0Þ ¼ j1ih1j. Using the relations Uðτ 1 Þρð0Þ ¼ e −γ eff τ 1 j1ih1j, J gd Uðτ 1 Þρð0Þ ¼κ gd e −γ eff τ 1 j0ih0j, and J gd Uðτ 2 − τ 1 ÞJ gd Uðτ 1 Þρð0Þ ¼ 0; ðH13Þ J bd Uðτ 2 − τ 1 ÞJ gd Uðτ 1 Þρð0Þ ¼ 0; ðH14Þ the qubit's density matrix evaluates (to all orders in J ) to ρðtÞ ¼ e −γ eff t j1ih1j þ ρ gd ðtÞ þ ρ bd ðtÞ; ðH15Þ In the long-time limit t → ∞, the system reaches the steady state, ρðt → ∞Þ ¼ ρ gd þ ρ bd , where ρ gd ¼ ðκ gd =γ eff Þj0ih0j.
The associated success probability p suc ¼ Tr½ρ gd for faithful decay through the mirrors is then which is a simple branching ratio comparing the strength of the desired decay channel ∼κ gd to the undesired ones ∼κ bd þ Γ deph . In the limit where κ bd ¼ 0, i.e., κ ¼ κ gd , the expression for p suc simplifies to with the usual definition found in the literature, C ¼ g 2 =ðκΓ deph Þ ¼ g 2 T 2 =κ; here, κ ¼ ω c =Q eff ¼ ðn th þ 1Þω c =Q is understood to account for thermal occupation of the environmentn th in terms of a decreased mechanical quality factor (compare, for example, Refs. [10,61]). It is instructive to rewrite the general expression for p suc given in Eq. (H17) as with ε ¼ κ bd =κ gd . Based on this definition, it is evident that two conditions need to be satisfied in order to reach p suc → 1 in the regime where κ bd > 0: (i) a low undesired loss rate, ε ¼ κ bd =κ gd ≪ 1, and (ii) high cooperativity, C ≡ ðg 2 =κΓ deph Þ ¼ ðg 2 T 2 =κÞ ≫ 1, with κ ¼ κ gd þ κ bd .
For an illustration, compare Fig. 14. This shows that the usual definition and interpretation of the cooperativity C holds, provided that ε ≪ 1 is fulfilled. In order to quantify the cooperativity C for SAW cavity modes in both the ∼MHz ðn th ≫ 1Þ and the ∼GHz ðn th ≪ 1Þ regime, in the main text we take a (conservative) estimate as C ≡ g 2 T 2 Q=½ω c ðn th þ 1Þ. For artificial atoms (quantum dots, superconducting qubits, NV centers, etc.) with resonant frequencies ∼GHz, at cryostatic temperatures this definition reduces to C ≈ g 2 T 2 =κ, as discussed above, whereas for a trapped ion with ω t =2π ≈ ω c = 2π ∼ MHz, it correctly gives C ≈ g 2 T 2 Q=ðω cnth Þ with a decreased effective quality factor Q eff ¼ Q=n th [10,96,97]. For small errors, the expression given in Eq. (H19) can be approximated as p suc ≈ 1 − ε − 1=ð4CÞ. Since the absorption process is just the time-reversed copy of the emission process in the state transfer protocol for two nodes, we can estimate the state transfer fidelity as F ≥ p suc × p suc . For small infidelities, we then find where the individual errors arise from intrinsic phonon losses ∼ε and qubit dephasing ∼C −1 ∼ T −1 2 , respectively. This simple analytical estimate agrees well with numerical results presented in Ref. [62], where (except for noise sources that are irrelevant for our problem) F ≈ 1 − ð2=3Þε=ð1 þ εÞ − CC −1 with a numerical coefficient C ¼ Oð1Þ depending on the specific pulse sequence. Since ε ≪ 1, this relation can be simplified to F ≈ 1 − ð2=3Þε − CC −1 . For the state transfer of the coherent superposition jψi ¼ ðj0i − j1iÞ= ffiffi ffi 2 p as described in detail in Appendix J, we explicitly verify the linear scaling ∼ε and find numerically ∼1=2ε for intrinsic phonon where a † k;μ creates an acoustic phonon with wave vector k and polarization μ, and M k refers to the matrix element for electron-phonon coupling; for free bulk modes, M k ¼ P i;j P σ hij exp ½ikrjjid † iσ d jσ . In agreement with our notation, the coupling constant is β ¼ ee 14 =ϵ and the square-root factor can be identified with U 0 . Replacing the sum by a single relevant cavity mode a, we recover the Hamiltonian describing the cavity-qubit coupling with g ∼ βU 0 ¼ eϕ 0 ; compare Eq. (F1).

APPENDIX J: STATE TRANSFER PROTOCOL
Here, we provide further details on the numerical simulation of the state transfer protocol as described by Eqs. (9) and (10), in the presence of Markovian noise. In line with previous theoretical studies [97], we show that the simple approximate Markovian noise treatment results in a pessimistic estimate for the noise transfer fidelity F.
We first provide results of the full time-dependent numerical simulation of the cascaded master equation given in Eqs. (9) and (10), including an exponential loss of coherence for Γ deph > 0; see Fig. 15. In contrast to the non-Markovian noise model discussed in the main text, the qubits are assumed to be on resonance throughout the evolution, that is, δ i ¼ 0ði ¼ 1; 2Þ, but experience undesired noise as described by Here, the second term refers to a standard Markovian pure dephasing term that leads to an exponential loss of coherence ∼ exp ð−Γ deph t=2Þ. As a reference, we also show the results for the ideal, noise-free scenario ðL noise ¼ 0Þ, where perfect state transfer is achieved [20]. The results of this type of time-dependent numerical simulations are then summarized in Fig. 16. For optimized, but experimentally achievable parameters T ⋆ 2 ≈ 3 μs [21], and accordingly Γ deph =κ gd ≈ 3%, we then obtain F ≈ 0.85, in the presence of realistic undesired phonon losses κ bd =κ gd ¼ 5%.
This shows that our non-Markovian noise model yields even higher state transfer fidelities than the Markovian noise model. Intuitively, this can be readily understood as follows: The simple Markovian noise model gives a coherence decay ∼ exp ð−t=T 2 Þ, whereas our non-Markovian noise model yields ∼ exp ð−t 2 =T 2 2 Þ. Therefore, for Markovian noise, one can estimate the dephasing-induced error on the relevant time scale for state transfer ∼κ −1 as ∼κ −1 =T 2 ≈ Γ deph =κ ≪ 1, whereas non-Markovian noise leads to a considerably smaller error ∼ðT −1 2 =κÞ 2 . (blue dashed line) in the presence of Markovian noise. Black (dash-dotted) curves refer to the ideal, noise-free scenario ðL noise ¼ 0Þ, where perfect transfer is achieved, while colored curves take into account decoherence processes. (a) Transfer fidelity F. (b) Excited-state occupation hS þ i S − i i t for first and second qubit. Numerical parameters are g 1 ðt ≥ 0Þ ¼ κ gd , κ bd =κ gd ¼ 5%, and Γ deph =κ gd ¼ 5%. as a function of the qubit's dephasing rate Γ deph for different values of intrinsic phonon losses κ bd =κ gd ¼ 0 (black solid line), κ bd =κ gd ¼ 5% (blue dashed line), and κ bd =κ gd ¼ 10% (red dash-dotted line). Inset: Pulse shape g 1 ðtÞ for first node.