Brillouin optomechanics in the quantum ground state

Bulk acoustic wave (BAW) resonators are attractive as intermediaries in a microwave-to-optical transducer, due to their long coherence times and controllable coupling to optical photons and superconducting qubits. However, for an optomechanical transducer to operate without detrimental added noise, the mechanical modes must be in the quantum ground state. This has proven challenging in recent demonstrations of transduction based on other types of mechanical resonators, where absorption of laser light caused heating of the phonon modes. In this work, we demonstrate ground state operation of a Brillouin optomechanical system composed of a quartz BAW resonator inside an optical cavity. The system is operated at $\sim$200 mK temperatures inside a dilution refrigerator, which is made possible by designing the system so that it self-aligns during cooldown and is relatively insensitive to mechanical vibrations. We show optomechanical coupling to several phonon modes and perform sideband asymmetry thermometry to demonstrate a thermal occupation below 0.5 phonons at base temperature. This constitutes the heaviest ($\sim$494 $\mu$g) mechanical object measured in the quantum ground state to date. Further measurements confirm a negligible effect of laser heating on this phonon occupation. Our results pave the way toward low-noise, high-efficiency microwave-to-optical transduction based on BAW resonators.

Quantum information processing has the potential to solve crucial problems that are beyond the reach of classical hardware [1,2].Microwave-frequency solid state qubits, such as superconducting circuits [3][4][5][6] and spins in semiconductors [7,8], have emerged as powerful platforms for quantum state manipulation, whereas optical photons are the natural choice for transporting quantum information over long distances [9,10].A microwave-tooptical conversion process [11,12] would enable a modular, large-scale quantum processor [13] by connecting microwave qubits in physically separate dilution refrigerators through optical photons, which can be transported at room temperature through optical fibers.To be useful for quantum information processing, a microwave-tooptical transducer would have to be efficient and introduce fewer than 1 noise photon (referred to the input).One promising scheme uses a mechanical resonator as an intermediary to boost the effective electro-optical interaction [14,15].Over the past decade, such optomechanical transducers have been demonstrated with rapidly improving performance [16][17][18][19][20][21][22][23], such as efficiencies up to 47% using MHz frequency membranes as mechanical resonators [17,23] and added noise as low as 0.57 photons in devices based on GHz frequency phononic crystal resonators [24].Achieving high efficiency and low noise simultaneously, however, remains an outstanding challenge: MHz-frequency resonators suffer from thermal noise even at mK temperatures, whereas in phononic crystal resonators the circulating photon number, and therefore the efficiency, is limited by laser absorption and poor thermalization.Bulk acoustic wave (BAW) resonators have recently been shown to couple strongly to both superconducting qubits [25][26][27] and infrared pho-tons [28][29][30], and form an attractive candidate for quantum transduction between the microwave and optical domains.They could potentially combine high efficiency transduction with low noise due to their high frequency and extremely low optical absorption [15,30].However, since previous optomechanical experiments on BAW resonators [28][29][30] were done at 4 K with elevated phonon mode occupancy, it remains to be shown that BAW resonators can operate in the quantum ground state in the presence of the strong laser pump necessary to boost the optomechanical coupling rate.
Here, we report on ground-state operation of an optomechanically addressed BAW resonator.The device is composed of of a quartz BAW resonator inside an optical Fabry-Perot cavity, similar to earlier systems measured at 4 K [29,30].Ground state operation at a thermal mode occupation of 0.3-0.4phonons is achieved by operating at ∼ 200 mK temperature inside a dilution refrigerator (hereafter abbreviated as fridge), which is made possible by several modifications to improve alignment and stability against vibrations.Any efficient transduction process would require a pump laser power sufficient to reach an optomechanical cooperativity of unity.We show that our device remains in the ground state even under continuous illumination with pump powers that approach this regime, demonstrating its potential for simultaneously efficient and low-noise microwave-to-optical transduction.Finally, BAW resonators are also interesting for fundamental tests of physics due to their high mass and frequency.For example, a stricter bound on spontaneous wavefunction collapse rates could be obtained by measurement of an increasingly heavy mechanical resonator in as low a thermal occupation as possible [31].To our knowledge, the measurement presented in this paper constitutes the heaviest mechanical resonator to date that was measured in the ground state.
In our experiment, we use an optomechanical interac-tion between infrared photons in an optical cavity and acoustic waves in a BAW resonator.The optical cavity has linewidth κ/2π ≈ 2.4 MHz and is formed by a planar and a concave mirror, between which we place a 5 mm long z-cut quartz crystal with planar surfaces that acts as a high-overtone bulk acoustic wave resonator (HBAR, see Fig. 1a).The standing wave phonon modes are formed by reflections of the acoustic waves at the flat crystal interfaces to vacuum.This configuration leads to the Brillouin optomechanical coupling Hamiltonian Ĥint = − g 0,m â1 â † with single-photon coupling rate g 0,m between two optical modes â1/2 and a mechanical mode bm [28,29].
The interaction is caused by the interplay of electrostriction and photoelasticity of the crystal material.The electrostrictive effect allows the beat note between the two optical modes to create an elastic wave (the phonon mode).This elastic deformation in turn modifies the refractive index via the photoelastic effect, creating a grating that allows for Bragg scattering between the two optical modes.This interaction leads to the up-(down-)conversion of photons between the optical modes while simultaneously destroying (creating) a phonon.To enhance the interaction, a strong pump tone of classical intra-cavity amplitude α cav p can be applied to the lower (higher) frequency optical mode, hereafter referred to as the red (blue) mode.This effectively linearizes the Hamiltonian and leads to a beam-splitter (two-mode squeezing) type interaction with a cavity-enhanced coupling rate g m = g 0,m α cav p .Since the interaction has to fulfill energy and momentum conservation, appreciable coupling is only observed for mechanical frequencies Ω m near the Brillouin frequency, which for optical wavelengths of ∼1550 nm in quartz is Ω B /2π = 12.65 GHz.
The implementation of this setup inside a fridge was accomplished by overcoming three outstanding challenges: coupling light in and out of the optical cavity in a dilution refrigerator, isolating the experiment from vibrations, and correcting for thermal misalignment.The optical modes of the cavity are addressed by light that is guided to and from the experiment by optical fibers.
To both ferrules at the fiber ends, we glued a gradientindex (GRIN) lens at a specific distance that matches the outcoupled light to the cavity mode (see Fig. 1a).All optical components are placed on a compact steel mounting bracket that is material-matched to reduce thermal misalignment (see Fig. 1b).The whole bracket is then mounted to the base stage of a fridge via a system of springs for vibration isolation (see supplementary information, Section A) and we place the cavity back mirror at a position that minimizes the effect of residual vibrations on the cavity mode frequency spacing (see supplementary information, Section B).We connect a dedicated thermometer to the steel mount holding the front mirror and the HBAR, which we will refer to later as the experiment thermometer.The in-and output One mode pair is tuned to be separated by the Brillouin frequency ΩB.(d) Cavity reflection spectrum before (top) and after (bottom) cooldown, normalized to the maximum reflection when aligned by hand.The modes that approach zero reflection are the fundamental transverse modes.Higher-order Laguerre-Gaussian modes appear as shallower dips.
lens mounts are at the ends of the supporting bracket and have to be operated by hand before cooldown.The mount holding the concave mirror of the cavity can be adjusted while the experiment is cold using stick-slip piezos.This is necessary to tune the frequency difference between the optical modes â1/2 to coincide with Ω B (see Fig. 1c), such that the optomechanical interaction is resonant.Note that the optical cavity modes are not spaced equidistantly due to reflections at the crystal interfaces [29].Finally, to mitigate optical misalignment due to thermal contractions, we devised a method that lets the setup self-align during cooldown, the details of which are described in the supplementary information, Section C. The results of the procedure are shown in Fig. 1d: At room temperature, the cavity reflection spectrum exhibits higher-order modes and a reduced reflection baseline compared to a reference measurement with optimum alignment, indicating slight misalignment.After the cooldown, the baseline increases to almost the value of the reference measurement.By additionally moving the back mirror mount to let the cavity mode overlap with the now aligned input beam, the higherorder modes disappear almost completely, showing that the cavity is completely aligned to the fundamental transverse mode.
With the experimental setup compatible with the fridge environment, we characterize our BAW modes and their optomechanical coupling using optomechanically induced transparency (OMIT) and amplification (OMIA) measurements.In our triply-resonant scheme, an OMIT (OMIA) measurement is done by locking a strong pump laser to the red (blue) optical mode and sweeping a weak probe laser over the blue (red) mode [29] (see Fig. 2a).The strong pump increases the cavityenhanced coupling strength g m , such that a narrow dip (peak) appears in the probe transmission spectrum when pump-probe detuning Ω equals a mechanical resonance frequency Ω m , as shown in Fig. 2b and c.The number of modes that show appreciable optomechanical coupling depends on the optical wavelength and cavity geometry [29].In this work we consider the three most prominent modes in our spectrum, labeled modes 0, 1 and 2 in increasing order of frequency.From fits to these spectra (see supplementary information, Section E), we retrieve several system parameters, including the optomechanical cooperativity C m and the effective mechanical linewidth Γ m,eff .These show a linear dependence on pump laser power P p (see Fig. 2d,e), as expected from theory [29].We extract intrinsic mechanical linewidths Γ m /2π of ∼ 50 − 55 kHz for all modes, limited by diffraction loss, and estimate vacuum coupling rates |g 0,m |/2π of {4.02 ± 0.09, 8.39 ± 0.05, 7.75 ± 0.03} Hz for modes 0, 1 and 2, respectively.Both results are in good agreement with earlier measurements on flat-flat quartz crystals [29].
Having established optomechanical coupling to several HBAR modes, we proceed to measure the thermal phonon occupations.This is done using optical sideband asymmetry thermometry, which relies on the difference that arises between Stokes and anti-Stokes scattering rates when the mechanical mode is near its quantum ground state [32].(Anti-)Stokes scattering corresponds to the second (first) term in Eq. ( 1) and is associated with the creation (annihilation) of a phonon.Its probability scales proportionally with the thermal mode occupation as n th +1 (n th ), reflecting the fact that it is impossible to destroy phonons if the resonator is in the ground state.Thus, by measuring the asymmetry between Stokes and anti-Stokes signals, n th can be determined.Note that n th refers to the occupation of the resonator in the absence of optomechanical backaction.While this technique has been frequently applied in various optomechanical systems [33][34][35], our system differs from these in that it uses not one but two optical modes: one that is resonant with the pump laser and the other with the Stokes or anti-Stokes signal.The expressions for these signals are therefore slightly modified and are presented in Section F of the supplementary information.The mechanical sidebands are measured by locking the pump to one of the optical modes and mixing the cavity transmission with a frequency-shifted local oscillator (LO) in a balanced heterodyne detection setup (see Fig. 2a).Each measurement  Note that these frequencies are with respect to the LO, which is at a frequency ΩLO roughly 115 MHz from the scattered signal frequencies.(d-f ) Same as (a-c), but measured at T ∼ 200 mK.Here, the signal for mode 0 is weaker than that of modes 1 and 2 because of its lower coupling rate g0 and because it is further off resonance from the optical mode (see Fig. 2b and  c).(g) Thermal mode occupations n th extracted from the red/blue asymmetries in (a-c) versus the mode cooperativity.The grey area indicates the expected occupation based on the experiment thermometer temperatures between start and end of the measurement.(h) Same as (g), but for the measurements shown in (d-f).The blue dashed line indicates n th = 0.5.
is performed in a ∼ 4 min time window during which the pulse tube of the cryostat is turned off to reduce vibrations.
We first verify the accuracy of our thermometry by measuring the mechanical modes at 4 K without helium mix circulation.At this temperature, we expect the HBAR modes to be well thermalized to the surrounding environment because the 4 K stage and all stages below it, as well as the experiment, are at the same temperature.We perform measurements with the pump laser locked to the red or the blue mode, using sufficiently low pump power to ensure our optomechanical cooperativities are around 0.1, so as to minimize optomechanical backaction on the mechanical modes.Three thermal noise peaks appear at the mechanical frequencies of the three modes visible in Fig. 2b-c, shifted by the LO frequency (see Fig. 3a-c).A clear asymmetry can be seen between the noise peaks in the cases of Stokes and anti-Stokes scattering.Since we correct for other sources of asymmetry, such as cavity mode spacing drifts between the two measurements or the residual optomechanical backaction (see Section G of the supplementary information), the remaining asymmetry can be ascribed to the scaling with n th + 1 versus n th of the two scattering processes.The mode occupations we measure through the ratio of the areas under these peaks (see Fig. 3g) show good agreement with the occupation of 5.9-7.6 phonons we expect based on the experiment thermometer readings throughout these measurements.To rule out the possibility that laser phase noise affects our thermometry, we measure the phase noise of our laser [36,37] and find it to be sufficiently small to have a negligible influence on these results (see Section J of the supplementary information).Note that the uncertainty on the occupations of modes 1 and 2 are large because during the ∼ 4-minute-long measurement, the cavity mode spacing shifts slightly due to heating, leading to an increased uncertainty on the detuning between the optical mode and mechanical modes.This affects modes 1 and 2 more than mode 0, as the former are situated on the flank of the optical resonance during the measurements at 4 K (see Section E 1 of the supplementary information for more details on our error sources).Furthermore, up until calculating the asymmetry between scattering signals, the errors are propagated via linear error propagation.However, because the function relating asymmetry and thermal mode occupation is strongly nonlinear in the range given by our error bars, the error bars for the thermal mode occupations shown in Fig. 3g,h indicate the values corresponding to the extrema of the errorbars on the asymmetry.
We now cool our experiment down to ∼ 200 mK and show that this brings the mechanical modes into the quantum ground state.At this lower temperature the noise peaks, shown in Fig. 3d-f, decrease in amplitude and the asymmetry between red and blue pumping configurations increases.The extracted occupations for modes 0,1 and 2 are 0.24 +0.13  −0.17 , 0.44 +0.07 −0.08 and 0.38 +0.07 −0.08 phonons, respectively (see Fig. 3h).In contrast to the 4 K measurement, here the uncertainty is largest for mode 0 because it is at the flank of the optical resonance.All three modes are in the quantum ground state, with occupations n th < 0.5.Interestingly, these occupations are higher than what one would expect based on the thermometer readings, which predict occupations in the range of 0.015 to 0.12 phonons.A mode temperature above that of the crystal mount could arise from either laser absorption heating the crystal or heating from another source of radiation such as blackbody radiation from the higher-temperature stages.To investigate these possible causes, we first repeat the sideband thermometry measurements at mK temperatures, but precede each measurement with two minutes where the pump laser is locked to the cavity and set to a power that is higher than during the thermometry measurement, effectively acting as a heat source.We find that the phonon mode occupations show no clear increase in thermal occupation with increasing heating laser power (see Fig. 4a), even though the crystal is subject to ∼ 1 W of intra-cavity power at the highest heating power.The experiment thermometer temperature, however, increases significantly with laser power.As a further test, and to rule out the possibility that the crystal cooled down in between the heating step and the thermometry measurement, we perform thermometry on our crystal using the same set of laser powers as shown in Fig. 2d-e.The resulting mode occupation shows no dependence on laser power (see Fig. 4b), demonstrating that laser absorption is not the cause of our elevated mode occupations.The laser power responsible for the elevated experiment thermometer readings in Fig. 4a is therefore also not absorbed in the crystal.Instead, it is likely scattered due to imperfect alignment and eventually absorbed by other parts of the experiment.Since the experiment thermometer temperature never exceeds the effective phonon mode temperatures, however, this heating of the external environment does not have a significant effect on the mode temperatures.
Having ruled out heating by laser absorption as source of elevated phonon mode temperatures, we then investigate the evolution of the mode temperature while the fridge warms up in order to test whether the heating is due to blackbody radiation from a higher temperature stage.To allow for faster measurements, we do this without relying on pairs of red and blue pumping measurements taken under the same conditions.Instead, we first extract the phonon occupations from a pair of reference measurements right before the warmup, and then use the fact that the corrected integrated signals from a subsequent red (blue) pumping measurement should scale as n th (n th + 1) (see supplementary information, Section H).Further reference pairs are taken during periods of stable temperature during the warmup, and used for subsequent measurements.We find that the temperatures agree well between phonon modes and follow the experiment temperature sensor, with the exception of low temperatures (see Fig. 4c).These observations are consistent with blackbody radiation from an additional heat source that starts at a higher temperature than the crystal mount before the warmup, but whose temperature increases more slowly than the still stage during the warmup.One possible culprit is the still shield, which surrounds the experiment and has a finite thermal conductivity to the still stage.In Section I of the supplementary information, we discuss why blackbody radiation from the still stage or other stages directly does not explain our result.
We have demonstrated the operation of a cryogenic Brillouin cavity optomechanics system and used it to measure the modes of a BAW resonator in the quantum ground state.While the measurement of mechanical resonators in the quantum ground state has now been achieved in many mechanical systems, either through passive or active cooling [15,38], the BAW modes studied here, with an effective mass of ≈ 494 µg, are to date the most massive mechanical objects measured with a thermal occupation of less than half a phonon (see supplementary information Section K for a calculation of the effective mass).Our results represent an important step toward using BAW resonators for quantum transduction.We have overcome several crucial technical challenges, for example ensuring the alignment and stability of a freespace optomechanical cavity at mK temperatures.While the measured thermal occupations are higher than expected and not ideal for noiseless transduction, this is likely a particular issue of the current geometry.Importantly, we find no evidence of laser heating, and we point out that BAW resonators enclosed in microwave cavities have been measured to have much lower thermal occupations [39].Further improvements and upgrades to our system will include lower loss mechanical resonators, higher finesse optical cavities, and the incorporation of superconducting circuits.This work forms a solid foundation for these next steps toward a quantum transducer between the microwave and optical domains.
In contrast to an optical table, a dilution refrigerator is a mechanically very noisy environment.The main source of noise is the pulse tube, but there are also vibrations from the turbo pumps during mix circulation as well as other vibrations from the lab.While this noise is mostly at frequencies below 1 kHz and therefore does not cause noise on our HBAR mechanical states, it does deteriorate our cavity lock quality and it causes noise on the optical mode spacing (see Section B for more details).We therefore mount our experiment onto a vibration isolation stage consisting of a plate suspended by six CuBe springs from the MXC stage (Fig. S1a).Loaded by the weight of the experiment, this mass-spring system has resonance frequencies between 1 and 10 Hz, and acts as a low-pass filter that suppresses mechanical vibrations above the resonance frequencies [40].
To test the effect of the isolation stage on the vibrations in our experiment, we perform time-resolved measurements of the cavity resonance frequency by sweeping a laser over one of our cavity resonance frequencies using a 2 kHz triangular sweep.The reflection spectrum is recorded on an oscilloscope and shows two resonance dips per sweep period.By finding the time differences between every second dip we obtain an array of time differences which would be equal to a sweep period if the cavity were perfectly stable, but in reality contain fluctuations due to the noise on the cavity frequency.We convert the time differences to frequencies using the known cavity linewidth as a frequency calibration and Fourier transform the array of frequencies to get the noise spectrum F (f c (t))(ω) of the cavity frequency f c up to 1 kHz.Note that a simple lock to the cavity resonance and recording of the error signal could give the same information, but locking was not possible in some of our measurements due to the large amount of noise.
The recorded noise spectra with and without isolation stage are shown in Fig. S1b, where the measurement without isolation is done by clamping the stage rigidly to the base plate using the failsafes shown in Fig. S1a.To generate reproducible vibrations, a sound containing all frequencies up to 1 kHz is played on a vibration speaker (Adin B1BT) placed on the still stage.These noise spectra show an attenuation of the noise by more than three orders of magnitude up to 100 Hz, whereas in the 300 to 1000 Hz range the attenuation is less.We speculate that there might be some higher order resonances or transmission through the thermal braids in that regime.Despite that, the total integrated frequency noise over the whole spectrum is reduced from 1.5 GHz without stage to 110 MHz with stage.While the isolation stage is instrumental to reducing the vibration noise on our experiment, we still require the pulse tube to be turned off to be able to lock our laser to our cavity.With the pulse tube on, we measure a total integrated frequency noise of 34 MHz, compared to just 3.6 MHz with the pulse tube off (Fig. S1c).With the total noise commensurate with our cavity linewidth of ∼2 MHz, our cavity lock performs well.Finally, we note that during our measurements, we had to disconnect the control cables to the piezo motors on our back mirror from their driving modules because the voltage noise on the piezo control signals caused noticable noise on our cavity frequency.This is despite them being stick-slip piezos, which are kept at 0 V when not moving.

B. Displacement-insensitive point
Vibrations of the cavity mirrors not only affect the individual cavity resonance frequencies, but also the frequency spacing between these resonances.The former is mitigated by locking our laser to the pump mode, but this leaves the noise on the frequency spacing unaffected.All our measurements require that the frequency difference ∆ 12 between the two optical cavity modes is equal to the mechanical frequency Ω m for the optomechanical interaction to be resonant.Thus, noise on the frequency spacing causes noise on the position of the broad optical resonance we observe in the OMIT and OMIA spectra.Since we average several spectra, this noise will result in a reduced height (depth) and a change of the lineshape for the OMIA (OMIT) features.Moreover, our thermometry signal strength at mechanical resonance Ω m depends on ∆ 12 as a Lorentzian with the optical linewidth, peaking when ∆ 12 = Ω m (see Section F).Noise on ∆ 12 therefore leads to a reduction of the averaged thermometry signals.
In a vacuum-filled optical cavity of length L, the resonance frequency of the m-th mode is given by f c,m = mc 2L , with c the speed of light, while the mode spacing is given by ∆f c = c 2L .In a ∼1 cm cavity at 1550 nm wavelength, m ∼ 1.3 • 10 4 .Thus, the dependence of ∆f c on small length changes δL is ∼ 1.3 • 10 4 times weaker than that of the resonance frequency, and vibrations would have a neglegible effect.However, in our cavity, the reflections at the crystal interfaces lead to a strong dependence of mode spacing on both wavelength and mirror position, as observed in earlier work [29].We use an analytical transmission matrix model, adapted from [29], to calculate cavity reflection and transmission spectra, and from those find how ∆f c depends on small changes in cavity length (Fig. S2).This reveals that for our cavity geometry, ∆f c oscillates between ∼9.7 and 12.7 GHz.The gradient d(∆f c )/d(δL) of this oscillation has a maximum value of 7.5 MHz/nm, as compared with an average gradient d(f c )/d(δL) of the individual resonance frequencies of ∼15 MHz/nm.This shows that the mode spacing can depend nearly as strongly on mirror position in our system as the individual resonance frequencies.We therefore tune our cavity length to the 'displacementinsensitive point', where the Brillouin frequency coincides with the maximum of the mode spacing oscillation for one of the mode pairs in our spectrum.This corresponds to the situation shown in Fig. S2.At this point, the mode spacing is first order insensitive to mirror position, which greatly mitigates the noise on our measurements.We are able to match such a mode spacing maximum with the Brillouin frequency to within ∼1 MHz by adjusting the cavity length.

C. Alignment procedure
During cooldown, the cryogenic components of our setup undergo thermal contraction, causing a misalignment between input and transmission optics and the cavity.While the experiment was designed to minimize such misalignments by matching materials and by fiber coupling rather than sending free-space beams through the fridge (see Fig. S3a), some misalignment remains.We therefore use a series of test cooldowns to determine the angular misalignments of the input and transmission lenses, and then we pre-compensate for these before cooling our experiment down in the dilution refrigerator.We will first discuss how cavity reflection and transmission spectra can be described in the case when input and output optics are not perfectly aligned to the cavity mode.We then present a model that parameterizes the effect of angular misalignment on the reflection and transmission spectra.This model is used to fit reflection and transmission spectra during the test cooldowns to find the room temperature settings for which the cavity is aligned at low temperatures (hereafter referred to as the 'cold optimum').Next, we discuss how we test our model and fix some of its parameters by using room-temperature misalignments.Finally, we present the results of our test cooldowns and show that, if we position our cavity at the cold optimum, this leads to an improved alignment during cooldown.

Misaligned cavity input-output theory
To understand the effect of misalignment on cavity transmission and reflection, we use cavity input-output theory, adapted to describe the coupling between input or output optics with the cavity ports using scattering matrices (see Fig. S3b).This same theory will also be used in Section G 2 to describe how we can determine input coupling rates from reflection spectra.We consider a cavity and its two input-output ports (1 and 2), described as usual by input-output theory.The Langevin equation of motion in the harmonic basis for the classical amplitude a of a cavity coupled to two such input-output channels is with ω 0 the cavity resonance frequency and κ = κ ext1 + κ ext2 + κ int is the loss rate, composed of the coupling rates κ ext1 and κ ext2 to ports 1 and 2 and intrinsic losses κ int .The input and output fields at port i are written as a in,i and a out,i , respectively.The reflected field (at ports 1) and transmitted field (at port 2) are then given respectively as While a in and a out describe the input and output modes of the cavity, those do not, in general, correspond to the input and output modes of our input and transmission optics.We therefore introduce scattering matrices that couple these modes to each other in order to describe the mode mismatches due to misalignment, but also to include any  further losses between the laser and cavity input, or cavity output and detector.Specifically, matrix S 1 couples the modes of our input optics b in,1 and b out,1 and the cavity modes a in,1 and a out,1 .Similarly, S 2 couples the modes of our transmission optics b in,2 and b out,2 and the cavity modes a in,2 and a out,2 .These are related by For simplicity, we assume that s 22,1 = s 22,2 = 0, i.e. nothing coming out of the cavity gets reflected back into it (as this would create additional cavities and complicate analysis).
We now consider the situation where light is only inserted at port 1, such that b in,2 = a in,2 = 0. Then we find that the reflection and transmission spectra are given by with ∆ = ω − ω 0 .The reflection and transmission coefficients R 2 and T 2 for the situation where light is inserted only at port 2 are obtained by swapping the final indices 1 and 2.

The effect of angular misalignments on cavity reflection and transmission
Having established a framework for how misalignments may affect the cavity reflection and transmission spectra, we now proceed to find closed-form expressions for reflection and transmission as function of input and transmission lens tilt angles.At the end of this section, we present the procedure that we use to determine the optimal input and transmission lens tilt angles at low temperature.
The input lens is well aligned when the input beam reflects off the first (flat) mirror under normal incidence, such that the reflected light (off resonance from the cavity modes, i.e. for |∆| κ) is directed back into the fiber.When this condition is met, the cavity mode can be spatially aligned in the x-y plane (mirror surface) to the position where the input beam hits the front mirror by tilting the curved back mirror, which displaces the cavity modes in x and y.An angular misalignment dθ in , dφ in of the input lens leads to a displacement of the reflected beam from the center of the fiber.As both the fundamental mode of the fiber and the reflected beam are Gaussian of shape, and as the overlap integral of two Gaussians is a Gaussian as well, the reflection coefficient is expected to depend on these misalignments as where θ 0 is the angle for which the beam is displaced by roughly one cavity waist on the front mirror, A is a dimensionless fit parameter that we can determine by room temperature misalignment tests and R 1,max is the reflected power at perfect alignment (and |∆| κ).Note that θ 0 is redundant, and we therefore fix it to 50 • , such that A is of order 1.To write the overlap integral in this way, we assumed that the phase front mismatch is negligible.From Eq. (S6) we see that R 1,|∆| κ = |s 12,1 s 21,1 + s 11,1 | 2 , so input misalignments result in a change of s 12,1 , s 21,1 and s 11,1 .The transmission lens is well aligned when the transmitted power measured at the end of the fiber is maximized.Ideally, this would involve both an optimization of the angle and the position of the transmission lens, but the limited space in our setup does not allow for independent control of both.Therefore, we only control the lens mount tilt, which affects both angle and position.From Eq. (S7) we see that, in so far as they affect transmission, misalignments must be reflected by a change in s 21,1 and s 12,2 , where s 21,1 captures misalignments of the input lens and s 12,2 those of the transmission optics.
Since s 21,1 describes how much of the input mode b in,1 is converted to the cavity input mode a in,1 , we take it to be proportional to the overlap integral of the respective mode fields E a,1 and E b,1 on the front mirror surface.Taking this surface to be the z = 0 plane, and assuming the input beam and cavity mode to have the same waist size w 0 , the fields E a,1 and E b,1 are respectively described by the fields of a Gaussian beam and a tilted and displaced Gaussian beam, i.e.E a,1 (x, y, z) = E a,1 e −(x 2 +y 2 )/w 2 0 e ikzz , (S9) E b,1 (x, y, z) = E b,1 e −((x−∆x−dθinz) 2 +(y−∆y−dφinz) 2 )/w 2 0 e ikz(z+dθin(x+∆x)+dφin(y+∆y)) .(S10) Here, dθ in and dφ in are the input beam tilt angles in the xz-and yz-plane, respectively, with respect to the zaxis.We have assumed them to be small, such that sin(dθ in ) ≈ dθ in and sin(dφ in ) ≈ dφ in .The displacements ∆x and ∆y are those between the input beam and the cavity mode on the front mirror, i.e. ∆x = ∆x in − ∆x c and ∆y = ∆y in − ∆y c , with {∆x in , ∆y in } and {∆x c , ∆y c } the displacements of the input beam and cavity mode, respectively, on the front mirror with respect to the point of optimum alignment.In Eqs.(S9) and (S10), we have also taken the beam waists to be at the mirror and ignored the Gaussian beam divergence (our input beam has a Rayleigh range of more than ∼10 mm).Due to the large distance d IL ∼ 22 mm between the input lens and the front mirror and the fact that we consider displacements on the order of the cavity waist (77 µm), our tilts will be of order arctan (77 • 10 −3 /22) = 3 × 10 −3 rad and we may safely ignore the tilt-dependent terms in E b,1 (x, y, z).The field E b,1 (x, y, z) is thus simply a Gaussian diplaced by ∆x and ∆y, which we can rewrite as a function of the tilts {dθ in , dφ in } by defining where B and C, like A before, are dimensionless fit parameters we can determine from room temperature tests, and {dθ bm , dφ bm } are the tilt angles of the concave back mirror.Note that in reality ∆x c and ∆x c do not depend on d IL but rather on the radius of curvature of the back mirror, and including that correction would simply lead to a different value for C. Using these definitions, and calculating the overlap integral of E a,1 and E b,1 in the z = 0 plane, we then find that s 21,1 , normalized to the value at perfect alignment (dθ in = dφ in = dθ bm = dφ bm = 0) shows a simple Gaussian dependence on tilts, described by With the same arguments as used to get to Eq. (S16), the dependence of the scattering matrix element s 12,2 , which describes how much of the cavity output mode a out,2 is converted to the output mode b out,2 on the transmission side of the cavity, can be written as s 12,2 (dθ tr , dφ tr , dθ bm , dφ bm ) s 21,1 (0, 0, 0, 0) = e −((Ddθtr−Edθ bm ) 2 +(Ddφtr−Edφ bm ) 2 )/2θ 2 0 , (S17) with {dθ tr , dφ tr } the tilt angles of the transmission lens and D and E two more dimensionless fit parameters.Plugging Eqs.(S16) and (S17) into Eq.(S7) and evaluating at cavity resonance (∆ = 0), we find where T 1,max is the resonant transmission at perfect alignment.
All the tilt angles used in Eqs.(S8), (S16) and (S17) are defined with respect to the point of optimal alignment, i.e.
Because the back mirror piezo motors that control {θ bm , φ bm } are open loop and suffer from hysteresis and unequal step sizes in their two directions of movement, as well as a change of step size with temperature, we cannot generally know the back mirror tilts.We can only know when the back mirror is positioned such that the cavity mode is perfectly aligned to the input beam, i.e. ∆x in = ∆y in = 0, by finding the position for which higher-order Laguerre-Gaussian modes disappear from our reflection spectrum.If we take both our input lens and the back mirror to be perfectly aligned, such that dθ in = dφ in = dθ bm = dφ bm = 0, Eq. (S18) simplifies to: Equations ( S8) and (S22) can be used to find the optimal input lens and transmission lens alignments at low temperatures, using the following procedure: 1.At room temperature, determine the parameter A in Eq. (S8) by measuring reflection spectra for a set of controlled misalignments {dθ in , dφ in } of the input lens, and fitting average off-resonant reflection values to Eq. (S8).
2. At room temperature, determine the parameter D in Eq. (S22) by measuring transmission spectra for a set of controlled misalignments {dθ tr , dφ tr } of the transmission lens (with input lens and back mirror perfectly aligned), and fitting average off-resonant transmission values to Eq. (S22).
3. Set the input lens to its room-temperature optimum, i.e.where dθ in = dφ in = 0, by maximizing off-resonant reflection.Cool down the cavity and measure reflection spectrum.Repeat such cooldowns for four other input lens alignment settings.
5. With the input lens set to cold optimum {θ in,0 , φ in,0 } cold , cool down and align back mirror in-situ.Record transmission spectrum.Repeat this step for five different transmission lens alignment settings.

Room temperature calibrations
Here we discuss the results of the room-temperature calibrations done to determine parameters A and D in Eqs.(S8) and (S22), i.e. steps 1 and 2 of the procedure listed at the end of Section C 2. By constraining our fits, we need only a minimum of 3 cooldowns to determine the input cold optimum, and another 3 for the transmission lens cold optimum.This is preferred over fitting more parameters on a larger cooldown dataset, because the cooldowns are most time-consuming.
We align our cavity manually to the warm optimum, i.e. dθ in = dφ in = dθ bm = dφ bm = dθ tr = dφ tr = 0 at room temperature.To fit for A, we then record reflection spectra at 1550-1552 nm wavelength at a set of controlled misalignments {dθ in , dφ in } of the input lens, centered at the optimum.Throughout this work, input and transmission lens tilts are measured in degrees rotation of the adjustment screws on the tip-tilt stages of these lenses, and a 10 • rotation corresponds to 32 µrad physical tilt of the lens.Fig. S4a shows the resulting fit of Eq. (S8) to the average off-resonant reflection values, normalized to reflection at perfect alignment, at all these misalignment positions.This reflection follows the expected Gaussian dependence.To fit for D, we keep the input lens and back mirror at their optima and record transmission spectra at a set of controlled misalignments {dθ tr , dφ tr } of the transmission lens, centered at the optimum.The fit of the average resonant transmission values, normalized to resonant transmission at perfect alignment, to Eq. (S22) also shows good agreement with a Gaussian (see Fig. S4b).

Finding the cold optimum using cooldowns
Here we discuss the results of the cooldowns used to determine input lens and transmission lens cold optima, i.e. steps 3-6 of the procedure listed at the end of Section C 2. The cooldowns are done in a liquid nitrogen dipstick, consisting of a ∼ 1.5 m tube with a chamber at the bottom hosting our experiment and vacuum, electrical and fiber connections at the top.The bottom half of the dipstick is then immersed in liquid nitrogen.A temperature sensor is mounted onto the cavity and after evacuation, helium is added to the chamber for faster thermalization.The experiment typically requires ∼ 30 min to reach a temperature of 78 K.This is still far above the base temperature of a dilution fridge, but we found that most thermal contraction happens in this first ∼ 220 K drop, consistent with the strong decrease of the thermal expansion coefficient of stainless steel around this temperature [41].A reflection and transmission spectrum is recorded every minute during cooldown, but here we only use the spectra taken at 78 K.One can, however, use the full dataset to track the optimal alignment during the cooldown.After reaching base temperature, the dipstick is pulled out of the nitrogen dewar, warmed up and vented once above 0 °C.The cavity input or transmission lens tilts are adjusted to the next value we want to measure, and the measurement is repeated.To normalize our measurements, we always also record a spectrum taken at optimal alignment (at room temperature).We fit the average off-resonant reflection values at 78 K from five cooldowns with different input alignment settings to Eq. (S8) (see Fig. S4c,d).This produces the input lens cold optimum {θ in,0 , φ in,0 } cold , which we find to be displaced from the warm optimum by (-15,23) degrees rotation on the tip-tilt mount screws.We then proceed with the transmission lens alignment by placing the input lens at its cold optimum and performing cooldowns with five different transmission lens alignments.Before taking the 78 K spectra, we align the back mirror in situ to its cold optimum, as is necessary to render Eq. (S22) valid.We then fit Eq. (S22) (see Fig. S4e,f) to the average resonant transmission of these spectra for all five cooldowns to find the transmission lens cold optimum {θ tr,0 , φ tr,0 }, which we find to be displaced from the warm optimum by (15,-11) degrees rotation of the tip-tilt mount screws.
To test the result of our alignment procedure, we align our cavity to the cold optimum and cool it down in our dilution fridge.We find an increase of the off-resonant reflectivity from 95% to 100% and an increase in resonant transmission from 50% to 64% (both quantities normalized to the values at warm optimum) as we cool our cavity down to ∼ 30 mK (see Fig. 1d in the main paper).Note that the value quoted here are averaged over the full ∼ 1 THz spectrum, while Fig. 1d only shows the first 100 GHz of these spectra, so the off-resonant reflection values do not correspond exactly.
Between the cavity alignment and the thermometry measurements presented in the main paper, 15 months passed during which the cavity was thermally cycled in our fridge nine times.To compensate for any slow drifts of the alignment, we redetermined the 'warm optimum' (the input and transmission lens optima at room temperature) before each cooldown and then changed our tilts by the warm-to-cold-optimum shifts that we found during our initial alignment.
We should mention that this alignment only works if the cavity expands and contracts reproducibly during a cooldown.That is, it returns to its original position when warmed up.This was the case, but only after a first 'settling' cooldown, during which we infer that mechanical elements overcome some stresses introduced during assembly, allowing them to remain in position in subsequent cooldowns.Furthermore, the unmounting of an element from our experiment bracket, or even the loosening and retightening of a mounting screw, would usually lead to a loss of the calibration for that element.A new cold optimum would then have to be found by repeating the test cooldowns.
Finally, we did observe a decrease of our off-resonant reflectivity at base temperature between the first fridge cooldown and the second.Afterwards, this reflectivity remained stable at ∼ 85% of the optimal room-temperature value over the course of 9 cooldowns.We attribute this decrease not to input lens misalignment, which would lead to asymmetric Fano lineshapes for our cavity resonances, which we don't observe, but to a failure of the anti-reflection coatings on our input GRIN lens or fiber ferrule, which are not specified to such low temperatures.Such degradation does not appear to occur for our mirror coatings, since we do not observe a systematic broadening of the cavity linewidth.
D. Selected Equipment 1550 nm light was created by a Toptica CTL 1550 tunable laser, intensity-and phase-modulated by Optilabs IM-1550-20-PM and iXblue MPX-LN-0.1,and sent to the experiment in a Bluefors LD400 dilution refrigerator.The local oscillator light was frequency-shifted by an iXblue MXIQER-LN-30 IQ modulator acting as a single-sideband modulator, which was driven by a Keysight MXG N5183B signal generator.For OMIT and OMIA spectra, the intensity modulator was driven by a Keysight P5004A vector network analyzer, which received its input signal from a Thorlabs RXM25AF photodetector.The spontaneous scattering signals were captured by a Thorlabs PDB570C auto-balanced photodetector and digitized by a Signalhound SM200A electrical spectrum analyzer.
The optomechanical cavity was formed by a 5 mm thick flat-flat z-cut quartz crystal from Rocky Mountain Instrument Co. between two > 99.9% reflectivity mirrors from Layertec.The quartz crystal is separated from the flat front mirror by a 0.2 mm thick Teflon spacer.The back mirror with a radius of curvature of 25 mm was mounted into a JPE cryo tip-tilt piston stage driven by three CLA 2201 stick-slip piezo actuators.The in-and outcoupling GRIN lens assemblies were mounted into Thorlabs POLARIS-K05F6 mounts.

OMIT and OMIA for thermometry corrections
As described in the main text, we perform optomechanically induced transparency and amplification measurements before and after the sideband thermometry measurements to characterize the optomechanical coupling.Fig. S5 shows the average of 20 OMIT/OMIA spectra recorded before and after the measurements in Fig. 3 of the main text.We observe that at ∼ 4 K, a slight shift of the optical mode spacing occurred during the measurement, which we attribute to a thermal expansion of the experiment.At mK temperatures, no significant shift is visible.We attribute this to the fact that the helium circulation still provides active cooling to the experiment at mK, whereas it is turned off at 4 K. show spectra taken at milliKelvin temperatures, taken right before and after the sideband asymmetry measurements presented in Fig. 3d-f.Compared to the 4 K measurements, more noise is visible, which comes from vibrations induced by the helium mix circulation pumps.

Fitting OMIT and OMIA spectra
We fit the averaged OMIT/OMIA spectra to extract the relevant parameters, such as the mechanical peak positions with respect to the optical resonance, the optical and mechanical linewidths and the optomechanical coupling rates.The transmitted probe tone intensity spectrum for an optical cavity coupled to a single mechanical mode is given by [30] where ∆ 21 = ω 2 − ω 1 is the frequency detuning between the two optical modes, κ is the optical linewidth, g is the cavity-enhanced coupling rate, Ω m is the mechanical frequency, Γ m is the intrinsic mechanical linewidth and A 0 is the transmission amplitude.We have assumed our pump laser to be resonant with one of the optical modes.The effective mechanical linewidth Γ m,eff , which includes optomechanical backaction and is shown in Fig. 2d,e of the main paper, is calculated using Eq.(S40).The last term in the denominator enters with a plus (minus) and causes a narrow dip (peak) on the broad optical resonance when the pump is locked to the low (high) frequency optical mode at ω 1 (ω 2 ), corresponding to the case of OMIT (OMIA).
For each OMIT/OMIA measurement, the recorded traces are preprocessed in multiple stages and several fits are preformed to extract the optical resonance or the mechanical resonances.First, a simple Lorentzian lineshape (the 'optical fit') is fitted to the broad optical mode by ignoring the data points near the mechanical peaks (see Fig. S6a,e).With the optical mode parameters ∆ 21 , κ and A 0 fixed by the optical fit, we then fit the region around each mechanical peak (dip) individually using Eq.(S23), as shown in Fig. S6b-d,f-h (the 'mechanical fits').The uncertainties on the fit parameters are propagated using linear error propagation when using them for the signal corrections as described in Section G.
There are two further sources of errors on the position of the optical resonance which are not captured by a single fit.First, any change in parameters between before and after the thermometry measurement, as discussed in Section E 1, is taken into account by taking the average parameter value and adding an error of half the change to either side.Second, at mK, there are sinusoidal oscillations of the optical resonance spacing ∆ 21 due to noise from the turbo pumps that circulate the helium mix (see Fig. S5c,d.These oscillations mostly cancel out when averaging multiple OMIT/OMIA spectra, but are clearly visible in a single trace.From a single OMIT trace taken for the mK measurement in Fig. 3d-f, we estimate the amplitude of the signal mode frequency fluctuations by using the amplitude of the oscillations of the transmitted intensity and the slope of the trace at this position.We find a conservative estimate for the root mean square amplitude of the frequency fluctuations of 0.266 MHz, which we add to the uncertainty on ∆ 21 for every OMIT/OMIA measurement taken at mK temperatures.This source of error is roughly equal to the total contributions by other error sources when calculating the error bars for the thermal mode occupations.In the following we will derive the signal observed on the electrical spectrum analyzer (ESA, see Fig. 2a) created by optomechanical up-or downconversion when pumping either the red or the blue mode with a strong classical pump tone.We will limit the derivation to the interaction with a single mechanical mode b for notational brevity, even though in the experiment we observe optomechanical coupling to multiple mechanical modes bm .However, since the mechanical modes are well separated in frequency, we can treat them as independent.Although the different mechanical modes exhibit similar linewidths, their single photon coupling rates g 0 are strongly modulated, as found previously by Kharel et.al. [29].
The system model consists of two optical modes â1/2 and a mechanical mode b coupled with rate g 0 .The optical modes have linewidths κ 1/2 = κ int 1/2 + κ ext1 1/2 + κ ext2 1/2 , with κ int 1/2 the internal losses, κ ext1 1/2 the rate with which light from âext1 1/2,in is coupled into the cavity mode, and κ ext2 1/2 the rate with which light exits the cavity into the propagating mode âext2 1/2,out leading to the detector.A strong classical pump tone with amplitude α p,in and frequency ω p drives one of the two modes, designated â1/2 for the red (blue) pumping case.The full Hamiltonian for this system is The Langevin equation of motion for the pump mode is thus where we have assumed weak single photon coupling to neglect the term ig 0 â2/1 b.Neglecting also quantum fluctuations of the pump mode amplitude, i.e. replacing â1/2 → ā1/2 , we solve for the classical pump mode amplitude where ∆ p,1/2 = ω p − ω 1/2 and α cav p is the intra-cavity amplitude of the pump mode.Thus we identify α cav as the number of intra-cavity photons of the pump mode.For simplicity, we define the time t to absorb the complex phase of α cav p such that α cav p = N 1/2 is real.We insert ā1/2 for â1/2 in Eq. (S24) to obtain for blue pumping. (S27) where we defined the cavity-enhanced coupling rates g r/b = g 0 N 1/2 for the red (blue) pumping case.Going to the rotating frame with respect to ω p â † 2/1 â2/1 and assuming that the pump beam is on resonance, i.e.
where ∆ 21 = ω 2 − ω 1 > 0. This leads to the following Langevin equations of motion for the signal mode â2/1 and the mechanical mode b: where we consider the mechanical mode only coupled to one single loss channel, and defined the coupling terms Applying the Fourier transformation defined as and defining the optical noise input to the signal mode where and we define the effective mechanical frequency and linewidth, modified by optomechanical backaction, as Inserting Eq. (S36) back into Eq.(S34) yields â2/1 (ω) = 1 (S47) Finally, Eq. (S44) together with âext2 2/1,out (ω) = âext2 2/1,in (ω)− κ ext2 2/1 â2/1 (ω) then yields the signal in the optical output mode âext2 2/1,out (ω).We can write the relations between all mechanical and optical input and output modes in terms of a scattering matrix: Due to energy conservation, the magnitude squared of the four scattering matrix elements S r/b 2j add up to one, although in the blue pumping case, S b 23 (ω) 2 enters with a minus sign (recall the different definitions of Br/b in/out (ω) in Eq. (S48)).
For later reference, we now prove the relation For this, we write down the scattering matrix elements for the blue pumping case: where So the magnitude squared of the scattering matrix elements are Here, we identify and therefore the last term in Eq. (S60) becomes Using Dividing by , adding Γ m and subsequently multiplying by κ1 By comparing the left side of this equation with Eqs.(S40) and (S47), we identify Γ m − Γ b m as δΓ b m (−ω).inserting −ω into Eq.(S43), we find which is just the right side of Eq. (S70), so Eq.(S52) holds.

Output voltage of the balanced detector
The output mode of the optical cavity is collected into the single-mode fiber that guides the signal to the detector with an amplitude collection efficiency of √ η.Before the detector, the signal is split by a beamsplitter with intensity transmission T into two paths.From the second input port of the beamsplitter, a second mode is added, in this case the strong local oscillator with amplitude âLO = α LO e −i∆ r/b LO t in the rotating frame of the pump laser, where In our case, the local oscillator is tuned close to the signal frequency such that ∆ r/b LO /2π ∼ ±12.5 GHz.The signal in the mode impinging onto one of the detectors is thus In the following, we will denote âext2 2/1,out as âs , ω 2/1 as ω s and ∆ r/b LO as ∆ LO for clarity, until it becomes necessary again to distinguish the red/blue pumping cases.We will first derive the voltage V 1 of one of the photodiodes in our auto-balanced detector, and later consider the effect of substracting the photocurrents from both photodiodes.The voltage produced by a photodiode upon light absorption is given by where from the first to the second line we dropped two of the four total cross-terms, since the correlator of any bath operator (or its Hermitian conjugate) with itself is zero.Also, we assume that all optical and mechanical baths are uncorrelated to each other.Going forward, we use the same arguments, as well as the correlators corresponding to negligible optical bath occupancy and mechanical mode occupancy where k goes over the different internal and external optical baths.We also note that Â † (ω − ∆ LO ) Â(ω + ∆ LO )) = Â † (ω) Â(ω ) , and Inserting what we found previously for âs (ω) thus gives As shown in Section F 1, the following two relations for the scattering matrix elements hold: This leaves us with the final expression for the power spectral density of the voltage generated by the balanced detector where N r = n th (N b = n th +1) for red (blue) pumping captures the well-known asymmetry of the two scattering signals.
The constant offset contributes to the shot noise level.For an accurate prediciton of the true shot noise level, one would have to treat the losses leading to the collection efficiency η as an effective beamsplitter onto which vacuum noise is impinging from the other port, but we don't do this here.We note that S The power displayed by the spectrum analyzer is SV V (ω)/R L where R L = 50 Ω is the load resistance, integrated over the bandwidth of the intermediate filter of the spectrum analyzer, i.e. the resolution bandwidth RBW .We operate in the limit where the resolution bandwidth is much smaller than the width of the signal peak (5 kHz vs. ∼ 50 kHz), such that we can replace the integration by simply multiplying with the resolution bandwidth.The final expression for the power displayed on the spectrum analyzer is thus: from fits to the cavity reflection spectra (see Section G 2), and the other parameters are obtained from fits to OMIT and OMIA spectra.Thus, we can divide these parameters out and obtained the corrected integrated signals We solve this for the thermal mode occupation n th and obtain This assumes that n th is the same for both red and blue pumping measurements, which is ensured by waiting for the fridge to return to similar temperatures and pressures after the pulse tube is turned off for a previous measurement.
For measurements during fridge warmup, this assumption is not true anymore, which is why we employ a different method to extract n th , see Section H. Finally, we note that when we show the corrected thermometry signals in Fig. 3a-f, instead of dividing out all the prefactors mentioned above, we multiply the red spectrum (not including its baseline of 1) by the ratio of the prefactors for the blue and red data.This ensures that the amplitude of both peaks can still be compared to the baseline of 1 to estimate our signal-to-noise ratio, while the blue/red asymmetry is given by the ratio of the areas under the blue and red peaks, respectively.

Optical input coupling characterization
According to Eq. (S92), we need to correct for differences in the external coupling rates κ ext1 1/2 , κ ext2 1/2 .As the integrated signal for red (blue) pumping I r (I b ) is proportional to κ ext1 corr /I r corr .While, in principle, the external coupling rates can be determined from the cavity reflection and transmission spectra, as done in [30], this is not possible in the presence of unknown losses due to misalignments or e.g.fiber transmission.We can, however, determine the ratio of external coupling rates of our two modes, which is sufficient for the purpose of this correction.
As discussed in Section C, in the presence of fiber losses and a mismatch between input or output optics and the cavity modes, the cavity reflection is described by Eq. (S6).This can be rewritten as with R 1,∆ κ the reflection far from resonance and S = |(s 12,1 s 21,1 )/(s 12,1 s 21,1 + s 11,1 )|.Equation (S97) describes a Fano resonance [42], with φ the Fano phase that determines the asymmetry of the resonant feature.By fitting this expression to the reflection spectra of our red and blue cavity modes, we can obtain R 1,∆ κ , φ, κ, ω 0 and S κ ext1 .Thus we see that we cannot uniquely determine the input coupling rate through a reflection fit.If we assume, however, that S 1 and S 2 are frequency-independent within the frequency range spanning our two cavity modes, we can determine the ratio κ ext1 1 /κ ext1 2 through a fit of both cavity modes.Similarly, by fitting the reflection spectra of both modes with the laser entering from port 2 (the 'back' side of the cavity), we can find the ratio κ ext2 1 /κ ext2 2 .We therefore record the reflection spectra of both our red and blue modes, illuminated through port 1 and port 2 to obtain these ratios necessary for the thermometry signal correction.Such spectra are recorded for every cavity setting at which we do thermometry, i.e. after alignment and tuning of the mode pair to the displacement-insensitive point (see Section B), both at 4K and at mK temperatures.

H. Warmup measurements
For the main results of the paper, the phonon mode occupation is extracted by observing the asymmetry between the (corrected) integrated Stokes and anti-Stokes scattering signals (see Eq. (S95)).This assumes, however, that the phonon mode temperature is the same between both measurements, which is not true during fridge warmup.Thus, we employ a different method which is based on interpolating the signals from a reference measurement pair before warmup.We take a pair of measurements (red and blue pumping) shortly before starting the fridge warmup, and extract the thermal mode occupation n th in the usual way according to Eq. (S95).We call this the reference measurement pair.For subsequent measurements at higher temperatures, we expect the corrected integrated signals to then scale with n th for red pumping, n th + 1 for blue pumping, respectively (see Eqs. (S93) and (S94)).Thus, by assuming β stays constant during fridge warmup, we can extract n th from just a single measurement via During fridge warmup, there are period of time where the experiment temperature stays on a plateau that is long enough for two measurements, such that we can declare them as a new reference measurement pair (see hollow markers in Fig. 4c.The measurements coming afterwards then refer back to this reference pair for the interpolation according to Eq. (S98).To test whether defining these new reference measurement pairs is valid, we perform the same analysis on the same data but just using the reference measurements from before the start of the fridge warmup.We observe the same qualitative behaviour as explained in the main text, see Fig. S7a.In a separate cooldown than the one in which all data in the main text was taken, eccosorb foam was added into the copper heat shield that surrounds the experiment.This was done in the hope that this would better thermalize the blackbody environment to the experiment temperature.However, no significant effect on the crystal temperatures was observed, as can be observed from the qualitatively same results in Fig. S7b.

I. An equivalent thermal circuit model to describe the crystal temperature
To test whether blackbody radiation by a higher-temperature stage could explain the elevated occupations we observe at mK temperatures in Fig. 3 of the main paper, we develop an equivalent thermal circuit model and use it to fit the data from our warmup measurement, shown in Fig. 4c of the main paper.We assume that such blackbody radiation will be dominated by the still plate, which is the warmest plate that can easily exchange radiation with our experiment, since radiation from higher plates is shielded by the still plate and still can.Heat exchange between crystal and its mount (the temperature of which we monitor with our experiment thermometer) happens either through blackbody radiation or through conductive exchange through the front mirror and teflon spacer that the crystal is clamped to.
Under these assumptions the equivalent thermal circuit for our system can be drawn as in Fig. S8.Temperatures of the crystal, front mirror (the facet that is facing the crystal), mount, and still plate are represented by voltages V c , V f , V m and V s , respectively.The thermal resistance between front mirror and mount (front mirror and crystal) is given by the resistance R f −m (R c−f ), while front mirror and crystal each have a thermal capacitance (i.e.heat capacity) of C f and C c .The voltages V m and V s are set by voltage sources to the values we measure on our experiment measurement time such that we measure our crystal always in the steady state, i.e.Vc = Vf = 0.This seems reasonable, given how fast the crystal temperature follows the sharp increase in mount temperature at t ∼ 55 min in Fig. 4c of the main paper.We may then solve Eq. (S101) for V f and plug that into Eq.(S100) to eliminate V f (which is unknown) and obtain where If we now plug in the temperature dependencies of the blackbody currents as in Eq. (S99) (and similarly for the other currents), we obtain the following quartic equation for V c : Here, as discussed above, the thermal resistance R c−m is given as the sum of three terms with different temperature dependencies, i.e.
To test whether this model can predict the temperatures we measure for our mechanical modes (taken to represent the crystal temperature) during the fridge warmup, we try to fit the model to the temperature data shown in Fig. 4c of the main paper.At each time when we have a temperature measurement of the crystal, we find the corresponding mount and still temperatures by interpolating the temperature sensor data.This gives us a data set of input voltages (temperatures) {V m , V s } and resulting crystal voltages {V c }.We find crystal voltages from our model for each combination of V m and V s by finding the four roots of Eq. (S103), postselecting the positive real roots between V m and V s (values outside that range are unphysical) and then picking the smallest root if there are multiple candidates.We also tried picking the largest root but it doesn't change the result.Initial guesses for the fit parameters An automated minimization routine did not manage to find a good fit to our data.We therefore manually adjusted the fit parameters to see if we could make the model fit the data.We were able to find parameters for which we were in the regime dominated by conductive heat transport, where the crystal always follows the mount temperature.By increasing b s−c we can go to a regime where radiation from the still plate dominates and the crystal follows the still temperature closely.Picking an intermediate value for b s−c brings us into the regime where the crystal temperatures lie somewhere in between mount and crystal.However, regardless what values we choose for our parameters in that region, we cannot reproduce the measured temperature behaviour of our crystal, which is to follow the mount temperatures closely at high mount temperatures (T m >∼ 400 mK), yet to stabilize around a crystal temperature of T c =∼ 400 mK when the mount temperature drops below ∼ 400 mK.We also tried setting R c−m to infinity and increasing b m−c to balance black body radiation from the still and to the mount, but also this cannot explain our data.
We believe that the reason that our model cannot predict the qualitative temperature behaviour we measure, is the following: let us assume that black body radiation from the still causes the elevated crystal temperatures at low mount temperatures.That means that at those low temperatures, heating by this blackbody radiation and cooling by the mount balance each other.However, blackbody radiation grows proportional to T 4 s , while the heat flow through R c−m can grow at most proportional to T 2 m (in the case where R c−m is dominated by R c−m,2 ).Thus, as time progresses during the warmup and both still and mount rise in temperature, heat flow into the crystal by blackbody radiation from the still must grow compared to the removal of heat from the crystal through R c−m , thus bringing the crystal temperature closer to that of the still.Even if the heat exchange between mount and crystal were dominated by blackbody radiation (and would thus be proportional to T 4 m ), we still cannot obtain the desired behaviour.In that case, neglecting the contributions of I m−c and I c−s , the ratio T 4 s /T 4 m determines the ratio of input to output heat flows for the crystal.So if T s and T m grow by the same factor, this ratio is conserved.However, we see in Fig. 4c that in the beginning of the warmup (around t ∼ 10 − 20 min), the still stage grows more rapidly in temperature, while only in the last phase the mount temperature grows faster.Such a pure blackbody radiation model thus always predicts the crystal temperature to approach the still temperature more closely in this early phase of the warmup, which is not what we observe.
We thus conclude that radiation by an object that is at the temperature of the still plate cannot explain why our crystal temperatures are larger than those of the mount.Neither can radiation by higher-temperature stages, since those warm up even more rapidly than the still stage during the warmup.A source of heat that would be consistent with our data, however, is radiation by a poorly thermalized still shield.This shield surrounds our experiment, it is long and its walls are thin, and it is being heated by radiation from the 4 K stage and its shield, so it is likely that it is indeed not always at the same temperature as the still stage.One would then expect it to be at higher temperatures than the still plate during normal operation, but lagging behind the increasing temperature of the still plate during the fridge warmup, which could help explain the behaviour we see.

J. Laser phase noise measurements
Laser phase noise produces sidebands that can appear in sideband thermometry measurements.If the noise is at the same frequency as the mechanical sideband, it can, depending on relative phase conditions, increase or decrease the measured sideband asymmetry in optomechanical experiments in addition to affecting the actual mechanical mode occupation [37].
To account for this contribution, we follow the procedure described in [37] in order to evaluate the influence of laser phase noise on the detected phonon occupancy in sideband asymmetry experiments.We use a calibrated electro-optical phase modulator (EOM) and a Mach-Zehnder interferometer (MZI) to generate a calibrated phase modulation signal that is then used to quantify the phase noise intrinsic to the laser (see Fig. S9a).The same Toptica CTL 1550 tunable laser at 1550 nm as used for our sideband thermometry measurements is used as the light source in this phase noise measurement, at a power of 673 µW.The EOM is calibrated using a fiber-based narrowband-pass filter.The relative transmission peak size of carrier and sidebands allows us to characterize the modulation depth β of our modulator at the desired frequencies around the mechanical mode, using J 1 (β) J 0 (β) where P 0 (P 1 ) is the power in the carrier (first sideband) and J k are the Bessel functions of the first kind.The phase modulation created by the EOM leads to frequency noise on the laser, with their noise power spectral densities related by S ωω (ω) = ω 2 S φφ (ω) [37].If we use a sinusoidal modulation of frequency ω mod , we create a frequency noise spectrum (in rad 2 Hz) The laser light is then passed through a fiber MZI, with the laser locked to its half-max point, which converts the phase noise to amplitude noise.The MZI free spectral range is designed to be 36 GHz, which is well-suited to the noise frequencies of interest, which lie around the mechanical mode frequency of ≈ 12.66 GHz.The MZI transmission is recorded on a fast photodetector (Thorlabs RXM25AF, operated with nominal gain of 1500 V/W), the output As discussed in Section G, we infer the mechanical occupation from the ratio of the corrected red-and blue-pumped integrals through Eq. (S95), which we can also write as where we assumed again that ∆ 21 = Ω m (such that Γ r/b m,eff = (1 ± C)Γ m ), and where we defined Īr/b to represent the integrals of the thermometry signals, corrected for all factors that differ between them except that of the difference in mechanical linewidth, i.e.Īr = βn th /Γ r m,eff and Īb = β(n th + 1)/Γ b m,eff .We note that this description is equivalent to that used in [37], where the integrals are defined as Īr = β n + eff and Īb = β( n − eff + 1), with β a common prefactor and n ± eff the 'effective' mode occupations.In the absence of phase noise, these effective mode occupations correspond to the occupation including back-action of the laser.In the presence of phase noise, they can no longer be interpreted as such, and can be expressed as [37] n where we assumed again that Ω m κ and κ = κ 1 ≈ κ 2 .If we insert these expressions into Īr and Īb we can find from Eq. (S110) a new relation between the inferred occupancy n inf th and the real occupancy n th in absence of laser light, which is (S113) We can estimate how laser phase noise affects our thermometry measurements by inverting Eq. (S113) to express n th as function of n inf th , and assuming typical experimental values (C = 0.15, κ ext /κ ≈ 1/2, |E 0 | 2 ≈ 6.4 × 10 14 photons s −1 and n inf th = 0.4).This leads to a real occupancy of n th = 0.407 or n th = 0.417 when using S av ωω (Ω m ) or S av+std ωω (Ω m ) for the frequency noise level, respectively.This shows that the difference between real and inferred occupations is negligible compared to the occupations we measure and their error bars.Similarly small relative differences are found for the occupations of ∼ 7 that we measure at 4 K temperature.

K. Effective mass
To calculate the effective mass of the measured phonon modes, we follow the same calculations as Bild et.al. [45] but use updated parameters.In this experiment, the phonon mode we couple to is not strictly an eigenmode of the system, but is instead formed by a superposition of eigenmodes.The exact superposition is found by maximizing the coupling Hamiltonian Ĥint = dV 0 2 r p Ŝ( r) where p is the photoelastic tensor, Ŝ( r) is the mechanical strain field, and ˆ E o,j ( r) and ˆ E o,j+1 ( r) are the two optical modes.Since both electric field modes have approximately identical Gaussian shapes with width w 0 ≈ 77 µm, the coupling Hamiltonian is an overlap integral of the strain field with an effective field with Gaussian width w 0 / √ 2. Thus, the superposition of mechanical eigenmodes will form an effective mode field that has the same width w 0 / √ 2. Since the Rayleigh length of the optical modes (∼ 18.4 mm inside Quartz) is larger than the crystal thickness of L = 5 mm, we treat the mode field diameter as constant.
Knowing the shape of the mechanical strain field, we can equate the mechanical energy with the potential energy of an effective mechanical mode

Figure 1 .
Figure 1.Design and alignment of the cryogenic Brillouin cavity.(a) Schematic of the cryogenic Brillouin cavity.(b) Picture of the assembled cavity.See text for detailed description.(c) Frequency landscape.The optical modes are irregularly spaced due to reflections at the crystal surfaces.One mode pair is tuned to be separated by the Brillouin frequency ΩB.(d) Cavity reflection spectrum before (top) and after (bottom) cooldown, normalized to the maximum reflection when aligned by hand.The modes that approach zero reflection are the fundamental transverse modes.Higher-order Laguerre-Gaussian modes appear as shallower dips.

Figure 2 .
Figure2.Observing optomechanical interaction through OMIT and OMIA measurements.(a) Setup used for OMIT/OMIA measurements and sideband asymmetry thermometry.The output of a tunable diode laser is split into a signal and a LO arm and is locked on resonance with one of the optical modes using a Pound-Drever-Hall (PDH) lock.When measuring OMIT/OMIA spectra, we use a vector network analyzer (VNA) to drive an intensity modulator (IM) in the signal arm, creating a weak probe tone detuned from the pump frequency ωpump by the VNA frequency Ω.The beating of the transmitted pump and probe tones is recorded on a fast photodetector and sent to the VNA input.The transmission S21 measured by the VNA is proportional to the probe transmission.For sideband thermometry, no probe tone is created and the weak spontaneously scattered signal is measured through balanced heterodyne detection, using an electrical spectrum analyzer (ESA) and a strong LO that is frequency shifted by a single-sideband modulator (SSBM).An optical switch allows for rapid switching between these two configurations.(b-c) Examples of OMIT (b) and OMIA (c) spectra measured at ∼ 200 mK.Three mechanical modes are clearly visible as narrow dips or peaks on top of a broad cavity transmission peak.(d-e) Linear scaling of optomechanical cooperativity Cm (d) and effective mechanical linewidth Γ m,eff (e) with pump power, for the center mechanical mode (mode 1) at ∼12.6553 GHz.Red and blue dash-dotted lines show linear fits.Black dotted line in (e) shows the intrinsic mechanical linewidth of 54.5 ± 0.4 kHz.

Figure 3 .
Figure 3. Sideband asymmetry thermometry at ∼ 4 K and ∼ 200 mK.(a-c)ESA spectra of the spontaneous Stokes (blue) and anti-Stokes (red) scattering at T ∼ 4 K mediated by the three mechanical modes, measured with the pump laser locked to the blue and the red mode, respectively.Spectra are normalized to the noise floor and the red traces are corrected for several differences in pre-factors between red and blue pumped measurements (see supplementary information, Section G).Note that these frequencies are with respect to the LO, which is at a frequency ΩLO roughly 115 MHz from the scattered signal frequencies.(d-f ) Same as (a-c), but measured at T ∼ 200 mK.Here, the signal for mode 0 is weaker than that of modes 1 and 2 because of its lower coupling rate g0 and because it is further off resonance from the optical mode (see Fig.2b and c).(g) Thermal mode occupations n th extracted from the red/blue asymmetries in (a-c) versus the mode cooperativity.The grey area indicates the expected occupation based on the experiment thermometer temperatures between start and end of the measurement.(h) Same as (g), but for the measurements shown in (d-f).The blue dashed line indicates n th = 0.5.

Figure 4 .
Figure 4. Thermometry with laser or fridge heating.Mode 0 is not shown due to its bad signal to noise ratio in these measurements.(a) Thermometry with laser preheating.Effective mode temperatures after heating with the pump laser for 2 minutes, for different laser powers at the cavity input.The red dashed line indicates the pump power used during the thermometry measurements.The grey area indicates the measured temperature range between start and end of each measurement.The blue dashed lines in panels (a) and (b) indicate n th = 0.5.(b) Pump power sweep.The measurement is similar to Fig. 3h, but the 4 min measurement window is split up into separate measurements with different pump powers.The grey area indicates the thermometer readings at the beginning and end of the entire sweep.Error bars are larger than in Fig. 3h because integration time per measurement is shorter.(c) Thermometry during fridge warmup.Red (blue) markers show effective mode temperatures extracted from single red (blue) pumping measurements.The hollow markers indicate which measurements were used as reference pairs.Horizontal error bars indicate the measurement time of 4 minutes.Blue (orange) lines show the temperature sensor readings at the experiment (still flange).

Figure S1 .
Figure S1.Vibration isolation stage.(a) Experimental setup mounted in the fridge.The cavity is mounted on the vibration stage, which is suspended by springs from the mixing chamber plate of the fridge.Failsafes surround the vibration stage and serve to mount the experiment rigidly during manual alignment.The experiment temperature sensor is attached to the front mirror mount.(b) Vibration spectra recorded by the cavity at room temperature and with the fridge open, with and without vibration isolation.Vibrations were generated by a vibration speaker placed at the still plate.Background spectra, i.e. noise due to other fridge vibrations or ambient sound not caused by the speaker, were recorded and subtracted before plotting.(c) Vibration spectra recorded by the cavity in an evacuated fridge at 4 K temperature, with and without the pulse tube running.

Figure S2 .
Figure S2.Dependence of cavity mode spacing on cavity length.The frequency spacings ∆fc between 6 neighbouring modes (shown as 5 lines with different colors) are shown as function of cavity length changes δL.Each mode pair shows an oscillatory dependence on δL with a period of roughly 1.3 µm.Dashed line indicates our Brillouin frequency.We use cavity dimensions similar to those of our experiment.Since the exact value of the total physical cavity length is unknown in the experiment, we set it to 10.4 mm to match the peak of the oscillations in ∆fc with the Brillouin frequency, as is the case in the experiment.This length corresponds to δL = 0. To change δL, we change the distance between crystal and back mirror.

Figure
Figure S3.Experiment design and description using scattering matrices.(a) Exploded view of the cavity design.All metal mounting parts are made of stainless steel to minimize relative movement during cooldown, except the back mirror mount which is made of titanium.The crystal is clamped to the front mirror using a CuBe clamp and using teflon spacers on each side.(b) Schematic of our cavity and its input and output channels.Scattering matrix S1 describes the coupling between the input optics and the cavity port 1, and S2 that between the transmission optics and the cavity port 2.

Figure S4 .
Figure S4.Determination and compensation of thermal misalignments.(a) Calibration fit of off-resonant reflection to Eq. (S8) for several room temperature input lens misalignments.From this fit we obtain fit parameter A, which sets the Gaussian width.(b) Calibration fit of resonant transmission to Eq. (S22) for several room temperature transmission lens misalignments.From this fit we obtain D, which sets the Gaussian width.(c) Off-resonant reflection fit for data taken at 78 K temperature, during 5 cooldowns with different input lens misalignments.This fit produces the input lens cold optimum.(d) Resonant reflection fit for data taken at 78 K temperature, during 5 cooldowns with different transmission lens misalignments (and with input lens and back mirror at cold optimum).This fit produces the transmission lens cold optimum.(e) Same as (c), but plotted as function of dθin and dφin separately, and with the origin set to the warm optimum.Color map shows the fit, circles show the data.The warm-to-cold optimum shift is the difference between the color map maximum and the origin.(f ) Same as (e), but plotted as function of dθtr and dφtr separately, and with the origin set to the warm optimum.

Figure
FigureS5.OMIT and OMIA spectra before and after sideband thermometry.(a) and (b) show spectra taken at around 4 K, taken right before and after the sideband asymmetry measurements presented in Fig.3a-c.(c) and (d) show spectra taken at milliKelvin temperatures, taken right before and after the sideband asymmetry measurements presented in Fig.3d-f.Compared to the 4 K measurements, more noise is visible, which comes from vibrations induced by the helium mix circulation pumps.

Figure
Figure S6.Fitting OMIT and OMIA spectra.(a-d) OMIT and (e-h) OMIA spectra taken after the thermometry measurement at ∼ 200 mK presented in Fig. 3d-f.(a) and (e) show the full spectra with the fit to extract the optical resonance parameters.(b-d) and (f-h) show closeups of the regions used to fit the mechanical resonances individually.
ig r bin (ω) for red pumping ig b b † in (ω) for blue pumping.(S45) Note that by taking the Hermitian conjugate of b(ω) in the blue pumping case, the frequency argument of Ω b m and Γ b m flipped its sign, meaning Ω ω) for blue pumping.

m
ω) is already symmetric in frequency, such that Sr/b V V (ω) = S r/b V V (ω).Looking back at Eq. (S51), we observe that S .(S87) peaks at positive ω, the other one will not be shown in the spectrum.We also recall that in the blue pumping case, the sign of the frequency argument in Ω b m and Γ b m is flipped (cf.Eqs.(S46) and (S47)).So Ω are centered at ±∆ 21 , the same frequency as where S

1 )
, the ratio I b /I r must be multiplied with a

Figure S7 .
Figure S7.Warmup analysis with one reference measurement pair.Instead of 3 reference measurement pairs, just the one before the start of the warmup is defined, as indicated by the hollow markers.(a) Data is identical to that presented in the main text.(b) Data shown is from another cooldown as the data in the rest of the paper, during which the experiment was surrounded by eccosorb foam.
2 , b s−c and b m−c are based on literature values of low-temperature conductivities of Teflon and fused silica, and the physical dimensions of the crystal and other elements involved.

Figure S9 .
Figure S9.Laser frequency noise measurement.(a) Setup used to measure contributions of frequency noise in the spectrum.VOA: variable optical attenuator, EOM: electro-optical modulator, MZI: fiber Mach-Zehnder interferometer, PD: photodetector.(b) Comparing ESA spectra measured with (blue) and without (orange) the MZI, with the same optical power and the detector noise (∼ 1.7 dB below signals) subtracted.(c) Laser frequency noise spectral density, obtained by subtracting the blue and orange traces in (b) and multiplying by A(ω).Orange and green dashed lines show the average and the average plus one standard deviation, respectively, of the data within a 10 MHz span around Ωm.