Sideband cooling of molecules in optical traps

Sideband cooling is a popular method for cooling atoms to the ground state of an optical trap. Applying the same method to molecules requires a number of challenges to be overcome. Strong tensor Stark shifts in molecules cause the optical trapping potential, and corresponding trap frequency, to depend strongly on rotational, hyperﬁne, and Zeeman states. Consequently, transition frequencies depend on the motional quantum number and there are additional heating mechanisms, either of which can be fatal for an effective sideband cooling scheme. We develop the theory of sideband cooling in state-dependent potentials, and derive an expression for the heating due to photon scattering. We calculate the ac Stark shifts of molecular states in the presence of a magnetic ﬁeld, and for any polarization. We show that the complexity of sideband cooling can be greatly reduced by applying a large magnetic ﬁeld to eliminate electron- and nuclear-spin degrees of freedom from the problem. We consider how large the magnetic ﬁeld needs to be, show that heating can be managed sufﬁciently well, and present a simple recipe for cooling to the ground state of motion.


I. INTRODUCTION
In recent years there has been rapid progress in the development of techniques for producing and manipulating ultracold molecules [1][2][3][4][5][6][7][8][9][10][11][12]. Arrays of molecules interacting via the dipole-dipole interaction can be used as a platform to study many-body quantum physics [13][14][15][16] or to implement two-qubit quantum gates [17]. Small arrays can be made using tweezer traps, and larger arrays using optical lattices. Molecules produced by association of ultracold atoms have been loaded into lattices at high enough filling factors to begin studying many-body effects [18]. Molecules have also been formed by associating pairs of atoms in tweezer traps [19].
Very recently, laser-cooled molecules were captured in tweezer traps for the first time [20]. To exploit the potential of these low-entropy arrays, it is necessary to initialize each molecule in a single quantum state. An important current challenge is how to cool these molecules to the ground state of motion in tweezer traps or lattices. This is frequently done for alkali-metal atoms using Raman sideband cooling [21,22], and these methods are now being extended to alkaline-earthmetal atoms [23,24]. Application of the same techniques to molecules is difficult because (i) the complex structure of the molecule tends to complicate all laser-cooling methods and (ii) molecules have large tensor Stark shifts, resulting in statedependent trapping potentials that, in most circumstances, make sideband cooling impossible. Raman sideband cooling consists of a repeated two-step process illustrated in Fig. 1. The first step drives a stimulated two-photon transition between a pair of internal states, reducing the motional quantum number n. In order to selectively drive the red sideband of the transition, the linewidth must be narrow compared to the energy spacing of the motional levels of the trapped atom; typical trap frequencies are of order 100 kHz. The second step provides the dissipation necessary for cooling by optically pumping the atom back to its original internal state.
There are two key requirements for effective cooling. Both are challenging for molecules. First, in order to cool from a thermal state, the frequency required to drive the stimulated transition must be independent of the motional state of the molecule. In alkali-metal atoms, which typically have very small tensor Stark shifts 1 [25], it is straightforward to find pairs of internal states for which the trapping potentials are nearly identical. Provided the potential is sufficiently harmonic, the transition frequency is then independent of the initial motional state. Because molecules have large tensor Stark shifts, the trap frequency depends strongly on internal state, so the two-photon transition frequency depends on the motional state.
The second requirement is that the optical pumping step must have a high probability of preserving the motional quantum number. When the potentials are identical, different motional states of the two potentials are orthogonal and the probability of changing n depends only on the Lamb-Dicke parameter, the square root of the ratio of the photon recoil energy to the level spacing of the traps. Provided this parameter is small, transitions that change the motional quantum Step 1: two-photon Raman transition with detuning set to change internal state from g A to g B while removing one quantum of motional energy. This step is forbidden when the molecule reaches the motional ground state. (b) Step 2: optical pumping via an electronic excited state returns the molecule to its original internal state. For cooling to work efficiently, the optical pumping should preserve the motional quantum number. number are strongly suppressed. However, for state-dependent potentials, the different motional states of the two traps are no longer orthogonal and the heating involved in scattering a photon has additional contributions associated with projection of the molecule from one potential to the other. Moreover, the additional complexity of a molecule compared to an atom means that optical pumping to a desired state often requires more photons to be scattered, each contributing to the heating.
A potential advantage of state-dependent potentials is that they could enable projection cooling schemes [26,27]. For example, one could drive a rotational transition resonant only with molecules in the motional ground state followed by state-dependent detection to determine whether or not the molecule has made the transition. If it has, the molecule has been projected into the motional ground state, if it has not we can reapply cooling light to scramble the motional state and try again. While such schemes may be useful for a single molecule, they scale poorly and so are not suitable for arrays. An active cooling method is required. This paper is organized as follows. In Sec. II we outline the theory of sideband cooling in state-dependent harmonic potentials and establish a quantitative set of criteria for efficient cooling to take place. In Sec. III we outline the effective Stark shift operator for the interaction of molecules with the trapping light. The results are applied in Sec. IV to determine the energy levels of a simplified molecule in an idealized tweezer trap and consider how they can be engineered to meet the requirements for sideband cooling. In Sec. V we consider complications that arise when we include the complex structure of real molecules: we use CaF as a case study FIG. 2. Schematic of the coordinate system used in this paper. The tweezer light propagates along x and is (except where otherwise specified) linearly polarized along z. The y axis is into the page. The quantization axis of the molecule Z, taken along the B field in Sec. V onward, makes an angle β to the z axis. Inset: For the optical pumping step we consider the angles θ abs and θ sp that the incoming and outgoing photons make with one of the three trap axes, chosen to be z in our illustration. since it has already been loaded into a tweezer trap [20]. In Sec. VI we discuss potential complications arising from the light field produced by a real tweezer trap. Finally, in Sec. VII, we propose a complete recipe for Raman sideband cooling of laser-cooled molecules. The coordinate system and set of angles that we use throughout the paper are illustrated in Fig. 2.

II. THEORY OF SIDEBAND COOLING IN STATE-DEPENDENT POTENTIALS
In this section, we develop the general theory of sideband cooling in state-dependent potentials in order to derive expressions for the frequency and amplitude of the transitions, and the heating arising from the optical pumping step. The Hamiltonian H 0 describing a molecule with a pair of ground states |g A and |g B in a state-dependent harmonic trap is Here, H ti is the harmonic oscillator Hamiltonian associated with the external motion of a molecule in internal state |g i , i ∈ {A, B}. The trap frequencies are ω ti and the motional eigenstates are |q i such that H ti |q i =hω ti (q + 1 2 )|q i .hω 0 is the energy difference between the minima of the two trapping potentials, shown in Fig. 1.

A. Raman step
In the first step of Raman sideband cooling, shown in Fig. 1(a), the two-photon detuning required to coherently 013251-2 transfer the molecule from |g A |n A to |g B |n − 1 B is where we have assumed that the energy of |g A is above that of |g B . We see that when the trap frequencies are different for the two internal states, coh depends on the motional quantum number n. The matrix element for a transition between the two internal states via interaction with a light field is proportional to B m|e i kz |n A whereh k is the momentum kick imparted to the molecule from absorption of the photon. For two-photon transitions,h k is the difference in momenta between the absorbed and emitted photons. We can reexpress kz as where a † i , a i are the harmonic oscillator raising and lowering operators associated with H ti and M is the mass of the molecule. We have also introduced the Lamb-Dicke parameter η i = √ E rec /hω ti , the square root of the ratio of the recoil energy along the trap axis E rec =h 2 k 2 /2M, to the energy spacing of the motional states of the trap. When this ratio is small, commonly referred to as the Lamb-Dicke regime, we can expand the matrix element in powers of η A : (4) When the potentials associated with the two states are identical, we have B m|n A = δ m,n and the transition strength . Under these conditions, transitions that change the motional quantum number are strongly suppressed. By using a two-photon Raman transition as shown in Fig. 1(a), k, and therefore η i , can be varied by changing the relative directions of the two photons. Counterpropagating optical photons give sufficiently large k to allow higher-order sidebands to be addressed. In general, the two harmonic oscillator potentials will have different trap frequencies and different equilibrium positions; an explicit expression for the overlap integral in this case can be found in [28].

B. Optical pumping step
We now turn to the optical pumping step of the cooling cycle, which involves spontaneous emission. Consider the photon scattering event illustrated in Fig. 1(b). A particle in |g B |n B absorbs a photon from the laser and then decays to |g A as it spontaneously emits a photon. The angles of the incoming and outgoing photons relative to the trap axis are labeled θ abs and θ sp , and are as shown in the inset of Fig. 2. The probability of ending up in state |m A is | A m|e i kz |n B | 2 where k depends implicitly on θ abs and θ sp . 2 For given directions of the incoming and outgoing photons, the mean change in motional quantum number is m (m − n)| A m|e i kz |n B | 2 .
Averaging over all possible directions of spontaneous emission gives where Y (θ sp ) is the probability density for the photon to be emitted at angle θ sp to the trap axis. 3 Using the completeness relation with respect to the set of |m A we have The remaining matrix element can be expanded as where z is the displacement between the minima of the two potentials. In the first step we have used e −i kz pe i kz = p +h k and in the last step B n|p 2 |n B =hMω tB (n + 1 2 ), B n|z 2 |n B = (h/Mω tB )(n+ 1 2 ), and B n|p|n B = B n|z|n B = 0. The second term in the last line of Eq. (7), the recoil energy associated with the process, is the only part which depends on the directions of the absorbed and emitted photons. We define the average of this recoil energy over spontaneous emission 2 The operator e i kz is unitary and so | A m|e i kz |n B | 2 gives a normalized probability: m | A m|e i kz |n B | 2 = 1. 3 If G(θ ) is the angular distribution of photon emission relative to the quantization axis, then Y (θ sp ) = 2π 0 dφ sp | ∂ (θ,φ) ∂ (θsp,φsp ) | G(θ (θ sp , φ sp )), with | ∂ (θ,φ) ∂ (θsp,φsp ) | the Jacobian determinant for the coordinate transformation which rotates the quantization axis on to the trap axis. directions, wherehk is the single-photon momentum, and is a geometric factor that depends only on the polarization of the outgoing photon and the angle between the trap and quantization axes. Its value lies in the range 1 5 ϒ 2 5 . In the second step of Eq. (8), we have used the fact that Y (θ sp ) is symmetric about θ sp = π/2 so that the integral over the term linear in cos θ sp is zero. Finally, we can write The heating induced by the photon recoil, rec , is independent of n and equivalent to the heating in free space or in state-independent potentials. The expression for E rec in Eq. (8) shows that this contribution to the heating can be split into a part due to the momentum of the absorbed photon and a part due to that of the spontaneously emitted photon. The distribution of the former among the three trap axes can be controlled by choosing the direction of the optical pumping beam. We note that the sum of E rec evaluated for any three perpendicular axes is 2¯h 2 k 2 2M . The second contribution, disp , is the additional heating associated with the displacement between the two potentials. The quantityhω tA disp is the gain in potential energy from moving the wave packet a distance z up the side of the trap. Finally, curv is the heating resulting from the difference in curvature of the two trap potentials. This part depends linearly on n and is independent of the direction of the transition |g A ↔ |g B .
In general, several photons will be scattered in the optical pumping step. Each scattering event begins with the molecule in some state i and ends in some state j with the associated mean change in motional quantum number n sc i, j . We define the mean change in n for the complete process of optical pumping to the desired (dark) state, n op , which is the sum of n sc i, j for each step of the process. We will calculate n op for a realistic case in Sec. IV. For efficient cooling, the number of motional quanta removed during the coherent step n coh must be greater than n op , remembering that the heating during optical pumping occurs for each axis regardless of which is being cooled during the coherent step. While it is possible to use higher-order sidebands during the coherent step to satisfy this condition, cooling to the motional ground state requires n coh = 1 since driving higher-order sidebands leaves population in other motional states.

III. STARK SHIFT
To derive the potential for a molecule in a tweezer trap or an optical lattice, we need to understand its response to the trapping light. The interaction of a molecule with light is described by a term in the Hamiltonian − d · E , where d is the dipole moment operator and E = 1 2 E 0 (ˆ e −iω L t +ˆ * e iω L t ) is the electric field of the light. Here, E 0 is the electric field amplitude, ω L is the angular frequency of the light, andˆ is a unit polarization vector. We divide the complete Hamiltonian into a zeroth-order part H 0 that describes the energy level structure of the molecule down to the rotational structure, and a part H 1 that describes level shifts smaller than the rotational splitting. H 1 includes the spin-rotation interaction, the hyperfine interaction, and the Zeeman and Stark interactions. We are interested in the small degenerate subspace of H 0 corresponding to a single rotational state. The effective Stark Hamiltonian that operates within this subspace is developed in the Appendix. It is Each of the three terms in the sum over K in Eq. (12) is the scalar product of two rank-K spherical tensors. The first, A K , is an operator related to the frequency-dependent polarizability of the molecule, and is given in terms of d and ω L by Eq. (A10). The second, P K , relates to the polarization of the light, and is given in terms ofˆ by Eq. (A11). The matrix elements of A K P for 1 and 2 molecules are given in Secs. A 2 and A 3 of the Appendix, respectively. They are functions of the relevant angular momentum quantum numbers, and are proportional to the three molecular constants, α K , given by Eq. (A26). The α K express the size of the scalar, vector, and tensor polarizabilities, and their values depend on the frequency of the light. The scalar part shifts all the levels within the subspace equally. This shift is W 0 = −α 0 E 2 0 /4 = −α 0 I/(2c 0 ), where I is the intensity of the light. It is convenient to define α K = α K /(2c 0 ), so that the scalar Stark shift is simply W 0 = −α 0 I.
The vector and tensor polarizabilities produce different shifts for different states, leading to state-dependent trapping potentials. For an angular momentum eigenstate | j, m j , the expectation value of A 1 0 is proportional to m j . The effect of this vector part can be large when the detuning of the light from a molecular transition is small, or comparable to, the fine-structure splitting of the transition. At larger detunings, the value of α 1 is proportional to the ratio of the fine-structure interval to the transition energy, as can be seen from Eq. (A27). This ratio is normally small, so suppresses the vector part. The value of α 1 is also proportional to ω L , so goes to zero when ω L = 0. When the light is linearly polarized, all components of P 1 are zero, so the vector polarizability contributes nothing to H S . When the light is circularly polarized, the vector part has the same effect as a magnetic field applied along the axis of circular polarization, so can be suppressed by applying a magnetic field orthogonal to that axis [21,29]. As shown by Eq. (A26c), the value of α 2 is proportional to the difference between the polarizabilities parallel and perpendicular to the molecular bond. These are typically very different, leading to a large value for α 2 . This tensor part results in trap potentials that depend strongly on the state of the molecule and on the polarization of the light. In Secs. IV and V, we show how to minimize the problems associated with these state-dependent potentials.

IV. SIMPLE MOLECULE
We first consider a simple diatomic molecule that has no electronic or nuclear spin. We concentrate on the first rotationally excited state, N = 1, within the ground electronic state. Excitation from this state to an electronically excited state with N = 0 is rotationally closed, as needed for the optical pumping step of the sideband cooling. In this case, we are interested only in the three m N states of N = 1, where m N is the projection of the rotational angular momentum onto a laboratory-fixed Z axis. To make the link with the 2 molecule considered later, we allow the states m N = −1, 0, 1 to be nondegenerate at zero intensity, with energies −w, 0, and w, respectively.

A. Linearly polarized light
Our coordinate system is shown in Fig. 2. The trapping light propagates along the x axis. We first assume the light is linearly polarized along the z axis, making an angle β to Z. The effective Hamiltonian for this system is The first matrix gives the energies in the absence of the light, the second is the scalar Stark shift, and the third is the tensor part of the Stark interaction. There is no vector part because our model system has no spin, and because the light is linearly polarized. In terms of the eigenvalues of H simple , which we may write as E (α 0 I, We can learn a great deal from this simple Hamiltonian. When w = 0 the Stark shifts are independent of β, and when β = 0 the Stark shifts are independent of w. In both cases, the m N = ±1 states have equal Stark shifts of δE ±1 = −(α 0 − α 2 /5)I, while the m N = 0 state shifts by δE 0 = −(α 0 + 2α 2 /5)I. When β = 0 and α 2 /w is positive (negative), the m N = −1 (+1) and m N = 0 states cross at the intensity where 3/5α 2 I = w (−w). This becomes an avoided crossing when β = 0, and the size of the gap at the avoided crossing is w sin(2β )/ √ 2. These features can be seen in Fig. 3, where we plot the energies of the three states as a function of α 2 I/w for the case where β = π/24, chosen here and later to clearly highlight the presence of the avoided crossings. We have removed the scalar Stark shift since it shifts all states equally. We note that the Stark shifts cease to be linear in intensity near the avoided crossing, and that this nonlinearity may translate into anharmonicity of the trapping potential if the trap intensity is in this range. Figure 4 shows the ratio of the tensor Stark shift to the scalar Stark shift, as a function of the polarization angle β. We have chosen α 2 = −0.6α 0 , and explore various values of w. The dashed lines show the limiting case where |w/(α 2 I )| 1. In this case, the eigenvalues of H simple are very nearly equal to its diagonal elements. By inspection of these elements, we see that the m N = ±1 states have identical tensor Stark shifts for all values of β, whereas the tensor Stark shift of m N = 0 is twice as large and has the opposite sign. We also see that the tensor Stark shift is zero for all three states at the "magic angle" where β = β magic = cos −1 (1/ √ 3). The solid lines in Fig. 4(a) show the case where w/(α 2 I ) = 4. The results follow the dashed lines closely, but the m N = ±1 no longer have identical Stark shifts when β = 0. This difference increases as w decreases, as can be seen in Fig. 4(b), which shows the case of w/(α 2 I ) = 1. In particular, we note that there is no longer any angle where all three states have the same tensor shift.

B. Elliptically polarized light
Next, we consider the case of elliptically polarized light. The light again propagates along x and Z is aligned with this axis also. The polarization of the light is described byˆ = cos(ξ )ẑ − i sin(ξ )ŷ. The tensor Stark part of H simple now reads as ⎛ ⎜ ⎝ equally. The solid lines show the case where w/(α 2 I ) = 1. The shift of the m N = 0 state has no dependence on ξ , while the shifts of the m N = ±1 states depend on ξ and are different to one another. This difference is largest at ξ = 0 (linearly polarized along z), zero at ξ = π/4 (circularly polarized), and reduces as |w/(α 2 I )| increases.

C. Sideband cooling of simple molecule
For this simple molecule, sideband cooling could be done with any choice of polarization where two of the three states have equal tensor shifts. This ensures that the Raman frequency for transitions between these two states is independent of the motional state, as required. Here, we consider the specific case where the Raman step is between the m N = −1 and +1 states. The cooling proceeds as follows: (i) optically pump into m N = −1, (ii) drive the Raman transition from m N = −1 to m N = +1 on a red motional sideband, (iii) repeat. The optical pumping should be on the rotationally closed transition, which excites the molecule to an N = 0 state. A pair of laser beams, one linearly polarized along Z and the other circularly polarized about Z, achieves the desired optical pumping.
If it is possible to work in a regime where |w/(α 2 I )| 1, the polarization of the tweezer is not important for the Raman step since the trapping potential is always the same for the m N = ±1 states. The tweezer polarization is relevant for the optical pumping step due to spontaneous emission to m N = 0 for which the trapping potential is, in general, different. The extra heating this produces can be eliminated by choosing the polarization at the magic angle where the trapping potential is identical for all three states. If it is not feasible to work in the regime where |w/(α 2 I )| 1, the tweezer should be linearly polarized along Z, or circularly polarized relative to Z, so that the trap potential is identical for the m N = ±1 states. We emphasize that other configurations are possible using different pairs of states for the Raman step.

D. Heating during optical pumping
We now evaluate the extra heating that occurs during the optical pumping step when the third state, not used for the Raman transition, has a different ac Stark shift to the other two. We again focus on the case where the Raman step is between the m N = −1 and +1 states and the third state is m N = 0. The N = 0 excited state has equal branching ratios to each of the three ground states and so it takes, on average, three scattered photons for the molecule to reach the dark state; two of these leave the molecule in |m N | = 1 and one in m N = 0. The heating that each of these scattered photons produces is given by the three terms in Eq. (10). Let us consider each in turn. The recoil heating, Eq. (11a), depends on the direction of the absorbed photon and the angular distribution of the emitted photon. In our case, with a single excited state, specifying the initial and final value of |m N | is sufficient to uniquely define both the beam from which the photon is absorbed and the angular distribution of the emitted photon. By analogy with Eq. (8) we define E i, j rec , the average recoil energy for a scattering event which takes the molecule from a ground state with |m N | = i to a ground state with |m N | = j. The circularly polarized beam which couples to m N = +1, from which on average two photons are scattered, is necessarily parallel to the B field, but the linearly polarized beam which couples to the m N = 0 state, from which on average one photon is scattered, can propagate along any direction perpendicular to that. We can control the heating along a particular trapping axis to some extent by choosing this angle appropriately. It is likely to be helpful to choose it orthogonal to the optical axis of the trapping light where, as we will see in Sec. VI, the confinement is weakest.
The contribution from disp , Eq. (11b), is zero because the potentials are not displaced with respect to one another. To understand the contribution of curv , given by Eq. (11c), we need to know how many of the scattering events change |m N |. Let y m N be the mean number of |m N |-changing events needed to reach the dark state, starting from state m N . Consider a molecule initially in m N = −1. If it scatters a photon and decays to m N = 1, the dark state is reached with no |m N |changing events. If it decays to m N = 0, there has been one |m N |-changing event, and there are an average of y 0 more to come. If it decays back to m N = −1, there are an average of y −1 events to come. Each outcome has a probability of 1 3 . Thus, By a similar argument, Together, these equations give y −1 = 4 3 and y 0 = 5 3 . Since the molecule begins and ends the optical pumping step in a state with |m N | = 1, half of the 4 3 events change |m N | from 1 to 0, and half the reverse.
We can now use Eq. (10) to estimate the mean change in motional quantum number during optical pumping, n op . If n sc i, j 1 for all i, j ∈ {−1, 0, 1}, so that we can assume a fixed n in Eq. (11c), then to a good approximation Here, ω ti is the trap frequency corresponding to the ground states with |m N | = i. Figure 6 shows how n op varies as a function of both n and the ratio of the trap frequencies, for η 1 = 0.2, which is a realistic Lamb-Dicke parameter for trap frequencies near 200 kHz. The blue areas of the plot show the parameter space where n op < 1. This corresponds to net cooling along the axis being cooled when the coherent step drives the first-order red sideband. We will see later that ω t0 /ω t1 is typically in the range 0.8-1.25. Throughout this range, n op < 1 all the way up to n ≈ 25.

V. REAL MOLECULE
Next, we consider a real molecule with a 2 ground state and a nuclear spin. We will use CaF as an illustrative example, though our discussion will apply to other molecules of this type. For laser cooling, the electronic transition to either the A 2 1/2 state or the B 2 + state can be used. For sideband cooling, we choose to use the transition B 2 + (v = 0, N = 0) ← X 2 + (v = 0, N = 1). Here, v and N refer to the vibrational and rotational quantum numbers, respectively. With this choice of excited state, decay to any other rotational state of X is forbidden by the parity and angular momentum selection rules, so the transition is rotationally closed. The branching ratio for decays to other vibrational states depends on the choice of molecule. For CaF, it is particularly small, about 10 −3 , so that for the purpose of sideband cooling we can consider the transition to be vibrationally closed. For other laser-coolable molecules with less favorable branching ratios, vibrational repump lasers can be used.
To understand how to apply sideband cooling, we need to consider the hyperfine interactions in the ground state. For CaF, and similar molecules, the Hamiltonian describing these interactions is where N, S, and I are the dimensionless operators for the rotational angular momentum, electron spin, and nuclear spin. The first term is the spin-rotation interaction, while the second and third represent the interaction between the electron and nuclear magnetic moments. Here, T 2 ( I, S) is the rank-2 spherical tensor formed from I and S, while T 2 (C) is a spherical tensor whose components are the spherical harmonics C 2 q (θ, φ), where θ and φ are the polar angles of the internuclear axis in the laboratory frame. We have neglected the nuclear-spin-rotation interaction which is much smaller than the other terms.

A. Reduction to the simplified molecule
The hyperfine interactions couple together the angular momenta, and as a result the ac Stark shift is, in general, much more complicated than the simple picture described above. However, that simple picture can be recovered by applying a magnetic field B, that is large enough to uncouple the angular momenta. The Zeeman Hamiltonian for a 2 state is whereλ is a unit vector along the internuclear axis, and we have assumed that only one nucleus has a spin. The first term is due to the unpaired electron spin and is typically 10 3 times larger than the other terms. We will often only need to consider this term. When B is large, so that the Zeeman interaction is much larger than the hyperfine interaction, the eigenstates are well described by uncoupled angular momentum eigenstates |N, m N |S, m S |I, m I . Each rotational state splits into two manifolds with m S = ± 1 2 , whose Zeeman shifts are E Z ≈ g S μ B m S B ≈ ±μ B B. Here, we have used g S ≈ 2 and have neglected the small terms. The hyperfine interaction lifts the degeneracy with respect to m N and m I within each of these manifolds. In the limit where the angular momenta are completely uncoupled, the ac Stark shift has no dependence on m S and m I . Furthermore, the values of m S and m I cannot change in either the Raman step or the optical pumping step. 4 Having chosen a particular (m S , m I ) pair, their values are fixed, so that (for N = 1) we are left with only three states, just as in Sec. IV.
Taking N = 1 and S = I = 1 2 , the shifts due to H hfs to firstorder in perturbation theory are Relative to m N = 0, the energies of the m N = ±1 states are In the limit where c γ , the splitting is symmetric and the description is identical to that of Sec. IV. For CaF, γ and c are almost equal, so the splitting is not quite symmetric, though this makes little difference to the description.
In practice, the magnetic field is limited in strength. This has two important consequences. First, in the optical pumping step, the residual state mixing by H hfs can result in decay to a different manifold of states from the one selected. Second, the ac Stark shifts deviate from the simple behavior shown in Figs. 3-5. Next, we work out the severity of these imperfections to our scheme.

B. Residual state mixing
Let us write the uncoupled states using the notation |m N , m S , m I . All three terms in H hfs result in mixing of these uncoupled states, and we can calculate the mixing amplitudes by perturbation theory (assuming that μ B B  γ , b, c). The spin-rotation interaction γ N · S has no effect on the upper state of the transition which has N = 0, but it does change the lower states since they have N = 1. The state |m N , ± 1 2 , m I obtains an admixture of |m N ± 1, ∓ 1 2 , m I (where that state exists), with amplitude ±γ /(2 √ 2μ B B). It follows that the excited state with m S = ± 1 2 can decay to the ground state with (nominally) m S = ∓ 1 2 with a branching ratio of Importantly, this branching ratio is suppressed with increasing B. For CaF at B = 300 G, we find b r,1 = 7.4 × 10 −4 . Similarly, due to the I · S term of Eq. where and the subscript indicates which state the hyperfine coefficients belong to. For CaF at B = 300 G, we find b r,2 = 7.2 × 10 −3 . This decay route can be eliminated by choosing to use a manifold with m S = S and m I = I or with m S = −S and m I = −I. Next, consider the last term of Eq. (18). In the uncoupled basis, it has nonzero matrix elements between all pairs of states with equal m F = m N + m S + m I . This means that as well as coupling states that differ in m S , it couples states of the same m S that differ in m N and m I . The former couplings are suppressed by the large Zeeman splitting between opposite m S manifolds, in the same way as for the first two terms of Eq. (18) discussed above, but the latter couplings are not suppressed by the field because the terms in the Zeeman Hamiltonian that depend on m N and m I are very small. As an example, consider the nominal state |−1, − 1 2 , 1 2 . In perturbation theory, its admixture with |−1, 1 2 , − 1 2 has amplitude −c /(60μ B B). This can cause m S to change in the excited state decay. For CaF, the branching ratio is smaller than the terms already discussed above because c and γ are approximately equal. Our chosen nominal state also has an admixture with |0, − 1 2 , − 1 2 . A rough estimate of its amplitude can be obtained by treating the last term in H hfs as a perturbation to the other two terms, giving an amplitude Similarly, using the same approximation, the nominal state Finally, we note that there is another mechanism for mixing states of different m N and m I but the same m S . This is through the combination of the S · N and I · S terms. The amplitude for this is proportional to the product of two matrix elements, one for each term, but scales only as 1/B because the mixing is with a state from the same m S manifold through an intermediate state of opposite m S . This mechanism affects 8 out of the 12 states of X and can be just as strong as the more direct mechanisms discussed above. Figure 7 shows the exact branching ratio, calculated numerically, for each of the spin manifolds of the B 2 + (N = 0) state of CaF to decay to a different spin manifold of X 2 + (N = 1), as a function of B. The behavior is as discussed above: the branching ratios scale as 1/B toward a constant value that is close to b r,3 . The branching ratios depend on the choice of spin manifold, and we see that using the (m S , m I ) = (− 1 2 , − 1 2 ) manifold minimizes the leak to other manifolds.

C. Tensor Stark shifts
Next, we calculate the tensor Stark shifts of CaF molecules in the presence of a strong magnetic field, and compare the results to those of the three-level model presented in Sec. IV. We calculate the eigenvalues of H tot = H Stark + H hfs + H Z given by Eqs. (12), (18), and (19). We suppose the optical trap has a wavelength of 780 nm, and estimate the values of α K by assuming that the A 2 and B 2 + states dominate the sums over states in Eq. (A24). The energies of the states are calculated using the molecular constants given in [30], and the dipole moments using the data given in [31][32][33]. We find α 0 ≈ 1.4 × 10 −3 Hz/(W/m 2 ), α 1 ≈ 3 × 10 −5 Hz/(W/m 2 ), and α 2 ≈ −8 × 10 −4 Hz/(W/m 2 ). Figure 8 shows the eigenvalues of H tot , focusing on the levels that have N = 1 and m S = 1 2 . We have chosen B = 300 G and linearly polarized light at angle β = π/24, the same as used for Fig. 3 and their shifts with intensity are similar to those in Fig. 3 (remembering that α 2 I/w is negative). The lower three levels have m I = − 1 2 and again show similar shifts with intensity. Our three-level model predicts an avoided crossing between m N = 1 and 0 at an intensity I c = −5/3w + /α 2 , where w + is given by Eq. (21). These values are I c = 34 and 52 GW m −2 for m I = 1 2 and − 1 2 , respectively. These intensities are indicated by the dashed lines in Fig. 8, and we see that the avoided crossings do indeed occur very close to these values. At intensities close to I c the trapping potential will be distorted due to the nonlinearity of the Stark shift with intensity around the avoided crossing. We note that, for CaF, with m S = m I = 1 2 , α 0 I c h/k B = 2.1 mK. Figure 9(a) shows the ratio of the tensor Stark shift to the scalar Stark shift for the N = 1 levels of CaF, at an intensity of I = 25 GW m −2 , a magnetic field of B = 30 G, and as a function of polarization angle β. Every level has a different Stark shift and a different dependence on β. Figure 9(b) shows the same information for B = 300 G, showing that at this higher field the levels group together and the pattern of shifts resembles the simple one shown in Fig. 4. Our chosen intensity gives α 2 I = −19.3 MHz, which corresponds to α 2 I/w + = −1.22 for positive m S m I , and α 2 I/w + = 0.81 for negative m S m I , with w + given by Eq. (21). Thus, at high B, we expect a close resemblance to Fig. 4(b), which is indeed what we see in Fig. 9(b). The small splitting of the three curves into closely spaced pairs is due to the different values of w + for opposite signs of m S m I . Figure 9(c) shows the ratio of the tensor Stark shift to the scalar Stark shift for β = 0 as a function of B. At fields approaching 300 G the ac Stark shifts of the 12 states separate into two groups corresponding to states with |m N | = 1 and |m N | = 0 as expected.

VI. REAL LIGHT: A TWEEZER TRAP
Real tweezer traps are produced using a high numerical aperture (NA) lens to focus light down to a spot size comparable to the wavelength. To model a real trap, we use the vector Debye integral [34] to compute the distribution of intensity and polarization close to the focus of a trap with parameters suitable for CaF. We then find the trap potential for all the N = 1 states of the molecule by calculating the eigenvalues of H tot at each point in the distribution. The calculations are for a 780-nm input beam propagating along x and linearly polarized along z, focused through a lens of 0.55 NA. The 1/e 2 diameter of the input beam is equal to the lens diameter and the total power is 20 mW. These parameters give a peak intensity at the center of the tweezer of 25 GW m −2 . For CaF molecules in the N = 1 manifold with a 300 G B field applied parallel to the incident polarization vector, the trap frequencies are nearly equal (less than 1 % difference) for states with the same value of |m N |. For a molecule in m N = ±1(0), we calculate trap frequencies of 213(174) kHz parallel to the incident polarization, 224(187) kHz perpendicular to both the incident polarization and the optical axis, and 38(32) kHz parallel to the optical axis.
The intensity profile of the trap is not perfectly harmonic and the tight focusing gives rise to polarization gradients close to the focus which can further distort the trap shape. Here, we consider the effects of these imperfections.

A. Anharmonicity
The tweezer trap potential is anharmonic away from the trap center due to the approximately Gaussian intensity profile of the trap. Further distortions of the potential are introduced when the intensity is close to an avoided crossing between the internal states of the molecule (see Fig. 8). Any anharmonicity causes the sideband frequencies to depend on the motional state of the molecule.
At intensities far from any avoided crossings, we find the potential near the trap center is approximately harmonic with a small, negative, quartic perturbation. The motional Hamiltonian is Working in the natural units of the system withz = z √ Mω/h, p = p/ √h Mω, the dimensionless motional HamiltonianH t = H t /(hω) can be writteñ wheref = (h/(M 2 ω 3 )) f 1. First-order perturbation theory gives the dimensionless energyẼ n = E n /(hω) of the nth motional eigenstate as The anharmonicity is apparent from the n dependence of the trap level spacingẼ The N = 1 states of CaF in a 300-G field havef ∼ 7 × 10 −4 in the radial direction and ∼1 × 10 −4 in the axial direction, meaning the level spacing changes by less than 5% for motional states up to n ∼ 25 and n ∼ 170, respectively. This effect is negligible, provided the effective Rabi frequency for the Raman process is not chosen too small compared to the trap frequency. Figure 10 shows example elements of the polarization tensor at positions across the focal plane, multiplied by the ratio of the local intensity to the maximum intensity at the trap focus: S = (I/I max )P. The elements are evaluated with respect to the incident polarization vector. Figure 10(a) shows the scalar component S 0 0 . As shown in Eq. (A13a), P 0 0 is 1 everywhere and so the plot mirrors the Gaussian intensity profile of the light with a waist of 0.74 μm. Figures 10(b)-10(d) illustrate how the nonparaxial focusing of the light creates polarization gradients across the trapping volume. The polarization tensor is, in general, complex, but useful real quantities can be obtained from linear superpositions of elements, as in Fig. 10.

B. Polarization gradients
For displacements away from the focus along the polarization vector of the incident light, the local polarization has a circular component with handedness along y, perpendicular to both the incident polarization vector and the optical axis. The quantity 1 √ 2 Im(S 1 −1 + S 1 1 ), shown in Fig. 10(b), is proportional to the intensity of circularly polarized light along this axis. The component has opposite handedness on either side of the focus, reflected by the change of sign in the plot, and couples to the vector part of the polarizability. Within a given hyperfine level, the vector contribution to the Hamiltonian looks like a fictitious magnetic field along the axis of the circular component. The gradient of polarization creates a gradient of this fictitious magnetic field that can shift the center of the trap for different m F states. The effect can be reduced to a negligible level by applying a large magnetic field orthogonal to the fictitious magnetic field, as previously demonstrated for atoms [21,29]. Our method for sideband cooling of molecules already requires this large applied field, so the suppression will be automatic. Figure 10(c) shows component S 2 0 . It is equal to 1 at the trap center where the light field is linearly polarized. The structure is very similar to that of the intensity distribution in Fig. 10(a) with a slight and asymmetric narrowing caused by the polarization gradient. The component couples to the tensor polarizability to provide a second mechanism by which the vector contribution is suppressed. To see this, choose a quantization axis for the molecule along z, the incident polarization direction. The effect of the vector Stark shift is that of a magnetic field orthogonal to this axis, introducing an off-diagonal matrix element (let us call it a 1 ) that couples m N = ±1 to m N = 0. The P 2 0 component of the polarization tensor introduces diagonal matrix elements which shift the m N = ±1 states relative to m N = 0 by an amount a 2 . When a 2 a 1 , as is often the case in molecules where the tensor polarizability is large compared to the vector polarizability, the coupling a 1 is ineffective because it couples states far apart in energy; the effect of the vector part is suppressed relative to a 1 by the factor a 1 /a 2 . Figure 10(d) shows 1 √ 2 Re(S 2 −2 + S 2 2 ). This part is zero for a light field linearly polarized with β = 0, as at the center of the trap, but is not quite zero at other positions across the trap volume. The component can split the energies of the m N = ±1 states but its size is comparatively small (note the ×40 scaling) and so has no significant effect on the trapping potential. This is confirmed by the nearly identical trapping frequencies calculated for the two states at the beginning of this section.

VII. COOLING RECIPE AND CONCLUSIONS
We have shown how to apply sideband cooling techniques to laser-coolable 2 molecules in optical tweezer traps. The cooling must proceed from the N = 1 rotational level to avoid decays to other rotational levels, but the resulting statedependent potentials introduce significant additional complexity. This complexity can be greatly reduced by applying a large magnetic field to decouple the nuclear and electron spins from the rotational angular momentum. Under these conditions, families of states can be found with three ground states coupled to a single excited state. For certain choices of laser polarization, two of the three ground states have equal ac Stark shifts, so the frequency of the Raman transition between them is independent of the motional state. The reduction to only three ground states also greatly reduces the number of photons scattered during the optical pumping step, thereby reducing the heating. We have derived a formula for the additional heating caused by the different ac Stark shift of the third state, and have calculated the branching ratio out of this family of states as a function of applied magnetic field.
These considerations lead us to a recipe for Raman sideband cooling of laser-coolable molecules such as CaF. For trapping light with linear incident polarization, the recipe is a straightforward extension of the scheme proposed for the simple molecule in Sec. IV: (i) Apply a weak magnetic field 5  The circularly polarized optical pumping beam must be orientated along the B field. The linearly polarized beam must be orthogonal to this and, as discussed in Sec. IV, to minimize heating should also be perpendicular to the weakly confining optical axis of the trap. Using Eq. (17) and the trap frequencies for CaF in a real tweezer calculated in Sec. VI, the mean change in motional quantum number during the optical pumping step can be written Under these conditions, we find (κ, ρ) equal to (0.17, 0.03) parallel to the incident polarization vector, (0.11, 0.02) perpendicular to both the incident polarization and the optical axis, and (0.29, 0.02) parallel to the optical axis. These calculations show sideband cooling should be effective on the first red sideband for small n. Driving higher-order sidebands may be helpful for initial cooling of hot clouds, particularly along the optical axis where the Lamb-Dicke parameter is much larger. As noted in Sec. IV, the cooling can also work for other polarization choices: the main requirement is that two of the three states have equal tensor shifts. For example, choosing a linear polarization at an angle close to β magic can satisfy this requirement and also reduce the heating due to curv , which is useful at high n. We conclude that the heating due to state-dependent potentials is not a major obstacle for effective cooling. Figure 7 shows that, at 300 G, the branching ratio to other spin manifolds is about 4 × 10 −3 . Increasing the magnetic field reduces the branching ratio further, but only by a factor of 2 for realistic fields. Recalling that an average of 3 photons are scattered in the optical pumping step, we see that under these conditions, the cooling cycle can be applied 58 times before half the population is lost to a different spin manifold. For the tweezer parameters considered here, 58 cycles of sideband cooling on the first red sideband corresponds to an energy reduction of 600 μK in either of the radial directions, or 100 μK in the axial direction. It is straightforward to use higher-order sidebands, resulting in proportionally larger energy reductions. Considering that molecular samples with temperatures of 5 μK have already been demonstrated by free-space laser cooling [10,11], we see that it is feasible to reach the ground state with little loss. We also note that loss to other spin manifolds is not fatal since the cooling process could be applied to each spin manifold. Moreover, the entire cooling process, beginning with the optical pumping step at low magnetic field, can be applied multiple times if that proved necessary.
Our analysis here has focused on CaF, but the methods and conclusions also apply to other similar molecules amenable to laser cooling. The ability to cool these molecules to the ground state of tweezer traps is a key advance that will open the door to molecules as processors of quantum information and simulators of many-body quantum systems.

ACKNOWLEDGMENTS
We are grateful to J. Hutson and E. Hinds for helpful discussions. This work was supported by EPSRC under Grants No. EP/M027716/1 and No. EP/P01058X/1.

Operator
Consider a diatomic molecule interacting with light which has electric field amplitude E 0 , angular frequency ω L , and unit polarization vector . The interaction Hamiltonian is H = − d · E , where d is the dipole moment operator of the molecule and E = 1 2 E 0 ( e −iω L t + * e iω L t ) is the electric field. We suppose that all effects much larger than this molecule-light interaction are included in a zeroth-order Hamiltonian H 0 , while the molecule-light interaction and all effects of a similar (or smaller) size are treated by perturbation theory. In second-order perturbation theory, the energy shift of a nondegenerate level i is Here, ω ji is the transition angular frequency between states j and i and the sum is over all states of the molecule. More generally, we may wish to know the energy shift of levels that are degenerate in the absence of the light, or handle cases where the ac Stark shifts are comparable to other level shifts and splittings, such as those arising from the hyperfine or Zeeman interactions. What is needed is an effective operator, which we will call H S , that describes the effect of the light within a small subspace of levels, for example, a single rotational state. The matrix elements of the effective operator between states |i and |i within the subspace are the generalization of Eq. (A1): where the sum is over all states of H 0 that lie outside the subspace [35]. In spherical coordinates, this expression is where we have defined the operator

013251-12
Provided H 0 does not include external fields, R ± is invariant under rotations. The formula for building a spherical tensor of rank k 12 from the product of two other spherical tensors of ranks k 1 and k 2 is Applying this to the tensor product of two vectors u and v gives The inverse relation gives us the expansion of the product u p v q as Equation (A3) contains two products of this form, one relating to the transition dipole moments of the molecule, and the other to the polarization of the light. Expanding each using Eq. (A7), then evaluating the sums over p and q, we find that Note that the transformation d p R ± d q → d q R ± d p on the left-hand side of this equation multiplies the terms in the sum over K on the right-hand side by (−1) K . Applying these results to Eq. (A3), we find that the effective Stark shift operator is where we have introduced the polarizability operators and the polarization tensors P K P = z K T K P ( , * ).

Matrix elements for 1 states
We consider a ground-state molecule with no orbital angular momentum, no electronic spin, and no nuclear spin. In this simple case, the basis states are | , N, m N with = 0. Here, the quantum numbers are the projection of the orbital angular momentum onto the internuclear axis ( ), the rotational angular momentum (N), and its projection onto the z axis (m N ). The matrix elements of the polarizability tensor are To evaluate the reduced matrix element, we rotate into the frame of the molecule using Here, the index P is used for laboratory-frame components, and the index Q for molecule-frame components, and D K is the rotation operator of rank K that transforms between them. This gives In the first line, the dot in the subscript of the rotation operator indicates that the matrix element is reduced relative to the index P. In the last step, we have set Q = 0 since this is the only nonzero term in the sum over Q. Let us define the molecule-frame parallel and perpendicular polarizability components: Here, X labels the 1 ground state of interest, the index j labels the set of excited states, k labels the set of excited states, and the dipole operators are acting in the molecule frame. We note that | X |d −1 |k, | 2 = | X |d 1 |k, | 2 because a state is an equal superposition of = ±1. We introduce the molecular parameters α K = = 0| A K Q=0 | = 0 , which we can think of as the scalar, vector, and tensor polarizabilities in the molecular frame. Using the definitions for the components of A given by Eq. (A12), the definitions of α , α ⊥ , and α K , and the relation i| d q | j = (−1) q j| d −q |i , we find the complete expression for the matrix elements (for = 0, S = 0): where

Matrix elements for 2 states
Now we consider a more complicated case where the basis states are | , N, S, J, I, F, m F . In this order, the quantum numbers are the projection of the orbital angular momentum onto the internuclear axis, the rotational angular momentum, the total electronic spin, the total electronic angular momentum, the nuclear spin, the total angular momentum, and the projection of the total angular momentum onto the z axis. Later, we shall also introduce the quantum numbers and , which are the projections of S and J onto the internuclear axis. Using the Wigner-Eckart theorem, and the fact that the operator acts in the space of the electronic coordinates, we have