Preparation of $^{87}$Rb and $^{133}$Cs in the motional ground state of a single optical tweezer

We report simultaneous Raman sideband cooling of a single $^{87}$Rb atom and a single $^{133}$Cs atom held in separate optical tweezers at 814\,nm and 938\,nm, respectively. Starting from outside the Lamb-Dicke regime, after 45\,ms of cooling we measure probabilities to occupy the three-dimensional motional ground state of 0.86$^{+0.03}_{-0.04}$ for Rb and 0.95$^{+0.03}_{-0.04}$ for Cs. Our setup overlaps the Raman laser beams used to cool Rb and Cs, reducing hardware requirements by sharing equipment along the same beam path. The cooling protocol is scalable, and we demonstrate cooling of single Rb atoms in an array of four tweezers. After motional ground-state cooling, a 938\,nm tweezer is translated to overlap with a 814\,nm tweezer so that a single Rb and a single Cs atom can be transferred into a common 1064\,nm trap. By minimising the heating during the merging and transfer, we prepare the atoms in the relative motional ground state with an efficiency of 0.81$^{+0.08}_{-0.08}$. This is a crucial step towards the formation of single RbCs molecules confined in optical tweezer arrays.


Introduction
Optical tweezer arrays trapping individual neutral atoms have been demonstrated as a platform where quantum simulation and quantum information processing could be used to solve complex problems in physics and chemistry [1,2,3,4,5,6,7]. Dynamic control over trap positions allows deterministic preparation of a fixed number of particles in a scalable geometry [8,9,10,11]. Microwave transitions or optical Raman transitions control the internal state of individual particles on timescales much faster than the decoherence time [12]. Entanglement can be generated within an array by using on-site collisions [13] or dipole-dipole interactions between Rydberg states [14,15,16]. Along with high-fidelity readout, the optical tweezer array has all of the essential elements for quantum computation [17,18]. The majority of experiments to date have focused on a single atomic species, although dual-species experiments are emerging [19,20,21].
Utilising dual-species tweezer arrays opens new avenues for research. The energydifference between atomic transitions in different species of atoms allows species-selective loading and imaging [20]. Species-selective imaging could be used to perform quantum non-demolition measurements with low cross-talk [22]. Furthermore, one can create species-selective traps using tweezers with different wavelengths [19,23,24]. By merging traps to bring two atoms together, one can produce molecules using microwaves and spin-motion coupling [25], or by using photoassociation [26], or by associating across a magnetic Feshbach resonance [27].
Ultracold heteronuclear molecules offer several advantages for experiments in quantum science [28], including controllable long-range dipole-dipole interactions, a diverse set of energy levels associated with rotation and vibration, and strong coupling to applied electric and microwave fields. An array of optical tweezers containing single ultracold molecules is an enticing platform for quantum simulation [29,30,31,32], quantum computation [33,34,35,36,37,38,39], and investigations into ultracold molecular collisions [40,41,42,43]. New opportunities arise for quantum information processing exploiting the rich internal structure [38,44]. All of the aforementioned applications benefit from the preparation of the particle in a single motional state of the tweezer. For example, reduction in thermal dephasing improves the fidelity of the transfer between quantum states [45,46,47], an essential component for implementing quantum gates. Single atoms can be prepared in the ground state of the harmonic trapping potential through the process of Raman sideband cooling (RSC) [46,48,49,50,51,47]. Certain species of molecule can be directly cooled [52,53,54,55] and have recently been trapped in optical tweezers [56]. However, cooling to the motional ground state is an ongoing challenge due to the complex energy level structure of molecules [57]. Alternatively, if a pair of atoms in the relative motional ground state are magnetoassociated into a molecular state [58,59,27], the molecule will inherit the initial motional state of the atom pair [25,27]. In fact, the initial preparation of the relative motional ground state is essential for the efficient magnetoassociation of the two atoms into a molecule [60,27]. This approach has recently been used to prepare a tweezer array of NaCs molecules in their motional ground states [21]. Extending this approach to other bialkali molecules will open up new opportunities, leveraging the existing work on bulk gases. In the case of RbCs molecules, the rotational and hyperfine structure [61,62,63], and AC stark shifts [62,64] have already been characterised in great detail. This has enabled the experimental demonstration of robust storage qubits based upon long-lived coherent superpositions of hyperfine states [65]. Moreover, the molecular structure of the X 1 Σ + → b 3 Π transition in RbCs has been shown to be ideally suited for the construction of a magic trap for multiple rotational transitions [66], permitting long rotational-state coherences and opening up interesting possibilities to encode synthetic dimensions in the molecule [67].
This work outlines progress towards creating an array of ultracold 87 Rb 133 Cs molecules. We demonstrate simultaneous RSC of a single rubidium ( 87 Rb) atom and a single caesium ( 133 Cs) atom held in separate optical tweezers. We achieve a probability of 0.86 +0.03 −0.04 that a Rb atom occupies the motional ground state of an 814 nm optical tweezer. Similarly, we achieve a probability of 0.95 +0.03 −0.04 that a Cs atom occupies the motional ground state of a 938 nm tweezer. Our cooling protocol is designed to cool atoms from outside the Lamb-Dicke regime in order to access the benefits of using moderate tweezer powers. The challenge of cooling in a relatively shallow trap is overcome using higher order sideband transitions [50]. Furthermore, we show that the cooling is possible using a condensed optical setup that shares hardware between the RSC laser beam paths for both species. We demonstrate the scalability of our setup by simultaneously cooling four Rb atoms in a one-dimensional tweezer array to a mean motional ground-state probability of 0.64 +0.03 −0.05 . Finally, following cooling, a Rb and a Cs atom are transferred with minimal heating to the same trapping potential of a 1064 nm tweezer. We achieve a probability of 0.81 +0.08 −0.08 for preparing the atoms in their relative motional ground state in the Zeeman hyperfine states |f Rb = 2, m f,Rb = 2 and |f Cs = 4, m f,Cs = 4 .
The structure of this paper is as follows. To begin, section 2 describes the general protocol for RSC. Section 3 provides a description of our experimental setup. Section 4 outlines the theoretical model we use to design the optimal cooling sequence, before presenting the design of a pulse sequence that reaches the motional ground state having started outside the Lamb-Dicke regime. In section 5, sideband thermometry is used to quantify the ground-state occupation after RSC and demonstrate high-fidelity preparation of atoms in the motional ground state, including results from cooling an array. Finally, section 6 reports the optimisation and performance of the merging sequence used to prepare a Rb-Cs atom pair in the motional ground state of a common 1064 nm tweezer.

Method of Raman Sideband Cooling
RSC relies on two processes to transfer the atom between the motional Fock states |n of the optical tweezer trap. In the first step, a stimulated two-photon Raman transition transfers the atom between hyperfine spin states, |↑ and |↓ . When the transition is on resonance with a lowering sideband, it performs a spin flip and reduces the motional level: |↑; n → |↓; n − 1 . Then, in the second step, optical pumping (OP) transfers the population back into the original hyperfine state while preserving the motional level: |↓; n − 1 → |↑; n − 1 . The combination of these two processes reduces the motional level by one quanta, as illustrated in Fig. 1. Iterating over the procedure cools the atom into the lowest motional level, at which point there is no further level to descend to, and so the atom decouples from both the Raman and the OP light.
The tight confinement of optical tweezers puts the atom in the Lamb-Dicke (LD) regime, allowing control over the motional level through atom-light interactions. Atoms are illuminated by laser light, leading to photon scattering events which result in atomic recoil due to the conservation of momentum. The LD parameter η = ω recoil /ω trap = k 2 /(2mω trap ) is determined by the trap frequency, ω trap , and the photon recoil energy, ω recoil = 2 k 2 /2m for resultant wavevector k and mass m. The LD parameter satisfies η 2 (n + 1) 1 in the LD regime, resulting in a suppression of motional excitation Cs Figure 1. The two stages of a Raman sideband cooling iteration. First a coherent twophoton Raman transition transfers some of the population from |↑; n → |↓; n−1 . One beam is circularly polarised for σ + transitions with Rabi frequency Ω σ . The other is linearly polarised for π transitions with Rabi frequency Ω π . The single-photon detuning from the excited state is ∆ R . Then a dissipative optical pumping step resets the spin, preserving the motional level: |↓; n − 1 → |↑; n − 1 . Each iteration of these stages removes one quanta of motional energy, ω trap , where ω trap is the trap frequency. The hyperfine spin state manifolds are labelled by the total angular momentum quantum number, f .
during photon scattering events [68]. Being in the LD regime is important for both of the aforementioned steps of RSC. In the OP step, the excitations and subsequent spontaneous emissions are on the carrier transition, i.e. they preserve the motional level. But it is also possible to make transitions between specific motional states -sideband transitions -provided the transition linewidth is smaller than the spacing of the energy levels. A stimulated two-photon Raman transition satisfies this condition by coupling two long-lived states via an excited state that is not populated [69,70]. Sideband transitions that change the motional level occur at intervals of the trap frequency.
In standard notation, a blue sideband (BSB) transition increases the motional level, whereas a red sideband (RSB) transition reduces the motional level. The direction of the atomic recoil momentum determines which trap axes the Raman transition can couple to. To cool to the 3D ground state, the laser beams driving Raman transitions must be arranged to allow coupling to the different trap axes. The effectiveness of an RSC protocol is determined by the competition between cooling rates and heating rates. The main limitation on the cooling rate is the reduced sideband transfer due to dephasing from differential light shifts, beam power fluctuations, and magnetic field noise. The important sources of heating are intensity and pointing noise from the tweezer trap, recoil from OP photons, and off-resonant carrier and BSB transitions. These obstacles to effective ground-state cooling are addressed in section 4.   Figure 2. (a) Lasers operating at 780 nm and 852 nm generate light for driving Raman transitions in Rb and Cs, respectively. Electro-optic modulators (EOMs) add frequency sidebands to the RB1 beams at the hyperfine splitting of the electronic ground-state, prior to overlap. For the remaining Raman beams, the 780 nm and 852 nm light is first overlapped and then split into three separate beam paths, each with an acousto-optic modulator (AOM) to control the power. (b) Raman beams are fibre-coupled to the main experiment and focused down to beam waists of 100 − 160 µm at the position of the atoms. RB1 drives σ + transitions and is pulsed on with one of the other linearly polarised Raman beams to give a two-photon transition coupling the harmonic motion along one of the trap axes. We use RB1+RB4 to couple to the atomic motion along the propagation direction of the tweezer, z. Both RB1+RB2 and RB1+RB3 can couple to the motion in both radial directions. We use RB1+RB2 for the x-direction, and RB1+RB3 for the y-direction. The inset depicts the radial asymmetry where the beam waists satisfy w x > w y . The optical pumping beam is aligned with the quantisation axis to drive σ + transitions with high fidelity.

Experimental Setup
Here we give an overview of the experimental setup for RSC by outlining a typical cooling sequence. The most relevant components are displayed in Fig. 2. The initial preparation of Rb and Cs in species-selective tweezers is described in detail in Ref. [23]. A typical cooling sequence begins with loading a Rb atom from a magneto-optical trap (MOT) into an optical tweezer with a wavelength of 814 nm, and subsequently loading a Cs atom from a MOT into a tweezer with wavelength 938 nm. This is a stochastic process that takes ∼250 ms, after which each tweezer has ∼50% probability of being occupied by a single atom. The tweezers are formed by focusing the laser beams through the same high numerical aperture objective lens. Their foci are not radially symmetric due to clipping of the laser beams before they enter the objective lens. At the focus the 938 nm tweezer has a beam waist {w 938 x , w 938 y } = {1.29(4), 1.06(2)} µm, and the 814 nm tweezer has a beam waist {w 814 x , w 814 y } = {1.03(3), 0.83(2)} µm. When the atoms are loaded into the tweezers, they are separated by a distance of 4.5 µm in the x-direction. We use fluorescence imaging to detect the occupation of the tweezers. Initially, we use a release and recapture technique [71] to measure the temperature of the atoms in the tweezers. This temperature corresponds to the mean energy of the atom's radial motion after averaging over many iterations of the experiment, where each iteration samples motional levels in the x-and y-direction from independent thermal distributions. After 10 ms of polarisation gradient cooling, we typically measure a temperature of ∼15 µK for Cs in a 1.3 mK deep 938 nm trap, and ∼ 30 µK for Rb in a 1.3 mK deep 814 nm trap. Following the initial cooling, the powers of both tweezers are increased so that the trap depths are 2 mK for Cs and 1.5 mK for Rb. The typical trap frequencies are {ν x , ν y , ν z } 938 Cs = {84, 120, 17} kHz and {ν x , ν y , ν z } 814 Rb = {107, 163, 25} kHz. At this point typical mean motional levels are {n x , n y , n z } ≈ {2, 1, 10}. Then RSC pulses of 780 nm light for Rb and 852 nm light for Cs are applied simultaneously to cool the atoms to the motional ground state. In order to detect the mean motional state, a Raman pulse is used to transfer population out of the upper hyperfine level. Finally, a resonant pushout pulse ejects any atom remaining in the upper hyperfine level and hence maps the atomic spin state onto the trap occupancy in a second fluorescence image [72,73,23].
To perform efficient RSC it is crucial to maintain the motional state while changing the spin state. A limiting factor is the spin-motion coupling introduced by the vector light shift of the optical tweezer trap [74,25]. For a linearly polarised tweezer, the tight focusing of the light introduces some ellipticity to its polarisation around the focus, resulting in a vector light shift equivalent to a nonuniform fictitious magnetic field [48,46,75]. As the vector light shift has a spatial dependence, there is an effective magnetic field gradient that offsets the trap centre for different m f states. The displacement in the trap centre for a spin flip with ∆m f = 1 between hyperfine states is similar to the ground-state atomic wavepacket size. Therefore, there is a high probability of motional excitation during the OP step. In our setup, the tweezer polarisation is set along the x-axis which results in an effective magnetic field that points in the y-direction. We suppress the effective field gradient, and consequently the spin-motion coupling, by applying a magnetic field of 4.8 G along the x-direction during OP and RSC which is perpendicular to the effective magnetic field [48]. This applied magnetic field contributes 1 mG of magnetic field noise and we measure drifts in the ambient field of order 1 mG from day-to-day. We will compare the dephasing on Raman transitions from vector light shifts and magnetic field noise in section 5.
It is also important to have high-fidelity state preparation and OP during RSC. The same laser beams are used for both tasks. Our scheme pumps to a spin-stretched hyperfine sub-level with total angular momentum quantum number f using resonant excitation on the D 2 line with circularly polarised light driving σ + transitions. To ensure that there is only a single dark state, we use two overlapped laser beams for each atomic species, derived from the lasers used for the MOT beams. The first beam drives |f = i − 1/2, m f → |f = i + 1/2, m f = m f + 1 transitions, where the nuclear spin quantum number is i = 3/2 for Rb and i = 7/2 for Cs. The second beam drives |f = i + 1/2, m f → |f = i + 1/2, m f = m f + 1 transitions. In this case, the spinstretched states, |f = 2, m f = 2 for Rb and |f = 4, m f = 4 for Cs, are dark to the OP light provided the polarisation is pure. High-purity circular polarisation is achieved using a polariser with extinction > 5000 : 1 followed by an achromatic quarter waveplate.
We measure a polarisation purity of > 2000 : 1 using the atoms after optimising the angle of the quarter waveplate and the direction of the bias field. Fig. 2 depicts the lasers used for Raman transitions. All of the Raman beams for one species are generated from the same laser, which ensures the phase coherence of two-photon Raman transitions. Our choice of laser frequency is a compromise between reducing off-resonant single-photon scattering whilst maintaining sufficiently strong coupling for two-photon Raman transitions. The 780 nm laser is red-detuned with a single-photon detuning of ∆ R = 50 GHz from the Rb D 2 line, and the 852 nm laser is red-detuned ∆ R = 41 GHz from the Cs D 2 line. Fig. 2(a) displays the hardware controlling the Raman beams. The 852 nm light is overlapped with the 780 nm light using a dichroic mirror so that they share hardware in the RB2, RB3, and RB4 beam paths. Frequency and power control is achieved using five acousto-optic modulators (AOMs). The Bragg diffraction angle of each AOM is wavelength dependent, so we must compromise the diffraction efficiency between the optimum for 780 nm and the optimum for 852 nm. Despite this, we typically achieve first order diffraction efficiencies of > 50% for both wavelengths. Electro-optic modulators (EOMs) are used to add frequency sidebands at 6.8 GHz for Rb and 9.2 GHz for Cs. To suppress the possibility of driving unwanted transitions and introducing pathways for quantum interference [76], we offset the EOM frequency by 10 MHz from the Zeemanshifted ground-state hyperfine splitting [46] (see Appendix A). The use of separate AOMs for RB1 (AOM1a for Cs and AOM1b for Rb) allows for independent control over the two-photon detuning and Rabi frequency of each species. However, sharing AOMs means that the pulse durations of RB2, RB3, and RB4 are constrained to be the same for both wavelengths. We therefore set the Rabi frequencies such that the sideband π-pulse duration is equal for Rb and Cs.
The geometry of Raman beams in Fig. 2(b) allows us to couple to the motional state along the three orthogonal axes of the trap. The two-photon Raman transitions are also coupling the spin states {|↓ , To couple to the motion along a given axis, the resultant wavevector from the combination of the two beams must have a non-zero projection along that trap axis. The tweezer propagates along the z-direction and the radial asymmetry in the x-y plane is depicted in the inset of Fig. 2(b). It should be noted that both RB1+RB2 and RB1+RB3 are capable of coupling to both radial axes. The trap asymmetry means sideband transitions along both radial axes can be spectrally resolved. Strictly speaking, only one of RB2 or RB3 is required. Although the current work makes use of both RB2 and RB3, we have verified that the cooling protocol reaches the same final ground-state fraction using only RB2. The Raman beams are focused onto the atoms to produce beam waists of 100 − 160 µm so that modest laser powers can be used to achieve the desired Rabi frequencies. We note that a combined Raman beam power of 2 mW at the atoms is sufficient for our cooling scheme. The Rabi frequency is chosen to maximise the cooling rate, but is constrained by other parameters such as the trap frequency, as will be explained in the next section. (c) Starting in |n = 0 , the probability of excitation after applying a Raman pulse on the RSB increases with the Rabi frequency, Ω R . The peak Rabi frequency is scaled to keep the pulse area the same for the different pulse profiles. The pulse shapes with a broad Fourier spectrum have a higher probability of excitation. The trap frequency is ω trap = 120 kHz and the pulse duration is T = π/ηΩ R , with LD parameter η = 0.13.

Designing the Protocol for Raman Sideband Cooling
Our RSC protocol uses a sequence of Raman pulses to cool an atom to the motional ground state, starting from outside the LD regime. In order to accomplish this, the Raman pulses are shaped with a smoothed temporal profile and the pulse sequence targets several different sideband transitions. Using these techniques enables cooling at lower trap depths where both the decoherence caused by differential light shifts and the heating from photon scattering are reduced. More specifically, lowering the trap depth can increase the effectiveness of RSC by increasing the cooling rate and reducing heating rates. The cooling rate is proportional to the sideband transfer efficiency, which is improved by reducing dephasing from differential light shifts that scale proportional to the trap depth [45,48]. The differential light shifts are significant in our case because the tweezers are near-detuned to ensure species-selectivity [23]. Furthermore, the heating rates from tweezer photon scattering, intensity noise, and pointing noise are reduced by using lower trap depths [77,78]. Finally, given the practical limit of maximum tweezer power available, using less power per trap allows the production of larger arrays. Therefore there is high incentive to design a cooling protocol that works for shallow traps where the atom is initially outside the LD regime. Below we provide more details of the measures we have implemented to overcome the challenges of implementing RSC in relatively shallow traps.

Pulse shaping
Pulse shaping is required when using lower trap depths to reduce the probability of off-resonant Raman carrier or BSB transitions. Lowering the trap depth decreases the sideband splitting such that the finite width of the Raman pulse's Fourier spectrum results in an increased probability of off-resonant excitation. These undesired transitions reduce the cooling rate by skipping a cooling step and necessitating another OP step to reset the spin, which has associated heating from photon recoil.
A suitable pulse shape is chosen based on the balance between its spectral width and the temporal duration required to achieve a π-pulse. Fig. 3(a) compares a square pulse profile to a Blackman-Harris profile and a Tukey profile with cosine fraction 2/3 [79]. The first consideration is to minimise the spectral width to avoid off-resonant excitation from two-photon Raman transitions. The abrupt change in amplitude of the square pulse results in the appearance of sidelobes in the Fourier spectrum in Fig. 3(b). The Tukey profile smooths the edges of the pulse to suppress sidelobes. The Blackman-Harris profile is shaped to minimise the sidelobes in the Fourier spectrum. Fig. 3(c) displays the results of solving the Schrödinger equation for the application of a Raman π-pulse on the RSB, having started in the motional ground state. Here we consider the radial direction, using a trap frequency of ω trap = 120 kHz and LD parameter η = 0.13. The square pulse profile has a significant probability of off-resonant excitation. The excitation probability is reduced by using a Tukey profile, but still increases with the Rabi frequency. For our radial trap frequencies, the Tukey profile maintains the minimal excitation given by the Blackman-Harris profile provided that the Rabi frequency is < 40 kHz. The side effect of smoothing the pulse profile is a lower mean Rabi frequency; in order to achieve the same pulse area either a longer duration or a higher peak Rabi frequency is required. Longer pulses are undesirable as they allow time for spontaneous scattering from the tweezer or the Raman beams. And given that there is limited laser power available for the Raman beams, we choose the Tukey profile for the radial directions so that a lower peak Rabi frequency is required. However, in the axial direction the smaller trap frequency necessitates using a Blackman-Harris profile. In practice, RB1 is always a square pulse, so the Raman coupling is the convolution of the RB2, RB3 or RB4 pulse profile with a square pulse. In the radial directions, the resultant square-root Tukey profile still maintains an acceptable excitation probability of < 10 −5 for Rabi frequencies < 30 kHz. However, for the axial direction we shape the RB4 pulse with the square of a Blackman-Harris profile, such that the convolution of RB1+RB4 is a Blackman-Harris profile.

Sideband Transitions Outside the Lamb-Dicke Regime
Using shallow trap depths means that the atom starts outside the LD regime, such that the Raman coupling depends on the motional level. The rate at which population is transferred between motional states is defined by the Rabi frequency [80]: Here n < is the smaller of the motional levels {n, m}, and n > is the larger. The Raman Rabi frequency, Ω R = Ω π Ω σ /(2∆ R ), depends on the single-photon Rabi frequencies, Ω π and Ω σ , and the single-photon detuning from the P 3/2 manifold, ∆ R . L |n−m| n< (η 2 ) is an associated Laguerre polynomial [81]. The LD parameter, η i for trap axis i ∈{x, y, z}, is dependent on the beam geometry that determines the direction of atomic recoil. In our setup the recoil momentum is always at 45 degrees to the trap axis that it couples to. In the LD regime, the Rabi frequency has only a weak dependence on the motional level. However, in the axial direction where η z ∼ 0.3, the initial thermal distribution with n ∼ 10 starts outside of the LD regime. Fig. 4(a) shows the result of solving the Schrödinger equation for the application of a Raman pulse on the first RSB, demonstrating population trapping where the Raman coupling vanishes at n = 35. In contrast, Fig. 4(b)-(d) demonstrate how higher order sideband transitions can be used to achieve strong coupling to motional levels of n > 20. In order to address a wide range of motional states, the pulse duration is set to a π-pulse for n max , the motional level that maximises the sideband Rabi frequency: T = π/Ω i (n max , n max + ∆n) where dΩ i (n max , n max + ∆n)/dn = 0. Cooling from outside the LD regime is achieved using a sequence of pulses targeting different sideband transitions.

Assembling a Pulse Sequence
We use simulations to model different pulse sequences and guide the choice of several experimental parameters relevant to the cooling scheme. We do so by solving the Lindblad master equation for the evolution of the atomic state during a RSC pulse sequence [82,46]. The simulations include the coherent transfer from the Raman beams, and dissipative OP steps. The methodology is outlined in Appendix B. Even without considering other heating effects such as scattering from the tweezer, we are able to draw several conclusions. In the radial direction, using a fixed pulse duration achieves a similar fidelity of motional ground-state preparation compared to an optimised routine where each pulse duration is an independent variable [83]. However, in the axial direction, achieving high fidelity motional ground-state preparation requires addressing the motional levels n > 20 using sideband transitions with |∆n| > 1. Our simulations showed that the higher motional levels should be addressed first, bunching the population distribution in the lower motional levels.
Randomising the two-photon detuning between time steps allows us to simulate drifts in two-photon detuning and places limits on the optimal choice of Rabi frequency. While reducing the Rabi frequency reduces the probability of off-resonant Raman transitions, it also reduces the width of the sideband transition and reduces sideband transfer in the presence of dephasing. The consequence is increased sensitivity to changes in the two-photon detuning. The increased π-pulse duration is also detrimental due to the photon scattering and trap heating effects discussed earlier. Therefore, guided by simulations, we compromise by choosing mean Rabi frequencies of 4 kHz for the axial direction and 20-30 kHz for the radial directions.
Finally, we bring the previous considerations together to construct a pulse sequence that cools an atom to the 3D motional ground-state starting from an initial temperature of 10-30 µK. The resulting pulse sequence, displayed in Fig. 5, is composed of 4 groups of pulses which are repeated 5, 10, 10, and 15 times, respectively. Groups 1, 2, and 3 have the form shown in Fig. 5(a). The radial pulses target the n − 1 sideband with a duration that corresponds to a π-pulse for atoms in n = 3. In the axial direction, the pulse duration is chosen to maximise the Raman coupling for the desired sideband transition: T = π/η z Ω z (n max , n max + ∆n). Group 1 applies pulses on the n − 4 sideband with n max = 66, group 2 applies pulses on the n − 3 sideband with n max = 41, and group 3 applies pulses on the n − 2 sideband with n max = 21. Group 4 is the final set of pulses illustrated in Fig. 5(b). In the radial direction with the smaller trap frequency, x, we alternate the pulse duration between a π-pulse for n = 1 or n = 3 on the n − 1 sideband. In the axial direction, we alternate between pulses on the n − 2 and n − 1 sidebands with n max = 21 and n max = 7, respectively. The whole pulse sequence takes 45 ms. See Appendix C for a table with full details of the pulse sequence.

Ground-state Cooling in Separate Tweezers
The RSC pulse sequence is designed for high-fidelity preparation of single atoms in the motional ground state. We confirm the effectiveness of the RSC pulse sequence using sideband thermometry in Fig. 6. Assuming a thermal distribution, the ratio of the RSB and BSB peak amplitudes gives the probability of occupying the motional ground state P (n = 0) = 1 − A RSB /A BSB [84]. After RSC, we extract mean motional levels of {n x , n y , n z } Cs = {0.000 +0. The distribution after RSC is not thermal, but the sideband ratio method is still expected to give a sufficiently accurate estimate of the groundstate probability. In the rest of this section, we examine the robustness of the cooling protocol, the dephasing of Rabi oscillations, and the application of the cooling protocol to atoms in an array of tweezers.

Robustness of the Raman Sideband Cooling Protocol
The RSC protocol is resilient against fluctuations in the two-photon detuning and Rabi frequency. This resilience is achieved by applying enough pulses to saturate the groundstate probability. The required number of pulses depends on the detuning from twophoton resonance and the pulse duration, but in the radial direction we can saturate the ground-state probability after ∼ 30 pulses. Our protocol uses 55 pulses on each radial axis so that the radial sideband detuning can drift by 6 kHz with minimal effect on the final ground-state probability given the 20-30 kHz Rabi frequencies. Similarly, the axial sideband detuning must be set to within 2 kHz. This insensitivity to frequency offsets is promising for the extension of this scheme to larger arrays; normalisation of the array's trap intensities to within 10% should result in comparable cooling performance across the array. The Raman beam powers fluctuate by < 3 % (< 2 kHz change in light shift), such that the variations in the two-photon Rabi frequency and two-photon detuning are within the boundaries previously stated. Furthermore, the reported 3D ground-state probabilities are achieved when simultaneously cooling Rb and Cs, demonstrating that crossover effects from the influence of the 780 nm light on the Cs RSC, or 852 nm light on the Rb RSC are negligible. All together, our simultaneous RSC protocol is a robust method of achieving 3D ground-state probabilities of > 80% for both species.

Dephasing of Rabi oscillations
The effectiveness of the RSC protocol can be seen by examining Rabi oscillations on the Raman carrier transition before and after cooling. Fig. 7 displays a measurement of carrier Rabi oscillations of a Rb atom using RB1+RB3 which can couple to the motion in the radial direction. The carrier Rabi frequency can be evaluated using Eq. 1 with m = n. Without applying the RSC pulse sequence, the thermal distribution of motional levels leads to a distribution of Rabi frequencies and hence causes dephasing. This is evident in Fig. 7(a), taken in an 814 nm tweezer without any RSC, where fitting a damped sine function allows us to extract a 1/e time of 0.053(8) ms. We can extract the temperature by instead fitting a sum over the Rabi oscillations from the different motional levels [46]; where P MB (n, T ) is the Boltzmann probability of occupying motional state n for a thermal distribution with temperature T . We include the coupling to both radial axes using Ω r (n, n) = Ω x (n, n)Ω y (n, n)/Ω R in order to extract the mean temperature. We fit a Raman Rabi frequency of 32.8(3) kHz and a temperature of 25(2) µK. For comparison, the mean temperature from the radial sideband spectroscopy before RSC shown in Fig. 6(b) is 15(2) µK. Typically, we expect a temperature of ∼ 30 µK, as measured using the release and recapture method. Fig. 7(b) shows the extended coherence time of carrier Rabi oscillations straight after applying the RSC protocol in the 814 nm tweezer. The fitted 1/e decay time is 0.25(4) ms. In a precursor sideband thermometry measurement we measured a motional ground state probability of 0.86 +0.06 −0.06 . This high fidelity preparation into the motional ground state removes the effect of thermal dephasing. We attribute the remaining dephasing to a combination of factors stemming from the bare diode laser source used for the tweezer. Firstly, the tweezer wavelength of 814 nm is relatively near-detuned to the Rb D1 and D2 lines, leading to significant differential light shifts of ∼ 15 kHz. Secondly, we measure increased broadband intensity noise on the tweezer light after it has passed through the optical fibre which delivers light to the experiment and a subsequent polariser. We believe this noise originates from multiple modes propagating in the fibre. Together, these factors result in an enhanced spread of differential light shifts and increased dephasing. Note that this additional dephasing is also present in Fig. 7(a), meaning that the fitted temperature is likely an overestimate. To confirm that the additional dephasing is a result of differential light shifts from the tweezer, Fig. 7(c) displays Rabi oscillations for Rb in a 938 nm tweezer after applying the RSC protocol, where we extract a 1/e time of 1.2(6) ms. The 938 nm tweezer has a similar level of intensity noise, and hence we measure fast dephasing of Rabi oscillations for a Cs atom. Yet for Rb the dephasing due to the tweezer is greatly suppressed owing to the greater detuning from the Rb D1 and D2 lines. We note that the additional dephasing associated with the bare diode laser can be removed by using a single frequency laser source. However, the robustness of our RSC protocol is demonstrated by the fact that we still prepare a single motional state with high fidelity despite the presence of this additional dephasing.

Cooling an Array
In order to prepare an array of ultracold RbCs molecules, we must first prepare an array of atoms in the motional ground state. Fig. 8 displays sideband spectroscopy after applying the RSC protocol simultaneously to four tweezers with wavelength 817 nm trapping Rb atoms in a 1D array with separation 4 µm. The measured motional groundstate probability is 0.72 0.05 0.05 , 0.70 0.07 0.10 , 0.48 0.08 0.12 , and 0.67 0.07 0.09 for trap 0, 1, 2, and 3. Details of the generation of the array using a two-axis acousto-optic deflector (AOD) added to our setup can be found in Appendix D. The trap frequencies of the array were normalised to within < 3% of the mean value so that the two-photon detuning was within the bounds of efficient cooling by the RSC protocol; this level of array intensity normalisation has been achieved for arrays of over 100 atoms [85]. The motional ground-state preparation has a lower fidelity than a single trap because of two technical issues that can be easily resolved in future work. Firstly, the intensity noise on the tweezer light previously mentioned increases with the laser power. Secondly, the polarisation purity measured straight after the two-axis AOD is reduced by a factor of between 10 -70 across the array. This leads to a reduction in purity as measured using the atoms and to a decrease in the efficiency of the cooling. Implementing a polariser after the AOD would restore the intended linear polarisation. Therefore, we conclude that our RSC protocol is suited for effective cooling of atoms in an array.

Preparing Atom Pairs in the Relative Motional Ground State
Once a Rb and a Cs atom have been prepared in the motional ground-states of their respective tweezers, the next step on the route to creating molecules is to merge the traps in order to prepare an atom pair in a single optical tweezer. For molecule creation the atom pair must be in the relative motional ground state of the trap. Therefore, it is important that the transportation of the atoms and merging of the traps maintains the motional state. A balance must be found between merging slow enough to avoid motional excitation from the movement and not leaving excess time for photon scattering. Our merging sequence, displayed in Fig 9(a), moves the Cs atom in the 938 nm tweezer to the position of a Rb atom in a stationary 814 nm tweezer, before transferring both atoms into a 1064 nm tweezer at the same position. We optimise the merging by decomposing the sequence to isolate the effects on each atom, as described below. We find that in order to avoid heating, we must carefully choose the trajectory and duration of movement for the 938 nm tweezer in conjunction with the powers of both tweezers. First, we place a limit on the total duration of the merge by considering photon scattering. As previously mentioned, off-resonant scattering from the tweezers causes heating from photon recoil through Rayleigh scattering or changes to the spin state through Raman scattering which necessitates an additional OP step with associated photon recoil. We calculate a heating rate in the axial direction of 0.02 quanta ms −1 for Rb in the 814 nm tweezer with a typical power of 1 mW, and 0.004 quanta ms −1 for Cs in the 938 nm tweezer for a typical power of 4 mW. These heating rates limit the merge duration to a few milliseconds.
Secondly, we consider the limitations on the movement of the Cs atom in the 938 nm tweezer alone. The position of the 938 nm tweezer is dynamically controlled in the xdirection by an AOD, as previously described in Ref. [23,86]. Chirping the frequency of the RF signal driving the AOD translates the tweezer, but the duration of the sweep must be slow enough to avoid excitation. To reliably keep the motional excitation < 0.01 quanta in the direction of transport while adiabatically transporting a Cs atom 4.5 µm in a 3.8 mW 938 nm tweezer, the duration must be > 0.11 ms [87]. However, faster transport while maintaining the motional state is possible using shortcuts to adiabaticity [88,89,90,91,92,93,94], assuming that the trap frequency is constant throughout the trajectory.
Unfortunately, the diffraction efficiency of the AOD has an oscillatory dependence on the driving frequency, which can cause resonant intensity modulation at certain sweep rates (see Appendix E). The resulting heating can be avoided by using a constant sweep rate where the intensity modulation is not resonant with the trap frequency. However, the linear chirp has significant jerk, which will heat the atom. Therefore, we form a hybrid minimum-jerk trajectory that starts and ends with a minimum-jerk function [26]. Fig. 9(b) displays a trajectory with a duration of 1.6 ms where the sweep rate is constant for the central 10 %. Using a trajectory for 4.5 µm of movement with 10 % linear sweep, we can avoid the resonant intensity modulation when the sweep duration is ∼ 1 ms or ∼1.6 ms.
To maintain both atoms in the motional ground state, we must also consider the effect of combining the potential of the 938 nm tweezer with that of the 814 nm tweezer. We investigate this experimentally using the merging sequence displayed in Fig. 9(a). We perform the experiment separately for each atomic species in order to avoid pair loss and interaction shifts complicating the interpretation of the sideband spectra used to measure any heating. The pair loss is a result of spin relaxation from all spin state combinations except when Rb is in |f = 1, m f = 1 and Cs is in |f = 3, m f = 3 . The final motional state of each atom is sensitive to both the trap depths (set by the tweezer powers, P 938 and P 814 ) and the overlap of the tweezers. We overlap the tweezers to within 100 nm in both radial directions by pushing out a Cs atom from the 938 nm or 1064 nm tweezer using the repulsive potential of the 814 nm tweezer. With the tweezers well overlapped, we observe that sweep durations longer than 1 ms are required to avoid heating. Therefore we choose a sweep duration of 1.6 ms which avoids resonant intensity modulation without leaving excess time for off-resonant scattering from the tweezer. All that remains is to choose the balance of the trap powers during the merge sequence. The Rb atom could be transferred to an excited motional state if the trap depths of the merging potentials are similar [13]. Alternatively, when the combined potential experienced by Cs is very shallow, the close spacing of axial harmonic levels means there is a high probability of motional excitation. We explore the balance of trap powers by fixing P 938 = 3.8 mW and varying P 814 . The powers are adiabatically ramped in 1 ms before merging the traps. The Rb atom starts to spill into the 938 nm tweezer when the power ratio P 938 /P 814 > 7. In contrast, the trapping potential for the Cs atom vanishes when P 938 /P 814 < 2. In practice, we find that the heating of the Cs atom is more severe than that of the Rb atom, and a power ratio of P 938 /P 814 ∼ 6 is required. The last step of the merging sequence transfers the atoms to a 1064 nm tweezer at the same position. An atom in the 1064 nm tweezer experiences a lower scattering rate and therefore permits a greater precision for the sideband thermometry used to assess the merging performance. The wavelength is chosen in anticipation of the subsequent transfer to a molecular state, as for RbCs the molecular polarisability in the rovibrational ground state is similar to that of the Feshbach molecule at 1064 nm [95,96,64].
Finally, we use the constraints previously discussed to determine parameters for merging a Rb atom in its motional ground-state with a Cs atom in its motional groundstate. The tweezer powers are set at P 938 = 3.8 mW and P 814 = 0.64 mW. This power for the 938 nm tweezer gives trap depths of U 938 Cs = 1.2 mK and U 938 Rb = 0.42 mK, whereas the 814 nm tweezer provides a trap depth of U 814 Rb = 0.50 mK for Rb but functions as a barrier for Cs with U 814 Cs = −0.34 mK. The separated potentials are plotted in Fig 9(c). The trap powers are ramped in 1 ms and the hybrid minimumjerk trajectory has a duration of 1.6 ms, with 10 % of the total duration being a linear sweep. The whole merging sequence is completed within 6 ms. Testing the sequence with a Cs atom, the mean motional levels extracted from the sideband spectroscopies in Fig. 10(a) are {n x , n y , n z } Cs = {0.13 +0.04 −0.04 , 0.05 +0.03 −0.03 , 0.09 +0.04 −0.04 }. Repeating the measurements with a Rb atom, we extract mean motional levels of {n x , n y , n z } Rb = {0.08 +0.03 −0.03 , 0.02 +0.02 −0.02 , 0.09 +0.03 −0.03 } from the sideband spectroscopies in Fig. 10(b). This corresponds to a 3D ground-state probability of 0.78 +0.05 −0.06 for Cs and 0.83 +0.04 −0.04 for Rb. While the decrease in ground-state probability of the Rb atom is consistent with heating from tweezer scattering, the optimised parameters have not completely prevented motional excitation of the Cs atom. We expect an increase of 0.028 quanta due to photon scattering from the tweezer. Yet, the mean motional level in the x-direction has increased by 0.13 +0.04 −0.04 quanta. Most likely this is a consequence of heating from the resonant modulation of the AOD's diffraction efficiency, and the heating is greater than measured with just Cs in the 938 nm tweezer because the trap frequencies are modified when the two tweezer potentials combine. The proportion of atoms in the relative motional ground state can be estimated using [27] P (n rel = 0) = P (n Rb = 0)P (n Cs = 0) 1 − m Cs . ( The derivation of this formula makes three assumptions. Firstly, that the centre-ofmass and relative motion are separable. In the 1064 nm tweezer the Rb and Cs trap frequencies are similar, satisfying ω Cs ≈ 1.08ω Rb , such that the centre-of-mass and relative motion are approximately separable. Secondly, that the distribution of motional levels follows a Boltzmann distribution. Thirdly, the derivation ignores interactions between the atoms. While the interactions between the atoms will shift the energy eigenstates [97], the motional ground state that we prepare is adiabatically connected to |n rel = 0 as the interaction strength is tuned to zero, provided that we can ignore the unlikely effects of trap-induced shape resonances [98,99]. Therefore, the probability of preparing an atom pair in the relative motional ground state, which becomes |f Rb = 2, m f,Rb = 2; f Cs = 4, m f,Cs = 4; n rel = 0 in the limit of weak interactions, is P (n rel = 0) = 0.81 +0.08 −0.08 following our optimised sequence. Using adiabatic rapid passage with a microwave field [100,101] or a Raman carrier π-pulse before merging, we can transfer to the desired hyperfine spin state with > 90% fidelity. Then > 60% of the initial atom pairs will be transferred into the |1, 1; 3, 3; n rel = 0 state. This is the state from which magnetoassociation to a molecular state can be achieved following a similar scheme employed with bulk mixtures of Rb and Cs [102,103].

Conclusion
We have demonstrated a pulsed Raman sideband cooling sequence that simultaneously cools a Rb atom and a Cs atom to the 3D motional ground-states of their respective traps with high fidelity. Since the atoms start outside of the Lamb-Dicke regime, the sequence starts with axial sideband transitions on the fourth red sideband, ∆n = −4, to ensure the portion of the population distribution initially in high motional levels is also cooled. Using sideband thermometry we measure a 3D ground-state fraction of 0.86 +0.03 −0.04 for Rb in a 814 nm tweezer, and 0.95 +0.03 −0.04 for Cs in a 938 nm tweezer. The scalability and robustness of the cooling protocol is promising for expansion to arrays of tweezer traps, and we demonstrate simultaneous cooling of Rb atoms in an array of four tweezers. Application of our RSC protocol to dual-species arrays of Rb and Cs atoms [19] would allow initialisation of both species in the motional ground-state, improving the fidelity of tweezer-based quantum logic gates [104,105]. In addition, this would realise an ideal system for implementing the mid-circuit measurements [22] that are necessary for quantum error correction schemes [106].
We have also demonstrated that the Rb and Cs atoms can be transferred into the same 1064 nm tweezer trap with little heating; we measure 3D ground-state fractions of 0.83 +0.04 −0.04 for Rb and 0.78 +0.05 −0.06 for Cs in the combined trap. The far-detuned 1064 nm tweezer has the benefit of a reduced scattering rate. This trap is also well suited to molecule formation as at this wavelength the polarisability of the Feshbach and ground molecule states are similar [64]. The inferred occupation fraction of the relative motional ground-state |f Rb = 2, m f,Rb = 2; f Cs = 4, m f,Cs = 4; n rel = 0 is 0.81 +0.08 −0.08 . With efficient microwave or Raman carrier transitions we can populate the |1, 1; 3, 3; n rel = 0 state with efficiency > 60%. The natural next step is to establish magnetic field control that allows sweeping of the magnetic field across an interspecies Feshbach resonance to form weakly bound RbCs molecules [102,103], paving the way to the production of single polar molecules in the rovibrational ground state using STIRAP [107,108].

Acknowledgements
The authors thank L A McArd and A Hunter for designing and building the DDS AOM driver circuit. The authors thank R Sawant for initial ideas in simulating RSC, S White for simulations of merging traps, and A Alampounti for developing software control of the AWG. This work made use of the facilities of the Hamilton HPC Service of Durham University. This work was supported by U.K. Engineering and Physical Sciences Research Council (EPSRC) Grant EP/P01058X/1 and Durham University.

Data Availability Statement
The data that support the findings of this study are openly available at the following URL/DOI: 10.15128/r1f4752g787. Fig. A1 displays two two-photon Raman transitions between the spin-stretched hyperfine states used in this paper which are allowed by dipole selection rules. The desired transition, in blue, occurs via a single virtual excited state. The other transition, in red, is possible through several virtual excited states and destructively interferes with the desired transition. Therefore, we suppress the undesired transition by using an 1000 : 1 polariser and a quarter waveplate with a retardance of 0.267 waves at 780 nm, and 0.243 waves at 852 nm. Assuming linearly polarised incident light and that the waveplate angle is set to within 1 degree, this maintains a polarisation purity of 500 : 1 (calculated using Jones matrices [109]). Secondly, we offset the EOM frequency by 10 MHz, ν EOM = ν HFS + 10 MHz, such that the undesired transition is −20 MHz detuned from two-photon resonance. AOM1 controls the frequency of the laser beam driving σ ± transitions, and AOM2 controls the frequency of the laser beam driving π transitions. Achieving two-photon resonance with the two laser beams marked by blue lines in Fig. A1 implies that ν AOM1 = ν AOM2 + ν HFS − ν EOM = ν AOM2 − 10 MHz. Note that the upper EOM sideband is used for the blue line driving σ + transitions, whereas the lower EOM sideband is used for the red line driving σ − transitions. In both cases the carrier frequency and other EOM sideband are detuned from two-photon resonance by at least the ground-state hyperfine splitting. The detuning of the laser beams marked by red lines in Fig. A1 is then given by  Figure A1. Simplified diagram of the hyperfine structure of a Rb atom showing the desired two-photon Raman transition and an undesired Raman transition. The hyperfine structure for Cs has different values for the total angular momentum quantum number f . Blue: a two-photon Raman transition between two of the spin-stretched states is possible when one Raman beam drives a σ+ transition and the other drives a π transition. The energy difference is added to one of the Raman beams using the upper sideband of an EOM. Red: if the circularly polarised Raman beam has residual light of the opposite handedness, it can drive a two-photon Raman transition with the lower EOM sideband that destructively interferes with the desired transition. We suppress this possibility by offsetting the EOM frequency by 10 MHz and using the AOM frequencies to bring the desired transition back onto two-photon resonance.
motional levels with the internal spin states [111]: Where the momentum kick is ∆k = |k π − k σ |, with ∆kx = η(a + a † ) for Raman LD parameter η. Ω R is the Raman Rabi frequency. We introduce the spin-raising operator σ + = σ x + iσ y . For the dissipative OP term, we assume that each OP step scatters 3 photons [26], giving σ − = σ x − iσ y is the spin-lowering operator. Γ OP is the scattering rate of the OP beams, and the recoil from the OP photons defines the OP LD parameter η OP . We have simplified the discussion to a single dimension. The generalisation to 3D is trivial, but since the directions are independent we retain the simplicity of using 1D and analyse each direction separately. An exception is the radial Rabi oscillations in Fig. 7, where we consider the coupling to both radial axes, assuming that the position operators are separable: )|m x n y | exp (iη y (a + a † ))|m y | = Ω x (n x , m x )Ω y (n y , m y )/Ω R .

(B.4)
The last line follows from Eq. 1, and the fitted line in Fig. 7 assumes n x = n y in order to extract the mean temperature. The simulation of a full pulse sequence runs in several minutes provided that the basis of harmonic levels included is sufficiently small. For our typical starting temperatures, the thermal distribution contains 99 % of the population within motional levels n < 21 for the radial direction with n = 4 and η ∼ 0.15. However, in the axial direction with n = 10 and η ∼ 0.3 we must include up to n = 65 in order to represent 99 % of the population. Table C1 gives full details of the transition frequencies and Rabi frequencies used in the different stages of the RSC pulse sequence.

Appendix D. Generation of an Array of Four Tweezers
We generate an array of tweezer traps by deflecting the tweezer laser in several directions using a 2D AOD (AA Opto Electronic DTSXY-400-810). The traps originate from the same laser at a wavelength of 817 nm, different to the wavelength used for Rb throughout the rest of this paper. The AOD is driven by an arbitrary waveform generator (Spectrum Instrumentation M4i.6631-x8) set to a sample rate of 1024 MS/s. One channel is used for horizontal deflection, and a second channel is used for vertical deflection. We drive the horizontal AOD with a signal composed of four RF tones equally spaced across a range of 6 MHz, which creates four tweezer traps in the atom plane with a 4 µm separation between each trap. The frequencies are chosen so that there are an integer number of periods within the data loaded onto the card, in order to avoid noise associated with phase slips when looping segments of data loaded onto the AWG. The phase of each tone is optimised to reduce interference from the mixing of sum and difference tones generated by the nonlinear response of the amplifier and AOD [112,113].
We homogenise the intensities of the traps across the array in order to ensure the Raman transitions during RSC are on two-photon resonance. The aforementioned phase optimisation is an essential first step, after which driving each tone with the same RF power results in optical powers differing by ∼ 20 %. Then we change the amplitudes of the RF tones until the powers measured on a CCD camera placed soon after the AOD are normalised to within 3 %. We then confirm the axial trap frequencies are normalised to within 3 % using a parametric heating measurement. Balancing the trap intensities means that the tweezer light shifts are the same across the array.

Appendix E. Etaloning in an Acousto-Optic Deflector Crystal
There is a technical issue that can occur when using an AOD to move an optical tweezer, which can cause severe heating of the atom during transport. The problem originates from the AOD crystal acting as an etalon [114]. An input RF signal drives a piezoelectric transducer to create an acoustic travelling wave in the AOD crystal, which periodically modifies the refractive index to deflect an input laser beam through Bragg diffraction. However, reflections of the acoustic wave from the opposite edge of the crystal will interfere and create a standing wave if the wavelength matches the cavity length. The consequence is that the diffraction efficiency of the AOD oscillates with a period given by the free spectral range of the cavity, c/2L, where c is the speed of sound in the cavity and L is the cavity length. For our AOD (ATD-1803DA2.850 from IntraAction) with speed of sound 4200 m s −1 , we measure a cavity free spectral range of 175 kHz. This implies a cavity length of 12 mm, which roughly matches the crystal dimensions. Typically the reflections are small and so the oscillations in diffraction efficiency are only a few per cent [26]. The interference from reflection can be mitigated by angling the far edge of the crystal. However, our AOD is not angle-cut, and the relative diffraction efficiency modulation is between 0.1-0.5% (the variation in the modulation is dependent on the RF drive). When the driving frequency is swept to move the deflected beam, the intensity of the beam follows the oscillations in diffraction efficiency. If this intensity modulation is at double the trap frequency, there would be parametric heating of the atom [77]. However, we find experimentally that the most significant heating occurred when the intensity modulation is at the trap frequency, rather than double the trap frequency. This suggests that the heating is from pointing noise rather than intensity noise [77], and that there is a coupling between the tweezer intensity and the position of the atom. This coupling could arise from the presence of strong vector light shifts as discussed in the main text. To minimise the heating from this effect, we avoid the sweep rates given by: where ω trap,i is the trapping frequency along the axis i, and dx df = 0.322 µm MHz −1 is the proportionality constant between the frequency of the RF input and the movement of the optical tweezer in the atom plane. Using the radial trap frequencies of 64 kHz and 92 kHz for a 938 nm tweezer power of 3.8 mW we calculate that sweep rates of 3.6 µm ms −1 or 5.2 µm ms −1 cause resonant intensity modulation.
The trajectory used to move the optical tweezer is chosen to avoid resonant intensity modulation and heating from jerk. Varying the driving frequency of the AOD with a constant sweep rate makes it easy to avoid resonant intensity modulation. However, the sudden change in acceleration at the start and end of the sweep can cause motional excitation of the atom being transported. Using a minimum-jerk trajectory [115] resolves the heating associated with jerk. However, the sweep rate varies throughout the trajectory and is likely to scan over a range that causes parametric heating. A compromise can be found using a hybrid minimum-jerk trajectory, which we define as in [26]: x min−jerk (t, d, T ) = d 10 (t/T ) 3 − 15 (t/T ) 4 + 6 (t/T ) 5 .

(E.2)
x hybrid =      x min−jerk (t, 2∆f, 2∆t) for 0 ≤ t ≤ ∆t 15 4 ∆f 2∆t for ∆t < t < T − ∆t Here, ∆f is the distance travelled during the minimum-jerk portion of the trajectory, and ∆t = T (1−α)/2 is the time elapsed during the minimum-jerk portion. The fraction of the total duration which follows the linear trajectory (α = 0 for fully minimum-jerk and α = 1 for fully linear) is colloquially named the hybridicity. Both the duration of the sweep, T , and the hybridicity, α, determine whether we scan across a bad sweep rate. In this paper trajectories with a hybridicity of 0.1 were used to move a distance of 4.5 µm (a frequency chirp of 14 MHz). This implies that the intensity modulation will be resonant with the trap frequencies {ω x , ω y } 938 Cs = {64, 92} kHz for sweep durations of {2.1, 1.5} ms respectively. We have demonstrated that the motional ground state can be maintained with high fidelity when merging a Cs atom with a Rb atom using a trajectory with hybridicity 0.1 and duration 1.6 ms. Future work might explore using a different hybridicity and sweep duration.