Quantum state transfer via acoustic edge states in a 2D optomechanical array

We propose a novel hybrid platform where solid-state spin qubits are coupled to the acoustic modes of a two-dimensional array of optomechanical nano cavities. Previous studies of coupled optomechanical cavities have shown that in the presence of strong optical driving fields, the interplay between the photon-phonon interaction and their respective inter-cavity hopping allows the generation of topological phases of sound and light. In particular, the mechanical modes can enter a Chern insulator phase where the time-reversal symmetry is broken. In this context, we exploit the robust acoustic edge states as a chiral phononic waveguide and describe a state transfer protocol between spin qubits located in distant cavities. We analyze the performance of this protocol as a function of the relevant system parameters and show that a high-fidelity and purely unidirectional quantum state transfer can be implemented under experimentally realistic conditions. As a specific example, we discuss the implementation of such topological quantum networks in diamond based optomechanical crystals where point defects such as silicon-vacancy centers couple to the chiral acoustic channel via strain.


I. INTRODUCTION
In recent years the efforts towards building scalable quantum information processing devices have reached unprecedented intensities. For this purpose, a number of physical platforms, such as superconducting circuits [1], cold atoms in optical lattices [2,3], trapped ions [4], Rydberg atoms [5] and defect centers in solids [6][7][8][9][10], are actively investigated. In parallel, various strategies for implementing hybrid quantum systems are currently explored [11][12][13], with the long-term goal to combine the strengths of the different architectures and to mitigate system-specific weaknesses. In this context, high-Q mechanical elements play a particularly important role for realizing coherent quantum interfaces [14][15][16][17][18][19][20][21] as they can be coupled efficiently to a large variety of other quantum systems [22] while being themselves only weakly affected by decoherence. Similar to optical fields, acoustic waves can be guided along coupled resonator arrays or continuous phononic waveguides [23][24][25], which can be used to implement chip-scale quantum networks where quantum information is distributed via individual propagating phonons [23,[26][27][28][29]. In particular, such phononic quantum channels have been proposed to overcome the problem of coherently integrating a large number of electronic spin qubits associated with defect centers in diamond [14,28,[30][31][32][33][34]. However, being in its infancy, the control of acoustic waves on the quantum level still faces many challenges, which must be met both on an experimental and on a conceptual level. This includes, for example, the scattering of phonons along the channel by fabrication imperfections, but even more fundamentally, the ability to emit phonon wavepackets with a specified shape and direction, as a prerequisite for many quantum state transfer protocols [35].
In this work we propose and analyze an hybrid phononic quantum network, where spin qubits or other two-level systems (TLS) are coherently coupled to the chiral acoustic edge channels of a two dimensional (2D) optomechanical (OM) array. This architecture is motivated by the progress in engineering spin-phonon interactions in solid-state systems [36][37][38][39][40][41][42][43], as well as in fabricating 2D OM crystals [44][45][46][47] with different geometries and band structures. In a previous work [48] it has been shown that 2D OM arrays can exhibit a rich set of topological phases of sound and light that can be fully explored by tuning in situ the optical driving of the cavities. In particular, for weak OM interactions, the acoustic excitations are expected to enter a Chern insulator phase where chiral edge states propagate along the array boundaries. Thus this hybrid quantum system offers a platform to study rich physics emerging from the interplay between spins, mechanical and optical degrees of freedom in phases where time-reversal symmetry is broken.
As a first application for this setup we focus on the quantum-state transfer between TLS located in distant cavities via chiral acoustic edge channels. Compared to state transfer protocols in regular 1D phononic waveguides [23,26,28,29], this platform offers the advantages of a unidirectional propagation [35,[49][50][51], which is robust against local perturbations and where the direction can be controlled by external optical driving fields. While the basic protocol discussed in this work is very general, a naturally-suited system where these ideas can be implemented is an array of separated Silicon vacancy (SiV) centers in a diamond OM crystal. In this case, quantum information can be stored in the long-lived spin degrees of freedom of the SiV ground state [52][53][54][55][56], where Ω e (t) Ω r (t) SiV ground states (b) and |2 form a long-lived spin qubit. A microwave drive Ω(t) couples opposite spin states while the strain associated with a single phonon couples the orbital degrees of freedom |e± with strength gs. The combination of the two processes leads to a tunable interaction between the spin states and phonons of frequency ∼ ω0. (c) State transfer between distant TLS via topologically protected chiral acoustic waves propagating along the boundaries of the structure. at low temperatures of T 1 K coherence times exceeding T 2 ∼ 10 ms have been demonstrated [55,56]. At the same time the orbital degrees of freedom of the defect allow strong and tunable coupling to vibrational modes, as recently discussed in Ref. [28]. Combined with the ability to design chiral acoustic channels via OM interactions this coherent spin-phonon interface offers many new tools to overcome fundamental challenges in phononic quantum network applications.

II. MODEL
We consider a 2D array of OM cavities as depicted in Fig. 1, where each lattice site contains a single TLS. The OM array can be realized, for example, in so-called "snowflake" structures [57], where high-Q vibrational and optical modes are co-localized in regions of engineered defects created by smoothly varying the size of the patterned periodic snowflake holes [cf. Fig. 1 (a)].
At each lattice site j, the variation in the index of refraction due to mechanical vibrations leads to a strong OM coupling that can be described by the standard OM [58]. Hereâ j (b j ) represents the annihilation operator of the photonic (phononic) mode of frequency ω c (ω M ) and g 0 is the OM coupling per photon. Due to the strong localization of both photons and phonons, this coupling can reach values of about g 0 ∼ 250 kHz [46], which we will assume for all the following estimates. The optical cavities are driven by a strong external laser field of frequency ω L , which drives each optical mode into a coherent state with amplitude α j (t) = √ n c e iθj e −iω L t , where n c 1 is the mean intracavity photon number. By redefining a j → a j + α j (t), the OM interactions can be linearized and in a frame rotating with ω L , the resulting Hamiltonian for the whole 2D OM array is given by ( = 1) Here J > 0 (K > 0) denotes the nearest-neighbor photon (phonon) hopping rate and ∆ = ω L − ω C < 0 is the detuning between the cavities and the drive. In Eq. (1), G = g 0 √ n c is the linear OM coupling, which is enhanced by the number of photons in a cavity. The positive hopping rates considered here are in contrast to the model proposed in Ref. [48] and, as described below, lead to qualitatively different scenarios. At this stage, we consider all parameters identical throughout the lattice except for the driving phases θ j . Note that in Eq. (1), we made an additional rotating-wave approximation by neglecting processes that do not conserve the number of excitations (∼ Ge −iθj a j b j + H.c.). The validity of this approximation is discussed in Appendix A.
In addition to the localized optical and mechanical modes, we consider a TLS embedded in each sites of the array, which is coupled to the acoustic vibrations via strain. We model the interaction by a Jaynes-Cumming coupling with time-dependent strength g sp (t), such that the effective Hamiltonian describing the full hybrid 2D array readŝ Here ω 0 is the transition frequency of the TLS,σ z is the usual Pauli-Z operator andσ While the spin-phonon coupling assumed in Hamiltonian (2) is very generic and could be realized with various types of TLS [16,22,26], we explicitly consider the example of SiV centers in diamond in our following analysis. As depicted in Fig. 1, the electronic ground-state manifold of this center consists of two long-lived spin states denoted by |1 and |2 , which can be coupled to a mechanical vibrational mode via a microwave assisted Raman process involving the excited state |3 . More precisely, the strength of the time-dependent spin-phonon coupling g sp (t) = Ω(t)g s /δ and the qubit frequency ω 0 can be externally tuned via the microwave drive amplitude Ω(t) and detuning δ compared to the state |3 , respectively. Here g s is the intrinsic strain coupling between the state |1 and |3 . Further details about SiV defects and their strain coupling are given in Appendix B.

III. ACOUSTIC EDGE CHANNELS
The main purpose of considering a 2D OM array instead of a simple 1D phononic waveguide is to use the OM interaction for engineering topologically protected acoustic edge channels, along which phonon propagation becomes unidirectional and immune against local disorder. As first proposed in Ref. [48], such a scenario can be achieved by imposing a non-trivial pattern of the driving phases θ j , which mimics the presence of a strong effective magnetic field. Similar to electronic systems in real magnetic fields, the resulting bandstructure of the OM crystal may then exhibit topologically protected bands with a non-trivial Chern number, which for a finite system are associated with left-or right-propagating edge modes. In contrast to Ref. [48], we here consider a different band structure which leads to much larger topological gaps in presence of weaker optical driving power.

A. Topological phases of sound in an OM Kagome lattice
While chiral acoustic edge channels can be implemented with various different OM lattice geometries, we here exclusively focus on the Kagome lattice for which topological phases of sound and light have already been described in Ref. [48]. The Kagome crystal structure (see Fig. 1) is defined by a triangular Bravais lattice spawned by the unit vectors { R 1 = −(1, √ 3)a, R 2 = (2, 0)a} and a three-cavity basis given by { r A = (0, Here a is the distance between two adjacent cavities and {A, B, C} refer to the different cavities within a unit cell. This structure possesses the full C 6v symmetry of the corresponding Bravais lattice. In absence of the external driving fields, i.e. G = 0, the OM crystal system is time-reversal symmetric and contains six energy bands. The three acoustic (optical) bands are centered around ω M (−∆) and have a total width of 6K (-6J). A zoom in of the non-interacting band structure in a spectral range that includes all acoustic bands but exclude far detuned optical modes is shown in Fig. 2 (a). We see that the C 6 and time-reversal symmetries impose essential degeneracy at the high-symmetry points of the Brillouin zone, i.e. K = (2π/3a, 0) and K = (π/3a, π/ √ 3a), where Dirac cones are formed, and at Γ = (0, 0), where a quadratic band-crossing point appears. Importantly, one of the optical (mechanical) bands is flat. This feature reflects the existence of localized normal modes describing a standing wave where the six cavities along the edges of the same Wigner-Seitz cell are excited with equal amplitude but alternating sign. More details about the diagonalization of the OM Hamiltonian in the quasi momentum space are given in Appendix C.
For finite driving of the OM cavities (G = 0) the acoustic and photonic bands hybridize. Following Ref. [48], we choose the pattern of phases ∆θ = θ B − θ A = θ C − θ B = θ A − θ C = ±2π/3 for every unit cells. In other words, we consider a driving of one of the optical modes at the Γ point. Such phase pattern can be generated by simply using three optical drives pointing at 120 • angle from each others [48] [cf. Fig. 1 (a)]. Most importantly, it breaks the time-reversal symmetry without breaking the spatial symmetries of the Kagome lattice and, thus, lifts the essential degeneracies giving rise to topological band gaps [59].
Here, we focus on the weak OM coupling limit where the detuning of all optical modes is larger than the OM coupling, i.e. G |ω M + ∆ + 2J|, such that all the excitations are still almost phononic or optical in nature. In this regime, the existence of topologically non-trivial phases for sound can be understood from the fact that phonons can also hop to neighboring lattice sites through virtual optical excitations. In the conceptually simplest setting where the optical bandwidth is small compared to the detuning, J |ω M + ∆|, this hopping is restricted to nearest neighbors and has an amplitude Since the resulting overall phonon hopping amplitude, K eff ij = K + K opt ij , then becomes a complex quantity, a phonon moving anti-clockwise around a crystal unit cell basis (i.e. This is reminiscent of a charged particle moving in a magnetic field where Φ represents the normalized magnetic flux encircled by the three cavities of the basis [61]. Note that the total flux within a Bravais unit cell is zero as moving anti-clockwise over an hexagonal path leads to a phase shift of −2Φ [cf. Fig. 1 (a)], simulating what is known as the anomalous quantum Hall effect [59]. In a more realistic situation, as considered in this work, the optical hopping rate is larger than the detuning, i.e. J |ω M + ∆|. In this case, the same intuition holds but the optically-induced phonon hopping becomes longer-ranged and one must resort to a full numerical evaluation of the band structure, as exemplified in Fig. 2 (b). Note that in this same limit with K J, the corresponding flux Φ opt experienced by the light field remains negligible, thus suppressing any non-trivial topology of the optical field.

B. Topological gaps
The breaking of time-reversal symmetry opens gaps between the acoustic bands, bringing the vibrational excitations into a Chern insulator phase. This is confirmed by computing the topological invariant, known as the Chern number [61], C n = i 2π BZ d 2 k[ ∂ kx m k,n |∂ ky m k,n − ∂ ky m k,n |∂ kx m k,n ] for each acoustic band n with energy eigenstates |m k,n . Here the integral is performed over the first Brillouin zone. For the two lowest-energy mechanical bands, one finds C 1 = 1 and C 2 = 0 [cf. Fig. 2 As shown in Fig. 2 (b), for weak OM interactions the gaps open at the symmetry points K and K , while for larger couplings, competing processes taking place with quasi momentum near the high-symmetry points 3 ) close the gap again. The dominant OM processes allowed by the conservation of angular momentum can be captured using a simple analytic model from which one accurately predicts the band gap for G < G c , where is the critical coupling above which the gap decreases (cf. Appendix D). Here δ OM = −∆ − 2J − ω M − K is the detuning between the lowest optical band and the mechanical Dirac points in the noninteracting limit [cf. Fig. 2 (b)]. From the above expressions it follows that the band gap reaches the maximum value = K for a detuning δ OM above the threshold δ th OM = 2K and driving strengths in a finite range Fig. 3 (a), we show as a function of δ OM and G while explicitly indicating G c and G min . In panel (b), we show as a function of G for δ OM /K = 1.5, 2.0. For experimentally relevant phonon hopping rates K/2π ∼ 100 MHz, the minimal coupling G min ≈ √ 3K at threshold δ th OM is reached with a number of intra-cavity photons n c ∼ 0.75 × 10 6 . While generally challenging, we note that recent experiments suggest that diamond structures [60] are more compliant to stronger drives compared to silicon-based systems [44][45][46][47].

C. Edge channels
For a finite size system, the existence of separated energy bands with non-trivial Chern number is associated with a set of topologically protected chiral edge states, which propagate along the boundaries of the OM array. Specifically, the difference between the number of such edge states propagating clockwise and anti-clockwise is given by the sum over the topological invariant C n associated with all lower-energy bands.
To study in more detail the properties of these edge channels in the present setup, we consider in Fig. 4 (a) a stripe of infinite length along x with straight edges at y = 0 and y = −(N y − 1) √ 3a. Here N y is the number of unit cells along R 1 and the upper straight boundary is obtained by excluding the cavities A of all cells at y = 0. For this geometry, the full OM crystal Hamilto-nianĤ OMC in Eq. (1) can be diagonalized within each quasi-momentum k x subspace, allowing us to capture the dispersion relation and the structure of the edge states localized at the boundaries. Details of the diagonalization are presented in Appendix C. In Fig. 4 (b), we show an example of the mechanical band structure as a function of k x for G = 2K, δ OM = 3K and ∆θ = 3π/2. On the upper boundary, a single edge state is present and its dispersion relation ω E (k x ) is shown by the black curve crossing the gap for π/2 < k x a ≤ π. The group velocity of phonons propagating along this channel is given by and is positive for the phase pattern ∆θ = 3π/2. The situation is completely symmetric for the edge state at the lower boundary: its energy crosses the gap for 0 < k x a ≤ π/2 and it has a negative group velocity. For the purpose of using such edge modes as phononic quantum channels, two other key properties must be taken into account: their penetration depth into the bulk and how strongly they are hybridized with the optical bands. The latter plays an important role for dissipation and the former characterizes how strongly the edge modes couple to the TLSs. In general, we can write the annihilation operator for an edge-state excitation with quasi-momentum k whereb s,m (k x ) [â s,m (k x )] is the phononic (photonic) annihilation operator acting on the basis s = {A, B, C} of the m th unit cell along R 1 with quasi-momentum k x . The upper boundary corresponds to m = 0. The coefficients u s (k x ) and v s (k x ) are the respective mechanical and optical probability amplitudes, ξ(k x ) is the penetration depth and φ m (k x ) is a generic phase. We define the optical fraction of the edge state as where the upper bound is set by the normalization condition.
In the view of a state transfer between TLS embedded in the outermost cavities along the boundaries of the array, we are mostly interested in the edge-state excitation that lies within the gap and has the smallest penetration depth and photonic amplitude. This condition motivates us to define k 0 such that u s0 (k 0 ) is maximal [cf. Fig. 4 (e)], where s 0 represents the outermost cavities along the boundary. In Fig. 4, we show the optical fraction, the penetration depth ξ(k x ) and the group velocity, all evaluated for k x = k 0 . We finally note that for the straight edges considered, i.e. where only the cavities B and C form the outermost layer of the boundaries, no A cavities throughout the crystal supports an edge mode (i.e. u A = 0). We conclude this section by noting several important differences between the results presented here and those in Ref. [48]. In this work, we consider positive hopping amplitudes K > 0 and J > 0 which results in inverted dispersion relations compared to Ref. [48]. As a consequence, the lowest-energy optical band is flat (for G = 0). This feature changes qualitatively the OM interaction. In particular, the band gap and the optical fraction P opt become independent of J for large J/δ OM . In contrast, these quantities are suppressed as K/J and K/J, respectively, for negative hopping amplitudes [48]. Since J/K ∼ 10 4 is of the order of the ratio between the speed of light and sound in the material, this allows us to reach much larger band gaps at the expense of larger optical fraction. Finally, for positive hopping amplitudes, the driven optical mode coincides with the lowest-energy band, such that the detuning from the drive frequency is considerably smaller than for the case of the highestenergy band considered in Ref. [48]. Due to this reduction of the detuning by about 6J ∼ 100 GHz, the necessary power of the external drive to reach n c ∼ 10 6 is strongly reduced.

IV. QUANTUM STATE TRANSFER
So far, we have focused solely on the OM cavities without considering finite couplings to the TLS. In this section, we exploit the time-dependent spin-phonon coupling g sp (t) and the acoustic chiral edge states to trans-fer an arbitrary quantum state between any pairs of TLS embedded in distant cavities along the boundaries of the structure.
In this protocol, only the emitting (e) and receiving (r) defects are driven, i.e. only g (l) sp (t) = 0 with l = {e, r}, while all the other undriven centers are far off-resonance with any mechanical excitations. We also consider a lowtemperatures environment T ≤ ω M /k B (corresponding to T 1K for SiV centers) in which case the system remains in the single excitation subspace. Finally, we account for dissipative processes by including photon and phonon loss in every cavities of the crystal. We denote by κ C = ω C /Q C (κ M = ω M /Q M ) the photon (phonon) decay rate where Q C (Q M ) is the optical (mechanical) quality factor. By restricting the dynamics to the singleexcitation subspace, we can account for losses by simply considering a non-unitary time evolution by substituting

A. Markovian channel
In the limit of weak spin-phonon couplings [g (l) sp (t) K], the topological phase of sound described in the previous section is approximately unperturbed by the TLS. Moreover, for resonance frequencies ω 0 of both TLS deep within the topological band gap, the defects only couple efficiently to the acoustic edge modes. In this limit, the coherent dynamics of the state transfer protocol can be described by the effective Hamiltonian where k and n l represent the quasi-momentum of the edge state and the position of the TLS l along the edge of N cavities, respectively. The effective spin-phonon coupling g (l) sp (t) depends on the distance of the defect from the boundary (m l ) and captures the properties of the edge modes previously derived in the context of the semi-infinite stripe. Although the structure supporting the state transfer is a finite 2D crystal [cf. Fig. 5 (a)], Eq. (9) is valid for defects positioned far from any dislocations, e.g., a corner. Similarly, one can estimate the decay rate of the chiral channel as κ E (k) = |u s | 2 κ M /[1 − e −2a/ξ(k) ] + P opt (k)κ C . For the scenarios considered in this work, where the mechanical frequencies are in the GHz regime while the optical modes are in the hundreds of THz, κ C /κ M ∼ ω C /ω M ∼ 10 4 in cases of similar quality factors. As a consequence, the optically induced decay rate is expected to exceed by far the intrinsic mechanical loss.
Considering the single-excitation ansatz where |0 represents the vacuum state with both centers being in their lowest-energy state and no acoustic excitations in the waveguide, a perfect transfer of an arbitrary state corresponds to a e (t 0 ) = a r (t f ) = 1 and a e (t f ) = a r (t 0 ) = 0. Here t 0 and t f are the initial and final times of the protocol, respectively. In the case where the TLS see a constant density of states of the acoustic modes, it is possible the apply the standard Born-Markov approximation to the Schrödinger equation i ∂ t |ψ(t) =Ĥ ss (t)|ψ(t) (cf. Appendix E), leading to the following local equations of motion Here, the transfer rate between the chiral waveguide and the defects is with v g ≡ v g (k 0 ) and θ l (t) = arg[g (l) eff (t)]. The input field of the receiving node is f in,r (t) = f out,e (t − τ p )e iφp with τ p and φ p the time and phase acquired during the propagation from the emitter to the receiver. In the case of a perfectly straight edge, τ p = 2(n r − n e )a/v g and φ p = k 0 2(n r − n e )a. The strategy to achieve a highfidelity state transfer is to control in time g eff (t) in order to suppress the output field f out,r (t) = f in,r (t) + γ r (t)e −iθr(t) a r (t) of the receiver. Thus in this idealized limit, the state transfer-problem becomes equivalent to the scenario discussed in the original work by Cirac et al. [35] and similar optimized pulse shapes can be used to achieve close-to-unity state transfer fidelities. The main limitation then arises from propagation losses and the ratio between the TLS decoherence rate and the maximal transfer rate γ max that one can reach in a specific implementation.

B. Exact evolution
While the above description properly highlights the physics underlying the state-transfer protocol, it becomes exact only for extremely weak couplings to perfect Markovian 1D channels. In contrast, we here perform no approximations and numerically simulate the full dynamics of the time evolution as governed by Eq. (2). We use a slowly varying pulse for the emitter g   Fig. 5 (a)], such that the non-trivial transfer of excitations around a corner is included in the simulations. We compare two scenarios: the first one with higher cavity quality factor Q C = 5.0 × 10 7 and strong OM coupling G = 2K with δ OM = 4K; and the second with lower Q C = 10 7 and more detuned OM coupling G = 2K and δ OM = 20K. In the first scenario, the edge state is more localized and has a slower group velocity (see Fig. 4), leading to a faster state-transfer via a stronger γ max /g max sp ∼ 0.03 [cf. Eq. (12)]. However, the larger optical hybridization and longer time spent in the waveguide make the optically-induced decay rate more detrimental, hence the need of higher Q C . In the second scenario, the transfer is slower with γ max /g max sp ∼ 0.006, but more resilient to dissipation.
In Fig. 6 (a), we analyze the maximal fidelity F = |a r (t f )| 2 as a function of the detuning δ OM for G = 2K for Q C = {0.5, 1.0, 5.0} × 10 7 . It highlights the larger optically-induced decay rate for smaller detunings. In the short-time limit, one can approximate the optical loss as 1 − F ∼ N s P opt κ C a/v g with N s the number of traveled cavities. For larger detunings, the optical loss is reduced, but the smaller decay rates require longer time t f to transfer the state, which can become an issue com-pared to the coherence time of the TLS. As an example, for K/2π = 100 MHz and g max sp /K = 0.06, t f ∼ 2µs for G = 2K and δ OM = 4K. This is still fast compared to the expected inhomogeneous dephasing times T * 2 ∼ 10 − 100 µs and much shorter than the intrinsic coherence time of T 2 ∼ 10 ms demonstrated for SiV centers at low temperatures.

C. Disorder
So far, we have consider identical parameters over the entire system, i.e. perfectly matched mechanical frequencies, detunings and OM couplings. In experiments, reaching a high level of homogeneity is challenging and any realizations is expected to have a certain level of disorder. We here analyze the robustness of the state transfer in presence of such imperfections within the system. To do so, we consider all parameters to be slightly different for every cavities. For example, we consider ω (j) M = (1 + p j )ω where p j is randomly chosen from a uniform distribution ranging from −W/2 to W/2. The same applies for ∆, G, κ C and κ M . In Fig. 6 (b), we plot the state transfer fidelity as a function of W averaged over 50 realizations of disorders. We compare the robustness for G = 2K and δ OM = 4K, where the gap is = K, to the scenario with G = 2K and δ OM = 20K, where the gap is ≈ 0.34K. The state-transfer fidelity starts to decrease for disorder strengths large enough to close the topological gap, which is roughly set by W ω M [as indicated by the vertical lines in Fig. 6 (b)]. An additional advantage of working with larger OM interactions is thus the increased resilience to disorder due to the larger gap.

V. CONCLUSION AND OUTLOOK
In this work, we have proposed and analyzed a 2D hybrid system where the acoustic excitations within a Kagome lattice of coupled OM cavities interact with spin degrees of freedom of point defects. In this context, we have described the emergence of a topological phase where time-reversal symmetry is effectively broken for the acoustic excitations as a result of the interplay between the OM interaction and the inter-cavity hopping. As a potential application, we have shown that the resulting acoustic chiral edge states can serve as phononic quantum channels, which are purely unidirectional and robust with respect to onsite-disorder. Our analysis revealed how the key properties of such topological channels depend on the relevant OM coupling and detuning parameters and how an optimized trade-off between optical losses, propagation speed and disorder protection can be achieved. Apart from the considered example of SiV defects in diamond OM crystals, most of these considerations will be relevant as well for other types of qubits or other artificial realizations of topological systems.
Beyond quantum communication applications, the proposed hybrid system provides a versatile platform to study interacting quantum many-body systems, where topological phases with broken time-reversal symmetry are combined with strong nonlinearities provided by the spin qubits. The rich physics expected for such interacting topological insulators is still little understood and could be probed in such setting in various parameter regimes and employing only static spin-phonon interaction, which are in general much simpler to engineer. Such terms describe the creation and annihilation of photon-phonon pairs and have been dropped during the derivation of Eq. (1) based on a rotating wave approximation. The processes that dominantly contribute to the corrections to the rotating wave approximation describe the creation (annihilation) of a phonon accompanied by the creation (annihilation) of a photon in the flat Kagome band. The typical oscillation frequency δ K of these terms in the rotating frame is set by the distance of the flat optical band from its blue side band, δ K ∼ δ OM + 2ω M . As a consequence the leading order corrections to the RWA are of the order ∼ G/(δ OM + 2ω M ). It is important to keep in mind that contrary to the usual case of timereversal preserving OM systems such terms can lead to an OM instability even in the regime where they represent a small perturbation and when the driving is red detuned compared to all optical resonances, see Ref. [48] for a detailed analysis. The reason for this behavior is that the same optical mode couples to different mechanical modes on its blue and red sidebands. As a consequence the mechanical modes that couple only to the blue sideband of the flat Kagome band but not to its red sideband are subject to a small overall optical induced amplification with rate ∼ κ C G 2 /δ 2 K . This implies that a small mechanical decay rate of the order κ M ∼ κ C G 2 /(δ K ) 2 is required to stabilize the system, which is present for all parameter regimes considered in this manuscript.

Appendix B: Negatively charged Silicon-Vacancy centers in Diamond
In this section we describe in more detail the negatively-charged Silicon-vacancy center in diamond. More precisely, we focus on its electronic ground-state and its strain coupling to vibrational modes of its host crystal.
The molecular structure of the SiV center belongs to the D 3d point group symmetry and their electronic ground state are formed by an unpaired hole of spin S = 1/2 subjected to a strong spin-orbit interaction. The resulting fourfold ground state subspace is comprised of two doublets, {|1 |e − , ↓ , |2 |e + , ↑ } and {|3 |e + , ↓ , |4 |e − , ↑ }, which are separated by ∆ SiV /2π 46 GHz [53,54]. Here, |e ± are eigenstates of the orbital quasi-angular momentum operator L z associated to a 2π/3 rotation about the symmetry axis of the defect (taken to be along z), i.e.R 2π/3 |e ± = e −i 2π 3L z |e ± = e ∓i 2π 3 |e ± . In the presence of a magnetic field B = B 0 e z , the Hamiltonian for a single SiV center is ( = 1) where ω B = γ s B 0 and γ s is the spin gyromagnetic ratio. In Eq. (B1), we have included a time-dependent driving field of frequency ω d with a tunable Rabi-frequency Ω(t) and phase φ(t), which couples the lower and upper states of opposite spin. This drive can be implemented locally on individual defects either directly with a microwave field of frequency ω d ∼ ∆ SiV [62], or indirectly via an equivalent optical Raman process [28].

Strain coupling to mechanical modes
Within an OM cavity, we consider a single mechanical mode associated with a displacement profile u( r). In addition to modifying the indice of refraction seen by the optical mode, such deformation of the cavity modifies the electronic environment seen by the SiV center, resulting in the coupling of its orbital states |e ± [41,42,63]. The SiV-phonon coupling can be described by ( = 1) whereĴ − = (Ĵ + ) † = |1 3| + |2 4| is the spin-conserving lowering operator and g s is the strain-induced coupling rate which is proportional to the local strain tensor The resulting coupling rate can be written as g s = 2πd xZPF λ∆ SiV ξ( r SiV ), where d/2π ∼ 1 PHz is the strain sensitivity [41,42], x ZPF ∼ 1 fm is the mechanical zero point motion [57], λ ∆SiV ∼ 200 nm the characteristic phonon wavelength in diamond and ξ( r SiV ) is the dimensionless strain distribution evaluated at the position of the SiV center r SiV . From deformations u( r) observed in previous experiments [44] and state-of-the-art positioning of SiV defects [64], we expect ξ( r SiV ) ∼ 1, leading to interaction rates as large as g s /2π ∼ 30 MHz. This estimation is consistent with finite-element simulations performed for 1D diamond nano-cavities [28]. For matching frequencies (∆ SiV = ω M ) and mechanical quality factors Q ∼ 10 5 , strong-coupling regime g s > ω M /Q, 1/T * 2 should be reached, allowing coherent excitation transfer between the SiV centre and the mechanical resonator.

Time-dependent effective spin-phonon coupling
In the specific case of a state transfer protocol, one has to control in time an effective coupling between the spin degree of freedom, encoded in the two lowest-energy ground-states |1 and |2 , and the mechanical mode. This can be performed by an off-resonant driving of the state |3 [cf. Fig. 1 (b) and Eq. (B1)], leading to a standard three-level Λ atomic system. For large drive detunings δ = ω 0 − ∆ SiV , i.e. |δ| |g s |, |ω M − ω 0 |, |Ω| with ω 0 = ω d + ω B the frequency of the emitted phonons, the higher-energy state |3 can be adiabatically eliminated resulting in an effective time-dependent spin-phonon coupling g sp (t) = g s Ω(t)/δ. Assuming 0 ≤ Ω(t)/2π < 100 MHz and δ/2π 400 MHz, this rate can be tuned between g sp = 0 and a maximal value of g max sp /2π ∼ 7 MHz, which is still large enough to reach the strong coupling regime.  (Kb † r,ib s,j + Jâ † r,iâ s,j + H.c.).
Here, n.n. represents the sum over the nearest neighbors.

Infinite 2D crystal
In the limit where the system is an infinite 2D array with N → ∞ cavities, one can definê which destroys an excitation with the conserved quasimomentum k. The same definition applies for the optical modes, leading tô where the mechanical Hamiltonian Here,b † s ( k) = [b s ( k)] † and the equivalent form applies to the photonsĤ C ( k). The OM interactions read Since H OMC is quadratic, one can fully solve the excitation spectrum considering only a single excitation for whichĤ( k) is a 6 × 6 matrix. DiagonalisingĤ( k) for every k within the first Brillouin zone of the Kagome lattice leads to a six-band dispersion relation as shown in Fig. 2 (a).

Semi-infinite stripe
For a stripe that is infinite in the x direction (along R 2 ) with N y unit cells along R 1 , only the quasi momentum along x (k x ) is conserved and the proper expansion for b s,j (same forâ s,j ) readŝ Here N x → ∞ is the number of unit cells along x and b s,m (k x ) is the destruction operator for a mechanical mode with quasi-momentum k x in cavity s of the m th unit cell along R 1 . We thus consider a strip that goes from y = 0 to y = −N y √ 3a and increasing m means to go along −y.
Similar to the infinite 2D case, one getŝ ] + H.c. The last line describes the hoppings within the first unit cell and so determines the form of the edge. In that case, the site A is missing which leads to a straight edge as pictured in Fig. 4 (b). The optical HamiltonianĤ C (k x ) adopts the same form.
Within the single excitation subspace, H kx is a matrix of dimension 3N y − 1 and its diagonalisation leads to the dispersion relation shown in Fig. 4 (a). For finite G and a phase pattern ∆θ = ±3π/2, the edge states appear in the energy spectrum and can be expressed within the same basis, i.e. (C9) Here, the coefficients u s and v s are the mechanical and optical probability amplitudes on the basis s, respectively. The edge state decays exponentially within the bulk with a penetration depth ξ(k x ) and phases φ m (k x ).

Appendix D: Topological gap
In this section, we derive in more detail the effective models to describe the opening and closing of the topological gap. We consider an infinite 2D array and utilize the modes derived in Eq. (C2).
In addition, a phase pattern of the drive ∆θ = θ A − θ B = θ B − θ C = θ C − θ A = ±2π/3 means that only the optical mode |o Γ,± is driven. Following the conservation of the total angular momentum, only few OM processes are allowed. For example, given ∆θ = 2π/3, the lowest-energy optical eigenstate is |o K,+ and can only interact with the mechanical eigenstate |o K,+ at the Dirac point ω M + K, leading to a simple two-mode effective model H eff,K = δ OM a † K,− a K,− + G(a † K,− b K,+ + a K,− b † K,+ ). (D2) Here a K,− destroys a photon in mode |o Γ,∓ and the effective Hamiltonian is written in a frame that rotates at the frequency ω M + K. Each time a phonon is destroyed, a photon of quasi-angular momentum σ = + is also absorbed. H eff,K is easily diagonalized and leads to the gap presented in Eq. (4). Note that the highest mechanical band remains untouched with E m,2 (K) = ω M + K.
For larger coupling rates G, processes occurring near quasi-momentum M 1 , M 2 and M 3 start to play a role (in what follows we omit the subscript for clarity). Those high-symmetry points are invariant under C 2 rotations. As a consequence the normal modes are divided into symmetric and anti-symmetric normal modes at these points. For this reason the anti-symmetric mechanical band E m,2 (M) do not interact with the symmetric optical band E o,1 (M). The consequence is that no matter how large the gap at K and K becomes, the middle mechanical band stays at E m,M = ω M and thus bounds the total gap at max = K.
Moreover, the allowed interaction between the optical band E o,1 (M) and the highest mechanical band E m,3 (M) has the net effect to push down the mechanical band which closes the gap. In order to accurately capture the processes, we also need to include the lowest mechanical band, leading to a 3-mode effective model From H eff,M , one can find the critical value G c at which the gap starts to close, leading to Eq. (5).
Using this result in the equation for the receiver's cavity and performing a Born-Markov approximation, one recovers the standard equation The effective decay rate of the TLS into the waveguide reads Here, u sσ ≡ u sσ (k x = k 0 ), v g ≡ v g (k x = k 0 ) and φ m=0 ≡ φ m=0 (k x = k 0 ), where the momentum k 0 is defined as ω E (k 0 ) = ω 0 , i.e. the momentum at which the frequency of the TLS crosses the dispersion relation of the edge modes. Note that the factor 2 in the second line of Eq. (E6) comes from the distance of 2a between two unit cells in the Kagome lattice. For example, in cases where only the atoms B and C are excited along the straight edges, i.e. truly 1D limit [cf. Fig. 4 (b)], u sσ = 1/ √ 2 and γ = a|g sp | 2 /v g , as expected in a 1D unidirectional waveguide. Finally, the incoming field reads This result is expected from the input-output formalism.
The role of the penetration depth ξ ≡ ξ(k x = k 0 ) is implicitly included in the coefficients u sσ as the normalization constraint imposes s,m |u 2 sσ |e − 2ma ξ ≡ 1.
As expected, as the penetration depth increases, the strength at which the TLS couples to the edge state decreases.