Design of an ultra-low mode volume piezo-optomechanical quantum transducer

Coherent transduction of quantum states from the microwave to the optical domain can play a key role in quantum networking and distributed quantum computing. We present the design of a piezo-optomechanical device formed in a hybrid lithium niobate on silicon platform, that is suitable for microwave-to-optical quantum transduction. Our design is based on acoustic hybridization of an ultra-low mode volume piezoacoustic cavity with an optomechanical crystal cavity. The strong piezoelectric nature of lithium niobate allows us to mediate transduction via an acoustic mode which only minimally interacts with the lithium niobate, and is predominantly silicon-like, with very low electrical and acoustic loss. We estimate that this transducer can realize an intrinsic conversion efficiency of up to 35% with<0.5 added noise quanta when resonantly coupled to a superconducting transmon qubit and operated in pulsed mode at 10 kHz repetition rate. The performance improvement gained in such hybrid lithium niobate-silicon transducers make them suitable for heralded entanglement of qubits between superconducting quantum processors connected by optical fiber links.


Introduction
Recent landmark demonstrations have established superconducting quantum circuits as a leading platform for quantum computing [1][2][3][4]. Superconducting processors encode quantum information in microwave-frequency photons and require operation at milliKelvin temperatures. Due to the high propagation loss and thermal noise of microwave links at room temperature, transmitting quantum information from superconducting processors over long distances remains an outstanding challenge. In contrast, optical photons are naturally suited for low loss, long distance transmission of quantum information [5,6]. The complementary properties of these two systems have spurred interest in transducers that can coherently convert quantum information between microwave and optical frequencies. Such transducers would enable optically connected networks of remote superconducting quantum processors analogous to classical networks underlying the internet and large-scale supercomputers with optical interconnects.
A quantum transducer operated as a frequency converter can be specified as a linear device with a certain conversion efficiency, added noise level, and repetition rate for pulsed operation. Current approaches for microwave-to-optical frequency conversion rely on a strong optical pump to mediate the conversion process between single photon-level signals at both frequencies [7][8][9][10][11][12][13][14][15][16][17][18][19][20][21][22]. Increasing pump power allows for higher conversion efficiency, but due to parasitic effects of optical absorption in various components of the transducer, and the vast difference in energy scales between optical and microwave frequencies, this adds noise to the conversion process. For quantum applications of the transducer, the number of added noise photons per up-converted signal photon should be 1 [23]. This trade-off between efficiency and noise has been a key obstacle to transduction of quantum signals.
One particularly promising approach to microwave-to-optical transduction is piezo-optomechanics [12][13][14][15][16][17][18], in which acoustic phonons are used as intermediate excitations in the conversion process. This is achieved through a highly engineered mechanical mode with simultaneous piezoelectric and optomechanical couplings. Recent design and materials advances in these devices have led to a demonstration of optical readout of the quantum state of a superconducting qubit with added noise below one photon [12]. However, the aluminum nitride piezoelectric element contributed significantly to acoustic loss, compromising device performance.
In this work, we propose a new device design on a hybrid material platform, lithium niobate (LN) on silicon, which addresses the limitations of previous work. Our design features a highly miniaturized piezoelectric element so that its negative impact on device performance is negligible, while at the same time maintaining strong piezoelectric coupling rates. This design approach is enabled by the strong piezoelectric coefficients of LN [24]. Combined with the excellent optomechanical properties of silicon, our design achieves state-of-the-art piezoelectric and optomechanical performance. While some recent demonstrations have used a lithium niobate-on-silicon material platform [14,17], the design techniques outlined in this work were not fully exploited in these devices. We show that by leveraging the strengths of the individual materials, our design approach can yield the performance improvements necessary to employ these transducers for remote entanglement of superconducting quantum processors. Fig. 1. a) Mode schematic for piezo-optomechanical transduction. Blue represents modeˆof a microwave circuit, purple represents 'supermode'ˆof a mechanical oscillator, and red represents modeˆof an optical cavity. b) Device schematic for the transducer in this work. The device can be split into two regions, one which couples to microwave electric fields and one which couples to optical fields. Both are part of the same mechanical 'supermode'ˆ. Fig. 1a illustrates the mode picture of our transduction scheme. An intermediary modê of a nanomechanical oscillator simultaneously couples to microwave photons from modê of a microwave circuit at rate pe , and to optical photons from modeˆof an optical cavity at rate om . Microwave photons are converted to phonons via a resonant piezoelectric interaction, and these phonons are subsequently converted into optical photons via a parametric optomechanical interaction. The microwave photon-phonon conversion is realized by tuning the circuit frequency on resonance with the mechanical frequency , yielding the Hamiltonian pe /ℏ = pe (ˆ †ˆ+ˆˆ † ) [25]. The phonon-optical photon conversion is realized by driving the optical cavity at a frequency that is red-detuned by exactly the mechanical frequency, − = − . The resulting Hamiltonian isˆo m /ℏ = om √ (ˆ †ˆ+ˆˆ † ) where is the number of intracavity optical photons corresponding to input drive power [26].
We realize the intermediary mechanical mode in the above scheme by connecting an ultra-low mode volume piezoacoustic cavity and an optomechanical crystal (OMC) cavity, shown in Fig. 1b. The acoustic modes of these components are strongly hybridized to form a mechanical supermode, whose mechanical displacement overlaps in one region with the field of a microwave circuit, and in another region with the field of an optical cavity. Using physically separate cavities allows us to independently optimize the piezoacoustic and optomechanical components of the transducer. Our design is formed from thin-film LN on the device layer of a silicon-on-insulator (SOI) chip. We define the piezoacoustic cavity in LN, which has large piezoelectric coefficients [24]. We define the OMC in silicon, since its large photoelastic coefficients and refractive index allow high optomechanical coupling. Well-established nanofabrication processes also allow high optical and mechanical quality factors for silicon OMC's [27][28][29]. For the microwave circuit in this design, we consider a transmon qubit with electrodes routed over the LN region to allow for capacitive coupling to the piezoacoustic cavity. The transmon is patterned in niobium on silicon, a standard material platform for realizing high-coherence qubits [30]. The insulating layer underneath these components is etched away, leaving a suspended silicon membrane as the substrate for our device.
Our design procedure utilizes finite element simulation in COMSOL Multiphysics ® and numerical optimization of the device geometry. We begin with independently optimizing the piezoacoustic and OMC cavities for high piezoelectric coupling rate pe and optomechanical coupling rate om , respectively. We design for closely matched acoustic modes at 5 GHz in both resonators, and for an optical mode at telecom wavelength (1550 nm). During the design process, it is crucial to maintain a low acoustic mode density such that the transduction schematic in Fig. 1a using a single acoustic mode remains valid. Further, since thin film LN has higher microwave dielectric and acoustic loss than silicon [29,[31][32][33][34], we aim to minimize the volume of LN in our device. The two independently optimized cavities are then physically connected and the parameters of the resulting hybrid acoustic modes are analyzed.

Piezo Cavity Design
The piezoacoustic cavity consists of a slab of lithium niobate on top of a suspended silicon membrane patterned in the shape of a box. We work with 100 nm thin-film -z-cut LN, on top of a 220 nm-thick suspended silicon device layer. We choose the -z LN orientation so that the dominant LN piezoelectric coefficient, 15 = 69 pC/N [24], couples to modes with favorable symmetry properties (further discussion in Sec. 3). 80 nm-thick Nb electrodes run over the top of the slab, and are routed in the form of an interdigital transducer (IDT) which capacitively couples the cavity to a microwave circuit such as a transmon qubit, as seen in Fig. 2a.
The box is surrounded by a periodically patterned phononic shield to mitigate acoustic radiation losses and to clamp the membrane to the surrounding substrate. The clamps are spaced periodically so that the IDT electrodes are routed over the top of each clamp, providing a means for electrical routing which is not acoustically lossy.
The phononic shield uses an alternating block and tether pattern (see relevant dimensions in Fig. 2c) consisting of metal electrodes on top of a silicon base. By tuning the parameters , , , and , we achieve a > 1 GHz acoustic bandgap centered around 5.1 GHz, shown in Fig. 2d. This yields strong confinement of 5 GHz mechanical modes inside the piezo region for sufficient number of shield periods, enabling high mechanical quality factors. By simulating the mechanical energy density across the entire cavity, we find that 5 shield periods provide > 4 orders of magnitude suppression of acoustic radiation into the environment, as shown in Fig. 2e.
The dimensions of the piezo box (outlined in Fig. 2b) are designed to support a periodic mechanical mode whose periodicity matches that of the IDT fingers. This results in high overlap between the electric field from the IDT and the electric field induced by mechanical motion in the piezo box. This overlap gives a microwave photon-phonon piezoelectric coupling rate which is derived using first order perturbation theory: Here the integral is taken over the entire LN slab, D is the electric displacement field induced from mechanical motion in the piezo region, and E is the single-photon electric field generated by the transmon qubit across the IDT electrodes. The fields are normalized to their respective zero-point energies ℏ /2, yielding the pre-factor in front of the integral in eq. (1). is the total cavity mechanical energy, and = 1 2 ( + IDT ) 2 0 is the total IDT electrostatic energy, where 0 is the zero-point voltage across the qubit. We note that for fixed electrostatic energy, the zero-point voltage (and thus E ) is dependent on both the qubit capacitance and IDT finger capacitance IDT , and therefore the coupling rate scales as ( + IDT ) −1/2 . For our calculations in this work, we assume = 70 fF which is a typical value for transmon qubit capacitance [35]. Replacing the transmon qubit with a high-impedance microwave resonator will allow lower ∼ 2 fF [36,37], and can therefore further increase this coupling rate. IDT is calculated with finite-element electrostatic simulation, and for the wavelength-scale devices considered here is typically ∼ 0.25 fF, a small contribution compared to transmon .
The small value of IDT also minimizes the energy participation of the qubit electric field in the lossy piezo region, given by the ratio The contribution of lithium niobate to the qubit loss rate , is then estimated as ,LN . Using reported dielectric loss tangents in lithium niobate at milliKelvin temperatures, tan = 1.7 × 10 −5 [32,33] corresponding to ,LN /2 = 85 kHz at 5GHz, we estimate the LN contribution to qubit loss to be ,LN /2 ∼ 300 Hz. This contribution is much smaller than typical loss rates ,SOI /2 ∼ 50 kHz reported in transmon qubits fabricated on SOI [38]. As a result, the contribution of the piezoacoustic cavity to qubit loss is not a limiting factor, and justifies the on-chip coupling scheme outlined in Fig. 2a. We note that the IDT capacitance can be increased by adding more IDT fingers, thereby increasing pe . However, as the number of fingers increases, the increased cavity size results in a more crowded mode structure, and it is more difficult to isolate a single mechanical mode without coupling to parasitic modes in the vicinity of the mode of interest. This is of key importance as these parasitic modes may not hybridize well with the OMC cavity and reduce overall transduction efficiency. This is shown in Fig. 3b, where we simulate the mechanical mode structure vs. the number of IDT fingers in the cavity. We see that pe of the mode of interest saturates, while the mode isolation is significantly reduced with increasing IDT fingers.
Reducing the size of the piezo region is also important to reduce mechanical loss, as lithium niobate has high acoustic loss tangents compared to silicon (further discussion in Sec. 4). For these reasons, we choose a 2-finger design to minimize these effects. The strong piezoelectric nature of lithium niobate allows for pe values high enough for strong microwave photon-phonon coupling, even in the limit of 2 IDT fingers. We emphasize the small dimensions of the piezoacoustic cavity in this design in contrast with previous work on piezo-optomechanical quantum transducers [13,15,16,18,39]. The benefits of this approach come at the cost of higher sensitivity of the piezoacoustic modes to changes in cavity dimensions. This can have large effects on hybridization with the OMC cavity and the performance of the final transducer device, which relies on resonant matching of acoustic modes in both regions. We show further in Sec. 4 that the achievable hybridization between piezo and OMC modes with this small piezo volume approach is large enough to protect the design against typical fabrication disorder.
The mechanical mode of interest is periodic with out-of-plane (Lamb-wave) and in-plane breathing components. The Lamb-wave component of the mode induces an electric field in the piezo region with high overlap with the IDT electric field, while the breathing component of the mode enables hybridization with the breathing mode of the optomechanical crystal to be attached in the full device (see Fig. 3c). The piezoacoustic mode can be tuned with three key parameters , , and . is the periodicity of the IDT fingers and is used to parameterize the piezo box length, given by = /2 where is the number of IDT fingers. is used to tune the frequency of the mode of interest while maintaining appropriate phase-matching of the mode periodicity with the IDT fingers (shown in Fig. 3d).
is the piezo box width, which can be increased to increase pe via larger mode volumes or decreased to reduce the mode crowding that results from larger box size. Finally, we define a silicon-piezo buffer parameter , which extends the silicon box length/width by an amount compared to the piezo box. This buffer is needed to protect against silicon/piezo box misalignment in the fabrication process, and acts as an added degree of freedom for tuning frequency, pe , and mode isolation. We use numerical optimization to tune parameters ( , , ) to arrive at a design with high piezoelectric coupling and mode isolation. We employ a Nelder-Mead simplex optimization similar to that described in Ref. [28]. After optimization, we obtain the mechanical mode structure shown in Fig. 3a. We identify a single mechanical mode with pe /2 = 9.01 MHz, which is isolated by > 110 MHz from other mechanical modes. We will use this single mode to strongly couple to the modes of an optomechanical crystal cavity to create the mechanical supermode of Fig. 1a.
A point of concern with LN is the in-plane crystal orientation with respect to the IDT electrodes. Due to the strong anisotropy of LN, in general the mode structure and pe will vary with this orientation. For -z-cut LN and our choice of mode, both pe and frequencies vary minimally with in-plane crystal rotations, with pe varying by < 5% over a full 2 rotation.

Optomechanics Design
The optomechanical crystal (OMC) cavity is a periodically patterned silicon nanobeam which is designed in a similar fashion to previous work [28], with the crucial change of a modified unit cell design on one side of the cavity to enable strong mechanical hybridization with the piezoacoustic cavity. This separates the OMC into three distinct regions: a phonon mirror, defect region, and phonon waveguide (see Fig. 4 for details). The phonon mirror unit cell (Fig. 4b) is designed to have a simultaneous mechanical and optical bandgap for modes of certain symmetry classes. In the defect region, the phonon mirror unit cell transitions to a defect cell designed to co-localize a 5.1 GHz mechanical breathing mode and a 194 THz ( 0 = 1550 nm) optical mode. The phonon waveguide unit cell (Fig. 4c) is mechanically transparent to breathing mode phonons at 5.1 GHz, while maintaining a large bandgap for optical modes. This is achieved by modifying the ellipticity of the phonon mirror unit cell. We see in Fig. 5b that the resulting mechanical mode is permitted to leak out into the phonon waveguide region, while the optical mode remains highly localized within the defect region. The optomechanical coupling rate is calculated from the mechanically-induced optical frequency shift arising due to the photoelastic effect and moving dielectric boundaries [28], giving om = om,PE + om,MB . The photoelastic contribution is derived from 1st-order perturbation theory as, where is the optical cavity frequency, is the refractive index, E is the electric field, p is the photoelastic tensor, and S is the strain tensor.
The moving boundaries component is derived similarly as, where Q is the normalized mechanical displacement field,n is the surface normal, E is the electric field parallel to the surface, D ⊥ is the electric displacement field perpendicular to the surface, Δ = Si − Air , and Δ −1 = −1 Si − −1 Air . One may expect the coupling rates in this design to suffer due to the delocalization of the mechanical mode. However, we find that after a Nelder-Mead simplex optimization of various OMC dimensions similar to [28], the resulting design gives multiple modes with high simulated values of om /2 , with the maximum coupling rate exceeding 850 kHz (Fig. 5a). This is comparable to state-of-the-art OMC designs in silicon which achieve om /2 up to 1.1 MHz [28,29,40].
The radiation-limited optical quality factor can be simulated and is found to be in excess of 10 6 , corresponding to an intrinsic optical loss rate , /2 ∼ 200 MHz. However, is usually practically limited to ∼ 5 × 10 5 ( , /2 ∼ 400 MHz) due to optical scattering from surface defects introduced in the fabrication process [28]. To ensure this limit is reached, we configure the optimization such that om is maximized while maintaining above ∼ 10 6 , well above the practical limit. The total optical loss rate is given by = , + , , where , is the decay rate associated with input coupling. , is controlled with a coupling waveguide using well-established design techniques [41], and is typically designed so that , = , . The total optical loss rate is then ≈ 2 , /2 = 800 MHz. When hybridizing the modes of the piezoacoustic and OMC cavities, we must consider the relative motional symmetry of the two cavity modes. Our OMC cavity design contains only breathing motion, whereas the piezoacoustic cavity design contains both breathing and Lamb-wave components. If the phonon waveguide permits propagation of 5 GHz Lamb-wave modes, then the OMC defect region will contain both breathing and Lamb-wave motion when hybridized with the piezoacoustic cavity. This will significantly reduce optomechanical coupling as only breathing motion yields high values of om , whereas Lamb-wave motion produces negligible om . For this reason, the phonon waveguide dimensions are chosen such that the bandstructure exhibits a bandgap for 5GHz Lamb-wave-like modes, but is transparent to 5GHz breathing modes. Fig. 5c further illustrates this idea by showing the mechanical bandgap of unit cells across the defect region for both breathing and Lamb-wave-like modes, shaded in red and blue respectively. At the phonon waveguide side, the breathing mode bandgap falls below 5 GHz, permitting the breathing motion of the piezoacoustic mode to couple strongly to the defect region. However, 5 GHz lies inside the Lamb-wave bandgap, so that Lamb-wave motion from the piezoacoustic cavity decays in the phonon waveguide and does not interact with the OMC cavity defect region. The phonon mirror exhibits a bandgap for both breathing and Lamb wave symmetries, so that acoustic radiation of the full transducer mode to the environment is effectively suppressed.

Full Device Design
After independently designing the piezoacoustic and optomechanical cavities, we connect the two as shown in Fig. 6d, and simulate the resulting hybridized mode structure. To observe the hybridization of the piezoacoustic and optomechanical modes, we sweep the IDT period in the piezo region to tune the piezoacoustic mode through the multiple optomechanical resonances. The results are shown in Fig. 6a and b. We find that over a frequency window > 250 MHz, there is a large number of mechanical modes with simultaneous high piezoelectric and optomechanical coupling rates. The phonon waveguide allows for strong enough mode hybridization that the piezoelectric coupling is distributed across a large number of modes. As shown in Fig. 6c, the mechanical energy participation in the piezo region is in the range 1-10%. We find that across the entire hybridization window, at least one mode can be identified with om /2 > 650 kHz, pe /2 > 1 MHz, and < 5%. In Fig. 6c, this mode is highlighted in red with om /2 = 826 kHz, pe /2 = 2.8 MHz, and = 2.3%. We will use the values from this mode to quantify further calculations in this work. In practice, the frequencies and couplings of these mechanical modes are subject to change due to multiple sources of fabrication disorder. The multi-mode structure and relatively large hybridization ensure that the full device is robust to these shifts. While the exact frequencies and couplings may shift, Fig. 6a and 6b illustrate that the qualitative nature of the mode structure remains unchanged for a large range of frequency shifts. Additionally, the modes are separated far enough in frequency that their parasitic effect on each other's transduction efficiencies is minimal.
We may use the simulated piezo participation ratio and radiation loss to estimate the mechanical decoherence rate of our device. There are two dominant contributions to decoherence in our design. The first is acoustic radiation loss into the surrounding substrate. This can be simulated and is found to be in the range rad /2 ∼ 1 − 10 kHz for all modes in Fig. 6, with rad /2 = 2.3 kHz for the mode highlighted in Fig. 6c. The second is coupling to two-level systems (TLS), which in both lithium niobate [34] and silicon [29] has been found to be the dominant loss mechanism for GHz-frequency acoustic cavities at single phonon level powers and milliKelvin temperatures. For mechanical piezo participation ratio , the TLS decoherence rate can be estimated by TLS = LN + (1 − ) Si . Using reported TLS-limited linewidths, LN /2 ∼ 300 kHz in lithium niobate [34] and Si /2 ∼ 4 kHz in silicon [29], and taking = 2%, we estimate a TLS decoherence rate of TLS /2 ∼ 10 kHz. The total mechanical decoherence rate is then estimated to be in the range /2 ∼ 10 − 20 kHz.

Efficiency and Added Noise
To analyze the efficiency and noise of our design, we consider a pulsed scheme for microwaveto-optical state transfer on a transmon qubit connected to the transducer [12]. The qubit is first tuned on resonance with the mechanical mode for a time = /2 pe to complete a microwave photon-phonon swap operation, and subsequently detuned far off-resonance. A red-detuned ( − = − ) laser pulse is then used to upconvert this phonon into an optical photon. The intrinsic efficiency of such a pulsed scheme is simply given by = pe om , where pe is the piezoelectric photon-phonon swap efficiency, om is the optomechanical phonon-photon conversion efficiency.
The photon-phonon swap efficiency, pe , can be calculated from a master equation simulation of the qubit-mechanics system with the Hamiltonianˆp e described in Sec. 1. Using pe /2 = 2.8 MHz, an estimated /2 = 50 kHz from Sec. 2, and an estimated mechanics decoherence rate /2 = 20kHz from Sec. 4, we find pe = 95%. The optomechanical readout step determines both om and the dominant noise contribution to the transducer, which arises from optical absorption heating of the mechanical mode. For a laser pulse duration , om is given by [12], where om = 4 2 om / is the optomechanical scattering rate, and is the number of intracavity optical photons corresponding to peak power of the optical pulse. In principle, this efficiency may be unity in the limit 1/( om + ) and om . However, optically-induced heating of the mechanical mode severely limits in order to maintain < 1 added noise photon. This leads to a fundamental tradeoff between efficiency and added noise resulting from OMC heating dynamics at milliKelvin temperatures. Maximizing efficiency for a given level of added noise requires careful choice of pulse duration and optical power .
The added noise phonons ( ) during optical readout are thought to originate from optical excitation of material defect states which undergo phonon-assisted relaxation, thereby heating the mechanical mode of interest [29,40]. The timescale ℎ for to exceed 1 noise phonon depends strongly on and is found to vary greatly in different devices. Experiments with low-loss ( 10 kHz) pure silicon OMCs report ℎ ∼ 1 s [29,40,42], whereas those with silicon OMCs integrated in a piezo-optomechanical transducer with = 1 MHz report much shorter ℎ ∼ 100 ns [12]. This suggests the presence of additional sources of optically induced heating and mechanical damping in piezo-optomechanical transducers that are potentially correlated. Possible sources are optical absorption by the IDT electrodes, TLSs in the piezo region, and surface defects in the OMC region from additional steps in the transducer fabrication process. While the dynamics of optically induced heating in piezo-optomechanical devices is a subject of future studies, it is clear that a transducer design aimed at improving optomechanical readout efficiency and noise should make the acoustic mode involved in the transduction process as silicon-like as possible, since this is the lowest loss material in the transducer.
In the design presented above, we minimize the dimensions of the piezoacoustic cavity so that most of the energy in the mechanical mode lives in the OMC region. The estimated mechanical damping rates based on participation ratios of various regions and calculated optomechanical coupling rates are comparable to those realized in pure silicon OMCs. Therefore, we may approximate the heating dynamics of our design as similar to that reported in previous silicon OMC work [29,40]. Using this heating model we estimate ∼0.5 added noise photons for a pulse with = 45 and = 500 ns. Using the previously estimated values /2 = 20 kHz, /2 = 800 MHz, and om /2 = 826 kHz (yielding om /2 = 153 kHz), we estimate a pulse with = 45 and = 500 ns can achieve om = 37%. Combined with pe = 95%, the estimated intrinsic efficiency is ∼ 35%. There are additional noise sources which we have not considered here such as photodetector dark counts and residual photons from the optical pump pulse. However, given the measured photon count rates for these noise sources in previous milliKelvin optomechanics experiments [40], these noise contributions are negligible compared to the one from optical absorption heating.
The total efficiency of the transducer includes efficiencies in the extraction and collection of optical photons produced in the transduction process, and is given by = ext . Here, = ( , / ) determines the fraction of optical photons emitted into the coupling waveguide, and ext is the external photon collection efficiency. For , ≈ , (as is the case for typical 1D silicon OMC designs [40]) we have ≈ 50%. In typical optomechanics experiments, ext is mainly determined by the fiber-to-device coupling efficiency, insertion loss of the optical pump filtering setup, and quantum efficiency of single photon detectors. Based on values from recent experiments [12,14,17], these factors can be estimated to be 60%, 20%, and 90%, respectively, leading to ext ∼ 10%. The product of all three efficiency estimates above yields a total transducer conversion efficiency from microwave to optical photons of ∼ 2%.
Finally, we consider the expected repetition rate for the transduction sequence in the pulsed scheme described above. In previous work with similar piezo-optomechanical transducers [12], this was limited to 100 Hz by the ∼10-ms timescale for quasiparticle relaxation in the aluminum transmon coupled to the transducer. We expect that using qubits and resonators fabricated in superconductors such as niobium and niobium nitride, with quasiparticle relaxation timescale in the ∼ns range [43,44], will allow for repetition rates in the 10 kHz range limited by the decay rate rad of added noise phonons in the acoustic mode after the optical pulse. At this repetition rate and estimated total efficiency ∼ 2%, we expect a single photon count rate of order 200 Hz, and a photon coincidence rate of order 4 Hz. These rates are a key figure of merit for heralded remote entanglement generation schemes [23,45], indicating that these sorts of experiments and applications are practically realizable with our proposed new transducer design. We note that the above estimates are calculated for microwave-to-optical frequency conversion and do not carry over to operation in the reverse direction due to the nature of parasitic heating in our device. However, this does not impact the utility of the device for remote entanglement of microwave qubits since we envision its use as a probabilistic microwave-to-optical frequency converter connected to a qubit. The resulting quantum correlations between the qubit and transduced optical photons can serve as a key resource to implement well-known probabilistic schemes for long-distance quantum communication based on optical photon detection [23,45].

Conclusion
We have presented an optimized design for an ultra-low mode volume piezo-optomechanical transducer. The choice of lithium niobate on silicon on insulator as the material platform and our design strategy, allow for transducer performance metrics in a parameter regime favorable for quantum network applications. Crucially, while these performance metrics are only valid for one-way conversion, this should not impact its effectiveness in remote qubit entanglement protocols.
The efficiency, noise, and repetition rate of the above transducer design are expected to be limited by optomechanical heating rates. Future improvements can be made by employing 2D optomechanical crystal cavities [46], which through better thermal conductance to the substrate, mitigate the impact of heating of the mechanical mode due to optical absorption. Further, the transmon qubit connected to the transducer can be replaced with a resonator made in disordered superconductors such as NbN or NbTiN [36,37], which provide high kinetic inductance and fast quasiparticle relaxation timescales. Such a transducer device could be connected to an off-chip qubit in a modular approach [7], which alleviates performance restrictions due to absorption of the transducer optical pump by superconducting qubits. With an order of magnitude higher impedance, kinetic inductance resonators can allow for further miniaturization of the piezo volume and the possibility of novel transducers with ultra-thin lithium niobate of thickness in the tens of nanometers range.