Design and operation of a microfabricated phonon spectrometer utilizing superconducting tunnel junctions as phonon transducers

In order to fully understand nanoscale heat transport it is necessary to spectrally characterize phonon transmission in nanostructures. Towards this goal we have developed a microfabricated phonon spectrometer. We utilize microfabricated superconducting tunnel junction-based (STJ) phonon transducers for the emission and detection of tunable, non-thermal, and spectrally resolved acoustic phonons, with frequencies ranging from ~100 to ~870 GHz, in silicon microstructures. We show that phonon spectroscopy with STJs offers a spectral resolution of ~15-20 GHz, which is ~20 times better than thermal conductance measurements, for probing nanoscale phonon transport. The STJs are Al-AlxOy-Al tunnel junctions and phonon emission and detection occurs via quasiparticle excitation and decay transitions that occur in the superconducting films. We elaborate on the design geometry and constraints of the spectrometer, the fabrication techniques, and the low-noise instrumentation that are essential for successful application of this technique for nanoscale phonon studies. We discuss the spectral distribution of phonons emitted by an STJ emitter and the efficiency of their detection by an STJ detector. We demonstrate that the phonons propagate ballistically through a silicon microstructure, and that submicron spatial resolution is realizable in a design such as ours. Spectrally resolved measurements of phonon transport in nanoscale structures and nanomaterials will further the engineering and exploitation of phonons, and thus have important ramifications for nanoscale thermal transport as well as the burgeoning field of nanophononics.


Importance of nanoscale phonon spectroscopy
One of the grand challenges of nanoscience is to develop experimental tools to understand the fundamental science of heat flow at the nanoscale [1,2]. In insulators and dielectrics, acoustic phonons are the dominant heat carriers [3], [4]. In nanostructures, as the sample's dimension or surface morphology becomes comparable to phonon characteristic lengths -wavelength, mean free path, and coherence length -the interactions of phonons with these structural features lead to regimes of phonon propagation in which the effect of confinement, scattering, and/or interference of phonons dominates heat transport [5], [6]. To probe these nanoscale effects on phonon transport, one needs a measurement technique that can precisely distinguish wavelength (or frequency) and position of the phonon modes. Previous studies have investigated the effects of nanoscale geometries on thermal transport using Joule-heated metal films on suspended structures [7], [8], [9], [10], [11], but because a thermal conductance measurement employs a broad spectral distribution of phonons, the frequency dependence of the phonon transport in such measurements is difficult to distinguish. Therefore, there is a strong need for a nanoscale technique that will spectroscopically measure phonon transport at hypersonic (>1 GHz) frequencies -particularly at frequencies above 100 GHz which are most relevant to heat flow [12]. Such a technique will be apt for the development of the burgeoning field of nanophononics [13].
An ability to fully understand the propagation of phonons will inform the engineering and exploitation of nanostructures and nanomaterials. For instance, through careful phonon engineering the realization of more efficient thermoelectric materials and microelectronic coolers will be feasible [10,14,15]. Such phonon engineering strategies have been recently demonstrated with silicon phononic crystal structures, which displayed a reduction in phonon thermal conductivity in comparison to bulk crystals [16] [17]; however, the exact mechanism and frequency dependence of this reduction is not completely understood because diagnostic tools for nanoscale phonon spectroscopy were not available.
In this paper we describe a new tool for nanoscale phonon spectroscopy using microfabricated superconducting tunnel junctions (STJs) -we detail its design and principle of operation, the fabrication techniques and challenges, the instrumentation and measurement procedures, and the results of selected phonon transport measurements. Phonon spectroscopy with STJs uses a narrow, non-thermal, and tunable frequency distribution of acoustic phonons to probe the phonon transport through nanostructures. STJ-based phonon spectroscopy has previously been performed extensively in macroscale samples by only a few research groups [18,19] [20] [21]. However, with the development in recent years of advanced micro/nanofabrication techniques, the phonon spectrometer can now be fabricated at the microscale and offer exceptional spatial resolution. The microfabricated phonon spectrometer has the advantage of probing nanoscale effects such as phonon confinement [3], end-coupling diffraction [22], and surface scattering [23], with submicron spatial resolution. We have recently demonstrated a prototype microfabricated spectrometer for emission and detection of non-equilibrium phonons with frequencies ranging from 0 to ~200 GHz [24], and have now tuned the phonon source (emitter) to emit phonons with frequency ranging from 0 to ~870 GHz. The spectrometer comprises a pair of aluminum-aluminum oxide-aluminum (Al-Al x O y -Al) superconducting tunnel junctions serving as phonon emitter and phonon detector on opposite sides of a silicon microstructure. The spectrometer measures the rate of phonons that propagate ballistically through the microstructure. Here we discuss in full detail the design, fabrication steps, required characterization, electronics, and measurement techniques involved in successfully realizing phonon spectroscopy with microscale STJ phonon transducers.

Spectrometer design
The device design for each spectrometer consists of two STJ phonon transducers -one emitter and one detector -attached on opposite sides of a mesa that is monolithically etched on a silicon substrate (See figure 1a). The mesas, which are ~0.8 m high and have widths ranging from 7 to 15 m, allow for the isolation of a ballistic path for phonon propagation. The devices are fabricated on a 525 m thick silicon (100) wafer and the mesa sidewalls are on the Si (111) plane. (Because the mean free path of phonons at our experimental temperature and frequencies is >>1 mm [25], the detected phonons will also include phonons that backscatter from the bottom of the substrate.) The ballistic path along the <110> direction between emitter and detector may be blocked by etching a trench into the mesa in order to determine this contribution of backscattered phonons [24]. This phonon transport measurement platform also enables the monolithic integration of nanostructures into the mesa. Microfabrication methods make the experiments very scalable -spectrometers are fabricated in lots of 100 on 100 mm Si wafers. Each 4.5 mm square chip contains up to 6 spectrometers, as shown in figure 1b.
The phonon emitter is a single Al-Al x O y -Al tunnel junction with the majority of the junction area lying on the sidewall of the mesa. The aluminum films are designed to be thin enough (<100 nm) to ensure that the decay length of the phonons is greater than the film thickness in order to minimize phonon reabsorption (emitted phonons breaking quasiparticles within the emitter film) [26]. As will be described, we isolate narrow bands of phonon energy by modulation of the emitter voltage. Emitter junction resistance should therefore be made low enough to maximize the amount of current (and therefore phonon signal) flowing at a given modulation amplitude, while the resistance must also be large enough to inhibit overinjection of electrons through the tunnel barrier into the Al film. Such overinjection may locally suppress the superconducting gap and thereby degrade energy resolution [27]. Residual inhomogeneities in the gap are inherent in the film and may be assessed from an I-V curve of the junction. Typically we use emitters having junction resistances from ~800 to ~5000 , and we observe an inhomogeneity of about ~60 to ~80 eV (~15 to ~20 GHz) which represents the upper limit of our energy resolution.
The detector is designed to have a double-junction (SQUID geometry) with a 'hot electron finger' extending onto the mesa sidewall to capture the incident phonons. The actual detector junctions lie on the (100) plane of the silicon substrate. The SQUID geometry enables the suppression of Josephson current via the application of a magnetic field. With the Josephson current suppressed, we can readily distinguish an incident phonon flux as an increase in the 'subgap' tunnel current due to the incident phonons breaking Cooper pairs in the Al. Incident phonons of energy break Cooper pairs in the aluminum film of the detector finger, and the excited quasiparticles diffuse to the junctions and tunnel through the oxide barrier. Some of the detectors have quasiparticle traps made from thin Au films. These traps are designed to ensure that excess quasiparticle energies do not reach the junction, and they also prevent back tunneling of quasiparticles [28]. When we change the length of the detector fingers, , from 10 m to 20 m (see figure 1a), we found no discernible difference in the detected phonon transmission signal levels. We conclude from this that the quasiparticle diffusion length is much longer than the finger length, a conclusion that agrees with diffusion lengths reported in the literature [29]. The tunnel junction therefore faithfully measures the rate of phonon arrival at the tip of the finger several microns distant. Forming the double-junctions on the flat Si (100) plane reduces their asymmetry , therefore facilitating Josephson-current suppression, simplifies fabrication, and offers great flexibility in spatial resolution to be achieved merely by changing the width and position of the finger [24].

Phonon emission with STJ
Phonon emission in STJs occurs via the excitation and decay of quasiparticles (single electrons) in superconducting films. As depicted in figures 2a and 2b, when the emitter STJ is biased above the superconducting gap ( ) such that ( , where , , and are the current through, voltage across, and normal state tunneling resistance of the emitter junction respectively), the Cooper pairs (paired electrons) in the first aluminum film break apart and quasiparticles tunnel through the oxide barrier into excited energy states ranging from to (all energies referenced to the Fermi level in the aluminum film at the opposing side of the junction) [18,19]. The Al emitters in our experiments have of ~400 eV at a temperature of ~0.3 K. These excited quasiparticles rapidly decay towards the edge of the superconducting gap, emitting phonons as they decay. Due to the singularity in the density of states at the gap edge, this 'relaxation' process typically requires only one or two decay steps before the quasiparticle energy is reduced to . This process thus emits a broad distribution of phonons of energies ranging from to . The phonons are incoherent and to a first approximation will have both random polarization and random direction due to elastic scattering of the tunneled electrons within the Al film. The shape of this 'relaxation' phonon distribution includes a sharp cutoff at energy , thus allowing a small modulation of to isolate a narrow portion of the spectrum that is sharply peaked at energy . Subsequent recombination of the quasiparticles into Cooper pairs lead to the emission of recombination phonons of energy . The average relaxation and recombination times are on the order of ~1 ns and 30 s respectively [30]. When the STJ emitters are attached to one end of a microstructure, relaxation and recombination phonons -both longitudinal and transverse polarizations -are emitted and ballistically propagate through the microstructure; however, only the relaxation phonons are controlled by modulation techniques for spectroscopic studies.

Modeling the phonon emission spectrum
When considering spectral precision of a phonon source, a convenient figure of merit is the ratio of phonon power near the peak of the distribution, to the total power in the measurement. For instance, a thermal conductance measurement at temperature employs a Planck's distribution of phonon modes, having power spectral density ~ , where is the angular frequency of the phonon and is the phonon speed. This distribution is peaked at the so-called 'dominant phonon frequency' , but the distribution is quite broad and therefore a slice of spectrum within around the peak contains only a small fraction of the total power. For instance, if we wish to interrogate a spectral feature at 400 GHz with 20 GHz precision, a Planck distribution at = 6.9 K offers but contains only 3% of its power within +/-10 GHz of this peak.
To model the phonon emission profile of the modulated STJ phonon spectrum and to estimate of this distribution, we must carefully consider the non-equilibrium electronphonon interactions within the superconducting film in the emitter STJ. These include phonon attenuation due to Cooper-pair breakage [26,30,31] and acoustic-mismatch transmission across the Al/Si boundary [32] as well as quasiparticle diffusion and reemission of absorbed phonons. The total emitted phonon power resulting from the quasiparticle relaxation process will comprise the phonons emitted in first-step relaxation, plus any emitted in second-step relaxation, minus the fraction reabsorbed by Cooper pair breakage within the aluminum, plus the power that is reemitted following this reabsorption processes.
For voltages not greatly exceeding , nearly all injected quasiparticles decay to energy , so that first-step relaxation dominates, the entire modulated spectral power falls at frequency , and if we neglect the effect of reabsorption then [19]. For higher bias voltages, a fraction of the quasiparticles will relax first to intermediate energies before undergoing secondary relaxation to the band edge energy . The precise distribution of generated phonon energies may be found by convolution integral of the quasiparticle injection rates, densities of states and decay rates [30,33]. For simplicity, we will assume that the phonon density of states in the Al follows a Debye model, and adopt an approximate model of phonon production rate, presented by This rate of phonon production per unit bandwidth is shown in figure 2c, and extends from to a sharp cutoff at . This is a good approximation for voltages [30,34]. It is evident from the shape of this distribution that a portion of the differential phonon power is produced at the peak while the remainder is produced at energies broadly distributed over the range to . The power spectral density may be found from equation (1) as . Second-step relaxation may add up to 25% additional phonon power, mostly at frequencies well below the cutoff at [30]. We must also consider reabsorption and reemission of phonon energy. Attenuation of the phonon population within the superconductor will occur as phonons of energy break Cooper pairs, creating fresh quasiparticles. The probability that a phonon will survive traveling a distance within the aluminum is , the mean absorption length of being dependent on phonon energy and band gap energy [30]. If we treat the phonons as point-particles traveling ballistically within the Al, then the probability of a phonon generated at a distance from the Al/Si interface and traveling at an angle to the normal, to escape into the Si before reabsorption is [30,35] ( Here is an acoustic-mismatch transmission factor for wave transmission from Al into Si. The films of some of our emitter STJs have lower and upper layer thicknesses of ~20 nm and ~79 nm respectively on the mesa sidewall (as determined by profilometry measurement and adjusted for sidewall angle). For simplicity, we treat all phonons as being generated within the lower layer at a spatially uniform rate. We assume the phonons' velocities are distributed uniformly in all directions, and that those entering the top layer may reflect from the Al/vacuum boundary, reenter the lower layer, and reach the Al/Si boundary. For phonons to emerge and travel directly across the mesa towards the detector (an angle ~35.3 degrees to the sidewall normal), we estimate the refraction angle within the Al using Snell's law, assuming average wave speeds in Al and m/s in Si, to be . From reported values of the acoustic impedances of Al and Si, we estimate to be > 0.9 for such an angle and to be frequency-independent [6,30]. Kaplan et al. have calculated values for phonon decay time in Al as a function of phonon energy and bandgap energy [26]. We multiply these by to find m, m and m. While these values are greater than some reported experimental values of in Al at energy , they are comparable to measured values of normal-state acoustic attenuation corrected to the[30-32, 35-37] superconducting state. Averaging equation (2) over our full Al layer thicknesses, we estimate that in the direction pointing out towards the detector, ~90% of phonons at will escape into the Si, ~78% at and ~68% at . We use these attenuation factors to modify the spectrum in equation (1), as shown in figure 2c.
To find the total rate of absorbed phonons, we must average equation (2) over all depths and angles. At large values of we note that will be <<1, regardless of phonon frequency, and for angles above about 45 degrees, will be zero due to total internal reflection within the Al [30, 32]. Transmission coefficients averaged over all angles and phonon polarizations have been calculated by Kaplan, from which we estimate assuming the three phonon polarizations to be equally populated [32]. Thus at any frequency , at least 56% of all phonons produced are liable to be reabsorbed within the Al. We can approximate the additional frequency-dependency by multiplying this by the average of equation (2) over the full Al layer thickness and all angles less than the critical angle. Therefore among all phonons at all angles we estimate that ~61% are reabsorbed at , ~67% at , and ~71% at . For each bias voltage , we apply these proportions to the spectrum of equation (1) and integrate to find the total reabsorbed power.
By conservation of energy, all of this reabsorbed power must be reemitted. The quasiparticles created in the reabsorption subsequently relax and recombine to emit additional phonons of lower frequency than the ones initially absorbed. We estimate based on typical decay times and on the geometry of our STJ on the mesa sidewall that the quasiparticles do not travel far prior to reemission, so that about 80% of the power is reemitted at the same or nearby location as the original tunneling injection in the Al film on the mesa sidewall.
Taking together first-step relaxation, second-step relaxation (constituting up to ~25% of the total relaxation phonon power), attenuation, and reabsorbed/reemitted power, we find that for typical values of up to a few mV, the total modulated power emitted from the emitter STJ is roughly proportional to the modulated emitter current . The power emitted due to recombination on the other hand (see figure 2a) should remain fixed as is varied, and for large we take this to be a negligibly small fraction of the total power. Therefore the total emitted differential phonon rate is . To find at a given peak frequency , we take from equation (1), for a given peak width (e.g. ), attenuate this quantity according to equation (2) as described above, and divide by the total power found as described above at . The result of this calculation for our typical emitter film thicknesses appears in figure 2d. For a peak width GHz, at a peak frequency of , is ~50%. This diminishes to ~32% at peak , and further at higher peak frequencies. As shown in figure 2d, the values of from the STJ-emitted phonon spectrum compare very favorably to a Planck distribution, exceeding it by more than an order of magnitude for . This analysis demonstrates that aluminum STJs made of films a few tens of nm thick will emit narrow spectral distributions of acoustic phonons into Si at frequencies up to several hundred GHz.
Phonon emission from aluminum STJs has been reported elsewhere at frequencies up to ~2 THz, but is likely to be very small at such a peak frequency even if the films are made very thin [38]. The wavelength in Al at 700 GHz is ~6 nm while the granularity in the Al film and the roughness at the Al/Si interface are most likely a few nm; hence, for above ~700 GHz, we expect to see the spectrum further modified by the effects of elastic scattering of phonons within the Al film[35], inelastic phonon scattering at the Al/Si boundary [39] and modification of phonon spectra due to excess injected quasiparticle population in the Al film [27]. All such effects are liable to become more severe as and are increased.

Phonon detection with STJ
The phonons incident on the detector are registered as an increase in the tunnel current through the detector junctions. The STJ detector is biased below its superconducting gap with voltage (figure 2b). Phonons incident on the detector finger with energy greater than or equal to will break Cooper pairs in the detector films, and the quasiparticles will diffuse until a portion reaches the detector junction and tunnel through. The STJ detectors are made from aluminum films with superconducting gap ~360 eV (corresponding to ~90 GHz), and in essence these detectors act as high pass filters of acoustic phonons with cut-off frequency ~90 GHz. A lock-in detector selects only the modulated portion of the detector current, corresponding to the modulated emitter phonons that strike the detector. The phonon spectrum therefore comprises phonons of frequencies between ~90 GHz and , with a sharp peak at frequency . Because the modulated emitter phonon power is proportional to , the measured differential transfer function tells us the fraction of this spectrum that is transmitted from emitter through the sample to the detector.

Modeling the detector behavior
We may use quasiparticle-phonon interactions to model and quantify the phonon detector behavior.
For a differential rate of phonons of frequency striking the detector finger, we expect the average differential rate of phonon-induced quasiparticle generation to be for for (3) for In equations (3), is the acoustic transmission factor for phonons transiting from Si into Al, which we estimate from acoustic impedances to be >0.9 over all incidence angles [32]. The fraction of phonons absorbed in the finger will be approximately . In our detector fingers, the thickness in the direction of phonon incidence is 140 to 205 nm, thus we expect to equal at least 0.2 for , and at least 0.8 for . In our devices, the diminishing fraction as peak frequency is increased (figure 2d) motivates us to treat as independent of peak frequency and having value ~ 0.25. In the signal of a typical spectrometer transmitting through bulk Si, we see a modulated signal that is consistent with this assumption and with the detector response behavior of equations (3).
To find and thereby the phonon arrival rate from the measured differential detector tunnel current , we must account for quasiparticle loss processes in the detector. The primary loss process comprises diffusion of the quasiparticles into the attached wiring leads, followed by recombination into Cooper pairs [30, 40, 41]. Using conventional theories of tunneling rate and quasiparticle recombination, we may express a nondimensional efficiency factor for each detector (see Appendix B) [30, 39, 41]: where is the normal-state tunneling resistance of the junction, is the normal density of states at the Fermi level ( in Al) [30], and and are respectively the average total width and thickness of the wiring trace connected to the detector STJ. The factor 20 cm 2 /s is the diffusion constant for quasiparticles in Al, and is the average quasiparticle recombination time in Al at a temperature of 0.3K [30, 42, 43] [29, 40]. In our detectors is typically ~ 0.1. Figure 3a illustrates the step-by-step fabrication of the mesas and transducers. The mesas are formed by a shallow depth anisotropic etching of silicon using KOH (50% KOH, 48 o C, 4 min.) with a low-stress silicon nitride etch mask. We found that standard RCA cleaning of the wafers prior to etching is crucial to obtaining smooth surfaces. The smoothness of the (100) and (111) planes is necessary to enable deposition of continuous Al films, and to minimize phonon scattering from rough surfaces. Simultaneous magnetic stirring and ultrasonication during the KOH etch helps to improve the smoothness of the etched surfaces. The trenches are simultaneously formed, where needed, on the mesas. Neither hydrochloric acid nor surfactants was added.

Fabrication techniques and challenges
We fabricate the emitter tunnel junctions on the sidewall of the mesa using double-angle evaporation as shown in figure 3b. A bilayer of S1818 photoresist (Rohm and Haas Inc.) and LOR liftoff resist (Microchem Inc.) is spun onto the fabricated mesas and the emitter geometry, wiring trace and bond pads are photolithographically patterned into the resist. The depth of field of our photolithography tool (±2.42 m) limits the range of mesa heights and resists thickness used to form the junctions. The patterned resist is developed in AZ MIF 300 (AZ Electronic Materials) for ~60 seconds until a "Dolan Photoresist Bridge" is formed with sufficient undercut (see figure 3a and 3b) [44]. The surfaces must be cleaned with Argon or oxygen plasma prior to evaporation to prevent poor aluminum film adhesion and aging of the tunnel junctions formed [45], [46].
As shown in figure 3a, arrays of detector and emitter STJs are patterned and deposited. The film thicknesses in the emitter are made only a few tens of nm (lower layer is ~20 nm thick and upper layer is ~38 to 80 nm thick on the mesa sidewall), to enable phonons to escape into the Si without reabsorption, whereas the detector film thicknesses are made several hundred nm thick to maximize the absorption of incident phonons. We utilize a two-step electron-beam angle evaporation interspersed with a static oxidation procedure to form the aluminum tunnel junctions. The overlap area of the tunnel junctions is dependent on the angles at which the evaporation is done (figure 3b). Assuming that the height of the bridge or thickness of the LOR layer is m and the width of the bridge is m, the overlap width is , where and are the two deposition angles measured from the normal to the substrate surface (see figure 3b). The width of the wiring traces, , should be made wide enough that the double angle evaporation forms a single overlapping metal trace (figure 3c). We found that the best quality films were obtained at evaporation rates ~4.5 -5 Å/s. The films were evaporated at base pressures ranging from 2 x 10 -7 Torr to 1.2 x 10 -6 Torr. The base pressures were sometimes lowered further by the initial evaporation of 50-100 nm Al in the chamber. The evaporated aluminum acts as a getter for particles in the chamber. The tunnel barrier is formed by static oxidation in between the two Al deposition steps, with an exposure parameter defined as [47]. The emitter tunnel barrier was grown in 3 Torr of oxygen for 60 minutes resulting in emitter resistances ~1.5 kThe detector tunnel barrier was grown at 300 mTorr for ~70 minutes resulting in resistances of ~200 . Figure 4a shows the exposure parameter plotted against the area-specific resistances of the tunnel junctions. This guide can be used to estimate the exposure parameters that will produce emitters or detectors with desired junction resistances. The plot was fitted to a power law ( ) with an adjusted-R 2 value of 0.997. The area of each junction was calculated from scanning electron micrograph (SEM) inspection and can be estimated prior to fabrication based on the overlap area calculations discussed above. The post-evaporation processing includes metal lift-off, dicing of the wafer into 4.5 sq. mm chips, and the evaporation of ~500 nm thick silver on the backside. Silver has been shown to be a good absorber of phonons [25]; hence, the addition of silver reduces the backscattered signals from the bottom of the chip. The junctions are very sensitive to static discharge, and therefore, proper grounding is essential at all times.
The base pressure at which the Al films are deposited is important as it may affect their room-temperature resistivity, which in turn affects their critical temperature and superconducting gap. Such variations in the superconducting gaps of aluminum films with respect to their roomtemperature resistivity have been reported to be due to oxygen doping [48], [49]. In figure 4b, we show the dependence of the room-temperature resistivity of thin aluminum films of identical dimensions on increased oxygen partial pressure during evaporation. The dimensions of the films were patterned by photolithography and the first film was evaporated at a base pressure of 0.44 Torr. By increasing the base pressure due to the continuous flow of oxygen into the chamber, we show that the resistivity of the films varies with base pressure. The typical transition temperature, , for films evaporated at base pressures of 27 Torr and 35 Torr was measured to be 1.75 K and 1.89 K respectively (increased compared to a pure Al film with ~1.12 K).

Low temperature apparatus
The apparatus for the low temperature phonon transport experiments includes a He-3-cryostat with a custom designed sample stage immersed in a liquid helium Dewar. The fridge wiring consists of twisted pair lines with room temperature pi-filters (Tusonix 4701 EMI) enclosed in brass block Faraday cage, allowing up to ~90 dB attenuation at frequencies > 100 MHz. The cold stage filters are 'tapeworm' type low-pass filters [28], but the extent of cold stage filtering is limited by the space in our vacuum can. The fridge is cooled down to a base temperature of 0.3 K and the sample is held in vacuum. The thermometer at the He-3 stage is a Cernox™ RTD (Lakeshore Cryotronics) and a silicon diode thermometer (DT-470-SD-12A, Lakeshore Cryotronics) monitors temperature at the 1 K-pot. Attempts are made to minimize the coupling of noise from the thermometry wiring into the measurement wiring. Metal film resistors are used in all bias networks, as this type of resistor is known to exhibit superior temperature stability and reduced noise. As shown in figure 5a, the chips containing the spectrometers are wire-bonded onto the gold plated copper sample stage. The backside of the chips must be properly anchored to the sample stage by thermalizing with Apiezon N grease or silver paint. A 5000 turn superconducting magnetic coil is attached to the top of the sample box for Josephson current suppression as shown in figure 5b, c, and d. Once the fridge is immersed in the helium Dewar, we ensure proper grounding of the fridge and equipment rack. We place rubber pads underneath the wheels of the Dewar to reduce mechanical vibration.

DC characterization of STJ emitters and detectors
The DC characteristics of emitter and detector tunnel junctions are determined from current-biased current-voltage (I-V) measurements at ~0.3 K. Figure 6a shows how we estimate the superconducting gap from the I-V behavior. From the I-V curves we also calculate the normal state resistance, , of the junctions. In figure 6b, we show the current-biased I-V curves in the subgap regime for four SQUID detectors with the current normalized by their normal state resistances for comparison. The red plot shows significant rounding-off which is due to poor filtering on that particular signal line, allowing stray voltage noise to add a random perturbation to the junction voltage. In figure 6c, we show the resistance-normalized I-V curves for four emitters with normal state resistance values of 212 , 935 , 2250  and 5559  This plot illustrates several possible problems in emitter performance. In the 212  emitter (red plot), we observe 'back bending' of the gap rise step at . This is a signature of quasiparticle overinjection, which appears consistently in emitter STJs of < 700 , leading to local suppression of the superconducting gap and poor phonon energy resolution. In the 2250  junction (magenta plot), the I-V curve shows a signature of being partially shorted (this could occur either at their formation or during processing) which will add an uncontrolled thermal phonon population to the junction's emission. The black and blue curves indicate a limitation on emitter energy resolution. For an ideal STJ, the 'gap rise' step at should be infinitely sharp, but in practice, we observe a breadth of ~60-80 V (~15 to 20 GHz). This behavior most likely indicates that the superconductor's gap varies within the junction by ~60-80 eV (corresponding to a ~15 to 20 GHz imprecision in emitted phonon frequency).

Josephson current suppression
Josephson current (or supercurrent) in the detector must be suppressed, so that the detector may be voltage-biased and its quasiparticle tunneling current clearly distinguished. To do so, we apply a magnetic field perpendicular to the SQUID loop, using a small superconducting coil mounted as close as possible to the top of the chip to minimize vibration-coupled flux noise. For our coil geometry (see figure 5c), we calculate (using Biot-Savart law) the axial magnetic field to be 1.27 Gauss/mA. The heat load resulting from typical coil current is ≤2 W. The maximum supercurrent in the SQUID detector junction, assuming perfect symmetry, is given as where and are the flux quantum (2.07 x 10 -15 Wb), applied flux, and critical current at zero magnetic field respectively [50]. By applying a magnetic flux proportional to where is an odd integer, the supercurrent should be fully suppressed. We typically employ the minimum effective flux (equivalent to ), in order to minimize flux trapping. In practice, we find that the supercurrent is not always fully suppressed, probably due to asymmetry between the two junctions. Figure 7a illustrates our technique for determining the detector bias point for phonon transport studies. The detector voltage is swept in the subgap regime between ~-300 to ~300 V. At each voltage step, the coil current is swept from 0 to 2 mA and the tunnel current is measured at each step. In the 3D plot in figure 7a, the current measured per the detector bias voltage and per coil current is shown. We set the voltage bias point of the detector to (~180 V) and coil current to ~1 mA, where the minimum critical current is obtained. The measured zero-voltage and zero Bfield supercurrent for the detector ( = 116 ) in figure 7a is ~1.2 A (z-axis) and is closely predicted by the Ambegaokar-Baratoff expression for ~ 0 K, [50]. By applying a magnetic field ~1 Gauss at the bias point, the supercurrent is suppressed to ~1 nA. The extent to which the supercurrent in the SQUID detectors may be suppressed is dependent on two geometric properties: self-induced flux and junction symmetry. The self-induced flux is proportional to the self-inductance, , of the SQUID loop, which we estimate based on the inductance of a rectangular loop [51]. The more closely identical the two junctions are, the more closely the current flowing through them may be made to cancel. In figure 7b, we plot the ratio of the minimum obtainable critical current to the maximum zero voltage critical current versus the parameter , the ratio of self-induced flux to the flux quantum. Each symbol in figure 7b represents a unique SQUID design based on the location of the junction and the loop area: Junctions formed on flat surface are represented by solid symbols, while the open symbols represent junctions formed on the sidewall; Loop areas vary from ≤ 2 m 2 (squares), to ~10 m 2 (circles), to ~120 m 2 (triangles), and to ~180 m 2 (diamond). Smaller loop areas and larger junction resistances lead to smaller values of and in general to better supercurrent suppression; however, for the SQUID detectors formed on the sidewall, we observe a large variation in suppression for devices with similar . This is likely due to junction asymmetry. For devices formed on the flat (100) surface, supercurrent suppression is more consistent and exceeds ~3 orders of magnitude for , indicating more symmetric junction formation. We also note a tradeoff in detector design: while Josephson critical current scales inversely with normal-state tunnel resistance , detector efficiency (equation (4)) also scales inversely with . In practice we find that a loop area of ~2 m 2 and detector resistance ~ 200 to 300  enable both suppression of to levels smaller than thermal quasiparticle tunneling current, as well as detector efficiencies of ~0.1 that permit readily measurable spectrometer signals.
With the supercurrent suppressed, we measured the subgap tunnel current due to thermally excited quasiparticles at detector voltage and at different temperatures (~0.3 to 0.4 K) as shown in figure 7c (red plot). We compare the results to the BCS approximation of the subgap tunnel current for an S-I-S junction (blue plot) [52]. The measurement shows exponential dependence of subgap current on temperature, as predicted by BCS theory. The deviation between the data and prediction may be due to our inability to fully suppress the supercurrent and to possible inaccuracies of our cold stage thermometer at temperatures below ~0.34 K.

Modulated phonon transport measurements
The schematic of our phonon transport experiments is shown in figure 8a. For phonon emission , the emitter is current biased by applying a DC voltage, through bias resistor ~500 k, where is the voltage across the emitter junction and is the normal state resistance of the emitter junction. All the device wiring comprises filtered twisted-pair lines, and shielded coaxial cables are used for all connections. The DC current through the emitter junction is stepped from ~ 0.35 to 2 A, which corresponds to emitter voltages ~0.35 to 2 mV for a junction resistance of = 1 k. In addition to the DC current applied to the junction, an AC modulation current ~20 nA RMS is applied by adding an AC modulation to the DC level through a unity-gain isolation amplifier (Burr Brown ISO124P) and 100 voltage divider; the output is independent of frequency between 4 and 1000 Hz and exhibits noise of ~10 - 6 . The typical modulation frequencies for our measurements range between 7 -11 Hz.
For phonon detection, the detector is voltage biased in the subgap regime with the Josephson current suppressed. The detector signal comprises a steady state plus a modulated component, as indicated in figure 8a. The steady state DC detector current ~1 to 2.5 nA for emitter voltages 0.35 to 5 mV as shown in figure 8b. For DC detector tunnel currents up to 1.5 times the unperturbed (thermal) level of the steady state detector current, we treat as being constant and therefore equation (4) as being valid and being fixed [41,53]. We checked this assumption by raising the device temperature until rose by a factor of 3, and observed very small change in the differential transfer function . Thus for 1.5 times its thermal level, we can safely assume that the detector response remains linear with incident phonon flux. (We note that for 1.5 times its thermal level, the detector response may be nonlinear with the incident phonon flux.) In our devices may be limited by magnetic flux trapped in the Al detector film as well as by quasiparticle population [41].
The modulated AC detector current (also differential response or differential transfer function) of our detector (figure 8c), which represents the modulated portion of the incident phonons, is isolated via a low-noise current pre-amplifier (DL 1211) and a lock-in amplifier (SRS 830) over a range from 0 to ~1 pA RMS . As shown in figure 8c, the emitter tunnel junction turns on at emitter voltage above : the step in detector response at occurs because the emitted relaxation phonons (peak energy = ) above this voltage are energetic enough to break Cooper pairs in the detector (gap energy , i.e. ~90 GHz). When , we observe a further change in detected signal level, as the emitted relaxation phonons acquire enough energy to break multiple cooper pairs in the detector (See also equation (3)). We have also considered the effect of microwave Josephson radiation on the detector signal [29]. In one spectrometer, we biased the emitter at and modulated the Josephson branch of the emitter I-V curve. We observed zero detector response. We conclude that our measurement is not influenced by Josephson radiation or inductive coupling of the emitter Josephson current into the detector.
The peak frequency of the emitted relaxation phonon distribution is related to the emitter bias voltage as . The feature in figure 8c at 4 mV is believed to be due to backscattering by oxygen impurities in the silicon. This peak was observed at ~870 GHz in past studies of STJ phonon spectroscopy [54], [55]. While this behavior confirms that our aluminum STJbased spectrometer emits a strong and tunable signal well above 800 GHz, we note that at such high frequencies (figure 2d), we estimate only ~20% of the total phonon power to be at the peak frequency of . In figure 8d, we present voltage-biased I-V curves of a detector recorded while varying the emitter voltage from 0 to ~5 mV. (We note that in this detector we were unable to suppress Josephson current below ~5 nA.) For emitter voltage V, the subgap current at detector voltage (180 V) is exactly the same as that shown in figure 7c at a temperature of ~313 mK. As a larger and larger phonon flux is transmitted to the detector, the total quasiparticle density in the detector increases well beyond the thermal level, and the detector current rises. In figure 8e, we calculate the differential conductance from the subgap I-V measurements of figure 8d. The conductance of the detector remains essentially the same as emitter voltage is varied. At the typical bias point of , conductance remains fixed at ~5×10 -6 / . The only difference is in the total current level.
These measurements motivate a simplified equivalent circuit model for our STJ phonon detector, shown in figure 8f. The phonon detector is modeled as a current source in parallel with a resistance . The DC current and modulated current follow the incident flux of phonons. The detector is in series with the current amplifier (input impedance and current through ) and line resistance . The bias point on the detector is maintained by an isolated voltage source (Stanford Research SIM928, output through a 10 5 voltage divider) across the entire network. Typical values for and are ~70 and 2 k respectively ( is the manufacturer's specification). This model, and the measurements of figures 8d and 8e, makes clear that the STJ maintains a steady bias throughout our measurement range-even if rises by 1 nA, the bias across the STJ will change by only a few V. Similarly, the current through the amplifier, accurately registers the modulated current through the detector. Modulated amplifier current equals , which is only ~1% differerent than for typical values of , and .

Energy resolution and sensitivity
The energy resolution of our measurement is limited by noise, by the band gap inhomogeneity of the emitter STJ, and by the modulation amplitude. Voltage noise across the emitter STJ adds random fluctuations to bias voltage , while inhomogeneity in the emitter gap likewise reduces precision of phonon energies. In practice, we assess these effects based on the width of the gap rise in the emitter I-V curve (figure 6c), typically ~60-80 eV. The modulation current applied to the emitter may also reduce energy resolution by adding a voltage oscillation of peak amplitude to the emitter voltage . For typical emitter junction resistance ~ 800  and ~ 20 nA RMS , this modulation envelope is only ~ 40 eV, and therefore the bandgap inhomogeneity imposes the limit on energy resolution: ~60-80 eV-corresponding to a frequency resolution ~15-20 GHz.
The sensitivity of the measurement is limited by detector noise, which may comprise electrical pick up noise, vibrational pickup in wiring and amplifier noise, as well as fundamental contributions such as Johnson noise in wiring and shot noise in the tunnel junction. Figure 9 shows a typical noise spectrum of the detector, exhibiting peaks in the spectrum at 60 Hz and its multiples due to power-line noise pickup, as well as an unexplained resonance at ~600 Hz. Wiring and apparatus to minimize noise are discussed in the section on instrumentation. Based on detector noise spectra such as figure 9, we typically choose modulation frequencies between 3 and 12 Hz, adding line-frequency notch filters and low-pass filters at the input of the preamplifier and lock-in amplifier to avoid amplifier overload. The lowest noise level obtained at modulation frequency of 11 Hz was ~60 . We note that a tunnel junction passing a DC current of 1 nA should exhibit a shot noise of ~18 (assuming a Fano factor of 1), so our experimental noise is not far above the shot noise level. To reduce uncertainty in a spectral measurement, we typically repeat it 25 times and average the results. Considering the typical detector efficiencies ~0.1 (equation (4)) as well as acoustic-transmission and absorption factors and (see equation (3)), we estimate the noise equivalent power (NEP) for phonon detection to be ~10 -15 , or ~2 x 10 7 phonons of energy per second per . A comparative analysis of similar low temperature thermal detectors found similar sensitivities [56].

Ballistic phonon propagation
The ballistic nature of phonon transport is evidenced by comparing the differential detector response ( ) of spectrometers with varying mesa widths, detector finger widths, blocked ballistic path, and offset line-of-sight between emitters and detectors ( figure 10a-d). To enable measurements made with different detectors to be compared equivalently, we divide each measured value of by for that detector to obtain the phonon transmission signal. Following equations (3) and (4), we expect the resulting scaled value to equal for , and for . Since and are expected to be roughly the same from one detector to another, we do not rescale the data for these factors. We note that the quasiparticle diffusion length is of order 100 m, so that phonons reflected from the bottom of the Si chip and striking the wiring leads far from the junction or the mesa may also contribute to a measured 'background' signal level that is also subject to the same efficiency as the signal resulting from phonons striking the detector finger [39]. The rate of ballistic phonons striking the detector finger, as measured by the differential detector response, is proportional to , where is the fraction of emitter STJ visible from the detector, is a Lambert law phonon emission distribution, is the phonon focusing factor, and are acoustic transmission factors described previously, and is the solid angle subtended by the detector with respect to the emitter STJ [30], [57], [58], [59], [60]. Figure 10a shows the phonon transmission signal between emitter and detector formed on different widths of mesas (7 m (blue) and 10 m (red)) with 6 m detector finger widths. As the mesa width increases from 7 to 10 m, the solid angle subtended by the detector with respect to the emitter decreases; hence, the differential detector signal decreases as expected. We further verified the ballistic phonon transmission by varying the width, , of the detector fingers. For a 10 m mesa, we show the phonon transmission signal for a 6 m wide (red plot) and a 3 m wide (magenta plot) detector finger (figure 10b). The wider finger will subtend a larger solid angle; hence, the detector signal is larger as expected for the 6 m wide detector finger shown in figure 10b. In figure 10c, we blocked the ballistic path between the emitter and detector by etching a trench into the mesa. The mesa width and detector finger widths are 10 m and 6 m respectively for both the bulk (open circle) and trench (hatched circle). The latter measurement reveals a significant portion of the transmitted phonon signals that are due to backscattering from the bottom of the chip ('background signal'). The difference between the trench transmission and the transmission through the mesa represents the dynamic range of our measurements. In figure 10d, we compare the phonon transmission signals for emitters and detectors that have a straight line-of-sight along the mesa width (along the <110> crystal direction, solid green plot), with emitter and detectors offset with line-of-sight offset by ~ 50° (near to the <100> crystal direction, open green plot). A slightly higher detector signal level is observed for the offset geometry. For this geometry, the ballistic signal is affected by phonon focusing-the attenuation or enhancement of phonon propagation in preferred direction in an anisotropic crystal such as silicon [59]. In silicon crystals, the phonon focusing factor is ~2 times higher in the <100> direction than in the <110> direction [60]. These measurements evince the sensitivity of our phonon spectrometer to submicron variations in device geometry. We point out, however, that the measured differential response of the detector must be scaled by the efficiency factor in order to compare measurements from different detectors. In figure 11, we replot the results in figure 10a (phonon transmission through different mesa widths) with the unscaled detector response , and we show that with typical detector efficiency factors ~0.1, there is an order of magnitude difference between the scaled and unscaled signals.

Conclusion
We have designed and microfabricated a phonon spectrometer utilizing superconducting tunnel junction transducers for the emission and detection of hypersonic (100 to ~870 GHz) acoustic phonons in silicon microstructures. We model the phonon emission profile of the modulated STJ phonon spectrum considering the electron-phonon interactions within the superconductor films of the emitter STJ, and we also model the phonon detector behavior by considering quasiparticlephonon interactions. Our energy resolution of ~60-80 eV, corresponding to a frequency resolution of ~15-20 GHz, is about 20 times better than the energy resolution obtainable from conventional thermal transport measurements, which rely on a Planck distribution of phonons. We have demonstrated that with a phonon detection noise equivalent power, NEP, of 10 -15 , the sensitivity of our STJ phonon detectors is comparable to similar low temperature thermal detectors that are available. The design of our spectrometer-comprising a silicon mesa with STJs on the sides-serves as a good platform for phonon transport studies. The ballistic phonon transmission through the mesa alone can be distinguished from backscattering from the substrate by subtracting the mesa-with-trench phonon transmission signal from the mesa-without-trench signal -a method which eliminates the need for more complicated suspended structures as is typical for thermal conductance measurements. The silicon mesa platform is adaptable to studies of phonon transmission through nanostructures or nanomaterials by etching or depositing these into the ballistic path defined by the mesa. Finally, we have evinced spectrally resolved ballistic phonon transport in microstructures with submicron spatial resolution. Our STJ-based spectrometer provides a state-of-the-art tool for examining nanoscale effects on phonon transport. nW of total phonon power and about ~0.4nW of modulated phonon power. Because of the geometry of our spectrometer, only about 0.1% of this modulated power, or ~0.4 pW, will participate in the measurement. Of this about ~32%, i.e. roughly phonons/sec, will be carried by the peak phonons in the 20 GHz band around 400 GHz; the remainder of the power (roughly phonons/sec) is carried by phonons of energy lower than the peak. In contrast, a thermal source peaked at 400 GHz and emitting the same experimental power (~0.4 nW) will emit a similar fraction (0.4 pW) in the proper direction to participate in the experiment, but only ~3% of this, or roughly phonons/sec, will be carried by phonons within +/-10 GHz of the peak. Of the remaining power, roughly half will be carried by phonons of frequency >410 GHz and half by phonons of frequency <390 GHz.

Appendix B. Estimating the detector efficiency
The measured differential tunnel current in our detector will be proportional to the change in nearby quasiparticle density [30, 39, 41]: where is the normal-state tunneling resistance of the junction, is the normal density of states at the Fermi level ( in Al), and the last factor reduces to 1.15 at our detector bias voltage [30]. Equation (3) of the main text presents the differential rate of quasiparticle generation as a function of differential rate of phonons incident on the detector. From this , we can determine the differential change in quasiparticle density by the steady-state assumption that the rate of quasiparticles generated must balance all quasiparticle loss rates. The primary loss process comprises diffusion of the quasiparticles into the attached wiring leads, followed by recombination into Cooper pairs [40]. We will assume that the tunneling itself does not contribute significantly to quasiparticle loss. For quasiparticles diffusing into a volume , the recombination loss rate is [30,41] .
The recombination time is strongly sensitive to the total quasiparticle density , where is the thermally-activated quasiparticle density, is the quasiparticle density due to the full rate of incident phonons and is due to modulated incident phonons. However, as long as , we may treat as constant [53]. At a temperature of 0.3K, is roughly 30 s [30, 42, 43]. To check the dependence of detector response on , we repeated one of our spectral measurements at a temperature of 0.36 K, at which was 3 times its value at 0.3K. We found that the detector response was degraded by only ~10% compared to the 0.3 K measurements. Thus, we expect that restricting to only 1.5 times its unperturbed (thermal) value should maintain the condition , and therefore maintain a consistent detector sensitivity. We note that the ~10% reduction upon raising the temperature to 0.36K is less than what would be predicted by the theory of Rothwarf and Taylor [53], suggesting that in our devices is less temperature-dependent than this theory. One possible explanation is that magnetic flux trapped in the Al detector film contributes to the quasiparticle recombination rate in our detectors [41]. In some cases, cycling our devices above T c resulted in variations of a few percent in the measured phonon transmission signal, which is consistent with the presence of detector efficiency variations due to trapped flux.
In considering , the volume primarily comprises the wiring trace attached to the finger, so we have , where and are respectively the average total width and thickness of the trace, which in our devices are respectively 3.2 m and 530 to 580 nm, and is the diffusion length of the quasiparticles. For diffusion constant D = 20 cm 2 /s, this length is ~250 m [29,40]. Thus the recombination rate found from equation (B.2) is . In steady-state we take the total rate of change of quasiparticle density to be zero, thus , and we find (B.3) Thus the tunnel current may be related to the rate of quasiparticle generation by incident phonons found from equation (3)  for each detector as the ratio of measurable current to charge production rate : (B.5) We note that the relatively large magnitude of quasiparticle diffusion length (of order 100 m) means that phonons reflected from the bottom of the Si chip and striking the wiring leads very far from the junction or the mesa may generate quasiparticles that register as a tunneling current at the detector STJ, therefore contributing to the measured backscatter signal level. It is interesting to think about whether we could reduce or eliminate the measured background level (which represents a source of experimental uncertainty) by redesign of the detector or wiring traces. However, we note from equations (B.2) to (B.5) that changes in the wiring trace dimensions may not achieve this goal: if we reduce the width of the wiring traces in order to diminish the intercepted phonon flux, we also reduce the volume occupied by the quasiparticles and thereby increase the tunneling efficiency for both the quasiparticles formed in the finger and those formed in the leads. [