Improving wafer-scale Josephson junction resistance variation in superconducting quantum coherent circuits

Quantum bits, or qubits, are an example of coherent circuits envisioned for next-generation computers and detectors. A robust superconducting qubit with a coherent lifetime of $O$(100 $\mu$s) is the transmon: a Josephson junction functioning as a non-linear inductor shunted with a capacitor to form an anharmonic oscillator. In a complex device with many such transmons, precise control over each qubit frequency is often required, and thus variations of the junction area and tunnel barrier thickness must be sufficiently minimized to achieve optimal performance while avoiding spectral overlap between neighboring circuits. Simply transplanting our recipe optimized for single, stand-alone devices to wafer-scale (producing 64, 1x1 cm dies from a 150 mm wafer) initially resulted in global drifts in room-temperature tunneling resistance of $\pm$ 30%. Inferring a critical current $I_{\rm c}$ variation from this resistance distribution, we present an optimized process developed from a systematic 38 wafer study that results in $<$ 3.5% relative standard deviation (RSD) in critical current ($\equiv \sigma_{I_{\rm c}}/\left\langle I_{\rm c} \right\rangle$) for 3000 Josephson junctions (both single-junctions and asymmetric SQUIDs) across an area of 49 cm$^2$. Looking within a 1x1 cm moving window across the substrate gives an estimate of the variation characteristic of a given qubit chip. Our best process, utilizing ultrasonically assisted development, uniform ashing, and dynamic oxidation has shown $\sigma_{I_{\rm c}}/\left\langle I_{\rm c} \right\rangle$ = 1.8% within 1x1 cm, on average, with a few 1x1 cm areas having $\sigma_{I_{\rm c}}/\left\langle I_{\rm c} \right\rangle$ $<$ 1.0% (equivalent to $\sigma_{f}/\left\langle f \right\rangle$ $<$ 0.5%). Such stability would drastically improve the yield of multi-junction chips with strict critical current requirements.


I. INTRODUCTION
Josephson junctions, fabricated by isolating two superconductors with a thin insulating barrier, are the core circuit element for superconducting solid state quantum coherent devices. When shunted with a capacitor, the non-linear inductance from the junction forms an anharmonic oscillator making energy levels individually addressable [1]. Precise control over junction properties is crucial for state-of-the-art devices such as: quantum processors utilizing the cross-resonance gate [2], single microwave photon detectors based on ensembles of identical qubits [3], and travelling wave amplifiers where variations in nominally identical junctions lead to unwanted impedance variations [4]. Therefore, in this work we specifically focus on the reproducibility of shadowevaporated sub-micron Al/AlO x /Al Josephson junctions common to nearly all current qubits [5].
The critical current, I c , of a Josephson junction, inversely proportional to its inductance, is tuned by either varying the critical current density, J c , or the junction area. The former involves modifying the tunnel barrier thickness via the oxidation time or pressure when using a thermally grown barrier. Our wafer-scale fabrication process produces 64, 1 cm 2 dies from a 150 mm wafer -the maximum size accommodated by our evaporator. The junctions are located within the central ≈ 49 cm 2 of the die array and thus high uniformity is desired over this length scale. Previous works describe two types of Josephson tunnel junctions: large junctions, I c O(µA), typically realized with a Nb/AlO x /Nb trilayer process suitable for superconducting digital electronics or microwave amplifiers; small junctions, I c O(nA), typically realized with Al/AlO x /Al suitable for qubits. Regarding the former, 2-4% intrachip variations have been reported [6] and ≈ 15% variation is observed across a wafer [7,8]; a notable exception is [9] where 8.2% and 2.9% variation in resistance is reported for 300 nm and 800 nm diameter junctions, respectively, across a 200 mm wafer. Junctions with sizes ranging from 0.015 to 3.27 µm 2 mentioned in [10] had variations of 2.3% on 39 mm 2 chips. For qubits, it is advantageous to reduce the physical size of the junction to minimize the inclusion of noisy two level defects [11]. Authors fabricating deep sub-micron junctions typically report fluctuations of ≈ 5% within chips smaller than 50 mm 2 [12], 3.5% within a few mm 2 [13], and fluctuations of 2-3% for 0.04 µm 2 junctions patterned with hard masks across 50 mm wafers [14].
In this work, we strive to further improve this absolute level of resistance variation, and to realize it over a larger substrate in order to increase the yield of functional multi-qubit chips which have tight tolerances on qubit frequency. Furthermore, we investigated designs where a SQUID replaces a single junction and the magnetic flux-tunability of the circuit inductance is limited by introducing asymmetry in the SQUID junction areas (≥ 5:1) to reduce the susceptibility to flux noise [1,15]. As such, we produced small junctions over a range of areas spanning 0.0036 to 0.013 µm 2 . It is important to note that in such SQUIDs, the smaller junction only affects the tuning range so we focus on tight control over the critical current of the larger junction.

II. METHODS AND OBSERVATIONS
For this study, both 100 and 150 mm wafers were used. Junctions were fabricated using the bridge-free "Manhattan Style" [16,17] on > 8000 Ω-cm intrinsic (100) Si using ebeam lithography, see Fig. 1. Bridgeless junctions have an advantage over bridged designs, such as Dolan style [18], that the junction area is independent of resist thickness. Layouts were generated in python with GDSpy [19], proximity effect corrected with Beamer from GenISys, and exposed with 100 keV electrons in a Raith Electron Beam Pattern Generator (EBPG) 5150. The EBPG is housed in an enclosure made by MCRT within a class 100 cleanroom. The enclosure re-filters the air to at least class 10 and stabilizes temperatures to ± 0.05 • C over month-scale time frames. A Spicer Consulting SC24 provides active 3-axis magnetic field cancellation from DC-13 kHz, measured at a single point next to the e-beam column. The environmental stability of the setup, combined with the Raith EBPG 5150 self-calibration protocol, provides highly reproducible lithography. Once exposed, samples are developed and subsequently coated with e-beam evaporated Al in a Plassys MEB550s with a base pressure of 3 × 10 −8 mbar. After liftoff, junctions were individually probed to measure their room temperature resistance from which I c can be inferred using the Ambegaokar-Baratoff formula [20]. These values can be converted into a qubit frequency using an estimate of the shunt capacitance. Initially, wafers were probed by hand but later, a Micromanipulator P200L semi-automatic probe station was used for the last 11 wafers to gather statistics on a larger number of junctions. Plots highlighting improvements made during this study can be found in Fig. 2.

A. Resist/Exposure
The resist bi-layer was spun with a Laurell Technologies WS-650-23B spin coater. MicroChem MMA-EL13 (copolymer in ethyl lactate) was used as the high sensitivity bottom undercut layer for all wafers. Zeon Corp. ZEP 520A-7, MicroChem 950k PMMA A4, and AllResist GmbH AR-P 6200.9 (CSAR) were all tested as the high resolution upper layer. It was found that the small (≈ 20 mm diameter) vent hole in the top of the spin coater had to be covered to create a uniform spin of the MMA, which was unnecessary for the CSAR and ZEP likely because of the differences in viscosity of anisole and ethyl lactate. We initially had difficulties spinning defect free CSAR on MMA, behavior which was not observed with ZEP. This issue was solved after the resist was degassed by opening the lid and letting it sit for 2 hours allowing the pressure and humidity in the bottle to equilibrate with ambient conditions. CSAR was ultimately selected as the resist of choice over PMMA because of the flexibility it offered having (mostly) orthogonal development chemistry to MMA and over ZEP because of its lower cost. For our developers, described below, MMA and CSAR had an optimal dose of 180 and 1100 µC/cm 2 respectively. We note that partial clearing of CSAR in MMA developer was observed for doses above 1100 µC/cm 2 when immersed for extended times.
Proximity effect correction (PEC) in Beamer was first optimized by observing the uniformity (or lack) of residual undercut as the MMA provides a sensitive indicator of long range substrate backscattering compensation. The software's 3D-Edge mode of 3D PEC was chosen due to its ability to simultaneously proximity effect correct both resist layers which require different doses and a default point spread function (PSF): 500 nm PMMA on Si at 100 keV (Z-Position: 0.325) was used initially. Before the addition of short range corrections to this PSF, we had low yield of sub 100 nm features with CSAR which we did not observe with ZEP. The short range corrections that were added to improve yield were: an effective short range blur FWHM of 50 nm, a short range separation value of 5 µm, and a mid-range activation threshold of 2%. A 200 pA beam and 200 µm aperture (calculated spot size = 2 nm) was used with a 1 nm beam step size to ensure that designed area variations on the order of a few nm were reproduced. Backscatter dosing from the probe pads (which are not written on device wafers) were written 130 µm away (∼ 4x the backscattering parameter for 100 keV electrons on Si) to ensure test wafers created junctions equivalent to device wafers. SEM observations of as-evaporated junctions showed worse line edge roughness (LER) on the second evaporation compared to the first (see Fig. 1). Our theory is that Al deposited on the sidewall of the CSAR in the first evaporation introduces additional LER for subsequent evaporations. A trilayer resist (MMA/CSAR/MMA) was briefly considered in an attempt to reduce this effect utilizing the top layer of MMA to shield the CSAR during off-axis evaporations. We did observe an improvement in LER, but since it did not reduce global I c variations, it was abandoned due to its added complexity and the additional forward beam scattering from the top MMA would result in increased developed linewidths [21], limiting achievable SQUID asymmetry ratios.

B. Development
Cold development with manual agitation (or ultrasonication for wafer 36) was used for CSAR and ZEP. A Thermo Scientific PC200 immersion circulator filled with 50:50 H 2 O: Propylene Glycol was used to chill N-amyl acetate (NAA) baths to 0 ± 0.02 • C. NAA from Zeon corp. (ZED-N50) was used initially and AllResist GmbH AR 600-546 was used after wafer 26. No difference was noted between these nominally identical developers. The MMA was developed at room temperature and puddle development was briefly considered, but led to many CSAR constrictions so was abandoned in favor of immersion development on PTFE wafer holders. Initially IPA:MIBK was used to develop the MMA but we observed many open junctions due to small resist bridges constricting the CSAR near the junction, especially for < 0.01 µm 2 junctions. Our hypothesis was that swollen, gel-like MMA re-moved by the developer [22] was the cause of these constrictions. Studies with PMMA (which has much higher molecular weight than MMA), showed that the co-solvent IPA:H 2 O was a superior developer, resulting in reduced swelling and the addition of sonication was shown to increase the rate at which developed resist is removed [23][24][25][26]. Although the switch of developer alone did not drastically improve small junction yield, the addition of sonication did. Care had to be taken to attenuate the ultrasonication power to prevent collapse of the CSAR overhang which was accomplished by using the lowest bath power and, crucially, lining the bath with a polyurethane/vinyl sound absorbing foam, leaving the central 1x1 cm open to allow some power transmission.
After development, oxygen plasma ashing of the newly opened channels is performed. We used a Plasma Etch PE-50 with a 50 kHz pure oxygen plasma (80 s, ≈ 500 mbar, ≈ 60 W). It was found that large, non-radially symmetric I c gradients were reduced and made more radially symmetric by splitting a single ashing step into four, 20 s steps with 90 • substrate rotations between steps. In an attempt to further improve the ashing uniformity, the sample was rotated four times in each corner of the chamber, for a total of 16 x 5 s ashes. This resulted in the best wafer-scale statistics at the time: σ I c / I c = 3.5% for single junctions across 49 cm 2 . Eliminating ashing resulted in worse σ I c / I c but also a 2x reduction in J c , strong evidence that residual organics have an effect on tunnel barrier properties [13,27,28].
After implementing 16x ashing, the dominant source of non-uniformity was found to be junction area variations which showed approximately radial dependence. First, the effect was reduced simply by increasing the junction area (and decreasing J c to keep I c constant). To test if this was introduced during development in NAA, manual agitation was replaced by ultrasonication for wafer 36 due contrast improvements seen in [29] and assumed higher uniformity. However, this showed no improvement and a ∼ 1 cm 2 patch of abnormally low J c on the wafer caused an overall σ I c / I c degradation. Pinpointing the cause of, and a solution to, the area fluctuations is the path towards better wafer-scale uniformity for this process. To this end, a hard mask process would be helpful as it should be more robust during evaporation and diagnostic post-development SEM imaging.

C. Evaporation and Oxidation
Motivated by the hypothesis that high energy electrons and UV radiation released during the evaporation could warp or distort the resist non-uniformly and produce the observed area fluctuations, a deposition rate of 3 nm/s was used for the majority of this study. However after other process optimizations, better uniformity was observed using a rate of 0.3 nm/s. Lower deposition rates provide more time for a growing film and substrate to thermalize, forming smaller grains [30]. Since the tunnel barrier thickness is not uniform grain to grain or at grain boundaries [31][32][33], we hypothesize that more grains per junction results in better averaging of the effective barrier thickness, improving site to site uniformity. To investigate this, cross section TEM analysis of junctions fabri-cated with the two deposition rates is ongoing [34]. Dynamic and static oxidations were also A/B tested. In a static oxidation, the chamber is filled with oxygen (in our case 95%/5% Ar/O) to a set pressure and then evacuated after a set time. In a dynamic oxidation, gas is continuously introduced and pumped out with rates balanced such that the pressures are the same as the static oxidation case. Interestingly, we found dynamic oxidation produced a lower J c and since it provided better uniformity, it was used for the remainder of the study.

III. RESULTS
Wafers (which each had 1000 fixed frequency junctions, 1000 6:1 SQUIDs, and 1000 8:1 SQUIDs patterned in alternating rows of 50) made after delivery of the automated probe station are summarized in Table I

A. Qubit Coherence and Frequency Predictability
Many measurements are still needed to rigorously correlate the observed improvements in junction uniformity with ultimate device performance. Nonetheless we describe here example measurements of two co-fabricated 8-qubit quantum processors. The 8 fixed-frequency transmon qubits on each chip had a mean target frequency of 5.6 GHz with detunings between neighbors optimized for the cross-resonance gate [2]. Fabricating 64 chips on a 150 mm Si wafer and binning the 64 junctions of each size across the wafer, we find an average σ I c / I c = 6.9%, a 3.8x improvement over an 8-qubit ring wafer made using a process similar to wafer 29 (where MIBK was used instead of H 2 O for MMA development). With this narrower distribution of critical currents, we found 3/64 chips had optimal qubit frequencies, consistent with numerical estimates of chip yield given the measured σ I c / I c . We hypothesize that the remaining ∼ 2x discrepancy in σ I c / I c between test wafers 35, 37 and 38 and the latest device wafer may be explained by the additional round of lithography that device wafers require after junction deposition (including resist baking and ion-milling) to define the low-loss junction-capacitor interconnects [36] or the different substrate surface between test and device wafers (RIE etched vs polished Si). The chips were wirebonded in two designs of Cu boxes and tested in separate dilution refrigerators. Coherence measurements and frequency predictability are summarized in Table II. Given the long lifetimes measured on sample #1, we conclude that the fabrication modifications made to improve uniformity do not come at the expense of qubit coherence. See the supplementary material for discussion of the observed offsets between probing estimates and cryogenic measurements, the suppressed coherence of sample #2, and data on individual qubits.

IV. CONCLUSIONS
Motivated by the challenging task of maintaining high Josephson junction uniformity when scaling quantum coherent circuit fabrication beyond a few qubits, we undertook a systematic study to identify and rectify sources of I c variation. We have developed a process which has shown a σ I c / I c as low as 3.1% over 49 cm 2 for single junctions. Looking within a chip sized 1 cm 2 window to remove global drift, an average σ I c / I c = 1.8% was measured with some areas <1.0%. To accomplish this, a reliable resist stack was found by changing proximity effect correction parameters and studying different development strategies, of which ultrasonication played a key role in producing high yield structures. Large gradients introduced by non-uniform ashing were mitigated by adding substrate rotations into that process, which may not be necessary with a more uniform asher. Slower evaporation rates and dynamic oxidations were then shown to further improve uniformity. Current levels of uniformity should be improved by minimizing the observed junction area fluctuations, whose origin is not currently understood. However, since σ I c / I c within chip sized areas is small, detunings between qubits on a single chip can be accurately set and the non-zero global I c drift can be used to target absolute frequencies; a useful capability as tolerances become tighter for quantum processors and microwave photon detectors growing in complexity, size, and qubit number.  The fixed-frequency transmon qubits (E J /E C = 70) measured in this study are coupled to nearest neighbors forming a ring topology. Qubit state is dispersively readout with frequency multiplexed λ /4 CPW resonators (κ ext ≈ 1 MHz) coupled to a common λ /2 CPW resonator acting as a Purcell filter to allow fast readout without excess qubit energy relaxation into the bus. In addition to a control line for each qubit, these structures are defined with Cl based inductively coupled reactive ion etching of sputtered Nb on > 8000 Ω-cm intrinsic (100) Si. After resist stripping in Microposit Remover 1165 at 80 • C for 1 hour, junctions were added using the recipe above. The final round of lithography opens a window above the junction-capacitor overlap region, which was then ion-milled at 400 V for 6 min (with a duty cycle of 2 s on and 20 s off to prevent excessive substrate heating/resist cross-linking) and subsequently covered with a 200 nm normal angle Al deposition serving as a low-loss junction-capacitor galvanic interconnect [S1]. Two test junction sites within each die were also fabricated for a total of 640 junctions on the wafer. We probed the room temperature resistance of these test sites approximately every 48 hours for a month with the wafer continuously exposed to air to investigate any aging of the junctions. Fitting the normalized change in junction resistance with an exponential function, we find a fit amplitude of 5.5 ± 0.5% and a time constant of 132 ± 37 hours. In order to minimize the physical damage that occurs when the probe tips touch the surface, the actual qubit sites were only probed after the wafer aged into the steady state. In order to remove all resist residue, chips are cleaned in Microposit Remover 1165 at 80 • C for 12 hours. This cleaning treatment induces additional aging of 4.3 ± 0.8% in I c (with an exponential rise to this new steady state in ≈ 48 hours) so a few candidate dies are cleaned and the exact chip is chosen with post-1165 probing data. Chips are then wirebonded in a copper cryopackage and mounted to the mixing chamber of a dilution refrigerator. Additional wirebonds are used on chip to suppress unwanted slotline modes.
Sample #1 was cooled to 8 mK in a BluFors XLD400 with a mixing chamber shield, cryoperm shield, Sn shield, and an inner radiation shield. The sample box was light tight and In sealed. All 8 control lines and common readout bus were connected to coaxial lines from room temperature that were attenuated (60 dB for control lines, 80 dB for readout) and low pass filtered using both  LC and lossy Eccosorb filters. Roughly 60 dB of isolation was installed between the sample and the HEMT. Sample #2 was cooled to 14 mK in an Oxford DR200 with a cyroperm shield, inner radiation shield, and a previous generation In sealed sample box. Due to the limited number of coaxial lines in this refrigerator, qubit control pulses were injected through the common readout bus and the control lines were wirebonded to the PCB and 50 Ω terminated on the cryopackage. The readout line had 80 dB of attenuation, LC and Eccosorb low pass filters, and 80 dB of isolation from the HEMT. The qubit energy relaxation times, T 1 , dephasing times measured with a Ramsey experiment, T * 2 , and dephasing time found by echoing away low frequency noise with a π pulse in the middle of the Ramsey evolution, T 2Echo , are summarized in the Table S1. Since careful A/B testing of the box designs is still ongoing, we can only suggest possible mechanisms for the reduced coherence of sample #2 compared to sample #1. We suspect that T * 2 and T 2Echo may have been improved if the 50 Ω terminations on control lines were replaced with shorts to reduce shot noise dephasing. Additionally, sample #2 was characterized at a higher temperature in a fridge with less shielding.
Sample #1 had larger frequency offsets between room temperature probing estimates and cryogenic measurements than sample #2. As mentioned above, in preparation for cooldown, samples are cleaned in 1165 and then probed. Sample #1 spent 24 hours at room temperature after probing (with ≈ 20 of those hours under vacuum) whereas sample #2 only spent 2 hours at room temperature. If the 1165 induced aging that occurs at atmosphere also occurs under vacuum, this could explain the larger frequency offset for sample #1. However, the small standard deviation of the frequency differences for both samples (< 0.5%) shows that relative detunings between qubit pairs can be accurately predicted with room temperature probing.