Cylindrical microlensing for enhanced collection efficiency of small pixel SPAD arrays in single-molecule localisation microscopy

: Single-photon avalanche photodiode (SPAD) image sensors offer time-gated photon counting, at high binary frame rates of >100 kFPS and with no readout noise. This makes them well-suited to a range of scientific applications, including microscopy, sensing and quantum optics. However, due to the complex electronics required, the fill factor tends to be significantly lower (< 10%) than that of EMCCD and sCMOS cameras (>90%), whilst the pixel size is typically larger, impacting the sensitivity and practicalities of the SPAD devices. This paper presents the first characterisation of a cylindrical-shaped microlens array applied to a small, 8 micron, pixel SPAD imager. The enhanced fill factor, ≈ 50% for collimated light, is the highest reported value amongst SPAD sensors with comparable resolution and pixel pitch. We demonstrate the impact of the increased sensitivity in single-molecule localisation microscopy, obtaining a resolution of below 40nm, the best reported figure for a SPAD sensor.


Introduction
Single Photon Avalanche Diode (SPAD) arrays offer a widefield approach to more traditional scanning imaging modes for the detection and timing of individual photons of light.Despite significant advances in recent years, they remain a developing technology, showing potential in a diverse range of applications, including Fluorescence Lifetime Imaging (FLIM) [1], Positron emission tomography (PET) [2], the imaging of quantum correlations [3], and automotive [4] as well as long-range LIDAR [5].
One of the limiting factors of SPAD arrays is the fill factor, the overall percentage of photosensitive area of the device.Due to the complex in-pixel electronics required for each SPAD to act as a single photon counting/timing module, as well as special "guard ring" structures designed to prevent premature avalanche breakdown, SPAD array fill factors are traditionally low.Historically, the fill factor of SPAD image sensors has been limited to modest values of below 10%.This compares to values beyond 90% for most commercial CMOS, CCD and EMCCD cameras.Newer SPAD architectures have sought to improve the fill factor via both the detector and the pixel circuit design.Virtual guard rings have been adopted enabling N-well sharing between pairs of SPAD rows [6] (at the potential cost of increased cross-talk and an irregular modulation transfer function [7]).In addition, compact analogue circuit approaches have replaced counting electronics with simple binary pixels, together with oversampled readout and external frame summation.The resulting improvements in fill factor have been considerable, with 61% being recently achieved at 16 µm pixel pitch [7].However, further improvements with this approach are unlikely to be significant as the large pixel active area inherently increases the dark count of the SPAD array.An alternative approach for raising the fill factor is to use a back-side illumination (as commonly seen in conventional CMOS imagers) so as to prevent metallisation layers from obscuring the photo-sensitive area, and to adopt a stacked a structure, moving the pixel electronics to a separate tier.This has recently been demonstrated in a 7.8 µm pitch SPAD device attaining 45% fill factor, albeit at the cost of a reduced spectral response across the blue-green wavelengths [8].
One of the more traditional approaches to increasing fill factor has been to employ microlens technology, aiming to couple the foci of each individual microlens to the typically micron sized photoactive area of each SPAD pixel.There are a number of examples in literature of microlensed SPAD imagers, whether using spherical refractive [9,10] or planar diffractive [11] lenses.More recently, a Fresnel lens design was proposed for microlensing large pixel SPADs for near-infrared imaging [12].However, in implementations reported to date, the initial fill factor of the SPAD is relatively low (at only a few %), which can present difficulties in fabrication and alignment, as well as restricting the final fill factor that can be attained.Furthermore, the pixel pitch in these implementations is quite large [10]: achieves microlensed fill factor of >50%, but at a large pixel pitch of 25 µm, which can be restrictive for many applications in microscopy.For optimum resolution in both widefield and singlemolecule localisation microscopy, the effective pixel size (accounting for the magnification) should be close to the Nyquist sampling limit, which results in an optimal pixel size of around 100nm in the case of a 1.4 NA objective and 525nm emission [13].For pixels as large as 25 µm, this corresponds to over 200 × optical magnification, which is challenging, as 200 × objectives are not generally available.Thus additional magnification optics are required [14], which introduces complexity, further aberrations, and reduces the field of view.Smaller SPAD pixel sizes may be achieved by moving to a shared-well layout, in which case the above microlensing solutions are no longer compatible, due to the resulting irregular pattern of active areas.Thus, so far the optimal pairing of a small pixel pitch SPAD with microlensing has not been achieved.
In this paper, we combine the front-side illuminated SPAD device of [6], which features a small pixel size of 8 µm and optimised initial fill factor of 26.8%, with new, custom-designed microlenses.The microlens array features cylindrical rather than the typical circular-shaped elements, and achieves a fill factor of 50%.We apply this technology to an exemplar application of single-molecule localisation microscopy, and show optical resolutions of below 40 nm, not viable in the non-microlensed array.

Microlens fabrication
The SPAD sensor (Fig. 1(a)) is fabricated using STMicroelectronics' 130 nm imaging process technology and features a 320 × 240 array of detectors, which are arranged in pairs of back-to-back rows.Within these pairs of rows, the detectors are closely packed such that, to a first approximation, they form a continuous, elongated photo-sensitive area.This maximizes fill factor, but results in an irregular array of detectors in one direction, preventing traditional high efficiency microlensing.However, the architecture lends itself instead to cylindrical microlenses (Figs.1(c) and 1(d)).In the optimal configuration, each microlens serves a pair of rows, and is aligned on the centreline between the two rows (Fig. 1(c)).The exact microlens shape adopted here was designed according to the dimensions of the combined active area using the Zemax ray tracing package (Zemax LLC, USA); the effect of microlensing is observable on micrographs of the pixel array (Fig. 1(b)), with the (darker coloured) active areas appearing visibly magnified by the microlens array.
The microlens fabrication process was similar to that described in [10].First, a mould master was manufactured using thermal reflow [15], based on the microlens design.A UV curable hybrid polymer was then deposited directly on the SPAD sensor die, and the mould was pressed on it to create the desired microlens profile using a mask aligner MA-6 from Süss Microtec.This was followed by a combination of UV exposure and thermal treatment.In all, microlens arrays were successfully imprinted (using the same mould) onto eight separate SPAD arrays, which were then wire-bonded and packaged.The height of the microlens array, indicated as 14 µm in Fig. 1(c), was varied between replications to investigate its effect on the collection efficiency, and was measured using a profilometer (Tencor P10, USA) at the four corners of the array prior to bonding.

Characterisation
To characterise the microlensed SPAD imager, the SPAD sensor was operated in digital readout (or Quanta Image Sensor [16]) mode, in which binary sub-exposures (or "bit-planes") are captured, each pixel presenting a '0' (for no photons detected) or a '1' (at least one photon detected).These bit-planes are then summed to compose grayscale image frames.An important corollary of the binary mode of operation is that there is effectively no read noise effecting output frames (due to the large voltage swing associated with a photon detection).The main noise source is the pixel dark count rate (DCR, with a median value of around 100 cps in this sensor when operated at room temperature and 2 V excess bias), which refers to the spurious firings of the SPAD due to thermal events.
Image sequences were captured, using both microlensed and non-microlensed SPADs, using diffused light with the source placed over 1 meter from the SPAD array to provide flat and even illumination of wavelength 450 nm or 650 nm.The source used was a white LED (Thorlabs MCWHL5) restricted by 10 nm wide bandpass filters (FPB450-10 or FB650-10).The incident light intensity on the sensor (3.48 mW/m 2 for 450 nm illumination and 8.12 mW/m 2 at 650 nm) was monitored using a calibrated power meter (Thorlabs PM200 with S120C sensor).DCR compensation was applied as per the method in [17] and hot pixels of DCR>5 kcps, accounting for fewer than 10% of all pixels, were interpolated over using the scheme of [18].The improvement in effective fill factor through microlensing was quantified in terms of the concentration factor, defined here as the ratio of photon counts registered with and without the microlens array.
Due to microlens alignment and manufacturing variability a certain level of mismatch (or offset) in the position of the lens is inevitable.The mismatch is compounded by this prototype microlensing being carried out on per-die basis, on a sensor lacking normal alignment fiduciae.As is now common in large scale volume production, fiducial markers would likely reduce this misalignment.In the present microlensed sensor, mismatch manifests itself as a strong odd-even row pattern in the output, as well as a reduction in the overall concentration factor across the array.The effect is illustrated in Fig. 2: a shift in position of the array with respect to the ideal case (Fig. 2(a)) reduces the collection efficiency of every second pixel row (Fig. 2(b)).The mismatch can be countered by tilting the sensor (Fig. 2(c)) so as to redirect the focused light back onto photo-sensitive areas.Figure 2(d) plots the resulting concentration factor across a range of tilt angles for one example microlensed sensor, identified as having the highest concentration factor at optimal tilt (the concentration factor being calculated as the ratio of the mean output of the microlensed and non-microlensed chips for an exposure time of 1 µs).The concentration factor is seen to peak at 11° (450 nm light) and 12° (650 nm), which were thus the tilt angles used to quantify output response in subsequent measurements.In a Quanta Image Sensor the response to light in terms of the mean photon count is linear at low light levels but undergoes logarithmic compression as the light level (or exposure time) is increased [19].The effect is akin to that seen with photographic film and it is due to pile-up distortion of multiple incident photons being recorded with the same logical high value as a single photon.Thus to capture the full response curve, it is necessary to record the output of the device over a range of exposure levels.Figures 3(a) and 3(b) plot the response of the SPAD sensor, with and without microlensing, when exposed to light of wavelength 650 nm and 450 nm respectively.The output of the sensor is presented in terms of the bit-density (or average pixel value) at increasing levels of exposure.The improvement in fill factor is indicated by a lateral shift in the response curve, representing increased sensitivity under comparable illumination conditions.A concentration factor of around 1.80 is seen at 650 nm, increasing to around 1.95 for 450 nm, representing an effective fill factor of 48% and 52%, respectively, for the microlensed SPAD.

320×240 SPAD Array
The microlensed SPAD sensor characterised here is one of the eight successful microlens replications, featuring different microlens heights.Figure 3(c) shows a scatter plot of the measured concentration factor (at zero tilt) versus mean microlens height for all replications (the horizontal error bars on the data points representing the standard deviation in the microlens height as measured at the four corners of the array).Alongside the measured data points, a simulated curve, based on a Zemax optical model, is given.The measurements follow the general trend suggested by the simulations; the disparity between the two could be attributed to offsets in the position of the lens array, collimation of the incident light during the experiment, as well as deviations in the microlens profile from the perfect spherical shape.Based on these results, it would be useful to target a microlens height of just over 20 µm in future replications, to maximise the concentration factor, the simulation suggesting a maximum achievable fill factor of around 56%.
In view of the potential nonidealities in the microlens array, it is also important to assess the spatial uniformity in the response of the SPAD sensor, in addition to measuring the overall response.Figure 4 compares the response, under 650 nm uniform illumination as described previously, of the microlensed SPAD (as before, the unit giving the highest concentration factor at optimal tilt was used) with the non-microlensed version, both in terms of photon counts across the pixel array and histograms of these photon counts.The standard deviation in photon counts, which is <1% of the mean count in the non-microlensed case (based on a Gaussian fit to the histogram), shows a moderate increase to 3.1% for the microlensed SPAD (or, more precisely, 3.3% after compensating for the slight pile-up in the counts due to the binary read-out of the sensor [18]).In the latter case, the histogram features heavier tails than a standard Gaussian function, so a t location-scale distribution is fitted [20].Most of the increase in non-uniformity is likely to be due to the misalignment and nonuniform height in the microlens array rather than the variability (aberration) in the microlens elements themselves (as discussed previously this would likely be reduced by the use of suitably placed alignment markers in any large scale manufacturing of the microlens array).This is evidenced by the edge and corner effects seen in Fig. 4(c) (where the uniformity is poorer).Indeed, considering a central region of interest of 160 × 120 pixels, the nonuniformity in the response of the microlensed chip reduces to 1.4%.

Fluorescence cell imaging
To test the platform with a benchmark imaging application, the SPAD camera was used to image a fluorescently labelled cell test slide (F36924, Thermo Fisher Scientific) on a widefield microscope (Olympus IX71 with 60 × , NA = 1.25 water objective).The SPAD image sensor is removable from the driving circuit board without any impact on alignment and therefore swappable between the microlensed and non-microlensed versions.The camera was configured to acquire back-to-back 100 µs bit-plane exposures, which were then summed to compose grayscales images in each of the three imaging channels (red, green and blue), and the channels merged into a single colour image.Alongside the SPAD, an sCMOS camera (Hamamatsu ORCA-Flash4.0V1) was used to provide reference images.The two cameras were coupled to the microscope via a 50:50 beam splitter enabling simultaneous imaging of the same field of view.
Figures 5(a)-5(c) show three-channel images of a given (bovine pulmonary artery) cell, as captured by the SPAD sensor, the SPAD with microlens, and the sCMOS camera, using the same equivalent total exposure time (with the sCMOS image cropped to match the field of view of the SPAD).The microlensed SPAD image shows more detail and better signal to noise ratio than the output of the non-microlensed SPAD, especially in the green channel.The sCMOS camera remains the benchmark, with a still higher signal to noise ratio.However, the microlensed SPAD array begins to approach this level, returning equivalent detail in the image but representing an improved technology in terms of photon counting and potential timing applications.Figure 5(d) demonstrates this further, plotting the intensity profiles over a given section of the cell from each camera, as indicated by the yellow lines in Figs.5(a)-5(c).In the case of the sCMOS image, the pixel values were converted into photoelectron number based on the ADC offset and conversion gain.These profile plots indicate around × 2 increase in photon numbers from microlensing, which is in agreement with the characterisation results of Fig. 2. As an example, a peak in the profile at around d = 250 µm becomes distinguishable from the background in the microlensed SPAD image.
Accounting for the smaller pixel size of the sCMOS (6.5 µm vs. 8 µm pixel pitch of the SPAD), and considering photon counts over the whole field of view, the sCMOS image has × 3.3 as many photons in the blue channel, × 3.2 in the green and × 6.2 in the red, compared with the microlensed SPAD.This is broadly in line with expectations given the relative external quantum efficiencies (internal QE × fill factor) of the sensors at the respective wavelengths.The SPAD has EQE≈0.35× 0.51 = 0.18 for blue and green, reducing to 0.25 × 0.51 = 0.13 for red, whilst for the sCMOS these numbers are EQE≈0.60(blue), 0.70 (green, red).The specified EQEs assume emissions at the peak emission wavelengths of the three fluorescent dyes, namely 461 nm (blue), 512 nm (green) and 599 nm (red).
In comparing even the high efficiency microlensed SPAD imager with the sCMOS camera, the above results illustrate that when the sCMOS is imaging a static scene in a shotnoise dominated regime (i.e. with sufficient photon numbers so that read noise is negligible), then it has a significant SNR gain over the microlensed SPAD.This is expected as discussed above.However, it is in photon starved imaging conditions over relatively short time-scales, so that dark counts are insignificant, where the SPAD offers an advantage.To demonstrate this, Fig. 6. plots the SNR versus the mean number of incident photons (also known as the photon transfer curve), comparing measured data points for the SPAD (with/without microlens), with standard noise models for EMCCD, ICCD and sCMOS, as used in literature (see, for example [21]).To measure the SNR for the SPAD, the experiment of Fig. 3 was repeated: a sequence of 2 × 10 6 bit-plane exposures were taken, under uniform illumination.The exposure time was set to 10 µs per bit-plane; back-to-back exposures may be obtained at this exposure setting by reading out 24 rows only.Light of 450 nm wavelength was used (close to the peak QE of the sensor at 480 nm), its photon flux established by the power meter, and a pixel with median DCR was considered.The incident photons per exposure were at all times fewer than 0.1.Photon counts were obtained by summing exposures in groups of different sizes, and for each grouping the SNR was calculated as: mean( ) mean( ) .
var( ) The resulting SNR curve is seen to be similar to the ICCD.When imaging in a regime with fewer than one incident photon/pixel, the microlensed SPAD gives higher SNR than sCMOS, otherwise sCMOS has the advantage.At photon levels of >100/pixel, as in the cell image, the sCMOS is projected to offer twice the SNR.The results from Fig. 5 suggest a factor of √3.3≈1.82difference in SNR in the blue channel, close to the idealised factor of 2 from the modelled sCMOS camera.

Single-molecule localisation microscopy
To further test the performance of the microlensed SPAD array we perform single-molecule localisation microscopy (SMLM) on GATTA-PAINT nanorulers [22].SMLM [23,24] utilises successive localisation of sparse subsets of stochastically blinking or photoactivatable molecules to construct a high-resolution image from the localization map.The standardised DNA origami samples are ideal for studying the capabilities of cameras for SMLM [25].Camera sensitivity is key, as fluorophores, which appear as diffraction limited point spread functions with millisecond blink durations, have to be localised to sub-pixel precision, the localisation performance being dependent heavily on number of photons collected.EMCCD cameras are the benchmark imaging device, as they provide the highest sensitivity amongst scientific imagers for low photon counts (as seen on Fig. 5.), owing to a very high QE (90%) and minimal read noise.Another advantage of EMCCD is a homogenous pixel response to light, which is desirable when localising faint signals.In this experiment, a sample of GATTA-PAINT HiRes 40G nanorulers was imaged using the SPAD and an EMCCD camera (Hamamatsu ImageEM).Each nanoruler (Fig. 7(a)) consists of a triplet of blinking fluorescent markers (based on the ATTO 542 dye), which are spaced 40 ± 5 nm apart, and as such are unresolvable by standard optical microscopy.The cameras were coupled to a widefield microscope (Olympus Cell Excellence IX81), which was used with a 150 × , NA = 1.45,TIRF oil objective, and a 561 nm excitation laser.
Different fields of view of the sample were imaged, sequentially, by the EMCCD and SPAD.In the case of the EMCCD, a total of 10000, 30 ms exposures were taken, with the region of interest cropped to match the field of view of the SPAD (the frame time was 46 ms).A similar acquisition time was used with the SPAD, which captured 5 × 10 6 , 100 µs bit-plane exposures.SPAD bit-planes were post-processed using smart aggregation [26], which exploits the high frame rate of the camera to produce optimised molecule images, where background has been suppressed by aggregating signal-only frames.The resulting SPAD image frames, together with the EMCCD frames, were then analysed using ThunderSTORM [27] to obtain molecule localisations (based on Maximum Likelihood fitting).Table 1 specifies the molecule detection settings used in this analysis.The thresholding of SPAD images was modified on account of the non-uniform noise (DCR) affecting the sensor array but otherwise the fitting parameters are identical between the three cameras.
Figure 7(a) shows example super-resolution images of the nanorulers acquired using EMCCD and the SPAD with or without microlensing.The characteristic three spots can be seen for all three detectors, but are more readily distinguishable on the image from microlensed rather than the non-microlensed SPAD, the former producing a comparable super-resolution image to the EMCCD.The EMCCD frames result in around four times as many localisations as the microlensed SPAD (Fig. 7(b)); this is due to the higher inherent EQE of the EMCCD, the exact difference in detection sensitivity being dependent on the molecule thresholding criteria used.We compared the localisation results for the different cameras, reported as mean values and s.d. for each camera.As far as the effect of microlensing is concerned, it is important to note the significant improvement in the performance of the microlensed SPAD compared with the non-microlensed SPAD in terms of the number of localisations per nanoruler (35.8 ± 10.2 versus 15.2 ± 8.0, respectively, P<0.0001, Kruskal-Wallis test) (Fig. 7(b)), the number of nanorulers detected (1.4 ± 0.3 µm −2 versus 0.3 ± 0.1 µm −2 , respectively, P<0.0005, ordinary ANOVA test) (Fig. 7(c)), the number of photons per localization (2676 ± 2589 versus 936 ± 1628, respectively, P<0.0001, Kruskal-Wallis test) (Fig. 7(d)) and the uncertainty of the localization (10.1 ± 3.9 nm versus 12.9 ± 3.8 nm, respectively, P<0.0001, Kruskal-Wallis test) (Fig. 7(e)).The localization maps produced by ThunderSTORM were further analysed using GATTAnalysis software tool, which determines the distance between the fluorescent markers and the localisation precision (Figs.7(f) and 7(g), respectively).The distance between markers is found to be within the 40 ± 5 nm expected for all three cameras and a ≈12% improvement is seen in the precision of localisations for the SPAD compared with the EMCCD (27.4 ± 4.4 nm versus 31.1 ± 4.2 nm, respectively, P<0.0001, Kruskal-Wallis test).
The fact that the SPAD can provide a similar localisation performance to the EMCCD, despite the four times higher EQE of the latter, results from a number of camera attributes that become significant in SMLM.The EMCCD, when set to a typical electronic multiplication gain setting of >100, exhibits an excess noise factor, such that shot noise variance is amplified by a factor of two (an effect equivalent to halving the EQE).Furthermore, whilst the SPAD captures continuously, the EMCCD has a readout deadtime, amounting to up to a third of the frame time.Photons incident on the EMCCD during these intervals are ignored.Finally, through smart aggregation, the SPAD camera avoids accumulating background photons for blinks that are on for only a fraction of the EMCCD frame time.

Conclusions
The use of compact pixels with binary output has enabled SPAD imager sensors to attain fill factors significant higher than previously possible.We have here demonstrated that by applying a custom-designed microlens array to such a SPAD imager, the effective fill factor can be further increased by almost a factor of two.The resulting fill factor of the 320 × 240, 8 µm pixel pitch sensor is around 50%, which is the highest reported value for SPADs with similar resolution and pixel pitch (see Table 2's comparison of SPAD imagers [5,7,8,15,[28][29][30]).
We have used this microlensed SPAD array, along with previously established smart aggregation analysis, to demonstrate comparable SMLM performance to an EMCCD camera.Scientific camera technologies such as EMCCD and sCMOS will continue to offer a higher external quantum efficiency, and therefore provide the highest sensitivity when imaging scenes evolving slower than their frame times.However, the SPAD, with its much higher frame rate, combined with negligible readout noise, can offer advantages when events occurring over short time scales are to be captured, such as SMLM, particle tracking and dynamic systems.Applications relying on the time-gating functionality of SPADs (such as fluorescent lifetime estimation, indirect Time-of-Flight and quantum imaging) will also benefit from the increase in sensitivity afforded by the microlens array, in the form of shorter acquisition times.
Considering SPAD imagers in general, recent developments have included sensors with wide spectral (or near-infrared-enhanced) responses, combined with high-fill factors [8,28,31] There is therefore the prospect of using microlens arrays to address the quantum efficiency gap, even at higher wavelengths, driven by the need for high sensitivity SPADs for automotive LIDAR applications [31].

Fig. 1 .Fig. 2 .Fig. 3 .
Fig. 1. a) Micrograph of SPAD sensor, detailed in [6], viewed from top, b) close up on pixel array with and without microlens, c) cross-sectional diagram of microlens array, d) electron microscope image of microlens array showing cylindrical structure

Fig. 4 .Fig. 5 .
Fig. 4. Uniformity in response across SPAD array after DCR correction, obtained under diffused 650 nm light, as indicated by a) aggregated image frame without microlens (sum of 2 × 10 6 bit-planes with a mean count rate of ≈0.07 photons/pixel/bit-plane), b) histogram of pixel values (photon counts) without microlens, c) aggregated image frame with microlens (sum of 2 × 10 6 bit-plane with a mean count rate of ≈0.12 photons/pixel/bit-plane) d) histogram of pixel values with microlens.

Fig. 7 .
Fig. 7. a) Representative super-resolution images of individual GATTA-PAINT 40G HiRes nanorulers acquired using EMCCD (left), SPAD without microlens (centre) or SPAD with microlens (right).Scale bar = 50 nm.Box plots comparing 40G HiRes nanorulers acquired using EMCCD (black), SPAD without microlens (blue) or SPAD with microlens (red): b) Number of localisations per nanoruler, c) number of nanorulers detected, d) number of photons detected for each localization, e) uncertainty in localization, f) distance between fluorescent points on nanorulers and g) fwhm of the fluorescent points on nanorulers.Grey box represents the manufactured distance between fluorescent point (40 ± 5 nm), if the FWHM is above this box then individual fluorescent points would not be resolvable.A minimum of 25 nanorulers were analysed per experiment, the results are from at least 4 experiments.The horizontal line within the box indicates the median, boundaries of the box indicate the 25th-and 75thpercentile and the whiskers indicate the 5th-and 95th-percentile.The " " marked in the box indicates the mean and "-" indicates the maximum and minimum values.Statistical significance was tested with Kruskal-Wallis one-way analysis of variance followed by Dunn's post hoc test for multiple comparisons or for C) ordinary one-way analysis of variance followed by Tukey's post hoc test for multiple comparisons.**P<0.001,***P<0.0005,****P<0.0001.

Table 1 . ThunderSTORM settings for single molecule localisation
Wave.F1 is the first wavelet level of the input image, and Med.F is the median filtered input image