Large area high-speed metrology SPM system

We present a large area high-speed measuring system capable of rapidly generating nanometre resolution scanning probe microscopy data over mm2 regions. The system combines a slow moving but accurate large area XYZ scanner with a very fast but less accurate small area XY scanner. This arrangement enables very large areas to be scanned by stitching together the small, rapidly acquired, images from the fast XY scanner while simultaneously moving the slow XYZ scanner across the region of interest. In order to successfully merge the image sequences together two software approaches for calibrating the data from the fast scanner are described. The first utilizes the low uncertainty interferometric sensors of the XYZ scanner while the second implements a genetic algorithm with multiple parameter fitting during the data merging step of the image stitching process. The basic uncertainty components related to these high-speed measurements are also discussed. Both techniques are shown to successfully enable high-resolution, large area images to be generated at least an order of magnitude faster than with a conventional atomic force microscope.


Introduction
There is an increasing demand from manufacturing industries for dimensional measurements of surfaces with nanometre precision over ranges from a few square micrometres to several square centimetres. This need arises primarily from the current trend for the miniaturization of mass-produced devices and the increasing availability of low roughness and pristine technical surfaces (free of defects, scratches etc). Often, conventional microscopic techniques are not accurate enough to address all the metrology problems inherent in the nanoscale characterization of such large sample areas and this demand is posing an important challenge for dimensional metrology [1,2].
There are many traditional methods for surface morphology measurements in the range of micrometres to centimetres, such as stylus profiler measurements and optical profilometry. There are also ISO standards e.g. ISO 25178 for performing such measurements and evaluating different surface characteristics from them, such as roughness parameters. However, there are many caveats while using these techniques, such as their limited spatial resolution or unsuitability for certain classes of samples, e.g. transparent surfaces when using optical techniques or soft samples in stylus measurements. Scanning probe microscopy (SPM) is ideal for such samples and can be directly applied with the present methodologies. However, SPM is typically limited to small scanning ranges of up to 100 μm × 100 μm due to the limitations of the piezoelectric actuators used. A possible solution is to construct SPMs with significantly larger scanning ranges [3][4][5][6] using mechanical amplification of the piezo actuator motion or other actuation principles such as voice coils [7,8]. But, this simply brings one to the next limitation, namely, the low scan rate of SPM instruments. Since all SPMs are based around the serial measurement of each pixel in turn they are typically not very fast devices even when measuring small image areas. Acquiring measurements over a millimetre range with nanometre sized pixels therefore requires many tens of hours for a single image. This is not only impractical but, also problematic when considering sources of drift, such as from the long term thermal instability of the instrument and surrounding environment. One successful route to reducing imaging times is to perform measurements using an adaptive stepping algorithm to produce non-equidistant sampling points (i.e. varying data density) that correspond to the sample features of interest [9,10]. Unfortunately, this approach is suitable only for certain types of samples, such as terraced steps on flat regions and for the majority of surfaces that are examined using atomic force microscope (AFM) adaptive stepping does not provide a significant reduction of time necessary for a scan. On particular, well defined samples (e.g. calibration gratings) this approach can even be simplified using several very long profiles [11]. However, this approach cannot be used generally for unknown samples. Another route to increasing the imaging rate is by operating multiple conventional speed AFM cantilevers in parallel [10]. However, the fabrication and signal processing required by large cantilever arrays raises significant challenges of their own. The most successful approaches to overcoming the speed limitation of SPMs to date are custom-built instruments collectively classed as high-speed or video-rate SPMs. These operate at tip-sample velocities in the order of mm s −1 to cm s −1 and are developed to overcome the challenges of imaging the fast dynamic behaviour of delicate biological samples [12][13][14]. These devices operate hundreds and thousands of times faster than conventional SPMs and generate tens to thousands of whole images per second [15]. It is this large throughput of pixels, incomparable to conventional SPMs, that makes it interesting to consider the metrological properties of such instruments and to investigate the possibility of converting them into high accuracy and high-speed devices capable of overcoming the challenge of providing nanometre precision across mm 2 and cm 2 areas.
In this article we describe the combination of an interferometer based large area XYZ positioning stage (NMM-1 by SIOS, Germany) [4] with a high-speed SPM XY scanner (custom-built at the University of Bristol, UK). With this combination we are able to take information from the NMM-1 and use it to calibrate the properties of the custom high-speed scanner. Then, by merging all the data produced by this combined high-speed SPM we generate traceable large area measurements hundreds of times faster than can be achieved with current SPMs, making such large high-resolution imaging practical and feasible for the first time. We discuss the associated algorithms for data processing and provide detailed discussion of important uncertainty sources related to high-speed measurements and the combination of the interferometric data from slow motion with high-speed data.

Experimental arrangement
The system consists of a combination of the NMM-1 with a 25 mm × 25 mm × 5 mm XYZ range with the custom highspeed XY scanner developed at the University of Bristol with a small 5 μm × 5 μm range. These two parts are used to scan the sample in all three axes, including coarse sample approach to the AFM cantilever tip. A simple AFM head is used to detect surface height variations according to the widely used cantilever beam deflection detection scheme [16]. In similarity with some other high-speed systems [15,17,18], the Z feedback is used only to compensate for height variations on a frame-by-frame basis rather than to follow sample topography at each pixel.
During measurement, the high-speed scanner continuously produces images of the sample surface in a scan window several square micrometres in size. Simultaneously, the NMM-1 performs either coarse motion around the sample as requested by the user or follows a pre-defined scan pattern. Instead of recording a single value at every scan location the NMM-1 visits (as would be done in a conventional SPM) we collect whole images continuously at a rate of 250 000 pixels per second. This leads to a significant decrease in the time taken to image a given sample size without the loss in resolution typically observed with conventional AFMs when imaging larger areas.

The custom high-speed XY scanner
To generate the high-speed sample motion with a range of several square micrometres we use a scanner developed at the University of Bristol. This custom XY scan system is based on a combination of piezoelectric stack actuators held within a 2D flexure guidance system as described elsewhere [19]. A low cost audio amplifier (model KSD8251 from EZK, Czech Republic) with an output power of 125 W per channel is used to drive the fast scan piezoactuators at frequencies of up to 5 kHz while a commercial piezo amplifier from ThorLabs is used to drive the slow scan piezoactuators at frequencies up to 10 Hz.
Prior to mounting the XY scanner on the NMM-1, the scan trajectory was measured and characterized for a range of scan frequencies and amplitudes using a simple Michelson interferometer as shown in figure 1. The fast scan axis was driven with a sinusoidal drive signal (to reduce the excitation of higher harmonics at the turning point of each scan line) and the interferometric measurements were recorded with a highspeed simultaneous sampling data acquisition card (PCI-6143 from National Instruments, USA). Figure 2 graphs the results of a series of scan amplitude measurements obtained at a range of different drive frequencies. The stage was driven without a sample mounted using a peak-to-peak drive voltage of 500 mV, as measured at the input to the audio amplifier. The voltage at the output of the amplifier varied with the drive frequency, especially at frequencies below 50 Hz. This is due to the design and specifications of the audio amplifier and results in the strong increase in scan amplitude as the drive frequency increases towards 50 Hz and beyond. The fundamental resonance of the scan stage can also be seen between 2 and 3 kHz. The measured values of stage amplitude for given frequencies and drive signal amplitudes were used for the initial estimation of the size of the high-speed frames as discussed in section 3.
The measured trajectories were also used to linearize the scan data to a conventional equally spaced grid from the nonlinear distribution created by the use of sinusoidal scan velocities. A harmonic function consisting of several higher harmonic sine functions was fitted to the interferometer values and the ratio between the amplitude of the drive frequency and the amplitude of the higher harmonic components was evaluated (as shown in figure 3) and is discussed later in the paper. Figure 4 shows the residual trajectory after removal of the base frequency and the base frequency together with two higher harmonics components. This can be used to estimate the uncertainty components related to data linearization if the higher harmonics components are not treated properly.
Finally, using the interferometer we can monitor the stability of the amplitude and absolute position of the highspeed stage. Due to imperfections in the electronic and mechanical components there can be small variances in both the amplitude and average value of the high-speed stage. We have found that the offsets of the average value are smaller than 20 nm (including the uncertainty of the interferometer setup) and that they randomly evolve with a characteristic time of milliseconds. The scan amplitude fluctuations were found to be even smaller, less than 1% of the scan range (around 1 μm) used here [18].

The commercial large area XYZ positioning system
For slow and accurate sample movement we have used a NMM-1 from SIOS that is available at the Central European Institute of Technology. This is a unique positioning system featuring very small uncertainties (around 1 nm) in a large 25 mm × 25 mm × 5 mm volume. To reach such performance, the system uses Zerodur ceramic for most of the parts in order    to significantly reduce thermal drift. Three interferometers using stabilized HeNe lasers are combined with voice coil actuators to monitor and drive the XYZ positioning stage. The system design is relatively open, thereby enabling novel sensors to be fitted. The software interface is similarly open in that it can be controlled from both a Matlab software environment or via a library in C. We created a simple server application to report the system position and to receive commands via the Ethernet so that the computer controlling the high-speed scanner and data capture could also directly control the NMM-1.
It should be noted that much less expensive positioning systems could be used for large area movement if they are equipped with some metrology sensors, ideally interferometers. This is something we are also investigating and will report on in a future paper. However the NMM-1 system from SIOS surpasses other systems available to us in terms of accuracy, which was the reason for choosing it for our experiments.

The combined system with SPM head
A simple SPM head with optical beam deflection detection of cantilever bending [16] was constructed for measuring the topography in a constant height contact mode (i.e. no fast feedback was implemented in the Z axis). Long, low-stiffness probes (model PPP-CONTR from Nanosensors, Switzerland) with a nominal stiffness of 0.2 N m −1 and a tip radius below 10 nm were used during imaging. To determine the calibration factor for the beam deflection detection system the NMM-1 was used to deflect the cantilever by known amounts; it was stepped sequentially by 1 μm increments towards the sample and back while voltages from the position sensitive detector amplifier were recorded. Obtained data were used for conversion from deflection to height.
The combined system including all the parts discussed above is schematically shown in figure 5.

Data processing
As described in section 2, high-speed frames from the custom XY scanner are acquired continuously and independently of the motion of the NMM-1 used for large area positioning. A simultaneous sampling data acquisition card (PCI-6143 from National Instruments, USA) is used to collect both the deflection signal from the SPM head and the drive signals of the high-speed stage. Actual positions reported from the NMM-1 are received via Ethernet communication protocols and are recorded as well to create a set of six data values that can be used for data processing at each measured pixel. To merge height data from sequential high-speed frames together correctly it is necessary to measure or otherwise determine the following parameter sets (see figure 6): The XY centre point of each of the high-speed frames. Set 2 The X and Y scan amplitude for any given frequency and driving voltage. Set 3 Small XY offsets of the high-speed scanner centre. Set 4 The uncertainty in the scan amplitude of the highspeed scanner.
The first two unknowns are necessary for any processing of the whole dataset, while the final two are necessary to estimate or minimize the uncertainty related to the data processing.

Frame size determination and image stitching from interferometer data
Our approach to data processing focuses on maximum utilization of the sensor information available from the NMM-1 (parameter set No. 1) because the interferometers have good traceability. Estimates of the high-speed scanner amplitude and phase (parameter set No. 2, as determined earlier in section 2, are used as the initial parameters for stitching the  high-speed frames together. In the simplest approach the high-speed frames could be merged one by one, using the frame shift reported by the NMM-1 and fitting the high-speed frame amplitude to find the best correlation between the overlapping regions of the two frames. This could, however, lead to an accumulation of errors as the number of frames that are successively added is very large (from hundreds to hundreds of thousands). Moreover, if we move the NMM-1 slowly, there will be many frames with overlapping regions in the resulting dataset. This occurs even during a 'fast' scan at the beginning and the end of each line.
In order to fit all the frames together we need to quantize their misfit for the given set of XY positions, XY amplitudes and XY offsets and amplitude variations (parameter sets No. 3 and No. 4). For every position in the final image (every pixel), all the high-speed frames that contain measured data that can be used to estimate the height value at that position (by interpolation) are first found. All the values in that position, coming from different high-speed frames, should be identical if everything is measured and fitted ideally. Therefore, a variance of all the estimated values for the position is calculated. The resulting factor for determining the total misfit is a sum of these variances, calculated for all the positions in the final image. As discussed above, the number of frames that contain information about same position in the final image varies with the scanning speed of the NMM-1 positioning stage and in our case it was between 4 (fast scanning with the NMM-1) and 60 (when scanning slowly with the NMM-1).
The easiest way to minimize the total misfit between all the frames is to do a 'brute force' search for the minimum variance. If we choose the X and Y amplitude of the highspeed frames as the values that should be found, we can perform this calculation relatively easily. Note that in this way we determine actual X and Y amplitudes of the high-speed stage using values of XY positions as determined by the (traceable) interferometer data from the NMM-1. This is a much simpler (albeit with potentially higher uncertainty) way to ensure traceability of the amplitude of the high-speed scanner than by using independent sensors to directly measure both X and Y amplitudes of the high-speed stage. The result of fitting X and Y amplitudes for one measurement set is shown in figure 7. Position of the minimum in the total misfit parameters shown in figure 7 gives us an estimate for X and Y amplitudes. The value of the minimum itself should be zero ideally (all the frames are perfectly aligned), but due to interpolation and noise effects there is some residual value observed in the total misfit. The result of merging all the frames is shown in figure 8. Note that this set was covering a relatively small area (up to 10 μm × 10 μm) and with about 30 overlapping pixels between successive high-speed frames.

Frame size determination and image stitching via genetic algorithm (GA) optimization method
To treat the unknown values of small shifts of the high-speed stage centre (Set 3 whose values were estimated in section 2) and any unknown amplitude variations, we will need to fit many more parameters. At the very least it will be necessary to look at the individual values of XY shift and scan amplitude for every high-speed frame. This would require fitting hundreds to thousands of parameters, which cannot be easily done by a brute force search or by conventional minimization algorithms. Even though we find that the XY shifts in our experimental arrangement are relatively small we have tried to devise a fitting scheme for treating them. This is in order to create a methodology that could be equally well applied to other high-speed, large area scanning systems where these XY shifts are perhaps more significant.
To fit all the shifts and amplitude variances we are using a GA approach. GAs are a strategy for solving complex optimization problems in a way that is similar to processes of evolution in biological systems [20]. In brief, GAs use a population of possible solutions to the problem that are interacting via simple crossover and mutation rules in a way that favours actual best solutions but still allows many different random changes to converge to globally optimum  Figure 7. A surface plot of the x and y amplitude of the high-speed scanning system obtained during a 10 μm × 10 μm measurement (620 frames in total). The Z axis denotes total misfit. The values of the x and y calculated scan amplitudes were 3.6 μm and 1.8 μm, respectively. Fitting was performed on raw Z data hence the units of the total misfit are Volts. solution. The details of the application of the GA to our data set are described below. In our case, unknown values of all the shifts and amplitude variances represent the members of a population (known as a single chromosome). The global value that should be minimized is again the total misfit described earlier.
At the beginning the population is initialized by random numbers introduced as a random walk sequence (we assume that the XY shifts and amplitude variances are continuously, but randomly, changing functions). The following steps are then performed until a desired value of total misfit is reached: (1) The total misfit is calculated for every member of population. (2) The members of the population are sorted in ascending order of their total misfit. (3) The member with smallest total misfit (i.e. the first member after sorting) is the winner, this member is kept unchanged for the new population. (4) The next 60% of members of the new population are created from the present population via random mutation (addition of random walk sequence again) and crossover of part of their values with those of another member. All this is governed by random number selection, however, in at least 20% of cases some part of their information is copied from the winner. (5) The rest of the new population is formed from newly created randomly initialized members.
After a large number of steps, the solution should converge to an optimum value, assuming that we had setup the initialization and mutation rules reasonably and the problem is suitable for GA treatment. To test the algorithm, we performed a virtual measurement, using a known surface topography and simulating a sequence of high-speed frames having a very simple X axis shift dependence governed by a sine function. Resulting values of the total misfit in successive steps of the algorithm are shown in figure 9 and the resulting X shift dependence is plotted in figure 10 together with the real values of the X axis shift. In this example we have used a population of 150 members and we ran the GA for 2000 steps even though stability occurred after the 300th reading. The GA optimization procedure took about 15 min on a regular computer. In figure 11 the data used for simulation, merged data with no GA treatment and merged data after GA optimization are plotted. Figure 12 shows the result of the fit and the effect of reducing the data misfit from 10 −16 m 2 to 10 −17 m 2 .
When we ran the GA on our real high-speed frames we calculated only very small XY shifts with almost no amplitude variance. This corresponds to the results of our characterization of the high-speed scanner as described in section 2 and prior work [18]. In figure 12 the results of two separate runs of the GA are shown. It is worth noting that the GA values are smaller than a one pixel shift in this case. However, if the NMM-1 was replaced with a large area positioning system with lower accuracy then the systematic and random errors in its position could also be treated as a part of the XY shift in exactly same manner as discussed here, which would increase the overall value of the XY shift measured.

Results and discussion
In figure 13 a large area measurement of a grating imaged by the setup presented in this article and related data processing is shown. The final image, with a pixel resolution of 8000 × 800 pixels, was merged from 5650 high-speed frames (approximately 120 × 120 pixels each) with an average overlap of ten frames for every point in the final image. This is still quite a high overlap factor and could be reduced if we needed to measure the same total area more quickly. It is worth noting that the measurement presented in figure 13 was performed in 20 min, substantially faster than a conventional AFM scan at the same pixel density. The time for data processing was approximately 5 min. Note that the poor contrast of the image at the right side was caused by an uncompensated sample tilt. The Z height of the NMM-1 was fixed at some value and it was not changed during this particular measurement. Hence, at some point the deflected laser beam  was hitting the position sensitive detector near its edge which introduced some error in the Z height. At present, the typical time necessary for imaging is around 2 min for smaller rectangular scans (about 20 μm lateral scan size) and about 50 min for large rectangular areas (cca 100 μm lateral scan size). All this depends on the fast scan axis drive frequency and desired resolution. The overall speed is affected by the number of overlapping frames that we have for data processing. In this work this number was intentionally kept very high (10-60 frames contributed to each pixel in the final image) to study the potential contributions to measurement uncertainty. We found that an overlap of about 30% between successive images is enough for data processing and successful image stitching. This means that for a 200 by 200 μm scan area we need at least 100 by 100 images each with a scan amplitude of 3 μm × 3 μm if we want to maintain a 30% (1 μm) overlap between frames. At the present sampling rate, this could take over 3 h to measure if we want to obtain a final image of 8000 pixels × 8000 pixels (representing a 40 nm × 40 nm pixel size), which is still quite a long time. Our next step is therefore to switch to a data acquisition card that will be 20 times faster (up to 5 MS s −1 ) and allows us to go to higher speeds by at least a factor of ten. Moreover it should be noted that if we were just using a conventional AFM head on the NMM-1 (without the custom high-speed XY scanner), it would take at least one order of magnitude longer to perform the same scan as shown in figure 13.
Another important issue is the amount of data that is collected. Figure 13 for example, was assembled from more than 40 GB of data. Handling of tens of gigabytes of data is not very easy and if we go to really large area measurements (millimetres or centimetres) data storage could be problematic. A technique to carry-out real-time frame merging is therefore very desirable. This could work with small batches of frames (say, 1000) at a time so that only the resulting final image would need to be stored and the majority of the overlapping data could be discarded.

Uncertainty analysis
Since the objective of this work is to explore the potential to build a metrology device using high-speed SPM technology it is important to discuss some aspects of the uncertainty budget for our measurements. Generally, we can start from the detailed uncertainty budgets for metrological SPMs presented in [21,22] and take into account the effects of tip radius and geometry as described elsewhere [23]. In addition to conventional uncertainty components there are also various effects related to the high-speed measurement and developed data processing. These will be discussed here: 4.1.1. Uncompensated motion of the high-speed stage due to higher harmonics. Prior to merging all the high-speed frames together we removed the sine distortion caused by harmonic motion of the scanner in the fast scan direction. If the motion of the stage does not correspond to a sine function, this step could introduce significant errors. The amount of higher harmonic components and the effect of their removal was discussed in section 2 and has a minimal effect. We can see that at the scanning frequencies used, the maximum error caused by uncompensated higher harmonic components is around 5 nm, which is less than the pixel size in the measurements that we have performed. However, as shown in figure 2 if we were to switch to a significantly different driving frequency we would need to perform further analysis  similar to that presented in section 2 to account for the frequency response of the custom XY scanner.
4.1.2. Hysteresis and friction effects. The motion of the highspeed stage will also exhibit some hysteresis caused by the piezoceramic transducers properties, which could be included into higher harmonics motion uncertainty component, but it is better to discuss it separately together with friction issues. Hysteresis due to transducers could be prevented using a charge amplifier for driving the transducers [24] or using another driving mechanism, e.g. voice coil [6,9]. The second effect leading to hysteresis is the friction between probe and sample which causes the cantilever to twist (torsionally deflect) in the opposite directions in forward and reverse scan. In our first measurements we scanned relatively slowly and we had not merged forward and reverse scans together, so effect of hysteresis was negligible. Another possible effect related to friction could be a tip and sample wear which would affect reliability of the data evaluation. This was not observed in our measurements, however we encountered it in one of our previous studies [18].
4.1.3. Data noise. As the data fitting procedure is based on the evaluation of the variance of values in successive frames, the presence of noise could affect its performance significantly. In order to determine the sensitivity of the calculations to noise we have simulated data with different types of artefacts and determined the difference of evaluated data from ideal data (used for simulation). Three effects were considered: (1) Random noise added to data, e.g. due to noise on AD converter. (2) Single frequency noise addition, e.g. due to some multiplies of 50 Hz in the electronics. (3) Tip not following the sample surface on downwards motion (parachuting effect).
We found that the most critical effect of these three is the parachuting of the tip because it can distort the data significantly and non-uniformly, leading to poor results from our data fitting routines. However, in our measurements we did not observe any significant parachuting effects that would need taking into account, and also the other effects were negligible compared to uncertainty in data synchronization and calibration of cantilever response. More systematic study of data noise influence on the reconstruction of the final image will be presented in a forth coming paper. 4.1.4. Data synchronization. The performance of the combined high-speed and large area system is also influenced by the synchronisation of the data from both parts of the system, the XY position data of the NMM-1 and the height and drive signal data from the high-speed XY scanner. Ideally a simultaneous sampling card would read all the values that are changing in time, however, this is not technically possible here. Values from the NMM-1 are already digitalized and need to be synchronized with the high-speed data. In principle, there will always be some delay between the high-speed data and the reported NMM-1 values. This could cause additional offsets and distortion of the highspeed frames within the final image. We found that if the ratio between the high-speed frame rate and the NMM-1 scan velocity was high enough, this effect is negligible. While moving much faster with the NMM-1 we did observe this effect with the present setup, which lead to a difference of up to 50 nm in the position of the grating edge for large stage motion in opposite directions. 4.1.5. Higher harmonics effects. Fast motion of tip can lead to excitation of higher harmonic frequencies on the cantilever, causing image distortion [19,[25][26][27]. This is namely an issue on measurements with much higher frequency and was not observed on our data. Generally, the best approach to prevent this effect is using interferometric tip-sample force detection or a laser vibrometer. These effects have been studied thoroughly for contact mode high-speed AFMs elsewhere [25,27]. For measurements at speeds used in present study these effects were not observed.
In summary, we have identified the contributions to the measurement uncertainty associated with higher scanning speeds, but to fully quantify them is beyond the scope of this paper.

Conclusion
We have presented our approach to create a specialized SPM system with the capability to image large surface areas rapidly. It is based around the integration of a small area highspeed stage with a high accuracy large area scanning system. This is followed by advanced data processing to make use of the accurate position information from the large area system to calibrate the high-speed stage motion, including variances of the amplitude and position of every high-speed frame.
The device presented here is quite complex, but we have demonstrated that using very special and expensive equipment as building blocks we can achieve high accuracy with minimum of additional uncertainty components. Moreover we can treat this as a basis for creating simpler devices with similar design principles and data processing steps. It is also Figure 13. 210 μm × 18 μm measurement on a calibration grating. The resolution of the image is 4094 × 340 pixels and the pixel size is 50 nm in both axes. worth noting that the software for data processing can be easily adapted to another system combining a high-speed stage with large area positioning system. We also showed that the GA used for multiple parameters fitting can be easily modified to determine the desired quantities and dependences (high-speed frames positions or amplitudes, hysteresis effects in high-speed images). Even if the GA approach cannot substitute for the use of the true (analytically known) dependences of different parameters in SPM data evaluation, we believe that in this case when many different effects are mixed together it is a good alternative.