Interlaboratory comparison measurements of aspheres

The need for high-quality aspheres is rapidly growing, necessitating increased accuracy in their measurement. A reliable uncertainty assessment of asphere form measurement techniques is difficult due to their complexity. In order to explore the accuracy of current asphere form measurement techniques, an interlaboratory comparison was carried out in which four aspheres were measured by eight laboratories using tactile measurements, optical point measurements, and optical areal measurements. Altogether, 12 different devices were employed. The measurement results were analysed after subtracting the design topography and subsequently a best-fit sphere from the measurements. The surface reduced in this way was compared to a reference topography that was obtained by taking the pointwise median across the ensemble of reduced topographies on a 1000×1000 Cartesian grid. The deviations of the reduced topographies from the reference topography were analysed in terms of several characteristics including peak-to-valley and root-mean-square deviations. Root-mean-square deviations of the reduced topographies from the reference topographies were found to be on the order of some tens of nanometres up to 89 nm, with most of the deviations being smaller than 20 nm. Our results give an indication of the accuracy that can currently be expected in form measurements of aspheres.


Introduction
In recent years, aspheric lenses have started to play an increasingly important role in a wide range of optical applications [1]. The requirements placed on the quality of these aspheres are high; currently, the available capabilities of asphere form measurement techniques limit the accuracy of asphere production [2][3][4]. Due to the complexity of current form measurement techniques, reliable uncertainty assessments are difficult to make.
In a project carried out in 2015 and 2016, CC UPOB e.V. 17 investigated the state of the art in measuring aspheres with a variety of techniques and instruments. CC UPOB e.V. [5] is an association dedicated to developing competence in manufacturing and to characterizing technical surfaces with ultrahigh precision. This competence centre joins the efforts and capabilities of companies, universities, and research institutes to further expedite asphere and freeform metrology (see, e.g. [6][7][8]). Another attempt to compare different measurement principles was carried out in an EMRP project in 2015 [9]. However, the number of specimens and participants brought together in this study by CC UPOB e.V. is unprecedented.
Within the scope of this project, four specimens were measured by eight laboratories using 12 different instruments, yielding the 29 measurements considered in the analysis. This paper presents and assesses the results of these measurements. Since measurement uncertainties have not been made available by all participants, formal checking of the accuracies claimed is not possible. Nevertheless, our quantitative analysis suggests an estimate of the accuracy that can be expected in current asphere surface measurements.
The paper is organized as follows. Section 2 gives an overview of the measurement methods and devices employed. In section 3, the specimens measured are described, and in section 4, the data analysis methods are introduced. In section 5, the results are then presented in anonymised form, and a discussion and concluding remarks follow in section 6.

Measurement methods and devices
The following measurement methods were applied within this project: tactile measurements [10]; optical point measurements [11]; and optical areal measurements. The latter group of measurements can be divided into four subgroups: measurements using a computer generated hologram (CGH) [12]; sub-aperture stitching methods [13]; full-aperture interferometric methods [14,15]; and deflectometric methods [16].
Tactile coordinate measuring machines (CMMs) use a stylus to obtain a pointwise scan of the specimen surface. CMMs have the advantage of providing absolute form information, but this process is slow and gives only pointwise information [17], even though the point density in the scanning direction can be very high. The tactile devices used in the study were the Isara 400 [18]; MarSurf LD260 Aspheric 3D [19]; UA3P 3D Profilometer [20]; and Taylor Hobson PGI [21].
Optical point measurements are significantly faster than tactile CMMs, but provide only pointwise information. In addition, the optical point has a considerable width of some micrometres, limiting the lateral resolution. On the other hand, the measurement procedure is contact-less, thereby sparing the surface from potential scratches or digs. The optical point sensors used were MarForm MFU200 Aspheric 3D [19,22] and LuphoScan [23,24].
CGHs are expensive and must be re-manufactured for every new design [25]. Thus, they are of economic value only if many measurements are conducted for the same design or if the specimens measured are very expensive. Initial adjustment of a conventional CGH is time-consuming and, if not done accurately, can lead to additional measurement errors [12,25,26]. However, these errors can be reduced by combining the CGH and the reference wave generating surface [26]. Furthermore, single measurements with CGHs are fast [25]. Here, a Zygo GPI interferometer and a TRIOPTICS μPhase interferometer [27] were used in combination with a CGH.
Areal measurements without a CGH are fast and yield a high point density; however, depending on the specific method used, they require stitching or elaborate computational processes. In sub-aperture interferometry, small sections of the specimen are measured in such a way that the section's deviation from a spherical or planar shape is small. From a large number of sub-aperture interferograms, small topography sectors are computed and stitched together to yield a full-aperture topography [13]. This can also be a time-consuming process. For the sub-aperture stitching, an SSI-A interferometer from QED [28] was used.
The tilted-wave interferometer (TWI) is a recently developed optical areal measurement principle [29] that uses a source array to illuminate the specimen from several angles. For data analysis purposes, the measurement process is simulated. In the simulation, the specimen surface is adjusted until the simulated data match the measurement data [15]. This method is advantageous because the instrument has no moving parts during measurement. On the other hand, careful calibration is necessary in order to distinguish between the specimen's surface and retrace effects [30]. Due to the novelty of this method, one device used in this study was a preliminary lab breadboard setup.
In large angle deflectometry (e.g. phase measuring deflectometry) the reflection of a pattern onto the specimen's surface is observed by a camera. The form of the surface is calculated from distortions of the image. Phase measuring deflectometry is suitable for obtaining high-spatial-frequency information but entails difficulties for obtaining low-spatial-frequency form information [31].
In total, four tactile CMMs, two optical CMMs, and six areal instruments were used in this study. For the sake of anonymity, they are labelled T 1 -T 4 (tactile CMMs), O 1 -O 2 (optical CMMs), and A 1 -A 6 (areal measurements) in the following.

Specimens
Four different aspheres were measured (see figure 1). The aspheres were chosen in such a way that a wide variety of possible designs could be measured with a small number of specimens. Asphere 1 is a weak asphere from series production at Leica Camera AG [32]. It has a diameter of 39 mm. Asphere 2 was provided by Schneider GmbH & Co. KG [33] and NTG GmbH [34]. It consists of two spherical parts with differing curvature radii: the curvature radius of the inner part is 250 mm, while the outer annulus has a curvature radius of 150 mm. The inner segment has an aperture diameter of 25 mm; the diameter of the outer segment is 60 mm. The inner segment contains a Siemens star-like structure. Siemens stars are used to assess the resolution capabilities of cameras and other optical devices [35]. Asphere 3 was provided by Thales Angenieux [36], and contains a turning point connecting a convex part in the centre and a concave annulus. The specimen's diameter is 60 mm. Asphere 4 was provided by Schneider GmbH & Co KG. It is a strong asphere with a diameter of 40 mm. A brief overview of the specimens is also given in table 1. Aspheres 1, 3, and 4 can be described by the asphere formula [3]: where R is the vertex radius of curvature; κ is the conic constant; and α i are further coefficients describing the asphericity. The specimens used in this study from left to right: asphere 1, weak asphere; asphere 2, two radii and Siemens star pattern in central part; asphere 3, containing turning point; asphere 4, strong asphere. The specimens have diameters of 39 mm, 60 mm, 60 mm, and 40 mm, respectively (see table 2). Asphere with turning point 7 16 4 Strong asphere 7 16  25 26 The asphericity coefficients of aspheres 1, 3, and 4 are given in table 2.
The CGHs used in this study for asphere 1, 2, and 3 were provided by DIOPTIC GmbH [37]. The CGH for asphere 4 was provided by Schneider GmbH & Co KG.

Data analysis
The measurement results were preprocessed and thereafter compared to a reference surface. The following section describes how the preprocessing steps were carried out, along with the data analysis subsequently performed. The data processing procedures applied in this study for removing the design topography are based on the strategy introduced in [38]: in the first step, all measurement data were adapted to have a common format, and the analysis was performed in Cartesian coordinates. Some of the measurements were given as deviations from the design topography, while others yielded the complete topography. The subsequent analyses were performed in terms of deviations from the design topography. The design was subtracted from the measurements that referred to the complete topography in the following way: the measurement data were aligned  (2) and (3)), for asphere 1. Measurement reduction consisted of subtracting the design data and thereafter a best-fit sphere prior to median computation. The median was computed pointwise on a regular Cartesian grid on which all measurement data were resampled using spline interpolation. The resulting VRT has an RMS of 7 nm and an MAD of 4 nm.  with the design topography by minimizing the differences between the measurement and the design in a least-squares sense, allowing shifts of the measurement point cloud along the three Cartesian axes and rotations about the x-and yaxes. This fitting took place by means of a tool developed at PTB (an advanced form of the tool described in [39]) that uses MATLAB ® [40]. The design data (the nominal surface data) were then subtracted from the measured surface data (the measured topography data). After this point, all measurement data were processed the same way.
The differences between the measurements and the design topography still showed significant spherical surface contributions due to the fact that many devices cannot determine the spherical component of the measured surface unambiguously. Therefore, the spherical surface contribution was removed by additionally subtracting a best-fit sphere (BFS). The BFS was determined using the Levenberg-Marquardt algorithm [41, chapter 5.2] as implemented in MATLAB ® [40]. The radii of the BFSs subtracted from each topography are shown in figure 2 in section 5. As expected, there were significant differences between the different measurements. This confirms the fact that, without subtracting a best-fit sphere from the results, the results will be dominated by the influence of the measurement error of the spherical contribution, and a meaningful comparison of the remaining form characteristics will not be possible. The next step was outlier removal. Data points that had an absolute difference to the ensemble median 7 times larger 18 than the median absolute deviation (MAD) of the data set were removed.
Finally, the data were rotated about the z-axis in such a way that the correlation to the same reference residual data set (chosen from the measurement data sets at random) was maximized. This ensured that for a given specimen all residuals had the same orientation for further analysis.
For asphere 2, the design was removed in a two-step procedure. First, the spherical component of the design was removed, leaving a possible spherical contribution with a different radius and the Siemens star pattern. After that, the BFS was subtracted. Then, the Siemens star pattern was removed using the aforementioned PTB tool. The reason for this procedure was that the BFSs had to be removed before removing the Siemens star pattern; otherwise, the BFS would have dominated the differences between the measurements and the Siemens star pattern design when aligning the residuals with a reference.
The topography obtained after completing these steps is called the reduced topography: where T n (x i , y i ) is the topography of the nth measurement at point (x i , y i ); D(x i , y i ) is the design topography; and S bf n (x i , y i ) is the value of the best-fit sphere for measurement n.
Since the specimens' true forms-and thus their deviations from the design topography-are unknown, the pointwise median of reduced topographies across the measurement ensemble (see column 3 of table 1) was computed in order to function as a virtual reference surface for those deviations: We will refer to this virtual reference topography as the VRT.
In the absence of measurement uncertainties, we chose the pointwise median as a robust estimator [42] for the mean value of the measurement ensemble. It disregards single large deviations and is a robust strategy to find a reference that approximates the true surface. The differences between the VRT and each reduced topography will be called the difference topography, defined as The reduced topographies, VRTs, and difference topographies were calculated pointwise on a regular 1000×1000 Cartesian grid on which the measurement data were resampled using spline interpolation.

Results
As mentioned in section 4, BFSs were subtracted from the measurements' deviations from the design topography. The radii of these BFSs are shown in figure 2. They show significant outliers for aspheres 1, 2, and 4, the rest of the radius values being of the same magnitude. A possible reason for the outliers might be BFSs being already subtracted prior to data delivery. Only for asphere 3, all of the values are very large, indicating almost flat surfaces. The radius for measurement A 6 of asphere 4 is quite small at 15.72 m, indicating a considerable curvature.  (2) and (3)), for asphere 2. Measurement reduction consisted of subtracting the design data and thereafter a best-fit sphere prior to median computation. The median was computed pointwise on a regular Cartesian grid on which all measurement data were resampled using spline interpolation. The resulting VRT has an RMS of 12 nm and an MAD of 7 nm.   Table 3 shows the root-mean-square error (RMS), median absolute deviation (MAD), and peak-to-valley (PV) values for each of these deviations. The MAD is also considered because it is a robust measure of the variability of measurement values [42].

Asphere 1
For asphere 1, the RMS of the difference topography ranges between 6 and 27 nm; the MAD ranges between 3 and 21 nm (see table 3). In the whole ensemble, the variability of RMS, MAD, and PV values is small. No order between measurement principles can be identified (see table 3 for a complete listing of RMS, MAD, and PV values). The VRT of specimen 1 (see figure 3) has very low values. It has a PV of 102 nm and an RMS of 7 nm, which suggests good manufacturing quality of this weak asphere. The difference topographies suggest that measurements T 2 and O 2 might have systematic error influences for this specific measurement task (figure 4).

Asphere 2
In the analysis of asphere 2, only the inner segment was considered. The VRT of specimen 2 (figure 5) still shows some contributions of the Siemens star structure. This suggests that these parts of the structure were manufactured with uncertainties in the range of the deviations shown. The RMS values of the difference topographies range up to about 20 nm, but most of the values are smaller than 10 nm (see table 3). Most of the MAD values are smaller or equal to 5 nm; only one value reaches 12 nm. The PV values are smaller than 200 nm, with most values being smaller than 100 nm. The relatively large PV values are mostly due to small outliers that could be caused by small disturbances such as dust particles. The sphericity of the design sphere was captured well by all of the participants which is reflected by relatively low values of the difference topographies in the spherical section of the specimens (see figure 6, top half of each aperture). The areal methods seem to have slightly more difficulties to reproduce the Siemens star pattern which results in slightly higher characteristic values in table 3. Figure 7 shows profiles through the structure at an 8 mm radius. The profile path is shown in the left panel of figure 7. Only measurement A 4 shows some   (2) and (3)), for asphere 3. Measurement reduction consisted of subtracting the design data and thereafter a best-fit sphere prior to median computation. The median was computed pointwise on a regular Cartesian grid on which all measurement data were resampled using spline interpolation. The resulting VRT has an RMS of 25 nm and an MAD of 14 nm. Figure 9. Difference topographies: differences between reduced topographies (best-fit sphere corrected deviation of measurements from the design) and virtual reference topography (VRT), i.e. pointwise median of reduced topographies, for asphere 3. For the definition of difference topography, see (4). T i : tactile CMMs, O j : optical CMMs, A k : areal measurements. larger deviations over the whole profile. The other profiles agree very well with the design along the profile line. The difference topographies suggest that measurements A 1 , A 2 , and A 4 have systematic error influences for this specific measurement task.

Asphere 3
For asphere 3, the difference topographies' RMS and MAD values range from a few tens of nanometres up to 28 nm RMS and 20 nm MAD (see table 3). PV values range from 82 nm up to 239 nm. The variability of the RMS and MAD values is comparable to asphere 1 and slightly larger than for aspheres 2 and 4, provided that measurement A 6 is not considered for asphere 4 since it is a clear outlier. There is no significant difference between the three measurement principles. The optical point measurements (green values in figure 12) have slightly smaller RMS and MAD values than the other methods. For this specific measurement task, the difference topographies suggest that measurement T 2 might have systematic error influences (see figure 9). The VRT of specimen 3 (figure 8) shows some rotationally symmetric features. We may conclude that the specimen was manufactured with the error shown.
The MAD values are slightly smaller at 4 nm-47 nm. The PV values range from 141 nm to about one micrometre, with most of of the values being smaller than or equal to 250 nm. For this strong asphere, the results shown in figure 11 suggest that measurements T 2 and A 6 might have systematic error influences for this specific measurement task. Except for measurement A 6 , all residuals are of the same magnitude. The VRT of this strong asphere (figure 10) has a PV of 299 nm and an RMS of 56 nm, and the deviations shown seem to be polishing errors in the manufacturing process. We observe that, for the specimen with the strongest asphericity in this study, the deviations between measurement data and design data are larger than for the other specimens. However, since the aspheres were not chosen with the primary focus on exact manufacturing, this may not be significant. Figure 12 shows plots of all RMS, MAD, and PV values. The variability between measurements for one asphere is quite small. There is no group of devices (colour coded) that stands out from the rest in one direction or the other (see table 3 for characteristic values). In every group of devices there are instruments with very low characteristic values. Only measurement A 6 shows significantly larger characteristic values for asphere 4 then the rest of the measurements.  (2) and (3)), for asphere 4. Measurement reduction consisted of subtracting the design data and thereafter a best-fit sphere prior to median computation. The median was computed pointwise on a regular Cartesian grid on which all measurement data were resampled using spline interpolation. The resulting VRT has an RMS of 56 nm and an MAD of 41 nm.

Discussion and conclusions
The aim of this study was to explore the state of the art in asphere metrology for a variety of measurement devices and different measurement tasks by measuring an ensemble of specimens. The measurement results of the different instruments differ in data point density and grid layout (e.g. Cartesian, spiral, concentric circles or profiles through the centre). Therefore, a sophisticated evaluation procedure has been developed that allows a comparison to take place without penalizing the results that have low data point density.
Since the true forms of the specimens are unknown, virtual reference topographies were chosen. For this purpose, pointwise median topographies were computed. Depending on the specimen, MAD values of the difference topographies were found to be in a range from a few nanometres to about 50 nm.
The difference topographies suggest that, depending on the measurement task, some measurement systems may have systematic error influences. Note, however, that a VRT does not represent the true topography. Nevertheless, very different measurement principles contributed data to building each VRT and the measurements' systematic deviations from these VRTs are apparent for different measurement principles (tactile point measurement, optical point measurement, optical areal measurement). Therefore, comparing the reduced topographies to the VRTs seems to be a fair choice, since no measurements uncertainties were available.
The results of this study show that the variability between the different measurement systems is as large as the variability between the different measurement principles (tactile point measurement, optical point measurement, and optical areal measurement). There are pointwise measurement systems as well as areal measurement systems with results close to the VRTs. The data available do not suggest that one measurement principle is super ior to another. Nevertheless, the results of some measurement systems are close to the VRTs for all measured samples.
The VRT is a good estimate of the deviation of the manufactured specimen from its design topography (except for an unknown spherical component), and can therefore be used as an indicator of the manufacturing accuracy of the specimens used in this study. The VRTs of the four specimens in this study show values that increase with asphericity. This may indicate that the manufacturing process is more challenging for stronger aspheres. Note that this does not necessarily reveal the best possible manufacturing accuracy currently available for each specimen type, as the efforts put into form optimization depend on the customers' specifications.
It is important to note that the observed deviations from the VRTs are based on differences between measurement data and design topographies from which a best-fit sphere was subtracted for every measurement data set. Therefore, the total measurement uncertainties, which include potential errors in the spherical component of the measurement results (see figure 2), are expected to be higher.