Multi-frame linear regressive filter for the measurement of infrared pixel spatial response and MTF from sparse data

: A challenging point in the prediction of the image quality of infrared imaging systems is the evaluation of the detector modulation transfer function (MTF). In this paper, we present a linear method to get a 2D continuous MTF from sparse spectral data. Within the method, an object with a predictable sparse spatial spectrum is imaged by the focal plane array. The sparse data is then treated to return the 2D continuous MTF with the hypothesis that all the pixels have an identical spatial response. The linearity of the treatment is a key point to estimate directly the error bars of the resulting detector MTF. The test bench will be presented along with measurement tests on a 25 µm pitch InGaAs detector.


Introduction
The current evolution of infrared focal plane arrays (IRFPA) is accompanied by a reduction in pixel size, close to the wavelength, especially for HgCdTe IRFPAs [1,2]. Thus, it becomes challenging to measure their spatial response. Optical system designers need to know the modulation transfer function (MTF) of a detector to evaluate the impact of the spatial filtering of the pixel on the final image [3][4][5]. This generally requires measuring the MTF of the pixel up to twice the Nyquist frequency of the detector. With regard to the manufacturer of the detector itself, the main goal is to have the clearest view of the pixel shape -which mainly depends on electron scattering and cross-talk effect -in order to better understand physical phenomena and optimize the design of the pixel (doping level, geometry...). In this case, it can be interesting to measure the pixel MTF up to higher frequencies, typically up to four times the Nyquist frequency.
Among the different MTF measurement methods, two categories can be distinguished. On the one hand, local methods such as spot scan, are based on the hypothesis that pixel response is time invariant during measurement. On the other hand, global methods rely on the hypothesis that all pixels of the focal plane array (FPA) have the same spatial response. In such methods, a known pattern O(x, y) is projected on the FPA and a specific treatment allows to determine the global MTF of a mean pixel. For instance an array of spots can be used to create a "global" spot scan method [6]. The method presented in this paper belongs to this second category.
In our case, we look at specific objects O(x, y) whose Fourier transform, O(f x , f y ), is composed of discrete spatial frequencies (f x i , f y j ). The sparse property of O is crucial as it concentrates the spectral content into sparse data and increase the Signal to Noise Ratio (SNR). In this paper, we propose a measurement method using this sparse property to obtain both the spatial pixel response and MTF. We illustrate our method with Continuously Self-Imaging Gratings (CSIGs), which were designed at the ONERA [7,8]. Their principle is illustrated in Fig. 1 and detailed hereafter. Still, the method can be applied, with little adjustments, to any measurement based on objects with predictable sparse transfer function.
A CSIG has a transmittance pattern [ Fig. 1(a)] mathematically defined in the Fourier domain. The N diffracted orders [ Fig. 1(b)], named CSIG orders, are at the intersection of a circle and a 2D periodic grid [9]. When illuminated by a polychromatic plane wave, the transmittance produces a field whose intensity profile is a propagation-invariant bi-periodic array of bright spots [Fig. 1(c)]. The Fourier Transform of the produced intensity pattern is composed of discrete spatial frequencies [ Fig. 1(d)], named CSIG harmonics, displayed within a disk of radius F max = 2η/a 0 where a 0 is the period of the periodic pattern and η 2 is an integer that describes the CSIG geometry, determining N, the number of diffracted orders [10]. The greater the number of orders is, the denser the harmonics are.
Those frequencies excited by the CSIG constitute the sparse data from which we can extract the pixel MTF. The radius F max of the disk gives the highest measurable spatial frequency. Choosing the right CSIG parameters (η, a 0 ) allows us to adapt F max to the pixel size and measurement requirements. The invariant propagative image O(x, y) is then captured by the FPA which we aim to characterize. Thanks to the z invariance property, the CSIG method does not require focusing. This has two advantages: first, the working distance between the network and the detector is adaptable, which provides simplicity and flexibility in positioning the elements of the optical bench. Second, the method is free from deconvolution of a focus lens, which can be difficult to estimate and cause additional error. For most of the experiments and simulations in this paper, we take the case of a 25 µm pitch InGaAs detector. We note I(x, y) the resulting image captured by the FPA. In order to extract the MTF from the image I, we search to divide I by the initial spectrum of the pattern O which is composed of the theoretical CSIG harmonics which can be easily modeled [ Fig. 1(d)]. The main difficulty here is that O is equal to zero almost everywhere so that the fraction I/ O is not well defined. To overcome the difficulty, two deconvolution methods have been developed. The original one consists in extracting from the image only discrete spatial frequencies that correspond to the CSIG harmonics positions, and then interpolate them to get a continuous MTF [10]. In this paper, we propose a new methodology which consists in applying an apodization window onto both I and O to get rid of the zero areas. The strengths of this second approach are in its linearity and in its ability to provide a direct measure of continuous 2D MTF. However, the windowing process needs a crucial hypothesis -generally accepted in the infrared photo-detection community -that cross-talk effects are negligible beyond the first eight neighbours of each pixel [11].
Moreover, since we aim at measuring the MTF beyond the Nyquist frequency, we have to deal with aliasing effect. Unfolding methods are often used to reconstruct the original signal, but here we made the alternative choice to raise the Nyquist frequency by a original spatial 2D over-sampling. The deconvolution algorithm will be detailed in section 2, the link between density of sparse data and the accuracy of the method is discussed in section 3 and section 4 presents simulation of the method on a modeled pixel shape. Noise propagation estimation is studied in section 5. The section 6 presents the experimental test bench, insisting on the over-sampling method used, and finally MTF measurement of a 25 µm pitch InGaAs detector will be presented in section 7.

Description of the deconvolution algorithm
The objective of the algorithm is to get the continuous 2D pixel Transfer Function, TF PIX . This function is the Fourier transform of the Point Spread Function of the pixel PSF pixel . The Modulation Transfer Function is then defined as : MTF(f x , f y ) = |TF PIX |(f x , f y ). Image formation comes down to : where * is the convolution product. By applying Eq. (1) in Fourier domain, we then have : As pointed out in the introduction, this straightforward inversion, called the main solution, is not usable in our case because O(f x , f y ) equals zero almost everywhere [ Fig. 1(d)]. Therefore, the goal of the algorithm is to treat the incoming images to remove the zero areas of O and I without losing any relevant spectral information. The idea is then to reduce the image format by applying a short-width window F w to I and O. Indeed, the reduction of the selected area of O by F w results in Fourier domain in the convolution of O by a wide-width window function F w : Thanks to convolution in Fourier domain, the discrete harmonics broaden and overlap with their first neighbours, erasing the zero areas. Concerning the size of the window function, on the one hand the window F w should be sufficiently small so that the interpolation of the harmonics fill the entire disk [ Fig. 2]. On the other hand, we must be sure that we do not lose any relevant information in terms of cross-talk effect. In practice, the selection window should not be smaller than a square of 3 pixels side. Indeed, a common hypothesis consists in considering that cross-talk phenomena in the FPA should be taken into account only for the first neighbours of a pixel, and are negligible beyond [11]. Thus applying locally a 3 pixels-width window should not alter the measurement of the pixel PSF. In practice, we do not use a square window but a Tukey window to avoid discontinuities for the FFT algorithm. The Full-Width at Half Maximum (FWHM) of the window is then fixed to the 3 pixels limit. An example of such a window is presented in Fig. 3. Besides, we assume here that an oversampling process has been applied to get the image I (cf. section 6). Typically, one pixel of the detector corresponds to 5 by 5 sampling points of I.  Moreover, the typical size of most of the IRFPA is at least 320 × 256 pixels. Since we only use a 3 × 3 pixels square I w of the image, we can turn this into an advantage to increase the SNR which is often a practical limitation. Indeed, we can split the detector image I and model of the CSIG object O into small images I (k) and O (k) , that we called thumbnails and whose dimension is fixed by the size of the Tukey window. By adding the spectrum of each thumbnail, we average different spatial contributions and raise the SNR . Note here that O is pure modeled data, so that only I suffer from experimental noise effects. The windowing process can be expressed in the following way : where δ is the Dirac distribution, * the convolution product and (x k , y k ) the center position of the k-th thumbnail. In fact, summing the different realizations I (k) comes down to use a least-square method. A similar approach was used for image deconvolution in astronomy [12]. It leads to the formulation of a multi-frame linear regressive filter [13]. Indeed, in such a method, we aim to find the Transfer Function TF PIX that minimizes the quantity : After calculation [12], this lead to the following expression of TF PIX : where * is the complex conjugate. By inverse Fourier transform, we then get the PSF pixel .

Discussion on sparsity and on the 3×3 cross-talk hypothesis
The accuracy of the method mainly depends on the CSIG object used. Indeed, the sparsity of the CSIG harmonics in Fourier space (which depends on the numbers of diffracted orders and on the maximum frequency F max ) should be adapted in regards to the 3 × 3 pixels hypothesis, i.e. to the pixel pitch of the detector. If the density of harmonics is not sufficient, that is to say the adjacent harmonics are too far from one another, the interpolation generated from the windowing process will not erase properly all the zero of the denominator in Eq. (6). Subsequently, the resulting continuous MTF will lack measurement points and will not be accurate in these areas.
To illustrate our discussion, we distinguish three typical cases in Fig. 4, where harmonics are represented by Dirac distributions and d harmonics is the distance between two adjacent harmonics. In a first case, the harmonics are too far from one another and the convolution by F w makes no overlap at all. As a result, some zero areas remain, the division is not well defined and the method is not applicable. The second case is an intermediate situation: the overlap between convolved harmonics is sufficient to erase all zero areas, but the interpolation is weak. Finally, the third case is the ideal situation where all zero areas are erased and the interpolation is strong. In both case 2 and 3, the method is applicable, but the denser is the sparse data, the more accurate the recovery will be. However the SNR will decrease as the number of harmonics increases, since the energy is distributed over all the harmonics. As a result, a trade-off has to be made between sparse density and energy available for each harmonic. For instance, the configuration we chose for the experimental measurement on the 25 µm pitch InGaAs detector, with a 24 orders CSIG (288 harmonics) with F max = 102 mm −1 is between case 2 and case 3 of Fig. 4, which ensure the extinction of zeros in the denominator of Eq. (6), and still offers a sufficient SNR.
Moreover, in the specific and rare case of a detector where the hypothesis of a 3 × 3 pixel cross-talk is weak, the method can still be workable with a 5 × 5 pixels Tukey window, but we must then increase the density of harmonics, for a given F max , to meet the conditions previously discussed.

Simulation of the method on a known MTF/pixel shape
To illustrate the capacity of the method to reconstitute the MTF and the pixel shape, we made a simulation on a known complex pixel [Fig. 5]. The simulated shape is a square of 25 µm side, with an amplitude equal to 1, and contains a truncated area of 10µm side (amplitude 0.2) at the bottom left [ Fig. 5(a)]. We introduced an electronic diffusion length in the pixel response with the convolution of the pixel by a Gaussian function whose standard deviation is 2.5µm. Besides, we used in this simulation a 48 orders CSIG with a maximum frequency F max = 120mm −1 , so that the 10µm truncation can be restored. From the restitution of the pixel shape, we can extract the precise dimensions of our pixel [ Fig. 5(b)].
Furthermore, we compare the output result to the input model in Figs. 5(c) and 5(d), which respectively present the 1D slices (in the directions f y = 0 and y = 0) of the MTF and the PSF.
The resulted MTF given by the algorithm is consistent to the input function [ Fig 5(c)] in the bandwidth [0, F max ]. The errors we measure between input and output are less than 1%. However, since we only reconstruct the pixel shape from this finite MTF bandwidth, we cannot expect the PSF to be truly accurate [Fig 5(d)] since the original pixel PSF, being a finite object in real space, contains frequencies greater than F max . The maximal measured error is 7% and occurs near the truncated area, where the pixel profile contains frequencies higher than F max = 120mm −1 , which can not be recovered. To solve the problem of the finite MTF bandwidth, the only solution here is to extend the MTF via some prior knowledge on the pixel shape [17], which is out of the scope of this paper.

Image noise propagation
The linearity of the method allows us to easily determine the propagation of the noise coming from the experimental image through the process. According to Eq. (6), each thumbnail I (k) will give a different realization of the transfer function TF (k) PIX : where is the model factor. Equation (6) is then equivalent to summing the different contributions : As a result, the noise will be reduced by √ n where n is the number of thumbnails. The noise propagation can be expressed as : where δ I , δ TF are noise realizations with temporal standard deviation σ I (x, y), σ TF (f x , f y ).
According to Eqs. (7) and (8), a standard deviation σ I on image I will give rise to an standard deviation σ TF proportional to σ I and M, where σ I is the standard deviation associated to I(f x , f y ) and M is the average value of the model factor over the thumbnails M (k) (f x , f y ). Besides, Parseval's identity allows us to write σ I = σ I . Finally, we get the linear relation : To valid Eq. (10), we simulated noise propagation by adding white normal noise realizations to a simulated image I. We got 100 noisy simulated images of standard deviation σ I which were then treated by the algorithm. σ TF is then calculated as the standard deviation of the 100 resulting TF PIX . With this simulation, we can retrieve the linear relation between σ TF and σ I [ Fig. 6(a)]. The scalar <σ TF > represents the averaged value over the disk |f x + f y | 2 < F 2 max . The model factor M(f x , f y ) stays fixed during simulation (window selection of 3 pixels width). We can also measure the impact of the number of thumbnails on σ TF . In logarithmic scale [ Fig. 6(b)], we find a straight line curve which is coherent with reduction of σ TF by the factor √ n.

Experimental set up
The test bench is mainly composed of a tungsten-halogen IR lamp from Arcoptix (temperature 2230 K), a collimator, and a translation stage on which is fixed the CSIG. The detector is fixed on a goniometric cradle for over-sampling needs and faces the CSIG. For this paper, we applied our MTF measurement method on a 25 µm pitch InGaAs detector and used a 24 orders CSIG containing 288 harmonics with a maximum frequency F max = 102mm −1 (η 2 = 650, a 0 = 500µm). The grating was manufactured with a high resolution, typically 1 µm for a 500 µm period CSIG transmittance. Still, it may contain some low-amplitude unwanted orders which are distributed all over the Fourier plane. In order to get rid of those, we took advantage of the sparsity of O to treat beforehand our experimental image in Fourier plane by selecting only the 288 harmonics of the 24 orders CSIG and multiply the data by zero elsewhere. As a result, the spectral data contains only the 288 measurement points. As explained in the introduction, we need to oversample the images in order to raise the Nyquist frequency and avoid aliasing effects. The goal is to oversample the images by a factor above 4 (typically 5 or more), so that we can comfortably deal with frequencies up to four times the original Nyquist frequency of the detector. We proposed an original method which makes it possible to realize a 2D oversampling with a single translation stage [ Fig. 7], simplifying the optical bench and reducing its cost. From a metrological point of view, such a technique is very interesting to limit residual bias and vibrations during measurement. The CSIG object is moved horizontally step by step in accordance with the oversampled grid, while facing the FPA which stays still and is tilted with a precise angle α, thanks to the goniometric cradle. This angle is chosen so that the oversampled pixels match the pixels of the detector periodically along the grid. More precisely, α = arctan(M/N), where M and N are respectively the number of lines and columns of pixels that have to be scanned by the pixel (i,j) before reaching the position of the next pixel of the FPA (i+N, j+M). By moving the CSIG horizontally, the oversampled grid needs K = M 2 + N 2 positions to be fully scanned. At each position of the CSIG, an image is registered, and step by step, the oversampled grid is filled. Actually, we average 100 detector images at each position to improve the SNR. The choice of M and N (and so the associated angle α) are adapted to the measurement requirements. The new sampling pitch equals to p detector × 1/ √ N 2 + M 2 where p detector is the initial pixel pitch of the detector. An example of an oversampled image is presented in Fig. 8.

Experimental results
In this section we first present some results of the algorithm applied to the 25 µm pitch InGaAs detector and then discussed the two main features of our method : complex PSF restitution and metrological aspects.

MTF and PSF experimental results
The results are presented with error bars which come from noise propagation simulation. In a first approach, the simulated noise in input is white normal noise with standard deviation σ I corresponding to the experimental estimated noise of acquired image I. The error bars correspond to a ±3 σ TF .
On Fig. 9 are respectively presented the 2D TF and 1D TF curve projection obtained with the algorithm. For this measurement, the detector was tilted with an angle α = 9.5°(so as to oversample the image, with N = 6 and M = 1, corresponding to a new 4.1 µm pitch and a Nyquist frequency F Ny = 122 mm −1 ). We can easily see the sign changes on the Transfer Function of the pixel, which correspond to the zeros of the TF. The first zero, also known as cutoff frequency f cut , is given by the formula f cut = By computing the inverse FFT of the resulting TF of the pixel, we have access to an estimation of the 2D PSF, which is presented in Fig. 10. We recognize on this figure a typical shamrock shape. This particular shape was already mentioned in [14] on a HgCdTe MWIR planar technology, and is most likely due to the non-homogeneous diffusion which depends on the pixel's first neighbours position. Indeed, the diffusion effect is more present with the diagonal pixels than with the aligned neighbour pixels due to the distancing of neighbours.

Discussion
When we have a strong a priori on the pixel shape, the use of non-linear methods is often very effective. For instance, in our previous method [10], the knowledge of pixel square form and pitch is an input data which assists the algorithm in finding the associate MTF.
However, our feedback over the last few years has turned towards more complex pixel responses. Indeed, the current trend towards pixel pitch reduction leads to non-negligible cross-talk effects between adjacent pixels, which should be properly quantified [2,15,16]. In this context, it is of interest to use more straightforward methods that do not require an a priori on the geometric shape of the pixel response. Indeed, in this paper, we make no other hypothesis than the spatial extension limited to a 3 X 3 pixel window which allows us to identify the specific shape presented in Fig. 10. The robustness of the restitution was also demonstrated on a complex simulated pixel in Fig. 5 of section 4.
Moreover, in the competitive context of infrared detection industry, especially with regard to the race to high resolution, metrological aspects of MTF measurement are of particular importance. In this paper, thanks to the linearity of our method, we presented an efficient and straightforward technique to get access to error bars on MTF.

Conclusion
In this paper, we presented a linear deconvolution treatment to extract MTF from the CSIG sparse data. This method allows to get directly a continuous 2D MTF, and also an estimation of the 2D PSF of the pixel detector. So far, we successfully applied the method on a 25 µm pitch InGaAs detector. Yet we plan in the future to characterize the MTF of other detectors with different pixel pitches in different spectral bands to ensure the generality of the method. At the heart of the process comes the hypothesis that cross-talk effects are negligible beyond the first neighbours of the pixels. In other words, we measure the TF on a 3 × 3 pixels support. Although this hypothesis is generally accepted for most IRFPA, we still aim in the future to reduce its importance, by improving the sparse data objects that we use, finding other ways to fill the CSIG spectral disk, or working with CSIGs that have more harmonics, such as the 32 or 48 orders CSIGs which could be relevant candidates for the method. Besides, in more advanced versions of the method, this hypothesis could be potentially coupled with other prior knowledge on the MTF. In this sense, regularization techniques [17] may be applied as a complement of our method. From an experimental point of view, we demonstrate an original over-sampling method to avoid aliasing problems. We also pointed out the linearity of the deconvolution method which allowed us to easily understand the noise propagation. For now, we made simulations with white normal noise. Further objective is to simulate error propagation with more complex incoming noises, and compare the results to experimental error bars. At last, we aim to adapt our optical bench to use a monochromatic source, in order to study the impact of the wavelength on the detector MTF, especially near the cut-off wavelength.