Transformation of point spread functions on an individual pixel scale

In some types of imaging systems, such as imaging spectrometers, the spectral and geometric pixel properties like center wavelength, center angle, response shape and resolution change rapidly between adjacent pixels. Image transformation techniques are required to either correct these effects or to compare images acquired by different systems. In this paper we present a novel image transformation method that allows to manipulate geometric and spectral properties of each pixel individually. The linear transformation employs a transformation matrix to associate every pixel of a target sensor B with all related pixels of a source sensor A. The matrix is derived from the cross-correlations of all sensor A pixels and cross-correlations of sensor A and sensor B pixels. We provide the mathematical background, discuss the propagation of uncertainty, demonstrate the use of the method in a case study, and show that the method is a generalization of the Wiener deconvolution filter. In the study, the transformation of images with random, non-uniform pixel properties to distortion-free images leads to errors that are one order of magnitude smaller than those obtained with a conventional approach. Published by The Optical Society under the terms of the Creative Commons Attribution 4.0 License. Further distribution of this work must maintain attribution to the author(s) and the published article’s title, journal citation, and DOI.


Introduction
Many scientific applications place high demands on the quality of data acquired with optical instruments such as remote sensing imaging spectrometers [1][2][3][4]. Consequently, great effort is put into building imaging spectrometers with uniform angular and spectral pixel properties [5][6][7]. Calibration measurements show, however, that some imaging spectrometers significantly suffer from spatial and spectral non-uniformities [8,9]. This manifests in different center wavelengths of pixels of the same spectral channel (smile), different center angles of pixels of the same spatial pixel (keystone), but also in changes of the response function shapes and resolution within a few pixels. In other words, the Point Spread Functions (PSFs) of these instruments are not constant but change rapidly with detector location. Image transformation techniques are therefore required to correct such optical distortions and non-uniformities.
In this paper we propose a linear transformation of images from a sensor A into images of a sensor B considering pixel-specific image properties. Sensor B can either be a distortion-free version of sensor A or a different instrument which allows the comparison of images acquired with both sensors. The transformation employs a transformation matrix to relate every pixel of sensor B to a set of pixels of sensor A. The transformation matrix can also be used to propagate measurement uncertainties from sensor A to sensor B. The proposed transformation generalizes the well-known Wiener deconvolution to sensors exhibiting a rapid pixel-to-pixel variation of geometric or spectral responsivity. Though developed for the correction of image distortions in imaging spectrometers, the presented algorithm is very general in nature. In fact, it should be applicable to almost any ground-based, air-, and spaceborne camera and imaging spectrometer.
Due to the vast amount of publications on Wiener filtering and deconvolution, a comprehensive literature review is beyond the scope of this paper. Since imaging spectroscopy for earth observation is our core focus, we begin with a brief review of related publications from this field of research.
Although many methods for the correction of pixel center angle and wavelength for imaging spectrometers exist, the number of techniques that take also the homogenization of image resolution into account is rather limited. Schläpfer et al. [2] investigate the homogenization of geometric response functions over all pixels of an imaging sensor with linear and nearest neighbor interpolation. The method is based on a single parameter, the Full Width at Half Maximum (FWHM), and neglects further information about the shape of the response function. Ceamanos and Douté [10] address the issue of varying resolution and center wavelength of the spectral response by sharpening the spectral bands after adjusting the center wavelengths by cubic spline interpolation. Zhao et al. [11] and Jia et al. [12] suggest to use an iterative method to create a superresolution image, which is subsequently convolved with the geometric and spectral model of a desired instrument. However, none of these publications discuss the propagation of uncertainties to the transformed images.
Besides the publications on imaging spectrometers discussed so far, there are several publications from other research fields that are relevant for our work. Nevas et al. [13] propose the simultaneous correction of stray light and bandpass effects of array spectroradiometers (non-imaging spectrometers). Therein the authors present a Tikhonov regularized spectral deconvolution method. This approach seems to be similar to the method presented here, but a detailed comparison is hampered by the lack of mathematical details provided by the authors. Similar techniques are also applied by astronomers, who typically refer to them as "PSF matching" or "PSF homogenization" methods. Within the field of astronomy these techniques are usually applied before comparing images taken by two or more cameras with different optical properties. In such a scenario homogenization is necessary to suppress images differences caused solely by different camera properties and not by actual variations in the light emitted by a star or other target of interest. Notable approaches in this category can be found in [14][15][16][17][18][19]. These publications served as an important inspiration during the development of this work. While the homogenization techniques described therein are in principle the solution we are looking for, they require pixel-to-pixel variations of the responsivity to be either negligible or to be representable by a slowly varying function, e.g. low order polynomial as in [17].
The remainder of this paper is structured as follows: In Sec. 2 we introduce the algorithm we developed for the transformation of the imaging properties of an optical sensor and we discuss its most important characteristics. We proceed with a case study using simulated data in Sec. 3. Therein we apply our method to synthetic data of a virtual sensor to demonstrate that the method works as expected and to analyze its behavior. For the interested reader we provide a rigorous mathematical derivation of all used equations in Appendix A. and we derive the Wiener deconvolution as special case from these equations in Appendix B.

Theoretical background
In the following we discuss imaging systems that focus on infinity, such as remote sensing instruments. In this case the mapping of the objected plane of the imaging system can be reinterpret as the mapping of the angle of the incident electromagnetic radiation. For different focus ranges the angular coordinates used in the following can be exchanged with other coordinate systems, such as Cartesian.

Pixel response function
The imaging properties of an optical system can be expressed in terms of its impulse response, the so-called PSF. The PSF describes the output of an optical system under illumination with a perfectly collimated, monochromatic point source. The PSF typically depends on the direction of the incident radiation and on its wavelength. If the PSF is measured for many different angles of incidence and wavelengths, one obtains the so-called Pixel Response Function (PRF) for each pixel. Each PRF describes one pixel's sensitivity to radiation depending on the radiation's direction of incidence and wavelength. Figure 1 visualizes the relation between PSF and PRF. each pixel. Each PRF describes one pixel's sensitivity to radiation depending on the radiation's direction of incidence and wavelength. Figure 1 visualizes the relation between PSF and PRF.
Throughout this paper f A k designates the PRF of pixel k of sensor A. k is an index (a number) used to identify a pixel of sensor A. Even if A is an imaging sensor with multiple spectral and geometric dimensions, we use a single number as index for each pixel. For an RGB camera we could e.g. start with the left most top red (R) channel, proceed with its right neighbor and then continue down the first row, followed by all pixels in the second row and so on. Once we reach the last red pixel in the last row we continue in the same fashion for all green (G) and blue (B) pixels. All parameters upon which a PRF depends are summarized into a single vector q. A typical choice would be q = (λ, θ, φ) T , where λ is the wavelength and the direction of incidence is defined by the angles φ and θ. Using this notation, we can express the reading s A k of sensor A's k-th pixel by where L s (q) is the at-aperture radiance entering the sensor and Q symbolises the parameter space of possible combinations of angles of incidence and wavelengths.
To keep the notation compact, we henceforth express scalar products of the above form as ·, · . In this notation the previous expression simply becomes Following common conventions, we assume f A k to be normalized such that This implies that the sensor is radiometrically calibrated, because a spectrally and spatially flat at-aperture radiance (L s (q) = L 0 = const.) obviously yields a reading s A k = L 0 identical to the incident radiance. Furthermore, we assume that the measured signal s A k has been corrected for non-linearity, dark current and any electronic offset. Note that keystone, smile and stray light can all be expressed in terms of the PRF and are thus treatable by the formalism discussed in the following sections. Throughout this paper f A k designates the PRF of pixel k of sensor A. k is an index (a number) used to identify a pixel of sensor A. Even if A is an imaging sensor with multiple spectral and geometric dimensions, we use a single number as index for each pixel. For an RGB camera we could e.g. start with the left most top red (R) channel, proceed with its right neighbor and then continue down the first row, followed by all pixels in the second row and so on. Once we reach the last red pixel in the last row we continue in the same fashion for all green (G) and blue (B) pixels. All parameters upon which a PRF depends are summarized into a single vector q. A typical choice would be q = (λ, θ, ϕ) T , where λ is the wavelength and the direction of incidence is defined by the angles ϕ and θ. Using this notation, we can express the reading s A k of sensor A's k-th pixel by where L s (q) is the at-aperture radiance entering the sensor and Q symbolises the parameter space of possible combinations of angles of incidence and wavelengths.
To keep the notation compact, we henceforth express scalar products of the above form as ⟨·, ·⟩. In this notation the previous expression simply becomes Following common conventions, we assume f A k to be normalized such that This implies that the sensor is radiometrically calibrated, because a spectrally and spatially flat at-aperture radiance (L s (q) = L 0 = const.) obviously yields a reading s A k = L 0 identical to the incident radiance. Furthermore, we assume that the measured signal s A k has been corrected for non-linearity, dark current and any electronic offset. Note that keystone, smile and stray light can all be expressed in terms of the PRF and are thus treatable by the formalism discussed in the following sections.

Calculation of the transformation matrix
In this section we present a set of equations for the manipulation of a sensor's PRFs. A detailed mathematical derivation of these equations can be found in Appendix A. The proposed algorithm is as a generalization of the well known Wiener deconvolution in so far, as it extends the Wiener theory [20] to sensors with arbitrarily varying PRFs. While the classical Wiener deconvolution requires the PRF to be the same for all pixels, the method proposed here allows to use a dedicated PRF for each pixel.
We assume that we have a sensor A with given properties expressed in terms of its PRFs. Additionally we assume a sensor B with different and typically more desirable PRFs. The PRFs of both sensors must be known. For real instruments these parameters are usually determined by calibration measurements in a dedicated laboratory. Our goal is to predict the result of a hypothetical measurement with sensor B based on an actual measurement acquired with sensor A. It should be noted that we do not make any assumptions about the sensors beyond their respective PRFs. Thus A and B could be anything from a simple monochromatic camera, via a single geometric pixel spectrometer to an imaging spectrometer. Obviously the conversion has limits: it is e.g. not possible to predict the output of a high resolution imaging spectrometer based on a measurement with a monochromatic camera. However, limited increases in geometric or spectral resolution (superresolution) are possible with the proposed method. Before we discuss the possibilities and limitations of the algorithm, we shall begin with the presentation of the essential equations, though.
Let sensors A and B consist of M and M ′ individual pixels, respectively. Then we can express a measurement by each sensor through the vectors The reading of a single pixel of sensor A is given by Eq.
(2). The readings for sensor B are obtained in complete analogy from where f B k , k = 1, . . . , M ′ are the PRFs of sensor B. To predict a measurement with sensor B based on a measurement of sensor A, we require a transformation matrix K such that The matrix K basically maps every pixel of sensor B to the set of pixels of sensor A. At this point it is not clear if -or under which circumstances -such a matrix exists. We will return to this question later and proceed here under the assumption that a suitable K can be found. If sensor A's PRFs overlap for neighboring pixels, then the readings of these pixels are correlated. In mathematical terms this correlation can be expressed in terms of the cross-correlation matrix In a similar fashion, pixels of sensor B are correlated to pixels of sensor A, provided their respective PRFs overlap to some extent. Again, we can formalize this correlation with the cross-correlation matrix C BA : It may be intuitively clear that the linear map of measurements from sensor A to sensor B is equivalent to a linear map of the associated cross-correlation matrices. A formal derivation of this argument is found in Appendix A. This implies that Thus we can formally obtain K from the (least squares) inversion of Eq. (10), which yields where K symbolises the least squares estimator for the matrix K and ∥·∥ 2 is the Euclidean norm.
Since we expect this inversion to be ill-conditioned, we choose the regularized least-squares solution proposed by Tikhonov [21][22][23][24], which yields Here γ and Γ are the regularization parameter and the Tikhonov regularization matrix, respectively. Both can be chosen to tune the balance between data fidelity and the penalization of undesired solutions. Typical choices for the regularization matrix are e.g. the identity matrix or a discrete Laplacian operator. Since PRFs normally have a smooth shape, a Laplacian that penalizes high frequencies seems to be a suitable first choice. Thus a two dimensional Laplacian is used in Sec. 3.4. However, other choices are possible and may yield better results depending on the exact problem at hand. The interested reader is referred to [24] and references therein for additional considerations regarding various regularization strategies. We would like to point out that Eq. (13) can be solved independently for each row of K and C BA and therefore lends itself to an individual treatment of each sensor B pixel. Moreover the transformation matrix K has to be calculated only once for a given sensor combination (A, B) before it can be used to correct any measurement s A with a simple matrix multiplication following Eq. (7).

Transformation matrix constraints
As discussed in Sec. 2.1, the PRFs are normalized. The normalization condition Eq. (3) leads to a constraint on the rows of the transformation matrix K. We can derive this constraint by investigating the special case of a spectrally and spatially flat illumination (L s (q) = L 0 = const.) on both sensors. In this case we obtain the signals Inserting these into Eq. (7) immediately leads to In other words, the sum over each row of the transformation matrix K has to be unity. Since K is independent of the incident radiation L s (q), this result can be used as a general normalization condition for the matrix, e.g. to suppress numerical noise.

Propagation of uncertainty
If we interpret the measurement s A as a random variable with expectation value E [︁ then s B is also a random variable with expectation value (mean) E [︁ Eq. (18) can be used to propagate the uncertainty of measurement s A to measurement s B . In the special case, where the covariance matrix of measurement s A can be decomposed into a product Σ A = D A D A † , the same is true for the covariance matrix of measurement B. We can Noise of the transformation matrix K is generally of minor importance, since the PRFs can usually be measured with high accuracy and numerical noise can be suppressed, as shown in Sec. 2.3. Because Eq. (7) is a linear transformation of the signals s A , any systematic change of the PRFs f A of sensor A compared to the time of calibration is linearly transformed to the signals s B . It is also noteworthy that noise in the PRF measurements results in systematic bias in the transformation matrix K.

Test method
We test the presented algorithm by defining two synthetic camera sensors, a source sensor A and a target sensor B, with their PRFs f A k and f B k . With these we calculate the cross-correlation matrices C AA and C BA according to Eqs. (8) and (9), respectively. The transformation matrix K is then computed with Eq. (13). To judge the performance of the algorithm, we use the method depicted in Fig. 2 (2) and (6), which yield the images s A and s B . Finally, we apply the transformation matrix K to the source image s A following Eq. (7) and compare the resulting transformed image with the image target s B .
then s B is also a random variable with expectation value (mean) E s B = K ·s A and covariance matrix Eq. (18) can be used to propagate the uncertainty of measurement s A to measurement s B . In the special case, where the covariance matrix of measurement s A can be decomposed into a product Σ A = D A D A † , the same is true for the covariance matrix of measurement B. We can Noise of the transformation matrix K is generally of minor importance, since the PRFs can usually be measured with high accuracy and numerical noise can be suppressed, as shown in Sec. 2.3. Because Eq. (7) is a linear transformation of the signals s A , any systematic change of the PRFs f A of sensor A compared to the time of calibration is linearly transformed to the signals s B . It is also noteworthy that noise in the PRF measurements results in systematic bias in the transformation matrix K.

Test method
We test the presented algorithm by defining two synthetic camera sensors, a source sensor A and a target sensor B, with their PRFs f A k and f B k . With these we calculate the cross-correlation matrices C AA and C B A according to Eqs. (8) and (9), respectively. The transformation matrix K is then computed with Eq. (13). To judge the performance of the algorithm, we use the method depicted in Fig. 2. A test scene is sampled with both sensor models f A k and f B k according to Eqs. (2) and (6), which yield the images s A and s B . Finally, we apply the transformation matrix K to the source image s A following Eq. (7) and compare the resulting transformed image with the image target s B . For a simpler description of test scenes and sensors, we only consider the transformation of two-dimensional geometric image properties here. However, the results do not change if instead a spectral coordinate system is assigned to one of the geometric axes as long as the relative position and the overlapping of the response functions do not change.

Test scenes
We define two synthetic test scenes, see Fig. 3. Both scenes consist of point sources which are defined by their location in a two-dimensional angle space and have relative intensities between For a simpler description of test scenes and sensors, we only consider the transformation of two-dimensional geometric image properties here. However, the results do not change if instead a spectral coordinate system is assigned to one of the geometric axes as long as the relative position and the overlapping of the response functions do not change.

Test scenes
We define two synthetic test scenes, see Fig. 3. Both scenes consist of point sources which are defined by their location in a two-dimensional angle space and have relative intensities between 0 and 1. Test scene 1 consists of 13 point sources on a dark background. This is equivalent to stars surrounded by deep space. Test scene 2 consists of a checkerboard pattern, which tests the algorithm performance especially near the high-contrast edges where transformation errors are expected to occur. We selected these test scenes due to their high frequency components. Usually it is more challenging for transformation algorithms to handle high than low frequency components. 0 and 1. Test scene 1 consists of 13 point sources on a dark background. This is equivalent to stars surrounded by deep space. Test scene 2 consists of a checkerboard pattern, which tests the algorithm performance especially near the high-contrast edges where transformation errors are expected to occur. We selected these test scenes due to their high frequency components. Usually it is more challenging for transformation algorithms to handle high than low frequency components.

Synthetic test sensors
We define the synthetic test sensors A and B by their two-dimensional PRFs f A k (θ, φ) and f B k (θ, φ) with normalized Gaussian functions h for the two angular axes, which yields where θ c,k and φ c,k are the center angles and ∆θ k and ∆φ k are the FWHMs of the Gaussians of pixel k. The normalized Gaussian functions are defined as Both sensors have 31 pixels along the y-axis and 61 pixels along the x-axis. Sensor A has response functions with FWHMs randomly ranging from 0.1 mrad to 0.125 mrad, see These differences between PRF center positions result in the largest distance between sampling points of the two sensors, thus the largest interpolation errors are to be expected. The sampling of both test scenes with sensor A and sensor B is depicted in Fig. 5. Due to the random response function widths, the sensor A images appear noisy.

Synthetic test sensors
We define the synthetic test sensors A and B by their two-dimensional PRFs f A k (θ, ϕ) and f B k (θ, ϕ) with normalized Gaussian functions h for the two angular axes, which yields where θ c,k and ϕ c,k are the center angles and ∆θ k and ∆ϕ k are the FWHMs of the Gaussians of pixel k. The normalized Gaussian functions are defined as Both sensors have 31 pixels along the y-axis and 61 pixels along the x-axis. Sensor A has response functions with FWHMs randomly ranging from 0.1 mrad to 0.125 mrad, see These differences between PRF center positions result in the largest distance between sampling points of the two sensors, thus the largest interpolation errors are to be expected. The sampling of both test scenes with sensor A and sensor B is depicted in Fig. 5. Due to the random response function widths, the sensor A images appear noisy.

Transformation matrix generation
We assume that the radiometric calibration is performed as a separate procedure before the PRF transformation. This means that the integrals of all PRFs are normalized to unity. Before determining the cross-correlations, we calculate the discrete representations of the response functions with the trapezoidal rule. Since this process may introduce additional noise, i.e. the integral of the discrete responses differ slightly from the actual integral, we normalize the results again to unity. For each sensor, the cross-correlations C AA and C B A are calculated according to Eqs. (8) and (9), respectively. The transformation matrix is then computed from Eq. (13).
To limit the computation time and memory usage we define subkernels for each pixel of sensor B before calculating the matrices C B A and C AA , i.e. sparse matrices. This can be done, due to the limited extend of the PRFs. In the case study these subkernels for each pixel have a size of 15 × 15 pixels of sensor A, since the maximum distance of PRF interaction is here less than 15

Transformation matrix generation
We assume that the radiometric calibration is performed as a separate procedure before the PRF transformation. This means that the integrals of all PRFs are normalized to unity. Before determining the cross-correlations, we calculate the discrete representations of the response functions with the trapezoidal rule. Since this process may introduce additional noise, i.e. the integral of the discrete responses differ slightly from the actual integral, we normalize the results again to unity. For each sensor, the cross-correlations C AA and C B A are calculated according to Eqs. (8) and (9), respectively. The transformation matrix is then computed from Eq. (13).
To limit the computation time and memory usage we define subkernels for each pixel of sensor B before calculating the matrices C B A and C AA , i.e. sparse matrices. This can be done, due to the limited extend of the PRFs. In the case study these subkernels for each pixel have a size of 15 × 15 pixels of sensor A, since the maximum distance of PRF interaction is here less than 15

Transformation matrix generation
We assume that the radiometric calibration is performed as a separate procedure before the PRF transformation. This means that the integrals of all PRFs are normalized to unity. Before determining the cross-correlations, we calculate the discrete representations of the response functions with the trapezoidal rule. Since this process may introduce additional noise, i.e. the integral of the discrete responses differ slightly from the actual integral, we normalize the results again to unity. For each sensor, the cross-correlations C AA and C BA are calculated according to Eqs. (8) and (9), respectively. The transformation matrix is then computed from Eq. (13).
To limit the computation time and memory usage we define subkernels for each pixel of sensor B before calculating the matrices C BA and C AA , i.e. sparse matrices. This can be done, due to the limited extend of the PRFs. In the case study these subkernels for each pixel have a size of 15 × 15 pixels of sensor A, since the maximum distance of PRF interaction is here less than 15 pixels. The size is different for every sensor pair and depends on the actual extent of the PRFs of sensor A and B. If stray light is treated, much larger subkernels may be required due to the long-range interaction between pixels. When PRFs do not overlap their cross-correlation is zero, which is forced to be the case for any pair of pixels not sharing the same subkernel in matrix C BA . Obviously, subkernels that are too small cause additional errors. The use of subkernels becomes especially necessary for real instruments that have a large number of pixels. For instance, in the case of our HySpex VNIR-1600 imaging spectrometer with 1600 geometric pixels and 160 spectral channels, the cross-correlation matrix C AA would otherwise consist of (1600 × 160) 2 elements. These subkernels are then used to solve the linear equation system individually for each pixel of sensor B.
Since the simulated camera is radiometrically calibrated, i.e. the integrals of the PRFs are normalized to unity, we apply Eq. (16) on the transformation matrix K. This reduces errors introduced by the finite accuracy of the numerical calculations.
Following [19], we choose a second-order differential operator (two dimensional discrete Laplacian) for the regularization matrix Γ. The equivalent operator d for the discrete convolution of a two-dimensional image is This translates to matrix representation by where N c is the number of image columns and r is the row index of the 2D-image. The expressions i = j − 1 ≠ rN c and i = j + 1 ≠ rN c + 1 account for row switching. Otherwise the leftmost pixel of a row in the 2D-image would regularlize the first pixels of the next row and vice versa. As regularization parameter we use γ 2 = 10 −15 . The parameter is found by trial and error, as the calculated transformation matrix is not very sensitive on the parameter when it is changed by one order of magnitude. For comparison with classical approaches, we calculated a constant transformation kernel by averaging of the two-dimensional subkernels of the transformation matrix K.

Test results
The transformed images of sensor A appear in the representation of Fig. 5 identical to the images of sensor B. We compare the transformed images of sensor A with the images of sensor B by determining their absolute difference, see Fig. 6.
The images transformed with the constant kernel show differences that are one order of magnitude larger, see Fig. 6(a) and (b), than images to which the transformation matrix was applied, see Fig. 6(c) and (d). At two edges bigger differences are visible. This is caused by the shift of the response function of sensor B with respect to sensor A by the half of a pixel to the left and to the top. Hence, the response functions of sensor B are not entirely covered by response functions of sensor A. This means that relevant information for the transformation is missing, which causes unavoidable errors.

Impact on noise
As we have discussed in Sec. 2.4, the covariances of a transformed image can be calculated by Eq. (18) from the covariances of its source image. Assuming that the noise of the synthetic sensor is the same for all pixels, we can calculate the relative noise of sensor B compared to sensor A by replacing the covariance matrix Σ A with the unit matrix I M in Eq. (18), yielding On the diagonal of the covariance matrix Σ B are then the variances relative to sensor A. The relative change of the standard deviation is shown in Fig. 7. As we can see, the noise of each pixel of the transformed images is below 50 % of the noise of the pixels of sensor A. Since sensor A and sensor B have a sampling ratio greater than two, the transformation matrix partly acts as a low pass filter. This means, that high frequency noise is suppressed by the transformation. At two image edges higher noise is expected, due to the image shift and the missing information there.

Impact on noise
As we have discussed in Sec. 2.4, the covariances of a transformed image can be calculated by Eq. (18) from the covariances of its source image. Assuming that the noise of the synthetic sensor is the same for all pixels, we can calculate the relative noise of sensor B compared to sensor A by replacing the covariance matrix Σ A with the unit matrix I M in Eq. (18), yielding On the diagonal of the covariance matrix Σ B are then the variances relative to sensor A. The relative change of the standard deviation is shown in Fig. 7. As we can see, the noise of each pixel of the transformed images is below 50 % of the noise of the pixels of sensor A. Since sensor A and sensor B have a sampling ratio greater than two, the transformation matrix partly acts as a low pass filter. This means, that high frequency noise is suppressed by the transformation. At two image edges higher noise is expected, due to the image shift and the missing information there.

Impact on noise
As we have discussed in Sec. 2.4, the covariances of a transformed image can be calculated by Eq. (18) from the covariances of its source image. Assuming that the noise of the synthetic sensor is the same for all pixels, we can calculate the relative noise of sensor B compared to sensor A by replacing the covariance matrix Σ A with the unit matrix I M in Eq. (18), yielding On the diagonal of the covariance matrix Σ B are then the variances relative to sensor A. The relative change of the standard deviation is shown in Fig. 7. As we can see, the noise of each pixel of the transformed images is below 50 % of the noise of the pixels of sensor A. Since sensor A and sensor B have a sampling ratio greater than two, the transformation matrix partly acts as a low pass filter. This means, that high frequency noise is suppressed by the transformation. At two image edges higher noise is expected, due to the image shift and the missing information there.

Summary and conclusion
We have proposed a novel method that allows the transformation of images of an optical imaging sensor A to images of a sensor B. This new approach can be used to correct optical distortions, to homogenize spectral and spatial resolution, and to make images of different instruments comparable. The core idea of the method is to apply a transformation matrix on sensor A images that maps every pixel of sensor B to the set of pixels of sensor A. This matrix is derived from two cross-correlation matrices: one that describes correlations between the pixels of sensor A and one that correlates each pixel of sensor B with those of sensor A. Since this problem tends to be ill-posed, we assume smooth response functions and use Tikhonov regularization to suppress high frequency noise. It turns out that the derived transformation matrix has a low sensitivity to the regularization parameter, which can be varied within an order of magnitude without causing significant changes. An advantage of the method is that the transformation matrix for a given sensor pair (A, B) needs to be calculated only once before it can be applied to many images with a simple matrix multiplication at comparatively low computational cost. Additionally, we developed an approximate matrix inversion for sparse problems based on sub-matrices, which considerably reduces the computational cost of the transformation matrix computation for sensors with many individual pixels (e.g. hyperspectral imaging spectrometers). This method is applicable in the absence of long range interactions between pixels beyond a certain distance.
In addition, the covariances of a target image can be easily calculated from the covariances of the source image. Our method can handle highly non-uniform geometric and spectral pixel properties, which are different for every pixel. This is evidenced by a case study for both point and extended sources. In the case study, we investigated the transformation of two dimensional images. However, we expect the transformation to be extendable to three-or higher-dimensional data sets, which we have shown in Sec. 2.
In cases of real instruments, the method may be limited by the accuracy of the geometric and spectral instrument model, which is usually determined by instrument calibration. Provided that an instrument is thoroughly calibrated, the method can significantly improve data quality. Considering the presented approach in instrument design may also allow a relaxation of the specifications for image distortion and uniformity.
An interesting aspect of the method is its capability to theoretically increase the resolution of sensor A, if the PRFs of sensor B are chosen to be narrower than those of sensor A. Obviously an increase in information content of the data is not possible, so that information theory poses bounds on the extent to which superresolution is achievable. We performed some case studies with airborne hyperspectral data, which indicate promising results if there exists sufficent overlap between neighboring PRFs, i.e. if the sensor is oversampled. In our opinion these results warrant further investigation regarding both the theoretical and practical possiblities and limitations of the proposed method.

Appendix A Mathematical derivation of the transformation matrix equations
In this section we provide a derivation of Eqs. (10) and (13) in Sec. 2.2.
In the following equations, K can be either R or C. Although the physical quantities discussed in this paper are exclusively real valued, we assume that they can be analytically continued to the complex domain. This is e.g. advantageous in the context of Fourier analysis, as demonstrated in Sec. B. Then the scalar product ⟨·, ·⟩ introduced in Sec. 2.1 can be formally defined as ⟨f , g⟩ := for two square-integrable functions f , g : Q → K.
To derive an expression for the desired transformation matrix K ∈ K M ′ ×M , which satisfies Eq. (7), we proceed with a formal inversion of Eq. (37) using the Moore-Penrose inverse A + . Inserting the result into Eq. (38) yields Apparently this result in combination with Eq. (7) implies Note that the inversion of A is most likely a severely ill-posed problem which has to be stabilized by the multiplication with B. At this point it is helpful to take a closer look at the cross-correlation matrices C AA ∈ K M×M and C BA ∈ K M ′ ×M , which we defined by Eqs. (8) and (9) we observe that the scalar products can be expressed in terms of these row vectors, if the spectral decomposition 3 of the PRFs is inserted into the scalar products: In matrix form these equations become This is exactly the desired result as Eq. (10), which we obtained from a more intuitive argument in Sec. 2.2. To obtain the Tikhonov solution 13 we transpose Eq. (49), which yields solve the result using the matrix version of the Tikhonov equation by and finally transpose again to obtain Eq. (13). Here we used the obvious fact that C AA is self-adjoint and introduced the conjugate transpose C AB of C BA , which is defined in analogy to Eq. (9) by swapping A and B everywhere.
Choosing Γ = Ã † F followed by multiplication with F † from the right results in and thus where we introduced the Wiener deconvolution matrix W ∈ C M×M with components In the classical Wiener deconvolution the Tikhonov parameter γ 2 is chosen as inverse of the signal-to-noise ratio. This result is not surprising in the context of Fourier theory as it states when inserted into Eq. (7). In other words, the least squares solution K acts like a multiplication in the frequency domain. First the Fourier transform of signal s A is multiplied by the Wiener deconvolution matrix W, which is a regularized inverse of the Fourier transformed PRF matrix Ã . The result is then multiplied with the Fourier transform of the desired PRF matrix B before it is transformed back to the wavelength domain. In essence, this is a direct consequence of the convolution theorem of the Fourier transform, which states that convolution in the wavelength domain is equivalent to multiplication in the frequency domain. This demonstrates that the solution proposed in Sec. 2.2 is a generalization of the Wiener deconvolution to cases where the PRFs vary with channel and/or pixel.