COMPACT HYPERSPECTRAL IMAGING SYSTEM ( COSI ) FOR SMALL REMOTELY PILOTED AIRCRAFT SYSTEMS ( RPAS ) – SYSTEM OVERVIEW AND FIRST PERFORMANCE EVALUATION RESULTS

This paper gives an overview of the new COmpact hyperSpectral Imaging (COSI) system recently developed at the Flemish Institute for Technological Research (VITO, Belgium) and suitable for remotely piloted aircraft systems. A hyperspectral dataset captured from a multirotor platform over a strawberry field is presented and explored in order to assess spectral bands co-registration quality. Thanks to application of line based interference filters deposited directly on the detector wafer the COSI camera is compact and lightweight (total mass of 500g), and captures 72 narrow (FWHM: 5nm to 10 nm) bands in the spectral range of 600-900 nm. Covering the region of red edge (680 nm to 730 nm) allows for deriving plant chlorophyll content, biomass and hydric status indicators, making the camera suitable for agriculture purposes. Additionally to the orthorectified hypercube digital terrain model can be derived enabling various analyses requiring object height, e.g. plant height in vegetation growth monitoring. Geometric data quality assessment proves that the COSI camera and the dedicated data processing chain are capable to deliver very high resolution data (centimetre level) where spectral information can be correctly derived. Obtained results are comparable or better than results reported in similar studies for an alternative system based on the Fabry–Pérot interferometer. * Corresponding author


INTRODUCTION
New technological concepts and device miniaturization enabled developments of hyperspectral systems suitable for the unstable small Remotely Piloted Aircraft Systems (RPAS).Several compact hyperspectral imagers based on various imaging concepts are currently available on the market.In classical hyperspectral pushbroom imagers a single line on the ground is observed through a slit in the optical system dispersing the light and projecting it at a 2-dimensional detector.For every ground location all spectral bands are simultaneously recorded and a scanning motion is required to cover the area of interest.Such devices are available for RPAS and have already been flown on fixed-and rotary-wing platforms (Buettner and Roeser, 2013;Lucieer and Veness, 2014;Suomalainen et al., 2014).Nonetheless, because they require high accuracy IMU and GNSS information (that comes usually with the weight and cost of the device), reconstruction of geometrically correct image data remains a challenging task.
Alternative data capturing approach is used in so called imageframe cameras (Aasen and Burkart, 2015) grabbing a 2D perspective image, similar to images captured in conventional photography.Examples of such cameras are: SM5X5-NIR (Ximea, 2015) and UHD 185-Firefly (Cubert GmbH, 2015) snapshot cameras or the Rikola FPI device (Rikola Ltd., 2015).The first two systems capture all spectral bands simultaneously (each band covers slightly different ground location), but at a cost of decreased spatial data resolution.The Rikola camera captures hyperspectral bands at full sensor resolution but at different times and thus band co-registration is required to build the data hypercube (Saari et al., 2011).
Another example of a hyperspectral frame camera is the COmpact hyperSpectral Imaging system (COSI, Figure 1).This imager has been recently developed at the Flemish Institute for Technological Research (VITO) with co-funding of the EC FP7 Airbeam security project.The COSI imager employs line based interference filters, or linearly variable filters (LVF), to capture the spectral data.These interference filters with varying thickness can be deposited directly on the detector in different spatial configurations, i.e. mosaic filters used in some snapshot cameras, or stepwise line filters as used in the COSI system.
In an LVF based imager using step line filter every image row corresponds to a different spectral band as well as a different location on the ground (Figure 2).Therefore the scanning motion is required to cover the area of interest, and to retrieve the complete spectrum for every spatial location.Although the imager is used similarly to line scanner, it captures a series of traditional 2D perspective images and therefore allows for extraction of 3D information such as digital surface model, required to produce geometrically correct, orthorectified hypercube.In opposite to the snapshot cameras, in LVF based imagers hyperspectral data are captured with very high spatial resolution.Another advantage is the lack of moving parts used in imagers based on the Fabry-Pérot interferometer (e.g. in the Rikola camera (Saari et al., 2011)).
The scope of this paper is to present hyperspectral data acquired with the COSI camera over an experimental strawberry field, and to report first estimates of the geometric data reconstruction quality.

COMPACT HYPERSPECTRAL IMAGING SYSTEM (COSI)
The COSI camera uses a 2048 x 1088 pixels sensor (pixel pitch of 5.5µm) with an LVF filter deposited directly on the sensor surface (Tack et al., 2011).72 narrow (FWHM: 5nm to 10 nm) spectral bands of the filter cover the spectral range of 600-900 nm.Such spectral information is highly favourable for vegetation studies, since the main chlorophyll absorption feature centred around 680nm is measured, as well as, the rededge region (680nm to 730nm) which is often linked to plant stress.The NIR region furthermore reflects the internal plant structure, and is often linked to leaf area index and plant biomass.
The payload is compact (6cm x 7cm x 11.6cm) and lightweight, with the total mass of 500g including: an embedded computer, power distribution unit, data storage and optics (330g without optics).
The imager captures very high spatial resolution data, i.e. images captured with a 9mm lens at 40m altitude cover the swath of ~40m with a ~1.5cm ground sampling distance (GSD).
Geometrically correct (orthorectified) hyperspectral data can be reconstructed with a GSD of ~4cm.The acquired images are processed into a conventional hypercube using a dedicated processing chain developed at our institute.Auxiliary data, such as geolocation of the images or ground control points (GCPs) are not required, although their presence improves data scaling and georeferencing.The DN pixel values need to be radiometrically and spectrally corrected in order to derive reflectance values of the imaged area.Details of these corrections are beyond the scope of this paper and are sketched in (Livens et al., 2016).
Figure 3. Single image frame acquired over a strawberry field near Sint Truiden, Belgium.

IMAGING USING LINEARLY VARIABLE FILTERS
Due to the specific way in which the spectral data is captured by the LVF based systems, correct geometric image reconstruction can be a challenging task (Serruys et al., 2014;Skauli et al., 2014).Very accurate knowledge of relative image orientation is required to track points as they move within the spectral filter and to reconstruct the spectral data.Errors in relative orientation between successive images may result in errors of the reconstructed spectra.LVF based imagers share many challenges related to the way the data is captured with imagers based on the Fabry-Pérot interferometer, e.g. the Rikola camera.
One such challenge is caused by varying angles at which the scene is observed by different bands.In the current configuration of the COSI camera band 49 (813.8nm) is captured in nadir and band 1 (604.0nm)at 10.4 degrees offnadir.In imagers based on Fabry-Pérot interferometer successive spectral bands are captured, with a time delay, from a moving RPAS platform and so a point on the ground is viewed from varying angles during spectral bands acquisition.
Taking into account that the radiance of the scene depends on viewing angle, spectral artefacts may result, especially for surfaces with specular reflections or complex 3D shapes (many shadows).
Another challenge is due to the fact that the spectral bands are not acquired simultaneously for a ground location (time delay of milliseconds for the COSI system and the Rikola camera).Different bands may represent different states of the ground point, especially pronounced for moving objects.This effect may cause errors in the reconstructed spectrum similar to the spatial image co-registration errors (see examples in (Krauß, 2014)).
All the above mentioned factors distort the final hypercube data in a local way, meaning, that a band mismatch may appear in one part of the dataset in specific spectral bands.An incorrectly co-registered image introduces artefacts in all spectral bands, but with different magnitude and at different spatial location.This makes the geometric data quality evaluation a challenging task.Additionally one has to cope with a significant image content difference between spectral bands i.e. vegetation in the visible part of the electromagnetic spectrum has a relatively low reflectance when comparing to the reflectance values in the near infrared range.

STUDY CASE
The functionality of the COSI camera has been demonstrated in a number of test flights covering diverse agricultural environments, e.g.: experimental strawberry fields, natural grassland, wheat fields or pear orchards.Description of, according to author's knowledge, best practice in hyperspectral data acquisition procedures and practical observations based on experiences with the COSI camera in-flight are sketched in (Livens et al., 2016).
In this paper a dataset covering an experimental strawberry field in Belgium (pcfruit vzw, 2015) will be presented and evaluated.

Dataset characteristics
In May and June 2015 a series of octocopter flights were performed with the COSI camera over an experimental strawberry field near Sint Truiden, Belgium (pcfruit vzw, 2015).More than 15 000 images were captured in each mission, covering area of 60x80m with 1.4cm ground sampling distance (GSD).In this paper a dataset of 14400 images acquired on May 21, 2015, in 9 flight lines with 80% sidelap, will be presented and studied.Example of a single image frame from this dataset is presented in Figure 3.The image data were successfully processed using the in-house software into orthorectified hypercube and transformed into reflectance values using dedicated spectral targets (see details in (Livens et al., 2016)).The GSD of the reconstructed hyperspectral product was set to value of 2cm to retain the highest data resolution but minimize data gaps caused e.g. by sudden platform movements.Nine geometric reference targets were distributed inside the area of interest, but unfortunately their geolocation was not surveyed.The hyperspectral data were scaled and georeferenced using the flight data.An overview and enlarged samples of the covered area are shown in Figure 4 and Figure 5.As previously mentioned one of the advantages of the LVF based imagers is the fact that they capture perspective image frames and therefore allow for extraction of the 3D information about the covered area.A sample of such digital surface model represented by a triangular mesh is shown in Figure 6.

Estimation of band co-registration quality
As previously mentioned (Section 3) many factors influence the quality of the final hypercube, and mostly appear as local band mis-registration, also affecting the spectral measurements.First ideas about the geometric coherence of the hypercube bands can be gained by visual checks of various data composites.An example of such image composite is shown in Figure 7, where every other 30 rows are copied from different spectral bands.Another interesting way to visually assess how closely the spectral bands are co-registered is to visualize a slice of hypercube including objects with sharp contrast changes.An example of such slice is shown in Figure 8.Other efficient technique to spot geometric image artefacts (difficult to depict in a paper though) is to explore different spectral bands as grayscale images in a cycle to mimic a band animation.The artefacts are indicated by pixel flickering between bands and thus easier to spot for a human eye.
While such visual check provides an overview of the geometric data coherence, it does not quantify the quality.In order to gain a better insight into the data geometry, evaluation of band coregistration quality was performed using two different image matching approaches: pixel based (normalized crosscorrelation) and feature based using Scale Invariant Feature Transform (SIFT).For both approaches band 31 (736.8nm)was selected (at the red edge) as a reference band due to its intermediate reflectance values between the visible and the infrared part of the COSI imager spectral range (Figure 7).

Estimation of band co-registration quality using template matching and normalized cross-correlation
The template matching algorithm reports the position of the template (col,row, with sub-pixel accuracy) in the search window, for which the normalized cross-correlation measure takes the maximal values.
Normalized cross-correlation is a relatively robust measure and capable to work successfully even for data captured in different illumination conditions.Nonetheless, because vegetation observed in visible and infrared part of the electromagnetic spectrum differs significantly (Figure 7, top) it was not capable of deriving correct results in all the bands.In other words, when using spectral bands in which vegetation appears very different, e.g.bands 1(604.0nm)and 31(736.8nm),template matching (search window of 100pix x 100pix, template: 50pix x 50pix) with normalized cross-correlation failed for most of the image locations.However it was successful for all the bands for the neighbourhood of the nine geometric reference targets (see Figure 4) placed in between the strawberry rows, with only few pixels representing vegetation in the template (Figure 9).Average and extreme residuals in horizontal (dx) vertical (dy) image direction as well the 2D RMSE resulting from template matching between all spectral bands and the reference band 31 per geometric target are reported in Table 1 and in Figure 10.2D residuals (emphasized 200 times) between all spectral bands and the reference band 31 are plotted in Figure 10.Average horizontal (dx) and vertical (dy) residuals and 2D RMSE between respective spectral bands and the reference band 31 for all nine reference targets plotted per band are shown in Figure 11.-0.27 0.78 -0.09 -0.36 0.06 0.32 0.79 GT 2 -0.07 -0.36 0.17 -0.02 -0.30 0.40 0.20 0.42 GT 3 -0.04-0.28 0.25 -0.14 -0.60 0.18 0.24 0.60 GT 4 -0.17 -0.44 0.06 -0.09 -0.30 0.09 0.22 0.47 GT 5 -0.18 -0.44 0.13 -0.05 -0.33 0.13 0.24 0.52 GT 6 -0.01 -0.42 0.54 0.00 -0.31 0.31 0.24 0.62 GT 7 -0.31-0.61 0.09 0.12 -0.09 0.44 0.35 0.67 GT 8 -0.28 -0.73 0.09 -0.05 -0.29 0.25 0.33 0.73 GT 9 -0.03 -0.21 0.27 -0.01 -0.15 0.17 0.11 0.29 Mean -0.10 -0.04 0.25 Table 1.Average and maximal (absolute) residuals in horizontal (dx) and vertical (dy) image direction and 2D distance/residual (2D res) between all the spectral bands and the reference band 31, per geometric reference target.

Estimation of band co-registration quality using SIFT
Due to the fact that template matching with normalized crosscorrelation across all the spectral bands in vegetated areas failed, the SIFT operator was employed.Despite the robustness of SIFT to radiometric image differences reported in the literature, significant differences in object appearance in visible and infrared bands resulted in very few points matched (~10 including false matches) when using the SIFT parameters suggested by (Lowe, 2004) and adopted by many authors.
Therefore, the values of several SIFT parameters were empirically optimised to yield a higher number of matched points.
More than 124 000 points were matched (NNratio = 0.8) in total between the 71 spectral bands and the reference band 31.Unfortunately, as is typical for feature based image matching, the results also included a large fraction of false matches (Figure 12).It was very difficult to visually confirm correctness of all the matches in the used band combination.After very careful data examination using various visualisation techniques, it was decided to remove all the (false) matches for which (1D) residuals in both horizontal and vertical direction were larger than 2 pixels (absolute value).This removed about 12000 false matches.
Figure 11.Average dx, dy and 2D RMSE (green, standard deviation marked in blue) for nine reference targets between respective spectral bands and the reference band 31.
Figure 12.SIFT matches, including false matches, between reference band 31 (736.8nm,left) and band 1 (604.0nm,right).The remaining 112 000 points were statistically evaluated.All matches with a Mahalanobis distance >3 were excluded, leaving about 100 000 matches for further assessment (Figure 13).The band matching results were searched for points that would be present in all the spectral bands, unfortunately with no results.An example of successfully matched SIFT points (after false matches elimination) between the reference band 31 (736.8nm)and bands at the beginning (band 25, 708.5nm) and end (band 70, 888.5nm) of the infrared range covered by the COSI camera is shown in Figure 14.In order to make the matching results more suitable for further statistical processing and comparison, sizes of samples (number of matches) per spectral band were balanced to the size of the smallest sample (299 points matched between reference band 31 and band 2) using random sampling.Similarly like in the previous image matching approach, average residuals in horizontal (dx) and vertical (dy) image direction as well as 2D RMSE between all the spectral bands and the reference band 31 were computed and are reported in Figure 15.Mean (all bands together) absolute values of horizontal (dx) and vertical (dy) residuals and the 2D RMSE are equal to: 0.32pix, 0.24pix and 0.45pix respectively.

DISCUSSION
Assessment of the hyperspectral band co-registration is not trivial due to the varying image content in bands captured in different wavelengths.The number of image bands (72) additionally increases the complexity of the analysis.Visualisation of data, e.g. as shown in Figure 7 or Figure 8 can ease understanding of the spectral changes within the dataset as well as observation of (larger) problems with band coregistration.
Analysis of the band co-registration quality using template matching and normalized cross-correlation on nine reference points showed that the average RMSE between the spectral bands and the reference band 31 is equal to 0.25pix.This result is similar to the one recently obtained for the Rikola DT-0014 camera and reported in (Tommaselli et al., 2015) and better than numbers reported in (Vakalopoulou and Karantzalos, 2014).
Although the obtained result is considered as of high quality, the vectors of extreme 2D residuals between some bands can get close to 1pix (Figure 10) and thus can influence spectral measurements (similar conclusion was made in (Tommaselli et al., 2015) for the Rikola camera).Nonetheless one should keep in mind the very high spatial data resolution (2cm) of the hypercube.This value was chosen relatively close to the value of average GSD of the raw image set (1.4cm) in order to enable the above described analysis at the highest resolution while minimizing the gaps (missing data) in dataset.Reconstructing the hyperspectral data e.g. with two times larger GSD will guarantee correct spectral band co-registration, and thus enable correct spectral measurement.Resulting ground sampling distance of 4cm is still sufficient for spectral analyses within the strawberry rows.If smaller ground sampling distances are required, the flight should be performed at lower altitudes above the ground.
In Figure 11 it can be observed that the residuals were higher for bands covering visible part of the spectrum (see standard deviation marks in Figure 11).This result can be influenced by the differences in spectral content of these bands and the reference band causing problems for the template matching.
Although the choice of the reference band approximately in the middle of the reflectance values change between the visible and infrared seemed reasonable, it can be optimized in a more extensive study looking at matching results of all possible band combinations.Alternative similarity measures, e.g. based on the mutual information (Shannon, 1948) can also be investigated.
For some of the geometric targets (e.g.GT4, GT7 or GT8) the 2D residual vectors show a systematic effect (Figure 10).It will be further investigated if this is related to the number of flight lines covering the geometric target location.Residuals estimated in section 4.2.1 were computed for geometric reference targets and thus for points corresponding to defined spatial location observed in all bands.Therefore the range between the extreme residuals is a valid estimate of the maximal band co-registration error.This is not the case for the results obtained using SIFT operator, where different (ground) points were matched between the reference band 31 and all other bands (section 4.2.2).Despite efforts made to increase the overall number of image matches, no points appearing in all spectral bands were found.Nonetheless this part of the study showed very interesting results.While residuals in vertical image direction (dy) are oscillating around zero (Figure 15), residuals in the horizontal image direction are increasing (absolute value) with increasing distance between the matched bands.In other words, the further the band from the reference band, the larger the co-registration error.This effect can be caused by the specific way in which data are captured by LVF based imagers and specifically by the fact that every spectral band observes the scene from slightly different viewing angle (parallax).The influence of different viewing angles on the band co-registration should be maximal in direction parallel to the flight (scanning) direction, and thus in this case, in the horizontal image direction (see Figure 4).On the other hand is should be minimal in the direction perpendicular to the flight direction, and thus in the vertical image direction.That seems to be the case in this study, but this conclusion should be confirmed by analyses of other datasets, acquired in another flight configuration.
The average residual RMS error between all the bands and the reference band 31 obtained using SIFT operator was at the level of 0.45pix.This value is higher than the corresponding value found by the template matching technique on the geometric reference targets, most probably due to the fact that points in all the scene, also in 3D objects like strawberry plants, were matched with the SIFT operator.Position of such points is heavily influenced by inaccuracies and smoothing of the DSM model used at the orthorectification stage of data processing.This influence is much smaller for flat geometric reference targets placed in between the strawberry rows.Again, these results are comparable with the ones reported in (Tommaselli et al., 2015) for the Rikola camera.

CONCLUSIONS
This paper presents and explores hyperspectral data acquired with a compact hyperspectral imaging system (COSI) recently developed at the Flemish Institute for Technological Research (VITO) over an experimental strawberry field.An overview and comparison of various hyperspectral imaging approaches suitable for small RPAS platforms is also sketched.
First results of the geometric data assessment, and in particular of spectral band co-registration, prove the capability of the COSI system to provide data where the spectral information can be correctly retrieved with very high spatial resolution.Additionally to the orthorectified hypercube a digital terrain model can be derived enabling various analyses requiring object height, e.g.plant height in vegetation growth monitoring.
In the explored dataset geometric band co-registration is comparable or better than corresponding values reported in literature for data captured by the Rikola camera (Tommaselli et al., 2015;Vakalopoulou and Karantzalos, 2014).The obtained results are very encouraging, nonetheless evaluation of more datasets covering different environment is required.Flights with a dedicated geometric calibration field and ground control points are foreseen to get more in-depth understanding of the error distribution throughout the scene and to assess accuracy of data scaling and georeferencing based solely on the RPAS platform GPS data.

Figure 2 .
Figure 2. Principle of data acquisition with LVF based imagers.

Figure 4 .
Figure 4.An overview of an experimental strawberry field near Sint Truiden, Belgium.Area marked in yellow is enlarged in Figure 5. Geometric reference targets marked in blue.False colour composite (R=801.7nm,G=672.6nm,B=604.0nm).

Figure 6 .
Figure 6.The DSM model extracted from the COSI images seen from two different perspectives.Area shown in Figure 5 marked in red.

Figure 8 .
Figure 8. Cross section through all the spectral bands (top).The slice location marked in yellow (bottom).

Figure 9 .
Figure 9. Neighbourhood (100pix x 100pix) of reference target GT1 in eight spectral bands (band number in brackets).Position of the matched template (50pix x 50pix extracted from band 31) marked in red.Although the above summarized results are very promising, they estimate geometric errors on flat objects (targets) and thus are not necessarily representative for the entire dataset including irregular 3D objects like strawberry plants.In order to get better estimates of the geometric data quality in the vegetated areas use was made of feature based matching.

Figure 10 .
Figure 10.Residuals (emphasized 200 times) between all spectral bands and the reference band 31.

Figure 13 .
Figure 13.Number of points matched between the reference band 31 and other spectral bands using optimized SIFT parameters.

Figure 15 .
Figure 15.Average residuals (green, standard deviation marked in blue) in horizontal (dx) and vertical (dy) image direction and 2D RMSE between the reference band 31 and respective spectral bands.