Multi-resolution terrestrial hyperspectral dataset for spectral unmixing problems

Recent developments in the miniaturization of hyperspectral imaging sensors have given rise to the increased use of hyperspectral imagery as the primary data for evaluating spectral unmixing algorithms in applications such as industrial quality control, agriculture, mineral mapping, military, etc. This article presents an ultra-high-resolution hyperspectral imagery dataset for undertaking benchmark studies on spectral unmixing. A terrestrial hyperspectral imager (THI) is used for imaging the target scene with the camera sensor pointing horizontally towards the target scene. The datasets are acquired at various spatial resolutions ranging from 1 mm to 2 cm. The targeted scene contains several paper-based panels, each size of 2 cm x 2 cm and filled with different colours and proportions, glued to a black background board that maintains a distinguishable distance between one another. In addition to the hyperspectral imagery data acquisitions, reference spectral signatures of the candidate mixture materials are obtained by in-situ hyperspectral reflectance measurements using a spectroradiometer. The hyperspectral image acquisition and the in-situ spectral signatures of the target scene are collected under natural illumination conditions. The proposed datasets are designed for undertaking proof-of-the-concept (PoC) studies in spectral unmixing. The datasets are also valuable for evaluating the performance of different statistical and machine learning algorithms for target detection, classification, and sub-pixel classification algorithms.

datasets are also valuable for evaluating the performance of different statistical and machine learning algorithms for target detection, classification, and sub-pixel classification algorithms.
© Description of data collection • Five paper-material of different colours and black background were chosen as the candidate materials for acquiring imagery • The colour proportions were arranged in 2 cm X 2 cm in different ways • A ground-based hyperspectral imaging sensor was used for imagery acquisition • Spectral library based on in-situ reflectance spectra collected using a field spectroradiometer

Value of the Data
• Spectral unmixing has been one of the classical approaches of analysing remote sensing imagery at the sub-pixel level. Initially developed as an analytical technique for estimating fractional abundances of surface materials in coarse resolution multispectral remote sensing, spectral unmixing has been extensively studied during the last three decades [1][2][3][4][5] . The advances in hyperspectral imaging technology have the potential to expand the application of spectral unmixing in various precision-material fractions retrievals in military, mineral mapping, food processing, quality control etc. Due to the lack of pixel-to-pixel ground reference measurements, the various methods and algorithms designed for spectral mixture modelling are often faced with generic or synthetic datasets to model and validate the results [6 , 7] . In addition, the availability of benchmark datasets to assess the impact of varying spatial resolution for sub-pixel classification is scarce.
• The problem of spectral unmixing is being pursued using several linear and non-linear approaches. This dataset will be handy for carrying out proof-of-the-concepts (PoC), mathematical constructs, and the performance of the developed processes/algorithms. • It has often been found that theoretical algorithms have certain limitations when deployed in realistic environments at different spatial scales. As the dataset allows implementing algorithms at the different spatial resolutions of hyperspectral imagery, the dataset will enable the development of scale-invariant methods for spectral unmixing problems. • The datasets are a valuable resource for researchers and scientists developing algorithms and concepts in solving spectral unmixing, classification, and target detection problems.

Data Description
The entire data is gathered in a single folder, ' Hyperspectral_Unmixing_Data ' and then zipped. On extracting this zip file, there are two folders, ' Hyperspectral_Image_Data ' and ' Spec-tral_Library '. ' Hyperspectral_Image_Data ' has two subfolders -' Raw_Data ' and ' Processed_Data '. In the ' Raw_Data ' folder, there are five radiance files in the standard ENVI ".hdr" format. The ' Pro-cessed_Data ' folder is further divided into five subfolders (named 5m, 10m, 25m, 50m, and 62m), which is further divided into three subfolders (named BFM, SBM, and MBM). Each subfolder-BFM, SBM, and MBM-contains ten reflectance files in ENVI ".hdr" file format with 743 spectral bands each.
The file name of each reflectance is represented as "C_D m _X_N p.hdr", where capital letters in bold can be replaced by different letters and numbers.
"C " -indicates the mixture category by three letters M -Background free mixture (BFM), S -Single background mixture (SBM) and D -Multiple background mixture (MBM).
"D " -represents the distance between the sensor and target 5 m, 10 m, 25 m, 50 m and 62 m.
"X " -specifies the mixture composition as mentioned in Tables 1 -3 ranging from 1 to 10. "N " -consists of the number of pixels in a row or column ranging from 16, 10, 8, 5, and 4 w.r.t distance.
' Spectral_Library ' folder is subdivided into two; the first one is "Raw " which contains raw reflectance spectral library in the ".hdr" and ".ascii" file formats. In the second "Processed " folder, reflectance resampled to the hyperspectral image file in the ".xlsx" file format is provided.
The entire hierarchy of this section is shown in Fig. 1 .

Experimental Design
The experiment design is based on the proof-of-the-concept for the assumptions in spectral unmixing as referred in [8] . In this experiment, each grid is of the area of 2 cm x 2 cm and is filled with five distinct colour materials. Each colour-grid represents a distinct material, and hence its reflectance spectrum is treated as an endmember. Various scenarios of mixture combinations are prepared in different sizes, orientation, and spatial distribution of materials with delineable boundaries in a grid. These grids can be seen in realistic form by printing colours in the print grid form. Single colour candidates are treated as pure pixels, and the combination of different colours in a grid represents mixed pixels.
We designed three different categories of spectral mixtures using five different colour materials -green, magenta, red, blue, and violent. The first category has ten candidate images -five images representing pure materials and five images representing a mixture of materials with combinations of variable size and spatial orientation inside a grid. In this category, there is no local background in the imagery and presents only pure and mixtures of candidate materials spectra. For ease of reference, this category of images is given the name 'background-freemixture (BFM)'. The composition of mixtures is shown in Fig. 2 and its areal proportion details are presented in Table 1 .   The second category has ten images representing different mixtures of two different materials indicated by two different colours. Each candidate mixture consists of different proportions of green colour behaving as local background, and violet colour material as foreground. This set is termed as 'single-background-mixture (SBM)'. The areal distributions of the two endmembers are listed in Table 2 , and the visualization of the dataset is shown in Fig. 3 .
The third category, labelled as 'multiple-background-mixture (MBM)', has a background of four different materials with fixed areal proportions. The foreground material (violet colour) has different proportions indicating different levels of interactions of background and foreground materials. Fig. 4 represents the combination candidate mixture set, and its areal distributions are listed in Table 3 .     All the three categories of images representing different mixtures were realized by printing the respective colour grids on a standard printing paper and affixed to a wooden board coated in black colour material. A radial distance of 10 cm was maintained to separate each candidate mixture. The black colour behaves as global background to all candidate mixtures. This candidate-mixture material configured wooden board was held onto a tripod to measure by a hyperspectral imaging camera.

Data Acquisition
The data acquisition set-up designed at multiple resolutions is shown in Fig. 5 . The hyperspectral imagery data was acquired using a terrestrial hyperspectral imager (THI) (Headwall Photonics Inc., USA; Model: A-Series). The spectral imager used in the acquisition has 854 spectral bands in the visible and infrared range of 400 to10 0 0 nm of the electromagnetic spectrum. The sensor uses push-broom imaging mode and can record 1004 pixels in a single column.
An additional rotating stage was attached between the sensor and tripod to scan an area. The speed of the rotational stage (angular view range 0 °-360 °) was manually adjusted between 0.01 °to 30 °according to the framerate of the sensor. While acquiring, the signal saturation was avoided by manually adjusting the exposure time. The exposure time was calculated by focusing on a white reference plate (prepared by barium sulphate material) until the black and white lines are separable on the connected computer live feed screen. Based on the exposure time the rotating speed was calculated and given the input through the rotating stage control unit attached to the same computer.
The effective spatial resolution depends on the lens used and the distance between the targeted scene and the sensor. We acquired multiple datasets of hyperspectral imagery at various spatial resolutions ranging from 1 mm (5 m distance between sensor and scene) to 2 cm (62 m distance between sensor and scene) using a 23 mm lens. The actual targeted image frame used in the field for acquisition is shown in Fig. 6 .
The in-situ measurements were collected using a high resolution spectroradiometer (HR-1024i, Spectra Vista Corporation, USA). This instrument records reflected the radiance of the target material in the 350 -2500 nm portion of the electromagnetic spectrum.  The hyperspectral image acquisition and in situ spectral measurements were acquired on January 31, 2019, from 11:00 to 13:00 Hrs in IST local time. The procedure suggested by [9] was adapted for acquiring the reflectance measurements.

Data Pre-processing
The hyperspectral imagery datasets acquired are in radiance units. In the first stage, the radiance data were converted into reflectance data through the reference spectral imagery acquired over a white reference plate of known spectral calibration. The reflectance data were further smoothed in the spectral dimension using the Savitzky-Golay filter [10] . After the filtering, the data were spectrally subset to confine to 400 -920 nm range to eliminate uncalibrated extremely noisy bands. The image data were spatially cropped to carve out several subsets based on the mixture categories (BFM, SBM, and MBM). At different spatial resolutions indicated by the distance from the sensor, the number of distinct pixels in each subset representing different material proportions is listed ( Table 4 ).
The reflectance spectral measurements acquired using the field spectroradiometer were organized in the form of a spectral library. This spectral library was resampled to conform to the datasets of hyperspectral imagery using spectral response function modelling. The different colour material's spectral signatures are shown in Fig. 7 .

Ethics Statements
No animal or human subjects are used in the experimental set-up. The data is not collected from any social media platform.

CRediT Author Statement
Manohar Kumar C. V. S. S.: data acquisition and pre-processing, writing original draft; Sudhanshu Shekhar Jha: data acquisition, editing and review; Rama Rao Nidamanuri: conceptualization, experiment design, review; Vinay Kumar Dadhwal: data quality review, supervision.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships which have, or could be perceived to have, influenced the work reported in this article.

Data Availability
Multi-resolution Terrestrial Hyperspectral Dataset for Spectral Unmixing Problems (Original data) (Mendeley Data).