A novel hyperspectral compressive sensing framework of plant leaves based on multiple arbitrary-shape regions of interest

Massive plant hyperspectral images (HSIs) result in huge storage space and put a heavy burden for the traditional data acquisition and compression technology. For plant leaf HSIs, useful plant information is located in multiple arbitrary-shape regions of interest (MAROIs), while the background usually does not contain useful information, which wastes a lot of storage resources. In this paper, a novel hyperspectral compressive sensing framework for plant leaves with MAROIs (HCSMAROI) is proposed to alleviate these problems. HCSMAROI only compresses and reconstructs MAROIs by discarding the background to achieve good reconstructed performance. But for different plant leaf HSIs, HCSMAROI has the potential to be applied in other HSIs. Firstly, spatial spectral decorrelation criterion (SSDC) is used to obtain the optimal band of plant leaf HSIs; Secondly, different leaf regions and background are distinguished by the mask image of the optimal band; Finally, in order to improve the compression efficiency, after discarding the background region the compressed sensing technology based on blocking and expansion is used to compress and reconstruct the MAROIs of plant leaves one by one. Experimental results of soybean leaves and tea leaves show that HCSMAROI can achieve 3.08 and 5.05 dB higher PSNR than those of blocking compressive sensing (BCS) at the sampling rate of 5%, respectively. The reconstructed spectra of HCSMAROI are especially closer to the original ones than that of BCS. Therefore, HCSMAROI can achieve significantly higher reconstructed performance than that of BCS. Moreover, HCSMAROI can provide a flexible way to compress and reconstruct different MAROIs with different sampling rates, while achieving good reconstruction performance in the spatial and spectral domains.


INTRODUCTION
Hyperspectral imaging (HSI) provides a potential insight into plant-pathogen interactions (Mahlein et al., 2018). Physiological and biochemical parameters of plant leaves can be effectively predicted with the reflectance spectra extracted from plant leaf hyperspectral images (HSIs), which can be an appropriate measure to evaluate the health and vitality state of plants. HSI has been used to do early detection of grapevine leafroll disease in a red-berried wine grape cultivar (Gao et al., 2020). The moisture content of tea leaves can be detected with hyperspectral imaging technology (Wei et al., 2019). HSIs can detect leaves at the stages of freezing and post-thawing injury, and quantify the impacts of freezing injury on leaf water and pigment contents (Wei et al., 2017). Therefore, with the wide use of HSIs in plant leaves, it brings huge challenge to their acquisition, storage and transmission.
Compressive sensing (CS) can not only combine the process of acquisition and compression, but also reconstruct them accurately. In recent years, CS has been a hotspot in the field of HSI. The applications of CS includes reconstruction, denoising, and unmixing et al. For reconstruction applications, based on a multidimensional multiplexing (MDMP) CS scheme, a tensor nonlinear CS (T-NCS) algorithm for noniterative recovery of HSIs using example-aided and self-learning compressive hyperspectral imaging (CHI) approaches was proposed (Yang et al., 2015). Martin, Bioucas-Dias & Plaza (2015) exploited the high correlation existing among the components of the hyperspectral data sets to introduce a new hyperspectral coded aperture (HYCA), which largely reduced the number of measurements to reconstruct the original data correctly. Zhang et al. (2016b) brought forward a structured sparsity-based hyperspectral blind CS (SSHBCS) algorithm in which a unified optimization framework was derived to jointly learn the dictionary and the corresponding sparse representation from the measurements. Wang et al. (2017) considered a single HSI as a tensor with three modes (width, height and band) and then identified the hidden spatial-spectral structures using low-rank Tucker decomposition to model the global spatial-spectral correlation. Gelvez, Rueda & Arguello (2017) proposed a joint sparse and low rank HSI recovery algorithm, which included a nuclear norm regularization term in the objective function, and then obtains the optimal solution through iteration. Xu et al. (2018) proposed a Mahalanobis distanceregularized tensor robust principal component analysis framework for hyperspectral CS with anomaly detection which reconstructed the HSI and detected the anomalies simultaneously. Meza et al. (2018) studied the spatial and spectral correlation of HSI cube, and reconstructed hyperspectral data through non-local means regularization and Bregman optimization techniques. Xue et al. (2019) proposed a nonlocal tensor sparse and low-rank regularization (NTSRLR) approach, which can encode essential structured sparsity of an HSI and explore its advantages for HSI-CSR task. For denoising applications, Hassanzadeh & Karami (2017) put forward a new lossy compression and noise reduction algorithm for HSIs based on non-negative Tucker decomposition (NTD) and CS. To model the distribution of structured sparsity in HSIs, and to make it adaptable to the unknown noise, Zhang et al. (2016a) proposed a novel reweighted Laplace prior based on hyperspectral compressive sensing (RLPHCS) algorithm. Tan et al. (2016) extended the AMP-Wiener algorithm to three-dimensions and applied it to the coded aperture snapshot spectral imager (CASSI) system for hyperspectral image reconstruction. For unmixing applications, Li et al. (2012) proposed a novel numerical procedure, and proposed an augmented Lagrangian algorithm for reconstructing the model. Junmin & Jiangshe (2014) casted the spectral unmixing into the CS framework by multiplying a random Gaussian measurement matrix on the SR model and investigated the effect of the proposed model on the recovered abundances obtained by the l1-minimization algorithms. A locally similar sparsity-based hyperspectral unmixing CS (LSSHUCS) method was proposed to unmix the HSIs with an established redundant endmember library (Zhang et al., 2016c). Based on spectral linear mixing model, a CS reconstruction algorithm with spectral unmixing for HSIs is proposed (Wang et al., 2018).
In terms of plant HSIs, especial for plant leaf HSIs, it is common that plant leaves belong to multiple arbitrary-shape regions of interest (ROIs) and background wastes too much memory resource. These above state-of-the-art hyperspectral CS algorithms can only compress and reconstruct the whole cubic HSIs and cannot process the arbitrary shape ROIs to achieve good compression performance.
In our previous works, we introduced the CS into the compression and reconstruction of plant HSIs and chose three critical physiological and biochemical parameters of plant to verify the retrieving efficacy of the proposed algorithms (Ping et al., 2016;Xu et al., 2017). In this paper, a general hyperspectral CS framework for plant leaf based on multiple arbitrary-shape ROIs (HCSMAROI) is proposed to further improve the compression efficiency of plant leaves HSIs at low sampling rates. HCSMAROI can not only compress different plant leaf HSIs, but also have the potential to be applied in other hyperspectral images with multiple arbitrary-shape bands of interest (MAROIs). HCSMAROI can compress different ROIs with different target rates. The spatial spectral decorrelation criterion (SSDC) is used to obtain the optimal band of plant leaf HSIs. Then, the labeled mask of the multiple arbitrary-shaped ROIs can be extracted according to the optimal band. After discarding the useless background data, each arbitrary-shape ROI can be divided into different blocks, which can be transformed into one-dimension data, and then compressed and reconstructed at the preset target bitrates using the random Gaussian matrix and staged orthogonal matching pursuit (StOMP) (Donoho et al., 2012).
The main contributions of this paper are as follows: 1. A novel hyperspectral CS framework is proposed to compress and reconstruct plant leaf HSIs; 2. The hyperspectral CS framework can not only be used in different plant leaf HSIs, but also have the potential to be applied in other HSIs; 3. A labeled mask is extracted to obtain multiple arbitrary-shape ROIs according to the optimal band; 4. Different arbitrary-shape ROIs can be compressed and reconstructed with different target bitrates after blocking and expansion.

ALGORITHM HCSMAROI
For most hyperspectral applications, plant leaf HSIs can be divided into multiple arbitrary shape region of interests and a background region in the spatial domain. This framework has the novelty that it can compress different arbitrary shape ROIs independently with different target rates and discard the background region. It can not only compression plant leaf HSIs, but also have the potential to be applied in other hyperspectral images with MAROIs. Fig. 1 shows the overview of HCSMAROI. Different types of HSIs can be effectively divided with appropriate strategies. For plant leaf HSIs, SSDC can be used to obtain the optimal band and then different ROIs can be labeled. For other types of HSIs, such as HSIs of satellite remote sensing, many image terrain classification algorithms can be used to extract the ROIs.
The proposed framework can be summarized as follows: Firstly, SSDC is used to obtain the key band of plant leaf HSIs; Secondly, the primary mask of the multiple arbitrary-shape ROIs could be extracted, and the final labeled mask of different ROIs can also be obtained in the key band; Thirdly, after discarding the useless background data for saving the memory storage, all arbitrary-shape ROIs of plant HSIs can be divided into blocks in the spatial domain and transformed into one-dimensional data one by one, then a random Gaussian matrix is selected to compress each arbitraryshape ROI; Fourthly, StOMP is chosen to reconstruct each arbitrary-shape ROI; Finally, the reconstructed plant leaf HSI can be obtained by combining all reconstructed MAROIs.
Extraction the optimal band using SSDC Gao et al. (2013) had compared the noise estimation performance of six main evaluation methods of homogeneous area (HA), geo-statistical (GS), local means and local standard deviations (LMLSD), SSDC, residual-scaled local standard deviation (RLSD), and homogeneous regions division and spectral de-correlation (HRDSDC), and obtained the conclusion that SSDC was the most reliable evaluation method for hyperspectral noise estimation performance among these six evaluation methods. Therefore, SSDC is chosen to evaluate the image quality of HSIs. After calculating the SSDC values of all bands, the band with the minimal SSDC is regarded as the optimal band.

ROI mask extraction and labeling
After obtaining the optimal band, k-means clustering algorithm is used to extract the mask for the arbitrary-shape ROIs of the HSIs. According to the characteristics of the optimal band, the number of classes, m, needs to be preset. The optimal band can be divided into m classes. But for the background region, the remaining m-1 classes can be combined to obtain the primary binary mask and construct the whole leaf regions. Each region in the primary binary mask can represent a ROI in the optimal band. Each 8-connected region in the primary binary mask can be labeled with an index value to construct the labeled mask.

Blocking and expansion for arbitrary-shape ROIs
In order to obtain good reconstructed performance, hyperspectral data of ROIs were compressed and reconstructed while those of background were discarded. Because of stronger spectral correlation of HSIs, compared with the spatial correlation, all ROIs were divided into multiple sub blocks only in the spatial domain. In the light of our extensive experiments, spatial block size of 2 × 2 can be used to achieve good reconstructed performance.
Considering the high correlation of neighboring pixels in the spatial domain, each block of the same position of all bands is firstly expanded into one-dimensional data in the order of zigzag band by band, and then combined to construct the whole one-dimensional data. Figure 2 shows the expansion and inverse expansion scheme of a block.

CS and reconstruction for arbitrary-shape ROIs
The measurement matrix f 2 R MÂN is used to sample each one-dimension expansion data of arbitrary-shape ROIs randomly; M is the signal length after sampling, N is the expansion length of a block, M << N; f is the random Gauss measurement matrix. The sampling rate can be defined as follow.
The sampling process is as below.
where X is a one-dimension expansion data, and Y is the observed data. We use Discrete Cosine Transform (DCT) matrix to construct the sparse basis w, w 2 R NÂN , w T Â w ¼ I, I is the identity matrix, which is an orthonormal matrix. Ψ is given by (3) The sensing matrix A is constructed by Ψ and Φ and is shown by Eq. (4). To ensure the convergence of the reconstruction algorithm, the sensing matrix A must satisfy the principle of the restricted isometry property (RIP) (Candès, 2008).
Due to the high efficiency and high accuracy of StOMP, it is chosen as the reconstructed algorithm to reconstruct each one-dimension expansion data.
The inverse expansion process (Fig. 2B) is performed to reconstruct a block. After all blocks have been reconstructed, each ROI of plant leaf HSIs can be reconstructed. Finally, the reconstructed plant leaf HSI can be obtained by combining all reconstructed MAROIs according to the labeled mask.

PERFORMANCE EVALUATION METHODS
Peak Signal to Noise Rate (PSNR) is chosen to evaluate the reconstructed performance of each ROI in the spatial domain. Mean Square Error (MSE) and PSNR of ROI are defined as follows: where idx is the index of ROIs, Area ROIðidxÞ is the total pixel count of the idx-th ROI, Xrec is the reconstructed data, Xori is the original data, and n is the number of bits per sample.
The reconstructed plant spectral indices Triangular Vegetation Index (TVI) and Double Difference (DD) are used to evaluate the reconstructed effectiveness in the spectral domain. Two physiological indices in Table 1 are chosen to evaluate performance, where Rx represents the spectral reflectance at the wavelength of x nm (Broge & Leblanc, 2000;le Maire, François & Dufrêne, 2004).

EXPERIMENTAL RESULTS AND ANALYSIS
In the experiments, two types of plant leaf HSIs are chosen. One is soybean leaf and the other is tea leaf.

HSIs of soybean leaves and tea leaves
In this study, a hyperspectral imaging system ImSpector V10E (ImSpector V10E, SPECIM, Finland) covering the spectral wavelengths of 400-1,000 nm was used to snap the soybean leaf HSIs. The system includes a CCD camera (DL-604M, Andor, Ireland), an imaging spectrograph, a lens, a light sources provided by 150 W quartz tungsten halogen lamp (2900-ER, Illumination, Santa Monica, CA, USA) and software (Isuzu Optics Corp, Zhubei, Taiwan) for the computer operating the spectral image system. The spectral resolution is 0.65 nm and the resolution of CCD array detector of the camera is 1,344 × 1,024. The system is driven by electronically controlled displacement platform (IRCP0076, Isuzu Optics Corp, Taiwan) to scan the samples line by line.
Two sets of soybean leaf HSIs are used in the experiment. The wavelength range of the first soybean leaf HSIs is from 476 nm to 939 nm, in which there are totally 596 bands. A single pixel is defined by a 16-bit unsigned integer and their resolution is 834 × 1,004, that is to say, the total pixel count of a band is 837,336. The wavelength range of the second soybean leaf HSIs is from 460 to 900 nm, in which there are totally 568 bands, and their resolution is 706 × 887. The bands of 666, 558 and 476 nm of these two sets of soybean leaf HSIs are selected as the red, green and blue channels to construct the false color composite RGB images as shown in Figs

Mask extraction
Primary mask extraction Figure 5A indicates the SSDC curve of the first soybean leaf HSIs in which the 702 nm band shows the lowest SSDC value. Figure 5B shows the SSDC curve of the first tea leaf HSIs in which the 690 nm band obtains the lowest SSDC value. Therefore, the 702 nm band and the 690 nm band are chosen as the optimal bands which can be used to extract the mask of multiple arbitrary-shape ROIs of four pieces of soybean leaves and six pieces of tea leaves, respectively. Figures 6 and 7 show the 702 nm band image of soybean leaf HSIs and the 690 nm band image of tea leaf HSIs, respectively. The k-means clustering algorithm is used to cluster these optimal bands. The number of classifications m is set as four and three for the first soybean leaf HSIs and the first tea leaf HSIs, respectively. Figures 8 and 9 show their primary binary masks by the k-means clustering algorithm. Labeled mask extraction The ROI regions of the primary mask are labeled from left to right and from top to bottom. In the labeled mask, the larger ROI label value, the higher its gray value. Figures 10 and 11 show the labeled masks of multiple arbitrary-shape ROIs of soybean leaves and tea leaves. The total pixel area Area_ROIs of all ROIs in the labeled mask can be obtained as follows.
where Area BOIðiÞ is pixel area of the ith ROI in the labeled mask, n is the total number of ROIs. Reconstruction performance analysis of multiple arbitrary-shape ROIs using the primary mask In this case, multiple arbitrary-shape ROIs can be compressed and reconstructed using the primary binary mask at the whole target sampling rate. Single spectral compressive sensing (SSCS) and blocking compressive sensing (BCS) are chosen to make the performance comparison. The sampling rates of soybean leaf HSIs are 5%, 10%, 15% and 20%, and the sampling rates of tea leaf HSIs are 3%, 5%, 8%, 12%, 17% and 20%. That is to say, the actual sampling rates of different ROIs are respectively written as follows.
Ratio p ¼ Area whole Area ROIs Â p (8) where Area whole is the whole area of all ROIs in the primary mask, p is the sampling rates of the HSIs, and Ratio p is the sampling rates of all ROIs.

Spatial domain reconstruction analysis
Figures 12-15 illustrate the experimental results of soybean leaf HSIs for SSCS, BCS and HCSMAROI at the sampling rates from 5% to 20%. Figures 16-21 illustrate the experimental results of tea leaf HSIs for SSCS, BCS and HCSMAROI at the sampling rates from 3% to 20%. Compared with SSCS, BCS preserves the spatial correlation to achieve better reconstructed image. HCSMAROI allocates all the bit rates to the leaf regions and can achieve significantly better reconstructed subjective quality for leaf regions than that of the other two algorithms at low sampling rates. The reconstructed PSNRs performances of soybean leaf HSIs as shown in Table 2, the reconstructed PSNRs performances of tea leaf HSIs as shown in Table 3. The reconstructed PSNRs of HCSMAROI and BCS are higher than that of SSCS at different sampling rates. The PSNRs of reconstructed images of different algorithms are all improved with the increasing of the sampling rate. HCSMAROI achieves the highest PSNRs at all sampling rates. When the sampling rate is lower than 10%, the reconstructed PSNR values of HCSMAROI are significantly higher than that of BCS. When the sampling rate is greater than or equal to 20%, the reconstructed PSNR of HCSMAROI is close to that of BCS. It is shown that HCSMAROI can achieve quite good reconstructed PSNR in the spatial domain, especially at low sampling rate.

Spectral domain reconstruction results analysis
Spectral analysis of plant HSIs plays an important role in monitoring of plant growth, disease and insect pest detection and the inversion of physiological parameters. Figure 22  shows the average reconstructed spectra of the multiple arbitrary-shape ROIs of SSCS, BCS and HCSMAROI at different sampling rates from 5% to 20% for the first soybean leaf HSIs. Figure 23 shows that the average reconstructed spectra of the multiple arbitraryshape ROIs of SSCS, BCS and HCSMAROI at the sampling rates from 3% to 20% for the first tea leaf HSIs. Experimental results shows that the reconstructed spectra of BCS are  obviously closer to the original one than that of SSCS at different sampling rates. Besides, especially at low sampling rates, the reconstructed spectra of HCSMAROI are closer to the original ones than those of the others.  Table 4 gives RMSE comparisons of spectral indices of TVI and DD of the first soybean leaf HSIs for different algorithms at different sampling rates. For spectral indices of TVI and DD at different sampling rates, RMSE of HCSMAROI is relatively less than that of the other algorithms. At the sampling rate of 10%, HCSMAROI has especially obvious advantage. And RMSE of SSCS shows the largest RMSE values at different sampling rates. Table 5 shows RMSE comparisons of spectral indices of TVI and DD of tea leaf HSIs for different algorithms at different sampling rates. HCSMAROI can achieve obviously the best performance at low sampling rates of 3% and 5%.
Reconstruction performance analysis for multiple arbitrary-shape ROIs using the labeled mask In the "Reconstruction Performance Analysis of Multiple Arbitrary-shape ROIs using the Primary Mask", the experimental results for multiple arbitrary-shape ROIs using the primary mask show that the proposed HCSMAROI can achieve significantly better reconstructed performance than that of the SSCS and BCS in the spatial and spectral domains. Therefore, HCSMAROI is chosen to perform the following experiments.
In this case, multiple arbitrary-shape ROIs can be compressed and reconstructed respectively using the labeled mask at different target sampling rates. This case can be regarded as a general one. Reconstruction of multiple arbitrary-shape ROIs using the   where Area BOI i ð Þ and p i ð Þ are the area and sampling rate of the ith ROI respectively, Area whole is the whole area of all ROIs in the primary mask, and Ratio whole is the whole sampling rate. Figures 24 and 25 show the reconstructed images of the first soybean leaf HSIs and the first tea leaf HSIs, in which the ROI with higher sampling rate obtains better reconstructed quality, respectively. For soybean leaf HSIs, Table 6 also shows the same result that average reconstructed PSNR of ROI1 is the worst and that of ROI4 is the best. For tea leaf HSIs, Table 7 shows that average PSNR of reconstructed ROI1 is the worst and that of ROI6 is the best. These results shows that when the sampling rate of a certain ROI is greater than or equal to 10%, it can achieve quite good reconstructed quality in the spatial domain. Figure 26 illustrates the reconstructed spectra of soybean leaf HSIs for different ROIs. For those ROIs whose sampling rates are larger than or equal to 10%, the reconstructed spectra can be quite closer to the original spectra. Figure 27 illustrates the reconstructed spectra of tea leaf HSIs for different ROIs. For those ROIs whose sampling rates are larger than or equal to 5%, the reconstructed spectra can be closer to the original spectra than the others. The higher the sampling rate of a ROI, the closer to the original spectra it is. Table 6 and Table 7 show the RMSE of reconstructed spectral indices TVI and DD for the first soybean leaf HSIs and the first tea leaf HSIs, respectively. Similarly, the higher the sampling rate of a ROI, the smaller the reconstructed RMSE of reconstructed spectral indices TVI and DD. When the sampling rate of a certain ROI is greater than or equal to 15% for soybean leaf HSIs or 10% for tea leaf HSIs, the ROI can also achieve quite good reconstructed quality in the spectral domain.
Therefore, the proposed HCSMAROI can provide a flexible way to effectively compress and reconstruct different arbitrary-shape ROIs with different strategy meanwhile good reconstructed performance can be obtained in the spatial and spectral domains.

CONCLUSION
There are high inter-spectral and inter-spatial correlations for plant leaf HSIs. The proposed HCSMAROI can achieve better reconstruction effects in both the spectral and spatial domains than those of SSCS and BCS. At low sampling rates, HCSMAROI shows quite good reconstructed performance. HCSMAROI can also be closer to the original spectra and retain the spectral indices of TVI and DD more effectively than those of SSCS and BCS. Besides, the proposed HCSMAROI shows special advantage that it can effectively compress and reconstruct different arbitrary-shapes ROIs with different sampling rates, and brings good reconstructed performance in both the spatial and spectral domains.