Feasibility Study of Combining Hyperspectral Imaging with Deep Learning for Chestnut-Quality Detection

The rapid detection of chestnut quality is a critical aspect of chestnut processing. However, traditional imaging methods pose a challenge for chestnut-quality detection due to the absence of visible epidermis symptoms. This study aims to develop a quick and efficient detection method using hyperspectral imaging (HSI, 935–1720 nm) and deep learning modeling for qualitative and quantitative identification of chestnut quality. Firstly, we used principal component analysis (PCA) to visualize the qualitative analysis of chestnut quality, followed by the application of three pre-processing methods to the spectra. To compare the accuracy of different models for chestnut-quality detection, traditional machine learning models and deep learning models were constructed. Results showed that deep learning models were more accurate, with FD-LSTM achieving the highest accuracy of 99.72%. Moreover, the study identified important wavelengths for chestnut-quality detection at around 1000, 1400 and 1600 nm, to improve the efficiency of the model. The FD-UVE-CNN model achieved the highest accuracy of 97.33% after incorporating the important wavelength identification process. By using the important wavelengths as input for the deep learning network model, recognition time decreased on average by 39 s. After a comprehensive analysis, FD-UVE-CNN was deter-mined to be the most effective model for chestnut-quality detection. This study suggests that deep learning combined with HSI has potential for chestnut-quality detection, and the results are encouraging.


Introduction
Chestnuts have a long history as an agricultural product and are widely distributed worldwide, but mainly concentrated in Asia and Europe. China is a major producer and consumer of chestnuts, with production reaching 295,661 ha in 2021, according to FAO. Chestnuts are rich in nutritional value, consisting of 42.2-66.5% starch, 40.3-60.1% moisture, 9.5-23.4% total sugar, 4.8-9.6% crude protein, 2.2-3.7% crude fiber, 1.8-3.4% ash, 2.8-3.2% fat, and essential amino acids and minerals [1]. In recent years, the demand for chestnuts has significantly increased due to the growing popularity of cooked chestnuts and various chestnut-based foods. However, there are numerous low-quality chestnuts in the market that are affected by factors such as planting terrain, ripening time, and storage conditions. These low-quality chestnuts can pose a serious health risk to consumers after processing and reaching the market.
In southern China, most chestnuts are planted in the hills and mountains, and these chestnuts are planted by farmers and purchased by processing companies from the farmers without forming large-scale plantation industrial parks, and the ripening time of chestnuts 1.
The feasibility of the hyperspectral method combined with the deep learning model for chestnut quality classification was verified.

2.
To determine the optimal identification band for different-quality chestnuts, the important bands of chestnuts of varying quality were extracted and compared with the full-band chestnuts. This allowed for verification of the impact of important band classification on chestnut quality. 3.
The visualization of chestnut-quality detection by hyperspectral data combined with principal component analysis is presented. 4.
Hyperspectral detection has the potential to determine the quality of agricultural products such as grapefruit, lychee, peanut, and mangosteen, which may not be easily observable by the naked eye.

Material Acquisition and Processing
To ensure that the chestnut samples had a uniform level of freshness, all chestnut samples in this study were provided with 170 freshly picked chestnuts by chestnut-processing companies. Chestnuts were transported by courier, and the first day of chestnut storage (28 December 2022) was used as the starting time for chestnut storage, and the number of days was recorded as "1st d". Chestnut data was obtained by hyperspectral instrumentation on the first day of sample acquisition, and the chestnuts were stored normally after the first phase of chestnut data collection. Chestnuts were stored in semi-enclosed boxes at a temperature and humidity of 13 • C and 67%, respectively, for the first thirty days of storage, and at a temperature and humidity of 20 • C and 80%, respectively, for the second thirty days of storage. The average temperature and humidity were significantly higher in the second thirty days of storage, and the trend of chestnut quality changes would be accelerated when the chestnut was in a higher temperature and humidity environment in the second thirty-day stage [15]. Spectral data of chestnuts were obtained on the 1st day, 30th day, and 60th day of storage, and recorded as fresh, sub-fresh (slightly moldy), and rotten (severely moldy). This data was used for subsequent experimental analysis.

Spectral Image Acquisition and Correction
The HSI system mainly consists of a hyperspectral imaging lens (Specim FX17, Spectral Imaging Ltd., Oulu, Finland), a light source, a mobile platform, and a computer, with the hyperspectral imaging lens operating in the near-infrared band (935-1720 nm). The hyperspectral imaging lens has an image resolution of 640 × 640 px and a spectral resolution of 8 nm, and the light source is a 280 W halogen lamp. In order to obtain clear image information, the height between the lens and the sample is 32 cm, and the specific workflow is shown in Figure 1. During the spectral instrument data acquisition process, there is dark current noise and the effect of uneven illumination [6]. To increase the accuracy of the data, corrections are made by transforming the original image (I raw ) into a reflectance image (I c ) using standard white reference images (I white ) and dark reference images (I dark ): where I raw is the original hyperspectral image, I c is the corrected image, I white is obtained by using a white PTFE rod with almost 100% reflectivity, and I dark is obtained by completely covering the lens with an opaque cap.

Moisture Loss Measurement
Immediately after each phase of spectral data collection, the samples were weighed using an electronic scale (longbei, Home Electronic Scale, Guangzhou, China), and considering the consistency of each weighed sample, each chestnut sample was labeled with a serial number to ensure that the same sample was measured each time. After each measurement, the sample was stored at room temperature, and the weight of the 1st d sample was measured as , the weight of 30th and 60th d was measured as Wd, and the formula of moisture loss was calculated as in Equation (2) [16]: (2) The mean, maximum, and variance data of the experimental sample weights are shown in Table 1.

Principal Component Analysis Method
Principal component analysis (PCA) is an unsupervised data dimensionality-reduction method targeted at exploring the sample space and interpreting the sample space as an effective statistical tool. An excessive number of variables can be transformed into a smaller number of new potential variables, which are also called principal components (PCs) [17]. Each PC has a corresponding score, which is provided in the form of scatter plots which illustrate the association between the samples. These score plots ensure that the spectral data can be interpreted successfully. In this study, the cluster analysis of the

Moisture Loss Measurement
Immediately after each phase of spectral data collection, the samples were weighed using an electronic scale (longbei, Home Electronic Scale, Guangzhou, China), and considering the consistency of each weighed sample, each chestnut sample was labeled with a serial number to ensure that the same sample was measured each time. After each measurement, the sample was stored at room temperature, and the weight of the 1st d sample was measured as W f , the weight of 30th and 60th d was measured as W d , and the formula of moisture loss was calculated as in Equation (2) [16]: The mean, maximum, and variance data of the experimental sample weights are shown in Table 1.

Principal Component Analysis Method
Principal component analysis (PCA) is an unsupervised data dimensionality-reduction method targeted at exploring the sample space and interpreting the sample space as an effective statistical tool. An excessive number of variables can be transformed into a smaller number of new potential variables, which are also called principal components (PCs) [17]. Each PC has a corresponding score, which is provided in the form of scatter plots which illustrate the association between the samples. These score plots ensure that the spectral data can be interpreted successfully. In this study, the cluster analysis of the spectral data of good-quality chestnuts and poor-quality chestnuts was performed using principal component analysis to achieve a qualitative analysis of chestnut-quality detection.

Spectral Image Pre-Processing
Proper pre-processing of the spectral data can increase the correlated chemical peaks in the spectra and reduce the effects of baseline shift and overall curvature [18]. In this paper, four pre-processing methods are used, including standard normal-variables transformation (SNV), multiplicative scatter correction (MSC), and first-order derivative (FD). The standard normal variables transformation (SNV) can effectively eliminate the scattering problem and provide high-quality data for building the identification model [19]. MSC can effectively eliminate the spectral differences caused by different scattering levels and thus enhance the correlation between the spectra and the data [20]. FD is used to amplify the trend of the spectral images through the derivative processing of the spectral images [21]. In this study, several pre-processing methods are compared to find an optimal pre-processing method.

Feature Selection Algorithm
Hyperspectral data are characterized by a strong correlation between adjacent bands and high redundancy [22]. Appropriate use of feature-extraction methods can effectively reduce the dimensionality of the spectral data and simplify the model. In this study, three feature-extraction algorithms were used: competitive adaptive reweighted sampling (CARS), the successive projections algorithm (SPA), and the uninformative variable elimination (UVE). SPA is a forward-selection algorithm designed for selecting spectral features. It selects the least redundant band from the original spectrum in order to reduce the effect of spectral covariance [23]. CARS is a feature-selection method that combines Moncatello sampling (MC) and partial least-squares (PLS) model regression, and this approach uses cross-validation (CV) for determining the subset with the lowest root-mean-square error [24]. The equation for root-mean-square error cross validation (RMSECV) is as follows: where y denotes the true value and y cv denotes the predicted value in cv. UVE is a variable selection method based on stability analysis of PLS regression models for eliminating redundant or uninformative spectral parameters [25].

Traditional Machine Learning Methods
The partial least-square discriminant analysis (PLS-DA) model is useful for solving classification problems in two stages. The first stage is the application of PLS components for dimensionality reduction, and the second stage is the predictive model building, i.e., discriminant analysis. In classification, PLS-DA transforms categorical variables into continuous variables, and then calculates the Latent Variable Scores (LVS) to fit the model using covariance [26]. Support vector machine (SVM) is a well-established classification method that mainly solves nonlinear classification, function-estimation, and pattern-recognition problems. The basic idea is to transform the low infinitesimal indistinguishable variables into a hyperplane that can correctly partition the training data set in a high-dimensional feature space with maximum measurement intervals. In this study, the radial basis function (RBF) is chosen as the kernel function, which has a better ability to handle nonlinear data [27]. Random forest (RF) is an effective method for analyzing high-dimensional data by constructing multiple decision trees in parallel to combine their results to produce an output [28], and the decision trees and minimum number of leaves in this paper are set to 2.  field of spectroscopy [29]. CNN consists of an input layer, convolutional layer, maximum pooling layer, fully connected layer, and output layer. In this study, a chestnut quality classification model called "1DCNN" is proposed and found to be effective. The structure diagram of the one-dimensional CNN is shown in Figure 2. The 1DCNN model comprises of three convolutional layers with a kernel size of 2 * 1. The number of kernels used were 16, 32, and 32 respectively. The step size and padding were set at 2 and 0, respectively. The maximum pooling layer has a size of 2 * 1 with a step size of 1. The fully connected layer reduces the neuron parameters to 3, and finally the output is passed through softmax. To speed up the training process and avoid the vanishing gradient problem, the rectified linear unit (ReLU) is used as the activation function [30] (Equation (4)). For classification problems, the cross-entropy function (Equation (5)) is often applied as a loss function, and the cross-entropy function to measure the distance between the actual output and the desired output [31].
where L C represents the classification loss function; y is the label vector; ỹ is the predicted vector; s is the number of samples; and q is the number of classes. In a classification task, the softmax operator is applied to the predicted output to obtain the predicted probability, and the cross-entropy loss is then compared with the true label, which is presented in Equation (6).
of spectroscopy [29]. CNN consists of an input layer, convolutional layer, maximum poo ing layer, fully connected layer, and output layer. In this study, a chestnut quality class fication model called "1DCNN" is proposed and found to be effective. The structure dia gram of the one-dimensional CNN is shown in Figure 2. The 1DCNN model comprises o three convolutional layers with a kernel size of 2 * 1. The number of kernels used were 1 32, and 32 respectively. The step size and padding were set at 2 and 0, respectively. Th maximum pooling layer has a size of 2 * 1 with a step size of 1. The fully connected laye reduces the neuron parameters to 3, and finally the output is passed through softmax. T speed up the training process and avoid the vanishing gradient problem, the rectified lin ear unit (ReLU) is used as the activation function [30] (Equation (4)). For classificatio problems, the cross-entropy function (Equation (5)) is often applied as a loss function, an the cross-entropy function to measure the distance between the actual output and the de sired output [31].
where LC represents the classification loss function; y is the label vector; ỹ is the predicte vector; s is the number of samples; and q is the number of classes. In a classification task the softmax operator is applied to the predicted output to obtain the predicted probability and the cross-entropy loss is then compared with the true label, which is presented i Equation (6).

Long Short-Term Memory Network
The long short-term memory network (LSTM) is a special kind of recurrent ne network (RNN) that solves the general recurrent neural network long-term depende problem [32]. LSTM can perform a single operation across all sequence lengths and o mize vanishing gradient problems through its gating features [33]. T Due to the vantages of the LSTM model, it has also been more widely used in the field of spect copy in recent years [9,34]. The LSTM model in this study is shown in Figure 2. The LS model consists of an input layer, two LSTM layers, and one fully connected layer. Am them, ReLU is chosen for the activation function, thus speeding up the learning speed avoiding the problem of vanishing gradients. The deep learning model in this stud executed in MATLAB R2022a (Mathworks, Natick, MA, USA).

Change in Water Content
The study found that the moisture content of chestnuts decreased as the numbe storage days increased, and the rate of moisture loss was significantly faster during first 30 days than the last 30 days. Measurements taken of the weight of fresh, sub-fr and rotten chestnuts led to this conclusion. The weight change and water loss of chest increased with the increase of storage days, as shown in Figure 3, and water loss sl down for 30-60 days, probably because the chestnut mold develops during this pe and mold multiplies rapidly [35]. The average weight of chestnuts decreased by 2. after 60 days of storage at room temperature, and the average moisture loss was 29 which also indicated that there was a large difference in the moisture of rotten chest stored for a long time compared to fresh chestnuts.

Long Short-Term Memory Network
The long short-term memory network (LSTM) is a special kind of recurrent neural network (RNN) that solves the general recurrent neural network long-term dependency problem [32]. LSTM can perform a single operation across all sequence lengths and optimize vanishing gradient problems through its gating features [33]. T Due to the advantages of the LSTM model, it has also been more widely used in the field of spectroscopy in recent years [9,34]. The LSTM model in this study is shown in Figure 2. The LSTM model consists of an input layer, two LSTM layers, and one fully connected layer. Among them, ReLU is chosen for the activation function, thus speeding up the learning speed and avoiding the problem of vanishing gradients. The deep learning model in this study is executed in MATLAB R2022a (Mathworks, Natick, MA, USA).

Change in Water Content
The study found that the moisture content of chestnuts decreased as the number of storage days increased, and the rate of moisture loss was significantly faster during the first 30 days than the last 30 days. Measurements taken of the weight of fresh, sub-fresh, and rotten chestnuts led to this conclusion. The weight change and water loss of chestnuts increased with the increase of storage days, as shown in Figure 3, and water loss slows down for 30-60 days, probably because the chestnut mold develops during this period and mold multiplies rapidly [35]. The average weight of chestnuts decreased by 2.68 g after 60 days of storage at room temperature, and the average moisture loss was 29.1%, which also indicated that there was a large difference in the moisture of rotten chestnuts stored for a long time compared to fresh chestnuts.

Spectral Overview
The average spectra of the different chestnut qualities are presented in Figure 4, with the spectral wavelength range in the NIR band (935-1720 nm). In general, the three different qualities of chestnuts have the same spectral curve waveforms with similar peaks and troughs, but different reflectance values due to the consistent internal composition of the chestnuts [22]. Among them, the sub-fresh and rotten chestnuts showed a more scattered spectral curve, which may be due to the different degrees of quality variation in these chestnuts.

Spectral Overview
The average spectra of the different chestnut qualities are presented in Figure 4, with the spectral wavelength range in the NIR band (935-1720 nm). In general, the three different qualities of chestnuts have the same spectral curve waveforms with similar peaks and troughs, but different reflectance values due to the consistent internal composition of the chestnuts [22]. Among them, the sub-fresh and rotten chestnuts showed a more scattered spectral curve, which may be due to the different degrees of quality variation in these chestnuts.

Spectral Overview
The average spectra of the different chestnut qualities are presented in Figure 4, with the spectral wavelength range in the NIR band (935-1720 nm). In general, the three different qualities of chestnuts have the same spectral curve waveforms with similar peaks and troughs, but different reflectance values due to the consistent internal composition of the chestnuts [22]. Among them, the sub-fresh and rotten chestnuts showed a more scattered spectral curve, which may be due to the different degrees of quality variation in these chestnuts. Compared to fresh chestnuts, sub-fresh chestnuts showed a higher reflectance due to the water-absorption peak associated with the O-H bond at 970 and 1450 nm (Figure 4) [36]. The measurement of chestnut moisture loss also showed that chestnut moisture loss was rapid during the first thirty days of storage, showing a large difference in spectral  Compared to fresh chestnuts, sub-fresh chestnuts showed a higher reflectance due to the water-absorption peak associated with the O-H bond at 970 and 1450 nm (Figure 4) [36]. The measurement of chestnut moisture loss W l also showed that chestnut moisture loss was rapid during the first thirty days of storage, showing a large difference in spectral water absorption peaks. In the second thirty days, it is likely that the moisture was produced by the rapid multiplication of chestnut mold bacteria, as has been seen in some studies on chestnut mold [35]. The appearance of an absorption peak at 1200 may be related to C-H bonded sugars and starch [37]. Although the average spectra of different quality chestnuts exhibit differences, it is clear from the overall spectral map that this is due to large variations in individual chestnuts weighted to the average spectrum. Therefore, a more reliable way to identify chestnuts with different degrees of variation needs to be established.

PCA Qualitative Analysis
Before building the model, the raw spectral data were visualized and qualitatively clustered using PCA. Since the cumulative contribution of the first two principal components (PC1, PC2) reached 97.4%, with PC1 explaining 79.8% of the variance and PC2 explaining 17.6% of the variance, PC1 and PC2 were chosen to explain the clustering analysis of the chestnut spectral data. As shown in the scatter plot of PCA scores in Figure 5, the distribution of the three chestnut species can be effectively separated using confidence ellipses. There is a certain degree of overlap between fresh chestnuts and sub-fresh chestnuts, and a small overlap between sub-fresh chestnuts and rotten chestnuts, which may be caused by the different change rates of different chestnuts. In the visualization of hyperspectral data, the first three principal component image were obtained by PCA dimensionality reduction; as shown in Figure 6, there was no ob vious difference between the original images of fresh and sub-fresh chestnuts, while ther was a slight difference in the principal component images, and at this time when mold chestnuts appeared on the principal component images of sub-fresh chestnuts could b distinguished more intuitively by means of images. When comparing the principal com ponent images of the sub-fresh and moldy chestnuts, there is a significant difference which further illustrates the feasibility of spectra for chestnut-quality differentiation. In the visualization of hyperspectral data, the first three principal component images were obtained by PCA dimensionality reduction; as shown in Figure 6, there was no obvious difference between the original images of fresh and sub-fresh chestnuts, while there was a slight difference in the principal component images, and at this time when moldy chestnuts appeared on the principal component images of sub-fresh chestnuts could be distinguished more intuitively by means of images. When comparing the principal component images of the sub-fresh and moldy chestnuts, there is a significant difference, which further illustrates the feasibility of spectra for chestnut-quality differentiation. was a slight difference in the principal component images, and at this time when moldy chestnuts appeared on the principal component images of sub-fresh chestnuts could be distinguished more intuitively by means of images. When comparing the principal component images of the sub-fresh and moldy chestnuts, there is a significant difference, which further illustrates the feasibility of spectra for chestnut-quality differentiation.

Chestnut Quality Identification Results Using Machine Learning Models
For the three pre-processing methods of RAW, MSC, SNV, and FD spectra (Figure 7), a chestnut-quality classification model was developed at 224 wavelengths. The identification models used PLS-DA, SVM, and RF for fresh (1st d), sub-fresh (30th d), and rotten chestnuts (60th d), respectively. The data are divided into training and test sets in the ratio of 3:1, where the training and test sets for RF are used in a disordered order to improve

Chestnut Quality Identification Results Using Machine Learning Models
For the three pre-processing methods of RAW, MSC, SNV, and FD spectra (Figure 7), a chestnut-quality classification model was developed at 224 wavelengths. The identification models used PLS-DA, SVM, and RF for fresh (1st d), sub-fresh (30th d), and rotten chestnuts (60th d), respectively. The data are divided into training and test sets in the ratio of 3:1, where the training and test sets for RF are used in a disordered order to improve their generalization ability. The recognition results of PLS-DA, SVM, and RF models for different pre-processing methods are shown in Table 2, and Figure 8 shows the corresponding confusion matrix for the test set with higher accuracy.
It can be observed that in most cases, the use of pre-processing can improve the accuracy of the model for the spectra. The RF + SNV and RF + MSC approaches achieved 95.28% and 95.56% in the training set, respectively, while the RF + FD and SVM + SNV approaches achieved the highest accuracy of 94% and 93.33% in the test set, respectively, with good results in overall accuracy using both PLS-DA and RF models.
The confusion matrix was used to further analyze the degree of misclassification within each model, with A, B, and C labels representing fresh, sub-fresh, and rotten chestnuts, respectively. From an overall perspective, A and B are susceptible to misclassification, while C is also somewhat susceptible to misclassification with B and C. This phenomenon is analogous to the behavior of spectral curves, where fresh and sub-fresh chestnuts exhibit similar curves, but curves of rotten chestnuts are considerably more dispersed, leading to a higher likelihood of misclassification as fresh or sub-fresh chestnuts. It is because of the similarity in spectral curve that it is more difficult for the traditional classification model to obtain satisfactory accuracy directly [9], so further deep learning approaches are needed to obtain satisfactory accuracy. their generalization ability. The recognition results of PLS-DA, SVM, and RF models for different pre-processing methods are shown in Table 2, and Figure 8 shows the corresponding confusion matrix for the test set with higher accuracy.    their generalization ability. The recognition results of PLS-DA, SVM, and RF models for different pre-processing methods are shown in Table 2, and Figure 8 shows the corresponding confusion matrix for the test set with higher accuracy.   It can be observed that in most cases, the use of pre-processing can improve the accuracy of the model for the spectra. The RF + SNV and RF + MSC approaches achieved 95.28% and 95.56% in the training set, respectively, while the RF + FD and SVM + SNV approaches achieved the highest accuracy of 94% and 93.33% in the test set, respectively, with good results in overall accuracy using both PLS-DA and RF models.
The confusion matrix was used to further analyze the degree of misclassification within each model, with A, B, and C labels representing fresh, sub-fresh, and rotten chestnuts, respectively. From an overall perspective, A and B are susceptible to misclassification, while C is also somewhat susceptible to misclassification with B and C. This phenomenon is analogous to the behavior of spectral curves, where fresh and sub-fresh chestnuts exhibit similar curves, but curves of rotten chestnuts are considerably more dispersed, leading to a higher likelihood of misclassification as fresh or sub-fresh chestnuts. It is because of the similarity in spectral curve that it is more difficult for the traditional classification model to obtain satisfactory accuracy directly [9], so further deep learning approaches are needed to obtain satisfactory accuracy.

Chestnut Quality Recognition Results Using Deep Learning Models
The dataset was partitioned using the same method as before, and data was randomized to enhance the modelʹs generalization ability. The CNN and LSTM deep learning models were built to quickly detect different quality chestnuts, respectively, and the models were trained for 500 rounds using segmented learning with a learning rate of 0.01 for the first 80% iterations and a decreasing learning rate of 0.1 for the last 20%, respectively. The recognition performance of the models is shown in Table 3. Observing the results in the table shows that the use of deep learning models is significantly better than traditional machine learning models, which is not an exception: in potato-disease detection [38], CNN models outperform traditional models in terms of recognition results. The same phenomenon is demonstrated in this experiment, where the CNN model also outperforms the LSTM model in terms of deep learning in general. The CNN models achieved 100% accuracy in the training set after pre-processing, and the highest accuracy of 98.67% was achieved in the MSC-CNN test set. However, the FD-LSTM test set has the highest accuracy of 99.72%, and the model accuracy and loss function are shown in Figure 9.

Chestnut Quality Recognition Results Using Deep Learning Models
The dataset was partitioned using the same method as before, and data was randomized to enhance the model's generalization ability. The CNN and LSTM deep learning models were built to quickly detect different quality chestnuts, respectively, and the models were trained for 500 rounds using segmented learning with a learning rate of 0.01 for the first 80% iterations and a decreasing learning rate of 0.1 for the last 20%, respectively. The recognition performance of the models is shown in Table 3. Observing the results in the table shows that the use of deep learning models is significantly better than traditional machine learning models, which is not an exception: in potato-disease detection [38], CNN models outperform traditional models in terms of recognition results. The same phenomenon is demonstrated in this experiment, where the CNN model also outperforms the LSTM model in terms of deep learning in general. The CNN models achieved 100% accuracy in the training set after pre-processing, and the highest accuracy of 98.67% was achieved in the MSC-CNN test set. However, the FD-LSTM test set has the highest accuracy of 99.72%, and the model accuracy and loss function are shown in Figure 9.
Although the recognition accuracy is improved by using a deep learning network, the deep learning network requires constant iterations, and the network has a low recognition efficiency with an average running time of 53 s. Therefore, it is important for practical use to reduce the amount of input information and improve the model recognition efficiency by screening spectral feature wavelengths. Although the recognition accuracy is improved by using a deep learning network, the deep learning network requires constant iterations, and the network has a low recognition efficiency with an average running time of 53 s. Therefore, it is important for practical use to reduce the amount of input information and improve the model recognition efficiency by screening spectral feature wavelengths.

Identification of Important Wavelengths
The accuracy of the traditional classification model does not achieve more satisfactory results, and the deep learning model has good results in terms of recognition accuracy, but the recognition efficiency is low. In this paper, we try to identify important wavelengths by spectral feature selection to improve the recognition efficiency of the deep learning model and improve the accuracy of the traditional model at the same time. The feature selection is performed by the SPA, CARS, and UVE algorithms, and the important wavelengths are recognized by the FD pre-processing method, which performs better in traditional classification models and deep learning models, while RAW is used as a control.
Different algorithms for the extraction results are shown in Figure 10. From the figure, it can be observed that although different feature wavelength-identification algorithms are used, the identification results are mostly concentrated around 1000, 1400, and 1600 nm wavelengths, which can be used as important wavelengths for chestnut = quality identification. Although the recognition accuracy is improved by using a deep learning network, the deep learning network requires constant iterations, and the network has a low recognition efficiency with an average running time of 53 s. Therefore, it is important for practical use to reduce the amount of input information and improve the model recognition efficiency by screening spectral feature wavelengths.

Identification of Important Wavelengths
The accuracy of the traditional classification model does not achieve more satisfactory results, and the deep learning model has good results in terms of recognition accuracy, but the recognition efficiency is low. In this paper, we try to identify important wavelengths by spectral feature selection to improve the recognition efficiency of the deep learning model and improve the accuracy of the traditional model at the same time. The feature selection is performed by the SPA, CARS, and UVE algorithms, and the important wavelengths are recognized by the FD pre-processing method, which performs better in traditional classification models and deep learning models, while RAW is used as a control.
Different algorithms for the extraction results are shown in Figure 10. From the figure, it can be observed that although different feature wavelength-identification algorithms are used, the identification results are mostly concentrated around 1000, 1400, and 1600 nm wavelengths, which can be used as important wavelengths for chestnut=quality identification.

Important Wavelength-Identification Results
RF and CNN, which have higher accuracy in machine learning and deep learning models, were applied in further studies, and the specific results are shown in Tables 4 and 5. The model confusion matrix is shown in Figure 11, where there is a small improvement in the

Important Wavelength-Identification Results
RF and CNN, which have higher accuracy in machine learning and deep learning models, were applied in further studies, and the specific results are shown in Tables 4 and 5. The model confusion matrix is shown in Figure 11, where there is a small improvement in the accuracy of the test set before the identification of important wavelengths using the RF model. The RAW-UVE-CNN recognition model has a slight increase in the accuracy of the test set, and a slight decrease in the accuracy of the test set of FD pre-processing, which is likely to have been caused by the reduction of the network input parameters.  5 The number of input variables in RF model. 6 Denotes the RAW spectra and FD pre-processing mode input variables, respectively.  However, although there is a slight decrease in accuracy by identifying significant wavelengths using the deep learning network, this approach significantly reduces the recognition time of the network and improves the network recognition efficiency. The model results and confusion matrix are shown in Figure 12. The average running time of the deep learning network was reduced from 50 s to 11 s. The improvement in recognition efficiency offers potential for the practical applications of chestnut-quality detection.  However, although there is a slight decrease in accuracy by identifying significant wavelengths using the deep learning network, this approach significantly reduces the recognition time of the network and improves the network recognition efficiency. The model results and confusion matrix are shown in Figure 12. The average running time of the deep learning network was reduced from 50 s to 11 s. The improvement in recognition efficiency offers potential for the practical applications of chestnut-quality detection.

Discussion
In food processing, it is crucial to identify the quality of chestnuts quickly and nondestructively. There have been previous studies on chestnuts investigating the identification of quality, mold, and origin, and this paper builds on that to conduct a more in-depth study. From the perspective of similar studies, researchers have studied chestnut origin and mold; for example, one investigated the application of backpropagation neural network (BPNN) to achieve the classification of healthy and moldy chestnuts with an accuracy of 0.99 [35]. Chestnut origin identification experiments using 1D-CNN and PLS-DA algorithms both achieved an accuracy of 0.971 [39]. In addition, the overall accuracy of HIS combined with deep learning for classification in nut-quality assessment was 0.958 [40]. This paper offers significant improvements in chestnut quality detection, distinguishing itself from prior research in the following ways. Firstly, a more accurate quality classification is utilized to categorize chestnuts into three groups for detection, with the application of the latest deep learning methodologies that further improve accuracy. Additionally, this paper shows that the moisture loss value of chestnuts is strongly correlated with the 1450 nm water absorption peak that increases as chestnuts are stored over time. Wavelength identification algorithms such as SPA, CARS, and UVE are employed to identify that the key wavelength range for chestnuts is mainly concentrated around 1200, 1400, and 1600 nm. Utilizing the identified wavelength distribution characteristics, multispectral equipment can be developed, making chestnut quality inspection more accessible and efficient.

Conclusions
This paper utilized a combination of HSI and traditional classification models, as well as deep learning models, to detect chestnut quality. By comparing the recognition effects of different models, combined with different treatments, the best model for chestnut-quality detection and the important wavelengths for chestnut-quality detection were identified. It was observed that FD-RF achieved the highest accuracy of 94% in the traditional classification model, while FD-LSTM achieved the highest accuracy of 99.72% in the deep learning network, with various pre-processing methods and different classification models employed.
Furthermore, feature extraction was used to identify the important wavelengths for chestnut-quality identification, which were mainly distributed around 1000, 1400, and 1600 nm. The significant wavelengths were identified with a slight increase in the original spectral accuracy in the RF model, and the significant wavelengths were used as input for the deep learning network, which had a slight decrease in accuracy but significantly reduced the network identification time (39 s on average).
Overall, the aim of this paper is to investigate the possibility of using hyperspectral imaging in conjunction with a deep learning model for chestnut-quality detection. Through the identification of significant wavelengths, we were able to enhance the effectiveness of the hyperspectral detection method. Furthermore, a FD-UVE-CNN model was developed, which showed high levels of accuracy and recognition efficiency with percentages of 97.33% and 11 s, respectively. These findings suggest that deep learning models have the potential to be applied in the practical detection of chestnut quality.

Data Availability Statement:
The datasets generated for this study are available on request to the corresponding author.

Conflicts of Interest:
The authors declare no conflict of interest.