Crop Type Classification by DESIS Hyperspectral Imagery and Machine Learning Algorithms

Developments in space-based hyperspectral sensors, advanced remote sensing, and machine learning can help crop yield measurement, modelling, prediction, and crop monitoring for loss prevention and global food security. However, precise and continuous spectral signatures, important for large-area crop growth monitoring and early prediction of yield production with cutting-edge algorithms, can be only provided via hyperspectral imaging. Therefore, this article used new-generation Deutsches Zentrum für Luft- und Raumfahrt Earth Sensing Imaging Spectrometer (DESIS) images to classify the main crop types (hybrid corn, soybean, sunflower, and winter wheat) in Mezőhegyes (southeastern Hungary). A Wavelet-attention convolutional neural network (WA-CNN), random forest and support vector machine (SVM) algorithms were utilized to automatically map the crops over the agricultural lands. The best accuracy was achieved with the WA-CNN, a feature-based deep learning algorithm and a combination of two images with overall accuracy (OA) value of 97.89% and the user's accuracy producer's accuracy was from 97% to 99%. To obtain this, first, factor analysis was introduced to decrease the size of the hyperspectral image data cube. A wavelet transform was applied to extract important features and combined with the spectral attention mechanism CNN to gain higher accuracy in mapping crop types. Followed by SVM algorithm reported OA of 87.79%, with the producer's and user's accuracies of its classes ranging from 79.62% to 96.48% and from 79.63% to 95.73%, respectively. These results demonstrate the potentiality of DESIS data to observe the growth of different crop types and predict the harvest volume, which is crucial for farmers, smallholders, and decision-makers.


I. INTRODUCTION
T HE major concerns of the modern society include crop and food security, and crop production and management are facing challenges due to population growth and environmental changes [1], [2], [3]. Crop-type classification provides essential information for various decision-making processes required to manage agricultural resources [4]. Crop-type information makes it possible to map agricultural land use intensity about crop sequences. The duration and diversity of crop sequences directly impact landscape complexity [5], [6] and can, consequently, lead to a decline in yields as soils are depleted, pest infestations are more likely, and pollinators or biological agents are deprived of resources [7], [8], [9]. Reliable information must be available on crops so that agricultural management can be improved, and costs can be reduced. Studies to monitor agricultural productivity and assess food security depend on accurate and reliable crop classification maps [10]. In order to develop strategies for a sustainable agricultural industry, detailed maps of crop types are required [11]. With advanced classification techniques, satellite image processing can give timely and accurate data on crop type and reliable yield estimation. Remote sensing has enabled significant crop monitoring [12]; for example, the combination of Landsat and Sentinel images has allowed increased temporal resolution, which is essential for this application [13]. However, spectral information, essential for crop classification, cannot be obtained with multispectral sensors in many cases.
Hyperspectral (HS) images provides data in hundreds of narrow bands, allowing advancement in the understanding and classification of crop types [14], [15]. Effective features derived from an HS image are quite important to improve the classification performance. Between different image features in classification tasks, the HA optimized by multiscale guided filter (GF) is generated by integrating harmonic analysis (HA) optimized by a multiscale GF with morphological operation which are input to an ensemble learning (EL) for HS image classification [16]. The HA has a suitable performance by converting the spectral signatures into multiple frequency-domain components and GF can preserve edge and reduce the presence of noisy or redundant features [17]. The morphological opening by reconstruction and closing by reconstruction lead to the breakage of the outline shape and the mismatch of the target in the HS image [18]. HS data have been used for several studies, such as invasive species control [19], [20], biodiversity assessment [21], vegetation/land-cover/plant-residue classification [22], [23], modeling of biochemical properties [24], This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ pollution assessment, and various agricultural applications [25]. Accurate spectral responses can be obtained from HS images with more than 100 bands for determining subtle changes in the Earth's surface over time [14]. Nonetheless, since HS images are not publicly available, they have not been widely used in precision agriculture. Moreover, there are still challenges the use of HS information is still somewhat limited by its large data volume and dimensionality, as well as its complex analysis [26]. However, these issues can be overcome by utilizing advanced machine learning (ML) techniques and big data analysis.
Deep learning methods such as two-dimensional (2-D) and 3-D convolutional neural network (CNN) have been widely used recently for hyperspectral image (HSI) classification tasks [27]. Because of suitable efficiency of the methods, usage of both models simultaneously has been represented in [14]. Fu-SE-Net is another deep learning model revealed [28]. In [29], a 3-D-CNN has been suggested which solves the complex feature map to decrease the spatial affluence captured since applying a usual 3-D-CNN for HSI classification. Authors in [30], applied pseudo-3-D modules together with a dense connected model. Unlike the common 3-D-CNN, the proposed pseudo-3-D modules can take both spatial and spectral features at the same time. In [31], 3-D-CNN method has been used and the effects of reduction of dimension has been investigated. It shows that the training time reduced by about 60%. Although the 3-D and 2-D methods can extract both spectral and spatial features for an HSI, the model's efficiency is limited when used on the huge datasets. Additionally, the computational cost of the 3-D CNNs are much higher than 2-D ones.
In order to analyze a signal/image in details, wavelet transforms is a great choice. The time frequency window can be processed by Wavelet which can receive lower and higher resolution of the original signal. WT is strong instrument for magnifying details of the image which is called mathematical magnifier. These extraction ability of the WT is useful and significant to resolve the above mentioned issues of the CNNs. WT can easily learn some difficult features, by regulating the translations and scales. Therefore, by joining the WT to the CNNs wealthy features can be learned. The extension of the WT functions can be controlled by changing the scale. The changing in the scale parameters can led to changing in the information extraction ability [32]. In [33], Hesser et al. proposed a model which contains an inverse-wavelet-transform equipped skip connections and a discrete-wavelet-transform reinforce feature reuse for balancing and development layers. The inverse-wavelet-transform increases the feature delegation by wholly regain the missing details within the down-sampling structure. To reducing the computationals costs, element-wise aggregation was applied for the skip connections. The two-level wavelet decomposition outcome presented that slightweight model without losing remarkable efficaincy. The practical investigations result on 3-D estimation indicates that the current method surpass the point-pillars-based model by about 14% while decreasing the number of training parameters. Also, they indicated the usage of Haar transforms for training the wavelet model.
In [34], Liu et al. showed a wavelet neural network adopts the network structure of back propagation neural network to achieve quicker training speeding time. The activation functions are Wavelet functions in the Wavelet neural network to resolve the issues regarding the local minimum. Liu et al. [35] indicated that wavelet CNNs can gain higher precision in image processing mission and texture classification comparing available models while possess notably fewer parameters than traditional CNN. Bastidas Rodriguez et al. [36] used Haar wavelet for downsampling and up-sampling in network which is the foremost selection regarding numerical and comparative estimation. Haar has equivalent tranning time with magnify CNN and U-net, but get higher PSNR outcomes, which show the effectiveness of MWCNN for tradeoff between performance and efficiency. Yang et al. [37] applied FA in HSI. They believed using FA in the preprocessing stage is sorely useful, since FA is capable to explain the variableness between the several correlating and overlapped spectrum bands, which support creating the model categorize analogous example better. Besides, regularly used Principal PCA based decrement does not straightly address this target in HSI. PCA process an estimation to the essential agents which do not help to differentiate similar examples that well.
Many spaceborne HS sensors have been recently developed, including the project for onboard autonomy the hyperspectral imager onboard the Indian microsatellite-1, the hyperspectral infrared imager, the hyperspectral imager for the coastal ocean, the Italian Precursore Iperspettrale della Missione Applicativa (PRISMA), and the German Deutsches Zentrum für Luft-und Raumfahrt (DLR) Earth Sensing Imaging Spectrometer (DE-SIS) [14], [38]. Furthermore, the German Environmental Mapping and Analysis Program, the Israeli and Italian Spaceborne Hyperspectral Applicative Land and Ocean Mission, and the NASA Surface Biology and Geology mission will launch new HS sensors. DESIS acquires information within the visible and near-infrared wavelength range of 400-1000 nm and is integrated into the multiuser system for earth sensing (MUSES) platform onboard the International Space Station; it records HS data by using 235 bands with an individual spectral resolution of 2.5 nm [39].
Thanks to free data access and thanks to the instrument characteristics, DESIS data can be used for many purposes, such as medium-and long-term environmental monitoring in mining areas, vegetation monitoring, soil degradation measurement, etc. [40]. In fact, they have already served for agricultural crop classification, forest health monitoring, grassland degradation measurement, water quality mapping, and landscape archaeology, but not for crop yield prediction, yet. Spectral libraries of HS reflectance data have been widely utilized for automatic crop identification and classification; thus, this application of crop HS libraries is currently one of the main research areas [41]. Several ML classification algorithms, such as pixel-based supervised random forest (RF) and support vector machine (SVM), and traditional methods, such as k-nearest neighbours, maximum likelihood estimation, and unsupervised K-means and ISODATA clustering, are available. Moreover, data on crop development stages, crop classification, and early yield estimation within the field variability are strategic interests of farmers, cooperatives, and decision-makers.
Aneece and Thenkabail [42] classified five major world crops (corn, soybean, winter wheat, rice, and cotton) and their growing phase by using 99 Earth Observing-1 (EO-1) Hyperion HS images based on the HS library of crops in the US. The classification algorithms were run on Google earth engine (GEE) by using linear discriminant analysis and SVM; optimal HS narrow bands (HNBs) were obtained through principal component analysis to reduce the large data dimensionality. They achieved the best results when analyzing 15-20 HNBs with the SVM, with an overall accuracy (OA) range of 75%-95%. Marshall et al. [43] predicted biomass and yield for corn, rice, soybean, and wheat by using PRISMA and Sentinel-2 data at the field level. Their study consisted of three stages: determination of two-band vegetation indices, performance estimation of partial least squares regression, and RF run. They used normalized difference vegetation indices derived from the two-band HNBs of PRISMA and sentinel-2 spectral bands throughout the three main growing stages (vegetative, reproductive, and maturity). The PRISMA RF model achieved better performance with mean root-mean-square error (RMSE) values of 0.42 and 0.17 kg/m 2 for the biomass and yield, respectively, while the sentinel-2 RF one provided corresponding mean RMSE values of 0.48 and 0.18 kg/m 2 .
Aneece and Thenkabail [42] compared two generations of HS sensors, Hyperion and DESIS, by studying the classification of three crops (corn, soybean, and winter wheat) in Ponca City (Oklahoma, USA) with ML techniques run on GEE. Ten EO-1 Hyperion images from 2010 to 2013 and three DESIS images from 2019 were used; they utilized 15 earlier established Hyperion optimal bands out of 242 for the crop-type mapping and selected 29 DESIS HNBs based on lambda-lambda correlation analysis. Overall, the best results were obtained with SVM and RF by using both HS image types, with an OA range of 96%-100% for Hyperion data with triple image sets and 67%-83% for DESIS data with double image sets. This article presents several important case studies that will increase the understanding of and knowledge about HS data by testing how a narrow bandwidth of 2.55 nm can help improve crop classification accuracy and characterization, and how to reduce the large datasets to overcome data redundancy and autocorrelations using deep learning and ML algorithms.
In this article, a wavelet attention 2-D-CNN has been illustrated for HSI classification for crop-type mapping. A wavelet transform was applied as a great feature extractor for classification. Therefore, higher accuracy can be achieved via the combination of the spectral attention CNN with wavelet transform in classification. The spectral attention mechanism (AM) has been added to increase the ability of the wavelet CNN network. This part can concentrate on informative features and can extract spatial and spectral correlation on the different types of features. Factor Analysis has been used to diminish the dimension of the HSI in the preprocessing stage. Wavelet transform also is applied to extract spectra and then feed into the attention CNN. In comparison to 3-D-CNN, the extracted features are easily estimated by wavelets.

A. Study Area
The study area (see Fig. 1) represents agricultural farmland located in Mezőhegyes, Békés County, next to the Romanian border (46°19 N, 20°49 E). Mezőhegyes is a town with a total administrative area of 15 544 ha and a population of 4950. The soil in its meadows and lowlands is mostly chernozem, which is a very common soil type with high lime content that is excellent for agriculture, especially for cereal and oilseed crops. There is an experimental farm, Mezőhegyesi Ménesbirtok Zrt., that plays an important role also in the neighboring settlements; it is one of the strongest agricultural companies in Hungary, with a land of 9862 ha. According to climate records at the Mezőhegyes station (next to the selected fields), the annual rainfall was 575 mm (458 mm in-crop) for the 2021 season. The main land use/land cover (LULC) classes include hybrid corn, sunflower, wheat, soybean, and noncrop classes such as grasslands, build-up areas, water bodies, and forested areas. Additional crop types included in the study area, that is, feed corn (for horses), silicone corn, barley, lucerne, silage, and sorghum [44], were dissolved into other classes in the classification procedure.

B. Satellite Data
DESIS data are primarily intended for commercial purposes. Therefore, DESIS images can be obtained free of charge for scientific and humanitarian purposes by presenting a proposal to the DLR showing the intended use. Thus, DESIS imagery was ordered from the DLR over the study area, and a Teledyne Brown Engineering HS camera was installed onboard the Teledyneoperated MUSES.
Two DESIS level 2A bottom of atmosphere reflectance images from June were downloaded from the EOWEB GeoPortal (https://eoweb.dlr.de/egp/, accessed on May 25, 2021) and then georeferenced in the ERDAS IMAGINE 2020 software. At the nadir view, the ground sampling distance depends on the ISS flight altitude and is around 30 m. DESIS covers an area of 30 km × 30 km (∼900 km2) [45]; its detailed characteristics are given in Table I.

C. Field Data
A total of 5080 sample pixels were generated randomly in the Point Sampling Tool of QGIS v3.16; they consisted of 1000 corn, 600 soybean, 820 sunflower, 860 winter wheat, and 1800 other crop samples for June (see Table II). DESIS samples were randomly split into two subsets for training and validation. For the agricultural classification, 70:30 training/validation  splits were used (see Fig. 2). Reference data were compared with high-resolution sentinel-2 and georeferenced Google earth images based on ground truth. Finally, the samples were filtered using the official crop plan map shaped as a mask layer, and the samples outside that layer were deleted. At the end of the growing season, the sunflower crop was harvested on September 26 with a John Deere W650i Combine harvester equipped with a yield mapping system using the Green Star software, which recorded crop yield data in a point shape format. Approximately one yield record was obtained every 2 s, and it could be viewed and manipulated in a geographic information system.
Crops are very sensitive to visible and near-infrared wavelengths. The DESIS spectral profiles showed distinct spectral signatures according to the crop type (see Fig. 3). HS narrowbands and continuous spectral sampling, along with the strong near-infrared reflection of vegetation, make it easy to distinguish crop types from each other. For example, the reflectance value for hybrid corn was very low because this crop was in the vegetative period on June 16, and soybean also was in its mid-vegetative phase, namely, additional trifoliate leaves were developing. In contrast, the sunflower had reached its stem elongation and flower bud development stage, and wheat had just entered its ripening and maturation stage (see Fig. 4). The ground truth   contains five classes, and details of the samples are shown in Tables III-V.

D. Methodology
The proposed method gains the advantage of the wavelet and AM to extract valuable features. Initially, the HSI is dimensionally reduced by factor analysis. The model framework is made of the wavelet transform, spectral attention, and CNN feature extractor to enhance the ability of the model. The HSI with W × H × M dimension, where W, H, and M are image dimensions which stands for width, heights, and spectral bands, is passed through the factor analysis to reduce the huge dimension of the HSI spectral bands into W × H × C. Where W and H are the spatial dimensions of the input data and M is depicted as spectral bands. It is confirmed that dimension reduction dramatically decreases training time. The output has a K-labeled vector which lifts certain classes from defined landcover classes. Where those labels are {y 1 , y 2 , . . . , y L } ∈ R 1×1×L , L is denoted as landcover classes. Factor Analysis maintains the HSI dimensions W × H, only spectrum bands diminished to C. By using factor analysis, alterability can be demonstrated across the various overlapping and highly correlated spectral bands. This ability can extremely reduce noise and degrade spectrum bands while emphasizing valuable features. Then, the input is patched into 3-D dimensions of the size (NS × D × D × C). Where NS is the number of samples, D × D is the patch's window size, and C is the number of reduced bands. The patches then send to the Haar wavelet transform which generates a pair of kernels K h and K l . Where, K h,t is shown as Haar wavelets and K l,t depicted as a scaling function. The Haar wavelet transforms are performed in four kernels which are as (f HH, K LL , K LH , K HL ).
It also can be shown (i, j) spectral place content as (1) when patches are extracted from Haar wavelet transforms.
Sublevels are created after passing the input patches through the wavelet transforms. In fact, haar wavelet transforms decomposes the input patches. The results of the four sublevels are forwarded to CNN to extract spatial and spectral features. The level-1 of the proposed model consists of two 2-D-CNN with 3 × 3 kernel size. After each convolution operation, the spectral AMs are used to focus on spectral features. The attention system is designed according to human visual understanding which can focus on public and local features [15]. The goal behind the attention system is to achieve a new agent relying on the correlation features. In this article, the AM has been used to develop the number of effective agents by preventing undue features and highlighting the useful features which have been showed in Fig. 5.
The output of level-1 and 2 are concatenated to maintain extracted spectral and spatial features. While the model is extracting features from concatenated features from levels 1 and 2 with two 2-D-CNN and attention modules, the input patches are then sent to level-3 and the results are combined with together. Same operation is done for level-4. To reduce the feature size after first convolution a stride 2 has been used. To prevent overfitting, mean-pooling is applied after each convolution layers and 2 dropouts and rectified linear unit as activation function alongside batch normalization has been also used. At the end of the process fully connected layer with the SoftMax function is utilized, which is consist of probability of each class.
As if, the highest value of the probability of each class will be sorted as output. Cross entropy as a loss function is applied to determine the compatibility of the model to predict new datasets. Input tensor with 3-D shape is sent through the 2-D-CNN of   CF×B×B, where C×B×B is the size of each patch, and F shows the number of output CNN named feature maps. The feature maps of all channels in the spectral dimension were retrieved in order to reshape a 2-D tensor CF×B×B as more modifying the spatial-spectral properties. Each spectral band creates new bands including various data after being processed by CNN. Weights are then joined to bands to represent the correlation between the bands and the main data. More relation is done via more weights, and features from more relevant bands are extracted precisely by the AM. The mechanism of attention has been shown in Fig. 6.

A. Implementation of Proposed Method
The model started with 2-D-CNN with a 3 × 3 as kernel size and one as padding. Instead of the pooling layer, we applied the CNNs with stride 2. To keep the model from overfitting, various tools have been applied, such as global mean pooling, and batch normalization. Some hyperparameters are also set to reduce the chance of overfitting, including ReLU has been used as the activation function. For using the wavelet transforms more effectively, dense layers are added which confirms that whole extracted features pass through the model permanently. After the end of each CNNs, spectral attention has been applied to highlighted features to be focused. We run the model over 250 epochs and used the Stochastic gradient descent with 0.001 as the learning rate. A fixed batch size has been used and was set to 30. The proposed method is performed applying the various libraries in the Colab environment. We measured the efficiency of the model using the HSI dataset. The dataset is first preprocessed by factor analysis to reduce the size of the HSI and then is separated into test, train, and validation. Patches are then generated from HSI with D×D×C size which C is denoted as the number of factors and D×D is patch size. These parameters are very effective in classification results. The ground truth includes five categorries, and the information of the classifications are given in Tables III-V. 1) Effect of Spatial Size on Classification Accuracy: Patches are sent to the network as inputs. Here, various spatial size has been tested to evaluate the classification efficacy [46]. Fig. 7 represent the impact of changing the patches' size over the accuracy. The model has tested with patches with 24×24, 48×48, 96×96, and 192×192. It is clear that the accuracy showed an increasing trend when the size of the spatial patches is increased. That is because it provides less information when the size of patches is small. On the other side, the larger patches include more data with larger amounts of noise. It may directly have a negative effect on accuracy. So, for the proposed dataset, by increasing the spatial patches to 48×48, the performance of the network is notably improved.
2) Effect of Factor Analysis on Classification Accuracy: Reducing the size and difficulty of the HSI is one of the important techniques. FA will extract the features which contain more information and are valuable. The extracted features are unique and uncorrelated with each other. Here, FA is used to reduce the dimension of the input data. Different FA has been tested on the 48×48 size to measure the best classification performance. Fig. 8 illustrates the accuracies of the three sizes of the training dataset. As can be seen, by increasing the FA number from 2 to 3, the OA will raise gradually and after that, the accuracy increases slightly.

B. Implementation of ML Methods
To achieve better results with ML methods, only 29 earlier established optimal DESIS bands were used for the classification [41]. The selected bands fell within the 500-1000 nm spectral range and were as follows: 41 The classification was performed with only the selected bands to reduce the data dimensionality because optimal band selection in imaging spectroscopy can improve the classification accuracy [47]. Moreover, a spectral profile of the features of interest was created and assessed the spectral response characteristics of each class.
Single and double DESIS image sets were examined for the crop classification; no triple image sets were realized due to the cloud coverage of the image captured on June 6, which was therefore not considered in this article. Since there were no images for April, July, and August, those acquired in June were utilized. On the one hand, the two most widely used pixel-based supervised algorithms, RF and SVM, to classify crops (e.g., hybrid corn, soybean, sunflower, wheat, and others) were applied. RF is an EL method in ML proposed by Breiman [48] that can be used for both classification and regression tasks. This method is most commonly used in the remote sensing community because of its high classification accuracy [49]. The classification was implemented using the random forest package in R 4.2.1. The number of variables used for tree node splitting (mtry) was set as the default value. An optimal number of trees (ntree) was selected based on the relationship between the decrease in out-of-bag error and the number of trees (see Fig. 9); subsequently, ntree was set at 500. On other hand, proposed model has been also comparied with two most recent deep leaning algorithm named MSRN [50], and MDBRSSN [51].  SVM is an ML algorithm that constructs a hyperplane in multidimensional space to separate different classes [52]. Its main advantage is that it can be utilized for both classification and regression tasks. Several types of kernels have been developed, and the most common one is the radial basis function (RBF).
Here, the SVM model was applied using the e1071 package in R 4.2.1, and the RBF was used. SVM requires two parameters, one gamma γ and one cost C value. Finding the best hypermeters is essential and the best combination cannot be estimated in advance. Thus, through cross-validation, the OA values were compared and the best parameters were determined based on a trial-and-error plot (see Fig. 10). The regularization parameter C and the Kernel parameter gamma γ were set at 64 and 0.0625, respectively.
These distinct spectral patterns allowed discrimination among the different crop types through RF and SVM classifications. The classification was run two times for RF and SVM with single and double DESIS image sets. When using a single image, the SVM classifiers achieved an OA and κ of 85.23% and 0.80%, respectively, while the RF classification obtained an OA of 83.3% and a κ of 0.78%. When using double image sets, the OA and κ values increased, respectively, to 87.79% and 0.84% for SVM and 86.28% and 0.82% for RF The results indicate that the SVM algorithm outperformed the RF one with a small difference in the OA of approximately +2% in both cases. However, two last satae of art moel MSRN and MDBRSSN using models applied new ideas, like 3-D-CNN, hybrid 2-D-3-D, skipped connection, and densly connection, that developed classification precision, and efficiency in comparison with traditional models. The MSRN and MDBRSSN deep leaning models achieved an OA and κ of 92.2%, 93.4%, and 0.90%, 0.91% respectively, while the proposed model obtained an OA of 97.89% and a κ of 0.97% (see Table VI); the corresponding classified images are displayed in Fig. 11.
According to the classification report, hybrid corn was the most cultivated crop type in the study area with an area coverage of 2464.29 ha, whereas other classes ranked second place with an area coverage of 2340.99 ha, followed by wheat, soybean, and sunflower. Based on these results and the average sunflower yield, 4961.88 tons of grain are to be harvested from all the sunflower fields in 2021. The proposed simulated model has been repeated 10 times for our dataset to achieve a higher accurate estimation. Surely, SVM nad RF classifer does not performe well due to our noisy and overlapping dataset in classes. It also shows a poor performance in the imblanced datasets. However, the MSRN and MDBRSSN have a deeper network structure and using novel concepts, such as 3-D-2-D CNN, hybrid Models, multiscale, skipped connections, and densely connection, which growth classification efficancy. The PU (see Table VII), IP (see Table VIII), and WHU-HI (see Table IX) datasets were tested by MSRN and MDBRSSN methods and were at least 15%, 40%, and 7% performed better than the last greatest model (SVM), respectively. The other method that has incremented the efficiency of models which several scientists have recently applied it in their models, is the AM. The proposed model also used AM for classessification task received a better efficiency than other methods. For instance, based on Tables VI-IX, AA in the PU, IP, and WHU-HI datasets present that the proposed method is 1.22%, 1.38%, and 0.85% outperform than the performance of the MDBRSSN model, respectively. Our proposed model, which extracting main features via wavelet, 3-D-CNN, and AM with various kernel sizes, can identify, between classes. The OA of the proposed model classification of 0.9%, 1%, and 0.75% of the superior accuracy between the models related to MDBRSSN outperforms in the SA, PU, IP, and WHU-HI datasets, respectively.

IV. DISCUSSION
The results indicated that the selection of effective and representative HS bands is critical to overcoming data redundancy and autocorrelation and reducing the computational time for the potential real-time applications of imaging spectroscopy. However, DESIS bands are less redundant and more informative because of their narrow bandwidth (2.55 nm) compared with    VIII  ACCURACY ASSESSMENT OF RESULTS OBTAINED FOR THE CROP-TYPE CLASSES FOR THE INDIAN PINES DATASET(10% SMAPLE FOR TRANING)   TABLE IX  ACCURACY ASSESSMENT OF RESULTS OBTAINED FOR THE CROP-TYPE CLASSES FOR THE WHU-HI-LONGKU DATASET (25% SMAPLE FOR TRANING) other HS images (e.g., Hyperion) [42]. Here, 29 DESIS HBNs were finally selected out of 235 within the 500-100 nm range. The selected HBNs have already been used in many other agricultural studies [53], [54], [55], including the prediction of biophysical and biochemical parameters, such as leaf area index (LAI), nitrogen, crop growth stage classification, biomass, yield prediction, weed, disease detection, LULC classification, stress, pigment, etc. For instance, the bands at about 504, 522, and 540 nm are good for disease, LAI, and stress applications, while those at around 556 and 625 nm can be used to map the crop growth stages. Moreover, the reflectance values at 648, 763, 778, 824, and 848 nm are important for biomass, yield, and crop classification studies.
A dedicated DESIS HS library can easily differentiate the crops based on their spectral profiles (see Fig. 5), improving the classification accuracy [56]. The results presented in Tables VI-IX have demonstrated an increase in OA for crop classification when using double image sets. The best results in terms of OA and κ were obtained with the SVM algorithm.
Regarding the specific class ranks, wheat achieved the highest user accuracy (UA) and producer accuracy (PA) ranging from 92.80% to 96.12% with both the RF and SVM models, while soybean recorded the lowest UA with the two algorithms (64.9% and 88.2%, respectively). This occurred because some soybean pixels were misclassified as sunflower and vice versa. Due to the spectral similarity of the pixels, the accuracy for soybean and sunflower was therefore weakened. This result is in line with those reported by Aneece and Thenkabail [41], who also classified major crops in the USA by using DESIS data and ML algorithms run in GEE and the R software; they classified three leading world crops (corn, soybean, and winter wheat), obtaining the highest accuracy when using the SVM model on June-August images, with an OA of 85%. Several studies have shown that SVM and RF techniques enhance classification accuracy when using spaceborne HS sensors [44]. The wavelet attention CNN method has been repeated ten times for every patch size and various FA numbers to achieve a better accurate evaluation of the outcome. The primary metrics and accuracy classes are registered in Table VI. The SVM classification method do not preform probably, when the classes have overlapping, the dataset are noisy, and in imbalanced datasets. The RF and SVM showed worse performance compared with proposed model. The results in Table VI showed that both classic models have worse accuracy, and the OA in 10% of training dataset are 86.23% and 82.34%, respectively. However, our proposed method has improved significantly. The performance has great increasing and the OA reached 97.89%. The proposed model used an AM which led to a huge raise in performance. In comparison with the classic classifiers SVM and RF models improved by 14.5% and 18.6% in the case of the average accuracy (AA). The model also has higher classification accuracy in all classes from 1 to 5. The OA of the model classification 13.5% and 17.5% is higher than the SVM and RF, respectively.

V. CONCLUSION
In this article, we examined the wavelet attention 2-D-CNN on DESIS image classification for crop-type classification taking into account image dimension reduction and spectral AM. By using FA and Wavelet-attention to diminish the size of the HSI, we could successfully filter out useless information in the low-frequency domain. A 48x48 spatial patch size was found the best on the HSI dataset and FA from 2 to 3 gave the highest OA. The result proves that the newly developed WA-CNN for crop-type mapping can incorporate the specific details of features in the high-frequency domain, improving CNN's capacity to learn features for image categorization. A DESIS HS library was established for four major crops (hybrid corn, sunflower, wheat, and soybean). A total of 29 important DESIS bands out of their total of 235 were selected based on previously determined narrow bands as input for RF and SVM models. Thanks to their high spectral resolution (2.55 nm), these selected narrow bands can help the discrimination among crops having similar spectral characteristics. The performance of different ML algorithms, RF and SVM, in automatically classifying the target crops by using the established HS library was investigated. This article is one of the few using DESIS HS data since they became available only recently. The SVM-supervised classifier was more robust in agricultural crop-type mapping with an OA of 87.79% and a κ of 0.84%. The classification accuracies (PA, PA, and UA) increased when two combined images were utilized. However, the newly proposed method based on a wavelet attention 2-D-CNN, feature-based algorithm obtained higher accuracy in terms of OA and κ values of 97.28 and 97.89, respectively than traditional ML algorithms. Overall, this article demonstrates how the very fine spectral resolution of DESIS narrow bands can support the agricultural crop classification and the identification of low-yield crops, which is crucial and can improve food security in vulnerable regions. Continuous spectral information from DESIS imagery can better assess crop biophysical and biochemical parameters that are necessary for yield mapping, measuring, monitoring, and modeling.