The agricultural sector is recognized as one of the most important contributors to the global economy [1]. With a growing population and limited food supplies, agricultural activities need to be monitored regularly to ensure that food is produced in a more efficient manner while preserving the natural ecosystem [2]. To this end, crops such as wheat, corn and barley are the most important sources of food all over the world. Consequently, at regional, national and even global levels of food production, information on their spatial distribution and condition plays a vital role [3]. The timely and accurate classification of crop types is one of the most fundamental parts of remote sensing (RS) monitoring in agriculture and is becoming an indispensable technology due to its wide range of applications, such as yield estimation, crop transport and soil productivity [4–6]. For the management of agricultural production and the formulation of agricultural policy, it has important practical implications [7, 8]. Traditional methods of crop classification rely heavily on visual interpretation, which is dependent on expert knowledge and can be subject to challenges such as poor timelines, and low operational efficiency [9]. Therefore, it would be beneficial to explore the classification of crops based on remote sensing (RS) imagery to determine their agricultural status and to propose a more advanced strategy to improve the performance [10, 11].
With the rapid development of Earth observation (EO) satellites in recent years, they have been invaluable in providing low-cost, wide-area RS images and crop classifications that are increasingly time-consuming to process [12]. In addition, the accuracy of crop classification can be improved by using image data with different spatial, temporal and spectral resolutions [13]. RS, the last few years have been known as the era of big free data. From 2013 to 2016, a large number of optical and synthetic aperture radar (SAR) RS satellites with a high spatial resolution (10–30 m) were launched, in particular Sentinel-1A/B and Sentinel-2A during the same period [14, 15]. In addition, multi-temporal, multi-source satellite imagery, including multispectral [16], hyperspectral [4], and synthetic aperture radar (SAR) [17, 18], is also used to identify the specific growth stages of the crop.
Traditional RS-based machine learning (ML) techniques are becoming more and more widely used to classify images. There are several types of ML-based classification algorithms such as Random Forest (RF) [19], Decision Tree (DT) [20], Support Vector Machine [21, 22], k-Nearest Neighbor [23], Maximum Likelihood [24], and Artificial Neural Network [24] that can be used for crop classification [12, 14]. While the above-mentioned traditional methods offer significant advantages and have proven to be effective, they still face a number of challenges. They are mostly associated with handicraft features, which are highly dependent on expert experience and traditional designs. In addition, the application of these methods to a large and complicated region is currently quite complex, and the accuracy is also limited. Many sources of information, such as ground and aerial surveys, must be used in traditional crop classification techniques. These procedures are slow and expensive, and the results are inconsistent.
In the last decade, the development of deep learning (DL) techniques has been greatly accelerated [25]. Overcoming the challenges of traditional ML algorithms, DL-based methods provide excellent feature extraction and nonlinear characterization capabilities for complex RS data. They have made significant advances and breakthroughs in a wide range of RS tasks, such as building reconstruction [26], classification [27], crop mapping [28, 29], and damage assessment [30]. With the introduction of DL methods in crop classification, there has been an overwhelming superiority and remarkable performance of DL methods in crop classification. There have been many proposals of effective DL methods for crop classification, which have improved the classification accuracy. In this context, Yang et al. [31] employed multi-temporal Sentinel-2 imagery, and a new crop classification approach was developed by integrating optimal feature selection with a hybrid CNN-RF model to identify summer crops in the northeastern part of China, Jilin Province. Ji et al. [32] introduced a novel method using multiple temporal RS images with 3D CNN to classify crops. Ac-cording to their results, the proposed model had a significantly better performance than the 2D-CNN model. Zhao et al. [33] evaluated the efficiency of three DL techniques for crop classification in sentinel-1A image time series in China, Zhanjiang city, evaluating different neural network architectures including 1D-CNN, long short-term memory recurrent neural network (LSTM-RNN), and gated recurrent unit RNN (GRU-RNN). First, to produce three classifiers with optimal architectures and hyperparameters, these NNs-based models were trained. Then, a classification network with all parameter values at each time point was created. Finally, the optimal length of the time series was deter-mined based on evaluating each time point for each crop. Li et al. [34] combined generative adversal network (GAN), CNN, and LSTM models, and a novel technique for classifying corn and soybean crops from Landsat-8 time-series imagery was presented. Meanwhile, for crop classification, a new framework was presented by Seydi et al. [35] that uses deep CNNs and dual attention modules (DAMs) with Sentinel-2 datasets. The results indicate a high level of classification performance, with the proposed technique achieving an overall accuracy of 98.54 and a kappa coefficient of 0.981%.
Many agricultural crop type mapping models have been developed in various studies. In general, these models suffer from one or more of the following drawbacks:
(1) Many of the frameworks are based on optical time series data sets for the classification of crop types. Optical datasets cannot work under cloudy and rainy conditions, although they are easy to interpret. As a result, crop type classification in cloudy areas is limited due to these disadvantages of optical datasets.
(2) The more attention is paid to the traditional classifiers instead of the advanced deep learning models. These models need to be highly informative in order to provide useful results. Especially for large areas and time series datasets, the generation of robust features is challenging and time consuming.
(3) Deep learning based on frameworks uses the convolutional layer and LSTM layers. The potential of advanced transformer models for crop type mapping is ignored by these models. In addition, to improve the capability of deep feature generation, CNN and transformer models can be combined.
This paper proposes a novel dual-stream crop type mapping framework based on the combination of CNN and NesT frameworks. The Crop-Net model makes use of time-series Sentine-1 SAR images for the classification of agricultural crops. The main contributions of the current study can be summarized in the following way: (1) design of a novel double-stream framework using convolution layer and hierarchical transformer model for crop type mapping for the first time; (2) the Crop-Net model using the advantage of spatial/spectral attention modules for crop mapping; (3) evaluation of the effect of increasing the number of months in the time series for crop mapping; and (4) analysis of the sensitivity of polarisation type for crop type accuracy and comparison with polarimetric radar vegetation index.