Hyperspectral Classification of Frost Damage Stress in Tomato Plants Based on Few-Shot Learning

Ruan, Shiwei; Cang, Hao; Chen, Huixin; Yan, Tianying; Tan, Fei; Zhang, Yuan; Duan, Long; Xing, Peng; Guo, Li; Gao, Pan; Xu, Wei

doi:10.3390/agronomy13092348

Open AccessArticle

Hyperspectral Classification of Frost Damage Stress in Tomato Plants Based on Few-Shot Learning

¹

College of Information Science and Technology, Shihezi University, Shihezi 832061, China

²

Key Laboratory of Special Fruits and Vegetables Cultivation Physiology and Germplasm Resources Utilization (Xinjiang Production and Construction Crops), College of Agriculture, Shihezi University, Shihezi 832000, China

³

School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai 201100, China

^*

Authors to whom correspondence should be addressed.

Agronomy 2023, 13(9), 2348; https://doi.org/10.3390/agronomy13092348

Submission received: 3 August 2023 / Revised: 2 September 2023 / Accepted: 6 September 2023 / Published: 9 September 2023

(This article belongs to the Section Precision and Digital Agriculture)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Early detection and diagnosis of crop anomalies is crucial for enhancing crop yield and quality. Recently, the combination of machine learning and deep learning with hyperspectral images has significantly improved the efficiency of crop detection. However, acquiring a large amount of properly annotated hyperspectral data on stressed crops requires extensive biochemical experiments and specialized knowledge. This limitation poses a challenge to the construction of large-scale datasets for crop stress analysis. Meta-learning is a learning approach that is capable of learning to learn and can achieve high detection accuracy with limited training samples. In this paper, we introduce meta-learning to hyperspectral imaging and crop detection for the first time. In addition, we gathered 88 hyperspectral images of drought-stressed tomato plants and 68 images of freeze-stressed tomato plants. The data related to drought serve as the source domain, while the data related to frost damage serve as the target domain. Due to the difficulty of obtaining target domain data from real-world testing scenarios, only a limited amount of target domain data and source domain data are used for model training. The results indicated that meta-learning, with a minimum of eight target domain samples, achieved a detection accuracy of 69.57%, precision of 59.29%, recall of 66.32% and

F_{1}

-score of 62.61% for classifying the severity of frost stress, surpassing other methods with a target domain sample size of 20. Moreover, for determining whether the plants were under stress, meta-learning, with a minimum of four target domain samples, achieved a detection accuracy of 89.1%, precision of 89.72%, recall of 93.08% and

F_{1}

-score of 91.37% outperforming other methods at a target domain sample size of 20. The results show that meta-learning methods require significantly less data across different domains compared to other methods. The performance of meta-learning techniques thoroughly demonstrates the feasibility of rapidly detecting crop stress without the need for collecting a large amount of target stress data. This research alleviates the data annotation pressure for researchers and provides a foundation for detection personnel to anticipate and prevent potential large-scale stress damage to crops.

Keywords:

tomato plants; frost damage stress; hyperspectral; few-shot learning; meta-learning strategy

1. Introduction

With the increase in population and the changing environment, food production is under severe strain. As one of the most widely cultivated and consumed vegetable crops in the world [1], tomato production is closely intertwined with global food security. In the process of tomato cultivation, early detection of abnormal conditions affecting the crops (including biotic and abiotic stressors) can effectively improve crop yield and quality [2,3]. In order to achieve food security and ensure grain yield, digital technology has been widely utilized to assist agricultural production [4].

By using machine learning and deep learning to classify crop RGB images, automatic detection of crop growth status can be achieved. P. Dhiman et al. [5] provided an overview of classification models and concluded that methods such as SVM and CNN exhibit strong performance in detecting citrus fruit diseases. Sujatha et al. [6] conducted a classification study on citrus leaf diseases using both traditional machine learning and deep learning methods. Singh et al. [7] presented a detailed and accessible classification of machine learning methods, which can help plant researchers apply appropriate machine learning strategies and best-practice rules for different biotic and abiotic stress. In the context of early detection and classification of sunflower diseases, Y. Gulzar et al. [8] achieved effective classification using EfficientNet. Such methods replaced the labor-consuming, time-consuming manual detection method, which relies on professional knowledge. However, the RGB image with only three wavelengths cannot analyze the deep growth information of crops. Benefiting from the hyperspectral image (HSI) containing many consecutive bands which embody a lot of spectral and spatial information, crop growth status can be easily reflected by HSI [9]. However, the challenges of creating a large-scale crop spectral dataset lie in the need for longer recording times, the conversion of physiological data into labels, and the complexity of field environments. Various machine learning techniques are utilized to create crop classification models using multi-spectral and multi-temporal satellite imagery [10,11]. When applied to real-world agriculture conditions ML-based methods could not get discriminative and representative features of data on account of the weakness caused by the shallow architecture. As a branch of machine learning, deep learning has been applied to classify HSI and shown its effectiveness.

Deep-learning-based methods construct a muti-layer structure to extract far deeper and more essential features [12]. Recently, various deep-learning-based methods, such as convolution neural network (CNN), have been applied in HSI classification of crop diseases. Despite the viable and efficiency of abovementioned deep learning methods in HSI classification of crop diseases, the diagnosis models perform well on training data, i.e., the source domain, while the performance would be diluted when applied to realistic conditions, i.e., the target domain. Therefore, to ensure high accuracy of the model classification on the target domain, most applications only take into account the fact that the distributions of the source domain and the target domain are identical [13,14].However, in most practical cases, the changing environments, diverse diseases and unknown factors results in a domain shift between the source domain and target domain. This domain shift changes the consistency of the distribution of the source and target domains, and it ultimately affects the accuracy of model detection. Furthermore, training deep learning models typically requires a large amount of labeled data. However, it is time-consuming and expensive to obtain sufficient labeled HSI [15]. In order to reduce the model’s reliance on a large amount of data and make better use of labeled and unlabeled data, generative adversarial networks (GANs) and transfer learning (TL) have been applied to address the problem of limited labeled samples. GANs enable image augmentation via generative model learning of the training data distribution, producing high-quality synthetic images that closely resemble real ones [16]. Hu et al. [17] reported on using DCGAN with conditional label constraint (named C-DCGAN) for identifying tea leaf diseases (red scab, red leaf spot and leaf blight).Abbas et al. [18] employed CGANs for augmenting ten classes of tomato leaf images from the Plant Village dataset [19]. Cap et al. [20] introduced a leaf segmentation module into CycleGAN, resulting in the creation of LeafGAN, a model capable of transforming regions of interest in plant disease images, thereby enriching the versatility of image generation. However, the existence of instability and collapse problem is a major obstacle for GAN to deal with this classification problem [21]. Furthermore, when the data in the target domain are insufficient, the data generated by GAN models also fail to capture the data patterns. TL is based on the observation that different tasks and domains often share common features or knowledge, allowing for the acceleration and improvement of learning performance by transferring previously acquired knowledge and features from related tasks or domains. This approach can enhance the classification performance of a model when training data is limited [22]. Espejo et al. [23] used transfer learning for robust plant identification with limited datasets. Too et al. [24] compared the performance of various fine-tuned deep learning models for plant disease identification. Paymode et al. [25] used a VGG based on transfer learning to solve a multi-crop leaf disease image classification problem. Y. Gulzar [26] synergistically amalgamated MobileNetV2 with deep transfer learning to formulate the TL-MobileNetV2 model, achieving effective classification in the context of fruit classification tasks. N. Mamat et al. [27] employed deep learning, YOLO (You Only Look Once) and transfer learning to develop an automated annotation technique for fruit images. When the source domain and target domain are distributed differently, using TL-based methods will cause negative transfer to occur. The real effect of TL-based methods may be abated by the negative transfer. In addition, in order to reduce the impact of negative transfer and obtain better classification performance, more target domain labeled data are essential. It is usually expensive for current learning model to effectively learn new knowledge by increasing the number of categories and numbers in the training sample. In contrast, humans can learn the ability to classify unseen samples through rarely labeled samples. How to make the model learn from the existing labels to the ability to classify new classes, the so-called few-shot learning classification has recently attracted attention [28,29].

In the few-shot classification, different from previous domain requirements, the distribution for the source and target domain are not required to be same, so few-shot learning (FSL) has the ability to classify new unseen classes in the target domain. An FSL approach is essentially a meta-learning (learning to learn) approach [30], the general knowledge is obtained by training the data of the source domain, which can be used as an aid to predict the target classes with only a few labeled samples. More concretely, given multiple N-way K-shot classification tasks, the goal of each task is to learn a N-classification model under K-labeled train data used for each class [31]. Usually, K takes a small number. To acquire meta-knowledge that is tailored for performing few-shot tasks, this method uses labeled source class data to simulate few-shot task by episodes [32]. The meta-knowledge obtained by training in this episodic paradigm is finally assembled on the model, and the model will be used in new few-shot classification task. Ultimately, the few labeled target class data and obtained from the source domain data meta knowledge are used to predict the category of target class data. Based on the above idea of few-shot sample learning, prototypical network creates a prototype for each category and the samples are discriminated by calculating the Euclidean distance from the prototype [33]. The Relation Network obtains relationship scores for support sets and query sets by building a learnable metric relationship module. The relationship score is the basis for determining the query category [34].

Meta-learning models have the ability to learn how to draw experiences from one type of crop stress and apply these experiences to other stress situations. This implies that when faced with new crop or stress types, the model can adapt and detect stress more rapidly. This empowers the model to play a crucial role in real-time crop monitoring and prompt decision making. This is particularly significant for identifying crop stress in emergency situations. Accurate classification by the model under limited samples can aid farmers in promptly taking actions to mitigate the losses caused by crop stress. In this paper, to solve the problem of low detection accuracy due to the small sample size of tomato plant hyperspectral, we cultivated and produced hyperspectral datasets containing various stresses and different levels of tomato plants. In this work, meta-learning is introduced for the first time to the small sample problem in the field of agricultural hyperspectroscopy to address the difficulty of traditional methods with low recognition rates in the case of small samples. We explore the ability of meta-learning to identify across domains in agriculture and explore the impact of meta-learning parameters on hyperspectral classification. Finally, we will summarize the experimental results and explain the effect laws of various meta-learning methods on hyperspectral classification. In the last part, the deficiencies of the experiment will be explained and the future improvement plan will be proposed.

The main contributions of this paper are as follows:

(1): Hyperspectral data of tomato plants under abiotic stress were collected. The collected hyperspectral data were also used to make a small sample hyperspectral dataset containing multiple species and degrees.
(2): Based on the created hyperspectral dataset of abiotic stress on tomato plants, we constructed machine learning-based models such as SVM, PLS-DA and traditional deep-learning-based CNN and meta-learning-based models such as Prototypical Network and Relation Network. The detection experiments were also conducted when they were homogeneous and cross-domain. The classification results of the models are summarized and analyzed.
(3): The experimental results were analyzed, and the reasons for model misclassifications were discussed. The classification performance of the experiment was explored from two aspects: data and model classification methods.
(4): This work offers new insights into the rapid detection of crop stress and provides a basis for agricultural personnel to proactively prevent large-scale stress damage.

2. Materials and Methods

2.1. Experimental Materials

Abiotic stress cultivation experiments on tomato plants were conducted at Shihezi University (44

^{\circ}

19

^{'}

N, 86

^{\circ}

03

^{'}

E), Shihezi City, Xinjiang Production and Construction Corps, China. Hyperspectral data from tomato plants were collected in April 2022–May 2022. The experimental variety of tomato plant cultivated is Shi Ji Bao Guan, which has a wide range of tomato plants with high disease resistance. The cultivated soil is PINDSTRUP peat soil, which has a Potential of Hydrogen (PH) value of 5.5 and an Electrical Conductivity (EC) value of 0.5.

Stress treatments were started when tomato seedlings reached four to five leaves. We divided the tomato plants cultivated into two sections and subjected them to different stress treatments. In terms of incubation temperature, subjected to drought placed in a greenhouse at 25–26 degrees Celsius for incubation. At the same time, tomato plants subjected to freeze stress were incubated in a low temperature artificial climate chamber model DRXM-1008F-2, which was set at 6 degrees Celsius, 8000 lux of light and 50% ambient humidity. For the irrigation treatments, tomato plants under drought stress were placed in the greenhouse for natural drought. Tomato plants were placed in a thermostat and transferred to the laboratory for hyperspectral image acquisition. The hyperspectral data acquisition system is shown in Figure 1. The reference plate was positioned at the average height of the tomato plants before the hyperspectral images of the tomato plants were acquired. The hyperspectral image acquisition device is the Surface Optics Corporation (SOC)-701-SWIR with a wavelength range of 900–1700 mm and a band count of 288. The hyperspectral image acquisition software is SOCs HyperScanner-SWIR https://surfaceoptics.com/wp-content/uploads/2019/04/710-Series-Brochure.pdf, accessed on 2 August 2023, and the hyperspectral image processing software is SRAnalysis710e https://aaronia.com/es/produkte/spectrum-analyzer?gclid=EAIaIQobChMIy_28yq-cgQMVwCeDAx2uaAInEAAYASAAEgLvu_D_BwE, accessed on 2 August 2023.

2.2. Spectral Data Extraction and Preprocessing

The conversion of hyperspectral data into spectral reflectance was necessary prior to the calculation of mean spectral reflectance for tomato plant hyperspectral data. The reference version of the spectral data Digital Number (DN), spectral calibration, spatial radiometric calibration and the spectral radiometric calibration of the SRAnalysis710 software were used to convert the tomato plant hyperspectral data into spectral reflectance.

To efficiently mitigate non-uniform illumination and spectral scattering intensity distortion arising from uneven sample surfaces, the approach of mean normalization is commonly used for hyperspectral data [35,36]. However, in this work, the raw hyperspectral images of tomato plants contain interfering noise including soil and background. This background noise interferes with the extraction of the average spectral reflectance of the plants. It is necessary to remove the background from the image before extracting the average spectral reflectance. Moreover, the presence of a certain amount of noise in the spectral reflectance of the different bands results in a jagged characteristic in the reflectance between adjacent bands. Smoothing of spectral curves is also equally essential. Park et al. [37] demonstrated in their work that smoothed spectra exhibit higher detection accuracy compared to raw spectra.

For setting the initial parameters in the SG smoothing algorithm, the window length was set to 5 and a third-order polynomial fit was used. To further reduce the effect of noisy spectra on the target spectrum, spectral differentiation techniques are used to remove part of the linear or near-linear background, and to decompose overlapping spectra, which are more easily identified. In addition, the standard normalized variate (SNV) method was used to correct for spectral errors due to inter-sample scattering [38]. The main steps are illustrated in Figure 1. Firstly, the band with the maximum and minimum reflectance of the hyperspectrum of the tomato plant was selected. The spectra of the selected bands are subtracted. Next, the resulting image is thresholded to obtain a mask containing only the tomato plants. Then, the mask is applied to the original spectrum to obtain a hyperspectral image containing only the tomato plants. Moreover, the average spectral reflectance was calculated for all pixels containing spectral reflectance. The average spectral curves of an individual tomato plant is obtained. Finally, the mean spectral reflectance curves were SG smoothed and multiple curves were SNV and finally first-order differentiated.

2.3. Meta Learning-Based Methods for Hyperspectral Classification (HSC)

In this subsection, we divide into two main sections detailing the meta-learning methods we use. First, we introduce a generic learning strategy for meta-learning. All the meta-learning models used in this paper are based on this strategy. In addition, the three meta-learning model frameworks of ProtoNet and RelationNet are described in detail. The Embedding functions mainly used in this paper are also presented.

2.3.1. Meta-Training Strategy

As mentioned in the introduction, meta-learning based learning strategies enable the meta-learner to learn from the classification task in the source domain based on task-level classification capabilities rather than sample-level. This task-level classification capability allows the model to have better classification results after fine-tuning with a small number of target domain samples. In this section, we will detail the learning strategies for meta-learning.

We define the dataset in the source domain as

D_{S}

, the number of source domain dataset categories is defined as

C_{S}

. The target domain dataset is defined as

D_{T}

and the number of target domain data categories is defined as

C_{S}

. In the target domain dataset

D_{T}

, the dataset

D_{f}

used for fine-tuning is the data with labels in the target domain, and the dataset

D_{t}

used for testing is the data without labels in the target domain, i.e.,

D_{T} = D_{f} \cup D_{t}

.

N-way, K-shot is a critical issue in meta-learning based learning strategies. Each few-shot task is considered an episode. The episodes are constructed by randomly selecting N classes

(N \leq C_{S})

from the score set

D_{S}

, with K samples from each class forming the support set S, i.e.,

S = {\{(x_{i}, y_{i})\}}_{i = 1}^{N \times K}

. Besides, M samples are randomly selected from each of the N categories to form the validation set Q, i.e.,

Q = {\{(x_{j}, y_{j})\}}_{j = 1}^{N \times M}

. x denotes the average spectrum corresponding to each sample, i.e.,

x \in R^{D}

. y denotes the label corresponding to each sample, i.e.,

y \in R

. For the task-based meta-learning, the training data within each task consist of this support set and query set, and we define this single task as

Γ

, i.e.,

Γ = \{S \cup Q\}

. The source set

D_{t r a i n_s c o r e}

is patterned in this way to form the multitask training set, i.e.,

D_{t r a i n_s c o r e} = \{Γ_{1}, Γ_{2} . . . Γ_{t r a i n_n u m b e r}\}

.

T r a i n_n u m b e r

is the number of tasks constructed. In the training phase, the goal of meta-learning for a single task internally is to minimize the model’s loss function for query set predictions by training the support set. The purpose of constructing multiple tasks is to allow the model to accumulate meta-knowledge, which is used to accelerate the model’s learning of new tasks. Tasks for labeled target set are constructed in the same way as tasks for source set, i.e.,

D_{t r a i n_t a r g e t} = \{Γ_{1}, Γ_{2} . . . Γ_{t a r g e t_n u m b e r}\}

.

T a r g e t_n u m b e r

is the number of tasks constructed for the target domain. The initialization parameters for training on the target set are determined by the meta-information. Meta-information is the parameter of the meta-knowledge which learned from the source set. In this way, the update of the model parameters

θ

is determined by the features of the target set. Furthermore, in a small number of tasks the loss function can be quickly minimized for detection on the target set. After the parameters have been updated for the target domain data, the trained model is used to make predictions on

D_{t}

. The classification ability of the model is evaluated based on the prediction results.

2.3.2. Prototypical Network

Prototypical Networks (ProtoNet) assumes that the input data encode feature vectors as points in an embedding space, where sample points are clustered around a single point of the class to which they belong, and the single point being clustered is the prototype. The non-linear mapping of the input data to the embedding space is achieved by the encoders. The prototype of each class is generated from the embedding vector of the support set. The distance between the embedding vector of the query set input and the prototype of each category determines the category of samples in the query set. ProtoNet completes the computation of the prototype

p_{c} \in R^{m}

by using an Embedding function

h (x) : R^{D} \to R^{m}

.

h (x)

contains all the parameters

ϕ

that the neural network needs to learn. The formation of the prototype

p_{c}

is obtained by averaging the embedding vectors of the c class samples in the support set.

The calculation formula is shown below:

\begin{matrix} p_{c} = \frac{1}{K} \sum_{i} h (x_{i}) \end{matrix}

(1)

where K denotes the number of samples per class in the support set. The distance function

F_{d i s t a n c e} : R^{m} \times R^{m} \to R^{+}

is used to measure the validation set in the embedding space, the specific calculations are as follows:

\begin{matrix} F_{d i s t a n c e} (v_{1}, v_{2}) = \sum_{i = 1}^{M} {(v_{1, i} - v_{2, i})}^{2}, v \in R^{M} \end{matrix}

(2)

The category probability of the sample vector

{\tilde{x}}_{i}

of the query set is calculated as follows:

\begin{matrix} p_{ϕ} (y = c ∣ {\tilde{x}}_{i} \in Q) = \frac{e x p [- F_{d i s t a n c e} (h (x_{i}), p_{c})]}{\sum_{c^{'}} e x p [- F_{d i s t a n c e} (h (x_{i}), p_{c^{'}})]}, c^{'} = \{1, 2, 3 . . . N\} \end{matrix}

(3)

The classification of the query set is achieved by the maximum of the category probabilities, that is:

\begin{matrix} \hat{y} = \underset{c}{a r g m a x} {\{p_{ϕ} (y = c ∣ {\tilde{x}}_{i} \in Q)\}}_{c = 1}^{N} \end{matrix}

(4)

In the training process, the loss function is the cross-entropy loss function, which is calculated as shown below:

\begin{matrix} L (y, \tilde{y}) = - \sum_{c = 1}^{N} y_{c} l o g {\tilde{y}}_{c} \end{matrix}

(5)

The model minimizes the loss value by calculating a loss function and back propagating to update the parameters. The model framework for ProtoNet is shown in Figure 2. In this work, the specific hyperparameter settings for ProtoNet are as follows. The number of constructed tasks is 1000. The number of epochs used for training between tasks is 200, and within each task, an episode consists of 100 iterations. The model employs the Adam optimization algorithm with an initial learning rate of 0.01, and the learning rate between different epochs is decayed by a factor of 0.95.

2.3.3. Relation Network

Unlike ProtoNet, the Relation network (RelationNet) does not calculate similarity by using a pre-defined distance method but by training a similarity measure. RelationNet achieves a non-linear mapping of the input to the embedding space by constructing Embedding functions. In addition, a relationship learning module is constructed in order to learn the relationship between the embedding vector of the query sample and each embedding vector in the support set, and finally to derive a relationship score. The relationship score is an indispensable basis for predicting the category to which the query sample belongs. The RelationNet model framework is shown in Figure 2.

The aim of RelationNet is to learn how to measure two vectors. First, all the samples

{\{x_{i}\}}_{i = 1}^{K} \in D_{Γ_{j}}^{s u p p o r t}

in the support set with those in a query set are input into the constructed Embedding function

h (x) : R^{D} \to R^{m}

, resulting in the feature vectors

r_{i} = h (x_{i}), i = 1 . . . . N \times K

and

{\tilde{r}}_{j} = h ({\tilde{r}}_{j})

.

h (x)

contains the parameters

ϕ

that the neural network needs to learn. Second, samples from the support set and samples from the validation set of the embedding space are linked together, i.e.,

z (i, j) = (r_{i}, {\tilde{r}}_{j}), i = 1 . . . . N \times K

. In learning to measure relationships, the relation score

s (i, j)

is obtained by the Embedding function

g (x) : R^{2 m} \to R^{+}

, and is expressed by the formula:

s (i, j) = g (z (i, j))

. When there is more than one data category in the support set, the calculation of the relation score changes to the average of the relation score for the current class:

c (i, j) = \frac{1}{K} \sum_{i = 1}^{K} s (i, j)

. Classification of query sets by maximum relation score:

\hat{y} = \underset{c}{a r g m a x} {\{p_{ϕ} (y = c ∣ {\tilde{x}}_{i} \in Q)\}}_{c = 1}^{N}

. Similarly, the loss function is defined as a cross-entropy loss function during the training process. The model parameters are updated by calculating the loss function to minimize the loss value and back propagating.

In this work, the specific hyperparameter settings for RelationNet are as follows. The number of constructed tasks is 1000. The number of epochs used for training between tasks is 200, and within each task, an episode consists of 100 iterations. The model employs the Adam optimization algorithm with an initial learning rate of 0.001, and the learning rate between different epochs is decayed by a factor of 0.95.

2.3.4. Embedding Function

The Embedding function

h (x)

contains multiple convolution blocks. Each convolution blocks contains four layers, a one-dimensional convolution layer

f_{c o n v}

, a Batch Normalization (BN) layer

f_{b n}

, an activation function layer

f_{r e l u}

, and a pooling layer

f_{p o o l}

. Details of the convolution blocks are shown in Table 1. BN layers are constructed to speed up training and reduce the internal covariance bias caused by the different distributions of inputs in each layer. Batch normalization is represented by the following equation:

\begin{matrix} f_{b n} (x_{i}, x) = γ \frac{x_{i} - μ (x)}{\sqrt{σ {(x)}^{2} + ϵ}} + β \end{matrix}

(6)

where

x_{i}

indicates the input sample. The initialisation settings for the learnable parameters

γ

and

β

were set to 1 and 0, respectively, with the hyperparameter

ϵ = 1 \times 10^{- 5}

. The mean

μ (x)

and covariance

σ (x)

are expressed as:

\begin{matrix} μ (x) = \frac{1}{M} \sum_{i = 1}^{M} x_{i} \end{matrix}

(7)

\begin{matrix} σ (x) = \sqrt{\frac{1}{M} \sum_{i = 1}^{M} {(x_{i} - μ (x))}^{2}} \end{matrix}

(8)

where M indicates the mini-batch size. The activation function layer

f_{r e l u}

can be expressed as:

f_{r e l u} = m a x (0, x)

. The n-th convolution block output is:

x_{i}^{(n)} \in R^{C \times W}

. C and W indicate the channel size and the length of the input data. Taken together, the formula for

x_{i}^{(n)}

can be expressed as:

\begin{matrix} x_{i}^{(n)} = f_{p o o l} \{f_{r e l u} [f_{b n} (f_{c o n v} (x_{i}^{n - 1}), f_{c o n v} (x^{n - 1}))]\} \end{matrix}

(9)

It is worth noting that the CNN network model also utilizes the same structure as the Embedding function. However, in this work, the final classifier of the CNN employs fully connected layers for classification. The hyperparameters for the CNN model in this context are as follows: The model was trained for 150 epochs using the Adam optimization algorithm, with an initial learning rate of 0.001. The learning rate between different epochs was decayed by a factor of 0.95.

2.4. Experiment

To investigate the ability of meta-learning to detect stress in tomato plants under limited sample conditions, in the experimental part, we created a limited sample dataset and performed classification tests. Firstly, we established a small-sample hyperspectral dataset of tomato plants subjected to different types and levels of stress. A detailed description of the dataset is presented in Section 2.4.1. Secondly, we will utilize traditional machine learning, deep learning and meta-learning techniques to perform detection under the condition of limited sample training, and report the classification accuracy and result analysis. By comparing multiple methods, we aim to explore the optimal approach for tomato stress detection under the condition of limited sample size. Thirdly, the project was executed in a software environment utilizing Python 3.9 and PyTorch 1.19. The hardware setup consisted of an Intel CPU, model i5-11400h, and an NVIDIA GPU, model RTX 3060 Laptop.

2.4.1. Dataset Introduction

The average spectra of collected plants were recorded, and the activity of three peroxidase enzymes (CAT, SOD, POD) and the content of malondialdehyde (MDA) were used as labels to divide the dataset. The dataset includes two types of stress (drought and frost stress) and four levels of stress severity experienced by tomato plants. Detailed information about the dataset is provided in Table 2. In this study, the objective is to achieve a favorable detection performance in the target domain while working with a limited number of samples available in that domain. Samples subjected to drought stress were divided into a training set and a validation set in a 3:1 ratio, amounting to a total of 88 samples. Additionally,

N^{t}

target domain samples will be provided for the fine-tuning of conventional models and to serve as feature vectors for metric based meta-learning. All data affected by frost damage will be utilized as test samples, constituting a total of 68 samples. It is worth emphasizing that during the fine-tuning process, the parameters of the model are updated using the input from samples in the target domain. In contrast, within the framework of meta-learning, the additional target domain samples provided as input do not lead to parameter updates in the model. Indeed, they solely function as feature vectors for metric evaluation purposes. The specific utilization scheme of datasets from different domains is shown in Figure 3.

2.4.2. Model Evaluation

In this section, primary evaluation metrics are introduced. The four evaluation metrics we employed are accuracy, precision, recall and

F_{1}

-score. Accuracy refers to the proportion of correctly predicted samples by a classification model compared to the total number of samples. It measures the overall accuracy of the model’s predictions. The calculation formula is as follows:

\begin{matrix} A c c = \frac{T P + F N}{T P + F P + T N + F N} \end{matrix}

(10)

Precision refers to the proportion of samples correctly predicted as the positive class by the model compared to the total number of samples predicted as the positive class by the model. Precision measures the accuracy of the model’s positive class predictions. The calculation formula is as follows:

\begin{matrix} P r e c i s i o n = \frac{T P}{T P + F P} \end{matrix}

(11)

Recall, also known as sensitivity or true positive rate, refers to the proportion of samples correctly predicted as the positive class by the model compared to the total number of actual positive class samples. Recall measures the extent to which the model captures positive class samples. The calculation formula is as follows:

\begin{matrix} R e c a l l = \frac{T P}{T P + F N} \end{matrix}

(12)

The

F_{1}

score is the harmonic mean of precision and recall, providing a balanced assessment of a model’s performance by considering the trade-off between the two. The calculation formula is as follows:

\begin{matrix} F_{1} s c o r e = 2 \times \frac{P r e c i s i o n \times R e c a l l}{P r e c i s i o n + R e c a l l} \end{matrix}

(13)

TP, TN, FP and FN are true positive, true negative, false positive and false negative, respectively.

2.4.3. Few-Shot Frost Damage Stress Classification Experiment

In this experiment, to simulate the classification of frost damage levels under conditions of limited frost-damaged samples, we used all the drought data as the source domain and the frost damage data as the target domain. Additionally, to ensure diversity within each task and differentiation between tasks, we set the N-way quantity of meta-learning to 3, considering that there are only four categories of stress severity.

Regarding the classification of frost damage levels, Table 3 shows the accuracy of machine learning methods, including SVM, BP and PLS-DA, traditional deep learning method CNN, and two meta-learning methods, trained with a limited number of samples. From the table, it can be observed that ProtoNet achieves the highest accuracy in frost damage level classification when trained with different amounts of target domain data, followed by RelationNet. As expected, CNN, being a typical deep learning model designed for large-scale data, exhibits lower accuracy when trained with a small amount of data. Among the three machine learning methods, PLS-DA performs the best, reaching an accuracy of 67.84% when the target data size reaches 20. Under the same number of samples in the target domain, BP achieved the highest F1 score of 56.6%.

As the number of training samples in the target domain increases, the detection accuracy of machine learning, deep learning and meta-learning methods also improves. Under the condition of having 20 detection samples in the target domain, the highest detection average accuracy achieved by machine learning methods is 62.3%, while traditional deep learning reaches an accuracy of 53.7%. Under the premise of domain transfer, CNN demonstrated inferior performance compared to the two categories of machine learning methods. RelationNet and ProtoNet surpass the accuracy of CNN even when trained with only four training samples in the target domain. ProtoNet achieves higher accuracy when trained with eight samples, while RelationNet surpasses the accuracy of all machine learning methods when trained with 12 samples, even when compared to the accuracy achieved with 20 training samples. Moreover, when comparing with other metrics, both meta-learning methods surpass conventional machine learning and deep learning methods. This strongly demonstrates that these two small-sample learning methods can achieve better detection performance with limited training samples.

As shown in Figure 4, we investigated the impact of shot quantity on the detection accuracy of meta-learning methods by varying the number of shots per task. From Figure 4a, it can be observed that as the shot quantity increases, the accuracy of ProtoNet also improves. Specifically, at a shot quantity of 5, ProtoNet achieves an average accuracy of 71.01% under the condition of training with multiple samples, and a maximum detection accuracy of 75.77% when trained with the highest number of target domain samples. However, in the case of RelationNet, the best detection performance is achieved when the shot quantity is 1, with an average accuracy of 68.73% under the condition of training with multiple samples, and a maximum detection accuracy of 69.69% when trained with 20 target domain samples. As shown in Figure 4b, the precision of ProtoNet is lower than that of RelationNet across different shot levels. However, when considering data sensitivity, ProtoNet outperforms RelationNet in terms of detection effectiveness.

Furthermore, in addition to the above experiments, we conducted a classification detection of whether the crops suffered from frost damage stress instead of assessing the severity of frost damage stress. The detailed results are shown in Table 4. From the table, it can be observed that ProtoNet achieved an accuracy of 89.1% in classifying whether crops suffered from frost damage stress with only 4 target domain training samples and reached a maximum accuracy of 94.43% with 20 target domain samples. In this classification task, ProtoNet has achieved optimal performance across all metrics. RelationNet achieved a detection accuracy of 87.63% with 4 target domain training samples, surpassing the highest accuracy of machine learning and traditional deep learning at 87.71% with 20 target domain samples. In terms of classifying whether crops suffered from frost damage stress, meta-learning methods achieved detection accuracies comparable to or even higher than traditional methods trained with a larger number of samples when trained with a small number of target domain samples.

3. Discussion

In real-world crop detection, two primary reasons primarily constrain the development of rapid plant detection. Firstly, the specific condition of crops requires confirmation through specialized biochemical experiments, often demanding significant time and expertise to label large-scale crop datasets. Secondly, in practical application scenarios, the exact type of stress affecting the crops is often unknown. This results in catastrophic detection performance when the data used to build the model are inconsistent with real-world scenario data. The task-based learning approach allows meta-learning to achieve effective classification in scenarios where constructing large-scale datasets is not feasible. Additionally, across different tasks, meta-learning focuses on acquiring the ability to distinguish novel classes. When the source domain differs from the target domain, the effectiveness of meta-learning in cross-domain scenarios is notably superior to other methods, as illustrated in Table 3 and Table 4. Furthermore, meta-learning methods offer a more advantageous approach for rapidly applying models to crop detection.

Additionally, the reasons for model misclassification from both data and model perspectives have been discussed. At the data level, as observed in Figure 4, ProtoNet and RelationNet exhibit a notable advantage in classifying tomato plant frost damage stress levels categorized as “Mild” and “Extreme” compared to “Moderate” and “Severe” stress levels. When examining the confusion matrix, it is evident that ProtoNet and RelationNet primarily misclassify “Moderate” stress level samples as “Severe”, while misclassifying “Severe” stress level samples is concentrated between “Moderate” and “Extreme”. Moreover, as the number of target domain samples increases, the quantity of misclassified samples gradually decreases. This indicates that when classifying highly similar spectral samples, meta-learning still poses a challenge under limited sample conditions, but increasing the number of trainable samples can help improve the classification of similar samples. Additionally, Figure 5 demonstrates a significant improvement in detection accuracy for both methods when using samples with greater dissimilarity, further emphasizing the impact of inter-class sample variability on detection performance.

At the model level, we visualized the features extracted by the models using the T-SNE method. Figure 4 illustrates the distribution of features in the feature space for both methods. The abundance of feature points in RelationNet is due to the continuous feature extraction process during detection, establishing relationships with the samples from the target domain, resulting in denser sample points. It can be observed that ProtoNet, through the establishment of centroid points, allows the features to surround the prototypes, as shown in Figure 6. This facilitates accurate classification of samples with slight similarities but belonging to different classes. As demonstrated in Figure 5e and Figure 6e, under appropriate sample sizes, the probability of correctly classifying samples with low inter-class differences increases. On the other hand, RelationNet performs classification by autonomously learning relation scores between two features, as depicted in Figure 7. This approach directs the model’s attention to similar features across different classes, causing similar but different class features to cluster together in the feature space, as shown in Figure 6j. Consequently, this leads to a decrease in classification accuracy, as depicted in Figure 5j. ProtoNet exhibits better detection capability than RelationNet when classifying samples with low similarity.

From the exploration above, we can draw two conclusions and identify future directions for our work. The first conclusion is that both ProtoNet and RelationNet exhibit certain classification capabilities under small sample conditions (Figure 8). Furthermore, it can be observed in the feature space that the clustering effect of vectors is positively correlated with the number of target domain samples. The second conclusion is that under the condition of limited samples, both meta-learning methods still require further improvement in classifying similar samples. Regarding the challenge of enhancing the model’s classification of similar samples, we have made certain attempts and outlined two directions for future work. The first direction is to enhance the feature extraction capability of the Embedding function for samples from new classes. Zhou et al. [39] summarized various embedding algorithms used for the Network Representation Learning (NRL) task. They discovered that across different tasks, diverse embedding algorithms influence feature extraction, subsequently affecting feature clustering and prediction. In this case, deeper network architectures are required to to obtain more distinctive feature vectors. The result of more distinguishable feature vectors is a better classification performance achieved by the model. The second approach is to modify the distance metric based on different tasks. Traditional distance metrics do not consider the weight relationships between different dimensions of feature vectors. Adopting different distance metrics for various feature vectors can expand inter-class distances and reduce intra-class distances. This approach effectively enhances the model’s capability to classify similar samples. In summary, enhancing the model’s classification accuracy for crop stress in small samples can provide relevant personnel with more accurate crop information during rapid monitoring. This serves as a crucial foundation for subsequent production measures. These two methods for enhancing the classification ability of similar samples offer a fresh perspective for addressing diseases, pests and stressors that are difficult to distinguish in subsequent classifications.

4. Conclusions

In this study, we introduced meta-learning for the first time in crop hyperspectral classification detection to address the challenge of large-scale data collection and annotation due to the need for physical and chemical experiments and expert knowledge. For the detection of non-biological stress in tomato plants, we used non-stressed hyperspectral data as the source domain data and trained the model with a small amount of hyperspectral data from plants subjected to freeze stress. We conducted experiments to classify the degree of stress and whether the crops were under stress. By comparing with machine learning and traditional deep learning methods, our meta-learning approach achieved better accuracy with a small number of samples for both samples with certain similarities and samples with larger differences between classes. Furthermore, as the sample size increased, the detection accuracy of our method surpassed that of traditional methods. This demonstrates the effectiveness of meta-learning in classifying whether crops are under stress and the degree of stress. The introduction of meta-learning techniques enables detection personnel to quickly learn and adapt to various stress conditions affecting crops with a small amount of spectral data. This allows for more accurate assessment of crop health in a shorter time frame. It provides a basis for relevant individuals to promptly take targeted management measures.

Author Contributions

Conceptualization, S.R., P.G. and W.X.; methodology, S.R., T.Y. and P.G.; software, S.R., F.T. and H.C. (Hao Cang); validation, S.R., H.C. (Hao Cang) and Y.Z.; formal analysis, S.R., F.T. and H.C. (Hao Cang); investigation, H.C. (Huixin Chen), L.D., F.T. and P.X.; resources, W.X., H.C. (Huixin Chen) and P.G.; data curation, S.R. and T.Y.; writing—original draft preparation, S.R.; writing—review and editing, S.R. and P.G.; visualization, S.R. and H.C. (Hao Cang); supervision, P.G., L.G. and W.X.; project administration, S.R. and P.G.; funding acquisition, P.G. and W.X. All authors have read and agreed to the published version of the manuscript.

Funding

This study was financially supported by the “National Natural Science Foundation of China” (grant numbers: 32060685 and 61965014) and the International Cooperation Promotion Plan of Shihezi University (GJHZ202104).

Data Availability Statement

All data can be obtained by email from the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

References

Zhu, G.; Wang, S.; Huang, Z.; Zhang, S.; Liao, Q.; Zhang, C.; Lin, T.; Qin, M.; Peng, M.; Yang, C.; et al. Rewiring of the fruit metabolome in tomato breeding. Cell 2018, 172, 249–261. [Google Scholar] [CrossRef] [PubMed]
Kibriya, H.; Rafique, R.; Ahmad, W.; Adnan, S. Tomato Leaf Disease Detection Using Convolution Neural Network. In Proceedings of the 2021 International Bhurban Conference on Applied Sciences and Technologies (IBCAST), Islamabad, Pakistan, 12–16 January 2021; pp. 346–351. [Google Scholar] [CrossRef]
Altaf, M.A.; Shahid, R.; Altaf, M.M.; Kumar, R.; Naz, S.; Kumar, A.; Alam, P.; Tiwari, R.K.; Lal, M.K.; Ahmad, P. Melatonin: First-line soldier in tomato under abiotic stress current and future perspective. Plant Physiol. Biochem. 2022, 185, 188–197. [Google Scholar] [CrossRef] [PubMed]
Vasconez, J.P.; Delpiano, J.; Vougioukas, S.; Cheein, F.A. Comparison of convolutional neural networks in fruit detection and counting: A comprehensive evaluation. Comput. Electron. Agric. 2020, 173, 105348. [Google Scholar] [CrossRef]
Dhiman, P.; Kaur, A.; Balasaraswathi, V.; Gulzar, Y.; Alwan, A.A.; Hamid, Y. Image acquisition, preprocessing and classification of citrus fruit diseases: A systematic literature review. Sustainability 2023, 15, 9643. [Google Scholar] [CrossRef]
Sujatha, R.; Chatterjee, J.M.; Jhanjhi, N.; Brohi, S.N. Performance of deep learning vs. machine learning in plant leaf disease detection. Microprocess. Microsystems 2021, 80, 103615. [Google Scholar] [CrossRef]
Singh, A.; Ganapathysubramanian, B.; Singh, A.K.; Sarkar, S. Machine learning for high-throughput stress phenotyping in plants. Trends Plant Sci. 2016, 21, 110–124. [Google Scholar] [CrossRef]
Gulzar, Y.; Ünal, Z.; Aktaş, H.; Mir, M.S. Harnessing the power of transfer learning in sunflower disease detection: A comparative study. Agriculture 2023, 13, 1479. [Google Scholar] [CrossRef]
Song, X.P.; Huang, W.; Hansen, M.C.; Potapov, P. An evaluation of Landsat, Sentinel-2, Sentinel-1 and MODIS data for crop type mapping. Sci. Remote Sens. 2021, 3, 100018. [Google Scholar] [CrossRef]
Chakhar, A.; Ortega-Terol, D.; Hernández-López, D.; Ballesteros, R.; Ortega, J.F.; Moreno, M.A. Assessing the accuracy of multiple classification algorithms for crop classification using Landsat-8 and Sentinel-2 data. Remote Sens. 2020, 12, 1735. [Google Scholar] [CrossRef]
Viskovic, L.; Kosovic, I.N.; Mastelic, T. Crop classification using multi-spectral and multitemporal satellite imagery with machine learning. In Proceedings of the 2019 International Conference on Software, Telecommunications and Computer Networks (SoftCOM), Split, Croatia, 19–21 September 2019; pp. 1–5. [Google Scholar]
Feng, Y.; Chen, J.; Zhang, T.; He, S.; Xu, E.; Zhou, Z. Semi-supervised meta-learning networks with squeeze-and-excitation attention for few-shot fault diagnosis. ISA Trans. 2022, 120, 383–401. [Google Scholar] [CrossRef]
Li, Z.; Liu, M.; Chen, Y.; Xu, Y.; Li, W.; Du, Q. Deep Cross-Domain Few-Shot Learning for Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–18. [Google Scholar] [CrossRef]
Gao, K.; Guo, W.; Yu, X.; Liu, B.; Yu, A.; Wei, X. Deep induction network for small samples classification of hyperspectral images. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2020, 13, 3462–3477. [Google Scholar] [CrossRef]
Peng, Y.; Liu, Y.; Tu, B.; Zhang, Y. Convolutional Transformer-Based Few-Shot Learning for Cross-Domain Hyperspectral Image Classification. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2023, 16, 1335–1349. [Google Scholar] [CrossRef]
Lu, Y.; Chen, D.; Olaniyi, E.; Huang, Y. Generative adversarial networks (GANs) for image augmentation in agriculture: A systematic review. Comput. Electron. Agric. 2022, 200, 107208. [Google Scholar] [CrossRef]
Hu, G.; Wu, H.; Zhang, Y.; Wan, M. A low shot learning method for tea leaf’s disease identification. Comput. Electron. Agric. 2019, 163, 104852. [Google Scholar] [CrossRef]
Abbas, A.; Jain, S.; Gour, M.; Vankudothu, S. Tomato plant disease detection using transfer learning with C-GAN synthetic images. Comput. Electron. Agric. 2021, 187, 106279. [Google Scholar] [CrossRef]
Hughes, D.; Salathé, M. An open access repository of images on plant health to enable the development of mobile disease diagnostics. arXiv 2015, arXiv:1511.08060. [Google Scholar] [CrossRef]
Cap, Q.H.; Uga, H.; Kagiwada, S.; Iyatomi, H. Leafgan: An effective data augmentation method for practical plant disease diagnosis. IEEE Trans. Autom. Sci. Eng. 2020, 19, 1258–1267. [Google Scholar] [CrossRef]
Feng, Y.; Chen, J.; Yang, Z.; Song, X.; Chang, Y.; He, S.; Xu, E.; Zhou, Z. Similarity-based meta-learning network with adversarial domain adaptation for cross-domain fault identification. Knowl.-Based Syst. 2021, 217, 106829. [Google Scholar] [CrossRef]
Pan, S.J.; Yang, Q. A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 2010, 22, 1345–1359. [Google Scholar] [CrossRef]
Espejo-Garcia, B.; Mylonas, N.; Athanasakos, L.; Fountas, S.; Vasilakoglou, I. Towards weeds identification assistance through transfer learning. Comput. Electron. Agric. 2020, 171, 105306. [Google Scholar] [CrossRef]
Too, E.C.; Yujian, L.; Njuki, S.; Yingchun, L. A comparative study of fine-tuning deep learning models for plant disease identification. Comput. Electron. Agric. 2019, 161, 272–279. [Google Scholar] [CrossRef]
Paymode, A.S.; Malode, V.B. Transfer learning for multi-crop leaf disease image classification using convolutional neural network VGG. Artif. Intell. Agric. 2022, 6, 23–33. [Google Scholar] [CrossRef]
Gulzar, Y. Fruit image classification model based on mobilenetv2 with deep transfer learning technique. Sustainability 2023, 15, 1906. [Google Scholar] [CrossRef]
Mamat, N.; Othman, M.F.; Abdulghafor, R.; Alwan, A.A.; Gulzar, Y. Enhancing image annotation technique of fruit classification using a deep learning approach. Sustainability 2023, 15, 901. [Google Scholar] [CrossRef]
Chen, W.Y.; Liu, Y.C.; Kira, Z.; Wang, Y.C.F.; Huang, J.B. A closer look at few-shot classification. arXiv 2019, arXiv:1904.04232. [Google Scholar]
Wang, Y.; Yao, Q.; Kwok, J.T.; Ni, L.M. Generalizing from a few examples: A survey on few-shot learning. ACM Comput. Surv. (Csur) 2020, 53, 1–34. [Google Scholar] [CrossRef]
Ravi, S.; Larochelle, H. Optimization as a model for few-shot learning. In Proceedings of the International Conference on Learning Representations, Toulon, France, 24–26 April 2017. [Google Scholar]
Wang, D.; Zhang, M.; Xu, Y.; Lu, W.; Yang, J.; Zhang, T. Metric-based meta-learning model for few-shot fault diagnosis under multiple limited data conditions. Mech. Syst. Signal Process. 2021, 155, 107510. [Google Scholar] [CrossRef]
Vinyals, O.; Blundell, C.; Lillicrap, T.; Kavukcuoglu, K.; Wierstra, D. Matching networks for one shot learning. Adv. Neural Inf. Process. Syst. 2016, 29. [Google Scholar] [CrossRef]
Snell, J.; Swersky, K.; Zemel, R. Prototypical networks for few-shot learning. Adv. Neural Inf. Process. Syst. 2017, 30. [Google Scholar] [CrossRef]
Sung, F.; Yang, Y.; Zhang, L.; Xiang, T.; Torr, P.H.; Hospedales, T.M. Learning to compare: Relation network for few-shot learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 1199–1208. [Google Scholar] [CrossRef]
Mollazade, K.; Omid, M.; Akhlaghian Tab, F.; Rezaei Kalaj, Y.; Mohtasebi, S.S. Data mining-based wavelength selection for monitoring quality of tomato fruit by backscattering and multispectral imaging. Int. J. Food Prop. 2015, 18, 880–896. [Google Scholar] [CrossRef]
Darnay, L.; Králik, F.; Oros, G.; Koncz, Á.; Firtha, F. Monitoring the effect of transglutaminase in semi-hard cheese during ripening by hyperspectral imaging. J. Food Eng. 2017, 196, 123–129. [Google Scholar] [CrossRef]
Park, B.; Lawrence, K.C.; Windham, W.R.; Smith, D.P. Performance of hyperspectral imaging system for poultry surface fecal contaminant detection. J. Food Eng. 2006, 75, 340–348. [Google Scholar] [CrossRef]
Jia, B.; Wang, W.; Ni, X.; Lawrence, K.C.; Zhuang, H.; Yoon, S.C.; Gao, Z. Essential processing methods of hyperspectral images of agricultural and food products. Chemom. Intell. Lab. Syst. 2020, 198, 103936. [Google Scholar] [CrossRef]
Zhou, J.; Liu, L.; Wei, W.; Fan, J. Network representation learning: From preprocessing, feature extraction to node embedding. ACM Comput. Surv. (CSUR) 2022, 55, 1–35. [Google Scholar] [CrossRef]

Figure 1. Schematic diagram of a hyperspectral imaging system and preprocessing.

Figure 2. Framework of ProtoNet and RelationNet.

Figure 3. Traditional learning method and meta-learning data training approach.

Figure 4. The performance in accuracy, precision, recall, and F1-score of ProtoNet and RelationNet. P1 represents ProtoNet trained with a shot of 1. P2, P3 and so on follow the same naming convention. R1 represents ProtoNet trained with a shot of 1. R2, R3 and so on follow the same naming convention. (a) Accuracy. (b) Precision. (c) Recall. (d) F1-Score.

Figure 5. Confusion matrix for the five-shot result of ProtoNet and the 1-shot result of RelationNet at different target train number. Labels 0 to 3 correspond to four levels of stress severity: mild, moderate, severe and extreme. (a) ProtoNet-4. (b) ProtoNet-8. (c) ProtoNet-12. (d) ProtoNet-16. (e) ProtoNet-20. (f) RelationNet-4. (g) RelationNet-8. (h) RelationNet-12. (i) RelationNet-16. (j) RelationNet-20.

Figure 6. T-SNE visualization for extracted features by RelationNet and ProtoNet under different target training number. Labels 0 to 3 correspond to four levels of stress severity: mild, moderate, severe and extreme. (a) ProtoNet-4. (b) ProtoNet-8. (c) ProtoNet-12. (d) ProtoNet-16. (e) ProtoNet-20. (f) RelationNet-4. (g) RelationNet-8. (h) RelationNet-12. (i) RelationNet-16. (j) RelationNet-20.

Figure 7. RelationNet Metric module.

Figure 8. ProtoNet Metric module.

Table 1. The details for the convolutional block (CB).

Layer	Operation	Details
1	1-D convolution	64@1 × 3
2	1-D normalization	Batch normalization
3	Activation	Relu
4	1-D MaxPooling	1 × 2

Table 2. The details of the dataset.

Types of Stress	Levels of Stress	Level Labels	The Number of Samples
Drought	Mild	0	38
	Moderate	1	20
	Severe	2	20
	Extreme	3	10
Freeze	Mild	0	19
	Moderate	1	20
	Severe	2	20
	Extreme	3	9

Table 3. The detection accuracy (%) of tomato frost damage stress at different levels of stress severity.

Methods	$N_{4}^{t}$				$N_{8}^{t}$				$N_{12}^{t}$				$N_{16}^{t}$				$N_{20}^{t}$				Average
Methods	Acc	Precision	Recall	$F_{1}$ -Score	Acc	Precision	Recall	$F_{1}$ -Score	Acc	Precision	Recall	$F_{1}$ -Score	Acc	Precision	Recall	$F_{1}$ -Score	Acc	Precision	Recall	$F_{1}$ -Score	Acc	Precision	Recall	$F_{1}$ -Score
SVM	−	−	−	−	56.14	44.4	54.55	48.95	57.14	44.02	57.39	49.82	57.93	43.59	57.11	49.44	62.3	45.09	61.61	52.07	58.37	44.28	57.67	50.07
BP	58.46	42.25	59.89	49.55	60.38	44.35	64.06	52.41	61.4	46.72	62.04	53.3	63.93	46.59	66.21	54.69	67.35	48.54	67.87	56.6	62.3	45.69	64.01	53.31
PLS-DA	53.85	49.75	43.57	46.46	62.3	45.34	59.71	51.54	63.16	47.08	63.04	53.9	64.15	47.53	65.01	54.91	67.84	49.22	65.15	56.08	62.26	47.78	59.3	52.58
CNN-FT	51.43	53.07	59.71	56.19	52.78	54.44	58.27	56.29	53.11	56.1	58.28	57.17	54.29	57.35	58.23	57.79	56.9	56.19	58.39	57.27	53.7	55.43	58.58	56.94
RelationNet	63.49	63.69	56.07	59.64	67.19	66.86	59.99	63.24	68.65	68.48	61.26	64.67	69.39	69.38	62	65.48	69.69	69.51	62.11	65.6	67.68	67.58	60.29	63.73
ProtoNet	62.47	50.42	58.78	54.28	69.57	59.29	66.32	62.61	72.53	62.91	69.5	66.04	74.73	65.69	71.89	68.65	75.77	66.64	72.9	69.63	71.01	60.99	67.88	64.24

N_{4}^{t}

denotes that the number of target set for training is 4.

N_{8}^{t}

,

N_{12}^{t}

,

N_{16}^{t}

and

N_{20}^{t}

have the same naming convention. The red font indicates the current optimal value.

Table 4. The detection accuracy (%) of tomato frost damage stress at different levels of stress severity.

Methods	$N_{4}^{t}$				$N_{8}^{t}$				$N_{12}^{t}$				$N_{16}^{t}$				$N_{20}^{t}$				Average
Methods	Acc	Precision	Recall	$F_{1}$ -Score	Acc	Precision	Recall	$F_{1}$ -Score	Acc	Precision	Recall	$F_{1}$ -Score	Acc	Precision	Recall	$F_{1}$ -Score	Acc	Precision	Recall	$F_{1}$ -Score	Acc	Precision	Recall	$F_{1}$ -Score
SVM	−	−	−	−	81.96	85.87	83.72	84.78	84.9	85.89	88.64	87.24	85.71	86.67	88.9	87.77	87.71	87.11	92.02	89.5	85.07	86.39	88.32	87.32
BP	83.07	85.72	86.52	86.12	85.24	86.41	88.44	87.41	85.71	86.24	89.46	87.82	86.79	87.39	90.04	88.69	87.71	87.54	90.91	89.19	86.36	86.9	89.71	88.28
PLS-DA	72.13	83.44	67.98	74.92	73.68	83.54	71.27	76.92	81.53	85.36	83.58	84.46	83.67	86.47	86.2	86.34	86.79	89.32	88.35	88.83	81.42	86.17	82.35	84.14
CNN-FT	72.3	84.26	69.05	75.9	80.7	86.25	81.85	83.99	79.24	85.91	80.44	83.09	85.71	85.99	89.37	87.65	86.88	85.81	91.88	88.74	83.13	85.99	85.89	85.87
RelationNet	87.63	88.42	92.76	90.54	88.68	88.46	93.86	91.08	88.75	89.31	92.11	90.69	88.97	89.53	93.12	91.29	88.73	88.36	93.27	90.75	88.78	88.92	93.09	90.95
ProtoNet	89.1	89.72	93.08	91.37	91.93	93.52	93.47	93.55	93.6	95.70	93.80	94.74	94.1	95.74	94.69	95.21	94.43	95.35	95.34	95.34	93.52	95.08	94.33	94.71

The red font indicates the current optimal value.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ruan, S.; Cang, H.; Chen, H.; Yan, T.; Tan, F.; Zhang, Y.; Duan, L.; Xing, P.; Guo, L.; Gao, P.; et al. Hyperspectral Classification of Frost Damage Stress in Tomato Plants Based on Few-Shot Learning. Agronomy 2023, 13, 2348. https://doi.org/10.3390/agronomy13092348

AMA Style

Ruan S, Cang H, Chen H, Yan T, Tan F, Zhang Y, Duan L, Xing P, Guo L, Gao P, et al. Hyperspectral Classification of Frost Damage Stress in Tomato Plants Based on Few-Shot Learning. Agronomy. 2023; 13(9):2348. https://doi.org/10.3390/agronomy13092348

Chicago/Turabian Style

Ruan, Shiwei, Hao Cang, Huixin Chen, Tianying Yan, Fei Tan, Yuan Zhang, Long Duan, Peng Xing, Li Guo, Pan Gao, and et al. 2023. "Hyperspectral Classification of Frost Damage Stress in Tomato Plants Based on Few-Shot Learning" Agronomy 13, no. 9: 2348. https://doi.org/10.3390/agronomy13092348

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Hyperspectral Classification of Frost Damage Stress in Tomato Plants Based on Few-Shot Learning

Abstract

1. Introduction

2. Materials and Methods

2.1. Experimental Materials

2.2. Spectral Data Extraction and Preprocessing

2.3. Meta Learning-Based Methods for Hyperspectral Classification (HSC)

2.3.1. Meta-Training Strategy

2.3.2. Prototypical Network

2.3.3. Relation Network

2.3.4. Embedding Function

2.4. Experiment

2.4.1. Dataset Introduction

2.4.2. Model Evaluation

2.4.3. Few-Shot Frost Damage Stress Classification Experiment

3. Discussion

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI