A Classification Method for Seed Viability Assessment with Infrared Thermography

This paper presents a viability assessment method for Pisum sativum L. seeds based on the infrared thermography technique. In this work, different artificial treatments were conducted to prepare seeds samples with different viability. Thermal images and visible images were recorded every five minutes during the standard five day germination test. After the test, the root length of each sample was measured, which can be used as the viability index of that seed. Each individual seed area in the visible images was segmented with an edge detection method, and the average temperature of the corresponding area in the infrared images was calculated as the representative temperature for this seed at that time. The temperature curve of each seed during germination was plotted. Thirteen characteristic parameters extracted from the temperature curve were analyzed to show the difference of the temperature fluctuations between the seeds samples with different viability. With above parameters, support vector machine (SVM) was used to classify the seed samples into three categories: viable, aged and dead according to the root length, the classification accuracy rate was 95%. On this basis, with the temperature data of only the first three hours during the germination, another SVM model was proposed to classify the seed samples, and the accuracy rate was about 91.67%. From these experimental results, it can be seen that infrared thermography can be applied for the prediction of seed viability, based on the SVM algorithm.


Introduction
Seed viability assessment is a key component of agricultural production and commercialization. Seed viability can be affected by several factors, including overheating, physical damage and natural aging. Assurances of high seed productivity are necessary for seed users in agricultural production, and meanwhile high seed viability needs to be guaranteed in agricultural commercialization to ensure the business optimization of seed companies. Hence, both the seed users and seed supply companies are required to invest in seed viability test and classification technologies.
In seed viability assessment, many conventional methods including the standard germination test, electrical conductivity test, seedling growth test, accelerated aging test and triphenyltetrazolium chloride (TTC) quantitative analysis have been proposed [1,2]. However, several shortcomings still exist in these methods, such as invasiveness, huge amount of test work needed, long test periods, low accuracy and obvious subjective effects [3]. Therefore, fast and nondestructive diagnosis methods are urgently required in seed viability assessment. In view of this, several optical techniques, such as infrared thermography, Fourier transform infrared, Fourier transform near-infrared, bio-speckle, nuclear magnetic resonance, ultraviolet-visible and Raman spectroscopy and hyperspectral imaging have been developed to estimate the seed viability [4][5][6][7][8][9][10].

Plant Materials
Pisum sativum L. seeds were selected as the experimental samples, and submitted to standard germination tests for quality control. A total of 120 seeds were divided into two groups, namely A class and B class, for different treatments (80 seeds for A class and 40 seeds for B class). The seeds of class A were stored at 5 • C for three days while the seeds of class B were treated at 100 • C for three days. To be specific, A class seeds were put in a refrigerator at 5 • C for three days, while B class seeds were treated at 100 • C (±1 • C) with a draught drying cabinet. Root lengths of each individual seed were then measured by a Vernier caliper on the fifth day after imbibition so as to qualify the seed viability.
Polycarbonate plates (cryogenic vial holders with holes) were used as the Petri dish in the germination experiment. The polycarbonate plate was placed in a water bath and covered with filter paper. Every seed was placed above the well of the polycarbonate plate with a little well-hydrated dip formed under the seeds. Ambient temperature, including both air and water temperature, was maintained constant at 24 • C with minimal convection to reduce the environment impact on seeds. The infrared thermography system, including a light source, infrared thermal camera, charge coupled device (CCD) and waterbath was put in the constant temperature incubator. The visible and thermal images that captured by infrared camera and CCD respectively, are shown in Figure 1. During the standard germination test, the temperature incubator has held constant at 24 • C (±0.4 • C) and the seeds continuously exposed to light till the end of the experiment.

Plant Materials
Pisum sativum L. seeds were selected as the experimental samples, and submitted to standard germination tests for quality control. A total of 120 seeds were divided into two groups, namely A class and B class, for different treatments (80 seeds for A class and 40 seeds for B class). The seeds of class A were stored at 5 °C for three days while the seeds of class B were treated at 100 °C for three days. To be specific, A class seeds were put in a refrigerator at 5 °C for three days, while B class seeds were treated at 100 °C (±1 °C) with a draught drying cabinet. Root lengths of each individual seed were then measured by a Vernier caliper on the fifth day after imbibition so as to qualify the seed viability.
Polycarbonate plates (cryogenic vial holders with holes) were used as the Petri dish in the germination experiment. The polycarbonate plate was placed in a water bath and covered with filter paper. Every seed was placed above the well of the polycarbonate plate with a little well-hydrated dip formed under the seeds. Ambient temperature, including both air and water temperature, was maintained constant at 24 °C with minimal convection to reduce the environment impact on seeds. The infrared thermography system, including a light source, infrared thermal camera, charge coupled device (CCD) and waterbath was put in the constant temperature incubator. The visible and thermal images that captured by infrared camera and CCD respectively, are shown in Figure 1. During the standard germination test, the temperature incubator has held constant at 24 °C (±0.4 °C) and the seeds continuously exposed to light till the end of the experiment.  Figure 2 shows a schematic of the infrared thermography system used to capture and analyze the thermal and visible images of the seeds. This system consists of an infrared thermal camera, a digital color charge-coupled device (CCD) camera, a directional light source, a constant temperature incubator, a thermostatic waterbath, and a host computer. A resolution of 320 × 240 pixels thermal images were registered by the infrared thermal camera Ti55 (Fluke, Everett, WA, USA) with a sensitivity of 0.02 °C and preliminarily processed with the Smartview software (Fluke Systems). Visible images with a resolution of 900 × 600 pixels acquired from the CCD were digitized to 8 bit (256 grey levels) data and stored. Both thermal and visible images were stored every 5 mins over five days and could be exported as individual images or as a series of images in time sequence. Afterwards, these thermal and visible images were analyzed with the software MATLAB (MathWorks, Natick, MA, USA) for post-processing.  Figure 2 shows a schematic of the infrared thermography system used to capture and analyze the thermal and visible images of the seeds. This system consists of an infrared thermal camera, a digital color charge-coupled device (CCD) camera, a directional light source, a constant temperature incubator, a thermostatic waterbath, and a host computer. A resolution of 320 × 240 pixels thermal images were registered by the infrared thermal camera Ti55 (Fluke, Everett, WA, USA) with a sensitivity of 0.02 • C and preliminarily processed with the Smartview software (Fluke Systems). Visible images with a resolution of 900 × 600 pixels acquired from the CCD were digitized to 8 bit (256 grey levels) data and stored. Both thermal and visible images were stored every 5 mins over five days and could be exported as individual images or as a series of images in time sequence. Afterwards, these thermal and visible images were analyzed with the software MATLAB (MathWorks, Natick, MA, USA) for post-processing. As shown in Figure 3, due to the effect of water uptake and respiration, heat convection between the seeds and the ambient (both air and water) lasts during the whole germination period until the end of the experiment. Under the impact of this convection, the edges of the seeds in the profiles merge into the background, but in the visible images, the unabridged edges can be detected. By the image fusion technology, the temperature information of seed areas restricted by the shapes in the visible images could be acquired to plot the curves of seeds in the different viability categories during the germination period. Regarding the characteristics of visible images, we propose an image processing flow to acquire each individual seed areas in the images. As a pre-processing approach, background subtraction can As shown in Figure 3, due to the effect of water uptake and respiration, heat convection between the seeds and the ambient (both air and water) lasts during the whole germination period until the end of the experiment. Under the impact of this convection, the edges of the seeds in the profiles merge into the background, but in the visible images, the unabridged edges can be detected. By the image fusion technology, the temperature information of seed areas restricted by the shapes in the visible images could be acquired to plot the curves of seeds in the different viability categories during the germination period. As shown in Figure 3, due to the effect of water uptake and respiration, heat convection between the seeds and the ambient (both air and water) lasts during the whole germination period until the end of the experiment. Under the impact of this convection, the edges of the seeds in the profiles merge into the background, but in the visible images, the unabridged edges can be detected. By the image fusion technology, the temperature information of seed areas restricted by the shapes in the visible images could be acquired to plot the curves of seeds in the different viability categories during the germination period. Regarding the characteristics of visible images, we propose an image processing flow to acquire each individual seed areas in the images. As a pre-processing approach, background subtraction can Regarding the characteristics of visible images, we propose an image processing flow to acquire each individual seed areas in the images. As a pre-processing approach, background subtraction can eliminate any non-uniform brightness areas and reflection points in the visible images. The edges of individual seeds were detected by the edge extraction method, and a region growing method was introduced to form the segmented regions. A disk whose area was approximately one-third of the seed area was placed in the center of each seed and defined as the seed area.

Image Analysis
Thermal images obtained during the experiment were displayed as pseudo-color images and were transformed into grayscale images which recorded the temperature as gray values. Maximum (grey level is 255) and minimum (grey level is 0) gray values corresponded respectively to the preset maximum (27 • C) and minimum (21 • C) temperature value before the experiment.
Image fusion technology was adopted to extract seed regions in the thermal images with the disk areas in the visible images. Then, the temperature data of each individual seed was obtained through multiplication of thermal images and visible images. The average over all pixels of the seed region was calculated as the resulting temperature value T and the value can be expressed as the follow equation (Equation (1)): where I i is the grey value of the i pixel in the thermal image, n represents the total pixel number of the seed region. T max and T min are the pre-set maximum and minimum temperature values. 255 and 0 are the maximum and minimum grey values in the thermal image.
For the purpose of correcting the temperature value, the temperature of filter paper area around each individual seed area was defined as the environmental temperature of the seed. This ambient area temperature was then introduced into the calculation. Similar to the calculation of temperature value in the seed area, the environment temperature can be calculated by (Equation (1)). Define rT as the temperature difference between the seed area and the environment area, namely. This difference was expressed in the equation (Equation (2)): A total number of 1440 thermal images were used to describe the temperature variation for each individual seed during the experiment. This variation was analyzed by the software MATLAB to obtain the temperature curve of each individual seed.
The temperature-time curve of each seed was analysis to extract its characteristic parameters. Then, these parameters were measured by Least Significant Difference (LSD) multiple comparison analysis method. LSD method is used to analyze the multiple comparisons as follows: calculate the ratio of absolute temperature value of two variables x i − x j and its standard error of mean difference. The standard error of the mean difference was calculated using the following equation (Equation (3)): where MS e is mean square error in F test, n is number of variables. The ratio would be compared with the critical value of 2-tailed samples T-test (α = 0.05). The equation LSD α could be transformed into the equation (Equation (4)) as: The results shows that x i and x j have a significant difference in α level if the ratio is higher than LSD α . Conversely, the difference between x i and x j is not significant.

Classification Algorithm
Support vector machine (SVM) was first introduced by Cortes as a universal feed-forward classification algorithm. The SVM approach is a supervised method based on the statistical learning theory and structural risk minimization principle, and can be used to analyze data and recognize images. The strategy of SVM classification is to find an optimal separating hyperplane between classes by focusing on training samples that locate at the edge of the class distributions. The main characteristics of SVM are as follows: (1) SVM can be generalized in high-dimensional spaces with only a small amount of training samples.
(2) The optimum result can be given by SVM through transforming the problem into a quadratic programming problem. (3) SVM can simulate nonlinear functional relationships.
A brief description of the SVM classification is given below. In a binary classification problem, the aim is to develop a classifier that generalizes accurately for predicting the membership of a class y i (−1, +1) from m-dimensional input data represented by a vector X = {x 1 , x 2 , . . . , x m }. In the case of seed viability assessment based on infrared thermography, m represents the number of the characters of the temperature variation curve during the germination. Before prediction, it is necessary to train a data set containing the characters corresponding to n experimental samples of a known class.
The core of SVM algorithm is to search the optimal hyperplane that separates different classes. This hyperplane can be described as the follow equation (Equation (5)): where w is the normal vector of the hyperplane and b is the offset. During the training process, SVM tries to find the hyperplane that can not only maximize the shortest distance from this hyperplane to the closest training sample of each class (the class y i = +1 and the class y i = −1), but also minimize the classification error. The support vectors of the two classes lie on two hyperplanes that are parallel to the optimal hyperplane. The distance between these two planes is defined as the margin associated with the separating hyperplane. The optimization of this margin can be converted into a constrained quadratic optimization problem as follow equation (Equation (6)): where ξi represents the classification error for the distance between the misclassified sample i and the corresponding margin hyperplane and C is the regularization meta-parameter controls the trade-off between the two conflicting objectives, i.e., margin maximization and error minimization.
When C is small, margin maximization is emphasized; whereas when C is large, error minimization is predominant. According to the Lagrangian dual formulation, the optimal hyperplane can be expressed as a liner combination of the training observations in the following equation (Equation (7)): where α i is a Lagrange multiplier that corresponds to a coefficient associated with each object. The magnitude of α i is related to the parameter C and varies between 0 and C.
For nonlinear classification problems, the input data are mapped into a high dimensional space through a mapping function. Then the data can be separated with a linear SVM. In the dual representation, a kernel, the inner product of two vectors of u(x 1 ) and u(x 2 ) is used as the mapping function. In this study, the radial basis function (RBF) kernel is used for its advantage of good performance in obtaining almost all boundary shapes. The RBF kernel function is given as the following equation (Equation (8)): where ϕ(X 1 ) and ϕ(X 2 ) are the mapping functions of the objects X 1 and X 2 respectively; σ is the kernel parameter determined by the kernel width meta-parameter. As a conclusion, the two meta-parameters regularization parameter C and kernel parameter G need to be selected properly as they determine the boundary complexity and the observed classification rate.
The multi classifier is integrated by a single classifier with a certain difference. In this work, a multi classifier y (viable seeds, aged seeds, non-viable seeds) consisted of three two-class classifiers y 1 (viable seeds, aged seeds) y 2 (viable seeds, non-viable seeds) and y 3 (aged seeds, non-viable seeds). To combine these classifiers, the weighted voting algorithm is adopted. The resultant class is given by choosing the class voted by the majority of the classifiers. Only samples from two classed of each individual classifier are used for training.
To be specific, all the classifiers were regarded as a voter with a weight value to produce classification results. The performance differences of the base classifiers was introduced by assign a weight value to each base classifier, and the equation (Equation (9)) is shown as follows: where α i is the weight value of i based classifier. p i represents the average accuracy of the training set of the i based classifier.

Viability Test
After the standard germination test, seeds viability was defined as the percentage of germinated seeds and classified by the root lengths. Based on the statistical analysis of five days of germination experiments, the germination rate of A class was 91.25%, whereas the B class seeds did not germinate. According to the root lengths, the seeds (a total of 120 seeds) were classified into three viability categories according to different lengths (Table 1). A class seeds were divided into three viability types, which are viable seeds (Viability type 2), aged seeds (Viability type 1) and non-viable seeds (Viability type 0) and the standards of classifications were explanted in Table 1.  Figure 4 shows the temperature variations of the seeds temperature in different categories. The three temperature curves were calculated respectively by the averages of viable seeds A2 (green), aged seeds A1 (yellow), and non-viable seeds A0 and B0 (red). In these curves, relative seed temperature (rT) acquired by the difference between every individual seed temperature and the environment temperature was used to describe the temperature variations during all the 5 days experiments. The heat variation, i.e., the warming or cooling of the seed, was revealed by the positive or negative of the rT value.  Figure 4 shows the temperature variations of the seeds temperature in different categories. The three temperature curves were calculated respectively by the averages of viable seeds A2 (green), aged seeds A1 (yellow), and non-viable seeds A0 and B0 (red). In these curves, relative seed temperature (rT) acquired by the difference between every individual seed temperature and the environment temperature was used to describe the temperature variations during all the 5 days experiments. The heat variation, i.e., the warming or cooling of the seed, was revealed by the positive or negative of the rT value. According to the differences in the three categories of seeds, a curve of an individual seed was illustrated in Figure 5, where several characteristic parameters were marked. Hereinto, rTmax represents the maximum temperature value; trTmax represents the time for reaching the maximum temperature value; rTdrop represents the temperature value at the beginning of its sharp decline; trTdrop represents the time when sharp decline of the temperature begins; rTmin represents the minimum temperature value; trTmin represents the time for reaching the minimum temperature value; rT0h, rT20h, rT40h, rT60h, rT80h, rT100h and rT120h represents the temperature values at 0 h, 20 h, 40 h, 60 h, 80 h, 100 h and 120 h, respectively. All these characteristic parameters of the three categories of seeds were statistically calculated and listed in Table 2.  According to the differences in the three categories of seeds, a curve of an individual seed was illustrated in Figure 5, where several characteristic parameters were marked. Hereinto, rT max represents the maximum temperature value; trT max represents the time for reaching the maximum temperature value; rT drop represents the temperature value at the beginning of its sharp decline; trT drop represents the time when sharp decline of the temperature begins; rT min represents the minimum temperature value; trT min represents the time for reaching the minimum temperature value; rT 0h , rT 20h , rT 40h , rT 60h , rT 80h , rT 100h and rT 120h represents the temperature values at 0 h, 20 h, 40 h, 60 h, 80 h, 100 h and 120 h, respectively. All these characteristic parameters of the three categories of seeds were statistically calculated and listed in Table 2.  Figure 4 shows the temperature variations of the seeds temperature in different categories. The three temperature curves were calculated respectively by the averages of viable seeds A2 (green), aged seeds A1 (yellow), and non-viable seeds A0 and B0 (red). In these curves, relative seed temperature (rT) acquired by the difference between every individual seed temperature and the environment temperature was used to describe the temperature variations during all the 5 days experiments. The heat variation, i.e., the warming or cooling of the seed, was revealed by the positive or negative of the rT value. According to the differences in the three categories of seeds, a curve of an individual seed was illustrated in Figure 5, where several characteristic parameters were marked. Hereinto, rTmax represents the maximum temperature value; trTmax represents the time for reaching the maximum temperature value; rTdrop represents the temperature value at the beginning of its sharp decline; trTdrop represents the time when sharp decline of the temperature begins; rTmin represents the minimum temperature value; trTmin represents the time for reaching the minimum temperature value; rT0h, rT20h, rT40h, rT60h, rT80h, rT100h and rT120h represents the temperature values at 0 h, 20 h, 40 h, 60 h, 80 h, 100 h and 120 h, respectively. All these characteristic parameters of the three categories of seeds were statistically calculated and listed in Table 2.   As shown in Figure 4, in viable type A2 seeds (green), rT first showed a gentle dip in the first one hour, and then drops sharply within the next two hours till the inflection point to rise up. Due to this phenomenon, the maximum value of temperature (rT max ) occurs at the begining and the minimum value of temperature (rT min ) appears at the inflection point. By contrast, in aged type A1 seeds, rT first shows a small peak and reaches the maximum temperature value (rT max ). The sharp decline in this curve was delayed nearly half an hour than that in the curve of viable seeds. The sharp decline appears at the beginning of the curve of the seed of non-viable type A0 and B0 seeds types and reaches the minimum temperature value (rT min ) earlier than that of viable and aged types.

Temperature Variation
These differences in the temperature curves are mainly reflected in the imbibition period, and can be explained by the biophysical and biochemical changes occurring during the germination test. The imbibition period was affected by the membrane permeability. Compared with viable seeds, the membrane permeability of aged seeds was changed and water absorption in the water uptake period became slower, which resulted in the delay of sharp decline in aged seeds. For non-viable seeds, although the seeds lost the viability, they still had certain water absorbing capacity [22,26]. Considering that different categories of seeds have different characteristics in the internal metabolic activity, the seeds showed distinct differences in their performance. To be specific, the non-viable seeds had the fastest temperature rise in this process, the aged seeds followed, while the viable seeds had the slowest.
To sum up, considerable variations of cooling in temperature can be observed for all three categories of seeds during the experiments. To prevent the cooling produced by evaporation, relative seed temperature was used to evaluate the cooling. In this situation, the decline of temperature was caused by the biophysical and biochemical changes in seeds instead of evaporation. Multiple comparisons are used to analyze the characteristic parameters in Table 2, and all the parameters with significant difference in three categories of seeds are shown in Table 3.
As shown in Figure 6, the temperature variations of all the three categories of seeds mainly occur in the first three hours. The parameters including rT drop , rT min , and rT 0h can be required from these temperature data, so the temperature data of the first three hours were used for seed viability assessment when the seeds can be redried and stored again. Hence, by virtue of the temperature data in this time period, an SVM model can be developed to assess the seed viability.

Classification Model
With the analysis, there are significant differences between the above 13 characteristic parameters. These parameters were used as the input of SVM model to explore the classification result with the whole temperature data during the germination. Both the regularization parameter and kernel parameter of this SVM model set to 2. 5-fold cross validation was used in the training of this model and the results of cross validation of the SVM model with the whole germination temperature data is shown in Table 4.
Based on the analysis and statistics of Table 4, the classification results of the SVM model with the whole germination temperature data is shown in Table 5. As seen in this data, the classification accuracy for all types of seeds is 95%. What's more, the classification accuracy of each type of seeds can be more than 90%. This result proved that the infrared thermography technique can be used in the viability assessment of pea seeds.  Although the above model proves the accuracy and effectiveness in seed viability assessment with infrared thermography, we still explored the possibility of classification of the different types with less temperature data, especially with the temperature data before the seeds germinate. According to the results in Table 3 and the following analysis, the first three hours of temperature data are selected as the SVM input. These data were represented by the temperature data obtained every ten minute in the first three hours. Both the regularization parameter and kernel parameter of this SVM model set to 2. Five-fold cross validation was used in the training of this model and the results of cross validation of the SVM model with the first three hours temperature data is shown in Table 6.
Based on the analysis and statistics of Table 6, the classification results of SVM model with the first three hours temperature data is shown in Table 7. As shown in Table 7, the classification accuracy of all types of seeds is 91.67%.  Comparing the SVM model with the whole germination temperature data and the SVM model with the first three hours of temperature data, as can be seen, the overall accuracy rate of the latter for the samples is 91.67%, slightly lower than that of the classification (95%). Hereinto, with the whole germination temperature data, the accuracy rate for the viable seeds is 97.56%; that for the aged seeds is 75%, which is the lowest; that for the non-viable seeds is up to 97.87%, which is the highest. By contrast, with the first three hours temperature data, the accuracy rate for the viable seeds is 92.68%; that for the aged seeds is 90.63% and that for the non-viable seeds is up to 100%. For the comparison results of SVM model, it can be seen that the accuracy rates of viable seeds and non-viable seeds have no significant differences between these two classification models, but the accuracy rate of SVM model with the whole germination temperature data higher than that of SVM model with the first three hours temperature data.
From analysis and comparison of these two SVM models, it can be concluded that the infrared thermography technique can be used to predict the viability categories of the seeds in the first three hours when seeds can be redried and stored again. The method proposed in this work can be applied for seed viability assessment with the advantages of being fast and nondestructive.

Conclusions
In this study, the infrared thermography technique was proposed as a viability assessment method for pea seeds based on their temperature variations. The temperature on the surfaces of experimental samples was measured in a non-destructive and non-contact way by this technique, and the viability of seeds of different categories were obtained by artificial aging. The thermal profiles of the seeds were recorded during the germination experiment, and the temperature curve for each individual seed was plotted by the method of image processing and image fusion. Finally, the SVM model, as a multi-classification method, was used to classify the seeds into viable, aged and non-viable types according to the root length.
The results showed that there are significant differences between the parameters used for characterizing the temperature variations of seeds depended on the seed viability. With these parameters, SVM was used to classify the seed samples into three categories, and the classification accuracy rate was 95%. On this basis, another SVM model was proposed to predict the seed viability in the first three hours when the seeds can be redried and stored again, and result in an overall accuracy rate of 91.67%.
Our work indicates that the infrared thermography technique can be used as a fast, non-invasive method in seed viability assessment, and has great potential in the viability assessment for various agricultural specimens. In the future, more data from extensive experiments are required to illuminate the relationship between the temperature variations and the specific biophysical and biochemical activities.