Algorithm of Strawberry Disease Recognition Based on Deep Convolutional Neural Network

(e growth of strawberry will be stressed by biological or abiotic factors, which will cause a great threat to the yield and quality of strawberry, in which various strawberry diseased. However, the traditional identification methods have high misjudgment rate and poor real-time performance. In today’s era of increasing demand for strawberry yield and quality, it is obvious that the traditional strawberry disease identification methods mainly rely on personal experience and naked eye observation and cannot meet the needs of people for strawberry disease identification and control.(erefore, it is necessary to find amore effectivemethod to identify strawberry diseases efficiently and provide corresponding disease description and control methods. In this paper, based on the deep convolution neural network technology, the recognition of strawberry common diseases was studied, as well as a new method based on deep convolution neural network (DCNN) strawberry disease recognition algorithm, through the normal training of strawberry image feature representation in different scenes, and then through the application of transfer learning method, the strawberry disease image features are added to the training set, and finally the features are classified and recognized to achieve the goal of disease recognition. Moreover, attention mechanism and central damage function are introduced into the classical convolutional neural network to solve the problem that the information loss of key feature areas in the existing classification methods of convolutional neural network affects the classification effect, and further improves the accuracy of convolutional neural network in image classification.


Introduction
Strawberry is a popular fruit nicknamed the "Queen of Fruits." It is favoured by most people because of its rich nutritional value, sweet taste, sufficient moisture, and affordable prices. In today's modern society where material needs and quality of life are gradually increasing, more and more people choose to eat strawberries, the demand for strawberries is also increasing, and the scale of their planting is further expanded as consumption increases. At the same time, as a fruit with a short history of artificial hybridization, strawberries have not adapted to the natural environment well [1]. First, the strawberry is a naked fruit, and its skin does not provide protection; second, the stems, leaves, and fruits of the strawberry are close to the ground, and the rich water and sugars in the strawberry are very easy to attract insects; and finally, strawberries are prone to disease due to climate changes during planting [2]. ere are many types of strawberry diseases and insect pests; if a strawberry planting base breaks out of diseases and insect pests, it will cause a sharp drop in output and reduce economic benefits. In order to take correct and effective countermeasures to strawberry pests and diseases, it is required to quickly and accurately identify strawberry pests and diseases [3].
Traditional plant disease identification methods are mainly based on image processing technology for plant disease identification. Although certain results have been achieved, these methods extract specific image characteristics of leaves and use traditional classification algorithms. e image preprocessing process is more complicated and has anti-interference ability [4]. e robustness is poor, the processing speed is slow, and there is not enough support for mobile devices with limited computing and storage resources. e feature extraction method is not universal, which makes the generalization ability of the overall method poor. On the other hand, in recent years, the artificial intelligence technology represented by deep learning has developed rapidly; particularly the research of convolutional neural network in the field of computer vision has made very big breakthroughs [5]. Among them, image classification refers to inputting an image and determining the category of the image. Many scholars have launched image classification research based on convolutional neural networks and achieved very good performance [6]. However, most of the existing image classification algorithms based on convolutional neural networks only care about the extraction of image global features and do not pay attention to the important area information in the image. In addition, most of the existing image classification algorithms based on convolutional neural networks have a large loss of information in the process of image feature extraction. At present, the convolutional neural network technology has made great progress in the field of image classification and target recognition, making it widely used in the recognition of plant diseases. Li Yan proposed a deep convolutional neural network algorithm based on Fisher's criterion, which realized the recognition of 4 potato diseases with a recognition accuracy of 87.04%. Liu Tianfu et al. proposed a grape leaf detection algorithm based on convolutional neural network, which has a detection rate of 87.2% in a complex background. Sedan et al. used convolutional neural networks to recognize 13 kinds of plant diseases. e model also has the ability to distinguish plant leaves and their surrounding environment [7]. e correct recognition rate of the final model was 96.3%. Mohanty et al. used Alex Net and Google Net model training to classify and recognize images of 14 plants, 26 diseases, and some healthy plants in plant village. e recognition accuracy can reach 97.82% and 99.35%, respectively [8].
e above-mentioned plant disease recognition methods based on convolutional neural networks have the advantages of not needing to segment images, more recognition types, and strong generalization ability. ey have obvious advantages over traditional image processing technology-based plant disease recognition methods.
While major breakthroughs have been made in the theoretical research of image classification based on convolutional neural networks, many people have explored the use of image classification based on convolutional neural networks for specific applications in reality. erefore, further improve the classification ability of the image classification algorithm based on convolutional neural network, and apply it to the identification of strawberry diseases and insect pests, realize automatic, accurate, fast, and instant recognition of strawberry diseases and insect pests, and help fruit farmers understand and study the occurrence rules of strawberry diseases and insect pests. Strawberry diseases and insect pest's prevention and control measures play a key role in reducing economic losses caused by diseases and insect pests, making strawberries high and productive, and promoting the accelerated development of strawberry planting. e recognition algorithm can make the positioning network detect most diseased areas under the guidance of the feedback network, and the classification network can identify and classify the diseased areas according to the suggested ones. is model is combined with the pretrained image fine-tuning classification model of common strawberry diseases and the fine-grained classification model of strawberry common diseases image based on attention mechanism. e algorithm has the characteristics of high recognition rate, fast recognition speed, etc., and can overcome the interference of the external environment to the greatest extent and achieve rapid and accurate identification of target diseases.

Structure Design of Deep Convolutional
Neural Network

Deep Convolutional Neural Network.
Convolutional neural network (CNN) is optimized by the BP algorithm. It is also the first successfully trained deep learning model with multiple hidden layers. It is one of the representative network structures of deep learning technology. e design framework of the convolutional neural network was born based on minimizing the preprocessed data [9]. Its shared weights feature reduces the training parameters in the network; at the same time, the pooling operation reduces the number of neurons in the network, and locally the receptive field makes neurons no longer use the full connection method, reducing the number of neuron connections; these three characteristics reduce the complexity of the CNN model, simplify the training method, and make the data extracted by each layer have translation invariance [10]. e common convolutional neural network structure is generally composed of input layer, convolutional layer, activation layer, pooling layer, and fully connected layer. e common CNN structure is shown in Figure 1. e low hidden layer of convolutional neural network is composed of convolution. e layer and the pooling layer are alternately composed, and the upper layer is composed of the hidden layer of the fully connected layer and the logistic regression classifier.
is article uses two methods: designing the model from scratch and fine-tuning the existing model. First, conduct a comparative experiment on colour images to observe which method has high classification accuracy on the colour test set images, and select the method with high accuracy as the final training model method. After multiple experiments on the training set and test set, the structure of the deep convolutional neural network designed for the object database is shown in Figure 2. Comparative experimental results show that the effect of fine-tuning the classic network is better than the experimental results of several convolutional neural networks designed by ourselves [11]. e reason may be that the database is not large enough to fit a multi-layer convolutional neural network model, or the parameters are not adjusted properly during the training process, such as the number of convolutional layers, the number of neurons in each layer, etc., which do not meet the statistical characteristics of the target database [12].

Image
Processing and Convolutional Operation. One advantage of using deep convolutional neural networks to extract features from images is that it does not need to perform some underlying image processing operations on the original pictures, and the original pictures can be directly input to the network model. However, due to the limited computing power of computers, it is not yet possible to completely rely on increasing training data to train large models. Instead, it is necessary to add some prior knowledge to the training data, which is also called data enhancement processing. e main method is to randomly select several large picture blocks from the original pictures, and flip these picture blocks left and right, and use the obtained picture blocks as part of the training data [13]. e advantage of this is to reduce overfitting. When training this model from the beginning, the processing part of the image only includes uniformly transforming the resolution of the image to 148 × 148 × 3, and then dividing the image into a training set and a test set. e test set data is from all 51 types of objects, take all pictures of an object randomly in each of them, use all the remaining pictures as the training set, and find the  average of the training set. When training and testing the model, subtract the average of the training set from the input data to reduce the data; the similarity between the two improves the training speed [14]. e convolution operation has been introduced in the previous section. In the experiment of designing the model from scratch, first randomly initialize 64 convolution templates with a size of 9 × 9 × 3, and then convolve the input image with a step size of 4 pixels. e 64 convolution templates activation function consisted with a size of 35 × 35 feature maps, each feature map corresponds to a convolution template, and the value of each pixel on the feature map is the result of convolution of the corresponding convolution template with the image block in the 9 × 9 × 3 area at the corresponding position. e height H and width D of the graph satisfy the following formula: Among them, A is the height or width of the input image, B is the height or width of the convolution template, and L is the step length. e role of the activation function is to introduce non-linear factors to solve the problem of insufficient expression ability of linear functions. Different activation functions correspond to different neurons. is article selects linear threshold neurons, and the output expression is where n represents the number of input neurons. is is a method to force the output to be 0. e trained network has a certain degree of sparsity, thereby reducing data redundancy and making the extracted features more expressive [15]. is kind of neuron is also a neuron often used in deep neural networks; there is a normalization layer behind the pooling layer, which is called the local response normalization technique. In addition, the main purpose of this technology is to suppress the large excitation output of the hidden layer, thereby improving the generalization ability of the model. e basic idea is to normalize the local input area; there are two main forms: one is to normalize the excitation in the adjacent feature map, and the other is to perform the normalization on the adjacent local area in the same feature map. Assuming that φ i x,y is the i-th feature map generated by the maximum pooling layer, the excitation at position (x, y), and δ i x,y is the normalized response, the calculation formula is 2.3. e Process from Output Layer to Back-Propagation. e output layer of the model uses the SoftMax classifier to output the probability distribution of different prediction results. Since the object database has 51 types of objects, the SoftMax classifier has 51 outputs, which correspond to the prediction scores of a picture belonging to each category. Suppose there is a training set (α (1) , β (1) ), (α (2) , β (2) ), . . . , α (100) , β (100) ) , where α(i) represents the output feature vector of the fully connected layer and β(i) is the true label of the data; the value is one of {1, 2, ..., 51}. 100 is the number of pictures contained in the input data for each training session. After forward propagation, that is, after the data is transmitted to the output layer, for a single training data, the output of the SoftMax classifier can be expressed as When using the SoftMax classifier, the loss function generally adopts the method of cross-entropy. e loss function of this model is where β (i) � j means that when the category of the i-th training data belongs to the j-th category, the output is 1; 50λ 51 i�1 2048 j�0 σ 2 i,j is the weight attenuation term, plus this term can limit the size of the weight, reduce the complexity of the model, and ensure that the loss function is strict convex function, so that the global minimum solution can be obtained when solving, and it can also reduce the model overfitting problem; and λ is the weight attenuation coefficient. When training this model, the value of λ has a relatively large impact on network performance. If λ is adjusted too large, the model is easy to overfit; if the value of λ is too small, the model is easy to underfit. By controlling other parameters in the model unchanged, a series of different sizes of λ are compared and trained to obtain the loss function after the value, and the weight parameters in the model are updated through the back-propagation algorithm.
e learning rules of parameters in neural networks usually adopt the stochastic gradient descent method. e goal is to learn the optimal weight parameter (σ, τ) to minimize the value of the loss function. e key to parameter update is to find the partial derivative of the loss function with respect to each weight parameter. Use the following formula to update parameter (σ, τ) again. 4 Complexity where σ (ℓ) ij represents the weight parameter of the connection between the i-th neuron in the ℓ + 1 layer and the j-th neuron in the ℓ layer and τ (ℓ) i represents the bias term of the i-th neuron in the ℓ + 1 layer. e greater the learning rate, the faster the training speed, but this may make the model enter the local optimum; the smaller the learning rate, the slower the training speed, and the optimal solution can often be found. In actual training, the choice of learning rate should be a compromise between the two, and the partial derivative of (σ, τ) can be further transformed. is article mainly focuses on strawberry gray mold, strawberry powdery mildew, strawberry black spot, strawberry leather rot, strawberry anthracnose, and Phytophthora disease, and ten kinds of pests and diseases as an example, through some methods to classify and identify these pests. Examples of typical strawberry disease images are shown in Figure 3.

Data set Construction.
Because the strawberry pest data set is small and the size of the collected strawberry images is not consistent, it needs to preprocess the data set. Image preprocessing includes data enhancement, adjusting the image to a uniform size, etc. Deep neural networks with good generalization performance usually need a large number of training data sets [16]. However, limited by the ability of data collection and preprocessing, many training data sets are small, which leads to the phenomenon of overfitting. In order to avoid this phenomenon, data augmentation is usually used to expand the training data set. In computer vision problems, the commonly used data enhancement methods include flip, crop, and so on. Flipping is to rotate the original image horizontally or vertically to form a new training image. Clipping is to cut a part of the original image to form a new training image. In the original data set, the size of different images may not be the same; even if all the image sizes are the same, the size may not be consistent with the image input size required by the convolution neural network. erefore, before inputting the original image into the convolutional neural network, the size of the image needs to be adjusted [17].

Flow of Strawberry Disease Identification Algorithm.
is chapter takes greenhouse crops as the main research object and elaborates its pest identification algorithm. It mainly focuses on locating the fruit parts of crops and identifying six kinds of diseases and insect pests that occur in strawberry fruit parts [18]. For the first aspect, the Hear feature extraction algorithm in image processing combined with the classic adobe algorithm in machine learning is used to judge whether the video contains crop fruits; the simple structure of the convolutional neural network is used for training and classification, and the specific crop disease and pest identification algorithm. e process and steps are shown in Figure 4.
From the flow chart, the experiment in this article is mainly divided into two parts: crop positioning and crop pest identification. For the first part, firstly screen the acquired strawberry diseases and insect pest pictures to obtain training sample images, and then send them to the neural network architecture used in the experiment in this article for training to obtain the recognition network structure of grape diseases and insect pests. For the second part, first obtain the crop video from the camera in Daylong, and use the video frame difference method in image processing to convert the video into a picture; then, use Gaussian filtering to preprocess the obtained image to remove the influence of some noise; then, judge whether the picture contains crop strawberries; if so, save the image and send it to the trained pest identification neural network for pest identification judgment; otherwise, discard the image and continue the next cycle [19]. Figure 5 shows the extraction of deep network structure for strawberry common disease recognition.

e Influence of Network Depth and Convolution Kernel Size on Model Recognition Rate.
e convolutional layer of the convolutional neural network model in the experiment adopts padding, so that the length and width of the image will not change when the image passes through the convolutional layer, but the depth is deepened; the sampling layer construction Complexity 5 method selects the most commonly used maximum pooling. e structure of strawberry leaf image database is 201 healthy leaves, 201 healthy leaves on the back, 637 early powdery mildew leaves on the front [20], 637 powdery mildew early leaves on the back, and 891 powdery mildew late leaves. In the curve of the recognition rate of 9 convolutional neural network models with the number of iterations, the initial recognition rate of the first 6 models is low, and the recognition rate increases faster in the first 10 iterations and then gradually stabilizes; in the latter 3 models, the initial recognition rate is high, and the recognition rate increases rapidly in the first 5 iterations and then gradually stabilizes. DCNN-1, DCNN-2, and DCNN-3 have the lowest network depth and the recognition rate curve is at the bottom; DCNN-4, DCNN-5, and DCNN-6 have the network depth in the middle, and the recognition rate curve is in the middle: DCNN-7, DCNN-8. e network depth with DCNN-9 is the highest, and the recognition rate curve is at the top. At the same time, after the curve is relatively stable, the recognition rate curve of DCNN-9 is above the recognition rate curve of DCNN-7 and DCNN-8, and the recognition rate curve of DCNN-6 is above the recognition rate curve of DCNN-5 and DCNN-4. e recognition  6 Complexity rate curves of DCNN-3 and DCNN-1 are above the recognition rate curve of DCNN-2. Figure 6 shows the change of recognition rate of convolution neural network model with different structure depth with iteration times. Based on this, we can conclude that as the number of iterations and the depth of the network increase, the correct recognition rate of the convolutional neural network is gradually increasing; for the network of the same depth, as the number of training increases, the convolution kernel is 5 × 5, the recognition rate of the 3 × 3 network model is higher, followed by the network model with a convolution kernel of 5 × 5, and the recognition rate of the network model with a convolution kernel of 3 × 3 is lower than the former two; (3) the deeper the network, the more obvious the advantages of the network model of the hybrid convolution kernel (5 × 5, 3 × 3). After 5 convolution operations, the convolution kernel is 5 × 5, and the 3 × 3 model CNN-9 performs best, with a recognition rate of 93.16%.

e Influence of Sampling Layer Construction Method on
Recognition Effect. In order to avoid the unbalanced data structure of strawberry samples and the larger proportion of late-stage powdery mildew leaves with obvious characteristics, the image database of healthy strawberry leaves was expanded. e strawberry leaf image library structure is adjusted, and the image library structure used for testing is 176 :176 :131 :131 : 179. It can be seen from Table 1 that the recognition effect of the maximum pooling model is similar to that of the mean pooling model, and the recognition effect of the intermediate value pooling and the mixed pooling is similar. Among them, the intermediate value pooling has the best recognition effect, which is 98.36%. Considering that there is no need to identify the front and back sides of healthy leaves in practical applications, such misidentifications are eliminated. e models established by four sampling layer construction methods had different recognition effects on healthy and early powdery mildew and late powdery mildew leaves. Among them, the overall recognition rate of intermediate value pooling and mixed pooling was 98.36% and 98.61%, respectively. For healthy strawberry leaves, the recognition effect of mean pool and mixed pool was 99.15% and 99.43%, respectively; for early stage of strawberry powdery mildew, the identification effect of intermediate value pool and mixed pool was 98.85% and 96.56%, respectively; for late stage of strawberry powdery mildew, the identification effect of mixed pool was the best, which could reach 100.00%. Considering the identification of various types of blades, the final sampling layer construction method is hybrid pooling. Finally, the selected convolution neural network model is based on mixed pooling, five convolution operations, and convolution kernel of 5 × 5, 3 × 3, which can identify strawberry leaf powdery mildew disease well. Figure 7 shows the comparison of recognition rate of pooling methods.

e Influence of Sampling Layer Construction Method on
Recognition Effect. e trained deep convolution neural network model was used to identify and detect 112 test data of strawberry diseases including 10 kinds of diseases, including strawberry big spot, small spot, round spot, gray spot, stem rot, common rust, head smut, ear rot, Uvularia leaf spot, and sheath blight. e results are shown in Figure 8.

Recognition and Analysis of Strawberry Disease Leaves
Based on Image Set. e folding cross-validation strategy is used in the experiment and compared with the method based on colour feature (CT), the method based on colour, texture, and shape features (CTSF), the method based on feature fusion and local discriminant projection (FFLDP), and the method based on DCNNs. e first three methods firstly preprocess and segment the diseased leaf image and then extract the classification features from the diseased leaf image. Among them, CT method extracts the colour features of the disease spot image, CTSF method extracts the colour, texture, and shape features of the disease spot image, and FFLDP method extracts the centrosymmetric local binary pattern features and colour, texture, and shape features of the disease spot image and then carries out feature fusion and uses local discriminant mapping algorithm (LDP) to reduce the dimension of the fusion features. Although the recognition rate of these three methods is higher when the disease spot segmentation is more accurate, in order to illustrate the advantages of the proposed method, this paper did not disease spot segmentation on the disease leaf image by using the three methods, DCNNs and idents altogether Each algorithm repeats 50-fold cross-validation experiments, records the average recognition rate of each experiment 10 times, and then calculates the average value and variance of 50 average recognition rates of 50 cross-validation experiments. ere are 5 methods for strawberry leaf disease recognition. Comparison of recognition rate of pooling methods is shown in Figure 9.
It can be seen from Figure 9 that the recognition rate of DCNNs is much higher than that of CT, CTSF, and FFLDP. e motive is that CT, CTSF, and FFLDP techniques can now not without delay extract high quality classification elements from the authentic diseased leaf photographs due to the fact of the complexity and range of the images, and the focus price of these three strategies mainly relies upon on the blessings and hazards of photograph preprocessing and disorder spot segmentation algorithms. DCNN can study deep discriminant points without delay from the photograph of diseased leaves, alternatively of extracting artificially

Conclusion
In this paper, an algorithm of strawberry disease recognition based on deep convolution neural network is proposed. rough the normal training of strawberry image feature representation in different scenes, then through the application of transfer learning method, the strawberry disease image features are added to the training set, and finally the features are classified and recognized to achieve the goal of disease recognition. is paper mainly designs a novel attention mechanism, which can effectively utilize the information region of the image and use transfer learning to quickly establish a fine-grained classification model of strawberry common diseases based on attention mechanism. e results showed that attention mechanism could improve the accuracy of strawberry disease classification. e recognition algorithm can make the location network detect most disease areas under the guidance of the feedback network, and the classification network can identify and classify the disease areas according to the proposed disease areas. e fine-grained model of strawberry disease classification is based on fine-grained image classification and fine-tuning mechanism is based on image fine-tuning. e algorithm has the characteristics of high recognition rate and fast recognition speed and can overcome the interference of external environment to the maximum extent and realize the rapid and accurate identification of target diseases. However, this method does not take into account the different characteristics of the disease in each stage of the disease course, because it does not take into account more of the disease Complexity images obtained from the characteristics of each stage in the actual planting environment, so it may cause misjudgment and affect the recognition rate. In order to solve the problem of accurate recognition and control of strawberry disease image in the actual planting environment, further development of strawberry common disease image recognition based on non-destructive, fast, and convenient recognition of strawberry diseases provides technical support for scientific and accurate control of strawberry diseases and provides support for intelligent research and application of strawberry disease recognition.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that they have no conflicts of interest.