DBA_SSD ฀ A Novel End-to-End Object Detection Using Deep Attention Module for Helping Smart Device with Vegetable and Fruit Leaf Plant Disease Detection

: In response to the difficulty of detecting and classifying pests and vegetable and fruit 8 leaves with pests and diseases, this study proposes a novel vegetable and fruit leaf pest detection 9 method called deep block attention SSD (DBA_SSD) for the identification of pests and diseases 10 and classification of the degree of pests and diseases of vegetable and fruit leaves. We propose three 11 vegetable and fruit leaf pest detection methods, namely, squeeze-and excitation SSD (Se_SSD), 12 DB_SSD, and DBA_SSD. Se_SSD fuses SSD feature extraction network and attention mechanism 13 channel, DB_SSD improves VGG feature extraction network, and DBA_SSD fuses the improved 14 VGG network and channel attention mechanism. To reduce the training time and accelerate the 15 training process, the convolutional layers trained in the Image Net image dataset by the VGG model 16 are migrated to this model, whereas the collected vegetable and fruit disease image dataset is 17 randomly divided into training set, validation set, and test set in the ratio of 8:1:1. In addition, data 18 enhancement methods, such as histogram equalization and horizontal flip were used to expand the 19 image data. The performance of the three improved algorithms is compared and analyzed in the 20 same environment and with the classical target detection algorithms YOLOv4, YOLOv3, Faster 21 RCNN, and YOLOv4 tiny. Experiments show that DBA_SSD outperforms the two other improved 22 algorithms, and its performance in comparative analysis is superior to other target detection 23 algorithms.


Introduction
Fruits and vegetables are susceptible to various diseases, thereby affecting their quality and yield seriously.The formulation of prevention and control plans as soon as possible before the outbreak of the disease can maximize the effect of prevention and control and reduce economic losses.Therefore, the identification of vegetable and fruit diseases and insect pests is an effective way to inhibit the rapid development of diseases and avoid their occurrence.Previously, people used to experience subjective judgments by category crop diseases.However, the ability of this process to distinguish various diseases is limited, and it is time consuming.
Agricultural detection based on artificial intelligence, such as crop yield prediction [1] , weed identification processing [2] , and pest and disease detection [3,4] ，is widely used with the development of artificial intelligence technology.Machine learning-based disease detection requires preprocessing the dataset, extracting the features of disease regions in the image using feature extraction algorithms, sending the obtained feature information to the classifier to obtain the model parameters, and obtaining the disease and pest categories and the degree of disease and pest of the object to be detected.However, the model generalization ability is weak because of the machine learning-based image recognition.,When the number of categories is excessive, the features of each class cannot be distinguished effectively.Moreover, the categories can only be recognized in a specific image context.Thus, the needs of large-scale planting, based on which it is important to research a fast end-to-end vegetable and fruit leaf pest detection method, cannot be met.
In recent years, deep learning algorithms based on convolutional neural networks (CNN) [5] have been widely used, and CNN have a large number of adjustable parameters [6] .It is very effective for real-life object detection, recognition and classification [7] .In the field of agricultural research, deep learning technology in agriculture includes crop/weed recognition, fruit harvesting [8] , and plant recognition [9] .Similarly, recent studies have also focused on the identification of plant diseases.Some of the latest network models have been applied to the classification of plant diseases.In addition, some researchers have introduced deep learning algorithms to improve the performance of plant species disease classification.Longsheng Fu [10] proposed an orchard kiwi fruit target detection algorithm.According to the characteristics of kiwi fruit images, the 3*3 and 1*1 convolutions were introduced into the YOLOv3-tiny [11] model, DY3TNet model was proposed and combined with R-CNN [12] , YOLOv2 [13] and YOLOv3-tiny are compared.The experimental results show that the improved DY3TNet model is small in size and high in efficiency.Guoxu Liu et al. [14] detected tomatoes based on the YOLOv3 model [15] , combined dense structure for feature extraction, replaced traditional R-Bbox with C-Bbox, matched the shape of the tomato, reduced the number of parameters, and compared YOLOv2 , and Faster RCNN [16] .Literature [17] proposed a tomato gray spot recognition method based on Mobilenev2 [18] and YOLOv3 lightweight network model.This method improves the accuracy of tomato gray spot recognition by introducing GIOU regression loss function, and uses a pre-training method that combines hybrid training and migration learning to improve the generalization ability of the model.Literature [19] compared the performance of five networks, namely, AlexNet [20] , VGG-16 [21] , ResNet-101 [22] , DenseNet-161 [23] , and SqueezeNet [24] for nutrient deficiency symptom identification based on the Deep Nutrient Deficiency for Sugar Beet dataset and discussed their limitations.
Building a fast and high classification accuracy model is necessary to determine the detection quality of plant and fruit leaf diseases and insect pests.The current mainstream target recognition networks include YOLO series, Faster RCNN, SSD [25] , and FPN [26] .The SSD target detection network uses an end-to-end method to regress features and extracts different levels of image features, which cover low-level and high-level semantic information.Previous studies have shown that the SSD network is fast.However, the direct application of SSD methods to detect vegetable and fruit pests and diseases cannot meet the high precision requirements in agricultural production.This paper proposes a fusion residual network and 1*1 convolution feature extraction module.It strengthens the feature extraction capability of SSD and improves the positioning and recognition accuracy of SSD for detecting vegetable and fruit pests.We also use data-enhancement to perform spatial transformation and pixel transformation on images, thereby not only improving the abundance of algorithm features, detection accuracy, and detection efficiency but also reduces the labor costs for agricultural fruit and vegetable pest detection.
This study focuses on proposing a novel end-to-end pest detection algorithm called Deep Block Attention SSD (DBA_SSD) for vegetable and fruit leaves.Our main work and contributions are presented as follows: 1. We proposed a novel end-to-end detection algorithm for plant and fruit leaf diseases and insect pests, DBA_SSD, by combining the attention mechanism and convolution kernel, which combines the attributes of the plant and fruit leaf disease and insect pest pictures and pay more attention to pest and disease details when testing fruit and vegetable leaves for pest and disease.
2. We graded the health of the fruit and vegetable leaves.According to the research results of the paper, different measures can be taken according to the severity of the diseases of the fruit and vegetable leaves.Increasing the yield of fruits and vegetables is of great significance.
3. We implemented the classic SSD, YOLOv4, YOLOv3, Faster RCNN, and YOLOv4 tiny models and compared them with our proposed DBA_SSD.Our method is better than the classic baseline method on the vegetable and fruit leaf data set.
The main structure of this article is presented as follows.The first chapter mainly introduces the related work on the detection of leaf diseases and insect pests of fruits and vegetables and combs the detection technology of leaf diseases and insect pests of fruits and vegetables.The second chapter introduces the SSD model and related improvement modules and proposes two improved methods for the SSD target detection algorithm.The third chapter introduces the environment of algorithm experiment, data set structure, experiment procedure, and experiment evaluation standard.The fourth chapter conducts a comparative analysis of the results of the two sets of experiments and related ablation experiments on the proposed DBA_SSD.The other is a comparative analysis of the results of SSD improved algorithms and other target detection algorithms.Finally, we summarize and prospect the research in this article.

Related work
At present, the research methods on plant disease recognition mainly focuses on two aspects: one is disease recognition based on machine learning, and the general steps include diseased leaf image segmentation, feature extraction, and disease recognition; and the other is target recognition technology based on deep learning, wherein terminal end-to-end target detection is favored by many researchers because of its fast recognition speed and efficient feature extraction methods.
In the research on the identification of vegetable and fruit diseases based on machine learning, Literature [27] proposed a DCNN-based apple tree leaf disease (ATLD) diagnosis method, and established 5 common ATLDs and healthy leaf data sets.The DCNN model combines DenseNet and Xception [28] models by using support vector machine to classify apple leaf diseases, the experimental results show that the accuracy of the DCNN model better than and comparing Inception-v3 [29] , MobileNet [30] , VGG-16, DenseNet-201, Xception, VGG-INCEP.Shrivastava et al. [31] proposed a rice disease image classification by only method using color features, and explored the feature extraction methods of 14 different color channels.They obtained 172 different color channel feature information and used 7 different classifiers.The performance is compared, and the result shows that the classification accuracy of the support vector machine classifier is up to 94.65%.Literature [32] introduced a hybrid method for detecting plant leaf diseases and insect pests.The first stage corresponds to the image enhancement and image conversion scheme to overcome the problems related to low illumination and noise.The second stage combines the feature extraction technology of GLCM, complex Gabor filter, Curvelet, and image moments.The third stage uses the extracted features to train the nerve fuzzy logic classifier, and the proposed combination of feature extraction and image preprocessing can improve classification accuracy.Abdulridha [33] used hyperspectral imaging and machine learning to develop a technique for detecting pumpkin powdery mildew in the asymptomatic, early, middle, and late stages.This method uses a radial basis function to treat the disease.Strains and healthy strains were distinguished, and the severity of diseased strains was classified.Abdu [34] proposed a method for identifying the surface of plant diseased leaves, extracting optimized features from the diseased area, and identifying plant diseased leaves based on a feature-based machine learning classifier.The diseased features are connected in series to form a pathological feature vector for disease recognition to improve detection accuracy.
In deep learning-based research on fruit and vegetable diseases, Salma Samiei [35] u used red clover and alfalfa as research objects and proposed CNN-LSTM models combined with denoising algorithms to classify the different growth stages of two different plant species.Based on highresolution remote sensing data, Alin-Ionut, Ples, oianu et al. [36] and others proposed an integrated deep learning model for individual tree crown detection and species classification.Mohamed Kerkech et al. [37] proposed a new method of grape disease detection based on the SegNet [38] architecture for visible light and infrared image segmentation to identify shadows, ground, healthy and symptomatic vines, and finally merge the segmentation obtained from visible light and infrared images to generate the whole disease map of grapes.Literature [39] , a U-Net method for pixel-level purple rapeseed segmentation was proposed to calculate the model parameters by adjusting the sample size.In the literature [40] , a new thermal imaging method was proposed to calculate the color similarity problem between unripe citrus fruits and leaves, which were prone to temperature differences between fruit and leaf surfaces because of the varying rates of temperature change between the fruit and leaf surfaces caused by water mist and to build a deep learning model based on the thermal imaging system.Meanwhile, the disease detection algorithm is moving towards lightweight, thereby making deploy into embedded devices easy.Chongke Bi [41] proposed a lightweight method for apple leaf disease identification based on MobileNet model.This method was also compared with ResNet152 and InceptionV3.The method can provide stable recognition results and is easily deployed in mobile devices.Utpal Barman [42] compared MobileNet CNN and Self-Structured CNN (SSCNN) based on citrus disease dataset from smartphone images.The experiments show that SSCNN is more accurate in classifying citrus leaf diseases based on smartphone images and takes less computation time.After research, increasing number of scholars tend to detect plant diseases using deep learning-based target detection methods, especially YOLO, SSD, and other target detection algorithms represented by one-stage methods, which omit tedious machine learning steps, such as image preprocessing, segmentation, and feature extraction, in a onestep end-to-end method with high recognition accuracy.Therefore, this paper explores the effectiveness of target detection algorithms for vegetable and fruit leaf disease detection and grading by using SSD as baseline method.
2 Novel end-to-end method for leaf disease detection of fruits and vegetables Fig. 1 SSD backbone network structure The SSD algorithm model is a one-stage real-time target detection model proposed simultaneously with YOLO series.SSD combines the one-stage regression prediction idea of the YOLO series and the Anchor Box mechanism of the Faster RCNN by using VGG as the base feature extraction network and extracting six different size feature layers from the bottom to the top layer as the regression prediction features.The advantage of SSD is that it improves the operation speed of the algorithm greatly while maintaining the detection accuracy.Moreover, the detection of small targets and large objects are considered.Figure 1 shows the SSD backbone network structure.

SSD Network
The loss function of SSD contains log loss for classification and smooth L1 for regression, and controls the proportion of positive and negative samples, which can improve the speed of optimization and the stability of training results.The total loss function is the sum of the errors of classification and regression. is used to adjust the weight between the confidence loss and location loss, default =1, and N denotes the total number of default boxes that eventually match with Ground Truth boxes.Confidence loss is a typical softmax loss, and location loss is a typical smooth L1 loss.
Total loss: Classified losses: Of which: represents whether the i-th regression box matches the j-th GroundTruth box of type P.Although SSD adopts the direct regression prediction of the full-roller machine and no longer generates candidate frames, which greatly improves the detection speed of the SSD network, the SSD algorithm will miss and mis-detect when facing similar surface features and leaf occlusion situations, which often occur in actual leaf disease detection.For this reason, SSD needs to be improved to enhance feature recognition.

Squeeze-and Excitation SSD (Se_SSD) Network
Se_Block [43] mainly focuses on the relationship between channels and can explicitly model the interdependencies between feature channels with the structural unit "Squeeze-and Excitation (SE)" module, which adaptively adjusts the feature response values of each channel and internal dependencies between channels.The Se_Block module works as shown in Fig. 2, First, feature compression is performed along the spatial dimension of the feature map, and each two-dimensional feature channel is turned into a real number, that has a global perceptual field to a certain extent.The output has the same number of dimensions as the input feature channels.Then, based on the correlation between the feature channels, a weight is generated for each feature channel to represent the importance of the feature channels.Finally, the original features are re-calibrated in the channel dimension by multiplying the channel-by-channel weights onto the previous features.4(a).X is the input feature map, Wi is the weight of the ith layer network, F( X, Wi) + X is the feature output, and F( X, Wi) + X is how the data are computed in the module.The residual network is superior to the traditional convolutional network.The residual network module implements an ultra-deep network and avoids the bottleneck problem of saturating the neural network with correctness due to continuous deepening.In addition, by directly connecting the input and output to achieve the goal of simplifying the learning objective and difficulty.1*1 convolution [46] is shown in Figure 4(b), and 1*1 convolution is usually followed by a nonlinear layer of Relu for nonlinearization to learn more features.In addition to this 1*1 convolution's can change the dimensionality of the image and transform the original image by 1*1 convolution to improve the generalization ability to reduce overfitting, and at the same time reduce the computational effort by boosting and reducing the number of channels to achieve cross-channel information interaction and feature integration in the process.Fig. 6 DBA_SSD network structure As shown in Figure 5, two kinds of rich feature extraction modules are designed in this paper, as shown in Figure 5(a), Deep_Block is used to enhance the network feature extraction capability by using 1*1 convolution to reduce the number of channels after convolution, fusing multi-channel information, while introducing a residual structure to prevent the loss of feature layer information.Deep_Block_Attention adds a channel attention mechanism at the end of the Deep_Block structure for fine-tuning at the channel level.As shown in Figure 5(b), the feature extraction network of SSD is reconstructed with the rich feature extraction module as the basic feature extraction unit, as shown in Figure 6, to deepen the feature extraction of each layer and increase the richness of feature learning by the rich feature extraction module.

Experimental environment
This experiment is a deep learning model built under the Pytorch deep learning framework, using a dataset of 3000 vegetable and fruit leaves, and the final output prediction frame identifies the leaf species and determines the severity of leaf disease.The experimental environment uses AMD Ryzen 7 4800H processor, NVIDIA GeForce RTX 2060 graphics card, 32G RAM, and Pytorch deep learning framework.

Experimental design
To ensure the equalization of the dataset and to increase the richness and quality of the dataset, data enhancement and image preprocessing were performed on the images before the experimental tests [44] .The means of enhancement are Histogram Equalization, Horizontal Flip + Hue Saturation Value, Vertical Flip + Channel Shuffle, Horizontal Flip + Vertical Flip+ Channel Shuffle.The enhanced images are shown in Figure 8, with each of the 15 classes expanded to 1,000 images, and the number of data sets expanded from 3,000 to 15,000, with the training, validation, and testing ratios randomly assigned according to 1:8:1.
Among them, Histogram Equalization is a means of image pixel processing that serves to bring out the objects with insignificant contrast.It is essentially a grayscale transformation that spreads concentrated grayscale intervals over the entire grayscale interval.It is a nonlinear transformation.A digital image histogram with a gray level in the range [0, L-1] is a discrete function of.
where n is the sum of the pixels in the image, is the number of pixels in the current gray level, and L is the total number of possible gray levels in the image.Color histogram equalization is the fusion of each of the three channels of the image after equalization.Histogram equalization is performed on the dataset considering the presence of blur and insignificance in the dataset.Experiment 1: The Se_SSD network with the Se_Block channel attention mechanism added is trained and the average accuracy of this network for the detection of fruits and vegetables leaves is tested.
Experiment 2: The DB_SSD network with the Deep_Block module added is trained in the environment and hardware conditions of Experiment 1, where the Deep_Block module does not contain the attention mechanism, and the effect of the network with the added attention mechanism on the detection of fruits and vegetables leaves is tested.
Experiment 3. The DBA_SSD network with the Deep_Block_Attention module added, and the Deep_Block_Attention module containing the attention mechanism, i.e., the SSD network with both improved structure and added attention mechanism, is trained and tested under the environment and hardware conditions of Experiment 1.
Experiment 4. The original SSD network is trained and tested under the environment and hardware conditions of Experiment 1.
All the four experiments were trained on the basis of 15,000 vegetable and fruit pest leaf datasets and tested 1500 randomly selected images.The experiments followed the experimental flow in Fig. 9, the experiment-comparison-optimization-experiment pattern, to obtain the average accuracy mAP under this model and to compare the mAP values of different models.

Performance Evaluation Metrics
Precision is a measure of the accuracy of a model's prediction, and its value is equal to the number of correctly predicted positive samples over the total number of positively predicted samples.Recall (Recall) is a measure of the model's ability to identify positive samples, and its value is the number of correctly predicted positive samples over the total number of positively predicted samples.The prediction results of the model are shown in Table 1 for TP, FP, FN, and TN.
Table 1 The PR curve is a graph drawn with Recall as the horizontal axis and Precision as the vertical axis; Precision is negatively correlated with Recall, and the recall rate decreases as precision increases.AP (Average Precision) as a single category indicator is the integration of PR curve.
The value of mAP(mean average precision), as one of the important metrics for the evaluation of the whole model, is the average of the summation of all the category APs.
where n is the category and N is the total number of categories.

DBA _SSD model experimental comparison analysis
The first 50 Epochs were trained by freezing some of the network layer weights, and each batch was trained with 8 images.For the last 50 Epochs, the frozen layers were unfrozen and the full network was trained.The learning rate started at 5e-4, and after unfrozen the learning rate was 1e-4.Fine tuning of the model parameters was performed.As shown in Figure 12 Fig.11 SSD and its improved algorithm loss variation graph The test results between SSD and its improved algorithm are shown in Table 2. DBA_SSD has the highest accuracy because Deep Block strengthens the network's feature extraction ability on the one hand, and it incorporates the channel attention mechanism to accelerate the network learning on the other hand, so that the network focuses on the channels with high information content for feature learning.The prediction accuracy between its SSD and its improved algorithm for predicting different species of fruit and vegetable diseases is shown in Figure 12.The prediction accuracy of DBA_SSD is relatively high among most of the categories, and the mAP value of DBA_SSD is 92.20%, while the mAP values of SSD, Se_SSD, and DB_SSD are 9.96%, 90.77%, and 89.93%, respectively.
Table 2 Comparison of accuracy of improved SSD algorithm

Comparative analysis with classical target detection algorithms
This experiment compares and analyzes the test results of the classical target detection algorithms YOLOv4 [45] , YOLOv4 tiny [46] , Faster RCNN, and YOLOv3.This experiment is conducted with the same dataset in the same experimental environment, and its Loss variation of each algorithm is shown in Figure 14.
Fig. 14 Target detection algorithm loss diagram Each vegetable and fruit plant leaves in this paper can be classified into three categories according to the degree of disease, which are healthy, general and severe (Table 3).Figure 15 then averages the detection accuracy of the same leaves on the basis of Table 3.The prediction accuracy of this category is the average of the sum of the prediction accuracy of the three degrees of leaves.Therefore, its horizontal coordinates indicate different target detection algorithms, and its vertical coordinates indicate the average prediction accuracy and the total average prediction accuracy (mAP) of different kinds of fruits and vegetables leaves.
Compared with DBA_SSD, YOLOv4 has lower prediction accuracy for Strawberry and Chili, YOLOv4 tiny has weaker prediction ability for Tomato, and YOLOv3 has lower prediction accuracy for Strawberry.This is the learning difference caused by different algorithms of feature extraction networks focusing on different information of the learned images, and DBA_SSD solves this deficiency by covering all levels of semantic information.The rightmost column indicates the average detection accuracy of the DBA_SSD algorithm in different categories, with the highest classification accuracy of 100% and the lowest of 82.24%.16 shows that YOLOv4 corresponds to the largest rectangular box area, and its upper quartile edge is close to 100%, indicating the existence of a certain number of prediction accuracies higher than 95%.However, its predicted category accuracy is more discrete.YOLOv3 has a smaller rectangular area, but its distance at the top of the rectangle is not as far as DBA_SSD, indicating that the number of its higher accuracy is not as high as DBA_SSD.Although the upper quartile line of SSD is in contact with the 100% line, its rectangle area is larger, indicating that the prediction accuracy varies widely and is unstable.The rectangle box area of DBA_SSD is the smallest among other algorithms, indicating that the prediction accuracy is more concentrated and is closer to the 100% line, suggesting that a large part of the prediction accuracy is high and the prediction of each kind is more stable.The experiment shows that the DBA_SSD model has a high accuracy rate for the recognition of fruit and vegetable leaves, and the SSD is a one-stage target recognition algorithm with the advantage of fast recognition speed.The comprehensive performance of DBA_SSD has been improved compared with the previous SSD, and the performance is also higher compared with other target detection algorithms.The detection effect is shown in Fig.This paper discusses the work related to vegetable and fruit pest leaf detection, which augments the dataset with spatial transformation, as well as pixel processing on top of the original one.To address the problem wherein the recognition rate of the SSD model is not high and the detail information is not paid attention to, which resulted in high accuracy and incorporated 1×1 convolution, residual network and attention mechanism into the SSD algorithm, the DBA_SSD network model for vegetable and fruit leaf pest detection is proposed to compare and analyze the experimental effect of multiple sets of classical target detection algorithms in vegetable and fruit leaf health detection.It makes the SSD algorithm improve to 92.20% on the original basis with high robustness and speed.In this paper, we mainly performed research on vegetable and fruit pest identification algorithms, but a gap for the application of target detection algorithms in actual production, and future work will mainly focus on the reduction and optimization of algorithms to implant embedded devices for the application of real-time monitoring of agricultural plant diseases.

Conflicts of Interest:
The authors declare no conflict of interest.

Fig. 2 Fig. 4 (
Fig. 2 Se_Block Attention ModuleTo increase the feature extraction capability of SSD feature extraction model and focus more on the feature layers with higher importance, this paper adds Se_Block attention mechanism module in front of the last six effective feature layers used for regression prediction on the basis of SSD model.The feature layers are rescaled by channel dimension.The structure of Se_SSD network is shown in Fig.3.

Figure 5
Enriched feature extraction module ((a) Deep_Block, a feature extraction module combining residual network and 1*1 convolution; (b) Deep_Block_Attention, a feature extraction module adding an attention mechanism to (a))

Fig. 7
Fig.7 Data set composition structure Negative sample (disease) Fig. 8 Data Enhancement To better test the performance of the improved algorithm, four experiments were designed.Se_SSD with channel attention mechanism added at the end of the feature extraction network, DB_SSD (Deep Block SSD) with improved VGG feature extraction network, DBA_SSD with fusion of the improved VGG network and channel attention mechanism, and SSD of the original network are compared, and the VGG model trained on Image Net image dataset is trained by migrated convolutional layers to this model.
Fig.9 Experimental flow TP): indicates the number of correctly identified positive samples; True Negatives (TN): indicates the number of correctly identified negative samples; False Positives (FP): indicates the number of incorrectly identified negative samples; False Negatives (FN): indicates the number of incorrectly identified positive samples.
, the horizontal coordinate is the number of Epochs trained, and the vertical coordinate is the loss value at the end of training for each Epoch.different line shapes indicate different improvement algorithms.As shown in Figure 11, the loss value of the model decreases as the number of iterations increases.The loss values in the training log gradually stop changing around 90 -100 Epoch.The red thin solid line in the figure indicates the loss value of DBA_SSD, whose value is lower compared with the loss of SSD, Se_SSD, and DB_SSD algorithms.
Fig.12AP diagram of SSD and its improved algorithm for the detection of different kinds of diseases Further observe the data distribution of the experimental results in Figure13.The horizontal coordinates indicate the improved algorithm types, the vertical coordinates are the distribution of predicted AP values for the 15 types, the points of the triangle indicate the mean, and the thin solid line in the middle of the rectangle indicates the median.From Figure13, we can see that among the four algorithms SSD, Se_SSD, DB_SSD, and DBA_SSD, DBA_SSD prediction accuracy is more concentrated.Moreover, the median and mean are the highest.DBA_SSD algorithm has better performance compared with other improved algorithms.

Fig. 13
Fig.13 Box diagram of SSD and its improvement algorithm 17 .

Fig. 16
Fig.16 Box plot of AP statistics under target detection algorithm

Figures
Figures

Figure 7 Data
Figure 7

Figure 13 Target
Figure 13

Figure 15 Box
Figure 15

Table 3
Comparison of the accuracy of the improved SSD model and other target detection algorithms for the detection of different kinds of diseases Fig.15 Heat map of correlation between different target detection algorithms and vegetable and fruit leaf types Figure