A novel method for detecting morphologically similar crops and weeds based on the combination of contour masks and filtered Local Binary Pattern operators

Abstract Background Weeds are a major cause of low agricultural productivity. Some weeds have morphological features similar to crops, making them difficult to discriminate. Results We propose a novel method using a combination of filtered features extracted by combined Local Binary Pattern operators and features extracted by plant-leaf contour masks to improve the discrimination rate between broadleaf plants. Opening and closing morphological operators were applied to filter noise in plant images. The images at 4 stages of growth were collected using a testbed system. Mask-based local binary pattern features were combined with filtered features and a coefficient k. The classification of crops and weeds was achieved using support vector machine with radial basis function kernel. By investigating optimal parameters, this method reached a classification accuracy of 98.63% with 4 classes in the “bccr-segset” dataset published online in comparison with an accuracy of 91.85% attained by a previously reported method. Conclusions The proposed method enhances the identification of crops and weeds with similar appearance and demonstrates its capabilities in real-time weed detection.

Weed infestation poses a threat to the environment, crop yields and quality. Weeds in a field retard crop 26 growth by competing for access to sunshine, water and nutrients. In particular, the density, spreading 27 Click here to download Manuscript Manuscipt2_VNTLe.docx time and growth characteristics are important factors for weed management [1]. One of the most 28 invasive and serious weeds is wild radish, which causes significant crop yield losses and low-quality 29 crops due to its fast growth rate, contaminants, multiple-herbicide resistance and vigorous competition 30 [2][3][4]. Currently, blanket herbicide spraying is the most common practice used to eradicate weeds. 31 However, the excessive use of herbicides has negative impacts on the environment in addition to the 32 development of herbicide-resistance properties in weeds. The dramatic challenge for controlling weeds 33 is to attain an optimal eradication efficacy with minimum herbicide usage. Note that, reducing the 34 herbicide application rates brings down the cost of weed management. Hence, it is a worthwhile 35 objective in precision agriculture. 36 Spraying selective weeds automatically in vegetation fields is considered as a potential method to 37 reduce the environmental and economic costs of weed management. Wild radish is a dominant weed in 38 all broadacre field crops, including wheat, barley, sorghum, maize and canola. Canola is the most 39 difficult crop to discriminate against wild radish because of their morphological similarity [5]. 40 Therefore, canola, corn and wild radish are selected for experimental investigation in this study. 41 Classifying crops and wild radish plants is a vital practical problem in agriculture. The ability to 42 accurately detect and classify weeds in row crops in real time enables the selective application of 43 herbicides, thus enhancing the quality and productivity of crops. 44 There have been numerous studies on weed-from-crop discrimination. Spectral techniques based on the 45 calculation of the Normalised Difference Vegetation Indices (NDVIs) [6, 7] have long been proposed 46 for identifying plant species. However, this method has some deficiencies. In typical farm field 47 structuring elements. As a result, there is a correlation among some characteristics of the structuring 135 elements. 136 There are two basic morphological operations for binary and grey-scale images including erosion and 137 dilation. Erosion is defined as a shrinking transformation, which reduces the size of regions within the 138 image, while expanding the size of holes within the regions. As for dilation, it is defined as an expansion 139 transformation, which increases the size of the regions within the image while reducing the size of the 140 holes in the regions and gaps between the regions. It is important to note that the erosion operator filters 141 the inner image, while the dilation operator filters the outer image. Opening and closing morphological 142 operators, which are an extension of erosion and dilation operators are also used, to find specific shapes 143 in an image. Specifically, the opening operation comprises the erosion operation followed by the 144 dilation operation, and helps to smooth the contour of an image and eliminate small objects. On the 145 other hand, the closing operation tends to remove small holes and fill gaps in the contours [53]. Note 146 that morphological operations have gained popularity because they are useful for the detection of the 147 edge of an image and suppression of noise. 148 In this paper, opening and closing morphological operators are applied on grey-scale images, mainly to 149 filter noise [53], while erosion and dilation operations are used for processing image edges. I(x,y) is 150 considered as a grey-scale two-dimensional image and S is referred as structuring element. The erosion 151 of a grey-scale image I(x,y) by a structuring element S(a,b) is defined as [52, 54]: 152 The dilation of a grey-scale image, I(x,y), is denoted by 153 Based on the erosion and dilation operators, the opening and closing of the image I by the structuring 154 element S are respectively defined as follows: 155 In this paper, the first step is to select structuring elements which are regarded as matrices and able 156 to measure the shape of the image. In addition, choosing the shape and size of the structuring element 157 is based on the condition and processing demand of the image. In this paper, we used a 5×5 square 158 structuring element to input in the opening and closing morphological operators for filtering. The 159 opened and closed images were then converted to binary images by using thresholds for next features 160 extraction and classification processes. 161

Local Binary Pattern Operators 162
The LBP algorithm was introduced by Ojala et al. in 1996 [55]. The LBP operator has been developed 163 to detect textures or objects in images for a long time. It is considered a robust texture descriptor for 164 analysing images, because of its capability to represent plant discriminative information and 165 computational efficiency [55]. It is also one of the best texture descriptors and has been effectively used 166 in various applications. The potentials and effectiveness of LBP have been presented in identifying 167 objects, recognizing faces and facial expressions and classifying demographics. In this paper, the LBP 168 operator is particularly used for leaf description due to its effectiveness in pattern description. 169 The main limitation of the previously reported LBP operator was to only cover a small 3×3 170 neighbourhood, thus failing to capture dominant textural features in images with large-scale structures. 171 To overcome this drawback (i.e., improve the LBP operators), the number of pixels and the radius in 172 the circular neighbourhood have been increased [14]. Typically, it is more flexible and effective to 173 enhance the performance of the LBP method by using textures of different scales. Generally, the value 174 of the LBP code of a centre pixel (x c , y c ) can be calculated as follows [14]: 175 where g c is the grey value of the central pixel and g p indicates the grey values of the circularly 176 symmetric neighbourhood from p = 0 to P − 1 andg p = x P,R,p . In addition, P stands for the number 177 of surrounding pixels in the circular neighbourhood with the spatial resolution of the neighbourhood R. 178 Also, s(x) symbolizes the thresholding function, which helps the LBP algorithm to gain illumination invariance against any monotonic transformation. The probability distribution of the 2 p LBP patterns 180 represents the characteristic of the texture image. The mentioned parameters of the LBP algorithm 181 control how patterns are computed for each pixel in input images. 182 Rotating an image causes diverse LBP codes. Therefore, LBP codes need to rotate back to the position 183 of the reference pixel in order to invalidate the results of translating a pixel location and generate 184 multiple identical versions of binary codes. To address the problem of the image rotation effect, a 185 rotation-invariant LBP has been defined as follows [14,56]: 186 LBP P,R ri = min{ROR(LBP P,R , i)|i = 0, 1, … , P − 1} where the function ROR(x, i) performs an i-step circular bit-wise right shift on the P-bit number x. The 187 rotation invariant LBP is formed by circularly rotating the basic LBP code and keeping the rotationally-188 unique patterns that result in a significant reduction in feature dimensionality. 189 For uniform patterns, LBP P,R refers to the number of spatial transitions in the patterns and theLBP P,R u2 190 patterns need to have at most two bitwise transitions from 0 to 1 or vice versa. As for a given pattern of 191 P bits, the uniform descriptor produces P(P − 1) + 3 output bins, which consist of P(P − 1) + 2 bins 192 for distinct uniform patterns, and a single bin(P + 1) assigned to all non-uniform patterns. To 193 overcome poor discrimination, due to the crude quantization of angular space at 45° intervals, the 194 rotation invariant uniform descriptor LBP P,R riu2 , which has a U value of at most 2, is defined as follows 195 [14]: 196 The other patterns are marked as "miscellaneous" label and grouped into a single value. To map from 197 LBP P,R toLBP P,R riu2 , the number of bins depends on the number of neighbours P are P+2. 198 Correspondently, theLBP 8,1 riu2 , LBP 16,2 riu2 and LBP 24,3 riu2 operators have 10, 18 and 26 bins, respectively.

Support Vector Machines (SVM) 200
After the dominant features are extracted using the LBP method, the next stage is classification. There 201 are several different classification methods, including decision trees, SVM, neural networks, k-nearest 202 neighbour method and the Bayesian classifier. One of the efficient classification methods is SVM, due 203 to its high performance in many applications, such as face recognition subject to the constraint y i (w T ϕ(x i ) + b) ≥ 1-ξ i withξ i ≥ 0, i = 1, … , l 215 According to Eq. (9), the training data are mapped into a higher dimensional space by the function 216 and every constraint can be satisfied if ξ is sufficiently large. In addition, C > 0 is the regularization 217 parameter, w is known as the weight vector and b is the bias. The SVM method generates an optimal 218 hyperplane with the maximal margin between classes in the higher dimensional space. A kernel function 219 ( , ) is represented as ( ) ( ) and two kernels including polynomial and radial basis function 220 (RBF) are applied in this paper. The polynomial and RBF kernels with kernel parameters γ,r, d are 221 given by [68] Kernel selection has long been a problem. In this paper, a study is conducted using independent test sets 224 to compare kernels and select the best one. 225

Data Collection 226
As mentioned in the article [43], all data was captured on a custom-built testing facility in Figure 1  spatial resolution of ≈1mm/pixel and size of 228×228 pixels, which were down-sampled by a factor of 231 2 from a size of 456×456 pixels. Moreover, the vertical height of the camera above the surface of the 232 plant pots was 980 mm and the camera focal length was 9mm. 233

Plant pots
Lighting 234 Figure 1: A high-speed testbed system used for controlled data capture [43] In this paper, we continue to use the bccr-segset dataset to compare the performance of the novel 236 combination of the LBP algorithm and contoured mask with coefficient k with that of the combined 237 LBP operators reported in [43]. In addition, a new dataset of broadleaf images including only canola 238 and radish leaves is captured to objectively evaluate the detection capability of the proposed approach. 239 Method 240 In the previous paper [43], three different LBP operators To overcome the limitation of the combined LBP operators in the previous paper, a novel   To begin with, we input the "bccr-segset" dataset into the plant classification program. The dataset was 262 processed in two branches: (i) the dataset was input to the feature extraction block without applying the 263 morphological operations, and (ii) the dataset applied the morphological opening and closing, and 264 generated contour masks with different thicknesses as shown in Figure 3. To be more specific in the 265 second branch, a 5x5 morphological filter was created to implement the morphological opening and 266 closing on all plant images in the dataset. By selecting a threshold, grayscale images were converted 267 into binary images to get better accuracy. Here, we masked all plant images with contours, i.e.,  As can be seen in Figure 4, it illustrates an example of the process shown in the flowchart (Figure 3). 291 In Figure 4 a) we show an original canola leaf image and its three histograms corresponding to 292 LBP 8,1 riu2 , LBP 16,2 riu2 and LBP 24,3 riu2 operators. The 9 th , 17 th and 25 th bins in each operator have the highest 293 level of the distribution of patterns. The LBP-based canola leaf image and contour mask, the original 294 histogram and the filtered histogram of the contour masks are shown in Fig 2.3 b), c), d) with the 295 LBP 8,1 riu2 , LBP 16,2 riu2 and LBP 24,3 riu2 operators, respectively. It is apparent that the feature distribution is easily 296 observed in the other bins of the LBP histogram with bin removal. Interestingly, dominant features such 297 as edge and corner patterns in other bins can be seen clearly by removing some specific bins (9 th , 17 th , 298 and 25 th bins) in the LBP histograms. Similarly, plant features in the histogram of the LBP based contour 299 mask with bin removal also present their significance. It is noted that the bin number of the LBP 300 histogram in Fig 2.3, calculated in a Python code, has an index range from 0 to [(P+2) -1] bins. Note 301 that the bin number mentioned in this paper starts from 1 to P+2. For example, the LBP 8,1 riu2 operator has 302 an index range from 0 to 9 but the bin number from 1 to 10. Multiresolution analysis can be achieved by altering P and R of LBP operators and then combining 309 these operators. Fig. 2.4 (a-d), shows four different LBP histograms of a canola leaf image obtained by 310 combining three operators (LBP 8,1 riu2 , LBP 16,2 riu2 andLBP 24,3 riu2 ), eliminating 9 th , 27 th and 53 rd bins, applying 311 the LBP method with contour masking and removing 9 th , 27 th and 53 rd bins in the joint cmask histogram, 312 respectively.  After the feature extraction step, the plant images are classified by using SVM kernels. Initially, 5-fold-338 cross validation was used to divide the dataset into five subsets. Due to the different plant growth stages 339 in the dataset, images at each growth stage are equally divided in each subset as well. A single subset 340 of the dataset is used for testing while the remaining four subsets of the dataset are used for training. 341 The cross-validation process was iteratively applied five times, with the test subset changed each time. 342 This procedure helps to prevent overfitting. After generating the training model by selecting RBF kernel 343 in SVM and making predictions, the classification accuracies of the methods was calculated by using 344 the performance metrics such as accuracy, precision, recall and F1-score.

346
The results are divided into two sections: the first section presents the average classification accuracies 347 of the broadleaf classes consisting of canola and radish. The effectiveness of the proposed k-FLBPCM 348 method is evaluated based on factors including feature extraction (by comparing among the FLBP, 349 FLBPbCM, and k-FLBPCM methods), different SVM kernels (the second order polynomial kernel and 350 RBF kernel), contour thickness, LBP parameters P (the total number of the neighbouring pixels) and R 351 (the radius) as well as the coefficient k. In the second section, the parameters (C, Gamma (γ), coefficient 352 k and thickness) for the classification of all four classes in the "bccr-segset" dataset including canola, 353 corn, radish and background are optimized to obtain improved classification accuracy. The computer 354 used in these experiments had a 3.4GHz processor, 16GB RAM and ran Python 2.7.13. 355

Results of the k-FLBPCM, FLBPbCM and FLBP methods in classifying two different broadleaf 356 plants 357
Canola and radish images were taken from the "bccr-segset" dataset. The train and test sets of canola 358 and radish classes consist of 15000 images (7500 images in each class). After applying the FLBP, 359 FLBPbCM, or k-FLBPCM methods, SVM was used to classify the two broadleaf classes including 360 canola and radish plants. The classification accuracies of the second order polynomial kernel and the 361 RBF kernel were compared. In this experiment, C = 10, 60, γ =10 −5 , 10 −6 and thickness =2 were 362 selected. The values of C and γ selected were typical values, before any optimization had been 363 performed. 364 The results of using two SVM kernels (the second order polynomial and RBF kernels) on the given 365 dataset for classification are summarised in 366 Table 2. In particular, the average classification accuracy of the k-FLBPCM method (C=10, γ =10 −5 , 367 k=0.5 and 0.2) with the RBF kernel was 97.32%, followed by 96.40% corresponding to k-FLBPCM 368 method with coefficient k=0.1. Meanwhile, the average classification accuracy of the k-FLBPCM 369 method (C=10, γ =10 −5 , k=0.5) with the second order polynomial kernel was just 95.46%. Similarly, 370 the case (C=60, γ =10 −6 ) of the k-FLBPCM method with the RBF kernel was also higher than the 371 polynomial kernel of degree 2. In addition, the FLBP method with the RBF kernel had higher 372 classification rate than the polynomial kernel. As for the FLBPbCM method (C=10, γ =10 −5 ), the RBF 373 kernel had the classification accuracy of 94.07% in comparison to the second order polynomial kernel 374 at 88.53%. These results show that the RBF kernel, which nonlinearly maps features into a higher 375 dimensional space, resulting in higher classification accuracy for all three methods (FLBP, FLBPbCM 376 and k-FLBPCM methods). 377 A second experiment was conducted to investigate the effects of the hyper-parameters C and γ, as well 381 as the coefficient k on the classification accuracy of canola and radish images. We chose the ranges of 382 C, γ and coefficient k as follows: C =1, 10, 30, 60, 100, 1000, γ =10 −4 , 10 −5 , 10 −6 , 10 −7 and k =0.1, 383 0.2, 0.5, 0.7, 0.8, and 1.0. As shown in 384 Table 3, the k-FLBPCM method had the highest classification accuracy, averaged over the 5-folds of 385 the cross validation, in the first pair (C=30, γ =10 −5 , thickness=2, k=0.2) and the second pair (C=60, γ 386 =10 −6 , thickness=2, k=1), at 97.50%. In addition, the average classification accuracies of the k-387 FLBPCM method with different parameters were sorted from high to low. Due to the large number of 388 combinations possible, only the top 10 cases are listed in 389 Table 3. Due to the low accuracy of using γ =10 −4 , the parameter γ should be less than 10 −5 to improve 390 the classification accuracy of the k-FLBPCM method. Although all experiments were conducted with different coefficients k, this parameter should be less 396 than or equal to 1. We find that (k ≤ 1) results in optimal accuracy. As shown in Figure 6, the average 397 classification accuracies of the proposed k-FLBPCM method with k ≤ 1 were higher than the ones with 398 k>1. 399  To check the effectiveness of the k-FLBPCM method in a different dataset, a new set of canola and 405 radish images in four different growth stages was collected and designated "can-rad" dataset (published 406 online). A total of 19600 broadleaf images (9800 images in each class) were collected at four different 407 growth stages. The parameters C =10, 30, 60, 100, 1000, γ =10 −5 , 10 −6 , and thicknesses from 1 to 8 408 were selected. Note that the SVM classifier was used only with the RBF kernel in the remaining parts 409 of experiments. Further, only the 10 highest classification accuracies for each method are listed in 410 Tables 3.3-5 and the average classification accuracy scores are sorted from high to low. 411  As can be seen from Table 4 and Table 5, the classification accuracy of the FLBP method was 95.13% 416 with C =100 and γ =10 −6 , while that of the FLBPbCM method was 93.95%, lower than the FLBP method. However, when combining the FLBP and FLBPbCM methods (in k-FLBPCM method), the 418 classification accuracy was significantly higher. Table 6 shows that the highest average classification 419 accuracy of the k-FLBPCM method was 96.21%. 420

Effects of the contour thickness on the classification accuracy 423
Next, we evaluated the average classification accuracy of the k-FLBPCM method for varying the 424 thicknesses of the contour lines. The "can-rad" dataset was used for this investigation. We selected 425 C=10, 30, 100, γ =10 −5 , coefficient k = 0.5 and thickness from 1 to 8. As can be seen in Figure 7 through four growth stages, as illustrated in Figure 9. The number of plant images at each class and 447 each growth stage is indicated in Figure 9 [43].   , 10 −6 . The k-FLBPCM method again achieved the highest accuracies among all compared 454 methods, confirming the results in the given "can-rad" dataset. 455 In order to find optimal (C, γ) pairs, we investigated the following parameter ranges: C = 1, 10, 30, 60, 459 100, 1000, γ =10 −5 , 10 −6 , k = 0.1, 0.2, 0.5, 0.8, 1 and thickness of 2. Only the 10 highest classification 460 accuracies of the k-FLBPCM method were listed in Table 8. This method attained the highest 461 classification accuracy of 98.63% with C =30, γ =10 −5 and coefficient k=0.2. The k-FLBPCM method can classify plant images with different conditions, as shown in our two 466 datasets, and improve the classification accuracies achieved previously [43]. Particularly, there is a 467 significant improvement in performance when combining LBP features with a contour based mask. The 468 average classification accuracies of the k-FLBPCM method have increased over the previously 469 described method by up to 6.78% [45]. 470 The F1-score results for each class are indicated in Table 9. Particularly, the F1 scores of the k-FLBPCM 471 method significantly increased to 97.40% and 97.40% for canola and radish, from 84.41% and 83.43% 472 respectively, which had used combined LBP operators in the previously published paper [45]. In 473 addition, the testing time (millisecond/image) of the k-FLBPCM method was faster than the combined 474 LBP method [45]. 475 With the aim of reducing the misclassification, we investigated the misclassified images through visual 479 inspection as shown in Figure 10. The first stage plants (Figure 10 (a), (b) and (c)), appear to have been 480 misclassified due to the close morphological similarities. In addition, deformity of the leaves and stems, 481 especially arising from perspective distortions (Figure 10 (e) (f)) and leaf diseases (Figure 10