Accurate Weed Mapping and Prescription Map Generation Based on Fully Convolutional Networks Using UAV Imagery

Chemical control is necessary in order to control weed infestation and to ensure a rice yield. However, excessive use of herbicides has caused serious agronomic and environmental problems. Site specific weed management (SSWM) recommends an appropriate dose of herbicides according to the weed coverage, which may reduce the use of herbicides while enhancing their chemical effects. In the context of SSWM, the weed cover map and prescription map must be generated in order to carry out the accurate spraying. In this paper, high resolution unmanned aerial vehicle (UAV) imagery were captured over a rice field. Different workflows were evaluated to generate the weed cover map for the whole field. Fully convolutional networks (FCN) was applied for a pixel-level classification. Theoretical analysis and practical evaluation were carried out to seek for an architecture improvement and performance boost. A chessboard segmentation process was used to build the grid framework of the prescription map. The experimental results showed that the overall accuracy and mean intersection over union (mean IU) for weed mapping using FCN-4s were 0.9196 and 0.8473, and the total time (including the data collection and data processing) required to generate the weed cover map for the entire field (50 × 60 m) was less than half an hour. Different weed thresholds (0.00–0.25, with an interval of 0.05) were used for the prescription map generation. High accuracies (above 0.94) were observed for all of the threshold values, and the relevant herbicide saving ranged from 58.3% to 70.8%. All of the experimental results demonstrated that the method used in this work has the potential to produce an accurate weed cover map and prescription map in SSWM applications.


Introduction
Chemical control is necessary to control weed infestation and to ensure rice production [1]. Traditionally, the chemical control strategy has three steps: a single pre-emergence herbicide application, a post-emergence herbicide treatment, and an optional late post-emergence chemical spray [2]. Data collection was carried out on 2nd and 10nd October 2017, when the rice and weeds were both in their early growth stages, and when the herbicide treatment was recommended. Phantom 4 (SZ DJI Technology Co., Ltd., Shenzhen, China) was used for data collection, and a 50 × 60 m plot was delimited in order to perform the flights. During the experiments, the flight altitude was set to 10 m, with a resolution of 0.5 cm per pixel. Sequences with a 70% forward-lap and 60% side-lap imagery were collected to cover the entire experimental plot. On 2nd and 10nd October, 54 and 50 imagery (3000 × 4000 pixels) were collected in the experiments, respectively.

Dataset Preparation
Image mosaicking is an important step prior to image analysis [4]. In this work, the collected imagery were stitched together to form the ortho-mosaicked imagery using the software of Photoscan [25]. However, the ortho-mosaicked imagery is usually quite large (14,000 × 13,000 pixels in our work), making it a difficult task to carry out the data processing with limited CPU and GPU memory. In order to address this problem and to retain the original spatial resolution, we split the orthomosaicked imagery into small patches (1000 × 1000 pixels), similar with the work of Zhang et al. [26]. Following this strategy, the datasets of D02-1 and D10-1 were generated from the ortho-mosaicked imagery obtained on 2nd and 10nd October 2017. Besides that, we also directly split the collected imagery into small patches (1000 × 1000 pixels), which generated the dataset of D02-2 and D10-2, as shown in Table 1. For each imagery in the dataset, its corresponding ground truth (GT) label data was produced by careful manual labeling. With the high spatial resolution of UAV imagery, the weed-crop discrimination can be visually accessed, making it feasible to manually label the imagery at a pixel level. Thus, each sample in the dataset represented one image-GT label pair, and the GT label is used as the standard when evaluating the performance of the classifiers. Three image-GT label pairs are illustrated in Figure 2. Data collection was carried out on 2nd and 10th October 2017, when the rice and weeds were both in their early growth stages, and when the herbicide treatment was recommended. Phantom 4 (SZ DJI Technology Co., Ltd., Shenzhen, China) was used for data collection, and a 50 × 60 m plot was delimited in order to perform the flights. During the experiments, the flight altitude was set to 10 m, with a resolution of 0.5 cm per pixel. Sequences with a 70% forward-lap and 60% side-lap imagery were collected to cover the entire experimental plot. On 2nd and 10th October, 54 and 50 imagery (3000 × 4000 pixels) were collected in the experiments, respectively.

Dataset Preparation
Image mosaicking is an important step prior to image analysis [4]. In this work, the collected imagery were stitched together to form the ortho-mosaicked imagery using the software of Photoscan [25]. However, the ortho-mosaicked imagery is usually quite large (14,000 × 13,000 pixels in our work), making it a difficult task to carry out the data processing with limited CPU and GPU memory. In order to address this problem and to retain the original spatial resolution, we split the ortho-mosaicked imagery into small patches (1000 × 1000 pixels), similar with the work of Zhang et al. [26]. Following this strategy, the datasets of D02-1 and D10-1 were generated from the ortho-mosaicked imagery obtained on 2nd and 10th October 2017. Besides that, we also directly split the collected imagery into small patches (1000 × 1000 pixels), which generated the dataset of D02-2 and D10-2, as shown in Table 1. For each imagery in the dataset, its corresponding ground truth (GT) label data was produced by careful manual labeling. With the high spatial resolution of UAV imagery, the weed-crop discrimination can be visually accessed, making it feasible to manually label the imagery at a pixel level. Thus, each sample in the dataset represented one image-GT label pair, and the GT label is used as the standard when evaluating the performance of the classifiers. Three image-GT label pairs are illustrated in Figure 2.

Methodology
In this work, two different workflows were applied to produce the weed cover map for the whole field. The performance of both workflows were evaluated and compared. Fully convolutional networks (FCN) was employed for the pixel level classification. Finally, the chessboard segmentation method was used to produce the prescription map based on the weed cover map.

Workflow
Two different workflows were adopted as candidates, as shown in Figure 3. The first workflow conducted the mosaicking operation to generate the ortho-mosaicked imagery for the whole field, and then it performed a per-pixel classification to create the weed cover map. Inspired by the fact that some sections in the ortho-mosaicked imagery were blurring, which made it difficult to distinguish and may cause misclassification during the classification stage, we directly applied the labeling process on the collected imagery in the second workflow, which may avoid the ambiguous pixels in the classification stage. After that, the mosaicking process was conducted on the classification results, using the geo-information in the collected imagery. All of the mosaicking operations were performed using the software of Photoscan.

Methodology
In this work, two different workflows were applied to produce the weed cover map for the whole field. The performance of both workflows were evaluated and compared. Fully convolutional networks (FCN) was employed for the pixel level classification. Finally, the chessboard segmentation method was used to produce the prescription map based on the weed cover map.

Workflow
Two different workflows were adopted as candidates, as shown in Figure 3. The first workflow conducted the mosaicking operation to generate the ortho-mosaicked imagery for the whole field, and then it performed a per-pixel classification to create the weed cover map. Inspired by the fact that some sections in the ortho-mosaicked imagery were blurring, which made it difficult to distinguish and may cause misclassification during the classification stage, we directly applied the labeling process on the collected imagery in the second workflow, which may avoid the ambiguous pixels in the classification stage. After that, the mosaicking process was conducted on the classification results, using the geo-information in the collected imagery. All of the mosaicking operations were performed using the software of Photoscan.

Methodology
In this work, two different workflows were applied to produce the weed cover map for the whole field. The performance of both workflows were evaluated and compared. Fully convolutional networks (FCN) was employed for the pixel level classification. Finally, the chessboard segmentation method was used to produce the prescription map based on the weed cover map.

Workflow
Two different workflows were adopted as candidates, as shown in Figure 3. The first workflow conducted the mosaicking operation to generate the ortho-mosaicked imagery for the whole field, and then it performed a per-pixel classification to create the weed cover map. Inspired by the fact that some sections in the ortho-mosaicked imagery were blurring, which made it difficult to distinguish and may cause misclassification during the classification stage, we directly applied the labeling process on the collected imagery in the second workflow, which may avoid the ambiguous pixels in the classification stage. After that, the mosaicking process was conducted on the classification results, using the geo-information in the collected imagery. All of the mosaicking operations were performed using the software of Photoscan.  The evaluation of the workflows was measured for accuracy and efficiency. The accuracy was evaluated by the overall accuracy and the mean intersection over union (mean IU) [16], and the time efficiency was measured using the total time required to generate the weed cover map, including data collection and data processing.

Semantic Labeling
Classical FCN-8s was proven to be effective on weed mapping of UAV imagery [21], which outperformed the traditional methods in terms of accuracy and efficiency. In this work, we sought for an optimal network architecture that will bring about a performance improvement.
Similar with the network architecture of classical FCN-8s [16], an ImageNet pre-trained Convolutional Neural Network (CNN) [27] was adapted to fully convolutional networks and was transferred to our study using a fine-tuning technique. Besides that, two modifications were conducted on the baseline architecture of FCN-8s. (1) In the previous experiments on the skip architecture [21], it was proven that the fusion of the prediction results (fc8) and the shallow layer of pool4 can effectively increase the prediction accuracy, as shown in Figure 4a. However, the fusion with other shallow layers brings no performance boost. This result indicated that the information from pool4 is crucial for the classification task, so that the fusion with this layer can make up the information loss caused by the downsampling operation. However, this strategy cannot properly address this problem, which resulted in low precision and blurred edges in the classification result [26]. Based on this result, the skip architecture and the last pooling operation (pool5) were removed so as to avoid the information loss of the layer of pool4. (2) The original network was designed for the dataset of PASCAL VOC 2011 segmentation challenge [28], which has 1000 different classes. However, our dataset only has three categories (rice, weeds, and others). According to the work of Stathakis et al. [29], The evaluation of the workflows was measured for accuracy and efficiency. The accuracy was evaluated by the overall accuracy and the mean intersection over union (mean IU) [16], and the time efficiency was measured using the total time required to generate the weed cover map, including data collection and data processing.

Semantic Labeling
Classical FCN-8s was proven to be effective on weed mapping of UAV imagery [21], which outperformed the traditional methods in terms of accuracy and efficiency. In this work, we sought for an optimal network architecture that will bring about a performance improvement.
Similar with the network architecture of classical FCN-8s [16], an ImageNet pre-trained Convolutional Neural Network (CNN) [27] was adapted to fully convolutional networks and was transferred to our study using a fine-tuning technique. Besides that, two modifications were conducted on the baseline architecture of FCN-8s. (1) In the previous experiments on the skip architecture [21], it was proven that the fusion of the prediction results (fc8) and the shallow layer of pool4 can effectively increase the prediction accuracy, as shown in Figure 4a. However, the fusion with other shallow layers brings no performance boost. This result indicated that the information from pool4 is crucial for the classification task, so that the fusion with this layer can make up the information loss caused by the downsampling operation. However, this strategy cannot properly address this problem, which resulted in low precision and blurred edges in the classification result [26]. Based on this result, the skip architecture and the last pooling operation (pool5) were removed so as to avoid the information loss of the layer of pool4. (2) The original network was designed for the dataset of PASCAL VOC 2011 segmentation challenge [28], which has 1000 different classes. However, our dataset only has three categories (rice, weeds, and others). According to the work of Stathakis et al. [29],  Besides the modified FCN-4s, the classical FCN-8s [21] and Deeplab [22] were also applied and evaluated as comparison. For the FCN-8s, an ImageNet pre-trained CNN [27] was applied as a baseline architecture. The final classification layer was removed, and all of the fully connected layers were converted to convolutions. Skip architecture was built to improve the prediction precision. The Besides the modified FCN-4s, the classical FCN-8s [21] and Deeplab [22] were also applied and evaluated as comparison. For the FCN-8s, an ImageNet pre-trained CNN [27] was applied as a baseline architecture. The final classification layer was removed, and all of the fully connected layers were converted to convolutions. Skip architecture was built to improve the prediction precision. The lower Sensors 2018, 18, 3299 6 of 12 layers (pool4 and pool5) were fused with the higher layer (fc8), as shown in Figure 4a. For the Deeplab approach, a 101-layer ResNet [30] was adapted in fully convolutional forms, similar with the approach of FCN-8s. The weights pre-trained on ImageNet [30] were transferred to our dataset using fine-tuning. Atrous convolution [17] was applied to extend the field of view (FOW) of the convolutional filters, and the fully connected random filed (CRF) [31] was used to further improve the prediction accuracy.
In this section, the accuracy was evaluated by the overall accuracy and mean intersection over union (mean IU) [16], similar to Section 3.1. However, the time efficiency was also measured using the processing time for one single image, which is the normal way for the evaluation adopted by most semantic labeling approaches [16,17].

Prescription Map Generation
The prescription map can be generated from the weed cover map. According to the work of López-Granados [4], a chessboard segmentation process was applied to build a grid framework of the prescription map. The weed cover map was split into small grids, and the comparison between the weed coverage of each grid and a given threshold was conducted: if the weed coverage of the grid is larger than the threshold value, it will be marked as a treatment area, otherwise it will be marked as a non-treatment area. The grid size is adjustable according to the different spraying machines, and it was set to 0.5 × 0.5 m in this work, in accordance with the site-specific sprayer [32].
For this section, the accuracy was calculated from two prescription maps (one generated from the weed cover map output by our algorithm, and the other from the GT label), which can be given by the following: The For each prescription map, its relevant herbicide saving was calculated. According to the work of de Castro [8], herbicide saving is calculated in terms of the non-treatment area, which can be given by the following: herbicide saving = P non = 1 − P treatment (2) where P non and P treatment represent the proportion of the non-treatment and treatment areas.

Results and Discussions
In this section, the experiments on workflows, semantic labeling, and prescription map generation will be conducted. In the experiments on workflows and semantic labeling approaches, the dataset was divided into training, validation, and testing set. The three datasets were used for parameter updating, hyper parameter tuning, and performance evaluation, respectively. All of the experiments were conducted on a computer with an Intel i7-7700 CPU and a NVIDIA GTX 1080 Ti GPU. During the process of weed mapping, the mosaicking operation was carried out in the CPU, while the semantic labeling approaches were performed using the GPU.

Workflow
In this section, two workflows (mosaicking-labeling and labeling-mosaicking) were applied in order to generate the weed cover map for the whole field. For the workflow of mosaicking-labeling, the dataset D02-1 (182 samples) was adopted as a training set. From dataset D10-1, 30% was randomly selected as validation set (54 samples), and the rest samples in the dataset D10-1 (128 samples) were used as the testing dataset. There were two reasons for this choice, namely: (1) the training set and testing set were chosen from different dates, which may evaluate the generalization capability of the algorithm, and (2) the validation set and testing set were selected from the same date, which may ensure that the two datasets belonged to the same distribution. For the workflow of labeling-mosaicking, the dataset D02-2 (648 samples) was used as training set, and 30% of dataset D10-2 (180 samples) was randomly selected as the validation set. However, we still used the testing set of the previous workflow (mosaicking-labeling) as the testing set of this workflow (labeling-mosaicking), since the classification on the ortho-mosaicked imagery is the ultimate objective of our algorithm.
In this section, the classical FCN-8s was used for the semantic labeling tasks. The quantitative results are listed in Table 2. From Table 2, it can be seen that both workflows obtained an approximate accuracy. However, because of the high overlapping in the collected imagery, directly processing on the collected imagery introduced too much redundant computation, which significantly lowered the inference speed. From this perspective, the workflow of mosaicking-labeling is the optimal solution, and will be considered as the default framework for the following experiments. Table 2. Experimental results of different workflows. The speed was measured using the total time required to generate the weed cover map for the whole field, including data collection and data processing. Mean IU-mean intersection over union.

Workflow
Overall Accuracy Mean IU Speed

Semantic Labeling
In this section, the dataset (training, validation and testing set) was the same as the workflow of mosaicking-labeling (Section 4.1). FCN-8s, Deeplab, and our modified FCN-4s were applied for our dataset, respectively. The quantitative results and confusion matrix by different approaches are shown in Tables 3 and 4. From Table 3, it is obvious that Deeplab and FCN-4s outperformed FCN-8s in accuracy. From Table 4, it can be seen that the weed recognition rate of Deeplab and FCN-4s is above 0.90, which is higher than that of FCN-8s. There were two reasons possible for this result, namely: (1) the Deeplab used CRF to refine the spatial details, which increased the prediction accuracy, and (2) the FCN-4s removed the last pooling layer, which reduced the information loss and obtained performance boost. Table 3. Experimental results on different semantic labeling approaches. Speed-1 was measured using the inference time for a single imagery (1000 × 1000 pixels), and speed-2 was measured using the total time required to generate the weed cover map for the whole field, including data collection and data processing. FCN-fully convolutional networks.

Method
Overall Although the Deeplab method achieved a satisfactory result for accuracy, the CRF introduced too much computation, which significantly slowed down the inference speed (Table 3). Therefore, it can be concluded that the FCN-4s strikes the best tradeoff between accuracy and efficiency. From Table 3, it can be found that the total time (including data collection and data processing) needed to generate the weed cover map for the entire field (50 × 60m) using FCN-4s is less than half an hour, demonstrating its rapid response capability on weed infestation monitoring.
The weed cover maps generated by the different approaches are shown in Figure 5. From Figure 5, it can be seen that (1) the weeds (in yellow dashed lines) were misclassified as others by FCN-8s, while they were properly recognized by Deeplab and FCN-4s; (2) the rice (in blue dashed lines) was misclassified as weeds by FCN-8s, while they were well classified by Deeplab and FCN-4s. From the qualitative results of Figure 5, it can be concluded that the FCN-4s obtained a satisfactory result with a simplified architecture in an end-to-end mode, which required no post-processing. The weed cover maps generated by the different approaches are shown in Figure 5. From Figure 5, it can be seen that (1) the weeds (in yellow dashed lines) were misclassified as others by FCN-8s, while they were properly recognized by Deeplab and FCN-4s; (2) the rice (in blue dashed lines) was misclassified as weeds by FCN-8s, while they were well classified by Deeplab and FCN-4s. From the qualitative results of Figure 5, it can be concluded that the FCN-4s obtained a satisfactory result with a simplified architecture in an end-to-end mode, which required no post-processing.

Prescription Map Generation
The prescription map can be generated from a weed cover map with a given weed threshold. According to the experimental results in Table 3, the weed cover map obtained by FCN-4s was used to generate the prescription map. For a given weed threshold, the grid with a higher weed coverage will be marked as the treatment area. In this section, six thresholds (0.00-0.25, with an interval of 0.05) were evaluated. The accuracy using different weed thresholds is shown in Figure 6. From Figure 6, it can be seen that, with increasing threshold values, the accuracy consistently increases. The reason for this result is that large weed patches were easier for the classifiers to detect, thus resulting in a higher

Prescription Map Generation
The prescription map can be generated from a weed cover map with a given weed threshold. According to the experimental results in Table 3, the weed cover map obtained by FCN-4s was used to generate the prescription map. For a given weed threshold, the grid with a higher weed coverage will be marked as the treatment area. In this section, six thresholds (0.00-0.25, with an interval of 0.05) were evaluated. The accuracy using different weed thresholds is shown in Figure 6. From Figure 6, it can be seen that, with increasing threshold values, the accuracy consistently increases. The reason for this result is that large weed patches were easier for the classifiers to detect, thus resulting in a higher accuracy. High accuracies (above 0.94 for all thresholds) were observed from Figure 6, demonstrating that our algorithm is qualified for treatment area prediction. The treatment area and herbicide saving with different weed thresholds were calculated and are shown in Table 5. From Table 5, it can be seen that, with increasing the weed thresholds, the treatment area consistently decreases. The relevant herbicide saving ranges from 58.3% to 70.8%, demonstrating great potential to reduce the use of herbicides in SSWM applications. From a practical perspective, a threshold of 0.0 would be recommended as the optimal weed threshold for SSWM applications. There are two reasons for this choice, namely: (1) the accuracy (above 0.94) of this threshold is qualified and the relevant herbicide saving (58.3%) is acceptable, and (2) this threshold would minimize the risk of missing weed infestation, which may cause weed-crop competition.
accuracy. High accuracies (above 0.94 for all thresholds) were observed from Figure 6, demonstrating that our algorithm is qualified for treatment area prediction. The treatment area and herbicide saving with different weed thresholds were calculated and are shown in Table 5. From Table 5, it can be seen that, with increasing the weed thresholds, the treatment area consistently decreases. The relevant herbicide saving ranges from 58.3% to 70.8%, demonstrating great potential to reduce the use of herbicides in SSWM applications. From a practical perspective, a threshold of 0.0 would be recommended as the optimal weed threshold for SSWM applications. There are two reasons for this choice, namely: (1) the accuracy (above 0.94) of this threshold is qualified and the relevant herbicide saving (58.3%) is acceptable, and (2) this threshold would minimize the risk of missing weed infestation, which may cause weed-crop competition.  The prescription maps generated with different thresholds are illustrated in Figure 7. From Figure 7, it can be seen that the changes of the threshold value have little influence on the areas with a high weed coverage (in blue dashed lines), as the weed coverage of these areas is higher than all of the threshold values. However, for the areas with a lower weed coverage (in yellow dashed lines), the weed threshold can effectively adjust the treatment areas, as the areas with lower weed coverage than the threshold will be ignored. The prescription maps generated by our method (with all thresholds) generally correspond to that generated by the GT label, thanks to the high accuracy of the output weed cover map. From Figure 7, it can also be seen that an overestimation for treatment areas was observed in the results of all of the thresholds. However, from an agronomic perspective, it is acceptable, as it can reduce the risk allowing the weeds to go untreated [33].  The prescription maps generated with different thresholds are illustrated in Figure 7. From Figure 7, it can be seen that the changes of the threshold value have little influence on the areas with a high weed coverage (in blue dashed lines), as the weed coverage of these areas is higher than all of the threshold values. However, for the areas with a lower weed coverage (in yellow dashed lines), the weed threshold can effectively adjust the treatment areas, as the areas with lower weed coverage than the threshold will be ignored. The prescription maps generated by our method (with all thresholds) generally correspond to that generated by the GT label, thanks to the high accuracy of the output weed cover map. From Figure 7, it can also be seen that an overestimation for treatment areas was observed in the results of all of the thresholds. However, from an agronomic perspective, it is acceptable, as it can reduce the risk allowing the weeds to go untreated [33].

Conclusions
Prescription maps can provide decision making support for the spraying machine, which may effectively reduce the use of herbicide while enhancing the chemical effects. In this paper, the study on weed mapping and prescription map generation was conducted using UAV imagery. (1) The UAV imagery over a rice field were captured at a high spatial resolution, and pre-processing was performed so as to generate our dataset. (2) Two different workflows (mosaicking-labeling and labeling-mosaicking) were applied in order to generate the weed cover maps. These workflows were evaluated and compared. The experimental results showed that the workflow of mosaicking-labeling outperformed the others in terms of efficiency with an approximate accuracy. (3) A modified FCN-4s introduced pixel-to-pixel translation from UAV imagery to weed cover maps. Theoretic analysis was conducted to seek for architecture improvement. The improved architecture was evaluated and compared with the classical FCN-8s and Deeplab. The experimental results showed that the modified FCN-4s outperformed others in both accuracy and efficiency. (4) A chessboard segmentation method was used to build the grid framework of the prescription map. Different weed thresholds were applied and evaluated. High accuracies (above 0.94) were observed for all of the thresholds, and the relevant herbicide savings ranged from 58.3% to 70.8%. The method applied in this paper was superior in efficiency, which may produce a prescription map for a rice field (50 × 60 m) within half an hour, demonstrating its rapid response capability to the emergency of weed infestation.
However, for the study of weed mapping and prescription map generation, more data is needed to extend and evaluate the generalization capability of the algorithm. Besides rough weed recognition, classification for specific weed species is also important for the SSWM applications, which can be extended based on our current work. All of these issues will be left as our future work.

Conclusions
Prescription maps can provide decision making support for the spraying machine, which may effectively reduce the use of herbicide while enhancing the chemical effects. In this paper, the study on weed mapping and prescription map generation was conducted using UAV imagery. (1) The UAV imagery over a rice field were captured at a high spatial resolution, and pre-processing was performed so as to generate our dataset. (2) Two different workflows (mosaicking-labeling and labeling-mosaicking) were applied in order to generate the weed cover maps. These workflows were evaluated and compared. The experimental results showed that the workflow of mosaicking-labeling outperformed the others in terms of efficiency with an approximate accuracy. (3) A modified FCN-4s introduced pixel-to-pixel translation from UAV imagery to weed cover maps. Theoretic analysis was conducted to seek for architecture improvement. The improved architecture was evaluated and compared with the classical FCN-8s and Deeplab. The experimental results showed that the modified FCN-4s outperformed others in both accuracy and efficiency. (4) A chessboard segmentation method was used to build the grid framework of the prescription map. Different weed thresholds were applied and evaluated. High accuracies (above 0.94) were observed for all of the thresholds, and the relevant herbicide savings ranged from 58.3% to 70.8%. The method applied in this paper was superior in efficiency, which may produce a prescription map for a rice field (50 × 60 m) within half an hour, demonstrating its rapid response capability to the emergency of weed infestation.
However, for the study of weed mapping and prescription map generation, more data is needed to extend and evaluate the generalization capability of the algorithm. Besides rough weed recognition, classification for specific weed species is also important for the SSWM applications, which can be extended based on our current work. All of these issues will be left as our future work.