GobhiSet: Dataset of raw, manually, and automatically annotated RGB images across phenology of Brassica oleracea var. Botrytis

This research introduces an extensive dataset of unprocessed aerial RGB images and orthomosaics of Brassica oleracea crops, captured via a DJI Phantom 4. The dataset, publicly accessible, comprises 244 raw RGB images, acquired over six distinct dates in October and November of 2020 as well as 6 orthomosaics from an experimental farm located in Portici, Italy. The images, uniformly distributed across crop spaces, have undergone both manual and automatic annotations, to facilitate the detection, segmentation, and growth modelling of crops. Manual annotations were performed using bounding boxes via the Visual Geometry Group Image Annotator (VIA) and exported in the Common Objects in Context (COCO) segmentation format. The automated annotations were generated using a framework of Grounding DINO + Segment Anything Model (SAM) facilitated by YOLOv8x-seg pretrained weights obtained after training manually annotated images dated 8 October, 21 October, and 29 October 2020. The automated annotations were archived in Pascal Visual Object Classes (PASCAL VOC) format. Seven classes, designated as Row 1 through Row 7, have been identified for crop labelling. Additional attributes such as individual crop ID and the repetitiveness of individual crop specimens are delineated in the Comma Separated Values (CSV) version of the manual annotation. This dataset not only furnishes annotation information but also assists in the refinement of various machine learning models, thereby contributing significantly to the field of smart agriculture. The transparency and reproducibility of the processes are ensured by making the utilized codes accessible. This research marks a significant stride in leveraging technology for vision-based crop growth monitoring.


a b s t r a c t
This research introduces an extensive dataset of unprocessed aerial RGB images and orthomosaics of Brassica oleracea crops, captured via a DJI Phantom 4. The dataset, publicly accessible, comprises 244 raw RGB images, acquired over six distinct dates in October and November of 2020 as well as 6 orthomosaics from an experimental farm located in Portici, Italy.The images, uniformly distributed across crop spaces, have undergone both manual and automatic annotations, to facilitate the detection, segmentation, and growth modelling of crops.Manual annotations were performed using bounding boxes via the Visual Geometry Group Image Annotator (VIA) and exported in the Common Objects in Context (COCO) segmentation format.The automated annotations were generated using a framework of Grounding DINO + Segment Anything Model (SAM) facilitated by YOLOv8x-seg pretrained weights obtained after training manually annotated images dated 8 October, 21 October, and 29 October 2020.The automated annotations were archived in Pascal Visual Object Classes (PASCAL VOC) format.Seven classes, designated as Row 1 through Row 7, have been identified for crop labelling.Additional attributes such as individual crop ID and the repetitiveness of individual crop specimens are delineated in the Comma Separated Values (CSV) version of the manual annotation.This dataset not only furnishes annotation information but also assists in the refinement of various machine learning models, thereby contributing significantly to the field of smart agriculture.The transparency and reproducibility of the processes are ensured by making the utilized codes accessible.This research marks a significant stride in leveraging technology for vision-based crop growth monitoring.
© 2024 The Author(s

Value of the Data
• This dataset is a collection of multi-date aerial imagery of the Brassica oleracea var.Botrytis crop [ 1 ].The images were acquired between the first and seventh weeks after sowing the cauliflower, with the intention of observing its growth over this period.The images were annotated with two types of annotations: bounding boxes that encompass the crops along with the sub-soil area, and automated annotations that span only the canopy part of individual cauliflower specimens.This data is valuable to monitor growth across different dates.• The dataset contains raw and annotated aerial RGB image data broadly classified into 5 subdirectories according to the nature of annotations and operations performed over cauliflower crops.The manual and automatic annotations are available in COCO segmentation format [ 2 ] as well as PASCAL VOC format [ 3 ] respectively.• Manually annotated images dated 8 October, 21 October and 29 October 2020 were used for training YOLOv8x-seg [ 4 ] to facilitate groundedSAM (Grounding DINO + SAM) for generating canopy-bound annotation over cauliflower crops across all dates.• The cauliflowers, which are contained in the segmentation masks of the aerial images and orthomosaics, are classified into 7 crop rows.The availability of row-wise segmentation masks facilitates both intra-date and inter-date class comparisons of cauliflower growth across these two modalities.• Automated annotation was based on transfer learning approach [ 5 ] where the reference point of training the Grounded SAM [ 6 ] annotator was kept as the preceding date annotated imagery.

Objective
The primary motivation for compiling this dataset was to create a repository containing raw, manually, and automatically annotated pixel information about the crop of interest, B. oleracea var.Botrytis , also known as cauliflower, captured via aerial images and orthomosaics [ 1 ].The underlying metadata can be utilized to improve growth modelling and monitoring of cauliflowers across different dates, as well as for intra-date comparison among different crop rows, making use of the two modalities: aerial images and orthomosaics.Automatic annotations play a vital role in agricultural imagery for deep learning by enabling efficient and accurate analysis at scale, contributing to the advancement of precision agriculture [ 7 ].Annotating the exact periphery of the crop in aerial images is a function of time and cost.Bounding box-based annotation may include other objects or free space in the background, potentially reducing the accuracy of the datasets.This can be particularly problematic in agriculture, where precision is crucial for tasks such as acreage estimation, disease mapping, growth monitoring [ 8 ], or precision spraying [ 9 ].To perform automatic crop-bound annotations, we trained a YOLOv8x-seg [ 4 ] before further training Grounding DINO [ 10 ] + SAM [ 11 ] based deep learning architecture, also known as GroundedSAM [ 6 ].GroundedSAM uses Grounding DINO as an open-set object detector, enables the detection and segmentation of crop regions based on arbitrary text inputs.This aids in refining the bounding box annotations into annotations bound to the periphery of the crop.Based on the morphological differences observed in the segmentation masks of this multi-class annotated imagery, the growth of cauliflowers across different dates can be calculated.The same rate of growth can also be calculated using orthomosaics.In this manner, the discrepancy in growth over different dates between these two types of imagery can be modeled.

Data Description
Our dataset is publicly available in a Mendeley data library [ 12 ].Fig. 1 shows the schema of the dataset.The cauliflower dataset 'GobhiSet' consists of 5 products: raw RGB images, manual annotations, automated annotations, binary masks extracted from automated annotations and Python scripts to perform binary mask extraction based on specific crop rows or image and automated annotation based on bounding boxes.The 'Raw images' subdirectory consists of multi-date aerial RGB images of the cauliflower farm, captured in JPG format.The imaging was performed on 8 October, 21 October, 29 October, 11 November, 18 November, and 25 November 2020 respectively and consists of 244 images in addition to 6 multi-date orthomosaics ( Fig. 2 ) in .JPG format.The subdirectory ' Manual Annotations ' consist of six multi-date annotation files (3 COCO segmentation, 3 CSV) titled as dates on which the imaging was performed.However, the manual annotations were performed for first three dates only.These images The COCO annotation [ 2 ] file contains details about extent of bounding box used for annotations whereas the CSV annotation file contains individual crop specimen's unique identity number across all overlapping image scenes and dates.These annotations were performed in VIA annotator [ 13 ] using bounding box region shape and its nomenclature is displayed in Fig. 3 .A few randomly   selected sample images, their annotations and binary masks are demonstrated in Table 1 .The automated annotations are saved in the PASCAL VOC format ( Fig. 4 ) as XML files [ 3 ], and each annotation shares the same filename as its corresponding image, saved in JPG format and classified accordingly in two sub-directories: 'Aerial Images ' and 'Orthomosaic' .The segmentation masks, which are derived using automated annotations, are saved in the subdirectory 'Binary Masks -Automated' .These masks are in PNG format and are grouped by date.They are also classified by row numbers and stored in two sub-directories: 'Aerial Images' and 'Orthomosaic' .The former sub-directory consists of segmentation masks of all images as well as masks derived from individual row classes saved in subdirectories 'Row_ < class_name > ' .The python scripts to

Experimental Design, Materials and Methods
The images were acquired over six different dates in the experimental farm of Department of Agriculture, University of Napoli Federico II, in Portici, Italy.This dataset covers the species of cauliflower vegetable crop, specifically B. oleracea var.Botrytis .These crops are imaged at an early stage of development, from the 1st to the 6th week after transplantation.The crop rows were neither hoed nor treated with phytosanitary products.The acquisition was performed using a DJI Phantom 4 Pro-Obsidian.The imaging was performed at nadir, with an elevation varying between 4.275 m and 4.749 m due to wind turbulence, and a forward and lateral overlap of 75 % was maintained.The orthomosaics were generated by stitching together overlapping nadir images captured by drones, making use of the image overlaps [ 14 , 15 ].The lighting conditions correspond to noon time, between 11:45 and 12:30 h for every date when acquisition was performed in a field environment.The datasets dated 8 October, 21 October, and 29 October 2020 were used to independently train YOLOv8x-seg.This training generated optimal weights for tuning the hyperparameters of GroundedSAM [ 6 ].These weights were then used to perform automated annotations across the entire batch of multi-date imagery.This process resulted in leaf-bound automated annotations for 250 multi-date RGB images and orthomosaics.The training results over 200 iterations, as depicted in Fig. 5 and detailed in Table 2 , suggest that the imagery from October 21, 2020, provides the best metrics.The trained weight from this imagery is used as a benchmark in the calibration of GroundedSAM.The overall workflow is illustrated in Fig. 6 .

Limitations
A few automated annotations and corresponding segmentation masks may suffer from oversegmentation errors.For instance, boundary detection becomes difficult in the later stages of cauliflower growth, particularly in the imagery dated 11 November, 18 November, and 25 November 2020, when the leaves start overlapping each other.As a result, the crops cannot be uniquely located, and growth monitoring can be limited to a row-based approach rather than a crop-based one.Occasionally, there were infestations by predatory birds, which resulted in some crop damage in row 6 and row 7 for imagery dated 25 November 2020.

Table 1
Details of GobhiSet with Examples.

Table 2
Training parameters for manually annotated imagery over 200 iterations.