Annotated image dataset of fire blight symptoms for object detection in orchards

The monitoring of plant diseases in nurseries, breeding farms and orchards is essential for maintaining plant health. Fire blight (Erwinia amylovora) is still one of the most dangerous diseases in fruit production, as it can spread epidemically and cause enormous economic damage. All measures are therefore aimed at preventing the spread of the pathogen in the orchard and containing an infection at an early stage [1–6]. Efficiency in plant disease control benefits from the development of a digital monitoring system if the spatial and temporal resolution of disease monitoring in orchards can be increased [7]. In this context, a digital disease monitoring system for fire blight based on RGB images was developed for orchards. Between 2021 and 2024, data was collected on nine dates under different weather conditions and with different cameras. The data source locations in Germany were the experimental orchard of the Julius Kühn Institute (JKI), Institute of Plant Protection in Fruit Crops and Viticulture in Dossenheim, the experimental greenhouse of the Julius Kühn Institute for Resistance Research and Stress Tolerance in Quedlinburg and the experimental orchard of the JKI for Breeding Research on Fruit Crops located in Dresden–Pillnitz. The RGB images were taken on different apple genotypes after artificial inoculation with Erwinia amylovora, including cultivars, wild species and progeny from breeding. The presented ERWIAM dataset contains manually labelled RGB images with a size of 1280  × 1280 pixels of fire blight infected shoots, flowers and leaves in different stages of development as well as background images without symptoms. In addition, symptoms of other plant diseases were acquired and integrated into the ERWIAM dataset as a separate class. Each fire blight symptom was annotated with the Computer Vision Annotation Tool (CVAT [8]) using 2-point annotations (bounding boxes) and presented in YOLO 1.1 format (.txt files). The dataset contains a total of 1611 annotated images and 87 background images. This dataset can be used as a resource for researchers and developers working on digital systems for plant disease monitoring.

The monitoring of plant diseases in nurseries, breeding farms and orchards is essential for maintaining plant health.Fire blight ( Erwinia amylovora ) is still one of the most dangerous diseases in fruit production, as it can spread epidemically and cause enormous economic damage.All measures are therefore aimed at preventing the spread of the pathogen in the orchard and containing an infection at an early stage [1][2][3][4][5][6].Efficiency in plant disease control benefits from the development of a digital monitoring system if the spatial and temporal resolution of disease monitoring in orchards can be increased [7].In this context, a digital disease monitoring system for fire blight based on RGB images was developed for orchards.Between 2021 and 2024, data was collected on nine dates under different weather conditions and with different cameras.The data source locations in Germany were the experimental orchard of the Julius Kühn Institute (JKI), Institute of Plant Protection in Fruit Crops and Viticulture in Dossenheim, the experimental greenhouse of the Julius Kühn Institute for Resistance Research and Stress Tolerance in Quedlinburg and the experimental orchard of the JKI for Breeding Research on Fruit Crops located in Dresden-Pillnitz.The RGB images were taken on different apple genotypes after artificial inoculation with Erwinia amylovora , including cultivars, wild species and progeny from breeding.The presented ERWIAM dataset contains manually labelled RGB images with a size of 1280×1280 pixels of fire blight infected shoots, flowers and leaves in different stages of development as well as background images without symptoms.In addition, symptoms of other plant diseases were acquired and integrated into the ERWIAM dataset as a separate class.Each fire blight symptom was annotated with the Computer Vision Annotation Tool (CVAT [8]) using 2-point annotations (bounding boxes) and presented in YOLO 1.1 format (.txt files).The dataset contains a total of 1611 annotated images and 87 background images.This dataset can be used as a resource for researchers and developers working on digital systems for plant disease monitoring.
© 2024 The Author(s The raw RGB images were collected using the cameras: Smartphone Samsung [ 10 , 11 ], Tablet Samsung [ 12 ], Canon EOS 70D [ 13 ] and Canon EOS 90D [ 14 ].The annotation tool CVAT [ 8 ] was used for the 2-point bounding box annotation.The images were resized to 1280 × 1280 pixels with the help of the Python library Pillow using the Lanczos filter [ 15 ].

Data source location
The data source location was the experimental orchard of the • Digital monitoring should contribute to the early identification of the phenotypic characteristics of pathogens and facilitate the targeted use of plant protection products.

Background
Global trade and changing climatic conditions mean that immigrating pests and diseases are increasingly able to establish themselves in Europe.Erwinia amylovora was introduced to Europe in the 1950s and has spread throughout most of Europe since around 2006 [ 3 , 16 ].If it spreads uncontrollably, fire blight can cause considerable economic damage to orchards.Although an infestation can be controlled by the use of antibiotics such as streptomycin, its use is banned or strictly regulated in most European countries [ 2 , 5 , 6 ].Early detection of the first symptoms of fire blight and localisation of the diseased trees in the orchard are important in order to be able to apply the containment methods in a controlled and timely manner.The simple handling, the cost-effective availability and the system-integration of high-resolution RGB cameras are advantages compared to cameras with other sensors [ 7 ].Since no comparable dataset of RGB images with the size of 1280 × 1280 pixels and annotations was available open-source, which included shoot, flower and leaf symptoms as well as symptoms similar to fire blight, the ERWIAM dataset was created.The development of a digital method for detecting fire blight symptoms in orchards can be an effective method for monitoring this pathogen.In order to utilise the possibilities of monitoring and mapping with RGB images, it is essential to create a dataset with the specific symptoms of the model pathogen Erwinia amylovora .

Data Description
The aim was to visualise the diversity of fire blight symptoms in apple genotypes that occurred in greenhouse and the field.The experimental orchard of the Julius Kühn-Institute (JKI -Federal Research Centre for Cultivated Plants) at the Institute for Plant Protection in Fruit Crops and Viticulture in Dossenheim (Germany) located in Kirschgartshausen, the experimental greenhouse of the JKI for Resistance Research and Stress Tolerance located in Quedlinburg   (Germany) and at the experimental orchard of the JKI for Breeding Research on Fruit Crops located in Dresden-Pillnitz (Germany) served as the data source.1698 RGB images with different resolutions ( Table 1 ) were used to create the ERWIAM dataset.
The resulting 1611 original images were carefully checked for quality and the presence of disease symptoms in order to label them using 2-point annotation with the annotation tool CVAT [ 8 ].A further 87 RGB images were added to the data set as background images that showed no fire blight symptoms.Rectangular bounding boxes were drawn around the symptomatic area.The flowers infected with Erwinia amylovora were labelled as the class "FLOWER'' ( Fig. 1 ).
Leaves with fire blight symptoms included several disease stages ( Fig. 2 ) and were labelled as class "LEAF__''.Fire blight symptoms on the shoot were included in the dataset as the class "SHOOT_'' ( Fig. 3 ).
In addition, symptoms of other plant conditions, as well as symptoms, that may be attributable to fire blight, but could not be clearly assigned based on RGB images, were also recorded and integrated into the ERWIAM dataset as a separate class "MAYBE_'' ( Fig. 4 ).
After the annotation work was completed, a labelling file (.txt) in YOLO 1.1 format was created for each image and used for model training.15,761 annotations belonging to all classes were created manually on 1611 images ( Table 2 ).
The following information can be found in the text files of the annotated images in the following order: class_ID, x_centre, y_centre, width, height.Each image file in the ERWIAM dataset has a corresponding text file in YOLO 1.1 format.The corresponding .txtfiles for each image are located in the labels folder.In addition to the images and the label folder, there is also an information folder for the ERWIAM dataset ( Fig. 5 ).The information folder contains an All_annotations.xlsxfile in TensorFlow Object Detection format, which lists all annotations with the values class, height, width, xmin, ymin, xmax, ymax for each file name.The metadata [ 9 ] of the images are stored as a Metadata_RGB_image.xlsxfile in the information folder so that users   can select or re-sort the dataset by camera models ( Table 1 ), frequency of annotations ( Table 2 ), locations and date ( Table 3 ), camera settings ( Table 4 ) or other parameters.
The annotated images were resized to 1280 × 1280 pixels in pre-processing by using the Python library Pillow [ 15 ] and saved in JPG format.In addition, 87 background images (1280 × 1280 pixels) without disease symptoms were added in JPG format ( Table 3 ).
There is an empty .txtfile in the label folder for each background image in the image folder that has no fire blight symptoms and therefore no annotations ( Fig. 5 ).
The one-stage detector model You Only Look Once (YOLO) was chosen because it recognises different objects very quickly and precisely, and requires less computing capacity than two-stage models due to its efficient training [ 17 ].The images folder contains 1611 images with fire blight symptoms ( Fig. 6 ) and 87 background images ( Fig. 7 ).

Data collection
The RGB raw data of fire blight symptoms were collected in 2021 and 2022 on different days in the experimental orchard and in the experimental greenhouse.The image data acquisition was carried out by employees of the Julius Kühn-Institute for breeding research on fruit crops, Leibniz Institute for Agricultural Engineering and Bioeconomy (ATB) and the company geo-konzept GmbH.Manual sorting was carried out with regard to image quality and the appearance of fire blight symptoms on flowers, leaves and shoots.For example, RGB images taken in the Kirschgartshausen and Quedlinburg trial sites that did not show any symptoms of disease were sorted out of the annotation dataset.In contrast, images taken in the Dresden-Pillnitz experimental orchard with only healthy leaves, flowers and shoots and no fire blight symptoms were sorted into the background images.To ensure that there were no fire blight symptoms in the background images, these were taken in Dresden-Pillnitz in 2022 and 2023 ( Fig. 7 ).
A broad spectrum of symptoms of different genotypes was recorded and clear fire blight symptoms (classes "FLOWER'', "LEAF__'' and "SHOOT_'') and possible fire blight symptoms (class "MAYBE_'') were narrowed down.As a result, a large number of possible symptoms that could lead to false-positive results, but prevent potential fire symptoms from being overlooked, could be integrated into this dataset.Features in the image that are similar in colour and texture to fire blight symptoms were included.For example, the symptoms of branch breakage could lead to false-positive results and should be reduced by including the class "MAYBE_'' and the background images.
After labelling, the size of the labelled images and the background images was reduced to 1280 × 1280 pixels with the help of the Pillow Python library using the Lanczos filter [ 15 ], as very different image sizes were available and the original images took up a lot of memory.Different camera settings, varying distances to the objects or the multiple shooting angles are possibilities to create a diverse dataset.The light value of the different camera systems used to create the ERWIAM dataset ranged from + 7.9 to + 14.7 LV.The ISO varied from 25 to 6400, the shutter speed from 1/32 to 1/1203 and the f-number from 1.8 to 14.0 ( Table 4 ).
Data acquisition under different weather and light conditions ( Table 5 ) should also help to ensure robust recognition of the trained model.Cover levels of 7 to 8 octa were measured in the experimental orchard and a mean leaf wetness of 5 to 89 % on the recording days.

Annotation work and data set
The CVAT labelling tool [ 8 ] and the 2-point bounding box labelling method were selected for the annotation.Rectangular bounding boxes were drawn around shoots, flowers and leaves.Prior to manual annotation, criteria for classifying the fire blight symptoms on the RGB images were compiled on the basis of the literature [3][4][5]: Flower blight -class "FLOWER'' ( Fig. 1 ) -Discolouration of flower heads in brown-black with white-grey coating -Bacterial slime (milky-white to yellowish-orange) can emerge from the flower head and flower stalk

Table 5
Cloud cover [ 18 , 19 ] and the daily average of the leaf wetness [ 20 ]  Symptomatic flower heads were annotated, without flower stalk.Green fruit mummies or green flowers were classified neither as a fire blight symptom, nor as "MAYBE_''.Dark-coloured flower stalks could indicate an early fire blight infection, but were not annotated as a symptom of a flower infection.
Leaf blight -class "LEAF__'' ( Fig. 2 ) -Black or brown discolouration of the leaf veins: starting from the base of the blade or petiole and spreading over the midribs and lateral ribs -Bacterial slime (milky-white to yellowish-orange) may emerge from the petiole Shoot blight -class "SHOOT_'' ( Fig. 3 ) -The Shepherd's Crook was annotated as the main characteristic -Shoot bends downwards in an arc shape -Initially still green to light green -Bacterial slime may appear -Shoots are later dry with a brown to black colouring Sometimes a white/grey or light brown colouring of the shoot has been observed.However, a sole white/grey colouration or light brown colouration of a shoot was not classified as a clear sign of fire blight and was annotated as "MAYBE_''.If the described leaf symptoms of fire blight also occurred, the infected shoot was annotated as "SHOOT_'' together with the symptomatic leaves on the infected shoot.Within the "SHOOT_'' bounding box, only leaves that exhibited the symptoms of the "LEAF__'' class were annotated as "LEAF__''.An attempt was made to record and annotate different perspectives of Shepherd's Crook.
Class "MAYBE_'' ( Fig. 4 ) -Leaves that showed no black or brown colouration from the base of the blade to the midrib and lateral rib -Flowers that showed a different colouring (e.g.light brown) from the symptoms described -Shoots that were clearly identified as branch breakage on the days of recording in the experimental orchard -Sole white/grey colouration or light brown colouration of a shoot Overlapping and nested fire blight symptoms were also labelled ( Fig. 6 , sample A and D).Fire blight symptoms were also annotated if they showed light reflections or shading in addition to the infection ( Figs. 1-3 , 6 , sample C and D) or were wet from rain.
It is also important to note that fire blight can spread quickly in orchards and cause devastating damage if it is not detected [ 1-6 , 16 ].However, it is often not clearly recognizable on RGB images (as well as in nature), as the symptoms are similar to those of other pathogens and can still differ significantly depending on the genotype.Genetic analyses are required for a clear identification of Erwinia amylovora [ 1-3 , 6 ].To avoid overlooking unclear fire blight symptoms and thereby risking the spread of the disease in the orchard, the class "MAYBE_" was considered necessary in the ERWIAM dataset.This allows users to control the sensitivity to fire blight symptoms themselves when creating object detection models.Once the annotation work was completed, a labelling file (.txt) in YOLO 1.1 format was created for each image and used to train the YOLO model.

Limitations
Not applicable.

Ethics Statement
For this research and analysis, no human or animal subjects were used and no data from social media platforms was used.The authors confirm that the provided dataset and presented work strictly meet the ethics requirements for publication in Data in Brief as mentioned in https: //www.elsevier.com/de-de/researcher/author/policies-and-guidelines .

Fig. 1 .
Fig. 1.Example of images with fire blight infected flowers used for the annotation of the class "FLOWER" in the ERWIAM dataset.Images were taken in the experimental orchard in Kirschgartshausen.

Fig. 2 .
Fig. 2. Example of images with fire blight infected leaves used for the annotation of the class "LEAF__" in the ERWIAM dataset.Images were taken in the experimental greenhouse in Quedlinburg (samples in the top row) and experimental orchard in Kirschgartshausen (samples in bottom row).

Fig. 3 .
Fig. 3. Example of images with fire blight infected shoots used for the annotation of the class "SHOOT_" in the ERWIAM dataset.Images were taken in the experimental greenhouse in Quedlinburg (samples in the top row) and experimental orchard in Kirschgartshausen (samples in bottom row).

Fig. 4 .Fig. 5 .
Fig. 4. Example of images with symptoms of other plant diseases that look like fire blight symptoms used for the annotation of the class "MAYBE_'' in the ERWIAM dataset.Images were taken in the experimental orchard in Kirschgartshausen.

Fig. 6 .
Fig. 6.Examples of annotated images with fire blight infected leaves (red bounding boxes), flowers (purple bounding boxes) and shoots (rose bounding boxes), as well as the class "MAYBE_'' (blue bounding boxes), used for the ERWIAM dataset.Images were recorded in the experimental greenhouse (Quedlinburg -QLB) in 2021 (A) and 2022 (B) and in the experimental orchard (Kirschgartshausen -KGH) in 2021 (C) and 2022 (D).

Fig. 7 .
Fig. 7. Examples of background images in 2022 (A) and 2023 (B-D) used in the ERWIAM dataset.Images were taken with the camera model Canon EOS 90D at the experimental orchard in Dresden-Pillnitz.
). Published by Elsevier Inc.This is an open access article under the CC BY license ( http://creativecommons.org/licenses/by/4.0/ ) Specifications Table Subject Computer Science; Computer Vision and Pattern Recognition Specific subject area The dataset contains resized and annotated RGB image files of fire blight infected leaves, shoots and flowers, as well as symptoms of diseases similar in appearance with symptoms of fire blight, from the experimental orchard and greenhouse for computer vision and pattern recognition applications.Type of data Analysed, filtered and processed digital 2D RGB image files (.JPG).Size of the RGB images: 3 × 1280 × 1280 pixels.YOLO 1.1 annotation files [ 8 ] (.txt) Table of the image metadata [ 9 ] (.xlsx) Table of the annotation metadata (.xlsx) Data collection

1. Value of the Data •
In the experimental greenhouse of the JKI-Quedlinburg Institute, around 20 0 0 different genotypes of apple breeding material were artificially inoculated with Erwinia amylovora in 2021 and 2022, which could be used to record fire blight symptoms.The JKI-Dossenheim Institute has a heterogeneous apple orchard consisting of 600 to 700 trees mainly 'Gala' but also sin-gle trees of other apple varieties and apple wild species.Every year, one third of the orchard with the oldest trees was used for fire blight inoculation.Artificial inoculation took place four to six weeks before the recording period.The high genetic diversity captured in the RGB images of the ERWIAM dataset can help to utilise the responses of different genotypes to the fire blight disease for the application of deep learning methods and is therefore important for the scientific community.•Plant breeders and scientists can use the ERWIAM dataset to test and refine object detection using the model pathogen Erwinia amylovora in precision agriculture and plant disease monitoring.

Table 1
Camera models used with the different resolutions, image sizes, focal lengths and the range of light values of the collected RGB images.

Table 2
Overview of all annotations in the ERWIAM dataset by date of recording, location and class.
a KGH: experimental orchard in Kirschgartshausen, JKI-Institute for Plant Protection in Fruit Crops and Viticulture in Dossenheim (Germany).bQLB: experimental greenhouse in Quedlinburg, JKI-Institute for Resistance Research and Stress Tolerance in Quedlinburg (Germany).

Table 3
Number of RGB images with a size of 1280 × 1280 pixels from the experimental orchards and the experimental greenhouse that were used for annotations and the use of background images.
b QLB: experimental greenhouse in Quedlinburg, JKI-Institute for Resistance Research and Stress Tolerance in Quedlinburg (Germany).c DD: experimental orchard in Dresden-Pillnitz, JKI-Institute for Breeding Research on Fruit Crops in Dresden-Pillnitz (Germany).

Table 4
[ 9 ]M dataset overview with the RGB image and sensor metadata[ 9 ]from 2021 up to and including 2023 in the experimental orchards and in the experimental greenhouse.
a KGH: experimental orchard in Kirschgartshausen, JKI-Institute for Plant Protection in Fruit Crops and Viticulture in Dossenheim (Germany).b QLB: experimental greenhouse in Quedlinburg, JKI-Institute for Resistance Research and Stress Tolerance in from 2021 up to and including 2023 in the experimental orchards in Kirschgartshausen (KGH) and Dresden-Pillnitz (DD).DD: experimental orchard in Dresden-Pillnitz, JKI-Institute for Breeding Research on Fruit Crops in Dresden-Pillnitz (Germany).cThecloudcoverwas recorded by the Mannheim weather station, which is located about 12 km southeast air distance of the experimental orchard in Kirschgartshausen.dThe cloud cover was recorded by the Dresden-Klotzsche weather station, which is located about 9 km north-west air distance of the experimental orchard in Dresden-Pillnitz. b