HVDROPDB datasets for research in retinopathy of prematurity

Retinopathy of prematurity (ROP) is a retinal disorder that may bring about blindness in preterm infants. Early detection and treatment of ROP can prevent this blindness. The gold standard technique for ROP screening is indirect ophthalmoscopy performed by ophthalmologists. The scarcity of medical professionals and inter-observer heterogeneity in ROP grading are two of the screening concerns. Researchers employ artificial intelligence (AI) driven ROP screening systems to assist medical experts. A major hurdle in developing these systems is the unavailability of annotated data sets of fundus images. Anatomical landmarks in the retina, such as the optic disc, macula, blood vessels, and ridge, are used to identify ROP characteristics. HVDROPDB is the first dataset to be published for the retinal structure segmentation of fundus images of preterm infants. It is prepared from two diverse imaging systems on the Indian population for segmenting the lesions mentioned above and annotated by a group of ROP experts. Each dataset contains retinal fundus images of premature infants with the ground truths prepared manually to assist researchers in developing explainable automated screening systems.

a b s t r a c t Retinopathy of prematurity (ROP) is a retinal disorder that may bring about blindness in preterm infants.Early detection and treatment of ROP can prevent this blindness.The gold standard technique for ROP screening is indirect ophthalmoscopy performed by ophthalmologists.The scarcity of medical professionals and inter-observer heterogeneity in ROP grading are two of the screening concerns.Researchers employ artificial intelligence (AI) driven ROP screening systems to assist medical experts.A major hurdle in developing these systems is the unavailability of annotated data sets of fundus images.Anatomical landmarks in the retina, such as the optic disc, macula, blood vessels, and ridge, are used to identify ROP characteristics.HVDROPDB is the first dataset to be published for the retinal structure segmentation of fundus images of preterm infants.It is prepared from two diverse imaging systems on the Indian population for segmenting the lesions mentioned above and annotated by a group of ROP experts.Each dataset contains retinal fundus images of premature infants with the ground truths prepared manually to assist researchers in developing explainable automated screening systems.

Value of the Data
• ROP may cause blindness in preterm infants.Preterm births are increasing due to improved neonatal intensive care, and the burden of ROP is expected to rise dramatically.Unfortunately, the ophthalmologists-to-patient ratio is very low and different experts' diagnoses are not unanimous.AI-based automated screening systems are needed to assist clinicians in ROP screening.ROP datasets are not published.• This dataset provides annotated fundus images of premature infants acquired by two imaging systems, RetCam and Neo.The ground truths (masks) of fundus images are prepared manually with Adobe Photoshop to segment the optic disc, vessels, and demarcation line/ridge.The researchers can use these data to segment the retinal structure essential for detecting zones and stages and develop explainable automated ROP screening systems.• A framework has been developed for automatically detecting and explaining zones, plus, and stages in the fundus images of infants.

Objective
Retinopathy of prematurity (ROP) is a disease that affects the retina of a premature infant.It usually affects both eyes and can result in lifelong vision impairment or blindness.ROP blindness is increasing due to improving neonatal intensive care in low and middle-income countries [1] .ROP may progress or regress after a few weeks of the infant's birth.Timely screening is necessary to control ROP progress because if the disease progresses to stage 3 with plus disease, invasive procedures may be required to stop further retinal detachment [2] .Due to the scarcity  of medical experts, researchers are developing automated screening systems to assist the experts.The lack of annotated public datasets is a major issue in designing and explaining such systems [ 3 , 4 ].
The following characteristics define the severity of ROP: blood vessel growth by zones (disease location), stages (severity of abnormal growth) seen, a plus disease (vessel size and tortuosity) observed, and the extent (number of clock hours involved) of the disease [5] .This work aims to provide an ROP dataset for segmenting demarcation line/ ridge, optic disc, and vessel for creating an explainable ROP diagnosis system.

Data Description
The HVDROPDB dataset consists of posterior and temporal view fundus images of premature infants, as shown in Figs. 1 and 2 .Figs. 1 a and 2 a display posterior images and Figs. 1 b and 2 b depict temporal images.HVDROPDB was named after the H.V. Desai Eye Hospital in Pune, India, where the fundus images of premature infants were collected.These images were captured by RetCam(Clarity MSI, US) and Neo(Forus Healthcare, Bangalore, India) imaging systems shown in  Fig. 3 .RetCam is used worldwide.Neo is very popular in India as it is reasonably priced and portable.The RetCam and Neo images are provided separately in these datasets.HVDROPDB-RetCam-Neo-Segmentation is the first dataset to be published to segment ROP images.It aims to aid in researching automated ROP screening systems and their explanation.The fundus images and their ground truths will facilitate the segmentation of retinal structures essential for detecting zones and stages.
HVDROPDB_RetCam_Neo_Segmentation dataset was prepared with three primary datasets of HVDROPDB-OD, HVDROPDB-BV, and HVDROPDB-RIDGE for the optic disc, blood vessels, and demarcation line/ridge segmentation, respectively.Each dataset contained four sub-datasets of 50 images and their masks (ground truths), as described in Table 1 .An optic disc is seen in the images taken from a posterior view.HVDROPDB-OD dataset was prepared with posterior view images, and it contains two subsets, RetCam_OpticDisc_images and Ret-Cam_OpticDisc_masks, which were used to segment the optic discs in RetCam images.In addition, Neo_OpticDisc_images and Neo_OpticDisc_masks were also included for segmenting optic discs in Neo images.The masks for segmentation were manually created using Adobe Photoshop Reader, as shown in Fig. 4 .
For the creation of HVDROPDB-BV, 100 images captured from the temporal and posterior views were selected, and their ground truths were prepared as shown in Fig. 5 .HVDROPDB-BV held RetCam_Vessels_images, RetCam_Vessels_masks, Neo_Vessels_images, and Neo_Vessels_masks datasets each with 50 images.
HVDROPDB-RIDGE contained 100 images of ROP stages 1, 2, and 3 captured from both posterior and temporal views, along with their ground truths depicted in Fig. 6 .The dataset was divided into four sub-datasets such as RetCam_Ridge_images, RetCam_Ridge_masks, Neo_Ridge_images, and Neo_Ridge_masks.Therefore, a total of 12 datasets were provided for segmentation.

Experimental Design, Materials, and Methods
The dataset preparation process is depicted in Fig. 7 .Images were provided by PBMA's H. V. Desai Eye Hospital in Pune captured between the years 2009 and 2022.The subjects were premature infants screened for ROP by the hospital team.The images were obtained by trained optometrists using two Neo or Retcam cameras with 120 • field of view (FOV).Posterior and temporal view images were saved in the database.A team of ROP experts with a minimum of 5 years of experience annotated them under the guidance of a senior ROP expert with 25 years of experience.Before annotation, an interobserver variability test was carried out (Kappa value 0.92).However, possibility of subjective bias cannot be ruled out as there was no external expert involved in annotation.The images were saved as different ROP classes in the HVDROPDB dataset.

Annotation of images
All collected fundus images of premature infants were gathered in a database.A team of medical experts who are experienced in grading ROP images for telemedicine models labelled these images.Each expert was trained to standardize the annotation process to develop an AI algorithm.Two hours per week were allotted for annotation.The authors reviewed the available literature and discussed it with the ROP experts.As the temporal and posterior views of the images were sufficient for the diagnosis, the team selected a pair of images with these views