Machine learning for determining the stage of the estrous cycle in bitches: a preliminary data collection

: Reproductive biotechnologies, such as artificial insemination, are important tools in the reproduction of female dogs. Accurate determination of the specific stage of the estrous cycle is crucial for the successful application of these technologies. Vaginal cytology serves as a cost-effective and rapid diagnostic solution. However, it relies on the analyzer’s expertise, subjecting it to human errors. Additionally, it may involve a prolonged duration between sample collection and result analysis. To minimize these limitations and streamline the diagnostic process, this study has developed and tested software in Pyphon language based on windows platform to automate the identification of the main phases of the estrous cycle that are important for artificial insemination. Eighteen vaginal cytology images were used, with six images representing each of the phases studied (proestrus, estrus, and diestrus). Images were analyzed using the open-source CellProfiller software, with subsequent classification of the images using the Tanagra software. Sensitivity and specificity values were determined for the proestrus, estrus, and diestrus phases, yielding results of 0.99 and 0.86 sensitivity and specificity for proestrus, 0.95, and 1 sensitivity and specificity for estrus and 0.95, 0.82 sensitivity and specificity for diestrus. These findings suggest that the model’s capacity to correctly identify different phases of the estrous cycle. The proposed model proved effective for the study’s objective, and the authors suggest that it may could be applied to other economically important species such as cattle, horses and small ruminants.


Introduction
Artificial insemination (AI) is defined as the insertion of previously collected semen into females.In bitches, this can occur through vaginal insemination, surgical intrauterine insemination, or transcervical insemination (Mason, 2018).Reasons for artificial insemination in dogs include reactive temperaments, inability to achieve natural mating, or the use of fresh chilled, or frozen semen (Bittencourt et al., 2014).The veterinarian is the professional responsible for determining the timing of the insemination and for performing the procedure itself (Lindh, 2022) although the interpretation of cytology can be done by technicians or virtually.
The AI process is of economic interest to the attending veterinarian and represents an economic risk for dog owners.In Brazil, the cost of AI artificial insemination varies depending on factors such as the chosen method and geographical location.Key elements for the success of this reproductive technique include identifying the optimum time for insemination and developing methods for accurately assessing this time frame (Gaytán,2020).)It is extremely important to distinguish the day of ovulation during estrus to achieve the appropriate moment for mating or insemination and, consequently, guarantee high fertility rates (Romagnoli, 2017).Many suggestions have been proposed in the recent scientific literature regarding how to determine the timing of estrus in dogs, such as the quantification of gonadotropin-releasing hormone (GnRH) (Goericke-Pesch, 2016), the luteinizing hormone (Alm; Holst, 2018), or immunohistochemistry (Vermeirsch et. al., 2002).However, in practice, the timing of estrus and consequently the timing of insemination is clinically determined by the attending veterinarian.According to Sharma and Sharma (2016), vaginal cytology is the most used procedure to assess the status of the female reproductive system.Diagnosis of the estrous cycle phase is based on vaginal cytology, specifically on the types of cells and their quantities observed under microscopy (Calderón et. al., 2020;Oliveira et. al., 2021).
Vaginal cytology, combined with clinical-reproductive evaluation and hormone dosage, are crucial steps in defining ovulation in this species.This method is simple and useful for determining the stages of the estrous cycle, assisting in identifying the optimal breeding time (Turmalay et. al., 2011).However, incorrect interpretation of the results based on human observation can culminate artificial insemination at an inappropriate time and consequently result in low conception rates (Calderón et.al., 2020).In a review conducted by Reckers et al. (2022), the authors noted that there has yet to be a uniform and standardized definition of canine vaginal cell types published.The authors of this review a closer look into the literature reveals that authors suggest different parameters and definitions of the type of cells, for example about the diameter of the different cell types.
Machine learning, a subfield of artificial intelligence (Hooper;Hecker;Artemiou, 2023), involves data analysis through iterative algorithms, aiming to build analytical models in an automated way (Witten;Frank;Hall, 2005).There are a multitude of applications for it, including human medicine and veterinary medicine, as cited by Cihan, Gokçe, and Kalipsiz (2017).A study conducted by Wolcott et al. (2022) also showed that the algorithm's performance in classifying the stages of the estrous cycle based on vaginal cytology images was comparable to that of expert human examiners for classifying estrus, proestrus, and metestrus, and was superior to human classification for diestrus images.Similarly, Sano et al. ( 2020) proposed an automatic model for identifying the stages of the estrous cycle in mice.Specifically, they conducted a test comparing the machine's performance with that of human examiners, using 100 images.The algorithm obtained 91% correct answers, which was also achieved by the first human examiner, surpassing the second examiner, who obtained 79% correct answers.These results demonstrate the algorithm's analytical capacity for accurately detecting cellular patterns.However, the author highlights the challenge of obtaining samples from all stages of the estrous cycle.
The authors of the present study hypothesized that an inexpensive and precise determination of the estrous cycle of bitches may be of interest to both veterinarians and dog owners.Therefore, the objective of this study was to develop a method for determining the phase of a bitch's estrous cycle based on automated analysis of vaginal cells using machine learning and Artificial Intelligence (AI).

Materials e Methods
The study protocol was approved by the local Ethics Committee of the Centro Universitário Católica do Leste de Minas Gerais (UNILESTE), Brazil, and was registered under No. 33.92.23.This approval was by the precepts of Law No. 11,794, dated October 8, 2008, Decree No. 6,899, dated July 15, 2009, and the standards published by the National Council for the Control of Animal Experimentation (CONCEA).

Animals and evaluation time points
For this study, seven female dogs between 2 and 8 years old were utilized.Their weights ranged from 5.00 and 36.00 kg, they belonged to different breeds and were sourced from voluntary owners.Collections were initiated upon the first clinical signs of proestrus, which included the swelling of the vulva and serous blood secretion.Subsequently, the bitches were monitored throughout the estrous cycle with consecutive sample collections.A total of six samples were selected for each stage of the estrous cycle studied (proestrus, estrus, and diestrus).The selection criteria considered characteristics that depicted the cycle phase, ensuring a substantial cellular area, and being free from residue, contamination, or any artifacts that could hinder the appropriate functioning of the software.

Cytological samples for creation of the image bank
Cytological samples were collected across various laboratories using a standardize procedure (Aydin;Sur;Dinc, 2011).Using a swab and sterile saline, epithelial cells were exfoliated from the superficial vaginal cavity and then transferred to a glass microscope slide.The samples were allowed to dry after staining with panoptic (5 sec.for each component), by the manufacturer's guidelines (Laborclin, 2019).Subsequently, the cytological images were captured, and each sample was classified into a specific estrous stage (proestrus, estrus, and diestrus) based on the consensus of two expert examiners.

Dataset -test
The dataset consists of 18 images, each with dimensions of 5184 x 3456 pixels.Six images were selected for each stage of the estrous cycle, namely, proestrus, estrus, and diestrus.These images were captured using a Canon EOS REBEL T5i camera (Coronel Fabriciano, Brazil) , with a microscope magnification of 100x, achieved through a 10X ocular and a 40X objective (Nikon ECLIPSE E200, Coronel Fabriciano, Brazil).The images were selected and classified into their respective stages by an experienced veterinarian (Post, 1985).

Analysis of images
The images were analyzed using the CellProfiler software version 4.2.5 (Vila Velha, Brazil).Initially, the original images were uploaded (Figure 1a) and then converted to greyscale (Figure 1b).Subsequently, the intensities were inverted (Figure 1c), and the objects were identified with a minimum of 300 pixels and a maximum of 500 pixels in area.The "Robust Background" algorithm was applied for cell identification (segmentation) (Figure 1d).Following this objects below a user-defined form factor were excluded (Figure 1e) to eliminate non-cellular artifacts.The remaining objects underwent multiparametric analysis, including parameters such as shape (e.g., area, minimum axis length, maximum axis length, form factor) and intensity (e.g., integral intensity, mean intensity, median intensity, minimum intensity, maximum intensity).A total of 145 parameters were quantified for each analyzed object.The collected data were then exported to an Excel spreadsheet using the CellProfiler software (Figure 1f).
For classification purposes, the Tanagra software version 1.4.41(Vila Velha, Brazil)) was used.To this end, the optimal parameters for distinguishing the different classes (proestrus, diestrus, and estrus) were selected through data mining and feature selection.As a final step, linear discriminant analysis and cross-validation were performed to calculate the sensitivity and specificity in the identification of the objects belonging to the different classes.The classification algorithms seek to identify the most significant quantitative differences among the parameters of the distinct classes.

Results
Extracting valuable and new information from large volumes of raw data using various techniques and algorithms requires the identification of optimal classification parameters.
The system demonstrated proficiency in distinguishing the proestrus, estrus, and diestrus phases based on the presence of anucleate, superficial intermediate, parabasal neutrophil, and erythrocyte cells observed in classical stages.The sensitivity and specificity values for proestrus, estrus, and diestrus were 0.99, 0.86, 0.95, and 1.0, 0.95, 0.82, (± 0,022 for all stages) respectively, demonstrating the model's capacity for correctly identifying estrous cycle phases.
The scatter plots in Figure 2 illustrate the software's capability in recognizing these three phases.

Discussion
The model presented in the present study obtained values of 99% sensitivity and 86% specificity in identifying proestrus, 95% sensitivity and 100% specificity in estrus and 95% sensitivity and 82% specificity in diestrus.Preliminary results suggest the model's ability to recognize the different phases of the estrous cycle with greater accuracy than that presented by Hernández et al. (2019), who carried out in their studies the autonomous classification of the estrous cycle in rats based on vaginal cytology, using two groups.The first group, referring to proestrus and estrus, achieved an accuracy of 82%, while the second group, referring to metestrus and diestrus, had 98.38% accuracy.
Cellprofiler was chosen for image analysis because it is open access software, available for Windows, IOS and Lynux, making it an accessible and reproducible tool.Furthermore, a study conducted by Moscon et al. (2018) shows the efficiency of Cellprofiller in differentiating different degrees of cell damage in the cervix of human patients with a set of just fifteen images divided into three distinct groups, pointing to the possibility of the algorithm created in the present study being valid even with the number reduced number of images used in its creation.
Despite the good preliminary results presented by the authors, there is a limitation regarding the number of images used to create the database.Only images characteristic of the specific phase of the estrous cycle, without the presence of artifacts and dirt, were selected to compose the database used in the analyzes in CellProfiller, resulting in a small number of images (n=18).As shown by Calderón et al (2020), it is possible to obtain a significant improvement in processing speed, with 91.6% accuracy in determining the estrous cycle stage in female dogs, but emphasized the dependence on a large database of 250 images.Curti et al. (2023) also reported the use of machine learning in agricultural contexts, which involve a large amount of data, emphasizing its role in decisionmaking processes for the production and reproduction of farm animals.However, few systems have been implemented to automatically resolve and identify the estrous cycle in female dogs.Most systems rely on extensive data sets (Hernández et. al., 2019;Calderón et. al., 2020;Wolcott et. al., 2022), bringing a counterpoint to the model created by the authors.
With this approach, it is possible to reevaluate the diagnostic methods used in animal reproduction, reducing the time between sample collection and examination.Process automation allows for the rapid analysis of numerous samples, thereby enhancing the efficiency of the production process.Although the study focused on bitches, there is potential for its application in other economically important species, such as cattle, goats, horses, and other pets, as the cellular patterns of each phase of the estrous cycle exhibit similarities across species (Porto et. al., 2007;Pimentel et. al., 2014;Teodoro et. al., 2023).
The uses of CellProfiler software for diagnosing the estrous cycle in bitches are new, and therefore lack literature, showing the importance of this article to encourage new tests and support other researchers.

Conclusion
The preliminary model proposed by the authors was effective for automated detection of the phases of the estrous cycle in bitches, however, new tests with a larger image bank are necessary to better evaluate the sensitivity and specificity of the software.

Figure 1 -
Figure 1 -Image processing and analysis carried out by Cellprofiler software of cytological samples from the estrus phase in bitches.a) original image; b) Image inversion to gray scale; c) Inversion of the intensity of objects of interest (dark background); d) Identification of objects of interest; e) Filtered objects of interest (exclusion of artifacts); f) Data from filtered objects of interest analyzed by Cellprofiler exported to an Excel spreadsheet

Figure 2 -
Figure 2 -Linear Discriminant Analysis with Tanagra Software: Classificação.Scatter plots illustrate the software's capability in recognizing the three estrous cycle phases -Classification obtained in the TANAGRA software (Scatter plot: red-proestrus / green-diestrus / yellow-estrus).