A deep learning approach to quantify auditory hair cells

therapies for sensorineural


Introduction
Hearing loss is considered the most common sensory deficit and imposes a high socioeconomic burden on our society ( WHO, 2018 ). Many environmental stressors in life can lead to hearing loss, such as exposure to noise, ototoxic drugs, or infections. Other causes of hearing loss are based on genetic defects or are age-related. All these factors lead to sensorineural hearing loss, which relies on loss or damage to the sensory auditory hair cells (HCs), spiral ganglion neurons, HC synapses, the auditory nerve or central auditory pathway.
Loss of auditory HCs is a major cause of hearing loss and is irreversible in mammals. Currently, there are no therapies to prevent HC loss or restore HCs ( Wang and Puel, 2018 ). To design future curative therapies, research in inner ear biology has centered for many years on understanding death and survival path- ways in HCs. An established method in the research field is the investigation of mammalian auditory HCs in vitro using neonatal organotypic cultures of the organ of Corti (OC) ( Landegger et al., 2017 ;Sobkowicz et al., 1975Sobkowicz et al., , 1993. Among the advantages of this method are the in vitro culture of the postmitotic HCs together with the supporting cells of the inner ear, recapitulating the complex anatomic structure of the inner ear. Further, this in vitro culture system has numerous applications, such as the testing of protective substances, and the investigation of structural components or molecular mechanisms. A common strategy in the field is the investigation of HC survival in the organotypic cultures of the OC after exposure to ototoxic and/or protective substances. Using molecules to activate or inhibit signaling pathways led to the discovery of different death and survival mechanisms in HCs ( Chung et al., 2006 ;Ebnoether et al., 2017 ;Jiang et al., 2016 ;Pirvola et al., 20 0 0 ;Sekulic-Jablanovic et al., 2017 ;Sha et al., 2001 ;Tabuchi et al., 2007 ). In many research laboratories, the assessment of HC survival is performed by manual counting, since conventional cell counting approaches based on thresholding do not produce reliable results in the OC, especially in damaged organs. This is due to the three-dimensional conformation of these explants, different morphologies of dying HCs depending on the insult, and co-staining of apoptotic bodies/cells. A recent report has described a method to count HCs in optically cleared cochleae of adult mice using a series of MATLAB scripts ( Urata et al., 2019 ). A further method to quantify HC survival has been described using Imaris (Bitplane) software ( Saleur et al., 2016 ). In Imaris, the authors used a (thresholding) spot detection algorithm to create a binary mask which they later imported in ImageJ to then automatically count the HCs using a custom macro ( Saleur et al., 2016 ). This two-step approach has not been widely used in the field and most researchers still rely on manual counting. Yet, manual counting requires trained observers, is time-consuming, and might be inconsistent between observers. Therefore, there is a great need for further automatization.
A deep learning approach has not yet been published to quantify HC survival in OC explants. Artificial neural networks can integrate various parameters and are superior to simple thresholding in image analysis. Therefore, the objective of our study was to test such an approach to quantify HC survival in OC explants cultured in vitro . We decided to use the Fiji ( F iji i s j ust I mageJ) plugin StarDist because it is publicly available and because it directly works in Fiji ( Schindelin et al., 2012 ), a user-friendly open-source image analysis software. The Fiji StarDist plugin ( Schmidt et al., 2018 ;Weigert et al., 2020 ) comes along with pre-trained models for cell/nuclei detection in 2D of microscopy images (fluorescence and immunohistochemistry), where it detects objects with star-convex shapes. While the default StarDist model for fluorescent nuclei detection performed already better than conventional thresholding methods, it was not comparable to the manual counting of our images already in untreated OCs. Hence, we decided to generate a custom StarDist model to detect HCs in murine organotypic OC cultures. We validated the model by comparing the generated counts with manual counting in different treatment conditions. We show that deep learning is a viable approach for HC quantification and that our trained custom StarDist model is a fast and reliable method for HC quantification. Our trained StarDist model, as well as a semi-automated script to count the HCs in Fiji, are available on GitHub and Zenodo ( https://github.com/ DBM-MCF/hair-cell-counting ; https://zenodo.org/record/45900 6 6#. YEdB _ C1Q2w6 ).

Organ of Corti explant cultures
All animal experiments were approved by the Animal Care Committee of the Canton of Basel, Switzerland in accordance with the Animal Welfare Act and the Animal Protection Ordinance of Switzerland.
Wild-type C57BL/6JRj mice (Janvier Labs, Le Genest-Saint-Isle, France) of either sex were used at an age of postnatal day 4-5. Pups were decapitated, the skin and mandible were removed, and the skull was opened midsagitally. After removal of the brain, the temporal bones were collected in sterile 60-mm petri dishes containing ice-cold sterile phosphate-buffered saline (pH 7.4). The cochlear capsule was opened under a surgical stereo-microscope and the stria vascularis, spiral ganglion neurons, and spiral limbus were removed from the OC. Each OC was then randomly assigned to an experimental group and the entire OCs were cultured in polyd -lysine (P7405, all chemicals from Sigma Aldrich Chemie GmbH, Steinheim, Germany, unless indicated otherwise) coated 4 well Ibidi μ-slides (80426, Vitaris AG, Baar, Switzerland).

Culture treatments
OC explants were cultured in medium containing Dulbecco's modified Eagle's medium supplemented with 10% fetal bovine serum, 25 mM HEPES buffer, and 30 U/mL penicillin (P3032) at 37 °C in 5% CO2. After a 24 hour (h) recovery period, the explants were treated either for 48 h with 10 μM gentamicin (G1397) or for 24 h with 150 μM cisplatin (P4394). The chosen concentrations are based on preliminary experiments, where we determined a median lethal dose (LD50) for the basal cochlear turn (data not shown).
OC explants in the control group were cultured for 24 h + 48 h (72 h total) in culture medium. There were 5 OCs per experimental group.

Image acquisition
All images were acquired on a Nikon Ti microscope, equipped with a Yokogawa CSU-W1 spinning disk (pinhole size 25 μm) confocal unit (Nikon AG Instruments, Egg, Switzerland), and a Photometrics Prime 95B camera (11 μm x 11 μm cell size), using a 20x air objective (numerical aperture 0.75, final image pixel size 550 nm). Fluorescence was excited with a 561 nm laser and emission was filtered with a ET630/75 m bandpass filter.
To capture the entire OC, images were taken as z-stacks (step size of 0.9 μm) using the large image acquisition tool where 3 × 3 adjacent field of views were automatically stitched together (using the microscope's Nikon NIS software).

Manual counting of hair cells
HC survival was assessed by counting HCs containing intact phalloidin-stained stereocilia and cuticular plates. First, using Fiji, the 3D stacks were maximum projected. Then, the HCs were counted in sections corresponding to 20 inner hair cells (IHCs) at five different randomly selected fields for each basal, medial, and apical cochlear turn. Segments containing mechanical damage from the dissection were excluded from the analysis. Both IHC and outer hair cell (OHC) counts were quantified separately and the results are presented as the number of surviving HCs per cochlear turn.

Annotation of images in Labkit
Manual annotation of image crops for the StarDist model was done in Fiji by two different observers, using the Labkit plugin available through the update sites. We followed the instructions provided in the StarDist GitHub repository ( https://github.com/ mpicbg-csbd/stardist ) for the annotation.
Briefly, we generated random crops (128 ×128 pixels) of the maximum intensity projected images (single channel) from different conditions and saved them as tif images. These crops included regions of interest (displaying HCs) but also regions without HCs, giving a good representation of the whole tissue and entire image (68 crops in total).
Subsequently, these crops were opened in the Fiji plugin Labkit and each cell (also ones touching the image border) was manually annotated with an individual label and the override option on (to prevent overlapping cells). Eventually, all the annotated images were saved as separate tif files, resulting in pairs of raw crops and ground truth crops.

Training the StarDist model
Training of the StarDist 2D model was performed on a CUDA enabled Windows 10 machine, equipped with an NVIDIA GeForce RTX 2080 SUPER graphics card. Python training environments were set up with Anaconda, following the instructions from the StarDist GitHub repository ( https://github.com/mpicbg-csbd/stardist ), and TensorFlow version 1.14.
We then followed the example Jupyter notebooks, provided on the same repository ( Schmidt et al., 2018 ;Weigert et al., 2020 ), to train our own 2D (U-Net-based) StarDist model, which can be used to train further models. The notebook splits the training dataset randomly into 85% training and 15% validation images (the full training data are available on Zenodo https://zenodo.org/record/ 45900 6 6#.YEdDJS1Q2w7 ). Further, the notebook also provides data augmentation functions (random image flipping and random intensity changes), which were used without modification.
For reproducibility, following (Config2D) parameters are key: n_rays = 32, default value was confirmed with the first Jupyter notebook; train_patch_size = (128, 128), to fit our training data input size; all other parameters were left to their defaults. A detailed list of all the parameters is supplemented (Suppl. doc. 1), along with the model training performance on the validation dataset.
The training was performed for 400 epochs. Subsequently, the probability and non-maximum suppression thresholds for the StarDist model were optimized and the model saved to file. The optimized thresholds were used when applying the model in Fiji using the StarDist plugin (probability threshold = 0.58035, non-maximum suppression threshold = 0.3). Our trained StarDist model is available on Zenodo ( https://zenodo.org/ record/45900 6 6#.YEdDJS1Q2w7 ).

Validation of the StarDist model
All explants used for the validation of the model were not included in the training. For each experimental group (control, gentamicin, cisplatin), 5 segments per cochlear turn (base, middle, apex) were selected in 5 different OC explants. Those segments were counted either manually or using the trained StarDist model and our custom script. For the counting with the StarDist model, HC counts outside of the HC region were manually deleted. No other corrections were performed. All images ( Fig. 1 -4 ) displaying the StarDist output (raw images with region of interest (ROI) overlay or label images) represent uncorrected counts. Manual counting was performed by two independent observers not blinded to treatment, of which only one (observer 1) participated in annotating the images in Labkit.

Statistical analysis
Results are presented as means ± SDs. The statistical analysis was performed with R Software version 4.1.0 ( R Core Team, 2021 ) using a Poisson regression (unless stated differently) with a Tukey posthoc test with false discovery rate (FDR) correction from MASS ( Venables et al., 2002 ) and multcomp packages ( Hothorn et al., 2008 ), respectively. Prism 9 (GraphPad software, La Jolla, CA, USA) was used to create the graphs. The results were considered statistically significant with a p-value < 0.05.

Exemplary StarDist output and results of validation dataset
StarDist provides a default deep learning model for instance segmentation of fluorescent nuclei. While the performance of this default model is superior to conventional thresholding, the segmentation of HCs was not comparable with manual counting, even for untreated OCs (Suppl. Fig. 1 ). Hence, we decided to train our own StarDist model, which performed significantly better than the default model.
To train the StarDist model, we used raw and corresponding ground truth image crops from full OC images. The images were annotated in Labkit (see Section 2.6 and Suppl. doc. 2). The training images were chosen from different old experiments, and represent untreated, as well as, treated (gentamicin, cisplatin) OC explants from all cochlear turns (all figure crops used for training available on Zenodo: https://zenodo.org/record/45900 6 6#. YEdB _ C1Q2w6 ). After training the StarDist model with 68 images in total, it already achieved qualitatively satisfactory results. The model works with both image crops from HC regions or on the entire OC ( Fig. 1 ). In Fiji, the counted HCs can be displayed as an overlay from the ROI Manager ( Fig. 1 A' and Fig. 1 B') or as label image ( Fig. 1 A'' and Fig. 1 B''), where only recognized cells are presented.
The validation dataset (see Section 2.7 ) had an average precision of 58%, calculated for an intersection over union (IoU; between predicted and ground truth object) with a threshold τ of 0.5. Further metrics are shown in Suppl. doc. 1. In addition, for a broad overview of the performance of our trained StarDist model, we pooled all data together for a first analysis (from all treatments, all cochlear turns, IHC and OHC together). There was no significant difference between the observers and the trained StarDist model (Observer 1 vs StarDist, p-value = 0.872; Observer 2 vs StarDist, p-value = 0.872) or between the observers themselves (pvalue = 0.872).
Albeit our trained StarDist model nicely recognizes the HCs ( Fig. 1 ), there are also structures outside of the HC region that are counted. To eliminate false-positive objects, the ROI Manager allows manual correction. To provide a standardized, semiautomated workflow for HC counting, we generated a script (Script for HC counting; https://github.com/DBM-MCF/hair-cell-counting ). In this script, HC regions to be quantified can be selected by drawing lines of comparable length ( e.g. , corresponding to 20 IHCs) along the OC. Only HCs in a rectangle around each line are counted by the script, avoiding counts outside of the selected HC region. For each selected region, the script allows manual correction of detected HCs. Additionally, the actual number of IHCs, as well as, the cochlear turn (base, middle, apex) to which the region corresponds, can be indicated. Class labeling (IHC or OHC) is manually performed in the script, since this is not accomplished by the StarDist model. The produced excel file shows the HC counts, separated into IHCs and OHCs for each cochlear turn.
We supply a further script to help to generate a cochleogram, available on GitHub (Script for cochleogram; https://github.com/ DBM-MCF/hair-cell-counting ). In this script, an automatic detected or manual annotated HC region is divided into 10 segments from the apex to the base. This allows the quantification of automatically detected HCs counts by the StarDist model in 10% distances from the apex, which can be plotted as cochleogram as described by Viberg and Canlon ( Viberg and Canlon, 2004 ). The Suppl. doc. 2 provides detailed instructions for usage of the custom scripts.

Validation of trained StarDist model in untreated OCs
To further quantitatively validate our StarDist model, we compared the counts generated by the model with the actual gold standard manual counting. Manual counting was performed by two independent observers. The OC explants used for validation were cultured for 72 h in total, changing the medium after the first 24 h. Our StarDist model reached comparable results in all cochlear turns for both IHCs and OHCs and there was no statistically significant difference between the groups ( Fig. 2 A). Fig. 2 B shows a representative image of HCs recognized by the trained StarDist model in the medial cochlear turn of an untreated OC. The middle row represents the ROI overlay of detected HCs ( Fig. 2 B'), the lower row the label image ( Fig. 2 B''). The objects detected by the StarDist model outside of the HC region (arrows in Fig. 2 B' and 2B''), were manually deleted and excluded from the analysis.

Validation of trained StarDist model in gentamicin or cisplatin-treated OCs
The aminoglycoside gentamicin and the chemotherapeutic agent cisplatin are both widely used drugs to treat bacterial infections or cancer respectively. However, their use is limited due to ototoxicity as an important side effect among others ( Schacht et al., 2012 ). In the inner ear, these compounds lead to damage and death of the HCs, especially OHCs in the basal cochlear turn ( Schacht et al., 2012 ). In OC explants, they are often used as ototoxicity models. In order to validate HC counts from our trained StarDist model under damage conditions, we used gentamicin-or cisplatin-exposed OC explants for our analysis.
In gentamicin-treated OC explants, there was no significant difference between counts from our trained StarDist model compared to the manual counts by the two observers, or between the two observers themselves ( Fig. 3 A). Fig. 3 B represents an image of HCs recognized by the trained StarDist model in the medial cochlear turn. With few exceptions, the StarDist model recognized almost all the HCs ( Fig. 3 B' and 3B'').
Since morphologies and cell death pathways involved in gentamicin-and cisplatin-induced HC death appear to differ ( Schacht et al., 2012 ), we also validated our trained StarDist model in cisplatin exposed OC explants. There was no significant difference in IHC or OHC counts in all cochlear turns between the StarDist model and the two observers, or between the two observers themselves ( Fig. 4 A). The arrows in Fig. 4 B' and 4B'' point to objects counted by the StarDist model outside of the HC region, which were excluded from the analysis.

Discussion
Here we present a deep learning approach to quantify HC survival in murine neonatal OC explants cultured in vitro . Using the publicly available StarDist plugin ( Schmidt et al., 2018 ;Weigert et al., 2020 ) for Fiji, we present a custom model to identify HCs in OC explants. Further, we confirm the reliability of this model for the identification of HCs in both control and damaged OC explants.
The culture of the mammalian OC in vitro allows the study of the complex structures of the peripheral auditory organ involved in sensorineural hearing loss and remains an important tool in inner ear research. Given their high vulnerability, missing regenerative capacities in mammals, and importance for hearing, many research groups have focused on auditory HCs. Therefore, HC survival is commonly assessed in OC in vitro cultures. Due to the complex and specialized structure of the OC, as well as the different cell  death morphologies, image analysis approaches using conventional thresholding to count surviving HCs do not perform well. Consequently, HC survival is commonly assessed by manual counting. However, manual counting of HCs is a time-consuming process, requires some degree of experience, and is subjective.
To address these issues, we tested a deep learning approach to quantify HC survival in OC explants. We used the Fiji ( Schindelin et al., 2012 ) plugin StarDist ( Schmidt et al., 2018 ;Weigert et al., 2020 ) because it is open-source and user-friendly. Using a small set of images, we trained our custom StarDist model and validated it by comparing its HC counts with manual counts from two independent observers. Our custom model produced re-liable results: There was no significant difference compared to the two observers, when the counting data were analyzed either with the entire pooled dataset or separately for the individual treatments (control, gentamicin, cisplatin), cochlear turns and IHCs/OHCs. Thus, we conclude that the performance of our custom StarDist model is comparable to the current gold standard manual counting.
All images used for training our custom StarDist model, as well as those for validating the model, have been acquired with the same microscope. We also observed that the model works with images taken with a point-scanning confocal microscope (Nikon A1, data not shown). However, it is possible that our custom StarDist model will not reliably identify HCs when using images acquired with other microscopes. Also, its reliability might depend on the objective and the resulting pixel size. While scaling such images to similar pixel size might improve the results from our StarDist model, a new StarDist model can be trained easily with another set of images that will fit the acquired images.
We usually counted and presented the HC counts in sections corresponding to 20 IHCs. Nonetheless, our custom script can be easily adapted to display the results of HC counts to other references ( e.g. , 200 μm).
Another method to count HCs has been previously described for cochlear histological samples from adult mice ( Saleur et al., 2016 ). In contrast to our described approach, the authors propose the use of the thresholding spot detection algorithm of Imaris. Using Imaris the authors created a binary mask of the 3D stack cochlear image to identify the HCs. Using a custom ImageJ macro they subsequently overlaid the binary mask with the corresponding cochlear image to automatically count the HCs in a given stretch ( Saleur et al., 2016 ). In contrast, our approach can be entirely performed in Fiji, using the deep learning StarDist plugin. The advantages are that both Fiji and StarDist are open-source, easily accessible and the actual analysis is performed in a single program, namely Fiji. Moreover, we validated our StarDist model also in damaged OCs, which has not been done for automatic quantification of HCs by others ( Saleur et al., 2016 ). Also, using the StarDist Python package other custom models can be trained and adapted to one's requirements.
To our knowledge, this is the first report to use deep learning to assess auditory HC survival in OC explants. In a recent study, Urata et al. described an automated HC detection method using a series of MATLAB scripts, among them also deep learning algorithms, in optically cleared tissue of adult mice ( Urata et al., 2019 ). Although more complex than our described approach, the method described by Urata et al. allows the visualization and reconstruction of the entire adult sensory epithelium in situ ( Urata et al., 2019 ). The advantage of the approach described by Urata et al. is that no dissection with concomitant opening of the bony cochlear capsule is necessary ( Urata et al., 2019 ). This allows the detection of HC damage along the entire length of the OC and extensive analyses in adult models. In contrast, our described approach requires less computational image processing and should therefore be easier to apply. Although we trained our model on neonatal OCs, a similar model can be used to count HCs in adult animals using decalcified and dissected cochleae stained for HCs. Moreover, we also noted that the model works in rat OC explants (data not shown). In addition to HCs, our described approach could also be used, for example, for the quantification of synapse and spiral ganglion neuron counts in the inner ear.
In image segmentation, instance segmentation identifies all single objects of a certain class in images, such as the identification of single cells ( Moen et al., 2019 ). StarDist uses star-convex polygons as a deep learning approach for instance segmentation to detect nuclei or cells ( Schmidt et al., 2018 ;Weigert et al., 2020 ). This method has shown to be superior to others for example in situations where neighboring cells overlap ( Schmidt et al., 2018 ). In addition to the star-convex polygon approach, there are also different deep learning approaches for instance segmentation that might be suitable to identify HCs ( Moen et al., 2019 ). Also, very recently a generalist deep learning approach has been developed that does not require new training data ( Stringer et al., 2021 ). We used the StarDist approach developed for 2D images ( Schmidt et al., 2018 ), where we used maximum intensity projections as 2D images. StarDist also allows object detection and segmentation in 3D images ( Weigert et al., 2020 ). However, this method is not implemented in Fiji yet, which makes it less accessible. Overall, deep learning opens many possibilities for image analysis and will gain importance in medical research in the future ( Esteva et al., 2019 ;Moen et al., 2019 ).
To conclude, we show as a proof-of-concept, that deep learning is a valuable and time-saving approach for HC survival quantification in OC explants. We trained and validated a custom StarDist model, which produces reliable HC counts. This model can be easily applied using the open-source StarDist plugin for Fiji. Our semi-automated method is objective and thus increases inter-rater reliability. In addition, it is faster than manual counting. Therefore, it is an important tool to facilitate the study of HCs making the results more comparable, which hopefully will foster hearing research.