Single-cell transcriptomic profiling of the mouse cochlea: An atlas for targeted therapies

Significance An increasing number of therapeutic strategies are being designed and tested in animal models for numerous forms of hereditary deafness, the most frequent genetic sensorineural disorder. One major challenge is the implementation of these therapies for diverse isolated and syndromic forms of hearing loss, taking into account the spatial and temporal patterns of expression of the causal gene in the auditory sensory organ, the cochlea. Here, combining single-cell and single-nucleus RNA sequencing with in situ RNA hybridization assays, we present a large-scale single-cell transcriptomic atlas for three crucial stages in the maturation of the mouse cochlea. This detailed atlas of gene expression provides key information for the development of effective therapeutic approaches.


Animals
C57BL6/J wild-type mice were purchased from Janvier Laboratories.

Single cell/nucleus isolation and sequencing
The cochleae of P8, P12, and P20 animals were extracted by removing the osseous labyrinth and dissected in phosphate-buffered saline (PBS) solution. Briefly, after removal of the osseous shell, whole cochleae were microdissected and then subjected to enzymatic dissociation (Adult Brain Dissociation kit, 130-107-677, Miltenyi Biotec) (Fig. S1A) and mechanical trituration with a gentleMACS dissociator (Miltenyi Biotec). The resulting suspension was filtered sequentially through 70 µm and 40 µm meshes and then sorted by flow cytometry on the basis of size (forward scatter), granularity (size scatter), and viability ( Fig. S1B) with a BD FACSAria III cell sorter (BD Biosciences) in PBS with 0.04% bovine serum albumin (BSA). To overcome the biases of cell sorting, which are dependent on the viability and morphology of a given cell type, we complemented this scRNAseq approach by generating a transcriptomic dataset for single nuclei, which are morphologically similar between the different cell types and the preservation of which is less dependent on tissue health. The snRNAseq dataset was obtained on P8 ( Fig. S1B) but not at older ages, as the number of harvested viable nuclei was not high enough for subsequent sequencing and analysis. For snRNAseq, we used the "Nucleus isolation from cell suspensions for ScRNAseq protocols" (CG000124 RevF, 10x genomics).
DAPI-positive nuclei (1/10,000, incubation for 10 minutes) were sorted on the basis of forward scatter and size scatter. The sorted single-cell/single-nucleus suspension was treated (approximatively 3 h after animal sacrifice) according to the "Chromium Next GEM Single Cell 3' v3.1 protocol" (CG000204 Rev D, 10x genomics) in accordance with the manufacturer's instructions, for library construction. The quality of the libraries was checked with an Agilent 2100 Bioanalyzer and the High-Sensitivity DNA kit (Agilent), and the libraries were then sequenced with a HiSeqX Illumina sequencer at a depth of 50 000 reads per cell (Macrogen).

Single cell/nucleus RNA sequencing analysis
Sequencing data were processed and analyzed with Partek Flow analysis software (Partek, St. Louis, Missouri). The STAR 2.7.8a algorithm was used to align sequences with the whole mouse genome index and the mm10 assembly. The duplicated Unique Molecular Identifiers were removed and barcodes were quantified with the Ensembl100 annotation model. Intronic reads were included for snRNAseq.
Gene expression was normalized in counts per million (by dividing by the total number of mapped reads per sample and multiplying by 1 × 10 6 ). A count of 1 was then added before log2 transformation of the data. Genes not expressed in 99.9% of cells were filtered out, resulting in 22,588 gene entries being considered in subsequent analyses. Principal component analysis focusing on the most variable features was performed and the top 15 principal components selected before data visualization with a t-distributed stochastic neighbor-embedding (t-SNE) representation for dimension reduction, with a Euclidean distance metric with 30 as perplexity and 1000 iterations. Cells were classified into cell types manually, based on their clustering and the combined expression patterns of several markers already known and published or characterized by our RNAscope assays. The classification was strengthened further by comparing the findings for the top 200 most differentially expressed genes between these cell types with published scRNAseq data for the cochlea. The cells without genes displaying high levels of differential expression (i.e. those without a clear transcriptomic signature) could not be classified and were therefore labeled as "nonassigned". The high heterogeneity in cell size, morphology and For snRNAseq, we applied the following criteria: 400-3000 genes, 500-5000 counts, 0-4% mitochondrial reads and 0-10% ribosomal reads. The same PCA and t-SNE algorithm parameters were used to visualize the cells as for scRNAseq, and the same cell-type classification method was employed. Erythrocytes were detected in the single-nucleus sequencing dataset. As erythrocytes have no nuclei, this observation suggests that whole cells may have escaped the filtering and were sorted with the nuclei. Indeed, on analysis of the snRNAseq t-SNE plots, we occasionally observed two wellseparated clusters, which therefore had different transcriptomes, but were of the same cell type, suggesting that a fraction of whole cells had been collected with the nuclei. Cluster duplication for a given cell type was not due to batch effects (see Fig. S2) and was not observed in the scRNAseq dataset. Moreover, for each cell type, a comparison of the 500 most differentially expressed genes between the two corresponding snRNAseq and scRNAseq clusters on P8 showed that the smaller snRNAseq cluster had the strongest transcriptomic correspondence to the scRNAseq data cluster. We therefore considered this smaller cluster to be the one containing the unwanted whole cells in our snRNAseq data and we removed this cluster from the analysis.
The total number of genes detected for the whole atlas was similar for the two approaches: 22 588 genes for scRNAseq and 20 344 genes for snRNAseq. However, an average expression of 2068 genes per cell was detected across all ages with scRNAseq in comparison to 828 genes for snRNAseq.

RNA in situ hybridization and immunohistofluorescence assays
The cochleae were extracted in cold freshly prepared PBS solution and fixed by incubation in 4% paraformaldehyde (PFA) for 1 h at room temperature (RT). For cryosections, the organs were decalcified by incubation in 0.35 M EDTA, pH 7.5 at 4°C for 24 h for the P8 samples, 48 h for the P12 samples and 72 h for the P20 samples. The cochleae were then post-fixed by incubation in 4% PFA for 1 h, and were then incubated overnight in 20% sucrose at 4°C. They were embedded in optimal cutting temperature (OCT) compound (VWR International), and the resulting blocks were frozen in Tissue Tek Cryomold Intermediate (Q93740, Interchim) in liquid nitrogen and stored at -80°C. The cochlear blocks were cut into 10-12 µm-thick slices on a cryostat (CryoStar NX70, Epredia), and allowed to dry for 1 h at RT before being stored at -20°C. For RNA in situ hybridization assays, the RNAscope kit (RNAscope Multiplex Fluorescent Reagent Kit v2 Assay, catalog no. 323100) purchased from Advanced Cell Diagnostics (Bio-Techne SAS, Rennes) was used according to the manufacturer's instructions. The OCT was removed, and slices were heated for 30 minutes at 40°C and post-fixed by incubation in 4% PFA for 15 minutes at RT. For whole-mount RNAscope assays, the organs of Corti were microdissected.
Both types of preparation were washed in H2O, dehydrated in a successive series of ethanol solutions and dried for 30 minutes at 40°C. The slices (but not the whole mounts) were incubated with hydrogen peroxide and heated in a HybEZ oven for 5 minutes at 99°C. The tissues were incubated with Protease Plus solution in the hybridization oven at 40°C for 30 minutes and were then hybridized with the target probes at 40°C for 2 h. Each target probe contained a mixture of short oligonucleotides designed to bind to a specific target mRNA and detectable in one of three fluorescence channels (see Dataset S5 for full list of probes), C1, C2 or C3. Different fluorophores were assigned to the C1, C2 and C3 channels, depending on the Opal TM dye (Akoya Biosciences) selected for the channel concerned. The tissue was washed several times and then subjected to sequential hybridization procedures with the RNAscope Multiplex Fl V2 Amp1, Amp2 and Amp3 and horseradish peroxidase, to amplify the signal.
The fluorescent probes Opal dye 488, 570 or 649 (1/1000 dilution) were then added for channel C1, C2 or C3 labeling. If additional immunofluorescence staining was required, the tissues were then incubated in a blocking solution (0.3% Triton-W100, 1% BSA, 20% goat serum) for 1 h at RT, and then with the primary antibody overnight at 4°C, followed by the secondary antibody for 2 h at RT. DAPI

Statistical and data analysis
The most differentially expressed genes were identified by Student's t-tests on the classified cell types with Partek Flow analysis software, comparing each group, one-by-one, with all the other groups together. The upregulated genes were identified as markers, and were sorted on the basis of their pvalues (Dataset S1). For multiple comparisons (Fig. S8), values of p < 0.05 were considered significant in Kruskal-Wallis (K-W) tests followed by non-parametric multiple-comparison tests (NPMC, Dunn-Holland-Wolfe test) for non-normally distributed data and datasets of equal size, in Igor Pro 6.3 (Wavemetrics). All genes were taken into account for the hierarchical clustering presented in Figs. 5 and S3. The average linkage method was used to determine cluster distance metrics and the Euclidean method was used to determine point distance metrics in Partek software. For Fig. 7, hierarchical clustering tools, such as Morpheus (software.broadinstitute.org/morpheus) and Clustergrammer (maayanlab.cloud/clustergrammer) (1), were used. For bubble plots presented in Figs 1, 3 and S6, mean gene expression level and the percent of cells with a gene expression value above 0 were considered. For Gene Ontology Enrichment Analysis, the 200 most differentially expressed genes in a given cell type were analyzed with the open-source python-based library GOATOOLs (tanghaibao/goatools on github) (2). For the tonotopy study (Fig. 6), the biological processes and molecular pathways were analyzed with online gene ontology (http://geneontology.org/) (3) and molecular network analysis tools (https://cytoscape.org/) (4). The frequency of genes for a particular GO term was compared to the background frequency. The uncorrected p-value was then calculated in a Fisher's exact test, with correction for false discovery rate according to the Benjamini/Hochberg procedure.               showing the levels of expression on P8 (counts per million, log2-normalized) of genes differentially expressed in tympanic border cells (ranked among the most differentially expressed genes, indicated in brackets after the gene name of interest) classified according to Emilin2 expression that is split into four equal quartiles (Q1 to Q4) assumed to represent four tonotopic subregions of the cochlea from the base to the apex, based on RNAscope assays. Non-siginificant differences are indicated by n.s, no indications mean a significant difference. Kruskal-Wallis test followed by NPMC DunnHolland-Wolfe test. . Differential gene expression during cochlear developmental maturation. Volcano plots displaying the differential pattern of gene expression between P8 and P20 for the various cochlear cell types. Fold-changes in expression are indicated on the x axis, with positive values indicating an upregulation on P8 and negative values indicating an upregulation on P20. The false-discovery rate-corrected p-value is indicated on the y axis. Only genes with a fold-change in expression of at least 2 in either direction (-2/+2) and an FDR-corrected p-value less than 0.05 are displayed in gray. The upregulated deafness/key genes are highlighted in red or blue for P8 and P20, respectively. (1) Similarity matrix built on the hierarchical clustering of cochlear cell types based on the expression of 195 deafness genes and key regulatory genes, for the scRNAseq (P8, P12, P20) and snRNAseq (P8, OHCs only) data. The x and y axes depict the cell types, and the heat map indicates the similarity, with white and red for low and high values, respectively. (2)(3) Hierarchical clustering of cochlear cell types (x axis) and 195 deafness/key genes (y axis). The heat map indicates the level of gene expression, with white/yellow and red/black for low and high values, respectively. Vc: vascular cells (endothelial cells, pericytes and smooth muscle cells), Bs + SS: basal stria cells + surrounding structures, Fb + SS: fibrocytes + surrounding structures, GC: glial cells, Is: intermediate stria cells, SC: supporting cells, Rt/Sp: root cells and spindle cells, Ms: marginal stria cells, HC: hair cells, Neu: neurons.