Identification of sensory hair-cell transcripts by thiouracil-tagging in zebrafish

Sensory hair cells are exquisitely sensitive to mechanical stimuli and as such, are prone to damage and apoptosis during dissections or in vitro manipulations. Thiouracil (TU)-tagging is a noninvasive method to label cell type-specific transcripts in an intact organism, thereby meeting the challenge of how to analyze gene expression in hair cells without the need to sort cells. We adapted TU-tagging to zebrafish to identify novel transcripts expressed in the sensory hair cells of the developing acoustico-lateralis organs. We created a transgenic line of zebrafish expressing the T.gondii uracil phospho-ribosyltransferase (UPRT) enzyme specifically in the hair cells of the inner ear and lateral line organ. RNA was labeled by exposing 3 days post-fertilization (dpf) UPRT transgenic larvae to 2.5 mM 4-thiouracil (4TU) for 15 hours. Following total RNA isolation, poly(A) mRNA enrichment, and purification of TU-tagged RNA, deep sequencing was performed on the input and TU-tagged RNA samples. Analysis of the RNA sequencing data revealed the expression of 28 transcripts that were significantly enriched (adjusted p-value < 0.05) in the UPRT TU-tagged RNA relative to the input sample. Of the 25 TU-tagged transcripts with mammalian homologs, the expression of 18 had not been previously demonstrated in zebrafish hair cells. The hair cell-restricted expression for 17 of these transcripts was confirmed by whole mount mRNA in situ hybridization in 3 dpf larvae. The hair cell-restricted pattern of expression of these genes offers insight into the biology of this receptor cell type and may serve as useful markers to study the development and function of sensory hair cells. In addition, our study demonstrates the utility of TU-tagging to study nascent transcripts in specific cell types that are relatively rare in the context of the whole zebrafish larvae.


Background
Sensory hair cells are the highly specialized mechanoreceptors of the auditory, vestibular, and lateral line (acousticolateralis) organs in vertebrates. Due to their mechanical sensitivity, relative scarcity, and the complex anatomy of the acoustico-lateralis organs, hair cells have been a difficult cell type to dissociate, purify and analyze. Uncovering new hair cell-enriched transcripts would complement the genetic approaches that have identified many, but not all, of the genes required for the function of hair cells. The zebrafish is an excellent genetic model for hearing and balance [1,2], however, only one study has attempted to experimentally define the zebrafish hair-cell transcriptome using dissociated macular cells from adults [3]. As such, larval zebrafish hair-cell gene expression remains poorly characterized. Uncovering additional hair cell-specific transcripts would both increase the usefulness of zebrafish as a model for hearing and balance disorders, and deepen our understanding of the development and function of vertebrate hair cells.
Although successful at enriching for rare cell types, the drawback of invasive cell-purification techniques is that they require extensive tissue manipulation that may disrupt endogenous patterns of gene expression. Furthermore, cell sorting captures the entire transcriptome of a cell, and does not discriminate between preexisting RNA and newly synthesized transcripts. For these two reasons, FACS may not be the best choice for analyzing dynamic changes in gene expression. To enrich for celltype specific RNA from an intact organism, a number of innovative techniques have been developed, including INTACT nuclei purification [10][11][12], translating ribosome affinity purification (TRAP) [13][14][15], and thiouracil (TU) RNA tagging [16,17]. To further our understanding of the biology of zebrafish hair cells, and to develop an alternative to invasive cell-purification techniques, in this study we describe the use of TUtagging in zebrafish to label hair cell-expressed transcripts in vivo.
TU-tagging is a method to enrich for actively transcribed RNA from a specific cell type of interest. This is achieved through the cell type-restricted expression of the Toxoplasma gondii uracil phospho-ribosyltransferase (UPRT) enzyme together with the global application of its substrate, 4-thiouracil (4TU). UPRT-positive cells will preferentially convert 4TU to 4-thiouridine monophosphate, a thiol-substituted form of uridine that can be readily incorporated into nascent RNA. By taking advantage of the fact that thiol (sulfur-containing) groups do not normally exist in ribonucleic acids, thiol-tagged RNA from rare or difficult to isolate cell types (such as hair cells) can be biotinylated in vitro and selectively purified from the greater RNA pool [16,17]. Moreover, because RNA is labeled in the live, intact organism, TUtagging alleviates concerns about disrupting endogenous patterns of gene expression by invasive cell isolation techniques. We have created a transgenic line of fish that expresses an HA-epitope tagged UPRT enzyme and a red fluorescent protein (Tg(myo6b:HA-UPRT-P2A-mCherry)) under the control of the myosin 6b minimal promoter to restrict UPRT expression to the zebrafish auditory, vestibular and lateral-line hair cells. TU-tagged and input RNA samples were subjected to RNA sequencing and transcript abundance was analyzed by DESeq to identify putative hair cell-expressed transcripts. In all, we found 28 significantly enriched transcripts (adjusted p-value < 0.05), only seven of which were known to be expressed in zebrafish hair cells. Using whole mount mRNA in situ hybridization, we confirmed the hair cell-restricted expression of an additional 17 genes whose spatial expression pattern had not been previously described in zebrafish. To our knowledge, this is the first demonstration of TU-tagging in zebrafish, and suggests that this technique may be useful in other zebrafish cell types.

Results
Generation and characterization of Tg(myo6b:HA-UPRT-P2A-mCherry) transgenic fish Using the Tol2/Gateway system [18] we created transgenic zebrafish that expressed an HA-epitope tagged version of the Toxoplasma gondii UPRT enzyme in auditory, vestibular, and lateral line hair cells under control of the myosin 6b promoter (Fig. 1a). Additionally, we used a P2A-mCherry marker to visually score for transgenesis. We selected a line of Tg(myo6b:HA-UPRT-P2A-mCherry) (hereafter: Tg(myo6b:UPRT)) fish that exhibited bright, hair cell-restricted mCherry fluorescence, and confirmed UPRT expression by co-staining for the HA tag at 3 dpf (Fig. 1b).
Functionality of the T.gondii UPRT enzyme has not been previously demonstrated in zebrafish. To test if UPRT activity in zebrafish hair cells enhanced 4-thiouracil incorporation into nascent RNA, we treated 5 dpf wild type and Tg(myo6b:UPRT) larvae with either 1 % DMSO or 5 mM 4TU/1 % DMSO for 3 h. Total RNA was isolated, biotinylated in vitro, and dotted onto a membrane. TU-tagged, biotinylated RNA was detected with streptavidin-HRP (Fig. 1c). Wild type larvae exposed to 4TU did show some UPRT-independent labeling. However, the level of 4TU incorporation was greatly enhanced in Tg(myo6b:UPRT) larvae. RNA from Tg(myo6b:UPRT) larvae exposed to DMSO alone did not exhibit any detectable biotinylation. These dot blot results indicate that UPRT is functional when expressed in zebrafish hair cells.

TU-tagging enriches for hair cell-expressed transcripts
To label and purify hair cell mRNA from zebrafish, we adapted the general strategy outlined in Gay et al. (see Methods and Fig. 2). We treated 3 dpf wild type and Tg(myo6b:UPRT) larvae with 2.5 mM 4TU/1 % DMSO for 15 h at 29°C. Following total RNA extraction and poly(A) mRNA enrichment, the mRNA was fragmented, biotinylated, and TU-tagged fragments were isolated using streptavidin-mediated pulldown. Barcoded Illumina RNA seq libraries were prepared from the following four sources and sequenced on one lane of a HiSeq 2000 sequencer: [1] Tg(myo6b:UPRT) input (pre-pull down) RNA, [2] Tg(myo6b:UPRT) TU-tagged (pull down) RNA, [3] wild-type (non-transgenic) input RNA, and [4] wild-type TU-tagged RNA. For each of the experimental groups, we mapped the sequencing reads to the Zv9 zebrafish genome using Tophat2 [19] and counted the number of reads aligning with each annotated gene region using SeqMonk [20]. Read counts were imported to DESeq [21] to determine statistically significant differences in transcript abundance between the input and TU-tagged samples derived from both Tg(myo6b:UPRT) and wild-type control larvae.
Our statistical analysis revealed 32 transcripts that were significantly enriched (adjusted p-value < 0.05) greater than 2-fold in the Tg(myo6b:UPRT) TU-tagged sample relative to the input (Additional file 1: Table S1). We filtered this list further by excluding four transcripts (si:dkey-22f5.9, slc10a2, slc20a1a, and tmem27) that were enriched >2-fold in the wild-type TU-tagged sample relative to the corresponding wild-type input (Additional file 2: Table S2), as the enrichment of these transcripts in non Tg(myo6b:UPRT) larvae was not related to hair cellspecific expression. As a result, we found 28 transcripts whose abundance was significantly enriched in the TUtagged RNA sample (Fig. 3, Table 1).
To determine if these Tg(myo6b:UPRT)-enriched transcripts were selectively expressed in zebrafish hair cells, we searched the PubMed and ZFIN [22] databases for data on spatial patterns of gene expression. Of the 28 enriched genes, only seven -cabp2b, myo6b, pcsk5a, s100s, s100t, slc17a8, and tmc2a -have been previously shown by in situ hybridization to be expressed in zebrafish sensory hair cells [23][24][25][26][27][28][29][30], while there was no data available for the remaining 21 (Table 1). For 18 of these 21 putative hair cell-enriched transcripts, we identified a homologous mouse gene by either querying the Ensembl database, or by BLASTP similarity. We used this homology information to search the Shared Harvard Innerear Laboratory Database (SHIELD), a repository for an RNA sequencing dataset derived from FAC-sorted mouse hair cells [4,9]. We found that 16 of the 18 mouse homologs had detectable expression in either vestibular or auditory hair cells, and that 12 homologs were significantly enriched (FDR ≤0.1) in GFP+ hair cells relative to GFP-inner ear cells. Additionally, because the mouse Gene Ontology annotation is more detailed than that for zebrafish, we used the 23 unique identifiable mouse homologs of the entire Tg(myo6b:UPRT)-enriched zebrafish gene set to perform a Gene Ontology (GO) term analysis [31,32]. Amongst the Tg(myo6b:UPRT)enriched dataset, the significantly over-represented Biological Process GO terms are all related to hair-cell development and function (Corrected p-value < 0.01; Table 2). Taken together, these in silico analyses suggest . c Dot blot for TU-tagged, biotinylated total RNA demonstrating the enzymatic activity of UPRT in the hair cells of 5 dpf zebrafish larvae. Nontransgenic wild-type (WT) larvae exhibited low levels of 4TU incorporation in contrast to Tg(myo6b:UPRT) larvae when exposed to 5 mM 4TU for 3 h. RNA from Tg(myo6b:UPRT) larvae exposed to DMSO only did not exhibit any detectable biotinylation that our TU-tagging experiment successfully enriched for hair cell-expressed genes in zebrafish.
Using whole mount in situ hybridization to characterize the spatial expression of TU-tagged transcripts in zebrafish Of the 28 significantly enriched TU-tagged transcripts, the spatial expression pattern of 21 genes has not been reported in zebrafish. To directly test if these genes are expressed in zebrafish hair cells at 3 dpf, we performed in situ hybridization (ISH) for those 18 genes that had clearly identifiable mammalian homologs. In total, we were able to confirm hair cell-restricted expression for 17 of these 18 TU-enriched transcripts (Fig. 4). We found that one of the previously uncharacterized genesa zebrafish ortholog of gpr113was expressed in taste buds, and not in hair cells (data not shown). Control sense probes for anxa5a, cd164l2, otofb, strc, and tekt3 did not yield specific signals in hair cells (data not shown). Considering both the previously reported expression patterns and the 17 new patterns described here, these TU-enriched genes were primarily if not exclusively expressed in hair cells. Most genes (n = 18) were detected in both ear and lateral-line hair cells, while four genes were primarily expressed in the ear (gpx2, si:dkey-229d2.7/ kif5-like, strc, and tmc2a), and the expression of two genes were detected in the lateral line organ only (CR293520.1/ strc-like and s100t). These ISH results confirm that our TU-tagging experiment successfully enriched for auditory, vestibular, and lateral line hair-cell transcripts.

Discussion
Our results demonstrate that TU-tagging is a viable, noninvasive method for identification of cell type-specific mRNA in zebrafish. Specifically, we employed this profiling technique to identify genes that are selectively expressed during development in sensory hair cells. By adapting the method of TU labeling of transcripts to a larval stage (3 dpf) in zebrafish, we found 28 transcripts that were significantly enriched in the TU-tagged mRNA sample of newly developed hair cells compared to the untagged input mRNA at the same developmental stage. Using in situ hybridization, we confirmed the specific expression pattern of 17 genes in hair cells that have not been previously described in zebrafish. Our work has substantially added to the number of confirmed hair cell-enriched transcripts in zebrafish and serves as an example of how TU-tagging can be used for characterization of newly synthesized transcripts in a rare cell type.
Our TU-tagging experiment sought to purify transcripts from auditory, vestibular and lateral-line hair cells from whole larvae without any prior tissue enrichment. We estimate that hair cells represent <1 % of the total cell number in a 3 dpf zebrafish larva (~750 hair cells in a larva of > >100,000 cells [33]). In addition to being scarce, zebrafish hair cells are clustered at different locations within the otic vesicle or distributed in neuromasts at the surface of the skin, making the enrichment of hair cell transcripts a demanding test for TU-tagging. Ideally, any RNA-enrichment experiment would identify hair cell-expressed transcripts with high specificity and high sensitivity; that is, identify only hair cell transcripts and detect even the rarest ones, regardless of whether Fig. 2 TU-tagging workflow diagram. Larvae (3 dpf) were exposed to 2.5 mM 4TU for 15 h and then homogenized to isolate total RNA. Purified Poly (a) mRNA was then fragmented and biotinylated for strepavidin-mediated pull down. RNAseq libraries were constructed and sequenced for comparison of transcript abundance between TU-tagged and input control RNA they were also expressed in other cell types. Given our experimental design, we found that TU-tagging enriched for hair cell transcripts with good specificity, but poor sensitivity. This means that the majority of our significantly enriched transcripts are bona fide hair cellexpressed genes. However, the experiment was not sensitive enough to identify anything other than hair cell-specific transcripts. The in situ hybridization experiments confirm the limitations on sensitivity; all of the significantly enriched TU-tagged transcripts were exclusive to hair cells. Our experiment did not identify known hair cell-expressed transcripts that are also robustly expressed in other tissues, such as the deafness genes pcdh15a, cacnad1a, or cdh23 (Additional file 1: Table S1). While our TU-tagging experiment successfully identified novel hair cell-specific transcripts in zebrafish, as performed, it was not an effective tool for analyzing the entire hair-cell transcriptome.
To improve the sensitivity of TU-tagging in zebrafish, manual tissue enrichment prior to RNA isolation is an option. This approach, similar to that taken in mice by Gay et al. [17], is more cumbersome in zebrafish due to the large number of small larvae required for the experiment. An alternative is to use adult tissues if the developmental stage is not an issue. Other possible changes to the experimental protocol could include shortening the duration and concentration of 4TU exposure, as this may reduce UPRT-independent labeling in non-target cell types. Furthermore, performing the experiment with discrete biological replicates will increase the statistical power during data analysis and may increase the sensitivity of transcript detection. However, due to the UPRT-independent thiol-labeling we observed, it is likely that TU-tagging of rare cell types will always have a signal-to-noise problem to some extent.

Conclusions
Despite the limitations on sensitivity, in our hands the TU-tagging method was robust using undissected larvae, revealing 17 hitherto unknown cell type-specific transcripts in developing zebrafish hair cells. In the context of the whole larva, acousticolateralis hair cells are a relatively rare cell type, thus this approach is likely to be useful for analyzing gene expression in other tissues or specific types of cells as well. The major appeal of TUtagging is the ability to spatially control the expression of UPRT and temporally control the exposure to 4TU. The ability of TU-tagging to discriminate between newly synthesized and pre-existing transcripts will enhance future studies of changes in gene expression during dynamic processes such as development or synaptic plasticity.

Zebrafish husbandry
Zebrafish were cared for in accordance with standard protocols and overseen by the Institutional Animal Care and Use Committee at Oregon Health and Sciences University. Larvae were obtained from pair-wise natural matings and kept at 29°C in E3 embryo media (5 mM NaCl, 0.17 KCl, 0.33 mM CaCl2 and 0.33 mM MgSO4).

Generating UPRT transgenics
The Tol2 Gateway transgenesis vectors and Tg(myo6b:HA-UPRT-P2A-mCherry) transgenic fish were generated essentially as previously described (Kwan et al. [18]). To make the HA-epitope tagged UPRT middle entry vector, the uracil phosphoribosyltransferase (UPRT) gene from Toxoplasma gondii (a kind gift from the lab of Richard Goodman) was amplified by PCR using custom attB1F and attB2R Ultramer oligonucleotide primers (Integrated DNA Technologies) containing an in-frame HA-epitope tag sequence upstream of the UPRT-specific sequence on the forward primer. Similarly, the P2A-mCherry 3' entry vector was generated from the pME-NLSmCherry (#233) template using attB2F-attB3R oligonucleotide primers containing an in-frame P2A viral peptide sequence [34] upstream of the mCherry-specific sequence on the forward primer. The resulting PCR products were cloned into the pDONR-221 or pDONR-P2R-P3 vectors respectively by standard protocols and verified by sequencing. The final Tol2 myo6b:HA-UPRT-P2A-mCherry transgenesis construct was assembled with the pDestTol2pA2 (#394) and p5e-6.5myo6b minimal promoter [29] vectors using the standard LR cloning procedure and verified by a  RNA processing and purification of TU-tagged RNA DNAse-treatment Total RNA was treated with DNase (Ambion Turbo DNA-free -AM1907) as per the manufacturer's protocol. DNAse-treated total RNA was precipitated in 1/10 volume of NaOAc, 200 ng/μL glycogen, and 2.5 volumes of ice-cold 100 % ethanol for 1 h at −80°C. Precipitates were spun at maximum speed for 15 min at 4°C, washed in ice-cold 75 % ethanol, and dissolved in RNase-free water.

mRNA purification, fragmentation and biotinylation
Poly-adenylated mRNA was purified using the Ambion Dynabeads mRNA Purification Kit (Ambion -61006) according to the manufacturer's protocol. Purified mRNA was fragmented for 4 min at 94°C using the NEBNext Magnesium RNA Fragmentation Module (NEB -E6150S) to approximately 200-500 bases, recovered by ethanol precipitation, and dissolved in 50 μl RNase-free water. TUtagged RNA was biotinylated using EZ-Link HPDP-Biotin (Thermo Scientific -21341) by the addition of 25 μl 4x TE and 25 μl 1 mg/ml EZ-Link (dissolved in DMF) to the 50 μl RNA. Following a 3 h incubation at room temperature in the dark, excess biotin was removed by a chloroform/isoamyl alcohol (24:1) extraction as described [35]. The RNA was recovered by ethanol precipitation, at which point 80 ng of RNA was set aside as the "input" RNA sample.

RNA dot blot
To test if UPRT expression in hair cells increased the rate of 4TU incorporation into nascent RNA relative to nontransgenic larvae, 5 dpf wild type or Tg(myo6b:UPRT) larvae were treated with 5 mM 4TU/1 % DMSO for 1.5 h at 29°C. Total RNA was extracted using TRIzol reagent and, omitting the DNAse, mRNA-enrichment, and fragmentation steps, 10 μg was biotinylated as described above. Following reaction clean up with the Qiagen RNeasy mini kit (Qiagen -74104), 250 ng of total RNA was pipetted directly onto a PVDF membrane and immobilized using the "Auto Cross Link" setting on a Stratalinker 1800 UV Crosslinker. The membrane was blocked for 15 min in 1x PBS/1 mM EDTA/1 % SDS at room temperature on an orbital shaker. After incubating in a 1:5000 dilution of 1 mg/ml Streptavidin-HRP (Thermo Scientific -21126) in block for 10 min, the membrane was washed 1 × 10 min in block, 3 × 5 min in 1x PBS/ 0.1 % SDS, and 1x5 min in 1x PBS. Chemiluminescent

Bioinformatics
The RNA seq reads were processed by Trimmomatic v0.32 [36] to remove Illumina adaptor sequences and discard sequence reads shorter than 36 bases. Trimmed reads were mapped against the Zv9 version of the zebrafish genome using Tophat v2.0.12 using the "-b2sensitive" option [19]. The resulting BAM file was imported into SeqMonk v0.29 [20] for data visualization and quantification (Scatter plot, see Fig. 3). For the purpose of counting the RNA seq reads assigned to each annotated gene region, probes were defined as "Gene" features, including 250 bases up-and downstream. Duplicated reads were discarded and counted only once. The resulting raw read count table was imported into DESeq v1.14.0 [21] for normalization and statistical analysis of differential transcript abundance between the "Input" and "TU-tagged" RNA samples. Prior to statistical analysis, the bottom 20 % of low-count genes was filtered from the data set [37]. Because all "Input" and "TU-tagged" samples were pooled, dispersion estimates were done using the options for data without discrete replicate sets (method = "blind",sharing-Mode = "fit-only",fitType = "local"). Genes whose transcripts were significantly enriched (adjusted p-value < 0.05) in the "TU-tagged" RNA sample were considered for further analysis.

Immunostaining and in situ hybridization
Larvae were fixed in 4 % paraformaldehyde/PBS for 4 h at room temperature followed by 5 × 5 min washes in 1x Phosphate Buffered Saline/0.1 % Tween-20 (PBST). To permeabilize, fixed larvae were rinsed twice in water and, after removing all liquid, submerged in −20°C acetone for 3 min. Larvae were rinsed twice in water, washed 3 × 5 min in 1xPBS/0.01 % Tween-20, and blocked in FSGGB (0.5 % fish skin gelatin, 1 % goat serum, 1 % BSA, 1x PBS, 0.02 % sodium azide) at room temperature for 2 h on a Nutator mixer. To label HA-tagged UPRT, 3 dpf wild type and Tg(myo6b:UPRT) larvae were incubated in a 1:750 dilution of rat anti-HA clone 3F10 antibody (Roche -1186742300) in FSGGB overnight at 4°C, washed 5 × 15 min in 1x PBS/0.01 % Tween-20, incubated in a 1:1000 dilution of Alexa Fluor 488 goat anti-rat IgG (Life Technologies -A-11006), and washed again 5 × 15 min in 1x PBS/0.01 % Tween-20. Specimens were mounted on a depression slide in 1.2 % low-melting point agarose and imaged on a Zeiss LSM 700 confocal microscope using Zeiss Zen acquisition software. Whole mount mRNA in situ hybridization (ISH) and probe synthesis was performed essentially as described [38,39]. Using the SuperScript® III One-Step RT-PCR kit (Life Technologies -12574-018), probe templates were amplified from 3 to 5 dpf total zebrafish RNA using gene-specific oligos with T3 and T7 RNA polymerase sites on the 5'-end of forward and reverse primers, respectively. Specimens were mounted on a depression slide in 1.2 % low-melting point agarose and imaged on a Leica DMLB microscope fitted with a Zeiss AxioCam MRc 5 camera using Zeiss AxioVision acquisition software (Version 4.5).