Small RNA-Seq Analysis Reveals miRNA Expression Dynamics Across Tissues in the Malaria Vector, Anopheles gambiae

Malaria continues to be a major global health problem, where disease transmission is deeply linked to the repeated blood feeding nature of the anautogenous mosquito. Given the tight link between blood feeding and disease transmission, understanding basic biology behind mosquito physiology is a requirement for developing effective vector-borne disease control strategies. In the mosquito, numerous loss of function studies with notable phenotypes demonstrate microRNAs (miRNAs) play significant roles in mosquito physiology. While the field appreciates the importance of a handful of miRNAs, we still need global mosquito tissue miRNA transcriptome studies. To address this need, our goal was to determine the miRNA transcriptome for multiple tissues of the pre-vitellogenic mosquito. To this end, by using small RNA-Seq analysis, we determined miRNA transcriptomes in tissues critical for mosquito reproduction and immunity including (i) fat body-abdominal wall enriched tissues, (ii) midguts, (iii) ovaries, and (iv) remaining tissues comprised of the head and thorax. We found numerous examples of miRNAs exhibiting pan-tissue high- or low- expression, tissue exclusion, and tissue enrichment. We also updated and consolidated the miRNA catalog and provided a detailed genome architecture map for the malaria vector, Anopheles gambiae. This study aims to build a foundation for future research on how miRNAs and potentially other small RNAs regulate mosquito physiology as it relates to vector-borne disease transmission.

Malaria is a major health problem world-wide, where disease transmission is tightly linked to the repeated blood feeding nature of the anautogenous mosquito, Anopheles gambiae. Thus, understanding how the mosquito processes a blood meal for egg production is vital for future vector control strategies. Upon digesting a blood meal, the female mosquito's physiology drastically changes, with over 50% of her genes showing significant changes in transcript levels (Marinotti et al. 2006). While mechanisms are known for how gene expression fluctuates during a blood meal at the transcriptional level (Attardo et al. 2005), the mechanisms of post-transcriptional regulation of gene expression are largely unknown. As a much-needed step to address this gap in knowledge, the vector biology field is currently exploring post-transcriptional regulation of mRNA transcripts and its effects on female mosquito physiology by studying a class of small RNAs called microRNAs (miRNAs) (Mead and Tu 2008;Li et al. 2009;Gu et al. 2013;Hu et al. 2015;Skalsky et al. 2010;Biryukova et al. 2014;Liu et al. 2017;Jain et al. 2014;Castellano et al. 2015;Liu et al. 2014b;Jain et al. 2015;Allam et al. 2016;Lampe and Levashina 2018;Fu et al. 2017;Liu et al. 2014a;Zhao et al. 2017;Zhang et al. 2016;Lucas et al. 2015a;Lucas et al. 2015b;Ling et al. 2017;Zhang et al. 2017;Akbari et al. 2013).
Since the discovery of lin-14 in 1993 (Lee et al. 1993) and the functional characterization of let-7 in 2000 within Caenorhabditis elegans (Reinhart et al. 2000), numerous studies in other metazoans have demonstrated that miRNAs regulate developmental timing, drive convergent evolution, and maintain tissue homeostasis (Carthew et al. 2017). miRNAs are small non-coding RNA molecules of approximately 22 nucleotides (nt) in length, where the coding region of the miRNA reside either as individual coding sequences or as a cluster, and are either expressed in numerous or single tissues (Carthew et al. 2017). Approximately two-thirds of miRNAs are expressed from intergenic regions of the genome. The remaining third of miRNAs are intragenic, mostly found in introns, many of which are thought to be hot spots for novel miRNA evolution (Berezikov 2011). Intragenic miRNAs are conventional miRTrons or 39 tailed miRTrons, depending on their placement within the intron of the gene (Westholm and Lai 2011). Intergenic miRNAs follow canonical processing using RNA polymerase II for transcription to form a pri-miRNA, Drosha for cleavage to form the pre-miRNA, and Exportin-5 for exportation from the nucleus to the cytoplasm (Liu et al. 2008). In contrast, miRTrons do not require Drosha for cleavage (Westholm and Lai 2011). Both intergenic and intragenic miRNAs control mRNA transcript abundance by physically binding the mRNA transcript, resulting in either degradation or translation inhibition (Okamura et al. 2007;Liu et al. 2008).
While the vector biology field has made significant progress in identifying how some miRNAs regulate mosquito physiology, we still lack an overall understanding for how mosquito tissues vary in miRNA abundance. To this end, our overarching goal was to elucidate how across tissues miRNA transcriptomes in the pre-vitellogenic mosquito are different or similar. In addition, we updated and consolidated the miRNA catalog, provided a miRNA loci genome map, and found certain tissues to possess more multi-mapping small RNAs over other tissues. Altogether, this study provides an essential foundation for furthering our appreciation of the importance of miRNAs and other small RNAs for mosquito physiology in the malaria vector, An. gambiae.

Animals and Tissues
The An. gambiae G3 strain was reared as previously described . Adult female mosquitoes were collected at 3-5 days post-eclosion and four tissue groups were dissected. Adult female (i) midgut, (ii) ovaries, (iii) fat body-enriched abdominal walls (fat body-Ab), and (iv) remaining mosquito tissues including head and thorax were obtained. Dissected tissues were stored in DNA/RNA Shield (Zymo Research, Irvine, CA, USA) at 4°until further processing. For each of the four mosquito tissue groups, three separate cages of mosquitoes were used for the three biological replicates.

Small RNA Library Preparation and Sequencing
Mosquito tissues were removed from DNA-RNA Shield (Zymo Research) and further processed for RNA by using The Direct-zol RNA MiniPrep Kit (Zymo Research) following manufacturers protocol. While Direct-zol removes most DNA, remaining DNA was digested by on-the-column DNAse I treatment. Prior to library construction, RNA quality was assessed with Agilent 2100 Bioanalyzer using an RNA6000 Nano kit chip (Agilent, Santa Clara, CA, USA). For each tissue sample, 1 mg of total RNA was used as input into Illumina TruSeq Small RNA Sample Preparation Kit v2 (Illumina, San Diego, CA, USA). Following library construction, libraries were indexed with Illumina Small RNA adapters, followed by 15 cycle PCR amplification (Illumina). Following library preparation, a Sage Pippin Prep 3% cassette size fractionation system (Sage Science, Inc, Beverly, MA, USA) was used to capture fragment sizes ranging from 125-160bp, which includes the 113bp adapter sequence. The Sage Pippin Prep cassette size capture system targets small RNAs approximately within 13 -48bp. To validate purified libraries, the Agilent 2100 Bioanalyzer was used with the High Sensitivity DNA Kit (Agilent, Santa Clara, CA, USA). Following quality control of library preparation and subsequent library quantification, small RNA libraries were adjusted to 2 nM and pooled for multiplexed sequencing. The small RNA libraries were sequenced for 50 cycles on a HiSeq2500 in Rapid Read Run using the TruSeq Rapid SBS kit-HS (Illumina). Sequence data were converted to FASTQ files and de-multiplexed into individual sequences for further downstream analysis.

Small RNA Sequence Analysis
Analysis of FASTQ files obtained from the small RNA libraries was performed using CLC Genomics Workbench version 11.0.1 (Qiagen, Hilden, Germany). Raw reads from 12 libraries, composed of four mosquito tissue types with three biological replicates, were trimmed of adapter sequences. For miRNA analysis, trimmed sequences were size filtered for reads of 20 -24nt in length. Trimmed and size-selected reads were mapped to the An. gambiae PEST genome, AgamP4 (Giraldo-Calderón et al. 2015). Next, mapped reads were divided into multi-mapping-or unique-reads. As multi-mapping reads are repetitive in nature, only their percentage from the total reads was determined. Further, the multi-mapping reads were removed from further analysis, routine practice for miRNA transcriptomic studies (Sherstyuk et al. 2017;Wen et al. 2014). Unique reads were queried against miRBase v22 (Kozomara and Griffiths-Jones 2014) pre-miRNA sequences from An. gambiae and Ae. aegypti mosquitoes and miRNAs from recent studies (Fu et al. 2017;Biryukova et al. 2014), allowing up to 2nt mismatches. Reads matching these sequences were designated as annotated reads. Reads not matching these sequences were designated as unannotated reads. For annotated reads, each miRNA was manually inspected to ensure accuracy of annotated miRNA read counts determined by CLC Workbench. For unannotated reads, these sequences were fed into miRDeep2 with default parameters (Friedländer et al. 2012) as a means to determine candidate miRNAs. Predicted candidate miRNAs were examined manually to assess their potential based on guidelines suggestive of RNase III enzymatic cleavage (Wen et al. 2014;Berezikov et al. 2011): (1) candidate sequence must form a stable hairpin structure of at least -30kcal/mol using RNAfold (Gruber et al. 2008), (2) candidate sequence must have at least one star strand read, and (3) the 5p and 3p miRNA sequence must pair with at least a 2-nt overhang. To account for redundancy of miRNA nomenclature, all candidate miRNAs were crossreferenced to miRNAs reported in (Fu et al. 2017;Biryukova et al. 2014), which are not currently represented in v22 miRBase. Finally, candidate miRNAs were then included in the annotated read group for chromosome mapping analysis.
For miRNA genome map illustration purposes, chromosomal locations of miRNAs obtained from VectorBase through BLAST (Giraldo-Calderón et al. 2015) were constructed with Adobe Illustrator (Adobe Systems, San Jose, CA, USA) using a 10,000 bp:1 pixel ratio. For miRNA expression analysis, all miRNAs were interrogated for expression in our twelve small RNA-Seq libraries with annotated reads converted to reads per million (RPM). Normalized log 10 miRNA expression values were used to generate a heatmap using Morpheus from the Broad Institute (software.broadinstitute.org/morpheus/). Values were hierarchically clustered by Euclidean distance for the metric, average for linkage method, and rows (miRNAs) and columns (tissues) were clustered. The heatmap for the miRNA cluster expression analysis used the relative color scheme option in Morpheus (software.broadinstitute.org/ morpheus/), which assesses the minimum and maximum values for each miRNA across tissues and converts the expression data into colors for strictly qualitative non-statistical assessment purposes. Principle component analysis (PCA) was performed on annotated reads RPMs across tissues with singular value decomposition (SVD) along with no scaling as a means to keep the original variability of the data (Metsalu and Vilo 2015). Qualitative guidelines for categorization of miRNA tissue expression patterns include the following. miRNAs with expression values $3.5 log 10 RPM across all tissues were designated pantissue high expression. miRNAs with expression values #2.0 log 10 RPM across all tissues were designated pan-tissue low expression. miRNAs with tissue exclusion or tissue enrichment were determined by comparing the average expression values between the two lowest and highest expressing tissues for each miRNA, respectively. miRNAs with $0.5 log 10 RPM difference between the two lowest tissues had a tissue exclusion expression pattern. Conversely, miRNAs with $0.5 log 10 RPM difference between the two highest had a tissue enrichment expression pattern.
Quantitative PCR Quantification of miRNA expression was performed as previously demonstrated (Bryant et al. 2010). RNA was extracted with same methods described for small RNA libraries. Reverse transcription (RT) was performed using Qiagen miScript II RT Kit (Qiagen) followed by quantitative PCR (RT-qPCR) using the Qiagen miScript SYBR Green PCR Kit (Qiagen). The PCR condition was as follows: Step 1, 95°15 min; Step 2, 95°for 15 s, 56°for 30 s, 70°for 30 s for 40 cycles; Step 3, 95°1 min; melt curve analysis. For miRNA expression, (i) forward primers were the sequence of the mature miRNA up to 58°Tm, where miRNA candidates with a low Tm contained extra sequence added to the 59 end of the forward primer, and (ii) the reverse primer for all miRNAs was the universal reverse primer, as described previously (Bryant et al. 2010). All primer sequences are listed in Table S1. Ribosomal S7 gene served as the normalizer. Transcript expression was determined by 2 -ΔCt (Schmittgen and Livak 2008).

Small RNA sequencing of An. gambiae tissues
To determine the miRNA transcriptome at the tissue level for An. gambiae, we divided the female mosquito into four tissue groups, (i) abdominal walls for fat body-Ab sample, (ii) midguts, (iii) ovaries, and (iv) remaining tissues comprised of the head and thorax. From these four tissue groups, each with three biological replicates, we generated and sequenced twelve small RNA libraries, which resulted in a total 148.2 million reads. Reads trimmed of adaptor sequences yielded a total of 141.3 million reads (Table S2). Across all tissues, the number of trimmed reads was roughly similar with a range of 10.8-12.3 million reads (Table S2). However, size profile distributions of these small RNA reads demonstrated a noticeable difference across tissues ( Figure 1A-D). Fat body-Ab and midgut tissues possessed peaks at 22-23, 29, 32, and 41nts ( Figure 1A and B), ovary tissue possessed peaks mostly from 25-30nts ( Figure 1C), and remainder tissue only yielded a slight peak at 29nt ( Figure 1D).
To focus on miRNAs, we size-filtered for reads within 20-24nts. Here, we obtained a total of 20.1 million reads across all tissue groups ( Figure 1E-H). On average for these size-filtered small RNAs: the fat body-Ab had $1.9 million reads, the midgut had $3.1 million reads, the ovary had $0.4 million reads, and the remainder had $1.2 million reads ( Table S2). The size-filtered reads mapped to the genome with a range of 99.4-99.7% (Table S2). Further, these reads were partitioned into four groups based on two parameters, (i) genome mapping properties and (ii) miRNA gene annotation (explained in Materials and Methods) ( Figure 1E-H). The first group, all reads, simply contained all reads and represented the sum of the remaining three groups discussed below. The second group, annotated reads, contained reads with unique mapping properties and miRNA gene annotation based on miRBase databases, miRNAs reported in (Fu et al. 2017;Biryukova et al. 2014), and miRDeep2 candidate miRNAs. The third group, multi-mapping reads, contained reads with non-unique mapping properties with no annotation. The fourth group, unannotated reads, contained reads with unique mapping properties but no available annotation.
The unannotated group contained the smallest number of reads representing $6.5% of size-filtered small RNA reads across all tissues (Figure 1E-H and Table S2), where small RNAs species within this group remain to be discovered in future studies. Interestingly, on average across all tissues $43.3% of the size-filtered small RNA reads were multi-mapping (Figure 1E-H and Table S2), in agreement with other mosquito small RNA studies demonstrating that close to half of small RNA reads are multi-mapping (repetitive or non-unique) (Fu et al. 2017;Biryukova et al. 2014). Furthermore, each tissue group varied in the percentage of multi-mapping small RNAs. On average, multi-mapping reads accounted for $26.2% for the fat body-Ab, $41.8% for the midgut, $36.6% for the ovary, and $68.6% for remaining tissues of the size-filtered small RNAs ( Figure 1E-H and Table S2). The annotated group represented the majority of size-filtered small RNA reads at $50.2% across all tissues (Figure 1E-H and Table S2), and were the focus of the study. For the rest of the results, only annotated reads are discussed.
Genome mapping properties for annotated miRNA reads As our main goal was to determine the uniqueness of the various mosquito tissue miRNA transcriptomes, (i) we mapped annotated reads across all chromosomes to determine the chromosomal contribution for each tissue's miRNA transcriptome, and (ii) for each chromosome, determined genome placement and the level of annotated read intensity across all tissues. However, it is important to note that the number of annotated reads vary for each tissue group. For example, while on average the fat body-Ab and midgut had $1.4 million and $1.8 million annotated reads, respectively (Table S2), ovary and remainder tissues had a lower number of annotated reads at $0.2 and $0.3 million reads, respectively (Table S2). Both fat body-Ab and midgut possessed more annotated reads mapped to individual chromosomes compared to ovary and remaining tissues (Figure 2A).
By determining the percentage of mapped annotated reads across chromosomes ( Figure 2B), we determined the chromosomal contribution for each mosquito tissue miRNA transcriptome. We found chromosomes 2L, 2R, 3L, and 3R contributed roughly similar to ovary-and remainder-tissues' miRNA transcriptomes with a range of 12.51-39.58% and 8.74-31.11%, respectively ( Figure 2B). Conversely, for fat body-Ab and midgut tissues, chromosome 3L at 68.60% and chromosome 2L at 74.58% contributed the most to these tissue miRNA transcriptomes, respectively ( Figure 2B). Across all tissues, X and UNKN chromosomes contributed minimally with highest values only at 3.27% for the remainder tissue and 0.08% for the fat body-Ab, respectively (Figure 2A-B).
At the chromosomal level with varying annotated read intensities across all tissues, we found multiple mapping hot spots ( Figure 2C-H). As the number of annotated reads mapped to each chromosome varies for each tissue, it is important to pair mapping data ( Figure 2C-H) with total number of reads per tissue (Figure 2A). For example, on chromosome 2L, a majority of the annotated reads mapped to the $18Mb locus across all tissues ( Figure 2C). However, the midgut contributed 90.4% to these 2L chromosome mapped annotated reads across all tissues ( Figure 2C). Similarly, on chromosome 3L, a major portion of the annotated reads across tissues mapped to the $39Mb locus across all tissues, but the fat body-Ab contributed 81.9% to these 3L chromosome mapped annotated reads across all tissues ( Figure 2E). Lastly, while 2R and X chromosomes shared similar trends of tissue contribution to annotated reads mapped to these chromosomes (compare Figure 2D and G), it is important to note that 2R chromosome contributed more to the mosquito's overall miRNA transcriptome over the X chromosome (Figure 2A-B). miRNA genome architecture In addition to updating and consolidating the An. gambiae miRNA catalog (Datasets S1 thru S3), we further developed a concise miRNA genome architecture map for future work on An. gambiae miRNAs. The miRNA genome architecture map illustrates the An. gambiae chromosome placement for all 139 miRNAs (Figure 3), further corresponding datasets are found in (Datasets S1 and S3). The miRNA genome architecture map highlights intergenic-and intragenic-miRNAs, as well as clusters for these two different miRNA types (Figure 3). Similar to previous studies, 64.0% of the miRNAs were intergenic, while the remaining 36.0% were intragenic (Westholm and Lai 2011;Rodriguez et al. 2004;Biryukova et al. 2014). For the intragenic miRNAs, 88.0% were intronic, 6.0% were conventional miRTrons, 4.0% were 39 tailed miRTrons, and 2.0% were within an exon (Dataset S3). Corresponding Anopheles gambiae Pest strain (AGAP) numbers for these miRNAs can be found in Datasets S1 and S3. Based on previous studies, groups of miRNAs were designated as clusters if they resided within 10 kb (Marco et al. 2013;Mohammed et al. 2014). Indeed, 37.4% of miRNAs resided in clusters, of which we found conserved miRNA clusters (Marco et al. 2013;Liu et al. 2014a) and novel miRNA clusters comprised of our miRDeep2 candidate miRNAs and previously described miRNAs (Fu et al. 2017;Biryukova et al. 2014) (Dataset S3). Lastly, 42.3% of miRNA clusters were intragenic (Figure 3 and Dataset S3).
By combining the genome mapping properties of annotated reads (Figure 2) with the miRNA genome architecture map (Figure 3), we correlated mapping annotated read peaks to a single miRNA or multiple miRNAs. For example, on chromosome 2L across all tissues, a prominent peak of mapped reads at approximately 18Mb ( Figure 2C) represented miR-281 (Figure 3). On chromosome 3L across all tissues, a prominent peak of mapped reads at approximately 39Mb ( Figure 2E) represented miR-8 (Figure 3). On chromosome 3L in ovary tissue, a prominent peak at approximately 3Mb ( Figure 2E) represented miR-989 ( Figure 3). Lastly, on chromosome 2R in ovary tissue, a prominent peak at approximately 39Mb ( Figure 2D) represented miR-92a and miR-92b ( Figure 3). Altogether, these prominent specific miRNAs accounted for a substantial proportion of the mapped annotated reads.
Multiple tissue miRNA transcriptome-wide analysis in An. gambiae First, PCA was performed on annotated reads RPM across tissues, where biological replicates for each tissue sufficiently grouped ( Figure S1). Further, to illustrate miRNA expression across mosquito tissue groups, reads were converted to RPM and the average for the three biological replicates were log 10 transformed (Dataset S4). Dendrograms of the different tissue groups illustrated fat body-Ab and remainder tissues clustered closely together, followed by midgut and ovary tissues ( Figure  4A, top of heatmap). To illustrate the range of miRNA expression across tissues, we separated the heatmap into three clades, Clade I representing the highest expressed miRNAs ( Figure 4B), Clade II representing middle ( Figure 4C), and Clade III representing lowest expressed miRNAs ( Figure 4D). To validate our mosquito tissue miRNA transcriptome data, fourteen miRNAs were assessed for tissue expression trends by RT-qPCR analysis. In all cases, miRNA tissue expression trends as determined by RT-qPCR agreed with miRNA tissue expression trends as determined by small RNA-Seq ( Figure S4).
Our small RNA-Seq analysis found a myriad of miRNA tissue expression patterns for most but not all miRNAs. To determine miRNA tissue expression patterns, miRNA expression values were interrogated with a qualitative set of guidelines to yield (i) pan-tissue high expression, (ii) pan-tissue low expression, (iii) tissue exclusion, and (iv) tissue enrichment (see Methods for guidelines, see Dataset S5 for miRNA list). Of important note, there were varying degrees of tissue -exclusion or -enrichment across all miRNAs. Examples of miRNAs with pan-tissue high expression included miR-281, miR-8-3p, miR-306, miR-184, bantam, miR-276, and miR-14 which resided within Clade I ( Figure 4B) and within the top twenty most abundant miRNAs across all tissues ( Figure 5). On the converse, the majority of the miRNAs exhibited pantissue low expression (Dataset S5), some within Clade II but mostly within Clade III ( Figure 4C, D). Examples of miRNAs excluded from ovary include miR-10, miR-34, miR-2940, and miR-317 (Dataset S5) within Clade I ( Figures 4B). Examples of miRNAs excluded from midgut include miR-276 3p, miR-263a, and miR-71 (Dataset S5) within Clade I ( Figure 4B). Examples of miRNAs excluded from remainder tissue include miR-281 3p and miR-9b (Dataset S5) within Clade I ( Figure 4B).
Hierarchical clustering of all miRNA expression data did not find any of the miRNAs within their respective cluster to exhibit co-expression patterns across tissues ( Figure 4). However, by analyzing miRNA clusters alone with the 'relative color scheme' option in Morpheus (see Methods) for strictly qualitative non-statistical assessment, nine of the sixteen miRNA clusters demonstrated co-expression patterns within their respective cluster across tissues. Examples included: (i) miR-1174 and miR-1175 on chromosome 2R, (ii) miR-92a and miR-92b on chromosome 2R, and (iii) miR-309, miR-2944a-1, miR-2944b, and miR-286 on chromosome 3R ( Figure S5).

DISCUSSION
By small RNA-Seq analysis, we determined the diverse range of small RNAs across An. gambiae mosquito tissues. This study (i) updated and consolidated the malaria vector miRNA catalog with additional Aedes orthologs and seven candidate miRNAs, (ii) made a genome architecture map to highlight chromosome loci for intragenic-and intergenic-miRNAs and their respective clusters, (iii) found mosquito tissues to vary in amount of multi-mapping small RNAs, and (iv) determined assorted miRNA tissue expression patterns in the malaria vector, An. gambiae. A recent report used microarray technology to determine miRNA abundance in mosquito tissues in the An. gambiae strain, An. coluzzi Ngousso (TEP1 Ã S1) (Lampe and Levashina 2018). Despite differences in mosquito strains and technology, there are numerous examples of where data from both studies agree. For example, both studies found miR-989 and miR-1175 to be ovary-and midgut-enriched, respectively (Lampe and Levashina 2018). Additionally, both studies found miRNAs miR-8, miR-306, and miR-184 to be highly expressed across tissues (Lampe and Levashina 2018). However, we did find additional data from our small RNA-Seq libraries. Our data found miR-1174 to be expressed higher than its miRNA cluster companion miR-1175, which agrees with Ae. aegypti (Liu et al. 2014a) but disagrees with (Lampe and Levashina 2018). We found bantam, miR-276, miR-263a, and miR-14 to be of the top twenty highest miRNAs across all tissues, which disagrees with (Lampe and Levashina 2018) who reported these miRNAs to be lowly expressed across tissues. RNA-seq has many advantages over microarray technology: (i) lower false discovery rate, (ii) lower background noise, and (iii) a larger dynamic range to quantify gene expression (Wang et al. 2009;Nault et al. 2015), which can explain the data variability between these studies. However, despite these differences, taken together both the (Lampe and Levashina 2018) and this current RNA-seq study, further drive our appreciation and understanding of miRNA differences across mosquito tissues in the malaria vector, An. gambiae.
Additionally, our mosquito miRNA transcriptome analysis found multiple miRNAs to exhibit either global tissue expression, tissue exclusion, or tissue enrichment. miRNAs with strong enrichment in a specific tissue are termed tissue-specific miRNAs, where current hypothesis suggests these small RNAs regulate the development of novel cell types as well as maintain tissue homeostasis (Wheeler et al. 2009;Christodoulou et al. 2010;Stark et al. 2005;Chen et al. 2004). Examples include, (i) mouse thymus-specific miR-181, which regulates hematopoietic tissue (Chen et al. 2004), (ii) fruit fly muscle-specific miR-1, which regulates muscle physiology (Sokol and Ambros 2005), (iii) mosquito midgut-specific miR-1174, which regulates blood digestion enzymes (Liu et al. 2014a), and (iv) mosquito ovary-specific miR-309, which regulates egg development (Zhang et al. 2016). In this study, we found miRNAs enriched in specific tissues to varying degrees. Given the drastic transcriptional changes induced by blood feeding (Marinotti et al. 2006), future research is needed to decipher which of these tissueenriched miRNAs retain or lose their specific tissue expression. Of interest to mosquito reproduction is miR-989. This miRNA was exclusively expressed in the ovary, which is in agreement with other mosquito studies (Lampe and Levashina 2018;Mead and Tu 2008;Castellano et al. 2015;Jain et al. 2015) and the butterfly (Pararge aegeria) (Quah et al. 2015). In Drosophila, functional data demonstrate miR-989 controls border cell migration within the ovary (Kugler et al. 2013), where defects in border cell migration severely affect egg fertilization and result in sterility phenotypes (Montell et al. 2012). Given the ongoing interests in mosquito sterility for vector population control strategies (Catteruccia et al. 2009), future work on how the ovary-specific miR-989 regulates the mosquito gonadotrophic cycle should prove fruitful. A curious part of our mosquito tissue small RNA-Seq data are the substantial number of multi-mapping small RNAs. While common in small RNA-Seq studies (Sherstyuk et al. 2017;Wen et al. 2014;Johnson et al. 2016;Biryukova et al. 2014;Fu et al. 2017), the current practice is to remove multi-mapping reads from miRNA analysis (Sherstyuk et al. 2017;Wen et al. 2014). Indeed, here we removed many Anopheles v22 miRBase and previously reported miRNAs (Fu et al. 2017;Biryukova et al. 2014) due to their multi-mapping nature. However, while we removed these small RNA sequences from the An. gambiae miRNA catalog, we are not suggesting they are not important for mosquito physiology. Rather, we suggest they were simply miscategorized. Indeed, some small RNA researchers suggest multi-mapping (or repetitive) small RNAs must regulate some aspect of physiology given their substantial abundance in plants and animals (Johnson et al. 2016;Axtell 2013;Slotkin 2018). Reproduction studies in the fruit fly find multi-mapping small RNAs, called repeat associated siRNAs (rasiRNAs), to regulate genomic stability, DNA damage in the germline, embryonic axis specification, and mRNA localization during oogenesis (Theurkauf et al. 2006;Vagin et al. 2006;Klattenhoff et al. 2007). Also, mutant flies lacking the ability to produce rasiRNAs result in sterile females with oogenesis defects (Pane et al. 2007). Interestingly, our data show the fat body-Ab and remainder tissues to greatly vary in their abundance of multi-mapping small RNAs, while dendrogram analysis of annotated reads grouped these mosquito tissues together. While the above mentioned small RNAs differ in length to rasiRNAs, 20-24nt vs. 25-29nt, respectively, both small RNA groups are multimapping. Given the vast abundance of multi-mapping small RNAs across mosquito tissues and the inherent challenges of studying them (Johnson et al. 2016;Axtell 2013), future work is needed to decipher their multiple genomic loci, as well as their importance in mosquito physiology. Lastly, this study focused on miRNAs, which account for less than 20% of all small RNAs obtained in small RNA-seq studies, in agreement with other studies performed in mosquitoes (Biryukova et al. 2014;Fu et al. 2017;Akbari et al. 2013). Thus, future work is also needed to classify, categorize, and quantify small RNAs outside the 20-24nt range.
Our tissue miRNA expression data represents the resting reproductive state of the previtellogenic adult female mosquito (Clements 2000). Interestingly, we found less than a third of miRNAs to be highly expressed across mosquito tissues. Further, given the high number of lowly expressed miRNAs in the previtellogenic adult female mosquito, stimuli required to induce expression for the majority of miRNAs awaits discovery. Several insect studies demonstrate a wide range of stimuli can induce expression of lowly expressed miRNAs. Hormonal signals serve as developmental stimuli in both the fly and mosquito (Sempere et al. 2003;Liu et al. 2014b;Hu et al. 2015;Gu et al. 2013;Li et al. 2009). Environmental stimuli like insect swarm aggregation in the migratory locust, Locusta migratoria, induces expression of a miRNA required for synchronous egg hatching (He et al. 2016). Extreme cold temperature induces specific miRNA expression required for freeze tolerance in the gall fly, Eurosta solidaginis (Courteau et al. 2012). Lastly, lipopolysaccharide injection and infection with ZIKA virus serve as stimuli to induce expression of particular miRNAs in the tick, Rhipicephalus haemaphysaloides, and mosquito, Ae. aegypti, respectively (Wang et al. 2015;Saldaña et al. 2017). Thus, future work on these lowly expressed miRNAs will identify stimuli needed to induce their expression as well as decipher their importance in disease transmission by An. gambiae.
As human malaria continues to be one of the most important vectorborne diseases, there is always a need to learn basic mosquito physiology. Previous studies demonstrate the importance of miRNAs in mosquito reproduction (Lucas et al. 2015b;Ling et al. 2017;Zhao et al. 2017;Fu et al. 2017;Bryant et al. 2010;Liu et al. 2014a;Zhang et al. 2016;Lucas et al. 2015a) and immunity (Dennison et al. 2015;Winter et al. 2007). However, we lack basic biology behind miRNA expression in various mosquito tissues. To this end, this study yielded an updated and consolidated miRNA catalog, a genome architecture map highlighting intragenic and intergenic miRNAs, and small RNA transcriptomes for mosquito tissues critical for reproduction and immunity. As a whole, this data will provide a stronger foundation for future work on miRNAs and potentially other small RNAs in the malaria vector, An. gambiae.