Reproductive functions and genetic architecture of the seminal fluid and sperm proteomes of the mosquito Aedes aegypti

The yellow fever mosquito, Aedes aegypti, transmits several viruses, including dengue, Zika, and chikungunya. Some proposed efforts to control this vector involve manipulating reproduction to suppress wild populations or replacing them with disease-resistant mosquitoes. The design of such strategies requires an intimate knowledge of reproductive processes, yet our basic understanding of reproductive genetics in this vector remains largely incomplete. To accelerate future investigations, we have comprehensively catalogued sperm and seminal fluid proteins (SFPs) transferred to females in the ejaculate using tandem mass spectrometry. By excluding female-derived proteins using an isotopic labelling approach, we identified 870 sperm proteins and 280 seminal fluid proteins. Functional composition analysis revealed parallels with known aspects of sperm biology and SFP function in other insects. To corroborate our proteome characterization, we also generated transcriptomes for testes and the male accessory glands—the primary contributors to Ae. aegypti sperm and seminal fluid, respectively. Differential gene expression of accessory glands from virgin and mated males suggests that protein translation is upregulated post-mating. Several SFP transcripts were also modulated after mating, but >90% remained unchanged. Finally, a significant enrichment of SFPs was observed on chromosome 1, which harbors the male sex determining locus in this species. Our study provides a comprehensive proteomic and transcriptomic characterization of ejaculate production and composition and thus provides a foundation for future investigations of Ae. aegypti reproductive biology, from functional analysis of individual proteins to broader examination of reproductive processes.

Introduction 1 microcentrifuge tube on ice, and pooled sperm samples were flash frozen in liquid 79 nitrogen every 2 h. 80 Two biological replicates included sperm combined from 400 and 470 randomly 81 selected males, respectively. Pooled samples were centrifuged at 25,000 x g and the 82 supernatant was removed, leaving 18µl saline with the pellet. An equal volume of 2x 83 Laemmli buffer + 5% β-mercaptoethanol was added to the pellet, and samples were 84 solubilized by sonicating for 30 s, boiling for 15 min, and re-sonicating for 30 s. To 85 remove any particulate debris, samples were spun down at 10,000 x g for 10 min, and 86 the supernatant was placed in a fresh tube. Protein was quantitated using a 1:169 5 87 dilution of the sample using the EZQ assay (Thermo Fisher Scientific, Waltham, MA). 88 Protein used for subsequent mass spectrometry was standardized across biological 89 replicates (16µg). 90 Transferred ejaculate esolation 91 Males were reared as described above. As larvae, females were labelled with 15 N using 92 the rearing methodology of Sirot et al. [27]. Briefly, a prototrophic yeast strain 93 (D273-10B) was grown in media whose only nitrogen source was 15 N ammonium sulfate 94 (Cambridge Isotope Laboratories; Cambridge, MA). Yeast were grown to saturation, 95 pelleted, and resuspended in PBS to a final volume of one sixteenth of the growth 96 media. A few drops of yeast slurry were provided to newly hatched first instar larvae 97 after vacuum hatching. One day after hatching, 200 larvae were placed in a rearing tray 98 with 200 mL water from a previous cohort of 15 N-yeast-reared mosquitoes and 800 mL 99 of deionized water. Larvae were fed 4 mL of labeled yeast slurry a day after hatching, 100 and again at 4 d after hatching. Pupae were isolated in individual tubes to ensure 101 virginity upon eclosion, and females were put in an 8 L bucket cage with sucrose 102 3/30 solution ad libitum. Sucrose was replaced every 2 d to preclude the introduction of 103 unlabeled nitrogen via microbial contamination. 104 At 4 -5 dpe, matings between labeled females and unlabeled males were observed as 105 in Degner and Harrington [35]. After mating, females were immediately placed on ice 106 and dissected within 3 min. Because mosquito seminal fluid is known to contain 107 proteases [27], mosquitoes were dissected in saline with protease inhibitors (cOmplete 108 Mini Protease Inhibitor Cocktail; Sigma Aldrich, St. Louis, MO). In contrast to Sirot et 109 al. [27], we dissected only bursae (and not spermathecae), cutting the bursae just  Tandem mass spectrometry analysis 120 Solubilized proteins were separated on a 1-dimensional SDS-PAGE gel and split into 6 121 fractions, with two biological replicates run in parallel ( Figure S1). Gel fractions were 122 reduced in dithiothreitol, alkylated with iodoacetamide, and digested with trypsin. 123 Lyophilized, digested proteins were reconstituted in 0.5% formic acid and subjected to 124 nanoLC-ESI-MS/MS analysis using an Orbitrap Fusion Tribrid mass spectrometer 125 (Thermo-Fisher Scientific, San Jose, CA) equipped with nanospray Flex Ion Source, and 126 coupled with a Dionex UltiMate 3000RSLC nano system (Thermo, Sunnyvale, CA). 127 Peptide samples were injected onto a PepMap C-18 RP nano trap column (5µm, 100µm 128 i.d. x 20mm, Dionex) with nanoViper fittings at 20µL/min flow rate for desalting. 129 Samples were then separated on a PepMap C-18 RP nano column (2µm, 75µm x 15cm) 130 at 35 • C, followed by elution on a 90 min gradient of 5% to 35% acetonitrile in 0.1% 131 formic acid at 300 nL/min. Finally, a 5 min ramping to 90% acetonitrile in 0.1% formic 132 acid and a 5 min hold at this eluent completed each run cycle. Between cycles, the 133 column was re-equilibrated for 25 min using 0.1% formic acid. The Orbitrap Fusion was 134 run in positive spray ion mode with spray voltage set at 1.6 kV and a source 135 temperature at 275 • C. External calibration for FT, IT, and quadrupole mass analyzers 136 was performed. In data-dependent acquisition analysis, the instrument was operated 137 using FT mass analyzer in MS scan to select precursor ions followed by 3 s "Top Speed" 138 data-dependent CID ion trap MS/MS scans at 1.6 m/z quadrupole isolation for 139 precursor peptides with multiple charged ions above a threshold ion count of 10,000 and 140 normalized collision energy of 30%. MS survey scans at a resolving power of 120,000  Raw data from each MS/MS run was analyzed by X!Tandem [36] and Comet [37] 149 against the Ae. aegypti L5.0 protein database (GCF 002204515.2; [38] Iodoacetamide derivative of cysteine was specified as a fixed modification, whereas 155 oxidation of methionine and deamidation of glutamine and asparagine were specified as 156 variable modifications. Peptides were allowed up to two missed trypsin cleavage sites. greater than 95.0% iProphet probability, and protein assignations were accepted if they 164 could be established at greater than 99.0% probability. Proteins that contained identical 165 peptides and could not be differentiated based on MS/MS analysis alone were grouped 166 to satisfy parsimony principles. To ensure the reproducibility of protein identifications, two biological replicates of each 187 tissue were analyzed and robust identification criteria were applied (see above). Each 188 replicate was prepared from independent cohorts of mosquitoes. To control for the 189 possibility that unlabelled female-derived proteins were identified in our ejaculate 190 samples, we assessed labelling efficiency by conducting mass spectrometry on virgin 191 bursae alone, using a representative gel slice from each biological replicate ( Figure S1). 192 False Discovery Rates (FDRs) were estimated with a randomized decoy database using 193 PeptideProphet [40], employing accurate mass binning model and the nonparametric 194 negative distribution model. Corrections for multiple testing were applied where 195 appropriate to ensure the conservative nature of statistical tests. 196

5/30
Transcriptome analysis of testes and male accessory glands 197 Testes were harvested from males at 1 dpe and transferred to TRIzol. Because mature 198 sperm are actively produced at this age [43], and spermatogenesis is at its peak [44,45], 199 testes at this age likely contain the majority of transcripts that contribute to the 200 testicular sperm proteome. Male accessory glands (MAG) including the connecting 201 ejaculatory duct were dissected from virgin males aged 6 and 8 dpe. We also analyzed 202 MAG from mated males at the same age. Previous work has demonstrated that Ae. 203 aegypti males become depleted after mating with three to five females in succession, and 204 seminal fluid is slowly regenerated over 48 h [46]. In our study, we provided males with 205 four virgin females for a period of 8 h to allow for male seminal fluid depletion. On 206 average, each male mated with more than three females in this period (as determined by 207 dissection of females' spermathecae). Males were dissected in saline 16 h after their 208 female mates had been removed. We generated four biological replicates from 209 independent cohorts for each treatment, and each replicate contained combined tissue 210 from 20-40 (testes) or 40-60 (MAG) males. Total RNA was extracted from each sample 211 in Trizol following manufacturer's instructions (Invitrogen, Carlsbad, CA). Poly-A 212 mRNA was isolated and cDNA libraries were prepared using the QuantSeq 3'  For further analysis of the transcriptomes, we included additional, publicly available 221 data to evaluate tissue-biased gene expression. These include gonadectomized male 222 carcass (SRP075464; [48]) and a virgin female reproductive tract sample 223 (SRP068996; [49]). Raw RNA-seq reads were processed by trimming the first 10 bases 224 from the 3' position, followed by quality trimming of both ends to a minimum quality 225 Phred score of 20 (Sickle v.1.210; [50] [52]). Raw counts for each sample were extracted from the StringTie 229 abundance estimates using the auxiliary "prepDE.py" script provided on the StringTie 230 website (https://ccb.jhu.edu/software/stringtie/). Signal peptides in the translated 231 transcriptome were predicted in silico using a local installation of SignalP (v.4.1; [53]). 232 We used raw counts from the RNA-seq samples to (1) (Table S1). The nearly 294 two-fold disparity in PSMs is due to the contribution of labelled female proteins in the 295 ejaculate sample. In total, 870 and 811 proteins with at least 2 unique PSMs were 296 identified in sperm and ejaculate, respectively (Table S2). Sperm proteins were 297 identified by an average of 11.4 unique PSMs and 62.8 total PSMs per protein.

298
Ejaculate proteins were identified with an average of 8.0 unique PSMs and 37.8 total 299 PSMs. As was expected given the substantial contribution of sperm cells to ejaculate 300 composition, extensive overlap was observed between sperm and ejaculate proteomes; 301 516 proteins were detected in both samples, while 354 proteins were only identified in 302 sperm and 295 proteins were uniquely detected in the ejaculate ( Figure 1A).

303
The primary goal of this study was to use MS/MS with higher sensitivity and 304 accuracy to expand upon the prior characterization of Ae. aegypti SFPs and sperm by 305 Sirot et al. [27]. They identified 74 SFPs that mapped to the recently refined Ae.

306
aegypti genome [38]; some of the 93 SFPs described by Sirot et al. [27] do not map to 307 the new genome or are now part of larger, fused gene models. Of these, we identified 60 308 (81%) in our ejaculate sample, 32 of which were also identified in our purified sperm 309 sample. It is noteworthy that we detected an additional 5 SFPs from Sirot et al. [27], 310 but these were not included in our SFP list because they did not meet our two unique 311 peptide inclusion criteria. As such our proteomic characterization expands the previous 312 Ae. aegypti SFP characterization. SFPs and sperm proteins. We first removed 10 proteins involved in protein translation 317 (i.e. ribosomal proteins, translation initiation factors, and elongation factors; Table S2) 318 from the list of putative SFPs. These proteins exhibit ubiquitous patterns of expression, 319 including both MAG and testes, and are unlikely to be bona fide secreted SFPs.

320
Although we cannot rule out that they are secreted SFPs, their presence may be the 321 result of cell rupture during seminal secretion [60] or transfer of MAG cells to the 322 female, as has been described in D. melanogaster [61]. To reduce the possible inclusion 323 of sperm proteins that were absent in our sperm proteome but present in the ejaculate 324 (perhaps due to low abundance), we define "high confidence SFPs" as proteins with a 325 minimum of 6 total PSMs in the ejaculate, 2 unique PSMs, and not present in our 326 sperm proteome; this resulted in 177 high-confidence SFPs (Table S2).

327
Previous analyses of insect sperm proteomes have consistently identified proteins 328 generally considered to be SFPs (i.e. highly expressed in the MAG and believed to be 329 secreted molecules transferred to females as non-sperm components [32,62,63]. To 330 identify proteins predominantly produced by the MAG, but identified in both our sperm 331 and ejaculate sample, we used a 2.5-fold greater abundance threshold in the ejaculate 332 relative to sperm. This resulted in the identification of 103 additional putative SFPs, 333 which we label as "sperm/SFP overlap" (Table S2), including 53 with 5-fold greater 334 protein abundance in the ejaculate relative to sperm. In total, this results in a combined 335 SFP proteome of 280 proteins.

336
To better understand the relationship between candidate SFPs identified in the  Wilcoxon rank sum test; W = 3005, p < 0.001). We therefore conclude that the 345 sensitivity of our SFP characterization resulted in the addition of a far greater number 346 of low abundance SFPs.

347
To evaluate our SFP characterization, we explored three types of analyses. First, we 348 determined the presence of predicted signal peptides in identified proteins -a hallmark 349 of secreted proteins. Amongst our high-confidence SFPs, ∼33% contained signal 350 peptides, in comparison to only ∼9% of sperm proteins (χ 2 = 69.2, p < 0.001).

384
Orthology with sperm proteins and SFPs in other species 385 We next examined orthology of sperm proteins and SFPs in two different species: D. 386 melanogaster, given its well-characterized sperm proteome and SFPs [32,32], and Ae. 387 albopictus, which is the closest species to Ae. aegypti with characterized SFPs [28]. 388 Orthology was determined between the complete genome of all three species, and then 389 orthologs of Ae. aegypti SFPs and sperm proteins also classified as SFPs or sperm 390 Table 1. Gene Ontology analysis of sperm proteins, high-confidence SFPs, and 1072 sperm/SFP overlap. List of significant terms is abbreviated to exclude redundancy and to focus on terms discussed in text. 1074 For exhaustive list, see Table S3. FDR; false discovery rate. proteins in the other species were identified. In the comparison with Ae. albopictus, we 391 focus solely on SFPs because a thorough sperm proteome is lacking. Overall, ∼87% and 392 ∼98% of proteins in the Ae. aegypti genome have an ortholog (either as one to-one or 393 one-to-many relationships) in D. melanogaster and Ae. albopictus, respectively. Among 394 our identified proteins unique to the Ae. aegypti sperm proteome, 760 (99%) have an 395 ortholog in D. melanogaster, and 451 (59%) of these are also found in the D.

396
melanogaster sperm proteome (Table S2; [32,63]  the association between tissue-biased mRNA expression and differential protein 418 abundance for proteins that were detected in both ejaculate and sperm samples. These 419 results demonstrate that proteins with significant protein abundance differences between 420 sperm and ejaculate samples also tend to show >2-fold tissue-biased mRNA expression 421 ( Figure 2C and 2D), further supporting our SFP classification criteria (see above).

422
However, we note that this relationship is less faithful for lower abundance proteins.

423
Because males regenerate seminal fluid over the course of 48 h after depleting their 424 reserves by repeated insemination [46], we reasoned that MAG transcriptional 425 regulation after mating might inform our understanding of pathways required to restore 426 depleted SFPs. Differential expression analysis of virgin and mated MAGs' 427 transcriptomes revealed a significant bias towards gene upregulation in mated males, 428 with 320 transcripts that are upregulated and 126 that are downregulated after mating 429 (binomial test; p < 0.001; Figure 3A). In contrast to downregulated transcripts-which 430  Previously, we have shown that males transfer mRNA to females in the ejaculate [49]. 449 Using the newly annotated genome [38], we re-analyzed data from those experiments  Figure 3C). The MAG therefore appears to be a primary source of RNA transferred to 458 females in the ejaculate. In total, 27 proteins encoded by transferred mRNA transcripts 459 were identified in our proteomes, with 22 in the high-confidence SFP proteome, three in 460 the sperm/SFP overlap proteome, and five that were in the sperm proteome.

461
Interestingly, the putatively transferred transcripts whose products were present in our 462 seminal fluid proteome encode highly abundant proteins that were on average six times 463 more plentiful than the remainder of the seminal fluid proteins. Lastly, transferred   as Ae. aegypti [15,16,18]. The primary goals of this study were to comprehensively 496 catalog male proteins transferred to Ae. aegypti females during insemination and 497 establish a reliable methodology for delineating between sperm proteins and SFPs. To 498 accomplish this, we (1) conducted an in-depth proteomic characterization of sperm, (2) 499 utilized a whole-female labelling approach to identify unlabelled male proteins 500 transferred by the male during insemination and (3) characterized the transcriptomes of 501 the testis and male accessory gland (MAG). Importantly, we note that the whole-female 502 labelling approach has been employed previously in Ae. aegypti but the assignment of 503 proteins as SFPs was limited by the lack of information regarding proteins found in 504 sperm. Thus, distinctions between sperm proteins and SFPs were previously difficult to 505 achieve. It is also noteworthy that advances in MS/MS sensitivity and accuracy have 506 resulted in far greater power of detection in our study, and our analysis has also 507 benefited tremendously from the recent resequencing and reannotation of the Ae. Our work differs from previous SFP characterization studies [27,28,57] in that our 515 classification was supported by a detailed knowledge of sperm proteome composition.

516
Nonetheless, several independent validation approaches were helpful in assessing the 517 quality of our proteomic characterization. For example, we quantified the proportion of 518 proteins with predicted secretion signals and analyzed transcriptome profiles in testes 519 and MAGs. As would be predicted, SFPs identified in this study possessed a 520 significantly higher proportion of predicted secretion signals than sperm proteins and 521 were, on average, highly specific or biased towards expression in the MAG. Additionally, 522 analysis of the functional composition of our proteomes revealed that they were closely 523 aligned with the results of previous sperm [32][33][34]63] and SFP studies in insects [30,57]. 524 For example, our expanded sperm proteome was highly enriched for proteins related to 525 flagellar structure, including microtubules, dynein complexes, and ciliar components, 526 and proteins likely associated with the mitochondrial derivatives, which are a 527 predominant structure in mosquito sperm [68,69] and that of other insects. Consistent 528 with what has been described by Sirot et al. [27], as well as in other insects (reviewed 529 in [29,30,57]) and humans [70], proteases were highly enriched amongst our 530 high-confidence SFPs, supporting the likely accuracy of our expanded characterization 531 (reviewed in [64]). The observed enrichment of vesicle-mediated transport proteins is 532 also consistent with the fact that mosquito seminal fluid is in part produced by apocrine 533 secretion [60]. Additionally, exosomes and other vesicles are believed to play a role in a 534 variety of post-insemination cellular interactions. For example, vesicles transferred in 535 Drosophila seminal fluid have been reported to fuse with sperm and interact with the 536 female reproductive tract [71], exosomes of the mouse epididymis have recently been 537 implicated in the control of sperm RNA stores [72], and the abundance of exosome 538 markers in avian SFPs has led to speculation about vesicle-mediated mechanisms in 539 post-testicular sperm maturation [73]. Therefore, the accuracy of our expanded 540 proteomic characterization of sperm and SFP proteomes is corroborated by several 541 independent lines of evidence.

542
It is important to note that, despite the application of stringent proteomic 543 thresholds, some proteins could not be definitively assigned as either sperm protein or 544 SFP. Previous studies in Drosophila and Lepidoptera have consistently identified known 545 SFPs (such as Acp36DE) at appreciable abundance levels in sperm that have yet to be 546

16/30
combined with MAG secretions [32,62,63]. Our identification of a relatively large 547 protein set that is highly MAG-biased in expression but also present in sperm further 548 suggests that the incorporation of "SFPs" during testicular sperm maturation occurs 549 and is worthy of additional functional investigation. Although Drosophila expression 550 profiles in the testis and accessory gland are quite distinct, many SFPs exhibit low 551 levels of co-expression in the testis (Dorus, unpublished data). Our transcriptomic 552 analyses here further support such patterns of co-expression. As such, dichotomous 553 distinctions between sperm proteins and SFPs may be an oversimplification of a more 554 nuanced relationship between these reproductive systems. We acknowledge this 555 uncertainty in our classification of MAG-biased proteins that were also identified in our 556 sperm proteome. 557 We also note that despite our expanded proteomic coverage, several proteins that we 558 anticipated to be identified were absent. The most notable case was Head Peptide-1, a 559 seminal fluid peptide which has been shown to be transferred in the ejaculate [74] and 560 has been reported to induce short term monogamy in the female after mating [15].

561
Head Peptide-1, as is the case for many SFPs, undergoes extensive post-translational 562 modification and may therefore be challenging to identify bioinformatically without a 563 priori knowledge of the biochemical composition of the proteolytic products (such as in 564 the case of the well-studied Drosophila Sex Peptide; [75]). Another example was 565 adipokinetic hormone (AAEL011996), which did not meet our two unique peptide 566 inclusion threshold, although we did identify five copies of one peptide from its 567 precursor protein that was also identified in Ae. albopictus seminal fluid [28]. This 568 protein has been postulated to contribute to sperm protection from oxidative stress [76] 569 and the regulation of feeding behavior [77] in other insects. We suggest that the 570 complexity of proteolytic pathways, governed both by male and female interacting 571 proteins, is a major barrier in the use of shotgun proteomics to study SFP identity and 572 function in the female reproductive tract. Future investigations would likely benefit 573 from the inclusion of a targeted proteomic approach (reviewed in [78]). Such approaches 574 require an a priori list of candidate peptides; in Ae. aegypti, the neuropeptides and Male reproductive proteins, including SFPs, are consistently among the fastest evolving 579 classes of protein (reviewed in [80]). Although initially a goal of our study, conducting a 580 robust analysis of the molecular evolution of protein identified in this study was limited 581 by the availability of genomic resources appropriate for both inter-and intraspecific 582 tests of positive selection. Obtaining high quality genomic data for different populations 583 of Ae. aegypti has proven difficult, given the genome's repetitive nature [38,81]. 584 Furthermore, this mosquito's ability to move globally as diapausing eggs has allowed for 585 frequent mixing and a complex population structure [82,83]. The development of 586 appropriate population level genetic data for the analysis of recent selective sweeps 587 should be a priority in Ae. aegypti, as it has been in Anopheles gambiae [84,85]. 588 Furthermore, given the extent of molecular divergence between Ae. aegypti and Ae. 589 albopictus [28], the development of genomic resources for a more closely related 590 outgroup to Ae. aegypti will assist in understanding evolutionary patterns at the gene 591 level. Despite these limitations, our analysis of orthology did reveal that the suite of 592 proteins contributing to seminal fluid, but not sperm, has diverged substantially from 593 other Dipterans. Although sperm proteins and SFPs possess levels of orthology to the 594 Drosophila genome that are comparable to the genome as a whole, only 59% and 4% of 595 orthology was observed when comparing the Ae. aegypti sperm and SFP proteomes 596 (respectively) with those of Drosophila [32,57]. While some of this disparity may be 597 attributed to differences in overall proteome size and coverage, such a stark contrast is 598 nonetheless compelling evidence of tissue-specific evolutionary patterns. Orthology 599 between Ae. aegypti SFPs and Ae. albopictus SFPs [28], while more extensive (43%), 600 was still comparatively low compared to orthology between the sperm proteomes of Ae. 601 aegypti and D. melanogaster -two distantly related Dipterans. These results suggest a 602 process of "turn-over" in seminal fluid proteomes, whereby overall protein composition 603 diverges rapidly even when there is evidence for conservation with regard to overarching 604 molecular functions represented in seminal fluid. For example, a priori expectations 605 about Gene Ontology enrichment were met for both Ae. aegypti sperm (e.g., cilium and 606 mitochondrial proteins) and SFPs (extracellular localization and hydrolase activity), 607 despite overall SFP divergence. SFPs are a pronounced target of selection and have been 608 discussed as a driver of sexual conflict (reviewed in [86]), and thus they are expected to 609 rapidly diverge. By contrast, we note that strong conservation of sperm proteins exists 610 across distant taxa, with different insect orders displaying 25% orthology between sperm 611 proteomes [34], and even D. melanogaster and mammals with 20% sperm proteome 612 orthology [32]. The overall lack of conservation in seminal fluid proteomes makes 613 comparing the roles of specific SFPs across species difficult, but conserved molecular 614 functions amongst SFPs will nevertheless allow the wealth of knowledge in Drosophila 615 to be leveraged towards an understanding of SFP function in non-model insects.

616
Unlike other mosquitoes with heteromorphic sex chromosomes, Culicine mosquitoes 617 (e.g., Aedes and Culex ) harbor male determining loci on undifferentiated, homomorphic 618 sex chromosomes [87]. Theory predicts the evolution of heteromorphic sex chromosomes 619 following the acquisition of a sex determining locus, suppression of recombination, and 620 expansion of the non-recombining region. It remains unclear why homomorphic sex 621 chromosomes appear to be retained in some taxa [88,89]. One proposed mechanism to 622 mediate the selective effect of sexually antagonistic alleles on the promotion of 623 recombination suppression is the establishment of efficient sex-b iased expression [90]. 624 Although previously lacking, the significant enrichment of SFPs on chromosome 1 is the 625 first evidence in support of this hypothesis in Ae. aegypti. This trend was restricted to 626 SFPs and was not observed for genes solely over-expressed in the MAG or testis. It is 627 intriguing to speculate that this distinction between SFPs and other male reproductive 628 genes might be due to the prevalence (and selective strength) of sexually antagonistic 629 alleles specifically amongst SFPs, which may favor their localization on chromosome 1. 630 This is consistent with their putative role as drivers of sexual conflict (reviewed in 86), 631 including the mediation of female post-mating responses such as sexual receptivity and 632 longevity [14,18]. including ∼1000-fold higher expression in testes than in MAG, ∼50 times more 642 transcript in whole male carcasses than gonadectomized carcasses [91], and upregulation 643 during later stages of spermatogenesis [48]. S-LAP orthologs constitute a significant 644 proportion of the protein composition of Drosophila [92] and Lepidoptera [34,62]. Little 645 is known about the specific function of S-LAPs, although it has been postulated that 646 may serve a structural function given the inferred loss of enzymatic capacity of several 647 S-LAPs during Drosophila evolution [92]. Additionally, a Y-linked S-LAP in D. 648 18/30 pseudoobscura has been implicated in a cryptic meiotic drive system, where suppression 649 of this locus results in aberrant spermatogenesis and a higher proportion of X-bearing 650 sperm [93]. It will be of great interest to establish the specific function of these proteins 651 in mosquito sperm, given their high abundance and expression patterns during 652 spermiogenesis. Furthermore, the proteins and transcripts involved in spermatogenesis 653 described in this study may assist in the identification of other genes involved in meiotic 654 drive systems (reviewed in [94]), which have been proposed as potential genetic means 655 to reduce wild populations through the induction of sex ratio biases [95]. 656 Although no SFP was as abundant as cytosol aminopeptidase in sperm, the top ten 657 most abundant proteins ranged from 1.2 -2.6% of the protein in our ejaculate sample. 658 L-asparaginase (AAEL002796) was the most abundant SFP (61% more abundant than 659 the next protein) and the tenth most abundant mRNA transcript in the MAG out of 660 over 11,000 transcripts. While the relevance of the abundance of this enzyme is 661 currently unclear, it may relate to several other notable observations. First, transcript 662 AAEL020035, whose protein product is comprised of ∼60% asparagine residues, is the 663 single most abundant MAG transcript and was also, by far, the most abundant observations alone, it is intriguing to speculate that the SFP proteome has the capacity 675 to conduct gluconeogenesis (of asparagine and potentially other amino acids) and that 676 this may feed into to the citric acid cycle. The citric acid cycle is believed to be 677 functional in mammalian sperm (reviewed in [96]) and many citric acid cycle enzymes 678 are present in our Ae. aegypti sperm proteome. Ae. aegypti [27], Ae. albopictus [28], Cx. quinquefasciatus [97], and several 684 non-mosquito taxa [21,22](reviewed in [64,98]), and are a common function of many 685 insects' seminal fluid. Based on studies in other insects, functions of these enzymes may 686 include the activation of sperm motility or the cleavage of propeptides into their active 687 forms [99]. Our seminal fluid proteome also contains abundant enzymes that catabolize 688 smaller substrates, such as amino acids and carbohydrates. Taken together, the 689 enzymatic cocktail in seminal fluid may be well equipped to break down many of the 690 molecules they contain. Seminal fluid proteins were also enriched for proteins involved 691 in maintaining proton and redox homeostasis. We identified several proteins 692 contributing to V-type proton ATPases, which use ATP to regulate pH via proton 693 transport. Maintaining an optimal pH in seminal fluid may allow for efficient sperm 694 motility (reviewed in [100]). Regulating pH may also create an ideal environment in

19/30
Ejaculate RNAs Transferred to Females 701 There has been much conjecture about the importance of spermtazoal RNA to 702 fertility [101] and recent work has confirmed that the regulation of sperm ncRNA stores 703 in the mammalian epididymis is necessary for proper embryogenesis [102]. Little is 704 known about the function of spermatozoal RNAs in insects, although they have been 705 demonstrated to have substantial functional coherence, including an overwhelming 706 enrichment of loci involved in translation [103]. New data in this study allowed us to 707 probe for patterns in previously described transcripts that are putatively transferred to 708 females during mating. A total of 106 transcripts were identified, including both coding 709 and non-coding transcripts, and a majority of these exhibit high levels of expression in 710 the MAG. Based on our SFP proteome, most of the protein coding transcripts are 711 translated at high levels. Their high expression in the MAG suggests that they may 712 simply hitchhike into seminal fluid with other secreted molecules. Alternatively, as has 713 been demonstrated in Drosophila, they could be transferred in intact MAG cells [61], or 714 via vesicles derived from the MAG [71]. Interestingly these vesicles, which may carry 715 RNA cargo including miRNAs, fuse with sperm and have the capacity to interact with 716 the female reproductive tract. Some male-derived transcripts are detectable in the 717 female for up to 24 h post-mating [46], and it has been postulated that they could be 718 used by females in some capacity [104]. In Ae. aegypti, both vesicles and RNAs are  [14,16,17], including short term mating refractory behavior [15]. To date, the 729 molecule(s) responsible for long term refractoriness has yet to be identified. Given the 730 strength and duration of responses to low SFP "doses" [14], identification of the 731 responsible proteins will provide powerful tools for manipulating female reproduction in 732 a species-specific manner. In addition, such knowledge may provide a molecular metric 733 by which the quality of males in modified mosquito release strategies (such as those 734 employing sterile or Wolbachia-infected males; reviewed in [105]) may be monitored and 735 optimized. Functional analysis of specific sperm proteins and SFPs may yield insights 736 into processes such as sperm motility and activation [21][22][23], sperm storage [106], and 737 sperm-egg recognition [107]. Very few studies have explored these processes in Ae.

738
aegypti (reviewed in [108]). A mechanistic understanding of complex post-copulatory 739 male-by-female interactions is critical to genetically modified mosquito release strategies 740 that manipulate reproduction. Our detailed characterization of the male contributions 741 to these interactions should serve as the foundation for the design and improvement of 742 vector control strategies that limit the transmission of arboviruses that cause serious 743 human illness and mortality.   Table S1 752 Database of all proteins identified by tandem mass spectrometry in this 753 study. Classification of each protein is based on criteria in text. Proteomic (probability 754 of identification, percent coverage, total PSMs, and unique PSMs) and transcriptomic 755 (TPM in virgin MAG, mated MAG, and testes) data are included, as well as whether 756 each protein includes a signal peptide, and how each protein was previously classified by 757 Sirot et al. [27].
758 Table S2 759 Full GO analysis for high confidence SFPs, sperm/SFP overlap, and sperm 760 proteins. CC, cellular component; BP, biological process; MF, molecular function.

762
We thank Sylvie Pitcher, Sheng Zhang, Jen Grenier, Peter Schweitzer, and the staff at 763 the Cornell Biotechnology Resource Center for technical support, and Laura Sirot for 764 experimental guidance and feedback. This study was supported by *NIH/NIAID grant 765 R01AI095491 to MFW and LCH, NIH/NICHD grant R21HD088910 to SD and MFW, a 766 Cornell Graduate School fellowship to ECD, and a Cornell Entomology Department 767 Griswold grant to ECD and LCH. YHAB was supported by NIH/NICHD grant 768 R01HD059060 to MFW and Andrew G. Clark. RNA-seq data and mass spectrometry 769 data were made possible by NIH grants 1S10OD010693-01 and 1S10OD017992-01,