Linking Spermatid Ribonucleic Acid (RNA) Binding Protein and Retrogene Diversity to Reproductive Success*

Spermiogenesis is a postmeiotic process that drives development of round spermatids into fully elongated spermatozoa. Spermatid elongation is largely controlled post-transcriptionally after global silencing of mRNA synthesis from the haploid genome. Here, rats that differentially express EGFP from a lentiviral transgene during early and late steps of spermiogenesis were used to flow sort fractions of round and elongating spermatids. Mass-spectral analysis of 2D gel protein spots enriched >3-fold in each fraction revealed a heterogeneous RNA binding proteome (hnRNPA2/b1, hnRNPA3, hnRPDL, hnRNPK, hnRNPL, hnRNPM, PABPC1, PABPC4, PCBP1, PCBP3, PTBP2, PSIP1, RGSL1, RUVBL2, SARNP2, TDRD6, TDRD7) abundantly expressed in round spermatids prior to their elongation. Notably, each protein within this ontology cluster regulates alternative splicing, sub-cellular transport, degradation and/or translational repression of mRNAs. In contrast, elongating spermatid fractions were enriched with glycolytic enzymes, redox enzymes and protein synthesis factors. Retrogene-encoded proteins were over-represented among the most abundant elongating spermatid factors identified. Consistent with these biochemical activities, plus corresponding histological profiles, the identified RNA processing factors are predicted to collectively drive post-transcriptional expression of an alternative exome that fuels finishing steps of sperm maturation and fitness.

Spermiogenesis is a postmeiotic process that drives development of round spermatids into fully elongated spermatozoa. Spermatid elongation is largely controlled posttranscriptionally after global silencing of mRNA synthesis from the haploid genome. Here, rats that differentially express EGFP from a lentiviral transgene during early and late steps of spermiogenesis were used to flow sort fractions of round and elongating spermatids. Mass-spectral analysis of 2D gel protein spots enriched >3-fold in each fraction revealed a heterogeneous RNA binding proteome (hnRNPA2/b1, hnRNPA3, hnRPDL, hnRNPK, hnRNPL, hnRNPM, PABPC1, PABPC4, PCBP1, PCBP3, PTBP2, PSIP1, RGSL1, RUVBL2, SARNP2, TDRD6, TDRD7) abundantly expressed in round spermatids prior to their elongation. Notably, each protein within this ontology cluster regulates alternative splicing, sub-cellular transport, degradation and/or translational repression of mRNAs. In contrast, elongating spermatid fractions were enriched with glycolytic enzymes, redox enzymes and protein synthesis factors. Retrogene-encoded proteins were over-represented among the most abundant elongating spermatid factors identified. Consistent with these biochemical activities, plus corresponding histological profiles, the identified RNA processing factors are predicted to collectively drive post-transcriptional expression of an alternative exome that fuels finishing steps of sperm maturation and fitness. Post-transcriptional regulation of gene expression is essential for cells to transition into and out of distinct developmental, physiological, and pathological states (1)(2)(3)(4). Accordingly, post-transcriptional control of gene expression plays essential roles during gametogenesis and embryo development by helping cells undergo dynamic changes in fate and func-tion (5)(6)(7)(8)(9)(10). In a classic model, transcriptionally inactive oocytes store large reserves of de-adenylated transcripts in a translationally repressed state (11). Then, in response to meiotic maturation, polyadenylation of the stored mRNAs signals their translation into maternal proteins required for early embryogenesis (11). Spermatozoan development is also known for its diverse post-transcriptional modes of gene expression (12)(13)(14)(15)(16)(17). However, in contrast to many oocyte mRNAs, translational activation in developing spermatozoa is commonly associated with poly-A tail shortening, rather than polyadenylation (18 -20).
In the typical fertile human male, energy is expended for net biosynthesis of Ͼ35 million new spermatozoa each day (21). This equates to Ͼ25,000 spermatozoa generated/male/minute throughout a 65 year reproductive life span to help parent an average family of ϳ3 children (21). In adult males, haploid gametes that form spermatozoa are continuously being produced from spermatogonial stem cells in the testes through the developmental process of spermatogenesis (21,22). During spermatogenesis, a subset of spermatogonial stem cells within the basal compartment of seminiferous tubules give rise to differentiating spermatogonia that amplify in number through a series of mitotic divisions. Differentiating spermatogonia then initiate meiosis and traverse the Sertoli cell blood-testis barrier to enter the seminiferous tubules as newly formed spermatocytes (23). Once inside the adluminal compartment of the seminiferous epithelium, spermatocytes complete meiosis to generate nascent haploid germ cells, termed round spermatids (24). As requisite for round spermatids to mature into spermatozoa, they must dramatically transform anatomically in size, shape, and organelle composition through the post-meiotic process of spermiogenesis (24).
In rodents, newly formed round spermatids undergo up to 19 well-defined steps of spermiogenesis before being shed into the lumen of the seminiferous epithelium as fully elongated, yet functionally immature spermatozoa (reviewed by Yves Clermont) (24). Acrosome biogenesis is a sperm-specific process adapted from the Golgi complex, and is commonly used to classify spermatids at distinct steps of spermiogen-esis as they mature during progressive stages of the seminiferous epithelium cycle (24). The periodic acid-Schiff's (PAS) staining method differentially highlights step-specific morphological changes to the acrosome and nucleus of developing spermatids (24). It should be stressed that linear "steps" in germ cell development are different from the "stages" of the seminiferous epithelium cycle (24). This is because the epithelial stages are defined by physical associations formed between different testis cell types during an epithelial cycle (see Supplemental Fig. 1 for review). Each unique cellular association defining a respective stage is organized vertically in space within a tubular segment by successive generations of spermatogenic cell types (24). Developmental gaps between each generation of germ cells comprising a given stage is defined by the time taken to complete one cycle of the seminiferous epithelium (i.e. ϳ12.9 days/cycle in rats) (24). As such, each epithelial stage merely represents subsequent snap-shots in cycle time within the same seminiferous tubule segment (Supplemental Fig. 1).
In rodents, it is estimated that Ͼ5% of mRNAs are specifically expressed to support the meiotic and post-meiotic processes of sperm development and fertilization (25)(26)(27). This includes numerous testis-specific isoforms of metabolic enzymes required for sperm function (28 -30). Spermatogenic cells also express an unusually high percentage of retrogenes, a subset of which encode glycolytic enzymes hypothesized to have been selected by enhancing sperm fitness (15). Additionally, spermatogenic cells express a most diverse array of alternatively processed mRNAs unique to the germline (16,17). Still, global silencing of transcription occurs during spermatid elongation as haploid nuclei remodel their chromatin into a hyper-compacted state within the spermatozoan head piece (31,32). As a result, major fractions of "silenced" transcripts are stored for up to a week in messenger ribonucleic acid particles (mRNPs) until factors in the elongation phase of spermiogenesis trigger their translation to support final steps in spermatozoan development and fertilization (12)(13)(14). Thus, sperm development provides an extraordinary system to study molecular events that regulate post-transcriptional gene expression.
Here, we present mass spectral analysis of proteins distinctly detected in fractions of round and elongating spermatids collected by flow cytometry using Green Elongating Spermatid (GESptd1) 1 transgenic rats. This cell sorting approach was made possible because GESptd1 rats differentially express EGFP post-transcriptionally from viral elements by mechanisms that mimic many spermatid genes. Protein profiles revealed by these studies demonstrate global down-regulation of an abundant RNA binding proteome expressed prior to up-regulation of metabolic proteins during spermatid elongation. Biochemical activities reported for these RNAprocessing factors shed light on how the multiplicity of germline specific mRNA variants are uniquely generated, stored and then expressed to drive spermatozoan biology following transcriptional silencing of the male haploid genome.

MATERIALS AND METHODS
Complete methodologies used for spermatid isolation, Western blot, Northern blot, immunofluorescence, histological staining, in situ hybridization, and sucrose gradient fractionation are detailed in Supplementary Information.
Animal Protocols-Protocols for use of rats in this study were approved by the Institutional Animal Care and Use Committee (IACUC) at UT-Southwestern Medical Center in Dallas, as certified by the Association for Assessment and Accreditation of Laboratory Animal Care International (AALAC).
Mass Spectrometry and Analyses-Round and elongating spermatid fractions were collected from adult rats between 165-240 days of age by flow cytometry, as described. A total ϳ2 ϫ 10 7 cells for each respective, EGFP Dim and Bright fraction were collected from five cumulative sorts. Post-sort, each fraction was washed twice with PBS and centrifugation at ϳ200 ϫ g and pellets were flash frozen in liquid nitrogen and then stored at Ϫ80°C. Frozen pellets were shipped on dry ice to Applied Biomics, Hayward, CA, for 2D DIGE expression profiling. Protein from pooled round spermatid fractions were labeled with Cy3, whereas proteins from pooled elongating spermatid fractions were labeled with Cy5 prior to fractionation. 2D gels were scanned with a Typhoon image scanner (G.E. Healthcare Life Sciences, Inc.) and images were generated with ImageQuant software (G.E. Healthcare Life Sciences, Inc.). Differential protein expression profiles were analyzed with Decyder software (G.E. Healthcare Life Sciences, Inc.). Spots of interest were picked with the Ettan Spot Picker (G.E. Healthcare Life Sciences, Inc.), treated with trypsin and identified by MALDI TOF/TOF mass spectrometry (Applied Biosystems, Foster City, CA) using the non-redundant NCBI database to search 111,467 entries (October, 2010) allowing 1 missing cleavage/ entry including the following variable modifications: carbamidomethylation of cysteine and oxidation of methionine. Mass tolerance for precursor ions and fragment ions were set at 100ppm and 0.5 Da, respectively; Cutoff score/expectation value for accepting individual MS/MS spectra ϭ 20; noise threshold Ͻ5%. Accession numbers of identified proteins were imported into the Database for Annotation, Visualization, and Integrated Discovery (DAVID) for pathway analysis (http://david.abcc.ncifcrf.gov/). Peaklist-generating software and re- 1 The abbreviations used are: GESptd1 rats, green elongating spermatid rats; EGFP, enhanced green fluorescent protein; CMV, cytomegalovirus; hnRNP, heterogeneous nuclear ribonucleoprotein; mRNP, messenger ribonucleoprotein particle; LTR, long terminal repeat; UTR, untranslated region. lease version was GPS Explorer™ v3.6; search engine and release version MASCOT 2.0 (Matrix Science).

Spermiogenic Expression of EGFP in
Rats-Novel strains of transgenic rats were generated by transducing donor spermatogonial stem cells with a self-inactivating, lentiviral vector (33), and evaluated as animal models for studying sperm development. The lentiviral vector used was designed to express EGFP from an internal cytomegalovirus (CMV) promoter ( Fig. 1A) (33). Consistent with reports in mice (34,35), testicular expression of EGFP was visualized by fluorescence microscopy in 7 of 12 (ϳ58%) rat strains generated (Fig. 1B). Genomic sites where the lentiviral vector integrated were defined for two of the single-copy strains expressing highest relative levels of EGFP in testes (i.e. GESptd1 and GESptd2 rats) (Fig. 1C). In all 7 strains where transgene expression could be visualized microscopically, a consistent pattern of fluorescence was observed in spermatogenic cells nearest to the luminal centers of seminiferous tubules (Figs. 2A, 2B). Northern and Western blot analyses of GESptd1 rats demonstrated selective expression of the reporter transgene in testes (Fig. 2C, 2D). The same Western blot profile was detected for EGFP in GESptd2 rat tissues (not shown). This localization pattern indicated that EGFP was being expressed in the germline during late steps of spermiogenesis, thus, coining the name "Green Elongating Spermatid rats." To more precisely define steps of spermiogenesis expressing EGFP in GESptd1 rats, PAS-Feulgen staining and immunofluorescence were performed on parallel cross-sections prepared from their testes. Expression of EGFP was most abundant in the cytoplasm of step 12-19 spermatids during stages XI to VIII of the seminiferous epithelium cycle (Fig. 3). By comparison, the spermatocyte and round spermatid transcription factor, CREM-⌻ (36), was most abundant in late pachytene spermatocytes during stages XI to XIV, and in step 1 to 11 spermatids during stages I to XI of the epithelial cycle ( Fig. 3). However, an antisense probe to EGFP specifically hybridized to RNA in both meiotic and post-meiotic spermatogenic cells of GESptd1 rat testes ( Fig 4A). Silver grains produced from the EGFP antisense probe became apparent over pachytene spermatocytes after stage IV of the epithelial cycle and then most densely labeled subsequent steps in spermatocyte and spermatid development (Fig. 4B). Thus, in adult rats, EGFP expression appeared to be up-regulated at the transcriptional level during meiotic prophase-I ( Fig. 4C; blue gradient bar) and at the translational level during elongation steps of spermiogenesis ( Fig. 4C; green gradient bar).
The relative timing of EGFP expression in GESptd1 rats was analyzed during the onset of spermiogenesis, which initiates in rats after ϳ25 days of age when round spermatids are initially produced through meiotic divisions of first-generation spermatocytes ( Fig. 4C; arrows). During postnatal development, EGFP transcripts were first detectable by Northern analysis in testes of 30-day-old rats, which then steadily increased in relative abundance as a function of age (Fig. 4D). Subsequent to detection of EGFP mRNA, EGFP was initially detected by Western blot on day 40, after development of the first elongating spermatids, and proceeded to increase in abundance by day 45 (Fig. 4E). By comparison, detection of testicular transcripts and protein encoded by the deleted in azoospermia-like (dazl) gene each peaked by postnatal day 21, consistent with expression of DAZL in spermatogonia and

FIG. 1. Transgenic rats produced with lentiviral reporter vector.
A, Diagram of self-inactivating lentiviral vector, pHR'CMV-EGFP-SIN 18 (107,108), used to generate transgenic rat lines by in vitro transduction of donor spermatogonia. Depicted are the U3, R, and U5 regions of the 5Ј and 3Ј long terminal repeat regions (LTRs); surface glycoprotein element (G); rev response element (RRE); splice acceptor sequence (SA); splice donor sequence (S.D.); cytomegalovirus promoter (CMV); enhanced green fluorescent protein (EGFP). Also shown is the region deleted from the 3Ј LTR (⌬U3), which helps to disrupt viral replication (107). B, Fluorescence microscopy of EGFP expression (green) in seminiferous tubules of transgenic rat strains GESptd1, GESptd2, and GESptd3 hemizygous for a single copy of the lentiviral construct. Wild-type rat seminiferous tubules illustrate background fluorescence. Right panels show respective bright field images of seminiferous tubules from each rat. Scale bar, 1 mm. C, Lentiviral transgenes integrated between the Trmp6 and Rorb genes in 1q43 of chromosomes 1 in GESptd1 rats, and between the Sema5a and LOC100360282 genes in 2q23 on Chromosome 2 in GESptd2 rats. spermatocytes (Figs. 4D, 4E) (37). Accordingly, transcription and translation of the GESptd1 transgene appeared uncoupled postnatally during sperm development.
However, because mature pachytene spermatocytes are present in 21-25-day-old GESptd1 rats, signals for EGFP detection by Northern blot in total testis lysates at these ages did not correlate precisely with in situ hybridization silver grains formed over pachytene spermatocytes in adult rat testes (Fig. 4B). This could reflect differences in detection thresholds for Northern versus in situ hybridization assays at these respective ages because 1 st generation spermatocyte and spermatid populations take time to accumulate in abundance relative to other testis cell types (see supplemental Materials and Methods), and/or, actual differences in the onset of EGFP expression by pachytene spermatocytes in adolescent versus adult rats.
Packaging EGFP into mRNPs-To help explain the presumptive delay between expression of EGFP transcripts and EGFP during spermiogenesis, lysates prepared from GESptd1 rat testes were fractionated over sucrose density gradients (Fig. 5). Most EGFP transcripts ranging from 0.8 -1.5 kb were detected in fractions near the top of the gradient (Fractions 1-3), whereas, lower molecular weight EGFP transcripts of ϳ1 kb migrated into EDTA-sensitive polysome-like fractions at the bottom of the gradient (Fractions 9 -11) (Fig. 5A). Similarly, the mRNP marker protein, MSY2 (14), comigrated predominantly with a majority of EGFP transcripts in fraction 2 and 3, but was also detected in fractions throughout the sucrose gradient (Fig. 5B). By contrast, a majority of transcripts encoding DAZL migrated in Fractions 9 -11 of the sucrose gradient, and thus, were enriched in the polysome-like fractions (Fig. 5C). Hybridization of total RNA from testes of 45-day-old rats with oligo-dt 18Ϫ24 and treatment with RNase-H increased the relative mobility of EGFP transcripts, shifting a larger portion into bands migrating closer in size to, or less than the ϳ1 kb transcript within the polysome-like fractions (Fig. 5A). RNase-H treatment also increased the relative mobility of DAZL transcripts (Fig. 5C). Thus, larger EGFP transcripts potentially represent longer poly-adenylated species, which like many endogenous spermatid genes, could be translationally repressed in mRNPs of spermatocytes and/or round spermatids prior to de-adenylation and incorporation into polysomes for translation in elongating spermatids (13,18,20). However, such modifications to GESptd1 EGFP poly-A tails in nonpolysome and polysome fractions will require verification by sequence analysis.
Flow Sorting Round and Elongating Spermatids-Based on robust expression of EGFP in GESptd1 rat spermatids, we postulated that enriched fractions of elongating spermatids could be isolated from their testes by flow cytometry to facilitate biochemical analyses on spermiogenesis. Testis cells from GESptd1 rats were enzymatically disaggregated, and then two major peaks of EGFP-positive cells were identified by flow cytometry and collected by cell sorting (Figs. 6A, 6B). A dimmer peak of EGFP ϩ cells expressed ϳeightfold lower levels of EGFP when compared with the second, brighter peak of EGFP ϩ cells (Figs. 6A, 6B). However, only 34.7  Tables S1, S2), the EGFP-Dim fraction was enriched with round spermatids; whereas, the EGFP-Bright fraction was enriched in both elongating spermatid cytoplasts (also termed residual bodies), in addition to intact elongating spermatids. Both spermatid fractions were depleted of somatic testis cells (supplemental Fig. S3A, S3B).
Mass Spectral Analysis of Spermatid Proteins-Due to the relative purity and abundance of each sorted population, we conducted mass-spectral analysis to identify proteins differentially sorted into the EGFP-Dim and EGFP-Bright spermatid fractions after purification by 2D-electrophoresis (Fig. 7). Respective EGFP-Dim and EGFP-Bright spermatid fractions from five total sorts were pooled for the analysis (ϳ20 million cells/fraction). A total of 88 differentially expressed (Նtwofold) protein-like factors were submitted for analysis (Fig. 7A) (supplemental Data Files S1-S3). Consistent with histological and cytometric fluorescence signals in round and elongating spermatids, EGFP was found by mass spectrometry to be ϳ10fold more abundant in the EGFP-Bright fraction versus the

EGFP Transgene
Crem-T PAS-Fuelgen  EGFP-Dim fraction (Fig. 7B). Similarly, the EGFP signal was 9.4-fold higher in the EGFP-Bright fraction than the EGFP-Dim fraction by Western blot (supplemental Fig. S3A; supplemental Table S2); in contrast, EGFP transcript levels were similar, but slightly more abundant in the EGFP-Dim fraction than in the EGFP-Bright fraction based on qtPCR (supplemental Table S1). Thus, the GESptd1 rat transgene provided an internal standard for analyzing the relative expression of round and elongating spermatid proteins (Fig. 7B).
Sequences for 20 proteins in spots enriched Նthreefold in the EGFP-Dim fraction (Table I), and 28 proteins in spots enriched Ն3-fold in the EGFP-Bright fraction were positively identified (Table II). Histological localization studies for 26 of the 50 non-EGFP proteins listed in Table I and Table II were previously reported in round and elongating spermatids of rats or mice (38 -67). In all reported cases (i.e. 26 of 26 reports), immunolocalization of the identified proteins during spermiogenesis within testicular cross-sections and isolated spermatids were in agreement with results of this study (Table  I, II). As additional standards, immunolabeling rat testis sections with antibodies to three of these proteins confirmed the mass-spectral results in this study by matching the reported Step 1-8 Spermatids Step 9-11 Spermatids Step 12-19 Spermatids localization of hnRNPK and GLUL in mice; and STMN in rats (Fig. 8) (41,47,58). To our knowledge, up to ϳ56% (13 of 23) and ϳ29% (8 of 27; excluding EGFP) of proteins in spots that differed by Նthreefold in relative abundance between respective round and elongating spermatid fractions represent newly reported spermiogenic factors (Table I, II). Moreover, in rat testis sections, PCBP1 (Fig. 9A) and EE2F (Fig. 9B) were selectively expressed in round and elongating spermatids, respectively. This was also consistent with mass-spectrometry results (Tables I, II). Interestingly, PCBP1 antibody labeling demonstrated a diffuse granular patter throughout the cytoplasm and in the nucleus, but was also concentrated in dis-tinct germinal granule-like foci in spermatocytes and round spermatids; PCBP1 was also highly localized to a relatively large type of cytoplasmic granule in elongating spermatids (Fig. 9A). EE2F immunolabeling was most prominent in differentiating spermatogonia, preleptotene spermatocytes, step 14 -17 elongating spermatids and small nuclear granules in Sertoli cells (Fig. 9B).
Ontology analysis of all sequenced proteins revealed the round spermatid fraction to be dominated by proteins that regulate RNA binding, processing and transport (supplemental Data File S4). This cluster included hnRNPA2/b1, hnRNPA3, hnRPDL, hnRNPK, hnRNPL, hnRNPM, PABPC1, A, Northern blot of EGFP transcripts in whole testis lysates from an adult GESptd1 rat after centrifugation into a sucrose density gradient. RNase-H (RH) treated samples after hybridizing with Oligo(dt). Note: a majority of EGFP transcripts are detected in low density mRNP-like particles. Controls (Ct) ϭ testis lysates prepared from GCS-EGFP transgenic (ϩ) and wildtype (Ϫ) rats. B, Western blot of MSY2 protein in whole testis lysates (T) from an adult GESptd1 rat after centrifugation into a sucrose density gradient. C, Northern blot of DAZL transcripts in whole testis lysates from an adult GESptd1 rat after centrifugation into a sucrose density gradient. Note: a majority of DAZL transcripts are detected in polysome-like particles. Controls (Ct) ϭ testis lysates prepared from tgGCS-EGFP transgenic (ϩ) and wildtype (Ϫ) rats. PABPC4, PCBP1, PCBP3, PTBP2, PSIP1, RGSL1, RUVBL2, SARNP2, TDRD6, and TDRD7 (Table I). Interestingly, with respect to post-transcriptional phases of spermatozoan biology, fractions containing elongating spermatids were dominated by alternative forms of glycolytic/metabolic enzymes, redox enzymes, protein synthesis enzymes, and factors that mediate protein stability (Table II).
Retrogene products were also over-represented within the elongating spermatid proteins sequenced; five of the top seven proteins in Table II are predicted to be encoded by retrogenes. DISCUSSION Here, a proteomics approach using flow sorted fractions of round and elongating spermatids revealed global down-regulation of an abundant RNA binding proteome expressed prior to up-regulation of metabolic factors during spermatid elongation. Based on biochemical activities in other biological processes, the identified factors are predicted to drive posttranscriptional RNA processing during meiotic and post-meiotic steps in spermatozoan development. Genome-wide studies find testes to express an extraordinarily diverse repertoire of tissue-specific mRNAs (25)(26)(27), alternatively spliced mRNAs (16,17), translationally repressed mRNAs (12)(13)(14), and translated retrogene mRNAs (15). Testes not only express an unusually wide array of total exons compared with most tissues, but also express exons encoding tissue-specific, cis-acting splicing elements; a majority of these elements bind hnRNPfamily proteins that mediate exon skipping (68). Accordingly, testes express a highly diverse collection of transcripts encoding proteins that regulate alternative splicing (69). This conduit of post-transcriptional genetic diversity unleashed by the testis is directly related to its fundamental role in supporting sexspecific processes required for spermatozoan development and fertility (25)(26)(27). Thus, it seems logical that we identified RNA regulatory factors as a major, differentially expressed class of proteins in fractions of round spermatids.
With respect to nuclear localization of pre-mRNA splicing factors, a strong positive correlation exists between several nuclear RNA binding proteins identified in the flow-sorted round spermatid fraction and their histological localization in testis sections (39,40,(42)(43)(44)(45)(46)(47)87). Experimentally, we obtained 100% agreement between our mass-spectrometry results and actual localization of spermatid proteins identified in 26 independent studies previously conducted in rats and mice (Tables I and II). Moreover, 17 of the 26 validating reports were conducted using mice. This demonstrates clear conservation of processes that control protein expression during spermiogenesis in these rodents.
However, it is interesting that only ϳ35% of spermatids in the elongating spermatid fraction retained their nucleus following the isolation procedure. Though biochemically suitable for comparing proteins sorted as different cellular and subcellular fractions; theoretically, this should bias our results by ϳthreefold when measuring the relative abundance of nuclear proteins down-regulated during spermiogenesis. Thus, the clear absence of false-positive discoveries following histological validation of Ͼ50% of the total identified proteins in testis sections was intriguing. This fact potentially underscores dramatic remodeling of nuclear protein composition that normally occurs during late steps in spermiogenesis (91)(92)(93)(94). Consequently, normal depletion of representative round spermatid nuclear proteins from fully compacted and elongated spermatids mirrors the relative abundance of these proteins detected in our collected round and elongating spermatid fractions. As such, this current approach would also be less sensitive for identifying up-regulated elongating spermatid proteins tightly association with more mature flagella and nuclei (95). Optimization of methods for testis cell disaggregation and sorting could result in higher yields of intact elon- gating spermatids. Additional purification steps could also better resolve intact sorted elongating spermatids from cytoplasts/residual bodies (96). At any rate, additional studies are required to validate localization of ϳ48% remaining spermiogenic factors in Tables I and II. We also demonstrated that post-transcriptional regulation of EGFP expression in GESptd1 transgenic rats mimics genes up-regulated during later steps of spermiogenesis. This is because, like many spermiogenic genes, transcription and translation of the EGFP reporter appears to be largely uncoupled in GESptd1 rats. This developmental paradigm is highly conserved in sexually reproducing organisms (9,13,97), and is well established for a large fraction of total transcripts (ϳ75%) that are stored in the nonpolysomal fractions of rodent spermatocytes and round spermatids prior to elongation (98,99). Requirement for this protracted translational delay for many sperm-specific genes is highlighted by early ectopic expression of protamine 1 or transition protein 2 in round spermatids (100,101). Forced premature expression of these sperm histone proteins in round spermatids stimulates precocious chromatin condensation prior to spermatid elongation, and results in infertility due to a block in spermatid development or function (100,101). Thus, based on functional roles in other cell types (88 -91), RNA binding proteins identified in this study, such as hnRNPA2b/1, hnRNPK, hnRNPL, PCBP1, and/or PTBP2 potentially represent factors that mediate translational repression of mRNAs in spermatocytes and round spermatids (Fig. 10). This theory further implicates that some unknown signal generated in the seminiferous epithelium inactivates and/or stimulates degradation of such RNA binding factors to facilitate spermatid elongation (Fig. 10).
Most recently, biochemical insight into mechanisms that trigger translational activation and de-repression of stored mRNAs was reported (19,102,103). Yanagiya and colleagues demonstrated that a double knockout of the poly-A-binding protein-interacting proteins (i.e. PAIP2a & PAIP2b) resulted in pre-mature translation of stored mRNAs and a lack of spermatid elongation (19). It was demonstrated that up-regulation of these proteins in round spermatids just prior to elongation under normal conditions proved instrumental in titrating out repressive effects of PABPC1 on translation (19). And more recently, Delbes and colleagues showed PAIP2A to regulate translation of a specific subset of spermatid mRNAs by this pathway (102). Genetically, this study further predicts the existence of multiple Paibp2-like factors that orchestrate translational activation during spermiogenesis ( Fig. 10; also see Supplementary Discussion). It is quite interesting that a relatively high percentage of retrogenes are abundantly expressed specifically in spermatids, many of which encode glycolytic enzymes (15). Five of the top seven most-enriched 2D gel spots in elongating spermatid fractions contained proteins predicted to be expressed from retrogene-like elements (Table II). Again, this is similar to the GESptd1 rat lentiviral transgene, which was identified in the 11th most-enriched protein spot in elongating spermatid fractions (Table II). A recent study by Vemuganti and colleagues used bioinformatics to study this relationship, and found an over-representation of LINE and LTR elements flanking glycolytic retrogenes in rodent and primate genomes (15). Likewise, EGFP is expressed from a LTR-based lentiviral retroelement we experimentally integrated directly into the germline of GESptd1 rats (Fig. 1A). Thus, in addition to chromatin factors regulating transcript levels, the striking abundance of proteins expressed from such retrogenes invokes questions regarding the existence of proteinacious and/or RNA translational enhancers that operate with their mRNA sequences during spermiogenesis (Fig. 10). Additionally, chromosomal DNA generated by reverse transcription, including retrogenes and other LTR-expressed factors, may lack elements that negatively regulate protein synthesis from their encoded mRNAs during spermatid elongation.
The unusually robust expression, and sperm-specific nature of retroelements seems remarkably well tailored for driv-A. hnRNPK Hoechst 34442 ing sperm fitness by selecting activities that promote ATP production while generating relatively low levels of oxidative stress (104). In fact, the reproductive advantage of mammalian glycolytic genes expressed during sperm maturation is well documented genetically in mice (28 -30), and three of the six core glycolytic enzymes listed Table II represent retrogenes. Thus, it is fitting that glycolytic and redox enzymes were also identified as enriched in the elongating spermatid fraction isolated from GESptd1 rats. Theoretically, this glycolytic power endows spermatozoa in many nonruminant mammalian species greater potential for driving motility (104), thus providing their haploid genomes an advantage for sexually equilibrating into a population (105). Similarly, it can now be asked how increased heterogeneity and abundance of the round spermatid RNA binding proteome would select for sperm fitness.

RS
Along with regulating the time, location and abundance of alternative mRNAs expressed in spermatids, additional clues into this later question may emerge from studies demonstrating that splicing elements in spermatogenic cells largely incorporate RNA binding proteins that select against exon inclusion (68). Consequently, variants produced by exon skipping uniquely in spermatogenic cells could select for gamete fitness by eliminating RNA/protein coding domains that constrain spermatozoan development and function, but which are beneficial to reproductive success when expressed in somatic tissue. Another intriguing hypothesis alluded to above reflects additional roles of the germline RNA binding proteome in restricting replication and coordinating expression from integrated reverse-transcribed sequences that escape transcriptional repression specifically during gametogenesis (as modeled in GESptd1 rats) (44,106). As such, deleterious retroelement-derived copy number variation could further drive purifying selection so to enrich genomes with replication-restricted or deficient integrants expressed in nonmitotic cells. Thus, evolutionarily, spermiogenetic down-regulation of the RNA binding proteome, as observed here in GESptd1 rats, would further capture benefits gained from retroelements post-transcriptionally, and selectively during spermatid elongation.  Table I), reported small RNA processing pathways (110), poly-A tail de-adenylation (18,20) and incorporation of processed transcripts into translational complexes (98,99). Mechanisms controlling relative abundance of hnRNPs upon spermatid elongation remain to be determined. Also illustrated, based on information from other studies, are hypothetical GW182/AGO-like protein components in RNA silencing complex (RISC) (110), associated poly-A tail, de-adenylation complex (DAC) (18,20), 40S and 60S ribosomal subunits, 5Ј and 3Ј untranslated regions (UTR), eIF4A, eIF4E, eIF4G and eIF3 elongation initiation factors, poly-A-binding protein C1 (PABPC1), poly(A)-tail binding protein-interacting protein 2 (PAIP2) (19,102), and additional putative translational enhancers (TE) (19,102).