Proteomic Analysis Reveals that Topoisomerase 2A is Associated with Defective Sperm Head Morphology*

Human spermatogenesis is a far from perfect process and even fertile men produce high numbers of defective spermatozoa. These cells are often characterized by chromatin defects. Following the isolation of normal and abnormal sperm cells, a label free comparison (SWATH) of nuclei from these populations demonstrates a retention of nuclear proteins in abnormal cells. Validation of mass spectrometry results by immuno-analysis reveals Topoisomerase 2A as a promising biomarker of sperm nuclear quality. Graphical Abstract Highlights Human spermatozoa possess cells of poor morphology that lack nuclear integrity. These cells can be isolated by density separation. Mass spectrometry reveals their nuclei contain excess protein. TOP2A is a promising marker of this poor nuclear development. Male infertility is widespread and estimated to affect 1 in 20 men. Although in some cases the etiology of the condition is well understood, for at least 50% of men, the underlying cause is yet to be classified. Male infertility, or subfertility, is often diagnosed by looking at total sperm produced, motility of the cells and overall morphology. Although counting spermatozoa and their associated motility is routine, morphology assessment is highly subjective, mainly because of the procedure being based on microscopic examination. A failure to diagnose male-infertility or sub-fertility has led to a situation where assisted conception is often used unnecessarily. As such, biomarkers of male infertility are needed to help establish a more consistent diagnosis. In the present study, we compared nuclear extracts from both high- and low-quality spermatozoa by LC-MS/MS based proteomic analysis. Our data shows that nuclear retention of specific proteins is a common facet among low-quality sperm cells. We demonstrate that the presence of Topoisomerase 2A in the sperm head is highly correlated to poor head morphology. Topoisomerase 2A is therefore a potential new biomarker for confirming male infertility in clinical practice.


In Brief
Human spermatogenesis is a far from perfect process and even fertile men produce high numbers of defective spermatozoa. These cells are often characterized by chromatin defects. Following the isolation of normal and abnormal sperm cells, a label free comparison (SWATH) of nuclei from these populations demonstrates a retention of nuclear proteins in abnormal cells. Validation of mass spectrometry results by immuno-analysis reveals Topoisomerase 2A as a promising biomarker of sperm nuclear quality.

Graphical Abstract
Proteomic Analysis Reveals that Topoisomerase 2A is Associated with Defective Sperm Head Morphology* □ S Jacob Netherton ‡, Rachel A. Ogle ‡, Louise Hetherington ‡, Ana Izabel Silva Balbin Villaverde §, Hubert Hondermarck ¶, and Mark A. Baker ‡** Male infertility is widespread and estimated to affect 1 in 20 men. Although in some cases the etiology of the condition is well understood, for at least 50% of men, the underlying cause is yet to be classified. Male infertility, or subfertility, is often diagnosed by looking at total sperm produced, motility of the cells and overall morphology. Although counting spermatozoa and their associated motility is routine, morphology assessment is highly subjective, mainly because of the procedure being based on microscopic examination. A failure to diagnose male-infertility or sub-fertility has led to a situation where assisted conception is often used unnecessarily. As such, biomarkers of male infertility are needed to help establish a more consistent diagnosis. In the present study, we compared nuclear extracts from both high-and low-quality spermatozoa by LC-MS/MS based proteomic analysis. Our data shows that nuclear retention of specific proteins is a common facet among low-quality sperm cells. We demonstrate that the presence of Topoisomerase 2A in the sperm head is highly correlated to poor head morphology. Topoisomerase 2A is therefore a potential new biomarker for confirming male infertility in clinical practice. A spermatozoon is a unique cell that is well equipped for its main purpose; to deliver the paternal DNA to the oocyte (1,2). Spermatogenesis, a process that involves intense cell division and morphological changes, is far from being a perfect process (3,4). In fact, so poor is human spermatogenesis, that all men, even fertile men, produce defective spermatozoa (5). However, when the number of defective cells reaches a critical level, a man can become either subfertile or infertile (5).
One of the major issues of male-factor infertility is the actual diagnosis. Typically, men are classified as infertile or subfertile after looking at the amount of sperm they produce, the motility of the spermatozoa and the morphology of the cells.
Although measurements of total sperm counts and sperm motility are consistent worldwide (nowadays commonly measured by computer), analysis of sperm morphology is proving to be problematic (6). According to the World Health Organisation Manual (1992), a spermatozoon can have several morphological abnormalities including defects within the head, tail and/or midpiece, or cells may possess excess residual cytoplasm. Detailed analysis on how sperm morphology should be assessed has been set out (7). However, even though sperm morphology is guided by specific criteria, interand intra-laboratory differences are prevalent (8) with one report showing user variation in excess of 12% (9). To illustrate how variable the analysis of sperm morphology can be, the initial examination of 1745 sperm samples over time from 1980 to 1994 showed that sperm morphology had a significant decline (10); however, the independent re-assessment of the same samples revealed no significant morphology drop. Instead, the previously observed decline was because of inconsistent analysis of sperm morphology over this time period (10). The inability to assess sperm morphology and understand its impact on fertility leads to unnecessary treatment and associated costs to many couples. Indeed, of 1391 couples that were referred to assisted conception, including male and female infertility cases, 45.6% ultimately conceived spontaneously (11). This suggests that many couples are misdiagnosed and better prognostic markers of fertility, particularly regarding male fertility potential, are desperately needed to avoid overtreatment of patients (12,13).
The most common morphological abnormality reported is that pertaining to the shape of the sperm head. Of concern, changes to head morphology may reflect the quality and/or packaging of the DNA (14). During the transition from a round spermatid to elongating spermatid to spermatozoon, a dramatic reduction in the size of the nucleus occurs (15), such that a sperm nuclear volume is about 1/7 th the size of any somatic cell (16). This is accomplished as the family of his-tones are replaced by two smaller and highly charged, proteins known as protamines (PRM1, PRM2). Estimates suggest that 90% of genomic DNA is bound to PRM1 or PRM2, while the remaining 10% maintains canonical histone-DNA interaction (17). Following histone-protamine replacement, DNA compaction achieves an almost crystalline level when inter-and intra-molecular disulfide binds are formed between PRM1 and PRM2 (18 -20). Such dense packaging affords critical protection of the genomic DNA from environmental factors and offers a hydrodynamic shape for motility. However, at times, it is evident that chromatin compaction does not occur correctly. Indeed, poor chromatin compaction is observed in cohorts of infertile men (21)(22)(23) and is also associated with lower in vitro fertilization (IVF) 1 outcomes (24).
Armed with this background knowledge, the IVF industry has sought to increase success rates by passing sperm ejaculates through density gradients. During this process, higher dense cells typically pass through the gradient and pellet at the bottom of the tube, whereas lower dense cells (including those with increased nuclear volume) pellet within the gradient (25). In general, higher dense cells possess better levels of motility, morphology and chromatin compaction (26) and as such, would be the same ones responsible for natural fertility.
In previous studies, we have compared the proteomic composition of high-and low-dense spermatozoa and demonstrated protein changes occur between these two cell populations (27). Of interest, many proteins found to be in higher or lower abundance within poor quality cells were "common" across different donors. Although male-factor infertility is often suggested to have a very heterogeneous etiology, our dataset suggested the opposite and potentially that the mechanisms leading to defective sperm formation may be more common than we previously understood (27). In this analysis, which was performed on intact spermatozoa, we noted that very few nuclear proteins were identified. Therefore, to understand if the nuclear content of poor-quality spermatozoa is also impacted during spermatogenesis, we decided to perform subcellular fractionation of the spermatozoa. This approach allowed us to then compare the nuclear protein composition between high-and low-quality spermatozoa and investigate their association with sperm morphology (28). In our quest to identify protein biomarkers that would better predict the fertility status of men, we found that nuclear protein retention of proteins, including Topoisomerase 2A (TOP2A), may be a hallmark of low-quality sperm cells. Such markers could be added to a panel of proteins for better diagnosis of men's fertility status in the future, rather than relying on subjective measurements of sperm morphology.
Preparation of Human Spermatozoa and Ethics-Institutional and State Government ethical approval was secured for the use of human semen samples in this research program. The work was carried out in accordance with The Code of Ethics of the World Medical Association (Declaration of Helsinki) for experiments involving humans. The study population was comprised of donors (n ϭ 8) aged from 25 to 70 years and free of any detectable organic disease. None of the men reported any obvious infertility-impairment defects such as varicocele. The semen samples were produced by masturbation after a minimum of 3 days of abstinence. Samples were allowed to stand at 37°C for at least 30 min for liquefaction to occur before being processed. Sperm numbers, morphology and total motility were assessed, according to the WHO criteria for fertility (29). Semen samples were then processed over a three-tier Percoll density gradient. Working solution was made by diluting Percoll with Biggers, Whitten & Whittingham (BBW) solution in a 9:1 ratio. A 2 ml aliquot of working solution (considered 100% Percoll) was placed in a 15 ml falcon tube, and overlaid with 2 ml of 60% Percoll, followed by 2 ml of 30% Percoll. The 60 and 30% Percoll solutions were obtained by diluting the working solution with BWW. Semen samples were gently layered over the gradients. Samples were then centrifuged at 500 ϫ g for 30 min, and both the spermatozoa at the bottom of the 100% Percoll solution (high quality cells) and at the interface between the 100 and 60% layers (low quality cells), were collected into separate tubes. Cells were then washed with 5 ml BBW and centrifuged at 500 ϫ g for 15 min. Pellets were resuspended in 1 ml BWW and cell counts were performed. 10 l of anti-CD45-coated dynabeads (Thermofisher, 11153D) were washed 3 times in BWW and added to the sample to ensure that no contaminating leukocytes were present and placed on a rotator at room temperature for 30 mins (1000 ϫ g, 30°C). Dynabeads were removed using a magnet, and the cells were pelleted at 500 ϫ g for 3 min and washed twice with BWW.
Sperm Swelling and DAPI Staining-Approximately 0.5 ϫ 10 6 spermatozoa were resuspended in 150 l of buffer 3 (50 mM sodium borate, 1% (w/v) SDS and 10 mM EDTA; pH 9.0) for 50 mins. The reaction was quenched by the addition of an equal volume of 4% paraformaldehyde. Cells were air dried onto glass slides and stained by overlaying ϳ50 l of 10 mM DAPI (4Ј,6-diamidino-2-phenylindole). Cells were imaged using fluorescence microscopy.
Image J Analysis-Fluorescent images were converted to 8-bit and the threshold set to create a binary image. The plugin 'Analyze particles' was used to measure the area of fluorescence, excluding areas Ͻ50 pixels and with the setting 'Exclude edges' selected. At least 100 cells were imaged and compared between groups using students t test (p Ͻ 0.05).
Sperm Nuclei Isolation-A total of 30 ϫ 10 6 spermatozoa (determined using a hemocytometer) were resuspended in 750 l of buffer 1 (1 mM phenylmethylsulfonyl fluoride (PMSF), 1% (w/v) cetyl-triethylammonium bromide (CTAB), 10 mM DTT and 50 mM Tris, pH 8.0) for 15 min on ice. Afterward, 750 l of buffer 2 (buffer 1 without DTT) was added and then incubated for 45 min at 4°C with gentle rotation. Cells were then centrifuged at 3000 ϫ g for 5 min at room temperature and the cell pellet, which consisted of nuclei, was gently resuspended in 500 l of buffer 2 and incubated for 30 min. Isolated nuclei were then gently washed twice in 500 l and once in 300 l of washing buffer (10 mM Tris and 1 mM PMSF; pH 8.0) using 5-min centrifugations at 3000 ϫ g and 4°C.
Nuclei Precipitation and Peptide Generation-Nuclei were resuspended in 450 l of tris-buffered saline, supplemented with 10 mM DTT, and incubated for 30 min at 37°C. An aliquot of 50 l of 10x MNase buffer and 0.5 l (1,000 U) of MNase were added, and samples were incubated for a further 30 min at 37°C. Immediately following MNase digestion, one volume of chloroform and two volumes of methanol were added to two volumes of sample, and the mixture vortexed vigorously. This sample was centrifuged (10,000 ϫ g, 2 min) and, after phase separation, the upper phase was removed. Three volumes of methanol were added to the remaining lower phase and the mixture gently inverted twice. The solution was then centrifuged (10,000 ϫ g, 15 min) to pellet the precipitated protein. The supernatant was discarded, and the pellet allowed to air dry. Trypsin was added to the pellet and digested overnight at 37C. Samples were centrifuged at 17,000 ϫ g for 20mins, transferred to sample vials and acidified.
Data Dependent Acquisition (DDA) and Library Generation-To generate the spectral library, 11 of the samples were randomly selected and fragmentation spectra acquired in DDA mode on a Sciex 6600 triple ToF mass spectrometer as previously described (26). Each file was loaded into Protein Pilot v5.0.1 (AB Sciex) and searched using the Paragon algorithm (5.0.1.0, 4895) in "thorough ID" mode, trypsin as the digestion enzyme, with no cys alkylation, acetylation emphasis as a special factor, and ID focus on biological modifications. The files were searched against the human SwissProt database (downloaded January 2017), which contained 42,324 proteins. Also include in the search database were proteins identified from the CRAPome (30). The final spectral library included 389 proteins, comprised of 1331 peptides and 8797 spectra with a global false discovery rate (FDR) of 1%. An in-depth explanation of how FDR is calculated by the paragon algorithm, the handling of multiple spectra for peptide ids and the assignment of variable modifications is explained elsewhere (31).
Sequential Window Acquisition of all Theoretical Masses (SWATH) Analysis-To obtain peptide quantitative data, the 16 samples (8 high-and 8 low-quality spermatozoa) prepared above, were again run through liquid-chromatography, coupled to a 6600 Triple ToF mass spectrometer, running a SWATH acquisition method. For the acquisition of SWATH data, the settings were as follows: a MS1 scan was performed followed by 100 m/z isolation windows with 1m/z overlap. The window sizes were determined using the Sciex Variable Window Calculator 1.0 using a DDA file generated above. The total cycle time was 2.8s. To quantify peptides, the spectral library generated above was imported into PeakView (2.2). Files acquired by SWATH-MS were imported using the SWATH-MS/MSAll Micro app (1.0). The generated library was imported, excluding modified and non-unique peptides. Retention times were aligned by manually selecting peptides present across all samples from proteins known to be in high abundance. Peptides were taken from an even distribution across the elution time. All samples were manually inspected to ensure the peptide was detected and selected correctly by the SWATH algorithm before elution time recalibration. After recalibration, it was ensured that linearity between samples was uniform. Following retention time alignment, the samples were processed to match to the spectral library within a 5 min RT window. The top six most abundant ions (excluding the precursor ion) were used for identifying, with a mass tolerance of 10 ppm for matching between the library and acquired SWATH data. The area under the curve for identified fragment ions was calculated automatically by the SWATH algorithm and summed for matched peptides (99% confidence).
Perseus-Following processing in PeakView, the raw data was exported into MarkerView (version 1.3.0.1). From here, the data was again exported as a raw text file, which allows for analysis using the freeware Perseus software (version 1.5.2.6) (www.perseus-framework. org). In Perseus, the data was log-transformed then normalized to the median of each individual sample. Sample groups were compared using a paired Welchs t test. Proteins with a fold change greater than 2.0 and p value Ͼ 0.05 were considered significant.
Immunoblotting-Whole cells were prepared as described above, or cell nuclei prepared as above (all steps prior to trypsin digestion). Proteins were solubilized in SDS extraction buffer (2% (w/v) SDS, 10% (w/v) sucrose, 114 mM Tris and 1 tablet of complete mini protease inhibitor: pH 7.5). Immunoblotting was performed as described elsewhere (27). For blots performed on whole cells, ϳ5 g of protein was used for analysis, and alpha tubulin used as a loading control. For protein extracts from nuclei, 10 ϫ 10 6 nuclei were used for protein extraction, and a silver stain loaded with equal protein used for immunoblotting as the loading control. Antibodies used and their dilutions were as follows: mouse monoclonal antibody against TOP2A (Abcam, ab52934) diluted 1/1000, sheep polyclonal antibody against PDIA3 (Bio-techne, AF8219) diluted 1/1000, mouse monoclonal against alpha tubulin (Sigma, T5168) diluted 1/2000, rabbit polyclonal against histone H3 (Sigma, H0164) diluted 1/3000. Secondary anti bodies used for detection: goat anti-rabbit HRP (Sigma, DC03L), goat anti-mouse HRP (Invitrogen, A16072), rabbit anti-sheep HRP (Sigma, 402100).
Experimental Design and Statistical Rationale-For the mass spectrometric analysis of sperm nuclei, ejaculates were processed through a Percoll gradient to isolate high-and low-quality spermatozoa populations. Isolated cells were treated with a detergent to isolate nuclei and treated with MNase prior to trypsin digestion of whole nuclei to generate peptides for analysis. 11 samples were used for library generation (5 high quality/6 low quality, randomly selected), and 16 samples for SWATH data generation (high/low quality from 8 donors). 8 donors were selected as an appropriate sample size, as we reached the threshold power of detection at 6 donors, with minimal increases at 7 and none at 8. For sperm chromatin swelling, immunoblotting and immunocytochemistry experiments, high-and lowquality sperm/nuclei from 3 separate donors were used. For statistical analysis of SWATH data using Perseus software, data was log 2 transformed, median normalized and a Welch's t test performed (Welch's t test was used as the data was assumed to be normally distributed, and as the welches t test perform as well at the students t test in situation of equal variance, and out performs in situations of unequal variance (33)). A p value Ͼ0.05 and Ͼ2-fold change was used to threshold changes. For chromatin swelling an unpaired 2 tailed Welch's t test was used, and for immunocytochemistry counts a paired 2 tailed student's t test was used.

Isolation, Purity and Biochemical Properties of Human
Sperm Nuclei-We compared the composition of nuclear pro-tein between high-and low-quality human spermatozoa, aiming to identify protein abundance changes which could be used to better classify the fertility potential of sperm samples. To achieve this, we first separated sperm samples into two populations; one with high-quality spermatozoa and the other with low-quality cells, using a discontinuous Percoll gradient protocol adapted from a previously described one (34). In brief, three different density gradients of Percoll are overlaid and the sperm samples placed on top, then centrifuged. Spermatozoa that pellet at the bottom of the tube are of higher density compared with those that pellet at the 100/ 60% Percoll interface (Fig. 1A). We have previously characterized these populations and shown that the majority (Ͼ90%) of the lower dense cells have distinctly poor morphology and motility (herein referred to as low quality) In contrast, cells of high density are enriched with morphologically normal, motile and highly fertile sperm (27) (herein referred to as high quality). Images of spermatozoa from each fraction are shown in Fig.  1B. Cells in the low-quality fraction often show morphological defects, including amorphous heads (Fig. 1B, upper image, black arrows), which contrast with cells from the high-quality fraction, where morphologically normal cells are enriched (Fig.  1B, lower image, white arrow) (Raw semen analysis of donors used for SWATH can be found in supplemental Table S1).
To isolate the nucleus, intact sperm cells were treated with the detergent CTAB as previously described (35). To characterize the purity of nuclei preparations, direct interference contrast (DIC) images were taken (Fig. 1C). Counting over 400 nuclei derived from either high (Fig 1B, lower), or low (Fig. 1B, upper) quality cells using direct interference contrast (DIC) demonstrated there was no evidence of fractured tails or midpiece. In addition, no round cell contamination was pres-ent suggesting the population of nuclei are highly pure, which has also been previously reported by others (35). In confirmation of this, after proteomic analysis of the isolated nucleus from both high-and low-quality sperm, the major non-nuclear sperm proteins (e.g. Tubulin, protein kinase A anchoring protein 3 and 4, and several heat shock proteins) were not apparent.
Differences in Chromatin Compaction Within the Low-quality Fractions-Although cells from low quality fraction showed a higher percentage of abnormal morphology than the highquality fraction, this was not the case when we observed only the nucleus. Both populations produced very similar images with very little evidence of gross differences regarding nuclear morphology. As such, to determine if there were biochemical changes, we challenged spermatozoa with SDS/EDTA (36). Previous work has shown that when incubated for short periods of time, sperm with poorly compacted chromatin (the definition in this case, being that they look "granular" under electron microscope) decondense rapidly, showing the chromatin was not tightly compact (compact sperm nuclei are very electron dense) (36). As seen in Fig. 2A, the chromatin of some spermatozoa from the low quality population undergoes rapid decondensation, or enlargement, following SDS/EDTA treatment (Fig 2A, white arrow). To quantify this, the area of nuclei in both populations following treatment was measured using Image J. Examples from three donors are shown (Fig.  2B). In each case, there is a significant increase in the DAPI stained area from high-to low-quality cells. However, the total number of cells that showed this increase varies between donors. This data demonstrates that, despite no gross differences in nuclear morphology observed, chromatin derived from low-quality spermatozoa are more susceptible to SDS- EDTA treatment than those derived from high quality cells and are therefore likely to be less compact.
Generation of SWATH Library-To determine quantitative differences between the nuclear composition from both highand low-quality sperm population, we performed a proteomic analysis using Sequential Window Activation of all Theoretical spectra, or SWATH. Following trypsin digestion, 11 samples consisting of 5 high-and 6 low-quality fractions from 6 donors were randomly selected and used to generate the spectral library by running the mass spectrometer in data-dependent mode. The library generated consisted of 389 proteins, which comprised of 1331 peptides and 8797 spectra at a global false discovery rate (FDR) of 1%. In comparison, proteomic analysis of somatic nuclei preparations usually produce ϳ2500 protein identifications (28). However, it needs to be recognized that to protect the DNA, spermatozoa possess a highly compact nuclei, which has been described as crystallike with (almost) no nucleoplasm (18 -20). This explains why we find reduced amounts of total protein identifications compared with somatic cell nuclei (28).
A comparison of our list of sperm nuclear proteins to the only previously published study (34) is shown in supplemental Fig. S1. We were initially concerned that only a ϳ50% overlap of proteins was found between both studies. However, it appears that (1) sample purity and (2) reporting of redundant hits explain the discrepancies. Examining the list unique to De Mateo et al. (35), we could identify some proteins that are unlikely to be associated with the nucleus, suggesting that a minor contamination of the nuclear preparation may have occurred. Examples include albumin, protein kinase A anchoring protein 3 or 4 , many of the mitochondrial proteins (37), fibronectin (38), ferritin (39), lactate dehydrogenase (40), prostatic acid phosphate (41) and tubulin. However, perhaps the greatest discrepancy is the way protein hits were reported. De Mateo et al. appear to have used redundancy in reporting protein hits (i.e. if one peptide matches to multiple proteins, then all proteins are reported), whereas our report uses non-redundant hits (that is, if one peptide is found to match to many proteins, but within the same protein group, and a second peptide is found that matches to only one protein isoform, then only this protein isoform is reported). There is no "correct" method given redundant reporting gives many false positives, but leaves few false negatives, whereas non-redundant reporting gives fewer false positives, but may fail to identify proteins that are present. However, the two approaches greatly exaggerate differences between the lists. For example, 15 isoforms of histone H2B are reported by De Mateo et al. (35), whereas our list only contains histone H2B2E, H2BFM and H2B1A. However, all 15 H2B isoforms reported by De Mateo et al. would not have a unique peptide (something our study requires), because they only differ by a few amino acids at the N terminus of the protein, which is rich in lysine and arginine (the same amino acid cut by trypsin). Thus, in this example alone, we only have 3 "overlapping proteins," whereas in reality, if both methods used non-redundant reporting, the lists would be extremely close. Therefore, our criterion was more stringent and that likely explains the difference between the identifications reported by both studies. The protein identifications from both lists (De Mateo et al., and ours) are given in supplemental Table S2.
Quantitative SWATH Analysis-To quantitatively compare proteins from high-and low-quality sperm nuclei, eight men gave donations and the sperm nuclei were isolated and run on the Triple-ToF 6600 in SWATH mode. Relative quantification was performed using the freeware statistical analysis package, Perseus (www.perseus-framework.org). Data was log 2 transformed to reduce sample spread and then normalized to the median value for each individual sample to account for variation. Using a paired Welchs t test, we identified significant protein changes only in proteins with the minimum criteria cut off value of p Ͻ 0.05 and a fold change no less than 2. In total, 26 proteins were significantly different between the groups, which represents 6.7% (26 of 389) of the total protein identified here. Proteins with significant regulation can be found in Table I.
Protamine 2 and Histones Do Not Change Within the Lowquality Cells-Several reports have suggested that infertile men lack protamine 2 specifically (42), or have a change in the ratio of protamine 1/protamine 2 (43,44). Furthermore, others have suggested that within infertile spermatozoa, there are increased amounts of histone (21). However, our SWATH data indicated that neither protamine 2, nor any of the histone proteins were significantly different. To confirm this, we compared high-and low-quality sperm fractions using antibodies against histone H3. As shown (Fig. 3A), we found no change in the level of H3 when normalized to the loading control ␣-Tubulin. The quantitative SWATH data for comparison are plotted (Fig 3B). Table I demonstrates that all proteins significantly different between groups, were exclusively more abundant within the low-quality fraction cells. From a germline developmental point of view, this data can be reasonably explained. During spermatogenesis, the sperm genome is packaged to near crystalline-state (45). During these events, there is a characteristic reduction in nuclear volume to around one-seventh the size of any somatic cell (16) that enables a more hydrody-namic head shape and protects the DNA from toxic insult (45). The processes that lead to a reduction in nuclear volume are unclear but may involve ubiquitin-dependent degradation (46) and/or manchette removal of protein and nuclear compaction (47)(48)(49)(50)(51)(52)(53). Less likely, but potentially still a factor, are tubulobulbular complexes (54) and ectoplasmic specialization removal of proteins (55). The finding that low-dense spermatozoa have more abundant proteins suggests a failure in one or more of these processes, by which nuclear volume is reduced. To validate the SWATH analysis, we performed immunoblotting. The proteins topoisomerase 2 alpha (TOP2A) and protein disulfide isomerase A3 (PDIA3) were selected because of their significant fold change and their potential importance within the nucleus of a developing spermatozoon. Nuclei from high-and low-quality human spermatozoa were taken, the proteins extracted and separated via SDS-PAGE. Following transfer, the nitrocellulose was probed with antibodies against these proteins. As shown, PDIA3 protein was more abundant within the low-quality sperm nuclei (Fig. 4A). Inter-donor variation of the changes was seen. For example, donor 1 (Fig 4A) appeared to have less PDIA3 than donor 3, despite near equal loading (Fig 4B). However, in all cases, PDIA3 was consistently higher in the nucleus of the low-quality sperm fractions. This was in accord with the proteomic data plotted as normalized expression (Fig. 4C) (note that the negative value associated with the high-quality sperm cells is because of the normalization against the median). Immunocytochemical analysis of intact sperm against PDIA3 demonstrated great variation in staining (Fig. 5A). The protein was shown to be located to several different area of the cell including the redundant nuclear envelope (Fig. 5A, ii,  a), perinuclear theca (Fig. 5A, ii, b) and residual cytoplasm (Fig. 5B, ii, c). Note, it is for this reason (particularly the residual cytoplasm staining), that the immunoblot in Fig. 4 was performed against isolated nuclei. We observed that a greater number of spermatozoa were positive with PDIA3 in the low-quality sperm fraction compared its high-quality counterpart. In most cases, staining was associated with poor sperm morphology, consistent with the notion that PDIA3 was higher in low quality cells. However, given the variation in staining patterns, we did not pursue this further.

Top 2A Is Associated with Low-quality Sperm Heads-
Immunoblotting demonstrated that TOP2A (Fig. 6A) was more abundant within the low-quality fractions, which is consistent with the quantitative SWATH data (Fig. 6B). Immunohistochemical staining demonstrated that TOP2A was localized to the redundant nuclear envelope (Fig. 7A). The redundant nuclear envelope consists of excess vestments of the nuclear envelope after compaction of the nucleus during sperm development. The proximity of TOP2A with DAPI-stained nuclei is shown (Fig. 7A, iv). The overlay demonstrates that TOP2A is part of the nuclear compartment of the sperm cell.
During our analysis, we noted that some sperm cells were stained with TOP2A, whereas others were not. To investigate this further, we counted the number of sperm within the highand low-quality fractions that were positive for TOP2A and FIG. 3. Levels of histone H3 are consistent between high and low quality spermatozoa populations. A, Immunoblot of SDS extracted proteins from high-and low-quality spermatozoa isolated from three ejaculates, probed using anti-H3. Cross-reacting bands were visualized with chemiluminescence after addition of horse-radish peroxidase conjugated secondary antibody. Lower Blot: following recording of chemiluminescence, the nitrocellulose was stripped of primary antibody, re-blocked and probed with anti-␣ Tubulin. The chemiluminescent signal obtained was used a loading control. B, The abundance of histone H3 according to the SWATH analysis between high-and low-quality sperm cells demonstrates no significant difference between the populations (n ϭ 8, p Ͼ 0.05).
FIG. 4. PDIA3 is detected in greater abundance in spermatozoa derived from the low quality fractions. A, Human sperm ejaculates were taken from individual donors and high-and low-quality fractions were isolated through Percoll density gradients. Nuclei isolated from spermatozoa were then lysed and separated via SDS-PAGE. Following transfer, the nitrocellulose was probed with primary antibodies against PDIA3. Cross-reacting bands were visualized with chemiluminescence after addition of horse-radish peroxidase conjugated secondary antibody. The approximate position of the molecular mass markers are shown. B, An equal sized aliquot of the same sample used for immunoblot was loaded into SDS page, run and total protein was detected via silver stain. C) PDIA3 abundance according to the SWATH analysis between high-and low-quality sperm cells demonstrates significantly more abundance within the latter (n ϭ 8, *p Ͻ 0.01).
found a significant difference with more sperm positive for TOP2A in the low-quality fractions (Fig. 7B).
Finally, we determined the morphology of spermatozoa that were positive for TOP2A staining. To achieve this, we isolated both high-and low-quality spermatozoa and then proceeded to characterize the morphology of sperm cells positive for TOP2A in both populations. Regardless of their origin from the Percoll fraction, we found that the overwhelming majority of FIG. 6. TOP2A is more abundant in spermatozoa derived from the low quality fractions. Human spermatozoa from individual donors were fractionated through Percoll density gradients. Intact cells were lysed and run into SDS-PAGE. Following transfer, the nitrocellulose was probed with primary antibodies against A, TOP2A (upper blot). Cross-reacting bands were visualized with chemiluminescence after addition of horse-radish peroxidase conjugated secondary antibody. Lower Blot: Following recording of chemiluminescent, the nitrocellulose was stripped of primary antibody, re-blocked and probed with anti-␣ Tubulin. B, The TOP2A abundance according to the SWATH analysis between high-and low-quality sperm cells demonstrates significantly more abundance within the latter (n ϭ 8, *p ϭ 0.0001). spermatozoa positive for TOP2A demonstrated amorphous heads (Fig. 8). As the high quality Percoll fraction is enriched for morphologically normal spermatozoa, this data explain why TOP2A is more abundant in the lower quality cells. DISCUSSION Male infertility is a growing medical condition. Although solutions are available through a variety of assisted reproductive technologies (ART), there are many drawbacks. First, the technology is expensive, with the average cost/cycle sitting around $12-18K. Second, there is no guarantee of success. In the United States (2015), 464 reporting clinics performed 231,936 cycles, of which only 60,778 resulted in live births (26% success rate). Third, it is highly medicalized, with the most common form of assisted conception, namely intracellular sperm injection (ICSI), involving hormone stimulation of the female followed by oocyte retrieval. Yet for many couples, it is evident that the referral for assisted conception was unnecessary (11). Despite these concerns, the number of couples resorting to ART increases by ϳ10% each year and will continue to grow as the public becomes increasingly aware of ART options and availability (56).
The problem faced by health practitioners and in vitro fertilization clinics alike is the classification of men requiring help. Currently, diagnosis of male-factor infertility relies on 2 criteria. The first of these includes unprotected sexual intercourse without success for at least 1 year. However, ϳ50% of couples that fail in the first year of trying, will go on to conceive in the second year without any medical intervention (57) and as such, further criteria is required. To this end, a semen analysis is performed that classifies the male patient as fertile or infertile based on the guidelines published by the World Health Organisation in 1980 (29). However, because of the inadequacies found in correctly predicting infertility, revised values were released in 1987, 1992, 1999 and again in 2010 (29). The semen analysis considers total sperm numbers and percentages of motile and morphologically normal cells. On the sur-face, a semen analysis would benefit the classification of fertile and infertile men. However, several reports shows that a semen analysis falls short of a true diagnosis (58 -70).
An alternative method for the diagnosis of male-factor infertility is to build up a panel of protein biomarkers whose expression correlates with sperm fertilizing potential. The advantage of using a panel of markers is the ability to cover more aspects of "fertility" than a semen analysis alone would be capable of. For example, with men classified as "idiopathic infertile," we have shown that Outer dense fiber 1 (ODF1) was virtually absent from their spermatozoa (29). ODF1 is a chaperone that regulates the joining of the sperm head to the tail. Despite a "normal" semen profile, sperm lacking ODF1 are quite fragile and easily decapitate over time, which may explain their infertility (32). Of interest, this current report found a second ODF family member to be more abundant in the sperm nucleus of low-quality cells, namely ODF2. Initially, this finding was surprising, as ODF2 is a well-recognized, major protein of the sperm flagella (71,72) and not the nucleus. However, upon further inspection, we noted our proteomic platform had detected a unique peptide from the ODF2-isoform 9 transcript. This isoform (Q5BJF6 -7, NP_702910) is a testis expressed transcript and arises because of an alternate start site, giving the protein a unique N-terminal. Inspection of the sequence demonstrated that the N terminus of ODF2isoform 9 has a nuclear localization signal (73). Indeed, and in concert with this idea, ODF2 specific isoforms have been located in the nuclei of germ cells during spermatogenesis (74).
Although lack of ODF1 is a biomarker of infertility for some men, it certainly does not work for all men. As such, a panel of markers will be required to diagnose an array of male-factor infertility phenotypes. Previously, alterations to the PRM1/2 ratio in mature sperm have been often associated with poor reproductive outcomes (75,76). In this study we find no change in either PRM2 or the detected histone peptides between our populations (Fig. 3, supplementary Table S3). Al- FIG. 8. TOP2A is associated with amorphous sperm heads. Cells from either the high-or low-quality Percoll fractionation were stained immunostained for TOP2A. Those cells positive for the protein, were then assessed in terms of their morphology. The graph shows that combined cell counts for both high-and low-quality spermatozoa (n ϭ 3 donors, 100 cells counted/fraction). though our results are may initially appear to be at odds with previous findings, these studies are often comparisons of fertile and infertile men. When protamine ratios and levels have been Percol separated fractions from the same individual, little (77) to no change (78) is observed. Both PDIA3 and TOP2A hold some promise to identify samples with poor sperm head shape. PDIA3 has been previously localized to the rat sperm membrane and is a chaperone known to be involved in the reduction of disulfide bonds of several proteins (79). Included in this list is the major histocompatibility complex class 1 (MHC) (80). Although a somatic protein, the reductase activity of PDIA3 leads to unfolding of the MHC protein, causing it to be targeted for destruction (80). Such a mechanism of action may explain why PDIA3 was largely associated with poor quality spermatozoa. Although the very heterogeneous signals we obtained through PDIA3, staining of human spermatozoa made it difficult to investigate this phenomenon further.
TOP2A is a protein normally involved in the alteration of DNA topology. Specifically, the enzyme activity of TOP2A causes the induction of transient double strand breaks within DNA. These temporary breaks allow one DNA strand to pass through another, creating or releasing torsional stress (81). The enzyme is vastly important to spermatogenesis, as topoisomerase activity is required for successful meiosis (82). Additionally, long term treatment of mice with clinical doses of etoposide (a TOP2A inhibitor), causes chromosomal and sperm morphology aberrations (83). This is not surprising, as in addition to its role in spermatogenesis, TOP2A is involved in the histone to protamine transition (84). This is of interest, as with infertile men, either the level of protamine's or the ratio between protamine 1 and protamine 2 within the sperm population can be different compared with fertile men (85). Given this data, our initial hypothesis was that expression of TOP2A may be related to sperm chromatin compaction through aberrant histone or protamine levels. However, both SWATH and immunoblotting demonstrated this to be incorrect. Rather, we found that in virtually all cases, TOP2A was associated with poor head morphology.
The association between the etiology of poor sperm head morphology and TOP2A expression is yet to be explained. One idea is that during spermatogenesis, nuclear compaction occurs at the end of the cell elongation stage. At the same time, proteins that are not necessary for sperm function are removed. Several processes may affect this, including ubiquitin-dependent degradation and/or manchette removal of protein and nuclear compaction (47)(48)(49)(50)(51)(52)(53). With respect to the former, previous work in our lab shows no difference in proteins labeled with anti-ubiquitin antibody between high-and low-quality sperm cells (data not shown). As such, there is a lack of evidence to support this idea. However, many genetic knockout models suggest manchette formation is critical for nuclear compaction and by implication, nuclear protein removal. Examples include KIF3A (47), MNS1 (48), HOOK1 (49), RIM-BP3 (50), MEIG1 (51) SPEM1 (52) and katanin p80 (53). The manchette is a skirt-like structure that forms around the elongating sperm head (86). As the manchette moves laterally from the sperm head toward the flagella, it not only shapes the cells, but also helps compact the nucleus (86). Thus, as lowquality cells contain more abundant nuclear proteins, we speculate that this finding has largely something to do with poor manchette formation. Unfortunately, this model will be difficult to test using human testis sections as one cannot distinguish between high and low sperm quality at the stage of development in the testis.
In conclusion, we found that retention of nuclear proteins is a hallmark of poor-quality spermatozoa, as defined here. Furthermore, TOP2A is highly associated with amorphous sperm heads, and therefore a potential new biomarker for malefactor infertility in clinical practice.

DATA AVAILABILITY
All raw MS files (DDA and SWATH), and the generated libraries (containing annotated spectra, in both .group and .mzid) can be on the ProteomeXchange Consortium (http:// proteomecentral.proteomexchange.org) (PXD014210) via the MassIVE partner repository with the dataset identifier MSV000083954. Processed data is available in supplemental Table S3.