Stable-protein Pair Analysis as A Novel Strategy to Identify Proteomic Signatures: Application to Seminal Plasma From Infertile Patients

Our aim was to define seminal plasma proteome signatures of infertile patients categorized according to their seminal parameters using TMT-LC-MS/MS. To that ex-tent, quantitative proteomic data was analyzed following two complementary strategies: (1) the conventional approach based on standard statistical analyses of relative protein quantification values; and (2) a novel strategy fo-cused on establishing stable-protein pairs. By conventional analyses, the abundance of some seminal plasma proteins was found to be positively correlated with sperm concentration. However, this correlation was not found for all the peptides within a specific protein, bringing to light the high heterogeneity existing in the seminal plasma proteome because of both the proteolytic fragments and/or the post-translational modifications. This issue was overcome by conducting the novel stable-protein pairs analysis proposed herein. A total of 182 correlations comprising 24 different proteins were identified in the normozoospermic-control population, whereas this pro-portion was drastically reduced in infertile patients with altered seminal parameters (18 in patients with reduced sperm motility, 0 in patients with low sperm concentration and 3 in patients with no sperm in the ejaculate). These results suggest the existence of multiple etiologies caus-ing the same alteration in seminal parameters. Additionally, the repetition of the stable-protein pair analysis in the control group by adding the data from a single patient at a time enabled to identify alterations in the stable-protein pairs profile of individual patients with altered seminal parameters. These results suggest potential underlying pathogenic mechanisms in individual infertile patients, and might open up a window to its application in the personalized diagnostic of male infertility.


In Brief
The seminal plasma proteome from infertile patients differing in seminal parameters was determined using quantitative MSbased proteomics. Conventional analyses together with a new strategy based on the identification of stable-protein pairs were conducted. A stable-protein pair pattern was established for normozoospermic individuals but not for patients-groups with altered seminal parameters, reflecting multiple causes affecting seminal parameters. Moreover, the evaluation of stable-proteomic pattern in control population, adding an individual patient once a time, opens a window to personalized male infertility diagnosis.

Graphical Abstract
Infertility is a worldwide frequent problem that affects ϳ15% of reproductive-aged couples. Around the 50% of the fertility problems is because of a male factor and, from those, the 40 -60% of the infertile patients present some alterations in at least one of the seminal parameters assessed by a routine semen analysis (sperm concentration, motility and morphology) (1). According to these seminal parameters, infertile patients are categorized in: (i) patients with low or absent sperm concentration (oligozoospermia or azoospermia, respectively), (ii) patients with defective sperm motility (asthenozoospermia) and/or (iii) patients with abnormal sperm morphology (teratozoospermia) (2). The semen evaluation, together with a complete medical history and physical examination, will determine whether the initial assessment needs to be complemented with genetic and/or hormonal analyses, urinalysis or testicular biopsies. Unfortunately, the current available tools for the evaluation of male fertility are limited and insufficient, and the development of new methodologies to better discern the male factor etiology is required (3)(4)(5)(6). Currently, the application of high-throughput proteomics for the study of the human sperm cell has resulted in the identification of 6871 proteins as well as some pathogenic mechanisms involved in male infertility (7)(8)(9)(10)(11). For instance, studies on the sperm proteome from asthenozoospermic patients revealed alterations mainly in proteins and pathways related with energy production and cytoskeleton (9,12). However, because the semen is not just composed by sperm cells, the exploration of seminal plasma could help to better elucidate the causes of male infertility.
The human seminal plasma is a complex and protein-enriched biological fluid that constitutes 95% of the semen volume, whereas only the remaining 5% corresponds to sper-matozoa (10). This fluid is composed by secretions from the testis (1-2%), the epididymis (2-4%) and the male accessory sex glands including the seminal vesicles (65-75%), the prostate gland (25-30%) and the bulbourethral glands (Ͻ 1%) (13)(14)(15). Seminal plasma contains a diversity of molecules including DNA, RNA, microRNAs, lipids, proteins and metabolites, together with a highly abundant population of extracellular vesicles, which are mainly secreted by the prostate (10,16). So far, the compiled proteome profile of human seminal plasma includes 2064 non-redundant proteins from 9 independent studies, revealing that this fluid is not only a simple medium to carry the spermatozoa through the female reproductive tract (10,17). In fact, once the spermatogenesis is completed, the testicular spermatozoa, although morphologically differentiated, are immotile germ cells unable to fertilize the oocyte by its own (10). Seminal plasma plays a crucial role for the sperm maturation, motility and capacitation, prevention of premature acrosome reaction and sperm-to-zona pellucida recognition and fusion, thereby providing to the spermatozoa the capability to fertilize the oocyte (18 -25). Additionally, the contact of the seminal plasma with the female reproductive tract provides an optimal environment that enhances embryo implantation and development as well as contributes promoting maternal immune tolerance of the semiallogenic fetus (26 -29).
The analysis of the seminal plasma proteome is currently undergoing an intense study (14,30). With the aim to shed light on the physiological role of the seminal plasma and to seek for specific protein biomarkers for male infertility diagnosis and/or prognosis, several groups have studied the seminal plasma proteome in certain subtypes of male infertility and/or alterations associated with oxidative stress and sperm functional traits, among others (31)(32)(33)(34)(35)(36)(37)(38)(39)(40)(41)(42)(43)(44)(45)(46)(47)(48)(49). In addition, there is an interest to identify seminal plasma protein biomarkers that could be predictive for the presence of sperm cells in the testis of azoospermic patients, which could avoid unnecessary testicular biopsies (34, 50 -57).
Therefore, the analysis of the seminal plasma proteome mimicking the infertility classification according to seminal parameters is warranted, because it might help to decipher potential pathogenic mechanisms resulting in these sperm alterations. However, it is important to consider a wide range of factors may alter both sperm and seminal plasma compositions, leading to a high heterogeneity within patients sharing the same phenotype. This is challenging for the conventional analysis of the quantitative proteomic data based on the search of differential proteins between groups of patients with similar characteristics. Therefore, the main aim of the present study was to define the seminal plasma proteome signatures of infertile patients categorized according to their seminal parameters (normozoospermia, NZ 1 ; asthenozoospermia, AS; oligozoospermia, OZ; azoospermia, AZ) by applying conventional and novel approaches for the analysis of quantitative proteomic data to try to better stratify the different subgroups of infertile patients. The results derived from this study suggest that the combination of conventional and novel analytical approaches may be useful toward the identification of pathogenic mechanisms of male infertility and, furthermore, to provide the bases for future studies to design new therapies to improve male fertility and move toward the application of an individual and personalized diagnostic of male infertility.

EXPERIMENTAL PROCEDURES
Biological Material and Sample Collection-Semen samples -Human semen samples (n ϭ 34) were obtained from patients undergoing routine semen analysis at the Assisted Reproduction Unit from the Clinic Institute of Gynaecology, Obstetrics and Neonatology, from the Hospital Clínic (Barcelona, Spain), after signed informed consent. The ejaculates were collected by masturbation into sterile containers after 3-5 days of sexual abstinence. The evaluation of the seminal parameters was performed using the automatic semen analysis system CASA (Proiser, Paterna, Spain), and the sperm viability test was assessed using 0.5% (w/v) Eosin Y, following the WHO recommendations (2). The semen samples included in this study were classified according to semen parameters as normozoospermic (NZ, n ϭ 10), asthenozoospermic (AS, n ϭ 4), oligozoospermic (OZ, n ϭ 10), and azoospermic (AZ, n ϭ 10) (supplemental Table S1).
Testis and epididymis tissues -Biopsies from normal testis (n ϭ 1) and normal epididymis (n ϭ 1) were provided by the Department of Pathology from the Hospital Clínic (Barcelona, Spain).
Ethics Statement -All samples were used in accordance to the appropriate ethical guidelines and Internal Review Board, and the biological material storing and processing was approved by the Clinical Research Ethics Committee of the Hospital Clínic of Barcelona (Barcelona, Spain). The written informed consent was obtained from all subjects in accordance with the Declaration of Helsinki.
Purification and Isolation of Proteins from Seminal Plasma-Liquefied semen samples were centrifuged at 500 ϫ g for 10 min and 1500 ϫ g for 10 min, to separate the sperm cells from the seminal plasma. The resulting seminal plasma was filtered (0.45 m pore size) to remove any cellular leftovers. All cell-free seminal plasma samples were frozen at Ϫ80°C until further processing. After thawing, seminal plasma samples were centrifuged at 16,000 ϫ g for 10 min at 4°C and the protein concentration from the supernatant of each sample was determined using the BCA protein Assay Kit (Pierce TM BCA protein Assay Kit, Thermo Fisher Scientific, Rockford, IL), following manufacturer's recommendations.
Seminal Plasma Peptide Isotopic Labeling (TMT 10-plex)-A total of 16 seminal plasma samples were selected for the proteomics study, including 4 NZ, 4 AS, 4 OZ and 4 AZ patients (Fig. 1). Differential peptide labeling was performed using TMT 10-plex isotopic label reagent set (TMT 10-plex Mass Tag Labeling; Thermo Fisher assisted semen analysis; WHO, World Health Organization; BCA, bicinchoninic acid; TMT, tandem mass tag; TEAB, triethyl ammonium bicarbonate; TCEP, tris (2-carboxyethyl) phosphine; IAA, iodoacetamide; RT, room temperature; LC-MS/MS, liquid chromatography coupled with tandem mass spectrometry; MS/MS, tandem mass spectrometry; HCD, higher energy collision dissociation; FDR, false discovery rate; PSMs, peptide spectrum matches; ANOVA, one-way analysis of variance; HPA, Human Protein Atlas; TBST, TBS with 0.1% (v/v) Tween 20; PAS, periodic acid and schiff's reagent; SEMGs, semenogelins; PTMs, post-translational modifications. Scientific), following manufacturer's instructions. Briefly, 60 g of protein from each seminal plasma sample was adjusted to a final volume of 60 l with 100 mM TEAB, and protein quantification was repeated to ensure that all samples had the same concentration (1 g/l). Proteins were reduced in 9.5 mM TCEP for 1 h at 55°C, alkylated with 17 mM IAA for 30 min in the dark, and precipitated with 500 l of cold 100% acetone at Ϫ20°C overnight. Samples were centrifuged at 17,500 ϫ g for 10 min at 4°C and the acetoneprecipitated protein pellets were resuspended in 60 l of 100 mM TEAB. Trypsin was then added at 1:20 protein-to-protease ratio and incubated overnight at 37°C with constant shaking. Prior to peptide labeling, aliquots with the same volume and concentration were taken out from each of the 16 samples and combined, in order to constitute the internal control. After that, 30 g of peptides from each individual seminal plasma sample (n ϭ 16) and internal control (n ϭ 1) were labeled with TMT isobaric tags (reporter ions intensity from m/z 127.1 to m/z 131.1 (TMT-127N, -127C, -128C, -129N, -129C, -130N, -130C, -131), and m/z 126 (TMT-126), respectively; Fig. 1). Specifically, 19.5 l of the TMT label reagents previously equilibrated at RT and dissolved in ACN (Sigma-Aldrich, St. Louis, MO) were added to the corresponding reduced and alkylated peptides. After 1 h of incubation at RT, the reaction was quenched with 4 l of 5% hydroxylamine for 15 min. Labeled peptides from each sample were combined at equal amounts constituting two different multiplex pools (Pool A and Pool B; each one consisting of 8 different samples plus the same internal control; Fig. 1), which were dried in a speed-vacuum centrifuge and peptides were resuspended in 20 l of 0.5% TFA (Sigma-Aldrich) in 5% ACN. Finally, the peptides were cleaned up via reversed-phase C18 spin columns (Pierce C18 Spin Columns, Thermo Fisher Scientific), following manufacturer's instructions.
LC-MS/MS Analysis-Labeled peptides were analyzed by a nano-LC Ultra 2D Eksigent (AB Sciex, Brugg, Switzerland) attached to an LTQ-Orbitrap Velos (Thermo Fisher Scientific). For HPLC separation, peptides were injected onto a C18 trap column (L 0.5 cm, 300 m ID, 5 m, 100 Å; Thermo Fisher Scientific). Chromatographic analyses were performed using an analytical column (L 15 cm, 75 m ID, 3 m, 100 Å; Thermo Fisher Scientific). Two different buffer systems were used for the analysis: buffer A (97% H 2 O-3% ACN, 0.1% Formic acid) and buffer B (3% H 2 O/97% ACN, 0.1% Formic acid). The following gradient was applied for peptide separation on the analytical column: from 0 -4 min 0% of B to 4% of B, from 4 -300 min 4% of B to 35% of B, from 300 -305 min 35% of B to 100% of B, at a flow rate of 400 nl/min, and from 305-320 min 100% of B at a flow rate of 400 nl/min. MS/MS analyses were performed using an LTQ-Orbitrap Velos (Thermo Fisher Scientific) directly coupled to a nanoelectrospray ion source. The LTQ-Orbitrap Velos settings included one 30,000 resolution at 400 m/z MS1 scan for precursor ions followed by MS2 scans of the 50 most intense precursor ions, at 30,000 resolution at 400 m/z, in positive ion mode. The lock mass option was enabled, and polysiloxane (m/z 445.12003) was used for internal recalibration of the mass spectra. MS/MS data acquisition was completed using Xcalibur 2.1 (Thermo Fisher Scientific). The normalized collision energy for HCD-MS2 was set to 40%.
Protein Identification-LC-MS/MS data was analyzed using Proteome Discoverer 1.4.1.14 (Thermo Fisher Scientific). For database searching, raw mass spectrometry files were submitted to the in-house Homo sapiens UniProtKB/Swiss-Prot database with Sus scrufa Trypsin added to it (HUMAN_Tryp_UP_SP_R_2016_03.fasta; released March 2016; 20155 protein entries) using SEQUEST HT version 28.0 (Thermo Fisher Scientific). For re-scoring, percolator search node was used. Searches were performed using the following parameters: five maximum missed cleavage sites for trypsin, TMT-labeled lysine (ϩ229.163 Da) and methionine oxidation (ϩ15.995 Da) as dynamic modifications, cysteine carbamidomethylation (ϩ57.021 Da) as a static modification, 20 ppm precursor mass tolerance, 0.6 Da fragment mass tolerance, 5 mmu peak integration tolerance, and most confident centroid peak integration method. Percolator was used for protein identification with the following identification criteria: at least one unique peptide per protein with a FDR of 1%.
Proteomic Data Analysis by Conventional and Novel Approaches-Conventional relative protein quantification -Normalized TMT quantitative values for each identified spectrum derived from the ratio of the intensity of reporter ions from HCD MS2 spectra corresponding to each individual samples (TMT-127N to TMT-131) with the internal control (TMT-126), which were obtained using Proteome Discoverer software (supplemental Fig. S1) (12,58). Different isoforms of the same protein were treated as dissociated or "ungrouped" from their respective families, to avoid any possible ambiguity (12,58). Only those proteins with at least 1 unique peptide quantified by Ն 2 PSMs in all the samples and a coefficient of variation Ͻ 50% in at least 75% of the samples were considered for further statistical analyses. Significant statistical differences among the different subtypes of infertile patients were evaluated after normalizing the relative proteomic quantification values by log2 transformation using ANOVA combined with Tukey's multiple comparison test. Additionally, the correlations between sperm concentration and normalized relative proteomic quantification values were assessed using Pearson correlation test followed by the adjustment of the p values to FDR. An MS expert checked the spectra of all the differential proteins.
Establishment of Stable-Protein Pairs Profile-The intensity values from HCD MS2 spectra corresponding to each individual sample (TMT-127N to TMT-131), but not from the internal control (TMT-126), were used to establish the stable-protein pairs for each group of patients. This strategy has been previously applied to the study of stable-transcript pairs obtained from RNA-seq data (59,60). In our case, only those proteins with at least 2 unique peptides quantified by Ն 2 PSMs for all samples with a coefficient of variation Ͻ 50% in at least 75% of the samples were considered. Stable-protein pairs were determined by applying the following statistical principle: 2 proteins (with more than 1 peptide quantified for each one) were highly-correlated when Ն 75% of the possible peptide combinations had a Pearson correlation coefficient Ն 0.9 (Fig. 2). In order to determine alterations in individual samples, stable-protein pair analysis was repeated for the control group (NZ patients) by adding a patient with altered seminal parameters once at a time.
Functional Enrichment and Expression Analyses Using Public Databases-The seminal plasma proteomic datasets were uploaded to the Gene Ontology Consortium database (http://www.geneontology. org/) (61), based on PANTHER v13.1 database (Release date 2018 -02-03), in order to predict the functional involvement of the seminal plasma proteins. The significance of enrichment analyses was calculated by a Fisher's exact test. p values Ͻ 0.05 after FDR adjustment were considered statistically significant.
The HPA Database (http://www.proteinatlas.org/) (62, 63) was used to assess the expression of specific proteins in different human male reproductive tissues.
Immunoblotting-Protein extracts from cell-free seminal plasma samples from an independent set of patients (n ϭ 18; 6 NZ, 6 OZ, 6 AZ; supplemental Table S1) were used for Western blotting validation of ECM1 protein. A total of 40 g of seminal plasma protein extracts from each sample were separated by SDS-PAGE and transferred onto Immobilon-P PVDF membranes (Merck Millipore, Tullagreen, Ireland) as described elsewhere (7). The membranes were blocked in TBST and 5% (w/v) skim milk for 1 h at RT. For immunostaining, anti-ECM1 antibody (polyclonal rabbit ECM1 antibody, #43263, SAB Signalway Antibody, Baltimore, MD) diluted 1:500 in TBST was used. After washing in TBST, membranes were incubated with an ECL horseradish peroxidase-labeled donkey anti-rabbit IgG antibody (Am- Immunohistochemistry-Cross-sections from normal testis (n ϭ 1) and a normal epididymis (n ϭ 1) were used to detect the expression pattern of ECM1 and NPC2. Bouin's fixed, paraffin-embedded testicular and epididymal sections (4 m) were deparaffinized in toluene (3ϫ) and hydrated through graded series of ethanol (100%, 100%, 90%, 70%, H 2 O milliQ), with a 0.3% hydrogen peroxide (Sigma-Aldrich) intermediate incubation between the two 100% ethanol incubations. For antigen retrieval, sections were incubated with 10 mM sodium citrate (pH 6.0) at 99.5°C for 20 min. Sections were blocked with PBS-5% skim milk for 30 min at RT and, then, incubated with Avidin/Biotin Blocking Kit (Vector Laboratories, Burlingame, CA). Afterward, sections were incubated for 16 h at 4°C with the primary antibody of interest: anti-ECM1 (polyclonal rabbit ECM1 antibody, #EPP12545, Elabscience, Houston, TX) diluted 1:20 in PBS-1% skim milk, and anti-NPC2 (polyclonal rabbit NPC2 antibody, #CQA1207, Cohesion Biosciences, London, UK) diluted 1:100 in PBS-5% skim milk. Negative controls for nonspecific binding of the primary antibodies were included using isotype rabbit IgG (Vector Laboratories). Then, sections were incubated with a biotinylated goat anti-rabbit secondary antibody (Vector Laboratories), and with an avidin-biotinperoxidase detection kit (Vectastain ABC Elite Kit, Vector Laboratories). Finally, slides were incubated with ImmPACT TM DAB Peroxidase Substrate kit (Vector Laboratories). All slides were then washed and stained with hematoxylin, and testis slides were additionally stained with PAS (Sigma-Aldrich). Sections were dehydrated in 100% ethanol (3ϫ), cleared in toluene (3ϫ) and mounted in Eukitt Mounting Medium (Sigma-Aldrich) to be finally analyzed by a transmission light microscope (Olympus BX50, Olympus, Tokyo, Japan).
Experimental Design and Statistical Rationale-The experimental design of this study is shown in Fig. 1. Specifically, seminal plasma proteome from 16 infertile patients including individuals with normal seminal parameters (NZ; control group) and infertile patients with altered seminal parameters (AS, OZ and AZ patients; supplemental Table S1) was characterized and compared. A total of 4 biological replicates per group were used for proteomic analysis. Significant statistical differences among the different subtypes of infertile patients were evaluated after the normalization of the relative proteomic quantification values by log2 transformation using ANOVA combined with Tukey's multiple comparison test. Because some differential proteins showed a gradual decreasing reliant on sperm concentration, we tested the correlation between sperm concentration and normalized values of the relative proteomic quantification using Pearson correlation test followed by the p value adjustment to FDR. To further validate these results, immunoblotting analysis of one differential protein (ECM1) was performed in an independent set of infertile patients as biological replicates (6 NZ, 6 OZ, 6 AZ). The correlation between sperm concentration and the abundance of ECM1 protein was evaluated using Pearson correlation test.
The expression profiles of the differential proteins in human reproductive tissues were also assessed using the information available at the HPA, in order to discern whether the proteins correlated with sperm concentration were just reflecting a variation in the amount of sperm leftovers in the ejaculate or, in contrast, were the result of some proteomic alterations in the secretions from accessory sex glands. Additionally, immunohistochemistry was performed in testicular (n ϭ 1) and epididymal (n ϭ 1) biopsies to infer the tissue origin of 2 differentially expressed proteins, ECM1 and NPC2. Finally, the correlation between sperm concentration and normalized relative protein abundance was also explored at peptide level, because discrepancies within our findings and results published by others suggested differences on the specific peptides analyzed for each protein (50 -52).
In order to assess the heterogeneity of the seminal plasma proteomic profile within the subgroups of infertile patients characterized according to seminal parameters as well as the specific protein alterations in individual patients, a new approach based on the analysis of the stable-protein pairs was conducted.
Both, statistical analyses and the establishment of stable-protein pairs were carried out using R software version 3.4.4 (http://www.rproject.org) (64). p values Ͻ 0.05 were considered statistically significant. All graphs were constructed using GraphPad Prism software version 5.01 (GraphPad Software Inc., San Diego, CA).

Proteomic Analysis of Human Seminal Plasma-LC-MS/MS
analysis resulted in the identification of a total of 349 proteins in the seminal plasma proteome from 16 infertile patients, with at least one unique peptide and 1% FDR (supplemental Tables S2 and S3). However, just 60 of the 349 seminal plasma proteins fit our strict quantification criteria (at least 1 unique peptide quantified with Ն 2 PSMs in all samples with a coefficient of variation Ͻ 50% in at least 75% of the samples) and, therefore, only these proteins were used for subsequent analyses. Detailed information of the quantifiable peptides and corresponding proteins are presented in supplemental Tables S4 and S5, respectively.
With the aim to assess the potential role of seminal plasma proteins in the functionality of spermatozoa, as well as the ability of seminal plasma proteome signatures to reflect disturbances in spermatogenesis and sperm maturation processes that could explain the alteration of seminal parameters, we compared the seminal plasma proteome from infertile patients categorized according to their seminal parameters (NZ, AS, OZ, AZ). Two different strategies were applied in our comparative quantitative proteomics study: (i) A conventional approach based on standard statistic analyses (ANOVA and Pearson correlation test) of relative protein quantification values; and (ii) a novel analysis method based on the establishment of the stable-protein pairs separately in groups of patients classified according seminal parameters, as well as the identification of protein alterations in individual samples based on the variations of the stable-protein pairs defined in NZ patients (Fig. 1). The results of each analysis are shown below.
Altered Seminal Plasma Protein Abundance in Infertile Patients-Comparison of the seminal plasma proteomes from the 4 different subtypes of infertile patients (NZ, AS, OZ, AZ) by conventional data analysis revealed a set of 6 differentially expressed proteins among the groups (p value Ͻ 0.05; ANOVA with Tukey's Post Hoc test; Fig. 3 proteins (CRISP1, NPC2 and SPINT3) was reduced in patients with low or absence of sperm cells in the ejaculate (OZ and AZ patients, respectively), whereas SCPEP1 protein abundance was increased in AZ patients (Fig. 3). Additionally, only the protein ANPEP displayed reduced protein abundance in patients with decreased sperm motility (AS patients).

Relationship Between Altered Seminal Plasma Proteins and the Sperm Concentration-
The correlation between the relative amount of seminal plasma proteins and the sperm concentration parameter was assessed in order to test whether the altered protein abundance detected in OZ and AZ patients relied in the number of sperm cells present in the ejaculate independently of sperm motility rate. Remarkably, the abundance of the proteins SPINT3, NPC2, ECM1, CRISP1, and IGHG2 increased with higher sperm concentration (Table I).
To validate these results, a Western blot analysis for ECM1 protein was performed in an independent set of samples not used for MS analysis (n ϭ 18; 6 NZ, 6 OZ, 6 AZ). As expected, higher protein abundance for ECM1 protein was found with an increased sperm count (p value Ͻ 0.01; Fig. 4). The HPA  database showed that 3 of the 5 seminal plasma proteins positively correlated with sperm concentration are mainly expressed in epididymis, but not detected in testis (Fig. 5A). We extended the protein patterns provided by the HPA database by immunohistochemical validation of ECM1 and NPC2 proteins in human testis (n ϭ 1) and human epididymis (n ϭ 1) biopsies (Fig. 5B). However, the analysis at peptide level showed that whereas the majority of peptides quantified for NPC2 protein (6 of 7 peptides) maintain the correlation with sperm concentration, only one peptide quantified for each of the remaining proteins was found correlated (1 of 1 peptide for SPINT3, 1 of 2 peptides for CRISP1 and IGHG2 and 1 of 3 peptides for ECM1; Table II).

Seminal Plasma Stable-Protein Pair Profiles Among Patient
Groups and Individuals-A total of 182 stable-protein pairs between 24 different proteins were identified for patients with normal semen parameters (NZ) ( Table III). Of note, those 24 proteins are functionally involved in processes already ascribed to seminal plasma, such as the regulation of sperm function, semen coagulation-liquefaction processes, immune system and lipid metabolism, among others. In contrast, very few stable-protein pairs were observed in the different subtypes of patients with altered seminal parameters: 18 stableprotein pairs comprising 16 proteins in AS, 0 in OZ and 3 comprising 5 proteins in AZ (Table III).
In order to assess alterations of stable-protein pairs in individual patients, the analysis of the stable-protein pairs for the NZ patients group was repeated by adding one single patient with an altered seminal parameter at a time (Table IV). This strategy revealed that the asthenozoospermic patient AS2 had a similar seminal plasma proteomic signature to that found in NZ men, because Ͼ 75% of the stable-protein pairs established for NZ population were maintained after performing the stable protein-pair analysis with proteomic data from NZ samples (n ϭ 4) and patient AS2 (Table IV). In contrast, when adding individually three of the four azoospermic patients (AZ1, AZ2 and AZ4) into the analysis of NZ stable-protein pairs, we detected Ͻ 50% stable-protein pairs determined in the NZ individuals, therefore reflecting huge differences in the seminal plasma proteome signature of those AZ patients (Table IV).  Table S6. The heterogeneous composition of the seminal plasma together with the rapid changes that occur in its molecular composition after ejaculation, such as the proteolytic cascade associated to the coagulation-liquefaction process, introduce further complexity to seminal plasma proteomic studies (10,14). In the present study, a total of 349 proteins were identified in the 16 seminal plasma samples analyzed (supplemental Table S2), which are functionally related to metabolism, response to stress, proteolysis, immune system and energy production. Also, with less extent, these identified human seminal plasma proteins seemed to be involved in processes related to fertilization and embryogenesis (8). This apparently low number of identified seminal plasma proteins could be explained by the detection of the semenogelins I and II (SEMG1 and SEMG2) as the most abundant proteins of the seminal plasma. Specifically, around 40% of the PSMs identified in our proteomic study corresponded to SEMG1 and SEMG2, thus hindering the detection of low abundant proteins. This low number of protein identifications is also observed in other studies assessing the human seminal plasma proteome using MS methods and identification criteria comparable to ours (13, 31, 32, 40 -42, 46, 47, 49, 65). For this reason, future studies should consider the incorporation of strategies to deplete SEMGs prior the proteomic characterization of seminal plasma, as for example the use of HPLC columns containing antibodies against SEMGs (10,13).
Conventional Approach to Analyze Quantitative Proteomics Data-Protein quantification of TMT-labeled peptides using conventional approaches is obtained from the average of relative ion abundance ratios for all peptides encompassing the same protein (12). The conventional statistical analyses conducted in this study showed: (i) The underexpression of the glycoprotein ANPEP in patients with altered sperm motility (Fig. 3), as previously reported by others (33); and (ii) a gradual decline of CRISP1, NPC2, and SPINT3 abundance in infertile patients, ranging from high to low sperm concentration (in decreasing order: NZ-AS, OZ, and AZ) ( Table I, Fig. 3). Of note, this gradual declined abundance was also observed for the protein levels of SPINT3, NPC2, ECM1, and CRISP1, independently of sperm motility parameter (Table I). The low abundance of those proteins in seminal plasma from patients with low or absence of sperm cells could reflect either proteomic alterations in the accessory sex glands secretions, or the presence of low amounts of male germ cells remnants coming from apoptotic sperm or from sperm cytoplasmic droplets. This specific question may be elucidated by deciphering the potential tissue origin of these altered seminal plasma proteins. According to the HPA Database, the testicular or extra-testicular origin of NPC2 could not be assessed, because, although NPC2 is a major component of epididymal secretions, it is also expressed in testis (43) (Fig. 5). In contrast, ECM1, SPINT3, and CRISP1 are mainly expressed in the epididymis whereas they are not detected in testis (Fig. 5), suggesting that epididymal secretions could be regulated by the presence of sperm themselves in the epididymis. Interestingly, this potential cross-talk between the spermatozoa and epididymis has also been observed in rat and bovine species  Table I). Intensity values are based on the immunohistochemical antibody staining score information available in the HPA database. B, Results from the subsequent immunohistochemical analysis in human testis (n ϭ 1) and epididymis (n ϭ 1) sections using ECM1 and NPC2 antibodies assayed in the present study. The obtained protein patterns validate the HPA data. ECM1 is an epididymis-specific protein whereas NPC2 is found in epididymis but also in testes. A lower magnification (ϫ100; images on top) and a higher magnification (ϫ400; images below) from the boxed area are shown. Negative controls with rabbit IgG showed nonspecific staining (data not shown). H: Hematoxylin stain, PAS: Periodic acid-Schiff stain. Table I  19

TABLE II Pearson correlation analysis between sperm concentration and seminal plasma peptide abundance (LC-MS/MS data) for the 5 proteins identified as correlated in
16 ---CPE  16  1  --B2M  16  3  -1  Other  AZGP1  20  3  --TF  14  ---IDH1  14  5  --VWA1  12 (66,67). Drabovich and colleagues, in contrast, detected decreased levels of SPINT3, NPC2, ECM1 and CRISP1 only in obstructive azoospermic patients but not in patients with non-obstructive azoospermia (16). Other groups assessing potential seminal biomarkers to discern the different subtypes of azoospermia found other distinct differential proteins, such as LGALS3BP (57) and STAB2, CP135, GNRP, and PIP (34). Altogether, it indicates the presence of some differences between seminal plasma proteomic data from different studies. Of note, a similar lack of concordance has also been observed in quantitative proteomic data of sperm samples from the same type of patients. This is exemplified by the detection of only 17 proteins out the 179 reported as differentially expressed in the sperm cells in at least 2 of the 7 comparative proteomic studies conducted so far for the study of asthenozoospermia (68). Differences in sample collection, handling and storage, proteomic strategies, and the biological intraand inter-individual variance may be important causes contributing to this lack of reproducibility between studies (14,17,69). Moreover, the high and fast protease activity in seminal plasma after ejaculation could introduce even more heterogeneity to the results because of the presence of distinct proteolytic fragments, in addition to the protein PTMs not detected in standard proteomic procedures (70). In order to evaluate this putative heterogeneity of seminal plasma proteome we also assessed the correlation of peptides encompassing the proteins SPINT3, NPC2, ECM1, CRISP1 and IGHG2 with sperm concentration (Table II). Of note, NPC2 is the only protein showing most of its corresponding peptides correlated with the sperm concentration (Table II). In agreement with our findings, Giacomini and colleagues, by using nano LC-electrospray ionization-MS/MS, found decreased levels of NPC2 in seminal plasma from idiopathic oligoasthenozoospermic patients (43). As a summary, there is a need of novel approaches to analyze the results from quantitative shotgun proteomic studies, in order to overcome the limitations produced by the heterogeneity of seminal plasma in the proteolytic fragments, as well as by its conjunction with other variations such as the protein PTMs not detectable by standard proteomic procedures. Novel Approaches to Analyze Quantitative Proteomics and Identify Patient-specific Alterations-A set of seminal plasma strictly co-regulated proteins was established by following a new approach based on the correlation of the intensities of all unique peptides comprising one specific protein with all the unique peptides quantified for all the other detected proteins in the sample (Fig. 2). This strategy, called stable-protein pairs analysis herein, may contribute to reduce the heterogeneity observed in the seminal plasma proteomic data, because only those proteins displaying a consistent pattern in a specific phenotype are obtained. A total of 182 stable-protein pairs comprising 24 proteins were detected in patients with normal semen parameters (Table III), reflecting the strict co-regulation of these proteins in NZ individuals. These stable-protein pairs include gene products involved in: (i) Sperm function, such as CKB that plays a critical role in the demand of energy necessary for sperm motility (71), CST3, PAEP, and CLU that regulate sperm capacitation (72)(73)(74) and CRISP1 and NPC2, which are necessary for sperm-oocyte binding and fertilization (75,76); (ii) the regulation of semen clotting-liquefaction processes, such as ACCP, KLK3, and MME, which are directly involved in the proteolysis of the SEMGs or other proteins (77,78), or as WFDC2, ALB, and TGM4, which regulate other components required for clotting-liquefaction process such as proteases, zinc ions and polyamines (79 -81); (iii) immunology, including proteins that could participate in the leukocyte-mediated immune response, such as IGKC, IGHG2, ANPEP, LGALSBP3, ECM1, and B2M (82,83) or in antimicrobial activity, as for example CPE and QSOX1 (84,85); and finally, (iv) other functions such as lipid metabolism (AZGP1, TF and IDH1) and matrix assembly (VWA1). In contrast to the high number of stable-protein pairs identified in NZ individuals, the stable correlations drastically decreased in the different groups of patients with altered semen parameters (Table III). Indeed, just 18, 3 and 0 stable-protein pairs were detected in AS, AZ, and OZ patients, respectively. The low number of stable-protein pairs observed in AZ patients was not surprising, because this group contains patients indistinctly diagnosed with obstructive and non-obstructive azoospermia, which probably results in different semen protein profiles as previously reported by others (16). However, a high heterogeneity was also observed in the proteomic profile of seminal plasma from AS and OZ patients. The few stableprotein pairs detected in infertile patients with altered seminal parameters indicate that alterations in different proteins may result or be a consequence of the same altered phenotype. Although some hints for the presence of protein-pair correlations was already reported in the sperm proteome using ancillary proteomic methods (86), the present study clearly demonstrates the potential of this approach using proteomics data at peptide level (Table III, Fig. 2).
To assess which protein pairs might be associated to the alterations of the seminal parameters in individual samples, we repeated the analysis of the stable-protein pairs in the NZ population but adding data from one patient with altered parameters at a time. First, we observed that the patient AS2 had a very similar seminal plasma protein signature to the NZ population, because it maintained more than 75% of the NZ stable-protein pairs (Table IV). Of note, one of the proteins that loses more correlations in patient AS2 is the TF, a protein involved in lipid metabolism and sperm protection against oxidative stress (87) (Table IV). It is interesting to note that TF loses more than 75% of the correlations in another asthenozoospermic patient (AS1) ( Table IV), suggestive that oxidative stress may be related to the impairment of the sperm motility in both patients AS1 and AS2.
We also observed that proteins involved in the induction of proteolysis of SEMGs and other regulators of the semen clotting-liquefaction process lose Ͼ 75% of the correlations in OZ and AZ patients (KLK3 in OZ2 and AZ2; ACPP in OZ2, AZ1, AZ2, and AZ3; and MME in AZ1; Table IV), suggesting that the sperm also contribute with regulators for the semen clot proteolysis (88). Whereas some of the protein correlations were lost in the majority of the individual patients, as observed for CLU, PAEP, and WFDC2, suggesting that the correlations established for these proteins in NZ samples are weak, some seminal plasma proteins seem to be altered only in one unique sample such as TGM4 in patient AS4, IDH1 in OZ1, CST3 and LGALS3BP in OZ4, CPE in Z1, B2M in Z3 and IGKC in Z4, although these alterations could not clearly explain the observed phenotype.
Quantitative Proteomics as A Tool to Provide Insights in Seminal Plasma Proteome Signatures Of Infertile Patients-So far, proteomics biomarker discovery experiments have shown a relatively low concordance among different studies. In fact, we demonstrated that the results from relative quantitative proteomics are different if the analyses are performed at protein or at peptide level. These differences could explain the apparent lack of reproducibility of some of the findings, a fact that should be considered also when using antibody-based techniques recognizing specific peptides (such as Western blotting or ELISA) or targeted proteomic approaches to selected specific peptides. Here, we propose introducing a novel complementary approach for the analysis of quantitative proteomic data, which is based on the establishment of stable-protein pairs. This strategy has been previously applied to the study of RNA-seq data (59,60), but to the best of our knowledge this is the first time it is applied to the proteomic data. The use of this new approach in our seminal plasma proteome dataset has allowed determining highly stable seminal plasma proteome signatures of men presenting normal seminal parameters (NZ). In contrast, we demonstrated that the current classification of infertile patients based on altered semen parameters resulted in a high heterogeneous seminal plasma proteomic profile, thereby suggesting that the current male infertility stratification performed in fertility clinics is not enough to obtain a good diagnosis. Moreover, the stable-protein pairs approach has the potential to pinpoint proteins potentially related to pathogenic mechanisms in individual samples when this strategy is applied for the evaluation of the NZ stable-protein pair alterations in individual infertile patients. Although our study has limitations, the novel data analysis approach proposed herein could be valuable toward the identification of altered proteins and pathogenic mechanisms of male infertility and might open a window to the personalized diagnosis of male infertility in future studies.