Novel and conventional approaches for the analysis of quantitative proteomic data are complementary towards the identification of seminal plasma alterations in infertile patients

AUTHORS: Ferran Barrachina1*; Meritxell Jodar1*; David Delgado-Dueñas1; Ada Soler-Ventura1; Josep Maria Estanyol2; Carme Mallofré3; Josep Lluís Ballescà4; Rafael Oliva1 AFFILIATIONS: 1Molecular Biology of Reproduction and Development Research Group, Institut d’Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Fundació Clínic per a la Recerca Biomèdica, Department of Biomedical Sciences, Faculty of Medicine and Health Sciences, University of Barcelona, Barcelona, Spain and Biochemistry and Molecular Genetics Service, Hospital Clínic, Barcelona, Spain; 2Proteomics Unit, Scientific Technical Services, University of Barcelona, Barcelona, Spain; 3Department of Pathology, University of Barcelona, Hospital Clínic, Barcelona, Spain; 4Clinic Institute of Gynaecology, Obstetrics and Neonatology, Hospital Clínic, Barcelona, Spain.


INTRODUCTION
Infertility is a worldwide frequent problem that affects approximately 15% of reproductive-aged couples. Around the 50% of the fertility problems is due to a male factor and, from those, the 40-60% of the infertile patients present some alterations in at least one of the seminal parameters assessed by a routine semen analysis (sperm concentration, motility and morphology) (1). According to these seminal parameters, infertile patients are categorized in: (i) patients with low or absent sperm concentration (oligozoospermia or azoospermia, respectively), (ii) patients with defective sperm motility (asthenozoospermia) and/or (iii) patients with abnormal sperm morphology (teratozoospermia) (2). The semen evaluation, together with a complete medical history and physical examination, will determine whether the initial assessment needs to be complemented with genetic and/or hormonal analyses, urinalysis or testicular biopsies. Unfortunately, the current available tools for the evaluation of male fertility are limited and insufficient, and the development of new methodologies to better discern the male factor etiology is required (3)(4)(5)(6). Currently, the application of high-throughput proteomics for the study of the human sperm cell has resulted in the identification of 6871 proteins as well as some pathogenic mechanisms involved in male infertility (7)(8)(9)(10)(11). For instance, studies on the sperm proteome from asthenozoospermic patients revealed alterations mainly in proteins and pathways related with energy production and cytoskeleton (9,12). However, since the semen is not just composed by sperm cells, the exploration of seminal plasma could help to better elucidate the causes of male infertility.
The human seminal plasma is a complex and protein-enriched biological fluid that constitutes 95% of the semen volume, whereas only the remaining 5% corresponds to spermatozoa (10). This fluid is composed by secretions from the testis (1-2%), the epididymis (2-4%) and the male accessory sex glands including the seminal vesicles (65-75%), the prostate gland (25-30%) and the bulbourethral glands (< 1%) includes 2064 non-redundant proteins from 9 independent studies, revealing that this fluid is not only a simple medium to carry the spermatozoa through the female reproductive tract (10,17). In fact, once the spermatogenesis is completed, the testicular spermatozoa, although morphologically differentiated, are immotile germ cells unable to fertilize the oocyte by its own (10). Seminal plasma plays a crucial role for the sperm maturation, motility and capacitation, prevention of premature acrosome reaction and sperm-tozona pellucida recognition and fusion, thereby providing to the spermatozoa the capability to fertilize the oocyte (18)(19)(20)(21)(22)(23)(24)(25). Additionally, the contact of the seminal plasma with the female reproductive tract provides an optimal environment that enhances embryo implantation and development as well as contributes promoting maternal immune tolerance of the semiallogenic fetus (26)(27)(28)(29).
The analysis of the seminal plasma proteome is currently undergoing an intense study (14,30).
Therefore, the analysis of the seminal plasma proteome mimicking the infertility classification according to seminal parameters is warranted, since it might help to decipher potential pathogenic mechanisms resulting in these sperm alterations. However, it is important to take into account that a wide range of factors may alter both sperm and seminal plasma compositions, leading to a high heterogeneity within patients sharing the same phenotype. This is challenging for the conventional analysis of the quantitative proteomic data based on the search of differential proteins between groups of patients with similar characteristics. Therefore, the main aim of the present study was to define the seminal plasma proteome signatures of infertile patients categorized according to their seminal parameters (normozoospermia, NZ; asthenozoospermia, AS; oligozoospermia, OZ; azoospermia, AZ) by applying added to the corresponding reduced and alkylated peptides. After 1 h of incubation at RT, the reaction was quenched with 4 μl of 5% hydroxylamine for 15 min. Labeled peptides from each sample were combined at equal amounts constituting two different multiplex pools (Pool A and Pool B; each one consisting of 8 different samples plus the same internal control; Fig. 1), which were dried in a speedvacuum centrifuge and peptides were resuspended in 20 μl of 0.5% TFA (Sigma-Aldrich, St. Louis, MO, USA) in 5% ACN. Finally, the peptides were cleaned up via reversed-phase C18 spin columns (Pierce C18 Spin Columns, Thermo Scientific, Rockford, IL, USA), following manufacturer's instructions.

LC-MS/MS analysis
Labeled peptides were analyzed by a nano-LC Ultra 2D Eksigent (AB Sciex, Brugg, Switzerland) attached to an LTQ-Orbitrap Velos (Thermo scientific, San Jose, CA, USA). For HPLC separation, peptides were injected onto a C18 trap column (L 0.5 cm, 300 μm ID, 5μm, 100 Å; Thermo Fisher Scientific, San Jose, CA, USA). Chromatographic analyses were performed using an analytical column (L 15 cm, 75 μm ID, 3 μm, 100 Å; Thermo scientific, San Jose, CA, USA). Two different buffer systems were used for the analysis: buffer A (97% H2O-3% ACN, 0.1% Formic acid) and buffer B (3% H2O-97% ACN, 0.1% Formic acid). The following gradient was applied for peptide separation on the analytical column: from 0-4 min 0% of B to 4% of B, from 4-300 min 4% of B to 35% of B, from 300-305 min 35% of B to 100% of B, at a flow rate of 400 nl/min, and from 305-320 min 100% of B at a flow rate of 400 nl/min. MS/MS analyses were performed using an LTQ-Orbitrap Velos (Thermo Fisher Scientific, Waltham, MA, USA) directly coupled to a nanoelectrospray ion source. The LTQ-Orbitrap Velos settings included one 30000 resolution at 400 m/z MS1 scan for precursor ions followed by MS2 scans of the 50 most intense precursor ions, at 30000 resolution at 400 m/z, in positive ion mode. The lock mass option was enabled, and polysiloxane (m/z 445.12003) was used for internal recalibration of the mass spectra.
MS/MS data acquisition was completed using Xcalibur 2.1 (Thermo Fisher Scientific, Waltham, MA, USA). The normalized collision energy for HCD-MS2 was set to 40%.

Protein identification
LC-MS/MS data was analyzed using Proteome Discoverer 1.4.1.14 (Thermo Fisher Scientific, Waltham, MA, USA). For database searching, raw mass spectrometry files were submitted to the in-house Homo sapiens UniProtKB/Swiss-Prot database with Sus scrufa Trypsin added to it (HUMAN_Tryp_UP_SP_R_2016_03.fasta; released March 2016; 20155 protein entries) using SEQUEST HT version 28.0 (Thermo Fisher Scientific, Waltham, MA, USA). For re-scoring, percolator search node was used. Searches were performed using the following parameters: five maximum missed cleavage sites for trypsin, TMT-labeled lysine (+229.163 Da) and methionine oxidation (+15.995 Da) as dynamic modifications, cysteine carbamidomethylation (+57.021 Da) as a static modification, 20 ppm precursor mass tolerance, 0.6 Da fragment mass tolerance, 5 mmu peak integration tolerance, and most confident centroid peak integration method. Percolator was used for protein identification with the following identification criteria: at least one unique peptide per protein with a FDR of 1%.

Proteomic data analysis by conventional and novel approaches
Conventional relative protein quantification -Normalized TMT quantitative values for each identified spectrum derived from the ratio of the intensity of reporter ions from HCD MS2 spectra corresponding to each individual samples (TMT-127N to TMT-131) with the internal control (TMT-126), which were obtained using Proteome Discoverer software (Supplemental Fig. S1) (12,58). Different isoforms of the same protein were treated as dissociated or "ungrouped" from their respective families, to avoid any possible ambiguity (12,58). Only those proteins with at least 1 unique peptide quantified by ≥ 2 PSMs in all the samples and a coefficient of variation < 50% in at least 75% of the samples were considered for further statistical analyses. Significant statistical differences among the different subtypes of infertile patients were evaluated after normalizing the relative proteomic quantification values by log2 transformation using ANOVA combined with Tukey's multiple comparison test. Additionally, the correlations between sperm concentration and normalized relative proteomic quantification values were assessed using Pearson correlation test followed by the adjustment of the p-values to FDR. An MS expert checked the spectra of all the differential proteins.
Establishment of stable-protein pairs profile -The intensity values from HCD MS2 spectra corresponding to each individual sample (TMT-127N to TMT-131), but not from the internal control (TMT-126), were used to establish the stable-protein pairs for each group of patients. This strategy has been previously applied to the study of stable-transcript pairs obtained from RNA-seq data (59,60). In our case, only those proteins with at least 2 unique peptides quantified by ≥ 2 PSMs for all samples with a coefficient of variation < 50% in at least 75% of the samples were considered. Stable-protein pairs were determined by applying the following statistical principle: 2 proteins (with more than 1 peptide quantified for each one) were highly-correlated when ≥ 75% of the possible peptide combinations had a Pearson correlation coefficient ≥ 0.9 (Fig. 2). In order to determine alterations in individual samples, stable-protein pair analysis was repeated for the control group (NZ patients) by adding a patient with altered seminal parameters once at a time.

Functional enrichment and expression analyses using public databases
The seminal plasma proteomic datasets were uploaded to the Gene Ontology Consortium database (http://www.geneontology.org/) (61), based on PANTHER v13.1 database (Release date 2018-02-03), in order to predict the functional involvement of the seminal plasma proteins. The significance of enrichment analyses was calculated by a Fisher's exact test. P-values < 0.05 after FDR adjustment were considered statistically significant.
The HPA Database (http://www.proteinatlas.org/) (62, 63) was used to assess the expression of specific proteins in different human male reproductive tissues.

Immunoblotting
Protein extracts from cell-free seminal plasma samples from an independent set of patients (n=18; 6 NZ, 6 OZ, 6 AZ; Supplemental Table S1) were used for Western blotting validation of ECM1 protein.
A total of 40 μg of seminal plasma protein extracts from each sample were separated by SDS-PAGE and transferred onto Immobilon-P PVDF membranes (Merck Millipore, Tullagreen, Ireland) as described elsewhere (7). The membranes were blocked in TBST and 5% (w/v) skim milk for 1 h at RT. For immunostaining, anti-ECM1 antibody (polyclonal rabbit ECM1 antibody, #43263, SAB Signalway Antibody, Baltimore, MD, USA) diluted 1:500 in TBST was used. After washing in TBST, membranes were incubated with an ECL horseradish peroxidase-labeled donkey anti-rabbit IgG antibody to be finally analyzed by a transmission light microscope (Olympus BX50, Olympus, Tokyo, Japan).

Experimental Design and Statistical Rationale
The experimental design of this study is shown in Fig. 1. Specifically, seminal plasma proteome from 16 infertile patients including individuals with normal seminal parameters (NZ; control group) and infertile patients with altered seminal parameters (AS, OZ and AZ patients; Supplemental Table S1) was characterized and compared. A total of 4 biological replicates per group were used for proteomic analysis. Significant statistical differences among the different subtypes of infertile patients were evaluated after the normalization of the relative proteomic quantification values by log2 transformation using ANOVA combined with Tukey's multiple comparison test. Since some differential proteins showed a gradual decreasing reliant on sperm concentration, we tested the correlation between sperm concentration and normalized values of the relative proteomic quantification using Pearson correlation test followed by the p-value adjustment to FDR. To further validate these results, immunoblotting analysis of one differential protein (ECM1) was performed in an independent set of infertile patients as biological replicates (6 NZ, 6 OZ, 6 AZ). The correlation between sperm concentration and the abundance of ECM1 protein was evaluated using Pearson correlation test.
The expression profiles of the differential proteins in human reproductive tissues were also assessed using the information available at the HPA, in order to discern whether the proteins correlated with sperm concentration were just reflecting a variation in the amount of sperm leftovers in the ejaculate or, in contrast, were the result of some proteomic alterations in the secretions from accessory sex glands.
Additionally, immunohistochemistry was performed in testicular (n=1) and epididymal (n=1) biopsies to infer the tissue origin of 2 differentially expressed proteins, ECM1 and NPC2. Finally, the correlation between sperm concentration and normalized relative protein abundance was also explored at peptide level, since discrepancies within our findings and results published by others suggested differences on the specific peptides analyzed for each protein (50)(51)(52).
In order to assess the heterogeneity of the seminal plasma proteomic profile within the subgroups of infertile patients characterized according to seminal parameters as well as the specific protein alterations in individual patients, a new approach based on the analysis of the stable-proteins pairs was conducted.
Both, statistical analyses and the establishment of stable-protein pairs were carried out using R software version 3.4.4 (http://www.r-project.org) (64). P-values < 0.05 were considered statistically significant. All graphs were constructed using GraphPad Prism software version 5.01 (GraphPad Software Inc., San Diego, CA, USA).

Proteomic analysis of human seminal plasma
LC-MS/MS analysis resulted in the identification of a total of 349 proteins in the seminal plasma proteome from 16 infertile patients, with at least one unique peptide and 1% FDR (Supplemental Tables   S2 and S3). However, just 60 of the 349 seminal plasma proteins fit our strict quantification criteria (at least 1 unique peptide quantified with ≥ 2 PSMs in all samples with a coefficient of variation < 50% in at least 75% of the samples) and, therefore, only these proteins were used for subsequent analyses. Detailed information of the quantifiable peptides and corresponding proteins are presented in Supplemental Tables   S4 and S5, Table S6).
Of interest, the Post Hoc test reflected that the protein abundance of 3 of these 6 differentially expressed proteins (CRISP1, NPC2 and SPINT3) was reduced in patients with low or absence of sperm cells in the ejaculate (OZ and AZ patients, respectively), whereas SCPEP1 protein abundance was increased in AZ patients (Fig. 3). Additionally, only the protein ANPEP displayed reduced protein abundance in patients with decreased sperm motility (AS patients).

Relationship between altered seminal plasma proteins and the sperm concentration
The correlation between the relative amount of seminal plasma proteins and the sperm concentration parameter was assessed in order to test whether the altered protein abundance detected in OZ and AZ patients relied in the number of sperm cells present in the ejaculate independently of sperm motility rate. Remarkably, the abundance of the proteins SPINT3, NPC2, ECM1, CRISP1 and IGHG2 increased with higher sperm concentration (Table I). To validate these results, a Western blotting analysis for ECM1 protein was performed in an independent set of samples not used for MS analysis (n=18; 6 NZ, 6 OZ, 6 AZ). As expected, higher protein abundance for ECM1 protein was found with an increased sperm count (p-value < 0.01; Fig. 4). The HPA database showed that 3 of the 5 seminal plasma proteins positively correlated with sperm concentration are mainly expressed in epididymis, but not detected in testis (Fig. 5A). We extended the protein patterns provided by the HPA database by immunohistochemical validation of ECM1 and NPC2 proteins in human testis (n=1) and human epididymis (n=1) biopsies (Fig. 5B). However, the analysis at peptide level showed that whereas the majority of peptides quantified for NPC2 protein (6 of 7 peptides) maintain the correlation with sperm concentration, only one peptide quantified for each of the remaining proteins was found correlated (1 of 1 peptide for SPINT3, 1 of 2 peptides for CRISP1 and IGHG2 and 1 of 3 peptides for ECM1; Table II).   <INSERT TABLE I>   <INSERT TABLE II>   <INSERT FIGURE 4> <INSERT FIGURE 5>

Seminal plasma stable-protein pair profiles among patient groups and individuals
A total of 182 stable-protein pairs between 24 different proteins were identified for patients with normal semen parameters (NZ) ( Table III). Of note, those 24 proteins are functionally involved in processes already ascribed to seminal plasma, such as the regulation of sperm function, semen coagulation-liquefaction processes, immune system and lipid metabolism, among others. In contrast, very few stable-protein pairs were observed in the different subtypes of patients with altered seminal parameters: 18 stable-protein pairs comprising 16 proteins in AS, 0 in OZ and 3 comprising 5 proteins in AZ (Table III).

<INSERT TABLE III>
In order to assess alterations of stable-protein pairs in individual patients, the analysis of the stable-protein pairs for the NZ patients group was repeated by adding one single patient with an altered seminal parameter at a time (Table IV). This strategy revealed that the asthenozoospermic patient AS2 had a similar seminal plasma proteomic signature to that found in NZ men, since > 75% of the stableprotein pairs established for NZ population were maintained after performing the stable protein-pair analysis with proteomic data from NZ samples (n=4) and patient AS2 (Table IV). In contrast, when adding individually three of the four azoospermic patients (AZ1, AZ2 and AZ4) into the analysis of NZ stable-protein pairs, we detected < 50% stable-protein pairs determined in the NZ individuals, therefore reflecting huge differences in the seminal plasma proteome signature of those AZ patients (Table IV).

DISCUSSION
The heterogeneous composition of the seminal plasma together with the rapid changes that occur in its molecular composition after ejaculation, such as the proteolytic cascade associated to the coagulation-liquefaction process, introduce further complexity to seminal plasma proteomic studies (10,14). In the present study, a total of 349 proteins were identified in the 16 seminal plasma samples analyzed (Supplemental Table S2), which are functionally related to metabolism, response to stress, proteolysis, immune system and energy production. Also, with less extent, these identified human seminal plasma proteins seemed to be involved in processes related to fertilization and embryogenesis (8). This apparently low number of identified seminal plasma proteins could be explained by the detection of the semenogelins I and II (SEMG1 and SEMG2) as the most abundant proteins of the seminal plasma.
Specifically, around 40% of the PSMs identified in our proteomic study corresponded to SEMG1 and SEMG2, thus hindering the detection of low abundant proteins. This low number of protein identifications is also observed in other studies assessing the human seminal plasma proteome using MS methods and identification criteria comparable to ours (13, 31, 32, 40-42, 46, 47, 49, 65). For this reason, future studies should consider the incorporation of strategies to deplete SEMGs prior the proteomic characterization of seminal plasma, as for example the use of HPLC columns containing antibodies against SEMGs (10,13).

Conventional approach to analyze quantitative proteomics data
Protein quantification of TMT-labeled peptides using conventional approaches is obtained from the average of relative ion abundance ratios for all peptides encompassing the same protein (12). The conventional statistical analyses conducted in this study showed: (i) The underexpression of the glycoprotein ANPEP in patients with altered sperm motility (Fig. 3), as previously reported by others (33); and (ii) a gradual decline of CRISP1, NPC2 and SPINT3 abundance in infertile patients, ranging from high to low sperm concentration (in decreasing order: NZ-AS, OZ and AZ) ( Table I, Fig. 3). Of note, this gradual declined abundance was also observed for the protein levels of SPINT3, NPC2, ECM1 and CRISP1, independently of sperm motility parameter (Table I). The low abundance of those proteins in seminal plasma from patients with low or absence of sperm cells could reflect either proteomic alterations in the accessory sex glands secretions, or the presence of low amounts of male germ cells remnants coming from apoptotic sperm or from sperm cytoplasmic droplets. This specific question may be elucidated by deciphering the potential tissue origin of these altered seminal plasma proteins.
According to the HPA Database, the testicular or extra-testicular origin of NPC2 could not be assessed, since, although NPC2 is a major component of epididymal secretions, it is also expressed in testis (43) (Fig. 5). In contrast, ECM1, SPINT3 and CRISP1 are mainly expressed in the epididymis while they are not detected in testis (Fig. 5), suggesting that epididymal secretions could be regulated by the presence of sperm themselves in the epididymis. Interestingly, this potential cross-talk between the spermatozoa and epididymis has also been observed in rat and bovine species (66,67). Drabovich and colleagues, in contrast, detected decreased levels of SPINT3, NPC2, ECM1 and CRISP1 only in obstructive azoospermic patients but not in patients with non-obstructive azoospermia (16). Other groups assessing potential seminal biomarkers to discern the different subtypes of azoospermia found other distinct differential proteins, such as LGALS3BP (57) and STAB2, CP135, GNRP and PIP (34). Altogether, it indicates the presence of some differences between seminal plasma proteomic data from different studies.
Of note, a similar lack of concordance has also been observed in quantitative proteomic data of sperm samples from the same type of patients. This is exemplified by the detection of only 17 proteins out the 179 reported as differentially expressed in the sperm cells in at least 2 of the 7 comparative proteomic studies conducted so far for the study of asthenozoospermia (68). Differences in sample collection, handling and storage, proteomic strategies, and the biological intra-and inter-individual variance may be important causes contributing to this lack of reproducibility between studies (14,17,69). Moreover, the high and fast protease activity in seminal plasma after ejaculation could introduce even more heterogeneity to the results due to the presence of distinct proteolytic fragments, in addition to the protein PTMs not detected in standard proteomic procedures (70). In order to evaluate this putative heterogeneity of seminal plasma proteome we also assessed the correlation of peptides encompassing the proteins SPINT3, NPC2, ECM1, CRISP1 and IGHG2 with sperm concentration (Table II). Of note, NPC2 is the only protein showing the majority of its corresponding peptides correlated with the sperm concentration (Table II). In agreement with our findings, Giacomini and colleagues, by using nano LC-electrospray ionization-MS/MS, found decreased levels of NPC2 in seminal plasma from idiopathic oligoasthenozoospermic patients (43). As a summary, there is a need of novel approaches to analyze the results from quantitative shotgun proteomic studies, in order to overcome the limitations produced by the heterogeneity of seminal plasma in the proteolytic fragments, as well as by its conjunction with other variations such as the protein PTMs not detectable by standard proteomic procedures.

Novel approaches to analyze quantitative proteomics and identify patient-specific alterations
A set of seminal plasma strictly co-regulated proteins was established by following a new approach based on the correlation of the intensities of all unique peptides comprising one specific protein with all the unique peptides quantified for all the other detected proteins in the sample (Fig. 2). This strategy, called stable-protein pairs analysis herein, may contribute to reduce the heterogeneity observed in the seminal plasma proteomic data, since only those proteins displaying a consistent pattern in a specific phenotype are obtained. A total of 182 stable-protein pairs comprising 24 proteins were detected in patients with normal semen parameters (Table III), reflecting the strict co-regulation of these proteins in NZ individuals. These stable-protein pairs include gene products involved in: (i) Sperm function, such as CKB that plays a critical role in the demand of energy necessary for sperm motility (71), CST3, PAEP and CLU that regulate sperm capacitation (72)(73)(74) and CRISP1 and NPC2, which are necessary for sperm-oocyte binding and fertilization (75,76); (ii) the regulation of semen clotting-liquefaction processes, such as ACCP, KLK3 and MME, which are directly involved in the proteolysis of the SEMGs or other proteins (77,78), or as WFDC2, ALB and TGM4, which regulate other components required for clotting-liquefaction process such as proteases, zinc ions and polyamines (79-81); (iii) immunology, including proteins that could participate in the leukocyte-mediated immune response, such as IGKC, IGHG2, ANPEP, LGALSBP3, ECM1 and B2M (82,83) or in antimicrobial activity, as for example CPE and QSOX1 (84,85); and finally,(iv) other functions such as lipid metabolism (AZGP1, TF and IDH1) and matrix assembly (VWA1). In contrast to the high number of stable-protein pairs identified in NZ individuals, the stable correlations drastically decreased in the different groups of patients with altered semen parameters (Table III). Indeed, just 18, 3 and 0 stable-protein pairs were detected in AS, AZ and OZ patients, respectively. The low number of stable-protein pairs observed in AZ patients was not surprising, since this group contains patients indistinctly diagnosed with obstructive and non-obstructive azoospermia, which probably results in different semen protein profiles as previously reported by others (16). However, a high heterogeneity was also observed in the proteomic profile of seminal plasma from AS and OZ patients. The few stable-protein pairs detected in infertile patients with altered seminal parameters indicate that alterations in different proteins may result or be a consequence of the same altered phenotype. Although some hints for the presence of protein-pair correlations was already reported in the sperm proteome using ancillary proteomic methods (86), the present study clearly demonstrates the potential of this approach using proteomics data at peptide level (Table III, Fig. 2).
To assess which protein pairs might be associated to the alterations of the seminal parameters in individual samples, we repeated the analysis of the stable-protein pairs in the NZ population but adding data from one patient with altered parameters at a time. First of all, we observed that the patient AS2 had a very similar seminal plasma protein signature to the NZ population, since it maintained more than 75% of the NZ stable-protein pairs (Table IV). Of note, one of the proteins that loses more correlations in patient AS2 is the TF, a protein involved in lipid metabolism and sperm protection against oxidative stress (87) (Table IV). It is interesting to note that TF loses more than 75% of the correlations in another asthenozoospermic patient (AS1) ( Table IV), suggestive that oxidative stress may be related to the impairment of the sperm motility in both patients AS1 and AS2.
We also observed that proteins involved in the induction of proteolysis of SEMGs and other regulators of the semen clotting-liquefaction process lose > 75% of the correlations in OZ and AZ patients (KLK3 in OZ2 and AZ2; ACPP in OZ2, AZ1, AZ2 and AZ3; and MME in AZ1; Table IV), suggesting that the sperm also contribute with regulators for the semen clot proteolysis (88). Whereas some of the protein correlations were lost in the majority of the individual patients, as observed for CLU, PAEP and WFDC2, suggesting that the correlations established for these proteins in NZ samples are weak, some seminal plasma proteins seem to be altered only in one unique sample such as TGM4 in patient AS4, IDH1 in OZ1, CST3 and LGALS3BP in OZ4, CPE in Z1, B2M in Z3 and IGKC in Z4, although these alterations could not clearly explain the observed phenotype.

Quantitative proteomics as a tool to provide insights in seminal plasma proteome signatures of infertile patients
So far, proteomics biomarker discovery experiments have shown a relatively low concordance among different studies. In fact, we demonstrated that the results from relative quantitative proteomics are different if the analyses are performed at protein or at peptide level. These differences could explain the apparent lack of reproducibility of some of the findings, a fact that should be taken into account also when using antibody-based techniques recognizing specific peptides (such as Western blotting or ELISA) or targeted proteomic approaches to selected specific peptides. Here, we propose introducing a novel complementary approach for the analysis of quantitative proteomic data, which is based on the establishment of stable-protein pairs. This strategy has been previously applied to the study of RNA-seq data (59,60), but to the best of our knowledge this is the first time it is applied to the proteomic data. The use of this new approach in our seminal plasma proteome dataset has allowed determining highly stable seminal plasma proteome signatures of men presenting normal seminal parameters (NZ). In contrast, we demonstrated that the current classification of infertile patients based on altered semen parameters resulted in a high heterogeneous seminal plasma proteomic profile, thereby suggesting that the current male infertility stratification performed in fertility clinics is not sufficient to obtain a good a diagnosis.
Moreover, the stable-protein pairs approach has the potential to pinpoint proteins potentially related to pathogenic mechanisms in individual samples when this strategy is applied for the evaluation of the NZ We greatly thank Dr Judit Castillo for critical revision of the manuscript. The authors also recognize Raquel Ferreti and Alicia Diez for their assistance in the routine seminograms and sample collection.