Identification of Bovine Sperm Surface Proteins Involved in Carbohydrate-mediated Fertilization Interactions*

Glycan-protein interactions play a key role in mammalian fertilization, but data on the composition and identities of protein complexes involved in fertilization events are scarce, with the added complication that the glycans in such interactions tend to differ among species. In this study we have used a bovine model to detect, characterize and identify sperm lectins relevant in fertilization. Given the complexity of the sperm-toward-egg journey, two important aspects of the process, both primarily mediated by protein-sugar interactions, have been addressed: (1) formation of the sperm reservoir in the oviductal epithelium, and (2) gamete recognition (oocyte-sperm interaction). Using whole sperm cells and a novel affinity capture method, several groups of proteins with different glycan specificities, including 58 hitherto unreported as lectins, have been identified in sperm surface, underscoring both the efficacy of our selective approach and the complex composition and function of sperm. Based on these results and previous data, we suggest that sperm surface proteins play significant roles in fertilization events such as membrane remodeling, transport, protection and function, thus supporting the hypothesis that rather than a simple lock-and-key model, mammalian fertilization relies on a complex interactome involving multiple ligands/receptors and recognition/binding events.

Fertilization is a fundamental event that follows a highly coordinated sequence of cellular interactions between gametes in order to form a diploid zygote and, ultimately, a new individual. There is now considerable evidence that carbohydrate recognition plays a major role in fertilization, from lower species to man (1,2), and also that some sperm-surface carbohydrates are implicated in immune-mediated infertility (3). In this context, it is well established that oviductal sperm reservoir formation (4) and gamete recognition (5) in mammals are predominantly mediated by protein-carbohydrate interactions involving, on the one hand, sugar moieties on both the oviductal epithelium and the oocyte zona pellucida (ZP) 1 and, on the other hand, carbohydrate-binding proteins (i.e. lectins) on the sperm cell surface. Different candidate molecules potentially involved in sperm-oviduct and/or sperm-egg interactions have been postulated for various mammalian species (6,7), suggesting that different carbohydrate ligands and lectins are involved in each case. However, no consensus still exists on the binding mechanisms and the molecules therein involved, mainly due to the incomplete identification of all players. Specifically, in the bovine species, it has been demonstrated that sperm binding to the oviductal epithelium involves fucose (Fuc) recognition (8,9) and that, following capacitation, spermatozoa are released from the reservoir and proceed further into the oviduct to meet the oocyte, the gamete interaction being predominantly mediated by sialic acid (Sia) residues on the oocyte (10).
To gain insights into gamete events associated with fertilization (oviduct adhesion, capacitation, ZP binding and acrosomal exocytosis), analyzing the sperm surface proteome is a must (11). Traditionally, sperm surface proteins have been studied using labeling strategies with membrane-impermeable tags to facilitate enrichment and identification (12,13). These approaches are not entirely plasma membrane-proof, as some intracellular proteins may also be accidentally labeled during preparation and experimentation, or endogenous sperm proteins may co-purify with labeled ones. As an alternative to surface labeling strategies, sperm cell plasma membrane fractions can be purified. Some of these techniques, however, tend to give low purifications and poorly defined fractions, or involve treatments that denature proteins, inhibit enzyme activity, or affect the functional integrity of the membranes. Careful evaluation of the strategy is especially relevant for proteins involved in ZP recognition; if the preparation contains acrosomal contamination, intra-acrosomal ZP-binding proteins will be identified that may mask primary (plasma membrane) ZP-binding proteins (14). An alternative to subcellular fractionation is enrichment in protein types from a whole cell lysate. For instance, sperm phosphoproteomics studies often resort to affinity-based approaches where enrichment in phosphorylated peptides is achieved on immobilized metal ion (or TiO 2 ) columns (15). Another relevant PTM, S-nitrosylation, has been characterized in humans using a biotin switch assay for protein enrichment that provided novel insights on the role of nitric oxide in capacitation (16). It is also possible to combine subcellular fractionation and protein enrichment; in sperm, the best example is the use of nitrogen cavitation to produce a cytosolic fraction that, after enrichment by poly-Glu:Tyr affinity chromatography, enabled the isolation and identification of four tyrosine kinases specifically localized to the cell cytosol (17). Alternatively, immobilized lectins are used in affinity chromatography to extract surface glycoproteins (18), a method that can also be employed on nitrogen-cavitated and solubilized sperm plasma membrane material. Finally, sperm head plasma membrane proteins with high primary ZP binding affinity have been specifically isolated using ZP fragment columns (14,19).
Herein, we apply a novel affinity capture method using immobilized carbohydrates and combining proteolysis of protein-glycan complexes and mass spectrometry (CREDEX-MS, "Carbohydrate REcognition Domain EXcision Mass Spectrometry") (20 -23) to examine complex samples of bovine sperm under single sided physiological conditions. In our approach, entire sperm cells rather than solubilized sperm proteins, whole cell lysate, or subcellular fractions were used to eliminate treatment-related uncertainties and to preserve as much as possible the native 3D architecture of sperm surface proteins, essential for ZP binding (24). In this way, 94 carbohydrate-binding sperm surface proteins addressing four different glycotopes were enriched, identified by state-of-theart proteomics, and mapped with the fertilization events in bovine species. Of these, 58 proteins had not been previously found in bovine sperm by proteomics approaches, which suggests they represent low-abundance lectins that could only be identified by our selective methodology.
Furthermore, a comparison of the proteins identified under capacitating and noncapacitating media (CM and NCM) conditions showed differences in number and composition, demonstrating that the sperm membrane undergoes changes during capacitation, in preparation for downstream functions during fertilization. In addition, correlation of the present bull sperm results with recently published identifications in human sperm proteomes 2 revealed some potentially novel bovine sperm proteins, corroborating the importance of speciesspecific reproductive biology characterization.

EXPERIMENTAL PROCEDURES
Materials-Fresh semen was obtained after electroejaculation of bulls (Asturiana de los Valles breed, Bos taurus) at the Cenero (Asturias, Spain) artificial insemination facility. Semen aliquots were stored in liquid nitrogen immediately after collection and kept at Ϫ196°C during transportation and storage.
Sperm Preparation-Prior to use, semen aliquots were thawed for 10 s at room temperature and immediately afterward placed in a water bath at 37°C for 40 s. Liquefied semen was next subjected to the washing swim-up technique. Briefly, samples (250 l aliquots) were layered in cryotube vials under 1 ml of either capacitating (26)  O, 11 mM glucose, pH 7.6) (both media previously conditioned at 37°C), and incubated for 1 h at 37°C. During this time sperm are allowed to swim up in the medium, with the purpose of collecting the most motile, active and normal ones, free of debris and seminal plasma. The supernatant (ϳ700 l) was collected and centrifuged at 200 ϫ g for 10 min. The top layer was discarded and the final pellet (ϳ500 l) kept at 37°C for subsequent affinity chromatography experiments. Samples prepared in this way were examined under light microscope before use, and sperm motility and morphology were evaluated. Total sperm count was assessed using an improved Neubauer hemacytometer.
Trypsinization of Sperm Surface Proteins-A semen aliquot was thawed, divided into two fractions (CM or NCM, respectively; three replicates per fraction), and each sample submitted to the above swim-up procedure. Final pellets obtained under either condition were resuspended in 85 l of 25 mM NH 4 HCO 3 (pH 8.5), treated with 15 l of 1 g/L trypsin (Promega, Madison, WI) in NH 4 HCO 3 and incubated overnight at 37°C; then filtered (Amicon 10 kDa) to remove remaining trypsin and sperm heads and/or tails. The filtrates were lyophilized before LC-MS/MS analysis.
Experimental Design and Statistical Rationale-Preparation of CREDEX Affinity Chromatography Columns-For carbohydrate immobilization, 5 mg of each glycotope (Fuc4N, Le a , 3Ј-SLN and SLe x ) dissolved in 50 l of 0.5 M K 2 CO 3 (pH 11) were incubated with 50 g of divinylsulfonyl (DVS)-activated Sepharose [100 g Sepharose beads, 10 l DVS in 100 l of 0.5 M K 2 CO 3 (pH 11), 70 min, RT under stirring] and the mixture was evenly distributed into two Mobicol microcolumns. Glycoprobe coupling was carried out overnight at RT under continuous shaking (800 rpm), then the microcolumn was washed with 50 mM NH 4 OAc (pH 4) and 0.1 M Tris (pH 8) and reequilibrated with either CM or NCM depending on the experiment to conduct. For each of the four glycotopes, two microcolumns for replicate performance were prepared. Additional microcolumns with no immobilized glycan, serving as blanks, were prepared by activating Sepharose as above with monofunctional MVS instead of DVS, then washed and equilibrated with either CM or NCM as above.
CREDEX-MS Excision Experiments-In a typical experiment, ϳ1.5 ϫ 10 6 entire sperm cells were loaded immediately after swim-up treatment (with either CM or NCM) on the microcolumns and incubated for 24 h at 37°C with agitation by combined vibration/rotation in an IntelliMixer apparatus (http://www.elmi-tech.com/rm/). Flowthrough from each column, containing unbound sperm, was collected and the column washed with the corresponding medium until only residual spermatozoa were observed by microscopy. Sugar-lectin complexes were then digested overnight with trypsin (150 g/ml) in 25 mM NH 4 HCO 3 , pH 7.8, 37°C, with stirring. After digestion, each column flow-through, containing nonspecific digestion products, was removed and columns washed again with culture media. After gently washing until no spermatozoa were observed, specific-bound peptides were eluted (2 ϫ 300 l ACN-H 2 O (2:1) 0.1% TFA, 15 min, 37°C, stirring). In excision experiments with Sia-containing (3Ј-SLN and SLe x ) microcolumns, an additional, competitive elution was done with 400 l of 0.5 mM fetuin (Sigma-Aldrich, Madrid, Spain) for 15 min, 37°C, with stirring. Fetuin was removed by filtration (Amicon 30 kDa) and all elution samples were lyophilized prior to LC-MS/MS analysis. In total, for each pair of microcolumns prepared for every glycoprobe, three replicates (for each CM and NCM conditions) were performed, i.e. six replicates per glycoprobe, plus six blank replicates (three replicates per each CM and NCM blank columns). Thus, a total of 30 CREDEX-MS excision experiments (24 with glycoprobe ϩ 6 with blank microcolumns) were carried out per each sperm condition.
Microcolumn functionality was tested by running binding tests with specific, pure lectins before excision experiments, as well as in between replicates and at the end of each replication set. Briefly, 20 g of lectin (UEA-I for Fuc4N; LTA for Le a ; MAA for 3Ј-SLN and AAL for SLe x ) were added to the corresponding glycoprobe-Sepharose microcolumn and incubated in 100 l HEPES running buffer (10 mM HEPES, 150 mM NaCl, 5 mM CaCl 2 and 1 mM MnCl 2 , pH 7.4) for 24 h at 37°C. Unbound material was removed by extensive washing with running buffer, then bound lectin was eluted with 0.1% TFA in 2:1 (v/v) ACN:H 2 O, except for Sia-containing columns (3Ј-SLN and SLe x glycoprobes), for which a second competitive elution with 1 mM fetuin was performed. The protein contents of each fraction (flow through, wash and elutions) were analyzed by 1D-SDS-PAGE electrophoresis, and preservation of column functionality was confirmed by detection of the specific lectin gel band in the elution fraction. In addition, to further ensure that columns were suitable for reuse, a hydration (HEPES buffer) -dehydration (0.1% TFA in 2:1 (v/v) ACN/H 2 O) washing cycle was carried out after each excision experiment replicate, to eliminate excess BSA from the CM or to remove residual fetuin used in elution from Sia-containing microcolumns.
MS/MS Sample Preparation-Proteomic analyses were performed on the elution fractions from all CREDEX experiments and with sperm trypsinization samples (including all the replicates). In order to evaluate instrumental reproducibility for each blank biological replicate, three analyses (three analytical replicates) were performed reaching a total of 18 blank replicates per each capacitating condition. In total, 54 sample injections (12 fetuin glycoprobe elutions, 24 standard glycoprobe elutions and 18 blank elutions) per each sperm condition (CM and NCM) were performed for MS/MS analysis of CREDEX experiments. In order to minimize instrumental variability, a defined batch file was programmed for sample injection. Specifically, samples were injected in groups of nine including the six replicates of the same glycotope alternated with three sample blanks. Additionally, six sample injections of the triplicates of sperm trypsinization experiments per condition were also injected separately. In order to improve the assignments in the subsequent MS/MS analysis, prior to injection, lyophilized samples containing tryptic peptides were resuspended in 200 mM NH 4 HCO 3 , reduced with DTT (60 nmol, 1 h, 37°C), alkylated in the dark with iodoacetamide (120 nmol, 30 min, 25°C) and purified in UltraMicroSpin C18 columns (The Nest Group, Inc, Southborough, MA). Desalted and purified peptides were dried in a vacuum centrifuge and redissolved in H 2 O (0.1% HCOOH) for subsequent MS analysis.
Nano-LC-MS/MS-Samples were analyzed in an LTQ-Orbitrap Velos Pro instrument (Thermo Fisher Scientific, San Jose, CA) coupled to an EasyLC (Thermo Fisher Scientific (Proxeon), Odense, Denmark). Peptides were loaded at 1.5-2 l/min directly onto a reverse-phase column (12 cm ϫ 75 m, C18, 3 m; Nikkyo Technos Co., Ltd. Japan), washed with 4 -5 times the injection volume and separated by linear gradients of 3-7% B in A over 1 min (A: 0.1% HCOOH in H 2 O; B: 0.1% HCOOH in MeCN), followed by 7-35% B in A over 40 min, at a flow rate of 300 nL/min. After each analysis, the column was washed with 90% B for 10 min. The mass spectrometer was operated in positive ionization mode with nanospray voltage set at 2.2 kV and source temperature at 325°C. Ultramark 1621 (Thermo) was used for external mass analyzer calibration prior to analyses. Moreover, internal calibration was also performed using the background polysiloxane signal at m/z 445.1200. The instrument was operated in data dependent acquisition (DDA) mode and full MS scans at resolution of 60,000 FWMH were used over a mass range of m/z 30 -2000. Auto gain control (AGC) was set to 1 ϫ 10 6 , dynamic exclusion (60 s) and charge state filtering disqualifying singly charged peptides was activated. In each DDA cycle, after each survey scan the top ten most intense multiply charged ions above a threshold count of 10,000 were selected for fragmentation at a normalized collision energy of 35%. Fragment ion spectra produced via high-energy collision dissociation (HCD) were acquired in the Orbitrap mass analyzer at a resolution of 7,500, with AGC set at 5 ϫ 10 4 , an isolation window of 2.0 m/z, activation time of 0.1 ms and maximum injection time of 100 ms. All data were acquired and processed with the Xcalibur software v2.2.
Database Searching and Data Set Composition-Proteome Discoverer software (v1.4, Thermo Fisher Scientific) and the Mascot search engine (v2.3.1, Matrix Science) were used for peptide identification and quantification. Searches using the UniProtKB/Swiss-Prot manually annotated and reviewed database were preferred over other alternatives (e.g. TrEMBL), to ensure high quality annotations and avoid redundancy. Specifically, data were searched against an inhouse-generated database containing all (6,121) UniProtKB/Swiss-Prot Bos taurus proteins plus common contaminants (ϳ600 entries). A precursor ion mass tolerance of 7 ppm at the MS1 level was used, and up to three missed cleavages for trypsin were allowed. Fragment ion mass tolerance was set to 20 mmu. Met oxidation, N-terminal acetylation, and Ser, Thr and Tyr phosphorylation were defined as variable modifications and Cys carbamidomethylation as a fixed modification. False discovery rate (FDR) in peptide identification was evaluated by using a decoy database and was set to a maximum of 5%. The log 2 corresponding to the average area of the three most intense peptides per protein as calculated by Proteome Discoverer was used as quantitation indicator. The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium (28) (http://www.proteomexchange.org/) via the PRIDE partner repository with the data set identifier PXD003386.
Data Mining and Bioinformatic Analyses-Script tasks and protein list comparisons were performed using the R software (http://www.Rproject.org) and the gplots, v.2.11.3 package was used for plotting data in a graphical matrix. A 1/0 matrix for presence/absence analysis minimizing the risk of false positives or negatives was generated by defining changes in protein identification as relevant if occurring in two thirds (67%) of total replicates. Subsequent hierarchical clustering was performed following a Euclidean distance metric and maximum linkage criteria, and final heat maps/dendrograms were generated by displaying x-(sample clustering) versus y-axis (clustering of identified proteins) data. Proportioned Venn diagrams were drawn using Venn Diagram Plotter (http://omics.pnl.gov/software/ VennDiagramPlotter.php). Proteins identified were matched against a database containing all published studies on bovine sperm proteomics. PubMed (http://www.ncbi.nlm.nih.gov/pubmed/) and UniProt Knowledgebase (http://www.uniprot.org) were also used, whenever needed, to analyze identified proteins, especially to check which had been previously described in bovine sperm. For comparisons, Swiss-Prot accession number (if available) or protein names (in which case all alternative names were verified) were used. Gene Ontology (GO) resources and tools available at agriGO (a GO analysis toolkit and database for the agricultural community) were used to obtain all existing GO annotations available for known proteins in our data sets, as well as to identify the molecular functions, biological processes and cellular locations represented in differentially expressed proteins in our data sets. DAVID bioinformatics (29) was also employed, to obtain a comprehensive set of functional annotations for our list of proteins/genes. Moreover, the KEGG PATHWAY database was used to map our data sets for biological interpretation of higher-level systemic functions. An overrepresentation analysis was performed in order to recognize functional biological pathways and to draw wire diagrams of molecular interactions, reactions, and relations. Physical and functional interaction properties of the identified proteins were predicted with STRING v 9.05 (Search Tool for Retrieval of Interacting Genes/Proteins) (30); network analysis was set at medium stringency (STRING score ϭ 0.4). Proteins were linked based on seven criteria: neighborhood, gene fusion, occurrence, co-expression, experimental evidences, existing databases, and text mining. Finally, human orthologs of bovine genes were batch-downloaded from Ensembl 65 using BioMart (http://www.ensembl.org/biomart/martview).

Isolation of Carbohydrate-binding Proteins from Bull Spermatozoa by CREDEX-MS-Various carbohydrate structures
chosen from the literature on the basis of their role in the above-mentioned carbohydrate-regulated steps (8,9,10,2) were immobilized onto DVS-activated Sepharose microcolumns as indicated in Fig. 1A. After immobilization and confirmation of functionality by binding tests with specific pure lectins and SDS-PAGE detection (not shown), they were used in excision experiments with both capacitated and noncapacitated entire sperm (Fig. 1B). A Sepharose column treated with methylvinylsulfone, i.e. without glycan binding properties, was used in each experiment as a blank to evaluate nonspecific and solid support binding. Phase-contrast microscopy was used to assess the different fraction compositions shown in Figs. 1C to 1G.
CREDEX strategy (Fig. 1), in addition to allowing digestion after the native recognition event hence emulating the in vivo situation, also favored removal of indirectly interacting pro- teins or noncovalent aggregates, i.e. amino acid interactions from domains not involved in carbohydrate recognition would be eliminated in the flow through of the digestion. Interestingly, the second flow-through fraction obtained after digestion contained sperm heads and tails (Fig. 1E), confirming that binding was indeed taking place on the column, as in previous washes no sperm was observed under the microscope (Fig.  1D). Therefore, tryptic peptides belonging to the carbohydrate recognition domain of sperm surface proteins with affinity for the immobilized sugars had been retained in the column, and could be eluted and analyzed by LC-MS/MS.
In excision experiments with Sia-containing columns (3Ј-SLN and SLe x glycoprobes), and in view of the results obtained with preliminary experiments with pure Sia-specific lectins, a special strategy was introduced to ensure effective separation of bound peptides that might not elute under the standard ACN/H 2 O/TFA conditions. To this end, an additional competitive elution step using fetuin, a glycoprotein with high Sia content, was included as a more effective elution step for peptides that remained retained. The structural basis for this interaction is not well understood, though it appears that glycoprotein conformation, hence specific steric relationships between oligosaccharide chains, plus proximity/orientation of the charged terminal residues, may bear on the ability of fetuin to interfere with sperm binding.
Proteomic Analysis of Proteins Isolated in CREDEX and Sperm Trypsinization Experiments-Initial MS/MS analysis of proteins from either CREDEX or sperm trypsinization experiments allowed to identify 394 or 230 entries, respectively (obtained under both CM and NCM conditions). Given the lack of information on which proteins might be involved in the carbohydrate-regulated fertilization events studied here, no targeted scanning and/or quantitative approach could be employed in our proteomic studies. Therefore, a post-processing analysis to determine presence/absence of protein identification under the different CREDEX parameters was chosen. To minimize the risk of false positives or negatives, changes in protein identifications were set to be relevant if occurring in two thirds (67%) of the total replicates. After this filtering, a total of 94 different proteins (out of 394) were found to be altered in the CREDEX experiments ( Table S1 for numbers and classification of identified proteins).
Heat maps of the CREDEX proteomic results, for capacitated ( Fig. 2A) and noncapacitated sperm (Fig. 2B) separately, allowed to establish patterns across proteins and biological samples simultaneously in the large data sets without the need for subjective summarizing (31). Thus, two matrices were created so that each column contained data from a single sample (elution fraction of each different CREDEX column) and each row corresponded to a single feature (identi-fied protein; 81/58 proteins for CM/NCM). The presenceabsence criterion of a given sample was reflected by a 1 or a 0, respectively. The heat map software further reordered the matrix data so that rows and columns with similar profiles were clustered, thus making patterns more visible. When two clusters are merged, a line is drawn connecting the two clusters at a height corresponding to how similar the clusters are. A subsequent procedure translated the numerical matrix into a color image, white for 0 (absence) and red for 1 (presence). With these considerations, significant differences between the elution samples were found from both heat maps. A general, particularly marked tendency could be observed in second fractions from fetuin-mediated elutions of Sia-containing glycotopes (3Ј-SLN and SLe x ), which clustered in a similar area on the left side of the heat map for CM ( Fig. 2A) and on the right for NCM conditions (Fig. 2B). These clearly merged clusters stressed the similarities between fetuin elutions, the corresponding similar red-heated areas representing proteins with a specific affinity for Sia columns, as they were mainly eluted only under such binding conditions. Cluster connections could also be observed between Fuc-containing columns (Fuc4N and Le a glycoprobes), some proteins being detected in the ACN standard elution of both columns types.
Data were also represented by means of Venn diagrams. In Fig. 3A, the intersection between the two different sets of proteins obtained in all combined CREDEX experiments using different (CM and NCM) sperm treatments contained almost 50% of the total number of different proteins in the collection. Hence, a large number of sperm proteins were conserved under both conditions and were essential for sperm form and/or function regardless of handling conditions. Some proteins, however, were only identified under CM conditions, demonstrating that capacitation involves subtle and intricate modifications of the sperm membrane, including alterations, removals or changes in organization and/or composition of the proteins and lipids at the sperm surface. The number of proteins identified in total sperm surface digests (188) was much larger than in the CREDEX experiments (94) (Fig. 3B). This somewhat predictable result can be explained by noting that CREDEX, an affinity enrichment method, reduces sample complexity by yielding predominantly low-abundance, specific proteins. In contrast, trypsinization is done upon whole cells, yielding more complex samples with more proteins identified. The same reasons could also explain the low number of coinciding proteins (32; 13% of total) in the two sets, in comparison with the 62 (25%) exclusively found by CREDEX or the 156 (62%) in the sperm trypsinization experiments. Venn diagrams in Figs. 3C and 3D show CREDEX results obtained with different glycoprobes under CM and NCM conditions, respectively. For Sia-containing columns, results from the two different elution fractions (standard and fetuin) were taken together. In comparison, fewer Fuc-binding (Ͻ10) than Sia-binding (Ͼ30) proteins were identified. Common proteins in multiple sets can be distinguished in the different intersec-tions, representing primarily nonspecific proteins. The same trends observed in heat map areas were also found with Venn diagrams. Specifically, up to 30 proteins in the intersection between the 3Ј-SLN and SLe x glycoprobes sets under both conditions, i.e. with specific affinity for Sia-containing core structures were found. By comparing these with those found with other glycoprobes, conclusions about specificity can be drawn. The complete list of proteins obtained in our proteomic study, organized from the Venn diagrams, is given in Tables  I-III.
Bovine Sperm Surface Lectinome-The 94 proteins identified by the CREDEX-MS approach were screened against a database that included all proteins (Ͼ4000) identified in 12 different proteomic analyses, covering either whole sperm (32)(33)(34)(35)(36) or fractions including isolated plasma membrane (37) or cytosolic matrix (17); or seminal plasma (38,39) including epididymal (40,41) and accessory sex gland (42). To the best of our knowledge, more than 50% of these proteins are described herein for the first time (i.e. no previous description of their expression in bovine sperm; Fig. 4A). Specifically, out of 94 proteins in the study, 36 have been previously reported (Tables I-III; in italics) but 58 are for the first time reported in bovine sperm by an MS/MS approach. These newly identified proteins, most of them not present in sperm trypsinization data, evidently represent low-abundance proteins identified only because of our glycan-affinity enrichment methodology.

FIG. 2. Heat maps representing the presence (red) and absence (white) of the identified proteins (y axis) through the different elution samples (x axis) for (A) capacitated sperm (81 proteins) or (B) noncapacitated sperm (58 proteins).
To gain insight into the biological significance of these newly identified proteins, gene ontology functional enrichment analysis was performed using agriGO (43). Proteins related to genes possessing the same or similar biological function were grouped. Among the 58 hitherto unreported proteins, 55 had a GO annotation and could be classified according to molecular function, cellular component, and bi-ological process (Figs. 4B, 4C, 4D). The categorization process was extended to the total (94) of proteins. Most newly identified proteins were classified into categories similar to those in the total set and included binding proteins, transporters, enzyme modulators, hydrolases, cytoskeletal proteins, and others. Potential functionalities were assigned using Uni-ProtKB. Concerning localization (Fig. 4D), most were extra-

TABLE I Glycospecific proteins identified by CREDEX under both sperm conditions (capacitated on the left and noncapacitated on the right columns). Proteins are distributed in rows separated by solid lines, corresponding to elution fractions obtained with the various glycoprobes indicated in the central column. Each table displays the protein accession number and the description for each identified protein
Glycoprobes with the same core structures, i.e. Fuc or Sia-containing core structures, are shaded in the same grey tone. Ambiguous category contains proteins identified in multiple elution fractions of different columns, hence without defined glycan specificity. Common proteins identified under both CM and NCM conditions and eluted from the same glycoprobe column are listed first and separated from singular proteins (exclusively detected under one condition) by broken lines.
Proteins found under both conditions but not exactly with the same glycoprobe (different row) though sharing the same functional monosaccharide (same grey tone) are differentiated by the character ( ‡).
ϭ and ¹ signs denote proteins identified/unidentified in sperm trypsinization experiments, where the maximum possible proteins were identified. Pointing out which low-abundance proteins were probably enriched by the CREDEX approach.
Proteins previously found in other published bovine proteomic studies are shown in italics whereas newly identified proteins are in normal type.
Asterisk (*) refers to proteins with no reported specific function within the field of fertilization. cellular, as expected from the analytical strategy, but a few localized in cytoplasmic vesicles, cytoskeletal microtubules, plasma membrane and the nucleus. The categorization by main function (Fig. 4C) or biological processes wherein putatively involved (Fig. 4B) revealed sperm surface proteins in standard functions such as metabolic regulation, stimuli and immune response, signaling, transport, etc. However, as anticipated, a number of proteins were also identified as players in different parts of the reproductive process such as spermatogenesis, sperm capacitation, fertilization, and acrosome reaction. Aiming at a systems-level understanding of cellular functions and to annotating all functional interactions among proteins in the cell, a STRING network analysis of protein-protein interactions was separately performed for newly and previously identified proteins (Fig. 5), with a view to establishing functional links and determining the potential underlying biological processes (44). Whenever a functional association between two proteins exists, an edge between the correspond-ing nodes in the network graph is set whose weight represents a confidence value on the evidence of such an association. Fig. 5A shows predicted interactions between proteins in our study also previously reported in bovine proteomes. Of the 35 proteins present in the STRING database, 28 (80%) were found to be linked either directly or indirectly through one or more interacting proteins, suggesting the existence of known functional linkages between these highly conserved proteins. In order to further explore these connections, functional pathways were investigated with the KEGG PATHWAY database. Results revealed two overrepresented pathways, namely "complement and coagulation cascades" within the immune system (including proteins such as alpha-2-macroglobulin, complement C3, complement factor B, complement factor H, plasminogen and alpha-2-antiplasmin) and "gap junction" cell communication (such as tubulin proteins). In contrast, the remaining 20% of the commonly observed proteins lacked functional interactions with other proteins. Interestingly, well-known proteins such as sperma- Footnotes as in Table 1 apply.

TABLE III Proteins found in both conditions but with different glycospecificity. Each table displays the protein accession number and the description for each identified protein
Footnotes as in Table 1 apply. dhesins (45) or PDC-109 (46) were assigned to this set of proteins, hence reinforcing the evidence that they cover selective functions related to fertilization and rarely participate in other common cell processes. Furthermore, the regular observation of these specific proteins in almost all proteome studies performed in the bovine sperm corroborates their relevance for the reproductive process. With regard to newly identified proteins, as expected fewer associations were observed (compare Fig. 5A and 5B). In this case, two different situations could be distinguished. On the one hand, around 50% of these newly found proteins were interlinked (Fig. 5B, right), and further KEGG analysis revealed these to be mainly proteins participating in either the immune system (e.g. prothrombin, coagulation factor V, complement C7, fibrinogen alpha and beta chain, kininogen 1, protein S, alpha-1 antiproteinase, antithrombin) or in extracellular matrix-receptor interactions (e.g. cartilage oligomeric matrix protein, collagen alpha 1, dystroglycan, fibronectin). It is likely that the carbohydrate-based enrichment process practiced here somehow selects for such extracellular matrix-receptor interactions. On the other hand, the remaining 50% (Fig. 5B, left) contained 24 proteins identified for the first time in bovine sperm with no known functional associations neither extensive background in the bovine reproduction field. Of particular interest are protamines, which have been associated with male infertility (47), and dynein FIG. 4. Similarity with earlier bovine observations and gene ontology functional enrichment analysis. A, Bull sperm proteins in the present study that coincide with previously published descriptions. Of the 94 proteins identified and considered here, 58 were not previously described in bovine sperm. New versus total bovine sperm proteins identified in our study were grouped using agriGO database according to: B, putative biological process, C, molecular function, and D, cellular component. light chain 1, which has been found to be involved in generation of murine sperm motility (48) as well as male germ cell differentiation in Drosophila (49).
Correlation with Human Sperm Proteome-The 94 total proteins from our data sets were assigned to 65 bovine genes, which in turn were matched against the entire human genome  Tables I-III) and present in the STRING database. and yielded 54 orthologs. Comparison of these 54 human orthologs with the complete list of ca. 4500 genes recently identified in human sperm proteome (50), revealed 22 (41%) evolutionarily conserved genes present in humans, such as PDC-109; whereas 32 noncoincident genes (59%) corresponded to potentially new, bovine-specific sperm surface proteins (Table IV). Noteworthy among the latter ones are cadherins, expressed in human gametes and participating in sperm-oocyte interaction events (51), or bovine mannose-binding protein C, possibly related to a human mannose receptor, a candidate protein proposed to interact with the ZP via carbohydrate-binding domains in a capacitationdependent manner (52). Indeed, we found this mannosebinding protein C only detected in bovine capacitated sperm and with no functional associations with other proteins. We would emphasize that all these proteins were exclusively observed in our bovine proteomic analysis. Because some of them have putative functions in human fer- tilization processes, it will be interesting to study their bovine gene ortholog functions. DISCUSSION The studies presented here aim to unveil the molecular basis of bovine sperm-oocyte interaction as the initial phase of gamete recognition. To this end, well-defined carbohydrate structures, previously identified as key elements of ZP glycosylation, were displayed as bait in an experimental protocol for isolation and enrichment of sperm surface proteins of defined glycan specificity, using intact live bull sperm cells under both capacitating and noncapacitating conditions. The CREDEX-MS approach mimics the in vivo recognition events between live sperm membrane-associated components and glycotopes present in ZP or oviductal epithelium glycoproteins. This setup also reduces sample complexity by enriching for protein types of interest from a whole cell. As such, only sperm surface proteins with affinity for the homologous specific carbohydrate, namely Neu5Ac and Fuc, were targeted.
Using nano-LC-MS/MS, proteomic analyses of the Sia/Fucbinding proteins isolated from sperm surface by CREDEX-MS generated a lectinome of 94 proteins, 58 hitherto unidentified in bovine species. These may represent low-abundance proteins not previously detected in whole sperm proteomic studies, i.e. glycospecific proteins detectable only in an enriched fraction. Potential functional information on the identified proteins was obtained from UniProt, agriGO, DAVID or KEGG databases; STRING was also employed to provide physical and functional protein-protein interactions. An overview of these bioinformatic analyses suggests that sperm surface is a dynamic architecture with proteins not only involved in fertilization but also in processes vital to sperm physiology including metabolic regulation, signaling, transport, proteases and hydrolases, and those associated with the cytoskeleton.
Our studies also show that medium composition, representing various stages in a spermatozoon's journey through the female tract, influences protein identification, suggesting that capacitation modifies the protein composition of the sperm membrane. Indeed, in vitro capacitation is assumed to mimic in vivo conditions in the oviduct, where spermatozoa interact with their immediate environment; this contact is likely to remodel sperm surface through, e.g. adsorption, removal, aggregation and/or migration of specific proteins, preparing and enabling it to fertilize the egg (53,54).
Regarding glycan specificity, two different glycoprobes, one slightly more complex than the other (di-versus trisaccharide), were selected for the study, yet both with a terminal functional monosaccharide (Fuc or Sia) for which an active role in gamete interaction is reported in the literature. Many proteins identified in our study show specific affinity for Siacontaining epitopes, a not unusual outcome as Sia units are typically at the termini of glycoconjugate chains as well as on secreted glycoproteins. This high Sia prevalence suggests its predominant function is modulating interactions with the en-vironment (55), both primary and complex, reportedly including bovine and human gamete interaction (10,2). One consequence of this ubiquity, also evident in our study, is however that many entries with such carbohydrate specificity are surface proteins with standard cellular roles, though some of them may also perform sperm-related functions. One interesting example is plasminogen, the enzymatically inactive precursor of plasmin, which has been identified in the oviductal fluid and demonstrated to bind oocytes at ZP and oolemma level (56). Upon sperm contact with the oocyte, plasminogen activators triggering the conversion of plasminogen into plasmin are released, the protease thereby causing sperm detachment from the ZP and helping regulate sperm penetration in the oocyte (57). In our study, plasminogen shows specificity for Sia glycotopes, hence binds sperm and may further interact with ZP Sia residues (58) during gamete interaction, thus contributing to plasminogen/plasmin system activation and regulation of the fertilization outcome. Fetuin-B, also identified in our Sia-glycoprobe fractions, is another protein that regulates gamete fusion by restraining ovastacin protease activity, thereby maintaining ZP permeability until the first spermatozoon intrudes. After penetration, ovastacin overrides fetuin-B and initiates ZP hardening, thus preventing polyspermy (59).
Fuc-binding proteins in our analysis include some wellknown spermadhesins, main components of seminal plasma that associate peripherally with the sperm surface. They are believed to act as primary receptors in sperm-ZP binding and to participate in other events regulating sperm capacitation and acrosome stabilization. Specifically, two members of this family (spermadhesin-1 and spermadhesin Z13) were identified, both with a hitherto unreported specific affinity for Fuc residues. The former, in particular, has redox activity that may not only protect sperm from oxidative stress but also inhibit motility during storage in the cauda region (60). Its ability to bind to ejaculated sperm is lost after capacitation (61), suggesting it may act as a decapacitation factor on bull spermatozoa rather than as a ZP-binding molecule. This latter observation is in tune with the present results, as spermadhesin-1 was exclusively detected in noncapacitated sperm. Therefore, considering that (1) sperm reservoir formation appears to be regulated by recognition between Fuc-containing glycans at the surface of the oviductal epithelium and Fuc-binding lectins on the sperm surface, and that (2) noncapacitated spermatozoa bind to the oviductal epithelium, their release from the sperm reservoir apparently coinciding with the loss/modification of surface proteins associated with capacitation, our data would suggest spermadhesin-1 as a potential bovine sperm surface lectin candidate recognizing Fuc residues in the oviductal epithelium and leading to sperm reservoir formation. Interestingly, PDC-109 was also detected with our proteomic approach, confirming earlier findings. This major component of seminal plasma, binding to sperm surface via choline phospholipids, is an important mediator in capacita-tion, probably through sperm membrane lipid modification (62), and also appears to contribute to the formation of a sperm reservoir by binding to oviduct epithelium (63). Previous studies reported that PDC-109 remained on sperm membranes at the onset of capacitation in the sperm reservoir (64). If so, PDC-109 should be present under both CM and NCM conditions and predominate in Fuc-binding fractions (65). However, it was only detected in CM, and with little glycan specificity. One possible reason for its absence in NCM might be the strict annotation criteria of our approach, which allowed in the final list only proteins identified at least in four (out of six) replicates. On this standard PDC-109 did not make the NCM list, despite having been identified in five Fuc replicates (two Fuc4N, three Lea). In conclusion, although in our settings PDC-109 cannot be attributed any glycan specificity, it clearly presents predominant binding to Fuc rather than Sia glycotopes.
In addition to our survey of lectins in bull sperm surface, we have also performed a comparative analysis of human and bovine sperm lectinomes. The differential protein/gene expression profile obtained corroborates the importance of species-specific reproductive biology characterization, as some proteins may be present in both species but have sperm functions in only one of them. This interspecies cross-referencing will help establish parallel genetic models for the study of spermatogenesis and sperm function, eventually allowing, for instance, the identification of sperm factors responsible for male infertility in humans.
The number and diverse nature of ZP/oviduct-binding candidates identified to date, including the present findings, do not support the hypothesis of a single sperm protein being responsible for the various species-specific recognition events in mammalian fertilization. Indeed, most researchers favor scenarios with multiple sperm receptors or heteromeric protein complexes involved, diversity being necessary to facilitate the various intricate events taking place during gamete interaction. Sperm-oviduct and initial sperm-ZP adhesion, specific tight binding, induction of a signal transduction cascade to initiate the acrosome reaction and secondary binding to facilitate ZP penetration need all to be choreographed, probably with hierarchical interactions among various proteins. The concept of multiple proteins, either alone or in concert, is also supported by the temporal and spatial differences observed during sperm-ZP binding. Moreover, because fertilization is fundamental for species propagation, one might expect a number of regulatory processes evolved to ensure that the initial contact between spermatozoon and ZP is successful. Thus, a combination of binding events, even if some are nonessential, can improve the chances of fertilization. One could expect a degree of redundancy to be involved in what is after all a crucial process to the individual. In this context, lectin activities have been identified in the sperm heads of several species and are commonly implicated in somatic cell-cell recognition, being clearly extracellular pro-teins in some species. These would appear to be likely candidates for regulating the initial sperm-egg recognition events, if only because no other function within such a complex could easily be attributed to them.
Taken together, our studies constitute to our best knowledge the first proteomic analysis of isolated carbohydratebinding sperm proteins, and have unveiled a new constellation of proteins previously not identified by MS/MS analysis in bovine spermatozoa. Further analysis of these newly identified leads, along with those already known, should afford valuable insights into potential mechanisms of biogenesis, maturation, and function during fertilization. Furthermore, characterization of the gamete glycoproteome should effectively aid in clarifying the mechanisms of fertilization and constitute a useful resource for future development of male contraceptives and diagnosis of male infertility. In this context, future cloning and site-specific mutagenesis experiments should help clarify the functional domains of sperm surface proteins currently known or discovered in the future to be involved in early egg recognition events. Moreover, genetically modified (i.e. knock-out) mice models will provide robust physiological data about the role played by these proteins and their domains. It will be interesting to see if a common group of functionalities is shared by all mammalians, and if such functionalities are associated with analogous or distinctly different polypeptide chains in different species. Moreover, correlations between saccharide moieties, glycan receptors and the outcome of mammalian IVF should be examined in order to help evaluate the precise carbohydrate binding criteria necessary for fertility.