Data set of the proteome of fresh and frozen thawed stallion spermatozoa

This paper provides the dataset of proteins of stallion ejaculates before and after cryopreservation. The data report the analysis and identification of stallion sperm proteins obtained from the same ejaculates and split in two subsamples. The first aliquot consisted on fresh spermatozoa and the second aliquot was frozen and thawed spermatozoa. Samples were analyzed using a UHPLC/MS/MS system consisting of an Agilent 1290 infinity series UHPLC coupled to an Agilent 6550 Q-TOF mass spectrometer (Agilent Technologies, Santa Clara, CA, USA). We provide a data set of 2226 different proteins, with 2180 aligned to the equine proteome database. The data can be used to identify potential targets to be explored to improve techniques for cryopreservation of spermatozoa. This data article refers to the article “Proteomic profiling of stallion spermatozoa suggests changes in sperm metabolism and compromised redox regulation after cryopreservation” (Martín Cano et al.; 2020) [1].


Specifications
Veterinary Medicine and Veterinary Science; Reproductive Biology.

Specific subject area
Proteomics of stallion spermatozoa.

Type of data
Raw data How data were acquired Samples were analyzed using a UHPLC/MS/MS system consisting of an Agilent 1290 infinity series UHPLC coupled to an Agilent 6550 Q-TOF mass spectrometer (Agilent Technologies, Santa Clara, CA, USA) Data format Processed Analyzed Parameters for data collection Fresh and frozen thawed equine spermatozoa using a split sample design from the same ejaculates.

Description of data collection
Comparative mass spectrometry proteomic profiling of fresh and frozen thawed equine spermatozoa Data source location Cáceres

Value of the Data
Proteomic data reported here can be useful for cryobiologists and spermatologists. The data can be used to identify signaling pathways affected by the procedure and potential targets to be monitored to improve current cryopreservation protocols. In addition, this data can be used to identify important signaling pathways involved in stallion sperm biology The whole dataset of proteins can be used in diverse applications including as a guide to identify biomarkers of stallion sperm quality and potential fertility The data reported point to new lines of research for improvement of cryopreservation technologies. The target pathways and proteins provided here can help to identify specific steps in the procedure (cooling, toxicity of cryoprotectants etc..) which can be modified to reduce damage.

Data description
A data set of 2203 proteins obtained from equine spermatozoa and its changes after cryopreservation is provided. This information could lead to a better understanding of cryodamage in the spermatozoa. The data set has been deposited to the ProteomeXchange Consortium [2] via the PRIDE partner repository with the dataset identifier PRIDE PXD018111.

Sample preparation
Spermatozoa were washed three times in PBS (600 gx 10 ) and samples of stallion semen before and after cryopreservation were pelleted and stored at −80 °C until processing and analysis.
To assure the absence of debris and other contaminating cells in the samples visual inspection using phase contrast microscopy was performed. Before preparation for proteomic analysis individual fresh and thawed samples from each stallion were pooled to generate 6 different pools, with the aliquots of semen before and after cryopreservation obtained from the same stallion and ejaculate. Then samples containing 200 × 10 6 cells were processed as described in [1] . The first step was solubilization of proteins in a buffer consisting of C7:C7Bz0 [3-(4-heptyl) phenyl-(3-hydroxypropyl) dimethylammoniopropanesulfonate, 7 M urea, 2 M thiourea and 40 mM Tris (pH 10.4). To achieve this goal, 20 microliters of lysis buffer was added per every 10 × 10 6 spermatozoa. The solution formed was thoroughly vortexed and then incubated under constant rotation at a temperature of −4 °C for 60 min. To measure the quantity of protein obtained in the previous step, the 2-D Quant Kit (GE Healthcare, Sevilla Spain) was used. All the samples were

UHPLC-MS/MS analysis
We processed the samples as described in [1]  analysis was initiated with 2% of Buffer B (water/acetonitrile/formic acid, 10:89.9:0.1) in isocratic mode for 5 min. The flow posteriorly increased linearly up to 45% of Buffer B over 40 min, and then increased again to 95% over a period of 15 min. Then the flow remained constant for further 5 min. After this first run of 70 min of duration, we conditioned the column for the next run for 5 min using the initial condition. We operated the mass spectrometer in positive mode, and we set to 35 psi the nebulizer gas pressure. The temperature of the gas was 250 °C and the flow rate was set to 10 l/min with the sheath gas set at 300 °C at a flow rate of 12 L/min. The voltages of the capillary spray, fragmentor and octupole VR were set a 3500, 340 and 750 V respectively. The extended dynamic range mode was used to acquire profile data for MS and MS/MS scans. The mass ranges for MS and MS/MS were 50-1700 m/z while the scan rates were 8 and 3 spectra per second for MS and MS/MS respectively. Precursor selection by abundance in auto MS/MS mode was used, with 20 precursors per cycle as a maximum. A slope of 3-6 and an offset of −4.8 was used in the ramped collision energy. Finally, after two consecutive scans the same ion was discharged.

Data processing
The Spectrum Mill MS Proteomics Workbench (Rev B.04.01, Agilent Technologies, Santa Clara, CA, USA) was used for data processing and analysis as described in [1] . The raw data were ex-tracted as follows; we selected non fixed modifications; [MH] + 50-10,0 0 0 m/z ; maximum precursor charge + 5; retention time and m/z tolerance ± 60 s; minimum signal-to-noise MS (S/N) 25; finding 12 C signals. The MS/MS search was done against the updated Uniprot/Horse protein database and was conducted as using the following criteria: non fixed modifications and the following variable modifications were selected: carbamidomethylated cysteines, tryptic digestion with the maximum missed cleavages set at 5. The ESI-Q-TOF instrument was set to the following criteria; minimum matched peak intensity 50%, maximum ambiguous precursor charge + 5, monoisotopic masses, mass tolerance for peptide precursor 20 ppm, mass tolerance for product ion 50 ppm and calculation of reversed database scores. We validated peptide and protein data using auto thresholds with a% false discovery rate (FDR) of 1.2%. The result for proteins was obtained as protein summarized using all validations, score > 4 and% Scored Peak Intensity (SPI) that is defined as the percentage of the extracted spectrum that is explained by the database search result, set at 60.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships which have, or could be perceived to have, influenced the work reported in this article.