Glycomic and sialoproteomic data of gastric carcinoma cells overexpressing ST3GAL4

Gastric carcinoma MKN45 cells stably transfected with the full-length ST3GAL4 gene were characterised by glycomic and sialoproteomic analysis. Complementary strategies were applied to assess the glycomic alterations induced by ST3GAL4 overexpression. The N- and O-glycome data were generated in two parallel structural analyzes, based on PGC-ESI-MS/MS. Data on glycan structure identification and relative abundance in ST3GAL4 overexpressing cells and respective mock control are presented. The sialoproteomic analysis based on titanium-dioxide enrichment of sialopeptides with subsequent LC-MS/MS identification was performed. This analysis identified 47 proteins with significantly increased sialylation. The data in this article is associated with the research article published in Biochim Biophys Acta “Glycomic analysis of gastric carcinoma cells discloses glycans as modulators of RON receptor tyrosine kinase activation in cancer” [1].

glycans as modulators of RON receptor tyrosine kinase activation in cancer" [1]. &

Not applicable
Data accessibility Data is within this article.
Value of the data Data shows the N-and O-glycome of the MKN45 gastric carcinoma cells and ST3GAL4 sialyltransferase overexpressing cells.
Data provides a list of glycoproteins with altered sialylated N-glycans in ST3GAL4 overexpressing cells, including several cancer associated proteins.
These data are valuable as a source for novel biomarkers for gastric cancer.

Data
This data article includes the N-and O-glycome of MKN45 cells and assesses the glycosylation alterations induced by ST3GAL4 overexpression. In addition, we provide the data on the proteins with significant increased sialylated N-glycans upon ST3GAL4 overexpression.

Experimental design, materials and methods
Glycomic and sialoproteomic analyses were performed as described below comparing MKN45 cells stably transfected with the full-length ST3GAL4 gene or empty vector (mock). Glycome data and statistical evaluation are shown in Tables 1, 2, 3 and 4. Sialoproteome data and statistical evaluation are shown in Tables 5 and 6.

N-glycomic strategy I: sample preparation and PGC LC-ESI-MS/MS
Frozen cell pellets (10 7 cells) of mock or ST3GAL4 transfected MKN45 cells [2] were directly resuspended in 7 M urea, 2 M thiourea, 40 mM Tris, 2% CHAPS, 10 mM DTT and 1% protease inhibitor (Sigma-Aldrich, St. Louis, MO). The cell membranes were disrupted by 10 times 10 s sonication with 16 amplitudes and 1 min on ice in between, and subsequent shaking at 4°C overnight. To reduce the viscosity of the lysates, the DNA was degraded by adding 1 ml benzonase s nuclease (250 units, Sigma-Aldrich) and 30 min incubation at 37°C. In order to impair refolding of proteins, 25 mM iodoacetamide were added for alkylation during 1 h in the dark. The lysates were centrifuged for 30 min at 14,000 rpm and the supernatants were transferred to a fresh tube.
Then, solubilized proteins were concentrated by adding 150 ml of supernatant on a 10 kDa cut-off spinfilter (PALL, Port Washington, NY), spinning down for 5 min with 12,000xg and washing 3 times with 100 ml 50 mM NH 4 HCO 3 , pH 8.4. N-linked oligosaccharides were released in the spinfilter using 20 ml 50 mM NH 4 HCO 3 and PNGase F (5 mU, Prozyme, Hayward, CA) with incubation at 37°C overnight. Subsequently, the N-glycans were collected by washing 3 times with 20 ml H 2 O and dried in Speedvac. Reactions were quenched with 1 ml of glacial acetic acid and N-glycan samples were desalted and dried as previously described [3]. N-glycan samples were subjected to LC-ESI-MS/MS analysis using a 10 cmx250 mm I.D. column, prepared in-house, containing 5 mm porous graphitized carbon (PGC) particles (Thermo Scientific, Waltham, MA). Glycans were eluted using a linear gradient from 0% to 40% acetonitrile in 10 mM NH 4 HCO 3 over 40 min at a flow rate of 10 μl/min. The eluted Nglycans were detected using a LTQ ion trap mass spectrometer (Thermo Scientific) in negative-ion mode with an electrospray voltage of 3.5 kV, capillary voltage of À 33.0 V and capillary temperature of 300°C. Air was used as a sheath gas and mass ranges were defined dependent on the specific structure to be analyzed. The data were processed using the Xcalibur software (version 2.0.7, Thermo Scientific) and manually interpreted from their MS/MS spectra.
All analyzes were performed in three independent replicates and results were subjected to statistical analyses (Average, standard deviation and unpaired t-test) Table 1.

N-glycomic strategy II: sample preparation and PGC nanoLC-ESI MS/MS
Frozen cell pellets (10 7 cells) of mock or ST3GAL4 transfected MKN45 cells were directly resuspended in 2 mL of lysis buffer (50 mM Tris-HCl, 100 mM NaCl, 1 mM EDTA and protease inhibitor at pH 7.4) and stored on ice for 20 min. The cells were lysed using a Polytron homogenizer for at least three times for 10 s in a cold room. Cellular debris and unlysed cells were sedimented by centrifugation at 2000g for 20 min at 4°C. The supernatant was collected and the pellets resuspended in 1 mL of lysis buffer and centrifuged again at 2000g for 20 min at 4°C. All the supernatants were combined, diluted in 20 mM Tris-HCl (pH 7.4) and ultracentrifuged at 120,000g for 90 min at 4°C. The supernatant was separated from the pellet containing the cell membrane proteins. The membrane proteins were resuspended with 150 μL of 100 mM ammonium bicarbonate buffer and lyophilized overnight. The dried samples were solubilized in 50 μL of 8 M urea and 10 mL aliquots were dot-blotted onto PVDF membranes as described previously [4]. N-and O-glycan release as well as Structures are represented by the mass to charge ratio (m/z) in which they were identified and quantified, by monosaccharide composition and proposed structure based on MS/MS analyses. The relative quantities were determined by base-peak intensity of extracted ion chromatograms. The average value (Avg) and standard deviation (SD) of triplicates are shown, as well as the pvalue (t-test; *p o 0.05; **p o 0.01; ***p o 0.001). Increased or decreased relative abundance are shown in red or blue, respectively. Structures marked as not analyzed (na) were either not detected or overlapped with other structures in given sample, precluding their quantification. Unknown linkage is represented by "?". Table 2 Identified N-glycan structures II. Structures are represented by their [M-H] value and charge state in which they were identified and quantified, by monosaccharide composition and proposed structure based on MS/MS analyzes. The relative quantities were determined by basepeak intensity of extracted ion chromatograms. The ratio of glycan abundance in ST3GAL4 transfected cells relative to mock transfected is shown on the right column. Increases or decreases larger than 2 fold in glycan abundance are highlighted red or blue, respectively. PGC-nanoLC ESI MS/MS analysis on an amaZon ETD Speed ion trap (Bruker, Bremen, Germany) were performed as described in detail previously [4,5] (Table 2).

O-glycomic strategy I: sample preparation and LC-ESI-MS/MS
After the removal of N-glycan, as previously described in "1. N-glycomic strategy I: Sample preparation and LC-ESI-MS/MS", the O-linked glycans were released from retained glycoproteins in spinfilter using reductive β-elimination (0.5 M NaBH 4 , 50 mM NaOH at 50°C, 16 h). Reactions were quenched with 1 ml of glacial acetic acid and glycan samples were desalted and dried as previously described [3]. Glycans were subjected to LC-ESI-MS/MS analysis using a 10 cmx250 mm I.D. column, prepared in-house, containing 5 mm porous graphitized carbon (PGC) particles (Thermo Scientific. Waltham. MA). Glycans were eluted using a linear gradient from 0% to 40% acetonitrile in 10 mM NH 4 HCO 3 over 40 min at a flow rate of 10 μl/min. The eluted O-glycans were detected using a LTQ ion trap mass spectrometer (Thermo Scientific) in negative-ion mode with an electrospray voltage of 3.5 kV, capillary voltage of À 33.0 V and capillary temperature of 300°C. Air was used as a sheath gas and mass ranges were defined dependent on the specific structure to be analyzed. The data were processed using the Xcalibur software (version 2.0.7. Thermo Scientific) and manually interpreted from their MS/MS spectra (Table 3).

O-glycomic strategy II: sample preparation and PGC nanoLC-ESI MS/MS
Sample preparation and analysis were performed as previously described in Section 2. "N-glycomic strategy II: Sample preparation and PGC nanoLC-ESI MS/MS" (Table 4).

Cell lysis, protein digestion and iTRAQ labeling
Cell pellets were redissolved in ice-cold Na 2 CO 3 buffer (0.1 M, pH 11) supplemented with protease inhibitor (Roche complete EDTA free), PhosSTOP phosphatase inhibitor cocktail (Roche) and 10 mM sodium pervanadate on ice. The suspensions were tip probe sonicated for 20 s (amplitude ¼50%) twice and incubated at 4°C for 1 h. The lysates were then centrifuged at 100,000 Â g for 90 min at 4°C to enrich membrane proteins (pellet). The pellets were washed with 50 mM triethylammonium Table 3 Identified O-glycan structures I. bicarbonate (TEAB) to remove any remaining soluble protein. Membrane fraction was resuspended directly in 6 M urea and 2 M thiourea, reduced in 10 mM DTT for 30 min and then alkylated in 20 mM IAA for 30 min at room temperature in the dark.
Samples were incubated with endoproteinase Lys-C (Wako, Osaka, Japan) for 2 h (1:100 w/w). Following the incubation, the samples were diluted 8 times with 50 mM TEAB (pH 8) and trypsin was added at a ratio of 1:50 (w/w) and left overnight at room temperature. Trypsin digestion was stopped by the addition of 2% formic acid and then the samples were centrifuged at 14,000 Â g for 10 min to Structures are represented by the mass to charge ratio (m/z) in which they were identified and quantified, by monosaccharide composition and proposed structure based on MS/MS analyzes. The relative quantities were determined by base-peak intensity of extracted ion chromatograms. The average value (Avg) and standard deviation (SD) of triplicates are shown, as well as the pvalue (t-test; *po 0.05; **p o 0.01). Increased or decreased relative abundance are shown in red or blue, respectively. Unknown linkage is represented by "?".
precipitate any lipids present in the sample. The supernatant was purified using in-house packed staged tips with a mixture of Poros R2 and Oligo R3 reversed phase resins (Applied Biosystem, Foster City, CA, USA). Briefly, a small plug of C18 material (3 M Empore) was inserted in the end of a P200 tips, followed by packing of the stage tip with the resins (resuspended in 100% ACN) by applying gentle air pressure. The acidified samples were loaded onto the micro-column after equilibration of the column with 0.1% trifluoroacetic acid (TFA), washed twice with 0.1% TFA and peptides were eluted with 60% ACN/0.1% TFA. A small amount of purified peptides (1 μl) from each sample was subjected to Qubit assay to determine the concentration, while the remaining samples were dried by vacuum centrifugation. Afterwards, peptides were redissolved in dissolution buffer and a total of 150 μg for each condition was labeled with 4-plex iTRAQ TM (Applied Biosystems, Foster City, CA) as described by the manufacturer. After labeling, the samples were mixed 1:1:1:1 and lyophilized by vacuum centrifugation.

Sialic acid containing glycopeptide enrichment by TiSH protocol
The method used for sialylated glycopeptides enrichment is a modification of the TiSH protocol [6] described in [7,8]. Briefly, samples were resuspended in loading buffer (1 M glycolic acid, 80% ACN, 5% Table 4 Identified O-glycan structures II. TFA) and incubated with TiO 2 beads (GL Sciences, Japan, 10 μm; using a total of 0.6 mg TiO 2 beads per 100 μg of peptides). The supernatant containing the un-modified peptides was carefully separated. The TiO 2 beads were sequentially washed with loading buffer, washing buffer 1 (80% ACN, 1% TFA) and washing buffer 2 (20% ACN, 0.1% TFA), saving the washings with the previous supernatant. The bound peptides were eluted with 1.5% ammonium hydroxide by shaking for 15 min. The eluted fraction containing the phosphopeptides and sialylated glycopeptides was dried by vacuum centrifugation and subjected to an enzymatic deglycosylation in 20 mM TEAB buffer using 500 U of PNGase F (New England Biolabs, Ipswich, MA) and 0.1 U of Sialidase A (Prozyme, Hayward, CA) overnight at 37°C. Structures are represented by their [M-H] value and charge state in which they were identified and quantified, by monosaccharide composition, type of core and proposed structure based on MS/MS analyzes. The relative quantities were determined by base-peak intensity of extracted ion chromatograms. The ratio of glycan abundance in ST3GAL4 transfected cells relative to mock transfected is shown on the right column. Increases or decreases greater than 1.5 fold in glycan abundance are highlighted red or blue, respectively.  To separate phosphorylated peptides and formerly glycosylated peptides, the samples were subjected to a second TiO 2 enrichment procedure to separate phosphorylated from deglycosylated peptides. The supernatant containing the deglycosylated peptides was saved and the beads were washed with 50% ACN, 0.1% TFA. The washing was added to the supernatant. The deglycosylated fraction was desalted on Oligo R3 staged tip column and dried prior to the HILIC fractionation [7]. All fractions were dried by vacuum centrifugation prior nLC-MS/MS analysis.

Sialic acid containing glycopeptide analysis by nLC-MS/MS
Samples were resuspended in 6 mL of 0.1% TFA for analysis. Peptides were loaded on an in-house packed Reprosil-Pur C18-AQ (2 cmx100 mm, 5 mm; Dr. Maisch GmbH, Germany) pre-column and separated on an in-house packed Reprosil-Pur C18-AQ (17 cmx75 mm, 3 mm; Dr. Maisch GmbH, Germany) column using an Easy-nLC II system (Thermo Scientific, Bremen, Germany) and eluted at a flow of 250 nL/min. Mobile phase was 95% acetonitrile (B) and water (A) both containing 0.1% formic acid. Depending on the samples, gradient was from 1% to 30% solvent B in 80 or 110 min, 30-50% B in 10 min, 50-100% B in 5 min and 8 min at 100% B. Mass spectrometric analyses were performed in an Orbitrap Fusion Tribrid system (Thermo Scientific, Bremen, Germany). MS scans (400-1200 m/z) were acquired in the orbitrap at a resolution of 120000 at 200 m/z for a AGC target of 5 Â 10 5 ions and a maximum injection time of 60 ms. Data-dependent HCD MS/MS analysis at top speed of the most intense ions were performed at a resolution of 30000 at 200 m/z for a AGC target of 5 Â 10 4 and a maximum injection time of 150 ms using the quadrupole to isolate the ions and an isolation window of 1.2 m/z, a NCE of 38% and a dynamic exclusion of 20 s.
The raw data were processed and quantified by Proteome Discoverer (version 1.4.1.14, Thermo Scientific) against SwissProt and Uniprot human reference databases by using Mascot (v2.3.02, Matrix Science Ltd, London, UK) and Sequest HT, respectively. Database searches were performed using the following parameters: precursor mass tolerance of 10 ppm, product ion mass tolerance of 0.02 Da, 1 missed cleavages for trypsin, carbamidomethylation of Cys and iTRAQ labeling on protein N-terminal and Lys as fixed modifications, and phosphorylation on S/T/Y and deamidation of Asn as dynamic modifications. The iTRAQ datasets were quantified using the centroid peak intensity with the "reporter ions quantifier" node. Only peptides with up to a q-value of 0.01 (Percolator), Mascot and Sequest HT rank 1, Sequest HT ΔCn of 0.1, cut-off value of Mascot score Z 18 and a cut-off value of XCorr score for charge states of þ 1, þ2, þ3, and þ4 higher than 1.5, 2, 2.25 and 2.5, respectively, were considered for further analysis.

Data normalization and significance analysis
Three biological replicates were analyzed and submitted to the statistical analysis. The log2 values of the measured intensities were normalized by the median. Modified peptides were merged with the R Rollup function (http://www.omics.pnl.gov) allowing for one-hit-wonders and using the mean of the normalized intensities for each peptide. Quantification of proteins was obtained by merging the List of significantly decreased sialylated N-glycan modified peptides in the ST3GAL4 overexpressing cells compared to mock control shown with accession number, protein name, peptide sequence and the identified N-glycan site, fold in increase and the p-value.
un-modified peptides with the R Rollup function considering at least 2 unique peptides not allowing for one-hit-wonders and using the mean of the intensities. Then the mean over the experimental conditions for each peptide in each replicate was subtracted in order to merge data from different iTRAQ runs. Formerly sialylated glycopeptides containing the consensus motif for N-linked glycosylation (NXS/T/C; where X # P) were normalized based on the protein expression in each of the replicates. Significant up/down-regulations between experimental conditions were calculated allowing a false discovery rate of 0.05. Therefore, we applied combined limma and rank product tests [9], subsequently corrected for multiple testing according to Storey. Since spontaneous deamidation is frequently observed for asparagine residues, especially when the C-terminal amino acid is glycine (NG), the sites with NGS/T/C are considered as only potential glycosylation. However, in order to reduce the contribution from spontaneous deamidation in the final list, we sort first for the N-linked consensus site (NXS/T/C) and then we filter for proteins that are membrane-associated in order to exclude intracellular proteins that are not N-linked glycosylated (Tables 5 and 6).