Development of a Novel Method for Analyzing Collagen O-glycosylations by Hydrazide Chemistry

In recent years, glycopeptide purification by hydrazide chemistry has become popular in structural studies of glycoconjugates; however, applications of this method have been almost completely restricted to analysis of the N-glycoproteome. Here we report a novel method for analyzing O-glycosylations unique to collagen, which are attached to hydroxylysine and include galactosyl-hydroxylysine and glucosyl-galactosyl-hydroxylysine. We established a hydrazide chemistry-based glycopeptide purification method using (1) galactose oxidase to introduce an aldehyde into glycopeptides and (2) formic acid with heating to elute the bound glycopeptides by cleaving the hydrazone bond. This method allows not only identification of O-glycosylation sites in collagen but also concurrent discrimination of two types of carbohydrate substitutions. In bovine type I and type II collagens, galactosyl-hydroxylysine /glucosyl-galactosyl-hydroxylysine -containing peptides were specifically detected on subsequent comprehensive liquid chromatography (LC)/MS analysis, and many O-glycosylation sites, including unreported ones, were identified. The position of glycosylated hydroxylysine, which is determined by our unambiguous and simple method, could provide insight into the physiological role of the modifications.

In recent years, glycopeptide purification by hydrazide chemistry has become popular in structural studies of glycoconjugates; however, applications of this method have been almost completely restricted to analysis of the N-glycoproteome. Here we report a novel method for analyzing O-glycosylations unique to collagen, which are attached to hydroxylysine and include galactosyl-hydroxylysine and glucosyl-galactosyl-hydroxylysine. We established a hydrazide chemistry-based glycopeptide purification method using (1) galactose oxidase to introduce an aldehyde into glycopeptides and (2) formic acid with heating to elute the bound glycopeptides by cleaving the hydrazone bond. This method allows not only identification of O-glycosylation sites in collagen but also concurrent discrimination of two types of carbohydrate substitutions. In bovine type I and type II collagens, galactosyl-hydroxylysine /glucosyl-galactosyl-hydroxylysine -containing peptides were specifically detected on subsequent comprehensive liquid chromatography (LC)/MS analysis, and many O-glycosylation sites, including unreported ones, were identified. The position of glycosylated hydroxylysine, which is determined by our unambiguous and simple method, could provide insight into the physiological role of the modifications. Molecular & Cellular Proteomics 11: 10.1074/mcp.M111.010397, 1-9, 2012.
Galactosyl-hydroxylysine (GHL) 1 and glucosyl-galactosylhydroxylysine (GGHL) are O-glycosylations unique to collagen (1). They are also found in several other proteins having a collagen-like sequence, such as the C1q complement protein (2). A monosaccharide or disaccharide is attached to the hydroxylysine residue lying in the Y position of repeating collagenous Gly-X-Y triplets within a triple helix. Specific enzymes add the carbohydrates to hydroxylysine before triple helix formation in the endoplasmic reticulum (3,4). The carbohydrate content varies with collagen type, tissue, and physiological conditions. There are few glycosylated hydroxylysines in fibrillar-forming collagens. For example, type I collagen alpha 1 chain has approximately one residue per 1000 amino acid residues, the alpha 2 chain has approximately two residues, and type II collagen alpha 1 chain has ϳ10 residues (5). In contrast, most lysine residues are glycosylated in network-forming collagens such as type IV collagen. Although disorder-related alterations, such as overglycosylation in osteogenesis imperfecta (6) and spondyloepiphyseal dysplasia (7), have been reported, the biological significance of the carbohydrates remains unclear. The position of glycosylated hydroxylysine in a primary amino acid sequence could provide insight into the physiological role of the modifications.
In the 1970s, extensive structural studies revealed nearly all the primary amino acid sequences of major collagens by automated Edman degradation on a protein sequencer, but a few sites have remained uncertain because of some modifications (8). When the collagen peptides, which are digested by cyanogen bromide (CNBr) or proteases such as trypsin, are sequenced from the N-terminal on the protein sequencer, the cycles of hydroxylysine glycosides appear as "blanks." Most likely because of their hydrophilicity, glycosylated hydroxylysine derivatives are not recovered into nonpolar solvents (n-butyl chloride and ethyl acetate), which results in no peaks on subsequent reverse-phase chromatography (8). Additional analyses were required to confirm their existence and distinguish between the two types of carbohydrate substitutions. For example, amino acid and carbohydrate analyses of the sequenced peptides were needed; therefore, the position of hydroxylysine glycosides has only been predicted by combining data from several experiments. Later, in the 1990s, Gooley et al. improved the automated Edman degradation methodology used for identification of the sites of N-and O-glycosylation by using polar solvents, such as trifluoroacetic acid and methanol, for transferring the released amino acids (9,10). However, it is still difficult to comprehensively determine the O-glycosylation sites of collagen with the protein sequencer, and time-consuming separation procedures of each peptide are required before the analysis.
The recent development and high sensitivity of mass spectrometers are invaluable in various glycan studies; however, ionization suppression by co-existing nonglycosylated peptides, and the low ionization efficiency of glycopeptides hamper exhaustive MS glycopeptide analysis without purification/ enrichment procedures. To this end, Zhang et al. have recently developed a new method for purification of N-linked glycopeptides using hydrazide chemistry to identify glycosylation sites (11,12). In their method, glycopeptides are oxidized by sodium periodate to generate aldehydes and then captured on hydrazide resin by forming hydrazone bonds. After removing the nonbinding peptide, the peptide released by PNGase F is subjected to liquid chromatography (LC)/MS analysis. This strategy is becoming popular and is used for analysis of various N-glycoproteome samples (13)(14)(15)(16). More recently, applications have been extended to include O-glycosylation analysis. For example, the structure of the carbohydrate and its protein attachment site of N-and O-linked sialylated glycopeptides have been identified after cleavage of the glycosidic bond at the terminal sialic acid by heating with formic acid (17). Another study has analyzed sialylated glycopeptides with an intact glycan by cleaving the hydrazone bond using hydrochloric acid under cooling conditions (18). O-GlcNAc-modified glycopeptides were released from the resin using hydroxylamine thus converting the sugar to an oxime derivative (19). Although various approaches have been adopted, most applications of hydrazide chemistry have been restricted to analysis of the N-glycoproteome.
Purification methods for collagen O-glycosylations, such as affinity purification using lectin or specific antibodies, have not been reported. In this study, we present a simple and definitive method for purification of peptides containing GHL and GGHL by hydrazide chemistry and subsequent LC/MS analysis. Galactose oxidase was used to introduce an aldehyde instead of sodium periodate oxidation, which is known to destroy O-linked collagen carbohydrates and has been used to produce deglycosylated hydroxylysine (20,21). Although, in general, the enzyme preferentially oxidizes nonreducing terminal galactose such as GHL, GGHL can also be oxidized despite steric hindrance (1,22). Galactose oxidase was immobilized on Sepharose 4B to reuse the enzyme and reduce its contaminated proteolytic activity for stabilization of the enzyme, leading to an increase in the total enzyme activity (22). In addition, in the oxidation reaction, we exploited the coordinated addition of catalase and horseradish peroxidase (HRP) to enhance galactose oxidase activity, as has been reported recently (23,24). Hydrazide resin was heated with 0.1% formic acid to elute binding glycopeptides by cleaving the hydrazone bond; consequently, the eluted peptide possessed the carbohydrate chain, and we could discriminate the two types of carbohydrate substitutions concurrently with determining the modification site. We first optimized this gly-copeptide purification method, which is referred to as the "hydrazide method" in this article, based on hydrazide chemistry using purified GHL/GGHL, and practical analysis was performed in bovine type I and type II collagens.

EXPERIMENTAL PROCEDURES
Materials and Reagents-Galactose oxidase, catalase, HRP, tosyl phenylalanyl chloromethyl ketone-treated trypsin, and trypsin-chymotrypsin inhibitor were purchased from Sigma-Aldrich Co. (St. Louis, MO). Sodium borodeuteride and 13 C 6 15 N 2 -L-lysine were purchased from Cambridge Isotope Laboratories, Inc. (Andover, MA), and CNBr-activated Sepharose 4B was purchased from GE Healthcare (Pittsburgh, PA). Affi-gel Hz and Mini Bio-Spin chromatography columns were purchased from Bio-Rad Laboratories, Inc. (Hercules, CA). All other chemicals were purchased from Sigma-Aldrich. Pepsinsolubilized type I collagen was prepared from bovine skin, and pepsin-solubilized type II collagen was prepared from bovine articular cartilage, as reported previously (25,26). GHL and GGHL were purified from natural sponge (8), and galactose oxidase was immobilized on CNBr-activated Sepharose 4B (22). In brief, 150 U of galactose oxidase was added to 0.4 g of Sepharose 4B and rotated overnight at 4°C. Unreacted sites were blocked using 1 M glycine at room temperature for 2 h.
Oxidation and Purification of GHL/GGHL Standards by the Hydrazide Method-Both GHL (100 nmol) and GGHL (100 nmol) were dissolved in reaction buffer (100 mM sodium phosphate and 150 mM NaCl, pH 7.2). The samples were incubated with immobilized galactose oxidase (30 U), catalase (115 U), and HRP (1.5 U) in Bio-Spin chromatography columns with end-over-end rotation at 37°C for 24 h. The samples were collected by centrifugation, and the pH was adjusted to 4 -5 with hydrochloric acid. The oxidized samples were coupled to hydrazide resin (200 l), which was washed with coupling buffer (100 mM sodium acetate and 150 mM NaCl, pH 4.8) before use, in Bio-Spin chromatography columns with end-over-end rotation at 37°C for 6 h. After the capture reaction, the hydrazide resin was washed with the coupling buffer, 1.5 M NaCl, 100% methanol, and distilled water. The coupling compounds were eluted with 0.1% formic acid by heating at 80°C for 30 min, and the resin was then washed once with hot 0.1% formic acid to collect the remaining glycopeptides. The samples [preoxidation, postoxidation (oxidant), those unbound to hydrazide resin (flow-through), and postelution (eluant)] were then reduced with 1 mM sodium borodeuteride at room temperature for 1 h in alkaline conditions adjusted by triethylamine. The reduced samples were acidified with formic acid, and 13 C 6 15 N 2lysine was added as an internal standard. The samples were subjected to multiple reaction monitoring (MRM) analysis to calculate the oxidation efficiency and recovery rate of the GHL/GGHL standards.
MRM Analysis of GHL/GGHL and Their Oxidation Products-Analysis was performed on a hybrid triple quadrupole/linear ion trap 3200 QTRAP mass spectrometer (AB Sciex, Foster City, CA) equipped with an electrospray ionization (ESI) source. The instrument was coupled to an Agilent 1200 Series HPLC system (Agilent Technologies, Inc., Palo Alto, CA). The samples were loaded onto a ZIC-HILIC column (5 m particle size, L ϫ I.D. 150 mm ϫ 2.1 mm; Merck SeQuant, Umea, Sweden) at a flow rate of 200 l/min and separated by a binary gradient as follows: 90% solvent B (100% acetonitrile) for 5 min, linear gradient of 10 -60% solvent A (0.1% formic acid in water) for 5 min, linear gradient of 60 -90% solvent A for 15 min, and 90% solvent B for 5 min. The settings for MRM analysis were determined by the compound optimization function provided in Analyst 1.5.1 (AB Sciex). Capillary voltage was 4.5 kV, declustering potential was 25 V, heater gas temperature was 700°C, curtain gas was 15 psi, nebulizer gas was 60 psi, heater gas was 80 psi, and collision energy was 19 V (GHL and its derivatives), 27 V (GGHL and its derivatives), and 21 V ( 13 C 6 15

Purification of GHL/GGHL Peptides of Bovine Collagen by the
Hydrazide Method-Bovine type I or type II collagen (1 mg) in the reaction buffer was denatured by heating at 60°C for 30 min and digested by trypsin (50 g) at 37°C for 16 h. After heating at 100°C for 5 min, a portion of each sample was taken and diluted to 0.1 mg/ml with 0.1% formic acid for control samples, which were analyzed by LC/MS with three 10 l injections for peptide identification without the hydrazide method. The trypsin-chymotrypsin inhibitor (100 g) was added to the remaining samples. Subsequently, the hydrazide method was used in a manner analogous to that for the GHL/GGHL standards. After elution from the hydrazide resin, the glycopeptides were reduced to their original form using 1 mM sodium borohydride at room temperature for 1 h in alkaline conditions. The glycopeptide solutions were acidified with formic acid and subjected to LC-MS/MS analysis.
LC-Tandem MS (MS/MS) Analysis-Samples prepared by the glycopeptide purification procedure were analyzed by LC-electrospray ionization (ESI)-MS/MS. The analysis was performed on a 3200 QTRAP mass spectrometer coupled to an Agilent 1200 Series HPLC system. The sample solutions were loaded onto an Ascentis Express C18 HPLC Column (2.7 m particle size, L ϫ I.D. 150 mm ϫ 2.1 mm; Supelco, Bellefonte, PA) at a flow rate of 200 l/min and separated by a binary gradient as follows: 98% solvent A (0.1% formic acid in water) for 5 min, linear gradient of 2-50% solvent B (100% acetonitrile) for 15 min, 90% solvent B for 5 min, and 98% solvent A for 5 min. The eluting peptides were analyzed by the information-dependent acquisition (IDA) method that was operated by selecting the two most intense precursor ions of the prior survey MS scan and then subject-ing the precursor ions to MS/MS fragmentation. The collision energy was automatically determined based on the mass and charge state of the precursor ions using rolling collision energy. Capillary voltage was 5.5 kV, declustering potential was 30 -50 V, heater gas temperature was 600°C, curtain gas was 40 psi, nebulizer gas was 50 psi, and heater gas was 80 psi. MS scan and MS/MS acquisition were operated over the m/z range of 400 -1300 and 100 -1700, respectively.
Database Search of MS/MS Spectra-ProteinPilot software 4.0 (AB Sciex) with the Paragon™ algorithm was used for peptide identification (27). Search parameters included digestion by trypsin, biological modifications ID focus, and 95% protein confidence threshold. Default parameters including number of missed cleavages permitted and mass tolerance for precursor ions and fragment ions were adopted by the software. Two residues of galactosyl hydroxylation and glucosyl galactosyl hydroxylation of lysine (ϩ178 and ϩ340, respectively) were added to the search criteria of post-translational modifications. The probabilities of hydroxylation of proline and lysine were set higher than those of the defaults for collagen analysis. The acquired MS/MS spectra were searched against the UniProtKB/Swiss-Prot database (release 2011_08, on July 2011) for Bos taurus species (5857 protein entries). We defined the confidence threshold of the identified peptides to be 90%.
Sequence Confirmation of GHL/GGHL Peptides-The glycopeptide-containing fraction was collected, and the molecular weight distribution of the fraction was surveyed by MALDI-TOF/MS analysis performed on a Voyager Linear DE apparatus (AB Sciex). The remainder of the fraction was subjected to N-terminal amino acid sequence analysis on a Procise 492 protein sequencer (Applied Biosystems, Invitrogen Co., Carlsbad, CA) in pulsed-liquid mode.

RESULTS
The workflow of the hydrazide method for collagen O-glycosylations is shown in Fig. 1A, and the details of the chemical reactions of GGHL are described in Fig. 1B. GHL/GGHL stan- dards or trypsin-digested glycopeptides of collagen were oxidized by galactose oxidase to generate an aldehyde group in galactose ( Fig. 1B-1). After pH adjustment of the solutions to a weak acid for purposes of the coupling reaction, the oxidized molecules were coupled to hydrazide resin by forming a hydrazone bond (Fig. 1B-2). Unbound and nonspecifically bound compounds were removed by extensively washing the resin, and the captured compounds were then released using formic acid and heat (Fig. 1B-3). Finally, the eluted samples were reduced and subjected to LC/MS analysis.
Oxidation Efficiency and Recovery Rate of GHL/GGHL Standards-Initially, we optimized the oxidation and elution in the hydrazide method for collagen using purified GHL/GGHL. To enhance the reactivity of galactose oxidase, we immobilized galactose oxidase on agarose beads (22), which increased the generation of oxidized GHL and GGHL ϳthreefold and eightfold, respectively (supplemental Fig. S1), as well as added catalase and HRP to further enhance the reactivity (23,24). The use of a large amount of galactose oxidase enhanced the oxidation efficiency of GGHL, but it resulted in a decrease in oxidized GHL because of the increased peroxidized side product, which was not detected in the GGHL oxidation (supplemental Fig. S2). Hence, the amount of galactose oxidase was determined to be 30 U, which was considered adequate for the concomitant oxidation of both GHL and GGHL. In addition, we determined the conditions for elution to be 0.1% formic acid (pH 2.8) at 80°C for 30 min, and the elution time seemed to almost reach a plateau for the release of captured GHL and GGHL (supplemental Fig. S3). GHL/GGHL standards were stable during acid/heat treatment under these conditions (GHL, 99.8 Ϯ 6.8%; GGHL, 98.7 Ϯ 6.5%). Fig. 2 shows the efficiencies of oxidation and purification of GHL/GGHL in the hydrazide method. The oxidation efficiency of GHL/GGHL was estimated by relative quantification of their oxidants versus preoxidation samples. Similarly, the recovery rate was estimated with relative amounts of eluants. The samples were reduced by sodium borodeuteride before MRM analysis, which permitted a simple comparison of the amounts of original GHL/GGHL with those of their deuteriumlabeled oxidation products. The oxidation efficiency of GHL and GGHL was 11.16% and 7.03%, respectively, when oxidized by galactose oxidase only, and the recovery rate of GHL and GGHL was 4.26% and 1.44%, respectively. Substantial amounts of oxidized GHL/GGHL flowed through the resin, but the recovery rate was not improved by increasing the amount of hydrazide resin or the coupling time (data not shown). The particularly low oxidation efficiency and recovery rate of GGHL were presumed to be because of nonreducing terminal glucose that may disturb the interactions of galactose with galactose oxidase and hydrazide resin.
By adding catalase and HRP, the galactose oxidase activity in the GGHL standard was markedly enhanced by ϳtwofold, leading to a striking improvement in the total recovery rate; however, it was less effective for GHL because the peroxidized side product of GHL increased ϳthreefold in the oxidant by the coordinated addition of catalase and HRP (data not shown). All the side products shifted to flow-through. The total recovery rate of GHL and GGHL oxidized in the presence of catalase and HRP was 4.66% and 3.57%, respectively. Although the recovery rates appeared to be somewhat low, they were sufficient to identify GHL/GGHL peptides by LC/MS, as described below.
Purification and LC/MS Analysis of GHL/GGHL Peptides in Bovine Collagen-The hydrazide method described in the above section was used on bovine type I and type II collagens. The glycopeptides were purified by the same procedure as for GHL/GGHL standards after trypsin digestion of denatured collagen (Fig. 1A). The glycopeptides eluted from the hydrazide resin were analyzed by LC/MS after reducing the galactose to its original form using sodium borohydride. An example of the results of sequence analysis of the GHL/GGHL peptides of type II collagen is shown in Fig. 3. Specific peaks were detected in the total ion current chromatogram (TIC) of LC/MS analysis, whereas no peaks were found in the control nonoxidized sample (Fig. 3A). Purification by the hydrazide method resulted in a decreased number of total peptides, thereby permitting MS/MS acquisitions of most of the glycopeptides (data not shown). Fig. 3B shows the MS spectrum of the survey scan obtained at 14.76 min. The m/z 641.0 ion was subjected to MS/MS analysis (Fig. 3C), which was identified as GFOGQDGLAGPK * GAOGER (charge ϭ 3ϩ, O indicates hydroxyproline and K * indicates GHL). Similarly, the m/z 695.0 ion (Fig. 3D) was identified as the identical peptide with GGHL instead of GHL (GFOGQDGLAGPK#GAOGER; charge ϭ 3ϩ, K# indicates GGHL). The retention time of the GHL peptide on reverse-phase chromatography was somewhat longer than that of the GGHL peptide. A similar tendency was found for all other peptides containing GHL/GGHL. Because collision-induced dissociation preferentially cleaves glycosidic bonds rather than peptide bonds, most of the carbohydrate moieties were lost on MS/MS resulting in the spectra of both the GHL and GGHL peptides closely resembling each other.
Verification of the Accuracy of the Hydrazide Method-We performed sequence analysis by a protein sequencer to verify the accuracy of the identification of GHL/GGHL peptides using the hydrazide method. The fraction that contained the peaks shown in Fig. 3B was collected, and MALDI-TOF/MS analysis revealed that there were two major peaks in agreement with the molecular weight of the GHL/GGHL peptides identified by LC/MS analysis (Fig. 3E). The remainder of the fraction was analyzed by protein sequencing and determined to be GFOGQDGLAGP(X)GAOGER, where X indicates "blank" and is considered to be GHL or GGHL, as expected (Fig. 3F). These results support the reliability of purification and identification of GHL/GGHL peptides by the hydrazide method.
Database Search of Acquired MS/MS Spectra for Peptide Identification-Whole MS/MS spectra obtained by LC/MS were subjected to peptide identification using a database search against the UniProtKB/Swiss-Prot database for Bos taurus species. Because the structures of GHL and GGHL were already identified, the glycosylations of hydroxylysine were added to the search criteria of post-translational modifications. To exclude possible random database hits, we defined the following three criteria for identification of GHL/ GGHL peptides: (1) location of GHL/GGHL as being at the Y position of Gly-X-Y triplets, (2) missed cleavage at GHL/GGHL by trypsin most likely because of steric hindrance (28,29), and (3) high confidence level of more than 90%.
The lists of identified GHL/GGHL peptides and their modification sites are summarized in Table I. The charge state of the glycopeptides was relatively high (ranging from ϩ2 to ϩ4); therefore, high molecular weight peptides resulting from trypsin miscleavage at GHL/GGHL were also identified. We found five GHL/GGHL peptides in type I collagen alpha 1 chain, eight in the alpha 2 chain, and 24 in type II collagen alpha 1 chain. Nearly all the previously reported GHL/GGHL sites, which have been determined by many independent studies (30 -34), were identified simultaneously using the hydrazide method. In addition, we found three glycosylation sites in type I collagen and two sites in type II collagen, which had not been reported in previous studies (supplemental Fig. S4). Diverse glycopeptides were identified as both GHL-and GGHL-containing peptides, whereas peptides substituted only with GHL also existed. In contrast, without the hydrazide method, only a few GHL/GGHL peptides were identified despite the high sequence coverage (supplemental Table S1). Thus, identification of GHL/GGHL peptides and exhaustive glycosylation site determination using LC/MS analysis were dramatically enhanced by the hydrazide method.

DISCUSSION
In this study, we established the hydrazide method for O-glycosylation analysis of collagen. To avoid degradation of labile carbohydrate moieties in GHL/GGHL by periodate oxidation, galactose oxidase was used to introduce an aldehyde into GHL/GGHL peptides. Because galactose oxidase activity especially in the GGHL standard was markedly enhanced by the coordinated addition of catalase and HRP (23, 24), we applied this three-enzyme system. We used acid/heat treatment to cleave the hydrazone bond so that the eluted peptides contained entire carbohydrate chains, which has also been previously achieved with alternative methods (18,19). Using the hydrazide method, we could identify the glycan structures concurrently with determining the position of Oglycosylated hydroxylysine. In addition, the LC/MS equipment enabled comprehensive and high-sensitivity sequence analysis compared with use of a protein sequencer, which has been used for sequence analysis of collagen in past studies (8).
During oxidation of the GHL standard, we observed a peroxide that has been reported as a main side product of galactose oxidase (23), whereas it was not detected in the GGHL reaction. Use of a large amount of galactose oxidase or the coordinated addition of catalase and HRP enhanced the oxidation efficiency, as shown for the GGHL standard. In contrast, excessive enhancement of the enzyme activity seemed to increase the peroxide generation, thereby reducing the oxidation efficiency for the GHL standard. The elution time was determined to be 30 min with acid/heat treatment because it seemed to almost reach a plateau for the release of GHL/GGHL, and a longer heating time under acidic conditions led to the smaller recovery rate of GHL/GGHL peptides, which was probably because of peelings of the carbohydrate or peptide cleavages (supplemental Fig. S3). In addition, two glycopeptides containing the Asp-Pro (Hyp) bond were identified among the three sites in type I and type II collagens, although the Asp-Pro bond was reported to be hydrolyzed by acid/heat treatment (17,35). Thus, the elution conditions seemed to be adequate for the O-glycopeptide identification of collagen despite the possible specific peptide cleavages.

Results of Comprehensive LC-MS/MS Analysis of O-glycosylated Peptides of Collagen
Purified GHL/GGHL peptides were identified by LC/MS analysis based on the following three criteria: (i) location of GHL/GGHL as being at the Y position of Gly-X-Y triplets, (ii) missed cleavage at GHL/GGHL by trypsin, and (iii) high confidence (conf) level of more than 90%. The numbering of residues begins with the triple-helical portion of the chains. First residue corresponds to residue 178 of P02453 (type I collagen alpha 1 chain), residue 89 of P02465 (type I collagen alpha 2 chain), and residue 201 of P02459 (type II collagen alpha 1 chain). O indicates hydroxyproline, K* indicates GHL, K# indicates GGHL, and K indicates hydroxylysine. The m/z values are monoisotopic.
There were few nonspecific peptides that did not contain the GHL/GGHL modification site from the peptide identification by the hydrazide method. A large number of GHL/GGHL peptides were identified and the modification sites were consistent with those reported previously. For normal peptide identification with trypsin digestion, only 1/50 protein concentration was required compared with the hydrazide method, and approximately half of the total sequences were identified in type I and type II collagens. However, only a few GHL/ GGHL peptides were identified with the direct LC/MS analysis because a number of co-eluted peptides competitively hampered the MS/MS acquisition of the glycopeptides in the IDA method. These data indicate that the hydrazide method efficiently purified GHL/GGHL peptides and enhanced the identification of the glycosylation sites. In addition, the hydrazide method provides greatly enhanced signal-to-noise ratio of the glycopeptides (data not shown), which would offer a big advantage for quantification analysis, such as stable isotope labeling by amino acids in cell culture (SILAC). Interestingly, some new glycosylation sites, which had not been reported in previous sequence analyses using protein sequencing, were also identified. Most O-glycosylation sites of collagen are partially glycosylated and exist as mixtures of lysine, hydroxylysine, GHL, and GGHL; therefore, a possible O-glycosylation site, which is partially substituted by lysine or hydroxylysine, could be identified as lysine or hydroxylysine on N-terminal sequence analysis in some cases. It is presumed that several GHL/GGHL modification sites have been missed, which further demonstrates the advantage of the hydrazide method in collagen analysis.
Within the GHL/GGHL peptides identified with a confidence level of more than 90%, most of the glycosylated lysine residues lie in the Y position of Gly-X-Y triplets, and trypsinmissed cleavages were observed at the residues, which is consistent with our prediction. High molecular weight peptides resulting from trypsin-missed cleavage at glycosylated hydroxylysine were identified with a high charge state of the precursor ions. However, the peptide containing consecutive GHL/GGHL is presumably over the MS range, and thus, the modification site could not yet be identified. Previously reported GHL/GGHL sites of type I collagen were fully covered in this study, but a few sites of type II collagen were not found (site 264, 270, 648, and 657), probably because of the above mentioned or other factors.
Although the amino acid sequences were clearly determined by LC/MS analysis, most GHL/GGHL sites were determined by the predicted molecular weight of the modifications. To dispel uncertainty about the site assignment, the accuracy of this collagen O-glycosylation analysis was verified by Nterminal amino acid sequence analysis. Thus, the hydrazide method is considered to be a highly accurate way of identifying O-glycopeptides of collagen. There are a few peptides that are cleaved at sites not accepted universally most likely by other protease activity, which has been reported to be contaminated in commercially available galactose oxidase (22,36). Despite the fact that the proteolytic activity of galactose oxidase could be suppressed by immobilization to agarose beads and co-addition of the trypsin-chymotrypsin inhibitor, slight proteolytic activity may have existed. Repurification of the enzyme or use of recombinant galactose oxidase may be required for more precise analysis.
Recently, localization of hydroxylysine glycosides in the collagenous domain of the C1q complement protein has been studied by comprehensive proteomic analysis (37). In that study, hydroxylysine glycosides are clearly identified without glycopeptide purification procedures. Use of a highly sensitive nano-LC/linear quadrupole ion trap-orbitrap instrument and the relatively small size of the collagenous domain of the C1q protein are considered to permit direct analyses of the GHL/GGHL-containing peptides. In comparison, our hydrazide method permits comprehensive identification of hydroxylysine glycosides of the collagen macromolecule, which contains about 1000 amino acid residues per individual chain, with conventional LC/MS.
In this study, we developed a simple procedure in which captured glycopeptides were eluted from hydrazide resin by heating with 0.1% formic acid. Because the purpose of this study was to determine GHL/GGHL modification sites, this simple purification method was used. There are several other ways to release glycopeptides from hydrazide resin. One is to convert the glycopeptides to oxime derivatives with aminooxy reagents on hydrazide resin (19). Although our hydrazide method enabled us to identify the O-glycosylation sites of collagen, the low oxidation efficiency and recovery rate of GHL/GGHL remain challenges for precise quantitative analysis. We would further develop the method to establish quantitative analysis concurrently with identification of the glycosylation site by optimizing the conditions and, for example, using isobaric labeling on hydrazide resin or metabolic amino acid labeling by SILAC. GHL/GGHL sites of human type I collagen, which was purified from the culture supernatant of skin fibroblasts, have also been identified by using the hydrazide method (data not published). Development of a quantitative hydrazide method may enable us to identify the association of hydroxylysine glycosides with disorders, such as overglycosylation in osteogenesis imperfecta, and the physiological role of the modifications.