Label-free Quantitative Proteomics and N-Glycoproteomics Analysis of KRAS-activated Human Bronchial Epithelial Cells*

Mutational activation of KRAS promotes various malignancies, including lung adenocarcinoma. Knowledge of the molecular targets mediating the downstream effects of activated KRAS is limited. Here, we provide the KRAS target proteins and N-glycoproteins using human bronchial epithelial cells with and without the expression of activated KRAS (KRASV12). Using an OFFGEL peptide fractionation and hydrazide method combined with subsequent LTQ-Orbitrap analysis, we identified 5713 proteins and 608 N-glycosites on 317 proteins in human bronchial epithelial cells. Label-free quantitation of 3058 proteins (≥2 peptides; coefficient of variation (CV) ≤ 20%) and 297 N-glycoproteins (CV ≤ 20%) revealed the differential regulation of 23 proteins and 14 N-glycoproteins caused by activated KRAS, including 84% novel ones. An informatics-assisted IPA-Biomarker® filter analysis prioritized some of the differentially regulated proteins (ALDH3A1, CA2, CTSD, DST, EPHA2, and VIM) and N-glycoproteins (ALCAM, ITGA3, and TIMP-1) as cancer biomarkers. Further, integrated in silico analysis of microarray repository data of lung adenocarcinoma clinical samples and cell lines containing KRAS mutations showed positive mRNA fold changes (p < 0.05) for 61% of the KRAS-regulated proteins, including biomarker proteins, CA2 and CTSD. The most significant discovery of the integrated validation is the down-regulation of FABP5 and PDCD4. A few validated proteins, including tumor suppressor PDCD4, were further confirmed as KRAS targets by shRNA-based knockdown experiments. Finally, the studies on KRAS-regulated N-glycoproteins revealed structural alterations in the core N-glycans of SEMA4B in KRAS-activated human bronchial epithelial cells and functional role of N-glycosylation of TIMP-1 in the regulation of lung adenocarcinoma A549 cell invasion. Together, our study represents the largest proteome and N-glycoproteome data sets for HBECs, which we used to identify several novel potential targets of activated KRAS that may provide insights into KRAS-induced adenocarcinoma and have implications for both lung cancer therapy and diagnosis.

Lung cancer is the leading cause of cancer deaths worldwide in males and the second in females (1). Adenocarcinoma is the single most common form of lung cancer, and it is frequently associated with mutationally activated RAS oncogenes (2,3). Activated RAS oncogenes are known to be involved in various biological processes that are relevant to cancer, including transcriptional and translational regulation, cell-cycle progression, cytoskeleton organization, apoptosis, survival, cell proliferation, adhesion, and motility (4). KRAS mutations account for 90% of the total RAS mutations, and ϳ95% of these occur in codon 12 or 13 in lung adenocarcinoma (3,5,6). These mutant forms of KRAS, or activated KRAS, are associated with poor prognosis, chemoresistance, and invasion and metastasis of cancer cells (7)(8)(9). Given the high prevalence and significance of mutational activation of KRAS in lung adenocarcinoma, identifying the activated KRAS target proteins is important for understanding the molecular mechanisms responsible for pathogenic development and progression. Several studies have identified activated KRAS targets at the transcriptome level (10,11) and phosphorylation level (12,13). However, protein changes associated with oncogenic KRAS have never been studied at the proteome-level.
In addition to protein changes, altered glycoproteins have also been used as biomarkers for cancer diagnosis (14,15). Several studies have shown that protein glycosylation, which is one of the most abundant posttranslational modifications, plays a key role in tumorigenesis as well as in induction of tumor invasion and metastasis (14 -17). However, the genes responsible for aberrant glycosylation and their downstream targets remain to be elucidated in cancers. Oncogenic RAS is involved in branching of N-linked carbohydrates by regulating the activity of N-acetylglucosaminyltransferase V (GnT-V) (18,19). In agreement, a positive correlation between mutationally activated KRAS and increased cell surface N-linked glycans has been observed in cancer cells (20). The N-linked glycosylation is involved in cancer progression by disrupting of receptor tyrosine kinase signaling in tumor cells (21,22). The regulation of cell-cell adhesion by N-glycosylation and its impact in the promotion of cancer metastasis has been shown (18,23). Further, Yan et al. have demonstrated that activated KRAS V12 leads to glycosylation of the integrin ␤1 chain of collagen, resulting in the inhibition of apicobasal polarization of epithelial cells. The other study has shown that activated RAS regulates sialylation of integrin ␤1, which may lead to tumorigenesis and metastasis (24,25). Little else is known about the glycoprotein alterations linked with oncogenic KRAS and identifying the KRAS target N-glycoproteins will be important for understanding the role of molecular glycosylation in neoplastic processes.
Quantitative mass spectrometry (MS)-based proteomics is one of the most powerful tools to study the expression profiles of activated genes of interest (13,26,27). In this study, we established the proteomic and N-glycoproteomic profiles of human bronchial epithelial cell (HBEC) 1 lines that are isogenic and differ only by the presence or absence of activated KRAS. The quantitative proteomic analysis of HBEC lines in two biological replicates revealed differential regulation of 23 proteins caused by activated KRAS, of which 17 have been reported for the first time. We have validated the candidate proteins using both an in silico-based analysis of microarray repository data and an experimental-based analysis. The quantitative N-glycoproteomic analysis of HBEC lines in two biological replicates identified 14 N-glycoproteins regulated by activated KRAS, the first that have been reported. Of the N-glycoprotein candidates, we have shown the roles of TIMP-1 glycosylation in promotion of lung adenocarcinoma. Finally, we have demonstrated the advantage of our approach using the isogenic cells, OFFGEL fractionation, hydrazide chemistry, MS-based label-free quantitation, integrated data validation, and functional analysis in studying the oncogeneinduced protein and N-glycoprotein alterations on proteomewide scale.

EXPERIMENTAL PROCEDURES
Cell Culture-Cdk4 (cyclin-dependent kinase 4)/hTERT (human telomerase reverse transcriptase)-immortalized human bronchial epithelial cells (HBEC3-KT) and K-RAS V12 -transformed HBEC3-KT cells were grown in K-CFM medium containing 50 g/ml bovine pituitary extract (BPE) and 5 ng/ml EGF as previously described (28,29). These cells were the kind gift of Dr. John D. Minna (University of Southwestern Medical Center, Dallas, TX, USA). The human lung adenocarci-noma A549 and H322 and large cell carcinoma H1299 cell lines were purchased from the American Type Culture Collection. A549 cells were grown in DMEM supplemented with 10% fetal bovine serum and 1% antibiotics. H1299 and H322 cells were grown in RPMI medium supplemented with 10% fetal bovine serum and 1% antibiotics. The cells were grown at 37°C with 5% CO 2 .
Lysis and Tryptic Digestion-To prepare cell lysates, cells were grown to ϳ85% confluence and lysed in modified RIPA buffer containing 1% Nonidet P-40 (Igepal CA-630), 300 mM NaCl, 1 mM EDTA, 0.5 mM dithiothreitol, 1 mM NaVO 3 , 10 mM NaF, and a mixture of protease inhibitors (Roche, Indianapolis, IN). Supernatants were collected from cell lysates after centrifugation at 15,000 ϫ g for 5 min at 4°C. The in-solution tryptic digestion was performed using a standard protocol. The lysates were denatured using 6 M urea, reduced with 10 mM dithiothreitoland alkylated with 40 mM IAA. Excess IAA was quenched by adding 30 mM of dithiothreitol. The urea concentration in the sample solution was reduced to 1 M by diluting the samples with 25 mM NH 4 HCO 3, and proteins were digested with 20 g of trypsin (Promega, Madison, WI)/mg protein overnight. Protein digestion was stopped by adding formic acid at 0.1% final concentration.
OFFGEL Fractionation-Tryptic peptide samples were desalted using a C18 cartridge (Waters, Milford, MA) and fractionated by means of a 3100 OFFGEL fractionator (Agilent Technologies, Santa Clara, CA). Peptides were fractionated according to the manufacturer's protocol using Immobilin TM DryStrip, pH 3-10, 13 cm (GE Healthcare). Twelve fractions were collected from the fractionator; however, fractions 8 and 9 as well as 10 and 11 were combined as the fractions 8 -11 showed low focusing quality relative to other fractions (30). The fractions were desalted using C18, dried, reconstituted in 0.1% trifluoroacetic acid, and subjected to MS analysis.
N-Glycopeptide Enrichment-N-glycopeptides were purified from tryptic peptides by means of the hydrazide chemistry method (31) with modifications. Briefly, tryptic digested samples were desalted using C18 and dried in a SpeedVac. The dried samples were dissolved in coupling buffer (100 mM NaOAc, 150 mM NaCl, pH 5.5) at a concentration of 2 mg/100 l and used for the following reactions. An oxidizing agent, sodium periodate, was added to the peptide solution at 15 mM final concentration and incubated in the dark for 30 min. Excess oxidant was quenched by adding sodium sulfite at 20 mM final concentration and incubated for 10 min. The coupling reaction was performed at 37°C overnight after adding hydrazide resin (Bio-Rad, Hercules, CA) to the quenched peptide solution at 20 mg/ml. The resin was washed twice sequentially with water, 1.5 N NaCl, MeOH, and 80% acetonitrile. PNGase F (New England Biolabs) at a concentration of 1 l/2-6 mg of crude protein was used to perform the cleavage of N-linked peptides form the carbohydrate moiety attached to the resin. The enzymatic cleavage process was performed overnight at 37°C in 18 O-water containing 100 mM NH 4 HCO 3 . The supernatant was collected by gentle centrifugation and combined with the supernatant of an 80% ACN wash fraction. The peptide solution was desalted, dried, reconstituted with 0.1% trifluoroacetic acid, and subjected to MS analysis.
MS Analysis of Peptides and N-Glycopeptides-LC-MS/MS analysis of the peptides and N-glycopeptides was conducted on an LTQ-Orbitrap XL ETD mass spectrometer (Thermo Fisher Scientific, San Jose, CA) equipped with a nanoelectrospray ion source (New Objective, Inc., Woburn, MA), and accela LC system (Thermo Fisher Scientific, San Jose, CA). The sample was injected (5 l) at 10 l/min flow rate on to a self-packed precolumn (150 m I.D. ϫ 30 mm, 5 m, 200 Å). Chromatographic separation was performed on a self-packed reversed phase C18 nanocolumn (75 m I.D. ϫ 200 mm, 3 m, 200 Å) using 0.1% formic acid in water as mobile phase A and 0.1% formic acid in 80% acetonitrile as mobile phase B with split flow rate 1 The abbreviations used are: HBECs, immortalized human bronchial epithelial cells; BRs, biological replicates; CV, coefficient of variation; IDEAL-Q, ID-based elution time prediction by fragmental regression; KRAS, v-Ki-ras2 Kirsten rat sarcoma viral oncogene; KRAS V12 , activated KRAS or lung cancer-specific KRAS allele; LTQ, linear trap quadrupole; NSCLC, non-small cell lung cancer; PNGase F, peptide-N-glycosidase F; shRNA, short hairpin RNA; TRs, technical replicates; XIC, extracted ion chromatogram. of 300 nL/min. The full-scan mass range was set from m/z 320 -2000 with resolution 60,000 at m/z 400. The ten most intense ions were sequentially isolated for MS2 by LTQ. The electrospray voltage was maintained at 1.8 kV and the capillary temperature was set at 200°C.
Database Search of MS/MS Spectra-LTQ-Orbitrap MS/MS data were processed using a Mascot search. The search options used in this study were ipi_HUMAN_3.74 database, digestion enzyme trypsin, up to two missed cleavages, fragment ion mass tolerance of 0.5 Da and a parent ion tolerance of 10 ppm. Variable modifications were set to oxidation (M), and Protein acetyl (N-term), and the fixed modification was set to Carbamidomethyl (C). Additionally, variable modification was set to deamidation of asparagine to aspartic acid with incorporation of 18 O for N-glycopeptide analysis. The sequences corresponding to the 89,575 protein entries are searched in the database. We input the information of bovine serum albumin (BSA) into ipi_HUMAN_3.74 to identify our internal standard. Peptide ions were filtered using the cut-off scores based on the p value Ͻ 0.01. We searched against the decoy database sequences to estimate the false discovery rate (FDR) in our study. This has revealed the FDR Ͻ 1 Ϯ 0.4% for proteomic analysis in 140 technical replicates and Ͻ 2.2 Ϯ 1.6% for N-glycoproteome analysis in 16 technical replicates. The details of MS/MS spectra of proteins identified based on single peptide and N-glycopeptides identified are included in supplemental Data S1-S5.
IDEAL-Q-based Label-free Quantitative Analysis-The label-free quantitative analysis of peptides and N-glycopeptides was performed using the IDEAL-Q (ID-based elution time prediction by fragmental regression) software, v1.0.1.3 (32). The raw data files acquired from the LTQ-Orbitrap were converted into mzXML format files by ReAdW. Raw MS data files were processed by Raw2msm (v1.10) to generate the peak lists using default parameters. The peak lists were analyzed by Mascot (v2.2.06) and the results were exported in eXtensible Markup Language data (XML) format. After data conversion, the peptide identification results from each LC-MS/MS run were loaded and processed to establish a global peptide information list according to the criteria described in IDEAL-Q. First, prediction of elution time for the unidentified ions was performed based on the peptide entry list using linear regression in different LC-MS/MS runs followed by a fragmental correction function. This allows for the detection and assignment of unidentified peptide ions because of low abundance or limitations of the duty cycle of the mass spectrometers. To ensure correct assignments, the detected peptide peaks were checked by SCI validation using the following criteria: (a) signal-to-noise (S/N) ratio Ͼ 2, (b) correct charge state, and (c) correct isotope pattern. The data values were mean normalized and log centered by means of an IDEAL-Q algorithm and were used for the following processes. See the supplementary method (supplemental Data S1) for further information on protein and N-glycoprotein quantitation.
Knockdown Experiments-The lentivirus-based knockdown approach has been previously described (13). The pLKO.1-short hairpin RNA (shRNA) plasmids encoding shRNAs with sequences targeting the firefly luciferase and human KRAS (sh1: 5Ј-CTATGGTCCTAG-TAGGAAAT-3Ј; and sh2: 5Ј-GAGGGCTTTCTTTGTGTATTT-3Ј) were purchased from the National RNAi Core Facility in Taiwan and intro-duced into HEK293T cells with the lentiviral packaging vectors pMD.G and pCMV⌬8.91. Viruses were collected from the medium 60 h after transfection. For knockdown experiments, cells were infected with the collected viruses over 24 h in the presence of Polybrene (at 3 MOI for H322 and 5 MOI for A549). Cells were cultured in designated medium for 24 h prior to lysis.
Wound-healing Assay-The migration wound-healing assay was performed in six-well plates. Cells were transfected with empty vector or vectors containing cDNA of TIMP-1(WT) or TIMP-1(N30Q). A scratch was made through the confluent monolayer cells by using a 200 l pipet tip. The detached cells were removed by washing with PBS and the cells were then allowed to migrate into the scratched areas. Photographs of the migrating cells were taken at the indicated time points following wound initiation.
Invasion Assay-Cell invasion assay was performed using a 48-well Boyden chamber (Neuro Probe, Inc.). Briefly, media containing 5% serum was added to the lower chamber wells. Fibronectin (Millipore), a chemotactic agent, was used with a concentration of 1 mg/ml. Matrigel (9.6 mg/ml; BD Bioscience) coated polycarbonate filter (pore size 8 m) was placed above the lower chamber and then upper chamber was assembled above the filter. The transfected cells (A549 or H1299) were trypsinized, washed in PBS, suspended in 0.1% serum-containing medium, loaded into the upper chamber wells, and incubated at 37°C in 5% CO 2 for 18 h. After incubation, the cells that had traversed the filter were fixed with methanol, washed in PBS, and stained with Hochest 33258 (Molecular Probe). The degree of invasion was scored by means of ImageJ software.

OFFGEL Fractionation Followed by LTQ-Orbitrap Analysis
Identified Over 5500 Proteins in Normal and KRAS V12 -transformed HBECs-To characterize and quantify the proteomic changes induced by activated KRAS, we established a global strategy outlined in Fig. 1. In brief, whole-cell lysates prepared from HBEC3-KT (hereafter 3KT) and KRAS V12 -transformed HBEC3-KT (hereafter 3KTR) were mixed with an internal standard, BSA. The lysates were tryptic digested and desalted using C18, then subjected to an OFFGEL fractionator. The OFFGEL fractionator assists in pI-based fractionation by using immobilized pH gradient strips (30,33). We collected 12 fractions from the OFFGEL fractionator for each cell line, which were desalted using C18 prior to LTQ-Orbitrap analysis. To ensure the reliability of the quantitative profiling results, the samples were prepared and fractions were collected on two independent occasions (two biological replicates). Each fraction was injected twice (two technical replicates) in the first biological replicate (Exp. 1) and five times (five technical replicates) in the second biological replicate (Exp. 2) into the LTQ-Orbitrap. The raw data obtained from the LTQ-Orbitrap were further processed and analyzed using the IDEAL-Q system as described previously (32). IDEAL-Q is an automated system that can perform protein identification, normalization and label-free quantitation functions simultaneously (32). The IDEAL-Q analysis resulted in identification of 25,998 peptides corresponding to 4741 proteins in Exp. 1 (Table S1) and 31,047 peptides corresponding to 5252 proteins in Exp. 2 (Table S2). The identification of such a large number of proteins could be attributed to the maximum recovery of peptides from in-solution digestion and OFFGEL fractionation as demonstrated previously (30). More than 91% of the peptides identified in Exp. 1 were observed in two technical replicates, whereas ϳ8% of the peptides were observed in any one of the technical replicates ( Fig. 2A, left panel). Similarly, more than 82% of the peptides identified in Exp. 2 were observed in five technical replicate analyses (Fig. 2B, left panel). These results suggesting the reproducibility of peptides identification across the technical replicate analyses.
Label-free Quantitative Proteomics Analysis Shows 23 Differentially Regulated Proteins in 3KTR Cells-We identified a total of 4741 proteins in Exp. 1 and 5252 proteins in Exp. 2 (Fig. 2C). By comparing Exp. 1 to Exp. 2, we identified 5713 proteins and we found that 97% of the proteins identified in Exp. 1 were also identified in Exp. 2 (Fig. 2C). In-addition, of 5713 proteins, 5610 were found in both 3KT and 3KTR cells. Majority (83.5%) of the proteins found in any of the cell lines were identified by a single peptide. In order to distinguish the KRAS-regulated proteins, we performed a two-step data normalization process prior to protein quantification using IDEAL-Q. First, the peptide abundance data was mean normalized. Because the isogenic 3KT and 3KTR cells differ only in the expression of activated KRAS, the majority of the protein concentrations should remain unchanged, and therefore, the mean normalized peptides were log centered. Normalized peptides were examined for their abundance variation in technical replicates by assessing the coefficient of variation (CV). We observed less than 20% CV for the majority of the peptides identified in Exps. 1 and 2 ( Fig. 2A and 2B, right panels), and only those met this criterion were further considered for quantitation to minimize the identification of false positives. This resulted in quantitation of 4038 (85.2%) and 4045 (77%) proteins, respectively, in Exps. 1 and 2 between 3KTR and 3KT ( Fig. 2D). Although we identified a relatively high number of peptides and proteins in Exp. 2 when compared with Exp. 1 (supplemental Tables S1 and S2, and Fig. 2C), the number of quantitated proteins containing peptides with less than 20% CV were almost equal in Exp. 1 and Exp. 2 (Fig. 2D). These results suggest that increasing the number of technical replicates favors protein identification but not quantitation.
Among the proteins quantitated, we first examined the extracted ion chromatogram (XIC) profiles and IDEAL-Q quantitation ratios of BSA, an internal standard in both experiments. The observed XIC profiles of BSA in the technical replicates of Exps. 1 and 2 demonstrated the accuracy of OFFGEL peptide fractionation and reproducibility of the LTQ-Orbitap analysis (Fig. 3A). After normalization and quantitation by IDEAL-Q, the results showed a ratio close to one for BSA in both Exp. 1 and Exp. 2 ( Fig. 3B), indicative of high quality sample preparation and OFFGEL peptide fractionation. In addition, the observed approximate protein levels of KRAS between 3KT and 3KTR in this study (supplemental Fig. S1A) are consistent with immunoblotting results shown previously for these two cell lines (29). Only proteins containing at least two peptides were considered for quantitation to ensure the KRAS-regulated proteins. We quantified 3058 proteins with at least two peptides, of which 2113 were quantitated in both Exp. 1 and Exp. 2 and further analyzed (supplemental Fig. S1B). The proteins with Ն1.7-fold differential expressions in both Exp. 1 and Exp. 2 were considered to be KRAS-regulated proteins. Differences with Ն1.7-fold may reflect the KRAS-induced changes in 3KTR cells because these cells exhibit constitutive KRAS activation but do not show the activation of other oncogenes (12). In total, we identified 23 proteins with Ն1.7-fold differential expression in 3KTR cells (Table I, upper panel). The majority (74%) of these regulated proteins showed even higher differential expression (Ͼ twofold) in at least one bio- logical replicate experiment. Among these 23 proteins, seven were up-regulated and 16 were down-regulated. In addition, we observed the unchanged protein levels of a loading control, ␤-actin, in both Exp. 1 and Exp. 2 (supplemental Fig.  S1C). The peptide fold change values of up-regulated oncogenic EPHA2 and down-regulated tumor suppressor PDCD4 identified in both Exps. 1 and 2 were listed to show the internal consistency of the peptide ratios (Table I, lower  panel). Of the identified KRAS-regulated proteins, six genes (AKR1C1, EPHA2, FABP5, GLUL, PDCD4, and VIM) were previously reported as KRAS targets (10, 11, 34 -36). The other 17 identified are novel potential targets of activated KRAS.
Enrichment of Glycopeptides Using Hydrazide Chemistry Approach Identifies 608 N-Glycosites on 317 Proteins in HBECs-To investigate whether activated KRAS has the potential to induce glycosylation changes, N-glycopeptides were enriched from tryptic peptides of 3KT and 3KTR in the presence of 18 O-water by means of a hydrazide chemistry approach (Fig. 1). The enrichment of N-glycopeptides was performed on two independent occasions (Exps. 1 and 2) and the samples were analyzed by LTQ-Orbitrap in three (Exp. 1) and five (Exp. 2) technical replicates (supplemental Tables S3  and S4). However, the enriched N-glycopeptide samples were not fractionated prior to LTQ-orbitrap analysis because the fractionation did not increase the number of identified Nglycosites (37). The data from these two independent experiments were grouped together to give a final dataset containing a total of 317 proteins, 595 unique N-glycopeptides, and 608 unique N-glycosylation sites (Fig. 4A). Of the identified N-glycopeptides, 96.3% were identified in both cell lines. Majority (21 of 22) of the N-glycopeptides identified in any one of the cell line were not reproducible between the two biological replicate experiments. We next used a motif-x algorithm to classify the position-weighted amino acid sequences that were over-represented among the identified N-glycosylation events (Fig. 4B). This revealed that 360 of the N-glycosylation events occurred in the context of the sequence, N-X P-  Fig. 5 shows a representative example for the identification (Fig. 5A) and quantitation (Fig. 5B) of an N-linked glycopeptide derived from metalloproteinase inhibitor 1 (TIMP-1). The XIC profiles, IDEAL-Q ratios, and % of CV of the TIMP-1 observed in two biological replicates (Fig. 5B and Table II) indicate the high quality of the N-glycopeptide sample preparation and reproducibility of the LTQ-Orbitrap analysis. A total of 15 N-glycosylation events on 14 proteins with Ն1.7-fold differential expression in two experiments were identified as potential components associated with activated KRAS (Table II). However, the majority (11 of 15) of the regulated N-glycosylation events were observed with Ͼtwofold differential expression in at least one experiment. The quantitative results demonstrate the increased levels of N-glycosylation on ITGA3, SEMA4B, TIMP-1, and TPGB, and decreased levels of N-glycosylation on ALCAM, ANO6, ATP1B1, CLCA4, HSPG2, PTGFRN, SLC44A2, SLC46A1, SULF2, and UNC5C in responsive to activated KRAS in 3KTR cells (Table II). A few of these KRASregulated N-glycoproteins including SEMA4B and TIMP-1 were also identified by N-glycopeptide enrichment experiments carried out in presence of 16 O-water (data not shown). Since our study represents the first application of quantitative glycoproteomics toward exploring the KRAS-regulated N-glycosylation events, the results discovered in this study may contribute to our understanding of activated-KRAS roles in lung cancer pathogenesis.
IPA-Biomarker ® Filter Analysis of KRAS-regulated proteins and N-Glycoproteins-We next focused on identification of biomarkers of cancer from our lists of KRAS-regulated pro- teins and N-glycoproteins. In order to define the biomarkers, KRAS-regulated proteins and N-glycoproteins were subjected to IPA-Biomarker ® filter (Ingenuity Systems Inc.) analysis. IPA-Biomarker ® filter prioritize the clinical biomarkers from the experimental datasets based on various contextual factors of diseases. The analysis revealed six proteins (ALDH3A1, CA2, CTSD, DST, EPHA2, and VIM) and three N-glycoproteins (ALCAM, ITGA3, and TIMP-1) as biomarkers for diagnosis and prognosis of cancer (supplemental Fig. S2). Some of the identified biomarkers may also use for the diagnosis and prognosis of respiratory diseases (supplemental Fig. S2). Our preliminary results suggest the identified candidates may have implications in the analysis of tissue-specific and disease-related biomarkers.

Validation of KRAS-targeted Proteins and N-Glycoproteins in Clinical Samples and Cell Lines by In Silico Analysis of
Microarray Repository Data-To evaluate the clinical significance of activated KRAS targets identified in HBECs, we examined the expression status of the corresponding transcripts in lung adenocarcinoma clinical samples and cell lines using microarray datasets that have been deposited in NCBI's GEO database. Comparing human lung adenocarcinoma samples (n ϭ 8) containing KRAS mutations with normal lung samples (n ϭ 5) showed 8809 differentially regulated genes, of which 1072 were quantified between 3KTR and 3KT cells. We found positive mRNA fold changes for 6 of the 23 regulated proteins identified in 3KTR cells (supplemental Fig. S3A). Likewise, a comparison of human lung adenocarcinoma cell lines (n ϭ 10) containing activated KRAS with immortalized normal lung epithelial cell lines (n ϭ 9) showed 7731 differentially regulated genes, of which 2210 were quantified between 3KTR and 3KT cells. We found positive mRNA fold changes for 10 of the 23 regulated proteins identified in 3KTR cells (supplemental Fig. S3B). Interestingly, both lung adenocarcinoma clinical samples and cell lines had significantly lower levels of fatty acid binding protein, FABP5 (p Ͻ 0.005) and the tumor suppressor PDCD4 (p Ͻ 0.025) ( Fig. 6A and 6B). The control gene DDX5, whose protein levels were unchanged in our experiment (supplemental Fig. S1C), showed constant mRNA expression (supplemental Fig. S3A and S3B). We also looked at the expression levels of regulated N-glycoproteins in both clinical samples and cell lines, which showed comparable levels of 9 of the 14 regulated N-glycoproteins (supplemental Fig. S3A and S3B). Interestingly, comparable expression levels of ATP1B1, TIMP-1, and UNC5C were observed in both clinical samples and cell lines. We showed the unchanged expression levels of TIMP-1 in clinical samples and cell lines as representative example ( Fig. 6A and 6B). This is in agreement with the unchanged TIMP-1 protein levels in 3KT and 3KTR cells, indicating that TIMP-1 expression levels did not contribute to its altered N-glycosylation levels. The integration of MS-based proteomic results with microarray-based transcriptome data suggests FABP5 and PDCD4 are potential targets of activated KRAS in lung adenocarcinoma.

Validation of KRAS-targeted Proteins in HBECs and Lung Cancer Cells by shRNA-based Knockdown and Immunoblot
Analyses-In order to evaluate the targets of activated KRAS identified by MS analysis and IDEAL-Q quantitation, we assessed the expression levels of several proteins by Western blot analysis. The tumor suppressor protein PDCD4, cell adhesion molecule BCAM, and stress resistance protein HSPB1, which were all down-regulated in 3KTR cells in our proteomics data, showed reduced levels by Western blot analysis (Fig. 6C). Western blotting showed the up-regulation of the receptor protein EPHA2 and the cytoskeletal protein VIM in 3KTR cells, which is consistent with the quantitative proteomics results (Fig. 6C). Finally, we looked at the loading control, ACTB (␤-actin) and the known protein MAPK1 levels by Western blotting, and found that their protein levels were unchanged in 3KT and 3KTR cells (Fig. 6C). These Western blotting results are in agreement with the MS analysis and IDEAL-Q quantitation data (supplemental Fig. S1C). We further characterized the KRAS targets in lung adenocarcinoma activation of KRAS downstream signaling (13,38). The examination of endogenous KRAS levels in A549 and H322 cells and 3KT cells by immunoblotting confirmed the elevated levels of KRAS in H322 cells (Fig. 6D). We compared the endogenous expression levels of several KRAS targets including BCAM, EPHA2, HSPB1, PDCD4, and VIM in A549, H322, and 3KT cells. The results had shown the overexpression of EPHA2 and relatively lower expression of BCAM, HSPB1, and PDCD4 in A549 and H322 cells when compared with 3KT cells (Fig. 6D). The up-regulation of VIM was observed in A549 but not in H322 cells when compared with 3KT cells (Fig. 6D).
Next, studies were conducted to examine the KRAS regulation of EPHA2, BCAM, HSPB1, and PDCD4. We knockdown the KRAS in A549 and H322 cells by using two individual shRNAs targeting at KRAS. The knockdown of KRAS in A549 and H322 cells clearly showed the reduced expression levels of EPHA2 (Fig. 6E). In addition, as a consequence of KRAS knockdown, expression levels of HSPB1 in A549 and PDCD4 in H322 cells also increased (Fig. 6E). However, KRAS knockdown did not affect the expression levels of BCAM in either A549 or H322 cells (data not shown). These knockdown results further confirmed the EPHA2, HSPB1 and PDCD4 as Effects of TIMP-1 glycosylation at Asn 30 on NSCLC cell migration and invasion-The results of label-free quantitative N-glycoproteomics from two biological replicate experiments showed differential regulation of 15 N-glycosylation events caused by activated KRAS in 3KTR cells (Table II). In biological replicates, a relatively lower value of average CV (3.25%) was observed for KRAS-regulated TIMP-1glycosylation at Asn 30 (Table II). In-addition to the measures of elevated TIMP-1(Asn 30 ) glycosylation quality, our study has prioritized the TIMP-1, a secreted protein, as one of the cancer biomarkers (supplemental Fig. S2). A recent study has described that some of the most important effects of KRAS signaling may involve secretory proteins (39). Previous studies have documented the role of TIMP-1 in suppressing the NSCLC cell invasion (40,41). However, the functional roles of TIMP-1 glycosylation have not been demonstrated in lung cancer biology. Here, among the identified KRAS-regulated N-glycosylation events, we further analyzed the effects of TIMP-1 (Asn 30 ) glycosylation on migration and invasion of NSCLC A549 and H1299 cells. First, we generated expression vectors encoding the Flag-tagged wild-type (WT) TIMP-1 and TIMP-1 N30Q mutant, in which the Asn-30 was replaced by Gln. These vectors along with the control vector were transiently introduced into lung adenocarcinoma A549 cells (harboring activated KRAS G12D ) and migration and invasion assays were performed (Fig. 7A). Ectopic expression of TIMP-1(WT) in A549 cells, compared with that of vector-transfected cells, displayed a significant (p Ͻ 0.01) suppression on the ability of cell invasion, and mutation of Asn-30 to Gln further enhanced the suppressive effect of TIMP-1 on cell invasion (p Ͻ 0.0001) (Fig. 7B). In contrast, no significant difference in the migration of A549 cells harboring the control vector, and vectors encoding TIMP-1(WT) and TIMP-1(N30Q) was observed (supplemental Fig. S4A). To further explore the functional roles of the TIMP-1 glycosylation at Asn 30 , we examined the effects of TIMP-1 on the migration and invasion of large cell carcinoma H1299 cells, which harbors endogenous activated NRAS Q61K (Fig. 7A). Similarly, ectopic expression of TIMP-1 showed a modest but significant (p Ͻ 0.01) decrease in cell invasion (Fig. 7C), but had little effect on the migration of H1299 cells (supplemental Fig. S4B). Interestingly, in contrast to the effect observed in A549 cells, mutation of Asn-30 to Gln did not confer further suppressive effect on cell invasion in H1299 cells ( Fig. 7B and 7C). These results suggest a distinct role of TIMP-1 glycosylation at Asn-30 in regulating invasive abilities in lung cancer cells with different RAS background.
Lectin Blot Analysis of KRAS-induced Structural Changes of N-linked Glycans in 3KTR Cells-Next, we focused on analyzing ␤1,6 branched structures of N-glycoproteins differentially regulated by KRAS in 3KTR cells because oncogenic RAS is known to regulate the activity of GnT-V, which catalyzes the addition of ␤1,6-GlcNAc branch to the core Nglycans (18 -20). In order to achieve this, we performed lectin blot analysis of HBECs using L-PHA, which recognizes the ␤1,6-GlcNAc moiety. By means of available antibodies, SEMA4B, one of the KRAS regulated N-glycoproteins (Table  II) was considered for the L-PHA blot analysis. The immunoprecipitation of SEMA4B using a specific antibody followed by L-PHA blot identified an increase in its ␤1,6-GlcNAc moiety levels in 3KTR relative to 3KT cells (Fig. 8, left panel). Stripping of the blot and reprobing with the specific antibody showed unchanged levels of SEMA4B between the molecular weights of 95-130 kDa and at Ͼ170 kDa (probably dimer) in 3KT and 3KTR cells (Fig. 8, middle panel). In-addition, a degradation product of SEMA4B was observed at Ͻ95 kDa in 3KTR cells (Fig. 8). These results suggest that oncogenic KRAS mediate the N-glycosylation changes through the activation of GnT-V in 3KTR cells.

DISCUSSION
Research into the downstream targets of activated KRAS is advantageous for various types of malignancies because KRAS is activated in all most all cancers, in turn, leads to the pathogenic development and progression. Most research has focused on the transcripts regulated by activated KRAS in cancers. However, several studies have been observed a poor correlation between the expression levels of transcripts and proteins. Studying the targets of activated KRAS at protein and post-translational modification levels would be useful in understanding their downstream cellular effects. In this study, we demonstrated the downstream targets of activated KRAS by elucidating the proteomic and N-glycoproteomic profiles of isogenic cells (3KT and 3KTR) that differ only in expression of activated KRAS and show no significant differences in cellular morphology and growth rate (29). LTQ-Orbitrap analysis identified 30,759 unique peptides corresponding to 5713 proteins and 595 unique N-glycopeptides of 317 proteins in HBECs. The label-free quantitation approach exemplifies the systematic quantification of 3058 proteins and 297 N-glycoproteins, which had allowed us to identify the differentially regulated proteins (n ϭ 23) and N-glycoproteins (n ϭ 14) in KRASactivated HBECs. The integrated analysis has validated 61% of the differentially regulated proteins in clinical samples and cell lines, of which, EPHA2, PDCD4, and HSPB1 were further confirmed as KRAS targets by knockdown experiments. Finally, we showed the functional significance of TIMP-1 glycosylation at Asn 30  We identified 23 proteins as being differentially regulated in response to activated KRAS in HBECs, of which 16 were down-regulated (Table I, upper panel). Similarly, a previous report has shown that KRAS V12 -transformation in ovarian surface epithelial cells led to a relatively high number of downregulated genes (10). The reasons for the down-regulation of large number of genes in KRAS V12 -transformed HBECs are beyond the scope of this study. One possible explanation, however, could be the involvement of activated RAS signaling in protein degradation (42). Of the KRAS targets identified in this study, EPHA2, VIM, AKR1C1, GLUL, and FABP5 have been previously reported as KRAS targets at the transcriptome level. The up-regulation of EPHA2 in human ovarian duct epithelial cells expressing KRAS G12V (11) and the up-regulation of VIM in murine fibroblasts expressing KRAS G12V or over-expressing KRAS (34) have been previously demonstrated. In-addition, KRAS-transformation-dependent up-regulation of VIM in rat ovarian surface epithelial cells (10) has been observed. The down-regulation of AKR1C1 and GLUL by KRAS-transformation has been observed in rat ovarian surface epithelial cells (10). Down-regulation of PDCD4 has been associated with microRNA-21 in rat thyroid cells expressing activated RAS (36). Similar results have been observed with PDCD4 down-regulation in mouse model of nonsmall cell lung cancer (35). In contrast to our results, up-regulation of FABP5 has been observed in murine fibroblasts that are either expressing KRAS G12V or over-expressing KRAS (34). The remaining 17 candidates identified in this study are the novel and potential targets of activated KRAS. The KRAS targets identified in this study may aid to extend our knowledge on the mechanisms involved in the initiation and progression of activated KRAS-induced tumorigenesis.
Further, by analyzing the microarray repository data, we compared lung adenocarcinoma clinical samples and cell lines containing KRAS mutations with their corresponding normals. The comparison showed a large number of regulated genes in clinical samples (n ϭ 1072) and cell lines (n ϭ 2210) because of heterogenic genetic background of the samples. Similarly, a large number of regulated phosphoproteins were identified by comparing the expression profiles of heterogenic cells but not isogenic cells (13). The current study had used the advantage of isogenic cells that differ only by the presence or absence of activated KRAS and identified 23 specific target proteins of KRAS (Table I, upper panel). The integrated microarray data analysis had confirmed that 61% of the 23 KRAS-targeted proteins, including HSPB1 and PDCD4, were also affected at the mRNA level (supplemental Fig. S3). We tried to confirm the KRAS-mediated regulation of the tumor suppressor PDCD4 and a couple of candidates by shRNAbased knockdown experiments in lung adenocarcinoma A549 and H322 cells (Fig. 6E). Knock-down of KRAS has showed the down-and up-regulation of EPHA2 and HSPB1 candidates respectively in A549 cells harboring activated KRAS G12D . However, knockdown experiments did not confirm PDCD4 as a downstream target of activated KRAS in A549 cells, raising the possibility of crosstalk between KRAS and other signaling pathways in the regulation of PDCD4 expression. In contrast, KRAS knockdown resulted in the over-expression of PDCD4 in H322 cells. We also observed the down-regulation of EPHA2 in KRAS knock down H322 cells, supporting the idea that amplified KRAS is involved in the activation of RAS/MAPK signaling and regulation of the downstream targets (13,34). Of the identified candidates, PDCD4 is best characterized in inhibiting the neoplastic processes, especially the cell invasion (43). Although we have not explored the regulation mechanisms of the candidates, our findings reveal possible novel insights into the downstream targets of the activated KRAS in lung adenocarcinoma.
Previously, it has been shown that aberrant glycosylation is implicated in cancer progression (14 -17, 21). However, the molecular determinants involved in aberrant glycosylation induced transformation are unknown. Several studies have shown that KRAS is important for protein glycosylation in tumor cells and a positive correlation between the KRAS mutations and increased cell surface N-glycoproteins has been demonstrated (18 -20). In this study, using KRAS-transformed HBECs, we identified a limited number (n ϭ 14) of N-glycoproteins as differentially regulated by activated KRAS (Table II), suggesting that activated KRAS may require additional contributions from other oncogenic signals to induce the aberrant glycosylation on large number of proteins. The identified KRAS-regulated N-glycosylation events were not confirmed by other methods because there is no established method to quickly evaluate a specific glycosite in clinical samples and cell lines. Novel methods and techniques need to be developed to address this problem. By means of Uniprot subcellular location annotation, 11 of the KRAS-regulated N-glycoproteins were annotated as membrane (including or-ganellar membranes) proteins, two (TIMP-1 and HSPG2) as secreted proteins, and one (SULF2) as endoplasmic reticulum protein. According to Panther Classification System, the regulated N-glycoproteins were mainly involved in cell communication (ALCAM, HSPG2, TPBG, SEMA4B, and UNC5C), cell adhesion (ALCAM, HSPG2, and ITGA3), and transport (SLC44A2, SLC46A1, and ATP1B1) processes.
Matrix matalloproteinases (MMPs) are a group of enzymes involved in degradation of extracellular matrix components, and are known to play key roles in several processes including tumor growth and metastasis. Tissue inhibitors of metalloproteinases (TIMPs) can bind to and inhibit the catalytic activity of MMPs (44,45). TIMP-1, a member of TIMP family, is a glycoprotein, and is up-regulated in many cancers (44,45). In lung adenocarcinoma A549 cells, a link between TIMP-1 up-regulation and anti-invasive action of cannabidiol and cisplatin has been demonstrated (40,41). Further, addition of human recombinant TIMP-1 protein to the A549 cells in Boyden chamber significantly inhibited the cell invasive ability through Matrigel (40). However, the effects of TIMP-1 glycosylation on lung cancer cell invasion are not known. A recent study has shown the aberrant glycosylation of TIMP-1 caused by GnT-V leads to a decrease in its binding affinity for gelatinases, in turn, enhances the invasive ability of colon cancer cells (46). Our study demonstrated the up-regulation of TIMP-1 glycosylation at Asn 30 in KRAS-activated HBECs (Fig. 5B and Table  II). We further showed the transient expression of TIMP-1(WT) suppressed the invasive ability of both adenocarcinoma A549 and large cell carcinoma H1299 cells (Figs. 7B and 7C). Because our results showed that TIMP-1 glycosylation did not significantly affect the cell migration in both A549 and H1299 cells (Fig. S4), the effect of TIMP-1 on cell invasion is mediated through an MMP-dependent hydrolysis of Matrigel components. Interestingly, the disruption of TIMP-1 glycosylation at Asn 30 residue (N30Q) has dramatically increased the suppressive effect of TIMP-1 on cell invasion in A549 but not in H1299 cells (Figs. 7B and 7C). An important question is how TIMP-1(N30Q), compared with that of TIMP-1(WT), exhibits enhanced suppression on cell invasion in A549 but not H1299 cells. Although A549 and H1299 cells, respectively, harbor activated KRAS G12D and NRAS Q61K , our recent study has demonstrated that constitutively activated KRAS triggers the downstream signaling in A549 cells but not in H1299 (13). In-addition, other studies have also shown that oncogenic KRAS is effective relative to NRAS in triggering the downstream oncogenic signals (47,48). In this study, we demonstrated the elevated levels of ␤1,6-GlcNAc moiety on N-linked glycans caused by oncogenic KRAS singling in 3KTR cells (Fig. 8). We hypothesize that constitutively activated KRAS signaling may stimulate the enzymes involved in N-linked glycosylation of proteins and/or may activate the enzymes involved in addition of molecules such as galactose, N-acetylglucosamine (GlcNAc), N-acetylgalactosamine, fucose, and sialic acid to the core N-linked glycans, which causes aber-rant glycosylation of proteins in cancer cells. The up-regulated glycosylation at Asn 30 of TIMP-1 may lead to its weak binding of MMPs, gain of MMPs activity, and enhance in cell invasion. In support, mutation of Asn 30 further strengthens its inhibitory effect on MMPs (Fig. 7B). Although our results demonstrate the roles of TIMP-1 glycosylation in regulating the lung cancer cell invasion, further studies are warranted to elucidate the structural alterations in N-glycans of TIMP-1 caused by KRAS signaling and their importance in promotion of lung adenocarcinoma cell invasion. This may aid to understand the correlation between high levels of TIMP-1 in tumors and poor prognosis of patients.
Cancer develops through multiple stages and therefore biomarkers have to be developed for the early pathologic diagnosis. In-addition to the identification of protein biomarkers, several studies have established the glycoprotein biomarkers of cancer, because cancer is caused not only because of the changes in protein levels but also because of aberrant glycosylation. We prioritized the biomarkers from the lists of KRASregulated proteins and N-glycoproteins using informatics-assisted analysis. The analysis revealed six proteins and three N-glycoproteins as cancer biomarkers (supplemental Fig. S2). Of the identified protein biomarkers two (CA2 and CTSD) were further validated in lung adenocarcinoma clinical samples and cell lines (supplemental Figs. S3A and S3B). The biomarker proteins and N-glycoproteins identified in this study may serve as valuable candidates for the early detection of lung adenocarcinoma.
In conclusion, using a highly sensitive MS instrument, our study illustrates the advantage of label-free quantitative proteomics and N-glycoproteomics in identifying the known and novel targets of the mutationally activated oncogene, KRAS. We believe that follow-up studies on KRAS target proteins and N-glycoproteins may provide novel insight into the mechanisms of oncogenic KRAS-induced cellular transformation and tumorigenesis. In-addition, the datasets of proteins and glycoproteins of lung cells provided here and other related studies will be used to establish the lung proteome and glycoproteome database to further identify the lung cancer biomarkers.