Phosphotyrosine-based Phosphoproteomics for Target Identification and Drug Response Prediction in AML Cell Lines*

Phosphotyrosine (pY)-specific phosphoproteome profiles have been determined for 16 AML cell lines through a pY immunoprecipitation-based protocol. Kinase activity inference using dedicated ranking analysis, combining kinome-, activation loop-, and substrate-based analyses, identified potential drivers for all cell lines and driver function was confirmed through drug experiments. Results for two patient samples show potential of phosphoproteomics in a clinical setting. Graphical Abstract Highlights pY phosphoproteomes and dedicated ranking analyses for 16 AML cell lines. RTK drivers, 6 mutant cell lines confirmed, identification for 4 more cell lines. MAPK1/3 phosphorylation for cell lines without TK driver, indicating RAS mutation. Drug target space phosphorylation correlates with drug IC50s in specific cell lines. Acute myeloid leukemia (AML) is a clonal disorder arising from hematopoietic myeloid progenitors. Aberrantly activated tyrosine kinases (TK) are involved in leukemogenesis and are associated with poor treatment outcome. Kinase inhibitor (KI) treatment has shown promise in improving patient outcome in AML. However, inhibitor selection for patients is suboptimal. In a preclinical effort to address KI selection, we analyzed a panel of 16 AML cell lines using phosphotyrosine (pY) enrichment-based, label-free phosphoproteomics. The Integrative Inferred Kinase Activity (INKA) algorithm was used to identify hyperphosphorylated, active kinases as candidates for KI treatment, and efficacy of selected KIs was tested. Heterogeneous signaling was observed with between 241 and 2764 phosphopeptides detected per cell line. Of 4853 identified phosphopeptides with 4229 phosphosites, 4459 phosphopeptides (4430 pY) were linked to 3605 class I sites (3525 pY). INKA analysis in single cell lines successfully pinpointed driver kinases (PDGFRA, JAK2, KIT and FLT3) corresponding with activating mutations present in these cell lines. Furthermore, potential receptor tyrosine kinase (RTK) drivers, undetected by standard molecular analyses, were identified in four cell lines (FGFR1 in KG-1 and KG-1a, PDGFRA in Kasumi-3, and FLT3 in MM6). These cell lines proved highly sensitive to specific KIs. Six AML cell lines without a clear RTK driver showed evidence of MAPK1/3 activation, indicative of the presence of activating upstream RAS mutations. Importantly, FLT3 phosphorylation was demonstrated in two clinical AML samples with a FLT3 internal tandem duplication (ITD) mutation. Our data show the potential of pY-phosphoproteomics and INKA analysis to provide insight in AML TK signaling and identify hyperactive kinases as potential targets for treatment in AML cell lines. These results warrant future investigation of clinical samples to further our understanding of TK phosphorylation in relation to clinical response in the individual patient.

Acute myeloid leukemia (AML) 1 is a clonal hematopoietic stem cell disorder, characterized by expansion of immature leukemic blasts in the bone marrow, resulting in suppression of normal hematopoiesis. In AML, protein kinase mutations are associated with proliferative and survival advantages (1,2) and treatment of AML with kinase inhibitors is therefore gaining much interest (3).
For example, the FMS-like receptor tyrosine kinase 3 (FLT3) gene is frequently mutated in AML, either by internal tandem duplication of the juxtamembrane domain (ϳ20 -30% of cases) (4 -8) or a point mutation of the tyrosine kinase domain (TKD) (ϳ7%) (5,7,8). These mutations lead to constitutive FLT3 signaling and are associated with high peripheral blast counts and a poor treatment outcome (4, 6, 8 -10). Recently, addition of the (rather non-selective) FLT3 inhibitor midostaurin to standard chemotherapy was shown to increase overall survival compared with chemotherapy alone in FLT3-ITD/ TKD-positive AML patients (11). Clinical studies, using more selective inhibitors, have also shown promising, but variable, results and suggested that patient selection based solely on mutational status is suboptimal for predicting patient response and/or that additional activated pathways may be missed by mutational analysis (12)(13)(14)(15). Clearly there is a need for predictive markers that reflect the functional state of the cancer cells. In order to advance KI-based precision medicine for the individual patient, knowledge of individual AML kinase activity profiles could improve treatment decisions.
The emergence of mass spectrometry-based phosphoproteomics has enabled dissection of cellular signaling in cancer at a global scale. Previous global phosphoproteomic studies of AML have focused on unraveling AML signaling pathways (16 -21) and assessing the effect of kinase inhibitors in the context of therapeutic response (17,20,(22)(23)(24)(25)(26). Global phosphopeptide patterns have been found to be discriminative between AML cell lines on the one hand, and cell lines derived from other hematological malignancies on the other (23). Heterogeneity was further found among different AML cell lines, underlining the heterogeneous nature of the disease (22,23).
Global phosphoproteomic analyses, involving all major types of phosphorylation (serine, threonine, and tyrosine), are heavily dominated by serine-and threonine-specific modifications that are much more prevalent than pY modifications. In contrast, pY enrichment-based phosphoproteomics allows for a thorough study of TK signaling that is frequently disturbed in cancer and enables identification of driver kinases (27). In the context of AML, pY-phosphoproteomics has contributed to increased understanding of aberrant TK signaling (16,18,19,27), and identified JAK2 and FGFR1 as drivers in AML cell lines (28,29). Though limited in their depth and sample size, these studies have shown the ability of phosphoproteomics to detect kinase hyperactivity in AML.
We have recently developed an analysis strategy (called INKA) to infer kinase activity in a single biological sample, combining knowledge on the phosphorylation of both kinases and their substrates (30). To further evaluate the potential of phosphoproteomics for identifying kinase hyperactivity as a possible target in AML, we have performed pY enrichmentbased phosphoproteomics of a panel of 16 AML cell lines, coupled to dedicated INKA analysis and functional testing. Additionally, we were able to detect the phosphorylation profile associated with activated FLT3 in two clinical samples. Altogether, our study provides a comprehensive overview of AML signaling and TK activity on an individual cell line basis and shows potential for target selection for effective treatment with kinase inhibitors.
Clinical Samples-Periferal blood samples containing 93-97% of blasts were obtained from two FLT3-ITD AML patients at the time of diagnosis at Amsterdam UMC, location VUmc, Cancer Center Amsterdam. These patients were included in the HOVON 102 trial (Netherlands Trial Register number NL2070). All study protocols were performed in accordance with the Declaration of Helsinki and approved by the central medical ethical committee (METC-2009 -293). Written informed consent was obtained from each patient prior to study entry.
The start of the isolation procedure was within 30 -60 min after sample collection. Mononuclear cells (MNC) were enriched using Ficoll-Paque Plus (GE Healthcare, Chicago, IL). Erythrocytes were lysed using lysis buffer (155 mM NH 4 Cl, 10 mM KHCO 3 , 0.1 mM Na 2 EDTA). Cell pellets were snap frozen and stored at Ϫ80°C. Isolation was completed within ϳ2.5-3 h after collection.
Sample Processing and pY Immunoprecipitation-Cell lines and clinical samples were lysed in urea lysis buffer for phosphoproteomics (8 M urea, 1 mM orthovanadate, 2.5 mM pyrophosphate, 1 mM ␤-glycerophosphate in MilliQ water) followed by 1 min. of vortexing and subsequent sonication. Sonicated lysates were cleared by spinning at 5400 ϫ g for 15 min. at 13°C. Protein content was determined using the DC TM Protein Assay (BioRad, Hercules, CA). Sample quality was examined by SDS-PAGE and Coomassie Blue staining.
Ten miligrams protein input was used as starting material for each cell line. Starting material for the two clinical samples consisted of two 5-mg workflow replicates. Lysates were brought to equal volumes at a concentration of 2 mg/ml protein. Sample preparation and phosphotyrosine immunoprecipitation (IP) procedures were performed as previously reported (31,32). IP was performed using PTMScan pTyr antibody beads (p-Tyr-1000) (Cell Signaling Technology, Danvers, IL) at a ratio of 4 l bead slurry per mg protein. Lysate aliquots were taken before the pTyr IP step, and were diluted to 0.1 g/l in 0.1% TFA for proteomic analysis.
Phosphopeptide Identification and Quantification-Peptides were separated by an Ultimate 3000 nanoLC system (Dionex LC-Packings, Amsterdam, The Netherlands) coupled online to a Q Exactive mass spectrometer (Thermo Fisher, Bremen, Germany) and equipped with a 40 cm ϫ 75 m (ID) fused silica column custom packed with 2-m, 120-Å-pore ReproSil Pur C18 aqua (Dr Maisch GMBH, Ammerbuch-Entringen, Germany). After injection, peptides were trapped at 6 l/min on a 10 mm ϫ 100 m (ID) trap column packed with 5-m, 120-Å-pore ReproSil Pur C18 aqua at 2% buffer B (buffer A: 0.5% acetic acid, buffer B: 80% ACN, 0.5% acetic acid) and separated at 300 nl/min in a 10 -40% buffer B gradient in 90 min (120 min. inject-to-inject). Eluting peptides were ionized at a potential of ϩ2 kV and introduced into the mass spectrometer. Intact masses were measured at a resolution of 70,000 (at m/z 200) in the orbitrap using an AGC target value of 3E6 charges. The top 10 peptide signals (charge states 2ϩ and higher) were submitted to the higher-energy collision (HCD) cell for MS/MS (1.6 amu isolation width, 25% normalized collision energy). MS/MS spectra were acquired at a resolution of 1 The abbreviations used are: AML, acute myeloid leukemia; TK, tyrosine kinase; KI, kinase inhibitor; pY, phosphotyrosine; INKA, integrative inferred kinase activity; ITD, internal tandem duplication; FLT3, FMS-like receptor tyrosine kinase 3; TKD, tyrosine kinase domain; PB, peripheral blood; MNC, mononuclear cell; IP, immunoaffinity precipitation; PDGFRA, platelet-derived growth factor receptor alpha; FIP1L1, pre-mRNA 3Ј-end-processing factor FIP1; KIT, Mast/Stem cell growth factor receptor KIT; JAK2, janus kinase 2; FGFR1, fibroblast growth factor receptor-1; RTK, receptor tyrosine kinase; AR, allelic ratio. 17,500 (at m/z 200) in the orbitrap using an AGC target value of 2E5 charges, a maximum inject time of 80 ms, and an underfill ratio of 0.1%. Dynamic exclusion was applied with a repeat count of 1 and an exclusion time of 30 s. MS/MS spectra for the cell line samples were searched against a Uniprot human FASTA file (release January 2014, no fragments; 42104 entries) using MaxQuant version 1.4.1.2 (33). Clinical samples were searched against the Swissprot human FASTA file (release September 2015, canonical and isoforms; 42122 entries) using MaxQuant version 1.5.2.8. Enzyme specificity was set to trypsin, and up to two missed cleavages were allowed. Cysteine carboxamidomethylation (ϩ57.021464 Da) was treated as a fixed modification, and serine, threonine and tyrosine phosphorylation (ϩ79.966330 Da), methionine oxidation (ϩ15.994915 Da), and N-terminal acetylation (ϩ42.010565 Da) as variable modifications. Peptide precursor ions were searched with a maximum mass deviation of 4.5 ppm, and fragment ions with a maximum mass deviation of 20 ppm. Peptide, protein and site identifications were filtered at a false discovery rate of 1% using the decoy database strategy. The minimal peptide length was set at 7 amino acids, the minimum Andromeda score for modified peptides was 40, and the corresponding minimum delta score was 17 for the cell lines, and 6 for clinical samples (default settings for the respective MaxQuant software versions). Proteins that could not be differentiated based on MS/MS spectra alone were clustered into protein groups (default MaxQuant settings). For the clinical samples, the "match between runs" option was selected in MaxQuant (not for the cell line samples). Phosphopeptide intensities derived from the integrated MS1 signal (isotope cluster) of the eluting peak for each precursor mass were obtained from the modificationSpecificPeptides.txt table from MaxQuant. Intensities were normalized on the summed intensity of all (phospho and non-phospho) peptides in a profiling run of the corresponding tryptic lysate digest and multiplied by the mean intensity across profiled cell line lysates. Phosphopeptide MS/MS spectral counts (34) were calculated from the MaxQuant evidence.txt file using R (http://www.r-project.org) (35). For subsequent analysis, values for clinical sample replicates were averaged. Information on identified phosphopeptides, phosphosites, and proteins can be found in supplemental Table S1. For all phosphopeptide identifications, annotated, mass-labeled MS/MS spectra are provided in supplemental Figs. S8 and S9.
INKA Analysis-Kinase activity was predicted using the INKA analysis tool as described (30). For the INKA analysis, phosphopeptide abundance measures were obtained from the individual cell line measurements. Label-free spectral counts were used as a proxy for abundancy and used in a standard INKA analysis based on four in-silico metrics calculated from phosphopeptide spectral count data. Two kinase-centric metrics focus on phosphopeptides derived from protein kinases themselves: a "kinome" metric, for all phosphopeptides derived from a given kinase, and an "activation loop" metric, which is restricted to phosphopeptides derived from the kinase activation loop segment. Two substrate-centric metrics focus on phosphopeptides derived from proteins that are deemed substrates for a given kinase: a "NetworKIN" (NWK) metric, for phosphopeptides from proteins that are predicted to be a substrate for a specific kinase by the motif-based NetworKIN algorithm (36), and a "PhosphoSitePlus" (PSP) metric, for phosphopeptides from proteins that have been experimentally observed to be a substrate for a particular kinase, as documented in the PhosphoSitePlus catalogue (37). These four metrics, all related to kinase activity directly (kinase-centric) or indirectly (substrate-centric), are integrated into a single INKA score with a requirement for kinases to be implicated by both the kinase-and the substrate-centric side of the analysis. The INKA tool provides bar graphs for all individual analyses, and final INKA scores for top 20 kinases.

Analysis of Kinase Inhibitor Targeting of (Inferred) Kinases-The
(kinase) target space of kinase inhibitors quizartinib, ponatinib, ibrutinib, tofacitinib, and GDC0994 was compiled from literature (38 -43), selecting kinases inhibited at concentrations below 500 nM. Inhibition data was supplemented with data on drug-kinase binding affinity (500 nM) for kinases for which no inhibition data was available. For each cell line, for each of its kinome, activation-loop, NWK, PSP and INKA metric analyses, kinase activity data were filtered for the kinases included in the pertinent target space (supplemental Table S2). For a given combination of cell line, metric analysis, and kinase inhibitor, the percentage potential kinase inhibition was calculated as the sum of activity values (ia, inferred activity) for kinases in the drug target space, divided by the activity values for all kinases in the data for a cell line (equation 1) (supplemental Table S2A, G ,M, S and Y).
Log EC 50 values for drugs were compared with the percentage potential kinase inhibition of these drugs for the AML cell lines using Pearson correlation. The EC 50 values for ponatinib were supplemented with data from the Genome of Drug Sensitivity in Cancer resource (GDSC), http://www.cancerrxgene.org, consulted Nov.2018) (44). These are indicated in red in supplemental Table S2.
In Silico Analysis of Cell Line KI Sensitivity-Thirty KIs were selected from the dose dependent protein-drug interaction analysis (40) from the proteomicsDB analytics toolbox for which drug sensitivity data was available in the GDSC database or from our own experiments. For each of the 30 KIs the (kinase) target space was determined from literature (38 -43, 45-59). Target spaces include kinases inhibited with an IC50 below 500 nM, supplemented with evidence of drug-kinase binding at a K d below 500 nM as described above (supplemental Table S4). Drug response data were found in the GDSC database for 13 out of 16 cell lines in our data set. For independently tested drugs in this manuscript that were selected based on the INKA profile (see above), the corresponding EC50s were used instead of those in the GDSC database. Per cell line, drug EC50s were divided into two groups based on the presence of the INKA-implicated driver kinase in the target space of the drug. Drug EC50s were visualized in a scatterplot and distribution of the EC50s for each group is shown with a Tukey box plot. The difference in response between the two groups was evaluated using the Mann-Whitney U test.
Molecular Analysis-AML cell lines (n ϭ 16) and two primary patient AML mononuclear fractions were analyzed for common AML-associated molecular aberrancies according to standard diagnostic protocols. DNA and/or RNA was extracted from cell pellets made at the time of lysate preparation as described previously (60,61). Samples were analyzed for the presence of molecular aberrations in FLT3 (ITD), 14), and CEBPA, FIP1L1-PDGFRA, BCR-ABL, PML-RARA, RUNX1-RUNX1T1 (t(8; 21)(q22;q22); AML-ETO), and CBFB-MYH11 (inv(16)) fusions, and 11q23 (MLL) rearrangements according to standard procedures (http://www.modhem.nl). When FLT3-ITD was present, the allelic ratio (AR) was determined by dividing the area under the curve (AUC) of the ITD peak by the total AUC of the ITD and wild-type peaks as found in fragment analysis.
Pathway Analysis-Exploration of downstream pathway activation was done by matching detected phosphoprotein data to an AML scaffold pathway, consisting of the PI3K, MAPK, and JAK-STAT pathways. The AML scaffold pathway was constructed by combining pathway information from four sources, KEGG (62), PANTHER (63), REACTOME (64), and Cell Signaling Technology pathways (https://www.cellsignal. com/contents/science/cst-pathways/science-pathways). Gene sym-bols were converted to HGNC nomenclature for each source list, and combined lists were made for each pathway. Pathway lists were further integrated to create a list with pathway-associated proteins. The final list was matched to the phosphoprotein list of our data set, and pathwayassociated proteins detected in our data set were included in an AML signaling list (n ϭ 199). For each sample, phosphoprotein-level spectral counts were matched to members of this list to create a per sample overview of pathway activation.
Pathway Visualization-To provide a per sample view of pathway activation, we built a simplified pathway scaffold in Cytoscape (65) (version 3.3.0), based on the kinases and phosphatases present in the AML signaling list that were identified in our data set, supplemented with a few other pathway proteins identified in our data set. Protein relations were extracted from the STRING database (https://stringdb.org) (66). Phosphoprotein counts per sample were matched to this scaffold pathway to create a per sample visualization of pathway activation. The color key range was set to 0 to 50 for all samples, with proteins undetected in a sample visualized in green.
Signaling Profiles-The spectral count-based AML pathway activation lists (supplemental Table S3) were used to calculate z-scores with median normalization of phosphoprotein counts within each sample column, using the Perseus software package (49) (version 1.5.5.0). Hierarchical clustering (distance measure: Euclidean, linkage method: average) was done for all cell lines and clinical samples using the hclust() function in R.
Drug Sensitivity Assay (MTT)-Quizartinib (AC220), imatinib, ponatinib, masitinib, tofacitinib, GDC-0994, and binimetinib were acquired from Selleckchem (Munich, Germany). Ibrutinib was purchased from MedChem Express (Sollentuna, Sweden). Optimal cell amounts for plating were determined with growth curves for each cell line to ensure logarithmic growth (between 3000 and 13,000 cells/ well). Proliferation was determined at 96 h to ensure a minimum of three doublings for each cell line. Cell proliferation was assessed after a 4-hour incubation with 10% MTT (3-(4,5-dimethylthiazolyl-2)-2-5diophenyl tetrazolium bromide) (Sigma, Taufkirchen, Germany) at 37°C following a previously described protocol (67). The EC 50 value was defined as the drug concentration needed to inhibit 50% of cell growth compared with growth of untreated control cells. All experiments were done in quadruplicate.
Experimental Design and Statistical Rationale-Phosphoproteomics was performed for 16 AML cell lines with different genetic profiles and mutations. Workflow reproducibility of the pY phosphoproteomics workflow was benchmarked previously (31). Samples were measured in a single-shot experiment. For cell lines, single biological samples (n ϭ 1) were used for analysis. For clinical samples, biological material was split at the start of sample processing and measured as two technical replicates. For further analysis, values for phosphopeptides identified in both replicates were averaged into one value per clinical sample. For phosphopeptides identified in only one replicate the value of the identification was used. The INKA score and its component metrics are within-sample values, and statistical relevance of the former has been addressed (35). For drug response analyses, fourteen out of 16 cell lines were tested for multiple KIs in quadruplicate, and drug response curves show mean viability with the standard deviation indicated. For correlation between reduction of INKA scores for a drug's target space and drug EC50s in Fig. 6, Pearson correlation in R was used.

Phosphoproteomic Profiling of AML Cell Lines Shows Heterogeneity in Phosphorylation
Patterns-Label-free phosphoproteomics has been shown to be a valuable tool to investigate kinase signaling (57)(58)(59), and, unlike isotopic labeling approaches, is feasible for application to large clinical multigroup cohorts that we envisage to screen in future. To investigate kinase hyperactivity and intracellular signaling in AML, we used our robust and reproducible workflow for label-free quantitative pY-based phosphoproteomics (31) combined with a dedicated analysis on a panel of 16 AML cell lines (Fig.  1). In total, 4853 phosphopeptides from 2279 phosphoproteins were identified, carrying 4229 phosphosites (3605 class I, localization probability Ͼ 0.75). The bulk of the class I phosphosites (3525) were phosphotyrosine sites on 4430 phosphopeptides (supplemental Tables S1 and S5). The number of identified phosphopeptides per sample was highly variable, ranging from 241 to 2764 phosphopeptides in individual cell lines ( Fig. 1; supplemental Table S5). This high variability in phosphorylation patterns was confirmed by anti-pY Western blot analysis using aliquots of cell line lysates (supplemental Fig. S1) and underlines the need for analysis of individual samples.
Unsupervised clustering using all normalized phosphopeptide intensities separated cell lines with high phosphorylation levels and clusters displaying intermediate and low levels of phosphorylation ( Fig. 2A). Cell lines did not cluster based on French-American-British (FAB) classification of hematologic diseases ( Fig. 2A) or growth conditions such as doubling time or medium (data not shown). Importantly, the closely related KG-1 and KG-1a cell lines, where KG-1a is a less mature subline of KG-1, showed comparable phosphorylation patterns and clustered together.
Out of 2279 identified phosphorylated proteins, 138 were classified as protein kinases (Fig. 1, supplemental Table S5). The heterogeneous levels of phosphorylation per cell line observed on Western blotting and at the phosphopeptide level were also reflected in the unsupervised clustering using kinase counts (supplemental Fig. S2A).
Molecular characterization of the cell line panel following standard molecular diagnostic procedures identified common genomic aberrations in only 6 out of 16 cell lines (supplemental Table S6). These included genomic aberrations in the AML-linked kinase genes FLT3 (MV4 -11, MOLM-13, Kasumi-6), PDGFRA (EoL-1), KIT (Kasumi-1), and JAK2 (HEL). Interestingly, analysis of Kasumi-6 revealed two distinct FLT3-ITDs, one of which was only detected at a low AR. No kinase aberrations were detected in any of the other 10 cell lines. Phosphopeptide clustering did not group the MV4 -11, MOLM-13 and Kasumi-6 cell lines based on their FLT3-ITD mutation, indicating that phosphoproteomics adds a complementary layer of information.

INKA Ranking Pinpoints Hyperactive Kinases in Accordance with Corresponding Molecular Aberrations, Confirming Their
Potential as Drug Targets-Several fusions and gain-of-function mutations in kinases are known driving factors in AML. An initial exploration into phosphorylation of the known AML TK drivers revealed that cell lines with an identified kinase mutation showed unusually high phosphorylation at the protein level when compared with the other cell lines in the panel (supplemental Fig. S2B). In addition, four cell lines without identified mutations according to standard genetic testing (KG-1, KG-1a, MM6 and Kasumi-3) also showed high phosphorylation of FGFR1, FLT3 or PDGFRA, indicating potential kinase driver activity in these cell lines (see outliers in supplemental Fig. S2B, Fig. 3, green panel).
To gain insight in (hyper)activity of kinases in a single cell line and pinpoint hyperactivity indicative of driver activity, we performed INKA analysis (35). The INKA approach is based on a 4-component analysis, for each given kinase, of phosphorylation of the kinase itself ("kinome" metric) and its activation loop segment ("activation loop" metric) on the one hand, and phosphorylation of all of its possible substrates as deduced from NetworKIN predictions and the PhosphoSitePlus catalogue of experimentally observed substrates ("NWK" and "PSP" metrics, respectively) on the other. Only kinases with both kinase-centric and substrate-centric evidence get a nonzero INKA score.
Bar graph visualizations of the INKA scores for the six cell lines with an identified TK mutation indicated high activity of these kinases, with ranks between position 1 and 6 of the highest INKA scores (Fig. 3). The high ranking of these mutant kinases was also apparent in the four individual INKA component analyses (supplemental Fig. S3A-S3F). INKA scores of mutant kinases were especially pronounced for the FIP1L1-PDGFRA fusion cell line EoL-1 and the Kasumi-1 cell line carrying an N822K KIT point mutation (Fig. 3). INKA ranking of the FLT3-ITD mutant cell lines MV4 -11, MOLM-13, and Kasumi-6, and the V617F JAK2 cell line HEL showed a top 2-6 ranking of FLT3 and JAK2, next to several other kinases. Interestingly, other high-ranking kinases in these cell lines were generally located downstream in the FLT3 and JAK2 FIG. 1. Experimental outline. A panel of AML cell lines with known and unidentified kinase drivers was subjected to phosphoproteomics. To this end, immunoprecipitated phosphotyrosine peptides of 16 AML cell lines were analyzed by nanoLC-MS/MS. Combining phosphopeptides per protein allowed for quantification of overall protein phosphorylation state. Kinase activity in the cell lines was assessed by four different analyses. Kinase activation was inferred from total kinase phosphorylation and activation-loop phosphorylation and substrate driven analysis was done using known kinasesubstrate relations and the motif-based kinase-substrate associations. Alongside, downstream pathway analysis was done, focused on AML associated pathways (PI3K, JAK-STAT, and MAPK), to gain insight in downstream network activation. Based on the results, candidate drivers were selected for functional validation using kinase inhibitors. cellular signaling hierarchy, thereby still implicating FLT3 and JAK2 as primary suspects of driver activity. KIT activity was also indicated for Kasumi-6, indicating this kinase might also contribute to signaling in this cell line.
Exposure of EoL-1, Kasumi-1, MV4 -11 and MOLM-13 cells to quizartinib (a PDGFRA, c-KIT and FLT3 inhibitor) confirmed dependence on their drivers, with EC 50 values below 10 nM (supplemental Fig. S3A-S3D). The JAK2 mutant HEL was only moderately sensitive to JAK2 inhibition with an EC 50 for tofacitinib at 5.5 M (supplemental Fig. S3F). This result was comparable to sensitivities found with the JAK inhibitors ruxolitinib (0.66 M) and fedratinib (1.02 M) for this cell line in the GDSC web resource. To check whether these results were specific to the cell lines with activity of a kinase targeted by the inhibitor, a few of the other cell lines without the intended target driver were taken along as negative controls. As expected, these cell lines were much less sensitive to the KIs tested compared with those with an indicated driver targeted by the inhibitor (supplemental Fig. S4). The KI ibrutinib was also tested, as several relevant kinases (both the major target, BTK, and multiple other kinases to which the drug binds in the nanomolar range) showed high INKA rankings. In correspondence with the general presence of these kinases in all cell lines, EC50s for ibrutinib were moderate with EC50s generally above 1 M (supplemental Fig. S4). Overall, EoL-1, Kasumi-1, MV4 -11, MOLM-13, and HEL cell lines were much less sensitive to ibrutinib than to the driver targeting inhibitor (supplemental Fig. S3A-S3F).

Phosphoproteomics Reveals Unexpected Hyperphosphorylated Kinases and INKA Analysis Indicates Their Potential as Drug Targets in Four Cell
Lines-Outlier analysis of phosphorylated kinases in our data set indicated four additional cell lines in our AML panel that potentially contain a kinase driver (supplemental Fig. S2B) although not indicated as such by a standard diagnostic mutation screen. These included FGFR1 (KG1 and KG1a), FLT3 (MM6), and PDGFRA (Kasumi-3), and kinome ranking for these cell lines showed high phosphorylation for each kinase (rank 1-4) (Fig. 3, green highlighted area; Fig. 4A). Kinase activity in these four cell lines was further supported by phosphorylation of the respective activation loops (rank 1-2) and by one or both substrate-centric analyses (supplemental Fig. S3B-S3C and S3I-S3J), resulting in a top 2 INKA ranking and thereby indicating high activity of these kinases in these cell lines.
Targeting of FGFR1 in the KG-1 and KG-1a cell lines with ponatinib further supports driver function of FGFR1 in these cell lines with drug EC 50 (29,68,69). Together, the results support driver activity of the respective kinases as predicted by our phosphoproteomics-based INKA analysis.
MAPK Signaling is Relatively High in Cell Lines Without a Clear TK Driver-In six of the sixteen cell lines in our panel, HL-60, ML-2, OCI-AML3, NB4, THP-1 and ME-1, we did not identify a clearly hyperactive TK driver, as assessed by groupbased outlier analysis and INKA analysis (Fig. 3, blue highlighted area; Fig. 5A; supplemental Fig. S3K-S3P). Although a subset of these cell lines (ML-2, OCI-AML3, THP-1, and ME-1) showed some phosphorylation of known AML drivers such as FLT3 and KIT, phosphorylation levels detected in these cell lines were much lower than in the cell lines carrying a FLT3 or  Table S1). Furthermore, phosphoproteomics uncovered potential driver kinases in 10 other cell lines (green and blue shaded cell lines) that were missed by the standard diagnostic tests.
KIT mutation. Similarly, limited substrate-derived evidence for activation of these kinases was found in these cell lines, and INKA analysis only indicated potential activity of KIT (rank 5) in ME-1.
Inspection of the INKA ranking in these cell lines revealed relatively high INKA ranking for MAPK3 (rank 5-8) in four cell lines, i.e. ML-2, HL-60, OCI-AML3 and THP-1, as compared with cell lines with a TK driver ( Fig. 3; Fig. 5A). Importantly, activity of MAPK 3 and of MAPK1 was supported by both the kinase-centric and substrate-centric analyses, in which the latter indicated activation of upstream activators MAP2K1 and MAP2K2 in these cell lines (supplemental Fig. S3K-S3N). An inspection of the Cancer Cell Line Encyclopedia (CCLE) database ( with an affinity below 500 nM. Targets indicated by circles were discovered through a chemical proteomics approach (40), and those indicated by triangles used cell-free essays for target identification (39,43,72). Bold font indicates kinases that are intended targets of the drug, whereas plain font indicates additional target kinases. Kinases with a top 20 INKA score in the KG-1, KG-1a, MM-6 and Kasumi-3 cell lines are indicated with the corresponding colors as in panel A. C, Targeting of suspected driver kinases in the above cell lines with a KI effective against FGFR1 (ponatinib), or FLT3 and PDGFRA (quizartinib) shows effective inhibition of cell viability in the nanomolar range. Ibrutinib is used as a control.
underlying analyses could be linked to drug EC 50 values for individual cell lines. To this end, a target space was specified for each drug, selecting experimentally proven targets inhibited by the drug at concentrations below 500 nM, supplemented with targets for which no inhibition data were available but with a binding affinity below 500 nM. For each cell line in each specific analysis, the percentage potential kinase inhibition was formulated as the ratio between the summed metric values of kinases in the target space of a KI and the summed metric values of all kinases. The resulting % potential kinase inhibition was then compared with the drug-specific log EC 50 values for a cell line. In case of predictive value, an inverse correlation between % potential inhibition and log EC 50 would be expected.  Fig. 4, the target space (40,42,43), is indicated in the phosphokinase profiles. Bold font indicates kinases that are intended targets, whereas plain font indicates additional targets of the drug. Kinases with a top 20 INKA score are indicated with corresponding colors as in panel A. C, Targeting of MAPK1 and MAPK3 with GDC-0994 shows effective inhibition of cell viability in the nanomolar range compared with treatment with ibrutinib (control) (p Ͻ 0.0001, two-way ANOVA). Targeting the MAPK pathway by upstream MAP2K1 and MAP2K2 inhibition using binimetinib was even more effective (p Ͻ 0.0001 and p Ͻ 0.0001, respectively). Colors as in panel A.
Overall, the kinase-centric data (phosphorylated kinase and phosphorylated activation loop) showed more consistent inverse correlations as compared with substrate-centric kinase analyses (Fig. 6). This was especially true for the drugs targeting an RTK driver, whereas the relation was less clear for the drugs targeting downstream kinases like tofacitinib (JAK2), GDC-0994 (MAPK1/3) and ibrutinib (BTK, LYN). This is most likely because of the broad involvement of these kinases in downstream cell signaling, thereby not limiting their involvement to those cell lines carrying a mutation. Though INKA scores are based on all four underlying analyses, correlation between % potential kinase inhibition and log EC 50 showed similar results to the kinase-centric analyses, with correlation clearest for those inhibitors targeting RTKs.
In Silico Validation of INKA-guided Drug Selection-To further validate our INKA-guided drug selection in a wider panel of drugs, we performed an in silico analysis of cell line sensitivity for a panel of 30 KIs for which kinase inhibition data was available in the proteomicsDB analytics toolkit, and for which data on drug sensitivity was available in the GDSC (supplemental Table S4). Comparison of drug EC50s for KIs targeting the INKA-identified driver kinase of an RTK-driven cell line versus selected non-profile KIs showed that EoL-1 (p ϭ 0.0001), Kasumi-1 (p ϭ 0.038), MV4 -11 (p ϭ 0.0015), MOLM-13 (p ϭ 0.0006), KG-1 (p ϭ 0.0075), and MM-6 (p ϭ 0.0005) were significantly more sensitive to KIs targeting their respective drivers (supplemental Fig. S5).
Though sensitivity of HEL to other JAK2-inhibiting KIs was comparable to its response to tofacitinib, the JAK2-driven cell line HEL did not respond significantly better to JAK2-inhibiting drugs compared with other KIs (p ϭ 0.2217), indicating that targeting of JAK2 alone might not be enough in this cell line. Analysis of RAS-mutant cell lines also showed variable results with p values ranging between 0.2245 and 0.0248. Overall, response to driver inhibition was less distinct for cell lines without an RTK driver. Percentage potential kinase inhibition for quizartinib, ponatinib, ibrutinib, tofaitinib and GDC0994 was separately calculated for "Kinome"-, "Activation loop"-, "PhosphoSitePlus" (PSP)-and "NetworKINЈ (NWK)-based metrics of kinase activity as well as for the aggregate INKA score. This involves, for a given metric/score, taking the ratio of the summed values of kinases in the target space of a kinase inhibitor and the summed values of all kinases within a cell line. Log EC 50 values for quizartinib, ponatinib, ibrutinib, tofaitinib and GDC0994 were compared with the percentage potential kinase inhibition of these drugs for the AML cell lines using Pearson correlation.
Phosphoproteomics Demonstrates FLT3 Activation in FLT3-ITD-positive Patient Samples-To assess if identification of hyperactive phosphorylated kinases is also feasible in AML clinical samples that undergo a ϳ1.5-hour purification protocol, we analyzed AML cells of two FLT3-ITD-positive AML patients. The first sample (Pt.1) was collected from a 66-year-old female AML patient carrying two FLT3-ITD variants, with an AR of 46% for the dominant variant, and an NPM1 mutation (supplemental Table S7). Blast percentage before MNC isolation using Ficoll-Paque was 97% as determined by immunophenotyping. The second sample (Pt.2) was acquired from a 45-year-old female AML patient with 93% blast cells before isolation. This sample also carried both an FLT3-ITD (AR of 95%) and an NPM1 mutation.
Phosphoproteomic analysis of the two clinical samples yielded 1089 phosphopeptides, with 880 phosphosites (768 class I) on 553 phosphoproteins, including 55 phosphorylated kinases (supplemental Table S8). Despite a lower sample input (5 mg versus 10 mg), phosphorylation of FLT3 was detected at a similar order of magnitude as observed for the FLT3-ITD mutant cell lines (supplemental Fig. S6, kinome analysis). Additionally, a higher extent of FLT3 phosphorylation was detected for the patient with the highest FLT3-ITD AR (Pt.2). INKA analysis of the two clinical samples indicated activity of FLT3 in both samples (rank 6), which was supported by both kinase-centric and substrate-centric analyses (Fig. 7A, supplemental Fig. S6), which is similar to the ranking found in ITD-FLT3 AML cell lines (Fig. 3, supplemental Fig.  3C-S3E).
Overlapping phosphopeptide data of the patient samples with that of the three FLT3-ITD-positive cell lines MV4 -11, MOLM-13 and Kasumi-6 revealed that nearly all phosphopeptides measured in the cell lines could also be identified in the clinical samples (supplemental Fig. S7A). Furthermore, despite the lower protein input, an additional 251 phosphopeptides were detected in the clinical samples. In total, 71 kinases were identified in the combination of both data sets, and 42 of these were detected in both cell lines and clinical samples (supplemental Fig. S7B).
To further investigate the potential effect of sample processing on phosphorylation in clinical samples, we clustered cell lines and patient samples together based on their AML signaling pathway components (Fig. 7B, supplemental Table  S3). This analysis showed that most samples clustered by Clustering is based on within-sample z-scores of summed phosphopeptide counts for proteins that are part of the defined AML signaling network (supplemental Table S3). their identified RTK drivers. Although the Pt.2 patient sample clustered together with the FLT3 aberrant cell lines, the Pt.1 patient sample clustered in an adjacent sub-cluster together with RAS-driven cell lines. These results underscore the similarity in signaling patterns of cell lines and patient samples and suggest that sample processing has minimal impact on AML signaling. DISCUSSION In view of the discovery of frequent molecular aberrations in RTKs in AML, and the limitations of mutation-based treatment selection, there is a clinical need to predict drug sensitivity of AML patients for kinase inhibitors. Phosphoproteomic approaches may provide an additional layer of functionally relevant molecular information that is advantageous for effective treatment selection. Here, as a first preclinical step, we have explored application of phosphoproteomics coupled to INKA analyses for KI selection for individual "AML "cases" in an in vitro setting. To our knowledge, we provide the largest AML pY-phosphoproteome data set to date (4853 phosphopeptides from 2279 phosphoproteins), thereby significantly extending previous findings of limited depth (ϳ48 -450 phosphopeptides) in 13 AML cell lines (16,19,29,30), five of which are included in our data set.
We have analyzed cellular kinome states by ranking kinase phosphorylation levels in single samples and by cohort outlier analysis. Most notably, our study revealed that INKA analysis, based on quantification of kinase phosphorylation, both at the level of the entire molecule and at the level of the activation loop specifically, in combination with phosphosubstrate analysis, can serve to deduce aberrant kinase activation. Application of single-sample INKA ranking to a panel of AML cell lines enabled us to explore how different genomic kinase aberrations translate into differences in the phosphoproteome of six AML cell lines, and whether phosphoproteomics-based INKA analysis could reveal their RTK hyperactivity. Furthermore, we were able to identify unexpected FLT3, PDGFR, FGFR1, and MAPK pathway activity in ten more cell lines. Analysis of two patient samples produced comparable results to those of the cell lines, indicating that the elaborate sample processing needed to purify AML blast cells from blood has no major influence on clinical sample phosphorylation state and that our method might be translatable to clinical samples. Our data show that phosphoproteomics in combination with INKA analysis can be used as a global readout of kinase activation that can be harnessed to select KIs for inhibition of identified targets. Potentially, this could also include elucidation of kinase-driven resistance mechanisms as well as stimulatory influences from the microenvironment that are missed by mutation-based analyses.
Evaluation of INKA-guided drug selection in a panel of 30 KIs from publicly available data showed that RTK-driven cell lines were significantly more sensitive to KIs targeting the INKA-identified RTK drivers, confirming the ability of our method to identify driver RTKs and the success of KI-based treatment of RTK-driven cell lines. However, though INKA was able to indicate activity of the mutant driver JAK2 and RAS pathway-related kinases MAPK1 and MAPK3 in non-RTKdriven cell lines, response to single KI inhibition of these downstream hyperactive kinases was much less distinct from selected non-profile kinase inhibitors. Possibly, monotherapy is not enough for these cell lines. Interestingly, despite evidence for only three kinases in its target space, the MEK inhibitor trametinib caused the largest response by far in HL-60, ML-2, and OCI-AML3 compared with other MEK inhibitors. Also, as the insulin receptor is ranked high (position 3-4) in ML-2, OCI-AML3, and THP-1, this receptor might be an interesting co-target in these cell lines. Further research is needed to reveal the mechanisms behind this variable response and the success of combination treatment in these cell lines.
Identification of kinase activity was not limited to the identification of driver kinases, as several other kinases showed high INKA score ranking. These kinases were generally involved downstream of RTKs such as SRC-family kinases, and kinases functioning further downstream such as MAPK14, GSK3A/B and CDKs. Though some components, such as adaptor proteins, were more specific to cell lines with a hyperactive RTK, kinases such as BTK, LYN, MAPK14, GSK3A/B and CDKs were generally phosphorylated in all cell lines and clinical samples, indicating a possible "household" function.
In the FLT3-ITD mutant cell lines, FLT3 generally scored lower than or similar to several downstream components in the INKA analysis. However, these cell lines still proved highly responsive to FLT3 inhibition, indicating that pathway hierarchy is equally important in selecting potential driver kinases for treatment. Indeed, despite the less striking difference in INKA scores in the FLT3-ITD mutant cell lines, sensitivity to a FLT3-targeting inhibitor was similar to that of the FLT3 point mutant cell line MM6, where FLT3 scored exceptionally high. The importance of targeting RTKs over other kinases was also evident from our systematic analysis of kinase activity in relation to drug EC 50 values, reflecting the involvement of RTKs as known drivers in AML. Co-activated kinases could potentially be interesting for use in combination treatment.
Previous efforts in AML mainly involved global phosphoproteomics in conjunction with phosphosubstrate-based analyses for kinase inference (17,54,55,57,58). For the phosphotyrosine-specific data set of AML phosphoproteomes presented here, we showed the superior correlation of kinase-centric ranking with drug EC 50 values as compared with substrate-centric ranking. Substrate-centric metrics do significantly contribute to the performance of INKA scoring (35), however, and their low correlation with drug EC 50 values did not compromise the correlation of the aggregate INKA scores with EC 50 . Whether these results may be extrapolated to global phosphoproteomics data dominated by serine and threonine phosphorylation remains to be determined. On a further note, most KIs demonstrate at least some degree of polypharmacology (see Figs. 4 and 5, and supplemental Fig. S3). This may be considered an added benefit, with one drug inhibiting multiple active kinases, and is therefore taken into account in the analysis shown in Fig. 6, where the target space of a drug is used to assess the relation between drug EC 50 and the magnitude of individual kinase activity metrics for separate cell lines.
In summary, our study provides an in-depth global analysis of the phosphotyrosine-based phosphoproteome in AML. We demonstrated the potential of the INKA ranking strategy based on kinase phosphorylation, supported by evidence of substrate phosphorylation, to provide a readout for kinase activation in single AML samples. Underscoring our approach, we functionally verified target potential of hyperactive kinases in ten cell lines for which standard diagnostic mutational analysis did not reveal any drivers. Our analysis of two clinical samples illustrates, in principle, the feasibility of kinase activity analysis in patient material. A more elaborate assessment of the phosphoproteome-guided drug matching strategy in de novo and relapsed patient samples will shed light on its potential for future clinical application in guiding specific kinase inhibitor selection for individual AML patients.
Acknowledgments-We thank the molecular diagnostics team of the Department of Hematology for performing the molecular characterization of the cell lines and clinical samples used in this study. Furthermore, we would like to thank Peter Sminia for providing us with binimetinib, and Anneloes van Duijn for performing several of the cell viability assays.

DATA AVAILABILITY
The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository with the set identifiers PXD007237 (cell line data), PXD015639 (clinical sample phosphoproteomics data), and PXD015662 (clinical sample proteomics data) (https://proteomecentral.proteomexchange.org).