An atlas of human kinase regulation

Abstract The coordinated regulation of protein kinases is a rapid mechanism that integrates diverse cues and swiftly determines appropriate cellular responses. However, our understanding of cellular decision‐making has been limited by the small number of simultaneously monitored phospho‐regulatory events. Here, we have estimated changes in activity in 215 human kinases in 399 conditions derived from a large compilation of phosphopeptide quantifications. This atlas identifies commonly regulated kinases as those that are central in the signaling network and defines the logic relationships between kinase pairs. Co‐regulation along the conditions predicts kinase–complex and kinase–substrate associations. Additionally, the kinase regulation profile acts as a molecular fingerprint to identify related and opposing signaling states. Using this atlas, we identified essential mediators of stem cell differentiation, modulators of Salmonella infection, and new targets of AKT1. This provides a global view of human phosphorylation‐based signaling and the necessary context to better understand kinase‐driven decision‐making.

Thank you again for submitting your work to Molecular Systems Biology. We have now heard back from the two referees who accepted to evaluate the study. As you will see, the referees find the topic of your study of potential interest and are supportive. They raise however a series of concerns and make suggestions for modifications, which we would ask you to carefully address in a revision of the present work.
-Reviewer #2 makes a series of suggestions to clarify some of the numbers presented and -The presentation of the data and method of the results shown in Figure 2 should be improved; we attach an annotated PDF file to make some suggestions and comments.
-Please confirm that the download and mysql dump sections of the phosfate.com will be activated upon acceptance.
I reviewed this manuscript previously and had strongly recommended it for publication. I also see that they had integrated corrections in response to mine and a number of other issues that the other reviewer had raised. My opinion has not changed, but nonetheless, here are my original comments.
The title of this work is apt, for it seems the focus of the authors efforts is to order the regulation of protein kinases according to cellular functions and to each other in the context of different cell functions. What Ochoa, et al. describe seems to be straightforward. They have compiled just about all of the human condition-specific differential phosphoproteomic data that they could muster, matched the phosphosites to those of known substrates of kinases and were able to come up with a compendium of condition specific activation states for 209 human protein kinases, about half of the human kinome. This is very good indeed What follows then is a series of analyses in which the authors explore what this large scale knowledge of what states kinases are in under such a large number of conditions might tell us about the nature of individual kinases, their relationships to other kinases and to cellular functions. Ultimately what we're provided with here is a collection of virtual reporters of signaling network regulation. To demonstrate the utility of these virtual reporters, the authors analyzed responses of different kinases during PMA-driven differential of hESC cells using phospho-substrate-specific antibodies as reporters to test the predictive power of their method to associate the regulation of different kinases to distinct conditions. The analyses described in this manuscript are quite elegant, straight-forward and answer some of the more basic questions one should ask given our present understanding that kinases form regulatory signaling networks that are intrinsically complex, limiting ourselves at the moment to describing kinase activity perturbations under different conditions as signatures of those conditions that may have some considerable predictive utility. This is a major contribution that the authors have done such a large-scale analysis and should be of general interest and utility to those interested in signaling networks.

Reviewer #2:
The authors have compiled a large data collection of almost 3 million human phosphopeptide quantifications from 435 perturbations. They collected these based on existing data, renormalized and filtered them. Kinase activities were inferred from the quantifications and show good correlation with autophosphorylation signals. Generalist kinases are identified that are central in the signaling network and pass through signals to other kinases. The study is unique and of excellent quality, the results support the conclusions. However I find the presentation of numbers included in the study confusing and in places misleading. I understand that the authors try to represent the large amount of analytical work that has gone into this study, but as it is, it is difficult to read and understand the results. Although breakthrough results are missing, the Atlas could represent a very valuable resource for the PTM community.

Presentation of numbers
The abstract does not mention how many phosphopeptides actually change and are thus part of a profile. My first impression was that the 3 million represented phosphopeptide changes (instead of quantifications). The way it is presented now is confusing. Further, the authors report that in fact only 48% of these 3 million quantifications are retained as the rest concerns phosphopeptides only seen in one study and are likely false postives. Similarly, on page 4 is written "In this study, we have compiled condition-dependent changes in human protein phosphorylation including 2,940,379 phosphopeptide quantifications." The word "including" seems to suggest all those quantified peptides are changed. It should rather be worded as "based on" or "derived from".
Please include the number of changes in phosphopeptides in this sentence as well as the number of unique phosphopeptides that have been quantified and display significant changes. Only in the supplement it is clear that only about 100,000 unique phosphopeptides have been quantified. How many of those end up in the final profiles is unclear.
Also the abstract mentions 399 conditions whereas on page 4, 435 perturbations are mentioned. Where does this difference come from? Do the other 36 conditions only contain phosphopeptides that are unique to one condition? Or do they not belong to a kinase?
Page 12: The authors present correlation as R in the tekst and the figure. It should rather be presented as R2 which is the conventional way. This is again confusing and could be interpreted as misleading ( a reader expects to see the R-squared here). Similarly on page 13 (R=0.27).
It is unclear how the correlation in figure 4C is calculated. Is this pearson correlation? Are the underlying assumptions for pearson correlation fulfilled? That is, are the two variables normally distributed? Would a rank correlation not be more appropriate?
Page 13: "Protein complexes are common signaling effectors that often display coordinated phospho-regulation with regulatory kinases" Is this an assumption? Is it based on previously published results? Is this a result of the study? Please clarify. Thank you again for sending us your revised manuscript. We are now satisfied with the modifications made and I am pleased to inform you that your paper has been accepted for publication.  1. Data the data were obtained and processed according to the field's best practice and are presented to reflect the results of the experiments in an accurate and unbiased manner. figure panels include only data points, measurements or observations that can be compared to each other in a scientifically meaningful way. graphs include clearly labeled error bars for independent experiments and sample sizes. Unless justified, error bars should not be shown for technical replicates. if n< 5, the individual data points from each experiment should be plotted and any statistical test employed should be justified Please fill out these boxes ê (Do not worry if you cannot see all your text once you press return) a specification of the experimental system investigated (eg cell line, species name).

B--Statistics and general methods
the assay(s) and method(s) used to carry out the reported observations and measurements an explicit mention of the biological and chemical entity(ies) that are being measured. an explicit mention of the biological and chemical entity(ies) that are altered/varied/perturbed in a controlled manner. the exact sample size (n) for each experimental group/condition, given as a number, not a range; a description of the sample collection allowing the reader to understand whether the samples represent technical or biological replicates (including how many animals, litters, cultures, etc.).
Each figure caption should contain the following information, for each panel where they are relevant:

Captions
The data shown in figures should satisfy the following conditions: Source Data should be included to report the data underlying graphs. Please follow the guidelines set out in the author ship guidelines on Data Presentation. a statement of how many times the experiment shown was independently replicated in the laboratory.
Any descriptions too long for the figure legend should be included in the methods section and/or with the source data.
Please ensure that the answers to the following questions are reported in the manuscript itself. We encourage you to include a specific subsection in the methods section for statistics, reagents, animal models and human subjects.