Phosphoproteomic Approaches to Discover Novel Substrates of Mycobacterial Ser/Thr Protein Kinases*

Mycobacterial STPKs are responsible for orchestrating phosphorylation-dependent signaling cascades that mediate bacterial growth and environmental adaptation. Recent advances in MS-based phosphoproteomics have significantly expanded the candidate substrate lists for individual mycobacterial STPKs. Integration of the available phosphoproteomic datasets provide new insights into the functional roles of specific STPKs in cell physiology. Future research should focus on in vivo phosphorylation network reconstruction to expose the fundamental signaling pathways in mycobacteria. Linking STPKs with their physiological substrates may reveal novel antimycobacterial agents. Graphical Abstract Highlights Mapping kinase-substrate relationships is vital in discovering new tuberculosis drug targets. LC-MS/MS-based phosphoproteomics expand mycobacterial STPK substrate catalogues. We review and integrate MS-generated datasets on novel candidate substrates. Validation studies are necessary to confirm true physiological substrates of STPKs. Mycobacterial Ser/Thr protein kinases (STPKs) play a critical role in signal transduction pathways that ultimately determine mycobacterial growth and metabolic adaptation. Identification of key physiological substrates of these protein kinases is, therefore, crucial to better understand how Ser/Thr phosphorylation contributes to mycobacterial environmental adaptation, including response to stress, cell division, and host-pathogen interactions. Various substrate detection methods have been employed with limited success, with direct targets of STPKs remaining elusive. Recently developed mass spectrometry (MS)-based phosphoproteomic approaches have expanded the list of potential STPK substrate identifications, yet further investigation is required to define the most functionally significant phosphosites and their physiological importance. Prior to the application of MS workflows, for instance, GarA was the only known and validated physiological substrate for protein kinase G (PknG) from pathogenic mycobacteria. A subsequent list of at least 28 candidate PknG substrates has since been reported with the use of MS-based analyses. Herein, we integrate and critically review MS-generated datasets available on novel STPK substrates and report new functional and subcellular localization enrichment analyses on novel candidate protein kinase A (PknA), protein kinase B (PknB) and PknG substrates to deduce the possible physiological roles of these kinases. In addition, we assess substrate specificity patterns across different mycobacterial STPKs by analyzing reported sets of phosphopeptides, in order to determine whether novel motifs or consensus regions exist for mycobacterial Ser/Thr phosphorylation sites. This review focuses on MS-based techniques employed for STPK substrate identification in mycobacteria, while highlighting the advantages and challenges of the various applications.


Phosphoproteomic Approaches to Discover Novel Substrates of Mycobacterial Ser/Thr
Protein Kinases* □ S Seanantha S. Baros ‡, Jonathan M. Blackburn ‡ §, and Nelson C. Soares ¶ʈ Mycobacterial Ser/Thr protein kinases (STPKs) play a critical role in signal transduction pathways that ultimately determine mycobacterial growth and metabolic adaptation. Identification of key physiological substrates of these protein kinases is, therefore, crucial to better understand how Ser/Thr phosphorylation contributes to mycobacterial environmental adaptation, including response to stress, cell division, and host-pathogen interactions. Various substrate detection methods have been employed with limited success, with direct targets of STPKs remaining elusive. Recently developed mass spectrometry (MS)-based phosphoproteomic approaches have expanded the list of potential STPK substrate identifications, yet further investigation is required to define the most functionally significant phosphosites and their physiological importance. Prior to the application of MS workflows, for instance, GarA was the only known and validated physiological substrate for protein kinase G (PknG) from pathogenic mycobacteria. A subsequent list of at least 28 candidate PknG substrates has since been reported with the use of MS-based analyses. Herein, we integrate and critically review MS-generated datasets available on novel STPK substrates and report new functional and subcellular localization enrichment analyses on novel candidate protein kinase A (PknA), protein kinase B (PknB) and PknG substrates to deduce the possible physiological roles of these kinases. In addition, we assess substrate specificity patterns across different mycobacterial STPKs by analyzing reported sets of phosphopeptides, in order to determine whether novel motifs or consensus regions exist for mycobacterial Ser/Thr phosphorylation sites. This review focuses on MS-based techniques employed for STPK substrate identification in mycobacteria, while highlighting the advantages and challenges of the various applications. PTMs are essential for the proper functioning of certain proteins and create heterogeneity and temporal complexity in mycobacterial proteomes (1)(2)(3). Recent developments in MSbased proteomics have highlighted the occurrence of numerous types of PTMs in mycobacteria (2,3). Phosphorylation is a reversible PTM responsible for diversifying protein functions and orchestrating various signaling networks in mycobacteria, mediated by protein kinases and phosphatases (1, 4 -6). Other well-studied PTMs in mycobacteria include glycosylation, lipoylation, acetylation, and pupylation, which have a wide array of physiological consequences in mycobacteria, ranging from the regulation of cell functions to modulation of host-pathogen interactions, as reviewed elsewhere (2,3).
Being the most dynamic of all common PTMs, Ser/Thr protein phosphorylation is a central control mechanism associated with the growth, virulence, and persistence of mycobacterial species, including Mycobacterium tuberculosis, the etiological agent of tuberculosis (7)(8)(9)(10)(11). Through phosphorylation, mycobacterial STPKs and two-component systems coordinate signal transduction cascades that regulate physiological states, environmental adaptation, and pathogenesis (10,12,13).
Two-component systems consist of a sensor histidine kinase and a cognate response regulator, which are typically genetically linked and transcriptionally coupled (14,15). The histidine kinase spans the cytoplasmic membrane and activates the response regulator when prompted by specific external stimuli (14). Communication between the proteins occurs via phosphoryl-group transfer from a conserved histidine residue of the sensor kinase to a conserved aspartate residue of the response regulator (16). The response regulator is usually a transcription factor localized in the cytoplasm that mediates environmental adaptative responses (14).
In contrast to two-component systems, STPKs are pleiotropic signal transducers that can phosphorylate multiple substrates to elicit branched signaling (14). The majority of mycobacterial STPKs are transmembrane proteins with an extracellular sensor domain and an intracellular kinase do-main. The extracellular sensor domain transduces extracytoplasmic signals to the intracellular kinase domain, resulting in kinase activation and phosphorylation of Ser/Thr residues on substrate proteins. This phosphorylation may directly affect protein function or influence protein-protein interactions. Notably, Ser/Thr phosphorylation rarely results in direct regulation of transcription, which is often facilitated by two-component signal transduction (10).
A phylogenetic analysis clustered the 11 mycobacterial STPKs into five groups, Clades I-V (Table I) (20). The group of transmembrane sensor kinases, PknA, PknB, and PknL belong to Clade I. PknD, PknE, and PknH belong to Clade II, representing the group of integral membrane receptor and cytoplasmic kinases. Clade III STPKs comprise PknF, PknI, and PknJ. Finally, Clades IV and V represent the group of soluble kinases that cluster along with PknK and PknG, respectively (20).
PknA and PknB are essential for sustaining the growth and survival in vitro and in vivo (11,19,21) while modulating cell morphology (22). PknG, a well-studied mycobacterial STPK, is involved in metabolism regulation (7,23) and promotes the intracellular survival of mycobacteria within macrophages by blocking phagosome-lysosome fusion (24). The remaining kinases have proposed roles in various biological processes, including the regulation of growth and cell division, cell wall synthesis, environmental sensing, and adaptive responses to different abiotic stresses (25)(26)(27)(28)(29).
These STPKs, along with their cognate phosphatases, interact functionally to generate a complex signaling architecture (6,13,30). Notably, kinases are distinguished from one another based on their substrate repertoires and their modes of regulation (31). Mapping kinase-substrate relationships is, therefore, key to understanding fundamental biochemical pathways and signaling networks in mycobacteria and iden-tifying new, plausible pharmaceutical targets for therapeutic intervention (1).
However, despite the increase in the definition of the functional roles of these individual STPKs, the physiologically relevant substrates and networks responsible for integrating cellular signals remain obscure. Various techniques, including genetic screening, in vitro kinase assays, protein microarrays, two hybrid-based studies and proteomics, have been described for the screening of kinase substrates (32). These approaches often attempt to mimic in vivo contexts to provide useful insights, however, validation is still required. Consequently, improved methods and novel strategies for mycobacterial STPK substrate detection are essential.
The sensitivity and high-throughput capacity of MS have improved significantly in recent years and proved to be of considerable benefit to the detection of phosphorylation events. MS-based methods now surpass classical approaches, especially since MS is able to confidently assign the localization of phosphosites (7)(8)(9). Coupled with the use of kinase knock-out mutants or chemical inhibitors, MS also allows for the quantitative analysis of phosphorylation changes in the presence or absence of a specific kinase.
Quantitative phosphoproteomic strategies have been successfully applied to identify novel substrates of protein kinases in bacteria such as Bacillus subtilis (33,34) and Escherichia coli (35), but also in higher organisms, including yeast (36) and human cells (37,38). More recently, high-resolution MS has been applied for the identification of STPK physiological substrates and interacting partners in mycobacteria. Accordingly, this review will hereafter focus on quantitative MS-based mycobacterial STPK substrate identification.
MS-Based Ser/Thr Phosphoproteomic Analysis-Classical Methods for STPK Substrate Identification-Technical developments in MS, together with its increasing accessibility, has enabled the refinement of proteomic strategies for the identification of novel physiological substrates of mycobacterial STPKs (4, 7-9, 39). Commonly utilized methods combine in vitro kinase assays with MS to detect kinase substrates. This often entails the use of purified, active protein kinases to phosphorylate proteins or cell lysates in vitro, followed by protein purification and MS analysis for phosphopeptide identification. Applying this strategy, the forkheadassociated (FHA) domain-containing protein, GarA, was identified as a substrate of PknG in vitro (40). Employing a similar proteomic approach, in combination with 2D-PAGE 1 and autoradiography, Rv2175c and GarA were found to be substrates of PknL and PknB, respectively (41,42). This method allowed for the differentiation of several protein species and highlighted the complementarity of 2D-PAGE and MS for increasing proteome coverage.
An alternative approach, termed KESTREL, uses fractions of cell lysates to separate endogenous protein kinases from their various substrates, thereby minimizing background phosphorylation and sample complexity for subsequent MS  (23) presented a KESTRELbased approach for mycobacterial STPK substrate identification combining proteome fractionation, in vitro kinase assays and reversed-phase chromatography with 2D-PAGE and tandem MS protein identification (23). Using this method, GarA was, again, identified as a substrate for mycobacterial PknG. However, as demonstrated in previous studies, high enzymeto-substrate ratios are required to detect phosphorylation using in vitro assays (1,23,40). This limitation suggests that these assays may not capture and reflect the true biological selectivity of protein kinases, which are also known to display promiscuity in vitro. The physiologically relevant specificity of these kinases is often lost because of the elevated, nonphysiological kinase and/or substrate concentrations used in kinase assays. Thus, mycobacterial phosphoproteomics research should ideally aim toward in vivo phosphorylation network reconstruction in order to link STPKs with their physiological substrates and binding partners.
Recent Advances in Mycobacterial STPK Substrate Identification-Various quantitative methods for ex vivo LC-MS/MS analysis have been successfully applied to generate phosphoproteomic datasets for mycobacterial species (12, 39, 44 -46). For example, Prisic et al. reported global Ser/Thr phosphorylation profiles of the M. tuberculosis laboratory strain H37Rv grown under conditions of hypoxia, nitric oxide stress, oxidative stress, and using glucose or acetate as a carbon source (39). MS analysis of these samples enabled the detection of 506 phosphosites on 301 proteins involved in a vast array of functions. Similar numbers of phosphoproteins were reported in other M. tuberculosis strains (12,47) and other mycobacterial species (45,46). However, the highthroughput capability of MS and phosphoproteome profiling is often unable to determine an exact relationship between protein kinases and their substrates due to the presence of multiple different active kinases in a given biological system. Consequently, more laborious methods are required to associate these substrates with their cognate protein kinases (4).
Recently, Nakedi et al. employed label-free quantitative phosphoproteomics to screen for novel physiological substrates of M. bovis BCG PknG in vivo (7). The study compared the phosphoproteomic dynamics of the batch culture growth Sample preparation for a standard mycobacterial phosphoproteomic experiment begins with the mycobacterial cells being lysed in the presence of protease and phosphatase inhibitors to preserve all phosphorylation events. Samples then undergo tryptic digestion and subsequent peptide desalting using C18 STAGE (stop-and-go-extraction) tips. Phosphopeptides are then enriched using immobilized metal affinity chromatography, TiO 2 or antibody-based immunoaffinity methods, followed by another desalting step. LC-MS/MS data acquisition is then performed, and the generated data are analyzed using various software packages and/or bioinformatic pipelines.
of M. bovis BCG wild-type against the respective PknG knock-out mutant strain. Titanium dioxide (TiO 2 ) beads were used for phosphopeptide enrichment, followed by LC-MS/MS (Fig. 1). This ex vivo workflow facilitated the identification of 55 differentially phosphorylated phosphopeptides, of which 21 phosphopeptides were phosphorylated only in the M. bovis BCG wild-type and not in the PknG knock-out mutant. Phosphopeptides were identified by mapping acquired peptide MS/MS spectra against the M. bovis BCG Pasteur 1172 reference proteome using the MaxQuant suite, Andromeda (48). Modified peptides underwent score-based filtering, including an Andromeda search score Ͼ40, delta score Ͼ8, false discovery rate Ͻ0.05 and a localization probability Ն0.75 (Fig. 2). A selected number of these novel candidate PknG substrates were further validated through targeted parallel reaction monitoring MS assays. Notably, this study exemplified the utility of parallel reaction monitoring as a new means to validate phosphorylation events in phosphoproteomic experiments (49).
Following this publication, several other studies employed similar approaches in the search of mycobacterial protein kinase substrates (6,8,9,50,51). Gil et al. developed an affinity purification-MS strategy to stepwise recover PknG substrates and interactors (8). Using this tailored interactome approach, seven novel candidate PknG substrates were identified, of which six substrates overlapped with those previously reported by Nakedi et al. (7). In addition, this study recovered the only two previously well-characterized PknG substrates, namely GarA and the 50 ribosomal protein L13, thereby substantiating the notion that these proteins represent plausible physiological substrates or interactors of PknG.
In order to identify potential PknA and PknB targets,  Table S3. quantitative phosphoproteomics approach with a PknB tetra mutant, wherein all four putative ligand interacting residues were modified. Simultaneous mutation of these penicillinbinding proteins and Ser/Thr kinase-associated 3-4 linker region residues appeared to abrogate ligand binding (9). Nine of these 73 targets overlapped with those reported previously by Carette et al. (52), providing partial independent verification of this significantly expanded list of candidate PknB substrates. However, the general lack of overlap between the candidate PknB substrates reported by these two studies is notable-even when accepting that the data presented by  Table S4. substrates. This suggests that further research is required to verify the full sets of candidate substrates to minimize overreporting and to identify the true physiological PknB substrates.
Notwithstanding caveats concerning the need for robust validation of new candidate substrates for individual kinases, LC-MS/MS phosphoproteomic-based workflows have proven to be the most promising strategy for the ex vivo identification of novel STPK substrates in mycobacteria thus far. Moreover, the cost demand of these workflows can be minimized using label-free quantitation, allowing for the investigation of multiple time points and/or conditions. Above all, these approaches allow for substrate discovery under more biologically relevant conditions. In contrast, in vitro strategies ideally need to factor in subcellular compartmentalization of proteins, as well as physiologically relevant enzyme/substrate concentrations, which ultimately determine whether a protein kinase can interact with, and subsequently phosphorylate, a putative substrate. In vitro kinase assays can, however, serve as a validation technique for novel putative substrates recovered during in vivo phosphoproteomic studies. As part of this review, we have created a manually curated list of all currently reported mycobacterial kinase substrates and their specific phosphosites (Table S1). Table I presents a truncated table of (Table S2) were used to perform statistical overrepresentation tests of GO biological processes and GO cellular components using the PANTHER Classification System (http://pantherdb.org) version 14.1 (55) and GO database version 1.2 (released 2019-07-03). The M. tuberculosis (strain ATCC 25618/H37Rv) proteome was used as a reference list. Fig. 3A shows the functions that were enriched in the PknA and PknB candidate substrate list, with a clear enrichment of protein folding, cellular macromolecule biosynthetic process, regulation of molecular function, and gene expression. Moreover, the cellular component enrichment suggests that these substrates are localized in the cell wall, external encapsulation structure, cell periphery, and plasma membrane (Fig. 4A). Both of these results reaffirm the role of PknA and PknB in modulating cell division, cell wall synthesis, cell morphology, and metabolic processes (19, 21, 22, 56 -58), while corroborating the localization of these kinases to the cell wall and cell membrane (21,59).
The biological processes that were highly enriched in the PknG candidate substrate list were protein folding, response to temperature stimulus, and the regulation of various processes, including protein maturation, plasminogen activation, and protein processing (Fig. 3B). These findings agree with the conclusions drawn in previous studies that suggest PknG to be involved in the regulation of protein folding and translation (7,8). The cellular component enrichment shows these substrates to be considerably localized in the cytosol, cell wall, and capsule (Fig. 4B). This agrees with PknG being a soluble STPK and cytosolic protein, which is hypothesized to be translocated under specific environmental conditions (60,61). Furthermore, these results also support the localization of PknG to the cell wall (61-63) and its proposed regulatory role in cell wall biogenesis and integrity (62,64).
The subcellular localization of a substrate influences its ability to access protein kinases. PknA and PknB are both transmembrane proteins comprising an intracellular kinase domain (18), thereby accounting for the higher percentage of cytosolic protein substrates than membrane protein substrates. Nevertheless, it remains important to determine the circumstances by which a substrate becomes available to interact with a kinase anchored to the cell membrane.
PknG demonstrated preferential phosphorylation of cytoplasmic proteins, as anticipated. However, to understand the recovery of membrane and membrane-associated proteins by Nakedi et al. (7), we analyzed these substrates using the transmembrane topology predicter, Phobius (http://phobius. cgb.ki.se) (65). Intriguingly, none of the detected phosphosites appear to be localized in the cytoplasm, further supporting a possible role of PknG beyond the cytoplasm of the cell. However, further investigation and evidence are required to substantiate the presence and role of PknG in the cell wall.
Characterizing Substrate Specificity Patterns in Mycobacteria-Mycobacterial STPKs have long been suspected of promiscuity toward various substrates, a notion supported in part by the aggregated data on reported mycobacterial substrates (Table S1), which suggests an unexpectedly large number of substrates for each mycobacterial kinase. This  Table II. The M. tuberculosis (strain ATCC 25618/H37Rv) proteome was used as the background database. The red horizontal lines (Ϯ 4.06) illustrate the relative statistical significance (p value Յ 0.05, after Bonferroni correction) of residues flanking the central phosphorylation site. Overrepresented residues are above the midline, whereas underrepresented residues are below the midline. No distinct motifs were observed for phosphorylation on either Thr or Ser sites.
contrasts with the tight specificity typically observed for eukaryotic STPKs that underpins the integrity of eukaryotic signaling networks. The apparent promiscuity of mycobacterial STPKs is further supported by the absence of any obvious sequence motifs surrounding preferential Ser/Thr phosphorylation sites for individual STPKs (Fig. 6). However, recent phosphoproteomic-based studies show that a subset of substrates is specifically phosphorylated in vivo only when in the presence of STPK activity (7)(8)(9)52), arguing that these STPKs demonstrate a preference for certain substrates. Furthermore, it is equally clear from the GO enrichment analyses previously presented that the substrates for the individual mycobacterial STPKs are nonrandom at the functional level. Future attempts to determine possible specificity across mycobacterial STPK substrates may, therefore, require the use of more sophisticated Hidden Markov model-type approaches to identify potentially distal structural motifs that can account for the observed substrate selectivity of the mycobacterial STPKs. Future Perspectives-Quantitative phosphoproteomics is now established as a powerful and reliable strategy for mapping connections between phosphorylated substrates and their respective kinases or phosphatases. As discussed in the preceding, the employment of LC-MS/MS-based strategies has dramatically increased the number of potential substrates for mycobacterial STPKs, specifically for PknB and PknG. While this represents a unique opportunity to better understand the regulatory role of STPKs, it is important to note potential pitfalls associated with such approaches. For instance, it could be argued that differentially phosphorylated proteins may not be direct substrates of the studied kinase but, rather, phosphorylated by another kinase whose activity was affected by the absence of the kinase and/or the presence of a kinase-specific inhibitor. It is, therefore, imperative to further validate these novel substrates. This may include the use of in vitro assays to confirm that the newly identified phosphosites are, in fact, phosphorylated by the corresponding STPK. Additionally, in cases where a new role and/or phenotype is associated with the activity of a protein kinase, it is important to determine the phosphorylation site occupancy of the novel substrate(s). Researchers should then, ideally, verify whether the phenotype is altered by the replacement of the substrate phosphorylation site with an alanine or glutamate residue via site-directed mutagenesis (i.e. knockout or knock-in of a constitutive phosphorylation phenotype).
It is equally important that future studies investigate and validate the biological significance of these findings in order to increase understanding of the regulatory function(s) of newly identified substrate phosphorylation sites. For example, until very recently, the physiological role of PknG was largely associated with the regulation of mycobacterial metabolism, but newer datasets now suggest that PknG phosphorylation activity likely plays an important role in protein processing, translation, and folding machinery (7,8). Moreover, it is also important to develop phosphoproteomic strategies to investigate the role of mycobacterial STPKs in host cells in order to further reveal the complexity of host-pathogen interactions. In a recent study, 31 proteins were reported as exclusive interactors of PknG using a human proteome microarray (66). This study provides a valuable foundation for further research on host-pathogen protein-protein interactions to reveal the role of PknG during pathogenetic conditions. However, the biological relevance of these findings remains to be verified by phosphoproteomic studies that account for the subcellular compartmentalization of proteins and the accessibility of a host substrate to a mycobacterial protein kinase in vivo. We, therefore, propose that the next goal in MS-based mycobacterial research should be the identification of host substrates for mycobacterial STPKs during live infections. Work toward this objective is currently underway in our laboratory.
Concluding Remarks-We anticipate that a comprehensive integration of the large phosphoproteomic datasets, aimed at complete identification of novel STPKs substrates, will pro-vide the scientific community with significant new insight into the network mechanisms by which STPKs regulate mycobacterial environmental responses. In addition, this should allow researchers to pinpoint potential sites for new pharmacological interventions. The phosphorylation status of novel substrates can be utilized as biomarkers of the corresponding STPK activity. This, in turn, may aid in determining the efficiency and mode of action of potential inhibitors targeting specific mycobacterial STPKs. Adopting such strategies could provide a new platform to guide further development of next-generation drugs for clinical applications.