Molecular Basis of the Mechanisms Controlling MASTL*

MASTL is a unique kinase because of a non-conserved insertion of 550 residues within its activation loop. Using different constructs of MASTL, we show that the NCMR is essential for target discrimination and the regulation of kinase activity. We also show that truncations of MASTL containing a cryptic C-lobe in the NCMR are active and can be found in cancer cell lines containing mutations in one of the MASTL alleles, suggesting an intriguing role for these short forms of the kinase in cancer. Graphical Abstract Highlights NCMR is crucial for substrate recognition and activity regulation. MASTL conserves a cryptic C-Lobe in the non-conserved middle region. MASTL450 containing the cryptic C-lobe is observed in cancer cell lines. Key phosphorylation sites for MASTL provide an activation model. The human MASTL (Microtubule-associated serine/threonine kinase-like) gene encodes an essential protein in the cell cycle. MASTL is a key factor preventing early dephosphorylation of M-phase targets of Cdk1/CycB. Little is known about the mechanism of MASTL activation and regulation. MASTL contains a non-conserved insertion of 550 residues within its activation loop, splitting the kinase domain, and making it unique. Here, we show that this non-conserved middle region (NCMR) of the protein is crucial for target specificity and activity. We performed a phosphoproteomic assay with different MASTL constructs identifying key phosphorylation sites for its activation and determining whether they arise from autophosphorylation or exogenous kinases, thus generating an activation model. Hydrogen/deuterium exchange data complements this analysis revealing that the C-lobe in full-length MASTL forms a stable structure, whereas the N-lobe is dynamic and the NCMR and C-tail contain few localized regions with higher-order structure. Our results indicate that truncated versions of MASTL conserving a cryptic C-Lobe in the NCMR, display catalytic activity and different targets, thus establishing a possible link with truncated mutations observed in cancer-related databases.


In Brief
MASTL is a unique kinase because of a non-conserved insertion of 550 residues within its activation loop. Using different constructs of MASTL, we show that the NCMR is essential for target discrimination and the regulation of kinase activity. We also show that truncations of MASTL containing a cryptic Clobe in the NCMR are active and can be found in cancer cell lines containing mutations in one of the MASTL alleles, suggesting an intriguing role for these short forms of the kinase in cancer.

Graphical Abstract
The gene encoding Microtubule-associated serine/threonine kinase-like (MASTL) 1 , also known as Greatwall (Gwl), was initially described in Drosophila as the Scant (Scott of the Antarctic) mutation (1). Soon after, Scant was shown to encode an essential protein kinase involved in mitosis (2) (3). Flies lacking MASTL displayed an abnormal chromosome condensation and an impaired mitotic progression because of a delay in the late G2 phase to mitosis. Subsequent studies in Xenopus egg extracts revealed that MASTL was not only required for mitotic entry but also necessary for maintaining the CSF-arrested mitotic state in the extracts (4). Based on these findings, MASTL was proposed to be involved in the Cdk1 autoregulatory loop, but its contribution remained unclear (4). Later experiments revealed that MASTL's role in mitotic control, rather than targeting Cdk1 regulators, was to inhibit the activity of the PP2A-B55␦ (5, 6) phosphatase antagonising its activity. Intriguingly, none of the phosphatase subunits was targeted by MASTL, and the means to achieve inhibition remained vague. Nonetheless, the discovery of two related paralogs as substrates, Arpp19 and ENSA, which once phosphorylated by MASTL inhibit PP2A-B55␦, elucidated the kinase mechanism to inhibit PP2A (7,8). Remarkably, the inhibition was restricted to the PP2A-B55␦ phosphatase complex, and other PP2A holoenzymes remained unaffected (8). Accordingly, the importance of the precise activation and inactivation of MASTL for proper mitotic progression was highlighted. Aside from this well-conserved role in cell cycle regulation, recent reports have associated MASTL with the control of DNA replication through ENSA (9) and coordination during recovery from DNA damage (10 -12), and a recent study of a MASTL thrombocytopenia-associated mutation has suggested a possible function for MASTL in regulating actin and cytoskeleton (13). However, the link between MASTL and other pathways remains elusive, most likely because of the absence of additional well-known substrates besides Arpp19/ENSA. MASTL is classified as a member of the MAST subfamily of AGC kinases (14). Closely related MASTL/Greatwall kinases are also present in other insects, vertebrates and yeast. Interestingly, MASTL contains a unique long insertion of about 550 non-conserved amino acids between the kinase subdomains VII and VIII, which is the typical location of the activation loop (3,15). This segment has been termed the non-conserved middle region (NCMR) (16) because of the low conservation between orthologues and paralogues. MASTL activation also differs from that of most other AGC kinases, which encompasses the phosphorylation in three conserved regulatory motifs (17,18) (1) in the activation segment, containing the T-loop (19), which is usually phosphorylated by another AGC kinase member (20), (2) in the hydrophobic motif (21) and (3) the tail linker motif (22) (supplemental Fig. S1). The NCMR is phosphorylated (3,15,16), and some of the phosphorylation sites within the NCMR seem to be necessary for MASTL activity, and they have been shown to play a role in its localization (23)(24)(25). However, it appears that no sequence within the NCMR is indispensable (15,16). Therefore, it remains unclear whether MASTL requires an activation loop phosphorylation for its regulation (15) or substrate selection (16). On the other hand, MASTL holds a short AGC C-tail, which does not include the hydrophobic motif present in most AGC kinases (15) (supplemental Fig. S1). Although some studies have addressed MASTL activation (15,16) and structural details in the absence of the NCMR region are available (26); the kinase activation mechanism and the role of the NCMR remain unclear.
In this manuscript, we perform a combined analysis indicating that the NCMR region is essential for MASTL specificity and allosterically regulates the enzyme catalysis. In addition, we show that truncated products of MASTL may be catalytically active, using a cryptic C-lobe contained in a section of the NCMR, and display a different protein target palette in a HEK293 cell extract. Finally, hydrogen/deuterium exchange mass spectrometry (HDX-MS) reveal MASTL dynamics.

EXPERIMENTAL PROCEDURES
Cloning-Full-length MASTL (H. sapiens) was obtained from a human cDNA library (Marcos Malumbres, CNIO, Spain). The DNA sequence corresponds to the serine/threonine-protein kinase Greatwall isoform 1 (Mammalian Gene Collection -MGC ID BC009107). The DNA sequence encoding for the Bonsai construct was obtained as a codon-optimized DNA sequence (Life Technologies). MASTL constructs were amplified by PCR using the LIC-MASTL primers. PCR products were used to clone the genes into the protein expression vector using Ligation Independent Cloning (LIC). A modified version of the commercially available pCEP4 (Invitrogen) expression plasmid containing LIC compatible overhangs, was used for protein overexpression. The resulting MASTL recombinant proteins contained an N-terminal tag coding for a 6xHis tag, a Twin-Strep-tag and TEV protease site (supplemental Table S1). Mutations into these constructs were achieved using the Quick-Change II Site-Directed Mutagenesis Kit (Agilent Technologies). All the constructs have been confirmed by full sequencing. Details of the cloning will be provided upon request.
Cell Culture and Protein Expression-Human Embryonic Kidney EBNA 6E cell lines (HEK293 6E) were cultivated in Freestyle 293 F17 expression medium (Invitrogen) supplemented with 1% fetal bovine serum (FBS). One day before transfection, HEK293 6E cells were resuspended in fresh Freestyle 293 F17 expression medium to a cell density of 1.2 ϫ 10 6 cells/ml and incubated at 37°C overnight. Approximately, 15 mins before transfection, cells were resuspended in fresh non-supplemented Freestyle 293 F17 expression medium at a cell density of 20 ϫ 10 6 cells/ml and incubated in the orbital shaker incubator at 37°C, 70% humidity, 5% CO 2 and 120 rpm (Ø50 mm), until being transfected. GigaPrep (Qiagen, Germany) plasmid DNA (50 g/ml final) and Polyethylenimine "MAX" (PEI) (Polysciences) (100 g/ml solution final) were directly added to the cell suspension. Complete Freestyle 293 F17 expression medium (1% FBS) was added to a final volume of 3L of cell suspension, 4 h post-transfection. Three days post-transfection the pellets were collected by centrifugation at 750 rpm for 10 mins at 4°C.
MASTL FL and Bonsai Purification-Cellular pellets of HEK293 6E cells overexpressing either MASTL FL or Bonsai were resuspended in Lysis buffer (50 mM Tris pH 8.0, 200 mM NaCl, 0.5 mM TCEP, 1 mM EDTA, 50 U/ml Benzonase/50 ml, 2x Tablets Complete Inhibitor mixture EDTA Free (Roche, Switzerland), 0.5% Triton X 10). After disruption by high-pressure EmulsiFlex-C3 Homogenizer (Avestin, Canada), cell debris and insoluble particles were removed by centrifugation at 10,000 ϫ g at 4°C. The supernatant was loaded onto a StrepTrap HP column (GE Healthcare) equilibrated in buffer A (50 mM Tris pH 8.0, 200 mM NaCl, 0.5 mM TCEP, 1 mM EDTA). After sample loading conclusion, the column was washed with 20 column volumes (CV) of buffer A. Elution of the protein was achieved by a single step elution with buffer B (buffer A ϩ 2.5 mM Desthiobiotin). Enriched protein fractions were pooled together and incubated with ATP (300 M final concentration) for 4 h at 4°C. Soon after ATP-incubation, the sample was loaded onto a HisTrap HP column (GE Healthcare) equilibrated with buffer C (20 mM Tris pH 8.0, 200 mM NaCl, 0.5 mM TCEP, 2 mM MgCl 2 , 100 M ATP). The column was washed first with 5-10 CV of buffer C, to be then further washed with 10 CV of buffer D (buffer C without ATP). Protein elution was achieved in a single step elution with buffer E (buffer D ϩ 500 mM Imidazole). Protein-rich fractions were collected and dialyzed overnight on a 10 kDa MWCO SkaneSkin Dialysis Tubing (Thermo Scientific) with buffer D at 4°C. The sample was concentrated (using a 10 kDa MWCO Centriprep Amicon Ultra devices) and if required loaded onto an S200 -10/300GL column (GE Healthcare) equilibrated in buffer D. The protein peaks were concentrated (using a 10 kDa MWCO Centriprep Amicon Ultra devices), directly used for experiments or flash-frozen in liquid nitrogen and stored at -80°C. The protein concentration was determined using the theoretical molecular extinction coefficient at 280 nm calculated from the amino acid composition. The same purification procedure was used to purify all MASTL FL and Bonsai mutants.
MASTL 450 Purification-Resuspension and supernatant preparation of cellular pellets of HEK293 6E cells overexpressing MASTL 450 were conducted as previously described for cellular pellets overexpressing MASTL FL and Bonsai, but the lysis buffer contained additionally 10% Glycerol (v/v). The supernatant was loaded onto a Strep-Trap HP column (GE Healthcare) equilibrated in buffer A1 (buffer A ϩ 10% Glycerol (v/v)). After sample loading conclusion, the column was washed with 20 column volumes (CV) of buffer A1. A single step elution was achieved with buffer B1 (buffer B ϩ 10% Glycerol (v/v)). Enriched protein fractions were run on an SDS-PAGE gel (NuPAGE® 4 -12% Bis-Tris Gel, Invitrogen) to be further stained with the Colloidal Stain Kit (Invitrogen). After extensive de-staining of the gel with ddH 2 O, the SDS-gel was digitalized using an Epson Perfection V750 Pro scanner. ImageQuant TL software (GE Healthcare) was used to perform a 1D gel analysis from the one-dimensional electroporation gel containing the samples. The protein concentration of the MASTL 450 present in the sample was calculated using the percentage 1 The abbreviations used are: MASTL, microtubule-associated serine/threonine kinase-like; Gwl, Greatwall; LC-MS/MS, liquid chromatography-tandem mass spectrometry; HCD, higher-energy collisional dissociation; HDX-MS, hydrogen/deuterium exchange mass spectrometry.
factor obtained from the 1D gel analysis and the total protein concentration of the sample determined by 280 nm absorbance (A 280 ) measurement.
In Vitro Kinase Reaction by Autoradiography-Standard kinase assays were performed for 30 min at 30°C in 10 l of Kinase buffer (10 mM Tris pH 7.5, 50 mM KCl, 10 mM MgCl 2 , 1 mM DTT) supplemented with 50 M cold ATP, 1.5 Ci [␥-32 P] ATP (3000Ci/mmol) and 0.5 M MASTL proteins prepared from the human HEK293 6E cells. In early experiments, either 50 M of MBP (Merck Millipore, Catalogue #13-110) was added to the reaction mixture as a model kinase substrate or 50 M recombinant full-length human Arpp19 protein (cAMP-regulated phosphoprotein 19 -UniProtKB -P56211) as verified MASTL substrate (7,8) Table S1). The kinase reaction was concluded by the addition of LDS Sample Buffer to be further fractionated by SDS-polyacrylamide gel electrophoresis. Radioactive MASTL and substrate bands were identified by autoradiography, quantification of the signal was achieved by densitometry analysis of the autoradiograms using the ImageStudioLite 5.2.5 Software (Li-Cor Biosciences). Finally, the activity levels were corrected by the amount of protein present at each sample and represented as a percentage of the phosphorylation compared with that of the MASTL FL form. For the kinetic characterization of the MASTL constructs, a constant MASTL concentration was maintained (0.25 or 0.5 M) while increasing the concentration of Arpp19 (1 -240 M). The length of the time courses was set so that the initial velocity at each substrate concentration presented a linear increase. Quantification of the signals as previously described. Data analysis was achieved using the Prism 6 software (Graphpad). In later experiments, 50 M of several MASTL protein targets identified by our mass spectrometry experiments were used in the in vitro kinase assays. The fragments ( 24 RGRGRPRKQPPVSPGTALVGSQKEPSEVPTPKRPRGRPKGS 64 and 67 KGAAKTRKTTTTPGRKPRGRPKKL EKEEEEGISQESSEEEQ 107 ) of the human High mobility group protein HMG-I/HMG-Y (HMGA1 -UniProtKB P17096), the ( 16 IKNSSVPRRTLKMIQPSASGSLVGRENELSAGLSKRK-HRND 56 ) of the human Geminin (GMNN -UniProtKB O75496) and the ( 42 KPGGSDFLRKRLQKGQKYFDSGDYNMAKAKMKNKQLPTAAP 82 ) of the human cAMP-regulated phosphoprotein 19 (Arpp19 -Uni-ProtKB P56211) were fused to N-terminal tag containing a 6xHis tag, a LSL-tag (27) and TEV protease site. Quantification of the signals as previously described.
Protein immunoprecipitation was performed with an in-house generated rat monoclonal antibody designed against the polypeptide comprising R22-S244 residues of mouse MASTL crosslinked with Protein A (Invitrogen) magnetic beads. Cell extracts were lysed in cold ELB lysis buffer (50 mM Hepes pH 7.5, 150 mM NaCl, 5 mM EDTA, 1% NP-40) with phosphatase and protease inhibitors, with rotatory agitation for 30 min, and then centrifuged at 13,000 rpm, 4°C for 10 min. Protein concentration determination was performed for the supernatants with a BCA assay (Pierce). For protein immunoprecipitation, a crosslinked antibody was added to the protein in a rate of 2 g/1 mg, and IP incubation was performed in a cold room at 4°C with rotatory agitation for 16 h. After subsequent wash-outs, the IP sample was mixed with sample buffer (350 mM Tris-HCl pH 6.8, 30% glycerol, 10% SDS, 0.6 M DTT, 0.01% Bromphenol blue) and boiled for 5 min before loading in the electrophoresis gel. Membranes were incubated with a mouse monoclonal MASTL C-lobe antibody (Monoclonal 4F9 Millipore MABT372) and with a different clone of the rat monoclonal antibodies against MASTL N-terminal region. Antibodies for immunoblot were diluted 1:1000 in 3% BSA 0.05% PBS-Tween20. Proteins were transferred to nitrocellulose membranes which were incubated with secondary antibodies conjugated with the reporter enzyme HRP (DAKO, Denmark) prepared at 1:10000 in 0.05% PBS-Tween20 containing 5% of non-fat dried milk powder. For membrane developing were used the chemiluminescent HRP-substrate ECL (GE Healthcare) and membranes were exposed to a High-performance chemioluminescence film (Amersham Biosciences) in a dark room with a red light.
Rationale MASTL Kinase Assay Phosphoproteomics -Sample Preparation-Our stable isotope-labeled kinase assay-linked phosphoproteomics (siKALIP) MASTL experiments were based on the protocol previously described (28). First, 31ϫ10 6 HEK293 6E of not transfected cells were lysed by sonication in 2 ml of lysis buffer (50 mM Tris-HCl, pH 7.5, 150 mM NaCl, 5 mM EDTA), cell debris and insoluble particles were removed by centrifugation at 16,000 ϫ g at 4°C. The supernatant containing 400 g of soluble protein was collected. The sample volume was adjusted to 200 l with lysis buffer. To inhibit endogenous kinases in the lysate, the sample was incubated with 1 mM 5Ј-(4-fluorosulfonyl-benzoyl) adenosine (FSBA) with 10% DMSO for 1 h at 30°C. After endogenous kinase inhibition, 10U FastAP phosphatase (ThermoFisher Scientific) and 23 l of 10x rAPid phosphate buffer were supplemented to the sample and incubated for 3 h at 37°C. Heat inactivation of the rAPid phosphatase was achieved by heating the sample for 5 mins at 75°C. Excess of FSBA was removed by Vivacon filtration unit 30 kDa cut-off (Merck-Millipore). Concentrated samples were washed in the concentrator with 200 l of lysis buffer and concentrated again. Samples in the filter were incubated in 300 l of 1ϫ kinase buffer (10 mM Tris pH 7.5, 50 mM KCl, 10 mM MgCl 2 , 1 mM DTT, 1 mM ATP(␥-P 18 O 3 ) (Cambridge Isotope Laboratory, Andover, MA) and 100 nm of the desired MASTL construct (MASTL FL, Bonsai or 450) for 1 h at 30°C. The reaction was quenched by Guanidine-HCl denaturation buffer (6 M Guanidine-HCl (GndCl), 100 mM Tris (pH 8.5), 5 mM Tris (2-carboxyethyl) phosphine and 10 mM chloroacetamide), to be further spun off the filter and heated for 10 mins at 99°C. For in-solution digestion, proteins were pre-digested with endoproteinase Lys-C (Wako) for 3 h and diluted 3-fold in 25 mM Tris buffer before overnight digestion with trypsin at 37°C (modified sequencing grade, Sigma-Aldrich). The enzyme activity was quenched by adding trifluoroacetic acid (TFA) to the samples, and samples were centrifuged 5 min at 3000 rpm to remove precipitates.
Peptides were desalted and concentrated using reversed-phase Sep-Pak C 18 cartridges (Waters) and eluted with 50% acetonitrile, 0.1% TFA. Phosphopeptides were enriched by Titansphere chromatography. In brief, titanium dioxide (TiO 2 ) beads (5 m, Titansphere, GL Sciences, Japan) were incubated in a solution of 20 mg/ml 2,5dihydroxybenzoic acid (DHB) (Sigma-Aldrich) in 80% acetonitrile, 0.1% TFA for 30 min. One milligram of TiO 2 beads in 10 l of DHB solution was added to each sample and incubated with rotation for 30 min. Beads were washed once with 40% acetonitrile, 6% TFA and transferred in 80% acetonitrile, 6% TFA into a C 8 stage tip. Beads were washed in the C8 stage tip with increasing concentrations of acetonitrile, and bound phosphopeptides were then eluted directly into a 96-well plate by 5 l NH 4 OH followed by 10 l NH 4 OH, 25% acetonitrile. The eluate was concentrated in a SpeedVac centrifuge at 60°C and acidified with 5% acetonitrile, 1% TFA. Samples were then desalted and concentrated by solid-phase extraction on reversedphased C 18 STAGE tips.
MASTL Phosphorylation Site Detection -Sample Preparation-Purified recombinant MASTL coming from HEK293 6E cells (FL, Bonsai, 450 and their corresponding kinase-dead mutants) were used for these assays. Purification of the proteins was performed as described previously. To obtain dephosphorylated MASTL FL, the recombinant protein was incubated with 1U of FastAP phosphatase (Thermo Fisher Scientific) at 37°C for 1 h. Dephosphorylation-reaction was stopped by the addition of 10 mM Sodium Orthovanadate (Na 3 VO 4 ) (NEB) to the mixture. Later experiments used MASTL FL and dephosphorylated FL samples incubated with HEK mitotic extracts. HEK mitotic extracts (kindly provided by Jakob Nilsson's laboratory -CPR, Copenhagen) were obtained by arresting the cells in 2.5 mM thymidine-containing medium for 18 h, released into a medium without thymidine for 8 h, and again cultured for 18 h in medium containing 2.5 mM thymidine. After release from the second thymidine block into medium without thymidine, mitotic cells were harvested by shake-off at the mitotic peak (ϳ10 h after release from the thymidine block). MASTL samples were incubated for 1.5 h at 30°C with the mitotic extracts supplementing the reactions with 5 mM MgCl 2 , 1 mM ATP and 50 mM KCl. Both samples were incubated with Ni-NTA Agarose beads (Qiagen, Germany) for 2 h at 4°C after mitotic extract incubation. Ni-NTA Beads were washed 3 times and eluted with a buffer containing 250 mM Imidazole.
For mass spectrometry analysis proteins were resolved by SDS-PAGE, and visualized by Coomassie staining and in-gel trypsin digested as previously described (29). Before SDS-PAGE proteins were reduced with 10 mM dithiothreitol (DTT) in 25 mM ammonium bicarbonate (ABC) buffer for 45 min and alkylated with 55 mM chloroacetamide (CAA) in 25 mM ABC solution for 30 min. LDS Sample Buffer was added to the samples before protein fractionation by SDS-PAGE (Invitrogen), and SDS-gels were stained with the Colloidal Blue Staining Kit (Invitrogen) according to manufacturer instructions. For each sample, the MASTL-specific band was excised from the gel into 1 ϫ 2-mm cubes. Gel slices were destained with 50% ethanol in 25 mM ABC solution and dehydrated with 96% ethanol. Proteins were digested with trypsin (modified sequencing grade, Sigma-Aldrich) overnight at 37°C. Trypsin activity was quenched by acidification with TFA and peptides were extracted from the gel pieces using increasing concentrations of acetonitrile. Organic solvents were removed by evaporation in a SpeedVac centrifuge at 60°C and samples were desalted and concentrated by solid-phase extraction on reversedphased C 18 STAGE tips.
Liquid Chromatography-Tandem Mass Spectrometry (LC-MS/ MS)-For all samples, peptides were eluted from the C 18 STAGE tips with 40 and 60% acetonitrile in 0.1% formic acid before online nanoflow LC-MS/MS analysis. Samples were analyzed with a nanoscale UHPLC system (EASY nLC1000 or 1200, Thermo Fisher Scientific) connected to a Q Exactive HF mass spectrometer (Thermo Fisher Scientific) through a nanoelectrospray ion source as previously described. Briefly, peptides were separated in a 15-cm analytical column (75-m inner diameter) in-house packed with 1.9-m reversedphase C 18 beads (ReproSil-Pur AQ, Dr Maisch, Germany) with a 76 min gradient from 8 to 64% acetonitrile in 0.1% formic acid with a flow of 250 nl/min. The mass spectrometer was operated with a spray voltage set to 2 kV, heated capillary temperature at 275°C and s-lens radio frequency level at 50%. Dynamic exclusion was set to 30 s and all experiments acquired in positive polarity mode. Full scan resolution was set to 120,000 at m/z 200, and the mass range was set to m/z 375-1500. Full scan ion target value was 3E6 with a maximum fill time of 25 ms. For every full scan, the 12 (for MASTL autophosphorylation analysis) or 7 (for kinase assay phosphoproteome analysis) most intense ions were isolated and fragmented (normalized collision energy 28%) by higher-energy collisional dissociation (HCD) with fragment scan resolution of 30,000, and an ion target value of 1E5 with a maximum fill time of 45 ms (for MASTL autophosphorylation analysis) or fragment scan resolution of 60,000, and an ion target value of 1E5 with a maximum fill time of 110 ms (for kinase assay phosphoproteome analysis).
Processing and Analysis of Mass Spectrometry Raw Data-All raw LC-MS/MS data files were analyzed using the MaxQuant software version 1.5.3.33 with the integrated Andromeda search engine (30,31). Data were searched against a target/decoy (forward and reversed) version of the reviewed part of the human UniProt database (April 2015 with 20202 entries) supplemented with commonly observed contaminants and the sequences of the MASTL constructs (Table SI). Trypsin was specified as a protease with the maximum number of missed cleavages set to 2 for kinase assay phosphoproteome analysis and 3 for analysis of MASTL phosphorylation status. Cysteine carbamidomethylation was searched as a fixed modification. Protein N-terminal acetylation, pyroglutamate formation from glutamine, and phosphorylation of serine, threonine and tyrosine (both in standard and heavy 18 O 3 -phospho versions for kinase assay phosphoproteome analysis) were searched as variable modifications. In addition, oxidized methionine and deamidation of asparagine and glutamine were searched as extra variable modifications. Phosphorylation site localization probabilities were determined by MaxQuant using the PTM (post-translational modification) scoring algorithm (30,32). A false discovery rate (FDR) of 1% or 5% was applied for peptide and site identifications for MASTL autophosphorylation or kinase assay phosphoproteome, respectively, whereas a protein FDR of 1% was applied. Minimum peptide length was specified to 7 amino acids and for modified peptides a minimum Andromeda score of 25 and delta score of 6 was applied. A precursor mass tolerance of 20 parts per million (ppm) was specified for the first search and of 4.5 ppm for main search. Mass tolerance for fragment ions was set to 20 ppm. The match between runs feature was applied in the analysis of kinase assay phosphoproteome and MASTL phosphorylation status samples.
Analysis of proteomics data was performed using the Perseus software version 1.5.1.12 (http://www.coxdocs.org/). Only peptides with a phosphorylation site localization probability of at least 0.75 (class I sites) (32) were included in the final list. Phosphorylation site identifications were filtered for reversed hits. For kinase assay phosphoproteome analysis, the existence of a heavy 18 O 3 -phosphate was used as an indicator to determine the direct targets of MASTL in vitro. Data was filtered based on the criteria that a phosphorylation site had to be identified in at least three of four replicates for at least one of the sample groups to be included in the downstream analysis. Data analysis was performed using the raw intensity values, which were log2-transformed and normalized by quantile-based normalization. Imputation from the lower end of the normal distribution was performed to replace missing values.
Significantly regulated phosphorylation-target sites for the different MASTL constructs were determined using student's t-test by comparing intensities for each construct to the dephosphorylated sample and results of the analysis were visualized by volcano plots. For identification of potential kinase motifs of the different MASTLs the IceLogo software (33) was used to assess for sequence bias around the regulated heavy phosphorylation sites. Sequence motif logo plots of Ϯ6 amino acids adjacent to the identified phosphorylation sites were generated using standard parameters with p Ͻ 0.01 and compared with all the non-heavy phosphorylation sites obtained from the phosphopeptide enrichment as background.
To rank MASTL autophosphorylation sites, we used the estimated site occupancy for phosphorylation (34), defined as the fraction of protein for which the given site is modified. The proportion of phosphorylated peptide is calculated based on signal differences of modified peptide and the corresponding unmodified peptide and protein ratio. Thus, obtained class I sites from the in-gel analysis were filtered so that sites with a fractional phosphorylation site occupancy above 20% are highlighted in the final list.
Experimental Design and Statistical Rationale-For the described kinase assay phosphoproteomics analysis four technical replicates were processed and analyzed for all samples (except for the dephospho which included only three replicates because of a fault in sample preparation). The set of samples included the kinase reaction for each construct in addition to two controls: "basal" and "dephosphorylated." The basal control was included as background to assess the degree of dephosphorylation achieved after in vitro phosphatase treatment. The dephosphorylated control was included as a reference to identify phosphorylation sites specifically induced by the kinase reaction. MASTL-regulated phosphorylation sites were identified using t-test by comparing the dephosphorylated control and the different kinase reaction samples as described in the previous paragraph and results section. Thus, a total of 19 samples were included in the final kinase assay phosphoproteomics experiments. For the MASTL phosphorylation status analysis 1-2 replicates of each sample were included (samples included in the mitotic extract experiments were analyzed in duplicates). Accordingly, these data were only used to assess presence/absence of protein phosphorylation and not for statistical analysis. A total of 20 samples were included in the final analysis set for MASTL phosphorylation status. For additional experiments incorporating statistical analysis number of replicates and clarification of statistical tests are provided in the results section and figure legends where relevant. The final list of phosphosites detected in this work is found in Table SII.
HDX-MS-HDX-MS experiments were performed with the recombinant full-length MASTL protein overexpressed in HEK293 6E cells and purified as previously described except for the presence of 200 M Staurosporine to avoid autophosphorylation during the last sizeexclusion chromatography purification step. Gel-filtration fractions containing only monomeric MASTL were used for HDX experiments in order to reduce sample heterogeneity. Deuterium exchange samples were prepared from MASTL stock solution (7 -9 pmol/l) in 20 mM HEPES, pH 7.5, 150 mM NaCl, 0.2 mM TCEP to a final D 2 O content of 90% and MASTL concentration of 1 pmol/l. Following a 15-min pre-incubation of the protein sample, deuterium labeling was performed at room temperature for the following time intervals: 0 min, 15 s, 10 min, 6 h, and 24 h. All control and time-course samples were prepared and analyzed in triplicates. A non-deuterated buffer was used to prepare the 0-min control samples. Equilibrium-labeled samples were prepared by overnight incubation in 6 M deuterated Gnd-HCL at a final D 2 O content of 90%.
At each time interval, the hydrogen exchange was quenched by a 1:1 (v/v) dilution of deuterated protein sample with ice-cold 300 mM potassium phosphate buffer pH 2.3. The quenched samples were immediately frozen and stored at Ϫ80°C. The quenched deuterated samples were thawed immediately before LC-MS analysis; 20 -30 pmol of each sample was loaded onto a cooled (0°C) Waters Nano-Acquity UPLC system (Waters Corp., Milford, MA) for online digestion on an in-house-packed pepsin column (Pierce, Rockford, IL), followed by desalting (Waters VanGuard C18, 1.7 m, 2.1 ϫ 5 mm) and reverse phase separation (ACQUITY UPLC BEH C18 1.7 m 1.0 ϫ 100 mm column) using a 9 min gradient from 8 to 50% acetonitrile in 0.23% formic acid (pH 2.5) at a flow rate of 40 l/min.
Mass analysis was performed with a Waters Synapt G2 HDMS mass spectrometer using positive ion electrospray. Identification of peptides was achieved by MS/MS by collision-induced dissociation using argon as collision gas and data-independent acquisition (DIA) mode (i.e. MSe). A peptide had to be identified in 2 out of 3 replicate MS/MS experiments in which the precursor mass accuracy was below 10 ppm, and at least 0.2 product ions per amino acid of the peptide could be assigned. Based on this procedure, 90 identified peptic peptides covering 89% of the sequence, were selected and used to measure the local HDX of MASTL. No phosphopeptides were identified in the monomeric preparation of MASTL used in the HDX experiments as staurosporine was added to the sample. Deuterium incorporation was determined in DynamX ver. 3.0 (Waters Corp.). All HDX-MS experiments were performed in triplicate. The heat map was generated by calculating the percentile exchange at each time point relative to the experimentally determined values of maximum deuterium incorporation for each peptide obtained for the equilibriumlabeled control sample. In areas with overlapping peptides, HDX was localized to sub-peptide segments by subtracting the differences in HDX in one peptide from the overlapping peptide. This method requires the back-exchange level of the overlapping peptides to be comparable (35). For all subtracted peptides, the difference in backexchange levels was below 10% points (36). To allow access to the HDX data of this study, the HDX Data supplemental Tables S3, S4 and the HDX Data Summary Table (supplemental Table S5) are included in the Supporting Information according to the community-based recommendations (37).

RESULTS
Understanding MASTL Specificity-A detailed analysis of MASTL polypeptide sequence and their orthologues shows that MASTL lacks 6 of the 15 AGC-conserved amino acids (16,17) (see supplemental Table S6). In addition, MASTL displays a unique kinase domain architecture (Fig. 1A). The NCMR is located where all other protein kinases contain a 20 -30 residues activation loop (38). Several in silico predictions suggest that this is an unstructured region (15). Further, a sequence similarity alignment between MASTLЈs NCMR with the same region of its paralogs and orthologues does not find similarities between them, indicating that this insertion does not present sequence conservation (3). To understand the role of the NCMR insertion and MASTL specificity, we used different MASTL variants in a mass spectrometry-based phosphoproteomics assay and analyzed their protein targets in a HEK293 cell extract. Using the information obtained from the alignment and structure prediction servers (Phyre2 and I-Tasser), we determined the boundaries for two constructs (Fig. 1A, supplemental Fig. S1). The in silico structural analysis of MASTL allowed us to set initial boundaries for a canonical kinase domain within the MASTL full-length sequence. This construct, termed Bonsai, joins the N-lobe (35-112 residues) and the C-lobe (113-180 and 728 -879 residues) disregarding the NCMR. Interestingly, the combined use of the alignment and the structural prediction servers allowed us to identify a newly conserved region between residues 180 to 294, which was predicted to fold into a cryptic C-lobe, thus composing a possible additional kinase domain thanks to a caspase proteolytic site after V450. To test this possibility, we prepared another construct, termed MASTL 450 , including the initial 450 residues (Fig. 1A, supplemental Fig. S1).
We used MASTL, MASTL 450 and Bonsai purified proteins (supplemental Table S1) in a slightly modified version of the siKALIP (28) (Stable Isotope Labeled Kinase Assay-Linked Phosphoproteomics) to identify phosphorylation differences between the three constructs. The three MASTL variants were expressed in HEK293 cells and isolated using tandem Strep and His-tag affinity purifications. For the siKALIP experiment, we used a partially dephosphorylated cell extract from interphase HEK293 cells to perform an in vitro kinase assay. This assay makes use of the incorporation of a heavy stable isotope of ␥-[(18)O(4)]-ATP as a substrate in the kinase reaction, thus enabling the identification of specific and direct phosphorylation targets (supplemental Fig. S2, Methods). Three samples were included for each of the variants, an untreated basal sample, a dephosphorylated sample and the sample containing the kinase reaction. The basal sample was included to assess the degree of dephosphorylation achieved in the other two samples after in vitro phosphatase treatment.
The dephosphorylated sample was used as a reference to identify phosphorylation sites specifically induced by the kinase reaction. The analysis of significantly MASTL-regulated phosphosites by t-test volcano plots was generated by comparing identified phosphorylation sites between the dephosphorylated and the different kinase reaction samples, thus providing a list of protein targets in the cellular extract for each of the variants (Fig. 1B, supplemental Table S2). In this experimental setup, MASTL phosphorylates 56 different proteins. Notably, one of the phosphorylation sites found was Ser62 on Arpp19 and Ser67 on ENSA, which are the only previously identified substrate sites (7,8). These sites were also phosphorylated by Bonsai, but not by MASTL 450 . Comparing MASTL and Bonsai, clearly indicates that the deletion of the NCMR results in a significantly higher number of phosphosites for Bonsai. This difference suggests a role for the NCMR in target selection. Interestingly, MASTL 450 phosphorylated several proteins in the extract with 32 targets identified, suggesting that the predicted cryptic C-lobe can build an active kinase domain.
The correlation between the phosphorylated proteins can be observed in the scaled Venn diagrams (Fig. 1C, middle panel) illustrating the dimensions of the three data sets. The overlap between the identified targets (Fig. 1C, right panel) of MASTL and Bonsai represents Х 70% of the total MASTL proteins, indicating that the full-length protein shares most of its targets with Bonsai. Nonetheless, the overlap between MASTL and Bonsai represent only Х 15% of the proteins found for the Bonsai. This analysis further highlights that Bonsai is licentious regarding its protein targets when compared with MASTL. The overlaps between the proteins phosphorylated by MASTL and MASTL 450 represent Х 30% of those for MASTL and Х 53% of those for MASTL 450 , showing that although MASTL 450 has a smaller set of targets, half of them are common with the full-length protein.
MASTL siKALIP Phosphorylation Motif-Arpp19 and ENSA, which are 71% identical, are the only known MASTL substrates (7,8). Hence, no consensus sequence motif is known for MASTL. Although our experiment was not performed in vivo, we took advantage of the larger list of phosphorylation sites on the identified proteins to perform a sequence motif enrichment analysis (33) to extract possible consensus sequences for MASTL, Bonsai and MASTL 450 (Fig. 1D). Several residues within the Arpp19/ENSA site of phosphorylation (GQKYFD-pS-GDYNMA) are enriched in the consensus sites of MASTL and Bonsai. Within the determined consensus sites for MASTL and Bonsai, positions, Ϫ2, 0, ϩ1 and ϩ3, are conserved in both proteins. The presence of those amino acids within the consensus sequences determined for MASTL and Bonsai is consistent with the fact that Arpp19/ENSA phosphopeptides were among the top targets in our siKALIP assays for both proteins. The absence of common residues between the determined consensus for MASTL 450 and the Arpp19/ENSA phosphorylation site also agree with the fact that Arpp19/ENSA was not found to be phosphorylated by MASTL 450 . A comparison between the consensus sequences defined for MASTL and Bonsai indicates that numerous amino acids within both consensus sequences contain similar residues in the same position. Nevertheless, most of the positions within the consensus of Bonsai exhibit a broader spectrum of residues, which illustrates the ubiquitous palette of proteins phosphorylated by this construct in the cell extract. Finally, only a few residues confirming the central region of the consensus sequences of MASTL and MASTL 450 are present in both sequences, showing the common potential targets, but at the same time demonstrating why both proteins do not The volcano plots graphically display the statistical significance of the identified phosphorylation sites (y axis; -log p ) over the fold-change between two conditions, of which one was always the dephosphorylated extracts, and the other one was the extracts incubated with one of the MASTL constructs (x axis; Difference (Average log 2 (MASTLx) -Average log 2 (dephosphorylated))). C, Intersectional relationship of the siKALIP phosphorylation sites. The Table contains the total number of phosphosites and phosphoproteins detected in the siKALIP assay for each sample. The Venn diagrams representing the proportional relationship between the phosphoproteins detected in all three assays and the number of shared phosphoproteins detected in all three assays. Venn diagram generated using the Venny 2.0 web tool available in (http://bioinfogp.cnb.csic.es/tools/venny/). D, Consensus sequences for each MASTL construct based on the siKALIP data. The MASTL consensus sequence generated from the in vitro targets found in the siKALIP assay. All consensus motifs produced using the IceLogo application (33).
share a considerable fraction of proteins among their targets in the HEK293 extract.
MASTL Targets in siKALIP-Our siKALIP experiment is based on a kinase reaction performed in a cell extract to understand the role of the NCMR in MASTL. Only 56 out of around 10,000 possible target proteins (39) have been phosphorylated in our assay, and ARPP19 and ENSA have been found among the targets in the cellular extract (supplemental Table S2). Therefore, we have analyzed the target proteins found in our experiment. The list of phosphorylated targets found in our assay includes proteins involved in processes that have been associated to MASTL, such as chromosome condensation and chromatin organization (Histone H1 and HIRIP-3), in agreement with MASTL phenotypes observed in Drosophila (3). Also, we can observe a small set of targets, which is in line with the observation that MASTL overexpression in breast cancer induces not only transformation but also invasiveness (40), including actin, talin and phosphatidylinositol-4-phosphate 5-kinase (PIP5K1C). These proteins are involved in migration through the regulation of the actin cytoskeleton, and the targeting of talins to the plasma membrane and their efficient assembly into focal adhesions (41) (supplemental Table S2). The adhesion pathway has also been shown to be affected in the MASTL thrombocytopenia-associated mutation (E166D in mouse (13)). Nevertheless, our assay is focused on understanding the molecular role of the NCMR and is not meant to identify MASTL substrates in vivo. Therefore, the physiological role of MASTL in these pathways needs to be fully confirmed in future studies.
Regarding MASTL 450 , a small set of targets is well conserved when compared with the full-length protein, 17 out of the 30 identified proteins are shared between these two constructs (Fig. 1C, right panel, supplemental Table S2). In addition, MASTL 450 incorporates seven unique targets. Despite the fact that Bonsai is a synthetic construct not relevant for cellular physiological conditions, a comparison of its target protein network with the full-length protein shows its promiscuity and many extra targets and processes, confirming the lack of phosphorylation selectivity in the absence of the NCMR region (Fig. 1C, supplemental Table S2). Therefore, this differential behavior of the three variants reveals the influence of the NCMR in target selectivity.
Biochemical Characterization of MASTL Activity-Having defined the role of the NCMR in MASTL specificity, we decided to further characterize the effect of this segment in activity to understand the functional mechanism of MASTL. The kinase activity was assayed for the different variants in vitro using Arpp19 (7,8) and MBP (Myelin Binding Protein), a generic kinase substrate (15, 16) ( Fig. 2A-2B, supplemental  Fig. S3). The kinase dead (KD) versions used in the assay contain a double mutation (G44S/D174A) (3) (42) within the common N-terminal kinase lobe. MASTL and Bonsai phosphorylated Arpp19, as observed in the siKALIP experiment ( Fig. 2A-2B). The Bonsai phosphorylated six times more Arpp19 than the full-length protein. Additionally, MASTL 450 did not phosphorylate Arpp19, in agreement with our phosphoproteomics data ( Fig. 2A-2B, supplemental Table S2). In contrast, MASTL, Bonsai and MASTL 450 were able to phosphorylate MBP (supplemental Fig. S3A). The Bonsai protein was the most active enzyme phosphorylating MBP twentysix times more than MASTL (supplemental Fig. S3A) or MASTL 450, which also showed weak MBP phosphorylation, suggesting that this construct holds catalytic activity.
We characterized the enzymatic reaction from MASTL and Bonsai (Fig. 2C), unfortunately, the absence of phosphorylation of Arpp19 by MASTL 450 prevented its enzymatic characterization. Initial velocities for MASTL and Bonsai were calculated using the obtained intensities from the autoradiograms (supplemental Fig. S4). The reaction rates were plotted to calculate the Michaelis-Menten constant (K m ) and the halfmaximum rates (V max ) (Fig. 2C) . The values of the K m for MASTL and Bonsai are similar, suggesting that the affinity of Bonsai toward Arpp19 has not changed when compared with MASTL. However, the difference in the V max shows that Bonsai catalyzes the phosphoryl group transfer reaction four times faster than MASTL, suggesting the NCMR is a modulatory element of the kinase activity. Accordingly, the in vitro kinase assays together with the siKALIP experiment agree, indicating that the NCMR is dispensable for kinase activity (15,16). Consequently, when the protein lacks the NCMR, as in Bonsai, the enzyme specificity and catalysis are distorted (Fig. 1B-1C and Fig. 2C). Therefore, it was not surprising that the Bonsai protein also resulted in deregulation of the autophosphorylation rates ( Fig. 2A-2C).
MASTL 450 Displays Catalytic Activity-The International Cancer Genome Consortium (ICGC-https://dcc.icgc.org/) has around 350 different mutations annotated for MASTL in human cancers which are single base deletions (7%) or insertions (4%). These mutations result in truncated versions of the protein that ends the translation of the protein prematurely. To validate MASTL 450, we checked for the presence of this polypeptide in several cancer cell lines using antibodies against the C-lobe and the N terminus of MASTL (Fig. 2D). We detected the presence of bands in agreement with the size of MASTL 450 in the human gastric carcinoma cell line 23132/87 (carrying a heterozygous p.K391Nfs*12 mutation) and the colorectal cancer cell line DLD1 (wild-type for MASTL), indicating that MASTL 450 is generated in cells. To further characterize MASTL 450 , we selected several high-scoring proteins from our siKALIP target list for MASTL 450 and Bonsai and tested their direct phosphorylation in in vitro kinase assays. These proteins are the high mobility group protein HMG-I/ HMG-Y (HMGA1, UniProt P17096) and Arpp19 (Arpp19, Uni-Prot P56211), respectively. The detected phosphorylation sites on HMGA1 were found on serine residues at position 44 and 99 (Table S II produced short recombinant peptides using E. coli. The peptides were purified together with an N-terminal LSL-tag (27). These fragments included the phosphorylatable residue flanked on each side by 20 residues from the original protein sequence (supplemental Table S1). Recombinant substrates for HMGA1 (LSL-HMGA1-44 and LSL-HMGA1-99) and Arpp19 (LSL-Arpp19) were generated; and the LSL domain was also included as a negative phosphorylation control. The in vitro kinase assay showed that MASTL 450 phosphorylated both peptides of HMGA1, whereas only a very faint signal was observed for Arpp19 (Fig. 2E). Residual phosphorylation was observed on LSL-HMGA1-99 when incubated with MASTL 450 KD. Nevertheless, the phosphorylation level for HMGA1 at S99 was higher in the presence of active MASTL 450 , indicating that the detected phosphorylation arises from MASTL 450 . These results support our siKALIP findings, showing that MASTL 450 can phosphorylate the HMGA1 peptide. This protein is involved in the regulation of gene transcription, integration of retroviruses into chromosomes, and the metastatic progression of cancer cells. Phosphorylation databases (43)(44)(45), contain annotations of HMGA1 (S-44 and S-99), but no specific functions or responsible kinase(s) have been assigned to them. To further characterize the phosphorylation of HMGA1, we also tested the ability of MASTL and Bonsai to phosphorylate the LSL-HMGA1 peptides (supplemental Fig.  S3B-S3C). In this set of experiments, we included full-length Arpp19 as an additional control. MASTL, as expected, was able to phosphorylate both Arpp19 and LSL-Arpp19, whereas no or only residual phosphorylation was observed for both HMGA1 LSL-peptides. A similar pattern was observed for Bonsai, which was able to phosphorylate both Arpp19 and LSL-Arpp19 but not HMGA1. The possible functions of this truncation product need to be addressed.
MASTL Activation-The kinase activity of MASTL has been directly linked to its phosphorylation status (3,15,16). MASTL is activated by phosphorylations acquired during mitosis (4,15,16), yet a consensus for kinase activation has not been met. A discrepancy on the importance of the different phosphorylation sites has led to the proposal of two models for MASTL activation (15,16). Further, the protein kinase(s) responsible for the reported phosphorylation sites have not been identified. In vitro kinase assays incubating MASTL with several mitotic kinases have suggested Cdk1 and/or Plk1 as responsible for MASTL activation (15,16). To identify impor-tant residues for MASTL activation, we performed a mass spectrometry analysis combined with a phosphatase treatment and incubation of the protein with mitotic extracts from HEK293 cells to identify the phosphorylated residues.
We detected 34 phosphorylation sites in MASTL isolated from HEK293 cells in interphase, 24 of them with high occupancy (Fig. 3A, supplemental Fig. S5A). MASTL autophosphorylation is an unimolecular reaction (16); therefore, the phosphorylation differences between wild type and KD mutant discriminate between the self and external regulated sites. We expressed and purified the KD mutant and performed the same mass spectrometry analysis. Only 14 phosphorylation sites were identified in the sample, 7 of them with high occupancy (Fig. 3A, supplemental Fig. S5A). Ten autophosphorylations occur within the NCMR, three are in the N-lobe and one in the C-lobe. Further, the difference between KD and wild type shows that these 14 phosphosites are because of other kinases phosphorylating MASTL (Fig. 3A). The large number of phosphorylation sites present in the NCMR highlight the importance of this region in MASTL regulation.
To further investigate the role of the different phosphorylations, we incubated the protein with FastAP phosphatase, resulting in an inactive enzyme unable to phosphorylate Arpp19 and MBP (supplemental Fig. S5B). After phosphatase treatment, 24 phosphorylation sites were identified (Fig 4A,  supplemental Fig. S5A), although, activity is clearly affected by the phosphatase treatment (supplemental Fig. S5B). Only 6 phosphosites were removed in the NCMR, suggesting that rest of the sites are protected. The difference between the total phosphorylation sites before and after phosphatase treatment in the wild type unveils the identity of at least 11 sites which are key for activity (Fig. 3A). By comparing the MASTL sites arising from autophosphorylation and after phosphatase treatment, we found that the autophosphorylation of 5 sites (S213, S552, S703, T847, and S875) and the action of external kinases in 6 residues, mainly in the NCMR, (S216, S303, S384, S556, S572, and S660) is determinant for kinase activity. However, this observation does not imply that some of the non-removed phosphosites are essential for activity, i.e. by providing a "primed" conformation that can be further modified by autophosphorylation or external regulators to tune catalysis, but not accessible to the phosphatase, FIG. 3. MASTL phosphorylations and activity. A, The scheme displays the positions of the phosphosites identified for MASTL in the different experimental conditions (MASTL isolated from the interphase culture, MASTL KD, MASTL after phosphatase treatment, MASTL after incubation with a HEK293 mitotic extract and MASTL after phosphatase treatment and HEK293 mitotic extract incubation). The phosphosites are depicted in green. The light green denotes the low occupancy phosphorylation sites, whereas the dark green indicates that the occupancy of the site was above 20%. Occupancy was defined as the fraction of protein for which the site is modified. This has a value between 0 -1 corresponding to a percentage where we set the cut-off at (Ն0.2). The second column shows the analysis of the data displaying the autophosphorylation, external kinase, mitotic sites and those removed by the phosphatase. B, In vitro kinase assays with MASTL, KD, MASTL T-loop and p ϩ 1-loop mutants (supplemental Fig. S6C-S6D). Autoradiograms display the incorporation of ( 32 P) in Arpp19 and MASTL. C, Densitometric analysis of the kinase activity (upper panel) and autophosphorylation (lower panel). The analysis was performed on 4 independent experiments (n ϭ 4). Error bars display the standard error of the mean (S.E.). this could be the role of certain sites such as T741 (see MASTL T-loop section).
MASTL is isolated from an interphase culture. To find out phosphorylations involved in mitotic regulation, we incubated our sample with mitotic extracts from HEK293 cells to reveal if further phosphorylation sites may be required to regulate the protein activity during this stage of cell cycle. After mitotic extract incubation 33 phosphorylated residues were identified, all of them with high occupancy (Fig. 3A, supplemental  Fig. S5A). Further, we carried out a phosphatase treatment to subsequently try to restore the phosphorylations and the activity on MASTL by incubating the protein with the mitotic extract. Here, we identified 30 phosphorylated residues, 29 of them with high occupancy. Out of the 30 sites, S36, T194, S217, S494, S551, T611, S631, and Y744, appear to be phosphorylated exclusively after the incubation with the mitotic extract, these residues reside preferentially in the cryptic C-lobe and the NCMR. Interestingly, two of them (T194 and Y744) are in the activation loop and the p ϩ 1 loop, which would suggest that their modification is linked to the regulation of the enzyme during mitosis. To evaluate the effect of these phosphorylations in MASTL activity, we performed a kinase assay before and after incubation with the mitotic extract. MASTL exhibited a 1.5-times higher activity after incubation with the mitotic extract compared with the sample purified from the HEK-293 culture, indicating that the enzyme activity is enhanced by these phosphorylations in mitosis (supplemental Fig. S6A-S6B).
MASTL T-loop-Activation loop phosphorylation is a hallmark of an active conformation in most protein kinases (46) and "T-loop" phosphorylation is viewed as a mechanism that promotes the assembly of the regulatory spine (47)(48)(49). Additionally, activation loop phosphorylation in one (primary phosphate) or more residues (secondary phosphates) allows the loop to refold such that the phosphorylated residue(s) in the activation loop can "neutralize" a basic pocket contained in the catalytic loop or help positioning the ATP molecule (46). A detailed alignment of the activation segment of selected members of the AGC kinase family locates the residues responsible for activation (supplemental Fig. S6C-S6D). This data, together with our results and previous studies (15,16), suggest that two different regions could contain the activation loop phosphorylation sites with their corresponding p ϩ 1 loops. These are the T193/T194 and the T741 residues, and the 196-SMAK-199 and 743-DYLA-746, termed P1 and P2, respectively. To better understand their role in the activation segment, we designed mutations in these regions and performed further activity experiments (Fig. 3B-3C and supplemental Fig. S6C-S6D).
The activity assays showed that the T193A/T194A, P1 (where we introduce the DYLA sequence in the p ϩ 1 loop of T193A/T194A) and T194E mutants were able to phosphorylate Arpp19, although they displayed 50%, 70%, and 80% of the MASTL wild-type activity (Fig. 3B-3C and supplemental Fig. S6C-S6D). Hence, mutations with an intact 743-DYLA-746 region conserve substantial activity levels activity. The relevance of T741 was also observed in the combined T194E/ T741A mutant. This phosphomimetic mutant was not able to phosphorylate Arpp19 even though the 196-SMAK-199 and 743-DYLA-746 regions were intact. Further, the P2 mutant, which contains an intact T193/T194 and T741 but the sequence SMAA in the 743-746 region, was not able to phosphorylate Arpp19; indicating that T741 and the 743-DYLA-746 residues are essential for substrate phosphorylation. Therefore, variants containing changes in T741 or the 743-DYLA-746 region were unable to phosphorylate Arpp19. In contrast, although the remaining mutants were not able to phosphorylate Arpp19, they displayed different levels of autophosphorylation, suggesting that mutations in T741 and the 743-DYLA-746 region affect preferentially the phosphotransfer reaction to the substrate but autophosphorylation is not fully impaired. Our mass spectrometry data show that pT741 arises from autophosphorylation (Fig. 3A), and at the same time, the kinase assays show that this modification is key for MASTL activity, supporting the view that phosphorylation of certain residues could promote conformational changes which in turn make other phosphosites accessible for regulation. In this scenario, pT741 would appear to be critical for substrate phosphorylation.
Hydrogen/Deuterium Exchange Mass Spectrometry (HDX-MS) of MASTL-We performed an HDX-MS experiment to obtain insights into the conformational properties and dynamics of native full-length MASTL in solution (50 -52). The HDX of backbone amide hydrogens is strongly dependent on their hydrogen-bonding status and strength, and HDX-MS thus provides a very sensitive measure of higher-order structure and dynamics of a protein in solution. By analysis of the deuterium uptake of MASTL at various time-points relative to FIG. 4. HDX-MS exchange scheme in MASTL. Deuterium uptake is plotted with a color scheme representing the fractional uptake measured for each of the detected peptides. Four-time points are being described in the uptake bar below the protein sequence (0.25 min, 10 mins, 360 mins and 1440 mins). The uptake bar is also displaying the position of the detected peptides. Secondary structure features come from the ab initio models obtained for Bonsai and MASTL 450 (Fig. 1A). Secondary structure information for Bonsai contained in the PDB 5LOH (26) correlates well with the one of our Bonsai model (data not shown). ␤1-9 and ␣C-D are common for Bonsai and MASTL 450 . ␣F-␣I 450 help to fold the cryptic C-lobe. ␣F-␣I Bonsai and the C-tail assist the folding of the canonical C-lobe. Residues between the DFG motif (Green Box) and the APE motif (Orange Box) compose the activation segment of MASTL. T1 (T194) is thought to be the T-loop phosphorylation for MASTL 450 (Yellow Star #1), whereas T2 (T741) is the T-loop phosphorylation for Bonsai (Yellow Star #2). Sequence coverage observed for MASTL in the HDX MS experiment was 89%. Peptide redundancy within the experiment is 1.98 (see supplemental Fig. S7) for peptide deuterium uptake). an equilibrium-labeled control sample, we identified and mapped regions containing stable h-bonded (pronounced protection from HDX), transient/partial h-bonded (medium protection from HDX) and no h-bonded structure (no protection from HDX) (Fig. 4, supplemental Fig. S7-S8). The ␤-strands within the N-lobe (Fig. 4) exhibited only moderate to minor protection, showing that these secondary structure elements of the N-lobe, albeit being present, are not as stable as the ones observed for some other protein kinases (53). Moreover, the N-lobe helices ␣B and ␣C showed less protection compared with the ␤-strands, suggesting that these ␣-helices are also highly dynamic and flexible. These two regions could not be observed in the crystal structure, suggesting a large flexibility, which agrees with our data. The very modest protection observed in the glycine-rich loop between ␤1 and ␤2 (54), and the conserved lysine (K62 at ␤3) and glutamic residues (48) (E81 in ␣C) is noteworthy as these regions, in particular, have shown more protection from HDX in some other kinases (53,55) and they are usually involved in strong interactions either with the ATP molecule for the glycine-rich loop (56) or as ion pairs for the K62-E81 residues (48). Only the ␣C-␤4 loop on the N-lobe (residues 89 to 96) displayed significant protection from HDX, showing that hydrogens of these residues are involved in stable secondary structure.
The residues composing the R-spine (57) (H154 RS1 , F175 RS2 , L85 RS3 and L96 RS4 ) are flanking this loop. A significant protection mapped to this section is thus possibly reflecting the interaction between this loop, the C-lobe tether (CLT) and a buried water molecule proposed to coordinate the movement of the ␣C helix in AGC kinases (17). Further, the ␣C-␤4 loop is the only secondary structure that is tightly attached to the C-lobe (58). In general, the low degree of protection of almost the entire N-lobe shows that this part of the kinase domain of MASTL is unusually dynamic in solution.
In contrast, multiple regions located within the common C-lobe region showed pronounced HDX protection. Here, V118 within ␣D potentially forms part of the so-called C-spine (59), which crosses the kinase lobes and aids the regulation of the catalytic process together with the R-spine. Stable hydrogen-bonding explains the significant protection from HDX detected in these two ␣ helices (␣D and ␣E). Moreover, the slow uptake observed for the four small ␤-strands (␤6 -9) that locate most of the catalytic machinery of the kinase agrees with the fact that several of these residues are either interacting through hydrophobic interactions or H-bonds with the ATP molecule or with other residues of the C-spine. Further, protection from HDX observed for the catalytic loop agrees with reports for other kinases (51,60).
Most of the NCMR and the activation segment (residues between the DFG and the APE motifs) of the protein displays no protection from HDX, suggesting that this entire segment is usually highly dynamic. However, some regions show minor but significant protection from HDX relative to the equilibrium-labeled control (res. 294 -314, 391-402, 497-514 and 537-554). Further, the 713-745 region, still confined within the NCMR, shows moderate protection, clearly demonstrating that secondary structure elements defined by backbone amide h-bonds exist in this region of the protein. The presence of regions of weak-moderate hydrogen bonding in the NCMR indicates that while adopting an overall very dynamic structure, regions of localized structure exist which could be important in the known regulatory function of this region. Upon inspection of the HDX of the region of MASTL thought to form the part of the C-lobe preceding the activation loop, we observed a moderate protection in several regions. The protection found for the ␣F Bonsai is consistent with the notion of the C-terminal ␣F acting as a central platform within the kinase domain, providing the correct positioning in space to the Rand C-spines (61). Finally, the MASTL C-tail overall displayed fast HDX, indicating the dynamic nature of the tail. Nevertheless, a small region of the C-tail that contains the PxxP motif (853-861) within the CLT showed moderate protection, suggesting that those residues are forming a more stable interaction with the rest of the protein. The PxxP protection is possibly reflecting the interaction of this motif with the interlobar region of the kinase domain, which has been observed for MASTL's C-tail (26). Our HDX-MS results correlate well with the B-factor distribution of the recent structure of the conserved kinase domain of MASTL (26). Several secondary structures that were not visible in the crystal structure, such as the ␣B and ␣C helices of the N-lobe, also show very modest protection in our HDX experiments. The HDX-MS experiments clearly show the N-lobe, NCMR and the C-tail of MASTL are dynamic structures with regions of more transient structures, and only the C-lobe forms a more stable platform. DISCUSSION MASTL presents an exceptional case of kinase regulation; its NCMR region makes it a unique protein in the kinome. The crystal structure of a construct of MASTL without the NCMR region has been solved (26). Unfortunately, this structure lacks information regarding how key regulatory regions of the kinase influence MASTL activity. Our work shows that the kinase is active when isolated using a transient expression in an asynchronous HEK293 culture. Our assay to decipher the NCMR function using the HEK293 proteome has shown that MASTL can phosphorylate up to 56 different proteins in a HEK293 cell extract (Fig 1B-1C, supplemental Table S2). These results warrant further research to investigate the possible role of MASTL in other cellular pathways.
Our study shows the double regulatory role of the NCMR in MASTL, modulating the selection of targets and the activity of the protein (Fig. 1B). The loss of this region increases the number of phosphorylated proteins indicating that proper target recognition is severely disturbed. In addition, the activity assays show that the catalysis rate is also uncontrolled (Fig.  2C). Therefore, the NCMR links target recognition and cata-lytic activity in MASTL, playing a critical role in its allosteric regulation. Many kinases are regulated in the activation segment by phosphorylation, these members of the kinome contain the so-called RD-pocket (46,62). The phosphorylation on one or more residues in this region permits an efficient sub-strate binding and catalysis (46). However, not all RD kinases require T-loop phosphorylation to trigger activity (46) (63-66). MASTL lacks a phosphorylatable residue in this position (supplemental Fig. S9). Therefore, RD-neutralization and consequently, T-loop activation are not required in MASTL. In FIG. 5. MASTL activation model. MASTL is divided in seven sections: The N-lobe, the C-lobe, the cryptic C-lobe, the NCMR, the new region (residues 715-735) preceding the C-lobe Bonsai and the C-tail. In a first priming stage (1) MASTL is phosphorylated by other kinases, inducing a conformational change that will allow the autophosphorylation of the S875 in the C-tail (2). This event would allow the interaction of the C-tail with the N-lobe, which shows the large flexibility in our HDX-MS, triggering a series of autophosphorylations that yield an active enzyme regulating different cellular pathways in interphase (3). Once cells enter mitosis, MASTL will be further phosphorylated in the NCMR (Fig. 4A), the C-tail and the T1 (T194) and T2 (Y744) regions to promote an extra activation of the enzyme and the targeting of mitotic substrates (4). agreement with this observation, our Bonsai variant is active and does not contain T194, T207, S213 or S453, which were proposed as necessary for full kinase activity (15,16). Although these results suggest that T741 could be the T-loop residue, mutations in its corresponding p ϩ 1 loop did not hamper MASTL autophosphorylation but Arpp19 phosphorylation (Fig. 3C), indicating that T741 is not required for the RD-pocket neutralization (supplemental Fig. S9).
We propose that phosphorylation sites within the NCMR regulate the kinase in an allosteric manner, linking substrate recognition and kinase activity. Accordingly, Bonsai, which lacks the NCMR, displays a large promiscuity in target phosphorylation and a hyper-activation. Thus, phosphorylations on the long activation segment in MASTL would serve as a modulation mechanism used by the protein to regulate the kinase activity and the recruitment of substrates. This regulation may be influenced by the cellular localization of the enzyme, which has been shown to depend on phosphorylations in the NCMR (23)(24)(25). In agreement with these observations, our model for MASTL activation (Fig. 5) implies that the enzyme can be active outside mitosis. Our results suggest that phosphorylations in the NCMR influence the activity by relieving NCMR autoinhibitory conformations. The phosphatase incubation eliminates some of the phosphosites rendering an inactive protein that can be reactivated (Fig. 3A, supplemental Fig. S5B), indicating that phosphorylations in the NCMR promote a MASTL conformation whose activity can be tuned later by acting on other phosphorylation sites (Fig. 3A, Fig. 5). Unknown structures within the activation segment, such as the one observed in our HDX-MS experiment within the residues 713-735 could help to place the 741-746 loop or position the substrate, this is supported by the fact that mutations in T741 and the residues in the P2 region abrogate Arpp19 phosphorylation but not catalysis (Fig 3B-3C).
To our surprise, we found that the predicted kinase catalytic domain comprising the cryptic C-lobe region, MASTL 450 , is present in cancer cell lines containing truncation mutations in that region (Fig. 2D). In addition, it shows kinase activity and displays a palette of protein targets which contains 7 unique proteins. Interestingly, gwlSr6, a Drosophila mutant allele of Gwl, which generates a stop codon near the end of the NCMR lacking the C-lobe, is a hypomorphic allele and not a null, and retains Gwl function in vivo (2). The biochemical assay of MASTL 450 (Fig. 2E), which was performed to decipher the role of the NCMR, showed that this MASTL product can phosphorylate other polypeptides. Interestingly, the database maintaining a catalogue of somatic mutations in human cancer (COSMIC -http://cancer.sanger.ac.uk), contains detailed information for several insertions and deletions within the MASTL gene, which mostly can result in a short-truncated version of MASTL because of frameshifts in the genetic sequence. For example, the most frequent reported insertion (c.1168_1169insA) creates a frameshift that results in a 407 residues long protein that shares the initial 391 residues with wild-type MASTL, whereas the most frequent deletion (c.1168delA) creates a frameshift that results in a 401 residues long protein that shares the first 396 residues with wild-type MASTL. These mutations result in truncated versions of the protein that ends the translation of the protein prematurely, thus resulting in a polypeptide without the necessary residues to fold the canonical C-lobe. These mutations affect only one of the alleles and may affect the levels of MASTL affecting some of its functions, however, they should not eliminate MASTL activity. This observation links truncated MASTL proteins with pathologically associated conditions, although the exact role of MASTL in these diseases and cancer is not yet clear.