Mechanistic and evolutionary insights into isoform-specific ‘supercharging’ in DCLK family kinases

Catalytic signaling outputs of protein kinases are dynamically regulated by an array of structural mechanisms, including allosteric interactions mediated by intrinsically disordered segments flanking the conserved catalytic domain. The doublecortin-like kinases (DCLKs) are a family of microtubule-associated proteins characterized by a flexible C-terminal autoregulatory ‘tail’ segment that varies in length across the various human DCLK isoforms. However, the mechanism whereby these isoform-specific variations contribute to unique modes of autoregulation is not well understood. Here, we employ a combination of statistical sequence analysis, molecular dynamics simulations, and in vitro mutational analysis to define hallmarks of DCLK family evolutionary divergence, including analysis of splice variants within the DCLK1 sub-family, which arise through alternative codon usage and serve to ‘supercharge’ the inhibitory potential of the DCLK1 C-tail. We identify co-conserved motifs that readily distinguish DCLKs from all other calcium calmodulin kinases (CAMKs), and a ‘Swiss Army’ assembly of distinct motifs that tether the C-terminal tail to conserved ATP and substrate-binding regions of the catalytic domain to generate a scaffold for autoregulation through C-tail dynamics. Consistently, deletions and mutations that alter C-terminal tail length or interfere with co-conserved interactions within the catalytic domain alter intrinsic protein stability, nucleotide/inhibitor binding, and catalytic activity, suggesting isoform-specific regulation of activity through alternative splicing. Our studies provide a detailed framework for investigating kinome-wide regulation of catalytic output through cis-regulatory events mediated by intrinsically disordered segments, opening new avenues for the design of mechanistically divergent DCLK1 modulators, stabilizers, or degraders.

Introduction also serves as a pseudosubstrate, which physically blocks the substrate binding pocket until it is competed away by Calmodulin (21). Notably, this autoregulatory pseudosubstrate can be phosphorylated 71 (19), and phosphorylation of the C-tail makes CAMKII insensitive to calmodulin binding. Across the CAMK 72 group, several other kinases share autoinhibitory activity via interactions between Calcium/calmodulin 73 binding domains and the C-terminal tail (22,23), and a major feature of these kinases is variation in the 74 tail length across the distinct genetic isoforms.  Table 1: Names of DCLK isoforms discussed in this paper, along with their respective isoform number, UniProt identification, and alternate names that have been used.

115
Origin and evolutionary divergence of DCLK family members. Shows domain annotations for sequences included in the phylogenetic tree. The length of C-terminal tail segment for these sequences is shown as a histogram (green). The original tree generated using IQTREE is provided in Figure 2-source data 1.

119
The human DCLKs repertoire is composed of three genes, termed DCLK1, 2 and 3 ( Figure 1A,  invertebrate DCLKs, and vertebrate DCLK1 and DCLK2 predominantly contain two DCX domains at the 150 N-terminus of the long isoforms ( Figure 2B). In addition, we identified a putative active site-binding motif,

151
VSVI, and a phosphorylatable threonine conserved within vertebrate DCLK1 and DCLK2, which is absent in all other DCLK sequences, including invertebrate DCLK1/2. This raises the possibility that the DCLK1/2 153 tail extensions are employed for vertebrate-specific regulatory functions.

154
Next, we compared the type of DCLK1 protein sequence encoded by a range of chordate mammalian 155 genomes. The domain organization of each DCLK1 isoform was compared based on annotated 156 sequences from UniProt, demonstrating the presence of at least one DCLK1 protein that lacks the DCX variants was found ( Figure 3A). To establish a model for DCLK1 biophysical analysis, we constructed a 160 recombinant hybrid human DCLK1 catalytic domain with a short C-tail sequence that is equivalent to 161 DCLK1.1 amino acids 351-689, containing the catalytic domain with a short C-tail region. As shown in

170
In addition to enzyme activity, we monitored thermal denaturation of purified, folded, DCLK1 351-689 protein 171 in the presence of ATP, either alone or as a Mg:ATP complex, which is required for catalysis. As shown in

207
To study isoform specific differences in the C-tail, we employed experimental techniques to compare 208 protein stability and catalytic activity between purified DCLK1 proteins alongside molecular dynamics

230
We next performed MD simulations to study the dynamics within the distinct DCLK1 C-tails that might  Sequence constraints that distinguish DCLK1/2/3 sequences from closely related CAMK sequences are shown in a contrast hierarchical alignment (CHA). The CHA shows DCLK1/2/3 sequences from diverse organisms as the display alignment. The foreground consists of DCLK sequences while the background alignment contains related CAMK sequences. The foreground and background alignments are shown as residue frequencies below the display alignment in integer tenths (1-9). The histogram (red) indicates the extent to which distinguishing residues in the foreground diverge from the corresponding position in the background alignment. Black dots indicate the alignment positions used by the BPPS (Neuwald, 2014) procedure when classifying DCLK sequences from related CAMK sequences. Alignment number is based on the human DCLK1.2 sequence (UniProt ID: O15075-2). C) Sequence alignment of human DCLK1 isoforms.

257
Residues contributing to the co-evolution and unique tethering of the C-terminal tail to the DCLK 258 catalytic domain.

260
To identify specific residues that contribute to the unique modes of DCLK regulation by the C-terminal tail,

261
we performed statistical analysis of the evolutionary constraints acting on DCLK and related CAMK family An autoinhibitory ATP-mimic completes the C-spine and mimics the gamma phosphate of ATP.

288
The most stable segment of the C-tail based on the B-factor and RMSF fluctuations in MD simulations is

294
Interestingly, based on our BPPS analyses, these C-tail residues are uniquely vertebrate DCLK1-specific 295 pattern constraints. At the tail end of this α-helix are two Thr residues, Thr 687 and Thr 688. As previously

319
To evaluate how sequence differences between DCLK1.1 and DCLK1.2 affected both thermal stability 320 and catalytic potential, we generated targeted mutations at contact residues within the Gly-rich loop and Unfortunately, we were unable to assign site-specific Thr 688 phosphorylation in DCLK1.2, since it 344 resides in a large, multiply phosphorylated tryptic peptide with multiple potential sites of phosphorylation.

345
Furthermore, quantitative abundance calculations between the two isoforms was not possible since Thr

441
In addition to the marked differences between DCLK1 splice variants relevant to nucleotide binding, small 442 molecule interactions and catalysis, our work also reveals two unique pseudosubstrate segments present 443 before and after the IBS. Before the IBS segment, we observe the formation of an anti-parallel transient 444 beta sheet with the beta1 strand in the catalytic domain ( Figure 5I, Figure 7-figure supplement 1A-B).   Figure S1). It is possible, like other CAMKs, the IDS helps facilitate For the DCLK family as a whole, we discovered phylogenetic divergence between DCLK1 and 2 as a relatively recent event, ( Figure 1A) in which metazoan DCLK3 is the more ancestral DCLK gene from between these paralogs. Moreover, for the first time, we quantify key differences between human 498 DCLK1.1 and 1.2 isoforms in the C-tail that can be reversed by amino acid changes. The differences 499 between isoforms 1 and 2 are generated by variations in exon splicing, which change both the C-tail 500 protein sequence, and introduce or exclude potential phosphorylation sites. Expression of the highly 501 autoinhibitable DCLK1.2 isoform is believed to be predominant during the brain in embryogenesis, and 502 DCLK1.1 is also thought to be present in the adult brain (45). It is therefore possible that an altered ratio

553
All structures were solvated using the TIP3P water model (62). Energy minimization was run for a 554 maximum of 10,000 steps, performed using the steepest-descent algorithm, followed by the conjugate-555 gradient algorithm. The system was heated from 0K to a temperature of 300K. After two equilibration 556 steps that each lasted 20 picoseconds, 1 microsecond long simulations were run at a two femtosecond GROMACS MD engine (63). We utilized the CHARMM36 force field (64). The resulting output was visualized using VMD 1.9.3 (65). All molecular dynamics analysis was conducted using scripts coded in stabilizing and destabilizing mutations in the enzyme structure. We performed three replicates per 564 mutation and averaged the Rosetta energies. All mutant energies were then subtracted by the wt Rosetta      Fold change in the relative abundance of the two phosphopeptides in DCLK 1.2 is computed with reference to these same two phosphopeptides in DCLK1.1., normalising against 3 non-modified peptides to account for potential difference in the amount analysed. B) DCLK1 substrate phosphorylation (reported as % total phosphopeptide in reaction) was quantified as a function of time for each of the indicated purified DCLK proteins. Assays were performed side-by-side in quadriplicate. C) As described in A, except fold change in abundance could not be calculated for the peptide containing pThr 395 given the presence of the inserted mutations and the potential differences in relative ionisation efficiency for the resulting tryptic peptide.