Crystal structure of the C-terminal domain of tubulin-binding cofactor C from Leishmania major

Graphical abstract The crystal structure of the C-terminal fragment of tubulin-binding cofactor C from Leishmania major has been determined. The structure identifies a single domain dominated by a right-handed β-helix and comparisons suggest key residues involved in stimulating the GTPase activity of β-tubulin. The structure of LmTBCC-C.

trypanosomatids as the model system to help dissect the contributions that they make to microtubule assembly [9][10][11].
There are five TBCs, termed A-E. TBCB and E bind ␣-tubulin whilst A and D interact with ␤-tubulin, to deliver each tubulin subunit into a super-complex comprising the ␣/␤-tubulin heterodimer and cofactors D and E [8]. TBCC is involved in the final stage activation of GTP hydrolysis by ␤-tubulin, promoting release of the ␣/␤-tubulin heterodimer from the super-complex protein assembly that can then proceed to polymerization [8,12,13]. This TBCC cofactor is a polypeptide of about 340 amino acids, located in the centrosome and predicted to form three distinct domains [14]. The N-terminal domain of the human protein has been characterized by nuclear magnetic resonance spectroscopy (NMR) and structures deposited in the Protein Data Bank (PDB) [14]. This N-terminal domain carries a flexible and unstructured N-terminus, that interacts with tubulin, and this leads into a bundle of three ␣-helices [14]. The fold is similar to that of TBCA [11]. NMR structures of a truncated C-terminal fragment of human TBCC, consisting of 179 amino acids are deposited in the PDB (code 2yuh, unpublished). This protein domain shares a sequence identity of approximately 20% with the corresponding domain of the trypanosomatid proteins.
Here we concentrate on TBCC from L. major (LmTBCC, Uniprot code Q4Q1A3). We describe the crystallographic analysis of the C-terminal domain of LmTBCC-C. Recombinant forms of the fulllength protein from Trypanosoma brucei and L. major (335 amino acids, approximate mass 36.8 kDa) were prepared but proved recalcitrant to structural studies. In particular, speedy degradation of the  The gene fragment encoding residues 152-355 of LmTBC-C from Leishmania major strain Friedlin identified in GeneDB (LmjF.36.3160, [21]) was amplified from genomic DNA using PCR. To permit the use of selenomethionine for phase determination a single mutation, Leu223Met was introduced (Quikchange mutagenesis, Stratagene). The gene was cloned into a modified pET15b plasmid to encode an N-terminal His-tag followed by a tobacco etch virus (TEV) protease cleavage site. The resulting vector was transformed into Escherichia coli B834 (DE3), and cells grown in Selenomethionine Medium (Molecular Dimensions, UK), expression induced with 1 mM IPTG at an OD600 0.6 and growth continued at room temperature for 16 hours. Cells were harvested by centrifugation and resuspended in 50 mM Tris-HCl pH 7.5, 250 mM NaBr, 20 mM imidazole before storage at −20 • C. Thawed cells were lysed using French Press at 16 kpsi and lysate was clarified by centrifugation at 37,500 × g for 30 min at 4 • C. Soluble supernatant was filtered (0.2 m) and loaded onto a 5 mL HisTrap HP column (GE Healthcare) pre-equilibrated with 50 mM Tris-HCl, 250 mM NaBr pH 7.5 for an initial affinity chromatography capture step. Elution of LmTBCC-C was performed by applying an imidazole gradient with the target protein eluting at approximately 140 mM. The product was treated with TEV protease at 30 • C for 2 h. Dialysis at room temperature, to remove excess imidazole, was followed by reverse affinity chromatography prior to a final purification step with size exclusion chromatography using a calibrated Superdex 200 26/60 gel filtration column and the equilibration buffer. The protein eluted with an estimated mass of 20 kDa, which corresponds to that expected for a monomeric sample (20.4 kDa). The sample was pooled, buffer exchanged into 10 mM Tris-HCl, 100 mM NaBr pH 7.5 and concentrated using a centrifugal concentrator (10 kDa cutoff, Sartorius) prior to crystallization. The protein concentration was determined by measurement of absorbance at 280 nm and an estimated extinction coefficient 38,680 M −1 cm −1 [22]. Poor quality crystals were produced at 18 • C by the hanging drop vapor diffusion method using 0.75 L of protein solution at a concentration of 7 mg mL −1 , mixed with 0.75 L of reservoir containing 100 mM MES (4-morpholineethanesulfonic acid) pH 6.5, 25-30 % PEG 2000 MME (polyethylene glycol monomethyl ether). A crystal was placed into an Eppendorf tube with 100 L of reservoir and a small nylon ball was added before vortexing for 30 seconds to create a micro-crystal suspension. Fresh conditions were prepared with 100 mM MES pH 6.7, 22% PEG 2000 MME in the reservoir, and protein solution as before but at a reduced concentration of 3 mg mL −1 . A cryo-loop was used to streak the micro-seed suspension into the conditions and the plates stored at 18 • C. Well formed needles (40 × 40 × 500 mm) appeared in several days. Single-wavelength anomalous dispersion (SAD) data were measured from a single crystal at −170 • C on beam line I24 of the Diamond Light Source with a Pilatus 6 M detector. A helical data collection protocol to minimize radiation damage was used. Data were indexed and integrated using XDS [23] and scaled using AIMLESS [24]. The structure was solved via SAD-phasing using Phenix AutoSolv [25]. Two molecules of LmTBCC-C constitute the asymmetric unit and each contains two SeMet residues. These four Se positions were identified and provided an initial figure-of-merit 0.44. The density modification step yielded an improved figure-of-merit of 0.69 and was followed by automated model building to produce a partial model consisting of 311 residues giving an R/R free of 24.2%/29.1% and a map-model correlation coefficient of 0.79. The model was then completed with the graphics software COOT [26]. Refinement was performed in REFMAC5 [27] utilizing Translation/Libration/Screw refinement [28], and alternated with rounds of electron and difference density map inspection and model manipulation together with ligand incorporation using COOT, and the incorporation of waters and alternate conformer side chains. Non-crystallographic symmetry restraints were not employed. MOLPROBITY [29] was used to investigate model geometry in combination with the validation tools provided in COOT. a Values in parenthesis are for the highest resolution shell. b Pearson correlation coefficient [30]. c Calculated using the Molprobity server (http://molprobity.biochem.duke.edu).
polypeptides was noted. Limited proteolysis of the L. major protein, using trypsin, followed by mass spectrometry finger printing matched to the identification of a C-terminal fragment. A recombinant form of this domain, comprising residues 152-355 with a Leu223Met mutation was prepared (LmTBCC-C) to allow the production of selenomethionine (SeMet) derivative protein. The strategy behind the mutation was to enhance the chances of obtaining a good anomalous dispersion signal by placement of selenium into the hydrophobic core of the protein fold at a position unlikely to influence the structure. Sequence comparisons (not shown) indicated that at positions corresponding to 223 a leucine, isoleucine (as in human TBCC PDB code 2yuh) or methionine is observed. This form was crystallized and the structure determined at 2.2 Å resolution by exploiting the anomalous dispersion X-ray scattering properties of selenium. Crystallographic details are presented in Table 1 and the coordinates and structure-factor data have been deposited in the PDB with accession code 5aj8. Two polypeptides constitute the asymmetric unit and in each there are two segments, residues 155-157 and 326-355, which could not be modeled due to disorder. Non-crystallographic symmetry was not restrained during the refinement and it is noteworthy that the two molecules adopt a similar structure with an r.m.s.d for all atoms of just 0.77 Å between residues 158-325, reducing to just 0.18 Å when only the main chain atoms are considered. It is therefore only necessary to detail one molecule.
The N-terminal segment, residues 158-242, forms a righthanded parallel ␤-helix barrel consisting of five coils or layers. The helical barrel is shaped as an approximate triangular prism, Fig. 1. The structure of LmTBCC-C. (A) A ribbon diagram of the fold, colored blue to red from N-terminus to C-terminus. Helix ␣1, is obscured behind the ␤-helix structure in this orientation. (B) The tyrosine ladder and a nearby tryptophan (green C atoms) and stabilizing hydrogen bonding interactions (purple dashed lines) formed with the extended C-terminal residues (yellow C atoms). A water molecule is depicted as a cyan sphere, N and O atoms are colored blue and red respectively (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.). with faces and strands labeled A, B, C (Fig. 1A). The sides are about 20 × 10 × 20 Å with height approaching 25 Å. Three short ␤-strands, A and C between four and six residues in length and B only 3 residues, together form each layer of the ␤-helix. This ␤-helix structure is classified as a CARP domain, identified in cyclase-associated proteins and X-linked retinitis pigmentosa 2 gene products (http://smart.embl-heidelberg.de/smart/do annotation.pl?DOMAIN=CARP). The rest of the polypeptide is catalogued as the TBCC domain [14] but it forms an extended structure, not at all domain-like, mainly positioned over one side of the ␤helix, the C-face. This section exits the ␤-helix, drops down (␣1) then loops up (␣2, ␣3) and over to place ␤6 as a cap of the ␤helix (Fig. 1A). There are numerous interactions involving hydrogen bonds and van der Waals forces formed between residues in this extended segment and those on the ␤-helix, including a tyrosine "ladder" composed of Tyr188, Tyr207 and Tyr224, to stabilize the fold. These interactions involve main chain and side chain groups. The arrangement of the three tyrosine residues, together with some of the stabilizing hydrogen bonding interactions, is depicted in Fig. 1B. In addition a significant hydrophobic core formed by aromatic residues (Phe170, Trp226, Tyr244, Trp247, Trp267, Phe316) help to position the ␣-helical loop onto the C-face (not shown). The polypeptide crosses over the top of the ␤-helix then drops down and the C-terminal region is tucked under the cylindric␣l structure interacting with the N-terminal residues. The overall result is that the so-called "TBCC domain" is actually an extended structure that forms a shield covering one side of the molecule leaving another to potentially interact with binding partners. The extended conformation and overall structure of residues 243-355 suggests this is not in itself a bone fide domain but rather contributes to an overall C-terminal domain.
A common feature of the right-handed ␤-helical fold is the existence of stacks of hydrophobic residues within the barrel [15]. LmTBCC-C is no exception. A stack of four cysteine residues are positioned within the core of the helix in the middle of the ␤B strands 2 to 5, while nearby on ␤C strands 2 and 3 another two cysteine residues form a minimal stack, the sulfur atoms being 3.8-4.7 Å apart vertically within a stack separated by a distance of 5.2 Å horizontally between the two stacks. Aliphatic hydrophobic residues dominate this part of the structure and form the core of the ␤-helix.
The structure of the LmTBCC-C domain matches to the human retinitis pigmentosa 2 (RP2) protein with an r.m.s.d. of 2.3 Å and a sequence identity of about 25% over 139 residues ( Fig. 2A). An alignment of the amino acid sequences is presented in Fig. 2B. The ␤-helix structures are closely related and the majority of strictly conserved residues occur on the ␤-strands. This extends to a similar cysteine stacking formation to that discussed above [16]. A further stack of conserved aliphatic and hydrophobic residues is positioned in the middle of ␤A strands 2-5. The particular arrangement of externally oriented tyrosine residues in the middle of ␤C strands 1-4 also creates the same tyrosine "ladder" described. These conserved features suggest they are likely important for the correct folding and stability of the ␤-helix fold [17].
The RP2 protein is not involved in tubulin heterodimerization but like TBCC it does facilitate the GTPase activity of tubulin in the presence of TBCD [18,19]. RP2 forms a complex with Arf-like GTPbinding protein ARL3 (ADP ribosylation factor-like protein 3) and a crystal structure of the RP2:ARL3:GTP complex has been determined [16]. A comparison of the structures is instructive such that we are able to identify, with some confidence, the ␤-helix A face on LmTBCC-C as the region for interaction with GTP and by implication ␤-tubulin. Strikingly, residues shown to be critical for RP2 activity are conserved. In particular LmTBCC Arg214 and Glu234 correspond to RP2 Arg118 and Glu138. The arginine contributes directly to the catalytic activity of this type of GTPase, it interacts with the ␥-phosphate, and is sometimes termed the argininefinger [20]. Mutation of this arginine in TBCC has been shown to abolish GTPase stimulating activity [18]. Other key residues are conserved in and around the potential GTP:␤-tubulin binding site on the ␤-helix A-face. In the RP2:ARL3:GTP complex, on one side of the catalytic arginine, a pair of glutamines (Gln115, Gln116) form hydrogen bonds to position the GTP ribose and ␣-phosphate moieties respectively. In LmTBCC these correspond to His211 (not shown) and Gln212. Nearby, on the other side of the arginine, a basic patch is formed on both proteins, Arg120 in RP2, Lys216 in LmTBCC. On the strand below Arg118/214 (A3), and directed onto the surface of the A-face, lie conserved serine (Ser99, Ser195) and phenylalanine (Phe101, Phe197) residues. These are positioned such that they would seem likely to be involved in protein-protein interactions.
Our structural data and the availability of LmTBCC-C in a stable recombinant form can now support and inform further studies to elucidate the precise contributions that these residues make to the final stage processing of the ␣/␤-tubulin heterodimer.  [19]) superimposed on LmTBCC-C (yellow C␣). The two proteins share 24% sequence identity. DALI [31] was used to inform comparison with structures in the PDB. The second domain from RP2 (residues 224-350) for which there is no similarity to LmTBCC-C has been removed for clarity. Several residues implicated in GTP binding in the RP2-ARL structure are highlighted where they remain conserved or conservatively altered in LmTBCC-C. These are labeled with the LmTBCC residue type and number, then the RP2-ARL residue type if different, and number. Residues positioned to make hydrogen bond interactions with the nucleotide from RP2 and their corresponding residues in LmTBCC-C are shown as sticks. (B) Sequence alignment of LmTBCC-C with part of human RP2 (Uniprot: O75695). The secondary structure of LmTBCC-C is shown and strictly conserved residues encased in black. Figure made with ALINE [32].