Crystal structures of the CPAP/STIL complex reveal its role in centriole assembly and human microcephaly

Centrioles organise centrosomes and template cilia and flagella. Several centriole and centrosome proteins have been linked to microcephaly (MCPH), a neuro-developmental disease associated with small brain size. CPAP (MCPH6) and STIL (MCPH7) are required for centriole assembly, but it is unclear how mutations in them lead to microcephaly. We show that the TCP domain of CPAP constitutes a novel proline recognition domain that forms a 1:1 complex with a short, highly conserved target motif in STIL. Crystal structures of this complex reveal an unusual, all-β structure adopted by the TCP domain and explain how a microcephaly mutation in CPAP compromises complex formation. Through point mutations, we demonstrate that complex formation is essential for centriole duplication in vivo. Our studies provide the first structural insight into how the malfunction of centriole proteins results in human disease and also reveal that the CPAP–STIL interaction constitutes a conserved key step in centriole biogenesis. DOI: http://dx.doi.org/10.7554/eLife.01071.001

but are not restricted to, microcephaly (McIntyre et al., 2012). Moreover, compelling genetic links are now emerging between centrioles/centrosomes and DNA damage repair (DDR) pathways: mutations in certain MCPH genes and in genes encoding other centriole/centrosome proteins can lead to Seckel syndrome and MOPD, pathologies normally associated with defects in DDR (Megraw et al., 2011). Thus, the cellular mechanisms that lead to pathology when centriole/centrosome proteins are mutated in humans remain unclear.
Centrioles are complex structures, but work in several model systems revealed only a small number of conserved proteins to be important for centriole assembly. These include PLK4/SAK, SAS-6, STIL/ Ana2, CPAP/CenpJ/SAS-4, Cep152/Asl, and CEP135 (Brito et al., 2012;Gonczy, 2012). Several studies have identified a complex web of putative interactions between these proteins (Cizmecioglu et al., 2010;Dzhindzhev et al., 2010;Hatch et al., 2010;Tang et al., 2011;Vulprecht et al., 2012;Lin et al., 2013). However, an understanding of centriole architecture and its assembly mechanisms will ultimately require high-resolution structures of the key centriolar components and their complexes. The power of combining structural studies with protein biochemistry and functional in vivo experiments has been demonstrated by work on SAS-6. These studies revealed how SAS-6 homo-oligomerises to organise the central cartwheel (Kitagawa et al., 2011b;van Breugel et al., 2011), the earliest structurally defined intermediate in centriole assembly (Brito et al., 2012;Gonczy, 2012), and suggested how SAS-6 might interact with SAS-5, the proposed STIL homologue in worms (Qiao et al., 2012). Additionally, high-resolution structures of Sak/Plk4 fragments have recently been solved (Leung et al., 2002;Slevin et al., 2012). However, equivalent studies with other core centriolar components or especially their complexes are currently missing, and how any of these proteins might be structurally and mechanistically compromised in MCPH is not known.
However, the process of cell division does not have to be symmetric, and the fates of the cells can be very different if cellular contents, including RNAs or proteins, are exclusively retained in the 'mother' or passed to her 'daughter'. Organelles known as centrioles can play an important part in influencing whether cell division is symmetric or asymmetric.
Centrioles contain ordered assemblies of various proteins, and mutations in some of these proteins can cause developmental defects in humans. For example, mutations in the centriolar proteins CPAP and STIL cause a syndrome known as microcephaly, in which the brain is smaller than normal. Although CPAP and STIL are known to bind each other, how they interact on a molecular level to form centriolesand how this interaction is disrupted in microcephaly-is not well understood.
Cottee et al. have now used structural and biochemical assays to explore how these two proteins bind to each other, and have identified specific amino acid residues that enable this interaction. These residues are highly conserved across many organisms, and a mutation in one of them has previously been associated with microcephaly in humans. Now, Cottee et al. demonstrate that this mutation weakens the interaction between CPAP and STIL in vitro.
To explore these processes in vivo, Cottee et al. studied mutant fruit flies in which the interactions between CPAP and STIL were weaker than normal, and found that these mutations prevented the normal formation of centrioles. Furthermore, there was a striking correlation between the ability to form centrioles in fruit flies and the ability of CPAP and STIL to bind each other, based on the structural model and in vitro binding studies.
Cumulatively, these findings reinforce the importance of CPAP and STIL in centriole formation, and suggest that one reason for the development of microcephaly may be defects in the proper formation of centrioles.
controlling the organisation (Pelletier et al., 2006;Dammermann et al., 2008) and length of the centriolar microtubules (Blachon et al., 2009;Kohlmaier et al., 2009;Schmidt et al., 2009;Tang et al., 2009;Kim et al., 2012). A direct interaction between STIL and CPAP has been observed in yeast-two-hybrid and pull-down experiments (Tang et al., 2011;Vulprecht et al., 2012). Intriguingly, a MCPH mutation (E1235V) in the conserved C-terminal domain of CPAP (the so-called TCP-domain or G-Box) appeared to weaken this yeast-two-hybrid interaction (Tang et al., 2011). Tissue culture experiments suggested that this MCPH mutation might cause a partial loss-of-function of CPAP (Kitagawa et al., 2011a). However, the same study also found that the E1235V mutation results in an enhanced functionality of CPAP when overexpressed in vivo (Kitagawa et al., 2011a). To understand how CPAP and STIL interact and how the MCPH mutation affects CPAP functionality in vitro and in vivo, we undertook a detailed biochemical, structural and functional study of the putative CPAP-STIL complex.

Results
The CPAP TCP domain binds to a conserved proline-rich motif in STIL Yeast-two-hybrid experiments suggested that a region of human CPAP comprising its conserved C-terminal TCP domain (or G-box) can interact with a ∼400 amino acid (aa) region (residues 231-619) of human STIL (Tang et al., 2011). To try to identify the region of STIL most likely to be involved in an interaction with CPAP, we carried out a sequence alignment with multiple metazoan STIL proteins ( Figure 1A, Figure 1-figure supplement 1). This analysis revealed a short (∼40 aa) highly conserved proline-rich region (CR2) ( Figure 1A) within this interval. To test whether this region of STIL could bind to the CPAP TCP domain, we recombinantly produced the TCP domain of Danio rerio CPAP and used isothermal titration calorimetry (ITC) to test its ability to bind to a fragment of D. rerio STIL that spanned CR2 (residues 404-448) ( Figure 1D, Table 1). The two proteins formed a 1:1 complex with a K D of ∼2 μM. Next, we further split the peptide to test the binding contribution from its N-terminal (residues 411-428) and C-terminal region (residues 429-448). The N-terminal region exhibited an only slightly weaker binding (K D ∼4 μM) to the TCP domain, whereas the C-terminal region showed a very weak binding (K D > 500 μM) ( Figure 1D; Table 1). We conclude that the CPAP TCP domain binds to a short conserved motif in STIL (CR2) with a potentially biologically significant affinity, and that the majority of the binding affinity comes from interactions with residues within the first proline-rich region in CR2.
The CPAP TCP domain adopts a unique extended open β-sheet conformation that packs against a series of conserved prolines in STIL To understand how CPAP and STIL interact at the molecular level, we obtained the crystal structures of the TCP-domain of D. rerio CPAP 937-1124 , both on its own and in a complex with D. rerio STIL 408-428 ( Figure 1B,C,E; Table 2, Table 3). In both structures, the TCP domain adopts a nearly identical conformation, suggesting that no significant conformational change occurs in CPAP upon binding to STIL (RMSD = 1.5 Å ± 0.2 Å over 148 ± 4 Cα pairs). The TCP domain folds into a single layer β-sheet comprising ∼20 consecutive antiparallel strands connected by type I β-turns and is stabilised by an extensive hydrogen-bonding network. The resulting sheet shows a twist of approximately 13° (i.e., the angle between the consecutive, hydrogen-bonded strands), slightly lower than the average value of 20° observed for typical β-sheets (Chothia, 1973;Murzin, 1992). Individual β-hairpins correspond to previously noted (Islam et al., 1993;Hung et al., 2000) repeats in the TCP domain sequence; the turns of these hairpins are often constituted by a PDG motif explaining the high frequency of proline and glycine residues in this domain (Figure 1-figure supplement 2). Crystal packing interactions involve only small protein interfaces, suggesting that the protein is biologically active as a monomer. Indeed both small-angle X-ray scattering (SAXS) and size-exclusion chromatography-multi angle light scattering (SEC-MALS) experiments demonstrate that the TCP domain is predominantly monomeric in solution ( Figure 1-figure supplement 3).
The structure of the TCP domain represents an unusual, novel architecture. It is reminiscent of the β-sheet conformation proposed to exist within amyloid fibrils and resembles engineered water-soluble peptide self-assembly mimics (PSAMs) used to study β-rich self-assemblies (Makabe et al., 2006). In contrast to these PSAM structures whose conformation is maintained by two globular domains capping both ends of the β-sheet, the TCP domain stably exists on its own. (Figure 1-figure supplement 4). The TCP domain structure lacks a defined hydrophobic core typical for globular domains, and both sides of its β-sheet are exposed to the solvent and well hydrated. 4 of 23  CPAP is a 1124 amino acid (aa) protein with three predicted coiled coil (cc) domains and a C-terminal TCP domain. STIL is a 1263 aa protein with one predicted cc domain and several conserved regions (CR). The proline-rich CR2 domain is enlarged and coloured according to Consurf conservation scores (Glaser et al., 2003) from cyan (variable) to burgundy (conserved). The constructs used in this study are indicated by bars. Figure 1. Continued on next page The structure of the CPAP-STIL complex revealed that the STIL peptide binds in a polyproline II helical conformation along one edge of the TCP domain β-sheet. The STIL peptide binds to CPAP by four main mechanisms ( Figure 1C). First, three STIL prolines (P417, P421, and P423) pack against aromatic CPAP residues (F978, Y996, and F1015) in a way that resembles target motif recognition by other described proline-rich motif (PRM) binding domains (Kay et al., 2000). Second, R418 (STIL) makes a cation-π interaction with the phenyl ring of Y994 (CPAP). Third, STIL R418 is further involved in a water-mediated hydrogen bonding network that includes CPAP residues H1003 and T1005. Finally, sidechain-mainchain interactions are formed between CPAP residues Y994, Q1019, and E1021 and the bound STIL peptide. The CPAP and STIL residues involved in this interaction are highly conserved across metazoans ( Figure 1C).
Sequence conservation of the TCP domain is not confined to this section of our structure but extends further along the same edge of the sheet ( Figure 1B). This additional conserved region contains aromatic residues that are arranged similar to those that pack against the proline residues of the bound STIL peptide in our crystal structure ( Figure 1B,C). Intriguingly, the C-terminal part of STIL's CR2 region (omitted to obtain diffraction grade crystals) contains two highly conserved proline (B) Two views of the TCP domain structure (green) in complex with the STIL peptide (orange), rotated by 180°. Images on the left of each view show a ribbon representation and images on the right show the TCP domain as a molecular surface coloured according to Consurf conservation scores. Note the presence of a conserved patch (dashed circle) along the edge of the TCP domain where the STIL peptide is bound. This patch contains aromatic residues (black sticks) that would be well placed to interact with conserved prolines in the C-terminal part of the STIL CR2 region that we had to omit for crystallisation. ITC experiments ( Figure 1D) suggest that these putative additional contacts would only contribute weakly to overall binding. (C) Detailed view of the D. rerio CPAP-STIL interaction interface coloured according to Consurf conservation scores. Interface residues are shown in sticks, and the TCP domain is shown as a semi-transparent molecular surface. Contact residues are labelled in green (CPAP) and orange (STIL). Dotted yellow lines indicate hydrogen-bonds. The dark orange sphere represents a bound water molecule. (D) ITC analysis using the STIL constructs shown in Figure 1A. The excess heat measured on titrating STIL into CPAP at 25°C was fitted to a single set of binding sites model.    residues (P435 and P438 in D. rerio) that would be well positioned to bind to these aromatic residues in an analogous way ( Figure 1A,B). Thus, we speculate that the entire CR2 region of STIL spanning from residue 417 to residue 438 (D. rerio) may be bound all along the edge of the TCP domain. Although our ITC experiments suggest that these putative additional contacts are insufficient to establish strong binding between STIL and CPAP ( Figure 1D) they may contribute cooperatively to the CPAP-STIL interaction once the N-terminal proline-rich region in CR2 established binding. We conclude that the TCP domain of CPAP adopts a unique extended open β-sheet conformation that recognises a series of conserved prolines in the CR2 region of STIL.
The CPAP E1021V MCPH mutation reduces the binding affinity of the CPAP-STIL interaction The involvement of CPAP E1021 in the interaction with STIL in zebrafish is potentially significant, as the equivalent residue in human CPAP (E1235) is mutated to valine in some MCPH patients. To test whether this mutation disrupts the organisation of the TCP domain, we obtained the crystal structure of D. rerio CPAP 937-1124 carrying the E1021V mutation ( Figure 1F; Table 2). The structure of the wild-type and the mutant TCP domain were virtually identical (RMSD = 0.1 Å over 142 Cα pairs) demonstrating that the TCP domain structure was not compromised. To test whether this mutation perturbed the interaction with STIL, we purified WT and various other mutant forms of D. rerio CPAP 937-1124 in which we valine substituted residues that our crystal structure suggested to be important for binding ( Figure 2B). Circular dichroism (CD) spectra indicated that the mutant forms of the TCP domain were correctly folded with a predominantly β-type profile ( Table 4), while mutation of E1021 decreased the binding strength by approximately eightfold. In contrast, mutation of T986, which is not predicted to be in the interaction interface, did not detectably perturb binding.
We also purified mutant forms of the D. rerio STIL 404-448 peptide and tested their ability to interact with the WT D. rerio TCP domain in ITC experiments ( Table 4). Alanine substitution of P417, R418 or P421 decreased the binding strength by ∼10 to 20-fold and alanine substitution of P423 by approximately twofold to threefold. In contrast, the mutation of residue N422, which is not predicted to be in the interaction interface, did not compromise binding. Taken together these results lend strong support to our structural model and indicate that the E1021V MCPH mutation leads to roughly an order of magnitude decrease in affinity of the CPAP-STIL interaction.

The CPAP-STIL interaction is highly conserved
The sequence conservation of the CPAP TCP domain (Figure 1-figure supplement 5) and the CR2 region of STIL (Figure 1-figure supplement 1) suggests that this interaction may be conserved. To confirm this, we solved the crystal structure of the TCP domain from Drosophila melanogaster DSas-4 (dCPAP) (residues 700-901) in complex with the region of Ana2 (dSTIL) equivalent to CR2 (residues 1-47) ( Table 2, Table 5, Table 6; Figure 2C). The dSTIL-dCPAP interaction interface in this structure was highly similar to the D. rerio complex (inter-species alignments of the structures yielded an average pairwise RMSD of 1.2 ± 0.2 Å across an average of 118 ± 4 Cα pairs). Indeed, all copies of the complex obtained in the structures from both species superimposed well and exhibited the same four major groups of binding interactions as described for the D. rerio structure. This conservation includes the contact made by the E792 residue in dCPAP (the equivalent of the E1235 residue in human CPAP that is mutated in MCPH). Together, these data allow us to determine a consensus CPAP binding motif in metazoan STIL proteins (PRxxPxP, Figure 1-figure supplement 1) and suggest that the described CPAP-STIL interaction constitutes a highly conserved step in centriole biogenesis.

The CPAP-STIL interaction is essential for centriole assembly in vivo
Since the binding mechanism of CPAP and STIL is conserved between zebrafish and Drosophila, we turned to D. melanogaster as a model system to address the functional relevance of this interaction in vivo. In flies, the lack of dCPAP or dSTIL leads to centriole loss and a consequent severe uncoordinated (unc) phenotype due to the lack of basal bodies and so cilia in Type I sensory neurons. These flies lack all mechano-and chemo-sensation and, although viable, they usually die shortly after eclosion, as they cannot feed or move in a coordinated fashion (Kernan et al., 1994;Basto et al., 2006;Wang et al., 2011). We examined the ability of various GFP-tagged versions of dCPAP and dSTIL to rescue the centriole loss observed in these mutants and assayed their ability to localise to centrosomes in the presence of endogenous dCPAP or dSTIL ( Figure 3). Research article Figure 3. The interaction between dCPAP and dSTIL is essential for centriole duplication in Drosophila. (A) Schematic view of the complex between dCPAP (green) and dSTIL (magenta) with the residues mutated in MC1 (cyan), MC2 (brown) and MC3 (dark purple) indicated as coloured sticks. The MCPH residue E792 is circled in red. Note that MC1 and MC2 are mapped onto the Drosophila structure (dark-green backbone), while MC3 had to be mapped onto the backbone of the D. rerio structure (light green backbone). Although highly conserved between Drosophila and D. rerio (Figure 1-figure supplement 5) this region was not visible in the electron density map of the Drosophila structure probably due to its partial unfolding to enable packing interactions within the crystal. (B-M) Panels show representative still images taken from movies of Drosophila embryos expressing the indicated dCPAP-GFP or dSTIL-GFP constructs. Note that all analyses were performed in the presence of endogenous WT dCPAP or dSTIL, and that all images were acquired with the same microscope settings at the same stage of the cell cycle. (B-F) dSTIL-GFP constructs localise to centrosomes at similar levels. (G-M) All mutant dCPAP-GFP constructs localise to centrosomes, but at strongly reduced levels compared to wild-type dCPAP-GFP. (N) Graphs show the percentage of cells with 0, 1, 2, and 3 centrosomes in the genotypes analysed (as indicated). All dSTIL-GFP and dCPAP-GFP constructs were analysed in their respective mutant backgrounds. Note that this experiment was performed blind. (O-Q′′) Panels show third instar larval brain cells of various genotypes in metaphase. Cells were stained for the centriolar protein Asterless (Asl-green) and the PCM component Centrosomin (Cnn-red) and DNA (blue). Wild-type metaphase cells have two centrosomes (O), whereas centrosomes are mostly absent in third instar larval brain cells from dCPAP mutants (P). As an example, representative images of dCPAP mutant cells expressing the dCPAP_E792V-GFP construct are shown that were scored with 2 (Q), 1 (Q′) or no (Q′′) centrosomes. Scale bars = 3 μm. DOI: 10.7554/eLife.01071.012 Figure 3. Continued on next page We first expressed a version of dSTIL that lacks the first 45 aa (including the PRxxPxP motif required for the interaction with dCPAP). GFP-tagged wild-type dSTIL (dSTIL_WT-GFP) served as a control. Both proteins were expressed at similar levels and localised strongly to centrosomes in the presence of endogenous dSTIL ( Figure 3B,C; Figure 3-figure supplement 1). Only the wild-type version, however, was able to rescue the unc phenotype and the centriole loss phenotype of the dSTIL mutant ( Figure 3N; Table 7). To further characterise the dCPAP binding domain in vivo we mutated the first proline and arginine of the PRxxPxP motif of dSTIL to alanine, both separately and in combination (P11A, R12A, and P11A:R12A, Figure 2C). All three constructs strongly localised to centrosomes in the presence of endogenous dSTIL ( Figure 3D-F, Figure 3-figure supplement 1). Both single mutants rescued the unc phenotype of the dSTIL mutation while the double mutant failed to do so (data not shown). The single mutants P11A and R12A were also able to partially rescue the centriole loss phenotype, whereas the double mutant P11A:R12A showed only a poor rescue ( Figure 3N; Table 7). These data strongly suggest that the interaction with dCPAP is essential for dSTIL function in centriole assembly.
We next deleted the entire TCP-domain of dCPAP (dCPAP_ΔC), or expressed GFP fusion proteins carrying mutation clusters (MCs) altering 3-4 residues in different regions of the TCP domain ( Figure 3A). Mutation clusters were designed that targeted central (dCPAP_MC1) or peripheral (dCPAP_MC2) residues in the dSTIL binding domain, as well as residues that are predicted to not significantly be involved in complex formation (dCPAP_MC3), according to the crystal structure and the ITC data ( Figure 3A, Figure 2A, Figure 1D; Table 1). We also analysed dCPAP_E792V-GFP lines, which carried the MCPH equivalent mutation E792V (E1235V in humans and E1021V in zebrafish CPAP). All transgenic dCPAP-GFP proteins were expressed at approximately equivalent levels in vivo, but were moderately overexpressed compared to endogenous dCPAP ( Figure 3-figure supplement 1). Wild-type dCPAP-GFP localised strongly to centrosomes and rescued both the unc phenotype and the centriole loss phenotype ( Figure 3G,N; Table 7). Strikingly, the rescuing ability of the mutant constructs strongly correlated with the predicted strength of dSTIL binding. dCPAP_ΔC-GFP failed to rescue, dCPAP-MC1 and dCPAP-MC2 rescued poorly, the MCPH mutation E792V showed an intermediate phenotype, while dCPAP-MC3 exhibited a robust rescue ( Figure 3N; Table 7). Interestingly, when compared to wild-type dCPAP-GFP, all mutant constructs (including dCPAP_MC3) localised only weakly to centrosomes ( Figure 3H-M). Together, these data suggest that the interaction between dCPAP and dSTIL is a key step in centriole assembly and is essential for centriole duplication. Furthermore, they indicate that low total levels of dCPAP at centrosomes might be sufficient for centriole duplication, as long as some interaction with dSTIL is maintained.
The TCP domain of C. elegans SAS-4 is required for its interaction with SAS-5 and for centriole assembly It has been proposed that SAS-5 is the C. elegans homolog of the STIL proteins in flies and vertebrates, but there is little sequence homology between these proteins (Stevens et al., 2010a). We failed to identify an unambiguous PRxxPxP motif in worm SAS-5, so we tested whether the TCP domain of SAS-4 (the C. elegans CPAP homologue) is functionally important. We used the Mos single-copy insertion system (MosSCI; Frøkjaer-Jensen et al., 2008) to generate transgenic lines with single-copy transgenes under the control of sas-6 regulatory sequences integrated at a specific site on chromosome II ( Figure 4A). Transgenes were generated expressing GFP fusions with either WT SAS-4 (SAS-4 WT ::GFP) or a form in which the C-terminal TCP domain (aa 557-808) had been deleted (SAS-4 ΔTCP ::GFP); both transgenes contained a 497 bp resequenced region in their N-terminal coding region (preserving codon usage) that rendered them resistant to RNAi-mediated depletion ( Figure 4A). SAS-4 depletion by RNAi prevents centriole assembly, resulting in a signature phenotype characterised by a normal first mitotic division followed by monopolar spindles during the second division (Figure 4B,C;O'Connell et al., 2001). This phenotype arises because the sperm that fertilise the  Figure 4C). Both the monopolar spindle phenotype and embryonic viability were fully rescued by the WT sas-4::gfp transgene (aa 1-808), but not by the ΔTCP transgene (aa 1-556) ( Figure 4B,C). While SAS-4 WT ::GFP targeted to centrioles in the absence of the endogenous protein, SAS-4 ΔTCP ::GFP did not, and instead exhibited a diffuse accumulation in the pericentriolar material ( Figure 4B). Thus, the SAS-4 TCP domain is required for SAS-4 to accumulate at centrioles and become incorporated into the microtubule-containing outer centriole wall.
To determine if the failure of SAS-4 ΔTCP ::GFP to become incorporated in the centriole outer wall could be due to an inability to interact with SAS-5, we performed a pull-down assay to determine whether 35 S-labelled in vitro translated SAS-4 fragments could interact with the N-terminal or C-terminal regions of SAS-5 bound to beads ( Figure 4D). In vitro translated full-length SAS-4 interacted specifically with the N-terminal domain (aa 1-202) of SAS-5. Interestingly, we could not further narrow down the region of SAS-4 required for this interaction. Neither the SAS-4 N-terminal nor C-terminal region (which includes the TCP domain) alone could be pulled down by SAS-5. This result suggests that although the TCP domain is required for SAS-4 to interact with SAS-5, it is not sufficient. Together, these data suggest that a TCP domain-dependent interaction between SAS-4/CPAP and SAS-5/STIL is conserved and essential for centriole duplication in C. elegans, but that the precise interaction interface may have diverged.

Discussion
Only a small set of conserved centriolar proteins is essential for centriole assembly (Brito et al., 2012;Gonczy, 2012) and some of these proteins, like CPAP and STIL, have been linked to microcephaly in humans (Leal et al., 2003;Bond et al., 2005;Gul et al., 2006;Thornton and Woods, 2009;Darvish et al., 2010). However, there is currently little structural understanding on how these proteins interact with one another, how mutations in them cause microcephaly in humans and how these interactions are regulated.
Here we have solved the crystal structures of the CPAP-STIL complex from zebrafish and Drosophila. We showed that the CPAP TCP domain folds into an elongated open-sided β-meander that consists of ∼20 consecutive antiparallel β-strands connected by type I β-turns. β-meanders are frequently found in β-barrels, β-propellers and some α+β proteins. However, what, to our knowledge, makes the TCP domain structure unique amongst naturally occurring proteins is that it solely consists of a freestanding meander β-sheet that entirely lacks a defined hydrophobic core and is not flanked by other globular domains that pack against it. We show that the TCP domain is predominantly monomeric in solution and self-interacts in its crystallised form only through small interfaces that are not conserved. Thus, despite some reminiscence to β-sheets observed in amyloid fibrils it is unlikely that the TCP domain self-associates in a similar manner.
Instead, we demonstrate that the TCP domain of CPAP constitutes a novel proline-rich-motif (PRM) recognition-domain that specifically binds to a short target motif in STIL. Although the overall sequence identity of the CPAP and STIL proteins between Drosophila and zebrafish is relatively low (∼22% and ∼13%, respectively), our structural analysis revealed that the interaction interface is conserved, confirming the previous proposal that fly Ana2 is the functional homologue of vertebrate STIL (Stevens et al., 2010a). Our characterisation of the binding interface also allowed us to define a consensus-binding site (PRxxPxP) for the CPAP TCP domain in STIL that is conserved across metazoa. Our mutational analysis of the interface demonstrates a remarkable correlation between the ability of mutant proteins to bind to one another in vitro and their ability to support centriole assembly in vivo, providing compelling support for our structural model of the metazoan CPAP-STIL complex. These data strongly suggest that the interaction between CPAP and STIL is a conserved, essential step in centriole biogenesis. A schematic model that places this interaction in the context of a possible centriole assembly pathway is shown in Figure 5.
The high degree of sequence divergence between vertebrate STIL, Drosophila Ana2 and C. elegans SAS-5 suggests that STIL homologs are under particularly strong lineage-specific selection. Despite the many sequence changes between Drosophila Ana2 and vertebrate STIL, our work suggests that the interaction interface between Ana2/STIL and dSAS-4/CPAP TCP domain has been retained, highlighting its importance. Even in C. elegans, which is the most divergent of the functionally characterised STIL homologs, our work indicates that the SAS-4 TCP domain is essential for centriole assembly, and that a TCP-domain dependent interaction between SAS-4 and SAS-5 has been conserved. Nevertheless, as the SAS-4 TCP domain is not sufficient for interaction with SAS-5 and we were unable to identify a PRxxPxP interaction motif in worm SAS-5, more work will be needed to understand the SAS-4-SAS-5 interaction in C. elegans and its relationship to the CPAP-STIL interaction in other metazoans.
A surprising aspect of our findings is that the E792V (MCPH) mutant and all three of the mutation clusters (MCs) that we analysed in dCPAP localise poorly to centrosomes. For the E792, MC1, and MC2 mutations this could be expected, as these are all predicted to perturb the interaction between dCPAP and dSTIL (as is the case with similar mutations in zebrafish CPAP in our in vitro binding assays), and this would be predicted to perturb the recruitment of dCPAP to centrioles. The MC3 cluster, however, is not predicted to lie in a strong interaction interface and, unlike the MC1 and MC2 mutation clusters, it can rescue the centriole duplication defect in dCPAP mutant cells nearly as efficiently as the WT protein. Possibly, an interaction with another protein that plays some part in recruiting dCPAP to centrioles might be perturbed by these mutations. Alternatively, similar to the situation with C. elegans SAS-4 (Dammermann et al., 2008), dCPAP may localise to both centrioles and the PCM. It might therefore be PCM and not centriole recruitment that is affected by the mutation clusters. If this were the case it would be hard to discern an additional partial loss of centriole recruitment, as this loss would be masked by the PCM pool of dCPAP, especially under conditions of moderate overexpression of dCPAP. Importantly, however, our findings demonstrate that even very reduced amounts of centrosomal dCPAP can support robust centriole duplication as long as this protein can interact efficiently with dSTIL.
Our studies provide the first structural insight into the nature of the link between centrioles and human microcephaly. It is unclear why mutations in genes encoding key centriole or centrosome proteins can lead to such a specific neuro-developmental disorder in humans. It is widely assumed that some aspect of centriole/centrosome function must be particularly important in human neural progenitor cells, and that the failure of these cells to proliferate in an appropriate manner underlies the small brain size in affected individuals (Megraw et al., 2011). One possibility, based on the fact that Figure 5. A schematic representation of protein interactions within the inner region of the centriole. In this illustration, interactions whose crystal structure have been determined are highlighted by green boxes-all other interactions are inferred from biochemical and genetic studies and so are depicted in cartoon form. The cartwheel central hub comprises SAS-6 (red) Kitagawa et al., 2011b;van Breugel et al., 2011). The spokes extending outward from the hub consist of a homodimeric SAS-6 coiled-coil, which extends (van Breugel et al., 2011) into a region known as the 'pinhead' (cyan in low magnification view, left), where CEP135 (grey) may act as a linker between SAS-6, CPAP and microtubules Roque et al., 2012;Lin et al., 2013). CPAP (dark blue) localises more towards the periphery of the centriole (Mennella et al., 2012;Sonnen et al., 2012;Lukinavičius et al., 2013), where its N-terminal part may interact directly with both Asterless/ CEP152 (Cizmecioglu et al., 2010;Dzhindzhev et al., 2010) (orange arrow) and microtubules (Hsu et al., 2008) (green arrow). In contrast STIL (yellow) localises more towards the interior of centrioles (Arquint et al., 2012), and appears to function upstream of CPAP in centriole biogenesis (Tang et al., 2011;Vulprecht et al., 2012). Thus, we propose that the C-terminal TCP domain of CPAP interacts with the conserved region 2 (CR2) of STIL towards the interior of the centriole and that this interaction is crucial for CPAP/STIL function at centrioles. The orientation of STIL in centrioles is unknown. DOI: 10.7554/eLife.01071.015 these neural progenitors seem to divide asymmetrically (Siller and Doe, 2009;Megraw et al., 2011), is that centrioles/centrosomes may play a particularly important role in properly orienting the spindle during asymmetric divisions, and division orientation could in turn be required for the maintenance of neuronal progenitors. This appears to be the case in flies, where mutations in dCPAP/DSas-4 and dSTIL/Ana2 lead to defects in the asymmetric division of the neural stem/progenitor cells (Basto et al., 2006;Wang et al., 2011). However, there are other possible explanations. Human neural progenitor cells form primary cilia, for example, and signalling through the cilium could be perturbed if centriole assembly is perturbed (Han and Alvarez-Buylla, 2010;Megraw et al., 2011). Moreover, several studies have linked centriole and centrosome malfunction to defects in DNA damage repair (DDR) pathways (Megraw et al., 2011), and mutations in MCPH genes can also lead to more severe phenotypes in humans that may be related to DDR pathway malfunction (Al-Dosari et al., 2010;Kalay et al., 2011;Megraw et al., 2011).
A previous analysis of the behaviour of various CPAP mutant proteins (modelled on MCPH mutations) in human cells revealed some surprising findings (Kitagawa et al., 2011a). The deletion of the TCP domain or the mutation of E1235 to Valine did not effect the localisation of CPAP to the centriole, although centriole duplication was compromised by both mutations. Moreover, overexpression of the E1235V mutant protein was able to promote centriole overgrowth to a greater extent than the WT protein, suggesting that it may have acquired some enhanced functionality. The structures we report here reveal that E1235 is one of the several residues involved in the binding interface with STIL, making an important sidechain-mainchain contact. This structural model explains how the E1235V mutation can compromise complex formation, and we have confirmed that this is the case with zebrafish proteins in vitro. Moreover, the equivalent mutation in flies leads to inefficient centriole assembly, but this process is not abolished. Taken together, our data strongly suggest that it is a partial failure in centriole assembly that is the primary cause of microcephaly in these patients. The challenge now is to understand how inefficient centriole assembly leads to microcephaly in humans.

Materials and methods
Recombinant protein expression and purification D. rerio CPAP 937-1124 was cloned from D. rerio cDNA. Proteins were expressed in Escherichia coli BL21 (DE3) Rosetta as N-terminally His-tagged constructs, and purified via immobilised metal ion affinity chromatography (NiNTA; Qiagen, Hilden, Germany), proteolytic tag cleavage, followed (optionally) by size-exclusion chromatography and ion-exchange chromatography using standard methods. The selenomethionine derivative protein was expressed in selenomethionine supplemented M9 medium and purified in the same way. Purified constructs contained the sequence GPHM at the N-termini that stem from the cloning and protease cleavage sites.
D. rerio STIL 404-448 was cloned from IMAGE clone 7147918 and expressed in E. coli C41 BL21 (DE3), fused to two His-tagged lipoyl domains from Bacillus stearothermophilus dihydrolipoamide acetyltransferase at both the N-and C terminus. The peptide was purified via NiNTA chromatography, proteolytic cleavage of the His-lipoyl domains, and ion-exchange chromatography. The purified constructs contained a G (GG for D. rerio STIL 404-448 and its point-mutants) at their N-terminus and the sequence EFGENLYFQ (ENLYFQ for D. rerio STIL  ) at their C-terminus. These extra sequences stem from the cloning and protease cleavage sites.
Mutations of the D. rerio constructs were introduced into the expression vectors by site-directed mutagenesis.
Codon-optimised (GeneArt, Carlsbad, CA) Drosophila dSTIL 1-47 was genetically fused to the N-terminus of Drosophila dCPAP 700-901 via 3-way ligation. The fusion protein was expressed in E. coli B834 (DE3) as an N-terminally His-tagged fusion, and purified via NiNTA chromatography, proteolytic tag cleavage and size exclusion chromatography. The selenomethionine derivative protein was expressed using SelenoMethionine Medium (Molecular Dimensions, Newmarket, UK) and purified in the same way.

Crystallisation
Native D. rerio CPAP 937-1124 was crystallised in sitting drops in 80 mM Tris pH 8.5, 160 mM MgCl 2 , 20% PEG-4000, 18% Glycerol, 1 mM DTT at 19.5°C. The drops were set up using 1 μl of the protein solution and 0.5 μl of the reservoir solution. Crystals were mounted after 3 days and flash-frozen in liquid nitrogen.
SeMet D. rerio CPAP 937-1124 crystals were obtained using the sitting drop method with a reservoir solution of 80 mM Tris pH 8.5, 160 mM MgCl 2 , 26% PEG-4000, 18% glycerol, 1 mM DTT at 19.5°C. Drops were set up using 1 μl protein solution and 1 μl of reservoir solution. Native CPAP 937-1124 crystals were used for streak-seeding into these drops and crystals allowed to grow for 7 days before mounting and flash-freezing them in liquid nitrogen.
Native D. melanogaster dSTIL 1-47 -dCPAP 700-901 was crystallised using the sitting drop approach, using the Morpheus screen (Molecular Dimensions). Crystals grew after approximately 3 weeks ( Table 5, 'Native'). Crystals were mounted after approximately 4 weeks. SeMet D. melanogaster  dSTIL 1-47 -dCPAP 700-901 was initially crystallised using the Morpheus screen (Molecular Dimensions). Crystals typically grew after 3-4 weeks. Some crystals were used for microseeding of further screens including an optimisation screen. Seed stock was generated using a Seed bead kit (Hampton, Aliso Viejo, CA). Details of crystallisation conditions are shown in Table 5.

Data collection and processing
Native data were collected as described in Table 2. All D. rerio datasets were integrated and scaled using MOSFLM (Leslie and Powell, 2007) and Scala (Evans, 2006) respectively. The D. rerio CPAP 937-1124 structure was solved by MAD in CRANK (Ness et al., 2004;Cowtan, 2006), resulting in clear electron density into which an initial model was built using ArpWarp (Langer et al., 2008). Phenix.refine (Afonine et al., 2005) and REFMAC (Murshudov et al., 2011) were used to refine the model against the native dataset with manual building done in Coot (Emsley and Cowtan, 2004). D. rerio CPAP 937-1124 E1021V was solved by molecular replacement in Phaser (McCoy et al., 2007) using a poly-alanine model derived from the WT model. The model was further built and refined as described for the WT structure. The complex of D. rerio CPAP 937-1124 and D. rerio STIL 408-428 was solved by molecular replacement using Phaser (McCoy et al., 2007) with a distorted model of the D. rerio CPAP 937-1124 WT apo-structure. Refinement yielded clear density for the residues of STIL shown here. The model was further built and refined as described for the other D. rerio structures. D. melanogaster dSTIL 1-47 -dCPAP 700-901 data was scaled using Xia2 (Winter, 2010). Phasing was carried out using all SeMet datasets (Table 6) in autoSHARP (Vonrhein et al., 2007), using SHELXC/D (Sheldrick, 2008) for heavy atom finding, SHARP for site refinement/phasing and SOLOMON (Abrahams and Leslie, 1996) for density modification. This resulted in an experimental density map within which a CHAINSAW (Stein, 2008) model based on the D. rerio complex structure could be manually placed, using heavy atom sites as a guide. Experimental density corresponding to the dSTIL peptide could be easily seen. Further refinement cycles allowed the remaining copies of the monomer to be placed and trimmed. Refinement and model building were carried out in autoBUSTER (Bricogne et al., 2011) and Coot (Emsley and Cowtan, 2004) respectively.

Isothermal calorimetry (ITC) measurements
All ITC measurements were performed using an auto-iTC 200 instrument (GE Healthcare, Little Chalfont, UK) in 50 mM HEPES pH 7.5, 100 mM NaCl at 25°C. Samples were stored by the instrument in 96-well microtiter plates at 5°C prior to loading and performing the titrations. Standard experiments used 19 × 2 μl injections of STIL peptide into CPAP protein preceded by a single 0.5 μl pre-injection. Heat from the pre-injection was not used during fitting. Data were analysed manually in the Origin software package provided by the manufacturer and fit to a single set of binding sites model. All measurements were corrected using control ITC experiments in which the peptide studied was injected into buffer only. The small endothermic heats of injection in these experiments were fitted to a linear function that was subsequently subtracted from the equivalent integrated heats of the peptide-protein binding experiment before fitting. The concentration of CPAP in the cell was typically 40 μM but varied maximally between 20 and 100 μM. The concentration of STIL used in the syringe was typically 700 μM but varied maximally between 600 and 2600 μM depending on the affinity of the peptide interaction being studied.

Rescue experiments
All constructs were tested for their ability to rescue the uncoordinated phenotype, which is a feature of flies lacking centrioles (Basto et al., 2006). For that purpose, the different versions of dCPAP-GFP and dSTIL-GFP were either crossed into the dCPAP or dSTIL 169 /dSTIL2 719 mutant background, and desired pupae were collected from vials and transferred to filter paper (Whatman, Maidstone, UK) for analysis.

Immunohistochemistry on third instar larval brains and centrosome quantification
Brains were dissected, squashed, and stained as previously described (Stevens et al., 2009). The following antibodies were used to stain centrosomes in third instar larval brain cells: sheep anti-Centrosomin (Cnn, directed against the N-terminus, 1:1000, [Lucas and Raff, 2007] but raised in sheep), guinea pig anti-Asterless (Asl, 1:500, [Conduit et al., 2010] but raised in guinea pig). Secondary antibodies conjugated to either Alexa Fluor 488 or Alexa Fluor 568 (Life Technologies) were used 1:1000. Hoechst33258 (Life Technologies) was used to visualise DNA (1:5000). Centrosomes were counted on a Zeiss Axioskop 2 microscope (Zeiss, Oberkochen, Germany). Only brain cells in metaphase were scored that did stain for Asl and Cnn. DNA morphology was used to identify cells at the desired stage of the cell cycle. Furthermore, the assessment of centriole loss was performed blind. Microsoft Excel was used to analyse the data. Images were acquired in Metamorph (molecular devices) using a CoolSNAP HQ camera (Photometrics, Tucson, AZ) and processed using ImageJ/Fiji (www.fiji.sc/Fiji, [Schindelin et al., 2012]), Gimp (www.gimp.org/) and Inkscape (www.inkscape.org/) for figure assembly.

Live imaging of embryos
Embryos expressing the different GFP-tagged versions of dCPAP and dSTIL were dechorionated manually and mounted in a Glass Bottom Microwell Dish (MatTek, Ashland, MA) using heptane glue. Embryos were covered with voltalef oil and followed by time-lapse spinning disc microscopy on a Perkin Elmer spinning disc microscope (Perkin Elmer, Waltham, MA). Images were acquired with a charge-coupled Orca ER device camera (Hamamatsu Photonics, Hamamatsu, Japan) using UltraView ERS (Perkin Elmer) and processed and analysed in Velocity (Perkin Elmer).
Double-stranded sas-4 RNA was generated as described (Oegema et al., 2001) using DNA templates prepared by PCR. For experiments to quantify monopolar spindle formation, L4 hermaphrodites were injected with dsRNA and incubated at 20°C for 40 hr prior to dissection for imaging. For lethality assays, worms were maintained at 20°C. L4 hermaphrodites were injected with dsRNA and singled 24 hr post-injection. Adult worms were removed from the plates 48 hr post-injection, and hatched larvae and unhatched embryos were counted 24 hr later.
For light microscopy to identify monopolar or bipolar second division cells, images were acquired using an inverted Zeiss Axio Observer Z1 system with a Yokogawa spinning-disk confocal head (CSU-X1), a 63X 1.4 NA Plan Apochromat objective, and a QuantEM:512SC EMCCD camera (Photometrics). Adult worms were dissected in M9 buffer, and embryos were mounted onto 2% agarose pads for imaging. 11 × 1 μm z-stacks were collected in the GFP channel (100 ms, 20% power, no binning), along with one central DIC section.

SAS-4/SAS-5 pull-down experiments
SAS-4 constructs were cloned into a pET21a vector for in vitro transcription/translation. Proteins were expressed using the T7 TNT Quick Coupled Transcription/Translation System (Promega, Fitchburg, WI) with 35 S-Met labelling.
SAS-5 fragments were cloned into a pRSET-A vector with a C-terminal 6xHis tag. Proteins were expressed in E. coli Rosetta2(DE3) cells and purified on Ni-NTA agarose (Qiagen) using standard protocols. For pull-down experiments, proteins were dialysed into 25 mM HEPES, 100 mM NaCl, 20 mM imidazole, 1 mM DTT, 10% sucrose, 0.02% Tween-20, pH 7.4. SAS-5 fragments were pre-incubated with 20 μl Ni-NTA beads for 45 min at 4°C. 10 μl of the SAS-4 IVTT product was added to the beads with 190 μl buffer and incubated at 4°C for 30 min. The beads were washed with 3 × 200 μl buffer and resuspended in 100 μl SDS-PAGE sample buffer. Samples were run on 10% SDS-PAGE gels and either stained with Coomassie or dried and exposed to a phosphor screen overnight. Phosphor screens were analysed on a Personal Molecular Imager System (Bio-Rad, Hercules, CA).