Architecture of the Tuberous Sclerosis Protein Complex

Graphical abstract


Introduction
Tuberous sclerosis complex (TSC) is an autosomal dominant disease characterised by benign tumours in multiple organs. 1 It is caused by mutations in either of the genes TSC1 or TSC2, which encode the 130 kDa TSC1 and the 200 kDa TSC2 tumour suppressor proteins respectively. TSC1 contains an N-terminal a-helical 'core' domain and a coiled-coil at the C-terminus which is required for binding TSC2. [2][3][4] TSC2 contains a long a-solenoid domain at the N-terminus and a C-terminal GTPase activating protein (GAP) domain, which is the sole catalytically active domain in the complex. Together with a small third subunit TBC1D7, 5 TSC1 and TSC2 assemble to form the TSC protein complex (TSCC).
TSCC signalling restricts cell growth by negatively regulating mTORC1, the central coordinator of metabolism. 6,7 Directly upstream of mTORC1, Rheb, a small GTPase localized to lysosomes through C-terminal farnesylation, 8 stimulates mTORC1 kinase activity when GTPbound. 9 The TSCC stimulates Rheb GTPase activity, accumulating the GTPase in the inactive, GDP-bound, state to suppress mTORC1. 10 Spatial regulation of TSCC between the cytoplasm and lysosome is known to be pivotal for its function as a Rheb-GAP, with the current understanding being that the TSCC translocates to the lysosome surface to catalytically and sterically inhibit mTORC1 by binding to Rheb and sequestering it. 11,12 This translocation is reversed on TSC2 phosphorylation by AKT, 11 and other kinases, which are thought to regulate localisation through an unknown mechanism involving 14-3-3 binding, 13,14 as well as TSCC breakdown by ubiquitination-targeted TSC2 degradation. 15,16 The architecture of the TSCC remains completely unknown, although small fragments of the complex have been structurally characterised. The core domain of S. pombe TSC1, 17 an N-terminal asolenoid fragment of C. thermophilum TSC2, 18 and most recently the CtTSC2 GAP domain 19 have been resolved crystallographically. Furthermore, two co-crystal structures of TBC1D7 interacting with C-terminal coiled-coil fragments of TSC1 have been determined. 20,21 In this study, we have used cryogenic electron microscopy (cryo-EM) to examine the molecular architecture of the full-length, human holo-TSCC.

Results
We cloned human TSC1, TSC2, and TBC1D7 for expression in human embryonic kidney cells, and Rheb for expression in Escherichia coli. While both TBC1D7 and TSC1 could be expressed and purified independently, TSC2 could not be purified in the absence of TSC1 (Supp. Figure 1(A)), forming inclusions or being degraded in cell, consistent with the role of TSC1 in preventing TSC2 degradation. 15,16 We retrieved the complete human TSCC from lysate using FLAG-tagging, and purified the TSCC from most remaining contaminants by size-exclusion chromatography (SEC) (Supp. Figure 1(B)). Full-length TSCC yielded a broad peak with an estimated mass of 5200 kDa on SEC, due to oligomerisation through inter-TSCC interactions (Supp. Figure 1(C)). A clone yielding an internal deletion of TSC1(D400-600) was generated (Supp. Figure 1(D)) to minimise inter-complex interactions 2,23 for SEC-MALLS, yielding a defined peak with an estimated molecular weight by multi-angle LASER light scattering of 700 kDa (Supp. Figure 1(E)), roughly corresponding to a composition of 2:2:1-TSC1:TSC2: TBC1D7. Both full length and TSC1(D400-600) TSCC exhibited physiological GAP activity towards Rheb (Supp. Figure 1(F)).
We initially investigated the molecular architecture of the TSCC by negative staining (Supp. Figure 2). We observed extremely flexible, independent particles, however three defined ordered regions could be isolated, and twodimensional averaging of these regions provided a complete picture (Figure 1(A)). The TSCC was extraordinarily elongated (~400 A) and exhibited a characteristic "scorpion" shape, with a bulkier central "body", flexible "pincer"-like appendage at one end, and a barbed "tail" at the other. Cryo-EM of TSCC at low concentrations revealed identical particles (Supp. Figure 3(A) and (B)), whereas at higher concentrations we observed web-like networks which appear to be formed of head-totail TSCC particles (Supp. Figure 3(C)). Once again, complete TSCC particles were too flexible to average beyond low resolution. We isolated the same regions; both the "pincer" and "tail" proved to be strongly preferentially oriented and flexible, refining only to low-intermediate resolution (8.1 A and 8.2 A respectively) ( Figure 1(B), Supp. Figure 4, Table 1). Because of the preferred orientation of TSCC particles on the grid, it was not possible to define a reliable persistence length for the TSCC HEAT repeat, however it is clear that the tips are substantially more flexible than the more proximal regions as would be expected. Recentring from the position of the body to the barb entailed a~7 A root-mean-squared deviation, while recentring from the body to the pincer regularly placed the resulting centres beyond the radius of convergence (i.e. at least 30 A error). The body of the TSCC exhibited pseudo-C2 symmetry, and was refined in C2 initially, before symmetry was relaxed for a final C1 structure at 4.6 A (Figure 1(B), Supp. Figure 4, Table 1).
The resolution of the "body" was high enough to trace chains and identify all secondary structural elements, but too low to definitively assign sequence. The published structure of the TSC2 GAP domain 19 fitted unambiguously into density adjoining a central a-solenoid on each side of the origin of C2 pseudo-symmetry (Supp. Figure 5(A)). At the juncture of the two a-solenoids, we observed a dimerisation interface comprising two back-to-back b-sheets. The topology and function of this dimerisation domain is conserved from that of the N-terminal domain of the RapGAP fold 22 (Supp. Figure 5(B)), although a domain swap between the two TSC2 molecules cannot be ruled out as the regions of secondary structure are separated by hundreds of disordered residues (Supp. Figure 6), however the interface formed is not, as TSC2 dimerises through back-to-back b-sheets instead of the end-on arrangement found in the case of RapGAP. With the exception of the GAP domain, the only belements remaining predicted within the sequence of any TSC protein are at the C-terminus of the asolenoid of TSC2 (Supp. Figure 6), implying that the long disordered regions containing many of the phosphorylation sites regulating the TSCC are insertions within the C-terminal GAP domain. Our results are consistent with the TSC2 a-solenoids running outwards from C-terminus to N-terminus from the dimerisation site, and indeed there is a good fit of the TSC2 N-terminal HEAT repeat structure into the end of each of the "pincer" and "tail" (Figure 1 (B), Supp. Figure 5(C) and (D). The C2 symmetry of the TSC2 dimer is broken by two helices running directly across the top of the RapGAP-like dimerisation domain. This helical density forms a weakly connected "backbone" running over both GAP domains, and along the TSC2 a-solenoid outwards to both the "pincer" and "tail". We assign this continuous helical coiled-coil as that from the C-terminal regions of TSC1, implying that the two ends are its N-and C-terminus respectively. The "pincer" density is uninterpretable, however the density corresponding to the "barb" lying alongside the a-solenoid of the "tail" is completely separated from the remaining density, allowing it to be interpreted independently, and it unambiguously matches the TSC1-TBC1D7 structure 21 (Supp. Figure 5(D)). Under the reasonable assumption that this corresponds to its known binding site on the TSC1-coiled-coil we can therefore assign the orientation of the TSC1 dimer, implying that the "pincer" is made up of the TSC1 HEATdomains.
Under the reasonable assumption that the GAP-Rheb interaction will mirror that of the published Rap-RapGAP complex, 22 we have docked Rheb accordingly (Supp. Figure 7(A)). The natural docking yields no clashes with the current structure, and implies a further interaction with the two helices adjoining the GAP domain which are conserved from the RapGAP fold ( Figure 2(B)). We note that the Rheb farnesylation sites would both be situated on the same side of the TSCC, consistent with this being the correct orientation for lysosomal binding. The TSC protein complex is an elongated, flexible, scorpion-like complex with a defined "pincer", "body", and barbed "tail". (A) Electron micrograph of a negatively stained TSCC particle on a carbon support, electron micrograph of a TSCC particle frozen within vitreous ice on a graphene oxide support, and composite 2D average image of the TSCC from the windowed regions of vitrified particles as indicated. The same regions, "pincer" (chartreuse), "body" (cerise), and "tail" (cerulean), are indicated through dashed boxes of the appropriate colour in both the representative particle image and the composite 2D class-average representation. (B) The overall structure of TSCC at low resolution (centre) and the refined densities corresponding to each region of the TSCC (indicated by boxes) are shown. In each case the reconstructed electron scattering density is shown as a transparent isosurface, while the corresponding fitted molecular structures (the TSC2 N-terminal HEAT-repeat, 18 the TBC1D7-TSC1 complex, 20,21 the TSC2-GAP domain, 19 and the RapGAP dimerisation domain 22 ), and secondary structural elements in the case of the body, are shown in cartoon representation where available and practicable. The reconstructions have been rotated by 90°in the second panel as indicated.

Discussion
We show that the TSCC forms an elongated, flexible architecture, comprising two copies of each of TSC1 and TSC2 and one of TBC1D7. The orientation of Rheb implied by the GAP domains (Figure 2(B)) matches the slight curvature of the complex, and the lysosomal membrane will therefore lie on the opposite side of the TSCC from the TSC1 backbone. While we were preparing this manuscript a study reporting the structure of a glutaraldehyde-crosslinked fulllength TSCC has been Reference: https://www. researchsquare.com/article/rs-36453/v1? published as a preprint by Yang and colleagues. Their results are congruent with our own, although their structure is reported at higher resolution, allowing a full atomic model to be generated. The super-structure observed forming at higher concentrations (Supp. Figure 3(B)) may well play a part in retaining TSCC at the lysosome and reducing the off-rate once Rheb signalling has been suppressed as previously predicted. 11 Further structural investigation of these inter-TSCC interactions, likely mediated by the TSC1 termini, 2,23 is required to understand the higher-order organisation of TSCC and its role in mTORC1 regulation.
The RapGAP-like domain of TSC2 forms a dimer, as reported by Scrima and colleagues, 22 providing the centre of pseudosymmetry of the TSCC. The C2 symmetry of each of the dimeric TSC proteins is broken by the presence of the other, that of TSC1 by the curvature of its coiled-coil along the TSC2 a-solenoids, and that of TSC2 by the involvement of TSC1 in its dimerisation. We have confirmed once again that while TSC1 can fold independently of TSC2, the reverse does not occur. 15,16,24 Our architecture suggests a structural explanation for this observation; direct TSC1 involvement in the TSC2 dimerisation interface. The previously observed breakdown of TSC2, following ubiquitination in the absence of TSC1, 15,16 would be expected when structural elements cannot fold appropriately in the absence of their partner. This would also constrain the presence of functional TSC2 to subcellular regions containing TSC1 dimers.
In our structure, the N-terminal part of TSC1 coiled coil is found to interact with the N-terminal HEAT domain of TSC2. Interestingly, this region of TSC1 is the highest conserved part among TSC1 homologues (Supp. Figure 8).
The Ras superfamily of GTPases comprise of five families: Ras, Rho, Ran, Rab, and Arf. The Ras family is further divided into six subfamilies: Ras, Rap, Rheb, Ral, Rad, and Rit. By comparing the sequences of Rheb homologues and other Ras family GTPases, we found three Rheb-specific residues: R15, I69, and K102, which are conserved in Rheb homologues but are different in all other Ras family, or subfamily, GTPases (Supp. Figure 9). Interestingly, these Rheb-unique residues all point toward TSC2-GAP in our model (Figure 2, Supp. Figure 7(B)) and GTPase assay data suggest that these residues are important for the TSC2-GAP interaction (Supp. Figure 7(C)). In complete agreement with the conclusions of the Manning group, 11 the presence of the TSCC will both catalytically and sterically prevent Rheb-mTORC1 interactions during its GAP activity (Figure 2(B)), rotating the Rheb pair in relation to its position when interacting with mTORC1. The TSCC has been proposed to sequester GDP-Rheb after hydrolysis, which would be expected to occur at a different site from the GAP domains, and possibly with the TSCC in a different conformation due to association with the lysosomal membrane, as the catalytic complex will by its nature be transient. We believe that one of the more interesting observations from our results is that TSC2 binds Rheb as such a pair, as does mTORC1. Despite the fact that they are completely different in architecture and approach from different directions, the TSCC GAP domains are poised to bind two copies of Rheb at an almost identical separation to that resolved for Rheb in the structure of activated mTORC1 (Figure 2(B)). While it is possible that this is entirely happenstance, this would also be expected were Rheb bound by each partner as part of a greater, at least dimeric, complex on the lysosomal surface.
Our improved architectural understanding of the TSCC provides a starting point for the investigation of the molecular mechanisms by which TSCC directly regulates Rheb, and poses new questions on the nature of the superstructures formed by TSCC complexes, their partners, and the involvement of such quaternary structures in mTORC1 regulation.

Materials and Methods
Protein expression and purification pRK7 plasmids subcloned with FLAG-tagged fulllength (FL) human TSC1 (1164 amino acids, UniProtKB/Swiss-Prot accession number Q92574-1) and FLAG-tagged FL human TSC2 (1807 amino acids, UniProtKB/Swiss-Prot accession number P49815-1) were purchased from Addgene, and pRK7 was subcloned with FLAGtagged human TBC1D7 (293 amino acids, GenBank accession number AAH07054). FL CtTSC2 GAP residues corresponding to human TSC2 residues targeted by tumorigenic mutations in tuberous sclerosis are shown in stick representation and coloured in orange, labelled with human TSC2 residue numbers. The three Rheb residues identified to be conserved in Rheb homologues and Rheb-specific among Ras family GTPases are displayed in stick representation and coloured in red. The second panel has been rotated by 180°as indicated. (B) The docked fit of Rheb against the TSC2 GAP domain, based on the structure of the RapGAP-Rap1 complex, within the "body" of the TSCC, in comparison to its fit in the Rheb-activated structure of mTORC1. The secondary structural elements of the TSCC, and the molecular structure of mTORC1, are shown in cartoon representation. TSC1-TSC2-TBC1D7 (TSCC FL ) plasmids, or TSC1 (D400-600)-TSC2-TBC1D7 (TSCC 1D ) plasmids, were co-transfected into human embryonic kidney (HEK) Expi293F cells (Thermo Fisher Scientific, Waltham, MA, USA). Two days after transfection, the harvested Expi293F cells were lysed by three cycles of freeze-thaw in lysis buffer (20 mM Tris, pH 8.0, 300 mM NaCl, 2 mM TCEP, 0.5 mM PMSF, 1 lg/ml aprotinin, and 1 lg/ml leupeptin), and TSCC was purified from the cell lysate by M2 anti-FLAG affinity chromatography (Sigma) followed by size exclusion chromatography using a Superose 6 column (GE Healthcare) preequilibrated with buffer containing 20 mM Tris, pH 8.0, 300 mM NaCl, and 2 mM TCEP. The identity of each TSCC component was verified by ESI-MS (Mass Spectrometry Facility, University of St. Andrews).

GTPase activity endpoint assay
The GTPase activity of Rheb was assayed using the QuantiChrom ATPase/GTPase assay kit (BioAssay Systems), in which the amount of released inorganic phosphate was measured through a chromogenic reaction with malachite green. In the assays 75 nM Rheb, either alone or mixed with 227.5 nM TSCC FL , were added to the reaction buffer (40 mM Tris, pH 8.0, 80 mM NaCl, 8 mM magnesium acetate, 1 mM EDTA and 14 mM GTP) at 28°C for 40 min. A further 200 lL of assay kit reagent was then added, and the reaction incubated for 20 min, before a microplate reading at OD 620 nm was measured. Spontaneous GTP hydrolysis was calculated by measuring background absorbance in the absence of Rheb and sample values were normalised by subtraction of background. Each experiment was repeated three times.
Size exclusion chromatography-multi-angle laser light scattering (SEC-MALLS) TSCC 1D was analysed by SEC-MALLS using an Infinity liquid chromatography system (Agilent Technologies), linked to a Dawn Heleos multiangle light scattering detector (Wyatt Technology) and Optilab T-rEX refractive index detector (Wyatt Technology). The sample was injected onto a Superose 6 10/300 size exclusion column (GE Healthcare) pre-equilibrated overnight with buffer containing 25 mM K-HEPES, pH 7.6, 250 mM KCl, 0.5 mM EDTA, 1 mM TCEP and trace amounts of NaN 3 , using 0.2 mL/min flow rate at room temperature. In-line UV absorbance, light scattering and refractive index measurements were analysed using the ASTRA software package (Wyatt Technology) to determine molar mass estimates. The TSC1(D400-600) internal deletion construct was used for SEC-MALLS as prior studies have shown that this region contributes to higher-order oligomerisation or aggregation of the complex, 2,23 and therefore would confound attempts to derive a molecular weight estimate for the core complex.
Sample preparation for cryo-EM studies TSCC FL , after the above purification steps, was loaded onto a Superose 6 10/300 size exclusion chromatography column (GE Healthcare) preequilibrated with a preparation buffer containing 25 mM K•HEPES, pH 7.6, 175 mM KCl or 150 mM LiCl, 1 mM TCEP, and 0.5 mM EDTA. TSCC eluted as a single peak with a slight shoulder at lower retention volume. The integrity of the complex was confirmed by SDS-PAGE of both the peak and shoulder fractions. Main peak fractions were combined and concentrated to 0.1-0.2 mg/mL using Amicon 100 kDa molecular weight cut-off (MWCO) centrifugal filters and used for grid preparation.

Generation of an initial TSCC reference density
A sample of concentrated wild-type TSCC FL was applied to a carbon-coated holey carbon grid (R1.2/1.3, Quantifoil) and stained with 2% (w/v) uranyl acetate. A total of 224 micrographs were collected using an FEI Tecnai T12 electron microscope (Thermo Fisher Scientific, Waltham, MA, USA) at a magnification of 81,000-fold, an acceleration voltage of 120 kV, and a total dose of 50 e À / A 2 over a defocus range of À0.5 to À2.0 lm. A dataset of 9597 particles was selected manually using BOXER. The parameters of the contrast transfer function were determined using CTFFIND4. Particles were 2D-classified into 100 classes in two dimensions using RELION and sixteen well-defined classes were selected for initial three-dimensional reconstruction. Initial models were created using the initial model functions in EMAN2, refined in three dimensions at low resolution using SPIDER, then filtered to 60 A and used as an initial reference for automatic refinement in RELION. The resulting initial model at a resolution of 26 A, with independent volume Fourier Shell Correlation (FSC) of 0.143, was used for further refinement.

TSCC cryo-EM sample preparation
Samples of concentrated TSCC protein complexes were adsorbed to a thin film of graphene oxide deposited upon the surface of holey carbon copper grids (R2/1, 300 mesh, Quantifoil). Grids were blotted for 2-3 s before plunge freezing in liquid ethane using a Vitrobot Mark IV (Thermo Fisher Scientific, Waltham, MA, USA) at 4°C and 100% humidity.

TSCC cryo-EM data collection
Data were collected of TSCC FL on a Titan Krios (Thermo Fisher Scientific, Waltham, MA, USA) at the Electron Bioimaging Centre (eBIC, Diamond Light Source), equipped with a K2 Summit direct electron detector (GATAN, San Diego, USA) and operated at 300 kV, 37,000-fold magnification and with an applied defocus range of À0.75 to À3.25 lm. Frames were recorded automatically using EPU, resulting in 5387 images of 3838 by 3710 pixels with a pixel size of 1.35 A on the object scale. Images were recorded in two successive datasets (of 1880 and 3507 images, respectively) as either 40 or 60 separate frames in electron counting mode, comprising a total exposure of 52.3 or 80.2 e À A À2 , respectively. A separate dataset of TSCC FL , which was not used for the determination of the final structure reported, was collected on a Titan Krios at the London Consortium for Cryo-EM (Francis Crick Institute, London, UK) equipped with a K3 direct electron detector (GATAN, San Diego, USA) and Volta phase plate. The microscope was operated at 300 kV and 81,000-fold magnification, with an applied defocus range of À1 to À3.25 lm. 4973 images were recorded, of 5760 by 4092 pixels, with a pixel size of 1.1 A on the object scale.

TSCC cryo-EM data processing
Frames were aligned, summed and weighted by dose according to the method of Grant and Grigorieff using MotionCor2 25 to obtain a final image. Poor-quality micrographs were rejected based on diminished estimated maximum resolution on CTF estimation using CTFFIND4 26 and visually based on irregularity of the observed Thon rings.
Particles were selected using BATCHBOXER, 27 and refinement thereafter performed using RELION3. 28,29 Two-dimensional reference-free alignment was performed on~1,500,000 initially boxed particles to exclude those that did not yield high-resolution class averages and to identify the principal ordered regions of the TSCC molecule. Of these, 395,622 particles populated classes extending to high-resolution and were retained for further refinement.
TSCC proved to be highly preferentially oriented on the grid, however it was possible to identify 2D classes for each of the "body", "pincer" and "tail" regions. Iterated re-centring, two-dimensional refinement, and re-boxing using the neural network particle picker Topaz 30 was performed from the "body" region outwards in order to recover enough "pincer" and "tail" views to provide a complete description and definitive topology for all three regions.
Particles belonging to the "body" frequently displayed C2 symmetry in 2D class averages, and this particle subset was refined in three dimensions using this symmetry restraint. After several iterations of re-picking particles using Topaz 30 and refinement, the final gold-standard refinement of the "body", including 172,093 particles, reached 4.2 A at an independent FSC = 0.143. The symmetry was subsequently relaxed to C1 and refined (gold-standard) to 4.5 A resolution at an independent FSC = 0.143. Reconstructions were also performed of the "pincer" and "tail" regions, from 15,854 and 58,307 particles respectively, however these suffered from persistent highly preferred orientation and conformational flexibility, with gold-standard refinements reaching 8.1 A and 8.2 A resolution, respectively, at an independent FSC = 0.143.

Architectural model of TSCC
The hand of the structure could be assigned from the maps based on the handedness of alphahelices in the core of the body. We were also able to confirm the handedness of the density by determining the best possible fits of the known homologous structures into the density in either hand and then by local optimisation in Chimera. Maximal real-space CC values were 0.65 versus 0.58 for the GAP, and 0.55 versus 0.48 for the RapGAP dimerisation domain, into the correct hand versus the incorrect hand in each case. A poly-UNK secondary structural model of the TSCC "body" HEAT repeat sections and putative TSC1 coiled-coil helical sections was built using COOT. 31 The C. thermophilum TSC2-GAP structure (PDB ID: 6SSH) 19 was unambiguously aligned with the region of density against each TSC2 HEAT repeat within the "body" using UCSF Chimera. The human RapGAP dimerisation domain crystal structure (PDB ID: 3BRW) 22 was fit into the dimerisation interface between the two TSC2 HEAT repeats. In both the "pincer" and "tail" reconstructions, the C. thermophilum TSC2 N-terminal HEAT repeat crystal structure (PDB ID: 5HIU) 18 could be fitted with the a-solenoid extending from C-terminus at the "body" dimerisation interface to N-terminus at either end of the elongated complex. Additionally, in the "tail" reconstruction the TBC1D7-TSC1 crystal structure (PDB ID: 5EJC) 21 fit into density of the "barb", providing a means to distinguish the "tail" from the "pincer".

Data and Materials Availability
The cryo-EM density maps corresponding to the "pincer", "body", and "tail" of the HsTSCC complex has been deposited in the EM Databank under accession codes EMD-11816, EMD-11819, and EMD-11817.
CRediT authorship contribution statement