Unusual Armadillo Fold in the Human General Vesicular Transport Factor p115

The golgin family gives identity and structure to the Golgi apparatus and is part of a complex protein network at the Golgi membrane. The golgin p115 is targeted by the GTPase Rab1a, contains a large globular head region and a long region of coiled-coil which forms an extended rod-like structure. p115 serves as vesicle tethering factor and plays an important role at different steps of vesicular transport. Here we present the 2.2 Å-resolution X-ray structure of the globular head region of p115. The structure exhibits an armadillo fold that is decorated by elongated loops and carries a C-terminal non-canonical repeat. This terminal repeat folds into the armadillo superhelical groove and allows homodimeric association with important implications for p115 mediated multiple protein interactions and tethering.


Introduction
Membrane trafficking in eukaryotic cells is an example for the modular organization of cellular activity. The formation and delivery of transport intermediates to specific cellular locations are complex processes that can be divided into several stages [1]. In this modular organization the first interaction of a vesicle and its target membrane is termed tethering. It depends on a heterogeneous group of proteins called 'tethers' [2]. They can be divided into multi-subunit tethering complexes and proteins containing an extended coiled-coil region.
The golgin p115, which forms stable homodimers, is recruited to membranes in a nucleotide-dependent manner by the guanosine triphosphatase (GTPase) Rab1a [2,3] and belongs to the family of tethers containing an extended coiled-coil region. p115 is among the best characterized representatives of long coiled-coil tethers. The architecture of p115 comprises a long central coiled-coil region, a large globular N-terminal domain and a C-terminal acidic region. The central region mediates homodimerization and contains the Rab1a binding site. Interaction of Rab1a and p115 is thought to tether coat-protein complex II (COP II) vesicles to each other, thus promoting homotypic vesicle fusion [3]. The C-terminal region of p115 binds to GM130 and giantin, two further coiled-coil tethers localized at the Golgi membrane [4].
p115 binds to a specific set of soluble N-ethylmaleimidesensitive-factor attachment protein receptors (SNAREs), aiding formation of a cis-SNARE complex that promotes the anterograde ER-to-Golgi transport by targeting COP II vesicles to the Golgi apparatus [3]. In addition to its major role in exocytotic transport (5), p115 functions in retrograde Golgi-to-ER trans-port, intra-Golgi transport and Golgi biogenesis [6] of the Golgi apparatus, due to essential interactions with the coat-protein complex I (COP I) subunit b-COP [7] and the conservedoligomeric-Golgi complex (COG) subunit COG2 [8].
To understand how these different activities are combined in one p115 molecule, we embarked on its structure analysis. We used a construct comprising the globular head region of p115 (p115 GHR , residues Asp54 to Tyr629) for crystallization (Fig. 1). The fragment lacks 53 N-terminal residues that are predicted to be disordered [9] and the C-terminal coiled-coil domain (p115 CC ).

Results and Discussion
Structure of the p115 globular head region p115 GHR consists of a multi-helical b-catenin-like armadillo fold arranged in a regular right-handed superhelix ( Fig. 2A). We observe 10 classical armadillo repeats (ARM1-ARM10) [10][11][12] and one non-canonical repeat which we termed USO repeat, after the yeast homolog of p115, Uso1p. Each armadillo repeat is composed of three a-helices (H1-H3) and has a distinct hydrophobic core. ARM1 and ARM2 are connected by a highly acidic and flexible loop (residues 92-110), which is not visible in the electron density. Structure-based sequence alignment reveals a number of highly conserved amino acids (Fig. 2B).
The N-terminal region of p115 GHR (Fig. 2C left) is remarkably similar to other armadillo-fold proteins [11,13] of different subfamilies (b-catenin, p120/catenin d-1, and karyopherin-a/ importin-a), although these proteins show low sequence conservation. The C-terminal region (ARM7-USO) of p115 GHR differs from other members of armadillo-protein subfamilies (Fig. 2C  right). The armadillo repeats exhibit long loops (5 to 13 residues) in ARM5-ARM9. This structural motif of elongated loops culminates in the formation of a short helix inserted in the H2-H3 loop (37 residues) of the terminal USO repeat which we named the USO helix. This USO repeat does not follow the rule of classical armadillo repeat proteins that form a right-handed superhelix. It folds back into the superhelical groove, leading to a globular C-terminus of p115 GHR and covering helices H3 of ARM8-ARM9, while the USO helix points into the center of the groove.
These unique characteristics of the C-terminus allow to structurally separate the protein in an armadillo helical domain (ARM1-ARM6, residues 54-342) and an Uso1 head domain (ARM7-USO, residues 343-629), which clearly distinguishes the head region of p115 from any other armadillo-fold protein. The Uso1 head domain identifies a group of proteins which are described as general vesicular transport factors, transcytosis associated proteins (TAP) or vesicle docking proteins [14].

Interaction of p115 GHR and the COG complex subunit COG2
Uso1p and p115 share a similar domain structure, a large globular head region with a long coiled-coil domain and an acidic patch on the C-terminus of the protein with an overall sequence identity of 25%. Two highly conserved homologous regions HR1 (residues 21-54) and HR2 (residues 200-247) were shown to bind to the appendix domain of the COP I subunit b-COP and the COG complex subunit COG2, respectively (see Fig. 1). HR1 is predicted unordered and missing in our structure. The HR2 is mapped to ARM4 and ARM5 of the N-terminal armadillo like helical domain. The armadillo fold is found in more than 240 proteins that mostly serve as scaffolds for the assembly of multiprotein complexes. They often mediate complex formation by polar interactions. Interestingly, the armadillo helical domain shows large negatively charged patches (Fig. 3a), and additionally we observe a conserved, highly charged surface patch of ARM4 in HR2 [15,16] which indicates that COG2 binding arises mainly from polar interactions [ Fig. 3b].
Dimeric arrangement of the p115 globular head region p115, like other golgins, is a stable homodimer with an Nterminal globular head domain and a C-terminal coiled-coil domain of 45 nm length as determined by rotary-shadowing electron microscopy [17]. We have observed p115 GHR to be monomeric in solution by gel-filtration experiments (not shown). In the crystal structure, a dimeric arrangement between p115 GHR molecules results from their packing along a dyad axis (Fig. 4A).
Depending on the orientation, the crystallographic dimer has a single-head or double-lobed globular appearance (Fig. 4B). The extended loops point towards the exposed surface, and the large superhelical groove of one molecule is covered by the groove of the second molecule in the dimeric arrangement. Interestingly, the p115 GHR groove is less charged compared to b-catenin and karyopherin-a (Fig. 5) which there serves as a binding site for interaction partners in these proteins. In the dimeric p115 GHR assembly as observed in the crystal the monomers are twisted around each other, keeping the USO helices, which form the interface of the head dimer, in the center. The USO helix and the USO repeat helix H2 are part of the dimer interface which covers only 635 Å 2 (,2.6%) of the total 24,000 Å 2 of solvent-accessible surface (SAS). This contact area is relatively small, indicating that in solution the globular head domains might be connected flexibly, if at all.

A Model for p115 full length protein
Although the observed crystallographic dimer might not exactly reflect the protein structure in the cell, we suggest a model of the overall fold of the full-length p115 (Fig. 6). We note the distinct shape similarity between the dimer arrangement of p115 GHR and EM images of intact dimeric p115 [17] and Uso1p [18]. In agreement with this observation, the C-termini of both p115 GHR monomers are aligned in parallel in the crystal structure which would allow continuation into the coiled-coil of p115.
The different members of armadillo subfamilies like b-catenin, karyopherin-a and p115 GHR define a conserved architecture and provide a scaffold for the assembly of protein complexes with different functions. Interestingly, the C-terminal region of p115 GHR , in comparison to full-length b-catenin [19] shows how the architecture of an armadillo domain is altered to serve in, what we propose, a hinge-linkage between the subunits of the dimeric p115 head domain. Further high-resolution structures of p115 GHR/CC and binding partner complexes combined with characterization of structure based mutants in cell-based assays will be required to understand how p115 carries out its tethering function.

Protein expression and purification
A fragment of the human p115 gene, encoding amino-acid residues 54-628 (p115 GHR ), was cloned into the bacterial  Crystallization and data collection p115 GHR crystals were grown at 4uC by the sitting-drop method using a semi-automated dispensing system [21]. Crystals for X-ray measurements were obtained in 25% PEG 550 MME, 0.1 M HEPES pH 7.5. The best crystals were flash-cooled at 100 K in mother liquor containing 20% sucrose. Data from a native crystal to 2.2 Å and a crystal from selenium-labeled p115 GHR to 2.8 Å resolution were collected at 100 K at the Protein Structure Factory beamline 14.2 of the Freie Universitä t Berlin [21] at BESSY (Berlin, Germany). The same space group, C2, was obtained for the native and selenomethionyl proteins, with one molecule per asymmetric unit. Data were reduced and scaled using HKL2000 [22]. Data collection statistics are listed in Table 1.

Structure determination and refinement
For structure determination of p115 GHR , selenium-peak wavelength data to 2.8 Å resolution were used for single-wavelength anomalous diffraction phasing (SAD) to determine the positions of 15 selenium sites. Initial phases were calculated and improved using PHENIX [23]. The initial model was automatically built with ARP/wARP [24] and manually improved using the program COOT [25]. The model was placed into the unit cell of the higher-resolution native protein and subsequently refined using REFMAC5 [26]. During several rounds of iterative model building and refinement (including TLS), the model was extended to 553 residues per asymmetric unit, and three polyethylene glycol and 123 water molecules were placed into the electron density. The p115 GHR structure has a final R work = 21.9% and R free = 26.9%, and the quality of the model was excellent as assessed with the program Molprobity [27]. The coordinates and diffraction amplitudes were deposited in the Protein Data Bank with accession code 2w3c. Refinement statistics are summarized in Table 1.

Figure production
All pictures were prepared using PyMOL [28] and the APBS tool [29]. The sequence alignment was prepared with ClustalW [30].