Computational Analysis of Cholesterol Binding and Pore-Lining Regions in Alpha-Synuclein: Role in Mitochondrial Function

Copyright: © 2017 Morrill GA, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.


Introduction
Alpha-synuclein (α-syn) is a small (14 kDa), acidic protein that is highly conserved in vertebrates, making up as much as 1% of all proteins in the cytosol of brain cells [1]. The α-Syn protein is the major component of Lewy bodies, which are characteristic pathological trademarks for neurodegenerative diseases (e.g. Parkinson's and Alzheimer's diseases; rev. [2]). It is predominantly expressed in the neocortex, hippocampus, substantia nigra, thalamus and cerebellum as a neuronal protein, and can also be found in the neuroglial cells. It is also expressed in low concentrations in all tissues examined except liver [see UniprotKB-P37840 (SYUA_HUMAN) for Gene expression databases]. α-Syn is localized in the presynaptic termini in both free and membrane-bound forms, with about 15% of α-syn being estimated as membrane-bound at any given moment in neurons [3]. α-Syn is also highly expressed in the mitochondria in the olfactory bulb, hippocampus, striatum and thalamus [4,5]. Cerebral cortex and cerebellum are exceptions, with each containing high levels of cytosolic α-syn but very low mitochondrial levels [4]. Devi et al. [6] have shown that α-syn contains a 32 amino acid mitochondrial targeting signal with evidence that accumulation of α-syn in human dopaminergic neurons caused reduced mitochondrial complex I activity and increased production of reactive oxygen species. Complex I (NADH:ubiquinone) is the largest multimeric enzyme complex of the mitochondrial respiratory chain, responsible for electron transport and the generation of a proton gradient across the mitochondrial inner membrane to drive ATP production. A number of proteins have been identified as assembly factors for complex I biogenesis and numerous mitochondrial diseases have been associated with complex I dysfunction (rev [7]).
In the normal brain, α-syn is found at the tips of neurons in specialized structures called presynaptic terminals [1]. Within these structures, α-syn interacts with phospholipids [8] and other neuronal proteins [3]. A fraction of α-syn interacts with mitochondria, and increased α-syn can produce mitochondrial fragmentation and impair mitochondrial complex 1 activity [9]. Other studies suggest that α-syn plays a role in maintaining a supply of synaptic vesicles in presynaptic terminals by synaptic vesicle clustering [10]. α-Syn reportedly occurs physiologically as a helically folded tetramer that resists aggregation [11]. The usual form of the protein, and the one most investigated, is the full-length protein of 140 amino acids. Other isoforms are alphasynuclein-3, which lacks residues 41-54 due to loss of exon 3; and alphasynuclein-2 [12], which lacks residue 103-130 due to loss of exon 5 [13].
Here we identify additional protein domains and moieties in the three α-syn isoforms, and analyze the role of protein topology in α-syn-1 physiology. A protein domain is a conserved part of a given protein sequence and (tertiary) structure that can evolve, function, and exist independently of the rest of the protein chain. Each domain forms a compact three-dimensional structure and often can be independently stable and folded and responsible for a particular function or interaction, contributing to the overall role of a protein (e.g. [14]). A super secondary structure such as a helix-turn-helix which appears in Abstract Alpha-synuclein (α-syn) protein is the major component of Lewy bodies, which are characteristic pathological trademarks for neurodegenerative diseases (e.g. Parkinson's and Alzheimer's diseases). It is primarily expressed in neural tissue, with smaller amounts found in heart, muscle and other tissues. The canonical form found in Homo sapiens (α-syn-1) contains 140 residues and interacts with neuronal mitochondria via an N-terminal 32 residue mitochondrial-targeting signal. All isoforms (there are 3) have multiple highly conserved lipid binding (KTKE(Q)G(Q) V) motifs, thought to mediate binding to phospholipid membranes. Two isoforms also contain an EF-hand-like (helixloop-helix) sequence found in a large family of calcium-binding proteins, as well as three copper binding sites. We investigate protein topology using computational analysis and find that each isoform contains a pore-lining region, two cholesterol-binding (CRAC/CARC) and three or four lipid binding motifs, with one cholesterol motif overlapping the pore-lining region. Two lipid-binding motifs also overlap the N-terminal mitochondrial-targeting region consistent with evidence that α-syn inserts into mitochondrial inner membrane. α-Syn-1 reportedly occurs physiologically as a helically folded tetramer that requires N-terminal acetylation. Thus, each α-syn-1 tetramer could contain 4 mitochondrial targeting regions, up to 4 pore-lining regions, 4 EF-hand domains, 8 bound cholesterol molecules and 16 lipid binding motifs with pore-lining regions merging to form a membrane channel. Cholesterol binding to CRAC motifs may in turn facilitate protein folding, Ca 2+ -channel formation, as well as mitochondrial membrane lipid-protein interactions, altering mitochondrial bioenergetics. Disruption of mitochondrial bioenergetics may be involved in the pathogenesis of Alzheimer's disease and Parkinsonism. a variety of proteins is often defined as a motif and may describe the connectivity between secondary structural elements [15]. As described here, α-syn contains multiple protein domains including a 32 residue N-terminal mitochondrion-targeting sequence, a pore-lining region, multiple cholesterol binding (CRAC/CARC) motifs and lipid-binding KTKE(Q)G(Q)V motifs, an EF-hand motif and several copper binding sites with cholesterol binding motifs overlapping specific domains and super secondary structures. Based on protein topology analysis, we find that each α-syn tetramer may contain 4 pore-lining regions that form channels and up to 8 bound cholesterol molecules overlapping other motifs and serve to modulate ion movement and interaction with mitochondrial membranes.

Computational analysis of α-syn isoforms
Secondary structure, transmembrane helix and/or pore-lining region predictions were determined by computational analysis. PSIPRED is a simple and accurate secondary structure prediction method, incorporating two feed-forward neural networks which produce an output obtained from Position Specific Iterated -Blast. PSIPRED 3.2 achieves an average Q 3 score of 81.6% [16] and is available at http://bioinf.cs.ucl.ac.uk/psipred/. Transmembrane helices and pore-lining regions in transmembrane protein sequences in this study were predicted using the method of Nugent and Jones [17]. These workers have trained [17] a support vector machine classifier to predict the likelihood of a TM helix being involved in pore formation. This approach has a prediction accuracy of 72%, while a support vector regression model is able to predict the number of subunits participating in the pore with 62% accuracy.

Analysis of cholesterol (CRAC, CARC) motifs
CRAC is a short linear amino acid motif that mediates binding to cholesterol and stands for Cholesterol Recognition/Interaction Amino acid Consensus sequence [18]. In a N-terminus to C-terminus direction the motif consists of a branched apolar Leu (L), Ileu (I) or Val (V) residue, followed by a segment containing 1-5 of any residues, followed by an aromatic Tyr (Y)/Phe (F) residue, a segment containing 1-5 of any residues, and finally a basic Lys (K) or Arg (R). In the one letter amino acid codes the algorithm is (I/L/V) -X 1-5 -(Y/F) -X 1-5 -(K/R). A second cholesterol recognition domain similar to the CRAC domain (CARC) has been identified (cit. [18]) but exhibits the opposite orientation along the polypeptide chain ("inverted CRAC"), i.e., (K/R) -X 1-5 -(Y/F) -X 1-5 -(L/V).

EF-hand domain analysis
The EF-hand is a helix-loop-helix structural domain or motif found in a large family of calcium-binding proteins [19]. The EF-hand motif contains a helix-loop-helix topology, much like the spread thumb and forefinger of the human hand, in which the Ca 2+ ions are coordinated by ligands within the loop. The EF-hand consists of two alpha helices linked by a short loop region (usually about 12 amino acids) that usually binds calcium ions.

11-Residue lipid-binding domains
An antipathic N-terminal region of α-syn (residues 1-63) dominated by four 11-residue repeats including the consensus sequence KTKE(Q)G(Q)V has a structural alpha helix propensity similar to apolipoprotein-binding domains (e.g. [20]). Similarities in these repeats with those found in apolipoproteins have suggested a possible role in the association of α-syn with lipids through the formation of an α-helical secondary structure [21]. When mixed with lipid vesicles, including negatively charged phospholipids, the N-terminal region of α-syn adopts a α-helical conformation [22].

Analysis of alpha-synuclein (α-syn) protein topology in terms of pore-lining regions, disordered regions and Dompred and DomSSEA boundaries
Nugent and Jones [17] have developed a method (MEMSAT-SVM) to predict pore-lining regions in transmembrane (TM) proteins and to distinguish pore-lining regions from TM helices. In predicting pore stoichiometry, four features were used to train a support vector machine (SVM) model: sequence length, the number of pore lining residues, topology and the number of pore-lining helices, with the target value set to the number of subunits contracting the pore within the membrane region. Pore-lining regions are usually enriched in negatively (e.g. E, D) or positively charged residues (e.g. K, R, H). Using this approach, Nugent et al. 16] were able to identify pore-lining regions and to predict both the likelihood of transmembrane helices involved in pore (channel) formation and to determine the number of subunits required to form a complete pore or ion channel.
Including the initiator methionine, the α-syn isoforms vary from 140 residues (isoform-1) to 112 residues (isoform-2) and 126 residues (isoform-3). The pore-lining regions, disordered protein binding regions, as well as Dompred and DomSSEA boundaries are compared ( Figure 1) in the three α-syn isoforms of Homo sapiens produced by alternative splicing (P37840-1, top; P37840-2, middle; P37840-3, bottom). P37840-1 has been chosen as the 'canonical' sequence (also known as NACP140). The solid blue squares indicate pore-lining regions whereas the extracellular region is indicated as orange-filled squares with the predicted intracellular region shown as colorless squares ( Figure 1). As indicated by the key annotations (lower part of each plot), squares with dark red and green rims, respectively, represent disordered and disordered protein binding regions. As shown, each isoform contains a 16-residue transmembrane pore-lining region which should facilitate insertion of α-syn into mitochondrial membranes. The pore-lining region of the canonical form ( 76 A-A 91 ) is located within domain 2 (residues 61-95) of the three domains (1-60; 61-95; 96-140) identified in α-syn. The three isoforms are similar in that all C-terminal regions are predicted to be extramitochondrial (or "extracellular") but differ in their disordered regions. Using LALIGN, a comparison of isoforms 1 and 2 indicates a Waterman-Eggert score of 616, with 80.0% identity (80.0% similar) in 140 amino acid overlaps (1-140:1-112). Residues 103 to 130 are deleted in isoform 2. In comparison, isoforms 1 and 3 exhibit a score of 752, with 90.0% identity (90.0% similar) in Figure 2 compares the pore-lining regions (sequences are bold underlined), cholesterol binding (CRAC/CARC) domains (highlighted in red), KTKE(Q)G(Q)V motifs (double underlined) and EF-hand like regions (green highlighted) in the α-syn isoforms. The differential copper binding regions are highlighted in blue (cf. [23]). Each α-syn isoform contains two cholesterol binding (CRAC or CARC) motifs, one of which overlaps a pore-lining region and/or a KTKE(Q)G(Q)V repeat motif in all 3 isoforms. Recent studies have defined new roles for copper metabolism in cell proliferation, signaling and disease including SCO1 and SCO2 that are essential for the assembly of the catalytic core of cytochrome c oxidase (rev. [24]). Miotto et al. [25] have shown that N-terminal acetylation facilitates Cu + binding to the naturally occurring form of α-syn at Met-1 and Met-5 and to a lesser extent to Met-116, -127 and His-50. More recent studies by Ranjan et al. [23] report that Cu 2+ first binds to residues in the stretch of amino acids 48-53 with a higher affinity, which is followed by the binding at residues 3-11 and residues 115-123, respectively, in the case of the canonical form of alpha-syn. Isoforms 1 and 3 (but not isoform 2) contain an EF-handlike (helix-loop-helix) C-terminal sequence found in a large family of calcium-binding proteins. As predicted by the MEMSAT-SVM algorithm in (Figure 1), the EF-hand domain is extra mitochondrial (or extracellular), in a position to modulate Ca 2+ binding to the α-syn at the mitochondrial surface.
The CRAC sequence has been suggested to relate to the propensity of membrane proteins to be incorporated into cholesterol-rich lipid domains as well as promote segregation of cholesterol into lipid domains [18]. It is important to note a possibility of "false positive" results in CRAC/CARC motifs. Many proteins are shown to have multiple CRAC/CARC motifs but have not been confirmed as actual cholesterol binding domains. Modeling studies by Fantini et al. [26] indicate that a cholesterol molecule can bind to the CRAC domain with energy of interaction of -44.4 kJ mol -1 whereas the CARC domain binds with a significantly higher energy of interaction of -62.4 kJ mol -1 . The presence of a CARC or CRAC motif in a protein usually indicates its intrinsic capability to interact with cholesterol, provided that the motifs are accessible to cholesterol molecules. Based on energy considerations, protein folding can be affected by cholesterol binding and protein folding may be different in the presence and absence of cholesterol. It may be possible to characterize cholesterol binding using 1 H/ 13 C-NMR with purified preparations of α-syn. It should also be noted that methionine is the N-terminal amino acid in each of the α-syn isoforms (Figure 2). In a biological system, protein synthesis is initiated universally with the amino acid methionine (cit. [27]) which is subsequently removed by enzymes in the cell. The codon initiator methionine is therefore listed in protein data files, although it may not be present physiologically. A contribution from N-terminal methionine to protein topology/interactions has not been ruled out.

Distribution of mitochondrial-targeting sequences and C-terminal EF-hand-like domains
The intracellular sorting of newly synthesized precursor proteins (preproteins) to mitochondria depends on the "mitochondria-targeting sequence" (MTS) which is located at the N-termini of the preproteins (rev. [28]). MTS is required not only for targeting newly synthesized preproteins to mitochondria, but also for several steps along the mitochondrial protein import pathway. MTS is described as a multirole sorting which specifically interacts with various components along the mitochondrial protein import pathway. Targeting of proteins to different cellular locations is often mediated by N-terminal "topogenic sequences" (rev. [29]). However, most mitochondrial proteins are encoded by the nuclear genome and thus have to be imported into mitochondria from the cytosol (rev. [30]). In the case of α-syn, the N-terminal 32 amino acids have been identified as an MTS [6]. As can be seen in Figure 2, the 32 amino acids are identical in all three isoforms, with all three containing a copper binding site (VFMKGLSKA) and a lipid-binding KTKE(Q)G(Q)V motif.
As also shown in Figure 2, the last 20 residues of α-syn represent the EF-hand domain. Based on protein topology, this may exist as a random coil structure due to its low hydrophobicity and high net negative charge. Trexler and Rhoades [31] have demonstrated that N-terminal acetylation is critical for forming α-helical oligomers of α-syn. Further evidence for the tetrameric conformation has been obtained by Bartels et al. [32], Pochapsky [33] and Dettmer et al. [34]. As noted by Trexler and Rhoades [31], acetylation removes the N-terminal charge, making this region more hydrophobic. Increased hydrophobicity would favor protein-protein interactions of α-syn involving hydrophobic packing, either in folding to a native tetramer or in aggregations found in Parkinson's disease. It should also be emphasized that studies using an E. coli expression system to prepare large amounts of α-syn from human C-DNA (e.g. [35] would generate non-acetylated protein monomers. Acetylated α-syn can be produced by co-transfecting E. coli with both α-syn and N-terminal acetylated B complex plasmids [11]. The pattern emerging from numerous studies suggests that the N-acetylated tetrameric helix-rich form is important to α-syn physiology.

Amphipathic α-helical domain binding to phospholipid membranes: Role of conserved KTKEGV heteromeric motifs and formation of helically folded tetramers
Amphipathic α-helical domains in the N-terminus of α-syn reportedly mediate binding to phospholipid membranes [36]. This domain is structurally similar to the class A2 lipid-binding motif described for the exchangeable apolipoproteins. As proposed by George and Yang [36], α-syn is unstructured in aqueous solution, but shifts to a unique α11/3-helical conformation in the presence of phospholipid vesicles or SDS. In addition, α-syn binds fatty acids, particularly those with long polyunsaturated acyl chains (PUFAs), and these interactions can induce irreversible multimerization of the protein. The N-terminal domain (residues 1-70) of the canonical α-syn-1 contains a positively charged region that includes the 11-amino acid heteromeric repeats. Each 11-amino acid repeat contains a highly conserved KTKE(Q) G(Q)V hexameric motif that is also present in the α-helical domain of apolipoproteins (rev. [36]. The core region of α-syn (residues 61-95, also known as NAC) is involved in fibril formation and, as shown   (Figure 2), includes the pore-lining region and a CRAC motif. The C-terminal region of α-syn (residues 96-140) is defined as an "acidic tail" of 43-amino acid residues, containing 10 Glu and 5 Asp residues, and contains the EF-hand motif.

Conclusion
We find that current protein computational methods indicate that α-syn contains previously unidentified protein domains and motifs. These include a pore-lining region ( Figure 1) as well as two cholesterol-binding (CRAC) motifs per α-syn molecule ( Figure 2). The α-syn protein domains and motifs in isoform 1 identified to date are summarized in (Table 1). All α-syn isomers contain N-terminal 32 residue mitochondrial-targeting signal and coupled with the finding that α-syn is highly expressed in the mitochondria in the olfactory bulb, hippocampus, striatum and thalamus [4] indicates that α-syn is an important mitochondrial protein. Other evidence in the literature indicates that, under physiological conditions, the α-syn molecule is acetylated in the N-terminal region and forms tetramers (e.g. [11]).
The mitochondrial-targeting sequence in the N-terminal region indicates that α-syn may selectively insert in the inner mitochondrial membranes of neural cells [6,9,37]. Thus, the pore-lining regions within tetramers may merge to form a mitochondrial membrane ion channel and alter mitochondrial bioenergetics. All α-syn isoforms also contain multiple highly conserved phospholipid-binding motifs (KTKE(Q) G(Q)V), thought to mediate binding to phospholipid membranes (e.g. [34]), which in the tetramer would involve up to 16 phospholipid binding (KTKE(Q)G(Q)V) motifs. Cholesterol-containing pore-lining regions are known to form Ca 2+ ion channels [18,26] which together with the EF-hand in the C-terminal region of the molecule would regulate Ca 2+ movement across the mitochondrial membrane. It has been proposed that memory reconsolidation and its maintenance depend both on voltage-dependent Ca 2+ channels as well as calcium/calmodulindependent protein kinase in the hippocampus [38]. Ion channel formation in plasma/mitochondrial membranes can be facilitated by exercise and account for the increase in both brain cognition and the beneficial effects of physical activity on selective aspects of brain function (cit. [39]). An excess of α-syn (e.g. in Parkinson's disease) may produce mitochondrial calcium overload, resulting in mitochondrial dysfunction/destruction and neuronal cell death [40].