The Route from the Folded to the Amyloid State: Exploring the Potential Energy Surface of a Drug‐Like Miniprotein

Abstract The amyloid formation of the folded segment of a variant of Exenatide (a marketed drug for type‐2 diabetes mellitus) was studied by electronic circular dichroism (ECD) and NMR spectroscopy. We found that the optimum temperature for E5 protein amyloidosis coincides with body temperature and requires well below physiological salt concentration. Decomposition of the ECD spectra and its barycentric representation on the folded‐unfolded‐amyloid potential energy surface allowed us to monitor the full range of molecular transformation of amyloidogenesis. We identified points of no return (e.g.; T=37 °C, pH 4.1, c E5=250 μm, c NaCl=50 mm, t>4–6 h) that will inevitably gravitate into the amyloid state. The strong B‐type far ultraviolet (FUV)‐ECD spectra and an unexpectedly strong near ultraviolet (NUV)‐ECD signal (Θ ≈275–285  nm) indicate that the amyloid phase of E5 is built from monomers of quasi‐elongated backbone structure (φ≈−145°, ψ≈+145°) with strong interstrand Tyr↔Trp interaction. Misfolded intermediates and the buildup of “toxic” early‐stage oligomers leading to self‐association were identified and monitored as a function of time. Results indicate that the amyloid transition is triggered by subtle misfolding of the α‐helix, exposing aromatic and hydrophobic side chains that may provide the first centers for an intermolecular reorganization. These initial clusters provide the spatial closeness and sufficient time for a transition to the β‐structured amyloid nucleus, thus the process follows a nucleated growth mechanism.


Introduction
Aggregation of proteins and peptides egmentsi nto amyloid fibrils have been studied intensively overt he past decades since the process was shown to be associatedw ith, or even trigger, [1][2][3] such illnesses as Alzheimer's disease, type-2 diabetes mellitus,r heumatoid arthritis,o rh aemodialysis ass. amyloidosis. [4] From the pioneering work on lysozyme [5] and Ab , [6][7][8] the amyloid state of several misfolded proteins (e.g., b2-microglobulin, [9] crystallin, [10] tau protein, [11][12][13] the glucagon peptideh ormone, [14] and insulin [15] amongo thers) were charac-terized.T he generalt opology of such aggregates consists of protein segmentsa dopting an extended backbone, interacting through b-edges. The association between the b-sheetst hus formed is compacta nd specific;i nm ost cases it excludes water molecules, leading to the formation of tightly stacked, "dry-zipper" nanostructures. [16][17][18] The state-of-the-art TEM, SAXS, cryo-EM and ssNMRt echniques now allow full characterization of the aggregated end-state; [19,20] however,m uch less is knowna bout the specificm olecular speciest hat evolve during the process, especially in the early stages,w hichc oncern the formation of still soluble but oligomeric assemblies that are the most toxic [1-3, 21, 22] and also represent the stage at which amyloidosis can still be reversed. [23] The progress of self-association can often be followed by ThT fluorescence and DLS-besti nc ombination [24] -reporting the accumulationo fc ross-b-backbone (above am inimum size of ca. 10 nm or 4-6 aligned strands [25] )a nd the size distribution of the species presenti nt he solution,r espectively.H owever,t o gain atomistic detail, molecular spectroscopyn eeds to be applied such as CD, IR or NMR spectroscopy. [26,27] In fact, CD spectroscopyc an be used to monitorafull range of molecular transformations accompanying amyloidogenesis if the secondary structure content of the folded, intermediate, and amyloid states are distinct. There are notable examples that satisfy this description;n amely,h elical peptide hormones such as amylin or glucagon and ac onsiderable number of peptidet herapeutics [28] and since over 60 %o fa ll protein-protein interfaces-typical targets of drug design-also constitute helices, [29] their number will most likely just increase.
Here we present the amyloid formation pathway of av ariant of Exenatide, am arketed drugf or type-2 diabetes mellitus [30] that also containsawell-folded a-helix. We have discovered that this 25 residue long segment( E5:E EEAVRLYIQWL-KEGGPSSGRPPPS) (Figure 1), comprising the entire interface neededf or GLP-1 receptor binding, [31] can be turned into amyloid in ac ontrolled, fully reproducible and tunable manner within al arge range of protein concentrations (80 mm < c protein < 800 mm)a tp hysiologically relevant temperatures. Therefore, understandingt he molecular details of the amyloidosis of E5 and mapping its conditions is highly relevant to any optimization efforts targetingE xenatide. In addition, E5 is an ideal model to study the amyloid transition of folded pro-teins and helix-containing peptides. Beside its helical stretch, E5 contains a b-turn, ap olyproline-IIh elix and ah ydrophobic centerw ith ab uried Trp, thus has ap rotein-like build-up and also folds quite similarly to at ypical globularp rotein. [32] As E5 is small, both its chemical synthesis on ar esin and bacteriale xpression in af usion system is straightforward. [33] Given that its folded state is partially helical, its transition toward the amyloid phase resultsi nasignificant change in secondary structure content, which is easy to monitor by electronic circulard ichroism( ECD) spectroscopy.F urthermore, E5 has an interacting Trp/Tyr residue pair within its hydrophobic core, enabling folding and refolding of the protein to be trackedb yn ear ultraviolet (NUV)-i na ddition to far ultraviolet (FUV)-ECD. We also found the process to be quenchable; the amyloid formationc an be suspendeda ta ny time by dropping the temperature, and restartedb yasubsequentt emperature rise. Furthermore, the moderate size of E5 allows detailed structure characterization both by NMR and modelingt echniques.
We probed various regions of the f(T,pH,c protein ,c ion ,t)p otential energy hypersurface of E5 by acquiring quantitativeN UV-a nd FUV-ECDc hiroptical information and NMR data complemented by MD simulations to pinpoint the reactionp ath that leads from the fully folded-to the amyloid-state. Based on these results, we were able to propose am echanism that resembles that of well-folded proteins but relies on special features of the miniprotein.T he methodology presented here gains significance because the amyloid state of E5 is ThT silent and thus presentsana pproach for dealing with such cases as well.

Results and Discussion
We have shown previously [34] that high concentrations( c % 10-30 mm)o fE 5t rigger self-association,a ccompanied by an ahelix to antiparallel b-sheet structural transition. Here, we set out to identifyo ptimum conditions of aggregation,s tructurally characterizing keys tates alongt his route at ap hysiologically more relevant concentration range using NMR, NUV-a nd FUV-ECD data to pinpoint misfolded structureso ft he reaction path, and electron microscopyt oc onfirm the emerging amyloid fibrils.
To locate roughly and effectively those conditions that enable the amyloid formation, the HANABI system [35] was applied. The 96 well plates were set up with the following boundaries:2 < pH < 7, 80 mm < c E5 < 800 mm and 0mm < c NaCl < 100 mm at T = 37 8C. ECD measurements were carried out after 90 h, with sonication cycle turned "on" for 1min and repeated every 10 min (powerl evel:7 00 Wa nd frequency:2 5kHz). FUV-ECD measurements showed that amyloid formation is more effective in the presencet han in the absence of salt (c NaCl = 100 mm), with the preferred protein concentration ranging from 80 to 160 mm,w hile the optimal pH is near 4.0. The amyloid thus produced showed at ypical twisted fibril structure ( Figure 2).
The formationo ft hese fibrils could not be followed by ThT fluorescencea nd thus ECD spectra werer ecorded for each of the 96 wells. Based on the results, additional experiments were Figure 1. a) Primarysequence of E5 with residuesc olored by their charges of pH 7.0:n egatively charged amino acid, positivelychargeda mino acid, and neutral amino acid are highlighted with red, blue, and black, respectively.b)Overall charge (z)o fE 5asaf unction of the pH calculated by Prot pi (http://www.protpi.ch). c) The solubility of E5 as afunction of the pH (at 25 8C) showst hat near the isoelectric point reversible precipitation occurs. d) Folded structureo fE 5( 48C&pH 7.0) determined by NMRanalysis, with its four basic groups (highlighted blue), and five acidic groups (highlighted red) with the following pK a values: Ser C-terminal :2 .2, Glu side chain :4 .3, Glu N-terminal : 9.1, Lys side chain :1 0.8, and Arg side chain :12.5. e) The major microstatesw ith their labelinga re depicted schematically at four different pH values of interest, with side chains coloredbyc harge:neutral(black), negative (red) and positive (blue). designed to identify parameters of amyloid formation separately,topinpoint their significance and specific molecular consequences.
Optimumconditions located for E5 amyloid formation:the temperature-scan NMR (2D homonuclear) measurements were carried out to determinet he structure of E5 varying the temperaturesb etween 4a nd 48 8C( at pH 7, c protein % 0.8 mm, c ion % 0mm)l eading to the primary conclusion that as temperature increases, partial unfolding occurs without amyloid formation. We found that al-thoughN MR frequencies shift with rising T,l ine broadening only takes place above 37 8C, indicating that considerable unfolding occurs only at 48 8C( Figure S1 in the SupportingI nformation). The unsynchronized local backbonef luctuation of the folded F-statee nhances as T increases. Furthermore, we have determined the most T-sensitive residues of E5 forw hich the presenceo fh idden intermediate(s) (I-state(s)) [36,37] was revealed. Thet hermal unfolding of the protein backbone shows an onlinear T dependence for Leu 7 ,I le 9 ,a nd Lys 13 and, to a lesser extent, for Ala 4 ,V al 5 ,a nd Tyr 8 ,i ndicating an enhanced presenceo ft ransientc onformers at higher temperatures for the inner helix of E5 ( Figure 1, Figure S2). 3D structure elucidations were also completed by acquiring al arge number of NOE distance-restrains ( Figure 3a nd Ta ble S1). Although the total number of restrains drops as T increases, 666 (4 8C)!221 (48 8C), the latter number of NOEs are still sufficientt oe stablish the overall 3D-foldo fE 5e ven at 48 8C( especially since 16 out of 221 are key long-range restraints) (Figure 3e and Ta ble S1). Inter-residue NOEs associated with Y 8 andW 11 (Figure S3) enabled us not only to determine the overall molecular scaffold but also the relative orientation of the two aromatic side chains. At 37 8C, 458 distance restraints in total, among which 40 long range ones (i!(i + 5 <)) were assigned, affording as ingle time-average and compact 3D structure for E5. While the central a-helix is tightly folded at this temperature (RMSD of the backbone heavy atoms is 0.64 within the 50 best-fit structures) (Table S1), as T increases the unfolding of the -P 17 SSGRP 22 -s egmentw as detected. This segment is the least structureda rea within the folded scaffold even at 4 8C,  . The backgroundcolor indicates the fold compactness:g reen-unfolded, gray-folded, red-misfolded leading to amyloid formation.Aromatic side chains of the key Y 8 and W 11 are highlighted magenta. The basic (N-terminal;R 6 ,K 13 ,R 21 )m oieties are protonated, the Cterminal (S 25 )d eprotonated, whereas the protonation microstates of the Glu residuesa re shown at the rights ide at each pH.All atomsb ackbone RMSD ()a safunction of the primary sequenceo fE 5a tb )different temperatures (8C) and c) different pH values are plotted. The total number of NOEs assigned for E5: are reported as d) the temperature was scanned at pH 6.9 and e) as pH was scanned at T = 15 8C. where only sequentialN OEsc ould be measured ( Figure 3c). Nevertheless, the large number of NOEs associated with Y 8 and W 11 residues and the -P 22 PP 24 -s egment ensures their concerted motion,s ignaling that the hydrophobic core remains wellstructured at 37 8Cand below ( Figure S3).
Due to exchange phenomena, NMR spectroscopy cannot be used to provide precise structurali nformation at high T,t herefore complementary FUV-ECD measurements were carriedo ut to follow the transition of the backbone fold up to 85 8C. The recorded spectra wered econvoluted using the CCA + protocol as mixtures of folded (F-) and unfolded (U-) forms. [38][39][40] At lower temperatures, in line with the NMR data, the folded fraction is dominant (F contentgoes from 100 to 80 %asT increases from 5t o3 5 8C), a5 0:50 %m ixture is reached near 60 8C, while at 85 8Ct he spectrum indicates a7 0% unfolded content ( Figure S4).
The pH-scan 2D-NOESYd riven 3D structure elucidation was completed (c E5 % 500-1500 mm, c NaCl < 10 mm)a tv ariousp Hv alues (6.9, 5.9, 5.0, 4.1, 2.0) at T = 15 8C ( Figure 3b)( The poorer signal to noise ratio meant that longer measurement time was required at pH 5.0 and4 .1). The solubility of E5 drops close to its isoelectric point (pH 4.8, Figure1)w here unspecific and reversible precipitation was observed. As side chain protonation pattern varies with pH, H-bonds and other weak interactions also change. E5 contains ap rotonated N-terminal (pK a = 9.1) plus three basic residues, Arg 6 (pK a = 12.5), Lys 13 (pK a = 10.8) and Arg 21 (pK a = 12.5), with an acidic C-terminal (pK a = 2.2) and four acidic glutamines (Glu 1 ,G lu 2 ,G lu 3 and Glu 14 )( p K a % 4.25) (the listed pK a values are nominalv alues that strongly depend on backbonec onformation). The net charge of E5 is predicted to be positive at pH values smaller than 5( Figure 1). At pH 7.0, two salt bridges may contributet ot he stabilization of the 3Dfold, those of Glu 1 (À)$Arg 6 (+)a nd Glu 14 (À)$Arg 21 (+). But as pH decreases, Glu(s) get partly( or completely)p rotonated and thus salt bridgesw eaken and 3D-fold compactness loosens. Accordingly,atp H6.9, 5.9, and 5.0, the measured 3D structures of E5 are similar (Figure3), althoughc onformational heterogeneity increases considerably (demonstrated by the reduction of the total number of assignedr estrains from 666 to 319;F igure 3f and Ta bleS1). Thus, the Trp-cage fold holds, though backbone heavy atom RMSD of the 50 best structures increases significantly:R MSD (pH 6.9, T = 15 8C) = 0.73 ! RMSD (pH 5.0, T = 15 8C) = 2.56 (Table S1). We found that the total number of NOESY cross-peaks between R 21 and W 11 residues is ar eliable measure of the Trp-cage fold compactness (Figure S3): 11 and 10 such peaks were assigned at pH 6.9 and 2.0, but only 3a t5 > pH > 4( T = 15 8C). Moreover,R MSD changes show that the a-helix becomes partly unfolded, as the pH gets closer to 4.1 (Figure 3d). The structure loosening effect of the pH drop is the most pronouncedb etween residues 13 and 21: the 3 10 -helix( -G 15 GPSSG 20 -) tends to unwind,e xposing both the Y 8 and W 11 core residues to external water molecules ( Figure 3). Interestingly,m ovingb eyond the isoelectric point, ordering of the ensembles takes place and the original F-state reappeared at pH 2.0. At pH 6.9, 77 long-range NOE i!(i + 5 <) restraintsw ere assigned, while at pH 2.0 in total 84 NOEs i!(i + 5 <) were found. This is rather unexpected since the overall charge, as well as the local chargedistribution of E5 is indeed different at the above two pH values. Meanwhile, in between,a tp H4.9 only 22 NOEs of this kind were recorded (Figure 3b). In conclusion, the pH-scan shows that the basic topological features of the Trp-cage fold of E5 (and other analogues [41] )a re preserved at pH 7.0 and 2.0, but weakened near the isoelectric point, where I-stateso ff ull refoldingp otentialo ro fa myloidogenic misfolding capacity could be present simultaneously ( Figure S5).

The effect of stirring
At pH 4.1 and T = 37 8C, in the absence of stirring or sonication, the graduald ecay of the far-UV ECD spectrum intensitieso fE 5 was detected, which is indicative of self-association leading to the loss of monomeric form or weakly boundl ow molecular weighta ssociates in the solution( Figure S6). Stirring, however, greatly speeds up both the nucleation and fibril maturation stages of amyloidogenesis. This affects aggregationb yi ncreasing fragmentation and hence the number of free ends to support furtherf ibril growth and possibly by increasing the number of collisions occurring between monomers and/or small oligomeric clusters. [42,43] Stirring the solution of E5 for 51 h( pH 4.1) resulted in a strongB -type FUV-ECD spectrum, indicating ap redominantly b-pleated backbone structure. [44,45] When setting the pH to 4.9, 4.4, and 3.0, "mixed" ECD spectra were recorded (B/C-type) even after 51 h( Figure 4) signaling that, in solution, both helical and b-stranded structures are present,a nd amyloid forma- tion is less complete. At pH 7a nd pH 2, pure C-type ECD spectra were measured (even when stirring), indicating that only ahelical backbones tructures are in solution and thus, these pH values prevent amyloid formation.T oq uantify the extento f the transition from the folded towardt he amyloid-likep hase, deconvolution of al arge collection of ECDs pectraw ere carried out (including T-dependentE CD curves of E5 and E0 as unfolded model miniprotein [37] ). In this way,w eo btained three pure component spectra: AC -type for a-helical conformation, aUtype corresponding to theu nstructured and aB -type signaling b-stranded backbone conformation. More interestingly,asimilar NUV-ECD spectrum analysis( see below) showst hat this transition is more complex than as imple a-t ob-backbone conformational shift. Complemented with dynamic light scattering measurement data, we conclude that these NUV-ECD spectralc hanges are associated with ag raduala myloid formation.
The overall path from the F-to the amyloids tate is reported using ab arycentric coordinate system ( Figure 5) in which the gradualm aturation of the amyloida safunction of time is visualized.A ta ny point along the route, the ratio of the folded, unfolded, and amyloid states can be calculated from the deconvoluted spectralp roperties. As an example, the "mixed state" of [p F (t = 5h)= 0.26, p U (t = 5h)= 0.50, p Amy (t = 5h)= 0.24] or simply (0.26, 0.5, 0.24) is shown on Figure 5( see the "yellow dot" in Figure5d).
By using this mapping technique, three phases of the amyloid formation of E5 could be differentiated as i) initially the path runs parallelt ot he folded!unfolded axis, with no or marginalc ontribution of the amyloid state, corresponding to a misfolding phase, withthe gradualaccumulation of the unfolded/misfolded forms.i i) During the second phase the route runs parallel to the folded!amyloid axis, corresponding to the nucleation phase. During this phaseacritical concentration of misfolded structures is reached, while the misfolded content remains nearly constant. F-state diminishes while Amy-states start to accumulate,i ii)The third phase runs parallel to the unfolded!amyloid axis, where no furtherr eversible unfolding takes place:t he misfolded/unfolded structures are trapped by the growing amyloids,w hichi st he elongation phase of the E5 amyloidosis.
The rate of amyloid formation was also probed at af ixed protein concentration (e.g., c E5 = 400 mm)w ith increasing salt concentration (c NaCl :0 m m,1 2.5 mm,2 5mm,5 0mm) ( Figure 6b,F igure S10). We found that both misfolding and nucleation occurs at as imilar rate. However,d uring the elongation phase cleard ifferences were detected as af unction of c NaCl . Upon stirring for three days (t = 76 h) the amyloid ratio was found to be higheri fs altc oncentration was lower: [Amy] c NaCl ¼0mm = 0.85, [Amy] c NaCl ¼12:5mm = 0.78, [Amy] c NaCl ¼25 mm = 0.63, [Amy] c NaCl ¼50 mm = 0.52. This finding, at first glance, could suggest that the easy way to avoid amyloid formation-at least forE 5-is to use ah igh salt concentration. However,a s we only detect amyloids of limited size by ECD (those that remain part of the solution), it is more likely that the larger salt concentration speeds up amyloid maturation and thus, eliminates shorter amyloid fragments from the solution-an assumption morei nl ine with the literature data. Moreover,a t the salt concentration used here( < 100 mm), besidet he nature of the anion and cation,s pecific ion-binding to the polypeptide chain was also shown to contribute to the rate of amyloidosis. [46][47][48] This could well be the case for E5 also: during MD simulations of the monomericp rotein (both at pH 7a nd pH 4.1 in 0.15 m NaCl solution) ca. 10 %o ft he all ionprotein interactions involved the charged residues of the Glu 14 $Arg 21 salt bridge, whichm ight contributet ot he loosening of the hydrophobic core of E5 ( Figure S11), influencing the ratio of the misfolded structures present.

Temperature-dependence revisited
The effect of temperature on the kinetics of amyloid formation was revisited by using the fine-tuned conditions of amyloid formation( c E5 = 250 mm, c NaCl = 50 mm and pH 4.1) (Figure 6c, Figure S12). We found that the process is slower and incomplete both at 23 8Ca nd 47 8C, comparedt ot hat of the physiological temperature (37 8C). It seemst hat at ahigh temperature the increased thermalm otion disfavors self-association:

Verifying amyloidosis: NUV-ECD measurements
Amyloidt ransformation was monitoredf rom the viewpoint of the Y 8 $W 11 aromatic interaction by acquiring NUV-ECD data. At pH 6t he characteristicp ositive band at 283 nm (over the entire tested protein concentrationr ange) is indicative of a shifted face-to-face p-p interaction between the two aromatic rings of the hydrophobic core of E5 (Figure 7a). However,a t  (Figure 7b)a nd assignedt oa ne dge-to-face p-p interaction, signaling that misfolding of E5c oncerns the relative re-orientation of the two aromatic rings of the core. However,a sa myloid transformationp roceeds, gradually these negative bands are reverted and positive bands similart o those measured for the folded state at as lightly shifted posi-tion appear.A sa myloid formation progresses, these bands intensify:a fter 26 h, band intensities are about 10 times higher than those of the initial F-state( V % 283 nm :4 000!40 000) (Figure 7a and c). This side chain restructuring coincidesw ith the backbonec hanging from an a-t ob-state. Thus, we propose that the enhanced positive NUV-ECD bands between 270 < l < 290 nm no longera rise from ap airwise, intramoleculars hifted face-to-face p-p interaction, but rather from the interstrandi nteraction of Tyra nd Trps ide chains packed tighter in the supramolecular assembly of the amyloid phase.

The nature of the misfolded states
Amyloidf ormation proceeds at physiological temperature, but it can be quenched if cooled to 4 8C. Thus, as eries of heteronuclear correlation spectra ( 1 H-15 N-HSQC) were recordeda t48C using samples retrieved at regular time intervals (every 30 min) during amyloid formation ( Figure S13).
The 1 H, 15 N-chemical shifts of all residues were calculated by using Equation (1): as functiono ft he time (0 < t < 24 h). Residues of larger changes than the average chemical shift (t end Àt 0 > 0.047 ppm) are those of the a-helix (E 2 ,E 3 ,V 5 ,L 7 ,I 9 ,Q 10 ,K 13 ), signaling that amyloid formation affects this region.R esidueso ft he helix were found to be the most temperature sensitive as well (Figure S2), indicating that this region is assailable as soon as the shielding efficacy of the Trp-cage fold against water gets reduced. This region (segment-R 6 LYIQWL 12 -) was also predicted to be the most amyloidogenic of the sequence by CamSol. [49] Therefore, we proposet hat the a-helix itselfi st he seed of amylogenic nucleation.A si ts misfolding and transient unwinding allows the reorientation of the aromatic side chains, diand oligomers of the misfolded monomers can interact via their exposed hydrophobic cores.
Amyloid formation is optimal at pH 4.1, where the overall charge of E5 is near + 2 ( Figures 1, 2, and 3) and thus, on average three out of the four Glu side chains are protonated.T o identify the most likely protonation state to trigger misfolding, MD simulations were carriedo ut. At pH 7o nly one possible protonation motif, that of E5[ 1 E À1 , 2 E À1 , 3 E À1 , 14 14 E 0 ]was significantly different from any of the above when descriptors iÀix were evaluated, this state shows strongr esemblance to the loosened NMR structural ensemble measured at pH 4.1. We found Figure 7. NUV-ECD (240-325nm) spectrareveal how conformers and interaction modes of Y 8 $W 11 aromatic residues change as function of the pH and time.A tpH4.1 b) the p-p interactionprofile changes as protein concentration increases (84 < c E5 < 730 mm at 25 8C, c NaCl % 0mm), whereas it remainsu nchanged a) at pH 6.0.c)NUV-ECD spectral shift signals indicating the changinginteraction mode of Y 8 &W 11 as function of the time (0 < t < 26 h) during amyloid formation (c E5 = 800 mm, c NaCl = 12.5 mm,pH4.1, stirringa tT = 37 8C).
Chem. Eur.J.2020, 26,1 968 -1978 www.chemeurj.org 2019 The Authors. Published by Wiley-VCH Verlag GmbH &Co. KGaA, Weinheim that in case of E5 pH 4.1 [ 1 E À1 , 2 E 0 , 3 E 0 , 14 E 0 ]b oth d 11W$17P ,a nd d 11W$23P ,d istances are lengthenedb y0 .4 and 0.3 ,r espectively,a nd in somec onformers the d 11W$21R distance shiftedf rom 2.85 to 5.85 ,i ndicating the appearance of lessc ompact protein folds. Furthermore, as the distribution of m gets wider, the -PPP 24 -s egment twistsm ore often with respectt ot he main axes of the a-helix. In parallel, d 8Y$11W increases by 0.9 .T he relative orientation of the aromatic side chains changes as well:m easured q shows am ore diverse distribution relative to the other microstates,w hile the values of a and f increase slightly.S umming up, we might say that E5 pH 4.1 [ 1 E 0 , 2 E 0 , 3 E 0 , 14 E À1 ] protonation microstate has an enhanced backbone conformational freedom and as ignificantly rearrangedh ydrophobic core, with respect to all the others.T he Trp-cage gets occasionally unfolded, giving rise to unshielded aromatics and exposed backbonea mide groups,r eady fors elf-association and subsequenta myloid formation( FigureS14), though these events are transient, and misfolding is temporary.H owever,i t should be noted that MD simulations carried out here consider isolated molecules. The exposed aromatic and hydrophobic side chains create a" sticky" interaction centerf or E5,q uite as the free b-edges that appear transiently on the surface of locally misfolded large, globular proteins and becomei nitiators of aggregation. Thus, when collisions among similarly loosened conformers of E5 are also considered, the described changes might become sufficient to create the first di-and oligomer nuclei of aggregates ( Figure 9). These findings also explain why stirring is necessary for amyloid formation of E5;g iven that only one of the four possible protonation patterns of pH 4.1 produces a" misfolded enough"c onformer,agreat number of collisions are required for successful nucleation.
From the three probed offsets, YW turned out to be both the most stable at pH 4.1 and the most sensitive to pH shift (since both at pH 2a nd 7, the folded conformation is most stable and practicallyn oa myloid formation could be detected, we expected the most realistic model to be significantly more stable at pH 4.1) (FigureS17). Considering that in the -PPP 24segment f Pro must be ca. À608 and thus this part of the sequencei su nlikely to form an extended b-sheet structure (with f%À145, y % + 1458), in case of the YW offset 18 pairs of interstrand H-bonds could be formed between residue 1-18 in total. In YW pH 4.1 on average 10 out of 18 H-bonds are present during the MD simulationbetween the two middle chains, typically involving residue-pairss uch as V 5 -E 14 ,L 7 -L 12 ,I 9 -Q 10 ,W 11 -Y 8 , and K 13 -R 6 .These interstrand H-bonds confirm that an aggregation core can indeed form between residues 5-14, as suggested above.F urthermore, in this arrangemento fY W pH 4.1 an aromatic ladderforms between the Y 8 and W 11 side chains of adja-cent b-strands, as they come in closep roximity (d 8Y$11W < 6 as measured between the center of the rings) ( Figure S16 D). The aromatic laddere nables the formation of the shifted faceto-face p-p interaction expectedb ased on ECD spectroscopic data. Interestingly,t he derived orientation and spacingo ft he Tyra nd Trps ide chains here is also quite reminiscento ft he aromatic clusters createdb ym utations within the central singlelayer b-sheet of Borrelia outer surfacep rotein A( OspA) to probe the structure-ordering capacity of such residues. [53] Using the mid-structures of six different clusters (see Methods) of the MD simulation of (E5 pH 4.1 ) 4 YW as possible amyloid seeds, we also probed different inter-sheet arrangements by using ap rotein-protein docking algorithm.T he near-parallel arrangements thus obtained could be grouped into two fundamentally different topologies (using the nomenclaturei ntroduced by Eisenberg and co-workers [54] ): those of class 7(equifacial, antisymmetric, up-up) with translational symmetry,a nd class 8( antiparallel, equifacial, antisymmetric, up-down) with twofold symmetry (Figure 10).

Conclusions
Concerning E5, the Exenatide variant studied here, we found that its amyloid aggregation is most likely triggered by the transiently exposed aromatic andh ydrophobic side chains of loosened-misfolded-conformers, which create ac enter for intermolecular associations. The clusters thus created provide stabilization (spatialc loseness for sufficient time) for the much slower processo fa-helix to coil and then to extended bstrand transition, which seems to be highly unfavorable for the monomeric forms but eventually leads to the formationo fabstructured amyloid nucleus. The amyloidosis of E5 thus follows the nucleated growth mechanism. [42,43] Studyingt he early phases of amyloid formation is crucially important both for understanding the initialization of various pathophysiological processes and for aiding the design of nontoxic peptidem edications that will not become initiators of such processes themselves. We derived an ew monitoring technique of amyloidp rogression that can be appliede ven if the amyloid in question is ThT silent-using simple ECD measurements. Decomposition of the spectra and its barycentric representation on the folded-unfolded-amyloid potential energy surface of the amyloidic transition can be applied to filter out potentially harmful sequences from development. Generally,i tc an be concludedt hat the lowestp ossible salt concentration, low temperatures, and the absence of agitation can prolong the shelf-lifeofany polypeptide and protein medications,b ut,s omewhat disturbingly,w ea lso found that the optimum temperature for E5 amyloidosis coincides with our body temperature and requires well below physiological salt concentration. This just underlines how important it is to test the aggregationp ropensity of any drug candidates.

Experimental Section
Protein expression and purification:T he E5 miniprotein was produced by bacterial expression using the previously published protocol. [33] Preparation of amyloid form of E5 miniprotein:T he lyophilized E5 samples were dissolved in distilled water.F irst, the pH of protein solution was adjusted to pH 7w ith 0.1 m NaOH solution (Orion Star A211p Hm eter (Thermo Scientific)) then decreased to 4.1 or lower (pH-dependent amyloid formation) with 0.1 m HCl solution. Finally,d esired NaCl concentration was adjusted with concentrated NaCl solution. The protein solution was stirred (with magnetic stirrer) and incubated at target temperatures. At ag iven time, 10-fold diluted sample was measured with ECD spectroscopy.T he concentration of the diluted sample was determined with aN anoDrop Lite Spectrophotometer (Thermo Scientific)a t280 nm.
NMR experiments:D atasets were collected with a1 6.4 TB ruker AvanceI II spectrometer equipped with a5mm inverse TCI probehead with z-gradient. Spinlock (d9) for 1 H-1 HT OCSY was 80 ms, while the mixing time (d8) of 150 ms was taken for 1 H-1 HN OESY spectra. NMR structure calculations were performed and refined by cooperative use of CcpNmr Analysis 2.4.1.,A ria 2.0 [61] and CNS Solve 1.2. [62] Molecular dynamics simulations:M Ds imulation were carried out as implemented in GROMACS59, using the AMBER-ff99SBildnp* force field. The systems were solvated with TIP3P water molecules in dodecahedral boxes with as ize allowing 10 between any protein atom and the box. Trajectories of 600 ns ((E5) 4 models) to 1000 ns (E5 monomers) NPT simulations with a2fs time step at 310 Ka nd 328 Ka nd 1bar were collected (with snapshots at every 4ps). Electronic circular dichroism spectroscopy:F ar-and near-UV ECD measurements were carried out with aJ asco J810 spectrophotometer.T he temperature at the cuvette was controlled with aP eltiertype heating system.