Full atomistic model of prion structure and conversion

Prions are unusual protein assemblies that propagate their conformationally-encoded information in absence of nucleic acids. The first prion identified, the scrapie isoform (PrPSc) of the cellular prion protein (PrPC), caused epidemic and epizootic episodes [1]. Most aggregates of other misfolding-prone proteins are amyloids, often arranged in a Parallel-In-Register-β-Sheet (PIRIBS) [2] or β-solenoid conformations [3]. Similar folding models have also been proposed for PrPSc, although none of these have been confirmed experimentally. Recent cryo-electron microscopy (cryo-EM) and X-ray fiber-diffraction studies provided evidence that PrPSc is structured as a 4-rung β-solenoid (4RβS) [4, 5]. Here, we combined different experimental data and computational techniques to build the first physically-plausible, atomic resolution model of mouse PrPSc, based on the 4RβS architecture. The stability of this new PrPSc model, as assessed by Molecular Dynamics (MD) simulations, was found to be comparable to that of the prion forming domain of Het-s, a naturally-occurring β-solenoid. Importantly, the 4RβS arrangement allowed the first simulation of the sequence of events underlying PrPC conversion into PrPSc. This study provides the most updated, experimentally-driven and physically-coherent model of PrPSc, together with an unprecedented reconstruction of the mechanism underlying the self-catalytic propagation of prions.

Introduction Prion diseases are infectious neurodegenerative disorders characterized by an invariably lethal outcome caused by a proteinaceous infectious agent named "prion" [1]. The central event in these pathologies is the conversion of PrP C , a GPI-anchored protein of unknown function, into a misfolded isoform (PrP Sc ) which accumulates in the central nervous system of affected individuals [6]. While PrP C structure has been widely characterized, and consists of a N-terminal disordered tail and a C-terminal globular domain [7], no high-resolution information is available for PrP Sc due to technical challenges posed by its high insolubility and aggregation propensity [8]. In order to fill this gap, different atomistic models based on low-resolution experimental data have been proposed, including a Left-handed-β-Helix (LβH) structure spanning residues 89 to 170 while retaining the two C-terminal α-helices of PrP C [9], and a Parallel In-Register Beta-Sheet (PIRIBS) architecture, characterized by intermolecular stacking of aligned PrP monomers [10]. Recent cryo-EM data obtained using infectious, anchorless PrP Sc fibrils [4] provided strong evidence indicating that PrP Sc fibrils consist of two independent protofilaments, and of the existence of 2 nm structural units repeating along each protofilament axis, suggestive of a 4-rung β-solenoid (4RβS). This would be fully compatible with the LβH model, and much less so with the PIRIBS model, which posits that PrP Sc fibrils are not made up by two protofilaments, but rather, by a single wider filament that features two subdomains or lobes separated by a cleft (Goveman et al., 2014). Moreover, the PIRIBS model fails to accommodate glycosylated residues in PrP Sc , which would result in the introduction of excessive steric clashes [11]. It should be pointed out, however, that the low resolution of available experimental data still does not allow to definitively discard any option (a detailed comparison of PIRIBS and solenoid-based models can be found in [12]). Of note, while consistent with the mentioned experimental constraints, the proposed LβH model is incoherent with a recent reevaluation of previous FTIR data suggesting that PrP Sc does not contain α-helices [8].

Construction of an experimentally-driven atomistic model of infectious mouse prion
To satisfy current experimental evidence and theoretical structural constraints, we built a new atomistic model of mouse PrP Sc based on the 4RβS conformation, and tested its stability by means of all-atom MD simulations. The construction of the model took into account an array of experimental data, including: (i) cryo-EM [4] and X-ray fiber-diffraction studies [5], which showed that the fold of a mouse glycosylphosphatidylinositol (GPI)-anchorless, infectious PrP Sc is compatible with a 4RβS architecture with L-or T-shaped cross-section [4]; (ii) circular dichroism (CD) and FTIR spectroscopy, which ruled out the presence of α-helices, and suggest that PrP Sc contains approximately 40-50% β-sheet and 50-60% coil/turns [8]; (iii) Mass Spectrometry (MS) analyses indicating the presence of an intact disulphide bond between residues C178 and C213 (mouse sequence) [13], as well as mapping Proteinase K (PK)-sensitive residues, which reflect amino acids likely excluded from the resistant core of the protein [14] [15]; and (iv) the possibility of accommodating complex glycans at positions N180 and N196 [11]. All these constraints were comprehensively included into a 2D threading scheme spanning mouse PrP (moPrP) residues 89-230 (S1 Fig), also considering the structural propensities of different residues: polyglycine tracts and prolines were positioned in loops due to their destabilizing effects on β-strands; charged sidechains were excluded from the inner core of the protein or counterbalanced by salt bridges. This scheme was then modelled onto the 3D arrangement of a naturally-occurring β-solenoid protein (Dickeya dadantii Pectate Lyase; PDB 1AIR). The resulting structure (depicted in Fig 1) features an inner core containing mainly hydrophobic or mildly polar side-chains (T94, T106, L108, V111, Y127, M128, W144, Y149,  V165, Y168, I181, I183, V188, F197, T198 and T200), few polar side-chains involved in hydrogen bonding (N142-HB-Y168, H168-HB-T198, Q216-HB-T200 and Q218-HB-S221), and a salt bridge (R147-SB-D166). Conversely, the majority of the highly-polar residues (N and Q) including the glycosylation sites (N180 and N196) and charged side-chains (E, D, K and R) are exposed to the solvent. The structure also encompasses identified PK cleavage sites localized in loops/turns, or at the edge of the β-strands, and the intact disulphide bond between C178 and C213. Importantly, the final model fitted with a previously described, low-resolution cryo-EM map of infectious PrP Sc (S2 Fig).

Molecular dynamics simulations of the new 4RβS model
To test the physical consistency of the 4RβS model, we challenged its stability by all-atom MD simulations in explicit solvent. First, three independent, 20 ns simulations were performed in the isothermal-isobaric ensemble (T = 300 K, P = 1 Bar) restraining hydrogen bonds distances between atoms involved in β-sheets. This process allowed relaxation of protein loops and side chains of the core (S3 Fig). Next, the imposed restraints were released, and three plain-MD trajectories of 100 ns each were simulated. The stability of the model was then assessed by both Root Mean Squared Deviation (RMSD) of atomic positions relative to the initial frame (t = 0 ns) and secondary structures content. Interestingly, we obtained values in the same range of fluctuation for our 4RβS model and the prion-forming domain of the fungal protein Het-s, a naturally occurring β-solenoid whose structure has been solved by solid-state NMR (ssNMR; PDBs 2KJ3 and 2RNM; Fig 2,and S4 Fig). Conversely, by applying an identical workflow to the previously proposed model of laterally-stacked LβH trimer of PrP Sc , we observed a profound instability of the β-helical domain, which was already evident after few tens of ns of dynamics (Fig 2 and S4 and S5 Figs).
Next, we tried to adapt alternative threading schemes to the 4RβS architecture. We obtained two new schemes carrying small deviations (i.e. one or two residue shifts) from the original model but still fully coherent with the experimental constraints (S6 Fig). Every other attempt to model threading schemes with a larger deviation (>3 residue shifts) resulted in violation of experimental constraints. Interestingly, MD simulations revealed that the two alternative models are also significantly stable (S6 Fig). These data suggest that the new threading schemes may represent putative alternative arrangements of the PrP polypeptide into an RML prion fibril. This conclusion implies an intrinsic flexibility of 4RβS architecture to accommodate slightly different arrangements, which could be related to the ability of prions to occur in different conformations specifying prion strains.
To mimic the fibrillary behavior of PrP Sc , we built a tetrameric 4RβS model by stacking monomers in a head-to-tail fashion. As expected, this assembly showed a comparable MD stability to the monomer (Fig 3). Moreover, the 4RβS tetramer fits with the two main signals obtained by Fourier transform single particle analysis of cryo-EM data (19.1 Å and the~40 Å) [4], which reflect distances between the same residues on two contiguous or alternate monomers, respectively, as well as with the observed volume of the protofilaments (Fig 3). Finally, by introducing complex sugar precursors (GlcNAc2Man3Fuc) at positions N180 and N196 of each monomer, we observed the absence of steric clashes, confirming that the 4RβS model accommodates the presence of glycans (S6 Fig). Collectively, these findings indicate that, in contrast to the LβH model, the 4RβS is a solid arrangement for PrP Sc , built on the most recent experimental data, and showing a conformational stability comparable to that of a natural prion.

All-atom reconstruction of prion conversion
The 4RβS model allowed us to develop an original scheme to perform for the first time a simulation of the conformational transition from PrP C to PrP Sc . In order to bridge the gap between the computationally-accessible and the biologically-relevant time scales, we employed a Upper graphs report the RMSD deviation from initial configuration for the entire structure (blue lines) or the β-solenoid core (orange lines) of the different proteins. Filled curves indicate standard error of the mean. Results show a comparable stability between the Het-s dimer (A) and the 4RβS model (B), with an average RMSD of the hydrophobic core (calculated as the average of the three trajectories over the last 5 ns, ± standard deviation) of 2.4 ± 0.2 Å for the Het-s dimer (similar results for Het-s stability are reported [35]) and 2.1 ± 0.3 Å for the 4RβS. In contrast, the structural deviation of the LβH (C) hydrophobic core is approximately two-fold higher, reaching a value of 4.3 ± 0.6 Å. Lower graphs indicate the α-helical (blue lines) or β-sheet (red lines) content of each protein. The initial and final β-sheet content was calculated as the average of the three trajectories over the last 5 ns of the restrained and unrestrained MD simulations, respectively. The Het-s dimer showed a variation from an initial 41.8 ± 2.3% to a final 33.2 ± 2.8%. Similarly, the 4RβS model deviates from an initial 41.3 ± 2.3% to a final 36.6 ± 5.4%. Instead, the LβH model deviates from a starting 18.3 ± 1.4% to a 7.5 ± 3.1%. These results are illustrated by the structures shown below the graphs, which represent the initial (top) and final (bottom) frames of the MD trajectories. https://doi.org/10.1371/journal.ppat.1007864.g002 Full atomistic model of PrPSc specific kind of biased dynamics called ratchet-and-pawl MD (rMD) [16] in the framework of an all-atom realistic force field [17]. rMD-based methods have been successfully applied to simulate protein folding and other conformational transitions [18,19]. However, this scheme provides a sampling of the transition path ensemble only if the biasing force is applied along a reliable reaction coordinate [20]. Therefore, the first step towards developing our rMD simulation was to build a statistical model to identify the reaction coordinate of the process. Using the Markov State Mode formalism, we demonstrate that among all the possible misfolding Blue bars indicate the distance between two residues in the same position on two consecutive monomers, which corresponds to 19.2 ± 0.4 Å (t = 0 ns) and 19.8 ± 2.2 Å (t = 100 ns). Purple bars indicate the distance between a residue in one monomer and the same residue on the second forthcoming monomer, which corresponds to 38.4 ± 0.5 Å (t = 0 ns) and 39.6 ± 1.8 Å (t = 100 ns). A similar pattern of signals reflecting monomeric and dimeric repeats has previously been observed by cryo-EM studies on Het-s [36]. Both values are in almost perfect agreement with the two main signals obtained by Fourier transform single particle analysis in the cryo-EM experiment (19.1 Å and the~40 Å signals). The average monomeric model volume calculated at beginning and at the end of the 100 ns tetramer simulations are equal to (18.5 ± 0.7)�10 3 Å 3 and (18.5 ± 0.3)�10 3 Å 3 respectively; while the estimated cryo-EM monomeric volume from the protofilament is equal to 18.9�10 3 Å 3 . (B) Upper graph shows the RMSD deviation of the tetramer from initial state for the entire structure (blue lines) or the β-solenoid core (orange lines). Structural deviation over the 100 ns of simulation corresponds to 2.6 ± 0.2 Å. Lower graph report the secondary structures percentage, initial β-strand content is 46.2 ± 1.2%, while the final 38.6 ± 1.7%. Filled curves indicate standard error of the mean. These results indicate a comparable stability between monomeric and tetrameric 4RβS structures. (C) Map of the experimentally observed PK cleavage sites (colored in red) overlapped with the probability of each residue to be in an extended conformation, calculated over the last 5 ns of the MD trajectories. patterns from PrP C to PrP Sc the prominent reaction mechanism is the sequential formation of rungs. A biasing coordinate was then built by coupling this dynamical information with the all atom 4RβS structure. The associated rMD simulations yielded a transition pathway with full atomistic resolution in which the C-terminal rung of the solenoid acts as a primary conversion surface for PrP C unstructured N-terminus (residues 89-124). This first event initiates a cascade of conformational transitions in which each newly formed rung acts as a template for the formation of the following one, ultimately leading to the complete conversion of PrP C into PrP Sc (Fig 4 and S1 Movie). This analysis provides a rigorous description of the active role of protofilament ends in the templated propagation of prions, and it is compatible with previous observations suggesting the presence of a structured intermediate conformer of PrP C in its transition to PrP Sc [21]

Fig 4. Graphs and Frames Extracted from the rMD Simulation of PrP C to PrP Sc Conversion.
Upper graph reports the fraction of native and misfolded contacts of the instantaneous configurations of the PrP C -PrP Sc complex, starting from the initial modelled configuration (depicted in 1) in which PrP C contacts the 4RβS monomer, with respect to the final target configuration (4RβS dimer). Lower graph shows the secondary structure content. Pictures of the evolving complex represent frames extracted from the entire conversion simulation at precise rMD steps, corresponding to the refolding of PrP C as follow: (2) residues 89-115; (3) residues 89-151; (4) residues 89-190; (5) residues 89-230. The process highlights the progressive unfolding and refolding of PrP C onto the 4RβS template, which initially involves the unstructured region, followed by the loss of α-helices and a progressive formation of β-sheets.

Discussion
The elucidation of the structure of PrP Sc at atomic resolution has proven to be a phenomenal experimental challenge, mainly due to its high insolubility and aggregation propensity. Previously generated computational models of PrP Sc have the virtue of providing a plausible 3D structure, but fail to comprehensively accommodate most recent experimental data [8]. Here, we filled this gap by exploiting the information arising from cryo-EM and X-ray fiber-diffraction studies [4,5], which clearly defined the general architecture of PrP Sc as a 4RβS and refined the structure by including experimental constraints obtained by mass spectrometry. The model we have created allowed us to perform the first reconstruction of how the information encoded into the conformation of a protein could be propagated in a directional fashion, a concept directly underlying the infectious nature of prions.

Virtues and limits of the 4RβS model
The search for effective therapies against prion diseases has so far been mostly unsuccessful [22], partly due to the lack of detailed information regarding the structure of PrP Sc . In fact, defining the structure of infectious prion particles at high-resolution would allow the rational design of molecules that directly target PrP Sc and inhibit its replication and accumulation into the nervous system. Unfortunately, this task is complicated by the wide heterogeneity and high propensity to aggregate of prions, aspects that make canonical high-resolution techniques such as NMR and X-ray crystallography inapplicable [8]. In an attempt to fill this gap, low-resolution methods have provided important experimental constraints illuminating the macroscopic architecture of prion fibrils and laying the groundwork for the application of computational techniques such as those used in this study. While these approaches still suffer from the lack of direct high-resolution information, they are undoubtedly valuable for interpreting the available information and inspire new hypotheses and approaches. Indeed, the new model described in this study possesses several important features. First, it displays an intrinsic coherence with state-of-the-art knowledge about infectious PrP Sc fibrils and it appears to be as stable as the naturally-occurring prion Het-s [3]. Importantly, it also represents a unique workbench for interpreting future structural data or available biological evidence. In fact, it will be possible to test the effect of variations in the PRNP gene on the stability of the model by inferring mutations or polymorphisms known to favor or inhibit prion propagation and analyze the resulting structures by MD. As an example, it is known that the presence of an isoleucine at the polymorphic site 109 (M/I) in Bank vole PrP results in a substantially higher susceptibility to prion disease [23]. Assuming that the structure of prions in mouse and Bank vole are similar, as suggested by the difference in just 9 residues between murine and Bank vole PrP, and in light of the known higher propensity of isoleucine to participate in β-strands as compared to methionine [24], our model may explain the effect of the 109 polymorphism on prion replication in Bank vole. Our 4RβS structure could in principle be adapted to model prions from Bank vole, human, deer, ovine and other species, as well as be employed to test how the effect of known polymorphisms and mutations affect prion stability and/or propagation. Similarly, it will be possible to rationally predict novel amino acid substitutions that could influence the stability of the 4RβS architecture and test their effects in surrogate in vitro (e.g. PMCA) or cell-based assays for prion replication. As a result, these strategies will produce new experimental constraints, which in turn will help to further improve the accuracy of the model, leading to an iterative cycle of prediction, testing and refinement. While waiting for a technological innovation that will definitively solve the structure of infectious prions, this strategy appears as the most feasible and promising among the available options.

Using the 4RβS model to address fundamental questions in prion biology
Since the original hypothesis of prions as protein particles responsible for transmissible brain pathologies, later evolved into a much wider concept of protein-based inheritance, tens of laboratories have intensively investigated the molecular features of these infectious agents. However, despite enormous efforts worldwide, many important questions of prion biology remain unanswered. For example, we still don't fully understand the nature of the biological barriers that disfavor or fully impede the propagation of prions among different species. The reconstruction of prion conversion described in this manuscript could provide a unique opportunity to try to address this issue. For example, it will be possible to test the extent by which our templated conversion model tolerates sequence mismatches between the 4RβS template and the PrP substrate. These analyses could be used to interpret available biological evidence, for example the unusual ability of Bank vole PrP molecules to escape species barriers. Interestingly, our model may also help to improve current understanding of the prion strain phenomenon, which until now has been generally conceptualized as the same PrP polypeptide adopting different conformations. We show that the 4RβS scaffold can accommodate slightly different sequence threading schemes while remaining coherent with low-resolution experimental constraints and maintaining structural stability (S6 Fig). This observation suggests surprising flexibility of the 4RβS architecture to the alignment of residues in the β-strands, a feature that may introduce small conformational variability in each PrP Sc monomer and result in large structural effects at the level of fibrils. This conclusion implies that the different biochemical and biological features of prion strains can be encoded in the small differences of PrP sequence alignment along the 4RβS scaffold.

The 4RβS model inspiring therapy
Computer-aided technologies are among the most promising strategies for the discovery of novel therapeutics against a wide variety of human illnesses, including prion diseases. In most cases, these approaches require the availability of high-resolution information about the target protein, but in absence of experimentally solved structures, as is the case for PrP Sc , may also be applied to structural models. From this standpoint, our 4RβS may represent a unique opportunity to rationally design new anti-prion compounds. These molecules may be directed against two main structural elements of a putative 4RβS fibril: the "sticky" end, which exposes unpaired β-strands to the solvent, or the lateral surface. In the former case, the compound could block the insertion of a new PrP polypeptide into the growing fibril, while in the latter it would act as a stabilizer and prevent breakage into smaller, more diffusible elements. Interestingly, a similar mechanism has already been proposed to explain the anti-prion effects of luminescent conjugated polythiophenes (LCPs), molecules presenting consecutive thiophene rings exposing regularly spaced carboxyl side groups which could promote the interaction with lateral surfaces of prion fibrils [25]. It will undoubtedly be interesting to use our 4RβS structure to model the binding architecture of LCPs, as such information could then be exploited to optimize these compounds and/or rationally design immunological mimics and immunogens giving rise to PrP Sc -specific antibodies or prophylactic vaccines against prion diseases.

Template selection & model building
In order to accommodate the moPrP 89-230 sequence, we selected a β-solenoid architecture capable of satisfying the following requirements: (i) Approximate number of residues in extended conformation higher than 12 per rung. Requirement needed to fit the secondary structure content of approximately 40% of β-sheets [8]; (ii) Capacity of accommodating two extended loops per rung. Namely, the possibility of introducing arbitrarily long sequences between consecutive β-strands in order to allow the exclusion of PK cleavage sites, glycine tracts and prolines from the core of the solenoid; (iii) L or T cross-section solenoidal-shape. Architecture coherent with cryo-EM maps; (iv) Capacity of accommodating bulky aromatic residues in the hydrophobic core (e.g. moPrP 89-230 contains 11 Tyr, 3 Phe and 1 Trp). All these requirements are satisfied by the right-handed, L-shaped β-solenoid architecture. In contrast, left-handed-solenoid structures typically display smaller rungs [26], impairing the modelling of the desired number of residues in extended conformation as well as the accommodation of bulky residues in the core. The left-handed structure of the prion forming domain of Het-s (PDB 2KJ3 and 2RNM) is an exception in terms of β-sheet content, however its hydrophobic core is mainly composed of small side-chains and it allows the insertion of only one arbitrarily long loop connecting two consecutive rungs [3]. The threading scheme for the PrP 89-230 sequence (S1 Fig) was obtained by following the L-shaped β-solenoid architecture. In the threading process, PK cleavage sites, prolines and glycine tracts were positioned in the loops (when this was not possible, at the edges of beta strands). Charged side-chains were excluded from the inner core of the solenoid or paired with residues forming salt bridges. The presence of an intact disulphide bond between C178 and C213 and solvent exposure of N180 and N196 sidechains (in order to accommodate glycans) were also considered.
The 3D structure of the monomeric model of PrP Sc was constructed following these steps: (i) 4 Rungs of Dickeya dadantii Pectate Lyase (PDB 1AIR; two repetitions of residues 168-235) were used to obtain the scaffold for the β-solenoid core; (ii) The original loops of the protein were removed; (iii) The residues in the hydrophobic core of the original PDB structure were replaced with PrP residues (as indicated in S1 Fig) using Chimera [27]; (iv) New loops were generated using the MODELLER tool [28] in Chimera. Each loop was selected from a set of 20 proposed conformations. Structures resulting in extended atomic clashes were discarded, and finally the best performing model in term of DOPE score was selected; (v) Side-chain geometry was optimized using the Dunbrack's rotamer library. In particular, for highly polar and charged side-chains in the hydrophobic core, geometries capable of forming hydrogen bonds or salt bridges with nearby residues were selected; (vi) The energy of the system was minimized in vacuum with the steepest descent algorithm in Gromacs 4.6.5 [29]. System topology was generated using Amber99SB-ILDN force field [17]. A restraining potential during energy minimization between the H and O atoms involved in hydrogen bond formation between backbone residues was added, as in Eq 1: Where r ij is the distance between the H and the O atoms involved in hydrogen bond formation; r 0 is the original hydrogen bond distance (2 < r 0 < 2.5) Å and r 1 is an upper limit equal to 4 Å, while k dr is set to 2�10 3 kJ/(mol�nm 2 ).
This strategy was applied to favor the movement of the side-chains and accommodation of loops, while impairing the backbone deviation of the residues involved in the β-solenoid core formation; (vi) Additional optimization of the backbone (in order to remove Ramachandran outliers) and side-chain geometry was performed using Coot [30]; (vii) Absence of steric clashes was verified using Chimera, setting the VDW-overlap threshold as 0.6 Å, subtracting 0.4 Å to account H-Bonding pairs and ignoring contacts of pairs <4 bonds apart. The 3D structure of the tetrameric model of PrP Sc was assembled by stacking four monomers in a head-to-tail fashion using Chimera, maintaining the proposed threading. The strands 198-TETD and 215-TQYQKESQAYY were stacked on the top of 94-THNQ and 105-KTNLKHVAGAA of the forthcoming monomer, respectively. The structure was then energy minimized following the same protocol used for the monomer. Glycans (DManpα 1-6 [DManpα 1-3]DManpβ 1-4DGlcpNAcβ 1-4[LFucpα 1-6]DGlcpNAcα 1-Asn) attached to residues N180 and N196 were added using the doGlycans tool to the 4RβS tetramer, and the structure was energy minimized in vacuum and then explicit solvent using Gromacs 5.1.2.

Molecular dynamics simulations
Molecular Dynamics simulations were performed on the supercomputers Finis Terrae II (CESGA, Santiago de Compostela, Spain) and Marconi (CINECA, Bologna, Italy) using Gromacs 5.1.2. The following protocol was applied for the whole set of simulations performed in this work. Protein topologies were generated using Amber 99SB-ILDN force field. The structures were accommodated in a dodecahedral box with periodic boundary conditions. The minimum wall distance from the protein was set to 12 Å. The box was filled with TIP3P water molecules. The charge of the system was neutralized with the addition of Na + or Clions. The system was energy minimized in explicit solvent using a steepest descent algorithm, retaining the restraining potential (Eq 1). From the minimized structure, three independent equilibrations with position restraints on heavy atoms were launched: in the NVT ensemble (300 K) for 500 ps using the Nose-Hoover thermostat, and then in the NPT ensemble (300 K, 1 Bar) for 500 ps using the Nose-Hoover thermostat and the Parrinello-Rahman barostat. For each equilibrated system, a 20 ns MD simulation with restraining potential (Eq 1) was launched. Finally, restraints were released and each trajectory was extended with 100 ns of plain-MD. This protocol yielded three independent, unbiased 100 ns MD trajectories for each structure of interest. MD simulations were performed using a leap-frog integrator with step equal to 2 fs. LINCS algorithm was applied to handle bonds constraints. Cut-off for short-range Coulomb and Van der Waals interactions was set to 12 Å, while long range electrostatics was treated using Particle Mesh Ewald.

Generation of the propagation model
A key step to perform reliable rMD simulations is the identification of a reasonable reaction coordinate. To this end, we first developed a statistical coarse-grained model which enables us to identify the dominant reaction pathway underlying the conversion of PrP C into a 4RβS PrP Sc . In order to describe the reaction kinetics we developed a simple stochastic model in which the key rate-limiting processes are assumed to be the irreversible formation and docking of the four rungs of the 4RβS. We indicated R 0 as the C-terminal rung of the pre-formed 4RβS, and with R 1 R 2 R 3 R 4 the consecutive rungs of the converting monomer. The instantaneous state of the system can be represented as a 4-dimensional vector of binary entries S = [n 0 , n 1 , n 2 , n 3 ], where n k = 1 in the presence of docking between rung R k and rung R k+1 , and n k = 0 otherwise. We emphasize that this model excludes the presence standalone rungs, which would correspond to an entropically-unfavorable single extended conformation, not stabilized by hydrogen bonds of nearby β-strands. On the other hand, misfolded rungs can be stabilized either upon docking to a pre-existing misfolded region (template mechanism) or through a process in which two rungs simultaneously form and dock. We modeled the transition of an initial state S R = [0,0,0,0] (in which the PrP C monomer is in the native state and none of the rungs are formed) to the fully misfolded state S P = [1,1,1,1] (where the monomer is completely misfolded and incorporated into PrP Sc ). The resulting network is represented in S8A Fig  which contains all the possible combinations of docking events leading S R to S P through a sequence of irreversible transitions. The model can be simplified considering that rate k 2 is expected to be negligible as compared to k 0 . Indeed, while the event associated with k 0 only requires the structuring of a disordered PrP region, the events associated with k 2 require the loss of native content (breakage of hydrogen bonds in the helical regions) together with the simultaneous formation of two rungs (two-fold entropic cost). Thus, it is possible to disregard all the reaction pathways in which the first step of the reaction is a transition occurring at a rate k 2 and consider only events starting from [1,0,0,0]. The network of the resulting simplified model is depicted in S8B Fig. Furthermore, we can set k 3 /k 1 >>1 and k 3 /k 2 >>1, since the docking of two pre-formed rungs occurs at a rate much faster than all processes involving misfolding. The reaction kinetics in this stochastic model was simulated through a Kinetic Monte Carlo algorithm (arbitrarily setting k 3 /k 1 = 10 6 ) and the resulting reaction mechanisms were studied as a function of the k 2 /k 1 ratio. Namely, we enumerated all the reaction pathways and computed the corresponding probability: Step 1 Step 2 Step 3 Step 4 We find that Path 1 (which consists in the consecutive formation of all rungs by templating on previously misfolded structures) dominates over all others as soon as k 1 /k 2 � 4. Using Kramers' theory and assuming comparable pre-factors, we find that the templating mechanism is the most prominent reaction pathway when the activation energies for single and double rung formations obey the relationship: We expect this condition to be always satisfied. Indeed, processes 1 and 2 are characterized by the formation of the same number of hydrogen bonds, leading approximately to the same enthalpic gain. However, the process described by rate k 2 requires the breaking of the double amount of native contacts, together with a two-fold entropy loss (compared to the single rung formation) when forming two de novo rungs. We can therefore conclude that the propagation reaction proceeds through a subsequent formation of individual rugs, using as template preexisting 4RβS free end. We emphasize that our approach is likely to underestimate the dominance of the sequential misfolding mechanism. Indeed, it does not account for the direct cooperativity of hydrogen bonds and the long-range electrostatics favoring β-strand formation in presence of pre-formed β-sheets, as directly supported by previous computational and experimental evidence [31,32].
The coarse-grained information about the reaction mechanism obtained by means of our theoretical model can be exploited to set up a fully atomistic rMD simulation of the PrP C ! PrP Sc transition, using the 4RβS as a target structure (Fig 4 and S1 Movie). The 3D structure of moPrP C (residues 105-230) was obtained by linking moPrP 121-231 (PDB 1XYX) to the adapted N-terminal fragment (residues 105-120) of huPrP C (PDB 5yj5) which was mutated to the moPrP sequence. The initial state for the conversion simulation was generated by modifying a central dimer extracted from the tetrameric 4RβS at the end of 20 ns of restrained molecular dynamics simulation. The initial contact point between PrP Sc and PrP C was generated by leaving residues 89-104 of the C-terminal monomer anchored to the β-solenoid, which was then replaced by moPrP C retaining the original disulfide bond. The rationale behind this modelling scheme derives from multiple previous reports indicating that the same region is a primary contact point between PrP C and PrP Sc [33]. The complex was energy minimized in vacuum and then in explicit solvent in a dodecahedral box (with periodic boundary conditions and minimum wall distance of 20 Å) also containing six Clions to neutralize the charge of the system, using a steepest descent algorithm (protein topology was generated by Amber99S-B-ILDN force field). The system was then equilibrated in the NVT ensemble (350 K) for 200 ps using the Nose-Hoover thermostat, and then in the NPT ensemble (350 K, 1 Bar) for 200 ps using the Nose-Hoover thermostat and the Parrinello-Rahman barostat. The model was then subjected to a modified protocol of rMD simulations, adapting the method to a sequential biasing. This scheme resulted in a progressive rMD simulation in which the structure of PrP C was targeted to the growing solenoid in a rung-by-rung fashion. Target structures included an entire 4RβS solenoid and additional rungs at the growing C-terminus: Step 1, residues 89-115; Step 2, residues 89-151; Step 3, residues 89-190; Step 4, residues 89-230. In this rMD scheme, the equation of motion is modified by an history-dependent biasing force F i rMD (X,t), defined as: where k R is the ratchet spring constant determining the strength of the biasing force applied on the system, in this case 10 −2 kJ/mol. θ function is equal to 1 if its argument is positive, its value its 0 otherwise. Z[X(t)] is a collective coordinate defined as: where C ij [X(t)] is the instantaneous contact map and C ij 0 is the contact map of the target state.
Their entries are chosen to smoothly interpolate between 0 and 1, depending on the relative distance of the atoms i and j. The z m function indicates the smallest value assumed by the reaction coordinate z[X(t)] up to time t and.
The contact map C ij (X) is defined as: Where r ij is the distance between the i-th and the j-th atom; r 0 is a typical contact distance set to 7.5 Å; r c is an upper limit to improve computational efficiency by excluding excessively distant atoms from the calculation (and it is set to 12.3 Å). In this way, no bias force acts on the system when the protein spontaneously proceeds towards the target state, while the external biasing force is only applied when the polypeptide tends to backtrack toward the initial state. We terminated each rMD simulation when the RMSD of the protein relative to the final state is stably lower than 0.8 Å. rMD simulations were performed in explicit solvent using Gromacs 4.6.5 with the plugin Plumed 2.0.2 [34]. Integration of motion was performed using a leap-frog algorithm with step equal to 2 fs. Temperature was maintained at 350 K (approximately PrP melting temperature) and pressure at 1 Bar using Nose-Hoover thermostat and the Parrinello-Rahman barostat. LINCS algorithm was applied to handle bonds constraints. Cut-off for short-range Coulomb and Van der Waals interactions was set to 10 Å, while long range electrostatics was treated using Particle Mesh Ewald.
Secondary structure content and fraction of native/misfolded contacts were obtained using the Timeline and the Trajectory tools in VMD 1.9.2. Fraction of native and misfolded contacts are defined in the following equation SðXÞ � P jiÀ jj>4 yðr 0 À r ij ðXÞÞyðr 0 À r ij ðX S ÞÞ P jiÀ jj>4 yðr 0 À r ij ðX S ÞÞ ðEq 6Þ where r ij is the distance between the i-th and j-th C α atoms, r 0 is the typical distance defining a contact between two residues and it is set at 7.5 Å, X is the instantaneous configuration of the protein during the simulation, X S is the atomic configuration of the protein in the reference state, that is the native PrP C structure for the native contacts calculation, and the PrP Sc structure for the misfolded contacts calculations. This quantity is evaluated only for the converting monomer. Initial and final values of RMSD and secondary structure content were calculated by averaging the values over the last 5ns (of restrained or unrestrained MD) and then calculating the mean and standard deviation for the three trajectories. Inter-monomer distances in the tetramers are calculating using Chimera as an average of the distances between the mid-residue of each β-strand and the same residue in the first or the second next monomer. In particular, for β-strand-1: N96, G130, Y168, T200; for β-strand-2: N107, D143, T192, Y217; for β-strand-3: G113, Y149, V188, A223. Tetramer volumes were calculated using Chimera. Molecular grapics images were produced using the UCSF Chimera package from the Computer Graphics Laboratory, while graphs are plotted using Gnuplot.
Supporting information S1 Fig. 2D threading scheme of moPrP residues 89-230. 2D arrangement based on the general architecture of right-handed β-solenoid proteins with L-shaped cross section. Blue thick arrows indicate β-strands, thin cyan and thin orange arrows indicate the sidechain orientation (toward the solvent or toward the hydrophobic core, respectively). The scheme is used to thread the moPrP 89-230 sequence by considering different constraints. PK cleavage sites identified by mass spectrometry in two different previous reports are colored in red. Glycosylation sites are labelled in green. Prolines are colored in purple. Cystine is indicated in cyan. Upper graphs report the RMSD deviation from initial configurations for the entire structure (blue lines) or the β-solenoid core (orange lines) of the different proteins during the 20 ns of restrained MD simulations. Filled curves represents standard error of the mean. The graphs indicate a minor rearrangement of the β-solenoid core that is almost identical for the Het-s dimer and the 4RβS model, characterized by a RMSD of 2.3 ± 0.2 Å and 2.4 ± 0.2 Å respectively (calculated as the average of the three trajectories over the last 5 ns, ± standard deviation). In contrast, the hydrophobic core of the LβH model displays a higher RMSD deviation Only small deviations from the main model are allowed to agree with experimental data. Images (i) represents the 2D schemes of two putative alternative schemes. Images (ii) highlight the structure of the rung that differs from the main treading scheme, blue residues represent the shifted amino acids in the threading. Graphs (iii) report RMSD and secondary structure analysis of three 100 ns unrestrained MD simulations. Images (iv) report the initial and final frame of the respective MD simulations.