Loop Motion in Triosephosphate Isomerase Is Not a Simple Open and Shut Case

: Conformational changes are crucial for the catalytic action of many enzymes. A prototypical and well-studied example is loop opening and closure in triosephosphate isomerase (TIM), which is thought to determine the rate of catalytic turnover in many circumstances. Speci ﬁ cally, TIM loop 6 “ grips ” the phosphodianion of the substrate and, together with a change in loop 7, sets up the TIM active site for e ﬃ cient catalysis. Crystal structures of TIM typically show an open or a closed conformation of loop 6, with the tip of the loop moving ∼ 7 Å between conformations. Many studies have interpreted this motion as a two-state, rigid-body transition. Here, we use extensive molecular dynamics simulations, with both conventional and enhanced sampling techniques, to analyze loop motion in apo and substrate-bound TIM in detail, using ﬁ ve crystal structures of the dimeric TIM from Saccharomyces cerevisiae . We ﬁ nd that loop 6 is highly ﬂ exible and samples multiple conformational states. Empirical valence bond simulations of the ﬁ rst reaction step show that slight displacements away from the fully closed-loop conformation can be su ﬃ cient to abolish most of the catalytic activity; full closure is required for e ﬃ cient reaction. The conformational change of the loops in TIM is thus not a simple “ open and shut ” case and is crucial for its catalytic action. Our detailed analysis of loop motion in a highly e ﬃ cient enzyme highlights the complexity of loop conformational changes and their role in biological catalysis.


■ INTRODUCTION
Triosephosphate isomerase (TIM) is an enzyme that catalyzes a simple reversible isomerization reaction, namely the isomerization of dihydroxyacetone phosphate (DHAP) and (R)-glyceraldehyde-3-phosphate (GAP, Figure 1A). 1−5 It does so with tremendous catalytic proficiency, such that the turnover of this enzyme has been argued to be limited by diffusion. 2,3 As a result, TIM has often been described as a "catalytically perfect" enzyme 3 and has been the subject of extensive experimental and computational studies, as a model system for understanding the factors that drive enzyme catalysis (see refs 2 and 4−27 as just a few examples).
Structurally, TIM is usually a homodimer and gives its name to the archetypal TIM barrel fold, which consists of eight αhelices and eight parallel β-sheets alternating along the protein backbone ( Figure 1B). We note that exceptions exist: In some organisms, TIM is a tetramer, and the change in oligomerization state can be functionally important. [28][29][30] which are by far one of the most commonly occurring protein folds, 31 provide a versatile 11 and highly evolvable 11,15,24,32 scaffold and have been argued to have facilitated the early evolution of protein-mediated metabolism. 24 In addition, TIM barrel enzymes are excellent model systems for understanding enzymatic thermal adaptation. 33,34 A defining feature of reaction in TIM is the large motion (up to 7 Å) of a phosphate gripper loop, loop 6 (residues Pro166-Ala176 for yeast TIM from Saccharomyces cerevisiae, yTIM, PDB ID: 1NEY), 16,35 which interacts first with loop 5 in the unliganded form of TIM, and subsequently with loop 7 of TIM and with the phosphodianion of the bound substrate ( Figure 1C). This dianion is, in turn, anchored to the enzyme through (up to) four hydrogen bonds with the protein backbone ( Figure 1D). 16 There has been extensive elegant experimental work on activation of TIM by phosphite dianions. 19,21,23,25 The movement of the TIM loop was among the first functional enzyme motions to be investigated by molecular dynamics (MD) simulations. Seminal work by Kollman, 7 Karplus, 8 and co-workers reported MD simulations in vacuo using simple reaction coordinates and argued from the results that the loop closure is essentially a rigid-body type displacement, in which the loop moves like a "lid" attached to the protein by two hinges (Figure 2), the sequences of which are not conserved. 36 This simple picture has been highly influential and has become a widely accepted model for how this loop moves. 4,7,8 Specifically, previous NMR data were interpreted based on a two-state system in which the loop is only open and closed. 4,[7][8][9]13,14,37 However, more recent studies have indicated that there is some degree of independence in the motions of the N-and C-terminal hinges of the loop. 14 Desamero et al. 17 have suggested that substrate binding to TIM is very fast, followed by slow unimolecular loop closure, with a population ratio between two states that favors the closed-loop conformation, but that is substantially less temperature dependent than the subsequent product release step. This is in contrast to previous NMR studies that suggested that ligand-free TIM is mostly open, but with small amplitude motions on the micro-to nanosecond time scale, 14 as well as work by Williams and McDermott 9 who argued that irrespective of whether TIM is substrate free, or bound to substrate GAP, or transition state (2-phosphoglycolate, PGA) analogues, the loop movement occurs at a rate similar to the empty enzyme, with a similar population ratio for the two conformers and with a measured rate for loop movement that is approximately matched to the turnover time. McDermott and colleagues 9,13,14,17 have similarly found that the rate of loop opening and closing is on the 10 4 s −1 time scale for TIM, with both open and closed states being substantially populated.
Tawfik and co-workers have argued, based on theoretical considerations, that k cat is unlikely to be higher than 10 6 −10 7 s −1 for any enzyme, and, additionally, the apparent secondorder rate constant (k cat /K M ) for a diffusion-limited enzymecatalyzed reaction with a single low-molecular mass substrate cannot exceed ∼10 8 −10 9 s −1 M −1 (see ref 38 and references cited therein). In the case of TbbTIM, k cat values of 300 s −1 and 2100 s −1 , and corresponding k cat /K M values of 4.3 × 10 5 M −1 s −1 and 8.4 × 10 6 M −1 s −1 , have been measured for the deprotonation of substrates DHAP and GAP, respectively. 21 Therefore, TIM is only partially diffusion-limited, and, while full loop closure is essential for the catalytic activity of the enzyme, the loop motion itself is only partially rate-limiting. Finally, more (comparatively) recent MD simulations of TIM loop motion surprisingly suggested multiple loop opening and closing events on a 100 ns time scale, 39,40 although this is on a much faster time scale than that derived from experiments, and could plausibly be either a force field artifact or an artifact of how loop opening and closure was measured and/or defined.
Clearly, therefore, despite being a prototype system for understanding functionally (and potentially catalytically) important loop motions, 4 the nature of the motion of these loops in TIM is far from fully understood. Addressing this issue in a satisfactory way would expand fundamental understanding of these regulatory loop motions. It might superficially be thought that loop motions provide a relatively easy case for molecular simulations, but in fact, as we show here, even simple loop motions provide a significant challenge for biomolecular simulation, because modeling loop motions is far from trivial computationally. 45,46 Loops that are regulated by ligand-gated conformational changes are a common feature of not just TIM barrel proteins, 47,48 but also unrelated enzymes such as orotidine 5′-monophosphate decarboxylase 49 (OD-Case) and glycerol 3-phosphate dehydrogenase (GPDH). 25,50 Analyzing how the motion of these loops is regulated and

Journal of the American Chemical Society
Article potentially linked to their catalytic activity is central to understanding the catalytic proficiencies of these enzymes as well as, more generally, the high evolvability of the TIM barrel scaffold. 11,15,24,32−34 The extensive experimental data on TIM loop dynamics (e.g., refs 9, 13, 14, and 17 among others) make TIM an ideal model system. We perform here detailed simulations of the conformational dynamics of TIM loop 6, using a range of different techniques, including long-time scale conventional MD simulations (which are used to construct Markov state models), 51−53 principal component analysis (PCA), 54−56 Hamiltonian replica exchange molecular dynamics (HREX-MD), 57 and bias-exchange metadynamics (BE-METAD) 58,59 simulations. These are complemented by empirical valence bond (EVB) simulations 60,61 of the TIM-catalyzed deprotonation of the substrate DHAP, which test the effects of varying loop conformations on the activation free energy for this process, using the conformational coordinates defined by the motion of loop 6. We illustrate that accurately modeling the loop dynamics of even the apparently simple case of triosephosphate isomerase is challenging. Extensive conformational sampling is required to obtain a reliable model. Further, we demonstrate that loop 6 visits multiple stable conformational substates rather than being an example of a two-state rigid-body motion as widely thought. Also, this conformational dynamics has a clear impact on the catalytic activity of TIM. Therefore, in contrast to the simple two-state model often put forward in the literature, 4,[7][8][9]13,14,37 the dynamics of the TIM active site loops is in fact far from a simple open and shut case.
■ METHODOLOGY System Setup. In order to study the motion of loops 6 and 7 of both substrate-free TIM (apo-TIM) and TIM in complex with DHAP (DHAP-TIM), five crystal structures of the dimeric TIM from S. cerevisiae (yTIM) were chosen as starting structures (PDB IDs: 1NEY, 16,35 1YPI, 35,41 1I45, 13,35 1NF0, 16,35 and 7TIM. 35,62 The three tryptophan mutations (Trp90Tyr, Trp157Phe, and Trp168Ftr, where Ftr = 5′fluorotryptophan) introduced into the structure of 1NEY were  16,35 conformations, respectively, with the spheres in panels (A) and (B) representing the C α atoms of the annotated residues. All C α −C α contacts within a 5 Å cutoff were mapped with blue lines, while the H-bonds are represented with red lines. The same interactions were mapped in (C, D), using Cytoscape 3.6.1, 42 RINalyzer 2.0, 43 and UCSF Chimera 1.13, 44 in order to show the interaction network.

Journal of the American Chemical Society
Article mutated back so that all the five structures have an identical sequence, as in our previous work 26,27,63 (note also that these mutations have been shown not to affect the catalytic activity of the enzyme 16 ). Two (1YPI and 1I45) of the five crystal structures are ligand-free, with loop 6 in its open state, while the other three were crystallized in complex with either a substrate or a transition state analog. Of these three structures, 1NEY and 7TIM are complexed with DHAP and phosphoglycolohydroxamate (PGH), respectively, with loop 6 in its closed state in both chains of the two structures. In 1NF0, DHAP is bound in the active sites of both chains, but, curiously, loop 6 is in its closed state in one chain, but in its open state in the other chain. All water molecules present in the crystal structures were removed, and in the case of the simulations of substrate-free yTIM, the ligands in 1NEY, 7TIM, and 1NF0 were also removed, considering only the protein coordinates as starting conformations for the simulations. In the case of the simulations of yTIM in complex with DHAP, the substrate was placed in the active site using Chimera 44 to perform structural alignment of the different structures onto PDB ID 1NEY, which is a structure of yTIM in complex with DHAP at 1.2 Å resolution. 16,35 We maintained the same histidine protonation patterns (protonating at the δvs ε-positions) as in our previous work, 26,27 as outlined in the Supporting Information of ref 26, and all other ionizable residues were kept in their default protonation states at physiological pH. Each system was placed into an octahedral box filled with TIP3P water molecules 64 with a distance of at least 10 Å from the solutes to the surface of the box. To neutralize the system, 10 Na + counterions were added to the TIM-DHAP complex, while 6 were added to the apo-TIM system. The protein was described using the Amber ff99SB-ILDN force field, 65 while the General AMBER Force Field 2 (GAFF2) 66 was used to obtain parameters to describe the substrate DHAP. The partial charges for DHAP were calculated using the standard restrained electrostatic potential (RESP) approach, 67 using Antechamber 68 as implemented into AMBER16, 69 and based on the electrostatic potential calculated in vacuum at the HF/6-31G(d) level of theory, using Gaussian 09 rev. D.01. 70 The GAFF2 parameters and partial charges of DHAP are listed in Table S1 in the Supporting Information. To keep the substrate DHAP stable in the TIM binding sites, several distance restraints were applied for all the DHAP-TIM simulations, as listed in Figure S1 and Table S2. Monitoring the P−O1−C1−C2 dihedral of DHAP indicates that the distance restraints do not distort the conformation of DHAP ( Figure S2). These distance restraints were applied to all DHAP-TIM simulations except the empirical valence bond simulations, as described below.
Conventional Molecular Dynamics Simulations. The LEaP module of AMBER16 69 was used to generate the topology and coordinate files for the conventional MD simulations, which were carried out using the CUDA version of the PMEMD module 71−73 of the AMBER16 simulation package, 69 with system preparation as described above. The solvated system was first subjected to a 2000 step steepest descent minimization, followed by a 3000 step conjugate gradient minimization with positional restraints on all heavy atoms of the solute, using a 5 kcal mol −1 Å −2 harmonic potential. The minimized system was then heated up to 300 K using the Berendsen thermostat, 74 with a time constant of 1 ps for the coupling, and 5 kcal mol −1 Å −2 positional restraints (again a harmonic potential) applied during the heating process. The positional restraints were then gradually decreased to 1 kcal mol −1 Å −2 over five 500 ps steps of NPT equilibration, using the Berendsen thermostat 74 and barostat 74 to keep the system at 300 K and 1 atm. For the production runs, each system was subjected to either 200 or 400 ns of sampling in an NPT ensemble at constant temperature (300 K) and constant pressure (1 atm), controlled by the Langevin thermostat, 75,76 with a collision frequency of 2.0 ps −1 , and the Berendsen barostat with a coupling constant of 1.0 ps. A 2 fs time step was used for all simulations, and snapshots were saved from the simulation every 10 ps. The SHAKE algorithm 77 was applied to constrain all bonds involving hydrogen atoms. A 10 Å cutoff was applied to all nonbonded interactions, with the electrostatic interactions being treated with the particle mesh Ewald (PME) approach. 78 Table S3.
Hamiltonian Replica Exchange Molecular Dynamics Simulations. HREX-MD 57 simulations, an advanced sampling technique, were performed in order to further explore the conformational space of loops 6 and 7 of TIM. These simulations were based on the crystal structure of yTIM in complex with DHAP (PDB ID: 1NEY), 16,35 and in the case of the substrate-free simulations, the coordinates of the substrates were simply deleted from the PDB structure. The system setup was performed as described above for the conventional MD simulations. As with the conventional MD simulations, the resulting systems were then subjected to 5000 steps of each of steepest descent and conjugate gradient minimizations, after which they were heated up from 0 to 300 K over 500 ps of simulation time in an NVT ensemble, using the velocity scaling algorithm, 79 with a time constant of 1 ps. They were then equilibrated for a further 500 ps in an NPT ensemble at 300 K and 1 atm, controlled by the same thermostat and a Parrinello−Rahman barostat, 80 with a time constant of 2.0 ps. During the equilibration steps, 2.39 kcal mol −1 Å −2 positional restraints were applied to all heavy atoms, in each of the xyz directions. The final configuration of the system after equilibrations was then used as the starting structure for the HREX-MD simulations, which were performed without any restraints on the protein.
All HREX-MD simulations were performed using GROMACS v. 4.6.7, 81,82 interfaced with the PLUMED v. 2.1 plugin. 83 All bonds were constrained with the P-LINCS algorithm 84 in the simulations. The cutoff for the short-range nonbonded interactions was set to 10 Å, and the electrostatic interactions were calculated using the PME 78 method, in combination with periodic boundary conditions. All HREX-MD simulations were performed in an NPT ensemble, which was maintained using the same thermostat and barostat as used in the NPT equilibration. Loops 6 (Pro166-Ala176) and 7 (Tyr208-Ser211) of both chains were chosen as the "hot" region for the HREX-MD simulations. The charges in the "hot" region were scaled by √λ, the Lennard-Jones parameter by λ, and the proper dihedral angles were scaled by either λ or √λ, depending on whether both the first and the fourth of the dihedral, or just one of the two atoms of the dihedral, were in the "hot" region. The simulations were performed using 8 replicas, with the scaling factor, λ, exponentially ranging from 1.0 to 0.6. This corresponds to an effective temperature range from 300 to 500 K. Exchanges between neighboring replicas were attempted every 2 ps, and configurations were saved every 2 ps. A 4 fs integration step was used in the simulations of apo-TIM, as virtual sites were used for hydrogens, while a 2 fs time step was used in simulations of the TIM Michaelis complex with DHAP. Each replica was sampled for 200 ns in the case of the apo-TIM simulations and for 300 ns in the case of the DHAP-TIM simulations. This resulted in an average exchange rate of 20−30% in both simulations. Only the replica without scaling (λ = 1.0) was used for further analysis.
Bias-Exchange Metadynamics Simulations. In the BE-METAD approach, 58,59 a set of collective variables (CVs) that are expected to be relevant to the process under investigation are chosen for the simulations. A number of MD simulations (replicas) are then run in parallel, with each replica being biased by a history dependent Gaussian-type bias 58 acting on one of the selected CVs. During the BE-METAD simulations, the sampling is promoted by attempting swaps of the bias potentials between pairs of replicas at fixed time intervals. The probability of the swap is determined based on eq 1: (1) where X i and X j are the configurations of replica i and j, while V G i and V G j are the metadynamics bias potentials acting on replica i and j, respectively. A native contacts parameter (Q), ranging from 1 for the closed state using 1NEY as reference to 0 for the open state without contacts between loops 6 and 7, was used to define the CVs and was calculated using the following equation: 85

Journal of the American Chemical Society
where X is a conformation along the reaction coordinate, r ij (X) is the distance between atoms i and j in conformation X, r ij 0 is the distance between atoms i and j in the reference conformation, S is the number of all pairs of heavy atoms (i, j), β is set as 5 Å −1 ,and λ is chosen as 1.4 for a tight determination of native contacts. For the 7 CVs corresponding to the 7 biased replicas, the distances between the heavy backbone atoms (N, C α , C, O) of each of the residues Ala169-Ala175 of loop 6 and the backbone atoms of all residues of loop 7, that is, Tyr208-Ser211, were used to train the native contacts, using λ = 1.4 and without the 4.5 Å cutoff proposed in the original paper as some of the distances defined as CVs were greater than 4.5 Å. The reference distances for the BE-METAD simulations were taken from chain A of PDB ID 1NEY. 16,35 The 7 CVs used for these simulations are listed in Table S4. The eighth ("unbiased") replica was not biased by a metadynamics potential.
All BE-METAD simulations were performed using GROMACS v 5.1.4 81,82,86 interfaced with PLUMED v. 2.3.0. 83 A 2 fs integration step was used in the simulations, and all bonds were constrained with the P-LINCS algorithm. 84 The crystal structure of yTIM in complex with DHAP (PDB ID: 1NEY) 16,35 was used as the starting structure for the TIM-DHAP simulations, while the coordinates of the substrates were simply removed from the PDB structure for the apo-TIM simulations. The same protocol as used for the HREX-MD simulations was applied for both the equilibration and production BE-METAD runs but using a time step of 2 fs for both apo-TIM and DHAP-TIM. A Gaussian-shape biasing potential was added to each of the 7 CVs every 2 ps with an initial height of 0.1 kcal mol −1 , which was gradually decreased on the basis of adaptive biasing 87 by a factor of 10. The width of the Gaussians was set at 0.05. Exchanges between replicas were attempted every 2 ps, and each replica was sampled for 500 ns with configurations saved every 2 ps. This resulted in an average exchange rate of ∼40%.
Computation of Markov State Models. For further elucidating the loop motions and kinetics, Markov state models (MSM) 51−53 were constructed from the conventional MD data with the PyEMMA Python library (v. 2.4). 88 The first stage of building an MSM is to discretize the conformational space of the molecule and reduce the trajectory to a sequence of transitions between discrete states. Obtaining good state space discretization is key to obtaining an MSM that is both descriptive and predictive. 51 To this end, we describe the conformation of TIM in terms of the distances between the C α atoms in loops 6 and 7. Next, we apply PCA to reduce the dimensionality of our system characterization from 120 interatomic distances to 2 collective coordinates along which the largest motions take place. PCA is performed taking both the DHAP-TIM and apo-TIM data into consideration, so that both systems can be represented on a common set of collective coordinates and compared with each other. Subsequent operations, however, are applied to the two sets of data individually. K-means clustering is applied to these reduced systems to divide their respective conformation spaces into 100 "microstates". A granular 100-state MSM can then be built for both the apo-and DHAP-TIM systems by counting the transitions between the microstates, constructing a matrix of transition counts at a specified lag-time, and normalizing it by the total number of transitions emanating from each state to obtain the transition probability matrix. A lag time of 100 ns was chosen for our MSM because the time scales of the transition matrix were found to converge sufficiently by this interval. To aid our understanding, we coarse-grained the 100-state MSM into a hidden Markov model, 89 using the PCCA+ algorithm 90 to assign each microstate from the granular MSM a probability of belonging to a macrostate, thereby implementing a fuzzy lumping together of kinetically similar microstates.
Empirical Valence Bond Simulations. We recently performed extensive EVB simulations of the wild-type and mutant TIM-catalyzed deprotonation of substrates DHAP and GAP as well as substrate pieces glycoaldehyde (GA) and glycoaldehyde + phosphite ion (HP i ). 26,27 These studies focused only on modeling the initial rate-limiting deprotonation of the substrate and/or substrate piece, because the subsequent proton transfer steps are expected to be fast. We demonstrated that these models are able to capture the catalytic effect of not just the wild-type enzyme but also a range of key active site mutants with high quantitative accuracy. We have used the same simulation protocol to study the impact of loop 6 dynamics on the energetics of the TIM-catalyzed deprotonation of DHAP in the present work, with the exception of the fact that a shorter 20 ns equilibration was used (rather than 40 ns as in our previous work), 26 due to the much larger number of simulations involved. In brief, we extracted 10 structures at each of Q = 1.00, 0.99, 0.98, 0.97, 0.96, 0.95, 0.90, 0.85, 0.80, 0.70, 0.60, 0.50, 0.40 from our BE-METAD simulations of DHAP-TIM, to a total of 130 starting structures over all Q values considered. Note that we stopped at Q = 0.40 because this corresponded to a fully open conformation of loop 6. These structures were effectively randomly selected, with the only exception that we took into account the distance between the C δ atom of Glu165 and the reacting carbon atom of the substrate as a selection criterion when extracting these structures, selecting only those in which the distance between these two atoms are <4.0 Å (the distance in the crystal structure, PDB ID: 1NEY, 16,35 is 3.83 Å). This was necessary because in many snapshots Glu165 moved out of the active site (in agreement with crystal structures which indicate a conformational change of this side chain out of the active site in structures of TIM with loop 6 in an open conformation), 4 and clearly those conformations will be noncatalytic. By selecting structures in which Glu165 points into the active site, even as the loop starts to open, we obtain a lower limit for the activation free energy for that loop conformation, assuming an idealized conformation for the Glu165 side chain.
We then performed 20 ns of equilibration on each structure, using the protocol described in refs 26 and 63, followed by 5 EVB simulations performed using 51 EVB mapping frames of 200 ps each, from the end point of each equilibration, with the starting points for each EVB simulation generated by performing an additional 110 ps of equilibration using five different random seeds. This led to a total of 50 individual EVB trajectories for structures extracted at each Q value and thus a total of 650 EVB trajectories over all Q values. We also performed additional EVB simulations on the crystal structure (PDB ID: 1NEY) 16,35 with loop 6 unrestrained, as a control. In the case of the crystal structure, we performed 10 initial equilibrations using 10 different random seeds and spawned five new EVB simulations from each equilibrated structure, by assigning new random seeds, as with the structures extracted from the BE-METAD simulations. Each individual EVB trajectory was sampled from the transition state in either reactant or product directions, as in our previous work, and each individual trajectory was a total of 10.2 ns, leading to a total simulation time of 510 ns over all 50 trajectories per loop conformation. This corresponded to a cumulative total of 2.8 μs equilibration and 7.14 μs of EVB simulation time over all systems.
All the EVB simulations were performed as described in detail in our previous work, the only exception to this being the sphere size used in our simulations, which was extended from a radius of 20 to 24 Å from the simulation center (the reacting carbon of substrate DHAP), in order to fully capture the flexibility of loop 6. The size of the water droplet used in our simulations was also extended from 20 to 24 Å accordingly. As with our previous work, all titratable residues with side chains that fell within the inner 85% of the sphere (i.e., within a distance of 20 Å from the reacting carbon of substrate DHAP) were ionized, and all residues in the outer 15% of the simulation sphere were kept neutral and restrained in place by a 10 kcal mol −1 Å −2 harmonic restraint (in the outer 15% of the simulation sphere) or a 200 kcal mol −1 Å −2 harmonic restraint (fully outside the simulation sphere). Note that neutralizing these titratable residues is a standard practice in such simulations in order to avoid system instabilities by having charged residues outside the simulation sphere. 26,27,91−96 A full list of all ionized residues and the protonation states of all histidine residues in this system can be found in Table S5 Analysis. In order to define the closed and open states of loop 6, we calculated root-mean-square deviations (RMSD) of this loop, using the conformation of this loop in the structure of DHAP-TIM with loop 6 in a closed conformation (PDB ID: 1NEY) 16,35 as the reference conformation. The RMSD of the backbone of loop 6 was calculated after the alignment of all protein backbone atoms (apart from those from loop 6) to the reference conformation. In the BE-METAD simulations, the contacts between loops 6 and 7 were divided into 7 CVs (termed Q1 to Q7) and used for the different biased replicas (Table S4). The native contacts parameter, Q, with values from Q1 to Q7 (see eq 2) was used to distinguish the closed (Q approaching 1) and open (Q < 0.45) states of loop 6 of TIM. As shown in Table S6, the RMSD and Q value of loop 6 in its open conformation (from PDB ID: 1YPI) 35,41 are 4.77 Å and 0.419, respectively. Finally, the pK a s of all ionizable residues in the protein at different Q values obtained from our EVB Michaelis complexes (1000 snapshots extracted from 50 independent trajectories per Q value) were estimated using PROPKA 3.1. 97 The pK a of the catalytic base, Glu165, was also estimated for crystal structures of TIM in complex with DHAP and PGA, as described in the Results and Discussion.

■ RESULTS AND DISCUSSION
Modeling TIM Loop Dynamics Using Conventional Molecular Dynamics Simulations. The classical picture of loop movement in TIM is that of a two-state rigid-body motion, 4,7−9,13,14,37 and this system is in fact often used as a prototypical example of such motion in biologically relevant systems. 98 This makes TIM a particularly important system to understand the role of loop motions in enzyme catalysis and protein function.
As our starting point, we performed a cumulative 48.4 μs of conventional MD simulations, as described in the Methodology section and summarized in Table S3 Williams and McDermott have performed solid-state NMR measurements to study TIM loop movement in both apo-TIM and TIM in complex with substrate and transition state analogs. 9 Based on their experimental data, they predicted an activation free energy of 12 kcal mol −1 for the transition from open to closed conformations of loop 6 of apo-TIM (based on the rate at 10°C), with a free energy difference of 1.8 kcal mol −1 in favor of the open conformation of the TIM loop in the apo form of the enzyme. They estimated that the presence of substrate and transition state analogues would alter this equilibrium slightly to favor the closed conformation of TIM loop 6 by up to 2.8 kcal mol −1 in the presence of the PGA trianion (and a barrier reduction from ∼12 to 9.2 kcal mol −1 for this transition). In light of the (from a simulation perspective) slow rate of change between the states and the small free energy difference between them, it is perhaps unsurprising that unbiased MD simulations struggle to capture the closure of loop 6 on reasonable simulation time scales. The key question then becomes how to capture this loop movement in a meaningful way during atomistic molecular simulations.

Article
Using Enhanced Sampling Approaches To Capture the "TIM-Closed" State. In order to capture TIM loop motion, including the closed state of loop 6, we first performed HREX-MD) simulations, as described in the Methodology section and shown in Figures 3, S7, and S8. However, as with our conventional MD simulations, the HREX-MD simulations fail to capture the closed state of loop 6. That is, while the closed conformation of loop 6 appears to be very briefly sampled in the DHAP-TIM Michaelis complex (Figures 3 and  S8), this is due to the fact that the simulations were initiated from the closed state of loop 6, and, even here, the loop opens within 25 ns of simulation time and does not close again. Again, the latter is not very surprising, given the fact that replica exchange MD simulations were shown to achieve about an order of magnitude of sampling speedup compared to conventional MD, 103 which is still not sufficient to reach the ∼100 μs needed for loop 6 to close, as estimated by experiment. 14 In contrast, however, BE-METAD simulations ( The convergence of the BE-METAD simulations was assessed by following the time evolution of the free energy profiles along each of the 7 CVs. The corresponding onedimensional profiles for the last 200 ns of simulation time using a 50 ns time interval are shown in Figures S12 and S13 for the simulations of apo-and DHAP-TIM, respectively. In addition, as can be seen in Figures S14 and S15, the loops experience diffusive dynamics along the CVs, which is indicative of dynamics on a flattened energy landscape, wherein the biasing potentials compensate for the underlying free energy surface. Based on this, we consider the BE-METAD simulations to be converged for both apo-and DHAP-TIM. Characterizing the Different Conformational States of Loop 6. Up to this point, our focus has been on capturing the closed state of loop 6 in our simulations. However, as it has been argued previously that loop movement in TIM is a twostate rigid-body movement, 4,7−9,13,14,37 it is also interesting to examine the dynamics of the open state of TIM. To achieve this, we returned to our conventional MD simulations and characterized the motion of loop 6 during the simulations using PCA, subsequently projecting the free energies for the enzyme along the most dominant motions, that is, PC1 and PC2.  Table S3). The distribution of distances (Å) between loops 6 and 7, defined as the average distance between the C α atoms belonging to the two loops, along PC1, and PC2 are shown for (C) apo-TIM and (D) DHAP-TIM. The open and closed states of the loops are projected from chains A of crystal structures 1YPI (▼) and 1NEY (▲), respectively.

Article
In order to facilitate direct comparison between the two systems, we performed the PCA on the combined simulation data obtained from conventional MD. We then calculated the free energy from the simulations of apo-TIM and DHAP-TIM separately, but along the same principal components ( Figure  4A,B, respectively). In Figure 4C,D, the interloop distance, defined as the average distance between the C α atoms belonging to the two loops, is projected onto the corresponding free energy landscape.
These plots clearly demonstrate that, in both systems, the main conformational motions of the loop take place along the first PC. On the other hand, the free energy surfaces also reveal considerable loop motion along the second PC, especially in the open state of loop 6. That is, this figure shows that while PC1 is the motion that mainly distinguishes between the open and closed states of loop 6, PC2 indicates that several conformational pathways exist for loop opening and closure.
Inspection of conformations along PC2 indicates a simultaneous inward/outward motion of loops 6 and 7, where loop 6 becomes slightly twisted as one-half of it (residues 169−173) tries to move closer toward loop 7, which simultaneously turns away from loop 6. This is in agreement both with computational studies using coarse-grained simulations that suggest a concerted motion for these loops, 30 and with experimental observations 4,13,16,41,62 (based on structural information) of likely concerted loop motion for loops 6 and 7,

Journal of the American Chemical Society
Article due to a ligand-induced conformational change of the YGGS motif of loop 7 ( Figure 5), which in turn causes the Glu165-Pro166 dipeptide to rotate, resulting in the placement of Glu165 in a catalytically competent conformation in the TIM active site. 4 Specifically, ligand binding causes a 90°rotation of the Gly209-Gly210 peptide plane, which further leads to a 90°r otation of O (Gly209). This creates a steric clash with the Pro166 side chain, which triggers the conformational movement along the Glu165-Pro166 plane that places Glu165 in the active site. These movements are also coupled to a substantial 180°rotation along the Gly210-Ser211 peptide bond. 4 Figure  5 shows a comparison of crystallographic structures and snapshots from our simulation, illustrating that our simulations sample an approximation to these substantial conformational changes, in good agreement with structural information. 16,41 Therefore, our extensive atomistic simulations, in line with previous coarse-grained simulations by Katebi and Jernigan,30 show that motion of loops 6 and 7 is concerted. We now also characterize this concerted motion in terms of both thermodynamics and kinetics. In addition, although these loops do sample multiple conformational states, they are not independently flexible, but rather the conformational space sampled by the loops is interdependent, restricting which combined conformations are available.
As discussed in the previous section, the conformations sampled for the open state in the simulations populate a large basin on the free energy surface. On the other hand, the closed state is not stably sampled in our conventional MD simulations. Consistent with this, simulations that started from the closed state quickly evolved toward the open state, while the reverse was never observed (also true for the HREX-MD simulations). For the DHAP-TIM complex, conformations were sampled leading toward the crystal structure of the closed state (PC1 > 2 and interloop distances ≤10 Å), which were, however, not observed for apo-TIM. In the absence of the substrate DHAP, the minimal interloop distances that were obtained were around 10 Å, which corresponds to a semi-open (or semi-closed) state. This is due to the fact that the closed state is stabilized by H-bonds between the substrate and loops 6 and 7 in DHAP-TIM, which are missing in the ligand-free enzyme.
Finally, in order to further elucidate the dynamics of loop 6 as well as the time scales of these motions, we built MSMs from the extensive conventional MD data and superimposed them onto the calculated free energy surfaces ( Figure 6). The large 7:1 ratio between the first and second slowest time scales for the MSM constructed from simulations of apo-TIM results in only two metastable states. The loop dynamics in the presence of DHAP can be characterized by three different metastable states, of which state 1 represents a semi-closed, yet not very stable state (<1% population, see Figure 6 and Table  S6). State 2 is the same for apo-and DHAP-TIM and is conformationally similar to the open state as determined by crystallography (see, e.g., PDB ID: 1YPI). 35,41 It is also the most stable state, which is populated 90.9% of the time in simulations of apo-TIM and 63.8% of the time in simulations of DHAP-TIM. Both systems can adopt another open state, which is, however, different between the simulations of apoand DHAP-TIM. As these states are not identical, we denote these states as state 3 for apo-TIM and state 4 for DHAP-TIM. The main motion for the interconversion between the two different open states in both systems is observed along PC2. State 3 in apo-TIM, which has a population of 9.1%, is represented by structures in which loops 6 and 7 are closer to the TIM barrel than is the case in state 2. This is also reflected in the somewhat smaller interloop distances of 11−13 Å in state 3 vs 13−15 Å in state 2 ( Figure 6). State 4 is the most open metastable state for DHAP-TIM, with interloop distances of 15−17 Å and a population of about 36.1%. While there is still some loop motion along PC2 in the presence of DHAP, this motion is suppressed compared to the corresponding motion in the ligand-free enzyme, as the differently populated conformational areas belonging to states 3 and 4 reveal ( Figure  6). This plausibly indicates that, in the presence of ligand, the sampling is more confined to the open-closed transition pathway, which in turn may indicate a ligand gated conformational change, as has been suggested in several experimental studies (e.g., refs 13, 17, 22, and 27).
Another difference between DHAP-TIM and apo-TIM is the time scales for the interconversion between the metastable states. As already mentioned above, the opening of loop 6 happens very quickly in our simulations, as evidenced by the, on average, 55 ns that DHAP-TIM needs for the transition from the semi-closed state 1 to the open state 2. The transition back to state 1, however, appears to be a slow process with a predicted time scale of 39.2 μs. This is by a factor of 2 faster than the experimentally determined time scale for this motion, 14 although the experimentally measured values are associated with full closure of the loop, which we do not observe in our simulations. In light of this, the predicted time scale for the closure of loop 6 can be considered to be in good agreement with the experimental value, with a predicted activation barrier of ∼11.5 kcal mol −1 , compared to an experimental estimate of 12 kcal mol −1 . However, loop opening should happen much more slowly than observed in our simulations, as the rate of loop opening is predicted to be only ∼10-fold faster than loop closing in the apoenzyme and up to 100-fold slower than loop closing when TIM is in complex with a substrate or transition state analogue. 13,14 This overestimate of the rate of loop opening is probably due to an underestimation of the stability of the closed state of loop 6. The calculated interconversion here between the two open states of DHAP-TIM is fast with time scales of 168 ns for the transition from state 2 to state 4 and of 95 ns for the transition back to state 2. The 10-fold higher population of state 2 compared to state 3 in apo-TIM is also reflected in the transition rates: the time scale for the transition from state 2 to state 3 is 4.91 μs, whereas it takes only 0.49 μs for the backtransition. Therefore, our simulations show fast switching of loop conformations between different open states of the enzyme, which has also been reported in experiments of the apoenzyme where time scales of 10 5 −10 6 s −1 for small-angle motions of loop 6 were found. 14 In summary, these findings show that TIM loop motion is clearly more complex than a simple rigid-body movement involving two well-defined conformational states and that the "open" conformation of loop 6 samples multiple distinct conformational substates.
Exploring the Link Between Loop 6 Dynamics and the Reaction Catalyzed by TIM. Having extensively studied the conformational dynamics of loop 6, the obvious remaining question is how this affects the catalytic activity of the enzyme and whether the structures sampled along the minimum energy pathway from our BE-METAD simulations represent a transition toward a catalytically competent closed conformation of the enzyme. We have previously modeled the deprotonation of substrates DHAP and GAP by TIM in the Journal of the American Chemical Society Article closed state of the enzyme, 26 using the EVB approach and the same PDB structure (PDB ID: 1NEY) 16,35 that was also used in this work. Both we and others have demonstrated the importance of a very precise network of interactions in the closed state in facilitating catalysis by this enzyme. 26,104,105 In addition, previous work 26,106 has shown that mutations that allow even a single water molecule to enter the active site in the vicinity of the general base, Glu165, negatively impact the thermodynamic barrier to substrate deprotonation and that there is in turn a near-linear correlation between the reaction free energy (thermodynamic barrier, ΔG°) and activation free energy (kinetic barrier, ΔG ⧧ ).
In the present work, we performed EVB simulations of the deprotonation of substrate DHAP, starting from (1) the fully closed state of the enzyme (i.e., the crystal structure) with no restraint on loop 6 (see also ref 26) as well as (2) Figure 7). We extracted snapshots at 0.05 intervals of Q (see Methodology section) as well as additional snapshots in 0.01 increments between Q = 1.0 and Q = 0.95, to capture how quickly loop opening has an impact on the calculated energetics. In the case of calculations initiated from conformations of the loop other than the initial crystal structure, a 1 kcal mol −1 Å −2 restraint was placed on all backbone heavy atoms of loop 6 residues (Val167-Ala176) in order to maintain a Q value close to that extracted from the BE-METAD simulations.
We note that in the crystallographically observed open conformation of loop 6 (see, e.g., PDB ID: 1YPI), 35,41 there is a substantial displacement of Glu165 toward a noncatalytic conformation in which it points out of the active site. 13,41 While we also sampled this conformation in our BE-METAD simulations, we selected structures at different Q values with this side chain pointing into the active site in a catalytically competent conformation (as described in the Methodology section), although this is not likely at smaller Q values, as loops 6 and 7 open substantially. It does, however, provide a lower limit to the activation free energy, assuming ideal positioning of Glu165. We note also that our structures for Q = 1.0 were extracted from the first 300 ps of the unbiased replica from the BE-METAD simulation.
From our data ( Figure 7 and Table S8), it can be seen that the EVB simulations performed for the crystal structure and conformations at Q = 1.0, which were selected early enough in the simulation to be virtually identical to the crystal structure, give excellent agreement with the experimental value for the deprotonation of DHAP (calculated activation free energies of 15.2 ± 0.2 and 15.5 ± 0.3 kcal mol −1 , respectively, compared to the experimental value of 14.1 kcal mol −1 ). 21 Already at Q = 0.99, however, only a small perturbation to the structure, the activation free energy, jumps to 23.2 ± 0.2 kcal mol −1 , which is similar to the experimentally estimated activation free energy for the non-enzymatic reaction (25.2 kcal mol −1 ), 10,107 indicating that loop displacement has created an environment very similar to simply performing the reaction in water. This 8 kcal mol −1 increase upon opening of the loop that we observe computationally is also in qualitative agreement with the effect of an experimental loop deletion mutant, where deletion of residues 170−173 of loop 6 (Ile170, Gly171, Thr172, Gly173) led to a ∼6.4 kcal mol −1 increase in activation free energy. 108 In addition, the overall reaction free energy increases from 6.8 ± 0.4 to 16.9 ± 0.3 kcal mol −1 , which is again similar to the estimate for the non-enzymatic reaction (18.9 kcal mol −1 ), 10 in line with our previously observed linear relationship between kinetic and thermodynamic barriers to the reaction 26 (i.e., Finally, a comparison of the structures at Q = 1.0 and Q = 0.99 shows that even at this point there is a substantial displacement of loop 6, which leads to flooding of the active site with water molecules from ∼8 at Q = 1.0 to as high ∼14 at lower Q values (see Figure 7 and Tables S9 and S10) as well as a displacement in substrate position and key active site side chains, which remains the case with decreasing Q values (see Table S11, although the impact of this substrate displacement on reacting geometries is minimal, as highlighted in Table  S12). It is notable that the largest displacement upon loop opening (apart from displacements of the reacting atoms, substrate DHAP and side chain of Glu165) is observed in the side chain of Ile170 (Figure 7 and Table S11). Ile170 is already subtly displaced at Q = 0.99 and then further displaced at Q = 0.4 ( Figure 7B−D). Both we, 26 and prior to that Richard et al., 106 have shown the importance of the active site residues Ile170 and Leu230 in forming a "hydrophobic clamp" that regulates the basicity of the Glu165 side chain by blocking the access of solvent to the active site. The shift in position of the Ile170 side chain in conjunction with (small) displacements of loop 6 is therefore notable.
In summary, as can be seen from these simulations, achieving the crystallographic closed conformation is extremely challenging. As our metadynamics simulations bias only the backbone, even when the native contacts shown in Figure 7 are as high as Q = 0.99, the structure is not the fully (crystallographically observed) closed state: loop 6 remains partially open, and the protein is in a catalytically incompetent conformation. The negative impact of this on the activation free energy is perhaps unsurprising, as the role of active site architecture and solvent exclusion in catalysis by TIM and related enzymes has been discussed in detail elsewhere. 26 However, these data demonstrate that despite the overall flexibility of the TIM loop, and the fact that it appears to sample many different conformations, it must be fully closed over the active site for efficient catalysis. This is necessary to provide the key interactions for efficient transition-state stabilization, while simultaneously creating a hydrophobic cage that excludes water molecules from the active site, thus contributing to the elevated pK a of the catalytic base, Glu165, which is in turn necessary to promote the deprotonation of the substrate. This pK a has been experimentally estimated to be as high as ∼10 in the presence of the phosphoglycolate (PGA) trianion. 109 We have performed qualitative estimates of the pK a of this residue at different Q values using PROPKA 3.1. 97 We have, in addition, estimated the pK a of this residue in PDB IDs: 1NEY 16,35 and 4TIM, 35,110 which are structures of TIM in complex with DHAP and PGA, respectively (see Figure S18 and Table S14). PROPKA underestimates the pK a of Glu165 in the presence of PGA (calculated pK a of 7.4 compared to ∼10 from experiment), 109 but it should predict trends in its pK a , indicating how the pK a changes with changing loop conformation. These calculations show that the pK a of Glu165 is highest when the loop is fully closed and drops as loop opening increasingly exposes the active site to solvent. No other ionizable residues show changes in pK a that follow a trend with Q values that could lead to changes in protonation states upon loop opening or closure. We note that the importance of creating a hydrophobic cage for the reaction of TIM to occur has been discussed at great length elsewhere in the literature (see, e.g., refs 26, 104, and 106). Closure of the active site to provide such a hydrophobic environment (reducing solvation of the catalytic base) to favor proton abstraction by the active site base is also observed in other enzymes, such as ketosteroid isomerase, 111 and is essential for efficient reaction.
Finally, the conformational differences we observe between apo-TIM and DHAP-TIM highlight the role of ligand binding in helping to attain this catalytically competent conformation. This provides further evidence that the large energetic changes observed when comparing the reaction of the full substrate GAP and that of the substrate piece glycoaldehyde in the absence and presence of phosphite dianion 27 are not due to chemical activation, but rather due to ligand binding shifting the conformational equilibria between the open and closed conformations of loop 6 of this enzyme (see also the discussion in ref 98).

■ CONCLUSIONS
Enzyme conformational changes, in particular of active site loops, are a common feature of the regulation of enzyme activity as well as enzyme functional evolution. 112−115 Triosephosphate isomerase in particular has been a prototype system for understanding the importance of loop conformational changes in enzyme function. In addition, understanding how TIM is regulated is of interest from a biocatalysis perspective, as the TIM barrel is one of the most versatile and evolvable protein scaffolds. 11,15,24,32 The conventional image of loop motion in TIM has been that of a simple two-state rigidbody motion, 4,7−9,13,14,37 which is ligand-gated, with ligand binding shifting the equilibrium of the closed-state of loop 6 from a marginally stable conformation to the preferred, catalytically competent conformation of the enzyme. 22 In the present study, we have performed detailed simulations of the conformational dynamics of loop 6 and the role of this dynamics in the TIM-catalyzed deprotonation of substrate DHAP, using conventional MD, Hamiltonian replica exchange, bias-exchange metadynamics, and empirical valence bond simulations, making this to our knowledge the most comprehensive computational study of TIM conformational dynamics to date. Through comparing these different techniques, we demonstrate that highly sophisticated sampling methods are necessary to obtain a meaningful description of this loop closure using atomistic simulations. Taken together, our simulations demonstrate for the first time that, contrary to the simple two-state rigid-body motion often assumed in the literature, loop 6 is dynamic and flexible and can sample multiple kinetically distinguishable conformational states. Furthermore, the nature of these states is altered by the presence of the substrate. However, despite sampling these multiple conformational states, the enzyme is only fully active when in a structure that approximates the crystallographically observed completely closed state of the enzyme (as shown in our EVB simulations). This is because loop 6 closure is necessary in order to create a hydrophobic cage that elevates the pK a of Glu165 such that it is able to deprotonate the substrate. This insight impacts both design of experiments to analyze this loop motion (for example, additional sites may need to be labeled in NMR studies) as well as the choice of appropriate reaction coordinates in computational studies; simple geometric reaction coordinates such as specific Journal of the American Chemical Society Article pseudodihedrals, as commonly used in many computational studies, 8,102 are unlikely to capture the overall conformational dynamics of the loop. This will also pose challenges for the further refinement of tools for automatic prediction of loop dynamics, such as, for example, ref 116. Our EVB simulations, which complement our earlier studies of this enzyme, 26,27 demonstrate that complete loop closure is absolutely critical for efficient catalysis by TIM, which is, in turn, driven by a combination of a network of highly specific hydrogen bonding and electrostatic interactions, combined with solvent exclusion from the active site. We also demonstrate that even slight displacements of the loop away from its catalytically competent conformation can have substantial negative impact on the calculated activation freeenergies, due to breaking these interactions and allowing solvent to penetrate the active site.
So, is loop motion in TIM a simple open and shut case? Our simulations altogether show that yes it is, in the sense that full closure of the loop is essential for catalytic activity, but they also show that the motion of the loop is complex. Our study shows that a conformational change that was thought to be simple is in fact complex, sampling multiple states along the reaction coordinate. We emphasize again that, while our study has focused on TIM as a model system, such ligand-gated conformational changes are certainly not unique to TIM and are in fact common. It will be important in the future to characterize the motion of functional loops and their effects on activity in other enzymes. The binding of ligands to other enzymes (structurally and functionally unrelated to TIM) such as orotidine 5′-monophosphate decarboxylase 49 (ODCase) and glycerol 3-phosphate dehydrogenase (GPDH) 25,50 suggests an evolutionary role for the regulation of loop dynamics. In contrast, highly promiscuous enzymes frequently possess large and "sloppy" active sites with fewer specific binding interactions, which is thought to allow them to adapt to different substrates for efficient catalysis. [112][113][114]117 From a computational perspective, the results here demonstrate that simulating loop dynamics is challenging, requiring extensive conformational sampling and specialized computational approaches. The complexity of what had been thought to be a simple loop motion is a sobering reminder of the challenges posed by simulations of protein conformational changes and the need for enhanced sampling methods, on time scales of at least several hundred ns, in order to be able to characterize these motions. The results here demonstrate, however, the essential role of atomically detailed simulations in understanding protein conformational behavior, and in informing and complementing experiments and the interpretation of experimental data. 118 Finally, simulations of ligandgated conformational changes and their role in catalysis provide a promising route for understanding the catalytic activity of many enzymes and assisting protein engineering and artificial enzyme design.