Reinforcing Tunnel Network Exploration in Proteins Using Gaussian Accelerated Molecular Dynamics

Tunnels are structural conduits in biomolecules responsible for transporting chemical compounds and solvent molecules from the active site. They have been shown to be present in a wide variety of enzymes across all functional and structural classes. However, the study of such pathways is experimentally challenging, because they are typically transient. Computational methods, such as molecular dynamics (MD) simulations, have been successfully proposed to explore tunnels. Conventional MD (cMD) provides structural details to characterize tunnels but suffers from sampling limitations to capture rare tunnel openings on longer time scales. Therefore, in this study, we explored the potential of Gaussian accelerated MD (GaMD) simulations to improve the exploration of complex tunnel networks in enzymes. We used the haloalkane dehalogenase LinB and its two variants with engineered transport pathways, which are not only well-known for their application potential but have also been extensively studied experimentally and computationally regarding their tunnel networks and their importance in multistep catalytic reactions. Our study demonstrates that GaMD efficiently improves tunnel sampling and allows the identification of all known tunnels for LinB and its two mutants. Furthermore, the improved sampling provided insight into a previously unknown transient side tunnel (ST). The extensive conformational landscape explored by GaMD simulations allowed us to investigate in detail the mechanism of ST opening. We determined variant-specific dynamic properties of ST opening, which were previously inaccessible due to limited sampling of cMD. Our comprehensive analysis supports multiple indicators of the functional relevance of the ST, emphasizing its potential significance beyond structural considerations. In conclusion, our research proves that the GaMD method can overcome the sampling limitations of cMD for the effective study of tunnels in enzymes, providing further means for identifying rare tunnels in enzymes with the potential for drug development, precision medicine, and rational protein engineering.

In result, we observed and mostly can be seen in run1 and run4 that the distance increases between Asp147 (CG) and CoM of side helix also distance increases between Asp147 -Leu177 (both backbone Cα and side chain atoms) and the ST opens.We can observe around 1 µs and after that in run1 and run4 of GaMD, there is increase of red diamond appearance, which indicated that movement of side helix farther from the protein is required for the opening of ST in LinB-Wt.Asp147 and CoM of side helix by black color.To understand the movement of phi and psi angle, the angle between the Trp and Asp were analyzed by pink and magenta color, and presence or opening of ST is shown by red diamond.In result, we can observe that the breakage of hydrogen bond is accompanied by the opening of ST and during such event, the distance increases between the Asp and Trp as well as the distance increases between CoM of side helix and Asp147.Also we can observe the fluctuation in psi and phi angle during the event of hydrogen bond breakage.So, in mutant the ST opens with the breakage of H-bond and movement of side helix farther from the protein.formation despite the residues Asp147 and Trp177 backbone are close enough, in this case the backbone is close enough but the side chain is far investigated by distance between Asp147 (CG) and Trp177 (NE1) in green color, hence no formation of hydrogen bond.

Figure S2 .
Figure S2.RMSD and RMSF of mutant LinB-Closed: (A) RMSD calculated on protein after removal of N-terminal tail (11 amino acids) using cMD and GaMD, (B) RMSF calculated on whole protein residues.

Figure S3 .
Figure S3.RMSD and RMSF of mutant LinB-Open: (A) RMSD calculated on protein after removal of N-terminal tail (11 amino acids) using cMD and GaMD, (B) RMSF calculated on whole protein residues.The high fluctuation in the second replicate's (run2) values in RMSD and RMSF plots of GaMD are caused by residues of loop domain (245GLY -249GLY).This can occur due to boost potential of GaMD, which increased the fluctuations of the loop and the residues in the vicinity.Importantly, the RMSD remains in the range of 2Å and the loop region returns close to its initial, stable conformation after second μs indicating that observed movement represent reversible fluctuation without serious effect on the overall dynamics of protein.

Figure S4 .
Figure S4.Rg and SASA of LinB-Wt: (A) Rg calculated on whole protein using cMD and GaMD, (B) SASA calculated on whole protein using cMD and GaMD.

Figure S5 .
Figure S5.Rg and SASA of LinB-Closed: (A) Rg calculated on whole protein using cMD and GaMD, (B) SASA calculated on whole protein using cMD and GaMD.

Figure S6 .
Figure S6.Rg and SASA of LinB-Open: (A) Rg calculated on whole protein using cMD and GaMD, (B) SASA calculated on whole protein using cMD and GaMD.

Figure S7 .
Figure S7.RMSD of catalytic residues ASN38 in all variants of LinB, to investigate its stability in the cMD and GaMD.

Figure S8 .
Figure S8.RMSD of catalytic residues ASP108 in all variants of LinB, to investigate its stability in the cMD and GaMD.

Figure S9 .
Figure S9.RMSD of catalytic residues TRP109 in all variants of LinB, to investigate its stability in the cMD and GaMD.

Figure S10 .
Figure S10.RMSD of catalytic residues GLU132 in all variants of LinB, to investigate its stability in the cMD and GaMD.

Figure S11 .
Figure S11.RMSD of catalytic residues HIS272 in all variants of LinB, to investigate its stability in the cMD and GaMD.

Figure S12 .Figure S13 .Figure S14 .Figure S15 .
Figure S12.TransportTools results of LinB-Wt, comparison between cMD and reweighted GaMD: bar plots for total number of frames, average bottleneck radius (BR), average length and maximum BR for LinB-Wt obtained from corresponding fractions of 5, 2.5, 1 μs and 500 ns of cMD and GaMD simulations.

Figure S18 .
Figure S18.Deepsite tool prediction of viable druggable protein-ligand-binding sites using machine learning approach.Top two viable binding site represents p1 tunnel pocket as pocket1 and ST pocket as pocket2.

Figure S19 .
Figure S19.PASSer 2.0 tool prediction results for top three pockets as viable allosteric or cryptic site to bind ligand.Here, p1 tunnel pocket is shown in red colored spheres and arrow and ST pocket is shown as orange colored spheres and arrow.

Figure S20 .
Figure S20.FTMove tool prediction result of transient binding sites on conformers of protein.This tool predicts cryptic allosteric site in the protein and probability is depicted by color in the figure.Here we can also see, ST pocket (yellow color -Binding site 002) is predicted as high probable cryptic site along with active site pocket (dark grey color -Binding site 004) (left).The ST connects the two cryptic pockets (Binding site 002 -004) and also we can see main p1b tunnel connecting p1b tunnel pocket (salmon color -Binding site 003) and active site (Binding site 004) pocket (right).

Figure S21 .
Figure S21.The variance and cumulative variance of all fourteen principal components in cMD and GaMD, showing first two PC captures around 80-90% variance and hence sufficient to understand the systems.

Figure S26 .
Figure S26.Mechanism of side tunnel (ST) opening in LinB-Wt using GaMD method.Here, Because of Leu177, it cannot make hydrogen bond with Asp147 and hence the hydrogen bonds presence is at 0 and we will observe the opening of ST in absence of h-bond.The movement of side helix was investigated by distance analysis between atoms (Cα) of Leu, which is present on side helix in respective to Asp which is shown in red color and distance between same residue but different atom Asp (CG) and Leu (NE1) to analyze the side chain behavior is shown by green color.Distance between Asp side chain atom (CG) and center of mass (CoM) of side helix to understand the mechanism of opening of ST is shown in black color and finally opening of ST are shown in red diamonds in the figure.In result, we observed and mostly can be seen in run1 and run4 that the distance increases between Asp147 (CG) and CoM of side helix also distance increases between Asp147 -Leu177 (both backbone Cα and side chain atoms) and the ST opens.We can observe around 1 µs and after that in run1 and run4 of GaMD, there is increase of red diamond appearance, which indicated that movement of side helix farther from the protein is required for the opening of ST in LinB-Wt.
Figure S27.Mechanism of side tunnel opening in LinB-Closed mutant using GaMD method.Here because of the mutation at Leu177 to Trp177, it forms hydrogen bond with Asp147.Analogously to previous plot, the presence and absence of hydrogen bond is shown by 0 and 1 by blue vertical line.To understand the mechanism of ST opening in Closed mutant, we have analyzed distance between Trp177 backbone atom (Cα), side chain (CG) which is on the side helix to Asp147 (Cα) and side chain atom (NE1) by red and green color.Further we have analyzed the movement of side helix with respective of protein by analyzing distance between Distance between Asp147 (Cα) -Trp177 (Cα) Distance between Asp147 (CG) -Trp177 (NE1) Distance between Asp147 (CG) -CoM of side helix Angle between Trp177 (NE1) -Trp177 (H) -Asp147 (OD1) Angle between Trp177 (NE1) -Trp177 (H) -Asp147 (OD2) Presence of hydrogen bond between Asp147 and Trp177 Presence of Side tunnel

Figure S28 .
Figure S28.Mechanism of side tunnel opening in LinB-Open mutant using GaMD method.Analogously to FigureS27.Some interesting cases indicated by dashed lines, in few cases, there is hydrogen bond formation and still opening of ST, we investigated that in these cases the hydrogen bond is formed by the residues but still the movement of side helix can be seen farther away from the protein (black color), and due to the movement of side helix away from the protein, it helps with the opening of the ST.In another case there is no hydrogen bond