Asymmetric dynamics of dimeric SARS-CoV-2 and SARS-CoV main proteases in an apo form: Molecular dynamics study on fluctuations of active site, catalytic dyad, and hydration water

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has been widely spread around the world. It is necessary to examine the viral proteins that play a notorious role in the invasion of our body. The main protease (3CLpro) facilitates the maturation of the coronavirus. It is thought that the dimerization of 3CLpro leads to its catalytic activity; the detailed mechanism has, however, not been suggested. Furthermore, the structural differences between the predecessor SARS-CoV 3CLpro and SARS-CoV-2 3CLpro have not been fully understood. Here, we show the structural and dynamical differences between the two main proteases, and demonstrate the relationship between the dimerization and the activity via atomistic molecular dynamics simulations. Simulating monomeric and dimeric 3CLpro systems for each protease, we show that (i) global dynamics between the two different proteases are not conserved, (ii) the dimerization stabilizes the catalytic dyad and hydration water molecules behind the dyad, and (iii) the substrate-binding site (active site) and hydration water molecules in each protomer fluctuate asymmetrically. We then speculate the roles of hydration water molecules in their catalytic activity.


Introduction
Severe Acute Respiratory Syndrome coronavirus (SARS-CoV) was originally identified in 2003. 1 In 2019, the novel SARS coronavirus (SARS-CoV-2) emerged and has been spreading around the world. Both viruses cause severe pneumonia. The viruses encapsulate a singlestranded RNA in its viral envelope, and protrusions (i.e., spike proteins) project from the surface so that the virus can recognize certain receptors on cells.
The main protease (3-chymotrypsin like protease, 3CL pro ) plays an important role in the maturation of the SARS-CoV and SARS-CoV-2 coronaviruses. The protease is a cysteine protease and has a catalytic dyad formed by histidine (H41) and cysteine (C145), which are highly conserved among different coronaviruses ( Figure S1). The protease catalyzes the polyprotein translated from the viral RNA, breaking the polyprotein into functional pieces, which include the protease itself. In the present study, we called the protease "3CL pro " and used the term "active site" as the one consisting of the physiological-ligand binding pocket and the hydrolysis-reaction catalytic site.
Because the SARS-CoV and SARS-CoV-2 3CL pro are a protease, designing binders to the active site is a direct way to deactivate the catalytic activity. On the basis of this manner, many investigations have been attempted to identify novel binders to the active site. [2][3][4][5][6][7] For instance, a Michael acceptor inhibitor (N3) was originally designed to inhibit SARS-CoV 3CL pro . 2 N3 also binds to the active site for SARS-CoV-2 3CL pro (PDBID: 6LU7 and 7BQY). 3 Another way to impair the catalytic activity can be to inhibit the dimerization of the protease. 8 For dimeric SARS-CoV 3CL pro , an X-ray experiment suggested that the substrate pocket S1 in the active site of a protomer was collapsed and the other pocket wasn't, and hence one protomer is active, but the other is inactive. 9 The analysis of MD trajectories for dyad conformation suggested that one protomer in the dimeric SARS-CoV 3CL pro was inactive and a monomeric SARS-CoV 3CL pro was inactive. 10 However, for dimeric SARS-CoV-2 3CL pro , room-temperature X-ray crystallography demonstrated that the collapse of S1 pocket in the dimer did not occur; the authors pointed out that previous crystal structures might have an artefact. 11 The dimer dissociation constant for both SARS-CoV and SARS-CoV-2 3CL pro was estimated; also the catalytic efficiency of SARS-CoV-2 3CL pro was slightly higher than SARS-CoV one. 12 Subsequently, the dissociation constant was revised by native mass spectrometry, the value of which was about 0.14 0.03 . 13 The dimerization and the enzymatic activity was weakened by both the C-and N-terminal truncations of SARS-CoV 3CL pro . 14 On the other hand, the C-terminal truncation, which ranged from 300 to 306, did not cause noticeable effects on the quaternary structure. 15 The stability of several known compounds on the active site was assessed for finding potent binding sites. 16 Their blind docking result implied that the dimer interface would be a potent target.
Although the relationship between the dimerization and the catalytic activity has been investigated so far, 8 few studies demonstrate how they relate and why the dimerization matters for the catalytic activity. Furthermore, it is unclear whether the results of SARS-CoV 3CL pro are reusable even for SARS-CoV-2 3CL pro . It is thus necessary to investigate the relationship and the difference between the two proteases. It should be confirmed whether SARS-CoV-2 3CL pro behaves in the same way as SARS-CoV 3CL pro ; and whether it is inactive in a monomeric form and is active in a dimeric form. No study revealed, from a statistical and structural viewpoint, the reason why the monomer is inactive, but the dimer is active. The present study aims to unveil the relationship between the protease's structure/dynamics and its catalytic activity.
In the present study, we demonstrate that the dimerization of the proteases stabilizes their catalytic dyad and hydration water behind the dyad. We apply atomistic molecular dynamics simulations to calculate the global dynamics, dyad stability, and hydration water dynamics. We find that the global dynamics of SARS-CoV-2 3CL pro differ from those of SARS-CoV 3CL pr . For both dimeric proteases, each active site complies with different structural probability distributions. The catalytic dyad is retained in a dimeric state but not in a monomeric state, and hydration water molecules behind the dyad stay longer in a dimeric state than in a monomeric state.

Preparation of Simulation Systems
All the target proteases in the present study are shown in Table 1. One simulation system for a dimeric system of SARS-CoV-2 3CL pro was constructed by homology modelling based on the structure of PDBID 1UJ1. The template sequence is shown in Table S1. Since the sequence identity is 96.3 %, the homology modelling approach would provide a reliable structural model. Furthermore, another dimeric system of SARS-CoV-2 3CL pro was constructed from the X-ray structure of PDBID 6LU7. The structural data of the homodimer was downloaded from the "Biological Assembly" of the PDB 6LU7. Note that one protomer's conformation in the dimeric system is identical to the other protomer's because the asymmetric unit of the PDBID 6LU7 is monomeric (i.e., the root mean square displacement between the two protomers is zero). On the basis of the chain IDs, each chain was named "protomer A" and "protomer B".
The protonation state of the histidine H41 would be crucial for the catalytic activity of 3CL pro . For the careful assignment, the catalytic site in a known X-ray structure of 3CL pro (PDBID: 1UJ1) was examined. The surroundings of the catalytic dyad (H41-C145) imply that of H41 is protonated and of H41 is not ( Figure S2). Accordingly, of H41 was protonated and was not in No. 1-7 simulations. Nevertheless, because it is also plausible for 3CL pro to form an ionized dyad , 17 a dimeric system with two ionized dyad and a monomeric one with an ionized dyad were also constructed (No. 8-9 in Table 1). The computational results of the ionized systems are given in the Supporting Information.  21,22 The query sequence is shown in Table S1 in the supporting information. f The ligand was removed from 6LU7, and thus the systems from 6LU7 were an apo form. g The dyad was ionized (i.e., H41 + …C145 -).
For each structure prepared, the C-and N-terminus were capped by acetyl and amine groups, respectively. Then, each of the structures was immersed into a tetrahedron water box whose margin was 10 Å from each of the planes of the box to the structure. The salt concentration in the box was set to 0.1 M. Energy minimization was executed with all heavy atoms restrained and then without any restraints. Subsequently, five NVT simulations with different velocities were performed during 100 ps, and then five NPT simulations at 1 bar during 100 ps. From the five equilibrated structures, five independent NPT simulations at 1 bar at 310 K were performed. Only for the dimeric system of PDBID 6LU7, the number of simulations was 10. Production runs were performed for for the dimeric system and for the others (Table 1).
Computational setting are as follows: time step was set to 2 fs, an interaction table for cut-off scheme 23 was used whose list was updated every 20 fs. A leap-frog algorithm was used for an integrator, and LINCS 24 was for hydrogen-bond constraints (lincs_iter=1, lincs_order=4)/ V-rescale 25 was for temperature control for two groups, protein and nonprotein ( , compressibility=4.5 10 6 ). Berendsen coupling 26 was used for the first box equilibration, and Parrinelo-Rahman coupling for all production runs to control pressure 27,28 ( , ). Short-range electrostatic and van der Waals interaction cut-off values were set to 12 Å. Amber ff99SB-ILDN was employed for protein force field 29 and TIP3P model 30 for water molecules. Reaction-field method 31 for long-range interaction calculation was used with the relative dielectric constant set infinity. Note that, in Gromacs notation, the option epsilon-rf=0 means that is infinity. Snapshots were stored for every 100 ps. Gromacs 2019.2 [32][33][34][35][36] was used for all simulations. The supercomputer TSUBAME 3.0 was used for all simulations. For analysis, MDAnalysis 37,38 was used for parsing MD trajectory. All pictures for molecules were drawn by PyMOL 2.3.0 39 . Every MD trajectory from to was analyzed (see Figures S3 and S4).

Principal Component Analysis (PCA)
Cartesian-based PCA (cPCA) was used for description of global protein dynamics. The cPCA provides dynamic modes, i.e., eigenvectors , where is a system name and is the number of coordinates of interest. The eigenvectors were arranged in a descending order of the eigenvalues. Correlations between and of different proteases were calculated (i.e., inner products ). 40 Here, cPCA was applied to trajectories of the SARS-CoV 3CL pro and SARS-CoV-2 3CL pro systems. MD snapshot of a trajectory was expressed as , where is the number of atoms. Structural superimposition for every snapshot was carried out for all the atoms. The variance-covariance matrix was defined as ⟨ ⟩ ⟨ ⟩ ⟨ ⟩ , where ⟨ ⟩ is the ensemble average at temperature . was diagonalized, thereby generating eigenvectors ( ). Inner products were computed. Global modes were defined as the top-3 eigenvectors , , and . The inner products indicate the correlation of global modes between of SARS-CoV-2 3CL pro and SARS-CoV 3CL pro .
Furthermore, distance-based PCA (dPCA) was also utilized to grasp a structural distribution. 41 In terms of a structural distribution, dPCA shows better performance (e.g., convergence of cumulative variances) than cPCA. 1 The dPCA was thus applied for the active site that was represented as a set of distances between atoms (Table S2). The number of the selected atoms was . The active site of MD snapshot was expressed as a set of distances , where was the number of distance pairs, i.e., . The variance-covariance matric was defined as ⟨ ⟩ ⟨ ⟩ ⟨ ⟩ . The matrix was diagonalized and then eigenvectors ( ) were generated. The active site of each snapshot was projected onto every eigenvector by , where is the position of each snapshot on eigenvector, i.e., principal component (PC ).

Statistical Analysis of the Catalytic Dyad
The probability distribution of the catalytic dyad's distance was calculated ( Figure 1). On the basis of the probability distribution, the dyad was judged whether it was broken or not. The following procedure was used for the judgement: (i) The probability distribution of the dyad distance between and was calculated. The distribution was expressed as . (ii) The mode (the value that appears most frequently) of was calculated, which was expressed as . (iii) The deviation of from the reference distance was calculated by where was the distance between and in the X-ray structure of PDBID 6LU7. The following condition was applied for the judgement of the dyad's state: Å Å Using this condition, a confusion matrix was created (Table 3). This matrix was used for the evaluation of the structural difference of the dyad between the monomeric and the dimeric systems. Two-sided Fisher's exact test of independence was employed for the statistical assessment. The null hypothesis was that oligomeric states (i.e., monomeric or dimeric) are independent of the states of the dyad (i.e., broken or retained). Although this test is similar to test of independence, the use of Fisher's exact test is recommended if sample size is less than 1000. 42 Matthew's correlation coefficient (MCC) of the matrix was also calculated for the assessment of the correlation between the two categories: monomeric/dimeric states and broken/retained dyad. MCC is considered a reliable statistical measure. 43 was defined as the distance ( ) between and in an X-ray structure (PDBID: 1UK3A). was set to the most frequent value of .

Residence Time of Water Molecules
Survival probability (SP) 44 of water molecules within the defined sphere ( Figure 2) was calculated by the following procedure: The average coordinates of the triad (H41, H164, and D187) is calculated for each snapshot. The number of water molecules at time within the sphere of radius Å centered at is expressed as . The number of water molecules within the sphere from to is represented as , where is called a lag time. The SP is then defined as: ∑ where is the entire simulation time of interest and . The SP tells us how fast water molecules within the sphere diffuse. Let us consider two extreme cases: if the number of water molecules within the sphere never decreases after time , i.e., the water molecules stay in the sphere for the time ; if , the number of water molecules within the sphere vanishes for the time . Residence time was defined as the time when SP reaches 0.5. A python library, MDAnalysis, was used for the SP calculation. 37,38,44 Figure 2. The definition of the spherical space used in the survival probability calculation. The transparent grey space was defined as the sphere whose center was positioned at the average coordinates of H41, H164, and D187 and whose radius was . This depicted structure was taken from the PDB ID 6WQF. 19 The symbol O in indicates an oxygen atom of a water molecule within the sphere. The water molecule is surrounded by H41, H164, and D187 and corresponds to the crystal water that are considered important for the catalytic activity. 19,45

Results and Discussion
This section presents results of the systems with neutral dyads mainly, whereas results of the systems with ionized dyads are given in the supporting information.

Correlation between Major Dynamic Modes of the Two Proteases
Performing cPCA for the entire protein structure, we produced their dynamic modes (i.e., eigenvectors) for the dimeric SARS-CoV 3CL pro and SARS-CoV-2 3CL pro systems. Since it is unclear how much the two proteases' global motion is conserved, we examined dynamic modes' correlations between the two proteases. Here, the correlation was defined as cosine similarity between a pair of eigenvectors, which were scaled to be unit vectors. Table 2 shows the correlation for each pair of eigenvectors. The highest correlation was 0.37 between and , the second was 0.36 between and , and the third was 0.35 between and . The second mode was distributed into and . These results indicated that the major modes were not highly conserved, while the amino-acid sequence identity between two proteases was 96% that was very high value. The slight mutations on SARS-CoV-2 3CL pro altered the structural dynamics of the previous SARS-CoV 3CL pro . The alternation implied that the global dynamics were not a fundamental aspect for the proteases' function.

Structural Fluctuations of the Active Site
This section deals with structural fluctuations of the active site. For the analysis, we applied distance-based principal component analysis (dPCA) (Table S2 and Figure 3E). Recall that the dimer has one active site in each protomer. We executed dPCA for each active site, obtaining the first principal component (PC1).
For the dimeric systems with neutral dyads, the active site in a protomer fluctuated differently from the other protomer's ( Figure 3). For instance, Traj2 in Figure 3A shows two distinguishable probability distributions of PC1. Although the undistinguished distribution in Traj5 of Figure 3D also occurred, it appeared rarer than the separated distributions. At the beginning of the simulations for 6LU7, the two active sites were structurally identical. This eliminates the possibility that the separated distribution was attributed to the initial structures. The asymmetric fluctuations may imply that substrate specificity of a protomer differ from that of the other. This asymmetric fluctuations of the active site were supported by several crystal structures whose protomer has conformationally distinct active site (e.g., PDBID: 1UK3) 9 and by earlier computational studies. 10,47 An earlier study also pointed out the structural heterogeneity of the active site on the basis of known crystal strucutres. 46 For the monomeric systems with a neutral dyad, we observed multimodal probability distributions along with PC1 ( Figure S5). For example, Traj5 in Figure S5C shows a bimodal distribution, which indicated that the monomeric systems' active site was unstable more than the dimeric systems. Keeping active site's conformation was thus likely to be a difficult task for the monomer; this can be one explanation of why the monomer is inactive. For both proteases, the dynamic modes were altered by the slight mutations (Table 2), but the distributions of the active site showed the same tendency: each active site fluctuated asymmetrically ( Figure 3). atoms of the active site that was used in dPCA. PC1 of each graph was not identical to each other, i.e., we analyzed each Traj i (i=1,2,…,10) individually in the dPCA, because merging multiple trajectories results in the expansion of conformational space, leading to the poor capability of dPCA to discriminate conformations.

Dyad Stability
We investigated the distance fluctuations of the catalytic dyad in order to connect the dyad's fluctuations with the catalytic activity of the 3CL pro . We calculated the distance between and . For each trajectory, we computed a probability distribution of and obtained the mode (i.e., the most frequent value). Using the mode and the reference distance , we judged whether the dyad was retained or broken (Table 3). Further details of this procedure are given in Materials and Methods.
We found that the dimeric systems tended to retain the active dyad's conformation more than the monomeric systems did (Figure 4 and Table 3). To assess the difference between the monomeric and dimeric systems, we conducted Fisher's exact test and calculated Matthew's correlation coefficient (MCC). We have shown the statistical significance (p-value = 0.0025) if we employed the significance level of . Furthermore, MCC was 0.48, showing a positive correlation of broken dyad with the monomeric systems. Thus, from a statistical viewpoint, the dimeric systems are likely to retain the dyad, whereas the monomeric systems are less likely to do so. The dyad instability in the monomer was in agreement with the result that SARS-CoV 3CL pro (not SARS-CoV-2 3CL pro ) at low concentration (corresponding to a monomeric state) did not show its enzymatic activity. 10 Table S3 shows that 20% dyad of one protomer in the dimeric systems tended to be broken. It implied that the dyad in each protomer kept its distance asymmetrically like the asymmetric conformation of the active sites in Figure 3. We thus conjectured that the active site and the dyad in a protomer form conformations different from those in the other protomer and that each protomer has inhomogeneous catalytic activity. Although no study gave the direct evidence to this inhomogeneity, the structural heterogeneity of the active site has been pointed out from crystal structures. 46 It should be noted that the homology modelling structure could not retain the geometry of the dyad even though the template structure 1UJ1 retained its dyad geometry (Table S3). The entire dynamics seemed appropriate, but the probability distribution of the distance for the dyad might have been affected by the modelled initial structure. It suggested that a homology model should be handled carefully by examining not only the overall dynamics but also local structural fluctuations.  Table 3. Confusion matrix for the dependency of monomeric/dimeric states on the dyad's conformation. "Monomeric" (first column) and "Dimeric" (second column) data were derived from the trajectories No. 3, 4, and 7 and No. 1, 2, and 6 (see Table 1).

Consensus Hydration Sites of Water Molecules
To investigate the role of hydration water in the catalytic activity, we identified sites in which water molecules stayed for more than 500 ns ( Figure S6 and S7). From the sites, we selected and studied three sites (M 1 , M 2 , and D 1 , D 2 , and D 3 in Figure S6 and S7; M and D stands for Monomeric and Dimeric, respectively). Since these were found in most trajectories, we called the sites "consensus hydration sites". The consensus hydration sites had one noticeable feature: several water molecules were closely located and were stabilized mainly by the hydrogen bonds to N-H or C=O of the backbone ( Figure 5). For example, backbone's polar atoms in the site D 2 of both protomers interacted with a water molecule ( Figure 5D). The water molecule stabilized the loop in the site D 2 . Interestingly, since some residues in the site D 2 (e.g., F291, E290, and E288) were seen even in the site M 2 , this hydration site was likely to be conserved in both the monomeric and dimeric systems. The hydration site D 2 may play a crucial role in the dimerization: An earlier study suggested that the mutation E290A for SARS-CoV 3CL pro led to a complete loss of the activity and dimerization because E290 in a protomer formed a salt bridge with Arg4 in the other protomer. 48 If the hydration water molecule does not stabilize the site D 2 containing E290, forming an appropriate orientation of E290 towards Arg4 can be difficult. The hydration water was also observed in crystal structures.
The site D 1 in the dimeric systems contained three water molecules ( Figure 5C). This site accommodated the water molecule (dubbed ) found in most crystal structures that is hydrogen-bonded to , , and . 19,45 The three water molecules were positioned behind the catalytic dyad, whereas water molecules in the site D 1 were not observed in the monomeric systems. Instead, the site M 1 appeared and was adjacent to D 1 . M 1 was likely to be transformed into D 1 in the dimeric systems ( Figure S8)).
The site D 1 accommodating the three water molecules corresponds to the one where a single crystal water molecule is bound (e.g., PDB ID 6WQF, 6LU7, 1UK3, and 1UJ1) (Figure 2). The difference between our MD trajectories and the crystal structures was the number of hydration water molecules. The crystal structures show the existence of a single crystal water molecule in the site D 1 , whereas our trajectories indicated several water molecules in the site.
The site D 3 was observed in most of the trajectories, which was located near D 1 ( Figure 5E and the blue circles in Figure S7). Thus, the site D 3 can be said to be the most agreed hydration site among the dimeric systems. Our result indicated that this site was conserved among the two different proteases ( Figure S7). Since the water molecule in the site D 3 stayed near D 1 , it can contribute to the stability of the site D 1 , and thus to the stability of the dyad. In the site D 3 , a crystal water molecule is also seen in crystal structures (e.g., 6WQF, 6LU7, 1UK3, 1UJ1, 6M03); it corresponds to the water molecule in Figure 5E(b). Our trajectories demonstrated that the water molecule in the site D 3 fluctuated between the following hydrogen-bond networks: Y182-G174-F134 and Y161-H172 ( Figure 5E(a) and 5E(b)). We noted that D 3 was not observed in all the trajectories of the monomeric systems ( Figure S6). In (C), the water molecule O in corresponds to the crystal water that are considered important for the catalytic activity. 19,45 The sphere representation indicates the dyad H41 and C145. The small red spheres represent oxygen atoms of water molecules. Green and cyan cartoon representations show the protomer A and B, respectively. The dotted-yellow lines indicate hydrogen bonds of water molecules to amino-acid residues. The interactions were mainly from backbone atoms, except for the electrostatic interactions with the tips of following residues: K5 ( Figure 5B); T175, H164 and D187 ( Figure 5C); Y161 ( Figure 5E(a)).

Residence Time of Water Molecules in the Hydration Site D 1
In the previous section, we discussed the five hydration sites, and the site D 1 contained three water molecules, one of which corresponded to the catalytic water O in . Here, we investigate the site D 1 further, using residence time , which indicates how long water molecules stay in a site on average (see the definition Eqn. 3). We computed survival probability (SP) of water molecules within the spherical space defined in Figure 2 (This space was the definition of the site D 1 ). We then estimated residence time for each system (Figure 6 and S9, and Table S4).
For the monomeric systems, the residence time for 1UK3A and 1UK3B were shorter than the one for 6LU7 (left columns in Table S4). This showed that the ability of the monomeric SARS-CoV-2 3CL pro to keep water molecules in the site D 1 was slightly higher than the monomeric SARS-CoV 3CL pro . Also, since the site D 1 of the monomeric SARS-CoV-2 3CL pro had the two distinctive values of the residence time (e.g., and in Table S4). The site D 1 appeared to form at least two water-bound modes. We referred to the two water-bound modes as "water-weakly-bound mode" (if ) and "water-strongly-bound mode" ( ), respectively. On the other hand, the site D 1 in the monomeric SARS-CoV 3CL pro appeared to form only water-weakly-bound mode, because of the short residence time ( ). For dimeric systems, the site D 1 of one protomer was likely to hold water molecules for longer residence time than that of the other protomer (right columns in Table S4). For both SARS-CoV 3CL pro and SARS-CoV-2 3CL pro , the site D 1 of one protomer was capable of forming the water-strongly-bound mode, whereas the site D 1 of the other protomer formed the water-weakly-bound mode. Both protomers were unable to form the water-stronglybound mode simultaneously (e.g., Figure 6), but both were able to do the water-weaklybound mode at the same time.
The comparison of the monomeric and dimeric systems demonstrated that the dimerization stabilized water molecules in the site D 1 , because the residence time of the monomeric systems was at most , whereas that of the dimeric systems was more than . Indeed, several water molecules settled in the site D 1 for more than 800 ns (not residence time but simulation time). As further analysis, WaterMap method would also be useful to obtain insight into the hydration sites and drug design. 49,50 It would be beneficial to reiterate the two features for the dimeric systems: (i) the longer residence time of water molecules in the site D 1 than the monomeric systems; (ii) more stable dyad than the monomeric systems. The two results showed that the dimerization stabilized the dyad and hydration water in the site D 1 . Unexpectedly, the dyad stability appeared irrelevant to the residence time of the hydration water (Eqn S1 and Table S5), i.e., the hydration water was unlikely to stabilize the dyad. The hydration water molecules may have other roles in the catalytic activity, and we speculated the roles in the subsequent section. Figure 7 summarizes the relationship between the water molecules, the dyad, and the oligomeric states. Figure 7A illustrates the mobility of the hydration water molecules in D 1 . The mobility depended on the oligomeric state of the proteases (Table S4 and Figure S9). In a monomeric state, hydration water molecules did not stay in D 1 , and the dyad tended to be broken ( Figure 7B). The dimerization stabilized the water molecules in D 1 and the dyad; the hydration water in D 2 appeared to contribute the dimerization ( Figure 7C). Figure 6. A representative of survival probability (SP) for water molecules in the site D 1 , which was taken from Traj1 (1UJ1) in Figure S9. The horizontal solid line indicates 0.5 of SP, by which was defined. All of the results are given in Figure S9.

Speculative Roles of Hydration Water
For further consideration, we conjectured three roles of the hydration water molecules. The first possibility is that the water molecules stabilize the dyad, thereby keeping the activity. The water molecules, however, seemed irrelevant to the dyad stability because of the conditional probabilities calculated from Table S5. For the dyad stability, the dimerization did matter (Figure 4 and Table 3).
The second possibility is that some of the water molecules in D 1 may transfer protons to C145 that is covalently bonded to a substrate, allowing the substrate to dissociate. It is thought that the catalytic process involves the hydrolysis of the covalent bond, which is attacked by a water molecule. 51 We speculated that a water molecule in D 1 can donate a proton to the covalent bond easier than bulk water molecules does because of the steric hinderance of the covalently-bonded substrate.
The third possibility is that the hydration water may prevent the active site from being denatured by accepting the heat generated by the catalytic reaction and dissipating it into bulk. The hydration water molecules in the site D 1 might act as a medium that gets and releases the heat into bulk. Earlier studies investigated the heat generated by a catalytic reaction of some enzymes: [52][53][54] The heat increased the enzyme's diffusivity, 52,53 and the energy of the heat was about 5 kcal/mol. 54 The results raise the question of why the resultant heat does not cause severe denaturation of the active site. We speculated that if the heat dissipates into the hydration water molecules in D 1 , the active site may not be intensely perturbed. Since the water molecules in D 1 were mobile ( Figure 6 and S9), they were readily released into the bulk and stored again. In this way, the active site may be prevented from being overheated. For direct verification, future work will take the propagation of the heat into account. A computational method by Stock Gerhard group would be suitable, which applies local temperature jump and then keeps track of the energy propagation. [55][56][57]

Conclusion
The investigation of SARS-CoV 3CL pro and SARS-CoV-2 3CL pro in an apo form provides fundamental insight into how they sustain their catalytic activity. In the present study, we simulated the two proteases. We revealed that major global modes were not conserved ( Table  2). The changes in the modes would originate from the slight mutations on SARS-CoV-2 3CL pro . We observed that each active site in the dimeric 3CL pro obeyed different conformational distribution (Figure 3). This may imply that each protomer has different substrate specificity. We found that the catalytic dyad in the dimeric systems tended to be retained more than that in the monomeric systems ( Figure 4 and Table 3), which indicated that the dimeric state is active.
Furthermore, we identified hydration water around the two proteases, the water which might be functionally relevant ( Figure 5). Hydration water molecules in the site D 1 of a dimeric state stayed longer than of a monomeric state ( Figure 6). Their residence time depended on the oligomeric states of the two proteases. Since it is thought that the dimer is an active form, these hydration water molecules might be functionally relevant. These tendencies above were also observed in the systems with ionized dyads.
Lastly, we point out unsolved problems. Although the dimerization is crucial for the catalytic activity, the dimerization interface has a large interaction surface, which causes difficulty in targeting the surface. For future work, we illustrated important residues on the interface in "Contact Residues on Dimer Interface" in the supporting information. Plus, even though the allostery cause by several mutations was investigated, 60 the knowledge of the effects on the protease's dynamics is still not adequate for application. Future work should describe more (e.g., by pointing out allosteric pathways) to allow us to manipulate the protease's function. SARS-CoV-2 can mutate readily because it is an RNA virus that mutates more quickly than DNA viruses. 61 The understanding of the mutation effects on the activity is indispensable to a fight for the viruses' drug-resistance.

Supporting information
The Supporting Information is available free of charge.