Constructing kinetic models to elucidate structural dynamics of a complete RNA polymerase II elongation cycle

The RNA polymerase II elongation is central in eukaryotic transcription. Although multiple intermediates of the elongation complex have been identified, the dynamical mechanisms remain elusive or controversial. Here we build a structure-based kinetic model of a full elongation cycle of polymerase II, taking into account transition rates and conformational changes characterized from both single molecule experimental studies and computational simulations at atomistic scale. Our model suggests a force-dependent slow transition detected in the single molecule experiments corresponds to an essential conformational change of a trigger loop (TL) opening prior to the polymerase translocation. The analyses on mutant study of E1103G and on potential sequence effects of the translocation substantiate this proposal. Our model also investigates another slow transition detected in the transcription elongation cycle which is independent of mechanical force. If this force-independent slow transition happens as the TL gradually closes upon NTP binding, the analyses indicate that the binding affinity of NTP to the polymerase has to be sufficiently high. Otherwise, one infers that the slow transition happens pre-catalytically but after the TL closing. Accordingly, accurate determination of intrinsic properties of NTP binding is demanded for an improved characterization of the polymerase elongation. Overall, the study provides a working model of the polymerase II elongation under a generic Brownian ratchet mechanism, with most essential structural transition and functional kinetics elucidated.


Introduction
RNA polymerase II (Pol II) is the core enzyme that catalyzes gene transcription in eukaryotic cells [1][2][3][4]. The multi-subunit Pol II consists of 12 subunits and is weighted about 550 kD. It works with various transcription factors to control multiple stages of transcription [5][6][7][8][9][10], from initiation, elongation to termination. High-resolution structures of the Pol II complex have been captured at different conformational states during its elongation cycle [11][12][13][14][15][16][17][18][19][20]. In particular, two prominent structural elements close to the active center of Pol II are well characterized as keys to the elongation control: one is a trigger loop (TL) that opens after product release and during polymerase translocation, and closes upon nucleotide binding and through catalysis [15,[21][22][23][24][25][26]; the other is a bridge helix (BH) that locates next to TL and assists translocation or active site re-arrangement [27,28]. Pol II is one of the most studied multi-subunit RNA polymerases. Both ensemble and single molecule measurements had been conducted to investigate the elongation properties of Pol II [29][30][31][32][33][34][35][36]. Nevertheless, detailed mechanisms of Pol II elongation are still lack of or remain controversial.
The multi-subunit RNA polymerases have been suggested to work as the Brownian ratchet (BR) along the DNA track, as the translocation takes place spontaneously back and forth between the pre-and posttranslocation states [37][38][39]. The NTP binding to the active site prevents backward movements of the RNA polymerase, and therefore promotes forward movements. Remarkably, the base-pair stepping motion of a similar multi-subunit RNA polymerase from E. coli was resolved at the single molecule level [37]. The experimental fittings to the force-velocity data of the single enzyme elongation supported a BR model, though NTP binding was suggested to happen at either pre-or post-translocation.
Similar features have been derived from a recent experiment using an optical-trapping assay with high spatiotemporal resolution to probe single yeast Pol II elongation [34]. Global fits to the force-velocity data again supported the above BR model with NTP binding at either pre-or post-translocation. The model is later on referred as a branched BR model. It should be noted that the translocation or the force-dependent transition is assumed to happen much faster than subsequent kinetic steps in each elongation cycle in this branched BR model [34].
In a most recent single molecule experiment challenging the individual yeast Pol II with nucleosomal barrier, however, the above assumption of fast translocation turns out to be unnecessary [35]. The study sought to achieve comprehensive kinetic characterization of Pol II elongation without making any assumption about the rate-limiting step of the elongation [35]. In the experiment, an optical tweezers assay was used to follow a single Pol II elongation at varying NTP conditions, and under an assisting or opposing applied force, similarly as that in [34]. In addition, different tracks of DNA, bare or with nucleosomal barrier, were both examined for the Pol II elongation [35]. Based on similar force measurements as that in [34] but an alternative way of data fitting, it has been shown that the forward translocation rate can be very low (∼88 s −1 ) and comparable to the force-independent slow rate or catalytic rate (∼35 s −1 ) [35]. The finding suggested that a non-branched generic BR model holds as long as the translocation, or indeed, the force-dependent transition prior to NTP binding, is also regarded as a slow step during the transcription elongation.
On the computational side, molecular modeling and simulations based on high-resolution structures of Pol II have provided fruitful insights into atomic level mechanisms of transcription elongation in recent years [23,26,[40][41][42][43][44][45][46][47][48]. For example, normal mode analysis employing atomistic force field to the polymerase structure suggested that productive translocation requires TL opening, while the translocation is further inhibited by the presence of an NTP in the active site [41]. At the same time, molecular dynamics (MD) simulation on atomistic structures of Pol II also suggested that the thermally driven translocation of the polymerase is only achievable with an open TL [42].
Furthermore, extensive all-atom MD simulations employing the Markov state model (MSM) techniques have recently captured fast dynamics of pyrophosphate ion (PPi) release at microsecond time scale [43,44]. PPi release happens after chemical addition of an inserted NTP; after the PPi release translocation of the polymerase happens. The study suggested that the PPi release could induce a slight tip opening motion of TL, even through a full TL opening is yet to happen. Remarkably, the atomistic MD simulations employing the same techniques while focusing on translocation of Pol II reported that translocation in the absence of NTP can occur at tens of microseconds [48]. The results appear inconsistent with the slow translocation step (about ten milliseconds) reported by the single molecule experiments [35]. However, it is noted that the pre-translocation structure adopted in the MD simulation has already an open TL configuration.
Combing data and analyses at single molecule level from both the experimental and computational sides, we propose here a working model of Pol II elongation, in which fast translocation of the RNA polymerase is preceded by a slow process of TL opening. We regard that the force-dependent transition detected in the single molecule experiments [35] as a combination of the slow TL opening and the fast translocation. The model resolves seemingly controversies from the above studies. At the same time, we also show that there are at least two scenarios for the force-independent rate-limiting transition that follows the translocation and NTP binding in the elongation. In addition, we discuss how sequence stability variation along DNA affects the elongation rate and explain why that cannot induce the detected force-dependent slowing down. We also estimate intrinsic NTP dissociation constants of the polymerase so that to infer which scenario is more likely for the force-independent transition.

Model constructs and results
2.1. The generic BR model and single molecule measurements supporting this non-branched model First we address the elongation kinetics of Pol II in a minimal three-state representation: the Pol II follows a generic BR mechanism, and NTP binding serves as a pawl and is only allowed at post-translocation. In the depicted scheme in figure 1(a), state I, II, III are the pre-translocated or pre-trans, post-translocated or post-trans, and NTP-bound or substrate state. Translocation happens in between state I and II, with both forward and backward transition rates depending on the mechanical force for or against the movements [34,35]. Basically, transition I → II encloses a forcedependent process that happens after the product formation but prior to the NTP binding. The rate of NTP binding (II → III) is proportional to the NTP concentration, as the binding process is presumably dominated by NTP diffusion in the solution. On the other hand, the product formation step (III → I) lumps several events together: a potentially slow process of waiting (such as the active site rearranging or tightening [42]), followed by chemical addition of the nucleotide, and then the PPi release. When only one of these events is significantly slow, it is a good approximation to model the full product formation process as one transition. In addition, if PPi concentration is sufficiently low, the above transition approaches irreversible.
When the off-pathway transition such as transcription pausing or backtracking is not considered, the elongation rate of the above scheme is derived as: and + k c the forward rates of the force-dependent (translocation I → II) and the force-independent (catalytic III → I) step, respectively, in the three-state scheme ( figure 1(a)). In particular, ≡ is the forcedependent bias against the forward translocation, where is defined at zero force (Δ is 1 nt distance; F > 0 for an assisting force). σ t 0 ∼ 1 is expected for the Brownian motion. As a result, how fast the elongation rate approaches to saturation (measured by ), as well as how large the saturation rate ( is, depends on the implemented force F in the single molecule manipulation. In the single molecule studies by Dangkulwanich et al [35], no particular assumption (such as translocation being very fast) was made to the kinetic rates in the BR scheme (figure 1(a)), which was used to fit the experimental data. Among their results, it was mea- c c t max = 25 ± 3 nt s −1 and σ ≡ + M t c t = 39 ± 12 μM, according to the Michaelis-Menten fitting to the pausefree elongation velocity versus [NTP] data (at an applied force of 6.5 pN) [35]. To determine the respective values of + k t and + k c , additional measurements challenging Pol II with nucleosome barriers were conducted [35]. By comparing the pause-free velocities on nucleosome DNA and bare DNA, it showed that both the force-dependent and force-independent rates are low ( = + k t F ( 0) = 88 ± 23 s −1 at zero force, and + k c = 35 ± 3 s −1 ) [35]. Hence, the translocation step seems to be another slow step aside from the catalytic one. Furthermore, pause densities during elongation were measured at different NTP concentrations, giving ⋅ − k K t ∼ (4.7 ± 0.5) × 10 3 μM −1 s −1 . As a result, it was estimated that for the backward translocation

A branched BR model and single molecule measurements supporting this branched model
For the pause-free elongation rate obtained in the generic BR scheme (figure 1(a)) and equation (1), one can see that by assuming + k t ≫ + k c , k max is reduced to the catalytic rate + k c . That says, when the forcedependent transition happens much faster than the force-independent step in this scheme, the saturation rate of the elongation becomes that of the forceindependent step. However, in single molecule experiments of Pol II [34,35], force dependency of the saturation elongation rate had always been detected, in particular, in the case of the mutant polymerase E1103G [34]. Therefore, if one adopts the assumption + k t ≫ + k c while using the non-branched BR scheme (figure 1(a)), the model could not fit well to the single molecule experimental data, as shown in [34].
In order to fit the experimental data under the fast translocation assumption, a branched BR model was adopted [34], in which NTP binding is allowed either at pre-translocation or at post-translocation (see figure 1(b)). As + k t ≫ + k c holds, the pause free elongation rate of the branched BR model is t t max such that = + Then the saturating elongation rate becomes force dependent under this branched scheme. Hence, it explains why the experimental data fitted better with the branched BR model than with the non-branched one, under the fast translocation assumption [34]. Correspondingly, three parameters were obtained for the wild-type Pol II from fitting to the branched BR scheme in [34]: μM, and σ t 0 = 0.2 ± 0.1. The data fitting from [34], along with the same fitting to the data measured later [35], is shown in SI figure S1.
If one compares the fitted parameters from the branched BR model [34] with that from the non-branched BR model [35], one sees that except for an identical catalytic rate + k c , the other key features are obtained quite differently: (i) in the branched BR model, σ t 0 ∼ 0.2, so the post-translocation state is more populated than the pre-translocated state. In the non-branched BR model, however, σ t 0 ∼ 7.7, so it is the pre-translocated state (under the three-state scheme) that is more populated. (ii) In the branched BR model, the intrinsic NTP dissociation constant can be valued high as σ 2 μM in the non-branched three-state BR model. Hence, one sees that even though the pause-free elongation velocities were measured consistently on Pol II in both experiments [34,35], different choices of the data-fitting models lead to quite different physical interpretations of the system.
Based on structural and dynamical properties of Pol II revealed from computational studies, we present an expanded elongation scheme (shown in figure 1(c)) in this work. The scheme follows the generic nonbranched BR model, while five kinetic states instead of three are used to provide slightly more specific descriptions of the Pol II elongation. Essentially, we propose a force-dependent TL opening process prior to the translocation. We justify this proposal below.

TL opening is a necessary step prior to translocation and can be slow and force dependent
In contrast to the single molecule experiment that suggested a slow translocation step of Pol II at tens of milliseconds, recent atomistic MD simulations implementing the MSM techniques identified the Pol II translocation at tens of microseconds in the absence of incoming NTPs [48]. This study supported the BR model where Pol II can move between the pre-and post-translocation states with nearly identical transition rates. The translocation rates (∼10 5 s −1 ) [48], however, are much larger than the experimentally measured force-dependent rate (∼10 2 s −1 ) [35]. Though including a full transcription bubble into the simulation may result in a longer translocation time, it is worth pointing out that the experimentally measured force-dependent rate does not necessarily apply to the translocation per se. The single molecule experimental study on the mutant Pol II E1103G [35] (addressed below) strongly indicates a conformational change related to TL rather than the transcription bubble that leads to the slow and force-dependent transition in the generic BR scheme ( figure 1(a)). If the conformational change happens as a pre-requisite right before the translocation, then an assisting force of the translocation can easily accelerate the conformational change (or inhibit the change in the reversed change direction); vice versa, an opposing force of the translocation may hinder the preceding conformational change (or facilitate it in the reversed change).
In previous computational studies of Pol II, it had been suggested that TL opening is required for the Pol II translocation to happen [41,42]. Using normal model analyses, it was found that the reduced flexibility of the clamp domain upon TL closing or NTP binding to the active site translates into reduced mobility of the downstream DNA, thereby, effectively inhibits the translocation [41]. Further, it was shown that MD simulations with open and closed TLs sampled a common state with slight forward translocation relative to the x-ray structures, but more significant forward translocation was only observed in simulations with an open TL [42].
Another important event that happens after catalysis and prior to translocation is PPi release. MD simulation studies have shown that dynamics of the PPi release is much faster than the complete opening motion of TL. Nevertheless, the PPi release can increase the flexibilities of the tip region of the TL domain [43,44]. On the other hand, in simulating the Pol II translocation, the pre-translocated structure adopted at the beginning of the simulation already has an open TL (modeled from the crystal structures: PDB id: 1I6H, 2NVT and 2E2J) [48]. Hence, it seems that a closed TL puts a 'brake' on the downstream DNA until after the PPi release. The full opening of the TL after the PPi release removes the brake and allows the translocation to proceed. Under an assisting force, the TL opening would be accelerated as the downstream DNA is forced to move backward. That is, the TL opening can be regarded as a slow and force-dependent transition that allows the fast translocation to happen.
Essentially, the single molecule study on the mutant E1103G indicated that the rate of the slow force-dependent transition would drop significantly (from 88 s −1 to 44 s −1 ) upon the mutation [35]. E1103 actually locates on one end of the TL. The finding strengthens the idea that conformational changes related to TL rather than the transcription bubble cause the force-dependent slowing down. In the TL closed structure, we notice that salt bridges form between the negatively charged E1103 and two positively charged residues, R1100 and K1112, with distances at ∼4.4 Å and 3.5 Å, respectively (see figure 2(a)). The geometry of the salt bridge between E1103 and R1100, however, appears to be suboptimal, compared to regular salt bridges (<4 Å). The impact of salt bridge interactions on the helix folding and unfolding kinetic has been closely examined in a recent study using temperaturejump transient-infrared spectroscopy and steady-state UV circular dichroism [49]. In that study, the effect of Glu-Arg salt bridges on the kinetics of alpha-helix folding was investigated, which shows that suboptimal salt bridges with unfavorable geometry kinetically destabilize the folded structure or promote the helix unfolding. It is then likely that in our study of Pol II, the suboptimal salt bridge (E1103-R1100) formed in the closed configuration of TL destabilizes the closed/ folded TL, or promotes the TL opening/unfolding. In contrast, the mutation (E1103G) that abolishes the destabilizing salt bridge (E1103-R1100) thus brings a relatively stabilized form of the closed TL, or reduces the TL opening rate comparing to the wild type.
Following the above analyses, we use a five-state BR model to describe the elongation kinetics of Pol II. The reasons to choose five states are: (i) upon the three basic steps (translocation, NTP binding, and catalysis), we want to separately model the TL opening process as a slow transition, after the catalytic product formation, but prior to the translocation. Hence, at least four kinetic steps/states are needed for an elongation cycle. (ii) In addition, we want to consider and compare two scenarios for the force-independent slow event: the event either happens right upon NTP binding (model A below), or happens at least one kinetic step further down the reaction path (model B). To that end, two kinetic steps are considered between the NTP binding/pre-insertion state and the product state as the simplest case. Hence, five states/steps are now modeled. (iii) Since PPi release happens very fast and is off our concern in this work, we do not separately model the PPi release aside from the catalysis transition. As such, a five-state kinetic model serves for a minimal representation for the purpose of current study.

Model
A: the five-state BR model with a ratelimiting TL closing/ isomerization upon NTP binding In the five-state scheme (figure 1(c)), state I-V refer to the pre-trans, post-trans, pre-insertion [50], substrate, and product states. Correspondingly, translocation proceeds through I → II, NTP binding II → III, TL closing (or isomerization) III → IV, catalysis IV → V (including pre-catalytic adjustment, phosphoryl transfer, and PPi release), and TL opening V → I that allows for a next cycle. We assume in current scheme that (a) IV → V approaches irreversible (with forward rate + k c , and backward rate − k c → 0) as PPi concentration is very low; (b) both V ↔ I (TL opening and closing prior to translocation, at rates ± k TLo ) and I ↔ II (translocation forward and back, at rates ± k t ) are force dependent, with V ↔ I sufficiently slow, and I ↔ II very fast; (c) in particular for model A, III → IV is set as the rate-limiting step (forward rate or the rate of the TL closing after NTP is + k TLc , without force dependence), such that the transition rate ± k TLc ≪ + k c (the catalytic rate).
Correspondingly, one can write down a master equation for the probability/population distribution of the five kinetic states, using the population vector Π = P P P P P ( ) T where M stands for the transition rate matrix: ), under the model A assumptions (see SI for further details), one obtains the pause-free elongation rate as max TLc TLo TLc 0 , and in particular, t t t as the biases against the TL opening (V → I) and translocation (I → II), respectively. The larger the bias, the system is more stabilized in the initial state (the TL closed state V or the pre-translocated state I). It is easy to see that equation (4) is analogous to equation (1), derived for the non-branched BR model in the three-state representation. Mathematically, fittings to experimental data according to equations (4) and (1) give equivalent results for the dominant or slow events: One obtains + k TLc ∼ 35 s −1 and + k TLo ∼ 112 s −1 (under the assisting force of 6.5 pN) in model A.
Combing with experimental data on pause densities [32], one can estimate quantities such as K , but cannot extract further the 'intrinsic' information K d and k b 0 on NTP binding. One also notices that in equation (4) the elongation velocity is independent of some of individual kinetic rates, such as − k TLc and + k c , as the derivation is under specific assumptions of model A. Fitting equation (4) to the elongation velocity data, therefore, cannot reveal those 'hidden' parameters. In practice, one can also fit the experimental data following the exact formulae (e.g., see SI equation S1) when the model assumptions hold approximately. As a result, some parameters can be estimated this way while the rest are assigned to likely values. The fitting details of model A to the experimental data are provided in SI figure S2 and table S1.
In the most recent MD simulation study that demonstrated the fast translocation of Pol II, it was found that the forward rate and backward rate are almost identical, at ∼5 × 10 4 s −1 [48]. Accordingly, we set σ (0) t = 1 at the zero force. To keep the results consistent with the measured pause densities of Pol II during elongation [35], one obtains μM, and − k TLo ∼ 2142 s −1 at zero force, thus the bias against the TL opening is σ (0) TLo ∼ 21 (see SI table S1). It indicates that prior to translocation, the TL-closed product state (V) is more favored than the TL open pre-translocation state (I), by k B T ln21 ∼ 3 k B T.
In estimating the pause densities, the pause related backtracking is assumed to start from state V, before the TL opening. Since the experimentally fitted backtracking rate is at ∼7 s −1 , it presumably includes a slow process such as the TL opening. If the backtracking is assumed to start at the pre-translocation state I, one either cannot fit with the pause Essentially, one sees that the population bias toward the pre-translocation state (∼7.7) presented in the three-state BR model [35] is now 'shifted' to the TL opening process prior to the translocation (∼21), favoring the TL-closed product state (V). Furthermore, one infers that the intrinsic NTP dissociation constant  [35].
From previous literature, the measured apparent NTP dissociation constant of RNA polymerases (or K M in the Michaelis-Menten scheme) ranges from 10 to 100 μM [51][52][53]. Conventionally, one assumes that the intrinsic dissociation constant K d is similar to the apparent value. Model A indicates that these values can be significantly different in the polymerase elongation kinetics. While K M was measured ∼39 μM, K < 10 μM is obtained, and K d is even smaller. For example, in table S1, when we set k b 0 =20 μM −1 s −1 , K d is ∼4.1 μM. In an alternative model B, however, we show that K d is not restricted to such a low value.  (4):

Model
. Since the catalytic rate = TLc is assumed low in model B, while k c * needs to fit with the slowest rate in the experiment [35], σ TLc cannot take a large value.
According to data fitting, k c * ∼ 35 s −1 and + k TLo ∼ 112 s −1 (at F = 6.5 pN) apply for model B as the two slow rates.  figure 3, where the pause-free elongation rate of wild-type Poll II  [34,35]. The red curve is the model fitting to the experimental data. The parameters and details are provided in SI (table S2). Similar fitting of model A can be found in SI figure S2 and table S1. approaches ∼25 nt s −1 (under the force of 6.5 pN) with increasing concentrations of NTP (∼39 μM). Besides, the force dependence of the elongation rate (at a high NTP concentration of 1 mM) is also demonstrated: variation of the forces from opposing to assisting (from −6 pN to 20 pN) increases the elongation rate from ∼20 nt s −1 to ∼30 nt s −1 . The measurements and data fitting for the mutant E1103 are also shown in figure 3.
By analyzing and comparing model A and B, we see that a generic non-branched BR mechanism explains the experimental data sufficiently well. In both models, the force-dependent slow step can be attributed to the TL opening that precedes the fast translocation; NTP binding is only allowed at posttranslocation, acting as a paw for the BR. Model A assigns the force-independent slow step to the TL closing or isomerization transition (III → IV) right upon NTP binding. In this case, the polymerase waits a long time at state III for transiting to state IV, thus state III is highly populated. As such, one expects a fairly small NTP dissociation constant so that NTP is held tightly at state III to avoid excessive dissociation. In contrast, model B assigns the slow step to the catalytic transition (IV → V, but dominated by pre-catalytic adjustments), while the TL closing isomerization becomes relatively fast (e.g. at a rate >1000 s −1 ). In case that the TL closing upon NTP binding is close to irreversible, state III becomes transient or lowly populated. Accordingly, the NTP dissociation constant can be large, as the polymerase does not lose NTP significantly from the lowly populated state III.
One can also compare the fitting results of the mutant E1103G with that of the wild type, for both model A and B. According to the measurements [35], one can estimate quantitative features of the mutant polymerase at a fast rate limit (e.g. the force-independent rate-limiting step proceeds as fast as ∼260 s −1 , see SI for details). From the experimental fitting, the forward rate of force-dependent step ( + k TLo for the TL opening prior to translocation) decreases ∼50% in the mutant, in comparison to the wild type. The bias against the TL opening is found ∼20 in the wild type (for both model A or B), while it increases to at least ∼30 in the mutant (with σ ∼ + K 60/(1 ) t μM set for the mutant, see SI). Hence, one is able to attribute the mutant impact on the force-dependent step to stabilization of the TL closed configuration in the product state, and this impact is similar in both model A and B. On the other hand, the rate of the force-independent step increases significantly in the mutant.
As the bias against the force-dependent TL opening is obtained above, one gets the free energy of the TL closed product state (V) ∼ 3 k B T (or ∼2 kcal mol −1 ) lower than that of the TL open pre-trans state (I). This free energy difference matches well to that has been estimated from a recent MD simulation study [26]. The study also estimated that the free energy barrier from the TL open to the closed state is about 2-4 k B T (1.5-2.5 kcal mol −1 ) lower than that from the TL closed to the open state, in the presence of NTP. The property seems consistent with model B, in which the TL closed to the open (V → I) happens at ∼100 s −1 , while the TL open to the closed (III → IV) happens about tens of times faster (∼1000 s −1 or over). In model A, the TL closing upon NTP binding is even more slowly than the TL opening prior to translocation. Anyhow, we note that the barriers reported in [26] were obtained using a combination of target MD and Hamiltonian replica exchange MD simulations, in which the application of the external force may introduce artifacts.

Discussion
Pol II is a complex molecular machine that cycles though a number of conformational states upon each nucleotide addition. Although various intermediate states have been characterized by biochemical and structural experiments, the dynamical mechanisms are still lack of. By combing existing biochemical, structural, single-molecule, and MD simulations studies, we develop a transcription elongation model for Pol II in which a slow transition of TL opening happens right before translocation of the polymerase. Overall, the elongation still follows a generic BR mechanism, with NTP binding after the fast Brownian translocation. Essentially, our model suggests that the TL opening instead of the translocation brings the force-dependent slow transition detected in the single molecule experiment [35]. The proposal is supported by structure-based computational studies: previous MD simulation and normal mode analyses based on molecular force field had indicated that the TL in its closed form inhibits the downstream DNA translocation, so that opening of the TL is necessary to remove the inhibition [41,42]. Recent MD simulation connecting a large number of short trajectories to a network of Markov states demonstrated that the translocation of Pol II happens at ∼20 μs [48], which is much faster than the milliseconds single molecule measurements. The simulated translocation, however, does not involve TL opening transition, and only contains a minimum scaffold of transcription complex [54]. In addition, MD simulation studies had also shown that the TL only slightly opens after a fast PPi release at microseconds [43,44], while the substantial opening of the TL is left to happen thereafter.
The mutant Pol II E1103G demonstrates consistent behaviors with this TL opening model. We notice that E1103 locates at one end of the TL. It forms a suboptimal salt bridge with R1100 that likely destabilizes the local helix folding in the closed/folded TL [49]. The mutation E1103G abolishes the destabilization effect of the E1103-R1100 salt bridge and thus stabilizes the TL in its folded/closed configuration. Accordingly, the mutant E1103G exhibits a slower rate of TL opening, a stronger bias against the TL opening, and in addition, a faster rate of TL closing or catalysis, comparing to the wild type Pol II [35].
On the other hand, previous work combining single experiments with kinetic modeling demonstrated sequence-dependent pausing behaviors of the polymerase during elongation [55]. Due to variation of the sequence stabilities along the double stranded DNA, each step of the polymerase translocation brings a variable free energy change Δ = G − − G post trans − G pre trans , caused by melting of the downstream base pair (bp), re-annealing of the upstream bp, and similar adjustments in the DNA-RNA hybrid [55,56]. Correspondingly, the translocation bias varies as σ σ , so that M t c t in the pause-free velocity (e.g. equation (1)) incorporates the sequence effect. Indeed, a sequence barrier ΔG > 0 above thermal fluctuation level can lead to a very large K M , and thus interfere with the overall elongation rate. In SI, we numerically examined if it is possible that the force-dependent slowing down detected in the single molecule experiment [35] is indeed caused by the sequence barriers. Our calculation shows that only when large sequence barriers (ΔG ∼ 5 to 6 k B T or over) are populated sufficiently high (e.g. 10-30% over), can the sequence barriers slow down the elongation rate significantly. For regular sequences with ΔG varying within ∼3 k B T, the sequence barriers cannot cause that much force-dependent slowing down as detected. As such, it becomes even convincing that it is the slow conformational transition (i.e., TL opening) susceptible to mechanical force change that causes the force-dependent slowing down in the single molecule detection.
To further explore the kinetic specificities in the Pol II elongation, we have analyzed two possible scenarios for the rate-limiting event after NTP binding. The first scenario, presented in model A, assumes that a slow event starts right after NTP binding or pre-insertion. This slow event delivers a substantial isomerization or TL closing transition during NTP insertion. Correspondingly, the NTP-bound 'preinsertion' state (III) becomes highly populated as the initial state of the transition. During the long waiting period of the slow transition, the NTP affinity to the binding site has to be sufficiently high in order to keep NTP from dissociation, or say, the NTP dissociation constant has to be sufficiently small. For the second scenario, presented in model B, the slowest transition happens one step further into the catalytic stage. In this case, the NTP binding affinity can be either high or low, as the isomerization transition upon NTP binding proceeds sufficiently fast to allow a timely NTP insertion (e.g. ∼1000 s −1 or over). Since both the phosphoryl transfer reaction and PPi release happen fast, the slow transition is attributed to some pre-catalytic adjustment, such as rearrangement or tightening of the active site [21,42]. The rearrangement assists a proper geometry of NTP to form in the active site, so that an efficient phosphoryl transfer reaction can happen. In order to determine which scenario is more likely, one can examine the intrinsic NTP dissociation constant 0 : according to current analyses, if K d > ∼10 μM, model A would be ruled out.
In common practice, however, it is the apparent value of the NTP dissociation constant or indeed, the Michaelis constant K M that is measured, instead of the intrinsic value. Nevertheless, even in the simple threestate BR scheme, the derivation based on the master equation approach shows the discrepancy between (see equation (1);  [58], then it is likely an average K d will be substantially larger than 10 μM in Pol II. In that case, model A would be ruled out, and model B would be supported. There is some previous evidence also suggested that the TL closing transition after NTP binding happens fast [59], as supported by model B, in which the force-independent rate-limiting step is set at the catalytic stage. Hence, one sees that accurate measurements of the intrinsic NTP dissociation constants or NTP binding affinities help essentially on discovering a more specific elongation scenario. Indeed, the specific details of NTP binding to Pol II remains to be elusive and controversial. It had been proposed that there is an entry (E-) site aside of the active (A-) site for the initial NTP binding [52,[60][61][62]. One previous modeling work on diffusion of NTP into the polymerase active center showed that binding to the E-site adjacent to the Asite can overcome the limitation of the RNA synthesis at low NTP concentration [61]. The NTP binding had also been suggested to happen early in the pre-translocation state [60]. The branched BR model adopts this perspective, allowing NTP binding at either pre-or post-translocation [34,37]. The branched BR model provides a fairly large upper bound to the intrinsic NTP dissociation constant (K d < 140 μM), as an extra site greatly enhances the life time of NTP around the active center [61]. Superimposing of NTP into the E-site of the pretranslocated structure of Pol II (see figure 2(b)) shows a high proximity between the NTP triphosphate and the backbone phosphate of the 3'-end of the RNA. Normally, due to the strong electrostatic repulsion, the two negatively charged groups would not be able to approach each other to such a short distance (∼3.0 Å between two closest oxygen atoms). Nevertheless, our docking study indicates that it is possible for the substrate NTP to directly bind to the E-site of the pre-translocation state of Pol II ( figure 2(b)). In particular, several positively charged residues (Lys987, Arg766 and Arg1020), along with a magnesium ion in the active site, help on shielding the negative electrostatic repulsions. Therefore, the branched BR model that allows NTP binding at pre-translocation is still feasible. Nevertheless, the fitting results of the branched BR model do not seem to match well with current computational findings. For example, the translocation bias σ t 0 ∼ 0.2 suggested in the branched BR model has not been identified in the MD studies [48]. It is also not clear whether TL opening can proceed fast prior to the translocation, and how TL opening and NTP binding are possibly coordinated at the pre-translocation. Hence, we adopt only the non-branched BR mechanism in current study.
In addition, next to the TL structure around the active center, one can identify an essential BH. From the simulation study, the BH bends frequently during translocation of Pol II [48]. Close examination shows that there is a high chance that the BH bending interferes with the active center and hence the A-site NTP binding: the middle region of the BH would bend into the A-site for ∼70% chance at pre-translocation, and for ∼50% chance during the translocation (see SI for details). Only at post-translocation, the BH does not bend that much, so it hardly interferes with the active center or NTP binding to the A-site any more. Accordingly, one should be aware that even though NTP binding to the E-site is plausible at pre-translocation, the binding to the A-site is not fully supported until the post-translocation state is reached. Interestingly, one can also compare the multi-subunit Pol II translocation with that of the single-subunit T7 RNA polymerase [63], in which a highly conserved residue Tyr639 oscillates IN and OUT the active site similarly as the BH bending region, while an O-helix opens and closes through the elongation cycle as TL does in Pol II.

Conclusions
Based mainly on most recent single molecule experimental measurements and computational studies on Pol II transcription elongation, we present a kinetic model that connects the experimental data analyses with the structure-based modeling and simulation. The model follows a generic BR mechanism, in which the polymerase translocates forward and backward spontaneously and quickly, until NTP binds at posttranslocation to stop the backward translocation. Since the single molecule measurements suggested a slow force-dependent transition prior to NTP binding, we show that the slow transition can be a TL opening process that precedes the translocation, while the translocation per se is still fast. Consistently, one can see that the mutant polymerase E1103G stabilizes the closed TL configuration and also slows down the TL opening prior to the translocation. In addition, we consider the sequence-dependent translocation and show that the significant force-dependent slowing down of the Pol II elongation, as detected experimentally, cannot be attributed to the regular sequence stability variation along DNA. On the other hand, in current model, it is still to be determined which step after NTP binding is rate limiting for regular cognate NTP incorporation. According to our analyses, if this rate-limiting step happens right after NTP binding, as during an isomerization for the TL closing, then the NTP binding affinity has to be high to prevent excessive NTP dissociation. Consequently, a low NTP binding affinity or high NTP dissociation constant indicates that instead, the force-independent ratelimiting event happens later, after TL closing, as for the pre-catalytic arrangements. Though current study follows a generic non-branched BR mechanism, one cannot rule out the branched BR model, as it seems plausible that NTP binds to the E-site prior to the translocation. As such, we emphasize that accurate determination of NTP binding properties, either quantitatively on the NTP affinity or dissociation constant, or qualitatively on when and where NTP binds to the polymerase elongation complex, would substantially improve our knowledge and understanding on the polymerase elongation. It is notable that the working model developed here not only provides structural and functional insights in Pol II transcriptional elongation mechanisms, but also makes critical connections between the overall elongation kinetics with individual transition steps, which can be characterized relatively easy by existing experimental and computational techniques. Hence, the modeling perspective brings new opportunities of investigating how local structural and dynamical perturbations may affect overall elongation kinetics, for example, on how a DNA damage that hinders translocation may ultimately affect the in vitro elongation rate.