Direct detection of molecular intermediates from first-passage times

Model-free method reveals details of underlying energy landscapes from dynamics in experimental systems across a range of scales.


Section S1. Theoretical Background
To start with the simplest case, consider a 1D network of discrete states i = 0, ...B, where at t = 0 a system starts at some state in the network with i = A. Over time, the system can transition between states by moving one step forward in the network to state i = A + 1 or by moving one step backwards to state i = A − 1. Forward transitions occur with a rate u A and backward transitions at a rate w A . We wish to obtain the first passage probability distribution P B,A (t) that describes the probability that a system starting at state A will reach state B at time t without first hitting state 0.

Supplementary Material
By considering the possible transitions between states, the temporal evolution of the probability distribution can be described by backward master equations as with boundary conditions P B,0 (t) = 0, for all t, To solve this equation a Laplace transform, defined as is performed, allowing for Eq. 1 to be written as From here, precise details of the solution depend upon the nature of the system (24), for example, whether the rates of transition between states are all the same or vary depending upon which state the system is in. However, in all cases, irrespective of these details, a final expression of the form is obtained. At early times this term dominates, leading to, on a log-log scale, the key result of Now consider a general system that involves multiple discrete states. Processes taking place in this system can be viewed as a multi-dimensional network of transitions between these different states. Here, for a process of interest in which a system moves from state A to state B many pathways may connect the initial and final states. As such, in analyzing events that start at the state A and finish at B, it is clear that events corresponding to all of these pathways will be included. However, the time to travel along each possible pathway in the network of discrete states is not the same. In considering short times, only events along the shortest pathway i.e.
with the fewest number of intermediate states, will be observed because they will be the fastest.
This means that at short times we are probing effectively a one-dimensional part of the system, corresponding to the shortest pathway. While the exact range of times over which these shortest events occur will depend on transition rates it can be shown that this short-time regime will always exist (34). As such, the method can also be successfully used for analysis of complex multi-dimensional systems.

Section S2. DNA hairpin sequences
Details of the fabrication of DNA hairpin structures have previously been reported (32,33,36,37), including the three state structure used in this work (32). The four state hairpin is built from two strands. The first has the following sequence: The second strand is the complementary strand to this sequence.
Particle trajectories are acquired from videos using standard image analysis techniques (38).
As datasets are acquired via an automated process using the optical tweezers, large datasets of trajectories are obtained for each potential landscape (500-4000 trajectories corresponding to more than 10 5 particle positions) (28,39). Section S3. Analysing particle trajectories and establishing the slope of the linear regime To calculate the first-passage time distributions in our colloidal system, data is first split into subsets. Each subset corresponds to the particle starting from a different potential minimum, and thus position within the channel. The value of m iattached to each distribution indicates the lowest number of minima that must be crossed by the particle to exit from the channel, from its initial position (a minima in the channel) to its final position (the left or right reservoir). Typical particle trajectories, as used to calculate the distributions shown in Fig. 2A  The scaling behaviour of the short-time regime is determined simply by inspection of the distributions. As a first step, the distribution for a particular imposed potential is compared to that with no imposed potential (corresponding to free diffusion for the colloidal system as shown in section S4). This allows for the change in shape of the distribution to a more linear behaviour caused by the presence of the potential minima to be identified. Having thereby identified the linear regime of the distribution, we compare this region of the data on a log-log scale to lines of integer slope to determine the scaling that best describes it.
Section S4. First-passage time distributions with no imposed potential landscape  To further highlight the different behaviour seen with and without an imposed potential more explicitly, Fig. S3A shows a direct comparison of the short time regime of the first-passage time distributions shown in Fig. S2. Here, black dashed lines indicate the predicted power-law scaling according to Eq. 1 and the difference in qualitative behaviour of the distributions at short-times can clearly be seen. Fig. S3A also allows for the more similar behaviour of the distributions at very short times to be considered. To achieve this, data for ∆U ∼ 0 k B T has been renormalised (leading to a shift downwards) to overlap with the data for ∆U ∼ 3 k B T .
This allows for the shape of the distributions in this limit to be more easily compared. Fig. S3A clearly shows that at very short times i.e. for times at which there is a deviation from the power law scaling for the data with imposed potential landscape, the shape of the distributions with an imposed potential landscape coincides with that of the distributions for the free diffusion case.
This suggests that these very short time events correspond to effectively 'ballistic' motion in the direction of the exit with particles moving so rapidly across the potential landscape that they do not have enough time to feel the potential minima.
Furthermore the very short time deviations highlight the approximate nature of the modeling of our experimental system, which has a continuous potential landscape, in terms of a Markov Jump process, as in the theoretical description of movement through a discrete network. This assumption is only valid in the limit in which there is separation of timescales such that the time spent within a minimum is much longer than the time spent moving between the minima and thus is only valid for sufficiently deep minima. Indeed, alternative theoretical approaches that more explicitly consider continuous diffusion processes find different scaling behaviour at short-times (as discussed in (40)(41)(42) ) and so we note our observation of these very short time deviations may reflect the different behaviour associated with a continuous landscape.
In Fig. S3B we show the same data but now over a wider range of times. Here we also indicate as vertical lines two key timescales for the system. The first is the experimental time resolution of ∼ 0.0167s for the frame rate of 60 fps. It is clear from Fig. S3B that the time resolution is around an order of magnitude smaller than the shortest events. This supports our conclusion that the deviations from the linear scaling at short times do not arise from events that are missed due to the finite time resolution.
Indicating the time resolution also clearly shows that there is an initial period of time for which no first passage events occur. This is because the particle must diffuse a certain distance to the exit before it can escape, and the length of time associated with this diffusion corresponds to the region between the two vertical lines in Fig. S3B. Here, the second vertical line, marked τ 1 , shows a theoretical estimate of the shortest time at which it is probable to observe an event. To estimate τ 1 , we first obtain the 1D probability distribution of particle displacements, P (∆x, t), for a particle with the diffusion coefficient measured in our experiments. Multiplying this probability distribution by the number of experimental trajectories and then integrating the area under the distribution with ∆x > L/4, where L is the channel length, provides an estimate of the number of particles that would display this displacement after time t. As the value of t increases, the probability of finding a particle at a larger distance from its initial position increases, leading to a broader distribution. The value of τ 1 corresponds to the time required for the distribution to have become sufficiently broad that this integral is approximately equal to 1, i.e. for there to be an appreciable probability that a particle will diffuse far enough to escape.
This estimate is in good agreement with the shortest events observed in our experiment.
Section S5. Linking the length of the linear regime to minima depth In Fig. 2D of the main manuscript we observe a linear increase in ∆t, the length of the powerlaw scaling regime of P (t F P T ), with exp(∆U/k B T ). Furthermore, the gradient of ∆t against exp(∆U/k B T ) increases with m. Note that we do not include data for m = 3 at ∆U ∼ 5k B T as the very long times associated with this distribution mean we have insufficient statistics to accurately estimate the short-time regime.
To rationalise this we consider the main features of the theoretical model used to derive Eqn. 1 from the main manuscript (24). The theoretical model follows Kramer's theory, which assumes that transitions between the states in the network involve only barrier-crossing, with barriers that are large comparable to k B T . The short-time regime of the distribution reflects movement of the particle along the single shortest pathway, i.e. directly towards the exit crossing intermediate barriers in only one direction. The time required to follow this shortest path will therefore be related to the number of barriers crossed multiplied by the time necessary to cross a barrier. The time for barrier crossing is proportional to exp(∆U/k B T ). As such it is clear that ∆t should scale with exp(∆U/k B T ). Furthermore, as the number of minima that must be crossed increases, the time to follow the shortest path will also increase, rationalising the increase in gradient in Fig. 2D with increasing value of m.
Having established the dependence of the length of the linear regime on the potential minima depth, this relationship can be used to determine the effective value of ∆U/k B T in the molecular hopper system. To compare the length of the linear regime in different systems, however, the differing typical time and lengthscales in each system must be taken into account. To account for this the plot of ∆t against exp(∆U/k B T ) in Fig. 2D is expressed as: ∆t ∼ C m t 0 exp(∆U/k B T ) where C m is a constant, dependent on m, and t 0 is the time to move across a state in the absence of a potential minima. This can be rearranged as For the colloidal system, the m = 1 line in Fig. 2D has a gradient C m t 0 = 0.0658. The typical diffusion coefficient of a colloidal particle in the channel is approximately 0.25 µm 2 s −1 and the distance between the potential minima is 1.2 µm. As such, the typical timescale for moving between states in the absence of an imposed potential landscape, t 0 ∼ 3 s. This allows for the constant C m = 0.022 for the m = 1 process to be obtained.
For the nanoscale molecular hopper system, the distance between states is approximately 0.68 nm and the typical timescale to move between states in the absence of footholds is approximately 5 µs (43). Furthermore, in the m=1 distribution for the hopper (main manuscript Fig. 3B), the length of the linear regime, ∆t = 2.5s. Substituting these values for ∆t, t 0 and C m into Eq. (2) allows ∆U to be calculated for the molecular hopper as approximately 17 k B T .