Single-photon smFRET: II. Application to continuous illumination

Here we adapt the Bayesian nonparametrics (BNP) framework presented in the first companion article to analyze kinetics from single-photon, single-molecule Förster resonance energy transfer (smFRET) traces generated under continuous illumination. Using our sampler, BNP-FRET, we learn the escape rates and the number of system states given a photon trace. We benchmark our method by analyzing a range of synthetic and experimental data. Particularly, we apply our method to simultaneously learn the number of system states and the corresponding kinetics for intrinsically disordered proteins using two-color FRET under varying chemical conditions. Moreover, using synthetic data, we show that our method can deduce the number of system states even when kinetics occur at timescales of interphoton intervals.


INTRODUCTION
Single-molecule Förster resonance energy transfer (smFRET) experiments are widely used [1] to study molecular kinetics across timescales on both stationary [2][3][4][5] and freely diffusing molecules [6]. These timescales include faster events, below the micro-to millisecond timescales, including domain rotations, configurational kinetics of disordered proteins, protein folding, and protein-protein interactions, all the way to slower events, such as misfolding and refolding events, occurring on minute-and even hour-long timescales [7].
In a typical experiment we consider herein, a continuous wave (CW) laser illuminates a sample with a beam of constant intensity and power over a period of time. CW sources are common as they are both cheaper and technically simpler to implement in an experimental setup than their pulsed counterparts [8,9], which we explore in our third companion article [10]. However, compared with pulsed sources, a disadvantage lies in the increased photon flux through the sample that can accelerate photodamage [11].
Although pulsed illumination can significantly reduce sample photobleaching and phototoxicity [12] and more readily reveals excited state lifetimes of fluorophores, in practice it is restricted to analyzing one (time-stamped) photon per interpulse period. This in turn limits the data acquisition rate and sets a bound on the temporal resolution of the kinetics we may deduce from pulsed single-photon arrival.
By contrast, continuous illumination avoids this problem, by allowing a larger number of photons to be detected in the time that would normally be considered an interpulse period in pulsed illumination [13]. The cost then comes at the loss of direct knowledge of excited state lifetime which can, with difficulty and high uncertainty, then be decoded from photon-antibunching statistics if required [14], as shown in the first companion article [15].
It is common practice to analyze photon arrival data to extract kinetics under continuous illumination by binning the data and subsequently using hidden Markov models (HMMs) [16][17][18][19]. As noise distributions are better characterized in unprocessed data, it remains conceptually preferred, though more computationally costly, to use photon-by-photon methods [13,14,[20][21][22][23][24]. Indeed, photon-by-photon methods can be used to learn both photophysical and system transition rates directly from the detected photon colors and interphoton arrival times. Additionally, this has the benefit of avoiding averaging kinetics that may occur when binning data [17].
Currently available methods to analyze smFRET data in a photon-by-photon manner [13,20] rely on the foundational works of Gopich and Szabo [13,14,25], where the likelihood is taken as the product of as many generator matrix exponentials as there are photons in a FRET trace. Such a generator matrix constitutes transition rates encoding the kinetics of the system-FRET composite [15].
When analyzing smFRET data, of particular interest is the dimensionality of this generator matrix determined by the number of system states. In all existing analyses, the dimensionality is fixed by hand a priori, and the transition rates are then learned as point estimates using maximum likelihood methods.
Yet point estimates can be biased. In fact, limited data, lack of temporal resolution to estimate very fast kinetics [15], and noise all contribute to bias [26] in addition to a flattening of possibly multimodal likelihoods [27,28]. This motivates why we wish to operate in a Bayesian setting to learn distributions over the number of system states and transition rates while incorporating unavoidable noise sources such as detector electronics and background.
For this reason, we developed a complete Bayesian nonparametric (BNP) framework in the first companion article [15]. This framework incorporates many key complexities of a typical smFRET experimental setup, including background emissions, fluorophore photophysics (blinking, photobleaching, and direct acceptor excitation), instrument response function, detector dead time, and cross talk.
Here, we delve deeper into this framework for the case of continuous illumination by exploring its utility in cases where the number of system states is unknown.
We first test the robustness of our nonparametric method and its software implementation BNP-FRET by analyzing synthetically generated data for kinetics varying from very slow to timescales as fast as the interphoton arrival times. We then apply our method to experimental smFRET data capturing interactions between intrinsically disordered protein (IDP) fragments [29,30] relevant to signaling and regulation.
IDPs are of particular interest to nonparametric analyses as IDP's lack of order and stability results in broader spectra of dominant FRET pair distances sensitive to their chemical environment. In particular, we study interactions between the nuclear-coactivator binding domain (NCBD) of a CBP/p300, i.e., transcription coactivator and the activation domain of SRC -3 (ACTR) under varying chemical conditions affecting their coupled folding and binding reaction rates [29][30][31]. We use a single FRET pair under continuous illumination to observe the possible physical configurations (system states) of the NCBD-ACTR complex. Further, we report new bound/transient system states for the NCBD P20A mutation, not observed using previous point estimation techniques [30].

FORWARD MODEL AND INFERENCE STRATEGY
For the sake of completeness, we begin with relevant aspects of the methods presented in the first companion article [15], including the likelihood needed in Bayesian inference, and our parametric and nonparametric Markov Chain Monte Carlo (MCMC) samplers.
An smFRET experiment involves at least two singlephoton detectors collecting information on stochastic arrival times. We denote these arrival times with fT start ; T 1 ; T 2 ; T 3 ; .; T K ; T end g; in detection channels fc 1 ; c 2 ; c 3 ; .; c K g; for a total number of K photons. In this representation above, T start and T end are the experiment's start and end times, respectively.
Using this data set, we would like to infer parameters governing a system's kinetics. That is, the number of system states M s and the associated transition rates l s i /s j , as well as M j photophysical transition rates l s i ;j l /j m corresponding to each system state s i . Here, s i˛f s 1 ; .; s M s g and j l˛f j 1 ; .; j M j g are the system states and photophysical states, respectively. These rates populate a generator matrix G of dimension M 4 ¼ M s Â M j now representing transitions among composite superstates, where i ¼ ðj À 1ÞM j þ k (see the first companion article for details [15] on the structure of such a matrix). This matrix governs the evolution of the system-FRET composite via the master equation as described in the "Likelihood" section of the first companion article [15]. Here, rðtÞ is a row vector populated by probabilities for finding the composite in a given superstate at time t.
In estimating these parameters, we must account for all sources of uncertainty present in the experiment, such as shot noise and detector electronics. Therefore, we naturally work within the Bayesian paradigm where the parameters are learned by sampling from probability distributions over these parameters termed posteriors. Such posteriors are proportional to the product of the likelihood, which is the probability of the collected data w given the physical model, and prior distributions over the parameters as follows pðGjwÞfLðwjGÞpðGÞ; (2) where w constitutes the set of all observations, including photon arrival times and detection channels.
To construct the posterior, we begin with the likelihood derived in Sec. 2.3 of the first companion article. Here, P non k and G rad k are the nonradiative and radiative propagators, respectively. Furthermore, r start is computed by solving the master equation assuming the system was at steady-state immediately preceding the time at which the experiment began. That is, we solve r start G ¼ 0: Next, assuming that the transition rates are independent of each other, we can write the associated prior as where we choose Gamma prior distributions over individual rates. That is, to guarantee positive values. Here, 4 i represents one of the M 4 superstates of the system-FRET composite collecting both the system and photophysical states as described in Sec. 2.2. Furthermore, a and l ref are parameters of the Gamma prior.
In what follows, we first assume that the number of system states are known and will describe an inverse strategy that uses the posterior above to learn only transition rates. Next, we generalize our model to a nonparametric case accommodating more practical situations with unknown system state numbers. We do so by assuming an infinite dimensional system state space and making the existence of each system state itself a random variable.
Inference procedure: Parametric sampler Now, with the posterior defined, we prescribe a sampling scheme to learn distributions over all parameters of interest, namely, transitions rates populating G and the number of system states. However, our posterior in Eq. 2 does not assume a form amenable to analytical calculations. Therefore, we employ MCMC techniques to draw numerical samples.
Particularly convenient here is the Gibbs algorithm that sequentially and separately generates samples for individual transition rates in each MCMC iteration. This requires us to first write the posterior in Eq. 2 using the chain rule as follows: where the backslash after G indicates exclusion of the subsequent rate parameter. Furthermore, the first term on the right hand side is the conditional posterior for the individual rate l 4 i /4 j . The second term in the product is a constant in the corresponding Gibbs step as it is independent of l 4 i /4 j . Similarly, the priors pðG\l 4 i /4 j Þ for the rest of the rate parameters on the right-hand side of Eq. 2 are also considered constant. Equating the right-hand sides of Eqs. 2 & 4 then allows us to write the following conditional posterior for l 4 i /4 j as Since the conditional posterior above does not take a closed form that allows for direct sampling, we use the Metropolis-Hastings (MH) step [32][33][34] where new samples are drawn from a proposal distribution q and accepted with probability where the asterisk denotes proposed rate values from the proposal distribution q. Now, to generate an MCMC chain of samples, we first initialize the chains for all transition rates l 4 i /4 j , by randomly drawing values from their corresponding prior distributions. We then successively iterate across each transition rate in each new MCMC step and draw new samples from the corresponding conditional posterior using the MH criterion.
In the MH step, a convenient choice for the proposal is a normal distribution leading to a simpler formula for the acceptance probability in Eq. 6. This is due to its symmetry resulting in However, a normal proposal distribution would allow forbidden negative transition rates, leading to automatic rejection in the MH step and thus inefficient sampling. Therefore, it is more convenient to propose new samples using a normal distribution in logarithmic space to allow exploration along the full real line as follows: where k ¼ 1 is an auxiliary parameter in the same units as l 4 i /4 j introduced to obtain a dimensionless quantity within the logarithm. The transformation above requires introduction of Jacobian factors in the acceptance probability as follows: where the derivatives represent the Jacobian, and the proposal distributions are canceled by virtue of using a normal distribution.
The acceptance probability above depends on the difference of the current and proposed values for a given transition rate. This difference is determined by the covariance of the normal proposal distribution s 2 , which needs to be tuned for each rate individually to achieve an optimum performance of the BNP-FRET sampler, or equivalently approximately one-third acceptance rate for the proposals [35].
In our case, where the smFRET traces analyzed contain about 10 5 photons, we found it prudent to make the sampler alternate between two sets of variances at every MCMC iteration, , for the excitation rates, FRET rates, and system transition rates. This ensures that the sampler is quickly able to explore values at different orders of magnitude.
Intuitively, these covariance values in the proposal distributions above would ideally scale with the relative widths of the conditional posteriors for these parameters (in log-space) if the approximate width could be estimated. Since posterior widths depend on the amount of data used, an increase in the number of photons available in the analysis would require a correspondingly smaller variance.

Inference procedure: Nonparametric BNP-FRET sampler
Here, we first briefly summarize our inference procedure described in the "inverse strategy" section of the first companion article [15] for ease of reference.
In realistic situations, the system state space's dimensionality is usually unknown, as molecules under study may exhibit complex and unexpected behaviors across conditions and timescales. Consequently, the dimensionality M 4 of the generator matrix G is also unknown and must be determined by adopting a BNP framework.
In such a framework, we assume an infinite set of system states and place a binary weight, termed load, on each system state such that if it is warranted by the data, the value of the load is realized to 1. Put differently, we must place a Bernoulli prior on each candidate state (of which there are formally an infinite number) [36,37]. In practice, we learn distributions over Bernoulli random variables b i that activate/deactivate different portions of the full generator matrix as (see the "inverse strategy" section of the first companion article [15]): where active loads are set to 1, and inactive loads are set to 0. Furthermore, Ã represents negative rowsums. Finally, the number of active loads provides an estimate of the number of system states warranted by a given data set.
As we have introduced new variables we wish to learn, we upgrade the posterior of Eq. 2 to incorporate the full set of loads, b ¼ fb 1 ; b 2 ; .; b N g, as follows: pðb; GjwÞfLðwjb; GÞ pðbÞpðGÞ; where we assume that all parameters of interest are independent of each other.
As in the parametric sampler presented in the previous subsection, we generate samples from the nonparametric posterior above using Gibbs sampling. That is, we first initialize the MCMC chains for loads and rates by drawing random samples from their priors. Next, to construct the chains, we iteratively draw samples from the posterior in two steps: (1) sequentially sample all rates using the MH procedure, then (2) loads by direct sampling, from their corresponding conditional posteriors (as described in the "Inverse Strategy" section of the first companion article [15]). Since step (1) is similar to the parametric case, we only focus on the second step in what follows.
To generates samples for load b i , the corresponding conditional posterior is given by [38] where the backslash after b indicates exclusion of the following load. We may set the hyperparameters M max s , the maximum allowed number of system states used in computations, and g, the expected number of system states based on simple visual inspection of the smFRET traces. Now, the conditional posterior in the equation above is discrete and describes the probability for the load to be either active or inactive, that is, it is itself a Bernoulli distribution, as follows: The simple form of this posterior is amenable to direct sampling. In the end, the chain of generated samples can be used for subsequent statistical analysis.

RESULTS
In this section, we first demonstrate the robustness of our BNP-FRET sampler by investigating the effects of excitation rate on the distributions over transitions rates and system state numbers. Once we have illustrated the BNP-FRET sampler's performance on synthetic data, we apply it to estimate the number of system states along with associated escape rates from publicly available experimental data for a complex involving intrinsically disordered proteins (ACTR-NCBD). We compare our results with reported literature values [29,30].

Resolution of timescales given excitation rate: Nonparametrics
To demonstrate the performance of our BNP-FRET sampler over a range of timescales given a fixed excitation rate, we follow the same approach as presented in the first companion article (see the "Results" section) [15]. That is, we generate four synthetic smFRET traces containing K ¼ 2 million photons each for a biomolecular complex with three system states, fs 1 ; s 2 ; s 3 g. The kinetic scheme for this system is a generalization of the example presented in the first companion article [15] (brown boxes) with two system states. Now, to synthesize smFRET traces, we fix the excitation rate to l ex ¼ 10 ms À 1 and FRET efficiencies ε FRET to 0.09, 0.5, and 0.9 for the three system states, respectively, motivated by experiments in [30]. The remaining parameters are the system transition rates l s i /s j , varied across datasets to test our BNP-FRET sampler over a wide range of timescales ranging from a thousand times longer than the average interphoton arrival time ð1 =l ex Þ to as short as the average interphoton arrival time itself (representing an extreme case). We do not probe kinetics any faster because the excitation rate does not provide enough temporal resolution for resolving system transitions in this regime, as demonstrated in the first companion article (see the "Results" section).
We start the analysis by applying our BNP-FRET sampler to learn the number of system states for the case with slowest escape rates, i.e., the sum of all transition rates out of a given system state. These escape rates are l esc ¼ 0:01;0:02;and 0:03 ms À 1 . We show that our BNP-FRET sampler can correctly learn the number of system states and the associated escape rates and FRET efficiencies; see Fig. 1 a and Fig. 2 a.
Next we analyze, one-by-one, datasets generated using escape rates that are 10 times faster in each subsequent data set. BNP-FRET deduces the correct number of system states in all cases (see Fig. 2 a-c), however the determination of the rates begins to fail in Fig. 2 d. The failure to estimate escape rates approximating the excitation rate can also be predicted using a "photon budget index" defined in the "Photon budget and excitation rate" section of the first companion article [15] as where K and l probe are, respectively, the photon counts and the escape rate to be probed. Plugging the parameter values associated to the data set shown in both Figs. 1 d and 2 d with three escape rates, i.e., K ¼ 2 Â 10 6 ; M s ¼ 3; l ex ¼ 10 ms À 1 and l probe ¼ l esc ¼ 10 À 30 ms À 1 , into the above equation, we obtain s ¼ 2=3 Â 10 6 ; 2=6 Â 10 6 and 2=9 Â 10 6 . The index obtained for l probe ¼ 10 ms À 1 is on par with the threshold of s thresh ¼ 10 6 found in the first companion article in [15] 4.1, where the sampler had available sufficient information to draw an accurate inference. By contrast, moving to the larger escape rates of l esc ¼ 20; 30 ms À 1 , the photon budget indices obtained are much smaller than the threshold, and the sampler starts failing due to lack of information.
To be more precise, our sampler is capable of learning any escape rates, even those larger than excitation rate, given sufficient photons. As this is counter intuitive, we note that the excitation rate is an average value, and there are often photons detected with interphoton intervals much smaller than 1=l ex . As such, given long photon traces, there are always enough photons with small interphoton intervals to learn faster escape rates (and indeed to learn excited state lifetimes as we show in the first companion article [15]) that would otherwise evade binned photon analysis methods [39].

Analysis of experimental data: NCBD-ACTR interactions
Here, we apply our BNP-FRET sampler to two datasets probing the interactions between partner IDPs, NCBD and ACTR, under different conditions [29,30]. Precise knowledge of binding and unbinding reactions of such proteins is of fundamental importance toward understanding how they regulate expression of their target genes. Methods that have been used in the past [29,30] to analyze smFRET traces from experiments on NCBD-ACTR interaction assumed a fixed number of system states to obtain maximum likelihood point estimates for transition rates. In addition, these methods bin photons to mitigate computational expense. However, given the inherently unstructured and flexible nature of IDPs, fixing the dimensionality of the model a priori can be limiting and, as we will see, may bias analysis. Therefore, our nonparametric method that places no constraints on the number of system states while incorporating all major noise sources is naturally suited.
In the following subsections, we first analyze data for a system where an immobilized ACTR labeled with a Cy3B donor interacts with an NCBD labeled with a CF680R acceptor in the presence of ethylene glycol (EG), 36% by volume, in order to more closely mimic cellular viscosity [29]. Here, the binding of NCBD to ACTR is monitored in smFRET experiments using a confocal microscope setup. Next, we analyze data for a system in a buffer without EG, and therefore with faster kinetics. Here an immobilized ACTR interacts with a freely diffusing mutated NCBD (P20A) [30].
To acquire both experimental FRET datasets containing about 200,000 photons each, laser powers of 0.5 m W and 0.3 m W were used leading to excitation rates varying from 3000 to 11,000 s À1 in the confocal region depending on where the immobilized sample lies with respect to the center of the excitation laser beam.
Moreover, we are provided a calibrated route correction matrix (RCM) by Zosel et al. [29,30] to account for spectral cross talk, and relative detection efficiencies of donor and acceptor channels. We defined such an RCM in the "detection effects" section of the first companion article [15] and specify it for each data set separately in the following subsections.
Finally, by contrast to the first companion article [15], we ignore the instrument response function. The latter typically acts over a period of hundreds of picoseconds. As such, it is immaterial on the seconds timescale over which system transitions occur. FIGURE 1 MCMC chains generated by the BNP-FRET sampler for the number of system states. The synthetic smFRET datasets used to generate these chains assume uniform excitation rate of 10 ms À1 and FRET efficiencies of 0.09, 0.5, and 0.9, for a three-state system. However, the system's escape rates for all three states become faster by a factor of 10 as we move from (a) to (d). That is, in the slowest case, we use escape rates of 0.01, 0.02, and 0.03 ms À1 for the three system states, whereas in the fastest case, kinetics are as fast as the excitation rate itself. Our method converges to the correct number of system states for each data set. As we will see later, the rates become more difficult to estimate for (d), which we consider to be the point at which the method breaks down.
Moreover, the background values vary for each data set, and they are therefore precalibrated, independently, for each data set in the corresponding sections. Now, with all experimental details at hand, we proceed to analyze the experimental data using our BNP-FRET sampler.

Immobilized ACTR in 36% EG
Binding of NCBD to ACTR leads to the formation of a stable and ordered complex in the presence of EG. In addition, when two fluorescent dyes labeling the IDPs come in close proximity, we expect FRET interactions. Therefore, bound and unbound system states of the NCBD-ACTR complex correspond to high and low FRET efficiency signals, respectively.
For the analysis of the collected smFRET data from such a complex, we must take into account all sources of noise such as cross talk and background. The cross talk/detection efficiency values are computed from the RCM given by the authors of Zosel et al. [29] as As such, these values imply that approximately 18% of the emitted donor photons are detected in the acceptor channel due to cross talk. Furthermore, only 84% of emitted acceptor photons are detected in the acceptor channel, and acceptor photons do not suffer any cross talk.
We must also incorporate precalibrated background rates for donor and acceptor channels given as 0.283 s À1 and 0.467 s À1 , respectively [29].
With all such corrections applied, our BNP-FRET sampler now predicts two system states; see Fig. 3. The system state with the lowest FRET efficiency of 0.0 corresponds to the unbound NCBD. The remaining system state with higher FRET efficiency of z0:7 coincides with the bound NCBD-ACTR complex configuration. The associated escape rates we obtain from our FIGURE 2 Learned bivariate posterior for the escape rates l esc and FRET efficiencies ε FRET from synthetic data also used in Fig. 1. Going from (a) to (d), we speed up the kinetics (escape rates) by a factor of 10 each time, leading to a gradual loss of temporal resolution needed to identify system transitions. The ground truth is shown with the red dots. The estimates for escape rates and FRET rates in (a) to (c) have less than 10% errors. However, as seen in (d), the excitation rate does not provide enough temporal resolution to resolve system transitions occurring at interphoton arrival timescales, resulting in large errors in the parameter estimates. The estimated escape rates in (d) are 0:8 þ0:1 À 0:4 s À 1 , 1:4 þ1:0 À 0:2 s À 1 , and, 2:0 þ1:8 À 0:3 s À 1 with very large uncertainties (95% confidence intervals). We have smoothed the posterior distributions here using kernel density estimation (KDE) technique for visualization purposes only. method for both of the system states are approximately 2.9 s À1 and 4.1 s À1 as seen in Fig. 3 b. These results are consistent with results reported in Supplementary Table S1 of Zosel et al. [29] with an average relative difference of z15%.

Immobilized ACTR in buffer
Here, in the absence of EG, the viscosity of the solution is lowered [29], leading to faster system transitions, representing a unique analysis challenge.
As in the previous subsection, from the RCM provided by the authors of Zosel et al. [30] for the current data set, we found cross talk factors of 4 a1 ¼ 0:72, 4 a2 ¼ 0:0, 4 d1 ¼ 0:10, and 4 d2 ¼ 0:90. After correcting for these cross talk/detection efficiency values and background rates of 0.312 s À1 and 1.561 s À1 for the donor and acceptor channels, respectively, our BNP-FRET sampler now predicts five system states (see Fig. 4 a and b) with FRET efficiencies of 0.0, 0.72, 0.03, 0.28, and 0.92 approximately. Here, the first two system states with vanishingly small, estimated FRET efficiencies, namely 0.0 and 0.03, most likely represent the same configuration where NCBD is diffusing freely away from the immobilized ACTR, leading to no FRET interactions. Various sources of noise in FIGURE 3 Results for NCBD-ACTR interactions in the presence of ethylene glycol (EG). In (a), we show the raw photon counts (bin width of 0.01s) recorded by the two detection channels during the experiment. In (b), we show a probability distribution for the number of system states estimated by the BNP-FRET sampler. The sampler spends a majority of its time in two system states with only a small relative probability ascribed to more states. In the posterior distribution for the escape rates and FRET efficiencies in (c), two distinct FRET efficiencies are evident with values of about 0:003 þ0:020 À 0:002 (unbound) and 0:70 þ0:02 À 0:02 (bound), and corresponding escape rates of about 2:9 þ0:3 À 0:3 s À1 and 4:1 þ0:5 À 0:4 s À1 . The red dots show results reported by Zosel et al. [29] using maximum likelihood method. We have smoothed the distribution for demonstrative purposes only. FIGURE 4 Results for NCBD-ACTR interactions in buffer, without EG. In (a), we show the raw photon counts (bin width of 0.01s) recorded by the two detection channels during the experiment. In (b), we show a probability distribution produced by the BNP-FRET sampler for the number of system states. Models with less than four system states in the histogram are not shown as we ascribe to them zero probability. Indeed, the most probable model contains five system states. Next, in (c) depicting the posterior distribution for the escape rates and FRET efficiencies, five distinct FRET efficiencies are evident with values of 0.002 þ0:03 À 0:001 , 0.72 þ0:02 À 0:02 , 0.03 þ0:02 À 0:02 , 0.28 þ0:02 À 0:02 , and 0.92 þ0:02 À 0:01 with corresponding escape rates of about 4.3 þ1:9 À 1:8 , 25.0 þ2:1 À 2:9 , 5.1 þ1:8 À 1:9 , 8.9 þ3:5 À 0:8 , and 6.6 þ4:0 À 1:0 s À 1 . The first two system states with almost vanishing FRET efficiencies may represent the same unbound configuration with the small splitting likely arising from various sources of noise present in the data set. The red dots show the results reported in Zosel et al. [30] using maximum likelihood method. the data set may have resulted in this splitting of the unbound system state. Furthermore, the system state with the FRET efficiency and escape rate of approximately 0.72 and 25.0 s À1 , respectively, coincides with the previously predicted bound configuration found using a maximum likelihood method with a fixed number of system states [30]. We have compiled the learned transition rates (median values) in the generator matrix below (in s À 1 units): where the diagonal elements correspond to negative of the escape rate values. Furthermore, the steady-state populations/probabilities for these system states can be computed by solving r steady G s ¼ 0, resulting in r steady ¼ ½ 0:55 0:12 0:23 0:05 0:05 : (9) Here, the two newly observed system states, with FRET efficiencies of 0.28 and 0.92 and corresponding escape rates of approximately 8.87 s À1 and 6.6 s À1 are bound configurations not previously detected [30] and deserve further attention. For instance, lower viscosity buffer (as compared with cases in the presence of EG) may allow the system to visit transient system states more readily under observation timescales [40,41]. Additionally, steady-state probabilities for these new transient system states that we recover are indeed expectedly low (0.05 and 0.05) compared with other system states of the NCBD-ACTR complex. Furthermore, IDPs interact in a complex manner with high possibility for residual secondary structures [42]. Competing parametric methods would need to posit a high number of system states a priori in order for their kinetics to be quantifiable. Finally, despite a difference in the estimate of the number of system states, our slower kinetics in the presence of EG are consistent with those of Zosel et al. [29]. Direct comparison of escape rates across system states recovered by BNP-FRET versus Zosel et al. [29], however, is questionable on account of having recovered a different number of system states.
One way by which we may assure ourselves that these system states are not artefactually added by our computational algorithm (overfitting) is to analyze synthetic data generated under the same conditions (excitation rate, cross talk, and background) as the experiment but with a ground truth of two system states. We can then ask whether the noise properties force our method to introduce artefactual states. Thus, we simulate a two system state model with the previously reported escape rates [30]  Another way by which we may assure ourselves is by analyzing synthetically generated data for the four most distinct system states (on the basis of FRET efficiency) predicted by the BNP-FRET sampler for the experimental data set. These system states correspond to FRET efficiencies of approximately 0.0, 0.72, 0.28, and 0.92 with associated escape rates of 4.31, 24.97, 8.87, and 6.6 s À1 as computed from the matrix in Eq. 8. We tested whether our sampler BNP-FRET underfits or overfits with regard to the estimated number of system states. As shown in Fig. 6, the most probable model predicted by the sampler has four system states, again verifying the robustness of our method.

DISCUSSION
FRET techniques have been essential in investigating molecular interactions on nanometer scales, for instance, most recently in directly monitoring interaction of the SARS-COV2 virus spike protein with host receptors [43,44]. Yet, the quantitative interpretation of smFRET data suffers from several issues including difficulties in estimating the number of system states, dealing with fast transition rates and providing uncertainties over estimates, particularly uncertainties over the number of system states [16,45] originating from multiple noise sources.
Here, we implemented a general nonparametric smFRET data analysis framework presented in the first companion article [15] to address the issues associated with smFRET data analysis acquired under continuous illumination. The framework developed can learn posterior distributions over the number of system states as well as the corresponding kinetics ranging from slow values all the way up to kinetic of events occurring on timescales approaching excitation rates. That is, our method propagates uncertainty over not only kinetic parameters but their associated models as well. This is especially significant in avoiding overcommitment to any one model when multiple models are almost equally probable given the data.
We benchmarked our method starting from synthetic data with three system states with a range of different timescales. We challenged our method by simulating data with kinetics as fast as the interphoton arrival times and correctly deduced the system state numbers even under such extreme conditions. We further assessed our method using experimental data acquired observing NCBD interacting with ACTR under different EG concentrations that may impact the timescales at which the binding/unbinding reactions occur. In the previous point estimate methods [29,30], two system states were assumed a priori for 0 and 36 % EG concentrations. However, our nonparametric method predicts the number of system states and obtains two additional system states in the absence of EG (fast kinetics). This observation may be tied to the inherently unstable nature of the two IDPs under investigation [40].
A careful treatment of how experimental noise propagates into uncertainties over the number of system states and rates does come with associated computational cost. Other methods have managed to mitigate these costs by making approximations, including the following: (1) assuming kinetics much slower than fluorophore excitation and relaxation rates [13,14]; (2) assuming fast dye photophysics is completely irrelevant to the system transition rate and that FRET efficiency sufficiently identifies transitions between system states [14]; (3) ignoring detector effects and relegating other noise sources, such as background, to FIGURE 6 Second robustness test using synthetic data with realistic noise parameters. The synthetic data here is generated under the same conditions (excitation rate, cross talk, background, and photon budget) as the experiment whose results are shown in Fig. 4 with four distinct system states (on the basis of FRET efficiency) as ground truth to see whether our sampler overfits or underfits with regard to the number of system states. These system states correspond to FRET efficiencies of 0.0, 0.72, 0.28, and 0.92 with associated escape rates of 4.31, 24.97, 8.87, and 6.6 s À1 as computed from the matrix in Eq. 8. In (a), we show the posterior produced by the BNP-FRET sampler for the number of system states. Fortunately, the most sampled model contains four system states, verifying the robustness of our method. Small probabilities are also ascribed to models with different numbers of system states. In the posterior distribution over the escape rates and FRET efficiencies in (b), four distinct FRET efficiencies are evident with values of 0.002 þ0:03 À 0:001 , 0.72 þ0:03 À 0:03 , 0.28 þ0:04 À 0:03 , and 0.92 þ0:03 À 0:03 and corresponding escape rates of 3.9 þ2:0 À 1:5 , 23.8 þ2:2 À 1:5 , 7.1 þ1:9 À 1:5 , and 5.9 þ1:7 À 1:5 s À1 . Here, the ground truth is shown with red dots. High noise from background results in the underestimates seen here. postprocessing steps [13]; and, most popularly, (4) binning data [16,18,46]. For the general case without such approximations, however, the primary computation-the likelihood-remains expensive due to the required evaluation of many matrix exponentials. This cost can be mitigated in a number of ways, for instance, by computing likelihoods for several data traces in parallel. The scaling of the method is provided in the first companion article [15].
The method described in this paper was developed for cases with discrete system state spaces. For continuous state spaces, both the likelihood and priors would require major modification in the spirit of Bryan and Press e and Gopich and Szabo [47,48].
Our framework can accommodate different illumination modalities such as alternating laser excitation (ALEX) [49] to directly excited both donor and acceptor dyes by assuming nonzero direct excitation rates in the generator matrix. Indeed, direct excitation of the acceptor would further help in the simultaneous determination of cross talk factors, detection efficiencies, and quantum yield of the dyes alongside kinetics.