Permissive and nonpermissive channel closings in CFTR revealed by a factor graph inference algorithm

The closing of the gated ion channel in the cystic fibrosis transmembrane conductance regulator can be categorized as nonpermissive to reopening, which involves the unbinding of ADP or ATP, or permissive, which does not. Identifying the type of closing is of interest as interactions with nucleotides can be affected in mutants or by introducing agonists. However, all closings are electrically silent and difficult to differentiate. For single-channel patch-clamp traces, we show that the type of the closing can be accurately determined by an inference algorithm implemented on a factor graph, which we demonstrate using both simulated and lab-obtained patch-clamp traces.


INTRODUCTION
Cystic fibrosis (CF) is a life-threatening genetic disease affecting the respiratory and digestive systems that is caused by mutations to the CF transmembrane conductance regulator (CFTR) anion channel (1,2). CFTR is a "broken" member of the ATP-binding cassette transporter class in that CFTR acts as an ATP-gated ion channel rather than an active transporter as is the function of other ATP-binding cassette transporters. CFTR consists of a single polypeptide chain, with two transmembrane domain-nucleotide binding domain (NBD) pairs connected through a region called the R domain (3). Each of the two NBDs contribute to both of the two known binding sites for ATP, although only one of these sites facilitates the hydrolysis of ATP to ADP. The two transmembrane domains form a gated channel that is controlled by the state of the two intracellular NBDs.
Of key interest are permissive and nonpermissive closings of the CFTR ion channel, in the sense of permissive to rapid reopening the channel. Considering the kinetic model in Fig. 1 (see (4)), nonpermissive closings involve the release of ADP in order to enable the binding of ATP (5,6) and include the irreversible C4/ C1a transition. On the other hand, permissive closings do not include this transition, meaning that they do not involve the release of ADP or ATP, leading to faster reopening. However, as these transitions occur on states in the same conductance level, the two types of closing cannot be directly distinguished by the patch clamp. The problem addressed by this report is to distinguish permissive and nonpermissive closings solely by performing signal processing on the patchclamp trace.
Hidden-variable problems of this type have a long history in biophysics, e.g., (7). More recently, methods have been developed to automate idealization of noisy ion channel current recordings, including the expectation-maximization (EM) algorithm (8) and deeplearning (9) approaches. Nonparametric Bayesian approaches have been used to identify the number of kinetically distinct hidden states (10). Finally, many methods have been developed for estimation of the hidden state transition matrix, including maximum likelihood methods (11)(12)(13)(14)(15) and Bayesian approaches (16)(17)(18)(19). Bayesian inference has also been used to estimate hidden kinetic states from synthetic patchclamp measurements in (16), using a Markov chain Monte Carlo method.
In this report, we are interested in factor-graph-based inference algorithms, such as the sum-product algorithm (20,21), which can combine inference with EMbased estimation of unknown parameters (22,23). Algorithms of this type have been used in biomedical hidden-variable problems (24). Distinct from (16), the factor graph approach calculates posterior probabilities efficiently (and exactly if parameters are known). Elsewhere, applications of inference algorithms are found in diverse areas such as bioinformatics (25,26), biophysics (27,28), telecommunications, and, more recently, in machine learning (29).
The main contribution of this report is to show that permissive and nonpermissive closings of CFTR (i.e., the timing of ADP dissociation and ATP binding) can be accurately inferred from patch-clamp traces alone, without any additional equipment, model training, or prior knowledge about parameter values. In doing so, we apply a factor-graph-based inference algorithm; to our knowledge, our report is also the first to apply an inference algorithm of this kind to CFTR.

Receptor model
We use a physical model of CFTR described in Fig. 1 (see (4)). In this seven-state model, states C1a, C1b, C2, C3, and C4 are fully closed, so we assume that ions are completely unable to pass through when CFTR is in these states, and thus, they have the same conductance level. States O1 and O2 are open states in which an ion current can flow through the channel. (The states are labeled so that the first letter indicates whether the channel is closed [C] or open [O].) Beginning with C1a, with a single ATP bound at the first ATP binding site, which is incapable of hydrolysis (30), the reversible transition to C1b occurs when a second ATP binds so that both binding sites are occupied. Because this step involves ATP binding, the rate depends on the concentration of ATP. The reversible transitions from C1b to C2 and from C2 to O1 are conformational changes resulting in an open CFTR, much like any ligand-gated channel such as the acetylcholine receptor (31). The transition from O1 to O2 is the first of two irreversible steps in the cycle, with one NBD-bound ATP undergoing hydrolysis to ADP. CFTR can then undergo reversible conformational changes from O2 to C3 and from C3 to C4, resulting in a closed pore. Finally, the second irreversible step occurs in the transition from C4 to C1a, where the NBD-bound ADP unbinds from CFTR, leaving one apo and one filled ATP binding site. This seven-state model is in agreement with the four-state simplified cyclic gating model of (32) and models distinguishing multiple closed and/or open states (33,34).
As noted in the introduction, we are interested in determining whether the channel closings are permissive or nonpermissive. Nonpermissive closings include the irreversible C4 to C1a transition, in which ADP unbinds and a binding site is available for ATP, while permissive closings do not. Thus, considering The states under the ellipsis ð.Þ can be any valid sequence of closed states from Fig. 1, not necessarily the same state.
The kinetic microstates of CFTR can be modeled using a master equation of the form . States with open and closed ion channels are indicated in blue italics and black, respectively. Closings for which the reopening occurs in the same state are called permissive, depicted with green arrows; closings from O2 where the reopening is O1 are called nonpermissive, depicted with a red arrow. See also reference (4) and the supporting material.
where P is a row vector with length equal to the number of kinetic microstates and R is a square matrix of kinetic rates for each possible state transition. In this formulation, P i is the probability that a receptor is in microstate i, while R ij is the transition rate from state i to state j. The rate matrix R and the full master equation are given in the supporting material.
Patch-clamp signal model Formally, our system contains a set V of observable conductance states, a set S of hidden kinetic microstates, and a mapping m : S/V from microstates to conductance states. For CFTR, we have S ¼ fC1a; C1b; C2; C3; C4; O1; O2g; and (3) The conductance states f0; 1g correspond to the ion channel's current when closed and open, respectively.
The patch clamp observes the channel current through additive noise and samples these observations with sampling time Dt to form discrete-time signals. Let y ¼ ½y 1 ; y 2 ; .; y n ˛R n represent the sequence of observations for a single channel, and let s ¼ ½s 1 ; s 2 ; .; s n ˛S n represent the corresponding microstates. Then, at the k th sample, the patch clamp measures where I sk ¼ Amðs k Þ is the current flowing through the channel in state s k , given by a constant A and the function mðsÞ in (4), and where n k forms a sequence of independent, identically distributed Gaussian random variables with zero mean and variance s 2 . Using the central limit theorem, it is reasonable to model the noise as Gaussian, particularly if a decimation filter is applied to the raw patch-clamp outputs (see the results section below). The transitions of microstates s k À 1 /s k are modeled as a discretetime Markov chain (35), again with a discrete time step Dt. The transition probability matrix for this Markov chain Q ¼ ½Q ij ¼ ½Prðs k ¼ j j s k À 1 ¼ iÞ is given by the solution to (1): Inference, parameter estimation, and simulation We use the sum-product algorithm (21) over the factor graph representing the sequence of states s to obtain the a posteriori distribution pðs k j yÞ. Meanwhile, the transition probability matrix Q and noise variance s 2 (and potentially the current amplitude A) are unknown a priori and must be estimated from the data. The EM algorithm (36) is a standard tool for this kind of simultaneous inference-estimation task. We employ a variant of the EM algorithm, known as the factor graph EM algorithm (22,37), which is intended for use alongside sum-product inference algorithms. The complete details of our algorithm are described in the supporting material. Given pðs k j yÞ, we define a confidence threshold C, 0 % C < 1 and use the following decision rule to obtain the estimate b s k for each s k : b s k ¼ argmax s k˛S pðs k j yÞ; pðs k j yÞ > C; B; otherwise: That is, b s is the maximum a posteriori (MAP) estimate if the estimate exceeds C; otherwise, b s is null ð BÞ. Setting C ¼ 0 obtains the MAP estimate for all states s k , while setting C > 0 reduces the probability of false alarm. We test our inference algorithm via Monte Carlo simulation by generating instances of discrete-time Markov chains with transition probability matrix Q (Eq. 6) and adding noise (Eq. 5). To evaluate the algorithm, we consider probability of false alarm P FA and probability of missed detection P MD , also known, respectively, as type I and type II errors. First, using the ground-truth state sequence s k , we make a list L GT of closings (i.e., any sequence OX/CX.

Patch-clamp measurements
Single CFTR channels were studied in inside-out patches pulled from Xenopus oocytes injected with cRNA encoding the wild-type channel, as previously described in (38). Briefly, to enable removal of the vitelline membrane, oocytes were placed in a bath solution containing (in mM) 200 monopotassium aspartate, 20 KCl, 1 MgCl 2 , 10 EGTA, and 10 HEPES (pH 7.2) adjusted with KOH. Gigaohm seals were formed with patch pipettes pulled from borosilicate glass and filled with solution containing (in mM) 150 N-methyl-D-glutamine chloride, 5 MgCl 2 , and 10 TES buffer (pH 7.5). After excision of the patch,

Code
Code and raw patch-clamp traces used to generate all results in this report are publicly available on Zenodo (see (39)). We do not include simulated patch-clamp traces in this dataset, but simulations can be generated using the code we provide. The code repository also includes a Jupyter notebook and raw data for generating all results in this report.

Analysis using simulated patch-clamp measurements
Here, we give results obtained from Monte Carlo simulations of the CFTR ion channel. To generate simulated   Table 1. patch-clamp results, we use example model parameters for the seven-state CFTR model given in Table 1.
These example values are not provided to the inference algorithm, so performance does not in general depend on their accuracy.

Properties of nonpermissive closings
In Fig. 2, we used our simulator to generate groundtruth state sequences s k ; we then count the number of nonpermissive closings L ðnpÞ GT and the total number of closings jL GT j and divide by time to get rates. In the figure, we see that nonpermissive closings occur at a much lower rate than permissive closings. While the rate of closings depends on ATP concentration due to the cyclical nature of CFTR gating, the ratio of nonpermissive to total closings remains nearly independent of concentration. These observations are consistent with the dynamics of our model: a low ATP concentration will lead to a longer dwell time in state C1a, thus increasing the interval between openings without affecting the rates of transitions at the boundary between open and closed. Moreover, nonpermissive closings are relatively rare, with a ratio of roughly 1 nonpermissive closing per 12 total closings.

Missed detection and false alarm probability
In Fig. 3, we give missed detection and false alarm probabilities, P MD and P FA , using our algorithm (see Eq. 8). Setting the confidence threshold C ¼ 0 gives the MAP estimate of each state (see Eq. 7); we see a low P MD , meaning few nonpermissive closings are missed, but P FA is relatively high, i.e., around 50%.
(Since nonpermissive closings are rare, cf. Fig. 2 b; this is still far better than random guessing.) With a higher confidence threshold of C ¼ 0:8, P FA is reduced at the expense of increased P MD . This is explained by noting that lower-confidence state estimates are discarded, so those closings will be missed by the algorithm. This demonstrates that C can be adjusted to trade off P FA against P MD . The performance of the algorithm is dependent on ATP concentration, with error rates increasing as concentration increases.

Application to CFTR mutants
We modified the rate matrix in Table 1 to represent the K1250A CFTR mutant (see (4)). Complete details and results, similar to Figs. 2 and 3, are provided in the supporting material.
Application to experimentally obtained patch-clamp measurements In Fig. 4, we show the application of our algorithm to lab-obtained CFTR patch-clamp measurements. We show two examples corresponding to two different experiments. For techniques used to obtain these measurements, see the materials and methods section.
As the patch-clamp signal is oversampled compared with the kinetics of CFTR, we decimate the signal by a factor of 50, applying a block-averaging decimation filter (taking the sample average over nonoverlapping blocks of 50 samples). The decimation step is performed to reduce the noise at high frequencies, which contains very little useful information about the signal, while preserving the features of interest at lower frequencies, improving the performance of the algorithm. We show the decimated signal in the middle plots of Fig. 4, overlaid with vertical lines indicating the closings found by our algorithm, both permissive and nonpermissive. In the bottom plots of Fig. 4, we give the state estimates b s k found by the algorithm, with different colors indicating whether or not the initial and final estimated states both exceed a confidence threshold of C ¼ 0:8. From the raw data and preprocessed traces, abrupt transitions from high to low current correspond well with the detected closings.

DISCUSSION
We show that an algorithmic tool reveals the precise microstate kinetics of CFTR. By revealing permissive and nonpermissive closings, we can precisely estimate the timing of each nucleotide unbinding event, a key step in CFTR's kinetic model. Furthermore, our method may be used to study the effect of reagents that are known to affect hidden-state kinetics of CFTR, such as scorpion venom (4); future experiments in this direction might also be used to analyze pharmaceuticals that target CFTR. More generally, beyond permissive and nonpermissive closings, this method gives the designer of an experiment a novel and fine-grained algorithmic tool to discover changes to the behavior of receptor proteins. For example, this method could be used to determine the fine-grained, microstate-by-microstate effects of particular agonists or mutations, in CFTR or other receptors.