Determination of Ligand-Binding Affinity (Kd) Using Transverse Relaxation Rate (R2) in the Ligand-Observed 1H NMR Experiment and Applications to Fragment-Based Drug Discovery

High hit rates from initial ligand-observed NMR screening can make it challenging to prioritize which hits to follow up, especially in cases where there are no available crystal structures of these hits bound to the target proteins or other strategies to provide affinity ranking. Here, we report a reproducible, accurate, and versatile quantitative ligand-observed NMR assay, which can determine Kd values of fragments in the affinity range of low μM to low mM using transverse relaxation rate R2 as the observable parameter. In this study, we examined the theory and proposed a mathematical formulation to obtain Kd values using non-linear regression analysis. We designed an assay format with automated sample preparation and simplified data analysis. Using tool compounds, we explored the assay reproducibility, accuracy, and detection limits. Finally, we used this assay to triage fragment hits, yielded from fragment screening against the CRBN/DDB1 complex.


■ INTRODUCTION
Advances in molecular biology offer novel protein targets for drug discovery, with up to 10,000 candidates potentially suitable for drug intervention. 1 Many of these targets act through novel mechanisms and are currently considered challenging to drug. Over the past 20 years, fragment-based drug discovery (FBDD) has risen as an effective strategy, providing successful tool compounds for novel targets, and resulting in the development of new drugs. 2 3 The weak affinity (high μM to low mM) of initial fragment hits in FBDD has heralded the innovation of a wide range of biophysical assays. 4 Among them, the three most popular technologies are nuclear magnetic resonance (NMR), surface plasmon resonance (SPR), and differential scanning fluorimetry (DSF). Each of these approaches have their own advantages and limitations. SPR can provide binding affinity (K d ) information; however, it requires the protein to be physically immobilized to the chip surface, potentially altering the target conformation and/or interfering with the binding pocket. DSF requires the fragment to increase the thermal stability of the target protein, which is not always achievable for weak binding fragments, especially for proteins with high melting points.
Ligand-observed NMR (LONMR) assays are routinely utilized to screen fragments, yielding hits with high μM to low mM binding affinity. During screening, a single ligand concentration is normally employed to qualitatively distinguish binder from non-binder. While this single concentration approach enables screening of 1000−2000 fragments in a reasonable time frame (2−4 weeks) with an acceptable level of protein (20−100 mg) consumption, it is not useful in ranking the affinity of these hits. 5 NMR-based K d determination methods have been published, 6 and chemical shift perturbation (CSP) using 15 Nlabelled protein is considered the gold standard. 7 However, CSP is generally restricted to small protein targets (<30 kDa) and is further limited by significant resource costs. K d determination by ligand-observed methods such as saturation transfer difference and WaterLOGSY has been reported. 8 A major limitation of these nuclear Overhauser effect-based experiments is re-binding of the ligand; consequently, measured K d values are affected by assay conditions. 9 A higher protein concentration can potentially result in an artificially higher observed K d . Line width of NMR signals has also been suggested as an observable parameter to measure K d . However in practice, line width can be difficult to quantify accurately unless a singlet is clearly observable. 10 To address these limitations, here we report a quantitative LONMR assay that uses an intrinsic NMR property� transverse relaxation rate R 2 �as the observable parameter and determines K d values of small molecules in the affinity range of low μM to low mM. In this assay, named as R2KD, we observe the R 2 values of ligands (small molecules) at various concentrations interacting with a single target protein concentration and obtain the K d values through curve fitting.
R 2 values of nuclei depend on how fast molecules tumble in solution: small molecules such as fragments tumble very fast, resulting in small R 2 values (e.g., 0.5−2), whereas large molecules such as proteins tumble slowly, resulting in larger R 2 values (e.g., 20−100). When a fragment interacts with a protein, its R 2 value increases, providing a useful observable metric. 11 In the R2KD assay, we use a routine Carr−Purcell− Meiboom−Gill (CPMG) pulse sequence to experimentally measure R 2 values.
In this study, we examine the basic theory and propose a new mathematical formulation to obtain K d using least squares non-linear regression analysis. We design an assay format with automated sample preparation and simplified data analysis. Using tool compounds, we explore the assay accuracy, reproducibility, and detection limits. With examples, we highlight key factors that affect the assay result. Finally, we outline an application of this assay in the triage of fragment hits and in the calculation of K d values for selected ligands.

■ THEORY
Considering that the aim was to determine the binding affinity of small molecule fragments to target proteins in early fragment hit discovery programs, a simple one site reversible binding process was assumed. The dynamic equilibrium could be described by Scheme 1, where P represents the protein, L represents the small molecule ligand, and PL represents the protein−ligand complex.
At thermodynamic equilibrium, the binding dissociation constant, K d , can be defined by eq 1. 12 Eq 1 can be expressed by experimentally controllable variables, P T (total protein concentration) and L T (total ligand concentration). The concentrations P T , L T , [L], [P], and [PL] are related by eqs 2 and 3.
[P] is the free protein concentration, [L] is the free ligand concentration, and [PL] is the protein−ligand complex concentration, or protein-bound ligand concentration. Substituting eqs 2 and 3 into eq 1 gives eq 4, which can be rearranged to quadratic eq 5.
A solution to this quadratic equation is eq 6, which allows K d determination by non-linear regression if we vary L T . This equation accommodates a ligand depletion scenario since the [L] was not assumed to be L T . Eq 6 is graphed in Figure 1 using conditions commonly seen in fragment-based ligand NMR experiments. In this study, a linear x-axis scale was used as it is more suited to visualize results for fragments with K d in the range of μM to low mM.
Since in LONMR experiments, bound ligand concentration [PL] cannot be observed directly, a suitable observable parameter was required to substitute [PL]. For a weak binding event, and under high ligand excess condition, we could substitute [PL] with R 2 using the mathematical manipulations outlined below (7).
By designing the binding experiment such that the ligand− protein ratio was much higher than 1, i.e., L T /P T ≫ 1, the free ligand fraction ρ F becomes much higher than the bound ligand fraction ρ B , i.e., ρ F ≫ ρ B . Under such conditions, the observed transverse relaxation rate R 2,obs could be expressed using the Swift-Connick formula, as shown in eq 7, where R 2,obs is the observed transverse relaxation rate, ρ F = [L]/L T is the free ligand fraction, ρ B = [PL]/L T is the bound ligand fraction, R 2F is the transverse relaxation rate of the ligand in the free state, R 2B is the transverse relaxation rate of the ligand in the bound state, K ex is the ligand exchange rate between free and bound state, ΔΩ = ω B − ω F is angular precession frequency difference of the ligand in bound and free states, in practice, ΔΩ = 2π(δ B − δ F ), where δ B and δ F are the chemical shift in Hz. For a weak binding event, the ligand exchange rate was in the range 1000 < K ex < 100,000 S −1 . As a result, the formula could be further simplified to eq 8 according to Peng et al., who gave a detailed explanation for this simplification in their 2004 paper. 13 Scheme 1. One Site Reversible Binding Process can be approximated to zero, thus allowing the total protein ligand concentration [PL] to be expressed as shown in eq 10. Combining eq 10 with eq6, we obtained the eq 11, which could be processed in Graphpad Prism to obtain a ligand K d value from non-linear regression curve fitting to the experimental data. R 2,obs was measured experimentally using individual samples containing differing ligand concentrations but identical protein concentrations. R 2F was measured using sample containing only ligand. In practice, we first used eq 13 to calculate the value of the quantity y in Microsoft Excel and then substituted the first term in eq 11 with the factor α as shown in eq 14. Finally, we used Prism to perform non-linear regression with eq 12 which produced fitted values for both K d and the factor α.
■ RESULTS AND DISCUSSION Assay Format and Sample Preparation. Four aqueous stock solutions were prepared in Eppendorf Safe-Lock tubes (Cat no. 0030120094) by manual pipetting: (1) ligand aqueous stock solution was prepared by adding 50 mM ligand DMSO-d 6 solution to aqueous buffer; (2) dimethyl sulfoxide (DMSO) aqueous stock solution was prepared by adding the same volume of DMSO-d 6 as ligand DMSO-d 6 solution to aqueous buffer; (3) protein aqueous stock solution was prepared by adding protein solution to aqueous buffer; and (4) aqueous solution was the buffer used in the assay.
Samples were prepared on a Bruker SamplePro-Tube liquid handler by mixing different volume of the four aqueous stocks in 96-well microplates (Greiner 650201) and then transferred to 3 mm NMR tubes ( Figure 2). 10 samples were prepared: samples 1−8 containing increasing ligand concentration (maintaining constant protein concentration); sample 9 and 10 containing only ligands at two different concentrations (protein absent). These two samples served as control samples to obtain the R 2 values of free ligand. The final percentage of DMSO-d 6 in all samples were identical to minimize possible impacts of DMSO-d 6. K d Determination Demonstrated Using the BCL6 BTB Domain Ligand CCT365133. Here, we used compound CCT365133 ( Figure 3A) to demonstrate how K d value could be determined using the R2KD assay. CCT365133 was discovered in our in-house drug discovery program 14 and determined to interact with the BCL6 BTB domain with K i 50 μM as determined by a TR-FRET binding assay, the assay detail has been published elsewhere. 15 BCL6 is a transcriptional repressor and has been reported as a potential target for cancer drug therapy. 16 Figure 3 shows the R2KD assay data used to obtain the K d curve for CCT365133. This data was obtained from eight samples containing ligand (CCT365133) at concentrations of 50−500 and 5 μM BCL6 protein as well as from two samples containing ligand only at concentrations of 200 and 400 μM ( Figure 3C). The R 2 values of four aromatic 1 H-NMR signals ( Figure 3B) were experimentally determined ( Figure 3C) and (R 2,obs − R 2F ) * L T values were calculated using Excel and used as y-axis values in Prism, while ligand concentrations were plotted as the x-axis values ( Figure 3D). All four NMR peaks were used to fit a global K d with non-linear regression using a customized equation (see Experimental Section), while α values were fitted as an individual parameter for each peak ( Figure 3E) as α values differ depending on the individual proton's environment. While it was possible to fit K d values   Figure 3F), we found using a global fitting algorithm increased robustness of curve fitting and better accommodated outliers in the data. The goodness of fit was judged by 95% confidence interval (CI) (profile likelihood) values and R squared value. The global K d obtained here is close to the K i value (54 vs 50 μM) although from our accuracy study, the differences can be larger (see section "Assay Accuracy"). Assay Accuracy. To determine the accuracy of K d measured using the R2KD assay, we tested seven small molecule ligands (Figure 4) against three protein targets (BCL6, CRBN/DDB1 complex, and ERAP1) and compared their K d values from the R2KD assay with known K i values (Table 1) obtained in biochemical binding assays 15,17,18 or K d value from the SPR binding assay. 19 We found good agreements between the K d values and the K i values, with most of them less than two-fold difference from those determined from biochemical or biophysical assays. These results from three structurally distinct proteins also demonstrated the versatility of the R2KD assay. We found the assay setup to be simple and positive controls unnecessary, making the approach suitable for new and challenging targets.
Assay Reproducibility. For these seven compounds, the R2KD assay was repeated three times with sample preparation on separate dates to assess the assay's reproducibility under optimized assay conditions. The assay was observed to have good reproducibility: compounds with affinity ranged from high μM to low mM had standard deviation (SD) less than  50% of average K d value; for compounds in the affinity range between 1 and 20 μM, the error increased to 80% of average K d value. As the purpose of this assay is to assess initial, typically weak binding fragment hits, this reproducibility was deemed satisfactory in the relevant affinity range. Assay Detection Limit. Based on the theory of the R2KD assay, this approach is best suited to detect low μM to low mM affinity, when the ligand exchanges rapidly between bound and free states. In practice, two main factors limit the detection range: NMR instrument sensitivity and ligand aqueous solubility.
For the NMR system used in this study (600 MHz with TCI-CryoProbe), we found that 20 μM ligand concentration was the lowest concentration that generated sufficient signalto-noise level in 1 h for R 2 measurements. This limited the lowest affinity we can accurately measure. In the R2KD assay, K d is a fitted parameter value from non-linear regression, so it is possible to obtain K d even if its value is outside the experimental concentration range, as demonstrated using CCT373101 ( Figure 5). Although a lack of data points could compromise accuracy of the fitted K d value with wider 95% CI ( Figure 5F), results showed that compounds with around 10 μM K d could still be measured with satisfactory accuracy and reproducibility (Table 1). We also observed that specific binding was clearly indicated as R 2 values increased several fold when ligand concentration reduced gradually ( Figure 5C). In conclusion, 10 μM was deemed to be the lower detection limit for this assay.
For the upper detection limit, as with other assays, small molecule aqueous solubility is a key factor limiting the largest K d value detectable. In our experience, a high proportion of fragments have aqueous solubility less than 500 μM, as measured by qHNMR in PBS buffer at pH 7.4, resulting in a near straight line observed within this concentration range for ligands with K d higher than 1 mM. Lacking data points from higher ligand concentrations reduced the accuracy of curve fitting. In our tests, the largest K d we can determine accurately is around 1 mM using compound CCT040036 binding to BCL6 BTB ( Figure 6). However, for soluble fragments like acids or carbohydrates, the upper detection limit can be higher.
To conclude, the R2KD assay is most suited to test compounds with affinity in the 10 μM to 1 mM range.
During the process of developing the R2KD assay, we have noticed several factors that impacted assay performance. The practical implications of such factors are discussed below. R 2 Measurement. To successfully determine K d , it is essential to know how precisely the R 2 value can be measured, to distinguish changes in R 2 due to interaction with protein from changes due to experimental variation. In this study, R 2 was determined using non-linear regression with monoexponential delay equation I = I 0 exp(tR 2 ) ( Figure 7A) using a CPMG pseudo-2D NMR experiment with water suppression. In optimizing the NMR experiment, we aimed to determine R 2 with standard error less than 10% of the R 2 value while minimizing NMR experiment acquisition time to within 1 h for the lowest concentration sample.
We considered that the key factors for the R 2 NMR acquisition experiment were the number of different spin-echo times, how to distribute these spin-echo times, and the total scan number. Ideally, all the relaxation time points should be evenly distributed on the curve with enough data points to give satisfactory resolution for the mono-exponential curve fitting. For samples with low ligand concentrations (e.g., 50 μM), more scan numbers were acquired to allow the software to integrate the peak area under the NMR signals accurately. After some initial analysis, we set up an R 2 experiment using  Figure 7 were generated using CCT365133 against BCL6 BTB. nine different spin-echo times (4, 20, 100, 200, 300, 500, 700, 1000, and 2000 milliseconds) and scan numbers ranging from 8 to 32.
We tested if these parameters were appropriate by analyzing results from 30 R 2 experiments with different ligand protein ratios. We found that when R 2 values were less than 10 ( Figure  7B), standard errors were less than 10% for concentrations greater than 50 μM, which was within our intended range. For samples at a concentration of 50 μM, standard errors were between 10 and 17%, higher than we aimed for but still acceptable for our purpose. It was possible to reduce the standard error by increasing NMR experiment scan numbers further, but this required experimental time beyond what was considered routinely practical in our laboratory. However, when R 2 values were greater than 10 ( Figure 7C), standard errors were greater than 20% for several concentrations. It may require trial and error to identify assay conditions to avoid R 2 values greater than 10. Instead of changing the R 2 experimental protocol, we opted to either exclude these values when we performed the non-linear regression or to reduce protein concentration to decrease R 2 values. Impact of Ligand Aqueous Solubility. As an assay designed to determine K d within the range of μM to low mM, the ligand concentration range was required to cover near mM concentration. We observed that for ligands with aqueous solubility below the desired nominal concentration, the R2KD assay results were impacted by the way samples were prepared. To illustrate this issue, an example is shown in Figure 7D. Using the soluble tool compound CCT365133, we prepared two data sets with different concentration series. For data set A, the ligand concentration was increasing from 50 to 250 μM (50, 100, 150, 200, and 250), then the concentration was kept at 300 μM for the last three data points to mimic the scenario, the compound has reached its aqueous solubility. For data set B, the ligand concentration continues to increase to 500 μM for the last three data points, mimicking the scenario that the compound has no solubility limit. The resulting K d value for data set A was around 10-fold larger than data set B. So limited solubility portrays compounds as less active in R2KD assay. To prevent measured K d values being impacted by this issue, we opted to prepare samples by making a series of dilutions of a ligand aqueous stock solution. Ligand aqueous solubility impacts all methods for K d determination, the advantage of the NMR approach being that the actual concentration of ligand in samples could be monitored by extracting the first slice of the pseudo-2D R2 experiment where the relaxation delay is set to 4 ms. We have also been using quantitative 1 H-NMR to measure the ligand concentration in the samples that contain no protein to further gauge the actual concentration of compound. If desired, the curve fitting could use the experimentally measured concentration range instead of nominal concentrations.
Impact of Protein Concentration to K d Value. Protein concentration was observed to be a key factor for the success of the R2KD assay. The concentration should be theoretically kept as low as possible to comply with the above derivations (i.e., L T /P T ≫ 1) and ideally less than 20% of the lowest ligand concentration. If a tool compound is available, it is recommended a few concentrations be tested during assay optimization. However, we observed that K d value is not dependent on the total protein concentration. Using compound CCT365133, K d was determined using three different protein concentrations: 5, 25, and 35 μM. The results suggested that the variation was small (RSD 17%) ( Figure 7E). At higher protein concentration (35 μM), we noticed that in samples with lower ligand concentration, such as 50 and 100 μM, R 2 values had larger error bars between replicates. Two reasons are behind this increase in variability: first the NMR signals from protein became visible at 35 μM and if overlapped with ligand signals, the R 2 values measured was a mixture from both; second, the R 2 values of ligands became larger than 10 and could not be measured accurately using the R 2 NMR experiment settings in the current method, as we discussed earlier in the R 2 Measurement section. Protein

Journal of Medicinal Chemistry
pubs.acs.org/jmc Article concentrations higher than 35 μM were not used because the ligand R 2 value could not be observed accurately with higher protein concentration for compound CCT365133. As expected, increased R 2 values were observed with increased protein concentration due to higher percentages of bound ligand. As a rule of thumb, we suggest starting the assay development with a protein concentration of 5 μM. For large proteins (>50 kDa), the concentration may be further reduced to 2 μM while for small proteins (<30 kDa), the concentration can be increased to 10 μM. Curve Fitting Parameters. In theory, all NMR signals from the same molecule should share a single K d value. So when fitting the K d value using Graphpad Prism, a constraint "Shared value for all data sets" was used for K d parameter fitting. For each shared parameter, Prism finds one (global) best-fit value that applies to all the data sets. This method also improved robustness of the fitting. The α values were not shared since they differ, depending on the individual proton's environment. Once the α values were fitted, the bound protein concentration [PL] could be calculated using eqs 10 and 14 for individual NMR signals. This allowed us to simplify the graphical representation of the curve fitting ( Figure 7F) and facilitated easier comparison of different compounds with a unified y-axis scale. Using such a visual representation, the graph would plateau at the total protein concentration used in the assay.
Application. Here, we present how we used the R2KD assay to triage fragment hits against the CRBN/DDB1 complex. The CRBN/DDB1 complex is part of the E3 ligase system, with its thalidomide-binding domain (TBD) pocket critical for recruiting its substrate protein. 20 A fragment library, composed of around 1000 compounds, was screened using R 2 relaxation edited 1 H-NMR experiments, with 19 fragments subsequently identified as competitive hits for the TBD pocket. The R 2 values of these fragments were measured at two ligand concentrations (200 and 50 μM) against 2 μM of the CRBN/ DDB1 complex. This enabled preliminary ranking of hits. The hits with larger R 2 differences rank higher. This practice also reveals possible aggregators which have larger R 2 values at higher concentration and will result in a non-saturating dose− response curve. We then measured the K d values of the top 10 fragments using R2KD assay and confirmed the ranking order (Table 2). We also determined IC 50 of the top 10 fragments using a FP based biochemical assay. 17 Most fragments showed good correlation between the two assays apart from two weaker binders with IC 50 > 3000 μM in FP assay ( Table 2). The structures of these fragment hits are shown in Figure 8. Several of the most potent fragments contain similar cores such as uracil and hydantoin, as reported in a previous fragment screen. 21

■ CONCLUSIONS
Here, we present a new, robust, versatile ligand-based NMR based approach to determine fragment binding K d applicable to a range of targets. Based on the single site reversible binding theory and the Swift-Connick formula, we have devised a new equation to determine binding dissociation constant (K d ) using transverse relaxation rate R 2 . We established an automated biophysical assay, R2KD, using state-of-the-art NMR instrumentation and optimized the approach for accuracy and reproducibility. Our results suggested good agreement of the K d values from the R2KD assay with other biochemical and biophysical techniques across multiple protein targets with a range of molecular sizes. From our limit of detection study, we concluded that the R2KD assay is most suited to measure weak binding events in the K d range of 10 μM to 1 mM. This suggests that the assay can be applied to triage hits resulting from a fragment-based drug discovery approach. We successfully demonstrated the use of this protocol to rank fragment hits from our NMR-based fragment screen against the CRBN/DDB1 complex of the CUL4 CRBN E3 ligase.
We envision that the R2KD assay will play a key role in fragment-based drug discovery, especially when allosteric sites are considered, for which other assays may not be readily available. It would be a valuable tool to assess primary hits from many forms of fragment screening techniques, such as crystallography or DSF screening. The R2KD assay can also serve as an orthogonal approach to biochemical assays in the early drug discovery stage when the initial hits are discovered and are in the 10−1000 μM K d range.
■ EXPERIMENTAL SECTION Materials. All compounds used in this study were either purchased from Chembridge or synthesized in house. All compounds where K d are measured have purity higher than 95% by high-performance liquid chromatography (HPLC). NMR data were collected at 298 K on a Bruker AVANCE NEO 600 MHz spectrometer, equipped with 5 mm TCI CryoProbe using Bruker Topspin 4.0. HRMS data were collected using an Agilent 1200 series HPLC instrument and diode array detector coupled to a 6530 time-of-flight mass spectrometer with an ESI-AJS source. The characterization information is included in the Supporting Information, and all proteins used in this study were prepared in house, and the relevant information were published previously.
R 2 Determination. NMR data were collected at 298 K on a Bruker AVANCE NEO 600 MHz spectrometer, equipped with 5 mm TCI CryoProbe using Bruker Topspin 4.0. The T 2 relaxation experiment was acquired using pulse program CPMG with 3-9-19 pulse sequence with gradients incorporated to suppress the water signal. 22 Extra water suppression was achieved by adding presaturation during D1. The spin-echo period (delay-180°-delay) was set to 1 msec (d20 is 500 μs), and the relaxation delay (d1) was set to 10 s. The pseudo-2D experiment contained nine slices with spin-echo period repeated the following times: 4, 20, 100, 200, 300, 500, 700, 1000, and 2000. The T 2 relaxation experiment was processed using MestReNova 14.1, and Data Analysis Module of MestReNova was used to obtain integrals of individual 1 H-NMR signals, which were used to calculate R 2 with equation I = I 0 exp (−tR 2 ).
Nonlinear Regression Analysis. Nonlinear regression analysis was carried out using GraphPad Prism 9.2.0. Equation 12 was defined in user-defined equations as: Y = 0.5α*((X + Kd + P) −sqrt((sqr(X + Kd + P)) − 4*P*X)) where Y was calculated from eq 13 and X is the total ligand concentration. Three parameters: P, α, and Kd were in the equation with the initial value set at 1. In the default constraints setting, P was "constant equal to" the total protein concentration, Kd was "shared value for all data sets", and α was "must be greater than 10". CIs of parameters was calculated at 95% level using Asymmetrical (Profile-likelihood) CI.
Detailed equation derivation; compounds characterization by 1 H NMR and LCMS; K d curve fitting data for the top 10 fragments identified in fragment screening campaign against the CRBN/DDB1 complex; and correlation plot between pK d and pK i using data presented in Table 1