Ultrahigh molecular recognition specificity of competing DNA oligonucleotide strands in thermal equilibrium

The specificity of molecular recognition is important to molecular self-organization. A prominent example is the biological cell where, within a highly crowded molecular environment, a myriad of different molecular receptor pairs recognize their binding partner with astonishing accuracy. In thermal equilibrium it is usually admitted that the affinity of recognizer pairs only depends on the nature of the two binding molecules. Accordingly, Boltzmann factors of binding energy differences relate the molecular affinities among different target molecules that compete for the same probe. Here, we consider the molecular recognition of short DNA oligonucleotide single strands. We show that a better matching oligonucleotide strand can prevail against a disproportionally more concentrated competitor that exhibits reduced affinity due to a mismatch. The magnitude of deviation from the simple picture above may reach several orders of magnitude. In our experiments the effective molecular affinity of a given strand remains elevated only as long as the better matching competitor is not present. We interpret our observations based on an energy-barrier of entropic origin that occurs if two competing oligonucleotide strands occupy the same probe simultaneously. In this situation the relative binding affinities are reduced asymmetrically, which leads to an expression of the free energy landscape that represents a formal analogue of a Landau description of phase transitions. Our mean field description reproduces the observations in quantitative agreement. The advantage of improved molecular recognition comes at no energetic cost other than the design of the molecular ensemble, and the introduction of the competitor. It will be interesting to see if mechanisms along similar lines as exposed here, contribute to the molecular synergy that occurs in biological systems.


Introduction
Many chemical reactions or functions in biochemistry, molecular medicine, or biotechnology rely on the specificity of molecular recognition. Specific binding is crucial to the formation of dedicated macromolecular complexes [1]. They may catalyse chemical reactions, transmit information, among a great variety of other functions. Accordingly, there has been a longstanding interest in molecular binding specificity. The well-known 'Lock and Key' model explains the recognition specificity of enzymes solely based on their molecular shape [2]. The 'Induced Fit' and the 'Conformational Proofreading' models [3] go beyond that. Considering a recognition pair, they show that the deformation of a (flexible) molecule upon binding can lead to an enthalpic barrier that increases binding specificity at the expense of binding affinity. Cooperative binding of a molecule at multiple sites can increase both, binding affinity and binding specificity at the same time. DNA hybridization is an example: binding of one complementary base pair increases the binding probability of other complementary bases as well, and this increases recognition specificity. Cooperative binding of several molecules to the same location, as observed in transcription regulation, can increase specificity along the same lines [4]. The 'Pre-existing Equilibrium' or 'Monod, Wyman, and Changeux' model [5] considers allosteric molecules that possess several conformational states with different affinities. The state of increased affinity to a ligand will be stabilised as long as a third, different molecule remains attached to the allosteric site.
In the above examples, specificity solely depends on the design of the molecular recognisers.
A different case is the use of energy to create thermal non-equilibrium and increase specificity. In their seminal papers [6] Hopfield and Ninio discussed how DNA polymerases consume ATP (adenosinetriphosphate) as energy source to reduce error during the incorporation of single bases in a DNA copying process. Due to the laws of thermodynamics, in competition and in thermodynamic equilibrium (without energy source), the error is fixed by the ratio of the individual binding constants of the incorrect and correct bases, K incorr / K corr , (in vicinity of 10 -1 ) and the law of mass action. By introducing several supplementary and energetically excited binding states, which are selective and proceed one after another, polymerases reduce this error by several orders of magnitude at the expense of both, energy and biochemical side reactions. The mechanism has since become the paradigm of 'kinetic proofreading'.
There is no 'kinetic proofreading' process for DNA hybridization, and non-equilibrium states, often metastable, are considered detrimental to the accuracy of the recognition process [7]. It is generally believed that the molecular recognition among competing DNA strands performs best in thermal equilibrium without interactions [7][8][9]. Accordingly, the recognition error remains limited by the ratio of the individual binding constants of the competing strands.
One of the most challenging aspects of oligonucleotide technology is the recognition of perfectly matching molecules in an environment of many similar competitors [8]. This is particularly difficult if the oligonucleotides differ by as little as a single base, as for instance in the search for single nucleotide polymorphisms that are of medical importance [10].
However, DNA hybridization can perform surprisingly well in a highly competitive environment. An example is in situ hybridization where a few hundred base pairs long nucleotide strand needs to bind exclusively to its complement, in the crowded molecular environment of a biological cell [11]. Here our aim is to investigate molecular recognition specificity of oligonucleotide hybridisation 12 in a competitive situation [13][14][15][16][17][18]. We show that the specificity can increase by several orders of magnitude compared to the usually admitted physical limit described above. We suggest a cooperative mechanism that quantitatively reproduces most of our observations.

Oligodeoxyribonucleotide sequences and buffer solutions
We obtain all oligonucleotides commercially (

Dendrimer coating of the glass slides used in surface based measurements
We use standard microscopy coverslips with a diameter of 20 mm as substrates for DNA immobilization. Coverslips are cleaned with Deconex (Borer Chemie, Switzerland) and rinsed with purified water. Silanization and functionalization with dendrimeric molecules (Cyclotriphosphazene PMMH, Generation 2.5 or 4.5, Aldrich) bearing aldehyde endgroups for immobilization of amino-modified DNA is performed accordingly to [19].

DNA grafting to dendrimer coated slides
For immobilization of DNA probes we follow the protocol in [19] with the following modifications: Immobilization time is reduced to one hour. We limit sodium borohydride treatment to 15 min and add 25 percent (v/v) EtOH to reduce the formation of hydrogen on the substrate and avoid inhomogeneities in grafting density. This modified protocol does not reduce the number of surface bound DNA molecules.

Förster Resonance Energy Transfer (FRET) to study DNA hybridization in bulk
We employ Förster Resonance Energy Transfer (FRET) to assess the hybridization of target and probe in bulk. One of the two competing target molecules is labeled with the acceptor Cy-5 at the 5' end, while the complementary probe molecule is labeled with the respective donor Cy-3 at the 3' end (Fig. 1). The competitive target molecule is not labeled. To avoid contact quenching we increase the distance between the fluorescent dyes Cy-3 and Cy-5 by introducing a spacer consisting of three T bases to the 3' end of the probe molecule. The three different oligonucleotides are mixed in a standard reaction tube, incubated at 44°C.
Immediately after mixing, 200 µl of the solution is transferred into a non-absorptive 96 well microplate (Nunc, Germany). The fluorescent signal is immediately determined using a platereader (Polar Star Optima, BMG Labtech, Germany). We use 544 nm as the excitation wavelength of Cy-3 and measure either the emission of Cy-3 at 580 nm (donor channel) or of Cy-5 at 670 nm (acceptor channel).

Total Internal Reflection Fluorescence (TIRF) for time dependent observation of surface based DNA hybridization
TIRF relies on the evanescent field penetrating into the medium of lower refractive index at the point of total reflection by a distance of typically 100 nm (Fig. 2). This field excites fluorescent dyes at the surface. A cover glass with immobilized DNA is fixed as part of the observation chamber mounted on the xy-stage of an inverted microscope (Axiovert 135, Zeiss, Germany). The excitation beam reaches the cover glass through a dove prism and a layer of immersion oil (see supplementary material S1.1 for a scheme of the optical path). To enable simultaneous detection of different fluorescently labeled molecules during competitive experiments, two lasers of different wavelengths (DPSS, 532 nm and HeNe, 633 nm) excite the fluorescent dyes Cy-3 and Cy-5. Before entering the prism, laser beams are chopped, expanded and focused onto the observation chamber. The excitation power is as low as possible to minimize photo bleaching. Fluorescence emission is collected through the microscope objective. A beam splitter directs the emitted light through the emission filters (Cy-3 channel: HQ585/40, Cy-5 channel: HQ680/30, AHF Analysentechnik, Germany), each followed by a photomultiplier (H9305-04, Hamamatsu). Lock-in amplified signals are recorded using a PC. A PID control equipped with a PT100 sensor controls the temperature of the experiment. To avoid a temperature drop towards the observation window, we apply an electrical heating current through its Indium tin oxide (ITO) coating. The observation chamber is of circular cross section (viewed from top) with a diameter of 4 mm and a height of 2.5 mm. It is filled via openings on its sides before the start of an experiment. We reuse the same coverslips with grafted DNA for several hybridization experiments. Removal of hybridized DNA before the start of a new experiment is performed by treatment with 10 mM NaOH for 1 minute at 44°C.

DNA microarrays
DNA microarrays are synthesized in situ using a maskless photolithographic technique based on NPPOC phosphoramidites (details in [12] and references therein). After adding fluorescently labeled target molecules to the surface bound probes and waiting for equilibrium, we acquire the fluorescence signals of the hybridized DNA by imaging the microarray surface with the help of an epifluorescence microscope.

Fluorescence correlation spectroscopy (FCS)
The excitation light of a frequency doubling Nd:YAG laser (532 nm) is first expanded by a telescope and then coupled into an inverted microscope (Axiovert 135, Zeiss, Germany) via the rear site port (see supplementary material S1.2 for a scheme of the optical path). The light beam is focused into the sample through the objective lens (C-Apochromat 40X, numerical aperture 1.2, water immersion, Zeiss, Germany). The emitted light is collected through the same objective. A dichroic mirror reflects the excitation light. Emitted light passes through a Cy-3 filter to reach a 50 µm diameter pinhole. The emitted light is detected by a photon counter (H8259-01, Hamamatsu), equipped with a hardware correlator (Flex 99r480, Correlator.com). We use low excitation power to avoid photo bleaching as well as triplet states of the fluorescent dyes. The contribution of the afterpulsing of the photon counter is at least one order of magnitude faster (approximately 3µs) than the specific signal of the base pair fluctuations and does not need to be taken into account. The DNA molecules are premixed at certain concentrations in a standard reaction tube. We use a 2-fold excess of targets (20 nM) over probes (10 nM) to occupy almost all probe molecules. The tubes are incubated in boiling water, which is allowed to cool down to room temperature over a time period of 12 hours. Before the start of the measurement we transfer the DNA solution into a measurement chamber. The chamber consists of PDMS, and it is covered by two glass slides at the top and the bottom. The temperature of the chamber is determined by a Pt-100 thermocouple connected to a controller that regulates the current of a heating foil. Thermal excitations lead to DNA breathing, i.e., local denaturation and reclosing of the double strand structure. For temperatures T < T m , the shape of the correlation function ( ) G t is mainly dominated by the closing dynamics.  (Table 1) at room temperature, and at 44°C. To fit the experimentally obtained correlation function we use the equation:

Results and Discussion
In a first set of experiments we consider a DNA-strand of 16 base pairs. The sequence has been designed at random, under the constraint that the Dinamelt server 21 does not predict any significant secondary structures. We consider three different strands: the target PM (the perfectly matching complement of the probe), and two other targets, MM1 and MM2, that differ from the PM complement in only a single, non-matching base pair (Table 1). To monitor the amount of hybridized PM in bulk in presence of a competing strand, we use Förster Resonance Energy Transfer (FRET) (Fig. 1).

TIRF (Total Internal Reflection
Fluorescence) detection is employed in the case of surface grafted probes (Fig. 2). Probes are also surface bound in the case of microarrays. However, DNA is polymerized 'from' in contrast to grafting 'to' the surface, leading to a better control of surface deposition. A further advantage of microarrays is that different strands can be arranged in different locations, which enables to perform experiments in parallel. However, the optical control of the photolithographic process produces a higher number of sequence errors compared to standard nucleotide synthesis [12]. We use an epifluorescence microscope equipped with a cooled CCD camera to determine the fluorescence intensity and estimate the amount of hybridized DNA in case of microarrays.
We find that in the case of competition among PM and MM2 (Fig. 3a, Here ΔΔG = ΔG A − ΔG B is the difference between the individual effective binding free energies of the competing strands  (Table 1).
Hence, in the case of 'PM vs. MM1' the experimentally observed mismatch discrimination appears as improved on the predictions of Eq. (2) by several orders of magnitude (Fig. 3d).
Competitive hybridisation experiments among other strands related to the PM sequence, including target molecules MM3 and MM4 of Table 1, reveal two characteristic scenarios: either standard specific systems (Fig. 4a), where the presence of a 'Low Affinity Target' (LAT) reduces the equilibrium duplex concentration of a 'High Affinity Target' (HAT) as predicted by Eq. (2), or highly specific systems (Fig. 4b) where the HAT hybridises as if the LAT was not present (supplementary material S4 for experimental details).
In order to test if our result is sequence related, we randomly choose a second motif in related literature [15]. After shortening to adjust the denaturation temperature for our experimental setup, the investigation of the sequences PM* and MM* ( By measuring the fraction of duplexes as a function of temperature in thermodynamic quasiequilibrium, we extract the changes in entropy and enthalpy of the denaturation process Put together, we conclude from our observations above that in highly specific situations, the competing strands interact physically at the binding site, however, only transiently, and the effective binding free energies change in such a way that the HAT 'wins' by reducing the affinity of the LAT. This does not contradict thermodynamics since binding in competition is a different situation that is not necessarily described by the mean energies of individual binding pairs in isolation. It requires high cooperativity to generate the required non-linearities to produce a strong change as a response to a weak perturbation (e.g. the slight shift in position of the mismatched base between MM1 and MM2). It also requires asymmetry, since it is otherwise impossible to tilt the average binding free energy in favour of one of the two competitors. However, given the symmetry in the binding microstates of the competitors, the origin of the asymmetry is not straightforward.
Although the interaction of the different strands is a matter of thermal fluctuations that are difficult to picture in precise detail [23], in the following we show how the experimental observations can easily be understood on theoretical grounds.

Model
Since we chose our sequences at random, we do not expect out observations to substantially depend on the nature of the DNA sequence. This is why we consider a homopolymer in the following, which simplifies the model.
An intermediate configuration has a reduced number of degrees of freedom compared to the fully denatured configuration. We determine the entropy change by treating the conformations of a strand as a self-avoiding walk (SAW) on a lattice [28,29]. For simplicity, we assume that one step on the lattice corresponds to one unbound base of the DNA strand. The number of The intermediate configuration where all bases are bound corresponds to the largest entropy change. We write the free energy ΔG! v p of given intermediate configuration

Free energy of coexisting intermediate and helicoidal conformations
With this, the number K of all possible configurations k of a given intermediate configuration If two strands are simultaneously bound to the same probe, the entropic cost of a helicoidal conformation is increased since the occurrence of a (stiff) helix within such a trifold configuration requires stretching the competitor at the same time (Fig. 5b). This is reflected by the energy penalty 0 p Δ > in presence of a competitor in a mean field approximation.
, the configuration is not taken into account, and , # block i in Eq.
(7) depends on p Δ . The enthalpy of a given conformation that includes helix conformations, In case no base is in the helix conformation, the enthalpy ΔH! v ,k h = 0 , and the stability of the duplex is governed by the intermediate configuration only.
The transition from an intermediate configuration to a helix configuration causes an additional entropy change due to the high persistence length of the duplex For numerical evaluation, we consider the HAT and the LAT as homopolymers of 16 bases in length. The HAT constitutes a perfect match and the LAT possesses a single noncomplementary base in its centre. Their partition functions correspond to the value of their binding constants, where t designates either a HAT or a LAT. The first sum runs over all possible intermediate  (Fig. 4c).
We now consider simultaneous binding of two strands to the same probe in a trifold configuration (Fig. 5b). The numerical assessment confirms that for an effective entropic barrier p Δ , which corresponds to only very few closed base pairs in terms of free energy, the LAT looses orders of magnitude of its binding affinity while the HAT affinity remains practically unaffected (Figs. 5e, f). This agrees well with the influence of the HAT on the LAT melting curve. A subtler coupling than the proposed simple entropic barrier p Δ can lead to even higher values of specificity.
One understands that if some microstates of the interacting competitors are antagonistic, little differences among them can modify the effective binding free energies in the LAT-probe-HAT trifold configuration, which can strongly disfavour binding of only one of the two competing molecules. This also implies, however, that the trifold configuration is only sparsely populated. To compare our theory to the experimental situation we need to consider the total free energy that is gained as a function of probe occupation.

Total Gibbs free energy
We determine the overall Gibbs free energy total G of N probe molecules and two targets, HAT (High Affinity Target) and LAT (Low Affinity Target), competing for the probes. Each binding site can be empty, occupied by a HAT, occupied by a LAT, or occupied by both molecules at the same time. where α c HAT c LAT = c T . The formalism turns out analogue to a Landau description of phase transitions 30 (Fig. 7a).
In more detail, the effective Gibbs free energy total G per site is: The With this expression, Eq. (15) becomes: If all probes are occupied by a LAT, there is always a fraction of them that share the probe with a HAT. This is because the HAT can achieve binding states that the LAT cannot reach.
The corresponding probability κ 0 is a property of the competing molecules, and it depends on The results of the numerical assessment (Fig. 5) suggest that for an entropy barrier of  2)) to almost exclusively HAT occupied probes.

Discussion
For the interpretation of our observations, we introduced an interaction term between the probe bound strands HAT and LAT, of the form αc HAT c LAT into the free energy expression. Experimentally this is corroborated by the reduction in melting temperature in presence of the stronger binding competitor, which occurs only in the highly specific case (Fig. 4d). Another experiment that points to the interaction is the change of the LAT fluctuation spectrum in presence of the HAT (Fig. 4f). Theoretically we find that 0 κ , the probability for the HAT to join a LAT bound probe, needs to exceed 0 crit κ . This implies that the LAT probe affinity needs to be lower by a certain amount, which depends on the precise microstatistics and interaction of the competing molecules at the probe-binding site. In agreement, our experiments show that a certain difference in melting temperatures between the competitors, here about 10 °C, is required for the highly specific binding to occur (Fig. 4e). The sharpness of the predicted transition at the critical value of 0 κ consistently explains the experimentally observed occurrence of either standard specificity, following the Boltzmann factor, or the strongly increased specificity. It is also a good reason why the degree of specificity changes in such an abrupt and unforeseen manner with mismatch position (supplementary material S11). The degree of interference of the competitors at the probe binding-site, expressed by 0 κ , is likely to present a complex dependence on sequence as it is the case for secondary structures. A sequence dependence of the kinetics of strand association was suggested [31]. In analogy to a phase separation, once equilibrium is reached, the HAT-LAT-probe trifold conformation is hardly populated. Although experiments at elevated concentrations of the LAT are difficult to perform, we see that the experiment agrees even quantitatively with the prediction (supplementary material S12).
In principle nothing precludes the here described mechanism to work for much longer strands although thermodynamic equilibrium may not always be simple to reach in such a case. Some of the observations in [18] may well be due to similar mechanisms.
Following our interpretation, simultaneous binding of the competitors creates an energy barrier in analogy to conformational proofreading [3]. However, in the case considered here, the energy barrier is of entropic origin and is created by the presence of a third molecule, which consequently results in almost no loss in affinity. The energetic cost for the observed, highly increased specificity is hidden in the different thermodynamic situation that comes with the presence of both competitors. The molecular behaviour corresponds to logic 'if' (the stronger binding competitor is not present) 'then' (bind almost as well at the same spot).
Thermodynamically, computing can indeed be performed at the simple cost of the input and output operations [32].
Using the here described phenomenon, very high fidelity single mismatch detection could be

Competing Interests
The authors declare that they have no competing financial interest. Corresponding and request for material should be addressed to AO (albrecht.ott@physik.uni-saarland.de).

Fig. 2: TIRF (Total Internal Reflection Fluorescence)
Competing target species A (red) and B (green) bind to surface immobilized probe molecules.    Table 1 for sequences). Blue: highly specific cases,      Thermodynamic equilibrium conditions S10 Simple occupation of probe molecules by HAT and LAT S11 Competition in binding as a function of the LAT MM position S12 Specificity increase in theory and experiment S1.1 TIRF, optical path  ). τ PM < τ MM 1 . We interpret the result that the PM duplex spends more time in the closed configuration compared to MM1.

S2 Extended Langmuir formalism
In the case of comparable target and probe concentrations ( ) , we need to take into account the reduction of molecules in solution during the hybridization reaction. We extend the well-known Langmuir description by taking into account concentration changes of single strands in solution that arise from hybridization (23).
The hybridization reaction follows where [ ] P and [ ] T are the concentrations of probe and target molecules, [ ] D is the concentration of duplexes. k + and k − are the association and dissociation rates of the reaction. Assuming first-order kinetics, this leads to: is the binding affinity of the target molecule T.
The binding affinity K depends on the binding free energy G Δ of duplex formation

Fig. S5
Determination of binding affinities for the target molecules PM, MM1, MM2, using FRET in bulk at 44°C (S1.5). The probe molecule (the PM complement) and the respective target constitute the FRET pair. The initial probe concentration is 10 nM in all measurements. The FRET signal is normalized to its maximum value at a target concentration of 100 nM. Fitting the extended Langmuir isotherm (Eq. (S3)) to this data reveals the binding affinity K of the investigated target molecule in bulk. The binding constants are in units of 1/M.

S5
Competitive hybridization experiments with the oligonucleotides PM* and MM* Table S1: Sequences and binding affinities of PM* and MM*.

Target
Sequence (   The PM target concentration is 5 nM in all cases. The PM molecular recognition remains highly specific, but at lower temperatures the influence of the competitor becomes more and more visible (green arrow). The degree of PM specificity diminishes with decreasing temperatures. However, standard specificity, described by a Boltzmann factor (filled triangle), is not reached in the temperature range under study.

S8
PM and MM1 melting curves We slowly change the temperature so that each data point is recorded at quasi-equilibrium. This is confirmed by the superposition of the heating and cooling curves (not shown). The recorded temperature dependent TIRF signal consists of two contributions, the amount of hybridized DNA and the temperature dependent efficiency of the dye. To separate the two parts we perform temperature ramps with a fluorescently labeled variant of the probe sequence immobilized to the surface (not shown). In this case the number of dyes excited by the evanescent field does not change with temperature and the observed linear dependence of the TIRF signal on temperature is a property of the dye alone. This contribution determines the linear baseline BL of the denaturation experiment. It is removed following Eq. (S9) to give the fraction of occupied binding sites θ ( 0 1 θ ≤ ≤ ):

S9 Thermodynamic equilibrium conditions
Fig. S12 Verification of thermal equilibrium conditions. a, Surface based competitive hybridization experiment with the target molecules PM and MM1. The graph shows the fluorescent signal from the PM (blue) and MM1 (black) as a function of time. At 0 t = we introduce the target solution. The concentration ratio between PM and MM1 is 5:1000. After equilibration we alternately decrease and increase the temperature (vertical lines), and we observe that the initially measured equilibrium values for PM and MM1 are reached again. Note that the varying hybridization signals at varying temperatures consist of two contributions, the amount of hybridized DNA and the emission characteristics of the dye. b, Surface based displacement experiment. After priming the probe molecule surfaces with MM1 targets (  Long-term stability of the competitive hybridization equilibrium between PM and MM1 in bulk. We follow the hybridization signal of both molecular species over a time period of one week in order to rule out relaxation on long time scales. We perform this measurement in bulk using the FRET setup. a, PM and probe constitute a FRET pair, while MM1 is not labeled. b, MM1 and probe constitute the FRET pair, while the PM is not labeled. The fluorescence intensities remain constant over a time period of at least one week, confirming the equilibrium situation.