Selection of allosteric dnazymes that can sense phenylalanine by expression-SELEX

Abstract Aptamers are ligand-binding RNA or DNA molecules and have been widely examined as biosensors, diagnostic tools, and therapeutic agents. The application of aptamers as biosensors commonly requires an expression platform to produce a signal to report the aptamer-ligand binding event. Traditionally, aptamer selection and expression platform integration are two independent steps and the aptamer selection requires the immobilization of either the aptamer or the ligand. These drawbacks can be easily overcome through the selection of allosteric DNAzymes (aptazymes). Herein, we used the technique of Expression-SELEX developed in our laboratory to select for aptazymes that can be specifically activated by low concentrations of l-phenylalanine. We chose a previous DNA-cleaving DNAzyme known as II-R1 as the expression platform for its low cleavage rate and used stringent selection conditions to drive the selection of high-performance aptazyme candidates. Three aptazymes were chosen for detailed characterization and these DNAzymes were found to exhibit a dissociation constant for l-phenylalanine as low as 4.8 μM, a catalytic rate constant improvement as high as 20 000-fold in the presence of l-phenylalanine, and the ability to discriminate against closely related l-phenylalanine analogs including d-phenylalanine. This work has established the Expression-SELEX as an effective SELEX method to enrich high-quality ligand-responsive aptazymes.

SELEX was invented in 1990 to select aptamers that can bind to T4 DN A pol ymerase ( 15 ) or organic dye molecules ( 16 ) or to select ribozymes that can cleave single stranded DNA ( 17 ). La ter, dif ferent improved SELEX techniques have been invented, including negative SELEX ( 18 ), counter SELEX ( 14 ), capillary electrophoresis SELEX (CE-SELEX) ( 19 ), Cell SELEX (20)(21)(22), in vivo SELEX ( 23 ) and Capture-SELEX ( 24 ). Based on these techniques, many aptamers have been generated. Howe v er, the application of aptamers commonly r equir es an expr ession platform to recei v e the signal from the aptamer-ligand binding e v ent.
Many aptamers have been combined with an expression such as a DNAzyme , a ribozyme , a ribosome-binding site (RBS), a Rho-independent terminator, or a poly (A) signaling sequence to facilitate their applications ( 8 , 12 , 13 ). An ATP aptamer was fused with the hammerhead ribozyme to form the first allosteric ribozyme whose cleavage can be controlled by ATP ( 2 ). Moreover, a natural c-di-GMP class I riboswitch aptamer was joined to a hammerhead ribozyme to sense bacterial c-di-GMP ( 25 ). Recently, a TPP riboswitch aptamer was fused with a hammerhead ribozyme to detect the blood concentration of thiamine pyrophosphate, a potential biomarker for Alzheimer's disease ( 26 ).
The first DNAzyme was isolated by Breaker and Joyce in 1994 by SELEX using a library of ssDNA with a target ribonucleotide in the middle ( 27 ). Later, two more classes of self-cleaving DNAs were produced, one of which r equir es Cu 2+ and ascorbate as co-effectors, while the other r equir es Cu 2+ onl y ( 28 ). Recentl y, small highl y-acti v e DNAzymes that r equir e Zn 2+ have been generated ( 29 ). Additionall y, an L-RN A cleaving DN Azyme has been isolated and fused with an ATP-binding aptamer to form an allosteric DN Azyme ( 30 ). These DN Azymes can potentiall y be used as expression platforms for DNA aptamers.
Historically, the selection of an aptamer and the fusion of an expression platform with the aptamer are two separate steps. Recently, we integrated a highly acti v e DNAzyme (I-R3), which was generated by the Breaker's lab in 2013 ( 29 ), as an expression platform into the SELEX cycle (termed Expression-SELEX ( 31 )) to merge these two steps into one step. Expression-SELEX does not require immobilizing either the random sequence library or the ligand. The embedded DNAzyme I-R3 also facilitates the isolation of the cleavage product for the next cycle selection without the need to immobilize either ligands or DNA libraries.
Howe v er, the pre viousl y isolated allosteric DN Azyme (a ptazyme) bound to the ligand weakly and induced only slight DNA cleavage.
In this study, we improved Expression-SELEX further by the use of a ne w e xpression platform II-R1 DNAzyme, the application of an exonuclease to digest the dsDNA to obtain higher quality ssDNA, and the increment of negati v e incubation time to eliminate self-cleaving DN A a ptazymes in the absence of a ligand. With these improvements, within six rounds of selection, we isolated se v eral allosteric DNAzymes that can bind to the amino acid Lphenylalanine, which is the biomarker of phenylketonuria with high affinity and high inducible cleavage. The improved Expression-SELEX can be applied to select aptazymes that can bind to various ligands with high affinity in a few selection cycles.

Expr ession-SELEX scr eening process
Step 1: Negati v e selection. The single-stranded DNA (ss-DNA) r andom libr ary II-R1 r andom-sequence libr ary was dissolved with molecules up to 1 × 10 15 in HEPES buffer (0.05 M HEPES, 0.1 M NaCl, 0.04 M MgCl 2 , 1 mM ZnCl 2 , pH 7.05), and then incubated at 37 • C for 12 h; Full-length ssDNA was separated by 10% urea-denaturing polyacrylamide gel electrophoresis (PAGE) and the corresponding ssDNA band was isolated; the gel pieces were crushed and soaked in the elution buffer (Tris-HCl 10 mM, pH 7.5 a t 23 • C , 200 mM NaCl and 1 mM EDTA) and recovered by ethanol precipitation.
Step 2: Positi v e selection. The r ecover ed full-length ssDNA was dissolved in HEPES buffer containing the ligand L-Phenylalanine (different ligand concentrations in different rounds) and incubated at 37 • C for 10-30 min.
Step 3: Selection of cleaved products. The 3 cleavage product (80 nt) was separated by 10% PAGE and the corresponding ssDNA band was isolated and purified.
Step 4: PCR amplification using Taq DN A pol ymerase (ABM company). Primers II-forward (10 M) and II-re v erse (10 M, containing a 5 phosphate group) (Supplementary file No. S1 Supplementary Table S1) were used to amplify the 3 cleavage product.
Step 5: Exonuclease digestion to obtain single-strand DNA (Supplementary files No. S1 Supplementary Figure S1). The purified PCR product (5 g) was incubated with 5 units of lambda exonuclease (New England Biolabs) at 37 • C for 30 min to digest the DNA antisense strand with the 5 phosphate group. The purified full-length ssDNA was used for the next round of screening. When the screening was conducted on the 19th and 20th rounds, a phenylalanine analog (4-fluoro-DLphenylalanine) was used for counter-selection.

Next-generation sequencing (NGS) of the SELEX-enriched aptazymes
Sequencing was performed using a previously described method ( 31 ). Briefly, the P5-forward (10 M) and P7re v erse (10 M) (Supplementary file No. S1, Supplementary Table S1) from the Illumina platform were appended to the end of the ssDNA by PCR using Taq DN A pol ymerase. The PCR product was purified by an agarose gel DNA recovery kit (Sangon Biotech) and sent to Beijing Novogene Compan y f or sequencing.

Analysis of the secondary structures of the allosteric DNAzyme candidates
The analysis method is similar to the one described previously ( 32 ). We used perl scripts to analyze the high-throughput sequencing results for these enriched libraries to list all unique reads in ranked order (Supplementary files No. S2-S4). To search for homologous sequences and structures that are similar to those top-enriched sequences, we first extracted the 30 nt random regions of top 1000 sequences (Supplementary files No. S5), use BLAST to look for higher than 90% similarity between each top sequence and the top 1000 sequences (Supplementary files No. S6). Ther efor e, each top sequence and its homologues form a group of similar sequences. The full-length of these sequences were obtained by a perl script (Supplementary files No. S7), and then they were aligned by Clustal Omega ( 33 ) (Supplementary files No. S8). The resulting alignment files were used for RNAalifold ( 34 ) to predict their structures (Supplementary files No. S9) based on minimal free energy and nucleotide Step 1: Negative selection. The library was incubated with buffer containing 1 mM Zn 2+ for 12 h, and the uncleaved full-length ssDNA was purified with 10% PAGE gel.
Step 2: Positi v e selection. The purified full-length ssDNA was incubated with L -phenylalanine to induce the cleavage of the embedded DNAzyme II-R1.
Step 3: Isolation of the 3 cleavage product by a PAGE gel. Step 4: Amplification by PCR (using Taq DN A pol ymerase) with specific primers.
Step 5: Digestion of antisense DNA strand with Lambda exonuclease to purify the aptazymes. The purified aptazymes were used for the next selection cycle.
covariance ( 34 ). The resulting Stockholm files containing structural information were used for R2R to draw the secondary structures ( 35 ) (Figure 2 and Supplementary files No. S1 Supplementary Figure S3).

DNA aptazyme cleavage assays
Cleavage assays were performed under conditions similar to those described previously ( 31 ). Briefly, approximately 100 ng of ssDNA was incubated with or without the ligand L-Phenylalanine in cleavage buffer (0.05 M HEPES, 0.1 M NaCl, 0.04 M MgCl 2 , 1 mM ZnCl 2 , pH 7.05) in a final volume of 20 l. Cleavage products were separated on the 15% PAGE gel and visualized by staining with SYBR Gold (Invitrogen). The fraction of cleavage was determined as the intensity of the 3 cleavage DNA products divided by the total ssDNA.

Dissociation constant ( K D ) measurements
The apparent K D values were determined by a previously described method ( 31 ). The K D was calculated using Graph-Pad Prism with the function of specific binding with Hill slope 1.0 and the equation: where B max is the maximum-specific binding.

Observed rate constant ( k obs ) measurements
Measurements of k obs values were performed as described previousl y ( 31 ). Briefly, we performed cleavage of the ss-DN A a ptazymes with and without a ligand in the cleava ge b uffer. The fraction of cleava ge at each time point was calculated as described above. The k obs values for each reaction were measured using GraphPad Software with one phase decay and the equation: is zero, Plateau is the Y value at infinite times, and k obs is the rate constant.

Integrating the II-R1 DN A enzyme (DN Azyme) into the random library
We chose the II-R1 DNAzyme as an expression platform for its slow cleavage rate. We expected that the binding of ligands would enable the DNAzyme to reorganize and stabilize the structure and the cleavage rate would be impro ved, w ould facilitate the isolation of a ligand-responsi v e DN Azyme. The new DN Azymes to be selected will be called allosteric DNAzymes or aptazymes in this work. The initial library included ssDNA sequences in which 30 random nucleotides were integrated into the P2 stem of the II-R1 DNAzyme (the aptazyme in Figure 1 ) and the new II-R1 aptazymes were expected to perform self-cleaving reaction upon ligand binding. The cleavage products were then isolated by denatured PAGE and used for PCR amplification.

Enrichment of aptazymes by expression-SELEX
The double-stranded (ds) PCR amplicons of the cleaved products were digested by the lambda exonuclease to obtain full-length ssDN A a ptazyme candidates. Our experiment showed that exonuclease digestion produced a highly pure ssDNA with a single band on the gel, while the traditional method using embedded RNA nucleotide in the ds-DNA and degraded by NaOH could not (Supplementary Figure S1, all supplementary figures are in the Supplementary file No. S1). In addition, we extended the negati v e selection step up to 12 h to obtain full-length ssDNA with extremely low background cleavage activity. More detailed steps (Figure 1 ) are described in the method.
In addition, the ligand concentration and incubation time were varied during different cycles to obtain high affinity and specificity of ligand binding by the ssDN A a p-tazymes (Supplemary2 Supplementary Figure S2). The first round started from the ssDNA library commercially acquir ed. Compar ed to the cleavage of the second-round ss-DNA pool, the cleavage of the ssDNA pool for each round in the following cycles increased gradually. The sixth round reached the first enrichment peak. A pproximatel y 10% of the total ssDNA can respond to the ligand induction and cleave itself. We then reduced the concentration of the ligand from 1 mM to 0.5 mM, and after another four rounds of selection, the fraction of cleavage had increased to approximately 20% in the 10th r ound. Fr om this point, we reduced the incubation time from 20 min to 10 min and further decreased the ligand concentration to 0.25 mM.

Identification of enriched self-cleaving aptazymes and their family members
High-throughput sequencing was employed to analyze the ssDNA sequences in the enriched libraries from the 6th, 14th and 20th round selection (Supplementary file No. S2-S4). The results showed that the top four most abundant sequences in these three libraries are the same (Supplementary file No. S1 Supplementary Table S2). The most abundant sequence (II-R1-1) was enriched from 0.9% to 45.6%, the second most abundant sequence (II-R1-2) was enriched from 0.14% to 20.9%, and the third most abundant sequence (II-R1-3) was enriched from 0.07% to 7.1% (Supplementary file No. S2-S4). We arbitrarily picked candidates II-R1-1, II-R1-3 and II-R1-7 among the top 10 enriched candidates (Supplementary Figure S3A-C) for cleavage assay under the ligand induction. The initial results showed that all of them can self-cleave when the ligand was present while almost no cleavage happened when the ligand was absent, suggesting that they are all allosteric DN Azymes. Subsequentl y, we used these candidates as starting sequences to search for their homologs among the top 1000 enriched sequences. First, we used BLAST to search for the sequences that are > 90% similar to the starting sequence (for 30 nt random region). Then, full-length (93 nt) homologs were aligned by the Clustal Omega ( 33 ). Finally, the secondary structure was predicted by the RNAalifold algorithm ( 34 ) based on the minimal free-energy and nucleotide covariance. To visualize the secondary structure from a Stockholm file obtained fr om RNAalifold, R2R pr ogram ( 35 ) was applied to draw the figur e (Figur e 2 ). II-R1-1 aptazyme family contains a big loop at the top and an internal loop between P1 and P2 stems (  Figure S3C). These top candidates contain complex secondary structures, including se v eral stems, loops, and multiway junctions. These variable structures suggest that the li-brary contains many sequences that can cooperate with the embedded DNAzymes in a way to sense the ligand and induce the cleavage.

Binding specificity of the aptazymes
To investigate the binding specificity, several Lphenylalanine analo gs, including closel y related analogs (such as L -tyr osine, L -tryptophan, 4-fluor o-DL -phen ylalanine, and boc-4-amino-L -phen ylalanine, D -phenylalanine) and less closely related analogs (such as alanine and L -isoleucine) were used for comparison (Figure  3 A). The analogs containing a benzyl group appeared to be able to induce the highest percentage of cleavage for the aptazyme II-R1-3. For example , L -phenylalanine , L -tyrosine , and 4-fluoro-DL -phenylalanine induced a fraction of cleavage of 0.325, 0.307 and 0.235, respecti v el y, w hereas alanine and L -isoleucine only induced a smaller fraction of cleavage of 0.098 and 0.125, respecti v ely (Figure 3 B and Supplementary Figure S3B). The test of different compounds on the induction of the aptazymes II-R1-7 (Supplementary Figure  S3C and S4) and II-R1-1 (Supplementary Figure S3A and S5) produced similar results, except that the II-R1-1 had a higher fraction of cleavage for isoleucine (Supplementary Figure S5). These results indicate that the benzyl group in these compounds is important for recognition by all three aptazymes. Howe v er, in an additional experiment, when D -phenylalanine was incubated with the aptazyme II-R1-3, no cleavage was observed (Figure 3 B), suggesting that the aptazyme can distinguish the difference between the D and L enantiomers of phenylalanine.

Binding affinity ( K D ) of the aptazymes
To investigate the binding affinity of II-R1-3, we incubated the aptazyme with L -phenylalanine and L -tyrosine with concentr ations r anging from 0.1 to 100 M. The apparent dissociation constant ( K D ), measured using a previously described method ( 31 ), was the concentration of the ligand that was r equir ed to induce 50% of the maximum cleavage. The K D value for II-R1-3 was 4.8 ± 1.4 M (mean ± standar d de viation) f or L -phen ylalanine and 6.8 ± 0.7 M f or Ltyrosine ( Figure 4 ). Similar tests were conducted for II-R1-7 (Supplementary Figure S6) and II-R1-1 (Supplementary Figure S7). The K D values for II-R1-7 were 19.3 ± 2.4 and 118.9 ± 11.7 M for L -phenylalanine and L-tyrosine, respecti v ely (Supplementary Figure S6) and those for II-R1-1 were 20.3 ± 1.6 and 71.6 ± 9.8 M, respecti v ely (Supplementary Figure S7). These results indicate that II-R1-7 and II-R1-1 exhibit a higher binding affinity for L -phenylalanine than for L -tyrosine, while II-R1-3 shows a similar binding affinity for these two ligands.

The allosteric DNAzymes obtained by expression-SELEX enable the direct detection of ligand by the cleavage assay
Through the Expression-SELEX cycles, we obtained several aptazymes that can be induced by L -phenylalanine. These allosteric DNAzymes uses ligand binding to trigger their self-cleavage, and the le v el of self-cleavage correlates with the concentration of the ligand. They can be designed into various sensors as the cleavage reaction itself can function as a reporter. Howe v er, common aptamers need to attach to both a fluorescent group and a quencher to produce a fluorescent signal ( 36 ), or to conjugate with a Gquadruplex / hemin DNAzyme to generate a chemiluminescent signal ( 37 ) or a color ( 38 ), or to connect to a set of electrochemical equipment to yield a electric signal ( 39 ) before they can be applied to measure the ligand concentration.

Impro ving expr ession-SELEX to obtain high-quality allosteric DNAzymes for phenylalanine recognition
Previously, we implemented a highly acti v e DNAzyme I-R1 in the random library in the SELEX cycle to select an aptamer with an expression platform, namely, Expression-SELEX ( 31 ). The isolated allosteric DNAzyme (aptazyme) IR3-I-DNA could bind L-allo-isoleucine with a low dissociation constant ( K D ) of 0.57 mM ( 31 ). In this study, we made a few improvement steps to obtain aptazymes with higher quality. First, we chose the II-R1 DNAzyme as an expression platform because of its slow cleavage rate, probably due to its structure deficiency. If the binding of ligands enables the aptazyme to reorganize and stabilize the structure, the cleavage rate will be improved and a ligandresponsi v e allosteric aptazyme can be obtained. Second, we used lambda exonuclease to digest the PCR product to obtain highly pure full-length ssDNA during the selection cycle (Supplementary Figure S1). Third, we extended the incubation time for negati v e selection up to 12 h instead of 2 h to eliminate those ssDNA molecules that can self-cleave in the absence of a ligand. Fourth, we chose L -phenylalanine as the ligand. L -phenylalanine contains a phenyl group that can easily be recognized and bound by the aptamer in riboswitches ( 40 , 41 ). Finally, we reduced the ligand incuba-tion time and lowered the ligand concentration during the selection cycles. Through these improvements, we obtained many aptazymes that can self-cleave upon ligand binding. We arbitrarily picked three candidates among the top 10 enriched candidates for characterization.
The K D values for these aptazymes for L -phenylalanine are as low as 4.8 M while they are 118.9 M for the analog L -tyrosine. These results imply that these aptazymes have a high binding affinity for L -phenylalanine and they can elucidate the subtle difference between L -phenylalanine and L -tyrosine. In addition, our experiment showed that Lphenylalanine but not D -phenylalanine could induce significant cleavage of the aptazyme, suggesting that the aptazyme could tell the difference between D and L enantiomers.

Cleavage of the selected aptazymes by expression-SELEX is highly inducible by the ligand
The k obs values for the cleavage of these three aptazymes are very low ( < 0.011 min −1 ), especially for the aptazyme II-R1-1, which has a k obs value of around 10 −6 min −1 . Howe v er, when the ligand L -phenylalanine is present, the values can be improved up to 5.6 × 10 −2 , 3.6 × 10 −2 and 3.4 × 10 −2 min −1 for these three aptazymes, and one of which were impr oved by appr oximately 20 000-fold. These results suggest that the cleavage of these aptazymes could be induced dramatically by the ligand.

Lo w no-ligand self-clea vage and high ligand-induced selfcleavage might facilitate the enrichment of these allosteric DNAzymes
The II-R1-1 aptazyme is the most abundant ssDNA candidate in the pool during the selection cycles. In the 6th round, it occupied only 0.92% of the total reads. Howe v er, it occupied up to 45.61% in the 14th round and up to 73.85% in the 20th round. The reason for its high enrichment is unknown. We noticed that the self-cleavage of II-R1-1 is the lowest when the ligand is not present. Howe v er, the improv ement of the cleavage induced by the ligand was the highest among all of these tested aptazymes. Ther efor e, low self-cleavage and high ligand-induced cleavage might account for its high enrichment.

Optimized approach can select various ligand-binding aptazymes without the need to immobilize either ligands or aptazymes
The SELEX procedure, invented in 1990, commonly requires the immobilization of either the ligand or the aptamer to facilitate the selection ( 15 , 16 , 28 ). Howe v er, the chemical modifications of both small ligands and aptamers for immobilization are complicated and may interfere with the binding of the aptamer to its ligand. The Expression-SELEX described in this study takes advantage of the selfcleaving DNAzyme and does not r equir e immobilization. When the ligand binds to the aptazyme, the binding induces cleavage, and the cleavage products could be separated and purified by PAGE gel. After PCR amplification using Taq DN A pol ymerase (ABM company) and ssDNA separation by the exonuclease digestion, the aptazyme candidates can be enriched during each SELEX cycle.