Discovery of DNA aptamers targeting SARS-CoV-2 nucleocapsid protein and protein-binding epitopes for label-free COVID-19 diagnostics

The spread of COVID-19 has affected billions of people across the globe, and the diagnosis of viral infection still needs improvement. Because of high immunogenicity and abundant expression during viral infection, SARS-CoV-2 nucleocapsid (N) protein could be an important diagnostic marker. This study aimed to develop a label-free optical aptasensor fabricated with a novel single-stranded DNA aptamer to detect the N protein. The N-binding aptamers selected using asymmetric-emulsion PCR-SELEX and their binding affinity and cross-reactivity were characterized by biolayer interferometry. The tNSP3 aptamer (44 nt) was identified to bind the N protein of wild type and Delta and Omicron variants with high affinity (KD in the range of 0.6–3.5 nM). Utilizing tNSP3 to detect the N protein spiked in human saliva evinced the potential of this aptamer with a limit of detection of 4.5 nM. Mass spectrometry analysis was performed along with molecular dynamics simulation to obtain an insight into how tNSP3 binds to the N protein. The identified epitope peptides are localized within the RNA-binding domain and C terminus of the N protein. Hence, we confirmed the performance of this aptamer as an analytical tool for COVID-19 diagnosis.


INTRODUCTION
With the rapid spread of a novel coronavirus identified as severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the number of COVID-19 cases dramatically increased and later became a global pandemic. 1 Thus far, approximately 539 million cases and more than 6.3 million deaths have been confirmed worldwide. 2 To decrease the mortality rate, it is necessary to develop robust and fast diagnostic tools that reflect the early stages of viral infection. According to various manifestations of COVID-19, including asymptomatic infections, a molecular technique such as RT-PCR is primarily used for COVID-19 diagnosis. [3][4][5] Among four structural proteins, including spike surface glycoprotein, envelope protein, membrane protein, and nucleocapsid (N) protein, 6 the genes encoding spike, envelope, and N have been targeted for PCR testing. [7][8][9] Because of the complex pre-treatment, processing time, and the cost of RT-PCR, 10,11 COVID-19 antibody tests, also known as serological tests, have been the only alternative to nucleic acid testing officially approved by the US Food and Drug Administration. 12 The serological test can rapidly identify infected people and nearby contacts. 13 However, serological-based assays, e.g., ELISA and lateral flow immunoassay, essentially rely on the function of antibodies such as immunoglobulin G (IgG)/IgM implemented on the sensing materials, [14][15][16] which can present a high risk of false-negative results caused by inadequate antibodies in clinical samples and inconsistent immobilization of antigen-coated supporting materials, while the varying range of viral load over the infection period can cause inaccuracy of antigen detection. 17 Like antibodies, aptamers selectively bind and recognize target antigens or specific proteins; nonetheless, aptamers are smaller, chemically synthesized, and more stable with ease of chemical modification, and have less batch-in-batch variation compared with antibodies. 18,19 Aptamers are short nucleic acids and peptides produced in vitro through the systematic evolution of ligands by exponential enrichment (SELEX). 20,21 Remarkably, the applications of aptamers for viral diagnostics and therapeutics have been drastically increasing in the past decade since the emergence of the SARS coronavirus in 2002. 22,23 While most studies have been paying attention to the receptor-binding domain and spike protein of SARS-CoV-2, 24-26 SARS-CoV-2 N, an RNA-binding protein for viral genome encapsulation, is abundantly expressed in infected cells with very low genetic variation over time compared with the spike protein. 27,28 In this regard, the N protein could be used as a potential biomarker for early diagnosis. 29 In the early days of the global pandemic, Zhang and co-workers 30 reported the DNA aptamers targeting SARS-CoV-2 N protein and implemented the selected aptamers with an anti-N antibody for N protein detection. Thereafter, several studies used the reported aptamer sequences to implement various types of biosensors aiming to improve the sensitivity and accuracy in detecting SARS-CoV-2 N. [31][32][33] These studies are still ongoing but carry the disadvantages of requiring multiple steps of sample preparation and a lack of understanding of molecular interaction between the N protein and aptamers. In this study, we selected novel single-stranded DNA (ssDNA) aptamers using asymmetric-emulsion PCR-SELEX to target SARS-CoV-2 N protein strategies and sequences different from those of previous studies. The aptamer sequence, possessing the highest affinity to the N protein, was fabricated on biolayer interferometry (BLI) through the biotin-streptavidin interaction without additional labels. This label-free aptamer-based BLI was successfully able to detect the SARS-CoV-2 N protein of the wild type as well as Delta and Omicron variants. Furthermore, we studied the cross-reactivity of the selected aptamers to other SARS-CoV-2 and MERS proteins. In addition to the discovery of DNA aptamers targeting the N protein, the technology of aptamer-facilitated biomarker discovery (AptaBiD) 34 was amenable to identification of the specific binding epitope on the SARS-CoV-2 N protein using our selected aptamer. The aptamer-binding digested peptides were processed through a proteomics approach, and the interaction of the binding epitope and the aptamer was additionally assessed by molecular dynamics (MD) simulation. This study can be used not only to distinguish between SARS-CoV-2 positive and negative clinical samples but also to identify aptamer-protein binding motifs that may also be applied to therapeutic purposes in the future. Hence, the use of this aptamer-based BLI has highlighted the potential of our discovered aptamer as a promising candidate for COVID-19 diagnostics.

Selection of DNA aptamers to SARS-CoV-2 nucleocapsid protein
The recombinant N protein was used as a target for ssDNA aptamer selection using the SELEX procedure shown in Figure 1A. The DNA library, with a 40-nt random region flanked with 20-nt primer-binding sequences at 5 0 and 3 0 ends, was incubated with the N protein immobilized on nickel-nitrilotriacetic acid (Ni-NTA) HisSorb strips. As thoroughly described in materials and methods, the unbound DNA sequences were washed out while the N-protein-binding DNA pool was further amplified by asymmetric-emulsion PCR and purified with a PCR cleanup kit to generate an ssDNA pool subsequently subjected to the next rounds of selection. To eliminate the non-specific sequences, the counter-selection took place in the first, fourth, and sixth rounds of selection before starting the positive selection. After each round of selection, one additional wash was added to remove the weak-binding ssDNA sequences and increase the stringency of the selection, while in addition the number of PCR cycles was optimized to achieve maximum amplification efficiency.
To assess the presence of the desired ssDNA sequences with 80-nt length in each enriched pool from every round of selection, gel electrophoresis was carried out, whereby the results showed two bands of ssDNA as a major product and double-stranded DNA as a minor product located at 40 bp and 80 bp, respectively. After six selection rounds, the enriched pools were subjected to preliminary binding affinity tests using BLI. The binding and dissociation of the target protein and the analytes on the biosensor tips affect the reflected light's interference pattern, indicating the change in molecular thickness on the tip surface. 35 This phenomenon can be reported as a change in wavelength (nm) over time, as in the BLI sensorgrams shown in Figure S1A. All six enriched pools showed relatively higher binding shifts ( Figure 1B) with a 2-fold increase of the association curves compared with the initial N40 DNA library after 300-s partitioning with N protein on the Ni-NTA biosensor tips ( Figure S1B). Therefore. all enriched pools were sequenced and analyzed using next-generation sequencing (NGS).
The NGS results for all six pools were processed using the FASTAptamer toolkit ( Figure S2). 36 Sequences were counted and clustered into families based on the neighboring distance, which were further subjected to the enrichment tool to calculate the fold enrichment of individual sequences across six pools. The retrieved sequences were chosen on the basis of three criteria, namely (1) the total length of 80-82 nucleotides containing 40-42 nt random region flanked with forward and reverse primers, (2) the identical sequences enriched in more than one pool, and (3) the sequence composition without primer-dimer formation in the random region. While three enriched sequences named NSP1, NSP2, and NSP3 were obtained from the cluster-enrichment tool, the count-enrichment tool resulted in 16 unique sequences (Table S1) based on these criteria, including the NSP1, NSP2, and NSP3 aptamer sequences. These 16 unique sequences were then aligned to analyze the differentiation and similarity using Clustal Omega. 37 Relative to the construction of the N40 DNA library with a 40-nt randomized region, the insertion of two nucleotides was observed in two sequences named NSP6 and NSP10. Phylogenetic tree analysis based on primary sequence homology categorized the 16 aptamer sequences into three groups ( Figure S3). Among the 16 unique sequences, the truncated structures of seven aptamers (tNSP1, tNSP2, tNSP3, tNSP5, tNSP9, tNSP10, and tNSP12) representing each subgroup were chosen for further validation of their affinity binding to the N protein. Moreover, the secondary structures were predicted using an RNAstructure web server for designing the truncated motifs of aptamers. 38 Based on the observation of the predicted structures, some additional nucleotides at the 5 0 and 3 0 primer-binding regions remained to conserve the stem-loop structures. This results in different nucleotide lengths of the selected aptamers in the range of 47-60 nt (Table S2). According to the highly enriched characteristic of the NSP1, NSP2, and NSP3 aptamers, the apparent dissociation constant (K D ) values of the full-length structures of these three aptamers were determined using BLI.

K D determination for the selected aptamers
To study the real-time binding assay on the BLI instrument (named Octet N1), serial dilutions of NSP1, NSP2, NSP3, and their truncated structures were prepared in the assay buffer with five different concentrations of aptamer from 200 nM to 12.5 nM. The reference assay was prepared in the absence of an aptamer. The his-tagged N protein was loaded on an Ni-NTA biosensor, after which the sensor tip was dipped into the aptamer solution as an analyte for association and dissociation steps. The results demonstrated the binding of full-length NSP1, NSP2, and NSP3 aptamers to the N protein with K D values of 7.86 ± 1.17, 4.38 ± 1.21, and 8.60 ± 1.06 nM, respectively. The K D values of truncated structures ( Figure S4) of these three aptamers named tNSP1, tNSP2, and tNSP3 were also determined to study the impact of primer-binding regions on the target recognition, giving K D values of 4.53 ± 0.96, 3.04 ± 1.53, and 5.77 ± 1.17 nM, respectively. These results indicate that the primer-binding regions did not significantly affect the binding affinity. Also, the other four aptamers (tNSP5, tNSP9, tNSP10, and tNSP12) were truncated, and their K D values ranged from 6 nM to 22 nM (Table S2). In addition to the binding assay of the selected aptamers, the initial N40 DNA library and the scrambled DNA with 80 nt were tested as controls. The DNA library showed low binding affinity and a low BLI shift (about 0.5 nm) to the N protein and insignificant binding behavior of the scrambled DNA ( Figure S5). As positive controls, the binding affinity of two previously published aptamers (A48 and A58 with reported K D values of 0.49 ± 0.95 and 0.70 ± 0.06 nM, respectively) 30 were measured on Octet N1, and their K D values were 6.46 ± 1.04 and 3.02 ± 0.95 nM, respectively. According to our study, Mg 2+ and Ca 2+ ions played a crucial role in maintaining the binding affinity between DNA aptamers and the protein compared with the Mg/Ca-omitted buffer condition so that these divalent ions were always added into the binding and assay buffers. Owing to the high affinity of the three truncated aptamers (tNSP1, tNSP2, and tNSP3), they were further used for subsequent studies.

Binding evaluation of aptamers to SARS-CoV-2 N protein
To establish the detection of the N protein using aptamers, affinity studies were also performed by individually immobilizing the biotinylated tNSP1, tNSP2, and tNSP3 aptamers on a streptavidin biosensor, SARS-CoV-2 N protein was spiked in PBS with Tween 20 (PBST) buffer with varied protein concentrations of 7.69, 15.38, 30.76, 61.52, and 123.04 nM, and the control assay was prepared in the absence of the protein. There was no significant difference in K D values obtained from attaching either the truncated aptamers or the N protein on the biosensors. The analysis setup with the aptamer-loaded sensor ( Figure 2A) yielded a better curve fit (R 2 = 0.999) with the use of the default binding model (1:1) of the BLI software, which refers to the higher accuracy of the calculated K D . Kinetic binding of these three aptamers ( Figures 2B-2D) showed a potential capability of N protein detection in PBST buffer with K D values of 4.83 ± 1.09, 4.51 ± 1.50, and 2.91 ± 1.02 nM for tNSP1, tNSP2, and tNSP3, respectively ( Table 1).

Assessment of aptamer selectivity for SARS-CoV-2 nucleocapsid
To verify the selectivity of the tNSP1, tNSP2, and tNSP3 aptamers to the SARS-CoV-2 N protein, their binding affinity to other viral proteins was tested, including SARS-CoV-2 S1 protein and the MERS coronavirus nucleocapsid (MERS N). The purity of each protein was visualized on SDS-PAGE ( Figure S6), and each protein displayed a single band representing high homogeneity. All three aptamers exhibited strong binding to SARS-CoV-2 N protein (Figures 3A-3C) with no significant cross-reactive binding to the SARS-CoV-2 S1. Some binding (about 50%) was observed to the N protein of MERS coronavirus, highlighting the higher affinity of these aptamers to SARS-CoV-2. Notably, these three truncated aptamers did not exhibit binding to hexahistidine peptide used as a tag on the recombinant proteins in this study. Considering the lowest K D and highest selectivity to the N protein, the tNSP3 aptamer was chosen for detecting the N protein in human saliva and identifying aptamer-protein binding epitopes.

Assessment of tNSP3 aptamer binding to SARS-CoV-2 nucleocapsid in various conditions
With different sources of the wild-type N protein production, the binding affinity of the tNSP3 aptamer to our in-house expressed N protein using E. coli was tested under the same conditions, and showed a K D of 7.52 ± 0.50 nM ( Figure S7) compared with the commercial recombinant protein modified with glycosylation and expressed in HEK293 cells. In addition to detecting the N protein obtained from wild-type SARS-CoV-2, an affinity study of our tNSP3 aptamer to the Delta and Omicron variants was also performed. The calculated K D values of the aptamer to the Delta and Omicron variants were 0.63 ± 0.30 and 3.55 ± 0.20 nM, respectively ( Figures 4A and 4B). Taking the safety of the clinical laboratory into consideration, viral inactivation is currently required prior to processing the samples reported by other studies. 37,38 Heat inactivation of SARS-CoV-2 virus has been commonly used to treat the virus-containing serum. [39][40][41] Thus, the binding affinity of the tNSP3 to the N proteins in the BLI assay buffer (PBST) was investigated after heating the N protein of Delta and Omicron variants at 65 C for 15 min and 80 C for 5 min. The results showed that the incubation of the variant N proteins at 65 C had decreased about 20% and 40% the binding signal of tNSP3 aptamer to the Omicron and Delta variants, respectively. Incubation at 80 C showed a similar effect on the binding affinity to the N protein of the Omicron variant. On the other hand, this resulted in a 60% loss of binding signal to detect the N protein of the Delta variant compared with the binding signals of non-heated N proteins (Figures 4C and 4D).
Detection of SARS-CoV-2 nucleocapsid protein in human saliva using tNSP3 aptamer To investigate the aptamer-based detection of COVID-19, the tNSP3 aptamer was used to detect the presence of SARS-CoV-2 N protein in human saliva. Concentrated saliva obtained from unspecified-gender pools was diluted in PBST buffer to 10%. The biotinylated aptamer was loaded on a streptavidin biosensor, and the testing saliva sample containing different concentrations of the N protein was completed with the aptamer. The binding measurement was achieved within 8 min per sample, and the control assay was performed with 10% saliva without N protein. Notably, the tNSP3 aptamer could detect the N protein in the human-pooled saliva with K D of 50.9 ± 1.26 nM ( Figure 5A). The binding of the scrambled DNA to the N protein was not observed. In the controlled experiment, the streptavidin biosensor without tNSP3 aptamer was tested for binding with the highest concentration of N protein (123.04 nM) in the 10% saliva. Accordingly, binding of the N protein to the biosensor was not observed, so the non-specific binding was negligible for this aptamer-based detection. Figure 5B shows the increased binding shifts with concentrations of N protein in the range of 7.79-123.04 nM and the limit of detection (LOD) of 4.5 nM or about 6.77 Â 10 11 molecules in 1.125 pmol of the N protein.

Identification of nucleocapsid binding epitope-targeting aptamer by nLC-MS/MS
To identify the binding sites of tNSP3 aptamer on the SARS-CoV-2 nucleoprotein, restricted proteolytic digestion of the aptamer-protein complex was performed, and streptavidin magnetic beads were used to pull down peptides bound to the biotinylated aptamer while nonbinding peptides were washed out ( Figure 6A). The resulting binding peptides were eluted and subjected to nano liquid chromatographytandem mass spectrometry (nLC-MS/MS) analysis. The intensities of the eluted peptides from the tNSP3 aptamer were compared with a scrambled-DNA control experiment. Abundance differences of individual peptides were compared ( Figure 6B) and visualized along the protein sequence ( Figure 6C). The eluted peptide at location 376-385 (denoted position in SARS-CoV-2 nucleoprotein, UniProt: P0DTC9) revealed the highest abundance, and this peptide with the amino acid sequence ADETQALPQR (named AA10) was identified as a possible binding epitope. Apart from the AA10 peptide, another peptide appeared at the location of 69-89 with the amino acid sequence GQGVPINTNSSPDDQIGYYRR (named AA21); however, its abundance difference showed a 2-fold lower abundance compared with the AA10 peptide.

Molecular modeling of the aptamer-protein binding interaction
The primary sequence of tNSP3 aptamer was used to predict its secondary structure by the mFold web server 42 ( Figure 7A). The corresponding tertiary structure of the tNSP3 was modeled using SimRNA 43 and VMD 44,45 programs. After that, 200-ns molecular dynamics (MD) simulations were performed using GROMACS package 46 to obtain the spatial structure of the tNSP3 aptamer in the solution ( Figure 7B). Owing to the high binding affinity of the tNSP3 aptamer, molecular docking for this aptamer was performed to predict the spatial structure of the protein-ligand complex and to reveal the critical nucleotides providing the binding. There is no known experimental structure of the whole N protein to date. In this study, the N protein tertiary structure was predicted by modeling using Iterative Threading ASSEmbly Refinement (I-TASSER) service. 47 The model with the best score referred to a reported modeling template 48 was selected and subjected to 200-ns MD simulations followed by cluster analysis of MD trajectories. The obtained equilibrated structure of the N protein ( Figure 6C) was used for molecular docking. Considering AA10 and AA21 peptides as the binding sites on the N protein, molecular docking was performed using HDOCK web server. 49 As a result, the HDOCK program yielded ten different aptamer-protein complexes for both AA10 and AA21 binding sites. The top-scoring binding poses were taken to study aptamer-protein interaction. The models of these binding sites were further refined using MD simulations to obtain the best structures in the solution condition ( Figures 6D and 6E). Both considered complexes were stable during 200-ns-long MD simulations and showed no dissociation. Two clusters with a large population in MD simulation were chosen for a more detailed evaluation of residue-nucleotide interaction for each complex. Detailed information on the hydrogen bonds is given in Table S3, and both complexes did not reveal a significant difference in the total number of hydrogen bonds. Different nucleotides are involved in the binding with AA10 and AA21 epitopes. For the AA10-binding epitope, the binding was primarily driven by G15, T17, C18, and C19 nucleotides, whereas the binding of the AA21 epitope was mainly due to T32, A22, and T23 nucleotides. All these nucleotides are located in the loop of the tNSP3 aptamer.

DISCUSSION
During viral infection, the nucleocapsid (N) protein was reported to possess abundant expression and high immunogenicity in the infected human cells, which is amendable to be targeted at the early stage of viral infection for COVID-19 diagnosis. 50 Therefore, the N protein has been used as a target for aptamer selection to develop an aptamer-based label-free detection method. Several studies have implemented the N protein-binding aptamers to improve the sensitivity of aptamer-based detections. 51,52 This work's novelty is selecting DNA aptamers binding to SARS-CoV-2 nucleocapsid via asymmetric emulsion PCR, identifying the aptamer-N protein binding epitopes, and using the aptamers in BLI-based detection of the N protein in human saliva. We discovered the novel sequences of DNA aptamers from a synthetic DNA library using the N protein of wild-type SARS-CoV-2 as a binding target. The selection was stopped at the sixth round because of emerging PCR by-products caused by PCR artifacts, such as a primer-dimer and primer partially hybridized DNA, which might interfere with the binding efficiency between the targeted DNA aptamers and the N protein. 53,54 The most enriched sequences over all rounds were considered to retrieve aptamers with high affinity and specificity.
The truncated structures of these selected aptamers (tNSP1, tNSP2, and tNSP3) evinced a high binding affinity to the N protein with a similar range of K D values. The binding affinity of truncated structures was not significantly affected by trimming out the primerbinding region. While aptamers with their full length may form various unfavorable secondary structures, the smaller size of aptamers renders the higher binding affinity due to less steric hindrance of the aptamer structures. 55,56 These findings were consistent with previously reported studies. 57,58 The three truncated aptamers did not exhibit cross-reactivity to the SARS-CoV-2 S1 subunit and hexahistidine peptide. The aptamers revealed 2-fold decreased binding to the nucleocapsid protein of MERS coronavirus. This can be explained by the high level of similarity of peptide sequences between nucleocapsid of SARS-CoV-2 and MERS, as previously reported. 59,60 To prevent false-positive detection of our aptamer-based BLI for MERS N protein, a cutoff diagnostic threshold may be set up by using the binding signal of the aptamer to the MERS N protein as a background signal. With tNSP3 possessing a K D value below 4 nM to wild type as well as Omicron and a K D value of 0.6 nM to the Delta variant, the epitope of N protein binding to the tNSP3 was identified. Notably, the aptamer-binding epitope was localized in the C-terminal domain with high binding affinity, which can explain the specific interaction of the tNSP3 aptamer with SARS-CoV-2 over MERS coronaviruses due to the dissimilarity of the amino acid sequences of the N protein. In addition, the interaction of the AA10-binding site with the tNSP3 was modeled by MD simulation. Interestingly, there are no identical nucleotideamino acid pairs in the aptamer-protein complexes, assuming that an individual nucleotide may preferably bind to the specific amino acids on the binding sites. Even though the tNSP3 aptamer showed partial binding to the AA21 peptide localized in the RNA-binding domain, this might be a non-specific interaction mainly through electrostatic interactions between positive residues on the N protein and a phosphate backbone of the DNA aptamer. The limitation of the peptide-binding method lies in potentially different folding and conformational dynamics of the full-sized protein and digested peptides. This difference can change binding kinetics and decrease the affinity of aptamers.
In addition, the performance of the tNSP3 aptamer was assessed by detecting the SARS-CoV-2 N protein spiked in human saliva. Owing to the high complexity of saliva samples, including metabolites and proteins, it affected the aptamer-protein interaction and increased K D up to 50 nM. The tNSP3 aptamer implemented on BLI has an LOD in the nanomolar range comparable with other aptamer-based detections previously reported in the literature. [61][62][63] Nevertheless, the detection of SARS-CoV-2 N protein using an aptamer-based biosensor still requires sample processing, which has been commonly used in the standard clinical procedure. 64 Once the viral particles are lysed and inactivated with heat treatment, the N protein is released from the viral RNA, which is necessary for aptasensors to detect this protein. In conclusion, we discovered novel DNA aptamers binding to the SARS-CoV-2 nucleocapsid protein and its variants, Delta and Omicron. The selected aptamer named tNSP3 implemented onto the BLI biosensor platform successfully detected the N protein in human saliva. With the high specificity of the tNSP3 aptamer to the N protein, the cross-reactivity to SARS-CoV-2 S1 was negligible; however, the binding of this aptamer to MERS N protein was observed with a 50% reduction. Future work will aim to apply our aptamer-based approach to clinical samples.

Asymmetric-emulsion PCR-SELEX
The selection of the N protein-binding aptamers was initiated by heating 2 nmol of DNA library: First, a counter-selection was performed by directly adding the heated DNA library mixture to an Ni-NTA HisSorp strip (Qiagen, Germantown, MD) without adding SARS-CoV-2 N protein followed by incubation for 30 min at room temperature in the first round of selection. Additionally, the counter-selections were performed in the fourth and sixth rounds, in which the enriched DNA pools from the previous rounds of selection were incubated with SARS-CoV-2 S1 (ACRObiosystems, Newark, DE) coated on the Ni-NTA strips. After the counter-selection was completed, the positive selection was performed by transferring the unbound DNA mixture from the counter-selection to an Ni-NTA HisSorb strip coated with 20 mg/mL his-tagged N protein (ACRObiosystems) and incubating for 1 h at room temperature. The unbound DNAs were then washed twice with 500 mL of the washing buffer (DPBS with 0.5 mM MgCl 2 , 1 mM CaCl 2, and 0.05% Tween 20). The N-protein-binding DNAs on the strip were eluted by incubating with nuclease-free water at 95 C for 10 min. The stringency of the six-round selection was increased by an additional number of washings and the decrease of incubation time of the DNA-protein mixture for 30 min in the fourth to sixth rounds. Also, the ratio of protein to DNA was decreased by two times for each round. Thereafter, the eluted DNA was mixed with an asymmetric PCR mixture (forward-to-reverse primer ratio of 20:1), and the PCR mixture was added dropwise to the emulsion oil while stirring. The optimized PCR amplification was performed for 25 cycles (98 C for 30 s; 25 cycles at 98 C for 30 s, 56 C for 15 s, 72 C for 15 s; and hold at 4 C), and the amplicons were separated from the oil phase by adding 1 mL of isopropanol and centrifuging at 15,000 Â g for 5 min. The aqueous phase at the bottom layer, containing the amplified ssDNA, was then purified with a PCR cleanup kit (New England BioLabs, Ipswich, MA).

Next-generation sequencing and data analysis
After six selection rounds, each ssDNA pool was reamplified by symmetric PCR and purified before NGS. The fastq raw data were analyzed following the guideline from FASTAptamer, 36 as shown in Figure S2. The data representing each aptamer pool were categorized into its cluster, followed by using the FASTAptamer-enrich tool to retrieve the most enriched sequences. After the aptamer sequences were identified, the multiple sequence alignment tool on Clustal Omega was used to evaluate the similarity and differentiation of individual aptamer sequences. The secondary structures of the selected aptamers were then predicted using the RNAstructure web server. 38

Evaluation of nucleocapsid aptamer binding affinity by biolayer interferometry
The binding affinity of aptamer pools and SARS-CoV-2 nucleocapsid protein (ACRObiosystems) was performed using Ni-NTA biosensors and a BLI instrument named Octet N1 (Sartorius, Bohemia, NY). First, the DNA-enriched pools were prepared in the assay buffer (DPBS with 0.5 mM MgCl 2 and 1 mM CaCl 2 , pH 7.4) at a concentration of 500 nM and heated at 95 C for 5 min, then ice-cooled. The binding assay was performed in 0.5-mL light-blocking tubes at 25 C with a total volume of 250 mL. In the next steps, the his-tagged N protein was loaded on Ni-NTA biosensor tips. During the binding assay, a baseline was established in the binding buffer followed by monitoring of the association between the DNA-enriched pools and the N protein on the biosensors. The association step was performed with six different concentrations (0, 12.5, 25, 50,100, and 200 nM) of NSP1, NSP2, and NSP3 aptamers. The assay was carried on Octet N1 in the DPBS buffer at 25 C with a 30-s baseline, 100 s for association, and 150 s for dissociation, and glycine-HCl (pH 1.7) was used for biosensor regeneration. For the non-specific binding test, the 80-nt-scrambled DNA (Sc80) was used as a negative control. The published aptamers A48 and A58 were included in this experiment as positive controls.
To imitate the COVID-19 diagnosis, the binding assay was performed by immobilizing the aptamers on the biosensor tips to detect the presence of the N protein in the assay buffer. To study the effect of heat inactivation on the aptamer-based detection of N proteins, the biotinylated tNSP3 aptamer was immobilized on the streptavidin biosensor tips, and the binding assay of the aptamer to the N proteins of Delta (123.02 nM) and Omicron (123.58 nM) variants that were heated at two different temperature conditions (65 C for 15 min and 80 C for 5 min) was performed on Octet N1 with the same assay buffer mentioned above.

Cross-reactivity of aptamer binding to N protein
The selectivity of three aptamers (tNSP1, tNSP2, and tNSP3) to the N protein was examined by investigating the binding affinity of the aptamers to other viral proteins, including SARS-CoV-2 S1 protein and   by immobilizing the biotinylated tNSP3 aptamer on a streptavidin biosensor tip. As mentioned in the previous assay, the N protein spiked in 10% saliva was detected on Octet N1, and the 10% saliva assay buffer without adding the N protein was used as the background signal. The apparent K D of the aptamer was calculated with Octet N1 software (version 1.3.0.5) using a 1:1 global binding model, and the LOD was also determined based on the standard deviation (SD) of the binding curves and the slope of binding (nm) versus concentration of the N protein (nM): LOD = 3.3SD/slope. 65 Identification of SARS-CoV-2 nucleocapsid and aptamerbinding epitopes using nLC-MS/MS First, 20 pmol of SARS-CoV-2 nucleocapsid in 25 mM HEPES (pH 7.5) was incubated with 100 pmol biotinylated tNSP3 aptamer heat-folded in the incubation buffer (DPBS with 0.5 mM MgCl 2 , 1 mM CaCl 2 , and 0.2 mg/mL tRNA) for 30 min at 25 C. The DNAprotein mixture was then transferred to a Microcon-30kDa filter (Millipore, Burlington, MA) and centrifuged at 14,000 Â g for 10 min to clean up the unbound DNAs. The solution of aptamer-protein complexes retained on the filter was transferred to a 1.5-mL microcentrifuge tube. The aptamer-bound protein complexes were digested by adding 1.5 mL of 0.2 mg/mL trypsin/Lys-C followed by incubation at 37 C for 30 min. The digested aptamer-protein complexes were added to the pre-washed Dynabeads MyOne streptavidin magnetic beads (Invitrogen, Waltham, MA) according to the manu-facturer's protocol. In brief, the biotinylated tNSP3 aptamer-digested peptide complexes were incubated with the streptavidin magnetic beads on a rotator at room temperature for 15 min, and the unbound biotinylated aptamer was washed out twice with the washing buffer (DPBS with 0.5 mM MgCl 2 and 1 mM CaCl 2 ). Next, 100 mL of 8 M urea was added and incubated for 10 min in order to elute the aptamer-bound peptides at room temperature. The eluted peptides were collected, and 2 mL of formic acid was added to the sample which was stored at À20 C until further processing. The digested peptides were cleaned up by desalting on C18 stage tips (TopTip) before injection into the nLC-MS/MS system. Another control sample was initially performed by incubating the heat-folded scrambled DNA (Sc80) with the N protein for 30 min at 25 C before transferring the aptamer-protein complexes to the Microcon-30kDa filter. The digestion of aptamer-protein complexes was performed using trypsin as described previously. Each sample was prepared in triplicates for statistical analysis.  cysteine carbamidomethylation was set as a fixed modification. A minimum peptide length of 6 amino acids was required. The false discovery rate was set to 0.01 for the protein and peptide levels, determined by searching against a reverse sequence database. Enzyme specificity was set as C-terminal to arginine and lysine with a maximum of two missed cleavages. Peptides were identified with an initial precursor mass deviation of up to 10 ppm and a fragment mass deviation of 0.5 Da. The "match between runs" algorithm in MaxQuant 68 was performed between all samples to increase the peptide identification rate. Proteins and peptides matching the reverse database were discarded. For label-free protein quantitation (LFQ), a minimum ratio count of 2 was required. 69 To identify the aptamer epitope on the N protein, the log 2 fold change and a p value for each peptide were computed from the LFQ intensity between eluted fractions obtained from the tNSP3 aptamer and the scrambled DNA. Peptides significantly more abundant (p < 0.05) in the fraction eluted from the aptamer were considered potential epitope peptides. The sequence position of individual peptides was visualized with R and the pOmics and ggplot2 packages.

Molecular dynamics simulation of peptide-aptamer interaction
The secondary structures of the aptamers were predicted on the basis of their sequences using the MFold web server. 70 Protein folding was performed using the I-TASSER server. 47 A model with the highest C score was chosen for further refinement by MD simulations to improve structure quality. Aptamer-protein complexes were obtained by the HDOCK molecular docking web server. 49 MD simulations of the aptamers, N protein, and aptamer-protein complexes were conducted in the same way as previously described 71 by using GROMACS 2019.8 software 46 with the Amber14sb force field 72 and the TIP3P model 73 for water. The negative charge of the complexes was neutralized with Na + ions. Na + and Clions were then added to the system at the concentration of 0.15 M. After MD simulations, clusters of structures and their centers were computed using the quality threshold algorithm 74 with the VMD program. 44

DATA AVAILABILITY
The authors confirm that the data supporting the findings of this study are available within the article and its supplemental information. Additional raw data that support the findings of this study can be made available from the corresponding author upon reasonable request.