Genotyping of the Major SARS-CoV-2 Clade by Short-Amplicon High-Resolution Melting (SA-HRM) Analysis

The genome of the SARS-CoV-2 virus, the causal agent of the COVID-19 pandemic, has diverged due to multiple mutations since its emergence as a human pathogen in December 2019. Some mutations have defined several SARS-CoV-2 clades that seem to behave differently in terms of regional distribution and other biological features. Next-generation sequencing (NGS) approaches are used to classify the sequence variants in viruses from individual human patients. However, the cost and relative scarcity of NGS equipment and expertise in developing countries prevent studies aimed to associate specific clades and variants to clinical features and outcomes in such territories. As of March 2021, the GR clade and its derivatives, including the B.1.1.7 and B.1.1.28 variants, predominate worldwide. We implemented the post-PCR small-amplicon high-resolution melting analysis to genotype SARS-CoV-2 viruses isolated from the saliva of individual patients. This procedure was able to clearly distinguish two groups of samples of SARS-CoV-2-positive samples predicted, according to their melting profiles, to contain GR and non-GR viruses. This grouping of the samples was validated by means of amplification-refractory mutation system (ARMS) assay as well as Sanger sequencing.


Introduction
SARS-CoV-2 caused the ongoing pandemic severe respiratory coronavirus disease 2019 (COVID- 19), which was reported for the first time in China in December 2019 [1]. During its replication, SARS-CoV-2 can undergo mutation, a change in the sequence of its genome. In turn, genomes that differ by one or more mutations are called variants. A clade is a group of variants that share a common ancestor, whereas a strain is a variant with a distinctively different phenotype [2,3]. A dynamic lineage nomenclature has been adopted internationally in order to accommodate emerging variants and, at the same time, constrain the number and depth of hierarchical lineage labels [4]. Analysis of viral genomes-isolated from affected subjects from many countries-has allowed the identification of eight major clades: L (including the first Chinese cases reported), S, O, V, G and its derivatives GH, GV, GR and GRY [5][6][7]. G and its derivatives became the dominant clades (by far) worldwide around March 2020. The G clade is defined by the D614G mutation within the gene encoding the Spike (S) protein that binds its receptor in mammalian cells during infection. The D614G mutant virus displays greater infectivity and is associated with greater viral loads [8]. Furthermore, SARS-CoV-2 D614G also shows enhanced replication ex vivo and earlier transmission by in vivo experiments [9].
The genotyping of clades and related variants can be useful for monitoring the population evolution of the pandemic, validating transmission routes, and determining their association with different clinical variables or outcomes [10]. Different next-generation sequencing (NGS) experimental designs have been implemented to study the genomic diversity of SARS-CoV-2 worldwide [11,12], and they can be broadly classified in three main groups: (i) metagenomics (specifically, sequencing aimed at all the RNA or cDNA species present in a sample) and the subsequent identification of viral sequences using bioinformatic approaches [13,14]; (ii) probe-based target enrichment, a strategy that selectively captures SARS-CoV-2-derived library fragments by means of hybridization to a set of probes [15] and; iii) amplicon-based enrichment, where SARS-CoV-2 sequences are selectively amplified by PCR from RNA sample-derived cDNAs, to generate overlapping fragments [16,17]. However, these approaches require highly specialized equipment and personnel that are not readily available in many highly populated territories [18].
Short-amplicon high-resolution melting analysis (SA-HRM) is a post-PCR, closed-tube method that can be performed using more widely available qPCR equipment, as well as dedicated instrumentation. SA-HRM allows cost-effective genotyping of single-nucleotide differences-as well as other types of variants-and relies on the differential melting profile of a short double-stranded DNA molecule in the presence of an intercalating agent [19].
Here, we report, for the first time, the genotyping of a SARS-CoV-2 clade (GR) by short-amplicon high-resolution melting analysis.

Subjects and Samples
After informed consent and study approval by the institutional ethics and biosafety committees, saliva samples were obtained as previously reported by us [20] from fourteen patients showing RT-PCR positive, as well as a sample from a patient negative for SARS-CoV-2, according to a standard TaqMan qPCR assay of three amplicons within the RdRp (RNA-dependent RNA polymerase gene), E (envelope protein gene) and N (nucleocapsid protein gene) [21]. Total RNA was isolated using the QIAamp Viral RNA Mini Kit (Qiagen, Hilden, Germany) according to directions provided by the manufacturer. Spectrophotometer UV readings at 260 and 280 nm were used to estimate concentration (260 nm) and purity (260/280 ratio) of the RNA samples.

cDNA Synthesis
360 ng of purified total RNA isolated from every saliva sample were retrotranscribed using the GoScript Reverse Transcriptase kit (Promega Corporation, Madison, WI, USA) with random hexamers, following manufacturer's instructions, to generate 20 µL reactions. The cDNAs samples obtained were employed for SA-HRM and ARMS assays as well as Sanger sequencing.

Small-Amplicon High-Resolution Melting Analysis (SA-HRM)
An amplicon of 54 bp (Table 1) was designed for the genotyping of samples belonging to the GR clade of the SARS-CoV-2 virus, using the GR-SA-HRM-F and GR-SA-HRM-R primers. The 3 end of the upstream (forward) primer was located 5 bp form the GGG/AAC insertion/deletion whereas the 3 of the downstream (reverse) primer was located 1 bp from the insertion/deletion. This asymmetry was needed to ensure a similar optimal annealing temperature for both primers, in order to achieve a robust PCR amplification. 1 µL of cDNA reactions-corresponding to the RNA samples and a negative control-were amplified by PCR with both primers at a final concentration of 200 nM and an initial denaturation step at 95 • C for 3 min, followed by 29 cycles, each with denaturation at 95 • C for 5 s, annealing at 61 • C for 15 s and extension at 70 • C for 10 s. A premelt was performed with a ramp of 20 • C/s, initiating at 95 • C and ending at 37 • C. The melting step (0.3 • C/s) was initiated at 55 • C and ended at 95 • C. The PCR, premelt and melting were carried out using a LightScanner 32 Instrument (Biofire Defense, Salt Lake City, UT), employing the LightScanner Master Mix, containing the intercalating agent LCGreen (Biofire Defense). Traces belonging to the saliva RNA from a SARS-CoV-2 negative control, as well as the negative control, were removed once it was verified that they showed no significant peaks in -(d/dT) relative fluorescence/temperature plots obtained during the SA-HRM. The remaining traces were normalized in the X-axis interval between 78.04 • C and 85.14 • C, establishing a sensibility value of −3.00. Table 1. Specifications of primers and PCR assays employed for the short-amplicon high-resolution melting analysis (SA-HRM), amplification-refractory mutation system (ARMS) and Sanger experiments. Bases in bold highlight the clade discrimination segments of the ARMS primers.

Amplification-Refractory Mutation System (ARMS)
An amplification-refractory mutation system (ARMS) assay [22] was designed (Table 1) to selectively amplify cDNA generated from viruses belonging or not belonging to the GR clade-generating 649 bp or 295 bp, respectively. Each PCR reaction was prepared using GoTaq Green Master Mix (Promega, Madison, WI) and contained 1 µL of cDNA as well as the GR-ARMS-F, GR-ARMS-R, non-GR-ARMS-F and non-GR-ARMS-R primers, each at a final concentration of 300 nM. An initial denaturation step at 94 • C for 3 min was followed by 40 cycles, each consisting of a denaturation step at 94 • C for 30 s, an annealing step 55 • C for 25 s and an extension step 68 • C for 40 s. A final extension was carried out at 68 • C for 3 min. 7 µL of the PCR products were run in 3% agarose gels. The PCR amplification, the agarose gel electrophoresis and the gel documentation were conducted entirely using the miniPCR DNA Discovery System (MiniPCR, Cambridge, MA, USA).
Optical densitometry data of the gel bands were obtained using ImageJ software (NIH, Bethesda, MD, USA). Optical density values from individual SARS-CoV-2-derived bands were normalized by expressing them as ratios of the optical density value of the 350 bp band included in the molecular weight marker.

Sanger Sequencing
An amplicon of 304 bp was designed (Table 1) for PCR amplification of a segment of cDNA spanning the 203_204delinsKR mutation using the GR-Sanger-F and GR-Sanger-R primers. 10 µL of the cDNA reaction were PCR-amplified using GoTaq Green Master Mix in 100 µL reactions with both primers at a final concentration of 300 nM, and an initial denaturation step at 94 • C for 3 min, followed by 40 cycles, each consisting of a denaturation step at 94 • C for 30 s, an annealing step at 57 • C for 30 s and an extension step 68 • C for 20 s. A final extension was carried-out at 68 • C for 5 min. The PCR products were purified from excised 2.5% agarose gel bands using the QIAquick Gel Extraction Kit (Qiagen) and quantified with a Nanodrop spectrophotometer (Thermo-Scientific, Waltham, MA, USA). 100 ng of purified PCR product was used to generate 20 µL Sanger sequencing reactions with the BigDye Terminator kit v3.1. All four PCR products were sequenced bidirectionally, employing the amplifying primers (GR-Sanger-F and GR-Sanger-R) for this purpose. Sequencing reactions were, in turn, purified employing Centri-Sep columns (Thermo-Scientific). Capillary electrophoreses were run in a Genetic Analyzer 310 (Thermo-Scientific) following manufacturer's instructions.

Statistical Analyses
Descriptive statistics consisted in calculation of mean and standard deviation (SD). Parametric analyses were performed using Student's t test for nonpaired samples, with 95% confidence intervals (CI) and normality of residuals evaluated using Shapiro's test (p > 0.05). Correlation was assessed using Pearson's test. Nonparametric analyses were performed by means of the Mann-Whitney U test. Nonexistent values were not considered for these analyses. All statistical images and calculations were processed using R language version 3.6.2.

Results
The main features of the fourteen SARS-CoV-2 positive RNA samples are summarized in Table 2. The normalized SA-HRM traces corresponding to the SARS-CoV-2 positive samples within the relative fluorescence/temperature plot (Figure 1) formed two clearly distinctive groups. According to the shape of the traces, the group showing earlier denaturation (9 samples) was predicted to represent SARS-CoV-2 belonging to GR samples (AAC, fewer hydrogen bonds between the two DNA strands), whereas the group with later denaturation (5 samples) was predicted to represent viruses not belonging to GR samples (GGG, more hydrogen bonds). A statistically significant difference was observed between the melting temperatures of both groups (p = 9.67 × 10 −5 , CI: −0.75 to −0.51 • C) (Figure 2). The nonnormalized SA-HRM -(d/dT)/temperature traces indicated that non-GR samples generated weaker signals during the denaturation process (see Figure S1).  Each of the ten samples that yielded a distinguishable product in the ARMS assay was classified according to size (649 bp for the GR clade and 295 bp for the non-GR samples), consistent with the grouping suggested by the SA-HRM assay (Figure 3 and Table 3). Optical densitometry of the bands obtained through the ARMS assay showed that band intensity was significantly correlated to RNA sample concentration and purity, but not correlated to Ct of viral TaqMan amplicons (Table 4).
Sanger sequencing of four randomly selected samples-two classified as GR and two classified as non-GR-was also consistent with the SA-HRM data (Figure 4). Legible adjacent segments at either side of the insertion/deletion did not reveal additional variation in comparison with the canonical sequence.    Interestingly, we observed that samples belonging to the GR clade had significantly smaller Ct values corresponding to the TaqMan amplicons of SARS-CoV-2 (Table 5). Table 5. Comparison of parameters between the GR samples and non-GR samples according to the SA-HRM assay.

Discussion
We present herein the implementation of two molecular biology methods (SA-HRM and ARMS) for the typification of a SARS-CoV-2 mutation that defines the GR clade.
This SA-HRM approach could be adapted to screen for variants of current interest, such as B.1.351 (first observed in South Africa) [23] or the P.1 variant (identified in Japan in travelers from Brazil) [24]. Specifically, within the receptor binding domain (RBD) of the Spike protein, position 417 can be occupied by an asparagine (N) amino acid residue in the B.1.351 variant, by a threonine (T) in the P.1 variant and by a lysine (K) residue in SARS-CoV-2 virus without RBD mutations. A modification of our SA-HRM assay can be thus employed to differentiate between the three situations described above, locating the primers around the 417 codon of the Spike ORF.
In terms of our SA-HRM assay, the differential quantity of hydrogen bonds between complementary strands of the PCR products predicted a melting pattern with earlier denaturation of GR sample-derived molecules and later denaturation of non-GR-derived molecules. That pattern was observed, and the sample classification was validated by means of the ARMS assay and Sanger sequencing.
Interestingly, we observed that, in our relatively reduced group of subjects, GR samples showed significantly lower Ct values in the diagnostic qRT-PCR assays in comparison to non-GR samples ( Table 5). This suggests that there was a smaller quantity of starting material in the non-GR SA-HRM PCRs, which could result in less robust and therefore more variable amplification. Additionally, the nonnormalized SA-HRM -(d/dT)/temperature traces showed that non-GR samples generated weaker signals during melting (see Figure S1). These two factors could explain the more variable SA-HRM traces of the non-GR samples upon normalization (Figure 1).
Although it was possible to classify all of our samples using the SA-HRM assay, three of them did not generate a band in the ARMS assay gel that could allow classification ( Figure 3 and Table 3). We observed that, in the ARMS gel, the bands showed a broad range of variation and that optical density of the bands was positively correlated with RNA concentration and negatively correlated with the ratio of absorbance at 260 and 280 nm. Traditionally, it has been considered that a 260/280 ratio of 2.0 indicates a RNA sample with high purity [25]. The RNA isolation kit used employed carrier RNA and an elution buffer that contained sodium azide. It has been shown that both the carrier RNA and the sodium azide can increase the 260/280 ratio significantly [26]. Additionally, the length of the SA-HRM was 54 bp, whereas the ARMS amplicons were 295 bp and 649 bp long. RNA integrity, which allows synthesis of longer cDNAs could, therefore, be a factor that renders the SA-HRM assay suitable for a broader range of sample quality specifications in comparison to the ARMS assay.

Conclusions
Due to the proofreading enzymatic activity encoded by the SARS-CoV-2 genome, single nucleotide substitutions occur at a slower rate in comparison to other RNA viruses [27]. However, deletions are not subject to this control mechanism [28]. It has also been shown that recurrent deletions within the viral genomic segment encoding the Spike protein confer resistance to neutralizing antibodies [28], therefore potentially diminishing the effectiveness of vaccines. Given its nature, our SA-HRM assay is particularly suited to genotype deletions, a feature that can be used to screen for strains evading active or passive immunity.
Like the PCR amplification, the agarose gel electrophoreses and gel documentation corresponding to our ARMS assay were conducted using the miniPCR DNA Discovery System (see Materials and Methods), an instrument that currently costs under 800 USD. We consider that this represents an alternative for SARS-CoV-2 variant typing of specific mutations in territories with limited equipment availability.

Institutional Review Board Statement:
The study was conducted according to the guidelines of the Declaration of Helsinki and approved by the Institutional Review Board and the Ethics Committee of Hospital Infantil de México Gómez, under protocol code HIM-2020-031 SSA 1661.
Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.
Data Availability Statement: All data generated by this work is freely available upon request.