Introduction and rapid dissemination of SARS-CoV-2 Gamma Variant of Concern in Venezuela

In less than two years since SARS-CoV-2 emerged, the new coronavirus responsible for COVID-19, has accumulated a great number of mutations. Many of these mutations are located in the Spike protein and some of them confer to the virus higher transmissibility or partial resistance to antibody mediated neutralization. Viral variants with such confirmed abilities are designated by WHO as Variants of Concern (VOCs). The aim of this study was to monitor the introduction of variants and VOCs in Venezuela. A small fragment of the viral genome was sequenced for the detection of the most relevant mutations found in VOCs. This approach allowed the detection of Gamma VOC. Its presence was confirmed by complete genome sequencing. The Gamma VOC was detected in Venezuela since January 2021, and in March 2021 was predominant in the East and Central side of the country, representing more than 95% of cases sequenced in all the country in April–May 2021. In addition to the Gamma VOC, other isolates carrying the mutation E484K were also detected. The frequency of this mutation has been increasing worldwide, as shown in a survey of sequences carrying E484K mutation in GISAID, and was detected in Venezuela in many probable cases of reinfection. Complete genome sequencing of these cases allowed us to identify E484K mutation in association with Gamma VOC and other lineages. In conclusion, the strategy adopted in this study is suitable for genomic surveillance of variants for countries lacking robust genome sequencing capacities. In the period studied, Gamma VOC seems to have rapidly become the dominant variant throughout the country.


Introduction
A year and a half after the detection of the emerging coronavirus SARS-CoV-2, COVID-19 has caused more than 200 million cases and more than 4 million deaths worldwide. This virus belongs to the family Coronaviridae. These viruses encode for an exonuclease, which enables proof-reading capacity to the replication machinery, limiting mutational events. However, the huge frequency of replication of this virus, together with its high frequency of recombination, and the probable action of host deaminases on its viral genome, allows the emergence of many mutations in the viral genome (Bakhshandeh et al., 2021;Pujol et al., 2020), that leads to the generation of viral variants.
Different SARS-CoV-2 variants have emerged since the end of 2020. Some of these variants have been classified as Variant of Interest (VOI) or Concern (VOC) by WHO. VOIs carry mutations that might confer to these viruses a specific phenotypic characteristic, such as higher transmission or immune evasion. VOCs are VOIs for which at least one of such characteristics has been confirmed. At present, 4 out of the VOIs have been classified as VOCs by WHO (WHO 2021. Tracking SARS-CoV-2 variants. https://www.who.int/en/activities/tracking-SARS-CoV-2-va riants/). The grouping of each variant is defined according to the Pango lineages classification (Rambaut et al., 2020). The Alpha VOC (lineage B.1.1.7) emerged in UK (Challen et al., 2021), Beta VOC (lineage B.1.351) in South Africa, Gamma VOC (lineage B.1.1.28.1 or P.1) in Brazil (Campbell et al., 2021;Faria et al., 2021;Tegally et al., 2021;WHO, 2021), and more recently, Delta VOC (lineage B.1.617.2) in India (Callaway, 2021). In addition to their increased transmission rate, these three later VOCs are also more resistant to the neutralizing activity of antibodies produced during natural infection or vaccination (Gobeil et al., 2021).
Many of the key mutations found in the VOIs and VOCs are located in the Receptor Binding Domain (RBD) of the viral Spike protein (S). Mutation N501Y, present in all except the Delta VOC, confers a higher affinity for the cellular ACE2 (Angiotensin-converting Enzyme 2) viral receptor and may be related to an increased transmissibility of VOCs carrying this mutation Faria et al., 2021;Tegally et al., 2021). Mutation E484K, present in Beta an Gamma VOCs, has been frequently associated with reinfection cases and might reduce the neutralizing activity of antibodies produced by vaccination (Altmann et al., 2021;Ferrareze et al., 2021). Delta VOC harbors several mutations, such as L452R and T478K, being the former associated with an increased transmission potential and reduced susceptibility to protective immunity, both at humoral and cellular level (Motozono et al., 2021;Tao et al., 2021).
Genomic surveillance has been recommended by WHO, for monitoring the introduction of SARS-CoV-2 VOCs in each country (WHO, 2021). However, not all countries possess a robust genome sequencing capacity, compared to the one available in many developed countries. The aim of this study was the monitoring of VOIs and VOCs in Venezuela. For this purpose, we developed a rapid screening procedure for the detection of variants and VOCs in Venezuela, for the selection of isolates to be definitively characterized by complete genome sequencing. Special attention was also paid to mutation E484K alone, in order to analyze its eventual association with cases of reinfection.

Identification of VOCs or VOIs by partial genome sequencing
This study was approved by the Human Bioethical Committee of IVIC. Samples, from nasopharyngeal or nasal swabs and positive by qRT-PCR during the routine COVID-19 diagnosis in Venezuela, from the end of October 2020 to May 2021, were analyzed (Fig. 1). The identity of the patients was maintained anonymous. In the suspected cases of reinfection, informed consent was obtained from patients.
Nested RT-PCR was carried out using Artic primers 70 L and 84 L (4.597 bp) for the first round and 76 L and 78R (1004 bp) for the second round PCR (Artic Network, 2021) ( Fig. 1), using SuperScript III One-Step RT-PCR System with Platinum Taq High Fidelity DNA Polymerase for the first round and Platinum Taq High Fidelity DNA Polymerase for the second one (Thermo Fisher Scientific), and the following PCR conditions: an incubation at 55 • C for 30 min, followed by 94 • C/3 min and 40 cycles of 94 • C/15 s, 55 • C/30 s and 68 • C/60 s per kb, with a final extension of 68 • C for 7 min.
Once the Gamma VOC was detected, its spreading throughout the country was monitored by a one-step PCR and sequencing of a smaller fragment of 293 bp, by using primers 76.1 L (5´-CCAGATGATTTTA-CAGGCTGCG-3 ′ ) and 76.8R (5´-GTTGCTGGTGCATGTAGAAGTTC-3 ′ ), and the following PCR conditions: an incubation at 55 • C for 30 min, followed by 94 • C/3 min and 40 cycles of 94 • C/15 s, 55 • C/30 s and 68 • C/30 s, with a final extension of 68 • C for 7 min (Jaspe et al., 2021a). This fragment allows the analysis of amino acids 434-522 of the S protein (Jaspe et al., 2021a).
PCR-purified fragments (from the second round of the first strategy or the small fragment of the second strategy) were sent to Macrogen Sequencing Service (Macrogen, Korea) for sequencing.

Complete genome sequencing of selected isolates
RNA of samples selected for complete genome sequencing was extracted using the QIAamp Viral RNA Mini Kit (Hilden, Germany). Nested RT-PCR was carried out using the SuperScriptIII One-Step RT-PCR Platinum Taq HiFi System (Invitrogen, Thermo Fisher Scientific, USA) with specific Artic primers (Artic Network. SARS-CoV-2. https:// artic.network/ncov-2019, accessed April 5th, 2021) to produce 16 overlapped amplicons of approximately 2000 bp. These amplicons cover the entire genome and were detected by electrophoresis on a 2% (w/v) agarose gel. Equal amounts of each amplicon were used to generate a pool, and DNA pool inputs between 250 and 500 ng were used to generate sequencing libraries. Libraries were prepared according to the Illumina DNA Prep Reference Guide (Document # 1000000025416 v09), using the Nextera DNA CD Indexes 24 indexes-24 samples (Illumina, Inc. USA). The libraries were quantified (Qubit DNA HS, Thermo Scientific), checked their quality (Bio-Fragment Analyzer, Qsep1-Lite, BiOptic), and pooled before sequence. The sequencing was performed with 10% PhiX control v3 and a loading concentration of 200pM, using an iSeq 100 platform and a 300 cycle V2 kit with paired-end sequencing. Viral genome assembly and variant assignment were performed with the Dragen COVID-19 program. Nucleotide sequences of complete genomes have been deposited into the GISAID database and into the GenBank with the accession numbers MZ611956-MZ611975.

Phylogenetic analysis
Sequence alignment was performed by the MUSCLE algorithm and the phylogenetic analysis by the Maximum Likelihood method (1000 bootstrap replicas) with MEGA7 (Kumar et al., 2018). The evolutionary history was inferred by using the Maximum Likelihood method based on the General Time Reversible model. The tree with the highest log likelihood (− 45,645.42) is shown. The percentage of trees in which the Fig. 1. Workflow strategy for genomic surveillance of VOCs in Venezuela. A nested PCR product, amplifying a fragment of the SARS-CoV-2 genome was sequenced for genomic surveillance of viral variants. Once the first Gamma VOC was identified, the original method was substituted by a more rapid method (one step PCR of a smaller fragment) focused on detection of the characteristics mutations of Gamma VOC. Some samples from the first (n = 6, 5 of which were Gamma VOCs) and second (n = 13, 11 of which were Gamma VOCs) partial genomic sequencing methodology were selected for complete genome analysis. associated taxa clustered together (bootstrap) is shown next to the branches. Initial tree(s) for the heuristic search were obtained automatically by applying Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances estimated using the Maximum Composite Likelihood (MCL) approach, and then selecting the topology with a superior log likelihood value. A discrete Gamma distribution was used to model evolutionary rate differences among sites (3 categories (+G, parameter = 0.0500)). The tree is drawn to scale, with branch lengths measured in the number of substitutions per site. The analysis involved 70 nucleotide sequences. Codon positions included were 1st + 2nd + 3rd + Noncoding. There was a total of 29,725 positions in the final dataset. Reference sequences from different lineages were included in the phylogenetic analysis and lineages assigned with the Pango Lineage Assigner (Rambaut et al., 2020).

Statistical analysis
Statistical differences were evaluated by the Student t-test or Chi-Square test with Yates correction, or Fisher Exact test (when a number under 5). Chi-Square tests were done using the Epi Info program, version 3.5.3 (Centers for Disease Control and Prevention, Atlanta, GA). P values less than 0.05 were considered significant.

Results
A partial sequencing methodology was developed to allow the screening of a large number of samples for genomic monitoring of variants (Fig. 1). The first screening was performed by amplification of a two-round PCR, to analyze a fragment covering amino acid 420 to 752 of the SARS-CoV-2 Spike protein, allowing us to detect key mutations associated with VOCs. With this first strategy, from these first 245 isolates for which sequence was obtained, 29 carried both mutations E484K and N501Y, and one carried only the mutation E484K.
This nested PCR strategy failed to amplify around 25% of samples with Ct values between 25 and 30, but also some samples with Ct values below 15, these probably because of PCR inhibition or sample integrity problems. In addition, the methodology required two rounds of amplification. Thus, once the putative VOCs were detected, a single round RT-PCR strategy was adopted, to amplify a shorter fragment, allowing the analysis of amino-acids 434-522 (Fig. 1). This strategy was more suitable to determine the prevalence of Gamma VOC and monitoring the circulation of other putative variants. With this one step-PCR method, more than 95% of the sequences of samples with Ct below 30 could be obtained.
The presence of VOCs in symptomatic infections was evaluated in a group of 245 individuals for which sample sequence was available to determine the influence of sampling bias on the prevalence of VOCs. A total of 79% of these samples (194/245) was from symptomatic persons, from mild to severe COVID-19 infections. No significant difference was observed in the prevalence of Gamma VOC between symptomatic and asymptomatic individuals (Table 1). Another possible bias that might alter the frequency of VOC in the samples tested is the fact that samples with Ct higher than 30 were excluded from the analysis. A significantly lower Ct value was observed for Gamma VOCs samples, for one fluorophore. The reduction was only in 1 point, suggesting an average two/ fold increase in viral concentration (Table 2). Since the difference in Ct values observed between Gamma VOC and non-VOC samples was too low, it does not seem to introduce a bias in the frequency of VOCs detected.
Complete genome analysis of selected samples confirmed the presence of Gamma VOC (lineage P.1) circulating in Venezuela (Fig. 2). A total of 16 complete genomes were obtained from this lineage, from samples analyzed with both PCR strategies (Fig. 1). They displayed more than 99.9% identity between them.
The predominance of Gamma VOC was monitored through time. The first isolate was detected at the end of January 2021. A rapid increase in the frequency of this VOC was observed from January to April-May. In March 2021, more than 95% of the tested isolates belonged to this lineage in Bolivar State. Frequencies of less than 95% for Gamma VOC were found in some Western states of the country, in agreement with a possible introduction of this VOC through the Eastern Bolivar State (Fig. 3). In April-May, the frequency of Gamma VOC was 96% in the country, and more frequent in states from the Eastern and Central regions of the country (Fig. 4). Table 3 describes some documented cases of suspected reinfection despite vaccination. A suspected reinfection case was defined as a case with two PCR-positive detections with more than 3 months of separation. Unfortunately, the samples from the first episode were not available for sequencing, thus the cases were defined as suspected and not confirmed reinfection. In all of them, the infecting virus carried the mutation E484K, either from a Gamma VOC or, in one sample, E3344, the E484K mutation alone (Table 3). Only the S genomic region could be sequenced for the E3344 sample, so the sample could not be classified in a specific lineage. S sequence from E3344 was closely related to D1444 isolate (B.1) and sequences from Brazil available at GISAID (EPI_-ISL_2017450_Brazil, P1, EPI_ISL_2241594_Brazil, P2, and EPI_-ISL_2466142_Brazil, P2). E484K mutation without the N501Y one was found in a total of 8 cases out of the 1736 samples analyzed in this study (0.5%). In the other 7 cases, no evidence of a previous infection was found. Complete genome sequence for 3 of these cases showed that these isolates belonged to lineages B.1, B.1.111 and B.1.526 (Fig. 2).
The presence of E484K mutation in the virus (from a Gamma VOC or another lineage) infecting some cases of suspected reinfection in our study, prompted us to analyze the presence of this mutation in the sequences available worldwide in the GISAID database over time, to evaluate if the frequency of this mutation could be related to reinfection (Fig. 5). Although it emerged in early 2020 (first report on February 2020) its prevalence only increased significantly in 2021 (Fig. 5A). Since a high number of GISAID database sequences are from VOCs, the prevalence of this mutation was also analyzed in non-Alpha, Beta or Gamma VOCs, which are very prevalent in the database. An unexpected increasing frequency of sequences carrying the E484K mutation was found also among non-VOCs isolates (Fig. 5B).   In order to assess the contribution of E484K mutation to the incidence of reinfections, a review of reinfection cases was conducted until June 1st, 2021, as described in Materials and Methods. A total of 89 reported cases of reinfection were identified worldwide, 41 of which were supported by a preprint or a peer-reviewed article (Fig. 6). For the cases in which severity could be compared between episodes, most were similar or less severe, except one with fatal evolution, raising the possibility of partial immune protection, although no information of their immune status was available. Most of them exhibited a different clade or lineage detected between initial infection and reinfection.
Reinfection with SARS-CoV-2 genomes harboring E484K occurred only across geographically diverse regions from Brazil and among patients spanning a broad distribution of ages and baseline health status, whose initial and second infections were respectively asymptomatic or mild and mild/severe. The interval time was as early as 24 days or until 224 days after the initial infection. E484 substitution was found mainly associated with Gamma VOC and lineage P.2.

Discussion
SARS-CoV-2 variants are emerging and spreading rapidly in several parts of the world (Boehm et al., 2021). Many of the key mutations found in the VOCs are found in or near the RBD. For this reason, we adopt a strategy focused on Sanger sequencing of a small region of the SARS-CoV-2 targeting the RBD genome sequence. This strategy allowed us to process a large number of samples, with a special focus on amino acid positions 484 and 501 in the S protein, but also 452.
The presence of any of these mutations does not necessarily mean that a VOC is circulating in a specific country, nor that this isolate will gain the phenotypic advantages provided by these mutations in VOCs, as the increased transmissibility observed for the VOCs due to the N501Y. Many other mutations present in these VOCs (point mutations and deletions) are contributing to the enhanced transmissibility of these VOCs (Altmann et al., 2021;Gobeil et al., 2021). Indeed, these mutations, particularly, E484K one, have emerged in many other lineages, as previously reported (Ferrareze et al., 2021). However, the specific detection of these two mutations, allowed us to rapidly detect the introduction of Gamma VOC. This introduction was confirmed by complete genome sequencing of 16 samples. All the sequences were closely related, suggesting few introductions of the VOC and then a rapid dissemination throughout the country, despite national mobility restrictions (Lampo et al., 2021). After its detection in January 2021, it took 2 months for the Gamma VOC to predominate in most parts of the country. It cannot be ruled out that the studied sample set may have some bias, introduced by the fact that most of the samples were from symptomatic individuals and with Ct lower than 30. However, an analysis of a group of samples showed that the Gamma VOC does not seem to be more prevalent among symptomatic patients. Only a minimal difference in Ct value was observed between Gamma VOC and the other circulating isolates. A small reduction in Ct value was also observed in Brazil for this VOC. However, this difference was not observed when adjusting for the time of onset of symptoms (Faria et al., 2021). A lower Ct value was however found in another report from Brazil . Rapid emergence and predominance of Gamma VOC were observed in Manaus some months before Venezuela (Faria et al., 2021), evolving from a local B.1.1.28 clade in late November 2020 and replacing it in less than two months in the Amazon . Wang et al. (2021) also reported that Gamma VOC is relatively resistant to neutralization by multiple therapeutic monoclonal antibodies, convalescent plasma, and sera from vaccinated individuals. This ability is due to a combination of the amino acid substitutions present in S and a particular conformation with one of the RBD in an "up" position (Increased resistance of SARS-CoV-2 variant P.1 to antibody neutralization). Indeed, infection with Gamma VOC was found in some vaccinated individuals or cases of reinfection.
Specifically, the E484K mutation is associated with antibody evasion in Beta and gamma VOCs (Dejnirattisai et al., 2021), but also in some other lineages (Jangra et al., 2021;Nonaka et al., 2021). In our study, from 6 cases of suspected reinfection, all the isolates from the last infection carry the mutation E484K, either belonging to the Gamma VOC or associated with another lineage. The frequency of this mutation has  been increasing in Brazil and worldwide, not only in VOC samples, as shown in this study and another one (Ferrareze et al., 2021). This mutation has also been found associated with COVID-19 cases despite vaccination (Fabiani et al., 2021). Three sequences carrying the mutation E484K but not classified as Gamma VOC, belonged to three different lineages: B.1, B.1.111 and B.1.526. According to the Pangolin database, B.1 is a large European lineage the origin of which roughly corresponds to the Northern Italian outbreak early in 2020, and is widely present throughout the world. B.1.111 lineage is described as South/Central American lineage and, in addition to USA, has been found mostly in Trinidad and Colombia. Lineage B.1.526 has been predominantly circulating in New York but has also been detected in Aruba, Colombia, and Ecuador (Rambaut et al., 2020). Lineage B.1.526 was identified in New York, USA, in November 2020, and its prevalence increased sharply. Approximately one half of the B.1.526 carry the E484K mutation. However, this lineage, with or without the E484K mutation, was not more associated with cases of reinfection, when compared to other lineages .
A worldwide increase in frequency of E484K mutation in the sequences was observed during the second year of pandemic, where reinfection cases were expected to be more common (Fig. 5). The effect of this mutation on escape to neutralizing antibodies and its observed increase in frequency led us to suspect that reinfection cases might be associated with the presence of this mutation. Contrary to our expectations, documented cases of reinfection worldwide were not associated with the presence of E484K mutations, even during the period of March-May 2021, when the frequency of this mutation increased significantly (Fig. 5). The only exception is Brazil, where some reinfection cases were associated with the presence of E484K mutation (Fig. 6). However, E484K mutation is very frequent in Brazilian sequences, either in the Gamma VOC or in the P.2 or B.1.1.33 lineages (Ferrareze et al., 2021;Goes et al., 2021). Thus, the presence of E484K mutation in reinfection cases may be only the reflection of the high frequency of circulation of this mutation in the country.
In Venezuela, however, although only a limited number of suspected cases of reinfection or infection despite vaccination were analyzed, all of them were infected with viruses carrying this mutation, in agreement with the ability of this substitution to partially escape protective immunity. However, as in Brazil, the presence of E484K mutation in the suspected reinfection cases may be only the reflection of the high frequency of circulation of Gamma VOC in the country, although, in one case, the infecting virus was not the Gamma VOC and carried the E484K mutation associated with another lineage.
In conclusion, the present methodology allowed us the rational detection of the introduction and rapid dissemination of Gamma VOC in Venezuela. The use of the selected methodology for genomic surveillance presents some limitations, since it is mainly based on the detection of key amino acids such as those in positions 452, 478, 484, 490, and 501. Another amino acid position, 417, missing in this fragment, is also important to differentiate Gamma, Beta, Alpha with E484K, and Mu variants. The methodology described in this study was sufficient to detect the first variant introduced in Venezuela, or at least the first which disseminated and could be detected, i.e. Gamma VOC. However, since then, other variants have been introduced in Venezuela: Alpha, Delta, Lambda, and Mu have been detected later (Jaspe et al., in preparation). For a more accurate rapid detection of mutations useful for the initial identification of putative variants, we adopted the amplification and sequencing of a slightly larger fragment of 593 nt, which allows also the detection of amino acid in position 417. However, the smaller fragment used in this study is still useful for us, for an even more rapid detection (in one day) of the most frequent variants circulating in Venezuela, by restriction enzyme analysis to detect E484K (Jaspe et al., 2021a) and L452R mutations (Jaspe et al., 2021b). Continuous monitoring is on way to detect the eventual introduction of other variants in the country and their temporal dynamic in each region.

Funding
This study was supported by Ministerio del Poder Popular de Ciencia, Tecnología e Innovación of Venezuela.

Credit author statement
Rossana C Jaspe, Esmeralda Vizzi: Investigation, Formal analysis,   6. Geographic distribution of the genetically confirmed reinfection cases worldwide. The analysis included 89 reported cases of reinfection worldwide, with SARS-CoV-2 genomes confirmed by sequencing, 41 of which were supported by a preprint or a peer-reviewed article: 13/89 (14.6%) of them harbored the mutation E484K, all of them from Brazil. (*) 5/13 supported by a preprint or a peer-reviewed article.

Declaration of competing interest
I declare that we have no conflict of interest.