COVID-19 epidemic in the Brazilian state of Amazonas was driven by long-term persistence of endemic SARS-CoV-2 lineages and the recent emergence of the new Variant of Concern P.1

The Northern Brazilian state of Amazonas is one of the most heavily affected country regions by the COVID-19 epidemic and experienced two exponential growing waves in early and late 2020. Through a genomic epidemiology study based on 250 SARS-CoV-2 genomes from different Amazonas municipalities sampled between March 2020 and January 2021 we revealed that the first exponential growth phase was driven mostly by the dissemination of lineage B.1.195 which was gradually replaced by lineage B.1.1.28. The second wave coincides with the emergence of the variant of concern (VOC) P.1 which evolved from a local B.1.1.28 clade in late November and rapidly replaced the parental lineage in less than two months. Our findings support that successive lineage replacements in Amazonas were driven by a complex combination of variable levels of social distancing measures and the emergence of a more transmissible VOC P.1 virus. These data provide unique insights to understanding the mechanisms that underlie the COVID-19 epidemic waves and the risk of disseminating SARS-CoV-2 VOC P.1 in Brazil and potentially worldwide.

COVID- 19 Table 1), with a changing temporal prevalence over time (Fig. 1c). The lineage B. 1.195 was the most prevalent variant during the rst exponential growth phase. However, its prevalence gradually decreased after the rst epidemic peak in early May and was surpassed by lineage B.1.1.28. This lineage persisted as the most prevalent one from May to December 2020, when the second lineage replacement took place, coinciding with the second phase of exponential growth. The VOC P.1 was rst detected on 4th December 2020 in Manaus and displayed an extremely rapid increase in prevalence up to January 2021.
To better estimate the temporal trajectory of the P.1 emergence in the Amazonas state in late 2020 and early 2021, we designed a real-time PCR assay to detect the deletion at orf1b (NSP6: S106del, G107del, F108del), which is a genetic signature of the VOCs (P.  Table 2). Furthermore, the frequency of MVs observed in samples taken during the early (March-September) and late (October-January) epidemic phases were comparable (Extended Data Fig. 1).
Differences in the epidemic trajectory of major SARS-CoV-2 Amazonian clades. Reconstruction of the spatiotemporal dissemination dynamic using a Bayesian phylogeographic approach supports that the early prevalent local clade 195-AM probably emerged in mid-March 2020 in the city of Manaus (Supplementary Table 3 Table 3) and rapidly spread to other municipalities of the metropolitan region and also to municipalities located up to 1,100 km distant from Manaus, at the border with Peru, Colombia and Venezuela (Figs. 4e and 4f). These analyses further traced the most recent common ancestor of lineages P.1 and P.1-like to the Manaus city in late August (Supplementary Table 3).
We next applied the birth-death skyline (BDSKY) model to estimate the effective reproductive number (Re) of the Amazonian clades with more than 40 genomes. The estimated Re trajectories matched the relative prevalence of lineages and social distancing metrics (Fig. 5) very closely. The Re of clade 195-AM was high (2.6, 95% HPD: 1.6-3.8) in March, but displayed a steep decrease to 1.0 (95% HPD: 0.8-1.2) in April, coinciding with an increase of social distancing above 50% in Manaus. Clade 28-AM-I, which was estimated to have emerged in Amazonas countryside municipalities, also presented a high Re (2.1, 95% HPD: 1.2-3.4) in its initial spread, reducing to 0.9 (95% HPD: 0.7-1.2) in May, when the social distancing index reached 50% in the interior of Amazonas state. The increasing relative prevalence of clade 28.AM-I over clade 195-AM from April to June agrees with the estimated Re differences during April. From June to August 2020, the Re of clades 195-AM and 28-AM-I remained roughly stable around 1.0, as well as their relative prevalence. When the social distancing index decreased to below 40% in September 2020, the clade 195-AM became apparently extinct while the Re of the clade 28-AM-I increased to 1.2 (95% HPD: 0.9-1.6) and then remained roughly stable above 1.0 up to the end of 2020, leading to an increasing prevalence of clade 28-AM-I between September and November 2020. The lineage P.1 arose in late November and displayed a high Re of 2.6 (95% HPD: 1.5-4.5) during December 2020, becoming the predominant lineage. With the consequently increased social distance after the health system collapse, this VOC's Re was estimated to decrease to 1.2 (95% HPD: 0.9-1.6) in late December and January.

Discussion
The present study is the most comprehensive SARS-CoV-2 genomic investigation performed to date in Amazonas, one of the most heavily hitten Brazilian states by the COVID-19 pandemic. Our genomic analyses revealed that most Amazonian cases were driven by the successful dissemination of a few local viral clades that together comprise 77% of the 250 SARS-CoV-2 Amazonian genomes here sampled between March 2020 and January 2021. Early major SARS-CoV-2 Amazonian clades arose in Manaus or at the metropolitan region between mid-March and late April 2020 and were widely disseminated within the Amazonas state, reaching the most isolated inner localities. By contrast, we found almost no evidence of disseminating early local SARS-CoV-2 Amazonian lineages outside the state, supporting that Amazonas has not been a major hub of viral dissemination within Brazil during 2020. The low land accessibility of major Amazonian cities from other Brazilian states, combined with the considerable reduction in tourism activities and air tra c during 2020, might have signi cantly reduced the chance of exportation of Amazonian SARS-COV-2 variants to other Brazilian regions. However, travels during Christmas and New Year celebrations, combined with the emergence of a potentially more transmissible VOC P.1, might have changed this scenario.
Two SARS-CoV-2 lineage replacements characterized the COVID-19 epidemic in the Amazonas state during early and late 2020. The rst lineage replacement started after the rst epidemic peak and was a gradual process of nearly ve months during which lineage B. A study of blood donors conducted in Manaus estimated that the rst wave of SARS-CoV-2 infected 76% (95% CI 67-98) of the city's population by October 2020, suggesting that herd immunity had already been reached 6 . Assuming that reinfection is rare and that immunity against infection did not signi cantly wane by December 2020, a second COVID-19 wave would not be expected so early. Several hypotheses were proposed to explain this unexpected second wave that resulted in the collapse of the health system in Manaus in December 2020 and January 2021 5 . Our ndings support that non-pharmaceutical interventions (NPI) could explain a large portion of the epidemic dynamics in Amazonas. A drastic reduction in the median Re (from 2.1-2.6 to 0.9-1.0) for Amazonian lineages B.1.195 and B.1.1.28 around April-May 2020, was reconstructed in our analysis. This is entirely consistent with the epidemic trajectories estimated from epidemiological modeling 7,8 and further coincides with the timing of implementation of NPIs that effectively increased social distancing in Amazonas. This evidence indicates that the rst epidemic wave in Amazonas was brought under relative control by the implementation of NPIs, before herd-immunity. Our results also con rm that NPIs were not stringent enough to consistently reduce the Re of SARS-CoV-2 Amazonian lineages to below 1.0 and that a stationary state of endemic community transmission was maintained from May to September 2020 7,8 .
September 2020 onwards, mitigation measures were relaxed, and the Re of clade 28-AM-I returned to above 1.0. Notwithstanding, the second epidemic wave only started in December 2020, coinciding with the emergence of the VOC P.1 and the second lineage replacement event. Several complementary evidence supports that these events were probably driven by the emergence of a more transmissible VOC P.1 in a context of relaxed social distancing. First, the second lineage replacement event was an abrupt process as the VOC P.1 evolved from the local clade 28-AM-II around late November 2020, and it took less than two months to become the dominant variant. Second, the estimated median Re of the VOC P.1 during December 2020 was 2.2 times higher than that estimated for clade 28-AM-I in the same period, indicating that P.1 could have been nearly two times more transmissible than the co-circulating B.1.1.28 parental lineage. Third, the level of SARS-CoV-2 RNA (estimated from the median Ct) in the URT samples from P.1 infections, particularly from adults (18-59 years old), was ~ 10-fold higher than the level detected in non-P.1 infections; suggesting that P.1-infected adult individuals are more infectious than those harboring non-P.1 viruses 9,10 . Phylodynamic modeling also indicates that NPIs implemented in Manaus since the late-December were effective to reduce the median Re of the VOC P.1 ~ 50% (from 2.6 to 1.2), but probably failed to bring the epidemic under control (Re < 1.0), allowing the continued spread of this VOC in the Amazonas state.
Understanding the factors that drive the emergence and expansion of VOC harboring key mutations in the RBD of the Spike protein is of crucial importance. One hypothesis is that VOC evolved by the selective pressure of a large number of people having developed antibodies against SARS-CoV-2. Our study revealed no unusual pattern of intrahost viral variability in the Amazonian clades between April and December 2020, showing that the local emergence of VOC in heavily affected regions is an evolutionary event challenging to anticipate from the analysis of parental lineages. Concurrently, we identi ed a P.1like virus in Manaus in December 2020 that harbors several of the P.1 lineage-de ning mutations and probably shared a most recent common ancestor with lineage P.1 in September 2020. This nding revealed that the diversity of SARS-CoV-2 variants carrying mutations of concern at the Spike protein in Manaus could be larger than initially described and that those variants probably circulated for some time before the expansion of lineage P.1. Although only the lineage P.1 seems to have displayed a rapid dissemination so far, our ndings alert for the potential spread of other P.1-related VOC in the Amazonas state and for the importance of a continuous molecular surveillance system to track the viral diversity in SARS-CoV-2 whole-genome consensus sequences and genotyping. FASTQ reads were generated by the Illumina pipeline at BaseSpace (https://basespace.illumina.com). All les were downloaded and imported into Geneious v10.2.6 for trimming and assembling using a customized work ow employing BBDuk and BBMap tools (v37.25) and the NC_045512.2 RefSeq as a template. Using this approach, we generated consensus sequences with mean depth coverage of 2,600X, excluding duplicate reads. Whole-genome SARS-CoV-2 consensus sequences were initially assigned to viral lineages according to the nomenclature proposed by Rambaut et al. (doi.org/10.1038/s41564-020-0770-5), using the Pangolin web application (https://pangolin.cog-uk.io) and later con rmed using phylogenetic analyses as explained below.
Intra-host SARS-CoV-2 genomic variability. Raw sequencing reads and primer sequences were removed with Trimmomatic 0.26 3 using default parameters. Reads that passed quality ltering were then mapped against the Wuhan SARS-CoV-2 reference genome (NC_045512.2) using Bowtie2 software 4  To characterize the viral intra-host population, we identi ed all minor variants (MVs) found in the samples, that is, highly supported nucleotides that are supported by 10 to 49% of the reads in a given position, and that was not added in the nal majority consensus genome. We then replaced the nucleotide supported by the majority of the reads by the MVs in the consensus genome to evaluate the impact of the synonymous and nonsynonymous nucleotide variation between the major and minor variants. We performed the synonymous and nonsynonymous analysis using a R pipeline developed for SARS-CoV-2 (10.3389/fmicb.2020.01800).