Implementation and Evaluation of the Clear Dx Platform for Sequencing SARS-CoV-2 Genomes in a Public Health Laboratory

Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) is an etiological agent of the ongoing coronavirus disease 2019 (COVID-19) pandemic, which has infected over 750 million people globally, with 1% mortality rate (World Health Organization, accessed 8 February 2023) (1). During this enormous human health crisis, whole-genome sequencing (WGS) has enabled public health agencies to identify circulating SARS-CoV-2 variants, understand vaccine breakthroughs and transmission patterns, and contact tracing investigations (2, 3). WGS involves library preparation, sequencing, and bioinformatics data analysis. However, manual library preparation and data analysis are immensely labor-intensive processes that likely increase workflow errors and reduce the consistency of results. This may have resulted in significant delays to real-time SARS-CoV-2 genomic surveillance and monitoring of the variants for public health action. Clear Dx (Clear Labs, San Carlos, CA) is a fully automated platform for SARS-CoV-2 detection and genomic surveillance that goes from extracted RNA to bioinformatic data analysis without any human intervention, thereby reducing the analytical errors (4). The City of Milwaukee Health Department Laboratory (MHDL) has verified the performance characteristics of the Clear Dx WGS SARS-CoV-2 assay as recommended by the manufacturer. A total of 75 clinical specimens, such as nasopharyngeal and nasal swabs, comprising 54 SARS-CoV-2positive specimens and 21 other respiratory viral-pathogen-positive specimens, but negative for SARS-CoV-2 (Table S1 in the supplemental material), from MHDL frozen specimen inventory were included for the verification. Of 54 SARS-CoV-2-positive specimens, 27 were previously sequenced either on the MinION (Oxford Nanopore Technologies, ONT) or MiSeq (Illumina Inc.) platform, while the remaining were sequenced first in Clear Dx and subsequently verified using the MiSeq platform. Two of the samples sequenced in Clear Dx subsequently failed in MiSeq; we therefore continued with the remaining 52 SARS-CoV-2-positive specimens. Based on the verification, 95.9% overall accuracy was obtained with sensitivity of 94.2% (49/52) (in terms of $90% genomic and $100 sequencing depth/coverages as quality control metrics [QC] of precisely identified SARSCoV-2 lineage) and specificity of 100% (21/21) (in terms of no amplification and detection of SARS-CoV-2 in the specimens previously positive for other respiratory virus pathogens) (Fig. 1A and B; Table S1). The remaining three (3/52) SARS-CoV-2-positive specimens were assigned accurate lineages with only ,90% genomic coverages in Clear Dx (Table S1). The genomic coverage ranges between 54.1 and 99.6% (median, 98.7%; mean, 96.8%) for Clear Dx-generated sequences and 83.9 to 100% (median, 98.9%; mean, 97.3%) for sequences from the other two platforms (Fig. 1B). These results suggest that coverage discrepancies were possibly caused by either an area of low amplification and sequence drop-out due to low performance of ARTIC V3 primer set in that region, or sequencing errors (5). Furthermore, Editor Anne Piantadosi, Emory University School of Medicine Copyright © 2023 Ramaiah et al. This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license. Address correspondence to Arunachalam Ramaiah, arama@milwaukee.gov. The authors declare no conflict of interest. Published 28 March 2023


Methods
The Milwaukee Health Department Laboratory (MHDL) has been testing human nasopharyngeal/nasal swab specimens submitted for SARS-CoV-2 testing according to CDC Interim Guidelines for Collecting and Handling of Clinical Specimens for COVID-19 Testing [1]. Flex platforms (Thermo Fisher Scientific, Waltham, MA) as per manufacturer's instructions. The extracts were stored at −80°C until testing [2,3]. Clear Labs Dx WGS SARS-CoV-2 Test is an automated whole-genome sequencing platform for SARS-CoV-2. The workflow begins from extracted RNA to automated library preparation, sequencing and generation of compressed FASTQ and FASTA files with minimal human intervention. The reagent preparation, assay processing, and analysis were performed according to the Clear Labs Dx WGS SARS-CoV-2 manufacturer's instructions and MHDL standard operating procedures. The Clear Labs Dx WGS SARS-CoV-2 Test utilizes pre-packaged reagents and kits. These kits contain different prepackaged and metered reagents, such as primers, enzyme mixes, buffers, Solid Phase Reversible Immobilization (SPRI) beads, and sequencing specific reagents. All reagents were sealed with pierceable foil that was pierced by pipette tips during the run. The system uses a Hamilton STAR robotic platform for automation of liquid handling and includes required ancillary equipment, such as Hamilton thermal cyclers, a barcode reader, magnet block, and two MinION nanopore sequencers from Oxford Nanopore Technologies. Total nucleic acids extracted from SARS-CoV-2-positive specimens were amplified using ARTIC V3 (biopipeline BIP-Wv6) or MIDNIGHT (biopipeline BIP-Wv7) primer pools. For each sequencing run, 30 SARS-CoV-2 RNA extracts that met quality control requirements were sequenced as per manufacturer guidelines.
To assess the overall accuracy of the Clear Labs Dx platform, we compared the sequence data generated from the same SARS-CoV-2-positive specimens on MinION or MiSeq sequencing platforms [5,6]. The detailed procedures for manual library preparation, sequencing, and data analysis for these two platforms were described elsewhere [7]. In brief, the ARTIC network workflow was used for the MinION platform. The samples were first converted into cDNA, then a PCR tiling protocol was performed to amplify overlapping 'tiled' sections covering the SARS-CoV-2 genome using V3 primer pools. It yielded tiled 400 bp amplicons as per ARTIC multiplex PCR protocol [8]. After cDNA amplification, end-repair was performed using NEB Next Ultra II End Repair/dA-Tailing Module (New England BioLabs), the amplicons were barcoded for sequencing with the Native Barcoding Expansion pack to prepare the samples for sequencing in a single run and then pooled. A different native barcode ligates to each sample to be sequenced in multiplex.
This was followed by ligating sequencing adapters to pooled barcoded samples before the library was prepared for Nanopore sequencing. Samples were sequenced in multiplex on MinION flow cells using either a MinION Mk1B or MinION Mk1C device. Using MinKNOW software v4.1.22, base calling and demultiplexing of reads were performed by following the EPI2ME Labs ARTIC SARS-CoV-2 workflow to produce FASTQ files (https://github.com/epi2me-labs/wf-artic/). Raw reads in FASTQ format were processed using the ARTIC bioinformatics pipeline (https://github.com/artic-network/artic-ncov2019) [9], where the demultiplexed raw FASTQ files were mapped to the reference virus genome Wuhan-Hu-1 using minimap2 [10] to generate the consensus sequences and variant calls using Medaka.
Using the Illumina DNA Prep library kit on MiSeq platform [6], we sequenced SARS-CoV-2 genome by following the sample preparation procedures used for MinION from converting RNA into cDNA, PCR tiling, primer selection to amplification of targeted regions. Subsequently, sequencing libraries were prepared by adding specialized adapters to both ends of amplicons by tagmentation chemistry. These adapters contain complementary sequences that allow the DNA fragments to bind to the flow cell. Subsequently, these fragments were amplified and purified. The multiplex libraries prepared were pooled together and sequenced. During adapter ligation, unique index sequences, or barcodes were added to each library. During data analysis, these barcodes were used to distinguish the libraries. The FASTQ files generated from the MiSeq platform were assembled using Illumina DRAGEN COVID Lineage (v3.5.4 to v3.5.9), where the reads aligned to a reference genome for calling variants and producing a consensus genome sequence for each specimen.
The clade and lineage assignment for SARS-CoV-2 genomes were uniformly performed using Nextclade Pango Classifier on Nextclade version web-2.8.1 [11]. For the phylogenetic analysis, all 1,224 SARS-CoV-2 genomes along with a reference Wuhan-Hu-1/2019 (GenBank # MN908947) genome sequences were aligned using MAFFT v.7.505 [12]. Subsequently, aligned sequences were used to identify GTR+F+R2 as a best-fit model based on the Bayesian Information Criteria using ModelFinder [13]. The maximum likelihood (ML) phylogenetic tree of SARS-CoV-2 was constructed with 1,000 bootstrap replicates in IQ-TREE multi core version 2.0.3 [14]. The phylogenetic tree was visualized in Interactive Tree Of Life (iTOL) [15].