Enhanced Virus Detection and Metagenomic Sequencing in Patients with Meningitis and Encephalitis

ABSTRACT Meningitis and encephalitis are leading causes of central nervous system (CNS) disease and often result in severe neurological compromise or death. Traditional diagnostic workflows largely rely on pathogen-specific tests, sometimes over days to weeks, whereas metagenomic next-generation sequencing (mNGS) profiles all nucleic acid in a sample. In this single-center, prospective study, 68 hospitalized patients with known (n = 44) or suspected (n = 24) CNS infections underwent mNGS from RNA and DNA to identify potential pathogens and also targeted sequencing of viruses using hybrid capture. Using a computational metagenomic classification pipeline based on KrakenUniq and BLAST, we detected pathogen nucleic acid in cerebrospinal fluid (CSF) from 22 subjects, 3 of whom had no clinical diagnosis by routine workup. Among subjects diagnosed with infection by serology and/or peripheral samples, we demonstrated the utility of mNGS to detect pathogen nucleic acid in CSF, importantly for the Ixodes scapularis tick-borne pathogens Powassan virus, Borrelia burgdorferi, and Anaplasma phagocytophilum. We also evaluated two methods to enhance the detection of viral nucleic acid, hybrid capture and methylated DNA depletion. Hybrid capture nearly universally increased viral read recovery. Although results for methylated DNA depletion were mixed, it allowed the detection of varicella-zoster virus DNA in two samples that were negative by standard mNGS. Overall, mNGS is a promising approach that can test for multiple pathogens simultaneously, with efficacy similar to that of pathogen-specific tests, and can uncover geographically relevant infectious CNS disease, such as tick-borne infections in New England. With further laboratory and computational enhancements, mNGS may become a mainstay of workup for encephalitis and meningitis.

IMPORTANCE Meningitis and encephalitis are leading global causes of central nervous system (CNS) disability and mortality. Current diagnostic workflows remain inefficient, requiring costly pathogen-specific assays and sometimes invasive surgical procedures. Despite intensive diagnostic efforts, 40 to 60% of people with meningitis or encephalitis have no clear cause of CNS disease identified. As diagnostic uncertainty often leads to costly inappropriate therapies, the need for novel pathogen detection methods is paramount. Metagenomic next-generation sequencing (mNGS) offers the unique opportunity to circumvent these challenges using unbiased laboratory and computational methods. Here, we performed comprehensive mNGS from 68 prospectively enrolled patients with known (n = 44) or suspected (n = 24) CNS viral infection from a single center in New England and evaluated enhanced methods to improve the detection of CNS pathogens, including those not traditionally identified in the CNS by nucleic acid detection. Overall, our work helps elucidate how mNGS can become integrated into the diagnostic toolkit for CNS infections.
In this study, we prospectively enrolled 68 patients with known or suspected CNS infection and performed mNGS from both RNA and DNA to identify pathogens. We focused laboratory and analysis methods on viral nucleic acid detection since viruses are the most common type of pathogen detected in CNS infection (4,5,24,25). The goals of this study were to assess the utility of standard mNGS in identifying CNS pathogens and to examine enhanced laboratory techniques for improving analytical sensitivity, including hybrid capture (HC) of viral nucleic acid and methylated DNA depletion (MDD).
Based on clinical testing, 44 of the 68 subjects received a conclusive diagnosis by discharge. Twelve subjects were diagnosed with viral infection by PCR from CSF ("infection, CSF PCR 1 " group), 25 were diagnosed with infection by serology or PCR from blood ("infection, other" group), and 7 had a noninfectious etiology ("alternative diagnosis" group). The remaining 24 subjects (35%) had no known diagnosis ("unknown" group) ( Table 1). Subjects classified as "unknown" underwent exhaustive clinical testing; 50% of them (12/24) had $25 infectious disease (ID) tests ( Fig. 2A; Fig. S2 and Table S2), and no diagnoses were made during long-term follow-up (Table S3). In contrast, the "infection, CSF PCR 1 " group had a much lower median number of clinical ID tests performed (12 [IQR,6,56] versus 22.5 [IQR,11,36] for the "unknown" group). The "infection, CSF PCR 1 " group also had the shortest length of stay (LOS) (4.5 days [IQR, 2, 51 days]), and across the total cohort, the LOS moderately correlated with the number of ID tests ordered (Spearman's r = 0.65; P , 0.01) (Fig. 2B). Results from mNGS and enhanced methods. To understand mNGS performance in a real-world context, we sequenced 68 CSF, 3 plasma, 5 serum, and 12 urine samples along with 47 negative controls. We performed mNGS from RNA, DNA, or both, generating an average of 9.6 million reads per subject ( Fig. S3; see also Tables S4 to S6 at https://figshare.com/ articles/dataset/Tables/13266506). We identified a plausible pathogen in 22 subjects (32.4%): 18 by standard mNGS, an additional 2 with the use of HC, and 2 more with the use of MDD ( Fig. 3; Fig. S4; see also Table S7 at https://figshare.com/articles/dataset/Tables/13266506). As expected, we detected viral nucleic acid in most subjects in the "infection, CSF PCR 1 " group (10 out of 12 [83%]) (Fig. 3), consistent with other mNGS studies (18,23). mNGS was negative in one subject with herpes simplex virus 2 (HSV-2) infection and another with human immunodeficiency virus type 1 (HIV-1) infection, illustrating that mNGS can be less sensitive than PCR for very low-level infections (Text S1 and Fig. S5). We detected reads from both JC virus and HIV in a subject with HIV and progressive multifocal leukoencephalopathy (PML), illustrating the capacity of this single platform to identify viral coinfections. In assessing our enhanced methods, we found that HC increased the number of viral reads in 8 out of 9 cases positive by routine mNGS, sometimes substantially (Fig. 4). In contrast, MDD led to mixed results, enabling pathogen detection in 4 out of 19 matched libraries (e.g., varicella-zoster virus [VZV] in subjects M049 and M070) and enhanced yields in 5 additional libraries while decreasing the yields in 10 libraries (e.g., Epstein-Barr virus [EBV] in subject M095) ( Fig. 4; Text S1). Twenty-five subjects in the "infection, other" group had infections diagnosed by serology from CSF and/or blood or by PCR from blood (Table 1). Fifteen had an infection for which no clinically approved CSF PCR assay was available; standard mNGS detected pathogen nucleic acid in six, and mNGS plus HC detected pathogen nucleic acid in a seventh, yielding 7/15 positive hits (47%) (Fig. 3). There were several cases of regional interest. For example, three subjects were clinically diagnosed with Powassan encephalitis using a time-consuming send-out serology test, and mNGS identified Powassan virus RNA in two cases. In addition, while our methods were focused on viral detection, we identified atypical bacteria whose genome reads were readily distinguishable from the background, including Borrelia burgdorferi in two out of two subjects diagnosed with Lyme disease by serology and Anaplasma phagocytophilum in a subject diagnosed by PCR from blood (Fig. S6).
In the remaining 10 subjects from the "infection, other" group, clinical CSF PCR was available and negative for the culprit pathogen . While all of these were negative using standard mNGS, the addition of MDD allowed the detection of VZV in two subjects (M049 and M070) (Text S1; see also Fig. S7 at https://figshare.com/ articles/figure/Supplemental_Figures/13266488). In both cases, clinical CSF VZV PCR from the same sample was negative, illustrating that mNGS may occasionally be more sensitive than a clinically validated PCR. In contrast, MDD decreased the yields for other herpesviruses, suggesting pathogen-specific effects ( Fig. 4; Text S1).
mNGS detects pathogens not tested by clinicians. Among the 24 subjects with no identified clinical diagnosis ("unknown"), standard mNGS identified viruses in 3 subjects, and no additional pathogens were detected using MDD and HC. We detected enterovirus in two subjects with lymphocytic meningitis (subjects M108 and M126), neither of whom had orders for clinical enterovirus PCR. These findings were verified by sequencing a second CSF aliquot and by assembling a complete enterovirus genome for each subject. Phylogenetic analysis for both subjects demonstrated closely related echovirus 30 strains (see Fig. S8 at https://figshare.com/articles/figure/Supplemental_Figures/13266488). Colors indicate clinical diagnosis categories. The LOS correlated with the number of total ID tests ordered (Spearman's r = 0.65; P , 0.01). The final clinical diagnosis for viral pathogens is stated for cases whose number of ID tests or LOS was an outlier above the 3rd quartile.
Enhanced Virus Detection and mNGS in Encephalitis ® We also detected EBV and assembled a complete genome in one subject (M095) during two serial hospitalizations for recurrent lymphocytic meningitis. While clinical testing for EBV in CSF was not performed, EBV PCR was positive from blood during both admissions. Overall, these results are compatible with EBV meningitis or reactivation in the setting of another, unidentified primary syndrome (26).
mNGS and enhanced methods detect viruses of uncertain significance. In addition to the plausible pathogens described above, we detected DNA viruses of uncertain clinical significance. EBV was present at low levels in CSF from four subjects, three of whom had alternative primary diagnoses: VZV (subject M043), HSV-1 (M026), and HSV-2 (M029). For the fourth subject (M085), no alternative diagnosis was identified; however, EBV reads were detected only after MDD and HC, and a clinical PCR for EBV from CSF was negative. A review of clinical data for these subjects suggested that EBV was unlikely to explain their clinical syndromes, and these findings most likely suggest reactivation in the setting of another acute process. We also detected human herpesvirus 7 (HHV-7) at a low level in a subject (M132) who was diagnosed with HSV-2 by clinical PCR, but HSV-2 was not detected by mNGS. Acute encephalitis due to HHV-7 rarely occurs in immunocompetent adults and has been described in three cases of patients with limbic encephalitis (27), facial cranial palsy, and polymyeloradiculitis (28,29); none of these syndromes were compatible with this subject's presentation. Adenovirus reads were detected in two subjects (M090 and M121) and were not considered vector contaminants due to their distribution across the genome; however, the reads were found in RNA libraries only, and subjects were not known to be immunocompromised or to have features compatible with adenovirus infection.
A known challenge of mNGS is the assessment and interpretation of background contamination. Even after extensive computational depletion of both human reads and sequences found in negative controls, bacteria accounted for ;11% of DNA and ;39% of RNA reads. We also found viral reads from bacteriophages and vectors commonly used in molecular biology, such as adenovirus, cytomegalovirus (CMV), HIV/lentiviruses, and parvoviruses, consistent with previous studies (30). Finally, we found a few reads matching recently discovered picornaviruses from environmental surveys (Text S1; see also mNGS is negative for subjects with noninfectious diagnoses. mNGS did not detect pathogen nucleic acid in the seven subjects with noninfectious diseases ("alternative diagnosis" group): autoimmune encephalitis and cerebellitis (n = 3), lymphoma (n = 2), and vasculitis (n = 2). In this category, the median CSF white blood cell (WBC) count was 2 to 6 times lower than those in the two infection groups. The "alternative diagnosis" group had the highest number of ID tests ordered for CSF and blood (median, 26 tests [range, 10, 57]) (see Table S2 at https://figshare.com/articles/dataset/Tables/13266506), which is consistent with provider practice to test a wide range of pathogens prior to immunomodulatory therapy; subjects were ultimately treated with immunosuppressive agents.

DISCUSSION
Advances in genomic technologies provide translational researchers the unprecedented capacity to identify and study pathogens in patients with meningitis and encephalitis. Here, we performed a prospective study using mNGS, enhanced laboratory and analysis techniques, and detailed clinical phenotyping to assess the use of this technology as a diagnostic tool for hospitalized subjects with inflammatory CSF. We identified a range of CNS pathogens, including regionally important tick-borne organisms not typically detected by CSF nucleic acid testing. In 9 cases, we were able to recover full or partial viral genomes, demonstrating the utility of this technique for virus characterization studies (e.g., molecular epidemiology and identification of neurotropic variants). In our study, subjects with CNS infections diagnosed using CSF PCR underwent fewer ID tests than other clinical groups with inflammatory CSF and had shorter lengths of hospital stay (32); from this, we infer that the judicious application of molecular diagnostic techniques such as mNGS can positively impact patient care and associated costs. Together with recent reports (18), this work highlights the opportunity for mNGS to become integrated into the infectious disease diagnostic toolkit.
Overall, mNGS was highly effective at detecting pathogens identified by clinical PCR. mNGS detected the expected pathogen in 10 of 12 subjects, similar to a recent study detecting viruses in 14 out of 16 subjects diagnosed by CSF PCR (18). Our results also highlight the benefit of enhanced mNGS techniques. Panviral HC consistently improved the sequencing of RNA and DNA viruses and resulted in virus detection in two cases (HSV-1 and HIV) that were negative by standard mNGS. MDD plus mNGS detected VZV DNA in two samples negative by standard mNGS. However, MDD decreased the yields for some viruses, indicating that the role of this technique in mNGS remains unclear. Our mixed results with MDD correlate with previous studies focusing on bacterial metagenomics (33,34). Saponin lysis of host cells may prove a more effective depletion technique for DNA (35,36).

Enhanced Virus Detection and mNGS in Encephalitis ®
A notable strength of this study was the detection of pathogens not routinely detected by CSF PCR, most notably the tick-borne pathogens Powassan virus (17), Borrelia burgdorferi, and Anaplasma phagocytophilum. These pathogens show increasing rates of human infection (37), particularly in the Northeastern United States, where this study was conducted. For Powassan virus, which is routinely detected by serology, our findings illustrate the potential utility of nucleic acid-based screening. Interestingly, we detected the CSF presence of Anaplasma, which is not commonly considered to be a cause of CNS infection (38), although the related intracellular bacterium Ehrlichia chaffeensis can cause meningoencephalitis (39,40). Overall, the high number of subjects with tick-borne infection highlights the importance of conducting mNGS in diverse geographical regions for both diagnostic purposes and epidemiological studies.
Among the 24 subjects for whom no diagnosis was achieved by routine clinical testing ("unknown"), mNGS detected potential pathogens in 3 (8%), a rate similar to that reported previously by Wilson et al. (13/159 [8%]) (18). It is possible that subjects in whom no pathogen nucleic acid was detected had a noninfectious syndrome or an infection with a low pathogen burden or short duration of replication. We reviewed the postdischarge clinical course in the subgroup, and none were identified as having an infectious syndrome, signaling the likelihood that mNGS did not miss an actionable result.
Our results highlight a few challenges associated with mNGS, particularly for infections with low titers or parainfectious complications. For example, we report an equivocal mNGS result in a subject with HIV-1 who had a CSF HIV load of 469 copies/ml, a value close to the recently reported CSF limit of detection of 313 copies/ml for HIV-1 using mNGS (23). Additionally, mNGS results were negative in all four subjects with WNV, three of whom had clinical WNV PCRs from CSF performed, which were also negative. These results support other studies showing that WNV nucleic acid is usually undetectable in CSF by clinical PCR (23,41) or mNGS (18), although it may be observed in immunocompromised subjects (15,(41)(42)(43)(44). Similarly, CSF mycoplasma nucleic acid was not detected clinically or by mNGS in three subjects despite positive mycoplasma IgM serologies. These patients had clinical and neuroradiographic findings suggestive of encephalitis or encephalomyelitis, including multifocal T2/FLAIR (T2-weighted fluidattenuated inversion recovery) hyperintensities in the brain and spinal cord (M006), superrefractory epilepsy with T2/FLAIR hyperintensity in the right posterior subinsular/ anterior temporal stem (M032), and longitudinally extensive transverse myelitis (M075). This finding supports interpretations that CNS complications of mycoplasma infections likely reflect a parainfectious antibody-mediated response rather than direct infection (45).
While we investigated specific atypical bacteria of interest (Borrelia, Anaplasma, and Mycoplasma spp.), our study focused on viruses for four key reasons: they are the most common pathogens in CNS infection (4,5,24,25); bacteria and fungi often require different laboratory methods for processing and nucleic acid extraction (34); bacterial infections are associated with greater pleocytosis and, therefore, higher levels of host background (23); and the analysis of viruses is more tractable given that mNGS (34) commonly detects bacterial reads (e.g., Pseudomonas aeruginosa and Escherichia coli) as background from skin and reagents (34). As this was not a clinical validation study, we focused on the practical application of mNGS in a defined cohort rather than general diagnostic test performance (21,23). We adhered to strict practices to minimize contamination, but we did not conduct this research study in a Clinical Laboratory Improvement Amendments (CLIA)-certified laboratory (22), allowing us flexibility in iterative testing and refinement of methods. Because this study was conducted primarily using clinical excess samples, many of which had undergone multiple prior freeze-thaw cycles for clinical testing, it is also possible that some infections were missed due to nucleic acid degradation prior to mNGS, which would be solved if clinical processing for mNGS is standardized (34).
Conclusions. Overall, our results highlight several important benefits of mNGS, including opportunities to reduce dependence on test-specific diagnostics, recover pathogen genomic data, and potentially offer shorter turnaround times than serology (17). However, our results among subjects with an unknown etiology of disease suggest that the addition of mNGS to standard clinical testing will lead to relatively few additional diagnoses, underscoring the challenge of identifying an etiology in these devastating clinical syndromes. One potential strategy for incorporating mNGS into clinical diagnostic workflows would be wide implementation early in the diagnostic workup to capitalize on the one-step detection of common pathogens, potentially sparing subjects unnecessary tests and reducing overall costs. An alternative would be to reserve this specialized technique for subjects with a high pretest probability of infection (e.g., immunocompromised). Determining how to best utilize mNGS in clinical practice will require evaluation of these factors as well as the cost and logistics of implementation (46,47). Currently, it is prudent to employ diagnostic mNGS through close communication between clinicians and mNGS experts (18) to evaluate the plausibility of the pathogens identified. This is especially important considering background reads and contamination, the essential limitation that mNGS detects only infections with circulating pathogen nucleic acid, and our stillevolving understanding of mNGS test characteristics. Results from this study will inform ongoing efforts to transition the much-needed and promising technique of mNGS from a research tool to a clinical test used in the routine care of patients with suspected CNS infection.

MATERIALS AND METHODS
Subject enrollment and clinical characterization. The Prospective Encephalitis and Meningitis Study (PEMS) is a prospective cohort study enrolling adults who present to Massachusetts General Hospital (MGH) with confirmed or suspected CNS infection. Adults aged $18 years with at least one of the following symptoms were eligible for enrollment: (i) altered level of consciousness, (ii) fever, (iii) seizure, (iv) focal neurological findings, (v) electroencephalographic or neuroimaging findings consistent with encephalitis or meningitis, and (vi) refractory headaches. Additionally, enrollment was offered only to patients who had undergone, or planned to undergo, lumbar puncture (LP). Potential participants were referred to the study team by clinicians concerned for encephalitis or meningitis. In parallel, the study team performed queries of the electronic medical record to screen for eligible participants based on the reason for admission and/or CSF results. Further details of screening and enrollment are provided in Text S1 in the supplemental material.
A total of 136 subjects were prospectively enrolled in the PEMS between April 2016 and December 2017, of whom 122 had available CSF for mNGS. For this study, immunocompetent patients with a CSF white blood cell count (WBC) of ,5 cells/ml (n = 40) were excluded as being unlikely to have infectious meningitis or encephalitis; most of these patients had been enrolled prior to undergoing LP (Text S1). Patients with infection due to nosocomial bacteria, or bacteria and fungi that would be challenging to distinguish from common laboratory contamination in mNGS, were also excluded (n = 14) (Table S1). Sixty-eight subjects were ultimately included in the mNGS analysis (Fig. 1). This study was approved by the Partners Institutional Review Board under protocol 2015P001388. Further details are in Text S1.
Nucleic acid isolation and standard mNGS. To minimize environmental contamination from viruses studied in the research laboratory, nucleic acid extraction and library construction were performed in an isolated workspace with limited access, extensive decontamination, and strict oversight of supplies, storage areas, and reagents. As a negative control, water and/or CSF from an uninfected patient (negative CSF) was included with each batch starting from nucleic acid isolation. Nucleic acid was extracted from 140 ml of CSF, urine, or plasma stabilized with linear acrylamide using the QIAamp viral RNA minikit (Qiagen). The eluent was split into two fractions for RNA and DNA sequencing. External RNA Controls Consortium (ERCC) spike-in oligonucleotides were added to each fraction; for the RNA fraction, RNA spike-in oligonucleotides were synthesized according to National Institute of Standards and Technology instructions, and for the DNA fraction, cDNA spike-in oligonucleotides were synthesized from RNA templates using random hexamer primers (48). Samples also underwent cDNA synthesis using random hexamer primers and previously described methods (49,50). Both DNA and RNA libraries underwent tagmentation with the Nextera XT DNA library prep kit (Illumina) and were pooled and sequenced on HiSeq and MiSeq machines using paired-end 100-or 150-bp reads. Methods are outlined in Fig. 1B, and details are in Text S1.
Methods to enhance the detection of pathogen nucleic acid. We attempted two different approaches to enhance sequencing-based detection of pathogen nucleic acid, either separately or together. We first assessed whether enrichment for nonmethylated microbial DNA would improve mNGS yields. We used samples from 12 subjects: 10 with clinically diagnosed DNA virus infections and 2 with clinically diagnosed Lyme disease. Samples underwent methylated DNA depletion (MDD) using the NEBNext microbiome DNA enrichment kit (New England BioLabs), and the enriched fraction was used for DNA library construction as described above (Fig. 4). This kit is designed to remove human host DNA that is methylated and is expected to enrich viral content because no known viral pathogens encode a methyltransferase. We then assessed the efficacy of enrichment for viral nucleic acid by hybrid capture (HC) using a set of probes targeting all viruses known to infect humans (51). We applied HC to 13 RNA and 12 DNA libraries from subjects with clinically diagnosed RNA and DNA virus infections, respectively. Given the observed efficacy of HC, we also Enhanced Virus Detection and mNGS in Encephalitis ® applied HC to samples from 20 subjects in the "unknown" group (using the RNA library, DNA library, or both depending on clinical suspicion for a specific pathogen). To perform HC, indexed libraries were pooled into groups of approximately 5 samples per reaction and then underwent hybridization and capture using the SeqCap EZ enrichment kit (Roche), with modifications as described previously (51). HC libraries were pooled and sequenced as described above. Finally, we applied both MDD and HC to a subset of 12 samples from patients with known or suspected DNA virus infection.
Metagenomic analysis pipeline. Illumina sequencing reads were demultiplexed via viral-ngs (57), quality filtered and read trimmed using Trimmomatic (52), and depleted of human reads via a comprehensive KrakenUniq (53) database. The resulting reads were deduplicated and assembled into metagenomic contigs via metaSPAdes (54). Contigs were classified using a cascading BLAST scheme in which unclassified contigs at each stage passed to the next level of more intensive BLAST searches from MegaBLAST and BLASTn to BLASTx (55,56). Contigs and associated hits derived from water and negative-control samples were aggregated into a contaminant database and used to further deplete the human-depleted reads ( Fig. 1C; Fig. S1).
Finally, the human-and contaminant-depleted reads (see Table S4 at https://figshare.com/articles/ dataset/Tables/13266506) were classified by KrakenUniq using the same comprehensive database as the one described above. Reads classified as potentially human-pathogenic viruses were validated via BLAST, discarding any reads that were not concordantly classified by both methods. The counts of reads per taxon were normalized to sequencing depth as reads per million (RPM) (see Tables S5 and S6 at https://figshare.com/articles/ dataset/Tables/13266506). Kaiju (58) was run on depleted reads to explore divergent taxon hits, while viral-ngs was used to assemble genomes for a subset of viruses. For mNGS interpretation after computational classification, false-positive species were identified based on broad contamination patterns across all sequencing runs. All true-positive determinations were output from the mNGS classification results directly.
Statistical analyses. Analyses were performed using Student's t test and the Mann-Whitney U test for normally and nonnormally distributed continuous variables, respectively, and using the x 2 test for categorical variables.  Data availability. Reads after quality control (QC) filtering, trimming, and depletion of human reads via KrakenUniq to a comprehensive database, including the human genome (GRChg38/hg38) and all human sequences from the BLAST NT database, are available in the NCBI Sequence Read Archive (SRA) under accession number PRJNA668392.

SUPPLEMENTAL MATERIAL
Supplemental material is available online only. TEXT S1, DOCX file, 0.04 MB.