Identification of Viruses in Cases of Pediatric Acute Encephalitis and Encephalopathy Using Next-Generation Sequencing

Acute encephalitis/encephalopathy is a severe neurological syndrome that is occasionally associated with viral infection. Comprehensive virus detection assays are desirable because viral pathogens have not been identified in many cases. We evaluated the utility of next-generation sequencing (NGS) for detecting viruses in clinical samples of encephalitis/encephalopathy patients. We first determined the sensitivity and quantitative performance of NGS by comparing the NGS-determined number of sequences of human herpesvirus-6 (HHV-6) in clinical serum samples with the HHV-6 load measured using real-time PCR. HHV-6 was measured as it occasionally causes neurologic disorders in children. The sensitivity of NGS for detection of HHV-6 sequences was equivalent to that of real-time PCR, and the number of HHV-6 reads was significantly correlated with HHV-6 load. Next, we investigated the ability of NGS to detect viral sequences in 18 pediatric patients with acute encephalitis/encephalopathy of unknown etiology. A large number of Coxsackievirus A9 and mumps viral sequences were detected in the cerebrospinal fluid of 2 and 1 patients, respectively. In addition, Torque teno virus and Pepper mild mottle viral sequences were detected in the sera of one patient each. These data indicate that NGS is useful for detection of causative viruses in patients with pediatric encephalitis/encephalopathy.


Results
Identification of viral sequences in clinical samples using NGS. To validate the NGS-based approach to detect virus-derived sequences in clinical samples, serum or CSF samples obtained from patients with a defined diagnosis of viral infection were examined. Representative results of patients with adenovirus fulminant hepatitis (serum), HSV encephalitis (CSF), and HCV (serum) are shown in Fig. 1. The patient with adenovirus fulminant hepatitis was a 12-year-old female with acute myeloid leukemia who had received a bone marrow transplantation. She developed hepatitis on day 103 after transplantation, and died of hepatic failure on day 113. The details of this case were reported previously 7 . The patient with HSV encephalitis was a 12-day-old male newborn who presented with fever and seizure. He was diagnosed with neonatal HSV encephalitis by real-time PCR of CSF. The patient with HCV infection was a 14-year-old male who was infected by perinatal transmission. His liver function tests have been normal, but high HCV load was detected.
With DNA sequencing, human sequences were the most abundant, and the vast majority of detected viral sequences corresponded to a single predominant virus, as expected (Fig. 1a,b). Nearly complete adenoviral genomes were detected with high coverage, and sequencing results of the hexon gene indicated that the detected adenovirus was classified into type 2, which was consistent with the results of virus isolation (data not shown). On the other hand, with sequencing of RNA in the serum of the HCV patient, HCV sequences represented 2.1% of total reads (Fig. 1c). Sequencing results of the NS5B region indicated that the detected HCV was classified into genotype 1b (data not shown). However, unclassified sequences derived from Illumina adaptor sequences were the most abundant due to the low amount of input RNA for library preparation and subsequent preferential formation of adaptor dimers. Prior filtering of such non-essential artifacts decreases the heavy computational burden of BLAST search. Importantly, all RNA libraries contained sequences mapping to avian retroviral reference genomes such as avian myeloblastosis-associated virus type 1 and Rous sarcoma virus with high coverage (data not shown). However, they were likely derived from residual avian retroviral genomic material in the ScriptSeq reverse transcriptase enzyme (Illumina, San Diego, CA) 8 . For the viral metagenomics analysis using NGS, it has Total reads were mapped against the reference genomes using the CLC Genomics Workbench. Light gray, gray, and very light gray colors in the viral genome alignments represent minimal, average, and maximal coverage in the aggregated 100 bp (human adenovirus and hepatitis C virus) or 1 kbp (human herpesvirus 2) region, respectively. (a) Sequencing results of the DNA library prepared from the serum of a patient with adenovirus fulminant hepatitis. Sequencing reads were mapped to the reference genome of human adenovirus 2. (b) Sequencing results of the DNA library prepared from cerebrospinal fluid of a patient with neonatal herpes encephalitis. Sequencing reads were mapped to the reference genome of human herpesvirus 2. (c) Sequencing results of the RNA library prepared from the serum of a patient with hepatitis C virus infection. Sequencing reads were mapped to the reference genome of Hepatitis C virus genotype 1b.
been shown that the ScriptSeq library preparation method retrieved more virus reads than TruSeq RNA Kit v2 (Illumina), but avian retroviral sequences were detected only in the libraries prepared with ScriptSeq kit 9 .
Correlation between results from NGS and real-time PCR. First, we assessed the sensitivity of NGS for detection of viral sequences using sera spiked with HHV-6. Cell-free HHV-6 fluid was prepared by collecting supernatants from HHV-6 infected cord blood mononuclear cells. As shown in Fig. 2a, NGS was able to detect 44 and 2 reads of HHV-6 sequences in serum with 100 and 10 copies/ml of HHV-6, respectively. Considering that detection limit of real-time PCR for HHV-6 is around 50 copies/ml, the sensitivity of the NGS-based approach for detection of HHV-6 was equivalent to that of real-time PCR. To confirm the quantitative performance of NGS, the number of HHV-6 sequences detected in sera of patients was compared with the HHV-6 load measured using real-time PCR. The number of HHV-6 sequences per sample was set to per 5,000,000 total reads because the sensitivity of NGS is dependent on sequencing depth. As shown in Fig. 2b, the number of HHV-6 reads was significantly correlated with HHV-6 load (r = 0.90, p < 0.01). These results suggest that the sensitivity and quantitative performance of the NGS-based approach for detection of DNA viral sequences was equivalent to that of real-time PCR in cell-free clinical samples with a sequencing depth of 5,000,000 reads.

Detection of viral sequences in clinical samples of acute encephalitis/encephalopathy. Next,
we investigated the CSF and serum samples of patients with acute encephalitis/encephalopathy of unknown etiology. A summary of the patients and the viruses that were detected is shown in Table 1. In some CSF samples, DNA sequencing could not be performed because the concentration of the extracted DNA was not sufficient to prepare DNA libraries. Sequences that mapped to a specific viral genome (except avian retroviruses) with more than 10 reads were detected in three of 16 CSF samples and in two of 15 serum samples (Table 1). Substantial reads of Coxsackievirus A9 sequences were detected in the CSF of patients 1 and 12 (Fig. 3a,c). These sequences aligned with other enteroviral genomes such as Coxsackievirus B3 and Echovirus E30, whereas alignment with the VP1 region suggested that they were derived from Coxsackievirus A9 in both patients (data not shown). Furthermore, mumps viral sequences were detected in the CSF sample from patient 9, and these sequences covered the reference mumps genome (Fig. 3b). Alignment with the SH genomic region indicated that the detected mumps virus was genotype G and not a vaccine strain (data not shown). On the other hand, 13 reads of the Torque teno virus (TTV) and 298 reads of the Pepper mild mottle virus (PMMoV) were detected in the serum of patients 2 and 14, respectively. These virus-derived sequences were not detected in CSF. We confirmed that PPMoV-derived   sequences covered the reference genome (Fig. 3d). The presence of Coxsackievirus A9 and mumps virus RNA in the CSF, and PMMoV RNA in the serum was confirmed by RT-PCR (Fig. 4).
In addition to viral sequences, a significant number of sequences derived from environmental bacteria such as Acinetobacteria and Proteobacteria were detected in all CSF and serum samples (data not shown). However, sequences of bacteria that have been recognized as causes of encephalitis, such as Listeria or Coxiella burnetii, were not detected.

Discussion
In this study, we evaluated the utility of NGS for detecting viral sequences in CSF and serum and investigated the causative virus in 18 pediatric patients with acute encephalitis/encephalopathy. Sequences mapping to the Coxsackievirus A9 virus and the mumps viral genome were found in CSF, and these viruses were considered to be the causative viral pathogen of encephalitis/encephalopathy. Non-polio enteroviruses are a major cause of encephalitis in children, and Coxsackievirus A9 virus is one of the most frequent serotypes associated with encephalitis 10 . On the other hand, mumps is one of the most frequent causes of confirmed viral encephalitis in the pre-vaccine era. In Japan, outbreaks of mumps are not uncommon because universal vaccination against mumps has not been introduced. However, the patient with suspected mumps encephalitis in this study had received one dose of the mumps vaccine at the age of 1 year. Sequences detected in the CSF aligned with the wild-type mumps genome rather than vaccine strains, suggesting primary or secondary vaccine failure in this patient. These results suggest that the NGS-based approach for detection of virus-derived sequences in CSF may be a useful and reliable method for identification of the viral pathogen of encephalitis/encephalopathy.
On the other hand, no significant viral reads were detected in 13 of the 16 CSF samples. In general, free RNA from sera or CSF is degraded and obtained in low yield 11,12 . Extraction of RNA from CSF without pleocytosis was performed with carrier RNA to yield sufficient RNA for library preparation. However, no virus pathogens were detected in these CSF samples, suggesting that RNA sequencing of CSF specimens with very low RNA concentrations may not be very useful. Because of low RNA yield in CSF, the detection of virus-specific RNA has lower sensitivity for diagnosis than other methods. For example, virus RNA detection using RT-PCR in CSF of patients with West Nile virus encephalitis is less sensitive than detection of virus specific IgM antibody in CSF 13 . It is possible that some of the virus-specific IgM in CSF is more stable than virus RNA. Furthermore, we were unable to perform DNA sequencing in CSF samples without pleocytosis because DNA yields were not sufficient for DNA library preparation. However, it has been shown that RNA sequencing can be used to detect mRNA reads derived from DNA viruses 9 .
In addition to CSF, we investigated the sera from patients with encephalitis/encephalopathy because some viral pathogens such as HHV-6 are sometimes detected in sera but not in CSF 14 . Some cases of virus-associated encephalopathy may be a consequence of a systemic immune response rather than a result of direct viral invasion of the central nervous system 15,16 . Interestingly, sequences mapping to PMMoV were detected in the serum of one patient. PMMoV is a plant RNA virus of the genus Tobamovirus. Recently, metagenomics studies have identified PMMoV in the stool of healthy subjects 17 . Furthermore, the presence of PMMoV in stool is associated with seropositivity for PMMoV-IgM antibodies and clinical symptoms such as fever and abdominal pain 17 . On the other hand, detection of PMMoV in human peripheral blood has not been reported. Further investigations are required to clarify whether PMMoV is associated with human disease or is just an innocent bystander. A small number of TTV sequences were detected in the serum of another patient. TTV infections are frequent in humans, but a direct association between TTV and specific diseases has not been established because TTV is sometimes detected in apparently healthy individuals 18 .
As with previous studies of NGS-based pathogen analysis, a significant number of sequences derived from different types of bacteria such as Acinetobacteria and Proteobacteria were detected in all CSF and serum samples (data not shown) 19,20 . Although cultures of blood and CSF for bacteria were negative in all patients, some of these bacteria may have been causative agents of encephalitis/encephalopathy. Another possibility is that sequences of normal bacterial flora in humans may be persistently detected in the sera or CSF from patients without bacteremia or meningitis. However, significant levels of bacterial reads, which likely arise from environmental sources, are detected in DNA and RNA sequencing libraries prepared from clinical specimens or tissue culture cells 19 . For that reason, bacterial contamination is a relevant issue that needs to be extensively addressed before widespread use of NGS-based pathogen detection can be implemented. Distinguishing microbes of interest from contaminants, particularly bacteria, remains difficult.
In conclusion, we have shown that NGS is useful for detection of causative viruses in patients with pediatric encephalitis/encephalopathy. Although routine use of NGS may be difficult because it is costly and labor intense, an unbiased and highly sensitive NGS-based approach has great potential for analysis of pathogens present in clinical samples, and will likely contribute in clinical and public health settings.

Methods
Ethics Statement. The study design and methods were approved by the Institutional Review Board of Nagoya University Hospital (IRB number:5069). The methods were carried out in accordance with the approved guidelines. Informed consent was obtained from all patients or their guardians.

Patients and samples.
A total of 18 patients with pediatric acute encephalitis/encephalopathy of unknown etiology were enrolled in this study (Table 1). Acute encephalopathy/encephalitis was defined as occurring in patients with a depressed or altered level of consciousness lasting more than 24 hours and one or more of the following: fever (> 38 °C) during the presenting illness; seizure(s) and/or focal neurological findings; CSF pleocytosis; abnormal results of an electroencephalogram; abnormal results of neuroimaging 1 . Cultures of blood and CSF for bacteria, rapid influenza test, and PCR for HSV, HHV-6, and HHV-7 were negative in all patients. RT-PCR for enterovirus was not performed at enrollment. Serum and CSF samples were obtained in the acute phase (within 2 days of the onset of neurologic involvement) and stored at − 30 °C until use. Clinical samples obtained from patients with a defined diagnosis of viral infection such as adenovirus hepatitis, HSV encephalitis, hepatitis C virus (HCV), or HHV-6 encephalopathy were used to validate the NGS-based approach to detect virus-derived sequences. Real-time PCR of HHV-6 was performed with a QuantiTect multiplex PCR kit (Qiagen, Hilden, Germany) as described previously 21 , and primers and probes are shown in Table 2.
Library preparation and sequencing. Samples were filtered through a 0.45-μ m filter (Merck-Millipore, Temecula, CA) to remove blood cells and bacteria, and extraction of total nucleic acids was performed with a QIAamp UCP Pathogen Mini kit (Qiagen) in accordance with the manufacturer's instructions. Extraction of RNA from CSF without pleocytosis was performed with a QIAamp Viral RNA Mini kit (Qiagen) with carrier RNA to yield sufficient RNA for library preparation. Concentrations of extracted DNA and RNA were measured using a Qubit assay kit (Thermo Fisher Scientific, Walthman, MA). Concentrations of DNA from sera and CSF with pleocytosis ranged from 0.1 to 8.2 ng/μ l (median, 0.5 ng/μ l). On the other hand, concentrations of DNA from CSF without pleocytosis, and RNA extracted without carrier RNA were below the detection limit (< 0.01 ng/μ l). We could not assess the quality of RNA because the amount of extracted RNA was very low. Before preparation of the RNA sequencing library, extracted nucleic acids were treated with Turbo DNase (Ambion, Darmstad, Germany) for 30 min at 37 °C to digest host DNA. DNA and RNA sequencing libraries were prepared using a Nextera XT DNA Sample preparation kit (Illumina) and ScriptSeq v2 (Illumina), respectively. Library quality was determined using an Agilent 2200 TapeStation (Agilent, Santa Clara, CA). Fragment sizes and concentrations  Table 2. Primers and probe used for virus detection. * Sequences are shown 5′ to 3′ with standard codes (Y = C/T, R = A/G).