DNA extraction from primary liquid blood cultures for bloodstream infection diagnosis using whole genome sequencing

Purpose Speed of bloodstream infection diagnosis is vital to reduce morbidity and mortality. Whole genome sequencing (WGS) performed directly from liquid blood culture could provide single-assay species and antibiotic susceptibility prediction; however, high inhibitor and human cell/DNA concentrations limit pathogen recovery. We develop a method for the preparation of bacterial DNA for WGS-based diagnostics direct from liquid blood culture. Methodology We evaluate three commercial DNA extraction kits: BiOstic Bacteraemia, Amplex Hyplex and MolYsis Plus. Differential centrifugation, filtration, selective lysis and solid-phase reversible immobilization bead clean-up are tested to improve human cells/DNA and inhibitor removal. Using WGS (Illumina/MinION), we assess human DNA removal, pathogen recovery, and predict species and antibiotic susceptibility inpositive blood cultures of 44 Gram-negative and 54 Staphylococcus species. Results/Key findings BiOstic kit extractions yield the greatest mean DNA concentration, 94–301 ng µl−1, versus 0–2.5 ng µl−1 using Amplex and MolYsis kits. However, we note higher levels of inhibition (260/280 ratio 0.9–2.1) and human DNA (0.0–4.4×106 copies) in BiOstic extracts. Differential centrifugation (2000 g, 1 min) prior to BiOstic extraction reduces human DNA by 63–89 % with selective lysis minimizing by a further 62 %. Post-extraction bead clean-up lowers inhibition. Overall, 67 % of sequenced samples (Illumina MiSeq) contain <10 % human DNA, with >93 % concordance between WGS-based species and susceptibility predictions and clinical diagnosis. If >60 % of sequencing reads are human (7/98 samples) susceptibility prediction becomes compromised. Novel MinION-based WGS (n=9) currently gives rapid species identification but not susceptibility prediction. Conclusion Our method for DNA preparation allows WGS-based diagnosis direct from blood culture bottles, providing species and antibiotic susceptibility prediction in a single assay.

Species identification and AST methods directly from primary blood culture have the potential to reduce turnaround times, and show benefits in clinical care [4][5][6][7][8][9][10]. For example, MALDI-TOF methods have been adapted for this purpose, with varying degrees of accuracy [14,15]. However, MALDI-TOF cannot provide full drug susceptibility information from primary culture [4,[6][7][8]. Microarray and PCR-based molecular tests target panels of species-specific and drug-resistance markers in primary culture [8,9]. Although molecular methods report rapid and accurate diagnosis [8,9], panel sizes are limited and none encompass the full diversity of BSI-causing bacteria and drug-resistance markers (for examples see [16][17][18][19][20][21]). Sequence divergence in the primer region may affect sensitivity of these assays whereas the impact of DNA contamination, from both human and other bacterial cells, affects the specificity. Thus, molecular approaches continue to present a challenge [22,23].
Whole genome sequencing (WGS) offers a solution to the limitations of PCR-based methods, providing species identification that is un-restricted by a target panel, with the advantages of antimicrobial susceptibility prediction, lineage and information regarding relatedness to other isolates using the same data. Retrospective investigations for pure cultures of E. coli, Klebsiella pneumoniae and S. aureus demonstrate that WGS diagnostic accuracy is comparable to routine phenotyping methods [24,25]. WGS directly from primary blood culture could reduce turnaround times to a clinically applicable time-frame. However, this is more challenging, reliant on the removal of a diverse range of inhibitors present in the liquid media [7,26], and depletion of human cells and/or DNA to allow recovery of sufficient good-quality pathogen DNA. Initial bacterial density in BSI falls between 1-100 c.f.u. ml À1 [27] and even post-primary culture recovery of bacterial DNA for successful WGS is complicated by the presence of large amounts of human DNA.
We report a method to deplete human cells/DNA and isolate bacterial DNA of sufficient quality and quantity for WGS direct from primary liquid blood culture. We demonstrate the ability of the method to provide species identification using short-read Illumina sequencing and the emerging long-read Nanopore MinION system, as well as antimicrobial-resistance prediction using Illumina sequencing data.

Sample collection and processing
Positive blood culture specimens identified as containing either Gram-negative organisms or Staphylococcus sp. by routine clinical Gram-stain were retrieved from the Oxford University Hospitals National Health Service (UK) (NHS) Foundation Trust microbiology laboratory. Blood culture bottles collected included the BD BACTEC Aerobic Plus, Peds Plus, and Lytic Anaerobic (BD, USA). The former two contain resin, and the latter lytic agents for the release of intracellular pathogens. Positive samples from the previous 24 h were retrieved by 10am each working day. Samples were stored at 37 C between positivity and collection, 5 ml of culture was immediately removed and retained for processing.
DNA extraction and pre-steps Three DNA extraction and purification kits were tested: MolYsis Plus kit (Molzym, Germany), BiOstic Bacteraemia kit (MoBio, Qiagen, USA) and Amplex Hyplex Quickprep (Amplex Biosystems, Germany) following the manufacturers' protocol. Chosen kits were readily available in the UK and designed to remove inhibitors to molecular testing. DNA extracted using BiOstic Bacteraemia kit was analysed with and without subsequent DNA purification using AMPure XP solid-phase reversible immobilization (SPRI) beads following the manufacturer's protocol (Beckman Coulter, UK). BiOstic kit extractions were also performed with pre-treatment steps for the depletion of human DNA. These steps were differential centrifugation of primary culture, selective lysis of the primary culture pellet, filtration of primary culture, and a combination of the methods (fully detailed in Fig. S1, available in the online version of this article). DNA was quantified and qualified using the NanoDrop 1000 or Qubit 2.0 fluorometer (double-stranded DNA high sensitivity/broad range kits as required; Thermo Fisher Scientific, USA).

PCR
Quantitative PCR using the probes and primers described in Table S1 was used to quantify S. aureus, E. coli or 16S rRNA, as well as human GAPDH or b-actin DNA (Fig. S1). S. aureus, E. coli and human GAPDH primers and probes were used at a final concentration of 0.32 µM, 16S primers and a probe were used at 0.1 µM, human b-actin primers were used at 0.5 µM and the probe at 0.2 µM. All reactions used 1X Brilliant Multiplex qPCR Master Mix (Agilent, USA) with 2 µl DNA and sufficient molecular grade water to bring the reaction volume to 25 µl. Amplification was performed using the MxPro 3005P (Agilent, USA) under the following conditions: 95 C for 10 min, 40 cycles of 95 C for 15 s and 60 C for 1 min. Extractions showing evidence of inhibition were re-amplified using 2 µl of 1 : 10 and 1 : 100-fold diluted DNA, and following SPRI bead cleanup.

Illumina MiSeq whole genome sequencing
The finalized protocol ( Fig. S1) was used to prepare samples for WGS (finalized protocol, Fig. S1). Bacteria and other debris were pelleted via differential centrifugation of primary culture (1000 g, 1 min). The pellet was retained, resuspended in 1 ml molecular grade water and incubated at room temperature for 5 min to selectively lyse human cells. The suspension was re-pelleted (17 000 g, 3 min) and the pellet taken forwards to DNA extraction, or re-suspended in 1 ml nutrient broth with 10 % glycerol and stored at À20 C until extraction. DNA extraction was performed using the BiOstic kit followed by SPRI bead clean-up. Sequencing libraries were prepared using the Nextera XT kit (Illumina, USA) following the manufacturer's protocol with manual library normalization. Sequencing was performed using MiSeq v2 2Â150 base pair (bp) paired-end read cartridges and MiSeq v3 2Â75 or 2Â300 bp paired-end read cartridges.

MinION whole genome sequencing
As DNA input requirements for MinION sequencing are higher than for MiSeq, 10 ml blood culture was processed per sample. S. aureus and E. coli DNA extracts (finalized protocol, Fig. S1) were prepared for MinION sequencing using the Rapid Sequencing Kit (SQK-RAD002; RSE_9018_v2_redD_21Nov2016) following the manufacturer's protocol. Sequencing was performed using R9.

Sequence analysis
MiSeq sequencing data was processed via an in-house pipeline. Reads were classified with the metagenomic classifier, Kraken (database built from bacterial, viral and human genomes present in National Center for Biotechnology Information refseq on 14 January 2015; v0.10.6unreleased) [28], and human reads removed. Remaining reads were mapped (stampy v1.0.23) to a reference genome chosen according to the top species hits from Kraken (for S. aureus GenBank BX571856.1 and for E. coli GenBank AE014075.1). For Staphylococcus sp. specimens, the species was also predicted via the publically available Mykrobe Predictor [29]. Mykrobe also predicts antibiotic susceptibility for S. aureus (Mykrobe v0.3.6-0-g9d196c7), while antibiotic susceptibility for E. coli and Klebsiella sp. was predicted using resistType (https://github.com/hangphan/ resistType), an in-house algorithm using a previously published catalogue of resistance conferring mutations/genes [25]. We determined sequencing quality through assessment of overall genome coverage and depth of coverage from both Mykrobe and mapping analysis, as well as the total number of reads available.
Predictions of genotype (species and antimicrobial resistance from Mykrobe for Staphylococcus sp.; species from Kraken and antimicrobial resistance from resistType for E. coli and Klebsiella sp.) were compared to anonymized clinical diagnosis generated using pure culture isolates [species from Bruker microflex MALDI-TOF MS (Bruker, USA), antimicrobial resistance from BD Phoenix microbroth dilution (BD, USA)]. In all cases the clinical diagnosis was taken as the gold standard comparator method. The sensitivity (clinical positives/clinical positives+WGS false negatives) and specificity (clinical negatives/clinical nega-tives+WGS false positives) of WGS-based diagnosis was calculated. When analysing concordance between Kraken data and clinical species identification, we disregarded organisms with <1 % of available reads assigned to them.
For MinION-generated reads fastq and timing data was extracted in real time and iteratively updated using fast5-Watcher.py (https://github.com/nick297/fast5_scripts; commit vb88e14a). We conducted metagenomic classification using Kraken with no filtering threshold. We predicted antibiotic susceptibility via Mykrobe for samples identified as S. aureus (v0.4.3-0-gd6c8714), and for E. coli and Klebsiella sp.-resistance predictors were identified by BLAST against the published mutation/gene catalogue [25]. Mykrobe provided data quality parameters including depth of species and resistance mechanisms for S. aureus. The number of bases recovered, read number and length statistics, and accuracy, were calculated for all runs using nanoStats.py (https://github. com/nick297/fast5_scripts; commit vb88e14a).

DNA extraction optimization
Measurement of DNA yields from 23 positive blood cultures (E. coli, n=11, S. aureus, n=12) from 17 individuals (aerobic and anaerobic blood culture processed from 6/17 individuals) followed extraction using the three commercially available kits. According to manufacturers, all kits remove inhibitors and deplete human cells.
Initial yields from six E. coli and six S. aureus positive samples demonstrated that the BiOstic kit provides the most DNA (Qubit fluorometer), with mean values up to 430x greater than MolYsis or Amplex (Table 1). All BiOstic extracts contained detectable DNA, while in 2/6 MolYsis and 5/6 Amplex no S. aureus DNA was detected. On this basis, Amplex was disregarded as a suitable method to extract DNA from blood cultures for WGS purposes. qPCR assessment of the 12 initial extracts was performed using S. aureus, E. coli and human GAPDH targets (Table S1). MolYsis extracts yielded 10 3 -10 5 copies [interquartile range (IQR)] for E. coli and S. aureus; alongside 0-10 2 human copies (IQR; Table 2). The six BiOstic extracts initially failed to amplify, but amplification of diluted input DNA with and without SPRI bead clean-up (n=4) yielded 10 3 -10 7 copies of S. aureus and E. coli; alongside 0-10 6 human copies (IQR; Table 2). This suggests amplification may be inhibited by contaminants carried over during the extraction process or from excessive DNA.
Overall, yields in E. coli samples exceeded S. aureus from both extraction kits, possibly due to incomplete lysis of S. aureus without a specific lysis enzyme (e.g. lysostaphin). The MolYsis kit was more successful at human cell/DNA depletion, shown by the lower ratio of human to bacterial DNA as compared to BiOstic (Table 2). However, the higher bacterial DNA concentrations and the success of SPRI clean-up of BiOstic extractions indicates the validity of this method for further testing to enhance human cell depletion.
Depletion of human cells/DNA optimization All human cell/DNA depletion experiments were assessed using qPCR (Fig. S1, Tables S1 and S3). Differential centrifugation was performed for two fresh samples to determine what speed and time effectively depleted human copy number ( Fig. S1; pre-step a) from initial values of 248 in sample 1 and 56 in sample 2. In sample 1, 2000 g centrifugation, for both 30 s and 1 min demonstrated the greatest reduction in  Table S3) whereas in sample 2 centrifugation at 1000 g for 30 s and 1 min reduced copy number by >90 %. Variation in relative bacterial load was minimal; other than in 3000 g for 1 min centrifugation which depleted bacterial cells (Table S3).
Although 1000 g centrifugation resulted in the largest single reduction in human copy number seen across both samples, centrifugation at 2000 g reduced human DNA more effectively where burden was higher and was consequently incorporated into the final protocol alongside a distilled water wash (Table S3 and finalized protocol in Fig. S1).   (Table S4). Three non-E. coli and non-Klebsiella sp. specimens were discrepant (MALDI-TOF Aeromonas sp., Acinetobacter lwoffi, and Brevibacillus sp.; Kraken Enterobacter aerogenes, Acinetobacter baumannii, and unclassifiable). Polymicrobial infections were identified by clinical diagnosis in 12/44 cases (including two non-E. coli and non-Klebsiella sp. specimens); Kraken classified all species in 2/12 of these cases, while in 3/12 cases the un-identified organism(s) were found in an additional, un-sequenced, blood culture bottle (Table S4). For the remaining 7/12 cases, the co-infecting organisms were found at <0.5 % of the total read number (3/7), or were undetectable by Kraken (4/7). Kraken identified additional organisms, not seen in clinical diagnosis, in 4/44 of cases (although at 1-2 % of the total read number). The percentage of human DNA reads across all samples ranged from 0-15.7 % (mean 1.2 %, IQR 0.04-1.1 %; Fig. 1).
In successful sequencing runs over 85 % of the proportion of reads were the infecting organism (Fig. 3). Real-time analysis allowed identification of infecting species within 10 min of sequencing commencing (Fig. S2), even where yield was variable. Approximately 80 % of total yield was obtained in the first 11 h of sequencing.
Depth of coverage was 4-33x for S. aureus and 12-210x for E. coli (Table 4). Drug susceptibility prediction was 97 % concordant in S. aureus (samples 3, 4, 9 in Table 4; data not shown) with one specimen fully sensitive, one penicillin resistant, and one penicillin and fusidic acid resistant. Mykrobe predicted one sample, phenotypically susceptible to trimethoprim-sulfamethoxazole, to be trimethoprim resistant. Susceptibility predictions for E. coli were 86 % concordant (samples 1, 2, 5 in Table 4; data not shown). One isolate was fully sensitive and one concordant for amoxicillin and coamoxiclav resistance (TEM-30 identified). In the final sample, SHV was detected, leading to a concordant prediction of amoxicillin resistance. However, the SHV variant could not be differentiated, preventing genotypic prediction for ceftriaxone, ceftazidime and co-amoxiclav.

DISCUSSION
We demonstrate a human cell/DNA depletion and bacterial extraction method to allow WGS directly from aerobic and anaerobic BACTEC blood culture bottles. The method includes differential centrifugation to remove intact human cells, and a distilled water wash to lyse remaining human cells. Removal of free human DNA occurs following pelleting of bacterial cells. Following these steps, we extract bacterial DNA with a commercial kit (BiOstic Bacteraemia) and clean with SPRI beads prior to sequencing; effectively removing inhibitors common to blood culture media, including sodium polyanetholsulfonate [26]. The results indicate no inhibition of WGS library preparation using this protocol. The method does not require specialist equipment or reagents, so is cost efficient and straightforward to implement in a range of settings.
Avoiding the use of specialist reagents allows the method to be used for most bacteria, exploiting the non-specificity of WGS to allow diagnosis to encompass the 20-25 pathogens causing most BSI, as well as rarer pathogens and known resistance conferring genes/mutations and virulence genes [6]. The use of specific lysis reagents, such as lysostaphin, may improve lysis of some target organisms (for example, Staphylococcus sp.) while kits such as the MolYsis Basic5 kit (Molzym, Germany) may reduce human DNA further. The effect of our and other potential methods on DNA concentrations for different bacterial species, including intracellular pathogens, should be explored in future investigations.
The complexity of WGS laboratory protocols and bioinformatics analysis is often viewed as an impediment to implementation in clinical settings [29,30]. However, kit-based sequencing preparation methods (such as Nextera XT), paired with the Illumina MiSeq and bioinformatics tools designed for clinical usage (such as Mykrobe Predictor) have already enabled the roll-out of WGS-based infectious disease diagnosis for organisms such as Mycobacterium tuberculosis [31]. For Gram-negative bacteria a WGS-resistance prediction tool designed for clinical usage has not yet been made widely available, leading to the use of Kraken and an in-house tool (resistType) in this study. However, as an area of rapid development, we anticipate that a suitable tool will be available in the near-future [32][33][34].
Using our method, species identification and drug susceptibility prediction could be performed with >93 % concordance to clinical diagnostic results. We note discordance with low sequencing quality, co-infections, and high human DNA content, which reduce the number of reads available for susceptibility prediction and increases the likelihood of low-confidence predictions. For unknown reasons, the number of Gram-negative BSI identified as co-infections was high during this study at 27 % (12/44) versus reported rates of 6-12 % [19]. For 3/12 of these, phenotypic investigations identify the co-infecting organism in the unsequenced bottle. Information regarding the distribution of species in individual culture bottles is unavailable for the remainder. Performing WGS with both aerobic/anaerobic blood culture bottles, or a mixture of the two, would avoid missing additional pathogens in the future. Although WGS may identify co-infections present in the same culture bottle more readily than MALDI-TOF MS [4], further investigations are required to confirm this potential and determine limits of detection.
WGS reports three very major errors (phenotype resistant, genotype sensitive) for co-amoxiclav resistance in Gramnegative bacteria. However, 15 % discordance between co-amoxiclav MICs generated by automated microbroth dilution and gradient diffusion has previously been observed [25]; while repetition of automated microbroth dilution generates differing susceptibility predictions in 5 % of samples [25]. There remains a clear requirement for further investigations to explore the genotype-phenotype relationship for co-amoxiclav.
The timeliness of appropriate antimicrobial therapy is crucial to the reduction of BSI-related morbidity and mortality. An alternative approach is to explore sequencing directly from blood or plasma [35]. Sequencing circulating cell-free DNA from plasma permits species and limited antimicrobial-resistance diagnosis based on <1Â coverage depth [35]. Although by-passing culture steps provides rapid turnaround, this method does not yet provide robust data for full resistance prediction.
MinION-based WGS with computational support can also reduce turnaround times; predicting species within 4 h of culture positivity (in this study: 3 h DNA extraction; 1 h for sequencing library preparation and species identification) and subsequently generating drug-resistance predictions [36]. The MinION is random access (reducing the requirement for sample batching) and permits real-time sequencing data analysis; minimizing these time-delays. This turnaround time begins to rival MALDI-TOF, even with rapid subculture [7,10], and microarray approaches where <4 h turnaround times are reported [37]; with the advantage of being un-restricted by a target panel and capable of Fig. 3. Percentage of total MinION reads assigned to S. aureus, E. coli, other bacteria, or human genomes by Kraken for samples 1, 2, 3, 4, 5 and 9. Insufficient reads for data analysis seen in samples 6, 7 and 8 (Table 4).
generating resistance predictions. Given the rapid development of MinION-based sequencing, a <4 h time from positive blood culture to species identification of any organism, drug susceptibility prediction and phylogenetic placement is becoming increasingly tangible. Application to direct from blood/plasma sequencing would similarly reduce diagnostic time [35], and provide a crucial step towards point-of-care BSI diagnosis. However, in this investigation 3/9 MinION sequencing runs failed to generate sufficient data for analysis; suggesting optimization of methods to improve robustness is required. MinION sequencing may also prove to be prohibitively expensive, with uncertain potential surrounding the ability to multiplex samples and thus reduce costs.
We demonstrate that DNA of sufficient quantity and quality can be extracted from positive blood culture bottles to allow species identification and drug-resistance prediction using MiSeq-and MinION-based WGS. WGS offers the potential for an end-to-end diagnostic solution, replacing the multiple clinical workflows currently used to support species identification and drug susceptibility testing. Further investigations are required to assess the performance of WGS in parallel with routine clinical testing.