Molecular diagnosis of orthopaedic device infection direct from sonication fluid by metagenomic sequencing

Culture of multiple periprosthetic tissue samples is the current gold-standard for microbiological diagnosis of prosthetic joint infections (PJI). Additional diagnostic information may be obtained through sonication fluid culture of explants. However, current techniques can have relatively low sensitivity, with prior antimicrobial therapy and infection by fastidious organisms influencing results. We assessed if metagenomic sequencing of complete bacterial DNA extracts obtained direct from sonication fluid can provide an alternative rapid and sensitive tool for diagnosis of PJI. We compared metagenomic sequencing with standard aerobic and anaerobic culture in 97 sonication fluid samples from prosthetic joint and other orthopaedic device infections. Reads from Illumina MiSeq sequencing were taxonomically classified using Kraken. Using 50 samples (derivation set), we determined optimal thresholds for the number and proportion of bacterial reads required to identify an infection and validated our findings in 47 independent samples. Compared to sonication fluid culture, the species-level sensitivity of metagenomic sequencing was 61/69(88%,95%CI 77-94%) (derivation samples 35/38[92%,79-98%]; validation 26/31[84%,66-95%]), and genus-level sensitivity was 64/69(93%,84-98%). Species-level specificity, adjusting for plausible fastidious causes of infection, species found in concurrently obtained tissue samples, and prior antibiotics, was 85/97(88%,79-93%) (derivation 43/50[86%,73-94%], validation 42/47[89%,77-96%]). High levels of human DNA contamination were seen despite use of laboratory methods to remove it. Rigorous laboratory good practice was required to prevent bacterial DNA contamination. We demonstrate metagenomic sequencing can provide accurate diagnostic information in PJI. Our findings combined with increasing availability of portable, random-access sequencing technology offers the potential to translate metagenomic sequencing into a rapid diagnostic tool in PJI.


Introduction 46
Prosthetic joint infections (PJI) are a devastating and difficult to treat complication of joint replacement surgery. Although the relative incidence of PJI is low (0.8% knee and 1.2% hip 48 replacements across Europe)(1), given the increasing numbers of arthroplasties performed 49 worldwide it is a significant healthcare burden and cause of expense. For individual 50 patients, PJI often requires multiple surgeries, intensive, long-term antimicrobial therapy, 51 and a prolonged period of rehabilitation. Fast, accurate and reliable diagnosis of PJI is 52 necessary to inform treatment choices, particularly for antibiotic resistant organisms. 53 Culture of multiple periprosthetic tissue (PPT) samples remains the gold-standard method 54 of microbial detection (2-4). However, culture can be relatively insensitive with only 65% 55 of causative bacteria detected in infections even when multiple PPT samples are collected 56 (2,5). Infections with fastidious organisms or where a patient has received prior 57 antimicrobial treatment are often culture-negative. 58 Culture of sonication fluid from explanted prostheses may improve microbiological yield in 59 PJI, by disrupting bacterial biofilm. Since sonication was first applied to explanted hip 60 prostheses in 1998 (6) several clinical studies have reported improved sensitivity of 61 pathogens not targeted in the assay design. Other studies identify pathogens by 68 amplification and sequencing of the universal bacterial 16S ribosomal RNA gene (13,14). A 69 drawback of these methods is the potential for generating false-positive results from 70 contaminating bacterial DNA. 71 The potential of high-throughput sequencing as a diagnostic tool for infectious diseases is 72 widely recognized (15)(16)(17). Metagenomic sequencing offers the possibility to detect all DNA 73 in a clinical sample, which can be compared to reference genome databases to identify 74 pathogens. Additionally, a profile of common laboratory and kit contaminants can be 75 generated from negative controls sequenced concurrently and accounted for (18,19). In 76 addition to diagnostic data, whole-genome sequencing can also simultaneously provide 77 characterization of infection outbreaks (20, 21), tracking of transmission (22)(23)(24) and 78 prediction of antimicrobial resistance (25-28). An advantage offered by sequencing is the 79 speed at which it can deliver genetic information(29) compared to traditional 80 microbiological culture and antimicrobial susceptibility testing, which can take days to 81 weeks depending on the pathogen. By removing a culture step and sequencing directly 82 from clinical samples the time taken to diagnosis can be reduced further(30) and 83 pathogens not identified by conventional methods can be detected (31)(32)(33). Here, we 84 investigated if metagenomic sequencing of complete bacterial DNA extracts obtained direct 85 from sonication fluid can provide an alternative rapid and sensitive tool for diagnosis of PJI, 86 without the need for a culture step. 87

Materials and Methods 89
Sample collection and processing. Intra-operative samples from the Nuffield 90 Orthopaedic Centre (NOC) in Oxford University Hospitals (OUH), UK, between June 2013 91 and January 2017 were investigated. The NOC is a tertiary level specialist musculoskeletal 92 hospital, including a dedicated Bone Infection Unit, undertaking approximately 200 93 revision arthroplasties annually. A subset of samples submitted were chosen at random 94 following culture to provide a ratio of approximately 2:1 bacterial culture-positive samples 95 to culture-negative samples. 96 Prosthetic joint implants and metalwork, received into the OUH microbiology laboratory 97 following revision arthroplasty and operative management of other orthopaedic device 98 related infection, were placed directly into single-use sterile polypropylene containers 99 Bioinformatics analysis. Raw sequencing reads were adapter trimmed using BBDuk 154 (https://sourceforge.net/projects/bbmap/) with the following parameters: minlength=36, 155 k=19, ktrim=r, hdist=1, mink=12 and the adapter sequence file provided within the bbmap 156 package. Taxonomic classification of trimmed reads was performed using Kraken (35) and 157 a bespoke database constructed from all bacterial genomes deposited in the NCBI RefSeq 158 database as of January 2015 (updated January 2017 for validation set, see below), with 159 default parameters and no K-mer removals. Where no refseq genome was available for an 160 organism cultured from a PJI at OUH since June 2013, available whole-genome assemblies 161 were also added to the database where available in NCBI. Additionally, the human genome 162 GRCh38 was included in the database to allow detection of host DNA. An optimum filtration 163 threshold, using Kraken-filter, that balanced false-positive removal and sensitivity was 164 determined using simulated datasets of reference genomes. Reference genomes 165 representative of common pathogenic species were used to generate simulated Illumina 166 MiSeq datasets and analysed with Kraken using different filtration thresholds. A threshold 167 value of 0.15 provided optimum read classification sensitivity whilst minimizing spurious 168 results. Kraken output was visualised using Krona (36). 169

Statistical analysis. 170
In order to correct for samples which may contain small numbers of contaminating and 171 non-specific bacterial reads, a threshold was determined to identify the presence of true 172 infection, using the first 50 samples sequenced as a derivation set. Two thresholds (1 and with more species-specific reads than a lower read cut-off (b) and with the percentage of 176 species-specific reads as a proportion of all bacterial reads present above a percentage cut-177 off (c) were also included. Parameter values were selected by numerical optimisation, 178 using R version 3.3.2, comparing sequencing results to sonication fluid culture results, and 179 maximising the value of the Youden Index (37) (sensitivity + specificity -1). Sensitivity was 180 calculated taking each species identified from each culture-positive sonication sample as a 181 separate data point. Specificity was calculated using the total number of sonication samples 182 as the denominator; as such samples contaminated by more than one species were counted 183 as one false-positive. 184 To ensure that read cut-off parameters were chosen without a penalty for potentially 185 difficult to culture anaerobic species, the specificity value optimised was adjusted. Potential 186 'false-positive' sequencing results with plausible fastidious anaerobic causes of infection 187 (including Fusobacterium nucleatum, Propionibacterium acnes and Veillonella parvula) in 188 culture-negative samples were excluded when calculating the specificity value used for 189 parameter optimisation. 190 Where bacterial reads were detected over the thresholds described above in a negative 191 control, that sample was deemed to be contaminated. In the derivation set, in order to 192 maximise the number of sequences available for analysis, only samples with evidence of 193 the same contaminating organisms were excluded from each contaminated batch, rather 194 than discarding the whole batch. During the derivation phase of the study, several batches 195 of samples were found to be contaminated with DNA from other studies performed 196 concurrently in the same research laboratory. Six of eight saline negative control extracts displayed contamination with a single or multiple species at read numbers exceeding the 198 determined diagnostic thresholds. All samples within these batches that displayed similar 199 contamination were excluded from subsequent analysis if Kraken classification resulted in 200 >100 reads corresponding to the majority of the contaminating species. A total of 22 201 samples (in addition to the 50 successfully sequenced) were excluded on this basis ( Figure  202 1). In batches 4 and 5 the negative controls were contaminated with Staphylococcus aureus,203 Escherichia coli, and P. acnes, and 15 samples were excluded with >100 reads from ≥2/3 204 species; in batch 6 the negative control was contaminated with Serratia marcescens, 205 Klebsiella pneumoniae, E. coli, and P. acnes, and 2 samples with >100 reads from ≥3/4 206 species were excluded; in batches 2, 9 and 10 the negative control was contaminated with 207 P. acnes, and 5 samples were excluded with >100 P. acnes reads. To address this issue, prior 208 to the validation phase of the study, all pipettes, laminar flow and PCR hoods and 209 laboratory benches used for DNA extraction and library preparation were deep-cleaned 210 with Virkon disinfectant and RNase AWAY surface decontaminant (Thermo Fisher 211 Scientific, Waltham, MA, USA) in order to remove any possible sources of microbial or DNA 212 contamination. All DNA extraction and library preparation reagents were replaced, and 213 used in pre-prepared per-batch aliquots used exclusively for this study. Sonication fluid 214 samples were handled one at a time in the laminar flow hood, which was cleaned as above 215 between each sample. Fresh gloves were worn each time a new sample was handled during 216 the DNA extraction phase of the protocol. Having implemented these changes, for the 217 validation phase, a more stringent quality control standard was applied, requiring the 218 negative control to be contamination-free for any of the samples in a batch to be analyzed.

Technical replicates.
To ensure sequencing reproducibility one DNA sample was 220 sequenced twice and biological replicates (DNA extraction process repeated) were 221 sequenced for six samples (four in duplicate and two in triplicate). Samples extracted and 222 sequenced as replicates showed good reproducibility. In four duplicate and one triplicate 223 culture-positive samples the same species was recovered by WGS on all occasions (samples 224 164, 171, 182, 183, and 193). A single replicate, 182a, had an additional, likely 225 contaminating, species identified (not found in sonication fluid or PPT culture). A single 226 culture-negative sample (176) was processed in triplicate. One of the three replicates 227 (176a) had an apparent contaminating species identified (also not found in sonication fluid 228 or PPT culture). 229

Results 231
A total of 131 sonication fluids from patients undergoing revision arthroplasty or removal 232 of other orthopaedic devices, were aerobically and anaerobically cultured and underwent 233 metagenomic sequencing (Figure 1). Additionally, a median (IQR) [range] 5 (4-5) [1-8] PPT 234 samples were cultured from each patient. S. aureus, isolated from 22% of sonication fluids 235 and 29% PPTs, and Staphylococcus epidermidis, from 16% sonication fluids and 25% of 236 PPTs, were the 2 most frequently cultured species (Table 1) Table 2). On culture, 35 (36%) sonication fluid 249 samples had no growth, or less than <50 CFU of an organism not considered to be highly 250 pathogenic (skin and oral flora), 55 (57%) had a single organism isolated, and 7 (7%) two organisms isolated. Greater than 10 6 reads were achieved in 91/97 (94%)  Optimal thresholds for determining if samples contain low-level contamination or true 260 infection were determined by numerical optimisation, choosing thresholds that maximised 261 the sensitivity and specificity of sequencing ( Figure 2). The final thresholds chosen to 262 determine the presence of true infection were ≥1150 reads from a single species, or ≥125 263 reads from a single species if ≥15% of the total bacterial reads also belong to that same 264 species. 265  Figure 3 shows the relationship between the proportion of sequence reads obtained that 296 were classified as bacterial, the sonication fluid culture CFU counts, and the concordance 297 between sonication fluid culture and sequencing. Sequencing false positive results were 298 more likely when cultures were negative. 299 More simplistic thresholds for determining true infection performed less well. Within the 300 derivation set, using a single cut-off for the proportion of bacterial reads from a given 301 species, irrespective of the absolute numbers of bacterial reads present, the optimal cut-off 302 value was 25%. Using this threshold, species-level sensitivity was 30/38 (79%) and 303 adjusted specificity 44/50 (88%). Similarly, if only a single absolute read number cut-off is 304 used the optimal value is 410 reads from a single species, sensitivity 30/38 (79%) and 305 adjusted specificity 45/50 (90%). 306 Sequencing results were also compared to a consensus microbiology diagnosis based on 307 IDSA guidelines (4), considering any species isolated twice or any virulent species isolated 308 as a cause of infection, combining sonication and PPT culture results (Supplementary Table  309 2). 65/97 (67%) samples showed complete agreement between the consensus species list 310 from culture and sequencing, 15/97 (15%) a partial match with at least one species found 311 on culture also found on sequencing, 15/97 (15%) had none of the species cultured found 312 on sequencing, and 2/97 (2%) had a plausible additional species found on sequencing not 313 found on culture. 314

Discussion 316
Diagnosis of PJI by culture of sonication fluid and PPT is not always conclusive, and may 317 take up to 10-14 days for slow-growing organisms. Here we assess, for the first time, the 318 use of metagenomic sequencing of complete bacterial DNA extracts obtained direct from 319 sonication fluid in the diagnosis of PJI. We develop a novel filtering strategy to ensure that 320 low-level contaminating DNA is successfully ignored, while infections are detected 321 accurately. Compared to sonication fluid culture, metagenomic sequencing achieved a 322 species-level sensitivity of 88% and specificity of 87% after adjusting for plausible 323 fastidious causes of infection, species found in concurrently obtained PPT samples, and 324 prior antibiotics. Importantly we demonstrate similar performance of our method and 325 filtering algorithm in the subset of samples that formed an independent validation set, 326 sensitivity 84%, adjusted specificity 89%. 327 Sequencing failed to identify an organism cultured from sonication fluid for eight samples. Parvimonas micra, and we were also able to identify a plausible pathogen in two patients 345 who had received prior antibiotics where the routine microbiology was uninformative. 346 Controlling for contamination during sampling and culture is a major challenge in 347 investigating PJI, and underlies why multiple independent PPT samples remains the gold 348 standard for diagnosis. Contamination is an even greater concern in molecular diagnostics 349 given the additional potential for DNA contamination. There are published examples 350 demonstrating the potential for contamination leading to misinterpretation of sequencing 351 data from clinical specimens (38, 39). In our laboratory samples were handled in laminar 352 flow hoods and extracted in a dedicated pre-PCR extraction laboratory. DNA was handled 353 in a PCR hood and sequencing libraries were manipulated in a dedicated post-PCR 354 sequencing laboratory. Despite these measures, we still observed contamination in some of 355 our samples. During the derivation phase of our study it is likely that one or more of the 356 reagents used became contaminated with DNA from other sequencing projects in our 357 laboratory. Although we were able to account for this in our analysis, and then validate our findings in a separate set of samples having resolved this issue, it demonstrates that 359 rigorous laboratory practice would be key to deploying our method. There may also be a 360 role for sealed systems that perform DNA extraction and sequencing in a separated 361 environment. Our experience also re-enforces the requirement that negative controls are 362 included in each sequencing batch, as is routine in molecular microbiology diagnostic 363 assays, to ensure contamination is detected if it does occur. A limitation of our study is that 364 the saline used for sonication was not PCR-grade, and this could be considered in future 365 work. 366 Despite addressing the major contamination issues in the derivation phase, contamination 367 with P. acnes remained an issue in one of our validation batches. Overall, false positive 368 results for P. acnes were found in 7% of samples. Species-specific filtering may be required 369 to address this, our one true-positive sample with P. acnes present on culture had >10 5 P. This study demonstrates as a proof of principle that metagenomic sequencing can be used 394 in the culture-free diagnosis of PJI directly from sonication fluid. Improvements to the 395 method of human DNA removal from direct samples before sequencing are ongoing, and if 396 these are successful, this is likely to greatly improve the efficiency, and therefore accuracy, 397 of metagenomic sequencing. Generating greater numbers of bacterial reads direct from 398 clinical specimens may make prediction of antimicrobial susceptibilities direct from 399 samples possible, as has been achieved from whole-genome sequencing of cultured 400 organisms (25-28). If this can be achieved reliably, it is possible that sequencing can offer a 401 complete microbiology diagnosis without the need for culture. The increasing availability of portable, rapid, random-access strand sequencing technology offers the potential that in 403 future sequencing may become a same-day diagnostic tool. Applications of rapid 404 sequencing in PJI might include perioperative microbiological diagnosis to guide local 405 intraoperative antimicrobials, for example in cement or beads. Earlier diagnosis may also 406 ensure post-operative antimicrobials are more focused, improving antimicrobial 407 stewardship, while treating resistant organisms effectively. Earlier diagnosis may also 408 reduce hospital stays and therefore reduce costs. Sequencing is also likely to be helpful in 409 situations where multiple samples containing the same commensal species are identified. 410 Sequencing will be able to determine whether these are clonal, suggesting true infection 411 rather than contamination, instead of having to rely on current proxies such as 412 antibiograms, which only imperfectly distinguish non-clonal isolates. Ultimately, same-day 413 sequencing may significantly improve the precision, efficiency and cost of PJI care. This 414 study provides a foundation for further development towards this goal. 415  and three parameters (a-c), were used to determine true infection: 1) samples with more 582 reads from a given species than an upper read cut-off (a, x-axis) were included; 2) samples 583 with more species-specific reads than a lower read cut-off (b, panels) and with the 584 percentage of species-specific reads as a proportion of all bacterial reads present above a 585 percentage cut-off (c, y-axis) were also included. Youden-Index = (sensitivity + specificity) -586 1. 587  and/or screws from tibia (n=3), femur (n=4), spine (n=2), foot (n=2), humerus (n=1), ankle 602 (n=1) and ulna (n=1). 603