Abstract
Background
Diagnostic genetic testing programmes based on next-generation DNA sequencing have resulted in the accrual of large datasets of targeted raw sequence data. Most diagnostic laboratories process these data through an automated variant-calling pipeline. Validation of the chosen analytical methods typically depends on confirming the detection of known sequence variants. Despite improvements in short-read alignment methods, current pipelines are known to be comparatively poor at detecting large insertion/deletion mutations.
Methods
We performed clinical validation of a local reassembly tool, ABRA (assembly-based realigner), through retrospective reanalysis of a cohort of more than 2000 hereditary cancer cases.
Results
ABRA enabled detection of a 96-bp deletion, 4-bp insertion mutation in PMS2 that had been initially identified using a comparative read-depth approach. We applied an updated pipeline incorporating ABRA to the entire cohort of 2000 cases and identified one previously undetected pathogenic variant, a 23-bp duplication in PTEN. We demonstrate the effect of read length on the ability to detect insertion/deletion variants by comparing HiSeq2500 (2 × 101-bp) and NextSeq500 (2 × 151-bp) sequence data for a range of variants and thereby show that the limitations of shorter read lengths can be mitigated using appropriate informatics tools.
Conclusions
This work highlights the need for ongoing development of diagnostic pipelines to maximize test sensitivity. We also draw attention to the large differences in computational infrastructure required to perform day-to-day versus large-scale reprocessing tasks.
Similar content being viewed by others
References
Watson CM, Crinnion LA, Morgan JE, Harrison SM, Diggle CP, Adlard J, et al. Robust diagnostic genetic testing using solution capture enrichment and a novel variant-filtering interface. Hum Mutat. 2014;35:434–41. doi:10.1002/humu.22490.
Wood HM, Belvedere O, Conway C, Daly C, Chalkley R, Bickerdike M, et al. Using next-generation sequencing for high resolution multiplex analysis of copy number variation from nanogram quantities of DNA from formalin-fixed paraffin-embedded specimens. Nucleic Acids Res. 2010;38:e151. doi:10.1093/nar/gkq510.
Watson CM, Crinnion LA, Berry IR, Harrison SM, Lascelles C, Antanaviciute A, et al. Enhanced diagnostic yield in Meckel-Gruber and Joubert syndrome through exome sequencing supplemented with split-read mapping. BMC Med Genet. 2016;17:1. doi:10.1186/s12881-015-0265-z.
Mattocks CJ, Morris MA, Matthijs G, Swinnen E, Corveleyn A, Dequeker E, et al. A standardized framework for the validation and verification of clinical molecular genetic tests. Eur J Hum Genet. 2010;18:1276–88. doi:10.1038/ejhg.2010.101.
Deans Z, Watson CM, Charlton R, Ellard S, Wallis Y, Mattocks C, et al. Practice guidelines for targeted next generation sequencing analysis and interpretation. Association for Clinical Genetic Science. 2015. http://www.acgs.uk.com/media/983872/bpg_for_targeted_next_generation_sequencing_-_approved_dec_2015.pdf. Accessed 7 July 2017.
Schouten JP, McElgunn CJ, Waaijer R, Zwijnenburg D, Diepvens F, Pals G. Relative quantification of 40 nucleic acid sequences by multiplex ligation-dependent probe amplification. Nucleic Acids Res. 2002;30:e57. doi:10.1093/nar/gnf056
Mose LE, Wilkerson MD, Hayes DN, Perou CM, Parker JS. ABRA: improved coding indel detection via assembly-based realignment. Bioinformatics. 2014;30:2813–5. doi:10.1093/bioinformatics/btu376.
Martin M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet J. 2011;17:10–2. doi:10.14806/ej.17.1.200.
Li H, Durbin R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics. 2009;25:1754–60. doi:10.1093/bioinformatics/btp324.
DePristo MA, Banks E, Poplin R, Garimella KV, Maguire JR, Hartl C, et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat Genet. 2011;43:491–8. doi:10.1038/ng.806.
Wallis Y, Payne S, McAnulty C, Bodmer D, Sistermans E, Robertson K, et al. Practice guidelines for the evaluation of pathogenicity and the reporting of sequence variants in clinical molecular genetics. Association for Clinical Genetic Science. 2013. http://www.acgs.uk.com/media/774853/evaluation_and_reporting_of_sequence_variants_bpgs_june_2013_-_finalpdf.pdf. Accessed 7 July 2017.
Thorvaldsdóttir H, Robinson JT, Mesirov JP. Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Brief Bioinform. 2013;14:178–92. doi:10.1093/bib/bbs017.
Rimmer A, Phan H, Mathieson I, Iqbal Z, Twigg SRF, WGS500 Consortium, et al. Integrating mapping-, assembly- and haplotype-based approaches for calling variants in clinical sequencing applications. Nat Genet. 2014;46:912–8. doi:10.1038/ng.3036.
Koboldt DC, Zhang Q, Larson DE, Shen D, McLellan MD, Lin L, et al. VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing. Genome Res. 2012;22:568–76. doi:10.1101/gr.129684.111.
Hwang S, Kim E, Lee I, Marcotte EM. Systematic comparison of variant calling pipelines using gold standard personal exome variants. Sci Rep. 2015;5:17875. doi:10.1038/srep17875.
Whiffin N, Brugger K, Wook Ahn J. Guidelines for development and validation of software, with particular focus on bioinformatics pipelines for processing NGS data. Association for Clinical Genetic Science. http://www.acgs.uk.com/media/1025075/ngs_bioinformatics_bpg_final_version_2016.pdf. Accessed 7 July 2017.
Caulfield M, Ainsworth C. Q&A: Mark Caulfield. National genomics. Nature. 2015;527:S5. doi:10.1038/527S5a.
Project Team SG. The Saudi Human Genome Program: an oasis in the desert of Arab medicine is providing clues to genetic disease. IEEE Pulse. 2015;6:22–6. doi:10.1109/MPUL.2015.2476541.
Gudbjartsson DF, Helgason H, Gudjonsson SA, Zink F, Oddson A, Gylfason A, et al. Large-scale whole-genome sequencing of the Icelandic population. Nat Genet. 2015;47:435–44. doi:10.1038/ng.3247.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
CMW, NC, LAC, SC, RLR, JA, RC, AFM, IMC, DTB have no competing interests.
Funding
This work was supported by Grants MR/M009084/1 and MR/L01629X/1 awarded by the UK Medical Research Council.
Ethical approval
Ethical approval was granted by the Leeds East Research Ethics Committee (07/H1306/113).
Informed consent
Written informed consent was obtained from all reported individuals.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Watson, C.M., Camm, N., Crinnion, L.A. et al. Increased Sensitivity of Diagnostic Mutation Detection by Re-analysis Incorporating Local Reassembly of Sequence Reads. Mol Diagn Ther 21, 685–692 (2017). https://doi.org/10.1007/s40291-017-0304-x
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40291-017-0304-x