Abstract
The accurate detection of mutations from clinical samples using Next Generation Sequencing (NGS) is of great importance in the clinical treatment of cancer patients. Clinical tests use archival pathology slides, which are preserved by Formalin-Fixation Paraffin Embedding (FFPE). The FFPE process introduces spurious C > T mutations hindering accurate cancer diagnosis.
FFPE mutational artifacts occur in a well-defined pattern called a mutational signature. By quantifying the abundance of the FFPE mutational signature and using Bayes’ formula we developed a method to filter FFPE artifacts. We implemented this method as the excerno package in the R statistical language.
We tested our method by generating mutations that follow the FFPE mutational signature and combining them with variants produced by other mutational signatures from the Catalog of Somatic Mutations in Cancer (COSMIC). First, we mixed an equal number of FFPE variants and mutations from a single COSMIC mutational signature and tested excerno across all of the 60 COSMIC mutational signatures. Our median sensitivity, specificity, and Area Under the Curve (AUC) were 0.89, 0.99, and 0.96 respectively. Furthermore, our performance characteristics decrease as a linear function of the similarity between the COSMIC and the FFPE mutational signatures (R2 = 0.90). We also tested our method by mixing different proportions of mutations from the COSMIC and FFPE mutational signatures. As we increased the proportion of FFPE variants our sensitivity increased while our specificity decreased.
In conclusion, we developed and implemented excerno, an accurate method to filter FFPE artifactual mutations and characterized its performance characteristics using simulated datasets.
A. Mitchell, M. Ruiz and S. Yang−Equal Contributions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Chen, G., Mosier, S., Gocke, C.D., Lin, M.T., Eshleman, J.R.: Cytosine deamination is a major cause of baseline noise in next-generation sequencing. Mol. Diagn. Ther. 18(5), 587–593 (2014). https://doi.org/10.1007/s40291-014-0115-2
Alexandrov, L.B., Kim, J., Haradhvala, N.J., Huang, M.N., Tian Ng, A.W., Wu, Y., et al.: The repertoire of mutational signatures in human cancer. Nature 578(7793), 94–101 (2020)
Tate, J.G., Bamford, S., Jubb, H.C., Sondka, Z., Beare, D.M., Bindal, N., et al.: COSMIC: the catalogue of somatic mutations in cancer. Nucleic Acids Res. 47(D1), D941–D947 (2019)
Lee, D.D., Seung, H.S.: Learning the parts of objects by non-negative matrix factorization. Nature 401(6755), 788–791 (1999)
Blokzijl, F., Janssen, R., van Boxtel, R., Cuppen, E.: MutationalPatterns: comprehensive genome-wide analysis of mutational processes. Genome Med. 10(1), 33 (2018)
Islam, S.M.A., Wu, Y., Díaz-Gay, M., Bergstrom, E.N., He, Y., Barnes, M., et al.: Uncovering novel mutational signatures by de novo extraction with SigProfilerExtractor. bioRxiv. 2021:2020.12.13.422570 (2022)
Bhagwate, A.V., Liu, Y., Winham, S.J., McDonough, S.J., Stallings-Mann, M.L., Heinzen, E.P., et al.: Bioinformatics and DNA-extraction strategies to reliably detect genetic variants from FFPE breast tissue samples. BMC Genomics 20(1), 689 (2019). https://doi.org/10.1186/s12864-019-6056-8
DiGuardo, M.A., Davila, J.I., Jackson, R.A., Nair, A.A., Fadra, N., Minn, K.T., et al.: RNA-Seq reveals differences in expressed tumor mutation burden in colorectal and endometrial cancers with and without defective DNA-mismatch repair. J. Mol. Diagn. 23(5), 555–564 (2021)
Guo, Q., Lakatos, E., Al Bakir, I., Curtius, K., Graham, T.A., Mustonen, V.: The mutational signatures of formalin fixation on the human genome. bioRxiv. 2021:2021.03.11.434918 (2022)
Acknowledgements
We acknowledge funding from Kay Winger Blair Endowment Award in Mathematics, the TRIO McNair Scholars Program and the Center for Undergraduate Research (CURI) from St Olaf College. We acknowledge Dr Asha Nair, for providing datasets that motivated this research.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Mitchell, A., Ruiz, M., Yang, S., Wang, C., Davila, J. (2022). Excerno: Filtering Mutations Caused by the Clinical Archival Process in Sequencing Data. In: Bansal, M.S., et al. Computational Advances in Bio and Medical Sciences. ICCABS 2021. Lecture Notes in Computer Science(), vol 13254. Springer, Cham. https://doi.org/10.1007/978-3-031-17531-2_3
Download citation
DOI: https://doi.org/10.1007/978-3-031-17531-2_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-17530-5
Online ISBN: 978-3-031-17531-2
eBook Packages: Computer ScienceComputer Science (R0)