Abstract
This study aimed to obtain a quantitative assessment of the occurrence of contradictory evidence in functional classification of genetic variation, according to the American College of Medical Genetics and Genomics (ACMG) guidelines. We analyzed 140,883 genetic variation in the Human Gene Mutation Database (HGMD). The 2014 release of the HGMD dataset before the publication of the ACMG guidelines was used for its independence from the ACMG guidelines. Evidence for benign classification, BS2 (0.37%), was identified among variants classified as pathogenic. For likely pathogenic variation, BP1 (2.99%) and BS2 (0.37%) were identified. PM1 is commonly observed among variants classified as benign (28.45%), while PM2 and PM1 are commonly identified among variants classified as likely benign (48.91% and 42.95%, respectively). Taken together, these observations will inform better approaches to apply the ACMG guidelines.
Similar content being viewed by others
Introduction
In recent years, there has been rapid progress of human genomics research greatly increasing the knowledge of human genomics and the role genetic variants play in human diseases [1]. At the same time, the advancement of genomic technology has been robust. These progresses enable the application of genomic tests in healthcare. As a result, precision medicine based on personal genomics information has become the direction of future medicine [2, 3]. One importantly theoretical basis of precision medicine is that every human individual is different from others. There are about three billion nucleotides in the human genome, and while less than 0.1% of these varies among individuals, no two people's genome are the same, even monozygotic twins can have differences [4, 5]. Because of the large number and complexity of the genetic variation in the human genome, to apply genetic information in medicine, a critical but complicated step is to interpret the functional effects of these variants. Considering the complexity of interpretation of genetic variation, the American College of Medical Genetics and Genomics (ACMG) and the Association for Molecular Pathology (AMP) developed a set of standards and guidelines for this topic [6], which has been highly welcomed since its publication and is now widely adopted in both genetic research and clinical practice. In the implementation of these guidelines, a frequently seen issue is the coexistence of contradictory evidence in variant classification. In this study, we did a quantitative assessment of contradictory evidence in variant classification based on the results of an automatic algorithm for interpretation of genetic variants by the ACMG/AMP guidelines. The results of this analysis may help researchers and clinicians to better understand the contradictory evidence, and be beneficial for potential revisions and improvements of the current ACMG guidelines.
Methods
The Human Gene Mutation Database (HGMD) professional dataset [7] 2014 release, as a database independent from the 2015 ACMG guidelines [6], was assessed using the automated software InterVar [8] based on the ANNOVAR algorithm [9]. This earlier version of HGMD dataset before the publication of the 2015 ACMG guidelines, which has caused great impact in the field of medical genetics since its publication, was used in an attempt to assess HGMD as a relatively independent database from the impact of the 2015 ACMG guidelines. The 2014 HGMD data also predated the Exome Aggregation Consortium (ExAC) database [10] and the Genome Aggregation Database (gnomAD) database [11], which provide critical population data for the variant classification. Benign evidence from ExAC/gnomAD is commonly seen in retired HGMD disease-causing mutations. Altogether, 140,883 HGMD records were analyzed. Among these records, 1424 variants mapped to more than one reference gene. To avoid contradictory evidence from different genes, these variants were excluded from further analysis. Altogether, 139,207 HGMD variants were included in this study (Table 1).
Not all 28 of the ACMG criteria can be scored by automating software like Intervar, e.g., scores for benign evidence BS3, BS4, BP2, and BP5, and pathogenic evidence PS2, PS3, PM3, PM6, PP1, and PP4, cannot be automated. InterVar got the PP5 and BP6 data by automated scoring using the ClinVar dataset after data-cleaning procedure including removing common variants (allele frequencies >5%) and variants with conflicting annotations [8]. ClinVar at the National Center for Biotechnology Information (NCBI) is a public archive of the relationships among human genetic variants and phenotypes [12]. ClinVar receives submissions from clinical testing laboratories, research laboratories, expert panels and practice guidelines, and public databases, e.g., OMIM, GeneReviews, UniProt, etc. For the two reputable source criteria PP5 and BP6, the definition of “reputable source” is obscure, thus may be lack of validity. There has been recent concerns that PP5 and BP6 may be commonly misused by laboratories, and discontinuing the use of criteria PP5 and BP6 has been proposed [13]. Like the PP5 and BP6 data, InterVar also got PM1 data by automated scoring using the ClinVar dataset. The PM1 evidence was generated for all ClinVar variants from functional domains contained only pathogenic or likely pathogenic variants without benign or common (allele frequency >5%) variants [8]. The functional domains were annotated by the dbNSFP database for functional prediction and annotation of missense variants [14].
Results and discussion
Among the 139,207 HGMD variants, 42.7% were classified as pathogenic (14.9%) or likely pathogenic (27.8%), and 5.6% were classified as benign (3.4%) or likely benign (2.2%) (Table 2). Contradictory evidence was identified in both pathogenic and likely pathogenic groups of variation, at the same time we identified pathogenic evidence in a high rate of occurrences of variants classified as benign (35.3%) or likely benign (78.4%) in keeping with the HGMD aims to record disease mutations.
Altogether, 102 (0.49%) pathogenic variants have information in support of benign functionality; and 1356 (3.50%) of likely pathogenic variances have comparable evidence. As shown in Table 3, five types of benign characteristics are frequently observed among variants predicted to be pathogenic or likely pathogenic.
-
BS1(Allele frequency is greater than expected for disorder [6]),
-
BS2 (Observed in a healthy adult individual with full penetrance expected at an early age [6]),
-
BP1 (Missense variant in a gene for which primarily truncating variants are known to cause disease [6]),
-
BP4(Multiple lines of computational evidence [6]), and
-
BP6 (reputable source recently reports variant as benign [6]),
BS2 is the most common benign feature identified among pathogenic variants. For likely pathogenic variants, BP1 and BS2 are the top two benign features. The frequencies of some types of benign characteristics are different between pathogenic variants and likely pathogenic variants. BS1 evidence is currently seen in pathogenic variants, but not in likely pathogenic variants (Fisher’s Exact Test P = 3.90 × 10−7). This is from a limitation of automated algorithm of variant classification. InterVar generated BS1 evidence based on the alternative allele frequency (AAF) in the ExAC Browser [10]. In the case of variants originally assigned as likely pathogenic, those with BS1 were reassigned as uncertain significance. Likewise, the frequency of BP6 was originally more common among pathogenic variants than likely pathogenic variants. Currently, most pathogenic or likely pathogenic variants with BP6 were reassigned as uncertain significance by InterVar (personal communication). There might be more research interest in pathogenic variation from peers. On the other hand, this might also suggest an overrepresentation of misclassified pathogenic variants. InterVar generated PS1 evidence based on the contents of ClinVar. With OMIM as a major data source of ClinVar, ClinVar assigns a large number of OMIM entries as pathogenic (https://www.ncbi.nlm.nih.gov/clinvar/docs/clinsig/), which may include misclassified benign variants.
In contrast, the frequencies of BP1 and BP4 are more common in likely pathogenic type than in pathogenic variants (χ2 = 632.5, P = 1.40 × 10−139 and χ2 = 9.9, P = 1.64 × 10−3, respectively). The higher frequency of evidence of benign role among variants classified as likely pathogenic (χ2 = 511.0, P = 3.82 × 10−113) can be mainly explained by the presence of BP1. In our analysis, BP1 was not seen in variants with PVS1, PS2, PS3, PM3, PM4, PM6, PP1, or PP4. Variants with BP1 cannot have enough evidence to make it into the pathogenic class. In contrast, BP1 is commonly seen in variants with PM2, PM1, PP3, or PP5 by the order of frequency. For the most commonly observed contradictory benign support, BP1, pathogenic variants with BP1 evidence are commonly seen in the literature. Although initial interest in the analysis of some Mendelian diseases focused on finding truncated variants [15, 16], missense variants are commonly found to be disease causing e.g., such as BRCA1 variants causing breast cancer and ovarian cancer [17,18,19], while 50 missense variants in the BRCA1 gene predicted to be likely pathogenic have automated BP1 score. Therefore, BP1 should not be weighed towards overturning results of a likely pathogenic prediction unless thoroughly evaluated. For contradictory BP4, combinations of in silico algorithms with high performance characteristics have been suggested for the assessment of genetic variants using the ACMG/AMP guidelines [20]. The numbers in Table 3 provide quantitative information about the degree of contradictory evidence that exist in the domain of functional classification of pathogenic variation.
According to the reports generated from the HGMD (Table 1), disease-associated polymorphisms (i.e., DFP, DP, and FP) account for 4.60% of the HGMD dataset [7], which is much lower than the frequency observed in this study (benign + likely benign = 5.6%, χ2 = 144.3, P < 2.98 × 10−33). Therefore, disease-associated polymorphisms may only partially explain the benign and likely benign variants recorded in the HGMD database. On the other hand, evidence for pathogenic effects is commonly seen among the benign and likely benign classified variation (Table 4), which cannot be explained by disease-associated polymorphisms. To be more specific, among benign variation, 1664 (35.31%) variants have evidence of pathogenicity. Seven types of supportive evidence have been identified in support of pathogenic classification with the most common being
-
PM1(mutational hot spot and/or critical and well-established functional domain [6]),
-
PP3 (multiple lines of computational evidence support a deleterious effect [6]),
-
PP5 (reputable source recently reported [6]), and
-
PP2 (missense variant in a gene that has a low rate of benign missense variation [6]).
Among variants classified as likely benign, 2429 (78.38%) have evidence of pathogenicity. The most common evidence of pathogenicity is PM2(absent from controls [6]), followed by four pathogenic evidence commonly seen among benign variant classification. This observation highlights the fact that PM1 is commonly seen in benign variation and PM2 and PM1 are commonly seen among variants classified as likely benign. Currently, caution should be used when applying PM1 and PM2 given the frequency of their applicability to benign variants, unless supported by additional data. To acquire PM1 scores, InterVar removed functional domains with common (allele frequency >5%) ClinVar variants, while benign variants may have allele frequency <5%. Therefore, in most instances, it is reasonable to downgrade the weight of PM1 and PM2 for the prediction of pathogenicity.
PM1 is automatically scored using the ClinVar dataset by InterVar. To date, potential report bias exists in the collection of ClinVar variants as clinical testing and research laboratories may submit only pathogenic variants, but not benign variants to ClinVar. Therefore, PM1 evidence in ClinVar needs to be further developed using unbiased data sources, e.g., ExAC and gnomAD developed by exome and genome sequencing data [10, 11].
Interestingly, 21 (0.45%, seen in Supplementary Table S1) benign variations have PVS1 (Very strong pathogenic evidence), and 14 of these variation have also other evidence of pathogenicity, most commonly with PP3. These 21 variants are from 17 genes, while eight variants introduce or abolish a stop codon, seven variants are canonical ±1 or 2 splice sites, and six variants are frameshift. Although it is rare that variation with PVS1 are predicted as benign by InterVar because of sufficient benign evidence meeting the ACMG criteria, these variants need indeed to be considered of reclassification as being of uncertain significance in InterVar, which warrants more intensive investigation. InterVar got the PVS1 evidence from ClinVar [8] and the ExAC dataset [10], including 4807 genes that are LOF-intolerant and excluding nonsense variants close to the extreme 3′ end of a gene. Other caveats, including those were warned by the ACMG guideline [6], need to be considered for assessing the validity of PVS1, e.g., alternative splicing [21], cryptic initiation codon [22], and the presence of multiple transcripts may indeed reserve the gene function in these variation. For a solid PVS1 evidence, the genetic variant needs to affect all mRNA transcripts. Considering the current caveats, the ClinGen Sequence Variant Interpretation Working Group recently published recommendations to facilitate consistent and accurate PVS1 evidence [23].
In summary, this study presented a quantitative assessment of the automating application of ACMG guideline in the functional classification of genetic variation that may cause inherited diseases. The ACMG guidelines suggest a VUS classification for variants with contradictory evidence. Some laboratories allow one contradictory supporting benign evidence for pathogenic/likely pathogenic variants with strong evidence [24]. Considering the limitations of an automated algorithm of variant classification like InterVar, it will be an extremely interesting topic to examine systematically variants with contradictory evidence by the full ACMG-based interpretation from an experienced laboratory. Our statistics shows that BP1 is relatively common among likely pathogenic variants, and should not be weighed too much to overturn a likely pathogenic prediction. On the other hand, PM1 and PM2 are common in benign and likely benign variants, and should be used cautiously as an effective evidence for pathogenicity while PM1 needs to be further developed using unbiased data sources. These observations should be considered in future revisions and improvements of the ACMG guidelines, as well as to improve automating algorithms for the functional classification of genetic variation. Currently, the quantitative Bayesian framework to interpret the ACMG/AMP guidelines, proposed by Tavtigian et al. has been shown to be a useful approach for the functional classification of some genetic variation with conflicting evidence [25].
References
Frazer KA, Murray SS, Schork NJ, Topol EJ. Human genetic variation and its contribution to complex traits. Nat Rev Genet. 2009;10:241–51.
Collins FS, Varmus H. A new initiative on precision medicine. N Engl J Med. 2015;372:793–5.
Mirnezami R, Nicholson J, Darzi A. Preparing for precision medicine. N Engl J Med. 2012;366:489–91.
Jorde LB, Wooding SP. Genetic variation, classification and ‘race’. Nat Genet. 2004;36:S28–S33.
McRae AF, Visscher PM, Montgomery GW, Martin NG. Large autosomal copy-number differences within unselected monozygotic twin pairs are rare. Twin Res Hum Genet. 2015;18:13–18.
Richards S, Aziz N, Bale S, Bick D, Das S, Gastier-Foster J, et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet Med. 2015;17:405–24.
Stenson PD, Ball EV, Mort M, Phillips AD, Shiel JA, Thomas NS, et al. Human Gene Mutation Database (HGMD): 2003 update. Hum Mutat. 2003;21:577–81.
Li Q, Wang K. InterVar: clinical interpretation of genetic variants by the 2015 ACMG-AMP guidelines. Am J Hum Genet. 2017;100:267–80.
Wang K, Li M, Hakonarson H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 2010;38:e164–e164.
Lek M, Karczewski KJ, Minikel EV, Samocha KE, Banks E, Fennell T, et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature. 2016;536:285.
Karczewski KJ, Francioli LC, Tiao G, Cummings BB, Alföldi J, Wang Q, et al. Variation across 141,456 human exomes and genomes reveals the spectrum of loss-of-function intolerance across human protein-coding genes. bioRxiv. 2019:531210.
Landrum MJ, Lee JM, Benson M, Brown G, Chao C, Chitipiralla S, et al. ClinVar: public archive of interpretations of clinically relevant variants. Nucleic Acids Res. 2016;44:D862–D868.
Biesecker LG, Harrison SM. The ACMG/AMP reputable source criteria for the interpretation of sequence variants. Genet Med. 2018;20:1687–8.
Liu X, Wu C, Li C, Boerwinkle E. dbNSFPv3.0: a one-stop database of functional predictions and annotations for human nonsynonymous and splice-site SNVs. Hum Mutat. 2016;37:235–41.
Ishioka C, Suzuki T, FitzGerald M, Krainer M, Shimodaira H, Shimada A, et al. Detection of heterozygous truncating mutations in the BRCA1 and APC genes by using a rapid screening assay in yeast. Proc Natl Acad Sci USA. 1997;94:2449–53.
Miki Y, Swensen J, Shattuck-Eidens D, Futreal PA, Harshman K, Tavtigian S, et al. A strong candidate for the breast and ovarian cancer susceptibility gene BRCA1. Science. 1994;266:66–71.
Malone KE, Daling JR, Thompson JD, O’Brien CA, Francisco LV, Ostrander EA. BRCA1 mutations and breast cancer in the general population: analyses in women before age 35 years and in women before age 45 years with first-degree family history. JAMA. 1998;279:922–9.
Katagiri T, Kasumi F, Yoshimoto M, Nomizu T, Asaishi K, Abe R, et al. High proportion of missense mutations of the BRCA1 and BRCA2 genes in Japanese breast cancer families. J Hum Genet. 1998;43:42–48.
Sweet K, Senter L, Pilarski R, Wei L, Toland AE. Characterization of BRCA1 ring finger variants of uncertain significance. Breast Cancer Res Treat. 2010;119:737–43.
Ghosh R, Oak N, Plon SE. Evaluation of in silico algorithms for use with ACMG/AMP clinical variant interpretation guidelines. Genome Biol. 2017;18:225.
Modrek B, Lee C. A genomic view of alternative splicing. Nat Genet. 2002;30:13–19.
Tian G, Huang MC, Parvari R, Diaz GA, Cowan NJ. Cryptic out-of-frame translational initiation of TBCE rescues tubulin formation in compound heterozygous HRD. Proc Natl Acad Sci USA. 2006;103:13491–6.
Abou Tayoun A, Pesaran T, DiStefano M, Oza A, Rehm H, Biesecker L, et al. Recommendations for interpreting the loss of function PVS1 ACMG/AMP variant criteria. bioRxiv. 2018.
Amendola LM, Jarvik GP, Leo MC, McLaughlin HM, Akkari Y, Amaral MD, et al. Performance of ACMG-AMP variant-interpretation guidelines among nine laboratories in the Clinical Sequencing Exploratory Research Consortium. Am J Hum Genet. 2016;98:1067–76.
Tavtigian SV, Greenblatt MS, Harrison SM, Nussbaum RL, Prabhu SA, Boucher KM, et al. Modeling the ACMG/AMP variant classification guidelines as a Bayesian classification framework. Genet Med. 2018;20:1054–60.
Acknowledgements
We thank Dr Gangcai Xie for helping with the InterVar software. We appreciate the instructive comments from two anonymous reviewers.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
About this article
Cite this article
Qu, HQ., Wang, X., Tian, L. et al. Application of ACMG criteria to classify variants in the human gene mutation database. J Hum Genet 64, 1091–1095 (2019). https://doi.org/10.1038/s10038-019-0663-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s10038-019-0663-8
This article is cited by
-
A preliminary analysis of mitochondrial DNA atlas in the type 2 diabetes patients
International Journal of Diabetes in Developing Countries (2022)
-
Identification of a novel missense c.386G > A variant in a boy with the POMGNT1-related muscular dystrophy-dystroglycanopathy
Acta Neurologica Belgica (2021)