Introduction

In recent years, there has been rapid progress of human genomics research greatly increasing the knowledge of human genomics and the role genetic variants play in human diseases [1]. At the same time, the advancement of genomic technology has been robust. These progresses enable the application of genomic tests in healthcare. As a result, precision medicine based on personal genomics information has become the direction of future medicine [2, 3]. One importantly theoretical basis of precision medicine is that every human individual is different from others. There are about three billion nucleotides in the human genome, and while less than 0.1% of these varies among individuals, no two people's genome are the same, even monozygotic twins can have differences [4, 5]. Because of the large number and complexity of the genetic variation in the human genome, to apply genetic information in medicine, a critical but complicated step is to interpret the functional effects of these variants. Considering the complexity of interpretation of genetic variation, the American College of Medical Genetics and Genomics (ACMG) and the Association for Molecular Pathology (AMP) developed a set of standards and guidelines for this topic [6], which has been highly welcomed since its publication and is now widely adopted in both genetic research and clinical practice. In the implementation of these guidelines, a frequently seen issue is the coexistence of contradictory evidence in variant classification. In this study, we did a quantitative assessment of contradictory evidence in variant classification based on the results of an automatic algorithm for interpretation of genetic variants by the ACMG/AMP guidelines. The results of this analysis may help researchers and clinicians to better understand the contradictory evidence, and be beneficial for potential revisions and improvements of the current ACMG guidelines.

Methods

The Human Gene Mutation Database (HGMD) professional dataset [7] 2014 release, as a database independent from the 2015 ACMG guidelines [6], was assessed using the automated software InterVar [8] based on the ANNOVAR algorithm [9]. This earlier version of HGMD dataset before the publication of the 2015 ACMG guidelines, which has caused great impact in the field of medical genetics since its publication, was used in an attempt to assess HGMD as a relatively independent database from the impact of the 2015 ACMG guidelines. The 2014 HGMD data also predated the Exome Aggregation Consortium (ExAC) database [10] and the Genome Aggregation Database (gnomAD) database [11], which provide critical population data for the variant classification. Benign evidence from ExAC/gnomAD is commonly seen in retired HGMD disease-causing mutations. Altogether, 140,883 HGMD records were analyzed. Among these records, 1424 variants mapped to more than one reference gene. To avoid contradictory evidence from different genes, these variants were excluded from further analysis. Altogether, 139,207 HGMD variants were included in this study (Table 1).

Table 1 Categorization of HGMD recorded variantsa

Not all 28 of the ACMG criteria can be scored by automating software like Intervar, e.g., scores for benign evidence BS3, BS4, BP2, and BP5, and pathogenic evidence PS2, PS3, PM3, PM6, PP1, and PP4, cannot be automated. InterVar got the PP5 and BP6 data by automated scoring using the ClinVar dataset after data-cleaning procedure including removing common variants (allele frequencies >5%) and variants with conflicting annotations [8]. ClinVar at the National Center for Biotechnology Information (NCBI) is a public archive of the relationships among human genetic variants and phenotypes [12]. ClinVar receives submissions from clinical testing laboratories, research laboratories, expert panels and practice guidelines, and public databases, e.g., OMIM, GeneReviews, UniProt, etc. For the two reputable source criteria PP5 and BP6, the definition of “reputable source” is obscure, thus may be lack of validity. There has been recent concerns that PP5 and BP6 may be commonly misused by laboratories, and discontinuing the use of criteria PP5 and BP6 has been proposed [13]. Like the PP5 and BP6 data, InterVar also got PM1 data by automated scoring using the ClinVar dataset. The PM1 evidence was generated for all ClinVar variants from functional domains contained only pathogenic or likely pathogenic variants without benign or common (allele frequency >5%) variants [8]. The functional domains were annotated by the dbNSFP database for functional prediction and annotation of missense variants [14].

Results and discussion

Among the 139,207 HGMD variants, 42.7% were classified as pathogenic (14.9%) or likely pathogenic (27.8%), and 5.6% were classified as benign (3.4%) or likely benign (2.2%) (Table 2). Contradictory evidence was identified in both pathogenic and likely pathogenic groups of variation, at the same time we identified pathogenic evidence in a high rate of occurrences of variants classified as benign (35.3%) or likely benign (78.4%) in keeping with the HGMD aims to record disease mutations.

Table 2 ACMG classification of HGMD recorded variances by InterVara

Altogether, 102 (0.49%) pathogenic variants have information in support of benign functionality; and 1356 (3.50%) of likely pathogenic variances have comparable evidence. As shown in Table 3, five types of benign characteristics are frequently observed among variants predicted to be pathogenic or likely pathogenic.

  • BS1(Allele frequency is greater than expected for disorder [6]),

  • BS2 (Observed in a healthy adult individual with full penetrance expected at an early age [6]),

  • BP1 (Missense variant in a gene for which primarily truncating variants are known to cause disease [6]),

  • BP4(Multiple lines of computational evidence [6]), and

  • BP6 (reputable source recently reports variant as benign [6]),

Table 3 Frequencies of evidence of benign in pathogenic and like pathogenic variances

BS2 is the most common benign feature identified among pathogenic variants. For likely pathogenic variants, BP1 and BS2 are the top two benign features. The frequencies of some types of benign characteristics are different between pathogenic variants and likely pathogenic variants. BS1 evidence is currently seen in pathogenic variants, but not in likely pathogenic variants (Fisher’s Exact Test P = 3.90 × 10−7). This is from a limitation of automated algorithm of variant classification. InterVar generated BS1 evidence based on the alternative allele frequency (AAF) in the ExAC Browser [10]. In the case of variants originally assigned as likely pathogenic, those with BS1 were reassigned as uncertain significance. Likewise, the frequency of BP6 was originally more common among pathogenic variants than likely pathogenic variants. Currently, most pathogenic or likely pathogenic variants with BP6 were reassigned as uncertain significance by InterVar (personal communication). There might be more research interest in pathogenic variation from peers. On the other hand, this might also suggest an overrepresentation of misclassified pathogenic variants. InterVar generated PS1 evidence based on the contents of ClinVar. With OMIM as a major data source of ClinVar, ClinVar assigns a large number of OMIM entries as pathogenic (https://www.ncbi.nlm.nih.gov/clinvar/docs/clinsig/), which may include misclassified benign variants.

In contrast, the frequencies of BP1 and BP4 are more common in likely pathogenic type than in pathogenic variants (χ2 = 632.5, P = 1.40 × 10−139 and χ2 = 9.9, P= 1.64 × 10−3, respectively). The higher frequency of evidence of benign role among variants classified as likely pathogenic (χ2 = 511.0, P = 3.82 × 10−113) can be mainly explained by the presence of BP1. In our analysis, BP1 was not seen in variants with PVS1, PS2, PS3, PM3, PM4, PM6, PP1, or PP4. Variants with BP1 cannot have enough evidence to make it into the pathogenic class. In contrast, BP1 is commonly seen in variants with PM2, PM1, PP3, or PP5 by the order of frequency. For the most commonly observed contradictory benign support, BP1, pathogenic variants with BP1 evidence are commonly seen in the literature. Although initial interest in the analysis of some Mendelian diseases focused on finding truncated variants [15, 16], missense variants are commonly found to be disease causing e.g., such as BRCA1 variants causing breast cancer and ovarian cancer [17,18,19], while 50 missense variants in the BRCA1 gene predicted to be likely pathogenic have automated BP1 score. Therefore, BP1 should not be weighed towards overturning results of a likely pathogenic prediction unless thoroughly evaluated. For contradictory BP4, combinations of in silico algorithms with high performance characteristics have been suggested for the assessment of genetic variants using the ACMG/AMP guidelines [20]. The numbers in Table 3 provide quantitative information about the degree of contradictory evidence that exist in the domain of functional classification of pathogenic variation.

According to the reports generated from the HGMD (Table 1), disease-associated polymorphisms (i.e., DFP, DP, and FP) account for 4.60% of the HGMD dataset [7], which is much lower than the frequency observed in this study (benign + likely benign = 5.6%, χ2 = 144.3, P < 2.98 × 10−33). Therefore, disease-associated polymorphisms may only partially explain the benign and likely benign variants recorded in the HGMD database. On the other hand, evidence for pathogenic effects is commonly seen among the benign and likely benign classified variation (Table 4), which cannot be explained by disease-associated polymorphisms. To be more specific, among benign variation, 1664 (35.31%) variants have evidence of pathogenicity. Seven types of supportive evidence have been identified in support of pathogenic classification with the most common being

  • PM1(mutational hot spot and/or critical and well-established functional domain [6]),

  • PP3 (multiple lines of computational evidence support a deleterious effect [6]),

  • PP5 (reputable source recently reported [6]), and

  • PP2 (missense variant in a gene that has a low rate of benign missense variation [6]).

Table 4 Frequencies of evidence of pathogenicity in benign and like benign variances

Among variants classified as likely benign, 2429 (78.38%) have evidence of pathogenicity. The most common evidence of pathogenicity is PM2(absent from controls [6]), followed by four pathogenic evidence commonly seen among benign variant classification. This observation highlights the fact that PM1 is commonly seen in benign variation and PM2 and PM1 are commonly seen among variants classified as likely benign. Currently, caution should be used when applying PM1 and PM2 given the frequency of their applicability to benign variants, unless supported by additional data. To acquire PM1 scores, InterVar removed functional domains with common (allele frequency >5%) ClinVar variants, while benign variants may have allele frequency <5%. Therefore, in most instances, it is reasonable to downgrade the weight of PM1 and PM2 for the prediction of pathogenicity.

PM1 is automatically scored using the ClinVar dataset by InterVar. To date, potential report bias exists in the collection of ClinVar variants as clinical testing and research laboratories may submit only pathogenic variants, but not benign variants to ClinVar. Therefore, PM1 evidence in ClinVar needs to be further developed using unbiased data sources, e.g., ExAC and gnomAD developed by exome and genome sequencing data [10, 11].

Interestingly, 21 (0.45%, seen in Supplementary Table S1) benign variations have PVS1 (Very strong pathogenic evidence), and 14 of these variation have also other evidence of pathogenicity, most commonly with PP3. These 21 variants are from 17 genes, while eight variants introduce or abolish a stop codon, seven variants are canonical ±1 or 2 splice sites, and six variants are frameshift. Although it is rare that variation with PVS1 are predicted as benign by InterVar because of sufficient benign evidence meeting the ACMG criteria, these variants need indeed to be considered of reclassification as being of uncertain significance in InterVar, which warrants more intensive investigation. InterVar got the PVS1 evidence from ClinVar [8] and the ExAC dataset [10], including 4807 genes that are LOF-intolerant and excluding nonsense variants close to the extreme 3′ end of a gene. Other caveats, including those were warned by the ACMG guideline [6], need to be considered for assessing the validity of PVS1, e.g., alternative splicing [21], cryptic initiation codon [22], and the presence of multiple transcripts may indeed reserve the gene function in these variation. For a solid PVS1 evidence, the genetic variant needs to affect all mRNA transcripts. Considering the current caveats, the ClinGen Sequence Variant Interpretation Working Group recently published recommendations to facilitate consistent and accurate PVS1 evidence [23].

In summary, this study presented a quantitative assessment of the automating application of ACMG guideline in the functional classification of genetic variation that may cause inherited diseases. The ACMG guidelines suggest a VUS classification for variants with contradictory evidence. Some laboratories allow one contradictory supporting benign evidence for pathogenic/likely pathogenic variants with strong evidence [24]. Considering the limitations of an automated algorithm of variant classification like InterVar, it will be an extremely interesting topic to examine systematically variants with contradictory evidence by the full ACMG-based interpretation from an experienced laboratory. Our statistics shows that BP1 is relatively common among likely pathogenic variants, and should not be weighed too much to overturn a likely pathogenic prediction. On the other hand, PM1 and PM2 are common in benign and likely benign variants, and should be used cautiously as an effective evidence for pathogenicity while PM1 needs to be further developed using unbiased data sources. These observations should be considered in future revisions and improvements of the ACMG guidelines, as well as to improve automating algorithms for the functional classification of genetic variation. Currently, the quantitative Bayesian framework to interpret the ACMG/AMP guidelines, proposed by Tavtigian et al. has been shown to be a useful approach for the functional classification of some genetic variation with conflicting evidence [25].