Identification of Genomic Alterations in Thai Patients With Colorectal Cancer Using Next-Generation Sequencing-Based Multigene Cancer Panel

Introduction Colorectal cancer (CRC) is one of the leading causes of death and illness in the general population. Although the incidence of CRC is steadily decreasing worldwide, it is being diagnosed more in individuals under 50 years of age. Multiple disease-causing variants have been reported to be involved in the development of CRC. This study aimed to investigate the molecular and clinical characteristics of Thai patients with CRC. Methods NGS-based multigene cancer panel testing was performed on 21 unrelated patients. Target enrichment was performed using a custom-designed Ion AmpliSeq on-demand panel. Thirty-six genes associated with CRC and other cancer were analyzed for variant detection. Results Sixteen variants (five nonsense, eight missense, two deletions, and one duplication) in nine genes were identified in 12 patients. Eight (66.7%) patients harboring disease-causing deleterious variants in genes APC, ATM, BRCA2, MSH2, and MUTYH. One of the eight patients also carried additional heterozygous variants in genes ATM, BMPR1A, and MUTYH. In addition, four patients carried variants of uncertain significance in genes APC, MLH1, MSH2, STK11, and TP53. Among all detected genes, APC was the most frequent causative gene observed in CRC patients, which is consistent with previous reports. Conclusion This study demonstrated the comprehensive molecular and clinical characterization of CRC patients. These findings showed the benefits of using multigene cancer panel sequencing for pathogenic gene detection and showed the prevalence of genetic aberrations in Thai patients with CRC.


Introduction
Colorectal cancer (CRC) is the third most commonly diagnosed cancer worldwide and the second cancerrelated cause of mortality following lung cancer [1]. The most common age for individuals with CRC diagnosis is over 50 years old. Although the incidence of CRC has been steadily declining over the past few decades, the tendency of detection of early-onset CRC patients (under the age of 50) is increasing [2]. Data from the population-based cancer registry in Thailand have reported that CRC ranks as the third most prevalent cancer in males and as the second most prevalent cancer in females with an estimated 8,658 and 7,281 new cases, respectively, and there were estimated 4,781 deaths from this disease in an annual report 2017 [3,4].
Most early-stage CRCs are asymptomatic. As a result, CRC patients are often not diagnosed until they have reached an advanced stage of disease. Therefore, early detection has a significant impact on prognosis, treatment, and reducing its mortality. Due to the diversity of genes implicated in this disease, it is important to select a method that can screen for many genes at once. Next-generation sequencing (NGS) is a powerful tool for detecting genetic alterations associated with various cancers. Advances in the development of this technology allow us to select the appropriate approach for cancer genetic testing. One of the NGS platforms is the sequencing of selected target genes according to the cancer relevance, creating a multigene panel that are cheaper and more cost-effective. It also provides molecular profiles of personalized cancer therapies and treatments. In the recent literature review, a multigene panel, including cancer susceptibility genes, has been used extensively in screening for patients with CRC [5][6][7][8][9][10][11].
The identification of a germline pathogenic variant in a known hereditary cancer-predisposing gene has an important implication for both patients and their family members because of its relevance in both clinical management according to a gene-specific approach for treatment and family planning. In this study, we proceeded to examine the prevalence of germline alterations with NGS-based targeted gene panel sequencing which included 36 genes associated with both known CRC-predisposing genes and other cancer susceptibility genes in 21 Thai patients with CRC.

Peripheral blood samples
For the germline alterations analysis. Eighteen milliliters of peripheral blood were collected from the patient after obtaining written consent and sent to the Molecular Genetics Laboratory, Siriraj Genomics.

NGS-based multigene cancer panel testing
Briefly, the genomic DNA was extracted using Qiagen DNeasy DNA Isolation Kit (Hilden, Germany). Genetic analysis was performed using the NGS. Target enrichment was performed using a custom-designed Ion AmpliSeq on-demand panel (Ion Torrent S5 XL, Thermo Fisher Scientific, USA  [12]. All variants passing filter criteria and copy number variants were validated with Sanger sequencing (the primer sequences for each gene used in this study are available upon request) and Multiplex Ligation-dependent Probe Amplification (MLPA) kits from MRC-Holland (Amsterdam, The Netherlands), respectively. The variants were interpreted according to the "Standards and Guidelines for the Interpretation of Sequence Variants" by the ACMG/AMP 2015, which are categorized into five different classes: pathogenic (P), likely pathogenic (LP), variant of unknown significance (VUS), likely benign (LB), and benign (B) [13]. LB and B variants were interpreted as negative. In families with identified variants, further target testing of such variants was performed in additional affected family members when available.

Protein structure and functional prediction
The protein structure analysis resulting from missense variants was predicted using the freely available web service HOPE (Have Your Protein Explained) (https://www3.cmbi.umcn.nl/hope/) [14].

Results
A total of 21 unrelated Thai patients with CRC were recruited in this study, consisting of 12 males and nine females. A multigene cancer panel was performed to identify the disease-causing variants in patients. According to variant analysis in 36 genes, a total of 16 different unique variants (five nonsense, eight missense, two deletions, and one duplication) were detected in 9/36 genes affected (25%). The deleterious P/LP variants were detected in eight out of 21 patients (38.1%). A summary of the clinical characteristics, the variants identified in this study, and family pedigrees are shown in Tables 1, 2, and Figure 1, respectively.

Potentially pathogenic variants
Among all patients with detectable P/LP variants, APC accounted for four patients. In the meanwhile, the P/LP variants were also identified in other genes (ATM, BRCA2, MSH2, and MUTYH) that were found in four patients ( Table 2).
Deleterious truncating variants (p.Arg554*, p.Arg564*, and Leu698*) in APC gene were observed in three patients (PMCRC16, PMCRC15, and PMCRC4), respectively. These three variants have been previously reported in the ClinVar database as P variants. In the PMCRC13 patient, a heterozygous alteration (c. (1313_1409)_(1548_1626)del) that causes deletion of exon 12 of the APC gene was detected by MLPA analysis. This variant has not been previously reported in any database. Interestingly, the same variant was also detected in her affected brother (Figure 1). According to the ACMG guidelines, this putative loss of function variant is classified as P.
In the patient, PMCRC19, a heterozygous three-base deletion variant (c.1786_1788delAAT) was identified in the MSH2 gene. This variant causes in-frame single amino acid deletion (p.Asn596del) in the DNA mismatch repair protein MSH2. It was reported as P in both the ClinVar (Variation ID: 1757) and the InSiGHT database (http://insight-database.org/), which has been shown in numerous patients with Lynch syndrome. In the BRCA2 gene, a heterozygous nonsense variant, c.7558C>T (p.Arg2520*), was identified in the patient PMCRC7. It has been previously reported in the ClinVar database as a P variant (Variation ID: 52353). For the ATM gene, a heterozygous nonsense variant, c.3687C>G (p.Tyr1229*), was detected in the patient PMCRC21. This variant is not present in population databases and was classified as LP according to the ACMG criteria.
In the patient PMCRC8, four variants in three genes (MUTYH, ATM, and BMPR1A) were identified. For the MUTYH gene, two heterozygous variants, c.545G>A (p.Arg182His) and c.674C>T (p.Ser225Phe), were detected. The p.Arg182His variant has been reported multiple times as pathogenic (ClinVar Variation ID: 182689) in many individuals affected with familial adenomatous polyposis (FAP) and CRC. The MUTYH variant (p.Ser225Phe) and another two genes (ATM and BMPR1A) variants were categorized as uncertain significance.

Variants of unknown significance
A total of eight VUS in eight genes (APC, ATM, BMPR1A, MLH1, MSH2, MUTYH, STK11, and TP53) were detected in five patients ( Table 2) According to all of the following information, both variants were classified as VUS.
As mentioned earlier, patient PMCRC8 harbors four variants in three genes (MUTYH, ATM, and BMPR1A). A heterozygous variant, p.Arg182His in the MUTYH, was pathogenic, while another heterozygous variant, p.Ser225Phe in the MUTYH, has never been reported before in any database. Results from in silico prediction tools suggested that the p.Ser225Phe was deleterious. However, the supporting evidence is currently insufficient to ascertain the role of this variant in the disease. Therefore, this variant has to be classified as VUS per the ACMG criteria. Furthermore, heterozygous missense variants, c.743G>A (p.Arg248Gln) and c.116C>T (p.Ser39Phe), were observed in ATM and BMPR1A genes, respectively. These two variants are reported in ClinVar as uncertain significance, Variation ID: 479015 and 230945, respectively.
The MLH1 variant, c.1549G>A (p.Gly517Arg), was identified in the PMCRC9 patient. The in silico prediction tools reported conflicting results regarding the pathogenicity of this variant. This substitution is absent from gnomAD and has a single submission in the ClinVar database (variation ID 1404870) as uncertain significance. Therefore, the p.Gly517Arg variant was classified as VUS according to the evidence.
The STK11 variant, c.1168G>A (p.Val390Met), was detected in the PMCRC1 patient. Many prediction tools predicted the benign consequence of the variant on protein structure and function. This variant is reported in dbSNP (rs374078532) and gnomAD exomes East Asian population database (allele frequency = 0.000113). Clinically significant assessments submitted by seven clinical laboratories in ClinVar were classified this variant as uncertain significance (variation ID: 142283). From all supporting data, the p.Val390Met variant was classified as VUS.
The TP53 variant, c.304A>T (p.Thr102Ser), was observed in the PMCRC5 patient. Computational prediction tools are inconclusive regarding the impact of this missense variant on protein structure and function. Although the threonine residue 102 of cellular tumor antigen p53 is not evolutionarily conserved, the results from HOPE prediction showed that Thr102 is located within a stretch of residues annotated in UniProt as a special region (interaction with CCAR2). Substitution of p.Thr102Ser may disturb this region and its function. This variant is absent in the gnomAD cohort. ClinVar contains an entry for this variant as uncertain significance (variation ID: 582041) with four submissions. The evidence currently available is not sufficient to investigate the role of p.Thr102Ser in disease. Therefore, the clinical significance of this variant is uncertain.

Discussion
In this study, cancer-related molecular diagnostics testing in 36 genes by NGS was performed in 21 unrelated subjects with CRC. Most of the patients were diagnosed with rectum cancer (12 cases) ( Table 1). Variant analysis revealed the presence of clinically significant P or LP variants in eight patients (38.1%), while a VUS was detected in four of the cases (19.1%).
The patient PMCRC2 was diagnosed with sigmoid colon cancer with FAP at the age of 29 years and had a strong family history of CRC (Figure 1). This patient carried two gene variants (exon 10-11 duplications in APC and missense in MSH2) which classified as VUS ( Table 2). From the previous research about the APC gene, two duplications were detected in the FAP family [15,16]. The first one led to the duplication of exon 4 whereas the second one was a duplication of exons 10-11. These two duplication variants result in a frameshift and truncation of the APC protein. Based on this data, the duplication of exon 10-11 of the APC gene which was found in the patient PMCRC2 is assumed to be deleterious. The protein-truncation testing is required for the detection of abnormal proteins. As well as another variant, p.Ser185Cys in the MSH2 gene, cosegregation analysis and further in-depth functional study of this variant are needed for understanding the disease mechanism. The results from all of the testing are useful for reclassification of the clinical significance of these two variants in the patient PMCRC2 that can change from VUS to P or LB/B.
The PMCRC7 patient developed ascending colon at the age of 33 years and had a family history of CRC, lymphoma, and uterine cancer ( Figure 1). This patient harboring a pathogenic BRCA2 variant, p.Arg2520* (ClinVar Variation ID: 52353), which has been observed in several individuals with hereditary breast-ovarian cancer (HBOC), prostate, pancreatic, esophageal and renal cancer from diverse ethnic origins. Although there is still controversy about the reported association of CRC with the BRCA2 gene [17], the deleterious BRCA2-related CRC has also been observed in numerous studies [7,8,18,19]. Therefore, the pathogenic variant, p.Arg2520*, that was observed in the patient PMCRC7 was classified as a secondary finding. Unfortunately due to the scarcity of DNA samples from family members, it is challenging to explain the segregation of p.Arg2520* in this family. Recently, this patient was 44 years old. She underwent a mammogram and ultrasound, which revealed several cysts on right side and a few cysts on the left-side breast. The result was interpreted as negative (BIRADS2). Based on the available data, no association with HBOC was found in the current age of the patient. Although the pathogenic variant (p.Arg2520*) can be seen to be associated with cancer in multiple organs, it has never been reported in the colorectal. Therefore, this is the first report showing the presence of p.Arg2520* variant in CRC patients.
The PMCRC8, a female-sporadic patient diagnosed at 32, presented with rectum cancer. Colonoscopy disclosed a few polyps (less than 10 polyps) in the sigmoid colon. Variant analysis revealed two heterozygous variants (known pathogenic and unclassified variants) in MUTYH and a single heterozygous with VUS in ATM and BMPR1A ( Table 2). MUTYH-associated polyposis (MAP) is an autosomal recessive inherited disorder. Biallelic germline variants in MUTYH lead to MAP and a tendency to develop CRC. Many different phenotypes, including classic and attenuated polyposis, can be seen in individuals with MAP. A previous study has shown that carriers with monoallelic have an elevated risk for CRC when compared with the general population [20]. Due to a lack of DNA samples from additional family members make it impossible to ascertain whether the MUTYH variants are in cis or trans. However, first-degree relatives should be screened for the same pathogenic variant to assess their risk of developing CRC. In addition, the absence of family members resulted in the inability to explain the ATM and BMPR1A gene variants as possible modifier genes or as possible causes of phenotypic variation within the patient's family. Therefore, further cosegregation analysis is essential to evaluate the disease penetrance of these three gene variants.
The PMCRC21, a 36-year-old male at the time of diagnosis, presented with transverse colon malignant neoplasm, and had a strong family history of CRC ( Figure 1). Genetic testing revealed a nonsense variant (p.Tyr1229*) in the ATM gene. The deleterious variants in ATM have been demonstrated that it is a moderate penetrance gene correlated with increased susceptibility to breast cancer [21]. Furthermore, ATM has also been reported to be associated with the risk of CRC [6][7][8]10,11,19].
Our results are consistent with several published research reports, showing evidence of APC being the most frequently mutated gene detected in early-onset CRC patients [19,22,23]. Further analysis of the immunohistochemical staining of the four main DNA mismatch-repair (MMR) genes, the microsatellite instability (MSI) testing, or MLH1 hypermethylation analysis, is required to confirm the diagnosis of the PMCRC9 patient. For individuals with a negative genetic test result, they may have causative variants in these genes that have yet to be discovered due to the limitations of sequencing or probably in other cancer susceptibility genes. Additionally, these patients may be suffering from sporadic cancer as a result of somatic mosaicism of the APC pathogenic variant [24], somatic hypermethylation of the MLH1 gene [25], germline hypermethylation of MSH2 (via 3' end deletions in the EPCAM gene) [26], or confluence of genetic and environmental risk factors.