Clinicopathological Characteristics and Mutational Landscape of APC, HOXB13, and KRAS among Rwandan Patients with Colorectal Cancer

Cancer research in Rwanda is estimated to be less than 1% of the total African cancer research output with limited research on colorectal cancer (CRC). Rwandan patients with CRC are young, with more females being affected than males, and most patients present with advanced disease. Considering the paucity of oncological genetic studies in this population, we investigated the mutational status of CRC tissues, focusing on the Adenomatous polyposis coli (APC), Kirsten rat sarcoma (KRAS), and Homeobox B13 (HOXB13) genes. Our aim was to determine whether there were any differences between Rwandan patients and other populations. To do so, we performed Sanger sequencing of the DNA extracted from formalin-fixed paraffin-embedded adenocarcinoma samples from 54 patients (mean age: 60 years). Most tumors were located in the rectum (83.3%), and 92.6% of the tumors were low-grade. Most patients (70.4%) reported never smoking, and 61.1% of patients had consumed alcohol. We identified 27 variants of APC, including 3 novel mutations (c.4310_4319delAAACACCTCC, c.4463_4470delinsA, and c.4506_4507delT). All three novel mutations are classified as deleterious by MutationTaster2021. We found four synonymous variants (c.330C>A, c.366C>T, c.513T>C, and c.735G>A) of HOXB13. For KRAS, we found six variants (Asp173, Gly13Asp, Gly12Ala, Gly12Asp, Gly12Val, and Gln61His), the last four of which are pathogenic. In conclusion, here we contribute new genetic variation data and provide clinicopathological information pertinent to CRC in Rwanda.


Introduction
Cancer in Africa is underinvestigated and underreported in the scientific literature [1][2][3], especially with respect to cancer genetics and genomics [4,5]. Available data highlight current and future increases in the incidence and mortality related to colorectal cancer (CRC) in African countries [3,6,7], and the Rwandan population is no exception [7].
Globally, CRC accounted for nearly 10% of cancer incidence and 9% of cancer deaths in 2020 and was ranked as the third-most deadly cancer [8,9]. consent form prior to participation in this study. The study was conducted in accordance with the Declaration of Helsinki.

Patients
From December 2020 to September 2022, we prospectively recruited consecutive cases of participants from patients attending colonoscopy services at the Department of Internal Medicine at the University Teaching Hospital of Kigali (CHUK, French acronym), Rwanda. Clinical data and self-reported information regarding the family history of cancer, alcohol, and tobacco use were recorded prospectively. Tissue processing and microscopic examination were performed at CHUK. The microscopic diagnosis was reconfirmed by a second pathologist at Hamamatsu University School of Medicine (HUSM) in Japan, and only patients with histologically confirmed CRC were included in this study. Initially, we graded cases according to the extent of glandular differentiation/formation using a three-tier grading system as follows: grade 1 (well-differentiated CRC, glandular formation >95%), grade 2 (moderately differentiated CRC, glandular formation 50-95%), and grade 3 (poorly differentiated CRC, glandular formation <50%) [39]. All CRC cases were then grouped into a two-tier grading system: low grade (i.e., including grade 1 and grade 2) and high grade (i.e., including grade 3), according to the recommendations of the fifth edition of the World Health Organization (WHO) classification of tumors of the digestive system [40]. During our study period, 148 consecutive patients underwent colonoscopic biopsy for suspected CRC. Of these, 129 (87.1%) signed an informed consent form to participate in our study. Here, 58 of 129 (44.9%) subjects had CRC, which was confirmed by biopsy, but 4 cases were excluded: one due to poor quality DNA, one due to a low concentration of extracted DNA, one case whose DNA could not be amplified by PCR, and one due to insufficient tissue for DNA extraction. The final number of cases analyzed and presented in this study was 54, and all cases included here were naïve to cancer therapy.

DNA Extraction
Since our samples were collected prospectively, biopsy specimens were immediately fixed in a 10% neutral buffered formalin solution at the time of biopsy. After automated tissue processing, samples were embedded in paraffin. DNA extraction was performed before the tissues were six months old. For each case, the DNA was extracted from formalin-fixed paraffin-embedded (FFPE) tissue blocks containing at least 50% tumor cells. Extraction was performed using QIAamp DNA FFPE Advanced UNG Kits (Qiagen GmbH, Hilden, Germany), with all procedures following the manufacturer's recommendations. The concentration and quality of extracted DNA were then measured using a Nanodrop ® 1000 spectrophotometer (ThermoFisher Scientific, Wilmington, CO, USA). As measured using the Nanodrop, the cases with a 260/280 ratio between 1.7 and 2.3 were included for further analysis.
We also sequenced both exons (i.e., exons 1 and 2) of the HOXB13 gene [45]. Moreover, to cover a wide range of KRAS mutations in an understudied population-since there are no available genetic studies of Rwandans with CRC-we sequenced exons 2 to 4 of KRAS [46]. We also added a fragment of exon 5 extending from GRCh 38: 25209690 to 25209999 to include the hypervariable region (HVR) of this gene since mutations corresponding to the HVR are not included in some reports [47]. The boundaries of all sequenced exon segments are shown in Supplementary Table S1. Prior to sequencing, extracted DNA was amplified by PCR using HotStarTaq DNA Polymerase (Qiagen) in a 20 µL reaction volume. A list of primer sequences used to amplify the three genes is provided in Supplementary Table S1. The quality of PCR products was then assessed by electrophoresis on a 2% agarose gel.
After amplification, PCR products were purified and sequenced as described in a previous study [48]. Briefly, PCR products were purified using ExoSAP-IT (ThermoFisher Scientific). The resulting purified products were directly Sanger sequenced in both directions using the BigDye TM Terminator v3.1 Cycle Sequencing Kit (Applied Biosystems TM ). This was performed in 20 µL reaction volumes, with all procedures performed according to the manufacturer's recommendations. Base calling was performed using an Applied Biosystems 3130xl or 3500xL Genetic Analyzer (Applied Biosystems, Foster City, CA, USA).

TA Cloning and Plasmid Sequence Analysis for Insertion/Deletion Cases
TA cloning was performed to assess cases with insertion/deletion mutations. Note that two separate cloning experiments (i.e., including two different transformations, with two PCR products, obtained in separate reactions) were used to eliminate possible polymeraseinduced errors.

Ligation Reaction Using pGEM ® -T Easy Vector System
First, PCR products were purified using QIAquick ® PCR Purification Kits (Qiagen GmbH, Hilden, Germany). Next, we constructed plasmids by ligating purified PCR products to a pGEM ® -T Easy Vector System (Promega Corporation, Madison, WI, USA) with all procedures following the manufacturer's protocol.

Transformation Using Highly Competent DH5α E. coli Cells
COMPETENT High DH5α competent E. coli cells (Toyobo Corporate, Osaka, Japan) were then transformed with the constructed plasmid. All procedures were performed according to Promega's protocol, except that we used DH5α E. coli cells (Toyobo). Furthermore, we prepared fresh X-Gal/IPTG Luria-Bertani (LB) agar plates containing ampicillin at a final concentration of 100 µg/µL according to the protocol of ThermoFisher Scientific (i.e., to each 25 mL of LB agar, we added 40 µL of X-Gal solution (20 mg/mL) and 40 µL of IPTG (100 mM); then, the plates were allowed to dry under UV light at room temperature before use).

Screening for Transformants Followed by Sanger Sequencing of Positive Subclones
Transformed cells were grown overnight on X-gal/IPTG LB agar plates described above, allowing the selection of blue/white colonies. White colonies (i.e., eight subclones) were then selected [48] and each cultured in 50 µL of liquid LB medium containing ampicillin at a final concentration of 100 µg/µL for 1 h and 30 min at 37 • C. The resulting cultures were centrifuged, and 1 µL of the lower phase was used for PCR in a 20 µL volume using HotStartTaq (Qiagen) polymerase enzyme and pGEM Easy V-2930F (5 -AGG CGA TTA AGT TGG GTA ACG-3 ) and pGEM Easy V-134R (5 -CAA GCT ATG CAT CCA ACG C-3 ) primers. The resulting PCR products were then purified and sequenced as described above.
Known variants were then annotated using the dbSNP (build 156) database as per the recommendations of the Human Genome Variation Society (HGVS) [50]. Novel variants were analyzed using MutationTaster2021 [51] in silico tools to predict the consequences of the resulting proteins. Each detected mutation was confirmed by two independent experiments to exclude false mutations. We then consulted major databases (i.e., ALFA, ClinVar, dbSNP, VariSome, 1000 Genome Project, TOPMed, and COSMIC) before concluding that a mutation was novel. Finally, since we used tumor tissue, we could not determine whether mutations were purely somatic or were also present in the germline.

Clinicopathological Characteristics of Patients
In this study, we analyzed data from 54 patients with a mean age of 60 ± 13 years (range 31-89 years). Our dataset included 34 females (63%) and 20 males (37%). We found that tumors were mainly located in the rectum (45 cases, 83.3%) or colon (8 cases, 14.8%) with only one case of anorectal tumor (1.9%). All cases were biopsies and histologically confirmed as adenocarcinoma. In total, 50 cases (92.6%) were low-grade and 4 cases (7.4%) were of high-grade. Most patients (38 cases, 70.4%) reported that they had never smoked, while 33 patients (61.1%) reported that they had consumed alcohol. Detailed clinical data are presented in Table 1.

Characteristics of Genetic Variants in the KRAS Gene
Next, analysis of the KRAS gene identified six different genetic variants including four pathogenic variants (i.e., Gly12Ala, Gly12Asp, Gly12Val, and Gln61His), one variant with a conflicting interpretation of pathogenicity (Gly13Asp), and one SNP (NM_004985.5: c.519T>C) that leads to a synonymous amino acid (Asp173=). Detailed information regarding these mutations is provided in Table 4.

Discussion
CRC is underinvestigated in the Rwandan population. To our knowledge, only two papers [18,53] and one abstract [19] reporting on CRC in Rwanda have been published in international peer-reviewed journals within the last 10 years. In contrast to global data where CRC is reported to be more common in males than in females [9], in this study, we found that CRC was more common in females than in males (i.e., 63.0% versus 37.0%). Our findings also report a higher rate of female incidence than the 52.5% reported by Uwamariya et al. [18] and the 52.2% reported by Fadelu et al. [19].
Most tumors were located in the rectum (83.3%), which may be due to the fact that rectal tumors are more symptomatic (i.e., they display tenesmus, pain during defecation, bleeding, and blood in the stool, among other symptoms) than colon tumors. Therefore, patients with rectal tumors are more likely to seek medical care relative to patients with colon tumors. In addition, screening programs for CRC are not available in our settings [20]. As a result, most patients with colon tumors present with advanced stages of cancer and signs of obstruction. They, therefore, immediately undergo resection to relieve the obstruction without undergoing colonoscopy. This speculation may be supported by the fact that Uwamariya et al. [18] found that rectal cancer accounted for 40% of all CRC cases in a study conducted at the same institution as ours that used both biopsy and resection specimens.
We also found that 35.2% (19/54) of our research participants reported that they had no information regarding a family history of cancer. This is different from missing information that was not recorded, which accounted for 3.7% (2/54 cases) in this study. In addition, 61.1% of CRC patients reported that they had "no family history of cancer." Therefore, none of the patients in this study reported having a family history of cancer. We speculate that self-reported data regarding family history of cancer may not necessarily mean that cancers were absent, especially because of limited opportunities to receive a cancer diagnosis, treatment, and cancer registration in the past [20]. Therefore, data on family history of cancer should be interpreted with caution, and prospective studies with good documentation regarding cancer-related information are recommended.

New Mutations in the APC Gene
The APC gene is a key gatekeeper gene involved in the development of CRC [28,30]. Germline mutations in APC are also key factors in familial adenomatous polyposis syndrome. This syndrome is quasi-nonexistent in African populations, with only a few cases having ever been reported in the literature [54][55][56][57][58][59][60][61][62][63]. Mutations in the APC gene in CRC have been reported in more than 50% of cases [44]. Moreover, >60% of APC mutations are located in the MCR [44,[64][65][66]. In this study, we limited our genetic assessment to a region including the MCR with ±300 bases flanking either side. In total, we detected 27 genetic variants or mutations, 24 of which have been previously documented in the dbSNP, ClinVar, and/or COSMIC databases.
Three mutations, i.e., c.4310_4319delAAACACCTCC: p.Lys1437Asnfs*33, c.4463_4470delinsA: p.Leu1488Tyrfs*17, and c.4506_4507delT: p.Ser1503Hisfs*4 have, to our knowledge, not yet been reported in the literature or major genetic variation databases. All these mutations are frameshift mutations that are predicted to cause a premature termination codon and therefore produce a truncated protein. Using in silico bioinformatics tools, each of these genetic variants is predicted to be deleterious and to cause nonsense-mediated mRNA decay.

Variants in the HOXB13 Gene
We identified four different genetic variants: rs33993186, rs8556, rs9900627, and rs13865188, all of which are synonymous and classified as benign or likely benign. Silva et al. [34] only reported the rs8556 and rs9900627 variants in CRC samples from patients treated at the Portuguese Institute of Oncology-Porto. However, in our study, we found cases containing the rs33993186 and rs138675188 variants. The rs33993186 (g.487282264G>T) variant and the rs138675188 (g.48726910C>T) variant appear to be more than two times more common in Africans than in the global population as currently reported in dbSNP build 156.
The HOXB13 Gly84Glu variant found mostly in the Europeans [67,68], the Gly132Glu and Gly135Glu mutations described in the Asians [69,70], and the HOXB13 Ter285Lys variant found mostly in the West African populations [52] were all absent in our study participants.

KRAS Missense Mutations
Five of the six KRAS variants identified in this study are missense mutations. These mutations included the pathogenic mutations Gly12Asp, Gly12Ala, Gly12Val, and Gln61His as well as the variant Gly13D, which had conflicting interpretations of pathogenicity. These five missense mutations are the most commonly reported KRAS mutations and were present in 22 of 54 samples (40.7%) analyzed here. Ben Salah et al. [71] reported a KRAS mutation rate of 55.7%, while Marbun et al. [72] reported a rate of 64%. In other studies, either Gly12Asp [73,74] or Gly12Val [71,75] were the most common KRAS mutations. However, in our study, Gly13Asp was the most recurrent mutation, followed by Gly12Val. KRAS Gly12Asp, which has been reported to be associated with the best prognosis [73], was present in 2/54 of our patients, and Gly12Cys, which is associated with a poor prognosis [73], was not identified in any patients in our study.

Smoking, Alcohol Consumption, Diet, and Other Environmental Risk Factors for CRC
A total of 20% (11/54) of our study population reported ever having smoked and 61.1% (33/54) reported ever having consumed alcohol. Both proportions are higher than the 6.9% (for smokers) and 14.7% (for drinkers) reported by Wismayer et al. [76] in the Ugandan population neighboring Rwanda. In addition, Wismayer et al. [76] found that former smokers and current and former drinkers had a higher risk of developing CRC.
It has been reported that a high intake of red meat, processed meat, sweetened beverages, a diet low in fiber, and a low intake of dairy products [77][78][79][80] are factors associated with an increased risk of developing CRC. Although we did not assess the diet of our study participants, it is worth noting that the Rwandans rarely consume meat, fish, fruit, and dairy products, but their diet may be rich in starchy foods [81][82][83].

Limitations
Our study has some limitations. We only used tumor tissues. Therefore, we could not confirm whether genetic variations and/or mutations were purely somatic or also present in the germline. By using Sanger sequencing, we were only able to investigate a small number of cases and genes. Furthermore, with respect to the APC gene, our analysis was limited to only a small portion of this gene. In addition, we did not investigate any genes associated with Lynch syndrome, nor did we determine the chromosomal stability status of our study participants. Further studies using next-generation sequencing techniques are recommended to investigate a large number of genes in a larger pool of cases. Our clinical data did not include the cancer stage; therefore, we provided limited pathological characteristics of our patients. Furthermore, information on smoking, alcohol consumption, and family history of cancer was self-reported, and we were not able to make an objective estimate of the amounts consumed. Therefore, we could not estimate an association between these factors and the presence or absence of a particular mutation. Finally, we were not able to perform immunohistochemical staining or functional studies to further investigate the effect of the described mutations on the expression levels of corresponding proteins and/or to confirm the in silico predicted mutational effect.

Conclusions
In this study, we contribute new genetic variation data and clinicopathological information relevant to CRC patients in Rwanda. This genetic information does not yet reflect specific environmental (i.e., dietary and other) components. However, additional studies are required to improve the quality of scientific data characterizing CRC cases in the Rwandan population and improve evidence-based management of this disease. The genetic data provided in this paper also represent a valuable resource for the study of CRC in understudied populations.
Supplementary Materials: The following supporting information is submitted as supplementary material and can be downloaded at: https://www.mdpi.com/article/10.3390/cimb45050277/s1. Table S1: List of primer pairs for PCR amplification. Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.