Gene Variants in Components of the microRNA Processing Pathway in Chronic Myeloid Leukemia

Current therapy in chronic myeloid leukemia (CML) has improved patient life expectancy close to that of healthy individuals. However, molecular alterations other than BCR::ABL1 fusion gene in CML are barely known. MicroRNAs are important regulators of gene expression, and variants in some of the components of microRNA biosynthesis pathways have been associated with genetic susceptibility to different types of cancer. Thus, the aim of this study was to evaluate the association of variants located in genes involved in the biogenesis of microRNAs with susceptibility to CML. Fifteen variants in eight genes involved in the biogenesis of miRNAs were genotyped in 296 individuals with CML and 485 healthy participants using TaqMan probes. The association of gene variants with CML and clinical variables was evaluated by a Chi-square test, and odds ratios and 95% confidence intervals were estimated by logistic regression. The variant rs13078 in DICER1 was significantly higher among CML individuals than in healthy participants. In addition, the variants rs7813 and rs2740349 were significantly associated with worse prognosis, according to their Hasford scores, whereas the rs2740349 variant was also associated with a later age at diagnosis. These findings suggest that variants in components of the microRNA biogenesis pathway could be involved in CML genetic risk.


Introduction
Chronic myeloid leukemia (CML) is a myeloproliferative malignant disease characterized by the presence of an aberrantly short chromosome 22, named the Philadelphia chromosome.This altered chromosome is the product of reciprocal translocation t(9;22)(q34;q31) and results in the fusion of the BCR gene in chromosome 22 with the ABL1 gene located in chromosome 9 [1].The BCR::ABL1 fusion gene encodes a chimeric protein with the exacerbated activity of tyrosine kinase, which is capable of activating different intracellular signaling pathways, including RAS-MAPK, JAK-STAT, and SRC [2].
The characterization of the BCR::ABL1 fusion gene as the main molecular alteration underlying CML development allowed the design of the first successful target therapy for malignant diseases.This therapy was based on the use of Imatinib, a highly specific tyrosine kinase inhibitor (TKI), which was significantly more efficient than Interferon, the conventional therapy used in the past in the treatment of CML [3].At present, five other TKIs have been approved by the FDA for CML therapy, including nilotinib, dasatinib, bosutinib (second generation), ponatinib (third generation), and asciminib (fourth generation) [4].Currently, the use of TKIs as the first-line therapy in CML has significantly increased 10-year survival rates from 20% to 85% [4].Although a small group of CML patients (10-15%) show resistance or intolerance to TKIs, most CML patients exhibit life expectancies similar to those found in healthy individuals [5].
In contrast to the great advancement in the treatment of CML, worldwide incidence of this disease has remained unchanged, and as a consequence of the success of TKI therapy, its prevalence has significantly increased in the last few years.In this sense, it was estimated that in 2017, about 15 million people were living with CML around the world [6].Risk factors associated with CML, including genetic factors, are not widely known, with high exposure to radiation and benzene as the only recognized factors related to CML development [7].The identification of the genetic factors associated with increased CML risk could help in the establishment of public health strategies that allow the prevention of this disease with a consequent reduction in its prevalence.
MicroRNAs (miRNAs) are short, non-coding RNAs of 19-22 nucleotides long that post-transcriptionally regulate gene expression by directly binding to protein-coding transcripts [8].MiRNAs interact with their target transcripts through base pair complementarity binding between a short sequence located at the miRNA 5 ′ end, named the seed region, and a canonical sequence situated in the 3 ′ untranslated region (3 ′ UTR) of the target transcript.Since the seed regions in miRNAs are very short sequences, each individual miRNA can regulate different transcripts, and in the same sense, an individual transcript can be regulated by different miRNAs [9,10].At present, more than 2600 mature sequences of miRNAs have been deposited in an international database reservoir called miRBase [10], and it has been estimated that about 30% of coding genes might be regulated by miRNAs [11].
MiRNAs regulate several cellular processes, including proliferation, survival, and differentiation [12].Accordingly, alterations in the expression of miRNAs are a common event in different types of cancers.For instance, the up-regulation of miR-130b-5p and miR-222-3p has been reported in breast cancer [13], whereas the down-regulation of miR-148a and miR-152 has been observed in gastric [14,15] and bladder cancers [16].In addition, genetic studies have revealed that the aberrant expression of miR-126 plays a critical role in the leukemogenesis and progression of acute myeloid leukemia [17].Particularly in CML, miR-125a-3p and miR-320b are differentially expressed in CML patients [18].In an independent study, the down-regulation of miR-342-5p expression was found to be associated with CML progression [19].Taken together, these findings indicate the importance of miRNA regulation in different types of cancers, including leukemia.
The biogenesis of miRNAs is a highly regulated multi-step process, starting with its transcription by RNA polymerase II as long hairpin structures called primary miRNAs (pri-miRNAs).Since miRNAs can be transcribed as monocystronic (one gene) or polycistronic (multiple genes) transcripts, the sizes of pri-miRNAs are highly diverse.Later, pri-miRNAs are cleaved into precursor miRNAs (pre-miRNAs) of ~65-70 nucleotides long by a type III ribonuclease named DROSHA in complexes with DGCR8 proteins.Then, pre-miRNAs are transported into the cytoplasm by the Exportin-5 (XP05)-RAN complex and further processed into ~22 nucleotide mature duplex RNA molecules, by another type III ribonuclease called DICER1 in complexes with the TRBP protein.This mature duplex miRNA are composed of a passenger and a guide strand.The passenger miRNA strand is usually degraded, whereas the guide miRNA strand is loaded into the RNA-induced silencing complex (RISC) by the proteins AGO1-4 and GEMIN3-4 [20].
Recent studies have demonstrated the importance of gene variants in components of the biosynthesis process of miRNAs as risk factors for different human diseases.For instance, variants rs7813 in GEMIN4 and rs636832 in AGO1 have been associated with an increased risk of the multiple sclerosis variant [21], whereas variant rs13078 in DICER1 has been associated with a decreased risk of type 2 diabetes mellitus [22].In addition, the variants rs197388 in GEMIN3 and rs3744741 in GEMIN4 have been associated with an increased risk of depression [23].In the case of malignant diseases, the variants rs7157322 and rs3742330 in the DICER1 gene have been associated with gastric adenocarcinoma and papillary thyroid cancer, respectively [24,25].Similarly, variants rs3742330 in DICER1, rs11077 in XPO5, and rs10719 in DROSHA are associated with colorectal cancer risk [26].Importantly, functional assays have demonstrated that variant rs10719, located in the 3' untranslated region of DROSHA, decreases the binding of hsa-miR-27b with the consequent elevation of DROSHA transcripts [27].
Currently, no studies about the participation of variants in miRNA biogenesis and CML development have been performed.Since alterations in the expression of miRNAs are found in CML, variants in genes involved in the biogenesis of miRNAs could be associated with this disease.Thus, the aim of the present study was to analyze the association of gene variants in components of the miRNA machinery pathway and the development of CML.

Study Population
Our sample population was composed of a total of 781 unrelated subjects, comprising 296 individuals previously diagnosed with CML and 485 unrelated healthy persons.Case and control subjects were recruited from tertiary hospitals in Mexico City.The case group was composed of 170 (57%) males and 126 (43%) females, whereas the control group comprised 271 (56%) males and 214 (44%) females.The median ages in the case and control groups were 45 ± 14.7 and 48 ± 7.4, respectively.CML was initially diagnosed by a hematological test and then confirmed through the detection of translocation t(9;22)(q34;q11) by conventional cytogenetics, or the detection of BCR::ABL1 transcripts by qPCR.All participants with CML were of recent diagnoses and all of them were in the chronic phase of the disease.
This study was conducted in accordance with the Declaration of Helsinki and approved by the Research and Ethics Committees of the National Institute of Genomic Medicine (Registry number 13/2019/1).All participants signed a written informed consent form.

Statistical Analysis
Deviation from the Hardy-Weinberg equilibrium was determined by a Chi-square test.To evaluate the association of variants in components of the miRNA biogenesis process with CML and clinical variables, Chi-square and Fisher's tests were performed.
Additionally, odds ratios (ORs) and 95% confidence intervals (CIs) were estimated using a logistic regression model.Linear regression was used to evaluate the effect of gene variants on age at CML diagnosis (β value).Both methods were adjusted by age and sex.All statistical analyses were performed with the PLINK software 1.9 [28].A p-value less than 0.05 was considered as the threshold of statistical significance.

Characteristics of the Study Population
The demographic and clinical characteristics of CML patients and healthy individuals are shown in Table 1.From the 296 individuals diagnosed with CML, clinical data for Sokal and Hasford scores were obtained from 118 participants, whereas clinical data for EUTOS scores could only be obtained from 97 participants.Based on Sokal scores, 44% of patients were classified as high risk, 29% as intermediate risk, and 27% as low risk.Regarding Hasford scores, 41% of patients were classified as high risk, 30% as intermediate risk, and 29% as low risk, respectively.According to EUTOS scores, 52% were classified as high risk and 48% as low risk (Table 1).

Gene Variants in Components of miRNA Biosynthesis Pathway and CML
After genotyping the 15 studied variants in the case and control groups, the minor allele frequency of rs197414 in GEMIN3 and rs4968104 in GEMIN4 significantly deviated from the Hardy-Weinberg equilibrium (p < 0.05).Thus, both variants were excluded from further analysis.The rest of the variants showed no deviation from the Hardy-Weinberg equilibrium.In the association study, the frequency of the minor allele of rs13078 in DICER1 was significantly higher in individuals affected with CML in comparison to healthy subjects (0.09 vs. 0.06; OR = 1.64, 95% CI [1.105-2.421],p = 0.013; Table 2).Moreover, the number of participants carrying the minor homozygote genotype of this variant was significantly higher among the cases compared to the controls (0.034 vs. 0.002; OR = 15.06,95% CI [2.824-373.6];Table 3).It is worth noting that no significant differences were observed in the number of individuals carrying the heterozygote genotype of this variant among the cases and controls, suggesting a recessive effect.The allele and genotype frequencies of the rest of the studied variants showed no significant differences between the cases and controls (Tables 2 and 3).

Association of Variants in Components of miRNA Biogenesis with Prognostic Scores and Age at Diagnosis in CML
To analyze the participation of SNVs in miRNA biogenesis pathways with CML progression, the association of the studied variants with the prognostic scores of EUTOS, Sokal, and Hasford was evaluated.For comparison purposes, participants with low and middle risk in the Sokal and Hasford scores were grouped together with participants classified as low risk in the EUTOS scores.For the comparison of participants with high risk in the three prognostic scores against individuals with low risk for EUTOS and low and middle risk in the Sokal and Hasford scores, the frequency of minor alleles of the variants rs7813 and rs2740349 in GEMIN4 was higher in the high-risk group than in the low-and middle-risk group (Sokal: 0.36 vs. 0.26 and 0.25 vs. 017; Hasford: 0.39 vs. 0.24 and 0.27 vs. 0.14; and EUTOS: 0.35 vs. 0.26 0.23 vs. 0.17, respectively).However, only in the case of the Hasford score was this difference statistically significant for both variants (rs7813: OR = 1.96, p = 0.019 and rs2740349: OR = 2.23, p = 0.015, respectively).None of the other studied variants showed any association with CML prognostic scores (Table 4).The impact of variants in miRNA biosynthesis on age at diagnosis was also evaluated.In this analysis, CML individuals carrying the variant allele of rs2740349 in GEMIN4 showed a significantly later age at diagnosis than CML participants with the wild-type allele (~5 years, p = 0.003; Table 5).The variant rs7813, also present in GEMIN4, showed a tendency to associate with a later age at diagnosis (~3 years, p = 0.059; Table 5).None of the other variants showed any association with age at diagnosis.

Discussion
MiRNAs are important regulators in different cellular processes, including proliferation, apoptosis, and cellular differentiation.Accordingly, alterations in the expression of miRNAs are frequently found in a variety of human illnesses, including cancer, diabetes, and cardiovascular diseases [12][13][14][15][16]. Importantly, recent studies have demonstrated that genetic variants in components of miRNA biogenesis pathways could be involved in the development of several types of cancer and metabolic diseases [21][22][23][24][25].
In this study, the association of 15 SNVs in eight genes involved in the biosynthesis process of miRNAs with CML susceptibility was evaluated.As important observations, the frequency of the minor allele and the minor homozygote genotype of the variant rs13078 in DICER1 was significantly higher among CML individuals than in healthy participants.Additionally, the frequency of the minor alleles of GEMIN4 variants rs7813 and rs2740349 was higher in the group of CML patients with high risk in comparison to the intermediateand low-risk group according to their Hasford, EUTOS, and Sokal scores; however, this difference was significant only in the case of the Hasford score.Considering that all CML participants included in this study were in the chronic phase of the disease, this finding indicates that variants rs7813 and rs2740349 could be associated with a worse prognosis.Finally, the presence of the variant allele of rs2740349 was significantly associated with a later age at diagnosis of CML in comparison to noncarriers, whereas the variant rs7813 showed only a small association.
The DICER1 gene encodes a protein of approximately 218 kDa with the activity of type III ribonuclease involved in the regulation of several small non-coding RNA pathways [29].Alterations in the expression of DICER1 have been found in different types of cancer such as breast carcinoma, hepatocellular carcinoma, and lung cancer [30][31][32].Moreover, germline mutations in the DICER1 coding region, which affects protein activity, have been related to a rare autosomal dominant genetic disorder named DICER syndrome [33,34].This syndrome influences different benign and malignant tumors characterized by DICER1 loss of function [35,36].Moreover, in papillary thyroid cancer samples, pathogenic somatic variants in the DICER1 gene are associated with a reduction in the abundance of 5pderived miRNAs [37].In this sense, common variants in DICER1, such as rs13078, rs105703, and rs3742330, have been associated with a variety of human illnesses, including cancer, diabetes, cardiovascular, autoimmune, and neurodegenerative diseases [22,[38][39][40][41].
The rs13078 variant is located at the 3 ′ UTR region of the gene, and it has been previously associated with decreased risk of type-2 diabetes [22] and increased risk of gestational hypertension, idiopathic male infertility, hepatocellular carcinoma, and larynx cancer [42][43][44][45].A previous report demonstrated the direct binding of miR-892a to the 3 ′ UTR of DICER1 transcripts in a human laryngeal-cancer-derived cell line with a consequent reduction in DICER1 protein levels [46].In this sense, variant rs13078 could modify the interaction of different miRNAs or RNA-binding proteins with DICER1 transcripts, affecting the abundance of the protein and potentially affecting the global expression of miRNAs.
GEMIN4 is a protein of approximately 120 kD, which is part of a multiprotein complex involved not only in miRNA biogenesis but also in the assembly and organization of the small nuclear ribonucleoproteins needed for mRNA splicing [47].Recent studies have reported the association of germinal pathogenic variants in GEMIN4 with an inherited neurodevelopmental disorder characterized by microcephaly, cataracts, and renal abnormalities [48,49].An independent study also demonstrated the association of germinal variants with pediatric cataract syndrome [50].Moreover, common variants in GEMIN4, such as rs7813, rs3744741, rs910924, and rs595961, have been associated with a variety of human diseases, including multiple sclerosis, depression, and cancer in different organs such as the stomach, lungs, and kidneys, among others [21,23,[51][52][53].
The variants rs7813 and rs2740349 are missense variants (Arg1033Cys and Asp929Tyr, respectively).Although the variant rs2740349 has not been associated with any illnesses, previous studies have reported the association of rs7813 with several diseases, including multiple sclerosis, tuberculosis, gastric cancer, and lung cancer, among others [21,[51][52][53][54][55].Similar to our findings, the rs7813 variant was reported to be associated with the prognosis of esophageal squamous cell carcinoma [56].This is the first study showing the association of variants in components of miRNA biogenesis pathways with CML and the progression of the disease.An important limitation of this study is the small number of CML individuals with available data for prognostic scores.However, our findings are in line with previous studies indicating the relevance of gene variants in miRNA biogenesis pathways in different types of malignant diseases.Further studies evaluating the impact of variants in miRNA biogenesis pathways on the abundance of miRNAs associated with CML are warranted.

Table 1 .
Demographic and clinical characteristics of CML patients and healthy individuals.

Table 2 .
Allelic associations of SNVs in miRNA machinery genes with CML.

Table 3 .
Genotypic association of SNVs in miRNA machinery genes with CML.

Table 4 .
Variants in components of miRNA biosynthesis process and CML prognosis.

Table 5 .
Linear regression estimating age at diagnosis in years.