Analysis of the TSER and G>C variants in the TYMS gene: a high frequency of low expression genotypes predicted in the Mexican population

Abstract Background In the TYMS gene promoter, there is a repeat polymorphism (TSER) that affects the expression level of the thymidylate synthetase (TS) enzyme involved in the response to some anticancer drugs. The G>C transversion located in the TSER*3R allele decreases the expression level of the TS enzyme avoiding the upstream stimulatory factor (USF-1) binding site. Despite the biomedical impact of the SNP G>C, only TSER has been reported in most worldwide populations. Thus, we studied both TSER and SNP G>C variants in the Mexican population. Subjects and methods A population sample (n = 156) was genotyped for the TSER and G>C variants by PCR and PCR-RFLPs, respectively, followed by PAGE and silver staining. Results For TSER, the most frequent allele was 2 R (52.56%), as well as the genotype 2 R/3R (42.3%). Comparison with Latin American, European, and American (USA) populations suggest a heterogeneous worldwide distribution (FST-value = 0.01564; p-value = 0.0000). When the G>C variant was included (2RG, 3RG, and 3RC), a high frequency of low expression genotypes was observed: 2RG/2RG, 2RG/3RC, and 3RC/3RC (84.6%). Conclusion The high frequency of genotypes associated with low TS enzyme expression justifies obtaining the TYMS gene variant profile in Mexican patient’s candidates to pharmaceutical treatments like 5’-Fluoracil, methotrexate, and pemetrex.


Introduction
The thymidylate synthetase (TYMS) gene is located on the short arm of chromosome 18, which is constituted by seven exons encoding a 313 amino acid protein known as TS enzyme (Kawakami and Watanabe 2003).The TYMS gene contains one of the 68 very important polymorphisms (https://www.pharmgkb.org/),with large pharmacogenetic potential in cancer, because TYMS expression alterations allow the prediction of response to specific treatments, mainly with 5 0 -Fluoracil (5 0 -FU) (Miteva-Marcheva et al. 2020).
Several polymorphisms have been described in the TYMS gene, but only three have shown clinical relevance, mainly due to their function as modulators of the TYMS gene expression.The first can be detected in the 5 0 -UTR region of the TYMS mRNA as a variable number of tandem repeats (VNTR) (rs45445694), which involves a 28 base pair repeat in the promoter region known as TSER that ranges from two (TSER Ã 2) to nine (TSER Ã 9) duplicated copies (Kawakami and Watanabe 2003).The most common alleles of this polymorphism are TSER Ã 2 (2 R) and TSER Ã 3 (3 R).The presence of the 3 R allele is associated with increased mRNA expression and protein production.Some studies have predicted the association between the toxicity and effectiveness of some drugs, such as 5'FU, methotrexate, and pemetrexed (Marcuello et al. 2004;Miteva-Marcheva et al. 2020).
The most common alleles for these TYMS gene variants are related to increased levels of the TS enzyme, which produces resistance to drugs, such as 5 0 -FU and oral pro-drugs (capecitabine, and methotrexate, among others).Different worldwide studies have demonstrated the clinical impact of these TYMS gene polymorphisms (Kawakami and Watanabe 2003;Graziano et al. 2004;Marcuello et al. 2004;Marsh 2005;Acuña et al. 2006;Bolufer et al. 2007;Hammad et al. 2012;Miteva-Marcheva et al. 2020).However, TSER frequencies have been reported in one study in Mexican patients and control volunteers (Quintero-Ramos et al. 2014), whereas the haplotype distribution based on the TSER-SNP variants has been poorly analysed in worldwide populations.
In this study, we analysed two TYMS gene variants in one Mexican population sample, and we compared our results with available worldwide populations.Moreover, based on pharmacological treatment results -previously associated to the TYMS genotypes-we assessed the potential risk of Mexican patients exposed to the implied drugs.

Population sample
We included healthy volunteers from two Mexican geographic regions, including 103 western individuals from the Jalisco state described in a previous report (Favela-Mendoza et al. 2018), and 53 northern subjects from the Chihuahua state self-defined as healthy according to a personal interview.They are reported as one Mexican population sample (n ¼ 156) because they present similar admixture components (Rubi-Castellanos et al. 2009), and similar genetic frequencies in this study (p>0.05).We estimated this sample size using the formula for prevalence studies with a confidence interval of 95%, an expected prevalence of 60%, and 6% precision.Previous ethical approval was obtained from the Committee of Ethics and Research of the Centro Universitario de la Ci enega of the Universidad de Guadalajara (N 129693).The anonymity of the recruited individuals was always preserved.

Genotype data collection
After each volunteer signed the informed consent letter, a blood sample was taken by venipuncture from the arm.The genomic DNA was extracted from peripheral blood samples by the standard phenol-chloroform method.DNA was quantified into a Nanodrop 2000 TM instrument (Thermo Scientific, USA), and it was diluted to 25-30 ng/lL for working samples.The TSER 2 R/3R polymorphisms were amplified by PCR using the primers TSER-F (GTG GCT CCT GCG TTT CCC CC) y TSER-R (GCT CCG AGC CGG CCA CAG GCA TGG CGC GG).The PCR was carried out in a total volume of 20 lL including 4 lL of template DNA.The reaction included 10 lL of the multiplex PCR master mix (QIAGEN), 2 lL of oligonucleotides at 4 lM, 0.75 lL of formamide, and 3.25 lL of HPLC water.The amplification conditions were: initial denaturation to 95 C for 15 min, 37 cycles of the denaturation to 94 C for 30s, alignment to 57 C for 90s, extension to 72 C for 90s, and a final extension to 72 C for 10 min.Using the above-described amplified products for TSER genotyping, the G>C variation was established by the technique of restriction fragment length polymorphisms (RFLPs).We took 9.75 lL of the TSER 2 R/3R amplified product, which was mixed with 0.25 lL of restriction enzyme Hae III and incubated for 1h at 37 C. Enzyme inactivation was done by thermoblock incubation at 80 C for 30 m.In both cases, the amplification products were submitted to polyacrylamide gel electrophoresis (PAGE 6%; 29:1) to 250 V for 2.5h followed by silver staining.The TSER genotypes based on the 2 R and 3 R alleles were established by the presence of 200 and 230 bp amplified fragment sizes, respectively (Figure 1(a)).Similarly, the RFLP products of 66 and 44/47 bp established the G allele presence, whereas the 94 bp defined the C allele (Figure 1(b)), as previously described (Kawakami and Watanabe 2003;Marcuello et al. 2004).

Data management and statistical analysis
Descriptive statistics were accomplished by estimation of the allele and genotype frequencies by the gene counting method.Fisher exact tests were performed to confirm that the genotype distribution by locus agreed with the Hardy-Weinberg equilibrium (HWE), and to evaluate the linkage disequilibrium (LD) between pairs of loci.Significance levels of exact tests were empirically determined in 5000 simulations.Similarly, pairwise F ST distances and F ST p-values were calculated to establish differences regarding previous worldwide studies of the TSER polymorphism.For these purposes, we used the software Arlequin 3.5 (Excoffier et al. 2005).

Allele, genotype, and haplotype frequencies
The allele and genotype frequencies were estimated in the Mexican population sample for TSER and their internal G>C SNP in the TYMS gene.Results were compared to previously reported Mexican and worldwide populations (Table 1).Comparison with Latin American, European, and American (USA) populations suggest a heterogeneous worldwide distribution (F ST -value ¼ 0.01564; p-value ¼ 0.0000) (Table 2).The modal allele in the TSER polymorphism was 2 R (52.6%), whereas the heterozygous 2 R/3R was the most frequent genotype (42.3%).Similarly, the 2 R allele was the most frequent in previously studied Mexican cancer control (MxCaCo¼ 58.6%), in Florida (FlorUSA¼ 54.7%), and in Italian control (ItaC¼ 48.9%) population samples (Table 1).However, in most of the worldwide populations, the modal allele was 3 R.On the other hand, for the SNP G>C, the wild-type allele G (63.5%) and their homozygous genotype G/G (44.87%) were the most common.In our Mexican population sample, and most of the cited worldwide populations, the genotype distribution agreed with HWE expectations (p>0.05)(Table 1).
When the G>C SNP was included in the TSER polymorphism, three haplotypes were observed: 2RG, 3RG, and 3RC.As could be expected, the 2RC absence is explained by the association (LD) between the 3 R and C alleles, which is in agreement with previous reports (D' ¼ 0.71; p < .001)(Dotor et al. 2006) and was confirmed herein by exact tests (p-value ¼ 0.0000).The 2RG haplotype was the most frequent in the studied Mexican population sample (52.56%), followed by the 3RC (36.54%), and 3RG (10.9%), respectively.Similarly, 2RG was the most frequent in Spain (39.9%) (Marcuello et al. 2004), and Portugal (44.2%) (Lima et al. 2014), followed by the haplotypes 3RC (range: 34.27 to 35%) and 3RG (range: 20.77 to 25.8%), respectively.In addition, we were able to estimate the frequency of six "genotypes" based on these three haplotypes and to compare results indicating the TYMS gene expression level (Marcuello et al. 2004;Lima et al. 2014) (Figure 2).The genotype distribution patterns in the Spanish and Portuguese populations were similar to each other but different from the studied Mexican population sample (p < 0.0001).However, some similarities were observed, such as the same modal genotype, the heterozygote 2RG/3RC, followed by the homozygous 2RG/2RG, among others.

Population pairwise comparisons
Based on the TSER polymorphism, F ST genetic distances and F ST p-values from pairwise comparisons were accomplished among all the worldwide populations enlisted in Table 1, and the results are presented in Table 2.We detected two main worldwide population clusters including different Mexican samples, which is apparent in the MDS plot showing   their genetic relationships (Figure 3).The first population cluster included our sample and another Mexican control sample, plus one Italian and American (Florida, USA) population.The second population cluster included Mexican Breast Cancer, pre-menopausal, and post-menopausal patients, plus Caucasian Americans and most of the Spanish population samples (except AML patients).

Discussion
We report the distribution of the TSER and G>C variants located in the TYMS gene in a Mexican healthy population sample, including individuals from the not previously analysed north region of this country.We compared the estimated frequencies with available European, Latin American, and American (USA) populations, mainly based on the TSER analysed in patients with different types of cancer (Graziano et al. 2004;Marcuello et al. 2004;Marsh 2005;Acuña et al. 2006;Bolufer et al. 2007;Hammad et al. 2012;Lima et al. 2014;Quintero-Ramos et al. 2014).Although we did not include cancer patients, we tried to assess the expected prevalence of treatment complications based on these TYMS variants in the general Mexican population.
The frequency of the TSER 3 R variant associated with high production of the TS enzyme (Marsh 2005) was lower than in a previous report including control, breast cancer, premenopausal, and postmenopausal patients (Quintero-Ramos et al. 2014).The high frequency of the 2 R allele (52.6%) and of the low expression homozygous 2 R/2R (31.4%) are important from the clinical point of view in the Mexican population, particularly because this genotype has demonstrated a greater chance of responding to methotrexate treatment (Marsh 2005), and elevated survival and free progression of the disease regarding the pemetrexed administration (Li et al. 2013).
The allele distribution of the TSER polymorphisms in the studied Mexican population was similar to the majority of the compared worldwide populations after the Bonferroni correction (p>0.003125)(Table 2).Although the observed differences with Mexican breast cancer and postmenopausal patients are probably explained by the presence of illness (Quintero-Ramos et al. 2014), variations in the proportion of European and Native American admixture could be implied, as previously described between and within Mexican populations (Lisker et al. 2004;Mart ınez-Marignac et al. 2007;Rubi-Castellanos et al. 2009).
In the present study, we included the TSER G>C SNP that provides greater certainty to predict the success of some pharmacological treatments (Kawakami and Watanabe 2003).This inclusion generates TSER haplotypes, which highlighted the elevated frequency of the 2RG haplotype in the studied Mexican population (52.26%) because it has been associated with a lower response to 5 0 -FU (Marcuello et al. 2004;Lima et al. 2014).Moreover, we estimated a high frequency of low TYMS gene expression genotypes based on these TSER haplotypes (84.62%) (Figure 2), which -theoreticallyimplies a high population probability of toxicity and ineffective treatment to some important drugs (Lima et al. 2014; Kawakami   and Watanabe 2003).In brief, these findings are helpful to promote the genotyping of the studied TYMS variants in the Mexican population, specifically before the exposition to 5 0 -FU, methotrexate, and pemetrex.
We must comment that some studies to elucidate the influence of TYMS gene polymorphisms concerning treatment response, survival, and disease progression remain contradictory.Possible explanations for these opposing findings could be: 1) the type of tissue or samples, such as those from patients with a tumour that have shown greater heterozygosity, unlike those derived from control people (Jakobsen et al. 2005); 2) the drug-administration route used between studies, because intravenous bolus administration shows a greater effect on mRNA synthesis, whereas continuous infusion administered from 3 to 15 min shows an effect on the TS (Jakobsen et al. 2005); 3) the effect of the TYMS gene variant, either associated with the mRNA expression (Kawakami et al. 2001;Ishida et al. 2002) and the synthesised protein (Edler et al. 2002;Kornmann et al. 2003); and 4) the presence of additional polymorphisms, such as the TYMS 3'UTR or 1494del6 that modulate the TYMS gene expression and are associated with poor response to treatment (Danenberg 2004;Dhawan and Padh 2017).

Conclusions
We described a high frequency of genotypes associated with low TYMS gene expression in the Mexican population (84.62%), specifically based on the following TSER and G>C genotypes: 2RG/2RG, 2 R/G3RC, and 3RC/3RC.This result predicts a problematic response in a large proportion of Mexican cancer patients to 5 0 -FU, methotrexate, and pemetrexed treatment.Consequently, this study justifies conducting the routine implementation of pharmacogenetic profiles of these TYMS gene polymorphisms in Mexican candidate patients to these pharmacological treatments.

Figure 2 .
Figure 2. Genotype frequency and expression level based on haplotypes from TSER and G > C TYMS gene polymorphisms in the studied Mexican sample and two European populations.

Table 1 .
Allele and genotypes frequencies for the TSER 2 R/3R variants in worldwide populations used for comparison purposes.
a Hardy Weinberg equilibrium

Table 2 .
Pairwise comparison between 16 worldwide populations a for the TSER variant by means of Fst p-values (above diagonal) b , and Fst genetic distances (below diagonal).Please check population sample abbreviations and references in Table1 a