Spectrum of CFTR mutations in Chechen cystic fibrosis patients: high frequency of c.1545_1546delTA (p.Tyr515X; 1677delTA) and c.274G>A (p.Glu92Lys, E92K) mutations in North Caucasus

Background Cystic fibrosis (CF; OMIM #219700) is a common autosomal recessive disease caused by pathogenic variants (henceforward mutations) in the cystic fibrosis transmembrane conductance regulator gene (CFTR). The spectrum and frequencies of CFTR mutations vary among different populations. Characterization of the specific distribution of CFTR mutations can be used to optimize genetic counseling, foster reproductive choices, and facilitate the introduction of mutation-specific therapies. Chechens are a distinct Caucasian ethnic group of the Nakh peoples that originated from the North Caucasus. Chechens are one of the oldest ethnic groups in the Caucasus, the sixth largest ethnic group in the Russian Federation (RF), and constitute the majority population of the Chechen Republic (Chechnya). The spectrum of CFTR mutations in a representative cohort of Chechen CF patients and healthy individuals was analyzed. Methods Molecular genetic analysis of 34 CFTR mutations (representing approx. 80–85% of mutations in multiethnic CF populations of the RF) was performed in 32 CF patients from 31 unrelated Chechen families living in Chechnya. One hundred randomly chosen healthy Chechens were analyzed for the 15 most common “Russian” mutations. The clinical symptoms in Chechen CF patients with different CFTR genotypes were investigated. Results High frequencies of c.1545_1546delTA (p.Tyr515X; 1677delTA) (52 out of 64 CFTR alleles tested; 81.3%) and c.274G > A (p.Glu92Lys, E92K) (8/64, 12.5%) mutations were found. Twenty patients were homozygous for the c.1545_1546delTA mutation, and eight were compound heterozygous for the c.1545_1546delTA and c.274G > A mutations. Three carriers of the c.1545_1546delTA mutation were also found in the cohort of 100 apparently healthy Chechens (frequency – 0.015). The c.1545_1546delTA and c.274G > A mutations are linked to the same haplotype (22–7–16–13) of intragenic Short Tandem Repeat markers, i.e., IVS1CA, IVS6aGATT, IVS8CA, and IVS17bCA. Conclusions The distribution of CFTR mutations in the Chechen CF population is unique regarding the high frequency of mutations c.1545_1546delTA and c.274G > A (more than 90% of the mutant alleles). The c.274G > A mutation is associated with a less severe course of CF than that observed in c.1545_1546delTA homozygotes. Testing for these two variants can be proposed as the first step of CF DNA diagnosis in the Chechen population.


(Continued from previous page)
Conclusions: The distribution of CFTR mutations in the Chechen CF population is unique regarding the high frequency of mutations c.1545_1546delTA and c.274G > A (more than 90% of the mutant alleles). The c.274G > A mutation is associated with a less severe course of CF than that observed in c.1545_1546delTA homozygotes. Testing for these two variants can be proposed as the first step of CF DNA diagnosis in the Chechen population.
Keywords: Cystic fibrosis, CFTR mutations, Chechens, Russian Federation, Caucasus, c.1545_1546delTA (p.Tyr515X; 1677delTA) Background Cystic fibrosis (CF; OMIM # 219700) is a common autosomal recessive disease caused by pathogenic variants (henceforward mutations) in the CFTR gene (OMIM #602421). To date, over 2000 pathogenic, likely pathogenic, or benign CFTR variants have been identified [1]. The distribution of CFTR mutations varies widely in different populations [2]. Therefore, identification of the spectrum of the most common CFTR mutations in a given population can be used to optimize genetic counseling, foster reproductive choices, and facilitate implementation of mutation-specific therapies.
In this study, we have focused on the Chechen population that lives within the Chechen Republic of the Russian Federation (RF), located in the North Caucasus (see Fig. 1). Chechens are an ancient Nakh language speaking group that together with a related Ingush population are jointly termed "Vainakhs" [3,4]. Both populations predominantly live in the Chechen and Ingush Republics of the RF (Fig. 1). Small portions of this distinct population also inhabit several neighboring districts of Dagestan and Georgia. Regarding their population size, the Chechens rank fourth in the Caucasus (after Azerbaijanis, Georgians, and Armenians) and sixth within the entire multiethnic RF (after Russians, Tatars, Ukrainians, Bashkirs, and Chuvash). The total number of Chechens around the world ranges between 1.5-2 million, with the majority of them (i.e. 1,344,122 according All-Russia Population Census) living in Chechnya proper [5].
Available historical and linguistic studies describe a single ancestral population that had been living on the northern slopes of the Great Caucasian Range (i.e., between the Caspian and Black sea regions) already several thousand years ago [6]. This presumed historic population had been associated with a Western Asian culture, distinct from the East Caucasus populations. Recent population genetic studies utilizing Y-chromosome haplotypes have demonstrated a robust genetic delineation of the Nakh language speaking people of Chechnya, Dagestan, and Ingushetia from other populations of the North Caucasus [7]. In this regard, Vainakhs are considered one of the oldest autochthonous Northern Caucasus ethnic groups [8]. The struggle for an independent Chechnya on the basis of the Nakh cultural-linguistic uniqueness ended in the 17 th and 18 th centuries CE (Common Era) when the Vainakh peoples became citizens of the former Russian Empire. In 1810, the Ingushs accepted Russian citizenship, followed by Chechens in 1859 [9].
In this study, we analyzed the spectrum of CFTR mutations in a representative cohort of Chechen CF patients and healthy individuals. According to the best of our knowledge, this is the first study of this population, which also has relevance for the strong Chechen diaspora in the RF and beyond.

Methods
Our study included a representative cohort of 32 Chechen CF patients from 31 unrelated families. Except We used a representative group of 100 unrelated apparently healthy Chechens as controls. The majority of them were also drawn from Grozny (67/100 of the entire control group), with the remaining volunteers originating from other regions of Chechnya. Ethnicity up to the third generation had been validated through a structured questionnaire filled under supervision. Healthy individuals, CF patients, or their legal representatives provided written informed consent for the study. This research project received approval from the Ethics Committee of the "Research Centre for Medical Genetics" (Moscow).
The "Wizard Genomic DNA Purification Kit" (Promega, USA) was used for DNA extraction from whole blood samples where EDTA was used as an anticoagulant. Initially, we examined the 34 most common CFTR mutations utilized for diagnosis of CF within the multiethnic RF that account for over 85% of all CF-causing mutations [10]. In-house molecular genetic methods previously described [11], including amplified fragment length (AFLP) and restriction fragment length (RFLP) polymorphism techniques were utilized to detect insertion/deletion variants and nucleotide substitutions, respectively. The panel of tested CF-causing mutations currently includes mutations: c.54-5940_273+10250del21kb (p.Ser18ArgfsX16; , and c.3909C > G (p.Asn1303Lys; N1303K)) [10].
Subsequently, for a case in which one CFTR mutation remained unidentified we carried out direct Sanger DNA sequencing of the entire CFTR coding region, including adjacent splice sites and the 3′-untranslated CFTR region [10]. Positive cases were confirmed in parents to establish their linkage phase. Random controls were analyzed for the 15 most common "Russian" mutations: c.54-5940_273+10250del21kb Four intragenic short tandem repeats (STR) (IVS1CA, IVS6aGATT, IVS8CA, and IVS17bCA) were examined as previously described [12]. STR haplotypes were established by segregation analysis of given CFTR alleles within CF families. STR haplotype frequencies in healthy samples were calculated using ARLEQUIN version 3.5 software [13].
Statistical analysis was performed using the STATIS-TICA 8.0 program. To compare observed categorical variables, the Fisher test was used, while for quantitative tests the Mann-Whitney test was utilized. Results were considered as significant when p ≤ 0.05.

Results
In the Chechen Republic, 33 Chechen CF patients are officially registered. The prevalence of CF was 2.455 per 100,000 Chechens in 2017. Nearly all of the known CF patients residing in Chechnya (32/33) were analyzed.
In the control group of 100 randomly chosen Chechen individuals, 3 carriers of the c.1545_1546delTA mutation were detected (1.5%), while none of the 14 remaining common "Russian" CFTR mutations were detected.
To compare the clinical course of CF in the studied cohort, the patients were divided into the two most prevalent groups: 17 homozygous for c.1545_1546delTA (Group 1), and 8 compound heterozygotes for c.1545_1546delTA and c.274G > A variants (Group 2). We did not find significant differences in the patient's age at last clinical examination, age at diagnosis, sweat Cl − concentrations, or BMI values. None of the most common before-mentioned complications (meconium ileus, liver cirrhosis, diabetes, polyposis) were revealed in either group. Significant differences were observed only in terms of pancreatic insufficiency in that all patients from Group 1 had fecal elastase 1 concentrations below 50 μg/g, while all patients from Group 2 had concentrations over 200 μg/g (p < 0.0001), this indicating lower degree of pancreatic exocrine dysfunction associated with the presence of c.274G > A. Similarly, the proportion of patients with chronic P. aeruginosa lung colonization was significantly higher in Group 1 compared to Group 2 (69.0% vs. 14.0%, respectively; p = 0.024). The differences in the other studied microorganisms were not significant. Overall, the presence of the c.274G > A mutation is associated with less severe course of the disease than in the c.1545_1546delTA homozygotes (Table 2).

Discussion
To the best of our knowledge, this is the first study on the distribution of CFTR mutations in the Chechen population (Fig.1). We have provided evidence that the c.1545_1546delTA and c.274G > A mutations (as validated by the www.cftr2.org database) account for the majority of CFTR mutations in this population. These CF alleles very likely have a common origin, since they are residing on a  common population-specific intra-CFTR STR haplotype that probably increased in frequency due to genetic drift. The c.1545_1546delTA mutation was previously found to be common in populations neighboring or with historic links to the greater Black Sea region [2] (e.g., Bulgaria, Romania, Greece, Cyprus, and Turkey, including Northern Iran and Georgia [14]). It was also identified in other ethnic groups from the Northern Caucasus region (i.e., Ingush, Armenians, Ossetians, Dagestanis, etc.), albeit only in a small number of CF patients examined [10]. The fact that the c.1545_1546delTA mutation is found in autochthonous ethnic groups of the Caucasus region (Chechens, Ingush, Georgians, Armenians) and is linked to a single haplotype may point to a single source of origin (or penetration) of this mutation into the Caucasus region, and differences in frequencies seen in various populations may be due to gene drift.
Interestingly, the c.274G > A mutation was found at the highest frequency in Chechen CF patients (12.5%). This CF allele was previously found in Turkey [2], but also in Chuvash CF patients living in central parts of the RF (55.6%) [15]. According to the Russian Cystic Fibrosis Patient Registry (RCFPR), the prevalence of this allele in patients in the Volga-Ural region is 6.88% (Chuvash Republic -53.19%, Udmurt Republic -6.76%, Tatarstan -2.38%, Bashkortostan -1.37%, Samara region -3.06%, Perm -0.75%, and Orenburg regions -1.96%), including the Khanty-Mansi autonomous region (Yugra) -3.85%, as well as sporadically in many other RF regions [16]. Given its population distribution, the c.274G > A mutation could be associated with the historical resettlements of Turkic-speaking peoples in regions of RF metioned above, including their migrations to the Caucasus and greater Black Sea geographical area.
Although the c.3846G > A mutation was found at a very high frequency in the neighboring Karachay-Cherkessia ( Fig.1) [11], it was only sporadically observed in Chechens, thereby substantiating the strong genetic delineation of Vainakh populations from other ethnic groups residing in the Northern Caucasus. The penetration of the c.3846G > A mutation into the territory of the Northeast Caucasian region could be related to the migration of Jews from Byzantium through the northern Black Sea region or Georgia in the early Middle Ages or from Persia (Iran) in the late Middle Ages [11].
The c.287C > A mutation was previously described in Turkish CF patients, where its frequency is at 2.6% [2]. In the RF this mutation was also found in 2 patients from Dagestan, which is bordering Chechnya (Fig.1).
Finally, the c.1000C > T mutation is in CF populations from the greater Mediterranean region, such as those from southern France (1.2%), Greece (1.1%), Portugal (0.7%), and Spain (1.2%) [2], while in the RF it was sporadically observed in various ethnic groups at the frequency of 0.8% [16].
The comparison of key clinical parameters in the two groups of Chechen patients with different genotypes demonstrated that the allele c.274G > A is associated with higher residual pancreatic function and lower chronic lung colonization with pathognomonic microorganisms, in accordance with the data of the CFTR2 database [1].

Conclusions
The distribution of CFTR mutations in the Chechen CF population is unique in terms of the high frequency of mutations c.1545_1546delTA (p.Tyr515X; 1677delTA) and c.274G > A (p.Glu92Lys, E92K), which account for more than 90% of the mutant alleles in the studied ethnic group. Testing of the two CF-causing mutations is thus recommended for Chechen CF patients, since it allows identification of one or both mutant CFTR alleles in more than 99% of patients suspected of being affected by CF. Furthermore, we have confirmed the genetic delineation of the Chechen population from other ethnic groups of the Northern Caucasus (e.g. by low prevalence of the c.3846G > A mutation, which is dominant in adjacent Karachay-Cherkessia; Fig.1), as well as the role of historic migrations of Turkic-speaking peoples from Central Asia to the Northern Caucasus with the c.274G > A mutation very likely being their "marker" [17]. Analysis of genotype-phenotype correlations in two groups of Chechen CF patients (i.e. c.1545_1546delTA homozygotes versus c.1545_1546delTA/c.274G > A compound heterozygotes) demonstrated that the presence of the c.274G > A mutation is associated with generally less severe course of the disease. Our data will improve genetic counselling and provide a basis for the introduction of mutation-specific therapies in the future.

Acknowledgments
We are grateful to all the Chechen CF families and healthy Chechen volunteers who participated in this study. We thank Dr. Richard H. Lozier for interest in our work and useful assistance.

Funding
This work was supported by the Russian Science Foundation (RSF) grant 17-15-01051 (expeditions, DNA research, writing the manuscript) to RAZ and Czech Ministry of Health grant IP00064203/6003 (general management of the project, reviewing the manuscript, discussion of findings) to MM Jr.

Availability of data and materials
The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.