Variants in genes related to development of the urinary system are associated with Mayer–Rokitansky–Küster–Hauser syndrome

Mayer–Rokitansky–Küster–Hauser (MRKH) syndrome, also known as Müllerian agenesis, is characterized by uterovaginal aplasia in an otherwise phenotypically normal female with a normal 46,XX karyotype. Previous studies have associated sequence variants of PAX8, TBX6, GEN1, WNT4, WNT9B, BMP4, BMP7, HOXA10, EMX2, LHX1, GREB1L, LAMC1, and other genes with MRKH syndrome. The purpose of this study was to identify the novel genetic causes of MRKH syndrome. Ten patients with MRKH syndrome were recruited at Beijing Obstetrics and Gynecology Hospital, Capital Medical University, Beijing, China. Whole-exome sequencing was performed for each patient. Sanger sequencing confirmed the potential causative genetic variants in each patient. In silico analysis and American College of Medical Genetics and Genomics (ACMG) guidelines helped to classify the pathogenicity of each variant. The Robetta online protein structure prediction tool determined whether the variants affected protein structures. Eleven variants were identified in 90% (9/10) of the patients and were considered a molecular genetic diagnosis of MRKH syndrome. These 11 variants were related to nine genes: TBC1D1, KMT2D, HOXD3, DLG5, GLI3, HIRA, GATA3, LIFR, and CLIP1. Sequence variants of TBC1D1 were found in two unrelated patients. All variants were heterozygous. These changes included one frameshift variant, one stop-codon variant, and nine missense variants. All identified variants were absent or rare in gnomAD East Asian populations. Two of the 11 variants (18.2%) were classified as pathogenic according to the ACMG guidelines, and the remaining nine (81.8%) were classified as variants of uncertain significance. Robetta online protein structure prediction analysis suggested that missense variants in TBC1D1 (p.E357Q), HOXD3 (p.P192R), and GLI3 (p.L299V) proteins caused significant structural changes compared to those in wild-type proteins, which in turn may lead to changes in protein function. This study identified many novel genes, especially TBC1D1, related to the pathogenesis of MRKH syndrome. The identification of these variants provides new insights into the etiology of MRKH syndrome and a new molecular genetic reference for the development of the reproductive tract.

normal female with a normal 46,XX karyotype. MRKH syndrome affects approximately one in 5000 live female births [2], and has been reported in approximately 16% of patients with primary amenorrhea [3]. When only the reproductive organs (uterus, fallopian tubes, cervix, and the upper part of the vagina) are affected, this condition is classified as MRKH syndrome type I. Some women with MRKH syndrome also have abnormalities in other organs of the body; in these cases, the disease is classified as MRKH syndrome type II [4]. Affected individuals often display renal, skeletal, and heart defects and hearing loss [5].
MRKH syndrome is directly caused by incomplete development of the Müllerian ducts. Genetic and/or environmental factors that control the formation and morphogenesis of Müllerian ducts are closely related to the MRKH syndrome. During human embryonic development, Müllerian ducts form just lateral to the Wolffian duct. Both Müllerian and Wolffian ducts develop from the intermediate mesoderm and on the surface of the mesonephric kidneys. Therefore, MRKH syndrome is usually associated with abnormalities of the renal and axial skeletal systems.
In this study, we aimed to explore the novel genetic causes of MRKH syndrome using whole-exome sequencing (WES) technology. We recruited 10 patients with MRKH syndrome and performed WES and family genetic analysis on them. We attempted to identify novel genetic pathogenic factors associated with MRKH syndrome.

Patients
Ten patients diagnosed with MRKH syndrome with a 46,XX karyotype were recruited at Beijing Obstetrics and Gynecology Hospital from January 2019 to May 2021. The clinical conditions and manifestations of the ten patients are presented in Table 1. Five milliliters of peripheral blood were collected from each patient for further genetic analysis.

WES analysis
Genomic DNA from each patient was extracted from the peripheral blood using the QIAamp DNA Blood Kit (Qiagen, Valencia, CA, USA). WES was performed as previously described [17]. The functional effects of the variants (damaging or not) were predicted using the PolyPhen-2, SIFT, MutationTaster, LRT, and FATHMM-MKL algorithms. The desired variants were filtered using two criteria: (i) missense, nonsense, frameshift, or splice site variants; and (ii) variants with minor allele frequency < 1%. The minor allele frequency information was obtained by referring to the Genome Aggregation Database (gnomAD, http:// gnomad. broad insti tute. org/), 1000 Genomes Project (1000G, http:// brows er. 1000g enomes. org/ index. html), NHLBI Exome Sequencing Project (ESP6500), and our in-house database.

Sanger sequencing analysis
Sanger sequencing was performed to validate the identified variants and determine if each variant was inherited from a parent. The primer pairs for each gene are listed in Additional file 1: Table S1. Forward or reverse primers were used to sequence the PCR products. Sequencing was performed using an ABI 3730 automatic sequencer (Applied Biosystems, Foster City, CA, USA).

Protein structure prediction
The three-dimensional structures of wild-type and mutant proteins were predicted using the Robetta online protein structure prediction server (https:// robet ta. baker lab. org/) [18]. This tool can predict the three-dimensional structure of a given amino acid sequence. Protein structure alignment was performed using Visual Molecular Dynamics 1.9.3 software (https:// www. ks. uiuc. edu/ Resea rch/ vmd/).

WES analysis
Of the 10 women with MRKH syndrome, seven had type I MRKH syndrome and the other three had type II MRKH syndrome (Table 1). WES helped to identify 11 variants in 90% (9/10) of the patients and was considered a molecular genetic diagnostic tool of MRKH syndrome. These 11 variants involved nine genes:  Table 2). All the variants were heterozygous. These changes included one  (Table 2). All the identified variants were absent or very rare in the gnomAD East Asian population ( Table 2). Two of the 11 variants (18.2%) were classified as pathogenic variants according to the American College of Medical Genetics and Genomics (ACMG) guidelines. The remaining nine variants (81.8%) were classified as variants of uncertain significance (VUS).

Novel candidate genes of MRKH syndrome TBC1D1
We identified TBC1D1 variants in two unrelated patients, Fc-M-1 and Fc-M-3 (Table 2). Fc-M-1 was diagnosed as type II MRKH syndrome (European Society of Human Reproduction and Embryology [ESHRE] classification: U5bC4V4) with full uterine agenesis, left pelvic kidney dysfunction, congenital anal atresia with vestibular fistula, ventricular septal defect, and accessory auricle Table 2 In silico analysis of sequence variants found by WES in MRKH patients The Genome Aggregation Database (gnomAD) is a resource developed by an international coalition of investigators, with the goal of aggregating and harmonizing both exome and genome sequencing data from a wide variety of large-scale sequencing projects. In this study, we referred to the allele frequencies in the East Asian ( Table 1, Fig. 1A-C). Fc-M-3 was diagnosed as type I MRKH syndrome (ESHRE classification: U5bC4V4) with uterine remnants without a rudimentary cavity (Table 1 and Fig. 1D).
TBC1D1 was highly expressed in the uterus (Fig. 1E). Fc-M-1 harbored a frameshift c.2553delC (p.R854Efs*24) variant (Fig. 1F) inherited from her father (Fig. 1F). Therefore, TBC1D1 variant c.2553delC was associated with MRKH syndrome within this family. The c.2553delC variant of TBC1D1 was predicted to produce a truncated p.R854Efs*24 protein, which would destroy the Rab-GAP TBC domain of the TBC1D1 protein and lead to the loss of C-terminal sequences (Fig. 1G). The c.2553delC variant was absent in the gnomAD East Asian population. It was classified as a pathogenic variant (PVS1 + PM2 + PP3) according to ACMG guidelines.
Fc-M-3 carried a missense c.1069G>C (p.E357Q) variant confirmed by Sanger sequencing (Fig. 1H). Due to the unavailability of samples from the patient's mother and father, we could not determine how this variant was transmitted. The allele frequency of the c.1069G>C variant in the gnomAD East Asian population was 0.0002. The variant was predicted to be a damaging variant by all five algorithms we used and was classified as VUS (PM2 + PP3 + BP1) according to the ACMG guidelines ( Table 2). p.E357Q was located in a PTB_TBC1D1_like domain (TBC1 domain family member 1 and related protein phosphotyrosine binding (PTB)), which contains amino acids at positions 164 to 371. Wild-type (WT) and mutant protein structures for amino acids at positions 164 to 371 were predicted by the Robetta fold. The domain structural prediction results revealed significant structural changes in at least two regions between the protein carrying the p.E357Q variant and the WT protein (Fig. 1I), suggesting that the p.E357Q variant might affect the function of TBC1D1.

DLG5
We identified a DLG5 variant in patient Fc-M-6 who was diagnosed as having type I MRKH syndrome (ESHRE classification: U5bC4V4) with primary amenorrhea and dyspareunia (Table 1). DLG5 was highly expressed in the human cervix, uterus, and vagina ( Fig. 2A). Fc-M-6 harbored the DLG5 stop-codon gained variant c.418C>T (p.Q140*) ( Table 2). This variant was confirmed by Sanger sequencing (Fig. 2B) and classified as a pathogenic variant (PVS1 + PM2 + PP3) according to the ACMG guidelines ( Table 2). The c.418C>T variant of DLG5 was predicted to produce a truncated p.Q140* protein, which would destroy all functional domains in the DLG5 protein (Fig. 2C).

GLI3
We also identified a GLI3 missense variant in patient Fc-M-8, who was diagnosed with type I MRKH syndrome (ESHRE classification: U5bC4V4) with primary amenorrhea and dyspareunia (Table 1). GLI3 was highly expressed in the human uterus and vagina (Fig. 4A). Fc-M-8 carried a heterozygous GLI3 variant c.895C>G (p.L299V) ( Table 2), which was confirmed by Sanger sequencing (Fig. 4B) and classified as a VUS variant (PM2 + PP3 + BP1) according to the ACMG guidelines ( Table 2). The structural prediction results showed that there were significant structural changes between the GLI3-WT protein and the mutant protein (p.L299V) (Fig. 4C). Alignment of the GLI3-WT and GLI3-mutant proteins was difficult (Fig. 4D).

Other genes associated with MRKH syndrome
We also identified variants of KMT2D, LIFR, CLIP1, HIRA, and GATA3. All variants were classified as VUS according to the ACMG guidelines ( Table 2). All variants were confirmed using Sanger sequencing (Additional files 3 and 4: Figure S2A and S3A; some data not shown). Some of the mutant proteins that harbored the variants were analyzed by protein structure prediction (Additional files 3 and 4: Figure S2B, C, S3B, and C).
Two variants of KMT2D (c.2992C>G; P998A, and c.1754C>T; P585L) were found in two unrelated patients. The P998A variant was found in patient Fc-M-2 (Table 2), who was diagnosed with type I MRKH syndrome with primary amenorrhea and bilateral uterine remnants, without a rudimentary cavity ( Table 1). The P585L variant was carried by patient Fc-M-7 (Table 2), who was diagnosed with type II MRKH syndrome with bilateral uterine remnants without a rudimentary cavity, congenital cleft palate, and bilateral fallopian tubal dysplasia (Table 1).

Discussion
Using WES and genetic analysis, this study identified several novel genetic variants that may lead to MRKH syndrome. Next, we discuss these novel genes involved in the pathogenesis of MRKH syndrome.

TBC1D1
TBC1D1 encodes a Rab-GTPase-activating protein and is involved in regulating the trafficking of GLUT4 storage vesicles to the cell surface [19]. Previous studies have found that heterozygous mutation of TBC1D1 is associated with congenital anomalies of the kidneys and urinary tract (CAKUT) [20,21]. The TBC1D1 mutation may promote the pathogenesis of CAKUT through its role in glucose homeostasis [20]. Patient Fc-M-1 harboring the TBC1D1 truncating variant found in this study also had CAKUT; the left pelvic kidney did not function and the right kidney was enlarged as a compensatory response. Type II MRKH syndrome is usually complicated by abnormalities in the urinary system. Therefore, the study findings suggest that attention should be paid to whether patients with type II MRKH syndrome, especially those with urinary system abnormalities, carry genetic variants related to CAKUT. A previous study by our group has also shown that sequence variants related to CAKUT could be associated with another complex reproductive tract malformation, which is related to the Herlyn-Werner-Wunderlich syndrome [17].

DLG5
Dlg5 is required for epithelial tube maintenance in the mouse brain and kidney. Dlg5 gene knockout mice exhibit hydrocephalus and renal cysts [22]. Heterozygous sequence variants of DLG5 are associated with ureteropelvic junction obstruction or renal agenesis [23]. Gene expression data of DLG5 in humans in the present study ( Fig. 2A) showed that DLG5 was highly expressed in the tissues of the reproductive tract, including the cervix, uterus, and vagina. Therefore, the DLG5 protein may play an important role in the development of the reproductive tract. In this study, patient Fc-M-6 with MRKH syndrome harbored a rare DLG5 truncating variant, Q140* (Fig. 2C), which is classified as a pathogenic variant according to ACMG guidelines. Q140* lacks almost all functional domains of the DLG5 protein, so the variant may lead to the loss of function of the protein. As few patients with MRKH syndrome were included in this study, only one DLG5 variant was found. We expect our follow-up research or other research groups to identify DLG5 variants in unrelated patients with MRKH syndrome and provide more evidence of the association of mutations in this gene with MRKH syndrome.

KMT2D
Previous studies have reported that KMT2D mutations can lead to Kabuki syndrome [24,25]. KMT2D gene variants are also related to CAKUT and renal agenesis [26][27][28][29]. In the present study, the Fc-M-2 and Fc-M-7 patients harbored a KMT2D variant. Both patients also carried another genetic variant. Fc-M-2 carried a variant of the LIFR gene (Table 2), which is related to the pathogenesis of CAKUT [30]. Fc-M-7 harbored a variant in the CLIP1 gene (Table 2), which is also a candidate gene for CAKUT [29]. The findings suggest that Fc-M-2 and Fc-M-7 are related to digenic inheritance. KMT2D, LIFR, and CLIP1 are associated with CAKUT, which also indicates that perturbations of renal development-related genes may affect the normal development of reproductive tracts, including the uterus and vagina [17,31].

Other genes
We also found several genes, including HOXD3, GLI3, HIRA, and GATA3, which may be associated with MRKH syndrome. HOXD3 is predominantly expressed in the uterus and kidney (https:// varso me. com/ gene/ HOXD3), suggesting its important roles in reproductive and urinary tract development. GLI3 is highly expressed in the uterus (https:// varso me. com/ gene/ GLI3). Variants in GLI3 have also been associated with CAKUT or renal agenesis [23,  26,32]. HIRA encodes a histone chaperone and is considered the primary candidate gene in DiGeorge syndrome. Deletion or duplication in chromosomal loci 22q11.21 containing the HIRA gene has been associated with MRKH syndrome [33,34]. GATA3 is expressed in Wolffian ducts at the time of their emergence in the embryonic intermediate mesoderm [35]. GATA3 mutations cause hypoparathyroidism, deafness, renal dysplasia syndrome, and CAKUT [26,36,37] The limitations of this study lie in the following two aspects. First, the sample size of this study is relatively small. MRKH syndrome is a rare disease with an incidence of 1/5000. Patients who come to our hospital for treatment of this disease are very few, so the number of patients who can be recruited in the group is even less. Although the sample size is relatively small, the researchers in our team try their best to find the genetic pathogenic factors of each patient. More patients will be recruited in the future, and we will also focus on whether the pathogenic genes we have found are mutated in the patients enrolled in the future. Secondly, most of these variants found in this study are VUS variants. The reason why this kind of variant is VUS is mainly because the variants have not been rigorously analyzed by functional experiments. We will also continue to study the variants of interest to clarify the molecular mechanism of their pathogenesis.

Conclusion
Genetic variants, especially in the TBC1D1 gene, are related to the pathogenesis of MRKH syndrome. This study provides new insights into the etiology of MRKH syndrome and the data are a new molecular genetic reference for the development of the reproductive tract.

Learn more biomedcentral.com/submissions
Ready to submit your research Ready to submit your research ? Choose BMC and benefit from: ? Choose BMC and benefit from: