DUOX2 and DUOXA2 Variants Confer Susceptibility to Thyroid Dysgenesis and Gland-in-situ With Congenital Hypothyroidism

Background: Thyroid dysgenesis (TD), which is caused by gland developmental abnormalities, is the most common cause of congenital hypothyroidism (CH). In addition, advances in diagnostic techniques have facilitated the identification of mild CH patients with a gland-in-situ (GIS) with normal thyroid morphology. Therefore, TD and GIS account for the vast majority of CH cases. Methods: Sixteen known genes to be related to CH were sequenced and screened for variations by next-generation sequencing (NGS) in a cohort of 377 CH cases, including 288 TD cases and 89 GIS cases. Results: In our CH cohort, we found that DUOX2 (21.22%) was the most commonly variant pathogenic gene, while DUOXA2 was prominent in TD (18.75%) and DUOX2 was prominent in GIS (34.83%). Both biallelic and triple variants of DUOX2 were found to be most common in children with TD and children with GIS. The most frequent combination was DUOX2 with DUOXA1 among the 61 patients who carried digenic variants. We also found for the first time that biallelic TG, DUOXA2, and DUOXA1 variants participate in the pathogenesis of TD. In addition, the variant p.Y246X in DUOXA2 was the most common variant hotspot, with 58 novel variants identified in our study. Conclusion: We meticulously described the types and characteristics of variants from sixteen known gene in children with TD and GIS in the Chinese population, suggesting that DUOXA2 and DUOX2 variants may confer susceptibility to TD and GIS via polygenic inheritance and multiple factors, which further expands the genotype-phenotype spectrum of CH in China.


INTRODUCTION
Congenital hypothyroidism (CH), a common childhood disease associated with mental disabilities, manifests as high thyroid-stimulating hormone (TSH) levels and low T4 levels in the serum during neonatal screening due to insufficient production of thyroid hormone. CH can lead to short stature and intellectual disability if not diagnosed and treated at an early stage (1)(2)(3)(4). Based on the Chinese neonatal screening program that started in 1984, CH affected up to 1/2421 neborns (4.13 per 10,000 live births) between 2013 and 2015, compared to 1/3000-1/4000 newborns worldwide; these figures suggest a high incidence in China and prompted us to investigate the pathogenesis of this disease (5)(6)(7).
Thyroid dysgenesis (TD), the most common cause of CH, is caused by the failure of the thyroid gland to develop a normal morphology or position and encompasses the spectrum of thyroid agenesis, thyroid ectopy, and thyroid hypoplasia (8,9). Insufficient hormone production secondary to TD is mainly due to defects in genes that encode enzymes that participate in iodine organification, thyroglobulin synthesis or transport, iodine transport or iodotyrosine deiodination during thyroid hormone synthesis (10). Although gland-in-situ (GIS) with normal thyroid morphology and goiter were once thought to be present in only 20-30% of CH patients, an increase in the frequency of CH with GIS has been reported, with the incidence more than doubled to ∼1 in 1,500 live newborns, which accounted for almost two-thirds of diagnosed cases in Italy at the time (11,12). Increasing evidence has shown that the pathogenesis of CH is complex, and this disease is considered to be influenced by environmental factors and genetic causes; genetic factors play an important role in the pathogenesis of CH (13). Therefore, studies of large cohorts and multiracial groups are required to investigate genetic causality due to the unclear underlying cause and complex etiology of CH.
Thyroid morphogenesis is a multistage process that proceeds from thyroid progenitors to the thyroid gland, and thyroidstimulating hormone receptor (TSHR), thyroid transcription factors, such as FOXE1, NKX2-1 PAX8, and NKX2-5, and transcription regulators, such as GLIS3, are required for the normal development of the thyroid gland; mutations in these factors can lead to the failure of normal thyroid formation (14)(15)(16). Dyshormonogenesis is often due to genetic defects in important enzymes in thyroid hormone synthesis: iodine organification disorder (TPO, DUOX2, DUOXA2, and SLC26A4); thyroglobulin synthesis or transport defect (TG); iodine transport defect (SLC5A5); and iodotyrosine deiodinase deficiency (IYD) (17)(18)(19)(20). In addition, a very small proportion of CH children are severely resistant to thyroid hormones due to variants in THRB (21). Although genetic variants of candidate genes are involved in thyroid hormone synthesis and TSHR and intersect with genes associated with both TD and dyshormonogenesis (19,22), the genetic basis of GIS remains poorly understood.
Investigations of the genetics of TD and dyshormonogenesis have revealed that the genetic diversity of relevant genes far exceeds our expectations. Previous studies have applied nextgeneration sequencing (NGS) as a cost-efficient and powerful approach and confirmed that the conventional goiter-associated candidate genes DUOX2 and TPO may be related to TD, suggesting that the genotype-phenotype relationship of CH is increasingly complex (23)(24)(25)(26). However, most previous studies focused on only a single gene or a small number of genes and thus had significant limitations in elucidating the relationship between genotype and phenotype. Therefore, it is necessary to explore the comprehensive effects and variant rates of these causative genes in sporadic cases as part of an in-depth study of the pathogenesis of CH. In this study, we will explore 16 causative genes associated with CH by conducting NGS in 377 CH cases comprising 288 TD cases and 89 GIS cases in China to reveal the genetic complexity of CH, which will expand the genetic spectrum of this disease and clarify the interactions among multiple genes.

Editorial Policies and Ethical Considerations
Our study was approved by the ethics committee of the Affiliated Hospital of Qingdao University. Written informed consent was provided by the parents of the participants.

Study Samples
A total of 377 children with sporadic CH, including 288 children with TD (121 athyreosis, 93 ectopia, and 74 hypoplasia) and 89 children with GIS were recruited from newborn screening centers in 11 cities in Shandong, China (Qingdao, Zibo, Jining, Linyi, Yantai, Jinan, Liaocheng, Dezhou, Weifang, Zaozhuang and Tai'an) ( Table S1). The clinical, biochemical and ultrasonographic criteria of the patients were as follows: (1) All patients were full-term children aged 1-6 years old and had no other congenital diseases. (2) Heel blood was collected within 72 h after birth and TSH levels were analyzed. The children with TSH > 10 µL/ml were recalled within 2 months after birth. The levels of serum TSH, FT4, and FT3 were further measured. Then, CH was diagnosed according to high TSH (> 4.2 µIU/ml) and low FT4 (< 12 µpmol/L). (3) The morphological classification of thyroid is as follows (shown by ultrasonography or technetium-99 m thyroid scanning): (a) GIS means a child diagnosed with CH, accompanied by normal thyroid anatomy and imaging findings; (b) athyreosis is defined as failure to show any thyroid tissue on ultrasonography and technetium thyroid scans; (c) ectopia refers to the absence of thyroid tissue in normal thyroid anatomical structure, but appeared in other areas; (d) hypoplasia refers to unilateral and/or bilateral thyroid gland not reaching normal size and lacking normal shape (27). Thyroid morphology was classified by thyroid ultrasonography and technetium thyroid scans (28,29). In addition, we excluded children with chromosomal abnormalities, children with central hypothyroidism or associated syndromes and offspring of mothers with Graves' disease. All subjects represented sporadic and non-consanguineous cases.

Targeted NGS
A Qiagen DNA extraction kit (Qiagen, Hilden, Germany) was used to extract DNA samples from blood clots using the column method. The following is the process of library preparation: (1) genomic DNA randomly fragmented to an average size of 180-280 bp; (2) gDNA fragments repaired and phosphorylated; (3) A-tailing and ligation at the 3 ′ ends with paired-end adaptors(Illumina) with a singel "T" base overhang and purification using AMPure SPRI beads from Agencourt. DNA library was hybridized with the probe library labeled with biotin in liquid phase. The probe library was specifically combined with the exons of 16 candidate genes: and SLC26A4 (NM_000441). Then, the prepared library was sequenced on an Illumina Hiseq x for 150 bp paired-end reads.

Data Filtering and Variant Calling
The raw image files obtained from the Hiseq 4,000 were processed with an Illumina pipeline for base calling and stored in FastQ format (raw data). Adapter contamination, uncertain nucleotides >10% and single read >50% low quality were used to filter the data.

Variant Annotation
The variant position, variant type, conservative prediction and other information were annotated by ANNOVAR through a variety of databases, such as DbSNP, 1,000 Genome, ExAC, CADD and HGMD. We defined the variations with the following criteria: (1) quality scores >30, site read depth >10×, variant depth >5×, and mapping quality >50; (2) minor allele frequency (MAF) < 0.01 in the 1,000 Genomes databases and ExAC; (3) at least half of the employed algorithms (PolyPhen-2, SIFT, Mutation Taster and CADD) showed that the single-nucleotide variant (SNV) results in an amino acid change or is not benign (defined as a nonsense or splice site variant). All of these variants were verified by Sanger sequencing.

Distribution of Biallelic or Multiallelic Variants in the Cohort
To characterize the variant rates of biallelic and multiallelic variants (2 or more pathogenic variants on the same allele) of these sixteen genes in the cohort, we identified biallelic variants in 16

Variant of Multiple Genes Was Detected in our 94 Subjects
Because more than one gene can participate in the process of CH in an individual and because CH is a complex disease involving multiple genes (31,32), we analyzed the sequencing information for subjects carrying two or more genes with variants and conducted an analysis of various gene combinations ( Table 2). Among the 61 digenic variants, the most frequent combination was DUOX2 with DUOXA1 (29.51%, 18/61 in all patients; 8/19 in GIS patients and 10/42 in TD patients) (Figure 2); among 26 trigenic variants, the most common combination was DUOX2 and TG combined with DUOXA1 (15.38%, 4/26) or DUOXA2 (11.54%, 3/26), and 7 subjects had joint variants of four genes (Table S1, Patient ID: 11; 91; 106; 107; 119; 166; and 199).

We Detected 58 Novel Variants and Found 10 Possible Variant Hotspots in our CH Cohort
Among the 168 variants, 58 novel variants were identified, as follows: 12 variants in TG; 11 variants in DUOX2; 8 variants in TSHR; 5 variants in DUOXA1; 3 variants each in PAX8, SLC5A5, FIGURE 1 | Distribution of variants in 219 congenital hypothyroidism. The left side was 16 detected genes with variants, and the number on top of each box is the patient ID. Each column represents a patient and each row represents a gene. Blue blocks represent triallelic variants, green blocks represent biallelic variants and yellow blocks represent monoallelic variants. Such as patient 45 carries variants on three genes: biallelic variants in the DUOX2 gene and each monoallelic variant in TG and DUOXA2. A total of 80 subjects bore DUOX2 variants, which was the most commonly pathogenic genes with variants in our study. As for TD (encompassing thyroid, ectopy and hypoplasia), the most frequently was DUOXA2, which held 54 carriers. DUOXA2 and TPO; 2 variants each in GLIS3, NKX2-1 and THRB; and 1 variant each in FOXE1, IYD, NKX2-5, and SLC26A4 (Table S2). We found 10 possible variant hotspots of these sixteen genes that were concentrated significantly in the Chinese CH population (

DISCUSSION
In our study, we conducted NGS for 16 known genes associated with CH in a total of 377 CH-affected subjects and annotated non-synonymous variants of these known genes to predict the cause of thyroid dysfunction or abnormal morphology by the highly rigorous ACMG criteria. We found that DUOX2 was the most significantly gene with variants in our CH cohort, while DUOXA2 was significant in TD patients and DUOX2 was prominent in GIS patients. We found that DUOX2 had a significantly high rate (21.22%) in our study, mainly based on CH, with 288 TD patients (16.67%) and 89 GIS patients (34.83%), which indicated that DUOX2 is an important susceptibility gene in the Chinese CH population. The high prevalence of DUOX2 variant in our work is consistent with previous studies in China (variant rate of DUOX2 from 31.8 to 62.5%), suggesting that DUOX2 is an important causative gene involved in the pathogenesis of CH (31)(32)(33). As DUOX2 has generally been considered a gene involved in disorders of H 2 O 2 generation (11,34,35), a majority of previous studies focused on dyshormonogenesis and found that DUOX2 pathogenic variants were present in 62.5% of the cohort consisting of 154 CH patients, with 118 dyshormonogenesis patients and 36 TD patients (33); Sun et al. (32) revealed that DUOX2 was present in 60% of 110 CH patients, including 21 goiter patients, 51 patients with a normal-sized thyroid and 28 patients with unknown thyroid morphology. Recently, One study has shown DUOX2 mutations associated with CH with ectopic thyroid (25). However, mouse models with biallelic inactivation of DUOX2 or DUOXA2 present a normal developed thyroid gland with only thyroid dyshormonogenesis (36,37). Our study also found that the high rate of DUOX2 in GIS patients was significantly different from that in TD (Pearsonχ 2 = 12.911, p-value= 3.27 × 10 −4 ), indicating that DUOX2 is an important factor associated with GIS in China; this result differed significantly from previous research in western countries that suggested that variants of TG are the most common cause of GIS (19). Therefore, we speculated that DUOX2 confers susceptibility not only to TD but also to GIS.
In our TD cohort, the most significant variant was DUOXA2 (18.75%), which had a slightly higher rate than DUOX2. As a necessary maturation factor for functional DUOX2, DUOXA2 is an accessory protein required for the correct processing of the DUOX2 enzyme and is generally involved in dyshormonogenesis (38). In our study, we first found that DUOXA2 confers susceptibility to TD, which provides new insights for understanding thyroid development. In addition, we found that 23.6% of children with GIS carried DUOXA2 variants. Thus, we speculate that DUOXA2, as a critical maturation factor for functional DUOX2, may play an important role in TD and GIS. We also found 10 subjects in our 288 TD cohort with TPO variants, which is consistent with the fact that TPO was previously believed to be involved in thyroid hormone synthesis, which may also lead to TD (26). Therefore, our findings contribute to the variant spectrum associated with multiple CH phenotypes in China and may explain the multigene interactions and new roles of DUOX2 and DUOXA2 in children with TD and GIS. DUOX2 encodes an enzyme responsible for catalyzing the production of H 2 O 2 from the apical membrane of follicular epithelial cells, which is necessary for the iodization of thyroglobulin tyrosine residues. DUOXA2 proteins migrate to the cell surface with DUOX2 to constitute the functional active complex (39). DUOX2/DUOXA2 complex plays an important role in the transport, maturation and localization of DUOX2. Recently, studies have shown that duox and duoxa are new thyroid differentiation markers in zebrafish, and duoxa is earlier expressed than duox (40). It has also been proved in insects that the main purpose of duox activity is to guarantee tyrosine residues cross-linked, which is helpful for the cuticle formation in Drosophila (41). DUOX2 is one downstream target of FOXE1 in rat thyroid follicular cells (pccl3) (42). This suggests that duox family may play an additional role in thyroid development by interacting with other undetermined proteins.
We found that biallelic variants were present in 5.56% (16 of 288) of children with TD and 16.85% (15 of 89) of children with GIS in our study, and this difference was statistically significant (Pearsonχ 2 = 11.501, p-value=6.96 × 10 −4 ), which indicated that biallelic variants may play a key role in GIS. The most common biallelic variants involved DUOX2 in children with GIS, and DUOX2 or TSHR in children with TD. We showed for the first time that biallelic variants of TG, DUOXA2 and DUOXA1 participated in the pathogenesis of TD. We also identified triple variants in DUOX2 in three cases each of TD and GIS. However, it is uncertain whether the phenotypic severity of CH depends on the number of allelic variants. Further studies in larger families are needed to assess the contributions of different variables to the phenotype of CH.
We detected monogenic variants of 16 known genes in 125 cases (26 athyreosis, 21 ectopia, 18 hypoplasia and 29 GIS) and non-monogenic variants in 94 cases (39 athyreosis, 27 ectopia, 26 hypoplasia and 33 GIS). The most frequently digenic variant was DUOX2 combined with DUOXA1 in both TD and GIS. However, previous studies reported that TG combined with DUOX2 had the most significant variant rate in children with GIS and dyshormonogenesis (19,43). Hulur et al. (44) reported that a CH child with goiter lacked one allele of DUOX2 and DUOXA1, with loss of function of DUOXA2. We demonstrated for the first time that DUOX2 and DUOXA1, as members of the NADPH oxidase family that were once thought to be involved only in thyroid hormone synthesis (35,45), may play a role in thyroid development. Furthermore, we found 26 subjects carrying trigenic variants mostly in DUOX2 and TG accompanied by DUOXA1/DUOXA2, and 7 cases had joint variants of four genes without obvious gene aggregation. These diverse genetic variant combinations indicated that CH is a complex genetic disease.
In this study, we found that there are two or more pathogenic variants on the same allele and a large number of multigenic pathogenic variants. As recently reported (46), mutations of multiple genes cause phenotype aggravation because of the weakening of the attenuated compensation effect of related genes. In fact, environmental factors, such as iodine deficiency disorders, may cause endocrine dysfunction by affecting gene expression. Another article revealed that (47) the frequent combination of rare variations in CH patients was independent of phenotype, implying CH has a oligogenic origin, which can explain the absence of CH heritability and sporadic manifestations of the disease. We support the view that CH was an oligogenetic predisposition disease with multiple defects in different loci or genes in sporadic cases.
We identified a high prevalence of p.Y246X variants in DUOXA2 in our 68 subjects, but this hotspot is quite different from the p.K530X variants in DUOX2 in Guangzhou due to possible differences in population composition between North and South China (33). In addition, we detected p.S268G and p.T168M in DUOXA1 with other hotspots. The variant hotspots in these genes laid the foundation for future genetic screening.
In summary, our research made full use of the superiority of NGS to identify pathogenicity variants in 16 known genes based on CH in a Chinese population. We meticulously described the types and characteristics of gene variants in TD and GIS patients in the Chinese population, suggesting that DUOXA2 and DUOX2 variants may confer susceptibility to TD and GIS via polygenes and multiple factors, which further expands the genotype-phenotype spectrum of CH in China. However, genetic verification in large families and functional research on the identified variants are necessary for further insights into the etiology of CH.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: NCBI SRA: PRJNA613190.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the ethics committee of the Affiliated Hospital of Qingdao University. Written informed consent was provided by the parents of the participants.

AUTHOR CONTRIBUTIONS
SL and FaW selected topics, and participated in design and coordination. FeW analyzed the data and drafted the manuscript. YZ, ML, and WL collected the sample and analyzed the basic clinical data. YW, HL, and XY interpreted the results and provided critical revisions of the manuscript. All authors approved the final version of the manuscript and agreed to be accountable for all aspects of the work.