Copy number variations exploration of multiple genes in Graves’ disease

Abstract Background: Few previous published papers reported copy number variations of genes could affect the predisposition of Graves’ disease (GD). Herein, the aim of this study was to explore the association between copy number variations (CNV) profile and GD. Methods: The preliminary copy number microarray used to screen copy number variant genes was performed in 6 GD patients. Five CNV candidate genes (CFH, CFHR1, KIAA0125, UGT2B15, and UGT2B17) were then validated in an independent set of samples (50 GD patients and 50 matched healthy ones) by the Accucopy assay method. The CNV of the other 2 genes TRY6 and CCL3L1 was investigated in 144 GD patients and 144 healthy volunteers by the definitive genotyping technique using the Taqman quantitative polymerase-chain-reaction (Taqman qPCR). TRY6 gene-associated single nucleotide polymorphism (SNP), rs13230029, was genotyped by the PCR-ligase detection reaction (LDR) in 675 GD patients and 898 healthy controls. Results: There were no correlation of the gene copy number (GCN) of CFH, CFHR1, KIAA0125, UGT2B15, and UGT2B17 with GD. In comparison with that of controls, the GCN distribution of TRY6 and CCL3L1 in GD patients did not show significantly differ (P > 0.05). Furthermore, TRY6-related polymorphism (rs13230029) showed no difference between GD patients and controls. No correlation was found between CNV or SNP genotype and clinical phenotypes. Generally, there were no link of the copy numbers of several genes, including CFH, CFHR1, KIAA0125, UGT2B15, UGT2B17, TRY6, and CCL3L1 to GD. Conclusion: Our results clearly indicated that the copy number variations of multiple genes, namely CFH, CFHR1, KIAA0125, UGT2B15, UGT2B17, TRY6, and CCL3L1, were not associated with the development of GD.


Introduction
Graves' disease (GD), one clinical subtype of autoimmune thyroid diseases (AITDs), is a kind of complex and thyroidspecific autoimmune disease with up to 5% in incidence. [1] GD is characterized by the presence of hyperthyroidism and circulating thyroid stimulating hormone receptor antibody (TRAb). The disease predominantly affects women of child-bearing age. Likewise, GD has a high degree of familial aggregation and genetic heredity. [2] Given the exact etiology of GD is yet to be defined, a large body of literature supports pivotal roles of the inherent genetic factors, environmental triggers, and their interaction. [3] In addition, although under intense investigations, the genetic basis of human GD is still not well understood. Most genetic studies so far have focused on disease-association of GD with common genetic variation such as single-nucleotide polymorphism (SNP), but it has recently become apparent that copy-number variant (CNV) in immune genes is also a vital source of genomic diversity contributing to susceptibility to disease development as well. [4,5] New high-throughput approaches such as microarray-based comparative genome hybridization (CGH) or CNV microarray, as an emerging and efficient molecular genetics technology, can simultaneously screen multiple CNV genes and areas in the whole genome on one specimen. In the present project, the CNV microarray was utilized to screen the CNV profile in GD. Then, we tried to further confirm the candidate CNV genes including CFH, CFHR1, KIAA0125, UGT2B15, and UGT2B17 in a larger size of samples (50 GD cases and 50 healthy ones). We also chose 2 immune-related genes, chemokine ligand 3-like 1 (CCL3L1), and trypsinogen C (TRY6), for in-depth verification on their copy numbers in GD. CCL3L1 is a variable copy-number gene which encodes a protein binding to several pro-inflammatory cytokine receptors including chemokine receptor 5 (CCR5). There is increasing evidence that CCL3L1 copy-number influences the susceptibility to rheumatoid arthritis (RA), [6] type 1 diabetes (T1D), [6] and systemic lupus erythematosus (SLE). [7] Because the correlations between CNV of this gene (CCL3L1) and other autoimmune diseases have been reported, [6,7] we attempted to grope for the gene copy number of CCL3L1 in GD. TRY6, whose alias is PRSS3P2, is localized to the T cell receptor (TCR) beta locus on chromosome 7. Given TRY6 is a key factor for T-cell-immune and inflammatory responses [8,9] and simultaneously the microarray showed TRY6 was with less copy numbers in GD, we intended to explore whether TRY6 CNV is associated with the pathology of GD.

Subjects
Here, all GD patients were enrolled from the Outpatient Clinic of Endocrinology of Jinshan Hospital, whereas ethnically and geographically matched healthy controls were recruited from the Healthy Check-Up Center of the same hospital. All the patients and controls were given written informed consent. And sample collection was supported by the Ethical Committee of Jinshan Hospital of Fudan University. Each participant, including patients and controls, was ethnic Chinese Han.
All GD patients were newly diagnosed and met the diagnosis criteria, including clinical manifestations and biochemical assessments of hyperthyroidism and the positive circulating TRAb, with or without diffuse goiter of the thyroid and positive antibody against thyroid peroxidase (TPOAb) or thyroglobulin (TgAb), which was clearly described in our published articles previously. [10,11] The subjects without positive antibody against TPO or Tg, personal or family history of thyroid diseases, or any other diseases were recruited as controls. In this study, the immunochemiluminescence method with high specificity and sensitivity was used for detecting TPOAb, TgAb, and TRAb, and all of the kits were provided by the Roche Company, China.
In the first screening stage, 6 GD patients were used for the microarray detection. In the validation stage, another 50 GD patients and 50 healthy subjects were enrolled. Finally, we chose 2 genes for further CNV genotyping, 1 gene chosen from CNV microarray (TRY6), and the other reported by other studies on other autoimmune diseases (CCL3L1). In the next stages, additional 2 independent cohorts were collected. For CNV genotyping, 144 GD patients and 144 matched controls were enrolled from the same hospital. And for SNP genotyping, another cohort of 675 Chinese patients with GD and 898 unrelated healthy controls were recruited for analysis in this phase.

DNA sample collection
Blood samples were collected from all the participants and genomic DNA was extracted from 2 mL of ethylenediamine tetra-acetic acid (EDTA)-treated peripheral venous blood using a commercially available kit of RelaxGene Blood DNA System (Tiangen Biotech, Beijing, China), following the instructions of the kit. The concentration and purity (represented as A260/A280 ratio) of all DNA samples were measured by the Nano Drop 2000 Spectro-photometer (Thermo Scientific Company, USA). The DNA samples with great quality (high purity and concentration) were prepared for next CNV and SNP genotyping.

CNV screening
The Affymetrix CytoScan HD array with 900,000 probes was applied for screening CNV genes and there were normal genes copy numbers on the slides as controls. For microarray detection, the sample DNA was digested by restriction enzymes Nspl, amplified by the PCR method, labeled and further fragmented to obtain biotin labeled DNA using the Affymetrix Cytoscan HD array kit (Cat #901835, Affymetrix, Santa Clara, CA) following the manufacturer's instructions. The array hybridization was performed at 50°C in Hybridization Oven (Cat #00-0331-220V, Affymetrix, Santa Clara, CA). After 16 hours' incubation, array slides were washed in Fluidics Station (Cat #00-0079, Affymetrix, Santa Clara, CA) according to the user manual. Then, slides were scanned with default settings by the GeneChip Scanner 3000 (Cat #00-00212, Affymetrix, Santa Clara, CA) and Command Console Software 3.1 (Affymetrix, Santa Clara, CA). Data were extracted and further analyzed with Affymetrix Chromosome Analysis Suite (Affymetrix, Santa Clara, CA).

CNV genotyping
To detect copy number of each genomic DNA (gDNA) sample, we used 2 methods. First, the Accucopy assay (a multiple competitive real-time PCR, Shanghai Tianhao Company, China) was used for detecting 5 genes, CFH, CFHR1, KIAA0125, UGT2B15, and UGT2B17, chosen from the array detection. Ten target genomic segments within the 5 genes (2 segments for each gene) were chosen for the PCR measurement. The reference genomic sequences were 2p, 10pL, 20q, and 16p. The forward and reverse primers of target genome segments are displayed in Table 1. Then, we used the quantitative polymerase chain reaction (qPCR) by TaqMan Copy Number Assays (Applied Biosystems of Lifetechnology, Foster City, CA) for CCL3L1, (Hs03198166_cn) and TRY6 (Hs03114102_cn), according to the manufacturer's protocol. qPCR was performed using the Table 1 The primers of target genomic segments and reference genome sequences by the Accucopy assay.

SNP genotyping
All the SNP genotyping experiments were done using the PCRligase detection reaction (LDR). The target DNA sequences were amplified using a multiplex PCR method, and the PCR program (20 mL volume) was conducted as follows: 95°C 2 minutes, 94°C 30 s, 40 cycles (53°C 90 s, 65°C 30 s), 65°C 10 minutes. The ligation reaction for each sample was carried out in a final volume of 10 mL, containing 1 mL of 1Â buffer, 4 mL of multi-PCR product, 1 mL of 2 pmol/ml probe mix, and 0.05 mL of 20U Taq DNA ligase (New England Biolabs). The LDR method was performed using 95°C 2 minutes, followed by 40 cycles of: 94°C for 15 seconds and 50°C for 25 seconds. The fluorescent products of LDR were differentiated by the software of Gene Amp PCR system 9600 (Norwalk, CT).
To ensure quality of the genotyping, we made encoded samples numbers, double positive controls (duplication of the same DNA samples) and negative controls (blank samples without DNA) in the process of SNP genotyping. Quality control analysis was also performed so that only SNPs and samples which had passed the 98% quality control threshold were subjected to further statistical analysis. All genotype call rates were manually recorded and conflicting results were liberally re-genotyped by sequencing. The information of primers and probes specific for the SNP (rs13230029) in the region of TRY6 is shown in Table 2.

Statistical analysis
The clinical data are described as mean ± standard deviation (M ± SD). Hardy-Weinberg equilibrium (HWE) analysis was conducted with Haploview software 4.2, using a threshold of 0.05, without correction for multiple testing. Comparison of cases and controls was made with allele frequency and copy number data using Pearson chi-square tests or Fisher's exact test on 2 Â 2 contingency tables. Further, polymorphism genotype association was compared by chi-square tests on 2 Â 3 contingency tables as well. Other statistical analyses were performed using a customized version of SPSS 17.0 (http://www-01.ibm. com/software/analytics/spss/). P-value of <0.05 was considered of statistical significance. Effect sizes, taken as odds ratio (OR), and 95% confidence interval (95%CI) assessed the association between each genotype and GD.

Patients data
Demographic data and clinical characteristics of the 2-cohorts (GD patients and controls) are described in Table 3. In the first CNV microarray screening stage, we collected 6 GD patients (33.333% men and 66.667% females). In the validation study, we enrolled 50 GD patients (34.000% males and 66.000% females) and 50 controls (36.000% males and 64.000% females). Then, in subsequent study phase, our research investigated 144 GD patients (65.278% females and 34.722% males) and 144 controls (66.667% females and 33.333% males). In GD patients, the average onset age was 36.640 ± 14.797 years, 14 (9.722%) had ophthalmopathy and 32 (22.222%) individuals had family history. In the second SNP stage, we enrolled 675 GD patients (69.185% females and 30.815% males) and 898 controls (66.815% female sand 33.185% males). The average onset age of GD patients was 33.920 ± 14.516 years old, 120 (17.778%) had ophthalmopathy and 137 (20.296%) individuals were with family history.

Gene copy number (GCN) array screening
Through CNV microarray detection, we found several CNV genes, which were shared by 6 GD patients, including some genes Table 2 Probe and primer information for TRY6 SNP-genotyping using the PCR-LDR method.
3.3. Gene copy number (GCN) association validation 3.3.1. Accucopy assay validation. DNA copy number calculation in CFH showed no deletion or duplication in our GD cases. Association analysis of the copy number variations in CFHR1, KIAA0125, UGT2B15, and UGT2B17 did not find significant difference between cases and controls.

CNV assay selection.
To identify CNVs in the TRY6 and CCL3L1 genes, we searched the Database of Genomic Variants (DGV; http://projects.tcag.ca/variation) for cataloged CNVs. TRY6 has only one listed designed-probe in DGV and can be detected by an assay specific for this CNV. That is what we chose for detection. However, CCL3L1 has multiple cataloged probes in DGV, which overlap with and encompass the entire gene. Then, we selected one, which was reported by several CNV array previously, for further CNV genotyping.

CNV analysis in the TRY6 gene.
The gene copy number (GCN) of TRY6 was confirmed in our cohort. It varied from 0 to 3. 0 was the most common GCN count. In comparison with that of the controls, the GCN distribution of TRY6 in GD patients did not significantly differ (shown in Table 5).
3.3.4. CNV analysis of the CCL3L1 gene. As shown in Table 5, there was indeed CCL3L1 CNV in GD patients and controls. That of CCL3L1 was from 1 to 5, and 5 copies was the most common. In comparison with that of controls, the CCL3L1 gene in GD patients did not show difference in GCN distribution (P > 0.05).

Correlation of CNV with clinical phenotype in GD patients
We further analyzed the correlation of the CNV distribution with GD patients with or without clinical phenotype (female or male, with or without family history, with or without ophthalmopathy, the age of onset 18 years old or ≥19 years old), but the results were all negative (P > 0.05, the data not shown).

Allelic and genotypic association of SNP
In our cohort, TRY6 CNV-associated SNP locus, rs13230029, was in Hardy-Weinberg equilibrium (P-value larger than 0.05). Allele G of rs13230029 in GD patients was up to 29.778%, higher than that in healthy subjects (29.399%), but without significant difference (P = 0.818). When the genotype distribution was further analyzed, the results displayed that GG genotype of TRY6 gene in the GD group was increased compared with that in the control group, but with no significant difference (displayed in Table 6). Similarly, we compared the allele and genotype distributions in clinical phenotype of GD patients (gender, onset of age, with or without ophthalmopathy, with or without family history), but no positive results were found (data not shown).

Discussion
Hitherto, significant progress has been made in understanding the role of genetic factors in the pathomechanism of GD. [2] Recent data on the genetic susceptibility to GD reveal a novel mechanism. For instance, a large body of evidence show that SNPs of different genes are involved in the development of GD, such as the polymorphisms of IL-21/IL-21R, [12] BANK1, [13] STAT3, [14] and TNFAIP3. [15] Several researches have concluded that the varieties of DNA copy number, another essential subtype of genetic components, may influence the expression of genes and Table 4 CNV profile in GD by microarray screening. Gain genes  KIAA0125, ADAM6, MFHAS1, IL3RA, SLC25A6, DHRSX, HDHD1, DMD, NROB1, FGF13, FAM70A, MID2, BRWD3, ACSL4,  SLC6A14, SLC25A43, ZNF280C, CFH, CFHR1  Loss genes  OR4P4, OR4S2, OR4C6, HEATR4, ACOT1, ANKRD36, UGT2B17, UGT2B15, TRY6, PRSS2, ASMTL, NLGN4X, DMD, RP2,  IL1RAPL1, PCDH11X, KIAA2022, VAMP7,   change the structure of genes, and therefore trigger the contribution to substantial phenotypic variation. Moreover, CNV in genomic regions harboring many dosage-sensitive genes probably cause or predispose to a variety of human genetic diseases. So far, increasing data have suggested that the CNVs of immune-genes may confer risk for various autoimmune diseases, with a manner similar to the well-documented associations between SNPs in immune-related genes and autoimmunity. For example, C4 CNVs were associated with SLE. [16] In CD, HBD-2 showed fewer copy numbers, [17] and CNVs of FCGR3B were linked to RA [18] and SLE. [19] However, few published papers reported whether copy number variations of genes could affect the predisposition of GD. [4,5,20,21] Here, CNV microarray as a genome-wide analytical technique was applied by us to detect genomic CNV profile in GD, and we indeed found some copy number variant genes not only on enchromosomes but also on sex chromosomes. However, 5 genes including CFH, CFHR1, KIAA0125, UGT2B15, and UGT2B17, which were chosen from the CNV assay detection, showed no relation to GD in further validation procedure.

CNV screening in microarray
It was showed TRY6 with gene copy numbers loss in GD patients in our microarray screening and it was considered the pivotal roles of CCL3L1, TRY6 as vital effectors in immuneregulation, so our present study explored whether there is an association between copy number variation of these 2 immunegenes and the pathogenesis of GD. The copy number of CCL3L1 varies among individuals, with most subjects owning 1 to 5 copies in the diploid genome. It was reported that the low CCL3L1 copy number is an important risk factor for HIV infection, whereas high CNV of CCL3L1 is protective against the disease. [22] A more recent publication reported the lower copy number (less than 2) of CCL3L1 increased the risk of SLE. [7] And another study suggested that CCL3L1 CNV was associated with the susceptibility to RA and T1D, probably through enhancing inflammatory responses and increasing the chance of autoimmune diseases. [6] Our present study showed there was indeed Table 5 Comparison of the copy numbers frequencies in CFH, CFHR1, KIAA025, UGT2B15, UGT2B17, TRY6, and CCL3L1 between GD patients and controls.  CCL3L1 CNV in GD patients and controls, from 1 copy to 5 copies. Nevertheless, the CNV distribution of CCL3L1 did not differ between the GD group and the control group. Further, we also investigated the correlation of the clinical subsets with the CNV distribution, and found no statistical significance in all caseonly comparisons (cases with each clinical phenotype versus those without). TRY6 is a member of a highly homologous serine protease family (PRSS). It is also a locus that appears to encode a protein trypsinogen and this gene is related to the T cell receptor (TCR). In recent years, TRY6 has been believed to be involved in immune and pro-inflammatory processes. [8] Moreover, TRY6 mutation is associated with an inflammatory disease, pancreatitis. [23] Rs13230029, a neighboring SNP in 100% LD with the TRY6 gene deletion is associated with breast cancer risk. [24] In our previous microarray detection, there were less gene copies of TRY6 in GD patients. Nevertheless, in our present study, neither TRY6 CNV nor rs13230029 locus was found to be associated with GD. Furthermore, we analyzed the correlation of the clinical subsets with the TRY6 CNV and SNP allele frequencies, and no statistical significance was found in comparison with each clinical phenotype. Therefore, the TRY6 copy number and neighboring SNP both showed no link with GD, suggesting that TRY6 gene probably does not affect the susceptibility to GD.
Obviously, our research is only a preliminary and important exploration into copy number variation of immune-related genes, namely CFH, CFHR1, KIAA0125, UGT2B15, UGT2B17, CCL3L1, and TRY6, in GD. In spite of our negative results, there are investigations reported that copy number variants of C4, [4] TSHR, and TLR7, [5] but not FCGR3B, [20] CD40 and CTLA4, [24] confer risk to GD. Therefore, in the future more genetic, cellular and molecular experiments in depth are warranted to validate our results.

Conclusions
To summarize, our results for the first time indicated the copy number variations of genes, namely CFH, CFHR1, KIAA0125, UGT2B15, UGT2B17, TRY6, and CCL3L1, are not associated with the risk for GD.