Introduction

Autoimmune thyroid diseases (AITD) including Graves' disease (GD) and Hashimoto's thyroiditis are common autoimmune diseases that develop as a result of environmental triggers in individuals with a genetic predisposition. Although a number of replicated genetic associations are emerging, providing insights into the underlying disease mechanisms, a significant component of the genetic contribution to AITD remains unknown.

In keeping with other common diseases, the identification of novel genetic variants conferring susceptibility has proved problematic with genome-wide linkage analysis generally proving disappointing in identifying loci for AITD.1, 2 Recent genome-wide association studies are, however, now beginning to reveal a number of novel genetic variants in many common diseases. The Wellcome Trust Case–Control Consortium (WTCCC) in the United Kingdom has recently completed two large association studies in 11 common diseases and reported a number of novel loci for most diseases.3, 4 In the smaller of the two studies, the WTCCC investigated 5500 individuals, which included 900 cases with GD, using a genome-wide set of 14 500 nonsynonymous coding single-nucleotide polymorphisms (nsSNPs). Although the strongest association signal was unsurprisingly identified in the human leukocyte antigen (HLA) region (P value<10−20), association was also confirmed at the previously reported thyroid stimulating hormone receptor gene (TSHR)5 and Fc receptor-like 3 gene (FCRL3),4, 6 with a further nine novel regions showing some evidence of association with P value10−4.4 The aim of this study was to try to replicate the association identified in the nine novel regions in an independent UK collection of GD subjects and controls.

Materials and methods

Subjects

A total cohort of 2478 unrelated Caucasian GD patients of UK origin were recruited into the AITD UK National Collection as previously described.7 Control samples totaling 3446 were obtained from the 1958 British Birth cohort (http://www.b58cgene.sgul.ac.uk). The WTCCC had previously genotyped 900 GD patients and 1500 control subjects from these data sets in the 14 500 nsSNPs association scan.4 In total, therefore, a further 1578 GD patients and 1946 controls, not previously included in the WTCCC nsSNP scan, were incorporated into this replication study. All subjects gave informed written consent and the project was approved by the local research ethics committee.

nsSNP genotyping

Nine novel nsSNP associations detected outside the HLA region that met a point-wise significance level of P10−4 in the original WTCCC nsSNP scan were genotyped in an independent collection of 3524 samples (see Table 1 for a list of all SNPs investigated). The FCRL and TSHR regions were not examined as association with AITD had been previously reported.4, 5, 6 A further tenth SNP, rs3748140, present within the protein phosphatase 1 regulatory inhibitor Subunit 3B gene (PPPIR38) also produced a point-wise significance level of P10−4 but was excluded from further analysis as it was non-polymorphic in the replication study.

Table 1 nsSNP genotyping of nine novel possible regions of association with GD in the original WTCCC collection (900GD; 1500CO), the independent replication collection (1578GD; 1946CO) and the complete collection (2478GD; 3446CO)

Tag SNP genotyping

To provide greater coverage in selected candidate genes in which nsSNPs showed some evidence for association in the combined WTCCC and replication data sets we also typed a number of Tag SNPs. A total of 29 Tag SNPs were selected (excluding the initially typed nsSNPs within each gene region) to take into account all remaining known common variation within HDLBP, TEKT1, JSRP1 and UTX (see Table 2 for details of Tag selection). These Tag SNPs were genotyped in the complete GD collection of 2478 samples. The control group, however, was reduced to 2690 to ensure geographical matching of cases and controls (756 controls originally used as part of the WTCCC were not screened) and referred to as the geographically matched complete group. Assays for all of the above SNPs were purchased from Applied Biosystems, Warrington, UK and genotyped on an ABI7900HT using Taqman (Applied Biosystems) genotyping technologies.

Table 2 Tag SNP genotyping of HDLBP, TEKT1, JSRP1 and UTX in a case–control collection of 2478 patients with GD and 2690 control subjects

Statistical analysis

All SNPs in HDLBP, TEKT and JSRP1 were in Hardy–Weinberg equilibrium (HWE) in cases and controls except rs10185319 (HDLBP) (control HWE P=0.02). As UTX was present on the X chromosome HWE was assessed in Haploview Version 3.2 (http://www.broad.mit.edu/mpg/haploview) and the rs6611065 SNP was shown to strongly deviate from HWE (P=8.64 × 10−15) and was excluded from further analysis. Allelic and genotypic analysis of case–control data was performed using the χ2-test within the MINITAB statistical package (MINITAB Release 15.1.2, © 1972–2007, Minitab Inc., State College, PA, USA). As UTX is located on the X chromosome these SNPs were analyzed within Haploview Version 3.2, which can analyze hemizygote males. Odds ratios (OR) with 95% confidence intervals (CI) were calculated by the method of Woolf with Haldane's modification for small numbers where appropriate.8 Using the ORs and minor allele frequencies (MAF) generated from the WTCCC nsSNP study, power calculations have shown that the replication collection (1578 GD and 1946 controls) had between 93 and 100% power to detect the size of effect reported from the original data set (900 GD and 1500 controls) for all loci, except TEKT1.

Results

nsSNP genotyping results in the independent and complete data sets

None of the nine novel candidate nsSNPs detected within the original WTCCC study were found to be associated with GD in the independent replication collection of 1578 GD cases and 1946 controls (P=0.215–0.841) (Table 1). In the combined collection (consisting of the WTCCC and the independent replication samples, totaling 2478 GD cases and 3446 controls) no significant differences in allele or genotype frequencies of the rs10916769 (VWA5B1), rs1047911 (MRPL53), rs1048101 (ADRA1A) and rs2856966 (ADCYAP1) SNPs were observed between GD cases and controls (see Table 1). However, minor differences in genotype frequencies between GD and controls persisted for, rs7578199 (HDLBP), rs2271233 (TEKT1), rs7250822 (JSRP1) and rs2230018 (UTX) SNPs, producing ORs of 1.11 (95% CI=1.02–1.22), 0.84 (95% CI=0.73–0.96), 1.50 (95% CI=1.18–1.90) and 1.21 (95% CI=1.06–1.38), respectively. Differences in allele frequencies were also observed between GD cases and controls for rs7975069 (ZNF268) although no association between genotypes and GD was observed (P=0.060).

On the basis of the finding of weak association of, HDLBP, TEKT1, JSRP1 and UTX in the combined collection, we then subjected the remainder of these gene regions to Tag SNP screening to capture the majority of the common variation and determine if further GD associations existed within these regions.

Tag SNP genotyping results

None of the Tag SNPs for JSRP1 or UTX showed evidence of association with GD in the geographically matched combined collection of 5168 samples (2478 GD cases and 2690 controls) (Table 2). Out of 14 Tag SNPs for HDLBP; two, rs6437249 and rs11680329, showed some evidence for association with GD producing ORs of 0.91 (95% CI=0.84–0.99) and 1.11 (95% CI=1.00–1.22), respectively. From the six Tag SNPs for TEKT1; two, rs4796561 and rs4796356, also showed some evidence for association with GD producing ORs of 0.88 (95% CI=0.81–0.96) and 1.12 (95% CI=1.03–1.23). A third SNP rs8078571 showed differences in allele frequency but no evidence of genotypic association (P=0.104).

Discussion

In this study, we have failed to replicate, in an adequately powered independent data set, association of nine novel nsSNPs previously reported to be weakly associated with GD in the WTCCC nsSNP study. Although none of the original nsSNPs were found to be associated with GD in the replication study, nsSNPs in HDLBP, TEKT1, JSRP1 and UTX remained weakly associated (P<0.05) in the complete collection of 5924 UK GD cases and controls. Tag SNPs in HDLBP and TEKT1, not originally typed in the nsSNP study also provide a weak signal for association in the complete geographically matched collection.

A number of significant replicated associations have previously been reported for GD, the HLA class I and II regions,9, 10 cytotoxic T-lymphocyte-associated protein 4 (CTLA-4),11 protein tyrosine phosphatase non-receptor type 22 (PTPN22)12, 13 and TSHR5 with ORs for the development of disease ranging from 1.50–3.00. Weaker replicated effects are also likely to be conferred by the interleukin 2 receptor alpha gene (IL2RA),14 CD4015, 16 and FCRL4 with lesser ORs of 1.10–1.30. Other loci in Caucasian subjects have been reported although findings are less consistent (thyroglobulin17, 18) or require further replication (including PTPN2 and CD22619). To help identify further susceptibility loci, large-scale genome-wide association studies (comprising >500 000 SNPs) have been conducted and are at last beginning to deliver novel susceptibility loci for a number of common diseases including the autoimmune diseases. The most comprehensive association study published in GD was conducted by the WTCCC and included 14 500 nsSNPs in 900 GD index cases and 1500 controls.4 Reassuringly, association with disease was replicated for the previously identified loci within the HLA region, FCRL3/5 and the TSHR. Other previously identified loci including CTLA-4 and PTPN22 were not detected as these genes were not covered in the nsSNP study, which used a custom-made Infinium array (Illumina, San Diego, CA, USA) based largely on experimentally validated nsSNPs with a MAF>1% in western European samples.4 In addition to HLA, FCRL3/5 and the TSHR, nine novel regions were also reported to be weakly associated with GD at a significance level of only P10−4. The lack of replication reported in the current independent collection of GD index cases and controls vindicates the caution exercised over the interpretation of the weak associations in the original WTCCC nsSNP study.

The replication collection of UK GD cases and controls used in this study was adequately powered to detect the size of effect observed in the WTCCC nsSNP study, except TEKT1. More over for the majority of SNPs tested we had >80% power in the replication collection to detect an OR of 1.13–1.18 or higher. Clearly, however, we can not exclude smaller sized effects at these loci, which may be contributing to the overall genetic architecture of GD. Studies in other common autoimmune diseases have revealed loci conferring ORs for disease risk of <1.20 and have indicated the size of cohort required to detect such effects. In type 1 diabetes, wherein results from two different genome-wide studies both using approximately 4000 cases and controls found loci with small effects, including, for example, BACH2 and CTSH, convincing genome-wide statistical significance was only achieved when these effects were tested in 12 971 cases and controls.20

The replication and extension data presented in this study, based on 5924 samples and representing the largest GD association study to date, serve to highlight, the importance of appropriate levels of statistical significance and data interpretation. Our study also highlights the need for collaborative efforts to produce large collections of DNA for genome-wide screening and replication in common diseases such as GD in which individual susceptibility loci are likely to be exerting small effects.