Fine mapping of genes within the IDDM8 region in rheumatoid arthritis

The IDDM8 region on chromosome 6q27, first identified as a susceptibility locus for type 1 diabetes, has previously been linked and associated with rheumatoid arthritis (RA). The region contains a number of potential candidate genes, including programmed cell death 2 (PDCD2), the proteosome subunit beta type 1 (PSMB1), delta-like ligand 1 (DLL-1) and TATA box-binding protein (TBP) amongst others. The aim of this study was to fine map the IDDM8 region on chromosome 6q27, focusing on the genes in the region, to identify polymorphisms that may contribute to susceptibility to RA and potentially to other autoimmune diseases. Validated single nucleotide polymorphisms (SNPs; n = 65) were selected from public databases from the 330 kb region of IDDM8. These were genotyped using Sequenom MassArray genotyping technology in two datasets; the test dataset comprised 180 RA cases and 180 controls. We tested 50 SNPs for association with RA and any significant associations were genotyped in a second dataset of 174 RA cases and 192 controls, and the datasets were combined before analysis. Association analysis was performed by chi-square test implemented in Stata software and linkage disequilibrium and haplotype analysis was performed using Helix tree version 4.1. There was initial weak evidence of association, with RA, of a number of SNPs around the loc154449 putative gene and within the KIAA1838 gene; however, these associations were not significant in the combined dataset. Our study has failed to detect evidence of association with any of the known genes mapping to the IDDM8 locus with RA.


Introduction
Rheumatoid arthritis (RA; MIM#180300) is a systemic autoimmune disease characterized by chronic inflammation of the joint synovium. In common with other autoimmune diseases, such as type 1 diabetes (T1D; MIM#222100), systemic lupus erythematosus (SLE; MIM#152700) and autoimmune thyroid disease, it is a complex disease caused by both genetic and environmental factors. Various lines of evidence suggest that some of the genetic factors may be common to a number of autoimmune diseases. These include their shared pathophysiology and also the co-occurrence of autoimmune diseases in families. In addition, observations from meta-analyses of autoimmune disease whole genome screens show non-random clustering of disease susceptibility loci for a number of human autoimmune diseases and animal models of autoimmunity [1,2]. Recently convincing proof of this hypothesis has been provided by the association of the missense single nucleotide polymorphism (SNP; rs2476601) in the protein tyrosine phosphatase N22 (PTPN22) gene with at least five autoimmune diseases; RA [3,4], SLE [5], autoimmune thyroid disease [6], T1D [7] and juvenile idiopathic arthritis [4].
We have, therefore, hypothesized that loci identified in one autoimmune disease are strong potential candidates in other related conditions. Of the autoimmune diseases that cluster within the same families as RA, T1D has been most thoroughly investigated for genetic susceptibility loci. The T1D susceptibility locus, denoted IDDM8, a region on chromosome 6q27 (Figure 1), spans approximately 200 kb and contains a number of potential candidate genes, including programmed cell death 2 (PDCD2), proteosome subunit beta type 1 (PSMB1), delta-like ligand 1 (DLL-1) and TATA box-binding protein (TBP) amongst others [8]. Interest in this region, in relation to RA, has stemmed from our previous work that revealed DLL-1 = Delta-like ligand 1; LD = linkage disequilibrium; PDCD2 = programmed cell death 2; PSMB1 = proteosome subunit beta type 1; PTPN22 = protein tyrosine phosphatase N22; RA = rheumatoid arthritis; RF = rheumatoid factor; SLE = systemic lupus erythematosus; SNP = single nucleotide polymorphism; T1D = type 1 diabetes; TBP = TATA-box binding protein.
evidence for linkage and association of a microsatellite marker (D6S446) with RA in a dataset comprising RA affected sibling pair families and RA simplex families. An adjacent microsatellite, D6S1590, has also shown evidence of linkage and association with RA in the same families [9].
The aim of this present study was to fine map the IDDM8 region on chromosome 6q27. We have chosen to examine a 330 kb region spanning the IDDM8 region and have focused on the genes in this region to identify variants that may contribute to susceptibility to RA and potentially to other autoimmune diseases.

Materials and methods
Subjects DNA was available for an initial RA dataset comprising 180 RA cases; these were combined with a further 174 RA cases to give a total RA dataset of 354 RA cases for the second stage analysis. The RA cases were obtained either from the ARC National Repository for families with RA or from clinics within the Greater Manchester area of Northern England. For patients obtained through the National Repository, only one affected case per family was selected at random for investigation. All RA cases had disease that satisfied the 1987 American college of Rheumatology criteria [10] modified for genetic studies [11]. Rheumatoid factor (RF) status was ascertained using a particle agglutination test, and a positive result was classified as a titre of 1 in 40 or greater. Of the RA cases used in this study, 75% were RF positive, 83% had erosive disease and the mean age-at-onset was 44.6 ± 14.6 years. HLA-DRB1 genotypes were determined using a commercially available semi-automated PCR-sequence specific oligonucleotide probe typing technique (INNO-LiPA; Abbott Laboratories, Maidenhead, UK). Of the RA cases, 16% had zero copies of the shared epitope, 47% had one copy and 34% had 2 copies (3% of cases not HLA typed).
The initial RA case cohort was compared with a cohort of 180 population control individuals; this was combined with a second cohort of 192 population control individuals to give a total control dataset of 372 controls for the second stage analysis. Population control subjects were recruited from blood donors and from General Practice registers.
All patients and controls were of UK Caucasoid ethnic origin, were recruited with ethical committee approval and provided informed consent.

SNP selection
Over the 330 kb region of IDDM8 on chromosome 6q27, frequency validated SNPs were selected from public databases, including NCBI [12] and HapMap (CEPH population) [13], using a gene-focused approach. The genes in the region are loc154449, loc401289, DLL1, KIAA1838, loc401290, PSMB1, TBP and PDCD2. Information on linkage disequilibrium across the region was obtained from HapMap and, where genes fell within haplotype blocks, haplotype tagging SNPs were selected to reduce the total number of SNPs required for genotyping. All SNPs within coding regions or with any potential function were also prioritized for genotyping. In total, 65 SNPs were selected for genotyping. Polymorphisms were mapped to the UCSC genome browser [14] May 2004 human reference sequence based on NCBI build 35. Details of primer and probe sequences are available on request.

Genotyping protocol
SNPs were genotyped using Sequenom MassArray genotyping technology, according to manufacturer's instructions, whereby the genomic sequence containing the SNP is amplified by PCR [15]. The amplified product is cleaned using shrimp alkaline phosphatase to neutralize any unincorporated dNTPs. This is followed by the homogeneous MassEXTEND process. This process utilizes a primer that anneals to the A schematic diagram of the IDDM8 region A schematic diagram of the IDDM8 region. The genes are shown in blue boxes, arrows denote position of the microsatellite markers associated in the Myerscough and colleagues study [9] and blue circles denote the single nucleotide polymorphisms.
genomic amplification product immediately adjacent to the SNP site and is extended to generate SNP-specified DNA products of different length with predictable masses that can be resolved easily by mass spectrometry. Following the Mas-sEXTEND reaction, SpectroCLEAN resin is added to the reaction mixture to remove extraneous salts that could interfere with MALDI-TOF mass spectrometry. The reaction mixture is then spotted onto a SpectroCHIP microarray and subjected to the MALDI-TOF mass spectrometry. SpectroTYPER software identifies the SNP-specific peaks according to their expected masses and automatically assigns the genotype calls.

Statistical analysis
All SNPs were tested for Hardy-Weinberg equilibrium in cases and controls. Association of the IDDM8 SNPs was tested using the chi-squared test implemented in Stata (Stata, College Station, TX, USA).
Pairwise linkage disequilibrium (LD) measures of D' and LD correlation coefficient r 2 were calculated and plotted on a graph. Two-and three-marker haplotypic associations using a moving window approach were investigated using haplotype trend regression implemented in HelixTree™ version 4.1 (Golden Helix Inc., Bozeman, Montana, USA). Haplotypes were inferred using the expectation-maximization algorithm.
The Tagger option in the program Haploview 3.2 [16] was used to determine how many of the SNPs in the specified HapMap region had been successfully tagged by the SNPs that have been genotyped.

Association analysis of IDDM8 SNPs
We excluded 15 SNPs from any subsequent analyses either because they were non-polymorphic in this study, had a call rate of <80% or as they showed deviation from Hardy-Weinberg equilibrium expectations in the control population (p < 0.001). For stage 1, 50 SNPs were analyzed for association with RA ( Figure 1).
Following single marker analysis of the test dataset, one SNP mapping close to loc154449 showed a trend towards allelic association (rs11752069, p = 0.06) with RA and significant genotypic association with RA (p = 0.05). A second SNP mapping within the KIAA1838 gene was significantly associated with RA (rs910424, allelic association p = 0.012), whilst three other SNPs in KIAA1838 showed a trend towards association (p < 0.1) ( Table 1).
Data from the HapMap suggest there is variable LD across the IDDM8 region. We therefore carried out two-and threemarker haplotype analysis for SNPs across the genes using HelixTree™ version 4.1 to see if evidence of association was stronger in two-or three-marker haplotypes. In the stage 1 dataset, analysis using Haplotype Trend Regression in Helix Tree™ showed evidence of association with RA of a number of two-and three-marker haplotypes within the KIAA1838 gene ( Table 2). There was borderline significant association of a two-marker T_T haplotype (SNPs rs910425_rs910424), and a three-marker T_T_A haplotype (SNPs rs910425_rs910424_rs2881062) with RA (p values of 0.07 and 0.09, respectively). We therefore went on to genotype these SNPs, and other SNPs in the gene, in a second set of cases and controls. Two SNPs (rs2024694) and (rs958998)  showed deviation from Hardy-Weinberg equilibrium in the combined controls and were not analyzed any further. These were combined with the data from stage 1 for the association analysis. In the analysis no SNPs or haplotypes were significant at a p value < 0.05 (Table 3 and 4).

Evaluation of SNP coverage
Calculations of pairwise LD between all markers examined were performed and plotted on a graph for the total region studied ( Figure 2). A block of strong LD can be found at the distal end of the IDDM8 region, spanning SNPs rs1474554 to rs734249 and approximately 55 kb. The genes PSMB, TBP and PDCD2 map within this block. Figure 1 shows a schematic diagram of the region studied with the 50 intragenic SNPs plotted across the chromosomal region.
Results from Tagger

Discussion
Linkage to the IDDM8 region on chromosome 6q27 was originally identified in the first whole genome screen in T1D [17] and the region has also been linked to multiple sclerosis [18] and SLE [19], supporting the hypothesis that it could harbor polymorphisms important in autoimmunity. Linkage disequilibrium mapping of the region in T1D narrowed down the region   to the terminal 200 kb of chromosome 6q spanning the PDCD2-TBP-PSMB1 gene complex [8]. Previous investigation of the region in RA found evidence of linkage and association to two microsatellite markers (D6S446 and D6S1590) [9].
In this study we have taken a SNP-based association mapping approach and selected a large number of SNPs spanning the known genes in this region. Despite initial weak evidence of association with RA of a number of SNPs around the loc154449 putative gene and within the KIAA1838 gene, there was no evidence of association with RA in the combined dataset and we conclude that there is no evidence to support association of polymorphisms in these genes with RA.
Information on LD across the region was obtained from the HapMap. Results from the software program Tagger suggest that the SNPs selected within the PDCD2-TBP-PSMB1 gene region and the loc401289 gene region capture all the Hap-Map SNPs within these regions, suggesting that adequate coverage of the region was achieved with the SNPs analyzed in this study.
The PDCD2-TBP-PSMB1 gene region was initially highlighted in the T1D study as likely to contain the susceptibility gene [8]. A recent study of the IDDM8 region in T1D, however, found no evidence of association, although they could not completely rule out the possibility that the putative IDDM8 locus exists elsewhere in this chromosomal region [20]. Other genes in the IDDM8 region include the KIAA1838 gene and, although the 10 SNPs within this gene captured 33 out of 38 SNPs identified on the HapMap, there is a possibility that variation across the region has not been completely captured and further SNPs would need to be genotyped before this locus can be confidently excluded for modest effect sizes.
Another possible reason for the failure to identify a susceptibility region in the study could be heterogeneity between the dataset used in this study and the dataset used in the previous analysis [9]. However, half of the cases used in this study were Linkage disequilibrium plot of the IDDM8 region in controls Linkage disequilibrium plot of the IDDM8 region in controls. The plot shows both linkage disequilibrium correlation and D' as measures of linkage disequilibrium across the IDDM8 region. probands from the National Repository of RA cases that had been used in the previous study and there were no differences in gender (p = 0.06) or severity of disease (as denoted by number of erosions; p = 0.55). However, there was a significant difference in RF status between the two subgroups (p = 0.03); of the RA probands 86% had RF whilst the dataset used in this study had 76%.
The total dataset analyzed in this study (354 cases and 372 controls) had the power to detect an effect size or odds ratio greater than 1.6; therefore, if the IDDM8 region conferred a risk similar to that of PTPN22 in RA (odds ratio = 1.8), then we would have had 80% power to detect it (p = 0.05). However, for smaller effect sizes, such as that of CTLA4 in T1D (odds ratio = 1.14) then our study would have been underpowered.
Our study has failed to detect evidence of association with any of the known genes mapping to the IDDM8 locus, a region we had identified as a candidate autoimmune locus common to RA, T1D and SLE. It is possible that the limits of the region defined by earlier T1D studies have, in fact, failed to encompass the RA susceptibility gene that gave rise to evidence of linkage and association to microsatellite markers in our initial study, and future studies would need to focus on genes adjacent to those investigated here.

Conclusion
Our study has failed to detect evidence of association with any of the known genes mapping to the IDDM8 locus with RA.