Introduction

Autosomal recessive congenital ichthyosis (ARCI) is a clinically and genetically heterogeneous group of disorders that affect the keratinisation process of the skin (Williams and Elias 1985). ARCI can be subdivided into at least four groups (I–IV) by electron microscopy (Anton-Lamprecht 1992). So far, eight loci have been associated with ARCI (chromosomes 2q34, 3p21, 5q33, 9q33-34, 14q11, 17p13, 19p12-q12 and 19p13), and a corresponding disease gene has been identified in six of these regions (Russell et al. 1994; Parmentier et al. 1996; Fischer et al. 2000; Virolainen et al. 2000; Krebsova et al. 2001; Jobard et al. 2002; Klar et al. 2004; Lefevre et al. 2004, 2006).

Ichthyosis prematurity syndrome (IPS), also known as ichthyosis congenita type IV (Anton-Lamprecht 1992), is a distinct form of ARCI. IPS is characterised by premature birth, usually at gestational weeks 30–32, of the affected child, which has a thick desquamating epidermis. Asphyxia is a frequent complication, probably due to aspiration of amnion debris. Soon after the critical neonatal period the child’s health improves spontaneously. No developmental abnormalities are present, and the skin symptoms become minor, but there are a slightly thickened epidermis and follicular hyperkeratosis, which remain into adulthood. IPS is very rare, except in a region in Norway and Sweden, with an estimated heterozygote carrier frequency of 1 in 50 (Gedde-Dahl et al. 1999). This indicates an ancient founder mutation for IPS in the area. IPS has been reported in three families outside Norway and Sweden, of which two are from Finland and one from Italy (Niemi et al. 1993; Brusasco et al. 1997). We recently reported linkage of IPS to chromosome 9q33.3–34.13 (Klar et al. 2004), and a shared ancestral haplotype was expected among a majority of the affected individuals because of the geographical clustering. In this study we present the refinement of the IPS locus, using a highly dense marker haplotype. From the results we calculated the age of an ancient founder mutation and analysed specific candidate genes within a 76 kb core region (Hastbacka et al. 1994; Engert et al. 2000; Joensuu et al. 2001).

Materials and methods

Participants

Blood samples were collected from 22 families with at least one member affected with IPS. Siblings were available for 19 out of the 22 families, of which 13 had been previously reported with linkage to chromosome 9q34 (Klar et al. 2004). The six new families were added to expand the previous linkage study, and all 22 families were included in the haplotype analysis. Altogether, 28 affected and 22 healthy siblings were available for analysis. A majority of the families originated from a defined region in middle Norway and Sweden (Fig. 1). None of the families was known to be related, and no consanguinity could be ascertained. Affected families were clinically examined (Norwegian families by T.G.-D., Swedish families by A.V. and a Danish family by F.B. and A.B.), and ultrastructure of skin biopsies was determined by electron microscopy. All affected family members presented with a phenotype consistent with IPS, including premature birth. Informed consent was obtained from all participants, and the study was approved by the ethics committee at Uppsala University.

Fig. 1
figure 1

Map of Norway and Sweden. The shaded portion shows the geographical area from which the majority of the IPS families originated. IPS has the highest prevalence in the city of Trondheim

Genotyping

DNA was extracted from whole blood, and genotyping was performed using fluorescence-labelled microsatellite markers. Genotype data for 15 markers at the IPS locus were available from the previous linkage study, and an additional 22 polymorphic repeats, of which 12 are novel, were used as markers in the linked interval between marker loci D9S250 and D9S63 (Klar et al. 2004). We amplified the markers by polymerase chain reaction (PCR) and incorporated fluorescent labels, as previously described (Klar et al. 2004). Genotypes for four single nucleotide polymorphisms (SNPs) were obtained through sequencing, as described elsewhere (Entesarian et al. 2005). Altogether, 41 marker loci, spaced at a mean distance of 150 kb, were included in the analysis. Additional SNPs were genotyped in families IR71 and SV76 in order to refine the minimal shared haplotype. Primer sequences and allele frequencies for the novel markers are available upon request.

Linkage and haplotype analysis

Two point LOD score calculations were made using the MLINK program of the LINKAGE software package (version 5.1) (Lathrop and Lalouel 1984). We assumed an autosomal recessive inheritance, an equal male and female recombination rate and full penetrance. The disease allele frequency was set to 1 per 150,000 due to a proportion of obligate carriers born in geographical areas with a reduced incidence of IPS.

Haplotypes were drawn using Cyrillic (version 2.1.3) (Cherwell Scientific Publishing, Oxford, UK) and “affected” and “unaffected” control haplotypes were constructed for the association studies using sizes of marker alleles in affected children and their parents for genetically linked markers. The aggregation of families in a distinct region suggests a high local carrier frequency, and unaffected parental haplotypes were used as controls. This corresponds to the transmission disequilibrium test. Associations between allele sizes and the disease phenotype were calculated with Fisher’s exact test and P excess (Hastbacka et al. 1994).

The age of a major founder mutation for IPS was estimated from the mean length of the disease haplotype identified in affected individuals. This is calculated as 2/g morgans, where g is the number of generations after the mutation was funded (Boehnke 1994). This equation has previously been consistent with empirical data (Engert et al. 2000) and can be used for estimating the age of the mutation.

Mutation and expression analysis

Sequencing analysis of all exons and flanking intron regions of TBC1D13, ENDOG, C9Orf114, CCBL1 located within the critical interval was performed on three affected individuals (IR61, IR125 and SV76). Expressed sequence tags (ESTs) and mRNA sequences were utilised as a means to identify new putative exons for the previously known transcripts within the region as well as a way to predict new, independent, transcribed sequences. This was done by aligning all human ESTs and mRNA sequences to the genomic assembly (NCBI, build 35.1) using BLAT (Kent 2002). The genomic alignments were converted into GFF files and manually inspected in the Apollo browser (Lewis et al. 2002) for identification of novel expressed transcripts.

The ECR browser was used to identify additional candidate regions of possible functional importance from the degree of conservation between human, dog, mouse and rat (Ovcharenko et al. 2004). Regions with a conservation level >65% over 100 bp were selected. Conserved sequences and regions with homology to expressed ESTs were sequenced as described previously (Entesarian et al. 2005).

Primary keratinocytes from three IPS patients homozygous for the haplotype (IR104, IR122 and IR125, Fig. 3) and three healthy controls were cultured in a DMEM/F12 medium, with a feeder layer of radiated 3T3 cells. The cells were maintained at 37°C in a humidified atmosphere with 5% CO2. RNA was prepared using TRIzol (Invitrogen, San Diego, Calif., USA) according to the manufacturer’s protocol. The quality of the total RNA was controlled by capillary electrophoresis using a Bioanalyzer 2100 (Agilent Technologies, Palo Alto, Calif., USA). Only RNA with no sign of degradation was used. cDNA was prepared using 2 μg total RNA, 6 μl MMLV buffer (5×), 0.3 μl BSA (1 μg/μl), 0.25 μl HPRI (31 U/μl), 2 μl dNTP (10 mM), 1 μl random primers (25 μm) and 0.45 μl MMLV (200 U/μl) in a 30 μl reaction. The reaction mix was heated to 50°C without MMLV, cooled to room temperature, at which point MMLV was added, incubated at 37°C for 1 h and finally incubated at 80°C for 10 min.

Oligonucleotide primers positioned within the transcripts for TBC1 domain family member 13 (TBC1D13), endonuclease G (ENDOG), chromosome 9 open reading frame 114 (C9Orf114) and cysteine conjugate-beta lyase (CCBL1) were designed (supplementary Fig. 1). Exon splicing was investigated with PCR, spanning all known exon boundaries specific for each of the four transcripts (supplementary Fig. 1) using cDNA from one patient (IR122) and two controls. The Platinum SYBR Green qPCR SuperMix-UDG kit was used for quantitative real time PCR according to the manufacturer’s protocol (Invitrogen). cDNA levels of β-actin (ACTB) and GAPDH were used for normalisation. The PCR and detection were performed with the ABI PRISM 7000 Sequence Detection System (Applied Biosystems). All oligonucleotide primers were designed with the Primer3 program (http://www.frodo.wi.mit.edu/). PCR conditions used are available upon request.

Results

Linkage analysis

Linkage analysis confirmed previous linkage of IPS to chromosome 9q33.3–34.13 (Klar et al. 2004). A maximum cumulative LOD score of 5.3 was obtained with marker D9S752 (θ=0) for 19 families. The centromeric recombination event is defined by marker D9S282, which further restricts the IPS locus in this study to a 5.8 Mb interval (approximately 6 cM) between markers D9S282 and D9S63.

Association and haplotype studies

The region between the closest flanking recombinations was further investigated with a total number of 37 microsatellite markers, which were analysed with Fisher’s exact test in order to assess association of markers to the disease. This resulted in significant P values in two different regions, spanning 1.1 Mb and 2.0 Mb, respectively. Within these regions allelic excess was calculated for each marker locus used as a method to identify a haplotype associated with the disease (Fig. 2). Three adjacent microsatellite markers (CAA13-ATTT10-D9S260) showed strong allelic excess (P excess≥0.9), which defined a possible core haplotype (marker alleles 2-2-1, respectively) in the region (Fig. 3). Three SNP markers (rs6478853, rs6478854 and rs11542344) confirmed this core haplotype. This six-marker haplotype was present on 91% of the disease-associated chromosomes and on 23% of the control chromosomes (Table 1). The distribution resulted in an allelic excess of 0.88. The six markers constituted the core in a common haplotype block (Table 1), spanning, on average, 2.56 cM (1.60 Mb) in the affected individuals (Fig. 3). Based on this average haplotype length in affected individuals, the age of the mutation in our population was calculated to 78 generations. With a mean generation age of 25 years (Genin et al. 2004), the IPS mutation was introduced in the population approximately 1,900 years ago.

Fig. 2
figure 2

Association results for microsatellite markers in the region restricted by recombination events. The Y-axis denotes 1-p for Fisher’s exact test and P excess for allelic excess. The X-axis denotes the physical position in Mb (NCBI, build 35.1). Black bars annotate regions showing allelic association spanning 1.1 Mb and 2.0 Mb, respectively. Allelic excess peaks are achieved for alleles of markers CAA13, ATTT10 and D9S260

Fig. 3
figure 3

Haplotypes from a 1.8 Mb region on chromosome 9q34 in 22 independent IPS patients. A haplotype (shaded) can be constructed in the affected individuals around the markers rs6478853, rs6478854, CAA13, ATTT10, rs11542344 and D9S260. Positions, in megabases, are from the sequence tagged site map of GenBank (NCBI, build 35.1). Paternal haplotype is the first column of each individual and maternal haplotype in the second

Table 1 Distribution of combined haplotypes for D9S61-rs6478853-rs6478854-CAA13-ATTT10-rs11542344-D9S260-GAGT34

Three patients, IR71, SV76 and IR61, are heterozygous for the six-marker haplotype (Fig. 3). Patients IR71 and SV76 are homozygous for markers rs6478853, rs6478854, CAA13 and ATTT10, which suggests that the IPS mutation is restricted to this region. IR71 and SV76 are both heterozygous for rs10988141, which restricts the minimal shared haplotype towards the telomere (Fig. 3). The shared region is restricted towards the centromere by D9S61 in nine patients. The paternal haplotype of IR61 shares only one single SNP marker allele with the associated haplotype (Fig. 3). The minimal ancestral haplotype in all affected patients but IR61 restricts the IPS candidate region to a maximum 76 kb. To date, four genes are known to be located in this region: TBC1D13, ENDOG, C9Orf114 and CCBL1.

Mutation and expression analysis

Genomic sequencing of DNA from three patients, IR61, IR125 and SV76, revealed no sequence alterations in the exons of the four genes TBC1D13, ENDOG, C9Orf114 and CCBL1. Twenty base pairs of intronic sequences flanking each exon were found to be normal when compared to public databases. Four evolutionary conserved regions (ECRs), ranging from 100 bp to 271 bp were extracted from the ECR database (Table 2) (Ovcharenko et al. 2004). Three new possible 5′UTRs were identified upstream of CCBL1 from the EST data, one of which coincided with an ECR. In addition, a region of approximately 700 bp in intron 1 of ENDOG was supported by several ESTs as an independent transcript (Table 2). These seven regions were all sequenced but showed no variations from sequences in the public databases.

Table 2 ECRs and sequences with homologies to ESTs within the 76 kb associated with IPS. The closest annotated gene is indicated

In search of aberrant splicing due to mutations in introns we performed PCR on cDNA using primers spanning all known exon boundaries in each of the four transcripts. The PCR products showed similar sizes when samples from patient and controls were compared (data not shown). The four genes are expressed in a wide variety of tissues, including keratinocytes, as verified by quantitative PCR on cDNA from cultured keratinocytes. The analysis revealed no significant differences in expression levels between patients and controls (Fig. 4). Student’s t-test for TBC1D13, ENDOG, C9Orf114 and CCBL1 yielded P values of 0.12, 0.48, 0.88 and 0.34, respectively.

Fig. 4
figure 4

Quantitative real-time PCR. Normalised mRNA levels in cultured keratinocytes are shown for patients (grey,n=3) and controls (black, n=3) on a logarithmic scale. No significant differences in expression level between patients and controls were detected

Discussion

In this report we map the IPS by allelic and haplotype association to a 1.6 Mb region on chromosome 9q34. Within this interval we identified a core region, defined by six marker alleles, forming a haplotype shared by 91% of the IPS chromosomes analysed. This indicates a strong founder effect for IPS, and the age of the founder mutation is calculated to approximately 1,900 years. Three patients show deviating haplotypes in the core region. Two of these patients are homozygous for four of the six markers, which suggests that the core haplotype is restricted to this interval. The third patient carries one deviating haplotype with one single marker allele shared by the core haplotype for IPS. There are several possible explanations for this: the deviating haplotype may be very small and not detected at this marker resolution. Other possibilities are: mutations in marker loci, resulting in altered allele variants, or that the patient is compound heterozygous for two IPS mutations.

The size of a shared chromosomal region allowed us to calculate the age of the IPS founder mutation. The size of the shared segment is predominantly determined by the total number of meioses to a common ancestor. With the use of microsatellite markers it is important to consider their mutation rate, which is in the range of 10−4 to 10−2 per locus and generation (Lai and Sun 2003; Ellegren 2004). In this example we estimated that an ancestral mutant chromosome has passed through 78 generations, and it is likely that mutations have occurred at some of the marker loci, which reduces their absolute frequency among the affected individuals. This may be one reason for IPS-associated haplotypes interrupted by single microsatellite markers (IR85 and IR107, Fig. 3). On the other hand, this observation could also be the result of recombinations between similar or identical haplotypes. The calculations of haplotype length are affected by the mutation rate of polymorphic alleles, the marker density and the background allele frequencies. As a consequence, this influences the estimated age of the IPS mutation in this population. From our results, the average length of the haplotype identified corresponds to a founder mutation with an age of approximately 1,900 years.

The core haplotype restricts a region of 76 kb containing four genes (TBC1D13, ENDOG, C9Orf114, CCBL1) all shown to be expressed in keratinocytes. Endonuclease G is involved in DNA cleavage at a stage in mitochondrial DNA replication. It also has a function in chromosomal degradation during apoptosis (Cote and Ruiz-Carrillo 1993; Li et al. 2001). The cysteine conjugate-beta lyase is also known as kynurenine aminotransferase I and glutamine transaminase K, due to its functions in different reaction pathways (Cooper and Meister 1974; Okuno et al. 1996). Little is known about the functions of the TBC1D13 and C9Orf114 gene products. A majority of the genes found to be mutated in ARCI encodes proteins involved in lipid transportation or modification, such as the 12(R)-lipoxygenase pathway (Jobard et al. 2002; Lefevre et al. 2006). No connections could be found between the TBC1D13, ENDOG, CCBL1 or C9Orf114 gene products and pathways previously shown to be involved in ARCI. We found no sequence alterations in the exons of the four genes, or in evolutionary conserved or EST supported regions within the 76 kb. Significant differences in transcript levels were excluded between patients and controls for TBC1D13, ENDOG, CCBL1 or C9Orf114, indicating a normal expression of the genes in the IPS patients (Fig. 4). In addition, we found no indications of aberrant splicing by the analysis of intron sequences flanking each exon of the four genes or by cDNA PCR spanning all exon boundaries in each of the four transcripts (data not shown).

Possible explanations for a yet unidentified IPS mutation may be a larger rearrangement, such as an inversion, insertion or duplication which mediates a positional effect. The mutation may also affect a regulatory region, controlling a gene outside the 76 kb. If conserved, a regulatory region could be smaller than the ECR analysed in this study. A third possibility is that the mutation is located within a region encoding a small non-coding RNA. Their functions include regulation of transcription, RNA processing and translation and stability of mRNAs (Storz 2002). Small RNAs, such as microRNAs and snoRNAs, are generally well-conserved across species, but since they are short (typically 21–25 nt and 60–300 nt, respectively) they may have escaped detection in our ERC and EST analysis (Pang et al. 2006). The 76 kb core region will be further investigated in search of structural changes and short functional motifs.

In this study we define the IPS mutation to a 76 kb region containing four candidate genes. Haplotype analysis confirms a strong founder effect for the disorder. Refinement of the IPS haplotype has facilitated prenatal diagnostics, and carrier screening in families with affected members and diagnosis will be further improved by the identification of the IPS gene mutation. Moreover, this may bring clues to mechanisms in normal skin formation and factors initiating child delivery.