Introduction

Retinitis pigmentosa (RP) is the generic term given to a diverse set of inherited retinal degenerations characterised by loss of visual field, night blindness and the pigmentary deposits in the retina that give the condition its name. This heterogeneous condition can be inherited in autosomal dominant, autosomal recessive or X-linked recessive modes, with multiple loci accounting for each inheritance pattern (RetNet). The dominant form, designated RP9 (MIM:180104), was first identified by analysis of a nine generation adRP pedigree from the South of England,1 with linkage being established to chromosome 7p14.2 In contrast with other adRP loci no other pedigrees have subsequently been reported linked to this locus. The phenotype in this family exhibits a wide variation in severity, with some obligate affected patients displaying no symptoms at all.3 The RP9 disease is classified as the regional form (or type-2), which refers to the patchy or ‘regional’ loss of both rod and cone function in the early stage of the disease.4 This is distinct from the diffuse form (type-1) where there is dysfunction of the rod photoreceptors over an extended area of the retina prior to cell death, and which is characteristic of the disease arising from mutations in the rhodopsin gene.5

Subsequent to the original mapping of the disease, the locus was refined genetically to a 3 cM region between the polymorphic markers D7S690 (distal) and D7S484 (proximal),6 an interval which we now know contains 2 924 000 bp of DNA (NCBI genome build 24). New microsatellite markers were developed from genomic sequence generated at the University of Washington Genome Center. These allowed the locus to be further refined, using haplotype analysis, with cross-overs described previously,7 to an interval flanked by the new marker gs234f24ca1 (embl:G42103) at its proximal end and D7S690 distally (data not shown). This interval, which contains 1 994 500 bp of DNA, is located approximately 34 Mb from the 7p telomere (Figure 1).

Figure 1
figure 1

The genomic structure of the RP9 gene and its chromosomal position relative to the adjacent duplicated gene. Two unrelated genes are located between the RP9 gene and its duplication, namely erythrocyte pyrimidine 5-nucleotidase (AF151067) and FK506 binding protein (AF089745). A segment of the normal and mutated sequence is shown from around the mutations and polymorphisms found in patients, with the site of each base change marked by an asterisk.

Relatively few genes are present in this interval, of which none appeared to be good candidates for a retinal disease on the basis of expression pattern or known function. However, one gene represented in DNA databases by an entry derived from a patent application (embl:AX016710) was selected for analysis because its sequence contained a repetitive poly(purine) tract coding for a poly(lysine) segment in the protein (Figure 2a). This selection was based on the hypothesis that expansion of the poly(lysine) domain could, in the manner of dynamic CAG repeat diseases, provide a mechanism for the variable presentation of the disease.8 An expression profile based on EST sequence representation suggested that the gene was transcribed in most tissues (UniGene: Hs.292057), though no EST sequences from retinal cDNA libraries are present in the human and only two in the mouse divisions of dbEST. This human gene also had a previously identified mouse orthologue (embl:D78255), which was isolated in a yeast two-hybrid screen for proteins that interact with the Pim-1 oncogene.9 The predicted peptide sequence of the mouse orthologue is essentially identical to the human protein, with only 15 out of 221 residue differences between the human and mouse sequence (93% identity) of which eight are conservative substitutions. Apart from these minor differences, the human sequence is characterised by an eight residue insertion (relative to the mouse protein) three residues from the amino terminus (Figure 2a).

Figure 2
figure 2

(a) The predicted protein sequence of the RP9 gene, with the two mutated residues (H137L and D170G) and the K210R polymorphism highlighted with a black background. The boxed residues between 95 and 120 encompass the motif shared by the RP9 protein and the CIR (embl:AF098297) like proteins. The boxed serine residues near the carboxy terminus indicate the sites (in mouse) of phosphorylation by the PIM1 kinase. Residues 4–11 are underlined indicating the insertion in the human sequence relative to that of mouse. (b) Multiple alignment of the region encompassing the H137L and D170G mutations. Non conserved residues are shown with black backgrounds, whereas conserved but non-identical residues are shown with a grey background. Apart from human and mouse these peptide sequences are derived from ESTs, where the species is indicated thus; ‘r’ rat, ‘m’ mouse, ‘c’ cow, ‘p’ pig, ‘z’ zebrafish and ‘x’ xenopus, alongside their respective accession numbers. (c) Multiple alignment of the region of similarity between RP9 (residues 95 and 120) and CIR motif containing proteins. Other mammalian species have essentially identical sequences to human CIR (AF098297) and are not shown. Non conserved residues are shown with black backgrounds, whereas conserved but non-identical residues are shown with a grey background. Residues conserved between all the sequences are indicated by an asterisk.

Methods

This gene consists of 6 exons comprising an open reading frame of 666 bp, spread over 14138 bp (ATG start to TGA stop) of genomic DNA (Figure 1). To test whether or not mutations in this gene were present in RP patients, individual exons were PCR amplified for mutation detection by single-strand conformation polymorphism (SSCP) analysis using the following primers: exon1f GTTGCCCGAGCGGCGCT, exon 1r TGGCCGCGCGCGGACGGCA, exon2f AAATCTCTGATTAAAAATCCTATAGCC, exon2r AAAAGGAGATTTAACATCATGCAA, exon3f CAGGAAAAAGCCAGGCAAG, exon3r GAGGGCTGTGATGAGAACAAG, exon4f TGCTGATTCTTTATCTTGAGTAGGTG, exon4r TGGTGACTTTCTGCTTCACTG, exon5f GGTTTTCATAACATAGGCATTTCA, exon5r TGTTTACTGCACCATTCCTCT, exon6f CATCCTATACTGCTTTTGAATGACA, exon6r TGCATCTTCCTCTGTTCCTTG. The primers were selected carefully, as duplicated copies of the gene exist elsewhere in the genome (see below). The use of DNA from human subjects in this study has been approved by the Leeds Health Authority/St James's University Hospital NHS Trust Clinical Research (Ethics) Committee. Manipulation of the sequences in this project was largely carried out using the programs in the EMBOSS software suite.10

Results

In the RP9 linked family we identified a missense change c.410A>T in exon 5 resulting in the amino acid substitution H137L (1). This residue is conserved in all orthologous sequences that are present in nucleotide databases, these being from rat, mouse, cow, pig, frog and zebrafish (Figure 2b). The mutation is restricted to the 47 affected members and obligate carriers in the family and was absent from 280 control samples.

Alongside the original RP9 family a panel of 300 dominant, recessive and genetically undefined RP patients were also screened. In this panel we identified additional sequence changes. One is a missense mutation present in a single RP patient which results from a change in exon 6 c.509A>G, producing the substitution D170G (Figure 1). Again this residue is highly conserved (Figure 2b) and the c.509A>G alteration was absent from 279 control samples. The phenotype of this patient was consistent with that described previously for RP9.3

Two polymorphisms were also identified in exon 6, both of which change the protein product. A common polymorphism, c.629A>G, which results in the conservative amino acid substitution K210R, was found to occur at a frequency of 30% on control chromosomes (50 out of 176) showing that this is non-pathogenic. No correlation was observed between the presence of this polymorphism and the severity of disease in the linked family (data not shown), the Sau 3A restriction enzyme site created by this polymorphism simplified this analysis.

The second polymorphism is a deletion of the ‘T’ of the TGA stop codon (c.664delT) resulting in an extended reading frame from 666 bp and 221 amino acids to 750 bp and 249 amino acids, introducing an aberrant carboxy terminus to the protein (Figure 1). This sequence change was present in 11 out of 300 apparently unrelated individuals in our RP screening panel and in 5 of 279 control samples. The apparent difference in the frequency between RP patients and control samples is most likely explained by population stratification between the two groups, though we cannot rule out a pathogenic contribution from this allele. The c.664delT polymorphism also appears to be in linkage disequilibrium with the c.629A>G polymorphism, though analysis of other polymorphic markers near the gene, D7S795, djs40ca2 (embl:G49432), D7S460 and D7S683 did not show any common haplotype (data not shown), suggesting that c.664delT is quite ancient.

In contrast to these polymorphisms, the poly(purine) tract which had originally provoked interest in this gene, was found not to display a size or sequence variation in any individual we analysed.

Discussion

These data suggest that some mutations in this gene may cause the RP9 form of retinitis pigmentosa. The human RP9 protein contains 221 amino acids and analysis of the primary sequence would suggest that it is a soluble protein. The most prominent feature in the protein is a positively charged domain in the carboxy terminus extending from residue 184–208. It also contains a previously unidentified motif that lies between residues 95 and 120 (Figure 2c), which is also present in the human protein CIR, a component of a transcriptional control complex.11 By inference, it seems possible that the RP9 protein product is also involved in transcriptional control. Predicted proteins found in the genomic sequences of Drosophila melanogaster, Caenorhabditis elegans and Arabidopsis thaliana also contain this motif, but in orthologues of CIR and not RP9, for which no clear orthologue can be identified in non-vertebrate genome sequences. In the publication describing the identification of the mouse RP9 orthologue it was suggested that it interacted with CIR but no data were shown.9 In that paper the mouse RP9 orthologue was named PAP-1 (Pim-1 Associated Protein). This has led to some confusion in the literature as several other genes have also been named PAP. Therefore it has been agreed with the HUGO and mouse nomenclature committees that the mouse gene will be known as Rp9h until further information emerges about its function.

The original study describing mouse Rp9h (Pap-1) provides few insights into possible pathogenic mechanisms for retinal degeneration.9 What is known is that the Pim-1 kinase binds to the first half of Rp9h (PAP-1) and phosphorylates one or more of a group of serine residues near the carboxy terminus. The interaction between Pim-1 and Rp9h (PAP-1) was observed to take place in the nucleus. It thus seems likely that the basic domain of the RP9 protein functions as a nuclear localisation signal.12 Viral leukemogenesis in mice is often initiated by proviral activation of the Pim-1 oncogene13 and because of this most work on Pim-1 and its possible targets has been directed to its likely role in haematopoiesis.14 There has also been one report that demonstrated a significant level of Pim-1 expression in the developing avian retina, though this occurred alongside widespread expression in other tissues.15 Other phosphorylation targets of Pim-1 may therefore be candidates for retinal disease. One of these is the EBNA2 coactivator, which is located in the region of the RP10 locus on 7q31.

Within the RP9 interval a regional duplication has resulted in an almost exact copy of the RP9 gene, which is located in tandem array, approximately 166 kb distal to it (ATG start to ATG start) (Figure 1). The duplicated segment, which is approximately 20 kb in size, extends from 1600 bp upstream of the RP9 gene ATG start, to around 4500 bp downstream from the last exon. Two other genes are located between the copies but they are not involved in the duplication. Analysis of EST representation would suggest that the duplicated distal gene is not as transcriptionally active as the RP9 gene, with only three ESTs in public databases derived from it, as opposed to over 100 from the RP9 gene itself. Recently a putative mRNA sequence, derived from the RP9 duplicate (named PNAS-13; embl:AF274938), has appeared in the DNA databases. This contains a sequencing alteration when compared to genomic sequence which changes the reading frame relative to the RP9 gene. This seems most likely to be the result of a sequencing error or a problem with the cDNA clone. A CpG island encompasses all of the first exon of the RP9 gene, but the extent of this GC rich region is reduced in the duplicated copy. The majority of the differences between the copies in the region of the first exon result from C to T changes at CpG dinucleotides, an observation most easily explained by the CpG island of the duplicated gene becoming methylated and then undergoing deamination mediated mutation. Nevertheless none of the differences between the sequences appears to inactivate the copy by changing the reading frame, supporting the hypothesis that this copy is still potentially functional.

There is also a region of sequence similarity to the RP9 gene on chromosome 2 (embl:AC007041 and embl:AI972605). In comparison to the RP9 gene the putative chromosome 2 coding sequence appears to be disrupted. Furthermore it lies on the same strand as and within an intron of another quite distinct gene (pumh2/AF315591), suggesting that this may be an inactive pseudogene. Despite this, its presence has led to the erroneous localisation of the RP9 Unigene cluster to chromosome 2.

The mechanism whereby mutations in this widely expressed gene could lead to retinal degeneration remains unknown. The most likely explanation is that the high metabolic rate and presence of photoactivated molecular species in the neural retina constitute physiological stresses not present in other tissues.16 However, it is also possible that specific domains of a multifunctional protein are only required in a specific physiological context. It has recently been reported, by this laboratory and others, that mutations in pre-mRNA splicing factors are responsible for at least three forms of dominant RP.17,18,19 Whilst there is no evidence to suggest that the RP9 gene is involved in RNA splicing, it would appear from these observations that mutations in ‘housekeeping’ genes with no obvious involvement in visual system development or function can still cause retinal disease.

In conclusion we have identified in a previously uncharacterised gene, two missense mutations which are associated with RP. Whilst this evidence is suggestive of disease causation, further mutations must be identified in RP patients before this becomes unequivocal.