Comparative transcription map of the wobbler critical region on mouse chromosome 11 and the homologous region on human chromosome 2p13-14

Background To support the positional cloning of the mouse mutation wobbler (wr) the corresponding regions on human Chr2p13-14 and mouse Chr11 were analyzed in detail and compared with respect to gene content, order, and orientation. Results The gene content of the investigated regions was highly conserved between the two species: 20 orthologous genes were identified on our BAC/YAC contig comprising 4.5 Mb between REL/Rel and RAB1A/Rab1a. Exceptions were pseudogenes ELP and PX19 whose mouse counterparts were not located within the analyzed region. Two independently isolated genomic clones indicate an inversion between man and mouse with the inverted segment being identical to the wobbler critical interval. We investigated the wobbler critical region by extensive STS/EST mapping and genomic sequencing. Additionally, the full-length cDNA sequences of four newly mapped genes as well as the previously mapped gene Otx1 were established and subjected to mutation analysis. Our data indicate that all genes in the wr critical region have been identified. Conclusion Unexpectedly, neither mutation analysis of cDNAs nor levels of mRNAs indicated which of the candidate genes might be affected by the wr mutation. The possibility arises that there might be hitherto unknown effects of mutations, in addition to structural changes of the mRNA or regulatory abnormalities.


Background
The conservation of genomic structure between two or more species is the basis of comparative genomics. During the process of evolution the distribution of genes on chromosomes is rearranged by illegitimate recombination and translocation. Species sharing a recent common ancestor tend to have fewer chromosomal rearrangements than distantly related species [1]. Orthologous genes serve as landmarks for the definition of related chromosomal regions. Orthologous segments are defined by at least one pair of orthologous genes; conserved synteny relates to at least two pairs of orthologous genes located on the same chromosomal segment in two different species [2].
With increasing sequence data comparative genomics is no longer restricted to genes, as evolutionary conserved intergenic sequences [3] can now also be taken into account. In addition, comparison of genomic sequence data of two different species facilitates the identification of novel genes and conserved regulatory elements.
We use comparative genomics for a positional cloning approach in order to identify candidate genes for the mouse mutation wobbler (wr) which causes muscle weakness due to motor neuron degeneration [4] and a spermiogenesis defect [5]. The wobbler mouse is used as an animal model for human spinal muscular atrophies (SMAs) although there is no human SMA known to be located within the wr homologous region.
Here we present a detailed transcript map of the orthologous chromosomal regions on proximal mouse Chr 11 and human Chr 2p13-14 with newly mapped genes and newly established mouse cDNA sequences serving as candidate genes for the wobbler mutation. The genomic regions analyzed are highly conserved in gene content, but two independent genomic clones indicate that the gene order is disrupted by an inversion.

Results
Our physical map of mouse proximal Chr 11 is based on 68 BAC and 11 YAC clones covering 3 Mb. The present mostly BAC-based contig has been established by STS/EST mapping improves the previously described YAC contig [8]. Our homologous human YAC contig (twelve clones, [8]) spans approximately 4.5 MB, exceeding the mouse contig on the distal side.
The human pseudogenes endozepine like peptide (ELP) [16] and px19-like protein (PX19) [17] are localized in the human genome sequence of the interval under study and their positions were confirmed by STS-mapping on our human physical map. However, their counterparts were not detected on the mouse contig. Murine Elp was radiation hybrid mapped but located far distally (at 40 cM) on Chr11 (Fig 1). The product of Elp is a testis specific isoform of the ubiquitous acyl-CoA binding protein (ACBP) which is highly expressed in the haploid stages of male germ cell development [16]. PX19 was first identified in chicken as a cDNA with a LEA (late embryogenesis abundant) motif [17]. By searching the human draft genome sequence we detected more than 5 intronless copies of this gene, and also in the limited data of the mouse genome several copies were present (data not shown).
Human gene content and order on our contig was compared with the annotated draft sequence of the human genome (UCSC genome browser, December 2001). Nearly all genes/cDNAs are present on both and the gene order is identical with one exception (HCC8 and UGP2) which is due to gaps in the human draft sequence. On the basis of the human draft sequence and our YAC contig it was possible to determine the orientation of most human genes (PELI1 [9], MDH1 [18], HCC8 [19], UGP2 [20], LOC51057 and KIAA0903 [21]). In the mouse, genes were ordered and oriented by our high resolution BAC contig combined with sequence analysis of seven selected mouse BACs (AC091422, AC091423, AC091424, AC091428, AC091419, AC091420, AC091421).
Murine genomic clones, YAC clone ymWIBR100H6 (500 Kb) and BAC clone 135B4 (200 kb), comprised sequences from both, Kiaa0903 and Hspc159; this indicated a different gene order of mouse and man in the wobbler critical region ( Fig. 1). We did not identify any YAC or BAC clones that would contain both Murr1 and Peli1 which match the distal border of the presumed inversion. On the basis of close neighbourhood of Kiaa0903 and Hspc159, three segments were defined. Within each segment order and orientation of genes are conserved between man and mouse, but the central one is inverted. The distal segment is defined by Rel and Murr1, the proximal one by Hspc159 and Rab1a and the central one by Kiaa0903 and Peli1.

Figure 1
Region of conserved synteny of human Chr2-13-14 and mouse prox Chr11 Orthologous genes of human and mouse and their positions on the respective physical map are shown except for Kiaa0570 and Kiaa0729, which could not be mapped with radiation hybrids. Mouse orthologs for ELP and PX19 are not located within the investigated region. Gene order of HCC8 and UGP2 is based on our STS/EST mapping and sequence analysis. Gene order between human and mouse is conserved in three sections; the proximal and distal sections show the same orientation whereas the central section is inverted between the two species. The central part of the figure including the inverted section is shown slightly enlarged. For orientation, positions of microsatellites D2S2225 and D2S147 (human) and D11Mit19, D11Mit294 and D11Mit343 (mouse) are indicated. Blue arrows show orientation and relative genomic sizes of genes, the arrowhead points to the 3' end. Yellow bars represent chromosomal regions covered by YAC or BAC contigs. Red overlays show sequenced sections (human: public draft sequence, mouse: BACs sequenced by our group, AC091422, AC091423, AC091424, AC091428, AC091419, AC091420, AC091421). Green bars indicate the genomic clones (YAC ymWIBR100H6, * ; BAC 135B4, § ), which cover Kiaa0903 and Hspc159. Approximate distances are given in Kb for human Chr2p13-14, in cM for mouse prox Chr11.

Segregation analysis of the wr candidate interval
The candidate interval of the wobbler mutation was refined in comparison with [8] by using the interspecies backcross Mus musculus C57BL/6J-wr x Mus musculus castaneus.
New polymorphisms between the two strains were established all over the candidate region using BAC-end or in-tron sequences. Two recombinations were detected between BAC-end 147N22rev and wr; and four recombinations were detected between anonymous cDNA Murr1 and wr. Thus the candidate region of the wobbler mutation was narrowed down to the interval between the two loci 147N22rev (located distally to Kiaa0903) and Murr1.
Interestingly, the apparently inverted region seems to be identical to the wobbler critical interval. It contains the newly mapped genes Hcc8, Ugp2, Kiaa0903, Homoloc13 and the previously mapped genes Otx1 [8], Mor2 [8], and Peli1 [9]. Our high resolution physical maps in addition to the human draft sequence suggests that all orthologous genes have been identified. Because no additional genes were found by BLAST search using genomic sequences or Accession numbers in brackets are previously published results. n.a. = not available by analysis of mouse sequences with exon prediction programs we believe that the set of candidate genes for the wobbler mutation is complete.
Full length cDNA sequences for the newly mapped genes Hcc8, Ugp2, Kiaa0903, Homoloc13 and the previously mapped gene Otx1 were established according to Materials and Methods. All genes were subjected to mutation analysis by comparative sequencing of the coding regions, and parts of 3' and 5' UTRs, but no mutation was detected so far. Using RNAs from testis and spinal cord (the affected organs) of adult wobbler (wr/wr) and wildtype (+/+) mice, transcript levels of the canididate genes were analyzed by Northern blotting and RT-PCR, but no difference between wobbler and wildtype mice was found (data not shown).
A summary of nucleotide and amino acid features for all genes within the wobbler critical region including accession numbers, genomic structure and mouse-man homologies is displayed in Table 1.

Discussion
Knowledge of gene order and content is crucial for the identification of disease genes. During the draft phase of genome projects sequence information has to be used with caution since multiple gaps lead to incorrect orientation and order of the sequence fragments. High-resolution maps based on contigs of genomic clones as presented in this work help to overcome this problem. For example, the positions of genes UGP2 and HCC8, whose 5' ends are less than 2 kb apart, were shown to be inverted in the human draft version relative to the correct orientation, and most of the UGP2 gene was not available. The correct gene order was resolved with the high resolution map presented here. Furthermore, genes are missing from the chromosome maps if their transcripts have not been described although the protein is known. For example, OTX1 was not annotated in the draft sequence because its cDNA sequence was not present in the database.
By the identification of 20 orthologous gene pairs we were able to show that the gene content of proximal mouse Chr11 and human Chr2p13-14 is highly conserved. On the other hand, the gene order appears disrupted by an inversion which comprises 1.5 Mb. Despite the fact that the present mouse draft sequence suggests an orientation as in the human genome our results indicate an inversion. The exact inversion breakpoints are not known, and despite extensive screens no mouse BACs spanning the distal breakpoint were detected. Low complexity and reiterative sequences may be responsible for both, the chromosomal rearrangement between mouse and man and the difficulty to clone these regions. A comparable inversion of about 1 Mb has been reported for human Chr19p13.3 and mouse Chr10 [22]. Carver and Stubbs [1] suggest that continuous regions of conserved syntenies without insertions, deletions and inversions are rather the exception than the rule. Further small-scale rearrangements within regions of conserved synteny are expected to be discovered with highresolution maps and availability of sequence data.
The only exceptions from the conservation of gene content are PX19 and ELP which are probably pseudogenes that have integrated on human Chr2p after the divergence of rodents and primates. Although some cDNA clones are listed in the databases, no expression of ELP could be detected and the sequences contain several mutations [23]. We conclude that the human ELP is a pseudogene. Murine Elp probably arose by retroposon-mediated gene duplication and secondarily acquired a testis-specific promotor and thus a limited expression. The Chr2p copy of PX19 is truncated compared to the functional PX19 cDNA, with the 3' UTR partially missing. No human EST corresponding to the Chr2p copy was detected, implying it to be a pseudogene. The polyA tail in the genomic sequence suggests that it may have arisen by the integration of an alternatively polyadenylated PX19 transcript. Although first identified as a human cDNA clone, Homoloc2 neither has an open reading frame nor could a transcript be detected by RT-PCR. The high conservation of its genomic sequence between man and mouse suggests a regulatory function.
Establishing full-length cDNA sequences for all genes within the wobbler critical region was the basis for an extensive comparative sequencing effort aimed to detect a mutation. So far, no mutation and no difference of transcript levels was found.

Conclusions
The comparative mouse-man genomics approach has proven a valuable tool to find new candidate genes for disease mutations [24]. However, inversions and insertions have to be taken into account if the human genome is to be used as a framework for the sequence assembly of the mouse. Besides providing new candidates comparative genomics leads to the identification of conserved noncoding sequences, which might have important biological functions. Mutations such as wobbler for which all candidate genes have been excluded might affect a conserved noncoding sequence with a cis-acting effect on genes that might be located outside the critical region as defined by recombination.

BAC and YAC clones
For the mouse BAC and YAC clones and the human YAC clones see Resch et al. [8]. The contigs were established by extensive STS/EST mapping and genomic sequencing.

STS/EST mapping
Standard PCR reactions were: 12.5 µl reaction volume of Qiagen (Hilden, Germany) Taq Mastermix Kit with 12.5 pmoles of each primer. PCR was performed on all BACs and YACs using 50 ng of template DNA.

Search for orthologous ESTs
Orthologous mouse ESTs for mapped human ESTs were identified via BLAST search. For ESTs with greater than 80% homology to ESTs of other species, PCR primers were generated. Subsequently, physical mapping was carried out by STS and radiation hybrid mapping [8].

Preparation of RNA and RT-PCR
Mouse RNA from wildtype and wobbler (wr/wr)was prepared from various tissues of C57Bl/6J mice by standard procedures [23]. Human RNA was purchased from Clontech (Palo Alto, USA). One µg of total RNA was reverse transcribed using the Omniscript RT-Kit from Qiagen (Hilden, Germany).

Expansion of ESTs to full length cDNAs
In order to expand EST sequences (length usually < 500 bp), overlapping sequences of cDNA clones belonging to a single EST were assembled in silico. To verify the assembled sequences, primers were generated for each end, and the transcript was amplified by RT-PCR, cloned and sequenced.

Genomic sequence analysis
BACs were sequenced using a combination of shotgun and directed approaches as described previously [24]. Assembled sequence contigs were analyzed by the automated annotation system RUMMAGE [25].