A high resolution physical and RH map of pig chromosome 6q1.2 and comparative analysis with human chromosome 19q13.1

Background The generation of BAC/PAC contigs in targeted genome regions is a powerful method to establish high-resolution physical maps. In domestic animal species the generation of such contigs is typically initiated with the screening of libraries with probes derived from human genes that are expected to be located in the region of interest by comparative mapping. However, in many instances the available gene-derived probes are too far apart to allow the cloning of BAC/PAC contigs larger than a few hundred kb. High resolution physical mapping allows to estimate the sizes of gaps and to control the orientation of the individual sub-contigs, which helps to avoid errors during the assembly of smaller contigs into final Mb-sized contigs. The recently constructed porcine IMNpRH2 panel allowed us to use this approach for the construction of high-resolution physical maps of SSC 6q1.2. Results Two sequence-ready BAC/PAC contigs of the gene-rich region on porcine chromosome 6q1.2 (SSC 6q1.2) containing the RYRl gene were constructed. The two contigs spanned about 1.2 Mb and 2.0 Mb respectively. The construction of these contigs was monitored by the results provided by the mapping of 15 markers on the IMpRH7000rad and 35 markers on the IMNpRH212000rad radiation hybrid panels. Analyses on the IMpRH panel allowed us to globally link and orientate preliminary smaller contigs, whereas analyses on the high resolution IMNpRH2 panel allowed us to finally identify the order of genes and markers. Conclusions A framework map of 523 cR12000 was established covering the whole studied region. The order of markers on the framework 1000:1 RH map was found totally consistent with the data deduced from the contig map. The kb/cR ratio was very constant in the whole region, with an average value of 6.6 kb/cR. We estimate that the size of the remaining gap between the two contigs is of about 300 kb. The integrated physical and RH map of the investigated region on SSC 6q1.2 was used for a comparative analysis with respect to the syntenic regions on HSA 19q13.1 and MMU 7 and revealed a perfectly conserved gene order across the entire studied interval.


Background
Comparative genome analysis increases the knowledge of genome evolution and is especially important in livestock species where the currently available sequence information is very limited as compared to the vast amount of information available from the human and mouse genomes. Radiation hybrid mapping is seen as an efficient technique for the generation of high-resolution gene maps in different species and RH maps can be integrated in comparative mapping approaches to reveal the degree of synteny conservation between species [1].
Two RH panels have been reported for the pig: the 7 000 rad IMpRH panel [2] that provides medium-resolution global mapping information, and the 12 000 rad IMNpRH2 [3], that can be used to construct high-resolution local RH maps. Panels developed after a high level of cell irradiation (10 000 to 50 000 rads) are very useful for high resolution regional mapping studies but they require a characterization with a very large number of markers to be useful for genome-wide mapping studies [4].
The porcine RYRl gene region on SSC 6q1.2 is of special interest due to its economical importance. The porcine stress syndrome (PSS), which in pigs is caused by a single RYRl point mutation, is known to be associated with positive characteristics like increased muscling and increased lean meat content. Until now, it is not clear whether the RYRl mutation is also responsible for the positive carcass traits in stress susceptible pigs or whether these complex growth traits are influenced by other closely linked genes on SSC 6q1.2 [5][6][7]. Furthermore, this genomic region is also of special interest as it represents a GC-rich genomic region with a very high gene content. To investigate this genomic region we have previously reported the construction and analysis of a 1.2 Mb BAC/PAC contig [8].
In the present study, we report the construction of highresolution framework and comprehensive RH maps of the RYRl gene region on the porcine chromosome 6q1.2 using the porcine IMpRH and IMNpRH2 panels as well as the comparison of the RH maps to an extended clonebased physical map of this region.

Construction of the BAC and PAC contig and analysis of end sequences
We previously reported the construction of a 1.2 Mb BAC/ PAC contig on SSC 6q1.2 [8]. To extend the existing contig the porcine TAIGP714 PAC and RPCI-44 BAC libraries were screened with new probes either derived from end fragments of previously isolated porcine genomic clones or from human HSA 19q13.1 genes. Assembly of all 171 isolated BAC and PAC clones according to STS content, insert sizes and fingerprinting data resulted in the expansion of the existing 1.2 Mb contig [8] to 2.0 Mb and the generation of a new 1.2 Mb contig (Fig. 1). End sequences from all clones of the contig were generated and submitted to the EMBL database under accessions AJ514457-AJ514832. In total 292 end sequences from SSC 6q1.2 with an average read length of 708 bp totaling 207 kb of genomic survey sequences were generated. Thus, the BAC/ PAC end sequences cover approximately 6 % of the studied genomic region. The end sequences contain an average GC content of 47 % exceeding the value of 41 % that is generally accepted as the average GC content in mammalian genomes [9]. The GC content analysis further confirms that SSC 6q1.2 is indeed closely related to HSA 19q13.1, which has a GC content of 46 % in the corresponding 4 Mb region. An analysis of repetitive elements revealed that 39.8 % of the end sequences consisted of repetitive DNA. Of the 39.8 % repetitive DNA, 20.5 % were SINE, 13.3 % were LINE, 2.4 % were of retroviral origin (LTRs), and 2.0 % represented DNA transposons. The predominance of SINEs is another typical hallmark of GCrich and gene-rich genome segments [10]. The analysis of the end sequences also revealed three dinucleotide and one tetranucleotide microsatellite (AJ514594, AJ514613, AJ514706, AJ514795).
The availability of the end sequences allowed the continuous verification of the contig assembly by comparative mapping. In BLAST searches against the human draft genome sequence, approximately 15 % of the BAC/PAC end sequences showed significant (E < 10 -5 ) matches to HSA 19q13.1, which allowed the precise comparative mapping of 27 % of the tested BAC/PAC clones. Of the investigated clones, 73 % had no match in the human genome sequence, 23 % had matches with one end sequence, and 4 % had matches with both end sequences.

Physical mapping and comparative analysis
During the contig construction many gene-specific STSs were used, which allowed the unequivocal assignment of genes to individual clones. Further genes were localized by hybridization of heterologous cDNA probes to the individual BAC/PAC clones and BLAST analysis of the clone end sequences. Using these approaches, 33 genes in total were localized. Furthermore, the microsatellite SW193 was also localized by STS content analysis thus anchor in the physical clone-based map to the linkage map of this region [11].
The gene assignments were compared with human and mouse maps and a comparative map for SSC 6q1.2, HSA 19q13.1 and MMU 7 was developed (Fig. 2). The gene order in this region of the pig genome corresponds exactly to the gene order of the NCBI HSA 19 map (http:// www.ncbi.nlm.nih.gov build 31). The gene order of MMU 7 (http://www.ncbi.nlm.nih.gov MGSCv3) also  LGALS7 SPINT2

BAC_187_C16_Sp6
BAC_528_O23_Sp6   . Gene order is perfectly conserved between the three species, however the gene order is inverted in the mouse with respect to the other two species. In the human map all known genes without hypothetical gene predictions are listed, while in the murine map only those genes are listed that have also been mapped in the pig. In the porcine map the position of the microsatellite SW193 is also indicated. corresponds exactly to the gene order of SSC 6 and HSA 19 but the orientation is inverted. The perfect synteny conservation between mouse and the two other species can only be observed since the latest update of the mouse maps as in the previous mouse genome assembly a major rearrangement of the gene order in this genome region was observed [8].
Whereas the gene order is perfectly conserved between human, mouse and pig, the physical distances between genes vary somewhat between the three species. Within the investigated region the gene-poor stretch between COX7A1 and NEUD4 accounts for the biggest part of these size deviations. The cloned region has a very uneven gene density. At the top and at the bottom of the map (Fig.  2) genes are clustered extremely dense with very short intergenic regions, while in the middle of the map, between the COX7A1 and the NEUD4 gene the gene content is actually very low.

RH mapping
In this study, we were able to build two comprehensive RH maps for SSC 6q1.2. On the 7000 rad IMpRH panel 15 STS markers were genotyped, while on the 12 000 rad IMNpRH2 35 STS markers were analyzed. Retention frequencies of markers ranged from 18.1 % to 32.8 % (average 22.9 %) on the IMpRH panel and from 27.8 % to 44.3 % with an average retention frequency of 37.5 % on the porcine IMNpRH2 panel.
During the building of these two contigs, we simultaneously analyzed data obtained on both IMpRH and IMNpRH2 panels using the Carthagene program. Intermediate rough analyses of RH data allowed us to monitor the construction of the contig. In particular it allowed us to orient a subcontig in the gene poor region from ITZ002 to ITZ004 as well as to estimate the size of remainmg gaps.
When the full RH data set was available for both panels, it appeared that at the scale of 10-100 kb, the degree of resolution of the IMpRH panel is not high enough, and furthermore the order of genes that could be determined on this panel is very sensitive to some small genotyping errors. To produce a final reference map we thus computed a 1000:1 framework map using only the 35 vectors produced on IMNpRH2 panel. The framework status of the map was tested by calculation of likelihood of maps produced after all local permutations in a slipping window of 6 markers, and by global local inversions. We confirmed that no altemate order could be identified with a difference of log likelihood of less than 3 compared to the proposed order. The framework map contained 24 of the 35 IMNpRH2 markers. Using this framework map comprehensive maps were produced on each panel. In order to avoid inflation of the map size, we chose to project addi-tional markers at their most likely location, without altering the multipoint distance between framework markers (Fig. 3).
As shown in figure 3, the gene orders on the RH and physical maps are generally in good agreement. This agreement is perfect between the physical map and the 1000:1 framework RH map produced on IMNpRH2 panel. It demonstrates that at the 50-100 kb scale, fully accurate maps can be produced on this panel provided that 1000:1 framework maps are drawn.
Some minor discrepancies can be found when comprehensive maps are drawn. For instance, the location of SPTBN4 on the IMNpRH2 map seems incorrect. However a difference of log likelihood of only 1.57 is found between the maps constructed under the most likely order and the expected order. We thus think that our RH data do not sufficiently support the hypothesis of a very small rearrangement of this region. It should be pointed out, that even if additional markers are added at their most likely location on this kind of comprehensive map, their mapping does not affect the distance calculated between framework markers.
We also compared the resolution of both panels on the framework map established between COX7A1 and BLVRP. On IMpRH the distance is 146 cR 7000 , whereas the same fragment is 438 cR 12000 long on IMNpRH2. In this region the ratio between the resolutions is thus 3.01, which is slightly higher than the value of 2.77 observed in the PRKAG3-RN region [3] and of 2.43 observed in a QTL region close to the centromere of SSC 7 [12]. In the gene rich region between RYRl and BLVRP, which is precisely mapped on the reported clone contig, a ratio of 6.6 kb / cR 12000 (1370 kb / 207 cR 12000 ) is observed on the IMNpRH2 panel.
The RH map allowed us to confirm the close link between the two contigs we produced. The distance between the extremity markers of the contigs (ITZ002 and ITZ014) was estimated at 43.2 cR 12000 . Considering a ratio of 6.6 kb/ cR 12000 in this region, we can estimate that the physical distance between both contigs could be around 285 kb, which is roughly similar to the 360 kb distance that would be estimated from the human-pig comparative map.

Conclusion
The IMNpRH2 panel allowed a highly accurate resolution of closely spaced markers and was very useful in evaluating the assembly of a clone contig. In most instances not only the order of markers but also the physical distances between markers could be very accurately estimated from the RH 12000 map. During the contig building it helped us to orientate small sub-contigs, which were originally un-  linked, and to estimate the size of remaining gaps. Combining analyses on both IMpRH and IMNpRH2 panels provides both the possibility to detect significant linkage between relatively distant markers on the IMpRH panel as well as to determine the accurate gene order on the higher resolving IMNpRH2 panel.

DNA library screening and chromosome walking
Library screenings were done as described [8]. Briefly, the TAIGP714 PAC library [13], http://www.rzpd.de was screened by PCR of hierarchical DNA pools. The porcine genomic BAC library RPCI-44 was screened by radioactive hybridization according to the RPCI protocols http:// www.bacpac.chori.org.

DNA sequence analysis
End sequences of isolated BAC and PAC DNA were generated with a LICOR 4200L automated sequencer system. Further analyses were performed with the online tools of the European Bioinformatics Institute http:// www.ebi.ac.uk/, BLAST database searches in the GenBank database of the National Center for Biotechnology Information NCBI and the RepeatMasker searching tool for repetitive elements (Smit, A.F.A. and Green, P. http:// repeatmasker.genome.washington.edu/). Single copy sequences were used to design primer pairs for the chromosome walking using the programs GeneFisher and Primer3 http://bibiserv.techfak.uni-bielefeld.de/cgi-bin/ gf_submit?mode=START, http://www-genome.wi.mit.edu/cgi-bin/primer/primer3_www.cgi.

RH mapping
Prior to RH mapping of swine genomic inserts, each designed primer pair was tested for correct localization on SSC 6 on a somatic cell hybrid panel [14].

Statistical analysis of RH results
Vectors obtained on IMpRH and IMNpRH2 panels were analyzed with Carthagene software [15]. A framework map was built using buildfw option, which constructs a 1000:1 framework map by a stepwise locus adding strategy under the haploid model of fragment retention. The framework map was tested using a flips algorithm, which checks all local permutations in a window of 6 markers, and a greedy algorithm, which tries to improve the map by inversion of parts of the reference map. When the most likely order did not fit the expected order based on the human-pig comparative map, the likelihood of the two possible orders were calculated to determine the strength of the indication of a possible modification of gene order between both species. The final framework map was recomputed under a diploid model. Additional markers were mapped relatively to the framework map at their most likely location, projecting the markers on the map using the following formula (using the diploid model).
where Loc (M) is the location on the framework map of marker M mapped between the n th and n + 1 th markers of the framework (respectively named Fwk n and Fwk n+1 ), Dmltpt(X,Y) and D2pt(X,Y) are the multipoint and two point distances between markers X and Y.