A High-Resolution Genetic Map for the Laboratory Rat

An accurate and high-resolution genetic map is critical for mapping complex traits, yet the resolution of the current rat genetic map is far lower than human and mouse, and has not been updated since the original Jensen-Seaman map in 2004. For the first time, we have refined the rat genetic map to sub-centimorgan (cM) resolution (<0.02 cM) by using 95,769 genetic markers and 870 informative meioses from a cohort of 528 heterogeneous stock (HS) rats. Global recombination rates in the revised sex-averaged map (0.66 cM/Mb) did not differ compared to the historical map (0.65 cM/Mb); however, substantial refinement was made to the localization of highly recombinant regions within the revised map. Also for the first time, sex-specific rat genetic maps were generated, which revealed both genomewide and fine-scale variation in recombination rates between male and female rats. Reanalysis of multiple quantitative trait loci (QTL) using the historical and refined rat genetic maps demonstrated marked changes to QTL localization, shape, and effect size. As a resource to the rat research community, we have provided revised centimorgan positions for all physical positions within the rat genome and commonly used genetic markers for trait mapping, including 44,828 SSLP markers and the RATDIV genotyping array. Collectively, this study provides a substantial improvement to the rat genetic map and an unprecedented resource for analysis of complex traits and recombination in the rat.


rat genetic map recombination
Genetic maps are constructed by calculating the frequency of meiotic recombination between genomic markers to define the linear order and relative distance between loci. In addition to refining the physical map of a species (i.e., the genomic sequence), a genetic map is fundamentally important for the analysis of quantitative trait loci (QTL) that link disease phenotypes to genomic intervals containing disease modifiers.
The accuracy of a genetic map influences the correct calculation of a QTL intervals, because the linkage of a trait to a genomic interval is a function of the genetic distance (i.e., the expected recombination frequency) between the two flanking markers (Lander and Botstein 1986). Thus, miscalculating the genetic distance between two flanking markers is expected to alter the QTL interval, which ultimately impacts the list of candidate disease modifier(s) within the QTL. Indeed, a recent refinement to the mouse genetic map revealed dramatic changes to previously characterized QTL (Cox et al. 2009).
One way to improve the resolution of a genetic map is to increase the number of meiotic events and genetic markers that are assessed. This approach has been adopted for constructing high resolution genetic maps of the human (104,246 meioses; 833,754 markers) (Bhérer et al. 2017) and mouse (15,832 meioses; 120,789 markers) (Morgan et al. 2017). However, in stark contrast, the genetic map for the rat has never been updated and is currently based on a considerably smaller dataset (90 meioses; 2,305 markers) (Jensen-Seaman et al. 2004;Steen et al. 1999), with a map resolution of only 1.1 cM, as compared with resolutions of ,0.004 cM (Bhérer et al. 2017) and ,0.01 cM (Morgan et al. 2017) for the human and mouse genetic maps, respectively. As such, the relatively low resolution of the current rat genetic map is a potential limitation for accurately mapping QTL in the rat.
Here, we generated the first high resolution sex-averaged genetic map of the rat using 95,769 informative markers and 528 NIH-Heterogeneous Stock (HS) rats (65 families; 870 meioses), with an average resolution of ,0.02 cM. Additionally, we produced the first sex-specific genetic maps for the rat, which recapitulated the dimorphic recombination rates observed in human (Bhérer et al. 2017) and mouse (Cox et al. 2009). Finally, reanalysis of historical QTL using the refined rat genetic maps revealed changes to QTL localization, shape, and effect size.

Rats
The HS rat colony was initiated by the National Institutes of Health in 1984 using eight inbred strains (ACI/N, BN/SsN,BUF/N,F344/N,M520/N,MR/N,WKY/N,and WN/N) (Hansen and Spuhler 1984).

Genotyping and Data Cleaning
The collection of genotypes from 528 HS rats was described previously (Baud et al. 2014;Rat Genome Sequencing and Mapping Consortium et al. 2013) and the genotype data used for this study can be accessed from (Baud et al. 2014). Briefly, genomic DNA were fragmented with NspI and StyI endonucleases and hybridized to the custom RATDIV array that assays 797,968 genotypes spread evenly across the rat genome, with a median inter-SNP distance of 3,470bp (RGSC 6.0/rn6). High quality genotypes were called using the BRLMM-P component of the Apt-Probeset-Genotype program that is part of the Affymetrix Power Tools suite (apt-1.10.2), as described previously (Rat Genome Sequencing and Mapping Consortium et al. 2013). The genotype data were further cleaned to remove monomorphic SNPs and genotypes with high Mendelian inheritance error rates (.2% error accross the cohort) were identified by PLINK and removed (Purcell et al. 2007). After data cleaning, a total of 412,430 high-quality polymorphic genotypes remained, which far exceeds the number of genotype markers required for accurate estimation of the rat genetic map. Thus, we performed an unbiased selection of the first genotype (MAF . 0.05) per every 10Kb window across the genome, which yielded a final list of 132,521 informative genotypes that were uniformly distributed across the genome.

Construction of the High-Resolution Rat Genetic Map
The rat genetic map was constructed using Lep-MAP3 (LM3) (Rastas 2017), which is a linkage mapping program designed for large genotyping datasets, such as the high-density genotyping data used in our study. The following LM3 functions were used to construct the rat genetic map for each of the 20 autosomes and for the X chromosome: 1) the ParentCall2 function was used to call parental genotypes by taking into account genotype information of grandparents, parents, and offspring; 2) the Filtering2 function was used to remove those markers with segregation distortion or those with missing data using the default setting of LM3, which resulted in the removal of 1.2% genotype markers; and 3) the OrderMarkers2 function was used to compute cM distances (i.e., recombination rates) between all adjacent markers per chromosome. Although LM3 was previously shown to be highly accurate in ordering markers (Rastas 2017), we detected multiple markers with cM positions in obvious disagreement with their reported physical position in the RGSC 6.0/rn6 genome assembly. Errant markers with disagreement between their genetic and physical positions could be due to multiple factors, including errors within the RGSC 6.0/rn6 genome assembly or the inability to fully resolve the genetic position of some markers in the genetic map, due to low allelic frequency and the limited size of the cohort of 528 individuals. A two-stage process was used for map cleaning a given chromosome for each of the sexspecific genetic distances produced by LM3 to build sex-specific and sex-averaged genetic maps. First, a locally weighted regression model (LOESS) (Cleveland and Devlin 1988) was used to assess the relationship between physical (bp) and genetic (cM) positions. Markers with absolute residuals of .1 were identified and removed since they disrupt the expected monotonically increasing behavior between the physical and genetic maps. Second, a cubic splines model was fitted to the plot of genetic vs. physical distances of the remaining markers to estimate local recombination rates and to interpolate the genetic positions of markers on the original SNP array for each gender, based on their positions in RGSC 6.0/rn6. A final set of 95,769 informative markers were analyzed across ten independent replicate LM3 analyses and were used to calculate the sex-averaged and sexspecific rat genetic maps.

Assigning Recombination Rates to Physical Positions and Genotyping Markers
Sex-averaged and sex-specific centimorgan distances were assigned to every kilobase of the physical map (RGSC 6.0/rn6) by interpolation with the revised rat genetic maps. Likewise, sex-averaged and sex-specific centimorgan distances were assigned to each genotype position within the RATDIV array (Rat Genome Sequencing and Mapping Consortium et al. 2013). A total of 44,828 SSLP markers with physical positions were also given sex-specific and sex-averaged centimorgan distances by interpolation with the revised rat genetic maps.

QTL Reanalysis
To assess the impact of the revised rat genetic map on QTL mapping, we re-examined the data from a previously published F2 reciprocal cross using WKY and F344 rats (Solberg et al. 2004). The Animal Care and Use Committee approval of the study and a detailed description of the methods can be found in the original article (Solberg et al. 2004). Data were downloaded directly from the Mouse Phenome Database (MPD; https://phenome.jax.org/centers/QTLA) and QTL were reanalyzed using the R/qtl software package (Broman et al. 2003) (version 1.41-6; http://www.rqtl.org/). The primary phenotypes-of-interest were related to depression-like behavior in rodents (e.g., immobility and climbing). Four F2 rats with missing data on both immobility and climbing (3 male and 1 female) and four SSLP markers with a high rate of missing data (D6Rat24, D6Rat128, D6Rat88, D6Mgh4) were excluded. A total of 482 F2 rats (222 females and 260 males) and 101 autosomal SSLP markers were used in the QTL re-analysis. No other phenotypic outliers or problematic markers were detected, as would be expected from using pre-cleaned data from the MPD. A single-locus genomewide QTL scan was performed and LOD scores were calculated at 2 cM interval across the genome, using the imputation method implemented in R/qtl (Sen and Churchill 2001) and significance was determined on the basis of 1000 permutations of the data (Churchill and Doerge 1994). A LOD score exceeding the 0.05 genomewide adjusted threshold was considered significant, whereas a LOD score exceeding the 0.63 genomewide adjusted threshold was considered to be suggestive (Lander and Kruglyak 1995). The Bayes credible interval function in R/qtl (bayesint) was used to approximate the 95% confidence intervals for the QTL peak location for both the additive and the interactive models, as described in (Solberg et al. 2004). QTL that were detected using the original or repositioned markers were compared for correspondence in peak positioning, shape, and significance level, as described previously (Cox et al. 2009).

Data Availability
The GSA Figshare portal was used to upload supplemental files related to this study. The genotype data used for this study can be accessed from (Baud et al. 2014) and can be directly downloaded from https://www. ebi.ac.uk/arrayexpress/experiments/E-MTAB-2332/. File S1 contains detailed descriptions of all supplemental files. File S2 contains sexaveraged and sex-specific centimorgan distances that were assigned to every kilobase of the physical map (RGSC 6.0/rn6) by interpolation with the revised rat genetic maps. File S3 contains sex-averaged and sexspecific centimorgan distances for each genotype position within the RAT-DIV array. File S4 contains sex-averaged and sex-specific centimorgan distances for 44,828 SSLP markers with assigned physical positions from the Rat Genome Database (http://rgd.mcw.edu/). Supplemental material available at Figshare: https://doi.org/10.25387/g3.6260888.

Construction of the high-resolution rat genetic map
To generate the first high-density genetic map of the laboratory rat, genomewide genotyping data were collected from 528 HS rats using the RATDIV high-density SNP array. Following data cleaning (see Methods section), a final set of 95,769 informative markers were used to calculate the sex-specific centimorgan distances using the LM3 program (Rastas 2017). The average resolution of the revised rat genetic map was ,0.02 cM, which is considerably higher than the 1.1 cM resolution of the previous map (Jensen-Seaman et al. 2004). As shown in Table 1, the sizes of the most recent physical map (RGSC 6.0/rn6) and the revised genetic map were both increased by roughly 10% compared with the circa 2004 assemblies of the rat physical map (Baylor 3.4/rn4) and rat genetic map (Jensen-Seaman et al. 2004), yielding comparable average recombination rates of 0.66 cM/ Mb and 0.65 cM/Mb in the Jensen-Seaman map and the revised rat genetic map, respectively. Compared with males, the female-specific rat genetic map was approximately 10% larger (1,826 cM vs. 1,589 cM), which is similar to the dimorphic recombination reported in mouse (Cox et al. 2009) and human (Bhérer et al. 2017). To our knowledge, a sex-specific genetic map for the rat has not previously been reported, precluding the comparison of sex-specific recombination rates with historical data. Nonetheless, these data suggest that the revised genetic map is of comparable size to the Jensen-Seaman map, when accounting for the larger physical assembly of RGSC 6.0/rn6 compared with the Baylor 3.4/rn4 genome assembly. Moreover, the revised rat genetic map provides novel insight to the disparate recombination rates between males and females.
Updated distribution of recombination rates in the rat genome Although the average recombination rates of the rat genome did not differ greatly from the Jensen-Seaman map, we observed marked shifts in the distribution and amplitude of recombination frequencies in the revised genetic map (Figure 1). We attribute the altered distribution of recombination rates within the revised genetic map to more accurate marker placement, which is likely due to the greater number of informative haplotypes within the HS population, as well as a 10-fold increase in informative meioses and .20-fold higher map resolution compared with the historical Jensen-Seaman map. Another noted advantage of the revised rat genetic map is the characterization of dimorphic recombination rates, which to our knowledge has not previously been reported for the rat. As found in other mammalian genomes (Bhérer et al. 2017;Cox et al. 2009), a substantial portion of the rat genome showed dimorphic recombination rates and higher recombination rates were typically observed in females ( Figure 2). However, despite the genomewide trend of increased recombination in females compared with males, fine-scale variation in dimorphic recombination rates was pervasive and highly recombinant regions in both sexes were detected. Collectively, these data reveal for the first time the dimorphic recombination rates in the rat.
Impact of the revised rat genetic map on QTL identification Accurate marker placement is crucial for correctly localizing QTL. To assess whether the revised rat genetic map affects QTL localization, we n  (Solberg et al. 2004). Identical methodology was used to calculate QTL, except for altering the marker positions based on the Jensen-Seaman map or the revised rat genetic map. Phenotypic data (e.g., immobility and climbing scores) from the F2 generation male and female rats (total n = 486) were retrieved from the MPD and QTL were calculated using historical and revised marker positions. Reanalysis of the QTL using the original sex-averaged marker positions from the Jensen-Seaman map recapitulated all 12 QTL that were previously reported for rat immobility and climbing phenotypes in the F2 generation. In comparison, analysis of the QTL using the revised genetic map revealed multiple QTL with altered shape, localization, and effect size. A comparison of the peak LOD scores and 95% confidence intervals (CI) of QTL calculated with the original or revised maps revealed 9 out of 32 QTL with shifts of greater than 10 cM (Table 2). Collectively, these data demonstrate that the accuracy of marker placement has a dramatic impact on QTL localization, which would subsequently impact identification of the underlying causative gene(s).
Assignment of centimorgan distances to the physical map and all SSLP markers within the rat genome As a resource to the scientific community, we have provided the sexaveraged and sex-specific centimorgan distances per every kilobase of the current physical map (RGSC 6.0/rn6) (File S2). We have also provided the revised centimorgan positions for every genotype position Figure 2 Comparison of the sex-specific rat genetic maps. The recombination rates for all chromosomes are compared between the male (blue) and female (red) rat genetic maps. For reference, the revised sex-averaged map is also plotted (black hashed line).
within the RATDIV genotyping array (Rat Genome Sequencing and Mapping Consortium et al. 2013) (File S3) and for 44,828 rat SSLP markers with unique physical positions that are currently annotated in the Rat Genome Database (rgd.mcw.edu) (File S4). Finally, in the process of curating the known SSLP markers, we identified 8,457 markers in the RGD database that aligned to multiple positions or were assigned no position in the current physical genome build. These unmapped markers potentially lie within poorly resolved or duplicated regions of the RGSC 6.0/rn6 assembly and we therefore do not recommend using these SSLP markers for mapping studies until the locations of the markers have been resolved in future genome builds.

DISCUSSION
In contrast to frequent revisions of high-resolution human and mouse genetic maps (Bhérer et al. 2017;Cox et al. 2009), the rat genetic map has been restricted to a relatively low-resolution (1.1 cM) map for the past two decades (Jensen-Seaman et al. 2004). To address this issue, we constructed a revised rat genetic map from a single large cohort of 528 HS rats, which is to our knowledge the most diverse genetic resource for map building in the rat (capturing haplotypes from 8 parental strains) and the highest-resolution map (,0.02 cM) of any rat genetic map to date. These data were used to assign highly accurate centimorgan positions, for the first time, to all physical positions within the rat genome (RGSC 6.0/rn6), to all genotypes captured by the RATDIV array, and to 44,828 SSLP markers by interpolation with a highresolution, revised rat genetic map. To our knowledge, this is also the first rat genetic map to analyze both sex-averaged and sex-specific recombination rates, which were used to generate sex-specific rat genetic maps for the first time. Using these data, we demonstrate that QTL localization is affected by the accuracy of marker positions. Collectively, the current study provides a new consensus rat genetic map and a common framework for candidate gene mapping in the rat.
Changes to the rat genetic map and recombination rates After accounting for the larger physical size of the RGSC 6.0/rn6 rat genome build (2,619 Mb) compared with the original Baylor 3.4/rn4 rate genome build (2,554 Mb), the increased size of the rat genetic map (1,708 cM) is proportional to the original Jensen-Seaman map (1,542 cM). Thus, although the coordinates of highly recombinant regions in the rat genome were refined in the revised rat genetic map, the sexaveraged genomewide recombination rates did not change (0.66 cM/Mb vs. 0.65 cM/Mb). Although the genomewide recombination rates did n not change, fine-scale localization of highly recombinant regions differed between the Jensen-Seaman map and the revised rat genetic map. One potential reason for the refined localization of highly recombinant regions in the revised rat genetic map is the greater potential of genetic variation due to the possibility of eight informative HS founder haplotypes per genomic position, whereas prior rat genetic maps relied on crosses between two parental strains with less genetic variation (Bihoreau et al. 2001;Jensen-Seaman et al. 2004). In some instances within the Jensen-Seaman map, recombinations likely occurred, but were poorly localized due to sparse marker density and the limited number of informative meioses. Thus, it stands to reason that the increased genetic diversity of our cohort of 528 HS rats, combined with the greater number of informative meioses (870 vs. 90), and the higher density of genetic markers (95,769 vs. 2,305), would enable more precise localization of crossover events in the revised map compared with the Jensen-Seaman map that was based on a much smaller cross (45 rats) of only two strains (SHRSP x BN).
In addition to improving map resolution, we have analyzed the first sex-specific genetic maps for the rat. Similar to human (Bhérer et al. 2017) and mouse (Cox et al. 2009), the female-specific rat genetic map was larger than the male-specific map, indicating that the genomewide average of recombination events are more frequent in female rats compared with males. Nonetheless, fine-scale variability in the ratio of male-to-female recombination is pervasive across the genome and multiple highly recombinant regions that are unique to either gender were detected. Notably, the underlying mechanisms of dimorphic recombination have yet to be examined in the rat, as sex-specific genetic maps were not previously available. Several mechanisms of dimorphic recombination have been proposed in other species though, including the sex-specific crossover interference (Petkov et al. 2007), sex-specific molecular mediators of recombination (Bhérer et al. 2017), and sexspecific haploid selection (Lenormand and Dutheil 2005). The current findings suggest that similar sex-specific mechanisms potentially underlie the dimorphic recombination rates in the rat and could be explored using the HS model.

The impact of the revised genetic map on QTL localization
Reanalyzing QTL using the refined rat genetic map revealed notable shifts in the localization and significance of multiple previously mapped loci ( Table 2). As we did not detect disordered markers in the revised map compared with the original dataset (Solberg et al. 2004), the QTL changes were likely due to the refinement of genetic distance between markers and its impact as a function of interval mapping (Lander and Botstein 1989). Our findings recapitulate similar changes in QTL peaks that were observed following revision of the mouse genetic map (Cox et al. 2009) and demonstrate the importance of a highresolution genetic map for QTL localization. Although the impact of the refined rat genetic on other QLT mapping studies has yet to be investigated, our data suggest that reanalysis of other historical datasets might help to localize the regions of interest.

Conclusions
In summary, this study has provided a substantial refinement to the rat genetic map, which had not been updated since 2004 (Jensen-Seaman et al. 2004). The refined rat genetic map provides sub-centimorgan resolution and sex-specific maps for the first time, which highlighted the dramatic variation in recombination rates between males and females. An added advantage of the revised rat genetic map is its derivation from the eight common inbred founder strains within the HS rat population and therefore the large number of genetic loci that were placed on the map are likely to be applicable to the majority of independent linkage analyses. Importantly, the refined marker positions in the sex-averaged map strongly impacted the localization, shape, and effect size of multiple QTL, indicating that other existing and future QTL mapping studies might be similarly affected. The refined rat genetic map will also serve as a significant resource to the growing community of geneticists utilizing outbred rat models to fine-mapping QTL, such as the HS rat (Keele et al. 2018;Rat Genome Sequencing and Mapping Consortium et al. 2013;Solberg et al. 2004). As such, we have provided for the rat research community, the refined genetic distances for all physical positions in the RGSC 6.0/rn6 rat genome build (1kb bins; File S2), for all genotypes on the RATDIV genotyping array (File S3), and for all SSLP markers with physical positions in the RGSC 6.0/rn6 rat genome build (File S4). These data represent an unprecedented genetic resource for mapping candidates in the rat and are on par with the current genetic map resources that are currently available for the human and mouse.