QTLViewer: An interactive webtool for genetic analysis in the Collaborative Cross and Diversity Outbred mouse populations

The Collaborative Cross (CC) and the Diversity Outbred (DO) mouse populations are related multiparental populations (MPPs), derived from the same eight isogenic founder strains. They carry >50M known genetic variants, which makes them ideal tools for mapping genetic loci that regulate phenotypes, including physiological and molecular traits. Mapping quantitative trait loci (QTLs) requires statistical and computational training, which can present a barrier to access for some researchers. The QTLViewer software is graphical user interface webtool for CC and DO studies that performs QTL mapping through the R/qtl2 package. Additionally, the QTLViewer website serves as a repository for published CC and DO studies, allowing users to explore QTL mapping results interactively and increasing the accessibility of these genetic resources to the broader scientific community.

for the CC and the DO compared to intercrosses or backcrosses due to additional 42 generations of meiosis, which facilitates the identification of candidate genes (Solberg 43

Woods 2014). 44
Mouse experiments support the collection of multiple types of data on the same 45 individuals, including physiological traits and molecular assays (e.g., gene expression) 46 across multiple tissues. Integrative approaches like mediation analysis can be used to 47

Equation 1 124
where trait is the phenotype value of mouse , QTL is the effect of locus on mouse 125 being tested, covariates is the cumulative effect of all covariates on mouse , kinship 126 is a random term that captures noise variation for mouse due to population structure, 127 and error is the independent random noise for mouse . The structure of the kinship 128 term is encoded in a genetic relationship matrix (K) estimated from the genotypes. We 129 use the "leave one chromosome out" (LOCO) approach in which the K used for each 130 locus fit by Equation 1 excludes all markers from the chromosome of locus , which 131 improves QTL mapping power (Wei and Xu 2016). 132 For additive QTL, the QTL term represents allele dosages founder haplotypes 133 at the locus. The QTLViewer will also plot the regression coefficients from the QTL 134 term, i.e., the founder allele effects, when doing haplotype-based analysis. These 135 effects can also be re-estimated as best linear unbiased predictors (BLUPs), which 136 reduces the impact of rare alleles and can make signals clearer. 137 For interactive QTL, a similar model to Equation 1 is used to test a locus-by-138 factor interaction effect at loci across the genome: 139 trait = QTL + factor + (QTL × factor) + covariates + kinship + error

Equation 2 140
where factor is a covariate of interest for mouse that may have an interaction effect 141 with genotype at the locus , (QTL × factor) is the QTL-by-factor interaction term at 142 locus being tested, and all other terms as defined before. Possible factors include sex 143 and age. 144 The QTLViewer can also perform association mapping on bi-allelic variants using 145 both additive and interactive models. Rather than encoding genetic effects based on 146 doses of founder haplotypes, the QTL and (QTL × factor) terms in Equations 1 147 and 2 can be fit to doses of alleles of specific variants, imputed from dense genotypes 148 of the founder strains. Although variant association mapping sacrifices some of the 149 information present in the founder haplotypes, it does enable researchers to potentially 150 identify specific variants of interests and prioritize candidate genes. Briefly, Equation 1 is re-used (with the kinship term excluded for computational 155 efficiency) and the QTL of interest re-tested, but now conditioning one-by-one on 156 candidate mediators. A strong candidate mediator will localize near the QTL and 157 strongly reduce the QTL LOD score. box and press Enter. The search algorithm is specialized for -omics data like gene 180 expression, recognizing gene symbols, gene names, and Ensembl identifiers. All 181 elements that match the search criteria will be displayed in a table below, but only 7 elements that are in the dataset will be displayed in blue and clickable ( Figure 3B). This 183 feature takes advantage of the Ensimpl database and webservice, which currently hosts Clicking on a point from the LOD Peaks grid or selecting a trait in the text search will 202 generate a genome-wide LOD plot for the trait ( Figure 4A). Information about each LOD 203 score may be accessed by hovering. The Plots select box will switch between additive 204 and factor-interactive QTL models. Clicking on a locus of interest (i.e., the peak LOD  The QTLViewer contains multiple R data objects that supply all the necessary inputs for 264 the analyses, including traits, covariates, and genotype probabilities. In addition to 265 exploring the data interactively on the QTLViewer webpage, users can download the 266 corresponding R data objects by clicking on Download Data on the top of the page 267 ( Figure 7A). All plots and analyses generated above can be downloaded as figures or 268 data tables by clicking on the top right corner button on each plot ( Figure 7A). When 269 clicking on Download Data, users will be redirected to a new page displaying all the 270 downloadable RData files ( Figure 7B). There, users will find a "core" RData file 271 containing all the input needed for mapping, such as genotype probabilities and marker 272 information. Users can also download the "dataset" RDS files that contain the trait data, 273 sample and assay metadata annotations, and a summary of the QTL mapping results 274 ( Figure 7C). This functionality enables users to quickly gain access to processed data 275 files to run further analyses. 276 277

Future directions 278
The R/qtl2 software at the core of QTLViewer is very general and can 279 accommodate a wide range of cross designs and data from model organisms other than 280 the mouse. We plan to extend the QTLViewer to include mouse backcrosses,