- Split View
-
Views
-
Cite
Cite
Carol J. Bult, Debra M. Krupke, Dale A. Begley, Joel E. Richardson, Steven B. Neuhauser, John P. Sundberg, Janan T. Eppig, Mouse Tumor Biology (MTB): a database of mouse models for human cancer, Nucleic Acids Research, Volume 43, Issue D1, 28 January 2015, Pages D818–D824, https://doi.org/10.1093/nar/gku987
- Share Icon Share
Abstract
The Mouse Tumor Biology (MTB; http://tumor.informatics.jax.org) database is a unique online compendium of mouse models for human cancer. MTB provides online access to expertly curated information on diverse mouse models for human cancer and interfaces for searching and visualizing data associated with these models. The information in MTB is designed to facilitate the selection of strains for cancer research and is a platform for mining data on tumor development and patterns of metastases. MTB curators acquire data through manual curation of peer-reviewed scientific literature and from direct submissions by researchers. Data in MTB are also obtained from other bioinformatics resources including PathBase, the Gene Expression Omnibus and ArrayExpress. Recent enhancements to MTB improve the association between mouse models and human genes commonly mutated in a variety of cancers as identified in large-scale cancer genomics studies, provide new interfaces for exploring regions of the mouse genome associated with cancer phenotypes and incorporate data and information related to Patient-Derived Xenograft models of human cancers.
INTRODUCTION
The laboratory mouse has long served as a model system for investigations into human biology and disease because its physiology is similar to that of humans, because there is a high degree of conservation in genes and genome organization, and because the mouse genome is amenable to precise manipulation (e.g. transgenesis, targeted mutation, recombineering, CRISPR/Cas9, etc.) (1–4). The genetic uniformity of inbred strains contributes to the experimental power of mice to drive discovery of the contributions of specific genes and modifiers for cancer susceptibility and resistance (5). Inbred lines of mice are ideal for identifying and studying low penetrance cancer genes (5,6) which are particularly important for uncovering the genetic basis of susceptibility to cancers in human populations when there is no evidence of familial segregation (7). Studies of mice with induced and engineered changes in specific genes or transgenes have provided fundamental insights into the underlying molecular genetics of cancer initiation and progression (8). Although the limits of model systems for the faithful recapitulation of all aspects of human disease must be appreciated (9,10), the laboratory mouse is widely recognized as the premier animal model for investigating clinically relevant aspects of tumor biology (11–13).
Understanding the influence of genetic background on phenotype variation is critical for the creation of valid mouse models of human cancer and for the appropriate use of these models in cancer research (14). The failure of researchers to appreciate the impact of genetic background can lead to confounding and misleading interpretation of mouse model data. For example, transgenic mice expressing human HRAS on a mixed genetic background of the C57BL/6 and SJL strains were reported to have a mammary gland carcinoma frequency of 45–50% (15). The same transgene allele on an FVB/N strain congenic background resulted in a mammary gland carcinoma frequency of 100% (16). Along similar lines, a recent study reported that both incidence and latency of thymic lymphoma in mouse models varied depending on which genetic background the Atmtm1Awb allele was introgressed into (17). The impact of drug treatments for specific tumor types also needs to be evaluated in the context of what is ‘normal’ for the genetic background of the mice in the experiment. For example, a study by Roberts et al. (18) demonstrated the importance of genetic background in using animal models to investigate responses to pharmacological intervention.
One of the major trends in translational cancer research is the renewed interest in xenograft models using immunodeficient mice. Although xenografts have been used in cancer research for decades, progressive improvements in the transplant compliant host animal have greatly enhanced the utility of these models in basic and translational cancer research (19). ‘Human-in-mouse’ xenografts (i.e. Patient-Derived Xenografts, PDX) are created by implanting primary human tumor material directly into an immunodeficient host mouse. Several routes for implantation are possible including subcutaneous, under the renal capsule, tail vein injection or orthotopic. PDX models allow researchers to directly study human cells and tissues in vivo (20,21). These models have an advantage over cell lines and cell line xenografts because the tumors retain a more natural architecture and are more reflective of the heterogeneity and histology seen in primary tumors (10). They have an advantage over tumors that arise in genetically engineered mice in that the xenografts retain (at least for a period of time following the initial implantation) a human-specific microenvironment. Several murine hosts are used routinely in xenograft studies, including nude (Foxn1nu) and SCID (Prkdcscid). However, much of the current state-of-the-art xenograft research in cancer relies on NOD.Cg-Prkdcscid Il2rgtm1Wjl/SzJ (NOD scid gamma or NSG) mice (20,22). NSG mice lack mature T cells, B cells and functional Natural killer (NK) cells; they are deficient in cytokine signaling. As with genetic models of human cancer, the genetics of the host strain in xenograft models is important for selecting an appropriate model system. Engraftment success of different host strains is variable and an important factor in experimental design of PDX studies (23). Curating this information in the published cancer xenograft literature is one of the future content enhancements planned for Mouse Tumor Biology (MTB) (see Future Directions, below).
The MTB (http://tumor.informatics.jax.org) database is a freely available community informatics resource designed to support the effective use of the laboratory mouse as a model system for investigating the genetic and genomic basis of human cancer. MTB provides expertly curated, semantically consistent data to guide researchers in the selection of appropriate strains and mutant mice for experimentation. A fundamental biological principle at the heart of MTB's design and functionality is that the genetic background of mice greatly influences the choice, use, interpretation and translational utility of these models in cancer research. MTB is extending curated content and search functionality beyond the past focus solely on genetic models of human cancer to include mouse models and related data that are important for translational and pre-clinical cancer research. In this paper we describe recent changes to user interface and data acquisition strategies for MTB.
MTB CONTENT
The MTB database was first released on the World Wide Web in 1998 (24) with a primary focus on genetically engineered mouse models of human cancer. Data in MTB related to these models includes frequency of mouse tumors in the context of specific genetic backgrounds, histopathology images of mouse tumors and information about specific mutations/allelic variants in mouse tumors. As genome sequencing became a common approach for characterizing human tumors (25), we developed new interfaces for MTB to connect lists of human genes commonly mutated in various cancers to mouse models carrying mutations in those genes. As PDX have become increasingly important as a platform for basic cancer research and translational cancer research (26), we have initiated curation efforts to represent this literature and these models in MTB. A summary of the trends in MTB's data content since the inception of the resource is provided in Table 1.
Annotation trends in MTB since the project's inception
. | January 1999 . | October 2004 . | October 2009 . | August 2014 . |
---|---|---|---|---|
Annotated references | 142 | 1164 | 2594 | 3730 |
Tumor frequency records | 3699 | 23 778 | 35 832 | 61 655 |
Genetically defined strains | 623a | 3559a | 3764 | 6057 |
Tumors associated with specific genes | 105 | 800 | 1330 | 2288 |
Images (histopathology, SKY, FISH, etc.) | — | 1318 | 3882 | 5886 |
PDX models | — | — | — | 357 |
. | January 1999 . | October 2004 . | October 2009 . | August 2014 . |
---|---|---|---|---|
Annotated references | 142 | 1164 | 2594 | 3730 |
Tumor frequency records | 3699 | 23 778 | 35 832 | 61 655 |
Genetically defined strains | 623a | 3559a | 3764 | 6057 |
Tumors associated with specific genes | 105 | 800 | 1330 | 2288 |
Images (histopathology, SKY, FISH, etc.) | — | 1318 | 3882 | 5886 |
PDX models | — | — | — | 357 |
aMale and female mice were stored as separate strain entries until 2004.
. | January 1999 . | October 2004 . | October 2009 . | August 2014 . |
---|---|---|---|---|
Annotated references | 142 | 1164 | 2594 | 3730 |
Tumor frequency records | 3699 | 23 778 | 35 832 | 61 655 |
Genetically defined strains | 623a | 3559a | 3764 | 6057 |
Tumors associated with specific genes | 105 | 800 | 1330 | 2288 |
Images (histopathology, SKY, FISH, etc.) | — | 1318 | 3882 | 5886 |
PDX models | — | — | — | 357 |
. | January 1999 . | October 2004 . | October 2009 . | August 2014 . |
---|---|---|---|---|
Annotated references | 142 | 1164 | 2594 | 3730 |
Tumor frequency records | 3699 | 23 778 | 35 832 | 61 655 |
Genetically defined strains | 623a | 3559a | 3764 | 6057 |
Tumors associated with specific genes | 105 | 800 | 1330 | 2288 |
Images (histopathology, SKY, FISH, etc.) | — | 1318 | 3882 | 5886 |
PDX models | — | — | — | 357 |
aMale and female mice were stored as separate strain entries until 2004.
Nomenclature and annotation standards are essential for scientific communication and for accurate and complete data retrieval and aggregation (27). The nomenclature and semantic standards implemented in MTB benefit researchers by greatly simplifying what can be a frustrating and time consuming task of finding and comparing data about mouse models. For all of the gene, allele, mutation and strain data in MTB, curators enforce the nomenclature standards set by the International Committee on Standardized Nomenclature for Mice (http://www.informatics.jax.org/nomen/) and the Human Gene Nomenclature Committee (28). An example of the importance of standardized genetic nomenclature is illustrated by alleles of the Fgfr3 gene (fibroblast growth factor receptor 3, MGI:95524). Mouse models carrying mutations in Fgfr3 have been observed to promote skin and lung cancer (29), but finding the relevant model using non-standard nomenclature (e.g. Fgfr3−/− or Fgfr3+/−) is complicated by the fact that there are over 25 targeted alleles of Fgfr3 published, with 11 of the alleles originating from the same laboratory that published the observation of increased tumorigenesis in lung and skin. Only one of the 11 alleles, Fgfr3tm4Cxd (MGI:2135675), in combination with a targeted mutation in Ctnnb1 (Ctnnb1tm1Mmt) and on a mixed genetic background, is associated with increased lung tumorigenesis.
For terminologies associated with tumor classification annotations, MTB adopts standards set by pathologist working groups that are convened on a regular basis to evaluate neoplasias of specific anatomical systems and to develop standard diagnoses and terminologies. Examples of standard classifications for mouse tumors to emerge from these workshops include the lymphohematopoietic system (30), lung (31), mammary gland (32), gastrointestinal system (33), nervous system (34), pancreas (35) and prostate (36). We also work with related bioinformatics resources such as PathBase (37) on mapping terms to ensure broad dissemination of standard tumor classification and diagnosis terminologies for mouse models.
DATABASE ENHANCEMENTS
MTB can be searched using web-based search forms and interactive graphical summaries of mouse strain characteristics related to cancer phenotypes (24,38). Four recent enhancements to data content and user interfaces have been implemented in MTB to further advance the mouse for understanding the genetic and genomic basis of cancer. These enhancements include (i) customizable genome maps of Quantitative Trait Loci (QTL) associated with cancer phenotypes mapped in mice, (ii) search tools for finding data sets associated with cancer genomics studies in laboratory mice in public archives, (iii) search tools for finding mouse models using human gene symbols and (iv) access to information and data associated with PDX models of human cancer.
Cancer QTL in mouse
The laboratory mouse is a powerful genetic model for identifying genes associated with cancer susceptibility and resistance (39). MTB's Cancer QTL Viewer (Figure 1) provides a graphical summary of published cancer-related QTL studies in the mouse that are integrated with the rich biological annotations of mouse genes available from the Mouse Genome Informatics (MGI) database (40). One common starting point for the interface is to display all mapped cancer-related QTL on a genome-wide map. The QTL regions are color coded according to the organ system/cell type associated with the mapping study. In cases where authors provide the symbol of the genetic marker associated with the peak marker-phenotype association score in a mapping study, the QTL is represented by the location of that marker. When authors provide information on the genetic markers that define the boundaries of the QTL region, the entire range of the QTL region is displayed on the map. The cancer QTL graphic in MTB includes a table with links to QTL details in the MGI database. The Cancer QTL Viewer allows users to upload their own unpublished annotations in a simple GFF-formatted file (General Feature Format; http://www.sanger.ac.uk/resources/software/gff/) so they can be displayed in relationship to published QTL regions. By highlighting a sub region of a chromosome of interest, users are directed to another workspace where they can filter the genes in a region according to the biological (phenotype and function) annotations of those genes. For example, a researcher could identify a region of the genome where multiple lung cancer QTL have been mapped and then filter the genes in that region to find those previously associated with lung phenotypes, biological processes or molecular functions.
Cancer genomics
Genomics technologies are now used routinely to characterize the genomes of primary human tumors and of tumors in genetically defined mouse models. Although many of the genome-scale data from these studies are available from public data archives it can be difficult for researchers to find data sets that include specific mouse strain information or are associated with specific tumor types due to the lack of adherence to standardized gene and strain nomenclature. To support rapid access to mouse cancer genomics data, MTB indexes public genomic data archives such as Gene Expression Omnibus (GEO) (41) and ArrayExpress (42) with standardized nomenclature. The Gene Expression Data Set Search Form in MTB allows users to search for data sets in these archives by organ, tumor classification, strain name and assay platform. The search results allow users to rapidly access study information and the data for each sample in the relevant public archive and are also linked to relevant mouse model data in MTB.
To link the results of large-scale human cancer genomics studies to mouse models, MTB curators have compiled lists of human cancer genes identified from published cancer genome surveys. These lists are provided on the Human Gene Search form in MTB and can be downloaded or used as input to search the MTB for mouse models associated with each of the genes in the list. The search form also allows ad hoc searches of mouse models in MTB using human gene symbols. An example of a search of MTB using the precompiled list of 26 human genes identified as being frequently mutated in a survey of 188 human lung adenocarcinomas (43) is shown in Figure 2.
PDX
MTB supports online access to all of the available PDX models from The Jackson Laboratory's PDX Resource. The models in this resource have been comprehensively annotated for clinical information (de-identified) about the patients from whom the primary tumor material was obtained, histopathology and diagnostic marker labeling, whole genome copy number variation (CNV), genome-wide transcriptional profiling and targeted exome sequencing. Subsets of the models have tumor growth data and standard of care drug response data from dosing studies in tumor-bearing mice. When histopathology images of both the primary patient tumor and the engrafted tumors are available, a board certified pathologist reviews the images and provides a summary of the degree to which there is concordance of the morphological features between the primary and engrafted tumors. The user interfaces to the PDX associated data support searches for models by cancer type, patient diagnosis and genomic properties of the engrafted tumor (expression, CNV and/or mutation) (Figure 3). Tabular summaries of variants identified in the engrafted tumors and graphical summaries of gene expression, amplifications and deletions are also provided for each model. Users can request information on the availability of tumor fragments and tumor-bearing mice using a web-based form that is forwarded to the customer services group at The Jackson Laboratory.
FUTURE DIRECTIONS
MTB initially was designed to serve as a centralized resource of information regarding mouse models of human cancer, with a focus on emphasizing how the genetic background of different strains of laboratory mice can influence cancer phenotypes in genetically engineered mice. Mice as genetic models of human cancer will continue to be a focus for the resource and we will integrate two emerging sources of data in this area in the near future. First we will curate cancer-related phenotype information in data emerging data from large-scale phenotyping initiatives associated with the international Knockout Mouse Project (44,45). A second source of new data in MTB will come from complex trait mapping studies using new high-precision mapping populations such as Diversity Outbred mice (46).
While mouse model representation will continue to be a primary and unique focus for the MTB database project, many of the planned enhancements to the resource will emphasize promoting the use of mouse models in translational and pre-clinical cancer research. We will accomplish this goal by highlighting data used to validate the human cancer relevance of mouse models (i.e. data associated with model ‘credentialing’) and by adding PDX models and data. Specific planned enhancements to MTB include interfaces that highlight the genomic and histopathology concordance of mouse models and human cancers. We also are working with other informatics groups to develop improved interfaces for navigating conserved syntenic regions of the mouse and human genome, enabling better integration of complex trait mapping studies in the mouse with genome-wide association study data in humans. The representation of PDX models in MTB is currently centered on The Jackson Laboratory's PDX Resource, however we will incorporate information and data from other PDX resources and repositories when possible.
INFRASTRUCTURE
The MTB database system has three major software components:
The public web interface runs from version 5.x of the Apache Jakarta Tomcat Web Server and utilizes the Apache Struts Framework. The web interface utilizes Java technology conforming to the 2.4/2.0 Java Servlet/Java Server Pages specifications. We leverage ExtJS and Google Visualizations Javascript libraries for data visualization on the public web interface to MTB.
The curatorial interface for MTB is a Java-based desktop application used by the Scientific Curators to enter and curate data. Curator interfaces are built using a rich set of custom Java libraries designed around the Core J2EE Design Patterns.
The database itself is a highly normalized relational database currently housed in MySQL. JDBC is used to access the data repository
The MTB Application Programming Interface (API) facilitates the exchange of data with users and with other database systems. In keeping with our software design goals of ‘write once, run anywhere’, Web Services via SOAP is the primary access to the MTB API suite. This design supports access to MTB data in a platform and language independent manner that suits the differing needs, standards and tools of the community.
USER SUPPORT
The MTB Database can be accessed at the MTB Home Page (http://tumor.informatics.jax.org), which is part of the MGI group web pages (http://www.informatics.jax.org). Announcements about new MTB features are released via the MTB web site as well as the MGI FaceBook page. User support for MTB is available in the form of online documentation, email, fax and phone:
Web: http://www.informatics.jax.org/mgihome/support/support.shtml
Email: mgi-help@jax.org
Tel: +1 207.288.6445
Fax: +1 207.288.6830
FaceBook: https://www.facebook.com/mgi.informatics
DATA USE AND RESOURCE CITATION
MTB data and software are provided freely to the scientific community for promoting education and research activities. Any reproduction or use for commercial purpose is prohibited without the prior express written permission of The Jackson Laboratory.
Users of MTB are encouraged to cite this paper when referring to MTB in a publication. The following format is suggested when referring to specific data obtained from MTB: Mouse Tumor Biology Database (MTB), Mouse Genome Informatics Group, The Jackson Laboratory, Bar Harbor, Maine, USA. World Wide Web (http://tumor.informatics.jax.org). [Include the date (month/year) when the data were retrieved.]
We thank Drs Judith Blake and Joel Wagner for helpful comments on earlier versions of this manuscript.
FUNDING
The MTB database is supported by the National Institutes of Health [CA089713]. Research reported in this publication was partially supported by the National Cancer Institute [P30CA034196]. Funding for open access charge: NCI grant CA089713.
Conflict of interest statement. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH.
Comments