Introduction

A fundamental aim of biology is not only to identify the normal function of genes but also to understand their role in human disease when mutated or otherwise abnormally affected. To this end, the mouse continues to be the model organism of choice in many cases because of extensive comparative analysis of its completed genome, the availability of an increasing number of genetic manipulation techniques, and the ability to perform physiologic and behavioral tests that can be extrapolated directly to human phenotypic traits. In addition to individual research groups examining single genes, a commitment has been made in recent years to the systematic generation of mouse mutants on a large scale using various forward genetics strategies in both whole organisms and embryonic stem (ES) cells. Despite these considerable efforts, over 70% of mammalian genes still do not have a corresponding mutant line; however, this so-called “phenotype gap” is beginning to close as these projects continue, establishing a large and valuable catalog of inherited traits, their causative mutations, and importantly, multiple alleles representing each one for comparative analysis.

Gene knockouts and the KOMP project

One way to study gene function at the level of the whole organism is by examining the consequences of its inactivation. This may be achieved either by “knocking out” the relevant promoter or coding sequences, or by “knocking in” an inactivating point mutation, deletion, or truncation to disrupt the activity of the corresponding product. With a number of strategies available, including gene targeting by homologous, site-specific, or transpositional recombination, gene trapping, and RNA-mediated interference, knockout technology constitutes the most widely used approach to create loss-of-function alleles in model organisms (Chen and Soriano 2003; Kuznetsov 2003; Sangiuolo and Novelli 2004). The exploitation of inducible promoters and tissue-specific recombinase enzymes also allows the deletion of a gene of interest in a particular organ, cell type, and/or stage of development; such conditional and tissue-specific knockouts provide more accurate and finely tuned systems to study gene function than those generated by conventional constitutive technology (Porret et al. 2006). A combination of these approaches, in addition to the generation of multiple gene knockouts, has revolutionized the study of many fields of fundamental research, most significantly impacting developmental biology with major insights into the physiology of the hematopoietic, immune, skeletal, cardiovascular, and nervous systems (Shastry 1998; Sheahan et al. 2004; Ning et al. 2006).

Equally important, as technical advances in mapping and mutation screening have facilitated the rapid identification of genetic defects related to human disorders, targeted mutagenesis in the mouse has become an invaluable tool to model and gain mechanistic insights into the pathology and progression of these conditions. For example, disruption of the insulin receptor substrate-2 gene has produced a good biochemical model for type-2 diabetes (LeRoith and Gavrilova 2006). In addition constitutive heterozygous and conditional knockouts deficient for a variety of tumor suppressor genes faithfully recapitulate many of the clinical symptoms present in the corresponding human cancer predisposition syndromes, including Brca1+/− and Brca2+/− mice for familial breast cancer or Ptc1+/− mice for basal cell nevus syndrome (Ghebranious and Donehower 1998; Hakem and Mak 2001; Pazzaglia 2006). Other successes of disease modeling in the mouse from which programs of gene therapy have been initiated notably include Cftr-deficient mice for cystic fibrosis (Rosenecker et al. 2006; Snouwaert et al. 1992). These and many other knockouts have provided ideal preclinical models for diagnostic development, drug discovery, and targeted therapy testing (Walke et al. 2001; Zambrowicz and Sands 2003).

Although existing knockouts account for almost half (12,000) of the known genes in the mouse genome, only 20% have been described in the literature and/or reported in public databases such as the Mouse Knockout & Mutation Database (http://www.research.bmn.com/mkmd) or the Mouse Genome Database (http://www.informatics.jax.org). With the advent of fully annotated mouse and human genome sequences and only about 15,000 genes remaining to be disrupted, the National Institutes of Health (NIH) has recently launched the Knockout Mouse Project (KOMP), a $52 million cooperative program over five years to generate a comprehensive public resource of systematic knockout mutations of the mouse genome in ES cells by gene targeting (http://www.nih.gov/science/models/mouse/knockout) (Austin et al. 2004). Each year approximately 500 new ES cell lines will be selected by a peer-review process for production of the corresponding knockout mice, reporter tissue expression analysis, and basic phenotyping. Depending on the findings, a number of those will undergo further characterization, including more detailed and specialized phenotyping and tissue profiling. Complementary high-throughput approaches have been applied to ES cell insertional mutagenesis, or gene trapping, and recent technological advances have extended the scope and value of this method to the generation of conditional knockout alleles in addition to targeted mutagenesis (Branda and Dymecki 2004; Cobellis et al. 2005). Recently, two major programs have been established, EUCOMM (http://www.eucomm.org) and norCOMM (http://www.norcomm.phenogenomics.ca/index.htm), with the aim of generating over 30,000 new conditional gene-trap ES cell lines for analysis. In addition, these projects aim to collate and distribute compatible tissue-specific Cre recombinase lines in parallel and therefore represent a powerful resource for the selection of a desired gene knockout. By centralizing the rapid and efficient production of mouse knockouts and gene-trap lines and making them readily available to the entire scientific community, such large-scale programs will not only save considerable time and money but also will provide the basis for normalization of comparative phenotypic studies.

The seemingly perfect mouse knockout technology, however, also comes with its limitations and pitfalls. In many instances null embryos are not viable due to developmental defects, precluding the functional study of these genes at later stages of development and in the adult. This particularly applies to knockout mice of tumor suppressors which, with few exceptions, all show embryonic lethality with a distinctive pattern of organ malformations (Ghebranious and Donehower 1998). Although these mice have provided good models for the study of individual genes in embryonic development and the regulation of differentiation, apoptosis, and cell cycle control during organogenesis (Shastry, 1998), they have not necessarily been useful for the characterization of gene function in tumorigenesis. The development of conditional and tissue-specific gene-deficiency technologies mentioned above, however, has now overcome this restriction.

While mouse knockouts generally do provide very valuable information about the function of a gene in vivo, a number of reports have raised legitimate concerns as to their true value as models of human disease (Hochgeschwender and Brennan 1995; Routtenberg 1995); knockout mice often fail to recapitulate the expected clinical symptoms, sometimes producing totally unexpected, conflicting, subtle, or absent phenotypes (Elsea and Lucas 2002). Moreover, although most mouse genes do perform functions identical to their human homologs, the physiologic differences between these species may greatly influence the phenotypic outcome. This may notably account for differences in tissue specificity; prominent examples include p53- and Rb-deficient mice whose tumor spectrum considerably differs from that seen in patients with Li-Fraumeni and inherited childhood retinoblastoma, respectively (Ghebranious and Donehower 1998). Another common pitfall of gene targeting is the potential disruption of transcriptional control elements that govern the expression of neighboring genes by the introduction of a selection marker from the targeting vector. This likely explains why very different phenotypes have been obtained from the disruption of the same gene using different vectors. (Gingrich and Hen 2000; Olson et al. 1996). A number of reports have also highlighted the fact that the strategies used for gene inactivation are not equivalent. To study the role of phosphoinositide 3-kinase γ (PI3Kγ) in cardiac function, deficient mice were generated either by using a traditional knockout strategy or by knocking in a targeted mutation that causes loss of kinase function. Surprisingly, these mice displayed different phenotypes with PI3Kγ knockout mice showing reduced inflammatory responses and increased cardiac contractility, while the mutant PI3Kγ knockin mouse retained only the immunologic defect. Molecular studies later revealed that PI3Kγ also functions as a scaffolding protein, thus transducing both kinase activity-dependent and -independent signaling pathways (Patrucco et al. 2004). This demonstrates that different functions of a gene may be revealed depending on the region that is inactivated.

Finally, regardless of the engineering strategy or model organism used, one major risk associated with the complete loss of function of a given gene is the establishment of compensatory mechanisms that may—at least partially—mask the effect of its inactivation. For example, despite the essential role of myoglobin in oxygen transport from erythrocytes to mitochondria, myoglobin knockout mice show normal cardiac function (Garry et al. 1998). Further studies revealed that more than half of null embryos die in utero and that those surviving have developed adaptative mechanisms to compensate for the defect in oxygen transfer (Meeson et al. 2001). In that respect, conditional knockout models offer another advantage over their constitutive counterparts as adaptive responses are unlikely to take place shortly after the knockout event. The lack of overt phenotype is especially frequent when the gene of interest belongs to a family of related proteins with some functional redundancy, and in many cases it may be necessary to create animals carrying null alleles of two or more members to obtain informative phenotypes (Kono et al. 2004).

Knockout technology has become an invaluable experimental tool for assigning gene function and modeling genetic disorders in vivo. However, it must be used with caution and some of the examples mentioned above highlight the fundamental problem posed by using genetic deficiency regarding the validity and phenotypic interpretation of knockout models; the studies are often limited to examining the compensatory effects of gene ablation as opposed to changes in the function of the gene of interest. Therefore, to address complex questions regarding gene function and regulation, complementary approaches that alter gene structure in a more subtle way must be used in conjunction with knockout technology, such as the analysis of single-point mutations and their phenotypic consequences. Such strategies may therefore be limited to the modeling of single-gene disorders as opposed to complex traits or mitochondrial or chromosomal disorders; however, approximately 10% of all human genes are currently implicated in monogenic Mendelian diseases, and this number continues to rise (Antonarakis and Beckmann 2006; Hamosh et al. 2005). Moreover, at the time of writing over two thirds of the 2300 disease genes in the Human Gene Mutation Database contain point mutations (http://www.hgmd.cf.ac.uk/ac/index.php) (Stenson et al. 2003). In addition, the fact that 75% of disorders in man are inherited in a dominant manner (McKusick 1998) indicates that both dominant and recessive mouse point mutants are vital to complement targeted knockouts and provide the most clinically relevant disease models.

Phenotype- and genotype-driven approaches to generating allelic mouse mutants

Before the advent of molecular biology, many important discoveries were derived from mutations that arrived spontaneously in inbred mouse colonies. However, to generate significant numbers of mutant mice more efficiently, a number of large-scale N-ethyl-N-nitrosourea (ENU) mutagenesis programs were instigated ten years ago (Brown and Peters 1996; Hrabe de Angelis et al. 2000; Nolan et al. 2000). Their success led to the establishment of over a dozen independent centers worldwide, each with particular skills and interests and with the aim to standardize procedures and share resources (summarized in Table 1 in Cordes 2005). Details of these strategies and variations in the breeding schemes have been reviewed extensively elsewhere (Brown and Balling 2001; Hrabe de Angelis and Balling 1998; Justice 2000). Initially, these centers have concentrated on a phenotype-driven approach, in which mutant progeny are screened for abnormalities using a simple yet quantitative assessment of physiology and behavior in combination with more focused phenotyping methods for a specific trait or organ of interest (Keays and Nolan 2003; Rogers et al. 1997; Thaung et al. 2002). Once inheritance of the phenotype is confimed, mutant lines of interest are then analyzed in more detail and the causative mutation is identified by positional cloning.

Table 1 Examples of characterized allelic series (>3 members) generated by ENU mutagenesis

One advantage of this approach over targeted mutagenesis is it can create a range of mutations: hypomorphic (reduced amount of gene product), hypermorphic (increased amount of gene product), and neomorphic (altered function) alleles, in addition to those that are null (loss of function), facilitating the identification of novel functions for known genes. Such a phenomenon was recently illustrated by our own studies where the identification of a stabilizing gain-of-function point mutation in the mixed-lineage leukemia fusion partner Af4, the cause for the neurodegeneration in the ataxic mouse mutant robotic, revealed a new role for this gene in the central nervous system otherwise unpredictable from the phenotypic presentation of the Af4 knockout mouse (Bitoun and Davies 2005; Bitoun et al. 2007; Isnard et al. 2000; Isaacs et al. 2003; Oliver et al. 2004).

The random nature of ENU also means that mutiple mutations in the same gene, an allelic series, may occur in independent lines. A combination of such mutants is therefore more likely to provide information related to gene dosage, or the identification of functionally important protein interacting domains, than a classical gene knockout. This may be particularly applicable to the pharmaceutical industry, because typically the mode of action of drugs is to alter the activity of proteins rather than eliminate their function, or to target specific residues of an active site, for example (Russ et al. 2002). Alternatively, a knockin of the desired mutation might provide a suitable genetic model; however, the time and cost constraints of generating one or more mutant lines using this method is often prohibitive. An engineered mutant trangene might provide a more rapid solution, although studies of transgenic lines are often confounded by factors such as epitope tags, multiple insertions, incorrect promoter artifacts, or position effects that do not influence ENU point mutations that have the advantage of always occurring at the endogneous genomic position. Consequently, an ENU mutation that is not known to be causative in human disease is still likely to provide valuable insights into gene function.

However, the practicalities of large-scale phenotype-driven mouse mutagenesis, such as the limitation and bias of the phenotyping methods, means that many potentially interesting or subtle phenotypes may simply not be detected in a first-pass screen; consequently, mutagenesis centers routinely cryopreserve tissues from each new mutant line for future rederivation and genotyping regardless of the phenotypic data obtained (Glenister and Thornton 2000). These resources, in combination with recent advances in the rapid detection of mutations by denaturing high-performance liquid chromatography (DHPLC) (Dobson-Stone et al. 2000) and temperature gradient capillary electrophoresis (TGCE) (Culiat et al. 2005; Sakuraba et al. 2005), have made it practical to use a gene-driven approach to mouse mutant detection. Here, a gene of interest can be efficiently screened for mutations by PCR from thousands of individual DNA samples followed by rederivation of the selected lines for further study (Coghill et al. 2002; Michaud et al. 2005). This technique is also applicable to ES cells that are amenable to random chemical mutagenesis and PCR screening (Chen et al. 2000; Munroe et al. 2004). Although there is no guarantee of any measureable phenotype in the resulting mutant, it has been calculated from pilot studies that 5000 DNA samples is sufficient to identify at least two alleles with 90% confidence (Coghill et al. 2002; Quwailid et al. 2004). Such values are based on estimating the proportion of the genome that is protein coding; therefore, assuming no positional bias in the mutagenic action of ENU, the larger the gene, the greater the likelihood of identifying a new mutation (Concepcion et al. 2004). With the size of these archives increasing and other academic centers such as RIKEN generating similar resources (Sakuraba et al. 2005), gene-driven screening will play an increasingly important role in the identification of multiple mutant alleles. Another, sometimes overlooked, advantage of such strategies is that all the resulting mutants from a particular ENU screen will be derived from the same genetic background, a vital feature for comparative assessment. It is well known that inbred lines differ considerably in a large number of physiologic and behavioral parameters (Contet et al. 2001; Kaku et al. 1988; Solberg et al. 2006), which may confound attempts to accurately compare spontaneous mutant or knockout lines on different backgrounds (Lalouette et al. 1998; Runkel et al. 2004).

Allelic series in mouse: genotype-phenotype correlations

Historically, the first large collections of allelic series were derived from early genetic studies that analyzed the offspring of mutagenized male mice with females homozygous for a mutation that causes a visible phenotype such as coat color. These specific locus tests were not only used to titrate the effectiveness of particular mutagens accurately, but they also generated dozens of alleles at these particular loci, such as dilute and short-ear (Davis and Justice 1998a, 1998b). There are also numerous examples of allelic spontaneous mutants, although these are, as expected, biased toward visible phenotypes such as gait dysfunction (Lalouette et al. 1998; Letts et al. 2003). There is now an increasing number of mutiple mutant alleles representing individual genes; a range of the more recent examples that have used large-scale mutagenesis are outlined in Table 1. Although not an exhausative list, some of those that have highlighted the various experimental features of this method and have provided the most interesting functional insights are decribed below.

Pmp22: insights into human peripheral neuropathies

The random nature of ENU as a mutagen provides a sound basis for the discovery of novel disease models; however, in a phenotype-driven screen there will naturally be a bias toward those defects that are early onset, easy to recognize, and nonlethal. Some of the earliest examples of an allelic series identified from a large-scale ENU screen are the three Trembler mutants Trm1H, Trm2H, and Trm3H. These dominant mutants displayed a range of resting tremor and hind-limb grasping behavior, with Trm1H the most severely affected and Trm3H the least (Isaacs et al. 2000, 2002). Quantitative histopathologic analysis revealed that there was a direct correlation between the severity of the phenotype and the level of hypomyelination in the peripheral nerve of the mutant mice; Trm1H displayed the fewest and most narrowed axonal profiles and the greatest increase in endoneurial connective tissue of all the lines. The causative mutations were identified in peripheral myelin protein 22 (Pmp22), a highly conserved structural component of the axonal myelin sheath already implicated in peripheral neuropathies in humans (Isaacs et al. 2000, 2002). The Trm1H mutation (H12R) is in an identical position as an amino acid substitution described in a patient suffering from Derjerine-Sottas syndrome (DSS), a severe demylelinating neuropathy (Valentijn et al. 1995). In addition, the Ser72 altered in the relatively mild Trm3H line is a putative hotspot for PMP22 mutations that relate to the same disorder (Marques et al. 1998, 2004). This serine-to-threonine change is far more conservative than the four human mutations documented at the same residue in DSS patients, suggesting a strong association between genotype and phenotype. Interestingly, this can also be applied to two spontaneous Pmp22 mutants; the Pmp22-Tr mutation causes a severe tremor in mice and occurs in another conserved position altered in DSS patients (Ionasescu et al. 1997; Suter et al. 1992), whereas the less severe Charcot-Marie-Tooth disease type 1A (CMT1A) can be caused by identical mutations found in the milder Tr-J mutant (Valentijn et al. 1992). The Pmp22 knockout also displays resting tremor and hypomyelination, although the phenotype is less severe than Tr or Trm1H, suggesting that the disease progression is not simply due to a loss-of-function effect (Adlkofer et al. 1995). Indeed, in vitro modeling experiments proposed that expression of mutant Pmp22 caused retention of wild-type protein away from the membrane (Colby et al. 2000). Importantly, similar experiments with the three ENU lines showed there was a direct relationship between the severity of the phenotype and the propensity of Pmp22 to form oligomers in vitro (Isaacs et al. 2002). The allelic series of Pmp22 mutants has therefore provided valuable insight into the genotype-phenotype correlation of peripheral neuropathies in humans.

Gk: modeling multiple human mutations in new diabetes models

Inactivating mutations in glucokinase (GK), a critical glycolysis enzyme and “glucose sensor” in insulin secreting β cells, are known to cause dominant maturity-onset diabetes of the young type 2 (MODY2) and recessive permanent neonatal diabetes mellitus (PNDM) (Njolstad et al. 2001; Vionnet et al. 1992). Mouse Gk knockout models have provided some insight into disease-causing mechanisms, although they have not revealed details of the structure-function relationships of this important metabolic protein (Postic et al. 1999). Moreover, over 190 missense, nonsense, and splicing mutations in GK have been described in MODY2 alone, suggesting that similar genetic lesions in mice would provide the most clinically relevant information (Gloyn 2003). Two independent groups have combined large-scale ENU mutagenesis with a biochemical screening approach to identify new mouse models of diabetes and have identified a total of 12 Gk mutations with distinct enzyme properties and hyperglycemic phenotypes (Inoue et al. 2004; Toye et al. 2004). For example, Inoue et al. demonstrated that three of their missense mutations (M-272, M-341, and M-392) caused a marked reduction in GK activity, hyperglycemia, and glucose intolerance, although only two of these (M-272 and M-341) showed a corresponding reduction in protein level, suggesting that Gk mutations can cause protein instabilty in addition to directly affecting the function of the enzyme (M-392). Another point mutation (M-210) occurred at the splice donor site of the β-cell-specific exon 1, causing a similar hyperglycemic phenotype and confirming that disruption of this isoform is sufficient to impair glucose regulation as observed in β-cell-specific Gk knockout mice (Postic et al. 1999). This allelic series also demonstrated that the degree of severity is significantly influenced by the position of the Gk mutation if homozygous. Severe hyperglycemia in M-210 homozygous mutant pups was first detectable at postnatal day 2 (P2), in addition to marked growth retardation, liver abnormalities, and eventual death by P7. By contrast, the M-392 point mutation produced a milder hyperglycemic phenotype when homozygous, with no overt growth retardation or liver defects (Inoue et al. 2004). The differing phenotypes are related to the relative position of the mutations; M-210 homozygous mice have a phenotype similar to Gk knockouts, whereas the M-392 mutation occurs in the glucose-binding region as opposed to the more functionally significant catalytic domain (Gidh-Jain et al. 1993). Importantly, four of the newly identified Gk mutations, including the β-cell-specific substiution, are identical to those found in humans (Gloyn 2003); the missense mutation (M-236) at Thr288 is not only identical to mutations found in both MODY2 and PNDM patients, but it is situated in the active site of GK, correlating directly with the reported significant reduction in kinase activity of the enzyme (Davis et al. 1999; Gidh-Jain et al. 1993). These new mutants have provided a detailed structure-function study of GK and the relationship between varying levels of its inactivation to hyperglycemic phenotypes that are directly relevant to both dominant and recessive forms of diabetes.

Quaking: functional analysis of protein isoforms

The RNA binding protein encoded by the quaking loci mediates the stabilty and splicing of oligodendrocyte RNA transcripts such as myelin basic protein (MBP) and exists in three distinct isforms: QKI-5, -6, and -7 (Wu et al. 2002). A total of five ENU-induced mutants have since been identified by their failure to complement the original spontaneous qkv/qkv line: four that are homozygous embryonic lethal (qkkt1m, qkk2, qkkt3/4, and qkl-1) and a single viable allele (qke5) that causes seizures in the second postnatal week in addition to adult-onset progressive ataxia (Justice and Bode 1988; Noveroske et al. 2005). In addition to demonstrating an unexpected developmental role for this gene, subtle yet significant differences in the embryonic phenotypes of the four lethal lines have revealed new information regarding the functional domains of the QKI isoforms. For example, the qkk2 mutation causes death at embryonic day (E) 8.5-11.5 due to a disorganization of the anterior-posterior (A-P) axis, in addition to cranial and heart defects (Justice and Bode 1988). The causative mutation appears to be inherited in a semidominant manner and occurs at a highly conserved amino acid in the RNA-binding (KH) domain common to all three QKI isoforms (Cox et al. 1999). In contrast, a splice site is disrupted in the recessive qkl-1 mutant, leading to a loss of the QKI-5 protein and cessation of development between E8.0 and E9.0, suggesting that disruption of the nuclear isoform is sufficient to cause embryonic lethality (Cox et al. 1999; Justice and Bode 1988). Despite the viability of qke5/qke5 mice, severe dysmyelination of the brain is evident, in addition to the presence of axonal swellings in the cerebellum that are likely to account for the observed ataxia. Expression studies revealed that the QKI-6 and -7 isoforms are not present in mutant postnatal oligodendrocytes and levels of QKI-5 are considerably reduced, although all three isoforms appear to be expressed at normal levels in astrocytes (Noveroske et al. 2005). Because no coding mutations in the quaking loci were identified, this evidence points to disruption of an as-yet unidentified tissue-specific regulatory region. Further examination showed that markers of mature oligodendrocytes were disrupted in qke5/qke5 mice, suggesting that the severe phenotype is due to defective myelination by QKI-regulated pathways rather than to aberant development of these cells during proliferation. The biological importance of the qke5 mutant is also emphasized by the fact that the original viable qkv line contains a large deletion that disrupts two additional genes that are likely to influence the demyelination phenotype (Lorenzetti et al. 2004). This ENU-derived allelic series has not only revealed a new role for quaking in embryonic vasculogenesis, but it has also demonstrated that regulation of the gene is critical for normal CNS myelination; these mutants are a useful resource for modeling seizure and neural tube defects in addition to psychiatric disorders such as schizophrenia to which human QKI has recently been associated (Haroutunian et al. 2006).

Pde6b: new models of retinal degeneration

From a phenotype-driven screen of over 6000 ENU mutants for vision defects detectable by either microscopy or visual tracking response, 25 inherited phenotypes were isolated, with 7 of these caused by mutations in Pde6b, a gene previously implicated in eye pathophysiology (Thaung et al. 2002). Mutations in the β subunit of the rod cGMP-phosphodiesterase (PDE6B) gene have been shown to account for approximately 2% of cases of retinitis pigmentosa (RP), causing degeneration of the retina and blindness (McLaughlin et al. 1995). Mice homozygous for the null Pde6b allele (Pde6brd1) provide a model for autosomal recessive (ar) RP, but suffer from almost complete photoreceptor apoptotic degeneration by three weeks of age due to persistent opening of cGMP-gated cation channels (Chang et al. 1993). Consequently, a Pde6b allelic series containing mutants with slower disease onset would be more practical as a tool for analyzing potential therapeutic strategies. Four of the novel recessive mutants identified (Pde6brd1-1-4H) displayed essentially identical phenotypes to the endogenous Pde6brd1 alleles of the C3H mouse strain; however, the remaining lines showed varying degrees of post-weaning-onset atypical retinal degeneration (atrd), named Pde6batrd1-3 (Hart et al. 2005; Thaung et al. 2002). As expected from their related pathology to Pde6brd1 mice, the four Pde6b1-4H mutants contained mutations that are predicted to cause loss of function: three generated premature stop codons and the fourth altered a highly conserved splice donor site. Interestingly, the latter mutation is identical to one found in a patient with arRP, once again correlating ENU mutants with aberant human alleles (McLaughlin et al. 1995). Of the three less severe alleles, two mutations (Pde6batrd1 and Pde6batrd3) were in the catalytic domain of Pde6b and the third (Pde6batrd2) occurred in another splice donor site. Molecular analysis of Pde6batrd2 transcripts showed that the predominant product was truncated and presumably nonfunctional, although 17% of the transcripts analyzed were correctly spliced and expressed, presumably accounting for the milder retinal degeneration seen in this mutant line compared to rd1 mice (Hart et al. 2005). Combined visual acuity, fundus, and histopathologic analysis of the three atrd lines showed that Pde6batrd1 displays the least severe phenotype; this may reflect the fact that in related phosphodiesterases the mutated histidine is replaced by a tyrosine, indicating that this residue is not necessarily functionally essential. By contrast, the asparagine mutated in Pde6batrd3 mice is conserved in all known mammalian phosphodiesterases, and this is reflected in a reduced visual acuity performance compared with those homozygous for the Pde6batrd1 allele (Hart et al. 2005). The strong genotype-phenotype correlation in this study has shed new light on the functional domains of Pde6b as well as provide valuable new models for RP and related disorders.

Smad4: ES cell-derived gene-driven screening

Whereas gene-driven screens from whole-animal archives have proved to be successful, the initial cost and space involved in generating thousands of lines may be prohibitive (Williams et al. 2003). Consequently, a number of groups have established mutagenized ES cell libraries for PCR screening and subsequent recovery (Chen et al. 2000; Munroe et al. 2004). The validity of this approach has been demonstrated by the identification of 29 Smad2 and Smad4 mutations in a screen of over 2000 ENU-mutagenized ES cell clones (Vivian et al. 2002). Both genes, transducers of transforming growth factor β (TGF-β) superfamily signaling, have been implicated in the development of human cancers (Miyaki and Kuroki 2003), although the perigastrulation lethality of knockouts has limited research into their role in late embryonic stages or adult life (Weinstein et al. 2000). A series of Smad4 mutants was recovered and, unexpectedly, three of these lines, Smadm1Mag, Smadm2Mag, and Smadm3Mag, were viable in the homozygous state and displayed no detectable phenotype in adults despite missense substitutions in three conserved residues spread throughout the protein. The Smad4m4Mag mutation, however, occurred in a splice donor site, causing the deletion of an exon and the generation of a truncated protein with 19 additional out-of-frame amino acids. Homozygous Smad4m4Mag mutants, unsurprisingly, did not survive embryogenesis and failed to initiate gastrulation. Interestingly, ES cells heterozygous for the mutation showed steady-state Smad4 protein levels considerably lower than expected, suggesting that the mutant protein may confer instability to wild-type Smad4 (Vivian et al. 2002). This was confirmed by further biochemical analysis that demonstrated that the truncated protein is not only targeted for proteosomal degradation but can form a complex with wild-type Smad4, targeting it for proteolysis (Chen et al. 2006). This pilot study illustrates that although an ES-cell gene-driven approach can be used to generate allelic series, because of the random nature of the mutagen not all new lines may have a functional effect on the gene of interest.

The MFCS1 element: mutation analysis of a regulatory region

While the vast majority of gene-driven screens have concentrated on exonic and splice site regions, the method can equally be applied to any genomic sequence. A recent study has screened a short, well-characterized cis-acting element, Mammal-fish-conserved-sequence 1 (MFCS1), which regulates the expression of the polarity-signaling protein Sonic hedgehog (Shh) (Masuya et al. 2007). In humans there are a number of cases of preaxial polydactyly (PPD) caused by point mutations in MFCS1 (Lettice et al. 2003; Maas and Fallon 2005); however, no molecular motifs have been discovered in this region that explain the mechanisms involved in Shh regulation by an element situated over 1 Mb upstream of the coding region (Lettice et al. 2002). From over 3500 mutant mice screened, three new mutations were discovered, M101116, M101117, and M101192, flanking the pre-existing PPD ENU mutant M100081 (Sagai et al. 2004). M101117 and M101192 showed no limb dismorphology, whereas the remaining lines had a semidominant PPD phenotype, with M100081 also showing tibial hemimelia when in the homozygous state. Ectopic expression of Shh was detected in the anterior region of the hind-limb bud of M100081 and M101116 homozygotes at E10.5, which was confirmed by transgenic analysis of embryos generated with the LacZ reporter gene under the control of mutated MFCS1. As expected, the level of ectopic Shh expression correlated with the relative severity of the PDD phenotype, but also with the evolutionary conservation of the mutated nucleotide; the M100081 and M101116 substitutions occur at positions found in fugu, whereas the nucleotide positions altered in the less severe M10117 and M101192 lines are conserved only as far as chicken. The molecular mechanisms controlling these observations remain unclear and may relate to aberrant binding of transcription factors. However, this study highlights the power of this strategy to identify novel features of regulatory elements, a natural progression from the current mutation analysis of protein-coding domains.

Conclusions and future prospects

The cost of generating a new allelic series is continually decreasing with advances in DNA analysis and the expansion of mutant mouse archives. For example, combining TGCE with over 17,000 mutant genomic DNA samples, Ingenium Pharmaceuticals state they are able to generate at least five new alleles for any given gene and provide live adult mutant mice within three to four months (Augustin et al. 2005). With this number of novel strains available, preselection of the most biologically relevant mutations will become routine and necessary in many cases (Grosse et al. 2006). Improvements in ES cell technology have also enhanced the efficiency of gene-driven screens from mutagenized stem cell archives. Recovery of these lines has traditionally relied on introducing ES cells into blastocycts, followed by two rounds of breeding to generate mice homozygous for the desired mutation. Hybrid ES cell lines and tetraploid host embryos can facilitate the identification of phenotypes in the first generation (F0), although this method suffers from a number of confounding technical inefficiencies that preclude its use as a high-throughput strategy. The latest advance uses laser-assisted injection of ES cells into eight-cell-stage embryos, generating viable F0 mice of both sexes that are nearly 100% chimeric. A pilot study demonstrated that the phenotype of previously characterized mutations could be successfully recapitulated in mice recovered from ES cells using this method (Poueymirou et al. 2007). It has even been possible to identify and successfully rederive new splice variants from a highly pooled archive of 40,000 mutagenized ES cell clones using nested exon-skipping PCR primers (Greber et al. 2005). Because the time and cost of these high-thoughput technologies is decreasing, the availability of an “off-the-shelf” series of mutations for a given gene is slowly becoming a reality.

It must not be overlooked, however, that each new ENU mutant line harbors many other mutations in addition to the one that may have been identified by gene-driven screening (Hitotsumachi et al. 1985). The estimate that one functional animo acid change is obtained every 1.82 Mb of coding DNA was determined from a gene-based screen of the Harwell ENU archive (Quwailid et al. 2004). Consequently it was calculated that there is still a 7% chance that a second confounding mutation is linked to the originally identified loci after ten generations of backcrossing to a wild-type strain (Keays et al. 2006). Marker-assisted selection (MAS), in which offspring with the smallest amount of donor chromosome linked to the mutation of interest are used for breeding, can be used to circumvent this problem (Visscher et al. 1996). Although this may detract from the apparent speed and convenience of a gene-driven approach, additional experimental evidence such as a BAC rescue (Keays et al. 2007) or even a second allelic mutation (Hafezparast et al. 2003) can provide sufficient supporting evidence that a novel genetic lesion is causative.

As the studies above have illustrated, multiple-mutant alleles frequently provide valuable insight into gene function, including unexpected and serendipitous findings, as a consequence of the random nature of ENU mutagenesis. The power of this technique relies on evolutionary conservation of DNA sequence and physiologic parameters to extrapolate conclusions to human disease states; consequently, the mouse will continue to provide important and clinically relevant phenotypes. Moreover, there is likely to be a move toward mutation screening of non-protein-coding regions such as promoter elements, introns, and noncoding RNAs as more is learned about their role in biology. For example, the modeling of human point mutations in the noncoding RMRP RNA has provided new insight into the role of this component of the ribonucleoprotein complex in cartilage-hair hypoplasia (Hermanns et al. 2005). There is no doubt, therefore, that large-scale chemical mutagenesis, in combination with other genetic tools such as conventional and conditional knockouts, knockins, and transgenics will continue to play a vital role in the generation of new therapeutic targets.