Comparative genomics and host resistance against infectious diseases.

The large size and complexity of the human genome have limited the identification and functional characterization of components of the innate immune system that play a critical role in front-line defense against invading microorganisms. However, advances in genome analysis (including the development of comprehensive sets of informative genetic markers, improved physical mapping methods, and novel techniques for transcript identification) have reduced the obstacles to discovery of novel host resistance genes. Study of the genomic organization and content of widely divergent vertebrate species has shown a remarkable degree of evolutionary conservation and enables meaningful cross-species comparison and analysis of newly discovered genes. Application of comparative genomics to host resistance will rapidly expand our understanding of human immune defense by facilitating the translation of knowledge acquired through the study of model organisms. We review the rationale and resources for comparative genomic analysis and describe three examples of host resistance genes successfully identified by this approach.

Two major elements underlie a thorough understanding of the pathogenesis of virtually any infectious disease: identification and characterization of the virulence factors and in vivo survival mechanisms of the invading microorganism (e.g., surface attachment factors, exotoxins, or enzymes that disrupt cellular homeostasis [1]) and understanding of the components of the host response that lead to elimination of the invading pathogen and resolution of disease. (These include both nonspecific [or innate] immune defense mechanisms, such as the complement cascade, and adaptive elements, such as clonally derived lymphocytes capable of eliminating specific targets [2]). The traditional approach to human infectious diseases has been to focus research on the study of important pathogens. The outcome of investigation of relevant bacteria, viruses, fungi, and parasites has led to the production of protective vaccines, antimicrobial agents, and effective strategies for control and elimination of disease outbreaks. A principal advantage of microbiologic research is the relative ease with which the organisms may be obtained, manipulated, and analyzed in the laboratory. Because microbial genomes are smaller, complete cloning and DNA sequencing of several microorganisms have been achieved and have paved the way for comprehensive study of gene expression and genome organization (3,4). In contrast are relatively limited advances in our understanding of the molecular basis of host defense. The study of host immune defense in humans is inherently complex; obstacles to greater understanding include limited opportunities for controlled observation and experimental manipulation, a large genome, and until recently, a lack of molecular techniques capable of facilitating genomewide analysis.

Genetic Analysis
One of the principal aims of the study of host response to infectious diseases is to uncover novel components of the host immune system critical to robust host defense. Identification of these components at a molecular level is the first step in understanding how the host deals with an infectious challenge and lays a foundation upon which rational therapies that augment host resistance may someday be designed. Despite this promise, the interaction between host and The large size and complexity of the human genome have limited the identification and functional characterization of components of the innate immune system that play a critical role in front-line defense against invading microorganisms. However, advances in genome analysis (including the development of comprehensive sets of informative genetic markers, improved physical mapping methods, and novel techniques for transcript identification) have reduced the obstacles to discovery of novel host resistance genes. Study of the genomic organization and content of widely divergent vertebrate species has shown a remarkable degree of evolutionary conservation and enables meaningful cross-species comparison and analysis of newly discovered genes. Application of comparative genomics to host resistance will rapidly expand our understanding of human immune defense by facilitating the translation of knowledge acquired through the study of model organisms. We review the rationale and resources for comparative genomic analysis and describe three examples of host resistance genes successfully identified by this approach.

Synopses
pathogen that leads to infection is multidimensional, dynamic, and exceedingly complex. From a genomic perspective, a thorough understanding of the pathogenesis of a given infection would include a complete inventory of the spatial and temporal expression of the genes by both the host and pathogen from the time of exposure to the final resolution of the infection. Given the potentially large number of factors that contribute to host defense, precise gene identification is a formidable challenge. Nevertheless, researchers have recently made progress in dissecting and identifying the most important individual genetic elements that govern the host response to important pathogenslargely through the use of animal models of human disease (5). Of the model organisms amenable to genetic analysis, the mouse is by far the most well-developed and physiologically relevant system for study of human host defense (6,7). Identification of commercially available inbred strains of mice that show a differential response to a well-defined infectious challenge is the first requirement for study of genetically regulated host resistance factors. Once distinct phenotypes are identified, controlled breeding is carried out to determine the mode of inheritance of the phenotype (simple or complex). Correlation of the inheritance of susceptibility or resistance to a specific infectious challenge with one or more chromosomal regions is then performed by using linkage analysis. Finally, known genes within the genetic interval must be evaluated and novel genes must be positionally cloned to elucidate the underlying molecular basis of immune defense. Comparative genomic analysis is a logical extension of these principles (8). Knowledge of the genomic organization of human and mouse, for example, facilitates direct localization and identification of the human orthologues of susceptibility genes identified through experimental challenge. These genes can then be tested as candidates for human disease susceptibility through mutation analysis.

Genetic Linkage Maps
Genetic linkage maps provide an organizational framework for genes and phenotypes in the genome (9). Maps, by establishing the location, order, and relative distance of genes, anonymous DNA markers, and biologically important traits along a species chromosomes, are critical tools in analyzing genetic contribu-tion to a given disease state. Genetic maps can help precisely localize chromosomal region(s) linked to host resistance phenotypes and provide the starting point for identification of the causative gene(s). During the past decade, comprehensive genetic maps spanning the genomes of mouse and human, have been created largely through the initiative of the Human Genome Project (10).

Mapping the Human Genome
A great deal of effort has resulted in the creation of a whole genome human linkage map, consisting of 5,624 microsatellite markers located to 2,335 positions (11). The DNA markers in this map are highly informative and are densely distributed, with an average interval between markers of 1.6 centimorgans (cM) (1 cM = a 1% rate of recombination during meiosis, or approximately 1 million bp). Other comprehensive maps have been assembled on the basis of a collection of more than 16,000 distinct transcribed sequences (including known genes and gene fragments) or expressed sequence tags, which are estimated to represent at least 50% of all genes in the human genome (12). This human transcription map has been integrated with selected microsatellite markers from the Généthon collection, thus allowing the position of gene-based markers to be resolved to specific intervals measured in centimorgans. The map is available electronically (13). Work is also under way to generate comprehensive physical maps of the human genome in which the relative location of markers is defined by the actual length along the chromosome, rather than by recombination events (14,15).

Mapping the Mouse Genome
Among model organisms, genetic mapping is most well established in the mouse, having begun in 1915 with the discovery of the first linkage group (16). Controlled crosses of common laboratory strains segregating a small number of visible phenotypes such as coat color then became the mainstay of genetic mapping. In the past decade, two major breakthroughs have revolutionized the technique of mouse genetic mapping and paved the way for generation of high-resolution whole genome maps. The first was the development of the interspecific cross, involving a laboratory strain (Mus musculus) and a distantly related species Mus spretus (17), Figure. Schematic representation of microsatellite marker analysis in mice. A) Flanking forward (F) and reverse (R) oligonucleotides are designed to specifically amplify a simple sequence repeat by polymerase chain reaction (PCR) (in this case a CA dinucleotide). The length of the dinucleotide (N) varies among inbred mouse strains. B) Gel electrophoresis of a PCR-amplified microsatellite in homozygous parental strains A and B and heterozygous F 1 progeny. The larger microsatellite from strain A migrates more slowly than that of strain B. Inheritance of both parental alleles is shown in the F1. allowing literally thousands of genes to be mapped within the same cross. The second advance was the development of abundant genetic markers rapidly typable by polymerase chain reaction (PCR) (termed microsatellites), which amplified polymorphisms in simple sequence length repeats such as [CA]n (Figure) (18). Several comprehensive genetic maps of the mouse (based on genes or microsatellites) have been developed, and in some cases, these are being integrated. At least three are publicly available, while the others are available for mapping in a collaborative arrangement (19). As of January 1997, more than 17,000 markers had been mapped in the mouse (one locus approximately every 200kb), including more than 5,000 genes and more than 10,000 (mostly microsatellite) DNA markers.

Mapping in Other Species
Genetic mapping has been widely embraced by the scientific community; more than 30 vertebrate species are the subject of genetic mapping projects, and high-resolution maps of microsatellite markers have been developed for humans, mice, rats, cows, sheep, pigs, fish, and chickens (19). Two invertebrates, Drosophila melanogaster (a dipteran fly) and Caenorhabditis elegans (a nematode), also have complete genetic and physical maps; the complete nucleotide sequence of the latter is expected in the near future. The status of individual genetic mapping projects and resources has been summarized, along with a compilation of databases for speciesspecific or comparative mapping reference (19,20). Integrating the data from these speciesspecific projects in a form that allows relevant information from diverse organisms to be assembled is a major challenge to biologic information systems. The most extensive coverage of mammalian species homologies is the Mouse Genome Database of The Jackson Laboratory (21). Initially developed for the mouse, comparative mapping data for more than 55 species may be searched online, with links to related genomic resources, such as the Human Genome Database, Ratmap, SheepBase, and PigBase.

Comparative Genetic Mapping
Because of the density of genetic markers positioned along the chromosomes of both organisms, the comparative map of the mouse and human genomes is the most well developed of all species. In a comprehensive summary of mouse/human homology published in 1996, 1,416 loci were placed on both maps by using human physical mapping data and mouse genetic maps (22). This comparison defined 181 conserved linkage groups, approximately 90% of the mouse genome. Further comparative mapping with newly discovered genes and expressed sequence tags will refine the chromosomal relationships between mouse and human.

Integrating Maps and Aligning Genomes
The integration of existing genetic maps of different species is a formidable challenge.
Accurate, comprehensive comparisons of gene arrangements across different species will rapidly advance our understanding of all aspects of biology by allowing rapid information exchange across different model organisms and experimental systems. Several approaches have been used in developing universal mapping probes for diverse genomes (23)(24)(25). Of the two classes of loci used to construct gene maps, coding gene sequences (Type I markers), which show conservation among distantly related mammalian species, are most useful as landmarks for comparing linkage and syntenic association. Highly polymorphic sequences (Type II markers), such as microsatellites, are more abundant and are invaluable for mapping within a pedigree but are less useful for comparative purposes because they do not show adequate sequence conservation to recognize locus homology between mammalian orders. In 1993, a list of anchored reference loci for comparative genome mapping in mammals was proposed; it consists of 321 Type I markers equivalently spaced throughout the mammalian genome (26). This approach allowed the position of homologous loci in the maps of four species (human, mouse, cattle, and cat), which represent different mammalian orders, to be established. Interspecies comparison of conserved exon sequences of homologous genes has generated a new overlapping set of anchor loci called comparative anchor tagged sequences (25). Large-scale mapping of these sequences in several species may be an efficient way of developing high-resolution comparative maps with essentially complete genome coverage.

Alternative Techniques for Comparative Genomic Analysis
Mammalian genomes may be compared at several levels by using a variety of tools and strategies tailored to individual objectives. Although direct sequence comparison of whole genomes will provide the highest resolution for comparative study, this sophisticated form of analysis is at least several years away from being realized. At a cytologic level, species may be compared by fluorescence in situ hybridization (FISH) with single or multiple probes (single or multicolor Zoo-FISH), producing rapid, highresolution chromosomal localization detectable by microscopy. Alternatively, libraries from microdissected or individual flow-sorted chromo-somes may be constructed and used as fluorescence-labeled chromosome paints to probe the chromosomes of other species and identify homologous regions (27,28). The main advantage of chromosome painting is its rapid overall evaluation of the extent and character of genomic conservation among distantly related species, such as pig and cattle. In contrast to FISH, chromosome painting does not allow determination of gene order or high-resolution demarcation of chromosomal breakpoints. Radiation hybrid panels, another method for physical assignment of homologous loci (29,30), are generated by irradiation and subsequent fusion of a cell line containing a chromosome from one species, such as human, on another background, such as hamster. The donor DNA is fragmented at random, resulting in a series of lines retaining only fragments of the original chromosome. Conserved genes from other species may be mapped to the homologous region of the human genome by comparing the PCR pattern for each cell line to reference loci with well-established map positions.

Models of Human Disease
Identifying genetically regulated host immune responses might significantly advance our understanding of the molecular targets and immunologic mechanisms critical to robust defense against pathogenic microbes. To date the number of host defense genes that have been cloned remains small; comparative genomics has the potential to accelerate gene discovery by allowing available data for model organisms to be rapidly applied to the study of human disease. We summarize three examples of human host resistance genes in the following section; in each example, genetic analysis of mouse models of the human disease phenotype played a crucial role in the initial discovery of the human homologue or served as a means of validating the identity of the proposed human candidate disease gene.

Nramp 1 and NRAMP1 The Mouse Nramp1 Gene
In classic inbred strains of mice, natural resistance to infection with Mycobacterium bovis (BCG), M. lepraemurium, Salmonella Typhimurium, and Leishmania donovani is controlled by the Bcg locus, also known as Ity and Lsh (31)(32)(33). The major effect of the Bcg gene is to Synopses modulate the growth rate of these diverse pathogens in cells of the reticuloendothelial system of the mouse during the preimmune phase of the infection (33). Resistant and susceptible strains are distinguished by the kinetics of infection shown by pathogen counts (CFUs or Leishmania-forming units) in liver and spleen after infection. The susceptible phenotype is characterized by a higher net growth rate of BCG, Salmonella, or Leishmania in the reticuloendothelial system during the early phase of infection, followed by specific immune responses in BCG-and L. donovaniinfected mice or by a rapidly lethal infection with the virulent pathogen S. Typhimurium. Bcg is inherited as a simple autosomal dominant Mendelian trait in crosses between classical strains of laboratory mice; it was localized to mouse chromosome 1 by linkage analysis (34). Using a positional cloning strategy, Vidal et al. (35) isolated the Nramp1 (natural resistanceassociated macrophage protein 1) gene as a strong candidate for the Bcg mutation based on its map location, its macrophage-restricted expression pattern and a nonconservative Gly 169 Asp substitution in the protein of all susceptible strains. Creation of a null allele at Nramp1 then provided formal proof that a mutation within Nramp1 is the cause of the mouse susceptibility to infection with M. bovis, S. Typhimurium, and L. donovani (36).
Nramp1, an integral membrane phosphoglycoprotein located in the late endosome/lysosome compartment of resting macrophages, is recruited to the maturing phagosomal membrane (37), consistent with its potential function in controlling the replication of intracellular parasites by altering the intravacuolar environment in which they reside. Nramp1 is part of an ancient family of proteins with highly conserved members in mammals (including humans, cows, rats, sheep), birds, invertebrates (C. elegans, D. melanogaster), plants (Oryza sativa, Arabidopsis thaliana), fungi (Saccharomyces cerevisiae), and even bacteria (M. leprae and Escherichia coli) (38,39). This family is characterized by a highly conserved hydrophobic core consisting of 10 transmembrane (TM) domains with a structural organization typical of families of ion transporters and channels. In addition, the most highly conserved segments of the Nramp family (TM8-TM9 intracellular loop) show impressive similar-ity with the highly conserved region of mammalian voltage-gated K + channels of the shaker type (40).
Several issues concerning the biochemical function of Nramp1 with respect to intracellular survival of taxonomically unrelated pathogens remain unresolved. Studies of the function of Nramp1-related sequences (Nramp2 and Smf1 in model organisms) provide insight into how Nramp1 confers resistance to microbial agents. Nramp2 has been isolated in mouse and human and shows a high degree of similarity to Nramp1 (77% overall similarity), with identical hydropathy profiles and predicted secondary structures (41,42). Mouse and human Nramp2 mRNA are both widely expressed in contrast with the tissue-specific expression of Nramp1 (41, 42). Recently, Nramp2 was shown to be a metal ion transporter with broad divalent cation specificity (including Fe 2+ , Zn 2+ , Mn 2+ , Co 2+ , Cd 2+ , Cu 2+ , Ni 2+ , and Pb 2+ ), driven by the proton electrochemical gradient in Xenopus laevis oocytes (43). Studies using the yeast double mutant SMF1/ SMF2 provided additional support concerning the function of Nramp2 as a divalent cation transporter. Inactivation of SMF1 and SMF2, two yeast Nramp homologues encoding divalent cation transporters (44), is specifically complemented by Nramp2 (45). In vivo, Nramp2 plays an important role in normal iron transport. Mutation within Nramp2 causes microcytic anemia in mk mutant mice because of severe defects in intestinal iron uptake (46). Interestingly, the missense mutations in mutant Nramp1 and Nramp2 alleles introduce a charged amino acid in two adjacent positions of TM4, confirming the importance of this region of both proteins for normal function. It has been suggested that Nramp1 may also be a divalent cation transporter; its role in reticuloendothelial cells remains unexplored (40,44).

The Chicken NRAMP1 Gene
The discovery of Nramp1 allowed the study of its role in susceptibility to related infections in other species. Salmonellosis, one of the most common causes of food poisoning in humans, is frequently caused by ingestion of contaminated poultry products; efforts to identify salmonella resistance genes in poultry could lead to more efficient poultry control strategies, thereby reducing secondary human morbidity. Genetic regulation of chicken host resistance exists, as Synopses inbred poultry lines differ in their susceptibility to infection with several strains of Salmonella. Segregation analysis with a combination of Salmonella-resistant and Salmonella-susceptible lines has shown that resistance to infection is fully dominant and is not sex-linked or associated with the major histocompatibility complex (47). The candidacy of the chicken Nramp1 homologue was tested in the differential resistance of inbred chicken lines to infection with S. Typhimurium by using sequencing and linkage analyses (48). Through the use of a mouse cDNA, the chicken homologue Nramp1 has been cloned and shown to share 68% identity with the mouse gene (49). As demonstrated in mice, the macrophage is a major site of NRAMP1 mRNA expression in chickens (49). NRAMP1 mRNA transcripts from S. Typhimurium resistant or susceptible chickens were analyzed to identify amino acid sequence variants that could be associated with the disease phenotype. Eleven sequence variants in Nramp1 mRNA were obtained from three Salmonella-resistant and three Salmonella-susceptible chicken lines; almost all (10) resulted in silent mutations or conservative changes (to amino acids with similar physical properties) that were detected both in resistant and susceptible chicken lines, while only one sequence variant resulted in a non-conservative substitution of a positively charged residue (Arg 223 by a polar residue (Gln 223 ). This allelic variant was specific to the susceptible line C and was clearly associated with survival to infection (a resistance allele at NRAMP1 improved survival rate from 13% to 27%) (48). Taken together, these data strongly suggest a direct role of NRAMP1 in susceptibility to infection in chickens.

The Human NRAMP1 Gene
Work in inbred strains of mice has established unambiguously that Nramp1 has an important role in determining resistance to mycobacterial infections and has encouraged several research groups to test the association of NRAMP1 with corresponding human infections. Host genetic factors play a major role in determining the outcome of mycobacterial infections in humans, as shown by racial variation in susceptibility to infection and higher concordance of tuberculosis and leprosy among monozygotic twins compared with dizygotic twins and siblings (50,51). Segregation analysis in a population from Desirade Island (French West Indies) has demonstrated that susceptibility to leprosy (regardless of the clinically defined subtype) is controlled by a major gene not linked to the major histocompatibility complex (52). Through use of a candidate gene approach, population association studies, and linkage analysis, several genes (HLA-linked genes, tumor necrosis factor, collectin, vitamin D receptor, interferon gamma receptor) have each been associated with susceptibility to mycobacterial infections (53,54).
The chromosomal region surrounding Nramp1 on mouse chromosome 1 has been conserved on the telomeric end of human chromosome 2q35 and contains the human NRAMP1 orthologue (55). Sequence comparison of the mouse and human Nramp1/NRAMP1 proteins showed a high degree of conservation between the two species (85% identity, 92% similarity); the most conserved region was the intracellular loop containing the consensus sequence transport motif (56). In humans, the highest sites of NRAMP1 expression are peripheral blood leukocytes and lungs (56). The high degree of sequence homology between mouse and human NRAMP1, the presence of similar regulatory elements within the promoter regions of the genes, and similar tissue expression patterns support the notion that the NRAMP1 protein exerts similar roles in vivo in both mouse and humans.
A number of polymorphic variants have been used to study the association of NRAMP1 and susceptibility to leprosy and tuberculosis (57)(58)(59)(60). One study based on the segregation analysis of certain NRAMP1 haplotypes in 20 multiplex families involving 168 individuals from South Vietnam clearly showed that NRAMP1 was involved in predisposition to leprosy (61). Another large study measuring the association of NRAMP1 with clinical tuberculosis in a population of Gambia (West Africa) demonstrated that polymorphic variations within the human NRAMP1 gene affect susceptibility to the disease (62). Nevertheless, susceptibility to either leprosy or tuberculosis appears to be genetically heterogeneous since the role of NRAMP1 was observed only in certain ethnic groups (63,64).
Identification of Nramp1 illustrates the value of comparative genomics for identification and characterization of the biologic basis for Synopses differences between susceptible and resistant hosts. Genetic dissection of the mouse model of M. bovis infection was crucial to the identification of similar mechanisms governing the human response to medically important pathogens such as tuberculosis and leprosy. Comparative genomics was also important in accelerating the identification of an important host resistance gene for salmonellosis in the chicken (a species of significant agricultural importance), where the available genetic tools are modest, relative to mice and humans.

Chediak-Higashi Syndrome (CHS)
CHS is a rare autosomal recessive disorder characterized by partial ocular and cutaneous albinism, a mild bleeding diathesis, and peripheral sensorimotor neuropathy. The most serious phenotype among CHS patients, however, is a marked increased in susceptibility to bacterial infection that may lead to death during the first 2 decades of life. These clinical features are attributable to dysfunctional granulecontaining cells including melanocytes, platelets, Schwann cells, neurons, and granulocytes (65,66). On the basis of phenotypic similarity, the beige (bg) mutation in mice has long been regarded as a model for CHS (67). Several components of the immune system are affected in Beige/CHS. Neutrophils exhibit defective chemotaxis and reduced intracellular killing for up to 90 minutes after bacterial phagocytosis, and their granules lack the serine proteases cathepsin G and elastase because of a failure of normal protein sorting (68,69). Natural killer cell activity is defective, causing impaired cytolysis of tumors and virally infected cells; cytotoxic T-cell responses against allogeneic tumor cells are also abnormal (70,71). Mice with the bg mutation have increased susceptibility to a variety of pathogens, including cytomegalovirus, Leishmania donovani, Candida albicans, and a variety of pathogenic bacteria (E. coli, Klebsiella pneumoniae, Staphylococcus aureus, Streptococcus pneumoniae) (72)(73)(74).
To identify the genetic basis of this host resistance defect, the bg gene was localized to a 0.24 cM interval of proximal mouse chromosome 13 by genetic mapping of three mouse backcrosses segregating this phenotype (75). A DNA contig of this region spanning 2,400 kb was constructed from large-capacity yeast artificial chromosomes and P1 bacteriophage clones (76).
Using yeast artificial chromosome complementation and direct cDNA selection, two groups subsequently identified portions of a candidate gene for bg, named Lyst (lysosomal trafficking regulator) (77,78). Lyst, ubiquitously expressed in the mouse, has a maximum transcript size of approximately 12kb and possible complex alternative splicing. Several mutations predicted to severely truncate the Lyst polypeptide were identified within each transcript. Through the use of partial sequence data for mouse Lyst, 27 cDNAs corresponding to the human gene were identified and assembled into a complete human gene sequence of 13,499 bp, with an open reading frame of 11,403 bp (79). Comparison of the partial 3' mouse cDNA to the human sequence demonstrated 77.2% nucleotide identity and 87.9% amino acid identity, indicating that human and mouse genes are highly homologous, and sequence analysis of three CHS patients identified pathologic mutations in all.
Comparative genetic mapping between the region of mouse chromosome 13 with the bg mutation and the human genome indicates homology with distal chromosome 1q. Consistent with this alignment, genetic mapping of the human CHS locus in affected families localized it to 1q42-1q44 as part of a conserved linkage group shared with mouse chromosome 13 (80,81). Radiation hybrid mapping also assigned the human CHS candidate gene to 1q43, confirming that the bg phenotype in mouse and human CHS are both caused by mutations in orthologous genes (79). Database searches with the complete nucleotide sequence of the CHS gene showed significant homology to open reading frames from S. cerevisiae and C. elegans, as well as a human cell division control protein-4 (CDC4L) (82). The modular architecture of the CHS protein is similar to Vps15, a yeast serine/ threonine kinase protein kinase thought to be part of a membrane-associated signal transduction complex regulating intracellular protein trafficking (83). To date, the function of the CHS gene remains unknown, although it may be similar to Vps15 and may be part of a novel gene family.

X-Linked Agammaglobulinemia (XLA)
XLA, one of the first primary immunodeficiency disorders described in humans, is the prototypic example of the protective role of humoral immunity against common bacterial pathogens (84). XLA is characterized by a Synopses profound deficiency of B-lymphocyte development at two sequential stages of maturation within the bone marrow (85). This defect results in marked reductions in the serum levels of all three major classes of immunoglobulins and a profound decrease in the number of B lymphocytes in the peripheral blood as well as in the lymphoid follicles and germinal centers of lymph nodes. The clinical manifestations generally begin by the end of the first year of life, once the level of maternally derived antibodies has declined. Bacterial infections with organisms such as S. pneumoniae, Haemophilus influenzae, S. aureus, and Pseudomonas species are most common, with the respiratory tract being most frequently affected. Gastrointestinal infection with Salmonella or Campylobacter have also been reported, as have urogenital infections with Mycoplasma or Chlamydia. XLA patients have defective host resistance to enteroviruses, since neutralizing antibody is important in controlling these pathogens during their passage through the blood stream. Resistance to other infections for which intact T lymphocyte function is required (e.g., tuberculosis or histoplasmosis) remains intact.
Recognition of the familial occurence of this rare disorder and pedigree analysis demonstrated an X-linked recessive inheritance pattern of the trait (86). Carrier females could not be detected because they are phenotypically normal, with normal serum levels of immunoglobulin. Linkage studies of over 500 individuals from 60 families mapped the gene for XLA to the midportion (Xq22) of the X chromosome, cosegregating with the polymorphic genetic marker DXS178 (87,88). By using complementary strategies of positional cloning and lowstringency cDNA library screening, two groups identified a novel src-like cytoplasmic tyrosine kinase, named Btk (Bruton agammaglobulinemia tyrosine kinase) as a strong candidate gene for XLA (89,90). Btk was mapped to the XLA locus by FISH and somatic cell hybrid analysis and was expressed in cell lines representing all stages of B cell development, myelomonocytic cell lines, and a macrophage cell line; it was not detectable in T lineage cell lines (90). In transformed B-cell lines from individuals affected with XLA, the expression level of Btk mRNA and protein, and consequently its kinase activity, was reduced or absent. Southern blot analysis of DNA from pedigrees with XLA cases showed restriction fragment length alterations that segregated in an X-linked recessive pattern; detailed analysis disclosed either genomic DNA deletions in the region encompassing Btk or missense point mutations resulting in nonconservative amino acid substitutions at important residues in the putative proteintyrosine kinase domain (89). These findings provide strong evidence that the failure of normal B-cell growth and differentiation in XLA is caused by abnormal function of an intracellular protein tyrosine kinase.
The CBA/N inbred mouse strains X-linked immunodeficiency (xid) has been regarded as an experimental model for human XLA since it was first described in 1972 (91). B lymphocytes from these mice exhibit pleiotropic defects in development and function. Normal numbers of pro-B, pre-B, and surface immunoglobulinpositive B cells exist in the bone marrow, while peripheral B-cell numbers are significantly reduced (30% of normal). The B lymphocytes that are present have an abnormal surface marker phenotype, and B-cell proliferation triggered through the surface immunoglobulin M (IgM) receptor or surface immunoglobulin crosslinking is impaired, as are responses to a number of other mitogenic stimuli including lipopolysaccharide, interleukins IL-5 and IL-10, CD38 receptors, and CD40 ligands. Consistent with these defects, CBA/N mice have reduced serum IgM and IgG3 antibody levels and cannot make antibody responses when challenged with type-2 thymus-independent antigens (e.g., polysaccharides and hapten-polysaccharide conjugates). As with human XLA, impaired humoral immunity can result in increased susceptibility to bacterial pathogens, including S. Typhimurium (92). Inheritance of the susceptibility trait was linked to the xid locus by using back-cross and F2 progeny derived from crosses of CBA/N and DBA/2N parental strains.
To determine whether XLA and xid were caused by mutations in homologous genes, two groups performed genetic mapping of xid and Btk. The Btk gene was closely linked to the xid locus in the distal region of the mouse X chromosome by using an interspecific backcross mapping panel (93), and precise colocalization of Btk and xid was observed in 1,114 segregating back-cross progeny (94). Normal and mutant mouse strains did not differ in Btk expression or in vitro kinase activity. Sequence Synopses analysis of the mouse Btk transcript in CBA/N and several immunocompetent mouse strains (including the CBA/CaHN progenitor) demonstrated a point mutation within the first coding exon that is predicted to convert a highly conserved arginine residue to cysteine. This amino acid substitution occurs within the pleckstrin homology domain in the aminoterminal region of the protein and is presumed to alter normal B-cell signaling by disrupting protein-protein interactions. To unequivocally confirm that mutations in Btk were responsible for the xid phenotype, targeted gene disruption (a gene knockout experiment) was performed in embryonic stem cells (95,96). Complete elimination of Btk protein production identically reproduced the xid phenotype, indicating that the naturally occurring point mutation produces a complete loss-of-function phenotype or results in a protein with dominant negative properties (presence of a single mutant allele is sufficient to block normal gene function). The severe early Blymphocyte developmental arrest of human XLA was not observed, which suggests that Btk function in mice is accompanied by a compensatory mechanism operating during early B-cell development to rescue B-cell maturation.
On the basis of comparative mapping and sequence analysis, human XLA and the mouse xid phenotype are clearly homologous disorders caused by mutations in orthologous genes. Nevertheless, although the underlying genetic alteration in both species was successfully identified, a number of issues remain unresolved. First, the phenotypes observed in these two disorders are not identical; the more severe block of early lymphocyte development in XLA results in a greater deficiency of peripheral B cells relative to the CBA/N mouse strain, suggesting that the requirement for Btk in early murine B-cell development is less stringent than that for humans. Second, the range of pathogens to which humans are are highly susceptible appears more diverse than the range for mice. Finally, the exact role of Btk in normal B-cell physiology remains to be demonstrated. Thus far, identification of BTK has led to carrier detection and prenatal counselling; additional characterization of a mouse model with great similarity to the human condition could advance our understanding of the fundamental processes underlying B-lymphocyte development and function.

Conclusions
Complete understanding of infectious disease pathogenesis requires identification and characterization of host genes that regulate the response to virulent microorganisms. Through evolutionary selection, a series of innate immune defense mechanisms have evolved to protect the host against the constant threat of microbial injury and direct the development of specific adaptive immune responses. Genetic analysis of naturally occurring variation in the host response among model organisms has successfully identified novel genes such as Nramp1, Lyst, and Btk, thus providing new insights into the molecular nature of host resistance. Rapid advances are now being made in the creation and integration of dense genetic maps of model organisms and humans. Comparative genomics will play an increasingly important role in facilitating the transfer of new knowledge from experimental models to a more complete understanding of human host resistance.