Isolation and Characterization of a Rat Amylase Gene Family* a Rat DNA-A Charon 4A DNA for

Portions of at least nine distinct rat amylase genes or pseudogenes have been isolated. Cloned rat genomic DNA fragments containing complete or major portions of seven of these have been examined by heteroduplex analysis and fall within two separate groups based on their degree of homology. Four gene sequences com- prising one of these groups are closely related to pancreatic amylase mRNA. The other group shows signif- icant nonhomology to both pancreatic and parotid amylase cDNAs and may represent an additional gene type(s). All of the cloned amylase gene sequences are found in rat genomic DNA. Additional amylase se- quences which have not yet been cloned are also detected. Comparison of DNA from individual Sprague- Dawley rats by Southern blotting techniques indicates allelic variation at multiple amylase loci.

The a-amylases are a multigene family of widespread occurrence (1) in eucaryotic organisms. In the rat, analysis of the a-amylase enzymes and the mRNAs encoding them has indicated that the rat genome contains multiple related amylase genes which are expressed selectively and to varying extents in the pancreas, parotid, and liver. Pancreatic and parotid amylases can be differentiated immunologically and by their unique peptide maps (2,3). Further, the two mRNAs differ in size and sequence (4). The cDNAs derived from both pancreatic and parotid amylase mRNAs have been cloned and the sequence of each has been determined' (5). Amylase mRNA from liver also differs slightly in sequence from the pancreatic RNA (4) and is larger than either the pancreatic or parotid mRNAs.' In the mouse, the parotid and liver amylases are encoded by a single gene with overlapping regions (6).
We report here the cloning and analysis of a set of rat amylase genes. We have isolated 39 genomic DNA fragments from a X Charon 4A library derived from a single animal. Portions of at least nine distinct amylase genes can be recognized. Seven of the clones contain complete or major portions of amylase gene sequence. We present an analysis of the structural relationship between these seven gene sequences. A preliminary characterization of four of these genes has been presented previously (5,7,8). 1 To whom correspondence should be addressed.

MATERIALS AND METHODS
Screening of a Rat Genomic DNA-A Charon 4A DNA Library for Amylase Gene Sequences-A Sprague-Dawley rat genomic DNAA Charon 4A library (9) was screened for amylase gene sequences as described by Benton and Davis (10). Pancreatic amylase cDNA (5,7,8), "P-labeled by nick translation (11) to a specific activity of approximately lo8 cpm/pg, was used as probe. Plaque-purified recombinant phage were used to prepare DNA for further analysis.
Restriction Endonuclease and Southern Blot Analysis of Cloned Genomic DNA Fragments-Restriction endonuclease sites were ordered by double digestions with reference to known sites in the Charon 4A arms. Cloned genomic DNA fragments were digested with restriction endonucleases, subjected to electrophoresis on 1% agarose gels, and transferred to nitrocellulose filters (BA-85, Schleicher & Schuell) as described by Southern (12). To orient amylase gene sequences hybridizations were performed with either 5' or 3' portions of the pancreatic amylase cDNA (1-5 X loti cpm/filter). These two probes designated pcXPl00 (5') and pcXP38 (3') together extend from the 3' end of the pancreatic amylase mRNA to 29 nucleotides from the 5' end and overlap for about 200 bases. The isolation and nucleotide sequence of both probes has been reported (5).
Southern Blot Analysis ofRat Liver DNA-High molecular weight nuclear DNA from the liver of individual Sprague-Dawley rats, Rattus noruegicus was isolated as previously described (13). Southern blotting methods and hybridization with "P-labeled (1-5 X IO6 cpm/ filter) pancreatic amylase cDNA probes were as described above.
Heteroduplex Analysis-Mapping of homologous regions between the cloned amylase genomic DNA fragments and the cDNA plasmid, pcXPl0l (5), was performed as described by Fergusson and Davis (14) with the following modifications. Equimolar amounts of cloned amylase genomic fragments and wild type Charon 4A DNA were added to a final concentration of 0.5-2 ng/pl. pcXP101, constructed by splicing pcXP38 and pcXPl00 through a common unique restriction site (5), was added at a 20-fold molar excess relative to the genomic DNA. To determine the extent of homology between amylase gene sequences, heteroduplex molecules were formed by annealing equimolar amounts of the genomic DNA fragments. DNA concentrations were the same as used for cDNA mapping. In both cases the renatured DNA mixtures were diluted 4-to 10-fold for spreading (15). Molecular structures were visualized with a Philips 300 electron microscope and length measurements were made from electron micrographs using a Hewlett-Packard 9864A digitizer coupled to a HP9821A calculator. pcXPl0l DNA and A Charon 4A DNA were used as molecular length standards.
Melting Profile Measurements-The degree of homology between the cloned rat amylase genes Amy 1-7 and pancreatic and parotid cDNA was examined by determining the melting profiles of hybrids formed between these molecules. Pancreatic and parotid hybridization probes were prepared from the cDNA inserts of the plasmids pcXPl0l and pcRP16, respectively, by nick translation (11).
in 50% formamide, 10 m~ Tris-HC1, pH 7.5,l.O m~ EDTA, 0.1% SDS, The resultant hybrids were melted by incubating filters for 5 min 10 pg/ml of sonicated salmon sperm DNA at increasing temperatures over the range 21-70 "C. The amount of 32P-label released from the filter at each temperature was measured by liquid scintillation. Background was determined as the amount of 32P released from filters containing X Charon 4A DNA.

RESULTS
Isolation of Rat Amylase Genes-A rat genomic DNA-X Charon 4A library (kindly provided by T. Sargent, B. Wallace, and J. Bonner (9)) was screened with a cloned pancreatic amylase cDNA probe (5,7,8). Out of 2 X IO6 plaques examined (>0.999 probability of isolating a gene sequence if present in the library), 39 gave positive hybridization signals. These clones were plaque purified and DNA was prepared from each for further analysis.
Restriction Endonuclease and Southern Blot Analysis of Cloned Rat Amylase Gene Sequences-The cloned genomic DNA fragments were analyzed by EcoRI digestion and by their selective hybridization with either the 5' or 3' pancreatic amylase cDNA (5) probe according to the method of Southern (12). By this analysis, these genomic sequences appear to fall within 12 nonoverlapping fragments (Fig. 1). Seven of these contain major portions of amylase gene sequences and are designated Amy 1 through Amy 7 (no relationship to specific genetic loci is intended). The other five contain smaller portions of amylase gene sequences and extend in either the 5' or 3' direction. A more detailed restriction endonuclease analysis was carried out with Amy 1 through 7 (Fig. 1).
The degree of similarity of restriction endonuclease sites within corresponding regions suggests that some of these gene sequences are closely related. Amy 1 through 6 can be arranged in two groups on this basis, one group comprising Amy 1 through 4, the other Amy 5 and 6. The minor differences that occur between the fragments within these groups appear predominantly as small insertions/deletions of DNA as determined by a change in the length separation of common restriction endonuclease sites, e.g. a 1. Amy 2 is 0.2 kb larger in Amy 3 and 1, and an extra segment of DNA approximately 0.6 kb in length is present in the 8.4kb EcoRI fragment of Amy 6 but not in the corresponding fragment of Amy 5. The latter DNA segment can be located by a unique XhoI site which lies within it. Gene sequence differences are also detected as occasional gain/loss of restriction endonuclease sites suggesting single base pair changes, e.g. the XbaI site present in Amy 4 is missing in Amy 1,2, and 3. Based on its restriction endonuclease map Amy 7 is the most divergent of the seven gene fragments. Amy 1 and 3 apparently have been isolated in their entirety within a single genomic DNA fragment; the remaining five appear to be missing portions of the gene at the 5' (Amy 2,5, 6 and 7) or 3' (Amy 4) ends. One of the three fragments containing a small 3' portion of amylase gene sequence may represent the missing 3' end of Amy 4 (see below). We have thus isolated a minimum of nine distinct amylase gene sequences, comprising Amy 1 through 7, plus the two remaining fragments containing 3' ends of genes. Two other fragments containing 5' ends have not as yet been assigned to specific gene sequences.

Restriction Maps of Rat Amylase Genomic Clones
Cloned Amylase Gene Sequences in Rat Genomic DNA- Fig. 2 shows a Southern blot of EcoRI-digested liver DNA from 10 individual Sprague-Dawley rats hybridized with either the 5' (Fig. 2 A ) or 3' (Fig. 2B) cDNA probe. Table I lists the molecular lengths of the EcoRI fragments hybridizing to the two probes and indicates their incidence in individual rats. All of the EcoRI fragments from the cloned amylase genes ( Fig. 1) are observed except the 0.2-kb fragment from Amy 5. We believe this results from a low level hybridization signal. The companion 7.8-kb fragment is observed. In Southern blots of cloned Amy 5 DNA the 0.2-kb fragment is detected but exhibits a low intensity hybridization band relative to the 0.8-kb fragment.   probe. In lane 5 both probes hybridize to EcoRI fragments of approximately 6 kb. In contrast, the two cloned 6.0-kb EcoRI fragments hybridize exclusively with the 5' probe (Fig. l), suggesting the existence of an additional 6.0-kb fragment.
Location of cDNA Regions within Amylase Genomic Sequences by Heteroduplex Analysis-The organization of cDNA sequences within Amy 1 through 7 was determined from an electron microscopic examination of heteroduplexes formed by annealing the Amy genomic DNA fragments with pancreatic amylase cDNA. An electron micrograph and interpretive drawing of the results is presented for Amy 3 in Fig.  3.
cDNA sequences present in the genomic DNA were oriented after heteroduplex formation by the pBR322 tails present in the heteroduplex structure. These tails were produced prior to heteroduplex formation by digestion of the probe with SalI which results in the formation of a long and short tail of pBR322 sequences situated at the 3' and 5' ends of the cDNA, respectively (5). X Charon 4A DNA was also included in the heteroduplex mixture to demarcate the left and right Charon 4A DNA arms that flank the genomic DNA fragments. This allows a determination of the position of the amylase gene sequence within the genomic DNA fragment independent of the restriction endonuclease analysis.
The overall gene sequence organization is similar to that reported previously for Amy 1 (4); seven intervening sequences are detectable. A comparison of the estimated lengths of corresponding cDNA regions and intervening sequences for the seven genes is presented in Table 11. The cDNA regions within the gene. The cDNA regions of the genes range in length from approximately 130 to 300 bases and the intervening sequences from approximitely 300 to 2000 bases. The similarity in the estimated lengths of corresponding cDNA regions indicates that these regions are interrupted in similar positions. Thus at this level of analysis members of the rat amylase gene family appear to have common organization. cDNA region 8 of Amy 7 is an exception to the close similarities in length exhibited by corresponding cDNA regions. This region is approximately 100 bases shorter than the corresponding cDNA regions in other cloned gene sequences and does not anneal either with the cDNA probe in the Southern blot analysis under the high stringency conditions used or with the cDNA region 8 of Amy 5 (see Fig. 5D). These results suggest that there is reduced sequence homology in this region.
Inverted repeat sequences, visualized as intrastrand duplexes, were present in all seven genomic DNA fragments (shown by the arrows in Fig. 5 ) . These range in size from approximately 100 to approximately 750 bases and are found in both flanking regions and intervening sequences.
Determination of Amylase Gene Sequence Homology by Heteroduplex Analysis-In order to examine the degree of relatedness of the amylase gene sequences, heteroduplexes were formed between the seven genomic DNA fragments and examined by electron microscopy. Electron micrographs and interpretive drawings of heteroduplexes are presented in Fig.  4, A-D. Schematic representations are presented in Figure 5.

A comparison of the length (bases) of corresponding cDNA regions and intervening sequences in Amy I through 7
Molecular length estimates were determined as averages from measurements of a t least 5 heteroduplexes and standard deviations range from &5% for the largest measurements to k508 for the smallest measurements. ND, not determined.

Gene
Length  In C a composite of Amy 2 and 4 is shown since these gene sequences considered separately lack cDNA regions 2 and 8, respectively. Hybrid molecules were not made between Amy 5 and Amy 1 or 3 since they are in opposite orientations within the X Charon 4A vector. In C-E only corresponding regions within the two hybridized fragments are shown.
Heteroduplexes in C and D have two different conformations with equal frequency in the 3' flanking region as indicated. This suggests the presence of a 1-kb direct repeat in the 3' flanking region of Amy 5 as indicated by the arrows. There is also a partial repeat of the sequence which is interrupted by a 1.5-kb segment of DNA in the 3' flanking region of Amy 7 as indicated by the arrows. For clarity, genomic regions which hybridize to amylase cDNA (1-8) although not detected bv this analvsis are represented here as filled boxes. Inverted repeats are indicated as arrows in opposite orientations.
The closely related gene sequences Amy 1 through 4 hybridize along their entire regions of overlap (Figs. 4, A and B, and  5 A ) . Since single-stranded regions are not observed within these hybrid regions, deletions/insertions of DNA present withir. these fragments can be no greater than 50-100 bp in length, the limit of detection of single-stranded regions by this technique.
The second group of closely related sequences Amy 5 and 6 also hybridize extensively with each other along the entire length of their overlap except for a looped out region in intervening sequence 5-6 ( Fig. 5B). This corresponds to the extra 600 bp in the 8.4-kb EcoRI fragment of Amy 6.
In contrast to the close homology observed within these two groups of genes less homology is evident between groups in the intervening and flanking sequences. For the most part, intervening sequences from these two groups do not crosshybridize; however, there are minor regions of dose homology up to 150 bases in length within some of the intervening sequences. In some instances these homologous regions are located in the central portions of the intervening sequences as well as at the junctions. The 3' flanking regions of both groups contain a region of homology about 1000 bases in length. This region is directly repeated in one of the genomic fragments as evidenced by two distinct populations of hybridizing structures.
The results of these experiments suggest that Amy 7 is more closely related to Amy 5 and 6 ( Fig. 5 0 ) than to the first group (Fig. 5E). Nonhomologous regions up to 300 bases in length are detected in heteroduplexes of Amy 7 with Amy 5 whereas such regions are more extensive in heteroduplexes of Amy 7 with Amy 1-4. Interestingly, Amy 7 shares the homologous 1-kb segment in the 3' flanking region with the other genes.
Identification of Pancreatic Amylase Genes-The genes have been characterized with respect to their homology with pancreatic and parotid coding sequences by determination of  Table 111. melting profiles for hybrids formed between each of the cloned genes, and pancreatic and parotid amylase cDNA probes (Fig.   6). In agreement with previous reports (4,19) we find a 7 "C depression in T,,, for the heterologous hybrids, 32P-pancreatic cDNA/parotid cDNA plasmid and 32P-parotid cDNA/pancreatic cDNA plasmid relative to the T,,, (41 "C) observed for the homologous hybrids, 32P-pancreatic cDNA/pancreatic cDNA plasmid and 32P-parotid cDNA/parotid cDNA plasmid. The melting temperatures for the corresponding hybrids with Amy 1-7 are given in Table 111. Amy 1-4 DNA hybridizes with the pancreatic cDNA with a T,,, of about 38 "C. In contrast the T,,, of these Amy fragments with the parotid probe is about 33 "C. These results suggest that Amy 1-4 are pancreatic-type amylase genes. The 3 "C depression of T,,, observed for these genes relative to the T,,, observed for the pancreatic cDNA probe hybridized to itself may be due to the presence of intervening sequences in the genes. Partial sequence analysis' supports the identity of the pancreatic type genes and indicates that the mRNA capping site is contiguous with cDNA region 1 in these genes. Amy 5-7 display significant nonhomology (T,,, approximately 33 "C) to both cDNA probes.

DISCUSSION
We have analyzed 39 genomic DNA fragments containing amylase gene sequences. Restriction endonuclease maps and Southern blot analyses indicate at least nine distinct amylase genes or pseudogenes. Digestion of total rat genomic DNA with EcoRI generates a complex pattern of amylase gene fragments ( Fig. 2 and Table I) including the cloned (Fig. 1) fragments. At least seven fragments remain to be isolated.
The complex EcoRI pattern of amylase gene sequences in genomic DNA is further complicated by the fact that the Sprague-Dawley rat is not an isogenic strain and might, therefore, show allelic variation from one rat to another. This is consistent with other evidence for allelic variation in rodent amylases (20-24, 26, 27). Similar variation is not found in isogenic mice strains (6,22). Amy genes 1-7 can be distinguished in blots of genomic DNA by diagnostic restriction fragments. Amy 1, 3, 4, 6, and 7 vary in the 10 rats analyzed (Table I). Amy 4 can be detected by a distinctive 0.4-kb fragment which anneals with the 5' probe; it is present in only 2 of the 10 rats. Amy 1 and 3 can be detected by a diagnostic 1.7-kb fragment hybridizing with the 3' probe. This fragment is present in 2 of 9 rats. Amy 6 and 7 can be detected by 8.4kb and 3.1-kb fragments, respectively, that anneal with both probes. These are present in 6 and 5 rats out of 10, respectively. Amy 2 detected as a 1.5-kb fragment annealing with the 3' probe and Amy 5 detected as a 7.8-kb fragment annealing with both probes appear to be present in all 10 animals.
If we assume amylase genes segregate as intact entities then it should be feasible to assign fragments to specific loci by determining the distribution of genes in the population. For example, the 1.25-kb fragment which anneals with the 3' probe is present only in those rats which contain the 0.4-kb fragment annealing with the 5' probe. This latter fragment is diagnostic of Amy 4, suggesting that the 1.25-kb fragment is the 3' end of Amy 4. The cloned 6.0-kb fragments annealing with the 5' probe do not correlate with any of Amy 1-7 fragments and thus must represent 5' ends of other gene(s).
This simple analysis is compromised by the finding that amy 1,2,4,6, and 7 do not coexist in all animals. Since these cloned genes were isolated from a single rat this lack of coexistence must reflect either a nonlinkage or a scrambling of gene sequences. Close linkage between pancreatic and salivary genes has been demonstrated in other mammals (1,20,21), suggesting that an unlinked arrangement of amylase genes is unlikely. However, classical genetic analysis would not reveal pseudogenes, which may be unlinked (28). One also cannot eliminate the possibility of large allelic variation.
Heteroduplex analysis demonstrates homologous relationships between Amy 1-4 and between Amy 5 and 6. Amy 7 does not fall cleanly into either of the two internally homologous groups, Amy 5 and 6 or Amy 1-4, suggesting that it represents a distinct locus.
The functional relationships of the genes may be inferred from the melting profiles of hybrids formed with either specific pancreatic or parotid amylase cDNAs. The selective hybridization of Amy 1-4 with the pancreatic probe suggests that one or all of these genes encode pancreatic amylase. This suggestion is supported by partial DNA sequence analysis.' The lower melting points of both probes with Amy 5-7 suggest these gene sequences do not contribute substantially to mRNA populations in either of these two tissues. Instead these sequences might be expressed in other yet to be determined tissues or cells, or they could be pseudogenes. The unassigned fragments containing small portions of 3' and 5' ends could be derived from the parotid amylase gene(s). We presume that a single gene encodes both parotid and liver amylase (6).
Since Amy 1-4 were isolated from a single animal, pancreatic amylase could be the product of more than one locus. Variations in pancreatic amylase levels, therefore, could be the result of regulation of individual genes or coordinate regulation of a pancreatic gene family. We have previously demonstrated dramatic changes of amylase enzyme and mRNA content during development (4,(29)(30)(31)(32)(33)(34)(35)(36)(37). In addition the level of pancreatic amylase but not salivary amylase is dramatically altered by insulin and glucocorticoids (33-36). A further analysis of the regulation of expression of individual genes may be possible by using distinctive hybridization probes.
The similarity in the basic structure of the amylase genes is most easily interpreted by assuming these genes originated through a series of duplications from one ancestral gene which contained the fundamental cDNA coding sequences interrupted by intervening sequences. This basic evolutionary mechanism is now apparent in a number of gene families (37, 38).
The strong correspondence of structure within the closely related sets indicates either recent duplication of genes, restricted variation imposed by the requirements of a particular biological function, or gene conversion (39-41). The regions of homology within the pancreatic set extends several kilobases in both 3' and 5' directions beyond the immediate domain of the gene itself. Thus the mechanism for structure maintenance must operate over an extensive genetic region.