The Chicken Myosin Heavy Chain Family*

We have used a myosin heavy chain gene fragment to probe two chicken genomic libraries. The fragment, derived from a gene expressed in the fast-white fibers of adult chickens, contains 1400 bases upstream from the translational start site and 1600 bases downstream from the initiation codon. Thirty-one unique nonover- lapping clones were isolated. All of the genes showed homologies in the nucleotide sequences which code for the globular head portion of the myosin protein while no extensive homologies were detected in the 5”flank- ing sequences. The relationships between the genes were studied using oligomeric sequences as probes. The hybridization patterns showed that seven of the genes fall within a well defined subgroup. The exon contain- ing a domain of the nucleotide (ATP) binding site was sequenced for all seven of these genes and shown to be, except for 2 nucleotides in one of the genes, completely conserved. The lack of sequence conservation in the 5”nontranslated portions of the genes was exploited in the preparation of transcript-specific probes. We have used these probes to show that two of the isoforms are expressed in a tissue-specific and developmental stage-specific manner in the chicken.

globular head region of the myosin heavy chain (MHC') (9), and in the different isoforms this portion of the molecule may be of primary importance in determining the ATPase activities which are thought to underlie the functional differences of the different types of muscle fibers (10)(11)(12).
Another contribution to the diversity of the MHC isoforms is that the myosins present in the different fiber types may undergo developmental transitions (13)(14)(15)(16). Thus, a large number of myosin isoforms have been detected. Not surprisingly, this diversity was found to be reflected at the gene level, and it is now known that the myosin heavy chain genes in the chicken (17,18) and rat (19)(20)(21)(22) are present as multigene families.
Presumably, within the families are the different sequences which code for the distinct isoforms, although differential transcription and processing as well as post-translational modification have not been ruled out in the generation of some of the different MHC protein isoforms. Other musclespecific proteins such as the troponins and myosin light chains (23,24) have been shown to be generated in such a manner, the appearance of a particular isoform being dependent upon both differential transcription of the gene and differential processing of the primary transcript which results in different exon-exon junctions being formed (24, 25). In Drosophilia, whose genome contains a single myosin heavy chain gene, three myosin RNAs appear during development, differing primarily in the sequences present at the 3'-ends (26,27).
In the vertebrates, the number and nature of the myosin heavy chain genes and pseudogenes, as well as their structural relationships to the different protein isoforms, has remained obscure. Recently we have characterized in detail two members of the chicken MHC gene family (42). Both code for fastwhite isoforms which are expressed in fast (white) glycolytic muscle, one being expressed in embryonic (N118) and the other in the adult pectoralis (N116). We noted, in comparing various regions of the genes to myosin sequences from other organisms that the nucleotides in the 5"regions which code for the head portion of the MHC appeared to be highly conserved, more so than the nucleotides at the 3'-end of the gene which encode the rod portion of the protein. For example, while the 3'-ends of the nematode and chicken myosin genes are unable to cross-hydridize even under nonstringent conditions, there is sufficient homology between the nucleotides at the 5'-ends of both genes such that cross-hybridization can occur. 2 We reasoned that a fragment which contained this portion of a myosin gene might serve as a generalized probe for most or all of the MHC genes, if used to screen a library under

Myosin Heavy Chain
Genes 6607 relatively permissive conditions. We have used a cloned fragment of a 5'-end of the adult gene to exhaustively screen two chicken genomic libraries and describe below the isolation of thirty-one unique nonoverlapping clones, all of which show striking homologies to elements of the probe. Furthermore, the patterns of hybridization of the isolates allowed us to identify seven clones which, by virtue of their homologies, appear to fall within a defined subgroup.

MATERIALS AND METHODS
Isolation of the Cloms-Two genomic libraries (17,41) both made in Charon 4A using chicken genomic DNA were screened. Usually, 500,000 bacteriophage were screened at a time. A total of 5 genomic equivalents of each library was used. After transfer of the bacteriophage to nitrocellulose, the filters were baked at 80 "C for 3 h, washed (29), and subsequently hybridized in 3 X SSC, 0.1% sodium dodecyl sulfate, and 5% dextran sulfate at 62 "C. The 3000-bp fragment used to probe the libraries was labeled to a specific activity of 5 X 10' cpm/ pg by the method of nick translation (30) and added to the hybridization buffer at a final concentration of 1 X lo6 cpm/ml.
Duplicates and overlaps between the clones (a total of 92 were isolated) were determined by restriction endonuclease digestion. A combination of single, double, and triple digestions using the restriction endonucleases EcoRI, AccI, BglI, HincII, Hinfl, PstI, and SphI proved to be sufficient in determining which clones shared common fragments. Partial maps of these clones are described elsewhere (31,32).
Hybridization Conditions Using the Oligonucleoticle-Hybridization of the immobilized DNA to the oligonucleotide was carried out at 33 "C in 5 X SSC, 0.1% sodium dodecyl sulfate, and 5 X Denhardt's solution (0.1% Ficoll, 0.1% polyvinylpyrrolidone, 0.1% bovine serum albumin) for 20 h. The oligonucleotide was made radioactive using carrier-free [Y-~'P]ATP and T4 polynucleotide kinase (28). The final concentration of the probe in the hybridization solution was 500,000 cpm/ml. RNA Isolations-Total RNA and poly(A+) RNA were isolated from the various muscle sources as previously described (17). SI Nuclease Analyses-5 pg of total mRNA along with 0.1 ng of kinased oligomer was dissolved in 10 pl of 10 mM Tris-HC1 (pH 8.0), 100 mM NaCl, and 1 mM Na'EDTA. The solution was heated to 90 "C for 10 min and the nucleic acids annealed at the desired temperature for >2 h. After annealing, 300 pl of S1 buffer (280 mM NaC1, 30 mM Na acetate (pH 4.4), and 4.5 mM Zn acetate) along with 20 units of S1 nuclease was added. The digestion was incubated at 20 "C for 2 h. The reaction was stopped by adding 6.2 gl of 500 IIIM Na'EDTA. The DNA was precipitated by adding 5 pg of carrier tRNA and 800 pl of absolute alcohol. The precipitate was collected by centrifugation, and the products were subsequently analyzed on 8% polyacrylamide gels (38).

RESULTS
Screening the Chicken Genomic Libraries Using the 5'-End of a Myosin Gene-We reported previously the characteristics of a myosin heavy chain gene which codes for an adult isoform. A 3000-bp fragment derived from the 5'-end of the gene by EcoRI digestion was subcloned into pBR325 and the sequence of approximately 2000 bp of the fragment determined (42). Fig. 1 shows the structural organization of the fragment. The 1400 bp upstream from the ATG at which translation begins contain the 5'-flanking sequences and promoter elements. The 5'-nontranslated region is generated by removal of a 769bp intron. The remaining 1600 bp, located downstream from the ATG at which translation is initiated (the NcoI site) contain at least 3 exons. Three exons (207, 154, and 188 bp) have been sequenced (18) and are separated by 89-and 121bp introns. The exon-intron sequences 3' to the Sau3AI site have not yet been determined.
The 3000-bp EcoRI fragment was purified by electrophoresis in low melting point agarose, labeled with [32P]dATP by nick translation (30), and used to screen the genomic libraries pN116-505 FIG. 1. Structural organization of the 5'-end of an adult myosin heavy chain gene. The EcoRI sites in the gene are denoted (1). A partial sequence of pN116-3.05 is given in Ref. 18. NcoI cleaves the gene in the methionine codon at which translation is initiated. as described under "Materials and Methods." 92 plaques showed varying intensities of hybridization to the probe. Duplicates and overlaps between the isolates were identified by restriction endonuclease digestions of the purified DNAs. The analyses showed that 31 unique nonoverlapping genes were present within the collection. The sizes of the EcoRI fragments are shown in Table I. A Southern transfer (33) of the EcoRI-digested DNAs, which was subsequently hybridized to the 3000-bp fragment, is shown in Fig. 2.
The DNAs are grouped in order to emphasize the different intensities of the signals obtained using the 5'-end of the fastwhite gene as a probe. Not surprisingly, the different clones do show varying degrees of homology with the probe. The DNA from which the 3000-bp fragment was derived (N116) is present in lane 1. Lanes 2 and 5 contain duplicate samples of the same DNA (derived from N29) in order to monitor the reproducibility of signal intensity (see figure legend). The autoradiogram shows that the DNAs in lanes 1-8 bind more of the probe than do the other 24 DNAs. Six other isolates, corresponding to clones N39, N311, N324, N316, N309, and N315 (lanes 9-14), generated signals of moderate intensity, while the DNAs present in lanes 15-32 showed widely varying intensities ranging from moderate signals (N310, N109, N320; lanes 18, 21, and 25) to ones that were barely visible in this exposure (N50, N321, N328; lunes 26, 31, and 32). It should be noted that subsequent restriction endonuclease mapping of these clones indicated that the varying signal intensities were not due to partial deletions of the homologous sequences in the inserts of the clones (with the possible exceptions of N50, N321, and N319). We conclude from these data that a  large number of myosin or "myosin-like" genes (or pseudogenes) exists in the chicken genome.
Relationship between the Clones; Identification of a Subset-We noted in shorter exposures of the autoradiogram shown in Fig. 2 that the DNAs in lanes 1-8 hybridized to the probe with approximately equal intensities. In order to delineate further the relationship between these clones and the rest of the isolates, the probe used in the original screening was cleaved at the NcoI site to generate 1400-and 1600-bp fragments, which correspond to the 5"flanking sequences and structural sequences of the myosin gene, respectively (Fig. 1). Hybridization of each of these probes to Southern transfers similar to the one shown in Fig. 2 was carried out; the data are shown in Fig. 3.
The shows that if the blots are washed under stringent posthybridization conditions (0.1 x SSC at 62 "C), hybridization occurs predominantly to the DNA from which the probe is derived, N116, although faint signals can be observed in lanes 3 and 7 in the overexposed autoradiogram. We conclude from these data that a subset of these genes (N29, N118, N127, N125, N101, and N124) shares some homologies at their 5'ends with N116 (N29 is placed within this group on the basis of sequencing data of the structural regions of the genes (18)). However, these homologies are imperfect and quite limited since only the parental DNA continues to generate a strong signal under stringent post-hybridization wash conditions.
The above results contrast sharply with those obtained with the 1600-bp probe derived from the structural sequences (Fig.  3, C and D ) . The probe hybridizes well with all of the isolates, although the varying intensities observed with the 3000-bp probe are still present. However, the overall pattern of hybridization is not affected by a low (Fig. 3C) versus high stringency wash (Fig. 3 0 ) . The same pattern resulted when sequences containing only the second and third exons from the 1600-bp fragment were used to probe the isolates (data not shown). Thus, the hybridization observed cannot be due to repetitive sequences present in the introns. We conclude that the internal 5'-structural sequences are highly conserved in all of these isolates, although the homologies are more pronounced within the clones present in lanes 1-8. A Domain of the ATP-binding site is Conserved in Seven of the Isolates-It is interesting that the structural sequences present at the 5'-end of the gene appear to be so highly conserved among the isolates. As noted above, this region of the gene encompasses the sequences which code for the globular head region of the protein, the region in which the ATPase is localized. The head region has been further subdivided on the basis of limited trypsin digestion (34). Three proteolytic fragments of 25,50, and 20 kDa are generated, the 25-kDa fragment corresponding to the amino terminus of the protein. Experiments by Yount and co-workers (35,36) using photoaffinity labels show that the ATP binds in proximity to both the 25-and 50-kDa fragments. Chemical analysis of the 25-kDa fragment shows that the ATP analog is bound to a tryptophan residue (35).
Previously, sequence analysis of two fast-white genes had shown that the domain of the nucleotide-binding site on the 25-kDa fragment appeared to be very highly conserved; the site is present in the third exon of the adult fast-white MHC gene at lysine 131 and tryptophan 132 (18). An oligomer which corresponds to the nucleotide sequence at and around this site was prepared. The %mer, 5'-GGC AGC CAC TTG TAG GGG-3', contains the third base in the codon for aspar- Southern transfers of the EcoRI-digested DNAs were carried out as described in the legend to Fig. 2. Both the 1400-and 1600-bp EcoRI-NcoI fragments were labeled with [32P]dATP to a specific activity of 5 X 10' cpmlpg.
Hybridizations were carried out in 6 X SSC, 5 X Denhardt's solution a t 62 "C for 12 h in the presence of 500,000 cpm/ml of the 1400-bp fragment (panels A and B ) or the 1600-bp fragment (panels C and D). Post-hybridization washes were carried out in 6 X SSC at 62 "C (panels A and C) or in 0.1 X SSC at 62 "C (panels B and D). The order of the DNAs on the autoradiogram is identical to that in  Fig. 4 and demonstrate that the genes in lanes 1-8 hybridize to the oligomer with approximately equal intensities. Furthermore, during sequential washes of increasing stringency the hybridizable sequences appeared to be identical (data not shown). In order to confirm this, previous sequence data obtained from the myosin genes (42, 43) was extended, and the nucleotides present in the third exon of each of the seven genes were determined. The results are shown in Fig. 5 and show that, with the exception of two nucleotides in N124, all of the nucleotide positions in each of the genes are conserved.
These results and the results shown in Fig. 3A delineate the close relationship among the seven genes in lanes 1-8. Clearly, these isolates constitute a subgroup within the larger myosin gene family, the subgroup being defined by the limited homologies present in the 5'-flanking sequences, and the conservation of the third exon.
Preparation of Transcript-specific Probes-We wished to explore the transcriptional patterns of the genes contained within this subset. However, the detection of a specific myosin transcript is complicated by the fact that previous data (17) and the data shown above (e.g. in Figs. 2 and 5) illustrate that the different myosin genes are closely related and are able to cross-hybridize with one another, even under stringent con-EXON 3 120 thr tyr rer gly leu phe cys Val thr Val asn pro tyr lys trp leu pro Val tyr asn pro glu Val  6. Hybridization to a "dot-@ @ @ @ @ @ @ blot." An oligomer derived from the 5'region of N116 was labeled with [y3'P] ATP as described under "Materials and Methods" and hybridized to DNAs which had been applied onto nitrocellu-' lose using the Bio-Rad dot-blot apparatus. After hybridization the filter was washed in 5 X SSC at 37 "C and exposed for 3 h at -70 "C. The oligomer, 3'-GACACACTCGCGTCG-5' terminates one base 5' to the ATG site.

N116 ACC TAC TCG GGC CTC TTC TGT GTC ACT GTC AAC CCC T I C AAG TGG CTG CCG GTG TAC AAC CCG GAG GTG
ditions. We have successfully prepared isoform specific probes, e.g. probes that hybridize specifically to fast (white) isoforms. However, these probes do cross-hybridize, at least to a limited extent, with members of the same isoform family (data not shown). To understand the transcriptional patterns of the subgroup in the different tissues during development, it is necessary to be able to distinguish the different isoform transcripts from one another. The data presented in Fig. 3, A and B, led us to suspect that transcript-specific probes could be prepared from the 5'flanking region of the genes which encode the untranslated portion of the mRNA. Previously, we have used short probes prepared from these regions to identify the transcriptional start sites contained within two of the genes, N116 and N118 (42). The oligomer derived from N116 was used to test the general utility of the approach and was hybridized to DNAs derived from the 31 isolates, which had been "dot-blotted" onto nitrocellulose. The data are shown in Fig. 6; the oligomer binds only to the DNA from which it had been derived. Similar results have been obtained with oligomers derived from 5 of the other 6 isolates (data not shown) and assuming that most or all of the sarcomeric MHC genes are contained within these 31 isolates, these data indicate that oligomers derived from the nonconserved regions encoding the untranslated portions of the MHC RNAs are suitable as transcript-specific probes.

Members of the Family Are Expressed in a Tissue-specific
and Developmental Stage-specific Manner-We wished to examine if the expression of the isoforms encoded by this subgroup was tightly controlled during development in different muscle types. Oligomers derived from N116 and N127 were prepared as described above, radiolabeled with [y"P] ATP, and hybridized to RNAs derived from different muscle types at various stages of development. Subsequently, the hybridization was treated with S1 nuclease, and the resulting products were electrophoresed in 8% polyacrylamide DNA sequencing gels (38).
The results are shown in Fig. 7. In panel A , the data obtained using the oligomer derived from N116 are consistent with previous data showing this gene codes for a transcript which is expressed predominantly in the fast fiber musculature of the adult, as homologous transcripts are localized mainly in the deep and superficial pectoralis of the adult birds. N127 (panel B ) apparently encodes transcripts which are expressed differently from N116s; expression is restricted entirely to the adult posterior latissimus dorsi (a muscle enriched in fast (white) fibers) and the adult gastrocnemius, which contains a heterogenous mixture of fiber types.

DISCUSSION
The data presented in Fig. 7 indicate that at least some of the genes are expressed in a tissue-specific manner, and only at certain times during development, as we were unable to detect complementary transcripts in most of the muscles  7. Determination of transcriptional patterns using S1 nuclease. Oligomers derived from N116 (panel A ) or N127 (panel B ) were labeled with ["'PIATP, hybridized to RNAs derived from the dilferent muscles, and subsequently treated with S1 nuclease as described under "Materials and Methods." The products were elect rophoresed on 8% polyacrylamide DNA sequencing gels. The RNAs derived from the following muscles were used: lanes 1-3 (9 days in mw), pectoralis, leg, cardiac; lanes 4-6 (18 days in ouo), pectoralis, leg, cardiac; lanes 7-12 (10 days post-hatch (neonatal)), superficial pectoralis, deep pectoralis, gastrocnemius, anterior latissimus dorsi, cardiac, gizzard; lanes [13][14][15][16][17][18][19] (6 months post-hatch (adult)), posterior latissimus dorsi, deep pectoralis, gastrocnemius, anterior latissimus dorsi, cardiac, gizzard, superficial pectoralis. Lane 20 contains the products protected in a hybridization to yeast tRNA; lane 21, the intact oligomer (c). studied. There remains the formal possibility that a primary MHC transcript is subject to differential processing at the 5'end in the other tissues. If this were the case, these transcripts would not be detected in our assay as the nucleotides complementary to the oligomer would be removed. However, there are no apparent splice junctions that might account for the selective removal of these sequences, unless one hypothesizes the removal of the entire first exon (18). We consider the deletion of the 68 amino acids at the NH, terminus to be very unlikely in the generation of any myosin isoform, especially in view of the homologies observed in this region when sequences from nematode, rabbit, rat embryonic, and cardiac myosins are compared with the chicken fast (white) isoforms. Rather, a more likely explanation of these data is that there is tissue-specific expression of the MHC gene family, and this expression is regulated primarily at the transcriptional level.

FIG.
The large number of isolates able to hybridize to the structural sequences of the myosin probe, even under hybridization conditions of high stringency, is rather surprising and indicates that our previous estimate (7-11 genes) of the number of MHC genes present in the chick multigene family was in error. It is possible that a number of the isolates do not correspond to MHC genes, but rather code for "myosin-like'' proteins. However, preliminary data obtained from S1 analyses similar to those shown above indicate that the isoforms encoded by a t least five of the seven members of the subgroup are localized largely in muscle types which contain significant amounts of fast-white (glycolytic) fiber types; no homologous transcripts have been found in the slow-fiber or cardiac mus-cles.:' In addition, as was the case for N127 and N116, the different genes show differential patterns of expression, indicating that it is unlikely that the majority of the subgroup is made up simply of alleles or polymorphic variants. It seems reasonable to assume, therefore, that since we have not yet identified any of the genes encoding the cardiac, slow red, and smooth muscle myosin isoforms, that they may be found within the remaining 24 isolates.
The physiological role played by a large number of myosin genes is difficult to assess at this time, especially since we do not have a good estimate of the number of pseudogenes present in the isolate population. It is known that the chicken undergoes at least three developmental transitions and that distinct embryonic, neonatal, and adult isoforms exist (13). It is also apparent that at each stage, in the different fiber types, different myosins are expressed and that multiple MHCs may be present in the same fiber type. The roles, if any, that these multiple isoforms play in modulating the overall ATPase activities of the various fiber types remain unknown.
The high degree of homology between the seven genes merely emphasizes the difficulties involved in resolving the differential patterns of expression for each of the MHC isoforms. Both polyclonal and monoclonal antibodies to the MHC isoforms have been prepared ( 5 , 39, 40), and although some success has been reported in differentiating embryonic from adult isoforms and slow-red from fast-white and cardiac proteins, the cross-reactivity of the antibodies to the different myosins present in different tissues and at different developmental stages has hampered the analyses of isoform transitions and the tissue specificity of expression.
Analyses using transcript-specific probes, as above, show promise in resolving these problems. The ability to quantitate the level of a particular MHC transcript in the mRNAs isolated from the different muscle types during development will lead to a comprehensive picture of the expression of the gene family.