Generation of Troponin T Isoforms by Alternative RNA Splicing in Avian Skeletal Muscle CONSERVED AND DIVERGENT FEATURES IN BIRDS AND MAMMALS*

We describe the isolation and sequence analysis of quail muscle cDNA clones encoding two closely related isoforms of the striated muscle contractile protein, troponin T. The cDNAs represent two troponin T mRNAs that exhibit an unusual sequence relationship. The two mRNAs have identical sequences over hundreds of nu- cleotides including 3‘ untranslated regions, but they differ dramatically in a discrete, internally located block of 38 nucleotides. The two alternative sequences of this 38-nucleotide block encode two different but related versions of amino acid residues 230-242, near the C terminus of the protein. These results are con- sistent with a novel mechanism of troponin T isoform generation by alternative mRNA splicing pathways from a single gene containing two different exons cor- responding to amino acids 229-242, as recently proposed by Medford et d. (Medford, R. M., Nguyen, H. T., Destree, A. T., Summers, E., and Nadal-Ginard, B. (1984) Cell 38,409-421). This proposal was based on analysis of a rat troponin T genomic DNA clone and a cDNA clone corresponding to one of the two alterna- tively spliced mRNAs. Our analysis of quail troponin T cDNA clones, apparently corresponding to two alter- natively spliced mRNA species, provides important new evidence for this novel mechanism of troponin T isoform generation and reveals the differential splicing mechanism to be of great antiquity, antedating the bird-mammal divergence.


Generation of Troponin T Isoforms by Alternative RNA Splicing in Avian Skeletal Muscle CONSERVED AND DIVERGENT FEATURES IN BIRDS AND MAMMALS*
(Received for publication, February 22,1985) Kenneth E. M. Hasting&, Elizabeth A. Bucher, and Charles P. Emerson We describe the isolation and sequence analysis of quail muscle cDNA clones encoding two closely related isoforms of the striated muscle contractile protein, troponin T. The cDNAs represent two troponin T mRNAs that exhibit an unusual sequence relationship. The two mRNAs have identical sequences over hundreds of nucleotides including 3' untranslated regions, but they differ dramatically in a discrete, internally located block of 38 nucleotides. The two alternative sequences of this 38-nucleotide block encode two different but related versions of amino acid residues 230-242, near the C terminus of the protein. These results are consistent with a novel mechanism of troponin T isoform generation by alternative mRNA splicing pathways from a single gene containing two different exons corresponding to amino acids 229-242, as recently proposed by Medford et d. ( T., Destree, A. T., Summers, E., and Nadal-Ginard, B. (1984) Cell 38,409-421). This proposal was based on analysis of a rat troponin T genomic DNA clone and a cDNA clone corresponding to one of the two alternatively spliced mRNAs. Our analysis of quail troponin T cDNA clones, apparently corresponding to two alternatively spliced mRNA species, provides important new evidence for this novel mechanism of troponin T isoform generation and reveals the differential splicing mechanism to be of great antiquity, antedating the bird-mammal divergence. One of the quail alternative isoform sequences clearly corresponds to one of the rat sequences, but the other quail alternative sequence does not correspond to either of the rat sequences. This result suggests a greater complexity of troponin T gene structure or a greater diversity of troponin T isoform genes than is currently known, and also has implications for the functional significance of the troponin T protein isoform heterogeneity. Comparison of quail and mammal alternative isoform sequences also reveals strongly conserved features which suggest that all the isoform alternative amino acid sequences are variations on a common structural theme.
Troponin T is one of three dissimilar subunits of the key muscle protein troponin. Troponin, in association with tro-*This work was supported by a National Institutes of Health Grant (to C. P. E., Jr.) and a Muscular Dystrophy Association Postdoctoral Fellowship (to K. E. M. H.). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
To whom correspondence should be addressed.
pomyosin, forms the Ca2+-sensitive molecular switch that controls the contractile interaction of the thick and thin myofilaments in the sarcomeres of striated muscle cells (see Ref. 1 for a review). The subunits of troponin, troponin T, troponin I, and troponin C, exhibit a complex array of interactions with each other, with tropomyosin, with actin, and with Ca2+ (2, 3). Considering the complexity of these interactions, and the possible importance of the specific molecular properties of the Ca2+ switch in determining the characteristics of contractile activity, it is of great interest that troponin subunits occur in distinct isoforms which are specifically expressed in different muscle cell types. In vertebrates each of the troponin subunits exists in antigenically distinct isoforms in fast skeletal, slow skeletal, and cardiac muscle cells (4,5). At least one isoform of each troponin subunit has been completely sequenced by protein chemical methods (6)(7)(8).
Further study of the structure of troponin subunit isoforms and of the molecular and cellular mechanisms responsible for their differential expression will contribute information relevant to understanding both the molecular mechanics of the Ca2+ switch, and the developmental and evolutionary processes involved in the generation of distinct striated muscle cell types. Investigation of the molecular genetic basis of muscle protein isoform heterogeneity has demonstrated the existence of multigene families in which specific protein isoforms are produced from distinct genes. This has been documented for actin (9,10) and myosin heavy chain (11,12) and it supports the notion that gene duplication is the fundamental evolutionary mechanism for generating protein isoforms. However, recent work has indicated other possibilities. In the case of myosin heavy chain (13,14) and myosin light chains (15)(16)(17)(18) it has been found that a single gene can give rise to several distinct protein isoforms via different transcriptional modes and/or different transcript processing pathways that result in the production of more than one mRNA species. Recently, Medford et d. (19) presented evidence that different isoforms of troponin T were produced by a novel alternative RNA splicing mechanism from a single gene in the rat. Here we report new evidence from an independent line of research concerning the generation of troponin T isoforms by differential RNA splicing.
We have been studying, by cDNA cloning, that set of mRNAs that become newly abundant during the differentiation of quail embryo skeletal muscle myoblasts. We have isolated 28 cDNA clones representing about 17 such mRNAs. Six of these approximately 17 mRNA species have been identified as mRNAs encoding the muscle contractile proteins a-actin, myosin heavy chain, myosin light chain 2, a-tropomyosin, troponin C, and troponin I (20,21). Among those cDNAs not identified in our previous studies we now report

Generation of Troponin T Isoforms in
Avian Skeletal Muscle the identification of 3 cDNA clones encoding troponin T protein sequences. These cDNA clones represent two distinct quail troponin T mRNAs that exhibit an unusual sequence relationship featuring long stretches of sequence identity and a short, sharply defined internal region of dramatic sequence divergence. The general features, and precise details, of this sequence relationship indicate that the two quail troponin T mRNAs are alternatively spliced products of a single gene, along the lines indicated by Medford et al. (19) for the rat. Because we identify cDNA clones corresponding to two different troponin T mRNAs (whereas only one of the two rat mRNAs has been identified by cloning (19)) our results constitute important new evidence that directly demonstrates the occurrence of differential troponin T RNA splicing. Our results further indicate that the troponin T differential RNA splicing mechanism was developed prior to the bird-mammal divergence and provide new information on isoform-specific troponin T amino acid sequences. These results are discussed in relation to the origin and operation of this complex gene and its novel RNA splicing mechanism, and the role of alternative amino acid sequences in the structure and function of the troponin T protein.

MATERIALS AND METHODS
cDNA Cloning and Screening-The quail (Coturnix coturnix) muscle culture cDNA clone library was produced by a standard method oligo(dT)-primed reverse transcription of whole cell poly(A)+ RNA, second strand synthesis with DNA polymerase I, S1 nuclease trimming, and insertion via G + C tailing into the PstI site of pBR322.
Construction of this particular cDNA clone library, and the isolation from it of 28 cDNA clones representing mRNAs whose abundances increase substantially during muscle cell differentiation, have been described (20,21). Analysis of these 28 developmentally regulated sequences by DNA dot hybridization (22) showed that the three cDNA clones cC113, cC119, and cC122, were cross-hybridizing sequences.
gC106 is a recombinant phage containing a fragment of quail genomic DNA that hybridizes with the rat troponin T cDNA clone pTnT-15 (23).' The above mentioned collection of 28 developmentally regulated cloned quail cDNA sequences was screened by DNA dot hybridization with gC106. cC113, cC119, and cC122 showed strong hybridization signals. DNA Sequencing-DNA sequences of the cDNA inserts of cC113, cC119, and cC122 were determined by the method of Maxam and Gilbert (24) by 5' labeling with polynucleotide kinase and [-p3'P] ATP or 3"labeling with terminal deoxynucleotidyltransferase and a-32P-labeled cordycepin 5"triphosphate the restriction sites indicated by overlining in Fig. 1B. Other strategies included 3' labeling at the PstI sites at the ends of the G + C tails and at the BalI and PuuI sites in pBR322 approximately 120 bp2 from the ends of the cDNA inserts. The entire 359-bp overlap of cC113 and cC119 was independently sequenced in both clones and most of it, including the Arg 224 codon and the difference region, was determined on both strands in both clones.

RESULTS
Troponin T cDNA clones were isolated in two steps from a library representing the poly(A)+ RNA population of differentiated muscle cell cultures derived from quail (C. coturnix) embryos. The first step identified 28 cDNA clones corresponding to mRNAs whose abundances increase dramatically during muscle cell differentiation (20,21). In the second step these 28 clones were screened by hybridization with a cloned quail genomic DNA fragment, gC106, containing troponin T sequences (identified by hybridization with a rat troponin T cDNA clone (23)). Three quail troponin T cDNA clones cC113, cC119, and cC122 were thus identified. Their DNA E. A. Bucher and C. P. Emerson, Jr., manuscript in preparation.
The abbreviation used is: bp, base pair. sequences were determined and are presented in Fig. 1, along with a map showing their overall physical relationships to each other and to the troponin T protein.
The cDNA insert of cC113 is a 549-bp sequence (excluding the G + C tails introduced during cloning) that includes both protein coding and 3' untranslated mRNA sequences. The protein coding sequence corresponds to troponin T from residue 194 to the C terminus (residue 257) of the protein.
The quail protein sequence is similar to the rabbit protein sequence (7) on which the amino acid numbering system used in Fig. 1 is based. In addition, cC113 contains the translation stop codon TAA and an apparently complete 3' untranslated sequence of 343 bp that exhibits the poly(A) addition signal

G C A G G A G C C A A A G G C A A G G T T G G C G G G C G C T G G A A G T A A A C C C T G C T G G T C T C C neaGeyAeaLydGeyLydVaeGeyGeyAngrnpLyb '
Sau 96 1

T T C T T C T T C C T T C A C A A A C T A C T T G T G T T C C T G T G C C T C T G C T G C T G C T T
cc119 e nds . PstI and HpaII sites that are characteristic of the difference regions of cC119 and cC113, respectively. B, quail troponin T DNA and protein sequences. DNA sequences of cC113, cC119, and cC122 were determined. On the accumulated DNA sequence shown, the first and last base of each of the three cDNA inserts is indicated. Not shown are the G + C tails (-20 residues) introduced during cloning. X indicates an unknown base and R a purine. In cases where these uncertainties result in amino acid ambiguities the residue shown in the protein sequence corresponds to that in the rabbit protein (7), on which the amino acid numbering system is based. The Ile 229 codon is included in the difference region sequences because preliminary analysis of a cloned quail troponin T gene indicates that this codon is part of the difference exons, as is the case in the rat (19). AATAAA (25) and is followed 20 bp downstream of this by 3"terminal poly(A).
The cDNA insert of cC119 is a 554-bp sequence (excluding G + C tails) that overlaps with cC113 as indicated in Fig. 1.
The cC113/cC119 overlap amounts to 359 bp and includes DNA encoding the last 63 amino acids of the protein, the stop codon TAA, and the first 167 bp of 3' untranslated mRNA sequence. With two exceptions the cC113 and cC119 sequences are identical throughout the region of overlap. The exceptions are, first, that the codon for Arg 224 is AGG in cC113, but is AGA in cC119. Second, there is a discrete block of 38 bp corresponding to amino acid residues 230-242 in which the DNA sequences are very different.
In this block of 38 bp the cC113 and cC119 DNA sequences differ at 21 positions. These nucleotide differences translate into 8 differences in the corresponding 13 amino acids. Thus, cC113 and cC119 represent two different mRNA species encoding distinct isoforms of troponin T differing in a discrete internal block of sequence, residues 230-242. For later reference, we call this the region of the protein, or the mRNA, or the gene, the difference region.
As indicated in Fig. 1 the 335-bp DNA sequence of cC122 encodes residues 113-223 of troponin T. Although cC122 extends further in the N-terminal direction than cC113 and cC119, it does not extend sufficiently in the C-terminal direction to provide additional information regarding the difference region, or the codon for Arg 224.

Differential RNA
Splicing"cC113 and cC119 show striking sequence identity over hundreds of base pairs with the exception of a short sharply defined internal region, the difference region. Although several genetic mechanisms could account for such a sequence relationship between two mRNA species, the most satisfactory involves the novel model of alternative mRNA splicing recently proposed by Medford et al. (19) for rat troponin T. In a rat troponin T gene they found two possible exons corresponding to amino acids 229-242, and presented evidence that both exons were used as mutually exclusive alternatives. Alternative splicing would result in two troponin T mRNA species having exactly the relationship we have found for cC113 and cC119 RNAs. Each of the mRNA species would have one of the two alternative difference region exons encoding residues 229-242, but the two mRNAs would be identical in sequence both upstream and downstream of this region as well as share identical 3' untranslated sequences (see Fig. 2). Medford et al. identified a cDNA clone corresponding to one of the two predicted rat mRNA species, but the second was known only indirectly through SI nuclease protection experiments (19). Thus our results with cC113 and cC119, which apparently derive from two alternatively spliced troponin T mRNAs, represent direct evidence in support of the differential splicing model.
The only possible inconsistency in our cDNA data with the differential splicing mechanism concerns the codon for Arg 224, which is AGG in cC113 but is AGA in cC119. Since Arg 224 is outside the difference region in a common exon presumably shared by the two alternatively spliced mRNAs, the sequence here should be identical in the two cDNA clones. In principle this discrepancy might be used to argue that ~~1 1 3 and cC119 in fact represent two different genes. However, a single base difference cannot carry much weight as evidence against the single gene/differential splicing model because trivial explanations are also possible. The one-base difference could represent allelic variation in a single DNA sequence or a misincorporation error of reverse transcriptase or DNA polymerase during cDNA cloning. These explanations for this single base difference are particularly reasonable considering that the 3' untranslated sequences of cC113 and cC119 are identical. This 3' homology is compelling evidence that these two troponin T mRNAs are products of the same gene since the 3' untranslated sequences of very closely related genes are divergent (29,30).
There is precedent for the generation of muscle protein isoforms by production of different mRNAs from a single gene, although the exon choice mechanism of troponin T may be unique. Vertebrate myosin light chains 1 and 3, which differ at their N termini, are produced from a single gene (15)(16)(17). The production of mRNA encoding either myosin light chain 1 or 3 apparently depends on the choice between two alternative promoters present in the gene. In the case of troponin T it is unlikely that there is such a simple relationship between exon choice and promoter activity since the difference region is far removed from the N terminus of the protein. The Drosophila muscle myosin heavy chain gene (13,14,26), tropomyosin gene (27)) myosin alkali light chain gene (18), and the rat tropomyosin gene (28) give rise to multiple mRNAs, in these cases differing at their 3' ends by poly(A) addition site choice. This results in isoform heterogeneity at the C termini of these proteins. Again, troponin T may not be exactly similar because the C termini of the protein isoforms (and the 3' untranslated mRNA sequences) are identical, implying that these troponin T isoforms are generated at least in part by splicing choices internal to troponin T transcripts. Thus, a variety of distinct differential RNA splicing mechanisms may be used in different muscle protein genes to generate multiple protein isoforms. Clearly, processes other than simple gene duplication have been important in the evolution of muscle protein isoform diversity.
None of the 3 cDNA clones extend out to the N terminus of troponin T. Thus, we cannot yet correlate the near Cterminal heterogeneity we describe here with the N-terminal troponin T sequence heterogeneity revealed through biochemical studies of the protein (31). The patchwork nature of the N-terminal heterogeneity has led Wilkinson et ul. (31) to propose that multiple RNA splicing patterns may be responsible for generating different N termini in various troponin T isoforms. These observations, coupled with ours and those of Medford et al. (19) suggest that the troponin T gene is very complex in terms of structure and expression and gives rise to a variety of troponin T isoforms through RNA splice choices affecting various parts of the protein molecule. This complexity has important implications for muscle physiology and muscle gene regulatory mechanisms. We do not know any molecular details concerning the operation of the differential splicing mechanism that produces troponin T mRNAs having alternative difference region sequences. However, certain general conclusions can be drawn from the manner in which the quail cDNA clones cC113, cC119, and cC122 were isolated. Since these were among a set of cDNAs representing mRNAs whose abundances increase dramatically during myoblast differentiation this is evidence that troponin T gene expression, like that of many other contractile protein genes (20,23), is quantitatively regulated by mechanisms that determine mRNA abundance. Nuclei transcription studies show that the troponin T gene goes from a transcriptionally inactive state to a transcriptionally active state during myoblast differentiati~n.~ The presence of both cC113 and cC119 mRNAs in newly fused embryonic muscle cells in culture shows that alternative RNA splicing of difference region sequences occurs at the onset of myoblast differentiation. This indicates that the operation of the alternative splicing reactions in muscle cells does not depend on external features of development and maturation such as innervation or hormonal changes occurring in the embryo; these processes do not occur in muscle cell cultures. We can further conclude that, if any gene-or cell-specific factors or mechanisms are required for alternative difference region splicing of troponin T mRNAs, then these are either already expressed in proliferating myoblasts (and continue to be expressed following differentiation), or else they themselves are activated during myoblast differentiation, along with troponin T, and other muscle protein, gene expression. However, according to the present results there is no need to postulate any gene-or cellspecific splicing factors at all. A simpler hypothesis would be that the structure of troponin T gene transcripts directly dictates the production, via the general RNA splicing machinery used in all cells, of alternatively spliced mRNAs. According to this idea, the production of alternatively spliced troponin T mRNAs in muscle cells requires only that the troponin T gene be actively expressed. This could be tested experimentally by introducing a transcriptionally active troponin T gene into nonmuscle cell types.
Although the above considerations raise the possibility that no cell-specific RNA splicing factors may be required for alternative troponin T RNA splicing, the relative accumulation of the alternatively spliced mRNAs varies from muscle to muscle (19).l This suggests that the mechanism(s) governing the production of both RNA species may be subject to some kind of regulation related to muscle function and muscle fiber type.
Comparative Analysis-A comparative analysis provides insight into structural aspects of the troponin T difference region heterogeneity. Fig. 3 shows difference region amino acid sequences from quail and rat, as well as relevant data for the rabbit (7,32). Some features are conserved in all these difference region sequences (Fig. 3). The existence of common features can be taken as evidence that the various difference region sequences are descended from a common ancestral sequence. Presumably one of the events that occurred during the evolution of troponin T genes was an internal, partial gene duplication in which the difference region exon (probably along with parts of the adjacent introns), originally present as a single sequence, came to be present in two adjacent copies. It is unclear whether or how a mechanism of alterna-  Thus the mechanism has been operating during a long period of vertebrate evolution.
The conservation of amino acid sequence similarities also implies that the various isoforms of the difference region are relatively modest variations on a common protein structural theme, as opposed to being completely different structural forms. In this regard, it is noteworthy that the difference region of troponin T corresponds to a distinct structural element, a P-sheet domain, in a predicted structure of the rabbit protein (7). Whatever the structural role of the difference region in the protein, it seems likely that it is fundamentally similar in all troponin T isoforms, and that the conserved sequence features in Fig. 3 are involved.
In face of the overall structural similarities, a key question remains: are the alternative difference region protein sequences functionally equivalent in terms of troponin T activity, or does each isoform have a distinct function? Distinct patterns of expression of the two difference exons both in the rat (19) and in the quail2 argue in favor of a distinct function for each isoform. If distinct isoform-specific functions do exist then we might expect them to be associated with distinct amino acid sequence features. These features could be identified by virtue of their being shared by, and being distinctive of, functionally equivalent quail and rat sequences. Making the quail/rat correspondences indicated in Fig. 3, i.e. quail cC113 + rat p, and quail cC119 rat a, we note that residues 239 and 242 are distinctive in the above sense. Thus these residues are of particular interest in terms of isoform-specific function.
How Many Genes; How Many Difference Exons?-The cC113 + p, cC119 + 01 correspondence has a puzzling feature.
On the one hand, the similarity of cC113 and is very high (only 2/14 differences) as might be expected of corresponding structures. In contrast the similarity of cC119 and a is much lower (8/14 differences). In fact, the cC119/a difference is comparable to the level of difference between the two quail sequences (8/14 differences) or the two rat sequences (10/14 differences), and this calls into question the whole idea of a cC119 01 correspondence. An additional point of contrast between quail cC119 and rat a is that the cC119 difference exon is expressed both in adult' and in embryonic quail muscle cells (as demonstrated by the isolation of both cC113 and cC119 from embryonic muscle), whereas the a difference exon is not expressed in embryonic rat muscle cells, but only in adult muscle (19).
The great sequence difference between quail cC119 and rat a and their different patterns of embryonic expression raise the possibility that they may be evolutionarily paralogous (related, but not exactly corresponding), as opposed to orthologous (exactly corresponding) structures. Two possible paralogous relationships deserve special consideration.
First, it is possible that the rat gene studied by Medford et al. (19) and the quail gene we are studying here represent two distinct members of a vertebrate troponin T multigene family.
That vertebrates, or at least birds, contain more than one troponin T gene is suggested by the recent results of Cooper and Ordahl (33). They isolated a chicken cDNA clone encoding a cardiac isoform of troponin T. The encoded protein differs significantly from the quail skeletal troponin T we describe here, not just in the difference region, but throughout its length and clearly is the product of a distinct cardiac gene. Thus birds appear to carry a minimum of two troponin T genes; at least one encoding cardiac isoform(s), and at least one encoding skeletal isoforms. It remains to be determined whether birds or mammals contain additional skeletal troponin T genes. If so, then it is possible that we and Medford et al. have studied noncorresponding quail and rat genes and this could perhaps account for the very great cC119/a sequence difference.
Second, it may be that the quail and rat genes do correspond exactly but the cC119 and a difference exons do not correspond. This situation could possibly arise if either the quail gene or the rat gene (or their common ancestral gene) contained not two, but three alternative difference exons. These would correspond to the a sequence, the cC119 sequence, and the cC113//3 sequence. A computer search of the rat troponin T gene sequence did not reveal a third exon similar to the cC119 difference exon. However, it remains to be determined whether the quail troponin T gene has three alternative difference exons.
If further work should reveal that the quail cC119 and rat a difference regions are orthologous structures, then their great sequence divergence would imply either that birds and mammals have different functional requirements for this particular alternative troponin T protein segment, or that it has, compared to the highly conserved cC113/@ protein segment, relatively little isoform-specific function. Further comparative investigation of the structure and expression of avian and 'rms in Avian Skeletal Muscle 13703 mammalian troponin T genes will help reveal the functional significance of troponin T isoforms and the mechanisms regulating alternative troponin T RNA splicing choices.