Alternative Splicing Generates Variants in Important Functional Domains of Human Slow Skeletal Troponin T*

We provide the first nucleotide sequence information for the slow isoform of troponin T (TnT). Sequence and hybridization analyses revealed that a single slow TnT gene present in the human genome gives rise to at least two different slow TnT variants by alternative splicing. The observed variations in slow TnT splicing generated major structural differences between the two corresponding slow TnT proteins in a domain that is likely to be involved in critical interactions with troponin C, troponin I, and tropomyosin in the thin filament. Corresponding variations have not been found for fast or for cardiac TnT. The comparison of splicing patterns for fast, cardiac, and slow TnT re- veals that the splicing pattern for each isoform is unique. These features raise important questions of why and how all the individual members of the closely related TnT gene family developed such complex but different schemes of alternative splicing to create sets of variant proteins. This unusual familial trait is not known in any other muscle or nonmuscle multigene family.

We provide the first nucleotide sequence information for the slow isoform of troponin T (TnT). Sequence and hybridization analyses revealed that a single slow TnT gene present in the human genome gives rise to at least two different slow TnT variants by alternative splicing. The observed variations in slow TnT splicing generated major structural differences between the two corresponding slow TnT proteins in a domain that is likely to be involved in critical interactions with troponin C, troponin I, and tropomyosin in the thin filament. Corresponding variations have not been found for fast or for cardiac TnT. The comparison of splicing patterns for fast, cardiac, and slow TnT reveals that the splicing pattern for each isoform is unique. These features raise important questions of why and how all the individual members of the closely related TnT gene family developed such complex but different schemes of alternative splicing to create sets of variant proteins. This unusual familial trait is not known in any other muscle or nonmuscle multigene family.
Troponin T (TnT)' is the tropomyosin binding component of the troponin complex that regulates the contraction of muscle in response to Ca2+ efflux from the sarcoplasmic reticulum (1). Adult cardiac muscle, slow skeletal muscle, and fast skeletal muscle contain different isoforms of T n T (2). The domains of T n T involved in critical interactions with troponin C and troponin I in the troponin complex and with tropomyosin have been characterized in considerable detail for the fast isoform of T n T by peptide binding studies (3-6).
Recently it has been demonstrated that more than 40 and *This work was supported in part by grants from the National Institutes of Health, the Muscular Dystrophy Association, and the Veterans Administration (to L. K.). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "aduertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

to the GenBankTM/EMBL Data Bank with accession numbeds)
The  The abbreviation used is: TnT, troponin T.
potentially 64 distinct variants of the fast T n T isoform are derived from a single gene by a complex scheme of alternative splicing (7-9). Two different cardiac T n T proteins (10) which may have functional differences (11) are derived from a single gene by alternative RNA splicing (12). We describe here the first sequence information for the slow isoform of TnT and propose that, different slow T n T variants feature major differences in an important functional domain of the proteins.

EXPERIMENTAL PROCEDURES
Cloning and characterization of the two cDNAs described in this study has been performed essentially according to methods described by Maniatis et al. (13). For DNA sequencing analyses the method of Sanger et al. (14) was employed. Specific details for individual experiments are provided in the figure legends.

RESULTS AND DISCUSSION
The two T n T cDNA clones described in this paper, H22h and M1, were isolated from a human adult skeletal muscle cDNA library (15). Clone H22h was obtained as an abundant message clone by using a strategy described elsewhere and represents an essentially full length cDNA clone (16). Clone M1 was selected by using a probe containing the repetitive human DNA sequence Hut2 which is described by Hoffman-Liebermann et al. (17). Comparison of the locations of restriction endonuclease cleavage sites in the cDNAs revealed that both clones are related (Fig. la), and nucleotide sequencing ( Fig. lb) verified these similarities. The major differences are that clone H22h contains two short insertions relative to clone M1. The clones contain 5' untranslated regions of 57 or 58 nucleotides (see legend to Fig. l b ) and are likely to be full length (16). Both clones also carry 3"untranslated regions (86 and 83 nucleotides long in clone H22h and clone M1, respectively) that end in a polyadenylic acid tail. T o identify these clones, the amino acid sequences were derived from both cDNAs and compared to sequences of known proteins contained in the Protein Identification Resource (NBRF) data base. The search revealed similarities to all represented T n T proteins. The similarity to both fast and cardiac T n T amino acid sequences is about 65% in the conserved carboxyl-terminal segment. This result suggests that H22h and M1 represent T n T cDNA clones, the first isolated from human sources. In addition, the degree of dissimilarity of the polypeptide encoded by these clones to cardiac and fast TnT proteins (35%), which is like the degree of dissimilarity between fast and cardiac T n T proteins, suggested that the isolated human clones corresponded neither to fast nor to cardiac T n T but represented a third and distinct class of T n T isoforms. No direct reference data for slow T n T is available since the protein has not been sequenced, but it seemed likely . The first base of the sequence represents the first base in the 5'-untranslated region of clone H22h next to the flanking poly dG.C tail. An additional base (A) is present at the beginning of clone M1. The amino acid sequence deduced from the coding sequence of clone H22h is written beneath the nucleotide sequence. The three differences between the coding regions of cDNA clones H22h and M1 are underlined; base 117 is G (instead of C) in clone M1, leading to a change from aspartic acid to glutamic acid in the corresponding protein. Two stretches of 33 and 48 nucleotides, from about base pair 131 to 163 and from about base pair 670 to 717, respectively, are not present in the sequence of clone M1. The exact sequences and location of the inserted exons relative to M1 are not deducible from the comparison of both cDNA sequences alone. One out of three possible adjacent locations is indicated for the amino-terminal "insert"; one out of four adjacent possible locations is indicated for the carboxyl-terminal "insert." The triplet ACC in front of the poly dA.dT tail (in parenthses) is present in the sequence of H22h, but not in clone M1. The polyadenylation signal "AATAAA" which is common to both sequences is underlined. that both H22h and M1 are the first examples of slow TnT isoform cDNA clones.
In order to unambiguously demonstrate the identity of clone H22h and clone M1, we performed a blotting analysis of RNA isolated from adult rabbit soleus, psoas, and heart muscle which contain predominantly or exclusively slow, fast, and cardiac TnT, respectively (2). The probe used in this experiment, comprising about 500 base pairs of the coding sequence of H22h, is depicted in Fig. la. TnT mRNA could be detected easily in soleus muscle RNA. On the other hand, heart ventricular RNA gave no signal and psoas RNA gave a very faint signal only after prolonged exposure on the autoradiogram (Fig. sa). This result demonstrates that both clones represent slow isoforms of TnT. H22h and M1 are thus the first isolates of clones for slow T n T from any source.
Comparison of the sequences of the two clones revealed that two small stretches of 33 a n d 48 nucleotides in length are present in clone H22h, but not in clone M1 (Fig. 1, a a n d  b). In addition, the presence of a single nucleotide difference (C uersus G) at base 117 results in the conservative change from aspartic acid to glutamic acid in the corresponding TnT  2. a, demonstration bv Northern blotting that transcripts corresponding to human T n T cDNA clones are specifically present in RNA from slow-twitch muscles of rabbits. 10 pg of total cellular RNAs from rahbit soleus, psoas, and heart ventricle were separated on an agarose-formaldehyde gel and transferred onto a Biodyne membrane as described by Thomas (33). RNA on the filter was hybridized to a nick-translated "'P-labeled probe (34) of an internal RsaI fragment of clone H22h (Fig. la). The region contained in this fragment is highly conserved between different T n T isoforms. The filter was washed in 0.5 X SSC, 0.1% sodium dodecyl sulfate a t 55 "C proteins. There are two possible explanations for these sequence differences. First, the two clones may have been derived from two different but very similar slow T n T genes. Alternatively, they may have been derived by differential splicing of a common precursor transcribed from a single gene. A Southern blot analysis of human genomic DNA demonstrates that the two mRNAs are the product of a single copy gene. Each of five aliquots of a human genomic DNA sample was digested with a different restriction endonuclease.
The DNA fragments were size-fractionated on an agarose gel and transferred to a nitrocellulose filter. We used a "'P-labeled PuuII DNA fragment (Fig. la) common to both cDNAs as a hybridization probe. The autoradiogram from this experiment is shown in Fig. 2b. Single bands were detected with the probe in all restriction endonuclease-digested DNA samples, and we conclude that there is a single slow T n T gene in the genome. Accordingly, the two cDNAs must be derived from a single genomic transcript which undergoes alternative splicing, although the splicing pattern of slow T n T transcripts could be more complex than the comparison of the two cDNA clones alone would suggest. The other difference between the cDNAs a t base 117 (C versus G ) presumably represents a polymorphism at the human slow T n T locus. If so, H22h and M1 were derived from different alleles of this gene. However, the simple explanation of a cloning artifact to explain this single base difference cannot be excluded. The three terminal bases ACC of the 3"untranslated region of H22h are missing in M1 (Fig. lb). Heterogeneity of the length of 3"untranslated sequences seems to be rare (18) but has been described before for bovine prolactin mRNAs (19) and for mRNA of the mouse ribosomal protein L30 (20). The precise locations of the junctions of the two optional inserts in the common sequence of slow T n T cannot be assigned with certainty. Three adjacent locations for the amino-terminal insert and four adjacent locations for the carboxyl-terminal insert are possible. Only one of the alternative locations for each insert is designated in Fig. Ib. The more precise localization of the alternative splice sites must await cloning and analysis of the slow troponin T gene.
Alternative splicing of a precursor nuclear RNA is well documented for cardiac T n T (12) and fast TnT (8,9,21). The most surprising aspect of alternative splicing in the TnT family is that the variants of all three isoforms are derived from their respective precursors by different schemes of alternative splicing as presented in Fig. 3a. The only common feature is that the insertion sites of the amino-terminal optional exon block of slow TnT and of the set of five alternatively spliced amino-terminal exons of fast TnT are at an equivalent position in the coding sequence (Fig. 3). With this exception, different regions of the TnT proteins are affected in each case. and exposed for 8 h at -70 "C. Only one band of hybridization is visible in soleus RNA, but the two different mRNAs would not be distinguished by this assay. A control experiment (not shown) demonstrated that the three RNA preparations used were of comparable quality. b, Southern blotting demonstration that the human T n T gene is single copy. 10 pg of HeLa DNA were digested with the restriction endonucleases EcoRI, HindIII, BarnHI, BgIII, or XbaI.
DNA fragments were size-fractionated on an agarose gel and transferred onto a nitrocellulose filter (35). The blotted DNA fragments were hybridized with the :'2P-labeled nick-translated PuuII fragment indicated in Fig. la,  and clone M1 are essentially identical in this region. The filter was washed a t 65 "C in 0.5 X SSC, 0.1% sodium dodecyl sulfate and exposed overnight a t -70 "C. HindIII fragments end labeled with "' P were used as size markers.  The majority of the possible exon combinations of the five optional amino-terminal exons of fast TnT seems actually to have been identified in fast skeletal muscle (7). The two carboxylterminal exons (a and 0) of rat fast TnT are used alternatively and are, therefore, marked differently by hatched boxes. The corresponding amino acid region in cardiac TnT, for which only one exon exists, and in slow TnT, for which only one exon has been found so far, are marked by hatched boxes as well. These regions in slow and cardiac TnT are 64% similar and are slightly more similar to the 0-exon of rat (8) (50 and 43%) than to the corresponding a-exon (29 and 36%). The splicing pattern shown for human slow TnT is based on the two cDNA clones described in this report, but the pattern could be even more complex than the figure suggests. b, protein sequence alignment for evolutionarily conserved parts of slow, fast, and cardiac troponin T proteins (symbolized by block bars in a). The evolutionarily conserued parts of the three troponin T isoforms are compared I , fast TnT (rat) with a-exon (underlined), amino acids 44-259 according to Breitbart et al. (8); 2, fast TnT (rat) with 0-exon (underlined), amino acids 44-259 according to Breitbart et al. (8); 3, cardiac TnT (chicken), amino acids 88-302 according to Cooper and Ordahl (12); 4, slow TnT (human, H22h), amino acids 50-278; 5, slow TnT (human, Ml), amino acids 50-262.

KKPLDIDYHGEEQLRARSAWLPPSQPSCPAREKAQELSDWIHQLESEKFDLHAKLKPQKYEINVLYNRIS W S K K A G A T A K G K V G G R W K
These considerations raise two important questions. Why are there three isoforms of TnT and so many variants thereof? And how were two strategies, gene duplication and alternative splicing, combined during evolution to generate this unique collection of protein variants?
One attractive explanation for the existence of many variant TnT proteins emphasizes the possibility that different variants of T n T might be necessary during muscle differentiation (8, 9, 12, 22).
The levels of individual variants of chicken cardiac TnT and rat fast TnT mRNAs change during embryogenesis (8,9, 12,22). But a developmental role cannot explain the surprisingly high number of more than 40 fast T n T variants present in the fast leg muscles of the adult chicken (7). Similarly differential expression during differentiation cannot be the only reason for the existence of the two slow T n T isoform variants discussed in this report since both were recovered from an adult human skeletal muscle cDNA library.
The carboxyl-terminal 70-80% of T n T has been conserved during evolution between all three isoforms (about 65%, comparison not shown) and within the same isoform in different species. For example, comparison of the rat fast TnT gene (8) and quail fast T n T cDNAs (21) reveals a very similar coding sequence and perfect conservation of the splicing pattern in this segment of the gene. Peptide binding studies suggest that interactions of fast T n T with tropomyosin, troponin C, and troponin I occur in this conserved carboxyl-terminal domain.
Tropomyosin binds to a rabbit fast T n T peptide comprising amino acids 71-151 (3, 4), and both troponin C and troponin I bind to peptides containing amino acids 152-259 ( 5 , 6 ) in a Ca2+-sensitive manner (23). Altogether these data suggest that this conserved carboxyl-terminal region contains important functional domains in all the T n T isoforms. The initial descriptions of alternative splicing of T n T isoforms (8,9,12, 21) did not reveal significant variation in this carboxyl-terminal region and were consistent with this hypothesis. Cardiac TnT, for example, reveals no variation in this region (12). In addition, although fast T n T uses one out of two available exons encoding amino acids 229-242 in the carboxylterminal region, these exons are quite similar in amino acid sequence (9, 21). In contrast, the slow T n T gene described here generates radically different variant proteins by its optional use of exons in the very heart of the evolutionarily conserved carboxyl-terminal region as well as by use of optional exons in the amino-terminal region (Fig. 3a).
A sequence comparison for the evolutionarily conserved segments of the three TnT isoforms is shown in Fig. 3b. The carboxyl-terminal "insertion site" of the alternative exon in slow T n T corresponds to amino acids 198/199 in the reference rabbit fast T n T protein (24) and also represents a known splice site (corresponding to exons 14 and 15) in fast and cardiac T n T (8,12). Four proline residues and one cysteine residue are found among the 16 amino acids inserted in the carboxyl terminus of the protein (Fig. Ib). Proline is normally very rare ( d % ) in the conserved carboxyl-terminal domain of any T n T (8,21,22,24). The only cysteine residues detected in TnT, so far, are found in quail (21) and chicken fast T n T (25), located in alternative exons, and close to the amino terminus of bovine cardiac T n T (10). Thus, all the known cysteine residues in the conserved part of T n T proteins appear to be encoded by alternative exons raising the possibility that introduction of the cysteine-bearing peptide imparts important functional differences to the variant proteins in the troponin complex. A computer analysis using the Chou and Fasman algorithm (26, 27) predicts that the "insert" of 16 amino acids in human slow T n T would severely disrupt its conformation by introducing a turn into an a-helical domain of the protein (data not shown). The capacity of troponin I (6) and, in particular, of troponin C ( 5 ) to bind to this region of slow T n T is likely to be altered.
Alternative splicing is widely found in animal viruses (28), and in mammalian species it is especially common among muscle-specific transcripts (29-32). This efficient strategy of evolution provides an alternative to gene duplication as a means of creating variant proteins while preserving a gene for an essential function. The carboxyl-terminal alternative exon of slow T n T provides an example of such a splicing scheme that appears to have allowed maintenance of conserved func-tion while allowing a variant to arise. This evolutionary mechanism does not explain the origin of alternative splicing in the highly variable amino-terminal segments of the three T n T isoforms. In any case, evolution appears to have selected for or accepted an extraordinary degree of hypervariability in this region.
When in evolution was alternative splicing adopted and used to generate TnT variant proteins? Was alternative splicing already used by the putative single ancestral T n T gene that was duplicated and subsequently modified? Or did the individual genes for slow, fast, and cardiac TnT adopt differential splicing independently after their derivation from an ancestral gene that did not create variant proteins? Unfortunately the available data cannot distinguish these possibilities. Both the evolutionary and functional aspects of the TnT splicing question are similarly puzzling. Why the TnT gene family as a group developed such a unique collection of mechanisms for production of variant proteins remains a key issue in understanding the function of T n T in muscle.