The Deduced Sequence of the Novel Protransglutaminase E (TGase3) of Human and Mouse*

transglutaminases in the follicle, presumably in cross-linking structural proteins in the formation of the glutaminase is a proenzyme, activation by and oligonucleotides from the amino acid se- of peptides of the guinea pig enzyme, we amplified mRNA and deduced the complete amino acid sequences of the and protransglutaminase 3 enzymes. Both proteins contain 692 amino acids of molecular mass about 77 kDa. Following

At least three transglutaminases are involved in terminal differentiation events in the epidermis and its derivatives, such as the hair follicle, presumably in cross-linking structural proteins and in the formation of the cornified cell envelope. Of these, only the transglutaminase 3 is a proenzyme, requiring activation by proteolytic cleavage, and is the least understood. Using oligonucleotides designed from the amino acid sequences of peptides of the guinea pig enzyme, we amplified mRNA and deduced the complete amino acid sequences of the mouse and human protransglutaminase 3 enzymes. Both proteins contain 692 amino acids of molecular mass about 77 kDa. Following expression in yeast, the proenzymes encoded by the full-length cDNA clones are active enzymes and can be further activated 15-fold on treatment with dispase. Although these proteins share 38-53% identity to other members of the transglutaminase family, surprisingly, the mouse, human, and guinea pig enzymes have not been highly conserved and show only 50-75% identity to each other. Much of the sequence variation occurs in the vicinity of the proteolytic activation site which lies at the most flexible and hydrophilic region of the molecule and is flanked by a sequence of 12 residues that are absent from other transglutaminases. We suggest that cleavage of this exposed flexible hinge region promotes a conformational change in the protein to a more compact form, resulting in activation of the enzyme. Expression of mouse and human protransglutaminase 3 mRNAs is regulated by calcium, as for other late differentiation products of the epidermis, suggesting that this enzyme is responsible for the later stages of cell envelope formation in the epidermis and hair follicle.
Transglutaminases (TGases)' are calcium-and thiol-de-* The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

2886.
The abbreviations used are: TGase, transglutaminase; TGase3, suggested new nomenclature, replacing transglutaminase E; CE, cornified cell envelope; bp, base pair; kbp, kilobase pair; kb, kilobase; PCR, polymerase chain reaction; HPLC, high performance liquid chromatography. pendent enzymes that modify proteins by catalyzing the formation of an isodipeptide cross-link between an t-NH2 of a lysine and the y-amide of a glutamine residue (1)(2)(3)(4). In mammals, five distinct TGases are known to exist: a membrane-associated activity first discovered in keratinocytes of about 92 kDa (5-7), TGasel, and is now known to be widely expressed (8-13); a ubiquitous "soluble" or "tissue" activity of about 80 kDa, termed TGase2 (14-18); a soluble proenzyme activity of about 77 kDa, known as the "epidermal" or "hair follicle" TGase3 (14,(19)(20)(21); an inactive TGase-like protein of about 75 kDa, band 4.2, which is a ubiquitous constituent of the subplasma membrane of most eukaryotic cells (22,23); and the catalytic a subunit of the blood clotting factor XI11 of about 77 kDa (24)(25)(26). Curiously, all but the latter member of this family are expressed in terminally differentiating epidermis (27). Although the TGase2 enzyme is implicated in apoptosis and cell adhesion (4, 28, 29), the TGasel and TGase3 enzymes are thought to be involved in the formation and assembly of the cornified cell envelope (CE) of the epidermis, hair follicle, and perhaps other stratified squamous epithelia (1-4, 12, 30, 31), by cross-linking of the several known or putative CE protein constituents with isodipeptide bonds (30-33). However, while data on the protein (27,(34)(35)(36) and gene (13, 37-39) structures and expression characteristics (40, 41) of the TGasel system are available, rather less is known about the TGase3 system. Moreover, the substrate specificities, if any, of these enzymes during CE assembly is not yet known.
The TGase3 activity was the first epidermal enzyme to be isolated and characterized (14,15,19,20). Several early studies reported a soluble protein of about 50 kDa from both epidermal and hair follicle tissues (14,15,19,20,42,43), but more rigorous biochemical and cell biological analyses revealed that it is in fact a proenzyme of molecular mass about 77 kDa and becomes highly active upon proteolytic cleavage into a 50 kDa (amino-terminal) and 27-kDa species (20, 21). Although newer work showed that these fragments are not normally separated upon activation (21), the fact that the "isolated 50-kDa fragment can retain catalytic activity was the source of confusion in earlier studies. Furthermore, despite earlier work (42), it is now generally agreed that the epidermal and hair follicle pro-enzyme species are the same (12).
Biochemical analyses have revealed that purified guinea pig TGase3 is chemically and perhaps structurally different from the other members of the TGase family (21). Guinea pig TGase3 possesses an elongated shape that undergoes a conformational rearrangement to a more compact form upon proteolytic activation. Fractionation by ion-exchange chromatography showed that the mouse and guinea pig TGase3 enzymes are near neutral in net charge, whereas the TGasel and TGase2 enzymes are acidic (21). Activated TGase3 accounts for >75% of total TGase activity in mammalian epidermal and hair follicle tissues, although chromatographic experiments show that the amount of TGase3 protein is far less than for TGasel or TGase2 (21).
We have undertaken a systematic analysis of the family of TGases involved in CE formation in the epidermis (13, 27).
In this paper we now describe the full-length cDNA and deduced amino acid sequences of both the mouse and human TGase3 enzymes and describe their unusual properties and structures. By comparisons with other TGase sequences, these two peptides constitute the active site (order: 3-1); sequence alignments show that they are unique to and thus diagnostic for the TGase3 system. 'This is the amino terminus of the 27-kDa fragment (21) and thus represents the cleavage activation site of guinea pig TGase3. 50-kDa fragment 1 CLGVRSR" 27-kDa fragment: 7 AQRSPGREQAPSISGRFKVNGVLAVGQE'

MATERIALS AND METHODS
Determination of the Amino Acid Sequences of Selected Peptides of Guinea Pig TGase3"The 50-kDa amino-terminal and 27-kDa carboxyl-terminal fragments of guinea pig TGase3, derived by dispase treatment, were fractionated and purified as described previously (21). Each portion was cleaved with trypsin (Boehringer sequencing grade) at 10 mg/ml in 0.1 M NH,HCOa with a final enzyme to protein ratio of 1:50 and digested for a total of 4 h at 37 "C. Following drying, the peptides were redissolved in 0.1% aqueous trifluoroacetate, fractionated by HPLC, and well resolved peaks with absorbance at both 210 and 350 nm were selected for sequence analysis. Absorbance at 350 nm was taken as an indication of cysteine residues alkylated with 5-N [(iodoacetamidoethyl)amino]naphthalene-l-sulfonic acid, possibly corresponding to active site peptides (21). Sequence analysis of selected peptides was then performed on an Applied Biosystems 470A protein sequenator using the automated Edman degradation method (21,33,44).
Anchored PCR Cloning Strategies-Initially, we constructed a series of degenerate oligonucleotide primers based on the available guinea pig TGase3 peptide sequences (see Table I1 for lists of primers) and used these to amplify DNA obtained from a random-primed cDNA library prepared from mouse epidermal mRNA (45). PCR was done with a commercial DNA amplification reagent kit (Perkin-Elmer Cetus) by following the manufacturer's specifications, using 25 pmol of primers and with conditions of 95 "C (5 min) and 35 cycles of denaturation at 94 "C (0.5 min), annealing at 42 "C (0.5 rnin), and elongation at 72 "C (1.5 min). The products were fractionated through low melting agarose, excised, and purified through Chroma spin 100 columns (Clontech). The ends of the amplified DNA were filled in with Klenow DNA polymerase (46), subcloned into the pGEM-3z vector (Promega), and then sequenced on both strands in both directions by the dideoxy chain termination method with Sequenase 2.0 (United States Biochemical Corp.). Although most sets of degenerate primers did not work, apparently because of the substantial nucleotide sequence differences between guinea pig and mouse TGase3 mRNAs

TABLE I1
Sequences of oligonucleotide primers used for anchored PCR experiments "P" oligonucleotides were derived from numbered peptide sequences (see Table I). "+" means plus (left) primer; "-" means minus (right) primer. The "a" primers were used in the first PCR reaction. The "b" primers were nested inside the a primers and used for the second round of PCR on a diluted sample of the first reaction. I = inosine; N = all four nucleotides. Note, the primers 6a+, 6b+, 7b-, 7a-are from corresponding mouse TGase sequences.

Primer
Mouse primers:  Table 11) and extent of sequence information obtained with each PCR step.
(see Table V), four were found sufficiently useful to proceed. Subsequently, RNA-mediated anchored PCR was used to "walk" in both directions along the mouse TGase3 mRNA using specific mouse TGase3 nucleotide sequences as primers and additional degenerate primers. In this case, aliquots of 200 ng of DNase I-treated total newborn mouse epidermal RNA (45) were reverse transcribed at 42 "C. Following removal of the dNTPs through Chroma spin columns, the cDNAs so produced were tailed in the presence of 200 PM dGTP with 25 units of terminal deoxynucleotidyltransferase (GIBCO/BRL) for 1 h at 37 "C (46). PCR was then done in two steps. The conditions for the first round were exactly as described above, with 25 pmol of the primer used as the minus primer and either a degenerate primer, a specific mouse TGase3 primer, or oligo(dC) as the plus primer. A portion was diluted 1:lOOO with buffer and 1 pl was reamplified in a second round of PCR using the more stringent conditions oE denaturation at 94 "C (0.5 min), annealing at 55 "C (0.5 min), and elongation at 72 'C (1.5 min), and using primers on one end that were nested inside those used in the first PCR reaction (see Table 11, Fig. 1). The further subcloning and sequencing procedures were performed as above. In this way, it was possible to walk along the entire length of the mouse TGase3 mRNA in both directions in six steps. The human TGase3 cDNA sequence was generated in essentially the same way in three steps using the nested primers listed in Table I1 using the adduced mouse sequence data.
Expression of Full-length Human Coding Clone of TGase3 in Yeast-The human TGase3 sequence was initially obtained in three sections (see Fig. 1). The longer 1400-bp clone possessed unique NaeI and KpnI sites toward its 5'-and 3'-ends, respectively, that afforded simple joining of the 500-bp 5'-clone and the 900-bp 3"clone into a full-length single clone in pGEM-32. This was then subcloned into the pYES 2.0 yeast expression vector system (Invitrogen, San Diego, CA), and yeast transformation was performed with the INVScl (MATa,his3-Al leu2 trp-289 ura3-52) strain (47). The selected yeast colonies were grown in 1 liter of minimal medium (0.7% yeast nitrogen base and 0.1 mg/ml each of His, Leu, Trp) containing 2% glucose at 30 "C until anAsw = 0.2 had been achieved (2-3 days). The expression of TGase3 protein was induced by resuspension of yeast cells in fresh minimum medium containing 2% galactose and further incubation for 24 h. Following homogenization with glass beads, TGase activity was assayed in the yeast lysate using established methods (48). Dispase or ethanol activation was performed as described before (21,49).
Northern Blotting Procedures-Total cellular RNA was prepared from human foreskin epidermis (50), newborn BALB/c mouse epi-2. Nucleotide and deduced amino acid sequence information of mouse and human TGase3 enzymes. The initiation, termination and polyadenylation signal sequences are underlined. Nucleotide sequences are numbered following the initiation codon. The amino acid sequences are shown using the single letter code. In mouse, only variations from human are shown. dermis (45), and human and mouse keratinocytes grown to confluence mRNAs with a PhosphorImager (Molecular Dynamics Corp.).
in the presence of low (0.1 mM) or high (0.6 mM ) Cas+ (51, 52). Dr. Computer Analyses of Sequences-Nucleic acid and protein se-Ulrike Lichti kindly provided a preparation of hair follicles from 5-quence homologies were performed using the University of Wisconsin day-old mice, from which RNA was isolated. Northern gels using software packages compiled by the Wisconsin Genetics Computer denaturing conditions were loaded with 25 pg of total cellular RNA, Group (531, the IBI Pustell sequence software (version 3.5, Internaperformed as described (39), and calibrated with standard RNA size tional Biotechologies Inc.), and Geneworks sequence software (Intelmarkers (GIBCO/BRL). Northern slot blots were prepared as de-ligenetics Inc.), based on published algorithms (54-56).
scribed (46). In this case. the blots were calibrated with lo-. 1.0-. 0.1-. and 0.01-fmol amounts'of probes encoding the full-length TGasel (27), a 1.0-kbp PCR fragment of 3'-noncoding region for TGase3 adduced here (see Fig. 1) and a 0.7-kbp 3"noncoding region of the published sequences of TGase2 (18) (see Table I1 for the two primers used). Aliquots of 10 pg of the several RNA samples were tested separately with the three TGase-specific probes. All Northern filters were washed with a final stringency of 0.5 X SSC at 65 "C for 30 min.
The resulting x-ray blots were exposed for varying amounts of time in order to facilitate quantitation of the abundance of the specific Skin TGase3

RESULTS AND DISCUSSION
Our initial attempts to locate clones for either mouse or human TGase3 in available Xgtll libraries using low stringency hybridizations with TGasel or TGase2 probes or active site probes (27) were unsuccessful. Accordingly, we made TGase3-specific degenerate oligonucleotide probes derived from the amino acid sequences of tryptic peptides of TGase3 isolated from guinea pig epidermis. The implicit assumption was that the guinea pig, mouse, and human TGase3 proteins would share high degrees of sequence homology, as found for the TGasel (13,34) and TGase2 (18) systems.
Amino Acid Sequences of Guinea Pig TGase3 Tryptic Peptides-Although the 27-kDa fragment resulting from dispase treatment yielded a clean amino acid sequence for 28 cycles, corresponding to its amino terminus and the activation site of proteolytic cleavage, no useful information on the larger catalytic 50-kDa portion was possible (21). Accordingly, using larger quantities, both portions were cleaved to completion with trypsin, and selected well resolved peptides, especially those containing cysteine residues, were chosen for sequencing. In this way, sequences from a total of 12 tryptic peptides (six from each of the 50-and 27-kDa portions) and the amino terminus of the 27-kDa portion were obtained (Table I). These represented 180 sequenced residues or about 25% of the total protein. Peptides 1 and 3 (order 3-1) are recognizable as constituting the active site region, based on comparisons with the known sequences of the TGase family members (4, 27). The minor amino acid substitutions in this active site region in relation to the other family members are diagnostic for the TGase3 system (see Table V).
Cloning by Anchored PCR and Deduced Amino Acid Sequences of Mouse and Human TGase3 Proteins-Degenerate oligonucleotide probes based on the above amino acid sequences of guinea pig TGase3 failed to identify positive clones  Table II), a 1-kbp cDNA probe encoding 3'-noncoding sequences of human TGase3 (lanes 4-6) (see Fig. 1). The individual strips were exposed for: 14 days (lane I ) , 2 days (lane 2), 4 days (lane 3 ) , and 23 days (lanes 4 6 ) . Positions of migration of RNA size markers are shown. R, Northern slot blots. Aliquots of 10 pg of RNA from the sources shown were probed with the above TGase-specific probes. X-ray films were exposed for several different times; this figure shows one exposure (for 6 days) only.  Lichti), These values were calculated from the Northern slot blots (Fig. 48), based on calibrations of 10, 1, 0.1, and 0.01 fmol of known cloned probes.  (27) were found. Therefore, we used the oligonucleotide primers to amplify by PCR the DNA extended by primer P3-- (Table  11). One pair of primers (Pl+,P3-; Fig. 1, Table 11) yielded a product of 292 bp and was subcloned into pGEM-3z. About 5% of such clones contained TGase-like sequences, including the active site region, which were identical to peptides 3+1 of the available guinea pig tryptic peptides (Table I). This finding afforded confidence that we were indeed amplifying the mouse TGase3 mRNA system. However, when this 292-bp probe was used to screen our Xgtll epidermal foreskin cDNA library (27), no clones were found, possibly due to the very low abundance of its mRNA (and see below). Accordingly, we used this exact sequence data of the 292-bp probe to extend the mouse TGase3 sequence by use of RNA-mediated anchored PCR as described under "Materials and Methods." First, we used one set of specific nested primers and another degenerate primer from the guinea pig peptide information (la+/P10-and lb+/P10-; P5+/2a-and P5+/2b-) ( Fig. 1, Table 11). The remainder of the 5'-end up to the capsite was recovered by primer extension, tailing with dG, and PCR amplification in two steps with nested primers (oligo(dC)/ 3a-; then oligo(dC)/3b-) (Fig. 1). The 3'-end sequence information was recovered in two steps by use of primer extension with a random hexamer, followed by tailing with dG. The cDNA products were amplified by PCR in two steps with two sets of nested primers (4a+/oligo(dC); then 4b+/oligo(dC)) and (Ba+/oligo(dC); then Sb+/oligo(dC)) ( Fig. 1, Table 11).
The human TGase3 sequence was generated in essentially the same manner in three steps (Fig. l ) , except that an oligo(dT) primer was used to generate the full-length 3'noncoding information.
A series of further RNA-mediated anchored PCR experiments was performed using primers that crossed over those shown in Fig. 1 and Table 11, in order to confirm and check the sequences for PCR-induced sequence mutations (lists of primers used are not shown). The natures of seven ambiguous nucleotides were resolved in additional PCR experiments.
The available nucleotide sequence information consists of 2297 nucleotides for mouse, including the entire 5"noncoding information, but incomplete 3"noncoding sequences (Fig. 2). The human data extend for 2645 nucleotides and are assumed  to be near full-length because of the inclusion of the polyadenylation signal sequence (Fig. 2); thus its estimated mRNA size is about 2.8 kb. In both cases, there is open reading frame of 2079 bp, so that both proteins contain 692 amino acids of calculated molecular mass 77.1 kDa (mouse) and 76.6 kDa (human), which are very close to the values adduced for guinea pig TGase3 by analytical ultracentrifugation and SDS-polyacrylamide gel electrophoresis experiments (20, 21). Interestingly, mouse TGase3 is near neutral in charge (PI 6.5) compared with human TGase3 (PI 5.6), findings that are also consistent with earlier chromatographic observations (21). Recombinant Clones of TGase3 in Yeast Produce a Highly Active Enzyme-We performed experiments to confirm that the cDNA sequences adduced by RNA-mediated anchored PCR methods described above in fact encode a functional TGase3 enzyme system. An intact full-length cDNA for the human TGase3 was assembled into a pYES yeast expression vector and transfected into yeast. Following induction, cells were lysed and assayed for TGase activity (Fig. 3). Yeast clones with an empty expression vector did not express detectable levels of TGase activity ( 4 0 cpm), consistent with the view that yeast do not contain TGases (1). However, clones containing the full-length human TGase3 cDNA had significant TGase activity, comparable with that which can be extracted from human trunk epidermis (21) (Fig. 3). Moreover, we found that the activity of the yeast expression product could be activated 7.5-fold by ethanol and 15-fold by dispase (Fig. 3), values which are also very comparable with those found in the activation of the native epidermal TGase3 proenzyme (Fig. 3;Ref. 21). Taken together, these data establish that the sequences determined in the present work encode a functional human pro-TGase3 enzyme system with expected properties similar to those of the native epidermal protein.

Abundance and Expression of Mouse and Human TGase3 mRNAs-A series of cDNA probes containing specific 3'noncoding sequence information for human TGasel (27),
TGase2 (generated by PCR; see Table 11; Ref. 18), and TGase3 (generated with PCR primers 6a+/6b+, Fig. l), were used to separately test human foreskin RNA on Northern blots (Fig.  4A). Four distinct bands are seen with a degenerate oligonucleotide probe for active site sequences ( Ref. 27; Fig. 4, lane 1 ), which correspond to the four known TGase-like activities expressed in the epidermis. The TGase3 probe identified only the central mRNA species of about 2.9 kb (Fig. 4, lune 4 ) . This is consistent with the size of the TGase3 mRNA adduced from the above sequencing data. Furthermore, it is now known that the mRNA encoding TGase2 is the largest (about 3.4 kb, Fig. 4, lane 3; Ref. 18) and that encoding TGasel is smaller (about 2.7 kbp, Fig. 4, lune 2; Ref. 27); the fourth and smallest band of about 2.4 kb corresponds to the mRNA for band 4. 2 (22). Mouse epidermis and hair follicles also express a TGase3 mRNA species of the same size as for human (Fig. 4A, lunes   5 and 6), consistent with the biochemical data which suggests the epidermal and hair follicle TGase3 pro-enzymes are in fact the same gene product (12, 21). These highly specific probes displayed almost no cross-hybridization. The data therefore confirm the identity of our TGase3 probes.
Using slot blotting techniques, we also examined the expression characteristics of these mRNA species (Fig. 4B). By use of specific cloned probes as calibration standards for each TGase species, to account for variations in hybridization and labeling efficiencies, we could estimate the amounts of each species expressed in intact epidermis, hair follicles, or cultured cells (Table 111). Whereas the TGasel and TGase2 mRNAs are up-regulated in submerged liquid cultures, TGase3 mRNA is greatly diminished and essentially absent in low Ca2+ medium conditions. Furthermore, TGase3 expression is modestly up-regulated in media containing near-optimal levels of Ca", whereas the former two species are down-regulated. Thus the TGase3 system is regulated differently from the TGasel and TGase2 enzymes. Rather, these data establish that the TGase3 system is regulated in the same general way as the other late epidermal differentiation products such as loricrin and profilaggrin (52) and keratins 1 and 10 (51). More importantly, these data support the view that the TGase3 enzyme is involved in a later stage of CE formation or assembly than the TGasel enzyme (3,32).
The data of Table I11 also show that in intact epidermis, the level of TGasel mRNA is about five to seven times greater than that of TGase3. Although little information is currently available on the turnover rates or rates of translation of these mRNAs (compare Ref. 40 with Ref. 41 for TGasel), our present data imply that TGasel is a more abundant enzyme in epidermis than TGase3. Nevertheless, activated TGase3 constitutes about 75% of total epidermal TGase enzymic activity (21). Therefore, it seems possible that the specific activity of TGase3 enzyme is higher than TGasel. In order to resolve this question and to explore substrate specificities and preferences, we are currently expressing the TGasel and TGase3 cDNAs for use with known CE substrate proteins. Table IV are listed the several tryptic peptides generated for guinea pig TGase3 that were found in the mouse and human TGase3 sequences. The comparisons further extend confidence for the correct identity of these sequences. In addition, the availability of the amino-terminal information of the 27-kDa fragment formed on proteolytic cleavage activation enabled identification of the activation region in the mouse and human TGase3 proteins as well (Table IV).

Amino Acid Sequences of the Human, Mouse, and Guinea Pig TGase3 Proteins Are Not Highly Conserved-In
Previous studies have shown that the sequences of human and mouse TGasel (13, 34) and TGase2 (18) enzymes have been very highly conserved; sequences show identities of about 93% and homologies of about 97%. In contrast, the data of Fig. 2 reveal that mouse and human TGase3 sequences have deviated more widely (Table V). Overall, the sequences show 75% identity and 84% homology, with the 27-kDa fragment generated following proteolytic activation somewhat less conserved 71% identity and 81% homology. Interestingly, the amino acid sequences of the available tryptic peptides of the guinea pig TGase3 show far more variation from mouse and human (Table V), such that the 27-kDa fragment displays as little as 45% sequence identity in available comparable sequences. Most of the variations have occurred in the vicinity of the proteolytic activation site (Table IV), which may mean that the different species have evolved alternate mechanisms for proteolytic activation of the TGase3 pro-enzyme. These sequence variations can account for the difficulties we initially encountered in generating mouse and human sequence information using the guinea pig data.

Comparisons Show That Human TGase3
Is Distantly Re-

FIG. 5. Updated alignment of amino acid sequences of human
TGase-like proteins. The sequences were aligned to maximize homologies according to the protocol of Pearson and Lipman (56). Alignments of the aminoterminal sequences are arbitrary. The arrowhead marks the presumed site of proteolytic cleavage required for activation of TGase3. Homology and identity scores (see Table V

TTPADAVIGHYSLLLQVSGPXQLL
LGQFTLLFNPWNR3DAVFLKNFAQPMEYLLNQNGLIYLGTADC    lytic activation. Overall homology and identity scores between the five TGases are shown in Table V. We have chosen to analyze only those sequences bounded by the conserved intron locations identified previously (13), which presumably delineate the conserved structural regions of the TGases. Each TGase chain deviates widely at its termini in both sequence and length, which thus does not admit meaningful comparisons (Fig. 5). The human TGase3 protein is most closely related to TGasel and TGase2, and more similar to band 4.2 than factor XIIIa, although the band 4.2 is least related to the other TGases. The evolutionary significance of these findings are not yet clear; however, analysis of gene structures has suggested to other investigators that the band 4.2 protein has either been least conserved during evolution, perhaps because it is not a functional enzyme (22, 23) or because it evolved before TGasel, TGase2, and factor XIIIa (38). Analyses of the gene structure and chromosomal location of human TGase3 are in progress and may clarify these evolutionary questions.

The TGase3 Proteins Consist of Two Globular Domains Separated by a Flexible Hinge at the Site of Actiuation-
Secondary structural analyses (53-56) of the human and mouse TGase3 proteins reveal multiple interspersed regions of turns, sheet structures, and a-helix, in both the 50-kDa amino-terminal and 27-kDa carboxyl-terminal portions (Fig.  6). In general, these features suggest a folded compact configuration (13, 27). However, the 12-residue insertion immediately following the cleavage site required for activation of the pro-enzyme describes a prominent protein turn that is surrounded by sequences that are the most hydrophilic and flexible in the entire protein (Fig. 6). Thus this sequence describes a flexible hinge region and is likely to be located near the surface of the molecule. From these observations, we can infer that the intact TGase3 molecules adopt an elongated shape consisting of two globular domains, a larger aminoterminal and a smaller carboxyl-terminal, that are separated by a flexible hinge corresponding to the activation site. This is flanked by highly polar residues, predicted to lie near the surface of the protein, that may be involved in recognition by and accessible to the activating protease(s). We predict that following cleavage, the hinge region collapses, promoting a more compact configuration that greatly enhances catalytic activity of the TGase3 molecule. These predictions are entirely consistent with the hydrodynamic observations that the guinea pig TGase3 pro-enzyme is an elongated molecule (21). In contrast, no other members of the TGase family possess a flexible hinge region (Fig. 5), and all are predicted to adopt a compact globular form (13,27). Further structural studies on expressed forms of human TGase3 are in progress to explore the implications of these predictions on its activation, enzymic function, and substrate specificity.