The Nucleotide and Partial Amino Acid Sequence of Toxic Shock Syndrome Toxin- l*

The nucleotide sequence of toxic shock syndrome toxin-1 (TSST-1) has been determined. In addition, one-third of the predicted amino acid sequence was confirmed by amino acid sequence analysis of cyanogen bromide-generated TSST-1 protein fragments. The DNA sequencing results identified a 708-base pair open reading frame starting with an ATG, 7 base pairs downstream from a Shine-Dalgarno sequence, and ter- minating at a UAA stop codon. Amino acid analysis of the intact protein defined the NHZ terminus of the mature protein and located the cleavage point for the signal peptide (Ala/Ser). The signal peptide contained the first 40 amino acids and had characteristic struc- tural similarities with other bacterial signal peptides. The coding sequence of the mature protein was 585 base pairs (194 amino acids) in length, and the molecular weight of the predicted protein was 22,049. This is in good agreement with the previously reported mo- lecular weight of TSST-l (22,000), as determined by sodium dodecyl sulfate-polyacrylamide gel electropho- resis. NH2-terminal amino acid sequence analysis performed on isolated TSST-1 CNBr fragments deter- mined the position of the peptides in the TSST-1 sequence and verified the predicted amino acid sequence in those positions. Computer analyses of the amino acid sequence showed that TSST-1 has little or no sequence homology Computer Analysis-The amino acid sequences of TSST-1, staph- ylococcal enterotoxins B and C (19, 20), and streptococcal pyrogenic exotoxin type A (21, 22) were compared using the computer program Fast Protein Data Base (FASTP) written by Lipman and Pearson (23; sequence analysis programs distributed by the National Institutes of Health). This program is based on modificatjons of the algorithm of Wilbur and Lipman (23). To determine the statistical significance of sequence similarities, we employed Monte Carlo analysis using another algorithm written by Lipman and Pearson.

The nucleotide sequence of toxic shock syndrome toxin-1 (TSST-1) has been determined. In addition, one-third of the predicted amino acid sequence was confirmed by amino acid sequence analysis of cyanogen bromide-generated TSST-1 protein fragments. The DNA sequencing results identified a 708-base pair open reading frame starting with an ATG, 7 base pairs downstream from a Shine-Dalgarno sequence, and terminating at a UAA stop codon. Amino acid analysis of the intact protein defined the NHZ terminus of the mature protein and located the cleavage point for the signal peptide (Ala/Ser). The signal peptide contained the first 40 amino acids and had characteristic structural similarities with other bacterial signal peptides. The coding sequence of the mature protein was 585 base pairs (194 amino acids) in length, and the molecular weight of the predicted protein was 22,049. This is in good agreement with the previously reported molecular weight of TSST-l (22,000), as determined by sodium dodecyl sulfate-polyacrylamide gel electrophoresis. NH2-terminal amino acid sequence analysis performed on isolated TSST-1 CNBr fragments determined the position of the peptides in the TSST-1 sequence and verified the predicted amino acid sequence in those positions. Computer analyses of the amino acid sequence showed that TSST-1 has little or no sequence homology with biologically related toxins, streptococcal pyrogenic exotoxin A, and staphylococcal enterotoxins B and C.
Toxic shock syndrome (TSS') is a multisystem illness characterized by the acute onset of high fever, hypotension or dizziness, rash, desquamation of skin upon recovery, and variable multisystem involvement (1)(2)(3)(4). Staphylococcus aureus producing TSST-1 has been isolated from nearly 100% of menstrual-associated TSS patients (5,6). Since bacteremia rarely occurs in TSS, it was proposed that a staphylococcal exotoxin was causing the widespread systemic effects. TSST-1 has been cited by several investigators as a major toxin most * This work was supported by National Institutes of Health Research Grant A122159. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The likely responsible for the symptoms of TSS (5-8).
The biological activities of TSST-1 have been examined extensively to build a relationship between the toxin's effects and the symptoms of TSS. The biological activities include: the capacity to induce fever, enhancement of host susceptibility to lethal shock by endotoxin, nonspecific T lymphocyte mitogenicity, suppression of immunoglobulin M synthesis against sheep erythrocytes, enhancement of delayed type hypersensitivity reactions, and induction of immunological tolerance in certain rabbits (9). The TSS model proposed by Schlievert (10) correlates these biological functions to the pathology seen in TSS patients.
To date, TSST-1 has been purified and biochemically characterized (11). In addition, the TSST-1 structural gene has been cloned from the bacterial chromosome (12). The toxin was purified to homogeneity by differential precipitation with ethanol and resolubilization in water followed by successive electrofocusing in pH gradients of 3-10 and 6-8 (11). TSST-1, thus isolated, migrated as a single band in SDS-polyacrylamide gel electrophoresis gels to a molecular weight of 22,000. The toxin gene was localized within a 10.6-kb chromosomal insert contained in plasmid pRN6100 and was previously shown to produce TSST-1. Analysis of subclones of the 10.6kb insert in pBR322 assigned the structural gene to the leftmost 3.2-kb ClaI fragment (12).
Here we report the nucleotide and partial amino acid sequence analysis of TSST-1. These experiments were conducted to determine the complete amino acid sequence of TSST-1 as an approach to understanding the multiple structure-function relationships of this toxin molecule.

MATERIALS AND METHODS
M13 and DNA Sequencing-DNA sequencing was performed by the dideoxynucleotide chain termination method of Sanger et al. (13), following the subcloning of isolated DNA fragments from pRN6101 into M13 vectors mplO and mpll (12). A universal M13 sequencing primer (17 nucleotides, Boehringer Mannheim) was used in the annealing reaction.
Preparation of TSST-1 CNBr Fragments-TSST-l was purified as previously described (11). The CNBr reactions were performed as reported by Gross (14) using a 2OOO:l molar ratio of CNBr:TSST-1. The formic acid and excess CNBr reagent were removed by lyophilization. As a control, toxin was incubated with formic acid alone and then analyzed by gel electrophoresis. No significant protein degradation occurred due to the acidic reaction conditions. Separation of CNBr Fragments by Gel Filtration-Lyophilized TSST-1 fragments (5-10 mg) were solubilized in 30% acetic acid and then loaded onto a Sepbadex G-75 column (1.7 X 120 cm) equilibrated with 10% acetic acid. Fractions (2.0 ml each) were eluted with 10% acetic acid at a flow rate of 12 ml/h. The acetic acid was removed from each fraction by lyophilization. After lyophilization, column fractions containing a visible amount of material were suspended in 0.2 ml of water. The protein content of each fraction was then analyzed on a 10-18% linear gradient SDS gel (1-5 pgllane) to 15783 ascertain which toxin fragments were present in each sample. Protein concentrations were estimated by use of the Bio-Rad assay.
~r e a -S~~-G r a d~e n~ P o~y~~~m i d e Get Electrophoresis-The approximate molecular weight of the CNBr-generated TSST-1 fragments were determined using a 10-18% linear gradient SDS gel containing 7 M urea and a cross-linking ratio of 20:l (15, 16). The gel bands were visualized by silver staining (17). Estimates of fragment sizes were obtained by comparison with standards: ovalbumin (43,000), a-chymotrypsinogen (25,700), 8-lactoglobulin (18,400), ly- sozyme (14,300), bovine trypsin inhibitor (6,200), and insulin chain Amino Acid Sequencing-Amino acid sequences were determined by automated Edman degradation in a gas-phase sequenator (Applied Biosystems Model 470A, under program 03RPTH) as described by Hewick et al. (18). The purity of each CNBr fragment and intact TSST-1 analyzed was assessed by SDS-gel electrophoresis. Purified pept.ide and intact toxin samples were immobilized in the reaction cartridge by means of Polyhrene-impregnated glass fiber filters. The amount of peptide loaded in each run was in the range of 80-400 pmol. An aliquot (YIz) of the phenylthiohydantoins at each cleavage was injected directly into a high performance liquid chromatograph (Model 120A, Applied Biosystems, Foster City, CA). Resolution of the phenylthiobydantoins was accomplished on a reverse-phase column (Brownlee "Spheri-5" C18, 0.21 X 22 cm) with a complex gradient composed of 114 mM acetate, pH 4.0, in 5% tetrahydrofuran  (23). To determine the statistical significance of sequence similarities, we employed Monte Carlo analysis using another algorithm written by Lipman and Pearson.

RESULTS AND DISCUSSION
Nue~otide Sequence Analysis of the TSST-1 Gene-Previously, subclones of pRN6101 were isolated which failed to express the TSST-1 protein (12). These results indicated that a 300-bp BamHI-HincII fragment was required for expression and established the sequencing strategy ( Fig. 1) since it could be assumed that this region was within or very close to the TSST-1 struct.ura1 gene. Using the dideoxy chain termination method of Sanger (131, the overlapping 1.1-kb HincII and the 1.0-kb BamHI fragments, which bracket the gene-specific region, were separately cloned into coliphage M13 and sequenced. The completion of the entire nucleotide sequence of the tst gene and its controlling regions required additional cloning and sequencing of subfragments depicted in Fig. 1. The final percentage of the sequence determined from both strands was 70%. The toxin sequence, presented in the 5' to 3' orientation, starts in the AluI fragment and extends to the 3' end in the HincII fragment (Fig. 2). The sequencing data identified a 705-bp open reading frame starting with an ATG at nucleotide position 478 and terminating at a TAA nucleotide number 1180. A good consensus Shine-Dalgarno site lies 7 bp upstream of the ATG start position (dotted line in Fig. 2). Amino acid sequence analysis of intact TSST-1 identified the NE, terminus of the mature TSST-1 protein as: Ser-Thr-Asn-Asp-Asn-Ile-Lys-Asp-Leu, thus confirming previous studies of Igarashi et al. (24). In addition, these data defined the cleavage point for the signal peptide at an Ala/Ser sequence and determined the length of the signal peptide to be 40 amino acids. The signal peptide is indicated by a solid line in Fig. 2.
The TSST-1 signal peptide contained the characteristic structural homologies found in other bacterial signal peptides: (a) 1-3 basic amino acids at the amino terminus, (b) a hydrophobic region of approximately 15 residues, ( e ) a Pro or Gly in the hydrophobic core, (d) a Ser or Thr near the carboxyl terminus of the core, and (e) an Ala or Gly at the cleavage site (25).
The coding sequence of the mature protein was 585 bp in length (194 amino acid residues), and the calculated molecular weight was 22,049, which is in complete agreement with the molecular weight of TSST-1 determined by SDS-polyacrylamide gel electrophoresis, 22,000 (11).
The mixture of CNBr-generated toxin fragments was then applied to a Sephadex G-75 gel filtration column. The elution profile of the column fractions showed one broad protein peak. Therefore, each column fraction containing a visible  amount of lyophilized material was analyzed on a 10-18% linear gradient SDS gel (gel not shown). The SDS gel analysis showed that fraction numbers 39 and 61 contained separated, homogeneous bands of CN3 (14 kDa) and CN4 (6-8 kDa), respectively. Fraction number 35 contained both CN1 (18 kDa) and CN2 (17 kDa) fragments; none of the column fractions contained an isolated preparation of CN1 and CN2. CN5 was not recovered from the column. Fraction numbers 35, 39, and 61, as well as intact TSST-1, were used in the amino acid sequence analysis.

TCT ACA M C GAT AAT ATA M G GAT TTG CTA GAC TGG TAT RGT AGT GGG TCT W C ACT T T l ACA M T Ser Thr A m Asp Arn Ile Lys Asp Leu Leu Asp Trp Tyr Ser Ser G l y Ser Asp Thr Phe Thr Asn
Amino Acid Sequence Analysis of the CNBr Peptides-Purified protein fragments were analyzed by automated Edman degradation to determine the amino acid sequence of each fragment. These data were used to determine the position of each CNBr fragment in the TSST-1 molecule, as well as verifying the inferred amino acid sequence in those positions. Furthermore, defining the position of each fragment is critical for structure-function studies; localizing a specific biological activity to a mapped fragment will ultimately determine the functional domains of the intact protein.
Amino acid sequence analysis of purified CN3 and CN4 mapped their positions on the TSST-1 protein as shown in A sample containing CN1 and CN2 generated two unambiguous peaks at each amino acid position (24 cycles, repetitive yield 91%). These two signals were easily resolved by examining the inferred amino acid sequence; one signal depicted a peptide originating at the NH2 terminus, the second signal described a fragment starting at methionine residue 33. The positions of CN1 and CN2 were determined by comparing the SDS gel estimated molecular weights and the molecular weights calculated from the predicted amino acid sequence (Table I). Peptide CN5 was not retrieved from the gel filtration column and, therefore, was not sequenced (indicated by the completely open box in Fig. 4). The position of CN5 is proposed based on the SDS gel estimated molecular weight and the calculated molecular weight of the predicted sequence. All of the SDS gel estimated molecular weights are in good agreement with the predicted amino acid sequence values except for CN4 ( Table I). CN4 is an acidic fragment, and it is known that SDS-polyacrylamide gel electrophoresis overestimates the molecular weight of several other acidic proteins (26,27). Each of the samples described above was analyzed once, but analysis of the CN1, CN2 preparation verified the earlier CN3 results, and intact toxin analysis confirmed and extended the CN2 assignments.
The boxed sequences in Fig. 2 identify the inferred amino acid sequence which was confirmed by amino acid sequence analysis of intact TSST-1 and CNBr-generated fragments. Approximately one-third of the nucleotide sequence was con- Sequence of Toxic Shock Syndrome Toxin-1 firmed. Note, amino acid residues 25-33 were determined by sequence analysis of intact TSST-1 (38 cycles, repetitive yield 97%).
Analysis of TSST-I Amino Acid Sequence-Previously, we reported the amino acid composition of TSST-1 predicted from the nucleotide sequence (11). The subsequent CNBr fragment amino acid sequence analysis showed a few changes in the predicted amino acid sequence. The changes occurred in those areas where there was some ambiguity in the reading of the gel sequences. The amino acid sequence analysis of the CN1, CN2 preparation and of intact TSST-1 identified three excess nucleotides between amino acid numbers 31-32, 39-40, and 43-44. The corrected predicted amino acid composition is listed in Table 11. As previously noted, this composition correlates closely with other TSST-1 protein compositions ( 5 , 24), excluding the differences in cysteine residues reported to be present by Reiser et al. (28).
The most interesting features of the toxin's amino acid sequence are the abundance of hydrophobic residues, the clusters of proline residues, and the two predicted @-turns. Approximately 25% of the total amino acids in TSST-1 are hydrophobic residues. Also, the TSST-1 amino acid sequence had four different areas containing clusters of proline residues; amino acids 48-56,95-101,112-117, and 179-180. Evaluation of the secondary structure of TSST-1 by the Chou-Fasman method suggested the presence of two @-turns at residues 35-39 and 47-50 (29).
TSST-I Amino Acid Sequence Homology with Related Toxins-Previously, studies have shown that staphylococcal enterotoxins B and C1 and streptococcal pyrogenic exotoxin type A have highly significant protein sequence homology (19-22). Since TSST-1 belongs to the same general family, based upon shared biological activities, analyses were performed to compare the toxin protein sequences. No homology was observed between TSST-1 and enterotoxins B and C1. Minimal homology was seen between TSST-1 and streptococcal pyrogenic exotoxin type A. However, Monte Carlo analysis indicated that this minimal homology was not significant. Furthermore, streptococcal pyrogenic exotoxin A and staphylococcal enterotoxins B and C1 showed some serological cross-reactivity, as detected by Western blot analysis using polyclonal antisera against each toxin (data not shown). In support of the sequence homology results, none of the related toxins showed any cross-reactivity with TSST-1 antiserum, nor did the TSST-1 band cross-react with antisera against the other toxins.
In summary, knowing the complete amino acid sequence for TSST-1 and its CNBr fragments now allows for careful examination of the structure-function relationships of this immunoregulatory toxin. We plan to localize the biological activities retained in the CNBr fragments, as well as using DNA technology to further specify those particular amino acids involved.