Cuttlefish Spermatid-specific Protein T MOLECULAR CHARACTERIZATION OF TWO VARIANTS T1 AND T2, PUTATIVE PRECURSORS OF SPERM PROTAMINE VARIANTS Spl AND Sp2*

In cuttlefish, as in selachians and mammals, sper- miogenesis is characterized by the double nuclear protein transition histones -+ intermediate protein (pro-tein T) + protamine (protein Sp). The cuttlefish protein T, which consists of two structural variants phosphorylated at different degrees, is the first invertebrate spermatid-specific protein to be fully characterized and sequenced. The primary struc-tures of these two variants were established from se- quence analysis and mass spectrometric data of the proteins and their fragments. T1 and T2 are two highly related proteins of 78 and 77 residues, respectively, which differ only by four conservative substitutions, two inversions Ser w Arg, and the deletion of 1 residue of arginine in variant T2. The asymmetrical distribu-tion of the hydrophobic and basic residues determines two well defined domains: an amino-terminal domain (residues 1-21) devoid of arginine and aromatic

In cuttlefish, as in selachians and mammals, spermiogenesis is characterized by the double nuclear protein transition histones -+ intermediate protein (protein T) + protamine (protein Sp).
The cuttlefish protein T, which consists of two structural variants phosphorylated at different degrees, is the first invertebrate spermatid-specific protein to be fully characterized and sequenced. The primary structures of these two variants were established from sequence analysis and mass spectrometric data of the proteins and their fragments. T1 and T2 are two highly related proteins of 78 and 77 residues, respectively, which differ only by four conservative substitutions, two inversions Ser w Arg, and the deletion of 1 residue of arginine in variant T2. The asymmetrical distribution of the hydrophobic and basic residues determines two well defined domains: an amino-terminal domain (residues 1-21) devoid of arginine and aromatic residues and containing all the aliphatic hydrophobic residues and a highly basic carboxyl-terminal domain (residues 22-77 or 78) that contains 77% of arginine, all the tyrosine residues, and most of the phosphorylated serine residues present in the protein.
The complete structural identity of the basic carboxyl-terminal domain of spermatidal proteins T1 and T2 with the protamine variants Spl and Sp2 isolated from cuttlefish spermatozoa strongly suggests that T1 and T2 could be precursors of Spl and Sp2, respectively.
Cuttlefish spermiogenesis is characterized by a double nuclear basic protein transition (1). A spermatid-specific protein called protein T appears in round spermatids to replace transiently somatic type and/or testis-specific histones. It then disappears from elongated spermatids, where it is replaced by a typical protamine named protein Sp (2). This protamine is the major basic protein associated to DNA in the mature spermatozoa of cuttlefish. It is constituted of two structural variants, Spl and Sp2, which differ only by the position of 2 residues of serine and by an additional residue of arginine in * This work was supported by grants from the Centre National de la Recherche Scientifique, from the Universiti: de Lille 11, and from the Fondation a la Recherche M6dicale. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "aduertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Among vertebrates, only mammals (7-10) and a selachian, the dogfish ( l l ) , were found to have a double nuclear protein transition during spermiogenesis, in which several spermatidspecific proteins and protamines are generally involved. Moreover, some of the mammalian protamines appear to be synthesized as precursor molecules. Thus, in mouse, a cDNA coding for a putative precursor of protamine mp2 has been identified (12). In man, intermediate basic nuclear proteins HPI1, HPS1, and HPS2 are structurally related to protamines HP2 and HP3, and it has been suggested that they could be precursors of these protamines (13)(14)(15). On the contrary, the two spermatidal proteins S1 and S2 of the dogfish do not exhibit any structural relationship with any of the four scylliorhinines and cannot be considered as protamine precursors (16). This paper deals with the characterization and elucidation of the amino acid sequence of the two variants T1 and T2 of the cuttlefish spermatid-specific protein T. The close structural relationship observed between these variants and the protamine variants Spl and Sp2 strongly suggests that the spermatidal proteins T1 and T2 could be the precursors of Spl and Sp2, respectively. EXPERIMENTAL PROCEDURES'

Evidence for the Existence of Structural Variants of Protein T
Spermatid-specific protein T was obtained in both 0.4 M HCl and 5 M guanidinium chloride extracts from cuttlefish testis chromatin where it represents 28% of the total nuclear basic proteins (1). Protein T can be separated from protamine Sp by fractionation of the guanidinium chloride extract on a C18 pBondapak column (1) or from somatic-type histones by fractionation of the acid extract on a C8 Ultrapore column (Fig. S U ) . In this case, protein T is eluted first (fraction l), before histone H1 (fraction 2) and core histones (fraction 3) (Fig. SIB). Protein T migrates as several bands on urea/ Portions of this paper (including "Eperimental Procedures," Figs SlLS6, and Tables SI-SV) are presented in miniprint at the end of this paper. Miniprint is easily read with the aid of a standard magnifying glass. Full size photocopies are included in the microfilm edition of the Journal that is available from Waverly Press. acetic acid polyacrylamide gel (Fig. SIB). We have shown previously that this apparent heterogeneity had to be related to different levels of phosphorylation of the protein (1).
The first data obtained from automated Edman degradation of whole protein T revealed three microheterogeneities in the amino-terminal sequence of the protein (residues 1 to 50) at positions 7 (Ser/Thr), 12 (Ala/Val), and 16 (Glu/Asp) and indicated that this protein was indeed a mixture of a t least two polypeptides, one of them lacking a residue of methionine at its amino-terminal end.
These preliminary results established clearly the asymmetry of the protein since all the hydrophobic residues are accumulated in the amino-terminal part of the molecule (residues 1-21) and all the arginine residues are accumulated in the carboxyl-terminal part (residue 22 to the carboxyl terminus). The 4 lysine residues are regularly distributed within the amino-terminal region, at positions 2, 9, 14, and 19.
In the early stages of our work, the cleavage of dephosphorylated protein T with endoproteinase Lys-C was intended to obtain the carboxyl-terminal basic fragment K-4 (residues 20-77 or 78) necessary to establish the structural relationship between spermatidal protein T and protamine Sp. Indeed, the amino acid composition of K-4 only differs from that of protamine Sp by 2 additional residues of glycine (Table I). Furthermore, the limited chymotryptic digests of K-4 and protamine Sp have almost identical electrophoretic patterns (Fig. S2).

Separation of Protein T Variants
The separation of the two variants was achieved on a C18 pBondapak column, using a stepwise gradient of acetonitrile ( Fig. S3). In these conditions, variants T1 and T2 were only obtained in pure form in fractions 4 and 6, respectively. These variants of similar electrophoretic mobilities have markedly different amino acid compositions (Table I).
The molecular mass of variant T1 was determined through the use of electrospray mass spectrometry. Three subfractions of decreasing abundance were identified in T1 with the following masses: 10,872 k 2 Da, 10,953 k 2 Da, and 10,788 f 4 Da (Fig. 1). These data correspond to the calculated mass for variant T1 (10,632.5 Da), with the addition of three, four, or two phosphate groups, respectively. The triphosphorylated form of variant T1 was found to be predominant, whereas the tetra-and diphosphorylated forms are minor.

Automated Edman Degradation
Each protein T variant was submitted to automated Edman degradation. The data obtained up to cycle 42 (Table SII) corroborate the preliminary results obtained from the whole protein T and allow us to identify unambiguously the amino acids at positions 7, 12, 16, 18, 34, and 35 in each variant. Moreover, each of the two variants T1 and T2 was proved to be itself a mixture of two molecules, T l a a n d T l b or T2a and T2b, only differing by the presence of a methionine residue at the amino-terminal end of the major molecular species (Tla and T2a). Thus, cuttlefish spermatid-specific protein T consists, in fact, of a mixture of four structural variants of similar electrophoretic mobilities and phosphorylated a t different degrees. From the yields of the peptides obtained by cleavage of the whole protein T with endoproteinase Lys-C and the yields of phenylthiohydantoin-Met and phenylthiohydantoin-Lys at the first cycle of Edman degradation of T1 and T2, the relative amount of each variant was calculated as follows: T l a , 60.5%; T l b , 6.2%; T2a, 32.1%; and T2b, 1.2%.

Enzymatic Hydrolyses
The carboxyl-terminal sequences of T1 and T2 were deduced from structural data provided by peptides generated by enzymatic cleavages of each variant using chymotrypsin at p H 5.0 and Astacus fluviatilis proteinase. The elution patterns of the enzymatic digests of cuttlefish proteins T1 and T2 are presented in Figs. S4-S6. The amino acid compositions of the peptides useful for the elucidation of the complete sequence are reported in Tables SI11 and SIV.
Cleavage with Chymotrypsin-The chymotryptic peptides generated from each variant cover the entire sequence of the protein. All the tyrosyl bonds were cleaved as expected from the specificity of chymotrypsin at pH 5.0 (21). Nevertheless, in the amino-terminal region of the protein, two nonspecific cleavages were observed the bonds Leu15-G1u"j and Met''-   Lydg were partially hydrolyzed. Moreover, the incomplete cleavage of the bond T~r~~-A r g~~ gave rise to peptide C-6 (residues 70-78 in variant T1; residues 69-77 in variant T2).
The molecular masses of the peptides C-4, established as 2456 Da for T1 C-4 and 2298.6 Da for T2 C-4 through the use of 252Cf plasma desorption mass spectrometry confirmed the results of automated Edman degradation (Table SV). They correspond to the calculated masses for the peptides C-4. It can therefore be deduced that the residues of serine a t positions 58 in variant T1 and 56 in variant T2 are not phosphorylated.
Cleavage with A. fluuiatilis Proteinase-Among the peptides obtained by digestion of variants T1 and T2 with A. fluvintilis proteinase, only the carboxyl-terminal peptides T1 A-1 (residues 58-78) and T2 A-1 (residues 56-77) were useful in aligning the chymotryptic peptides C-4, C-5, and C-6 and in subsequently establishing the complete sequences of the spermatid-specific protein variants T1 and T2. The amino acid composition and the automated Edman degradation of these two homologous peptides showed that they differ only by 1 residue of arginine. These results were substantiated by mass spectrometric analysis of these peptides in electrospray mode, which showed molecular masses of 3192.2 Da and 3348.8 Da for T1 A-1 and T2 A-1, respectively. The difference of 80 Da observed between the calculated masses and the measured masses (Tables SI11 and SIV) corresponds to the presence of 1 phosphate residue in each peptide.

Complete Sequences
The amino acid sequences of the variants T1 and T2 are presented in Figs. 2  bution of the hydrophobic and basic residues in these molecules determines two well defined domains: an amino-terminal domain (residues 1-21) devoid of arginine and aromatic residues and containing all the aliphatic hydrophobic residues and a highly basic domain (residues 22-77 or 78) which contains 77% of arginine and all the tyrosine residues present in the protein.
Predictive methods (22)(23)(24)(25)(26)(27) indicate a high probability of a-helical structure for the amino-terminal domain and the presence of three p turns in the carboxyl-terminal domain.
The hinge region between these two domains (residues 18-22) has a high probability of p turn structure.
On the other hand, in order to identify the phosphorylation sites of the protein T, each variant and their peptides C-2 and C-5 were treated according to the procedure described by Meyer et al. (19). Serine residues at positions 7, 8, 28, 35, 39, and 68 in variant T1 and at positions 8, 28, 34, 39, and 67 in variant T2 were found partially phosphorylated. All the phosphorylation sites except the serine residues at positions 7 and 8 are located in an Arg-X-Ser sequence specifically recognized by the CAMP-dependent protein kinase where X is any amino acid except proline (28,29). Most of the phosphorylated serine residues are located in the amino-terminal half of spermatidal protein T. Only 1 phosphorylated serine, at position 67 or 68 according to the variant, is present in the carboxyl-terminal half. Moreover, it must be emphasized that the nonphosphorylated form of protein T has not been found in testis chromatin from sexually mature cuttlefish.
These results are consistent with the mass spectrometric data, which indicate the presence of three major molecular species in variant T I , corresponding to diphosphorylated, triphosphorylated, and tetraphosphorylated forms of the protein (Fig. 1). Among these, the triphosphorylated form was found to be predominant.
The most striking feature of cuttlefish spermatidal protein variants T1 and T2 is the complete structural identity of their carboxyl-terminal basic domain (residues 22-78 or 77) with the protamine variants Spl and Sp2 isolated from cuttlefish spermatozoa (3) (Fig. 4). This strongly suggests that T1 and T2 could be the precursors of Spl and Sp2, respectively. A similar situation has also been observed in mouse and man where P2-type protamines are synthesized as precursors (12)(13)(14)(15).
The mechanism of the transition spermatid-specific protein Tprotamine Sp remains unknown. Several hypotheses have to be considered. First, the protamine Sp would derive from protein T by a specific proteolysis of the GlyZ1-Arg22 bond in the sequence Met/Le~'~-Lys-Gly-Gly-Arg-Arg~~. This region, which has a high probability of p turn conformation constitutes the hinge between the amino-terminal domain in a-helical structure and the basic carboxyl-terminal domain. Analogous sequences with highly structured flanking regions are the sites of proteolytic processing of polypeptide hormone precursors (38) or of biologically active proteins of the adenovirus 2 (39). It must be pointed out that the occurrence in elongated spermatids of protamine Sp, phosphorylated at the same sites as is protein T, would support this hypothesis. Second, the successive emergence of spermatid-specific protein T and of protamine Sp could arise from a regulation mechanism of gene expression. These proteins could be encoded by two different genes or by a unique gene. In the latter case, there would be several mRNAs directly derived from this gene or a primary transcript which, after processing, would give rise to different mRNAs.

EXPERIMENTAL PROCEOURES
T l a v a r i a n t TI v a r i a n t 72 3 103