Identification and Characterization of SET, a Nuclear Phosphoprotein Encoded by the Translocation Break Point in Acute Undifferentiated Leukemia*

The translocation (69) in acute nonlymphocytic leu- kemia results in the formation of a dek-can fusion gene. In a case of acute undifferentiated leukemia, the onco- gene can is fused to a different gene, named set, instead of dek and is assumed to be activated. Transcripts of set encode a putative SET protein with a predicted molecular mass of 32 kDa. We identified SET as a 39-kDa protein by immunoprecipitation with rabbit antiserum against each of three synthetic peptides predicted from the open reading frame of the set gene. We confirmed this identification of SET by protein sequencing. We also observed that SET is expressed ubiquitously in various human cell lines. SET is phosphorylated on serine residue(s) in cultured cells and is localized predominantly in nuclei. Although the function(s) of SET and SET-CAN is not known, we propose that SET plays a key role in the mechanism of leukemogenesis in acute undifferentiated leukemia, perhaps by activating CAN in nuclei and stimulating the transformation potential of SET-CAN. This proposed role would therefore be similar to the roles observed for BCR and DEK of the chimeric onco-proteins BCR-ABL and DEK-CAN in acute myeloid leu- kemia and acute nonlymphocytic leukemia, respectively.

The occurrence of defined chromosomal translocations in specific subtypes of leukemia strongly suggests that these translocations play an important role in the process of leukemogenesis. As a result of translocation, nearby oncogenes and cellular genes involved in the control of proliferation or differentiation can be activated through alterations in regulatory DNA sequences that leave the encoded protein intact (e.g. myc) (1) or through formation of fusion genes, which encode chimeric proteins (e.g. bcr-abl, E2A-pbx, and pml-RARa) (2)(3)(4)(5)(6)(7).
The translocation (6;9) (p23;q34) in acute nonlymphocytic leukemia results in the formation of a highly consistent dek-can fusion gene (8). Translocation break points invariably take place in single introns of dek and can, named icb-6 and icb-9, respectively. In the case of acute undifferentiated leukemia (AUL),l a break point was detected in icb-9 of can, whereas no * This work was supported by the National Cancer Institute, Department of Health and Human Services, under Contract N01-CO-74101 with ABL. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
5 To whom reprint requests should be addressed.
break point was detected in dek. Genomic and cDNA cloning showed that instead of dek, a different gene was fused to can, which was named set (9). The set gene is located on chromosome 9q34, centromeric of c-abl. The set gene encodes transcripts of 2.0 and 2.7 kilobase pairs that result from the use of alternative polyadenylation sites. Both transcripts contain the open reading frame for a putative SET protein with a predicted molecular mass of 32 kDa (9). The set-can fusion gene in AUL encodes a 5-kilobase pair transcript that contains a single open reading frame predicting a 155-kDa chimeric SET-CAN protein. The SET sequence shows homology to the yeast nucleosome assembly protein NAP-1 and shares the common sequence motif with DEK, B-23, HMG-1, and HMG-2 proteins (8,(10)(11)(12) at the COOH-terminal acidic region. SET has a long acidic tail, a large part of which is present in the predicted SET-CAN fusion protein. Despite rapid progress in molecular research on the set and set-can genes (91, little is known about their gene products, SET and SET-CAN. Here, we report the identification of SET, a 39-kDa protein encoded by the set gene associated with the translocation break point of can in AUL. We show that (i) expression of SET is ubiquitous in various human cell lines; (ii) SET is phosphorylated in vivo, mainly on serine residues; and (iii) SET is located predominantly in nuclei. Our findings should be helpful in studying the function of SET and chimeric SET-CAN proteins in the mechanism of leukemogenesis in AUL. in methionine-or phosphate-free medium supplemented with 10% fetal calf serum dialyzed against Tris-buffered saline (20 m~ Tris-HC1, 0.15 M NaCl, pH 7.5) at 37 "C for 8 or 2 h, respectively (16,21). The concentration of isotope added to the medium was 7.4 MBq/ml for [3SSlmethionine or 18.5 MBq/ml for [32P10rthophosphate.

EXPERIMENTAL PROCEDURES
Peptide Synthesis and Antibodies-Oligopeptides were synthesized with an automated peptide synthesizer (Applied Biosystems Model 430A) using t-butoxycarbonyl amino acids and p-methylbenzhydrylamine resins (16,19,22). Peptides were conjugated to keyhole limpet hemocyanin. Antiserum against each conjugated peptide was raised in rabbits as described previously (22).
Phosphoamino Acid Analy~is-~~P-Labeled SET protein was transferred from acrylamide gel onto Immobilon polyvinylidene difluoride membrane (Millipore) in 10 m CAPS/NaOH, 10% methanol, pH 11.0, and hydrolyzed in 6 N HCI at 110 "C for 2 h. Resulting phosphoamino acid samples were separated by high-voltage electrophoresis and identified by autoradiography as previously described (16,21).
Immunoblot Analysis-Proteins (60 pg) from each cell extract were subjected to SDS-PAGE. Separated proteins were transferred onto a nitrocellulose membrane and incubated with anti-SET serum. The immunocomplexes were detected by binding with horseradish peroxidaseconjugated protein A as previously described (13,16).
Indirect Immunofluorescence-Cells in culture chambers were fixed with 3.7% formaldehyde in phosphate-buffered saline (10 m phosphate, 0.14 M NaCI, pH 7.5) and permeabilized with 0.1% Nonidet P-40 in phosphate-buffered saline. Cells were incubated with primary antibodies in phosphate-buffered saline containing 1% bovine serum albumin. Detection was performed by further binding with fluorescein isothiocyanate-conjugated secondary antibodies (24).

RESULTS
Identification of SET Protein-The nucleotide sequence of the set cDNA contains an open reading frame encoding a protein of 277 amino acids with a predicted molecular mass of 32 kDa, as shown in Fig. 1 (9). To detect the SET protein, we prepared three rabbit antisera, each directed against a synthetic peptide from a different portion of the putative SET protein. Human erythroleukemia K-562 cells were metabolically labeled with [35S]methionine for immunoprecipitation. A predominant 39-kDa protein was immunoprecipitated from these cells with antiserum directed against the SP-1 (residues 3-16), SP-2 (residues 44-56), and SP-3 (residues 169-181) peptides, respectively (Fig. 2, lanes 2, 5, and 81, but not with preimmune serum (lanes 1, 4, and 7). A 66-kDa protein was also detected by immunoprecipitation with anti-SP-1 and anti-SP-2 sera, but the intensity of this 66-kDa protein was less than one-twentieth of the major 39-kDa protein. The immunoprecipitation of these proteins could be completely prevented by preabsorption with the same peptides used for immunization (lanes 3, 6, and 91, indicating specific antigen-antibody reactions. Thus, the major 39-kDa protein contained three sequences of the putative SET protein (residues 3-16,44-56, and To obtain more information for the identification of the 39-kDa protein, especially for protein sequence, we isolated the 39-kDa protein from K-562 cells by large-scale immunoprecipitation using anti-SP-1 serum and preparative SDS-PAGE ( Fig.   169-181).
Expression of SET in Various Human Cell Lines-To test whether SET protein is expressed ubiquitously in other cells, we subjected cell extracts (60 pg of protein) from various human cell lines (HUT-102, MT-2, TL-Su, HUT-78, H-9, Jurkat, Raji, Dauji, K-562, HL-60, HeLa, HOS, and BIM) to immunoblot analysis using anti-SP-1 serum. As shown in Fig. 4, all of the cell lines that we employed expressed SET at approximately the same level. The apparent molecular mass of SET in the immunoblots was the same, 39 kDa. Thus, the expression of SET at the protein level was ubiquitous.
In Vivo Phosphorylation of SET-Since the amino acid sequence of SET contains the apparent consensus sequences of the sites phosphorylated by protein kinases (Fig. 1) (25,261, we attempted to determine whether in vivo phosphorylation of SET occurred in cells. K-562 cells were metabolically labeled with [32Plorthophosphate and subjected to immunoprecipitation. Radioactive SET was then immunoprecipitated with anti-SP-1 serum (Fig. 5A, lane 2 ) . Because the radioactivity associated with SET was sensitive to 50 pg/ml alkaline phosphatase treatment a t 37 "C for 2 h (lane 31, the incorporation of radioactivity was due to protein phosphorylation, but not ADP-ribosylation. To determine the phosphoamino acid content of SET, 32Plabeled SET from K-562 cells (shown in Fig. 5A, lane 2 ) was eluted from the gel and hydrolyzed in 6 N HCI at 110 "C for 2 h. Phosphoamino acids were separated by electrophoresis and autoradiographed. As shown in Fig. 5B, SET protein was phosphorylated mainly on the serine residue in vivo, whereas phosphothreonine and phosphotyrosine were not detected. Subcellular Localization of SET-We investigated the subcellular localization of SET in HeLa and HOS cells by indirect immunofluorescent staining with anti-SP-1 serum. Fig. 6A shows typical HeLa cells stained using anti-SP-1 serum. The immunofluorescent signal was observed mainly in nuclei. The outer layer of the nuclear membrane was also stained weakly. In contrast, preimmune serum and antiserum preabsorbed with the SP-1 peptide did not stain the nuclei significantly (data not shown). The same nuclear staining of SET was observed in HOS cells (Fig. 6B ). Of the 100 stained cells viewed in each cell line, >95 cells showed an identical pattern of intracellular distribution of SET (Fig. 6, E and F). In a few cells, granular staining in the nucleus was evident as well as staining in the nucleolus and cytoplasm. Moreover, the same nuclear localization of SET was also observed in HeLa cells by using anti-SP-2 and anti-SP-3 sera (data not shown). These results indicate that most of SET is localized in nuclei.

DISCUSSION
In this report, we identified and characterized a 39-kDa protein as SET protein, which was predicted from the open reading frame of set transcripts (9). The criteria used to identify this protein are as follows. (i) The same 39-kDa protein could be immunoprecipitated with antisera to each of three peptides corresponding to different regions of the predicted amino acid sequence of SET (residues 3-16, 44-56, and 169-181) (Fig. 21, thus greatly decreasing the possibility that the immunoprecipitation was due to chance sequence homology between the syn- (ii) Immunoprecipitation of the 39-kDa protein could be completely inhibited by prior incubation of the antiserum with the original synthetic peptides, and preimmunized serum did not recognize the protein (Fig. 21, indicating specific antigen-antibody reactions. (iii) Microsequencing of the immunoprecipitated and enzymatically cleaved 39-kDa protein yielded two internal peptide sequences that were identical to residues 75-80 and 191-210 of the predicted SET protein (Fig. 3B). Taken together, these findings strongly support our conclusions that the 39-kDa protein is identical to the set-encoded protein, SET.
The SET protein consists of 277 amino acids with a predicted molecular mass of 32 kDa (9). The discrepancy in the molecular size of the observed 39-kDa SET and the predicted 32-kDa SET may be due to protein phosphorylation (Fig. 5), the blocked N H 2 terminus: protein glycosylation, and/or the high content of acidic residues in the amino acid sequence (Fig. 1) (9) since phosphoproteins with blocked NH2-terminal residues and a high content of acidic residues (e.g. nucleolar shuttle protein B-23, HMG-1, and HMG-2) have been shown to migrate more slowly than expected on SDS-PAGE (10-12, 19,271. SET is widely distributed in various human cell lines, as shown in Fig. 4. The molecular mass of SET observed on the immunoblot was the same (39 kDa) in every case. These results are consistent with previous observations of set gene expression at the mRNA level in mouse tissues and early embryos (91, suggesting that SET has a rather general function(s) in the cell. The data described in this study clarify that SET is phosphorylated in vivo (Fig. 5). The major phosphoamino acid of phosphorylated SET was phosphoserine, indicating that SET is a substrate for one of the cellular serinekhreonine kinases. Tyrosine kinases are not involved as phosphotyrosine was not detected. The function(s) of SET is not yet known; however, the in vivo phosphorylation of SET may be involved in the regulation of SET and SET-CAN functions or in the next event in the signal transduction pathway in response to physiological stimuli or the cell cycle.
SET was found predominantly in nuclei, as shown in Fig. 6. The nuclear localization of SET seems consistent as it contains a n extremely high percentage of acidic residues, 32% (98 amino acids), half of which (43 amino acids) are present at the COOH terminus, forming a long acidic tail. Many proteins containing acidic regions are located in the nucleus and have different functions (28). Analogous to acidic domains in NAF"1, HMG-1, HMG-2, nucleolin, GAL4, and VP16 (12,27,(29)(30)(31)(32)(33), the acidic motif of SET might serve as a nucleosome/chromatin assembly domain or a transcription activation domain.
Biological and functional assays are needed to determine whether the acidic domain of SET is essential for the putative transformation potential of the SET-CAN fusion protein in AUL. The nuclear localization of SET and its in vivo phospho-~~n t~~c~~i o n and Characterization of SEI' rylation are interesting since fusion of CAN to DEK and of ABL to BCR results in a nuclear localization of the fusion proteins, whereas CAN and ABL themselves are present mainly in the cytoplasm (3,8,9,34). Hence, fusion of SET to CAN may have the same effect, resulting in a nuclear localization of the SET-CAN fusion protein. A nuclear translocation of CAN may be essential for the putative leukemogenic effect of the fusion protein in AUL.