aKMT Catalyzes Extensive Protein Lysine Methylation in the Hyperthermophilic Archaeon Sulfolobus islandicus but is Dispensable for the Growth of the Organism *

Protein methylation is believed to occur extensively in creanarchaea. Recently, aKMT, a highly conserved crenarchaeal protein lysine methyltransferase, was identified and shown to exhibit broad substrate specificity in vitro. Here, we have constructed an aKMT deletion mutant of the hyperthermophilic crenarchaeon Sulfolobus islandicus. The mutant was viable but showed a moderately slower growth rate than the parental strain under non-optimal growth conditions. Consistent with the moderate effect of the lack of aKMT on the growth of the cell, expression of a small number of genes, which encode putative functions in substrate transportation, energy metabolism, transcriptional regulation, stress response proteins, etc, was differentially regulated by more than twofold in the mutant strain, as compared with that in the parental strain. Analysis of the methylation of total cellular protein by mass spectrometry revealed that methylated proteins accounted for ∼2/3 (1,158/1,751) and ∼1/3 (591/1,757) of the identified proteins in the parental and the mutant strains, respectively, indicating that there is extensive protein methylation in S. islandicus and that aKMT is a major protein methyltransferase in this organism. No significant sequence preference was detected at the sites of methylation by aKMT. Methylated lysine residues, when visible in the structure, are all located on the surface of the proteins. The crystal structure of aKMT in complex with S-adenosyl-l-methionine (SAM) or S-adenosyl homocysteine (SAH) reveals that the protein consists of four α helices and seven β sheets, lacking a substrate recognition domain found in PrmA, a bacterial homolog of aKMT, in agreement with the broad substrate specificity of aKMT. Our results suggest that aKMT may serve a role in maintaining the methylation status of cellular proteins required for the efficient growth of the organism under certain non-optimal conditions.

In Archaea, especially in thermophilic crenarchaea, an increasing number of proteins have been shown to undergo post-translational methylation. Early examples include ferredoxin from Sulfolobus acidocaldarius, and glutamate dehydrogenase, aspartate aminotransferase, ␤-glycosidase and ribosomal proteins from S. solfataricus (26 -30). Methylation of lysine residues in S. solfataricus glutamate dehydrogenase and ␤-glycosidase appears to enhance thermal stability of the enzymes or reduce their susceptibility to denaturation and aggregation (26,28). The crenarchaeal chromatin proteins Sul7d and Cren7 are also shown to be methylated, and the level of lysine methylation of Sul7d is increased during heat shock (31)(32)(33). More recently, Botting et al. identified 21 methyllysine residues from nine subunits of S. solfataricus RNA polymerase and 52 methyllysine residues in 30 different proteins from Thermoproteus tenax, suggesting that lysine meth-ylation may occur more extensively in crenarchaea than previously thought (6). The widespread protein methylation in crenarchaea has been speculated to represent an adaption of these organisms to growth in hyperthermal environments (6).
For a long time, little was known about the enzymes responsible for protein lysine methylation in Archaea. In the first report on protein methyltransferases in Archaea, a Su(var), Enhancer of zeste, Trithorax (SET) domain protein capable of methylating a lysine residue in the chromatin protein MC1-␣ was identified in the euryarchaeon Methanosarcina mazei (34). The homologues of this methyltransferase are also found in several other methanogens. However, no SET domain proteins have been found in the sequenced genomes of crenarchaea where lysine methylation is prevalent. Recently, we identified the first crenarchaeal protein lysine methyltransferase, designated as aKMT, from S. islandicus (35). This protein resembles methyltransferases of the eukaryotic Dot1 family (36). Notably, aKMT is capable of methylating several tested recombinant Sulfolobus proteins overproduced in Escherichia coli, exhibiting broad substrate specificity in vitro (35).
To gain more insight into the function of aKMT, we have now constructed an aKMT deletion mutant of S. islandicus. The mutant strain is viable but exhibits moderate growth defect under certain conditions. Expression of a small number of genes is significantly altered as a result of the deletion of the aKMT gene. We show that most of the cellular proteins from S. islandicus are methylated, and aKMT is responsible for the methylation at the majority of the target lysine residues in the organism. We have also determined the crystal structures of aKMT-S-adenosyl-L-methionine (SAM) 1 and aKMT-S-adenosyl homocysteine (SAH) complexes, providing the structural basis for the broad substrate specificity of the protein. Our results suggest that extensive protein methylation is not required for the adaptation of S. islandicus to growth at high temperature and may serve to enhance the ability of the organism to achieve efficient growth under certain non-optimal conditions.

Construction of an aKMT Deletion Mutant and a Complementary
Strain-The aKMT gene (denoted kmtA, Tg-arm) and its upstream and downstream sequences (In-arm and Out-arm, respectively) were amplified by PCR from the genomic DNA of S. islandicus Rey15A (for primer sequences, see supplemental Table S1 in the Supplemental Materials). The Tg-arm, In-arm and Out-arm fragments were digested with SalI/MluI, XhoI/SphI and NcoI/XhoI, respectively, and inserted into plasmid pMID (40). The resulting vector (pMID-aKMT) was transformed into E. coli strain DH5␣. After growth, the plasmid was extracted from the cells and the sequences of the inserts were verified by DNA sequencing through primer walking. Plasmid pMID-aKMT was then linearized and introduced into S. islandicus E233S by electroporation, as described previously (38). Transformed cells were plated onto SCVy medium solidified with 0.8% (w/v) Gelrite (Sigma, St. Louis, MO) and incubated at 75°C. After 7-10 days, colonies were stained with 5-bromo-4-chloro-3-indolyl ␤-D-galactopyranoside (X-Gal,2 mg/ml). Blue colonies were picked and grown on SCVy medium containing 5Ј-fluorooritic acid (50 g/ml). Colony purification was repeated twice.
A strain that complemented the deletion of the genomic copy of the aKMT gene was prepared by amplifying the kmtA sequence from the S. islandicus DNA by PCR (supplemental Table S1). The PCR product was cleaved with NdeI/SalI, and the resulting fragment inserted into plasmid pSeSD (41), yielding the aKMT expression plasmid pSeSD-aKMT capable of replicating in both E. coli and S. islandicus. After propagation in E. coli DH5␣, pSeSD-aKMT was transformed into the aKMT mutant strain. The transformed cells containing pSeSD-aKMT were purified by isolating single colonies on SCVy plates. Colony purification was repeated twice.
Thermal Stability of Cellular Proteins-Parental and mutant S. islandicus cells were grown to the exponential growth phase in SCVy medium at 75°C, harvested by centrifugation, resuspended to the same cell density in 50 mM sodium phosphate buffer, pH 7.0, and sonicated. The thermal stability of cellular proteins in the cell-free extract was determined by monitoring the change of the sample in absorbance at 600 nm with an increase in temperature from 50 to 95°C at a rate of 0.2°C/min on a Shimadzu UV-2550 spectrophotometer (26). The thermal stability of the cellular proteins was also measured by incubating the extract for indicated lengths of time at various temperatures. Aggregation of the proteins following heat treatment was monitored by 90 o light scattering at 488 nm on a Shimadzu RF5301PC spectrofluorimeter.
Identification of Methylated Lysine Residues-A sample (ϳ300 g) of total cellular proteins from the parental or the mutant S. islandicus strain grown to an OD 600 of ϳ1.0 in SCVy medium at 75°C was subjected to 15% SDS-PAGE. The gel was stained with Commassie brilliant blue R-250, and destained with 20% (v/v) ethanol/10% (v/v) acetic acid, instead of methanol, to avoid chemical methylation of the proteins. The protein-containing portion of the gel was horizontally cut into slices, and each slice was then cut into small pieces. Proteins in gel pieces were subjected to in-gel digestion as described previously (42,43). Briefly, the gel pieces were destained, treated with dithiothreitol and iodoacetamide, and vacuum-dried. The proteins in the gel pieces were digested with 10 l of trypsin (12.5 ng/l; Roche, Switzerland) in 25 mM ammonium bicarbonate, pH 8.0, at 37°C for overnight. The peptides were extracted from the gel pieces and vacuum-dried. The peptides were first separated on an EASY-nLCII integrated nano-HPLC system (Proxeon, Odense, Denmark) and then analyzed on a LTQ-Orbitrap Velos mass spectrometer (Thermo, Waltham, MA). Mobile phase A was 0.1% (v/v) formic acid and mobile phase B consisted of 100% (v/v) acetonitrile and 0.1% (v/v) formic acid. Peptide separation was performed for 105 min on a home-made 1 The abbreviations used are: SAM, S-adenosyl-L-methionine; SAH, S-adenosyl homocysteine; Se-Met, Selenomethionine; IMG, Integrated Microbial Genomes; FDR, False discovery rate; qRT-PCR, Quantitative reverse transcription-PCR; PTM, Post-translational modifications; ORFs, Open reading frames; SAD, Single-wavelength anomalous dispersion; RMSD, Root-mean-square deviation; SET, (Su(var), Enhancer of zeste, Trithorax) domain; X-gal, 5-Bromo-4chloro-3-indolyl ␤-D-galactopyranoside; PCNA proliferating cell nuclear antigen; Lig1, DNA ligase I; MCM, Minichromosome maintenance protein complex; PriL, DNA primase noncatalytic subunit; PriS, DNA primase catalytic subunit; RFC, Replication factor C. fused silica capillary C18 column (3 m, 75 m x 150 mm; Upchurch, Middleboro, MA) at a flow rate of 300 nL/min using the following gradients successively: 2 to 6% B in 10 min, 6% to 25% B in 65 min, 25% B to 45% B in 20 min, and 45% to 100% B in 10 min. MS and MS/MS data acquisition was performed using Xcalibur in the datadependent acquisition mode. We searched the data with SEQUEST search engine in Proteome Discoverer 1.4 software against a S. islandicus Rey15A database downloaded from NCBI (44). Percolator was used to calculate the FDR of each peptide. The decoy database was constructed by reversing all protein sequences in the original database. Search parameters were set to allow 20 ppm for ms tolerance and 0.8 Da for ms/ms tolerance. Mono-, di-, and tri-methylations of lysine were set as variable modification on Lys, oxidation as variable modification on Met, and Cys carbamidomethylation as a fixed modification, two missed cleavages were allowed for trypsin digestion. The false discovery rate (FDR) of the peptide was set to 0.01. The searched results were filtered by high confidence, and at least one matched peptide was required for each identified protein. For the identified methylated peptides, a manual check of randomly selected methylated peptides was carried out to ensure a high quality of spectrum. The raw data was also processed by using Proteome Discoverer 2.1 with a ptmRS node, which provided a confidence measure of the assignment of lysine methylation in peptide sequences. All peptides containing a methylated lysine residue at the C terminus were manually checked. The "Best Site Probabilities" values were obtained for the assignment of methylation in these peptides. To estimate the abundance of the methylated proteins, the raw data was further searched with Mascot 2.5 to obtain emPAI values for the proteins (45). Raw spectra have been submitted to the PRIDE database (http://www.ebi.ac.uk/pride/) via the Proteome-Xchange with the data set identifier PXD003424 (46).
Global Analysis of Protein Lysine Methylation-The genomic information of S. islandicus REY15A was downloaded from the Integrated Microbial Genomes (IMG, http://img.jgi.doe.gov) (47). Two databases, termed parent and ⌬aKMT, were constructed with methylated proteins identified in the parental and the mutant S. islandicus strains, respectively. Excel plotting was used in arCOG analysis. The position of a peptide in a protein from which the peptide was derived was determined by BLAST. A 15-residue sequence window containing seven amino acid residues on each side of a methylated site was determined. For cases where methylated sites were near the C-or N terminus, "-" was used to complete the sequence window. The characteristics of the secondary structure of each 15-residue amino acid sequence were predicted by using Garnier from the EMBOSS software package. The frequency of residues flanking a methylated lysine residue was analyzed by Weblogo (weblogo 3.0).
Experimental Design and Statistical Rationale-To compare the patterns of protein methylation in the parental and the aKMT deletion mutant strains, a sample (ϳ10 g) from each strain, grown to an OD 600 of ϳ1.0 in SCVy medium at 75°C, was loaded onto an SDS-PAGE gel for in-gel trypsin digestion and MS analysis. The experiment was then repeated once with a larger amount of proteins (ϳ300 g) from a separately grown culture of each strain under the same conditions. The quality of each data set was ensured by setting the false discovery rate (FDR) to 0.01. Since the majority of the proteins (97.5% and 96.5% for the parental and the mutant strains, respectively) identified in the smaller sample were also found in the larger one and the two data sets showed similar patterns of protein methylation, the larger data set was presented in the present work.
Immunoblotting-Cells from the parental, the mutant or the complementary strain grown exponentially at 75°C were harvested and resuspended in a calculated volume of the sample buffer for SDS-PAGE to the same cell density. Equal aliquots of each sample were loaded, in parallel, onto two 15% SDS-polyacrylamide gels. After electrophoresis, one of the gels was stained with Commassie brilliant blue R-250. The other gel was processed for immunoblotting. Proteins were transferred electrophoretically to a PVDF membrane (Merck Millipore, Darmstadt, Germany). The membrane was incubated with the rabbit anti-mono/dimethyllysine antibodies (Jingjie PTM Biolab, Hangzhou, China), which recognized mono-and dimethylated lysine residues but not a trimethylated, acetylated or unmodified lysine residue in a protein. An anti-rabbit IgG-HRP conjugate (Promega) was used as the secondary antibody and was detected by the chemiluminescent method.
To prepare selenomethionine (Se-Met) aKMT, the above aKMT expression strain was grown in M9 medium containing 0.2% glucose, 1 mM MgSO 4 , and 100 g/ml ampicillin at 37°C until the OD 600 of the culture reached ϳ0.8. Se-Met was added to the culture to a final concentration of 50 g/ml. Se-Met aKMT was purified as described for the wile-type aKMT protein. Protein concentrations were determined by the Lowry method (48). In order to obtain the aKMT-SAH complex, aKMT copurified with SAH was incubated in 6 M guanidine-HCl, followed by incubation with SAH under renaturation conditions (49).
Protein Crystallization, Data Collection, and Structure Determination-Crystallization screening was performed using the sitting drop vapor diffusion method in 96-well plates with commercial screening kits from Hampton Research (Aliso Viejo, CA), Molecular Dimensions (Suffolk, UK) and Emerald BioSystems (Bainbridge Island, WA). A sample (0.3 l, 20 mg/ml) of a Se-Met aKMT protein stock solution was mixed with 0.3 l of reservoir solution using a Mosquito robot (TTP Labtech, Melbourn, UK) and equilibrated against 40 l of reservoir solution at 16°C. Initial hits were performed by mixing 1 l of the protein mixture with 1 l of reservoir solution in hanging drops (10 mM magnesium chloride hexahydrate, 0.1 M HEPES-NaOH, pH 7.0, 15% (w/v) polyethylene glycol 3,350, 5 mM nickel chloride hexahydrate) at 16°C. Crystallization conditions for aKMT-SAH (20 mg/ml) in space group of P2 was 10 mM nickel chloride, 0.1 M Tris-HCl, pH 8.5, 20% (w/v) PEG 2,000 MME. The crystals were mounted on a nylon loop and cooled immediately in liquid nitrogen without any antifreeze. All data sets were indexed, integrated, and scaled using the HKL2000 software package (50). The initial phase was determined using the X 2 DF structure determination pipeline (51,52). The model was manually improved in Coot (53). The aKMT-SAH structure was solved by molecular replacement (54) using our determined Se-Met aKMT structure as the search model. Refinement was carried out using REFMAC (55) and PHENIX (56) alternately. The quality of the final model was validated with MolProbity (57).

RNA Preparation and cDNA Synthesis for RNA-seq-
The parental and the mutant S. islandicus strains were grown at 75°C with shaking in SCVy medium, and harvested at an OD 600 of ϳ0.5. Total RNAs were extracted from the cells by using the TRIzol reagent (Invitrogen, Carlsbad, CA) according to the manufacturer's instruction. A sample (10 g) of total RNA was treated with DNaseI (5 U; Takara, Dalian, China) at 37°C for 30 min, and purified by using the RNeasy MinElute Cleanup Kit (Qiagen, Hilden, Germany). DNA-free RNA samples were fragmented by heating at 95°C. A cDNA library was constructed from 100 ng of total RNA by using the RNA-Seq Library Preparation Kit for Whole Transcriptome Discovery (Gnomegen). The quality of the library was determined by agarose gel electrophoresis, and by using an NanoPhotometer® spectrophotometer (IMPLEN, Mü nchen, Germany) and an Agilent High Sensitivity DNA Kit (Agilent Technologies, Santa Clara, CA). Cluster generation of cDNA (10 ng) was performed on a cBot Cluster Generation System using TruSeq PE Cluster Kit (Illumina, CA) according to the manufacturer's protocol, and the library was subjected to sequencing in both directions on an Illumina Hiseq TM 2500 sequencer.
Analysis of RNA-seq data-Raw data in a FASTQ format was first processed by using FASTX-Toolkit (http://hannonlab.cshl.edu/fastx_ toolkit/). Reads containing sequencing adapters or of low quality (Ͼ 5 Ns, where N represents any of the unidentified bases) were removed. The number of reads mapped to each transcript was counted by HTSeq v0.5.3 (http://www-huber.embl.de/users/anders/HTSeq). To calculate the difference between the parental strain and ⌬aKMT in the expression of each gene, MARS (a MA-plot-based method with a Random sampling model) from the DEGseq program package was employed (58). The difference was considered to be significant when FDR was no greater than 0.001. For each COG category, enrichment of differentially expressed transcripts compared with the entire genomic set of genes was determined by using the hypergeometric distribution statistics. Enrichment analysis of the KEGG pathways was performed in the same manner.
Quantitative reverse transcription-PCR (qRT-PCR)-Total RNA (500 ng) was reverse transcribed into cDNA using M-MLV Reverse Transcriptase (Promega, Fitchburg, WI) according to the manufacturer's instruction. Gene-specific primers were designed using Primer Premier 5.0 (supplemental Table S1). Quantitative PCR (qPCR) reaction mixtures contained 2ϫ KAPA SYBR ® FAST qPCR Master Mix Universal (10 l; KAPA Biosystems, Wilmington, DE), 50-fold diluted cDNA (1 l), and 200 nM primers in a final volume of 20 l. qPCR reactions were conducted on a LightCycler 480 II PCR machine (Roche, Basel, Switzerland) according to the manufacturer's protocol. Relative mRNA expression was calculated using the comparative threshold cycle (Ct) method (59). The level of 16S rRNA was used as a reference to normalize the expression data for target genes.

S. islandicus Defective in aKMT is Viable but Shows a
Moderately Altered Growth Phenotype-To gain insights into the physiological role of aKMT, we deleted the gene encoding the protein in S. islandicus by introducing a deletion construct (pMID-aKMT) into S. islandicus strain E233S, referred to as the parental strain. A successful double crossover event would lead to the generation of an aKMT deletion mutant strain, denoted ⌬aKMT. The deletion of the aKMT gene in the mutant strain was verified by PCR, Southern hybridization and immunoblotting with anti-aKMT antibodies (supplemental Fig.  S1). Surprisingly, the antibodies, which were generated in rabbit with recombinant aKMT as the antigen and affinity purified with recombinant aKMT, recognized a number of S. islandicus proteins, in addition to aKMT, in the parental strain as well as the complementary strain, which was constructed by introducing an aKMT expression plasmid (pSeSD-aKMT) into ⌬aKMT (supplemental Fig. S1B). Since the cross reaction was identical in both the parental and the complementary strains and was substantially reduced in ⌬aKMT, we speculate that proteins recognized by the antibody shared antigenic determinants, possibly, including a methylated lysine, in which the addition of the methyl group was catalyzed by aKMT. A similar observation on a glycoprotein from a rumen bacterium was reported previously (57). Our results indicate that aKMT is dispensable for the growth of the cell.
In order to examine the growth phenotype of the aKMT mutant, we grew the parental strain, the deletion mutant and the complementary strain at 75°C, the optimal growth temperature for S. islandicus, and 65°C in either CVY-rich or SCVy medium. The two media were similar except that the former contained 40-fold more yeast extract than the latter. The cells of the three strains grew more slowly in SCVy medium than in CVY-rich medium, but reached similar maximum cell densities in the two media. As shown in Fig. 1, ⌬aKMT showed a slower growth rate than the parental strain in both media, and the difference was most significant when they were grown at 65°C in SCVy medium. The growth phenotype was partially restored in the complementary strain. This appears to be consistent with the finding that intracellular level of aKMT in the complementary strain was lower than that in the parental strain (supplemental Fig. S1B). It is also possible that the His 6 tag attached to the C terminus of aKMT synthesized in the complementary strain reduced the methyltransferase activity of the enzyme. Our data suggest that the function of aKMT is required more for the growth of the organism under certain restrictive conditions than for that under optimal conditions. To test further if the parental and the mutant cells would differ in sensitivity to treatment at temperatures higher than that optimal for growth, we first grew both strains at 75°C in SCVy medium to the mid-exponential phase and subsequently incubated the two strains for two hours at 87, 90, or 95°C. The fractions of the cells that survived the heat treatment were determined by plating. The parental and the mutant strains appeared to survive heat treatment similarly well at 87 and 90°C with the survival rates of ϳ100 and ϳ90%, respectively (data not shown). However, both strains were unable to survive incubation for two hours at 95°C. Therefore, the deletion of the aKMT gene does not appear to affect the thermal tolerance of the organism.
aKMT is Responsible for Methylation of the Bulk of Proteins in the Cell-In our previous study, we found that aKMT was capable of methylating in vitro several S. solfataricus proteins overproduced in E. coli (35). It was shown earlier that S. solfataricus RNA polymerase was highly methylated at lysine residues, suggesting that extensive protein methylation may occur in the organism (6). These observations prompted the suggestion that aKMT may play a major role in protein methylation in the cell. The availability of the aKMT deletion mutant permitted an analysis of the extent and the pattern of protein methylation catalyzed by the enzyme in vivo. In our preliminary experiments, anti-mono/dimethylated lysine antibodies (Jingjie PTM Biolab, China), which recognize proteins with mono-and dimethylated lysine residues, were employed to detect methylated proteins in the parental, the mutant and the complementary strains by immunoblotting. As shown in Fig.  2, the number of proteins recognized by the antibodies in the parental and the complementary strains far exceeded that in ⌬aKMT, supporting the notion that aKMT is a major protein lysine methyltransferase in S. islandicus.
To learn more about protein methylation catalyzed by aKMT in vivo, we set out to compare the methylated proteins in the parental and the mutant strains. In a control experiment, the pattern of methylated proteins in the parental strain, as revealed by immunoblotting with anti-mono/dimethylated lysine antibodies, remained largely unchanged during the entire growth phase of the organism (supplemental Fig. S2). This finding agrees with the observation that the intracellular level of aKMT, as determined by immunoblotting using anti-aKMT antibodies, was nearly constant throughout the growth phase (data not shown; (35,36)) as well as at various temperatures within a tested range. Therefore, we subjected samples containing the same number of either the parental or the mutant cells (OD 600 ϭ ϳ1.0) to electrophoresis on a SDS-PAGE gel, followed by in-gel trypsin digestion and mass spectrometry. This experiment was repeated once. Similar observations were made in both experiments. The results of one of the experiments are shown in this report.
Among a total of 2535 proteins predicted to be encoded by the S. islandicus genome, 1751 and 1757 proteins were identified in the parental and the mutant strains, respectively, by mass spectrometry (Table I and supplemental Table S2). 1643 proteins were found in both strains. 1158 proteins (66.1% of the identified proteins) were found to be methylated in the parental strain, whereas 591 proteins (33.6%) were methylated in ⌬aKMT. These results confirm the suggestion that there is extensive protein methylation in Sulfolobus (6). The observation that 591 methylated proteins remained in the mutant strain but they were not readily detected by anti-aKMT antibodies raised the possibility that these proteins were of low abundance. So we estimated the abundance of methylated proteins in the parental and the mutant strains by quantitative spectral counts (supplemental Table S2B). Our results suggest that the signal intensity on the immunoblot did not correlate with the amount of the proteins, and might primarily depend on the levels of methylation of the proteins. Given that the vast majority of the proteins identified in the parental and the mutant strains were the same, it may be inferred that the profile of the synthesis of cellular proteins was not drastically affected by protein methylation. Further analysis showed that 497 proteins were methylated in both strains, whereas 661 proteins were methylated only in the parental strain. In addition, 94 proteins were methylated only in ⌬aKMT (supplemental Table S3), and 66 of these proteins were detected in an unmethylated form in the parental strain, suggesting that the deletion of the aKMT gene altered the pattern of protein methylation by the remaining methyltransferase(s) in the cell.
Since aKMT was initially identified in a search for enzyme(s) responsible for the methylation of Cren7 (Sire_1111), an abundant chromatin protein isolated from S. islandicus (35), we were interested in comparing the methylation state of Cren7 from the parental strain with that from the mutant strain. As reported previously, Cren7 from the parental strain was methylated at multiple lysine residues (32,35). Monomethylated Lys11 and Lys16, dimethylatred Lys24 and Lys42, and trimethylated Lys16 were detected. By comparison, Cren7 from ⌬aKMT appeared to be unmethylated, suggesting that the protein was methylated by aKMT in vivo.
A survey of methylated lysine residues revealed a far more drastic difference than that of methylated proteins between the parental and the mutant strains. 3718 and 1074 methylated residues were identified in the parental strain and ⌬aKMT, respectively (Table I). Furthermore, there were 3003 monomethylated, 745 dimethylated, and 540 trimethylated lysine residues in the parental strain, and 274 monomethylated, 566 dimethylated, and 351 trimethylated lysine residues in ⌬aKMT. Since there were over 10-fold more monomethylated residues but only slightly more dimethylated or trimethylated residues in the parental strain than those in ⌬aKMT, aKMT is probably responsible primarily for the monomethylation of lysine residues. In addition, most of the proteins were methylated at multiple sites in the parental strain. The thermosome (SiRe_1214) was the most highly methylated polypeptide detected with 24 methylated lysine residues (Fig. 3). In comparison, most methylated proteins contained only one lysine residue in ⌬aKMT. Based on these data, we conclude that aKMT is a major protein methyltransferase in S. islandicus.
To determine if a lysine residue could be methylated to various extents, we investigated the methylation state of each lysine residue in the methylated proteins. The proteins identified in the parental strain and ⌬aKMT contained 17,784 and 16,941 lysine residues, respectively. Approximately 20.9% (3, 718) of the identified lysine residues in the parental strain and 6.3% (1, 074) of those in ⌬aKMT were methylated (Table I). Only 28.7% (1, 067) and 20.5% (220) of the methylated residues were detected in a single methylation state in the parental strain and ⌬aKMT, respectively (Fig. 4). The majority of the methylated residues were found in more than one methylation forms. For example, 40 residues in the prental strain and 12 residues in ⌬aKMT existed in all of the following four forms, i.e. monomethylated, dimethylated, trimethylated, and unmethylated forms. Differential methylation of the lysine residues may play a regulatory role in the cell, as in the case of eukaryotic histone methylation (60). However, it is worth noting that S. islandicus cells used in this study were not synchronized and, therefore, different methylation states for a given lysine residue may be related to cells in different phases of the cell cycle.
To investigate if protein lysine methylation by aKMT shows sequence or structural specificity, amino acid sequences flanking the sites of methylation were analyzed (Fig. 5A). The secondary structures of the amino acid sequences were also predicted (Fig. 5B). No apparent bias in the flanking sequences was detected for the sites of methylation. The lack of sequence specificity is consistent with the ability of aKMT to a Proteins and lysine residues were methylated in the parental strain and not in ⌬aKMT. b Proteins and lysine residues were methylated in ⌬aKMT and not in the parental strain.
catalyze methylation of a large number of proteins, often at multiple sites, in the cell. On the other hand, lysine methylation occurred preferentially in the helix regions or regions immediately adjacent to a helix on the C-terminal side in substrate proteins. To shed light on the location of methylated lysine residues in the three-dimensional structure of a protein, we conducted a survey on several S. solfataricus chromatin and replication proteins with a known or partially known structure, i.e. PCNA, Lig1, MCM, PriS/PriL, RFC, Cren7, and Sul7. These proteins are highly similar to their S. islandicus homologues whose structures are not yet available. Our data showed that these proteins were all methylated at multiple  4. Schematic representation of lysine residues that were methylated to various extents. The extents of methylation on lysine residues, as identified by mass spectrometry in parent (A) or ⌬aKMT (B), are depicted in a Venn diagram. Unmethylated, monomethylated, dimethylated or trimethylated lysine residues are indicated by none, methyl, dimethyl, or trimethyl, respectively. sites in the parental strains but were not methylated in ⌬aKMT. We found that methylated residues, when visible in the structure, were all located on the surface of these proteins and were thus solvent exposed, as exemplified by Lig1, a monomeric protein, or PCNA, a trimeric protein complex (supplemental Fig. S9). However, not all surface-located lysine residues were detectably methylated. Therefore, our results indicate that lysine residues on the surface of a protein were targeted by aKMT for methylation using an unknown recognition mechanism in vivo.
We also performed an arCOG analysis on proteins, which were methylated in the parental strain but not in ⌬aKMT and, therefore, probably methylated by aKMT (supplemental Table  S4). These proteins were widespread without apparent bias in various functional categories (22 out of the 26 arCOG categories), suggesting that protein methylation catalyzed by aKMT is not restricted to specific cellular functions.
The Structural Basis of the Broad Substrate Specificity of aKMT-In order to understand the substrate recognition and the catalytic mechanism of aKMT, we sought to determine the crystal structure of the protein. Se-Met aKMT alone, aKMT alone and aKMT with added SAM were used in crystallization trials. Crystals were obtained for all three samples under the same crystallization conditions, and these structures were similar (supplemental Table S5). In addition, all three structures contained a SAM molecule in each protomer, indicating that an endogenous SAM molecule was bound by recombinant aKMT overproduced in E. coli.
As predicted from the sequence alignment, aKMT contains a conserved methyltransferase domain of the IPR025714-type but lacks the substrate recognition domain, as compared with the homologous L11 methyltransferase PrmA from the hyperthermophilic bacterium Thermus thermophilus (supplemental Fig. S3, (35)). The complex consists of four ␣ helices (␣1-␣4) and a central seven-stranded ␤ sheet (␤1-␤7), a structure typical of a number of SAM-dependent methyltransferases (61). In the ␤ sheet, ␤1 through ␤6 strands are parallel to one another, whereas ␤7 is antiparallel to the others and inserted into the sheet between ␤5 and ␤6 (supplemental Fig.  S4). A SAM molecule is bound in the SAM binding pocket within the aKMT molecule. The SAM binding pocket is negatively charged but contains a hydrophobic adenine binding region. The pocket has a wide opening and narrows down toward the center of the protein. The methionyl moiety of SAM inserts into the pocket, with the adenine group pointing outside of the pocket (Figs. 6A and 6B). Amino acid residues interacting with SAM are shown in Fig. 6C. Six of them, i.e. Thr12, Asp34, Gly36, Glu59, Asn87, and Phe88, interact with SAM via hydrogen bonding, and the remaining ten residues form hydrophobic interactions with SAM (Fig. 6C). Because of their structural roles, many of the residues (e.g. Asp34, Gly36, Glu59, Thr12, and Asn87) are conserved (35). The importance of these residues to the activity of aKMT is further verified by a site-directed mutagenesis experiment in which a single alanine substitution for Thr12, Asp34, Gly36, Glu59, or Asn87 resulted in over 90% loss of the methyltransferase activity of the protein (supplemental Fig. S5).
Because aKMT appears to be responsible primarily for the monomethylation, instead of di-or trimethylation of the proteins, we were interested in looking into the cycle of methyl transfer by the enzyme. We solved the crystal structure of aKMT containing the reaction product S-adenosyl homocysteine (SAH) in space group of P2 at 1.84 Å (Figs. 6D and 6E). The structures of aKMT-SAH and aKMT-SAM were highly superimposable with a root-mean-square deviation (RMSD) of 0.432Å, and identical residues were found to be interacting with both SAM and SAH. However, the N-terminal portion (1-8 residues) of the protein poses in very different positions in the two structures. In the aKMT-SAM complex, the N terminus of the protein extends outwards, creating a large opening for the pocket. Therefore, a substrate lysine residue may be readily inserted into the pocket and form a close contact with the methyl group of the SAM molecule. In comparison, the N-terminal seven residues (Ser2-Pro8) in the aKMT-SAH complex covers the opening of the pocket as the result of a 90 o -turn of the main chain between Pro8 and Tyr9, drastically reducing the size of the entrance of the pocket. The SAH molecule is thus totally buried in the pocket of aKMT. It appears that aKMT may exist in two conformations, i.e. the open conformation (aKMT-SAM) and the closed conformation (aKMT-SAH). It is speculated that the N terminus serves as a switch between the two conformations in regulating the reaction process of the enzyme. Given its mode of action, along with its lack of the substrate recognition domain and, thus, its presumably low binding affinity for substrates, aKMT may catalyze methyl transfer in a distributive fashion.

Cellular Proteins from the Parental and the aKMT Deletion Mutant Strains Show Similar Thermal Stability-Methylation
has been suggested to increase the stability of proteins from Eukarya and Archaea (12,26,62,63). Because proteins are extensively methylated in S. islandicus, and aKMT is responsible for much of the protein lysine methylation in this organism, we were interested in learning if cellular proteins from the mutant differed from those from the parental strain in thermal stability. We first followed the change in optical density at 600 nm of cell-free extracts from the parental strain and ⌬aKMT grown in the exponential growth phase with increasing temperature by using a spectrometric assay (26). As shown in Fig. 7A, cellular proteins from both the parental strain and ⌬aKMT started to denature at ϳ85°C. We then compared the thermal stability of the cellular proteins from the parental strain with that from the mutant using a light scattering-based protein aggregation assay. Proteins from the two strains were similar in sensitivity to thermal denaturation (Fig.  7B). Taken together, these results reveal no significant differences in the thermal stability between cellular proteins from the parental strain and those from ⌬aKMT.
Transcriptomic The structure is shown as a ribbon diagram with the ␣ helices and the ␤ sheets colored in red and yellow, respectively. The SAM molecule is shown as gray sticks. The purple sphere represents a magnesium ion. The N-and C termini are labeled with the respective letters. B, The solvent-accessible surface of aKMT in complex with SAM, colored according to electrostatic potential. Blue, positively charged; red, negatively charged; white, neutral. Electron density of a 2Fo-Fc simulated annealing (SA) omit map for SAM bound in the catalytic pocket contoured at 1.0 is shown. The SAM molecule is shown as gray sticks. C, Schematic diagram summarizing the interactions between aKMT and SAM in the aKMT-SAM structure generated by LIGPLOT (67). Interacting atoms are connected by green dashed lines with bonding lengths indicated (in Å). Nonligand residues involved in direct hydrophobic contacts with SAM are shown as red semicircles with radiating spokes. D, The solvent-accessible surface of aKMT in complex with SAH, colored according to electrostatic potential. Blue, positively charged; red, negatively charged; white, neutral. The SAH molecule is shown as green sticks. E, Comparison of the structures of aKMT-SAH (cylan) and aKMT-SAM (green).
SCVy medium and harvested during the exponential growth phase, conditions where the difference between the two strains in growth was minimal. RNA-seq was employed to determine the levels of gene expression in the parental strain and ⌬aKMT. 10,040,707 and 10,500,000 sequencing reads were obtained from the parental and the mutant samples, respectively (supplemental Table S6). Each transcript was covered by an average of 70 reads, and the transcripts of about 89% of the genes had a coverage of 90 -100%. Reads mapped to open reading frames (ORFs), rRNAs and intergenic regions in the S. islandicus genome accounted for 14.86, 82.34, and 0.22% of the total reads, respectively (supplemental Fig. S7). Using a cutoff of a twofold difference in gene expression, the transcription levels of 43 or 42 genes increased or decreased, respectively, in ⌬aKMT, as compared with those in the parental strain (Fig. 8, Tables II and supplemental Table S7). The number of these genes accounted for ϳ3.3% of the total number of genes (ϳ2600) detected in this study. The expression of 4 and 8 genes was up-and downregulated by more than fourfold. The changes in gene expression revealed by RNA-seq were confirmed by qRT-PCR performed on ten selected genes (supplemental Fig. S8).
The down-regulated genes included those that were involved in membrane transportation, energy production and conversion (Tables II and supplemental Table S7). Expression of genes encoding the ATP-binding protein of an ABC transporter (SiRe_0588), a protein of the major facilitator superfamily (SiRe_2482), a general substrate transporter (SiRe_1708) and an ABC-type oligopeptide/nickel transport system (SiRe_2283) decreased by 5.0, 2.5, 2.2, and 2.1 fold, respectively, in the mutant strain, as compared with that in the parental strain. In addition, several genes encoding proteins with putative functions in redox reaction and electron transfer, e.g. 4Fe-4S ferredoxin (SiRe_2422, SiRe_2424), and molybdopterin oxidoreductase (SiRe_2425), and polysulfide reductase NrfD (SiRe_2427), were down-regulated in ⌬aKMT. On the other hand, genes encoding archaellum proteins FlaB and FlaF were up-regulated in ⌬aKMT (Table II). But, the mutant did not show an altered motility (data not shown). Also among the up-regulated genes in the mutant strain are those which encode transcriptional factors, e.g. CopG-like regulator (SiRe_0131), TetR protein (SiRe_0301), and stress response proteins, e.g. Dps family protein (SiRe_0453), VapB-type protein (SiRe_0374). Notably, genes encoding two enzymes involved in aromatic catabolism (SiRe_0706 and SiRe_0707) were up-regulated by 5-7 fold in ⌬aKMT. In general, however, the expression of the vast majority of the genes was not significantly affected by aKMT-catalyzed protein methylation in S. islandicus. DISCUSSION Although it has been suggested that proteins are extensively methylated in crenarchaea (6), the present study shows for the first time the extent and pattern of protein lysine methylation in an archaeal proteome. Strikingly, methylated proteins accounted for 66% of the cellular proteins, and the methyllysines were about 21% of the lysine residues identified in S. islandicus. To the best of our knowledge, this organism harbors the most extensively methylated proteome that has ever been reported. The availability of a S. islandicus mutant defective in aKMT permits an analysis of potential roles of protein methylation in general, and aKMT in particular, in the cell. The proportions of the methylated proteins and lysine residues over the total proteins and lysine residues identified in the mutant were reduced to 33.4 and 6.3%, respectively. These results show that aKMT is a major protein methyltransferase responsible for the methylation of the majority of meth- FIG. 7. Thermal stability of total cellular proteins from the parental and the mutant strains. Cells from an exponentially grown culture of the parental strain or ⌬aKMT were harvested, resuspended in 50 mM sodium phosphate buffer, pH 7.0, and sonicated. A, The clarified cell-free extracts were heated from 50 to 95°C at a rate of 0.2°C/min, and the increase in absorbance at 600 nm was recorded on a Shimadzu UV-2550 spectrophotometer. B, The cell-free extracts were incubated at 65, 75, 85, or 95°C for 15, 30, 60, and 120 min.  ylated cellular proteins in S. islandicus. The presence of the remaining methylated proteins in the mutant strain indicates that one or more additional protein methyltransferase(s) must exist. The unknown protein methyltransferase(s) appear to differ from aKMT in target site selection but the former also shares sites with the latter, as revealed by the comparison of patterns of protein methylation in the parental and the mutant strains. No genes encoding other protein lysine methyltransferases have been identified by genome sequence analysis so far. Notably, most of the methylated residues (2,651/3,718) were methylated to various extents from an unmethylated to a trimethylated form in the parental strain. This may indicate the presence of either a dynamic balance between methylation and demethylation or a continuing methylation process with trimethylation as the end point at the target residues. Protein demethylases required for the former process have yet to be identified but are likely present in Archaea. In any case, the presence of multiple methylation states in proteins may serve a regulatory role that remains to be understood. No apparent sequence preference by aKMT was detected through the analysis of sequences flanking methylated lysine residues, in agreement with the previous finding that the enzyme exhibits broad substrate specificity (35,36). Given the observation that ϳ21% of the total lysine residues in cellular proteins were methylated and the report that lysine residues undergoing methylation were often found on the surface of a protein (6), most, if not all, lysine residues accessible to methyltransferases are presumably potential targets of posttranslational modification by protein methyltransferases in the cell. This is consistent with the observation that methylated lysine residues are loacated on the surface of the selected proteins. The structural basis for the broad substrate specificity of aKMT was explored by the crystallographic analysis of the aKMT-SAM and aKMT-SAH complexes. As revealed by sequence alignment (35), aKMT is structurally homologous to the C-terminal catalytic domain of bacterial ribosomal protein L11 methyltransferase (PrmA), but lacks the N-terminal substrate recognition domain of the bacterial protein. Therefore, unlike PrmA which specifically methylates L11, aKMT is able to catalyze methylation of a large number of proteins. By superimposing the structure of aKMT with that of bacterial PrmA in complex with its substrate L11, we identified a putative active site in the archaeal protein. Like its bacterial homolog, aKMT appears to possess an active site that provides a hydrophobic environment but contains no identifiable residue that could serve as a general base to facilitate the deprotonation of the substrate (24). The pK a of the substrate lysine residue would conceivably be lowered at the hydrophobic active site (64), permitting solvent-mediated deprotonation of the amino group during the methylation reaction.
Intriguingly, the N-terminal portion (the second to the eighth residues) of aKMT appears to adopt two distinct conforma- Given the extensive protein methylation in S. islandicus and possibly in crenarchaea in general, it would be of great interest to understand the potential physiological role of the posttranslational modification. Protein methylation in bacteria and eukaryotes appears to be more restricted to subsets of proteins (such as ribosomal and flagellar proteins in bacteria and histone proteins in eukaryotes) and catalyzed by methyltransferases in a more specific manner than that in archaea. By comparison, proteins in various COG categories appear to be equally well methylated in S. islandicus. Therefore, it appears that, although protein methylation may play a regulatory role in specific processes in bacteria and eukaryotes, this posttranslational modification affects primarily the overall biochemical properties of cellular proteins in archaea. As more methylated Sulfolobus proteins (e.g. glutamate dehydrogenase, aspartate aminotransferase) were reported over the years, it has been speculated that methylation enhances the thermal stability of a protein and represents an adaptation of the hyperthermophilic organism to growth in hot environments (26,28). Examples in support of the contention include the observation that native (methylated) and recombinant (unmethylated) chromatin protein Sac7d, a member of the Sul7d family from S. acidocaldarius, differ by ϳ6°C in melting point temperature (Tm) (65). However, no significant differences were detected in thermal stability between the methylated and unmenthylated forms of Sso7d, a highly close homolog of Sac7d from S. solfataricus (66). Taking advantage of the availability of the aKMT deletion mutant, in which the level of protein methylation was substantially lower than that in the parental strain, we compared the thermal stability of the cellular proteins from the two strains. No significant differences were detected, suggesting the lack of contribution of methylation to the overall thermal stability of cellular proteins in S. islandicus. However, these results do not rule out the possibility that the thermal stability of some proteins may be affected by methylation. It remains to be investigated how methylation would affect cellular proteins in archaea, but a number of possibilities exist. For instance, methylation of the -amino group of a lysine residue in a protein would lower the pK a of the residue, increasing the hydrophobicity of the protein (26). This effect is probably substantial because a large proportion of the surface-accessible lysine residues in Sulfolobus proteins are potential targets for methylation. Therefore, protein lysine methylation may serve to modulate protein-protein and protein-nucleic acid interactions in Sulfolobus.
In agreement with the lack of the significant effect of methylation on the thermal stability of cellular proteins, the mutant strain was able to grow nearly as well as the parental strain at 75°C in the nutrient-rich medium. Growth of the mutant was more significantly affected at 65°C than at 75°C, as compared with that of the parental strain, when they were both grown in the nutrient-poor medium. Therefore, the processes affected by the defect in protein methylation appeared to become growth-limiting under conditions where optimal growth was hindered. As expected from the growth phenotypes of the parental and the mutant strains, the transcriptional profiles of the two strains are in general similar with differentially expressed genes (Ͼtwofold difference in expression) accounting for only a small fraction of the total number of genes determined (85 out of ϳ 2600). Presumably, transcription of these differentially expressed genes is altered as a direct or indirect result of changes in the methylation state of upstream regulatory factors. It is observed that many of these differentially expressed genes are clustered in operons and the expression of genes in an operon is often changed in the same direction, supporting the above contention. The transcriptomic comparison of the parental and the mutant strains fails to provide a convincing interpretation of the differences between the two strains in growth, but it appears to yield some interesting clues. For example, several genes with putative functions in substrate transportation and energy metabolism were significantly down-regulated, whereas some genes encoding transcriptional regulators and stress response proteins were up-regulated in the mutant cells. However, it remains to be determined if these changes are related to the slower growth phenotype of the mutant strain.
Our study suggests that there is a distinct difference between bacteria/eukarya and archaea in the pattern and, possibly, the function of protein lysine methylation. As a well studied eukaryotic example, methylation of histones plays a key role in the epigenetic regulation of the chromatin structure and function. On the other hand, the finding in the present study supports a role for aKMT-mediated protein methylation in the growth of the organism under nutrient poor conditions. A better understanding of the physiological role of protein methylation in archaea clearly awaits further investigation. § § These authors contributed equally to this work.