Archaeal Ubiquitin-like SAMP3 is Isopeptide-linked to Proteins via a UbaA-dependent Mechanism*

SAMP1 and SAMP2 are ubiquitin-like proteins that function as protein modifiers and are required for the production of sulfur-containing biomolecules in the archaeon Haloferax volcanii. Here we report a novel small archaeal modifier protein (named SAMP3) with a β-grasp fold and C-terminal diglycine motif characteristic of ubiquitin that is functional in protein conjugation in Hfx. volcanii. SAMP3 conjugates were dependent on the ubiquitin-activating E1 enzyme homolog of archaea (UbaA) for synthesis and were cleaved by the JAMM/MPN+ domain metalloprotease HvJAMM1. Twenty-three proteins (28 lysine residues) were found to be isopeptide-linked to the C-terminal carboxylate of SAMP3, and 331 proteins were reproducibly found associated with SAMP3 in a UbaA-dependent manner based on tandem mass spectrometry (MS/MS) analysis. The molybdopterin (MPT) synthase large subunit homolog MoaE, found samp3ylated at conserved active site lysine residues in MS/MS analysis, was also shown to be covalently bound to SAMP3 by immunoprecipitation and tandem affinity purifications. HvJAMM1 was demonstrated to catalyze the cleavage of SAMP3 from MoaE, suggesting a mechanism of controlling MPT synthase activity. The levels of samp3ylated proteins and samp3 transcripts were found to be increased by the addition of dimethyl sulfoxide to aerobically growing cells. Thus, we propose a model in which samp3ylation is covalent and reversible and controls the activity of enzymes such as MPT synthase. Sampylation of MPT synthase may govern the levels of molybdenum cofactor available and thus facilitate the scavenging of oxygen prior to the transition to respiration with molybdenum-cofactor-containing terminal reductases that use alternative electron acceptors such as dimethyl sulfoxide. Overall, our study of SAMP3 provides new insight into the diversity of functional ubiquitin-like protein modifiers and the network of ubiquitin-like protein targets in Archaea.

transferred to its respective substrate to generate sulfur-containing biomolecules (e.g. the thiazole moiety of thiamine and molybdopterin (MPT) of MoCo).
Several Ubl proteins have been demonstrated to function in both protein modification and sulfur transfer. For example, the eukaryotic Ub-related modifier 1, originally shown to function as a Ubl protein modifier during oxidative stress (13,14), acts as a sulfur carrier in the formation of 2-thiouridine modified tRNA (15)(16)(17)(18). Similarly, TtuB, a Ubl protein of the thermophilic bacterium Thermus thermophilus, is required for sulfur transfer in the thiolation of tRNA (19,20) and covalently modifies proteins (21).
Here we extend our knowledge of post-translational mechanisms in archaea beyond SAMP1/2 conjugation by demonstrating that SAMP3 (recently shown by NMR structural studies to form a classic ␤-grasp fold in a salt-independent manner) (30) can act as a Ubl protein modifier in Hfx. volcanii. Samp3ylation was found to be dependent on the C-terminal diglycine motif of SAMP3 and the E1-like enzyme UbaA. SAMP3 was also shown to modify the lysine residues of various proteins through a covalent isopeptide bond with cleavage of these conjugates mediated by HvJAMM1, suggesting samp3ylaton is reversible. In particular, SAMP3 was found to be linked to the conserved active site residues of the molybdopterin synthase large subunit homolog of Hfx. volcanii (MoaE). These results reveal that SAMP3 is a small protein modifier and suggest samp3ylation regulates a variety of cellular functions including MoCo biosynthesis.

EXPERIMENTAL PROCEDURES
Materials-Biochemicals and analytical-grade inorganic chemicals were purchased from Fisher Scientific (Atlanta, GA), Bio-Rad (Hercules, CA), and Sigma-Aldrich (St. Louis, MO). Desalted oligonucleotides were from Integrated DNA Technologies (Coralville, IA). DNA polymerases and modifying enzymes were from New England Biolabs (Ipswich, MA). Hi-Lo DNA standards were from Minnesota Molecular, Inc. (Minneapolis, MN).
Strains, Media, and Plasmids-Strains, primers, and plasmids used in this study are summarized in supplemental Tables S1 and S2. All liquid cultures were grown with rotary shaking at 200 rpm. E. coli strains were grown at 37°C in Luria-Bertani medium. Hfx. volcanii strains were grown at 42°C in either rich or minimal media as previ-ously described (25,31). Rich media included yeast-peptone-casamino acids (YPC) and ATCC 974 complex medium. In minimal media, carbon sources were glycerol (G), glucose (Glu), and lactate (L) at 20 mM final concentration as indicated. Minimal media was also supplemented with 25 mM alanine (and ammonium chloride was excluded), 1x YPC nutrients (0.5% (w/v) yeast extract, 0.1% (w/v) peptone, and 0.1% (w/v) casamino acids), 15-100 mM dimethyl sulfoxide (DMSO), 15 mM dimethyl formamide (DMF), 15 mM trimethylamine N-oxide (TMAO), and 15-75 mM potassium nitrate (KNO 3 ). Ampicillin (0.1 mg ⅐ ml Ϫ1 ), novobiocin (0.1 g ⅐ ml Ϫ1 ), and agar (1.5% (w/v)) were included as needed. Uracil was dissolved to 50 mg ⅐ ml Ϫ1 in 100% (v/v) DMSO or 500 mM NaOH and supplemented to a final concentration of 50 g ⅐ ml Ϫ1 in growth medium. Uracil solutions, TMAO, DMF, and DMSO were sterilized by passage through a 0.2-m nylon filter (Fisher) prior to their addition to sterile medium. Potassium nitrate was prepared as a 1 M stock and autoclaved prior to addition to sterile medium. Minimal medium components were added to YPC medium at their respective final concentrations where indicated. To monitor SAMP3 conjugation in cell lysate, cells were freshly inoculated from Ϫ80°C glycerol stocks onto agar plates with ATCC974 complex medium and novobiocin. Cells from freshly inoculated plates were inoculated into 13 ϫ 100 mm culture tubes containing 4 ml of medium and grown to stationary phase (A 600 of 1.5 to 3.5). For immunoprecipitation experiments, cells from freshly streaked plates were either directly inoculated into 500-ml Erlenmeyer flasks containing 100 to 200 ml of medium or inoculated into 13 ϫ 100 mm culture tubes (4 ml), grown overnight, and subcultured into 100 to 200 ml of medium in 500-ml Erlenmeyer flasks using 1% (v/v) inoculum. To monitor the effect of velcade (bortezomib) (Fisher) on SAMP conjugation, isolated colonies of the Hfx. volcanii strains were inoculated into 4 ml of YPC medium (13 ϫ 100 mm culture tubes) and grown to log phase. Cells were subcultured into 4 ml of fresh YPC medium and again grown to log phase (A 600 0.5 to 0.7). Velcade (7.7 l of 52 mM velcade stock dissolved in Ն99.8% (w/v) DMF) and DMF solvent alone (7.7 l DMF at 99.8% (w/v)) were added to log-phase cells (final concentrations at 100 M velcade and 25 mM DMF). Cultures were incubated with shaking (200 rpm) at 42°C for 24 h prior to harvest for immunoblot analysis.
Pairwise Comparison-Protein sequences related to Hfx. volcanii gi 302595883 (SAMP1), gi 292654382 (SAMP2), and gi 292656305 (HVO_2177, Met22-Gly113 correspond to SAMP3) were retrieved via Basic Local Alignment Search Tool (BLAST) (blastp and tblastn) query (32) of the nonredundant protein sequences (nr) and nucleotide collection (nr/nt) databases (May 2013). Protein sequences were aligned by ClustalW (33), and alignments were visualized in the graphic view of BioEdit v7.2.0 (34). Protein sequences with unique N-and C-terminal tails were trimmed and their evolutionary history was inferred using the Neighbor-Joining method (35). The evolutionary distances were computed using the p-distance method (36) in units of the number of amino acid differences per site. Cluster analysis involved 56 amino acid sequences. All ambiguous positions were removed for each sequence pair. There were a total of 136 positions in the final dataset. Evolutionary analyses were conducted in MEGA5 (37). Species origins of protein sequences included in Fig. 1 are listed in supplemental Table S3.
Coding Potential Scoring-Coding potential scoring was performed with the high-scoring-segment method, based on cumulative scores assigned to codon-like trinucleotides, and the G-test method, based on the cumulation of associations between nucleotide types and codon base positions.
Fold Recognition and Three-dimensional Structural Modeling-The Protein Homology/Analogy Recognition Engine 2 (Phyre2) Webbased server (38,39) was used for fold-recognition and model building. In brief, the primary amino acid sequences of HVO_2178 and HVO_2179 (40) were submitted to the Phyre2 threading server using intensive mode, combining HHsearch for remote homology detection based on pairwise comparison of hidden Markov models with ab initio and multiple-template modeling. The library of known protein structures for comparison by Phyre2 was from the Protein Data Bank and Structural Classification of Proteins databases. Chimera 1.7 (41) was used as an interface for interactive visualization and analysis of threedimensional structures modeled at Ͼ90% accuracy.
Total RNA Isolation-For mapping of the 5Ј and 3Ј ends of the samp3 transcript, RNA was isolated from Hfx. volcanii H26 cells grown aerobically in ATCC974 complex medium with and without 100 mM DMSO supplementation to early log phase using RNeasy RNA purification columns (Qiagen, Hilden, Germany) with the following modification: 3 U of amplification grade DNase I (Sigma-Aldrich) was added per 1 g of RNA, and the mixture was incubated for 45 min at room temperature. In addition, buffer RW1 was omitted during the purification process, similar to the procedure of Babski et al. (42). For Northern blot analysis, total RNA was isolated from Hfx. volcanii H26 log-phase cells (37.5 ml of 100 ml culture in a 500-ml flask; 42°C) grown in ATCC 974 complex medium with and without supplementation with 100 mM DMSO as previously described with a typical yield of 100 to 150 g of RNA (24). All RNA concentrations were determined by A 260 using a Bio-Rad SmartSpec 3000 instrument. RNA integrity was confirmed by ethidium bromide stain after separation by 0.8% (w/v) agarose gel electrophoresis in buffer (40 mM Tris, 20 mM acetic acid, and 1 mM EDTA) at pH 8.0.
Northern Blot Analysis-Total RNA (15 g per lane; isolated as described in the section "Total RNA Isolation") was denatured and fractionated via electrophoresis (4 h, 50 V) on a 1.2% (w/v) agarose gel supplemented with 0.8% (w/v) formaldehyde in 1ϫ MOPS running buffer (20 mM MOPS (pH 7.0), 5 mM sodium acetate, 1 mM EDTA) according to standard procedures (43). RNA molecular mass standards labeled with digoxigenin (DIG-11-dUTP) (0.3-to 6.9-kb RNA ladder; Roche Molecular Biochemicals, Indianapolis, IN) were included. After three rinses with diethylpyrocarbonate-treated water, the formaldehyde gel was incubated (45 min) in 10X saline sodium citrate (SSC) (where 20X SSC is 3 M NaCl plus 0.3 M sodium citrate (pH 7.0)). RNA was transferred to a BrightStar-Plus nylon membrane (Ambion, Austin, TX) by means of upward overnight capillary transfer in 20X SSC, crosslinked to the membrane using a UV Stratalinker 2400 (Stratagene, La Jolla, CA), and hybridized overnight at 50°C with a digoxigenin (DIG)labeled double-stranded DNA probe of 264 bp specific for samp3. The samp3-specific probe was generated via PCR using primers 5Ј-ACGC-CTCCGCGTCCTCGCC-3Ј and 5Ј-GTGGACCACCTCGCGCCCGT-3Ј and plasmid pJAM1112 as a template. TaqDNA polymerase (Bioline, Taunton, MA) was used according to the supplier's recommendations with the following modifications: 3% (v/v) DMSO was included, and the 1ϫ DIG deoxyribonucleoside triphosphate mixture (catalog no. 1277065; Roche) was supplemented with mixed deoxynucleotides (New England Biolabs) to 0.1 mM. For hybridization, membranes with the cross-linked RNA samples were equilibrated in DIG Easy Hyb solution (catalog no. 11603558001; Roche) and then incubated with 100 to 150 ng of labeled probe in 10 ml of DIG Easy Hyb (16 h, 50°C). Membranes were washed with 2ϫ SSC with 0.1% (w/v) SDS (twice, for 5 min each) and with 0.1ϫ SSC with 0.1% (w/v) SDS (twice, for 15 min each, at 50°C). Hybridization products were detected in a chemiluminescent (CSPD*) DIG immunoassay (Roche). Chemiluminescent signals were recorded on x-ray film with exposure times ranging from 60 to 180 min.
Mapping 5Ј and 3Ј Ends of samp3 Transcript-Methods similar to those described by Brenneis et al. (44) were used to map the 5Ј and 3Ј ends of samp3 (hvo_2177) transcript. In brief, total RNA (7 g; isolated as described in the "Total RNA Isolation" section) was incubated with 10 U of tobacco acid pyrophosphatase (Fisher Scientific) and 30 U of RNase inhibitor (Promega, Madison, WI) at 37°C for 1 h. The RNA sample was purified via extraction with an equal volume of an acidic-phenol (pH 5.0):chloroform:isoamyl alcohol (25:24:1) mixture followed by chloroform:isoamyl alcohol (24:1). RNA was precipitated in 0.25 M sodium acetate (pH 5.0) with two volumes of 95% (v/v) ethanol (Ϫ70°C, 15 min) and washed with 70% (v/v) ethanol. The RNA pellet was resuspended in diethylpyrocarbonate-treated water and denatured (10 min, 65°C). Denatured RNA was self-ligated via incubation (1 h, 37°C) with 40 U of T4 RNA ligase (New England Biolabs) in 25 l reaction volume with 10 U of RNase inhibitor (Promega) and 1X T4 ligase buffer. Self-ligated RNA was purified via phenol/chloroform extraction as described earlier and then denatured (10 min, 65°C) and hybridized with 0.5 pmol of a samp3 gene-specific primer (SAMP3 GSP). First-strand cDNA synthesis was followed with primer pairs (P1/ P2) to amplify 5Ј-3Ј-ligated mRNA ends in the presence of the M-MLV reverse transcriptase, RNase H Minus, Point Mutant (M-MLV RT (H-). Point Mutant, Promega) according to manufacturer's protocol. A second PCR reaction with nested PCR primers (N1/N2) was used with the first PCR reaction as a template. TaqDNA polymerase was used according to the supplier's protocol with the following modification: 3% (v/v) DMSO was included. The PCR reaction product was gel-extracted (Qiagen) and TOPO cloned into pCRII-TOPO vector according to manufacturer's protocol (Invitrogen). This was followed by Sanger sequencing and comparison to the Hfx. volcanii genome.
Protein Concentration Assay-For immunoprecipitation experiments, the protein concentration of clarified cell lysate was estimated via Bradford protein assay (Bio-Rad) according to the supplier's instructions. All other protein concentrations were determined via bicinchoninic acid protein assay (Thermo Scientific, Rockville, IL). Bovine serum albumin was used as the protein standard.
Tandem Affinity Purification of SAMP3-MoaE-SAMP3-MoaE conjugates were purified from HM1096-pJAM1316 via tandem affinity chromatography in which clarified cell lysate was first passed through a Strep-Tactin column (Qiagen). Proteins eluted from the Strep-Tactin column were subsequently passed through an ␣-Flag column and eluted with 100 g ⅐ ml Ϫ1 1x Flag peptide. Eluted SAMP3-MoaE conjugates were dialyzed into 20 mM HEPES pH 7.5 with 2 M NaCl and stored at 4°C prior to use in the desampylation assay.
Desampylation Assay-Desampylation activity assays were performed as previously described (26). Briefly, HvJAMM1 was expressed with an N-terminal His-tag in recombinant E. coli Rosetta (DE3)-pJAM991 and purified via nickel chromatography using a HisTrap HP column (GE Healthcare). HvJAMM1 (5 and 10 M) was incubated with samp3ylated substrate (5 g Flag-SAMP3 conjugate enrichments and 12 M SAMP3-MoaE conjugates, respectively) in 20 mM HEPES buffer at pH 7.5 with 2 M NaCl and 50 M ZnCl 2 . Reactions (10 l total) were incubated for 2 to 3 h at 50°C. Controls included substrate incubated in the absence of HvJAMM1 and in the presence of HvJAMM1 with 50 M EDTA.
Purification of SAMP3 Conjugates via ␣-Flag Chromatography for Mass Spectrometry Analysis-Proteins were purified via ␣-Flag chromatography for mass spectrometry analysis as previously described (25), with some modifications. In brief, Hfx. volcanii strains were grown to stationary phase in ATCC974 complex medium (200 ml cultures). Strains included H26-pJAM202c, H26-pJAM977, and HM1052-pJAM977 for AspN/trypsin digests and H26-pJAM202c, H26-pJAM1198, and HM1052-pJAM1198 for trypsin digests (see supplemental Table S2 for strain details). Cells were harvested via centrifugation (6000 ϫ g, 20 min, 4°C). Cell pellets were washed and lysed as described above for immunoprecipitation experiments. Cell lysate was clarified via centrifugation (14,000 ϫ g, 10 min, 25°C) and filtered through a 0.2-m surfactant-free cellulose acetate filter (Nalgene Nunc, Rochester, NY). A polystyrene column (0.7 ϫ 1.2 cm) was packed with anti-Flag M2 agarose (Sigma) to a final bed volume of 0.5 ml and equilibrated with 10 column volumes of TBS (50 mM Tris-Cl buffer at pH 7.4 with 150 mM NaCl). Clarified cell lysate was applied to the equilibrated ␣-Flag column (4x batches), and unbound protein was removed by washing the column with 20 column volumes of TBS. Protein conjugates were eluted with five column volumes of TBS supplemented with 100 g ⅐ ml Ϫ1 1x Flag peptide (Sigma) and collected in nine fractions (ϳ300 l per fraction). Separate columns were packed and used for each sample type. Wild-type and ⌬ubaA mutant strains expressing Flag-SAMP3 A90K were purified in biological triplicate and duplicate, respectively.
LTQ-Orbitrap Velos Mass Spectrometry Identification of SAMP3 Modified and Associated Proteins-Proteins purified via ␣-Flag chromatography (see above) were separated via 15% nonreducing SDS-PAGE. The gel lanes were cut into 10 gel pieces each and tryptic digested as described (25). Peptide fragments were subjected to reversed-phase column chromatography (self-packed C18 column, 100-m inner diameter ϫ 200 mm) operated on an Easy-nLC II (Thermo Fisher Scientific, Waltham, MA). Elution was performed with a binary gradient of buffers A (0.1% (v/v) acetic acid) and B (99.9% (v/v) acetonitrile, 0.1% (v/v) acetic acid) over a period of 100 min with a flow rate of 300 nl ⅐ min Ϫ1 .
MS and MS/MS data were acquired with an LTQ-Orbitrap-Velos mass spectrometer (Thermo Fisher Scientific) equipped with a nanoelectrospray ion source. The Orbitrap Velos was operated in data-dependent MS/MS mode using the lock-mass option for real-time recalibration. After a survey scan in the Orbitrap (r ϭ 30,000), MS/MS data were recorded for the 20 most intensive precursor ions in the linear ion trap. Singly charged ions were not taken into account for MS/MS analysis. Proteins with samp3ylation sites were identified by searching all MS/MS spectra in "dta" format against an Hfx. volcanii (strain ATCC 29605/DSM 3757/JCM 8879/NBRC 14742/ NCIMB 2012/VKM B-1768/DS2) target-decoy protein sequence database (9874 entries) extracted from UniProtKB (release 2013_05) using Sorcerer™-SEQUEST® (Sequest v. 2.7 rev. 11, Thermo Electron including Scaffold_4_0_4, Proteome Software Inc., Portland, OR). The target-decoy database includes the complete proteome set of Hfx. volcanii that was extracted from UniProtKB release 2013_05 (45) and an appended set of reversed sequences and 42 sequences of common laboratory contaminants created by Bio-WorksBrowser v. 3.2 (Thermo Electron Corp.) according to Elias and Gygi (46).
The Sequest search was carried out considering a parent ion mass tolerance of 5 ppm and a fragment ion mass tolerance of 1.00 Da. Up to two tryptic miscleavages were allowed. Methionine oxidation (ϩ15.994915 Da), cysteine carbamidomethylation (ϩ57.021464 Da), and N6-glycylglycyl-L-lysine modifications (K ϩ 114.042927 Da) were set as variable modifications. Peptide identifications were accepted if they could be established at greater than 95.0% probability by the Peptide Prophet algorithm (47) with Scaffold delta-mass correction. Protein identifications were accepted if they could be established at greater than 95.0% probability and contained at least two identified peptides. Protein probabilities were assigned by the Protein Prophet algorithm (48). Proteins sharing significant peptide evidence were grouped into clusters. Proteins were also identified by at least two peptides with the application of a stringent SEQUEST filter. Sequest identifications required at least ⌬Cn scores of greater than 0.10 and XCorr scores of greater than 2.2, 3.3, and 3.75 for doubly, triply, and quadruply charged peptides. The false discovery rate was 1% to 1.8% for peptide identifications with no false positive proteins detected and was calculated using the Protein and Peptide Prophet algorithms according to Scaffold_4_0_4.
The complete collision-induced dissociation MS/MS spectra of the diglycine-modified lysine-containing peptides and the corresponding b and y fragment ion series are presented in detail in supplemental Fig.  S1. The quantification of proteins enriched in the SAMP3 A90K conjugate samples relative to the ⌬ubaA control was performed using spectral counts by Scaffold software, and the results are shown in Table S4. Samp3ylated proteins and proteins associated with SAMP3 with a spectral count average of Ͼ3-fold relative to the ⌬ubaA control were classified into the Clusters of Orthologous Groups (COGs) updated for Archaea by Wolf and colleagues (46) (supplemental Table S5). The complete peptide and protein summary reports for all identified peptides and proteins in the samp3-A90K-conjugates and in the ⌬ubaA control are listed in supplemental Tables S6 -S9. The mass spectrometry proteomics data have been deposited in the ProteomeXchange Consortium (proteomecentral.proteomexchange.org) via the PRIDE partner repository (30) with the dataset identifier PXD000202.

RESULTS
HVO_2177 was previously identified as a Ubl protein homolog based on its predicted ␤-grasp fold structure and conserved C-terminal diglycine-motif (25). However, when the N-terminal Met1 of HVO_2177 was fused to a Flag tag to facilitate its detection via ␣-Flag immunoblot, the encoded protein did not appear to form Ubl conjugates based on analysis of the lysate of Hfx. volcanii cells grown under a variety of culture conditions (25). These early findings suggested that HVO_2177 did not form protein conjugates and that this open reading frame (ORF) perhaps had an alternative role in the cell; this was the rationale for this study.
HVO_2177 ORF-To further understand HVO_2177, we analyzed the nucleotide and deduced protein sequences of the current ORF annotation (40) (GenBank NC_013967.1) by means of pairwise comparison and coding potential scoring. Pairwise comparisons were performed via blastp and tblastn search of the nonredundant protein sequence (nr) and nucleotide collection (nr/nt) databases using the 113 amino acid deduced protein and 342 bp nucleotide sequences as the query, respectively. With these comparisons, an N-terminal tail of 21 amino acids corresponding to residues Met1 to Gly21 of HVO_2177 was identified as unique and not conserved in close protein homologs (e.g. Fig. 1A). By contrast, FIG. 1. Haloferax volcanii HVO_2177 (SAMP3) analyzed via pairwise comparison of deduced amino acid sequence to protein homologs. A, Hfx. volcanii HVO_2177 (SAMP3) is compared with close archaeal protein homologs (upper panel) and SAMP1/2 (lower panel) via multiple amino acid sequence alignment. Identical and functionally related amino acid residues are highlighted in black and gray, respectively. The N-terminal extension (amino acid residues 1-21) of the original HVO_2177 annotation (gi:292656305) that is relatively hydrophobic and not conserved in other genera based on blastp and tblastn query of the nonredundant protein sequences (nr) and nucleotide collection (nr/nt) databases is indicated. The C-terminal diglycine motif characteristic of Ub/Ubl proteins is highlighted by asterisks below the alignments. For details, see "Experimental Procedures." B, Hfx. volcanii SAMP1/2/3 and close homologs group to three distinct clades based on cluster analysis. The optimal tree with the sum of branch length ϭ 7.02878898 is shown. The tree is drawn to scale, with branch lengths in the same units as those of the evolutionary distances used to infer the phylogenetic tree. For details, see "Experimental Procedures." residues Met22 to Gly113 of HVO_2177 were found to be closely related to uncharacterized Ubl proteins from archaea and select bacteria forming a clade distinct from SAMP1 and SAMP2 based on amino acid sequence alignment (Fig. 1A) and cluster analysis (Fig. 1B). The coding potential of the HVO_2177 annotated gene sequence was then tested by high-scoring-segment and G-test methods that rely on contrasting global compositional properties in the three possible codon positions (e.g. GC-rich genomes, such as Hfx. volcanii, typically have ORFs with a greater GC content in the third than in the second codon position). The coding potential of HVO_2177 was found to be extremely low in the 5Ј end of the current annotation that predicts a GTG start codon (supplemental Fig. S2). Based on these results, we hypothesized that the ATG codon that corresponds to Met22 of the current HVO_2177 annotation is the biological start codon and encodes a Ubl protein of 92 amino acids (named SAMP3) that is conserved with other organisms.
HVO_2177 (samp3) Transcript Mapping-To determine whether Hfx. volcanii synthesized transcripts that included a coding sequence for the unique Met1 to Gly21 N-terminal tail of HVO_2177, the 5Ј and 3Ј ends of the associated transcripts were mapped using a T4 ligase-based approach as described by Soppa and colleagues (44). In brief, total RNA was isolated from wild-type cells and circularized using T4 RNA ligase. The resulting product was converted to cDNA using reverse transcriptase with a gene-specific primer ( Fig. 2A). The cDNA was then amplified in two consecutive nested PCR reactions with four gene-specific primers (P1/P2 followed by N1/N2; Fig.  2A), yielding a PCR product comprising a portion of the ORF and the 5Ј and 3Ј untranslated regions (UTRs). PCR products were cloned into a TOPO vector, and the DNA sequences were compared with the genomic DNA sequence (supplemental Fig. S3A), allowing us to determine the 5Ј and 3Ј ends of the samp3 transcript (Fig. 2B). With this approach, we detected samp3 transcripts of 498 nt with an extensive 3Ј UTR (219 nt) predicted to fold into a series of stem-loop structures that spanned the majority of the downstream ORF, HVO_2178 (1-218 out of 258 nt total) (supplemental Figs. S3A and S3B). By contrast, the 5Ј end of the samp3 transcript was leaderless (no 5Ј UTR), with the extreme 5Ј nt corresponding to the A of the ATG Met22 codon of the HVO_2177 annotation. Leaderless transcripts are relatively common in some archaea, including Hfx. volcanii, based on in silico analysis (49,50) and 5Ј-end mapping (44,51). Of the 40 haloarchaeal transcripts previously mapped at their 5Ј and 3Ј ends (44), about two-thirds were categorized as leaderless, and all had 3Ј UTRs that varied in length from 13 to 154 nt with an average length of 57 nt (median 49 nt) and included a pentaU-motif at the extreme 3Ј end preceded by a stem-loop structure predicted to be involved in transcription termination. Thus, the samp3 transcript is common in its leaderless 5Ј UTR but unusual in its 3Ј UTR that is long (219 nt) and lacks an apparent polyU-motif. This lack of a polyU sequence sug-gests that the samp3 transcript is cleaved at a TC motif within the coding sequence of HVO_2178 after transcript synthesis. Consistent with this possibility, the ORFs downstream of samp3 (HVO_2178 and HVO_2179) overlap by four nt with a stretch of four U's 7 nt downstream of HVO_2179 that may serve as a site of transcription termination. A 10-bp palindrome overlapping the 3Ј end of the samp3 transcript may serve as a binding site for RNase mediated cleavage or an alternative type of transcription terminator (Fig. 2B). Together these results reveal that the samp3 transcript does not encode the unique 21-amino-acid N-terminal extension of the current annotation of HVO_2177 and suggest that samp3 is co-transcribed with its downstream gene neighbors (i.e. HVO_2178 and 2179) with the transcript subject to posttranscriptional 3Ј-end processing.
Genomic Map Comparisons of samp3 and Homologs-Based on our finding that the samp3 transcript includes 84% of the HVO_2178 coding sequence, we performed genomic map comparisons to determine whether this linkage was conserved across phyla and thus was an indicator of biological relationship. Gene homologs of samp3 and HVO_2178 were found physically linked on the genomes of diverse archaea and bacteria with coding sequences often overlapping or having only a few intergenic base pairs (supplemental Fig. S4). Of particular note was the Thermus thermophilus TTHA0151 (169 amino acids) of unknown function, which harbors an N-terminal domain of the Ub-fold superfamily and a C-terminal domain of the DUF1952 superfamily related in threedimensional structure and primary amino acid sequence to SAMP3 (37.1% amino acid identity) and HVO_2178 (30.1% amino acid identity), respectively (Fig. 2C). The N-to C-terminal configuration of TTHA0151 appears common among hyperthermophilic bacteria (Thermus and Oceanithermus sp.) and includes two triglycine motifs, with one C-terminal to the ␤-grasp fold and the other C-terminal to the DUF1952 domain (analogous to the diglycine motifs of SAMP3 and HVO_2178, respectively). Although HVO_2178 does not appear to form isopeptide bonds based on previous in vivo analysis (25) or have a Ubl ␤-grasp fold structure based on three-dimensional homology modeling (Fig. 2C), the extended 3Ј UTR of samp3 transcript and conservation of genomic neighborhoods suggest a close functional relationship between SAMP3 and HVO_2178.
SAMP3 Conjugate Formation-To test whether samp3 encoded a functional Ubl protein, a Flag-tag fusion to the newly identified translational start site (Met22 of the original annotation of HVO_2177) was expressed in Hfx. volcanii wild-type and ⌬ubaA mutant cells. The ⌬ubaA mutation was introduced to determine whether Ubl conjugate formation was dependent on the ubiquitin-activating E1 enzyme homolog UbaA. To enrich for potential Ubl conjugates, the Flag-tagged proteins were immunoprecipitated from cell lysate, separated via reducing SDS-PAGE, and analyzed via ␣-Flag immunoblot (IB). With this approach, protein bands of high molecular mass

Archaeal Ubiquitin-like Protein Modification
(Ͼ20 kDa) that were indicative of Ubl conjugation by SAMP3 were detected when Flag-SAMP3 was expressed in wild-type cells (Fig. 3A, lanes 1 and 2). The banding pattern of the putative SAMP3 modified proteins was influenced by the growth medium, suggesting regulation (Fig. 3A, lanes 1 and  2). The high-molecular-mass protein bands corresponding to the potential SAMP3 modified proteins were not detected  Table S2). Down arrows indicate 5Ј and 3Ј ends of samp3 specific transcript mapped using a T4 ligase-based PCR approach. azlC and azlD, homologs of Bacillus subtilis multipass transmembrane spanning AzlC and AzlD associated with branched-chain amino acid transport (70). sph3, SMC-like Sph3 homolog. aor4, aldehyde:ferredoxin oxidoreductase homolog. B, detailed map of the 5Ј and 3Ј ends of samp3 specific transcript correlated with genomic DNA sequence. The DNA sequence displayed corresponds to GenBank NC_013967.1 positions 2,049,617 to 2,050,266. The 5Ј and 3Ј ends of samp3 transcript are indicated by down arrows with a 10-bp palindrome overlapping the 3Ј end underlined. Numbering correlates with the transcript start site. The originally annotated GTG start codon and the newly identified ATG start codon of samp3 are indicated. Archaeal promoter elements (BRE and TATA box consensus sequences) are highlighted in blue. Open reading frames for samp3 (pink) and HVO_2178 (brown with four-nucleotide overlap with HVO_2179 in green) are also highlighted. C, SAMP3 and HVO_2178 are related to Thermus thermophilus TTHA0151 fusion protein with N-and C-terminal domains clustering to MoaD/Ub and DUF1952 superfamilies, respectively. Top: represented in ribbon diagram using Chimera 1.7 (41) is a three-dimensional structural comparison of TTHA0151 (green) (1.6-Å crystal structure, PDB: 1V8C), SAMP3 (blue) (NMR solution structure, PDB: 2M19), and HVO_2178 (purple) (three-dimensional homology model by Phyre2 (38) with 62 residues (73%) modeled at Ͼ90% accuracy with high (96.7%) confidence in alignment to 1V8C). Ct, C-terminal residue. Ct_mod, C-terminal residue of structure corresponds to R166 (glycine residues were not resolved in TTHA0151 crystal structure). Bottom: multiple amino acid sequence alignment of TTHA0151 (1-169 amino acids, GI:55980120), SAMP3 (22-113 amino acids, GI: 292656305), and HVO_2178 (1-85 amino acids, GI: 292656306). Identical (black), functionally similar (gray), and considered diglycine motif (red) residues are highlighted. this 65-kDa protein was not determined, but it may be relevant to MoaE samp1/3ylation, as discussed in later sections. To further analyze SAMP3 conjugate formation, the C-terminal diglycine motif of SAMP3 was deleted, and the resulting ⌬GG variant was expressed in wild-type cells. Deletion of the diglycine motif was found to abolish the conjugation of SAMP3 to protein targets (Fig. 3A, lane 3). Thus, SAMP3 appeared to form Ubl protein conjugates that were dependent on the Ubl C-terminal diglycine motif and the E1-like enzyme UbaA, similar to what was observed for SAMP1/2 (24).
In contrast to SAMP3, when Hfx. volcanii cells expressed an N-terminal Flag-tag fusion to the Met1 GTG start codon of the originally annotated HVO_2177 (Flag-HVO_2177), the SDS-PAGE banding pattern for the Flag-tagged proteins was independent of genotype (wild-type versus ⌬ubaA mutant strains) and culture condition and included a predominance of Flagtagged proteins at Յ15 kDa, with only a few proteins more than 20 kDa (Fig. 3C). As UbaA is the single ubiquitin-activating E1 enzyme homolog predicted for Hfx. volcanii, these results reveal that the few Flag-specific proteins that were detected at Ͼ20 kDa in the Flag-HVO_2177 expressing strains were formed independent of a ubiquitin-activating E1 enzyme homolog. The N-terminal extension of HVO_2177 (Met1 to Gly21 of the original annotation) was found to be relatively hydrophobic based on hydrophobicity plots (supplemental Fig. S5). Thus, we propose that the UbaA-independent SDS-PAGE banding pattern that was observed for Flag proteins produced in the Flag-HVO_2177-expressing strains was due to a mixture of Flag-HVO_2177 in denatured, cleaved, and SDS-resistant forms, with the latter observed for Pyrococcus furiosus Pfp1 (52). Alternatively, it is possible that a UbaA-independent mechanism is used to conjugate this nonbiological form of HVO_2177 (based on the 5Ј-end mapping of samp3 transcript). However, we feel this latter possibility is unlikely.
Mass Spectrometry Analysis of SAMP3 Conjugates-In order to determine whether SAMP3 forms covalent isopeptide bonds with protein targets, an MS-based proteomic approach was used. SAMP3 conjugates were purified from wild-type and ⌬ubaA mutant strains expressing Flag-SAMP3 via ␣-Flag affinity chromatography and subjected to double in-gel digestion with AspN and trypsin prior to the detection of peptide fragments in biological duplicates via Orbitrap LC-MS/MS analysis. The AspN/trypsin double digest was performed to reduce the size of the peptides and the lysine-linked SAMP3 tag (with a footprint of -EVVHLDGMATALDDGDAVSVFPPVAGG with trypsin alone to -AVSVFPPVAGG for AspN/trypsin). The ⌬ubaA mutant served as a control to filter out identified proteins that did not require UbaA for association with and/or covalent modification by SAMP3. However, samp3ylation sites were not successfully mapped using this approach and were probably lost during this AspN/trypsin double digestion, which took two days.
To improve the MS-based mapping of samp3ylation sites, the alanine residue (Ala90) immediately N-terminal to the diglycine motif of SAMP3 was modified to a lysine residue (A90K). In theory, expression of the SAMP3 A90K variant with an N-terminal Flag-tag in Hfx. volcanii would enable us to map the sites of samp3ylation by scanning for -GG footprints on tryptic peptides with mass increases of ϩ114 Da. The introduction of similar site-directed modifications in the C terminus of small Ubl modifier has been used to successfully map sites of SUMOylation in yeast (53). To determine whether SAMP3 A90K formed protein conjugates, the cell lysates of Hfx. volcanii wildtype and ⌬ubaA mutant strains expressing Flag-SAMP3 A90K were compared with Flag-SAMP3 via ␣-Flag IB (Fig. 4A). Although subtle differences in the SDS-PAGE banding patterns of the Flag-specific proteins were detected when SAMP3 A90K was expressed in Hfx. volcanii compared with SAMP3, SAMP3 A90K still formed protein conjugates in a UbaA-dependent manner at levels similar to those of SAMP3 (Fig. 4). In particular, the banding patterns of SAMP3 A90K and its associated conjugates migrated somewhat faster than those of SAMP3 (Fig. 4). The A90K modification would generate a SAMP3 protein that is less hydrophobic and more basic, which could account for the slight shifts in the migration of proteins observed in SDS-PAGE. This type of result is not uncommon for haloarchaeal proteins, which are relatively acidic (peak pI ϳ 4.2) (40). For example, the D94N variant that generates a less acidic form of HvJAMM1 migrates significantly faster in SDS-PAGE than the wild type, yet it has an observed average mass (as determined via MALDI-TOF and electrospray ionization MS) that is consistent with its theoretical average mass (26).
Thus, the SAMP3 A90K variant was used to map sites of samp3ylation and to identify proteins that were associated with SAMP3 in a UbaA-dependent manner. Proteins were purified from wild-type and ⌬ubaA strains expressing the SAMP3 A90K variant via ␣-Flag affinity chromatography. Proteins were separated via SDS-PAGE (Fig. 4B), excised in gel slices, tryptic in-gel digested, and subjected to Orbitrap Velos LC-MS/MS analysis, with quantitative values based on normalized spectral counts of biological replicates, as described under "Experimental Procedures" and reported in supplemental Table S4. With this approach, samp3ylation sites were mapped onto protein targets (Table I; see details below). Proteins were also identified that were highly enriched in the fractions associated with Flag-SAMP3 A90K of wild-type cells relative to the ⌬ubaA mutant (supplemental Table S4). Reproducibility was defined as the MS-based detection of two or more peptides per protein in at least two of the three biological replicates. Providing further support for the specificity of proteins identified via LC-MS/MS analysis of samples purified from wild-type cells expressing SAMP3 A90K , no proteins or sites of sampylation were detected at significant levels in samples purified from wild-type cells expressing the empty vector control (data not shown). To verify that the Flag-SAMP3 found in conjugates was not processed at the N or C terminus, the sequence coverage of Flag-SAMP3 digested by AspN/trypsin is included in supplemental Fig. S1 with all peptides detected in the Scaffold search results. Flag-SAMP3 was identified as the major protein hit in the Scaffold search results, with 93% sequence coverage. Furthermore, the first possible AspN/tryptic peptide of SAMP3 started as expected after Arg6, and the last GG peptide was also detected in the Scaffold search results (supplemental Fig. S1). This result clearly shows that the N and C termini are not cleaved in Flag-SAMP3. Searching all of our MS data against original annotated HVO_2177 did not show N-terminal peptides unique to this original annotated N terminus (data not shown).
Mapping Sites of SAMP3 Conjugation-Samp3ylation was found to be dependent on UbaA and to form covalent isopeptide bonds between the C-terminal carboxylate of SAMP3 and the -amino group of lysine residues of protein targets (supplemental Table S4 and supplemental Fig. S1). No common motif in primary amino acid sequence or secondary structure of the samp3ylation sites was detected (data not shown). However, samp3ylation sites were readily mapped to a total of 28 lysine residues on 23 proteins, with the majority of these sites detected in at least two of the three biological replicates of wild-type cells expressing SAMP3 A90K (as summarized in Table I). All of the samp3ylation sites were unique to wild-type cells and were not detected in the ⌬ubaA mutant. Furthermore, the majority of the samp3ylated proteins (17 out of 23 proteins) were highly enriched (4-to 788-fold) in wild-type cells relative to the ⌬ubaA mutant as calculated using the Scaffold software by normalized spectral counts of the identified proteins. Of the remaining samp3ylated proteins, three were detected only in a single biological replicate (Table I), and three (although reproducibly detected in samp3ylated form only in wild-type cells) had normalized spectral counts of total peptides mapping to each protein (including the unmodified form) at relatively similar quantitative levels (0.7-to 1.9-fold differences) ( Table I, HVO_1081, HVO_1577, and SAMP3). As evidence to support these latter findings, Flag-SAMP3 A90K was estimated via ␣-Flag IB to be purified at relatively similar levels from wild-type and ⌬ubaA cells (Fig. 4B).
The conjugation sites mapped in our approach were assigned to SAMP3 based on the following evidence. Firstly, we did not detect tryptic peptides derived from SAMP2 in any of the biological replicates of this study. SAMP2 tryptic peptides are readily detected in complex protein samples by LC-MS/MS analysis (25). Secondly, although SAMP1 peptides   were detected in our Flag-SAMP3 A90K samples, tryptic peptides of samp1ylated proteins do not generate a GG footprint and instead generate a NGEAAALGEATAAGDELALFPPVSGG footprint on lysine residues of modified proteins. Thus, the GG footprints can be attributed to SAMP3, not SAMP1. Finally, GG footprints were not detected in LC-MS/MS analysis of tryptic digests of wild-type cells with an empty vector control (pJAM202c) or AspN/trypsin digests of Flag-SAMP3 expressed in wild-type cells with a functional UbaA enzyme, providing evidence that the GG footprints detected were dependent upon the enrichment of SAMP3 A90K versus SAMP3 fractions. Based on these findings, we conclude that the sampylation sites identified in this study via LC-MS/MS analysis co-purify with SAMP3 and are modified by SAMP3 and not SAMP1 or SAMP2. Noteworthy, our ability to detect SAMP1 in the SAMP3-associated fractions was UbaA dependent (supplemental Table S4, normalized spectral counts of 10 for wild-type and undetected in the ⌬ubaA mutant) and most likely due to an overlap in target proteins (Table I). In particular, the K240 and K247 residues of the MoaE-MobB domain protein HVO_1864 (named MoaE) that were demonstrated here to be samp3ylated are also known to be samp1ylated (26) (note that a SAMP1 S85K was required to map the samp1ylation sites of MoaE in this earlier study). The translation elongation factor EF-1 ␣ homolog HVO_0359, which we found here to be samp3ylated at K99, co-purifies with SAMP1 (25), although a site of covalent modification by SAMP1 is not known. Our findings suggest polySAMP3 chains form in the cell. In particular, the C-terminal carboxylate of SAMP3 was isopeptide linked to at least three (K18, K55, and K62) of its three lysine residues in all biological replicates (Table I, supplemental Fig. S1). PolySAMP3 chains are consistent with the banding pattern observed from 25-37 kDa for the fractions purified via ␣-Flag chromatography from wild-type cells expressing SAMP3 A90K (Fig. 4B). Mixed chains of polySAMP3/1, although not detected, are also possible based on our identification of SAMP1 peptides via analysis of MS/MS spectra derived from ␣-Flag-SAMP3 samples purified from wild-type cells. SAMP2, however, was not detected in any of the samples in this study. Whether polySAMP3 chains are attached to protein targets or are free chains in the cell remains to be determined. Ubiquitin forms isopeptide bonds at all seven of its lysine residues with the chains attached to target proteins (54). Likewise, ubiquitin can form mixed chains with ubiquitin-like proteins (55), and the C-terminal Gly of SAMP2 is found isopeptide bonded to its K58 residue (25).
Categorizing SAMP3-associated Proteins-Numerous Hfx. volcanii proteins (331 in total) were reproducibly enriched (at least 3-fold) by ␣-Flag chromatography from wild-type (relative to ⌬ubaA) strains expressing Flag-SAMP3 A90K (supplemental Table S4). Fifty-two (16%) of these proteins were not represented in our previous MS-based shotgun analysis of the Hfx. volcanii proteome (1296 proteins) (56). Minor constituents of highly complex mixtures of tryptic peptides (such as the trypsinized "shotgun" proteome of Hfx. volcanii) are often not readily identified by MS-based technologies (57). Thus, we suggest that at least a portion of the proteome associated with SAMP3 in a UbaA-dependent manner is of low abundance in the cell.
To obtain a better understanding of the biological pathways that associate with SAMP3 in the presence of UbaA, we classified the MS-identified proteins into the COGs recently updated for Archaea by Wolf et al. (58) (supplemental Table  S5). Most (330) of the 331 proteins could be classified into COGs including proteins associated with metabolism (50%), information storage and processing (23%), and cellular processes and signaling (13%) (supplemental Table S5). In particular, proteins clustering to COGs [H] coenzyme transport and metabolism (2.3-fold, p Ͻ 0.001), [I] lipid transport and metabolism (3.2-fold, p Ͻ 0.0001), and [J] translation and ribosomal structure and biogenesis (2.4-fold, p Ͻ 0.0001) were significantly overrepresented in the SAMP3 samples relative to proteins deduced from the Hfx. volcanii genome (supplemental Table S5). By contrast, proteins of COGs [T] signal transduction mechanisms (4.6-fold, p Ͻ 0.001) and [S] unknown function (3.1-fold, p Ͻ 0.0001) were significantly underrepresented in the SAMP3 samples relative to the deduced proteome (supplemental Table S5). The remaining COGs had differences in their relative representation of SAMP3 that were of low to no significance (p values of 0.0028 to 0.6985). Based on the COG grouping distribution of SAMP3-associated proteins and our previous finding that SAMP1 and -2 have roles in protein modification and sulfur mobilization for the biosynthesis of MoCo (SAMP1) and thiolated tRNA (SAMP2) (24), we speculate that SAMP3 is involved not only in protein modification but also in mediating and/or regulating the biosynthesis of sulfur-containing coenzymes, tRNA, and/or lipids (associated with COG groupings [H], [J], and [I], respectively). The finding that the SAMP3 fractions purified from wild-type cells are correlated with proteins of translation (COG [J]) is also consistent with our ability to map samp3ylation sites on ribosomal S24e protein (HVO_1896), alanine-tRNA ligase (HVO_0206), and translation elongation factor EF-1 ␣ (HVO_0359) homologs, the latter two of which were highly abundant in our UbaA-dependent SAMP3-associated samples relative to control samples based on normalized spectral counts.
SAMP3 Conjugate Levels and Proteasome Function-Ubl protein conjugation is often linked to proteasome function (59). In our previous work in Hfx. volcanii, SAMP1/2 conjugate levels were found to be altered by growth conditions and proteasomal gene deletion (25). In particular, SAMP1/2 conjugate levels are low when cells are grown in complex medium with SAMP1 forming a predominant 65-kDa conjugate that is not observed for SAMP2 (25). Furthermore, the conjugate levels of SAMP1 are increased and SAMP2 conjugate levels are decreased in a proteasomal mutant strain (⌬psmA ⌬panA double mutant, where psmA encodes the 20S proteasome ␣1 subunit and panA encodes the AAA ATPase proteasome activating nucleotidase PAN-A) (25). Thus, sampy1ylation is thought to target proteins for degradation by proteasomes, whereas the association of SAMP2 with proteasome function is less clear.
To better understand the biological function of the SAMPs, the levels of SAMP3 conjugates were compared with those of SAMP1 and SAMP2 in Hfx. volcanii strains with diminished proteasome activity. Strains included proteasomal mutants (⌬psmA, ⌬panA, and ⌬panB single mutants and ⌬psmA ⌬panA double mutant, where psmA and panA are as defined above and panB encodes the AAA ATPase proteasome activating nucleotidase PAN-B) and strains treated with the reversible 20S proteasome inhibitor velcade (bortezomib). In contrast to SAMP1/2 (25), no drastic changes were observed relative to the wild type in the levels of SAMP3 conjugates expressed as N-terminal Flag-fusions in the proteasomal mutant strains (Fig. 5A). Chemical perturbation of 20S proteasome function by the addition of velcade to cells grown on rich medium (YPC) was found to generate a significant increase in the levels and alter the banding pattern of proteins conjugated to SAMP1/2 but not SAMP3 (Fig. 5B). The velcade-dependent increase in proteins conjugated to SAMP2 was greater than for SAMP1 (Fig. 5B, lanes 3 and 4), which differs from the response observed previously when SAMP1/2 conjugates were assayed in the proteasomal mutant strain ⌬psmA ⌬panA (25). Together these results suggest that samp3ylation serves as a nonproteolytic signal in the cell, whereas SAMP1/2 may target proteins for proteasome-mediated proteolysis.
SAMP3 Conjugate Levels Are Increased by Dimethyl Sulfoxide-Based on immunoprecipitation (IP) of Flag-SAMP3 from cell lysate, we found significant differences in the banding patterns of SAMP3 conjugates from cells grown under different culture conditions (e.g. media with complex nutrients compared with glycerol minimal medium with alanine (GMMAla) as the nitrogen source), suggesting carbon and/or nitrogen sources altered samp3ylation (Fig. 3A). To determine the media component responsible for these differences, we examined SAMP3 conjugates in the lysate of cells grown under culture conditions with different nitrogen and carbon sources (supplemental Fig. S6). We found that SAMP3 conjugates were highly abundant in cells grown on glycerol minimal medium, whereas the SAMP3 conjugates of cells grown in complex medium were not readily observed via ␣-Flag IB of cell lysate (supplemental Fig. S6) and instead required enrich- ment via ␣-Flag IP prior to IB for detection (Fig. 3A). In minimal media, SAMP3 conjugates were found to remain relatively constant and robust when alanine was replaced with ammonium chloride as a nitrogen source and when glycerol was replaced with either lactate or glucose as a carbon source (supplemental Fig. S6). Surprisingly, even the addition of the YPC component of rich medium to glycerol minimal medium did not reduce the SAMP3 conjugates to levels comparable to those in complex media alone (supplemental Fig. S6). Thus, differences in nitrogen and carbon sources did not appear to alter the levels of the SAMP3 conjugates detected via ␣-Flag IB of cell lysate.
In order to determine the chemical signal that was responsible for the high levels of samp3ylated proteins in cell lysate, the formation of SAMP3 conjugates was monitored in cells grown on media with minimal medium components systematically added to the nutrient-rich YPC medium. With this approach, SAMP3 conjugates were found to accumulate upon the addition of DMSO to rich or minimal media (supplemental Figs. S7 and S8) in a dose-dependent manner (Fig. 6A). As the concentration of DMSO increased in the growth medium, so did the levels of SAMP3 conjugates. Supplementation of YPC medium with 15 mM DMSO alone was sufficient to allow the detection of SAMP3 conjugates at high levels in cell lysate (Fig. 6A). 6. SAMP3 conjugate (A, B) and transcript (C) levels are elevated by the addition of dimethyl sulfoxide to the growth medium in a dose-dependent manner. Immunoblot analysis of Hfx. volcanii wild-type (wt) cells expressing Flag-SAMP3 (A) in trans (H26-pJAM977) and (B) Hfx. volcanii cells expressing Flag-SAMP3 from the native samp3 gene locus (HM1126 Flag-SAMP3 integrant). Cells were grown to stationary phase in YPC medium supplemented with DMSO at 0 to 100 mM as indicated. Cells were harvested via centrifugation, resuspended in SDS-PAGE loading buffer, and boiled for 30 min. Proteins were separated via 10% reducing SDS-PAGE. SAMP3 and its protein conjugates were detected via ␣-Flag immunoblot (IB). Protein loading was monitored by A 600 of cell culture (0.065 units per lane). Equivalent protein loading was confirmed by staining parallel gel with Coomassie Brilliant Blue R-250 (lower panel). The experiment was performed in biological duplicate with a representative image shown. Molecular mass standards are indicated on the right. C, Northern blot analysis of samp3 transcripts from wild-type cells (H26, wt) grown in the presence and absence of DMSO supplementation in ATCC974 complex as indicated. DIG-labeled samp3-specific probe used for Northern blot is indicated in Fig. 2A. Chemiluminescent signals were recorded on x-ray film with exposure times of 180, 120, and 60 min (left to right blots). RNA integrity was confirmed by ethidium bromide staining. See "Experimental Procedures" for details.
Although our data revealed that DMSO supplementation increases the levels of SAMP3 conjugates in cells, it was unclear whether this effect was specific to DMSO or a general response to terminal electron acceptors and/or solvents. Hfx. volcanii strains respire on nitrate and DMSO (24,60,61), and the related Halobacterium salinarum uses TMAO in addition to DMSO as a terminal electron acceptor (60,62). Although DMF is not a terminal electron acceptor, this common laboratory compound has many properties that are similar to those of DMSO, including its classification as a polar aprotic solvent. For these reasons, DMF, TMAO, and nitrate were compared with DMSO in terms of their ability to stimulate the levels of SAMP3 conjugates in Hfx. volcanii cells expressing Flag-SAMP3. When cells were grown on YPC medium, the levels of SAMP3 conjugates were enhanced by supplementation with DMSO but not by DMF, TMAO, or nitrate (supplemental Fig. S9). This increase was not observed in a ⌬ubaA mutant (supplemental Fig. S9), indicating that the DMSO-mediated induction of SAMP3 conjugates was dependent on the UbaA enzyme. The pool of free SAMP3 was also found to be depleted in a UbaA-dependent manner under a number of culture conditions, including those where the conjugates of SAMP3 were not readily detected in cell lysate (e.g. supplemental Fig. S9), suggesting that SAMP3 pools are regulated by UbaA.
DMSO-induced SAMP3 Conjugates Are Formed in Cells Expressing SAMP3 from Its Native Locus-To determine whether SAMP3 forms protein conjugates when expressed at wild-type levels from its native gene locus and whether these conjugate levels are modulated by DMSO, the coding sequence for an N-terminal Flag-tag was integrated onto the genomic copy of the Hfx. volcanii samp3 gene (see supplemental Table S1 for strain details). Cells were grown in YPC medium with and without DMSO supplementation and analyzed for SAMP3 and its associated conjugates via IB against its N-terminal Flag-tag. With this approach, SAMP3 was found to form conjugates, and the levels of these conjugates were increased by the addition of DMSO to the growth medium (Fig. 6B). Although the level of DMSO required for the detection of SAMP3 conjugates appeared somewhat higher when synthesized in the integrant strain (100 mM DMSO) than in the in trans expression strain (5 to 15 mM DMSO), the migration of the SAMP3 conjugates at high molecular mass in reducing SDS-PAGE appeared similar irrespective of the type of promoter (native or plasmid-based) used for expression (Figs. 6A and 6B). Interestingly, little if any of the free form of SAMP3 was observed in the HM1126 integrant strain (Fig.  6B), suggesting the majority of SAMP3 is in conjugated form in wild-type cells grown (micro)aerobically in the presence of DMSO.
SAMP3 Transcript Levels Are Increased by Dimethyl Sulfoxide-To determine whether DMSO influences SAMP3 at the transcript level, we performed Northern blot analysis in H26 wild-type cells. Similar to the trend observed for SAMP3 at the protein conjugate level, the levels of samp3 transcripts were stimulated in cells grown in the presence of DMSO relative to those grown in rich medium alone (Fig. 6C). The predominant samp3 transcript was found to migrate in denaturing agarose gels at an estimated size of 400 to 500 nt, consistent with the 5Ј and 3Ј ends of the samp3 transcript mapped by the T4 DNA-ligase-based approach (described earlier). Thus, although samp3 transcripts were readily detected via 5Ј-and 3Ј-end mapping of RNA isolated from cells grown aerobically with or without DMSO, the levels of samp3 transcripts were significantly increased by DMSO based on use of the less sensitive Northern blotting with a non-radioactively labeled probe.
Samp1/3ylation of MoaE-MoaE (HVO_1864, MobB-MoaE domain fusion) is an ideal target for studying sampylation based on its close similarity to the well-studied large subunit of MPT synthase and its presumed requirement for MoCo biosynthesis (24). Unlike samp2/3, the moaE and samp1 genes (along with ubaA) are required for activity of the MoCocontaining DMSO reductase in Hfx. volcanii (24). Thus, MoaE is presumed to associate with SAMP1 (analogous to the E. coli MoaE-MoaD, human MOCS2B-MOCS2A, and plant Cnx7-Cnx6 large-small subunit associations) and form the MPT synthase needed for MoCo biosynthesis. Although the amino acid sequence of SAMP3 shares 38% identity and 56% similarity with SAMP1, SAMP3 is not needed for DMSO respiration, suggesting that it is not required for MoCo biosynthesis (24). Thus, our discovery that MoaE is modified by SAMP1 and SAMP3 is of biological interest, as these modifications may serve autoregulatory (SAMP1) and regulatory (SAMP3) roles in MoCo biosynthesis.
Similar to SAMP1 (26), SAMP3 was found reproducibly conjugated to MoaE at lysine residues K240 and K247 (Table  I, supplemental Fig. S1). These lysine residues are analogous to the K119 and K126 active site residues of MoaE required for MPT biosynthesis in E. coli (7), suggesting that samp1/ 3ylation of MoaE reduces MoCo biosynthesis in Hfx. volcanii. Although an isopeptide bond between the large (MoaE K119) and small (MoaD G81) subunits has been detected in the MPT synthase of E. coli, this covalent bond is observed only in purified preparations of the enzyme incubated for six months at 4°C, suggesting that the linkage is not physiologically relevant (7). In contrast, the samp1/3ylated forms of MoaE are detected in samples freshly prepared from wild-type (versus ⌬ubaA mutant) strains of Hfx. volcanii, providing evidence that samp1/3ylation of MoaE is dependent upon UbaA and is physiologically relevant.
To further analyze MoaE as a biological target of samp3ylation (beyond LC-MS/MS of the complex mixture of samp3ylated proteins), a C-terminal StrepII tagged variant of MoaE was co-expressed with Flag-SAMP3 and purified from Hfx. volcanii via two methods: (i) immunoprecipitation anti-StrepII antibody conjugated to Dynabeads (␣-Strep IP) and (ii) tandem affinity purification (TAP) chromatography us-ing Strep-Tactin and ␣-Flag columns. To facilitate the detection of samp3ylation and to prevent samp1ylation of MoaE, purifications were performed in ⌬samp1 mutant strains. With this approach, MoaE-StrepII was detected in what appeared to be "free" (unconjugated) and Flag-SAMP3 modified forms when ␣-Strep IP fractions were probed by IB with anti-StrepII and anti-Flag antibodies (Fig. 7A). Protein bands specific for MoaE-StrepII and not Flag-SAMP3 were also detected in the ␣-Strep IP fractions that were of greater molecular mass than Flag-SAMP3-MoaE-StrepII alone (Fig. 7A). Flag-SAMP3 was also found covalently conjugated to MoaE-StrepII in TAP fractions (Fig. 7B). However, the large-molecular-mass protein bands specific for MoaE-StrepII were not detected in the samples purified via TAP. Strain geno-types differed between the two purification strategies, with MoaE-StrepII fractions purified from a ⌬samp1-3 strain in TAP and from a samp3ϩ (⌬moaE ⌬samp1) strain in ␣-Strep IP. Thus, the large-molecular-mass protein bands could be MoaE-StrepII conjugated to multiple moieties of SAMP3 encoded from the genome. Overall, the ability to purify Flag-SAMP3-MoaE-StrepII conjugates via two different strategies reaffirms our MS/MS finding that MoaE is samp3ylated in Hfx. volcanii. SAMP3 Conjugates Are Cleaved by HvJAMM1-HvJAMM1 is a zinc-dependent metalloprotease that cleaves SAMP1/2 from linear and isopeptide linkages to protein targets (26). Thus, we examined whether HvJAMM1 could cleave SAMP3 conjugates including SAMP3-MoaE purified via TAP as well  HvJAMM1 (B, C). A, Hfx. volcanii wild-type (H26) strain expressing pJAM202c vector control and ⌬samp1 ⌬moaE mutant co-expressing Flag-SAMP3 and MoaE-StrepII were grown to stationary phase in ATCC974 complex medium (200 ml). MoaE-StrepII was immunoprecipitated (IP) using ␣-StrepII antibody coupled beads, proteins were separated via 10% reducing SDS-PAGE, and MoaE-StrepII and Flag-SAMP3 were detected via ␣-StrepII and ␣-Flag immunoblot (IB). Nonspecific protein bands (*) detected in vector control and MoaE-StrepII protein bands (**) that were not modified by Flag-SAMP3 but migrated in reducing SDS-PAGE slower than expected for MoaE-StrepII alone are indicated. B, SAMP3-modified MoaE-StrepII was purified via tandem affinity purification (TAP) and assayed for desampylation by HvJAMM1 as monitored by ␣-StrepII IB after separation of proteins via 12% reducing SDS-PAGE. C, HvJAMM1 cleaves Flag-SAMP3 conjugates enriched from Hfx. volcanii cells expressing Flag-SAMP3. Flag-SAMP3 conjugates were enriched from Hfx. volcanii cells expressing Flag-SAMP3 by clarifying boiled cell lysate through centrifugation. Desampylation was monitored via ␣-Flag IB after proteins were separated by means of 12% reducing SDS-PAGE. Equal protein loading and absence of nonspecific protease activity were assessed by staining a parallel gel with Sypro Ruby. For all panels, experiments were performed at least in biological duplicate with a representative image shown. For details, see "Experimental Procedures." Molecular mass standards are indicated on the left. as a mixed population of SAMP3 conjugates in cell lysate. The metal chelator EDTA inactivates HvJAMM1 and thus was included as a negative control. With this approach, HvJAMM1 was found to cleave SAMP3-modified MoaE, resulting in an increase in the level of MoaE in its unmodified form (Fig. 7B). HvJAMM1 also eliminated the SAMP3 conjugates detected via ␣-Flag IB of cell lysate while having little if any effect on the profile of proteins in cell lysate detected by means of Sypro Ruby staining (Fig. 7C). Although the sample loading was similar for the Sypro Ruby stain, free SAMP3 did not show any apparent increase when SAMP3 conjugates were cleaved by HvJAMM1 (EDTA minus, Fig. 7C, lane 2). The reason for this relatively steady level of free SAMP3 is unclear; the level of SAMP3 was anticipated to increase when released from its protein target. Free SAMP3 was not cleaved by HvJAMM1 (based on in vitro assay with Flag-His-SAMP3 purified from recombinant E. coli) (data not shown). However, we cannot rule out the possibility that the SAMP3 released from the protein conjugates during the in vitro assay was susceptible to cleavage. We note, based on our previous work (26), that HvJAMM1 is unable to cleave proteins that are not covalently attached to SAMPs (e.g. ␤-amylase and carbonic anhydrase). Thus, we conclude that HvJAMM1 is a metalloprotease with relatively broad substrate specificity, able to cleave a wide variety of proteins conjugated to SAMP3 as well as SAMP1/2.
Working Model on the Biological Function of Samp1/ 3ylation of MoaE-Based on our finding that the conserved active site residues of MoaE are primary targets of samp1/3ylation and that these modifications are reversed by HvJAMM1, we speculated that MoaE is inactivated by sampylation under conditions that require little if any MoCo production. In particular, aerobic growth in nutrient-rich conditions would minimize the need for MoCo-containing homologs of the DMSO terminal reductase (e.g. DmsA, GI: 291369330 HVO_B0363) and xanthine oxidoreductase (GI:292494242; HVO_B0309) families, the latter of which may be involved in the catabolism of purines to uric acid. Consistent with this model, when cells are grown aerobically in medium minus DMSO, samp3ylated MoaE is detected via LC-MS/MS, and low levels of samp3 transcripts are produced. The significant increase in the levels of SAMP3 conjugates and samp3 transcripts observed after the addition of DMSO to aerobic cells has led us to speculate that DMSO stimulates samp3ylation to further inactive MoCo biosynthesis in the presence of oxygen. In particular, the covalent modification of MoaE under conditions with DMSO and oxygen may provide a hierarchy that enables cells to scavenge for the optimal terminal electron acceptor oxygen, which has a midpoint potential (EЈ m ) of O 2 /H 2 O at ϩ0.82 V, much higher than the EЈ m of ϩ0.16 V for DMSO/dimethyl sulfide. The accumulation of sampylated MoaE under aerobic conditions would ready the cells for a rapid shift to anaerobic respiration on DMSO through the action of HvJAMM1-mediated cleavage of the protein conjugates.
Because the complexity of SAMP3 conjugates in aerobic cells is high in the presence of DMSO and immunoprecipitation is required in order to detect SAMP3 conjugates in the absence of DMSO, we used SAMP1 as a model to further understand whether MoaE is a major target of sampylation. We hypothesized that the predominant ϳ65 kDa band detected as a SAMP1 conjugate in aerobic cells grown on complex medium is isopeptide-linked SAMP1-MoaE (Fig. 8, lane  1). Thus, we monitored the formation of SAMP1 conjugates in wild-type and ⌬moaE mutant strains grown with oxygen (minus DMSO). With this approach, we found that deletion of the moaE gene disrupted the formation of the major ϳ65-kDa SAMP1 conjugate (Fig. 8, lanes 1 and 2). We found that SAMP1 conjugates, similar to those of SAMP3, were more abundant and were shifted to a greater diversity of bands that migrated as 40-to 200-kDa proteins when cells were aerobically grown on medium supplemented with DMSO (Fig. 8,  lanes 3 and 4). This increase was independent of MoaE, with differences in the ϳ65 kDa band difficult to discern because of the complexity of banding patterns observed in this region of the immunoblot. However, based on these results, we can conclude that MoaE is a major target of samp1ylation in Hfx. volcanii cells grown aerobically in nutrient-rich conditions devoid of alternative electron acceptors such as DMSO. This form of MoaE is proposed to be reactivated by HvJAMM1 to function with free SAMP1 in MPT synthase sulfur transfer activity as conditions become limiting in nutrients and oxygen. Thus, samp1ylation would serve an autoregulatory role. In contrast, SAMP3, although not functional in sulfur transfer to form MPT, is proposed to add an additional layer in regulating MPT synthase activity. The samp1/3 genes are encoded on separate regions of the Hfx. volcanii chromosome and thus are anticipated to be differentially controlled and provide an added layer of regulation to the system.
We propose that covalent modification of MoaE by Ubl proteins is a general mechanism for regulating MoCo biosynthesis in diverse archaea and bacteria. Many prokaryotes encode linear fusions of an N-terminal Ubl domain and C-terminal MoaE domain (e.g. Ref. 63). This type of precursor polypeptide is not anticipated to be active in the synthesis of MPT and instead is thought to require cleavage to expose the C-terminal diglycine motif of the Ubl domain for sulfur transfer. HvJAMM1 cleaves linear and isopeptide-linked forms of Ubl-MoaE fusions (26), suggesting that this enzyme and its widespread JAMM/MPNϩ domain homologs are crucial for the reactivation of MPT synthase in diverse organisms.
Summary-Here we report a new Ubl protein (named SAMP3) that functions in protein modification in the halophilic archaeon Hfx. volcanii. We provide strong evidence that the start codon of SAMP3 is the ATG codon corresponding to Met22 of open reading frame HVO_2177 in the original genome sequence annotation (40). Through LC-MS/MS analysis, we have shown that protein conjugation can occur through isopeptide bonds between the C-terminal glycine of SAMP3 and the -amino group of lysine residues on protein targets (with 28 conjugation sites mapped to 23 proteins). Similar to that of SAMP1/2 conjugates, the formation of SAMP3 conjugates was found to be dependent on the E1-like enzyme UbaA and the C-terminal diglycine motif of SAMP3. Thus, the ubiquitin-activating E1 enzyme homolog UbaA appears to activate a wide variety of Ubl proteins for protein conjugation. The specificity of the HvJAMM1 metalloprotease now appears broader than we previously reported (26) based on its capacity to cleave not only SAMP1/2 but also SAMP3 from protein conjugates. Thus, sampylation appears regulated and reversible. Unlike what has been observed for SAMP1/2, the abundance of proteins conjugated to SAMP3 was not substantially altered by perturbation of proteasome function, suggesting that the SAMPs are diversified to serve proteolytic and nonproteolytic roles in archaeal cells. Whether distinct boundaries can be drawn is not yet clear, as even small Ubl modifier and Ub, which were once thought to clearly demarcate proteins for respective nonproteolytic and proteolytic fates, are now found to have converging roles in the cell (64).
We also provide evidence that SAMP3 is involved in the regulation of MoCo biosynthesis. In particular, we have shown, by means of MS/MS, immunoprecipitation, and tandem affinity purification, that SAMP3 is covalently linked by isopeptide bonds to conserved active site lysine residues of MoaE, the Hfx. volcanii homolog of the large subunit of MPT synthase, when cells are grown in the presence of oxygen. SAMP3-MoaE conjugates are cleaved by HvJAMM1, suggesting that the inactivation of MoaE is reversible. Thus, SAMP3 may regulate MoCo biosynthesis by inhibiting the activity of MPT synthase under aerobic conditions, providing a hierarchy of oxygen use prior to that of alternative electron acceptors such as DMSO. DMSO-mediated induction of SAMP3 conjugate and transcript levels is consistent with this model of regulation.