A Transmembrane Crenarchaeal Mannosyltransferase Is Involved in N -Glycan Biosynthesis and Displays an Unexpected Minimal Cellulose-Synthase-like Fold

Protein glycosylation constitutes a critical post-translational modification that supports a vast number of biological functions in living organisms across all domains of life. A seemingly boundless number of enzymes, glycosyltransferases, are involved in the biosynthesis of these protein-linked glycans. Few glycan-biosynthetic glycosyltransferases have been characterized in vitro , mainly due to the majority being integral membrane proteins and the paucity of relevant acceptor substrates. The crenarchaeote Pyrobaculum calidifontis belongs to the TACK superphylum of archaea (Thaumarchaeota, Aigarchaeota, Crenarchaeota, Korarchaeota) that has been proposed as an eukaryotic ancestor. In archaea, N -glycans are mainly found on cell envelope surface-layer proteins, archaeal flagellins and pili. Archaeal N -glycans are distinct from those of eukaryotes, but one noteworthy exception is the high-mannose N -glycan produced by P. calidifontis , which is similar in sugar composition to the eukaryotic counterpart. Here, we present the characterization and crystal structure of the first member of a crenarchaeal membrane glycosyltransferase, Pc ManGT. We show that the enzyme is a GDP-, dolichylphosphate-, and manganese-dependent mannosyltransferase. The membrane domain of Pc ManGT includes three transmembrane helices that topologically coincide with “ half ” of the six-transmembrane helix cellulose-binding tunnel in Rhodobacter spheroides cellulose synthase BcsA. Conceivably, this “ half tunnel ” would be suitable for binding the dolichylphosphate-linked acceptor substrate. The Pc ManGT gene (Pcal_0472) is located in a large gene cluster comprising 14 genes of which 6 genes code for glycosyltransferases, and we hypothesize that this cluster may constitute a crenarchaeal N glycosylation (PNG) gene cluster.


Introduction
Across all three domains of life, protein glycosylation constitutes a critical post-translational modification that supports a vast number of biological functions in living organisms, such as protein stability, sorting, cell-cell recognition and signaling, transport, and dynamic adaptation to changing environments [1][2][3][4]. About two-thirds of all eukaryotic proteins are predicted to be glycosylated, and the majority of these carry asparagine-linked (N)glycans [5]. While all three domains of life perform protein glycosylation, the archaeal glycosylation machineries appear more closely related to the eukaryotic systems [6], albeit highly blended with distinct traits of both bacteria and eukarya [7]. Especially N-glycosylation is more common in eukarya and archaea than in bacteria [6].
All N-glycans are assembled on a membranebound lipid carrier. Eukaryotes use dolichyldiphosphate (Dol-PP) as lipid carrier, archaea use either Dol-PP or Dol-P, and bacteria use undecaprenol monophosphate. In eukaryotes, the Dol-PP-linked heptasaccharide core (GlcNAc) 2 -(Man) 5 of the lipidlinked oligosaccharide (LLO) is assembled at the cytoplasmic face of the ER membrane, and translocation of the LLO to the lumenal side where the glycan is further mannosylated and glucosylated to give the LLO tetradecasaccharide, (GlcNAc) 2 -(Man) 9 -(Glc) 3 , to be transferred "en bloc" by the oligosaccharyltransferase (OST) to the appropriate protein substrate [8]. The enzymes responsible for completing the heptasaccharide core are Alg7, Alg13/14, Alg1, Alg2 and Alg11, where the latter three are mannosyltransferases that transfer a mannosyl residue from GDP-α-Man to an intermediate built on a chitobiose-charged Dol-PP-carrier lipid ( Figure S1). After translocation of the heptasaccharide to the ER lumen, Alg3, Alg9 and Alg12 add four additional mannosyl residues, and Alg6, Alg8 and Alg10 each add one glucosyl units to complete the LLO tetradecasaccharide. The GTs resident on the ER-lumenal face use Dol-P-linked mannose (Dol-P-Man) and glucose (Dol-P-Glc) as activated glycosyl donors for the transfer reactions instead of nucleotide sugars.
Archaea produce a wider variety of glycans compared with eukaryotes, which is manifested as an equally impressive diversity of the glycanbiosynthetic enzymes [6,7]. As of May 2020, more than 10,000 gene sequences coding for archaeal glycosyltransferases (GTs), originating from 364 archaeal genomes, have been classified in the carbohydrate-active enzymes database in 31 distinct GT families (CAZy (www.cazy.org) [9]). In archaea, glycosylation occurs mainly on the cytoplasmic face of the plasma membrane, but can in some cases be further modified after the lipid-linked glycan has been translocated to the extracellular side for subsequent transfer to surface proteins; as for instance for Haloferax volcanii AglS, which transfers the terminal mannose from Dol-P-Man to the lipid-linked tetrasaccharide at the extracellular face of the membrane [10], and the post-transfer modification observed for the Sulfolobus solfataricus LLO, which becomes extended by one hexose on the extracellular side [11]. The main substrates for protein N-glycosylation are typically cell envelope surface-layer proteins (S-layer proteins) and archaellins (archaeal flagillins) [3,12,13].
Of the few archaeal N-glycan structures that have been characterized to date, one stands out as noteworthy from an evolutionary perspective. The biantennary high-mannose N-glycan (HMG) produced by the crenarchaeon Pyrobaculum calidifontis bears, unlike other characterized archaeal N-glycans, distinct resemblance with respect to sugar composition to its eukaryotic HMG counterpart [14]. The P. calidifontis Nglycan is an HMG built on a chitobiose-type core consisting of two β-1,4-linked modified GlcNAc Figure 1. Overall fold of PcManGT. The overall fold of PcManGT shown in two views. Coloring scheme: GT-A fold in orange, IF helices in green, substratum IF helix in violet, and the TM helices in light blue. The model of the PcManGT·Mn 2+ ·GDP complex is shown with GDP drawn as a stick object and the Mn 2+ ion as a gray sphere. The βstrands are drawn as arrows and α-helices as cylinders. The position of the disordered acceptor loop, which is protruding from IFH1, is indicated as red coil. residues that are di-N-acetylated at the C2 and C3 positions (Glc(NAc) 2 ), and with an additional carboxylate group at C6 of the second unit (GlcA(NAc) 2 ) ( Figure  S1) [14]. The eukarya-lookalike HMG of P. calidifontis is particularly interesting since this crenarchaeote belongs to the archaeal TACK superphylum (comprising Thaumarchaeota, Aigarchaeota, Crenarchaeota and Korarchaeota), which has been proposed as an origin of eukaryotes [15,16].
Despite the evolutionary diversity of archaeal glycans, the underlying biosynthetic processes remain poorly understood, and few archaeal glycans and GTs have been characterized biochemically and functionally. Of the known archaeal GT genes, three-dimensional structural information for the gene products exists for only six unique enzyme activities, all of which are of euryarchaeal origin. Furthermore, of these few characterized enzymes, merely two represent integral membrane proteins, namely, the dolichylphosphate mannose synthase from the euryarchaeon Pyrococcus furiosus (PfDPMS) belonging to family GT2 [17], and the GT66-family member OST AglB from the euryarchaeon Archaeoglobus fulgidus [18]. Specifically, there are no biochemical or structural data available for any GT member belonging to the TACK superphylum.
Here, we present the first crystal structure and biochemical characterization of a glycosyltransferase of TACK origin, namely, the membrane glycosyltransferase PcManGT from P. calidifontis. We report the structures of PcManGT in the unliganded state, in complex with GDP·Mn 2+ and with GDP-α-Man·Mn 2+ , as well as structural comparisons with related GTs, and discuss the possible function of PcManGT in N-glycan biosynthetic pathways of P. calidifontis.

Structure determination and overall fold
The diffraction capacity of PcManGT crystals could be improved from 4 to 2.6 Å resolution by crystallization in meso using bicelles (i.e., bilayer micelles), which are open disk-like structures that offer an environment that is more akin to natural, native membranes. Structure determination was initially hampered by insufficient isomorphous or anomalous signal for the large number of screened data sets for heavy atomexposed crystals, but experimental starting phases could eventually be estimated by the use of sodiumtungstate cluster (Table S1). The overall fold of PcManGT features a catalytic extramembrane domain with the, for family-2 GTs, canonical GT-A fold (www.cazy.org) (Figures 1 and  S2). As for membrane GTs [17], the catalytic domain is connected to two membrane-interface helices, IFH1 and IFH2. The transmembrane (TM) domain is composed of three antiparallel TM helices (TMH1-TMH3). TMH3 is short and is disrupted by a 90°-kink midway across the membrane, after which it continues roughly parallel with, but below, the membrane surface where it forms a substratum for IFH1 and IFH2 ( Figures  1 and S2). Hence, we refer to this helix as the substratum interface helix (sub-IFH3).

Donor specificity and donor-crystal complexes
Isothermal titration calorimetry (ITC) was used to test the affinity of PcManGT for a relevant selection of diphosphonucleotides, diphosphonucleotide sugars and divalent metal cations. Of the four diphosphonucleotides (ADP, CDP, UDP and GDP) tested in the presence of the metal cofactor Mn 2+ , dissociation constants could only be obtained for GDP ( Figure 2 and Table S2). ITC analysis of GDPα-mannose (GDP-α-Man) and GDP-α-glucose (GDP-α-Glc), both of which are candidate donors for PcManGT, revealed a 2-fold lower K d value for GDP-α-Man in the presence of Mn 2+ (Figure 3 and Table S2). For GDP and GDP-α-Man in the presence of Mn 2+ , Mg 2+ or Ca 2+ indicated Mn 2+ as preferred divalent metal cation. Guided by the ITC results regarding the donor preference, PcManGT·Mn 2+ crystals were soaked with either GDP, GDP-α-Man or GDP-α-Glc. Crystals of PcManGT soaked with GDP-α-Glc did not produce ordered electron density for the donor substrate; however, crystal complexes with GDP and GDP-α-Man were determined at 2.7 Å resolution.
The guanosine ring in both the GDP (Figure 4(a)) and GDP-α-Man (Figure 4(b)) complex forms hydrogen bonds via its N1 and N2 groups to Ser82, and the O6 oxygen is within interaction range of Arg112 Ne. The two ribose hydroxyl groups (O2′ and O3′) are involved in hydrogen bonding with the backbone amide of Tyr52, the backbone amide nitrogen of Pro50, and the side-chain of Glu54. Additional stacking occurs between the ring of Pro50 and the GDP ribose ring. The DXD motif in PcManGT is defined by Asp135 and Asp137. Asp137 coordinates the diphosphate group indirectly via the Mn 2+ ion, while Asp135 stacks with the ribose C4′-C5′ moiety of GDP. The importance of the divalent metal cation is confirmed by the ITC data where the presence of the Mn 2+ significantly increases the affinity of PcManGT for the substrate (Table S2).
In the GDP complex (Figure 4(a)), the βphosphate oxygen atoms are stabilized by Arg261 Nh2 and Ne, and the Trp257 Ne1 nitrogen. Arg260 Nh2 is suitably positioned to interact with the bridging oxygen between the α and β-phosphate groups, but the Arg260 side-chain has less welldefined electron density making the assignment of these interactions more uncertain. In the GDP-α-Man complex (Figure 4(b)), the phosphate oxygen atoms are only interacting with the manganese ion, while Arg261 interacts with the mannosyl group.
The overall electron density for the mannosyl group in the PcManGT·Mn 2+ ·GDP-α-Man complex is somewhat featureless ( Figure S3), probably due to local motion. Nonetheless, possible interactions with the protein can be predicted: Asp135 Od2, which is positioned within hydrogen-bonding distance of the mannosyl O5 oxygen; His264 Nd1 with the O4 hydroxyl group; and the Gly220 backbone carbonyl oxygen that can reach within hydrogen-bonding distance of the mannosyl O3 hydroxyl group (Figure 4(b)). Arg260 is still mainly disordered, but residual side-chain density indicates that it no longer interacts with the donor's phosphate groups. Compared with the two liganded PcManGT structures, we observe only minor positional shifts of the active-site side-chains in the unliganded structure.

Thermal stabilization by lipids
The effect of phospholipids and dolichol compounds on the stabilization to thermal unfolding of PcManGT was tested ( Figure 5(a) and Table S3). The anionic 55and 95-carbon dolichylmonophosphates (Dol55-P and Dol95-P) provided stabilization corresponding to an increased melting temperature (ΔT m ) of 30°C and 22°C, respectively. Similarly, the anionic phospholipids phosphatidylserine (PS) and phosphatidylglycerol (PG) provided relative thermal stabilization of PcManGT by ΔT m values of 21°C (PS) and 24°C (PG). While the neutral lipid phosphatidylcholine (PC) was marginally stabilizing, phosphatidylethanolamine (PE) did not show a stabilizing effect. PfDPMS, which like PcManGT depends on dolichyl lipids for function, shows a similar, but weaker, trend for preferential stabilization by anionic lipids (Figure 5(b)). Dolichylphosphates do not play a functional role for the prokaryotic MFS transporter used as membrane protein control, and while there is a certain amount of stabilization, no distinct preference for the lipid head group or an isoprenoid-type chain is observed for this transporter ( Figure 5(c)). As expected for the lipid-independent soluble control protein EcArnA, the addition of lipids does not provide any significant protection against thermal denaturation ( Figure 5(d)).

Characterization of enzyme-dependent donor hydrolysis
Both retaining GTs [19,20] and inverting GTs [21] have been reported to be able to catalyze the destructive hydrolysis of its donor substrate in the Figure 5. Thermal stabilization assay. The effect of lipids on the thermal stability of (a) PcManGT, (b) PfDPMS, (c) a prokaryotic transporter belonging to the MFS superfamily, and (d) EcArnA, a soluble protein from E. coli. The relative stabilization to thermal unfolding by the addition of lipids is expressed as the change in melting temperature (ΔT m ). The lipids used were: anionic dolichylmonophosphates (Dol55-P, Dol95-P); anionic phospholipids (PG, PS); and neutral phospholipids (PC, PE). The Dol55-P and Dol95-P data for PfDPMS are from Gandini et al. [17]. All experiments were performed at least twice (n ≥ 2). absence of a natural acceptor. Based on the very strong relative stabilization of PcManGT by dolichylphosphate, we expect the natural acceptor to be an archaeal dolichyldiphosphate-linked sugar needed for synthesis of an LLO intermediate; however, Dol-Plinked sugars are not easily available. In the absence of suitable acceptors, we noted a distinct signal in reactions with GDP-α-Man as sole donor substrate. We initially considered this phenomenon to be attributed to true PcManGT-dependent GDP-α-Man hydrolysis, or to enzyme-independent spontaneous hydrolysis of GDP-α-Man, or both. To investigate this further, GDP-α-Man and GDP-α-Glc were used as substrates in reactions using the GDP-GLO™ assay that is highly sensitive, and importantly, does not produce false positive results due to contaminating phosphate in the reagents used. GDP-α-Man hydrolysis was monitored using wild-type PcManGT, and a mutant, D135A/D137A, where the metal-coordinating aspartate residues had been replaced by alanine. We tested the enzymes in the absence and presence of PS, Dol55-P, Dol95-P and bicelles. The reason for testing both dolichylphosphates was to investigate whether the higher stabilization of PcManGT by Dol55-P ( Figure 5(a)) is also manifested in higher catalytic activity. The reason for testing bicelles was to evaluate whether a bilayer-like environment provides an advantage for activity compared with only adding free phospholipids (PS).
The results show that spontaneous enzymeindependent hydrolysis is distinct, but comparably low for GDP-α-Man, and close to zero for GDP-α-Glc ( Figure 6, bars 1 and 11), which highlights an inherent instability of GDP-α-Man compared with GDP-α-Glc under the reaction conditions used (for details, see Materials and Methods). PcManGT reactions with or without the addition of a natural anionic phospholipids (PS) show a 2-fold enzyme-dependent increase in GDP-α-Man hydrolysis relative to solvent-mediated hydrolysis ( Figure 6, bars 3 and 4). Further increase of enzyme-dependent hydrolysis is observed when Dol55-P, or Dol95-P, is added, irrespectively of the presence of PS ( Figure 6, bars 5-6 and 8-9). The precise length of the dolichyl chain does not seem to influence the activity significantly when the enzyme is present only in a detergent solution.
The most noticeable results were obtained when performing the hydrolysis reactions in the presence of Dol-P and PcManGT incorporated in bicelles ( Figure 6, bars 7 and 10). A remarkable increase in hydrolysis of the GDP-α-Man donor is observed when PcManGT is incorporated into bicelles in the presence of a Dol55-P ( Figure 6, bar 7), which corresponds to the natural length of the lipid carrier. Moreover, the results show that (i) the DXD mutant lacks detectable activity beyond spontaneous solvent-mediated donor hydrolysis (Figure 6, bar 2), which supports the importance of Asp135 and Asp137 for coordinating the donor substrate and metal-ion cofactor, respectively; and (ii) that GDP-α-Glc is not a substrate for PcManGT ( Figure 6, bars 12-13). In addition, we performed a preliminary attempt to identify possible acceptor substrates using soluble intermediates of the eukaryotic LLO-biosynthetic pathway (2α-mannobiose, 3α-mannobiose, 6α-mannobiose, 3α,6α-mannotriose, and chitobiose; Figure S4). Possible product formation was monitored using HPAEC-PAD analysis ( Figure S5); however, no products were generated in the presence of soluble sugar intermediates, and only GDP-α-Man hydrolysis was observed.

Comparison with related enzymes
In a search for the experimental 3-D structure most similar to that of full-length PcManGT structure, the DALI server (http://ekhidna2.biocenter.helsinki.fi/dali/) returned the catalytic cellulose synthase domain BcsA of Rhodobacter spheroides cellulose synthase (PDB 4HG6) [22] as the closest match with an r.m.s.d. value of 3.2 Å for 325 aligned residues and a resulting sequence identity of 19%. A DALI search against only the catalytic extramembrane domain of PcManGT gave the best match for chondroitin synthase (PDB 2Z86), resulting in an r.m.s.d. of 2.7 Å for 204 aligned residues corresponding to a sequence identity of 20%. These global measures of similarity highlight that PcManGT is distinct from GTs with known structure.
A structure-based alignment of PcManGT with the few available transmembrane GT2 members with known 3-D structure emphasizes the similarities and differences ( Figures S6 and S2). The most similar protein, the synthase subunit RsBcsA (PDB 4HG6) [22], can be divided into three principal parts: the Nterminal region (residues 1-60) folds as two TM helices (TMH1, TMH2) that coil with the C-terminal TM helix of the BcsB subunit to form a bundle; a cellulose synthase region with an extramembrane catalytic GT-A domain connected to a TM domain with six TM helices (TMH3-TMH8) that form a narrow cellulose channel (residues 63-123, 404-467, 522-572); a PilZ domain with a β-barrel fold (residues 583-675) linked to an extensive Cterminal α-helical region that wraps around the GT-A domain like a collar (residues 676-759). Since PcManGT does not require either a BcsB or PilZ domain, only the cellulose synthase region of BcsA (residues 64-572) is relevant for a structural comparison.
The PcManGT structure maps to the region in BcsA defined by residues 96-496 ( Figures S6 and S2). PcManGT and BcsA shares the same triangular arrangement of IFH1, IFH2 and the substratum IF helix (sub-IFH3, residues 474-496). With respect to the arrangement of TM helices, the three TM helices in the TM domain of PcManGT can be regarded as one half of the six-TMH domain in BcsA that forms the cellulosebinding channel (i.e., TMH3-TMH8) ( Figure S7). By this comparison, TMH1, TMH2 and TMH3 in PcManGT correspond to TMH4, TMH5 and TMH6 in BcsA, respectively. However, a positional shift of PcManGT-TMH3, and the shorter loop preceding it, causes PcManGT-TMH3 to superimpose with BcsA-TMH8 rather than BcsA-TMH6. Furthermore, PcManGT lacks the sequence pattern [D,D,D,Q(Q/R)XRW] that is typical for processive GTs [23], emphasizing that PcManGT does not act as a sugar polymerase. The presumed lack of processivity and the topologically simpler "half-channel"-TM-domain motif in PcManGT ( Figure S7A) would be in good agreement with an enzyme that catalyzes a single glycosyl-transfer reaction to a Dol-PP-linked acceptor sugar during Nglycan biosynthesis in P. calidifontis.
In PfDPMS, the A-loop is defined as an extensive connection between the last β-strand of the GT-A domain (β7) and IFH2 ( Figure S2C), which displays a particularly high sensitivity to the absence or presence of acceptor substrate [17]. Typically, this loop is ordered in the absence of acceptor, and once the glycolipid product is bound, the IF helices are separated and pull the A-loop into an open and disordered state. The Aloop in DPMS is directly tethered to the metal center via Arg202, which provides a rationale for the release of the A-loop in the post-catalysis state when the metal has departed.
In PcManGT, the β7/IFH2 loop appears well ordered, but instead, the loop β5/IFH1 (residues 168-177) shows disorder. PcManGT lacks a counterpart to Arg202 in DPMS and thus does not seem to couple IFH2 to the presence of metal ion, acceptor substrate or product. Interestingly, BcsA has a corresponding loop (residues 278-297) that closes as a lid over the triangular opening between IFH1, IFH2 and the sub-IFH3. This opening leads directly to the position of a presumed third glucosyl-binding subsite at Phe301 in IFH1 and Trp417 in TMH5 (corresponding in PcManGT to Tyr181 in IFH1 and His293 in TMH2) ( Figure S8A). A similar opening exists in PcManGT ( Figure S8B), and we anticipate that the A-loop, once ordered, presumably in the presence of acceptor, would fold over this opening site, shielding the transfer site in a way similar to that seen in BcsA. Regardless of whether the A-loop is located at β7/IFH2 as in PfDPMS and GtrB, or at β5/ IFH1 as in PcManGT and RsBcsA, the common theme is that the loop connects to either of the two IF helices, which emphasizes the functional role of IF helices to regulate lipid-linked acceptor association/ dissociation in transmembrane GTs.
Hypothetical N-glycosylation gene cluster in P. calidifontis Biosynthesis of the eukaryotic LLO heptasaccharide core requires three mannosyltransferases Alg1 (GT33), Alg2 (GT4) and Alg11 (GT4), whereas the complete LLO requires three additional non-Leloir mannosyltransferases (Alg3, Alg9 and Alg12). We analyzed the P. calidifontis genome for Leloirmannosyltransferase genes, specifically GT counterparts for alg1, alg2, and alg11. The P. calidifontis genome contains 21 GT genes: 9 GT2, 6 GT4, 1 GT5, 1GT20, 1 GT35, 1 GT66 and 2 GTNC, where all but GT2 and GT66 are retaining. Most of the GT genes are scattered throughout the genome, with one exception: a 14-gene cluster of 13,404 bp containing the genes Pcal_0470 to Pcal_0483 (http://csbl.bmb.uga.edu/DOOR). This gene cluster is the largest predicted cluster in the P. calidifontis genome (Table S4) and contains six GT genes: three GT2 (Pcal_0472, Pcal_0478, Pcal_0481) and three GT4 (Pcal_0473, Pcal_0475, Pcal_0483). Thus, half of all GT4 genes are present in this cluster. Considering that several of the mannosyltransferases in the eukaryotic HMG-biosynthetic pathway are GT4 enzymes, the presence of three GT4 genes in this cluster is interesting and raises the possibility that it may constitute a protein N-glycosylation gene (PNG) cluster.
Alg1 adds the first mannosyl unit to the chitobiose core of the LLO and belongs to family GT33 of inverting mannosyltransferase. There is no GT33 gene in the P. calidifontis genome, which means that the enzyme adding the α-linked mannose to the chitobiose-like core belongs to a different GT family. The gene coding for PcManGT (Pcal_0472) could be a possible candidate for Alg1. Since PcManGT and Alg1 belong to different GT families, it is not meaningful to compare their sequences; however, both display a relatively simple transmembrane architecture: PcManGT has three TM helices; and Alg1, whose structure is unknown, is predicted to have 1-3 TMHs depending on the prediction software used (TOPCONS; http://topcons.cbr.su.se) [24]. Two GT4 genes (Pcal_0473 and Pcal_0483) in this cluster could be possible candidates for the eukaryotic alg2 gene (Table S4). The gene Pcal_2037 has been proposed to be an alg11 ortholog [14] and is 23.8% identical in amino-acid sequence to human Alg11. While Pcal_2037 is not present in the proposed PNG cluster, Pcal_0483 shows 23.1% sequence identity to human Alg11 and could be an alternative. The GT responsible for transferring the N-glycan onto the protein asparagine substrate is referred to as AglB in archaea (OST in eukaryotes) and belongs to GT family 66. The only GT66 gene in P. calidifontis genome, Pcal_0997, is the likely candidate for this activity. Hyperthermophilic and crenarchaeal GT genes responsible for N-glycan biosynthesis do not typically display aglB-based N-glycosylation-gene clustering as in euryarchaeal genomes [25,26], and in agreement with this observation, Pcal_0997 is distantly located some 470,000 bp downstream from the proposed PNG gene cluster.
Mannosyltransferases capable of catalyzing the formation of glycosidic bonds in α-configuration are either retaining GTs in the cytoplasm that use GDPα-Man as donor, or non-cytoplasmic inverting GTs that use Dol-P-β-linked mannose as donor substrate. The amino-acid sequence of PcManGT has been classified in the CAZy database [9] as belonging to the GT2 family of inverting glycosyltransferases. This classification infers that the enzyme acts by inversion of configuration at the anomeric center, meaning that transfer of mannose from the donor GDP-α-Man will result in a mannosylated product with the mannose attached by a βconfigured glycosidic bond. Our results show that GDP-α-Man is the natural donor substrate for PcManGT, and that the enzyme catalyzes the hydrolytic side-reaction in the absence of acceptor. The ability of PcManGT to readily accept water as acceptor in enzyme-catalyzed hydrolysis of the donor substrate is a feature normally associated with a retaining reaction mechanism. Whether PcManGT is indeed a bona fide GT2 member, or possibly a retaining GT2-like enzyme raises a relevant question for further studies.
Like eukaryotes, P. calidifontis assembles its nascent N-glycans on the membrane-bound lipid carrier Dol-PP. The Dol-PP produced by P. calidifontis has an average isoprenoid chain length of 50-55 carbon [11], while the isoprenoid chains of eukaryotic Dol-PP are longer and less saturated. In contrast to crenarchaea and eukaryotes, the LLOs of euryarchaea are assembled on single-phosphorylated dolichol carriers, Dol-P. The dependency of PcManGT on the isoprenoid chain length for structural stability and catalytic activity is evidenced by the ability of Dol55-P to stabilize the enzyme against unfolding to provide a notable increase in melting temperature of 30°C ( Figure 5(a)). Furthermore, Dol55-P added to a membrane-mimicking environment (bicelles) increases donor hydrolysis by almost 5-fold compared to an environment containing only detergent, or detergent with the supplementation of the phospholipid PS ( Figure 6, bars 3, 4 and 7).
Several membrane proteins have been reported to depend on anionic lipids for function and stability [27,28]. Our results add important new knowledge regarding of the function of protein-lipid interactions by establishing that beyond the general stabilization by anionic lipids, dolichylphosphate-dependent membrane proteins (e.g., PcManGT and PfDPMS) are also specifically stabilized by dolichylphosphates. This behavior was not observed for a dolichylphosphate-independent membrane protein (MFS transporter), or for soluble proteins. Considering the importance of the identity of the lipid-carrier, the lack of transfer products from the soluble acceptors tested here is not surprising. For example, in the case of PfDPMS, we showed that precise positioning of the phosphate moiety of Dol-P is critical for the mannosyltransfer reaction [17].
Taken together, these observations convincingly point to a Dol-PP-linked sugar intermediate as the natural acceptor substrate for PcManGT. Furthermore, we have shown by sequence and structure comparisons, ITC, crystal-structure analysis and donor-hydrolysis assays that PcManGT is a Leloir glycosyltransferase that uses GDP-α-Man as donor substrate, in contrast to non-Leloir GTs that use other types of activated glycosyl donors. This further implies that the transfer reaction catalyzed by PcManGT would need to take place on cytoplasmic face of the P. calidifontis plasma membrane.
That GDP-α-Man serves as the natural donor substrate and that the enzyme requires a Dol-PPlinked sugar acceptor establish that PcManGT acts as a Dol-PP-dependent mannosyltransferase, and thereby assigns the enzyme to the P. calidifontis Nglycan-biosynthesis pathway. Thus, the function would be to add a mannosyl residue to one of the mannosyl positions in the P. calidifontis HMG intermediate. The cryo-electron microscopy structure of Saccharomyces cerevisiae Alg6 was reported recently [29]. Alg6 is an inverting GT non-Leloir enzyme that adds the first α-1,3-linked glucose moiety to the long antenna of yeast N-glycans ( Figure S1). Rather than using a nucleotide sugar, Alg6 uses Dol-P-Glc as the activated glycosyl donor, from which glucose is transferred to Dol-PP-(GlcNAc) 2 -(Man) 9 on the lumenal side of ER. Alg6 has a GT-C fold that is unrelated to the GT-A fold of PcManGT but shares the functional similarity of depending on a dolichyl-linked substrate.
Docking of a modeled Dol-PP acceptor molecule to the TM domain in PcManGT suggests that a dolichyl isoprenoid chain is likely to adopt a similar kinked conformation as observed for Dol-P-Glc in Alg6. Furthermore, adding sugar units to the docked Dol-PP molecule indicates that two sugar moieties corresponding to the chitobiose-type core can be accommodated ( Figure S9). If this is the case, a mannosyl unit would be transferred from GDP-α-Man to Dol-PP-Glc(NAc) 2 -GlcA(NAc) 2 , and thereby assign a role to PcManGT equivalent to the eukaryotic chitobiosyldiphosphodolichol mannosyltransferase Alg1. However, this hypothesis presupposes that PcManGT is a retaining GT, or that the assignment of the chitobiose-mannosyl bond should be in β-configuration rather than α.
The assignment of PcManGT to the N-glycosylation pathway is further strengthened by the observation that the gene Pcal_0472 is present in a large 14-gene cluster in the P. calidifontis genome that includes six GT genes (Table S4). Of the GT genes, four are GT4 enzymes, including possible candidates for the α-1,3/ 1,6-mannosyltransferase Alg2 (Pcal_0473, Pcal_0483) and Alg7 that adds the first GlcNAc unit to the Dol-PP carrier (Pcal_0478). Thus, it is likely that the gene cluster constitutes a crenarchaeal Nglycosylation (PNG) gene cluster.

Cloning and mutagenesis
The gene coding for P. calidifontis JCM 11548 glycosyltransferase 2 (PcManGT; ABO07903.1; UniProt A3MTD6) was synthesized and codonoptimized by MR.GENE for host expression in Escherichia coli. The gene insert was amplified using the following primers: P c M a n G T _ f w d
A PCR protocol using 70 ng plasmid DNA and Phusion High-Fidelity (Thermo Scientific) included 98°C for 30 s, 30 cycles of 98°C for 10 s, 65°C for 30 s, 72°C for 90 s with a final incubation at 72°C for 10 min. Following the PCR products were purified using GeneJET PCR Purification Kit (Thermo Scientific).
The amplified gene was cloned using restriction enzymes into the pWaldo-GFPd vector [30,31], which was digested with NdeI and BamHI before being ligated with the insert using the In-Fusion II cloning kit (Clontech Laboratories). The recombinant plasmid carrying the fusion-gene construct PcManGT-TEV-GFP-His 8 was transformed into chemically competent E. coli strain BL21(DE3) cells. Five colonies were picked for screening by colony PCR with the insert primers. Sitedirected mutagenesis to generate a catalytically deficient PcManGT mutant (D135A/D137A) was performed using the following forward and reverse primers: D135A/D137A_fwd (5′-CATCACTGCTGCCGCTG CCCTGTGGCCGG-3′).
Mutagenesis was performed using the Quik-Change II site-directed mutagenesis kit (Agilent Technologies), as described by the manufacturer. The sequence of the mutated gene was confirmed by automated DNA sequencing, and plasmid with desired mutations was transformed into E. coli BL21 (DE3) for protein production.

Gene expression and solubilization of membrane proteins
Transformed cells were grown on a Lysogeny Broth (LB; Sigma Aldrich) agar plate and incubated at 37°C. Positive transformants were selected with 50 μg/ml kanamycin. An overnight LB seed culture (50 ml) was used to inoculate Terrific Broth (TB; Sigma Aldrich) medium (5 L) containing 50 μg/ml kanamycin. At an OD 600 value of 1.0, overexpression of wild-type or D135A/D137A variant PcManGT-TEV-GFP-His 8 was induced by adding IPTG to a 0.2 mM concentration and continued for 15-17 h at 18°C. The cells were harvested in phosphate-buffered saline supplemented with 1 tablet of cOmplete™ Protease Inhibitor Cocktail (Merck). The bacterial cells were lysed using an Avestin EmulsiFlex C3 homogenizer, followed by sonication to fragment nucleic acids. The cell lysate was centrifuged at 10,000 r.p.m. (12,096g, Avanti J-20XP centrifuge) for 10 min at 4°C. The supernatant was centrifuged at 35,000 r.p.m. (125,749g, 45Ti rotor, Optima LE-80K centrifuge, Beckman Coulter) at 4°C for 1 h to separate the membrane fraction.
For the crystals of PcManGT in complex with GDP•Mn 2 + and GDP-α-Man•Mn 2 + , the crystals were soaked for 5 min in the presence of either 10 mM GDP or GDP-α-Man. For experimental phasing, a PcManGT crystal was soaked for 5 min with 10 mM Na 2 WO 4 .
Intensity data for all crystals, derivatized and nonderivatized, were recorded using synchrotron radiation (Table S1), followed by merging and scaling using the XDS package [33]. The protein crystallized in space group P2 1 2 1 2 1 with two molecules in the asymmetric unit and approximately 50% solvent content. Heavyatom substructure solution and phasing of amplitudes (Table S1) were performed using a crystal derivatized with Na 2 WO 4 using single isomorphous replacement with anomalous scattering (SIRAS) as implemented in autoSHARP [34] included in the SHARP package [35]. The electron density was improved and phaseextended to 2.7 Å by solvent flattening using the program SOLOMON [36] implemented in SHARP with an optimized solvent fraction of 0.502. Noncrystallographic symmetry (NCS) operators were deduced by manual inspection of the improved electron density in O [37], and used for 2-fold NCS averaging, solvent modification and extension of the modified SHARP phases with the program DM as implemented in the CCP4 package [38]. Additional density modification of the NCS-averaged phases was performed using AutoBuild in PHENIX [39], and a preliminary partial model was built using both the improved SIRAS and PHENIX-modified electron densities. An initial test refinement was performed using BUSTER [40]. The model was completed by alternating manual model building and rebuilding using the programs O [37] and COOT [41], and refinement with PHENIX guided by likelihood-weighted electron-density maps. The Fourier amplitudes for the crystal complexes were phased using Fourier synthesis where model phases were calculated from the refined unliganded PcManGT model. Refinement with PHENIX included real-space refinement, NCS restraints, secondary-structure restraints for α-helices, refinement of individual atomic displacement parameters and TLS refinement. Refinement statistics are given in Table S1.

Isothermal titration calorimetry
ITC was used to identify a possible donor substrate for PcManGT. The strategy employed was to first analyze the interaction with nucleotide diphosphates (adenosine diphosphate, ADP; cytidine diphosphate, CDP; uridine diphosphate, UDP; guanosine diphosphate, GDP), and second, to evaluate possible nucleotide diphosphate sugars guided by the nucleotide preference obtained. Based on the resulting binding data, GDP-α-Man and GDP-α-Glc were selected for ITC analysis.
To avoid buffer mismatch, the enzyme was dialyzed against 50 mM Hepes (pH 7.5), 150 mM NaCl, 5% glycerol (v/v), 0.1% DM (w/v), and the same dialysis buffer was used for dissolving the substrates. Substrate binding was analyzed in the absence and presence of the divalent metal cation Mn 2+ , Mg 2+ or Ca 2+ , with MnCl 2 , MgCl 2 , or CaCl 2 added at a final concentration of 1 mM to enzyme and substrate solutions. The measurements were performed using 50 μM of PcManGT and 500 μM of substrates (corresponding to 10 × molar excess). Titrations were performed by 25 injections using a MicroCal iTC 200 system (GE Healthcare), and the raw data were fitted with the software Origin 7.0.
Following heat treatment, precipitated protein was removed from the samples by centrifugation at 14,500 r.p.m. (19,780 g, F-45-24-11 rotor, Eppendorf 5424 Microcentrifuges, Eppendorf) for 10 min at 4°C. The supernatants, containing the protein remaining in solution, were analyzed by Coomassie-stained SDS-PAGE (Coomassie Brilliant Blue R-250). Densitometric analysis of the gel bands was performed with ImageJ (Rasband, W.S., ImageJ, http://imagej.nih.gov/ij/, 1997-2016). Prism 7 (GraphPad Software, San Diego California USA) was used to fit the data in the unfolding curves, and to determine the T m ± SD and the ΔT m ± SEM values.
Three proteins were used as references: two membrane proteins and one soluble protein.
PfDPMS [17] was used as a reference for a dolichylphosphate-dependent membrane protein; a transporter belonging to the major facilitator superfamily (MFS) was used as a reference for a dolichylphosphate-independent membrane protein; and the soluble protein was used as a reference for a lipid-independent protein. The stability assays were performed as described for PcManGT.
The reaction samples were incubated at 60°C for 3 h. Following incubation, precipitated protein was removed from the samples by centrifugation at 13,000 r.p.m. (15,900 g, F-45-32-5-PCR rotor, Eppendorf 5424 Microcentrifuges, Eppendorf). To quantify the hydrolytic activity, we used a protocol based on the GDP-Glo™ Glycosyltransferase Assay (Promega Biotech AB). From each sample, 25 μl was transferred to a 96-well plate (Sarstedt96, solid white), and an aliquot of 25 μl of GDP detection reagent (prepared according to supplier information, GDP-Glo™ Glycosyltransferase Assay, Promega Biotech AB) was added to each sample. The dyecontaining samples were incubated at room temperature for 60 min, followed by recording the luminescence at 555 nm using a microplate reader (Omega Fluostar; BMG Labtech). All samples were run in triplicates and from at least two different protein batches. The data were analyzed with Prism 8 (GraphPad Software, San Diego California USA) and displayed as mean values ± SEM.

HPLC analysis of products
The donor substrate hydrolysis was analyzed by High-Performance Anion Exchange Chromatography and Pulsed Amperometric Detection (HPAEC-PAD), using a Dionex ICS 5000 system (Dionex, Sunnyvale, CA) with an autosampler, equipped with a PA-1 column and guard cartridge. The enzymatic reactions were performed as described above for the activity assays with GDP-α-Man as donor substrate, Dol95-P as stabilizing lipid (used instead of Dol55-P since the longer dolichol is more easily available from Larodan), and with the addition of different possible soluble acceptor substrates ( Figure S4