Exploring the N-glycosylation Pathway in Chlamydomonas reinhardtii Unravels Novel Complex Structures*

Chlamydomonas reinhardtii is a green unicellular eukaryotic model organism for studying relevant biological and biotechnological questions. The availability of genomic resources and the growing interest in C. reinhardtii as an emerging cell factory for the industrial production of biopharmaceuticals require an in-depth analysis of protein N-glycosylation in this organism. Accordingly, we used a comprehensive approach including genomic, glycomic, and glycoproteomic techniques to unravel the N-glycosylation pathway of C. reinhardtii. Using mass-spectrometry-based approaches, we found that both endogenous soluble and membrane-bound proteins carry predominantly oligomannosides ranging from Man-2 to Man-5. In addition, minor complex N-linked glycans were identified as being composed of partially 6-O-methylated Man-3 to Man-5 carrying one or two xylose residues. These findings were supported by results from a glycoproteomic approach that led to the identification of 86 glycoproteins. Here, a combination of in-source collision-induced dissodiation (CID) for glycan fragmentation followed by mass tag-triggered CID for peptide sequencing and PNGase F treatment of glycopeptides in the presence of 18O-labeled water in conjunction with CID mass spectrometric analyses were employed. In conclusion, our data support the notion that the biosynthesis and maturation of N-linked glycans in the endoplasmic reticulum and Golgi apparatus occur via a GnT I-independent pathway yielding novel complex N-linked glycans that maturate differently from their counterparts in land plants.

Chlamydomonas reinhardtii is a green alga that is used as a model organism for studying a number of biological processes such as photosynthesis, flagellar assembly and function, organelle biosynthesis, phototaxis, and circadian rhythms (1). Studies on glycosylation pathways in C. reinhardtii have been mostly focused on O-glycosylation processing, as the cell wall of this organism consists of a vast framework of O-glycosylated hydroxyproline-rich glycoproteins (2,3). More recently, Bollig et al. even demonstrated that Oglycans from C. reinhardtii cell wall glycoproteins contain arabinose and galactose, the latter being in the furanose form (4). In contrast, the N-glycosylation pathway, although a major post-translational modification step in the maturation of secreted proteins in eukaryotes, has received very little attention so far. In N-glycan processing, a Man 5 GlcNAc 2 -PP-dolichololigosaccharide intermediate is first assembled onto a dolichol pyrophosphate on the cytosolic face of the endoplasmic reticulum (ER). 1 After translocation of this intermediate by a flippase, the biosynthesis continues in the lumen of the ER until a Glc 3 Man 9 GlcNAc 2 -PP-dolichol N-glycan precursor is completed (5). This precursor is then transferred by the oligosaccharyltransferase (OST) multisubunit complex onto the asparagine residues of the consensus Asn-X-Ser/Thr sequences of a protein (5). The precursor is then deglucosylated/reglucosylated to ensure the quality control of the neosynthesized protein through the interaction with ER-resident chaperones such as calnexin and calreticulin. These ER events are crucial for the proper folding of secreted proteins (6), conserved in eukaryotes investigated so far, and involve a limited number of oligomannoside N-glycans. In contrast, the evolutionary adaptation of N-glycan processing in the Golgi apparatus gives rise to a large variety of organism-specific complex structures (7). Type I mannosidases located in this compartment first degrade the oligosaccharide precursor into oligomannoside N-glycans ranging from Man 9 GlcNAc 2 (Man-9) to Man 5 GlcNAc 2 . N-acetylglucosaminyltransferase I (GnT I) then transfers a first GlcNAc residue on the ␣(1,3)mannose arm of Man-5 to initiate the synthesis of polyantennary complex-type N-glycans (7).
To date, a few studies carried out in Chlorophycaea using on-blot affinodetection or a combination of exoglycosidase digestions and two-dimensional HPLC separation have suggested that proteins secreted by these microalgae harbor mainly oligomannosides or mature N-glycans having a core xylose residue (8 -10). Deeper insight into the structure of glycans N-linked to proteins secreted by two algal species, Porphyridium sp. and Phaeodactylum tricornutum, has been recently reported. A cell wall glycoprotein from the red microalgae Porphyridium sp. was found to carry Man-8 and Man-9 oligomannosides containing 6-O-methyl mannose and substituted by one or two xylose residues (11). In contrast, glycans N-linked to proteins secreted by the diatom P. tricornutum can be processed through a GnT I-dependent pathway into paucimannosidic oligosaccharides (12).
In contrast to glycomic analysis, which focuses on the structure of N-linked oligosaccharides irrespective of the carrier proteins, glycoproteomics is used to characterize and determine the cell localization of individual proteins carrying these carbohydrate post-translational modifications. Whereas mammalian N-glycoproteomes have been studied extensively down to tissue-and cell-type-specific levels (13)(14)(15)(16)(17), less information is available regarding the N-glycoproteomes of plants and green algae (18,19). The use of glycoproteomic approaches could help unravel the identity of endogenous glycoproteins from C. reinhardtii. As this green alga possesses many animal-like features (20), glycoproteomic analyses will help provide information concerning similarities and differences relative to not only mammalian but also vascular plant N-glycosylation pathways and glycoprotein trafficking.
Recently, microalgae have emerged as an alternative system for the production of biopharmaceuticals, which represents a multibillion-dollar industry worldwide (21). The high expense and complicating factor of potential virus contami-nation encountered with commonly used expression systems have driven scientists to seek alternatives such as C. reinhardtii cells. Actually, they are cheap, easy to grow, safe, and scalable for the production of a high amount of proteins, making them ideal hosts for industrial production (22). Several studies have already demonstrated that the green alga C. reinhardtii is a convenient platform for producing recombinant proteins, including those of human origin (23). For example, a large single-chain antibody directed against glycoprotein D of the herpes simplex virus (24) and full-length IgG1 monoclonal antibodies directed against anthrax protective antigen 83 (25) have been successfully expressed in the chloroplast of transgenic C. reinhardtii cells. The production of secreted therapeutic proteins such as erythropoietin has also been evaluated (26). In contrast to the expression of proteins in the chloroplast, protein post-translational modifications such as N-glycosylation acquired by the secreted recombinant protein are a major concern for biopharmaceuticals, as more than half of the approved ones are glycosylated (27). Moreover, glycosylation is a critical quality attribute for biopharmaceuticals, because the presence and structures of the N-glycans are required for their biological activity, stability, and half-life (28,29). However, given that unsuitable N-glycan structures can induce immune responses in humans (30 -32) and generate adverse reactions, as reported for ␣(1,3)-Gal epitope on therapeutic drugs like cetuximab (33), it is essential to take into account the N-glycosylation capacity for an optimal expression system. Therefore, a suitable expression system should allow the production of glycomolecules harboring N-glycans and/or O-glycans compatible with human therapeutical applications and better efficacy of the therapeutic drug (34).
In this study, we used a comprehensive approach including genomic, glycomic, and glycoproteomic analyses to investigate the N-glycosylation pathway occurring in C. reinhardtii. Our results revealed that the biosynthesis and maturation of N-glycans occur in the ER and Golgi apparatus through a GnT I-independent pathway and yield novel complex structures in addition to oligomannoside N-glycans.

EXPERIMENTAL PROCEDURES
Strains and Growth Conditions-CC-503 cw92 and CC-1036 pf18 strains were obtained from the Chlamydomonas Culture Collection at Duke University (Durham, NC) and grown in batch cultures at 26°C, illuminated with a photosynthetic photon flux density of 150 mol m Ϫ2 s Ϫ1 supplied from cool, white fluorescent lamps (TDL 150 W, Philips, Eindhoven, The Netherlands), using minimal medium (35) and aeration with air enriched with 5% CO 2 . Cells were harvested via centrifugation, frozen in liquid nitrogen, and stored at Ϫ80°C until use. Cultivation of C. reinhardtii CC-400-cw15 under iron-sufficient and iron-deficient conditions was carried out in TAP medium as described elsewhere (36).
In Silico Genome Analysis-Annotation of genes involved in the N-glycosylation pathway in the C. reinhardtii genome (Chlamydomonas reinhardtii v4.3, available online at the Phytozome website) was carried out via TBLASTN analysis using protein sequences from Homo sapiens, Mus musculus, Arabidopsis thaliana, Drosophila melanogaster, Saccharomyces cerevisiae, and Physcomitrella patens as queries. Sequence alignments were done with ClustalW 1.8 (37) from the BioEdit 7.0.9.0 package. Signal peptides and cell localization/ targeting of mature proteins were predicted using SignalP4.0 and TargetP 1.0. Searches for the presence of predicted transmembrane domain(s) and for specific Pfam domains were done, respectively, by TMHMM and Pfam (Wellcome Trust, Sanger Institute, Cambridge, UK).
Soluble and Membrane-bound Protein Preparation-Ten liters of cells were harvested via 5 min of centrifugation at 2,500g (SORVALL® RC5C Plus). The cells were washed with 20 mM potassium phosphate buffer (pH 7.4) and underwent 5 min of centrifugation at 2,500g after washing. The cell pellet was packed in 20 ml of 20 mM potassium phosphate buffer (pH 7.4) plus 1 ml PIC 25x (Protease inhibitor mixture, Roche, Meylan, France) and broken with a French press (SLM Amincor, SLM Instruments, Inc., Urbana, IL) at 1,300 Pa. Cell suspensions were centrifuged at 300g for 3 min to remove intact cells and debris. The supernatant was centrifuged again at 20,000g for 30 min and then ultracentrifuged at 100,000g for 1 h (Centrikon T-2070, Kontron Instruments Montigny Le Bretonneux, France) to pellet the microsomal fraction, corresponding to the membrane-bound proteins. Concentration to 1 ml of the supernatant containing the soluble proteins was done using Amicon® Ultra centrifugal filters ULTRACEL® 10K (Millipore, Billerica, MA). All the steps in the protein preparation were carried out at 4°C.
Isolation of N-glycans from C. Reinhardtii Proteins-In-gel Trypsin Digestion-Five milligrams of protein extract were loaded on an SDS-PAGE gel (4 -12%, Bis-Tris, Invitrogen) run at 150 V for more than 1 h in MES buffer 1X (Invitrogen). After fixation in 20% ethanol/10% acetic acid, the gel was cut in small pieces and washed with an ammonium bicarbonate 50 mM acetonitrile (1/1 V:V) solution for 15 min. This washing step was repeated once. Reduction and alkylation were then performed by means of incubation with, respectively, 25 mM dithiothreitol (DTT) for 45 min at 56°C and 55 mM iodoacetamide at room temperature in the dark for 20 min; both DTT and iodoacetamide were dissolved in 50 mM ammonium bicarbonate buffer. Then, alkylated proteins were in-gel digested overnight by trypsin treated with L-1-tosylamido-2-phenylethyl chloromethyl ketone TPCK needs to be add in the brackets (Sigma) at a ratio of 20:1 (protein:trypsin) at 37°C under agitation. The resulting peptides and glycopeptides were extracted from the gel via a succession of washes with 100% acetonitrile, 100 mM ammonium bicarbonate, 100% acetonitrile, and 5% formic acid. The peptide and glycopeptide mixtures were separated using a C18 cartridge (Waters, Milford, MA). The column was conditioned with 5 ml of ethanol followed by 5 ml of water. The sample was then loaded onto the resin prior to a washing step using 5 ml of water. The interesting glycopeptides were eluted with 4 ml of 50% acetonitrile.
Peptide N-glycosidase A Digestion-The eluate containing the glycopeptides was then dried down and reconstituted in 100 mM sodium acetate, pH 5.5, prior to peptide N-glycosidase A digestion (Roche). 1.5 mU of enzyme was added for every 5 mg of protein and incubated at 37°C overnight under agitation. After the digestion, isolation of the released N-glycans was performed using a C18 cartridge (Waters) conditioned as previously described. After the sample had been loaded, the free N-glycans were recovered by 5 ml of water. This fraction was dried down prior to purification by a Hypersep Hypercarb cartridge following the manufacturer's instructions (Thermo Scientific).
Peptide N-glycosidase F Digestion-The peptide N-glycosidase F (Roche) digestion was performed on C. reinhardtii proteins according to the procedure outlined in Ref. 12. The released N-glycans were then cleaned up and labeled with 2-aminobenzamide (2-AB) prior to MALDI-TOF-MS analysis with or without ␣-mannosidase treatment. 2-AB labeling was done according to Ref. 38 in order to increase the sensibility and the ionization  efficiency of the MALDI-TOF analysis as previously described for plant N-glycan analysis. The excess reagent was then removed using a cartridge S from Prozyme, Hayward, CA, USA following the manufacturer's instructions. The 2-AB-labeled N-glycans were analyzed via MALDI-TOF MS1 and MS2 prior to eventual further permethylation. The 2-AB labeling reaction was carried out to completion, and there were no leftover unlabeled oligosaccharides-this was checked carefully during the experiment through MALDI-TOF-MS analysis prior to and after the labeling procedure.

2-AB Labeling of N-glycans-The
Permethylation-The 2-AB-labeled N-glycan preparation was permethylated using the sodium hydroxide procedure (39,40). The permethylated N-glycans were cleaned up using Sep-Pack C18 cartridges (Waters) according to the procedure described in Ref. 41.
MALDI-TOF-MS Analysis-The permethylated glycans were dried down and reconstituted in 10 l of 90% methanol-0.1% trifluoroacetic acid (TFA) and 2-AB-labeled N-glycans were reconstituted in 10 l of water containing 0.1% TFA before 0.5 l of sample were spotted on a MALDI target plate at a ratio of 1:1 with 2,5-dihydroxybenzoic acid matrix at 10 mg ml Ϫ1 (Waters, Milford, MA) dissolved in 80% (v/v) methanol in water. The analysis was then performed on a MALDI-TOF-TOF 5800 (AB Sciex, Framingham, MA). The MS acquisition was done in reflector positive mode with the laser intensity fixed at 63% and a pulse rate of 400 Hz. The detector's voltage was about 1.95 kV. The MS2 experiments were performed at a voltage of 2 kV combined with activation of collision-induced dissociation (CID) by argon gas at a pressure of 5 psi. 10,000 laser shots were accumulated for each spectrum of MS1 and MS2. The 4,700 calibration standard kit Cal Mix (AB Sciex) was used for external calibration. Spectra were analyzed using Data Explorer® software (AB Sciex). The parameters used to reject or exclude outliers were a signal-to-noise ratio threshold of 3%, a centroid of 50, a noise window width of 250, and a threshold (m/z) after signal-to-noise ratio recalculation of 10. Relative quantification of the different N-glycan species was based on the MALDI-TOF-MS spectra of permethylated N-glycans as previously described (42). For this purpose, the height of the interesting ions was used to calculate their relative intensity as compared with that of all the glycan structures identified. The values presented in Table I correspond to the mean of data obtained from five independent N-glycan preparations and MALDI-TOF analyses. The standard deviation has been calculated and indicated. The biological reliability of all measurements was validated using at least three independent experiments for each of the three biological replicates.
␣-Mannosidase Treatment-1.5 l of 2-AB-labeled N-glycans were submitted to 215 mU of ␣-mannosidase from proteomic-grade Canavalia ensiformis (Sigma-Aldrich, St Louis, MO) in commercial buffer diluted five times (Sigma-Aldrich) for 24 h at 37°C under agitation. Then, the digest was directly analyzed via MALDI-TOF-MS.
HPAEC-PAD Analysis of N-glycans-Oligosaccharides, especially Man-5 (MC0731) from Dextra Laboratories, Reading, UK were used as standards. These were prepared by dissolving 20 g of each standard in 1 ml of water. The Dionex ICS-5000 system with integrated amperometry was used for HPAEC analysis. A carboPac PA200 analytical column (3 ϫ 250 mm) with a PA200 guard column (3 ϫ 50 mm) from Dionex (Sunnyvale, CA) and three eluents were used for the separation. The eluents were 500 mM sodium acetate, 500 mM NaOH, and deionized water. The gradient program for the elution of both neutral and charged oligosaccharides began with isocratic mode with 20% NaOH and 80% water for 10 min, followed by a ramp gradient for sodium acetate to 34% while maintaining NaOH at 20% until 78 min had passed. After that, a re-equilibration period of 25 min with 20% NaOH and 80% water was allowed for the next run. The waveform used was E1 ϭ ϩ0.05 V, t1 ϭ 400 ms; E2 ϭ ϩ0.75 V, t2 ϭ 200 ms; E3 ϭ Ϫ0.15 V, t3 ϭ 400 ms. The flow rate was kept constant at 0.3 ml min Ϫ1 , and the volume of sample/standard injected was 30 l.
Monosaccharide Composition of 2-AB-labeled N-glycans by GC-EIMS-The monosaccharide composition of 2-AB-labeled N-glycans was determined via GC-EIMS. Samples were hydrolyzed with 2 M TFA, reduced with NaBD 4 , and peracetylated. The resulting alditol acetates were separated via GC (Hewlett-Packard 6890 series gas chromatographic system) on an HP-5MS capillary column (0.25 mm inner diameter ϫ 30 m, 0.25-m film thickness; Hewlett-Packard, Palo Alto, CA) and analyzed via electron ionization using an Autospec mass spectrometer of EBE geometry (Micromass, Manchester, UK) equipped with an Opus 3.1 data system. Helium was the carrier gas, and the flow rate was 0.8 ml min Ϫ1 . The oven temperature was as follows: 100°C for 1 min, 100°C to 160°C at 10°C/min, 160°C to 200°C at 2°C/min, 200°C to 300°C at 15°C/min, and 300°C for 1 min. The temperature of the injector, the interface, and the transfer lines was 250°C. Injections of 0.5 or 1 l were performed with a split ratio of 10 or in splitless mode. The mass spectra were recorded using an ionizing electron energy of 70 eV and a trap current of 200 A, and the pressure and temperature of the ion source were 2.10 Ϫ6 mbar and 250°C, respectively. The acceleration voltage was 8 kV, the resolution was 1,000 (10% valley definition), and the magnet scan rate was 1 s/decade over m/z range 600 -38. The assignment to monomers was carried using standards of monosaccharides, as well as on the basis of their electron ionization fragmentation patterns.
Sialic Acid Release and HPAEC-PAD Analysis-Sialic acids bound to soluble and membrane proteins were released from 6 to 15 mg of CC503, CC1036 protein extracts by means of acetic acid hydrolysis and recovered through a 5-kDa vivaspin filter (Sartorius Stedim Biotech, Aubagne, France) following the procedure described in Ref. 43. Then the released sialic acids were analyzed via HPAEC-PAD. The experiment was run on a Dionex ICS-5000 system (Dionex, Sunnyvale, CA) equipped with an electrochemical detector. A carboPac PA 20 column (3 ϫ 150 mm, Dionex) with a guard (3 ϫ 30 mm, Dionex) was used for the analysis using a flow rate of 0.5 ml min Ϫ1 and the gradient conditions described in Ref. 44.
Affino-and Immunoblotting Analyses-C. reinhardtii total cell protein extracts were separated via SDS-PAGE and electrotransferred onto nitrocellulose membrane (membrane blotting, Pall Corporation, Port Washington, NY) for immunoblot or affinoblot analysis. Affinodetections with concanavalin A (Con A) (Sigma-Aldrich, St. Louis, MO) and biotinylated Sambucus nigra lectin (Vectorlabs, Burlingame, CA), and immunodetection with ␤(1,2)-xylose specific antibodies (Agrisera, Vä nnä s, Sweden) were performed according to the procedure described in Ref. 45. Total protein extracts from different organisms were used as controls: Drosophila melanogaster, which does not contain any core ␤(1,2)-xylose; Arabidopsis thaliana wild type; A. thaliana double mutant fut11/fut12, which does not contain any core ␣(1,3)-fucose in complex-type N-glycans; and the GnT I mutant of A. thaliana (cgl1), which is unable to synthesize complex-type N-glycans because of the lack of GnT I activity in this organism and produces only oligomannoside-type N-glycans.
Identification of Glycopeptides Using a Proteomic Approach Combined with Liquid Chromatography-Electrospray Ionization-Fourier Transform Mass Spectrometry-Culture Conditions and Protein Isolation-A schematic overview of the various protein isolation and cell fractionation procedures is given in supplemental File S1. Cells required for the preparation of total cell extracts (TCEs) were harvested via centrifugation (3 min at 2,000g), washed in a small volume of fresh culture medium, and centrifuged again (10 min at 10,000g). The supernatants (SNs) containing secreted glycoproteins were combined and concentrated using 15-ml centrifugal filter devices (Amicon Ultra, 30-kDa molecular-weight cut-off). Both cell pellets and SNs were stored at Ϫ80°C until further use. Isolation of chloroplasts and plasma membranes was performed according to established protocols (46,47). Protein concentrations of all samples were determined using the Pierce BCA assay kit (Thermo Fisher Scientific) according to the manufacturer's instructions.
For the solubilization of plasma membranes, TCEs, chloroplasts, and proteins and the reduction of cysteine residues, lysis buffer (2% SDS/0.1 M DTT in 0.1 M Tris-HCl, pH 7.6) was added to the samples, and the samples were then incubated at 95°C for 3 min. Samples were centrifuged at 16,000g for 10 min, and SNs containing solubilized proteins underwent either glycoprotein enrichment or N-glycoproteomic analysis via the filter-assisted sample-preparation method (N-glyco-FASP) (see below). The lysis step was omitted for secreted proteins; instead, a volume of SN concentrate corresponding to 300 g of protein was transferred to centrifugal filters and concentrated further to a volume of 40 l. Then 350 l of 8 M urea/0.1 M DTT in 0.1 M Tris/HCl pH 8.5 was added and denaturation/reduction was carried out at room temperature for 45 min. After that, SN samples were immediately used for glycoprotein enrichment or N-glyco-FASP.
Glycopeptide Enrichment and PNGase F Treatment-Carbamidomethylation of cysteines, tryptic digestion, glycopeptide enrichment, and PNGase F-mediated glycan hydrolysis in 18 O-labeled water were performed in centrifugal filter devices (Amicon Ultra, 0.5-ml capacity, 30-kDa molecular-weight cutoff) according to the N-glyco-FASP protocol (16) with the following modifications: 300 g of protein were used per sample, and glycopeptide enrichment was carried out using 150 l of agarose-bound Con A (50% slurry, Vector Laboratories Inc., Burlingame, CA). PNGase F (catalog no. P0704S) was obtained from New England Biolabs, Ipswich, MA. After the elution of 18 O-labeled peptides, samples were dried in a vacuum centrifuge and stored at Ϫ20°C.
Glycoprotein Enrichment for In-source Collision-induced Dissociation Analyses-300 g of protein were transferred to centrifugal filters (Amicon Ultra, 0.5-ml capacity, 30-kDa molecular-weight cutoff) and concentrated via centrifugation at 14,000g for 15 min at room temperature to a volume of 40 l. 100 l of UA buffer (8 M urea in 10 mM HEPES, pH 6.5) was added, and samples were centrifuged as described above (this step was repeated twice). Subsequently, samples were incubated with 100 l of 50 mM iodoacetamide in UA buffer for 20 min in the dark and then centrifuged as before. Filters were washed twice with 100 l of UA and twice with 200 l of lectin binding buffer (500 mM NaCl, 1 mM CaCl 2 , 1 mM MnCl 2 in 20 mM Tris-HCl, pH 7.6). Afterward, protein samples were transferred to new collection tubes via the addition of 100 l binding buffer and centrifugation of the inverted filter unit at 14,000g for 5 min. The transfer step was repeated once. Then, 200 l of agarose-bound Con A were washed and equilibrated three times with 300 l of binding buffer in a 1.5-ml reaction tube by means of mixing, centrifugation at 10,000g for 5 min, and finally removal of the SN. Protein samples were added to the preconditioned Con A and incubated at room temperature overnight in a thermomixer while being shaken (1,000 rpm). Unbound proteins were removed by three washes with 300 l NaCl-free binding buffer, with centrifugation at 10,000g for 5 min between washes. Glycoproteins were eluted via the addition of 150 l of 0.5 M ␣-methyl Dmannopyranoside in NaCl-free binding buffer and incubation for 20 min at room temperature with shaking (1,000 rpm). After centrifugation (10,000g for 5 min), the elution was repeated once, and the pooled eluates were transferred to a centrifugal filter unit (30-kDa molecular-weight cutoff). The samples were centrifuged at 14,000g for 15 min, and ␣-methyl D-mannopyranoside was removed by three successive washes with 200 l of 50 mM ammonium bicarbonate. Afterward, glycoproteins were digested by the addition of 2 g of trypsin (sequencing-grade modified, Promega, Madison, WI) in 40 l ammonium bicarbonate and overnight incubation at 37°C. Peptides were eluted via centrifugation (14,000g for 10 min). Elution was repeated twice with 50 l of ammonium bicarbonate. Finally, peptides were dried down in a vacuum centrifuge and stored at Ϫ20°C.
The mass spectrometer was operated in positive ion mode. MS full scans (m/z 400 -1800) were acquired via Fourier transform MS in the Orbitrap at a resolution of 60,000 (full width at half-maximum). The five most abundant ions of each full scan were fragmented in the linear ion trap via CID (35% normalized collision energy) using three microscans per spectrum. Dynamic exclusion was enabled with a repeat duration of 30 s, exclusion duration of 90 s, repeat count of 1, list size of 500, and exclusion mass width of Ϯ25 ppm. Unassigned charge states and charge states of 1 were rejected.
LC-MS Analysis of Intact Glycopeptides-Instrumentation, the composition of mobile phases, and the sample loading procedure were the same as described for the analysis of 18 O-labeled peptides. HPLC gradient conditions were as follows: 0% to 30% B (70 min), 30%-100% B (5 min), and 100% B (10 min).
The mass spectrometer was operated in positive ion mode with in-source (IS)-CID enabled during MS full scans and MS2 for the partial removal of glycans. The default IS voltage was 90 V. In addition, some samples were analyzed multiple times using 60 V and 80 V. MS full scans (m/z 400 -2000) were acquired via Fourier transform MS in the Orbitrap at a resolution of 60,000 (full width at half-maximum). The "mass tags" option of XCalibur was enabled for the online identification of ion pairs differing by 203.0794 and 406.1587 Da, corresponding to the neutral loss of one and two HexNAc residues, respectively (48). These ion pairs potentially represented peptides whose glycans had been trimmed down to the chitobiose core by IS-CID, with the high-mass partner bearing one HexNac residue more than the low-mass partner. The three most intense ion pairs of each full scan were fragmented in the linear ion trap via CID (35% normalized collision energy) with multistage activation (MSA) of the neutral loss of HexNAc (Ϫ203.1 Da, Ϫ101.5 Da, and Ϫ67.7 Da for z ϭ 1, 2, and 3). Each ion pair partner was isolated and fragmented individually. Consequently, a maximum of six fragmentation events were triggered per MS1 scan. Dynamic exclusion was enabled with a repeat and exclusion duration of 10 s, a repeat count of 1, a list size of 500, and an exclusion mass width of Ϯ10 ppm. Unassigned charge states and charge states of 1 were rejected.
Glycopeptide Identification-For peptide identification, X!Tandem CYCLONE (49) incorporated into the proteomics data-processing pipeline Proteomatic (50) was used. MSA-CID spectra were matched against a target-decoy database composed of JGI4.3 Augustus 10.2 gene models merged with mitochondrial and chloroplast protein sequences from NCBI databases BK000554.2 and NC_001638.1, respectively. This database was supplemented with the protein sequences of jack bean Con A (UniProtKB: CVJB), Flavobacterium meningosepticum PNGase F (UniProtKB: P21163), and sequences of contaminant proteins from the Common Repository of Adventitious Proteins (version 1.0, released January 1, 2012). The total number of protein entries in the composite database was 16,940. Decoy protein sequences were generated by randomly shuffling tryptic peptides while retaining the redundancy of non-proteotypic peptides. The maximum number of missed cleavages allowed was two. The mass accuracy was set at 5 ppm for MS1 precursor ions and 0.5 Da for product ions. X!Tandem analyses were performed several times on spectra files, each time with a slightly modified set of glycosylationrelated variable modifications (see below). The following modifications were used for all X!Tandem analyses: carbamidomethylation of cysteine (static), oxidation of methionine (variable), and deamidation of asparagine (variable). Peptide identifications were statistically validated using Qvality (version 2.02 (51)), with a q-value threshold of 0.01.
Additional Variable Modifications Used for the Identification of 18 Olabeled Peptides-Preliminary analyses led to the identification of numerous peptides derived from Con A and PNGase F, indicating high residual tryptic activity during the glycopeptide enrichment step of N-glyco-FASP. In order to take account of trypsin-mediated incorporation of 18 O into the C termini of peptides potentially resulting in false glycopeptide identifications, spectra files were analyzed four times by X!Tandem, each time using a different set of variable modifications (52) 18 O at the peptide C terminus (ϩ4.0085 Da). All results were combined, and conflicting peptide-spectrum matches were filtered on the basis of e-values. If e-values differed by 2 orders of magnitude or more, the peptide-spectrum match with the lower score was retained. Otherwise, peptide-spectrum matches were regarded as ambiguous and all corresponding identifications were discarded.
Additional Variable Modifications Used for the Identification of Intact Glycopeptides via IS-CID-X!Tandem searches were performed twice, each time with different sets of variable modifications: (1) modification of asparagine, serine, and threonine by HexNAc (ϩ203.0794 Da); and (2) modification of asparagine by chitobiose (ϩ406.1587 Da). Conflicting peptide-spectrum matches were filtered on the basis of e-values as described for the identification of 18 Olabeled peptides. However, peptide glycosylations considered as ambiguous were not discarded automatically and were validated through manual inspection of the fragmentation spectra.
All the raw MS data have been placed in a public database repository at PeptideAtlas.

RESULTS
To comprehensively identify the N-glycosylation pathway of C. reinhardtii, different approaches including genomic, glycomic, and glycoproteomic techniques were employed.
Soluble and Membrane N-glycoproteins from C. Reinhardtii Bear Mainly Oligomannoside N-glycans-The characterization of N-glycans from C. reinhardtii was carried out on protein extracts from three different strains: CC-503 cw92, CC-1036 pf18, and CC-400 cw15 (later called CC-503, CC-1036, and CC-400, respectively) (1). CC-400 and CC-503 are cell-walldeficient strains used as references (20), whereas CC-1036, the motility of which is completely impaired, possesses a cell wall (53). Both soluble and membrane-bound proteins were isolated and separated on SDS-PAGE, trypsinized prior to the N-glycan release using PNGase A, or directly deglycosylated using PNGase F. The resulting N-glycans were then labeled with a fluorescent tag (2-AB) before their analysis using MALDI-TOF-MS (Fig. 1). In both CC-503 and CC-1036 strains, N-glycans released from soluble and membranebound proteins showed identical profiles regardless of whether PNGase A or PNGase F was used. As presented in Fig. 1, the N-glycan processing does not depend on the strains or on the final destination of the secreted proteins. Moreover, the absence of cell wall does not influence the N-glycan processing.
Based on the m/z values of [MϩNa] ϩ ions, major ions were assigned to 2-AB derivatives of hexose 2-5 N-acetylglucosamine 2 (Hex 2-5 GlcNAc 2 ) ( Fig. 1; supplemental Table S1). Traces of larger oligomers up to Hex 9 GlcNAc 2 were also detected ( Table I, supplemental Table S1). The pool of 2-ABlabeled N-glycans was subjected to exo-glycosidase digestion with jack bean ␣-mannosidase. Consistent with the presence of ␣-linked mannose residues, ions corresponding to Hex 2-5 GlcNAc 2 were converted into single species corresponding to a HexGlcNAc 2 composed of a ␤-mannose linked to the chitobiose unit (supplemental File S2). These results are in agreement with the affinodetection with Con A, a lectin specific for oligomannoside N-glycans (supplemental File S3A). All these data allow us to assign these ions to oligomannoside N-glycans ranging from Man 2 GlcNAc 2 to Man 5 GlcNAc 2 , as previously reported for other eukaryotes (54). For confirmation, HPAEC-PAD of the N-glycan pool showed that Man-5 from C. reinhardtii had the same elution time as a commercially available standard of Man-5 (not shown).
Complex N-glycans in C. Reinhardtii Carry Xylose Residues and Are Partially Methylated-Remaining minor ions (indicated by lowercase letters in Fig. 1 and Table I) were assigned to 2-AB-labeled complex N-glycans. Most of these ions were resistant to jack bean ␣-mannosidase digestion (supplemental File S2), suggesting the absence of free terminal ␣-mannose residues in these oligosaccharides. In order to determine the monosaccharide composition of these complex-type Nlinked glycans, 2-AB-labeled N-glycans were hydrolyzed and monomers were converted into alditol acetates prior to their analysis via GC-EIMS. Mannose was identified as the main monosaccharide, along with low amounts of xylose. A 6-Omethyl hexose was also detected on the basis of its fragmentation pattern in EIMS. A search for specific monosac-  (101). f, N-acetylglucosamine; F, mannose. The alphanumeric code indicates complex-type N-glycan structures (letters) and the number of methyl groups present on the structure (digits). The asterisks indicate ions that have been identified but not annotated in the spectra.

TABLE I Relative quantification of the N-glycans found on CC-503 soluble proteins
Relative percentages are the mean of five independent N-glycan preparations and analyses. The quantification was run on permethylated 2-AB-labeled N-glycans. Oligomannoside N-glycans accounted for almost 70% of the N-glycan population, whereas the complex-type N-glycans substituted by one or two pentose residues represented 14.1% and 16.6%, respectively. The symbols used are the ones adopted by the Consortium for Functional Glycomics (101). , N-acetylglucosamine; , mannose; , xylose; , fucose. charides such as sialic acid was also carried out. As illustrated in supplemental File S4, neither Neu5Ac nor Neu5Gc was detected with HPAEC-PAD. Complementary Western blotting using a sialic acid specific lectin such as Sambucus nigra lectin did not reveal any sialic acid modified glycoproteins (not shown). Altogether, these results indicate that C. reinhardtii proteins do not carry any detectable sialic acid.
Among the 2-AB-labeled complex N-glycans (Figs. 1 and 2), most of the species exhibited differences of 132 or 264 Da, which could correspond to the presence of one or two pentose residues on the oligosaccharides (Fig. 2). Traces of a fucosylated N-glycan (Table I) were also detected, but due to its very low amount, this glycan could not be investigated further. The pentose residue is likely to be xylose, based on the monosaccharide composition. Moreover, some complex N-glycans exhibited mass differences of 14 Da (Figs. 2A and 2B). This shift may have resulted either from the substitution of xylose by a deoxyhexose residue or from the methylation of the N-glycan. To discriminate between these two possibilities, the pool of 2-AB-labeled N-glycans isolated from CC-503 soluble proteins was then permethylated and analyzed via MALDI-TOF-MS. The MS1 profile comparison of native and permethylated 2-AB-labeled N-glycans is shown in Fig. 2. N-glycans that had previously displayed a mass difference of 14 Da were converted into unique derivatives corresponding to permethylated 2-AB-labeled Man 4 Xyl 2 GlcNAc 2 and Man 5 Xyl 2 GlcNAc 2 , which indicated that the 14-Da mass shifts in the native oligosaccharides resulted from the partial methylation of the N-glycans (Figs. 2C and 2D), with the single exception of the ion at m/z 1331, which was converted into m/z derivatives of 1653 and 1667, corresponding to the dixylosylated glycan E and to Man 3 GlcNAc 2 bearing both fucose and xylose residues, respectively.
Xylose Residues Are Located on a Terminal Mannose Residue and on the Core ␤-mannose of the Complex N-glycans-In order to precisely determine the position of the xylose residues on C. reinhardtii complex N-glycans, tandem mass spectrometry was carried out on permethylated 2-ABlabeled N-glycans from CC-503 soluble proteins (Fig. 3). For example, MS2 fragmentation of the precursor ion at m/z 2062.0, which corresponded to the sodiated adduct of the permethylated derivative of Man 5 Xyl 2 GlcNAc 2 , yielded product ions at m/z 462.4 and 707.5, which resulted from the fragmentation of the 2-AB-labeled chitobiose unit (Fig. 3A). No product ion was found that could support the notion that the xylose residue was linked to the proximal GlcNAc of the chitobiose unit. In contrast, the presence of characteristic product ions at m/z 735.5 and 939.6 indicated the location of one xylose on the core ␣-mannose (Fig. 3A). Indeed, these two ions were shown to result from the core ␤-mannose's cross-ring fragmentations 1,5 X 2 (m/z 735.5) and 2,5 X 2 (m/z 939.6) by Domon and Costello (55). The presence of these two ions can be explained by the substitution of the C2 of the core ␤-Man with a xylose residue. Because C. reinhardtii proteins were also immunodetected with specific core ␤(1,2)-xylose antibodies (supplemental File S3B), we concluded that this xylose is ␤(1,2)-linked to the core ␤-Man, as has been demonstrated in land plants (56).
Moreover, the presence of the ion at m/z 401.3 (B 2 ) indicated the location of the other xylose on a terminal mannose residue (Fig. 3A). Additionally, the presence of the ion at m/z 285.3 showed that the C4 of a terminal mannose was substituted. Based on this MS2 information and the resistance to the ␣-mannosidase treatment, the second xylose is proposed to be located on a terminal mannose residue. The same conclusions could be drawn from the MS2 analysis of the permethylated Man 4 Xyl 2 GlcNAc 2 and Man 3 Xyl 2 GlcNAc 2 .
Terminal Mannose Residues Are Methylated in Complex N-glycans-The position of the methyl groups on C. reinhardtii complex N-glycans was further investigated by MS2 fragmentation on native 2-AB-labeled N-glycans. MS2 analysis of precursor ions at m/z 1683.6, corresponding to the Man 5 Xyl 2 GlcNAc 2 modified by three methyl groups (Figs. 3B and 3C), revealed product ions at m/z 364.2 and 567.2, which corresponded, respectively, to one and two N-acetylglucosamine residues linked to the 2-AB. The presence of these product ions suggested that the methylation did not occur on the chitobiose unit. Furthermore, the presence of the specific product ions at m/z 1199.5 and 1507.6 indicated the presence of two methyl groups on two different outer mannose residues. The product ion at m/z 1023.4 was attributed to the triple fragmentation Y 3␤ /Y 4␣ /Y 4␤ (55) of the N-glycan according to its low intensity, signifying that the two inner mannose residues were not methylated. Consequently, the three methyl groups were assigned to the three outer mannose residues that were supposed to be linked to the C6 of each residue based on the GC-EIMS data. Those results linked to the presence of the ion at m/z 1551.5 confirmed that the methylation occurred on the mannose residues rather than on the xylose (Figs. 3B and 3C).
Relative quantitation based on the intensity of the ions corresponding to the 2-AB-labeled permethylated N-glycans revealed that oligomannosidic N-glycans represented almost 70% of the total N-glycan population (Table I). Although these analyses gave rise to thorough insights into N-glycan structures, no information about the proteins that carry these posttranslational modifications can be inferred. In order to shed light on N-glycoprotein identities, glycoproteomic studies were conducted.
Glycoproteomic Analyses Led to the Identification of 135 Glycopeptides and Confirmed Hexose Methylation-In order to obtain information regarding the identity and cellular distribution of glycoproteins in C. reinhardtii, a glycoproteomic approach was performed using proteins from TCEs, plasma membranes, chloroplasts, and the culture SN expressed under iron-sufficient and -deficient conditions. A detailed workflow scheme depicting protein isolation and glycoprotein/glycopeptide enrichment strategies is presented in supplemental File S1.
Two complementary methods were used to identify glycoproteins and glycosylation sites. For the first approach, mannose-rich glycopeptides were enriched using Con A and then samples underwent PNGase F-mediated deglycosylation in the presence of H 2 18 O. N-glycosylation site occupancy was then detected by a mass increase of ϩ2.9883 Da of precursor and fragment ions induced by the deamidation of modified asparagine residues upon glycan hydrolysis (N-glyco-FASP) (16,57). The second approach, unbiased by the mode of core fucosylation, focused on the analysis of intact glycopeptides. Here, IS-CID was applied for the partial removal of glycan structures. IS-CID leads to the fragmentation of glycosidic bonds while preserving the integrity of the peptide backbone. Therefore, IS-CID in combination with high-resolution spectrum acquisition allows for the detection of glycopeptides on the MS1 level through the detection of ion pairs that differ in mass by a single carbohydrate residue. These ions were determined "on the fly" through activation of the mass tag option of XCalibur with a mass setting of one N-acetylglucosamine (HexNAc) unit (203.0794 Da). Subsequently, all ions potentially differing by one HexNAc residue were sequentially isolated and fragmented via MSA-CID in order to obtain peptide sequence information. Because IS-CID affects all glycans irrespective of their linkage types, measures were taken to rule out that peptides modified by O-glycans were falsely identified as N-glycosylated. Firstly, peptide identification by X!Tandem was performed using the modification of serine and threonine by HexNAc as an additional variable parameter, thereby creating a competitive environment in terms of peptide-spectrum matching and scoring. Secondly, all glycopeptide identifications and glycosylation sites were validated through manual inspection of MS2 fragmentation spectra. However, IS-CID-MS1 spectra do not provide information regarding the isomeric nature of carbohydrates, nor do they allow for the determination of linkage types. Although the identification of oligomannoside N-glycans via IS-CID is facile, ion signals of branched and multiply N-glycosylated peptides often are highly ambiguous, and caution must be exercised with respect to spectrum interpretation.
A total of 134 distinct glycopeptides corresponding to 137 glycosylation sites from 86 proteins were identified (Table II). Using the PNGase F/ 18 O-method and IS-CID, we identified 124 and 31 glycopeptides, respectively, with an overlap of 21 glycopeptides. Through the 18 O-method, six additional glycosylation sites were found that did not match the consensus motif N[X!P][S/T], suggesting spontaneous deamidation during incubation in the presence of H 2 18 O. The IS-CID method yielded a considerably lower number of glycopeptide identifications than the PNGase F/ 18 O-approach. This might have been due to the generally low ionization efficiency of glycopeptides and the weak retention of glycopeptides on the reversed-phase trap column. Moreover, when intact glycopeptides are being analyzed, the presence of glycoforms of distinct glycopeptides exhibiting slightly shifted retention times may lead to peak spreading and ultimately to signal intensities too low for detection (58).
The number of detected glycoproteins differed considerably among the cell fractions analyzed (supplemental File S5). The majority of N-glycosylated proteins (62) were identified in the culture medium (SN). Only a few of these secreted proteins are functionally annotated in the JGI 4.3 Augustus 10.2 gene model database, but conserved domain searches indicated that most proteins are related to protein lysis, cell wall degradation, and carbohydrate binding.
Among secreted N-glycosylated proteins of C. reinhardtii cultivated under iron-deficient conditions, FEA1 and FEA2 were identified. These are two related proteins that have been proposed as members of the iron uptake pathway (59). All glycosylation sites of FEA1/2 were detected independently with both the PNGase F method and IS-CID. In the same fraction, another N-glycosylated protein (MnSOD3) that was previously proposed to play an important role in iron-deficiency responses was also identified (60).
The highest numbers of glycosylation sites were determined for the flagellar proteins PKHD1-1 and FMG1-1/ FMG1-2. PKHD1-1 is a close homolog of human polycystic kidney and hepatic disease 1-like 1 protein (PKHD1L1) (BLASTp E-value ϭ 0.0, 28% identity). An alignment of the peptide sequences of algal and human proteins revealed that four glycosylation sites were highly conserved (supplemental File S6). The IS-CID-MS1 spectrum of the PKHD1-1 glycopeptide TITVANN*GTHSTATILK showed considerable clustering of glycopeptide ions, which can be attributed to extensive glycan heterogeneity ( Fig. 4; supplemental File S7). Within individual clusters and as observed in the total N-glycan profile (Fig. 1), several ions exhibited a difference of 14 Da, suggesting either partial glycan methylation or the presence of xylose and fucose. In fact, the IS-CID-MS1 spectrum indicated that both possibilities applied. However, MALDI-TOF analyses of released N-glycans clearly suggested that clustering was predominantly caused by the presence of methylated hexoses. Moreover, no ions exhibiting a mass difference of 291.0954 Da corresponding to sialic acid were present in the IS-CID-MS1 spectra of PKHD1-1. This was also true for all other MS1 spectra containing glycopeptide ions with identities confirmed by means of MSA-CID. Figs. 4B and 4C show MSA-CID spectra of the peptide TITVANN*GTHSTATILK modified by one and two HexNAc residues, respectively. N-glycosylation sites of PKHD1-1 are not evenly distributed but in three clusters. Cluster 1 (342-519) contains four N-glycosylation sites, whereas cluster 2 (1944 -1967) and cluster 3 (2671-2716) contain two N-glycosylation sites each.  Protein N-glycosylation in Chlamydomonas reinhardtii  Six distinct glycopeptides of HSP70G corresponding to seven glycosylation sites were identified in the SN, TCE and chloroplast fractions. Of these, only one peptide (IIEVPVN* ETDTATGAEGAGADADTKAEK) was detected with IS-CID (Fig. 5). The IS-CID-MS1 spectrum (Fig. 5A) showed signals of the HSP70G peptide modified by up to two HexNAc and five hexose residues. No peak clustering was observable; thus the glycan could be unambiguously classified as oligomannoside. Targeted fragmentation of m/z 1488.7174 via MSA-CID provided the information required for the determination of pepfrom m/z 1389 to 1419, please refer to the supplemental material. Possible glycan compositions: X 1 : HexϩMeHex 2 ϩPent; X 2 : Hex 2 ϩMeHex 2 ϩPent; X 3 : Hex 2 ϩMeHex 2 ϩDeHex 2 or HexϩMeHex 3 ϩDeHexϩPent or MeHex 4 ϩPent 2 ; X 4 : Hex 2 ϩMeHex 2 ϩDeHex 2 ϩPent or HexϩMeHex 3 ϩDeHexϩPent 2 or MeHex 4 ϩPent 3 ; X 5 : Hex 3 ϩMeHex 2 ϩDeHex 2 ϩPent or Hex 2 ϩMeHex 3 ϩDeHexϩPent 2 or HexϩMeHex 4 ϩPent 3 ; X 6 : Hex 2 ϩMeHex 3 ϩDeHex 2 ϩPent or HexϩMeHex 4 ϩDeHexϩPent 2 or MeHex 5 ϩPent 3 ; X 7 : Hex 2 ϩMeHex 4 ϩDeHex 2 ϩPent or HexϩMeHex 5 ϩ DeHexϩPent 2 or MeHex 6 ϩPent 3 ; X 8 : Hex 3 ϩMeHex 4 ϩDeHex 2 ϩPent or Hex 2 ϩMeHex 5 ϩDeHexϩPent 2 or HexϩMeHex 6 ϩPent 3 . B, multistageactivation (MSA)-CID spectrum of TITVANN*GTHSTATILK (precursor m/z 973.0197(MH 2ϩ )) modified by one HexNAc residue. C, MSA-CID spectrum of the same peptide as in B differing by one additional core HexNAc residue (precursor m/z 1074.5597 (MH 2ϩ )). The majority of ions consistent with those in B are not annotated. f, HexNAc; *, loss of water/ammonia. tide sequence and glycosylation site (Fig. 5B). BLASTp revealed a high similarity of HSP70G to human HYOU1 (E-value: 1e Ϫ111 ; Table III). However, none of the N-glycosylation sites of HSP70G aligned with those determined for HYOU1 (supplemental File S8).

Protein N-glycosylation in Chlamydomonas reinhardtii
Among the glycoproteins of the TCE fraction, we found three candidates of the N-glycan pathway: CrSTT3A, CrSTT3B, two subunits of the OST complex, and CrUGGT, an UDP glucose:glycoprotein glucosyltransferase involved in the ER quality control of neosynthesized glycoproteins. As shown in supplemental File S9, the glycosylation sites of CrSTT3A and CrSTT3B are highly conserved among eukaryotic organisms. BLASTp analyses revealed that 14 N-glycosylated proteins from C. reinhardtii exhibited high sequence similarity to human proteins (E-value cutoff: 1e Ϫ30 ; Tables III and IV). Based on these results, peptide sequence alignments were repeated using ClustalW2, which led to the identification of five human proteins with either potential (PKHD1L1, CPVL) or confirmed (STT3A, STT3B, TM9SF3) N-glycosylation sites matching those determined for C. reinhardtii glycoproteins.
In Silico Analysis of the Chlamydomonas Reinhardtii Genome-In eukaryotes, the N-glycan pathway starts with the biosynthesis of the dolichol pyrophosphate-linked oligosaccharide donor Glc 3 Man 9 GlcNAc 2 -PP-dolichol and its transfer by the OST onto asparagine residues of proteins in the lumen of the rough ER. Then, this precursor is deglucosylated by glucosidases I and II and reglucosylated by UGGT to ensure its interaction with chaperones responsible for protein folding. Taking advantage of the sequenced C. reinhardtii genome (20) and based on sequence similarity to genes encoding enzymes involved in these ER steps, we identified most of the enzymes involved in the biosynthesis of dolichol pyrophosphate-linked oligosaccharide (Table IV). Some of these predicted enzymes show strong homologies with the corresponding asparaginelinked glycosylation (ALG) orthologs described in other eukaryotes (61). No candidate gene was found to correspond to ALG3, ALG9, ALG10, and ALG12. Putative transferases able to catalyze the formation of dolichol-activated mannose and glucose (CrDPM1 and CrALG5, respectively) required for the biosynthetic steps arising in the ER lumen were also predicted. In addition, genes whose translation products display high percentages of identity with the flippase involved in the translocation of the dolichol pyrophosphate-associated intermediate and with subunits of the OST complex (STT3A, STT3B, DLG1, DAD1, ribophorin I and II, and OST3) were also identified (Table IV).
A search for putative proteins involved in the quality control of the proteins in the ER led to the identification of sequences encoding a glucosidase I, as well as the ␣ and ␤ subunits of glucosidase II. The ␣ subunit contains the DMNE sequence (62) and a lectin domain involved in the binding of mannose residues (63). Glucosidase II is responsible for the cleavage of two ␣(1,3)-linked glucose residues from the precursor N-glycan. The trimming of terminal glucose residues allows the . Peptide sequence alignments of candidates with conserved glycosylation sites are provided as supplemental material.

Protein N-glycosylation in Chlamydomonas reinhardtii
binding and release of monoglucosylated glycoproteins with calnexin and calreticulin, two ER-resident lectin-like chaperons that are involved in the retention of misfolded or incompletely folded proteins (64). A sequence encoding a UGGT, involved in the entry of incompletely folded proteins into cycles of calnexin/calreticulin-assisted folding (65), was also identified in the C. reinhardtii genome. After their ER processing, the glycoproteins move to the Golgi apparatus, where the oligomannoside N-glycans are stepwise maturated into complex-type N-glycans. Three types of mannosidases are predicted in the C. reinhardtii genome (Table IV). An endo-mannosidase belonging to the CAZy family GH 99, CrEMAN, has been identified exhibiting 36.5% identity with the human homolog. This mannosidase, identified in animals but not in plants, is able to release a Man-8 oligomannoside by cleaving internally the glucosylated precursors (66). Endo-mannosidases are usually located in cis Golgi and provide an alternative pathway for the processing of the ER N-glycan precursor (67). In addition to this endomannosidase, one putative type-I mannosidase is predicted in the genome (Table IV). Although this glycosidase, CrMANI, does not display the typical topology of Golgi enzymes, the sequence exhibits 26% to 28% identity with human and plant ␣-MANI, as well as the conserved aminoacids of the catalytic domain involved in Ca 2ϩ and oligomannoside bindings and Cys residues involved in its folding (68,69). In addition, a sequence encoding ␣-mannosidase II belonging to the CAZy family GH38 was predicted (Table IV). This putative mannosidase displays the greatest similarity to the human cytosolic type-II mannosidase C (MANIIC, NP_06706.2), which has been shown to be involved in the turnover of free oligosaccharides (70,71). However, a putative function as a Golgi mannosidase involved in the N-glycan trimming cannot be definitively ruled out.
Usually, the synthesis of complex-type N-glycans starts with the transfer of a GlcNAc residue on the ␣(1,3)-mannose arm of Man 5 GlcNAc 2 by the action of a GnT I. However, no putative GnT I or GnT II sequence was identified in C. reinhardtii, suggesting the absence of a GnT I-dependent pathway in this green microalga. A search for a putative xylosyltransferase revealed the presence of one sequence (CrXYLT) exhibiting about 16.5% identity with ␤(1,2)-xylosyltransferase from Arabidopsis thaliana, in which this enzyme is responsible for the transfer of a ␤-xylose onto the ␤-mannose of the core N-glycan (72). However, considering the lack of information regarding conserved peptide domains required for ␤(1,2)xylosyltransferase activity on N-linked glycans, the assignment of such a sequence remains highly speculative. A putative fucosyltransferase, CrFUT1, exhibiting 20% and 21% identity with ␣1,3-fucosyltransferases from A. thaliana At-FUT11 and AtFUT12, respectively, was also predicted in the genome (Table IV). This protein sequence exhibited the expected type-II membrane protein topology and motifs required for ␣(1,3)-fucosyltransferase activity (73)(74)(75), as well as conserved Cys residues and a CXXC motif located at the C-terminal end that is involved in the formation of disulfide bonds in plant ␣(1,3)-fucosyltransferases (76). DISCUSSION Here, we developed an integrated genomic, glycomic, and glycoproteomic approach to unravel the N-glycosylation pathway of C. reinhardtii and shed light on N-glycan structures and N-glycosylated proteins. Based on sequence similarities, we identified in the genome of C. reinhardtii a set of putative sequences encoding proteins involved in the synthesis in the ER of the dolichol pyrophosphate-linked oligosaccharide donor Glc 3 Man 9 GlcNAc 2 -PP-dolichol, its transfer by OST onto asparagine residues of proteins, and the deglycosylation/reglycosylation of the precursor N-glycan allowing its interaction with chaperones involved in the quality control of secreted proteins (Fig. 6, Table III and Table IV). Some of these proteins (STT3A/STT3B, UGGT) were identified in the proteome analysis of C. reinhardtii. In addition, the biochemical investigation of the N-glycan structures showed that both secreted and membrane-bound C. reinhardtii glycoproteins bear mainly Man 2 GlcNAc 2 to Man 5 GlcNAc 2 structures representing almost 70% of the total N-glycan population. Although some ALGs were not clearly identified in the C. rheinardtii genome, the identification of large oligomannosides up to Man 9 GlcNAc 2 (Man-9) (Table I) suggested that the biosynthesis of C. reinhardtii N-glycan in the ER is similar to that described in other eukaryotes. In addition, the structure of Man 5 GlcNAc 2 (Man-5) detected in C. reinhardtii N-glycan pools is identical to the one usually observed on eukaryote N-linked proteins.
Complex N-glycans were also identified on secreted and membrane-bound proteins isolated from C. reinhardtii. These N-glycans are partially O-methylated Man 3 GlcNAc 2 to Man 5 GlcNAc 2 bearing one or two xylose residues. Based on MS2 fragmentation and immunoblotting data, we demonstrated that one of these xylose residues is linked in ␤(1,2) to the core ␤-Man as previously reported in plants (56), whereas the second one is linked in C4 on one outer terminal mannose. Although a putative fucosyltransferase, CrFUT1, is predicted in the C. reinhardtii genome, only traces of fucosylated glycans were detected in the N-glycan profiles. These complex N-linked glycans were also observed on an individual peptide, the PKHD1L1 peptide TITVANN*GTHSTATILK, exhibiting extensive glycan heterogeneity (Fig. 4, supplemental File S7). These results contrast with those obtained in Porphyridium sp., in which a cell wall glycoprotein was found to carry Man 8 GlcNAc 2 and Man 9 GlcNAc 2 containing 6-O-methyl mannose and substituted by one or two xylose residues, with one xylose located on the chitobiose unit (11).
Based on both in silico and biochemical analyses, we postulate, as illustrated in Fig. 6, that after their synthesis in the ER, oligomannoside N-glycans are processed into Man 5 GlcNAc 2 in the Golgi apparatus by Golgi-residing mannosidases such as the putative type I-mannosidase CrMANI (Table IV). The formation of complex-type N-glycans then occurs via additional maturation steps such as xylosylation and methylation of mannoses (Fig. 6). Although functional characterization of Golgi putative transferases is required in order to definitively establish the precise order of Golgi events, the absence in N-glycan profiles of methylated Man 2 GlcNAc 2 and Man 1 GlcNAc 2 suggests that O-methylation of mannose residues likely occurs after the xylosylation of oligomannosides.
In most eukaryotic organisms, GnT I transfers a GlcNAc residue on the ␣(1,3)-mannose arm of Man 5 GlcNAc 2 to initiate the synthesis of complex-type N-glycans. However, because no gene encoding a putative GnT I could be identified in the C. reinhardtii genome and neither MALDI-TOF analyses of N-glycans nor IS-CID experiments indicated any GnT I-dependent activities, we conclude that the maturation of complex N-glycans occurs through a GnT I-independent pathway. N-glycan processing in a GnT I-independent pathway has already been demonstrated to occur in GnT I mutants (77,78) or in organisms devoid of GnT I activity such as mushrooms (79). In contrast, N-glycans are processed in a GnT I-dependent manner in the diatom Phaeodactylum tricornutum (12), implying the existence of distinct N-glycosylation pathways in microalgae depending on the phyla they belong to.
From the proteomic data, it is clear that C. reinhardtii possesses numerous functionally interesting N-glycosylated proteins. The highest number of distinct N-glycoproteins was detected in the culture medium of C. reinhardtii, which is not surprising, because glycosylation is a common characteristic of secreted proteins (19,80). Moreover, glycoproteomic analyses were carried out on the C. reinhardtii strain CC-400, which easily releases periplasmic proteins into the growth medium because of its cell wall deficiency (81). To compensate for the "loss" of extracellular proteins, the expression of secreted proteins may be up-regulated in this strain. Most of the identified proteins lack functional annotation, yet many of them feature conserved domains that suggest proteolytic and/or carbohydrate-binding activities. Correspondingly, they may be involved in processes such as nutrient acquisition, cell-cell recognition, or cell wall degradation. The latter function was confirmed for the matrix metalloprotease MMP1 (G-lysin), which is induced during gametogenesis (82)(83)(84)(85). However, it remains unknown whether the seven uncharacterized glycoproteins containing gametolysin domains serve a similar function.
Through BLASTp searches, 14 human proteins were identified that showed high sequence similarity to glycoproteins from C. reinhardtii (Table III). Among these, five proteins showed sequence conservation even with respect to the localization of the NXT/NXS motif. For example, two N-glycosylation sites were identified in each of the predicted oligosaccharyltransferases CrSTT3A and CrSTT3B. The peptide sequence alignments showed that these sites are located within a region that is highly conserved in eukaryotic organisms and is proposed to harbor the catalytic site (86). N-glycosylation sites corresponding to those of C. reinhardtii have already been reported for STT3 from Saccharomyces cerevisiae and human STT3A/STT3B (13, 86 -89) (supplemental File S9). In yeast, glycosylation of N539 (corresponding to Nglycosylated N595/N986 in CrSTT3A/CrSTT3B) was shown to be essential for the enzymatic function of STT3. N591 and N582 of CrSTT3A and CrSTT3B, respectively, were not found to be glycosylated, although they were located within consensus motifs of N-glycosylation. The same observation was FIG. 6. Proposed N-glycosylation biosynthesis pathway in C. reinhardtii. The proposed pathway is based on the major N-glycan structures found according to the in silico analysis. N-glycan structures have been drawn using the symbolic nomenclature adopted by the Consortium for Functional Glycomics (101). f, N-acetylglucosamine; F, mannose; ૾, xylose. made for the corresponding residue (N535) of yeast STT3. Moreover, mutational studies led to the conclusion that nonglycosylated N535 is essential for proper enzyme function (86). In humans, however, this residue is indeed N-glycosylated, in both STT3A and STT3B (13,89). The example of STT3 proteins demonstrates that N-glycosylation sites are highly conserved across distantly related organisms when N-glycans are essential for enzyme activity. The subtle differences in the glycosylation patterns of human, yeast, and C. reinhardtii STT3 proteins may provide fundamental information regarding the principles of N-glycosylation in eukaryotes.
Polypeptide sequence alignments showed that four out of eight N-glycosylation sites determined for PKHD1-1 (fibrocystin-like protein) aligned perfectly with NXT/NXS motifs of the human homolog PKHD1L1 (synonym: PKHDL1; supplemental Files S6 and S7). No glycoproteomic data are available indicating N-glycosylation of PKHD1L1. However, its paralog fibrocystin (polycystic and hepatic disease 1 (PKHD1)) was known to be highly N-glycosylated (90). Fibrocystin-like proteins are proposed to be evolutionary ancestors of fibrocystin and thus exhibit many structural similarities (91)(92)(93). In fact, although it is more similar to PKHD1L1 on the peptide-sequence level, PKHD1-1 is actually suggested to be the functional homolog of fibrocystin, because PKHD1-1 and PKHD1 are localized to flagella and primary cilia/basal bodies, respectively (93,94). In contrast, PKHD1L1 might play a role in cellular immunity, as was concluded from the widespread PKHD1L1 expression in human and murine blood-derived cell lines and its up-regulation in T lymphocytes (91). The function of fibrocystin remains unknown, but mutations in the PKHD1 gene are linked to autosomal-recessive polycystic kidney disease (92,95). The structural similarity of fibrocystin and fibrocystin-like proteins in addition to the presence of conserved (potential) N-glycosylation sites underlines that C. reinhardtii may be a suitable model system for studying human ciliary dysfunctions.
Glycopeptides of the heat shock protein HSP70G were identified in several cell fractions. Whether the widespread distribution was caused by cross-contaminations or HSP70G is indeed localized to several cellular compartments is currently being elucidated. HSP70G was initially presumed to be an ER resident protein as predicted by TargetP and inferred from its sequence similarity to the human ER-localized HYOU1 protein (96). However, recent studies have shown that HSP70G is also chloroplast localized in C. reinhardtii (97). Interestingly, HYOU1 localization was found not to be restricted to the ER. In rats, glycosylated HYOU1 was detected in mitochondria, and an N-terminally truncated form was found in the cytoplasm (98,99). Accordingly, we assume that HSP70G and its homologs may be localized to several cellular compartments as part of a multiple targeting strategy. The same conclusion may be drawn for MnSOD3, as it was found recently to be a chloroplast-located superoxide dismutase (60). Because MnSOD3 was found to be N-glycosylated and was identified in the culture medium, it is evident that the protein must take a route through the secretory pathway with subsequent distribution to multiple destinations. A corresponding targeting mechanism of proteins to the chloroplast via the ER and Golgi apparatus was already described in Arabidopsis for N-glycosylated ␣-carbonic anhydrase (100).
The exploration of the C. reinhardtii N-glycan pathway as done in the present study represents an important first step toward the design of genetically engineered driven remodeling of the alga to produce Chlamydomonas-derived biopharmaceuticals carrying N-linked glycans compatible with human therapeutical applications. Notably, our comprehensive analyses also revealed N-glycosylated proteins in the chloroplast, as well as in the extracellular space, thereby providing information for future targeting experiments for the expression of glycoproteins of biotechnological interest.