The Global Phosphoproteome of Chlamydomonas reinhardtii Reveals Complex Organellar Phosphorylation in the Flagella and Thylakoid Membrane *

Chlamydomonas reinhardtii is the most intensively-studied and well-developed model for investigation of a wide-range of microalgal processes ranging from basic development through understanding triacylglycerol production. Although proteomic technologies permit interrogation of these processes at the protein level and efforts to date indicate phosphorylation-based regulation of proteins in C. reinhardtii is essential for its underlying biology, characterization of the C. reinhardtii phosphoproteome has been limited. Herein, we report the richest exploration of the C. reinhardtii proteome to date. Complementary enrichment strategies were used to detect 4588 phosphoproteins distributed among every cellular component in C. reinhardtii. Additionally, we report 18,160 unique phosphopeptides at <1% false discovery rate, which comprise 15,862 unique phosphosites - 98% of which are novel. Given that an estimated 30% of proteins in a eukaryotic cell are subject to phosphorylation, we report the majority of the phosphoproteome (23%) of C. reinhardtii. Proteins in key biological pathways were phosphorylated, including photosynthesis, pigment production, carbon assimilation, glycolysis, and protein and carbohydrate metabolism, and it is noteworthy that hyperphosphorylation was observed in flagellar proteins. This rich data set is available via ProteomeXchange (ID: PXD000783) and will significantly enhance understanding of a range of regulatory mechanisms controlling a variety of cellular process and will serve as a critical resource for the microalgal community.

associated with the eyespot apparatus, including several kinases and phosphatases implicated in phosphorylationbased signaling in the eyespot. In a study of C. reinhardtii flagella, Pan et al. (10) observed 1296 spectral counts of phosphopeptides corresponding to 224 phosphoproteins involved with motility and assembly. In a similar study, Boesger et al. (11) observed 141 phosphopeptides corresponding to 32 proteins. Using whole cells, Wagner et al. (12) observed 360 phosphopeptides corresponding to 328 proteins, including several flagellar kinases, which indicates the importance of phosphorylation-based signaling for motility and assembly.
Despite the importance of phosphorylation-based signaling underlying C. reinhardtii biological processes, characterization of the cellular pool of phosphopeptides has been limited. Although additional dimensions of separation that are orthogonal to online reversed-phase are routinely used in order to probe phosphopeptide species of low-abundance, this has not been implemented for probing the C. reinhardtii phosphoproteome. Hydrophilic-interaction liquid chromatography improves phosphopeptide separation and detection (13) and is more orthogonal than strong-cation exchange compared with online reversed-phase chromatography (14). Additionally, to complement the increased resolution of phosphopeptides afforded by a first-dimension separation, enrichment strategies based on the affinity of a phosphate group to a metal ion or metal oxide can further increase coverage. Currently, a single immobilized metal affinity chromatography (IMAC) scheme is the most popular choice for phosphopeptide studies using C. reinhardtii. However, conventional insoluble TiO 2 beads recover more phosphopeptides than traditional IMAC (15). Additionally, PolyMAC (polymer-based metal ion affinity capture) is a polymer-based improved analog of IMAC that uses TiO 2functionalized soluble nanopolymers to chelate phosphopeptides in a homogeneous aqueous environment (15). Thus, use of complementary enrichment schemes based on TiO 2 and PolyMAC can yield more comprehensive results compared with a single strategy.
In this study, complementary approaches using TiO 2 / PolyMAC enrichment and hydrophilic-interaction liquid chromatography (HILIC) chromatography were employed to explore the C. reinhardtii phosphoproteome in significant depth. We report the detection of 4588 nonredundant phosphoproteins from 18,160 unique phosphopeptides at Ͻ1% false discovery rate. Among these peptides, we report 15,862 unique phosphosites identified with Ն95% localization probability. Nearly all reported sites are novel. Our data show many key biological pathways, including photosynthesis, chlorophyll biosynthesis, carbon assimilation, protein metabolism, and flagella assembly and motility are comprised of multiple phosphoproteins. These data provide a framework for garnering novel mechanistic insights into understanding a variety of cellular/signaling processes.

EXPERIMENTAL PROCEDURES
Strains and Cultures-C. reinhardtii strain CC-400 cw15mt-was obtained from Ursula Goodenough. Cells were cultivated at 21°C under 100 mol m Ϫ2 s Ϫ1 light with a 14 h-dark 10 h-light cyclefluora lamp (Osram, Munich, Germany) in 250 ml flasks on an INFORS HT Multitron orbital shaker at 110 rpm (INFORS HT, Bottmingen, Switzerland) using Tris acetate phosphate (TAP) medium (16). Freshly inoculated cultures were grown for 4 -5 days under air without additional supply of CO 2 . Samples were taken at log phase with cell concentrations of 3-4 ϫ 10 6 cells/ml. Three biological replicates were prepared.
Protein Extraction-cw15 cells from 500 ml were harvested and immediately frozen using N 2 (l). Protein was extracted using the phenol extraction method as described previously (17) with the following modifications. In order to preserve protein phosphorylation, protease inhibitor mixture and phosphatase inhibitor mixture tablets (Roche, Indianapolis, IN, USA) were added to 0.1 M Tris pH 8.8 buffer according to the manufacturer's instructions. After the final 100 mM ammonium acetate precipitation step, proteins were resuspended in 6 M urea, 2 M thiourea, and 100 mM ammonium bicarbonate buffer supplemented with the protease and phosphatase inhibitor tablets. Protein concentrations were determined using the Pierce 660 nm protein assay reagent (Thermo Scientific, Rockford, IL, USA) according to the manufacturer's instructions.
Protein Digestion-Protein digestion was performed as described previously with modifications (17). Samples were diluted 2-fold using 100 mM ammonium bicarbonate, reduced using 5 mM DTT for 45 min at room temperature, and alkylated using 10 mM iodoacetamide for 45 min at room temperature in the dark. The samples were then diluted to 1 M urea using 100 mM ammonium bicarbonate and modified trypsin (Promega, Madison, WI) was added to a final enzyme:substrate w:w ratio of 1:30. At this step the protein concentration was 1-1.5 mg/ml. The trypsin digest was incubated at 37°C for 12 h. After digestion, the peptide mixture was frozen at Ϫ20°C to halt protease activity. Samples not immediately analyzed were stored at Ϫ80°C. Each biological replicate was split equally: 8 mg sample was used for TiO 2 -HILIC enrichment, and 8 mg for HILIC-PolyMAC enrichment. Three biological replicates, that is, a total of six samples, were analyzed.
TiO 2 Enrichment-TiO 2 enrichment was performed using the centrifugation method described previously (18) with the following modifications. TiO 2 beads (Titansphere TiO 2 5 m, GL Science, Inc. Japan) were equilibrated once using acetonitrile and once using loading buffer (80%ACN/5% TFA/saturated phthalic acid) for 5 min in each step. Solvent was removed at each step by a 2 min centrifugation at 20,000 ϫ g. Trypsin-proteolyzed peptides were lyophilized in a vacuum centrifuge and resuspended in loading buffer to a final concentration of 1-1.5 mg/ml. The resultant solution was equilibrated with TiO 2 beads in a w:w ratio of 1:5 and incubated for 1 h at room temperature using a head-to-head rotator at 30 rpm. The beads were washed once using 1.5 ml loading buffer and three times using 1.5 ml washing buffer (2% TFA/80% ACN), by incubating for 20 min. Phosphopeptides were eluted twice using 500 l 0.5% ammonium hydroxide with shaking at 1400 rpm for 10 min; elutions were subsequently combined. Phosphopeptides were lyophilized in a vacuum centrifuge and stored at Ϫ80°C until further processing.
PolyMAC Enrichment-PolyMAC (Tymora Analytical Operations, West Lafayette, IN) enrichment was performed according to the manufacturer's instructions with two modifications. The binding time was increased from 5 min to 1 h and the PolyMAC capture time was increased from 10 min to 30 min.
Solid Phase Extraction (SPE) Desalting-For the HILIC-PolyMAC workflow, digested peptides were desalted prior to HILIC fractionation. Desalting was performed using Waters Oasis HLB 6 cc Vac Cartridge (500 mg sorbent per cartridge, 60 m) according to the manufacturer's instructions. Briefly, each cartridge was conditioned and equilibrated using 6 ml ACN and 0.1% TFA 2x each sequentially. Then, 4 mg dried peptide mixture was dissolved in 4 ml 0.1% TFA and loaded onto the cartridge at a flow rate of 1 drop/sec. The flow through was collected and reloaded onto the column as described previously. Then the cartridge was washed three times using 6 ml 0.1% TFA and once using 6 ml H 2 O. Peptides were eluted four times using 500 l 70% ACN containing 0.1 TFA. Eluates were combined, lyophilized using a vacuum centrifuge, and stored at Ϫ80°C until further processing.
HILIC Fractionation-For fractionation of TiO 2 -enriched samples, HILIC separations were performed on a Beckman System Gold HPLC (Beckman Coulter, Fullerton, CA) using a 2.0 mm i.d. ϫ 15 cm, 3 m TSKgel Amide-80 column (Tosoh Biosciences, King of Prussia, PA). For tryptic digests prior to PolyMAC enrichment, a 4.6 mm i.d.ϫ25 cm, 5 m, TSKgel Amide-80 HR column was used with the same HPLC. Both workflows used 200 l injections. The rate of solvent flow was 0.2 ml/min and 1 ml/min for the 2.0 mm i.d. and 4.6 mm i.d. columns, respectively. Fractionation methodology followed (13) and included solvent A (98% water with 0.1% TFA) and solvent B (98% ACN with 0.1% TFA). For TiO 2 -HILIC, samples were loaded in 80% B, and a modified gradient was applied consisting of 80% B held for 5 min, 80% B to 60% B over 40 min, and finally 60% B to 0% B over 5 min. Thirty-six 1 ml fractions from 15 to 50 min were collected and lyophilized in a vacuum centrifuge prior to LC-MS/MS analyses. For HILIC-PolyMAC, samples were loaded in 90% B, and a modified gradient was applied consisting of 90% B to 85% B over 5 min, 85% B to 70% B over 40 min, and finally 70% B to 0% B over 5 min. Twenty-four 1 ml fractions were collected from 15 to 38 min and lyophilized in a vacuum centrifuge for subsequent PolyMAC enrichment.
NanoLC-MS/MS-The first replicate sample of each workflow was analyzed using an LTQ-Orbitrap Velos mass spectrometer (Thermo Fisher Scientific, Rockford, USA) coupled with a nanoLC Ultra (Eksigent, Dublin, USA). Each sample (5 l) was loaded onto a trap column (C18 PepMap100, 300 mϫ 1 mm, and 5 m, 100 Å, Dionex, Sunnyvale, USA) at a flow rate of 4 l/min for 5 min. Peptide separation was carried out on a C18 column (Acclaim PepMap C18, 15 cm ϫ 75 m ϫ 3 m, 100 Å, Dionex) at a flow rate of 0.26 l/min. Peptides were separated using an 85 min linear gradient ranging from 5% to 40% B. The mass spectrometer was operated in positive ionization mode. The MS survey scan was performed in the FT cell from a mass range of 300 to 1700 m/z. The resolution was set to 60,000 @ 400 m/z, and the automatic gain control (AGC) was set to 500,000 ions. CID fragmentation was used for MS/MS and the 20 most intense signals in the survey scan were fragmented. An isolation window of 1.5 m/z, and a target value of 10,000 ions was used. Fragmentation was performed with normalized collision energies of 35% and activation times of 30 ms. Dynamic exclusion was performed with a repeat count of 1 and an exclusion duration of 75 s, and a minimum MS signal for triggering MS/MS was set to 5000 counts.
All subsequent replicates (four of six total samples) were analyzed using a TripleTOF 5600 mass spectrometer (AB Sciex, Concord, Canada) coupled to a nanoLC 2D (Eksigent, Dublin, OH). Five microliters of each sample was loaded onto a trap column (ChromXp C 18 , 350 m ϫ0.5 mm, 5 m, 120Å, Eksigent, Dublin, OH, USA) at a flow rate of 3 l/min for 7 min. Peptide separation was carried out on a C 18 column (ChromXp C 18 , 15 cm ϫ 75 m ϫ 3 m, 120 Å, Eksigent) at a flow rate of 0.3 l/min. Peptides were separated using a 120 min linear gradient from 5% to 50% B (mobile phase A, 1% formic acid; mobile phase B, 1% formic acid in ACN). The mass spectrometer was operated in positive ionization and high sensitivity mode. The MS survey spectrum was accumulated from a mass range of 380 to 1250 m/z in 250ms. For IDA (information dependent acquisition) MS/MS experiments, the first 20 features above 150 counts threshold and having a charge state of ϩ2 to ϩ5 were fragmented using rolling collision energy Ϯ 5%, with 100 ms spectra accumulation/experiment. Each MS/MS experiment put the precursor m/z on a 75 s dynamic exclusion list. Auto calibration was performed every five samples (9 h) to assure high mass accuracy in both MS and MS/MS acquisition.
Protein Inference-Orbitrap-Velos raw data were converted to Mascot generic format (.mgf) files by Mascot Daemon 2.4 (Matrix Science, London, UK). TripleTOF 5600 raw data were processed using MS Data Converter (ABSciex) to generate .mgf files. For each replicate, all fractions were merged prior to database search using Mascot Daemon 2.4. All Mascot searches were performed against a combined database (19,603 sequences total) containing the full Phytozome v9.0 database (http://www.phytozome.net/, Dec. 2012) and the NCBI chloroplast and mitochondrion databases using the following settings: trypsin with up to two missed cleavages; carbamidomethylation fixed at Cys; variable oxidation at Met; variable protein N-terminal acetylation, variable deamidation at Asn and Gln; variable phosphorylation at Ser, Thr, and Tyr. The mass error tolerance for precursor ions was set to 10 ppm, 0.08 Da for TripleTOF 5600 fragment ions, and 0.8 Da for Orbitrap-Velos fragment ions. A decoy reverse database search was also performed to estimate the false positive rate of protein identification. Identified proteins were further analyzed with Scaffold (v3.1, Proteome Software Inc., Portland, OR) and Scaffold PTM (v3.1; Proteome Software, Portland, OR) to confirm the identification and phosphorylation sites localization probability. First, Scaffold was used to filter proteins and peptides based on the Protein Prophet (19) and Peptide Prophet (20) algorithms to obtain 99% protein and peptide probability and Ͻ1% protein and peptide FDR with at least one peptide matched. Next, Scaffold PTM was used to obtain an estimate of the phosphosite localization probability by assigning Ascores (21) to each PTM call. A Scaffold PTM spectrum report for each search was exported, listing all MS/MS queries with their Ascores for every protein having a 99% or better probability.
Phosphosite Compilation-An Excel Visual Basic for Applications (VBA) (Microsoft, Redmond, WA) program was written to compile the spectrum reports and generate the peptide, protein, and phosphosite tables for all nonredundant proteins. Phosphosites were compiled as follows. (1) All MS/MS queries for each peptide of a protein were counted and the highest Ascore query tabulated; in case of a tie the highest Mascot score query was tabulated.( 2) All phosphosites for the protein were tabulated. (3) For each phosphosite, the highest Ascore was sought from all peptides containing the site, including missed cleavage peptides and multiply-modified peptides. (4) Each phosphosite was flagged if its Ascore was 13 or better, connoting a 95% site localization confidence (21)(22)(23). (5) Each phosphosite was flagged if its Ascore was 18 or better and its neighboring residues were not Ser, Thr, or Tyr. Phosphosites with neighboring Ser, Thr, or Tyr residues and Ascores of 40 or better were also flagged. The estimated site localization confidence for this more stringent filter is 99% (24).
The same VBA program determined the subset of nonredundant phosphosites among all proteins recursively as follows. First, for each site the largest contiguous sequence spanning the site was constructed using only peptides observed as modified at the site with high confidence. With all observed site sequences thus tabulated, each was compared with the rest to cull redundant entries. This processing was based solely on observed peptides and thus was independent of putative protein association.
Phosphosite Motif Analysis-For motif analysis only nonredundant 99% localization confidence sites were considered. A list of 13 residue sequences centered on each site was constructed from its ob-served largest contiguous sequence. For any site within six residues of the N-or C terminus of its largest contiguous sequence, the VBA program used the inferred protein sequence to extend the site neighborhood to give a phosphosite-centered 13-mer sequence. Sequences were categorized as basic, acidic, proline-directed, tyrosine, or "other" according to the rules of Huttlin and colleagues (23), well-articulated by Lasonder and colleagues (25). Motifs were determined for each category using Motif-X (http://motif-x.med.harvard. edu/motif-x.html2) (26,27). The background was the IPI Arabidopsis thaliana proteome, the width was set to 13, occurrences set to 20, significance set to 1 ϫ 10 Ϫ7 , and pT sites replaced at position 7 with pS. For tyrosine motifs the thresholds were relaxed: occurrences set to 5 and significance set to 1 ϫ 10 Ϫ6 .
Gene Ontology (GO) Annotation-Functional annotation of the set of phosphoproteins having at least one site with a 95% localization confidence was completed using Blast2GO Pro v2.6.4 (http:// www.Blast2GO.com/b2ghome, BioBam, Valencia, Spain). BlastP was used against the NCBInr database with an ExpectValue of 1 ϫ 10 Ϫ6 . The annotation step was configured with an E-Value-Hit-Filter of 1 ϫ 10 Ϫ6 , annotation Cutoff of 55, and GO Weight of 5. Fisher's exact test enrichment analysis was completed using Blast2GO and a background set of the Blast2GO-annotated Phytozome v9.0 C. reinhardtii proteome.

RESULTS
Phosphoproteome Profiling in C. reinhardtii-In order to obtain an in-depth phosphoproteome profile of C. reinhardtii, two complementary phosphopeptide enrichment approaches, PolyMAC and TiO 2 , were implemented. High-resolution fractionation was accomplished by HILIC in both enrichment workflows and high-performance MS was used to identify phosphoproteins. Three biological replicates, each split for the HILIC-PolyMAC and TiO 2 -HILIC workflows yielding six total samples, were processed via the schematic workflow depicted in Fig. 1A. A total of 180 LC-MS/MS data sets were acquired after fractionation using HILIC.
A total of 4588 proteins were identified as phosphorylated at least once with 95% phosphosite localization confidence (supplemental Table S1), from a total of 6136 proteins identified with a Ͻ1% false discovery rate (supplemental Table S2) from 33,957 peptides (supplemental Table S3). These phosphoproteins were inferred from a set of 23,030 phosphopep-FIG. 1. The workflow schematic and summary of identified phosphoproteins, phosphopeptides, and phosphosites. A, the workflow schematic for three biological C. reinhardtii cw15 replicates split among two enrichment-fractionation workflows. HILIC introduces 36 or 24 fractions for each arrow. B, 31.3% of the known C. reinhardtii proteins were identified at 99% probability with at least one unique peptide. Of these, 75% have at least one phosphosite identified at a 95% localization confidence. C, 77% of the peptides from which proteins were inferred were phosphorylated. Of these, 70% had one or more MS/MS spectra that enabled 95% confidence in the assignment of their phosphosites. D, over half of the phosphoproteins were multiply phosphorylated. E, 89% of peptides with confidently assigned phosphosites were singly phosphorylated. F, the majority of the 15,862 confidently assigned nonredundant phosphosites were pSer. tides, with 18,160 detected at 95% localization confidence (supplemental Table S4). These counts were based on an examination of unique peptides, where the peptide identity used for comparison was determined by its sequence and the presence and location of phosphorylation(s) and acetylation (for protein N-terminal peptides). Thus defined, the unique p-peptide subset was filtered on the maximum phosphorylation Ascore for each, using a threshold Ascore of 13.
Phosphosites shared by homologous proteins in the data set may not have peptidyl coverage in the neighborhood of the site to discern which homolog is modified at the site (see Phosphosite compilation under "Experimental Procedures"). Counting such sites only once gives 15,862 nonredundant and confidently-localized phosphorylation sites (supplemental Table S5). This is the lowest estimate of the true number of detected phosphosites, and is thus the set of unique phosphosites. For example, the STT7 homolog STN7 protein has two splice variants, Cre02.g120250.t1.1 PACid:27574692 ("variant 1") and Cre02.g120250.t2.1 PACid:27574693 ("variant 2") (supplemental Fig. S1). Fourteen peptides shared by each were detected, comprising five phosphosites. The presence of both proteins is inferred because detected unmodified peptide E 561 AMSESDPYGAAPSAMQVGSAINNAR 616 belongs only to variant 1, whereas detected phosphopeptide E 561 AMSEpSVGSAINNAR 605 belongs only to variant 2. It is possible that these proteins are identically phosphorylated except at S566, contributing in total 11 phosphosites to the phosphopeptidyl signal. By virtue of the peptidyl evidence, only the minimum number of sites is listed (six in this case) in the set of unique phosphosites, for the purposes of motif and statistical analyses.
The success of the overall phosphopeptide enrichment strategy is made clear by the overview of inferred proteins, phosphoproteins and corresponding identified peptides and phosphopeptides ( Fig. 1B, C, supplemental Table S1-4). Although 61% of the 4588 phosphoproteins are identified as having more than one phosphosite (Fig. 1D), almost 90% of the observed phosphopeptides are singly phosphorylated (Fig. 1E). The frequencies of phosphorylated Ser, Thr, and Tyr are 13,644, 2168, and 50, respectively, among all such sites (Fig. 1F). Considering the set of 23,030 phosphosites tabulated in supplemental Table S4, 332 have previously been observed; these are marked in the protein supplemental Table  S1, S2, S9, and S10. This is based on 1405 C. reinhardtii phosphopeptides published in references (6 -12, 28), containing 2337 phosphosites containing some redundant calls (i.e. the published phosphopeptides may originate from more than one homologous protein). To date, our study contains the most comprehensive proteomics (29 -33) and phosphoproteomics (6 -9, 11, 12) data published for C. reinhardtii. Mining this rich data set facilitates enhanced understanding of a range of regulatory mechanisms controlling a variety of cellular process, a sampling of which is expounded upon below.
Cilia and Flagella-Cilia and flagella have been highly conserved throughout evolution and are among the most ancient cellular organelles, providing motility for primitive eukaryotic cells living in aqueous environments. For many years, C. reinhardtii has served as a model for the examination of the structure and function of flagella. Phosphorylation/dephosphorylation controls flagellar motility (34), signaling (35), length, and assembly (36), and is thus one of the most thoroughly studied organellar phosphoproteomes in C. reinhardtii. Previous analyses of whole cell and flagellar phosphoproteomes of C. reinhardtii collectively identified 258 phosphoproteins, demonstrating phosphorylation/dephosphorylation as a key regulatory mechanism of eukaryotic cilia (10 -12, 37). In total, this includes 726 phosphosites, but some of these sites are redundant by the peptide evidence, as discussed above, and these phosphosites' localization probabilities are uncertain because no localization probability assessment was performed (10 -12).
In the present study, 182 phosphoproteins were identified with 1013 phosphosites in total, 730 of which were localized with high confidence (supplemental Table S6), including 38 key flagellar structure and regulator proteins depicted in Fig.  2. Of the 258 phosphoproteins reported previously, 43 were also detected as phosphorylated in this study. In this 43protein subset, 239 phosphosites have been identified (170 confidently) compared with 158 previously. The remaining 139 cilia and flagella-annotated proteins are novel phosphoproteins, comprising 560 confidently-localized phosphosites. It is noteworthy that the overlap of our phosphoproteins with those published is low, which may be because of a differing experiment approach and/or sampling conditions and furthermore reflects the dynamic property of phosphorylation.
The 44 flagellar associated proteins (FAPs) comprise the largest subset of identified cilia and flagella-annotated proteins in this study; almost all have unknown function. Other major subsets identified here include the kinesin and kinesinlike proteins and outer and inner dynein arm components. In addition, seven intraflagellar transport proteins (IFTs) and six microtubule-associated proteins were identified. Similar to the previous studies, 23 kinases and three phosphatases were identified, including one dual specificity protein phosphatase CDC14 (PTEN 3).
Compared with known flagellar proteins previously reported (38), 42 of 101 flagella proteins (not to be confused with FAPs) were identified as phosphorylated in the present study, including two tubulin, 12 outer and inner dynein arm subunit, five radial spoke component, two central pair, three nexindynein regulator complex (N-DRC) proteins, 11 IFTs, and 10 miscellaneous proteins (Fig. 2, supplemental Table S6). In total, 124 phosphosites were identified in these proteins, 102 with confident localization. Among these proteins, known phosphoproteins include the outer dynein arm (ODA) heavy chain alpha (ODA11) (39), central pair kinesin KLP1 (40), and glycogen synthase kinase 3 (36). All three proteins are multiply phosphorylated, with three, five, and seven phosphosites, respectively.
Approximately 250 flagellar proteins in C. reinhardtii are known based on two-dimensional gels (41) and 600 based on MS analysis (38). We have identified over 30% of these as phosphorylated in the present study. This enrichment of the cilia/flagella phosphoproteome indicates it is significantly more complex than previously known. The results presented herein provide new insights into the important regulatory role of protein phosphorylation in C. reinhardtii flagella, as elaborated in the Discussion.
Thylakoid Membrane-Phosphorylation of thylakoid membrane proteins and its role in adjusting photosystem excitation pressure by "state transitions" was among the first reports on the function of protein phosphorylation in plants (42,43). With changes in light conditions, the state transition balances the excitation energy between photosystems I and II (PSI and PSII) with phospho-mediated shuttling of the light harvest complex II (LHC II) between their cores. C. reinhardtii thylakoid phosphoproteome studies have mainly focused on state transitions and environmental stress; to date, 19 thylakoid phosphoproteins corresponding to 28 unique phosphorylation sites have been reported (6,8). In the present study, 126 thylakoid phosphoproteins were identified, corresponding to 474 phosphosites (Fig. 3, supplemental Table S7). Of these, 362 were confidently localized; their best-supporting peptide spectrum Ascore and Mascot score assignments are provided in supplemental Table S2. Among the 126 thylakoid/ plastid phosphoproteins, key proteins related to photosynthesis and carbon assimilation are included. These proteins include 13 PSII and six PSII antenna proteins, 10 PSI and three PSI antenna proteins, four cytochrome bf6 complex components, six ATP synthase proteins, four state transition kinase/phosphatase proteins, 10 pigment production proteins, five unique Calvin cycle proteins, and other proteins related to redox regulation (Fig. 3).
Given the progressing nature of C. reinhardtii gene and protein names, published peptides (6 -9, 11, 12, 28) were compared with the sequences of proteins inferred in this study to determine whether these proteins or their phosphosites had previously been observed. All of the previously identified phospho-thylakoid proteins but lhcbm1 have also been identified as phosphorylated here, and 35 (3 with Ascore Ͻ 13) of the 44 phosphosites previously reported likewise have been observed. In total, 330 new phosphosites among thylakoid proteins were identified in the present study at high localization confidence.
Motif Analysis-To facilitate an accurate analysis of phosphosite-centered sequence motifs, the number of unique phosphosites was culled to 12,165 (supplemental Table S8) using an estimated false localization rate threshold of 1%, based on an Ascore threshold of 18 for most phosphosites and 40 for phosphosites with neighboring Ser, Thr, or Tyr residues (24). Most sites had observed sequences extending beyond six residues in either primary sequence direction. For those that did not, the highest scoring protein containing the phosphosite was used to infer the missing residues up to six positions away in the N-terminal and/or C-terminal direction. A phosphosite data set was thus constructed with each sequence entry 13 residues long centered on the phosphosite, and searched for motifs using MotifX (http://motif-x.med. harvard.edu/motif-x.html2) (26,27). We identified at high confidence 46 acidic, 36 basic, 70 proline-directed, 1 tyrosine, and 60 "other" motifs; a subset of these are depicted in Fig. 4 and all are tabulated in supplemental Table S9. To our knowledge, 67 of these are novel, whereas the remaining 146 were almost trivially matched to 11 two-residue motifs for known kinases using the resources at http://www.hprd.org/ PhosphoMotif_finder (44) and http://phosida.de/ (45). In the FIG. 2. Flagella phosphoproteins identified in this study and their localization in a schematic diagram of a cross section of a flagellum from C. reinhardtii. The detailed information of each protein, including protein name, accession number, high confident phosphosites etc. can be found in supplemental Table S6. IFT, intraflagellar transport; ODA, outer dynein arm; IDA, inner dynein arm; RSP, radial spoke protein; N-DRC, nexin-dynein regulatory complex.  (26,27). The background was the IPI A. thaliana proteome, the width was set to 13, occurrences set to 20, significance set to 1 ϫ 10 Ϫ7 , and pT sites replaced at position 7 with pS. For tyrosine motifs the thresholds were relaxed: occurrences set to 5 and significance set to 1 ϫ 10 Ϫ6 . Known motifs, listed in parentheses, were searched using the resources at http://www.hprd.org/PhosphoMotif_finder (44)and http:// phosida.de/ (45). The comma-separated number is the # of occurrences of the colored motif. former case, a human phosphoproteome motif database was interrogated, whereas the latter mined a phosphosite motif database of nine species. Most of the phosphomotifs in supplemental Table S9 are more specifically defined than their matched kinase motifs, and are thus evidence for novel C. reinhardtii kinase substrates.
Annotation of the C. reinhardtii Phosphoproteome-The phosphoproteome data set simultaneously samples all of the organelle compartments of C. reinhardtii and many central biological processes, including cellular protein modification process, biosynthetic process, metabolism, response to stress, signal transduction, and transport (supplemental Fig.  S2A, S2B). Many phosphoproteins are involved with biomolecular binding and/or are themselves kinases (supplemental Fig. S2C) (Table S10). These annotation statistics stem from a standardized GO (46,47) term analysis of the 4588 phosphoproteins identified with at least one confidently-localized phosphorylation site. Using Blast2GO Pro v2.6.4 (48), 3121 data set phosphoproteins were functionally annotated (supplemental Table S10). Blast2GO first blasts the data set protein sequences against the full NCBInr protein database and then relates the associated GO terms for the top blast results, thereby providing a species-independent analysis of the identifications in concert. That 1467 of the 4588 phosphoproteins lack standard annotations is consistent with their Phytozome database descriptions: 1247 of these proteins are described as having unknown function.
Compared with the background C. reinhardtii Blast2GOannotated proteome, 28 cellular component, 57 molecular function, and 32 biological process GO-slim terms were found to be enriched; the top 20 of each categories are depicted in supplemental Fig. S3 (supplemental Table S11A-C). This enrichment is based on the Fisher's exact test comparison of the frequencies of each term in the data set to their frequencies in the C. reinhardtii reference set (49). In particular, many proteins are GO-annotated as thylakoid-associated, actin binding, motor activity, or cilium related suggest a rich contribution to phospho-mediated photosynthesis and flagellar motility in C. reinhardtii, and the preponderance of phosphorylated enzymes in glycolysis and the Calvin-cycle may be indicative of phospho-regulation of basic metabolism. Although these salient phosphoprotein collections is further explored here, the richness of this phosphoproteomics data set is primed for mining to reveal insights into additional key biological processes (supplemental Tables S1, S2, and S10). The phosphoprotein, underlying phosphopeptide, and unique and total phosphosite tables have been assembled in Excel to enable the use of filtering and other tools to enable facile perusal. These tables' entries are derived from their best MS/MS spectra matches, the statistics for which can be unhidden in the protein tables and are in plain view in peptide supplemental Table S3. As well, the mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium (http://www.proteomexchange.org) via the PRIDE partner repository (50) with the data set identifier PXD000783 and DOI 10.6019/PXD000783. Further, confidently-localized phosphosites will be submitted to the plant P3DB database (5).
TiO 2 -HILIC versus HILIC-PolyMAC-Because of the low abundance of phosphoproteins and low stoichiometry of phosphorylation, robustness, and high reproducibility are two key factors to conducting successful qualitative and quantitative phosphoproteomics studies. For both the TiO 2 -HILIC and HILIC-PolyMAC workflows used in this work, three biological replicates were analyzed. LC-MS/MS analysis proceeded via an LTQ-Orbitrap-Velos instrument for the first replicate in each workflow and a TripleTOF 5600 was used for the other two replicates (Fig. 1A). At the protein level, both TiO 2 -HILIC and HILIC-PolyMAC workflows showed high reproducibility regardless of acquisition platform (Fig. 5A-B). As expected, more variance in the biological replicates is seen at the peptide level (Fig. 5C, 5D), and comprehensive shotgun proteomic studies often rely on technical replicates to improve coverage (51).
The phosphopeptide enrichment afforded by both insoluble TiO 2 microspheres and PolyMAC soluble dendrimer-particles was compared in this study (Fig. 6). Because PolyMAC enrichment is best employed for small peptide mixture quantities, HILIC fractionation was first performed. Conversely, TiO 2 enrichment affords high selectivity and efficiency with large amounts of peptides, so HILIC fractionation was performed as a second step. Compared with the HILIC-PolyMAC workflow, the TiO 2 -HILIC workflow has the advantage of reduced enrichment time. As the order effect cannot be delineated from affinity interaction, workflow comparison is inclusive of the order and implementation of HILIC fractionation in the two methods (Fig. 1A). Although there is redundancy of either workflow at the protein level (Fig. 6A, 6B), the peptide data ( Fig. 6C-6E) show the methods to be complementary. At the protein level, the striking overlap of the two methods (Fig. 6A) is likely due in part to the large number of LC-MS/MS acquisitions committed to each workflow. Consistent with the fact that enrichment workflows are "bottom-up" proteomics methodologies operating on peptides and not intact proteins, a bias is not observed between methods for detecting multiply phosphorylated proteins (Fig. 6B). At the peptide level, a difference between workflows arises (Fig. 6C-6E): TiO 2 -HILIC is biased toward singly phosphorylated peptides. Considering only peptides found by one or the other workflow (Fig. 6E), the difference is much more pronounced. This corroborates the findings that TiO 2 enriches more singly phosphorylated peptides than IMAC (52), and that at the molecular level, PolyMAC enrichment is an improved analog of IMAC (15). DISCUSSION In this study, we report the most comprehensive exploration of the phosphoproteome of C. reinhardtii. We observed 4588 phosphoproteins distributed among every cellular component (supplemental Fig. S2A). Among these phosphopro-teins, 4256 are novel. Additionally, we identified 16,086 phosphosites with high-localization confidence, of which 15,862 are unique. In comparison to the published C. reinhardtii phosphopeptides, 98% of the 18,160 unique phosphopeptides identified in this study are novel. Given that 30% of proteins are subject to phosphorylation in a eukaryotic cell (53), we report the characterization of the majority of phosphoproteins (23%) in C. reinhardtii.
Our data reveals the global importance of phosphorylation in C. reinhardtii and highlights the importance of phosphorylation in cytoskeletal processes and energy production. The top frequency analysis (supplemental Fig. S2) and statistical FIG. 6. The complementarity of enrichment methods for phosphoproteins and phosphopeptide detection. A, both workflows found 77% of the 4588 phosphoproteins inferred in this study in at least one biological replicate. B, there is no significant difference for either method in detecting multiply phosphorylated proteins. C, 44% of the 18,160 phosphopeptides with 95% site localization confidence were detected by both workflows in at least one biological replicate. D, among all 18,160 such peptides, TiO 2 -HILIC is more biased toward singly phosphorylated peptide enrichment. E, this trend is more pronounced when comparing the 5031 phosphopeptides only found in TiO 2 -HILIC samples and 5228 phosphopeptides only found in HILIC-PolyMAC samples. enrichment analysis (supplemental Fig. S3) of GO terms best serve to highlight the global reach of the experiment, but because this was not a differential study, these analyses do not provide an overarching signpost to the most interesting biological process involving protein phosphorylation. For example, 180 cytoskeleton and cilium proteins were phosphorylated, which signifies phospho-mediated action in flagella/ cilium assembly and function (supplemental Table S6). Of particular note, more than 100 thylakoid proteins were phosphorylated, which ranked tenth among all the Cellular Component categories and underscores the important role phosphorylation plays in the photosynthetic process. In addition, more than 150 phosphoproteins were identified as associated with the mitochondrion, ninth among all categories. Like chloroplasts, mitochondria are of prokaryotic origin and have evolved by endosymbiosis. In C. reinhardtii, mitochondria occupy about 1%-3% of the cell volume of photosynthetically grown algae, whereas the chloroplast occupies about 40% (16). Owing to its small volume, the proportion of phosphoproteins in mitochondria is high and indicates that phosphorylation has an important role in mitochondrial biological processes. Overall, the widespread detection of phosphoproteins in C. reinhardtii reveals that phosphorylation is ubiquitous and critical to the C. reinhardtii regulatory machinery.
Cilia and Flagella Protein Phosphorylation-Cilia and flagella are highly conserved organelles with sensory and motile functions. These organelles contain an inner-core called an axoneme, which serves as a cytoskeleton-like structure. This cytoskeleton structure contains nine microtubule (MT) doublets interconnected by nexin links, which surround a central pair (CP) of singlet MTs. Additionally, neighboring doublets are interconnected circumferentially by nexin links and are attached to the CP of MTs through radial spokes (RSs). Ciliary motility is generated by outer dynein arm (ODA) and inner dynein arm (IDA). Phosphoproteins were identified in all the flagella structure elements as well as signaling proteins and functional unknown proteins (Fig. 2); the following subsections elaborate on these by localization and category.
Central Pair-Three central pair proteins were phosphorylated in the present study: KLP1, PF20, and PF6. KLP1 is essential for normal flagellar motility, modulating the spoke interaction of a neighboring central pair protein (54). Five KLP1 phosphosites were reported previously (10); all were detected in the present study, as well as three new phosphosites. Two phosphosites were identified in PF20. Of the 19 phosphosites detected here for PF6, 16 were identified with high localization confidence, and six were previously reported (10,11). The phosphorylation of central pair components indicates that protein phosphorylation of central pair proteins is a key regulator of flagellar motility (55).
Dyneins-The outer and inner dynein arms generate the forces driving ciliary and flagellar movements. The outer dynein arm of C. reinhardtii comprises three catalytic heavy chains (HCs), two intermediate chains (ICs), and 11 light chains (LCs) (56). The inner dynein arms, of which there are at least seven isoforms (56), are heterogeneous in composition and structural arrangement on the axoneme. Regulatory control of dynein motor function is thought to involve the phosphorylation of various components as well as a series of light chain proteins that are directly associated with the heavy chains (57). In the present study, 14 dynein components were identified as phosphorylated, including ODA6, ODA7, ODA9, ODA11, DHC3, DHC4, DHC6, DHC7, DHC8, DHC11, DLC1, and three predicted dynein heavy chains (CHLREDRAFT_ 174803, CHLREDRAFT_189792, and CHLREDRAFT_206178). Among these proteins, 41 phosphosites were detected, with 29 confidently localized; five of the sites were also previously reported (10,11). Multiple phosphorylation of ␣ dynein heavy chain (␣DHC, ODA11) was reported 20 years ago without exact knowledge of the phosphosites (39). Three pSer were identified in the present study with high localization confidence, two of which were also reported by Pan and colleagues (10).
The outer arm dynein light chain DLC1 interacts directly with tubulin to form a ␥HC-LC1-microtubule ternary complex and functions as a conformational switch to control outer arm activity (58). The homolog of the outer dynein arm DLC1 gene of Paramecium tetraurelia (ODAL1) is essential for controlling the ciliary response by cAMP-dependent phosphorylation (59). One pSer was identified in DLC1 (Cre02.g092850. t1.2 PACid:27574591) in the present study (S56).
RSPs-The radial spoke is a ubiquitous component of 9 ϩ 2 cilia and flagella, controlling dynein arm activity by relaying signals from the central pair of microtubules to the arms. The C. reinhardtii radial spoke contains at least 23 proteins, and 18 of them have been identified on the molecular level (54). Five of these proteins are located in the spoke head and the rest are in the spoke stalk. In the present study, five radial spoke proteins (RSPs) were identified, including spoke stalk proteins RSP3, RSP7, RSP8, RSP11, and spoke head protein RSP4, with 3,3,1,1, and 1 phosphosites identified, respectively. In earlier studies, five spoke stalk proteins (RSP2, RSP3, RSP5, RSP13, and RSP17) were revealed to be phosphorylated by radiolabeling, together with more than 80 axonemal components (60). Our results are consistent with the observation that more spoke stalk proteins are phosphorylated than spoke head proteins. The phosphoproteins are different from those reported before except for RSP3, which may be caused by differences in culture/sampling conditions and strain. The protein kinase A anchor proteins (AKAPs) are a diverse group of polypeptides responsible for localizing protein kinase-A (PKA) by binding to its regulatory subunit. RSP3 is one such anchor protein (AKAP97). Located at the base of the radial spoke, it attaches the radial spoke to the outer doublet, near the inner dynein arm (61). Its phosphorylation may play a role in localizing PKA to the inner dynein arm, thus mediating the phosphorylation of the arm components DHC3, DHC4, DHC6, DHC7, DHC8, etc. RSP11 is associated with RSP3 at the basal end of the spoke stalk, which is essential for normal flagellar beating and possibly for regulation (54). From the radial spoke structure model proposed by Yang and colleagues (54), RSP7 and RSP8 are close to RSP3 and RSP11. RSP7 contains five predicted EF-hands that exactly match the consensus for Ca 2ϩ binding and is predicted to be a calcium-dependent protein kinase (CDPK19). RSP7 might have a key role in Ca 2ϩ control of flagellar motility. RSP8 has armadillo repeats, which are believed to function in protein-protein interactions. Thus, RSP3, together with RSP7, RSP8, and RSP11 appear to anchor PKA at the base of the radial spoke in a signaling network designed to directly or indirectly control the phosphorylation state of the inner arms. None of these proteins have been identified as phosphorylated in previous studies. The phosphorylation of these RSPs implies that the radial spoke transmits signals from the central pair of microtubules to the dynein arms through phosphorylation.
Kinesin-Kinesin and kinesin-like proteins belong to the second largest group of phosphoproteins in flagella in our study, with 27 proteins and 245 phosphosites, of which 165 are confidently localized and only five were previously reported (10). Cilia contain multiple kinesins in addition to heterotrimeric kinesin-2 (FLA10), which is proposed to drive the anterograde movement of intraflagellar transport particles along the axoneme. Accessory motors such as the kinesins -2, -3, -9, -13, -14, -16, and -17 cooperate with FLA10 to confer functionally distinct cilia by controlling axoneme assembly, dynamics, and length by either inserting and moving membrane proteins along sensory cilia or by modulating the beating of motile cilia (62). Activation of a cell to generate new flagella induces rapid phosphorylation of CrKinesin-13, which indicates that phosphorylation of CrKinesin-13 may play a role in this process (63). In the present study, together with three sites of the above mentioned Kinesin-II motor protein (FLA10, Cre17.g730950.t1.2 PACid:27571860 ), 153 phosphosites were identified in kinesin proteins. Among these proteins, one armadillo repeat kinesin 3 (Cre01.g049650.t1.3 PACid: 27579073 ) contains 24 unique highly confident phosphosites, including two pThr and 22 pSer. Another kinesin motor family protein (Cre05.g235500.t1.3 PACid:27572513 ) contains 15 unique phosphosites. Regulation of the intraflagellar transport motor is poorly understood, but these data suggests kinesin phosphorylation may play a critical role in its regulation.
IFT-Intraflagellar transport, the bidirectional movement of particles within flagella, is required for flagellar assembly. IFT particles are composed of at least 16 proteins, which are organized into complexes A and B (64). Eleven IFTs were identified in the current study, including complex A components IFT43, IFT140, and IFT122 and complex B components IFT46, IFT57, IFT72/74, and IFT80. We also identified D1BLIC. The phosphorylation of IFT43, IFT 46, IFT57, and IFT72/74 has been reported previously, with 3,4, 1 and 1 phosphosites, respectively (10,11). IFT72/74 has been shown to be the complex B core, together with IFT81 (65). IFT46 has been reported to be necessary for complex B stability and is specifically required to transport outer dynein arm complexes into the flagella. The insertional mutant null for IFT46 in C. reinhardtii has short, paralyzed flagella lacking dynein arms and has central pair defects (64). In the present study, three high confident phosphosites were identified in IFT46, with two identified previously (43); further study is needed to determine whether and how these phosphorylations are involved in this transport function. FLA10 kinesin-II is responsible for powering anterograde IFT; this transport process is required for the assembly and maintenance of the flagellum (66). Three phosphosites were identified in FLA10 kinesin-II. The motor for retrograde intraflagellar transport in Chlamydomonas is cytoplasmic dynein 1b, which contains the dynein heavy chain (DHC) DHC1b and the light intermediate chain (LIC) D1BLIC. D1BLIC is important for stabilizing dynein 1b and required for retrograde intraflagellar transport (67). One pSer (S412) was identified in D1BLIC (g9904.t1 PACid:27570458 ).
Nexin-dynein Regulatory Complex (N-DRC)-Normal beating of cilia and flagella requires the control and coordination of thousands of dynein motors, and the N-DRC has been identified as an important regulatory node for orchestrating dynein activity. Previously, 12 N-DRC associated proteins were identified and eight were shown to be phosphorylated by ProQ Diamond staining (37), including DRC7 (FAP50) and FAP252. In the present study, N-DRC proteins DRC4 (PF2), DRC7 (FAP50), and FAP252 were identified with one, three, and six phosphosites, respectively. DRC4 (PF2) was not detected by staining in the study by Lin and colleagues (37), but other studies have demonstrated that mutations of the DRC4 gene (PF2) cause defects in C. reinhardtii N-DRC assembly (68). Mutations in the DRC4 homolog trypanin disrupts the ciliadependent step of the Trypanosoma life cycle (69), whereas mutations in the homolog GAS8 cause defects typical of ciliary diseases in zebrafish (70). FAP252 is a calcium-dependent protein kinase that may play an important role in dynein regulation.
CDPKs-Calcium has been reported to be intimately involved in a variety of flagellar related activities, and phosphorylation may be one mechanism by which these divergent flagellar activities are regulated (71). CDPKs are a group of kinases, present in plants but not in animals, harboring both protein kinase and calmodulin-like domains in a single protein. The presence of CDPKs in flagella of C. reinhardtii implies their role in flagellar related activities. In total, 16 CDPK sequences were identified in the present study as phosphorylated, with 115 total and 80 confidently-localized phosphosites. These include RSP7 and FAP252, binned in other colored categories in supplemental Table S6. In silico analysis of the C. reinhardtii genome identified 14 CDPKs (72); three were identified in the flagellar proteome, including CrCDPK1, 3, and 11 (12). In the present study's data set, 12 of the 14 in silico-identified CDPKs have been detected. CrCDPK3 is directly involved in flagellar biogenesis (72); here two of its phosphosites have been identified. In addition, three calciumbinding EF-hand family proteins GO-annotated as flagella associated proteins were identified as phosphorylated. Either by autophosphorylation or as substrates to other kinase, CDPKs are assuredly elements in flagellar activity.
Signaling Phosphoproteins-The flagellar proteome contains over 90 putative signal transduction proteins, including 21 protein kinases and 11 protein phosphatases (38). Phosphorylation of kinases and phosphatases has also been reported in C. reinhardtii flagella (10,11). Protein kinases are themselves regulated by different mechanisms, including autophosphorylation. In addition to the mentioned CDPKs, we identified the following as phosphorylated: five MAPK kinases; cyclin-dependent kinase B1 (CDKB1); glycogen synthase kinase 3 (GSK3); dual specificity protein phosphatase (CDC14); two protein phosphatase 2A (PP2A) isoforms (PP2A3 and g4366.t1 PACid:27576130 ); and blue light receptor phototropin 2 (PHOT). In total, 47 phosphosites were assigned to these proteins, 38 with confident localization. It was reported that GSK3 can regulate the assembly and length of flagella in a tyrosine phosphorylation-dependent manner (36). In the present study, seven novel phosphosites were identified with confident localization in GSK3. Phototropin 2 is a signaling protein localized to flagella in C. reinhardtii (73). Ten phosphosites were characterized in this study, including two pSer first identified by Pan and colleagues (10). PP2A is localized to outer doublet microtubules and may regulate dynein activity (74); two PP2A isoforms were identified with one and two phosphosites here.
Flagellar Associated Proteins (FAPs)-Consistent with previous flagella phosphoproteome studies (10, 11), we identified a large group of functional unknown FAPs (40), including three Calcium-binding EF-hand family protein and three ankyrin repeat proteins (ANK10, 17,24). Further investigation is required to reveal the biological function and post-translational regulation of FAPs.
Thylakoid Membrane Protein Phosphorylation-In photosynthetic organisms, the conversion of light energy into electrochemical energy is driven by two photosystems, photosystems I (PSI) and II (PSII). These photosystems possess peripheral antenna systems that are termed light-harvesting complexes I (LHCI) and II (LHCII), respectively. CP29 (LHCB4H) is one of three minor chlorophyll a/b binding proteins of LHCII in plants and algae. CP29 is critical to efficient light harvesting, energy conduction, PSII macro-organization, and photoprotection in higher plants, and its reversible phosphorylation is mainly involved in photoinhibition recovery (75) and state transitions (76). It has been demonstrated that there are at least seven phosphorylation sites in C. reinhardtii CP29 and that the hyperphosphorylation of CP29 is induced under high light conditions, which in turn promotes dissociation of phospho-CP29 from the phospho-D1 protein. This may in-crease accessibility of light-damaged D1 to phosphatases and proteases and, consequently, influence the turnover of this important functional subunit of PSII under high light stress (6). In the present study, 10 phosphosites were identified in CP29 (Cre17.g720250.t1.2 PACid:27572055), including six of the previously reported sites (5,7). Importantly, the preponderance of 17 doubly-phosphorylated peptides among all 38 unique CP29 phosphopeptides indicates the protein populates a highly phosphorylated state in our experimental conditions.
Orthologous protein kinases Stt7 and STN7 are required for LHCII phosphorylation and for state transitions in Chlamydomonas and Arabidopsis, respectively (77,78). Moreover, Stt7 is required for phosphorylation of the thylakoid protein kinase Stl1 under state 2 conditions, suggesting the existence of a thylakoid protein kinase cascade. It has been observed that Stt7 itself is phosphorylated at S533 in state 2, but analysis of S533A/D mutants indicated that this phosphorylation is not required for state transitions (8). In this study, pS533 is also observed, along with five new phosphorylation sites among two Stt7 isoforms discussed by example above (supplemental Fig. S1). First reported by Lemeille and colleagues, thylakoid protein kinase Stl1 (Cre12.g483650. t1.2 PACid:27581912) was also observed here as modified at T126, though no other high-localization-confidence phosphosites were identified.
Fast phospho-signaling in state transitions of A. thaliana directly involves PPH1/TAP38 phosphatase (79,80). The Chlamydomonas ortholog to this phosphatase has not been empirically identified. Among the thylakoid-annotated phosphoproteins identified in this study, thylakoid-associated phosphatase 38 (g4755.t1 PACid:27580688 , supplemental Table S7) is found to be phosphorylated at T138 and possibly T171. Blastp analysis of this protein with PPH1/TAP38 shows 37% sequence identity with an E-value less than 10 Ϫ68 , thus substantiating its Phytozome description.
The proteins highlighted here and in supplemental Table S10 are central to the light reactions of photosynthesis and have an average of 3.8 detected phosphosites. This corroborates the well-understood role of phospho-mediated signaling in state transitions. Significantly, nine thylakoid-associated proteins each exhibit at least one phosphosite consistent with the putative Stt7 kinase motif (Table I). These sites have at least two basic residues among positions -2, -1, ϩ2, and ϩ3 relative to a pThr. Two such sites, in proteins CP29 (LHCB4) and LHCB5, have previously been reported as substrate targets of Stt7. Of these proteins, CP29 is the most phosphorylated and has two potential Stt7-active sites. Counter to the overall distribution of Ser, Thr, and Tyr phosphosites (Fig. 1F), 9 of 11 sites in CP29 are pThr. This coincidence suggests CP29 may be exquisitely sensitive to modification by Stt7 at multiple sites, further underscoring its reported role in the Stt7-mediated dissociation of LHCII from PSII in the state 1 to state 2 transition. Despite this, the 474 thylakoid protein phosphosites are not likely to all be substrates of only Stt7 or Stl1. The signaling network for state transitions is but one mechanism by which the chloroplast may respond to stimuli (light quality) in a timescale amenable to phospho-PTM. These 474 sites are very probably integral to other putative networks.
Phosphoproteins in the Glycolytic Pathway-Flagellar beating requires large amounts of ATP. The glycolytic pathway is present in Chlamydomonas flagella as well as mammalian sperm flagella (38). The localization of glycolytic enzymes to cilia may be a way to maintain a constant ATP/ADP ratio along the length of the cilium. In the present study, seven of the 10 enzymes of the late glycolytic pathway required for the conversion of fructose 1,6-bisphosphate to pyruvate were phosphorylated. In total, 199 phosphosites, of which 145 were confidently localized, were identified in these proteins (supplemental Table S12). Although key enzymes in the glycolytic pathway were reported under protein phosphorylation control (81), glycolysis enzyme phosphorylation has not been widely reported. The high phosphorylation of glycolytic pathway enzymes led us to connect this pathway with flagella. A disproportionate share of these sites was identified on pyruvate kinase (PYK), with 64 phosphosites confidently localized among seven isoforms. Despite the high number of identified isoforms, only four of the 64 phosphosites are shared. The remaining 60 are unique, in that each peptidyl observation matches only one protein sequence (supplemental Table S4). Given that cytosolic glycolysis is largely regulated by phosphofructokinase (PFK) allosteric activation/deactivation by ATP release/binding, high phosphorylation of PYK suggests another regulatory function. Mitchell and colleagues (82) demonstrated that three enzymes of the lower half of the glycolytic pathway allow ATP production in C. reinhardtii flagella from the glycolytic intermediate 3-phosphoglycerate intermediate. There is a large negative free energy change in the final glycolysis reaction generating ATP by PYK-catalyzed phosphoenolpyruvate conversion to pyruvate. Phosphorylation control at this step, then, may be especially effective in regulating ATP generation in flagella. Earlier in the glycolysis pathway, PFK may also be directly or indirectly phosphoregulated; PFK1 is phosphorylated at three sites. Indirect control may be mediated by the phosphorylation of fructose-2,6-bisphosphatase, with five sites confidently localized (supplemental Table S2) that potentially alter its allosteric binding with PFK.
Motif Analysis-In addition to hyperphosphorylation of proteins involved in sensory, motility, and energy homeostasis, we observed an enrichment of motifs that suggests that basal phosphoproteome of C. reinhardtii is regulated by specific kinases (supplemental Table S9). Of these, the most predominant are extracellular-signal regulated kinase (EKR), protein kinase C (PKC), casein kinase II (CKII) and G protein coupled receptor (GPCR) kinase motifs, with 3357, 2093, 1363, and 609 (supplemental Table S9) phosphorylation sites matched, Previously identified phosphosite from ref. (14).

The Phosphoproteome of Chlamydomonas reinhardtii
respectively. These sites are observed in a broad range of proteins (e.g. 609 GPCR sits are distributed among 876 proteins), and supplemental Table S13 lists the phosphomotifs  observed for each protein found in supplemental Table S9 derived from the high-confidence list of unique phosphosites compiled in supplemental Table S8.
These motifs, which are derived from what is known in systems such as bacteria, yeast, and mammals, represent what are the likely kinase cascades and signaling pathways active in C. reinhardtii. However, defining these pathways is currently difficult in plant/algal systems; compared with other organisms, resources and knowledge of kinase-substrate relationships in Chlamydomonas is underdeveloped. For example, in some cases, downstream or upstream sequence motifs dozens or hundreds of residues removed from the phosphosite dictate target protein recognition and subsequent substrate phosphorylation. Such is the case for the mitogenactivated protein kinases (MAPKs), such that binding-domain distal phosphorylation only requires the simplest proline-directed phosphosite motif. In another example, although heterotrimeric G protein signaling has recently been discovered in the algae Chara braunii, it is not known to be present in Chlamydomonas. This, in addition to the fact that many plant proteins have been "rounded up" as GPCR-like proteins despite the dearth of functional confirmation (83), questions the role of GPCR-like motif containing proteins in Chlamydomonas and plants. Large-scale efforts such as this study, combined in the future with targeted biochemical probing of kinase-substrate relationships in plant systems (only recently explored in humans (84)) set the foundation for understanding how kinases mediate phosphorylation events in plant and algal systems.
Despite the difficulty in describing kinase-substrate relationships-particularly in algal/plant systems-our data set can be used to generate hypotheses given current tools and knowledge. Here, we present one striking example.
DNA ligase 1 (LIG1) is a replication/repair enzyme ubiquitous in living organisms. Here, we detected seven phosphopeptides (supplemental Table S3, protein accession number g7530.t1 PACid:27564004, rows 22650 -22656), and seven phosphosites: S5, S47, T129, T149, S196, S203, and S237 (supplemental Table S2). Sites T129, S196, and S203 share the simple substrate motif of CDKs, [pS/pT]Px[K/R] (85) Site S237 nearly fits this motif, with one additional residue between Pro and Lys immediately C-terminal of the pS237. Each of these four sites bin into several more-defined motifs; among them, respectively (composed from supplemental Table S3 and supplemental Table S13), xxAAxG[pS/pT]Pxxxxx, xxxxAA[pS/pT]PxxxxA,xxxxxA[pS/pT]Pxxxxx,andxxAAxA[pS/pT] PAxxxx. Though it may be unlikely that each of these motifs is associated with four distinct kinases, what is striking from the Motif-X pattern analysis is the compositional similarity of each of these 13-residue sites. This suggests that the CDK acting upon LIG1 requires not only a proline-directed motif with a downstream basic site, but that the sequence motifs have a high occurrence of Gly and Ala.
Frequency of Multi-phosphorylated Proteins in C. reinhardtii Compared with Higher Plants and Mammals-In this study, the relative distribution of pSer, pThr, and pTyr was 86.0%, 13.7%, and 0.3%, respectively. In contrast to these findings, the relative abundance of pTyr is typically observed to range from 1.5% to 2.4% in a variety of organisms (86 -89). This result likely stems from differences in methodology and organism used; for example, in HeLa cells, observed pTyr abundances range from 0.7-1.8% (86,90).
Compared with the overall number of identified phosphoproteins, more multiple-phosphosite proteins were identified in C. reinhardtii than in higher plants (i.e. rice and Arabidopsis), but less than in mammalian organisms. Multiple modifications do not necessarily occur simultaneously on individual protein molecules. Were they to, multiple phosphorylations could reflect concerted regulation of a single protein's function via one or multiple pathways. Conversely, multiple phosphorylations could arise from independent kinase regulation at distinct sites on different molecules of a protein ensemble. Sixty percent of phosphoproteins contained multiple sites in this study, whereas 30% were phosphorylated on four or more residues and 5% carried more than 12 sites (Fig. 1D). In contrast, Huttlin and colleagues (23) observed higher percentages of phosphoproteins with multiple phosphorylation sites in mouse tissues, with 80% of phosphoproteins containing multiple sites, 50% phosphorylated on four or more residues and 10% carried more than 14 sites. Again, differences between these two data sets may be because of organism of interest and/or experimental approach. Though a multi-phosphorylation analysis was not provided in recent Oryza sativa japonica (rice) and Arabidopsis phosphoproteome studies (89,91), a reasonable inference from the data sets' summaries is that fewer multiple-sites phosphoproteins were identified in rice than Arabidopsis: the average phosphosites per protein are 1.6 and 1.8 for rice and Arabidopsis, respectively. Based on the present study, the average number of phosphosites per protein in C. reinhardtii is 3.5, almost twice that of rice and Arabidopsis.
In large-scale mammalian phosphoproteome studies, this quotient was found to be 5.7 and 4.3 for mouse and rat, respectively (23,92). Given their close phylogenetic relation, the differences between mouse and rat studies stems mainly from differences in IMAC and TiO 2 phospho-enrichment methods used. Such results indirectly reflect that IMAC tends to enrich more multiple sites phosphopeptides than that of TiO 2 . It is clear that multiple factors manifest these differences including the organismal biology and features of the experimental setup, especially selection of enrichment approach(es).
PolyMAC Enriches More Multiply Phosphorylated Peptides than TiO 2 Heterogeneous Enrichment-IMAC enriches more multiply-charged phosphopeptides compared with other enrichment strategies. In Drosophila melanogaster Kc167 cells, 16% of the phosphopeptides enriched by IMAC were found to be multiply phosphorylated, compared with only 7% of the phosphopeptides enriched by TiO 2 particles (93). We find a similar bias between PolyMAC and TiO 2 : 14.5% of C. reinhardtii phosphopeptides enriched by PolyMAC and 2.5% of the phosphopeptides enriched by TiO 2 contained more than one phosphosite.
The enrichment by HILIC-PolyMAC for multiply phosphorylated peptides is advantageous because it provides temporal and cell-spatial co-occurrence of phosphorylations on a protein. This ameliorates a general problem of the bottom-up proteomics strategy: reported PTMs normally inform what the most modified protein might look like in a protein's ensemble of modification states-we see a projection of this ensemble as a (possibly fictitious) highly-modified state. Observing multiply phosphorylated peptides immutably defines one such state. This may be vitally important for understanding the role of phosphorylations in signal transduction. Sequential phosphorylation events have been observed in Arabidopsis. Vainonen and colleagues (94) showed that T4 phosphorylation in PsbH was dependent on phosphorylation of T2. They proposed that phosphorylation of other residues in PsbH depends on the phosphorylation of T2 and that this second phosphorylation requires STN8 kinase. These sequential phosphorylation events raise the question of cross-talk between thylakoid protein kinases at the substrate level (94).
Summary-Given the critical role of phosphorylation to underlying C. reinhardtii biology, proteomic approaches that focus on phosphopeptides uncover how the complex post-translation modification of proteins is manifested into a coherent biological response. The present study provides a rich exploration of the phosphoproteome of C. reinhardtii, with observation of 4588 phosphoproteins distributed among every cellular component. Although particularly interesting is the enrichment of phosphorylated proteins involved in cytoskeletal processes and energy maintenance, this distribution indicates the global reach of our experiment. Thus, continued wide-spread and targeted probing of this data set will significantly enhance understanding of a range of regulatory mechanisms controlling a variety of C. reinhardtii cellular process and will serve as a critical resource for the microalgal community.