Coexpressed subunits of dual genetic origin define a conserved supercomplex mediating essential protein import into chloroplasts

Significance Chloroplasts are of vital importance in photosynthetic eukaryotic organisms. Like mitochondria, they contain their own genomes. Nevertheless, most chloroplast proteins are encoded by nuclear genes, translated in the cytosol, and must cross the chloroplast envelope membranes to reach their proper destinations inside the organelle. Despite its fundamental role, our knowledge of the machinery catalyzing chloroplast protein import is incomplete and controversial. Here, we address the evolutionary conservation, composition, and function of the chloroplast protein import machinery using the green alga Chlamydomonas reinhardtii as a model system. Our findings help clarify the current debate regarding the composition of the chloroplast protein import machinery, provide evidence for cross-compartmental coordination of its biogenesis, and open promising avenues for its structural characterization.

FPKMs; 2) quantile normalization with the R package preprocessCore; 3) normalization by the mean of each gene across all samples, resulting in a scale-free dataset. For A. thaliana, we collected microarray data from AtGenExpress and processed them as described above. Gene lists were compiled from literature searches and BLAST results using the A. thaliana or P. sativum genes. These lists are provided in SI Appendix, Tables S1 and S2. Coexpression was visualized with the R package corrplot, with the order of genes within each set (proteasome and chloroplast translocon) determined by hierarchical clustering using a combination of "hclust" and "FPC" methods in corrplot. The distribution of the Pearson correlation coefficient (PCC) values for each gene set was plotted in R using the density function. To identify genes that are Coexpressed with C. reinhardtii TIC20, we calculated the mutual ranks (MR) associated with all gene pairs (1, 2) and then converted MR values into network edge weights (an edge being the distance between two genes (or nodes)). For the highest stringency, we opted for the formula with the fastest rate of decay: edge = e -(MR-1)/5 , and deemed a gene to be Coexpressed with TIC20 only if its network edge weight with TIC20 was greater than 0.01.

Construction of plasmids. Construction of pRAM73.19. A chloroplast integration plasmid
carrying the psbD 5'UTR fused to the tic214 (orf1995) coding sequence was generated in two sub-cloning steps, as described below. First, 210 bp of the psbD promoter and 5'UTR and about 2.4 kbp of the tic214 coding sequence were amplified from chloroplast genomic DNA using primer pair SR212/SR247 and SRSR246/SR231, respectively. These two PCR products were gel-purified, mixed in an equimolar amount, and used as a template for an overlap extension PCR by SR212/SR231. The resulting PCR product was gel-purified, digested by ClaI and EcoRI, and cloned in the chloroplast integration vector pUCatpXaadA digested with the same restriction enzymes. The resulting plasmid was verified by digestion analysis and named pRAM72. Next, about 2.7 kbp of the region upstream of the tic214 5'UTR was amplified from chloroplast genomic DNA using primer pair SR228/SR229. This PCR product was gel-purified, digested by SacI and XbaI, and cloned into pRAM72 digested with the same restriction enzymes. The Construction of pED3 (to produce recombinant proteins used to raise the polyclonal antibody against Tic214). A DNA fragment of tic214 was amplified from genomic DNA using the primer pair ED5/ED6. This PCR product was gel-purified, digested by NcoI and XhoI, and cloned into the bacterial expression vector pET28a digested with the same restriction enzymes. The resulting plasmid was verified by digestion analysis and DNA sequencing and named pED3. The sequence of the primers used during this cloning are reported below: Construction of the Y14 and NY6 strains. To generate the Y14 strain, the A31 strain (3) was transformed with the chloroplast integration plasmid pRAM73.19, where the aadA cassette (used as a selective marker) is located just upstream of the psbD 5'UTR fused to the tic214 gene. To generate the NY6 strain, a WT strain was transformed with a chloroplast integration plasmid named Orf1995:HA (NdeI) (gift of E. Boudreau with the HA-11 epitope inserted in the NdeI site located in position 1202-1206 of the tic214 coding sequence). In this vector, tic214 is under the control of its endogenous 5' UTR, and the aadA cassette is adjacent to it. Chloroplast transformation was performed as previously described (3).
Chloroplast isolation. The Chlamydomonas cell wall-deficient strains CC-400 (cw-15 mt + ) (a kind gift from H. Fukuzawa (Kyoto University), CC-4533 and CS1_FC1D12 (CC-4533 transformed with a TIC20-YFP-FLAG3x nuclear transgene, available at the Chlamydomonas Center) were grown until mid-log phase in TAP medium with shaking at 100 rpm at 25˚C in constant white light (54 µmol m -2 s -1 , provided by fluorescent bulbs). Intact chloroplasts were isolated as previously described (4) with slight modifications as follows: cells were harvested at 3,000 g for 4 min at 4˚C and washed with 50 mM Hepes-KOH, pH 7.8. After centrifugation at 3,000 g for 5 min at 4˚C, the cells were suspended in isolation buffer [50 mM Hepes -KOH, pH 7.8, 0.3 M sorbitol, 2 mM EDTA, 1 mM MgCl2, 0.1% (w/v) BSA, 0.5% (w/v) sodium ascorbate].
Just before cell breakage, cell suspensions were diluted to 0.5-3 mg chlorophyll / ml with isolation buffer and transferred to a 10 ml-Leuer-lock-syringe. The cells were broken by two passages through a 27-gauge needle at a flow rate of 0.1 ml / s. The suspensions were overlaid onto 45% / 80% Percoll step gradients [45% (v/v) or 80%(v/v) of Percoll, 50 mM Hepes -KOH, pH 6.8, 0.33 M sorbitol, 1 mM Na4P2O7, 2 mM EDTA, 1 mM MgCl2, 1 mM MnCl2, 0.3% (w/v) sodium ascorbate] and centrifuged in a swinging-bucket rotor at 4,200 g for 15 min at 4˚C. Intact chloroplasts were collected from the 45% -80% interface, diluted with 5 volumes of HS buffer (50 mM Hepes-KOH, pH 7.8, 0.3 M sorbitol) followed by centrifugation at 1,000 g for 3 min at 4˚C, and resuspended in HS buffer. After measuring chlorophyll concentration, chloroplasts were centrifuged at 1,600 g for 2 min at 4˚C and stored either on ice or at −80˚C.
The concentrations of pre-proteins were adjusted to 10-20 µM with 8 M urea buffer (8 M urea,250 mM NaCl,pH 7.5) and stored at −80˚C until use.
In vitro protein import experiments. In vitro import experiments with isolated intact chloroplasts prepared from Chlamydomonas strain CC-400 and purified pre-proteins were performed as described for A. thaliana (5) with the following minor modifications. Briefly, pre-proteins pre-RbcS2 and pre-Fdx1 with a C-terminal FLAG3x-TEV-Protein A-HIS6x tag were microfuged for 10 -20 min at 25˚C to remove insoluble materials and denatured again with an equal volume of 8 M urea dilution buffer (8 M urea, 20 mM DTT, 10 mM Hepes-KOH, pH 7.8) immediately before use.
Intact chloroplasts (1 -2 mg chlorophyll) were incubated with denatured pre-proteins (100 -200 nM) in 3 -4 ml of HS buffer containing 0, 0.3, or 3 mM Mg-ATP, 5 mM MgCl2, 5 mM DTT) for 15 min at 25˚C in the dark. After centrifugation at 1,600 g for 2 min at 4˚C, chloroplasts were washed twice with HS buffer. Chloroplasts were resuspended with HS buffer with or without (for thermolysin treatment) 0.1 % protease inhibitor cocktail (for plant cells, Sigma, P-9959) and transferred to a new tube. Thermolysin treatment was carried out as previously described (5). For purification of translocation intermediates, chloroplasts were pelleted by centrifugation and snapfrozen in liquid nitrogen and stored at −80˚C until use. To obtain soluble fractions containing stromal proteins, chloroplasts suspended in HS buffer containing 0.1% protease inhibitor cocktail were subjected to two freeze-thaw cycles, and the supernatant fraction was obtained after centrifugation at 21,500 g for 5 min at 4˚C.

Purification of translocation intermediates. Translocation intermediates after in vitro import
experiments of model pre-proteins were purified as previously described for A. thaliana (5) with slight modifications. Stored chloroplasts were suspended in solubilization buffer (1% watersoluble digitonin, 50 mM Tris-HCl, pH 7.5, 10% [w/v] glycerol, 250 mM NaCl, 5 mM EDTA, 5 mM DTT, 0.5% protease inhibitor cocktail) to a final concentration of 2 mg chlorophyll / mL for 20 min with gentle rotation at 4˚C. To remove insoluble materials, the chloroplast suspension was centrifuged at 21,500 g for 2 min at 4˚C, and the supernatant was again ultracentrifuged with a Hitachi S100 AT5 angle rotor at 100,000 g for 5 min 4˚C. The resulting supernatant (~1 ml) was incubated with 20 µl of dimethyl pimelimidate (DMP) cross-linked IgG Sepharose 6 Fast Flow (GE healthcare) resin for 2 h with gentle rotation at 4˚C. After washing 4 times with 0.7 ml of 0.2% digitonin-containing TGS buffer (50 mM Tris-HCl, pH 7.5, 10% [w/v] glycerol, 250 mM NaCl, 1 mM DTT), resins were further washed with the same buffer for 5 min with rotation at 4˚C to remove non-specific proteins. The resins were transferred to a siliconized 0.5 ml-tube and washed twice with 0.5 ml of 0.2% digitonin-containing TGS buffer. Bound translocation intermediates were eluted by cleavage with TEV protease in 100-120 µl of 0.2% digitonin containing TGS buffer for 1 h at 25˚C. To capture His-tagged TEV protease, 12 µl of complete His-Tag Purification Resin (Roche) was added and incubated for 15 min at 25˚C. After centrifugation at 8,700 g for 1 min at 4˚C, the supernatant was applied on a Micro Bio-spin column (Bio-Rad) to remove remaining resin, and the resulting flow-through was collected. For SDS-PAGE, the eluates were immediately denatured with sample buffer containing 16.7 mM Tris, pH 6.8, 50 mM (2-carboxyethyl) phosphine hydrochloride (TCEP-HCl) (Sigma, 66547) and 0.1% protease inhibitor cocktail at 37˚C for 30 min. For mass-spec analysis, the eluates were stored on ice until use.

Production of Antisera. For expression of the Chlamydomonas Tic56 protein fragment in E.
coli, the Cre17.g727100 coding sequence corresponding to amino acid residues 114-144 of Chlamydomonas Tic56 was synthesized; 5'-GGTGAACTGCGTCCGGTTCCGCGTAAAATTGTTCTGAGCCCGTATCAGTATGAGATGATTAA CTATCAGCGTATGCTGATGCGCAAAAACATTTGGTATTATCGCGATCGTATGAATGTTCCGC GTGGTCCGTGTCCGCTGCATGTTGTTAAAGAAGCATGGGTTAGCGGTATTGTGGATGAAAA TACCCTGTTTTGGGGTCATGGTCTGTATGATTGGCTGCCTGCAAAAAACATTAAACTGCTGC TGCCGATGGTTCGTACACCGGAAGTTCGTTTTGCAACCTGGATTAAACGTACCTTTAGCCTG AAACCGAGCCTGAATCGTATTCGTGAACAGCGTAAAGAACATCGTGATCCGCAAGAAGCAA GCCTGCAGGTTGAACTGATGCGT-3'. The synthetic DNA fragment was cloned into the expression vector pGEM-EX1 (Promega) with a C-terminal HIS6x tag. Polyclonal antiserum against Chlamydomonas Tic56 was produced by immunization of the purified Chlamydomonas Tic56 fragment as an antigen into a guinea pig. To produce the polyclonal antiserum against Chlamydomonas Tic214, a rabbit was immunized with a recombinant protein fragment corresponding to amino acid residue 628-737 of Chlamydomonas Tic214. This antigen was purified under denaturing conditions starting from BL21 E.coli cells transformed with pED3 (for details, see the section entitled "Construction of plasmids"). Polyclonal antisera against Chlamydomonas Tic20, Tic100 and Ctap2 were generated in collaboration with Yenzym, South San Francisco, upon rabbit immunization with the following peptide antigens: CRAEDAEKQDWKFGRNEG (Tic20), CRFGAYYREDEKGRVR (Tic100) and CQPATETVVEEGEKQE (Ctap2).
Protein extraction and immunoblot analysis. Unless stated otherwise, total protein extraction and immunoblot analysis were performed as previously described (3). When a TCA protein extraction protocol was employed, the cell pellet was resuspended in 10% TCA in acetone plus 0.07% β-mercaptoethanol (β-ME) (Biorad #1610710). To allow protein precipitation, the lysate was incubated at -20°C for at least 45 min and then subjected to centrifugation at 20,000g at 4°C for 15 min. The pellet was washed with cold acetone containing 0.07% β-ME at least twice.
Finally, the pellet was dried with a speed vac and resuspended in a denaturing protein buffer containing 50 mM Tris-HCl pH 6.8, 300 mM NaCl, 2% SDS, and 10 mM EDTA. Prior to immunoblot analysis, the protein content of each sample was measured by BCA assay to ensure equal loading. Transfer to PVDF membranes was carried out with (for Tic214 and others) or without (for Tic56) 0.01% SDS in transfer buffer at 60 V for 2 h or 20 V overnight on ice. Proteins were detected by the ECL Prime Western Blotting System (GE healthcare) or Clarity ECL Western Substrate System (Bio-Rad) and exposed to X-ray films (Super RX, Fujifilm) or imaged via Odyssey Fc Dual-Mode Imaging System (LI-COR Biosciences).

LC-MS/MS analysis of translocation intermediates.
After alkylation with iodoacetamide, purified proteins were digested with trypsin in solution. LC-MS/MS analysis was performed by UltiMate 3000 Nano LC systems coupled to Q-Exactive hybrid quadrupole-Orbitrap mass spectrometer (Thermo Fisher Scientific). Peptides and proteins were identified by Mascot v2.3 (Matrix Science, London) searched against the Uniprot Chlamydomonas dataset and the Creinhardtii_281_v5.5.protein datasets.

LC-MS/MS analysis of total protein extracts upon Tic214 depletion and proteasome
inhibition. Cell pellets preparation: at the beginning of the experiment, 20-ml aliquots of cell culture in the late exponential phase (at a cell density of about 7*10 6 cells / ml) were inoculated and diluted ten times either in regular TAP or TAP freshly supplied with 200 µM Thiamine and 20 µg / ml B12. Thereafter, to keep cell growth in the exponential phase, all cultures were diluted every 24 h to a final cell concentration of about 7*10 5 cells/ml. At each harvesting time point (2, 4, and 6 days), about 4*10 8 cells were pelleted at 1000 g for 5 min at RT. After being resuspended in 25 ml of the same type of growth medium (i.e., / + Vit), they were incubated for 3 h in the presence of 30 µM MG132 (Sigma Aldrich #M7449). Then, cells were pelleted again, frozen in liquid nitrogen, and stored at -80°C until use. Cell pellets processing: the weight of each frozen cell pellet was measured and resuspended in 5 volumes of lysis buffer containing 100 mM Tris-HCl pH8, 600 mM NaCl, 4% SDS, 20 mM EDTA and freshly supplemented with MS-SAFE Protease and Phosphatase Inhibitor Cocktail (Sigma Aldrich # MSSAFE) (e.g., 100 mg frozen pellet = 500 µl lysis buffer). Cells were disrupted by constant agitation in this buffer for 30 min at 4°C. Then, the protein mixture was further denatured for 30 min at RT and centrifuged at 21000 g for 30 min at 4°C to remove cellular debris. The supernatant (i.e., total protein) was transferred in a clean Eppendorf, and a 5-µl aliquot of this clear lysate was used to determine protein concentration by BCA assay (REF. Perlaza et al. 2019). Sample preparation for Mass-spec analysis: for each time point, 60 µg of total protein extracts were mixed with Laemmli sample buffer (Biorad #1610747), freshly supplemented with β-ME, heated for 30 min at 37°C and loaded into a polyacrylamide gel (any kD™ precast protein gel, Biorad #4569034). To avoid cross-contamination, one empty well was left between the different protein samples. Proteins were allowed to electrophorese into the gel for 15 min at 100 V, visualized by Colloidal Coomassie staining, excised as a single gel band of 0.5 X 1.5 cm, transferred in a clean Eppendorf tube a 1% acetic acid solution, and sent out to the Stanford University Mass Spec Core. Mass spec analysis: the identification of peptides and proteins by tandem mass spectrometry was carried out by the Stanford University Mass Spec Core using the Byonic software package. Output data were organized in two different types of Excel spreadsheets, one for proteins and one for peptide-spectrum matches (PSMs), and were summarized in a heatmap (Dataset S8). Data mining: Only proteins for which at least 10 MS-peptides could be identified in one of the six conditions were taken into consideration for further analysis. Their localization was predicted using the Predalgo software (6). In 80% of the cases (345/427), the chloroplast localization was confirmed by another prediction software, ChloroP (7). Only proteins for which sequences derived from their predicted cTP could be detected upon Tic214 depletion were annotated as potential TIC clients. This information is available in Dataset S 9. The PhytoMine interface (https://phytozome.jgi.doe.gov/phytomine) was used to identify potential A. thaliana orthologs of these proteins listed in SI Appendix, Table S6. To assess whether Tic214 knockdown affects chloroplast stress-responsive proteins encoded by nuclear genes whose upregulation is impaired in absence MARS1 (i.e. MARS1-dependent), we added +1 to all spectral counts, and determined the protein fold-change for each timepoint as log2 (spectral counts Tic214 OFF / spectral counts Tic214 ON). A protein was considered affected by Tic214 knockdown when its fold change was = or > 2 at least in one timepoint (768 proteins). This information is available in Dataset S10. Lists of chloroplast stress-responsive genes and MARS1-dependent genes were extracted from (8). All calculations were done in R (R project v3.5.1) (www.R-project.org) using a combination of the packages stringr (https://CRAN.Rproject.org/package=stringr.), dplyr (https://CRAN.R-project.org/package=dplyr.), gplots (. https://CRAN.R-project.org/package=gplots.) and custom scripts.    (C) Coimmunoprecipitation of total protein from NY6 containing HA-tagged Tic214. Cellular extracts from NY6 and wild type (WT) were incubated with an affinity matrix containing HA antibodies. After extensive washing of the matrix, bound proteins were eluted with SDS buffer and immunoblotted with HA antiserum. The black arrows highlight the position of the 110 kDa protein band detected by the Tic214 antibody in the immunoblot. (D) The entire tic214 mRNA is translated as a protein of 232 kDa. After immunoprecipitation with anti-HA antibodies, proteins were digested with trypsin and analyzed by mass spectrometry. The distribution of the identified peptides of Tic214 (indicated in blue) over the entire Tic214 sequence (between D15 and K1979) is drawn to scale.

Fig. S4. 2D blue native/SDS-PAGE separation of a mock purified sample.
(A) A mock purified sample with untagged Tic20 was analyzed as in Fig. 4A. (B) A mock purified sample with untagged Tic20 was analyzed as in Fig. 4B.

Fig. S5. 2D blue native/SDS-PAGE analysis of purified translocation intermediates.
The pre-RbcS2 was used for in vitro import experiments with Chlamydomonas chloroplasts in the presence or absence of ATP. Translocation intermediates were purified and analyzed by 2D blue native/SDS-PAGE separation followed by silver staining. Mock-purified samples prepared from the same amounts of chloroplasts without the addition of pre-proteins were also analyzed. Bands containing proteins identified by mass spectrometry are labeled. (B) ATP-dependent association of Tic214 with translocating pre-proteins is shown as measured by total ion current detected for Tic214-derived peptides. The numbers of the starting amino acids of the identified peptides in full-length Tic214 are shown below the horizontal axis. In the negative control, no pre-protein was added to the chloroplasts. (D) Y strains were obtained by transformation of A31 with the pRAM73.19 plasmid containing the psbD 5' leader fused to the CDS of tic214 and the aadA spectinomycin resistance cassette. The WY1 strain was obtained in the same way except that the wild-type cell line was used for transformation. The authentic tic214 locus was examined by PCR using the primers shown in panel A, while the chimeric psbD5'UTR:tic214 was analyzed by PCR using primers shown in panel B. (E) The same PCR reactions shown in panel D were repeated for A31 and Y14. In this case, a 1/100 dilution of the genomic DNA was also tested to make sure that at least one gene copy per chloroplast is detectable as there are ~80 copies of chloroplast DNA molecules per chloroplast in Chlamydomonas. Primers shown in panel C (spanning a region of tic214 CDS) were used as loading control. (A) Venn diagram highlighting the number of chloroplast stress-responsive proteins that could be detected by mass spectrometry (n = 91). About 80% (n = 78) were also differentially expressed upon tic214 depletion (Protein IDs are listed in Dataset S10). Chloroplast stress-responsive proteins are defined as proteins encoded by nuclear genes differentially expressed upon downregulation of the chloroplast Clp protease and in response to excessive light in Chlamydomonas cells (8). (B) Venn diagram highlighting the number of proteins encoded by MARS1-dependent genes and detected by mass spectrometry (n = 46). Over 95% (n = 45) were differentially accumulated upon tic214 depletion (Protein IDs are listed in Dataset S10). MARS1 encodes a critical component of the chloroplast unfolded protein response (8).
(C) Heatmap showing the proteins encoded by chloroplast stress-responsive genes and differentially accumulated upon tic214 depletion. The red squares on the side indicate those proteins encoded by MARS1-dependent genes. Tables   Table S1. Arabidopsis components of the TOC complex and their putative Chlamydomonas orthologs from BLAST analysis. (Supplemental data for Fig. 1) Table S2. Proposed Arabidopsis components of the translocon TIC complex and their putative Chlamydomonas orthologs from BLAST analysis. (Supplemental data for Fig. 1 and Fig. 2) Table S3. Spectral counts for proteins Coimmunoprecipitated with Tic20 and Tic214. (Supplemental data for Fig. 2) Table S4. Genes Coexpressed with Chlamydomonas TIC20. (Supplemental data for Fig. 2) Table S5. Spectral counts for proteins associated with pre-Fdx1 translocation intermediates. (Supplemental data for Fig. 5) Table S6. Putative Arabidopsis orthologs of Chlamydomonas chloroplast proteins whose import is affected upon depletion of Tic214. (Supplemental data for Fig. 7)

Table S1
Arabidopsis components of the TOC complex and their putative Chlamydomonas orthologs from BLAST analysis. In the case of multiple hits, the putative Chlamydomonas ortholog with the smallest e-value is highlighted with an asterisk. Chlamydomonas orthologs for Toc33, Toc132, and Toc159 could not be identified, based on e-values and predicted protein length.   a gene is part of the plastid translocon based on our BLAST analysis (SI Appendix, Tables S1 and S2) b encoded protein was identified during the Coimmunoprecipitation studies presented in this manuscript.

Table S5
Spectral counts for proteins associated with pre-Fdx1 translocation intermediates. The three asterisks indicate genes Coexpressed with TIC20.

Table S6
Putative Arabidopsis orthologs of chloroplast protein precursors detected upon depletion of Tic214 in Chlamydomonas.

Chlamydomonas Arabidopsis Gene ID Gene Name
Ortholog Gene ID Ortholog Gene Name