Assigning mitochondrial localization of dual localized proteins using a yeast Bi-Genomic Mitochondrial-Split-GFP

A single nuclear gene can be translated into a dual localized protein that distributes between the cytosol and mitochondria. Accumulating evidences show that mitoproteomes contain lots of these dual localized proteins termed echoforms. Unraveling the existence of mitochondrial echoforms using current GFP (Green Fluorescent Protein) fusion microscopy approaches is extremely difficult because the GFP signal of the cytosolic echoform will almost inevitably mask that of the mitochondrial echoform. We therefore engineered a yeast strain expressing a new type of Split-GFP that we termed Bi-Genomic Mitochondrial-Split-GFP (BiG Mito-Split-GFP). Because one moiety of the GFP is translated from the mitochondrial machinery while the other is fused to the nuclear-encoded protein of interest translated in the cytosol, the self-reassembly of this Bi-Genomic-encoded Split-GFP is confined to mitochondria. We could authenticate the mitochondrial importability of any protein or echoform from yeast, but also from other organisms such as the human Argonaute 2 mitochondrial echoform.


Introduction
Mitochondria provide aerobic eukaryotes with adenosine triphosphate (ATP), which involves carbohydrates and fatty acid oxidation (Saraste, 1999), as well as numerous other vital functions like lipid and sterol synthesis (Horvath and Daum, 2013) and formation of iron-sulfur cluster (Lill et al., 2012). Mitochondria possess their own genome, remnant of an ancestral prokaryotic genome (Gray, 2017;Margulis, 1975) that has been considerably reduced in size due to a massive transfer of genes during eukaryotic evolution (Thorsness and Weber, 1996). As a result, most of the proteins required for mitochondrial structure and functions are expressed from the nuclear genome (>99%) and synthetized as precursors targeted to the mitochondria by mitochondrial targeting signals (MTS), that in some case are cleaved upon import (Chacinska et al., 2009). In the yeast S. cerevisiae, about a third of the mitochondrial proteins (mitoproteome) have been suggested to be dual localized (Ben-Menachem et al., 2011;Dinur-Mills et al., 2008;Kisslov et al., 2014), and have been named echoforms (or echoproteins) to accentuate the fact that two identical or nearly identical forms of a protein, can reside in the mitochondria and another compartment (Ben-Menachem and Pines, 2017). Due to these two coexisting forms and the difficulty to obtain pure mitochondria, determination of a complete mitoproteome remains challenging and gave rise to conflicting results (Kumar et al., 2002;Morgenstern et al., 2017;Reinders et al., 2006;Sickmann et al., 2003).
Among all possible methods used to identify the subcellular destination of a protein, engineering green fluorescent protein (GFP) fusions has the major advantage that these fusions can be visualized in living cells using epifluorescence microscopy. This method is suitable to discriminate the cytosolic and mitochondrial pools of dual localized proteins when the cytosolic fraction has a lower concentration than the mitochondrial one (Weill et al., 2018). However, when the cytosolic echoform is more abundant than the mitochondrial one, this will inevitably eclipse the mitochondrial fluorescence signal. To bypass this drawback, we designed a yeast strain containing a new type of Split-GFP system termed Bi-Genomic Mitochondrial-Split-GFP (BiG Mito-Split-GFP) because one moiety of the GFP is encoded by the mitochondrial genome, while the other one is fused to the nuclear-encoded protein to be tested. By doing so, both Split-GFP fragments are translated in separate compartments and only mitochondrial proteins or echoforms of dual localized proteins trigger GFP reconstitution and can be visualized by fluorescence microscopy of living cells.
We herein first validated this system with proteins exclusively localized in the mitochondria and with the dual localized glutamyl-tRNA synthetase (cERS) that resides and functions in both the cytosol and mitochondria as we have shown previously (Frechin et al., 2009;Frechin et al., 2014). We next applied our Split-GFP strategy to the near-complete set of all known yeast cytosolic aminoacyl-tRNA synthetases. Interestingly, we discovered that two of them, cytosolic phenylalanyl-tRNA synthetase 2 (cFRS2) and cytosolic histidinyl-tRNA synthetase have a dual localization. We also confirmed the recently reported dual cellular location of cytosolic cysteinyl-tRNA synthetase (cCRS) (Nishimura et al., 2019). We further demonstrate that our yeast BiG Mito-Split-GFP strain can be used to better define non-conventional mitochondrial targeting sequences and to probe the mitochondrial importability of proteins from other eukaryotic species (human, mouse and plants). For instance, we show that the mammalian Argonaute 2 protein heterologously expressed in yeast localizes inside mitochondria.

Results
Construction of the BiG Mito-Split-GFP strain encoding the GFP b1-10 fragment in the mitochondrial genome We used the scaffold of the self-assembling Superfolder Split-GFP fragments designed by Cabantous and coworkers (Cabantous et al., 2005b;Pédelacq et al., 2006), where the 11 beta strands forming active Superfolder GFP are separated in a fragment encompassing the 10 first beta strands (GFP b1-10 ) and a smaller one consisting of the remaining beta strand (GFP b11 ). Seven amino acid (aa) residues of GFP b1-10 and three of GFP b11 were replaced in order to increase the stability and the self-assembly of both fragments (Figure 1-figure supplement 1). To increase the fluorescent signal and facilitate observation of low-abundant proteins, we concatenated and fused three b11 strands (GFP b11-chaplet; b11ch ) linked by GTGGGSGGGSTS spacers (see Materials and methods for DNA sequence, Figure 1-figure supplement 1, as in Kamiyama et al., 2016; Figure 1A).
Our objective was to integrate the gene encoding the GFP b1-10 fragment into the mtDNA so that it will only be translated inside the mitochondrial matrix, while the GFP b11ch fragment is fused to the nuclear-encoded protein of interest and thus translated by cytosolic ribosomes ( Figure 1A). To achieve this, we constructed a strain (RKY112) in which the coding sequence of the ATP6 gene has been replaced by ARG8m (atp6::ARG8m), and where ATP6 is integrated at the mitochondrial COX2 locus under the control of the 5' and 3' UTRs of COX2 gene (Supplementary file 1; Table 1; Figure 1-figure supplement 2A-C; see Materials and methods section for details). The RKY112 strain grew well on respiratory carbon source as wild type yeast (MR6) ( Figure 1B), produced ATP effectively ( Figure 1C), and expressed normally Atp6 and all the other mitochondria-encoded proteins ( Figure 1D). We next integrated at the atp6::ARG8m locus of RKY112 strain mtDNA, the sequence encoding GFP b1-10 ( Figure 1A; Figure 1-figure supplement 2). To this end, we first introduced into the r 0 mitochondria (i.e. totally lacking mtDNA) of DFS160 strain, a plasmid carrying the GFP b1-10 sequence flanked by 5' and 3' UTR sequences of the native ATP6 locus (pRK67, see Materials and methods for DNA sequence), yielding the RKY172 strain (bearing a non-functional synthetic r -S mtDNA, Figure 1-figure supplement 2C). This strain was crossed to RKY112 to enable Figure 1. Engineering of the BiG Mito-Split-GFP system in S. cerevisiae. (A) Principle of the Split-GFP system. When present in the same subcellular compartment, two fragments of GFP namely GFP b1-10 and GFP b11ch can auto-assemble to form a fluorescent BiG Mito-Split-GFP chaplet (three reconstituted GFPs). GFP b1-10 sequence encoding the first ten beta strands of GFP has been integrated into the mitochondrial genome under the control of the ATP6 promoter. GFP b11ch consists of a tandemly fused form of the eleventh beta strand of GFP and is expressed from a plasmid under the control of a strong GPD promoter (pGPD). The molecular weight of the tag is indicated. (B) Growth assay on permissive SC Glu plates, respiratory plates (SC Gly), and restrictive media lacking arginine (SC Glu -Arg) of the different strains used in the study (N = 2). All generated strains are derivative from MR6. (C) ATP synthesis rates of the MR6 and RKY112 strains presented as the percent of the wild type control strain (N = 2). P-value was 0.7456 (not significant). 95% confidence interval was À273.4 to 229.9, R squared = 0.064 (D) Mitochondrial translation products in the MR6 and RKY112 strains (N = 2). Cells were grown in rich galactose medium. Pulse-chase of radiolabeled [ 35 S]methionine + [ 35 S]cysteine was performed by a 20 min incubation in the presence of cycloheximide. Total cellular extracts were separated by SDS PAGE in two different polyacrylamide gels prepared with a 30:0.8 ratio Figure 1 continued on next page replacement of ARG8m with GFP b1-10 . The desired recombinant clones, called RKY176, were identified by virtue of their incapacity to grow in media lacking arginine due to the loss of ARG8m and their capacity to grow in respiratory media ( Figure 1B). Integration of GFP b1-10 in mtDNA was confirmed by PCR (Figure 1-figure supplement 2E, Supplementary file 2) and Western blot with anti-GFP antibodies ( Figure 2C). Finally, the BiG Mito-Split-GFP strain ( Table 1) was obtained by restoring the nuclear ADE2 locus in order to eliminate interfering fluorescence emission of the vacuole due to accumulation of a pink adenine precursor (Fisher, 1969;Kim et al., 2002).
The BiG Mito-Split-GFP system restricts fluorescence emission to mitochondrially-localized proteins The BiG Mito-Split-GFP system was first tested with Pam16 which localizes in the matrix at the periphery of the mitochondrial inner membrane and Atp4, an integral membrane protein with domains exposed to the matrix (Kozany et al., 2004;Velours et al., 1988; Figure 2A). The BiG Mito-Split-GFP host strain was transformed with centromeric plasmids expressing either Pam16 b11ch Figure 1 continued of acrylamide and bis-acrylamide. Upper gel: 12% polyacrylamide gel containing 4 M urea and 25% glycerol. Lower gel: 17.5% polyacrylamide gel. Gels were dried and exposed to X-ray film. The representative gels are shown. The online version of this article includes the following source data and figure supplement(s) for figure 1: Source data 1. Respiratory competency and translation of mtDNA-encoded respiratory subunits of the strains used in this study. Source data 2. Statistics of the comparison of ATP synthesis rates between RKY112 and MR6 strains (related to Figure 1C). Figure supplement 1. Optimized sequence and secondary structure of the GFP b1-10 and GFP b11ch that were used in this study (related to Figure 1). Figure supplement 2. Engineering of the strains and verification of the correct integration of ATP6 under the control of COX2 gene UTRs or GFP b1-10 under the control of ATP6 gene UTRs (related to Figure 1).   . These observations confirmed that the GFP b1-10 polypeptide is well expressed from the mtDNA, stably and correctly folded, allowing reconstitution of an active GFP upon association with the mitochondrial GFP b11ch -tagged protein. So far, the positive controls we used for the proof of concept of the BiG Mito-Split-GFP approach are proteins more or less abundant: Atp4 (30000-40000 copies/cell) and Pam16 (3000 copies/cell) (Morgenstern et al., 2017;Vö gtle et al., 2017). We will report soon, in BioRxiv, tests with other proteins with a known mitochondrial location and varying abundance to better estimate the sensitivity of the BiG Mito-Split-GFP system, including the GatF subunit of the GatFAB tRNA-dependent amidotransferase chromosomally expressed from its own promoter. This is a mitochondrial protein that has been reported to be present at only 40-80 copies (Vö gtle et al., 2017).
We next tested the BiG Mito-Split-GFP system with a GFP b11ch -tagged version of Pgk1, which is commonly used as negative cytosolic marker protein to probe the purity of mitochondrial preparations. Pgk1 b11ch and endogenous Pgk1 were well detected by Western blot of total protein extracts probed with anti-Pgk1 antibodies ( Figure 2C). No GFP fluorescence was observed with Pgk1  Figure 2C). This is an interesting observation considering that Pgk1 localizes at the external surface of mitochondria (Cobine et al., 2004;Kritsiligkou et al., 2017;Levchenko et al., 2016). This provides the proof that the BiG Mito-Split-GFP system does not yield any unspecific fluorescence with cytosolic proteins even when they are externally associated to the organelle (see also Source data 4). Another negative control (His3) that further confirms the absence of false positive signal will be provided soon in Bio-Rxiv. In conclusion, these data show that any GFP b11ch -tagged protein that localizes inside the mitochondrial matrix or at matrix side periphery of the inner membrane triggers GFP reconstitution and fluorescence emission, making this emission a robust in vivo readout for the mitochondrial importability of proteins of nuclear genetic origin.
We next tested whether the BiG Mito-Split-GFP system also allows visualization of the mitochondrial echoform of a protein located in both the cytosol and the organelle. We chose the cytosolic glutamyl-tRNA synthetase (cERS) encoded by the GUS1 gene as a proof of concept. As we have shown, cERS is an essential and abundant protein of the cytosolic translation machinery, and a small fraction (15%) is located in mitochondria where it is required for mitochondrial protein synthesis and ATP synthase biogenesis (Frechin et al., 2009;Frechin et al., 2014). After transformation of the BiG Mito-Split-GFP strain with plasmids expressing a GFP b11ch -tagged version of cERS under the control of either the GPD promoter (pGPD) or its own promoter (pGUS1), a GFP signal was observed only in b11ch was either expressed under the dependence of the GPD (pGPD) or its own promoter (pGUS1) from a centromeric plasmid. GFP reconstitution upon mitochondrial import was followed by epifluorescence microscopy (N = 3). (C) Immunodetection of the GFP b1-10 , cERS b11ch and Pgk1 b11ch fusion protein in whole cell extract from the transformed BiG Mito-Split-GFP strain using anti-GFP and -Pgk1 antibodies, confirming expression of Pgk1 b11ch . Loading control: stain-free. The representative gels are shown. (D) The strains described in the legend of panel (B) were used for three-dimensional reconstitution of yeast mitochondrial network (N = 1). Z-Stack images from Pam16 b11ch , Atp4 b11ch , cERS b11ch and Pgk1 b11ch were taken using an Airyscan microscope. Scale bar: 1 mm. (E) Flow cytometry measurements of total GFP fluorescence of the BiG Mito-Split-GFP strain stably expressing Pgk1 b11ch or Pam16 b11ch (N = 3). (F) The mitochondrial GatF protein was fused to the GFP b1-10 fragment (mtGatF b1-10 ), thereby targeting the ten first GFP beta-strands to mitochondria after being transcribed in the nucleus and translated in the cytoplasm. This construct was coexpressed with either cERS b11ch or Pgk1 b11ch . The GFP reconstitution was monitored by epifluorescence microscopy. Mitochondria were stained with MitoTracker Red CMXRos. Scale bar: 5 mm. Representative fields are shown. The online version of this article includes the following source data and figure supplement(s) for figure 2: Source data 1. Micrographs of the BiG Mito-Split-GFP expressing Pgk1 b11ch , cERS b11ch , Pam16 b11ch , (related to Figure 2B). Source data 2. Confirmation of the expression of the GFP b1-10 , cERS b11ch and Pgk1 b11ch fusion proteins in whole cell extract from the transformed BiG Mito-Split-GFP strains (Related to Figure 2C). Source data 3. Flow cytometry measurements of total GFP fluorescence of the three biological replicates of the BiG Mito-Split-GFP strain stably expressing Pgk1 b11ch or Pam16 b11ch (related to Figure 2F). . These observations demonstrate that the BiG Mito-Split-GFP system enables a specific detection in vivo of the mitochondrial pool of cERS ( mte cERS), without any interference by the cytosolic echoform, which is not possible when cERS is tagged with regular GFP (Frechin et al., 2009). We also expressed Pam16 b11ch and Pgk1 b11ch under the dependence of the GPD promoter at the TRP1 locus. Again, as shown with the plasmid-borne strategy, Pam16 b11ch expression resulted in a specific mitochondrial fluorescence, while Pgk1 b11ch gave no fluorescence ( Figure 2-figure supplement 1B).
Using high-resolution Airyscan confocal microscopy, a typical 3D mitochondrial network was reconstituted from the fluorescence induced by the expression of Pam16 b11ch , Atp4 b11ch and cERS b11ch in the BiG Mito-Split-GFP strain whereas, as expected, no fluorescent at all was detected with Pgk1 b11ch ( Figure 2D), which further illustrates the mitochondrial detection specificity of this system. These data were corroborated by flow cytometry analyses of the BiG Mito-Split-GFP strain stably expressing Pam16 b11ch and Pgk1 b11ch ( Figure 2E). These data will soon be completed (in Bio-Rxiv) with flow cytometry experiments aiming to know if the BiG Mito-Split-GFP system could be used in systematic screens for proteins with a mitochondrial localization.
We next evaluated whether the BiG Mito-Split-GFP approach represents a significant technical advance compared to the existing MTS-based Split-GFP methods that are currently used. To this end, we constructed cells (with a wild type mitochondrial genome) that co-express in the cytosol the mitochondrial protein GatF (with its own MTS) fused at its C-terminus with GFP b1-10 (mtGatF b1-10 ) and either cERS b11ch (dual localized, positive control) or Pgk1 b11ch (cytosolic, negative control) ( Figure 2F, left panel). As expected, a strong and specific mitochondrial fluorescent signal was obtained with cERS b11ch ( Figure 2F, right panel). However, Pgk1 b11ch resulted in a mitochondrial signal of similar intensity. This is presumably due to the location at the external surface of mitochondria of a small fraction of the Pgk1 pool that could interact with mtGatF b1-10 prior to its import into the organelle. These results show that due to the high affinity of both self-assembling Split-GFP fragments, the MTS-based strategy can generate a mitochondrial fluorescence without mitochondrial protein internalization ( Figure 2F, right panel). These experiments suggest that compartmentrestricted expression of the GFP b1-10 fragment and GFP b11ch -tagged proteins increases the reliability of identifying mitochondrial echoforms of dual-localized proteins.

Screening for mitochondrial relocation of cytosolic aminoacyl-tRNA synthetases
Originally, screening cytosolic aminoacyl-tRNA synthetases (caaRSs) that can additionally relocate to mitochondria was motivated by several inconsistencies concerning this family of enzymes. The first and most documented example concerns cERS (Frechin et al., 2009;Frechin et al., 2014). We showed that the fraction of cERS which is imported ( mte cERS) into mitochondria is essential for the production of mitochondrial Gln-tRNA Gln by the so-called transamidation pathway (Frechin et al., 2009;Frechin et al., 2014). In the latter, mte cERS aminoacylates the mitochondrial tRNA Gln with Glu thereby producing the Glu-tRNA Gln that is then converted into Gln-tRNA Gln by the GatFAB amidotransferase (AdT) (Frechin et al., 2009;Frechin et al., 2014). These results argued against the proposal that mitochondrial import of cQRS compensates for the absence of nuclear-encoded mtQRS in yeast (Rinehart et al., 2005). This being said, nothing excludes that cQRS can be imported into mitochondria to fulfill additional tasks beyond translation.
We therefore applied the BiG Mito-Split-GFP strategy to the S. cerevisiae caaRSs (See supplementary file 4), aiming to discover new mitochondrial echoforms of caaRSs. We successfully expressed in the BiG Mito-Split-GFP strain the full length GFP For unknown reasons, we failed to obtain the full-length GFP b11ch -tagged versions of cCRS and cPRS despite repeated attempts, but successfully cloned the first hundred N-terminal aa residues of cCRS (N 100 cCRS) ( Figure 3C). An unambiguous mitochondrial fluorescent signal was observed with cFRS2 b11ch (the a-subunit of the a 2 b 2 cFRS), cyte cHRS b11ch and N 100 cCRS  (Koerner et al., 1987), it is possible that supernumerary mte cFRS2 we identified is not necessary for charging mitochondrial tRNA Phe but exerts some non-canonical functions, in addition to its role in cytosolic protein synthesis. The mitochondrial fluorescence triggered by expression of N 100 cCRS b11ch suggests that this part of cCRS harbors a MTS, which has recently been proposed (Nishimura et al., 2019, see Discussion). The mitochondrial fluorescence triggered by cyte cHRS b11ch is more intriguing. The most plausible hypothesis is that the MTS of the mte cHRS is longer than the one originally characterized. The other possibility is that there is indeed a second mitochondrial echoform of cHRS imported inside mitochondria through a cryptic MTS that has yet to be The Saccharomyces Genome Database standard gene names are used. The amino acid (aa) one-letter code is used for the aminoacyl-tRNA synthetase aa specificity and (-) means that the gene encoding the corresponding aaRS is missing. Two genes encode the cytosolic phenylalanyl-tRNA synthetase (cFRS) since the enzyme is an a 2 b 2 heterotetramer. For echoforms, the position of the alternative initiation start codon is indicated and corresponds to the nomenclature described in Figure 3; briefly, (-number) means that the start codon of the mte aaRS is located (number) aa upstream the one that starts translation of the corresponding cyte aaRS while (Dnumber) means that the start codon of the cyte aaRS is located (number) aa downstream the one that starts translation of the corresponding mte aaRS.  Table S3). Genes encoding 18 out of the 20 yeast caaRS, including those encoding the a-and b-subunits of the cytosolic a 2 b 2 FRS (cFRS2), and the cGRS2 pseudogene, as well as the four encoding the cytosolic echoforms of cGRS1 ( cyte cGRS1), cARS ( cyte cARS), cHRS ( cyte cHRS) and cVRS ( cyte cVRS) were cloned in the pAG414pGPD b11ch Figure 3 continued on next page identified and, like for cFRS2, this new mte cHRS would then most probably exert a non-canonical function.
As already mentioned, cARS, cGRS1, cHRS and cVRS genes are known to produce both cytosolic and mitochondrial forms of these proteins ( Figure 3D). When mte cARS b11ch , mte cGRS1 b11ch , mte cHRS b11ch and mte cVRS b11ch (echoforms that start with the most upstream methionine initiator codon, Figure 3D) were expressed in the BiG Mito-Split-GFP strain, a mitochondrial GFP staining was, as expected, observed with these four mte caaRSs ( Figure 3D). Conversely, cyte cARS b11ch , cyte cGRS1 b11ch and cyte cVRS b11ch , versions without their MTS) did not produce any detectable GFP signal confirming the MTS-dependency of these cytosolic echoforms for mitochondria localization ( Figure 3D; Figure 3-figure supplement 1A). The mitochondrial fluorescence produced by cyte cHRS b11ch has already been discussed above.

Investigating non-conventional mitochondrial targeting signals in dual localized proteins
Unlike proteins with a MTS that is cleaved upon import into mitochondria, mte cERS does not involve any processing (Frechin et al., 2009). Presumably, the mitochondrial targeting residues are located in the N-terminal (N-ter) region of cERS as in precursors of mitochondrial proteins destined to the matrix. To identify them, we tagged with GFP b11ch three N-ter domains of cERS of varying length that correspond to the first 30 (cERS b11ch -N1), 70 (cERS b11ch -N2) and 200 (cERS b11ch -N3) residues of cERS (Supplementary files 3 and 4; Figure 4A) and we tested their ability to be imported in the mitochondria of the BiG Mito-Split-GFP strain ( Figure 4B). All three peptides produced a GFP fluorescence signal that matched the labeling of mitochondria with MitoTracker Red CMXRos ( Figure 4B). Consistently, no GFP fluorescence was detected with cERS b11ch lacking the residues 1-30 or 1-200 (cERS b11ch -DN1 and cERS b11ch -DN2 respectively) ( Figure 4B) despite detection by WB of these truncated proteins in cells ( Figure 4C). For unknown reasons, cERS b11ch -N1 and cERS b11ch -N2 constructs were not detected by Western blot but gave a proper mitochondrial fluorescence staining ( Figure 4B and C). These data narrow down cERS' MTS to the 30 first aa residues of its N-ter domain; this segment is made of a short b-strand and a 13 aa long a-chain (Simader et al., 2006) likely harboring the import signal. This further illustrates the strength of our technique towards the identification of unconventional MTSs in dual localized proteins.
Testing mitochondrial importability of plant and mammalian proteins using the BiG Mito-Split-GFP system The BiG Mito-Split-GFP system is based on modifications in the mitochondrial genome for expressing the GFP b1-10 fragment inside the organelle. Modifying the mitochondrial genome is thus far only possible in S. cerevisiae and Chlamydomonas reinhardtii (Remacle et al., 2006). Owing to the high degree of conservation of mitochondrial protein import systems (Lithgow and Schneider, 2010), we used the yeast BiG Mito-Split-GFP strain to test the mitochondrial importability of proteins from various eukaryotic origins. We first tested two glutamyl-tRNA synthetases from Arabidopsis thaliana, AthcERS and Athmt/chlERS. According to independent MTS prediction tools, AthcERS would be a cytosolic protein with a putative chloroplastic targeting signal (TargetP1.1), whereas Athmt/chlERS is  Fig. S4A. (C) Fluorescence microscopy analysis of the BiG Mito-Split-GFP strain expressing the first 100 amino acids of the N-ter region of the cCRS fused to GFP b11ch (N = 2). (D) Fluorescence microscopy analyses of BiG Mito-Split-GFP strain transformed with pAG414pGPD b11ch expressing the mitochondrial echoforms mte cGRS1, mte cARS, mte cHRS and mte cVRS. Schematics of cARS, cGRS1, cHRS and cVRS echoforms expression in yeast. Expression can be initiated upstream of the initiator ATG +1 ( mte cARS at ACG -75 and mte cGRS1 at TTG -69 ) but the synthesis of this echoform can also be initiated at the ATG +1 . In this case, the expression of the cytosolic echoform is initiated downstream ( cyte cHTS at ATG +60 and cyte cVRS at ATG +148 ). Mitochondria were stained with MitoTracker Red CMXRos. Scale bar: 5 mm. Representative fields are shown. The online version of this article includes the following source data and figure supplement(s) for figure 3: Source data 1. Confirmation, by WB, of the expression of the 18 full-length aaRS b11ch and N100cCRS b11ch in whole cell extracts from the transformed BiG Mito-Split-GFP strains (Related to Figure 3). Figure supplement 1. Screening of caaRSs and expression level of each GFP b11ch -tagged proteins (related to Figure 3). Mitochondria were stained with MitoTracker Red CMXRos; scale bar: 5 mm. The secondary structure (according to Simader et al., 2006) of the smallest peptide that still contains the non-conventional MTS of cERS is described together with the amino acid sequence of each helices. Positively and negatively charged amino acids are shown in orange and blue respectively. (C) Immunodetection of the cERS variants in BiG Mito-Split-GFP whole cell extracts using anti-GFP antibodies. Quantity of proteins loaded in each lane was estimated using anti-Pgk1 antibodies or by the stain-free procedure.
The bands corresponding to the mutants N1 and N2 could not be detected. The representative fields or gel are shown. The online version of this article includes the following source data and figure supplement(s) for figure 4: Source data 1. Immunodetection of the cERS variants in BiG Mito-Split-GFP whole cell extracts using anti-GFP antibodies (related to Figure 4C). strongly predicted to be located in mitochondria and chloroplast ( Figure 5A). cDNAs encoding the AthcERS and Athmt/chlERS proteins were fused to GFP b11ch (Supplementary files 3 and 4) and the resulting plasmids were transformed into the BiG Mito-Split-GFP strain. Expression of these proteins was confirmed by Western blot ( Figure 5C). AthcERS b11ch did not produce any GFP signal, whereas consistent with its predicted localization Athmt/chlERS b11ch resulted in a specific mitochondrial fluorescence staining ( Figure 5B). These data show that the yeast BiG Mito-Split-GFP system can be used to analyze mitochondrial localization of plant proteins.
We also used the BiG Mito-Split-GFP system to address a yet-unresolved question regarding the presence of mammalian Argonaute protein 2 (Ago2) in mitochondria. This protein mainly localizes to the nucleoplasm and cell junctions where it is required for RNA-mediated gene silencing (RNAi) by the RNA-induced silencing complex (RISC) (Hammond et al., 2000). In some studies, Ago2 was suggested to be associated to mitochondria, but it remains unclear whether it localizes at the external surface or inside the organelle (Barrey et al., 2011;Shepherd et al., 2017). Using four different algorithms a potential MTS could not be predicted in Ago2 proteins from human, mouse, Bos taurus, Danio rerio andDrosophila melanogaster, casting doubts on the mitochondrial import of Ago2 ( Figure 5D). To help resolve this question, the BiG Mito-Split-GFP yeast strain was transformed with plasmids expressing mouse and human Ago2 b11ch proteins (MmuAgo2 b11ch and HsaAgo2 b11ch , respectively, Supplementary files 3 and 4). Expression of each of these GFP b11ch -tagged constructs was confirmed by WB, and both generated a solid and specific GFP fluorescence restricted to mitochondria ( Figure 5E and F). These observations provide strong evidence that in addition to a cytosolic and nuclear location, Ago2 is also transported into mitochondria and is really a multi-localized protein with a mitochondrial echoform.

Discussion
Initially designed to study protein-protein interactions and solubility, the Split-GFP technology was almost immediately hijacked to track protein localization in various cell types and compartments (Hyun et al., 2015;Kaddoum et al., 2010;Kamiyama et al., 2016;Külzer et al., 2013;Pinaud and Dahan, 2011;Van Engelenburg and Palmer, 2010). It has also been used to study the mitochondrial localization of PARK7 upon nutrient starvation (Calì et al., 2015), and to detect remodeling of MERCs (mitochondria-ER contact sites) in mammalian cells (Yang et al., 2018). Recently, Kakimoto and coworkers developed in yeast and mammalian cells a Split-based system to analyze inter-organelles contact sites (Kakimoto et al., 2018). However, in these approaches both GFP b1-10 and GFP b11 were anchored to proteins either translated in the cytosol or following the secretory pathway. Although the latter may avoid nonspecific interaction or reconstitution of the two GFP parts, we bring herein proofs that the simultaneous synthesis of both fragments in the cytosol, coupled to their high affinity to self-assemble, may induce potential false-positive GFP emission ( Figure 2F).
To bypass this issue, we describe herein a new and robust Split-GFP system where the first 10 segments of beta barrel GFP (GFP b1-10 ) is expressed from the mitochondrial genome and translated inside the organelle without interfering with mitochondrial function (Figure 1C and D). The remaining beta barrel is concatenated (GFP b11ch ), tagged to the protein of interest and expressed from cytosolic ribosomes. As a result, any detected GFP fluorescence obligatory originates from the organelle thereby demonstrating a mitochondrial localization for the tested proteins ( Figure 6A-B).
This system was first successfully tested with two mitochondrial proteins (Atp4 and Pam16), and a cytosolic one (Pgk1) as a negative control. Moreover, the mitochondrial echoform of the cytosolic glutamyl-tRNA synthetase ( mte cERS) encoded by the GUS1 nuclear gene was also detected with the BiG Mito-Split-GFP system (Figures 2, 3, 4 and 6A). As we already showed, synchronous release of cERS and cMRS from the cytosolic anchor Arc1 protein is required for a coordinated expression of mitochondrial and nuclear ATP synthase genes (Frechin et al., 2009;Frechin et al., 2014). Mitochondrial relocation of cERS is consistent with the functional plasticity of caaRSs with multiple locations in cells. Using GFP b11ch -tagged N-ter segments of cERS, we localized its cryptic MTS within the first 30 aa residues. This region lacks amphiphilic residues (residues 15-28) and folds into a b-strandloop-aÀhelix motif different than regular MTSs (Roise et al., 1988;Simader et al., 2006; Figure 4). These findings demonstrate that the BiG Mito-Split-GFP system allows not only to visualize in living cells the mitochondrial pool of proteins with multiple cellular locations, but also to decipher their non-conventional MTSs.
Recent efforts made to identify mitochondrial proteins and assign their submitochondrial localization revealed an exquisite precision (Morgenstern et al., 2017). However, resolving mitochondrial proteomes is challenging due to the difficulty of obtaining pure mitochondria and because many proteins transiently localize in mitochondria and are found elsewhere in cells. Up to 10-20% of the yeast mitoproteome was suggested to be composed of proteins with another location in cells (i.e the cytosol, the nucleus, ER. . .) (Ben-Menachem and Pines, 2017; Morgenstern et al., 2017). Our BiG Mito-Split-GFP system will be especially helpful to resolve these proteome complexities. This system was here applied to proteins involved in tRNA aminoacylation, some of which are well-known to relocate in different compartment to fulfill a wide range of cellular activities (Han et al., 2012;Figure 6. Schematic of the BiG Mito-Split-GFP system and its applications. (A) Using our engineered strain, we could show the dual localization of echoforms in the aaRS family of proteins and foster its power by studying localization of heterologous proteins originating from plants, mice and human. (B) The BiG Mito-Split-GFP strain was generated by integrating the sequence encoding the first 10 beta barrel segments into yeast mitochondrial DNA, and by either expressing any protein of interest fused to the 11 th GFP segment from a plasmid or by integration in yeast nuclear Figure 6 continued on next page Ko et al., 2000;Yakobov et al., 2018). In this way, we provide strong evidence that cFRS2 and cyte cHRS are dual localized as was observed for cERS, which suggests that these proteins may have additional roles beyond translation ( Figure 6A). Being dually localized in the cytosol and mitochondria, and since there is no mte cFRS1, it can be inferred that the catalytic a-subunit (cFRS2) is not inevitably in complex with the b-subunit within the a 2 b 2 heterotetrameric form of cFRS. It will be interesting to test whether these findings in yeast extend to heterotetrameric cFRS from other eukaryotes, including humans. A bona fide mtFRS (encoded by the MSF1 gene) that was shown to function as a monomer is essential to generate mitochondrial Phe-tRNA Phe (Fmt tRNA F ) in mitochondria (Sanni et al., 1991). This further supports the hypothesis that mte cFRS2 is not required to produce Fmt tRNA F but more likely has a non-canonical yet-to-be-discovered function. Our failure to detect a mitochondrial echoform for cQRS is consistent with our previous findings (Frechin et al., 2009) that the only source of Qmt tRNA Q in mitochondria is provided by the relocation of mte cERS into the organelle (Figure 3 and Figure 3-figure supplement 1) de concert with the tRNA-dependent Gat-FAB Adt (Frechin et al., 2014). This definitely casts in doubt the previous proposal of the existence of a cQRS mitochondrial echoform (Rinehart et al., 2005). In agreement with our results ( Figure 3C), mitochondrial echoforms of cCRS were also detected in a recent study and shown to result from alternative transcription and translation starts (Nishimura et al., 2019), thereby unraveling how mtCRS is expressed from the CRS1 gene and rationalizing how mitochondrial Cys-tRNA Cys is produced.
Having identified new mitochondrial echoforms of caaRSs, we wondered if they carry in their N-terminal regions some common specific sequence or structural features possibly driving mitochondrial import. No specific motif was found using MAST/MEME analysis (Bailey et al., 2009), and there was no significant sequence similarity (as tested with Blast) (Figure 4-figure supplement 1). All but mte cARS show at least one aÀhelix within their 50 first aa residues, and most (except cERS) are enriched in positively-vs negatively-charged aa residues, as in classical mitochondrial targeting sequences. Due to the lack of 3D structures, we cannot rule out that these N-termini adopt some specific ternary structure that are important for mitochondrial localization. As we have shown, most of the cytosolic form of cERS interacts with Arc1 in fermenting yeast, but during the diauxic shift, Arc1 expression is repressed, allowing the generation of a free pool of cERS able to relocate into mitochondria. Thus, in the case of this caaRS, interactions of its N-terminal domain seem to be important to distribute it between the cytosol and mitochondria. Future work is required to know whether such a mechanism operates also for the other dually localized caaRSs.
Our BiG Mito-Split-GFP system requires modifications of the mitochondrial genome, which can be achieved in only a limited number of organisms (S. cerevisiae Fox, 2001 andC. Reinhardtii Remacle et al., 2006). However, due to the good evolutionary conservation of mitochondrial protein import, we reasoned that the system we developed in yeast could be used to test proteins of various eukaryotic origins, and we present evidence that this is indeed the case ( Figure 5; Figure 6C). For instance, we showed that the mammalian Ago2 protein Figure 5) heterologously-expressed in yeast localize inside mitochondria. This protein was suggested to be exclusively located at the external surface of mitochondria in human cells where it would help the transport of pre-and miRNAs into the organelle, as do numerous nuclear-encoded pre-and miRNAs (Bandiera et al., 2011;Barrey et al., 2011;Kren et al., 2009). Several studies have suggested that mitochondrial miRNAs, also termed mitomiRs, play a role in apoptosis (Kren et al., 2009), mitochondrial functions (Das et al., 2012), and translation (Bandiera et al., 2011;Jagannathan et al., 2015;Li et al., 2016;Zhang et al., 2014), and this would require the mitochondrial import of Ago2 (Bandiera et al., 2011;Das et al., 2012;Jagannathan et al., 2015;Li et al., 2016;Zhang et al., 2014). However, the import of mitomiRs is still poorly understood and several possible import mechanisms have been evoked (Barrey et al., 2011;Shepherd et al., 2017). Our unambiguous detection DNA. As opposed to regular GFP-tagging where visualizing an echoform ultimately results in a GFP signal diffusing in the entire cell, our BiG Mito-Split-GFP system abolishes the fluorescence originating from cytosolic echoform to only display a specific mitochondrial signal. Further applications range from high-throughput experiments to identify relocating proteins involved in mitochondria homeostasis or metabolism, to identify nonconventional MTSs or seek for mitochondrial localization of heterologous proteins.
of Ago2 inside mitochondria of yeast cells expressing this protein sheds new light on its potential role in miRNAs import.
The yeast BiG Mito-Split-GFP system we describe here is designed to point out mitochondrial echoforms. It is robust, not expensive and can be used to test proteins from various organisms. This new approach has certainly many potential applications and opens new avenues in the study of mitochondria and their communications with other compartments of the cell.

Construction of plasmids
ATP6 gene flanked by 75 bp of 5'UTR and 118 bp of 3'UTR of COX2 was synthesized by Genescript and cloned at the EcoRI site of pPT24 plasmid bearing the sequence of COX2 gene along with its UTRs (Thorsness and Fox, 1993), giving pRK49-2. The GFP b1-10 sequence (optimized for mitochondrial codon usage) encoding the first ten b-strands of GFP flanked by the regulatory sequences of ATP6 gene and BamHI/EcoRI sites was synthesized by Genescript. The BamHI-EcoRI DNA fragment was cloned into pPT24 plasmid, giving the pRK67-2. The sequences of inserts were verified by sequencing.
The GFP b11ch coding sequence, synthesized by Genescript, was subcloned into the pAG414 pGPD-ccdB vector to generate the pAG414pGPD-ccdB b11ch . All genes encoding cytosolic or mitochondrial proteins were amplified from genomic DNA using the PrimeSTAR Max polymerase according to the manufacturer instructions (Takara), purified by PCR clean up (Macherey-Nagel) and subcloned either by Gateway (Thermofisher) (Katzen, 2007) or Gibson assembly (NEB) (Gibson et al., 2010;Gibson et al., 2009) according to the manufacturer's instructions (see Table  S2).

Construction of the BiG Mito-Split-GFP strain
The genotypes of strains used in this study are listed in Table 1. The r + indicates the wild-type complete mtDNA (when followed by deletion/insertion mutation it means the complete mtDNA with a mutation). The rsynthetic genome (r -S ) was obtained by biolistic introduction into mitochondria of r 0 DFS160 strain (devoid of mitochondrial DNA) of the plasmids (pRK49-2 or pRK67-2) bearing indicated genes. The integration of ATP6 gene into the mtDNA under the control of regulatory sequences of COX2 was done using a previously described procedure (Steele et al., 1996). The pRK49-2 plasmid was introduced into mitochondria of DFS160 r 0 strain by ballistic transformation using the Particle Delivery Systems PDS-1000/He (BIO-RAD) as described (Bonnefoy and Fox, 2001), giving the r -S strain RKY89. For the integration of the ATP6 gene at the COX2 locus, we first constructed a r + strain (RKY83, Fig. S2A) with a complete deletion of the coding sequence of ATP6 (atp6::ARG8m) and a partial deletion in COX2, cox2-62 (Table 1), by crossing YTMT2 (Mata derivative of strain NB40-3C carrying the cox2-62 mutation (Steele et al., 1996) and MR10 (atp6::ARG8m) (Rak et al., 2007). After crossing, cells were allowed to divide during 20-40 generations to allow mtDNA recombination and mitotic segregation of the double mutation. The double atp6::ARG8m cox2-62 mutant colonies were identified by crossing with the r -S strain SDC30 (Duvezin-Caubet et al., 2003) that carries ATP6 and COX2 which restored the respiratory competence and by crossing with the YTMT2 strain, r + cox2-62, which did not restored the respiratory competence of the double mutant. Next, the r -S strain RKY89 was crossed with strain RKY83. This cross resulted in the respiratory competent progenies, named RKY112, which were growing on minimal medium without arginine (Table 1, Figure 1B and S2B). The ectopic integration of the ATP6 gene into COX2 locus was verified by PCR using oligonucleotides oAtp6-2, oAtp6-4, o5'UTR2 and o5'UTR1 (Table S1, Fig. S2D). To integrate GFP b1-10 into ATP6 locus the r -S strain RKY172 was obtained by biolistic transformation of DFS160r 0 with pRK67-2, as described above. RKY172 was crossed with RKY112, heterokaryons were allowed to divide during 20-40 generations to allow mtDNA recombination and mitotic segregation (Fig. S2C). The RKY176 progenies were selected by their respiratory competence and inability to grow on arginine depleted plates. The correct integration of the GFP b1-10 gene into ATP6 locus was verified by PCR using oligonucleotides oAtp6-1, oAtp6-10, oXFP-pr and oXFP-lw (Table S1, Fig. S2E). Finally, ADE2 WT sequence was amplified from the genomic DNA of a BY strain using oligonucleotides ADE2 Fw and ADE2 Rv (Table S2) and transformed into the RKY176 strain. Red/white colonies were then screened on adenine depleted plates to select ADE2bearing RK176 strain.

Pulse-labelling of mitochondrially-synthesized proteins and ATP synthesis
Labeling of mitochondrial translation products was performed using the protocol described by Barrientos et al., 2002. Yeast cells were grown to early exponential phase (10 7 cells/mL) in 10 mL of liquid YP Gal medium. Cells were harvested by centrifugation and washed twice with LSM medium then suspended in the same medium and incubated for cysteine and methionine starvation for 2 hr at 28˚C with shaking. Cells were suspended in 500 mL of LSM medium, and 1 mM cycloheximide was added. After a 5 min incubation at 28˚C, 0.5 mCi of [ 35 S]methionine and [ 35 S]cysteine (Amersham Biosciences) was added and cell suspension was further incubated for 20 min at 28˚C. Total proteins were isolated by alkaline lysis and suspended in 50 mL of Laemmli buffer. Samples with the same level of incorporated radioactivity were separated by SDS-PAGE in 17.5% (w/v) acrylamide gels (to separate Atp8 and Atp9) or 12% (w/v) acrylamide containing 4 M urea and 25% (v/v) glycerol (to separate Atp6, Cox3, Cox2 and cytochrome b). After migration, the gels were dried and [ 35 S]-radiolabeled proteins were visualized by autoradiography with a PhosphorImager after a one-week exposure. To measure ATP synthase activities in the RKY112 strain, mitochondria were prepared by the enzymatic method as described in Guérin et al., 1979. For the rate of ATP synthesis, the mitochondria (0.15 mg/mL) were placed in a 1 mL thermostatically controlled chamber at 28˚C in respiration buffer (0.65 M mannitol, 0.36 mM EGTA, 5 mM Tris-phosphate, 10 mM Tris-maleate pH 6.8) (Rigoulet and Guerin, 1979). The reaction was started by adding 4 mM NADH and 750 mM ADP; 100 mL aliquots were taken every 15 s and the reaction was stopped by adding 3.5% (v/v) perchloric acid and 12.5 mM EDTA. Samples were neutralized to pH 6.5 by KOH and 0.3 M MOPS. ATP was quantified using the Kinase-Glo Max Luminescence Kinase Assay (Promega) and a Beckman Coulter's Paradigm Plate Reader.
Flow cytometry analysis 5 mL of cells stably expressing Pam16 b11ch and Pgk1 b11ch strains (see Table 1) grown in YPD to confluence were diluted in 4 mL of SC Gal and grown overnight to reach mid-log phase. They were then diluted again in SC Gal and grown for 6 hr. Cells were then centrifuged and resuspended in water, passed for GFP detection on a BD FACS Canto II cytometer and Data analysis was performed using FlowJo.

Proteins extraction and western blots
10 mL of cells grown to mid-log phase were harvested and spin down 5 min at 2000 Â g at room temperature (RT). Cells were suspended in 500 mL of deionized water, 50 mL of 1.85 M NaOH was added and the mixture was incubated 10 min on ice. After addition of 50 mL of TCA 100% and 10 min of incubation on ice, the total precipitate was pelleted by centrifugation 15 min at 13000 Â g at 4˚C. After removing the supernatant, pellets were suspended in 200 mL of Laemmli buffer (1Â) supplemented with 20 mL of 1M Tris Base pH 8.

Image acquisition and staining
Cells were incubated overnight in the appropriate media, diluted to an OD 600 nm of 0.3 prior to microscopy studies and stained after 6 hr of growth at 30˚C. For mitochondria staining, cells were centrifuged 1 min at 1500 Â g at room temperature, suspended in 1 mL of SC Gal supplemented with Red-Mitotracker CMXRos at a final concentration of 100 nM (Molecular Probes), and incubated 15 min at rotational shaking at 30˚C. Cells were washed three times in one volume of deionized water, and suspended in 100 mL of deionized water for microscopic studies. Epifluorescence images were taken with an AXIO Observer d1 (Carl Zeiss) epifluorescence microscope using a 100 Â plan apochromatic objective (Carl Zeiss) and processed with the Image J software. Images for 3D reconstruction were taken using a confocal LSM 780 high resolution module Airyscan with a 63 Â 1.4 NA plan apochromatic objective (Carl Zeiss) controlled by the Zen Black 2.3 software (Carl Zeiss). Z-stack reconstruction was performed on the IMARIS 9.1.2 (Bitplane AG) software.