An Extended Proteome Map of the Lysosomal Membrane Reveals Novel Potential Transporters*

Lysosomes are membrane-bound endocytic organelles that play a major role in degrading cell macromolecules and recycling their building blocks. A comprehensive knowledge of the lysosome function requires an extensive description of its content, an issue partially addressed by previous proteomic analyses. However, the proteins underlying many lysosomal membrane functions, including numerous membrane transporters, remain unidentified. We performed a comparative, semi-quantitative proteomic analysis of rat liver lysosome-enriched and lysosome-nonenriched membranes and used spectral counts to evaluate the relative abundance of proteins. Among a total of 2,385 identified proteins, 734 proteins were significantly enriched in the lysosomal fraction, including 207 proteins already known or predicted as endo-lysosomal and 94 proteins without any known or predicted subcellular localization. The remaining 433 proteins had been previously assigned to other subcellular compartments but may in fact reside on lysosomes either predominantly or as a secondary location. Many membrane-associated complexes implicated in diverse processes such as degradation, membrane trafficking, lysosome biogenesis, lysosome acidification, signaling, and nutrient sensing were enriched in the lysosomal fraction. They were identified to an unprecedented extent as most, if not all, of their subunits were found and retained by our screen. Numerous transporters were also identified, including 46 novel potentially lysosomal proteins. We expressed 12 candidates in HeLa cells and observed that most of them colocalized with the lysosomal marker LAMP1, thus confirming their lysosomal residency. This list of candidate lysosomal proteins substantially increases our knowledge of the lysosomal membrane and provides a basis for further characterization of lysosomal functions.

Lysosomes are membrane-bound intracellular organelles that are key players in the degradation and recycling of biological material. Their crucial role in cell physiology is underlined by the existence of ϳ50 lysosomal storage diseases caused by genetic defects in lysosomal proteins or proteins involved in lysosome biogenesis (1). The degradative function is carried out in the lysosomal lumen by the concerted action of over 60 hydrolases and accessory proteins (2). Although these soluble lysosomal proteins have been extensively studied, knowledge about membrane proteins remains rather limited, despite the multiple and crucial functions fulfilled by the membrane. It is indeed responsible for establishing and maintaining pH and ionic gradients, transporting degradation substrates and products from/into the cytosol, and maintaining lysosome integrity. Additionally, the lysosomal membrane is subjected to multiple fusion and fission events with other endocytic or biosynthetic compartments. Substrates for degradation are conveyed to lysosomes from the extracellular milieu, the plasma membrane, or the cytoplasm through the endocytic, phagocytic, and autophagic routes. Delivery of newly synthesized material to lysosomes requires exchanges between endocytic or biosynthetic organelles on the one hand and lysosomes on the other hand. These numerous trafficking events are supported by molecular machineries that associate with the lysosomal membrane (3).
In the last decade, large scale mass spectrometry-based approaches have been exploited to study the lysosome protein composition. The soluble content has first been analyzed by the use of an affinity purification protocol based on the mannose 6-phosphate modification (4 -11) that is characteristic of soluble lysosomal proteins (12). This has resulted in the identification of about 60 known luminal lysosomal proteins, as well as of many mannose 6-phosphate-containing proteins that were not previously thought to carry out a lysosomal function (13). To gain insight into the membrane composition, several groups have used preparative subcellular fractionation to recover samples enriched in lysosomes (14 -18). Despite the experimental limitations of the latter methods that are unable to completely separate organelles, the use of comparative strategies and statistical tools (14,16,17) allowed the identification of novel putative resident lysosomal membrane proteins, including a few potential transporters, such as SLC12A4, SLC44A2, C19ORF28 (MFSD12), SIDT2, and MFSD1 (14,16). More recently, the coupling of selective lysosome density shift and MS quantification was shown to allow simultaneous identification and validation of lysosomal candidates (19). The efficiency of these various approaches in identifying candidates was highlighted by the demonstration of the effective lysosomal residency of several selected proteins (16, 20 -29). Concerning membrane proteins, these studies have led to a list of about 45 integral membrane lysosomal proteins for which evidence of the lysosomal localization has been obtained by at least overexpression of epitope-tagged fusion proteins (30).
However, despite the expanded knowledge provided by these recent studies, many lysosomal actors are still missing. For instance, although more than 20 lysosomal transport activities have been biochemically described (31,32), many of these transport functions remain orphans because the underlying proteins have not been identified yet (33). The aim of the present proteomic study was to gain deeper insight into the characterization of the lysosomal membrane and its associated proteins, with a particular interest in novel potential lysosomal transporters, given their major role in lysosomal physiology. Transporters are integral membrane proteins (IMPs) 1 displaying multiple transmembrane domains, and such IMPs are usually difficult to identify by mass spectrometry because of their high hydrophobicity and low abundance (34,35). Therefore, to extend our protein identification capacities, we used a combination of subcellular and biochemical fractionations prior to MS analysis. We first established an overall list of 2,385 gene products from lysosome-enriched and lysosome-nonenriched fractions. Then, a comparative proteomics analysis based on spectral counts led to the selection of 734 candidate proteins. They included on the one hand 94 novel potentially lysosomal proteins and, on the other hand, 46 established or putative transporters for which lysosomal residency is suggested by this study. The lysosomal localization has been validated for nine candidates, including five transporters. Moreover, we recently showed elsewhere that another candidate identified during this proteomic study, PQLC2, is a novel lysosomal amino acid transporter (36).

EXPERIMENTAL PROCEDURES
Subcellular Fractionation-All experiments involving rats were conducted in compliance with approved Institutional Animal Use Committee protocols. Livers were obtained from male Wistar rats. Each preparation was performed on four rat livers essentially as described previously (37). Briefly, fractionation of subcellular organelles by differential centrifugation produced nucleus and heavy mitochondrial (NM), light mitochondrial (L), and microsomal and soluble (PS) fractions. The L fraction was subjected to isopycnic centrifugation on a discontinuous Nycodenz density gradient. Conditions of the gradient were essentially the same as described in the original publication (37), except that Nycodenzா was used instead of metrizamide. The following density layers were successively loaded on top of the L fraction: 1.16 (7 ml), 1.145 (6 ml), 1.135 (7 ml), and 1.10 (7 ml). Centrifugation was performed at 83,000 ϫ g for 2 h 30 min in an SW28 Beckman rotor. Fraction 2 (the interface between the layers of respective densities, 1.10 and 1.135 g/ml) was recovered as the Lϩ ("lysosomeenriched") fraction. Fractions 1, 3, and 4 (upper and lower fractions) were pooled as the LϪ ("lysosome-nonenriched") fraction. Organelles from both Lϩ and LϪ fractions were separately diluted in 0.25 M sucrose, pelleted by ultracentrifugation (100,000 ϫ g, 4°C, 1 h), and subjected to a hypoosmotic shock in buffer A (10 mM Hepes, pH 7.8, supplemented with protease inhibitors (Complete, Roche Applied Science)). Membranes (MbLϩ and MbLϪ) were recovered by ultracentrifugation (100,000 ϫ g, 4°C, 1 h), extensively washed in buffer A, and resuspended in 200 l of buffer A before storage at Ϫ80°C.
Recovery of lysosomes in the fractions resulting from differential centrifugation and from the Nycodenz gradient was followed by ␤-galactosidase activity measurement (38). These data along with the protein amounts recovered in each fraction allowed calculation of purification factors as compared with the initial homogenate. Protein concentration was evaluated using a Micro BCA TM protein assay kit (Thermo Scientific).
Chloroform/Methanol Extraction-Chloroform/methanol (CM) fractionation of proteins was performed according to Salvi et al. (39). Briefly, 250 g of organelle membranes (1-10 mg/ml) in buffer A were sonicated, left for 15 min on ice, and ultracentrifuged for 40 min at 100,000 ϫ g and at 4°C. Membranes pellets were then gently resuspended in 100 l of buffer A and slowly diluted in 900 l of cold CM (5:4, v/v) on ice. The mixture was left 15 min on ice, with periodic agitation, and then centrifuged (15 min, 15,000 ϫ g, 4°C) to produce a pellet (the CM-insoluble fraction, CMI) and a supernatant (the CM-soluble fraction, CMS, containing the most hydrophobic proteins). Solvent from the CMS fraction was evaporated under nitrogen, down to 100 l, and proteins were acetone-precipitated. Proteins from the CMI and CMS pellets were dissolved in Laemmli buffer for SDS-PAGE separation.
Triton X-114 Phase Separation-Triton X-114 phase separation was performed according to Donoghue et al. (40). Briefly, 250 g of organelle membranes (1-10 mg/ml) in buffer A were sonicated, left for 15 min on ice, and ultracentrifuged for 40 min, at 100,000 ϫ g and at 4°C. Membranes pellets were then gently resuspended in 800 l of cold PBS, and 200 l of 10% Triton X-114 (Sigma-Aldrich) was added. The mixture was gently agitated on a rotating wheel overnight at 4°C and cleared by centrifugation (30 min, 20,000 ϫ g, 4°C). The supernatant was warmed at 37°C for 30 min and centrifuged (30 min, 5,000 ϫ g, 25°C) for phase separation. The upper aqueous phase (AQ) and lower detergent phase (DT) were mixed, respectively, with 200 l of 10% Triton X-114 and 800 l of cold PBS, incubated for 15 min at 37°C, and centrifuged as described previously. This step was repeated three times, before recovering the final AQ and DT phases as well as a pellet present in the DT fraction (DTP). AQ and DT proteins were acetone-precipitated, and all samples were finally dissolved in Laemmli buffer for SDS-PAGE separation. AQ samples from the first replicate were not kept.
For MS analysis, SDS-PAGE separation of the reduced proteins was performed on 4 -12% gradient acrylamide gels (NuPAGE, Invitrogen). Proteins were stained by Bio-Safe Coomassie stain or Coomassie Brilliant Blue R-250 (Bio-Rad). The amount of loaded proteins and the migration length were adapted to the protein sample complexity.
MS Sample Preparation-For protein digestion each SDS-polyacrylamide gel lane was systematically cut into 1-mm bands that were washed several times by successive incubations in 25 mM NH 4 HCO 3 for 15 min and in 50% (v/v) acetonitrile, 25 mM NH 4 HCO 3 for 15 min. Gel pieces were dehydrated by 100% acetonitrile and then incubated with 7% H 2 O 2 for 15 min before being washed again with the destaining solutions described above. 0.15 g of modified trypsin (Promega, sequencing grade) in 25 mM NH 4 HCO 3 was added to the dehydrated gel pieces for an overnight incubation at 37°C. Peptides were extracted from gel pieces in three 15-min sequential extraction steps in 30 l of 50% acetonitrile, 30 l of 5% formic acid, and 30 l of 100% acetonitrile. The pooled supernatants were finally dried under vacuum.
NanoLC-MS/MS Analysis-The dried extracted peptides were resuspended in 30 l in 4% acetonitrile, 0.5% trifluoroacetic acid and analyzed by on-line nanoLC-MS/MS (Ultimate 3000 and LTQ-Orbitrap, Thermo Fisher Scientific). The nanoLC method consisted of a 40-min gradient ranging from 5 to 55% acetonitrile in 0.1% formic acid at a flow rate of 300 nl/min. Peptides were sampled on a 300-m ϫ 5-mm PepMap C18 precolumn and separated on a 75-m ϫ 150-mm C18 column (Gemini C18, Phenomenex). MS and MS/MS data were acquired using Xcalibur (Thermo Fischer Scientific) and processed automatically using Mascot Daemon software (version 2.1, Matrix Science).
Database Searching and Criteria for Protein Identification-Consecutive searches against the IPI_rat_decoy_database (based on the IPI-Rat version 3.48 database; 80,082 entries including the reverse ones) were performed for each sample using Mascot 2.1 (Matrix Science, London, UK). ESI-TRAP was chosen as the instrument and trypsin as the enzyme, and two missed cleavages were allowed. Precursor and fragment mass error tolerance were set respectively at 15 ppm and 1 Da. Peptide variable modifications allowed during the search were: acetyl (N-terminal), oxidation (M), dioxidation (M), and trioxidation (C). Proteins identified with a minimum of one unique peptide and with a score higher than the query threshold (for a p value of peptide Ͻ0.01) were automatically validated using IRMa (42). The filtered results were downloaded into an MS identification database, in which the peptide false discovery rate (FDR) was of 2.38%. (FDR ϭ 2 ϫ reverse/(reverse ϩ forward)). A homemade tool 2 was used for the compilation, grouping of proteins identified by a same set or subset of peptides (according to the principle of parsimony) and final comparison. Peptides shorter than hexamers were rejected at the grouping step. A last filtering step retained only protein groups identified by at least two unique peptides. All keratin isoforms and trypsin were deleted from the results. All MS data are available on the Pride database site (43) as Pride project 22847.
Protein Annotation-Gene names were retrieved from the IPI-Rat or Uniprot databases. Uncharacterized proteins (IPI sequence set) underwent a Blastp process against the mammalian Uniprot database section (released February, 2011). Top protein hits with at least 10e-05 e-value and a query coverage greater than 50% were kept and manually checked for relevance. The query coverage represents the percent of the query length that is included in the aligned segments and is calculated over all segments. When several protein groups corresponded to the same gene name, they were all kept.
The TMHMM version 2.0 server (Center for Biological sequence analysis, Lyngby, DK) was used for predictions of membrane-spanning regions (i.e. transmembrane domains). Protein functional annotation and subcellular localization information, either experimental or predicted, were collected from the IPI, Uniprot, or QuickGO sites and from the bibliography.
Spectral Counting and Semi-quantitative Analysis-For each identified protein p, the spectral count values (sc p,s ϭ number of spectra assigned per protein in a given sample s) were determined with the homemade hEIDI software (supplemental Tables S1 and S2). All spectra pointing to a given protein after the filtering steps were considered. Spectra matching the protein isoforms were counted once for each protein group containing one or several of the isoforms. These spectral count values were then normalized to the equivalent amount (in micrograms) of total membrane protein prior to CM or Triton X-114 extraction (Fig. 1), which had been injected in the spectrometer. The normalized spectral count (Nsc) thus corresponds to a number of spectra per microgram of total MbLϩ or MbLϪ proteins. For each identified protein p, the Nsc value was first calculated for each sample s (Nsc p,s ), then for each fraction f (MbLϩ or MbLϪ; Nsc p,f ) in each replicate, as the sum of the Nsc p,s in the CMS, CMI, AQ, DTP, and DT samples and at last for each of the MbLϩ or MbLϪ fractions by summing the Nsc p,f obtained for the three replicates. Evaluation of the relative abundance of a protein in a given sample or fraction was based on the label-free spectral counting method (44), and performed by dividing Nsc p,s or Nsc p,f by the total Nsc of the considered sample or fraction.
For each protein, a spectral index (SpI) comprising both relative protein abundance and number of samples containing this protein was then calculated as indicated in Fu et al. (45) to allow comparison between MbLϪ and MbLϩ samples. Confidence intervals were established through permutation analysis (45) and used for determination of proteins significantly enriched in MbLϩ (lysosomal protein candidates).
Molecular Cloning-IMAGE or ORFEOME clones coding for the following proteins were obtained from Source Bioscience: LOH12CR1, MFSD1, PTTG1IP, SLC37A2, SLC38A7, SLC46A3, SLCO2B1, STARD10, TMEM104, TMEM175, TTYH2, and TTYH3. Inserts were amplified by PCR using the Phusion polymerase (New England Biolabs) and the commercial plasmids as template and cloned for heterologous expression of GFP or YFP fusion proteins. Original plasmids, DNA accession numbers, primers, and expression vectors are given in supplemental Table S3.
Cell Culture and Fluorescence Studies-HeLa cells were from the American Type Culture Collection (ATCC) and were grown in DMEM/ GlutaMAXI supplemented with 10% FBS. Media and serum were from PAA Laboratories and Invitrogen, respectively. Cells were transiently transfected using electroporation or lipofection with Lipofectamine 2000 and processed for epifluorescence 2 days after transfection. Cells were fixed at room temperature in 4% paraformaldehyde. Antibodies were used at the following dilutions: mouse monoclonal anti-human LAMP1, 1:2,000 (H4A3, Developmental Studies Hy-bridoma Bank); Cy3-conjugated donkey anti-mouse, 1:1,000 (The Jackson Laboratory). Fluorescence was examined using a Nikon TE2000 epifluorescence microscope. Images were deconvoluted after acquisition with the PSF-based Iterative 3D Deconvolution module of Metamorph software (Universal Imaging Corp.).

RESULTS
To extend our knowledge of the lysosomal protein content, with a particular focus on membrane proteins and especially transporters, we performed a semi-quantitative and comparative proteomics analysis of membranes from rat liver fractions enriched and nonenriched in lysosomes. As novel proteins remaining to be discovered have a low abundance, we maximized protein resolution and coverage by analyzing several biological replicates and by reducing sample complexity using membrane protein subfractionation and SDS-PAGE. The label-free spectral counting method, based on the number of redundant peptides that identify a protein (44,46), was used to evaluate the relative abundance of each protein.
Further selection of lysosomal protein candidates resulted from a statistical comparison between lysosome-enriched and -nonenriched fractions (45). Finally, novel candidate lysosomal transporters were identified among the multipass transmembrane proteins.
Preparation of Samples from Lysosome-enriched and -nonenriched Fractions-We essentially followed the well established protocol of Wattiaux et al. (37) for preparation of lysosomal fractions (Fig. 1). Rat liver homogenates were first fractionated by differential centrifugation, and the primary lysosome-enriched fraction (fraction L) was further separated on a discontinuous Nycodenz density gradient (47), resulting in secondary Lϩ and LϪ fractions. Nycodenz is a gradient medium that displays very similar banding density for various organelles as the originally described metrizamide medium (48). Three independent preparations (biological replicates) were made. Our protocol aimed at improving the identification of IMPs, because their hydrophobicity and usually low abundance hinder their MS detection and identification in highly complex protein samples (34,35). However, we also attempted to retain peripheral membrane associated proteins. These membrane associated proteins indeed include various trafficking machineries and cytoskeleton-associated proteins, which are crucial for the biogenesis and function of endolysosomes. We thus avoided harsh treatments that would have removed membrane-associated proteins, such as alkaline washes, to prepare Lϩ and LϪ membranes (respectively MbLϩ and MbLϪ fractions). We then subfractionated both fractions according to protein hydrophobicity by two independent treatments, chloroform/methanol extraction (39) and Triton X-114 phase separation ( Fig. 1) (40,49). All resulting samples were finally separated by 1D electrophoresis, before being processed for MS analysis.
The ␤-galactosidase activity was measured to follow the recovery of lysosomes along the fractionation process, for the three replicates. These measurements indicated that the L, Lϩ, and LϪ fractions were enriched 9 -13-, 65-75-, and 7-9.5-fold, respectively, in lysosomes relative to the initial liver homogenate. These values, consistent with published data (37,47), demonstrated the enrichment and nonenrichment of Lϩ and LϪ, respectively, as compared with L, and the much higher concentration (ϳ7-9-fold) of lysosomes in Lϩ as compared with LϪ. These fractions are thus described as lysosome-enriched and lysosome-nonenriched, respectively, and evaluation of "enrichment" will hereafter always be based on the comparison between Lϩ and LϪ fractions.
We then analyzed the enrichment of several subcellular compartments by immunodetection of organelle markers in the NM, L, and PS fractions resulting from differential centrifugation, as well as in the membranes of the L, Lϩ, and LϪ fractions (Fig. 2). The lysosomal protein LAMP2 was the only protein that was strongly enriched in both L and MbLϩ fractions. Rab5 (early endosome), the ␣1 subunit of the sodium potassium ATPase (plasma membrane), TGN38 (TGN), and

FIG. 1. Workflow of the sample preparation for MS analysis.
Differential centrifugation of rat liver homogenates (H) produced a light mitochondrial fraction L, which was submitted to Nycodenz gradient centrifugation. This step allowed separation of a lysosomeenriched fraction (Lϩ) from the rest of the gradient (LϪ). Organelles from Lϩ and LϪ were broken by hypoosmotic shock, and membranes were recovered by ultracentrifugation. Membrane pellets (MbLϩ and MbLϪ) were split in two equal parts that were separately fractionated by independent methods based on protein hydrophobicity (chloroform/methanol extraction or Triton X-114 (TX114) phase separation). All resulting samples were subsequently separated by SDS-PAGE prior to LC-MS/MS analysis of in-gel digested samples. FTCD (Golgi) were depleted from L and enriched in PS to different extents. Although they all display some enrichment in MbLϩ, TGN38 is the only one for which this enrichment is comparable with that of LAMP2. Both the mitochondrial ATP synthase OSCP subunit and the endoplasmic reticulum BiP were depleted in MbLϩ.
Protein Identification-From the three biological replicates, we generated 959 MS analyses. After first pass filters, this resulted in 1,398,920 spectra, 368,147 of which could be assigned to 4,097 nonredundant rat gene products from the IPI-Rat database. According to the principle of parsimony, protein isoforms that could not be segregated by the identified peptides were counted as one unique gene product. The 4,097 gene products corresponded to 24,316 nonredundant peptide sequences. All corresponding protein and peptide information is available in the Pride database under project number 22847 and in supplemental Table S2. Further filtering excluding trypsin and keratins as contaminants and retaining proteins identified by at least two unique peptides led to a list of 2,385 nonredundant gene products, hereafter named the MbL2385 list. In this list, 528 proteins were present in the MbLϩ fraction only, 157 in the MbLϪ fraction only, and 1,700 in both fractions (supplemental Tables S4a and S5a). Thus, most of the proteins were common to both MbLϩ and MbLϪ fractions, in agreement with the limited resolution power of subcellular fractionation and the high sensitivity of mass spectrometers.
To evaluate the content in IMPs identified in our samples, transmembrane domains were predicted by use of the TMHMM 2.0 server (supplemental Table S4a). This led to the identification of 762 IMPs (32%), including 361 polytopic proteins (proteins with at least two transmembrane domains, 15.1%).
Extraction of Semi-quantitative Data-Despite the high enrichment factor obtained by the well established Nycodenz gradient method used in this study, cofractionation of other organelles, such as mitochondria, challenges the identification of true lysosomal residents, including proteins with dual or multiple localization. We thus compared the protein sets from lysosome-enriched and -nonenriched fractions to identify the subset associated with lysosomes. Because most proteins were common to both MbLϩ and MbLϪ fractions, comparing their number was less informative and relevant than comparing their abundance (compare Fig. 3 with supplemental Fig. S1). Abundance information was extracted from spectral count data (supplemental Table S1), according to the spectral counting semi-quantitative approach (44,46). The relative abundance for any given protein was derived from the Nsc calculated as indicated under "Experimental Procedures" using merged data issued from all MbLϩ or MbLϪ samples (supplemental Table S1).
MbLϩ and MbLϪ Fractions Display Different Organellar Profiles-We then analyzed the known or predicted subcellular localization of proteins from the MbL2385 list by manually collecting this information in protein databases (IPI, Uni-protKB, and QuickGO) and bibliography. For IPI entries without any attributed gene name, homologs were previously searched in a mammalian subset of the Uniprot database using Blastp. This allowed comparison of protein abundances in MbLϩ and MbLϪ according to the following subcellular categories: endo-lysosomes (EL), plasma membrane (PM), mitochondria, peroxisomes and nucleus (MPN), endoplasmic reticulum (ER), Golgi (G), cytoplasm (C), cytoskeleton (CS), secreted (S), miscellaneous (Misc; vesicles, granules, and multiple localizations) and Unknown. The comparison of protein abundances in MbLϩ and MbLϪ according to the subcellular distribution showed striking differences (Fig. 3, left panel; supplemental Table S5c); EL and PM proteins were clearly enriched in MbLϩ, as they altogether accounted for 34% of the material, as compared with 4.7% only in MbLϪ. By contrast, proteins from the ER and MPN compartments were depleted from MbLϩ relative to MbLϪ (34.2 and 73.7%, respectively). The similar behaviors of EL and PM proteins on the one hand and ER and MPN proteins on the other hand were systematically observed in subsequent analyses. Despite its slight enrichment in the MbLϩ fraction (0.59% of the abundance in MbLϩ versus 0.15% in MbLϪ), the small set of Golgi proteins (n ϭ 30) has been ranked as "Contaminants," along with ER and MPN proteins, in subsequent quantitative analyses (see below). Except for the cytoplasm, the other subcellular constituents (CS, S, and Misc) were slightly enriched in the MbLϩ fraction (15.6 versus 7.1%). Proteins of unknown localization represented 11.2 and 10.0% in number but 6.3 and 3.8% in abundance in MbLϩ and MbLϪ, respec- tively, indicating that the average relative abundance of such proteins is low (Fig. 3, left panel, and supplemental Fig. S1 and supplemental Table S5c).
Thus, the subcellular distribution features that stem from our spectral count data were consistent with qualitative expectations based on a restricted set of organelle markers (Fig.  2) (37). The substantial presence of contaminant organelles in MbLϩ was expected as a known characteristic of subcellular fractions. Our results therefore validate the use of the spectral count-based semi-quantitative method to describe and analyze these fractions.
Assignment of Proteins to Lysosomes-The next step in our study was to identify which proteins identified in MbLϩ were indeed novel potential lysosomal proteins. We thus aimed at identifying those significantly enriched in MbLϩ relative to MbLϪ, similarly, for instance, to the observed enrichment of the typical lysosomal marker ␤-galactosidase in Lϩ relative to LϪ. Proteins from the MbLϩ fraction were either exclusively detected in MbLϩ or common to both fractions (supplemental Table S4a). Among the 528 proteins exclusively present in MbLϩ, we chose to consider as potentially lysosomal only those present in at least two out of the three biological replicates and identified by at least five spectra (356 proteins; supplemental Table S4b). Among the proteins common to MbLϩ and MbLϪ, lysosomal candidates were selected according to their SpI (45), a parameter that takes into account both the relative protein abundance (estimated by normalized spectral counts) and the number of replicates in which the protein has been found (supplemental Tables S1 and S4a). SpI values range from Ϫ1 to ϩ1, the lower and upper extreme values corresponding to proteins almost exclusively detected in MbLϪ and MbLϩ fractions, respectively. These values displayed a roughly bimodal distribution in the MbL2385 list, with a massive peak covering negative values and a second subset rising toward an SpI of ϩ1 (Fig. 4A). The SpI analysis highlighted the different distributions between MbLϩ and MbLϪ of proteins from various annotated subcellular categories (Fig. 4B). Indeed, proteins from contaminants (essentially mitochondrial, ER and peroxisomal proteins) were the main contributors to the massive peak of negative SpI, whereas EL and PM proteins demonstrated a strong tendency to score high SpI values, with respective median values of 0.77 and 0.60. Confidence intervals were established through permutation analysis (45). Proteins were considered as significantly enriched in MbLϩ when their SpI was higher than the 95th percentile cutoff value (SpI Ն0.594), a level reached by 378 proteins out of 1,700 (supplemental Table S4b).
Altogether, our selection criteria for significant enrichment in MbLϩ led us to sort 734 proteins (Lys-734 list; supplemental Table S4b) out of 2,385. This selection included 79.3% (n ϭ 207) of the EL-annotated proteins, 56% (n ϭ 132) of the PM-annotated ones, and only 3.2% (n ϭ 28) of the contaminant proteins ( Fig. 5A and supplemental Table S5d). The C, CS, S, and Misc categories were represented by a total of 273 proteins. To our knowledge, 38 of the 94 proteins without any subcellular localization annotation (Table I) were completely novel lysosomal candidates, because they have not been identified in previous proteomic studies of lysosomes (14 -17, 19).
As it was recently shown that most known lysosomal genes exhibit a coordinated transcriptional behavior regulated by the transcription factor TFEB, we compared our Lys-734 list to the list of 291 genes up-regulated following TFEB overexpression in HeLa cells (50). This comparison pointed to 38 common proteins, among which 30 were EL-annotated proteins and one, the product of the Wdr81 gene, was of unknown annotated localization.
Extraction from protein databases or bibliography, and analysis of known or predicted functional annotation showed that all defined functional processes were represented in the Lys-734 list, although only two "polypeptide transport" annotated proteins remained (Fig. 5B). Transporters, channels, and pumps of ions and small molecules represented the most abundant functional class, despite its third position by protein number. Metabolism-associated proteins were the most numerous but ranked second in abundance (supplemental Table  S5e). As for the 94 proteins without subcellular annotation, one-third had no functional annotation either; more than onequarter were various metabolic enzymes; and the remaining were distributed between the "miscellaneous," "transporters, channels, and ion pumps," and "receptors and signaling" classes with rather similar abundances (Fig.5A).
Identification of Novel Putative Lysosomal Transporters-In addition to extending the current list of known lysosomal proteins, our interest was focused on the discovery of potential novel lysosomal transporters. As transporters display multiple membrane spanning domains (35), we filtered the Lys-734 list for polytopic proteins. Among the 136 MbLϩ-enriched polytopic proteins, 10% (n ϭ 11) had no attributed function and more than half (n ϭ 72; 67.5% of the Lys-734 IMPs abundance) belonged to the transporters, channels, and ion pumps class. This protein set contains numerous subunits of ATPases (v-ATPase (n ϭ 6); P-ATPases (n ϭ 5)), ATP-binding cassette (ABC) transporters (n ϭ 9), channels (n ϭ 10), and secondary active transporters (n ϭ 42). The latter include the recently discovered potential or effective lysosomal transporters C2ORF18 (21, 51), DIRC2 (27), LMBD1 (52), and MFSD8 (53) (Fig. 6). During the revision of our manuscript, the lysosomal localization of the ABC transporter ABCD4 was established (29), and we showed in a separate study the lysosomal localization and transport function of the PQLC2 protein (36).
Removal of the transporters already annotated as endolysosomal led to a set of 46 novel potentially lysosomal transporters that notably included 27 plasma membrane proteins  and 12 proteins of unknown localization (Table II). To our knowledge, 9 out of these 46 proteins (ABCA6, C7ORF23, C9ORF91, CACFD1, SLC26A1, SLC38A7, SLC40A1, SLC46A3, and TMEM50b) have not been identified in previous proteomic analyses of mammalian lysosomes, phagosomes, or lysosome-related organelles (14 -17, 19, 54 -61). Candidates-Twelve candidates,  LOH12CR1, STARD10, PTTG1IP, MFSD1, SLC37A2, SLC38A7,  SLC46A3, SLCO2B1, TMEM104, TMEM175, TTYH2 and   TTYH3, were chosen to validate independently the proteomic data. Peptides allowing their identification are given in supplemental Table S6. LOH12CR1 and STARD10 are putative cytosolic proteins. PTTG1IP is predicted to be an integral membrane protein with a role in cellular trafficking. Its subcellular localization is unclear as it has been observed in cytosol and nucleus by some authors (62) and in late endosomes by others (63). All other candidates are multispanning transmembrane proteins. TMEM175 has no homology with FIG. 6. Lysosomal transportome. All known and potential transporters or channels retained as selectively enriched in MbLϩ are represented. These proteins display at least two transmembrane domains, they either belong to the functional class "transporters, channels, and ion pumps" or have no functional annotation. They are classified according to their functional and subcellular annotations. Different categories of transporters are depicted by different colors (ABC transporters, green; MFS transporters, pink; SLC family members, purple; ATPases, deep blue; V-ATPase subunits, light blue; channels, brown; miscellaneous, black). Validated candidates are in boldface.

TABLE II List of potential novel lysosomal transporters
Candidates with two TM or more, of unknown function or with an attributed transport function, are shown. The localizations are as follows: ER, endoplasmic reticulum, Golgi; EL, endo-lysosomes; Misc, miscellaneous; PM, plasma membrane; U, unknown. The functions are as follows: T, transporters, channels, and ion pumps; U, unknown; TM, number of transmembrane domains; SpI, spectral index; NA, not applicable (proteins identified in MbLϩ exclusively). The lysosomal localization of MFSD1 has been shown during the course of this (66 (65). MFSD1, which has previously been identified in proteomics analyses of lysosomes and phagosomes (14,16,19,54,59,60), is responsive to the transcription factor TFEB, and it was considered as a promising lysosomal protein candidate (16,50). Its lysosomal localization has been confirmed independently during the course of our study (66). SLC37A2 mediates sugar-phosphate/phosphate and phosphate/phosphate exchange in proteoliposomes (67). SLC38A7 (SoLute Carrier family 38 member 7) belongs to the amino acid/polyamine/organocation superfamily (68). It has been reported to transport neutral and cationic amino acids at the plasma membrane during the course of this study (69), but signal-to-noise ratios were intriguingly low, suggesting that the actual role of SLC38A7 deserves further investigation. SLCO2B1/SLC21A9/OATP2B1 is an organic anion transporter that is stimulated at acidic pH (70). TMEM104 is an orphan member of the amino acid and auxin permease transporter family.
These candidates were transiently expressed as GFP or YFP fusion proteins in HeLa cells, and their intracellular distribution was compared with the lysosomal/late endosomal marker LAMP1. Interestingly, only three candidates did not overlap with LAMP1 but localized instead at the plasma membrane (STARD10 and SLCO2B1) or in LAMP1-negative puncta (LOH12CR1; data not shown). By contrast, the distribution of the nine other candidates extensively overlapped with LAMP1 (Fig. 7), thus confirming that they reside in lysosomes and validating the predictive value of the Lys-734 list. DISCUSSION The main concern in lysosome-oriented proteomic studies based on subcellular fractionation is the identification of true lysosomal residents, because of cofractionation of other organelles (37,71). Thus, identification of lysosomal candidates requires comparison of lysosome-enriched and -nonenriched fractions. A pioneer study performed by Callaghan and coworkers (15) aimed at identifying lysosomal membrane proteins from Triton WR1339 density-shifted lysosomes, also referred to as tritosomes. However, the actual lysosomal residency of several proteins identified in this study could not be established, because of the lack of comparative approach. Later on, a study of placental lysosomal proteins took advantage of the comparison between successive steps of the preparation and used a semi-quantitative label-free spectral counting method to select 86 lysosomal candidates (16). More recently, Lobel and coworkers (19) demonstrated the potential of coupling the selective lysosome density shift induced by Triton WR-1339 injection in rats with MS quantification by isobaric peptide labeling, for simultaneous identification and validation of lysosomal candidates. In this work, we compared lysosome-enriched and -nonenriched fractions obtained from rat liver by differential centrifugation and isopycnic density gradient centrifugation, followed by detergent or organic solvent extraction steps to reduce sample complexity prior to MS analysis. Our spectral count-based analysis provided us with an extensive list of 2,385 proteins (MbL2385 list), including 32% of IMPs. Among these proteins, 734 were selected as significantly enriched in the lysosomal fraction (Lys-734 list).
To our knowledge, the MbL2385 list is the most extensive published to date for lysosomes (15-17, 19, 72), phagosomes (54 -56, 59, 60, 73, 74), or lysosome-related organelles (57,58,(75)(76)(77)(78). Its IMP content (32%) is much higher than that commonly obtained if no specific subfractionation treatment is performed (5-15% IMPs (34)), but it is very similar to that obtained in a study of placental lysosomal membranes that also used an organic solvent treatment (16). Because of our preparation protocol, we identified altogether IMPs and membrane-associated proteins, but soluble proteins as well, such as luminal lysosomal hydrolases. Indeed, centrifugation of the lysosomal membranes leads to sedimentation of aggregated inclusions from the lysosomal matrix and thus induces the presence of soluble lysosomal enzymes and of proteins being degraded (30). Moreover, soluble proteins might also be retained as entrapped in membrane fragments generated upon hypotonic lysis and subsequent resealing of the organelles. The Lys-734 list is also longer than those pre- As ϳ80% of the EL-annotated proteins but only 3.2% of contaminant proteins were recovered in the Lys-734 list, our semi-quantitative approach was able to strongly discriminate endo-lysosomal proteins from those of recognized contaminating organelles, such as mitochondria or endoplasmic reticulum. Nevertheless, the presence of proteins annotated as non-endo-lysosomal questions the significance of their selection, beside the possibility of false-positive retention. Additionally, among the EL proteins themselves, lysosomal proteins are not distinguished from proteins from other endocytic compartments (early or late endosomes).
The presence of proteins annotated to other compartments than lysosomes may represent true lysosomal residents with multiple subcellular locations, the lysosomal residency being either predominant or secondary. Indeed, as our data were restricted to fractions issued from the isopycnic density gradient, we do not know if the "lysosome-like" behavior observed for a given protein is representative of the whole cellular pool of protein or restricted to a small, specific subset.
For instance, the TGN marker TGN38 is depleted from the L fraction and mainly recovered in the PS fraction after differential centrifugation (Fig. 2). However, the minority of TGN38 proteins that cosegregated with lysosomes during differential centrifugation was concentrated in the MbLϩ fraction after the subsequent centrifugation on a Nycodenz gradient (Fig.  2). A surprisingly high number of PM proteins (56%) was retained in the Lys-734 list. The presence of PM in the lysosome-enriched fraction has been discussed previously (37); it was shown that the small amount of PM proteins recovered in the L fraction (ϳ5%) behaves like lysosomes on a metrizamide gradient, either as true PM residents or as lysosomal proteins. Migration of PM proteins between PM and lysosomes is conceivable. Indeed, the endocytic pathway constitutes a link between these two compartments, as numerous fusion/fission events occur between the various entities of the pathway (PM, endocytic vesicles, early and late endosomes, and lysosomes). Moreover, lysosomes are known to directly fuse with the PM in given circumstances (3). Such a dual localization has been suggested by observations of 5Ј-nucleotidase reactivity on the cytoplasmic face of lysosomes (37). This protein, which is considered as a PM marker, is notably present in the Lys-734 list. The non-EL annotation of a candidate may also be too restrictive. For example, numerous proteins annotated as cytoplasmic or belonging to the cytoskeleton might in fact be associated with endo-lysosomes as constituents of membrane trafficking machineries that allow membrane exchange between lysosomes and other organelles or as belonging to the microtubules along which endo-lysosomes move inside the cell (79,80). Finally, if not true lysosomal residents, the candidate proteins may be targeted to lysosomes for degradation through endocytosis or autophagy. For instance, many PM tyrosine kinase receptors, such as the EGF receptor, are down-regulated by this process (81). Only a few receptors of this type, including the EGF receptor, were, however, identified in our work.
Beside its major role as a cytosolic proteolytic machine, the proteasome is also required for endocytic transport and sorting of receptors toward inner membranes of the multivesicular bodies (82,83), through a specific interaction between Rab7 and the proteasome ␣-subunit PMSA7 (84). Accordingly, numerous subunits (n ϭ 28) of the proteasome were present in the Lys-734 list. In a previous study, 24 proteasome subunits had been found in placental lysosome membranes, although not considered as lysosomal candidates (16). Proteasome subunits had also been identified, although to a lesser extent, FIG. 8. Schematic representation of the identified actors of chosen lysosomally associated processes. A schematic endo-lysosome is drawn with the names of identified proteins implicated in chosen lysosome-associated processes. Transmembrane transport is not considered here. Well established complexes are represented on a gray background. All their described components are indicated, whether identified in this study or not. The Rab7 protein is indicated near the diverse complexes requiring Rab7 interaction for their endo-lysosome membrane association. Black, selected proteins; black italics, proteins identified with at least five spectra but not selected; gray italics, proteins neither identified nor selected.
in proteomic studies of phagosomes (54), lysosome-related organelles (57), or Arabidopsis thaliana vacuoles. 3 Biogenesis of the lysosomes and delivery of endocytic cargo to these organelles involve numerous and highly dynamic membrane fusion and fission events between compartments of the endocytic pathway and with the secretory pathway, thus allowing protein import to, or retrieval from, lysosomes (3,85). All complexes involved in these processes were present in the Lys-734 list (Fig. 8). Whereas numerous components of the ESCRT-III (Endosomal Sorting Complex Required for Transport-III) complex, which mediates the abscission of the newly forming intraluminal vesicles (86), were selected, none of the components of the ESCRT-0, -I, or -II complexes was identified. This was already the case in our recent proteomic analysis of the endocytic pathway of Dictyostelium discoideum (87) or in studies performed on the vacuolar membrane of Arabidopsis thaliana. 3 The origin of this apparently "tighter" association of ESCRT-III with endo-lysosomal membranes (in contrast, ESCRT-0 and -I were detected in phagosomes (54)) deserves further investigation. As for the process of homo-or heterotypic fusion between late endosomes and lysosomes, it implies an initial tethering step mediated by the HOPS (homotypic fusion and vacuole protein sorting) complex (88). Very similarly, tethering in early endosomes homotypic fusion is performed by the CORVET (class C core vacuole-endosome transport) complex, which shares four subunits with HOPS (89). Interestingly, all HOPS components were present in the Lys-734 list, although none of the specific CORVET subunits could be detected. This is similar to what was observed in a proteomic study of the yeast vacuolar membrane (90).
Similarly to endosomes (91), lysosomes are now emerging as signaling platforms with the capability to detect modifications of the cell environment, such as energy, growth factors, and nutrient levels (92). Accordingly, many actors of signaling processes were enriched in the MbLϩ fraction, such as receptors, ␣ subunits of the heterotrimeric G proteins, protein kinases and a few Ras-related proteins. As half of these signaling proteins were PM-annotated, their additional endocytic localization might have been ignored until now. A key signaling pathway in nutrient sensing involves the master cell growth regulator mTOR that controls autophagy in response to a wide range of signals, including amino acid availability (93). Recent studies showed that the lysosome acts as an assembly site for a sensing device, the "nutrisome," which is composed of the RagA/B-RagC/D heterodimer, the Ragulator and mTORC1 complexes, the Rheb GTPase, and the V-ATPase (92,94). Most proteins from this pathway were present in the Lys-734 list (Fig. 8).
Conclusions and Perspectives-Almost a hundred proteins, in which subcellular localization had never been described nor predicted, were sorted out as novel putative lysosomal proteins in this study. Concerning molecular transporters, 46 candidates were selected, most of which were either devoid of subcellular annotation or annotated as plasma membrane proteins, suggesting a dual localization for the latter. The lysosomal subcellular localization was validated for nine candidates, including five secondary transporters, further supporting the relevance of our list of candidate lysosomal proteins. The numerous novel candidates revealed by this work should promote new research and help with understanding the cell biology, physiology, and pathophysiology of this important organelle.