Classification of Subcellular Location by Comparative Proteomic Analysis of Native and Density-shifted Lysosomes*

One approach to the functional characterization of the lysosome lies in the use of proteomic methods to identify proteins in subcellular fractions enriched for this organelle. However, distinguishing between true lysosomal residents and proteins from other cofractionating organelles is challenging. To this end, we implemented a quantitative mass spectrometry approach based on the selective decrease in the buoyant density of liver lysosomes that occurs when animals are treated with Triton-WR1339. Liver lysosome-enriched preparations from control and treated rats were fractionated by isopycnic sucrose density gradient centrifugation. Tryptic peptides derived from gradient fractions were reacted with isobaric tag for relative and absolute quantitation eight-plex labeling reagents and analyzed by two-dimensional liquid chromatography matrix-assisted laser desorption ionization time-of-flight MS. Reporter ion intensities were used to generate relative protein distribution profiles across both types of gradients. A distribution index was calculated for each identified protein and used to determine a probability of lysosomal residence by quadratic discriminant analysis. This analysis suggests that several proteins assigned to the lysosome in other proteomics studies are not true lysosomal residents. Conversely, results support lysosomal residency for other proteins that are either not or only tentatively assigned to this location. The density shift for two proteins, Cu/Zn superoxide dismutase and ATP-binding cassette subfamily B (MDR/TAP) member 6, was corroborated by quantitative Western blotting. Additional balance sheet analyses on differential centrifugation fractions revealed that Cu/Zn superoxide dismutase is predominantly cytosolic with a secondary lysosomal localization whereas ATP-binding cassette subfamily B (MDR/TAP) member 6 is predominantly lysosomal. These results establish a quantitative mass spectrometric/subcellular fractionation approach for identification of lysosomal proteins and underscore the necessity of balance sheet analysis for localization studies.

Lysosomes are membrane-delimited organelles that are responsible for the degradation of macromolecules delivered via various cellular pathways including endocytosis, phagocytosis, and autophagy. The catabolic function of the lysosome reflects the concerted action of numerous hydrolases that break down large molecules into simpler components that can be reutilized by the cell. The biomedical importance of this organelle is underscored by the devastating effects of deficiencies in its components, with more than 40 different lysosomal storage diseases that result from mutations in genes encoding lysosomal proteins (1). Alterations in lysosomal function have also been linked with more widespread human disorders, including cancer (2) and neurodegenerative disorders such as Alzheimer and Parkinson diseases (3)(4)(5), thus there is a growing interest in understanding the function of this organelle. One approach to this lies in characterizing its proteome. To date, more than 60 different soluble lumenal proteins and numerous membrane proteins have been associated with the lysosomal compartment. However, new lysosomal proteins continue to be discovered and it is likely that more remain to be classified as such.
In recent years, two general protein identification approaches have been used to investigate the lysosomal proteome (reviewed in (1, 6 -8)). One approach focuses on soluble proteins, using affinity purification to isolate proteins targeted to the lysosome via the mannose 6-phosphate (Man6-P) 1 pathway. The other approach uses subcellular fractionation to isolate membranes enriched in lysosomal proteins. Coupled with MS-based methods for protein identification, both approaches have been extremely useful in identifying many candidate lysosomal proteins. However, validation of lysosomal localization remains a critical step and is conducted on a protein-by-protein basis (reviewed in (1,7,8)).
A general framework for assignment of proteins to the lysosome using analytical subcellular fractionation was developed over 50 years ago (9) and these principles remain highly applicable today. In the first step, a tissue homogenate is fractionated using differential centrifugation and balance sheets are used to monitor the distribution of a protein of interest among all fractions and to compare this with the distribution of known organelle marker proteins. In subsequent steps, a differential centrifugation fraction enriched in lysosomes is subjected to further fractionation methods, typically isopycnic density gradient centrifugation, and the distribution of the protein of interest again is compared with that of known markers. If the protein cofractionates with bona fide lysosomal markers, this is consistent with a lysosomal residence, and the more independent fractionation methods used that demonstrate this behavior, the greater the confidence of this assignment. Conversely, if a protein does not cofractionate with lysosomal markers, this demonstrates that it is not a lysosomal resident. In some cases, proteins or activities have composite distributions with reference to known markers and these can be quantitatively assigned to multiple locations (10).
In many fractionation schemes, the sedimentation coefficients and/or buoyant densities of lysosomes overlap or are close to those of other organelles, in particular mitochondria and peroxisomes. Fortunately, there are methods to specifically change the biophysical properties of the lysosome to alter its migration in a density gradient, thus providing an additional level of resolution. In particular, treatment of rodents with Triton WR-1339 results in a marked decrease in the buoyant density of liver lysosomes (converting them to "tritosomes") but not of other organelles (11). This density shift is a specific hallmark of lysosomal proteins that has been used in the case-by-case verification of lysosomal candidates identified in proteomics experiments (12)(13)(14)(15)(16)(17).
In this study, we have combined the selectivity of the lysosomal density shift with the protein identification and quantification capabilities of tandem mass-spectrometry on isobaric labeled samples as a step toward concomitant discovery and validation of lysosomal candidates. We assign a lysosomal localization to several proteins that have not previously been localized or whose location is controversial. Extensions of this approach have significant potential toward expanding the understanding of lysosome complexity and function in biology and medicine.

EXPERIMENTAL PROCEDURES
Subcellular Fractionation-All experiments and procedures involving live animals were conducted in compliance with approved Institutional Animal Care and Use Committee protocols. Liver homogenates from control or Triton WR-1339 injected adult male Wistar rats (250 -300 g) were prepared as described (18). All subsequent operations were conducted at 0 to 4°C. Fractionation of subcellular organelles by differential centrifugation to obtain nuclear (N), heavy mitochondrial (M), light mitochondrial (L), microsomal (P) and cytosolic (S) fractions was as described (10). Isopycnic centrifugation (19) was performed with the following modifications. For density gradients of control and triton-treated rat liver, an appropriate amount of an ML fraction (combined heavy and light mitochondrial) or an L fraction, was layered onto the top of linear sucrose gradients (density limits ϳ 1.06 -1.26 g/cm 3 ) and centrifuged for 150 min at 39,000 rpm in either a SW55 Ti or SW50 rotor. Twelve or 13 fractions were collected following centrifugation using a tube slicer and densities were measured by refractometry. Samples were stored at Ϫ80°C prior to further analysis.
Enzyme and Protein Assays-␤-galactosidase and ␤-glucuronidase were measured using 4-methylumbelliferyl substrates (20). Cytochrome C oxidase was measured with a kinetic method (18,21) at room temperature using 22 M cytochrome C reduced with sodium dithionite as a substrate in 30 mM sodium phosphate pH 7.4 containing 1 mM EDTA. The decrease in absorbance at 550 nm was monitored during the first 2 min and is proportional to the activity of cytochrome C oxidase. Catalase was measured as described (22). Protein concentrations were measured using Advanced Protein Assay reagent (Cytoskeleton Inc., Denver, CO) using bovine serum albumin standards.
Protein Microchemistry for Mass Spectrometry-Equivalent proportions of individual fractions were pooled as shown in Fig. 1 to yield a total of 45 g protein per pool and adjusted to the same sucrose concentration and final volume (typically 400 l). The exception was that, because of sample limitations, pool C 1 in Experiment II contained 27 g protein. Two purified histidine-tagged bacterial proteins (Northeast Structural Genomics Consortium target identification numbers DrR57 and GmR40, kindly provided by Dr. Guy Montelione) were added as internal standards (0.3 g each protein). For solution digests, samples were then adjusted to contain 0.1% octyl-␤-glucoside, 6 M guanidine hydrochloride, and 10 mM dithiotreitol in a final volume of 800 l, incubated for 30 min at 60°C, cooled to 20°C then carbamidomethylated with 20 mM iodoacetamide for 1 h. Samples were buffer exchanged to 50 mM ammonium bicarbonate, pH 7.8 by ultrafiltration using Vivaspin 500 10,000 MWCO concentrators (Vivaproducts, Littleton, MA). Modified porcine trypsin (Promega, Madison, WI) was added to retentates at a 1:50 w/w ratio trypsin: substrate and samples were incubated at 37°C for 16 h. Peptides were then recovered in the filtrate by centrifugal ultrafiltration at 14,000 ϫ g followed by two cycles of addition of 150 l 50 mM ammonium bicarbonate and ultrafiltration. These filtrates were pooled, dried using a vacuum centrifuge, and residual ammonium bicarbonate was removed by three cycles of resuspension in 200 l 50% methanol and drying. Dried samples were stored at Ϫ80°C. For in-gel tryptic digestion, samples were dissolved with 1ϫ lithium dodecyl sulfate gel loading buffer (Invitrogen) containing 10 mM dithiotreitol, heated for 10 min at 70°C, and loaded into a 2 mm thick polyacrylamide gel with large (14 mm wide ϫ 20 mm deep) wells. The gel consisted of 1 cm of 10% acrylamide segment cast on top of a 20% acrylamide spacer that served to retain proteins in the 10% acrylamide segment. Electrophoresis was conducted until prestained standards in a separate well completely entered the gel. Gels were fixed, stained with colloidal Coomassie blue and the 10% acrylamide segment containing each sample excised for subsequent reduction, carbamidomethylation and in-gel tryptic digestion (23). Peptides were extracted from the gel pieces using 60% acetonitrile (ACN)/5% formic acid, dried using a vacuum concentrator, and ammonium bicarbonate removed as described above.
Isobaric Tag for Relative and Absolute Quantitation (iTRAQ) Labeling-iTRAQ 8-plex labeling (24) was performed according to manufacturer's protocols (Applied Biosystems, Carlsbad, CA). Briefly, dried tryptic digests were dissolved in 20 to 25 l of 0.5 M triethylammonium and iTRAQ 8-plex reagents dissolved in 50 l of isopropanol were added to the samples that were incubated for 2 h at 20°C. Labels used for different samples are given in supplemental Fig. S1. Labeling was verified prior to pooling either by LC-MS/MS analysis using an LTQ (Thermo Fisher) or by matrix-assisted laser desorption ionization tandem MS (MALDI-MS/MS) using an ABI 4800 (Applied Biosystems). Most (Ͼ96%) of the peptides contained the iTRAQ modification, indicating that the labeling reactions were essentially complete. Samples labeled with different iTRAQ reagents were dried using a vacuum centrifuge, dissolved in 50% methanol, combined, vacuum-dried and stored at Ϫ80°C until use.
Two-Dimensional Peptide Separation-Tryptic digests were dissolved in 100 l 0.1% trifluoroacetic acid (TFA) and loaded onto a pipette tip column containing C18 resin (SPEC, Varian, Lake Forest, CA) equilibrated with 0.1% TFA. The column was washed with 600 l 0.1% TFA and peptides were sequentially eluted with 100 l of 30% ACN/0.1% TFA, 100 l of 50% ACN/0.1% TFA, and 100 l of 80% ACN/0.1% TFA. The procedure was repeated with the column flow through as above. The eluates were pooled, vacuum-dried, and used for strong cation exchange (SCX) chromatography. Briefly, peptides were dissolved in 180 l of buffer A (5 mM KH 2 PO 4 and 25% ACN, pH 3.0) and applied to a 1 mm ϫ 150 mm, 5 m, 300Å polysulfoethyl A column (PolyLC Inc, Columbia, MD). Chromatography was conducted at a flow rate of 40 l/min using an Ultimate LC system (Dionex, Sunnyvale, CA) equipped with an UV absorbance monitor. The column was washed with Buffer A for 5 min and developed with a gradient of 0 -100% Buffer B (5 mM KH 2 PO 4 , 25% ACN, and 400 mM KCl, pH 3.0) in 30 min followed by 100% Buffer B for 5 min. Fractions were collected at 2 min intervals and vacuum-dried. SCX fractions were dissolved in 0.1% TFA, combined based on absorbance measurements to yield seven SCX pooled fractions that contained roughly equivalent amounts of peptides. Each of these was further fractionated by reverse phase high performance liquid chromatography and the eluate deposited onto a MALDI plate. Briefly, samples were loaded onto a 300 m ϫ 5 mm C18 trap column (Dionex, Sunnyvale, CA) and washed with 0.1% TFA for 5 min at a flow rate of 25 l/min. The flow was reversed and the trap column brought in line with a 75 m ϫ 12 cm column packed in-house with Magic C18AQ (3 m, 200 Å; Michrom BioResources Inc., Auburn, CA). The columns were developed using a linear gradient of 2 to 50% ACN in 0.1% TFA at a flow rate of 250 nl/min for 120 min. The outflow was mixed with a 2 l/min stream of 50% ACN/0.1% TFA solution using a low dead volume tee and fractions were deposited onto a 384-well MALDI plate using a Probot plate spotter (Dionex) (15 s/fraction). Matrix (2.5 mg/ml recrystallized ␣-cyano-4-hydroxy cinnamic acid in 50% ACN/0.1% TFA) was manually added to each spot before MALDI-MS/MS analysis.
Tandem Mass Spectrometry-Tandem MALDI mass spectrometry of iTRAQ labeled samples was conducted using an Applied Biosystems 4800 MALDI-TOF/TOF. MS spectra were acquired in a window of m/z 800 -4000 in positive ion reflectron mode. The 10 most abundant precursor ions with a signal to noise ratio greater than 50 were selected for top-down MS/MS scans, excluding identical precursor ions present in adjacent spots from a given LC-MALDI run. Precursor ions were selected at a relative resolution of 200 full width at half maximum and MS/MS was conducted using a collision energy of 1 kV (Experiments Ia and Ib) or 2 kV (Experiment II) in positive ion mode, accumulating 2000 laser shots per spectra. Peaks with a minimum signal to noise ratio of 10 and a mass range of 60 Da to 20 Da below the precursor ion mass were exported from the Applied Biosystems 4000 Series database into mgf files using the TS2 Mascot utility.
Database Searching-Searches were conducted using a local implementation of the Global Proteome Machine (GPM) XE Manager version 2.2.1 (Beavis Informatics Ltd., Winnipeg, Canada), which uses X!Tandem Tornado 2010.01.01 to assign spectral data (25). All mgf files from the MALDI-TOF/TOF experiments were merged and used to search a combined database consisting of the ENSEMBL rat and mouse proteomes (Rattus_norvegicus.RGSC3.4.58.pep.all and Mus_musculus.NCBIM37.58.pep.all), the trypsin and dust/contaminant proteins abstracted from the GPM cRAP database and the two recombinant bacterial fusion proteins used as internal standards (see below). Search parameters specified a precursor ion mass error of 100 ppm and a fragment mass error of 0.4 Da with a minimum number of 5 MS/MS fragments. Cysteine carbamidomethylation and iTRAQ labeling were constant modifications, oxidation of methionine was a variable modification, and one missed cleavage was allowed during the preliminary model development. The threshold used for model refinement was a peptide expectation score of 0.01. During refinement, deamidation at asparagine and glutamine, and oxidation at methionine residues were allowed. Proteins were mapped back to corresponding genes and only those with at least two independent peptides and GPM expectation scores of 10 Ϫ5 or better were accepted as valid identifications.
Data Analysis-Intensities associated with reporter ions (monoisotopic mass Ϯ 0.06 Da for each reporter) were extracted from mgf files using a custom PERL script to generate a list of spectra and associated information. Matrix functions in Microsoft Excel were used to adjust iTRAQ reporter ion peak areas for crossover using correction factors supplied by the vendor. This data set was merged with an Excel spreadsheet containing a list of all spectra identified in database searching (see above). Spectra were used for quantification only if they met all of the following criteria: (1) the mean iTRAQ reporter ion intensity for both the control and Triton WR-1339 treated gradient fractions were Ն300 (see "Results"); (2) the assigned peptide contained a maximum of one missed tryptic cleavage sites; and (3) the assigned peptide was either fully tryptic or semitryptic. Gene products required at least three quantifiable spectra for inclusion in the analysis. Internal standards from each experiment were used to generate correction factors that were applied to the ion intensity data to adjust for differential losses and labeling efficiencies (see "Results").
Calculation of Distribution Profiles and Statistical Analysis-Each MS/MS spectrum included in the analysis contains eight reporter ion intensities that correspond to the four different samples from each of the two different sucrose density gradients analyzed in a given experiment. Each spectrum was used to create two sets of normalized values, where the reporter ion associated with a given gradient fraction was divided by the sum of the four reporter ions for that gradient. Normalized ion intensities for all spectra assigned to a given protein were averaged and used for plotting distribution profiles and for quadratic discriminant analysis (QDA) (26). To account for underlying data quality, we applied a parametric bootstrap analysis to refine the QDA (see "Results") using R version 2.11.0, with contour lines that separate the different classes generated using the "klaR" library. A set of rodent genes encoding established lysosomal matrix and membrane proteins or strong candidates compiled from Tables I-III Table S1), was used as a starting point for predicting new lysosomal candidates.
Western Blotting-Samples were fractionated by SDS-PAGE on precast NuPAGE Novex 10% bis-Tris or 3-8% Tris-acetate midigels (Invitrogen Corporation, Carlsbad, CA) under reducing conditions and transferred to nitrocellulose. For quantitative analysis, different load volumes were used to ensure that the signal was in the linear range. Membranes were probed with either a sheep antisera against superoxide dismutase, Cu/Zn enzyme (# 574597, lot D00047669, Calbi-ochem, San Diego, CA) or a rabbit antisera against ABCB6 (# LS-C19004, lot 8007, LifeSpan BioSciences, Inc., Seattle, WA) and appropriate iodinated secondary antibodies. Signal was visualized and quantified by phosphorimager analysis using a Typhoon 9400 scanner and ImageQuant 5.2 software (GE Healthcare Bio-Sciences).

RESULTS
Experimental Workflow-The aim of this study was to combine classical approaches to subcellular fractionation with quantitative mass-spectrometry to identify lysosomal proteins with high confidence. The first step was to prepare light mitochondrial (L) differential centrifugation fractions (which are ϳ4 -10-fold enriched in lysosomes compared with the homogenate as determined by ␤-galactosidase assay) from individual control and Triton WR-1339 treated rats and further fractionate these using sucrose density gradient centrifugation. Two independent pairs of control and treated animals were analyzed. As expected, gradients from the treated rats ("Triton gradients") contained a peak of the lysosomal marker activity ␤-galactosidase in the lower density fractions (Fig. 1, Top Panels), whereas in gradients from control animals ("control gradients"), this marker was in the higher density fractions (Fig. 1, Bottom Panels). The distribution of the mitochondrial marker cytochrome oxidase, the peroxisomal marker catalase, and total protein were essentially unchanged by the treatment (Fig. 1). Fig. 2 shows the overall workflow for the mass spectrometry experiments that were used to analyze each pair of control and treated samples and additional details are provided in supplemental Fig. S1. Although each gradient typically is divided into 12-13 fractions for marker enzyme analysis, this number of samples is impractical for iTRAQ analysis. Thus, each gradient was divided into four fractions (C 1-4 and T 1-4 for the control and Triton gradients, respectively: see Fig. 1), resulting in a set of eight fractions. To each of these fractions, recombinant bacterial fusion proteins were added as internal standards, then they were digested with trypsin and each labeled with a different iTRAQ 8-plex reagent. The eight iTRAQ-labeled samples were pooled and the peptides further fractionated by two-dimensional liquid chromatography and analyzed using a MALDI-TOF/TOF mass spectrometer. The first set of samples was used for two experiments. In Experiment Ia, samples were reduced, alkylated, and digested with trypsin in solution, whereas in Experiment Ib, these steps were conducted on proteins run into a polyacrylamide gel. Based on the results from this analysis (see below), the second set of samples were processed in-gel (Experiment II). In addition to representing samples derived from different animals, Experiment I and II differed in terms of which iTRAQ 8-plex reagent was used to label fractions from the different gradient fractions and on the collision energies used for MS/MS (supplemental Fig. S1).
Evaluation of Sample Workflows and Data Quality Using Internal Standards-The internal standards allowed evaluation of potential unequal losses and labeling efficiencies among the eight individual fractions before the iTRAQ labeled samples were combined. To determine whether either source of error was significant, for each spectrum, iTRAQ reporter ion intensity was normalized to the sum of all eight reporter ion intensities. If all eight samples had exactly the same labeling FIG. 1. Distribution of organelle markers and protein in sucrose density gradient fractions. Differential centrifugation L fractions prepared from livers of control or Triton WR-1339 treated rats were fractionated by isopycnic sucrose density gradient centrifugation. Each gradient represents an individual animal. Indicated fractions were combined following marker analysis as shown to create four pools per gradient.

FIG. 2. Workflow for sample processing and iTRAQ labeling.
Liver homogenates were processed and used for quantitative mass spectrometry as described in text. efficiency and losses, then the normalized intensity for all eight reporters would be exactly 0.125. Conversely, differences in normalized intensities for the internal standards would indicate differential losses or unequal labeling efficiencies. Comparison of Experiments Ia and Ib indicated slightly less variability for the in-gel digested samples compared with the solution digests (supplemental Fig. S2). Digestion format per se was not necessarily related to variability in the data but given that both were acceptable, we proceeded with in-gel digests for Experiment II. This analysis also allowed for the generation of correction factors that we applied to each reporter ion in each experiment to account for variability associated with sample processing. Following correction, data from Experiments Ia and Ib were pooled and used for subsequent analysis.
We also examined the effect of reporter ion intensity on signal variance ((supplemental Fig. S3). As expected, the coefficient of variation for each spectrum associated with the internal standards tends to be inversely proportional to the average iTRAQ intensity. Based upon these results, we only used spectra where the average reporter ion intensity was Ն300.
Protein Identification and Quantitation-The first pass database search using the GPM resulted in assignment of 8913 spectra to 1273 rodent gene products (excluding internal standards and contaminants) with a peptide false discovery rate of 0.77% (supplemental Tables S2 and S3). These assignments included 56 curated lysosomal proteins (see Methods). Filtering data to include only proteins with a Յ10 Ϫ5 chance of a stochastic match that also had at least two unique peptides reduced this to 657 gene products, 41 being curated lysosomal. Further filtering to include gene products with at least three quantifiable spectra (see Methods) reduced this to 7590 quantifiable spectra (5588 in Experiment I, 2002 in Experiment II) assigned to 545 different gene products, 38 being curated lysosomal. For the latter, 292 spectra in Experiment I were assigned to 37 proteins whereas 184 spectra in Experiment II were assigned to 38 proteins.
For each experiment, reporter ion intensities from each spectrum were first corrected for internal standard recovery, then normalized and used to create distribution profiles for the control and triton gradients. Profiles from all quantifiable spectra assigned to a given gene product were averaged to create mean distribution profiles. Fig. 3 shows profiles of relatively abundant proteins that have been assigned to lysosomes, mitochondria, peroxisomes, and endoplasmic reticulum as well as other miscellaneous proteins.
The mean distribution profiles for the lysosomal protease cathepsin D (CSTD), classical lysosomal acid phosphatase (ACP2), and the lysosomal membrane protein LAMP2 are similar to each other and to that of the relative specific enzymatic activity of lysosomal ␤-galactosidase (␤-gal). The control gradient fractions have similar relative intensities or specific activities (compare fractions C 1 to C 4 ). In contrast, there is evident enrichment in the lower density fractions of the triton gradients (compare fractions T 1 and T 2 with T 3 and T 4 ).
Thus, the iTRAQ labeling strategy can clearly detect the density shift characteristic of lysosomal proteins.
Analysis of other types of proteins indicated that their distributions in the density gradients were largely unaffected by FIG. 3. Distribution of different classes of proteins in sucrose density gradient pooled fractions. Relative intensities were calculated as described in Methods. Error bars show the 95% confidence intervals calculated using Prism5.03 (GraphPad Software, Inc). The profile for ␤-galactosidase is based on enzyme activity and protein measurements conducted following fractions were pooled. Note that the specific reporter ion intensities or activity are normalized to protein levels. Thus, even though most of the lysosomal proteins sediment in the denser fractions in the control gradients as shown in Fig. 1, these fractions also contain the bulk of the protein, resulting in similar relative specific activities among the pooled fractions.
the Triton WR-1339 treatment. In some cases, different proteins assigned to a given organelle exhibit differences in their behavior in a given gradient. For instance, whereas the distributions of the peroxisomal membrane protein ABCD3 and the luminal protein urate oxidase (UOX) are similar, they differ from that of another peroxisomal protein, catalase (CAT), with a greater proportion of the latter sedimenting in the less dense region of the gradient. The differences in distribution profiles of UOX and CAT were previously noted, leading to the hypothesis that a portion of CAT is released from peroxisomes during centrifugation through the sucrose gradient (19). Also, when considering the distribution of proteins generally thought to be associated with ER, the distribution of cytochrome P450 CYP2D2, which is associated with the ER membrane with the majority being located on cytosolic surface, differs from that of the luminal heat shock proteins HSPA5 and TRA1. Although these differences may be of interest, for this study, the important finding is that when comparing the control and triton gradients, the distribution of the lysosomal proteins can be distinguished from that of other types of proteins.
The mean distribution profiles for all curated lysosomal proteins are shown in Fig. 4 (Top Panels), with individual spectra plotted for select proteins (Fig. 4, remaining Panels). Note that there is some scatter in the profiles for individual spectra, and this variation needs to be accounted for when classifying proteins in the data set (see below). Nonetheless, the mean distribution profiles for all of the curated lysosomal proteins in a given experiment are remarkably similar, with only one outlier, ACP5, being found in Experiment I (dashed line, Fig. 4 Top Panel). The mean distribution profiles are more tightly clustered in Experiment I than in Experiment II, possibly reflecting the number of quantifiable spectra analyzed in each experiment. Also, there is some animal-to-animal variation in the degree of density shift induced by Triton WR-1339 treatment, and a more pronounced shift was obtained in Experiment I. Thus, we chose to use Experiment I for primary classification and Experiment II for corroboration.
Protein Classification-We developed a two-dimensional index to facilitate data visualization and analysis. This entailed calculating the proportion of reporter ion intensity found in fractions 1 and 2 of the control and triton gradients (pC 1,2 ϭ [C 1 ϩ C 2 ]/[C 1 ϩC 2 ϩC 3 ϩC 4 ] and pT 1,2 ϭ [T 1 ϩ T 2 ]/ [T 1 ϩT 2 ϩT 3 ϩT 4 ], respectively) for each spectrum and then taking the mean of all spectra assigned to a given protein to provide a point estimate for its relative enrichment in the lower density fractions. Plotting pT 1,2 versus pC 1,2 provides a simple way to visualize the degree of the triton shift. Most proteins have similar values for pT 1,2 and pC 1,2 , and these "nonshifted" proteins lie on a diagonal in the plots shown in Fig. 5. In contrast, several proteins have a distinct distribution and show the triton shift, with pT 1,2 Ͼ pC 1,2 . Importantly, the latter type of pattern includes most of the curated lysosomal proteins but not the others (black circles and red squares, respectively in Fig. 5). We used QDA to estimate the posterior probability that any given protein was associated with the lysosome as judged by the triton shift criteria. For the initial classification set for lysosomal proteins, we used the curated set of lysosomal proteins. For the initial classification set of nonlysosomal proteins, we used all other proteins (which may contain some lysosomal proteins). Following QDA, each point is assigned a probability of lysosomal localization (Supplemental Table S4, Basic.posterior_prob). The black curved dotted line in Fig. 5 represents a boundary between the lysosomal and nonlysosomal distributions (posterior probability ϭ 0.5) for this analysis stage, thus any point falling on this line would be equally likely to belong to either set.
This initial analysis stage ignores the variability in the data used to calculate the coordinates pC 1,2 and pT 1,2 . We used a parametric bootstrap procedure, with the parameters derived from the estimated mean and covariance of the measurements (see Legend, Fig. 5), to generate 10,000 coordinates for each protein to simulate its sampling distribution (Fig. 5, gray and pink dots for curated lysosomal and other, respectively). All these coordinates were then used for QDA as above, which resulted in a new boundary (blue curved dashed line, Fig. 5). Each of the 10,000 points associated with a given protein were then assigned a posterior probability and following sorting, the values of the 250th and 9750th points used as the 95% confidence interval limits (Stage 0 analysis, supplemental Table S4).
It is possible that proteins in the initial curated lysosomal set were not true lysosomal residents and that some proteins not on the curated list are actually lysosomal residents and were initially classified as nonlysosomal. To address this, proteins were reassigned to the lysosomal classification set if the lower 95% confidence limit was Ն 0.5, reassigned to the nonlysosomal classification set if the upper 95% confidence limit was Յ 0.5, or reassigned to a third "ambiguous" set if they did not meet either criterion. We then used the 10,000 points assigned to each protein in the first two sets (excluding ambiguous proteins) in the QDA procedure to recalculate posterior probability estimates and confidence intervals for each protein as above (Stage 1 analysis, supplemental Table S4). We again reassigned proteins as before to either the lysosomal, nonlysosomal, or ambiguous sets and the QDA procedure repeated (Stage 2 analysis, supplemental Table S4). All assignments stabilized following the second and first iteration for Experiments 1 and 2, respectively, and the final boundary is shown as a solid green line in Fig. 5. Fig. 6 shows the distribution of proteins identified in Experiment I based on their Stage 2 classifications, with corresponding data for Experiment II in supplemental Fig. S4. These classifications should be considered tentative (see Discussion), but provide a useful starting point for further investigation. Of the proteins in the curated lysosomal set, all but one (ACP5, tartrate-resistant acid phosphatase) was either classified as having lysosomal or ambiguous distributions (Fig. 6, Top Left Panel). Of the proteins not in the curated set, five were classified as lysosomal, 15 as ambiguous, and the remaining 487 as nonlysosomal (Fig. 6, Top Right Panel). Our data set overlapped with that of two prior lysosomal proteo- FIG. 5. Classification of proteins using quadratic discriminant analysis (QDA). Each protein identified in a given experiment is represented by a separate symbol: black circles, curated lysosomal proteins; red squares, others. The gray and red dots depict the bootstrap points of the curated lysosomal and other data set, respectively. Curves represent the boundary lines where the posterior probability for assignment as lysosomal or not lysosomal is equal for different classification routines (see text for details). Statistical analysis was conducted as follows: For each protein we generated 10,000 points using a bivariate normal parametric bootstrap procedure using the two sample means of pC 1,2 and pT 1,2 (calculated from individual spectra associated with each protein) and their variances and the covariance. Formally, denote by p C and p T the sample means of the n reporter ion intensity measurements for a particular protein and let s C 2 , s T 2 , and s CT denote, respectively, the variances of the n spectra and their covariance. Then the precision of the estimates of p C and p T are given by s C 2 /n, s T 2 /n, and s CT /n. These three parameters, which reflect the precision of the estimates, are used to generate the aforementioned bivariate normal random variables. (For those proteins with fewer than three spectra, we used estimates of s C 2 , s T 2 , and s CT derived from the average values of these parameters across all proteins.) This process was carried out for each of the proteins (resulting in 5,440,000 and 4,500,000 points for Experiments 1 and 2, respectively), and these points were used to carry out the discriminant analysis. This procedure effectively places greater weight on proteins which are more accurately characterized, i.e. those with smaller variances and covariances. mics studies. One study reported identification of 215 proteins in a rat liver tritosome integral membrane preparation (27). We identified 88 of these and classified 11 as lysosomal, 9 as ambiguous, and 68 as nonlysosomal (Fig. 6, Middle Left Panel). Another study classified 145 human placental proteins as being associated with the lysosomal membrane (28). We identified 24 rat orthologs of these and classified 7 as lysosomal, 7 as ambiguous, and 10 as nonlysosomal (Fig. 6, Middle Right Panel). We identified 84 proteins that were listed in the MitoCarta database of mitochondrial proteins (29) and all were classified here as nonlysosomal (Fig. 6, Bottom Left Panel), whereas of the 30 proteins identified that were also listed in the Peroxisome Database (30), one was classified as lysosomal and the remaining as nonlysosomal (Fig. 6, Bottom Right Panel). Subcellular Localization of Selected Candidates- Fig. 7 shows the confidence intervals for the Stage II distributions of all 37 curated lysosomal proteins identified in Experiment I as well as an equal number of proteins in the curated other category that had the highest posterior probability scores. Corresponding data for Experiment II are in Supplemental Fig. S5. We chose to investigate the distribution of two proteins where good antibodies were available for quantitative Western blotting: Cu/Zn superoxide dismutase 1 (SOD1), which had a clear lysosomal distribution in both experiments, and ATP-binding cassette subfamily B (MDR/TAP) member 6 (ABCB6), which fell into the ambiguous category.
Classical differential centrifugation analysis was used to compare the distribution of SOD1 and ABCB6 to that of established markers for lysosomes (␤-galactosidase), peroxisomes (catalase), mitochondria (cytochrome oxidase), and endoplasmic reticulum (glucose 6-phosphatase) (Fig. 8). ABCB6 exhibited a pattern consistent with either a lysosomal or peroxisomal distribution, with the greatest enrichment in the L fraction and the majority of the total present in  Table I Table I  the M and L fractions together. The bulk of cellular SOD1 was detected in the S fraction, which is consistent with the established residence of this protein in the cytoplasm. However, there was significant enrichment of SOD1 within the L fraction, consistent with a secondary association with lysosomes or peroxisomes. Fig. 9 shows sucrose density gradients of the ML fraction which together contains the majority of the mitochondrial, peroxisomal, and lysosomal markers. The distribution of ABCB6 followed that of the lysosomal marker ␤-galactosidase in both the control and triton gradients. SOD1 had a more complex distribution. For the control rat liver sample, a portion of the SOD1 remained at the top of the gradient, as might be expected for a cytosolic protein contaminant of a differential centrifugation fraction, whereas the remainder had a distribution similar to that of the lysosomal marker. SOD1 was clearly shifted in the Triton WR-1339 treated sample, again consistent with a portion of the protein being associated with lysosomes.

DISCUSSION
Several previous studies have combined subcellular fractionation with mass spectrometry and protein sequencing in the proteomic analysis of lysosomes and related organelles (27,28,(31)(32)(33)(34). These have provided valuable insights into the membrane and lumenal composition of the lysosome but interpretation is complicated by the presence of other organelles in the preparations. In one of these studies, mass spectrometry was conducted at different stages in the purification of lysosomal membranes and spectral counting used to help distinguish likely lysosomal proteins, which are increasingly enriched during purification, from contaminants, which are increasingly depleted (28). Here, a lysosome-enriched preparation was subjected to an in vitro treatment that preferentially ruptures lysosomes and the final purification step involved enrichment of the lysed membranes. This approach does represent a significant step forward in terms of distinguishing lysosomal proteins from contaminants but it is limited to lysosomal membrane proteins.
In this study, we have taken an alternative approach that can be used to classify both luminal and membrane proteins as lysosomal or nonlysosomal. Here, we use an isobaric labeling strategy to quantitate the shift in density of the lysosome that is induced by treatment with Triton WR-1339 and use a set of curated lysosomal proteins to establish criteria to guide the assignment of other proteins to this organelle. It is worth considering inherent assumptions and limitations in the current experimental design. First, in terms of mass spectrometry, there are several caveats in protein quantification using isobaric labels, including potential artifacts arising from analysis of mixed spectra (35). We have attempted to account for variable data quality by averaging spectra and including error estimates, but this remains a concern, especially for proteins with sparse coverage. Second, in terms of bioinfor- FIG. 8. Distribution of lysosomal candidates in control and triton-treated rat liver differential centrifugation fractions. For each plot, area is proportional to total signal. Left Panels, Controls, Right Panels, Triton WR-1339 treated. Ordinate, relative specific activity (percentage of total recovered activity or signal normalized to percentage of total recovered protein). Abscissa, relative protein content of fraction (cumulative from left to right). Fractions are: N, nuclear; M, heavy mitochondrial; L, light mitochondrial; P, microsomal and; S, high speed supernatant. Markers were measured by activity assays whereas ABC6 and SOD1 were measured by quantitative Western blotting analyzing equal amounts of protein (2 and 4 g) for each fraction. matics, it should be noted that there are conflicting reports regarding cellular localization of some proteins, and the composition of the curated lysosomal set is somewhat subjective. However, the iterative method used to obtain the Stage II classifications should tolerate a certain level of misclassification. Finally, in terms of subcellular fractionation, it should be noted that our initial analysis was conducted on L differential centrifugation fractions that only represent a portion of the lysosomes (15-20% based on marker enzyme analysis) and protein (ϳ2-2.5%) present in the liver homogenate. Based on the postulate of biochemical homogeneity, lysosomes in the L fraction are likely to have a composition that is representative of the entire sample (9). However, one cannot assume that the postulate of single location, which is obeyed to a first approximation by well-established organelle markers, holds true for lysosomal candidates. It is also possible that some peripheral proteins associated with the cytoplasmic leaflet of the lysosomal membrane dissociate during the fractionation procedure, resulting in a false negative classification. Thus, our classifications should be considered as a guide for further investigation using balance sheet and other types of analyses. Select proteins of interest that were not in the initial curated lysosomal data set and had posterior probability scores higher than that of ABCB6 in Experiment I are discussed below in order of their score.
Phospholipase B Domain Containing 1 (PLBD1)-PLBD1 (also known as LAMA-like protein 1 and FLJ22662) was classified as lysosomal in Experiments I and II. This protein is a paralog of a newly discovered lysosomal protein, phospholipase B domain containing 2 (also named P76, LAMA-like protein 2, and LOC196463) (15,17). PLBD1 was previously found in preparations of Man6-P glycoproteins from various sources (1,23) and was directly shown to contain Man6-P residues (36). These findings and the results presented here are together highly suggestive that PLBD1 is a bona fide lysosomal protein.
Cu/Zn Superoxide Dismutase 1-SOD1 catalyzes the conversion of superoxide anions into oxygen and hydrogen peroxide. In our initial analysis, SOD1 was classified as lysosomal in both Experiments I and II. We subsequently performed Western blotting analysis of differential centrifugation fractions and ML sucrose density gradient fractions from control and triton treated rats (Figs. 8 and 9). This indicated that SOD1 is associated with the lysosome, but that this organelle is not its primary residence. Although SOD1 is generally regarded as a cytosolic protein, some reports have shown that it can be found in the mitochondrial intermembrane space (37,38). A recent report has identified SOD1 in a peroxisomeenriched subcellular fraction and, based on immunofluorescence microscopy studies on cells overexpressing both SOD1 and "copper chaperone of SOD1," concluded that copper chaperone of SOD1 mediates import of SOD1 into peroxisomes (39).
Although SOD1 clearly has a complex distribution, it is important to determine whether the minor amount present in membrane fractions under physiological conditions is associated with mitochondria, peroxisomes, or lysosomes. A rigorous earlier study reported that small amounts of SOD1 (ranging from 2 to 8% of the total, higher under starvation conditions) are associated with the lysosome, likely reflecting autophagy of cytosolic protein (40). This was corroborated by a study employing cryosection immunoelectron FIG. 9. Distribution of ABCB6 and SOD1 in sucrose density gradients of control and triton-treated rat liver ML fractions. Red and black symbols represent analyses of samples from Triton-WR1339-treated and control animals, respectively. The distribution of ABCB6 and SOD1 was determined by quantitative Western blotting analyzing equivalent volume proportions (corresponding to 0.5 and/or 1 mg wet weight liver) for each fraction. Western blotting for SOD1 revealed a single band that migrated with electrophoretic mobility between the 11 and 31 kDa size markers (data not shown), which is consistent with the known mass of the rat protein (ϳ20 kDa). Western blotting for ABCB6 revealed several bands but the one shown which is shifted in response to Triton-treatment was the only protein migrating between the 59 and 110 kDa size standards (data not shown), consistent with the observed size of rat ABCB6 (ϳ80 kDa). microscopy, which also showed that SOD1 is relatively resistant to lysosomal proteolysis (41). The physiological rationale for the presence of SOD1 within the lysosome remains to be elucidated. It is possible that it is targeted to the lysosome for degradation but relatively stable within this environment and thus hydrolyzed slowly. Alternatively, it may be relevant that the activity of SOD1 is relatively independent of pH (42), which may allow it to function within the lysosome.
Glutamyl Aminopeptidase-Glutamyl aminopeptidase was classified as lysosomal in Experiment I and ambiguous in Experiment II. It appears to play a role in the control of blood pressure via degradation of angiotensin II (43) and our results suggest a lysosomal localization for this protein. Glutamyl aminopeptidase has also been found to be significantly enriched in lysosomal membranes isolated from human placenta (28).
Vacuolar ATPase V 0 Domain Subunit d1 and other Components of the Vacuolar ATPases-Although vacuolar ATPases are responsible for lysosomal acidification, these proton pumps are localized to a variety of membranes and are composed of multiple subunits and isoforms (44). The subunits also undergo dynamic association and dissociation. Because of this complexity, we did not place any of the subunits in our initial list of curated lysosomal proteins. Nonetheless, a number were identified, with the following classifications: ATP6V0D1, lysosomal in Experiments I and II; TCIG1 (ATP6V0A3) and ATP6AP1, ambiguous in both experiments; ATP6V0A1, ambiguous in Experiment I and not found in Experiment II; ATP6V1A1, ambiguous in Experiment II, nonlysosomal in Experiment I; and ATP6V1B2, not lysosomal in Experiments I and II. Although the nonlysosomal classification of some of the subunits of the V 1 domain may reflect subcellular localization, these are peripheral proteins, and it is possible that they may dissociate from membranes during the fractionation process.
ADP-ribosylation Factor-like Protein 8B-ARL8B was classified as lysosomal in Experiment I and ambiguous in Experiment II. It belongs to the Arf-like family of small GTP-ases and was reported to participate in chromosome segregation and to localize with microtubules on the mitotic spindle (45). ARLB8 was identified in previous lysosomal proteomics studies (27,28) and a lysosomal localization is consistent with visualization of fusion protein chimeras and effects of overexpression on lysosome distribution within cells (46,47).
Amyloid P-component, Serum (APCS)-APCS was classified as ambiguous in Experiment I and lysosomal in Experiment II. This protein is present in plasma and cerebrospinal fluid and may be involved in the pathogenesis of Alzheimer disease as it appears to prevent proteolysis of amyloid fibrils (48,49). It was previously found in preparations of Man6-P glycoproteins from human and mouse plasma (50,51) as well as a wide range of rat tissues (23) although the presence of Man6-P has not been directly demonstrated. Interestingly, there is evidence that APCS can bind carbohydrates and proteins containing Man6-P (52) which raises the possibility that this protein may not be a Man6-P glycoprotein per se but may be transported to the lysosome by MPR-mediated endocytosis while in association with other proteins containing Man6-P.
Ferric-chelate Reductase 1-Ferric-chelate reductase (also called for stromal cell-derived receptor 2) was classified as ambiguous in Experiment I and lysosomal in Experiment II. This protein belongs to the cytochrome b 561 family and is a ferric-reductase (53). LCTYB, another member of the Cytochrome b 561 family with ferric-reductase activity, has been reported to localize to the late-endosome/lysosomal membrane (54).
ATP-binding Cassette Subfamily B {MDR/TAP} Member 6 -ABCB6 had an ambiguous localization as determined in both Experiment I and II, with several peptides clearly exhibiting the triton shift. This protein is an ATP binding cassette transporter, being part of a large family of proteins that play important roles in the transport of a variety of substrates across membranes (55). There is considerable disagreement regarding the intracellular distribution of ABCB6. Initially described as a mitochondrial protein involved in iron homeostasis (56), ABCB6 subsequently was reported to reside in the mitochondrial outer membrane (57), in both the mitochondrial outer membrane and the plasma membrane (58), in endoplasmic reticulum derived compartments consisting mainly of Golgi (59), and in late endosomes/lysosomes (60). In previous proteomics studies, ABCB6 was identified in rat tritosomes (27) and was enriched in lysosomal membranes from human placenta (28). A variety of morphological and/or subcellular fractionation approaches were used in these localization studies, but it is worth noting that in some of them, the presence of lysosomes in preparations of "purified" mitochondria was not fully appreciated. Our balance sheet analysis provides strong support for the study reporting a lysosomal residence for ABCB6 (60).
Other-It is also possible that additional proteins classified as either ambiguous or nonlysosomal from this analysis contribute to lysosomal function, and thus represent false negatives. For instance, of the curated lysosomal proteins, tartrate-resistant acid phosphatase (ACP5) was classified as nonlysosomal in both Experiments I and II. This may reflect experimental error associated with measurements on relatively few spectra (three in Experiment I, one in Experiment II). Alternatively, it is possible that the data reflect underlying biology. ACP5 is expressed in high levels in osteoclasts where it is involved in bone resorption (61) but is also present in numerous other cell types where it functions in dephosphorylation of Man 6-P containing lysosomal proteins (62). Interestingly, early studies suggested a complex compartmentalization of the dephosphorylation activity (63). Additional investigation will be needed to resolve this issue.