Datasets from label-free quantitative proteomic analysis of human glomeruli with sclerotic lesions

Human glomeruli with intermediate (i-GS) and advanced (GS) sclerotic lesions as well as the normal control (Nor) were captured from laser microdissection, digested by trypsin and subjected to shotgun LC-MS/MS analysis (LTQ-Orbitrap XL). The label-free quantification was performed using the Normalized Spectral Index (SIN) to assess the relative molar concentration of each protein identified in a sample. All the experimental data are shown in this article. The data is associated to the research article submitted to Journal of Proteomics [1].


Value of the data
In-depth proteomic profiles provide comprehensive protein composition of human sclerotic glomeruli.
SI N -based label-free quantitative datasets are useful to explore the key biological events which would be critically involved in the progression of human glomerulosclerosis.
Detailed information on peptide-spectrum matches and ion intensities for spectral queries enables further bioinformatic analysis.

Data, experimental design, materials and methods
We aimed to characterize human glomeruli with intermediate and advanced sclerotic lesions by label-free quantitative proteomic approach in combination with laser microdissection. Macroscopically normal kidney tissues were obtained from patients who underwent nephrectomy due to urological cancers. Sclerotic glomeruli, which were excluded from specific renal diseases and assumed to be aging-related, were divided into two groups, intermediate (i-GS) and advanced (GS) sclerosis, as well as the normal control (Nor). Glomerular sections (10 μm-thick, fixed by methyl-Carnoy) were collected by laser microdissection (50 sections/sample, 3 samples/group from 3 patients). The detailed information on patients and specimens are given in the associated article submitted to Journal of Proteomics [1].
Each glomerular sample was directly digested with 15 μl of activated trypsin solution (20 ng/μl) at 37 1C overnight. After trypsin digestion, 1 μl of 50% trifluoacetic acid (TFA) was added to the peptide mixture to quench the trypsin activity. Peptides were eluted and purified using StageTips C18 (Thermo Scientific) according to the manual instructions [2]. Briefly, the C18 tips were firstly activated with solvent A (80% acetonitrile, 5% formic acid) and re-equilibrated with solvent B (5% formic acid) before sample loading. Then the peptides were eluted twice with 20 μl of solvent A. Finally, the peptide eluate was dehydrated in a Speedvac for dryness and stored at À30 1C until LC-MS/MS analysis. We estimated that around 1 μg of peptides could be extracted from one sample according to BCA assay. As the peptide yield is very limited, we did not perform BCA assay for the samples which were analyzed by LC-MS/MS. All peptides extracted from each sample were directly analyzed in triplicate.
Each peptide sample was solubilized in the sample solution (2% acetonitrile, 0.1% formic acid) and measured in triplicates by LTQ-Orbitrap XL (Thermo Scientific) combined with nanoscale C18 reversed phase liquid chromatography (DiNa-A, KYA technologies). The peptides were separated on a C18 separation column (75 μm Â 100 mm, particle size of 3 μm, pore size of 120 Å) and eluted with a 120 min mobile phase gradient at the flow rate of 300 nl/min. MS survey scan (m/z 350-1600, resolution 60,000) was acquired in the Orbitrap and the five most intensive precursor ions were fragmented in linear ion trap. The dynamic exclusion time was set to 60 s. The MS/MS spectral data obtained from the triplicate measurements for each sample were merged by Mascot Daemon software (Matrix Science) and then searched against UniProtKB/Swiss-Prot human database (release 2014_04) using the Mascot search engine (Version 2.3.01). The parameters for protein identification used were as follows: peptide tolerance, 10 ppm; MS/MS tolerance, 0.8 Da; fixed modification, none; variable modification, oxidation on methionine (M), histidine (H), and tryptophan (W); No. of missed trypsin cleavages, 2; significance threshold po0.01; protein scoring, MudPIT (multidimensional protein identification technology). Peptide FDR was controlled o1%. Protein hits with two matched peptides were considered as confident identifications. To eliminate protein redundancy, proteins with different accession numbers but same gene name were grouped and the protein with the highest score was selected to generate the final protein identification list. The reproducibility of peptide and protein identifications among three samples in each group are examined by Venn diagrams (Fig. 1).
The label-free quantitative proteomic analysis was performed using Normalized Spectral Index (SI N ) based on the previous description [3,4]. This label-free quantitative value, SI N , consists of multiple MS abundance features for a given protein hit: peptide count, spectral count for all assigned peptides, total ion intensities of all matched MS/MS spectra, and protein length (number of amino acids). Firstly, the abundance of a protein is calculated as the cumulative ion intensities of all assigned MS/MS spectra for a protein hit. Secondly, the intensity of the protein is normalized by dividing its  intensity by the total intensities of all proteins identified in the dataset. Thirdly, the normalized intensity of this protein is divided by the protein length to obtain the relative molar concentration of this protein in the sample. In this study, we grouped the identified proteins according to their gene names, then filtered out the duplicate spectra occurring in the same group, and finally calculated the SI N value for each protein group using the following formula: (SI N ) j : Normalized spectral index of the protein group with gene name j. SI j : Total ion intensities of MS/MS spectra assigned to the protein group with gene name j. P n i ¼ 1 SI i :Total ion intensities of all the protein groups identified in a given sample. L j :The length of the protein having the highest protein score in the group with gene name j. COL12A1, collagen, type XII, alpha 1; COL4A1, collagen, type IV, alpha 1; COL4A2, collagen, type IV, alpha 2; COL6A1, collagen, type VI, alpha 1; COL6A2, collagen, type VI, alpha 2; COL6A3, collagen, type VI, alpha 3; COL7A1, collagen, type VII, alpha 1; LAMA5, laminin, alpha 5; LAMB2, laminin, beta 2 (laminin S); LAMC1, laminin, gamma 1; BGN, biglycan; LUM, lumican; VTN, vitronectin; POSTN, periostin; ACTN4, actinin, alpha 4; EZR, ezrin; PODXL, podocalyxin; NPHS2, podocin; SYNPO, synaptopodin; CLIC4, chloride intracellular channel protein 4; TJP1, tight junction protein 1; CTTN, cortactin; NPHS1, Nephrin. Results are indicated as Mean 7 SD. *po 0.05; **p o 0.01; ***p o 0.001. SI N calculation was performed using a home-made Excel VBA script. Mascot search results of detailed peptide/protein identification in each sample, e.g. peptide-spectrum matches, protein hits, and ion intensities of MS/MS spectral queries, etc. are shown in Supplementary Table 1. In this table, the proteomic data necessary for SI N calculation are highlighted in blue. The SI N values of identified proteins among three samples in each group and the results of statistical analysis are shown in Supplementary Table 2.