Data for proteome analysis of Bacillus lehensis G1 in starch-containing medium

Bacillus lehensis G1 is a cyclodextrin glucanotransferase (CGTase) producer, which can degrade starch into cyclodextrin. Here, we present the proteomics data of B. lehensis cultured in starch-containing medium, which is related to the article “Proteome-based identification of signal peptides for improved secretion of recombinant cyclomaltodextrin glucanotransferase in Escherichia coli” (Ling et. al, in press). This dataset was generated to better understand the secretion of proteins involved in starch utilization for bacterial sustained growth. A 2-DE proteomic technique was used and the proteins were tryptically digested followed by detection using MALDI-TOF/TOF. Proteins were classified into functional groups using the information available in SubtiList webserver (http://genolist.pasteur.fr/SubtiList/).


Value of the data
This data set will be of value for the scientific community working in the area of Bacillus species since it represents the secreted proteins by Bacillus sp. in response to starch.
This data extends the information available for proteome/secretome changes in B. lehensis G1 and can be used as a reference for comparative experiments with different carbon sources.
Further analysis of the data should allow new insights into mechanisms by which B. lehensis proteins are released into the extracellular space.

Data
Extracellular proteins of B. lehensis were subjected to 2-DE analysis, producing an extracellular proteome map [1]. A total of 87 identified proteins on the 2-DE was listed in Table 1. Fig. 1 shows the grouping of functional categories of the identified proteins where they are mostly implicated in the metabolism of carbohydrates and related molecules (20%), cell wall (12%), metabolism of nucleotides and nucleic acids (11%) and proteins of unknown function (12%). Supplementary information table shows all assigned peptide sequences detected by MALDI-TOF/TOF analysis for the 87 putative secreted proteins.

Preparation of extracellular proteins for proteome analysis
B. lehensis G1 extracellular proteins were collected at mid-log phase as previously described [2] with slight modification. Cells were removed from the growth medium via centrifugation at 10,414g and 4°C for 15 min. Proteins in the supernatant were precipitated with 10% (w/v) pre-chilled trichloroacetic acid for 30 min and were collected via centrifugation at 10,414g for 15 min. The resulting protein pellet was collected and washed twice with pre-chilled acetone. The supernatant was removed, and the resulting protein pellet was air-dried for 5 min. Finally, the pellet was resolubilized in rehydration buffer (8 M urea, 40 mM dithiotreitol, 2% CHAPS, 0.5% (v/v) carrier ampholytes, 1 mM protease inhibitor cocktail, 0.002% bromophenol blue). The protein concentration of the extracellular protein sample was determined using a 2-D Quant Kit (GE Healthcare, United Kingdom) according to the manufacturer's protocols.

Two-dimensional gel electrophoresis (2-DE), gel analysis, and protein identification
1D isoelectric focusing was carried out using an IEF 100 (Hoefer, United States) and 2D sodium dodecyl sulfate-polyacrylamide gel electrophoresis (Bio-Rad, United States) was conducted using a VS20 WAVE Maxi (Cleaver Scientific Ltd, United Kingdom). The protocols were carried out according to manufacturer recommendations. Protein spots were in-gel digested using a trypsin digestion kit (Thermo Scientific, United States). The digested peptides were purified and concentrated using ZipTip C18 (Merck Milipore, United States) before spotting onto a target plate (AnchorChip Standard, 800 um; Bruker, United States). An UltraFlex MALDI-TOF/TOF mass spectrometer (Bruker) was used to analyze the digested peptides. Mass spectrometry spectra were gathered with 3000 laser shots per spectrum, and tandem mass spectrometry spectra were acquired with 4000 laser shots per  Figure S1[1] b The AIC gene numbering is according to the NCBI taxonomy database for B. lehensis G1. c The annotation was primarily based on the genome annotation of B. leheniss G1 d PMF represents the peptide mass fingerprinting using MALDI-TOF MS and PFF represents the peptide fragment fingerprinting using MALDI-TOF/TOF MS fragmentation spectrum. The peptide mass fingerprinting peaks with the highest mass intensities (maximum 20 strongest peaks) were selected as precursor ions to acquire MS/MS fragmentation data. Bruker Daltonics Bio tools 3.2 SR3 was used for spectra analyses and the generation of peak list files.
The signal-to-noise threshold was set at 7. The peak list files were used to search an in-house B. lehensis G1 database (4017 sequences; 1166855 residues) using MASCOT version 2.4 (Matrix Science). The search parameters were set for proteolytic enzymes: trypsin, one maximum missed cleavage, variable modification of oxidation (Methionine), fixed modification of cys residues carbamidomethylation and peptide mass tolerance for monoisotopic data of 100 ppm, and a fragment mass tolerance of 0.4 Da.

In silico analysis
Identified proteins were classified into functional groups using the information available in Sub-tiList webserver (http://genolist.pasteur.fr/SubtiList/).