Electrophoretic Extraction and Proteomic Characterization of Proteins Buried in Marine Sediments

Proteins are the largest defined molecular component of marine organic nitrogen, and hydrolysable amino acids, the building blocks of proteins, are important components of particulate nitrogen in marine sediments. In oceanic systems, the largest contributors are phytoplankton proteins, which have been tracked from newly produced bloom material through the water column to surface sediments in the Bering Sea, but it is not known if proteins buried deeper in sediment systems can be identified with confidence. Electrophoretic gel protein extraction methods followed by proteomic mass spectrometry and database searching were used as the methodology to identify buried phytoplankton proteins in sediments from the 8–10 cm section of a Bering Sea sediment core. More peptides and proteins were identified using an SDS-PAGE tube gel than a standard 1D flat gel or digesting the sediment directly with trypsin. The majority of proteins identified correlated to the marine diatom, Thalassiosira pseudonana, rather than bacterial protein sequences, indicating an algal source not only dominates the input, but also the preserved protein fraction. Abundant RuBisCO and fucoxanthin chlorophyll a/c binding proteins were identified, supporting algal sources of these proteins and reinforcing the proposed mechanisms that might protect proteins for long time periods. Some preserved peptides were identified in OPEN ACCESS Chromatography 2014, 1 177 unexpected gel molecular weight ranges, indicating that some structural changes or charge alteration influenced the mobility of these products during electrophoresis isolation. Identifying buried photosystem proteins suggests that algal particulate matter is a significant fraction of the preserved organic carbon and nitrogen pools in marine sediments.


Introduction
Much of the ocean is influenced by nitrogen limitation [1], and thus, understanding marine protein cycling is important for tracking the global organic nitrogen cycle. Solid state NMR has provided evidence that the majority of organic nitrogen in dissolved and particulate marine organic matter contains amide bonds as found in proteins [2][3][4]. Protein building blocks, such as total hydrolysable amino acids (THAAs, total amino acids that can be extracted using 6 N HCl), are found to account for up to 30%-40% of particulate nitrogen in marine sediments [5][6][7][8]. In addition to proteins representing the largest fraction of organic nitrogen, within the unique amino acid sequence, proteins also can provide functional and phylogenetic information on the organisms from which they were produced, potentially making peptides unique to taxonomic groups useful biomarkers. Identifying intact proteins and peptides buried in marine sediments would give valuable information towards understanding marine biogeochemical cycles and reconstructing algal/microbial populations [9,10]. The complex matrix effects of sediments [11][12][13], however, have made extracting buried proteins for further analysis a challenge [14][15][16]. New bioseparation methods are needed to successfully extract proteins from deep sediments for subsequent analyses.
Gel electrophoresis is the classic method for protein separation and visualization, including sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) and related approaches that separate proteins based primarily on molecular weight [17]. The SDS-PAGE method has been a standard technique for protein analysis in the fields of biochemistry, cell biology and medical sciences for decades [18][19][20][21][22][23][24]. Modified SDS-PAGE methods have been used to purify proteins from cell cultures before mass spectrometry analysis [25,26] and to separate proteins from soil trichloroacetic acid (TCA) extracts [27,28]. Recent studies by Moore et al. [16,29] found that intact algal proteins and peptides could be tracked and identified from the marine water column and extracted from surface sediments in the Bering Sea using modified SDS-PAGE tube gels where sediment-buffer mixtures were loaded directly to the gels for separation. In-gel protein extractions were then performed, and peptides were identified using proteomic mass spectrometry and database searching. It is unknown if these methods could be applied to identify algal proteins that have potentially been buried in marine sediments for extensive periods of time.
The goal of this study was to test sediment gel-based extraction methods with tube and 1D flat gels on field-collected deep marine sediments of the Bering Sea to determine if algal proteins identified previously in surface sediments could be identified after long-term burial. These methods were compared with direct sediment trypsin digest as a control.

Sediment Collection
The Bering Sea is known to be one of the world's most productive ecosystems [30][31][32], resulting, in part, in the high export of primary production material to sediments after bloom termination [33][34][35]. These factors, along with the identification of algal proteins in Bering Sea surface sediments, make down core sediments from this region ideal for identifying buried proteins. Bering Sea sediment was collected from the outer shelf during the Bering Sea Ecosystem Study (BEST) spring 2009 cruise on April 9, 2009, using a multi-corer (see [36]). Sediment was collected before the spring phytoplankton bloom to ensure that primary production exported to the sediment was from the previous year. The location of the sediment core was Latitude 59.9004 N, Longitude 171.5952 W, depth 101 m, temperature −0.15 °C. The sediment core was sliced into multiple sections, and the 8 to 10 cm section was used for the analysis of buried proteins. Sediments were frozen and stored at −70 °C until analysis.

Gel Electrophoresis Protein Extraction
To extract buried proteins from sediment, 1.5 g wet weight of 8 to 10 cm sediment was treated with 500 μL of extraction buffer in a 2 mL Eppendorf tube and pulse sonicated for 1 min on ice (Bronson sonicating microprobe, 20 kHz). The extraction buffer consisted of 7 M urea, 2 M thiourea, 0.01 M Tris-HCl, 1 mM EDTA, 10% v/v glycerol, 2% CHAPS, 0.2% w/v ampholytes (Fluka BioChemika, high resolution pH 3-10, 40% in water) and 2 mM Tributyl-phosphine [37]. The entire sediment + extraction buffer mixture was then loaded onto gel-prep cell tubes for isolation and molecular weight separation. The gel composition was 17% acrylamide/Bis, 0.125 Tris-HCl. The amount of protein material loaded onto the gel was determined by measuring the concentration of THAAs in the sediment buffer mixture as a proxy for total protein (buffer extraction efficiency was 12%). Gels were covered with running buffer (0.25 M Tris, 0.192 M glycine, 0.1% SDS, pH 8.3) and run at 180 V until the ion front traveled 7 cm or 2 cm from the top in separate gels. After electrophoresis, the 7 cm gel section was cut into five molecular weight ranges for digestion and analysis (<10, 10-25, 25-50, 50-100 and >100 kDa) based on external MW standards (Bio-Rad Kalaidoscope) in identical gels. In addition, 1-dimensional pre-cast 12% Bis-Tris flat gels (Invitrogen NuPAGE Novex) were loaded with 45 µL of sediment + buffer (the same ratio of sediment and extraction buffer as the above tube gels). Gel lanes were covered with agarose overlay and running buffer and run at 180 V until the ion front traveled 2 cm down the 1D gel. The top 2 cm of the ion front gels were excised for digestion analysis. All gels were washed three times with nanopure water to remove excess SDS prior to digestion.
A control gel was run with multiple lanes loaded with sonicated diatom cells. Approximately 0.75 g (wet weight) of cultured Thalassiosira weissflogii cells were sonicated in 1 mL of 44 mM ammonium bicarbonate for 1 min on ice (Bronson sonicating microprobe, 20 kHz). Total hydrolysable amino acid (THAA) analysis was performed on the sonicated T. weissflogii/ammonium bicarbonate mixture as a proxy for protein material for gel loading. Six lanes of a 1-dimensional pre-cast 12% Bis-Tris flat gel (Invitrogen NuPAGE Novex) were loaded with sonicated cellular material containing 40 µg of THAAs per lane. Gel lanes were covered with agarose overlay and running buffer and run at 180 V until the ion front traveled 10 cm down the gel. The gel was washed three times with DI water, and each gel lane was cut separately into molecular weight sections (<10, 10-25, 25-50, 50-100 and >100 kDa) based on internal molecular weight standards run in a separate gel lane. Three gel lanes were taken for trypsin digestions and proteomic analysis, and three gel lanes were taken for electroelution and THAA analysis. The electroelution molecular weight sections were separately placed in the electroelution cell (Bio-Rad, Hercules, CA, USA), covered with running buffer and run at 100 V for 1 h to elute protein material out of the gel section. The eluted protein was collected in an elution chamber with a 1 kDa molecular weight membrane, so that all material above 1 kDa was collected.

In Gel Protein Digestion
Gel molecular weight sections and 2 cm gels were cut into 2 × 2 mm slices to increase the surface area for enzyme and chemical access. Pieces were covered together with 100 mM ammonium bicarbonate and rinsed for 15 min to hydrate the gel sections and then rinsed for 15 min in acetonitrile to dehydrate the gel sections and remove detergents and other chemical interferences. The rinse cycle was repeated five times, and the gel sections were then dried by speed-vac for 45 min. Reduction, alkylation and digestion followed the standard procedure by Schevchenko et al. [38]. Digests were dried, and volumes were adjusted to give a final protein concentration of 1 µg protein/10 µL based on THAA concentration recoveries.

Direct Digestion of Sediment
As a control, 8-10 cm sediment was digested directly with trypsin by combining 100 mg of sediment with 300 µL of 6 M urea and 50 mM ammonium bicarbonate. The sediment was then sonicated (Bronson sonicating probe, 20 kHz for 60 s on ice). The pH was raised by adding 18 µL 1.5 M Tris-HCl (pH 8.8). To reduce sulfide bonds in proteins, 7.5 µL TCEP (2,2',2'(-phosphanetriyltripropanoic acid) was added to the sediment, vortexed and incubated for 1 h (37 °C). Proteins were then alkylated by adding 60 µL of 200 mM iodoacetic acid (IAM) and incubated in the dark for 1 h. After the addition of and incubation of 60 µL of dithiothreitol (1 h room temperature), the urea was diluted with the addition of 2.4 mL 25 mM ammonium bicarbonate, 600 µL HPLC-grade methanol and 1 µg of sequencing grade trypsin. The trypsin incubation was completed overnight at room temperature. Samples were centrifuged (14,000× g, 20 min) and the digest with buffer removed. The sediments were then washed 3 times with 1 mL 25 mM ammonium bicarbonate, centrifuged and the extracts combined. The volume was reduced to ~10 µL, and 200 µL of 5% acetonitrile and 0.1% trifluoroacetic acid were added before desalting the peptides using C18 micro prep desalting centrifuge columns (NEST Group, Southboro, MA, USA). Sample pH was adjusted to <2 using small additions of 10% TFA and was desalted using the protocol provided by manufacturer.

Mass Spectrometry and Database Searching
Protein identification was accomplished via shotgun proteomics with samples introduced into the ion trap (LTQ Velos) mass spectrometer (Thermo Fisher, Waltham, MA, USA) by NanoAcquity high performance liquid chromatography (HPLC, Waters, Milford, MA, USA). New analytical and trapping columns were packed in-house prior to batch analyses of Bering Sea samples to prevent sample carry over. Analytical columns were made using 11 cm 75 µm i.d. fused silica capillaries packed with C18 particles (Magic C18AQ, 100 A, 5 µm; Michrom, Bioresources, Billerica, MA, USA) preceded by a 2 cm, 100 µm i.d. trapping column (Magic C18AQ, 200 A, 5 µm; Michrom). Samples were loaded onto the trapping column with a flow rate of 4 µL min −1 for 7 min and then entered the analytical column at a flow rate of 250 nL min −1 (total run time: 100 min). Peptides were eluted using an acidified (formic acid, 0.1% v/v) water-acetonitrile linear gradient (5%-35% acetonitrile in 60 min) and ionized in atmospheric pressure before entering the mass spectrometer. Ions that entered the ion trap were surveyed (MS 1 ), and the fourteen most intense ions from scans having either +2, +3, +4 or +5 charge states were selected for collision-induced dissociation (CID) and tandem mass spectral (MS 2 ) detection. Sample digests were analyzed using full scan (m/z 350-2,000), followed by gas phase fractionation with repeat analyses over multiple narrow, but overlapping, mass to charge ranges (e.g., m/z 350-444, 444-583, 583-825, 825-1600; [39]).
Mass spectra were interpreted and searched using an in-house copy of SEQUEST on a Beowolf-style computer cluster with 800 dedicated processing cores with 22 terabytes of storage [40,41]. All data searches were performed with no assumption of proteolytic enzyme (i.e., unconstrained search), specifically to allow for the identification of the maximal number of protein degradation products. Fixed modifications were set for 57 Da on cysteine, resulting from the IAM alkylation step, and 16 Da on methionine via oxidation. Each tandem mass spectrum was then searched against a protein sequence database to correlate predicted peptide fragmentation patterns with observed sample ions. To objectively validate peptide and protein identifications, two statistical evaluations using PeptideProphet and ProteinProphet were used to provide probability-based scores [42,43]. Probability thresholds for positive identifications of proteins and peptides were strictly set at 90% confidence on ProteinProphet and PeptideProphet for SEQUEST search results. Based on the databases used, the false discovery rate was calculated to be 0.5% for database searches. Mass spectra from all samples were searched against a database containing the proteomes of Thalassiosira pseudonana (marine diatom), Prochlorococcus marinus (marine cyanobacterium) and Pelagibacter ubique (marine bacterium belonging to the SAR11 clade). Identified sedimentary peptides possessed amino acid sequences that matched peptide sequences within the proteomes of the above mentioned species. This database was chosen after extensive comparison revealed that larger databases, including the NCBI non-redundant database containing over 11.9 million protein sequences and the Global Ocean Survey Combined Assembly Protein database [44,45], did not enhance the number of proteins identified, added limited species diversity to identified proteins and had 95% functional agreement with proteins identified from Bering Sea sediment with the smaller database [16].

Total Hydrolysable Amino Acid Analysis
Individual amino acids were identified and quantified by gas chromatography (GC) and GC mass spectrometry (GC/MS) using the EZFaast method (Phenomenex, Torrance, CA, USA), which uses derivatization of AAs with propyl chloroformate and propanol for sensitive detection (see [46] for a comparison of the methods). Briefly, sediment samples, sediment buffer mixtures, T. weissflogii buffer mixtures and gel section electroelutions were hydrolyzed for 4 h at 110 °C [47,48] with 6 M analytical-grade HCl and L-methylleucine as the recovery standard. Following hydrolysis and derivatization, amino acids were quantified using an Agilent 6890 capillary GC with samples injected at 250 °C and separated via a DB-5MS (0.25 mm ID, 30 m) column with H2 as the carrier gas. The oven was ramped from an initial temperature of 110-280 °C at 10 °C per minute followed by a 5 min hold. For amino acid identification, the GC was coupled to an Agilent 5973N mass spectrometer run under the same conditions with helium as the carrier gas and acquisition of spectra over the 50-600 Da range. Bovine serum albumin (BSA) was analyzed in parallel to correct for responses among individual amino acids and calculation of molar ratios.

Results and Discussion
Using the tube gel method run for 7 cm, 21 peptides were identified combined from all of the gel molecular weight range sections, correlating to 11 proteins (Table 1). From the 2 cm tube gel, 2 cm flat gel and direct digest methods, 3, 1 and 1 peptides were identified correlating to 3, 1 and 1 proteins, respectively. The majority of identified peptides and proteins correlated to T. pseudonana chloroplast proteins, including the RuBisCO large subunit and three fucoxanthin chlorophyll a/c binding proteins (FCPs). The proteins with the most identified peptides were the RuBisCO large subunit and the histone H4 protein.   The 7 cm tube gel molecular weight section with the largest number of peptides and proteins identified was the >100 kDa section ( Table 2). The 7 cm tube gel >100 kDa section was also the only section to contain peptides and proteins from four cellular locations (chloroplast, nucleus, cytoplasm and mitochondria). All other 7-cm tube gel sections contained only chloroplast and nucleus peptides and proteins or no identifications in the case of the <10 kDa section. The majority of identified peptides in the 7 cm tube gel correlated to proteins that had a molecular weight that was outside the molecular weight range of the gel section in which the peptide was identified. Of these peptides identified in unexpected gel molecular weight sections, five correlated to proteins that were less than 6.5 kDa different from their identified gel molecular weight section. Five proteins were identified in multiple 7 cm tube gel molecular weight sections, and from these proteins, the same peptides were often identified in multiple gel sections. Table 2. Proteins (in bold) and associated peptides identified from each 7 cm tube gel molecular weight section method. Correlated species, molecular weight (MW), cellular location (CL) and percent sequence coverage (%) are given for each protein. Molecular weights in bold represent proteins that are outside the expected gel molecular weight section; molecular weights in bold and italic represent proteins that are less than 6.5 kDa outside the expected gel molecular weight section. For cellular location: C = chloroplast; M = mitochondria; N = nucleus; S = secretory.   The concentration of THAAs in the 8 to 10 cm sediments, 0.74 mg/g sediment (dry weight), was approximately 23% less than surface sediments of the same sediment core collected before the spring phytoplankton bloom and 36% less than surface sediments collected two months later after the spring phytoplankton bloom (both surface sediment samples analyzed by Moore et al. [29]). The number of proteins identified in 8 to 10 cm sediments was 56% less than surface sediments from the same core and 79% less than surface sediment collected two months later after the spring phytoplankton bloom. The THAA distribution was very similar between the three sediment samples (Figure 1). The amount of THAAs varied between the molecular weight sections of the T. weissflogii gel, with relatively large amounts in the >100 kDa section and small amounts in the 50-100 kDa section (Figure 2).  Trypsin digestions were found to be more effective in all of the gel methods than the direct digest method. In each of the gel methods, trypsin peptides were identified as the most abundant HPLC/MS peaks in the sample, particularly in the 7 cm gel >100 kDa section ( Figure 3A). The most abundant HPLC/MS peaks in the direct digest were m/z 519.27 at multiple retention times, possibly representing sample matrix material ( Figure 3B). The trypsin peaks were much lower in abundance in the direct digest, with many fewer peptides identified than the gel methods, as well. The identification of algal peptides and proteins in the deep marine sediment core section demonstrates that gel electrophoresis can be an effective method for isolating proteins present in complex matrices, but further improvements can be made in the total recovery of material. The results obtained by loading sediment buffer mixture directly to the gel interface supports previous findings by Moore et al. [16] that an electric field applied directly to sediment particles enhances protein extraction. The number of proteins identified was increased by running the gel for a longer period of time, allowing the ion front to move further down the gel (7 cm vs. 2 cm) and the electric field to mobilize proteins out of the sediment. Some brown colored sedimentary material traveled into the gel, as well, but the separation of protein from matrix was sufficient for peptides to be digested out of the gel sections with trypsin and identified via highly sensitive proteomic mass spectrometry and database searching (Figure 4, [49]). Protein function and amino acid sequence can be conserved across many species and taxonomic groups. Identified peptide sequences that correlated with T. pseudonana indicate that the peptide was likely conserved among species and not directly produced by T. pseudonana, but by some other algal source.
The difficult identification of peptides and proteins using the direct digest method indicates that the complex sedimentary matrix includes a suite of interferences, including mineral surfaces and organic materials, which impact protein digestion, extraction and identification. The flat gels loaded with sediment buffer mixture could not be reproducibly run over time frames for the ion front to travel greater than 2 cm down the gel. The is likely due to the fact that the electrophoresis current mobilized charged matrix particles, clays and other minerals, which were too large to move into the gel itself. The mobilized matrix particles were then either pushed down the edge of the gel or the gel was torn by the material pushing against it. The sediment material pushed against the tube gel, as well, but the thicker tube gel was sturdy enough to consistently withstand the pressure of the charged matrix material and allow proteins to be mobilized into the gel and matrix particles to be held at the top.  The identification of peptides correlating to proteins that occurred in unexpected gel molecular weight sections (based on their known intact molecular weights) suggests that some change to protein structure or perhaps the charge has taken place. Peptides identified in smaller than expected molecular weight ranges could be due to partial hydrolysis [50]. Covalent modifications [13,51,52], hydrophobic interactions [12,13] and sequestration in potential energy fields [53] have been proposed as mechanisms for enhanced protein preservation, which could influence the gel migration of proteins to be in larger than expected molecular weight sections. The denaturing conditions of the SDS-PAGE gels used in this study indicate that charge alteration could also play a role in protein mobility. Given the physical effect of mobilized sample matrix on the gels themselves, it is certainly possible that the complex mixture of charged matrix material could also impact protein migration in the gels.
Proteins in lower core sediments have likely been exposed to more degradation processes than in surface sediments. These identified proteins represent material that may have been preserved after burial [12,[54][55][56][57] or mixed from the surface into deeper sediments by bioturbators [58][59][60]. Thus, it is expected that fewer proteins, with potentially lower sequence coverage, would be identified in the Bering Sea 8-to 10-cm core section than in surface sediments observed by Moore et al. [16,29]. Age estimation of these sediments can be uncertain, because the surface sedimentary mixed layer can extend from 0 to 16.5 cm on the outer Bering Sea shelf [61]. However, given a sedimentation rate of 30 to 40 cm/1,000 y for the Bering Sea outer shelf and slope [62], one could estimate that the 8-10-cm sediment is conservatively 200 to 330 years old.
The proteins that were identified in the deeper sediments were also found in the water column particulate phase and surface sediments before and after the spring bloom [29]. The RuBisCO large subunit, histone H4 and fucoxanthin chlorophyll a/c binding proteins were all identified after 53 days of shipboard degradation experiments, indicating that these proteins can indeed survive water column recycling long enough to be buried in sediments and are present in sufficient quantities to be detected using tandem mass spectrometry [63]. Furthermore, eight out of the twelve total proteins identified correlating to T. pseudonana were found to be high abundance proteins in this species based on proteomic analysis [39]. This indicates that the identified algal proteins in 8-to 10-cm sediments signify buried primary production and exported organic nitrogen and carbon from the water column.
It has been estimated that only approximately 1% of the originally produced organic matter from the marine upper water column is transferred to the deep biosphere [64][65][66]. The identification of buried algal proteins (RuBisCO, FCPs, photosystem proteins, etc.) in marine sediment suggests that primary production proteins could be preserved over long periods of time. FCPs are important light harvesting complex proteins in diatoms and other marine algae [39,67,68], which along with other chloroplast proteins, could not have been produced in the sediments by bacteria and, thus, were transported and preserved from the water column. This would be expected, since bacteria make up a small fraction of the sedimentary buried carbon pool [69]. The greater decrease in protein identifications than THAAs between 8 to 10 cm sediments and surface sediments [29] suggests that intact proteins are present, but below detection limits. Since proteins are the functional biomacromolecules in all organisms, they could be useful biomarker molecules for reconstructing past phytoplankton communities and biogeochemistry if peptides unique to specific taxonomic groups are detected and tracked.

Conclusions
The sediment gel method successfully extracted buried intact proteins and/or peptides from 8-10 cm Bering Sea marine sediment for identification by proteomic mass spectrometry. More peptides were identified using the sediment tube gel method than the 1D flat gel or direct digest of the sediment. The majority of identified peptides and proteins correlated with diatom protein sequences, indicating that algal proteins survived water column recycling and burial in sediment. Identification of peptides in unexpected gel molecular weight sections suggests that modifications may have changed the protein's structure or gel mobility. Identifying buried proteins in marine sediments suggests that these functional biomacromolecules could be preserved over long time periods and be useful biomarkers for general algal growth. This unique method allows extraction of proteins from complex environmental matrices using conventional electrophoresis in a modified approach and may also be applied to other complex environmental samples, such as terrestrial sediments and soils, giving greater understanding of protein biogeochemistry.