Sediment biomarker, bacterial community characterization of high arsenic aquifers in Jianghan Plain, China

Representative biomarkers (e.g., n-alkanes), diversity and microbial community in the aquifers contaminated by high concentration of arsenic (As) in different sediment depth (0–30 m) in Jianghan Plain, Hubei, China, were analyzed to investigate the potential mechanism of As enrichment in groundwater. The concentration of As was abundant in top soil and sand, but not in clay. The analysis of the distribution of n-alkanes, CPI values, and wax to total n-alkane ratio (Wax(n)%) indicated that the organic matter (OM) from fresh terrestrial plants were abundant in the shallow sediment. However, n-alkanes have suffered from significant biodegradation from the depth of 16 m to 30 m. The deposition of fresh terrestrial derived organic matters may facilitate the release of As from sediment to groundwater in the sediment of 0–16 m. However, the petroleum derived organic matters may do the favor to the release of As in the deeper section of borehole (16 m to 30 m). The 16S rRNA gene sequences identification indicated that Acidobacteria, Actinomycetes and Hydrogenophaga are abundant in the sediments with high arsenic. Therefore, microbes and organic matters from different sources may play important roles in arsenic mobilization in the aquifers of the study area.


Methods
Regional hydrogeology. Jianghan Plain is the first large-scale river basin receiving sediment from Yangtze River downstream of the three Gorges. Situated in the central and southern regions of Hubei Province, it features Quaternary lacustrine sediments in the middle reaches of the Yangtze River. It has a warm and humid subtropical monsoonal climate, annual mean temperature 15-17 °C, altitude about 50 m, annual rainfall about 1160 mm. The typical characteristics of the geology, hydrogeology, and land use have been discussed in some previous reports 25, 28 . In brief, the typical topographical structure in this area is a semi-closed basin with a higher elevation in the north, a lower elevation in the south, and a low alluvial plain in middle region. The hilly areas primarily consist of aquitard, while the center consists of unconsolidated water-bearing sediment layers. Clayey silt, sandy silt, sandy clay and interlaced clay lenses, the main lithology characteristic for the unconsolidated sediments in the depth of 10-35 m, compose of the unconfined aquifer in the study area. The depth of groundwater level is about 0.5-2.0 m. The unconfined aquifer is recharged by precipitations during the rainy period, as well as by surface waters when situated close to them. Evaporation, drainage of the rivers and leakage to underlying formations are mainly responsible for the discharge of the unconfined aquifer. Pliotstocene sand and sandy gravel consist of the confined aquifer. Recharge of the confined aquifer occurs during floods season by infiltration of rivers and leakage from the phreatic water zone where the impermeable clay layer is thin enough. Discharge of the groundwater occurs through regional flow. There is no such a heavy exploitation of the aquifers for human activities (industries and agriculture) that may affect the natural regime of the groundwater (Hubei Hydrogeology and Geology engineering station, 1985 cited in ref. 29).
Sampling site description and sample collection. Three sediment bore cores with a total depth of 20-30 m were drilled by rotary drilling in March 2013 (S-01, 30°08′ N/113°38′ E; S-02, 30°10′ N/113°41′ E; S-03, 30°08′ N/113°43′ E, Fig. 1). The land in the sampling area is mainly used for paddy and wheat fields. The three sampling sites locate in the interior of the low alluvial plain; the sites are surrounded by rivers, and covered by other abundant surface water bodies such as ponds, irrigation channel, and wetlands. Strong surface water-groundwater interactions are observed here 11 . Samples from the core of S-01 was designated as the control samples as the As concentration in the groundwater in this sampling site is 5 μ g/L, which meets the standard criteria specified by China. The groundwater As concentrations in S-02 and S-03 were determined to be 1560 and 1340 μ g/L, respectively. The representative chemical composition of groundwater is this study area will be discussed in the following section. Samples were sectioned at every 2 to 4 m on site. Sectioned cores were wrapped with the pre-cleaned and verified aluminum foil, capped immediately with PVC pipe, and kept in N 2 atmosphere in − 20 °C, to minimize the exposure of sediments to atmospheric oxygen and microbial degradation of organic matters.
Arsenic analysis. Sediment samples were first homogenized with a mortar and pestle, and passed through a 200 μ m sieve to remove plant roots and miscellaneous debris. A small amount of sample, 0.1 g, diluted with 1 mol/L phosphoric acid and 0.5 mol/L ascorbic acid, was microwave digested by a method adopted from ref. 30 to extract arsenite and arsenate. The total arsenic in the digested extracts was analyzed by hydride generation atomic fluorescence spectrometry (HG-AFS) (AFS-820, Titan). The relative standard derivation for arsenic measurement was less than 10% in the present study.  12 ), was Soxhlet extracted with dichloromethane (DCM) for 24 h. Extracts were concentrated and solvent-exchanged into hexane. Approximately 2 g of activated copper was added for desulphurization overnight. The extracts were then quantitatively transferred into a 3-g of silica gel chromatography column topped with about 1-cm anhydrous granular sodium sulfate for cleanup and fractionation. Hexane (12 mL) and 50% DCM in hexane (v/v, 15 mL) were used to elute the saturated and aromatic hydrocarbons, respectively. The hexane fraction was used for analysis of n-alkane. These two fractions were concentrated under a gentle stream of nitrogen to appropriate volumes, spiked with appropriate internal standard (IS), e.g., 5-α -androstane, for n-alkane analysis, and then adjusted to an accurate pre-injection volume of 1.00 mL for gas chromatography/ mass spectrometry (GC/MS) analyses.
The hydrocarbons were then analyzed by GC-MS (Agilent 6890 N/5975 MS), equipped with an autosampler. A HP-5 MS capillary column (30 m × 0.25 mm × 0.25 μ m) was used. The operating conditions were as follows: oven temperature increased from 80 to 290 °C at 5 °C/min, finally held at 290 °C for 30 min. Samples were injected in splitless mode (injector temperature at 290 °C) with helium as carrier gas. The MSD was operated in the selected ion monitoring (SIM) mode. Agilent Enhanced MSD ChemStation software was used for system control and data acquisition. The identification of compounds was based on the authentic standards, published literature and the NIST chemical data library. The quantification was made by comparing individual peak area with that of a known concentration of internal standard of 5α -androstane. Sodium sulfate as blank control samples were also analyzed following the same procedures as the sediment samples.
Diagnostic ratios of n-alkanes. Diagnostic ratios (e.g., carbon preference index (CPI)) have been developed and applied to identify the origin of n-alkanes 31 . CPI 1 is usually used to discriminate n-alkanes between petrogenic and biogenic sources, which is defined by the ratio between the sum of all n-alkanes from n-C 9 to n-C 40 with odd carbon number and the sum of those with even carbon number (equation 1). Generally, CPI 1 around 1 suggests petrogenic input; naturally generated hydrocarbons link to higher plants exhibit values of CPI 1 > 1, usually 5-10 31 . CPI 2 is utilized to distinguish the fraction of n-alkanes from high plant wax, which is calculated by the ratio between all the n-alkanes with odd carbon number from n-C 27 to n-C 36 and the sum of those with even carbon number (equation 2). High CPI 2 value is an indicative of high plant wax input 32 . Similarly, plant wax alkanes (WaxCn (%), equation 3) have been used to estimate the contributions from allochthonous and autochthonous organic matter inputs. WaxCn(%) is close to zero for petroleum or crude oil residue, whereas, WaxCn(%) approaches to 100 for high terrestrial plants or marine plants 33,34 .   Scientific RepoRts | 7:42037 | DOI: 10.1038/srep42037 to get DNA extracts. PCR: The 16S rRNA gene was amplified from the DNA extracts using universal primers 27 f (5′ -AGRGTTTGATCMTGGCTCAG-3′ ) and 1492 R (5′ -GGTTACCTTGTTACGACTT-3′ ). The purity of the amplified product was determined by the electrophoresis of 10 μ L of samples in a 1.0% agarose tris-borate-EDTA (TBE) gel. DNA was stained with ethidium bromide and viewed under short-wave ultra-violet (UV) light.
Amplicon pools from four environments were subjected to cloning as follows: Amplicons were cleaned up and chemically bond into pGEM-T Easy vector using the manufacturer's protocol (Promega, Fitchburg, WI). The recombinant plasmids were used to transform competent Escherichia coli JM109 cells, which were plated on Luria-Bertani plates containing 100 μ g/mL of Ampicillin, 80 μ g/mL of 5-bromo-4-chloro-3-indolyl-β-D-galactopyranoside (X-Gal), and 0.5 mmol/L of isopropyl-β-D-thiogalactopyranoside (IPTG) and incubated overnight at 37 °C. Randomly selected white clones were screened by colony PCR amplification of the 16S rRNA gene inserts using M13 primers.
Nucleotide sequence accession numbers. Positive clones were randomly picked and sequenced by TsingKe Biological Technology. The raw sequences were trimmed by using DNAman 6.0. The resulting sequences were classified by RDP online (http://rdp.cme.msu.edu/). Phylogenetic and statistical analyses were performed with Bioedit 7.0.9, MEGA 5.02. Sequences above 97% identity were defined as one operational taxonomic unit (OTU), with the resulting 16S rDNA sequences (about 850 bp) in each OTU compared with those in GenBank database using BLAST. The most similar 16S rRNA sequences in each OTU from the GenBank databases were chosen to construct phylogenetic trees. Rarefaction analysis was used to evaluate the saturation of the sampled clones by EstimateS 9.1.0.
Coverage (C) value was calculated to evaluate the representative of the analyzed clones for species diversity in samples according to the formula: C = (1 − n/N) × 100%, where n is the number of 16S rDNA types appearing only once in the library; N is the total number of positive clones detected.

Results and Discussion
Physicochemical characteristic of the groundwater in the study area. The physicochemical parameters, including As, Cl − , SO 4 2− , NO 3 − , HCO 3 − , Mn, Fe and dissolved organic carbon (DOC), pH and Eh, for groundwater samples in the average depth of 10-45 m were analyzed and reported in ref. 27. In brief, the groundwater in the study area is mainly HCO 3 -CaMg type with circum-neutral pH and moderate to high electrical conductivity. Negative Eh and high concentration of DOC indicate the reducing conditions with abundant organic matter in the groundwater aquifers. High As concentrations (up to 2330 μ g/L) were detected in groundwater sampled in the study area, where 87% of them exceed the WHO recommended value of 10 μ g/L. High concentrations of dissolved Fe, Mn, and P were also observed in groundwater, with 89% and 98% of them exceeding the WHO guideline for Fe and Mn. Similarly, 60 groundwater samples in the wells within the depth range of 20-30 m were collected and analyzed for their physicochemical characters in 2011 by our group. The concentrations of As range from non-detectable to 1560 μ g/L, with 63% of the samples exceeding the WHO guideline limit and 25% of them > 50 μ g/L. The pH values range from 6.1 to 7.2, with an average value of 6.6. The DOC contents were determined to range from 2.4 to 5.3 mg/L, with an average values of 3.3 mg/L. Positive correlation was observed between the levels of As and Cl − , HCO 3 − , Mn, Fe, while the levels of SO 4 2− , and NO 3 − are low in the high As groundwater.
Lithology of sediment cores. The sedimentary sequences altering from clays to sand for the boreholes of S-02 and S-03 follow very similar patterns (Fig. 2). The top layer in the two boreholes consists of a soil cap (0-0 m), where the soil is brown in color in the surface (0-2 m) with abundant organic matters, then change to red brown and light brown till to 10 m depth. This phenomenon indicates the limit of O 2 ingress from the surface 14 . The remaining soils mix with sands to make the depth of 10 to 14 m with a grey to brown color. Then clay is dominant from the depth of 14 to 20 m with a grey color. Below this layer, the sediments predominantly consist of the mixed silt and sands with color ranging from dark grey to dark from 20 to 30 m. These silt and sediments are suggestive of deposition in a fluvial environment, which is consistent with the environment in this study area. Sediment samples from the control borehole S-01 also consist of top soil, mixed soil and sand, and clay, but the color varies from yellow to grey from the surface to bottom, which may primarily indicates less organic matters present in the top soil layer compared to the brown color of the top soil in S-02 and S-03. It is noted that the depth of the borehole of S-01 is only 20 m deep, while the others are 30 m. In conclusion, all the three boreholes show similar sediment lithology, despite that S-01 only has 20 m long, and the color of its top soil is lighter than the other two.
Vertical variation of total arsenic with sediment depth and lithology. The total arsenic concentrations in the S-01, S-02 and S-03 range from 3.5 to 17.3 μ g/g, 1.5 to 15.0 μ g/g, and 1.5 to 16.0 μ g/g, respectively. Some similarities were observed by analyzing the vertical distribution of As among the three boreholes, although differences are also present. The As contaminations vary at different deposition depth in the three boreholes (Fig. 2). Specifically, the As concentrations in the borehole of S-01 range from 3.5 to 9.1 μ g/g from the depth of Scientific RepoRts | 7:42037 | DOI: 10.1038/srep42037 20 m to 6 m with a slightly increasing trend at the depth of 12 m. It can be seen that all these values are lower than the WHO limit of 10 μ g/L. It increases to 17.3 μ g/g in the top soil layer (from 4 m to the surface). In the borehole of S-02, As is low in the depth of 30 to 28 m (the mixed silt and sand layer), while it increases to 10 μ g/g in the mixture of silt and sand layer (from 26 to 24 m); then it decreases to around 3 μ g/g in the clay and the mixed soil and sand layer (20 m to 12 m), followed by an increasing trend in the top soil layer (10 m to the surface). In the S-03 bore hole, it fluctuates from 4.6 to 13.2 μ g/g for the depth of 30 to 20 m, which is characterized as the mixed silt and sands; lowest As (1.5 μ g/g) can be observed in the depth of 14 m (clay layer); then an increasing trend (from 6.1 to 16.0 μ g/g) is present in the soil and sand mixture layer and top soil layer (from 12 m to surface). It is clear that arsenic level is closely associated with the lithologic structure of the sediment core. Generally, arsenic level is high in the mixed silt and sands and top soil layers, but low in the clay layer locating in the middle of the bore cores. The comparison of the three boreholes indicates that borehole S-01 has the lowest arsenic contamination, while S-02 and S-03 have similar impacts by As. This finding is consistent with the detected As concentrations in the specific groundwater samples. As we have mentioned in the previous sections, groundwater samples from the well close to S-01 has the lowest As, while those from the sites of S-02 and S-03 have similar and relatively high As contamination. This consistency also primarily indicates that As in sediments are the main sources for As in groundwater of the Jianghan Plain 27 . Figure 2 also depicts the variation of total n-alkanes with sediment depth. The contents of total n-alkanes range from 209 to 1020 ng/g, 234 to 3872 ng/g and 236 to 4012 ng/g, respectively, in the sediment bore cores of S-01, S-02 and S-03. The total n-alkanes in S-01 are generally lower than S-02 and S-03 at the same depth, despite the data from 22 to 30 m are not available. The highest concentrations of n-alkanes were detected in the surface for all the three sites. The vertical trends of total n-alkanes in S-02 and S-03 are similar to each other, but vary with sediment depth. In detail, n-alkane contents almost keep constant from the depth of 30 to 10 m in spite of some variation from the depth of 26 to 20 m, but they increase significantly in the surface soil (0-10 m). Comparing S-01 with S-02 and S-03, similar vertical distribution profile of n-alkanes were identified in S-01 from the depth of 20 m to 10 m; S-02 and S-03 have an increased shift from 8-10 m to 6-8 m, while S-01 does not change significantly at the same depth range. The significant increase is present in S-01 for sediments from 4 m to the surface. It seems that the distribution of n-alkanes in different depths shows positive relationship to arsenic and soil characteristic. For example, both n-alkanes and arsenic are abundant in the top soil layer. Some differences are also present between n-alkanes and arsenic. For example, n-alkanes under the top soil layer are generally lower than the top soil layer, while significant amount of arsenic was detected in the mixed silt and sand layer. For n-alkanes, it is reasonable because top soil in the study area is abundant with OM, which are the left over in the paddy and wheat field each year. The degradation of the plant remains increases the organic carbon with the typical character of terrestrial input in top soil (see the source identification n-alkanes in the following section). Accordingly, abundant dissolved organic carbon (DOC) in shallow groundwater has been reported, which may also facilitate the release of As from sediments into shallow wells 11 . Higher n-alkanes in top soil, but lower in the deep sediment primary indicates that OM have been suffered from significant microbial degradation or originated from different sources.

Vertical variation of total n-alkanes in the sediment cores.
The box plots of individual n-alkanes in Fig. 3 depict their full range of variation in the different depths of the three bore cores. Analysis of n-alkanes reveals the typical double peak shapes with a general dominance of high molecular weight congeners (C 25 -C 33 ), and obvious odd to even preference in the high number of carbon range. The three bore cores have similar distribution profiles, while their absolute concentrations show some differences. Specifically, S-01 has the lowest n-alkanes, while S-02 and S-03 have similar level of them. It is clear that the three bore cores have abundant organic matter input from terrestrial plant, especially for S-02 and S-03. The detailed source identification of n-alkanes in the following section will further identify their major source.  Based on the compound distributions of n-alkanes and the diagnostic ratios analysis, it can be concluded the surface sediments are characterized with higher amounts of n-alkanes (Fig. 2), and higher CPI and WaxCn(%) than the bottom ones (Fig. 4), which suggests a relatively higher input ratio for the terrestrial plant derived organic matter than the bottom sediments 14 . The sediments form the depth of 30 m to 20 m (S-02 and S-03) have substantially lower n-alkanes with lower CPI (close to 1) and WaxCn(%), which suggests a substantial contribution of petroleum derived OM 31 . Therefore, it seems the OM in the study bore cores (S-02 and S-03) can be divided into two types. The first one is likely that immature terrestrial derived OM is present in the surface section of the two boreholes, which is characterized by a muddy layer, with a shift to the deeper sediments. The second one is the thermally mature derived OM, which is dominant in the sediment core deeper than 16 m. The analysis of the lithology indicates that a clay layer locates in the depth of 14-20 m. The CPI and WaxCn (%) do not have depth dependent trend from the depth of 0-16 m. This may indicate the young allochthonous OM may have been delivered to the deeper coarse sands in the form of dissolved organic carbon (DOC) or reworking and secondary sedimentation of ancient, eroded OM during more recent times. However, this process would not change the CPI of n-alkanes, which means no substantially biodegradation process during the transport process 14 . Massive groundwater irrigation may lead to surface-derived OM being drawn down into the aquifer systems 36 . In the study area, people rely on groundwater for living and drinking. However, the sampling sties have been used as the farmland for growth of cotton, wheat, and rice. Groundwater may have hydraulic connection with surface water bodies such as ponds, irrigation channel, and wetlands. Precipitation/evaporation together with irrigation using surface water induced seasonal variation of groundwater level, which was high in July-October and low in March-May 11 . Therefore, the transport of fresh terrestrial OM to deeper aquifer system is possible. This process may have been hindered at 16 m depth due to the impermeable character of the clay layer locating in the depth of 14-20 m. This clay layer acts as the transition between terrestrial and petroleum derived OM.
Bacterial characterization. Four samples (TX1, TX2, TX3 and TX4) with varied As Concentration of 1.5, 6.8, 8.8, 15.0 μ g/g, at the depth of 14, 20, and 22 m from S-03, and 10 m from S-02 were selected for community analysis based on 16S rRNA gene clone libraries. It can be seen that the As concentrations in these four samples range from 1.5 to 15.0 μ g/g, which represents the low, medium and high As contamination samples. A total 200 bacterial 16S rRNA gene clone sequences were obtained and then subject to BLAST search in NCBI GenBank and phylogenetic analysis. The coverage values of these bacterial clone libraries are 94-98% (Fig. 5). Analysis of 200 cloned sequences from the sediment microbial communities allows us to identify 184 OTUs at a 98% cutoff in samples of TX1-TX4 (Table 1).
Five bacterial groups are present in the higher As sediment sample (TX3, Table 1). In detail, 82% of the identified clones are associated with the Betaproteobacteria (4.6%), Epsilonproteobacteria (16%) and Gammaproteobacteria (61%). This phenomenon is similar to one of the previous studies 37 . Members of Actinomycetes (6.8%), Acidobacteria (2.3%), Firmicutes (6.8%), and Chloroflexi (2.2%) were also identified. 94% of the clones obtained are affiliated with the Alphaproteobacteria (2.1%), Epsilonproteobacteria (11%), and Gammaproteobacteria (81%) in sediment with highest arsenic contamination (TX4). Members of Acidobacteria (2.1%), Firmicutes (2.1%), and Chloroflexi (2.1%) were identified too. In the class Gammaproteobacteria, 62% of the identified clones are 100% similar to Thiobacillus thioparus; 7.6% of the clones are commonly sequenced as genus Hydrogenophaga, which was reported as one type of arsenite-oxidizing bacteria isolated from groundwater containing high concentration of As 6,38 . 6.8% of the clones belong to genus Sulfuricella, which is one type of chemolithoautotrophic bacteria growing by the oxidation of sulfur containing components 39 . In the class Alphaproteobacteria, 60% of the clones are 100% similar to Brevundimonas, which was identified from a freshwater swamp adjoining to Lake Washington 40 . 80% of the clones in class Epsilonproteobacteria are 100% similar to Sulfurimonas autotrophica, which are chemolithoautotrophic bacteria, by using carbon dioxide as carbon source, and inorganic components containing sulfur for the energy to power their metabolic processes 41 . Bacterial community characterization. Figure 6 depicts the neighbor-joining trees to classify the bacterial community characterization in the present study. The obtained bacterial 16S rRNA gene clone sequences could be grouped into twelve bacterial phyla: Alpha-, Beta-Delta-, Epsilon-, and Gammaproteobacteria, Chlorobi, Bacteroidetes, Firmicutes, Chloroflexi, Actinobacteria, and Deinococus-Thermus. The bacterial 16S rRNA gene clone libraries are mainly composed of proteobacterial sequences, which accounts for 59%, 61%, and 81% in sample TX2, TX3, and TX4, respectively. The proteobacterial clone sequences are mainly affiliated with Alpha-, Beta-and Gammaproteobacteria (Table 1). At the phylum level, the relative abundances of major groups vary among samples. At the genus level, Acinetobacter and Pseudomonas are abundant in all these samples.
The major groups in the TX2 sample include Acinetobacter, Pseudomonas, Brevundimonas, Aquabacterium, Psychrobacter, and Geobacter with abundant Pseudomonas and Acinetobacter (10% and 59%, respectively). The TX3 sample consists of seven major genera, including Acinetobacter, Pseudomonas, Brevundimonas, Massilia, Dietzia, Sphiingomonas, and Planococcus, where the former four are dominant (5%, 7%, 7% and 16%, respectively). The TX1 sample is composed of four major genera, including Acinetobacter, Pseudomonas, Aquabacterium, and Arthrobacter, where the former two are dominant (7% and 23%, respectively). Previous studies have shown that some of the above identified genera could involve in As cycling. For example, some Acinetobacter strains are more resistant to arsenic than other species, and some of them can even oxidize or reduce arsenic 42 . In addition, some  of Brevundimonas spp., Massilia spp., Dietzia spp. and Planococcus spp. are capable of reducing arsenate and/or resisting As. And Dietzia was specially reported as the type species in the extreme environment of hyper-alkaline and hyper-saline soil and water contaminated by heavy metals 43 . 16S rRNA clone library search indicate that Acidobacteria, actinomycetes and Hydrogenophaga, who are similar to those As resistant bacteria 42 , are dominated in the sediments containing high concentration of As in the present study. The release of dissolved As, and Fe from sediment to groundwater might be due to the reductive dissolution of Fe-oxyhydroxides by the microbial populations found within the sediment in the reducing environment, because the reductive metals (e.g., As (III) and Fe (II)) are more soluble in water and less affinitive to sediment than their oxidative species.