Application of Molecular Microbiology Techniques in Bioremediation of Hydrocarbons and Other Pollutants

Molecular microbiology techniques have revolutionized microbial ecology by paving the way for rapid, high-throughput methods for culture-independent assessment and exploitation of microbial communities present in complex ecosystems like crudeoil/hydrocarbon polluted soil. The soil microbial community is relatively diverse with a high level of prokaryotic diversity. This soil species pool represents a gold mine for genes involved in the biodegradation of different classes of pollutants. Currently, less than 1% of this diversity is culturable by traditional cultivation techniques. The application of molecular microbiology techniques in studying microbial populations in polluted sites without the need for culturing has led to the discovery of novel and unrecognized microorganisms and as such complex microbial diversity and dynamics in contaminated soil offer a resounding opportunity for bioremediation strategies. The combination of PCRamplification of metagenomic DNA, microbial community profiling techniques and identification of catabolic genes are ways to elucidate the composition, functions and interactions of microbial communities during bioremediation. In this review, an overview of the different applications of molecular methods in bioremediation of hydrocarbons and other pollutants in environmental matrices and an outline of the recent advances in this fast-developing field are given. Review Article British Biotechnology Journal, 3(1): 90-115, 2013 91


INTRODUCTION
Modern molecular techniques provide an exciting opportunity to overcome the requirement for culturing and have greatly increased our understanding of bacterial diversity and functionality during bioremediation of oil-polluted soils. Unfortunately, only a fraction of the microorganisms involved in the biodegradation of pollutants in different ecosystems can currently be cultured using standard laboratory agars and conditions [1][2][3][4][5][6][7][8][9][10]. Comparisons between classical culture-dependent and molecular/metagenomic methods have revealed that only about 1% of the total microorganisms are amenable to culture [11]. In a few cases, understanding the phylogenetic diversity has led to the development of improved culturing methods such as extended incubation periods, improvement of media and growth conditions such as pH, temperature and atmosphere [10,12]. However, estimating the total microbial diversity/dynamics in any environment is a persisting challenge especially during bioremediation to cleanup hydrocarbon contaminants [13][14][15]. It has been observed using cultivation-based techniques that fast-growing strains best adapted to particular culture conditions grow more than those which are not and therefore do not accurately represent the actual microbial community composition during aerobic biodegradation of pollutants like crude oil [16][17][18][19][20][21][22][23][24][25][26][27]. To elucidate how microbial communities change due to the effects of environmental disturbances like pollution, investigations need to rely on rapid methods that can characterize cellular constituents such as nucleic acids, proteins (enzymes) and other taxa-specific compounds [28][29][30][31]. These molecules can be extracted directly from the soil without the need for culturing and thus can be used to elucidate the microbial community composition in such polluted environments during bioremediation [32]. This review evaluates current molecular microbiology applications in the assessment of microbial diversity and dynamics in polluted ecosystems during bioremediation in order to identify predominant microbial communities or degradative genes.

POLYMERASE CHAIN REACTION (PCR)
The polymerase chain reaction (PCR) was first discovered and performed with non-heatstable polymerase, thus subsequently new polymerase was added after each heat cycle. Then, the Taq polymerase was added into the mix, resulting in overwhelming success of PCR [5].
In 1988, a thermally stable DNA polymerase from a thermophilic bacterium was used to perform PCR. This extremely sensitive technique allows the amplification of millions of copies of a portion of a desired gene, entire gene or gene clusters with high fidelity within 3 to 4 h facilitating their cloning and study. It is the most widely used method for the amplification of 16S rRNA, or its gene, prior to fingerprinting studies. The 16S rRNA gene is essential as it encodes the small subunit of the prokaryotic ribosome and is therefore present in all prokaryotes. In bacteria, the rRNA genes are transcribed from the ribosomal operon as 30S rRNA precursor molecules and then cleaved by RNase III into 16S, 23S and 5S rRNA molecules [5,33]. The ribosomal operon size, nucleotide sequences, and secondary structures of the three rRNA genes are conserved within a bacterial species [12]. Since 16S rRNA is the most conserved of these three rRNAs, it has been proposed as an ''evolutionary clock'', which has led to the reconstruction of the tree of life [34]. For the past two decades, microbiologists have primarily relied on 16S rRNA gene sequences (hereafter, 16S sequences) for the identification and classification of bacteria. The 16S sequence analysis is used in two major applications: (i) identification and classification of isolated pure cultures and (ii) estimation of bacterial diversity in environmental samples without culturing through metagenomic approaches. Several criteria that make the 16S rRNAs and their genes the most widely studied phylogenetic molecular markers include:  The function of ribosomes has not changed for about 3.8 billion years;  The rRNA genes are present in all living organisms;  The size of 1,540 nucleotides makes them easy to analyze;  There are extensive databases available (e.g. the GenBank) for pairwise and/or multiple comparative analysis;  The genes act as molecular chronometer;  The genes allow for culture-independent analysis of unknown microbial communities;  The primary structure is an alternating sequence of invariant DNA, more or less conserved to highly variable regions, and  Lateral gene transfer is either totally absent or exceedingly rare [2].
PCR-based methods have also been used in the detection and quantification of microorganisms found in soil and water [35]. The technique can also be applied for the analysis of catabolic genes involved in the biodegradation of organic contaminants [36]. With PCR, phylogenetic relationships can be assessed by pairwise and/or multiple alignments/similarities (this can be done by submitting unknown sequences to the GenBank 16S rRNA sequences database in National Center for Biotechnology Information [NCBI] website http://www.ncbi.nlm.nih.gov/blast/). One hundred percent similarity found between a pair of 16S rRNA sequences obtained from comparative tools such as BLAST (basic local alignment search tool) indicates very close relatedness, if not identity of the investigated organisms. The lower the value, the more unrelated the compared organisms [2,12]. For the detection of organisms or genes from contaminated environments, simple and/or multiplex PCR techniques can be used. Simple PCR uses a pair of primers in a single amplification reaction, while multiplex PCR uses multiple primer pairs simultaneously to amplify several genes in a single reaction [37][38][39][40]. PCR amplification is dependent on the extraction and purification of nucleic acids of sufficient yield and quality from environmental samples. Insufficient lysis of cells could result in the preferential extraction of DNA from Gramnegative bacteria, while excessively harsh treatments may result in the shearing of DNA from readily lyzed cells [41]. In addition, PCR amplification efficiency can severely be hampered by the presence of inhibitory substances which are co-extracted with nucleic acids which include humic acids, organic matter and clay particles [42]. Methods employed for sample collection, transportation and storage prior to nucleic acid extraction are important in the way that bias may be introduced into subsequent microbial analysis of native communities [43] of particular concern is the temperature at which samples are stored and their exposure to oxygen [44]. With extended sample storage, both these factors may alter the microbial composition hence the need to extract DNA/RNA after sample collection [41].
Margesin et al. [45] studied the prevalence of alkane and PAH-degradative genes in hydrocarbon contaminated and pristine Alpine soils using PCR amplification and hybridization analysis of different alkane monooxygenases and aromatic dioxygenases. They found out that there was a statistically significant correlation between the level of contamination and the prevalence of these genotypes in the Alpine soils. Alquati et al. [46] used the 16S rRNA and naphthalene dioxygenase genes amplified by PCR to identify naphthalene degrading bacteria from a petroleum contaminated soil. Of the thirteen isolates they obtained belonging to Rhodococcus, Arthrobacter, Nocardia and Pseudomonas genera, nine showed PCR fragment amplification with naphthalene dioxygenase specific primers of Rhodococcus spp. while four were detected with primers of naphthalene dioxygenase specific for Pseudomonas spp. Other independent studies have also used PCR amplification of different hydrocarbon degradative enzymes using specific primers to study the diversity and composition of bacterial populations in hydrocarbon contaminated environments [47][48][49][50][51][52][53][54][55][56]71]. Knaebel and Crawford [57] applied multiplex PCR to detect the petroleum-degrading microbial population in a petroleum contaminated soil. Baldwin et al. [58] Also employed a multiplex PCR technique for the detection of naphthalene dioxygenase, biphenyl dioxygenase, toluene dioxygenase, xylene monooxygenase, phenol monooxygenase and ring-hydroxylating toluene monooxygenase genes in a single PCR reaction. Although multiplex PCR can save time and resources in the detection of microorganisms or genes involved in biodegradation, successful application depends on the combination of several primer pairs being able to perform reliably in a single reaction. Primer dimer formation between the various primers is more likely to occur and this may lead to poor sensitivity and preferential amplification of certain targets [59]. Another variant of the PCR technique which can simultaneously detect and quantify the amplified product while the reaction is occurring is a real-time PCR. This approach enables the detection and quantification of PCR amplicons during the early exponential phase of the reaction [60,61]. It involves the use of fluorescent markers, and the amount of fluorescence measured at the end of each cycle is directly related to the amount of product in the PCR reaction. Initially, fluorescence remains at background levels, and increases in fluorescence are not detectable even though product accumulates exponentially. Eventually, enough amplified product accumulates to yield a detectable fluorescent signal. The cycle number at which this occurs is called the threshold cycle, or C T . Since the C T value is measured in the exponential phase when reagents are not limited, real-time PCR can be used to reliably and accurately calculate the initial amount of template present in the reaction. The C T of a reaction is determined mainly by the amount of template present at the start of the amplification reaction. If a large amount of template is present at the start of the reaction, relatively few amplification cycles will be required to accumulate enough product to give a fluorescent signal above background. Thus, the reaction will have a low, C T . In contrast, if a small amount of template is present at the start of the reaction, more amplification cycles will be required for the fluorescent signal to rise above background. Thus, the reaction will have a high, C T . This relationship forms the basis for the quantitative aspect of real-time PCR [62,63]. Real-time PCR has been used in several environmental studies such as the monitoring of carbazole 1,9a-dioxygenase gene (carAa) numbers in soil slurry microcosms [64], the measurement of the alpha-subunit of benzylsuccinate synthase gene (bssA) and the atrazine catabolic gene (atz) [65,66] and the identification and quantification of the arsenate reductase gene (arsC) in soil and aromatic oxygenase genes [67,68]. Real-time PCR targeting the 16S rRNA genes and Dehalococcoides reductive dehalogenase (RDase) gene was used in the monitoring of Dehalococcoides strains [69]. It has also been used in quantifying the proportion of microorganisms containing alkane monooxygenase and the subsequent assessment of microbial community changes in hydrocarbon-contaminated Antarctic soil [70]. The advantages that real-time PCR offers include speed, sensitivity, accuracy and the possibility of robotic automation [70]. Although real-time PCR can measure gene quantity, the results obtained do not link gene expression with a specific measurable microbial activity or population. RNA extracted from soil and water samples are low in yield and often do not represent the soil microbial population [63]. Also, specific probes used in the amplification reactions may fail to capture the sequence diversity that is present within environmental samples [65]. PCR molecular techniques have completely revolutionized the detection of DNA/RNA especially in microbial ecological studies. However, differential amplification of target genes such as 16S rRNA can bias PCR-based diversity studies [41]. For example, sequences with lower guanine plus cytosine content are thought to separate more efficiently in the denaturing step of PCR and hence could be preferentially amplified [41]. Also products seen on gels or in real-time may be as a result of artefacts or chimeric PCR product formation [72]. PCR is a very sensitive technique and in some cases may produce falsepositive signals due to contamination [73]. Quantitative reverse transcription PCR (qRT-PCR) is used for monitoring transcription of a specific gene. This technique first uses the enzyme reverse transcriptase to produce a complementary DNA (cDNA). The primers used to prime the reverse transcriptase reaction are random hexamers. Then PCR amplifications of the cDNA are done, using specific primers to amplify the cDNA corresponding to the RNA being quantified. Where this differs from reverse transcriptase-PCR is that an attempt is made to estimate the amount of cDNA template and hence the amount of original RNA by measuring the rate of accumulation of PCR product rather than the amount of end product which is measured in standard PCR amplifications [5]. Alonso-Gutierrez et al. [74] Used a RT-PCR to analyze the alkane degrading properties of Dietzia sp., a key player in the Prestige oil spill biodegradation. According to them, three putative alkane hydroxylase genes (one alkB homologue and two CYP153 gene homologues of cytochrome P450 family were PCR-amplified from the bacterium of which alkB and CYP153 gene expression were observed to be constitutive in this alkane degrader. In an earlier investigation, [75] examined thoroughly bacterial communities from shoreline environments (Costa da Morte, Northwestern Spain) affected by the Prestige oil spill by a triple-approach method based on different cultivation strategies and culture-independent techniques such as PCR, DGGE and screening of 16S rRNA gene clone libraries. Their results revealed that members of the αproteobacteria and Actinobacteria were the prevailing groups of bacteria detected among the alkane and polyaromatic hydrocarbon degraders. For PCR analysis to be successful, critical analysis of primer sets to be used is necessary to ensure that the target gene is efficiently amplified [76,77].

DENATURING GRADIENT GEL ELECTROPHORESIS (DGGE)/TEMPERATURE GRADIENT GEL ELECTROPHORESIS (TGGE)
Denaturing gradient gel electrophoresis (DGGE) was introduced in 1993 by [78] as a fingerprinting technique very useful for analysing complex microbial communities. DGGE was not unknown in science as it was already developed in the 1980s for the detection of point mutations in defined genes [79]. DGGE [80,81] or TGGE [82,78] separate PCR-amplified 16S rRNA fragments of the same length but with different base pair compositions. The separation of bands in both DGGE and TGGE is dependent on the decreased electrophoretic mobility of partially melted double stranded DNA molecules in polyacrylamide gels containing a linear gradient of DNA denaturants or a linear temperature gradient [78,83,84]. A GC-rich sequence (known as GC clamp) is usually attached to the 5′-end of either the forward or reverse primer to prevent complete melting of the DNA fragments [85]. The PCR-amplified DNA fragments are generally limited in size to 500 bp and are separated on the basis of sequence differences, not variation in length. The number of bands produced during DGGE or TGGE is proportional to the number of dominant species in the sample. The final result of the electrophoresis of the mixed amplicons from a complex community through denaturing gradients will be a fingerprint consisting of bands at different migration distances in the gel. The DGGE/TGGE is a method of choice when the desired information does not have to be as phylogenetically exhaustive as that provided by 16S rRNA gene clone libraries, but is still precise to determine the dominant members of microbial communities with average phylogenetic resolution [86]. For environmental or contaminated source samples where microbial diversity is largely unknown [87], DGGE/TGGE technique provides the opportunity for the identification of the microbial population through the excision of selected bands followed by their reamplification, cloning and sequencing that can lead to the phylogenetic affiliation of the ribotypes [88,89,79]. DGGE in particular has been widely used for the assessment of microbial community structure in contaminated soil and water in a number of studies [90][91][92][93][94][95][96][97][98][99]. Apart from microbial community profiling, the DGGE technique has also been used to examine gene clusters such as dissimilatory sulphite reductase beta-subunit (dsrB) genes in sulphate-reducing bacterial communities [100] and benzene, toluene, ethylbenzene and xylene (BTEX) monooxygenase genes from bacterial strains obtained from hydrocarbon-polluted aquifers [101]. Coulon et al. [104] used DGGE analysis of reverse-transcribed bacterial 16S rRNA from the upper 1.5cm of a hydrocarbon polluted sediment in coastal mudflats to study the central role of dynamic tidal biofilms dominated by aerobic hydrocarbonoclastic bacteria and diatoms in the biodegradation of hydrocarbons. Their investigation revealed the presence of phylotypes associated with aerobic straight chain and polycyclic hydrocarbons degradation such as Cycloclasticus, Alcanivorax, Oleibacter and Oceanospirillales strain ME113.
The main advantages of DGGE/TGGE are that; it enables monitoring of spatial/temporal changes in microbial community structure and provides a simple view of dominant microbial species within a sample. The limitations of DGGE/TGGE in microbial community studies include; sequence information derived from microbial populations is limited to 500 bp fragments of 16S rRNA sequences which may lack the specificity required for the phylogenetic identification of some organisms [59]. Due to the existence of multiple copies of rRNA in an organism, multiple bands for a single species may occur [102,103]. In addition, different 16S rRNA sequences may have identical mobilities. Another shortcoming of PCR is that band intensity may not truly reflect the abundance of microbial population (strong band may just mean more copies of a particular gene from the same organism) and perceived community diversity may be underestimated.

AMPLIFIED RIBOSOMAL DNA RESTRICTION ANALYSIS (ARDRA)
In amplified ribosomal DNA restriction analysis, PCR-amplified 16S rRNA fragments are digested or cut at specific sites with restriction enzymes and the resulting digest separated by gel electrophoresis. Different DNA sequences will be cut in different locations and will result in a fingerprint unique to the community being analyzed [79,11]. Divergence of the community rRNA restriction pattern on a gel is highly influenced by the type of restriction enzyme used [105]. Banding patterns in ARDRA can be used to screen clones or be used to measure bacterial community structure [42]. ARDRA is simple, rapid and cost-effective, and as a result has been used in microbial identification [106][107][108] and microbial community studies [109,105,110,111]. Microbial community composition and succession in an aquifer exposed to phenol, toluene and chlorinated aliphatic hydrocarbons were assessed by ARDRA with the aim of identifying the dominant microbial community involved in the biodegradation of trichloroethene (TCE) following biostimulation [112]. In another study, [105] used ARDRA to examine the microbial differences in activated sludge from treatment plants fed on domestic or industrial wastewater. It was observed that the bacterial communities in activated sludge were different from industrial and domestic waste water treatment plants. Hohnstock-Ashe et al. [113], using ARDRA as a fingerprinting technique also observed that the microbial community composition in well waters contaminated with TCE had shifted toward a highly diverse community dominated by Dehalococcoides ethenogenes-like microorganisms. Babalola et al. [111] used ARDRA to study the phylogenetic relationships of actinobacterial populations associated with Antarctic valley mineral soils. Further sequencing of the of the amplicons restricted singly with endonucleases RsaI, BsuRI or AluI revealed that the phylotypes were most closely related to uncultured Pseudonocardia and Nocardioides spp. whereas complementary culturedependent studies isolated more of Streptomyces spp. which were detected at a low frequency in the metagenomic analysis. ARDRA is useful for detecting structural changes in microbial communities but is unable to measure microbial diversity or detection of specific phylogenetic groups within a community fingerprinting profile [114]. Optimization with restriction enzymes is required and is often difficult if sequences are unknown. As a result, further optimization may be required to produce fingerprinting patterns characteristic of the microbial community [106,73]. In addition, banding patterns in diverse communities become too complex to analyze using ARDRA [42]. In recent studies, ARDRA has been combined with other molecular techniques such as T-RFLP and DGGE to characterize microbial communities from contaminated sources [115,116,11]. A major challenge in using ARDRA lies in the interpretation of the fingerprints obtained from complex microbial communities.

RIBOSOMAL INTERGENIC SPACER ANALYSIS (RISA)
The RISA method makes use of the length and sequence heterogeneities that are present in the intergenic spacer (IGS) region between the small (SSU) and the large subunit (LSU) rRNA genes in the rRNA operon [79]. It is a PCR-based technique that amplifies the region between the 16S and 23S rRNA genes [5]. The IGS region, depending on the species, has both sequence and length (50-1500 bp) variability [117] and this unique feature facilitates taxonomic identification of organisms [5,73]. RISA has been used to distinguish between different strains and closely related species of Staphylococcus [118,119], Bacillus [120,121], Vibrio [122,123], and other medically important microorganisms. In environmental studies, RISA has been used to detect microbial populations involved in the degradation of PAH at low temperature under aerobic and nitrate-reducing enriched soil conditions [124]. RISA has also been used to define microbial diversity and community composition in freshwater environments [125]. RISA is a very rapid and simple fingerprinting method but its application in microbial community analysis from contaminated sources is limited partly due to the limited database for ribosomal intergenic spacer sequences is not as large or as comprehensive as the 16S sequence database [73]. As a result of the limited database, community analysis using RISA could have reduced utility for the identification of unknown or unculturable microbial species from contaminated sources. Furthermore, RISA sequence variability may be too great for environmental applications. Its level of taxonomic resolution is greater than 16S rRNA and hence may lead to very complex community profiles [73]. A modification of RISA is ARISA (automated ribosomal intergenic spacer analysis) which detects the abundance and size of PCR amplicons by measuring the fluorescence emission of labelled primers. Fragment length and abundance can be measured by comparison with labelled internal standards [79]. Kostka et al. [126] used ARISA to determine the diversity of hydrocarbon degrading bacteria and bacterial community response in beach sands impacted by the Deepwater Horizon oil spill in the Gulf of Mexico. Their findings indicated that oil contamination from the Deepwater Horizon had a profound impact on the abundance and community composition of autochthonous bacteria in the beach sands. Also members of the γ-proteobacteria (Alcanivorax, Marinobacter) and α-proteobacteria (Rhodobacteraceae) were identified as the key players in crude oil degradation.

TERMINAL-RESTRICTION FRAGMENT LENGTH POLYMORPHISM (T-RFLP)
Terminal-restriction fragment length polymorphism is a major modification and improvement of the ARDRA method. The main advancement of T-RFLP over ARDRA lies in the fact that per organism detected, only the terminal restriction fragments (T-RFs) will be detected. The PCR primers used in T-RFLP analysis are fluorescently labelled at the 5′-terminus and the resultant PCR products are visualised and quantified [114,2,127]. T-RFLP relies on variations in the positions of restriction sites among sequences and the determination of the length of fluorescently labelled terminal restriction fragments by high-resolution gel electrophoresis on an automated DNA sequencer. The electropherogram represents the profile of a microbial community as a series of peaks varying in migration distance [79]. The use of fluorescently tagged primers limits the analysis to only the terminal fragments of the digestion [128]. This simplifies the banding pattern, hence enabling the analysis of complex communities as well as providing information on diversity as each visible band represents a single operational taxonomic unit or ribotype [129]. Dojka et al. [130] monitored microbial diversity in a hydrocarbon-and chlorinated solvent-contaminated aquifer undergoing intrinsic bioremediation with T-RFLP and found sequence types characteristic of Syntrophus spp. and Methanosaeta spp. They hypothesized from their findings that the terminal step of hydrocarbon degradation in the methanogenic zone of the aquifer was aceticlastic methanogenesis, with these organisms existing in a syntrophic relationship. Bordenave et al. [131] studied bacterial community changes in microbial mats following exposure to crude oil using T-RFLP. Their results demonstrated clear succession of different bacterial populations with operational taxonomic units that were related to Chloroflexus, Burkholderia, Desulfovibrio and Cytophaga genera. In another related study, Bordenave et al. [132] used T-RFLP to assess the impact of the Erika oil spill on microbial communities in the Northern French Atlantic coast. Cluster analysis of T-RFLP fingerprints of eubacterial communities indicated that contaminated and uncontaminated communities evolved differently, suggesting that Erika oil had an impact on the evolution of the bacterial communities structure. Kaplan and Kitts [21] investigated bacterial succession in a petroleum land treatment unit with T-RFLP analysis. The T-RFLP patterns they obtained separated into five PCA clusters that reflected total petroleum hydrocarbons (TPHs) degradation phases and trends in aerobic heterotrophic bacterial counts. Further analysis revealed that Flavobacterium, Pseudomonas and Azoarcus phylotypes were the most abundant bacteria during the fast degradation phase. Restriction fragment length polymorphism (RFLP) analysis of 16S rRNA gene libraries was used by Allen et al. [133] to investigate the interdependence between geoelectrical signatures at underground petroleum plumes and the structures of subsurface microbial communities. The 16S rRNA gene inserts were screened for their RFLP patterns by using the restriction endonuclease MspI. Clones with the same RFLP patterns as well as those with sequence similarities of greater than 97% were considered to represent the same phylotypes. Their results revealed that the zone contaminated with residual hydrocarbons above the free-phase petroleum contained aromatic hydrocarbon degraders such as Sphingomonas aromaticivorans, Brachymonas petroleovorans and large populations of methylotrophs and methanotrophs. The microbial community composition of lake sediments contaminated with copper as a consequence of mine milling disposal over a 100-year period was studied using T-RFLP [134]. T-RFLP has also been used to characterize microbial communities recovered from surrogate minerals incubated in an acidic uranium-contaminated aquifer [135] and dechlorinating bacteria from a basalt aquifer [136,137] using T-RFLP observed that the chronic presence of benzene in groundwater reduced bacterial diversity and community composition compared with that of clean groundwater sources. In addition, the reliability of T-RFLP for monitoring microbial populations characterized by low diversity and high relative abundances of a few dominant groups was assessed in a hydrocarbon-polluted marine environment [138]. In contaminated soils, T-RFLP has also been used successfully in describing bacterial communities of polychlorinated biphenyl (PCB) contaminated soils [139] and microbial communities that reductively dechlorinate TCE to ethene [140]. Coulon et al. [104] investigated the dominance and dynamics of aerobic hydrocarbonoclastic bacteria in the biodegradation of hydrocarbons in oil-polluted coastal mudflats using T-RFLP employing the methods of Fahy et al. [137] but rather used a Genescan 500 TAMRA (5-carboxytetramethylrhodamine) internal standard. The use of automated detection systems and capillary electrophoresis in T-RFLP analysis allows high throughput and more accurate quantitative analysis of microbial community samples than with any of the other genetic fingerprinting methods. Despite the high resolution and sensitivity, T-RFLP is highly dependent on PCR amplification of 16S rRNA which is affected by DNA extraction method, PCR biases and the choice of universal primers [42]. Different enzymes will produce different community fingerprints and incomplete digestion by the restriction enzymes may lead to an overestimation of diversity [141,142]. It is therefore important to use at least two to four restriction enzymes [129] as T-RFLP profiles generated by a single restriction enzyme in a complex microbial community may lead to erroneous conclusions about the abundance of a particular strain or species [143]. A major drawback of T-RFLP in comparison with DGGE/TGGE is the impossibility of retrieving suitable phylogenetic information from the T-RFs generated since fragments are difficult to isolate and are usually too short to be sequenced properly and would not yield enough sequence information for phylogenetic analysis [79].

FLUORESCENT IN SITU HYBRIDIZATION (FISH)
An important step towards determining the diversity of microbes in environmental samples is to harness the information obtained from the direct sequencing of rRNA genes extracted from such samples. The full-cycle rRNA approach basically uses the sequence information of cloned, rRNA-encoding genes from environmental habitats to develop phylogenetic oligonucleotide probes that allow specific hybridization to the target region of the ribosomal RNA in fixed permeabilized cells. This technique is called fluorescent in situ hybridization (FISH) [79].
FISH is used to quantify the presence and relative abundance of microbial populations in a community sample. Microbial cells are treated with fixatives, hybridized with specific probes (usually 15-25 bp oligonucleotide-fluorescently labelled probes) on a glass slide then visualised with either epiflourescence or confocal laser microscopy [79,11]. Hybridization with rRNA-targeted probes enhances the characterization of uncultured microorganisms and also facilitates the description of complex microbial communities [144]. FISH is a taxonomic method that is mostly used for the examination of whether members of a specific phylogenetic affiliation are present and provides direct visualisation of uncultured microorganisms and also facilitates the quantification of specific microbial groups [86]. FISH use alone does not provide any insight to metabolic function of microorganisms. However, it can be coupled with other techniques such as micro autoradiography to describe the functional properties of microorganisms in their natural environment [145]. Two types of FISH probes based on conserved or unique regions of 16S rRNA genes can be developed; domain or-group specific probe and strain-specific probes. Domain-or-group specific probes discriminate or identify members of larger phylogenetic groups, while strain-specific probes quantify or assess the abundance of a specific species or strain within a microbial community [146]. Wagner et al. [147] and Wagner et al. [148] used both group-and speciesspecific rRNA-targeted oligonucleotide probes to define Acinetobacter from activated sludge. Richardson et al. [140] combined group-specific FISH and T-RFLP in the characterization of microbial communities engaged in TCE biodegradation. From the FISH analysis, the authors observed that the number of organisms such as Cytophaga, Flavobacterium and Bacteroides were more abundant than the TCE degrader Dehalococcoides ethenogenes in the microbial consortium. However, the lack of functional gene analysis in the study meant that the relative abundance of these organisms and their ecological importance for TCE biodegradation could not be established. FISH techniques are often used in conjunction with other genetic fingerprinting methods such as DGGE [149][150][151][152] and T-RFLP [140,153,154,152] for the enumeration and characterization of microbial population from contaminated sources. The drawback of FISH is that a limited number of probes can be used in a single hybridization experiment and background fluorescence can be problematic in some samples [146,86]. A prior knowledge of the sample and the microorganisms most likely to be detected is necessary (i.e. rRNA sequence) for the design of specific probes.
A major obstacle of the standard FISH technique is its limited sensitivity because bacterial cells with reduced ribosome contents which often occur in oligotrophic environments like most soil habitats are not satisfactorily stained for the microscopic analyses. Two modifications of FISH used to increase sensitivity and reliability are catalyzed reporter deposition-FISH (CARD-FISH) and microautoradiography-FISH (MAR-FISH). In CARD-FISH, signal intensities of hybridized cells are increased by enzymatic signal amplification using horseradish peroxidase (HRP)-labeled oligonucleotide probes in combination with the tyramide signal amplification (TSA). TSA is based upon the patented catalyzed reporter deposition (CARD) technique using derivatized tyramide. In the presence of small amounts of hydrogen peroxide, immobilized HRP converts the labeled substrate (tyramide) into a short-lived, extremely reactive intermediate. The activated substrate molecules then very rapidly react with and covalently bind to electron rich regions of adjacent proteins. This binding of the activated tyramide molecules occurs only immediately adjacent to the sites at which the activating HRP enzyme is bound. Multiple deposition of the labeled tyramide occurs in a very short time (generally within 3-10 minutes). Subsequent detection of the label yields an effectively large amplification of signal [79]. The combination of FISH with microautoradiography allows for the identification of bacteria and concomitantly gives an indication of their specific in situ activity using suitable isotope-labeled substrates (especially β-emitters such as 14 C and 3 H [79]. The substrate uptake patterns in FISH-labeled bacteria can be investigated in situ in mixed natural communities at a single-cell level even if the bacteria are yet unculturable.

DNA MICROARRAY TECHNOLOGIES
Microarrays ('chips') containing nucleic acids as probes represent a major advancement in molecular detection technology. They are ideal for the high-throughput study of the sequence diversity of 16S rRNA genes as well as of functional genes in environmental samples [79,11].
DNA microarray technology is a very powerful taxonomic and functional tool that is widely used to study biological processes, including mixed microbial communities involved in pollutant degradation. This technique is similar to FISH, but provides a means for simultaneous analysis of many genes [155]. DNA microarray is a miniaturized array of complementary DNA probes (∼500-5000 nucleotides in length) or oligonucleotides (15-70 bp) attached directly to a solid support, which permits simultaneous hybridization of a large set of probes complementary to their corresponding DNA/RNA targets in a sample [156]. Microarray technology has been used successfully in the analysis of global gene expression in pure culture studies [157,158], but is complicated for environmental samples due to numerous challenges such as specificity, sensitivity and quantification [159]. Despite these challenges, three major forms of environmental microarray formats namely, functional gene arrays (FGA) [160,161,162], community genome arrays (CGA) [163,164] and phylogenetic oligonucleotide arrays (POA) [165,166,167] have been developed for microbial community analyses of environmental samples. Functional gene arrays (FGA) identify or measure genes encoding key enzymes in a metabolic process [168,161]. Such an approach provides vital information about the presence of important genes as well as the expression of the genes in the environment by measuring the mRNA [169]. Many studies have used the FGA approach to investigate microbial involvement in environmental processes such as nitrogen fixation and nitrification [170,171,169]. For biodegradation of contaminants, FGA techniques have been developed for the detection of specific aromatic oxygenase genes in a soil community degrading PCB [172] and the presence and expression of naphthalenedegrading genes from soil contaminated with PAH (Rhee et al. [161], 2004). Community genome array (CGA) is similar in concept to reverse sample genome probing (RSGP) [173] except that CGA uses nonporous hybridization surfaces and fluorescence based detection systems for high throughput analysis but shows decreased sensitivity [163]. Wu et al. [163] initiated the development and testing of CGA as a tool to detect specific microorganisms within a natural microbial community. CGA has been shown to achieve species-to-strain level differentiation depending on hybridization temperature and has an added potential for the determination of genomic relatedness of isolated bacteria [169]. The major disadvantage of CGA is that culturable organisms are needed in the array preparation thus making the CGA application on the field apart from laboratory studies almost impossible. POA rely on the use of 16S rRNA for the identification of microorganisms in the environment. Due to a high throughput capacity of microarrays and the availability of extensive rRNA sequence databases, POA provides a very convenient means of simultaneously identifying many microorganisms from a sample. Several studies have employed POA in environmental investigations of microbial populations in water [174,175], soil [176,177] and activated sludge [178]. The application of microarrays in environmental microbiology, specifically in the examination of microbial populations engaged in biodegradation has the potential for organism identification as well as defining their ecological role [160,161]. However, more rigorous and systematic assessment and development are needed to realize the full potential of microarrays for microbial detection and community analysis [168]. Microarrays detect only the dominant populations in many environments [161]. In addition, probes designed to be specific to known sequences can cross-hybridize to similar or unknown sequences and may produce misleading signals [169]. Moreover, soil, water and sediments often contain humic acids and other organic materials which may inhibit DNA hybridization onto microarrays [179,79]. Finally, limitations in quality RNA extraction from many environmental samples imply that advances in RNA extraction and purification and amplification methods are needed to make microarray gene expression analysis possible for a broader range of samples [169,11].

APPLICATION OF METAGENOMIC DNA IN STUDYING MICROBIAL DIVERSITY AND DYNAMICS
The use of metagenomic DNA by direct isolation of total community DNA from the environment has been a good tool for evaluating microbial diversity and dynamics. Commercially available DNA extraction kits circumvent most of the challenges associated with DNA extraction using readily available and cheap reagents. Microbial population changes during bioremediation of an experimental oil spill was investigated by Macnaughton et al. [90] using metagenomic 16S rDNA PCR-denaturing gradient gel electrophoresis (DGGE) to identify the bacterial community members responsible for the decontamination of the site. Prominent DGGE bands which were excised for sequence analysis indicated that oil treatment encouraged the growth of Gram-negative microorganisms within the α-Proteobacteria and Flexibacter-Cytophaga-Bacteroides phylum. Consequently, α-Proteobacteria were never detected in unpolluted controls. Hamamura et al. [180] examined soil bacterial population dynamics in several crude-oil-contaminated soils to identify those organisms associated with alkane degradation and to assess patterns in microbial response across disparate soils using metagenomic DNA. The DNA sequences of prominent DGGE bands corresponding to bacterial populations selected during crude-oil degradation revealed that phylogenetically diverse populations related to β-and γ-Proteobacteria, Actinobacteria, and candidate division TM7 were identified across the set of contaminated soils. Surridge [181] characterized microbial communities in PAH and PCB-contaminated soils using DGGE of metagenomic soil DNA. This investigation revealed the involvement of novel bacterial genera such as Burkholderia, Sphingomonas, Pseudomonas, Staphylococcus, Bacillus, Providencia, Burkholderia, Methylobacterium, Klebsiella, Rhodococcus and Pseudomonas and many uncultured bacterial clones in contaminant degradation. Bacterial communities from shoreline environments affected by the Prestige oil spill in Spain were thoroughly examined by Alonso-Gutierrez et al. [75] using PCR-DGGE and the screening of 16S rRNA gene clone libraries. Their findings revealed that members of the classes α-Proteobacteria and Actinobacteria were the prevailing bacterial groups associated with hydrocarbon degradation of which Dietzia sp. HOB [74] was later identified as an excellent alkane degrader expressing alkane hydroxylase genes such as alkB and CYP153 constitutively.
Chikere [182] used the soil metagenome from crude oil polluted soil to study the bacterial diversity and dynamics during bioremediation. The investigation recorded distinct and dominant ribotypes obtained using denaturing gradient gel electrophoresis (DGGE) and sequencing that corresponded with known hydrocarbon degrading, Gram positive bacteria after PCR-amplification of 16S gene. Coulon et al. [104] used a metagenomic approach comprising DGGE of PCR-amplified cDNA from reverse transcribed 16S rRNA, T-RFLP, cloning and 454 pyrosequencing to investigate aerobic hydrocarbon degraders in hydrocarbon-polluted mudflat mesocosms. Throughout the experiment they discovered that phylotypes associated with degradation of alkanes and PAHs such as Cycoclasticus, Alcanivorax, Oleibacter and Oceanospirillales ME113 significantly increased in the oilpolluted mesocosms. A recent review by Chikere et al. [13] extensively discussed both culture-dependent and -independent techniques employed in monitoring microbial hydrocarbon remediation in the soil. This review presented the chromosomal or plasmid locations of some degradative enzymes and genes such as dioxygenases and alkane hydroxylases in known hydrocarbon degrading bacterial genera. Such nucleic-acid based analyses can only be done using molecular microbiology techniques such as PCR, DGGE and DNA sequencing.

METAGENOMIC ANALYSIS
Metagenomic approaches have enabled us to understand the genomic potential of the entire microbial community in an ecosystem by cloning and analyzing microbial community DNA directly extracted from environmental samples [33]. The construction of metagenomic libraries and other DNA-based metagenomic projects are initiated by isolation of high quality DNA suitable for cloning and cover the microbial diversity present in the original sample [183,32]. The large amount of cloning and sequencing required for such a task was prohibitive during the early development of metagenomics. More recently, researchers have transitioned to new direct and high throughput sequencing technologies (e.g. 454 pyrosequencing), often bypassing cloning steps that were essential previously. Direct sequencing increases the depth of microbial community DNA sequences analyzed, probing deeper into the metagenome of an ecosystem, and to date, has been applied to a variety of environments ranging from the termite gut (471 Mbp of sequence), to the human intestinal tract (478 Mbp), to marine environments [46 Gbp (billion base pairs)] and hydrocarboncontaminated sites [180,181,33]. This direct, deep-sequencing, metagenomic approach has been accepted widely and has even been termed 'megagenomics' in acknowledgement of the amount of effort required to achieve a comprehensive coverage.
Functional screening of metagenomic libraries have led to the discovery of novel genes encoding polyphenol oxidase [184], ester and glycosyl hydrolase [185] and also given indications in the diversity of extradiol dioxygenases in coke plant waste water [186]. Brennerova et al. [187] revealed the diversity and abundance of meta cleavage pathways in microbial communities from soil highly contaminated with jet fuel (aromatic and aliphatic hydrocarbons) under air sparging bioremediation using metagenomics. Moreover, the extradiol dioxygynase diversity was assessed by functional screening of a fosmid library in Escherichia coli with catechol as substrate. The 235 positive clones from inserts of DNA extracted from contaminated soil were equivalent to one extradiol dioxygenase-encoding gene per 3.  [188,189] used the E. coli library from the study conducted by Suenaga et al. [186] to screen for EDO activity using catechol as a substrate and 38 clones were subjected to sequence analysis. As a result, various gene subsets were identified that were not similar to those involved in previously reported degradation pathways. The distribution of these genes among the different genome segments was reported in some isolated sphingomonads [190]. Iwai et al. [191] applied gene-targeted-metagenomics and pyrosequencing to aromatic dioxygenase genes to obtain greater sequence depth than possible by other methods. A PCR primer set designed to target a 524 bp region that confers substrate specificity of biphenyl dioxygenases yielded 2000 and 604 sequences from 5′ and 3′ ends of the PCR products, respectively, that passed set validity criteria. Sequence alignment showed three known conserved residues as well as another seven conserved residues not previously reported. Ninety-five and 41% of the valid sequences were assigned to 22 and 3 novel clusters in that were not included in any previously reported sequences at 0.6 distance by complete linkage clustering for the sequenced regions. Although they designed their primers using only toluene/biphenyl dioxygenase, interestingly the deeper sequencing technique applied enabled them to obtain a much broader range of apparent dioxygenase genes. For example, clusters F24 and R5 contained all well-known toluene/biphenyl dioxygenase genes and clusters F35 and R7 contained all well-known naphthalene dioxygenase genes. This approach is likely to be most useful for genes directly responsible for important ecosystem functions or ecological processes such as biogeochemical cycles and biodegradation. Korenblum et al. [71] employed PCR-amplification of 16S rRNA gene sequences using the primers U968 and L1401 to amplify the V6-V8 variable regions in the Escherichia coli small subunit rRNA genes to determine the bacterial communities in crude oil samples from two Brazilian offshore petroleum platforms. The PCR products were then cloned and a total of 156 valid 16S rRNA gene sequences obtained was analyzed for taxonomic affiliation by the ribosomal database project (RDP) classifier. Among the clones, 122 were γ-proteobacteria and most were affiliated with the family Pseudomonadaceae. The remaining clones were affiliated with β-proteobacteria (representative families were Comamonadaceae and Burkholderiaceae), Moraxellaceae, Enterobacteriaceae, Alteromonodaceae and Xanthomonadaceae. A recent review by Suenaga et al. [192] focused on 'targeted metagenomics' studies, which combine metagenomic library screening and subsequent sequencing analysis. This approach is a more effective means to understanding the content and composition of genes for key ecological processes in microbial communities especially those involved in pollutant degradation such as hydrocarbons. Metagenomics have some biases that make the technique not completely fool-proof. Despite vast sequencing efforts, the complete coverage of a metagenome based on multiple-fold redundancy or the complete assembly of individual genomes within a community remains largely unattained. Such difficulties are expected, given the typical microbial community complexity and unevenness (i.e. a few numerically predominant populations and vast numbers of low abundance ones).

CONCLUSION
Cultivation-independent molecular microbiology techniques have in recent times unravelled the complexities of microbial populations, functions and interactions in hydrocarbon-polluted sites. These techniques have been applied in the monitoring of the progress of bioremediation of hydrocarbons and allied pollutants and invariably have greatly increased the acceptance of biological treatment of these pollutants as a result of the application of DNA-based techniques [193]. In recent years, the studies of gene expression and protein biosynthesis have emerged to complement DNA-based microbial community analysis [194]. Therefore, metagenomics offer significant promise to advance the prediction of the in situ microbial responses, activities and dynamics during hydrocarbon/pollutant bioremediation [195][196][197][198][199].

ACKNOWLEDGEMENT
This work was funded by the Third World Organization for Women in Science (TWOWS) postgraduate fellowship given to the author in 2005.