Denaturing High Performance Liquid Chromatography and Bioinformatics - Two Modern Tools for Extracellular Superoxide Dismutase (SOD3) Gene Promoter Analysis

Denaturing High Performance Liquid Chromatography (DHPLC) is a relatively new method used for screening DNA sequences, characterized by high capacity to detect mutations/polymorphisms. This study is focused on the Transgenomic WAVE TM DNA Fragment Analysis (based on DHPLC separation method) of a 485 bp fragment from human EC-SOD gene promoter in order to detect single nucleotide polymorphism (SNPs) associated with atherosclerosis and risk factors of cardiovascular disease. The fragment of interest was amplified by PCR reaction and analyzed by DHPLC in 100 healthy subjects and 70 patients characterized by atheroma. No different melting profiles were detected for the analyzed DNA samples. A combination of computational methods was used to predict putative transcription factors in the fragment of interest. Several putative transcription factors binding sites from the Ets-1 oncogene family: ETS member Elk-1, polyomavirus enhancer activator-3 (PEA3), protein C-Ets-1 (Ets-1), GABP: GA binding protein (GABP), Spi-1 and Spi-B/ PU.1 related transcription factors, from the Krueppel-like family: Gut-enriched Krueppel-like factor (GKLF), Erythroid Krueppel-like factor (EKLF), Basic Krueppel-like factor (BKLF), GC box and myeloid zinc finger protein MZF-1 were identified in the evolutionary conserved regions. The bioinformatics results need to be investigated further in others studies by experimental approaches.


Keywords: DHPLC method, SOD 3 promoter, SOD 3 polymorphisms, transcription factors binding sites, atherosclerosis
Extracellular SOD isoform (EC-SOD or SOD3) is the last discovered and the poorly understood SOD family member, able to convert superoxide anion to hydrogen peroxide and water.EC-SOD was first detected in human plasma, lymph, ascites and cerebrospinal fluids [1,2].The human arterial wall contains large amount of EC-SOD [3,4].The strategic localization of EC-SOD between endothelium and vascular muscle [4][5][6][7] and high levels of protein suggest that EC-SOD plays an important protective role against the superoxide radical in the vascular wall which is involved in the development of hypertension, diabetes and atherosclerosis.
The EC-SOD is a 135 KDa tetrameric glycoprotein and the overall structure has not been determined [8].The EC-SOD gene (SOD 3 ) comprising 3 exons, is located to chromosome 4 (4p15.3-15.1)[9].The corresponding cDNA predicts 240 aminoacids including a signal sequence of 18 aminoacids.This signal sequence is responsible for targeting the protein to the ER/Golgi system [10], the Nterminal domain is also responsible for the stabilization of EC-SOD tetramer [11,12], whereas the C-terminal domain of the protein is responsible for the distribution of EC-SOD in extracellular space according with affinity for heparin omologues [13,14,5].
At the moment, little is known about the regulation in vivo of this enzyme.Several studies have shown that the enzymatic activity of EC-SOD could be regulated in vivo by the existence of disulfide dependent folding variants or by different localization of that protein in the extracellular space.Less importance has been attributed to the regulation of SOD3 expression in vivo by different transcription factors.The screening of a fragment from EC-SOD promoter could reveal polymorphic sites affecting putative transcription factors binding sites which are potentially functional in the context of atherosclerosis and risk factors.As a screening method, Transgenomic WAVE TM DNA Fragment Analysis is a relative recently developed system that uses Denaturing High Performance Liquid Chromatography (DHPLC) to differentially separate the DNA heteroduplexes from the DNA homoduplexes [16,17] and thus, it represents an appropriate method for screening a SOD3 promoter fragment to detect polymorphisms / mutations associated with atherosclerosis and other risk factors.
In this study, the bioinformatics analysis of a 500 bp fragment from 5'flanking region of SOD3 (between -469 bp and +16 bp) was used in order to find out the transcription factors with higher potential to regulate SOD3 expression in vivo and the screening of the SOD3 promoter region between -469 bp and +16 bp using DHPLC was performed to indicate possible functional sites for transcription factors in conditions of atherosclerosis or predisposal factors.

DNA samples
170 genomic DNA samples were extracted using the Qiagen QIAmp DNA Kit (Qiagen) from venous fasting blood collected from Caucasian subjects whose coronary angiography was performed.The selected subjects were enrolled in the study between 2004-2005 at the Cardiology and Vascular Recovery Clinic, Timisoara.All participants gave their written informed consent before their participation in the study and this study was approved by the local research Ethics Committee.100 from the analyzed subjects were patients with no significant atheroma and 70 were patients with significant atheroma (obstruction >50% of one ore more artery).

Primers
Specific primers were designed using Primer 3 (http:// w w w -g e n o m e .w i .m i t .e d u / c g i -b i n / p r i m e r / primer3_www.cgi)to amplify specific genomic regions based on the published nucleotide sequence of EC-SOD from NCBI GeneBank (accession no.NM_003102.1).The forward F: 5'-ATA CCA GAA CGG AGA CTG G-3' and the reverse R: 5'-CTA AAT CTC TGG ATC CCACC-3' primers were used to amplify a 485 bp EC-SOD promoter fragment.

PCR Reaction
10-100 ng of genomic DNA were used in 25 µL PCR reaction mixtures comprising 0.6 units of Thermoprime Plus DNA Polymerase (ABgene), 0.2 mM dNTPs, 1.5 mM MgCl 2 , 1xThermoprime buffer and 1 pmol µL -1 forward and reverse primers.5% DMSO and 1M betaine was also used.PCR was carried out with a programmable termocycler PTC 200 using the following conditions: initial denaturation step at 94°C for 4 min.30 s, 30 cycles of : 94°C for 30 s, 56°C for 1 min., 72°C for 30 s, followed by final extension at 72°C for 5 min.The PCR products were verified by 1% agarose gel electrophoresis using ethidium bromide and UV visualization.

Denaturing HPLC analysis
Each sample was heated at 95°C for 5 min and slowly cooled down to 22°C by decreasing the temperature with 0.2°C every 4 s to allow heteroduplexes formation.5-10 µL of each sample were run on a Transgenomic ™ Wave DNA Fragment Analysis System (Transgenomic, Omaha) using a DNASep® column and the absorbance at 260 nm was monitored during the run by ultraviolet light.
The analysis requires 4 buffers: the buffer A (0.1 M trietylamonium acetate or TEAA), B (0.1 M TEAA and 25% acetonitrile), C (8% acetonitrile for siringe wash) and D (75% acetonitrile for column wash) pumped into the system individually or as a mixture.The samples are introduced in the system via the autosampler and the mobile phase initially composed of buffer A is used to connect the dsDNA fragments to the column matrix.The samples are eluted off the DNASep®column by linear gradient changes of the A and B buffers (mobile phase) at a specific temperature of the oven according with the fragment size and the time of denaturation.Heteroduplexes, due to their mispairing nature, denature more quickly than homoduplexes upon increasing temperatures and increasing ACN concentrations.
The incremental temperature scan using the temperature range between 50-75°C was used in a software-predicted melting profile to determine the optimum partially denaturing temperatures for the analyzed fragment and the optimum ACN gradient for polymorphisms detection.Due to the heterogeneous melting temperature distribution, resulted from the heterogeneous distribution of GC-rich regions, two different temperatures were chosen for the analyzed fragment (55°C and 59°C).The elution of partially denatured amplicons was done with a specific linear gradient of acetonitrile (59-67 %A:B at 55 °C respectively 55-63 5A:B at 59°C) at a flow rate of 0.9 mL/min.Elution profiles of all samples were compared between them to differentiate between heterozygote or homozygote variants.

Bioinformatics analysis
The bioinformatics tools used for analyses comprised data bases such as GeneBank (http://www.ncbi.nlm.nih.gov/sites/entrez) to find mRNAs, DNA and protein sequences corresponding to SOD3 from a variety of organisms, each with given access number.The RefSeq and SNP databases from GeneBank were accessed through NCBI Locus Link or NCBI SNP to locate previously identified SNPs within the given fragment that could be of interest in bioinformatics analysis.To compare nucleotide sequences, homology searching was performed using: Clustal W (www.ebi.ac.uk) for multiple sequences alignment, PIPMaker (http://bio.cse.psu.edu) to identify conserved regions of two DNA sequences from two related species and Foot Printer (http://bio.cs.washington.edu/software) to identify short motifs (10 bp ) conserved within 2h more DNA sequences from related species.The MatInspector program from Genomatrix was used to search the transcription factor sites evolutionary conserved within the analyzed DNA sequences.

Statistical analysis
The statistical software package SPSS V12 was used for all analyses.Two sided probability was chosen at p<0.05.To compare categorical variables the Pearson's χ 2 test was used, a student t-test was used for comparison of two group's means when the data were normally distributed and the Mann-Whitney test for non-normally distributed data.

Results and discussions
Statistical analyses have shown that the subjects with significant atheroma have the mean age, cholesterol and triglycerides levels significantly higher comparative with the subjects without significant atheroma (table 1).As expected, there was a higher proportion of men in patients compared with controls.There was, also, a significantly higher incidence of hypertension, diabetes, dyslipidemia and family history in patients.From all 70 patients, 63 cases (90.0%) have been recorded with coronary artery stenosis: one vessel (18 cases -25.7%), two vessels (26 cases -37.1%) and three vessels (19 cases -27.1 %), 9 cases with carotid stenosis and 12 cases with peripheral vessels affected.
Bioinformatics analysis of the human 500 bp SOD3 promoter fragment comprised in 5' flanking of SOD3 gene (between -500 bp and +1 nucleotide) was done considering the reference number NM_003102.1 (GeneBank).This reference number corresponds to a protein product with 3 exons (563 bp, 84 bp and 1337 bp).The third exon comprises entire coding region of the protein; the others 2 exons comprises regulatory elements with a putative role in the regulation of transcription process such as CAAT, AP-1, AP-4, CREB, ARE previously identified by Folz & Crapo [17].The MatInspector analysis [18][19][20] allows searching of sequences for matches with the consensus matrix description; the matrix similarity is calculated following an algorithm described [18] and higher scores indicate higher probability of the analyzed sequence to match with the consensus matrix.For the analyzed fragment of SOD3 promoter, MatInspector has predicted transcription factors sites corresponding to Kruppel like elements: BKLF, EKLF, KKLF as well as to the ETS family members: Ets-1a, PEA3, Elk-1, GABPα, Spi-B ºi Spi-1/PU.1.Additionally, the analyzed fragment comprises two sites corresponding to signal transducer and activator of transcription factors: STAT1 (5'-TTCYNRGAA-3' [24]) at positions -126 respectively -48 and two STAT5 sites (5'-TTNYNNNAA-3', [25]) at positions -128 respectively -51, both on negative strand; these sites may respond to cytokines and growth factors signals in a specific biological context.The analysis has revealed sites corresponding to the transcription factors involved in cell cycle such as: GATA-1 (5'-WGATAR-3' [26,27]) at -108 on negative strand and - The Clustal W alignments [21,23] of human, rat and mouse DNA sequences (based on reference numbers NM_003102.1 for human, NM_011435.3for mouse, NM_012880.1 for rat) of the 5'-flanking SOD3 gene (fig.1) together with FootPrinter [22] analysis results (fig..2) have revealed a low homology between the human and rodents promoters showing evolutionary changes of the human promoter sequence.There were identified several structural motifs conserved for all three species, but the sites were only partial conform to previously described sequences.
In order to identify new mutations and/or polymorphisms in the analyzed fragment, the DNA genomic from all subjects, was used for amplification of the 485 bp EC-SOD promoter fragment of interest (the region between -469 bp and +16 nucleotide).The PCR products were verified in 1% agarose gel using ethidium bromide for visualization in UV.
The DHPLC analysis at both temperatures chosen for the screening (55°C and 59°C) have not shown different melting profiles for the analyzed samples as it is shown in figure 3, demonstrating that there are no polymorphisms in the fragment of interest amplified from the170 samples.The screening of NCBI Database to identify mutations/ polymorphisms previously reported has been shown the existence of several mutations in the human SOD3 promoter region of interest.The SOD3 promoter rare variants described in the database have low frequencies of the rare allele (<5%); no homozygous individuals for the rare alleles have not been reported.The polymorphic sites are represented in the Clustal W report as it is shown in the fig.4, in order to analyze their positions regarding the TFs binding sites.The Mat Inspector analysis of the human SOD3 promoter fragment containing these polymorphic sites has been identified and four mutations where the polymorphic site interferes with TFs binding sites (see table 1).The rarer allele A of the rs17885921 G/A EC-SOD promoter polymorphism creates a new weak binding  [45] which is superposed to a SOX5 [39] binding sites.The COMP1 is a transcription factor that cooperates with myogenic proteins and SOX5 is a tests specific transcription factor that regulate embryonic development, but is expressed in liver and lymph nodules.Two new binding sites are created by the rarer allele T corresponding to DBP [43] and HOX1.3 [44].The rarer allele A of the rs3471066 G/A EC-SOD promoter polymorphism destroys a site EKLF/GABP [47,48] which is conserved in the analyzed human, rat and mouse EC-SOD promoters fragments.The rarer allele A creates also a bipartite site for HOX/PBS [46], increasing the possibility to bind both: HOX and PBS proteins and, consequently, increasing the probability of that site to be functional in a given biological context.

Conclusions
The regulation of EC-SOD expression and activity are of the major importance for the modulation of the function and toxicity of the oxidants radicals in the extracellular space.Taking into account the high level of EC-SOD in the vascular wall, is becoming clear that the regulation of EC-SOD activity as well as the regulation of SOD3 expression in vivo by different transcription factors are of the major importance for antioxidant defense.Two modern methods were used in the investigation of the region between -500 bp and +1 bp from the EC-SOD (SOD3) promoter in order to identify putative functional transcription factors binding sites in the conditions of atheroma or other risk factors existence.
The DHPLC method is a modern tool which has the capacity to differentiate hetero from homoduplexes in a very short time (minutes).This fact promotes this method among the valuable tools in the screening of DNA fragments when analyzing point mutation.Although, the standard method used in screening for point mutations is direct sequencing, time and low costs considerations are very important and DHPLC method has these advantages.Different studies have shown that the DHPLC was successfully applied in the mutation detection of PTEN [49], and INK4A genes [50].For the PTEN gene screening it has been estimated that the sensitivity of the DHPLC detection is greater than 95% being similar or superior to that obtained by SSCP method [49].Under specific conditions, the DHPLC detected 100% of the mutations revealed by sequencing of INK4A gene [50].Considering these observations, the results of our study which have consisted in no different profiles detection in DHPLC analysis of 170 samples at two different temperatures (55°C and 59°C) could suggest that in the analyzed fragments there are no polymorphisms/mutations.The screening of the SNP NCBI Database have shown that in the SOD3 fragment are found several reported polymorphisms for which the homozygous carriers of the rarer allele does not exists and the frequency of the heterozygous is under 5%.Consequently, it is possible that in the Caucasian population from our study the polymorphisms/mutations described in database does not exists or the frequency of rarer allele is lower requiring a larger study to identify these polymorphic variants of the EC-SOD promoter.
The bioinformatics analysis of the EC-SOD promoter fragment has been shown several TFs binding sites from the ETS (Ets-1, Elk-1& PEA3, GABP, Spi-B, Spi1.Pu), MZF-1, GC box, and Kruppel-like (GKLF, BKLF) transcription factors families in the conserved regions of human, rat and mouse sequences.The human promoter sequence has shown an evolutionary different pattern of transcription factors clusters compared with rodents sequences.
The computational analysis of the human EC-SOD promoter fragment containing the six polymorphic sites described in SNP NCBI Database have shown that the rarer allele create or destroyed a transcription factor binding site in four polymorphic variants, increasing or decreasing the probability of that specific site to bind to its specific transcription factor.
Taking into account the large number of false positive results generated by the computational methods used to predict transcription factors and to find conserved motifs in the analyzed DNA region, the limitations of using bioinformatics methods are evident.The results of the present study are only preliminary and presumptive.To have more confidence in these results, the computational analysis should be accompanied by experimental in vivo and in vitro methods such as nested deletions promoter assays which gives information about the enhancers or repressor regions of the promoter fragment and EMSA (Electromobility shift assay) which confirms the specific binding of nuclear proteins to a specific DNA sequence that validate the findings of bioinformatics analysis.

Fig. 1 .Fig. 2 .
Fig.1.The Clustal W alignments for rat, mouse and human DNA sequences; the conserved transcription factors sites are highlighted in grey colour

Fig. 4 .Fig. 3 .
Fig.4.Polymorphic sites identified in human DNA sequence by screening NCBI Database are underlined and highlighted for all three sequences

Denaturing High Performance Liquid Chromatography and Bioinformatics -Two Modern Tools for Extracellular Superoxide Dismutase (SOD3) Gene Promoter Analysis CORINA 1 SAMOILA, ALFA XENIA LUPEA 2* , ANDREI ANGHEI 1 , MARILENA MOTOC 1 , GABRIELA OTIMAN 1 , ANDREI MOTOC 1 , LIVIU TAMAS 1LL
1Victor Babes University of Medicine and Pharmacy, 2-P-ta Eftimie Murgu, 300041,Timisoara, Romania 2 Polytechnics University of Timisoara, Faculty of Industrial Chemistry and Environmental Engineering, 2 P-ta Victoriei, 300006, Timisoara, Romania Denaturing High Performance Liquid Chromatography (DHPLC) is a relatively new method used for screening DNA sequences, characterized by high capacity to detect mutations/polymorphisms.This study is focused on the Transgenomic WAVE TM DNA Fragment Analysis (based on DHPLC separation method) of a 485 bp fragment from human EC-SOD gene promoter in order to detect single nucleotide polymorphism (SNPs) associated with atherosclerosis and risk factors of cardiovascular disease.The fragment of interest was amplified by PCR reaction and analyzed by DHPLC in 100 healthy subjects and 70 patients characterized by atheroma.No different melting profiles were detected for the analyzed DNA samples.A combination of computational methods was used to predict putative transcription factors in the fragment of interest.Several putative transcription factors binding sites from the Ets-1 oncogene family: ETS member Elk-1, polyomavirus enhancer activator-3 (PEA3), protein C-Ets-1 (Ets-1), GABP: GA binding protein (GABP), Spi-1 and Spi-B/ PU.1 related transcription factors, from the Krueppel-like family: Gut-enriched Krueppel-like factor (GKLF), Erythroid Krueppel-like factor (EKLF), Basic Krueppel-like factor (BKLF), GC box and myeloid zinc finger protein MZF-1 were identified in the evolutionary conserved regions.The bioinformatics results need to be investigated further in others studies by experimental approaches.

Table 1
THE INFORMATION PROVIDED BY BIOINFORMATICS ANALYSIS OF POLYMORPHIC VARIANTS OF HUMAN SOD3 PROMOTER DNA SEQUENCE