Data on novel DNA methylation changes induced by valproic acid in human hepatocytes

Valproic acid (VPA) is a widely prescribed antiepileptic drug in the world. Despite its pharmacological importance, it may cause liver toxicity and steatosis. However the exact mechanism of the steatosis formation is unknown. The data presented in this DIB publication is used to further investigate the VPA-induced mechanisms of steatosis by analyzing changes in patterns of methylation. Therefore, primary human hepatocytes (PHHs) were exposed to VPA at a concentration which was shown to cause steatosis without inducing overt cytotoxicity. VPA was administered for 5 days daily to PHHs. Furthermore, after 5 days VPA-treatment parts of the PHHs were followed for a 3 days washout. Differentially methylated DNA regions (DMRs) were identified by using the ‘Methylated DNA Immuno-Precipitation - sequencing’ (MeDIP-seq) method. The data presented in this DIB demonstrate induced steatosis pathways by all DMRs during VPA-treatment, covering interesting drug-induced steatosis genes (persistent DMRs upon terminating VPA treatment and the EP300 network). This was illustrated in our associated article (Wolters et al., 2017) [1]. MeDIP-seq raw data are available on ArrayExpress (accession number: E-MTAB-4437).


a b s t r a c t
Valproic acid (VPA) is a widely prescribed antiepileptic drug in the world. Despite its pharmacological importance, it may cause liver toxicity and steatosis. However the exact mechanism of the steatosis formation is unknown. The data presented in this DIB publication is used to further investigate the VPA-induced mechanisms of steatosis by analyzing changes in patterns of methylation. Therefore, primary human hepatocytes (PHHs) were exposed to VPA at a concentration which was shown to cause steatosis without inducing overt cytotoxicity. VPA was administered for 5 days daily to PHHs. Furthermore, after 5 days VPA-treatment parts of the PHHs were followed for a 3 days washout. Differentially methylated DNA regions (DMRs) were identified by using the 'Methylated DNA Immuno-Precipitation -sequencing' (MeDIP-seq) method. The data presented in this DIB demonstrate induced steatosis pathways by all DMRs during VPA-treatment, covering interesting drug-induced steatosis genes (persistent DMRs upon terminating VPA treatment and the EP300 network). This was illustrated in our associated article (Wolters et al., 2017) [1]. MeDIP-seq raw data are available on ArrayExpress (accession number: E-MTAB-4437 Primary human hepatocytes (PHHs) were treated by valproic acid (VPA) at an incubation concentration of 15 mM for 5 days daily followed by a washout of 3 days

Experimental features
The treated samples were corrected for time-matched vehicle controls. The persistent changes were identified by determining DNA methylation similarities between samples of 5 days daily VPA-treatment and samples of 3 days washout upon the 5 days daily VPA-treatment Data source location

Value of the Data
The data derived from primary human hepatocytes (PHHs) treated with valproic acid (VPA) as well as the data analysis approaches in this publication can serve as a benchmark to investigate the epigenetics effects of other hepatotoxic compounds, since the data show that Methylated DNA Immuno-Precipitationsequencing (MeDIP-seq) analysis is highly informative in disclosing novel mechanisms of VPA-induced toxicity in PHHs.
The investigation of persistent methylation changes in PHHs provides a new perspective for other studies related to the drug-induced steatosis or other forms of toxicity.
The listed gene EP300 together with the neighbors, of the network analysis, can be used for the development of biomarker screening tools for the early detection of drug-induced steatosis or other forms of toxicity, also by using other cell types.

Data
Methylated DNA Immuno-Precipitationsequencing (MeDIP-seq) analysis showed that the methylation of more than 6000 genes significantly changed after 5 days daily valproic acid (VPA)treatment (3006 hypermethylated differentially methylated DNA regions (DMRs) and 3077 hypomethylated DMRs). 31 DMRs were persistently methylated after taking the compound away (11 hypomethylated DMRs and 20 hypermethylated DMRs). The names and functions of those persistent DMRs are shown in Table 1. Furthermore, the 3006 hypermethylated and 3077 hypomethylated DMRs were classified into 119 significantly enriched pathways ( Table 2). The unique genes of all those 119 significantly enriched pathways, which have shown significant methylation changes in our data after 5 days daily VPA-treatment, formed a complex network module (Fig. 1A-B). The gene EP300 has 33 neighbors ( Fig. 1B-C) and the gene names, gene symbols, and fold changes (FCs) of those neighbors were shown in Table 3. A more detailed description of those findings can be found in Wolters et al. [1].    The steatosis related pathways were shown in red.  Table 3. green ¼ hypermethylation; red ¼ hypomethylation; yellow ¼ neighbors of the gene 2033 (EP300) of the large molecular interaction network. PHHs, primary human hepatocytes; VPA, valproic acid; FCs, fold changes.

Cell culture and treatment
Cryopreserved primary human hepatocytes (PHHs, Invitrogen) of 3 individuals (Hu8084, Hu4197 and Hu4227) were thawed for 1 minute at 37°C in a water bath. Next, these PHHs were pooled in order to bypass inter-individual variability in susceptibility to toxicants and cultured in 6-well plates in a collagen sandwich [2], according to the supplier's protocol (Invitrogen). After 3 days, the PHHs were daily exposed to 15 mM VPA or 1% EtOH (vehicle control) in culture medium for 5 days. Culture medium was changed daily thereby administering a new incubation concentration of VPA or EtOH to the cells. After the exposure period of 5 days, PHHs were lysed for DNA isolation. Another well of PHHs was maintained in culture for 3 consecutive days without VPA-treatment (called washout); the culture medium was again changed every day. All experiments were performed in biological triplicates.

DNA isolation
PHHs were collected in 500 μL of digestion buffer (1 mM EDTA; 50 mM Tris-HCl, pH 8.0; 5% SDS) and proteinase K (1 mg/ml) (Ambion). After incubation for 1 hour at 55°C, the proteinase K was inactivated at 80°C. RNAse A (400 μg/ml) (Qiagen) and 1% collagenase (Sigma) treatment was performed for 1 h at 37°C. An equal amount of phenol-chloroform-isoamylalcohol (PCI; 25:24:1) (Sigma) was added and shaken manually for 5 minutes. After centrifugation, the upper phase was again treated with PCI until protein was no longer visible at the interphase. The upper phase was precipitated using 50 µL of 3 M NaAc pH 5.6 and 1250 µL of cold 100% ethanol. The DNA pellet was washed using cold 70% EtOH, dissolved in 50 µL of nuclease free water and quantified spectrophotometrically using the NanoDrop 1000 (Thermo Scientific, Waltham, MA). The total amount of DNA obtained was at least 10 µg of DNA, the 260/280 ratio laid between 1.7-1.9, and the 260/230 ratio was higher than 1.6.

MeDIP-seq protocol
MeDIP-seq was performed, with all the biological triplicates after DNA isolation, according to the protocol of Taiwo et al. [3], with minor adjustments.

DNA fragmentation to a size of~200 bp
For DNA fragmentation, 300 ng of isolated DNA were sonicated on the bioruptor (Diagenode) by using instrument settings of 15 cycles, each consisting of 30 seconds on/off periods. After fragmentation, the genomic DNA size range was assessed using an Agilent 2100 Bioanalyzer and high-sensitivity DNA chips (Agilent Technologies), according to the manufacturer's instructions.

Library preparation and size selection
Libraries were prepared using 300 ng of fragmented DNA (~200 bp) and the NEBNext Ultra DNA Library Prep Kit for Illumina (NEB), according to the manufacturer's protocol.

MeDIP analysis
The purified adaptor-ligated DNAs were used for Methylated DNA Immuno-Precipitation (MeDIP), according to the manufacturer's instructions of the MagMeDIP kit (Diagenode) and IPure kit (Diagenode).

Quality control
Quantitative PCR (qPCR) was used for controlling DNA methylation enrichment. qPCR was performed by measuring the Ct-values of 1 µL of purified DNA sample and 24 µL of qPCR mixture (1 µL of provided primer pair (reverse and forward), 12.5 µL of SYBR Green PCR master mix and 10.5 µL water) using the temperature profile: 95°C for 7 min, 40 cycles of 95°C for 15 s. and 60°C for 1 minute, followed by 1 minute 95°C. The efficiency of MeDIP was calculated by performing qPCR and using the following formula: %(meDNA-IP/Total input) ¼ 2^[(Ct(10%input)-3.32) -Ct(meDNA-IP)] × 100%. The efficiency for methylated DNA fragments was good ( 450%) for all samples. More interestingly, the efficiency for non-methylated DNA fragments was overall lower than 1.0%.

PCR amplification and size selection
PCR was used to amplify the MeDIP adaptor-ligated DNA fragments. In brief, 25 µL NEBNext High Fidelity 2x PCR Master mix (NEB), 1 µL of Index primer (NEB) that was used as a barcode for each sample, and 1 µL of Universal PCR primer (NEB) were added to 23 µL of the MeDIP adaptor ligated DNA fragments. PCR was performed by using the temperature profile: 98°C for 30 s, 15 cycles of 98°C for 10 s, 65°C for 30 sec., and 72°C for 30 s, followed by 5 minutes at 72°C and hold on 4°C as described before [3].
Thereafter, PCR-amplified DNAs (libraries) were cleaned using Cleanup of PCR Amplification in the NEBNext Ultra DNA Library Prep Kit for Illumina (NEB). Fragmented DNA size and quality were checked using the Agilent 2100 Bioanalyzer and high-sensitivity DNA chips (Agilent Technologies). In addition, generated libraries were size-selected on a 2% TAE low melting point (LMP) agarose gel; fragments of 250-350 bp were excised and the MinElute Gel DNA extraction kit (Qiagen) was used to extract and purify the DNA libraries. Libraries were quantified on a Qubit fluorimeter (Invitrogen) by using the Qubit dsDNA HS Assay kit (Invitrogen). All kits and chips were used according to the manufacturer's protocol.

Sequencing
The 12 amplified libraries, each sample having its own index primer, were pooled at an equimolar concentration of 2 nM, based on Qubit measurements. Ten, 15, and 20 pM of the 2 nM stock solution were then loaded onto three separated channels of a 1.4 mm flow cell (Illumina) and cluster amplification was performed on a cBot (Illumina). Clusters were generated on cBot (Illumina) using the TruSeq® PE Cluster Kit V3, according to the manufacturer's instructions (Illumina), and the paired-end libraries were sequenced using 2 × 100 cycles TruSeq™ SBS Kit v3 paired-end by sequencing by synthesis (SBS) on the Illumina HiSeq. 2000. Base calling was performed by using Casava 1.8.2 (Illumina) and de-multiplexing by using bcl2fastq 1.8.4 (Illumina). Sequence reads were aligned against the human reference genome called UCSC hg19. This alignment produces FASTQ files for each barcoded library. MeDIP-seq raw data are available on ArrayExpress (accession number: E-MTAB-4437).

MeDIP-Seq analysis
FastQC was applied to check the quality of the 100 bp reads pairs of the 12 sequenced samples. Pairedend sequencing reads were aligned against hg19 using Bowtie2 software. The MEDIPS package (version 1.16.0, Bioconductor) was used for the analysis of the MeDIP-seq data [4][5][6]. The default parameters described in the MEDIPS guideline (version 1.16.0) [7] were applied to all data from individual chromosomes, including mitochondrial DNA (chrM). The dataset was divided into four different groups of triplicates: (1) Control MeDIP samples includes the sequencing data of PHHs daily exposed during 5 days to the control vehicle; (2) VPA-treated MeDIP samples includes the sequencing data of PHHs exposed for 5 days daily to VPA; (3) Control washout MeDIP samples contains the sample exposed for 5 days daily with the vehicle control followed by a washout-period of 3 days; and (4) VPA-treated followed by washout MeDIP samples includes the sequencing data of PHHs exposed treated daily by VPA for 5 days followed by a washout-period of 3 days. Statistical analysis was performed applying the default parameters of MEDIPS, using the edgeR module, an empirical analysis of digital gene expression data in R that uses Bayes estimation and exact tests based on the negative binomial distribution [8]. Notably, raw count data was normalized using the weighted trimmed mean of M-values (TMM-normalization). Regions were considered significantly methylated if the edgeR.p-value was below 0.01 and if the number of reads, of a specific region, in one of the samples was higher than the mean of reads of all regions (the whole genome), which is the background correction. This p-value was derived from other studies performing MeDIP-seq analysis [9][10][11].
Annotation of DMRs into different genomic locations was achieved by using the HOMER software Regions were merged if (1) the start of a region was consecutive to the end of the previous region and (2) if the HOMER annotations of these consecutive DMRs were the same. Significant selected DMRs lists and unique gene lists were uploaded onto VENNY [12]. In this paper, the names and functions of the persistent genes are available in Table 1.

Pathway analysis
ConsensusPathDB [13] was used to identify and visualize the involvement of the unique genes in biological processes that have been derived from affected pathways, by selecting significant pathways with a p-value o 0.01 from a gene enrichment analysis. In this paper, the significant pathways are available in Table 2.

Network visualization
Methylated genes were then uploaded onto Cytoscape. The circular layout was selected and the network was analyzed as undirected. FCs were added and nodes were colored (green ¼ hypermethylation (positive FCs) and red ¼ hypomethylation (negative FCs)).
The first neighbors of methylated hub genes were selected by using the tool first neighbors of selected nodes in Cytoscape. Then, a sub-molecular induced epigenome network with its first neighbors was prepared in Cytoscape. In this publication, molecular interaction networks and a submolecular interaction network of the gene EP300 is available in Fig. 1.
Furthermore, names, FCs, and presence of the gene in one or more steatosis related pathways of the 33 neighbors of EntrezGeneID 2033 (EP300) are available in Table 3.