Datasets describing the growth and molecular features of hepatocellular carcinoma patient-derived xenograft cells grown in a three-dimensional macroporous hydrogel

This data article presents datasets associated with the research article entitled “Generation of matched patient-derived xenograft in vitro–in vivo models using 3D macroporous hydrogels for the study of liver cancer” (Fong et al., 2018) [1]. A three-dimensional macroporous sponge system was used to generate in vitro counterparts to various hepatocellular carcinoma patient-derived xenograft (HCC-PDX) lines. This article describes the viability, proliferative capacity and molecular features (genomic and transcriptomic profiles) of the cultured HCC-PDX cells. The sequencing datasets are made publicly available to enable critical or further analyzes.

Cells from various HCC-PDX in vivo models were cultured in a 3D sponge scaffold to determine whether these cells could be grown in vitro.

Experimental features
The degree of genomic and transcriptomic correlation between paired in vitro and in vivo HCC-PDX models was established for all the lines. Data source location Singapore Data accessibility GEO database (GSE109955)

Value of the data
The data presents the molecular features (genomic and transcriptomic profiles) of various in vivo and corresponding in vitro HCC-PDX models and could be used by other researchers to study HCC.
This data allows for other researchers to extend the correlative analyses between the established in vivo and corresponding in vitro HCC-PDX models.
The whole exome sequencing (WES) and RNA-sequencing (RNA-seq) data may also allow for comparisons between the HCC-PDX models and other HCC models to be made.

Data
The dataset of this article provides information on the characteristics of the HCC-PDX cells when grown in the sponge system. Fig. 1 describes the physical characterization of the sponge system used to culture the HCC-PDX cells. Figs. 2-4 describes the viability, proliferative capacity and growth profile of 8 different HCC-PDX lines grown in the sponge (data for 6 other lines can be found in [1]). Following these data are those comparing the SNP and INDEL overlap ( Table 2) and mutational signature of HCC-PDX cells grown in vitro versus those in vivo (Fig. 6), as well as the correlation analysis of gene expression levels (based on a known set of dysregulated HCC genes as reported by Ho et al. [2]) between individual HCC-PDX in vivo-in vitro pairs (Fig. 8).

Experimental design, materials and methods
For each HCC-PDX line, HCC-PDX cells were harvested from the tumors, dissociated and seeded onto macroporous sponge. After 7 days in culture, RNA and DNA were extracted for WES and RNAseq. HCC-PDX cells grown in the sponge will be referred to as HCC3D-PDX, while their corresponding in vivo counterparts will be referred to as HCC-PDX.

Material characterization of macroporous sponge
Hydroxypropylcellulose (HPC) was used to fabricate the macroporous sponge. In brief, HPC is first grafted with methacrylic (MA) groups, which renders the polymer photo-crosslinkable. NMR spectroscopy was performed to determine the successful grafting of methacrylic (MA) groups onto HPC (Fig. 1A). Subsequently, MA-HPC is allowed to undergo thermal-induced phase separation and crosslinked with gamma irradiation. Following which, pore size distribution of the resulting sponge was quantified with ImageJ software from collective top view images of sponge obtained using scanning electron microscopy (Fig. 1B). Top views of the sponge surface morphology were captured using SEM (JEOL JSM-5600, Japan) at 5 kV. Prior to imaging, the dried sponge was sputter-coated with platinum for 60 s. Sponge was also conjugated with galactose moieties as previously described [3]. The successful conjugation of galactose onto the MA-HPC backbone was confirmed by X-ray photoelectron spectroscopy (Fig. 1C). Measurements were made on a VG ESCALAB Mk II spectrometer with a MgKa X-ray source (1253.6 eV photons) at a constant retard ratio of 40.

Viability, growth and proliferative capacity of HCC-PDX cells in sponge
The ability of 14 different HCC-PDX lines to grow in this macroporous sponge is reported in [1]. Fig. 2 shows the morphology and viability of 2 different HCC-PDX lines grown in the sponge at Day 2 and 20. Cells were stained with calcein-AM and propidium iodide which labels live cells green, and dead cells red. Samples were assessed for viability using calcein-AM (2 µM) and propidium iodide (25 µg/mL). Following a 30 min incubation with calcein-AM and propidium iodide, samples were immediately imaged using a Olympus Fluoview FV1000 or Zeiss LSM 710 confocal microscope. Samples were assessed for growth using CellTiter-Glo (Promega) as described by the manufacturer. While the viability, proliferative capacity and growth profile of 6 HCC-PDX lines were reported in [1], this article illustrates that of the other 8 HCC-PDX lines grown in the sponge (Figs. 3 and 4).

2.3.
Transcriptomic correlation between HCC-PDX and HCC3D-PDX using RNA-seq 2.3.1. RNA-seq library preparation RNA quality was assessed by analysis of rRNA band integrity on an Agilent RNA 6000 Nano kit (Agilent Technologies, CA). Before cDNA library construction, 1 µg of total RNA and magnetic beads with Oligo (dT) were used to enrich for poly(A) mRNA. Then, the purified mRNAs were disrupted into short fragments, and the double-stranded cDNAs were immediately synthesized. The cDNAs were subjected to end repair, poly(A) addition, and connection with sequencing adapters using the TruSeq RNA sample prep Kit (Illumina, CA). The suitable fragments automatically purified by BluePippin 2% agarose gel cassette (Sage Science, MA) were selected as templates for PCR amplification. The final library sizes and qualities were evaluated electrophoretically with an Agilent High Sensitivity DNA kit (Agilent Technologies, CA) and the fragment was found to be between 350 and 450 bp. Subsequently, the library was sequenced using an Illumina HiSeq. 2500 sequencer (Illumina, CA, Table 1).

RNA-seq processing
After quality check with FastQC (Fig. 5), short reads were aligned to human genome assembly hg38 using STAR [4]. Transcript expression levels were measured as Fragments Per Kilobase of transcript per Million mapped reads (FPKM) using the analyzeRepeats.pl script from the HOMER package [5].

Transcriptomic correlation between the matched PDX and 3DPDX models
To investigate whether HCC-PDX and HCC-3DPDX share similar gene expression profiles, we focused on 219 up-regulated (EPR1 reported by Ho et al. has been discontinued since 2011 [6]) and 514 down-regulated genes known to be dysregulated in HCC [2]. Comparative analysis was performed using Pearson correlation (gene expression measure FPKM underwent inverse hyperbolic sine transformation). Fig. 8 shows the degree of correlation between paired in vivo and in vitro models for the 14 HCC-PDX lines. RNA-seq data for the 14 HCC-PDX lines (both in vivo and corresponding in vitro models) is publicly available in the GEO datasets (GSE109903).

WES library preparation
The quality and quantity of purified DNA were assessed by fluorometry (Qubit, Invitrogen) and gel electrophoresis. Briefly, 500ng of genomic DNA from each sample was fragmented by acoustic shearing on a Covaris S2 instrument. Fragments in 150-300 bp were ligated to Illumina's adapters and PCR-amplified. The samples were concentrated to 300 ng in 3.4 μl DW using a Speedvac machine (Thermo Scientific) and hybridized with RNA probes, SureSelect XT Human All Exon V5 Capture library for 16-24 h at 65°C. After hybridization, the captured targets were pulled down by biotinylated probe/target hybrids using streptavidin-coated magnetic beads (Dynabeads My One Streptavidine T1; Life Technologies Ltd.) and buffers. The selected regions were then PCR-amplified using Illumina PCR primers. Libraries were quantified using the Agilent 2100 Bioanalyzer (Agilent Technologies) and KAPA Library Quantification Kit (KK4824, Kapa Biosystems). The resulting purified libraries were applied to an Illumina flow cell for cluster generation and sequenced using 150 bp paired-end reads on an Illumina Hiseq. 2500 sequencer by following the manufacturer's protocols (Table 1). Image analysis were performed using the HiSeq control Software version 1.8.4.

WES processing
In order to remove mouse reads in PDX samples, BBMap [7] was applied to the fastq files based on hg19 and Ensembl Release 77 reference genome for human and mouse, respectively, and the reads classified into human reads only were then analyzed. After quality check by FastQC (Fig. 7), reads in high quality were aligned to human reference genome hg19 using Burrows Wheeler Aligner (BWA) [8] and duplicated reads were removed using Picard. Improvement of alignments and genetic variants calling were completed using Genome Analysis Toolkit (GATK) [9].

Genomic profiling of the matched PDX and 3DPDX models
Ovelapping of SNP and INDEL between HCC-PDX and HCC-3DPDX were analyzed using VCFtools [10]. Common SNP and INDEL overlap between HCC-PDX and HCC3D-PDX are shown in Table 2. In order to profile substitution patterns for signature [11] in HCC-PDX and HCC-3DPDX, we extracted 6 main types of substitutions, namely C 4 A, C 4 G, C 4 T, T 4 A, T 4 C and T 4 G. Specifically, for each main nucleotide substitution type, there are 16 different trinucleotide combinations and the occurrence frequency of each trinucleotide-based substitution subtype was calculated. Mutational signature for paired in vivo-in vitro HCC-PDX models is shown in Fig. 6. Whole exome sequencing data for 11 HCC-PDX lines (both in vivo and corresponding in vitro models) is publicly available in the GEO datasets (GSE109954).