Datasets for next-generation sequencing of DNA and RNA from urine and plasma of patients with prostate cancer

Current prostate cancer (PCa) diagnostic tests suffer from insufficient sensitivity and specificity. Novel biomarkers that can be detected by minimally invasive methods are of a particular value. Here we provide two datasets. The first one is on the whole transcriptome profiling by RNA-seq of urine and plasma obtained from patients with PCa and benign prostatic hyperplasia (BPH). The second one represents targeted sequencing of DNA from urine and plasma of patients with PCa and BPH. Both datasets are available at NCBI Sequence Read Archive under Accession No. SRP093707 and No. SRP093842 respectively.

Transcriptome RNA-seq Targeted DNA sequencing Prostate cancer Benign prostatic hyperplasia a b s t r a c t Current prostate cancer (PCa) diagnostic tests suffer from insufficient sensitivity and specificity. Novel biomarkers that can be detected by minimally invasive methods are of a particular value. Here we provide two datasets. The first one is on the whole transcriptome profiling by RNA-seq of urine and plasma obtained from patients with PCa and benign prostatic hyperplasia (BPH). The second one represents targeted sequencing of DNA from urine and plasma of patients with PCa

Value of the data
The data on protein coding and non-coding RNAs detectable in urine and plasma of patients with PCa is valuable for the screening of novel biomarkers for early diagnostics of PCa.
Targeted DNA sequencing data can be used to develop methods of revealing resistance to commonly used anti-androgen therapy at the early stages of PCa progression.
Both datasets include samples obtained from patients with BPH which can be used as a control group to ensure high specificity of found markers towards PCa (Nikitina et al., 2016) [1].
Access to the raw sequencing data allows researchers to perform further bioinformatic analysis based on their own computational algorithms.

Sample collection, RNA and DNA extraction
Urine and plasma samples were taken from 14 patients with PCa and 3 patients with BPH from Moscow City Clinical Hospital No. 50. Urine samples were obtained after prostate massage. Each urine and plasma sample was collected either before or after the operation (radical prostatectomy or transurethral resection of the prostate (TURP) depending on patient's diagnosis). RNA and DNA extractions from urine were performed using RNA/DNA Purification Kit (Norgen Biotek). Isolation of these molecules from plasma was carried out by QIAamp Circulating Nucleic Acid Kit (Qiagen). Overall, 19 RNA (12 from urine and 7 from plasma) samples and 41 DNA (14 from urine and 27 from plasma) samples were obtained. Table 1 provides information about patients and the samples obtained from them. Each sample name corresponds to a single library and to a single FASTQ record in NCBI SRA database.

Transcriptome library preparation
DNA contaminations from extracted total RNA were removed with the use of DNAse I (Fermentas) treatment following the manufacturer's recommendations. RNA concentration was determined by fluorometer Qubit 2.0 using Qubit RNA HS Assay Kit (Thermo Fisher). Transcriptomic libraries were constructed using Ion Total RNA-Seq Kit v2 (Life Technologies) with the following modifications of the protocol. For RNA fragmentation 1 ml of 10x RNase III buffer (Life Technologies) was added to 9 ml of RNA solution and heated for 10 min at 95°С followed by immediate snap-cooling on ice. After that 1 ml of 10 mM ATP and 1 ml of polynucleotide kinase (Fermentas) were added to the solution from the previous step and the whole mix was incubated at 37°С for 30 min. Fragmented RNA was cleaned up using Micro Bio-Spin Chromatography Columns (Bio-Rad). Further steps of library preparation including adapter ligation, first-strand cDNA synthesis, and amplification were carried out in accordance with manufacturer's instructions. Fragments of cDNA corresponding to rRNA were depleted using duplex-specific nuclease (Evrogen) as described in [2] with the following modifications. Only one round of DSN-normalisation was performed and the re-annealing of ds-cDNA at 68°C was carried out overnight. The prepared library was purified by magnetic beads Agencourt AMPure XP (Beckman Coulter Inc.) and its quality was assessed by 2100 Bioanalyzer (Agilent Genomics) using Agilent High Sensitivity DNA Kit (Agilent Genomics).

Targeted DNA library preparation
GeneRead DNAseq Targeted Human Prostate Cancer Panel (Qiagen) was used for targeted enrichment of the extracted DNA. Considering the quality of circulating cell-free DNA the number of cycles in this amplification step was raised to 20 and 22 for DNA extracted from urine and plasma respectively. Subsequent library construction was performed using GeneRead Library Prep workflow (Qiagen) following the manufacturer's recommendations.

High-throughput sequencing
The sequencing of constructed libraries was performed on Ion Proton platform using ION PI HI-Q Sequencing 200 Kit and Ion PI Chip Kit v2 (Thermo Fisher Scientific) following the recommendations of the manufacturer. Base calling was performed by Torrent Suite 5.0, fastqCreator v3.4.56313.