RNA-seq data of invasive ductal carcinoma and adjacent normal tissues from a Korean patient with breast cancer

Invasive ductal carcinoma is the most common type of breast cancer. Here, we provide a whole transcriptome shotgun sequencing (called RNA-seq) dataset conducted with ten samples of invasive ductal carcinoma tissue and three samples of adjacent normal tissue from a single Korean breast cancer patient (luminal B subtype). Differentially expressed genes (DEGs) were identified with a false discovery rate (FDR)-adjusted p-value of 0.05. Gene ontology analysis identified several key pathways, including lymphocyte activation. A list of differentially expressed genes is provided. The raw data was uploaded to the sequence read archive (SRA) database and the BioProject ID is PRJNA432903.


a b s t r a c t
Invasive ductal carcinoma is the most common type of breast cancer. Here, we provide a whole transcriptome shotgun sequencing (called RNA-seq) dataset conducted with ten samples of invasive ductal carcinoma tissue and three samples of adjacent normal tissue from a single Korean breast cancer patient (luminal B subtype). Differentially expressed genes (DEGs) were identified with a false discovery rate (FDR)-adjusted p-value of 0.05. Gene ontology analysis identified several key pathways, including lymphocyte activation. A list of differentially expressed genes is provided. The raw data was uploaded to the sequence read archive (SRA) database and the BioProject ID is PRJNA432903. &

Value of the data
This RNA-seq data provides a deep sequencing of ten samples of invasive ductal carcinoma tissue and three samples of adjacent normal tissue from a Korean breast cancer patient (luminal B subtype) The heterogeneous expression data from spatially distinct tumor samples can be used for various evaluation purposes.
Gene ontology analysis revealed that lymphocyte activation and PPAR signaling pathway are significantly up-and down-regulated pathways, respectively, in breast cancer tissue compared to adjacent normal tissue.

Data
Total RNA was extracted from ten samples of cancer tissue (invasive ductal carcinoma; luminal B subtype) and three samples of adjacent normal tissue from a Korean patient with breast cancer. RNAseq was performed to profile transcriptomes of breast cancer and normal samples. Differentially expressed genes were identified with an FDR-adjusted p-value cutoff of 0.05. Gene ontology analysis indicated that several pathways are associated with the onset or progression of breast cancer.

RNA-seq
One tissue sample of invasive ductal carcinoma (luminal B subtype) from breast tissue and a corresponding adjacent normal tissue were biopsied from a Korean woman with informed consent. This study was approved by the institutional review board of Catholic Medical Center (approval no. UC17TISI0015). The tumor and adjacent normal tissues were divided into ten and three samples, respectively. Poly(A) RNA was purified from 1 g total RNA from each sample, and cDNA was synthesized using SuperScript II (Invitrogen). Sequencing libraries were prepared using the TruSeq RNA Library preparation kit (Illumina) and sequenced using HiSeq. 2500 (Illumina).

Identification of differentially expressed genes
Differentially expressed genes (DEGs) between cancer and normal samples were identified using Cufflinks with the Cuffdiff function (version 2.2.1) [5]. DEGs were defined as the genes with FDRadjusted p-values o0.05. A total of 2456 up-regulated and 2601 down-regulated genes were identified in cancer samples compared to adjacent normal samples (Supplementary Table 2). When the low-quality RNA-seq data (C3) was excluded for DEG analysis, a total of 3199 up-regulated and 3422 down-regulated genes were identified as DEGs ( Fig. 1 and Supplementary Table 3).

Gene ontology analysis
Gene ontology (GO) analysis was performed to identify key pathways regarding the DEGs that were identified without the C3 sample. The top 100 up-regulated (or down-regulated) DEGs that were highly expressed (4 10 average FPKM) were analyzed using Metascape (http://metascape.org) [6]. The GO analysis revealed that the majority of up-regulated genes were significantly associated with lymphocyte activation and that some down-regulated genes were involved in PPAR signaling pathway (Fig. 2).