Transcriptomic data of BT549 triple negative breast cancer cells treated with 20 µM NU7441, a DNA-dependent kinase inhibitor

DNA-dependent protein kinase catalytic subunit (DNA-PK) is a multifunctional serine‑threonine protein kinase that plays roles in non-homologous end joining of DNA repair in cells. NU7441 is a specific DNA-PKcs inhibitor. We investigated the effects of NU7441 on the transcriptome of BT549 triple negative breast cancer cells. Total RNA extracted from NU7441-treated or control BT549 cells was processed for preparation of sequencing libraries. Assessment of read quality was performed using fastqc tool. Trimming and filtering low-quality reads were performed using fastp. Reads were aligned by hisat2. SAM files were converted to BAM files using Samtools. The gene differential expression analysis, Gene Ontology (GO) analysis and KEGG pathway analysis were performed. After NU7441 treatment, total number of 2045 differential genes were selected according to |log2(FoldChange)| >= 1 & padj<= 0.05, among which 1365 genes were down-regulated and 680 genes were up-regulated. The differential expression genes in pattern recognition receptors (PRRs) immune responses signals, including NOD-like receptor signaling, Toll-like receptor signaling, RIG-I-like receptor signaling and cytosolic DNA-sensing pathways were noted in this paper.

a b s t r a c t DNA-dependent protein kinase catalytic subunit (DNA-PK) is a multifunctional serine-threonine protein kinase that plays roles in non-homologous end joining of DNA repair in cells.NU7441 is a specific DNA-PKcs inhibitor.We investigated the effects of NU7441 on the transcriptome of BT549 triple negative breast cancer cells.Total RNA extracted from NU7441treated or control BT549 cells was processed for preparation of sequencing libraries.Assessment of read quality was performed using fastqc tool.Trimming and filtering low-quality reads were performed using fastp.Reads were aligned by hisat2.SAM files were converted to BAM files using Samtools.The gene differential expression analysis, Gene Ontology (GO) analysis and KEGG pathway analysis were performed.After NU7441 treatment, total number of 2045 differential genes were selected according to |log2(FoldChange)| > = 1 & padj < = 0.05, among which 1365 genes were downregulated and 680 genes were up-regulated.The differential expression genes in pattern recognition receptors (PRRs) immune responses signals, including NOD-like receptor signaling, Toll-like receptor signaling, RIG-I-like receptor signaling and cytosolic DNA-sensing pathways were noted in this paper.

Value of the Data
• This information uncovers a range of downstream analyses, including annotation, differential expression, pathway investigations between BT549 breast cancer cells treated with NU7441 and control.These data are valuable for the understanding the effects of inhibition of DNA-PK on the global gene expressions and pathways.• This dataset reported the differential genes in pattern recognition receptors (PRRs) immune responses signals, including NOD-like receptor signaling, Toll-like receptor signaling, RIG-I-like receptor signaling and cytosolic DNA-sensing pathways in BT549 cells treated with NU7441 and control.The relationship of DNA damage response and PRRs immune signals were not well studied.As DNA-PK is a vital sensor in repair of DNA double strand breaks, this dataset allow in-depth exploration of the relationship between DNA damage response and cytosolic nucleic acids-sensing immune signals.
• DNA-PK is a vital factor in non-homologous end joining (NHEJ) repair, these dataset would be valuable to the investigation of NHEJ repair and the generation of cytosolic DNA, well as the relationship of DNA damage repair and the innate immunity stimulated by cytosolic DNA.• This transcriptome sequences will function as essential references and valuable reservoirs for the investigation of the inhibitions of DNA-PK in innate immunity, tumoral immunemicroenvironment and breast cancer immunotherapy.

Background
In the published original research article, the DNA-PK inhibitor, NU7441, promoted the inflammation of breast cancer microenvironment.The primary objective of this study is to analyse the effects of DNA-PK inhibitor, NU7441, on the global gene expressions in breast cancer cells and identified the cellular functions and pathways regulated by NU7441 in breast cancer cells.To achieve this goal, the genes of differential expressions were selected and KEGG and GO analysis were performed.The genes in pattern recognition receptors (PRRs) immune responses signals, including NOD-like receptor signaling, Toll-like receptor signaling, RIG-I-like receptor signaling and cytosolic DNA-sensing pathways were noted.This study aims to identify differentially expressed genes and elucidate the distinct immune responses signals and genes triggered by NU7441.

Data Description
This dataset consists of the differential gene expressions in BT549 cells treated with NU7441 and control (Fig 1 A).The total number of differential genes were selected according  ).The gene ontology analysis based on molecular functions, biological processes and cellular components were performed ( Tables 1-3 ).The genes of differential expressions in NOD-like receptor signaling pathway were noted in Table 4 .The genes of differential expressions in Toll-like receptor signalling pathway were noted in Table 5 .The genes of differential expressions in RIG-I-like signalling pathway were noted in Table 6 .The genes of differential expressions in cytosolic DNA signalling pathway were noted in Table 7 .The RNA-seq raw data were deposited and available from NCBI database (accession number PRJNA872854).The results of RNA-seq were checked through RT-qPCR and western blot, which were replicated and reported in the related publication in Genes and Diseases Volume 10, Issue 5, September 2023, Pages 1809-1811.The expressions of cGAS, STING, RIG-I, MAVS, NF-kB, interferon and interferon stimulated genes were tested in BT549 cells, MDA-MB-231 cells and in CH12F3 cells, all of which were consistent with RNA-seq results.The details of the confirmation and replications in different cell lines could be searched in the related publication in Genes and Diseases Volume 10, Issue 5, September 2023, Pages 1809-1811.

Cell culture
Human breast cancer cell lines BT549 were obtained from American Type Culture Collection (ATCC).BT549 cells were cultured in RPMI 1640 medium which containing 10% FBS, and 1% penicillin and streptomycin, 5% CO2, 37 °C.

NU7441 Treatment
The BT549 cell line was cultured on cell culture plates, and the DNA-PKcs inhibitor NU7441 was added for 48 h at a concentration of 20 μM (below the IC50 value) when the cell density grew to 80%.Control cells are set up at the same time.

RNA Extraction and Processing
BT549 cells were collected, total RNA was extracted from the cells using Trizol reagent [2] , RNA quality was determined using an Agilent 5400 and quantified using a NanoDrop, and the RNA samples were used to construct sequence libraries.The first strand of cDNA was synthesised in M-MuLV reverse transcriptase system using fragmented mRNA as template and random oligonucleotides as primers, followed by degradation of the RNA strand by RNaseH, and synthesis of the second strand of cDNA using dNTPs as raw material under DNA polymerase I.The purified double-stranded cDNA was extracted from the cells using Agilent 5400 and quantified by NanoDrop.The purified double-stranded cDNA was end-repaired, A-tailed and ligated into sequencing junctions, and the cDNA of 370-420 bp was screened with AMPure XP beads, amplified by PCR, and the PCR products were purified again with AMPure XP beads to obtain the final library.After the libraries were constructed, they were initially quantified using a Qubit 2.0 Fluorometer and diluted to 1.5 ng/ul.The insert size of the libraries was then checked using an Agilent 2100 bioanalyzer, and the insert size was determined as expected.After the insertsize met the expectation, qRT-PCR was used to accurately quantify the effective concentration of the libraries (the effective concentration of the libraries was higher than 2 nM) to ensure the quality of libraries.After passing the library inspection, different libraries were pooled according to the effective concentration and the target downstream data volume, and sequenced with Illumina NovaSeq 60 0 0, and 150 bp paired-end reads were generated.Sequenced fragments are converted into sequence data (reads) by CASAVA base recognition of the image data measured by the high-throughput sequencer.

Gene expression data analysis
The raw reads were in FASTQ format.The quality of the reads were assessed using fastqc tool.The adapters, and low quality reads were filtered out from the FASTQ files using fastp tool.TThe fastqc tool was used to re-assess the filtered reads prior to mapping.The FASTQ files after the quality trimming and assessment were used for mapping [3] .
The Ensemble Homo sapiens GRCh38 genome was used as reference genome for mapping the clipped reads ( https://asia.ensembl.org/Homo_sapiens/Info/Index).Prior to mapping, indexing of reference genome was done using HISAT2 indexing scheme.Subsequently, clean reads were mapped using the HISAT2 tool against the index file [3] .The mapped output files (sam files) were converted into binary files (bam files) using Samtools [4] .
The featureCounts tool was used for quantification of mapped reads [5] .Mapped reads were counted at the feature (gene) level with the help of Homo sapiens GRCh38 annotation file (gtf).
Differentially expressed genes were screened using edgeR, and we performed the normalization and base-2 logarithm conversion for the matrix data of each GEO dataset using the limma package in R software.|logFC| > 1, P -value < 0.05 and adjusted P -value < 0.05 were considered to be statistically significant for the DEGs [6] .Furthermore, differentially expressed genes were subjected to gene ID conversion, GO functional annotation and enrichment analysis, and KEGG functional annotation and enrichment analysis using clusterProfiler (v4.10.0) in R studio.

Limitations
None.

Fig. 1 .
Fig. 1. (A) Venn plots of gene co-expression in NU7441-added and control cells, the overlapping area shows the number of co-expressed genes in the two samples.Control represents BT549 cells without drug treatment and inhibition represents BT549 cells after NU7441 treatment (B).Histograms of genes differentially expressed in the NU7441-treated group compared to the control group.The differential genes selected according to |log2(FoldChange)| > = 1 & padj < = 0.05.(C)Volcanoplot of differential genes in the NU7441-treated group compared to the control group.680 genes were up-regulated and 1365 genes were down-regulated in BT549 cells treated with NU7441 according to |log2(FoldChange)| > = 1 & padj < = 0.05.

Table 1
Gene ontology analysis based on molecular functions.

Table 2
Gene ontology analysis based on biological process.

Table 3
Gene ontology analysis based on cellular component.

Table 4
List of genes enriched in NOD-like receptor signaling pathway.

Table 5
List of genes enriched in Toll-like receptor signaling pathway.

Table 6
List of genes enriched in RIG-I-like receptor signaling pathway.