Error-free and error-prone DNA repair gene expression data through reprogramming and passage in human iPS cells

We recently found that DNA repair-related gene expression could be altered by reprogramming as well as the increased expression of genes that accurately convey genomic information, such as homologous recombination (HR) and mismatch repair (MMR), and the decreased expression of error-prone translesion synthesis (TLS) polymerase. Here, we confirmed this change in expression in another cell-line and found that such alteration was maintained by overlapping passages as well as OCT3/4 and NANOG. Our findings suggest that changes in the expression of DNA repair-related genes associated with reprogramming and their maintenance can be novel indicators of the quality control of the cells exhibiting pluripotency.


Data description
The mean RNA expression values of fibroblast and hiPSC (p31, p32) were calculated for DNA repairand replication-related genes, as noted in our previous analysis [1]. As a result, a stable and approximately three-fold elevated expression through reprogramming was observed in all the hiPS cell (hiPSC) lines, compared with the progenitor cells for RAD51 and BLM in HR, MSH2 and MSH6 in MMR and PARP1 and PARP2 in base excision repair (BER) which is a part of error-free repairs. RAD50, NBN and MRE11 were involved in both the HR and the non-homologous end-joining (NHEJ). MRE11 showed a slight elevation of expression, but there was no increase in expression of RAD51 or BLM, similar to our previous findings. RAD50 and NBN showed a minimal decrease in expression, consistent with the previous data [1] (Table 1).
Although expression was slightly upregulated in XRCC5 and XRCC6, that in XRCC4 was downregulated; all their corresponding genes were involved in the NHEJ of the error-prone repair. In addition, REV3L and POLH of the polymerase representatives, thought to perform ambiguous postreplicative repairs, showed reduced expression. All these alterations in expression were the same as those shown in a series of DNA repair-related genes using microarray, with a completely separate fibroblast and third molar cell [1].
The principal component analysis (PCA) showed that progenitor fibroblast and hiPSC were greatly divided by PC1 and PC2 and that the two passage groups of (p31, p32) and (p50, p51, p53) were divided in hiPSC (Fig. 1). Moreover, there were differences in the expression of genes between these two groups. We performed gene ontology (GO) analysis of the 761 genes, incurring a fold change of (2S, &2) and obtaining p-values of <0.05 (p31, p32 vs p50, p51, p53). The top five GO associated genes included those for regulation of cell differentiation, positive regulation of developmental process, epithelium development, regulation of multicellular organismal development and epithelial cell differentiation ( Table 2).
The mean values of each of the two groups were calculated for the FPKM values of OCT3/4 and NANOG as indices of pluripotency. No difference was found between the two groups, but our findings demonstrated that pluripotency was maintained even in the groups of (p50, p51, p53) compared with the groups of (p31, p32) ( Table 1).
Specifications Table   Subject Biology Specific subject area NGS, Transcriptomics, Stem cell biology Type of data Value of the Data This work gives a deeper understanding of the basic characteristics of DNA repair-related genes in pluripotent cells with reprogramming and overlapping passage.
The data in this article shows that changes in the expression of hiPSC passage group (p31, p32) and hiPSC passage group (p50, p51, p53) were clearly shown by PCA. The difference between hiPSC passage group (p31, p32) and hiPSC passage group (p50, p51, p53) was indicated at cell differentiation.

RNA extraction and library preparation
Total RNA was extracted from cells with an RNeasy Plus Micro Kit (Qiagen). Library preparation was performed using a TruSeq stranded mRNA sample prep kit (Illumina, San Diego, CA) according to the manufacturer's instructions.

RNA sequence
Whole transcriptome sequencing was applied to the RNA samples through the Illumina HiSeq 2500 and 3000 platforms in a 75-base single-end mode. An Illumina Casava ver.1.8.2 software was used for the base calling. The sequenced reads were mapped to the human reference genome sequences (hg19) using TopHat ver. 2.0.13 in combination with Bowtie2 ver. 2.2.3 and SAMtools ver. 0.1.19. The number of fragments per kilobase of exon per million mapped fragments (FPKM) was calculated using Cufflinks ver. 2.2.1. The FPKM values were calculated from the respective sequence data, and the analyses were performed using iDEP85 (http://bioinformatics.sdstate.edu/idep/).