Quantitative analysis by next generation sequencing of hematopoietic stem and progenitor cells (LSK) and of splenic B cells transcriptomes from wild-type and Usp3-knockout mice

The data described here provide genome-wide expression profiles of murine primitive hematopoietic stem and progenitor cells (LSK) and of B cell populations, obtained by high throughput sequencing. Cells are derived from wild-type mice and from mice deficient for the ubiquitin-specific protease 3 (USP3; Usp3Δ/Δ). Modification of histone proteins by ubiquitin plays a crucial role in the cellular response to DNA damage (DDR) (Jackson and Durocher, 2013) [1]. USP3 is a histone H2A deubiquitinating enzyme (DUB) that regulates ubiquitin-dependent DDR in response to DNA double-strand breaks (Nicassio et al., 2007; Doil et al., 2008) [2], [3]. Deletion of USP3 in mice increases the incidence of spontaneous tumors and affects hematopoiesis [4]. In particular, Usp3-knockout mice show progressive loss of B and T cells and decreased functional potential of hematopoietic stem cells (HSCs) during aging. USP3-deficient cells, including HSCs, display enhanced histone ubiquitination, accumulate spontaneous DNA damage and are hypersensitive to ionizing radiation (Lancini et al., 2014) [4]. To address whether USP3 loss leads to deregulation of specific molecular pathways relevant to HSC homeostasis and/or B cell development, we have employed the RNA-sequencing technology and investigated transcriptional differences between wild-type and Usp3Δ/Δ LSK, naïve B cells or in vitro activated B cells. The data relate to the research article “Tight regulation of ubiquitin-mediated DNA damage response by USP3 preserves the functional integrity of hematopoietic stem cells” (Lancini et al., 2014) [4]. The RNA-sequencing and analysis data sets have been deposited in NCBI׳s Gene Expression Omnibus (Edgar et al., 2002) [5] and are accessible through GEO Series accession number GSE58495 (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE58495). With this article, we present validation of the RNA-seq data set through quantitative real-time PCR and comparative analysis.

The data described here provide genome-wide expression profiles of murine primitive hematopoietic stem and progenitor cells (LSK) and of B cell populations, obtained by high throughput sequencing. Cells are derived from wild-type mice and from mice deficient for the ubiquitin-specific protease 3 (USP3; Usp3Δ/Δ). Modification of histone proteins by ubiquitin plays a crucial role in the cellular response to DNA damage (DDR) (Jackson and Durocher, 2013) [1]. USP3 is a histone H2A deubiquitinating enzyme (DUB) that regulates ubiquitin-dependent DDR in response to DNA double-strand breaks (Nicassio et al., 2007;Doil et al., 2008) [2,3]. Deletion of USP3 in mice increases the incidence of spontaneous tumors and affects hematopoiesis [4]. In particular, Usp3-knockout mice show progressive loss of B and T cells and decreased functional potential of hematopoietic stem cells ( ionizing radiation (Lancini et al., 2014) [4]. To address whether USP3 loss leads to deregulation of specific molecular pathways relevant to HSC homeostasis and/or B cell development, we have employed the RNA-sequencing technology and investigated transcriptional differences between wild-type and Usp3Δ/Δ LSK, naïve B cells or in vitro activated B cells. The data relate to the research article "Tight regulation of ubiquitin-mediated DNA damage response by USP3 preserves the functional integrity of hematopoietic stem cells" (Lancini et al., 2014) [4]. The RNAsequencing and analysis data sets have been deposited in NCBI's Gene Expression Omnibus (Edgar et al., 2002) [5]

Genome-wide expression profiling of mouse hematopoietic stem and progenitor cells (LSK cells), naïve splenic B cells and in vitro lipopolysaccharide (LPS)-stimulated B cells (activated B cell). qRT-PCR on LSK cells by SYBR Green assays. Data source location
Amsterdam, The Netherlands

Data accessibility
The raw data discussed in this publication have been deposited in NCBI's Gene Expression Omnibus [5] and are accessible through GEO Series accession no. GSE58 495. Validation of the RNA-seq data sets is available with this article.

Value of the data
One of the largest dataset of gene expression profiles by RNA-seq of hematopoietic stem and progenitor cells (LSK) and B cells from mice deficient for a deubiquitinating enzyme available to date, and the first dataset for Usp3 deletion.
The data can be used to link transcriptional expression profiles to functional alterations in ubiquitin-regulated pathways and DNA damage response pathways The data can be compared to available transcriptional data for further insights into regulatory networks of hematopoiesis Data can be used in the development of further experiments aimed at addressing how the ubiquitin-dependent DNA damage response pathway impact on hematopoietic stem cell biology

Data
mRNA profiles of murine populations of lineage negative cKit þ SCA1þ (LSK) hematopoietic progenitors, and naïve or activated B cell populations from 8 weeks-old wild-type (WT) and Usp3deleted (Usp3Δ/Δ) mice were generated by deep sequencing using Illumina Hiseq2000. Here we present validation of the datasets by comparative analysis and qRT-PCR (Fig. 1). qRT-PCR validation of the RNA-seq on LSK cells was performed using SYBR Green assays.

Experimental design
The cells used as a source for RNA-seq were lineage negative, cKit þ , SCa1 þ (LSK) hematopoietic stem and progenitor cell population purified by fluorescence-activated cell sorting (FACS) from freshly isolated bone marrow, FACS-sorted naive B cells from spleens (CD19 þ ) and activated B cells harvested and FACS sorted after 4 days stimulation with lipopolysaccharide (LPS) in culture. Two biological replicas were analyzed. For each experiment WT n ¼4, Usp3Δ/Δ n ¼4 mice were used. FACSsorted cells from individual animals were pooled and subjected to deep sequencing. We assigned about 8-16 million reads per sample uniquely to a gene of the mouse reference genome (mm9). We identified 23,429 genes in the LSKs, naive B cells and activate B cells of WT and Usp3Δ/Δ mice using TopHat in combination with HTSeq-count. The raw data files that were used in the validation/analysis presented here and in the analysis and interpretation in [4] have been deposited in the NCBI's Gene Expression Omnibus [5] database with the GEO Series accession no. GSE58 495 (Fig. 1).

B cell isolation
B cells were extracted from 8 weeks-old WT and Usp3Δ/Δ mice. Naive splenic B cells were obtained by CD43 depletion using biotinylated anti CD43 (Clone S7, BD Biosciences), and the IMag system (BD Biosciences), as described by the manufacturer. Naïve B cells were directly FACS sorted with fluorochrome-conjugated antibody specific for CD19 (APC), or cultured in vitro for four days in IMDM þ8%FBS and 50 μg/ml Escherichia coli LPS (055:B5, Sigma) to obtain activated B cells, followed by sorting. Two independent experiments were performed. Cell sorting was performed by FACSAria (BD Biosciences).

RNA-seq gene expression analysis
For gene expression analysis, LSKs (FACS sorted from freshly isolated BM), naïve splenic B cells (freshly FACS sorted) or FACS sorted LSP-activated B cells were used. N ¼4 Usp3Δ/Δ and N ¼4 WT littermates (8 weeks-old). Cells from individual animals were pooled and total RNA was extracted. Samples were prepared using TruSeq protocols and standard Illumina sample preparation protocols and RNA-seq was performed on an Illumina Hiseq2000 machine at the NKI Genomics Core Facility.  Table S1). Pearson coefficient r ¼0.9443; R 2 coefficient ¼0.891. Single qRT-PCR results for a subset of HSC-specific genes, Mpl2, Eng, Tek and Fdzl3 [8] The sequence reads that passed quality filters were mapped to mm9 with TopHat version 2.0.3 and the gene expressions were calculated using HTSeq-count. The expression levels are normalized to 10 million reads per sample (GSE58495_diffexp_LSK.txt.gz file). Differential expression was performed using the R package DEGseq (GSE58495_norm_gene_exp_10mil.txt.gz). Upon differential expression analysis all values were added with 1. Genes that had no expression in both samples were removed.

Quantitative real time-(qRT-)PCR
Total RNA was extracted using Trizol reagent (Life technologies) and cDNA was prepared using Superscript II RT and oligod(T)n primers (Life technologies). qRT-PCR was performed on a StepOne-Plus Real-Time PCR system (Applied Biosystems) using SYBR Green PCR mastermix (Applied Biosystems). The amount of target, normalized to an endogenous reference (TBP or beta actin) was calculated by: 2 À ΔΔ CT.
Primer sequences used in validation of RNA seq analysis (Supplementary Table S1) are available upon request.

Statistics
Statistical analysis was performed by Student t test or Pearson correlation analysis in Prism 6.