Data describing the experimental design and quality control of RNA-Seq of human adipose-derived stem cells undergoing early adipogenesis and osteogenesis

An important tool to study the regulation of gene expression is the sequencing and the analysis of different RNA fractions: total, ribosome-free, monosomal and polysomal. By comparing these different populations, it is possible to identity which genes are differentially expressed and to get information on how transcriptional and translational regulation modulates cellular function. Therefore, we used this strategy to analyze the regulation of gene expression of human adipose-derived stem cells during the triggering of the adipogenic and osteogenic differentiation. Here, we have focused on analyzing the differential expression of mRNAs during early adipogenic and osteogenic differentiation, and presented the detailed data concerning the experimental design, the RNA-Seq quality data, the raw data obtained and the RT-qPCR validation data. This information is important to confirm the accuracy of the data considering a future reuse of the data provided. Moreover, this study may be used as groundwork for future characterization of the transcriptome and the translatome regulation of different cell types.


a b s t r a c t
An important tool to study the regulation of gene expression is the sequencing and the analysis of different RNA fractions: total, ribosome-free, monosomal and polysomal. By comparing these different populations, it is possible to identity which genes are differentially expressed and to get information on how transcriptional and translational regulation modulates cellular function. Therefore, we used this strategy to analyze the regulation of gene expression of human adipose-derived stem cells during the triggering of the adipogenic and osteogenic differentiation. Here, we have focused on analyzing the differential expression of mRNAs during early adipogenic and osteogenic differentiation, and presented the detailed data concerning the experimental design, the RNA-Seq quality data, the raw data obtained and the RT-qPCR validation data. This information is important to confirm the accuracy of the data considering a future reuse of the data provided.
Moreover, this study may be used as groundwork for future characterization of the transcriptome and the translatome regulation of different cell types.

Data
In this report we show the data of a large-scale analysis by RNA-Seq of total, polysomal, monosomal and ribosome-free RNA fractions isolated from human adipose-derived stem cells (hASCs) after 24 hours of adipogenic or osteogenic induction. Here we focus on the process of RNA-Seq development, sequencing quality check, raw data obtainment and validation with RT-qPCR. The complete experimental design workflow is represented in Fig. 1.
After isolation and characterization, the hASC were treated with control, adipogenic or osteogenic medium for 24 hours. The polysome profiles of hASCs (from the 3 donors) treated with different induction media obtained by sucrose density gradient are represented in Fig. 2. Using this approach, the fractions corresponding to ribosome-free, monosome-associated and polysome-associated RNAs could be separated for posterior RNA purification. Total RNA was also extracted. The quality and concentration of the isolated RNA were analyzed in order to determine their suitability for RNA-sequencing using the Agilent RNA 6000 Nano Kit and Agilent 2100 Bioanalyzer instrument. In Fig. 3A, the RNA quality is demonstrated based on the presence of 18S and 28S ribosomal RNA except for the ribosomefree RNA samples (samples 1, 5 and 9). This result is also shown in Fig. 3B with examples of electropherograms.
All samples were prepared for sequencing according to the TruSeq Stranded Total RNA manufacturer's manual. For the fragmentation step, the time of incubation was adjusted according to the RNA integrity number (RIN) (Fig. 3A). High-quality sequencing data were obtained as shown in Fig. 3C, which shows an example of sample from all RNA fractions. The quality distribution per read position is shown, revealing that most positions were of high quality along the entire read (more than half of the reads lengths had quality values higher than 35, and almost all the read lengths had quality values above 29). The same pattern was observed in all samples. Table 1 shows the number of reads obtained in each run (an average of~19,800,000 reads) and the numbers and percentages of mapped reads (average was roughly 82%). Raw RNA-Seq data as generated by Illumina Hiseq 2500 can be downloaded at the ArrayExpress repository under the ID E-MTAB-6298. This site serves as a landing page for this data and includes a description of the project, metadata and raw sequencing files (https://www.ebi.ac. uk/arrayexpress/experiments/E-MTAB-6298/[2018]). The log 2 fold change (FC) values were calculated comparing ADI vs. CT and OST vs. CT in ribosome-free, monosomal, polysomal and total RNA. The differential expression analysis were published as supplementary material in previous articles [1,2].
Multidimensional scaling (MDS) plots were determined to assess the reproducibility of our data. For each differentiation (adipogenesis/osteogenesis) and for total and polysomal fractions, one MDS plot was performed. It was demonstrated that the pairs of RNA fractions were visualized together, e.g., polysomal and total RNA fractions for adipogenic (Fig. 3D) and osteogenic (Fig. 3E) differentiation. Four homogenous sample groups were observed: stem cell total RNA (orange), stem cell polysomal fraction (blue), induced total RNA (green) and induced polysomal fraction (red).

Data validation
Validation of RNA-Seq was performed by quantitative RT-qPCR. First, we selected six genes previously identified as differentially expressed in the polysomal fraction of adipogenic [2] and osteogenic differentiation [1]. Expression values were normalized to fold changes for comparison. Although these

Value of the Data
The data provide information about total, ribosome-free, monosomal and polysomal RNA from hASCs during early adipogenesis and osteogenesis. This dataset may be explored to understand the mechanisms involved in gene expression regulation involved in the balance between stemness and differentiation. The information here provided may be used for future studies on triggering hASC differentiation processes. The dataset can be used for direct comparison between adipogenesis and osteogenesis, or focused on the translational dynamic of individual transcripts, as well as studies about non-coding RNA expression and their interaction with the translational machinery. two techniques are usually not comparable due to the utilization of different procedures, a high correlation was detected in our analysis. The similarities in the expression levels of these genes showed that our RNA-Seq data were consistent (Fig. 4).
We also compared our data with previously published studies [3e5] and, for example, observed that differentially expressed genes (DEGs) found by Ambele et al. [5] were also identified in our total and polysomal RNA-Seq fractions, with a similar profile (up or down regulated) in both experiments (Table  2). Notably, the methodologies used here (RNA-Seq) and by Ambele et al. (microarray) [5] provided information about overall population profiles during differentiation. Still, since hASCs are heterogeneous, other studies have focused on the analysis of intrapopulation differences in gene expression, Fig. 2. Polysome profile obtained by sucrose gradient density fractionation. hASCs isolated from 3 donors were treated with control (CT), adipogenic (ADI) and osteogenic (OST) induction media for 24 h and then submitted to sucrose gradient density fractionation. The polysome profile of each sample was recorded (absorbance at 254 nm). These are the full version with all polysome profiling graphics described in our related works [1,2]. Polysomes from TL10 CT and ADI [2], and TL2 CT and OST [1] were previously published.
using single-cell RNA-seq technology [6]. Further investigation using this technology in hASCs treated with induction media might identify different intrapopulation responses to these stimuli.

Experimental design, materials, and methods
2.1. Isolation, characterization, culture and differentiation of human hASC hASCs were isolated from adipose tissue obtained from three healthy female donors ( Table 3) that underwent liposuction surgery. This data was performed in accordance with the guidelines for research involving human subjects and with approval from the Ethics Committee of Fundação Oswaldo Cruz, Brazil (CAAE: 48,374,715.8.0000.5248). Informed consent was obtained from each donor. hASCs were isolated, characterized and cultivated as previously described [1,2]. These methods are expanded versions of descriptions in our previous work [1,2]. First, 200 mL of adipose tissue was washed with 1 L of sterile phosphate-buffered saline (PBS) (Gibco Invitrogen®, Carlsbad, CA, USA) and digested with 1 mg/mL collagenase type I (Gibco Invitrogen®, Carlsbad, CA, USA) diluted in PBS for 30 min at 37 C and 5% CO 2 under constant shaking. After the incubation period, the shaking was halted, and the cell suspension was allowed to stand for 5 minutes to separate the lipid-enriched phase (upper). The bottom phase was collected and filtered through a 100-mm mesh filter (BD Bioscience). The cell suspension obtained was centrifuged (10 min, 950Âg, 8 C), and the supernatant was discarded. The cell pellet was resuspended and treated with hemolysis buffer (0.83% ammonium chloride, 0.1% sodium bicarbonate and 0.04% EDTA) for 10 min to remove erythrocytes. After centrifugation (150Âg, 10 min, 8 C), the supernatant was discarded, and the cell pellet was resuspended in PBS and filtered through a 40-mm mesh filter (BD Bioscience). After centrifugation (350Âg, 10 min, 8 C), the supernatant was discarded, and the cells were plated at a density of 1 Â 10 5 cells/cm 2 in T75 culture flasks in DMEM supplemented with 10% fetal bovine serum (FBS), penicillin (100 units/ml) and streptomycin (100 mg/ml). The flasks were incubated in a humidified incubator at 37 C and 5% CO 2 . The culture Table 1 Reads obtained for each sample, number of reads mapped onto the genome and percentage of mapped reads. This information is the expanded version of brief descriptions presented in our previous works [1,2]. Note that sample "TL01_ADI_Monosome" has a lower number of mapped reads due to an overrepresentation of ribosomal genes. medium was changed twice a week until the hASC cultures were 80e90% confluent, at which point the cells were trypsinized and expanded. All tests were performed with cell passaged 4 to 6 times. Cell characterization was performed according to the minimal criteria for defining mesenchymal stem cells as established by the International Society for Cellular Therapy. First, cells were detached using trypsin-EDTA and incubated in blocking solution (1% bovine serum albumin -BSA -diluted in PBS) at 4 C for 1 h. The cells were then incubated for one more hour at 4 C in the dark with the following antibodies (diluted in blocking solution): FITC-conjugated anti-human CD90 (Thy1), CD34, CD31 and CD19; APC-conjugated anti-human CD73; and PE-conjugated anti-human CD45, HLA-DR, CD117 and CD11b. Mouse IgG antibodies (FITC, APC, PE) were used as negative controls. After incubation, the cells were washed once with PBS, and the data were acquired on a FACSCanto II instrument (Becton Dickinson). For each sample, at least 10,000 events were collected and analyzed with FlowJo® v.10 software (Flowjo, LLC).
For adipogenic and osteogenic differentiation, hASCs were treated with hMSC Adipogenic Differentiation Medium (hMSC Adipogenic BulletKit, Lonza) or hMSC Osteogenic Differentiation Medium (hMSC Osteogenic BulletKit, Lonza), respectively, according to the manufacturer's instructions. For the RNA-Seq analysis, cells were treated with maintenance (control e CT), adipogenic or osteogenic induction medium for 24 h. To assess the differentiation potential of the isolated hASCs, the cells were submitted to a longer induction treatment. Adipogenic differentiation was induced by cycles of treatment for 3 days with induction medium and 3e4 days of maintenance over a total of 28 days. The induction medium consisted of basal medium plus the adipogenic inducers indomethacin, insulin,   dexamethasone and IBMX. Osteogenic differentiation was induced with medium containing b-glycerophosphate, ascorbic acid and dexamethasone over 21 days. The medium was replaced every 3e4 days. The efficiencies of adipogenic and osteogenic differentiation were determined by assessing the cytoplasmic accumulation of triglycerides with AdipoRed™ Assay Reagent (Lonza) or the mineralized extracellular matrix using the OsteoImage™ Mineralization Assay (Lonza), respectively.

Sucrose density gradient separation and RNA purification
Polysomal fractions were prepared as previously described [1,2]. These methods are expanded versions of descriptions in our related work [1,2]. First, hASC cultures at 60e70% of confluence were either induced to osteogenic (OST) or adipogenic (ADI) differentiation or kept in maintenance medium (control -CT) for 24 h. The cells were then treated with 0.1 mg/ml cycloheximide (Sigma Aldrich -St. Louis, MO, EUA) diluted in culture medium for 10 min at 37 C. Cells were detached with trypsin and then centrifuged (700Âg, 5 min, 8 C), and the resulting cell pellets were washed twice in 0.1 mg/ml cycloheximide diluted in PBS. After centrifugation (700Âg, 5 min, 8 C), the cell pellet was resuspended in lysis buffer (15 mM Tris HCl (pH 7.4), 15 mM MgCl 2 , 300 mM NaCl, 100 mg/mL cycloheximide, 1% Triton X-100) and incubated for 10 min on ice. The cell lysates were centrifuged (12,000Âg, 10 min, 4 C), and the supernatants were carefully isolated and loaded onto 10e50% sucrose gradients (Bio-Comp Model 108 Gradient Master ver. 5.3). Gradients were subjected to ultracentrifugation (150,000Âg, SW40 rotor, HIMAC CP80WX HITACHI, 160 min, 4 C) and then fractionated with the ISCO gradient fractionation system (ISCO Model 160 Gradient Former Foxy Jr. Fraction Collector) connected to a UV detector to monitor absorbance at 254 nm. The polysome profile was recorded.
The ribosome-free, monosome-associated and polysome-associated RNA fractions as well as the total RNA were extracted using the Direct-zol™ RNA MiniPrep kit (Zymo Research) according to the manufacturer's instructions.

cDNA library construction
In total, 1 mg of RNA from three independent biological sample replicates of each condition (ribosome-free, monosome-associated, polysome-associated and total RNA) was used for cDNA library construction and RNA-Seq (Table 4). The cDNA libraries were prepared with the TruSeq Stranded Total RNA Sample Preparation kit (Illumina, Inc.) following the manufacturer's instructions. The library size was verified using the Agilent 2100 Bioanalyzer (Agilent), and the library concentration was confirmed by qPCR using the Illumina Library Quantification Kit Universal qPCR mix (Kapa Biosystems).

Large-scale sequencing
The samples were prepared for sequencing on the Illumina Platform using the TruSeq Stranded Total RNA LT Kit. For clustering and sequencing, the TruSeq SR Cluster Kit v3 -cBot e HS and TruSeq SBS Kit v3 -HS (100-cycles) were used. Samples were sequenced on the Illumina HiSeq 2500 System. The raw data were deposited in ArrayExpress under the number E-MTAB-6298.

Bioinformatic analyses
Sequence data were mapped and counted by comparison against the latest version of the GRCh38 human genome with the Rsubread package. The mapping of reads was done with default parameters (unique mapping of reads), and counting was performed using the Ensembl annotation (GRCh38).
For quality evaluation purposes, we performed multidimensional analysis (MDS; multidimensional scaling), a method involving dimension reduction of the count matrix, to explore associations between variables. The log-2 transformation values of the raw counts were used for this analysis, and rows with no information were eliminated (0 counts in all samples). Samples of the same condition should cluster together to ensure consistency and replicability of the data. For comparisons of gene expression between samples, RPKM values (reads per kilobase per million mapped reads) were determined. Differential expression analysis was performed using the Bioconductor R package edgeR [7]. Different comparisons were considered for adipogenesis and osteogenesis. For each RNA fraction (polysomal, monosomal, total and ribosome-free RNA), the induced condition (ADI or OST) versus the stem cell state (CT -control) was analyzed. This analysis included genes with at least one count per million in at least three samples. After a normalization procedure using three recommended methods (estimateGLMCommonDisp, estimateGLMTrendedDisp, estima-teGLMTagwiseDisp), differential expression analyses of all comparisons were performed using the generalized linear mixed model (glmFit and glmLRT). Correction for multiple testing was performed with the FDR (false discovery rate). The data from some of these analyses with the parameters mentioned above are shown in Robert

Quantitative RT-PCR
Total RNA was extracted using the RNeasy Kit (Qiagen), and polysomal RNA was extracted with the Direct-zol™ RNA Kit (Zymo Research) according to the manufacturer's instructions. For complementary DNA (cDNA) synthesis, oligo-dT primers and the IMPROM II Reverse Transcriptase Kit (Promega) were used according to the manufacturer's instructions. Quantitative RT-PCR (RT-qPCR) was performed by using a SYBR green PCR premixture (Applied Biosystems -Foster City, CA, EUA). Normalization was performed using the internal control GAPDH (glyceraldehyde phosphate dehydrogenase), and all reactions were performed in technical triplicate. The primers used for RT-qPCR are listed in Table 5. Table 5 RT-qPCR primer sequences. Oligonucleotide primers used to analyze the differential expression of genes after 24 h of adipogenic and osteogenic differentiation.