Stable Reference Genes for qPCR Analysis in BM-MSCs Undergoing Osteogenic Differentiation within 3D Hyaluronan-Based Hydrogels

Reverse transcription quantitative polymerase chain reaction (RT-qPCR) enables the monitoring of changes in cell phenotype via the high-throughput screening of numerous genes. RT-qPCR is a fundamental approach in numerous research fields, including biomaterials, yet little attention has been given to the potential impact of 3D versus monolayer (2D) cell culture and to the requirement for a constant validation of the multiple steps of gene expression analysis. The aim of this study is to use high-quality RNA to identify the most suitable reference genes for RT-qPCR analysis during the osteogenic differentiation of human bone marrow mesenchymal stem/stromal cells (BM-MSCs). BM-MSCs are cultured under osteogenic conditions for 28 days in 2D or within hyaluronic acid hydrogels (3D). RNA is subject to quality controls and is then used to identify the most stable reference genes using geNorm, NormFinder, and the ∆Cq method. The effect of the reverse transcriptase is investigated, as well as the expression of osteogenic-related markers. This study shows marked differences in the stability of reference genes between 2D (RPLP0/GAPDH) and 3D (OAZ1/PPIA) culture, suggesting that it is critical to choose appropriate reference genes for 3D osteogenic cell cultures. Thus, a thorough validation under specific experimental settings is essential to obtain meaningful gene expression results.


Introduction
Bone marrow-derived mesenchymal stem/stromal cells (BM-MSCs) are certainly among the most studied cell type in the musculoskeletal research field [1]. Their multilineage potential is the continuous object of various investigations, as it suggests the possibility of cell-based therapies to regenerate tissues such as bone and cartilage.
The osteogenic differentiation of BM-MSCs was described for the first time in monolayer culture by Jaiswal and colleagues in the late 1990s [2]. In this study, von Kossa and alkaline phosphatase staining are the main outcome measurements, and the expression of osteogenic marker mRNA is demonstrated by Northern blot. Subsequently, the field of tissue engineering emerged to combine engineering with life science, and BM-MSCs were soon in the spotlight of the tissue engineering triad (cell-material-growth factors) targeting bone regeneration. Monolayer cell cultures are a simple method to gain a first perception of the mechanisms underlying cell differentiation, but they lack the three-dimensional characteristic of tissues. Tissue engineering makes use of natural and synthetic materials to more faithfully recreate the 3D microenvironment and therefore increases the complexity of in vitro culture systems.
A major paradigm shift in life science research occurs with the discovery of the polymerase chain reaction (PCR) [3,4]. The PCR can not only be used to sequence genomic DNA, but is also the most accurate technique to detect changes in gene expression at the mRNA level [5]. Throughout the years, improvements in PCR technology have resulted in the established use of reverse transcription quantitative real-time PCR (RT-qPCR) [6] and labelled/quenched probes such as the TaqMan™ technology to increase detection specificity [7]. Today, gene expression analysis by RT-qPCR is considered the technique par excellence in molecular biology, allowing the high-throughput quantification of a wide range of target genes with a high specificity and sensitivity.
Despite its widespread usage, the successful implementation of RT-qPCR to assess experimental outcomes faces numerous technical challenges that impact assay performance. Sample preparation, RNA quality, RT efficiency, and differences in experimental conditions can introduce significant variations that can wrongly be attributed to a biological effect. To correct for these variables, various papers [8][9][10] highlight the need for the optimization of all these aspects and the necessity of providing more accurate experimental details in publications [8]. A common method with which to analyze qPCR data is the ∆∆Cq method, based on the use of a reference gene as a normalizer. To be used as a normalizer, a reference gene must be stable throughout the duration of the experiment and across groups and should be expressed at a similar level to the target gene(s) of interest. Unfortunately, inappropriate normalization is one of the common pitfalls of qPCR, and often too few details are reported in gene expression studies. The choice of suitable reference genes requires a thorough validation of the different and widely ranging genes involved in various cellular pathways. This is a critical step prior to gene expression analysis, since many reference genes are unstably expressed under varying experimental conditions. However, reference gene stability remains poorly investigated in the presence of drastic differences among experimental groups [9], and the use of just one reference gene is a common but incorrect practice in many studies.
All these concepts have profound implications in biomaterial-assisted cell cultures, where the application of RT-qPCR is increasing as a tool to investigate changes in cell phenotype associated with both cell-cell and cell-material interactions [11]. Three-dimensional cultures fundamentally differ from conventional monolayer, but often the same methods are applied for gene expression studies. For instance, the extraction of high-quality RNA from complex matrixes can be particularly challenging [12,13], especially with increasing culture duration, during which time increasing amounts of extracellular matrix are produced. This is also the case of the hyaluronic acid (HA)-based biomaterials commonly used in tissue engineering studies relevant to the musculoskeletal field [14][15][16]. The three dimensionality of a biomaterial-based culture, together with the composition of HA itself, likely influence the expression of reference gene candidates over a prolonged culture period. We believe that the optimal gene expression analysis strategy for HA-based 3D cultures cannot be simply extrapolated from prior assessments made in conventional monolayer cultures.
This study aims to use high-quality RNA from cells in long-term 3D cultures to identify the most suitable normalizer for gene expression analysis during the osteogenic differentiation of hBM-MSCs within HA-based hydrogels. The identification of suitable reference genes, aside from the traditional ones, highlights the importance of performing a thorough validation under the specific experimental settings to obtain a meaningful gene expression analysis.

High-Quality RNA Can Be Extracted from BM-MSCs within MeHA Using Double Phase Separation
The first step in a gene expression study is the isolation of pure and intact RNA, as both the purity [17] and integrity [18] of RNA affects the performance of downstream applications. Given the 3D nature of the methacrylated HA (MeHA) hydrogels, and the long-term culture period under osteogenic culture conditions (28 days), we modify the standard Tri-Reagent-based protocol to extract RNA from cells in monolayer ( Figure 1A) by adding additional steps generally required when working with mammalian tissues. To obtain the complete disruption of the sample, the hydrogels are snap frozen in liquid nitrogen and pulverized prior to lysis. To reduce the viscosity of the lysate, a homogenization step is also added before a double phase separation is performed ( Figure 1B). snap frozen in liquid nitrogen and pulverized prior to lysis. To reduce the viscosity of the lysate, a homogenization step is also added before a double phase separation is performed ( Figure 1B). For the 2D culture, the cell lysis in the TRI Reagent is followed by phase separation and RNA precipitation using isopropanol alcohol (IPA). (B) For the 3D culture, the samples are snap frozen and pulverized before the lysis step. The lysate is homogenized and then subject to two rounds of phase separation before proceeding with the RNA precipitation step. In both cases, RNA is washed three times using 70% ethanol (EtOH).
To calculate the amount of isolated RNA from the 3D cultures and assess its purity, we perform absorbance measurements using a NanoDrop ® spectrophotometer. In common laboratory practice, RNA samples with A260/A280 and A260/A230 ratios ≥1.8 are considered clean from contaminants absorbing at 280 and 230 nm, with the absorbance at 230 nm known to be negatively affected by low concentrations of RNA. The RNA of all the samples has a A260/A280 ratio above 1.80. The A260/A230 ratio is affected by the reduction in total RNA at day 28, with values above 1.75 only for the samples at day 0. Indeed, there is a consistent ~three-fold reduction in the amount of recovered RNA between day 0 and day 28 from all four BM-MSC donors (Table 1). Although spectrophotometric assessment provides an indication about possible contamination, a limitation of this method is the inability to distinguish between intact and degraded RNA. Since degraded RNA does not perform well in downstream applications, we perform an additional quality check to determine RNA integrity before proceeding with RT-qPCR. The RNA integrity number (RIN) is assigned using the Agilent TapeStation system, based on a scale from 1 to 10, with 10 being the highest quality. A RIN of above six results in lower Cq values [19], while degraded RNA results in later gene expression [20] and also affects gene stability [21]. All the RNA samples under investigation have a RIN above nine and compare well with the RNA from a 2D culture ( Figure 2). For the 2D culture, the cell lysis in the TRI Reagent is followed by phase separation and RNA precipitation using isopropanol alcohol (IPA). (B) For the 3D culture, the samples are snap frozen and pulverized before the lysis step. The lysate is homogenized and then subject to two rounds of phase separation before proceeding with the RNA precipitation step. In both cases, RNA is washed three times using 70% ethanol (EtOH).
To calculate the amount of isolated RNA from the 3D cultures and assess its purity, we perform absorbance measurements using a NanoDrop ® spectrophotometer. In common laboratory practice, RNA samples with A 260 /A 280 and A 260 /A 230 ratios ≥1.8 are considered clean from contaminants absorbing at 280 and 230 nm, with the absorbance at 230 nm known to be negatively affected by low concentrations of RNA. The RNA of all the samples has a A 260 /A 280 ratio above 1.80. The A 260 /A 230 ratio is affected by the reduction in total RNA at day 28, with values above 1.75 only for the samples at day 0. Indeed, there is a consistent~three-fold reduction in the amount of recovered RNA between day 0 and day 28 from all four BM-MSC donors (Table 1). Although spectrophotometric assessment provides an indication about possible contamination, a limitation of this method is the inability to distinguish between intact and degraded RNA. Since degraded RNA does not perform well in downstream applications, we perform an additional quality check to determine RNA integrity before proceeding with RT-qPCR. The RNA integrity number (RIN) is assigned using the Agilent TapeStation system, based on a scale from 1 to 10, with 10 being the highest quality. A RIN of above six results in lower Cq values [19], while degraded RNA results in later gene expression [20] and also affects gene stability [21]. All the RNA samples under investigation have a RIN above nine and compare well with the RNA from a 2D culture ( Figure 2). The utilized protocol allows the isolation of high-quality RNA from MeHA hydrogels maintained under osteogenic culture conditions for 28 days. Both the purity and integrity of the RNA are comparable to those of the RNA obtained from 2D cultures serving as a reference standard, and therefore the material can be used in downstream applications such as RT-qPCR.

The Type of Reverse Transcriptase Affects the ∆Cq Values During Real Time PCR
Once high-quality RNA is obtained from both 2D and 3D cultures, complementary DNA (cDNA) is synthesized using a reverse transcriptase chosen from a variety of commercially available reverse transcriptase enzymes. To determine the effect of the reverse transcriptase on the cycle quantification (Cq) value detected during qPCR, we produce cDNA samples using two different reverse transcriptases, MultiScribe™ and SuperScript™ VILO™, and then amplify the RPLP0 cDNA using the TaqMan™ Universal Master Mix.
In 2D, no statistically significant differences are seen between day 0 (23.39 ± 0.47) and day 28 (23.73 ± 0.33) when using MultiScribe™ ( Figure 3). The same is observed for the Cq values obtained when using SuperScript™ VILO™, although the values are lower both at day 0 (19.16 ± 0.39) and at day 28 (19.40 ± 0.40). In 3D, we detect a statistically significant difference between the Cq values at day 0 (23.18 ± 0.31) and day 28 (24.03 ± 0.15) when using MultiScribe™ to obtain cDNA (p = 0.000427). The use of SuperScript™ VILO™ reduces the difference in Cq between time points from ΔCq = 0.86 to ΔCq = 0.39, and no statistically significant difference is calculated. As for the 2D, also in 3D the Cq values of RPLP0 are lower (overall mean 20.23 ± 0.49) when using SuperScript™ VILO™ compared to MultiScribe™ (23.61 ± 0.50). The utilized protocol allows the isolation of high-quality RNA from MeHA hydrogels maintained under osteogenic culture conditions for 28 days. Both the purity and integrity of the RNA are comparable to those of the RNA obtained from 2D cultures serving as a reference standard, and therefore the material can be used in downstream applications such as RT-qPCR.

The Type of Reverse Transcriptase Affects the ∆Cq Values During Real Time PCR
Once high-quality RNA is obtained from both 2D and 3D cultures, complementary DNA (cDNA) is synthesized using a reverse transcriptase chosen from a variety of commercially available reverse transcriptase enzymes. To determine the effect of the reverse transcriptase on the cycle quantification (Cq) value detected during qPCR, we produce cDNA samples using two different reverse transcriptases, MultiScribe™ and SuperScript™ VILO™, and then amplify the RPLP0 cDNA using the TaqMan™ Universal Master Mix.
In 2D, no statistically significant differences are seen between day 0 (23.39 ± 0.47) and day 28 (23.73 ± 0.33) when using MultiScribe™ ( Figure 3). The same is observed for the Cq values obtained when using SuperScript™ VILO™, although the values are lower both at day 0 (19.16 ± 0.39) and at day 28 (19.40 ± 0.40). In 3D, we detect a statistically significant difference between the Cq values at day 0 (23.18 ± 0.31) and day 28 (24.03 ± 0.15) when using MultiScribe™ to obtain cDNA (p = 0.000427). The use of SuperScript™ VILO™ reduces the difference in Cq between time points from ∆Cq = 0.86 to ∆Cq = 0.39, and no statistically significant difference is calculated. As for the 2D, also in 3D the Cq values of RPLP0 are lower (overall mean 20.23 ± 0.49) when using SuperScript™ VILO™ compared to MultiScribe™ (23.61 ± 0.50).
Overall, we show that the type of reverse transcriptase significantly influences observed differences in the Cq values between time points. In both the 2D and 3D samples, the difference between the Cq values of the two time points is not significantly different when using cDNA synthesized by SuperScript™ VILO™. Therefore, all the upcoming experiments are carried out with cDNA obtained using only SuperScript™ VILO™.  Overall, we show that the type of reverse transcriptase significantly influences observed differences in the Cq values between time points. In both the 2D and 3D samples, the difference between the Cq values of the two time points is not significantly different when using cDNA synthesized by SuperScript™ VILO™. Therefore, all the upcoming experiments are carried out with cDNA obtained using only SuperScript™ VILO™.

The Culture Type (2D vs. 3D) Influences the Performance of a Reference Gene
A comparative gene expression analysis using the ∆∆Cq method is based on a reference sample and a reference gene to be used as a calibrator and a normalizer, respectively. While the choice of the calibrator is dictated by the experimental groups, the identification of suitable reference genes as a normalizer requires a more careful validation. Changes in culture conditions such as the transition from 2D to 3D and the presence of MeHA can affect the stability of a reference gene. Based on a screening of the literature related to lineage differentiation of hBM-MSCs, we select eight reference genes and amplify them by real-time PCR.
The Cq values of the candidate reference genes range from a minimum Cq value of 7.43 and 7.03 obtained with 18S to a maximum of 30.46 and 30.00 for YWHAZ, in 2D and 3D culture, respectively ( Figure 4). In 2D, PPIA and TBP are the only two reference genes that show a statistically significant difference between day 0 and day 28 of culture (p ≤ 0.0008 and p ≤ 0.022, respectively) ( Figure 4A). In 3D, TBP has different Cq values between the two time points as well as GAPDH, GUSB, and YWHAZ. On the contrary, PPIA is stable between time points together with 18S and OAZ1 ( Figure 4B).

The Culture Type (2D vs. 3D) Influences the Performance of a Reference Gene
A comparative gene expression analysis using the ∆∆Cq method is based on a reference sample and a reference gene to be used as a calibrator and a normalizer, respectively. While the choice of the calibrator is dictated by the experimental groups, the identification of suitable reference genes as a normalizer requires a more careful validation. Changes in culture conditions such as the transition from 2D to 3D and the presence of MeHA can affect the stability of a reference gene. Based on a screening of the literature related to lineage differentiation of hBM-MSCs, we select eight reference genes and amplify them by real-time PCR.
The Cq values of the candidate reference genes range from a minimum Cq value of 7.43 and 7.03 obtained with 18S to a maximum of 30.46 and 30.00 for YWHAZ, in 2D and 3D culture, respectively ( Figure 4). In 2D, PPIA and TBP are the only two reference genes that show a statistically significant difference between day 0 and day 28 of culture (p ≤ 0.0008 and p ≤ 0.022, respectively) ( Figure 4A). In 3D, TBP has different Cq values between the two time points as well as GAPDH, GUSB, and YWHAZ. On the contrary, PPIA is stable between time points together with 18S and OAZ1 ( Figure 4B). Overall, we show that the type of reverse transcriptase significantly influences observed differences in the Cq values between time points. In both the 2D and 3D samples, the difference between the Cq values of the two time points is not significantly different when using cDNA synthesized by SuperScript™ VILO™. Therefore, all the upcoming experiments are carried out with cDNA obtained using only SuperScript™ VILO™.

The Culture Type (2D vs. 3D) Influences the Performance of a Reference Gene
A comparative gene expression analysis using the ∆∆Cq method is based on a reference sample and a reference gene to be used as a calibrator and a normalizer, respectively. While the choice of the calibrator is dictated by the experimental groups, the identification of suitable reference genes as a normalizer requires a more careful validation. Changes in culture conditions such as the transition from 2D to 3D and the presence of MeHA can affect the stability of a reference gene. Based on a screening of the literature related to lineage differentiation of hBM-MSCs, we select eight reference genes and amplify them by real-time PCR.
The Cq values of the candidate reference genes range from a minimum Cq value of 7.43 and 7.03 obtained with 18S to a maximum of 30.46 and 30.00 for YWHAZ, in 2D and 3D culture, respectively ( Figure 4). In 2D, PPIA and TBP are the only two reference genes that show a statistically significant difference between day 0 and day 28 of culture (p ≤ 0.0008 and p ≤ 0.022, respectively) ( Figure 4A). In 3D, TBP has different Cq values between the two time points as well as GAPDH, GUSB, and YWHAZ. On the contrary, PPIA is stable between time points together with 18S and OAZ1 ( Figure 4B).  The difference in Cq value between time points can be calculated as ∆Cq (Table 2), while the variation in relation to the mean is expressed by the coefficient of variation (CV).

The Stability of a Reference Gene Differs between 2D and 3D Culture
To effectively evaluate the stability of reference genes, several statistical methods and algorithms have been developed and are freely available. In this work, we make use of three different methods to calculate the average expression stability (geNorm [22]), stability value (NormFinder [23]), and mean SD of ∆Cq (∆Cq method [24]) for both 2D and 3D culture ( Figure 5). The difference in Cq value between time points can be calculated as ∆Cq (Table 2), while the variation in relation to the mean is expressed by the coefficient of variation (CV).

The Stability of a Reference Gene Differs between 2D and 3D Culture
To effectively evaluate the stability of reference genes, several statistical methods and algorithms have been developed and are freely available. In this work, we make use of three different methods to calculate the average expression stability (geNorm [22]), stability value (NormFinder [23]), and mean SD of ∆Cq (∆Cq method [24]) for both 2D and 3D culture ( Figure 5). According to geNorm, PPIA and GAPDH are the most stable reference genes for 2D osteogenic cultures of hBM-MSCs, while YWHAZ is the least stable gene ( Figure 5A). However, in 3D culture PPIA and OAZ1 are the best-performing reference genes. Additionally, in contrast to the 2D data, According to geNorm, PPIA and GAPDH are the most stable reference genes for 2D osteogenic cultures of hBM-MSCs, while YWHAZ is the least stable gene ( Figure 5A). However, in 3D culture PPIA and OAZ1 are the best-performing reference genes. Additionally, in contrast to the 2D data, GAPDH has the highest M value, thereby being the least stable reference gene for the osteogenic culture of hBM-MSCs in MeHA ( Figure 5D). NormFinder and the ∆Cq method confirm YWHAZ as the least stable gene in the 2D data, while they identify RPLP0 as the best-performing gene together with GAPDH ( Figure 5B,C and Figure A1). For the 3D samples, NormFinder and the ∆Cq method assign the lowest values to PPIA and OAZ1 and the highest value to GAPDH ( Figure 5E,F and Figure A1), therefore confirming PPIA/OAZ1 and GAPDH as the best-and worst-performing genes, respectively, in osteogenic cultures within MeHA hydrogels.
A ranking of the reference genes is made according to the stability/SD values to ease the comparison across methods and, to gain an overall ranking, the geometric mean of the values obtained with the three different methods is also calculated ( Table 3). A further value (V) is calculated, based on a pairwise comparison in geNorm (Figure 6), to obtain the optimal number of reference genes required for a robust normalization. When the V of the specific pairwise variation is ≥0.15, the addition of a further reference gene has a significant effect. While, for the 2D culture, a minimum of two reference genes is sufficient for a reliable gene expression analysis ( Figure 6A), for the 3D culture the pairwise comparison suggests that three reference genes are required, since the V2/3 comparison has a V= 0.151 ( Figure 6B). The validation of the eight reference gene candidates highlights the larger spread in stability across the reference genes in samples from 3D cultures compared to the one from 2D cultures. The results obtained using three different methods are comparable. They identify RPLP0 and GAPDH as the most suitable reference genes for the 2D osteogenic culture of BM-MSCs, and PPIA, OAZ1, and GUSB as the top three reference genes for the gene expression analysis of osteogenically differentiated BM-MSCs within MeHA hydrogels.

The Reference Gene of Choice has an Impact on the Target Gene Expression
To stress the importance of choosing the appropriate reference gene, the expression of two osteogenic target genes is calculated using the samples at day 0 as a calibrator and the different reference genes under validation as a normalizer.
For both culture types, the expression of both Col1A1 and IBSP is affected by the chosen reference gene (Figure 7), in particular for the 3D culture, where the fold change of Col1A1 has opposite trends and IBSP has a fold change varying between 52.9 and 345.87 depending on the choice of reference gene. Based on the results of the comprehensive ranking, we use the geometric mean of the two (GAPDH/RPLP0, for 2D) or three (PPIA/OAZ1/GUSB, for 3D) reference genes with the lowest mean of all stability values for the analysis of the target genes [22]. When compared to data obtained using the worst-performing reference gene (YWHAZ in 2D and GAPDH in 3D), a statistically significant difference is detected only for the 3D culture (p < 0.0001 for Col1A1 and p = 0.0001 for IBSP) ( Figure  7B,D).
The differential impact that the choice of reference genes has on the target gene expression in 2D and in 3D confirms the results of the validation, where the range of stability is higher in the 3D culture compared to the 2D setting ( Figure 5). The validation of the eight reference gene candidates highlights the larger spread in stability across the reference genes in samples from 3D cultures compared to the one from 2D cultures. The results obtained using three different methods are comparable. They identify RPLP0 and GAPDH as the most suitable reference genes for the 2D osteogenic culture of BM-MSCs, and PPIA, OAZ1, and GUSB as the top three reference genes for the gene expression analysis of osteogenically differentiated BM-MSCs within MeHA hydrogels.

The Reference Gene of Choice Has an Impact on the Target Gene Expression
To stress the importance of choosing the appropriate reference gene, the expression of two osteogenic target genes is calculated using the samples at day 0 as a calibrator and the different reference genes under validation as a normalizer.
For both culture types, the expression of both Col1A1 and IBSP is affected by the chosen reference gene (Figure 7), in particular for the 3D culture, where the fold change of Col1A1 has opposite trends and IBSP has a fold change varying between 52.9 and 345.87 depending on the choice of reference gene. Based on the results of the comprehensive ranking, we use the geometric mean of the two (GAPDH/RPLP0, for 2D) or three (PPIA/OAZ1/GUSB, for 3D) reference genes with the lowest mean of all stability values for the analysis of the target genes [22]. When compared to data obtained using the worst-performing reference gene (YWHAZ in 2D and GAPDH in 3D), a statistically significant difference is detected only for the 3D culture (p < 0.0001 for Col1A1 and p = 0.0001 for IBSP) (Figure 7B,D).
The differential impact that the choice of reference genes has on the target gene expression in 2D and in 3D confirms the results of the validation, where the range of stability is higher in the 3D culture compared to the 2D setting ( Figure 5).

Osteogenic Differentiation of hBM-MSCs Within MeHA Hydrogels
To conclude our study, we use the identified reference genes to perform a gene expression analysis on a selected number of target genes commonly used to assess the osteogenic differentiation of hBM-MSCs.
Consistent with cells in a monolayer, we observe an increased expression of the late-stage osteogenic marker IBSP at day 28, confirming the osteogenic commitment of the cells in 3D cultures of MeHA hydrogels (Figure 8). The expression at the mRNA level of all the genes under investigation is comparable between the 2D and the 3D culture. No statistical significance can be identified between the 2D and the 3D culture in the fold changes of the target genes. However, when compared to osteogenic 2D culture, there is a clear trend for cells within MeHA to express lower levels of Sox9 and Co10A1 and increased levels of PPARγ and IBSP. Analysis is performed according to the 2 −∆∆Cq method using each of the eight candidate genes as a reference gene and the day 0 reference sample. The geometric mean of the top 2 and 3 genes according to the geNorm algorithm is used as a comparison for the 2D and the 3D culture, respectively. Data are shown as mean ± SD of independent experiments using cells from four different donors.

Osteogenic Differentiation of hBM-MSCs within MeHA Hydrogels
To conclude our study, we use the identified reference genes to perform a gene expression analysis on a selected number of target genes commonly used to assess the osteogenic differentiation of hBM-MSCs.
Consistent with cells in a monolayer, we observe an increased expression of the late-stage osteogenic marker IBSP at day 28, confirming the osteogenic commitment of the cells in 3D cultures of MeHA hydrogels (Figure 8). The expression at the mRNA level of all the genes under investigation is comparable between the 2D and the 3D culture. No statistical significance can be identified between the 2D and the 3D culture in the fold changes of the target genes. However, when compared to osteogenic 2D culture, there is a clear trend for cells within MeHA to express lower levels of Sox9 and Co10A1 and increased levels of PPARγ and IBSP. osteogenic markers is analysed in 2D and 3D culture at day 28 of osteogenic differentiation. Analysis is performed according to the 2 −∆∆Cq method, with day 0 used as a reference sample and the geometric mean of RPLP0 and GAPDH or OAZ1, PPIA, and GUSB used as an endogenous control for the 2D and the 3D data, respectively. Data are shown as the median, 5 to 95 percentiles (boxes), and ranges (whiskers) of independent experiments using cells from four different donors. There is no statistically significant difference in the fold change between the 2D and the 3D culture.

Discussion
Tissue engineering and RT-qPCR have both revolutionized life science research by facilitating the transition from 2D to 3D culture systems and allowing the high-throughput screening of mRNA expression, respectively. In particular, the tantalizing regenerative properties of BM-MSCs for cellbased musculoskeletal regenerative approaches has seen intensive research interest in recent years. However, despite the large number of promising in vitro and preclinical approaches, MSC-based therapies have yet to make a significant impact in the clinical setting. This phenomenon is partly due to difficulties in translating methodologies and experimental findings, including gene expression results, from conventional 2D cultures to 3D biomaterial-and organ-based cultures.
With this in mind, we have utilized 2D and 3D osteogenic cultures of BM-MSCs to validate all the steps of a typical gene expression study, from RNA extraction to the analysis of osteogenic-related markers. We have demonstrated that the 3D osteogenic culture of human BM-MSCs within MeHA hydrogels requires an improved RNA extraction method and the use of different sets of reference genes compared to conventional monolayer/2D osteogenic cultures. The results of this study provide a direct comparison between 2D and 3D culture conditions that highlights the importance of appropriate validation steps to obtain biologically meaningful data representative of the cell type, the culture conditions used, and the type of biomaterial under investigation. osteogenic markers is analysed in 2D and 3D culture at day 28 of osteogenic differentiation. Analysis is performed according to the 2 −∆∆Cq method, with day 0 used as a reference sample and the geometric mean of RPLP0 and GAPDH or OAZ1, PPIA, and GUSB used as an endogenous control for the 2D and the 3D data, respectively. Data are shown as the median, 5 to 95 percentiles (boxes), and ranges (whiskers) of independent experiments using cells from four different donors. There is no statistically significant difference in the fold change between the 2D and the 3D culture.

Discussion
Tissue engineering and RT-qPCR have both revolutionized life science research by facilitating the transition from 2D to 3D culture systems and allowing the high-throughput screening of mRNA expression, respectively. In particular, the tantalizing regenerative properties of BM-MSCs for cell-based musculoskeletal regenerative approaches has seen intensive research interest in recent years. However, despite the large number of promising in vitro and preclinical approaches, MSC-based therapies have yet to make a significant impact in the clinical setting. This phenomenon is partly due to difficulties in translating methodologies and experimental findings, including gene expression results, from conventional 2D cultures to 3D biomaterial-and organ-based cultures.
With this in mind, we have utilized 2D and 3D osteogenic cultures of BM-MSCs to validate all the steps of a typical gene expression study, from RNA extraction to the analysis of osteogenic-related markers. We have demonstrated that the 3D osteogenic culture of human BM-MSCs within MeHA hydrogels requires an improved RNA extraction method and the use of different sets of reference genes compared to conventional monolayer/2D osteogenic cultures. The results of this study provide a direct comparison between 2D and 3D culture conditions that highlights the importance of appropriate validation steps to obtain biologically meaningful data representative of the cell type, the culture conditions used, and the type of biomaterial under investigation.
Accurate gene expression analysis can only be performed following the validation of a broad range of reference genes involved in different aspects of cellular activities to minimize the effect of variation induced by the experimental conditions [10]. In our specific experimental context, we use one cell type and ensure consistency in osteogenic differentiation conditions and experimental duration to specifically assess the impact of 2D vs. 3D culture conditions on reference gene expression. Previous work from Cagnan and colleagues [25] identifies RPLP0 and GAPDH as suitable reference genes during the differentiation of bone marrow-derived MSCs (both osteogenic and adipogenic). The stability of RPLP0 has also been confirmed in various studies of monolayer cultures across different MSC tissue sources and differentiation potentials [26][27][28], which is confirmed in our experimental settings. In a study screening various tissue sources, GAPDH expression stability displays high variability in expression between different types of tissue but expression levels within related tissues are comparable [29]. Consistent with this, our study demonstrates that GAPDH is a stable reference gene, but in a context-dependent manner. Indeed, while GAPDH stability is high in 2D, this is not the case in 3D HA-based cultures. This demonstrates that the sensitivity of GAPDH is influenced by the 3D environment, as already seen in human BM-MSCs cultured within peracetic acid-treated human cancellous bone cubes [30]. In addition, the observations of Quiroz et al. have shown that GAPDH expression is also subject to osteogenic regulation during monolayer differentiation [31]. In our study, no such upregulation is evident in monolayer cultures, although the high ∆Cq value observed for GAPDH confirms its upregulation during osteogenic differentiation at day 28 for the 3D culture.
In 3D HA-based studies, the geometric mean of OAZ1, PPIA, and GUSB serves as the most suitable combination of reference genes for use as a normalizer. PPIA has previously been reported as a stable reference gene during the osteogenic differentiation of human BM-MSCs both in 2D [26] and in 3D cultures in cancellous bone explants [30]. Interestingly, in a 3D chondrogenic culture of synovium-derived porcine MSCs, PPIA is ranked as highly stable within alginate but very unstable in agarose [32]. This comparison suggests that stability is not only influenced by the three dimensionality of the material but also by its biophysical properties, since alginate is negatively charged while agarose has a neutral charge. This finding could be transferred to our results, where PPIA behaves as highly stable in a negatively charged polysaccharide culture. A screening of more than 13,000 genes from diverse tissues and experimental conditions has established OAZ1 as a top reference gene [33], thus supporting our findings of OAZ1 as being a suitable endogenous control for 3D cultures. OAZ1 has been identified as a suitable reference gene in studies involving a high protein turnover [34] and during the development of thymic epithelial cells [35], but the literature on its efficacy as a reference gene in MSCs is scarce. GUSB has also been shown to be among the best performing reference genes for human MSC tri-lineage differentiation [26]. Utilizing the same validation methods as in our study, Brinkdorf et al. has ranked GUSB as one of the least stable genes for immortalized human MSCs in 2D and 3D. Conversely, in the same study, BestKeeper identifies GUSB as the most stable reference gene, demonstrating that stability can differ among methods and an overall ranking is required [36]. Various studies have also suggested that YWHAZ may be a suitable reference gene in monolayer cultures [26,37], however we observe that YWHAZ is not an appropriate choice of reference gene in 2D and 3D HA-based osteogenic cultures, which supports the findings of Brinkdorf et al. [36] that YWHAZ is not suited for 3D culture. A further point concerns the commonly used reference gene 18S, which we consider as an unsuitable choice as a reference gene, despite its apparent stability in 3D cultures. The extremely high expression of 18S, very far from the Cq of most of the genes of interest, and its ribosomal nature (and thus its absence in highly purified mRNA samples), indicates that 18S should not be considered as an appropriate choice of reference gene in future studies [9,31].
One could argue that the choice of 3D culture system used in this study does not specifically permit the determination of the contribution of the 3D setting per se from the intrinsic properties of HA for influencing the stability of reference gene expression. To further investigate this, cells could be seeded onto the surface of the MeHA hydrogel to determine how HA-mediated signaling influences reference gene expression. However, the known anti-adhesive properties of HA-based hydrogels [38] would require a further modification of the biomaterial to allow reproducible seeding on its surface. As such, the 2D monolayer culture was chosen for our study, given its established use in BM-MSC studies.
Despite the anti-adhesive nature of HA-based hydrogels, MSC laden HA-based hydrogels are promising approaches for tissue engineering strategies. HA is synthesized at the site of the future joint [39] and it is present within the cartilaginous callus during endochondral fracture repair [40]. In support of the osteogenic properties of our HA-based material for bone tissue engineering, a key determinant of early fate decision of MSCs is the transcription factor Sox9, which acts as an inhibitor of Runx2 expression during chondrogenesis [41,42]. The downregulation of Sox9 is therefore required for the onset of osteogenesis, and its expression trend in BM-MSCs observed within our MeHA hydrogel confirms the osteogenic phenotype of the cells. In addition, we observed a consistently increased expression of the late-stage osteogenic marker IBSP at day 28 in 3D cultures of our MeHA hydrogels. This is consistent with the findings of Zou and colleagues, who demonstrated that HA possesses good intrinsic osteogenic properties, with a predominant positive role in early and late markers of bone formation [43]. Previous work from Rauh et al. [30] also reported that IBSP is upregulated in 3D cultures of peracetic acid-treated human cancellous bone explants after 14 days. The marked upregulation of IBSP mRNA in our study would suggest that this is an effect induced by the 3D culture environment rather than the HA itself. The expression of Col10A1, an extracellular matrix protein expressed by hypertrophic chondrocytes [44], also appears to be affected by the 3D setting. In a recent study, where HA is supplemented to the chondrogenic culture medium of human BM-MSCs in fibrin/polyurethane scaffolds, the expression of Col10A1 is reduced compared to untreated scaffolds [45]. In the light of these findings, the expression of Col10A1 could be driven by the presence of HA rather than the 3D culture itself.
It could be considered that certain limitations are evident in this study, such as the use of only one cell type and the investigation of only one HA-based biomaterial under osteogenic culture conditions. However, given the widespread use of MSCs and HA-based hydrogels in tissue reparative approaches [46,47] and the known inter-individual heterogeneity of MSCs, the robust nature of our findings suggest that the use of the identified reference genes may serve to improve RT-qPCR-based assessments of phenotypic changes in osteogenic cultures. Further studies to determine the most appropriate reference genes for other MSC-relevant differentiation protocols (such as chondrogenesis), and when utilizing other widely used hydrogels (such as fibrin or collagen-and gelatin-based materials), should also be performed as a matter of urgency in an attempt to advance the field of MSC-based musculoskeletal regenerative approaches.
In conclusion, we have identified a robust set of reference genes for both 2D and 3D investigations concerning the osteogenic differentiation of BM-MSCs in HA-based materials. In addition, our study also provides a method to obtain high-quality RNA from challenging HA-based materials and stresses the importance of a thorough validation of reference genes to be conducted prior to the assessment of gene expression changes in 3D biomaterial cultures.

Materials and Methods
The bone marrow aspirates used in this study were obtained upon the informed consent of the donors with full approval from the Ethics Committee of the University of Freiburg Medical Centre (EK-Freiburg: 135/14) and the ethical commission of Graubünden (KEK-ZH-NR: 2016-00141).
All the reagents were purchased from Sigma-Aldrich (St. Luis, MO, USA) unless otherwise stated.

Methacrylated Hyaluronic Acid Polymer Synthesis
MeHA was synthesized according to [48,49] and stored at −20 • C until use. The degree of substitution was measured by nuclear magnetic resonance.

Cell Laden Methacrylated Hyaluronic Acid Hydrogel Preparation
Irgacure 0.3% (w/v) in phosphate buffered saline (PBS) was sterile filtered and used as a photoinitiator for the MeHA polymers (47% degree of functionalisation). MeHA, 2% (w/v), was dissolved in Irgacure solution and kept at 37 • C until use. Under sterile conditions, hBM-MSCs were added to the MeHA hydrogel precursor at a final density of 20 × 10 6 cells/mL. Cylindrical cell-laden MeHA hydrogels (100 µL volume; containing 2 × 10 6 cells) were produced in 4% (w/v) agarose molds (Lonza, Basel, Switzerland), created using a 6 mm diameter biopsy punch (Kai Europe GmbH, Solingen, Germany. MeHA hydrogels were subsequently crosslinked in a BioLink ® BLX 365 irradiation system (Witec AG, Sursee, Switzerland) for 1.5 min at 1 J. To reduce cell attachment to plastic during culture, the crosslinked hydrogels were transferred to a silicone mold within a 12-well plate and covered by 1 mL of culture medium.

Osteogenic Differentiation of 2D and 3D Culture
The osteogenic differentiation of the monolayer culture was performed as previously described in Hatt et al. [51]. Briefly, BM-MSCs were cultured in osteocontrol (OC) medium for the first 24 h (Dulbecco's Modified Eagle Medium (DMEM 1 g/L glucose, Gibco), 10% FBS (Gibco), and 100 U/mL plus 100 µg/mL penicillin and streptomycin, respectively). The OC medium was then supplemented with an osteogenic (OC) cocktail including 10 nM dexamethasone, 50 µg/mL ascorbic acid 2-phosphate, and 10 mM β-glycerophosphate. The first media change is defined as day 0 and the osteogenic culture was maintained for 28 days, with a media change three times per week. All the cultures were performed with four independent donors, with two replicates for each experimental group. Cells were incubated under standard cell culture conditions (37 • C and 5% CO 2 in a humidified atmosphere).
The cell-laden hydrogels are pre-incubated in OC medium for an initial 24 h period. After this period (defined as experimental day 0), the OC medium was supplemented for a further 28 days with the OG cocktail. As described for the monolayer culture, medium change was performed three times per week, and cultures were repeated with four independent donors and two replicates for each experimental group.

RNA Isolation from 2D and 3D Culture
Cells in monolayer culture were harvested at day 0 and 28 and RNA is isolated according to the TriReagent-based procedure described in Hatt et al. [51].
Cell-laden hydrogels were harvested at days 0 and 28. Each hydrogel was quickly rinsed in PBS, transferred to an Eppendorf tube (Eppendorf, Hamburg, Germany), snap frozen in liquid nitrogen, and stored at −80 • C until use. On the day of the RNA extraction, each hydrogel was pulverized using a custom-made pestle and mortar, with further pulverization using a pellet pestle to grind the gel while keeping the samples frozen. The pulverized samples were lysed in TRI-Reagent ® (1 mL/sample, supplemented with 5 µL/mL of Polyacryl Carrier; both Molecular Research Center Inc., Cincinnati, OH, USA) with an incubation of 10 min at 4 • C. Sample lysate was then added to QIAShredder columns (Qiagen, Venlo, The Netherlands) and centrifuged at 20,000 g for 5 min. For the phase separation, the homogenate was supplemented with 100 µL 1-bromo-3-chloropropane (BCP), mixed for 15 s, incubated for 15 min at room temperature, then finally centrifuged at 12,000 g for 15 min at 4 • C. Following centrifugation, 80% of the aqueous phase was transferred into a fresh Eppendorf tube. All the steps of the phase separation were repeated a second time and the new aqueous phase was transferred into a fresh Eppendorf tube. An equal volume of isopropanol was then added to precipitate the RNA and the samples were placed on an orbital shaker for 10 min before centrifugation at 12,000 g for further 10 min at 4 • C. The RNA pellet was washed three times with 75% ethanol. The ethanol is removed, and the pellet left to air dry for approximately 5 min. The RNA was then dissolved in 20 µL of DEPC-treated water before quantitative and qualitative measurements, then stored at −80 • C until use.

RNA Assessment
To quantify the amount and purity of the recovered RNA, absorbance measurements were carried out with a NanoDrop TM 1000 spectrophotometer (Thermo Scientific, Waltham, MA, USA). The obtained A 260 /A 280 and A 260 /A 230 ratios were used to assess the purity of the RNA samples.
The RNA integrity was calculated as the RIN equivalent (RIN e ) using an RNA ScreenTape kit (Agilent Technologies, Inc, Santa Clara, CA, USA) according to the manufacturer's instructions. Measurements were performed and visualized using a 2200 TapeStation system (Agilent Technologies) equipped with the Agilent Software Package.

qRT-PCR
Gene expression analysis was performed using two-step qRT-PCR. Total RNA (250 ng in 20 µL, or 500 ng in 40 µL) was retrotranscribed using random hexamers and a MultiScribe TM Reverse Transcriptase (Applied Biosystems, Waltham, MA, USA) or SuperScript™ VILO™ (Invitrogen, Waltham, MA, USA), according to the manufacturer's instructions. cDNA was diluted 1 to 5 using 1X Tris-EDTA buffer and stored at −20 • C until further use. For simplicity, the products of the two cDNA syntheses are named after the reverse transcriptase used: MultiScribe and VILO.
Real time qPCR was performed using the TaqMan™ Universal Master Mix with 5 ng of cDNA in a 10 µL reaction volume. To exclude contamination and primer-dimer formation, a non-template control was used for each gene under investigation. Each PCR reaction was run in technical triplicates for 40 cycles using a QuantStudio TM 6 Pro Real-Time PCR System (Applied Biosystems TM , Foster City, CA, USA). Primers used for the reference genes are listed in Table 4, and the primers of osteogenic target genes are listed in Table 5. A standard deviation (SD) of ≤0.25 is set as threshold for the technical triplicates. Data analysis of osteogenic-related gene expression was performed according to the ∆∆Ct method using the geometric mean of RPLP0 and GAPDH for the monolayer culture, and OAZ1, PPIA, and GUSB for the 3D cell-laden hydrogel culture, as a normalizer and day 0 samples as a calibrator.

Reference Gene Validation
Reference genes (Table 4) were selected based on previous literature [26,33,52] and 18S and GAPDH were added as frequently used reference genes. To minimize the possible regulation of the reference gene during in vitro osteogenic differentiation and/or possible co-regulation within the reference gene, the selected reference genes cover a variety of cellular activities.

geNorm
The Cq values of candidate reference genes are transformed by applying the 2 −∆Cq method, where the lowest Cq value is used as a reference. Obtained values are loaded into geNorm v. 3.5 Visual Basic Application for Microsoft Excel. A stability value (M) was computed for all candidate reference genes through a pairwise comparison.
Step by step, the highest stability value was excluded, and the average expression stability was recalculated, thereby enabling the ranking of reference genes from the lowest stability value to the most stable combination of reference genes. Furthermore, geNorm suggests the optimal number of reference genes based on the calculation of a new normalization factor (NF n+1 ) on each occasion when the next stable reference gene is added. Between the sequential normalization factors, according to the descending stability values the pairwise variation V n/n+1 was then calculated. If the pairwise variation is ≥0.15 when including the (n + 1) gene, the addition of that reference gene has a significant effect [22].

NormFinder
NormFinder v0.953 is an Excel based Add-In and, in contrast to geNorm, is a model-based approach. The inclusion of at least eight samples per group and 5-10 candidate reference genes is recommended. Cq values are first calculated as described for geNorm. For the analysis, no grouping is applied and NormFinder is then used to calculate the stability values for all reference genes, with the ranking calculated according to the lowest value [23].

∆Cq Method
Another approach to identify the most stable reference gene is developed by Silver et al. [24]. With this method, the Cq value of pairs of reference genes within a sample is compared by first calculating the mean ∆Cq value of each possible combination of pairs of genes and then the SD of the ∆Cq (Table A1). The candidate reference genes are then ranked according to the calculated mean SD.

Statistics
Statistical significance between two groups was examined by applying student's t-test using the Holm-Sidak method, with p ≤ 0.05 considered statistically significant. For more than two different groups, the statistical significance was examined by applying a one-way ANOVA using Sidak's multiple comparison test. Statistical analysis was performed using GraphPad Prism 8.1.0 (GraphPad Software Inc., San Diego, CA, USA). Funding: This research is funded by AO Foundation, AO CMF and ETH Zürich Foundation.

Acknowledgments:
The authors would like to acknowledge Flavio Linardi for the MeHA synthesis, Patrick Westermann (SIAF Davos) for his technical support with the Agilent TapeStation system and Keith Thompson for the insightful discussions.

Conflicts of Interest:
The authors declare no conflict of interest.