ALCAM (CD166) as a gene expression marker for human mesenchymal stromal cell characterisation

Background Human mesenchymal stromal cells (MSCs) phenotypically share their positive expression of the International Society for Cell and Gene Therapy (ISCT) markers CD73, CD90 and CD105 with fibroblasts. Fibroblasts are often co-isolated as an unwanted by-product from biopsy and they can rapidly overgrow the MSCs in culture. Indeed, many other surface markers have been proposed, though no unique MSC specific marker has been identified yet. Quantitative PCR (qPCR) is a precise, efficient and rapid method for gene expression analysis. To identify a marker suitable for accurate MSC characterisation, qPCR was exploited. Methods and results Two commercially obtained bone marrow (BM) derived MSCs and an hTERT immortalised BM-MSC line (MSC-TERT) have been cultured for different days and at different oxygen levels before RNA extraction. Together with RNA samples previous extracted from umbilical cord derived MSCs and MSC-TERT cells cultured in 2D or 3D, this heterogeneous sample set was quantitatively analysed for the expression levels of 18 candidate MSC marker genes. The expression levels in MSCs were compared with the expression levels in fibroblasts to verify the differentiation capability of these genes between MSCs and fibroblasts. None of the ISCT markers could differentiate between fibroblasts and MSCs. A total of six other genes (ALCAM, CLIC1, EDIL3, EPHA2, NECTIN2, and TMEM47) were identified as possible biomarkers for accurate identification of MSCs. Conclusion Justified by considerations on expression level, reliability and specificity, Activated-Leukocyte Cell Adhesion Molecule (ALCAM) was the best candidate for improving the biomarker set of MSC identification.


Introduction
Mesenchymal stromal cells (MSCs) are a valuable type of cells in regenerative medicine for their ease of isolation and multipotency. They can be isolated from virtually every organ or tissue in the postnatal body (da Silva Meirelles, 2006) and be differentiated in vitro into several cell types (Caplan, 2017). MSCs are traditionally defined by: 1) the ability to adhere to plastic, 2) tri-lineage differentiation potential, and 3) CD105+, CD90+, CD73+, and CD45−, CD34−, CD14− or CD11b−, CD79α− or CD19− and HLA-DR− in their surface marker expressions. Since the publication of these minimal criteria to define MSCs in 2006 (Dominici et al., 2006), the acronym and the hMSC criteria have been under debate lately (Boregowda et al., 2018;Caplan, 2017;Robey, 2017). This discussion is partially based on the inconsistent or even contradictory research results (Zhang et al., 2015), probably due to a lack of uniformity in nomenclature, no reference cell type and/or the lack of information on the process of generating MSCs (Reger and Prockop, 2014;Viswanathan et al., 2014). Furthermore, fibroblasts, a mature mesenchymal cell type particularly abundant in connective tissues, share phenotypic expression of CD90 (Stern, 1973;Walsh and Ritter, 1981), CD73 and CD105 (Alt et al., 2011) with MSCs (Halfon et al., 2011). These fibroblasts are frequently co-isolated when establishing primary cell cultures and can overgrow a cell culture rapidly (linge et al., 1989). Therefore, a confirmation of a genuine MSC culture and not fibroblasts is a prerequisite. Instead of using phenotypic analysis, gene expression profiling of cells could be a better approach to characterise the cells under investigation and confirm their MSC identity. Phenotypical evaluation of MSCs is mostly performed by FACS analysis and could therefore be considered as the gold standard. Nevertheless, FACS results only identify the number of cells in a sample that express a phenotypic marker in a fairly binary way. Additionally, gene expression can be reliably measured by quantitative PCR (qPCR). Cells with the same phenotypic profile could therefore be distinguished by their transcriptomic profile. In a previous study (Zhang et al., 2019), we studied bone marrow derived MSCs (BM-MSCs) cultured on surfaces with varying topography (flat versus fibrous) and chemistry (aminated versus pristine). RNA-Seq data from these cultures were used to generate their transcriptomic profiles and identify the effect of topography and chemistry on the expression of 177 previously reported MSC markers (several are reviewed in (Lv et al., 2014;Uder et al., 2018)). The gene expressions of these markers were processed through network analysis to determine the optimal cluster distribution, being organized into 4 clusters to achieve the optimal network integrity (Zhang et al., 2019). From these clusters we selected several genes to identify MSC specific gene expression biomarkers. In another publication (Brinkhof et al., 2018), we identified several reference genes suitable for gene expression normalisation after umbilical cord derived MSCs (UC-MSCs) and BM-MSCs were cultured in 2D on tissue culture plate (TCP) or 3D on scaffolds. A selection of these previously isolated RNA samples (Brinkhof et al., 2018) has been used to further identify genes stably expressed in MSCs depending on topography. In addition to these samples, RNA has been extracted from two commercially obtained primary BM-MSCs, hTERT immortalised MSCs (MSC-hTERT) and fibroblasts, cultured at different oxygen levels. Together with the previously extracted samples (Brinkhof et al., 2018), these newly isolated samples have been screened for the expression levels of selected marker genes to reliably characterise hMSCs and enable differentiation from fibroblasts. Amongst these tested genes, ALCAM was identified as upregulated in all MSC sample groups compared to fibroblasts. The expression levels positively correlated to those of ENG (CD105), though ALCAM was more specific for MSCs.

Cell culture
Two sources of primary and one tert-immortalised cell line of bone marrow derived hMSCs were acquired from Lonza (referred to as MSC-L, PT-2501, Slough, UK), PromoCell (Referred to as MSC-P, C-12974, Heidelberg, Germany), and a collaborating laboratory prepared using the method described in (Mihara et al., 2003) (referred to as MSC-T), respectively. The establishment of human umbilical cord derived MSCs (referred to as MSC-U) has been described in detail before (Brinkhof et al., 2018). Human dermal fibroblast (hDF) was acquired from ThermoFisher (C0135C, Hemel Hempstead, UK). Cells were cultured in incubators maintained at 5% CO 2 in air. Cells under hypoxia exposure (O 2 -levels are indicated in Supplemental Table 1) were cultured in a Hypoxystation-H35 (Don Whitley Scientific, Bingley, UK) supplemented with 5% CO 2 . Cells were transferred into the Hypoxystation immediately after seeding. Cells were cultured in a serum-free, xenofree media, MSCs NutriStem (Biological Industries, Cromwell, USA) which was changed every 3 days. Cells were cultured in 6-well cell culture plates (Costar, ThermoFisher) with a seeding density of 5000 cells/cm 2 . The passage number of 4-6 was used for the two sources of primary hMSCs, passage number of 6-10 was used for the immortalised hMSCs and passage number of 10-12 was used for hDF. Further sample details can be found in Supplemental Table 1.

RNA extraction and cDNA conversion
RNA extraction from the MSC-U samples and some of the MSC-T samples cultured in 2D (tissue culture plate) and 3D (fibrinogen scaffolds or polycaprolactone-poly[N-isopropylacrylamide] beads) used in this study has been described previously (Brinkhof et al., 2018) (Supplemental Table 1). The hMSCs and hDFs cultured under different oxygen levels (Supplemental Table 1) were harvested with trypsin solution (59418C, Sigma-Aldrich, Dorset, UK) and collected into a pellet. Trizol (11596-018, ThermoFisher) was added onto the cell pellet and resuspended. Samples that were not processed immediately were stored under −80°C for future extraction. For RNA extraction, 1-bromo-3cholopropane (B9673, Sigma-Aldrich) was added into the mix, incubated, centrifuged, and the upper layer, containing the RNA, transferred to a new tube. To purify the RNA, a mixture of phenol-chloroform-isoamyl alcohol (77619, Sigma-Aldrich) was added into the solution, incubated, centrifuged, and the upper phase was transferred to a new tube. To precipitate the RNA, 2-propanol (I9516, Sigma-Aldrich) was added, incubated, centrifuged and the supernatant was discarded. RNA pellets were washed in 75% ethanol (ThermoFisher) and air-dried before resuspending in H 2 O (ThermoFisher). RNA concentration was measured using the Nanodrop One (ThermoFisher) and quality was assessed using an Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA). A detailed procedure for the RNA extraction can be found in (Brinkhof et al., 2006). RNA was stored at −80°C or used immediately for reverse transcriptase reactions. To generate cDNA, 1 μg of RNA was used for initial elimination of genomic DNA (QuantiTect Reverse Transcription Kit, Qiagen, Manchester, UK) in 14 μl reaction volume. Genomic DNA elimination reaction was performed at 42°C for 5 min. Subsequently, a mixture containing Reverse Transcriptase (RT), a mix of oligo-dT and random primers, and RT buffer was added to a final volume of 20 μl. Reverse-transcription reaction was performed at 42°C for 30 min followed by an inactivation step at 95°C for 5 min. All procedures were performed per manufacturers protocol (Qiagen) in a Rotor-Gene 6000 (Corbett Research, Mortlake, Australia). All cDNA samples were stored at −20°C until further use.

Quantitative PCR and data analysis
All cDNA samples were measured in duplicate in a 96 well plate covered with adhesive seals. To fit all samples, two plates (A and B) were used per gene and both plates contained standards. These standards were generated by diluting an MSC sample 5-fold until S7. Plate A contained 40 samples and plate B the remaining samples. For all measurements 1 μl cDNA template per 20 μl final reaction volume was used on an Applied Biosystems StepOnePlus Real-time PCR system (Applied Biosystems, Warrington, UK) based on the SyGreen intercalating dye and a passive reference ROX (PCR Biosystems, London, UK). All primers had a final concentration of 400 nM each. Reactions started with 3 min at 95°C, followed by 40 cycles of 15 s at 95°C and 30 s at Tm. This reaction was followed by a melting curve, stepwise increasing temperature each 15 s by 0.5°C, ranging from 65°C to 95°C. Recommended Tm was used for previously published primers or optimal gene specific Tm was determined using a temperature gradient for newly developed primer sets using the same standards as for the actual measurements (Table 1). LinRegPCR (Ramakers et al., 2003) version 2016.1 was used for baseline correction (Ruijter et al., 2009) and quantification cycle (Cq) values were loaded into qBase Plus (Hellemans et al., 2007) version 3.2 for relative quantity and correlation (Pearson and Spearman) analysis. After amplification efficiency determination the five primer sets selected for gene expression normalisation (PPIA, PUM1, TBP, TFRC, and YWHAZ) were analysed for their suitability as reference gene using geNorm (Vandesompele et al., 2002).

Statistical analysis
All statistical calculations are described in detail in the referred manuscripts (Hellemans et al., 2007;Ramakers et al., 2003;Ruijter et al., 2009;Vandesompele et al., 2002). Relative quantities per sample as calculated in qBase plus were exported to Excel for further statistical analysis. The geometric mean and the standard error of the mean (SEM) were calculated per cell type (hDF, MSC-L, MSC-P, MSC-T, and MSC-U). A student t-test was used to identify statistically different gene expression levels (p < 0.05) between the cell types. For the relative expression figures in the manuscript all data have been normalised to fibroblasts. The original relative expression data with their SEM indicated are shown in Supplemental Fig. 1.

Gene expression qualification
MSCs cultured under various conditions such as topography (2D vs. 3D) (Brinkhof et al., 2018) and oxygen levels were analysed for their gene expression levels. When establishing and culturing MSCs, fibroblasts are the most frequent contaminating cell type. Therefore, it is important to distinguish genuine MSCs from fibroblasts. After baseline determination (Ruijter et al., 2009), Cq-values were exported to excel for further analysis using qBase plus (Hellemans et al., 2007). Amplification efficiencies were determined for 25 genes using Cq-values from the standards run on each plate. All efficiencies were between 92.7%-107.0% (1.927-2.070) and regressions r 2 ≥ 0.99 (Table 2), indicating good sample quality and qPCR reaction.
A selection of five genes (YWHAZ, TFRC, TBP, PUM1, and PPIA) has been made from a panel of 12 previously-validated candidate reference genes (Brinkhof et al., 2018) to analyse for their suitability as gene expression normalisers in this sample set. GeNorm analysis indicated the optimal number of reference genes was three (V-value, Fig. 1A) and the advised gene targets for normalisation were TBP, YWHAZ, and PPIA with a medium reference gene stability (0.5 < M-value < 1.0) (Fig. 1B). Since the sample set consisted of several MSC cell lines and fibroblasts this stability was expected from such a heterogeneous sample set (Hellemans et al., 2007). Further analysis was performed selecting the advised genes (TBP, YWHAZ, and PPIA) as normalisers.

ISCT MSC gene expression analysis
Statistical analysis for the common MSC markers 5′-Nucleotidase Ecto NT5E (CD73), Cell Surface Antigen THY1 (CD90), and Endoglin ENG (CD105), indicated that the expression for NT5E was significantly higher in BM-MSCs (MSC-L, MSC-P, and MSC-T) compared to fibroblasts (hDF), whereas the MSC-U showed no significant expression level difference ( Fig. 2A). For THY1, the MSC-U expression level was almost 4.5-fold higher than in fibroblasts, whereas the expression in BM-MSC was lower by > 2-fold (Fig. 2B). ENG gene expression levels were similar for MSC-P and MSC-U, whereas the expression levels in MSC-L and MSC-T were increased by 5.7-fold and 2.2-fold, respectively (Fig. 2C), compared to fibroblasts. These results together, indicate these ISCT markers (Dominici et al., 2006) are not very suitable as general MSC specific markers when using gene expression analysis, in particular when comparing with fibroblasts (Alt et al., 2011;Halfon et al., 2011). Another 18 genes, including TFRC as it has also been suggested as a MSC selection marker (Jeong et al., 2007;Zuk et al., 2002), have been analysed for their expression levels in the MSCs and fibroblasts (Supplemental Fig. 1).

ALCAM expression and its feasibility as a biomarker in MSCs
Activated leukocyte cell adhesion molecule, ALCAM, a type-I transmembrane protein, belonging to the immunoglobulin superfamily (Bowen et al., 1995) a.k.a. CD166, was the only tested gene to be expressed at higher levels in both BM-MSCs and UC-MSCs than in fibroblasts (Fig. 3A). The ALCAM protein has been identified as a possible human MSC surface marker (Bruder et al., 1998;Mareschi et al., 2009;Mildmay-White and Khan, 2017), although its role in MSCs seems to be undetermined (Moraes et al., 2016). Cells expressing CD166 on their membrane are reported to have favourable chondrogenic differentiation capacity (Jonitz et al., 2011). ALCAM has also been implicated in various pathologies such as multiple sclerosis (Wagner et al., 2014),  heart disease (Iolyeva et al., 2013) and cancer (von Lersner et al., 2019;Yavuz et al., 2018). The molecule is also present in hematopoietic stem cells (Jeannet et al., 2013), cancer stem cells (Manhas et al., 2016) and intestinal stem cells (Wang et al., 2013). On an mRNA level, ALCAM has only been identified as a human MSC marker in a few papers. In these papers, human BM-MSCs were described as being positive for ALCAM gene expression (Rallapalli et al., 2009). Its expression was analysed in MSC-like progenitors (MPC) derived from mild and severe osteoarthritic tibial plateaus without significant differential expression (Mazor et al., 2017). In a report comparing ALCAM expression in UC and dental pulp (DP) derived MSCs, qPCR indicated an 8-fold higher expression of the gene in UC-MSCs (Kang et al., 2016). A comparison of BM-MSCs and adipose derived MSCs (AD-MSCs) showed no significant differential ALCAM expression (Winkler et al., 2016). Microarray gene expression analysis revealed similar ALCAM levels in BM-MSC and osteosarcoma (OS) derived MSCs (Brune et al., 2011) and placenta (PL) derived MSCs (Brooke et al., 2008). When single MSCs from unexpanded bone biopsies from healthy donors and multiple myeloma (MM) patients were compared, ALCAM was expressed at similar levels, though in only 80% of the cells (Mehdi et al., 2019). Our data, together with these previous published results, indicate the stable and consistent ALCAM expression amongst MSCs regardless the tissue they are originally derived from. In other species ALCAM is also used as a gene expression marker for MSCs (Calloni et al., 2014;Kovac et al., 2017). In horses, ALCAM expression levels were similar to ENG and were not differentially expressed between BM and AD derived MSCs (Ranera et al., 2011). Cultures of porcine amniotic membrane derived MSCs showed a reduction in ALCAM expression after passage 3 (Lange-Consiglio et al., 2015).
Rabbit (rb) amniotic fluid (AF) derived MSCs (Kovac et al., 2017) were also positive for ALCAM gene expression whereas rbBM-MSCs were negative (Jin et al., 2014). The heterogeneous sample set used in this study represents different time points, culture dimension and oxygen levels during culture (Supplemental Table 1), the stable and high expression of ALCAM indicates this gene could serve as a robust marker for MSCs in gene expression analysis. Further analysis indicated a strong correlation with several other genes (Fig. 3B, Supplemental Table 2). Amongst these was ENG, suggesting ALCAM could replace ENG in a panel of genes for the accurate identification and characterisation of MSCs in general. In particular, since ALCAM is expressed at higher levels in all tested MSCs than in the fibroblasts (Fig. 3), which is not the case for ENG (Fig. 2C). In previous work, it has already been suggested that other genes would be more specific in representing the mesenchymal signature than THY1 (Roson-Burgo et al., 2016). Future studies need to be designed to provide further evidence on whether ALCAM is a more preferred choice in replacing THY1 and ENG.

CLIC1 as a BM-MSC biomarker
The chloride intracellular channel 1; CLIC1, was originally identified as NCC27 (Valenzuela et al., 1997). Its expression varies depending on cell type, distributed from intracellular vesicular to intranuclear (Ashley, 2003;Liao and Chang, 2012). MSCs differentiated into osteoblasts showed increased CLIC1 expression, whereas adipogenic differentiation abolished CLIC1 expression (Yang et al., 2009). In our study, CLIC1 was expressed at higher levels in BM-MSCs than in MSC-U or fibroblasts (Fig. 4A). Relative expression levels for CLIC1 were very For each graph, different letters indicate significant (p < 0.05) differences. B. Brinkhof, et al. Gene: X 5 (2020) 100031 similar to those for NT5E ( Fig. 2A) which confirms our previous finding that these genes belong to the same genetic cluster (Zhang et al., 2019). Expression differences between fibroblasts and BM-MSC samples were greater for NT5E than for CLIC1 indicating a preferential use of NT5E for BM-MSC identification. Its paralog, CLIC4, correlated with ALCAM ( Fig. 3B, Supplemental Table 2), though was not significantly different from fibroblasts for all MSC groups (Supplemental Fig. 1).

Gene expression of EDIL3 and TMEM47 could aid in hMSC identification
Two other genes correlated to ALCAM were EDIL3 and TMEM47 (Fig. 3B). The integrin ligand EGF Like Repeats And Discoidin Domains 3, encoded by EDIL3 (a.k.a. DEL1), is a protein playing an important role in mediating angiogenesis and is regulated upon hypoxia or vascular injury (Ho et al., 2004;Penta et al., 1999). It promotes adhesion of endothelial cells through interaction with the alpha-v/beta-3 integrin receptor (Hidai et al., 1998;Penta et al., 1999). Dermal MSCs from psoriatic skin lesions contained higher levels of EDIL3 mRNA as well as protein compared to their healthy counterparts (Niu et al., 2016). The Transmembrane Protein 47 gene, TMEM47, encodes a member of the PMP22/EMP/claudin protein family. The protein, localized to the ER and plasma membrane, regulates cell junction organization in epithelial cells (Christophe-Hobertus et al., 2001). Using microarray studies, the gene has been detected in MSCs and fibroblasts (Jaager et al., 2012;Roson-Burgo et al., 2016). Although expressed at higher levels in all MSC samples than in fibroblasts, both EDIL3 and TMEM47 were not significantly different from fibroblasts in the MSC-P samples ( Fig. 4B and C). Expression of EDIL3 in MSC-P was high at the beginning (day 0, day 1) and end (day 7) of culture and reduced to even below the fibroblast levels after 3 and 5 days in culture.

ITGA1 as a potential negative biomarker for BM-MSCs
Integrin alpha 1 (ITGA1), a.k.a. the very late activation protein VLA1 or CD49a, associates with the beta-1 chain (ITGB1) to form a heterodimer that functions as a dual laminin/collagen receptor in neural cells and hematopoietic cells (Briesewitz et al., 1993). Surface property has been suggested to play a role in osteogenic differentiation of MSCs increasing ITGA1 expression (Olivares-Navarrete et al., 2011). ITGA1 has been used to isolate BM-MSCs efficiently in previous studies (Deschaseaux and Charbord, 2000;Rider et al., 2007;Stewart et al., 2003), though fibroblasts do express ITGA1 as well (Gardner, 2014). In our study, BM-MSCs showed significant lower expression levels of ITGA1 than fibroblasts or UC-MSCs (Fig. 5A) and could possibly be used as a negative marker for BM-MSC gene expression studies. It could also aid in the distinction between BM-MSCs and fibroblasts. Apart from being not significant for MSC-U, the relative expression profile of ITGA1 for all sample types was very similar to that of THY1 (Comparing Fig. 5A with Fig. 2B).

Positive biomarkers for UC-MSCs
The UC-MSC samples expressed Ephrin type-A receptor 2 (EPHA2) and the Nectin cell adhesion molecule NECTIN2 at higher levels than any of the other cell lines ( Fig. 5B and C et al., 2015), AD-MSCs and BM-MSCs (Holley et al., 2015). NECTIN2 is a single-pass type I membrane glycoprotein with two Ig-like C2-type domains and an Ig-like V-type domain. Also known as HveB or PRR2, the protein is mainly associated with virus entry into cells and is expressed on certain fibroblasts (Eberlé et al., 1995;Lopez et al., 2000;Warner et al., 1998). NECTIN2 has been reported to be expressed on MSCs (Spaggiari et al., 2006) involved in the activation of Natural Killer (NK) cells (Poggi and Giuliani, 2016) and subsequent lysis of the MSCs (Crop et al., 2011). A similar expression level profile was found for EPHA2 and NECTIN2, although, in contrast to ITGA1 and THY1, BM-MSCs were not significantly different from fibroblasts. Even TFRC, a candidate reference gene in MSCs (Brinkhof et al., 2018;Su et al., 2016) as well as a selection marker for MSCs (Jeong et al., 2007;Zuk et al., 2002), showed similar expression levels for the MSCs and fibroblasts as EPHA2 and NECTIN2 (Supplemental Fig. 1) though no consistent significant expression differences could be identified.

Conclusion
Mesenchymal stromal cells are a valuable cell type for regenerative medicine (RM) and tissue engineering (TE). MSCs can be easily extracted from several tissues in the body e.g. extra-embryonic tissue, fat or bone marrow. Unfortunately, often, the majority of the extracted cells are fibroblasts and the MSCs are only a small portion. To obtain sufficient cells for RM or TE, these MSCs need to be expanded in vitro. As fibroblast can overgrow a cell culture rapidly, it is of utmost importance that these fibroblasts are eliminated from culture. The ISCT has proposed three minimal phenotypical markers to identify MSCs (Dominici et al., 2006). These markers are also expressed on fibroblasts. A transcriptomics approach could identify differences in expression for these ISCT markers or other genes previously identified as surface markers for MSCs. However, whole transcriptome analysis has not been feasible for routine practice in tissue culture laboratories. Instead, qPCR has continuously been one of the commonly used techniques in MSCs studies. Here, we identified ALCAM as a candidate gene for the identification of genuine MSCs in contrast to fibroblasts. We also confirmed our previous finding of a genetic positive correlation between ALCAM and ENG (CD105) expressions (Zhang et al., 2019), indicating only one of these genes needs to be tested to confirm MSC identity. The superior specificity, sensitivity and reliability, favours the use of ALCAM over ENG. Additional genes that could be used are EDIL3 and TMEM47. Both CLIC1 and NT5E were more specific for BM-MSCs than UC-MSCs and EPHA2 or NECTIN2 were more specific for UC-MSCs. These genes, in particular ALCAM, could aid in the characterisation of MSCs and distinguish them from fibroblasts in cell culture and, therefore, improve their application in tissue engineering and regenerative medicine.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.