Single-cell RNA sequencing of human femoral head in vivo

The homeostasis of bone metabolism depends on the coupling and precise regulation of various types of cells in bone tissue. However, the communication and interaction between bone tissue cells at the single-cell level remains poorly understood. Thus, we performed single-cell RNA sequencing (scRNA-seq) on the primary human femoral head tissue cells (FHTCs). Nine cell types were identified in 26,574 primary human FHTCs, including granulocytes, T cells, monocytes, B cells, red blood cells, osteoblastic lineage cells, endothelial cells, endothelial progenitor cells (EPCs) and plasmacytoid dendritic cells. We identified serine protease 23 (PRSS23) and matrix remodeling associated protein 8 (MXRA8) as novel bone metabolism-related genes. Additionally, we found that several subtypes of monocytes, T cells and B cells were related to bone metabolism. Cell-cell communication analysis showed that collagen, chemokine, transforming growth factor and their ligands have significant roles in the crosstalks between FHTCs. In particular, EPCs communicated with osteoblastic lineage cells closely via the "COL2A1-ITGB1" interaction pair. Collectively, this study provided an initial characterization of the cellular composition of the human FHTCs and the complex crosstalks between them at the single-cell level. It is a unique starting resource for in-depth insights into bone metabolism.

differentiation by mesenchymal stem/stromal cells [5]. Resting T cells have a protective role of bone [6], while activated T cells increase the production of receptor activator of NF-kappaB ligand (RANKL) and tumor necrosis factor alpha (TNF-α) to promote osteoclast formation and subsequent bone loss under inflammatory conditions [7]. B cells can regulate osteoclastogenesis by expressing osteoclast differentiation factor (ODF)/RANKL [8]. However, current strategies for bone study are based on whole cell population of bone by bulk sequencing of al the cells for bone tissue [9,10]. The approach ignores the heterogeneity between individual cells and lack the accuracy and resolution to characterize regulation and crosstalks between bone tissue cells.
Single-cell RNA sequencing (scRNA-seq) provides an opportunity to explore the heterogeneity of complex tissues and cell-to-cell interactions at high resolution [11,12]. Although flow cytometry is a prominent technique for categorizing cells, which can identify the single cell through the expression of both cell surface and (or) intracellular proteins, it has been limited to probing a few selected proteins [13,14]. Similarly, magnetic activated cell sorting (MACS) and immunohistochemistry (IHC) also have this limitation. And in situ hybridization (ISH) has been limited to probing a few selected RNAs. These single-cell approaches can only focus on information of the selected RNAs or proteins [13], while scRNA-seq can provide a broad characterization of the transcriptome profile. Besides, compared with the traditional bulk-RNA sequencing, scRNA-seq provides information of cellular biology at higher resolution and with more accuracy [15]. scRNA-seq has been successfully applied to reveal the transcriptional diversity of murine bone marrow-derived mesenchymal stem cells (BM-MSCs) [16], and to identify differential expression genes (DEGs) between human Wharton's jelly stem cells and human BM-MSCs [17]. However, the cellular composition of bone tissue cells and the crosstalks between them at single-cell resolution remains unknown.
Here, we applied scRNA-seq technology to characterize cellular heterogeneity at single-cell level in freshly isolated bone tissue cells from human femoral head. We identified serine protease 23 (PRSS23) and matrix remodeling associated protein 8 (MXRA8) as novel bone metabolism-related genes. Moreover, we defined distinct subtypes of monocytes, T and B cells in bone microenvironment. We further discussed their relationship with bone metabolism and re-constructed the communication networks of cells in human femoral head. We believe that the global single-cell profile of how different types of human femoral head tissue cells work together would promote our comprehensive understanding of bone metabolism, and provide some novel insights into the prevention and treatment of skeletal diseases, such as osteoporosis and osteoarthritis.

scRNA-seq analysis reveals distinct cell types in human femoral head
We performed scRNA-seq analyses on femoral head tissue cells from four human subjects ( Figure 1A). The gene expression profiles between samples have a strong correlation, suggesting that there is no obvious batch effect between samples (R > 0.96; Supplementary  Figure 1). After merging of the four datasets and QC, we obtained a cell-gene matrix of 26,574 cells, with an average of 1035 genes detected per cell ( Figure 1B). Then we clustered cells into 16 distinct clusters ( Figure 1C), and identified the cluster-specific markers ( Figure 1D).

Functional analyses and hub genes identification for DEGs of osteoblastic lineage cells
The osteoblastic lineage cells were a complex cell population which contained BM-MSCs, osteoblasts, osteocytes and chondrocytes, and we showed the expression of cell-specific markers by the violin plot (Supplementary Figure 2A).
To further study the biological functions of osteoblastic lineage cells, we performed GO and KEGG enrichment analyses based on the DEGs of osteoblastic lineage cells (Supplementary Tables 2, 3). GO enrichment analysis identified abundant terms related to bone metabolism, such as "extracellular structure organization", "extracellular matrix organization", "establishment of protein localization to organelle", "skeletal system development", and "ossification" (Supplementary Figure 2B). Several signal pathways related to bone AGING metabolism were revealed by KEGG enrichment analysis (Supplementary Figure 2B), such as "PI3K-Akt signaling pathway", "Rap1 signaling pathway", and "TGF-beta signaling pathway".
To identify the hub genes, which are genes with a high degree of connectivity, in the DEGs of osteoblastic lineage cells, a PPI network of DEGs was constructed (Supplementary Figure 2C), and the top 20 hub genes with a high degree of connectivity were detected ( Figure 2A). These top 20 hub genes were enriched in the process related to bone metabolism (Supplementary  Table 4 and Supplementary Figure 2D), such as "extracellular structure organization", "extracellular matrix organization", "ossification", "skeletal system development" and "osteoblast differentiation". We also detect seven significant modules in the PPI network (Figure 2B and Supplementary Figure 2E,  Supplementary Table 5). We used genes in the most significant module, module 1 (score = 19.097, with 32 nodes and 296 edges) for a GO enrichment analysis (Supplementary Figure 2D and Supplementary Table 6), and found that genes in module 1 were significantly related to extracellular structure, extracellular matrix, collagen fibril, ossification, skeletal system, etc. The biological process analysis of the top 16 genes in module 1 was shown in Supplementary Figure 2F.
Among the hub genes in the DEGs network and the hub genes in the most significant module (module 1) of the PPI network, we found that most of these genes were known to be related to bone metabolism (Supplementary Table 7). However, two genes, PRSS23 and MXRA8, were rarely reported to regulate bone metabolism. In addition, PRSS23 was highly expressed in osteoblast precursors and early osteoblasts (preosteoblast and undetermined osteoblast) ( Figure 2C), and MXRA8 was highly expressed in both BM-MSCs and osteoblasts ( Figure 2D). During the process of osteogenic differentiation by BM-MSCs in vitro, the expression levels of PRSS23 and MXRA8 were significantly increased ( Figure 2E). Therefore, we speculate that PRSS23 and MXRA8 may play important roles in bone metabolism.

Complex inter-cellular communication networks in human femoral head
We identified ligand-receptor pairs and molecular interactions among bone tissue cells (except granulocytes) ( Figure 8A). Cognate receptors with broadcast ligands were detected, demonstrating extensive communication between osteoblastic lineage cells and other types of cells ( Figure 8A, 8B). Our results suggested that chemokine, transforming growth factor and collagen had significant roles in inter-cellular communications ( Figure 8C). The "CXCL12-CXCR4" interaction pair played important role in the crosstalks between bone tissue cells. Previous studies have reported that the CXCL12/CXCR4 signaling was involved in the regulation of bone homeostasis [19][20][21]. Notably, the "COL2A1-ITGB1" interaction pair was significant in the crosstalk between EPCs and osteoblastic lineage cells ( Figure 8C). Additionally, compared with other cells, monocytes could act through COL1A1-/COL1A2-CD44 interaction pairs to perform closer cell communication (measured with interaction score) with osteoblastic lineage cells ( Figure 8C).

DISCUSSION
Bone is a complex tissue and undergoes modeling/ remodeling constantly during life [22]. Various types of cells are involved in the regulation of bone homeostasis, such as bone cells, stromal cell, immune cells, endothelial cells, which also communicate with each other closely [4]. Therefore, it is fundamental to study the cellular composition of the bone tissue cells and the crosstalks between them. In this study, we applied scRNA-seq analyses on freshly isolated bone tissue cells from human femoral head. We identified two novel bone metabolism-related genes, PRSS23 and MXRA8. We discovered several subtypes of immune cells (monocytes, T cells and B cells) that may be involved in the regulation of bone metabolism. Finally, the cell-cell communication analysis suggested complex inter-cellular communication networks among human femoral head tissue cells, and the close crosstalk between EPCs and osteoblastic lineage cells via the "COL2A1-ITGB1" interaction pair. Our results provided an initial systematic dissection of human femoral head tissue at single-cell resolution and a global single-cell profile of how different cells work together in human femoral head.
To avoid potential alternation of transcriptome profiles caused by in vitro operations (e.g. culturing) [23], we performed scRNA-seq on the freshly isolated primary femoral head tissue cells. In addition, we identified two novel bone metabolism-related genes, PRSS23 and MXRA8, by analyzing the PPI network constructed from DEGs in osteoblastic lineage cells, and showed that the expression of these two genes were significantly increased during in vitro osteogenic differentiation. Based on our recent scRNA-seq data of BM-MSCs [24] and osteoblasts [25], PRSS23 was highly expressed in the osteoblast precursors and early osteoblasts (preosteoblast and undetermined osteoblast). This result suggested that PRSS23 may promote the differentiation of BM-MSCs into osteoblasts. Previous studies reported that, in breast cancer cells, PRSS23 was co-expressed with estrogen receptor α (ERα), and PRSS23 knockdown may suppress estrogen-driven cell proliferation of breast cancer cells [26]. Since estrogens were highly significant for bone metabolism and maintaining bone mineral density (BMD) [27], we speculated that PRSS23 may regulate bone metabolism through affecting ERα gene expression. Additionally, PRSS23 have been reported to interact with TGFB signaling pathways [28], and TGFB signaling pathway was significant for bone metabolism [29]. Therefore, we speculated that PRSS23 may also regulate bone metabolism through mediating the TGFB signaling pathway. MXRA8 was highly expressed in both BM-MSCs and osteoblasts ( Figure 2D), suggesting that it may play a critical role in maintaining the activity and function of BM-MSCs and osteoblasts. Recent studies showed that MXRA8 was a lipid metabolism-related gene [30] and also related to the proliferation of growth plate chondrocytes [31]. Interestingly, MXRA8 is a cell adhesion molecule, as an entry mediator for arthritogenic alphaviruses [32], and arthritogenic alphaviruses would cause chronic musculoskeletal disease [33]. Taken together, PRSS23 and MXRA8 were likely related to bone metabolism in humans.
To explore inter-cellular interaction in human femoral head tissue cells, we constructed the inter-cellular communication networks in femoral head based on known ligand-receptor interactions. In the network, EPCs closely communicated with osteoblastic lineage cells via "COL2A1-ITGB1" interaction. Since COL2A1 is a known chondrogenic marker [41], we suspected that EPCs may regulate chondrogenesis of osteoblastic lineage cells through "COL2A1-ITGB1" interaction. For novel bone metabolism-related genes, PRSS23 and MXRA8, we did not find any significant crosstalks in the cell-cell communication analysis, which is probably due to the very limited knowledge about the functions of these two genes.
Despite interesting and novel findings in this initial comprehensive characterization of cells and their interactions in vivo in human femoral head at single cell level, our study may have some limitations. First, due to limited amount of data (only 691 osteoblastic lineage cells detected in our dataset), we were unable to further dissect subpopulations of osteoblastic lineage cells. This is mainly because we performed scRNA-seq on the primary femoral head tissue cells without any purification/enrichment procedures specifically for osteoblastic lineage cells. Also, the samples of this study were from subjects with osteoporosis or   osteopenia because appropriate bone samples can be obtained from these subjects during hip replacement therapy, and this might incur some bias in the cell subpopulation identification and proportion estimation compared with healthy individuals. Previous studies showed that the number and activity of osteoblastic lineage cells were significantly decreased in people with osteoporosis or osteopenia [42]. Therefore, we need to specifically isolate/enrich osteoblastic lineage cells from study subject with different health conditions in future studies. As the osteoblastic lineage cells are a heterogeneous cells population, a negative sorting approach can be adopted [43]. In addition, we were unable to detect osteoclasts in this study. This is as expected because the size of osteoclasts (150-200 µm in diameter) exceeds the upper limit of cell size (40 µm in diameter) compatible with the current 10x Genomics system, and thus osteoclasts were filtered out before the scRNA-seq library construction. In addition, osteoclasts are multinucleated cells with heterogeneous nuclei inside mature osteoclasts, and thus the current single-nucleus RNA-seq is not suitable for osteoclasts either [44]. In future studies, spatial transcriptomics may hopefully provide an opportunity to explore the cellular heterogeneity of osteoclasts and the relationship with bone metabolism [45]. In summary, our study characterized the cellular composition of the human femoral head tissue cells, and identified PRSS23 and MXRA8 as novel bone metabolism-related genes. The complex inter-cellular communication networks in human femoral head suggest that various types of cells are involved in the regulation of bone metabolism, and EPCs communicate with osteoblastic lineage cells closely via the "COL2A1-ITGB1" interaction pair. Our study provides a systematic dissection of human femoral head at the single-cell level, and shows the global single-cell profile of how different cells work together in human femoral head on the single-cell resolution, which is a unique resource for in-depth insights into bone metabolism.
In future studies, more subjects should be included to further dissect subpopulation of osteoblastic lineage cells, and to explore how different health states affect the bone metabolism and vice versa. Besides, by combining scRNA-seq with spatial transcriptomics [45] and scATAC-seq (a powerful tool to evaluate chromatin accessibility at the single-cell level) [46], we will aim to unveil the complicated crosstalk between bone tissue cells, and the gene regulatory network within/between them. In the meantime, deconvolution of the cellular heterogeneity of bone tissue cells in vivo in humans represents an important and necessary advancement step towards improving our understanding of bone physiological processes.

Study subjects
The study was approved by the Medical Ethics Committee of Xiangya Hospital of Central South University and written informed consent was obtained from all participants. The study subjects consisted of four Chinese subjects of Han ethnicity (detailed information of the study subjects provided in Supplementary Table 1), who underwent hip replacement surgery at Xiangya Hospital of Central South University. All the subjects were screened with a detailed questionnaire, medical history, physical examination, and measured for bone mineral density (BMD) before surgery. Subjects were excluded from the study if they had chronic diseases that may affect bone metabolism, including but not limited to renal failure, liver failure, diabetes mellitus, hematologic diseases, malabsorption syndrome, disorders of the thyroid/parathyroid, malignant tumors, ankylosing spondylitis, hyperprolactinemia, oophorectomy, or previous pathological fractures [47]. The femur head was collected from the patient during hip replacement surgery. The specimens were immediately stored at 4° C temporarily and transferred to the wet laboratory within 2 hours, where they were processed within 24 hours after delivery.

BMD measurement
BMD (g/cm 2 ) was measured by the duel energy x-ray absorptiometry (DXA) fan-beam bone densitometer (Hologic QDR 4500A, Hologic, Inc., Bedford, MA, USA) at the right hip (femoral neck and trochanter) and the lumbar spine (L1-L4). According to the World Health Organization (WHO) definition [48] and the BMD reference established for Chinese [49], subject with T-score ≤ -2.5 is clinically diagnosed as osteoporosis, while T-score between -2.5 and -1 as osteopenia, and T-score > -1.0 are considered healthy.

Isolation of bone tissue cells
Bone tissue cells were extracted from the femoral head specimens based on widely used dissociation protocols with a few adjustments [50,51]. First, femoral heads were washed three times with αMEM (Cat: SH30265.01, HyClone, USA) and dissected into small pieces of approximately 1-2 mm in diameter. Bone pieces (10 g wet weight) were placed into a 50 ml conical tube with 20 ml of 2 mg/ml collagenase type II (Cat: A004174-0001, Sangon Biotech, China) dissolved in αMEM with 100 U/ml Penicillin and 100μg/ml Streptomycin (Cat: 15140-122, Gibco, USA) and digested with gentle agitation for 25 minutes at 37° C. After that, the collagenase solution was aseptically removed and bone pieces were rinsed in 10 ml PBS for 3 times. Briefly, after five rounds of digestion, we combined the collagenase solutions from the last two rounds of digestion and filtered the solution through a 40 μm filter. Finally, we incubated the collected cells with red blood cell (RBC) lysis buffer (Cat: R1010, Solarbio, China) for 5 minutes and then washed it twice with PBS.

scRNA-seq library preparation and sequencing
scRNA-seq libraries were prepared using Single Cell 3' Library Gel Bead Kit V3 following the manufacturer's guidelines (https://support.10xgenomi cs.com/single-cell-gene-expression/library-prep/doc/ user-guide-chromium-single-cell-3-reagent-kits-userguide-v3-chemistry). Single cell 3' Libraries contain the P5 and P7 primers used in Illumina bridge amplification PCR. The 10x Barcode and Read 1 (primer site for sequencing read 1) were added to the molecules during the GEM-RT incubation. The P5 primer, Read 2 (primer site for sequencing read 2), Sample Index and P7 primer were added during library construction. The protocol was designed to support library construction from a wide range of cDNA amplification yields spanning from 2 ng to > 2 μg without modification. Finally, scRNA-seq libraries were sequenced on the Illumina Novaseq6000 platform with a sequencing depth of at least 100,000 reads per cell for a 150bp paired end (PE150) run.

Pre-processing of scRNA-seq data
The FASTQ files were mapped to the human transcriptome (GRCh38/hg38) using Cell Ranger 3.0 (https://support.10xgenomics.com/single-cell-geneexpression/software/pipelines/latest/what-is-cell-ranger ). To create Cell Ranger-compatible reference genomes, the references were rebuilt according to instructions from 10x Genomics (https://www.10xgenomics.com), which performs alignment, filtering, barcode counting and UMI counting. Finally, the digital gene expression matrix was generated. For quality control (QC), we used the R (version 3.6.1, https://www.r-project.org/) and Seurat R package (version 3.1, https://satijalab. org/seurat/) [52,53] to calculate the distribution of genes detected per cell and remove the cells in the top or the bottom 2% quantile. We also excluded cells in which more than 10% of the transcripts were attributed to mitochondrial genes.

Dimension reduction and cluster identification
To visualize and cluster the data, we selected top 2,000 most variable genes for principal-component analysis (PCA), and then, we used the first 20 principal-components (PCs) for visualization by t-Distributed Stochastic Neighbor Embedding (t-SNE) [54]. Next, we performed an unbiased graph-based method for clustering analysis using the first 20 PCs [55]. To identify differentially expressed genes (DEGs) between clusters, Wilcoxon rank-sum test was used to identify genes showing significantly higher levels of expression (false discovery rate (FDR) < 0.05) in a specific cluster compared to the other clusters.

Pathway enrichment analysis
To investigate the biological processes and signal pathways associated with cell type, we performed gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses for the genes that were identified as important DEGs for clusters (adjusted p value < 0.05), by using the clusterProfiler R package [56].

Protein-protein interaction (PPI) network, hub genes and module analysis
To identify the most significant gene among the DEGs in the context of functioning in gene networks, a PPI network of DEGs (selected with average log(Fold change) > 1.0, adjusted p value < 0.05) was constructed using the Search Tool for the Retrieval of Interacting Genes (STRING, http://string.embl.de/) [57]. The Cytoscape software (version 3.7.2) was applied to visualize and analyze the molecular interaction networks [58]. The hub genes of the PPI network were identified by cytoHubba (Cytoscape) [59]. The modules of the PPI network were selected by Molecular Complex Detection (MCODE) (Cytoscape) [60]. The biological analyses of hub genes were constructed using BiNGO (Cytoscape) [61].

Cell-cell communication analysis
To explore the potential interactions between cells in human femoral head, we used iTALK [62] to perform cell-cell communication analysis, which is an R toolkit for visualizing ligand-receptor-mediated inter-cellular interaction in scRNA-seq data. The product of average receptor expression and average ligand expression was calculated in each cell cluster to score the enriched receptor-ligand interactions.

Public datasets
We recently generated scRNA-seq datasets of human BM-MSCs and human osteoblasts [24,25], which can be accessed from GEO database (https://www.ncbi. nlm.nih.gov/geo/) [63] under the accession numbers of GSE147287 and GSE147390, respectively. In this study, we processed these datasets using the same parameters as described in our previous studies [24,25]. The gene expression profile of osteogenic differentiation by BM-MSCs in vitro was obtained from the GEO database with accession numbers GSE37558 [64]. And the data were log2 transformed and normalized using the quantile-normalization approach.
The data analyses of this study were divided into three parts, and illustrated in Figure 9.

Ethics approval
The study was approved by the Medical Ethics Committee of Xiangya Hospital of Central South University, and the IRB approval number is No. 201912315.

Consent to participate
Written informed consent was obtained from all participants.

Consent for publication
All authors gave their consent for publication.

Availability of data and material
The scRNA-seq data of human femoral head tissue cells from four human samples is available in the GEO database with accession numbers GSE169396. The scRNA-seq data of human BM-MSCs and human osteoblasts are available in the GEO database with accession numbers GSE147287 and GSE108891. The data of osteogenic differentiation by BM-MSCs in vitro was obtained from the GEO database with accession numbers GSE37558.  Tables   Please browse Full Text version to see the data of Supplementary Tables 2-4