Single-cell transcriptomic profiling and characterization of endothelial progenitor cells: new approach for finding novel markers

Endothelial progenitor cells (EPCs) are promising candidates for the cellular therapy of peripheral arterial and cardiovascular diseases. However, hitherto there is no specific marker(s) defining precisely EPCs. Herein, we are proposing a new in silico approach for finding novel EPC markers. We assembled five groups of chosen EPC-related genes/factors using PubMed literature and Gene Ontology databases. This shortened database of EPC factors was fed into publically published transcriptome matrix to compare their expression between endothelial colony-forming cells (ECFCs), HUVECs, and two adult endothelial cell types (ECs) from the skin and adipose tissue. Further, the database was used for functional enrichment on Mouse Phenotype database and protein-protein interaction network analyses. Moreover, we built a digital matrix of healthy donors’ PBMCs (33 thousand single-cell transcriptomes) and analyzed the expression of these EPC factors. Transcriptome analyses showed that BMP2, 4, and ephrinB2 were exclusively highly expressed in EPCs; the expression of neuropilin-1 and VEGF-C were significantly higher in EPCs and HUVECs compared with other ECs; Notch 1 was highly expressed in EPCs and skin-ECs; MIR21 was highly expressed in skin-ECs; PECAM-1 was significantly higher in EPCs and adipose ECs. Moreover, functional enrichment of EPC-related genes on Mouse Phenotype and STRING protein database has revealed significant relations between chosen EPC factors and endothelial and vascular functions, development, and morphogenesis, where ephrinB2, BMP2, and BMP4 were highly expressed in EPCs and were connected to abnormal vascular functions. Single-cell RNA-sequencing analyses have revealed that among the EPC-regulated markers in transcriptome analyses, (i) ICAM1 and Endoglin were weekly expressed in the monocyte compartment of the peripheral blood; (ii) CD163 and CD36 were highly expressed in the CD14+ monocyte compartment whereas CSF1R was highly expressed in the CD16+ monocyte compartment, (iii) L-selectin and IL6R were globally expressed in the lymphoid/myeloid compartments, and (iv) interestingly, PLAUR/UPAR and NOTCH2 were highly expressed in both CD14+ and CD16+ monocytic compartments. The current study has identified novel EPC markers that could be used for better characterization of EPC subpopulation in adult peripheral blood and subsequent usage of EPCs for various cell therapy and regenerative medicine applications.


Background
Endothelial progenitor cells (EPCs) are heterogeneous population of mononuclear cells (MNCs) that originate and reside in the bone marrow (BM); they are circulating in (mobilized to) the adult peripheral (PB) or umbilical cord blood (UCB) [1]. EPCs have been discovered by Asahara and his coworkers in 1997 [2]. They express endothelial antigens like CD31, von Willebrand factor (vWF), endothelial nitric oxide synthase (eNOS), VE-cadherin, and VEGFR2 [3,4]. EPCs constitute 1-5% of the total BM cells and > 0.0001-0.01% of PB circulating MNCs [5]. They are implicated in homeostasis, neovascularization, vascular repair, endothelial regeneration, and angiogenesis processes [6]. There are two distinct subpopulations of EPCs: early EPCs which give rise to heterogeneous colonies that appear in culture after 3-5 days; they are obtained by negative selection on fibronectin; they are round cells surrounded by spindle-shaped cells in morphology; they have a slow proliferation and their in vitro growth peak is reached after 2-3 weeks [7][8][9][10]. Moreover, early EPCs do not form vascular tubes in vitro but they have a strong paracrine activity (secrete a plethora of angiogenic factors) that contributes effectively to neovascularization [11,12], they have high expression of both hematopoietic and endothelial markers (VEGFR-2, CD31, vWf, able to uptake acLDL and bind UEA-1) [13,14], they are most likely derived from hematopoietic stem cells and had a resemblance to myeloid progenitors [15], and hence they are also named "hematopoietic EPCs" [16]. Early EPCs generate the endothelial cell colony-forming units (CFU-ECs) in vitro [8,17]. Interestingly, early EPCs [18] are also termed circulating angiogenic cells (CACs) [19]. On the other hand, the other subtype of EPCs is termed "late EPCs" [18]; they are more homogenous colonies that appear after 2-4 weeks in culture, they are isolated by positive selection on collagen I, they are elongated cells that form a cobblestone-morphology monolayer in vitro which is characteristic of endothelial cells, they could be maintained in culture for~12 weeks (up to 15 passages), and they have higher proliferative and clonogenic potential compared with early EPCs [12,17,20]. Moreover, late EPCs could easily form tubular/capillary-like structures in vitro, they possess high vasculogenic and angiogenic potential, and in vivo they could incorporate in the existing endothelium where they form stable vessels and continue to differentiate into mature endothelial cells [17,21,22]. Noteworthy, late EPCs are phenotypically similar to mature endothelium, they are present/circulate in both PB and UCB; importantly, they are not only closer to endothelium phenotypically but also by exhibiting no hematopoietic (CD45) or monocyte markers (CD14 and CD115) expression in contrast to early EPCs, whereas they express many endothelial cell (EC) antigens (CD31, VEGF R-2, CD105, CD144, CD146, vWf, CD34, higher eNOS, Tie-2, VE-cadherin, able to uptake acLDL and bind UEA-1) [22,23]. Collectively, late EPCs are termed "nonhematopoietic EPCs" [16,24], and thus they are considered the "EPCs" subtype that complies the most with the original endothelial phenotype and functions to be the legitimate endothelial progenitor cells bearing almost all of the endothelial cell characteristics [15]. Further, late EPCs generate in vitro "endothelial colony-forming cells or ECFCs" [25] and they are also called "outgrowth endothelial cells or OECs" [20,26].
The vast variation in the surface antigens for EPCs is possibly attributed to identifying different EPCs' subpopulations at various maturation/differentiation phases. The term "EPCs" has been haphazardly used to refer to both circulating (late EPCs) and cultured cells (ECFCs). In addition, the accumulating literature did not provide one consolidated definition of EPCs nor a specific EPC phenotype or a unified isolation and culture protocol of them. Accordingly, different isolation techniques and culturing methods applied resulted in EPCs with various phenotypes [28]. Therefore, we aimed herein using in silico data to reach a possible novel EPC marker or a combination of markers that could specifically characterize EPCs.
In the current manuscript, we are adding to the already ongoing efforts for the characterization analyses of EPCs by presenting a new approach for finding novel marker(s) of EPCs in peripheral blood.
The up-to-date "-omics," "gene-expression profiling" or "transcriptomics" is currently the most widely used tool for the characterization and functional analysis of cells; moreover, transcriptomics have provided a better understanding for EPCs' characterization analyses in an unbiased manner [28].
Large genomic data from large tissue sample collections are difficult to analyze; however, if we use the individual transcriptomic data coming from the tissuerepresenting or "single-cell" level, this would render mass analysis of bulk single-cell(s) data to be fast and non-tedious [29,30] and thus would introduce new insights about the ontogeny of new and rare cell types and the relationships between various cell lineages [31]. Collectively, single-cell transcriptomics would help herein to improve our knowledge for the identification and characterization of EPCs in peripheral blood.
Using Gene Ontology and literature survey, we assembled five groups of EPCs' molecules/factors/markers that have been specifically chosen for being of special interest and importance to the EPC biology.
Herein, our main objective is to search for novel markers of EPCs in peripheral blood. Thus, we have created a short list divided into five groups of EPC factors/ molecules using PubMed literature, Gene Ontology, and other sources. This list was used for both the transcriptomic and single-cell analyses. In transcriptome analyses, the list was used to compare the relative expression of various EPC genes (involved within this list) between ECFCs, HUVECs, and two adult ECs from the skin and adipose tissue. Moreover, EPC chosen-genes were used for functional enrichment on Mouse Phenotype and STRING protein-protein interaction network database to decipher the involvement of these factors in endothelial and vascular development and morphogenesis. Additionally, we built a digital matrix of healthy donors' PBMCs (33 thousand transcriptomes) and analyzed the expression of the short list of EPC factors and more specifically EPC molecules that have shown to be significantly regulated between ECFCs and the other three adult ECs in the transcriptome analyses.
The current study has identified novel markers, which include secreted factors, miRNAs, and growth factors. Among these markers we have analyzed, some of them could be used for better cytometric analyses and an optimized characterization of EPC subpopulation in peripheral blood.

Semantic search for chosen factors implicated in recent endothelial progenitor cell biology field
Using Gene Ontology, a vast array of EPCs' physiology/ pathophysiology-related published research and literature and PubMed databases were used in the current work. This was followed by the selection and categorization of different factors (affecting various signaling cascades, molecular functions, and biological processes of EPCs) into five main groups of molecules/factors using a combination of keywords in the field of the EPC biology. The five molecular sets were described in Table 1 with their related employed keywords. We have chosen sixty-one factors distributed as follows: group 1 (purple; 14 molecules), group 2 (green; 4 molecules), group 3 (red; 19 molecules), group 4 (blue; 9 molecules), and group 5 (brown; 15 molecules) as shown in Table 1.

Public datasets
ECFCs and mature ECs have been already studied by whole transcriptome analysis through Gene Omnibus Expression dataset from the series GSE55695 [58]. In these experiments, ECFCs of the peripheral blood (ECFC-PB) were compared to different kinds of endothelial cells: adipose tissue-derived endothelial cells (EC-ADIPO), dermal microvascular endothelial cells (EC-skin), and human umbilical vein endothelial cells (HUVECs). The expression matrix normalized by quantile normalization method was downloaded at the following web address: https://www. ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE55695. In a second step, the normalized matrix was annotated with the corresponding GEO plateform GPL10558 used for microarray technology: Illumina HumanHT-12 V4.0 expression beadchip.

Transcriptome analyses
Bioinformatics analyses were performed in R software environment version 3.4.1. Unsupervised principal component analysis was performed with FactoMineR Rpackage [59]. Molecule names from previously described semantic research in topics of endothelial cells/EPCs (see Table 1) were converted in official human gene symbol with HUGO database from HUGO Gene Nomenclature Committee (HGNC consortium) [60]. Expression heatmap was performed with R-package made4 by using unsupervised classification with Euclidean distances [61]. Most variable genes between the transcriptome of the four experimental groups (ECFC-PB, EC-ADIPO, EC-skin, and HUVECs) were defined by performing Fisher one-way analysis of variance (ANOVA) with implementation of 500 permutations in order to perform multi-testing corrections on p values with false discovery rate method in genomic suite Mev version 4.9.0 [62]. Functional enrichment on Mouse Phenotype Table 1 Table comprising semantic determination of molecule sets related to EPC/EC biology. Sixty-one factors distributed as follows: group 1 (purple; 14 molecules), group 2 (green; 4 molecules), group 3 (red; 19 molecules), group 4 (blue; 9 molecules), and group 5 (brown; 15 molecules). The keywords used for each group of molecules are slightly changed between the groups depending on the biological functions that various molecules/factors are incorporated in. It has to be noted that VEGFR1, 2, and 3 were repeated in groups 1 and 5 as they are differently involved in the general molecular functions of each group database was performed with ToppGene software suite [63]. Functional enrichment network was performed with Cytoscape standalone software version 3.6.0 [64].

Single-cell RNA-sequencing analyses
Transcriptome of 33,000 healthy donors' peripheral blood mononuclear cells (PBMCs) which were found publically available (10X genomics, https://www.1 0xgenomics.com/solutions/single-cell/) were analyzed to assess the expression of the chosen EPC-related markers in peripheral blood as shown in Table 2. Sequencing reads were analyzed with demultiplexing solution: Cell Ranger version 1.1.0. Seurat algorithm version 2.3.0 [65] was used in R software environment version 3.4.3 to build a digital matrix of the transcriptomes and subsequent clustering by combining principal component analysis and tSNE (t-distribution stochastic neighbor embedding) mathematical reductions in order to project the quantification of the studied endothelial markers.

Protein-protein interaction network
Molecular identifiers of EPC selected markers were used to build a protein-protein interaction network with STRING proteomic database [66]. High confident interaction score over 800 was set to select interactions which were validated experimentally. Network Analyst web tool [67] was used to perform functional inference with biological process Gene Ontology database.

Statistical analysis
Statistical analysis was performed in R software environment version 3.4.1. Statistical hypothesis between groups was verified by performing Fisher one-way analysis of variance with Tukey post hoc test. A significance threshold on alpha error p < 0.05 was defined during these analyses.
An overview of the experimental workflow undertaken in the current work is depicted in Fig. 5. Table 2 Most significant EPC-related genes found by ANOVA between ECFCs and other three types of endothelial cells: most variable EPC-related genes found to be significant by ANOVA between ECFCs (in peripheral blood) and three distinct groups of endothelial cells: HUVECs, adipose, and skin from transcriptome dataset GSE55695. The table shows gene symbol with their relative Illumina identifier, also ratio obtained from the Fisher statistics, and their corresponding corrected p value was adjusted for the multi-testing errors  (Table 2). Unsupervised principal component analysis performed with the expression of these EPC-related genes significantly discriminate samples through the different experimental conditions (group discrimination based on the principal component map, p value = 0.000107, Fig. 1a).
Unsupervised classification (clusters of samples with Euclidean distances and complete method, Fig. 1b) was performed with these significant EPC-related genes confirming the stratification of the samples by their experimental conditions.
Significant high levels of expression of BMP2, BMP4, and EFNB2 were found for ECFC-PB compared with the other three ECs (Fig. 1b). Moreover, significant high levels of expression of MIR34A, NOTCH4, and SEMA3F were found for EC-ADIPO compared with other groups (Fig. 1b). Further, significant high levels of expression of PDGFA and SEMA3A were found for EC-skin compared with other groups (Fig. 1b). The most significant gene found between the four types of cells was VEGF-C (vascular endothelial growth factor C; p = 0.001, Table 2) and VEGF-C was found to have a high level of expression specifically in HUVECs (Fig. 1b).
Functional enrichment of EPC-related genes on Mouse Phenotype database allowed finding significant relations between these EPC-related genes and endothelial functions (Table 3). These relations were used to build a functional enrichment network (Fig. 1c): EFNB2, BMP2, and BMP4 molecules were found to have a significant high level of expression exclusively in ECFCs (Fig. 1b) and after functional enrichment were also found to be connected to several enriched endothelial phenotypes, which includes abnormal arterial morphology, abnormal angiogenesis, and also abnormal vascular development ( Fig. 1c and Table 3).
Some EPC-related genes were also found to have a high level of expression shared between ECFCs and other types of endothelial cells. NRP1 (neuropilin 1) was found to share a high level of expression between ECFCs and HUVECs compared with other groups (ANOVA; p value = 0.0125, Fig. 2a) and especially compared with EC-skin (ANOVA; p value = 0.0104, Fig. 2a). Moreover, VEGF-C was found to share a high level of expression between ECFCs and HUVECs compared with other groups (ANOVA; p value = 0.00364, Fig. 2a) and especially compared with EC-skin. Further, some EPCrelated genes also shared a high level of expression between ECFCs and EC-skin (Fig. 2b) which may contribute to cluster ECFCs and EC-skin as near neighbors on the expression heat map (Fig. 1b). NOTCH1 shared a significant high level of expression in ECFCs and ECskin (p value = 0.0073, Fig. 2b), more particularly compared with EC-ADIPO (p value = 0.0103, Fig. 2b) and also compared with HUVECs (p value = 0.0347, Fig. 2b).
MIR21 was also found to have a significant high level of expression in EC-skin and ECFCs compared with other groups (p value = 0.0302, Fig. 2b) and more particularly compared with EC-ADIPO (p value = 0.0254, Fig. 2b). One molecule PECAM1, platelet and endothelial cell adhesion molecule 1, was found to share a significant high level of expression between ECFCs and EC-ADIPO (p value = 0.00454, Fig. 2c) and more particularly compared with HUVECs (p value = 0.00345, Fig. 2c). These results suggest that the EPC chosen molecules that we highlighted during the transcriptomic analyses between different types of endothelial cells are implicated in vascular development and could have an impact on human endothelial phenotype because they are upregulated in these cells.

Peripheral blood mononuclear cells from healthy donors expressed EPC markers in different sub-compartments characterized by single-cell RNA sequencing
One of the actual challenges to improve the isolation protocols and the yield of isolated EPCs from peripheral blood (PB) is upgrading the characterization of EPCs using specific new markers. In this regard, in order to improve the choice of markers for EPC subpopulation, using publicly available single-cell RNA-sequencing experiments, we built a digital matrix of healthy donors' PBMCs (33,000 single-cell transcriptomes) and analyzed the expression of EPC markers curated from the literature ( Table 1) and more particularly EPC markers/ genes shown to be highly regulated between EPCs/ ECFCs and other ECs from different tissues ( Table 2). Seurat algorithm allowed identifying five major cell populations after tSNE mathematical reduction (Fig. 3a):

CD19+ cells (B lymphocytes), CD3E+ cells (general T lymphoid marker), Granzyme B cells (natural killer cells and cytotoxic T lymphocytes)
, CD16+ monocytes, and CD14+ monocytes. In peripheral blood, we assessed the molecular expression of endothelial markers like ICAM1 Fig. 1 ECFCs compared to other endothelial cells harbored a distinct expression profile implicated in abnormal vascular development. a An unsupervised principal component analysis was performed with regulated endothelial-related genes on dataset GSE55695 comparing ECFC_PB (ECFCs in peripheral blood) to distinct groups of endothelial cells (EC_ADIPO, EC_skin and HUVECs, p value of group discrimination was calculated on the first principal axis). b Expression heatmap of endothelial-related genes performed on transcriptome samples from dataset GSE55695 (unsupervised classification was realized with Euclidean distances with complete method). c Functional enrichment network performed with regulated endothelial-related genes in dataset GSE55695 after enrichment on Mouse Phenotype database: circles represent genes; octagons represent enriched function; blue edges represent link(s) between functions and enriched genes; fill color with scale color ranging from blue to red is relative to negative logarithm 10 of the p values obtained during the enrichment and ENG, which were at low levels in the monocyte compartment and more particularly in the CD14+ compartment for the ICAM1 expression (Fig. 3b). Other less endothelial-specific markers curated from the literature confirmed the involvement of PB monocyte compartment as the source of ECs/EPCs, principally by the expression of CD163 and CD36 in CD14+ monocyte compartment and also the expression of CSF1R (CD115) in CD16+ monocyte compartment (Fig. 3c). These results suggest the potential implication of EPC subpopulation in monocyte subcompartment; thus, with the help of the assessed markers, a better understanding of EPC heterogeneities could be achieved. Some EPC genes curated from literature harbored a mixed lympho/myeloid expression in PBMCs; this is the case for SELL (CD62L, selectin L) and IL6R which have a high expression in the lympho/myeloid compartment (Fig. 3d). The latter two markers with elevated expression in the lympho/myeloid compartment, especially CD62L, could be interesting to be used for better EPC characterization, where they could be used as pre-gating endothelial markers on the total population of PBMCs.
Interestingly, among EPC markers that appeared in the transcriptomic analyses ( Fig. 1 and Table 2), two of them were found to have a positive expression in PBMCs: PLAUR and NOTCH2 in monocyte compartment (Fig. 3e) either in CD14+ or in CD16+ compartments, with a higher expression of PLAUR. Thus, PLAUR could be also used as EPC marker.
All these results of single-cell RNA-sequencing obtained for EPC-related markers expressed in PBMCs would be useful to design multi-parametric flow cytometric analyses for optimal and better characterization of EPC subpopulation in the peripheral blood.

EPC markers inferred a molecular network which is implicated in morphogenesis and vascular development
Among the sixty-one EPC markers selected for the study (Table 1), forty-two of them were retained as seeds of the network (red nodes on network, Fig. 4) by STRING protein database with stringent parameters (interaction score over 800 and interaction validated experimentally). Building protein-protein interaction network around these 42 seeds revealed a network comprising a total of 550 nodes with 1086 edges (Fig. 4). Functional inference on this interaction network with Biological Process (Gene Ontology) database revealed an important involvement of these molecule partners in morphogenesis (figure network, barplot) and also their implication in vascular development (blue nodes on network and blue bar in the barplot, Fig. 4 network). These results confirmed that the EPC-related markers that we have selected for this study could influence morphogenesis and vascular development processes.

Discussion
Since the discovery of endothelial progenitor cells (EPCs) three decades ago, there is/are no definitive/globally agreed upon marker or group of markers for the specific molecular characterization of EPCs. Thus, in the current work, we propose a novel in silico approach for finding novel markers of EPCs. We investigated the importance of sixty-one EPC-affecting molecules/factors in EPCs and vascular biology; we conducted semantic research of the chosen molecules/factors curated from the literature via querying Gene Ontology and PubMed databases with different keywords (Fig. 5). Merging these databases of EPC markers into publically available annotated transcriptome normalized matrix to compare the expression of these chosen EPC genes between ECFCs, HUVECs, and two adult ECs from the skin and adipose tissue has revealed that BMB2, BMP4, and EFNB2 (Ephrin B2) have significantly higher expression compared with other groups. Erythropoietin-producing human hepatocellular carcinoma (ephrin) receptors like Ephrin B2 are expressed by ECs [68] and EPCs [69], and they are important for embryonic angiogenesis, cellular adhesion, and migration [70]. Moreover, preconditioning EPCs with Ephrin B2 increases their angiogenic capacity in the hind limb model [71] and in wound healing [72].
Our transcriptomic analysis has showed that both BMB2 and BMP4 are also upregulated in ECFCs. It has Fig. 2 Regulated EPC-related genes sharing elevated level of expression between ECFCs and other three groups of endothelial cells. a Genes with a high level of expression shared between ECFCs and HUVECs. b Genes with a high level of expression shared between ECFCs and skin endothelial cells. c Genes with a high level of expression shared between ECFCs and adipose tissue endothelial cells. The statistical test used to obtain p values was performed with one-way ANOVA followed by Tukey post hoc test for multiple comparisons been demonstrated that both BMP2 and BMP4 were exclusively expressed by late EPCs (ECFCs) and they are essential for the angiogenic potential of ECFCs [73]. Moreover, BMP4 is implicated in endothelial lineage differentiation of embryonic pluripotent cells [74,75].
Further, BMP2 could enhance the vasculogenic differentiation of ECFCs co-encapsulated with mesenchymal stromal cells in synthetic scaffold [76]. Interestingly, the same three EPC molecules were the highest significantly regulated genes in the mouse functional enrichment Fig. 3 Expression of selected and highlighted EPC-regulated markers in healthy donors' PBMCs by single-cell RNA sequencing. a Cluster identification inside circulating population of 33,000 PBMCs from healthy donor analyzed by single-cell sequencing with Seurat software. b-e Quantification by single-cell RNA sequencing of molecular markers in healthy donor PBMCs: background of cells with negative expression is colored in gold and positive cells for the markers appeared in dark blue. b Expression of endothelial-related markers selected by literature curating. c Expression of markers selected by literature curating and were found to have lympho/myeloid expression. d Expression of markers selected by literature curating and were found to have an expression in monocytes either in CD16+ subpopulation or in CD14+ subpopulation. e Expression of highlighted markers that were found to be regulated previously between endothelial populations of different tissues network. Collectively, this means that EFNB2, BMB2, and BMP4 are crucial for ECFC commitment to the endothelial lineage and they are involved in the angiogenic capacity of ECFCs.
Some molecules have shown a high level of expression between ECFCs and HUVECs; NRP1 shared a high level of expression between ECFCs and HUVECs compared with other groups. NRP1 was proved to orchestrate the committed differentiation of endothelial precursors for both human and murine embryonic stem cells [77]. Moreover, it regulates the differentiation of murine pluripotent stem cells to vascular progenitor cells [78], and it is in generally important for angiogenesis and homeostasis [79].  4 Protein-protein interaction network of EPC selected molecules: protein-protein interaction network built with 42 seeds (red nodes) on string database with stringent parameters (interactions used were experimentally validated); blue nodes represent functional inference of vasculature development found with Gene Ontology biological process VEGF-C was also upregulated in both ECFCs and HUVECs; it is the most regulated gene with a high level of expression in both HUVECs and ECFCs and it is known to promote lymphatic endothelial cells from human pluripotent stem cells [80]. Moreover, VEGF-C induced the differentiation of lymphatic endothelial progenitor cells (LEPCs) into lymphatic ECs, and it also boosted their incorporation in the cardiac lymphatic system and thus VEGF-C stimulated cardiac lymphangiogenesis in a rat model of myocardial infarction [81].
Whereas the expression of other molecules was elevated in both ECFCs and skin endothelial cells, this includes NOTCH1 and MIR21. NOTCH1 via downstream action on HES1 influenced switch of hematopoietic versus endothelial fate specification [82]. Further, NOTCH1 regulates the differentiation of mouse embryonic stem cells into arterial ECs and increases the angiogenic potential of them [83]. MIR21 induces EPC proliferation [84], and it also modulates their senescence [85]. Additionally, MIR21 is known to have a protective effect on vascular ECs [86].
On the other hand, PECAM1 has shown a shared high level of expression between ECFCs and adipose tissue endothelial cells. PECAM1 is a classical marker of adult ECs so it is not surprising to be upregulated in adiposederived ECs and it has also been reported to be a maker of ECFCs [17,27]. Thus, it can be concluded that there was a high level of expression of the chosen factors in ECFCs as compared to other endothelial cells.
The functional enrichment of our chosen sixty-one EPC-related factors on Mouse Phenotype database has shown the significant involvement of the chosen EPC factors, specifically EFNB2, BMB2, and BMP4 which have the highest significant upregulation in ECFCs compared with other groups in the transcriptomic analyses, in mouse endothelial phenotypes like abnormal blood vessel morphology (with the highest number of EPCrelated genes involved), followed by abnormal vascular development, abnormal artery morphology, and also decreased angiogenesis (Table 3). Interestingly, the mouse functional enrichment analyses were consistent with the STRING analysis of functional protein-protein interaction networks, which revealed the involvement of 42 out of the chosen molecules as seeds of the network and they were crucial for vascular morphogenesis and vascular development (Fig. 4). Collectively, these results clearly prove the prominence of our chosen EPC-related factors and that they are crucial for endothelial and vascular physiology and pathophysiology.
There are two major types of blood for isolation of EPCs, namely the umbilical cord blood (UCB) and peripheral blood (PB). Although PB is the most available source, however, the number of EPCs and the probability of having EPC colonies from PB is much lower compared with UCB [5,87]. Thus, herein, our single transcriptomic analyses derived from 33,000 single-cell transcriptomes of healthy donor PBMCs have revealed that EC markers like ICAM1/CD54 (activated EPCs marker) and ENG (Endolgin/CD105) were still expressed at low levels at the monocytic compartments of PB, although the previous markers are authentically established markers of both ECs and EPCs [17,27].
Further, other EPC markers like CD163, CD36, and CD115 have been shown to be expressed in the monocytic compartment of PB, namely CD163 and CD36 EPCs in the CD14+ monocyte compartment and CSF1R (CD115) in the CD16+ monocyte compartment (Fig. 3c). Noteworthy is that both CD163 [27] and CD115 [17] are considered markers for early EPCs, whereas CD36 [27] is attributed as a late EPC marker. Hence, this proves the existence/the involvement of EPCs as a subpopulation of the monocytic PB sub-compartment. Collectively, the latter EPC markers could improve the study of EPC ontogeny and heterogeneities in PB and will also aid (when used with other conventional markers of EPCs) in better characterization, isolation, and higher yield of EPC colonies from PB.
Other less curated EPC markers from the literature have demonstrated high mixed lympho/myeloid expression in PBMCs which is the case of SELL (CD62L, selectin L); it has been demonstrated that CD62L has been expressed by EPCs, and it is even used as a marker for isolation and characterization of EPCs in combination with CD34 [27].
The same holds true for IL6R which has less expression in lympho/myeloid compartments of PBMC compared with CD62L. Actually, IL6R/CD126/gb80 is an indirect marker of activated ECs/EPCs, as IL6R is not expressed by ECs but it is expressed by neutrophils and monocytes. Moreover, IL6R is proteolytically cleaved forming a complex with IL6, and such complex binds with the gp130 receptor which is expressed ubiquitously on ECs to be activated and then they start expressing ICAM1, VCAM1, and IL6 [88]. We could conclude that the previous two markers with high expression in the lympho/myeloid compartment, especially CD62L, could be used as EPC markers for better characterization and isolation of EPCs from PBMC population.
Interestingly, the same two EPC-related gene markers, namely PLAUR and NOTCH2 that have been shown to be highly regulated between EPCs and other ECs from different tissues ( Fig. 1 and Table 2), have also been shown in our single-cell RNA-sequencing analyses to be highly expressed in PBMC monocyte sup-compartment (Fig. 3e) either in CD14+ or in CD16+ sup-compartments, where PLAUR has a much higher expression. UPAR/PLAUR/ CD87 is the receptor of UPA and both of them in addition to uPARAP form the UPA/UPAR/uPARAP system. This system is involved in the migration, proliferation, and adhesion of cells. Moreover, this system is a key orchestrator of angiogenesis besides other cellular processes that include receptor shedding and internalization, protein expression, phenotype modulation and tissue remodeling, cancer progression, and metastasis [47,51,[53][54][55]. In order for angiogenesis to occur, EPCs have to be released from the basement membrane then they migrate to distant regions where there is injury or neovascularization. UPA binds to UPAR on EC/EPC surface resulting in the formation of plasmin (activation or conversion of plasminogen to plasmin) which activates matrix metalloproteinases (MMPs) like MMP-3 and MMP-12 that in turn cleaves basement membrane releasing EPCs free to migrate and recruited to sites where neovascularization occurs where they differentiate progressively to mature ECs; moreover, MMPs also release growth factors like VEGF, FGF2, and HGF which activate the proliferation of EPCs [89]. Additionally, it has been shown that EPCs showed higher uPAR levels and uPA activity compared with mature ECs [90]. Adding to this, UPAR is a crucial proangiogenic regulator for ECFCs and it is also inducing VEGF activity [91]. Also, it has been shown that UPAR-CD36 interaction is important for the pathogenesis of atherosclerosis [92]. Collectively, UPAR/PLAUR has been proven to be a key player in angiogenesis, vasculogenesis, and EPC function and physiology. To summarize, in the current study, we are introducing a novel set of EPC markers (which include secreted factors, miRNAs, and growth factors), where we would propose a novel combination of conventional EC/EPC markers (like CD31, VEGFR2 (KDR), and vWF) and novel EPC markers emerging from the current study, like UPAR/PLAUR and CD36, as plausible panel of markers to be used for EPCs pre-gating on total PBMC population to design multi-parametric flow cytometric analyses and thus would aid in an improved characterization, isolation, and higher yield of EPC colonies from peripheral blood.

Conclusions
In conclusion, we report a new single-cell transcriptomic in silico approach for delineating a novel characterization panel of novel EPC markers that would help to design a multi-parametric cytometric analyses for optimal and better characterization of EPC subpopulation in peripheral blood and thus improving the isolation and yield of EPCs from peripheral blood for the subsequent use of EPCs in cell therapy and regenerative medicine applications.