Diabetes Induces a Transcriptional Signature in Bone Marrow–Derived CD34+ Hematopoietic Stem Cells Predictive of Their Progeny Dysfunction

Hematopoietic stem/progenitor cells (HSPCs) participate in cardiovascular (CV) homeostasis and generate different types of blood cells including lymphoid and myeloid cells. Diabetes mellitus (DM) is characterized by chronic increase of pro-inflammatory mediators, which play an important role in the development of CV disease, and increased susceptibility to infections. Here, we aimed to evaluate the impact of DM on the transcriptional profile of HSPCs derived from bone marrow (BM). Total RNA of BM-derived CD34+ stem cells purified from sternal biopsies of patients undergoing coronary bypass surgery with or without DM (CAD and CAD-DM patients) was sequenced. The results evidenced 10566 expressed genes whose 79% were protein-coding genes, and 21% non-coding RNA. We identified 139 differentially expressed genes (p-value < 0.05 and |log2 FC| > 0.5) between the two comparing groups of CAD and CAD-DM patients. Gene Set Enrichment Analysis (GSEA), based on Gene Ontology biological processes (GO-BP) terms, led to the identification of fourteen overrepresented biological categories in CAD-DM samples. Most of the biological processes were related to lymphocyte activation, chemotaxis, peptidase activity, and innate immune response. Specifically, HSPCs from CAD-DM patients displayed reduced expression of genes coding for proteins regulating antibacterial and antivirus host defense as well as macrophage differentiation and lymphocyte emigration, proliferation, and differentiation. However, within the same biological processes, a consistent number of inflammatory genes coding for chemokines and cytokines were up-regulated. Our findings suggest that DM induces transcriptional alterations in HSPCs, which are potentially responsible of progeny dysfunction.


Introduction
Diabetes mellitus (DM) encompasses a group of metabolic disorders characterized by hyperglycemia resulting from defects in insulin secretion, insulin action, or both. The chronic hyperglycemia of DM is associated with long-term micro-and macrovascular complications, finally resulting in end organ dysfunction and failure [1,2].

Experimental Design
We hypothesized that DM might induce persistent transcriptional alternations in HSPCs at BM level, potentially responsible for innate immune cell dysfunction of the patients. To verify this hypothesis, we performed genome wide transcriptional comparison of CD34 + HSPCs isolated from the sternal BM biopsies of both normoglycemic and T2DM coronary artery disease patients (CAD and CAD-DM, respectively) undergoing coronary artery bypass graft (CABG) A total of 23 patients, 11 CAD (controls) and 12 CAD-DM (cases) matched for gender and with similar age, were enrolled (Table 1).
Within this cohort, on the basis of mRNA availability, BM-CD34 + stem cell samples from 6 CAD and 8 CAD-DM patients were selected for genome-wide expression analysis ( Table 2).  Nanopore-based sequencing, as outlined in Figure 1, was used to this aim. The entire cohort of 23 patients was subsequently included in the validation phase.

Gene Expression Biotype in HSPCs of CAD and CAD-DM Patients
We performed long-reads cDNA sequencing of samples obtained from 8 CAD-DM and 6 CAD matched patients ( Table 2). The results evidenced 10,566 expressed genes, of which 79% were mRNA, while the remaining 21% were non-coding RNAs as shown in Figure 2. Notably, out of the 21% non-coding RNAs, the 76% were pseudogenes (16%) and 5% novel RNA isoforms or putative novel genes (1%), namely, loci expressed in intergenic or intronic regions not yet annotated. The complete list of expressed genes, including gene symbols, locus, and relative gene expression is gathered in Table S1.

Gene Expression Biotype in HSPCs of CAD and CAD-DM Patients
We performed long-reads cDNA sequencing of samples obtained from 8 CAD-DM and 6 CAD matched patients ( Table 2). The results evidenced 10,566 expressed genes, of which 79% were mRNA, while the remaining 21% were non-coding RNAs as shown in Figure 2.
Notably, out of the 21% non-coding RNAs, the 76% were pseudogenes (16%) and ~ 5% novel RNA isoforms or putative novel genes (1%), namely, loci expressed in intergenic or intronic regions not yet annotated. The complete list of expressed genes, including gene symbols, locus, and relative gene expression is gathered in Table S1. Out of 10,566 expressed genes, 79% (8388) was represented by protein coding genes. The remaining 21% was constituted by non-coding RNAs (16% pseudogenes, 3% and 1% long and small noncoding, respectively, and 1% novel genes).

DM Induces a Distinct Transcriptome Profile in BM-derived HSPCs
To gain insights into global gene expression modifications induced by the diabetic BM microenvironment, a differential expression analysis was performed. We identified 139 differentially expressed genes (p-value < 0.05 and |log2 FC| > 0.5) between the two groups of CAD and CAD-DM patients. Of these, 55 were down-regulated (blue dots), and 84 upregulated (red dots) in CAD-DM vs CAD, as shown in the Volcano plot of Figure 3A.

Gene Expression Biotype in HSPCs of CAD and CAD-DM Patients
We performed long-reads cDNA sequencing of samples obtained from 8 CAD-DM and 6 CAD matched patients ( Table 2). The results evidenced 10,566 expressed genes, of which 79% were mRNA, while the remaining 21% were non-coding RNAs as shown in Figure 2.
Notably, out of the 21% non-coding RNAs, the 76% were pseudogenes (16%) and ~ 5% novel RNA isoforms or putative novel genes (1%), namely, loci expressed in intergenic or intronic regions not yet annotated. The complete list of expressed genes, including gene symbols, locus, and relative gene expression is gathered in Table S1. Out of 10,566 expressed genes, 79% (8388) was represented by protein coding genes. The remaining 21% was constituted by non-coding RNAs (16% pseudogenes, 3% and 1% long and small noncoding, respectively, and 1% novel genes).

DM Induces a Distinct Transcriptome Profile in BM-derived HSPCs
To gain insights into global gene expression modifications induced by the diabetic BM microenvironment, a differential expression analysis was performed. We identified 139 differentially expressed genes (p-value < 0.05 and |log2 FC| > 0.5) between the two groups of CAD and CAD-DM patients. Of these, 55 were down-regulated (blue dots), and 84 upregulated (red dots) in CAD-DM vs CAD, as shown in the Volcano plot of Figure 3A. Out of 10,566 expressed genes, 79% (8388) was represented by protein coding genes. The remaining 21% was constituted by non-coding RNAs (16% pseudogenes, 3% and 1% long and small non-coding, respectively, and 1% novel genes).

DM Induces a Distinct Transcriptome Profile in BM-Derived HSPCs
To gain insights into global gene expression modifications induced by the diabetic BM microenvironment, a differential expression analysis was performed. We identified 139 differentially expressed genes (p-value < 0.05 and |log2 FC| > 0.5) between the two groups of CAD and CAD-DM patients. Of these, 55 were down-regulated (blue dots), and 84 up-regulated (red dots) in CAD-DM vs. CAD, as shown in the Volcano plot of Figure  3A. Moreover, unsupervised hierarchical clustering based on differentially expressed genes revealed distinctive expression patterns within each cell type, ensuring an utter discrimination between CAD and CAD-DM samples ( Figure 3B). The complete list of differentially expressed genes, including gene symbols, locus, and relative gene expression, is gathered in Table S2.
Moreover, unsupervised hierarchical clustering based on differentially expressed genes revealed distinctive expression patterns within each cell type, ensuring an utter discrimination between CAD and CAD-DM samples ( Figure 3B). The complete list of differentially expressed genes, including gene symbols, locus, and relative gene expression, is gathered in Table S2. The heatmap shows that clustering based on the differentially expressed genes allowed complete separation between CAD-DM (orange squares) and CAD (green squares) samples. Gene expression levels were standardized and displayed as gradient colors from higher (dark orange) to lower (dark blue).

Functional Genomics Analysis Reveals a Disease-specific Genomic Signature in HSPCs of CAD-DM Patients
Gene Set Enrichment Analysis (GSEA), based on Gene Ontology biological processes (GO-BP) terms, was implemented to infer the functional roles of overrepresented genes in CAD-DM samples, allowing for the identification of disease-specific genomic signatures. Gene Ontology annotation led to the identification of fourteen overrepresented biological categories in CAD-DM samples. Remarkably, eight out of fourteen biological processes were related to lymphocyte activation, chemotaxis, peptidase activity, and innate immune response (e.g., 'response to bacterium', 'toll-like receptor signaling', 'response to interferon gamma', cytokine production) as shown in Figure 4. Notably, all these categories included a large proportion of downregulated genes involved in migration (i.e., CXCR4, CKLF, ITGAX, PPBP), antivirus, antifungal, and antibacterial host defense (i.e., TRIM22, TRIM5, AZU1, MPO, RNASE3, FPR1, FPR2, PPBP, DEFAs) as well as macrophage differentiation, lymphocyte emigration, proliferation, and differentiation (i.e., CSF1R, TNFSF13B, IGLL1, ADAM28, HLA-DQA1, HLA-DQB2). Importantly, within the same biological processes, a consistent number of inflammatory genes such as CCL2, IL1β, IL18RAP, CCL1, IL7, CCL3L1, CCL13, CCL4, NFKB1, and ADAM17 resulted in being upregulated. Interestingly, besides immune system categories, biological processes related to cell division, cell cycle, catabolic process, RNA processing and WNT signaling pathways also resulted as overrepresented in BM-CD34 + HSPCs of CAD-DM patients, suggesting a direct impact of DM on HSPC proliferation, metabolism, and differentiation. The The heatmap shows that clustering based on the differentially expressed genes allowed complete separation between CAD-DM (orange squares) and CAD (green squares) samples. Gene expression levels were standardized and displayed as gradient colors from higher (dark orange) to lower (dark blue).

Functional Genomics Analysis Reveals a Disease-Specific Genomic Signature in HSPCs of CAD-DM Patients
Gene Set Enrichment Analysis (GSEA), based on Gene Ontology biological processes (GO-BP) terms, was implemented to infer the functional roles of overrepresented genes in CAD-DM samples, allowing for the identification of disease-specific genomic signatures. Gene Ontology annotation led to the identification of fourteen overrepresented biological categories in CAD-DM samples. Remarkably, eight out of fourteen biological processes were related to lymphocyte activation, chemotaxis, peptidase activity, and innate immune response (e.g., 'response to bacterium', 'toll-like receptor signaling', 'response to interferon gamma', cytokine production) as shown in Figure 4. Notably, all these categories included a large proportion of downregulated genes involved in migration (i.e., CXCR4, CKLF, ITGAX, PPBP), antivirus, antifungal, and antibacterial host defense (i.e., TRIM22, TRIM5, AZU1, MPO, RNASE3, FPR1, FPR2, PPBP, DEFAs) as well as macrophage differentiation, lymphocyte emigration, proliferation, and differentiation (i.e., CSF1R, TNFSF13B, IGLL1, ADAM28, HLA-DQA1, HLA-DQB2). Importantly, within the same biological processes, a consistent number of inflammatory genes such as CCL2, IL1β, IL18RAP, CCL1, IL7, CCL3L1, CCL13, CCL4, NFKB1, and ADAM17 resulted in being upregulated. Interestingly, besides immune system categories, biological processes related to cell division, cell cycle, catabolic process, RNA processing and WNT signaling pathways also resulted as overrepresented in BM-CD34 + HSPCs of CAD-DM patients, suggesting a direct impact of DM on HSPC proliferation, metabolism, and differentiation. The lists of the genes representative of each biological category displayed in the Figure are gathered in Table S3. Moreover, all the processes identified by GSEA related to the downregulated and the upregulated genes are reported in Table S4 and Table S5, respectively.

Validation of Sequencing Data with qPCR
To validate results obtained with sequencing analyses, we selected 6 genes involved in chemotaxis, immune response, and host defense to further assess their expression levels by qPCR. According to RNA sequencing, we chose three top differentially expressed genes (i.e., FPR2, CSFR1, and DEFA3), and three that showed the tendency to be up-and down modulated (i.e., CCL2, MS4A3, and CXCR4) to strength the appropriateness of our global transcription analysis. Then, we compared the expression levels obtained by qPCR and RNA-Seq for each selected gene. The analysis displayed a strong and significant correlation for all tested genes ( Figure 5).  Table S3. Moreover, all the processes identified by GSEA related to the downregulated and the upregulated genes are reported in Table S4 and Table S5, respectively. To visually interpreting biological data from GSEA analysis, a network of the most significant Gene Ontology biological processes (GO-BP) terms (p < 0.05) was drawn. Blue and red nodes represent pathways mainly associated with CAD and CAD-DM conditions, respectively. The node gradient color is proportional to node significance, from lower (light node; p = 0.05) to higher (dark node; p <<0.0001); node size is proportional to the gene-set size. Edge thickness is proportional to the similarity between two gene-sets.

Validation of Sequencing Data with qPCR
To validate results obtained with sequencing analyses, we selected 6 genes involved in chemotaxis, immune response, and host defense to further assess their expression levels by qPCR. According to RNA sequencing, we chose three top differentially expressed genes (i.e., FPR2, CSFR1, and DEFA3), and three that showed the tendency to be up-and down modulated (i.e., CCL2, MS4A3, and CXCR4) to strength the appropriateness of our global transcription analysis. Then, we compared the expression levels obtained by qPCR and RNA-Seq for each selected gene. The analysis displayed a strong and significant correlation for all tested genes ( Figure 5). To visually interpreting biological data from GSEA analysis, a network of the most significant Gene Ontology biological processes (GO-BP) terms (p < 0.05) was drawn. Blue and red nodes represent pathways mainly associated with CAD and CAD-DM conditions, respectively. The node gradient color is proportional to node significance, from lower (light node; p = 0.05) to higher (dark node; p << 0.0001); node size is proportional to the gene-set size. Edge thickness is proportional to the similarity between two gene-sets. Notably, we further corroborated our results on mRNA of BM-CD34 + stem cell samples isolated from patients who were not included in the sequencing cohort but Notably, we further corroborated our results on mRNA of BM-CD34 + stem cell samples isolated from patients who were not included in the sequencing cohort but originally enrolled (Table 1). In line with sequencing data, qPCR analysis confirmed the same expression trend that was statistically significant for all selected genes ( Figure 6, panel A). Importantly, the analysis by flow-cytometry of CXCR4 protein expression, a gene that by sequencing displayed a down-regulation trend, also demonstrated a significant reduction of the receptor at membrane level ( Figure 6, panel B).
of the trendline (purple line) is depicted in light purple. Data are plotted as -dct (y-axis) versu log-normalized value (x-axis) for FPR2 (panel A), CSFR1 (panel B), DEFA3 (panel C), CCL2 ( D), MS4A3 (panel E), and CXCR4 (panel F). The R coefficient as well as the corresponding pues for each gene are shown in the relative panel.
Notably, we further corroborated our results on mRNA of BM-CD34 + stem cell ples isolated from patients who were not included in the sequencing cohort but orig enrolled (Table 1). In line with sequencing data, qPCR analysis confirmed the sam pression trend that was statistically significant for all selected genes ( Figure 6, pan Importantly, the analysis by flow-cytometry of CXCR4 protein expression, a gene th sequencing displayed a down-regulation trend, also demonstrated a significant redu of the receptor at membrane level ( Figure 6, panel B).

Discussion
There is a close link between DM and atherosclerotic CV disease, which remains the most prevalent cause of morbidity and mortality in diabetic patients. The exact pathogenic mechanisms linking accelerated atherosclerotic cardiovascular complications, resulting in long-term damage and failure of various organ systems, and diabetes, are rather complex [23]. However, it is widely reported that in all stages of atherosclerosis, both in the absence and presence of diabetes, monocytes and macrophages are key players [24]. Preclinical and clinical studies support the role of inflammation in the initiation and progression of atherosclerosis that despite representing the intrinsic capacity of the body to react to tissue injury or pathogens can have detrimental effects on tissue and organs, if chronically activated [25,26]. To this regard, a chronic low-grade inflammation and immune activation have been described both in pre-diabetic and diabetic states [27][28][29][30]. A consisting body of literature suggests that the impact of hyperglycemia on HSPCs in the BM niche may be the primary factor contributing to CV disease development [31,32]. Indeed, numerous preclinical data show the ability of BM diabetic environment to boost HSPC differentiation toward myeloid lineage as well as to promote abnormal generation and accumulation of immune cell subpopulations (e.g., monocytes and macrophages) with more aggressive phenotypes [33][34][35]. Paradoxically, this state of chronic inflammation, characterized by increased myelopoiesis and circulating inflammatory cytokines, which account for increased risk of atherosclerosis, coexists with a dysfunctional immune response that renders diabetic patients more susceptible to infections [36].
Interestingly, though the molecular mechanisms underpinning the paradox of immune system function in DM is not fully understood, preclinical data suggest that epigenetic modifications in HSPCs triggered by diabetic BM microenvironment can, once passed to their progeny, be responsible for the gene expression and phenotypical alteration of terminally differentiated immune cells [22]. Herein, in order to explore if a transcriptional signature predictive of immune cell dysfunction was already present in HSPCs of DM patients, we compared the transcripts of CD34 + HSPCs isolated from sternal BM biopsies of CAD patients with or without T2DM by genome-wide expression analyses. The study involved the use of nanopore-based sequencing, a new NGS technique that allowed us to obtain a complete picture of RNA transcripts, specifically 79% mRNA and 21% noncoding RNA. Succeeding analysis of the data by the Gene GO-BP-based GSEA, revealed fourteen overrepresented biological categories in CAD-DM vs. CAD samples. Remarkably, most of the relevant biological processes were related to immune system function, namely 'chemotaxis', 'response to bacterium', 'toll-like receptor signaling', 'response to interferon gamma', cytokine production, and response to bacterium. In addition, narrowing our analysis down within the gene lists of biological processes, we found a down-regulation of numerous genes coding for cytotoxic peptides related to innate immune system with bactericidal, antifungal, and antiviral properties. These included AZU1, a neutrophilmonocyte-derived antibacterial and chemotactic glycoprotein with cytotoxic action against Gram-negative bacteria [37], defensins family members (i.e., DEFA1, DEFA1B, DEFA3, and DEFA4), abundant in the granules of neutrophils, as well as MPO and RNASE3, additional genes encoding for proteins with microbicidal activity against a wide range of organisms [38][39][40]. Notably, CD34 + HSPCs from CAD-DM patients also displayed a defective expression of genes involved in the regulation of macrophage differentiation (CSFR1), phagocytic cell recruitment and activation (i.e., PPBP, CXCR4, and FPR2), as well as antigen presentation (i.e., HLA-DQA1 and HLA-DQB2) [41]. Consistently in DM, despite increased myelopoiesis, there is a defense mechanism impairment of granulocytes and monocytes including migration to the site of inflammation, phagocytosis, ROS production, and germicidal activity. The functional failure of these cells, which are the first cells recruited to the sites of injury, plays a crucial role in inflammatory response against microbial infections and it is associated with an increased incidence of infections in DM patients [42].
However, such a gene expression pattern in CD34 + HSPCs, predictive of innate immune system impairment, was associated with upregulation of numerous genes encoding for pro-inflammatory interleukins, chemokine, and receptors, critical not only in the recruitment, accumulation, and activation of immune cells to the site of injury, but also in all inflammatory stages of atherosclerosis [42,43]. Among these, it is noteworthy mentioning CCL13, a chemotactic factor able to identify with a very high degree of accuracy subjects with clinically significant atherosclerotic heart disease [44]. The upregulation of pro-inflammatory genes in HSPCs of CAD-DM patients strongly suggests the acquisition of senescent-associated secretory phenotype (SASP). Similar to aging, diabetes is known to induce cell senescence with mechanisms involving, together with others, mitochondrial dysfunction and increased reactive oxygen species [45]. These do not spare immune cells and stem/progenitor cells that, as all other cells, are susceptible to developing premature senescence [45,46]. SASP profile, characterized by sustained release of pro-inflammatory cytokines, chemokines, and growth factors, may in turn, with a vicious positive feedback cycle, contribute to establishing a chronic low-grade pro-inflammatory status [47], and exacerbate immune dysfunction [45,48].
Despite the limitation of the study related to the particularity of biological samples that constrains the number of subjects, to our knowledge this is the first study revealing the gene expression profile of BM-derived CD34 + HSPCs from CAD patients who are distinguished by the only presence of T2DM. In line with recent evidence demonstrating the ability of DM to induce stable epigenetic modifications responsible for gene expression alteration in BM progenitors [22,49], our study suggests that diabetic BM microenvironment, regardless of the combination of comorbid CAD, establishes a transcriptional signature in HSPCs that is predictive of unrestrained inflammatory and dysfunctional phenotype of their daughter cells in the peripheral tissues. Epigenetic aberrant modifications, which are cell-typespecific and can be mitotically heritable contributing to long-term gene expression changes, are associated with many human diseases, including diabetes. Numerous studies already provide evidence that DNA methylation and post-translational modifications of histone proteins are involved in the development of diabetes and its vascular complications [50]. Differences in activator (e.g., H3K9ac, H3K4me) and repressive (e.g., H3K36me3, H3K9me3) chromatin marks have been found at the level of pro-fibrotic and inflammatory genes in numerous tissues [51,52], including immune cell components (i.e., monocytes and lymphocytes) of diabetic patients [53,54]. While it is becoming evident that diabetes, by glucose metabolism alteration and oxidative stress (ROS), can affect the function of epigenetic machinery [55,56], big gaps still exist in knowledge about the interplay between cell signaling and epigenetic machinery. However, as epigenetic enzymes can regulate cell signaling pathways (e.g., NF-κB, RAS/RAF/MEK/MAPK, PI3K/Akt, Wnt/β-catenin, p53, and ERα) [57], it is very likely that signal transduction pathways and transcription factors abnormally activated by diabetic conditions or hyperglycemia can cooperate, in a still not fully elucidated way, with epigenetic factors to promote sustained expression of pathological genes [58]. To this regard, we are currently studying the molecular and epigenetic mechanisms involved in glucose-induced dysfunction of HSPCs.
Overall, our results may open the way to further studies aimed at unravelling the profound mechanisms underlying the paradox of innate immune training in diabetes with the finally scope of discovering novel therapeutic targets to be exploitable for limiting and/or preventing infectious diseases and devastating complications in DM patients.

Study Participants
Twenty-three patients, 11 CAD and 12 CAD-DM, were enrolled in the study. All procedures performed on subjects were in accordance with the Helsinki Declaration of 1975. The investigation was approved by Centro Cardiologico Monzino Research Ethics Committee (No. CCM 205-RE1973/1) and each participant provided written informed consent. The inclusion criteria for CAD-DM patients were age above 35 years, diagnosis of T2DM as defined by the American Diabetes Association [59], with at least one year of disease duration at the time of the screening visit. The exclusion criteria were T1D diagnosis, inflammatory/infective/autoimmune disorders, and/or history of cancer. At admission, CAD-DM patients were treated with any combination of oral anti-diabetic therapies with/without insulin.

Sternal Bone Marrow Biopsy and CD34 + Stem Cell Isolation
During surgical procedure, two or three milliliters of sternal BM were withdrawn by bone biopsy needle (15G × 25/90mm; MDL). Blood aspirate was suspended in saline buffer, and mononuclear cell (MNC) fraction was isolated by density gradient centrifugation using Lymphoprep TM (Sentinel Diagnostic Spa; Milan, Italy). CD34 + stem cells were then magnetically sorted by MiniMACS system (CD34 Microbead Kit; Miltenyi Biotec GmbH; Bologna, Italy). Purity of isolated cells was assessed during experimental settings by flow cytometry (Beckman-Coulter Gallios, Life Science; Milan, Italy). The test was carried out three times on samples deriving, randomly, from both cohort of patients. Isolated cells were single stained for the hematopoietic stem cell markers CD34 and for CD14, CD3, CD80, and CD86 to determine the level of cell lineage contamination. The 90% ± 4.08 (SD) of the cells resulted positive for CD34 and negative for all other considered markers ( Figure S1). Isolated CD34 + cells were then cryopreserved for subsequent RNA isolation and analysis.

Total RNA Extraction
Total RNA from CAD and CAD-DM CD34 + cells was isolated using the Direct-zol RNA Kit (Zymo Research; EuroClone S.p.A., Milan, Italy) following the manufacturer's instructions. At the end of the procedure RNA concentration and quality were assessed, respectively, by microvolume spectrophotometry using a ND-1000 NanoDrop (Thermo Fisher Scientific; Milan, Italy) and by microfluidics-based automated electrophoresis, using the RNA 6000 Nano Assay Kit on 2100 BioAnalyzer system (Agilent Technologies S.p.A.; Milan, Italy).

MinION Nanopore Sequencing
RNA sequencing was performed on 6 CAD and 8 CAD-DM samples selected on the basis of RNA amount availability using a MinION Mk1B sequencer (Oxford Nanopore Technologies; Oxford, UK) on a R9.4.1 (FLO-MIN106) flow cell (Oxford Nanopore Technologies; Oxford, UK). Clinical characteristics of the sequenced patients are shown in Table 2. The libraries were prepared using the SQK-LSK108 Sequencing kit (Oxford Nanopore Technologies; Oxford, UK) following the 1D Strand switching cDNA-by-ligation protocol. Briefly, reverse transcription was performed starting from 500 ng of RNA using the PCR-VN-RT primer and the Strand Switch Oligo PCR_SW_mod_3G, followed by First Strand cDNA PCR amplification for full-length transcripts (18 cycles, 3 min extension). After purification by Agentcourt AMPure XP beads (Beckman Coulter), end-repairing and dA tailing steps were performed on 1 µg of DNA/sample before the ligation of specific adapters to 0.2 pmols of end-prepped DNA. A single library was loaded on each flowcell following protocol instructions. After reaching at least 5 GB of acquired data, the sequencing was stopped and the flowcell washed in order to allow the loading of a new sample. Then the flow cell underwent QC analysis to verify the presence of enough pores to run the sequencing. In case of insufficient pores availability, a new flow cell was loaded. Albacore software v2.3.1 (Oxford Nanopore Technologies; Oxford, UK) was used for basecalling.

Bioinformatics and Statistical Analysis
Reads were mapped to the human genome reference GRCh38.96 by the 'minimap2 aligner, properly set for minION long reads (custom options: -x map-ont -Y) [60]. Genes were considered as "expressed" when showing a minimum of 10 counts in at least 50% of samples. Gene expression quantification was performed by 'featureCount' (custom option: -L) [61], while gene filtering and variance-stabilizing normalization (VSN) were carried out by the 'DaMiRseq' R package [62]. The statistical analysis was accomplished by the 'limma' R package [63]. The Benjamini-Hochberg procedure was used to control for the false discovery rate (FDR). Differences were deemed significant if the p-value was < 0.05 and the |log2 FC| > 0.5. Clustering analysis was performed by the 'pHeatmap' R package. Demographic and clinical profile of enrolled patients are expressed as mean ± standard deviation, median [25-75% confidence interval], or percentage, as appropriate. Shapiro-Wilk normality test has been used to verify Gaussian distribution of the considered covariates. For comparison between two groups, Student's T or Mann-Whitney U test have been performed, as appropriate. Fisher's exact test has been used to compare categorical covariates. Dot-plots of qPCR validation were drawn by GraphPad Prism (v.5, GraphPad Software,) and given as mean ± SEM. All data expressed as fold-change were log2-transformed before analysis and tested for the normality by using the Shapiro-Wilk normality test. Differences between data were evaluated by unpaired Student's t test (2-group comparisons).

Functional Analysis
The Gene-Set Enrichment Analysis (GSEA v.4.0.0) with pre-ranked mode was performed to infer biological function, associated with CAD and CAD-DM phenotypes [64]; the ranking metrics was based on the t-statistics values of differential expression analysis (only protein coding genes) and only the gene ontology biological process (GO-BP) were taken into account. Pathways significantly associated with phenotypes (p < 0.05) were visually represented by the Enrichment Map (v.3.0.0) Cytoscape (v3.7.1) plug-in [65,66].

qPCR Validation
The gene expression levels were technically validated by qPCR on 6 genes (i.e., FPR2, CSFR1, DEFA3, CCL2, MS4A3, and CXCR4). The validation was performed on samples obtained from the entire cohort of 23 CAD and CAD-DM patients (Table 1). Total RNA (400 ng) was reverse-transcribed using 5x-All-in-One RT Master Mix (Applied Biological Materials; Gentaur, Bergamo, Italy) and the resulting cDNA diluted to 5 ng/µL with RNASe free water. Single assay Real Time qPCR was performed using specific Taqman Gene Expression Assays (Thermo Fisher Scientific; Milan, Italy) for each target, according to manufacturer's instructions, and run on Viia 7 RT-qPCR system. Ct values were normalized (Delta Ct), using beta globulin as reference gene. Data, expressed as fold-changes (FC; 2 −∆∆CT ) over CAD after normalization to each housekeeping gene, were log2-trasformed before analysis. The Pearson's correlation indexes and the corresponding p-values of sequenced samples were calculated by the native R base functions (v.3.6.1) [67] and represented by the 'ggplot2 R package [68].

Flow Cytometric Assay
The purity of the cells and protein expression of non-significantly downregulated gene (CXCR4) was validated by flow-cytometry. CD34 + stem cells were single stained for 30 min with antihuman CD34, CD14, CD3, CD80, CD86, and CXCR4 monoclonal antibodies (BD Biosciences). After 30 min, cells were washed and analyzed. The Beckman-Coulter Gallios platform (Beckman-Coulter Life Science; Milan, Italy) and Kaluza analysis software (v2.1.1) were used to analyze samples by use of appropriate physical gating. At least 10 4 events in the indicated gates were acquired.