CD49b identifies functionally and epigenetically distinct subsets of lineage-biased hematopoietic stem cells

Summary Hematopoiesis is maintained by functionally diverse lineage-biased hematopoietic stem cells (HSCs). The functional significance of HSC heterogeneity and the regulatory mechanisms underlying lineage bias are not well understood. However, absolute purification of HSC subtypes with a pre-determined behavior remains challenging, highlighting the importance of continued efforts toward prospective isolation of homogeneous HSC subsets. In this study, we demonstrate that CD49b subdivides the most primitive HSC compartment into functionally distinct subtypes: CD49b− HSCs are highly enriched for myeloid-biased and the most durable cells, while CD49b+ HSCs are enriched for multipotent cells with lymphoid bias and reduced self-renewal ability. We further demonstrate considerable transcriptional similarities between CD49b− and CD49b+ HSCs but distinct differences in chromatin accessibility. Our studies highlight the diversity of HSC functional behaviors and provide insights into the molecular regulation of HSC heterogeneity through transcriptional and epigenetic mechanisms.


INTRODUCTION
The maintenance and replenishment of the hematopoietic system and its cells rely on rare bone marrow (BM)-resident hematopoietic stem cells (HSCs). Blood cell development is traditionally described as a hierarchical tree, with HSCs differentiating through progenitor stages to ultimately form the terminally differentiated hematopoietic cells (Wilkinson et al., 2020). In this model, HSCs are assumed to be multipotent and equipotent, thus producing all mature blood cells with no lineage preference (Eaves, 2015). However, HSCs exhibit distinct functional behaviors in vivo, with only a subset of them showing a lineage-balanced output consistent with traditionally described HSCs. The functionally heterogeneous HSCs have been classified according to the ratio of mature myeloid to lymphoid cells within the leukocyte fraction (Eaves, 2015). Later studies, which included the analysis of platelet and erythroid cells, led to the discovery of additional biased and restricted HSCs (Wilkinson et al., 2020). To date, HSCs categorized by myeloidbiased, platelet-biased, lineage-balanced, and lymphoidbiased repopulating patterns have been demonstrated (Carrelha et al., 2018;Challen et al., 2010;Dykstra et al., 2007;Morita et al., 2010;Müller-Sieburg et al., 2002;Muller-Sieburg et al., 2004;Sanjuan-Pla et al., 2013;Yama-moto et al., 2013Yama-moto et al., , 2018. The behavior of distinct HSC subsets differs not only in lineage bias, but also in self-renewal ability (Eaves, 2015;Wilkinson et al., 2020). Myeloid-biased, platelet-biased, and lineage-balanced HSCs are recognized as durable HSCs that can sustain long-term (LT) hematopoiesis (Carrelha et al., 2018;Dykstra et al., 2007;Eaves, 2015;Morita et al., 2010;Sanjuan-Pla et al., 2013). Lymphoidbiased HSCs generally have finite self-renewal ability and likely overlap with intermediate-term (IT) HSCs and, thus, are not considered LT HSCs, but are still distinct from transiently self-renewing short-term (ST) HSCs and multipotent progenitors (MPPs) (Benveniste et al., 2010;Challen et al., 2010;Dykstra et al., 2007;Eaves, 2015;Kent et al., 2009;Morita et al., 2010). Altogether, these studies suggest that there is a large diversity of cells within the traditional HSC compartment, which exhibit lineage-bias differences and gradual differences in self-renewal potential and thereby durability.
Stem and progenitor cells are contained within the Lineage À Sca-1 + c-Kit + (LSK) compartment, but only a minor part of LSK cells are LT reconstituting HSCs. The combination of additional cell-surface markers has greatly improved the isolation of functional HSCs. However, HSCs are ultimately defined by their functional ability and cannot be identified by immunophenotype alone (Wilkinson et al., 2020). Lineage bias is, at least partly, thought to be intrinsically programmed. Further study of the mechanisms underlying HSC diversity is dependent upon the ability to connect HSC immunophenotype with functional behavior, highlighting the importance of prospective isolation of homogeneous HSC subsets (Eaves, 2015;Haas et al., 2018).
In this study, we have reassessed the phenotypic HSC compartment using cell-surface markers reported to identify HSCs, to explore whether different combinations of immunophenotypes can isolate functionally diverse HSC subsets. We found heterogeneous expression of CD49b in the phenotypic HSC fraction harboring the most primitive and durable cells (LSKCD34 À CD48 À CD150 hi ) (Morita et al., 2010). Phenotypically separated CD49b fractions were functionally distinct: CD49b À cells were highly enriched for myeloid-biased HSCs and were the most durable cells, and CD49b + cells were enriched for multipotent HSCs with lymphoid bias and less durability. Transcriptional profiling of CD49b À and CD49b + HSCs revealed high concordance, whereas chromatin accessibility analysis showed diverse profiles. Our studies demonstrate that CD49b can distinguish between functionally and epigenetically distinct multipotent HSCs with myeloid and lymphoid bias in the primitive HSC compartment.
(CD150 int ) cells, enriched for lineage-balanced HSCs, and the LSKCD34 À CD48 À CD150 À (CD150 À ) cells, with a lymphoid-biased phenotype, were included for comparison ( Figure 1A) (Kent et al., 2009;Morita et al., 2010). Most markers had uniform expression patterns in the CD150 hi fraction except for CD229, CD41, and CD49b, which showed bimodal expression profiles ( Figure S1A). Interestingly, CD49b has been suggested to mark ST HSCs, IT HSCs, and primed HSCs (Benveniste et al., 2010;Wagers and Weissman, 2006;Zhao et al., 2019). We found that previously identified CD49b À and CD49b + populations were heterogeneous for CD150 cell-surface expression, which could partly explain why CD49b marks cells with both transient and LT self-renewal ability (Figures S1B and S1C). We therefore hypothesized that CD49b might be a candidate marker to further enhance the isolation of functionally distinct HSCs within the CD150 hi compartment. Notably, the combination of CD229, CD41, and CD49b revealed further phenotypic subfractions within CD150 hi cells (Figure S1D). However, cell-cycle analysis by Ki-67 staining and cell-proliferation analysis by the 5-bromo-2 0 -deoxyuridine (BrdU) incorporation assay showed significant differences only between the CD49b À and the CD49b + subsets (Figures S1E-S1H). These data suggested that subfractionation with CD49b alone could be sufficient to isolate functionally distinct cells.
CD49b À and CD49b + subsets have different cell-cycle and cell-proliferation kinetics Despite the lack of functional differences between CD49b À and CD49b + cells in vitro, we observed a higher, but not statistically significant, cloning frequency from the CD49b + subset in the B/M assay, compatible with cell-proliferation differences ( Figures 1B and S1H). Cell-cycle analysis of CD49b À and CD49b + cells showed that both subsets were highly quiescent, but CD49b À cells were more in G0 (95%) and less in G1 (4%) compared with CD49b + cells (87% and 10%,respectively;Figures 2A and 2B). Consistent with the higher in vitro cloning frequency ( Figure 1B), more CD49b + cells (30%) incorporated BrdU compared with CD49b À cells (11%) (Figures 2C and 2D). To further understand cell-proliferation kinetics, we tracked the cell divisions of CD49b À and CD49b + single cells ( Figure 2E). One day after culture, most cells had still not divided (CD49b À , 96%; CD49b + , 89%), consistent with the quiescent nature of both subsets ( Figure 2B). However, after 2 days, 60% of CD49b + cells had already undergone R1 cell division, compared with only 36% of CD49b À cells, indicating that CD49b + cells entered the cell cycle faster. By 4 days, the numbers of cell divisions were comparable between the subsets.
Collectively, these findings suggested that, although both CD49b À and CD49b + subsets were highly quiescent, CD49b + cells had a higher proliferation rate.
CD49b À and CD49b + HSCs are the most durable subsets To evaluate the functional significance of the CD49b subfractions, we performed a competitive transplantation assay, in which five cells of the CD49b À , CD49b + , CD150 int , or CD150 À subsets were transplanted. Donor HSC subsets were purified from the Gata-1 eGFP mouse strain to detect platelets and erythrocytes ( Figure S2A) (Drissen et al., 2016). Although the transplanted HSC subsets differed in total leukocyte contribution, they repopulated all blood cells (Figures 3A,3B,and S2B). While the CD49b À , CD49b + , and CD150 int subsets could stably generate all blood lineages for 6 months, CD150 À transplanted mice exhibited transient myeloid repopulation but sustained high B and T cell repopulation over time, consistent with previous studies (Kent et al., 2009;Morita et al., 2010). Furthermore, donor-derived phenotypic HSCs were detected in most mice from the CD49b À (92%) and CD49b + (90%) groups, in 42% of mice from CD150 int , but in only 7% of mice from CD150 À (Figures 3C and S2C). Mice from the CD49b À and CD49b + groups generated all types of mature BM cells, progenitor cells, and phenotypic HSCs at 5-6 months posttransplantation . Collectively, these findings indicated that CD49b À and CD49b + subsets had the highest self-renewal potential.
CD49b À and CD49b + HSCs reconstitute all blood lineages, but at different ratios To evaluate the HSC behavior at the clonal level, we performed single-cell transplantation experiments. A total of 139 mice were transplanted with single CD49b À (61 mice) or CD49b + (78 mice) cells, and 48% of the CD49b À and 28% of the CD49b + cells reconstituted the recipients ( Figure S3). Both subsets were able to repopulate all leukocyte, platelet, and erythrocyte lineages for 6 months ( Figures 4A and 4B). The repopulation profiles of individual mice revealed five major patterns from CD49b À cells where the dominant groups were characterized by high platelet (P), erythroid (E), and myeloid contributions (M > L) and multilineage contributions (M-L) ( Figures S3 and S4A). Notably, mice with only P or PE repopulation patterns were found only in CD49b À cells ( Figures S3E, S3F, and S4A). Conversely, CD49b + cells could be grouped into four patterns with predominantly higher B lymphoid reconstitution (L > M) or ST/transient repopulation patterns ( Figure S4B). Thus, we assessed the lineage bias by analyzing the proportional contribution to B, T, natural killer (NK), and myeloid cells in donor leukocytes ( Figures 4C and 4D). Single-cell-transplanted CD49b À mice showed a high myeloid contribution, and CD49b + mice exhibited predominantly lymphoid contribution and some lymphoid-restricted patterns. Based on blood profiles from unmanipulated mice, a lineage-balanced (D) Frequency of BrdU + CD49b À and CD49b + HSCs (n = 9 mice, three independent experiments). (E) Cell divisions from cultured single cells on days 1-4 (n CD49b À = 347 cells, n CD49b + = 370 cells, five replicates, three independent experiments). Mean ± SD is shown in (B), (D), and (E). pattern can be categorized by the ratio of lymphoid to myeloid cell contribution (L/M) of 6.0 ± 2.0 ( Figure S4C) (Müller-Sieburg et al., 2002). Thus, we categorized the repopulation patterns from the CD49b À and CD49b + singlecell-transplanted mice as myeloid-biased (L/M < 4), lymphoid-biased (L/M > 8), or lineage-balanced (L/M R 4 and % 8). CD49b À -transplanted mice were predominantly classified as myeloid-biased (M-bi; 78%), whereas the most common classification in CD49b + -transplanted mice was lymphoid-biased (L-bi; 46%, Figure 4E). Furthermore, CD49b À generated more myeloid cells in the BM compared with CD49b + , consistent with the M-bi repopulation pattern in peripheral blood (PB), while the lymphoid cell contributions in CD49b À and CD49b + cells were comparable ( Figure S4D). These findings suggested that both CD49b À and CD49b + subsets were multipotent but with different lineage biases.
Eighty-nine percent of reconstituted mice transplanted with single CD49b À cells exhibited LT repopulating ability, of which 82% were M-bi. In contrast, 61% of CD49b + -reconstituted mice showed LT activity, of which 36% were L-bi ( Figures 5A and 5B). As expected, no CD49b + cells with lymphoid-restricted patterns and LT activity were found ( Figures 4D and 5A). Furthermore, both CD49b À and CD49b + subsets generated phenotypic HSCs and downstream BM progenitor populations, but less efficiently in CD49b + compared with CD49b À cells ( Figures 5C, S4E, and S4F). These data showed that CD49b À cells were more durable than CD49b + cells.
To assess whether LT repopulating activity decreased over time, we followed a group of single-cell-transplanted mice to 9 months. Both the repopulation level and the pattern were similar to the results at 5-6 months post-transplantation ( Figures 4B-4D, 5D, and 5E), suggesting no decline of self-renewal potential in CD49b À and CD49b + subsets over time.
(legend continued on next page) efficiency of CD49b À -transplanted mice was high, with all reconstituted mice with LT activity ( Figure 5A) showing LT repopulation in PB and BM (25/25). Most of these mice further repopulated secondary recipients in PB and BM (18/21). In contrast, the repopulation efficiency of CD49b + cells in both primary (9/10) and secondary (5/7) transplantation was lower, but they were still capable of multilineage repopulation ( Figures 5F, 5G, S5A, and S5B). An M-bi classification was highly correlated with positive and robust repopulation. In contrast, an L-bi classification was correlated with declining repopulation efficiency from primary to secondary transplantation. Although these findings indicated that L-bi cells were capable of propagating through serial transplantation with multilineage repopulation (Figures 5F, 5G, S5A, and S5B), our results nonetheless suggested that such cells were infrequent.
Given the difference in durability and lineage bias, we evaluated the hierarchical relationship between the CD49b subsets by assessing their ability to form phenotypic HSC subsets (CD49b À , CD49b + , CD150 int , and CD150 À ). While CD49b À cells were potent in generating all phenotypically defined HSC subsets, CD49b + cells were less efficient, but nevertheless able to produce the same subsets ( Figures 5H and S5C). Secondary transplantation confirmed that most of the phenotypic donorderived CD49b À and CD49b + cells were functional HSCs ( Figures S5D and S5E).
Collectively, these results showed that both CD49b À and CD49b + cells could sustain LT multilineage hematopoiesis with preserved lineage bias but at different efficiencies.
To identify epigenetic changes associated with gene expression differences, we analyzed open chromatin using assay for transposase-accessible chromatin sequencing (ATAC-seq, Table S2, Figures S6E and S6F) (Corces et al., 2017). Chromatin accessibility at population-specific genes followed the expected patterns ( Figures S6G and S6H). PCA revealed distinct LMPP and GMP clusters, whereas CD150 + and CD150 À populations clustered closely together, but, in contrast to RNA-seq data, formed distinguishable groups ( Figure 6D), indicating that transcriptionally similar CD49b À and CD49b + subsets differ on the chromatin level.
whereas regions in the CD150 + group (clusters 1-2) were associated with transforming growth factor b signaling, involved in regulation of HSC quiescence ( Figure 6F) (Wang et al., 2018). These findings suggested that generation of downstream progenitors from CD150 + HSCs is associated with extensive chromatin remodeling.

CD49b-high
Mouse phenotype increased monocyte cell number abnormal mononuclear phagocyte morphology abnormal phagocyte morphology abnormal monocyte morphology abnormal monocyte cell number abnormal thymus morphology extramedullary hematopoiesis abnormal macrophage cell number abnormal hematopoietic stem cell morphology abnormal NK cell number decreased hematopoietic stem cell number improved glucose tolerance abnormal glucose tolerance abnormal fat cell morphology abnormal heart valve morphology maternal imprinting genetic imprinting abnormal epigenetic regulation of gene expression abnormal heart ventricle outflow tract morphology GO biological process regulation of GTPase activity positive regulation of GTPase activity regulation of phagocytosis negative regulation of immune system process myelin maintenance inflammatory response regulation of granulocyte differentiation positive regulation of hemopoiesis myeloid progenitor cell differentiation myeloid leukocyte activation (B) Genomic feature distribution of regions with differential accessibility (p adj < 0.01) and all regions as a reference ("all peaks").
(D) GO enrichment analysis of regions with differential accessibility between CD49b À and CD49b + subsets. The top 10 significantly enriched terms from the mouse phenotype and GO biological process are shown.
(F) Volcano plots of differential TF binding. Transcription factors with differential binding activity (differential binding score >0.1, p < 1 3 10 À100 ) are colored and selectively annotated.
(G) Aggregated footprint plots for TFs with differential binding. See also Figure S7.
involved in genetic imprinting, which regulates HSC quiescence (Qian et al., 2016). In contrast, processes regulating hematopoietic cell numbers, differentiation, activation, and GTPase activity (Mulloy et al., 2010) were associated with regions with higher accessibility in CD49b + . These findings are consistent with the cell-cycle active and proliferative nature of CD49b + cells compared with the more quiescent CD49b À cells.
To identify potential transcription factors (TFs) responsible for chromatin accessibility differences between the CD49b subsets, we performed motif enrichment analysis ( Figure 7E). TF binding sites (TFBSs) Figure 7E). Moreover, we performed a genome-wide analysis of TF occupancy by footprinting analysis (Figures 7F and 7G). Differential TF binding plots showed large differences in TF binding between CD49b À and GMPs, which included the myeloid-associated C/EBP-family and SPI1 (PU.1) TFs (Tenen et al., 1997). In contrast, CD49b À and CD49b + had few TF binding differences. Consistent with the motif enrichment analysis, RUNX ( Figures 7F and 7G). Altogether, our findings indicated that differential functions of CD49b À and CD49b + cells may to a large extent be regulated by the same set of TFs.

DISCUSSION
It is well recognized that the HSC population is functionally diverse, with subtypes differing in propensity of blood cell differentiation and in self-renewal ability and lifespan (Eaves, 2015;Wilkinson et al., 2020). Although HSC heterogeneity is recognized, insight into the molecular mechanisms underlying HSC diversity is lacking due to limitations in purifying homogeneous HSC subtypes. Here, we have reassessed the expression of cell-surface markers suggested to define HSCs to explore functional heterogeneity. We identified CD49b as a candidate marker to subfractionate the phenotypic LSKCD34 À CD48 À CD150 hi compartment, enriched for M-bi and functional HSCs (Morita et al., 2010;Wilkinson et al., 2020). The CD49b À subset greatly improved the purity of M-bi cells and cells with the highest self-renewal activity. However, the L-bi phenotype was most common in CD49b + cells, with a subset exhibiting multilineage LT HSC activity. Notably, the CD49b + L-bi HSCs in this study are distinct from the previously described CD150 À L-bi HSCs and g and d cells, which all showed a lymphoid-dominant repopulation pattern and limited selfrenewal ability consistent with loss of LT HSC activity (Dykstra et al., 2007;Kent et al., 2009;Morita et al., 2010). Within the CD49b + fraction such lymphoid dominant cells were categorized as ST/transient cells. Furthermore, CD49b + cells described here were potent in generating platelets, erythrocytes, and lymphoid cells, but had low myeloid contribution, resulting in a lymphoid bias in the leukocyte compartment. Of note, the decline in platelet and erythrocyte reconstitution was associated with ST activity in lymphoiddominant cells. There are several studies with findings compatible with the existence of multipotent LT L-bi HSCs (Challen et al., 2010;Dykstra et al., 2011;Oguro et al., 2013;Yamamoto et al., 2013). Our results, however, suggested that LT L-bi HSCs are infrequent in young adult mice, and although CD49b + cells could sustain LT repopulation, they are less durable than CD49b À cells.
CD49b has previously been used to identify and characterize ST (Wagers and Weissman, 2006) and IT HSCs (Benveniste et al., 2010) with finite self-renewal ability, indicating that CD49b expression is associated with reduced durability. Paradoxically, it has also been used to distinguish between reserve (CD49b À ) and primed (CD49b + ) HSCs (Zhao et al., 2019). While the previous studies showed opposing results in self-renewal ability of CD49b + cells, lineage bias was not investigated. We show that the contradictory findings could partly be explained by the lack of CD150 in the immunophenotyping strategies, which greatly enriches for LT HSCs (Kiel et al., 2005). Although there is a degree of overlap, the CD49b + cells in this study, which were identified from the primitive LSKCD34 À CD48 À CD150 hi compartment, exhibit distinct differences compared with the CD49b + cells in previous studies (Benveniste et al., 2010;Zhao et al., 2019), particularly in lineage bias. Our findings show that CD49b + are mainly defining L-bi cells and reconcile previous studies by showing that CD49b marks both HSCs and ST/transient cells.
The CD49b À subset was able to efficiently generate all other HSC subsets. The CD49b + subset was less effective in generating both stem-and progenitor cells, but was nevertheless still capable of giving rise to all phenotypically defined HSC subsets. This suggested that CD49b À are hierarchically superior to CD49b + cells, but also that a degree of interconversion may occur, which remains to be confirmed through functional analysis.
Our data indicated that a degree of functional heterogeneity remains, especially within the CD49b + subset. Although we were unable to detect any significant functional differences in vitro with CD41 and CD229 subfractionation, it remains to be determined whether these subfractions can resolve the residual functional heterogeneity in vivo.
Insights into the molecular mechanisms underlying HSC heterogeneity are largely lacking. Genome-wide expression analysis of the HSC population has previously shown heterogeneity among phenotypic HSCs, suggesting that transcriptional profiling may distinguish functionally diverse HSC subsets (Challen et al., 2010;Haas et al., 2018;Wilson et al., 2015). Surprisingly, we observed high transcriptional overlap on both the bulk and the single-cell level between functionally different CD49b À and CD49b + cells. These findings suggested that functional differences between the HSC subsets may be determined by only a few genes or that RNA-seq could not reveal combinatorial consequences of small gene expression changes. To investigate the epigenetic changes associated with the subtle gene expression differences, we surveyed the genome-wide chromatin accessibility landscape of CD49b À and CD49b + cells. We observed distinct profiles, which, in agreement with previous studies, differed predominantly in promoter distal regions (Martin et al., 2021;Yu et al., 2017). We found a general increase in open chromatin associated with processes of an activated and proliferative cellular state in CD49b + . Conversely, in CD49b À cells, open chromatin regions were associated with processes involved in quiescence and dormancy. These results implied that CD49b À and CD49b + cells may have different epigenetic configurations priming them for the distinct in vivo functional behavior. These findings highlight the need to unfold specific regulators and epigenetic mechanisms that directly affect HSC function and diversity.
Collectively, we have shown that CD49b can be used to further enrich LT M-bi HSCs to high purity, by segregating a subset of multipotent CD49b + cells with lymphoid bias. Although L-bi cells were commonly associated with finite self-renewal, a small number of them could sustain LT, which correlated with the persistence of platelet and erythroid repopulation. Despite diverse functional characteristics, CD49b À and CD49b + HSCs were transcriptionally similar but epigenetically different. Overall, our studies highlight the different facets of the complex structure of the HSC compartment, composed of diverse HSCs with distinct functional behaviors that are likely regulated through epigenetic mechanisms as they sustain life-long hematopoiesis.

EXPERIMENTAL PROCEDURES
See supplemental information for details.

Animals
Female and male C57BL/6J mice (8-17 weeks) were used. Gata-1 eGFP (Drissen et al., 2016) mice were backcrossed more than eight generations to a C57BL/6J background. All experiments were approved by the regional ethics committee.

Hematopoietic cell preparation
BM cell suspensions were prepared by bone crushing. Cells were Fc-blocked and stained with antibodies against cell-surface antigens (Table S3). For HSC detection, BM cells were enriched by CD117 immunomagnetic separation (Miltenyi Biotec) before antibody staining. Platelets, erythrocytes, and leukocytes were isolated from PB samples followed by antibody staining as described previously (Carrelha et al., 2018;Luc et al., 2016). See Table S4 for immunophenotypes.

In vitro assays
Myeloid and B cell potential was evaluated with OP9 co-culture assay (Luc et al., 2012). Megakaryocyte potential was evaluated by manually plating one cell/well into 60-well plates and assessed after 13 days. Erythroid potential was evaluated by plating 30 HSCs in complete methylcellulose (GF M3434; STEMCELL Technologies) and evaluated after 12 days with 2,7-diaminofluorene staining (Merck) (Luc et al., 2012). Cell division kinetics was assessed by tracking cell divisions of single cells on days 1-4 post-sorting (Luc et al., 2016). See Table S5 for culture conditions.

Cell-cycle and proliferation assays
Ki-67 staining was done with a Cytofix/Cytoperm Kit (BD Biosciences). One dose of BrdU was given by intraperitoneal injection (50 mg/g body weight, BD), followed by oral administration (800 mg/mL, Merck) for 3 days. BrdU visualization was performed with a BrdU Flow Kit (BD Biosciences).

Slc16a1
(B) UCSC browser tracks of ATAC-seq signal for selected regions with differential accessibility between CD49band CD49b + cells (left) and RNA expression of the genes proximal to the differential regions (right). RNA expression in individual samples is shown as dots and boxplots show the distribution in each population.

Experimental animals
Animals were bred and maintained at the Preclinical Laboratory, Karolinska University Hospital and all experiments were approved by the regional ethical committee, Linköping ethical committee (ethical number 882). Females and males between 8-17 weeks old were used and were on a C57BL/6J background. B6.SJL-Ptprc a Pepc b /BoyCrl and B6.SJL-Ptprc a Pepc b /BoyJ mice (CD45.1) were used as primary and secondary recipients in transplantation experiments. Gata-1 eGFP 1 mice (CD45.2) were backcrossed >8 generations to a C57BL/6J background and were used as donor mice in transplantation experiments.

Hematopoietic cell preparation and staining
Bone marrow (BM) cell suspensions were prepared by crushing forelimbs, hindlimbs and hip bones into Phosphate-Buffered Saline (PBS, Gibco) with 5% FCS (Gibco) and 2 mM Ethylenediaminetetraacetic acid (EDTA, Merck). Unfractionated BM cells were either Fc-blocked with purified CD16/32 (BD Biosciences) or prestained with fluorophore-conjugated CD16/32 antibody and subsequently stained with antibodies against cell surface marker antigens (Table S3). For detection of hematopoietic stem cells (HSCs), unfractionated BM cells were enriched using CD117 MicroBeads (Miltenyi Biotec) followed by immunomagnetic separation of the cells and subsequently stained with antibodies against cell-surface markers. Immunophenotype definitions of hematopoietic populations are described in Table S4.
Peripheral blood (PB) was collected from the tail vein into lithium heparin coated microvette tubes (Sarstedt) followed by platelet and erythrocyte isolation as performed in Carrelha et al 2 . Leukocytes were subsequently isolated from PB samples and stained with antibodies against cell surface antigens as described previously 3 .

Flow cytometry experiments
Cell sorting experiments were performed using FACSAria™ Fusion or BD FACSAria™ III cell sorters (BD Biosciences) with a mean cell sorting purity of 95.3%±2.3%. Flow cytometry analysis was performed on LSR Fortessa™ or FACSymphony™ A5. Fluorescence minus one (FMO) controls were included in all flow cytometry experiments. Gates were set using FMO controls, backgating of the populations of interest or using internal controls (known negative and positive populations for the markers). Post-acquisition data analyses were done using the FlowJo software version 10 (BD Biosciences).
In single cell transplantation and single cell in vitro differentiation experiments, verification of single cell deposition into 96-well or 60-well plates was performed by sorting fluorescent beads.

Transplantation experiments
In all transplantation experiments, single cells or 5 cells were intravenously injected into lethally irradiated CD45.1 mice. The full irradiation dose was given as two split doses (2 x 600cGy). Donor and recipient mice were sex matched. Single cell transplantation experiments were performed as described in Carrelha et al 2 .
In primary transplantations, single or five HSCs were transplanted with 200,000-250,000 BM support cells (CD45.1). Secondary transplantations were performed with reconstituted primary recipient mice using 10´10 6 unfractionated BM or 30-100 sorted CD49bor CD49b + HSCs with 200,000 BM support cells. Cells from one primary donor mouse were transplanted into 1-5 lethally irradiated secondary recipients. Peripheral blood analyses from transplanted mice were periodically performed between 2-and 6-months post-transplantation and up to 9 months in extended long-term experiments.

Reconstitution threshold levels and lineage bias
The total donor reconstitution level was calculated based on the frequency of CD45.2 + events in total white blood cells (CD45.1 + CD45.2 cells). The transplanted mice were considered reconstituted when the total donor contribution in white blood cells (CD45.2 + ) or in platelets (Gata-1 eGFP + ) in the peripheral blood was ³0.1% and represented by ³10 events in the donor gate. The single cell transplantation efficiency was calculated based on the number of reconstituted mice at 2 months post-transplantation and the total number of mice that was transplanted. Mice that were found dead or sacrificed due to animal welfare reasons before 2 months post-transplantation were excluded from the total number of mice that was transplanted.
The reconstitution of B, T, NK and myeloid cells was calculated based on the frequency of CD45.2 + events within each blood cell lineage. The reconstitution of platelets and erythrocytes was calculated based on the frequency of eGFP + cells within the blood lineages ( Figure S2). Mice were considered positive for specific blood cell lineages when donor lineage reconstitution was ³0.01% and represented by ³10 events in the donor gate. Long-term (LT; ³5-6 months) or short-term (ST; <5-6 months) repopulating activity was based on the presence or absence of myeloid or platelet and erythrocyte repopulation in the peripheral blood (³0.01% CD45.2 + CD11b + or eGFP + CD41 + CD150 + and eGFP + Ter-119 + ), 5-6 months following transplantation.
Relative donor reconstitution levels were calculated based on the frequency of B, T, NK and myeloid cells within CD45.2 + cells. Lineage bias from single cell transplanted mice was categorized according to the ratio of lymphoid (L; including B, T and NK cells) to myeloid (M) cell contribution (L/M) in the peripheral blood at 5-6 months post-transplantation. A lineage-balanced pattern has a L/M ratio of 6.0±2.0 4 . Lineage-bias was therefore classified as myeloid-biased (L/M <4), lymphoid-biased (L/M >8) or lineage-balanced (L/M ³4 and ³8).
In HSC repopulation analyses on average 2 million events were recorded per sample and ³10 events were used to determine whether mice were positively reconstituted for HSCs ( Figures S2C and  S5B).

In vitro assays
To assess myeloid and B cell lineage potentials of HSC subsets in vitro, OP9 co-culture experiments were performed with the OP9 cell line as described previously 5 , and assessed after 3 weeks of culture.
Megakaryocyte and erythrocyte potentials of HSC subsets were evaluated as previously described 5 . Briefly, megakaryocyte potential was evaluated by manually plating 1 cell/well into 60-well MicroWell MiniTrays (Nunc, ThermoScientific) and assessed after 13 days of culture by an inverted microscope for the presence or absence of megakaryocytic cells in the cultures. Erythroid potential was evaluated by plating 30 HSCs in complete methylcellulose (GF M3434; StemCell Technologies). Cultures were evaluated for erythroid colonies after 12 days with 2,7-diaminofluorene staining (Merck) as previously described 5 .
Cell division kinetics of single cell sorted HSC subsets were performed as previously described 3 . Briefly, single cells were sorted directly into 60-well MicroWell MiniTrays. The number of cells and their cell divisions in the wells were regularly scored using an inverted microscope at 24-, 48-, 72-and 96-hours post-sorting.
Cell culture conditions for different in vitro assays are described in Table S5.

Cell cycle and cell proliferation assays
Cell cycle analysis by Ki-67 staining was performed with the BD Cytofix/Cytoperm Kit (BD Biosciences). Cell proliferation analysis was performed by the 5-Bromo-2'-deoxyuridine (BrdU) incorporation assay 3 with one dose of intraperitoneal injection of BrdU (50ug/g bodyweight), followed by administration of BrdU (Merck) in the drinking water (800ug/ml) for three days following the intraperitoneal injection. BrdU visualization was performed with the BD BrdU Flow Kit (BD Biosciences) according to the manufacturer's instructions.

RNA-sequencing
For RNA-sequencing, 250 or 500 cells from CD45.1 mice were FACS-sorted into 5 or 10 ul of Single cell lysis solution containing DNAse I (Single cell lysis kit, Invitrogen/ThermoFisher Scientific) and after 15 minutes incubation the reaction was stopped by adding stop solution according to the manufacturer's protocol. cDNA synthesis for strand specific RNA-sequencing libraries was done on RNA from 250-500 cells using NEBNext Ultra II RNA First Strand Synthesis (New England BioLabs) and NEBNext Ultra II Directional RNA Second Strand Synthesis (New England BioLabs) modules, in combination with QIAseq FastSelect rRNA HMR kit (Qiagen) for rRNA block. Custom made Tn5 (transposase) and replacement index primers were used for library preparation. Libraries were pooled and paired-end sequenced (2 x 41 cycles) using the NextSeq 500 system (Illumina, San Diego, CA). Paired-end reads were mapped to the mm10 reference genome using STAR (v.2.5.2b) 6 . Data from technical replicates, derived from the same biological sample, were merged when available. Quantification of reads in exons was done using HOMER 7 . Data was normalized for sequencing depth by converting to transcripts per million (TPM) in R. Genes with ≥1 TPM in ≥3 samples were considered expressed and used for analysis. For PCA analysis and visualization of gene expression in heatmaps and boxplots log2(TPM+1) values were used. Correlation between samples was calculated using log10 transformed and quantile normalized data. Differential