T-ALL leukemia stem cell 'stemness' is epigenetically controlled by the master regulator SPI1

Leukemia stem cells (LSCs) are regarded as the origins and key therapeutic targets of leukemia, but limited knowledge is available on the key determinants of LSC ‘stemness’. Using single-cell RNA-seq analysis, we identify a master regulator, SPI1, the LSC-specific expression of which determines the molecular signature and activity of LSCs in the murine Pten-null T-ALL model. Although initiated by PTEN-controlled β-catenin activation, Spi1 expression and LSC ‘stemness’ are maintained by a β-catenin-SPI1-HAVCR2 regulatory circuit independent of the leukemogenic driver mutation. Perturbing any component of this circuit either genetically or pharmacologically can prevent LSC formation or eliminate existing LSCs. LSCs lose their ‘stemness’ when Spi1 expression is silenced by DNA methylation, but Spi1 expression can be reactivated by 5-AZ treatment. Importantly, similar regulatory mechanisms may be also present in human T-ALL.


Introduction
Acute T cell lymphoblastic leukemia (T-ALL) is an aggressive hematological malignancy caused by the accumulation of genetic mutations and altered signaling pathways that affect normal T cell development (Belver and Ferrando, 2016;Ferrando and Ló pez-Otín, 2017;Girardi et al., 2017). Current treatment for T-ALL includes high-intensity combination chemotherapies. However, such treatment may cause short-and long-term side effects, and up to 20% of pediatric and 40% of adult T-ALL patients relapse (Atak et al., 2013;Girardi et al., 2017). Leukemia stem cells (LSCs) are considered to be one of the main causes of drug resistance and therapeutic relapse (Batlle and Clevers, 2017;Blackburn et al., 2014;Chiu et al., 2010;Visvader, 2011). Like hematopoietic stem cells, LSCs can self-renew and differentiate into leukemic blast cells (Bonnet and Dick, 1997;Reya et al., 2001), which makes them ideal candidates for high-efficiency and low-toxicity targeted therapies. However, many questions related to the control mechanisms of LSCs and cancer stem cells (CSCs) in general remain unanswered.
One question is how CSCs maintain 'stemness'. Although many driver mutations and dysregulated pathways have been identified in cancers, these are unlikely to be the only mechanisms that maintain CSC 'stemness', since the same driver mutations or dysregulated pathways are also present in most cancer cells. One good example is the Pten-null T-ALL model that we have generated by the conditional deletion of the Pten tumor suppressor gene in fetal liver hematopoietic stem cells (Guo et al., 2008). In this model, LSCs are enriched in the Lin -CD3 + KIT mid cell subpopulation; these cells are self-renewable and responsible for T-ALL initiation and drug resistance (Guo et al., 2008;Guo et al., 2011;Schubbert et al., 2014). However, since both LSC-enriched and leukemic blast subpopulations share similar genetic alterations, including Pten loss and Tcra/d-Myc translocation (Guo et al., 2008), these driver mutations are unlikely to determine LSC 'stemness'. Furthermore, treating the Pten-null T-ALL model with PI3K inhibitors is effective only before the onset of leukemia, not after leukemia is already underway (Guo et al., 2011;Schubbert et al., 2014), suggesting that this driver mutation is not responsible for the maintenance of LSC 'stemness' once it has been generated.
A related question is how CSCs lose 'stemness' and whether this process is unidirectional or reversible. Such plasticity or reversibility may contribute to some of the conflicting results in the literature regarding the nature and frequency of CSCs (Batlle and Clevers, 2017). As small-molecule inhibitors of epigenetic modifiers have been developed and applied to cancer treatments (Topper et al., 2017), understanding the nature of CSC maintenance may bear important clinical implications.
Using the Pten-null T-ALL model, we identify a master regulator, SPI1, and a b-catenin-SPI1-HAVCR2 regulatory circuit that are responsible for LSC 'stemness' maintenance. This 'stemness' maintenance circuit is initiated by the leukemogenic driver mutations, that is, PTEN loss and PI3Kmediated b-catenin activation, but after it is formed, it becomes independent of the driver mutation and the associated PI3K pathway. Furthermore, SPI1's LSC-specific expression is silenced by DNA methylation, resulting in the loss of LSC 'stemness'. Our study also provides the fate mapping of leukemia development from LSCs to leukemic blasts at single-cell resolution and identifies potential novel targets for LSC-mediated therapies.

Redefine heterogeneous LSCs at single-cell resolution
We reported previously that the LSC-enriched Lin -CD3 + KIT mid subpopulation in the Pten-null T-ALL model contains heterogeneous cells, of which 30% are MYC low, rapamycin-and JQ1 (a BRD4 inhibitor)-resistant, and relatively quiescent in terms of cell cycle (Guo et al., 2008;Schubbert et al., 2014) (Figure 1-figure supplement 1A). To further define this heterogeneous subpopulation, we isolated LSC-enriched and blast subpopulations for RNA-seq analysis (Figure 1-figure supplement 1B, upper panel) and identified one module with a LSC high -Blast 0 expression pattern by Weighted Gene Co-expression Network Analysis (WGCNA) (Zhang and Horvath, 2005) ( Figure 1A, yellow module). Approximately, 45% of the genes in this module encode membrane proteins such as Havcr2 (HAVCR2) and Itgax (ITGAX) ( Figure 1B-C). Although Havcr2 and Itgax are only expressed in the LSC-enriched subpopulation, the expression levels of these genes vary among different isolates ( Figure 1C), which may reflect the heterogeneity of the LSC-enriched subpopulation. The cell surface expression of HAVCR2 and ITGAX, as measured by FACS analysis, are highly correlated and can further separate the previously identified Lin -CD3 + KIT mid LSC-enriched subpopulation into several subgroups ( Figure 1D, upper panel), among which the HAVCR2 high or HAVCR2 high ITGAX high subgroups are most abundant in the thymus, the critical organ for T cell development and T-ALL initiation (Guo et al., 2008;Guo et al., 2011) ( Figure 1D, lower panel).
To determine whether these heterogeneous groups are organized hierarchically from LSCs to blasts during T-ALL development, we conducted single-cell RNA-seq analysis and identified four subgroups ( Figure 1E; Figure 1-figure supplement 1B, lower panel; Figure 1-figure supplement 2). Pseudotime analysis (Trapnell et al., 2014) further indicates that LSCs follow a continuous developmental path towards blasts, progressing from HAVCR2 high through HAVCR2 mid and HAVCR2 low to blasts ( Figure 1F), which can also be visualized by pseudotime analysis of Havcr2 and Itgax expression ( Figure 1G). Collectively, these results confirm the heterogeneity of the previously identified LSC-enriched subpopulation and provide fate mapping of LSC differentiation into blasts at single-cell resolution. The HAVCR2 high subgroup contains the vast majority of LSC activity Single-cell transcriptome analysis indicates that the HAVCR2 high subgroup is enriched in the hematopoietic stem cell/late progenitor pathways and is relatively quiescent, while the blast subpopulation is enriched in Myc lymphoma pathways and active in the cell cycle ( Figure 2A). Consistent with this observation, the HAVCR2 high subgroup also has the lowest c-MYC level among the four subgroups ( Figure 2B), suggesting that the HAVCR2 high subgroup may contain the MYC low cells within the Figure 1 continued T-ALL mice; (D) Upper panel: FACS plots are overlaid to show the differential expression of HAVCR2 and ITGAX in the LSC and blast subpopulations. The previously defined Lin -CD3 + KIT mid LSC-enriched subpopulation (in the red box in the left panel) can be further separated into several subgroups based on the expression of the cell-surface markers HAVCR2 and ITGAX. The Lin -CD3 + KITleukemic blast subpopulation (in the blue box in the left panel) does not express HAVCR2 or ITGAX. Lower panel: Quantitative measurement of the HAVCR2 high , HAVCR2 mid and HAVCR2 low subgroups in different hematopoietic organs from Pten-null T-ALL mice (n = 5; *, p<0.05). The HAVCR2 high subgroup is enriched in the thymus; (E) PCA analysis of the single-cell transcriptome shows four subgroups, labeled in different colors. Cells from two independent mice are indicated by different shapes; (F) Pseudotime analysis shows the expression profiles of T-ALL cells in 2-D component space. The solid black line shows the main differentiation path from HAVCR2 high (purple) to blasts (dark green); (G) Pseudotemporal ordering of single cells based on Havcr2 or Itgax expression.BM: bone marrow. DOI: https://doi.org/10.7554/eLife.38314.002 The following figure supplements are available for figure 1: previously defined Lin -CD3 + KIT mid LSC-enriched subpopulation (Guo et al., 2008;Schubbert et al., 2014).
To determine whether HAVCR2 high cells are true LSCs, we performed limiting dilution and bone marrow transplantation analyses using 10 to 1000 bone marrow cells from the Lin -CD3 + KIT mid , Lin -CD3 + KIT mid HAVCR2 high , and Lin -CD3 + KIT mid HAVCR2 low subgroups ( Figure 3A). Cells from the HAVCR2 high subgroup have the highest leukemia-initiating capacity-nearly every Lin -CD3 + -KIT mid HAVCR2 high cell is capable of inducing T-ALL development, compared to 1/14 of the cells in the Lin -CD3 + KIT mid subgroup and 1/28 of the cells in the Lin -CD3 + KIT mid HAVCR2 low subgroup ( Figure 3B). Consistent with these findings, cells from the HAVCR2 high subgroup can also induce Figure 3. The HAVCR2 high subgroup contains the vast majority of LSC activity. (A) Schematic illustrating the cell isolation, limiting dilution and transplantation procedures used for testing LSC activity as described in Guo et al. (Guo et al., 2008); (B) LSC frequencies were calculated for each subgroup according to Hu et al. (Hu and Smyth, 2009);(C) Survival curves showing LSC activity in each of the sorted subgroups upon transplantation (n = 4). Student's t-test was used to calculate the p-value. DOI: https://doi.org/10.7554/eLife.38314.007 T-ALL lethality much earlier than cells from the other two subgroups ( Figure 3C). Thus, HAVCR2 is a novel surface marker for the isolation of pure LSCs, and the HAVCR2 high subgroup represents the true LSC population in the Pten-null T-ALL model ( Table 1).

SPI1 is the master regulator of LSC signature genes
The identification of HAVCR2 high cells as the true LSC population allows us to define the key determinant for LSC 'stemness'. We used network component analysis (Tran et al., 2012), in which the activity of transcription factors can be deduced based on the expression levels of their target genes. Among the predicted transcription factors (Liberzon et al., 2011) that may control the expression of HAVCR2 high LSC signature genes, SPI1 scores the highest (data not shown). Importantly, approximately 70% of the HAVCR2 high LSC signature genes overlap with SPI1 target genes identified during T cell development (Zhang et al., 2012)( Figure 4A). Therefore, we decided to focus our subsequent analysis on SPI1.
Since the HAVCR2 high MYC low phenotype signifies LSCs, we first examined the correlation of Spi1, Havcr2 and Myc expression in HAVCR2 high and blast cells. The pseudotemporal ordering of the single-cell RNA-seq data and the FACS analyses demonstrate that Spi1 expression is highest in the HAVCR2 high subgroup, which is opposite to the differential expression of Myc ( Figure 4B-C).
We further investigated whether SPI1 could transcriptionally regulate Havcr2 and Myc expression by conducting SPI1 ChIP-qPCR analysis on Spi1-Egfp stably transformed blasts, using Egfp-transfected blasts as a control (Figure 4-figure supplement 1). SPI1 binds strongly to Havcr2 promoter region 2 (Zhu et al., 2015) ( Figure 4D) and the Tcra enhancer (EA) region in the translocated Tcra/ d-Myc allele ( Figure 4E), as well as to the E2 region of the WT allele (Shi et al., 2013) (Figure 4F), suggesting that it may have regulatory effects on both genes. The overexpression of Spi1 in T-ALL blast cells significantly increases the expression of Havcr2 and other known SPI1 target genes, such as Itgax and Lmo2 (Champhekar et al., 2015;Turkistany and DeKoter, 2011;Yashiro et al., 2017), but downregulates Myc mRNA and protein levels ( Figure 4G-H). In contrast, SPI1 knockdown in a human T-ALL cell line downregulates the expression of SPI1 target genes but upregulates MYC expression ( Figure 4I). Importantly, the positive correlation between SPI1 and the expression of HAVCR2 as well as that of SPI1 target genes such as ITGAX and LMO2 can be found in human T-ALL datasets Van Vlierberghe et al., 2011)(Figure 5), suggesting that the regulation of HAVCR2 expression by SPI1 could play an important role in human T-ALLs. SPI1 is essential for LSCs 'stemness' and T-ALL development SPI1 is an ETS domain-containing transcription factor critical for early T cell progenitor function (Zhang et al., 2012), and its overexpression or translocation induces T progenitor cell proliferation and blocks differentiation (Anderson et al., 2002;Seki et al., 2017), similar to the effects we  and T-ALL development, we conditionally deleted Spi1 in the Pten-null T-ALL model. Kaplan-Meier survival analysis shows that the lethality caused by T-ALL is delayed proportionally to the numbers of Spi1 allele that are deleted ( Figure 6A). The tissue architectures of the thymus and spleen appear normal, and no infiltrating leukemia cells can be detected in the liver of the mutant mice (dKO) ( Figure 6B). FACS analyses also show the absence of HAVCR2 high LSCs and CD3 + blasts in the thymus, spleen and bone marrow (BM) of the dKO mice ( Figure 6C-D). Spi1 deletion can also restore spleen weight and organ morphology ( Figure 6B; Figure 6E). Notably, the lethality seen in the compound homozygotes after 3 months is at least partially due to myeloid abnormalities, a known phenotype associated with SPI1 loss in the myeloid lineage (Dakic et al., 2007;Rosenbauer et al., 2004;Steidl et al., 2006)(data not shown).
To confirm that the absence of T-ALL in dKO mice is not due to a block in T cell development in the Pten; Spi1-null T progenitor cells (Champhekar et al., 2015;Spain et al., 1999), we first quantified CD3 + T cells in the WT, Pten-null and dKO mice and found relatively normal numbers of CD3 + cells in the dKO thymus ( Figure 6F). We then crossed dKO mice with mice of the Rosa26 loxp-stop-loxp -LacZ reporter line so that LacZ expression could be used to trace the behavior of cells with Cremediated deletion of Pten and Spi1 (Guo et al., 2008;Guo et al., 2011). Our FACS-Gal analysis shows that like LacZ -WT cells (blue), LacZ + dKO cells (red) in the same animals can undergo  WT and Pten/Spi1 double knockout mice were 3 months old, and Pten-null T-ALL mice were 2 months old. n = 3; (G) FACS-Gal analysis of T cell development in the thymus of Pten/Spi1 double knockout mice. LacZ + cells (red dots) and LacZ À cells (blue dots) from the same sample are overlaid. C-D, the data are presented as the means ± S.Ds; *p 0.05; **p 0.01; ***p 0. 001.The bars in the HE images and inserts represent 1000 mM and 50 mM, respectively. DOI: https://doi.org/10.7554/eLife.38314.011 differentiation to become CD4 + CD8 + double-positive T cells ( Figure 6G). These results suggest that PI3K activation can rescue the T cell developmental block in Spi1-null T cell progenitors (Champhekar et al., 2015;Spain et al., 1999), similar to the findings in our previous report on Pten; Rag-null mice (Guo et al., 2011).Therefore, SPI1 is essential for Pten-null LSC 'stemness' and T-ALL development.
Spi1 is upregulated at the ETP/DN1 stage during T cell development The essential role of SPI1 in regulating LSC signature genes and 'stemness' prompted us to investigate how Spi1 is regulated in the Pten-null T-ALL model. During T cell development, Spi1, with other T progenitor cell factors and growth factor receptors such as Bcl11a, Lmo2, Flt3 and Kit, is highly expressed at the early T progenitor (ETP) and double-negative 1 (DN1) stage and is then immediately downregulated during T cell commitment (Zhang et al., 2012) ( Figure 7A, upper panel). Interestingly, our pseudotemporal ordering of the single-cell RNA-seq data indicates that the expression patterns of Spi1 and these factors and receptors are largely unchanged in the Pten-null T-ALL model compared to normal T cell development ( Figure 7A, lower panels). Furthermore, these factors and receptors are highly expressed in the HAVCR2 high subgroup and downregulated in the HAVCR2 mid and HAVCR2 low subgroups, suggesting that HAVCR2 high SPI1 high LSCs may be generated at the ETP/ DN1 stage ( Figure 7A, low panels). Indeed, when we crossed Spi1-GFP reporter mice  to Pten-null T-ALL model mice, we found that Spi1-GFP expression is significantly upregulated at the ETP/DN1 stage ( Figure 7B).
A b-catenin-SPI1-HAVCR2 regulatory circuit is required for Spi1 upregulation and LSC 'stemness' b-Catenin is an important transcription factor regulating Spi1 expression in the T cell lineage . Previous works by us and others suggest that bÀcatenin is critical for LSC self-renewal (Guo et al., 2008) and RAG-dependent aberrant TCR rearrangement (Dose et al., 2014;Guo et al., 2011), a mechanism underlying the reoccurring Tcra/d -Myc translocation caused by PTEN loss or bÀcatenin activation observed in T-ALLs (Guo et al., 2008Kaveri et al., 2013. Indeed, the overexpression of b-catenin in a human T-ALL cell line leads to the significantly increased expression of SPI1 from its endogenous promoter and subsequently promotes the expression of its target gene HAVCR2 but downregulates MYC expression ( Figure 8A).
PTEN loss or PI3K/AKT activation is known to activate bÀcatenin by phosphorylating GSK-3b and preventing GSK-3b-mediated bÀcatenin degradation (Dan et al., 2008;Kikushige et al., 2015;Persad et al., 2001). Although the HAVCR2 high , HAVCR2 low and blast subgroups have similar levels of P-GSK-3b due to PTEN loss ( Figure 8B, upper and lower panels), the HAVCR2 high subgroup has a much higher level of non-phospho-bÀcatenin (the active form of bÀcatenin) and SPI1 than HAVCR2low and blast subgroups in vivo ( Figure 8C and E, upper and lower panels), suggesting that SPI1mediated LSC formation may depend on mechanisms other than the oncogenic driver mutation PTEN loss.
HAVCR2 signaling can activate NFkB and bÀcatenin and promote AML LSC formation and selfrenewal (Kikushige et al., 2015). Since we identified HAVCR2 as the SPI1 target gene, we hypothesized that HAVCR2 signaling may in turn activate Spi1 expression and promote T-ALL LSC formation. Intracellular FACS analyses show that among the four subgroups, the HAVCR2 high subgroup, which has the highest Spi1 expression, also has the highest level of both phospho-p65 and non-phospho-bÀcatenin ( Figure 8D-E, upper and lower panels), indicating that HAVCR2 signaling must play an important role in the hyperactivation of NFkB and bÀcatenin. Consistent with this hypothesis, the genetic deletion of Spi1 can prevent both HAVCR2 high LSC formation at the ETP/DN1 stage and T-ALL development ( Figure 8F). The pharmacological inhibition of bÀcatenin activation by the novel tankyrase inhibitor BAY6060, but not the inhibition of PI3K activity by BAY1082439 alone (Hill et al., 2017), can also significantly reduce the number of HAVCR2 high LSCs in vivo in late-stage T-ALL ( Figure 8G). Together, these results suggest that although Spi1 upregulation is initiated by PTEN loss, SPI1-mediated LSC formation and 'stemness' are maintained by the bÀcatenin-SPI1-HAVCR2 regulatory circuit.

LSCs loses their 'stemness' when Spi1 expression is silenced by DNA methylation
How cancer stem cells lose 'stemness' and whether this process is unidirectional or reversible are currently unknown. Since Spi1 expression is drastically reduced from the HAVCR2 high stage to the HAVCR2 low stage ( Figure 7A, lower panel), we hypothesized that a Spi1 silencing mechanism may explain the loss of LSC 'stemness' during differentiation. DNA methylation is one of the major epigenetic mechanisms in regulating gene expression during normal development. Although the global methylation patterns across the LSC signature and blast signature genes are similar ( Figure 9A-B), the Spi1 promoter is significantly hypomethylated in LSCs compared to blasts and normal T cell controls ( Figure 9C). Consistently, the 4 CpG islands on the Spi1 promoter (Fernández-Nestosa et al., 2013) are not methylated in the HAVCR2 high subgroup but gradually become methylated in the HAVCR2 mid and HAVCR2 low subgroups and are completely methylated in blasts ( Figure 9D), which may explain the trend in Spi1 expression and Spi1-controlled Havcr2 and Itgax expression ( Figure 4B, upper panel; Figure 1G). Conversely, treating leukemic blasts with the DNMT inhibitor 5-AZ can increase the expression of Spi1 and its regulated LSC signature genes in vitro ( Figure 9E) and induces the SPI1 + and MYC low subgroups in vivo ( Figure 9F), demonstrating that Spi1 expression is reversibly regulated by DNA methylation, which in turn regulates LSC signature gene expression.
To test the relevance of our findings to human T-ALL, we used two human T-ALL cell lines, KE-37 and CEM (Burger et al., 1999;Tatetsu et al., 2007). KE-37 expresses SPI1 and HAVCR2, while CEM does not ( Figure 10A), consistent with the methylation status of the SPI1 promoter ( Figure 10B). 5-AZ treatment can upregulate the expression of SPI1 and its target HAVCR2 but downregulate c-MYC expression in CEM cells, similar to the effects of our blast treatment, while no change can be detected in KE-37 cells ( Figure 10C), demonstrating that SPI1 expression is also regulated by DNA methylation in human T-ALL. To test whether the leukemogenic activity could be modulated by SPI1 expression in human T-ALL cell lines, we injected placebo-or 5-AZ-treated CEM cells and monitored the T-ALL development induced by these cells in vivo. 5-AZ treatment . The data are normalized to that of empty plasmid controls (blue bars); (B-E) Upper panels: quantitative intracellular FACS analyses of P-GSK-3b, non-phospho-b-catenin, P-p65 and SPI1 levels in the HAVCR2 high , HAVCR2 low and blast subgroups; lower panels: representative intracellular FACS analysis of P-GSK-3b, non-phospho-b-catenin, P-p65 and Figure 8 continued on next page significantly accelerated T-ALL development ( Figure 10D). However, cell lines are not the best model system for studying LSC activity, and the essential role of SPI1 in regulating LSC activity in human T-ALL needs to be determined using patient samples and PDX models.

Cotargeting oncogenic driver mutations and LSC 'stemness' maintenance circuit
We previously reported that treating Pten-null T-ALL model mice with a PI3K inhibitor is effective only at the preleukemia stage, not after leukemia has developed (Guo et al., 2011;Blackburn et al., 2014), suggesting the importance of cotargeting the LSC 'stemness' maintenance pathway once LSCs have been generated. Since SPI1 is essential for LSC formation and SPI1 expression is regulated and maintained by the b-catenin-SPI1-HAVCR2 regulatory circuit, we hypothesized that cotargeting any component of this circuit with an anti-PI3K inhibitor may effectively eliminate existing T-ALL cells.
To test this hypothesis, we first treated age-matched leukemia-stage Pten-null T-ALL mice with DB1976 ( Figure 11-figure supplement 1A-B), a compound known to specifically disrupt the interactions between SPI1 and its targets (Antony-Debré et al., 2017;Munde et al., 2014;Stephens et al., 2016). DB1976 can significantly inhibit the expression of Havcr2 and other SPI1 target genes in vitro ( Figure 11A) and reduce the number of HAVCR2 high LSCs in vivo ( Figure 11B, left panel), confirming that SPI1 is not only important for LSC formation but also for LSC maintenance. However, only when combined with a debulking anti-PI3K agent such as rapamycin (Guo et al., 2008) could DB1976 significantly reduce the leukemia burden, as demonstrated by the nearly complete absence of leukemic blasts in the hematopoietic organs ( Figure 11B, right panel). Consequently, combination treatment can markedly prolong the animal lifespan ( Figure 11C), restore the spleen weight and morphology, and eliminate infiltrating leukemic cells in the lung, kidney and liver without a significant change in animal body weight ( Figure 11D-E; Figure 11-figure supplement 1C). Similar results were obtained when we replaced DB1976 and rapamycin with BAY6060 and BAY1082439, respectively (Figure 11D-E; Figure 11-figure supplement 1D). BAY1082439 can inhibit PI3Kd, which is essential for Pten-null leukemia (Subramaniam et al., 2012), at nanomolar concentrations (Antony-Debré et al., 2017). The inhibition of tankyrase by BAY6060 can significantly reduce b-catenin activity and consequently decrease Spi1 expression and the number of HAVCR2 high LSCs in vivo ( Figure 12A). In combination, BAY6060 and BAY1082439 could significantly prolong the animal lifespan and almost completely eliminate LSCs and blasts ( Figure 12B, Figure 11E and Figure 8G).
Compared with b-catenin and SPI1, HAVCR2 may be a better therapeutic target as it is normally not expressed in hematopoietic stem and progenitor cells (Kikushige et al., 2010), and inhibition of HAVCR2 would therefore be less toxic. An anti-HAVCR2 antibody has been used clinically in immunotherapy and in targeting AML LSCs (Kikushige et al., 2010;Koyama et al., 2016). When combined with rapamycin, the anti-HAVCR2 antibody showed a therapeutic effect similar to that seen for DB1976/rapamycin and BAY6060/BAY1082439 combinations ( Figure 11D-E and Figure 12C; Figure 11-figure supplement 1E). Together, these results suggest that inhibiting any component in the b-catenin-SPI1-HAVCR2 regulatory circuit will inhibit LSC 'stemness' maintenance and lead to the effective elimination of HAVCR2-positive T-ALL cells in the presence of an effective debulking agent targeting the PI3K pathway, such as rapamycin or BAY1082439.

Discussion
Our study suggests that two layers of control mechanisms may play essential roles in leukemogenesis ( Figure 13). The first layer is driven by the loss of the PTEN tumor suppressor or the activation of the PI3K pathway, which leads to b-catenin activation, Tcra/d-Myc translocation and T-ALL development. The second layer is controlled by the master regulator SPI1, which determines LSC signature gene expression and maintains LSC 'stemness' (Figures 4-6). SPI1 upregulation is initiated by PI3K-controlled b-catenin activation,while the LSC-specific expression of SPI1 is reinforced by the b-catenin-SPI1-HAVCR2 regulatory circuit (Figure 8). Once formed, LSCs are very sensitive to any perturbation of this regulatory circuit but are less dependent on the PI3K pathway, as inhibiting the PI3K pathway at the leukemia stage has little effect on the LSC number (Figures 11-12) (Guo et al., 2008;Blackburn et al., 2014). SPI1 is silenced by DNA methylation, which leads to the downregulated expression of LSC signature genes, the loss of LSC 'stemness' and leukemic differentiation (Figures 9-10). Although the PTEN loss and Tcra/d-Myc translocation in the first layer of the leukemogenesis mechanism are hardwired and present in both LSCs and leukemia blasts, the SPI1 expression and maintenance in the second layer of the LSC 'stemness' mechanism is reversible and present  (Figure 13). Similar two-layer control mechanisms may also be present in other types of cancer in which CSCs are known to play essential roles.
This two-layer model may have important implications for LSC-targeted therapies. First, targeting driver mutations or dysregulated pathways in the first layer may be sufficient for debulking the leukemia mass but not for eliminating LSCs unless the mechanism for maintaining LSC 'stemness' is simultaneously inhibited (Figures 11-12). Second, since the expression of the LSC master regulator SPI1 can be reversibly regulated by epigenetic mechanisms (Figures 9-10), this model would predict poorer outcomes if leukemia controlled by such a mechanism was treated with 5-AZ or similar agents and would suggest that the reactivation of SPI1 expression could be a potential mechanism for LSC-mediated therapeutic resistance.
A broad spectrum of epigenetic and genetic alterations has been found in virtually all cancer types. In certain cases, mutations within the epigenetic control machinery can influence global gene expression and cause subsequent cancer heterogeneity and clonal diversity; in other cases, epigenetic mechanisms may act on a specific transcription factor. Although we did not detect significant global methylation differences between LSC signature genes and blast signature genes, SPI1, the master regulator found in this study, is specifically methylated during differentiation from the HAVCR2 high to the HAVCR2 low phenotype (Figure 9), resulting in down regulating the expression of LSC signature genes. The mechanism that controls the specific methylation of SPI1 is currently unknown, but we predict that a similar mechanism may also regulate SPI1 silencing during T cell commitment (Zhang et al., 2012). The alteration of this silencing mechanism may lead to a block of T cell development and contribute to early progenitor type of T-ALL, such as ETP-T-ALL.
The identification of specific markers expressed only in LSCs is essential for isolating pure LSCs and studying their control mechanisms. Using an advanced single-cell sequencing technique, we identified HAVCR2 as an LSC-specific biomarker that can be used to isolate 'pure' LSCs, as determined by our limiting dilution and transplantation experiments (Figures 1-3). Most of the cell surface markers currently used to isolate LSCs or CSCs are irrelevant to the function of LSCs or CSCs. HAVCR2 is not just another biomarker but is an important regulator of the function of LSCs in Ptennull T-ALL. HAVCR2 is directly regulated by SPI1 and serves as an important component of the b-catenin-SPI1-HAVCR2 regulatory circuit, which is essential for maintaining the LSC-specific expression of SPI1 and LSC 'stemness' (Figures 4-5 and 7-8). HAVCR2 can also serve as an LSC-specific target (Figures 11-12); this finding is similar to that in a recent AML publication (Kikushige and Akashi, 2012;Kikushige et al., 2015).
PTEN and the PI3K/AKT/mTOR pathway controlled by PTEN are critical for the etiology of human T-ALL (Gutierrez et al., 2009;Larson Gedman et al., 2009;Liu et al., 2017;Maser et al., 2007;Palomero et al., 2007), and our study may illuminate the understanding and treatment of T-ALLs associated with PTEN loss or PI3K activation. We demonstrate that SPI1 expression is upregulated by b-catenin and silenced by DNA methylation in human T-ALL cell lines, similar to the findings in the Pten-null T-ALL model (Figures 9-10). SPI1 also controls the expression of HAVCR2 and other LSC signature genes in human T-ALL cell lines and clinical samples ( Figure 5). However, whether the b-catenin-SPI1-HAVCR2 regulatory circuit also presents in human T-ALLs, especially the ETP T-ALL subtype, and determines LSC activity needs follow-up study using human T-ALL samples and PDX models. Such information may be used for the molecular classification of human T-ALLs, identifying human T-ALL LSCs and designing targeted treatment, as we showed in the mouse model. As HAVCR2 high LSCs can be detected in the peripheral blood of leukemic mice (our unpublished data), further investigation is worthwhile to explore the potential use of this approach as a noninvasive strategy for stratifying T-ALL and monitoring the treatment response. Figure 11 continued spleen weights of 2-month-old WT mice, untreated Cdh5-Cre + ;Pten L/L mice, and combination-treated mice upon euthanasia; (E) HE-stained images of spleen, lung, kidney and liver tissue from2-month-old WT, untreated and combination-treated mice. A, B and D: the data are presented as the means ± S.Ds; ***p 0.001; the bars in the HE images and inserts represent 1000 mM and 50 mM, respectively. DOI: https://doi.org/10.7554/eLife.38314.016 The following figure supplement is available for figure 11:

Mice
The Cdh5-Cre + ;Pten loxP/loxP ;Rosa26 floxedSTOP -LacZ + floxedSTOP-LacZ line was described previously (Guo et al., 2008;Guo et al., 2011;Schubbert et al., 2014). The Spi1 loxP/loxP and Spi1-GFP mouse lines were kindly provided by Dr. Stephen L. Nutt. Mouse genotypes were determined by genomic PCR analyses with the primer sets listed in Supplementary File 1. Animal housing, breeding, and surgical procedures were approved by the Ethics Committee under ID LSC-WuH-1 and conducted in accordance with the regulations of the Division of Laboratory Animal Medicine at Peking University.

Cell lines
The KE-37 human T-ALL cell line was purchased from DMSZ, CEM and Jurkat cell lines were generously provided by C. Radu and Drs. G. Cheng at UCLA, respectively. All of the human T-ALL cell lines were maintained in 1640 (Life Technologies) supplemented with 10% FBS, penicillin, and streptomycin. The Pten-null T-ALL cell line (HE001) was generated previously reported, and cultured in DMEM (Life Technologies) added with 20% FBS(Omega Scientific), 10 ng/mL IL-2, and 10 ng/mL IL-7 (both Invitrogen), 10 mmol/L HEPES, nonessential amino acids, sodium pyruvate, glutamine, penicillin, and streptomycin (Life Technologies), and 2-mercaptoethanol (b-ME; Sigma) (Schubbert et al., 2014). All cell lines were maintained according to the manufacturer recommendations or previous publications. CEM, Jurkat, HEK293, and KE-37 cells were authenticated by the providers and independently authenticated (via Hi-C, WES and RNAseq analyses for genome-wide alteration, mutation signatures and gene expression profiles) in the lab. All lines tested negative for mycoplasma.

Fluorescence-activated cell sorting (FACS) analyses
FACS analyses were performed on BD LSR Fortessa or Influx system from BD Biosciences. The numbers of leukemia blasts, LSC-enriched subpopulations, and HAVCR2/ITGAX subgroups, as well as intracellular protein levels, were analyzed as described previously (Guo et al., 2008;Guo et al., 2011;Schubbert et al., 2014).

Bulk RNA-seq analysis
For bulk RNA-seq analysis, total RNA was extracted from FACS-sorted cells using a RNeasy Micro Kit (Qiagen, 74004). Strand-specific libraries were generated using an NEBNext Ultra RNA Library Prep Kit (NEB, E7530) following the manufacturer's protocol. Libraries of 350±20 bp were obtained, and the quality was determined using a Fragment Analyzer system (Advanced Analytical). Barcoded libraries were subjected to 150 bp paired-end sequencing on an Illumina HiSeq 2500, and the paired-end reads were aligned to the mouse reference genome (Version mm9 from UCSC) using Tophat (v2.0.13) (Trapnell et al., 2009). The expression value was generated as the number of fragments per kilobase of transcript per million mapped reads (FPKM) using Cufflinks (v2.2.1) (Trapnell et al., 2012).

Single-cell RNA-seq analysis
For single-cell RNA-seq analysis, we essentially followed a published protocol . Raw reads were processed as previously reported Trapnell et al., 2009) to generate expression values. Low-quality cells with less than 10,000 reads or less than 3000 covered genes were filtered out. Genes with a mean expression (TPM) value of less than one were discarded, leaving 276 cells and 12972 genes for further analysis. The unique gene set was then used for PCA, t-SNE, and pseudotime analyses (Qiu et al., 2017a;Qiu et al., 2017b;Trapnell et al., 2014). Differentially expressed genes were identified by SCDE (Fan et al., 2016;Kharchenko et al., 2014), and genes with Z > 4 were selected. Gene Ontology analysis was performed by Cluster Profiler (Yu et al., 2012), followed by Gene Set Enrichment Analysis (GSEA) (Subramanian et al., 2005) to identify gene sets that show significant differences between the blast and HAVCR2 high subgroups.

Transplantation assay
Pten-null T-ALL cells harvested from primary Pten-null T-ALL mice were FACS-sorted and diluted before transplantation, as described previously (Guo et al., 2008). Leukemia development was monitored daily by physical appearance, and weekly by peripheral blood smear and FACS analysis..
T-ALL was confirmed if the bone marrow or peripheral blood contained 20% leukemic blasts (Guo et al., 2008).
For human T-ALL cell transplantation, CEM cells were treated with 5 mM 5-AZ or PBS for 6 days in vitro, and an equal number of untreated and treated cells were then transplanted by tail vein injection into NSG recipients.

Real-time PCR
Total RNA was isolated using the RNeasy Micro Kit (Qiagen, 74004) and was reverse transcribed into cDNA using a HiScript II Q RT SuperMix for qPCR Kit (Vazyme, R223-01). Gene expression levels were measured with quantitative real-time PCR using a HiScript II One Step RT-PCR Kit (Vazyme, P611-01) and a CFX Real-Time PCR detection system (Bio-Rad). All expression data were normalized tob-actin expression, and the relative expression levels were derived from the delta-delta Ct values using CFX software (Bio-Rad). For the primer sequences used, please see Supplementary File 2.

Plasmid construction
The full-length Spi1 sequence was PCR-amplified from cDNAs generated from HAVCR2 high cells (primers: EcoRI-SPI1-Forward 5'-GAATTCATGTTACAGGCGTGCAAAATGGAAG-3' and XhoI-SPI1-Reverse 5'-CTCGAGTCAGTGGGGCGGGAGGCG-3'). The PCR products were purified and cloned into the MSCV-IRES-EGFP vector, generously provided by Dr. Owen Witt of UCLA, and the sequence was confirmed. The pll3.7-shSPI1 and control constructs were kindly provided by Dr. Junwu Zhang of the Chinese Academy of Medical Sciences and Peking Union Medical College, PLVX-IRES-RFP plasmid and PLVX-active-b-catenin (S33A, S37A, S45A) plasmid were kindly provided by Dr. Wei Guo of Tsinghua University.

Western blot analysis
To quantify the protein levels of MYC and SPI1, Western blotting was performed as described previously (Schubbert et al., 2014) and the membranes were probed with antibodies against MYC (5605s),and SPI1(2258s) from Cell Signaling Technology, using HAVCR2 (ab185703) antibody from abcam, b-actin (7210,Santa Cruz) as a loading control.

RRBS library preparation
Blast-and LSC-enriched subpopulations were collected by FACS sorting, and genomic DNA was extracted using a DNA micro kit or a DNA mini kit (Qiagen). The RRBS library was prepared according to a previous publication (Smallwood and Kelsey, 2012). Genomic DNA was digested with MspI (Fermentas), followed by end repair, adapter ligation and bisulfite modification (Qiagen, #59104). The converted DNA library was sequenced on a HiSeq 4000 (Illumina) after two-round PCR amplification and size selection.

DNA methylation analysis
BS-seq reads were aligned to the reference genome (mm9) by BS-Seeker2 (Guo et al., 2013). The lollipop plot and region-specific distribution profiles were generated by CGmap Tools (Guo et al., 2018). The methylation status of murine and human SPI1 promoter CpG islands was determined according to (Fernández-Nestosa et al., 2013).

Data availability
All the Bulk RNA-seq, Single cell RNA-seq and BiSulfite-seq data for this study are deposited in NCBI Gene Expression Omnibus under the accession number GSE115356.
The following dataset was generated: The following previously published datasets were used: Database and