Surveying brain tumor heterogeneity by single-cell RNA-sequencing of multi-sector biopsies

Abstract Brain tumors are among the most challenging human tumors for which the mechanisms driving progression and heterogeneity remain poorly understood. We combined single-cell RNA-seq with multi-sector biopsies to sample and analyze single-cell expression profiles of gliomas from 13 Chinese patients. After classifying individual cells, we generated a spatial and temporal landscape of glioma that revealed the patterns of invasion between the different sub-regions of gliomas. We also used single-cell inferred copy number variations and pseudotime trajectories to inform on the crucial branches that dominate tumor progression. The dynamic cell components of the multi-region biopsy analysis allowed us to spatially deconvolute with unprecedented accuracy the transcriptomic features of the core and those of the periphery of glioma at single-cell level. Through this rich and geographically detailed dataset, we were also able to characterize and construct the chemokine and chemokine receptor interactions that exist among different tumor and non-tumor cells. This study provides the first spatial-level analysis of the cellular states that characterize human gliomas. It also presents an initial molecular map of the cross-talks between glioma cells and the surrounding microenvironment with single-cell resolution.


INTRODUCTION
Gliomas are among the most lethal forms of human tumors as they are characterized by aggressive behaviors and resistance to multiple therapies. The development of genetic mutations in malignant cells and the complex interactions between tumor and non-tumor cells in the glioma microenvironment foster intratumoral heterogeneity, thus contributing to therapeutic failures and the generally poor prognosis of gliomas. A major unmet challenge in neurooncology is our ability to understand glioma heterogeneity and progression in gliomas and how they influence therapeutic resistance [1].
Several studies have reported that malignant gliomas are characterized by a formidable degree of intratumoral heterogeneity. For example, mosaic amplification of receptor tyrosine kinase genes (EGFR, MET, PDGFRA) is known to represent a classical hallmark of genetic heterogeneity affecting neighboring tumor cells within bulk glioma samples [2]. Furthermore, single cell-derived clones of glioma cells have been identified and shown to exhibit divergent proliferation and differentiation abilities [3]. Finally, the multi-region genetic analysis of gliomas with single nucleotide polymorphism (SNP) array or whole exome sequencing has revealed that divergent glioma subtypes can be recovered from different geographical regions, which together give rise to a branched pattern of progression [4,5]. As single-cell RNA-sequencing became a feasible approach to investigate human tumors, glioma heterogeneity has started to be explored with singlecell resolution [6][7][8][9][10][11]. However, most of the studies that have previously reported single-cell RNAsequencing of gliomas did not include analysis of C The Author(s) 2020. Published by Oxford University Press on behalf of China Science Publishing & Media Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. either tumor cells and the tumor microenvironment (TME) from multiple spatially annotated regions of gliomas, thus limiting our understanding of patterns of spatial evolution and brain infiltration, the latter being one of the most critical hallmarks of aggressiveness and progression of malignant glioma.
To delineate the glioma single-cell heterogeneity in both spatial and temporal resolution, we performed a glioma single-cell analysis from multisector biopsies informed by precision navigation surgery. Cell type components of each tumor fragment and temporal relationship of cells in each individual patient were unbiasedly identified. Our analysis did not use approaches aimed at selecting specific tumor or non-tumor cell populations. Therefore, we report the first single-cell-based comprehensive spatial analysis of the geographical structure of glioma and the dynamic progression of the interactions of tumor cells with individual non-tumor cells from multiple tumor locations.

Precision navigation-based multi-sector biopsies and single-cell RNA-seq of glioma cells
Tumor sections with potential representative divergent properties were marked in a presurgical 3D enhanced magnetic resonance imaging (MRI) model and tumor tissues were precisely collected during surgery by navigation sampling. Samples were quickly dissociated and subjected to single-cell RNA-sequencing (scRNA-seq) library preparation [12,13] (Fig. 1A). Overall, 7928 single-cell transcriptomes were generated, and 6148 passed stringent quality filtering steps after alignment and reads counting ( Fig. 1B and Supplementary  Fig. S1A,B). These cells were collected from 73 regions in 13 patients with glioma covering the most frequent subtypes (3 WHO grade II, 1 WHO grade III, 8 WHO grade IV and 1 gliosarcoma). As a control, we also included one brain metastasis from a patient with lung squamous cell carcinoma (Fig. 1C, Supplementary Fig. S2, Supplementary Tables 1 and 2).
As regional gene expression status can be affected by copy number variation (CNV) dose effect, we adopted previously reported methods to predict large fragment copy number status with a single-cell gene expression matrix [6,8]. The generated copy number matrix clustered into five CNV subgroups (Fig. 1D). Malignant cells were identified based on this classification. Subgroups G1 and G2 included glioblastoma multiforme (GBM) cells sharing a chr7 amp /chr10 del -driven transcriptome, whereas a unique CNV pattern was apparent in GBM patient GS3 (subgroup G5). The G4 subgroup included low-grade glioma (LGG) cells with chr1p/chr19q codeletion-driven signatures. The last subgroup (G3) was composed of non-malignant cells without obvious CNVs.

Optimized t-SNE map and clustering identifies 25 cell type clusters
As reported by previous single-cell RNA-seq studies of human glioma, the malignant cells from different patients showed a fragmented relationship in clustering analysis, mainly because of the dose effect of gene expression caused by diverse CNVs [6,8]. Therefore, we argued that removal of CNV variances would optimize the principal component analysis (PCA) and t-Distributed Stochastic Neighbor Embedding (t-SNE) analysis.
To obtain the CNV status of each tumor sample, we performed low-depth whole genome sequencing (WGS) on bulk biopsies. Based on the WGS data, genes with interpatient large copy number changes were removed from the clustering analysis ( Fig. 2A, Supplementary Figs S3B and S4). The integrated analysis of tumor cells from our entire dataset using PCA and t-SNE resulted in an optimized map of 25 cell clusters (Fig. 2C, D and Supplementary Fig. S5). Compared to the original t-SNE map ( Fig. 2B and Supplementary Fig. S6A, B and D), the fragmented cell distribution was resolved to a map with more concentrated points and clearer edges (Fig. 2D). This distribution was confirmed by independent binary regulon activity clustering with SCENIC ( Fig. 2E and Supplementary Figs S6E and S7) [14]. A combination of these different strategies could help minimize analyze artifacts in analysis.
Considering the cell source and CNV status with a global t-SNE map, the non-tumor cells and the glioma cells lacking extensive genomic rearrangements (such as those derived from LGG) showed a converged distribution, indicating that our data were not influenced by batch effects. They also indicated that the LGG-derived tumor cells were more uniform among different patients ( Fig. 2C and Supplementary Fig. S6C). Cell cluster-specific genes coalesced the 25 clusters into four major groups (Supplementary Table 2): CNV − MOG + normal glial cells [15], KRT5 + lung cancer (LC) metastasis cells [16], CD45 (PTPRC) + immune cells and CNV + malignant tumor cells. In malignant glioma cells, OLIG2/DLL3 and AQP4/CLU distinguish tumor cells exhibiting transcriptomic features of oligodendrocytes and astrocytes, respectively [17][18][19]. Cells positive to the general proliferative marker  MKI67 + were present among both glioma cell phenotypes. Moreover, PTPRZ1 and SOX2 were significantly overexpressed in glioma cells and could therefore be considered as useful markers to estimate the tumor cell purity in bulk glioma tissues. Interestingly, clusters 10 and 12 specifically overexpressed cilium-related markers (HYDIN, FOXJ1, etc.) [20,21]. The tumor cells included in these clusters exhibited astrocyte-like features and expressed low levels of PTPRZ1 (Fig. 2F, G and Supplementary Fig. S8). Clusters 10 and 12 mostly derived from patient GS13, indicated that this patient developed a rare form of ciliated glioma for which single-cell studies have not previously been reported.  Optimized global cluster   chr2  chr3  chr4  chr5  chr6  chr7  chr8  chr9  chr10  chr11  chr12  chr13  chr14  chr15  chr16  chr17  chr18  chr19  chr20  chr21

Annotation of the glioma microenvironment and TCGA-based classification of single glioma cells
Normal diploid cells in the tumor samples were mainly cells within the tumor TME. To conduct an unbiased investigation of the TME in glioma, we did not include flow cytometry-based selection in our sampling procedure. Normal oligodendrocytes, microglia and macrophages accounted for approximately half of these cells, with relative homogeneity among patients. Such a conclusion could not have emerged from CD45 selected samples, as this is the prevailing sampling method used in previous scRNA-seq studies. We also captured cells showing specialized activities, such as interferon-induced oligodendrocytes and polarized microglia/macrophages (Fig. 3A). Next, we classified malignant cells according to the Cancer Genome Atlas (TCGA) GBM classification model [19]. Grade II tumor cells with chromosome 1p/19q codeletion had a strong proneural (PN) signature. Conversely, the individual tumor cells recovered from GBM were distributed among each of the three transcriptomic subgroups (PN, classical (CL) or mesenchymal (MES)). We did not identify glioma cells that could be assigned to the neural (NL) subtype but we detected cells expressing cilium-related signatures that were distinct from those associated with known TCGA subgroups ( Fig. 3B and C).

CNV accumulation and tumor progression in patient GS1
As CNVs usually accumulate during tumor progression, the CNV profile of individual tumor cells has emerged as an accurate inference to trace tumor progression [2][3][4]8,22,23]. We used arm-level scRNAseq-derived CNVs to trace tumor cell clones, which were confirmed by bulk WGS (Fig. 1D, Supplementary Fig. S3A and B) to avoid possible mistakes by unravelling single-cell CNV from RNA data with bioinformatics tools alone. Multiple cell subpopulations with accumulated CNVs were found in patients GS1, GS13 and GS5 ( Supplementary  Figs S3C and S9).
From patient GS1, a female with an IDH-wild type GBM, single cells were collected from five tumor core locations (P1/4/5/6/7) and two peritumoral locations (P2/3). A few cells (P8) were also collected from an adjacent brain region, which, from the imaging analysis, lacked evidence of tumor infiltration. Ring plots illustrate the tumor and nontumor cell components at each site (Fig. 4A). In this patient, CL glioma cells accounted for most of the malignant components at all tumor sites, with a small fraction of PN tumor cells. Interestingly, we found that macrophages composed the larger fraction of non-tumor cells infiltrating the tumor core (P1/4/5/6/7) but they were replaced by microglia in the peritumoral invading front (P2/3). The peritumoral biopsies contained a higher fraction of nontumor cells than glioma cells.
As scRNA-seq can be used to infer the underlying CNV, we interrogated our dataset to uncover the spatial dynamics of CNV changes in glioma cells to reconstruct the trajectory of tumor initiation and progression. Five CNV-based clusters (CN-1 to CN-5) were found in patient GS1 (Fig. 4B). CN-1 and CN-2 contained diploid non-tumor cells, but were divided into two clusters because their expression profiles differed from those of normal glial and immune cells. CN-3 to CN-5 clusters contained a series of aneuploid subclones that shared the chr7/10/19/20 CNVs that are recurrent genomic alterations in GBM. As they progressed, these subclones first accumulated CNVs on chr2 and later on chr5 as they transitioned from CN-3 to CN-4 and from CN-4 to CN-5, respectively (Fig. 4C), therefore highlighting the temporal progression of this tumor. The combination of spatial and temporal information indicated that the CN-3 population was distributed only in P5/P6, whereas the CN-4 and CN-5 populations were present in all other tumoral and peritumoral sites (Fig. 4D). These findings clearly indicate that P5/P6 were the initial sites of the tumor that progressed towards the locations of P1/P4/P7 and P3 (Fig. 4A). In the global t-SNE analysis, as clone CN-3 developed into CN-4 and CN-5, glioma cells also transformed from cluster 9 to 20 (Fig. 4E). Our findings established that the number of somatic CNVs increased across the different glioma regions, thus defining a pattern of tumor progression. They are also consistent with the notion that an increased definition of the multi-sector sampling of glioma might reveal finer and more accurate genetic trajectories of glioma evolution, as shown in recent hepato-carcinoma studies [24].
To unravel the phenotypic changes that mark spatial evolution of glioma, we performed  Table 2). Considering the functional enrichment of differentially expressed genes, our findings converge on a model whereby the genetic alterations such as CNVs that accumulate during turn or invasion lead to loss of differentiated astrocyte properties and gain of known features driving tumor aggressiveness, angiogenesis, dedifferentiation and oncogenic PI3K/AKT signaling (Fig. 4H), ultimately resulting in the promotion of glioma progression towards more aggressive and invasive phenotypes.

Trajectory of tumor cell states reveals branched progression in patient GS13
Patient GS13 was a male with an IDH-wild type GBM characterized by high expression of genes associated with motile cilium activities (e.g. FOXJ1, FAM183A, HYDIN, DNALI1, etc.) (Supplementary Figs S8 and S10). We noted that only about 5% of TCGA GBM patients exhibited high expression of cilium-related genes. Therefore, the particular type of GBM analyzed from patient GS13 belongs to a rare type of glioma that was not previously investigated at single-cell level. However, cilium-specific gene expression was not associated with a specific pattern of survival ( Supplementary  Fig. S11). From this patient, we acquired 978 cells from three core and two peritumoral sites, and found that the cells at core locations consisted of PN and cilium-positive cells (Fig. 5A). CNVs were also evident in this case. The clonal CNV reconstruction revealed two CNV clones (CN-2 and CN-3) and the transition from CN-2 to CN-3 was marked by   accumulation of chr1 amp and chr19 del . Furthermore, as cells transitioned to CN-3, the expression profile changed from global cluster 6 (PN) to global clusters 10 and 12 (cilium) (Fig. 5B-E). We also applied hierarchical clustering of CN-2 and CN-3 malignant cells and identified differentially expressed genes between these two groups. Interestingly, after cells transitioned to CN-3, they formed two branches with distinct gene expression profiles ( Fig. 5F and H, Supplementary Table 2). The functional geneset enrichment analysis showed that the transition from CN-2 to CN-3 was associated with decreased expression of group 1 genes, which were implicated in glial cell differentiation and adhesion. Conversely, the expression of genes in groups 2 and 3 (implicated in cell cycle/DNA replication and cilium regulation, respectively) increased ( Fig. 5G and Supplementary  Fig. S12).
To define the pattern of progression of this branched trail, we analyzed the single-cell trajectory of these cells with Monocle2 [25]. In a pseudotime model, this trail started from global cluster 6, which overexpressed oligodendrocyte progenitor cell marker OLIG1. Then, the trail branched into two directions when CN-2 became CN-3, and branches 1 and 2 correspond to global clusters 10 and 12, respectively. Branch 1 exclusively expressed cilium markers (e.g. HYDIN, FOXJ1, DNALI1, etc.) ( Fig. 5H and I), while branch 2 maintained the PN nature but showed high proliferative ability. Overexpression of TP73 and HYDIN was validated by immunohistochemistry (IHC) staining experiments ( Supplementary Fig. S13). Moreover, expression of the TP73 gene at chr1p36 increased in chr1 amp cells (Fig. 5H). From previous studies, a protumorigenic activity of TP73 has recently emerged, especially in the context of the N-terminal truncated TP73 isoform [26]. Furthermore, TP73 overexpression has been reported in several tumor types, including breast cancer, melanoma, prostate cancer and neuroblastoma, and was shown to induce metastasis, chemo-resistance and other hallmarks of tumor progression that confer poor clinical outcome [27]. By RT-qPCR detection, both full length and N-TP73 existed in GS13 tumor (Supplementary Fig. S14).
In conclusion, the expression of a motile cilium signature is not an unusual event in GBM, but little is known about this particular GBM phenotype. The branched model revealed that glioma cells may go through different spatial destinations despite sharing similar CNV profiles.

Characterization of the TME in glioma
In the glioma TME, tumor-associated macrophages (TAM) communicate through ligand/receptor cross-talks with tumor and non-tumor cells to promote tumor aggressiveness [28,29]. As our single-cell platform exhibits high sensitivity with a higher number of average detectable genes expressed in single cells compared to other glioma datasets (4470 genes/cell versus <2000 genes/ cell), we sought to build a ligand/receptor interaction map for reconstruction of the most important chemoattractant relationships that exist between glioma tumor cells and TAM in the glioma TME ( Fig. 6A and Supplementary Fig. S15). Overall, we detected 16 chemokine ligands and nine receptors in 13 patients.
Microglia and M2a/c macrophages, which expressed the CX3CR1 receptor, coexisted with glioma cells that expressed the CX3CL1 ligand. Lymphocyte infiltrates expressed CXCR3/6 and CCR6, but the CXCR3 ligands CXCL9/10/11 were rarely detected in glioma samples with the exception of interferon-activated microglia. These ligandreceptor pairs were previously reported to recruit tumor-infiltrating lymphocytes and inhibit tumor growth [30]. CXCR6 ligand CXCL16, which exists in both transmembrane and soluble form [31], was highly expressed in TAM cells, and mildly expressed in the malignant cells. Lymphocytes also expressed CXCR4 like TAM cells, whereas the ligands were expressed in microglia cells. Their binding was reported to mediate glioma chemotaxis and regulate cell survival through activating AKT-related pathways [32].
CCL5/CCL8 and CCR5 was another chemokine axis between lymphocytes and TAM. Besides inflammatory chemoattractant functions, this can also mediate NK cell activation, which promotes tumor genesis and metastasis [33]. Another receptor of CCL8 is CCR2, which is expressed in monocytes, and highly expressed in TAM cells and MES cells.
A strong impact from the CXCL family and related receptors was found in M2b macrophages and neutrophils (Supplementary Fig. S16A and B).  CXCL1/2/3/5/6/8 were overexpressed in M2b cells, generating a chemoattractant environment that might recruit CXCR1 + /CXCR2 + neutrophils. In glioma, M2b polarization and recruiting neutrophils have been connected with pro-tumoral functions [34][35][36]. Based on their common gene expression signals, we calculated an enrichment score from 38 genes ( Fig. 6B and Supplementary  Fig. S16C) and used this signature to deconvolute the presence of these cells from gene expression profiles of bulk tissues and predict clinical outcome. This method had been validated with TCGA glioma RNA-seq datasets (Supplementary Fig. S16D and E). In our spatial cell distribution data, macrophages were the most abundant non-tumor cell in core biopsies, but were replaced by larger fractions of microglial cells in biopsies from the tumor margins (Fig. 6C). Thus, a switch from macrophage to microglia infiltration from the glioma core to the periphery was a general event that likely marked the microenvironmental changes. The microenvironment changes are likely to be dictated by different requirements of glioma cells as they migrate from core tumor regions to the invading front at the tumor periphery. This pattern was also recapitulated in the IVY GAP dataset [37] ( Supplementary Fig. S17).

RESEARCH ARTICLE
Finally, a lineage trajectory was built with TAM cells (Fig. 6D-F), showing the gradual change of three transitional states. The pattern of TAM development in glioma started first with a microglia phenotype (P2RY12 + /TMEM119 + ), then it turned to polarized macrophage (CD163 + ), and finally converged into M2b macrophages (IL1RN + ) with activated expression of strong angiogenesis signaling molecules (VEGFA). In the middle of this trajectory, we detected a small branch of cells that expressed high levels of the bone-marrow-derived macrophage (BMDM) TAM marker ITGA4. These findings suggest a model whereby TAM cell polarization in glioma is the result of two independent cell sources: resident microglial cells and BMDM cells (Fig. 6G-N). As we have also been able to compare the cell fates of TAM between the different grades of glioma, we found that they exhibited lower M2 polarization in LGG samples (GS8/9/14) when compared to GBM (Supplementary Fig. S18).

DISCUSSION
As malignant gliomas are characterized by high degree of intratumoral heterogeneity, single-cell genomic technologies have rapidly emerged as a crucial approach to disentangle glioma heterogeneity. However, most of the previous single-cell RNAseq glioma studies relied on the analysis of a single biopsy from each tumor specimen, thus lacking information on the special heterogeneity of the analyzed tumor lesions [6][7][8][9]. Although one study reported the multi-regional analysis of glioma with single-cell sequencing, the number of cells analyzed in that study was very limited and could not provide a comprehensive picture of the geographical structure of glioma at the single-cell level [10].
Here, we presented a comprehensive single-cell landscape of multiple subtypes of gliomas, each of which was analyzed by multi-region samplings, and provided the first spatial-level analysis of the cellular states that characterize human gliomas. We designed multi-sector biopsies with a 3D-enhanced MRI model, and collected them during surgery by navigation sampling. For each biopsy, we generated and functionally annotated transcriptomes of hundreds of single tumor and non-tumor cells collected from multiple core and periphery tumor locations. Together, they provide a coherent map of the dynamic states and interactions between the different cell types that integrate the key features of glioma homeostasis at each tumor location. We found that both the number and the transcriptomic subtypes assigned to individual glioma cells frequently change dramatically between biopsies collected from different locations, even when they originated from neighboring glioma regions. We also made the unexpected observation that whereas core biopsies contained a high number of macrophages, this configuration of the core TME was replaced by a comparatively higher number of resident microglia at the glioma periphery, which represents the invading front of the tumor towards the normal brain.
As malignant glioma cells share high proliferation capacity, they readily accumulate multiple types of genetic alterations that trigger an increasing degree of aneuploidy with constant adaptation to the demands created by the growing tumor mass in relation to the TME [38]. A glioma cell CNV-driven  progression trajectory we uncovered that was especially highlighted by the dynamic changes in the tumor from patient GS1 was marked by progressive loss of the astrocyte-like hallmarks of glioma cells with gain of multiple tumor cell phenotypes (loss of differentiation, competence to migrate and invade through the extracellular matrix, etc.), which together drive glioma progression and invasion of the normal brain. Another progression trajectory we uncovered was well represented in the tumor of patient GS13, in which the pseudotime trajectory produced by Monocle2 identified a drastic switch in the major tumor cell population with gain of an intriguing ciliated phenotype that likely contributes to glioma aggressiveness. These transformation models indicate that the constant rearrangement of the genome of glioma tumor cells leads to continuous gain of new capacities, all of which converge towards the acquisition of more aggressive tumor phenotypes for always more deregulated proliferation, anaplasia and invasion of the normal brain. An important, novel finding contributed by our work is the deconvolution of the cross-talks between tumor and non-tumor cells in the glioma TME. We found that the active communications between the different cell types are primarily implemented by multiple combinations of chemokine ligands with their corresponding receptors that we have characterized within different regions of individual tumors and among the different types of glioma we studied [39]. In particular, we found that the communication between non-tumor cells was dominated by the prominent role of the CXCL family of chemokines and related receptors, which was especially apparent in M2b macrophages and neutrophils. We followed up on this finding and determined the enrichment score of M2b/neutrophil cells in bulk gliomas to evaluate the consequences of the infiltration of these cell types for clinical outcome of glioma patients. The analysis, which was performed with TCGAderived glioma RNA-seq datasets, was able to distinguish patients with divergent clinical outcome based on the predicted level of infiltration of the two cell types.
In conclusion, we used the single-cell RNA-seq technology to generate an extensive complete map of the geographical molecular structure of gliomas. The trajectories of reciprocal genomic and functional changes that accompany glioma cells as they move within the tridimensional space of the tumor mass, combined with the deconvolution of the crosstalks between different cells in the glioma TME, paint an unprecedented scenario that elucidates the intratumoral heterogeneity of this lethal tumor type.

METHODS AND MATERIALS
The detailed descriptions of methods are available as Supplementary Materials at NSR online.

DATA AVAILABILITY
Both the single-cell RNA-seq data and bulk DNA-seq data we used in this study were submitted to Genome Sequence Archive (GSA) in the Beijing Institute of Genomics (BIG), Chinese Academy of Sciences, under accession number HRA000179 that are publicly accessible at https://bigd.big.ac.cn/gsa-human. Barcode sequences and prepared gene expression count matrix were also submitted to the Gene Expression Omnibus (GEO) under accession number GSE117891. Other information related to this study is available from the corresponding authors upon reasonable request.