Spatial patterns of tumour growth impact clonal diversi cation: computational modelling and evidence in the TRACERx Renal study


 Intra-tumour genetic heterogeneity (ITH) fuels cancer evolution. The role of clonal diversity and genetic complexity in the progression of clear-cell renal cell carcinomas (ccRCCs) has been characterised, but the ability to predict clinically relevant evolutionary trajectories remains limited. Here, towards enhancing this ability, we investigated spatial features of clonal diversification through a combined computational modelling and experimental analysis in the TRACERx Renal study. We observe through modelling that spatial patterns of tumour growth impact the extent and trajectory of subclonal diversification. Moreover, subpopulations with high clonal diversity, and parallel evolution events, are frequently observed near the tumour margin. In-silico time-course studies further showed that budding structures on the tumour surface could indicate future steps of subclonal evolution. Such structures were evident radiologically in 15 early-stage ccRCCs, raising the possibility that spatially resolved sampling of these regions, when combined with sequencing, may enable identification of evolutionary potential in early-stage tumours.

Abstract: Intra-tumour genetic heterogeneity (ITH) fuels cancer evolution. The role of clonal diversity 3 and genetic complexity in the progression of clear-cell renal cell carcinomas (ccRCCs) has 4 been characterised, but the ability to predict clinically relevant evolutionary trajectories 5 remains limited. Here, towards enhancing this ability, we investigated spatial features of clonal 6 diversification through a combined computational modelling and experimental analysis in the 7 TRACERx Renal study. We observe through modelling that spatial patterns of tumour growth 8 impact the extent and trajectory of subclonal diversification. Moreover, subpopulations with 9 high clonal diversity, and parallel evolution events, are frequently observed near the tumour 10 margin. In-silico time-course studies further showed that budding structures on the tumour 11 surface could indicate future steps of subclonal evolution. Such structures were evident 12 radiologically in 15 early-stage ccRCCs, raising the possibility that spatially resolved sampling 13 of these regions, when combined with sequencing, may enable identification of evolutionary 14 potential in early-stage tumours. 15 16 17 Introduction: 1 The development of cancer can be viewed as an evolutionary process (Merlo et al. 2006, Zahir 2 et al. 2020). Acquisition of genomic alterations including mutations and somatic copy-number 3 alterations (SCNAs) drives the emergence of genetically heterogeneous subpopulations of 4 cancer cells or subclones (McGranahan and Swanton 2017), resulting in intra-tumour 5 heterogeneity (ITH). A small subset of genomic alterations (drivers) endows subclones with 6 increased fitness that manifests as growth and survival. Subclones compete for resources, 7 including physical space, and undergo expansion or extinction according to their fitness under 8 given selective pressures imposed by the tumour microenvironment (TME) or therapeutic 9 intervention. With recent advances in next-generation sequencing, clonal architecture and 10 evolutionary features have been elucidated in a variety of tumour types (Gerlinger et al 2012, 11 Yates Therefore, we further evaluated the effects of different spatial patterns of tumour growth 47 (hereafter also referred to as tumour growth mode) on spatial features of clonal diversity in our 48 computational model. While distinct growth patterns may predominate in different parts of a 49 tumour, or at different stages of the same tumour growth, we restricted our investigation in the 1 current study to two simple growth models: one model with uniform growth throughout the 2 tumour volume as a "null model" assuming that all tumour mass can proliferate (referred to as 3 "Volume Growth Model") and the other model with active proliferation restricted to the tumour 4 surface (referred to as "Surface Growth Model") (see Methods, Figure 1c). 5 6 In the context of the TRACERx Renal study, we reported a workflow to evaluate both the 7 genomic profiles and spatial coordinates of 756 regions (patient tumour regions are referred to 8 as "PT regions" hereafter) from 66 tumours (Zhao et al. 2020). Here, we focus on spatial 9 analyses of microdiversity and parallel evolution in the same dataset. We found in the model 10 setting that the spatial patterns of tumour growth influenced the extent and trajectory of 11 subclonal diversification. Furthermore, subpopulations with high microdiversity, as well as 12 parallel mutational events with limited clonal expansion, were frequently detected near the 13 tumour edge, which was corroborated in the TRACERx Renal dataset. Finally, time-course 14 studies on simulated tumours suggested that evolutionary steps could be predicted at an early 15 stage and seeded by clones in the budding structures on the tumour surface in a subset of 16 tumours. These budding structures were evident in 15 early-stage ccRCC tumours, raising the 17 intriguing possibility that spatially resolved sampling, when combined with sequencing, may 18 enable identification of evolutionary potential in early-stage tumours and predict the disease 19 course. 20 21 Results: 1   Generation of an agent-based model recapitulating ccRCC evolution   2   We developed a coarse-grained cellular automaton model to explore the evolutionary dynamics  3 of ccRCCs, a brief outline for which is presented here (see Methods for detailed description). 4 The model includes 12 driver genes and 14 SCNAs (Supplemental Figure 1) that were 5 highlighted as canonical driver events in ccRCCs in the TRACERx Renal study (Turajlic et al. 6 2018). Each model unit, referred to as a "tumour voxel", represents a tumour volume of 1 ## ! . 7 Tumour voxels stochastically undergo growth, death, and, upon growth, processes of driver 8 acquisition (i.e., mutations in driver genes or driver SCNAs). To keep the model minimal, 9 several assumptions were made. In the model, time has arbitrary units, so the baseline growth 10 probability $ "#$%&' was arbitrarily defined. Three levels of growth probability, $ "#$%&' , were introduced for the 26 drivers, to broadly reflect their association with 12 the Ki67 score in tumour regions in the experimental data (Supplemental Figure 1). As one 13 specific implementation, growth probabilities were set at $ "#$%&' and $ "#$%&' , where 0 ≤ -≤ 1 determines this advantage 15 relative to the baseline growth probability. For simplicity, mutations in all driver genes were 16 assigned with the baseline growth probability, while some SCNAs were assumed to confer 17 greater growth probabilities. Four SCNAs with strong association with Ki67 score, 7q gain, 18 20q gain, 4q loss, and 8p loss, are the most "advantageous" drivers with $ "#$%&' . A tumour 19 voxel grows with $ "#$%&' ()) conferred by the most advantageous driver harboured. Note that an 20 alternative implementation for fitness advantage was also explored (Supplemental Figure 2, 21 Supplemental Note 1, see Methods). Mutations in driver genes were assumed to be acquired 22 with a greater probability ($ ,#-./# ) than SCNAs. Based on functional evidence linking them to 23 chromosome instability (Varela et al. 2011, Peng et al. 2015, as well as their association with 24 a high weighted genome instability index (wGII) in the experimental data from the TRACERx 25 Renal study (Turajlic et al. 2018), mutations in PBRM1 or BAP1, were assumed to enhance the 26 probability of SCNA acquisition. Finally, a second mutation in the same gene was assumed to 27 never occur in the same tumour voxel. 28 29 Each simulation starts from a single tumour voxel carrying VHL and 3p loss and is terminated 30 when the size of the simulated tumour exceeds 1 million tumour voxels. Simulated tumours 31 were analysed at three spatial scales: three-dimensional (3D) whole tumour volume, two-32 dimensional (2D) tumour slice and tumour regions (Figure 1d). Exploratory simulations (50 33 per condition) were performed at 6 levels ofand 11 levels of $ ,#-./# . This permits the 34 understanding of the collective impact of fitness advantage and the rate of driver acquisition in 35 a spatial context. 36 37 Characterisation of clonal diversity in the whole tumour  43  model was smooth overall, the tumour under Surface Growth formed bulging structures on the  1  surface, which reflected the outgrowth of driver subclones.  2  3 Given the disparity in the quantity and spatial distribution of driver subclones in the 4 representative cases under Volume Growth and Surface Growth (Figure 1e), we then asked 5 how the mode of tumour growth might influence the number of subclones in the whole tumour. 6 We first counted the number of clones (including parental clone and subclones) in the whole 7 tumour (Figure 2a-b). In the Volume Growth model, subclones were only observed in tumours 8 with greaterand greater $ ,#-./# (panel (i) in Figure 2b). By contrast, in the Surface Growth 9 model, tumours with small to moderate $ ,#-./# harboured more subclones, for a wide range of 10 -(panel (ii) in Figure 2b). These findings suggest that the mode of tumour growth could 11 impact outgrowth of subclones and their ultimate prevalence. 12 13 Noticeably, tumours also displayed distinct proportions of subclonal populations, independent 14 of the growth mode and evolutionary conditions (i.e., identicaland $ ,#-./# ) (Figure 2c). In 15 the Volume Growth model, most tumours with a small $ ,#-./# displayed a monoclonal 16 structure with limited evidence of clonal evolution (panel (i) in Figure 2c). With a large 17 $ ,#-./# , a greater extent of clonal evolution was evident in a higher proportion of tumours 18 (panel (ii) in Figure 2c). In a fraction of these cases, one dominant subclone was observed 19 with the parental clone present as a minority, suggesting early fixation of a highly fit clone and 20 a (near) clonal sweep (i.e., the entire tumour was taken up by the dominant subclone). In 21 comparison, in the Surface Growth model, extensive subclonal diversification was evident in 22 nearly all cases, even with a small $ ,#-./# (panel (iii) in Figure 2d). These features observed 23 in individual tumours were reflected by the whole-tumour cancer cell fraction (CCF) of the 24 parental clone (Figure 2e) and Shannon diversity index (Figure 2f). Across a wide range of 25 driver acquisition probabilities, the Surface Growth model, in comparison to the Volume 26 Growth model, consistently displayed a relatively lower whole-tumour CCF of the parental 27 clone and a greater Shannon diversity index, indicating a greater extent of subclonal 28 diversification. The greater extent of diversification in the Surface Growth model was also 29 noted for conditions with still smaller $ ,#-./# (Supplemental Figure 3) or smaller -30 (Supplemental Figure 4), where clonal evolution was generally limited. For the interest of 31 characterising patterns of subclonal diversification and contrasting two growth modes, we 32 limited our parameter analysis, for the subsequent investigation, to -= 1 and a range of $ ,#-./# 33 from 2 × 10 01 to 1 × 10 0! . 34 35 In general, Surface Growth appears to enable more extensive subclonal diversification, leading 36 to a highly branched tumour evolution, while Volume Growth often resulted in tumours with 37 limited clonal evolution and occurrence of punctuated evolution. Interestingly, these were the 38 precise modes of evolution that underlined diverse evolutionary subtypes identified in ccRCCs, 39 ranging from "VHL mono-driver" tumours with limited evidence of clonal evolution, to 40 tumours with extensive subclonal diversification characterised by a range of drivers, and those 41 with early fixation of a highly fit clone resulting in rapid clonal sweep (Turajlic et al. 2018). 42 43 Having established a quantitative understanding of clonal diversity in the whole tumour and 44 the impact of different growth modes, we next strived for a spatial understanding of clonal 45 diversity. 46 47 Evaluation of genomic divergence between tumour regions 1 As a first step towards understanding spatial features of clonal diversity, we asked whether in-2 silico tumours were able to recapitulate a phenomenon of spatial clonal evolution noted in 3 colorectal and liver cancers by others (Cross, et al. 2017;Zhai, et al. 2017) and, more recently, 4 in ccRCCs by our group (Zhao et al. 2020), whereby subclones are patterned within a tumour 5 such that regions farther apart exhibit more divergent genomic makeups. 6 7 To this end, we sampled uniformly spaced regions within a 2D tumour slice and measured the 8 spatial and genomic distances between each pair of regions (Figure 3a, see Methods). 9 Consistent with previous experimental evidence (Cross, et al. 2017;Zhai, et al. 2017; Zhao, et 10 al. 2020), between-region spatial and genomic distances were positively correlated (hereafter 11 referred to as "spatial-genomic correlation") in representative tumours under Surface Growth 12 (Figure 3b-c) or under Volume Growth (Figure 3d-e) and also in repeat simulations across 13 multiple values of $ ,#-./# (Figure 3f). Moreover, different growth modes resulted in distinct 14 trends of variation in the spatial-genomic correlation with the change of $ ,#-./# . In the Volume 15 Growth model, the spatial-genomic correlation was weaker in tumours with smaller $ ,#-./# . 16 By contrast, in the Surface Growth model, the spatial-genomic correlation was weaker in 17 tumours with greater $ ,#-./# , as the extensive diversification likely caused large genomic 18 divergence even in regions separated by a short distance (Figure 3f). As a consequence of these 19 divergent trends, the spatial-genomic correlation was stronger in tumours under Surface 20 Growth than those under Volume Growth if $ ,#-./# was small, and weaker if $ ,#-./# was large. 21 Experimentally, the cohort of ccRCCs in the TRACERx Renal study displayed a spatial-22 genomic correlation coefficient of 0.31 (Zhao, et al. 2020). 23 24 Overall, these data demonstrate that our model is sufficient to recapitulate the spatial-genomic 25 correlations observed across experimental data, a pattern emergent from spatial evolution of 26 subclones in tumours. We next asked how clonal diversity itself could vary spatially within a 27 tumour slice and whether the mode of tumour growth had an impact on the spatial patterns.

29
Spatial distribution of clonal diversity within tumour sections 30 Characterisation of clonal diversity in the whole tumour and the between-region genomic 31 divergence provided a quantitative view of the extent of subclonal diversification. We next 32 sought to examine whether clonal diversity was spatially uniform or variable within a tumour. 33 To this end, a 3##-by-3## sliding window was applied throughout the 2D tumour slice and 34 the number of distinct subclones (defined as "microdiversity") was counted in that 9-## + area 35 (Figure 4a). Subsequently, the spatial profile of microdiversity was analysed to identify 36 hotspots with microdiversity equal to or greater than 5 (defined as "microdiversity hotspots") 37 (Figure 4a).

39
Consistent with the previous analysis on the whole-tumour CCF of subclones (Figure 2a), 40 tumours under Surface Growth developed multiple subclones that occupied distinct spatially 41 contiguous areas, reflective of a branched clonal structure (panel (i) in Figure 4b).

42
Microdiversity hotspots were frequently observed near the tumour edge within outgrowing 43 advantageous subclones but also at the boundaries of multiple adjacent subclones (panel (ii) 44 in Figure 4b). By contrast, tumours under volume growth commonly developed a single 45 dominant subclone (panel (i) in Figure 4c). In these tumours microdiversity hotspots were 46 observed within the area spanned by the dominant subclone with a more uniform distribution 1 along the tumour radius (panel (ii) in Figure 4c). 2 To corroborate these spatial features of clonal diversity with experimental data, we examined 3 66 tumours in the TRACERx Renal study. Individual PT regions that contain at least two 4 subclones were treated as a proxy for microdiversity hotspots. In total, 606 PT regions from 54 5 tumours were included in this analysis. Different spatial distributions of microdiversity 6 hotspots were observed, as highlighted in representative examples: predominantly near the 7 tumour margin (as in "G_K234"), while more uniformly distributed throughout the tumour 8 section (as in "G_K446") (Figure 4d). 9 10 In order to quantify the observed spatial patterns, we measured the distance to the tumour centre 11 and the distance to the closest point at the tumour contour, for every microdiversity hotspot 12 over all repeat simulations. These two distances were then combined into a single variable 13 termed the "normalised distance to tumour centre", denoted as 0. Interestingly, the cumulative 14 probability distribution with respect to 0 depicted power law scaling (Figure 4e), suggesting 15 that the probability of observing spots with high clonal diversity along the radius of a tumour 16 could be estimated using a simple mathematical formula (i.e., 1(2 ≤ 0)~0 2 , where 4 is the 17 power law exponent to be fitted). Growth and Surface Growth models (Figure 4g). A greater error in matching data and fitted 24 power law was noted at regions approaching the margin, likely attributed to under-sampling of 25 biopsies at the tumour periphery (Supplemental Figure 5).

27
To summarise, regions with high clonal diversity within a tumour slice were increasingly 28 frequent towards the tumour margin, a spatial feature particularly enhanced in the Surface 29 Growth model and corroborated experimentally using PT regions with at least two subclones. 30 These data suggested that subpopulations with high clonal diversity were abundant near the 31 tumour margin. 32 33 Frequency and spatial features of parallel evolution

34
As subclonal diversification could involve acquisition of, and be facilitated by, distinct 35 mutations in the same gene at spatially separate locations, we next evaluated the frequency of 36 parallel evolution events and their spatial features. 37 38 Distinct mutational instances in the same driver gene were recorded as different driver events 39 in the model (Figure 5a). Guided by an exploratory whole-tumour analysis, which showed 40 parallel mutational events other than PBRM1 and BAP1 to be rare at relatively low resolution 41 (Supplemental Note 2, Supplemental Figure 6a-b), we specifically focused on these two 42 mutations. As might be expected, as a reflection of the extent of subclonal diversification in 43 the whole tumour shown above (Figure 2b To further elucidate the spatial patterns of parallel mutations in PBRM1 or BAP1, we next 48 labelled tumour regions harbouring these events (Figure 5a). Within a tumour section, 49 subpopulations harbouring parallel mutational events expanded to occupy a variable number 1 of regions (Figure 5c-d). When mutational instances in PBRM1 or BAP1 occurred late, they 2 led to limited clonal expansion (up to two tumour regions), despite the role in promoting 3 acquisition of SCNAs (Supplemental Figure 6c-f). Those instances were commonly observed 4 at a short distance to the tumour edge (Figure 5e). Furthermore, for a range of $ ,#-./# , the 5 Surface Growth model resulted in a shorter distance of the parallel evolution events to the 6 tumour edge, compared to the Volume Growth model, likely due to the birth of these subclones 7 occurring at the tumour surface. 8 9 In the TRACERx Renal study, parallel evolution was observed in 28 tumours, with parallel 10 mutations in the same gene spanning distinct sets of regions (Supplemental Note 3). As 11 highlighted previously (Turajlic, et al. 2018) and similar to the in-silico tumours, distinct 12 parallel mutational events could span a variable number of PT regions (Figure 5f).

13
Consistent with the in-silico analyses, parallel mutational events in driver genes with limited 14 clonal expansion (spanning up to two PT regions) were predominantly located near the 15 tumour edge (Figure 5g). This spatial distribution was also found for mutational events 16 without evidence of parallel evolution and with limited expansion (Figure 5g), suggesting 17 that subclones lacking significant expansion were commonly emerging near the tumour edge. 18 The lack of expansion of these subclones could be attributed to either their narrow fitness 19 advantage and/or late occurrence. 20 21 Together, these data suggest that parallel evolution in BAP1 or PBRM1 was more prevalent in 22 the Surface Growth model, which underlined extensive subclonal diversification. For both in-23 silico modelling and experimental analysis of ccRCC tumours, parallel mutations with limited 24 clonal expansion were located near the tumour edge, implying that a spatially resolved 25 sampling could target the tumour margin to identify ongoing parallel evolution.

27
Predictive feature of future evolutionary trajectories

28
Having so far focused on quantitative and spatial characterisation of microdiversity and 29 parallel evolution, we next asked how features of clonal diversity evolved temporally and 30 whether some features could enable the prediction of future evolutionary trajectories. The 31 ability to predict likely evolutionary trajectories could aid in the clinical management of 32 ccRCC tumours. However, the challenges of obtaining serial biopsies of the same tumour 33 limit this opportunity in clinical practice. Therefore, we examined if our model could be used 34 to indicate the temporal features of evolution of clonal diversity. 35 36 To this end, the number of subclones within a tumour slice was tracked over time. distinct growth modes appeared indistinguishable at early stage according to this measurement. 42 We then asked whether some spatial features could indicate early the divergent patterns of 43 subsequent subclonal diversification in the two models. 44 45 To shed light on this question, we evaluated the Surface Growth model for the presence of 46 early indicators of subclonal diversification. In fact, one characteristic spatial feature noticed 47 in tumours under Surface Growth was a budding structure on the tumour surface, which 48 indicated the beginning of outgrowth of an advantageous subclone (Figure 1e, Figure 6c). 49 Quantitatively, this budding structure reflected the rapid increase in the CCF of an 1 advantageous subclone, as it spatially outcompeted surrounding subpopulations (Figure 6d). 2 Depending on the driver acquisition probabilities employed in the model, the median tumour 3 size for the detection of such budding structures was around 7 cm or smaller. Exploratory 4 simulations attempting at "replaying" evolution (i.e., re-simulating clonal evolution from a 5 historical tumour state with established clonal structure as a starting point) starting from 6 different tumour sizes suggested that evolution was more repeatable if starting from a historical 7 tumour state with budding structures emerging (Supplemental Figure 8, Supplemental Note 8 4). 9 10 With respect to the above findings, we then turned to analysing 46 tumours with a size of up 11 to 7cm in the TRACERx Renal study. By qualitative examination of radiological images of 12 these tumours, 15 tumours displayed apparent budding structures at their surface. In one 13 representative case ("G_K523"), budding structures were evident both radiologically ( Figure  14 6e) and in the tumour contour image with clonal diversity mapped (Figure 6f). Interestingly, 15 trailing the budding structures were regions with high clonal diversity, consistent with our in-16 silico tumour simulations under Surface Growth. These findings imply ongoing subclonal 17 diversification in these regions that may delineate future evolutionary steps. 18 19 Lastly, in the TRACERx Renal study, the number of subclones depicted a non-linear 20 relationship with tumour size (Figure 6g). This broadly reflected different modes of evolution, 21 ranging from limited evolution (bottom-left part in Figure 6g), to punctuated evolution 22 (bottom-right part in Figure 6g) and branched evolution (top part in Figure 6g). Furthermore, 23 a subset of tumours (enclosed only by blue contour in Figure 6g) were better recapitulated by 24 the Volume Growth model, while another subset by Surface Growth model (enclosed only by 25 red contour in Figure 6g). A more intriguing question still is whether the future steps of clonal 26 evolution in the early-stage tumours could be predicted. When 15 early-stage tumours with 27 apparent budding structures were highlighted, a subset of these already displayed a greater 28 extent of subclonal diversification, raising the possibility that spatially resolved sampling 29 combined with sequencing in these regions may enable identification of evolutionary potential 30 in early-stage tumours. 31 32

Discussion:
1 Intra-tumour genetic heterogeneity arises when different parts of the tumour acquire distinct 2 genomic alterations, endowing subclones with a variety of fitness advantages. Despite our 3 understanding of how dominant subclones sculpt evolutionary trajectories and histories, 4 gleaned primarily from multi-region sampling and deep sequencing, under-represented 5 subclones that reflect ongoing and clinically relevant evolution could remain undetected. 6 Therefore, our focus on microdiversity and parallel evolution is central to the elucidation of 7 the evolutionary potential of under-represented subclones and the emergent spatial patterns 8 they form. 9 10 An important finding, via computational modelling, is that different spatial patterns of tumour 11 growth impact the extent of subclonal diversification and shape divergent modes of evolution. Overall, our computational model was able to correlate with several clinical observations, 47 despite the simplification of its primary components (probabilistic birth, death, and driver 48 acquisition). Nevertheless, we are aware that our model has limitations. Clearly, many other 49 factors, such as vascularisation, necrosis, and immune predation, are all likely to (re)shape the 1 in-silico patterns of clonal diversity we observed. The simple implementations of fitness 2 advantage endowed by ccRCC drivers could be improved in the future with wet-lab 3 experiments aimed at measuring and comparing growth kinetics of tumours with different 4 genetic backgrounds. Future time-lapse experiments could also provide evidence for growth 5 modes of ccRCCs and elucidate whether different modes may predominate in different regions 6 or at different stages of the same tumour. Moreover, while we made an assumption that only 7 mutations in BAP1 or PBRM1 promoted acquisition of SCNAs for simplicity, we acknowledge 8 that other ccRCC driver mutations (e.g., SETD2) also link to chromosome instability. Lastly, 9 alternative mechanisms could explain the formation of budding structures and other protrusive 10 morphologies, such as mechanical properties (Fiore et al. 2020) and cell migration (Anderson 11 et al. 2006). In addition to these limitations from a modelling perspective, differences in size, 12 stage, and the number of samples among tumours in the TRACERx Renal study could 13 confound the patterns to which the computational outputs are compared. The current 14 sequencing data is panel-based with a small number of clonal markers and may have low 15 sensitivity to detect microdiversity. Future examination of whole-exome or whole-genome data 16 could lead to a more comprehensive characterisation of ongoing clonal evolution. 17 18 In conclusion, our study supports the importance of understanding spatial patterns of tumour 19 growth in deconstructing tumour evolution. It provides evidence, and has implications for, 20 focused spatial sampling of the tumour margin in order to garner richer information on ongoing 21 and clinically relevant evolution.

23
Methods: selection, suggesting that spatial growth of a structured population interplays with 8 evolutionary forces (driver acquisition, selection, and genetic drift) to shape the spatial 9 patterning of subclones. 10 11 In the present study, a coarse-grained cellular automaton model has been constructed to 12 simulate tumour growth and the evolution of ccRCC drivers. A basic model unit reflects a 13 tumour volume of 1 × 1 × 1 ## ! , referred to as a "tumour voxel". and $ "#$%&' , where 0 ≤ -≤ 1 determines this advantage relative to the baseline growth 5 probability. The baseline growth probability $ "#$%&' ()) was arbitrarily defined, as time has 6 arbitrary units in the model. For a given tumour voxel, its growth probability is defined by the 7 most advantageous driver harboured (i.e., "saturated" model of fitness advantage). For 8 simplicity, VHL mutation and 3p loss are considered as truncal events in the founder tumour 9 voxel, although in a clinical setting VHL mutation are not present universally (Turajlic et al. 10 2018). The subpopulation of tumour voxels that only harbour these two events is referred to 11 parental clone. For an exploratory purpose, an alternative implementation of growth advantage 12 endowed by drivers was also evaluated (Supplemental Figure 2). In this implementation, a 13 driver adds a certain amount of growth probability to the tumour voxel that acquires the driver 14 (i.e., "additive" model of fitness advantage) (Supplemental Figure 2a). The growth 15 probability of a tumour voxel depends on all the drivers it harbours, namely, $ "#$%&' = 16 $ "#$%&' , where $ "#$%&'_2 reflects the amount of growth probability added by 17 driver 4. $ "#$%&' is set to one if the calculated probability exceeds one. The amount 18 $ "#$%&'_2 varies between drivers, reflecting different strengths of their association with Ki67 19 score (Supplemental Figure 1a). Three different scenarios were explored to reflect different 20 amounts of growth probability endowed by drivers on average, as determined by -2 of the 21 weakest driver, namely, min(-2 ), and the difference in -2 between consecutive two drivers in 22 their advantages, namely, ∆-2 . (Supplemental Figure 2b). we assume that the acquisition of SCNAs in a tumour voxel becomes equally likely as 32 mutations in driver genes, if the tumour voxel harbours mutations in BAP1 or PBRM1. A range 33 of driver acquisition probabilities have been studied to explore its impact on patterns we 34 investigate. Lastly, only one mutation in the same driver gene is permitted in the same tumour 35 voxel, but multiple independent, distinct mutations may be acquired in parallel within a 36 simulated tumour in different tumour voxels. 37 38 Simulation 39 Each simulation starts from a single tumour voxel placed at the centre of the lattice, (: 5 , < 5 , = 5 ), 40 and runs until the tumour grows to at least 1 million tumour voxels after the last simulation 41 step. In each simulation step, growth and death are evaluated for every tumour voxel, in random 42 order. For those tumour voxels selected to grow, driver acquisition is evaluated for each of the 43 26 drivers. The computer code is written in CUDA C++. 44 45

Model analyses
1

Levels of analysis 2
Analyses have been conducted at three different levels ( Figure 1D): (1) whole tumour level,  3 which takes into account all tumour voxels in the 3D volume; (2) tumour slice level, which 4 takes into account all tumour voxels within a 2D plane (= = = 5 ); (3) regional biopsy level, 5 which takes into account tumour voxels within regional biopsies. A regional biopsy is defined 6 as all tumour voxels within a region in the 2D slice. Spatially uniform sampling is performed 7 in this study. This process is carried out by locating the centres of candidate regional biopsies 8 in the 200## × 200## 2D lattice with a spacing of 20 ## and collecting all voxels within 9 a distance of 5 ## from each biopsy centre. 10 11 Spatial maps of driver events 12 In representative cases, tumour voxels that harbour selected driver events are mapped, within 13 a 3D tumour surface and within a 2D tumour slice. In spatial maps of driver events, tumour 14 voxels that don't harbour drivers of interest are in grey; tumour voxels that harbour mutations 15 in driver genes are in blue; tumour voxels that harbour gain of chromosome arm events are in 16 green; tumour voxels that harbour loss of chromosome arm events are in red.

18
Spatial maps of subclones 19 A subclone is defined as a group of tumour voxels that harbour the same set of driver events. 20 In spatial maps of subclones, tumour voxels that belong to the parental clone are in grey, while 21 other subclones are visualised in randomly generated (R, G, B) colours.

23
Cancer cell fraction (CCF) of driver events 24 CCF of a driver event is calculated as the number of tumour voxels that contain the driver event 25 divided by the total number of tumour voxels in the domain of interest, depending on the level 26 of analyses. A driver event is considered detectable if the CCF is greater than 0.01. 27 28 Cancer cell fraction of subclones 29 CCF of a subclone is calculated as the number of tumour voxels that belong to a subclone 30 divided by the total number tumour voxels in the domain of interest, depending on the level of 31 analyses. A subclone is identified by a set of driver events, shared by a subpopulation of tumour 32 voxels, which are accumulated within the subclone-initiating tumour voxel. A subclone-33 initiating tumour voxel is defined as a tumour voxel that acquires a new driver event upon birth. 34 A subclone is considered detectable if the CCF is greater than 0.01. 35 36 Shannon diversity index 37 As a measure of clone diversity, Shannon diversity index is defined as > = ∑ −@ -ln @ --, where 38 @ -is the CCF of the subclone C. All subclones are taken into account in this calculation. 39 40 Spatial and genomic distances 1 Spatial distance between two regional biopsies is calculated as the Euclidean distance between 2 the positions of the centres of two biopsies. Genomic distance between two regional biopsies 3 is calculated as 0 "/6 = D∑ (E -2 − E 7 2 ) + +8 2 , where E -2 is the state of 4th driver in the Cth region 4 and has a value of 1 if the 4th driver is present, otherwise a value of 0. 5 6 Microdiversity 7 Microdiversity is defined as the number of subclones contained in a 3-mm-by-3-mm region 8 within the tumour slice. In representative cases, microdiversity is spatially mapped within a 9 tumour slice, by sliding a 3-mm-by-3-mm spatial window throughout the tumour slice. 10 Microdiversity hotpots are defined as a subset of these small regions with 5 or more subclones. 11 The distance from a microdiversity hotspot to the centre of a tumour slice is referred to as the 12 distance to tumour centre (0 ) ). The distance from a microdiversity hotspot to the nearest point 13 along the tumour contour is referred to as the distance to tumour margin (0 + ). The normalised 14 distance to tumour centre is defined as 0 = 0 ) (0 ) + 0 + ) ⁄ . Cumulative probability distribution 15 of 0 is generated by combining microdiversity hotspots from repeat simulations. Power law 16 exponent is obtained by bootstrapping 100 samples of 400 hotspots per sample and fitting a 17 power law function to cumulative probability distribution of G in each sample. 18 19 Parallel evolution 20 Parallel evolution refers to the independent, distinct instances of mutations in the same driver 21 gene that are acquired at different tumour locations. At the whole tumour level, the extent of 22 parallel evolution is assessed by counting the instances of independent, distinct instances of 23 mutations in each driver gene at a given resolution of detection. The resolution of detection is 24 represented using a minimum CCF that has to be reached by a parallel mutational event to be 25 detectable. At the regional biopsy level, the extent of parallel evolution is assessed by counting 26 the number of regions that contain distinct instances of mutations for each driver gene. The 27 distance from any region containing a parallel instance to the tumour margin is measured, to 28 study the spatial distribution of parallel mutational instances with limited clonal expansion. 29 30 Temporal analysis 31 Kernel density estimation with a Gaussian kernel is performed with respect to the number of 36 subclones and the diameter of the tumour slice, based on all simulations, to produce a 37 continuous density estimate, using seaborn.jointplot(kind = 'kde')in Python. 38 In a representative case, tumour voxels that harbour parallel mutational events in BAP1 are 39 mapped within historical tumour slices. In all simulations, CCFs of subclones that harbouring 40 mutations in BAP1 are measured within historical tumour slices over time.

42
Evolutionary replay 1 To perform evolutionary replay, historical tumour state (i.e., locations and clonal identities of 2 tumour voxels) at a certain time point is saved and employed as a common starting state for re-3 growth 50 new simulated tumours. At the end of these simulations, Shannon diversity index is 4 calculated to indicate the divergence in their evolutionary outcomes. Microdiversity 20 Spatial maps of regional clone diversity are created for two representative tumour sections. In 21 these maps, regions are colour-coded based on the number of subclones. Regions that harbour 22 at least one subclone are treated as a proxy for microdiversity hotspots defined in the model 23 analysis. In total, there are 606 regions from 54 tumours that satisfy this criterion. For these 24 regions, the normalised distance to tumour centre is measured as described above in "Model 25 analysis -Microdiversity". 26 27 Parallel evolution 28 Spatial maps of parallel mutational events in PBRM1 are created for one representative tumour 29 section. In these maps, regions are coloured differently according to different parallel 30 mutational events. Regions that harbour more than one event are indicated with multiple 31 colours. To study the spatial distribution of mutational events with limited clonal expansion, 32 the maximum distance from an event spanning up to two regions to the tumour margin is 33 measured.

35
Statistical analysis 36 Two-sided Wilcox's rank test is performed to compare several measurements between different 37 model conditions. Statistical significance is annotated within box plots using 38 stat_compare_means(method = "wilcox.test", label = "p.signif")in 39 R. 40 Pearson correlation coefficients are calculated to assess the correlation between spatial and 1 genomic distances between tumour regions, using cor.test(variable1, 2 variable2, method = "pearson") in R. 3 Two-sample Kolmogrov-Smirnov test is performed to compare two cumulative probability 4 distributions, using scipy.stats.ks_2samp(sample1, sample2)in Python. 5 Bootstrapping is performed to generate 100 random samples of 400 microdiversity hotspots 6 per sample with replacement, using random.choice() in Python. The power law 7 exponent is then determined by fitting a power law function to the cumulative probability 8 distribution from each sample, using scipy.optimize.curve_fit()in Python. 9 Quantile-Quantile (Q-Q) plot is employed to compare actual distribution of microdiversity 10 hotspots with a power law distribution with exponent being the median of fitted values in 11 bootstrapping, using statsmodels.graphics.gofplots.qqplot()in Python. 12 Kernel density estimation is performed for simulations with respect to the size of tumour 13 slice and the number of subclones, using seaborn.jointplot(kind="kde")in 14 Python  (d-e) The same analysis as described in (b-c) for a representative in silico tumour with Volume 43 Growth. 44 (f) Pearson correlation between spatial distance and genomic distance in in silico tumours 45 under Volume Growth and Surface Growth with varying driver acquisition probabilities. N = 46 100 for each condition. 47 Statistical annotations in (f) reflect two-sided Wilcoxon tests: "****" indicates 1 ≤ 0.0001 1 and "ns" indicates no statistical significance. 2 3 Figure 4. Spatial features of clonal diversity. 4 (a) Schematic figure and procedure for the analysis within a 2D tumour. 5 (b) Spatial maps of subclones (i) and microdiversity (ii) in a representative in-silico tumour 6 under Surface Growth. 7 (c) The same analysis as described in (b) for a representative in-silico tumour under Volume 8 Growth. 9 (d) Maps of regional biopsies with the number of subclones within a biopsy colour coded in 10 two cases (G_K234 and G_K446) in the TRACERx Renal study. Hues from red to purple to 11 blue reflect decreasing number of subclones. "Low" reflect zero subclones, while "High" 12 reflects the maximum number of subclones found in any region (i.e., 4 subclones in G_K234, 13 3 subclones in G_K446). 14 (e) Cumulative probability distribution, 1(2 ≤ 0), of the normalised distance to tumour centre 15 in     Genomic distance Spatial distance (mm) n 3 6 9 12 (i) Sample regions from tumour slice.
(ii) Collect driver information in each region.
(iii) Calculate spatial and genomic distances between each pair of regions. a R1 R2 (iv) Evaluate correlation between spatial and genomic distances.