Phenotypic heterogeneity and evolution of melanoma cells associated with targeted therapy resistance

Phenotypic plasticity is associated with non-genetic drug tolerance in several cancers. Such plasticity can arise from chromatin remodeling, transcriptomic reprogramming, and/or protein signaling rewiring, and is characterized as a cell state transition in response to molecular or physical perturbations. This, in turn, can confound interpretations of drug responses and resistance development. Using BRAF-mutant melanoma cell lines as the prototype, we report on a joint theoretical and experimental investigation of the cell-state transition dynamics associated with BRAF inhibitor drug tolerance. Thermodynamically motivated surprisal analysis of transcriptome data was used to treat the cell population as an entropy maximizing system under the influence of time-dependent constraints. This permits the extraction of an epigenetic potential landscape for drug-induced phenotypic evolution. Single-cell flow cytometry data of the same system were modeled with a modified Fokker-Planck-type kinetic model. The two approaches yield a consistent picture that accounts for the phenotypic heterogeneity observed over the course of drug tolerance development. The results reveal that, in certain plastic cancers, the population heterogeneity and evolution of cell phenotypes may be understood by accounting for the competing interactions of the epigenetic potential landscape and state-dependent cell proliferation. Accounting for such competition permits accurate, experimentally verifiable predictions that can potentially guide the design of effective treatment strategies.


Introduction
The phenotypic plasticity of many tumors can confound the identification of effective therapeutic strategies [1][2][3][4]. For such tumors, even if the cancer cells are isogenic, the cellular composition can be a heterogeneous mix of different cell states (phenotypes) that exhibit the capacity for dynamic interconversion. Each phenotype can have a characteristic gene expression profile, drug susceptibility, proliferation rate, and metastatic potential [5]. When this heterogeneous population is challenged with a physical or molecular perturbation, the cell states can rapidly evolve [6] to form a new population distribution better suited to survive the challenge. This adaption may proceed without genetic changes [7][8][9]. Removal of the challenge can lead to recovery of the original population distribution [5,10,11]. This behavior bears similarities to that of 'phenotypic equilibria' [1,12,13]. In those systems, if a subset of this population of microstates is physically separated from a stable, heterogeneous population and allowed to expand in culture, the phenotypic heterogeneity of the original culture will recover. This facile adaptability makes plastic tumors challenging to drug-target, and it highlights the importance of quantitative models that can provide predictive and mechanistic insights into the underlying driving force controlling such behaviors.
Similarities between steady states in nonequilibrium biological systems and perturbation/ relaxation scenarios in classical thermodynamics equilibria have prompted investigations into applying physicochemical models for describing phenotype dynamics within an epigenetic landscape [13][14][15][16]. Qualitative descriptive models have been explored for many years, but quantitative and predictive models have only been recently explored [14][15][16][17][18][19][20]. In one class of studies, epigenetic landscape models are explored, wherein stable cell states are described as local minima (attractors) within a metaphoric energy (or potential) cell-state landscape. In such models, the driving forces that influence the cellular composition and population dynamics are the gradients on that surface. As a result, cells tend to gravitate and remain in the local minima of such landscapes. However, in many other cases, this potential landscape does not predict the observed phenotypic heterogeneity [16], implicating other important factors that can influence the population dynamics are at play.
To address this puzzle, we studied highly plastic patient-derived BRAF V600E mutant melanoma cell lines as models of cancer cell phenotypic plasticity. The high rate of both response [21] and resistance development [10] of BRAF-mutant melanoma patients to BRAF inhibitor (BRAFi) treatment has made such cell lines important models for understanding challenges associated with targeted inhibitors [9,10]. BRAFi can trigger a series of nongenetic cell state changes along the melanocytic lineage towards drug-tolerant and eventually drug resistant states through epigenetic reprogramming. These include the transition of drug-sensitive melanocytic cancer cells into a drug-tolerant neural crest-like phenotype, which, under continued BRAF inhibition, can eventually transition into a fully drug-resistant, invasive mesenchymallike phenotype [5,[9][10][11]. The cell biology of this BRAFi-induced phenotypic evolution has been extensively characterized [5,22], and shown to correlate with what is observed in patient biopsies [9][10][11]22]. However, a quantitative biophysical understanding of this type of epigenetic plasticity has not been fully explored.
To this end, we carried out two sets of experiments, integrated with two theoretic approaches, on phenotypically plastic BRAF V600E mutant melanoma cell lines. At the macroscopic level, we measured a kinetic series of bulk transcriptomes over a 2.5-month course of low dose BRAF inhibition, during which time the cells evolve from a mostly melanocytic, drug-sensitive phenotype to a mesenchymal, drug-tolerant phenotype. This data set provides input into an information-theoretic surprisal analysis [23], which is used to identify the relative free energy-like potential over the entire course of cell state transition from drug response to drug tolerance. We also utilized microscopic inputs from flow cytometry to profile, at the single-cell level, the phenotypic evolution of the same system. These phenotypic evolution dynamic data cannot be described with conventional Fokker-Planck equation but can be well recapitulated using a modified Fokker-Planck-type (FP-type) kinetic model [17,18,24] which considered cell-state dependent proliferation differences. The model resolves relative cell state potential and cell-state proliferation differences were quantitatively validated through experiment. We further show that both approaches provide a self-consistent picture in which the combined effects from the relative stability of cellular phenotypes, together with the phenotype-specific net-proliferative rate, act as the drivers to predictably influence the cell population dynamics of drug-induced phenotypic evolution over time. The results provide conceptual guidance for considering effective therapy combinations [5].

Surprisal analysis of bulk transcriptome data resolves steady state and time-dependent constraints in the melanocytic to mesenchymal transition
We used two patient-derived BRAF V600E mutant cell lines (M397 and M229) with a prominent melanocytic to mesenchymal phenotypic evolution induced upon BRAF inhibition ( Fig 1A  and S1 Fig) [5]. We characterized this process by both a bulk transcriptome profiling (S1 Table) and a flow cytometry phenotyping using two protein markers (MART-1 and NGFR) that are established cell phenotype markers for this system [5,9,25]. The transcriptome was measured at Day 0 (D0), which served as an untreated control, and at a set of time points following BRAFi (vemurafenib) treatment (S2 Fig). Following drug exposure, the relative location of the binning of cell populations expressing different levels of the two markers followed a counterclockwise transition trajectory around the flow cytometry plots (S1(A) Fig), moving from the melanocytic phenotype (MART-1 pos ) towards a transiently enriched slow-cycling neural crest (MART1 neg /NGFR high ) population around day15 to day20 (D15-D20), and eventually terminating at a mesenchymal (MART1 neg /NGFR neg ) phenotype at around day62 (D62) with elevated expression of the mesenchymal marker N-cadherin ( Fig 1A). This drug-resistant population stably persisted with extended drug treatment beyond D62 (S1(A) Fig). A similar transition trajectory was also observed for M229 (S1(B) Fig). These drug-induced phenotypic transitions agree with previous literature [5,22].
To assess the overall stability and transcriptomic eigenpatterns associated with the cell population distributions at various time points across the drug-induced phenotypic evolution, we first applied surprisal analysis (Eq 1) to the transcriptome time series. Surprisal analysis extends the principles of maximum entropy and was initially formulated to understand the dynamics of nonequilibrium systems [26]. Using the method of Lagrange multipliers, it seeks the maximum entropy of molecules and identifies the global steady state with minimal free energy, as well as a series of time-dependent constraints that prevent the nonequilibrium system from reaching the global steady state [16,23,26,27]. Surprisal analysis has been extended to characterize biological processes in living cells, where it assesses the maximum entropy of the biomolecules within the cell ensemble through using a simplified approximation of quantum state distributions of the molecular species [23]. Consequently, for a system with kinetic transcriptome data as input, it can extract the time-independent gene expression baseline (the global steady state), as well as a series of gene expression modules (constraints) that evolve with time [16,23,26,28]. A full derivation and thermodynamic interpretation of Eq 1 is provided within the supplementary materials of previous reports [23,28].  In Eq 1, X i (t) is the measured level of transcript i at time t. This is considered to be the global steady-state level of transcript i (lnðX 0 i ðtÞÞ ¼ À l 0 G i0 ), modified by the sum of the contributions arising from the constrained processes. The global steady state resolved by surprisal analysis is the cellular state with maximum entropy. If there were no constraints acting upon the cells, then Eq 1 predicts that the cells would be in the global steady state. However, there are non-zero constraints (with amplitudes given by the λ j values), which are biological processes that move the system away from the global steady state. Transcripts associated with a constraint are identified through Eq 1 as lowering the entropy of the system, presumably to maintain one or more biological functions. Data mining the set of transcripts associated with a given constraint can provide insight into those biological functions. Although we do not impose the condition that λ 0 is time-independent, we neither expect nor find time-dependence (the λ 0 variation is <0.7%) (Fig 1B).
To capture the time evolution of the drug-treated cells, each constrained process is represented by a time-dependent amplitude λ j (t) and constraint-specific contributions from each transcript G ij . In principle, analysis of the transcriptomic data across the time series from D0 to D73 could resolve several constraints, but we resolve only three for M397 (S3 and S4 Figs). This is illustrated in Fig 1C, where we represent the whole transcriptome data as a self-organized map (SOM) [29]. The map structure is determined using all data sets. Each tile represents a minicluster of genes with similar expression kinetics. Gene clusters with related expression kinetics are placed close together, while clusters exhibiting very different kinetics are placed far apart. The tile color encodes the average expression level of the genes in that minicluster at a given time point. For SOMs representing a specific constraint, that average gene expression level is also weighted by the participation of the genes in the constraint, as determined from Eq 1. The gene expression profile for the global steady state remains unchanged throughout the transition, while the differentially expressed genes (termed eigengenes elsewhere [30]) specific to constraints λ 1 , λ 2 and λ 3 vary with time. Summing the global steady state and the three constraints reproduces the map of the measured transcriptome, indicating that, within the noise level of the data, the three major constraints are sufficient to accurately recapitulate gene expression levels globally across the transition. (Fig 1C and S5 Fig).
The major biological processes involved in each constraint, at a given time point, can be inferred by enrichment on the gene lists ranked by the constraint-specific contributions from each gene G ij (Fig 2A, S4 Fig and S2 Table), and by the time-dependent amplitude λ j (t) of that constraint. The first constraint shows monotonically increased amplitude (λ 1 ) along the course of the transition (Fig 1B), with up-regulated mesenchymal signatures, migration, invasiveness and metastasis features, as well as NFκB signaling (G 1 positive processes). It also reflects reduced glucose uptake and metabolism, MITF activity, and oxidative phosphorylation (G 1 negative processes) (Fig 2A). Constraint 2 contains similar transcriptional signatures, but its amplitude (λ 2 ) drops after 3 days of BRAFi exposure and slowly increases at later times ( Fig  1B). It points to an elevated MITF activity (G 2 negative process) and reduced cellular proliferation (G 2 positive process) at day-3. This is consistent with previous observations that a brief BRAFi exposure can induce melanocytic differentiation and increased BRAFi sensitivity [31,32]. The third constraint mainly involves oxidative phosphorylation and the TCA cycle, and has a near zero amplitude except for day 3 (Fig 1B), implying that initial BRAFi exposure leads to a sharply altered metabolic program. The three major constraints associated with M229 displayed similar dynamics and are functionally similar to those in M397 (S4 Fig), confirming the robustness of the BRAFi induced melanocytic to mesenchymal transition. To get a comprehensive view of the enriched transcriptional program, we plotted the enrichment maps of the GSEA results with respect to relevant gene function categories and highlighted representative gene sets in these categories ( Fig 2B and S6 Fig). Overall, these transcriptional signatures are wholly consistent with previous reports [5,[9][10][11]33], testifying the validity of our cell line model for recapitulating the known biology of the transition and confirming the power of surprisal analysis for dissecting the underlying biology of the transition.

Fokker-Planck modeling of phenotypic evolution with single-cell flow cytometry phenotyping failed in recapitulating the evolution dynamics
The same biological system was further characterized at the single-cell level using flow cytometry analysis of the established cell-state markers: NGFR and MART-1. The temporal transcriptomic signatures resolved by surprisal analysis result from the dynamics of the BRAFi-induced phenotypic evolution that can be characterized by MART-1 and NGFR marker proteins [5,9,25]. As shown in our previous report, these two marker proteins can yield the identical phenotypic classification to that of the whole transcriptome data [5]. Therefore, they can be used as robust phenotype markers during the course of the drug-induced transition ( Fig 1A and  S1 Fig).
To model the single cell data, we conceptualize cell population distributions as single cells moving on a configuration space delineated by the marker proteins. In this space, cell states correspond to stable or metastable attractors of a hypothetical potential landscape [34]. The dynamics of the protein markers for a single cell can be described by the Langevin type equation dz / dt = μ(z) + z, where z is the concentration vector of the protein markers (z 1 , . . .,z N ), μ(z) is a drift vector in concentration space that describes all of the deterministic (non-random) dynamics and can be determined by the gradient of the potential landscape. The term z is the white noise term from random fluctuations in protein expression: hz(t)z (t 0 )i = 2D δ(t-t 0 ) where D is the diffusivity tensor measuring the amplitude of those fluctuations [18]. The potential landscape of a cellular system is context-specific. We hypothesized that drug treatment altered the original drug naïve landscape into a new landscape, which in turn yielded relaxation dynamics as each cell adjusts to this new drift field, potentially with motions towards new attractor states.
Analyzing the dynamics arising from a multi-dimensional drift field is, in general, an intractable problem. However, the flow cytometry trajectory (Fig 1A and S1 Fig) upon BRAF inhibition suggested the simplification that cell populations may be considered to evolve along a linear chain of a limited number of cell states. Therefore, for computational convenience, we projected the protein concentration vectors of the two dimensional (2D) flow cytometry data into a one-dimensional (1D) representation where the cell populations were constrained to move along in this characteristic 1D trajectory ( Fig 3A). The distance along the trajectory x = x(z) serves as an effective reaction coordinate of the phenotypic evolution (see Methods).
The flow cytometry data do not track an individual cell stochastic trajectory, but rather give statistical snapshots of marker protein expression levels across single cells. Thus, it is natural to transform the single-cell Langevin equation into the Fokker-Planck equation for resolving the time-dependent probability distribution p(x,t) along the reaction coordinate [35]: Here, drift term μ(x) implies that motion along x is influenced by a potential landscape. D is a diffusivity that is assumed, for simplicity, to be a constant independent of x or drug treatment. Even in cases where the diffusivity depends on the reaction coordinate x, a Fokker-Planck (FP) equation with constant diffusivity can be obtained by a simple coordinate transformation as shown in Ref. [17].
Because the dynamics under consideration are 1D, the drift μ(x) can always be presented as the derivative of a scalar potential UðxÞ ¼ À mðyÞdy. This, in turn, is exactly related to the steady state solution of Eq 2 through a Gibbs relation as lim t!1 pðx; tÞ ¼ pðxÞ ¼ CexpðÀ 2UðxÞ=DÞ where C is a normalization constant. Therefore, one can determine (up to proportionality to D) the potential U from measurement of the steady state dis- � � ln pðxÞ. This FP approach has been successfully applied to understanding the population heterogeneity of model biological systems [17,18]. Here, we used a variation of this method to measure the diffusivity D = 0.

Modified Fokker-Planck-type kinetic model that incorporates cell-statedependent proliferation recapitulates the phenotypic evolution and predicts cell-state proliferation rates
We hypothesized that the disagreement with experiments arose because the drug would influence not only the cell phenotypic evolution but also the cell autonomous proliferation and survival. In other words, the cells have drug susceptibilities-as reflected by the net effect of cell proliferation and cell killing-that vary along the reaction coordinate. These factors can also influence the phenotypic compositions, but are neglected in Eq 2. Thus, we modified Eq 2 to include a self-sourcing term: Here the net growth rate α(x) (the net effect of cell proliferation and cell killing under drug treatment) was introduced to account for cell state-dependent drug susceptibility. As an additional ansatz, we considered α(x) as a double step function taking different values for the intermediate neural crest-like phenotype and late-stage mesenchymal phenotype relative to the early stage melanocytic phenotype. It is worthwhile to note that, with Eq 3, we were no longer working with a probability distribution p(x,t), but instead a non-normalized population P(x,t). Both the differential drift and self-sourcing term act together to induce the cell number changes that are proportional to the population size of a specific cell state. For direct comparison between the model P(x,t) and experimentally accessible p(x,t) from flow cytometry data, we simply factored out the norm (pðx; tÞ ¼ Pðx; tÞ=NðtÞ ¼ Pðx; tÞ= In this model, due to the addition of the self-sourcing term, the Gibbs relation between the drug-induced steady state p 1 (x) and the potential U(x) used in our analysis of the original FP equation no longer holds. To determine the parameters for this modified model, we therefore resorted to an unbiased numerical search for U(x) and α(x) that best fit the experimental data. The model prediction was obtained by numerically simulating Eq 3 with the same experimentally measured diffusivity D and the initial distribution p(x,0) as before, together with all possible U(x) and α(x) values in the unbiased search. We determine goodness of fit using an unweighted sum-of-square difference between all the predicted and measured cell population distributions p(x,t). In both cell lines, we were able to find one set of U(x) and α(x) for the modified FP-type kinetic model that produced the best prediction of population distributions over time. When compared to original FP model, the modified model predictions are in much better agreement to experiments (red lines in Fig 3B and S8 Fig). The agreement appears to confirm the validity of the self-sourcing term in Eq 3, but the value of that term can be put to an independent experimental test.
We treated the state-dependent net growth rate α(x) as a concrete prediction of the model, and found it to be in good agreement with experimentally measured cell growth rates: cell populations containing a higher fraction of the mesenchymal phenotype (day21-78) grow faster than those with a lower fraction (day0-21) (Fig 4, See Materials and methods). The agreements between model-predicted growth rates and experiments (Fig 4) further confirmed the validity of Eq 3 and show that differences in state-dependent growth rates are important in determining the drug-induced phenotypic evolution of the melanoma cells.
In addition to predicting proliferation rates, Eq 3 also yielded relative values of the epigenetic potential along the reaction coordinate U(x) (Fig 5A and S9

Further confirmation of the concordance of the epigenetic potential landscapes calculated from macroscopic and microscopic inputs
As demonstrated in previous work, surprisal analysis of the bulk RNA-seq data can also define a free energy-like potential corresponding to the drug-induced phenotypic evolution [16,28]. This potential, for the entire transcriptome of a cell state at time t, is relative to the global steady state, and is given by FðtÞ ¼ (See Materials and methods for details). It has a direct relationship to the entropy of the transcripts and thus evaluates, at a transcriptional level, the relative stability of a cell state (see Ref. [23] for theoretic details). Here we adopted the same definition to calculate the potential landscape over druginduced phenotypic evolution in melanoma cells. For M397, this potential landscape calculated from surprisal analysis, similar to the landscape calculated by the modified Fokker-Planck-type (Eq 3) model, indicates that the cells at D11 and D21, with mostly neural-crest like phenotypes are more stable than cells at earlier times (melanocytic phenotypes) or D73 (predominantly mesenchymal phenotype) (Fig 5B). For M229, cells at D21 with mostly the neural-crest like phenotype are also more stable than the cells at D90 (predominantly mesenchymal phenotype) (S10 Fig). Thus, the epigenetic potentials calculated from either surprisal analysis of bulk data or the Fokker-Planck kinetic model from single-cell data yield a consistent picture.
Both analyses indicate that neural-crest like cells are more stable than the mesenchymal phenotype. This prediction was experimentally validated by sorting the mesenchymal (NGFR neg /MART1 neg ) subpopulation from the M397 D73 distribution (S11 Fig). We carried out surprisal analysis of transcriptome data from both the segregated mesenchymal subpopulation and the unsorted day-73 population (a mixture of mesenchymal phenotype and neural-crest phenotype). Free energy-like potentials were calculated and found to be consistent with the scalar potentials of both sorted and unsorted populations determined by the modified FP-type kinetic model. The pure mesenchymal phenotype displayed higher potentials than the unsorted cells (Fig 5C). Hence, cell sorting and RNA-seq experiments confirmed the consistence between the two theoretic models, and indicated that the drug-resistant mesenchymal cells are epigenetically unstable relative to the neural crest phenotype.

Discussion
Heterogeneous cancer cell populations can often exhibit a phenotypic equilibrium and evolution behaviors, meaning that a specific composition comprised of relative abundances of distinct cancer cell phenotypes can be a characteristic of the system, and in the meantime, this characteristic composition will evolve or recover following the application or release of molecular or physical perturbations designed to alter it [2,5,[12][13][14]. This can, of course, confound the interpretation of responses to drug treatment, but it also provides a compelling biophysical puzzle. Here we investigated two statistical physics models to help build a predictive picture of such phenotypic equilibria. The models respectively utilize macroscopic and microscopic inputs, and we applied them towards understanding the population dynamics of phenotypically plastic patient-derived BRAF-mutant melanoma cancer cells following BRAFi treatment. During a few months period of drug treatment, the cells evolve from drug naïve, drug-sensitive melanocytic-dominated composition to a fully drug-resistant mesenchymal-dominated cell population. In an interesting parallel with state transitions in physical systems, the associated cell state transitions are fully reversible: upon drug removal, the mesenchymal cells revert back to a melanocytic state that is, for all intents and purposes, identical to the initial drug naïve state [5].
The first theoretical model, surprisal analysis, utilizes a bulk transcriptome kinetic series across the drug treatment course to provide a description of the global steady state (the state of maximum entropy) and to identify specific, time-dependent constraints that keep the system from reaching that steady state. The weights of the constrained processes can be utilized to generate a free energy-like potential of the cell-state space sampled during drug treatment [16,28]. It is worth noting that cells are open systems far from equilibrium. While a significant body of work has demonstrated the apparent parallel between equilibrium and nonequilibrium thermodynamics [36][37][38], the potential landscape across the cell state evolution in our study is still a metaphor of the real free energy landscape in an equilibrium system. However, the maximum entropy methods can infer the most probable distribution of a probabilistic system regardless of whether or not it is in equilibrium [39]. Surprisal analysis further extends the principles of maximum entropy to understand particularly small systems that are not in thermodynamic equilibrium [23,26,40]. Therefore, in analogy to entropy in equilibrium thermodynamics, the entropy (and free energy-like potential) of the cellular transcriptome calculated from surprisal analysis can be used to evaluate the overall stability of a cell state [28,41].
The second theoretic approach consists of a modified Fokker-Planck-type kinetic model, which takes a kinetic series of single cell flow cytometry data as input. This model considers the Langevin dynamics of self-sourcing single cells moving within a configuration space. That motion is influenced by both (random) diffusion and drift along a potential gradient, thus permitting a potential surface of the traversed cell-state space to also be extracted.
There are two primary considerations that allow results from these two theories to be directly compared. First, the flow cytometry data and the bulk transcriptome data sets capture the same essential biology. This is obviously not always true. However, for this particular case, the cell phenotype markers NGFR and MART-1 used in the single cell assays are known surrogates for drug-induced changes across the whole transcriptome [5]. It also implies that a more selective subset of the transcriptome might equally well recapitulate the underlying biology, which may be assessed by the contribution scores (G ij values) within each respective constraint. Second, the phenotypic evolution the melanoma cells proceeds stepwise from melanocytes ! neural crest ! mesenchymal phenotypes. This permits the cell response to BRAF inhibition to be considered as time-dependent motion along a linear reaction coordinate, and provides an equivalence between the Fokker-Planck reaction coordinate and the surprisal analysis time coordinate (Fig 5A and 5B).
We do not directly compare the y-axes of the two landscapes (Fig 5A and 5B), but only the slopes of the curves. The FP scalar potential and the surprisal analysis free-energy like potential have very different origins. The free energy-like potential is derived by comparing transcriptional profiles at each time point with that of the time-independent global steady state. The FP potential is derived from the drift term of Eq 3, and is, in fact, the only term in that equation that needs to be fitted, since both cell proliferation rate and diffusion along the FP reaction coordinate can be experimentally determined. However, both theories predict that the most stable cellular population is a largely neural crest phenotype. Surprisingly, that is not the cell population that is ultimately induced by the long-term drug exposure. That population is dominated by a mesenchymal phenotype with a minor neural crest component, and is arrived at through competing interactions. On the one hand, the neural crest phenotype serves as an attractor, but those cells only slowly proliferate. The higher potential mesenchymal cells are more proliferative and that is the dominating factor. This highlights a major difference between open biological systems and equilibrium thermodynamic systems [42].
The analyses presented here for the BRAF-mutant melanoma cells might suggest that identifying drug susceptibilities in each of the cancer cell phenotypes might lead to a more effective therapy. However, such highly plastic cancer cells might eventually switch into cell states that are resistant to even broad combination therapies. A more fruitful approach might be to target those biological mechanisms that underlie the plastic nature of the cells [5,43].

Ethics statement
Patient-derived melanoma M397 and M229 cell lines were generated from de-identified patient samples with written consent under UCLA IRB approved protocol # 11-003254.

Patient-derived melanoma tumor models and drug treatment conditions
Cells were cultured at 37˚C with 5% CO 2 in RPMI 1640 with L-glutamine (Mediatech, Inc, Manassas, VA), 10% fetal bovine serum (Omega Scientific Tarzana, CA), and 1% penicillin, streptomycin and fungizone (Omega Scientific Tarzana, CA). Cells were maintained and tested for mycoplasma as previously described [44,45]. Cell lines were periodically authenticated to their early passages using GenePrint 10 System (Promega, Madison, WI). Presence of mutations in the genes of interest was checked by OncoMap 3 or Iontrone, and was confirmed by PCR and Sanger sequencing as previously described [44,45].
Vemurafenib (NC0621949, Selleck Chemicals LLC) was dissolved in DMSO at designated concentrations before applying to cell culture media. All cell lines were plated in 10cm dish at 60% confluency and treated with vemurafenib for the specified numbers of days at twice the 50% inhibition concentration (IC50) of each cell line as reported before [5]. At different time points after drug treatment, cells were harvest for RNA-seq and flow cytometry. Cell number was also counted for determining the growth rate. Cell growth rate was fitted as the parameter α in the exponential growth curve equation N(t) = N 0 � 2 (α�t) , where N 0 is the cell number at the starting time point, and N(t) is the cell number at time t. Cell numbers counted at day 0, 7 and 21 were used to fit for the proliferation rate at day 0-21 time period, and cell numbers at day 30, 43, 66 and 78 were used to fit for the one at day 21-78 time period.

Flow cytometry analysis of cell phenotype
At different time points, cells were trypsinized from the dish, spun down and washed with PBS. Cell suspensions were stained for flow cytometry with PE-conjugated NGFR antibody from Biolegend (San Diego, CA). All cells were fixed with Fix-Perm buffer from BD Bioscience (San Jose, CA). Cells were then stained for intracellular Melan-A using FITC conjugated antibody from Santa Cruz (Dallas, TX). Isotypes for mouse IgG1k and mouse IgG1 respectively were used to enable correct gating and to confirm antibody specificity. Blue live-dead staining from Life technologies (Waltham, MA) was used to gate live cell events. 10000 alive events were collected for each sample. Flow cytometry analysis was conducted using LSR-II from BD Biosciences (San Jose, CA), and the data were analyzed using FlowJo software (Tree Star, Inc., San Carlos, California, USA).
Fluorescent images were acquired with a Nikon C2plus confocal microscope (Ti) using Plan Apo λ 20× objective (Nikon Inc., Melville, NY). The microscope was controlled by NIS elements AR software (4.51.00) with the following settings: 30 μm pin hole, 12-bit acquisition, 0.62 μm pixel size, 60 gain, and laser power of 5% (405 nm), 0.3% (561 nm), or 0.6% (640 nm). Images were background and contrast adjusted using their respective control wells with no primary antibody staining.

RNA-seq and transcriptomic data analysis
Cells treated under specified conditions and time periods were trypsinized to harvest for cell pellets. RNA extraction was performed at cell pellets using AllPrep DNA/RNA Mini kit from Qiagen. Bioanalyzer confirmed correct integrity, the library was constructed and Illumina 50 bps single-end RNA-seq data was collected for the samples described. RNA sequencing was performed using 50 bps single end sequencing on the Illumina HiSeq 2500 platform. Libraries were prepared using the IlluminaTruSeq RNA sample preparation kit per the manufacturer's instructions. Reads were mapped and aligned to the Homo sapiens NCBI build 37.2 reference genome using TopHat2 v2.0.9 [46]. Expression values in fragments per kilobase of exon per million fragments mapped (FPKM) were generated using Cufflinks v2.2.1 program and Cuffnorm to quantify and normalize aligned reads using the geometric library size normalization method [47].
Heatmap and clustering analysis of transcriptomic datasets was performed via MATLAB. Genes are pre-filtered by RPKM value with criteria of average value greater than 0.5 and coefficient of variance greater than 0.15. Filtered gene expression values were standardized across each row (normalized for each individual gene) and represented by a redblue colormap. Hierarchical clustering was performed with average linkage and Euclidean distance metric. Whole transcriptomic dataset and fractions of contributions from each constraints are visualized using self-organized mosaic maps with respect to its control via Gene Expression Dynamics Inspector (GEDI) [29]. Gene Set Enrichment Analysis (GSEA) [48] was performed using GSEA v2.2.3 software with 1000 permutations and weighted enrichment statistics. GSEA enriched gene sets were visualized as interaction networks with Cytoscape [49] and Enrichment Map [50].

Surprisal analysis and free energy-like potential
Surprisal analysis was applied as described previously [23,28]. The measured expression level of mRNA i at time t, ln X i (t), was expressed as a sum of a steady state term ln X 0 i ðtÞ and several constraints λ j (t)G ij representing deviations from the steady state. Each deviation term was a product of a time-dependent weight of the constraint λ j (t), and the time-independent contribution of the transcript to that constraint G ij .
To implement surprisal analysis, we computed the singular value decomposition (SVD) of the matrix ln X i (t). As well described previously [23], the SVD factored this matrix in a way that determined the two sets of parameters that are needed in surprisal analysis: the Lagrange multipliers (λ j ) for all constraints at a given time point, and for all times and the G ij (time-independent) transcription patterns for all transcripts i at each constraint j.
The free energy-like potential calculation based on the surprisal analysis result was implemented as in Ref. [33]. Briefly, The steady-state expression level of transcript i at time t can be linked to its actual expression level by as X 0 i ðtÞ ¼ X i ðtÞexpðÀ X j l j ðtÞG ij Þ. Therefore, as shown in Ref. [33], surprisal analysis defines the free energy-like potential of a transcript i relative to the global steady state at time t as f i ðtÞ ¼ X j l j ðtÞG ij . Taking all the transcripts into account, the free energy-like potential of the entire transcriptome of a cell state at time t relative to the global steady state is given by FðtÞ ¼ [16].
Natural log transformed transcriptomic dataset and fractions of contributions from each constraints (λ j (t)G ij ) calculated from surprisal analysis are visualized using self-organized maps (SOM). Self-organized map visualization of high-dimensional dataset in a form appropriate for human pattern recognition without discarding the global, higher-order information. Here, they present individual samples as a single 2-dimensional heatmap and, at the same time, display high-resolution patterns. Thousands of input genes are assigned to 625 rectangular "tiles" (SOM nodes), each of which represents a mini-cluster of genes, arranged so as to form a pattern within a 2-dimensional mosaic map on the SOM grid. Tile represent most similar clusters will be placed adjacent to each other in the mosaic. Gene Expression Dynamics Inspector (GEDI) package is utilized to implement the SOM visualization [29].

Modified Fokker-Planck-type kinetic model
The dynamics of the protein markers for a single cell can be described by the Langevin type equation dz / dt = μ(z) + z, where z is the concentration vector of the protein markers (z 1 , . . .,z N ), μ(z) is a drift vector in concentration space that describes all of the deterministic (non-random) dynamics and can be determined by the gradient of the potential landscape. The term z is the white noise term from random fluctuations in protein expression: hz(t)z(t 0 )i = 2D δ(t-t 0 ) where D is the diffusivity tensor measuring the amplitude of those fluctuations [18].
In the case of melanocytic to mesenchymal transition, for computational convenience, we projected the protein concentration vectors of the flow cytometry data into a one-dimensional (1D) representation where the cell populations were constrained to move along in this characteristic trajectory. This converted each snap-shot of cell population distribution from the 2D flow cytometry plot onto a one-dimensional distribution along the linear trajectory. More specifically, we reduced the dimensionality of the flow cytometry data by calculating the principle curve of the full set of measurements using the R package princurve. The data points were projected onto the curve, and the distances of these projected points along the curve were used as the one-dimensional data for the two Fokker-Planck models. These data points were converted into probability density functions (PDF) using kernel density estimation.
Consider the fact that flow cytometry data do not track an individual cell stochastic trajectory but rather give statistical snapshots of marker expression levels across many single cells. Thus, it is natural to transform the single-cell Langevin equation into the Fokker-Planck equation for resolving the probability distribution of the protein markers. The 1D coordinate ( Fig  3A) is defined as a reaction coordinate x(z) such that the Fokker-Planck (FP) equation for the probability distribution p(x,t) has the following form: Here, x is the 1D flow cytometry (FC) coordinate, D is a diffusion constant of the cells along x, and drift term μ(x) implies that motion along x is influenced by a potential landscape. In this model, the equilibrium distribution p 1 (x) = lim t!1 p(x,t) and the potential U(x) = − R μ (x)dx are connected through the Gibbs relation For the unmodified Fokker-Planck equation, this Gibbs relation was applied to the longterm drug treated cell population distribution data (day78 for M397 and day60 for M229) to infer a potential, whose gradient acted as the drift term driving the dynamic changes of the population distribution. This inferred potential and respective drift term, when coupled with diffusion constant D and the initial (day0) population distribution, generated the prediction results in Fig 3B. With regards to calculating diffusion constant D from cell sorting and relaxation experiments, the diffusion coefficient D was assumed to be a constant value independent of trajectory position x and drug treatment condition for simplicity. Based on this assumption, when calculating the diffusion constant, we used time-series flow cytometry data of cell sorting and relaxation experiments. In these experiments, we sorted out the untreated cells into NGFR pos and NGFR neg subpopulations. Both sorted subpopulations were cultured without drug treatment. At different days after sorting and culturing, the cells were harvested to quantify its abundance of NGFR and MART-1 using flow cytometry as shown S6 Fig. Consider the fact that variations of proliferation rates are small in the untreated cells, the Fokker-Planck model (Eq M1) was considered valid and this data was used as input to fit the diffusion constant D. Varying D as a free parameter, the (drug-naïve) potential and hence drift were calculated with Eq M2, using the original, untreated distribution as p 1 (x). The Fokker-Planck equation with these parameters was simulated, with the initial condition p(x,0) set by the sorted population distribution. The simulated data were compared to the measured timeseries distributions with an unweighted sum-of-squares measure. This measure was then minimized as a function of D, and yielded the best-fit diffusion constant D to be 0.35 q 2 /day where q represents the unit length on the flow cytometry coordinate.
For the modified Fokker-Planck-type kinetic model described in Eq (3) where the same reaction coordinate (x) as the unmodified equation is applied, the state-dependent proliferation rate α(x) was modeled as piecewise-constant with different values for the melanocytic, neural crest, and mesenchymal cell types. The cutoff locations in terms of the reaction coordinate x were chosen as the two local minima in an observed PDF with the coexistence of all three subpopulations. One can show that the time evolution of the PDF does not depend on an overall constant shift α(x)+c, so we set the proliferation rate of the starting melanocytic state to 0 for convenience, as the melanocytic cells were observed to be cytostatic without significant proliferation or cell death upon drug treatment. This then left the proliferation rates of the neural crest (α 1 ) and mesenchymal cells (α 2 ) as two free parameters.
Because the Gibbs relationship between the long-time density p 1 (x) and the potential U(x) no longer held with this nontrivial proliferation rate, we resorted to fitting a cubic spline interpolation for the drift μ(x) = −@U(x) / @x. Twenty spline points were used, with x values uniformly spaced along the curve and μ values as free parameters.
Starting with an estimate of α 1 (x) = α 2 (x) = 0 and μ~x, we calculated the prediction of this model using FiPy to numerically simulate the forward evolution with initial condition p(x,0) set by the experimentally measured distribution on day-0. To compare with the experimental data, we used the L 2 norm on the difference between the predicted and experimental probabil-   Table. Kinetic RNA-seq data for M397 and M229 cells.