Generating patient-specific virtual tumor populations with reaction-diffusion models and molecular imaging data

The use of mathematical tumor growth models coupled to noisy imaging data has been suggested as a possible component in the push towards precision medicine. We discuss the generation of population and patient-specific virtual populations in this context, providing in silico experiments to demonstrate how intra- and inter-patient heterogeneity can be estimated by applying rigorous statistical procedures to noisy molecular imaging data, and how the noise properties of such data can be analyzed to estimate uncertainties in predicted patient outcomes.


Introduction
Broadly speaking, mathematical oncology seeks to develop a collection of in silico models that accurately predict the growth and progression of malignant tumors and their response to treatment [1]. While the basic scientific goal is to develop a rigorous quantitative science of cancer [2], a more readily attainable goal is to use existing models and data to inform clinical practice in some substantive way [3]. One approach to clinical mathematical oncology, which we call mathematical model-based precision medicine, is to use mathematical models to make patient-specific prognostic and therapeutic predictions which are subsequently employed to direct treatment choices and predict patient outcomes. In our view, such predictions must be conditioned on patient data in order for the strategy to qualify as precision medicine [4][5][6].
Two essential features of oncology are heterogeneity and uncertainty. Heterogeneity can manifest as interpatient (between-patient), intrapatient (within-patient) and intratumor (within-tumor); see [7] for a more complete discussion of the clinical impact of heterogeneity in cancer and [8] for a review of the mathematical strategies to address heterogeneity. From a modeling perspective, heterogeneity implies a statistical approach where model parameters are treated as spatially distributed random variables with both population and patient-specific probability distributions [5]. Heterogeneity also highlights the issue of uncertainty: For a given patient, most if not all model parameters are unknown prior to the acquisition of patient-specific data, and even with such data, uncertainties may remain, since model parameters may only be partially identifiable and the data acquisition system may introduce additional irreducible noise. Image data provides a clear example of this issue, as imaging introduces information loss due to detector noise, low resolution and the possibility of null functions [9]. This uncertainty must be accounted for if model-based predictions are to be used reliably in the clinical context.
Once a relevant mathematical model has been selected, prediction takes place in silico, whereby a simulated patient is produced and Quantities-of-Interest (QoIs) are calculated. In this context, we say that the parameter values and resulting simulations correspond to a virtual patient. Owing to the heterogeneity and uncertainty issues discussed above, in both population and patient-specific applications of mathematical oncology it is desirable to generate suitably randomized Virtual Patient Populations (VPPs). If the VPP represents an untreated group, it is called a Virtual Control Population (VCP) [4,10], while if the VPP represents a treatment group, it is called a Virtual Treatment Population (VTP). With appropriate VPPs, an in silico Virtual Clinical Trial (VCT) can be performed to assess treatment choices, again in either the population or patient-specific case. Population VPPs and VCTs can assist in making general prognosis and treatment recommendations or simulate a regulatory trial [11], while a patient-specific VPP and/or VCT can be used to assess uncertainty in a particular patient's disease progression or treatment response [6,12]. While population VCTs have found usage in device design and regulatory contexts [13], patient-specific VPPs and VCTs have seen limited usage.
In previous work [5,6], we have highlighted the application of spatiotemporal random field modeling to address heterogeneity and uncertainty. In this work, we discuss specific VPPbased strategies towards the practical implementation of the framework presented in [5,6]. Specifically, we describe a method for generating VCPs by pairing a spatiotemporal reaction-diffusion equation (RDE) model for avascular, pre-metastatic tumor growth with spatially inhomogenous random field coefficients. We discuss the RDE model in sections 2.1 and 2.2 and the random field models in section 2.3. Together, these models provide a solution of the 'forward problem' for VPP generation. Two important inverse problems then arise. In the first, we consider in sections 4.1 and 4.2 the problem of estimating the RDE coefficient fields for a specific patient from noisy molecular imaging data; proof-of-concept simulation results are given in sections 5.1 and 5.2. For the second inverse problem, we must estimate the statistical parameters that define the random field coefficient models from a collection of many patients' data, hence solving the statistical calibration problem for virtual populations; we discuss this problem in section 4.3. Both maximum likelihood and Bayesian techniques are presented, and both require an accurate likelihood model connecting model parameters to available noisy data. To this end, in section 3 we discuss statistical models for molecular Emission Computed Tomography (ECT) data. We make this emphasis both because well-validated likelihood models for ECT are available [9], and because it is possible to model a direct connection between physiological processes such as tumor cell density and spatial growth rate, and certain emission imaging techniques and tracers.
Note that the results given in sections 5.1 and 5.2 are generated in a pure simulation setting, where both ground-truth tumors and image data are simulated. This allows for methodological comparisons with known ground truth and tight control of all nuisance parameters that influence the data. A summary of future work that extends these results is provided in section 6.

Tumor growth modeling with heterogeneity and uncertainty
In this section, we introduce the tumor growth model that we employ for the remainder of the paper. We emphasize that the models we present are simplifications of real tumor biology, which is complex and not yet fully understood. These models are intended to be more 'clinically relevant' [3] than biologically accurate, due to their relative simplicity and amenability to patient-specific inference from spatiotemporal imaging data. In section 2.2 we discuss the parameter-to-solution map, in section 2.3 we discuss random field models for the tumor growth model coefficients, and in section 2.4 discuss the notions of virtual populations and virtual clinical trials.

Reaction-diffusion tumor growth models
A wide variety of Partial Differential Equation (PDE) models have been employed to model spatiotemporal tumor growth, and the derivation and mathematical analysis of such models is adequately described elsewhere [14][15][16][17][18][19][20][21]. Here, we are interested in computational techniques for generating virtual patient populations, and thus consider a relatively simple PDE model (2.1). that has gained traction in the context of GlioBlastoma Multiforme (GBM). The randomization and statistical calibration techniques we present are independent of the tumor growth model, however, so the choice of Eq (2.1) is largely for illustration purposes.
We denote a spatiotemporal density of tumor cells by n = n(x, t), with x ∈ V being a ddimensional position vector in the spatial domain V and 0 ≤ t ≤ T, where T is the maximum number of days simulated. The units of n are cells per unit volume, and we take (x, t) to have units of cm and days, respectively. For in vivo tumors d = 3, though for simulation purposes and modeling of in vitro and certain in vivo experimental setups we also consider spatial domains in d = 2 as a reasonable approximation. The general reaction-diffusion equation for n is The spatially varying scalar diffusion coefficient D = D(x) describes the rate of apparent cell diffusion, while the growth function G(n) describes the growth and competition between the tumor and its environment. The boundary condition is taken to be the no-flux (Neumann) condition v ⋅ ∇n = 0, where v is the outward boundary normal of the domain V. Note that a treatment function T(n) can be added to Eq (2.1) to model an intervention such as chemotherapy, radiotherapy or surgical resection, but we postpone discussion of such models to a future work.
While in the scalar tumor growth context the Gompertz function G(n) = − ρn ln(n ∕ κ) has gained plausible mechanistic support via the notion of quiescence [22], to our knowledge the spatial RDE growth function is largely a phenomenological choice. The Fisher-KPP growth function is frequently selected in the GBM context, and is the one we will use in this paper: The growth ρ(x) and carrying capacity κ(x) functions, which have respective units of cells per day and cells, will be discussed further below.

Patient-specific parameters and the parameter-to-solution map
Note that in Eqs (2.1) and (2.2), several additional parameter functions are required to fully specify the solution to the PDE. These include the diffusion coefficient D(x), the initial cell density n 0 (x), the growth ρ(x) and carrying capacity κ(x). These parameter functions are certainly patient-specific, since they correspond to local physiology and nutrient concentrations. For notational simplicity, we will use the symbol β to denote the collection of all parameters necessary to specify a solution to Eq (2.1). In the context of this paper, a virtual patient then corresponds to a particular choice of β, which we will indicate by saying that patient j has parameter β j . We will assume for the remainder of this article the initial condition is given by a Gaussian with x 0 = [0.5, 0.5] T , A = 5/2πσ 2 and σ 2 = 1e−4; this represents an initial collection of 5 tumor cells, well-localized around the center of our computational domain. The collection of all remaining unknown parameter fields is thus β = (D, ρ, κ). To avoid pathological situations, we also assume the technical condition that β ∈ B, where B is a set of parameter functions selected to guarantee that a positive, bounded solution n(x, t) of Eq (2.1) exists for all β ∈ B and for 0 ≤ t ≤ T. Selecting such a B requires careful technical analysis of Eq (2.1); see [23].
While we assume that each patient corresponds to a unique set of parameters β = β j , it is likely that β j is effectively unknown or uncertain for a given patient. To address this, we treat β as a function space-valued random vector in both the population and patient-specific cases, and provide corresponding random field models in section 2.3 below. In a simulation setting, we must choose a computational representation for β j , which we indicate with a superscript v for 'virtual'. So, β j (v) is a virtual (i.e., computational) sample corresponding to a sample for β.
Assuming Eq (2.1) is classically well-posed for every (or almost every) β ∈ B, we can construct the deterministic parameter-to-solution map n j (model) = ℱ(β j ) (2.4) where n j (model) = n j (model) (x, t) solves Eq (2.1) with the parameter vector β = β j , corresponding to patient j. In a simulation setting, the forward map (2.4) is implemented via a numerical differential equation solver. We have employed a finite difference method-oflines approach with a five-point spatial stencil and Runge-Kutta time stepper, as discussed in [24]. The result is an approximate solution to Eq (2.1), where the virtual coefficient sample The choice of discretization specifies the format for the virtual cell density n j (v) ; with our finite difference scheme, we can assume that n j (v) takes the form of an N x × N y × N t voxel array (for d = 2). The specification of β j (v) is discussed below; we will assume a voxel-free form (Eq (2.8) below) that can be evaluated on an arbitrary grid when needed. Note that the virtual cell density will differ both from a patient's true cell density and from a measured cell density by discretzation, model and measurement errors; see [8] for a discussion of these errors. A demonstration of our scheme implementing (2.5) in d = 2 is shown in Figure 1.
In the next section, we will introduce random field models that allow Eqs (2.4) and (2.5) to be used to generate virtual tumor populations. Note briefly that we have made assumptions that allow us to analyze and simulate Eq (2.1) as a Random PDE (RPDE) model, that is as a classically well-posed PDE whose coefficients are randomized. Note that this is in contrast to a Stochastic PDE (SPDE) model, which requires more advanced analysis and simulation techniques. See e.g., [5] for discussion of a technique based on characteristic functionals (introduced briefly below), and [25][26][27] for further discussion of the distinction between SPDE and RPDE.

Random field models for RDE coefficients
As discussed in the introduction, any mathematical model that is employed in a clinical precision medicine context must account for the possibility of heterogeneity and uncertainty by assuming that model parameters are sampled from a probability distribution. Since the model parameters that specify Eqs (2.1) and (2.2) are spatial functions, we employ the theory of random fields. We will avoid technical discussions, sufficing to say that one can treat β as a function space-valued random vector, written formally as β : Ω B, where (Ω, ℱ, ℙ) is a probability space and B is a set of functions mentioned previously, selected to guarantee the existence of the forward map (2.4). We assume that B ⊂ X, where X is a Hilbert space with inner product (β, φ)X, taken to be the L 2 (V) inner product for vector-valued functions. In the most general setting, the complete statistics of a second-order random process are fully specified by its marginal distributions (through the Kolmogorov construction), its abstract probability distribution ℙ β , or its characteristic functional: [5,9,28,29] Ψ β (φ) = E exp i(β, φ) X . (2.6) Frequently, a process is completely specified by its first few moments e.g., its mean function β (x) = E[β] and its covariance function k(x, x′) = E[(β(x) − β (x))(β(x′) − β (x′)) T ]. While its use requires functional calculus, the characteristic functional (2.6) is typically a compact and efficient way to describe the complete statistics of a random field, and its use in the biological context is a topic of current investigation [5]. Refer to the recent books [27] and [30] for a complete discussion of the theory of Hilbert space-valued random vectors.
In this work, we will simulate random fields by first assuming a convenient functional form for the realizations of the process, then randomizing the corresponding (finitely many) parameters to achieve desired statistics. This method allows us to guarantee that realizations of β satisfy conditions for Eq (2.1) to be well-posed, and allows for straightforward simulations using standard code packages. In particular, the random field models we use will take the form of a randomized synthesis map, which for a scalar-valued function φ j (x) takes the form: where Φ( ⋅ ) : ℝ N X is a synthesis map that maps a (finite-dimensional) patient-specific parameter θ j to the function φ j (v) (x). By selecting a probability distribution ℙ(θ j | θ p ) that depends on the population parameter θ p ∈ Θ ⊂ ℝ p , the synthesis map (2.7) produces a random field. In sections 4.1 and 4.2, we discuss ways to estimate a particular θ j from image data, while in section 4.3 we discuss ways to estimate the population parameter θ p from a collection of images.
Because the coefficients in the RDE (2.1) must be non-negative and bounded to be physiologically realistic, it is advantageous to begin with a random field model that always produces non-negative, bounded realizations. For our simulations, we employ 'lumpy-type' random field models [31,32], a realization of which has the functional form If the lump function ℓ(x; γ) and the lump amplitudes b l are non-negative and bounded, and the number of lumps is finite, then Eq (2.8) is guaranteed to produce non-negative and bounded realizations. For simplicity, we use isotropic unnormalized Gaussian lumps: Random fields with this choice of lump function were proposed to model soft tissues with smoothly varying characteristics [31], while more sophisticated choices of ℓ(x), such as the clustered lump technique [32], can model more complex tissues such as breast or brain tissue. Ultimately, the choice of ℓ(x) and its shape parameter γ requires a statistical calibration step such as that discussed in section 4.3.
We randomize Eq (2.8) as follows. We first select a lump shape parameter γ = σ 2 = σ 0 2 , a constant lump amplitude b 0 , and a mean number of lumps L ≪ L max . We then draw L ∼ Poi(L), and set b ℓ = b 0 for 1 ≤ ℓ ≤ L, and b ℓ = 0 otherwise. Lastly, we draw x 1 … , x L , I.I.D. uniform in the domain V and form the sum (2.8). Note that the resulting random field realization can be written in terms of a nonlinear synthesis map (2.7) with N = (d + 1)L max .
We use the notation LB(L, b 0 , σ 0 2 ) to indicate a lumpy-type field with mean number of lumps L, amplitude b 0 and isotropic lump variance σ 0 2 . A demonstration of the behavior of lumpytype random fields of the form given in Eq (2.8) is shown in Figure 1, top row.
We reiterate that a single realization of the random field specified in Eq (2.8) is fully specified by the parameters σ 0 2 , b 0 and the lump center list [x l ], while the complete statistics of the random field are fully specified by L, σ 0 2 and b 0 . This distinction is important when we discuss parameter estimation problems in sections 4.1-4.3.

Virtual populations and quantitative biomarkers
Suppose that q ∈ ℝ n is a low-dimensional vector quantifying patient status or prognosis, for instance (informally) q = [metric of tumor burden, metric of normal tissue function]. In the model-based precision medicine context, such a vector is generally called a quantitative biomarker [6,33,34]. The purpose of a virtual patient population, and more generally a virtual clinical trial, is to predict the distribution of q for either a control or treatment group. We will assume for convenience that q = q ∈ ℝ is a scalar functional of β and n: Given a virtual population of simulated parameters and tumors, β 1 , n J (v) , we estimate the biomarker q for each virtual patient using a method ℳ (v) , resulting in a vector If Q (v) is generated from a population distribution, that is, the random vector β models interpatient heterogeneity, then we say that Q (v) is a population virtual biomarker sample. If β models patient-specific uncertainties, for instance if β is sampled from the conditional random vector β j |g j where g j is some patient-specific data, then we say that Q (v) is a patientspecific virtual biomarker sample. Statistical properties of q such as its mean, standard deviation and distribution can be estimated from Q (v) . By generating virtual control and treatment groups, Eq (2.11) can be used to simulate a clinical trial. If the VPP employed in Eq (2.11) represents a treatment group, decisions regarding the intervention can be made based on these statistics. For instance, a rigorous decision-theoretic approach to patientspecific treatment selection would define an additional function U(q) which measures the utility of achieving the biomarker q(τ) via the intervention τ. Since q is a random vector, the typical approach is to maximize the expected utility, that is, we seek the intervention τ* which solves While numerous patient-specific therapy optimization schemes have been suggested [35], to our knowledge a scheme based on Eq (2.12) has not seen practical implementation in this context. The VPP methodologies presented here may prove applicable to a future application of Eq (2.12), since the expected value in Eq (2.12) can be approximated via Eq (2.11).
An example of a quantity-of-interest that we will use to illustrate our methods is the tumor burden, which is the total number of tumor cells as calculated from n(x, t): If the goal is to predict N(t) for some t in the future, a VPP allows one to assess the uncertainty in this value as a result of uncertainty in the patient-specific parameter β j being 'propagated' through the models ℱ and ℳ [25,30]. An illustration of uncertainty propagation is demonstrated in Figure 2, where several choices of the population parameter θ p are chosen and the resulting (estimated) probability distribution of N(100) is shown.
Further examples are given in section 5 below. Another possible QoI, the integrated log-kill, is similar to Eq (2.13) but with the integrand n(x, t) replaced by ln(n(x, t)/n 0 (x)). This metric is derived from a mass action drug effect and is applicable in chemotherapy evaluation [5]. Note that we do not discuss the important topic of exactly which scalar QoIs have real clinical impact in terms of patient outcomes; refer to e.g., [33,34].

Molecular emission imaging data
In a pure simulation setting, a virtual biomarker sample of the form shown in Eq (2.11) can be computed without reference to any particular patient-specific data, so long as the corresponding probability distributions are well-calibrated. However, the model-based precision medicine context requires that patient-specific data be used to create individualized VPPs, hence making uncertainty-informed decisions about patient j possible. We will focus on the case of molecular Emission Computed Tomography (ECT) data, for reasons discussed below.

Preclinical and clinical imaging modalities
In the clinic, image data typically takes the form of reconstructed tomographic imaging studies such as MRI, CT, or 18 F-FDG Positron Emission Tomography (PET). Note that while it can be difficult to precisely relate these standard tomographic data to either n j or β j , it is nonetheless possible to postulate such relationships and thereby extract useful information from these images. An example of such an approach is the usage of Apparent Diffusion Coefficient (ADC) maps, generated using Diffusion-Weighted MRI (DWMRI), to estimate n j through the application of a model which relates tissue cellularity to ADC [3,36,37]. The basic difficulty in such efforts is that malignancy is not directly measured by any of these standard techniques: For example, ADC maps estimate the apparent diffusion coefficient of water molecules and CT measures X-Ray attenuation, neither of which is uniquely altered by the presence of a malignant tumor. While increased 18 F-FDG uptake implies increased cellular metabolism along the glycolytic pathway, a byproduct of malignant growth, this effect is not unique to malignant cells, since benign tumors can also display an increased metabolic rate [38].
In this article we consider a wide class of imaging techniques known as Emission Computed Tomography (ECT), broadly defined as any imaging modality (such as PET) which detects a molecular tracer distribution via optical or nuclear detection techniques. We choose to focus on ECT data because the wide variety of available tracers allows for more direct modeling relating these data to tumor cell density n j and the RDE coefficient fields β j . As an example, genetically engineered cancer cells can be made to express Green Fluorescent Protein (GFP), which acts as a photon activity distribution when excited [39], and thus ECT, typically in the form of intravital microscopy (e.g., window chambers [40]) can be used to image n j directly in vivo, at least in the small animal setting. Meanwhile, the growth factor ρ j can potentially be measured using the PET tracer 18 F-FLT, which is a radioactive version of the molecule thymadine used in DNA synthesis [41,42]. Ex vivo, the Ki-67 immunohistochemistry stain would typically be employed to measure proliferation, but this requires either a biopsy or sacrifice of the animal. Current progress in small animal imaging is working towards producing in vivo imaging agents for common immunohistochemistries, but to our knowledge an in vivo Ki-67 analog is not available. The other advantages of ECT are the possibility to perform polyscopic imaging (that is, measurement of several distinct physiological processes simultaneously) [43], and the wealth of literature on the statistics of raw and reconstructed ECT data [9].
To employ the statistical estimation strategies discussed in section 4, we require statistical models for ECT data. We emphasize the case of raw, un-reconstructed ECT data because the statistics are better understood: Most reconstructed images produced in the clinic are generated by a proprietary 'black box' method provided by the imaging system manufacturer. Our framework can still be applied to such data by carefully modifying the ℋ operator defined in Eq (3.3) to account for the reconstruction step, however it can be difficult to calibrate an appropriate noise model for likelihood-based estimation. If the MLEM algorithm is known to have been employed for reconstruction, see [44] for a discussion of the resulting noise characteristics.

Statistics of ECT data
In an ECT system, a chemical tracer is engineered to emit some measurable particle, such as photons, charged particles or nuclear decay products [45]. A tracer corresponds to a particle activity distribution f j (x, t), having units of particles emitted per unit volume per unit time.
For the case of GFP imaging discussed above, a reasonable model is f j that is, the photon activity is proportional to the actual tumor cell density, with constant of proportionality related to the photon yield per cell (i.e., the number of photons detected, on average, per cell). Similarly, we can assume f j ( 18 F-FLT) ∝ ρ j , the growth factor. The energy produced by f j (x, t) propagates through the tissue medium and is detected by specialized ECT imaging hardware, which can range from optical CCD and CMOS detectors to scintillation-type detectors for high-energy particles. The statistics of raw ECT data are discussed at length elsewhere [9,46], so we summarize by noting that two data formats are commonly available, known as binned-mode data and particle-processed data. Because the presentation of particle-processing requires a more technical discussion of imaging detectors, we only consider binned-mode data in this work; see e.g., [45][46][47] for a discussion of particle processing. In a future work we will address the usage of particle processing data in mathematical oncology.
Binned-mode ECT data arises when detected particles are sorted into spatial bins corresponding to a discretization of the detector. These spatial bins need not correspond to physical detector pixels, which many ECT imaging detectors do not have. The physics-based statistics of particle counting nearly always lead to a Poisson data model for particle count data, which can be written in terms of a noisy linear functional of the activity f j as follows.
Suppose that the imaging system has M detector bins, with data for patient j labeled g j1 , … , g jM . The statistical model for binned-mode ECT data is that g j is a Poisson random vector with mean where h m (x, t) is called the m-th sensitivity function of the imaging system (assumed to be fixed for all patients), f j (x, t) is the particle activity distribution corresponding to the tracer administered to patient j, and T is the total imaging time. For tomographic systems, h m (x, t) must account for both the propagation within the patient, typically following a Radiative Transport Equation (RTE), as well as the geometry and blur characteristics of the imaging detector [6,9]. For this work, we will consider a simplified imaging geometry, intended to model intravital microscopy techniques such as mouse window chambers, where optical and/or gamma-ray images are collected with a planar detector above a pseudo-2D activity distribution. To a reasonable approximation, a Gaussian blur model is then appropriate. This blur accounts for both imaging elements such as lenses and collimators as well as position estimation blur [9,46]. More sophisticated imaging system models, for instance 3D tomographic systems, will be discussed in a future work (see section 6).
Expressing Eq (3.1) as a linear continuous-to-discrete operator ℋ : X ℝ M allows us to write a binned-mode system model succinctly as g j = ḡ j + η j = ℋf j + η j , (3.3) where ḡ j = [ḡ j1 , …, ḡ jM ] T = ℋf j ∈ ℝ M is defined via Eq (3.1). Considering f j as a parameter, the probability distribution for the count vector g j is thus We will make use of Eqs (3.4)-(3.6) for likelihood-based estimation methods in section 4 below.

Statistical estimation and inversion techniques
Assume that for patient j, ECT data g j ∈ ℝ M has been collected, where g j can either be monoscopic (a single imaging study) or polyscopic (consisting of multiple imaging studies, corresponding to multiple tracers). In the previous section, we provided statistical models for g j that relate it to the activity distribution f j via Eq (3.3). As discussed there, ECT tracers allow f j to be directly related to n j and ρ j . In this section, we discuss statistical techniques for estimating f j (hence n j and ρ j ) from g j . In section 4.1, we discuss point estimation techniques for an individual patient, where a single estimate of f j is produced, then in section 4.2 we discuss patient-specific Bayesian strategies where the solution consists of samples from a posterior distribution. In section 4.3, we discuss methods for estimating a population parameter θ p from a database G = [g 1 , … , g J ] of images collected for J patients.

Patient-specific maximum likelihood parameter estimation
Given data g j ∈ ℝ M for patient j, we consider the estimation of the underlying activity distribution f j using the imaging system model g j = ℋf j + η j , given in Eq (3.3). To approach this infinite-dimensional inverse problem, we select a finite-dimensional parameterization f j = Φ(θ j ), where θ j ∈ ℝ N , and consider the problem of estimating θ j from g j , that is, solving the inverse problem g j = H(θ j )+ η j . Note that moving from f j to θ j might introduce an approximation error, owing to the fact that f j need not be expressible in same form as the synthesis used for the reconstruction; in ECT imaging, f j is typically quite smooth, so this approximation error is typically insignificant when compared to the influence of detector resolution and noise. While typical image reconstruction algorithms assume a voxel-based parameterization of f j , we elect to use a lumpy-type synthesis of the form given in Eq (2.8).
As discussed, the activity distribution can correspond to the cell density n j , or the growth factor ρ j (x). We denote a generic estimate of θ j produced from g j by a hat, i.e., θ j is an estimate of θ j produced from the data g j . While a wide array of strategies exist to solve this inverse problem, we consider only likelihood-based methodologies.
Recall from section 3.2 that the probability distribution for binned-mode ECT data is Poisson. For an activity distribution f j parameterized using the synthesis map Φ, i.e. f j = f j (θ j ), we thus have a likelihood function ℒ(θ j | g j ) = P (g j | θ j ) = P (g j | f j (θ j )) = P (g j | Φ(θ j )) = ∏ where P(g j |f j ) is the Poisson distribution given in Eq (3.4), and H m (θ j ) are the components of the function defined in Eq (3.5). Note the abuse of notation in Eq (4.1): The two functions P(g j |θ j ) and P(g j |f j ) have different parameter types, but we take Eq (4.1) to define the former in terms of the latter. Taking its logarithm (and ignoring the θ j -independent constant) results in the log-likelihood function ℓ(θ j | g j ) = ln(ℒ(θ j | g j )) = ∑ We will employ the method (4.4) to compute some MLEs for linear imaging models in section 5, while for nonlinear models we use the Matlab algorithm fmincon, which is a based on an interior point method with BFGS quasi-Newton steps [50]. A demonstration of algorithm (4.4) applied to a reconstruction of n j (x, t) from image data of the form (3.1) is shown in Figure 3. Note that the reconstruction displays visual artifacts due to the function H(θ) having a nontrivial null space. We have found that these artifacts have minimal impact on the task at hand, which is to predict the progression of n(x, t) and N(t). Future work will analyze in more detail the question of task-based image quality, as defined in [9,51], in the mathematical oncology context.
We note briefly that under certain technical conditions, an MLE is asymptotically consistent and efficient, meaning for well-specified likelihood models, the MLE converges to the true parameter and attains the Cramér-Rao lower bound. Asymptotically, the distribution of the estimate θ j (ML) (as a random variable depending on the data g j ) converges to a normal with covariance given by the inverse of the Fisher information [9]: M θ j (ML) − θ j N(0, F −1 ), F nn′ = − E g j ∂ 2 ℓ(θ j | g jm ) ∂θ jn ∂θ jn′ The Fisher information matrix F = F(θ j ) can be used to design more efficient measurement systems by 'tuning' the design parameters of the system to modify the likelihood in a desirable manner [52]. Note also that the Fisher information matrix can be evaluated 'offline', by computing the expected values in Eq (4.5), which allows one to analyze the uncertainty in the MLE in a frequentist setting. While F depends on the true θ j , for many ECT systems this dependence is minimal (i.e., the Fisher information matrix is nearly constant with respect to θ j ), which allows one to compute a single (system-dependent) F to be used for all MLEs derived from the system. One can also estimate confidence intervals for scalar quantities of interest derived from estimates of θ j using transformation rules for the Fisher information matrix [53].
To use the MLE procedure to generate a patient-specific virtual population, we assume that g j = ℋf j , where either f j = n j (x, t 0 ), f j = ρ j , or f j = [n j (x, t 0 ), ρ j (x)] T , that is, we image either the cell density at a single time point, or the growth factor, or both the cell density and the growth factor. From this data, we take the following steps: i. Use MLE to obtain θ j (n) (and θ j (ρ) , if measured) from which we synthesize n j (and ρ j , if measured).

iii.
Generate a patient-specific virtual population by using n j as the initial condition, either ρ j or ρ j (1) , … , ρ j (J) for the growth factor, and D j (1) , … , D j for the diffusion coefficient and carrying capacity, and solving the forward problem (2.5) for each sample.

iv.
With the resulting VPP, compute any desired quantitative biomarkers as in Eq (2.11).
Note that in the above procedure, we make no attempt to estimate D and κ by performing the nonlinear parameter estimation task implied by Eq (2.4), nor do we attempt to estimate n(x, t) for any t < t 0 . From single time point data such as this, an identifiability (i.e., uniqueness) issue arises preventing such estimation from succeeding. This procedure is used in sections 5.1 and 5.2 to produce virtual populations.
Note that one issue that is not addressed in the above technique is uncertainty inherent in the noisy data g j , namely that g j being random means the MLE is also random. While Fisher information analysis can be employed to analyze this uncertainty in an off-line frequentist setting, this analysis does not provide a clear method for producing alternates to θ j , since one only has access to a single realization of g j . The Bayesian methodology discussed below provides an avenue towards producing many plausible estimates from a single realization of g j .

Patient-specific Bayesian solution
In a patient-specific Bayesian solution to the inverse problem g j = ℋf j + η j , we assume that f j were sampled from a prior distribution ℙ 0 (f). Note that ℙ 0 (f) need not correspond to a well-calibrated model of the true population in order to proceed with the inference, but the method is typically much more reliable if it does. Then, since g j is statistically coupled to f j through Eq (3.3), we can consider the conditional random vector g j |f j , whose statistics are described by the likelihood function ℒ(f j | g j ) = ℙ(g j | f j ) as discussed previously. A Bayesian inverse solution to the inference problem is to consider the posterior random vector f j |g j instead of a single point estimate f j . Samples from the posterior can be used to generate a virtual population as described in the procedure above, where now n j and ρ j are replaced by samples from their corresponding posteriors to account for uncertainty in the data g j .
Henscheid Page 14 Bayesian personalization of tumor growth has been considered in [21,54], for example. The former considers an RDE similar to Eq (2.1) but requires two segmented MRI images to perform the personalization. The latter generates ensemble members (i.e., virtual patients) by using fixed RDE coefficients (β in our notation) and imposing an ad hoc additive Gaussian noise model on the system evolution, in order to leverage the ensemble Kalman filter technique.
The distribution of the posterior f j |g j is given via Bayes rule, which states informally that ℙ post (f j | g j ) ∝ ℙ 0 (f)ℒ(f j | g j ). More rigorously [55], we would say that dℙ post dℙ 0 ∝ ℒ(f j | g j ) = exp −ℓ(f j | g j ) Practically speaking, in this paper we consider (via Eqs (2.7) and (2.8)) finite-dimensional parameterizations of f, so that when coupled with binned-mode ECT data, the result is a finite-dimensional Bayesian inverse problem [56]. Taking as before a lumpy-type parameterization f j = Φ(θ j ), and with the data described by the nonlinear function ḡ j = H(θ j ) = ℋΦ(θ j ) defined in section 3.2, we now assign a prior on the vector θ j , which we assume has PDF p 0 (θ j ). The Poisson likelihood ℒ(θ j | g j ) is given in Eq (3.6), leading to a description of the posterior as p(θ j | g j ) ∝ p 0 (θ j )ℒ(θ j | g j ) (4.6) If either the prior is infinite-dimensional or the ECT data is assumed to be of particleprocessing form, a fully infinite-dimensional Bayesian approach must be considered [55].
To draw samples from the posterior (4.6), we apply a Markov Chain Monte Carlo (MCMC) algorithm based on the one presented in [57]. First, we assume for each tracer f j an expansion of the form Eq (2.8), so that f j = Φ(θ j ) with θ j = [b 1 , … , b L max , x 1 , … , x L max , γ 1 , … γ L max ]. Note that in some cases, we reduce the computational burden by fixing the x l and/or γ l and only sample the remaining parameters. The MCMC procedure is then as follows: i. Given the current sample θ j (n) , we propose a new sample θ j ′ ∼ π θ j ′ | θ j (n) by randomly selecting L mcmc lumps and perturbing their amplitudes b l , location x l and shape γ l by a Gaussian with diagonal covariance Σ. This proposal is symmetric.

ii.
From θ j ′, the activity f j ′ is synthesized using Eq (2.8) and the proposed mean image ḡ′ = ℋf j ′ is simulated using Eq (3.5).
By construction of the Metropolis-Hastings algorithm, the Markov chain samples θ j (1) , θ j (2) , … have stationary distribution given by the posterior p(θ j |g j ), and can thus be used to synthesize virtual patients for the calculation of any quantitative biomarkers as discussed in section 2.4. An example of this approach is discussed in section 5.1. Note that the standard Metropolis-Hastings algorithm performs best for θ with relatively small dimension; various modifications for high-dimensional problems have also been proposed [25,30,55].

Calibration of population distributions
As illustrated in Figure 2, the population parameter θ p which specifies the random field models used to generate the virtual population has a substantial influence on the statistics of any biomarkers calculated using Eq (2.11). The question of how to estimate θ p from data thus arises. Such calibration problems in the context of VCTs have been discussed in [58,59], but their methodology accounts only for ODE models arising in systems pharmacology and does not directly apply to our context of PDE tumor growth models, random fields and imaging data. In this section, we discuss briefly a technique to estimate the population parameters which specify a lumpy-type model (2.8) from a database of ECT images G = [g 1 , … , g J ].
Suppose we assume a lumpy-type model for each of the RDE coefficient fields, β = [D, ρ, κ], say each is of the form given in Eq (2.8) with ℓ(x) given by a Gaussian (2.9). While the lump function ℓ(x) can itself be the target of calibration (see [60], for instance), we assume Eq (2.9) for simplicity. The statistics of the random field LB(L, b 0 , σ 2 ) are thus fully specified by the three scalars L, b 0 and σ 2 , which we write as θ p = [L, b, σ 2 ] ∈ Θ = (0, ∞) 3 . The problem of calibrating a population distribution such as this is to take a database of noisy image data G = [g 1 , … , g J ] for a population of J patients and produce an estimate θ p = L, b , σ 2 . We can again apply either MLE or the Bayes approach as discussed in To develop a likelihood model for θ p , we consider first the generic case where g j consists of direct images of some activity f j , for instance f j ∝ ρ j as discussed in section 5.1. This case corresponds to that considered in [61]. Assuming inter-subject independence, we have ℒ(θ p | G) = p(G | θ p ) = ∏ j = 1 J P (g j | θ p ) (4.7) Henscheid Page 16 Using the law of total probability and the fact that the distribution of the image data is independent of the object statistics, we can write P (g j | θ p ) = E f j | θ p P (g j | f j ) (4.8) In (binned-mode) ECT imaging, the probability distribution P(g j |f j ) is given by the Poisson (3.4), which depends on the imaging operator ℋ. The expression (4.8) states that the probability of observing a given image g j if the population parameter is θ p is the probability of observing g j given the object f j , averaged over of all possible random field realizations f j with statistics specified by the parameter θ p . The expectation in Eq (4.8) is ostensibly infinite dimensional (since the f j are functions), though it can be approximated using Monte Carlo: Equation (4.9) specifies a computational method for approximating the likelihood (4.7): For fixed θ p , compute Eq (4.9) for each image in the database, then compute the product (or its logarithm) in Eq (4.7). The MLE of θ p is then defined as before in Eq (4.3), with θ p ∈ Θ p , where Θ p is the set of admissible population parameters. Note that the estimate (4.9) can be computed in parallel using graphics processors, which drastically reduces the time required for each likelihood calculation, which would have previously been a bottleneck in the application of a traditional MLE. Despite this, the MLE optimization problem (4.3) applied to the likelihood (4.7) poses a challenge for standard numerical optimization routines due to scaling and non-convexity issues. See section 6 for a brief discussion of future work aimed at addressing these issues.

Results
In this section, we discuss two in silico experiments that illustrate the methodologies discussed above. For both, a common 'ground truth' cell density path n(x, t) is generated by sampling a random field for each coefficient and simulating until t = T = 365 days. In the first experiment, we use (simulated) ECT data for f j (x) = n(x, 100) to compute both an MLE and a collection of posterior samples for the tumor cell density at t = t 0 = 100 days. Then, a virtual population is generated by using the resulting n j (or posterior samples) as the initial condition in Eq (2.1), randomizing the remaining coefficients and simulating until t = T. The resulting populations are then compared to the 'true' n(x, t), using the tumor burden N(t) defined in Eq (2.13) as the quantity-of-interest. In the second experiment, we use ECT data and MLE for both the growth factor ρ(x) and the cell density n(x, t 0 ), again generating a VPP by using the estimated cell density as the initial condition n 0 (x) and ρ j = ρ j , then randomizing both D and κ.
For both experiments, we use the following random fields, both to generate the initial 'true' tumor path n(x, t) and to randomize unmeasured coefficients. We assume D ~ LB(20, 1e−7, 0.04), ρ ~ LB(200, 0.25, 0.002), and κ ~ LB(100, 5e7, 0.1). These parameters were selected to achieve an average tumor size of approximately 10 8 cells after the first year; obviously a calibration procedure such as the one outlined in section 4.3 would be required to estimate population parameters for a real population of tumors. The exact sample used is shown in Figure 4.

Experiment 1: Maximum likelihood estimation and posterior sampling of cell density n
In this experiment, we suppose that g j (n) = ℋ (n) n j + η j (n) consists of ECT measurements of the cell density n j (x, t 0 ), with t 0 = 100 days, and that no other measurements are available. We assume that the experimental setup is confined to a thin slab so that a 2D simulation described in section 3.1 is appropriate, and that the imaging system corresponds to Eq (3.1) with blur kernel given by Eq (3.2). The blur is taken to be σ blur 2 = 0.0212, which corresponds to an imaging resolution (Full-Width-Half-Max, FWHM) of 500 μm, and we assume that the photon yield is A = 1e−3 (detected photons per cell). Once the mean image data ḡ j = ℋf j is computed using Eq (3.1), a randomized image is produced using poissrnd.
To perform the image reconstruction for n j (x, t 0 ), we apply the linear MLEM algorithm , we synthesize the estimate n j (x, t 0 ) using Eq (2.8). The results of this reconstruction are shown in Figure 5.
To demonstrate the usage of this estimate to generate a virtual tumor population as discussed in section 4.1, we use the resulting n j (x, t 0 ) as the initial condition n 0 (x) in Eq (2.1), and randomize the remaining coefficient fields using the lumpy-type random field models described in the introduction to this section to produce a virtual population of size J = 64. In Figure 6, the quantity-of-interest N(t) is shown for the resulting virtual population, while in Figure 7, a collection of 10 virtual tumors is shown for t = 150 (i.e., 50 days post-imaging).
To demonstrate the usage of the Bayesian methods discussed in section 4.2, we generated samples from the posterior n j |g j to use as part of the virtual population generation procedure. . Note that the subscript remains j as these samples correspond to patient j, while the superscript indicates that these are posterior samples, generated using MCMC. Starting at the maximum likelihood estimate (which for our prior corresponds to the MAP estimate) reduces the need for an expensive burn-in period in which the first J 0 samples are thrown out to ensure that the chain is sampling from the posterior.
The acceptance rate for our chain (number of proposals accepted divided by total number of samples) is 0.495. Using each posterior sample as an initial condition, we again generate a virtual population by solving Eq (2.1) with the remaining coefficients D, ρ, κ randomized according to the lumpy-type field models specified previously. The results of this virtual population are shown in Figure 8.
For both the maximum likelihood and Bayesian solutions, the resulting virtual populations underestimate, on average, the true tumor burden both over the short-and long-term, which indicates that additional data may be needed to estimate other model coefficients to gain a more accurate patient-specific prediction; we discuss on possibility in section 5.2 below.
While the Bayesian technique accounts for uncertainty inherent in the image data g j (n) , this uncertainty appears relatively small as it relates to the prediction of N(t); the variance of N(t) for the virtual population generated using the Bayesian technique is only slightly larger than the variance for the MLE population.

Experiment 2: Maximum likelihood estimation of both n and ρ
In Experiment 1, we saw that the virtual population under-estimated the true tumor path in both the MLE and Bayesian examples, suggesting that additional data may be required. In this experiment, we assume that the image data g j consists of (simulated) polyscopic data of the following form. We assume that we have access first to cell density measurements of the form g j (n) = ℋ (n) n j + η j (n) , where ℋ (n) is the same as in Experiment 1. The second dataset is of the form g j , where we now have σ blur 2 = 0.0425 (1 mm FWHM). As in the first experiment, we assume that imaging is performed at t = t 0 = 100. The image reconstruction for n j is performed using MLEM as in experiment 1 ( Figure 5). For ρ j (x), we use MLE with a nonlinear synthesis map as follows. The lumpy-type expansion (2.8) with Gaussian lump (2.9) is assumed where now θ j = [x 1 , … , x L max , b, σ 2 ], that is, we fit the lump positions and a common lump amplitude and variance. The result is a reconstruction of the form where we have chosen L max = 60 for this experiment. We assume further the constraints that x l ∈ [0, 1] 2 , b ∈ (0, ∞), and σ 2 ∈ (0, 0.1], and use the Matlab algorithm fmincon to perform the MLE optimization (4.3). The results of this reconstruction are shown in Figure 9.
In Figure 10, we compare the tumor burden predicted using the virtual population to the 'true' tumor path. Over the first 30 day post-imaging period, the ensemble mean N (t) for the virtual population tracks the true tumor path closely, indicating that if both n and ρ are estimated, the result appears unbiased over this short time period. This is also demonstrated by looking at the individual virtual population samples in Figure 11: Visual inspection reveals a closer overall appearance to the 'true' tumor than was apparent in Figure 7. Over a longer time period, the ensemble still trends towards a lower overall growth rate than the 'true' tumor path, as seen previously in Figure 6.

Discussions
In this article, we have presented a rigorous, end-to-end statistical framework for addressing both inter-and intra-patient heterogeneity for the commonly used reaction-diffusion tumor growth model (2.1), emphasizing the usage of raw molecular emission imaging data to estimate unknown model parameters. We demonstrated these methods in two in silico experiments, using a simulated ground truth tumor path and simulated imaging data to illustrate patient-specific Virtual Patient Population (VPP) generation.
In section 2.3, we presented a family of random field models that are both easy to simulate and demonstrate a wide variety of spatial heterogeneity; in section 4.3, we discussed how the parameters defining the statistics of these random field models can be estimated from a collection of molecular images, and discussed some of the computational challenges inherent in this task. Future work will discuss algorithms for solving this MLE problem, as well as additional estimation strategies to calibrate more general random field models from noisy data.
In section 3, we discussed molecular Emission Computed Tomography (ECT) imaging systems and their associated statistical models. For simplicity, we emphasized the case of performing direct, possibly polyscopic, 2D imaging of a thin sample. While the 2D approximation is very limiting in general, it is applicable to the experimental setting of mouse window chambers, a topic of future work in our group. In another future work, we will address fully 3D tomographic ECT systems with particle-processing data and the associated computational and image quality questions that arise. Understanding image data quality in a task performance setting can help design better imaging systems [51], and to our knowledge, there is currently minimal effort formally evaluating the quality of image data in the mathematical oncology context.
In section 4, we presented general likelihood-based statistical procedures for estimating both patient-specific and population parameters from noisy molecular imaging data, and in section 5, we presented two experiments to illustrate these procedures. In the first experiment, measurements of n j (x, 100) are used to construct a maximum likelihood estimate and posterior samples, which are subsequently used to generate a patient-specific VPP. In the second experiment, we add an additional imaging study, using only maximum likelihood to estimate both n j (x, 100) and ρ j (x). The resulting virtual population demonstrated a more accurate prediction of the true N j (t), over a 30 day post-imaging prediction period. In a future work, we will investigate the question of optimizing imaging acquisition schemes for the task of predicting N(t). We will also address the addition of treatment models and evaluation techniques discussed in [5] to our virtual population methodology, working towards providing a general strategy to solving the optimization problem (2.12).   Illustration of the MLEM image reconstruction algorithm (4.4) for k = 10 2 , 10 3 and 10 4 iterations. The true n j (x, t) shown on the right. The visible artifacts are due to the fact that ℋ has a nontrivial null space which is not accounted for in the MLE (4.3). Regularization strategies can alleviate this, though we have found that for the task of predicting tumor burden N(t), these artifacts have minimal impact.  The 'true' tumor cell density, generated using Eq (2.1) with β = (D, ρ, κ) sampled from the random field models specified in the section 5 introduction above. The simulated imaging studies are performed at t = 100 days (second panel).

Author Manuscript
Author Manuscript Author Manuscript Author Manuscript   Example VPP Samples for t = 150, i.e., 50 days post-imaging, generated using the MLEM estimate of n j (x, 100) as the initial condition and randomizing the other coefficient fields as described in the main text. All plots are shown on [0.3, 0.7] 2 and with a common colorbar. It is apparent that the tumor morphology is largely consistient across the population, though some virtual tumors demonstrate small 'islands' of growth and the overall rate of growth is variable, as seen in Figure 6.    Example VCT Samples for t = 150, i.e. 50 days post-imaging, generated using the MLEM estimate of n j (x, 100) as the initial condition, the MLE of ρ j (x) for the growth factor, and randomizing D(x) and κ(x) as described in the main text. All plots are shown on [0.3, 0.7] 2 and with a common colorbar. In contrast to Figure 7, the morphology apparent in the virtual population is nearly identical; heterogeneity in the overall growth is apparent in Figure 10.