MR-guided dynamic PET reconstruction with the kernel method and spectral temporal basis functions

Recent advances in dynamic positron emission tomography (PET) reconstruction have demonstrated that it is possible to achieve markedly improved end-point kinetic parameter maps by incorporating a temporal model of the radiotracer directly into the reconstruction algorithm. In this work we have developed a highly constrained, fully dynamic PET reconstruction algorithm incorporating both spectral analysis temporal basis functions and spatial basis functions derived from the kernel method applied to a co-registered T1-weighted magnetic resonance (MR) image. The dynamic PET image is modelled as a linear combination of spatial and temporal basis functions, and a maximum likelihood estimate for the coefficients can be found using the expectation-maximization (EM) algorithm. Following reconstruction, kinetic fitting using any temporal model of interest can be applied. Based on a BrainWeb T1-weighted MR phantom, we performed a realistic dynamic [18F]FDG simulation study with two noise levels, and investigated the quantitative performance of the proposed reconstruction algorithm, comparing it with reconstructions incorporating either spectral analysis temporal basis functions alone or kernel spatial basis functions alone, as well as with conventional frame-independent reconstruction. Compared to the other reconstruction algorithms, the proposed algorithm achieved superior performance, offering a decrease in spatially averaged pixel-level root-mean-square-error on post-reconstruction kinetic parametric maps in the grey/white matter, as well as in the tumours when they were present on the co-registered MR image. When the tumours were not visible in the MR image, reconstruction with the proposed algorithm performed similarly to reconstruction with spectral temporal basis functions and was superior to both conventional frame-independent reconstruction and frame-independent reconstruction with kernel spatial basis functions. Furthermore, we demonstrate that a joint spectral/kernel model can also be used for effective post-reconstruction denoising, through the use of an EM-like image-space algorithm. Finally, we applied the proposed algorithm to reconstruction of real high-resolution dynamic [11C]SCH23390 data, showing promising results.

Recent advances in dynamic positron emission tomography (PET) reconstruction have demonstrated that it is possible to achieve markedly improved end-point kinetic parameter maps by incorporating a temporal model of the radiotracer directly into the reconstruction algorithm. In this work we have developed a highly constrained, fully dynamic PET reconstruction algorithm incorporating both spectral analysis temporal basis functions and spatial basis functions derived from the kernel method applied to a co-registered T1-weighted magnetic resonance (MR) image. The dynamic PET image is modelled as a linear combination of spatial and temporal basis functions, and a maximum likelihood estimate for the coefficients can be found using the expectation-maximization (EM) algorithm. Following reconstruction, kinetic fitting using any temporal model of interest can be applied. Based on a BrainWeb T1-weighted MR phantom, we performed a realistic dynamic [ 18 F] FDG simulation study with two noise levels, and investigated the quantitative performance of the proposed reconstruction algorithm, comparing it with reconstructions incorporating either spectral analysis temporal basis functions alone or kernel spatial basis functions alone, as well as with conventional frame-independent reconstruction. Compared to the other reconstruction algorithms, the proposed algorithm achieved superior performance, offering a decrease in spatially averaged pixel-level root-mean-square-error on postreconstruction kinetic parametric maps in the grey/white matter, as well as in

Introduction
Positron emission tomography (PET) is a molecular imaging technique that monitors trace amounts of targeted radio-labelled compounds in vivo. Dynamic PET imaging, which monitors the temporal in addition to the spatial distribution of the radiotracer, can provide richer information compared to static PET and has the ability to estimate kinetic parameters by modelling the spatiotemporal radioactivity distribution. In contrast to static PET, where the measured data is integrated over the scan duration and a single image is reconstructed, each image in a dynamic PET sequence is conventionally independently reconstructed from a much smaller time-frame of measured data, resulting in images with poor statistical quality due to low measured counts. Following independent-frame reconstruction, post-reconstruction temporal modelling (or 'kinetic fitting') is done to estimate kinetic parameters for either individual pixels or for whole regions of interest (ROIs). The noise in the individual time-frame images can lead to unnecessarily high noise levels in the final fitted parameters, and sometimes the kinetic fitting can even fail. Ideally, the post-reconstruction kinetic fitting should accurately model the noise distribution in the PET images, but this is very difficult in practice because the noise is spatially correlated and object dependent.
In response to these concerns, fully dynamic reconstruction techniques have been developed that seek to address the high noise problem by removing the assumption of time-frame independence within the image reconstruction itself. By incorporating a model of the temporal behaviour of the tracer directly into the reconstruction algorithm, fully dynamic reconstruction simultaneously benefits from (i) implicating the whole of the measured data into the reconstruction task for each time-frame, (ii) imposing constraints on the reconstructed timeactivity curves (TACs), and (iii) accurately modelling noise by estimating kinetic parameters directly from the raw data, which are well modelled as a space-time dependent Poisson process. Whether the chosen temporal model directly relates to the kinetic parameters of interest (so-called 'direct kinetic parameter estimation') or not (in which case fully dynamic reconstruction is used as a powerful means of noise reduction, and then post-reconstruction kinetic fitting with the temporal model of interest is performed), substantial reductions in variance and even bias can be obtained compared to the conventional method of frame-independent reconstruction followed by kinetic fitting (Reader and Verhaeghe 2014).
In static PET reconstruction, the low count problem is often addressed by instead removing the conventional assumption of spatial independence between pixels. In this context, anatomy-guided image reconstruction, where information contained in co-registered higher-resolution computed tomography (CT) or magnetic resonance (MR) images is incorporated into the reconstruction, has shown promise not only in reducing image noise but also in reducing partial volume effects (Vunckx et al 2012, Bai et al 2013. Importantly, algorithms incorporating anatomical information into the reconstruction task have also been shown to outperform anatomy-based post-reconstruction correction techniques in terms of image noise versus bias and detection performance (Nuyts et al 2005). In general, MR images are favoured over CT images due to excellent soft-tissue contrast. Common techniques for incorporating MR images into reconstruction include the so-called 'Bowsher prior' (Bowsher et al 2004), which acts locally by encouraging smoothness over an anatomy-dependent neighbourhood, and the joint entropy prior (Nuyts 2007, Tang and Rahmim 2009, 2015, which acts globally using an informationtheoretic based similarity measure between features of the emission and co-registered anatomical images. More recently, anatomical images have been used to generate spatial basis functions for image reconstruction using the kernel method from machine learning (Hutchcroft et al 2014, Wang andQi 2015).
With the advent of integrated PET-MR scanners, providing simultaneously obtained anatomical and functional data with the potential for minimal co-registration errors, there are increasing opportunities for merging these developments to achieve increased regularization in dynamic PET reconstruction. However, the potential for combining these developments in fully dynamic PET reconstruction on the one hand, and in anatomy-guided static PET reconstruction on the other, has not been fully realized. A notable exception is the work of Tang et al (2010), which developed a direct kinetic parameter reconstruction algorithm based on a linear Patlak temporal model that incorporated prior information from a co-registered MR image using a joint entropy penalty. While the authors showed significant pixel-level variance reduction compared to fully dynamic reconstruction alone, their proposed algorithm is only applicable for Patlak analysis (for radiotracers exhibiting near-irreversible binding), and the addition of the joint entropy term renders the objective function non-convex and difficult to tune (Vunckx et al 2012).
The present work, in contrast, investigates the use of a relatively simple approach for combining temporal constraints as well as anatomy-guided spatial constraints into fully dynamic PET reconstruction. Building on previous work in fully dynamic PET reconstruction and anatomyguided static PET reconstruction, we propose to jointly parameterize the dynamic PET image as a weighted superposition of spectral temporal basis functions (Reader et al 2007) and spatial basis functions, derived using the kernel method (Wang and Qi 2015) from a high-resolution co-registered T1-weighted MR anatomical image displaying excellent soft tissue contrast. The proposed reconstruction problem can be optimized using a conventional or nested expectation-maximization (EM) algorithm which guarantees convergence to a maximum likelihood solution. Using a realistic simulated dynamic [ 18 F]FDG imaging study, and a real high-resolution dynamic [ 11 C]SCH23390 dataset, we quantitatively evaluated the performance of the proposed algorithm. In particular, we compared the proposed reconstruction algorithm to both frame-independent reconstruction using kernel spatial basis functions alone, and fully dynamic reconstruction with spectral analysis temporal basis functions alone, in terms of post-reconstruction kinetic parameter maps.

Proposed reconstruction algorithm
Let × S J L be a matrix which holds a spatial basis function (sampled at all J pixels) in each column, × T M K be a matrix holding a temporal basis function (sampled at all M time points) in each column, and θ × LK 1 be the coefficient vector. The model of the dynamic emission image × x JM 1 (sampled at each of J pixels and M time points) can be linearly parameterized as a weighted superposition of K temporal basis functions and L spatial basis functions as where s jl is the ( j, l)th element of the matrix S, t mk is the (m, k)th element of the matrix T, and θ lk is the (l, k)th element of the coefficient vector θ. The above linear model can be conveniently expressed in matrix-vector form as where ⊗ is the Kronecker product, and ( ) ⊗ × S T JM LK is a block matrix 3 . The model of the mean ȳ of the measured sinogram data × y IM 1 (a vector containing the expected detected events at each of I detector pairs and M time points) can be related to x through an affine transform: where × A IM JM is the system matrix and × b IM 1 is a vector containing the modelled mean background components (scattered and random events). The Poisson log-likelihood function is given by which defines the maximum likelihood estimate that we seek, θ : The EM algorithm (Shepp and Vardi 1982) can be used to find the solution by the following iterative update equation for the estimate of the coefficient image θ: where × 1 V T 1 is a vector with each element equal to one. After estimation of the coefficient vector, the reconstructed dynamic image is given by equation (2).
The convergence of the above iterative algorithm can be very slow when the basis function matrix is dense. In this work, the nested EM algorithm (Wang et al 2008) is used to decouple the emission image reconstruction and the linear modelling to accelerate convergence. At each iteration, the current estimate of the coefficients ( ) θ n is used to form the current dynamic emission image ( ) ( ) ( ) θ = ⊗ x S T n n , and a provisional tomographic EM update is obtained: Second, an EM-like image-space update is used to obtain a new estimate of the coefficients: 3 Rather than forming the block matrix ( ⊗ S T) explicitly, the product ( ) θ = ⊗ x S T can be calculated by first reshaping the vector θ into a matrix Θ × K L , then calculating the matrix Θ = X T S T . The vector x is obtained by reshaping the matrix × X M J into a column vector.
where p is the sub-iteration number and ( ) ( ) θ θ = + + n n P 1 , 1 . Each iteration of the nested EM algorithm consists of one iteration of (7) and multiple sub-iterations of (8).

Post-reconstruction denoising
As an alternative, the same spatiotemporal model can also be applied as a post-reconstruction denoising method using an EM-like update as follows: where x EM is a time-series of images given by conventional frame-independent EM reconstruction. After estimation of the coefficients, the denoised image is given by equation (2). There are two iteration numbers associated with this method: the number of reconstruction iterations used to obtain the initial image to be denoised (x EM ), and the number of image-space fitting iterations (equation (9)).

Basis function selection
It is worth noting that common dynamic reconstruction algorithms, using a linear temporal model for time-activity curves, model the dynamic image as a special case of equation (2) with the spatial basis matrix S set to the × J J identity matrix. Common temporal models include spectral analysis (Reader et al 2007), splines (Nichols et al 1999, Reutter et al 2000 and the Patlak model (Tsoumpas et al 2008, Wang et al 2008. In this work we adopt the spectral analysis model because it is consistent with compartmental models, and unlike the Patlak model, does not make any assumptions about reversible or irreversible tracer uptake. Additionally, the estimated spectral coefficients can be related to kinetic macroparameters of interest if the blood input function is known (Cunningham and Jones 1993). If the blood input function is not known, the spectral analysis model can be used as a powerful means of reparameterizing the reconstruction problem to achieve noise reduction, and then post-reconstruction kinetic fitting can be done using any temporal model of interest. The spectral analysis method (Reader et al 2007) models each time-activity curve as a non-negative superposition of K different exponential temporal basis functions, each convolved with a generating function. The kth spectral basis function b k (t), (sampled at each of M time points and placed into the kth column of the matrix T) is given by convolving an exponentially decaying function with the generating function G(t): where φ k is the kth rate constant and is the convolution operator (see figure 1). The generating function is either the blood input function (if known), or it can be estimated from the head curve (the curve of the total number of detected sinogram counts as a function of time) using the EM algorithm (Verhaeghe and Reader 2013).
To spatially parameterize the dynamic image we use the kernel method for image representation, in which the PET image is modelled as a linear combination of spatially extensive basis functions derived from a prior information image. In contrast to other spatial regularization methods for PET reconstruction, which typically introduce an additional penalty term in the objective function, regularization is here achieved implicitly through a linear model of the PET image, allowing simple optimization with the EM algorithm. Similar to dynamic reconstruction incorporating a linear temporal model into the reconstruction algorithm, the kernel method achieves noise reduction by (i) estimating coefficients for smooth, overlapping and spatially extensive basis functions, each drawing from more measured sinogram data compared to single pixels, and (ii) constraining the reconstructed spatial distribution of radiotracer to a weighted superposition of such basis functions. In contrast to previous work using the kernel method (Hutchcroft et al 2014, Wang andQi 2015) where a prior information image is used to regularize static reconstructions, in this work we incorporate kernel spatial basis functions (from a co-registered T1-weighted MR anatomical image) into the reconstruction task in conjunction with the aforementioned spectral temporal basis functions. Adapting the description from Wang and Qi (2015), the ( j, l)th element of the × J J kernel spatial basis matrix S is found by evaluating a kernel function ( ) κ f f , j l , where the function depends on the prior information image at the jth and lth pixels (each pixel j of the prior image is considered to have a feature vector f j containing one or more elements): Note that L in equation (1), the number of spatial basis functions contained in S, is here taken to be equal to J, the number of pixels in the image. In this work, we use the common radial basis kernel function, though other kernel functions could be considered: where σ is a free kernel parameter. To simplify the choice of the kernel function parameter σ, the feature vectors were normalized by their standard deviation prior to calculating the kernel spatial basis matrix: where f SD q ( ) is the standard deviation of the qth element of f j over all pixels. Thus (11) and (12) deliver spatial basis functions (columns of the spatial basis matrix S) where each basis function is a 'similarity map': the j th basis function will highlight pixels with similar features to f j (i.e. equation (12) is closer to 1), while pixels with dissimilar features will be suppressed (i.e. equation (12) is closer to 0). To incorporate prior information from the T1-weighted MR image, the prior image is here taken to be the co-registered anatomical image (i.e. f j is the Here 50 exponential functions are used, with rate constants k φ log-uniformly distributed between 0.001 min −1 and 10 min −1 (left plot). The special case of k φ = ∞ was also included. The generating function (right plot, dashed line) was estimated from the noisy headcurve of a simulated sinogram in the 2D simulation study and convolved with the exponential functions, giving the spectral basis functions (right plot, solid lines). intensity of the co-registered anatomical image at pixel j, effectively a scalar in this case). Since equations (11) and (12) can result in basis functions extending across the entire image, the reconstruction of a particular region may suffer if it is not present on the prior image. To avoid introducing long-range correlations, we here assume that the structure of the co-registered anatomical image is similar to the structure of the emission image only in local neighbourhoods, and we assign non-zero elements of the spatial basis matrix only within local neighbourhoods: where α j defines the local neighbourhood of pixel j. In this work we consider square (for 2D) or cubic (for 3D) neighbourhoods centered on the pixel j. As a visual aid, figure 2 presents a sample of spatial basis functions (each obtained by reshaping the non-zero elements of a different column of the spatial basis matrix S) derived from the co-registered MR image used in the 2D simulation study, to be described next. There are J such overlapping spatial basis functions, each centered on a different pixel j. The spatial basis functions are 'local similarity maps': for a spatial basis function centered on pixel j, neighbourhood pixels with similar anatomical intensity to f j are given larger values, while neighourhood pixels with different anatomical intensity are given smaller values. As can be seen in figure 2, the kernel function parameter σ controls the smoothness and sparsity of the spatial basis functions by controlling the extent to which differences in feature vectors (here, intensities of the anatomical image) are emphasized. The effect of the kernel function parameter σ and the neighbourhood parameter α will be examined in further detail in subsequent sections.

Experiments and evaluation
3.1. Simulation study 3.1.1. Dynamic phantom. We performed a dynamic [ 18 F]FDG PET simulation study using a BrainWeb database phantom (Collins et al 1998), which consists of both an anatomical model and a corresponding simulated T1-weighted MR image (pixel size 1 mm 3 and dimensions × × 256 256 181 for both images). The anatomical model is a set of probabilistic labels, one volume per tissue class (including grey matter, white matter and cerebrospinal fluid) Figure 2. Spatial basis functions from the kernel method. In (a), the local neighbourhood of the simulated BrainWeb T1-weighted MR image is shown for randomly chosen pixels. In (b)-(d) the spatial basis functions centered on the same pixels are shown, each obtained by re-shaping the non-zero elements of the respective columns of the spatial basis matrix, for different choices of the kernel function parameter 0.05 σ = (b), 0.2 (c) and 0.8 (d). The spatial basis matrix consists of one such patch for each pixel. Here the neighbourhood α is 7 7 × pixels.
where the pixel value in each volume specifies the proportion of that tissue in the pixel. The simulated T1-weighted MR image is derived from first-principles modelling and includes realistic noise and partial volume effects. Both images were re-sampled (using tri-linear interpolation) to have isotropic pixels (pixel side length 1.21875 mm) and image dimensions of × × 256 256 207, consistent with the standard reconstruction specifications for the Siemens high resolution research tomograph (HRRT) (Wienhard et al 1994). A single transaxial slice (image dimensions × 256 256) from both the resampled images was selected for our twodimensional dynamic simulation study.
Three small to intermediate sized tumours (diameters of 8, 12 and 16 mm respectively) were inserted into the white matter of the PET phantom, and two cases were considered in our simulation study: the first in which the tumours are absent from the T1-weighted MR image, and the second in which the tumours are present on the MR image. For the second case, simulated tumours (with identical size and location as in the PET phantom) were inserted into the MR image (see figure 3).
Regional TACs were generated using a two-tissue compartmental model (K k k , , 1 2 3 and k 4 ) with an analytical blood input function (Feng et al 1997). The kinetic parameters used are summarized in table 1. A simulated dynamic image was created by assigning the generated TACs (24 time-frames: × 4 20 s, × 4 40 s, × 4 60 s, × 4 180 s and × 8 300 s) to the corresp onding labels of the anatomical model. An attenuation image for 511 keV photons was created using the anatomical model by assigning a value of 0 cm −1 in air, 0.146 cm −1 in bone and 0.096 cm −1 in other tissues. The dynamic activity images were forward projected (288 projection angles with 256 projection bins per angle) to create noise-free sinograms, and Poisson noise was introduced. A spatially uniform but temporally dependent background sinogram consisting of 20% of the total counts was included to simulate scatter and random events, and attenuation was also modelled. These effects were also modelled within reconstruction.    We considered two different total mean count levels (intermediate and high noise): × 2 10 8 and × 5 10 7 . Thirty noisy realizations were simulated for each count level to compare the reconstruction algorithms.
3.1.2. Comparisons. Dynamic PET images were reconstructed from the 30 noisy realizations using conventional frame-independent EM, EM with spectral analysis temporal basis functions, frame-independent EM with kernel spatial basis functions, and the proposed algorithm combining the kernel spatial basis functions and spectral analysis temporal basis functions. For reconstructions with spectral temporal basis functions, 50 exponential functions were used with rate constants log-uniformly distributed between 0.001 min −1 and 10 min −1 , and the special case of φ = ∞ k was included. The generating function G(t) was iteratively estimated from the head curve of the noisy sinogram. A total of 200 iterations were performed for all reconstruction algorithms. For all reconstruction algorithms except conventional frameindependent EM, the nested EM algorithm was used with 20 sub-iterations.
Post-reconstruction kinetic fitting was performed on all the resulting dynamic images using a two-tissue compartmental model (K 1 , k 2 , k 3 and k 4 ), with the ground truth analytical input function, and was optimized using 100 iterations of the Levenberg-Marquardt algorithm, with weighting factors equal to the square of the frame duration divided by the total counts of the frame. The net influx rate which is the major parameter of interest for dynamic [ 18 F]FDG PET studies, was calculated from the estimated parameters for comparing the different reconstruction algorithms.

Figures of merit.
For quantitative comparison of the different reconstruction algorithms, we studied three ROIs, namely the grey matter, white matter, and tumours. For white and grey matter, the regions were defined to include pixels that belong at least 80% to the specified tissue class, based on the probabilistic BrainWeb label images. The following figures of merit were calculated at the pixel level across realizations. The spatially averaged pixel-level coefficient of variation (COV) is defined for each ROI as: where x j r is the K i parametric value at pixel j at noise realization r, R is the total number of noise realizations (here R = 30), N ROI is the number of pixels in the ROI, and x j is the ensemble mean value of the pixel j (i.e. ¯) = ∑ = x x j R r R j r 1 1 . The spatially averaged pixel-level normalized bias is defined for each ROI as:∑ Finally, the spatially averaged pixel-level root-mean-square error (RMSE) is defined for each ROI as: The above metrics were calculated across the K i parametric images (after every 5 iterations of reconstruction) obtained by the different methods.

Real data
We applied the proposed reconstruction algorithm to 3D PET/MR data from a healthy human subject. The dynamic PET data was acquired on the HRRT using a 370 MBq injection of [ 11 C] SCH23390, a selective dopamine D 1 antagonist (26 time frames: × 6 30 s, × 7 60 s, × 5 120 s and × 8 300 s). The T1-weighted MR data (isotropic pixels with side length 1 mm) was acquired on a Siemens Sonata 1.5 T system, using a gradient echo pulse sequence (repetition time 22 ms, echo time 9.2 ms, flip angle 30° and image dimensions × × 176 256 256). Both the PET and MR data were obtained, with permission, from a previous imaging study (Cox et al 2015) at our centre, which was performed in accordance with the Declaration of Helinski and approved by the Research Ethics Board of the Montreal Neurological Institute. All participants provided written, informed consent.
As a pre-processing step, the subject's T1-weighted MR image was corrected for nonuniformity (Sled et al 1998). Before reconstruction with the proposed algorithm, the MR image was aligned and resampled to an initial reconstruction of the time-summed dynamic PET image (using 10 iterations of the ordinary Poisson ordered subset expectation maximization (OP-OSEM) algorithm (Comtat et al 2004, Hong et al 2007 with 16 subsets) by a rigid (6 parameter) mutual-information based registration. During the tomographic update stage of reconstruction (equation (7)), normalization, attenuation and resolution modelling (Reader et al 2003) effects were accounted for, as well as randoms and scatter. No motion correction was performed. For the nested EM reconstructions, each image-space update consisted of 10 sub-iterations, with 160 iterations performed in total (this is the number used in practice at our imaging centre).
As for the simulation studies, for the spectral temporal basis functions, we used 50 exponential functions with rate-constants log-uniformly distributed between 0.001 min −1 and 10 min −1 , with the generating function estimated from the head curve of the measured sinogram. For the kernel spatial basis functions, we used a kernel parameter σ = 0.2 and a neighbourhood α = × × 7 7 7 pixels ( × × 8.5 mm 8.5 mm 8.5 mm). To reduce computational demand, only the 150 nearest neighbours (using the kNN algorithm (Friedman et al 1977) with the Euclidean distance) of each f j , within each cubic neighbourhood, were assigned nonzero values when building the spatial basis matrix.
Post-reconstruction kinetic analysis was performed on the reconstructed dynamic image using the simplified reference tissue model with the basis function method (SRTM-BFM) (Lammertsma andHume 1996, Gunn et al 1997) with 100 basis functions (basis function coefficients logarithmically sampled from θ = 0.001 3 min s −1 to θ = 0.01 3 max s −1 ), and with the cerebellum as the reference region, to estimate pixel-wise non-displaceable binding potential (BP ND ). Weighting factors were equal to the square of the frame duration divided by the total counts of the frame. The CIVET pipeline (Collins et al 1994, Sled et al 1998, Smith 2002, Ad-Dab'bagh et al 2006 was used to segment the subject's MR image into the cerebellum and two conventional regions of interest for [ 11 C]SCH23390 (Farde et al 1987): the caudate nucleus and putamen. Since the true kinetic parameters are not known, we plotted the regional COV (regional standard deviation divided by mean regional value) versus mean regional BP ND within the specified ROIs, with increasing iteration number.

Simulation study
4.1.1. Effect of tunable parameters. First, the effects of the tunable parameters associated with the kernel spatial basis functions (the neighbourhood α, and the kernel function parameter σ) were studied (with the high noise simulation: × 5 10 7 total mean counts) for various parameter choices. Since the results for the grey and white matter are similar, as well as the results for the tumours of different sizes, we report results for combined white and grey matter regions, as well as for the three tumours combined.
The effect of varying the neighbourhood α (with fixed kernel parameter σ = 0.2) on postreconstruction K i parametric maps is shown in figure 4. Figure 5 presents an example K i parametric image, and the unnormalized RMSE (no denominator term in equation (18)) image over the 30 noise realizations, resulting from kinetic fitting of images reconstructed with different parameters. The unnormalized RMSE was used because the RMSE as described in equation (18) is undefined in regions where K i = 0. In the grey and white matter regions, increasing the neighbourhood size from × 3 3 pixels ( × 3.7 mm 3.7 mm) to × 7 7 pixels ( × 8.5 mm 8.5 mm) results in reduced pixel-level COV at matched pixel-level bias. When the tumours are absent from the co-registered MR image, the spatial basis functions in the white matter (in which the tumours are embedded) are roughly uniform patches. Increasing the neighbourhood parameter α increases the size of the patches, limiting the ability to reconstruct the sharp edges of the tumour, and causing increased pixel-level bias due to excess blurring in the tumours. When the tumours are present on the MR image, the spatial basis functions are adapted to the shape of the tumour, and no blurring is observed. In this latter case, similar to in the grey/white matter, increasing the neighbourhood size results in reduced pixel-level COV at matched pixel-level bias. These effects are evident in figure 5. The effect of varying the kernel function parameter σ (with fixed neighbourhood parameter α = × 5 5 pixels) on post-reconstruction K i parametric maps is shown in figure 6. Figure 7 presents an example K i parametric image, and the unnormalized RMSE image over the 30 noise realizations, resulting from kinetic fitting of images reconstructed with different parameters. In the grey and white matter regions, increasing the kernel function parameter from 0.05 to 0.2 results in reduced pixel-level COV at matched pixel-level bias, while increasing it further to 0.8 increases pixel-level bias. If σ is too small, the kernel function overemphasizes differences between features, leading to spatial basis functions which are sparse and sensitive to noise in the anatomical image (figure 2(b)) and offering only slight noise reduction in the reconstructed images. As the parameter is increased, the spatial basis functions become smooth and less sparse (figure 2(c)), further reducing noise in the reconstructed images. If the parameter is increased excessively, the spatial basis functions become increasingly uniform (figure 2(d)) as the kernel function fails to separate pixels with sufficiently different intensities in the anatomical image, leading to reconstructions which are oversmoothed and biased (figure 7). In the tumours, the kernel parameter σ has little effect on pixel-level RMSE when the tumours are not present on the MR image; while increasing the parameter reduces pixellevel variance, bias is increased due to increased blurring. When the tumours are present on the MR image, pixel-level bias-COV tradeoff curves are similar to those observed in the grey/ white matter, with σ = 0.2 giving the best results.

Comparisons with other reconstruction methods.
The quantitative performance of the proposed reconstruction algorithm was compared to three other reconstruction algorithms: conventional frame-independent EM, as well as two other reparameterized reconstruction algorithms: EM with spectral analysis temporal basis functions, and frame-independent EM with kernel spatial basis functions. For reconstructions with kernel spatial basis functions, we used a neighbourhood α = × 5 5 pixels ( × 6.1 mm 6.1 mm) and kernel function parameter σ = 0.2. The top row of figure 8 compares the spatially averaged pixel-level bias-COV tradeoff curves on post-reconstruction K i parametric maps obtained with the four different reconstruction algorithms, for the intermediate noise case (total mean counts × 2 10 8 ). The bottom row compares the different algorithms in terms of the spatially averaged pixel-level RMSE as a function of iteration. Figure 9 is similar but for the high noise level (total mean counts × 5 10 7 ). Example post-reconstruction K i parametric images, as well as unnormalized RMSE images over the 30 noise realizations, are shown in figures 10 and 11.
These data are summarized in table 2. For a fair overall comparison, for each reconstruction algorithm, we calculated the total RMSE (obtained by summing the RMSE for the grey/white matter and tumour regions) at each iteration and chose the iteration that delivered the lowest value; this iteration number was used to compare the different algorithms in the following analysis. In the grey and white matter, each reparameterized algorithm (EM with spectral analysis temporal basis functions, frame-independent EM with kernel spatial basis functions, and the proposed algorithm combining both) performed better than conventional frameindependent EM (table 2). In terms of spatially averaged pixel-level RMSE, the the proposed algorithm outperformed EM with spectral analysis temporal basis functions, offering a reduction in RMSE of 6.5 and 8.0 percentage points (a relative decrease of 39% and 37%) at the intermediate and high noise levels, respectively. Compared to frame-independent EM with kernel spatial basis functions, the reduction in RMSE was 11.1 and 14.1 percentage points (a relative decrease of 53% and 51%), respectively. Compared to conventional frameindependent EM, these numbers increased further to 15.7 and 20.6 percentage points (a relative decrease of 62% and 60%). In the tumours, results are similar to the grey/white matter when they are present on the MR image, with the reparameterized methods outperforming conventional frame-independent reconstruction, and the proposed algorithm offering lower RMSE than the other reparameterized methods at both noise levels (relative decrease in RMSE greater than or equal to 51%). When the tumours are absent from the MR image, the proposed reconstruction algorithm performed similarly to EM with spectral analysis temporal basis functions, and better than both frame-independent EM with kernel spatial basis functions and conventional frame-independent EM. These quantitative results agree with the visual examples of the images shown in figures 10 and 11. In particular, the improved performance of the proposed reconstruction algorithm is most evident at the high noise level.  (table 2). Note. For each reconstruction algorithm, the optimal iteration number, which achieved the lowest total RMSE (obtained by summing the RMSE for the grey/white matter and tumour regions) was chosen for comparison.

Post-reconstruction denoising experiment
Compared to conventional frame-independent reconstruction, the improvements offered by the reparameterized reconstruction methods (each of which embed a different linear model into the iterative reconstruction algorithm) are evident. However, we note that the respective linear models can also be applied as a post-reconstruction denoising step in order to increase the robustness of the final non-linear fitting step (the latter being highly sensitive to noise). We performed an extensive post-reconstruction denoising experiment in order to assess the effectiveness of such a strategy. Dynamic images (reconstructed with the conventional frame-independent algorithm at the high noise level) were denoised by image-space fitting to each of three different linear models: the spectral analysis model, the kernel model, and the proposed joint model. For image-space fitting, we used the update equation (9), which is equivalent to modelling with spectral analysis basis functions alone when the spatial basis matrix S is the × J J identity matrix, and which is Figure 11. Similar to figure 10, but for the high noise level (total mean counts 5 10 7 × ). Note. For both post-and within-reconstruction strategies, the iteration settings which achieved the lowest total RMSE (obtained by summing the RMSE for the grey/white matter and tumour regions) was chosen for comparison. For the within-reconstruction method, the optimal iteration is the reconstruction iteration number. For the post-reconstruction method, the optimal iteration numbers are the reconstruction iteration number yielding the initial image to denoise, as well as the number of image-space iterations, respectively. equivalent to modelling with frame-independent kernel spatial basis functions alone when the temporal basis matrix T is the × M M identity matrix. Following post-reconstruction denoising, the images were fitted to the two-tissue compartmental model to obtain K i parametric maps (just as was done for the within-reconstruction method above). For each linear model, we applied up to 200 iterations of the image-space fitting algorithm to images reconstructed with either 25, 50, 100 or 200 iterations of conventional frame-independent reconstruction. The combination of iterations that achieved the lowest total RMSE (obtained by summing the RMSE for the grey/white matter and tumour regions) were chosen for compariso n. Table 3 summarizes these results, and allows direct comparison between the within-and postreconstruction methods. For a visual comparison, example K i parametric maps obtained by the different within-and post-reconstruction methods are shown in figure 12.
The top row of figure 13 compares the bias-COV tradeoff curves on K i parametric maps for the within-and post-reconstruction approaches incorporating the joint spectral/kernel model, in the case where the tumours are present on the MR image. The bottom row shows the spatially-averaged pixel-level RMSE as a function of iteration. As shown in table 3, the within-reconstruction method outperformed the post-reconstruction method in both regions, offering a decrease in spatially-averaged pixel-level RMSE of 1.7 and 0.6 percentage points in the grey/white matter and tumour regions, respectively, when considering an MR image without tumours. When the tumours are present on the MR image, these numbers increased to 2.4 and 1.8 respectively. From figure 13, we can see that these gains in performance are primarily due to lower bias. As the within-reconstruction methods properly model the raw sinogram data as a space-time dependent Poisson process, they are expected to outperform the post-reconstruction methods. However, for the spectral analysis or kernel models alone, we note that the within-and post-reconstruction methods perform comparably, with the withinreconstruction methods tending to offer better performance in the tumours and worse performance in the grey/white matter.
We hypothesize that this observation (not seen for the more advanced joint spectral/kernel basis function model) is due to the fact that the within-reconstruction method uses nested EM as a means of acceleration. Reconstruction with either just the spectral analysis basis  functions alone, or just the kernel basis functions alone, converges more quickly (after fewer tomographic update iterations) than reconstruction with the denser (less sparse) basis function matrix associated with the joint spectral/kernel model. We expect the performance of the within-reconstruction methods, regardless of the chosen linear model, to improve as the number of the nested EM sub-iterations decreases (i.e. allowing more tomographic update iterations to occur before convergence, increasing the opportunity of attaining a lower RMSE). Of course, the required computation time for reconstruction would increase in this case. In future work, it would be interesting to investigate the impact of the choice of sub-iteration number on the performance of reconstruction methods incorporating the different linear models.

Real data
For the real [ 11 C]SCH23390 dataset, figure 14 shows the regional COV (regional standard deviation divided by regional mean value) as a function of the mean regional BP ND , with increasing iteration, of the parametric BP ND maps obtained by post-reconstruction kinetic fitting of images from the different reconstruction algorithms. For a visual comparison, figure 15 shows examples slices through the post-reconstruction parametric BP ND maps. In the real [ 11 C] SCH23390 data study, similar to the results from the [ 18 F]FDG simulation study, the proposed algorithm demonstrates improvement over the others, achieving comparable or reduced noise (measured here by regional COV) at matched mean regional BP ND in both regions. Figure 13. Comparison of the within-and post-reconstruction methods with the joint spectral/kernel model. Top: spatially averaged pixel-level bias versus spatially averaged pixel-level COV on K i parametric maps (iteration number increasing (5, 10, 20, 40, 80, 160 and 200), corresponding to increasing COV in each case. For the withinreconstruction approach, the curve varies as a function of reconstruction iteration. For the post-reconstruction approach, each curve varies as a function of image-space iteration, for different initial images (given by either 25, 50, 100 or 200 iterations of conventional frame-independent reconstruction). Bottom: spatially averaged pixel-level RMSE as a function of reconstruction iteration (for the within-reconstruction approach) or image-space iteration (for the post-reconstruction approach). All results are shown for the high noise level (total mean counts 5 10 7 × ).

Discussion
The choice of parameters related to the kernel spatial basis functions (kernel function parameter σ and neighbourhood size α) is critical in determining the performance of the proposed method. The optimal choice for σ should result in smooth spatial basis functions while still discriminating between pixels belonging to different tissues in the local neighbourhoods of the co-registered MR image (figure 2). In our studies using both simulated and real T1-weighted MR data, σ = 0.2 resulted in satisfactory performance. However, if noisier images are used, this parameter may need to be increased in order to retain smooth basis functions (alternatively, a denoising algorithm could be applied to the MR image before building the spatial basis matrix). If less noisy images are used, the parameter can be decreased in order to increase the discrimination between different tissues while retaining smoothness. Additionally, we note that the current study only considers T1-weighted MR images, and that the choice of kernel function parameter will likely need to be modified for MR images obtained from sufficiently different (e.g. T1 versus T2-weighted) acquisition protocols with different tissue contrasts. A possible avenue for future work is to investigate the use of available segmentation algorithms for deriving spatial basis functions with the kernel method. In this case, the feature vector f j would contain the tissue probabilities for the various tissue classes for pixel j, rather than the original anatomical image intensity as used in this work. This could be advantageous because the spatial basis functions would be insensitive to the tissue contrast of the original image, possibly more robust to noise on the anatomical image, and an optimal kernel function parameter σ could generalize well across anatomical images acquired under different protocols.
The neighbourhood parameter α determines the trade-off between noise reduction, and increased blurring in structures of the emission image where there is no match in the anatomical image (figures 4 and 5). In our simulation study with small to intermediate sized tumours, α = × 5 5 pixels ( × 6.1 mm 6.1 mm) resulted in good performance, achieving comparable or better performance compared to the other algorithms in the tumours when they were absent from the co-registered MR image, and better performance (relative decrease in spatially averaged pixel-level RMSE ⩾ 50%) when the tumours were present on the co-registered MR image. In cases where there is confidence that the structures of emission and anatomical images match, this parameter can be increased to achieve even better performance.
We expect that the results on real data can be further improved. In particular, our study did not account for inter-frame motion, which could result in mismatches between the anatomical Figure 14. Real [ 11 C]SCH23390 data. The regional COV (regional standard deviation divided by regional mean value) is plotted as a function of mean regional BP ND in two striatal regions (with increasing iteration number (5, 10, 20, 40, 80 and 160) from left to right). and emission images if not corrected, leading to worse performance. Similarly, misregistration between the images will result in worse performance. However, we emphasize that in our tests with real data, the proposed reconstruction method delivered better results compared to the other algorithms, without explicitly accounting for either motion or misregistration. Indeed, with the advent of simultaneous PET-MR scanners, the potential for minimal misregistration and inter-frame motion (after correction) are now possible (Catana et al 2011), and we expect that the proposed algorithm will prove beneficial without modification in such cases. Nonetheless, future work will explore the impact of misregistration/inter-frame motion on the proposed algorithm.
While the proposed algorithm combines spectral analysis temporal basis functions with spatial basis functions derived from the kernel method, it would be straightforward to substitute other temporal basis functions into the linear model of equation (2). For example, Patlak temporal basis functions could be used instead of spectral analysis functions, in which case the Patlak coefficients could be recovered from the coefficients estimated directly from the measured sinogram data, removing the need to perform post-reconstruction modelling. Alternatively, the spatial basis functions used in this work could be combined with a non-linear temporal model using a method similar to that presented by Matthews et al (2010) or Wang and Qi (2012), in which case, at each iteration, frame-independent EM image reconstructions with kernel spatial basis functions could be fitted to a non-linear model with a modified leastsquares algorithm.

Summary
We have proposed a reconstruction algorithm incorporating a jointly parameterized model of the dynamic PET image data in terms of spectral analysis temporal basis functions, and spatial  basis functions derived from a co-registered MR anatomical image using the kernel method. With a realistic dynamic [ 18 F]FDG simulation study, we have shown that including a temporal and spatial model of the image into the reconstruction task significantly outperforms reconstruction methods incorporating either model alone, in terms of post-reconstruction kinetic parameter maps, when a co-registered MR image with the same structure as the emission image is available. In regions in which there is no shared structure, the proposed algorithm was found to perform similarly to reconstruction with spectral analysis temporal basis functions, and superior to conventional frame-independent reconstruction and frame-independent reconstruction with kernel spatial basis functions. We further demonstrated that the joint spectral/kernel model can also be effectively applied post-reconstruction, using an EM-like imagespace algorithm, to increase the robustness of non-linear kinetic modelling. Application to real dynamic [ 11 C]SCH23390 data also showed promising results. The proposed algorithm can be applied to reconstruction of any dynamic PET data for which a co-registered T1-weighted MR image is available, after which post-reconstruction kinetic fitting using any temporal model of interest can be applied.