Hybrid PET-MR list-mode kernelized expectation maximization reconstruction

The recently introduced kernelized expectation maximization (KEM) method has shown promise across varied applications. These studies have demonstrated the benefits and drawbacks of the technique when the kernel matrix is estimated from separate anatomical information, for example from magnetic resonance (MR), or from a preliminary PET reconstruction. The contribution of this work is to propose and investigate a list-mode-hybrid KEM (LM-HKEM) reconstruction algorithm with the aim of maintaining the benefits of the anatomically-guided methods and overcome their limitations by incorporating synergistic information iteratively. The HKEM is designed to reduce negative bias associated with low-counts, the problem of PET unique feature suppression reported in the previously mentioned studies using only the MR-based kernel, and to improve contrast of lesions at different count levels. The proposed algorithm is validated using a simulation study, a phantom dataset and two clinical datasets. For each of the real datasets high and low count-levels were investigated. The reconstructed images are assessed and compared with different LM algorithms implemented in STIR. The findings obtained using simulated and real datasets show that anatomically-guided techniques provide reduced partial volume effect and higher contrast compared to standard techniques, and HKEM provides even higher contrast and reduced bias in almost all the cases. This work, therefore argues that using synergistic information, via the kernel method, increases the accuracy of the PET clinical diagnostic examination. The promising quantitative features of the HKEM method give the opportunity to explore many possible clinical applications, such as cancer and inflammation.

and compared with different LM algorithms implemented in STIR. The findings obtained using simulated and real datasets show that anatomicallyguided techniques provide reduced partial volume effect and higher contrast compared to standard techniques, and HKEM provides even higher contrast and reduced bias in almost all the cases. This work, therefore argues that using synergistic information, via the kernel method, increases the accuracy of the PET clinical diagnostic examination. The promising quantitative features of the HKEM method give the opportunity to explore many possible clinical applications, such as cancer and inflammation.
Keywords: hybrid kernel, PET image reconstruction, multumodality, iterative reconstruction, anatomically-guided, synergistic PET-MR S Supplementary material for this article is available online (Some figures may appear in colour only in the online journal)

Introduction
Image reconstruction, using positron emission tomography (PET), is now mostly performed with iterative techniques. Some of these methods use prior information to take into account the prior belief about 'smoothness' of the tracer concentration. There are different ways to introduce prior information into the PET image reconstruction problem [26,53]; most of them use a Bayesian approach where instead of maximizing a Poisson log-likelihood, as in [16,39,59], the log-posterior is maximized [1,23,47].
Recently, Ahn et al [1] demonstrated that reconstruction using the relative difference penalty (RDP) can lead to better lesion detectability than ordered subsets expectation maximization (OSEM). The advent of multi-modality imaging scanners made the exploitation of anatomical information with Bayesian techniques simpler and more practical, for example, using magnetic resonance (MR) information. These techniques have shown promising results in terms of image quality and quantification [3,9,17,32,38,40,60,61,63,65,69,73]. To exploit the fact that PET-MR scanners allow the acquisition of PET and MR data simultaneously, synergistic reconstruction of these two has been investigated in order to improve the quality of both PET and MR images [18,44,45].
More recent studies have proposed a different approach to introduce prior information. Such techniques benefit from the kernel method which is commonly used in machine learning [24]. Hutchcroft et al [27,28] and Wang and Qi [71] introduced this technique in PET image reconstruction using one prior information image, MR and PET respectively, to regularize reconstruction. Novosad and Reader [46] used the kernel method combined with temporal basis functions in order to perform full dynamic PET reconstruction. Ellis and Reader [21] proposed the use of kernelized expectation maximization (KEM) in the context of dual-dataset longitudinal PET studies, where a baseline scan reconstruction was used to define basis functions for a follow-up scan reconstruction. Gong et al [22] used a hybrid kernel method to perform direct reconstruction of Patlak plot parameters from dynamic PET using MR and PET information where the latter was obtained by combining different frames. Bland et al [8] studied the effect of KEM on simulated dose-reduced datasets, showing improved contrast to noise ratio, but at the cost of possible over-smoothing of features unique to the PET data. To overcome this issue [7] proposed an MR resolution spatially constrained kernel method in order to maintain the noise reduction properties of the conventional kernel method, whilst better retaining the features unique to the PET data. These previous studies were carried out using sinogram-based reconstruction.
The work presented in this manuscript explores the performance of a hybrid kernel method for list-mode reconstruction (LM-HKEM) of static images exploiting the PET information, iteration after iteration. In particular, the hybrid kernel matrix contains two components: MR and PET. The PET part is obtained from the iterative update using a similar approach to the one-step-late (OSL) reconstruction method. Such a procedure avoids the need for a preliminary reconstruction from PET data, as in [71]. The MR component is a constant part of the kernel which has to be chosen according to the type of study one wants to perform. The proposed method was designed to improve quantification and to minimise PET unique feature suppression, while keeping good resolution and image quality at different count-levels. Both LM-KEM and LM-HKEM in this study use a voxel-wise and spatially restricted kernel rather than a patch-wise one. We studied the method performances on four datasets: simulated torso, Jaszczak phantom and two patient studies showing plaques in the arteries. The method was compared with different algorithms such as the Bayesian OSEM with different priors.
The paper is structured as follows: section 2 describes the mathematical aspects of the hybrid kernelized reconstruction algorithm. Section 3 describes the datasets used to study image reconstruction, LM sub-sampling and the experimental methodology. Section 4 presents results and a comparison of the different standard algorithms with the proposed one. The results are discussed in section 5 and conclusions are drawn in section 6.

Theory
The kernel method is a technique commonly used in machine learning [24]. It aims to learn details about the data by taking advantage of what is called a training sample where known information can be identified. One of the most general ways to represent data is to specify a similarity between pairs of objects. Suppose we have empirical data where V is the domain of the inputs, v j , and T is the domain of the outputs, t j . The idea is to be able to generalize for unseen data points. That is to say, given some new input v ∈ V , we want to predict the corresponding output, t ∈ T . In our case, the task is to predict the PET image of a subject, using PET data as well as information from an image of the same subject in a different modality (MR or computed tomography (CT)). The output, t j , can be written as a function, F, of the input, which we assume is a vector, v j F(v j ) is a high dimensional and non-linear function but it can be described linearly in a trans- where N is the number of input and outputs, Φ is a vector mapping function and µ is a weight vector also sitting in the transformed space with where α l is an element of the coefficient vector, α. At this point, it is clear how the outputs can be described as a linear function in a dot product space, called feature space, as The dot product is a similarity measure in the space V and it defines the kernel, k satisfying, for all v l , v j ∈ R N , the following identity The vector v is called the feature vector. The advantage of using a kernel as a similarity measure is that it allows construction of algorithms in dot product spaces without explicitly defining Φ. The kernel approach can be applied to the LM-OSEM reconstruction algorithm [4,37,49,50,55]. For simplicity, we show the mathematical formulation of the algorithm for 1 ordered subset. The LM-OSEM iterative update for a voxel, j , and sub-iteration, n + 1 is given by j is the estimated j th voxel value at the nth iteration and p ij is the ijth element of the system matrix. This represents the probability that an event occurring in voxel j produces a coincidence in the ith pair of detectors, s i is the additive sinogram containing scatter and random events, I j is the number of events that contribute to the value in voxel j , and J i the number of voxels whose projection i contributes.
Each voxel value of the image, λ, can be represented as a linear combination using the kernel method. So, λ j , can be described using the kernel matrix (9) where k fj is the fjth kernel element of the matrix, K, where N j is the number of feature vectors related to the voxel j . Different kernel functions are proposed in the literature [24], with the most used in medical imaging being the Gaussian kernel

Kernel matrix construction
In contrast to previous work, [28,71], where the kernel was created using either MR or PET images, and [46], where the kernel method is used in conjunction with spectral temporal basis functions for dynamic PET reconstruction, here we propose a LM hybrid kernel method that uses information from both MR images and PET update images. The PET image, used to construct the hybrid kernel, is helpful in tackling the mismatch problem between PET and MR. The dependency on the iterative process helps avoiding the need for a preliminary reconstruction to obtain the kernel. Taking into account the fact that the kernel is iteration dependent, the iterative step becomes the following where is the kernel coming from the MR image and is the part coming from the PET iterative update. Here, the Gaussian kernel functions have been modulated by the distance between voxels in the image space, as in [21]. The quantity x j is the coordinate of the j th voxel, z (n) j is the feature vector that is calculated from the nth PET update image, and σ m , σ p , σ dm and σ dp are scaling parameters for the distances in (13) and (14). For each voxel of the PET image the corresponding feature vectors, z (n) j and v j , are extracted from the local neighborhood of the voxel from the PET update image and MR image, respectively. To keep computation time short, we construct a sparse kernel matrix. A cubic neighborood, N j , was used and the k (n) jf element of the kernel was defined by Following previous studies, to make it easier to choose the kernel parameters (such as σ m and σ p ), the feature vector, v j , is normalized as where SD m is the standard deviation of the mean voxel value over the whole MR image v j . For the PET contribution, k p (z j ), this normalization is slightly different. As we said previously, our image can be written as a linear combination like in (9), and each element in this linear combination contains the difference between the feature vector, z (n) j , associated with the voxel j , with the feature vector z (n) f , associated with any neighboring voxel f . Here we normalise these differences for α In this way, for every f in the neighborhood the normalization is the same. Local mean voxel values were not used because the operation would need to be repeated for every voxel of the image and every iteration, making the method more computationally demanding.
The SD is not used because the first PET image used for the kernel is uniform and the standard deviation is then zero. Moreover the SD of a PET image grows iteration by iteration.
The value of α (n) j represents the central voxel value in the feature vector j . As mentioned in section 1, the kernel is voxel-wise. This means that the feature vector contains only one nonzero element, which is α (n) j , and equation (17) becomes:

Simulation study
A realistic simulation study was carried out to validate the proposed method and investigate its performance under controlled conditions. The simulated data were produced by a Monte Carlo numerical simulation based on GATE [30] which uses accurate physical modelling. The specific simulation is described in detail in [68] but in brief it uses the Philips Gemini TF scanner [64] having cylindrical geometry with 70 cm diameter, 18 cm length and consists of detector blocks with 44 × 23 crystals along 28 detector blocks. Each crystal size is 22 × 4 × 4 mm 3 . The synthetic data represents an anthropomorphic torso showing uniform contrast in clustered regions: lungs, myocardium, liver and three different spherical lesions between lungs and liver as shown in figure 2(a). The three lesions have their centers in different axial positions. To speed up the simulations they did not include effects such as patient motion and positron range which would need to be treated with appropriate modelling within the reconstruction system matrix [51]. The total number of simulated events is 6.5 × 10 7 with 18 F-fludeoxyglucose ([ 18 F]FDG). In this study we focused on the uptake value of lesions and we reproduced four cases: L1 is small (6 mm diameter) and L2 is big (12 mm diameter) both appear only in the PET images; L3 (12 mm diameter) is a lesion appearing both in the PET and the MR data (note that the MR image was augmented with a synthetic lesion equal to L3); and L4 is the part of the soft tissue which appears only in the MR. The simulated uptake for these ROIs is 7, 7, 4 and 1 respectively for L1, L2, L3 and L4. The reconstructed image were obtained using 23 subsets on a 128 × 128 × 87 grid of 4 × 4 × 2 mm 3 voxel size.

In vivo evaluation
The LM-HKEM method was also applied to dynamic data for a patient with suspected atherosclerotic plaques in the carotid arteries.  [20]. This image sequence is particularly suitable for carotid PET/MR studies, because it provides high contrast between the carotid arteries and the surrounding tissue.

Reconstruction setup
All datasets were reconstructed with 21 subsets and 10 full iterations using LM-HKEM and a selection of values for the parameters as discussed in section 2, that is: N, σ m , σ p , σ dm and σ dp . The values of the parameters used in this work are reported in table 1. The size of the cubic neighborhood, N, was chosen to be 3 × 3 × 3 voxels, although 5 × 5 × 5 and 7 × 7 × 7 were also studied. From a preliminary study, also reported in the results, we noticed that for LM-HKEM the RMSE does not show big improvement but actually becomes a little higher. For KEM the lesions L1 and L2, which are the most interesting, are over-smoothed when using a bigger neighbourhood. In addition, a bigger neighbourhood makes the computation slower by a factor equal to or bigger than 2. For comparison, the same datasets have been reconstructed also with 21 subsets and up to 10 full iteration cycles of: OSEM, as this is the algorithm used in clinical routine; the maximum a posteriori one step late with median root prior (OSMAPOSL-MRP) and the parallel level sets (to be referred to as MRP and PLS), and the KEM using the MR image [28]. We implemented the PLS prior in STIR as described in detail by Ehrhardt et al [17]. This inclusion was motivated by the fact that the PLS prior depends on the gradient of the PET image and the gradient of the MR image and as a consequence, its hybrid nature makes it particularly relevant in the comparison. Although this study is focused on quantification, which is performed for all 10 iterations, the images are shown at the 3rd iteration which is typically the recommended for OSEM in most clinical PET imaging protocols [1,2,42,43]. The iteration could be chosen as the one which optimises a specific figure of merit, such as CNR, however this is dependent on the chosen ROI and varies between datasets. Therefore, using a fixed iteration number to show the reconstructed images allows better consistency among the different cases. Note that all algorithms in this study use LM reconstruction; for this reason, we refer to the proposed algorithm as HKEM, instead of LM-HKEM, from now on.
Scatter correction was performed as developed by Tsoumpas et al [67] and discussed in more detail by Polycarpou et al [52]. Randoms were estimated from singles, which were calculated from delayed events [29]. The procedures for these evaluations, including attenuation and normalization corrections [25], make use of the open source Software for Tomographic Image Reconstruction (STIR) [66] version 3.0. All real datasets were reconstructed using sinogram axial compression 11 [5].

Image analysis
The comparison was carried out in terms of contrast-to-noise ratio (CNR) for the full dataset including all events, while bias and contrast recovery coefficient (CRC) were considered for the low-count cases. The rationale behind the choice of different image metrics for different count levels is that for the low-count case it is possible to use the long acquisition with OSEM reconstructed image, with 10 iterations, as the reference image for the bias and the reference contrast for CRC. On the other hand, only CNR is a meaningful measure of contrast for the long acquisition as no reference image is available for the long acquisition. The 10th iteration was chosen because it is assumed to be closer to convergence than the 3rd. Region of interest (ROI) analysis was performed in the clinical data using two separate regions: (a) the target ROI located in the atherosclerotic plaque of the right carotid bifurcation, which was segmented from the MR image, and (b) the background ROI drawn in the surrounding tissue of the carotid. From figures 1 and 2, we can see the inconsistencies between PET and MR: in (a) L1 and L2 appear only in the PET image, L3 in the PET and MR images and L4 shows details only in the MR image; in (b) for the [ 18 F]FDG scan and (c) for the [ 18 F] NaF scan, the MR image shows high contrast between the carotid arteries and the surrounding tissues while the PET image shows small lesions inside the carotid arteries which do not necessarily follow the same shape as the carotid arteries.
Quantitative comparison was performed using different figures of merit: where t and b are the mean values over target (hot region) and background ROIs, σ t and σ b are the standard deviations related to target and background ROI respectively. Note that CNR was used only for the single long acquisition image. For the low-count images, a mean CRC, and the bias, were calculated over the 10 sub-samples: p is the sample index, t p is the mean value of the chosen ROI, P is the number of sub-samples, b p , is the mean value of the target background ROI and C T is the true contrast calculated on the 304.349 s acquisition image as the difference between the values of target and background ROIs, and CoV p is the coefficient of variation (CoV) of the voxels inside the region of interest for the specific sample or frame p . Due to the fact that there are several sources of variability over time for a real human body, such as motion, tracer kinetics and others. We did not average the bias values over the 10 frames, but we calculated the bias of the sum. In particular, we summed up all the 10 frame images and calculated the difference of the sum with a reconstructed image, using OSEM, that had the same acquisition time as the sum of all the frames (304.349 s). Note that the duration of each frame takes into account the decay process, which is the reason why the sum is not exactly 300 s. The bias of the sum is calculated as follows: where S is the value of the sum over the frames in the selected ROI, M T is the value in the same ROI for the 304.349 s image obtained with the 10th iteration of OSEM. Figure 3 shows the kernel parameters (σ p , σ m , σ dp and N ) optimization in terms of Bias. Here we show the optimization for both KEM and HKEM and for all the ROIs. Note that σ dm = σ dp in this study because the voxel size of the MR image is the same as of the PET image. Figure 4 presents the bias-CoV plot in all ROIs, and over 10 iterations. Also, it shows a comparison between OSEM, MRP, PLS, KEM and HKEM. The image quality comparison is reported in figure 5. Note also that the effect of σ p is not shown for KEM because this method only has the MR related parameters, σ m and σ dm , to tune.  region) and the surrounding tissue as the background. In figures 7 and 11, the CRC metric, is assessed for the [ 18 F]FDG and [ 18 F]NaF datasets using the ten short frame datasets. Bias was evaluated as described in section 3.4 using equation (22). Figures 8 and 12 show the compariso n among the different methods using the [ 18 F]FDG and [ 18 F]NaF data. The plot shows the bias of the sum as a function of the CoV related to each iteration. The MR image used as the source of the kernel is shown in figures 1(b) and (c). To give an idea of the computational time, in table 2 the reconstruction time required for 10 iterations of a 5 s frame for each algorithm is reported and they refer to the University of Leeds high performance computer: each reconstruction used one of the 10 cores Intel E5-2660v3 (2.6 GHz) processor, and the available memory is 256 GB. The compiler that was used was GCC 4.4.7. The computational time in table 2 includes also the sensitivity calculation.

Discussion
We have proposed and developed a LM hybrid kernelized reconstruction algorithm which includes information from both MR and PET. Although the MR image is needed before iterations start, the PET contribution naturally comes from the iterative process. In this way, our method avoids preliminary PET reconstructions aimed at obtaining a PET contribution for the kernel matrix. Particular attention is focused on the improvement in quantification at different count-levels. Note that the PLS prior, also included in the comparison, has hybrid PET-MR characteristics.
Preliminary studies on the effect of different datasets were carried out with different ROIs. However, the results do not show significant differences between different subsets (see sup. figure 1). From figure 3 it is possible to see that the RMSE did not change substantially for L2, L3 and L4. However, it changed drastically for lesion L1. It is generally safe to use the setting in table 1, which has led to the detection of all lesions in our study. If the study aim    is, however, a high level of quantitative accuracy for small lesions, we recommend using either σ dp = 0.5 or σ p = 0.3. Both these values will have the effect of increasing the noise but making the quantification, even for small regions, very accurate. Although large values of N have been shown to slightly improve noise suppression, they also make bias higher and the computation slower. In addition, the same noise suppression is achievable by using higher sigma values.
The σ m and σ p parameters reflect the relative contribution of MR and PET. When the data are very noisy, it is better to use higher the σ p PET contribution to obtain a smoother image, however this will gradually reduce the quantification accuracy of the proposed method. Figures 4 and 5 show the comparison between the different algorithms and for all the ROIs in terms of quantification and image quality. Also, figure 4 shows that the HKEM method outperforms all the other techniques for the lesion L2 in terms of bias while achieves similar CoV as  OSEM+G, and that when the MR features are similar to the PET distribution, all MR-based methods perform well, while they can be at least 5% worse than the HKEM when MR has not related contrast on the corresponding area. Such result describes the reliability of the proposed method even when no information about the lesion is included in the MR image.
The L1 and L4 regions are informative showing that for quantification of lesions smaller than the voxel neighbourhood a smaller σ p is required, and that quantification in a uniform region is not affected by the presence of MR tissue-borders, as the bias is similar for all the algorithms.
The reason behind the fact that HKEM shows significant improvement for L1 compared to KEM, but only minor improvement for the other ROIs, lays in the size of the lesion. In fact, KEM tends to over-smooth PET unique features and the effect is stronger if the lesion size is smaller than the neighbourhood. Whereas L2 is big enough not to be completely suppressed but only slightly over-smoothed. For L3, KEM achieved similar bias but smaller CoV. This is related to the fact the the MR image contains this information and the detailed are preserved. Finally, L4 does not show particular difference because that region is uniform in the PET image.
OSEM shows smaller bias and CoV for the lesion L1. Although it can be seen that the OSEM image is noisier than the HKEM image, the CoV for the lesion is smaller. This is due to the fact that HKEM slightly smooths L1, therefore the outer voxels of the lesion have lower values, making the mean decrease and CoV increase. On the other hand, defining the ROI on the image reconstructed with OSEM would be very difficult without the reference image.
In the supplementary material the investigation with the Jaszczak phantom is reported to show the comparison when the PET and the MR images share the same contrast information (stacks.iop.org/IP/35/044001/mmedia). The phantom investigation shows that for the 3600 s acquisition all the MR-driven methods show higher CNR with the HKEM having the highest CNR; The study with ten 5 s sub-samples was carried out in terms of bias and CRC using the 3600 s image as a reference. Sup. figure 4 shows that even though the number of events is reduced by a factor of about 600, around 70% CRC was obtained with the proposed method and around 69% for KEM and PLS while lower values were achieved with the other methods. It is important to notice that after few iterations HKEM has also higher CoV making the measure noisier than KEM at late iterations.
Looking at the top row of sup. figure 6, it is possible to see that the kernel methods represents the best compromise between edge preservation and noise suppression when PET and MR images provide the same information. From the bottom row it is possible to appreciate the good performance of the kernel methods in reducing the noise while keeping a good definition of the cold rods.
In table 2 we can see that both kernelized methods are slower than OSEM. This is due to the fact that more operations are needed voxel by voxel and iteration after iteration.
Patient investigations were carried out to give two different examples of applications for the proposed method [33,41,56]. Regarding the [ 18 F]FDG patient experiment, for the 5400 s acquisition, CNR was calculated on the defined ROIs and compared all the studied techniques. Figure 6 shows higher CNR for HKEM, followed by the KEM and PLS. The low-count case consisted of 10 consecutive frames of around 30 s starting from 600 s post injection. Looking at the plots iteration after iteration, it can be seen that figure 7 shows higher CRC except for the first iteration for HKEM and figure 8 shows an improved bias-less negative than all the other algorithms, and less negative after the second iteration compared to KEM. On the other hand, KEM and OSEM+G show better noise suppression, although the latter shows bias values up to 30%. The top row of figure 9 shows the image quality for the 5400 s acquisition with [ 18 F]FDG. It is worth noting the improved carotid local resolution and contrast around the area of the carotid for PLS, KEM and HKEM. The kernel methods show good noise suppression, and good definition of the PET unique features. OSEM+G shows slightly better noise suppression than HKEM but at the cost of contrast. From the bottom row of figure 9 it is possible to see how the kernel method is able to suppress noise while keeping a satisfactory level of contrast in the location of the suspected carotid lesions even at low-count data.
Although several parameter settings were investigated for PLS, we were not able to obtain a better image than the one shown in the bottom row of figures 9 and 13. This is probably due to the fact that this technique is not based on a local neighborhood, which is more efficient at tackling noise.
The results from the [ 18 F]NaF patient study in figures 10-13, are consistent with the findings for the [ 18 F]FDG case and the phantom, showing that HKEM provides higher or similar values of CNR, CRC and bias compared to KEM and higher than the other algorithms, at least at the early iteration. It can be notice that there is a difference between the simulation, the long acquisition and the short acquisition. The difference between the parameter settings in the simulation and the real datasets might be due to the size of the lesion and the voxel size. In fact, the voxel size in the simulation is almost double the voxel size of the real data and the neighbourhood is about eight times bigger making it easier to over-smooth small lesion such as L1 and L2. The difference between long acquisition and short acquisition instead is related to the different noise level.
Of particular interest is the investigation of the bias for the low-count datasets. In fact, lowcount imaging is required in all those application were the estimation of the image-derived input function (IDIF) is needed [48,54,57,58,72]. The accuracy of the IDIF is crucial for an accurate kinetic analysis. Another context where low-count condition can affect imaging is the injection of lower amounts of radioactivity. This is of interest because it would have the potential to allow the examination to more patients [10,19,35]. It has previously been shown that OSEM and penalized algorithms show negative bias under low-count circumstances [1,12,13,36,70]. Positive bias is associated with the frequently observed positivity constraint in iterative reconstruction algorithms. This is because the average value without negative values is shifted to a higher value, therefore the bias is more positive. When the number of events is very small there can be trapping of iterative estimates at zero, making the average values lower thus introducing negative bias in those regions [31]. The results of our investigation show consistent results with the investigations cited above. The negative bias here is due to the the fact that each low-count frame is negatively biased making also the sum negatively biased compared to the reference image. The results of this study show that the kernel method, and also Bayesian anatomical priors, can reduce this effect in all datasets. Furthermore, when the MR information is combined with the available PET information, this reduction is even stronger, making it feasible to produce less biased images even at lower injected doses. This is due to the fact that each voxel is correlated with its neighbors and although one can have zero value, some of the neighbors will have non-zero values, and this will result in an increased voxel value. For the PLS algorithm no neighborhood is used, however, the gradients are calculated between pairs of neighboring voxels in all directions. When the MR features over-smooth the unique PET features, the PET kernel preserves the signal making our proposed method more reliable while providing quality image which is as good as the KEM.
Nevertheless, depending on the noise level of the data it can be seen that in some cases the CoV increases iteration after iteration for HKEM whilst it increases less in KEM. This effect is inherited from OSEM as the kernel depends on the PET image iteration after iteration. For this reason, the results show a better performance at early iterations for the hybrid method thus making it suitable for clinical investigations where low numbers of iterations are preferred [1,6,33,62].
Because of the normalisation of the feature vector from equations (16) and (17), which avoids the dependence from the image scale, HKEM and KEM provide stability in terms of parameter optimisation, as the same reconstruction settings demonstrated similar performance across three different applications and datasets at similar count levels. Although a limited degree of tuning may improve the results, it would be better for reproducibility to use σ p between 1 and 2 (according to how much noise the data contains) and σ dp = σ dm between 1 and 5. All the other parameters should be set as suggested in table 1. The effect of misalignment between PET and MR [14], and the application of the method to the calculation of the arterial image-derived input function were recently investigated [11,15]. Also, a possible future invest igation is to study how different MR sequences can affect the reconstruction, and in particular the effect of using more than one sequences or even different modalities images to derive the kernel.

Conclusion
In this work, a novel LM hybrid kernelised reconstruction method which takes into account the PET features from the iterative process was proposed. The aim was to improve accuracy in those cases where MR and PET local information are only partially equivalent or completely different. In addition, special emphasis was placed to the reduction of negative bias on lowcount circumstances. The proposed reconstruction method offers stable results across varied datasets. The performance for low-count data, obtained with short acquisition times, makes this approach particularly useful in dynamic PET-MR studies, as very short frame are usually used. The proposed algorithm outperformed the other techniques in terms of CNR, CRC and bias, at fixed iteration number, in regions of high focal uptake, such as suspected carotid plaque lesions.