Accelerated image reconstruction by a combined dual-matrix dual-voxel approach

Today, versatile emission computed tomography (VECTor) technology using dedicated high-energy collimation enables simultaneous positron emission tomography (PET) and single photon emission computed tomography (SPECT) down to 0.6 mm and 0.4 mm resolution in mice, respectively. We recently showed that for optimal resolution and quantitative accuracy of PET images the long tails of the 511 keV point spread functions (PSFs) need to be fully modelled during image reconstruction. This, however, leads to very time consuming reconstructions and thus significant acceleration in reconstruction speed is highly desirable. To this end we propose and validate a combined dual-matrix dual-voxel (DM-DV) approach: for the forward projection the slowly varying PSF tails are modelled on a three times rougher voxel grid than the central parts of the PSFs, while in the backprojection only parts of the PSF tails are included. DM-DV reconstruction is implemented in pixel-based ordered subsets expectation maximization (POSEM) and in a recently proposed accelerated pixel-based similarity-regulated ordered subsets expectation maximization (SROSEM). Both a visual assessment and a quantitative contrast-noise analysis confirm that images of a hot-rod phantom are practically identical when reconstructed with standard POSEM, DM-DV-POSEM or DM-DV-SROSEM. However, compared to POSEM, DM-DV-POSEM can reach the same contrast 5.0 times faster, while with DM-DV-SROSEM this acceleration factor increases to 11.5. Furthermore, mouse cardiac and bone images reconstructed with DM-DV-SROSEM are visually almost indistinguishable from POSEM reconstructed images but typically need an order of magnitude less reconstruction time.


Introduction
Today, small animal positron emission tomography (PET) and single photon emission computed tomography (SPECT) are key for discovering disease mechanisms, developing pharmaceuticals, radionuclide therapies and novel diagnostic radiotracers (Weissleder et al 2010). Preclinical SPECT commonly uses multi-pinhole collimation to image the distribution of single gamma emitting tracers and reaches uniform resolutions of 0.25 mm in vivo in the entire mouse (Ivashchenko et al 2015) and 0.12 mm ex vivo using a tissue sample collimator (Nguyen et al 2019). In contrast, the majority of PET scanners are based on electronically collimated coincidence imaging of 511 keV annihilation photon pairs. At the moment the best commercial coincidence PET scanners have a resolution down to 0.9 mm (Lecomte et al 1994, Cherry et al 1997, Ziemons et al 2005, Surti et al 2005, Visser et al 2009, Zhang et al 2011, Szanda et al 2011, Sanchez et al 2012, Herrmann et al 2013, Nagy et al 2013, Sato et al 2016, Krishnamoorthy et al 2018, while a prototype coincidence PET scanner with a central resolution of 0.6 mm has been reported (Yang et al 2016). We have taken a different approach by developing versatile emission computed tomography (VECTor) (MILabs BV, element for such neighboring voxels may still enable to calculate an accurate projection. This boils down to using larger voxels for that part of the system matrix that contains slowly varying components while for quickly varying components a smaller voxel size is used. In this paper we will combine this DV approach with a dual-matrix (DM) approach (Zeng and Gullberg 1992, 2000, Welch and Gullberg 1997, Kamphuis et al 1998, Kadrmas et al 1998, Beekman et al 2002 in which a much smaller system matrix is used for the backprojection step of image reconstruction than for forward projection. We integrated this DM-DV approach in a pixel-based ordered subsets expectation maximization (POSEM) (Branderhorst et al 2010) algorithm as well as in a pixel-based similarity-regulated OSEM (SROSEM) algorithm. The latter was recently developed in our group to accelerate image reconstruction by safely allowing the use of more subsets than with standard POSEM (Vaissier et al 2016). We apply DM-DV reconstruction to phantom images and animal scans to investigate how much image reconstruction can be accelerated without loss of image quality.

VECTor PET-SPECT system
The mouse VECTor collimator (figure 1) used here contains 162 clustered pinholes, each 0.75 mm in diameter. The system used was a VECTor/CT system (MILabs B.V. Utrecht, The Netherlands) that applies a triangular detector set-up with three PMT-based large-area NaI(Tl) gamma detectors. The maximum count yield of a clustered pinhole collimator is in the central field of view (FOV) and therefore a special graphical user interface is used that allows to focus on parts of the body (Branderhorst et al 2011). Larger regions up to whole body scans are done by stepping the animal through the central focus volume (Vastenhouw and Beekman 2007). In this paper a spiral bed sequence was used for data acquisition (Vaissier et al 2012). More details on collimator and scanner design can be found in (Goorden et al 2013).

System matrix calculation
The system matrix for 511 keV gamma photons is calculated as described in . In this procedure a scanning 99m Tc point source measurement is used for calibration. From the point source measurement a number of geometrical parameters for VECTor's geometry are estimated; these parameters describe a translation and rotation of collimator tube and of the three gamma detectors with respect to the designed positions and orientations. After the geometry has thus been determined, a voxelized raytracing simulation is employed to calculate system matrix elements. The voxelized raytracer was developed in-house and is described in detail in (Wang et al 2017). It uses a voxelized model of collimator and detector (0.1 mm cubic voxels in this case). To each of the voxels an attenuation coefficient is assigned and photon transport through the voxelized volume is calculated. In this simulation, gamma photon attenuation in the collimator is included, but scatter is ignored. Variable depth-of-interaction in the gamma detector is also included by modelling the varying interaction position in the scintillation crystal of the gamma detector using Beer's law. Additional detector non-idealities are incorporated by applying Gaussian detector blurring with 3.5 mm full-width-at-half-maximum (FWHM). Note that this system matrix calculation is different from the method described in (van der Have et al 2008) to obtain the SPECT system matrix, although the same point source measurement is used.
Storing system matrix elements for all possible voxel-detector pixel combinations is not feasible as this would result in about 10 12 system matrix elements for VECTor for a voxel size of 0.4 mm (the typical choice for high-resolution PET images). Therefore, we implemented a cut-off C in the raytracing code with a value between 0 and 1. Only gamma photon paths with a probability < C · 100% to pass through the collimator are incorporated into the system matrix. In  it was shown that the above-described method for calculating the system matrix yields accurate high-resolution images if a cut off as low as 0.01 is used. Therefore, throughout the main part of the paper a value of C of 0.01 was applied (while in the appendix some results for C of 0.05 are provided as well).

Image reconstruction
In this work different image reconstruction algorithms are tested and compared in terms of speed and quality of reconstructed images. In all cases, the positron range was only included in the forward projection (which effectively results in a DM approach Gullberg 1992, Kamphuis et al 1998)). Photo peak windows were set to a width of 20% of the photo peak value. Scatter correction was performed using a triple energy window correction (King et al 1997) with side windows adjacent to the photo peak having a width of 80 keV. Reconstructions were performed both at 0.4 mm and 0.8 mm voxel size, as both sizes are used in practice. While for both voxel sizes reconstruction times are compared, all displayed images in this paper have 0.4 mm voxel size. As the gold standard reconstruction algorithm, we use POSEM (Hudson andLarkin 1994, Branderhorst et al 2010) with 32 subsets as this is the reconstruction method that was tested and widely applied in our previous works (Goorden et al 2013. POSEM is an ordered subset algorithm with pixel-based subset choice which leads to a better subset balance than the projection-view-based choice in standard OSEM and therefore allows for much larger acceleration factors over the maximum likelihood estimation method (MLEM) (Branderhorst et al 2010). Secondly, pixel-based SROSEM (Vaissier et al 2016) was tested. SROSEM is an accelerated image reconstruction algorithm that we recently proposed and that automatically and locally adapts the number of subsets used based on the similarity of update terms for different subsets. This way pixel-based SROSEM can safely use a much larger number of subsets than POSEM for high-count regions and therefore further accelerate image reconstruction. In (Vaissier et al 2016) SROSEM was validated for multi-pinhole SPECT, here we apply the algorithm to PET tracer imaging with VECTor. We set a maximum number of subsets of 128 and a similarity threshold value of 40%.
To further accelerate image reconstruction, we implemented a DV approach to accelerate the forward projection step while a DM approach was used to accelerate the backprojection. The DV approach is illustrated in figure 2(a). In this figure a PSF is shown with its maximum value denoted by PSF max . It can be seen in this image that the central part of the PSF is changing more rapidly than the tails of the PSF which are more slowly varying. The idea behind the DV approach we use is that in order to calculate an accurate forward projection, it may not be necessary to represent the tails of the PSF on the same fine voxel grid as the central part. To this end, a tail cut-off parameter C f (0 < C f < 1) is defined to divide the PSF into two parts; a 'central PSF' and a 'tails PSF' . Based on the fact that the central part contains the highest spatial frequencies while the tails are more slowly varying, we set a three times coarser voxel size for the tails PSF. We thus assume that the 'tails PSF' corresponding to neighboring voxels is quite similar and that the simulated forward projection is not very different if one uses the 'tails PSF' of a nearby voxel in the forward projection. During image reconstruction the forward projection is then assumed to be the sum of (i) the forward projection obtained by letting the central PSF act on the fine voxel activity distribution and, (ii) the forward projection of the tails PSF which acts on a activity distribution binned on the larger voxel size.
Secondly, for the backprojection step we use a matrix with a higher tail cut off C b than in the forward projection. Such a DM approach in which full photon transport is not taken into account in the backprojection step has been used previously and it has been shown to often lead to the same quality images as a standard reconstruction Gullberg 1992, Kamphuis et al 1998). Thus in the backprojection step of our algorithm, tails were not included.
We integrated the DM-DV approach both in the POSEM and the SROSEM algorithm with settings as described above. Furthermore, in this work we fixed parameters C f and C b to 0.05. This value of 0.05 is based on preliminary try-outs in our lab but not on a rigorous optimization study. Note that these cut off parameters are significantly larger than the value of the cutoff parameter C = 0.01 of the whole PSF (section 2.2).
Matrices with higher cut offs were obtained from the matrix with lowest cut off in the following way. First each matrix element in the lower cut off matrix (C = 0.01) was divided by the detector efficiency and by a geometrical factor that quantifies the probability of the gamma photon reaching that particular detector pixel if no collimator would be present. This division is done as both these factors were incorporated in the system matrix elements (see  for more details). The remaining value after division is a measure of the probability of transmission through the collimator and if it exceeds the higher cutoff value (i.e. 0.05) the matrix element is saved to the higher cut off matrix.
To obtain the tails and central PSF matrices we follow the following procedure. Let the original matrix element (including detector efficiency and geometrical factor) have a value of M while the remaining matrix element after division is m. If m > C f , then a matrix element of value · M is saved to the central PSF matrix, while the tails PSF matrix is considered to have a value of C f m · M (as also illustrated in figure 2(a)). On the other hand, if m < C f , the matrix element M is solely saved to the tails PSF matrix. However, an element is only saved to the tails PSF matrix if the fine voxel corresponding to this matrix element is the central voxel on the rougher voxel grid of the tails PSF matrix (so only for 1 out of 27 fine voxels do we save a matrix element to the tails matrix).
Note that the procedure is the same for every matrix element and the exact shape of the PSF is not taken into account, i.e. there is no special procedure for PSFs near the edge of the FOV which may have a very asymmetrical appearance. In the appendix of this paper (figure A1) we provide a few sample PSFs for pairs of neighboring voxels, one pair near the centre of the pinhole's FOV and another pair near its edge to get an idea of their appearance.
With this procedure we found that while matrices with a cut off of 1% contained on average 20 · 10 9 matrix elements, the 5% cut off matrix used in backprojection contained on average 4.2 · 10 9 elements and for the tail matrix this number decreased to 0.75·10 9 elements. The average is over different VECTor systems which may have slightly different system matrices.

Phantom studies 2.4.1. Hot-rod Derenzo phantom.
A hot-rod Derenzo resolution phantom was filled with a 54 MBq 18 F solution and scanned for 4 h. The phantom had six sectors each containing a set of equally sized capillaries with diameters of 0.7, 0.8, 0.9, 1.0, 1.2 and 1.5 mm. The distance between capillary centres was equal to twice the capillary diameter. For visualization a 0.5 mm FWHM 3D Gaussian filter was applied to reconstructed images.

Uniformly filled syringe.
A 12 ml syringe (15.9 mm inner diameter, 6 cm length) was filled with 65 MBq 18 F solution and scanned for 3.5 h. For visualization, images were filtered with a 0.8 mm FWHM 3D Gaussian.

Animal studies
Animal studies were carried out in accordance with the Dutch Law on Animal Experimentation and conducted according to protocols approved by the Animal Research Committee of the University Medical Center Utrecht. In order to judge better if even small deviations between applied algorithms appear, high count scans with relatively long acquisition times were selected for the present paper. These studies were previously used in our paper describing the POSEM reconstruction software .

Mouse cardiac scan.
A 29 g C57Bl6 mouse was anesthetized with isoflurane and subsequently injected intravenously with 24 MBq 18 F-deoxyglucose ( 18 F-FDG). A 30 min acquisition began 10 min post injection. For visualization, images were post-filtered with a 0.8 mm FWHM 3D Gaussian.

Mouse bone scan.
A 33 g C57Bl6 mouse was anesthetized with isoflurane and injected intravenously with 49 MBq 18 F-fluoride and 141 MBq 99m Tc-MDP (not shown here). A 60 min whole-body acquisition was performed, starting 30 min post injection. Reconstructed images were post-filtered with a 0.8 mm FWHM 3D Gaussian.

Analysis
Time required for image reconstruction does not solely depend on the time per iteration of the algorithm, but also on the number of iterations needed to converge to the desired image. We quantify the convergence speed in the phantom experiment by calculating small detail contrast which is closely related to resolution. As the 0.8 mm rods are the smallest details still clearly visible in the resolution phantom for the collimator used, we use these to determine speed of convergence. The details of how contrast is calculated are explained below. To compare reconstruction times of different algorithms, we compare the times the different algorithms take to achieve equal contrast. As a measure of image quality, we perform a visual assessment of differently reconstructed images at equal contrast and we compare their noise levels. Details about noise level calculations are also provided below.

Hot-rod Derenzo phantom analysis
On (unfiltered) resolution phantom images reconstructed on 0.4 mm voxels, a contrast-noise analysis similar to the one presented in (Walker et al 2014) was done; the images were resampled to a fine grid and regions-of-interest (ROIs) with a diameter of 0.9 times the rods sizes were then placed on and in-between the rods (see figure 3). ROI placement was repeated on subsequent 0.4 mm thick slices for a total number of 10 planes. For a certain sector, ifh is the mean activity in all ROIs placed on the rods andc the mean activity in the ROIs placed in between (averaged over all slices), then contrast is defined to be Furthermore, to represent variability in ROI mean values we calculated for each sector the standard deviations σ h and σ c of the activity in ROIs on and in between rods, respectively. The noise in that sector is then calculated as .

Uniformly filled syringe analysis
As was found in , in case the tails were not modelled properly in reconstruction, part of the activity was reconstructed outside the syringe. To quantify if this is also the case for our proposed DM-DV approach, we calculated the activity in a cylindrical annulus with 19 mm inner diameter and 25 mm outer diameter around the 15.9 mm diameter phantom (in the same way as in the previous paper). Activity reconstructed in this region outside the phantom was reported as percentage of total reconstructed activity (for unfiltered images).

Timing
Reconstruction times provided in this paper were all obtained on our 24-core computer (Intel Xeon Gold 6126 Processors, 2.60 GHz, 19.25 M Cache). For the resolution phantom, average time per iteration is the average over the first 10 iterations. For all other scans, total reconstruction times were provided. We provide these reconstruction times both for 0.8 mm and 0.4 mm voxel size.

Results
In figure 4, images of the hot-rod Derenzo phantom are shown, reconstructed with POSEM or SROSEM, either combined with a DM approach (DM-POSEM and DM-SROSEM, respectively) or with a combined DM-DV approach (DM-DV-POSEM and DM-DV-SROSEM, respectively). We show three rows of reconstructed images, with the first image in each row being POSEM with 30 (figure 4(a)), 60 (figure 4(b)) and 120 iterations ( figure 4(c)), respectively. The other images in the same row were selected to have equal contrast in the 0.8 mm rod sector as the contrast of the POSEM reconstruction. The number of iterations that were needed to obtain this contrast are shown underneath each image. Figure 4 shows that the number of iterations required to converge to a certain contrast depends on the algorithm used. Here two effects play a role. First of all, convergence of reconstructions is somewhat faster when the DM approach is used instead of a standard reconstruction that uses the same matrix in forward and backprojection. Such improved convergence properties for DM reconstruction were also reported in previous works, e.g. (Kamphuis et al 1998, Zeng andGullberg 2000). Secondly, the use of SROSEM instead of POSEM decreases the number of required iterations further. The DV approach does not affect convergence speed. This is further quantified by the plots in figure 5 which show contrast vs. iteration number in the resolution phantom for all rod sectors. For the 0.8 mm rod sector, we performed a least squares fit to these curves to determine an equivalent iteration number for all tested algorithms. From this fit, we found that DM-POSEM and DM-DV-POSEM reach the same contrast in n iterations as POSEM does in 1.3 · n iterations. Furthermore, contrast of SROSEM in n iterations is equal to that of 4.0 · (n − 1) POSEM iterations while DM-SROSEM and DM-DV-SROSEM have an equivalent iteration number of 5.0 · (n − 1). Note that the (n − 1) term in the equivalent iteration number of the SROSEM algorithms is due to the fact that the first iteration in SROSEM is a single MLEM iteration which leads to negligible contrast recovery compared to POSEM with 32 subsets but which is necessary to determine the correct update scheme (Vaissier et al 2016). Furthermore, as the maximum number of subsets in SROSEM is 128, a speed up of SROSEM of 4 compared to POSEM is as expected at least in high-count regions where the maximum number of subsets is automatically used. Thus in case the number of iterations is ≫ 1, SROSEM needs 4.0 times less iterations than POSEM while DM-DV-SROSEM requires 5.0 times less iterations. While these equivalent iteration numbers were obtained by fitting contrast curves for the 0.8 mm rod sector, figure 6 shows that they are very similar over different rod sectors; in this figure the same contrast curves as in figure 5 are plotted, but now against equivalent iteration number as defined by the formulas above. As a result, all curves fall on top of each other, indicating that convergence is accelerated similarly for different rod sizes.
Besides requiring less iterations to reach equal contrast, the DM-DV approach results in extra speed up as the time per iteration is reduced, both in the backprojection step (due to the DM approach) as in the forward projection step (because of the DV approach). The time required for one iteration is listed in table 1 for each of the algorithms. While in POSEM, a single iteration at 0.4 mm voxel size takes 617 s for the hot-rod phantom, with DM-DV-POSEM or DM-DV-SROSEM this time reduces to 161 s and 268 s, a 3.8 and 2.3 times reduction, respectively. The maximum acceleration factors over POSEM of all algorithms are provided in the final columns of table 1. These numbers are valid for the hot-rod phantom and for large numbers of iterations (n ≫ 1) for which the first SROSEM iteration is negligible time-wise. From the table, it can be inferred that a large speed-up over POSEM can be attained; for 0.4 mm voxel size with DM-DV-POSEM, the  Table 1. Overview of algorithms tested. For each algorithm, the equivalent iteration number compared to POSEM is provided (i.e. the number of POSEM iterations that would be required to reach the same contrast as reached after n iterations of the tested algorithm). Furthermore, the time per iteration (for the hot-rod phantom) is provided as well as the acceleration (in case n ≫ 1), both for 0.4 mm and 0.8 mm voxel size.

Time per iteration
Acceleration ( same contrast can be reached 5.0 times faster than with POSEM while with DM-DV-SROSEM the speed-up increases to 11.5. For the larger 0.8 mm voxel size, acceleration factors are somewhat reduced, probably because the overhead time is relatively more important for shorter reconstruction times. After having established that the proposed algorithms accelerate image reconstruction, it is essential to test if reconstructed images with accelerated algorithms have the same quality as with standard POSEM. First of all, a visual assessment of reconstructed images in figure 4 does not reveal any visible difference for the hot-rod phantom images on the same row, meaning that images with equal contrast are visually indistinguishable. This is further confirmed by the profiles through the 0.8 mm and 1.0 mm rod sectors shown in figure 7. These profiles were taken from all images in figure 4(c) (i.e. for images with iteration number equivalent to 120 POSEM iterations). The figure shows that profiles are indistinguishable for different algorithms. To quantify similarity of images further, contrast as function of noise was plotted in figure 8, where the solid black line is the curve representing POSEM, while the symbols denote the other  tested algorithms. These figures show that images at equal contrast are very similar in terms of noise, at least once images are close to convergence (there are deviations in the earlier iterations).
Finally, figure 9 shows reconstructions of the uniformly filled syringe as well as several mouse studies reconstructed with POSEM and DM-DV-SROSEM (the fasted algorithm). Profiles through the images are provided as well for a more quantitative comparison. Images obtained with different algorithms have equivalent iteration number (determined by the equations in table 1). The total reconstruction time is provided below each image (both for 0.4 mm and 0.8 mm voxel size). These images as well as the image profiles reveal only small differences between different images; these small differences are sometimes visible between profiles and in the cardiac images the amount of activity in the apex is slightly different between OSEM and DM-DV-SROSEM. For the uniform phantom, we found that activity reconstructed outside the As accelerated algorithms all use a cut off value of 0.05 (to distinguish tails from central part of the PSF), one may wonder if simply using a cut off of 0.05 would provide sufficient image quality. In  we already investigated different cut offs and concluded that image quality was significantly improved by using a 0.01 cut off instead of 0.05 or higher. Therefore C = 0.01 was used as the gold standard here. This is also confirmed by the images we show in the appendix of this paper in which we compared results from this paper to POSEM and SROSEM reconstructed images with 0.05 cut off (denoted by POSEM(5%) and SROSEM(5%)). For the hot-rod phantom, using C = 0.05 leads to differences in profiles and negatively impacts contrast-noise characteristics as shown in figures A3 and A4 which are equivalent to figures 7 and 8 but with POSEM(5%) and SROSEM(5%) added. Visual differences are less clear in hot-rod images themselves (figure A2). Furthermore, in accordance with our previous work, using POSEM(5%) or SROSEM(5%) leads to increased background levels, especially for the uniform phantom and whole body mouse scan ( figure A5). This is also confirmed quantitatively; for POSEM(5%) and SROSEM(5%), the respective amounts of activity reconstructed outside the phantom were 8.3% and 8.4%, approximately two times higher than the values reported for POSEM with a cut off value of 0.01 or for DM-DV-SROSEM.

Discussion
We presented a new approach for accelerated image reconstruction for VECTor-PET in which we combined an accelerated algorithm (pixel-based SROSEM) and a DM approach with a new DV method, which is particularly well-suited for long-tail PSFs as encountered in VECTor. With pixel-based DM-DV-SROSEM, we typically attained an order of magnitude shorter reconstruction times than with standard POSEM without any visible loss in image quality. Two factors play a role in this acceleration; first of all, both the use of SROSEM instead of POSEM as well as the use of a DM method accelerate image reconstruction as less iterations are required to reach the same level of contrast (by approximately a factor 5.0 for large numbers of iterations). Secondly, the time per iteration is reduced; for the backprojection step this reduction is accomplished by using a smaller matrix with higher tail cut off, while for the forward projection step a rougher voxel size is used for those parts of the system matrix where probabilities are more slowly varying. Together this lead to an acceleration factor of 11.5 for the hot-rod phantom at 0.4 mm voxel size. For the animal images in some cases (e.g. the cardiac scan) somewhat lower acceleration factors were attained, partly due to the lower number of iterations required meaning that the first SROSEM iteration starts to play a larger role in total reconstruction time. Acceleration factors were also reduced for the smaller 0.8 mm voxel size, probably because in this case overhead time in the reconstruction software started to play a relatively larger role as the time spent on the projection and backprojection step quickly reduces as voxel size increases.
Comparison of images was done based on visual assessment, comparison of image profiles and a more quantitative contrast-noise analysis for the phantoms. Note that images were post-filtered with a Gaussian and image profiles were extracted from the filtered images. The amount of filtering was based on a visual assessment. Quantitative comparison was based on unfiltered images.
The DM approach that was applied here uses a backprojector with higher tail cut off than the forward projector, thus projector and backprojector in the reconstruction are unmatched. Such unmatched projector/backprojector pairs in which not all physics (e.g. scatter, attenuation, full system response) is included in the backprojector is usually applied to speed up reconstruction. In (Zeng and Gullberg 2000) a criterion for a valid pair was developed in terms of convergence, but authors also noted that for practical image reconstruction in which noise is present the use of a valid backprojector may not be critical. Here we take a heuristic approach in which we validate our DM approach by comparing several reconstructed images.
Reconstruction times reported in this paper were all obtained on the same 24-core computer that was available to us for an extended time (in order to perform all timings on the same work station). Of course, total reconstruction times could be further improved by using faster computers or computers with more cores. We already observed an almost twofold decrease in reconstruction time on other newer computers in our group, but could not perform our timings on these faster computers as they were heavily used by other researchers.
The amount of acceleration that is achieved depends on the choice of cut off parameters C f and C b . As mentioned, we chose these parameters based on preliminary try-outs, but did not perform any rigorous optimization. It may therefore be possible that further acceleration could be achieved if these parameters would further be optimized and higher values of these cut offs would be used. Another parameter that plays a role is the voxel size on which the tails are stored, which is at the moment three times rougher than the voxel size of the main part of the PSF. Note however that at the moment the 'tail matrix' has already more than five times fewer elements than the other matrices used in the DM-DV-SROSEM reconstructions (see section 2.3). Therefore, using an even rougher voxel size for the tails would probably not lead to much more acceleration.
Although in this paper we focused on image reconstruction of PET tracer distributions (all studies used 18 F), we believe that this method is also very well suited to accelerate reconstruction for SPECT isotopes that emit high-energy gammas. It has already been shown that the clustered pinhole technique in VECTor is well-suited to image these isotopes, (e.g. de Swart et al 2016 for 213 Bi, van der Have et al 2016 for 131 I) and for their reconstruction long PSF tails need to be included as well. As the use of isotopes that also emit high-energy gammas is emerging (Herrmann et al 2017), mainly for theranostic applications, accurate and fast reconstruction methods can have many applications.

Conclusion
Compared to POSEM, DM-DV-SROSEM accelerates reconstruction by an order of magnitude while obtaining similar images. DM-DV-SROSEM thus enables accurate VECTor-PET image reconstruction by including photon transport in the PSF tails without inhibitive long reconstruction times.