Resolution modeling in projection space using a factorized multi-block detector response function for PET image reconstruction

Positron emission tomography (PET) images usually suffer from limited resolution and statistical uncertainties. However, a technique known as resolution modeling (RM) can be used to improve image quality by accurately modeling the system’s detection process within the iterative reconstruction. In this study, we present an accurate RM method in projection space based on a simulated multi-block detector response function (DRF) and evaluate it on the Siemens hybrid MR-BrainPET system. The DRF is obtained using GATE simulations that consider nearly all the possible annihilation photons from the field-of-view (FOV). Intrinsically, the multi-block DRF allows the block crosstalk to be modeled. The RM blurring kernel is further generated by factorizing the blurring matrix of one line-of-response (LOR) into two independent detector responses, which can then be addressed with the DRF. Such a kernel is shift-variant in 4D projection space without any distance or angle compression, and is integrated into the image reconstruction for the BrainPET insert with single instruction multiple data (SIMD) and multi-thread support. Evaluation of simulations and measured data demonstrate that the reconstruction with RM yields significantly improved resolutions and reduced mean squared error (MSE) values at different locations of the FOV, compared with reconstruction without RM. Furthermore, the shift-variant RM kernel models the varying blurring intensity for different LORs due to the depth-of-interaction (DOI) dependencies, thus avoiding severe edge artifacts in the images. Additionally, compared to RM in single-block mode, the multi-block mode shows significantly improved resolution and edge recovery at locations beyond 10 cm from the center of BrainPET insert in the transverse plane. However, the differences have been observed to be low for patient data between single-block and multi-block mode RM, due to the brain size and location as well as the geometry of the BrainPET insert. In conclusion, the RM method proposed in this study can yield better reconstructed images in terms of resolution and MSE value, compared to conventional reconstruction without RM.


H Xu et al
uncertainties compared with other imaging modalities, e.g. magnetic resonance imaging (MRI) and computed tomography (CT).
Although spatial resolution can be improved by reducing the crystal pitch of the scanner detector, this increases the complexity and cost of the system. Additionally, the raise in the number of detector crystals reduces the detected coincidences per detector pair, escalating the statistical uncertainties and making the reconstruction problem more ill-conditioned. Image quality can be improved by extending the scanning time, but there are limitations for dynamic PET on short time scales. Moreover, longer scanning time leads to patient discomfort and increases the potential for patient movement. As an alternative, post-smoothing is usually applied to reduce the noise. However, this reduces the resolution performance and image contrast accordingly.
It has been demonstrated that accurate modeling of the system's detection process within the iterative reconstruction framework could improve the image resolution and suppress the noise (Sureau et al 2008, Rahmim et al 2013. The so-called system matrix in the iterative reconstruction allows the phenomena degrading the resolution to be considered. A full system matrix, modeled using Monte Carlo (MC) simulation, has been established (Rafecas et al 2004, Rannou and Chatziioannou 2004, Leroux et al 2007, Aguiar et al 2010, Zhang et al 2010. However, MC simulations are still very challenging in terms of the computing resources required to achieve reasonable high counting statistics for clinical scanners. Therefore, this method is predominantly used for animal scanners, which usually have a limited number of lines-of-response (LOR).
Alternatively, it is more efficient to factorize the whole system matrix into several different components and model each component separately (Mumcuoglu et al 1997, Qi et al 1998, Rahmim et al 2004. A profound factorization of system matrix consists of at least three submatrices-one matrix defining the geometrical relationship between the image space and projection space; one matrix modeling all the possible physical effects corre sponding to detection processes in projection space, e.g. attenuation, normalization, detector blurring effects related to inter-crystal scattering and penetration; and one matrix modeling the blurring effects in the image space, e.g. the positron range. The modeling of the blurring phenomena in image and projection space is often referred as resolution modeling (RM) or point-spread-function (PSF) modeling (Rahmim et al 2013, Ashrafinia et al 2017. With respect to RM simplification, it is reported that exclusive modeling in image space can effectively mimic part of blurring effects in the projection space, resulting in improved resolution and noise performance (Reader et al 2003, Sureau et al 2008. This method requires less computation resources and it is easier to implement compared with modeling in projection space, as the number of LORs is usually much larger than the number of image voxels for 3D reconstruction. Thus, this modeling is most commonly used for PET. However, RM in image space cannot compensate all the resolution loss due to the blurring effects in the projection space, e.g. the angular-dependent inter-crystal blurring effects (Stute et al 2011). Furthermore, for isotopes with a small positron range, e.g. 18 F, the blurring effects in image space are not as pronounced as in projection space (Bai et al 2004). In this situation, the image blurring matrix can be ignored and focus is given to the more precise RM in projection space. In this paper, we investigate the modeling of the inter-crystal scattering and penetration which are the dominant physical effects in projection space.
Regardless of whether the RM is carried out in projection space or image space, there are three major methods to generate the blurring submatrices, i.e. (1) analytical calculations (Mumcuoglu et al 1997, Yamaya et al 2005, Rahmim et al 2008, (2) empirically measurements (Panin et al 2006, Gong et al 2017 and (3) MC simulations (Qi et al 1998, Alessio et al 2006, Stute et al 2011, Cecchetti et al 2013, Iriarte et al 2016. Full analytical modeling is rather difficult for 3D reconstruction due to the complexity of various physical processes. Empirical measurements normally employ point sources or line sources to measure the system response. Therefore, it is reported to be capable of considering nearly all the physical effects both in image and projection space. However, it is impossible to measure the responses at all the positions on reasonable time scale (Panin et al 2006) and sampling strategies with additional interpolations have to be used. This introduces systematic bias into the RM model and increases the implementation efforts. Moreover, extra devices are required to locate the radiation source at different positions inside the field-of-view (FOV), reducing the flexibility of this method. Recent developments of PET simulation frameworks, e.g. GEANT (Agostinelli et al 2003) or GATE (Jan et al 2004), allow simulation of the detector system in a realistic way by considering all relevant physical processes. Meanwhile, the improvements of computation technology make the MC simulations of high counting statistics available within reasonable time. Therefore, it is the preference of our study to use simulation approaches.
Direct simulations of the RM blurring kernel for each LOR requires sophisticated radiation source configurations and vast calculation resources to achieve sufficient counting statistics. Therefore, some publications have reported methods of factorizing the LOR responses into two detector block responses (Stute et al 2011, Cecchetti et al 2013. However, only a single-block model was investigated. To date, no publication has taken inter-block scatter or penetration into consideration for RM using detector response factorization. In this work, we have developed a multi-block detector response function (DRF) using GATE simulations and evaluated it on the clinical Siemens hybrid MR-BrainPET scanner. Furthermore, the RM blurring kernel in projection space is factorized into two independent detector responses from the multi-block model, taking the inter-block scatter and penetration into account. Specific evaluations of the RM using the multi-block or single-block DRF for both simulation data and patient data were also conducted.

BrainPET configuration
The Siemens hybrid MR-BrainPET system , Caldeira et al 2018 has an inner ring diameter of 36 cm with a transaxial FOV of 32 cm and an axial FOV of 19.25 cm. The system consists of 32 modules and each module has six detector blocks. Every detector block has 12 × 12 crystals with a pixel pitch of (2.5 × 2.5) mm 2 and a length of 20 mm. It can be seen in figures 1(a) and (b), the system holds an axial block shift invariance and 32-fold block rotation symmetry. (Jan et al 2011) was used as the simulation framework for this study. All the simulations were configured to include the physical processes of photoelectric effect, Compton scattering, Rayleigh scattering, Bremsstrahlung, electron ionization and positron annihilation. The creation and propagation of scintillation photons was not included. Only the deposited energy of the high-energetic photons was studied. Therefore, effects like optical crosstalk or electronic noise were not included.

GATE simulation for multi-block DRF
After simulating a cylindrical source covering the FOV with no phantom material, we observed that the γ-ray detections predominantly occur inside every 3 × 3 block matrices (figure 1(b)) for the BrainPET insert, reaching an average ratio of 99.97%. Therefore, a DRF using a 9-block model, shown in figures 1(b) and (c), is enough to cover nearly all possible cases of single γ-ray detections. More specifically, the 9-block model consists of three modules with each module consisting of three blocks along the axial direction.
In general, if the cylindrical PET system has the properties of the axial block shift invariance and block rotation symmetry, the applied 9-block model can be regarded as an elementary structure. The detection of any single γ-ray photon (annihilation photon) from the FOV can be described by such an elementary unit. In addition, if the detector responses of all the neighboring blocks are ignored, the multi-block model works as a single-block model.
In order to obtain the multi-block DRF, a large plane with single γ-ray sources (figure 1(c) green surface) was simulated to illuminate the 9-block detector model. The orientations and positions of all the incident γ-rays, as well as the photon detection processes, were recorded in GATE using Tracker Detector mode with the same energy window as the measurements (400-650 keV).

Multi-block DRF design and generation
For the BrainPET insert, each block was configured with its own local BCS. The origin of the BCS was the center of each block's front surface. The X axis of BCS was aligned with the Z axis of the PET system, while the Z axis of the BCS was always perpendicular to the block front surface, as shown in figure 1(d). With this configuration, each γ-ray inside the PET system can be transformed into the BCS of any block.
In order to address all cases of possible γ-rays, any incident ray was uniquely defined with four parameters (x, y, θ, φ) in the BCS of the center block. As shown in figure 1(e), (x, y) denotes the intersection point with the block's front surface. θ is the polar angle and φ is the azimuth angle with respect to the Z axis of the BCS. Subsequently, the detection responses of the γ-rays were extracted from the GATE simulations and sorted into a lookup table (LUT) according to each γ-ray parameter (x, y, θ, φ). A sketch of one γ-ray and its detector response is shown in figure 1(h).
An accurate DRF LUT requires proper sampling of these four parameters. Each sampling bin represents a bunch of γ-rays. The detector response for every sampled bin (x i , y i , θ i , φ i ) comprises a list of fired crystal IDs and their detection probabilities. In this work, parameters x and y were sampled equidistantly in the range of [−20, 20] mm with 80 bins separately, where the block's front surface has a size of 30 × 30 mm 2 , as shown in figure 1(e). Therefore, this configuration enables consideration of the incident rays from the gaps in the multi-block model.
Parameter θ was sampled within the range of [0, 3π 7 ] with 135 bins. However, φ could not be sampled at a fixed number because (θ, φ) represents a direction vector which forms a sphere surface. Fixed sampling bins for φ will lead to oversampling in the polar point of the sphere. Therefore, φ was sampled equidistantly for each θ to ensure approximately the same arc length, as shown in figure 1(f). If there are N samples for the equator of the sphere and N θi samples for θ i , then the requirement of same arc length would result in equation (1).
In this application, the equator had 133 sampling bins and hence the polar point had only one. Moreover, the 9-block model had X symmetry and Y symmetry. However, the XY symmetry 10 is broken due to the oblique orientation of the blocks in BCS. Therefore, the responses from the symmetrical γ-rays were merged to accumulate four times more counting statistics.

RM blurring kernel
2.2.1. RM blurring kernel generation RM in projection space requires a dedicated blurring kernel which is a convolution kernel for the projection space. It describes how the detection probability of each LOR is spread over the neighboring LORs. In this study, the kernel in projection space was developed by factorizing the LOR response into two detector responses addressed from the multi-block DRF LUT.
The RM kernel generation and its application in reconstruction were implemented within the in-house reconstruction toolkit, PRESTO (Scheins et al 2015). For projection data, PRESTO utilizes a simple data structure, known as the generic cylinder model (GCM) (Scheins et al 2011), instead of a classical sinogram. The GCM has a cylindrical shape with a defined number of virtual detectors without gaps. Any virtual detector in the GCM is addressed by two parameters (ring index, detector index), as shown in figure 2 for an exemplary geometry. Two virtual detectors define one LOR and can be indexed by four parameters (ring1, detector1, ring2, detector2), leading to a 4D projection space.
The GCM applied for the BrainPET insert has 77 rings with a radius of 194.96 mm and 480 virtual detectors for each ring. The transaxial and axial FOV of this GCM are configured to be 280 mm and 192.5 mm respectively. With this configuration, the LOR defined by two averaged physical interaction points inside the crystal pair (d) PET coordinate system is shown in blue and the block coordinate system (BCS) for one block is shown in black. The gray area represents a cylinder phantom positioned in the PET system iso-center. The X axis of the BCS is aligned with the Z axis of the PET system, while the Z axis of the BCS is always perpendicular to the block's front surface. (e) γ-ray definition in the BCS with four parameters (x, y, θ, φ). (x, y) denotes the intersection point with the block's front surface. θ is the polar angle and φ is the azimuth angle; (f) Angle parameter (θ, φ) sampling sketch, θ i and θ j correspond to different rings on the sphere and each φ corresponds to an arc on the ring. The sampling strategy in this study requires that each sampled arc has the approximately same length for every ring, which means the two red arcs for φ 1 and φ 2 will have the similar length. (g) Sketch of the detector block with 12 × 12 crystals for BrainPET. (h) Sketch of one γ-ray example and its schematic diagram of detection response map in a single block.
(7 mm off the crystal inner surface) can have a one-to-one mapping with the LOR defined by two virtual detectors. This avoids any projection data compression (no span or mashing), and therefore, shares the same merits of LOR reconstruction (Kadrmas 2004). The total number of LORs involved in the reconstruction was around 3.4 × 10 8 for GCM, while the number of physical LORs was 1.9 × 10 8 . Such a GCM configuration to achieve oneto-one crystal mapping is not strictly required for RM generation, but it is recommended for obtaining better reconstruction performance.
The sketch of the RM blurring kernel generation for the GCM is shown in figure 3. Any virtual LOR according to the GCM was represented by a single LOR line. The line was then traced in the 3D system geometry and two blocks (m and n) with the first geometrical intersection were identified. Subsequently, the line was transformed into the BCS of the two blocks and four parameters (x α , y α , θ α , φ α ) of the incident ray were determined independently, denoted as α m and α n . Two detector responses, r αm and r αm , relating to α m and α n were addressed in the DRF LUT and were mapped back to physical detector crystals afterwards. In this way, the LOR response of the system is factorized into two detector responses from the multi-block DRF. The system response regarding the traced LOR line is obtained by equation (2).
Here r αm and r αm are vectors with the dimension of (1 × V ), where V is the number of physical detectors. Accordingly, B mn is the final blurring matrix with a dimension of (V × V ) for one given LOR, representing how the detection probability of one virtual LOR spreads over the neighboring LORs. These physical LORs were then mapped to virtual LORs in the GCM. Obviously kernel matrix B is rather sparse and only the non-zero values were stored in the LUT.
During the calculation process, it was necessary to cut off tiny values with a threshold for kernel matrix B, due to their limited contribution to the blurring. Furthermore, each virtual detector pair could be traced with several independent rays and their responses were averaged to increase the kernel accuracy. RM blurring kernel compression was achieved by applying the 32-fold rotation symmetry of the BrainPET insert, and no kernel was calculated for LORs outside the FOV. A four-ring GCM with 52 virtual detectors per ring is shown as an example. Each virtual detector can be specified with one ring index and one detector index in the ring. Each LOR in the GCM is addressed by four parameters (ring1, detector1, ring2, detector2), forming a 4D projection space for image reconstruction.
As seen in the RM kernel generation pipeline (figure 3), the GCM also treats the physical gaps in the scanner as virtual detectors, allowing the gaps to be taken into consideration for the RM blurring kernel. The reason for modeling the gap in the RM is that the physical detector crystals or blocks have a certain length in the radial direction (20 mm for the BrainPET insert), and the virtual gap LORs, which have no corresponding partner of a physical LOR according to the gaps of the real system, might have intersections with the real detector blocks, especially for the off-center cases. This indicates that virtual gap LORs in the GCM would also contribute to the projection blurring operation during the forward projection.

RM blurring kernel profiling
The RM blurring kernel is shift-variant in projection space, because virtual LORs with different incident angles or positions from the GCM will be mapped with different detector responses from the DRF, as seen in figure 3.
For each virtual LOR and its blurring kernel matrix, the blurring intensity can be defined by equation (3).
Where P primary is the weight in the kernel matrix for the investigated LOR, and P total is the sum of all the weights in the kernel matrix, which also corresponds to the total detection probability of this LOR. In this way, B intensity ranges from 0 to 1. B intensity = 0 corresponds to no blurring, while B intensity = 1 demonstrates the maximum blurring. Exceptionally, P total = 0 indicates that the detection probability of this virtual LOR is 0, which corresponds to some gap LORs of the GCM. In this case, the blurring intensity was specifically set to 1.

Image reconstruction with RM
The PRESTO toolkit was used for image reconstruction, and the blurring kernel for RM was developed within this framework. A volume-of-intersection (VOI) system matrix was applied in PRESTO for MLEM reconstruction without using subsets. The reconstruction with RM follows the procedure reported in the literature by . RM blurring kernel generation process. One virtual LOR line according to GCM is generated (red line). The line is then traced in the 3D system geometry and the two detector blocks (m and n) with the first geometrical intersection are identified. Subsequently, the line is transformed into the BCS of the two blocks and the four parameters (x, y, θ, φ) of the incident ray are determined independently, denoted as α m and α n . Two detector responses r αm and r αn regarding α m and α n can be addressed from the multi-block DRF LUT (only single-block responses are visualized in the plot due to limited space), and are then mapped back to the scanner geometry in order to create the physical LOR responses of the traced line (shown in yellow crystals and blue lines). These physical LORs are subsequently mapped to virtual LORs in the GCM, which forms the one blurring kernel matrix for the investigated virtual LOR. Rapisarda et al (2012) with matching projectors. This requires two blurring operations for each iteration. Both the geometrical projection and blurring operations were performed using the single instruction multiple data (SIMD) and multi-threading (20 threads) for accelerations. The time consumed by the RM kernel operations on 3.4 × 10 8 LORs was around 1.04 s per iteration on an Intel Xeon(R) CPU E5-2680 v2 with 20 cores. The blurring time cost was approximately 8% of a single iteration. The reconstructed image size was 256 × 256 × 154, with a 1.25 mm cubic voxel. Post-filtering was not used in any reconstructed images in this study.
All the simulation data used for evaluation were obtained with GATE using the same configuration described below. The source distribution was configured to be back-to-back γ-rays (i.e. ignoring photon non-collinearity and positron range), while no phantom material was simulated. From the simulation results, only the true coincidences were extracted for image reconstruction. Thus, the reconstruction is simplified to avoid scatter, attenuation or random correction and we can evaluate the effects of RM in a more precise way. The normalization file is derived from a large cylinder source simulation for MLEM reconstruction without RM, and it is further adapted to RM blurring kernel based on appendix.

DRF LUT statistic uncertainties
The DRF LUT in this work was calculated using MC simulation, which are always influenced by statistical uncertainties. In order to quantify the influences of the DRF LUT's statistical uncertainties on the reconstruction with RM, an anatomical brain phantom from the BrainWeb database (Aubert-Broche et al 2006) was simulated for reconstruction and evaluation.
The discrete brain phantom consists of 11 tissue types. For simulation, the ratios of activity assigned to gray matter, blood vessels, white matter, muscle/skin were 10:10:2:1, as shown in figure 4(a). Other materials and background had no activity. To minimize the noise of the reconstructed images as much as possible, the measured true coincidence counts were relatively high (1.1 × 10 10 ).
Concurrently, four DRF LUTs were calculated with different counting statistics. The ratios of the simulated counting statistics for DRF LUT were around 1:10:100:1000. The LUT with the lowest counting statistics was denoted as 1-unit-statistics LUT, and the LUT with the highest amount of data was denoted as 1000-unit-statistics LUT, which was also the most precise one in this study.
Each DRF LUT was used to create the RM blurring kernel separately, with which the BrainWeb phantom simulation data were reconstructed. 500 iterations were conducted and the images of every 10 iterations were saved. Conventional MLEM without RM was also conducted. The mean squared error (MSE) value between the reconstructed image and ground truth image for each saved iteration was calculated, as shown in equation (4).
(4) Where I recon is the reconstructed image, I groundtruth is the ground truth image used for GATE simulations, N is the number of image voxels. In this way, the statistical uncertainties of DRF LUT are propagating into the bias of the final reconstructed images and influencing the MSE curve. Theoretically, if the DRF LUT has smaller statistical uncertainties, the RM kernel is more accurate and the MSE value will be smaller accordingly.

Single-block mode and multi-block mode RM
If the response in the neighboring eight blocks of the multi-block DRF is discarded in the RM blurring kernel generation pipeline, as shown in figure 3, it is equivalent to a single-block DRF. We refer to this as single-block mode RM. In contrast, it is referred as multi-block mode RM, if the neighbor responses are considered.
Towards quantifying the difference between the single-block and multi-block mode RM more specifically, a large cylindrical phantom covering the whole FOV was forward projected with a VOI geometrical matrix and blurred with the two RM blurring kernels separately. Afterwards, the projection data obtained with multi-block mode was subtracted by that from the single-block mode. The differences in the projection data were visualized with different ring combinations in GCM.

Sphere phantom simulation and reconstruction
In order to quantify the resolution and reconstruction deviation at different radial distances off the center of the FOV, a sphere phantom with a 2 cm radius was simulated using GATE. The sphere phantom was placed at different positions in the FOV. Three locations and the corresponding detected true coincidences are shown in table 1.
For each sphere simulation, the data were reconstructed with conventional MLEM without RM, MLEM with RM in single-block mode and multi-block mode separately. 100 iterations were recorded and the image for each iteration was used to calculate the resolution and MSE value with respect to the ground truth image. The simulation and reconstruction were repeated 20 times independently with different random seeds.
The resolution was calculated by fitting an error function to the reconstructed sphere profiles which was sampled in the sphere coordinate system relative to the sphere center (Hofheinz et al 2010). In total, nine resolution versus MSE curves were obtained.

Off-center BrainWeb phantom simulation and reconstruction
The BrainWeb phantom used in section 2.3.1 was shifted off-center to (0, 4, 0) cm to check the reconstruction differences between different methods. In order to limit the BrainWeb phantom within the transverse FOV (14 cm radius), only blood vessels, gray matter and white matter were assigned with activity at ratios of 5:5:1. The detected true coincidence counts were 1.4 × 10 8 . MLEM without RM, MLEM with RM in single-block mode and multi-block mode were evaluated. 200 iterations were recorded and the MSE value was calculated for each iteration between the reconstructed and ground truth image, regarding both the full image region and the ROI shown in figure 4(b). The image differences between the single-block and multi-block mode were calculated after 100 iterations.

Patient measurement data and reconstruction
A patient measurement from the Siemens hybrid MR-BrainPET system was used for reconstruction evaluation. The patient was injected with 18 F-FDG (221 MBq for 89 kg) and scanned for 1 h after intravenous injection. Images were reconstructed for a single time frame of 1000 s starting from 300 s and the detected number of prompts were 4.1 × 10 8 . All relevant corrections were applied, i.e. component-based normalization (Badawi and Marsden 1999) (adapted to RM method based on appendix), templated-based attenuation (Kops et al 2015), scatter correction based on single scatter simulation (Watson 2000), variance-reduction random correction using delayed window technique (Hoffman et al 1981, Byars et al 2005, dead-time/pile-up correction (Weirich et al 2012), system calibration. All the data were converted to GCM for subsequent Ordinary Poisson-MLEM The ratios of source activity assigned to gray matter, blood vessels, white matter, muscle/skin are 10:10:2:1. (b) Shifted BrainWeb phantom. The ratios of source activity assigned to gray matter, blood vessels, white matter are 5:5:1. The light purple line indicates the region-of-interest (ROI) used to calculate the MSE curve. Table 1. Simulation information of the three sphere locations.

Position (cm)
Detected coincidences Sphere 1 (0, 0, 0) 8.6 × 10 6 Sphere 2 (5.5, 0, 0) 8.8 × 10 6 Sphere 3 (11, 0, 0) 9.7 × 10 6 reconstruction in PRESTO. Both MLEM without RM, MLEM with RM in single-block mode and multi-block mode were evaluated for the patient data. The patient images are obtained at 100 iterations. Moreover, in order to quantify the noise suppression, the list-mode data of the same time frame used above were split into two independent data sets which share the same PET dynamics information. Both data sets were converted into GCM projection data and reconstructed independently with MLEM without RM, MLEM with single-block mode RM and multi-block mode RM for 150 iterations. The standard deviation of image difference (denoted as dsd) between the two independent data sets for each iteration was calculated for all three methods under the terms of Equation (5) provides a measure of image noise (Lodge et al 2010), where d i is the image difference for voxel i between the reconstructed image of the two data sets at the same iteration; N is the total number of voxels.

DRF evaluation
Three typical incident γ-ray beams were explored inside the BCS and their responses from DRF LUT are shown in figure 5.
Case 1 is a beam of oblique γ-rays at the center of the block with a parameter of (0, 0, 0.25π, 1.25π). The response will mainly involve the diagonal crystals following the incident γ-ray's azimuth angle in the block, as shown in figure 5(a). Case 2 is a beam of oblique γ-rays intersecting with the peripheral crystal in the block with a parameter of (9.75, 1.25, 0.25π, 0), which demonstrates the inter-block crosstalk in the axial direction. Figure 5(b) shows that the neighboring block has a certain amount of detecting probability besides the response in the center block. Case 3 demonstrates a similar situation in the transaxial plane, where a beam of oblique γ-rays intersects with the peripheral crystals in the block with a parameter of (1.25, 12.25, 0.25π, 0.5π). The response is shown in figure 5(c), where we could observe the inter-block crosstalk between different modules. These results clearly show that the detector response is strongly depending on the incident angle and position in the BCS, and that inter-block crosstalk in both axial and transaxial direction can be modeled by the multi-block DRF LUT.

RM blurring kernel profiling
The blurring intensities for a fan of LORs were calculated according to equation (3). Two ring combinations from the GCM were explored here, as shown in figures 6(a) and (b). One combination (ring 19-19) covers the LORs in the direct plane parallel to transaxial view, shown in figure 6(a). The other ring combination (ring 19-58) covers the oblique LORs, shown in figure 6(b). Both ring No. 19 and ring No. 58 are located in the middle of the involved block. To obtain the blurring intensity profile, the blurring intensity of each LOR is calculated and plotted with respect to the LOR's distance to the center of the transverse FOV. These data points are connected to form the blurring intensity profile curves, which are shown in figures 6(c) and (d).
The non-continuity in the curves originates from the gap structure. Regardless of the different ring combinations, the LORs in the center of the FOV have smaller B intensity values, indicating weaker blurring effects. (b) Case 2 with incident γ-ray parameter (9.75, 1.25, 0.25π, 0) to demonstrate axial inter-block crosstalk. (c) Case 3 with incident γ-ray parameter (1.25, 12.25, 0.25π, 0.5π) to demonstrate transaxial inter-block crosstalk.
Furthermore, due to the additional axial blurring effects, the oblique LORs from different ring combinations have stronger blurring effects compared with LORs from direct planes.

DRF LUT statistic uncertainties
As stated in section 2.3.1, our DRF LUTs were generated from different statistical samples. For each bin in the most accurate LUT, i.e. 1000-unit-statistics LUT, the average number of γ-rays evaluated was around 2.0 × 10 4 .
The equivalent time needed to generate the 1000-unit-statistics LUT on a cluster node was around 2000 h. The computer node has 2 Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50 GHz, a total of 24 cores with 48 threads. The full simulations were parallelized on the Juelich Super Computer system with 16 computation nodes and the time needed was around 5 d. The other 3 DRF LUTs have a linear relation with the 1000-unit-statistics LUT, regarding the simulation time and average evaluated γ-rays per LUT entry.
The MSE curves for the MLEM and RM blurring kernels generated from 4 DRF LUTs are shown in figure 7. The MSE value decreases when the DRF LUT is generated with higher statistics, and more importantly, the 100-unit-statistics DRF LUT obtains a similar MSE performance to that of the 1000-unit-statistics LUT. This indicates that it is accurate enough to use the 100-unit-statistics LUT for the final reconstruction. The computation time needed to generate the 100-unit-statistics LUT was around 200 hours per node, which is acceptable to obtain without the super computer support.

Projection differences between single-block and multi-block mode RM
In figure 8 the differences between the forward projection data blurred with RM in single-block mode and multi-block mode are shown for two ring combinations. Ring 20-20 indicates the same direct planes and the significant differences only appears beyond around 10 cm from the center. Ring 20-66 is a very oblique plane for the BrainPET insert and ring 66 is near the gap of one detector block along Z axis. Ignoring the inter-block crosstalk in Z axis leads to increased projection data differences between single-block and multi-block mode.

Sphere phantom simulation and reconstruction
The resolution versus MSE curves for the sphere phantom reconstruction are shown in figure 9. All the curves start from the top-right corner when the bias from the first iterations is still large (the starting points are not visualized in these figures). Compared with curves from MLEM without RM, the resolution and MSE value of RM methods are always improved for the sphere at the three radial positions. The more significant improvement Figure 6. (a) The diagram of LORs explored for ring combination of (ring 19-19); (b) the diagram of LORs explored for ring combination of (ring 19-58); (c) blurring intensity profile for ring combination of (ring 19-19). (d) Blurring intensity profile for ring combination of (ring 19-58).
is acquired when the sphere is placed at a larger distance from the center, as shown in figures 9(b) and (c). Singleblock and multi-block mode RM only show a difference in figure 9(c) for sphere 3 (located at 11 cm from the center), regarding both resolution and the MSE value.
The reconstructed images and profiles for the sphere at three positions are shown in figure 10. These images are obtained at the iteration with the minimum MSE value. It is worth noting that the chosen iteration between the RM method and the no-RM method are quite different for sphere 3, compared with spheres 1 and 2.
For off-center spheres, especially sphere 3, the edge at the greatest distance from the center of the system looks more blurred from MLEM without RM, compared with the image pertaining to the central sphere. However, this could be further compensated by reconstruction with RM. In terms of images and profiles, no difference is seen   (a) Sphere 1 at (0, 0, 0) cm; (b) sphere 2 at (5.5, 0, 0) cm; (c) sphere 3 at (11, 0, 0) cm. No-RM represents MLEM without RM. RM-sblock represents MLEM with single-block mode RM and RM-mblock represents MLEM with multi-block mode RM. The maximum coefficients of variation (division between standard deviation and mean) for MSE values in 3 figures are 3.58%, 0.90% and 0.90%, respectively, while those for resolution are 0.91%, 0.63% and 0.61%, respectively. between the RM single-block and multi-block modes for spheres 1 and 2. However, the edge profile for the multiblock mode near 130 mm for sphere 3 is recovered more accurately compared with that in single-block mode. These recovery differences actually lead to smaller MSE values for the multi-block mode, as shown in figure 9(c).
Slight edge artifacts can be observed in nearly all the reconstructed images, except the one with no RM for sphere 3, as it is obtained at an early iteration. The strongest artifacts can be seen in sphere 2, but in general, the intensities of the overshoot for all the images were not very severe.

Off-center BrainWeb phantom simulation and reconstruction
It is clearly observed that the RM method reaches a smaller MSE value compared with MLEM without RM, as seen in figures 11(a) and (b). Moreover, the multi-block mode shows a smaller MSE value compared with the single-block mode. However, the difference between single-block and multi-block mode RM in figure 11(a) is not as large as in figure 11(b), because the MSE value is averaged over all the voxels in the images.
In figures 11(c)-(e), the reconstructed images and differences between single-block and multi-block mode at 90 iterations are shown for the off-center BrainWeb phantom. The major difference can be detected at the offcenter region, in accordance to the observations using the sphere phantom ( figure 9). This region is also chosen as the ROI to calculate the MSE curve for all the iterations, as shown in figure 11(b). Compared with single-block mode, the reduced MSE value of multi-block indicates that the difference in ROI can help to reduce the MSE value for the brain phantom, thus leading to better reconstruction performance.

Patient measurement data and reconstruction
The reconstructed images of the patient data are shown in figure 12. All the three reconstructed images were thresholded and normalized to the same intensity. Compared with MLEM without RM, reconstructed images with RM can better resolve the gray matter visually with suppressed image noise. The image differences between multi-block mode and single-block mode are shown in figure 12(d). The major differences can be seen from the peripheral skin or muscle, while the brain does not demonstrate prominent differences ( 4%).
The curves for noise quantification using dsd values (equation (5)) are shown in figure 12(e). It can be seen that the MLEM reconstruction with RM provides lower dsd values compared to MLEM without RM. Thus, RM reduces image noise more efficiently. Meanwhile, there is no siginifcant difference for curves between singleblock and multi-block mode.

Discussion
This work presents an accurate RM method based on a simulated multi-block DRF LUT in 4D projection space which can enhance resolution, and simultaneously reduce the MSE value (sum of variance and square bias) (Wackerly and Scheaffer 2008). The RM blurring kernel and the image reconstruction are implemented using a projection space without any angle or distance compression. This results in resolution enhancement with only slight edge artifacts. Moreover, the RM method and the multi-block DRF can be applied to most commercial PET scanners with geometrical block rotation symmetry and block shift-invariance.
The multi-block DRF model takes into account all possible incident γ-rays emitted from the scanner's FOV. The proposed sampling strategy of the four parameters (x, y, θ, φ) in the BCS enables a substantially uniform sampling for different incident γ-rays, which covers the gap in the multi-block model as well. Neighboring blocks in this model allow inter-block crosstalk to be considered in both the axial and transaxial direction. Theoretically, the multi-block DRF LUT can not only be used to calculate the RM blurring kernel, but also make it possible to accelerate the PET Monte-Carlo simulation, especially for ultra-high counting statistics. This is similar to the work done in SPECT, which also applies the known detection probability of DRF into simulation instead of detailed ray tracking (Song et al 2005).
With this dedicated DRF LUT, a factorization method for generating the RM blurring kernel has been established. The kernel profiling proves that the blurring kernel is shift-variant in 4D projection space as expected. The fluctuation of the curves, as seen in figure 6, might come from the gap structure and the geometrical sensitivity of each crystal location inside the block (Rahmim et al 2008, Cecchetti et al 2013. It has been proven that the inter-block crosstalk has an impact on the measured or simulated projection data (Zeraatkar et al 2011, Levin et al 2016. In theory, modeling of this effect could improve the quality of reconstructed images. The multi-block mode RM in this study includes the inter-block crosstalk, while the single-block mode does not. The comparison shows that the reconstructed images have significant differences beyond 10 cm radial distance from the center of the transverse plane for the BrainPET insert, where the depth-of-interaction (DOI) effects become significant. Moreover, these differences were shown to lower the MSE value of the shifted BrainWeb phantom (figures 11(a) and (b)) and sphere phantom (figure 9) using the multi-block mode RM. Shifted BrainWeb phantom also demonstrates significantly differences (reaching 10.8%) at off-center region. Additionally, the resolution is also enhanced for the multi-block mode, as indicated for the sphere phantom at 11 cm from the center, compared with single-block mode. These results prove that including the inter-block crosstalk can markedly improve the image quality for BrainPET at certain areas of the transverse FOV where the DOI blurring and block-crosstalk effects are quite severe.
The inter-block crosstalk at off-center locations beyond 10 cm in the transverse plane is further confirmed by the projection differences between single-block and multi-block mode for the direct plane in figure 8. All these results are strongly related to the geometrical structure of the BrainPET insert, where the transverse gaps are quite large, reaching the size of around three crystals (Caldeira et al 2012). Therefore, the inter-block crosstalk in transverse plane is less pronounced compared with other scanners with smaller gaps, for example GE SIGNA PET/MR System (Levin et al 2016), Siemens Biograph mMR system (Delso et al 2011), etc. Theoretically, scanners with smaller transverse gaps would benefit more from the multi-block mode RM.
It is indicated in figure 8 that there are also block-crosstalk effects along axial direction between single-block and multi-block mode for the ring 20-66. However, in contrast to transverse inter-block crosstalk, it appears that the axial crosstalk does not influence the reconstruction significantly, since the center sphere phantom does not show observable differences between the single-block and multi-block mode. This is predominantly due to the fact that there are only five gaps along Z axis compared with the 77 virtual detector rings. Furthermore, the maximum axial oblique angle is around 23 • for BrainPET reconstruction, which limits the axial blurring effects (Gong et al 2017). In general, if the PET system has more gaps along the Z axis with a larger axial oblique angle, e.g. animal scanner (Gong et al 2017) and the total body EXPLORER scanner (Zhang et al 2017), then multi-block mode could demonstrate better reconstruction performances along axial direction compared with single-block mode.
These geometrical properties of the BrainPET insert also explain why the differences between the singleblock and multi-block mode are rather low ( 4%). In general, the size of the brain is usually less than 20 cm in transverse plane and it is positioned in the scanner center. This indicates that modeling the single-block effect is nearly sufficient to achieve the same performance as multi-block mode for the BrainPET insert.
Regardless of single-block or multi-block mode, the shift-variant RM kernel describes the different blurring effects at different locations of the FOV. The accurate modeling of blurring effects leads to the reconstructed images with only slight edge artifacts, as seen in figure 10 for sphere phantoms. Even the highest edge overshoot for the sphere at (5.5, 0, 0) cm is still typical and not very severe. Moreover, the edge overshoot is also observed for MLEM without RM, depending on the cross-section of the tube for the VOI projector. The VOI projector has been shown to achieve a similar performance as RM in projection space. If a larger cross-section of the tube is used for the projector, the LOR would intersect with more voxels, which would eventually result in some blurring effects in projection space (Lougovski et al 2014). When we combine the RM blurring kernel with a VOI projector, the edge overshoot is slightly enhanced for the sphere at (5.5 ,0, 0) cm.
These edge artifacts can still be mitigated by some band filtering methods in the reconstructed images (Tong et al 2011) or by adding a regularization term in the image reconstruction (Fessler and Hero 1995, Qi et al 1998, Rapisarda et al 2012, Mikhno et al 2013. However, thorough elimination of edge artifacts for RM methods is currently not possible. Eventually, even for our accurate RM blurring kernel, a compromise still needs to be made between resolution enhancement and edge artifacts.
For real scanners, the DRF in this work only considers the dominant physical effects, i.e. inter-crystal scattering, inter-crystal penetration. Other blurring effects due to optical crosstalk, electronic systems, crystal efficiency and positioning algorithm are not included in the DRF. These effects are partly compensated in normalization to obtain artifact-free images. Consequently, our DRF could be further improved at this point. Since some other blurring effects are ignored, it can be claimed that the DRF in this study slightly underestimates the full blurring effects of the scanner data. Therefore, the reconstructions for patient data are not likely to obtain severe edge artifacts, which might lead to false-positive diagnosis in clinic applications or biased quantitative values for dynamic studies.

Conclusions and future work
In this study, a multi-block DRF has been developed using MC simulations for the first time, allowing interblock crosstalk to be taken into consideration appropriately. The simulation and reconstruction results indicate that such a DRF model for the BrainPET insert can also be applied on a normal cluster node within reasonable time (200 h) with sufficient counting statistics. Furthermore, other scanners with geometrical block rotation symmetry and block shift-invariance can also adopt this approach easily.
We applied this DRF model to build up a RM blurring kernel in 4D projection space without any distance or angle compression of the LOR data, which is equivalent to LOR reconstruction. The RM kernel is eventually integrated into PET reconstruction with SIMD and multi-threading support. The time cost due to RM blurring operations is approximately 8% of a single iteration.
Both patient and simulation data results have shown that the RM in 4D projection space can yield improved resolutions and reduced MSE values (the sum of variance and bias square), compared with MLEM without RM. Moreover, the shift-variant RM kernel can model the blurring intensity at different locations in the FOV, thus avoiding severe edge artifacts in the reconstructed images of patient or simulation data.
RM in single-block and multi-block mode were also investigated for the BrainPET insert. Improved spatial resolution and edge recovery of multi-block mode can be seen in the transverse plane at locations beyond 10 cm from the system center. Patient data demonstrated rather low differences between the single-block and multiblock mode RM, due to the brain size ( 20 cm) and scanner geometry. However, the multi-block mode can play a more important role for scanners with smaller transaxial gaps or with more axial gaps and longer axial FOV, as the DOI and block-crosstalk effects in transaxial or axial direction would be more pronounced. Our future work will focus on the validation of this framework on other scanners which have stronger block-crosstalk effects compared to the BrainPET insert.
In order to match the models with and without RM, the regenerated normalization N for RM should satisfŷ N(RGI) = N GI. (A.5) N and N are both diagonal matrices, and thus the above formula can be written as  (A.6) N ij , N ij , R ij , G ij , I i1 denote the element of matrix or vector N , N , R, G, I , respectively. Since matrix R, G and N are already known, each diagonal element of matrix N ii can be easily determined with a defined image. In practice, we use a cylinder image for I covering the whole FOV, i.e. illuminating all relevant LORs.