Improving depth-of-interaction resolution in pixellated PET detectors using neural networks

Parallax error is a common issue in high-resolution preclinical positron emission tomography (PET) scanners as well as in clinical scanners that have a long axial field of view (FOV), which increases estimation uncertainty of the annihilation position and therefore degrades the spatial resolution. A way to address this issue is depth-of-interaction (DOI) estimation. In this work we propose two machine learning-based algorithms, a dense and a convolutional neural network (NN), as well as a multiple linear regression (MLR)-based method to estimate DOI in depolished PET detector arrays with single-sided readout. The algorithms were tested on an 8× 8 array of 1.53× 1.53× 15 mm3 crystals and a 4× 4 array of 3.1× 3.1× 15 mm3 crystals, both made of Ce:LYSO scintillators and coupled to a 4× 4 array of 3× 3 mm3 silicon photomultipliers (SiPMs). Using the conventional linear DOI estimation method resulted in an average DOI resolution of 3.76 mm and 3.51 mm FWHM for the 8× 8 and the 4× 4 arrays, respectively. Application of MLR outperformed the conventional method with average DOI resolutions of 3.25 mm and 3.33 mm FWHM, respectively. Using the machine learning approaches further improved the DOI resolution, to an average DOI resolution of 2.99 mm and 3.14 mm FWHM, respectively, and additionally improved the uniformity of the DOI resolution in both arrays. Lastly, preliminary results obtained by using only a section of the crystal array for training showed that the NN-based methods could be used to reduce the number of calibration steps required for each detector array.


Introduction
Image quality in PET is directly affected by the number of detected photons. All commercial PET scanners and most research scanners today use scintillation crystals for gamma detection. The interaction probability of an annihilation photon within the scintillator depends on multiple factors, one of which is the travelling distance within the material. Therefore, increasing the length of the crystals is a simple way to increase the detection probability and thus enhance the scanner sensitivity. However, besides the increased cost, this approach has two main disadvantages. First, longer crystals have a degraded timing resolution due to the effect of light transport in the crystal . Second, the uncertainty in estimation of the interaction point inside the crystal leads to a resolution-degrading effect known as the parallax error. Both of these effects can be mitigated by depth-of-interaction (DOI) estimation. Figure 1. Schematic drawing of the electronic collimation setup used for DOI tagging of the crystal arrays, depicted for the 8× 8 array  arrays were finally wrapped with PTFE Teflon tape (3 M, USA) for mechanical stability. Furthermore, a 1.5× 1.5× 15 mm 3 Ce:LYSO crystal (Crystal Photonics Inc. USA), polished on all sides and coupled to a single 3× 3 mm 2 SiPM (S13 360-3050-PE, Hamamatsu, Japan) was used as a reference crystal in both DOI measurement setups. The crystal was coupled to the SiPM by means of Meltmount glue and wrapped in Teflon.

Experimental setup
The electronic collimation setup for side-irradiation of the crystal arrays, described in detail by Pizzichemi et al (2016), was used throughout this work. A schematic drawing of the setup is included in figure 1. In contrast to the conventional PET-like setup, where the source is usually located opposite to the front side of the crystal array, the array is rotated by 90 • , so that irradiating the crystal array from its lateral side generates a DOI tagging by selecting a small portion of interactions that occur at a certain depth within the crystal. The reference crystal was placed at a 150 mm distance from the crystal array, and a 2 MBq 22 Na point source with an active sphere diameter of 2 mm (Eckert and Ziegler, Germany) was placed at a 40 mm distance from the crystal array. With the 8× 8 array, the DOI-tagged data were acquired for each of the 8 crystal rows at 10 vertical positions, spaced with a 1.4 mm pitch. The acquisition rate was lower with the 4× 4 array; therefore, only 5 vertical positions with 2.8 mm pitch were acquired at each crystal row to reduce the total acquisition time. The acquisition lengths were adjusted to a minimum of 30 min per position to get sufficient count statistics. The signal from the reference detector was used as the acquisition trigger with a fixed threshold. For every interaction exceeding the threshold on the reference crystal, the integrated charges collected by the reference SiPM and each of the 16 SiPMs in the array were recorded and post-processed.

Linear approaches for DOI estimation
To evaluate the performance of the NN-based algorithms, two linear methods were included for comparison. The original linear method proposed by Pizzichemi et al (2016) and later slightly updated by Pizzichemi et al (2019) was included as the first reference. Second, an MLR-based method that includes additional linear coefficients from individual SiPM pixels was also implemented. The originally proposed method assumed a linear relationship between the ratio of the light emitted from both ends of the crystal and the DOI. In summary, if we denote the charges produced by each of N single SiPMs of the array as p i and the charge produced in the SiPM pixel directly coupled to the crystal where the interaction occurred as p max , then the ratio will be proportional to the DOI in the corresponding crystal, with coefficients m and q: The sum P in equation (1) is essentially equivalent to the amount of light emitted from the side of the crystal opposite to the SiPM and collected by the neighbour SiPM pixels. This method can be modified by considering all the charges p i produced within the array with their own separate coefficients m i in a multiple linear regression approach instead of P max alone. This could also account for probable differences in detector properties or the optical coupling between the crystals and the SiPMs. This way, the DOI can be estimated by: (3)

Neural network-based algorithms
To account for nonlinearities in light distribution response to DOI, we have investigated two machine learning approaches for DOI estimation, a dense NN (DNN) and a convolutional NN (CNN).

Datasets and input features
As shown by Pizzichemi et al (2016) and Stringhini et al (2016), the initial DOI estimate, w (as defined in equation 1), can be utilised as a third dimension, to generate 3D floodmaps, in which Anger logic is used to calculate the first two dimensions. This results in a more efficient crystal separation, in which background and scatter events are more effectively discarded. For all datasets used in the study, the 3D Anger-logic-based floodmaps were created and the clustering algorithm described by Stringhini et al (2016) was used to generate separate datasets per crystal to be used for training and testing. For each recorded interaction, the 16 values of integrated charges produced by the SiPMs were used as the inputs of the NNs and the position of the tagging setup in mm served as the ground truth for training and testing. The first dataset was acquired with the 8× 8 crystal array. Due to the limited measurement statistics per DOI tagging position, the datasets from 6 out of 64 crystals that had significant inaccuracies in the 3D clustering were not included in the study. These crystals are indicated, along with the 58 crystals that were studied, in figure 2(a). Due to the same reason, the data from one or two DOI tagging positions were discarded in a number of crystals among the 58 studied cases. This was done to minimise the effects of inaccuracies of the 3D clustering algorithm caused by limited statistics on the evaluation of the DOI estimation methods. This issue can be avoided by increasing the acquisition time per DOI tagging position in future.
The second dataset included the measurement data from the 4× 4 crystal array, split into 16 separate datasets corresponding to the individual crystals, as shown in figure 2(b). All 16 crystals were included in the study. Besides the 16 integrated charges from the SiPM pixels, the timestamps recorded by the 16 SiPM pixels were used additionally as NN inputs in a separate comparison.

Neural network architectures
The architectures of the studied NNs are shown in figures 3 and 4. Due to the limited number of input features, it was decided to use a relatively small number of hidden layers and hidden units in both NNs to avoid overfitting and speed up the training process. Several NN architectures were considered. DNNs with one, two, and three hidden layers were tested with various numbers of units in each layer. For the CNN, the number of filters in the convolutional layer was varied from 8 to 40 and the number of FC layers was varied from 1 to 3 with different number of units in each layer. The CNN was then tested with and without the maximum pooling layer. The input of the CNN, which can be considered as a 4× 4 image, was very small compared to images that are normally processed with CNNs. Hence, a smaller filter size (2× 2 instead of conventional 3× 3) was used in the convolutional layer. After the convolution operation, zero padding was applied. A learning rate of 0.000 7 and a minibatch size of 128 were chosen for both NNs after grid searching among a wide range of values. The selected architectures, learning rate, and minibatch size provided the   smallest values of the corresponding mean squared error cost functions. The weights of the NNs were updated using the Adam optimisation algorithm. When the timestamps were added to the input features, the input layer of the DNN was modified and contained 32 units in total, corresponding to the 16 integrated charge values plus the 16 timestamps, in case of the 4× 4 array that they were available. The training process consisted of 1000 epochs.
The output layer of both NNs was represented by the identity function and therefore produced a single number that was the predicted value of DOI in mm. To assess the DOI estimation error of the NNs on a test set, the predicted values were subtracted from the corresponding ground truth values, and the standard deviation of the differences was calculated to obtain the FWHM of the estimated error as a measure of DOI resolution.

Performance Evaluation
To facilitate the comparison, all DOI estimation methods were implemented in the TensorFlow machine learning framework. As shown in figure 5, the linear algorithms were represented as NNs without hidden layers. The original approach is depicted as a network receiving a single input w = pmax First, all four DOI estimation methods were trained on the 58 datasets from the 8× 8 array and on the 16 datasets from the 4× 4 array, i.e. four separate NNs corresponding to the four studied algorithms were trained for each single crystal. The sizes of the individual datasets were different, varying from 4000 to 13000 samples. 80% of each dataset was used for training and 20% for testing. Secondly, the crystals of the 8× 8 array were split into 10 groups, in which the crystals were located symmetrically with respect to the centre of the 8× 8 array (figure 6). For a given DOI, crystals within a symmetry group should ideally produce identical but differently oriented light distribution patterns. That means that for a symmetry group, a NN can be trained using only a part of the crystals of this symmetry group, provided that all the datasets are brought into the same orientation before the training. Such an approach imitates testing an algorithm on an unseen detector array and might help to reduce the number of crystals, for which DOI-tagged data have to be acquired, and therefore to reduce the total acquisition time required to calibrate all the detectors of a PET scanner. Each symmetry group was divided into training and test crystals. For each symmetry group, training and test sets consisted of combined samples from all training and test crystals, respectively, within that symmetry group. The symmetry groups and their corresponding training and test sets are shown in figure 6. Figure 7 compares the DOI estimation error (FWHM) for the four investigated methods tested on the 8× 8 array. All methods showed better performance on the central crystals of the array compared to the edge crystals, as expected. It can be observed that the two NN approaches performed better than the two linear algorithms and the largest improvement in DOI estimation error was achieved in the crystals located on the edges of the array. The fact that the DOI estimation precision in these crystals was less degraded compared to the linear methods led to more homogeneous performance throughout the array. This can be better observed in histogram plots shown in figure 8. The average errors in DOI estimation (FWHM) for the DNN and the CNN were 2.99 mm and 3.01 mm, respectively, while for the original method and MLR these values were 3.76 mm and 3.25 mm, respectively. The use of the DNN resulted in slightly improved average DOI resolution compared to the CNN, while the CNN demostrated slightly better uniformity throughout the array.

Training and testing on the same crystals of the 4× 4 array
The DOI estimation error (FWHM) obtained with the 4× 4 array is compared for the four estimation methods in figure 9, when only the integrated charges of the 16 SiPM pixels were used as inputs. The histograms of the results are also shown in figure 10 for better comparison of the uniformity within the array. Similar to the results obtained with the 8× 8 array, all three proposed algorithms improved the DOI resolution compared to the original linear method. However, the absolute difference in DOI estimation error between the NNs and the linear methods is less significant compared to the 8× 8 array. The CNN and DNN approaches show very close performance, with slightly better results obtained with the DNN. The average errors in DOI estimation (FWHM) for the original method was 3.51 mm and it was improved to 3.33 mm, 3.14 mm, and 3.17 mm when using the MLR, DNN, and CNN, respectively. Figure 11 shows the DOI resolution results obtained with the DNN on the 4× 4 array, when timestamps of events are also included as inputs to the NN. Including the timing information has slightly reduced the   average error in DOI estimation for the 4× 4 array, from 3.14 mm to 3.09 mm. Subtracting the two results for all crystals shows that the DOI resolution has improved more for the top right crystals.    Figure 12 shows the results of DOI estimation error obtained for the 16 test crystals of the 8× 8 array, when training was performed on the symmetrically located crystals of the array. Comparing these results to figure  7 shows that the DOI estimation error has increased for all the methods. The average error increased for the NNs more (by 0.57 mm for both DNN and CNN) than for the linear methods (by 0.39 mm for the original method and by 0.52 mm for the MLR). Nevertheless, on average the NN-based algorithms still produced better results than the original linear approach. The average DOI resolution in the 16 test crystals was 3.98 mm, using the original linear approach trained and tested on the same crystals. Whereas, when trained and tested on symmetrical crystals, the DOI resolution was improved to 3.58 mm and 3.56 mm using the CNN and DNN methods, respectively.

Discussion
The presented work has shown that neural networks can be successfully used to improve the DOI resolution in depolished crystal arrays using a single-sided readout. Using a DNN architecture for DOI estimation resulted in an average DOI resolution of 2.99 mm (FWHM) throughout an 8× 8 array of 1.53× 1.53× 15 mm 3 LYSO crystals, whereas the original linear approach  yielded a DOI resolution of 3.76 mm. The improvement in DOI resolution was larger for the edge crystals than the central crystals of the array. The average DOI resolution in the central 16 crystals improved from 3.15 mm to 2.7 mm, whereas for the edge crystals the improvement was from 3.99 mm to 3.12 mm. The NN-based approaches also offered significantly more uniform DOI resolution throughout the array. However, to enable a more robust comparison between different crystals in the array, the same number of samples should be used for each crystal in both training and testing in future studies. Longer acquisition times might be required to obtain around 10 000 samples for each crystal.
The results obtained with a 4× 4 array of 3.1× 3.1× 15 mm 3 LYSO crystals also showed improved performance using the NN-based approaches. However, the improvement was less significant in this case, where the average DOI resolution was improved from 3.51 mm with the original linear method to 3.14 mm with the DNN. This could be explained by the reduced light sharing present in this array using the one-to-one coupling approach. Including the timestamps as inputs for DOI estimation in this array also provided only slight improvement (0.05 mm in average). However, the improvement was not uniform throughout the array. Investigating this approach with more crystal arrays is required to understand the results better.
Although the performance of the NN-based approaches degraded when only a number of crystals in the 8× 8 array were used for training, compared to training on all crystals of the array, still the NNs outperformed the original approach. Even though the NNs were not tested on another 8× 8 array of the same architecture within this study, this result suggests that such reuse would result in acceptable DOI resolution of the other array, since the test sets for this part of the study included samples only from the crystals that were completely unseen by the algorithms. However, DOI estimations that are based on training data from different crystals do not include the effects from different coupling or physical differences in detector elements. Therefore, they would require calibration factors to account for these differences. Instead of performing the complete DOI tagging measurements per crystal, the calibration factors could be estimated from only a few tagging point measurements for each crystal array and they can be applied to the predicted DOI estimations of the NN. This would be especially useful when such detectors are being considered for use in a PET scanner, as this would reduce the number of detector calibration steps required for the total system. The use of the CNN in this work was intended to mainly target the 8× 8 array; however, comparing the results obtained with the CNN and DNN showed very similar performance obtained with the two NN architectures for both of the arrays. Thus, it is reasonable to use the DNN in future even with large arrays, especially since its training is less computationally expensive. However, the CNN might show better results when trained and tested on a dataset with more input features, e.g. a dataset acquired from a larger SiPM array. It was demonstrated that the MLR also produces improved results compared to the original method and therefore can be used as an alternative to the NNs due to its simplicity. Nevertheless, the DNN slightly improved both the mean (from 3.25 to 2.99 mm and from 3.33 to 3.14 mm for the 8× 8 and 4× 4 array, respectively) and the variance (from 0.39 to 0.33 mm and from 0.32 to 0.30 mm for the 8× 8 and 4× 4 array, respectively) compared to the MLR, which shows that the NNs offer not only improved mean DOI resolution, but also improved uniformity within the array. Therefore, considering the fact that the training of the NNs presented in this work is a fast process with modern computers, the use of the DNN is still preferable over the MLR.
To our knowledge, the DOI resolution obtained in this work is currently among the best achieved DOI resolutions published with single-sided readout crystals of similar size. For instance, (Lee et al 2017), whose DOI-encoding approach is based on the use of triangular-shaped reflectors, reported average DOI resolution of 4.26 mm FWHM for an 18× 18 array of 1.47× 1.47× 15 mm 3 LYSO crystals. Berg et al (2016) reported 3-3.5 mm DOI resolution in phosphor-coated crystal arrays of 3.94× 3.94× 20 mm 3 size. The best DOI resolution result with single-sided readout, to our knowledge, was 2.65 mm reported by Kuang et al (2018) using unpolished 3× 3× 20 mm 3 LYSO crystals. However, the coincidence timing resolution was in the order of 400 ps for both of these two preceding works, which is significantly worse than the reported 157 ps by Pizzichemi et al (2019). The obtained DOI resolution in this work is slightly better than the results reported by Stringhini et al (2017) for the same detector, using an analytic approach for event localisation.
The DOI resolution results obtained in this work can be further improved by increasing the accuracy of the DOI calibration measurements used for NN training. Furthermore, optimisation of the crystal localisation algorithm and light-sharing scheme should also be investigated. Monte Carlo simulations of the setup have shown that the vertical size of the collimated area on the crystals was on average 1.10 mm , which would be a rough estimation of the theoretical limit achievable with the currently used tagging setup if other factors, such as alignment inaccuracies, Compton scattering, and the number of calibration points, are not considered. This limit, however, could be reduced. The size of the collimated area on the crystals is defined by the combination of the source diameter, the distance between the source and the crystal array, and the distance between the source and the reference crystal. Reducing the source-crystal array distance, as well as increasing the source-reference crystal distance, is limited by the total number of coincidences and the total calibration measurement time. However, using a 22 Na point source with an active diameter of 0.25 mm, available on the market, instead of the currently used source with the diameter of 2 mm could substantially reduce the diameter of the irradiated area. This would also allow reducing the DOI tagging step size and include more calibration points in the training.
It is also possible that more scattered photons are detected in the crystals located farther from the source due to forward scattering in crystals located closer to the source. This effect was partially mitigated by using the 3D clustering algorithm, since the majority of interactions from scattered photons are not included in the clusters. Monte Carlo simulations could potentially be an additional source of the ground truth, and using them for training the NN can be implemented and tested in future studies.
The proposed DNN-based method of DOI estimation can be potentially suitable for online DOI calculation. An example of a successful application of DNN processing hardware integrated in PET data processing chain was shown by Geoffroy et al (2015), who were able to implement a DNN for real-time analysis of inter-crystal scattering triple events using layer-sequential processing on a field-programmable gate array (FPGA). This approach enabled online processing of more than a million triple events per second using less than 6000 FPGA slices. Similarly to our study, Geoffroy et al used a four-layer DNN; however, the number of units in their DNN was smaller ([6 10 5 1] units (Michaud et al 2014) compared to [16 16 8 1] units in our study). Therefore, the DNN module within an FPGA will require more multiply accumulate modules. Even more importantly, the DNN for DOI estimation will have not one but multiple sets of weights and biases, with the least number being 10 for a scanner with 8× 8 arrays, provided that the symmetries of the crystal array are employed and the same weights and biases are reused for all the arrays within the scanner. The event rate will have to be assessed and considered as well.
Finally, the presented NN-based approaches can potentially be used with any other DOI detectors with light sharing principle to improve the DOI accuracy. Further investigations on individual detector designs are required to study the potential benefits for each design.

Conclusion
In this work, two machine learning approaches were investigated to improve the average DOI resolution of single-sided readout depolished scintillation crystals in two LYSO crystal arrays. Using a dense neural network for DOI estimation improved the average DOI resolution from 3.76 mm to 2.99 mm (FWHM) for an 8× 8 array of 1.53× 1.53× 15-mm 3 LYSO crystals and from 3.51 mm to 3.14 mm (FWHM) for a 4× 4 array of 3.1× 3.1× 15-mm 3 crystals. The improvement was especially prominent for the edge crystals of the 8× 8 array, resulting in significantly better uniformity of DOI estimation throughout the crystal array. The dense neural network and the convolutional neural network architectures demonstrated similar performance for both crystal arrays. Furthermore, using only a section of the crystal array for training showed that this method could be effectively used to reduce the number of calibration steps required for each detector array. Further improvements could be expected by increasing the accuracy of the training dataset.