Energy discrimination for positron emission tomography using the time information of the first detected photons

The advantages of Time-of-Flight positron emission tomography (TOF-PET) have pushed the development of detectors with better time resolution. In particular, Silicon Photomultipliers (SiPM) have evolved tremendously in the past decade and arrays with a fully digital readout are the next logical step (dSiPM). New multi-timestamp methods use the precise time information of multiple photons to estimate the time of a PET event with greater accuracy, resulting in excellent time resolution. We propose a method which uses the same timestamps as the time estimator to perform energy discrimination, thus using data obtained within 5 ns of the beginning of the event. Having collected all the necessary information, the dSiPM could then be disabled for the remaining scintillation while dedicated electronics process the collected data. This would reduce afterpulsing as the SPAD would be turned off for several hundred nanoseconds, emptying the majority of traps. The proposed method uses a strategy based on subtraction and minimal electronics to reject energy below a selected threshold. This method achieves an error rate of less than 3% for photopeak discrimination (threshold at 400 keV) for dark count rates up to 100 cps/μm2, time-to-digital converter resolution up to 50 ps and a photon detection efficiency ranging from 10 to 70%.

We propose a method which uses the same timestamps as the time estimator to perform energy discrimination, thus using data obtained within 5 ns of the beginning of the event. Having collected all the necessary information, the dSiPM could then be disabled for the remaining scintillation while dedicated electronics process the collected data. This would reduce afterpulsing as the SPAD would be turned off for several hundred nanoseconds, emptying the majority of traps.
The proposed method uses a strategy based on subtraction and minimal electronics to reject energy below a selected threshold. This method achieves an error rate of less than 3 % for photopeak discrimination (threshold at 400 keV) for dark count rates up to 100 cps/µm 2 , time-to-digital converter resolution up to 50 ps and a photon detection efficiency ranging from 10 to 70 %.

K
: Data acquisition concepts; Gamma camera, SPECT, PET PET/CT, coronary CT angiography (CTA); Detector modelling and simulations I (interaction of radiation with matter, interaction of photons with matter, interaction of hadrons with matter, etc); Optical detector readout concepts 1Corresponding author. c 2018 CERN. Published by IOP Publishing Ltd on behalf of Sissa Medialab. Original content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

Introduction
Over the last few years, there have been many avenues explored to improve the time resolution of positron emission tomography (PET) detectors to achieve time-of-flight PET (TOF-PET). TOF-PET has several advantages over conventional (or non-TOF) PET, including better contrast facilitating detection of small lesions and increased performance with lower statistics, leading to shorter acquisition times and lower radiation doses [1]. Improving the time resolution to 10 ps FWHM, equivalent to a spatial resolution of 1.5 mm FWHM on the line of response, would also enable direct reconstruction, changing the PET image reconstruction paradigm [2]. To achieve such timing performance, all the elements of the detector chain must be optimized, including the scintillator [3], photodetector [4,5] and electronic readout [6][7][8][9][10].
In particular, the development of new photodetectors based on Single Photon Avalanche Diodes (SPAD) called Silicon Photomultipliers (SiPMs) or Multipixel Photon Counters (MPPCs) opened the road to sub-100 ps coincidence time resolutions [11]. However, the SPAD being a boolean detector, it benefits from digitalizing the signal as early as possible, resulting in a device called a digital SiPM (dSiPM). In this scheme, instead of passively quenching each SPAD and summing their currents, each SPAD has its own quenching circuit (passive or active) and readout circuits. The complexity of the SPAD's ancillary electronics varies with different implementations, however they can range from simple individual on-off control to an individual Time-to-Digital Converter (TDC) per array of SPAD or even per SPAD. These latter dSiPMs with multiple TDCs are the focus of current studies using multiple photoelectron timestamps and knowledge of photostatistics to elaborate new time of arrival estimators such as Maximum Likelihood Time Estimation (MLTE) [12][13][14].
The current study aims to demonstrate that we can use the same timestamps used to obtain the time of arrival to also estimate energy and reject Compton scattered events. Obtaining the energy information from the same photoelectrons would negate the need to collect all the scintillation photons of the event. After obtaining a sufficient number of photoelectrons, all SPAD can be turned off and remain off for the duration of the scintillation with a significant improvement of afterpulsing -1 -since the traps would have ample time to empty without triggering supplementary avalanches. Moreover, the energy dissipated by the SPAD array and its quenching circuits is proportional to the number of SPAD triggered [15]. Thus turning the SPAD array off early in the scintillation event may reduce local heating of the silicon substrate, which would in turn reduce the number of thermally generated carriers [16]. Reducing thermally generated dark counts facilitates the work of the triggering electronic whose role is to differentiate spurious counts from true events and improves the time resolution of the detector.
Thus, obtaining both timing and energy information using the first photoelectron timestamps presents a significant advantage in reducing the dSiPM noise.

Materials -scintillator
The scintillator simulation was done using the GEANT4 toolkit. The scintillator is a 1 × 1 × 10 mm 3 LYSO crystal with ESR reflectors on five surfaces and an optical grease and epoxy coupling to the photodetector. The surface model parameters represent depolished surfaces with the Gaussian distribution of the microfacets having a standard deviation of σ α = 0.35. The rise time and decay time of the scintillation light are t r = 0.07 ns and t d = 40 ns, respectively. The amount of scintillation photons emitted fluctuates according to the resolutionscale parameter which was set to 4. Selected values are based on average measurements of different samples found in the literature [17][18][19][20]. Unless specified, all the following data analysis was done on the same GEANT4 simulation dataset with a 511 keV source.

Materials -photodetector
The photodetector simulation was done using the Digital SPAD Array Simulator which is a custom simulator developed to study SPAD-based detector systems [21]. This time variant simulator provides an accurate representation of dynamic non-linearities incurred by the boolean behavior of SPAD and the necessary quenching and recharge times by continuously tracking the status of each SPAD cell. This means that the same cell cannot give two distinct timestamps within the SPAD dead time, which is the behavior observed in a physical SPAD array. Furthermore, dark counts may add a variable number of counts during an event, as well as between distinct events. Consequently, dark counts may blur the timing of the beginning of the events by occasionally causing an early misfire, particularly at high dark count rates.
The simulated SPAD array, or dSiPM, is based on a 3D vertical stacking of a SPAD array on a layer of electronics like the one described in [22]. This dSiPM measures 1 × 1 mm 2 with a 50 µm SPAD pitch, for a total of 20 × 20 = 400 SPAD cells. Each SPAD cell has a 35 ps FWHM time jitter. This value was chosen to include both the SPAD avalanche growth and the cell-to-cell variation across the array [23,24]. The fill factor of the array is 52 % and the Photon Detection Efficiency (PDE) of the array is 30 %@420 nm unless otherwise specified. This is based on a conservative estimate for a layout with through silicon vias connecting square SPAD to the quenching circuits in the underlying electronics [22]. To isolate the impact of different parameters, the Dark Count Rate (DCR), crosstalk and afterpulsing were not added to the simulations unless -2 -

JINST 13 P01012
otherwise specified. Parametric studies of the PDE and DCR were conducted and are detailed in section 3. No uncertainties were added to the physical and electrical input parameters.
The beginning of a scintillation event is determined by an algorithm equivalent to the designed electronic trigger behavior. In the absence of a sufficient number of triggered SPAD in a selected time window, the TDC are periodically reset such that they remain available in an idle state most of the time. Only when a sufficient number of SPAD trigger within this time window, signalling the start of a scintillation event, can the TDC complete the time conversion. The system collects the time codes for further analysis, including sorting of the timestamps [25]. This step also provides the time of the first photoelectron of the event, the photoelectron of rank k = 1 which serves as a reference point for the new method implemented.
The post-processing also includes a rudimentary model of the TDC in which a jitter is applied before associating the timestamp with its correct time bin. To reduce the number of variables studied, the bin width and the time jitter of the TDC (σ TDC ) were set to be the same value. A parametric study of TDC resolution and jitter is detailed in section 3. For all other simulations, the raw timestamps were used.

Methods -energy discrimination
The idea of using time information to evaluate the energy of a PET event has been explored before. Indeed, the time-over-threshold schemes use the time a signal spends over a given threshold as a proxy for energy [8,[26][27][28]. Another method based on detailed scintillator photostatistics uses multiple timestamps from a dSiPM to estimate energy [29]. This method requires up to 200 timestamps and several algebraic operations. We propose a simpler method which requires minimal additional circuitry, using fewer timestamps, one subtraction and a configurable numerical threshold.
All methods aiming to estimate energy using time information are based on the same principle: as the energy deposited in the scintillator increases, so does the number of photons emitted. Since the rise time and decay time of the light emission process are fixed, the amplitude of the signal increases according to the number of detected photons. Consequently, more photons are detected in a fixed period of time or, conversely, ordered photons in a high energy event have closer timestamps than those in a low energy event.
Thus as energy increases, the time difference between a detected photoelectron of rank k and the first detected photoelectron (k = 1) will diminish. This time difference ∆t is what this method aims to exploit as a proxy for energy. By setting a threshold T which represents the maximum time between the timestamp of the first detected photoelectron of the event and the timestamp of the detected photoelectron of rank k, we can discriminate events with sufficient energy such that if ∆t > T the event is rejected and if ∆t ≤ T the event is conserved.
Of course the detected photoelectrons for different ranks will have different ∆t k statistical distributions, as shown on figure 1. As the photoelectron rank k increases, the distributions show more separation between distinct energies, but also wider spread, which is expected [11,14]. The detector has a lower limit on the time interval it can measure, mainly caused by the SPAD jitter as well as the TDC jitter and resolution. Since the time interval between scintillation photons can be very short, distributions at lower rank overlap appreciably. Consequently, selecting rank k too low will make the selection of a threshold T difficult and cause a significant amount of discrimination -3 - errors. By using a higher rank, the greater time interval can be more easily differentiated. However, selecting a rank too high will defeat the purpose of using the same photoelectron timestamps as those used for the MLTE algorithms, which generally need a few dozen timestamps at the most [12,30].
The MLTE algorithm itself requires sorted timestamps, however the random nature of the incident photons and the mechanism of the vernier-based TDC will cause the timestamps to be generated out of order. Thus, a hardware sorting module has already been implemented for the MLTE, in our case using an odd-even sorting module [25]. The energy discrimination method aims to use the same timestamps as those measured and sorted for the MLTE. While the MLTE usually only requires a few timestamps, the number of timestamps available can reach up to the number of available TDC or the depth of the memory buffer, whichever is smallest. In the case of these simulations, the 1:1 SPAD to TDC coupling will lead to a maximum of 400 valid timestamps since we have 400 SPAD and memory is not limited. However, depending on the spatial distribution of light at the scintillator/photodetector interface and the time at which the SPAD array is shut off, the number of valid timestamps will actually be lower.
Due to the non-linear relationship between the energy deposited in the scintillator and the slope of the rising light signal, the energy spectrum for the described method is highly non-linear. As the energy increases in fixed increments, this slope increases in smaller and smaller increments. Thus the high-energy events have ∆t which are very close while the lower-energy events tend to have ∆t which are spread out.
Consequently, the energy resolution of the photopeak will be degraded compared to the integration method, which consists in the sum of all avalanches for the duration of the event. However, in PET, the energy information usually serves only to qualify or reject an event and is not used further for the image reconstruction. In lieu of energy resolution, we will directly use the capacity of -4 -the method to discriminate low-energy events. We compare the validity of the events obtained with the proposed method to the validity indicated by the integration method and evaluate the number of times the methods disagree. The lower the disagreement, called error rate, the better the new method performs. This comparison is realized using the following steps: 1. The detected photoelectron energy spectrum of the training dataset is obtained using the integration method, then linearized and converted to a keV scale.
2. The delay between the first detected photoelectron and photoelectron of rank k is calculated on a training dataset.
3. The energy spectrum is mapped to the delay information to obtain the time threshold corresponding to the selected energy threshold for the selected photoelectron rank k. The time threshold is then refined using an iterative method (described below). This time threshold will be used for the testing dataset.
4. The detected photoelectron energy spectrum of the testing dataset is obtained using the integration method, then linearized and converted to a keV scale.

5.
The events above the selected energy threshold are labeled as "Valid" and the others are labeled as "Rejected". This is the reference to which we will compare the method previously described.
6. The delay between the first detected photoelectron and the photoelectron of rank k is calculated for the testing dataset. If the delay is below the time threshold determined using the training dataset, the event is labeled as "Valid". If the delay is greater than the time threshold the event is labeled as "Rejected".
7. The methods are compared using a confusion matrix. If both methods agree that the event is "Valid", the event is classified as a "True positive". If both methods agree that the event is "Rejected", the event is classified as a "True negative". If, however, the methods disagree and an event is "Valid" according to the delay method but "Rejected" according to the reference method, the event is classed as a "False positive" (F p ). The inverse situation is a "False negative"(F n ).
8. The error rate is calculated using: where n is the total number of events in the dataset under study.
The time threshold for rejecting low energy events (step 3 of the comparison protocol) is actually determined in two steps. First, we obtain the linearized energy spectrum using the integration method (shown on figure 2) and we calculate the delay between the first detected photoelectron and photoelectron of rank k. At this point, both the energy and the delay are known for each event. We can then associate the energy to a corresponding delay using a second degree polynomial regression. Using this relationship, we obtain a first estimation of our time threshold T. Second, this -5 - time threshold is further refined using the confusion matrix obtained through the steps enumerated above on the training dataset. The time threshold is adjusted such that F p ≈ Fn. An example of the end result of this process for k = 51 is shown in figure 3 with its corresponding confusion matrix (table 1).  Table 1. Confusion matrix corresponding to figure 3. The threshold is set so that F p ≈ F n . Since GEANT4 was used to simulate the scintillator, the ratio between Compton and photoelectric events corresponds to the one expected for a real detector.

True False
Positive 20 660 (40.1 %) 967 (1.9 %) Negative 29 021 (56.2 %) 950 (1.8 %) Using the method described above to obtain the thresholds, we use the error rate to compare the success of the method under different conditions. However, to limit the number of parameters to verify, we selected to do the studies on a single photoelectron rank. The specific photoelectron rank was selected by calculating the error rate for ranks 10 to 150, as shown on figure 4. As expected according to the distributions of figure 1, the error rate decreases with increasing rank. However, since the timing algorithms make use of the earliest photoelectrons and we wish to shut off the array early, we have to select a compromise. For the following studies, rank k = 51 was selected. As shown on figure 4, earlier ranks have rapidly degrading performance. However, by demonstrating the technique works for k = 51, the ranks k > 51 will also show good performance. According to figure 1, the photon of rank k = 51 arrives well within the first 5 ns. By this time the detector has obtained enough information to calculate the time and discriminate the energy of the event. The SPAD array can be shut off while the electronics complete the operations over the next few hundred nanoseconds, including sorting the photoelectron timestamps, calculating the final event timestamp and determining whether the event has sufficient energy to be conserved and transmitted to the coincidence engine.

Results and discussion
To test the robustness of the method, DCR, TDC time resolution and PDE were studied. All other parameters not under study were set to the default parameters detailed in section 2. All simulations used at least 50 000 events to establish the thresholds. The thresholds were then used on a second dataset of at least 100 000 events distinct from the first to obtain the results in the next graphs. For all parametric studies, four thresholds of 250, 300, 350 and 400 keV were used.  Figure 5 shows the error rate with increasing DCR for k = 51. Below a DCR of 10 cps/µm 2 , the error rate remains unaffected. At superior DCR, the performance degrades because the first photoelectron selector has more difficulty isolating the actual first detected photon of the event. Consequently, the difference between the time of photoelectron k = 51 and photoelectron k = 1 isn't correctly calculated and the discrimination falters at high DCR counts.
The 400 keV threshold performs well, being within the Compton valley, closely followed by the 350 keV which is situated at the Compton edge. It is interesting to note that the 250 keV threshold performs better than the 300 keV. This behavior holds for all the parametric studies done in this work. This behavior is actually caused by the strong non-linearity of the method, illustrated by figure 3. Events with lower energy show a larger time difference between photoelectron k = 51 and photoelectron k = 1. Consequently, the fixed time jitter of the SPAD as well as the uncertainties of the TDC translate to a smaller energy estimation error, improving the error rates relative to the 300 keV threshold. The smaller number of events near the 250 keV threshold also contribute to reducing the number of incorrectly classified events.
As a point of comparison, current commercial SiPM demonstrate a DCR in the order of 0.1 cps/µm 2 [31][32][33]. As shown by the DCR sweep, the levels of DCR necessary to impact the energy discrimination performance are far above the levels currently attained.
-8 - Figure 6 shows the error rate for various TDC resolutions with k = 51. In this case, the TDC jitter was also increased with resolution such that σ TDC is equal to the TDC resolution.
TDC resolution impacts the energy discrimination by quantizing the delay between k = 51 and k = 1. This in turn restricts the valid values for the discrimination threshold and can displace it from its target value. As the width of the TDC bins increases, this effect become more significant and, combined with the increasing TDC jitter and wider bins, leads to more events being incorrectly classified.
In addition, wide TDC bins skews the algorithm used to find the optimal threshold, which has difficulty converging towards an optimal value. This is particularly obvious with the threshold at 350 keV and 400 keV which are affected more strongly by the non-linearity of the method, making one timing bin width equivalent to a larger part of the energy spectrum. Thus, the threshold selection algorithm has more difficulty balancing the error types and is forced to relax its constraints to obtain a valid result. Consequently, the error rate varies irregularly with increasing binning width.
Current TDC design reaches resolution in the tens of picoseconds [8,34]. Using TDC designs with insufficient resolution (>50 ps) defeats the purpose of using multi-timestamp algorithms to obtain the best possible time resolution on the event. At the expected TDC resolutions (<50 ps), the performance of the energy discrimination method is very slightly affected, degrading by a fraction of a percentage point. Figure 7 shows the error rate with increasing array PDE for k = 51. As expected, lower PDE decreases performance, as we sample less of the initial photons emitted by the scintillator. Since less photons are detected, the resolution on timing of the detected photon of rank k = 51 also degrades and results in a larger uncertainty on the energy extracted and a greater error rate around the threshold.
On the other hand, increasing PDE beyond 30 % results in a slightly better discrimination for 250 keV and 300 keV, but a slight worse discrimination for 350 keV and 400 keV. This can be  explained by the non-linearity of the method. Higher energies have a smaller time interval between the detected photon of rank k = 51 and photon of rank k = 1, which is even smaller if the PDE, and thus the sampling fraction, increases. Hence the relative uncertainty of the energy near the threshold increases and affects the error rate.
In this case the rank selection was done using the 30 % PDE configuration. It is possible that higher PDE requires the use of a higher rank to offset the effect of the non-linearity, however this optimization is outside the scope of this work.
The PDE of current SPAD arrays vary widely, ranging from <15 % to near 70 % in the 400-420 nm band [35,36]. As seen on figure 7, the energy discrimination method may result in significantly degraded performance with SPAD arrays showing a PDE below 20 %. However, higher PDE arrays are necessary for applications using time of arrival estimators and this range shows good energy discrimination.
The energy discrimination method shows stable performance for the expected values of DCR (0.1 cps/µm 2 ) and PDE (30-70 %) for current SPAD arrays. In addition, it also performs well for the TDC performance achieved by current TDC designs in the tens of picoseconds. In conclusion, the method is robust and can be used to discriminate energy above a variable threshold.

Conclusion
We propose an energy discrimination method which used the same timestamps as used for MLTE methods. The simulations indicate that this method can discriminate and reject low energy events in a PET application with a low error rate, reaching less than 3 % for photopeak discrimination and less than 7 % for Compton thresholds (250-300 keV). This method also shows excellent robustness to DCR and can work with the TDC resolutions and PDE necessary for high precision timing applications.
-10 -Since we collect the photoelectron information within the first few nanoseconds and use the same information for both time estimation and energy discrimination, we can disable the SPAD array for the majority of the scintillation event. This reduces afterpulsing by giving ample time for the traps to empty. In addition, this could be a viable avenue to reduce the power consumption of the SPAD array and local heating, which increases dark counts. As circuit density and the use of vertically integrated electronics increases, controlling the power consumption and heat generation are crucial to the performance of the SPAD.
The next objective is to test this method using a real detector to confirm its validity. To do so, a calibration procedure will need to be developed using only data accessible in a real detector, which can be done using the digital SPAD array simulator. Ultimately, the method and its calibration procedure will need to be tested in an ASIC designed to use a multi-timestamp algorithm for time estimation.