A deep learning method for the trajectory reconstruction of cosmic rays with the DAMPE mission

A deep learning method for the particle trajectory reconstruction with the DAMPE experiment is presented. The developed algorithms constitute the first fully machine-learned track reconstruction pipeline for space astroparticle missions. Significant performance improvements over the standard hand-engineered algorithms are demonstrated. Thanks to the better accuracy, the developed algorithms facilitate the identification of the particle absolute charge with the tracker in the entire energy range, opening a door to the measurements of cosmic-ray proton and helium spectra at extreme energies, towards the PeV scale, hardly achievable with the standard track reconstruction methods. In addition, the developed approach demonstrates an unprecedented accuracy in the particle direction reconstruction with the calorimeter at high deposited energies, above several hundred GeV for hadronic showers and above a few tens of GeV for electromagnetic showers.


Introduction
The DArk Matter Particle Explorer (DAMPE) mission was launched on December 17, 2015, from the Gobi desert in China. It operates on a 500 km sun-synchronous orbit in sky survey mode, accumulating about two billion cosmic-ray events per year [1]. The broad scientific program of DAMPE includes the measurement of the cosmic electron and positron (e − +e + ) spectrum in the energy range between a few GeV and about 10 TeV, the measurement of the cosmic-ray proton and ion spectra in the particle kinetic energy range between 10 GeV and hundreds of TeV, and gamma-ray physics. The DAMPE instrument is composed of four main subdetectors: a bismuth germanium oxide (BGO) imaging calorimeter, segmented in 14 layers of 22 bars in a hodoscope arrangement, with a total thickness of about 32 radiation lengths for precise energy measurements and the electron/hadron separation [2][3][4]; a Silicon Tungsten tracKer-converter (STK) with 12 layers of single-side silicon trip detectors, 6 in the x direction and 6 in the y direction, for a precise cosmic-ray trajectory reconstruction, the absolute charge (Z) measurement, and the identification of the gammaray direction through the photon conversion into an electron-positron pair [5][6][7]; a Plastic Scintillator Detector (PSD) consisting of two double-layers of scintillator bars for the cosmicray absolute charge measurement and also serving as a veto for the gamma-ray detection [8,9]; a NeUtron Detector (NUD) consisting of four boron-doped scintillator tiles enhancing the electron/hadron discrimination power [10]. Thanks to the fine-segmented thick calorimeter and relatively large acceptance, DAMPE is capable of detecting cosmic rays with very good energy resolution, ∼1.5% for electrons and gamma rays [11], and ∼30% for protons and ions [12].
Recent DAMPE results [12,13] provide the most precise measurements of cosmicray proton and helium spectra at the highest particle kinetic energies reached by directdetection experiments, with unprecedented statistical accuracy and energy resolution. They confirm the previously established hardening of both proton [14][15][16][17][18][19][20][21][22] and helium [16,17,[20][21][22][23][24][25]] spectra at about a few hundred GeV per nucleon, and reveal a new spectral feature, a softening at about 14 TeV and 34 TeV particle kinetic energy for proton and helium, respectively. The measurements are compatible with the hypothesis of a particle charge dependence of the softening, indicating the presence of a nearby cosmic-ray source, like a supernova remnant (SNR), with the acceleration cutoff corresponding to the softening energy. The complex spectral structures challenge the long-standing paradigm of the SNR origin of galactic cosmic rays [26,27]. In particular, it is not clear whether SNRs can accelerate cosmic rays to PeV energy or whether other sources are needed [28][29][30][31]. Future cosmic-ray measurements towards the PeV scale with the existing particle detectors in space (DAMPE, CALET [32], ISS-CREAM [33]) and the next generation calorimetric space observatory (HERD [34]) are therefore of paramount importance in order to shed light on the century-long puzzle of cosmic-ray origin. However, while calorimetric experiments provide unique opportunities to explore the TeV-PeV domain, the systematic uncertainties related to the conventional data reconstruction techniques hinder such measurements.
One of the key challenges in direct cosmic-ray detection comes from the limited precision of the absolute charge identification. It is directly linked with the accuracy of the particle trajectory reconstruction. In particular, considering the cosmic-ray proton spectrum measured by DAMPE [12], the systematic uncertainties of the analysis grow rapidly with energy rendering the adopted cosmic-ray reconstruction and identification techniques insufficient for future measurement at a few hundred TeV and higher energies. This problem is critical for any other cosmic-ray analysis, including the recently published helium spectrum [13] and future measurements of heavier nuclei [35,36].
The goal of this work is to develop a new method for the reconstruction of cosmicray tracks, in order to enable a reliable and accurate absolute charge identification in the entire energy range, in particular at the TeV-PeV scale. Our approach deviates from the conventional combinatorial pattern recognition adopted by the DAMPE Collaboration [6,37] and, in similar ways, in other major calorimetric and spectrometer experiments in space [38][39][40][41][42], toward the rapidly developing deep learning domain [43,44]. The paper is structured as follows. In Section 2 we describe the data sample and the Monte-Carlo simulation used in this work. In Section 3 we briefly introduce the existing DAMPE data reconstruction pipeline and identify the key challenges of the cosmic-ray track reconstruction and absolute charge identification. In the subsequent parts, we describe the constituent blocks of the deep learning machinery developed for the particle trajectory reconstruction, which includes the shower axis direction finding in the calorimeter in Section 4, and the particle direction reconstruction in the tracker in Section 5. Next, in Section 6 we introduce the neural network algorithm developed to reject particles inelastically interacting before the tracker, as a prerequisite for the reconstruction of the particle charge. Finally, in Section 7 we apply the developed methods on the proton and helium flight data samples and we compare the absolute charge identification performance based on the new approach with the one based on the standard DAMPE track reconstruction. The results and implications are discussed in Section 8.

Data and simulation
The data sample used for this analysis was collected by the DAMPE instrument during the period between December 2015 and December 2021. The raw data transferred from the satellite are processed offline with the standard software pipeline, including the energy and shower axis direction reconstruction with the calorimeter [45], the track pattern recognition based on the Kalman filter approach [37], the reconstruction and correction of the signals in PSD [46,47], and the internal alignment of STK [6].
The full-scale Monte-Carlo (MC) simulation of the DAMPE detector is based on the GEANT4 toolkit version 4.10.5 [48]. It is used to perform the training of deep learning models, to optimize the event selection and to estimate the performance of the developed algorithms. The proton and helium cosmic-ray spectra are generated in the particle kinetic energy range between 10 GeV and 1 PeV following a power-law distribution with the spectral index of -1. For the simulation up to 100 TeV, we use the FTFP_BERT hadronic model [49][50][51], while above 100 TeV we employ the EPOS-LHC [52] model from the CRMC package 1 linked to GEANT4 using a previously developed interface [53]. The electron MC sample is generated in the energy range between 1 GeV and 50 TeV. To ensure a good match with the real data, the detector geometry is implemented in the simulation from Computer-Aided-Design (CAD) drawings, using a software conversion toolkit 2 [37]. For a fair comparison, the simulated data are processed with the same reconstruction and analysis chain as the flight data.
In addition to GEANT4, an alternative DAMPE simulation based on the FLUKA version 2011.2X.7 [54] is also used at 100 TeV and higher energies. It incorporates the DPMJET3 [55] model for nucleus-nucleus interaction above 5 GeV/n. We profit from the FLUKA samples as they allow us to test the stability of the deep learning models and improve the model performance by extending the statistics of the training sample.
The list of all MC samples is given in Table 1. It is worth noting that less than 2% of generated MC events correspond to particles that geometrically pass through all the DAMPE subdetectors. These events are analyzed and processed with the deep learning model training. This feature is explained by the fact that particles are generated on the surface of a sphere surrounding the DAMPE satellite in order to mimic the isotropic flux of cosmic rays [12,13].

Cosmic-ray reconstruction and identification
Passages of cosmic-ray particles through the DAMPE detector are illustrated in Figure 1. The standard procedure for the trajectory reconstruction and particle identification can be grouped into the following steps: • Reconstruction of the shower axis direction in the BGO calorimeter; • Track reconstruction in the STK using the BGO shower axis direction as a seed; • Selection of the best STK track from the ensemble of candidate tracks; • Projection of the STK track onto PSD, calculation of the path length; • Measurement of the particle absolute charge with PSD using the STK track projection.
Normally, an additional selection criterion is applied requiring consistency between signals in different PSD bars along the path of the particle to ensure the correct absolute charge identification, which could otherwise be altered by the inelastic interactions of cosmic rays inside PSD [12,13]. The particle track finding starts with the reconstruction of the shower direction in BGO, which is obtained from the fit of the energy-weighted "cluster" positions in different calorimeter layers [1]. A somewhat similar approach is adopted by other calorimetric experiments to date, including Fermi-LAT [39], CALET [56], and CREAM [42]. The direction of the shower axis reconstructed in the calorimeter provides the region of interest to look for candidate hits in STK, as well as the seed direction to be fed as a "best guess" to the Kalman filter algorithm. A further track finding is done through the combinatorial search and simultaneous (Kalman) fitting of the potential track candidates in STK. Poor-quality and ghost tracks are removed, keeping only those that pass certain quality criteria. It is the task of the further analysis to identify which track best corresponds to the direction of the impinging cosmic ray. In other words, two conditions must be fulfilled in order to identify the real particle track: (a) it has to be reconstructed and present in the pool of the candidate tracks; (b) it has to be correctly selected on the analysis level. Once the track is identified, it is projected onto the PSD subdetector, which can provide the absolute charge measurement with high resolution in a broad dynamic range, up to nickel (Z=28) [46,47]. Moreover, it is worth mentioning that the width of a PSD bar is 2.8 cm, which is much higher than the pointing resolution of the STK, 50-100 µm [6]. Hence, the PSD measurement is not so vulnerable to potential errors in the STK track identification. In other words, given that the selected STK track candidate is relatively close to the real trajectory of the particle, even if it is wrongly identified, the PSD measurement is likely correct. With the argumentation above, the PSD rightfully serves as a major tool for absolute charge identification in DAMPE. However, the advantage of a relatively large PSD bar size turns into a weakness at high energies, especially in the context of proton and light ion identification, as described below. Figure 2 demonstrates the ultimate charge identification capacity of PSD and STK with respect to protons and helium nuclei, in different energy bins, up to 1 PeV. The distributions are obtained from simulation, using the true particle direction. In addition, a selection is applied on the MC truth level requiring no inelastic interaction inside PSD. The absolute charge measured with the PSD is defined as the minimum signal among PSD bars crossed by the particle 3 . For the STK charge calculation, hits closest to the true direction in each layer are selected, with a maximum of 12 hits. If no hit within 0.3 mm is found in an STK layer, it is skipped 4 . The STK measurement is obtained as the average of the signals of associated hits, excluding layers with abnormally high signals coming from particles that pre-shower inside the STK or due to the tail of the Landau distribution [57].
At high energies, secondary particles originating in the calorimeter showers severely contaminate the PSD, deteriorating the signal distributions and making the measurement towards PeV scales extremely hard (Figure 2-left). The tracker, on the other hand, can provide very precise measurements, nearly independent of particle energy (Figure 2-right), assuming, however, that the particle track can be reliably identified. In particular, given a 95% containment of the STK proton distribution, the helium cross-contamination is less than 0.5% at all energies; in turn, given a 95% containment of the STK helium distribution, the proton cross-contamination is ∼1% up to 100 TeV and ∼2% at PeV energies. Due to the relatively small silicon strip pitch and a large number of tracking layers, providing up to 12 signal points, the STK measurement is much less affected by the secondary particles that originate in the calorimeter shower and scatter back into the tracker. In particular, it can be seen that STK provides very clean proton and helium peaks. These are the most abundant cosmic-ray species, naturally making them a target for the first direct cosmicray measurements at PeV energies. Notably, such measurements are also among the major physics goals of the High Energy Cosmic Radiation Detection Facility (HERD) -the largest cosmic-ray instrument to be launched into space in the foreseeable future [34,58,59]. The HERD mission shares its design philosophy with DAMPE, consisting of a thick calorimeter surrounded by a fine-segmented tracker. Therefore, the task of precise tracking is crucial not only for DAMPE, but also for the future success of cosmic-ray direct detection experiments. Figure 3 shows the charge identification capability of the STK using the standard track reconstruction algorithm with two different track identification methods. The first one is referred to as ideal identification, the "perfect" algorithm that selects the reconstructed track best matching the true particle. The second one is the standard identification -an algorithm optimized for the helium analysis which selects a track matching certain quality criteria and having the highest signal compared to other candidate tracks [13]. As can be seen from the distributions in Figure 3, the reconstruction and identification of the track are particularly difficult at high energies. Due to the back-scattering and pre-showers in the tracker, a vast multiplicity of secondary hits arise, dramatically obscuring the signal of the primary impinging cosmic ray, as illustrated in Figure 1-bottom. Moreover, as the number of hits increases, the combinatorial pattern recognition turns computationally expensive -the search for the primary track becomes a "needle in a haystack" problem. While the conventional track reconstruction algorithm of DAMPE operates adequately up to about 100 TeV, the identification of the correct track remains an open task even in this case. At 3 We also tested other algorithms, including mean and truncated mean, and found that the "minimum" algorithm shows optimal proton-helium separation in PSD. 4 For STK we also tested a cut of 0.6 mm and found no significant difference in the results.   and STK (right) for MC events, obtained using the true particle direction. All distributions are normalized to unity.
higher energies, however, not only the track identification is a challenge, but the standard algorithm also fails to reconstruct the primary track in a large fraction of events. The exact figures on the tracking efficiency with the standard approach and the one developed in this work will be provided further in the paper. Below, we develop a novel deep learning regression approach for predicting the particle trajectory, using the calorimeter (BGO) and tracker (STK) data as an input. This approach enables to perform an accurate absolute charge measurement and allows to overcome one of the major difficulties for the first direct measurement of the cosmic proton and helium energy spectra at the PeV frontier.

Neural network calorimeter direction prediction
Up to now, conventional techniques based on the analytical fitting of the shower axis have been used for the calorimeter shower direction reconstruction in all major space experiments [1,39,42,56,60,61]. These techniques have a typical pointing resolution comparable to the granularity of the calorimeter, which is about ∼1 cm in the case of DAMPE. In this work, we propose a method based on Convolutional Neural Networks (CNNs), treating calorimeter data as images. Profiting from low-level data without a bias of human pre-processing, CNNs can learn "hidden" features in the data which conventional "analytic" algorithms may not be sensitive to, potentially providing a better particle direction prediction [43]. Deep learning techniques including CNNs have already demonstrated their first successful applications in high-energy physics [62][63][64][65][66][67][68], astroparticle and gamma-ray physics [69][70][71][72], neutrino [73,74], and extensive air shower detection experiments [75,76]. The bulk of applications belong to classification type problems [62-64, 69-71, 73], yet the first examples of regression tasks have also emerged [65,72,[74][75][76]. Interesting applications of neural networks and adversarial training in generative models are also attracting a growing attention in the community [66][67][68]. Beyond this work, in DAMPE, the neural network paradigm is also actively being explored for electron/hadron particle discrimination [69] and calorimeter energy reconstruction [72].
The chosen network architecture together with an example input calorimeter image is shown in Figure 4. The image is constructed as a mixture of two projections of the BGO calorimeter with a total dimension of 14×22 pixels. Vertical-wise, even and odd layers of an image correspond to xz and yz projections, respectively. There are 7 layers per projection, according to their hodoscopic geometrical arrangement. We have also considered an alternative architecture, where the xz and yz projections are connected to separate CNNs, the outputs of which are then combined in a fully-connected neural network. We learned that the mixed image architecture has a better prediction accuracy compared to the one with the disconnected images. This result is expected since the two projections are not independent. The mixed image accounts for cross-correlations throughout the calorimeter in the vertical direction, as particles are traversing sequentially the 14 layers. The value in each pixel of an image is taken as the signal of the corresponding BGO bar divided by the signal of the maximum-energy bar in the event. In this way, the values are limited to the [0;1] range. We use 8-bit precision to decode signals in each pixel. We also tested 16-bit  [37]. The candidate track is selected using either the ideal identification (left) or the standard identification (right) method [13]. All distributions are normalized to unity.   Figure 4. Convolutional neural network for the particle direction prediction in the BGO calorimeter. The output of the convolution layers is a set of 100 variables augmented with two additional variables: the total deposited energy and maximum-bar energy (both in units of TeV). It is followed by a fully-connected layer of 50 neurons, in turn, fully connected to the 4 output variables. Activation in all layers is done with the Rectified Linear Unit (ReLU) function, except for the last layer where the activation is linear. An additional fully-connected layer of 4 outputs with linear activation is added to perform the data/MC correction (alignment).
precision and found no significant difference in the network performance. The output of the network,x, is a vector of 4 variables, which correspond to particle coordinates (x and y) in the first plane and the last plane of the STK. The choice of the output variables is motivated by the fact that the BGO direction prediction serves as a first approximation for the particle trajectory finding in the tracker. As a target for training we use the mean squared error: wherex tru is the corresponding vector of true particle coordinates in the first and the last planes of the STK and N is the number of events in the batch. The training of the network has been done with MC data consisting of simulated proton, helium, and electron particles passing through the DAMPE detector. The CNNs were trained separately for the low-, middle-and high-energy ranges, corresponding to particle kinetic energy between 10 GeV and 10 TeV, between 1 TeV and 1 PeV, and between 10 TeV and 1 PeV, respectively. This yields a better accuracy compared to the case when a single model is trained on the entire energy range. We intentionally choose highly overlapping energy intervals for the three models in order to facilitate smooth transitions between the models. The output accuracy of the low-and middle-energy range models overlap in the region in which the deposited energy is about a few hundreds of GeV, while the accuracy of middle-and high-energy models overlap in the region of few tens of TeV. Hence, the transition thresholds were chosen at 300 GeV and 20 TeV, respectively 5 . The fact that the multiple energy range training shows a better performance compared to a single energy range model is due to few factors. Firstly, at energies below ∼100 GeV, considerably fewer pixels are fired in the calorimeter, therefore the amount of information is significantly lower than at ∼TeV and higher energies. As we will show further in the paper, the accuracy of the calorimeter CNNs model at low energies is significantly worse than at high energies. Therefore it bottlenecks the training, biasing it towards the optimisation for low-energy events. Secondly, at the highest energies, the effect of the BGO readout saturation starts taking place, changing qualitatively the typical picture of showers in the calorimeter [72,77]. The latter cannot be inferred from the lower (or middle-range) energy data. Therefore it requires a dedicated model with a particular attention to the saturated events. Finally, we choose not to split models into more than 3 intervals, as we did not observe any significant gain for alternative configurations. In particular, we have tried splitting the low-energy interval into subintervals, but no further improvement in accuracy was seen, at least in the energy range where we perform cosmic-ray measurements with DAMPE 6 .
All the neural networks described in this paper are implemented in TensorFlow 2 7 and trained on Nvidia 2080 series GPUs. The training is done in batches of size 32. For the calorimeter CNNs, the fitting time is about 36 minutes/epoch for 10 6 batches. Training sample size ranges between O(10 4 ) and O(10 6 ) batches depending on the available MC statistics in different energy ranges (see Table 1). The Adam stochastic gradient descent algorithm [78] in its default configuration is employed. The learning rate is initially set to 10 −4 and controlled during the training by the TensorFlow's Reduce-on-Plateau method. During the data processing the model inference is done on conventional CPUs. Inference time amounts to 0.24 s/event, evaluated on AMD Opteron 6274 processor 8 .
About 25×10 6 , 3×10 6 and 1×10 6 events were used for the training in the low-, middleand high-energy range, respectively. The MC samples were divided into training, validation, and test samples in an approximate proportions of 80%-10%-10%. As shown in Figure 5, the convergence of the gradient descent algorithm in the network optimization is achieved in about 100 epochs. Note that a gap corresponding to some lack of generalization can be observed, which is mostly due to the limited MC statistics at the highest energies. We will quantify this effect in the further analysis as a systematic uncertainty. As we will show later, its impact is nearly negligible. We have also tried reducing the complexity of the CNNs model as well as adding dropout layers to it, which resulted in a significantly worse performance of the network. The vertical dimension of the convolutional filters is set such that it spans two consequent layers of the calorimeter, both in xz and yz projections. We have also tested a network with convolutional filters of dimension 3×5 instead of 4×4, which resulted in a significantly lower accuracy. Finally, the majority of simulated data are generated with GEANT4. We have also added the FLUKA samples in the training, which marginally improved the performance at the highest energies thanks to the increased training statistics.
Since the shower-shape characteristics differ only marginally between different ions, the performance of the trained network on particles beyond helium (e.g. lithium, beryllium, boron, carbon, oxygen, iron) was found to be similar to that of proton and helium. We Mean absolute error validation (mm) Figure 5. Evolution of the mean squared error and mean absolute error in training for the calorimeter neural network model in the middle-energy range (particle kinetic energy larger than 1 TeV). The steps correspond to the reduction of the learning rate by a factor of two.
found no significant improvement if ions heavier than helium were added to the training. At the same time, it is well known that the shower shape characteristics for electromagnetic and hadronic interactions are fundamentally different. For this reason, we added simulated electrons corresponding to about 25% of the training sample, to ensure high prediction accuracy of the network for all particle species. It is worth noting that adding electrons did not degrade the performance of the CNNs with respect to the hadronic showers. We have also tried training a dedicated "electromagnetic" model by increasing the fraction of electrons to 80% and found no significant performance improvement on electromagnetic showers compared to the baseline model. On the contrary, we have also tested the model with a relatively low electron content, about 3% and found it to have significantly worse performance. Hence, we conclude that the training is not particularly dependent on the exact relative content of hadronic and electromagnetic showers, as long as they are of a comparable scale. The performance of the developed algorithm is illustrated in Figure 6. For the sake of clarity, we convert the output variables of the network into conventional azimuth angles and intercept coordinates of a particle in two orthogonal projections of the DAMPE coordinate system. The distributions are derived from the test MC samples.
To combine the BGO direction prediction with the further trajectory finding in the tracker, precise alignment between the BGO and STK has to be performed. While in simulation the alignment is perfect by definition, the trained CNNs model to be applied to the real data has to be corrected for possible misalignments between BGO and STK. To perform this task, we add another fully-connected layer of 4 neurons at the output of the network, while keeping the other adaptive parameters frozen, as illustrated in Figure 4,  Figure 6. Residual distributions of the azimuthal angle, θ x (left), and intercept, x 0 (right) for MC events. The residual is obtained as the difference between the prediction of the CNNs model and the true particle direction. Similar distributions obtained with the standard BGO direction reconstruction [1] are shown for comparison. While the results for the xz plane of DAMPE are shown, the yz distributions share the same behavior. Intercepts are calculated at the z = 0 plane, which corresponds to the border between the BGO and STK subdetectors, see Figure 1. and separately train this layer directly on the data. The selection, in this case, is performed to ensure the presence of exactly one clearly defined track in the STK, obtained with the standard reconstruction algorithm, which we consider as a "true" particle direction in the corresponding training. Hereafter we refer to it as a clean selection. Furthermore, in Figure 7 we use the clean selection to evaluate the CNNs model direction reconstruction performance on the data. Mutually exclusive data samples are used for the BGO-STK alignment in the CNNs model training and the evaluation of the residuals, each corresponding to about 10 4 events. For illustration purposes, we also show the CNNs model prediction on the data if no BGO-STK alignment is applied. It is worth noting that the BGO-STK alignment does not depend on the particle energy. In other words, the additional layer (the right-most layer in Figure 4), being trained in one energy region of the data, works equally well at other energies, which is expected since the (mis-)alignment is a purely geometrical effect. Figure 8 shows the 68% and 95% containment radius using the developed CNNs particle direction prediction. The direction of the primary particle to be compared with the CNNs prediction is either the MC truth direction or the standard reconstructed STK track, obtained after applying the clean event selection. The latter is done in order to allow for the data/MC comparisons, as no true particle direction is known in the real data. The effect of the CNNs generalization gap is quantified as a systematic uncertainty. It is estimated as the difference between the results obtained on the training MC sample and the test MC sample (which was excluded from the CNNs training). The impact of the generalization  Figure 7. Residual distributions of the azimuthal angle (left) and intercept (right), in both DAMPE projections, for MC events and flight data with and without the additional layer in the CNNs model responsible for the BGO-STK alignment. The clean event selection is applied, requiring the presence of exactly one well-defined STK track, reconstructed with the standard procedure [1]. This track is considered as a reference particle direction.
gap is observed only for hadronic showers with a deposited energy larger than ∼10 TeV.
As shown in Figure 8, a significant improvement over the standard shower-axis algorithms can be observed. In particular, the 68% angular (position) containment for hadronic showers is lower than 0.4 • (1.7 mm) at 100 TeV of deposited energy, which is more than 5 (7) times better than with the standard algorithm (Figure 8-a). For the electromagnetic showers, the 68% angular (position) containment reaches about 0.35 • (1.4 mm) at 5 TeV, with the corresponding improvement of about 6 (8) times with respect to the standard DAMPE algorithm. Some relatively small discrepancy between data and MC can be ob- . 68% and 95% containment of particle direction, in terms of azimuthal angle, ∆Θ 2 x + ∆Θ 2 y , and intercept, ∆x 2 0 + ∆y 2 0 , obtained with the calorimeter CNNs model, as a function of the total deposited energy in BGO: (a) proton and helium, (b) electrons. Either the true particle direction (tru) or the clean STK track (trk) is taken as a reference. A combined proton plus helium (trk reference) MC is shown in (a) for comparison with the data. The systematic errors due to the CNNs generalization gap are shown with shaded bands. The results for the standard DAMPE calorimeter reconstruction are overlaid for comparison. served at the highest energies, which is likely attributed to the possible imperfections in the BGO simulation. It is interesting to note that the angular resolution at TeV and higher energies (0.35-0.4 • ), while being obtained purely from the calorimeter, is not much worse than the typical angular resolution of dedicated tracking subdetectors (0.05-0.5 • ) of cosmicand gamma-ray missions, including DAMPE itself. Moreover, it is better than the typical gamma-ray pointing accuracy of DAMPE, Fermi-LAT, and CALET at the GeV scale, where the bulk of gamma-ray sources is observed [1,79,80]. At the same time, we also note that the advantage of the CNNs algorithm is marginal at low energies. The latter is expected since at low energies the shower is concentrated in very few pixels of the calorimeter image. In turn, the CNNs advantage manifests clearly at higher energies, as the amount of information per particle image increases.
Finally, the developed CNNs approach represents a significant improvement over the standard methods. We hypothesise that the architecture of the model can be improved even further, in particular by exploiting more efficiently the correlations between signals in alternating BGO layers, with more complex CNNs or, alternatively, Graph Neural Networks. We leave this subject for future studies.

Neural network tracker direction prediction
The above prediction of the calorimeter CNNs algorithm serves as a seed for the particle direction reconstruction with the tracker. Even if simply combined with the conventional Kalman algorithm [6,37] it is expected to improve the accuracy compared to the standard DAMPE track reconstruction. However, the problem of the correct track identification from the ensemble of Kalman track candidates would remain open. Instead, our goal is to develop an algorithm which provides a single particle trajectory as close as possible to the real one. The developed algorithm is also based on CNNs, as illustrated in Figure 9. First, the direction prediction from the calorimeter CNNs model is projected onto the tracker, selecting the hits within a certain window, hereafter called the Region of Interest (RoI) 9  is the corresponding prediction of the calorimeter CNNs model. Similar definitions holds for δ x,bot , δ y,top and δ y,bot . The image represents a 20×20 mm window, with a pixel resolution of 50 µm 10 . In the event of the BGO CNNs model prediction being perfectly correct, the true position of a particle would be placed directly in the center of the image.
It is worth noting that, contrary to the BGO, we do not use raw (non-transformed) images of the STK for the track reconstruction. The reason is that the nature of a particle's passage through the tracker is fundamentally different from the one in the calorimeter. 9 We use a window of ±10 mm, corresponding to ≥ 99% containment of a true particle direction. 10 Pixel resolution is chosen to match the position resolution of the STK [6].   Figure 9. Hough image of a typical Helium event and the architecture of the tracker convolutional neural network. Big (black) and small (red) circles represent the true and the reconstructed trajectory of a primary particle, respectively. Similar to the calorimeter CNNs model, the ReLU activation function is used in all layers except for the last one, which has a linear activation. The 400×400 dimension is determined by the chosen RoI size and pixel resolution.
In particular, the raw tracker image does not necessarily show a pre-shower profile correlated with the primary particle direction. The true particle track may be hidden among secondary-particle hits with higher signals. In fact, we have also tried to develop a CNNs model which uses raw STK images but no satisfying solution was found 11 . Another possible alternative could potentially lie in the Graph Neural Networks domain [82], which we do not cover in this paper. We opt for the Hough transform as a simple but powerful enough way of structuring topologically the hits. As we will show later in the paper, the Hough transform combined with the CNNs allows to achieve an excellent particle trajectory reconstruction in DAMPE.
Similar to the calorimeter network, 8-bit precision is used to store information in each pixel. Two tracker projections are mapped on separate images. Moreover, each projection is split further into two images, consisting of hits with STK signal (charge) either below or above a threshold Z thr , respectively 12 . In other words, for each projection, there is one image with the STK hits more likely corresponding to protons and the other one with the hits which potentially correspond to helium or heavier ions. In this way we partially encode the STK signal information into the image. As a result, the input image has a dimension 400×400×4. The internal STK alignment is applied to correct for the hit positions in the 11 The raw-image CNNs in turn prove to be useful for a classification type problem, as will be shown in Section 6. 12 The performance of the algorithm is not sensitive to the exact choice of the Z thr value. We have tested values in the range from 1 to 2, found no significant difference, and chosen Z thr = √ 2. At the same time the algorithm performance is better than in the case of no splitting. tracker [5,6]. Note that in order to avoid potential problems due to a (mis-)modeling of the readout saturation for heavy ions, we do not add more detailed STK signal information into the Hough image. Moreover, even for low-charge particles, the STK signal in the MC simulations may be slightly different from the flight data [6]. The described image design allows us to mitigate the effect of the simulation inaccuracy, such that the trained model can adapt correctly to the real data.
The image is provided as an input to the dedicated regression network, whose goal is to predict the true "position" of a particle on the image. The building blocks of the CNNs model are depicted in Figure 9 right. The fitting time of the network is about 8 hours/epoch for 10 6 batches. The training is also done in three energy ranges, as described in Section 4. Similar to the BGO calorimeter, the STK tracker starts saturating at the highest energies due to the increased density of secondary particles and the limitations of the data reduction algorithms in the STK readout electronics [83]. Hence, the same argumentation holds for having different models focused on specific energy ranges, for both the BGO calorimeter and the STK tracker. The CPUs inference time of the tracker CNNs model is about 0.35 s/event 13 , comparable to that of the calorimeter model. For reference, the average time consumed by conventional track pattern recognition in the reconstruction of DAMPE flight data is 3.4 s/event.
Once the particle direction is obtained from the STK CNNs model, the STK hits are assigned to it using a simple distance matching, as described in Section 3, forming the final track. Note that unlike the conventional Kalman approach, we do not perform fitting of the hits. There is no need for a further track selection since only one track is provided per event.
In summary, we note that the particle direction prediction is done with two CNNs algorithms chained together. The first one infers the particle direction from the calorimeter, while the second one yields a more precise prediction from the tracker, using the prediction provided by the calorimeter CNNs model as RoI. At energies below ∼1 TeV, however, the precision of the calorimeter CNNs model is not sufficient to provide an accurate RoI for the tracker CNNs model 14 . Hence we add an additional intermediate step to account for the lower accuracy of the calorimeter inference. In this step we use a coarse Hough CNNs model, identical to the one described above but with a lower pixel resolution, 400 µm instead of 50 µm, and a wider window, ±80 mm instead of ±10 mm. As a result, at low energies the particle direction prediction happens in three steps: the calorimeter CNNs model prediction, the coarse tracker inference (400 µm), and finally the precise tracker inference (50 µm) 15 .
As a figure of merit of the tracking algorithm, we use the combined reconstruction and selection efficiency ( ) of the track as follows. First, we perform the event selection 13 We expect the inference time to decrease dramatically if the inference is done in batches of events, using GPUs instead of CPUs. In the current DAMPE software framework, the data processing is done on an event-by-event basis.
14 For events with deposited energies below ∼1 TeV, the probability that the true particle direction falls into the ±10 mm RoI does not meet the requirement of 99%. 15 The coarse tracker CNNs model is added to the reconstruction chain if the deposited particle energy in BGO is lower that 1 TeV, which approximately corresponds to a true particle energy of about ∼3 TeV [12,13].
as described in Section 3, using the true particle direction as the "track". Next, we make distributions of the STK signal ( Figure 2) in different energy bins and define the regions of 85% (95%) signal containment for proton and helium peaks in each bin. Then we repeat the procedure using instead of the true direction of the particle, the direction given by the track reconstruction and identification algorithm under test. In this procedure, the number of signal events are again counted in the same 85% (95%) containment interval derived from the true track direction. We consider three algorithms for the comparison: (1) standard reconstruction with the ideal identification; (2) standard reconstruction with the standard identification; (3) the developed CNNs algorithm. Finally, the efficiency is defined as the ratio of the number of events in the 85% (95%) window obtained with one of the three algorithms, to the number of events (in the same window) obtained with the true track. The results are shown in Figure 10 and Table 2. The standard track reconstruction with the ideal identification has an efficiency in the 96%-98% range up to 100 TeV and reduces to about 70-75% and 50-55% at PeV for proton and helium, respectively. The drop of the reconstruction efficiency at few hundred TeV is explained by the decreased accuracy of the standard BGO direction reconstruction at highest energies, caused by the saturation of the BGO readout [72,77]. The combined standard track reconstruction and identification efficiency is on average about 80% and 90% for proton and helium respectively, below 100 TeV, and drops sharply towards PeV energies (below 35% and 40% respectively), as it becomes increasingly hard to identify the correct track with the classical selection methods [12,13]. For the developed CNNs algorithm, the tracking efficiency is higher than 98% at energies up to a few hundred TeV, where it reduces slightly to about 96-97% at PeV. The estimated uncertainty related to the CNNs generalization gap (both calorimeter and the tracker CNNs models) is negligible below 500 TeV. At higher energies, it does not exceed 1% and 2% for helium and proton, respectively. In order to quantify the performance of the developed calorimeter and tracker CNNs separately, we evaluate the tracking efficiency using the following two options: • (a) calorimeter CNNs model combined with the the standard track pattern recognition based on the Kalman filter approach; • (b) conventional calorimeter shower axis direction reconstruction combined with the tracker CNNs model 16 .
The corresponding results are shown in Figure 11 and Table 3. For case (a), the track reconstruction efficiency assuming the ideal selection algorithm is equivalent to that of the combined calorimeter and tracker CNNs approach. This result confirms that the calorimeter CNNs algorithm by itself enables a near ∼99% track reconstruction efficiency even if combined with the standard Kalman pattern recognition. In this case, however, the problem of the identification of the correct track remains open, especially for protons. In particular, a significant drop of the proton track selection efficiency is observed at the highest energies,  Figure 10. Efficiency of the track reconstruction and identification derived from MC as a function of the particle kinetic energy: (circles) the developed CNNs algorithm; (squares) standard track reconstruction with the ideal identification; (triangles) standard track reconstruction with the standard identification. Proton (top) and helium (bottom) cases are shown. The gray shaded area indicates the region where the two-step tracker CNNs prediction is used (1 TeV deposited energy roughly corresponds to ∼3 TeV particle kinetic energy [12,13]). CNNs STK-only (85%) CNNs STK-only (95%) Figure 11. Efficiency of the track reconstruction and identification derived from MC as a function of the particle kinetic energy: (circles) the combined chain of tracker and calorimeter CNNs (the same as in Figure 10); (squares and triangles) standard track reconstruction combined with the calorimeter CNNs model, using either the ideal (circles) or standard (triangles) identification; (diamonds) tracker CNNs model combined with the standard calorimeter shower axis direction reconstruction. Proton (top) and helium (bottom) cases are shown. The gray shaded area indicates the region where the two-step tracker CNNs are used for the combined CNNs reconstruction. For the tracker-only CNNs model, the two-step prediction is used in the entire energy range.  Table 2. Efficiency of the track reconstruction and selection derived from MC using different techniques: standard track reconstruction with the standard identification (left column); standard track reconstruction with the ideal identification (middle column); the developed CNNs algorithm (right column). Proton (top) and helium (bottom) cases are shown. Errors for the first two algorithms are statistical only. Errors for the CNNs algorithm also include the uncertainties related to the CNNs generalization gap. For the sake of brevity, selected points from Figure 10 corresponding to different energy decades are shown. For the CNNs reconstruction algorithm, a two-step tracker CNNs prediction is used in the first two energy bins.
which is related to the increased multiplicity of hits in the tracker when approaching the PeV regime. On the other hand, the combined calorimeter and tracker CNNs approach provides similar track reconstruction performance and at the same time solves the track identification problem. Moreover, as noted above, even with the conventional (CPUs) hardware, the time required for a single inference of the tracker CNNs model is one order of magnitude smaller than that required by the standard track pattern recognition in DAMPE.
For case (b), the efficiency is identical to the case when both calorimeter and tracker CNNs are used, but only until 100 TeV. At higher energies, a sharp decrease of the efficiency is observed. The latter is related to the resolution degradation of the conventional shower axis direction reconstruction, caused by the saturation of the BGO calorimeter [72,77].
With the above argumentation, the two CNNs algorithms can be considered complementary: the tracker CNNs model provides a relatively fast and reliable way of reconstructing the primary particle track, without the need of further track identification, while the calorimeter CNNs model recovers the drop of the tracking efficiency at the highest energies. An open question remains whether it is possible to perform a purely tracker-based CNNs reconstruction of particle trajectory without relying on the calorimeter or, at least, not requiring a dedicated calorimeter CNNs model. This could be potentially achieved by introducing at the highest energies an additional (third) step to the tracker CNNs re-  Table 3. Efficiency of the track reconstruction and selection derived from MC using different combinations of the developed CNNs algorithms and standard techniques: (left and middle columns) standard track reconstruction combined with the calorimeter CNNs model, using either the standard (left column) or ideal (middle column) identification; tracker CNNs model combined with the standard calorimeter shower axis direction reconstruction (right column). Proton (top) and helium (bottom) cases are shown. Errors include statistical uncertainties and the uncertainties related to the CNNs generalization gap, summed in quadratures. For the sake of brevity, selected points from Figure 11 corresponding to different energy decades are shown. For the tracker-only CNNs model, the two-step prediction is used in all energy bins. construction, with a large pixel size. In other words, a "very coarse" Hough model could be added to the CNNs chain prior to the coarse (400 µm) and the precise (50 µm) ones. Our estimates show that, if the conventional shower axis direction reconstruction is used at 100 TeV-PeV, a 99%-containment RoI would correspond to a window of ±160 mm, twice bigger than that of the current coarse tracker model. While not strictly needed for achieving the goals of this paper, the purely tracker-based particle trajectory reconstruction at 100 TeV and higher energies could be an interesting subject for future research, in particular for the HERD detector. At this point, we can conclude that the combination of the calorimeter and tracker CNNs efficiently solves the problem of track reconstruction and identification with DAMPE.
Finally, we have also tested the developed approach with particles heavier than helium. For intermediate mass ions like boron, carbon, nitrogen and oxygen, the tracking efficiency of the CNNs algorithm is in the 96%-98% range. We found that including these particles in the tracker CNNs model training only marginally (less than 1%) improves their tracking efficiency. For iron, the tracking efficiency at the highest energies is about 92% if no iron is included in the training, and more than 96% if it is included. At low energies, the iron tracking efficiency drops dramatically to 50%-60% if no iron MC is used in the training, and restores to about 95% if it is included. Hence, we conclude that the tracker CNNs models are weakly sensitive to the differences between cosmic-ray particle species. The only strong dependence on particle type is observed for iron at low energies, which is likely attributed to the coarse CNNs model. Further optimisation of CNNs models for cosmic-ray analyses with elements heavier than helium is beyond the scope of this paper and will be considered in the future research.
Validations of the complete CNNs reconstruction chain with the proton and helium data are shown in Section 7.

Neural network inelastic classifier
Note that for the analysis in Section 3, in particular in Figures 2 and 3, we have applied the selection criterion requiring no inelastic interaction in PSD, imposed on the MC truth level. Obviously, such a cut cannot be applied to real data. Therefore, to perform the rejection of events that interact inelastically before the tracker, we have developed a dedicated classifier based on the CNNs approach. Hereafter it is referred to as the inelastic classifier.
The classifier consists of two independent CNNs models, as shown in Figure 12. For the first model, we use a network which takes as input a raw image of the STK. Similar to the BGO image in Figure 4, the raw STK image is constructed as a mixture of two projections. Since there are 12 layers (6 layers per projection) and each layer is read out by two groups of electronic boards (3072 readout channels per group), the STK image has a total dimension of 24×3072 pixels. For the second model, we recycle the network architecture from Figure 9, replacing the last "Dense 4" with the "Dense 1" layer. The pixel resolution of the input Hough image is 400 µm. Finally, we multiply the outputs of the two networks. Each of the two models demonstrate comparable performance if used standalone. At the same time, a simple multiplication of the outputs of two models yields better performance than any of them used separately. We deliberately choose not to mix the Hough and raw STK images in a single network, in order to maintain the architecture modularity and its interoperability for future applications. The performance of the total inelastic classifier is shown in Figure 13, benchmarked against a typical PSD charge consistency cut.
The training of the two models is performed using the same training and validation MC sets as for the CNNs in the previous sections. We do not split the training in different energy ranges, since no significant gain in accuracy is observed in this case. The binary cross-entropy is used as a loss function. The models are fitted separately on the same input data. Notably, the two classifier models tend to a better generalization if some dropout is added. We tested 5%, 10% and 20% for both networks. The former two options demonstrated comparable accuracy, while the latter one degraded the performance. Hence we have chosen a 10% dropout for the training. The performance of the total classifier evaluated on the test and training samples did not show any significant deviation from one another, indicating the absence of overfitting. The Receiver Operating Characteristic (ROC) curves in Figure 13 are evaluated on the entire sample.  Figure 12. A raw STK image of a typical proton interacting inelastically inside PSD and the architecture of the inelastic classifier CNNs. The blue box on the right corresponds to the network in Figure 9 with the last layer replaced with "Dense 1". The ReLU activation function is used in all layers except for the last one, which is activated by a sigmoid function. The total classifier is the product of the outputs of the two networks.  Figure 13. Sample Receiver Operating Characteristic (ROC) curve of the inelastic CNNs classifier for selecting events that do not interact inelastically before the tracker (i.e. inside PSD or the passive support structures before the STK). For comparison, another ROC curve is shown for the typical PSD charge consistency cut, imposed on the difference between the highest and the lowest signal in the different PSD bars crossed by the particle. Two particle kinetic energy ranges are considered.
It is worth noting that unlike the regression problem of the particle trajectory reconstruction, the classification of the inelastically interacting events can be successfully done with the raw tracker image. Notably, a pattern of an interacting particle can be spotted even by eye in Figure 12. Therefore the CNNs used with a raw image is the right tool for classification in this case.
A peculiar alternating structure in the number of convolutional filters ( Figure 12) is introduced in order to gain maximum performance, while maintaining an optimal model complexity. In particular, if 32 or more filters are used in all convolutional layers, including the last one, the model exhibits less generalization, which degrades the accuracy by about 1-2%. Alternatively, a lower but fixed number of filters also yields inferior performance. We have noticed that the model is particularly sensitive to the number of filters in the very first layer. We have also tried increasing the horizontal span of the convolutional filters. The latter, however, did not result in any further improvement, hinting that inelastic interactions manifest more clearly through the localised fine patterns in the raw STK image, rather than large-scale correlations between the alternating STK projections.
In addition to the presented model, we have also tested if a neural network which takes PSD data as an input can serve as an inelastic classifier. In this case, we used a network architecture inspired by the calorimeter CNNs model in Section 4. However, the performance of such network was never close to the tracker CNNs model. This result is expected since the majority of the information about the particle passage through PSD is concentrated in at most 4 PSD bars crossed by the particle (see Figure 1). Moreover, as described in Section 7, at high energies the PSD signal gets severely contaminated, obscuring the footprint of the particle absolute charge and its inelastic interaction, if present.
The training time of the Hough-image and raw-image models for 10 6 batches is 12 hours/epoch and 6 hours/epoch respectively. An increased time for the Hough model compared to the regression case in Section 5 is explained by the addition of the dropout between all the layers. The inference (CPUs) time of two models is 0.35 s/event and 0.28 s/event respectively. The use of the inelastic classifier with data is demonstrated in the next section.

Application case: proton and helium analyses
In this section, we use the developed CNNs algorithms to perform the absolute particle charge identification in the STK. That is, we repeat the event selection described in Section 3, replacing the standard track reconstruction with the one developed in this work. Also, we add a selection criterion on the inelastic classifier to require that no inelastic interaction occurs in the PSD. We choose a cut value to maintain the true positive rate of selecting non-interacting events higher than 95% in the entire energy range. The results are shown in Figure 14. The distributions at a deposited energy of 1 TeV and above are shown -the distributions at lower energies have similar behavior -a good agreement between flight data and simulation is observed. To account for minor differences between flight data and simulation 17 smearing corrections of peak positions/widths in proton and helium MC samples have been performed. The systematic uncertainty of the proton and helium charge identification efficiencies, related to the smearing correction, is conservatively estimated to be within 1%. Thanks to the clean proton and helium peak identification with the CNNs approach, the relative background cross-contamination for either proton or helium analyses at all energies does not exceed 1% (2%), at 90% (95%) STK signal selection efficiency. Note that in the case of the standard selection, the corresponding backgrounds at the highest energies reach 3% (5%) for proton and 10% (13%) for helium, respectively. The effect of applying the inelastic classifier can be clearly observed in Figure 14, as it helps eliminate the middle and far "tails" around the proton and helium peaks 18 .
Next, we perform validations of the CNNs track reconstruction on flight data. While there is no way of knowing the true particle direction outside the simulation, we can still validate the CNNs algorithm using the standard track reconstruction algorithm as a reference. Namely, we evaluate the ratio r of the number of reconstructed events for the developed CNNs algorithm and the standard approach. The resulting comparison is shown in Figure 15. This ratio is equivalent to the ratio of "CNNs reco" and "reco & selection" efficiencies from Figure 10: Note that the two approaches are completely independent, hence specific problematic behaviors (if any) of the CNNs approach that can potentially arise, for example, due to some imperfections in the simulation accuracy, would likely manifest in such ratio. It can be seen that the improvement of the developed CNNs track reconstruction over the standard approach is fully consistent with the MC prediction. The error bands are mostly attributed to the standard track reconstruction, selection, and particle identification, namely to the track selection uncertainty at the lowest energies and background estimation uncertainty at the highest energies. From this comparison, we conservatively estimate the systematic uncertainty of the CNNs tracking approach to not exceed 1-2%. Finally, we summarise the characteristic execution time of the developed algorithms implemented in the DAMPE data processing pipeline. Up to five CNNs inferences are performed per event: one for the calorimeter, two for the tracker trajectory reconstruction, and two for the inelastic classifier. The total CPUs time consumption for the five model inferences is about 1.6 s/event. Notably, the standard track reconstruction algorithm of DAMPE takes on average 3.4 s/event.

Summary and discussion
Particle trajectory reconstruction represents one of the key challenges on the way of extending the direct cosmic-ray measurements towards the PeV landmark. To tackle this problem, we have developed and implemented a new tracking paradigm for the calorimetric cosmic-ray detectors in space. Our approach is based on the deep learning methods, in particular Convolutional Neural Networks (CNNs). The reconstruction of the cosmic ray   Figure 15. The ratio of the number of reconstructed events with the developed CNNs algorithm and the standard track reconstruction and selection algorithm, as a function of deposited energy in the detector. It is obtained for the 90% STK signal containment intervals for both proton and helium distributions ( Figure 14). The ratios for 85% and 95% containment have similar behavior. Red dashed lines indicate the region of ±1% data/MC deviation, to guide the eye.
trajectory is performed in multiple steps: first, a "rough" prediction of the impinging particle direction is inferred by the CNNs from the image of the calorimeter subdetector; the calorimeter prediction defines the RoI in the tracker, where the further reconstruction is done; then, the Hough-transformed image of the tracker RoI is used for inferring the precise trajectory, using another dedicated CNNs model. Finally, specific signal hits in the tracker are assigned to the predicted trajectory based on a simple distance matching, forming the reconstructed track of the impinging cosmic ray. In addition, to enhance the analysis we have also developed a CNNs-based classifier to reject events with early pre-showers, for which the cosmic ray identity (absolute charge) cannot be determined with the tracker. Unlike the conventional approach, where multiple candidate tracks are constructed and then the best track is selected on the analysis level, our algorithm yields one track from the beginning -a selection in the further analysis is not needed. We have benchmarked our algorithm for the case of proton and helium selection. The developed algorithm demonstrates excellent tracking efficiency, about 98% in the entire energy region of the DAMPE data. At the same time, the absolute charge mis-identification is very low, in the 1-2% range. While implemented for DAMPE, the designed approach is equally relevant for next-generation experiments, like HERD [34,59]. Moreover, HERD is targeting cosmic ray measurements at energies higher than DAMPE. Hence the problem of track reconstruction will manifest itself on a bigger scale, and the outlined CNNs approach appears as a credible potential solution.
The two key elements of the approach, the tracker and the calorimeter CNNs are complementing each other. The former allows to perform a precise particle trajectory reconstruction in the tracker, given that the RoI is provided by the calorimeter, while the latter one enables to recover the loss of track reconstruction efficiency at the highest energies, caused by the calorimeter saturation. It is worth noting that the set of the tracker CNNs models is sufficient by itself for reaching the target performance up until ∼100 TeV, even if combined with the classical calorimeter shower axis direction reconstruction. We choose not to mix the tracker and the calorimeter data in a single CNNs model in order to facilitate: the cross-calibration of the algorithms based on different subdetectors; the interoperability of models for other applications, beyond the proton and helium analyses; the validation of the models with the data; the optimal integration of the developed algorithms in the data processing pipeline.
Aside from the tracker CNNs, the obtained results for the calorimeter CNNs direction prediction look interesting per se. The developed algorithm at the highest energies outranks the conventional approach of analytical fitting of the shower axis, by at least a factor of 5. In particular, the 68% angular containment of electromagnetic showers at TeV and higher energies is better than 0.4 • . Even more intriguing, the CNNs approach applied to a finer and thicker calorimeter having a 3-D granularity, like that in HERD, is naturally expected to yield even better results. A possible question at this point could be whether a purely calorimeter-based direction reconstruction of electromagnetic showers could replace a dedicated tracker, at least at the highest energies. An answer to this question would be important in the design phase of new instruments, in particular with respect to their gamma-ray detection capability and the need for dedicated photon converters. This remains a subject for future research.

Acknowledgment
The DAMPE mission was funded by the strategic priority science and technology projects in space science of the Chinese Academy of Sciences. In Europe, the experiment and data analysis is supported by the Swiss National Science Foundation and the National Institute for Nuclear Physics (INFN), Italy. Development of the deep learning techniques and related data analysis is funded by the European Union's Horizon 2020 research and innovation programme (grant agreement No 851103).
Ukrainian authors, M. Deliyergiyev, A. Kotenko, and A. Tykhonov are indebted to the resilience and courage of the Armed Forces of Ukraine, for keeping their loved ones safe during the course of this work, without which the presented results would have been impossible to achieve.