Real-Time Event Reconstruction and Analysis in CBM and STAR Experiments

Within the FAIR Phase 0 program, the algorithms of the FLES (First-Level Event Selection) package developed for the CBM experiment (FAIR/GSI, Germany) are adapted for the STAR experiment (BNL, USA). Use of the same algorithms creates a bridge between online and offline, which makes it possible to combine online and offline resources for data processing. In this way, an express data production chain was created on the basis of the STAR HLT farm, that extends the functionality of HLT in real time up to the analysis of physics. It is important, that the express analysis chain does not interfere with the standard analysis chain. A particular advantage of express analysis is that it allows calibration, production and analysis of the data to begin immediately after they are collected. Therefore, the use of express analysis is beneficial for BES II data production and helps to speed up scientific discovery by helping to obtain results within one year after the end of data acquisition. The specific features of express data production are presented and discussed as well as the results of online production and analysis, such as real-time reconstruction of short-lived particles in the BES-II STAR environment.


Introduction
Within the framework of the Facility for Antiproton and Ion Research (FAIR) project, a large international centre is being constructed to study the structure and fundamental properties of matter. It will be a new generation accelerator complex that will provide unique opportunities for detailed investigations in the most interesting areas of modern science: nuclear, hadron and particle physics, atomic and anti-matter physics, high density plasma physics, and applications in condensed matter physics, biology and bio-medical sciences [1].
In the Compressed Baryonic Matter (CBM) [2] experiment with heavy ions, the highest baryon densities will be created, and the properties of super-dense nuclear matter will be investigated in various extreme states that are similar to, for example, the conditions of matter in the center of neutron stars, where matter is at the final stage of evolution before transition to the black hole. The CBM experiment will thus complement the experimental heavy-ion program The scientific program of the CBM experiment includes: • explore properties of super-dense nuclear matter; • search for in-medium modifications of hadrons; • search for the transition from dense hadronic matter to quark-gluon matter; • search for the critical endpoint in the phase diagram of strongly interacting matter; • investigate the structure of neutron stars and the dynamics of core-collapse supernovae.
The experiment will measure rare and penetrating probes such as dilepton pairs from light vector mesons and charmonium, open charm, multistrange hyperons, together with collective hadron flow and fluctuations in heavy-ion collisions at rates up to 10 7 collisions per second.   CBM is characterized by high collision rates, large amount of produced particles, nonhomogeneous magnetic fields and a very complex detector system. Event reconstruction is the most complicated and time consuming task of the data analysis in modern high-energy physics experiments. It is a key part of success in the CBM experiment with up to thousand particles per central collision ( Fig. 1). An additional complication in CBM is its continuous data stream represented in form of time slices. This makes the reconstruction of such 4-dimensional data with time stamps and the search for interesting physics extremely difficult. All of the above mentioned makes necessary to develop fast and efficient algorithms for data analysis and to optimize them for running on a modern high-performance computer cluster [3].

First Level Event Selection
The First Level Event Selection (FLES) package [4,5] of the CBM experiment is intended to reconstruct online the full event topology including tracks of charged particles and short-lived particles. The FLES package consists of several modules: Cellular Automaton (CA) track finder, kalman Filter (KF) track fitter, KF Particle Finder and physics selection. In addition, a quality check module is implemented, that allows to monitor and control the reconstruction process at all stages. The FLES package is platform and operating system independent. The package is portable to different many-core CPU architectures, vectorized using SIMD (Single Instruction, Multiple Data) instructions and parallelized between CPU cores. All algorithms are optimized with respect to the memory usage and the speed.

Cellular Automaton (CA) track finder
The 4-dimensional (4D, space and time) Cellular Automaton (CA) track finder [5,6] takes as input hit measurements from the tracking detector in the form of a time-slice, which includes time and spacial measurements. The track finding procedure starts with combining the hits into triplets, combination of three hits on adjacent stations. The triplet structure was chosen, since it allows to estimate the momentum of a particle, which could produce it. The triplets with two common hits are combined into track candidates. The track candidates should survive a dedicated selection based on the track length and calculated χ 2 -value to be accepted to the reconstructed tracks.
Input time information is used in the algorithm to the same extent and in similar manner as it is done with the spacial coordinates. The same logic is used while constructing triplets: the hits in the triplet should belong to the same particle, therefore they should correlate not only in space, but also in time. The resulting track reconstruction efficiencies for the cases of event-by-event analysis (so-called 3D analysis) as well as for the 4D case (with included time measurement, as well as 3-dimensional spacial information) while reconstructing time-slices are similar thus there is no efficiency degradation in the much more complicated case of time slices. The same is valid for the speed of the 4D CA tracks finder with respect to the 3D case.

Kalman Filter (KF) track fit
High precision of the parameters of particle trajectories (tracks) and their covariance matrices is a prerequisite for finding rare signal events among hundreds of thousands of background events. Such high precision is usually obtained by using the estimation algorithms based on the Kalman filter (KF) method. High speed of the reconstruction algorithms on modern manycore computer architectures can be accomplished by: optimizing with respect to the computer memory, in particular declaring all variables in single precision, vectorizing in order to use the SIMD instruction set and parallelizing between cores within a compute node.  After all tracks are found and their parameters are reconstructed, the tracks are grouped into events. This is done by clustering tracks based on their time parameters in the area of the target. The left distribution of Fig. 2 shows hits within a time slice for 10 7 interaction rate. One can see that the traditional grouping of hits into events at this stage is impossible. The track distribution at the middle shown against the same hits displays grouping of tracks belonging to the same event. The right distribution with different colors shows different clusters of tracks, close in time in the target area. One can see that already at this stage it is possible to build events with efficiency more than 85%. The task of event building is finalized at the stage of searching the primary vertex, where it is possible to additionally use the proximity of tracks in space, as well as more accurate time measurements of the TOF detector.

KF Particle
Finder -a package for reconstruction of short-lived particles Today the most interesting physics is hidden in the properties of short-lived particles, which are not registered, but can be reconstructed only from their decay products. A fast and efficient KF Particle Finder package [4,7], based on the Kalman filter (hence KF) method, for reconstruction and selection of short-lived particles is developed to solve this task. A search of more than 100 decay channels has been currently implemented (Fig. 3).

Open-charm
Open-charm resonances

KF Particle Finder for Physics Analysis and Selection
In the package all registered particle trajectories are divided into groups of secondary and primary tracks for further processing. Primary tracks are those, which are produced directly in the collision point. Tracks from decays of resonances (strange, multi-strange and charmed resonances, light vector mesons, charmonium) are also considered as primaries, since they are produced directly at the point of the primary collision. Secondary tracks are produced by the short-lived particles, which decay not in the point of the primary collision and can be clearly separated. These particles include strange particles (K 0 s and Λ), multi-strange hyperons (Ξ and Ω) and charmed particles (D 0 , D ± , D ± s and Λ c ). After that tracks are combined according to the block diagram in Fig. 3. The package estimates the particle parameters, such as decay point, momentum, energy, mass, decay length and lifetime, together with their errors. The package has a rich functionality, including particle transport, calculation of a distance to a point or another particle, calculation of a deviation from a point or another particle, constraints on mass, decay length and production point. All particles produced in the collision are reconstructed at once, that makes the algorithm local with respect to the data and therefore extremely fast.
In addition, simultaneous reconstruction in the KF Particle Finder of different decay channels of the same particle, including also decays with a neutral particle in the final state, makes it  The use of the Kalman filter at all stages of particle reconstruction allows in many cases to get rid almost completely of the combinatorial background and to obtain clean sets of particles, which can serve as probes of various stages of the collision (Fig. 4).

Deep learning for quark-gluon plasma detection
In addition to the macroscopic inverse approach [8] we investigate the microscopic inverse approach by using artificial neural networks to classify processes in heavy ion collisions. We have created two types of neural networks: fully connected (FC) and deep convolutional (CNN) neural networks. These networks were then used to identify quark-gluon plasma simulated within the Parton-Hadron-String Dynamics (PHSD) microscopic off-shell transport approach for central Au+Au collision at a fixed energy.
For FC networks we use a 64-neuron fully-connected hidden layer with batch normalization, Leaky Rectified Linear Unit (LReLU) activation and dropout. The number of neurons is chosen empirically and is fixed to allow comparison of FC neural networks with one, two and three layers. Batch normalisation and dropout are used to reduce overfitting and therefore improve overall performance. LReLU is used as it performs similarly to the most commonly used Rectified Linear Unit (ReLU) activation function but avoids dead neuron issues.
The CNN consists of two three-dimensional convolutional layers, each followed by a max pooling layer, and two sequential fully-connected layers.

Conclusion
The results obtained in our work suggest that raw data contains hidden patterns that allow the neural network classifiers to discern an event simulated using the transport model with and without the quark-gluon plasma formation model. Out of four architectures that included several fully-connected networks as well as a convolutional neural network the latter showed the best performance.

How to classify an event?
Goal is to determine physical properties of QCD matter in real time The Fig. 5 shows that the accuracy on the validation set rapidly increases for all four network architectures up to the fifth epoch, when the rise slows down and the curves level off. At the same time, the precision on the training set continues to go up until it reaches 100%, which suggests that overfitting occurs after the fifth epoch. Nevertheless, the fully-connected networks reach 80% precision while the convolutional neural network attains the best performance of more than 90% accuracy.

Express reconstruction and analysis in STAR
The STAR (Solenoidal Tracker At RHIC) experiment [9] at the RHIC (Relativistic Heavy Ion Collider) facility of the Brookhaven National Laboratory (BNL, USA) is designed to study nuclear matter under extreme conditions of relativistic heavy ion collisions, including hadron production and search for signs of quark-gluon plasma formation and its properties.
Very important for RHIC is the possibility to collide ions, covering the range of baryon chemical potential (µ B ) from 20 to 420 MeV, which corresponds to a wide range of energies (so called Beam Energy Scan, BES). The BES results once again confirmed evidence of QGP discovery in the upper RHIC energy There are three main improvements to the STAR detector for the BES-II phase: inner TPC (iTPC), end-cap Time-Of-Flight (eTOF), and event plane detector (EPD). These new detector systems cover pseudo-rapidity in the range of 1 < |η| < 5.4. While iTPC and EPD cover both sides of STAR, eTOF is only installed on one side. The upgrade of iTPC and eTOF allows better identification and acceptance of particles. The EPD provides event plane reconstruction and centrality detection. All three detector systems improve significantly the statistical accuracy of STAR measurements in the low energy range close to the CBM energy range.
The FLES package of the CBM experiment is ported to the STAR experiment. Since 2013 (online) and 2016 (offline) the CA track finder is the standard STAR track finder for data production. Use of CA provides 25% more D 0 and 20% more W . The KF particle finder provides a factor 2 more signal particles than the standard approach in STAR. The package is  Figure 6. The HLT express and the standard data production and analysis workflows.
The use of the CA track finder and the KF particle finder in online extends significantly the functionality of HLT (Fig. 6). The standard calibration, production and analysis remain unchanged. HLT starts the calibration procedure as soon as data become available. The express chain makes possible physics analysis of the data as soon as the calibration is reasonable. It unifies approaches in extended (x)HLT and online (o)RCF to speed up the express workflow, and combines high competence of xHLT and oRCF experts involved in online operation. In addition, it provides physics working groups with instant and uncomplicated access to the data, like picoDST etc.
With the express calibration and alignment one can reconstruct hyperons with high significance and low level of background, as it is shown in Fig. 7. Hyperons are clearly seen at all BES-II energies: 3, 3.2, 3.9, 7.7, 9.1, 14.5, 19.6, 27 GeV. In addition, high significance allows extraction of spectra.

Conclusion
The CBM experiment with 10 7 input rate will require the full event reconstruction and physics analysis of the experimental data online. As the same HPC farm will be used for offline and online processing of experimental data, the main reconstruction and analysis algorithms will work both offline and online. Errors and insufficient accuracy in online data processing, physics analysis or selection of interesting collisions by the reconstruction algorithms will lead to complete loss of all experimental data, since only the incorrectly selected data will be stored in this case. Therefore only immediate comparison of the results of online analysis with the predictions of theoretical models using ANNs can guarantee the proper operation of the whole experiment. It has been demonstrated, that the core algorithms of the FLES package, the Cellular Automaton for searching for particle trajectories (100 µs/core/track) and the Kalman Filter to estimate their parameters (0.5 µs/core/track), have a very high level of intrinsic parallelism for their fast