INTO THE FAST TOMOGRAPHIC POSTPROCESSING IN TOKAMAKS

The collaboration of authors led to implementing advanced and fast systems for diagnostics of plasma content in tokamaks. During the development of systems it is planned to add new functionalities, in particular, the algorithms of tomographic reconstruction to obtain information on three dimensional distribution of plasma impurities. In the article the idea of tomographic reconstruction is introduced and issues of performance and adequate hardware selection are presented.


Introduction
The development of mechanisms allowing for sustaining controlled thermal fusion reaction is currently intensively realized in largest international enterprise in Europe, under supervision EUROfusion consortium.The research is strictly focused on currently constructed ITER (International Thermonuclear Experimental Reactor) tokamak.The experience gained with the reactor will be of great importance for the development of DEMO tokamak (DEMOnstration Power Station) [6].Maintaining the conditions that allow to generate the energy (so-called Lawson criterion) is currently the most important issue to overcome.This would lead to constructing reactors capable of providing cheap and safe energy for hundrets of thousands of years.
There are twofold reasons why this issue is crucial.The plasma impurities (mostly tungsten from the tokamak cladding) strongly hinder the reaction efficiency.Moreover, their accumulation can lead to damaging the tokamak.The impurities of plasma have to be therefore detected in real-time and eliminated as soon as possible.The fast detection mechanisms are strictly associated with the need of introducing fast methods of analysis of plasma to provide a feedback into the plasma control mechanisms and manipulate the shape of plasma [2,14].
One should mention that so far the time of sustaining the reaction is critical and attempts to lengthen the period of the reaction are made.This would allow to produce more power by reactor than consume.
In order to maintain controlled thermal fusion reaction the comprehensive and fast mechanisms of plasma diagnostics and control are needed to sustain its purity and temperature.Therefore, the fast detectors and electronic devices capable of quick analysis of plasma are in demand.This would allow to provide mechanisms for currently implemented tungsten divertor for tokamaks.
One of mechanisms of detection is to measure the soft X-Ray electromagnetic radiation, emitted by the plasma.The emission of photons is intensified when impurities occur.Such a measurement allows to obtain information about plasma temperature, shape and its magnetic axis [14].
As a result at the present time there is a rapid development of technologies that will allow to control the reaction of thermal fusion, in particular control its content and impurities.Currently the ITER tokamak (claimed to be first reactor generate more power than to consume) is built and intensive studies are performed in JET and WEST reactors.Providing mechanisms to control plasma in tokamaks and to maintain conditions in whom the reaction occurs is now scrutinized.

The GEM detector based acquisition systems
To meet these requirements, there is implemented a system developed by Photonics and Electronics Research Group in Warsaw University of Technology and Institute of Plasma Physics and Laser Microsynthesis.The device is based on a GEM detector-the Gas Electron Multiplier.The purpose of the device is to measure the soft X-ray radiation emitted by hot plasma thereby obtaining information about plasma structure.The detector has several advantages -it has high temporal and spatial resolution, it is resilient to conditions in the tokamak, relatively compact and cheap [2].The principle of operation of the detector can be found in [14].This device allows to convert radiation to an electron cloud, subsequently detected by the A/D converters.The detectors of such kind require specialized electronic devices to process the data at high rates.To meet these requirements, the GEM detector based acquisition system was implemented.The first revision of system is fully operational and new version is under development.The recent achievements in this field can be found in [1] and [3].

First revision of system
The first revision of the system is fully functional and is intensively used in tokamaks, such as JET and MAST.The picture of functional first revision of the system is presented in figure 1.
To provide high spatial and temporal resolution, the comprehensive hardware and algorithmic solutions are needed.The FPGA-based hardware and firmware solutions allow initial per-A/D converter analysis of samples per channel [4,16], sorting the data [9] in the circuits.This step is called the pre-processing.After this preliminary analysis the raw data from converters is analyzed and events occurring for a channel are detected.Then the data from different channels per circuit is sorted on a single board.Subsequently, the preprocessed data from different boards is collected on PC vie PCIe interface and can be further analyzed offline with PC and MATLAB.[14] The following parameters of the system are of particular importance [14]:  Up to 256 channels  Each of channel acquires the data from the detector at a rate of 77.78 MHz  10 bits A/D converter per channel  The overall throughput of the input data up to 200 Gbit/s The overview of the first revision of the system is presented in figure 2. The details concerning the modules are presented in [4], [14] and [15].The most important parts is the detector and analog modules, the FPGA circuits and PC with postprocessing in MATLAB.

Fig. 2. Overview of the first revision of the fast GEM detector based acquisition system [9]
The first revision of the system allowed to diagnose changes of plasma content after the experiment.The phase of analysis of collected data is called the post-processing and involves further analysis of collected and stored data using the PC and MATLAB.
The system provides methods to obtain precise information on spectral, temporal, spatial and energy distribution of radiation in the detector.
However, the demands for more advanced system led to the implementation of new revision of the system with new functionalities.

The new revision of system
The advancement in electronics and the need for new functionalities led to implementation of new version of the system.The second revision has the following parameters:  up to 2048 channels  each channel acquiring data at 125 MHz,  resolution of 12 bits of each A/D converter in channel,  the overall input throughput of above 3 Tbits/s.The significant improvement of parameters is noticeable.The system is planned to be deployed in tokamak WEST in CEA Cadarache, France this fall.In further perspective the system is planned to be used in ITER.Apart from increased throughput, the post-processing is planned to be significantly improved and new set of functionalities is implemented.

Postprocessing algorithms
In the post-processing, collected data via PCIe is analyzed.This concerns grouping data from different boards, sorting and merging, as well as a set of data analysishistogramming and tomography [4,9].In the post-processing algorithms, the following operations concerning data analysis are done:  The data collected from different boards is sorted, merged and prepared for further calculations.The absolute time and spatial position is calculated for each channel. The spatial mapping from a channel number to a position in the detector. The event detection (i.e. group of data in time proximity) and in-event cluster detection (i.e.data in spatial proximity) is done. Basing on detected clusters, the temporal, spatial, energy histograms are calculated and data statistics is collected.The exemplary scheme of preprocessing is presented in figure 3.

Fig. 3. The procedure of converting and mapping channels from the detector topology (upper left) into a matrix (upper right) and subsequent spatial (bottom left) and energy (bottom right) histograms calculated in such manner.[3]
The data format is strictly associated with what is obtained from FPGA and strict limitations occur [8].
So far, these operations were performed offline in MATLAB.However, the development of specialized computation devices and CPUs allowed to significantly decrease computation time.Therefore, the real-time analysis, so far unachievable due to hardware limitations, can be implemented with further possibilities of speedup with the advancement of hardware.
The scripts in MATLAB have been written in C language and MATLAB mex library in order to investigate achievable speedups and bottlenecks of the algorithms.Significant speedups have been achieved with using the PC.The results can be found in [10,11].Several conclusions and observations were made in the tested software in terms of scalability, optimization and performance of implementation:  Memory access pattern (e.g.data reusability when residing in cache). Computation to memory access ratio, overheads when creating parallel threads. Possibility of vectoring computations. Irregularities in data (for instance, variable data length) and irregular data access patterns maximizing the usage of slow memory instead of cache. Used hardware architectures, in particular core interconnects, memory organization per cores. Memory size (cache, registers) and the presence of other devices in hardware architecture-TLB, branch predictors. Connecting interfaces, such as PCIe, SRIO etc.

These factors are critical in terms of selecting hardware for optimized postprocessing system
There is currently done a research in whom stand-alone, optimized programs will be prepared and deployed in Cadarache.Due performance, latency and throughput analysis will be performed in order to assess capability of present hardware.
In the framework of the development of the system several new functionalities are planned to be introduced, one of whom is the fast topographical reconstruction.These advanced methods of analysis will allow to provide detailed data on the plasma content, in particular, the source of impurities, the level of impurities and their influence on plasma discharges [2].These data are crucial to be precise and calculated as quickly as possible, because they could be used in a loopback signal to the plasma control mechanisms to control plasma in the reactor.Therefore, responsive system is in high demand.

Tomography
To obtain detailed information on plasma content (in particular, the three dimensional distribution of radiation from whom the plasma content can be deduced), the fast tomographic reconstruction algorithm is currently implemented.
By using several detectors (in this particular case 2) one can reconstruct the 3-D distribution of radiation.The concept of placement of the detectors for the reconstruction is presented in figure 1.There have been developed algorithms by CEA to reconstruct the distribution of radiation in tokamak, discussed in [12].The method of placing the detector in a matter that allows reconstruction is presented in figure 4.This configuration will be used in WEST.[12] Minimum Fisher regularization technique is planned to be used.The methods of reconstruction are commonly known.The algorithms have been developed by CEA and the results can be found in [7].The exemplary results of reconstruction, developed by CEA, are presented in figure 5.

Fig. 4. Placement of detectors for tomographic reconstruction in tokamak
However, in order to provide mechanisms of feedback to the reactor, the implementation of fast version of algorithm is needed.The main issue is to implement known algorithms in an optimized manner.
Therefore, a feasibility study is performed to assess the achievable bandwidth and latencies and to optimize the algorithms.

The problem of latencies and throughput
The tomography is one step in entire sequence of postprocessing, which consists in grouping the data, finding data patterns and performing intensive computations with analysis of obtained results.In order to minimize latencies, due performance tests should be conducted for each step of computation.The implementation will minimize the latencies of whole postprocessing sequence of operations.The research concerning analysis of throughput for various cases of experiments is currently conducted and will have an impact on selecting adequate hardware for computation.The solution is a compromise between changes in the existing system (for example, using PCIe to transfer data), the achievable throughput, the costs and needed throughput.

Hardware selection
Due hardware has to be considered in terms of fast tomography algorithms.First of all, the computation oriented device is needed that is capable of performing intensive floatingpoint calculations.Second of all, the latencies introduced by the data transfer should be as low as possible.Finally, there are some constraints concerning data organization that have to be met.Presently the Intel Xeon CPU, Intel Xeon Phi MIC and NVIDIA GPGPU are scrutinized and due implementations are written.These devices are mostly popular for solutions of such kind, and examples of successful applications can be found in [13]

Other branches of exploiting the systems
Although initially designed for diagnosis of plasma in tokamaks, the implemented system has a capability of being used in different branches of industry and science.The advantage of the solution over its contemporaries is a high temporal resolution (milliseconds when on-line) and spatial resolution of around 1 cm, considering the requirements of the system in [6].This allows to analyze the dynamically changing objects.Currently the opportunities to use the system in the industry are sought for.
There are potential chances of implementation in such branches as:  Analysis of liquids. Analyzing fuels in the engines. Controlling rapid chemical reactions where fast and comprehensive methods of control are needed. Metallurgy, for example, for advanced control of annealing.Also, one could mention that two modes of operation of the devices are possible.First of all, data can be collected and processed after the experiment.Second of all, there are made attempts to implement an on-line processing that would provide feedback mechanisms to control systems and to analyze the data on-line.
These features and parameters can make the implemented systems suitable for novel methods of analysis in different fields of science and industry.