Data archiving system implementation in ITER's CODAC Core System

doi:10.1016/j.fusengdes.2015.06.076

Fusion Engineering and Design

Volumes 96–97, October 2015, Pages 751-755

https://doi.org/10.1016/j.fusengdes.2015.06.076 Get rights and content

Highlights

•
Implementation of ITER's data archiving solution.
•
Integration of the solution into CODAC Core System.
•
Data archiving structure.
•
High efficient data transmission into fast plant system controllers.
•
Fast control and data acquisition in Linux.

Abstract

The aim of this work is to present the implementation of data archiving in ITER's CODAC Core System software. This first approach provides a client side API and server side software allowing the creation of a simplified version of ITERDB data archiving software, and implements all required elements to complete data archiving flow from data acquisition until its persistent storage technology. The client side includes all necessary components that run on devices that acquire or produce data, distributing and streaming to configure remote archiving servers. The server side comprises an archiving service that stores into HDF5 files all received data. The archiving solution aims at storing data coming for the data acquisition system, the conventional control and also processed/simulated data.

Introduction

CODAC Core System (CCS) [1] is a software kit distribution for plant systems I&C. It is based on the widely used open source software EPICS (Experimental Physics and Industrial Control System) and CSS (Control System Studio). During the last year, the consortium formed by INDRA, CIEMAT, SGENIA and UPM, has been developing data archiving technologies (commonly referenced as DAN) to be included into ITER CODAC Core System. From the CCS version 4.3, a simplified implementation of data archiving [2] has been included.

This new development has been driven by the ITER operation and diagnostics requirements:

•
Steady state data archiving: ITER pulse length is over thirty minutes and all the data gathered by acquisition systems or data processor systems must be archived while they are being generated in the experiment.
•
Concurrent reading while writing: because of ITER experiments’ length, it is not possible to wait until each experiment finishes to read archived data, and thus it is mandatory that any archived solution enables concurrency at least between one writer and multiple readers.
•
Archiving on remote: data archiving services will be located in the CODAC system and data generator systems will archive on remote.

The DAN system in CCS is formed by client and server subsystems. In this work both will be presented.

HDF-5 [3] is a scientific-oriented file format with a complete set of libraries and management tools (supported by a wide scientific community), and with the main objective of making data platforms independent and portable. There is a list of important advantages that support the use of HDF-5 files for data archiving. Here are the most important ones:

•
Self-description data management: any client can completely know and manage the format of the data stored in a file. This feature also helps to keep version compatibility.
•
Huge file size support: there are no size limits apart from those derived from the operating system or file system.
•
Optimized performance for data segment access: HDF5 perfectly fits the data access requirements for scientific data archiving in long pulse experiments.
•
Data appending: it is possible to append new data to a file without copies or redefinitions.

Although HDF5 does not support concurrent write and read access by default, which is one of the most important ITER requirements, this feature is currently supported in a parallel version of the library called HDF5 SWMR (Single-Writer Multiple-Reader), and will be officially supported by HDF5 in the short term.

Section snippets

DAN system in CCS

The general data archiving architecture is formed by two main elements, shown in Fig. 1. On the one hand, the client systems (DAN clients) which generate data to be archived from a set of data sources (either acquisition or processing sources), and on the other hand, data archiving servers (DAN servers) that receive data from DAN clients and are responsible to store them in HDF5 files using the DAN writer services. DAN clients can be distributed and transmit data to the DAN archiving servers

DAN API

The DAN API is a library that implements an internal data transmission engine between processing components. It is included into the CCS “dan-daq” package with the other necessary libraries and management tools that will help developers incorporate the DAN functionality into their systems. Its present design and implementation make it ready for multi-processing operation, which will be extended to be thread compatible in the next future.

The DAN API is designed with the aim of minimizing data

Data archiving server

The DAN archiving server is the software that responds to the requests of DAN clients, configured on the basis of the streaming data structure. It receives stream data from the clients and stores them into HDF5 files. A DAN archiving server listens to the dedicated port of the DAN network and provides connections with the various DAN clients in order to transmit data from the DAQ devices to the central data storage. A DAN stream is the elementary communication channel between a DAN client and

Conclusions

The DAN software for client systems supports data from acquisition systems and data from simulation/processing elements. Thanks to the DAN API library, these data can be efficiently accessed not only for data archiving purposes but also for additional uses such as data processing or monitoring. Additionally, within the DAN archiving services, data are stored in HDF-5 files with special single-writer multiple-reader extension. This type of file is widely used by the scientific community and

References (3)

ITER Organization, CODAC Core System....

There are more references available in the full text version of this article.

Cited by (18)

MEPhIST-0 integrated control and data acquisition system
2023, Fusion Engineering and Design
The growing fusion research industry faces an extensive need for specialists with practical skills. However, the high complexity of the relevant experiments makes it difficult to maintain laboratories in which students can get their first hands-on experience of interacting with relevant devices. One of the options is using small, simplified machines that are designed for low-qualified personnel. This paper presents the architecture and implementation of a control and data acquisition system for an educational tokamak MEPhIST-0. It integrates different plasma diagnostic systems as well as tokamak's control nodes using the EPICS framework. Different approaches to data archiving and access are discussed, and the overall design is presented from the perspective of a primarily educational installation, as well as the preliminary results.
Storing EPICS process variables in HDF5 files for ITER
2023, Fusion Engineering and Design
EPICS (Experimental Physics and Industrial Control System) is the main technology used by ITER for distributed control of all its systems. EPICS uses an architecture based on a distributed protocol that allows the exchange of control data (process variables) between different elements and subsystems. In its previous versions, EPICS used “Channel Access” as the communication protocol, and its control variables had a simple structure that allowed the exchange of a very limited set of primitive data types. From its version 7, EPICS uses a new “pvAccess” protocol, and its process variables allow for nested data types that can form very complex data structures which can contain thousands of fields. From the perspective of data storage, these pvAccess variables and their nested data types include important challenges both at a functional level and at a performance level.
In this paper, the EPICS storage system that has been implemented for ITER is presented. This system stores the EPICS process variables in standard HDF5 files that ensure not only the accessibility and maintainability of the data, but also its full compatibility with the rest of the ITER data storage system.
Evaluation of ITER Real-Time Framework in plasma diagnostics applications
2023, Fusion Engineering and Design
The ITER Real-Time Framework (RTF) is a software suite designed and developed to facilitate the implementation of various real-time applications for ITER plant systems. The main driver to implement RTF was the Plasma Control System (PCS). RTF was designed as a base and development environment for PCS. However, due to its universal architecture, it can also be applied in other systems requiring real-time control or data processing.
This paper presents the demonstration system to evaluate RTF in Thomson Scattering (TS) diagnostics. The system was developed to validate the integration of data acquisition hardware with RTF and check it in real-time data processing applications. The presented system covers the whole path of data acquisition, processing and archiving required for a typical Instrumentation and Control (I&C) system for ITER diagnostics.
The main components of the RTF-based application developed for the system are device support for the pulsed giga-sample analog-to-digital converter, functional blocks implementing algorithms for analysing pulses from polychromator and calculating plasma electron temperature as well as interfaces for archiving raw data and publishing the measurement results.
The overriding goal of the presented work was a detailed functional and performance evaluation of the implemented RTF-based system. The system was tested in laboratory using simulated data and in tokamak conditions at Korea Superconducting Tokamak Advanced Research (KSTAR). During the test campaign at KSTAR, the developed RTF-based application was integrated into the Thomson Scattering system and tested during regular operation of the tokamak with signals from a real polychromator. The results of measurements and performance evaluation are presented and discussed in the paper.
Design for the distributed data locator service for multi-site data repositories
2021, Fusion Engineering and Design
Citation Excerpt :
Under the Broader Approach (BA) activities, the ITER Remote Experimentation Centre (REC) has been preparing to make a full replication of ITER data at Rokkasho, Japan [13]. ITER data system and its access methods are named as ITERDB and the Unified Data Access (UDA), respectively [14–16]. ITER UDA adopts the broker-type facilitator model [9], in which all the client/server communications will be carried through the UDA server.
The Remote Experimentation Centre (REC) in Japan has been preparing to replicate the full dataset of ITER over 10 000 km distance. In such a multi-site data repository environment, the data location informing service will be essential to find and retrieve the data efficiently. Considering the long latency time and the self sustainability of remote sites, the data location database should be running at each repository site. Multi-master asynchronous replication between cooperating databases will be essential to realize the remote experimental collaborations in fusion research. This study has investigated the functional differences of some relational databases and found that Postgres BDR has the expected database replication capabilities. Bi-directional replication (BDR) tests by using the LHD database and SNET revealed that the throughputs are sufficient for remote collaborations in fusion experiments.
Data model implementation in ITER data archiving system
2019, Fusion Engineering and Design
Citation Excerpt :
ITER data archiving technology has to archive different types of data coming from their systems. ITER data can be grouped in three main groups: DAN, SDN and PON [1,2], according to the network used for their distribution. DAN (Data Archiving Network) data come from acquisition systems, fast control and diagnostics.
ITER’s CODAC archiving system currently manages three different sets of data: DAN, SDN and PON, that correspond with the three networks: Data Archiving Network, Synchronous Data Network, and Plan Operational Network. In this sense, ITER’s CODAC data archiving system has been implemented to manage a wide variety of complex types of data and to support multidimensionality, dynamic resolution, metadata embedded types with header and footer sections, or user defined composition of types. This work describes ITER archiving data model, how it is able to fulfill ITER data requirements and keys of its implementation.
J-TEXT distributed data storage and management system
2018, Fusion Engineering and Design
Citation Excerpt :
This means the applications based on MDSplus and HDF5 can seamlessly migrate onto these types of storage [9,10]. The experiment data analysis tooling and workflow are well established on MDSplus and HDF5, which are very popular among fusion community [3,11]. But MDSplus and HDF5 are not designed for distributed file systems, they are optimized for block devices like hard drives or RAIDs (Redundant Arrays of Independent Disks).
As fusion experiment goes to steady state and high-performance data acquisition system been developed, traditional storage solutions cannot cope with the needs of fusion experiment data storage. Distributed file systems such as Lustre are being adopted by more and more facilities. The scaling-out of performance and capacity features addressed the main issue of traditional SAN (Storage Area Network) based storage. However, traditional file system based software solutions are still being used with the distributed file system. They lack modern data manage functions and limiting the performance of parallel storage. J-TEXT Cloud Database (JCDB) is a software stack that uses distributed database to provide fusion experiment data storage and management services. It ships with a storage engine powered by Cassandra database. This storage engine is designed for fusion experiment data and provides great performance. Data is divided into chunks when written and stored in a specially designed distributed database across the cluster. It has a MongoDB powered metadata management system which works seamlessly with the storage engine. JCDB is fully modular, handling different data type and metadata management functions are integrated as plugins. Even the storage engine can be changed. Though J-TEXT not supports long pulse experiments, the design of JCDB is aiming to meet long pulse discharge requirements, bring distributed database technology for future fusion devices such as China fusion engineering test reactor (CFETR). JCDB has the benefit of distributed file systems, provides complex metadata manage functions and comes with great performance.

View all citing articles on Scopus

View full text