Dataset of anomalies and malicious acts in a cyber-physical subsystem

This article presents a dataset produced to investigate how data and information quality estimations enable to detect aNomalies and malicious acts in cyber-physical systems. Data were acquired making use of a cyber-physical subsystem consisting of liquid containers for fuel or water, along with its automated control and data acquisition infrastructure. Described data consist of temporal series representing five operational scenarios – Normal, aNomalies, breakdown, sabotages, and cyber-attacks – corresponding to 15 different real situations. The dataset is publicly available in the .zip file published with the article, to investigate and compare faulty operation detection and characterization methods for cyber-physical systems.


a b s t r a c t
This article presents a dataset produced to investigate how data and information quality estimations enable to detect aNomalies and malicious acts in cyber-physical systems. Data were acquired making use of a cyber-physical subsystem consisting of liquid containers for fuel or water, along with its automated control and data acquisition infrastructure. Described data consist of temporal series representing five operational scenarios -Normal, aNomalies, breakdown, sabotages, and cyber-attackscorresponding to 15 different real situations. The dataset is publicly available in the . zip file published with the article, to investigate and compare faulty operation detection and characterization methods for cyberphysical systems. &

Value of the data
The dataset represents realistic sensors signals of a cyber-physical subsystem impacted by actual risks like aNomalies, sabotages, system breakdown, and cyber-attacks.
The dataset can be used to validate detection and characterization algorithms for operational surveillance and security applications in cyber-physical systems.
Included aNomalies and malicious acts can be studied to compare detection and characterization approaches for decision support.
The dataset can be used to examine algorithms that assess data alteration and service degradation.

Data
The dataset contains 15 files of temporal series that represent 15 different situations related to 5 operational scenarios. Files' duration varies depending on the situation and dysfunctional component. Accordingly, affected components are two types of depth sensor, the underlying network, or the whole subsystem. These situations can be wrongly understood by a decision maker, or only identified for instance after the malicious act was accomplished. Since wrongly managed situations might have significant adverse operational costs, it is critical to detect and analyze in real time such events. Datasets covering such situations are currently rare, because of the complexity to acquire data from cyber-physical systems. In our case, the principle of reusable experimental platform [1] was applied, to collect diverse datasets for monitoring [2] and categorization of aNomalies [3].

Experimental design, materials and methods
Two tanks of different volumes that function as storage and distribution device for water or fuel, one ultrasound depth sensor, four discrete sensors, and two pumps, were used to acquire the dataset (Fig. 1). A computer controlled the system with a PLC connected to a monitoring network. The ultrasound depth sensor on the main tank (volume of 7 L) was calibrated relating the tank dimensions to 10,000 equidistant depth steps (0 corresponds to the full tank and 10,000 to the empty tank). Fig. 2 shows the tracked filling and emptying of the main tank. The four floating discrete sensors in the second tank (volume of 9 L), measured levels of liquid corresponding to four volumes: 1.25 L, 3.35 L, 8 L, and 9 L.
All signalsultrasound depth sensor, pump 1, pump 2, and the four discrete level sensorswere acquired synchroNously for every situation described in Table 1   Additionally, two of the discrete sensors (1 and 2) were disrupted by keeping each one at a blocked position, i.e. up when the liquid has Not reached that level yet (No. 8) and pushing randomly down once liquid overflowed it (No. 9), leaving the tank almost empty or filling up to the security aperture, respectively. Network intrusions were carried out making use of the Modbus Penetration Testing Framework, Smod, 1 to execute a denial of service attack (No. 10) and a spoofing attack (No. 11). Finally, aNomalies can also be the result of unintentional human errors as a wrong system connection (No. 12) and more generally incorrect maintenance. Technical data sheets of the ultrasound sensor and the PLC, the network schema, the transmitted information between components, a script written in Python to read and display files, and additional details are provided with the dataset.