Published September 9, 2021 | Version 1
Dataset Open

Supplementary data: "Secondary control activation analysed and predicted with explainable AI"

  • 1. Forschungszentrum Jülich GmbH, Institute for Energy and Climate Research - Systems Analysis and Technology Evaluation (IEK-STE), 52428 Jülich, Germany
  • 2. Faculty of Science and Technology, Norwegian University of Life Sciences, 1432 Ås, Norway

Description

This repository contains processed data and result files for the paper  Secondary control activation analysed and predicted with explainable AI .   The code for producing the processed data and the results is available at github.

Data

The data folder contains the feature and target data used to train the ML model. The data for Germany comprises the following folders and files:

  • raw_input_data.h5 : The aggregated external features without additional engineered features.
  • inputs_<model_type>.h5 : The input features for the different model types used in the paper including the engineered features. Depending on the model type, the input files also contain the IGCC features.
  • outputs.h5 :  The activated aFRR volumes in Germany.
  • version_2021-08-20: Folder containing the training and test sets used for the results.
  • documentation_of_data_download: Information files concerning the ENTSO-E raw data and its aggregation.

In addition to the German time series, the data folder contains the raw input data for the remaining IGCC states. Note that the results contain more model types as actually discussed in the paper.

Data sources

The data for input features (raw_input_data.h5 and input_<model_type>.h5) is derived from ENTSO-E Transparency Platform data [1]. The target data (outputs.h5) is based on publicly available data from the German Transmission System Operators (TSOs) [2].

Results

The result folder comprises the results of hyper-parameter optimization, model prediction and interpretation via SHAP. The model type, the loss function to train the model and the data set for prediction/interpretation were varied.

  • cv_results_<model_type>_<loss_function>.csv : Performance results for each combination in the hyper-parameter grid search.
  • cv_best_params_<model_type>_<loss_function>.csv : Hyper-parameters used in the final (optimized) model.
  • shap_values_<data_set>_<model_type>_<loss_function>.npy : First-order SHAP values calculated on different data sets: The train set, the randomized test set and the continuous test set.
  • y_pred_<data_set>.h5 : Predictions of daily profile predictor and Machine Learning models.

Disclaimer

The data might be subject to copyright or related rights. Please consult the primary data owner.

Files

data.zip

Files (678.2 MB)

Name Size Download all
md5:a9bae345e95f47e909304d4540eb4c93
301.5 MB Preview Download
md5:d40e12c801ee2a618d56bc5e0f4208f0
376.7 MB Preview Download

Additional details

References