Data Management

Data management includes the development and use of architectures, guidelines, practices and procedures for accurate managing of data during the entire data lifecycle of an institutional unit or a research project. Data are defined as different information units such as numbers, alphabetic characters, and symbols that are particularly formatted and can be processed by computer. The data in the project is provided by various actors which can be GeomInt partners, their legal representatives, employees, and external partners.

The collaborative work requires data management structures and guidelines.Therefore, the first step was to set up a document that includes a user agreement and a data management plan which is the basis for data management in the project.

User Agreement and Data Management Plan
The GeomInt project partners agreed to set up a user agreement which includes specifications for data structures including metadata, data formats, access authorization for data, the possible publication of data, as well as the handling of the data after the end of the project and outside the project.A first version of this user agreement was created six months after the start of the project.
The user agreement includes guidelines and definitions for the following aspects -Which data will be generated in the project and has to be managed?-How will data be provided and exchanged?-What are the rights of use for the partners and for third parties?-How to cite data?-How to supervise the compliance of the user agreement?
As part of the user agreement, a data management plan, which is a formal document that describes how project data is managed during the research period and after completion of the project, was developed.The goal of a data management plan is to consider the aspects of data management (metadata creation, data preservation and analysis) before the start of the project.Following points are discussed in the GeomInt data management plan: 1. Generation and management methods (data infrastructure, external data, data integration, data formats, quality control, user groups, data processing stages, versioning, documentation and meta data, geocoding) 2. Data Legal Management 3. Data exchange and provision, citation rules 4. Short-term storage and data management (storages, data transfer, backup, security) 5. Long-term storage (characteristics, metadata and documentation, responsibility) 6. Resources (organizational roles and responsibilities for data management).

GeomInt Data
The project results include specific data from laboratory and in-situ experiments, software components and data sets from numerical simulations (i.e.model and result files).An estimation of the extent of the data generated in GeomInt could not be made before the project.Therefore, the data management concept had to be flexible.This uncertainty was mainly due to the fact that the evaluation of test and calculation results may lead to a change in test and calculation planning and may even lead to additional experiments or simulation calculations.
The availability of experimental and numerical data generated in the project, including existing metadata, is realized on an internal area of the Geomint homepage.The UFZ is responsible for the project data and has many years of experience in data management regarding the cooperative development of open source software (OpenGeoSys) as well as the acquisition, storage and processing of data from experiments on different scales, exploration and monitoring campaigns, numerical simulations and scientific 3D visualizations.
The UFZ has sufficient capacities and modern data management systems for data storage, which are available as a central data infrastructure for the research network.Specifically, data sets are managed by means of an ORACLE database.Access is via a web portal, where each data record must be provided with metadata before uploading.The metadata standard used is compatible with the INSPIRE Directive 2007/2/EC and also regulates the rights for access, use and transfer of the data.A tape system is also available for the long-term storage of very large amounts of data.For the provision of exploration and monitoring data, geo-services mentioned in the GDI-DE are used as far as possible.Since such services for complex modelling and simulation data do not exist so far, the provision is done via a data research portal, where data can be found by means of stored metadata.
As software components are part of non-commercial, scientific program platforms and are open source products (e.g.. OpenGeoSys), they are hosted by the responsible partner via established source code hosting services (e.g.GitHub) and is publicly available.A possible public access to project data, which goes beyond the status quo as described in technical publications, as well as the handling of the data after the end of the project is regulated in the cooperation agreement or in the cooperation contract between the project partners.
The handling of data obtained from the in-situ experiments in the underground laboratories through synergies with other projects is also regulated separately (access authorisation for these data, storage location, publication, handling of the data after the end of the project).Such an approach is necessary because specific parts of these data can be used for the scientific purposes of GeomInt, but they are generated in other projects with partly other partners.

GeomInt DMP
In this section, exemplary data sets of every project partner are described.A table of these data sets including description and link are available only for project partners at the website (Fig. 5.2).Some data sets can be found on the UFZ data investigation portal https://www.ufz.de/drp/.These data sets are uploaded to the data management portal at UFZ (DMP@UFZ).
The GeomInt data management system (DMS) is organised in three sections (Fig.
Table 5.1 summarizes the MEX related data concerning experiments and simulations.A selection will be described in the following sections.
The following codes (and related input files) are used (see Chap. 7 for detailed code introductions).

CAU (LEM)
The required LEM code and the input variables of the three-Point fracture toughness test on the Rockville Granite samples are uploaded to the IfG (Kiel) NextCloud server.The data is accessible through the following link: https://nextcloud.ifg.unikiel.de/index.php/s/pRmBPJ9gK5Se6ci.The uploaded protected MATLAB file in a *.p format requires a MATLAB version with a built-in Voronoi Tessellation and Delaunay Triangulation functions.The input variables are prepared in a single file for the simulation of the fracture toughness in Rockville Granite. Figure 5.4 shows the relation between the load versus CMOD as described in Sect.4.1.

UFZ (FEM-VPF)
The source code can be found in OpenGeoSys project on github and the input files for the three point bending test performed on the Rockville Granite samples have been uploaded.The files include the unstructured finite element mesh files in vtu format and an OGS input file in xml format.As homogeneous properties such as Young's modulus are assigned int the computational domain, the spatially constant material properties are specified in the OGS input file rather than in the mesh file.The load and crack mouth opening displacment computed from the simulations are shown in Fig. 5.5 as described in Sect.4.1.MEX 0-1a (UFZ) will be also provided as an OGS benchmark case at: https://www.opengeosys.org/docs/benchmarks/phase-field/pf_tpb/.
Meta Data Overview (According to Dublin Core) See Tables 5.3, 5.4 and 5.5.

CAU Kiel
The experimental results of the three-point fracture toughness test on the Opalinus Clay samples are uploaded to the IfG (Kiel) NextCloud server.The data is accessible through the following link: https://nextcloud.ifg.uni-kiel.de/index.php/s/pJxp2eNEJb6PfiS.The data set, which includes the time, applied force (N ) and the displacement of the sample at the loading point (mm), is provided in a *.txt file.The crack mouth opening displacement (CMOD), which is determined from the image processing technique (Sect.2.2.2), is given in a *.xlsx file.The data includes the time and the calculated CMOD (mm).The required LEM code and the input variables of the three-Point fracture toughness test on the Opalinus Clay samples are uploaded to the IfG (Kiel) NextCloud server.The data is accessible through the following link: https://nextcloud.ifg.unikiel.de/index.php/s/ZBFN2rSZ99kPY9M.
The uploaded protected MATLAB file in a *.p format requires a MATLAB version with a built-in Voronoi Tessellation and Delaunay Triangulation functions.The input variables are prepared in two different files for a parallel and perpendicular embedded layer orientations.Figure 5.6 shows the comparison between the experimental and numerical data as described in Sect.4.2.

UFZ
The input files for OGS, which were used to simulate the three point bending test performed on the orthogonal and parallel laminations of Opalinus Clay samples, have been uploaded.The files include the unstructured finite element mesh files in vtu format and OGS input files in xml format.Also in the mesh files, the material properties are defined per element.Particularly for the orthogonal and the parallel Lamentations in the samples are represented through a contrast in the fracture tough- MEX 0-1b (UFZ) will be also provided as an OGS benchmark case at: https://www.opengeosys.org/docs/benchmarks/phase-field/pf_tpb_ani/.
Meta Data Overview (According to Dublin Core) See (Tables 5.7 and 5.8)

CAU Kiel
The required LEM code and the input variables for simulating the swelling process of the salt clay are uploaded to the IfG (Kiel) NextCloud server.The data is accessible through the following link: https://nextcloud.ifg.uni-kiel.de/index.php/s/JmZseQqrsbgWNqC.The uploaded protected MATLAB file in a *.p format requires a MATLAB version with a built-in Voronoi Tessellation and Delaunay Triangulation functions.Fig.5.8 shows the change of hydraulic conductivity with applied linear strains as described in Sect.4.4.

Meta Data Overview (According to Dublin Core)
See (Table 5.10).

CAU Kiel
The experimental results of the drying and wetting paths of the sandy Opalinus Clay are uploaded to the IfG (Kiel) NextCloud server.The data is accessible through the following link: https://nextcloud.ifg.uni-kiel.de/index.php/s/q6g25nWyWJKqzNB.
The experimental data (*.xlsx) of drying and wetting paths are uploaded to the server.The data includes the reading number, time (day), stain values in perpendicular and parallel orientations, weight of the sample and measured water content values.Fig. 5.9 shows the change of the strains under the applied suction values.
The required LEM code and the input variables for simulating the drying and wetting paths of the sandy Opalinus Clay are uploaded to the IfG (Kiel) NextCloud server.The data is accessible through the following link: https://nextcloud.ifg.unikiel.de/index.php/s/fDNoPoXpXMqeAsK.
The uploaded protected MATLAB file in a *.p format requires a MATLAB version with a built-in Voronoi Tessellation and Delaunay Triangulation functions.Fig. 5.10 shows the change of hydraulic conductivity with applied linear strains as described in Sect.4.5.Meta Data Overview (According to Dublin Core) See (Table 5.12).

UFZ
Link to the data set at UFZ data investigation portal (Download only for project members): https://www.ufz.de/record/dmp/archive/7706/.
The link contains two input decks for OGS-6 in which pressure driven percolation as described in MEX2 is simulated under different configurations of boundary loading.The first case applies the boundary loading of 12 MPa, 21 MPa, and 8 MPa in x-, y-, and z-direction respectively.It is called "case 1" and the corresponding OGS-6 input file is "me2_insitu_case1.prj".The second case is loaded with 4 MPa, 15 MPa, and 19 MPa in x-, y-, and z-direction respectively and the input file is named "me2_insitu_case2.prj".The remaining files are vtu files that describe the computing domain and the boundaries as shown in Fig. 4.58.MEX 2-1a (UFZ) will be also provided as an OGS benchmark case at: https://www.opengeosys.org/docs/benchmarks/phase-field/pf_perc/.

CAU Kiel
The required LEM code and the input variables of the percolation test on saltstone samples are uploaded to the IfG (Kiel) NextCloud server.The data is accessible through the following link: https://nextcloud.ifg.uni-kiel.de/index.php/s/9JZZcpS4S3JJT9S.
The uploaded protected MATLAB file in a *.p format requires a MATLAB version with a built-in Voronoi Tessellation and Delaunay Triangulation functions.The input variables are prepared in two files for two different stress configurations.Fig. 5.11 shows the frack surfaces under the percolation test as described in Sect.4.8.

Meta Data Overview (According to Dublin Core)
See (Tables 5.16 and 5.17).

CAU (LEM)
The experimental results of the pressure driven percolation test on the cubic Opalinus Clay samples are uploaded to the IfG (Kiel) NextCloud server.The data is accessible through the following link: https://nextcloud.ifg.uni-kiel.de/index.php/s/EMdNkdF4PRKWCqa.The experimental data (*.ASCII) of two different stress configurations (Sect.2.4.2) are uploaded to the server.The data includes the time ( T = 1s), pump volume (m L), given oil pressure (Bar) and actual oil pressure in the system (Bar).
Meta Data Overview (According to Dublin Core) See (Tables 5.19 and 5.20).
The measured gas flow is converted into permeabilities, which are stored as time series in an Excel file.For each of the three experiment there are two columns.The
The CNL data set contains four text files.One text file with the rock properties of the used granite (see Table 2.1).Two files with the scan data of the two surfaces.One point cloud can be seen in Fig. 5.17.The last file contains the laboratory data.In Fig. 5.18 the results for the four shear stress levels can be seen.
Meta Data Overview (According to Dublin Core) See (Table 5.28).
The uploaded data set contains two compiled executables for simulations to fit experimental data for the Reiche Zeche fracture characterization tests and to reproduce the non-linear flow response throughout harmonic testing of a single fracture

Fig. 5 . 1
Fig. 5.1 Network diagram to illustrate synergies and dependencies in the GeomInt network in connection with the infrastructure elements and numerical methods in the project

Fig. 5 . 3
Fig. 5.3 GeomInt DMS portal: Data areas for experimental, simulation and URL related information

Fig. 5 . 6
Fig. 5.6 The load versus crack mouth opening displacement (CMOD) response of the Opalinus Clay

Fig. 5 . 7
Fig. 5.7 The load versus crack mouth opening displacement (CMOD) response simulations for orthogonal and parallel lamination by VPF

Fig. 5. 8
Fig. 5.8 The change of hydraulic conductivity with applied linear strains

Fig. 5 .
Fig. 5.12 The recorded results depicting the evolution of the borehole pressure versus flow volume for the 1st stress configuration Fig. 5.12 illustrates an example of the plotted borehole pressure versus flow volume for the 1st stress configuration discussed in the Sect.4.9.The required LEM code and the input variables of the percolation test on Opalinus Clay samples are uploaded to the IfG (Kiel) NextCloud server.The data is accessible through the following link: https://nextcloud.ifg.uni-kiel.de/index.php/s/tFKKjxnSpgNG25b.The uploaded protected MATLAB file in a *.p format requires a MATLAB version with a built-in Voronoi Tessellation and Delaunay Triangulation functions.The input variables are prepared in two files for two different stress configurations.Fig. 5.13 illustrates an example of the evolved frack surfaces for the 2nd stress configuration discussed in the Sect.4.9).

Fig. 5 .
Fig. 5.17 Point cloud representing the surface of a granite sample from Saxony.The size is 65 mm by 170 mm and the cloud contains approx.98000 points .1), two files with the scan data of the two surfaces.One point cloud can be seen in Fig.5.19.The results of the laboratory tests are available as ASCII files and the shear curves and the dilatation are visualized in Fig.5.20.Additionally three photos of the basalt surface before, after the first and after the fourth shear test are included.MetaData Overview (According to Dublin Core) See (Table5.30).

Fig. 5 .
Fig. 5.19 Point cloud representing the surface of a basalt sample from Thuringia.The size is 196 mm by 149 mm and the cloud contains approx.252000 points

Fig. 5 .
Fig. 5.21 Visualisation of data obtained from the pre-compiled executable used for inverse analysis computations of pumping tests performed at Reiche Zeche at a depth of 40.6 m