Cone penetration test dataset Premstaller Geotechnik

The dataset contains 1339 cone penetration tests (CPT, CPTu, SCPT, SCPTu) executed within Austria and Germany by the company Premstaller Geotechnik ZT GmbH. As a first processing step, core drillings, located within a maximum distance of approximately 50 m to the insitu tests, were assigned to these cone penetration tests, which allow an interpretation of the insitu measurements based on its grain size distribution. In a second step, the software Geologismiki was used to calculate various normalized measures, which can e.g. be used as input parameters for soil behaviour type charts. The present data can be utilized by researches for example to develop new approaches related to soil classification based on cone penetration test. Furthermore, it provides a framework for combining insitu measurements (qc, fs, Rf, u2, Vs), normalized measures (i.e. Qt, Bq, U2) and soil classifications.


Specifications
Geotechnical Engineering and Engineering Geology Specific subject area Soil classification, cone penetration test Type of data Table  Chart csv (supplementary data) How data were acquired The cone penetration tests (CPT, CPTu, SCPT, SCPTu) were executed by Premstaller Geotechnik ZT GmbH. Further processing of the insitu measurements was performed using the software Geologismiki [1] . Assignment of soil classification (i.e. core logs mapped by geotechnical engineers next to cone penetration test) to data was done manually. Data format Raw Analyzed Parameters for data collection The insitu tests and core drillings within the dataset were anonymized and can therefore not be related to single projects. Description of data collection The execution of insitu tests complies the following standards: • ISSMGE: IRTP 1999/2001 [2] • ASTM D:5778-95, 1996 [3] • EN ISO 22476-1 [4] The soil classifications (of core drillings) were homogenized according to EN ISO 14688-1 [5] . Data

Value of the Data
• The dataset includes 1339 CPT, CPTu, SCPT and SCPTu executed in a wide range of grain size distributions within Austria and Germany. All cone penetration tests have been performed by Premstaller Geotechnik ZT GmbH. Furthermore, the soil classification of core drillings was assigned to 490 insitu tests, which allow an interpretation of the insitu measuments based on the grain size distribution. • The data can be used by researches to develop e.g. new approaches related to soil classification or the identification of soil layers based on CPT results. • The data provide a framework for combining insitu measurements (q c , f s , R f , u 2 , V s ), normalized measures (i.e. Q t , B q , U 2 ) and soil classifications (of core drillings) to achieve an improved characterization of soils. • The dataset addresses the problem that there is a lack of publicly available datasets that can be used for benchmark tests in geotechnics (e.g. for machine learning applications). This dataset should serve especially as basis of supervised machine learning techniques for CPT data processing.

Raw data
Cone penetration tests (CPT) allow continuous, rapid and cost-effective measurements over depth. Therefore, they are becoming increasingly popular in geotechnical engineering. During Table 1 Overview of insitu measurements and number of tests for different test types. Measurements  Total number of tests  Number of tests with soil classification   CPT  q c , f s  931  336  CPTu  q c , f s , u 2  312  106  SCPT  q c , f s , V s  46  23  SCPTu  q c , f s , u 2 , V s  50  25 the test procedure, a probe is pushed under constant rate (2 cm/s) in the subsurface and measures the tip resistance q c as well as the sleeve friction f s . In addition, the generated porewater pressure can be measured at position u 2 (above the cone) when performing CPTu tests. The shear wave velocity V s of soils can be determined in intervals of 50 cm by means of seismic CPT / CPTu (denoted as SCPT, SCPTu). The different types of cone penetration tests are summarized in Table 1 .

Test type
The dataset presents a collection of 1339 insitu tests executed by Premstaller Geotechnik in basins and valleys of various Alpine regions and forelands (Austria, Germany). Those basins were formed during the last glacial period, remained as lakes after the melting of ice masses and are often filled by fine-grained sediments over thousands of years [6] . Consequently, its properties can strongly vary within a basin and are in many cases additionally overlaid by coarse-grained top layers. On the other hand, today's alpine valley fills are often characterized by a succession of (from bottom to top): deposits from the glaciation (e.g. basal till); deposits from the period of glacial retreat (e.g. fine-grained lake deposits) and a cover of recent (Holocene) coarse grained fluviatile deposits. Therefore, valley fillings are usually characterised by a coarser grain size distribution and more heterogeneous subsoil properties than basins.
The respective basins or valleys -within the insitu tests are located -are named within the supplementary data. Additional soil classifications based on core drillings, located next to the insitu tests, are included and assigned to these tests (see Section 1.3 ). The total number of insitu tests are listed in Table 1 for the different test types. Furthermore, the number of insitu tests with an additional soil classification (based on core drillings) are listed in column four.

Normalized parameters
In practical engineering, normalized parameters -calculated based on the insitu measurements -are often used for soil characterization by means of soil behaviour type charts. Nowadays, especially the SBT charts according to Robertson [7] [8] [9][10] (see Fig. 2 ) are widely used in practical engineering. Different SBT-charts found in literature are summarized in Table 5. For detailed information, reference should be made to Section 2.2 .
All normalized parameters used within the dataset are described in Table 2 . These normalized parameters can also be utilized to determine stiffness, strength and other properties of soils based on correlations [11] .

Core drillings
Core drillings executed within a maximum distance of approximately 50 m to the insitu tests were interpreted according to EN ISO 14688-1. The soil types of a core drilling are also related to the different measurements of the cone penetration test, as shown in Fig. 1 (see also Section 2.4 ).
To enable a holistic interpretation including the grain size distribution, Oberhollenzer defined 7 groups additionally. As shown in Table 3 , each group includes a defined range of grain sizes. Those soils, which could not be directly assigned to a specific group were ignored and replaced with 0 within the dataset. An overview of the single soil classifications is given in Table 3 .
Various soil behaviour type charts; determination of normalized parameters (i.e. Q t , F r ) R f (%) Friction ratio Normalized cone resistance Various soil behaviour type charts σv = insitu total vertical stress σ v = insitu effective vertical stress Q tn (-) Updated normalized cone resistance Various soil behaviour type charts p a = atmospheric pressure in the same units Normalized friction ratio Various soil behaviour type charts B q (-) Pore pressure parameter Various soil behaviour type charts U 2 (-) Normalized excess pore pressure Calculation of updated normalized cone resistance Q tn (n ≤1.0) I c (-) Soil behaviour type index Approximation of SBTn boundaries according to Robertson (1991) Soil behaviour type index

Structure of dataset (supplementary data)
The supplementary data (csv file) contains the insitu measurements as well as normalized parameters for all insitu tests. For each test a characteristic ID-number was defined and the parameters are listed in 1 cm intervals over depth, except for the shear wave velocity which was determined approximately every 50 cm. Parameters which were not measured insitu (i.e. the u 2 measurement when performing a CPT), which are smaller -100 or larger 10,0 0 0 are left blank. An extract of the data is given in Table 4 , where exemplary parameters of CPT, CPTu, SCPT and SCPTu are shown. Besides ID number (column 1), test type (column 2), location (column   3), depth (column 4), insitu measurements (columns 5-8) and normalized parameters (columns 9 -26), a soil classification (columns 27-28) based on the close by core drillings is given. As explained in section 1.1 and 1.3 core drillings were assigned only to insitu tests executed next to the performed drilling. The soil classifications were homogenized according to EN ISO 14688-1 (column 27) and grouped according to their grain size distribution by Oberhollenzer (column 28).

General
In practical engineering, soils are often characterized based on their grain size distribution. The latter can be determined using laboratory tests (i.e. hydrometer and sieve analyses) or based on subjective experience. On the other hand, the insitu behaviour of soils is strongly related to the stress history, density, degree of consolidation as well as other physical and chemical processes [10] .
Therefore, alternative classification procedures based on CPT or CPTu measurements have been developed by various authors. Based on so-called "soil behaviour type charts" the soil can be characterized using insitu measurements (i.e. q c , f s , u 2 ) or normalized parameters (i.e. Q t , B q , U 2 ). For this procedure no soil sampling nor time-consuming laboratory tests are needed. However, information regarding its grain size cannot be evaluated directly and has to be estimated based on experience. The dataset enables a direct comparison of cone penetration measurements as well as normalized parameters with soil classifications according to EN ISO 14688 (based on core drillings).

In-situ measurements and normalized parameters
The present dataset contains insitu tests executed according to the current standards (ISS-MGE: IRTP 1999/2001 [2] , ASTM D:5778-95, 1996 [3] ) by Premstaller Geotechnik ZT GmbH. For all tests, CPT-trucks or CPT-rigs were used to push standard probes (cross-section area of 15 cm ²) under constant rate of approximately 2 cm/s into the soil. In a second step all insitu measurements (GEF-format) were imported to the software "Geologismiki" [1] for post-processing and calculating the normalized parameters (see Section 1.2 ). Therefore, several estimations were required: • The net area ratio a depends on the geometry of the cone and is determined from calibration measurements at the laboratory. The value varies between 0.75 and 0.85 for the tests executed. • A hydrostatic insitu pore water pressure distribution was assumed for all tests. The groundwater table was estimated based on u 2 measurements or neighbouring core drillings. • The insitu total and effective stresses were calculated based on a wet unit weight, saturated unit weight and buoyant unit weight equal to 19 kN/m ³, 19 kN/m 3 and 9 kN/m 3 respectively. These values were defined based on personal experience. • All normalized parameters (calculated based on CPTu and SCPTu) consider pore water pressure measurements at position u 2 . For CPT and SCPT, where no pore water pressure was determined u 2 column was left blank. • Normalized tip resistance Q tn > 10 0 0 cannot be calculated by using the software package "Geologismiki". Therefore, those values are set equal to 1001 within the dataset.

Soil behaviour type charts
Results of cone penetration tests executed at standard rate (2 cm/s) can be used for interpretations based on soil behaviour type (SBT) charts, which enable an interpretation based on insitu measurements (row data) or normalized parameters. Nowadays, especially charts developed by Robertson [ 8 ,9 ,10 ] have a wide application in practical engineering. These SBT charts are based on normalized parameters ( Q t , Q tn , F r , B q -see Fig. 2 ). Consequently, u 2 measurements are required for their application and CPT or SCPT could lead to wrong, misleading classifications. Defined boundaries within the charts enable an efficient characterization of soils for different  Robertson [ 8 ,9 . 10 ] Table 5 Overview of developed soil behaviour type charts. depth levels and a wide range of material behaviour. But it has to be reminded, that those classification systems do not consider the grain sized distribution. Table 5 summarizes the input variables as well as the required types of cone penetration test for different SBT-charts found in the literature.

Core drillings
Core drillings executed within a maximum distance of approximately 50 m to the insitu tests were interpreted according to EN ISO 14688-1. In the next step the different soil layers within one core drilling were assigned manually to one or more insitu tests (depending on the distance). Thereby, insitu tests and core drillings with the smallest distance were combined. Furthermore, it was ensured that the lithologies (within the core drillings) agree with the general trend of the respective insitu measurements (i.e. small grain size leads to small q c and f s ). To improve the agreement between core drillings and insitu measurements the elevation of layer changes was sometimes adjusted slightly (as shown in Fig. 1 ).
The classification of the different core drillings was carried out by various engineers and / or geologists. Consequently, the (subjective) assessment can differ between different involved parties. It must be kept in mind that especially for fine grained soils (i.e. silts) the soil classification becomes difficult without additional laboratory tests.
It should be noted that the chosen methodology according to Oberhollenzer (see Table 3 & Table 4 -column 28) for the categorization of soils is only one of many possibilities to investigate the insitu measurements in combination with its grain size distribution.

Ethics Statement
The work does not involve the use of human subjects or animal experiments

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships which have, or could be perceived to have, influenced the work reported in this article.