Soils

,


Introduction
The demand for geospatial information today has grown exponentially and given the multiplicity of existing geotechnologies in the market, the production and distribution of geospatial data became more agile every day. To ensure the quality, interoperability, and data sharing between producers and users of geospatial data and information, it is important that there is a geospatial geotechnical data storage standard.
A large amount of geotechnical data has been produced to support the elaboration of geotechnical maps in view of the obligation established by the Federal Government and, when associated with public and private sectors data results in a large volume of unstandardized and restricted access geotechnical data. This fact is reinforced because the National Spatial Data Bank does not have a geospatial geotechnical data standard.
There are data format that provides a standard way to transfer geotechnical or geoenvironmental data between the contributing parties of a project like Association of Geotechnical & Geoenvironmental Specialists (AGS) Format and Geography Markup Language (GML), but this does not guarantee that the consistency of the information that originated the file in the standard format. When standardization occurs at the data modeling level, all stored information follows the rules defined in the model, and the added information is automatically subjected to a check of all integrity constraints.
The use of a database in geotechnics has been discussed by several authors since the 2000s (Priya & Dodagoudar, 2018), which point to several advantages for the adoption of this practice such as allowing information quality control, availability in a single place, low risk of information loss and provide structured information to subsidiary the most diverse analyses.
In general, all databases are built based on a data model, even if implicit, but few authors approach the conceptual model and commonly develop solutions that meet only a specific need as can be observed in (Bozio & Reginato, 2020;Priya & Dodagoudar, 2018;Moura et al., 2017;Ribeiro et al., 2016;Santos et al., 2018). Therefore, this work aims to present a proposal for a conceptual model of geographic database using the Object Modeling Technique for Geographic Applications (OMT-G) for the theme geotechnics and its implementation in a free database management system (DBMS), PostgreSQL / PostGIS.

Object Modeling Technique for Geographic Application (OMT-G)
A data model is a set of concepts that can be used to describe data, its relationships, and constraints (Silberschatz et al., 2020). Among the existing models, the OMT-G expands the Unified Modeling Language (UML) by introducing bidimensional geographical primitives like points, polygons, and lines, which increase its semantic representation capacity.
The structure of OMT-G model is based in three mains concepts, the class, which is responsible for data representation in geographic applications, relationships that explains how a class is related with other classes and constraints which are the rules that need to be followed in the database to ensure data integrity. The model allows space to be modeled and represented as non-spatial, continuous and discrete data, last one with different types of geometry and topological relations (Borges et al., 2001(Borges et al., , 2005 providing an integrated view of the modeled space ( Figure 1).
As for the relationships between the classes, the OMT-G model allows simple associations represented by continuous lines and spatial associations by dashed lines, generalization, which allows defining more generic classes (superclass) from classes with similar characteristics (subclass), and specialization, which is the reverse process ( Figure 1). Relationships are characterized by their cardinality that represents the number of instances of a class that can be associated with instances of the other class. More information about OMT-G can be obtained from (Borges et al., 2001(Borges et al., , 2005Davis, 2000;Queiroz & Ferreira, 2006;SPUGeo, 2021).

Materials and methods
Conceptual modeling was performed using the Star UML 5.0 free code (Lee et al., 2005) which has an OMT-G module for visualization and modeling of class and transformation diagrams. For the logical schema was used pgModeler 0.93 (Silva, 2021) which is a free code database modeling program. The physical implementation was done via pgAdmin version 5.4 (PostgreSQL Global Development Group, 2021) in the chosen database, PostgreSQL Version 11.17 (PostgreSQL Global Development Group, 2021) with the spatial extension PostGIS Version 3.0 (POSTGIS, 2021) which, supports 2D and 3D geographic objects and queries. 3D Data visualization and analysis were performed using QGIS GIS in version 3.16 (QGIS, 2021), which allows direct connection to the database. As all the manipulation and visualization of the data inserted in the database is done by a geographic information system that can visualize the information two or three dimensions using the tools available in the program itself and import and export various data formats such as vectorial, matrix, tabular, text, images, among others. Other thematic information such as topography, geology, pedology, geomorphology, among others, can be inserted in the database or in the GIS itself and analyzed together with the geotechnical data.

Compiling pre-existing data
Approximately 4.850 quantitative and qualitative geotechnical data from mappings, laboratory tests and field investigations conducted in academic research and by government agencies of the Federal District Government (GDF) were compiled, whose spatial distribution is presented in Figure 2. All compiled data were submitted to pre-processing routine consisting of georeferencing, and data quality analysis.

Requirements gathering
The process begins by choosing the objects that will be represented by conventional or georeferenced classes during abstraction process to elaborate the conceptual model, which will later be implemented on the database. In this step, the requirements of the information itself and the spatial representation of the chosen objects are defined.
The requirements consist of the definition of the concepts of the different laboratory tests and geotechnical field investigations that define the scope of the proposed data model and semantic constraints, also known as business rules, which are inherent to geotechnical data. For example, the spatial constraint that two laboratory tests represented as tridimensional points, which alter the arrangement of the soil particles, cannot overlap spatially is linked to a semantic constraint that when we want to know any geotechnical property of a soil in an in-situ condition, both tests cannot be performed on the same sample. However, the execution of particle size with sedimentation using the material resulting from the test is possible then, in this case, the spatial overlap is allowed.

Conceptual modeling
In this stage, the class diagram containing the conventional and georeferenced classes was elaborated with the spatial representation defined during the requirements gathering step, followed by the definition of the relationships between classes whether simple, topological, semantic, or user-defined, and finally the cardinality of the relationships between the classes. The proposed model was based on previous experiences with geological and geotechnical databases and data models (Gao, 2007;OGC, 2017;Ribeiro et al., 2016;Santos et al., 2018;Silva, 2005;Silva, 2007;Tegtmeier et al., 2014;Zand, 2011). Of the above-mentioned articles, implementations and models are restricted to only one type of field investigation or laboratory test, especially standard penetrations tests.
Since OMT-G does not specify 3D geometric primitives, the transformation diagram, routines for the construction of 3D geometries were defined the based on computational geometry transformations such as buffer construction, extrusion and expand using the locational information of the data and the dimensions of the three-dimensional object to be constructed by database. This article does not include the presentation diagram since the focus is on storing information within the database although an example of graphical representation of three-dimensional geotechnical data is presented.

Logical schema and physical implementation
In the logical schema, data was organized in the way that it will be stored in the database and is where the primary and foreign keys, normalization, and referential integrity are defined. All information contained in the logical schema is recorded in the data dictionaries and metadata that follows the Geospatial Metadata Profile of Brazil specification (IBGE, 2021).
The database is composed of tables, visualizations, and spatial and non-spatial indexes that were implemented through Structured Query Language (SQL) code which are being compiled in a PostgreSQL extension that will be available at the https://github.com/bro-geo/geotechnical_database.

Compiled requirements
The main objects of interest of the model are field investigations, laboratory tests, samples, and geotechnical units. The main types of laboratory tests and field investigations conducted by academic research and government agencies of the GDF were chosen to compose the proposed conceptual model but ensuring the possibility of expanding the model if necessary.
Field investigation is a method of obtaining information in the field, on the surface or subsurface, in which the researcher may or may not have contact with the sampled material to obtain its physical properties (Marrano et al., 2018).
Field investigations are represented by point or volume geo-objects, which can overlap spatially if they are not executed in the same period but does not apply to field point subclass that can overlap other subclasses in any period or piezometer subclass that cannot be overlapped spatially by other types of field investigations except field points. The geometry of investigations superclass must be constructed using latitude, longitude, point elevation and any information related to the shape of the investigation like diameter, and it is essential that all investigations share the same horizontal and vertical reference system. This superclass does not need to be related to the sample class or the laboratory tests superclass.
Laboratory tests consist of tests carried out within a laboratory on soil or rock samples, to obtain the physical, mineralogical, mechanical, and hydraulic properties of the materials and/or categorize the tested materials by their geotechnical properties (Head, 2006).
The laboratory tests are represented by point-type geoobjects or volumes which cannot overlap spatially, regardless of the execution date, except for the rules presented below. Two laboratory tests geometries can overlap spatially when we are not interested in an in-situ condition of any geotechnical property. The geometries of the subclasses compression and California bearing ratio (CBR) can overlap each other because CBR the test is executed on a compacted soil cylinder. Two geometries of laboratory tests that are executed on deformed samples can overlap each other. A laboratory test executed on deformed sample can overlap a laboratory test executed on undeformed sample if it was executed after the undeformed sample test.
The geometry of a laboratory tests superclass must be constructed using the latitude and longitude coordinates and the elevation of the point and any shape-related information, and it is essential that all laboratory tests share the same horizontal and vertical reference system. Laboratory tests superclass needs to be spatial related with a sample class and a field investigation superclass. Table 1 presents the subclasses of the field investigations and their respective descriptions and spatial and semantic restrictions which were based on Marrano et al. (2018), ABNT (2018), ABNT (2016) and ABNT (2020).
The Table 2 and Table 3 presents the subclasses of laboratory tests and their respective descriptions and spatial, semantic, and user-defined restrictions which were based on ABNT (2018), Head (2006), Head & Epps, (2011. As every test comes from a sample, it is necessary to compile the requirements of this class. The sample can be defined as a material, rock or soil, collected through field investigations that can be used for performing laboratory tests. For soils, the sample is said to be disturbed when its natural structure was modified by breaking the structure of a soil without variation of its moisture content. The deformed sample is the one that does not maintain all the characteristics that occur in-situ and the undeformed sample is obtained in order to preserve the soil characteristics that occur in-situ (ABNT, 1995).
Samples are represented by point-type geo-objects on small scales and polygon and/or volume on large scales and cannot overlap itself regardless of period. The sample should always be related to the investigation in which the collection was performed but does not necessarily need to be related Table 1. Subclasses of the superclass investigations and their respective descriptions and constraints.

Subclass of investigations Description Spatial and semantic constraints Field Points
Any point on the earth's surface or subsoil that contains information relevant to an engineering project.
The geometries of this subclass can overlap independently of the period or overlap geometries from other subclasses of the superclass independent of the period investigations.

Concentric rings
Field test to determine the speed of water infiltration into the soil.
Geometries of this subclass can overlap geometries of other subclasses of the superclass independent of the period-independent investigations with the exception of the piezometer and can overlap geometries of this class at different periods. Geometries of this subclass cannot be related to sample and laboratory test classes. Relationship with conventional tables with measurement values. Piezometer Field test that allows to check the water pressures and the position of the groundwater level in the rock mass respectively.
Geometries of this subclass cannot be overlapped by geometries of other subclasses of the investigation superclass with a period after the piezometer is installed except for field points.

Drilling
Field test to collect deformed soil or rock samples depending on the driling type. The borehole is commonly used to perform soil infiltration tests or pressure water loss in rocks.
Geometries of this class can overlap geometries from other subclasses of the superclass investigations at different times, with the exception of the piezometer.

Inspection trenches and pits
Vertical excavation (circular, square or rectangular section) that allows access of a researcher to make visual inspection of the walls and bottom and the removal of representative samples (undeformed and/or deformed).
Polygon and/or volume geometry at large scales and point on small scales The superclass investigations must contain a geometry with the same geociu originating from this class.
All subclasses Inherits the constraints, geometry, and relationships of the investigations superclass unless otherwise specified. to laboratory tests because it may not have been subjected to any type of test, i.e., the geometries of this class must be contained in the field investigations superclass and contain the laboratory tests that have the same identifier code. The geotechnical units are defined by lithological, pedological, hydrogeological and geomorphological conditions that present a homologous geotechnical behavior and are represented in geo-objects of the polygon or volume type. In this article it proposes a third form of abstraction of geotechnical units consisting of their subdivision of the volume of the geographical feature into a regular mesh of representative elementary volumes, represented by their respective centroids. Geotechnical units related to the same geotechnical map cannot overlap, cannot have gaps, and must be contained within the boundary of the geotechnical map.
Related to the rock unit, we have the class of the rock mass that is composed by the intact rock discontinuities, water, and the stress state. The rock mass is represented by point-type geo-objects on small scales and by a polygon or volume type geo-object or large-scale. This separation of the rock mass is important because depending on the scale that addresses the problem the same rock unit can present distinct geomechanical behaviors.
The geotechnical unit also relates topologically to the geotechnical sections and the boundary of geotechnical maps. Geotechnical sections can be represented by a line-type geoobject, which corresponds to the alignment of the section, and by polygon-type geo-objects corresponding to a 2D section that is a simplified two-dimensional representation of 3D geotechnical reality or the three-dimensional section itself. Table 4 presents the subclasses of the geotechnical units and their descriptions and spatial and semantic constraints.
The boundary of geotechnical maps consists of the polygons of the areas in which geotechnical maps have been drawn and must be represented by a polygon-type geo-object and overlap and gaps between geometries are allowed. All topological relationships between the geo-objects of the model, regardless of the geometric primitive, are specified in the class diagram and spatial-time relationship is made based on the execution date of a field investigation or laboratory test.

Proposed conceptual model
Based on the analyses performed regarding the topological and semantic characteristics of the classes, a conceptual Laboratory test to measure the support capacity of the sub-base and subbed.
All constrains established for other subclasses. Geometries related to this subclass can overlap geometries from the compaction subclass as long as they have the same identifier code. Table 4. Subclasses of the geotechnical unit superclass and their respective descriptions and constraints.

Subclass of the geotechnical unit Description Spatial and semantic constraints Soil Unit
Mapping units defined by pedological conditions presenting homologous geotechnical behavior or soils categorized by a soil classification system for engineering purposes.
Inherits constraints, representation, and geometry unless otherwise specified. Units related to the same geotechnical map, cannot overlap, cannot have gaps and must be contained within the boundary of the geotechnical map. Gaps and inner rings are allowed because rock units or other soil units can occur between or within the unit.

Rock Unit
Mapping units defined by lithological conditions that present homologous geotechnical behavior.
All constrains established for soil units. For rock unit, there is an intersection relationship with geological structures of the type alignment and with structures of the plane type.
model that represents the geotechnical field investigations and laboratory tests in an objective and coherent manner is proposed (Figure 3). The terminology "ge" in the diagram classes means large scales while "pe" means small scales. The subclasses of field investigations and laboratory tests inherit the common information from their respective superclass's while the sample class is responsible for relating investigations and laboratory test spatially and through the unique identifier code. The superclass field investigations is responsible for generating the unique identifier code, here called geociu, of geotechnical data whose construction was based on the articulation of charts of systematic mapping of the Federal District and a point on the surface of the geometry of the field investigation.
From the superclass investigations, any number of subclasses can be derived if it meets their specifications. In the proposed model, the superclass field investigation through specialization, using the type of investigation attribute, we obtain the subclasses Field Points, Concentric Rings, Piezometer, Trench and Drilling. Each of these objects is responsible for storing information related to geotechnical investigation of the type of the subclass.
In the relationship between the superclass investigations and their respective subclasses, the partial overlap specialization relationship was adopted, overlap because it defines that, two investigations can be conducted in the same place and partial because the subclasses presented do not constitute all the possibilities of field investigations. This relationship validates situations such as a drilling followed by the installation of a piezometer, but it does not prevent the overlap of drilling, whose problem would be that the soil would no longer be in the in-situ conditions. Time could be a variable that makes the second example feasible if the execution of the tests were not in the same period.
In the case of the laboratory test superclass, any number of subclasses may be derived if it meets the definition of the superclass. In the proposed model, the superclass laboratory test, through specialization using the attribute type of derive subclasses moisture, atterberg, physical indexes, particle size test, permeability test with variable or constant load permeameter, california bearing ratio, direct shear, simple compression, consolidation, characterization tests of MCT and triaxial.
In the relationship between the superclass laboratory test and their respective subclasses, the partial disjoining specialization relationship was adopted, because two tests cannot be performed in the same undeformed sample and partial sample because the subclasses presented do not constitute all the possibilities of geotechnical tests. Although the partial disjoining specialization relationship was adopted there are some exceptions for this rule as defined in the requirements section that need to be implemented in the database. The other conventional tables are intended to store the results of the measurements of the tests.
The transformations that involve the classes mentioned above are presented in Figure 4. Regarding the transformations that occur in the database for the generation of 3D geometries, three routines of two-dimensional to three-dimensional data transformations were proposed. For investigations and tests that collect or use cylindrical samples, a buffer is created with the radius of the sample followed by extrusion with its respective height. In the case of a square or rectangular sample, an expansion is made on the X and Y axes followed by extrusion by the height of the sample. For classes that have a polygon or multi-polygon geometry it is only necessary to extrude by the depth that is the case of samples and trenches.

Logical schema e physical implementation of database
During the elaboration of the logical schema and the physical implementation of the database it was necessary to consider the storage of historical series, measurement results and information related to the execution of tests. Conventional tables were created during the implementation to store the results of measurements during the execution of tests as the concentric rings and information of laboratory test measurements and historical series, for example, water level in piezometer. Because the database user will not open multiple tables to obtain information, selections have been created through materialized visualizations to facilitate access to the data.
The Field Points and Concentric Rings classes inherit the geometry of the superclass and were treated as conventional classes during physical implementation. The field points class was segmented into two tables during implementation, one to store soil profiles and the other for rock outcrops.
Rotary, Percussion and auger drilling, PANDA penetrometer, Cone penetrometer, Guelph Permeameter and Vane test are generated from the query of the field investigations superclass, the drilling subclass, responsible for storing the volumes of the subclasses, and the conventional tables "rotary", "percussion", "auger", "panda_penetrometer", "cone_penetrometer", "guelph" e "vane_test" respectively. In the DBMS, this selection was made by creating materialized visualizations.
Considering that a point, for trenches and pits, would not represent the area investigated at larger scales, and that the centroid generation inserted within the polygon that originates it is a simpler procedure than generating a polygon from a point, it was decided to represent them by polygontype geo-objects and whose centroid should be included in the field investigation superclass.
The target of the generated 3D geometries varies with the class. The investigations superclass stores the 3D point data while the sample class stores the 2D sample projection as a polygon. The laboratory test superclass uses the point registered in this class as a reference, creates the twodimensional projection based on the radius or length and width depending on the sample shape of the test, which is stored in the sample class. Based on this projection and with the height of the generates the tested volume, which is stored in the test class. The dimensions of the samples evaluated, in 2D or 3D, are represented in the "tests_geom3d" table which has the same purpose as the drilling table, which is to store the volumes of the subclasses.
The Figure 5 presents the geometry of field investigations of SPT type and a geotechnical section projected in two dimensions and the original geometries of the section and investigations in three dimensions. QGIS allows to query any information in the database using PL/pgSQL expressions and visualize the data using the 3d viewer. Figure 5 is just one example of the many possible applications that can be made based on the data inserted into the database. The use of pre-existing geotechnical data in the elaboration of geotechnical maps, for example, facilitates their elaboration and the remaining time, previously invested in the compilation and compatibility of information, can be reallocated to other activities. Besides geotechnical cartography, we can mention the importance of a geotechnical database as a preliminary source of information for engineering projects such as foundations and excavations for example.
The sections were prepared based on information of number of blows and geological origin of the soils available from boreholes. To include more information in the section, or any other information together with the geotechnical data, it is only necessary to perform a spatial analysis between the section and the other information available in the database or insert the data in the GIS together with the section in the same project, since all the information shares the same reference system. Figure 6 shows in detail the sections shown in Figure 5. Other types of information are not displayed in   the section because there is no information available in the same place. As this information is compiled from several sources, there are no cases with overlaps of different types of field investigations and laboratory tests.

Conclusion
Considering the reality of the Federal District, in which more than four thousand geotechnical investigations were compiled the auto to make up this database were restricted in their respective sources, it is observed the real need to build a geospatial database that is compatible with the Spatial Data Infrastructure of Federal District and National Spatial Data Bank to disseminate this information.
The implementation of the proposed model allows, and the systematic and periodic organization of data produced by various agencies or companies, improves the quality of stored data, facilitates the interoperability of geotechnical data between producers and consumers of geoinformation in addition to optimizing investigation plans, improving the planning of investments for geotechnical studies, and optimizing the execution of future construction projects.
In the case of the OMT-G model, it proved appropriate to obtain adequate representations of laboratory tests and field investigations. Relationships such as specialization can define more specific classes from generic classes by adding new properties in the form of attributes, such as field investigations and laboratory tests and their respective subclasses. This type of relationship also allows you to specify that two field tests can be done in the same location as a drilling followed by the installation of a piezometer, but two tests cannot be done in the same sample, such as a triaxial assay followed by a simple compression test.
Despite the OMT-G model was not designed to model data with time property, due to the characteristics of geotechnical data, queries related to the date of execution or registration of an investigation are easily constructed and are sufficient to retrieve information related to the temporal issue.
The OMT-G was also satisfactory in modeling threedimensional data, when associating the class and transformation diagrams. All the operations required for the construction of the three-dimensional geometries are available in the chosen DBMS and the available topological relationships meet those specified in the class diagram.
The DBMS PostgreSQL proved to be extremely robust and stable to serve as the basis for the geotechnical three-dimensional database and its integration with GIS such as Quantum GIS creates the possibility to use all its functionalities to analyze the data in question. All the structures and relationships mentioned during conceptual modeling and the structures presented in the logical schema have been successfully implemented in the PostgreSQL database and are being compiled in the PostgreSQL extension that will be available in the https:// github.com/bro-geo/geotechnical_database. Finally, this model will serve as the basis for the development of an application for geotechnical database management in Quantum GIS. Later the model will be expanded to include more objects of interest of the Geotechnics theme, and greater interoperability with other databases such as the Multifinalitary Technical Cadastre of the Federal District.