Comparison of Ordinary and Universal Kriging interpolation techniques on a depth variable (a case of linear spatial trend), case study of the Šandrovac Field

Until now, Universal Kriging has not been used for the mapping of geological data in Croatia. However, it is one of the most frequently used methods of Kriging, probably the most adequate in cases when the input data is marked by a common trend. That exact feature is often an attribute of deep geological data, and thereby that of structural maps. Mapped surfaces in a row of examples have a structural trend towards one cardinal direction, or a sequence of geological structures, like anticlinorium, is a part of a structural unit of a higher order such as regional monocline. An example is given of geographical trend recognition in e-log Z’ surface spread in Šandrovac Field as well as successful mapping of that marker depth variable by using Universal Kriging.


Introduction
Geostatistics is widely used in geology as an interpolation approach for mapping different geological variables like depth, thickness, porosity or permeability. One of the most used estimators for the interpolation of spatial data are Kriging techniques. They are considered an interpolation method for the estimation of a regionalized variable at selected grid points that predict values from interpolation without bias, and with minimum variance. There are different types of Kriging techniques, such as Ordinary Kriging (abbr. OK), Universal Kriging (abbr. UK), Indicator Kriging, Co-kriging and others. The most commonly used method is OK, but the choice of which kriging to use depends on the characteristics of the data. Geostatistical interpolation and variogram analysis has been the standard analytical tool in Croatian geology over the last 10-15 years. Even so, UK has not been used yet for the mapping of geological data in Croatia. It is a method used for the estimation of spatial means when data has a strong trend and the trend can be modelled by simple functions. UK should also be used when the dependent variable does not meet the criterion of second order stationarity necessary for kriging. Second order stationarity means that the mean and variance are the same on the entire area and that the correlation between any two observations depends only on their relative position in space. Even so, UK also has its limitations, like a surface area. If we analyze a large surface, like a big country or a continent, it is difficult to follow a trend along the direction of spreading, so in that case it is not advisable to use a UK. However, when describing a structural local phenomenon, this technique should be applied and the results compared to other kriging techniques in order to decide whether to use it or not.
The aim of this paper is to describe the UK method and introduce it as another statistical tool for mapping variables in Croatian geology. Šandrovac Field was selected as a case study. In this area, the OK and UK methods will be compared and described by mapping a depth variable, and their advantages and disadvantages will be discussed.

Study area
The entire Šandrovac Field covers a small surface area of about 38 square km, located 18 km east from the city of Bjelovar. It represents a classical uplifted structure which is geographically placed in the Bjelovar Subdepression (see Figure 1), a separate regional geotectonic unit in the Croatian part of the Pannonian Basin (abbr. CPBS).  Ma), salinity decreased due to a reduced sea area, and brackish environments were formed. The 2 nd transtension lasted from the Late Pannonian (9.3-7.1 Ma) to the Early Pontian (7.1-6.3 Ma). It was marked by a deposition of monotonous series of sandstones and marlstones. The 2 nd transpression started in the Late Pontian (6.3-5.6 Ma) and was characterized by different weakly consolidated and unconsolidated clastics with sporadical peat and coal (e.g. Malvić 2011, Mesić Kiš and Malvić, 2014). The Bjelovar Subdepression was not situated on the main direction of transposal of materials, particularly in the transpressional periods for which the sediment income was significantly lower. Therefore, the thickness of the Neogene-Quaternary clastics above the Mesozoic carbonate basement or Paleozoic igneous/metamorphic basement, rarely exceeds 3000 m in contrast to 7000 m in the central Drava Depression zone. The area of the Bjelovar Subdepression is considered a mature petroleum province. Accordingly, Šandrovac is an oilgas field that was put into production since 1967. Its exploitation was in accordance with the strategic plan of INA d.d. (Zornjak, 2009).

Methods
Dataset was provided from the doctoral thesis and an article published by Malvić (2003Malvić ( , 2011. Figure 2 shows a palaeostructural map of marker Z' from which 18 well data coordinates were taken. E-log marker Z' was selected as a mapping horizon due to its chronostratigraphically importance in the Drava Depression. It is situated approximately in the middle of the 2 nd transtension and represents a border between the Pannonian and Pontian (and between the formations of Ivanić Grad and Kloštar Ivanić).  Table 1 shows well coordinates and values of the "depth" variable that were used for variogram analysis and as input data for Kriging techniques. The dataset was analyzed using OK and UK. Since a variogram analysis is necessary for map interpolation using Kriging methods, it is also briefly described. The emphasis will be on UK, an interpolation method that has not been used for the mapping of geological data in Croatia.  1000  6420400  5088400  880  6420400  5087000  1120  6420400  5086600  1200  6420800  5086000  1300  6422000  5088200  1040  6421800  5087000  1320  6421900  5086400  1430  6421900  5086000  1500  6422000  5085800  1520  6423000  5085000  1700  6421000 5085600 1480

Variogram analysis
A variogram (2γ) is one of the basic geostatistical tools that is used to determine spatial dependence. It is often referred to as a semivariogram (γ), which has exactly the same characteristics, except that in the denominator of the equation, the number 2 is eliminated. A variogram is mathematically expressed by Equation 1: Where: N(h) -number of data pairs at distance "h" (inside searching neighbourhood area), z(n) -value at location n, z(n+h) -value at location n+h.
Calculation of an experimental variogram is necessary input for different geostatistical interpolation techniques, like Kriging. An experimental (semivariogram) curve is shown in Figure 3. On such a curve, the following parameters can be read: nugget -C0, sill (variance) -C, rangea, and distanceh.
A given experimental variogram is then approximated by the theoretical model (see Figure 4). In the interpretation of geological variables, commonly used theoretical models are spherical, exponential and a Gaussian model.

Figure 4: Theoretical models
The following equations determine the behavior of a certain variable that was approximated by a selected theoretical model (e.g. Isaaks i Srivastava, 1989):

Ordinary Kriging
OK is one of the most commonly used Kriging techniques, a geostatistical interpolation technique that is described by the acronym BLUE -"best linear unbiased estimator". It is the "best" because it aims at minimising the variance of the errors, "linear" because its estimates are weighted linear combinations of the available data, and "unbiased" since it tries to have the mean residual or error equal to 0. The principle of Kriging is to estimate values of a regionalized variable at a selected location (Zk), based on the surrounding existing values (Zi). Selected locations are assigned a relevant weighting coefficient (λi) which represents the influence ofparticular data on the value of the final estimation at the selected grid node. Variogram values express the relationship between the existing (hard) data and the estimation point, or by covariance in case of second order stationarity (e.g. Malvić and Balić, 2009). OK premise is a constant unknown mean in the local neighborhood of each estimation point. Local variance of the data within the search ellipsoid is used for estimation, which is useful in the case of a small number of input data (15 or 20). Then, the global variance often does not reflect local changes, so deviations of the mean and estimation can be large. In the OK technique, the amount of kriging variance is minimized using a linear external parameter called the Lagrange factor (μ). The limiting factor minimizes error and assessment becomes impartial. The condition when assessing the OK technique is that the sum of all weights is equal to 1 (e.g. Malvić et al., 2008). The OK equation in matrix form is:

Universal Kriging
The aim of the UK method is to predict Z(x) at an unsampled area as well. It splits the random function into a linear combination of deterministic functions, the smoothly varying and nonstationary trend, that is also called a drift μ(x) ϵ ℝ, and a random component Y(x):= Z(x) -μ(x) representing the residual random function (Wackernagel, 2003). OK assumes a stationary, i.e. constant mean of the underlying real-valued random function Z(x). But in reality, the mean value varies, it is often not constant across the entire study area and the variable is said to be nonstationary. A nonstationary regionalized variable can be considered as having two components (Davis, 1973); drift (average or expected value of the regionalized variable) and a residual (difference between the actual measurements and the drift). The method of UK assumes that the mean m(x) has a functional dependence on the spatial location and can be approximated by a model with the equation (e.g. Kumar, 2007): Where: al -lth coefficient to be estimated from the data, fl -lth basic function of spatial coordinates that describes the drift, k -the number of functions used in modelling the drift.
UK is also known as kriging with a trend or kriging in the presence of a drift. Spatial trend or a drift represents any detectable tendency for the values to change as a function of the coordinate variables. The mean can be a function of the coordinates in linear, quadratic or higher form. The UK considering linear trend was used for depth spatial interpolation. The mathematical model with two variables is (e.g. According to Journel and Huijbregts (1978), the minimisation of Equation (10) that is subject to the constraint of Equation (9), using the Lagrange multiplier (μ), results in the following UK system: Where: γ(si, sj) -semivariogram between two points si and sj, μl -Lagrange multiplier associated with the lth unbiased condition.
The optimum weights λi can be obtained by solving these equations simultaneously. For the calculation of λ-s, a variogram which is a measure of spatial continuity of the data, is required (Kastelec and Košmelj, 2002). The variance of this estimation is given in Equation 12 (e.g. Kumar, 2007): The process of using UK can be described through the following steps: 1) it is necessary to understand why the trend exists based on the nature of our data, 2) it is required to use a simple form of the trend if possible and avoid extrapolation beyond available data, 3) after the trend selection, it is subtracted from the observed data to obtain the residuals, 4) those residuals are used for variogram computation and spatial estimation, 5) and finally, kriged residuals are added back to the trend (e.g. Isaaks and Srivastava, 1989).

Results
Results include maps of variogram surface for variables "depth" and "residuals", experimental and theoretical variogram models, maps interpolated using OK and UK that were also compared using cross-validation. Variogram surface maps, as well as experimental and theoretical variogram models were obtained using Variowin 2.0, a software for spatial data analysis in 2D. Maps were interpolated using the licensed version of the Surfer 8.0 program.

Ordinary Kriging
A prerequisite for using a kriging method is the construction of experimental variograms. A variogram surface map for the variable "depth" was created (see Figure 5) with the following parameters: X: lag spacing: 1000 m number of lags: 6 Y: lag spacing: 1000 m number of lags: 6

Figure 5: Variogram surface map for a "depth" variable
A variogram surface map calculated for a "depth" variable indicates 2 axes: a primary axis that has a NW-SE trend (135º-315º) and a secondary axis with a NE-SW trend (45º-225º). Primary and secondary axes were defined according to the structural map (see Figure 2). Figure 6 shows an experimental variogram and a Gaussian theoretical variogram model of Šandrovac Field for a "depth" variable (primary axis), with the following parameters: nugget 0 range 2280 m sill 45000 According to the structure of the Bjelovar Subdepression, the ratio of primary and secondary axis is 2:1. Using the variogram data, a map was interpolated by OK using Surfer 8.0 TM (see Figure 8). By the method of cross-validation MSE was calculated for a "depth" variable, which is 29269. The root mean square error of the variable "depth" is 171. The most underestimated data is at X = 6423000, Y = 5085000, Z (depth) = 1700 where the estimated value of depth was 1419. The most overestimated data is at X = 6422000, Y = 5088200, Z (depth) = 1040 where the estimated value of depth was 1552.

Universal Kriging
Spatial correlation between variables can been shown in a scatter plot, if there is such. That is also the starting point in the regression analysis. Figure 9 shows a positive linear trend of depth with longitude and a negative linear trend with latitude. In this case, Gauss-Kruger coordinates are independent variables and depth is a dependent variable. As mentioned before, a variogram for interpolation using UK is constructed using different variables -"residuals". Since there are two independent variables, multiple regression analysis is used for trend removal. An estimated model has the following Equation 13: Estimation of β coefficients is given by resolving normal equations of least square method: which means that the estimated regression equation is: If real values of independent variables x are included in an estimated regression equation for every i = 1, 2, ..., 18; regression values ŷ of dependent variable y are given. Accordingly, the first regression value ŷ is given by including x1 (longitude) and x2 (latitude) variables from Table 1 into Equation 14: Residuals are given by subtracting the real value of dependent variables from the estimated. Specifically, the first residual value is: Variogram surface map for the variable "residuals" was created (see Figure 10) with following parameters: X: lag spacing: 800 m number of lags: 5 Y: lag spacing: 800 m number of lags: 5 Figure 10: Variogram surface map for a "residual" variable A variogram surface map calculated for the "residual" variable indicates 2 axes: a primary axis that has a NW-SE trend (135º-315º) and a secondary axis with a NE-SW trend (45º-225º), same as for the "depth" values. Figure 11 shows an experimental variogram and spherical theoretical variogram model of Šandrovac Field for a "residual" variable (primary axis), with the following parameters: nugget 0 range 1638 m sill 2450 Using the variogram data, a map was interpolated by UK using Surfer 8.0 TM (see Figure 13).  Figure 7 shows that there is no sill intersecting in the experimental and theoretical Gaussian variogram model of the secondary axis for a "depth" variable, which means that we have an extremely large spatial dependence. Increasing the number of lags also did not lead to sill intersecting, so the range value of 4000 m was used. According to a regional structural map of e-log marker Z' (see Figure 2), the primary axis has a direction of 135-315º. Adirectional variogram of a primary axis for depth variable was used for OK interpolation, and the ratio was 2:1 since the structure is 2 times longer than wide. The resulting OK map (see Figure 8) with a Gaussian variogram model does not respect the rule (even geological mapping axiom) that lines of some values should not pass between two points (as the minimum and maximum), where there isn't some intermediate value, through which the line can pass. That is considered as a basic rule of simple interpolation. The second rule of interpolation is that isolines with greater or less values than the maximum and minimum of the measured points, cannot occur on the map. Both rules are not respected on this map. Therefore, this map cannot be considered good, which can be confirmed with a bigger MSE value. The Šandrovac Field was already analysed by the OK method in a previous study (Mesić Kiš and Malvić, 2014) where a linear variogram model was used because it was proposed as a "default" variogram model in SAGA GIS software, and it was compared with Thiessen polygon method. Such a model is usually considered "handy" because it describes regression dependence with a line and does not use any complex variogram "mathematics". Therefore, such krigings are usually considered better than inverse distance or other simple methods because simple regression can be much more easily applied in plain interpolation algorithms. In fact, such kriging implies that there is a general trend in data that is described by simple regression with one dependent variable. Figure 9 shows that we have two independent variables (x and y coordinates) and one dependent variable (depth). There is a positive linear trend of depth with longitude and a negative linear trend with latitude. Since UK assumes a general polynomial trend model such as a linear trend model, it was used for this study. The problem with 2 independent variables is solved using multiple linear regression which leads to a regressional equation from which you can calculate the value of the regression variables and residuals. A variogram was calculated for a "residuals" variable. Since we had a linear trend, the option "linear drift" in Surfer was chosen to implement the UK. The resulting UK map (see Figure  13) shows a homocline, where the entire Šandrovac Field sinks towards the SE as well as the structure of the entire subdepression.

Discussion
The reason why Figures 8 and 13 look so different lies in the fact that there is a linear trend in the data. The UK method identified that trend and made a better interpolation. Previous work (Mesić Kiš and Malvić, 2014) shows a similar result of the OK method with the UK method used in this work due to a "default" linear variogram model option in SAGA GIS software.

Conclusion
This paper describes which kriging technique is more suitable for depth interpolation in Šandrovac Field with e-log marker Z' structural map as input data. Directional variograms of a depth variable was used for the OK interpolation, and the UK interpolation applied the same method except that residuals were used as variables.
Structural maps usually show a trend in input data since mapped surfaces often have a structural trend towards one cardinal direction, or a sequence of geological structures, like in our case a field structure is part of a structural unit of a regional monocline. Therefore, with such a dataset that is characterised by a linear spatial trend or a drift, it is advisable to use UK in relation to the OK because this technique was developed precisely in order to identify and calculate trends in the data, i.e. to describe the regression and give better mapping results. That can be seen by comparing the maps that were obtained by OK and UK. Consequently, it means that UK could be used for the interpolation of depths on all Croatian hydrocarbon fields in the CPBS, where trends such as those described in this paper are observed.