Geostatistics and its potential in Agriculture 4.0 Geoestatística e suas potencialidades na Agricultura 4.0

- In order to meet future food demands and ensure sustainability new technologies have been incorporated into agriculture. Some researchers believe that we are living in the fourth agricultural revolution or Agriculture 4.0. Among the many technologies involved in the Agriculture 4.0, it is necessary to highlight the importance of geostatistics in the implementation of those technologies. Geostatistics is a class of statistics used to analyze and predict the values associated with spatial or spatiotemporal phenomena, and it is very important to understand the spatial distribution of agricultural variables. Therefore, the objective of this review is to show the potential of geostatistics in Agriculture 4.0. The article presents an exhaustive literature review of geostatistics and its potential in agriculture, by showing a brief of geostatistical approaches, some practical use of geostatistics in agriculture, and a description of multivariate geostatistics for multi-source data fusion using some case studies. This review showed that geostatistics has been used for agricultural purposes and has been producing exciting results. In addition, more advanced analysis such as multivariate geostatistics in fuse heterogeneous data can be easily adapted to any experimental conditions and type of sensor data and/or sampling data to increase estimation accuracy.


INTRODUCTION
The population of the world is expected to reach 9.2 billion people by 2050 (FAO, 2013) and food demand which will inevitably lead to increased plant production for food and feed, is expected to increase by 70%.Meeting this demand, while avoiding further deterioration of agricultural systems and natural resources, will require a radical change in the way farmers manage their farms.It is then necessary to abandon practices that do not sustain the environment and open up to new methods and technologies that are based on rational use of resources.
In trying to meet the recent expectation from the agricultural sector, some researchers have stated that we are living the 'fourth agricultural revolution', or 'Agriculture 4.0' (KLERKX; ROSE, 2020).According to Rose and Chilvers (2018) this "new agriculture" is closely related to high technologies such as the internet of things (NYÉKI et al., 2020), cloud computing (MEKALA; VISWANATHAN, 2017), robotics (MARINOUDI et al., 2019), artifi cial intelligence (PATRÍCIO; RIEDER, 2018), sensors (OJHA et al., 2015), satellites (MURUGAN et al., 2017), etc.Consequently, Precision agriculture (PA) attempts to respond to the necessity for innovative actions in the fi eld of primary production.
The current defi nition of PA by International Society of Precision Agriculture (ISPA) states that precision agriculture is a management strategy that gathers, processes, and analyses temporal, spatial and individual data and combines it with other information to support management decisions according to estimated variability for improved resource use effi ciency, productivity, quality, profi tability and sustainability of agricultural production.
Geostatistics is one of the most important tools in PA.This statistical approach, which was initially created for mining, is now used in many fi elds of study including agriculture.Its main function in PA is to understand spatial distribution patterns and obtain more accurate maps.
Therefore, the objective of this review is to show the potential of geostatistics as an important tool in Agriculture 4.0.The study presented a brief history of geostatistics and an introduction to its analysis.In addition, an exhaustive review of the use of geostatistics in agriculture is presented.Finally, the use of geostatistics in data fusion with some case studies is described.

Brief history and concepts of geostatistics analysis
Environmental variables usually present spatial continuity (LI; HEAP, 2011) and classical statistics are not effi cient in analyzing this kind of data, because classical statistics do not take spatial distribution into account.In a classical approach, the samples are considered independent and the error is random.Although the existence of spatial variability was undeniable, even for developers of classical statistics, spatial variation was regarded as a little consequence (OLIVER, 2010).However, Daniel Krige, a South African Mining engineer, noticed that the absence of a spatial structure in predicting ore grades could be a huge problem.Therefore, in 1963 his empirical approach was mathematically formalized by Matheron (1963).
Matheron developed the regionalized variable theory.A regionalized variable is an actual function that, takes a defi nite value in each point of space (MATHERON, 1963).To assess whether data are spatially correlated and to what extent, the variogram or semivariogram was developed.The variogram or semivariogram is a curve representing the degree of continuity of some variables (MATHERON, 1963).
The equation of a semivariogram is defi ned as (Equation 1): (1) where N(h) is the number of h-distant point pairs (x i , x i + h).The result of the point estimates is a set of semivariogram values for various distances h between points.However, to compile Kriging equation is necessary to know the semivariogram values for any distance h.Therefore, when quantifying spatial dependence, the obtained estimates are approximated with a theoretical curve.The most common semivariogram models used in agricultural variables are spherical, exponential, and Gaussian model (ZŮVALA et al., 2016).
From the fi ttings of the mathematical models to experimental semivariograms, parameters were defi ned as a spatial dependence range (a) which defi nes the distance, where the semivariogram value is constant.Sample locations separated by a distance shorter than the range are spatially autocorrelated; locations farther apart than the range are uncorrelated.Sill (C 0 + C) is the semivariogram value in the range.The nugget effect (C 0 ) is defi ned as the limit of the semivariogram for h decrease o zero.The nugget effect can be caused by measurement errors or by the fact that the process includes spatial variability at distances smaller than the sampling interval (ZŮVALA et al., 2016).
The semivariogram is an important tool to determining whether a variable presents spatial dependence and quantifi es this dependence.However, it is necessary to introduce another geostatistical tool for mapping: the kriging interpolation method, which name was in honor of Daniel Krige.Kriging is a generic Geostatistics and its potential in Agriculture 4.0 term for a range of least-squares methods to provide the best linear unbiased predictions, best in the sense of minimum variance (OLIVER; WEBSTER, 2014).In the beginning, kriging was used to map ore in South African minings; however, it is now used for many kinds of data, including variables related to agriculture, such as soil properties, weed, crop yield, etc (CHAVES et al., 2018;METWALLY et al., 2019;SAN MARTÍN et al., 2018).
Kriging usually performs better than other interpolation methods.This is because kriging takes into account the manner in which a property varies in space through the variogram.In addition, kriging provides not only predictions but also kriging variances or errors (OLIVER, 2010).The ordinary kriging system can be explained by: Letting z(x i ),i = 1,2,… ,N, be the observed values of variable z at points x 1 ,x 2 ,… ,x N , in two dimensions x i ≡ {x i1 ,x i2 } T .For any new point x 0 we predict Z by (Equation 2): (2) The λ i , i = 1,2,…,N, are the weights chosen to minimize the prediction error variance by the following set of equations (Equations 3 and 4): for all j (3) (4) Here, γ(x i − x j ) is the semivariance between data points i and j, γ(x j − x 0 ) is the semivariance between data point j and the target point x 0 , and ψ(x 0 ) is a Lagrange multiplier introduced for the minimization of error variance.
A detailed description of semivariogram and kriging method was provided by Oliver (2010) and Oliver and Webster (2014).

Use of geostatistics in agriculture
Geostatistics have been shown to be very effi cient defi ning the spatial variability of agricultural variables.It has been used for many different variables such as crop yield, soil properties, weed, etc.In this review, we focused on the soil and weed application.In addition, other applications using crop yield and sensor data were approached in the data fusion topic.

Geostatistics applied to soil sampling
The soil samples must accurately reproduce the condition of an agricultural fi eld, using a minimum number to satisfactorily estimate the mean value of the soil attributes, thus the management of irrigation and soil fertility in crop fi elds will be refi ned (BUDAK, 2018).This is because has been estimated that 80-85% of the total error in the results used in the recommendation of fertilizers and correctives can be attributed to sampling in the fi eld and 15-20% may be due to laboratory analyses (CARVALHO et al., 2002).
Based on classical statistics, the traditional study carried out by Catani et al. (1954) defi nes as representative sampling 20 individual soil samples to obtain a sample composed in areas of up to 10 hectares considered homogeneous by farmers (same soil type, crop, slope, etc) are commonly applied in Brazilian agricultural fi elds ( S I LVA et al., 2020).Random sampling, based on classical statistics, is also widely used in other countries, such as the United States (LAWRENCE et al., 2020).However, for classical statistics to be properly applied, one of its premises is that the randomness of errors is met, that is, the samples are spatially independent (BUDAK et al., 2018).
However, the presence of spatial dependence has been verifi ed in most agricultural variables, such as soil attributes (ARMANTO, 2019;GAO et al., 2019;LI et al., 2017;MIRZAEE et al., 2016;ROSEMARY et al., 2017;USOWICZ;LIPIEC, 2017;YANG et al., 2020), thus limiting the use of classical statistics.According to Welsch et al. (2019), soil variability can affect sampling plan, and may result in inadequate sampling intensities and high levels of uncertainty in evaluated attributes and, consequently, inappropriate management if the spatial distribution is not taken into account.
Therefore, geostatistics has been shown to be a viable alternative for the optimization of a number of individual soil samples, considering the spatial dependence (MONTANARI et al., 2012), where semivariogram is the main tool used for the description of spatial autocorrelation (SEIDEL; OLIVEIRA, 2016).Based on the knowledge of semivariogram parameters, Carvalho et al. (2002) state that it is possible to defi ne a sampling radius with the its range value because distances greater than its value guarantee the independence of the sampling points.
Scaled semivariogram, which is obtained from the division of the semivariance by the statistical variance of experimental semivariograms, can be used in defi nition samples, and it is useful because it allows several attributes to be inserted in the same graph, obtaining a single range value (PEREIRA et al., 2013).Some studies on crops of agronomic importance such as sugar cane -Saccharum offi cinarum L .- (MONTANARI et al., 2012), beans -Phaseolus vulgaris L. - (CASTIONE et al., 2019), and mango -Mangifera indica L. - (SILVA et al., 2020) have shown the effi ciency of the use of geostatistics compare to classical statistics in determining the ideal number of individual soil samples.
Soil chemical attributes of two areas cultivated with sugar cane, located in Jaboticabal county, São Paulo state, Brazil, were assessed by Montanari et al. (2012) to optimize the procedure of soil sampling plan, and they found that the use of geostatistics, using the range of experimental semivariogram reduced the number of individual samples needed to represent the area.In addition, the number estimated by classical statistics would make collection unviable in practical terms.
In a study to defi ne the minimum sampling density of soil physical attributes in a central pivot area cultivated with beans, in Cristalia county, Goiás state, Brazil, Castioni et al. (2019) identifi ed that geostatistics, through scaled semivariogram, must be utilized as a strategy for characterizing spatial variability and determining the minimum scale of points in a sampling area.
Soil sampling approaches and point allocation sampling methods in irrigated mango orchards in the Brazilian semiarid region, Silva et al. (2020) observed the effi ciency of geostatistics compare to classical statistics since the method of 20 individual soil samples proved to be ineffi cient.They noticed that the addition of spatial component, using the range of the scaled semivariogram was appropriate in obtaining the minimum soil sample density in each of the studied areas.In addition, the allocation of sample points demonstrated that, in the sampling method, the collection site is as important as the number of individual soil samples to be collected.
Based on the above considerations, it is evident that careful sampling, which accurately represents the sampled area, aims at cost reduction and resources optimization, guaranteeing agricultural viability.It is essential to study sampling projects for each area, depending on the intrinsic characteristics of each one, since variability could be affected by several factors such as topography, management, vegetation, and soil formation factors (WELSCH et al., 2019).
Hence, geostatistics is the most appropriate tool for defining sample density when compared to classical statistics, because geoestatistics takes into account the spatial dependence found in most crop fields.Therefore, the range parameter to define the number of individual soil samples may be a satisfactory method to define soil sampling.

Geostatistics applied to weed science
The presence of undesirable plants in agricultural fi elds, also known as weeds, can be defi ned as higher plants that interfere with the interests of humans and environment (PITELLI, 2015).The establishment of these plants to form plant clusters with different populations (weed community) causes environmental disturbances and various damages, especially when it comes to agricultural production systems.
In addition, to the factors inherent to the species and its biotype, climatic, physiographic factors, and biotic factors determine the occurrence and permanence of weeds in any given environment and period (FRIED et al., 2019), and interferes with the way populations are distributed: generalized or punctual.
The ability to describe and map the weed spatial distribution is the fi rst step in the study of its spatial variability, guaranteeing effi cient management the concepts of precision agriculture.In-plant health, the spatial pattern of unwanted populations can be studied through statistical indexes based on mean and variance, which gives the degree of aggregation of pests and diseases.This degree can vary between highly aggregate, aggregate and uniform, and many studies have already shown that the degree of weed distribution in agricultural fi elds, generally shows an aggregate population pattern (LOPES et al., 2020;MARTÍN et al., 2018;SILVA et al., 2017).
However, Schaffrath et al. (2007) affi rm that the traditional measurements of tendency and dispersion are of little use and can induce overestimation since the position of individuals in space is not considered.Therefore, the use of geostatistics in studies of the spatial distribution of weed has technological and environmental potential (SCHAFFRATH et al., 2007), enabling more accurate mapping techniques, such as Kriging (KRAHMER et al., 2020).
Several studies have shown that the distribution of weeds in agricultural fi elds is not a random process but follows a spatial dependence.In the literature, there are reports of moderate to strong spatial dependence for weed variables (AVILA et al., 2019;CHIBA et al., 2010;IZQUIERDO et al., 2020;KALIVAS et al., 2012;LÓPEZ-GRANADOS et al., 2016;METCALFE et al., 2016, SIQUEIRA et al., 2016).
Geoestatistical studies assists in the planning of sustainable control, carrying out site-specifi c management, that is, where it is strictly necessary, bringing benefi ts such as greater economic return and less environmental impact (MARTÍN et al., 2018;ROCHA et al., 2015).For example, there is the possibility of using a precision agriculture technique called variable rate application, that is, when an input, (in this case, the pesticides), is applied based on the spatial distribution of the target to be controlled.This technique can maximize the effectiveness of herbicides, and studies showed savings of approximately 25% (GUNDY et al., 2017;KEMPENAAR, 2017).
In addition, an important characteristic of this type of study is the possibility of correlational analysis with agro-environmental variables with the same coordinate as weed variable, in assisting the understanding of the occurrence of weeds.These variables include metrics or qualitative data on soil properties, types of management, and crop characteristics (AVILA et al., 2019;COMAS et al., 2016).According to Pätzold et al. (2020), weed population maps integrated with soil maps can potentially be used in weed planning and management to estimate herbicide requirements or meet application limitations.
In a study of weed community of a corn crop fi eld in Arganda del Rey, Madrid, Spain, Martin et al. (2018) revealed that the variance for S. halepense and A. theofhasti along the direction of crop rows was larger at lags exceeding 5 m.This indicates the existence of large patches at this scale and that there might be an interaction between the effect of cultivation or harvester dragging seeds with variable soil properties.In correlating the weed seed bank in a soybean fi eld with the soil properties, Avila et al. (2019) found a signifi cant negative correlation between the grass seed seminal bank and the variation in the sand content.Regarding chemical management and reporting, the authors recommended the use of their result in fi eld management, because that lower doses of pre-emergents could be applied in sandy areas, warning against the risk of fi xed-rate applications.
Therefore, weed mapping, favors the understanding of how the weed community is distributed in the fi eld and the reasons for its distribution, whether they are of ecological or anthropic origin, enabling, not only a more effi cient plant health management but also a deep understanding of its ecophysiology (BOTTEGA et al., 2018;KRAHMER et al., 2020;MARTÍN et al., 2018).Krahmer et al. (2020) also pointed out that comparative tests for different locations, improvement of existing maps for more frequent species, standardization, expansion of herbicide resistance maps, and mapping of rare weeds as potentials and objectives for infestation mapping works.
Based on the information collected, it can be considered that geostatistical analyses, and mapping techniques are integrated with the so-called Agriculture 4.0 model, enabling digital and technological tools for local diagnosis and guiding modern agriculture for weed management and more sustainable, economical, and effi cient crop production.

Multivariate geostatistics for multi-source data fusion
To achieve a suffi cient degree of agronomic and fi nancial effectiveness, it is necessary to improve the accuracy of estimation of relevant variables, which could be achieved by intensifying sampling.Given the costs associated with both sample collection and laboratory analyses, an advantage in terms of accuracy of estimation could come from the integration of sparse sample data with variables recorded using remote and/or proximal sensors on a variety of spatial and temporal scales.
In particular, there are different types of realtime on-the-go sensors, that are, already widely used and capable of recording both soil and plant attributes at a very fi ne spatial resolution (CASTRIGNANÒ et al., 2017;2018;VISCARRA ROSSEL et al., 2011).However, the relationship between the property of interest and output of sensor is not direct because the latter depends on several factors (CASTRIGNANÒ et al., 2017;2018;DE BENEDETTO et al., 2012).Therefore, more sensors are needed, based on different measurement principles, to separate the different effects and consequently improve the interpretation of the different causes of variability.A new technique, called sensor data fusion (CASTRIGNANÒ et al., 2017(CASTRIGNANÒ et al., , 2018(CASTRIGNANÒ et al., , 2019;;MASARIK et al., 2016;WANG et al., 2013), has been developed, it combines several sensing techniques and is more informative about reality than the one based on a single sensor.
However, these multi-source data are collected for a variety of spatial scales more often than the one of interest and most environmental variables display spatial structures, such as gradients, patches, trends, and complex spatial autocorrelations, which may exist at many scales.Moreover, the assessment of spatial variations strongly depends on the size of the sampling unit or measurement unit of the sensor.Therefore, the size of the sampling/measurement unit is quite important in the process of investigation because it can critically infl uence our perception of environmental phenomena (BELLEHUMEUR et al., 1997).
Multivariate Geostatistics offers a set of linear unbiased estimators (cokriging) with minimum error variance, and allows one to combine a virtually infi nite number of variables with different characteristics and with different degrees of uncertainty.In particular block cokriging, in its more general formulation, considers both upscaling and downscaling (CASTRIGNANÒ; BUTTAFUOCO, 2020).However, the generation of cokriging estimates may require the inversion of very large matrices, due to the often massive character of proximal/remote sensing data.This operation with big data may take a lot of computer time or even be impossible to perform.Multicollocated cokriging (RIVOIRARD, 2001) is a simplifi ed version of full cokriging, and makes use of secondary variable(s) only at the point being estimated and at all points where the primary variable is known (CASTRIGNANÒ et al., 2009).This approach requires the secondary variable(s) to be known at all points where estimation of the primary variable is desired, therefore, it is well suited for remote/proximal sensing data, usually in grid format.Moreover, it effectively combines the differences among sensors and enhances the integration between remote/proximal sensing observations and ground-based truth data (CASTRIGNANÒ et al., 2012).
Another geostatistical data fusion technique, widely used especially in PA, is factor cokriging, which allows to synthesis of a wide array of variables in spatial indices that are scale-dependent.These can be used to partition an agricultural fi eld into more homogeneous units (Management Zones), with respect to a large variety of soil physical, chemical, and hydraulic properties and crop attributes, which can be used for differential management (CASA; CASTRIGNANÒ, 2008).Moreover, some factors affecting crop response are expected to act over a short-range action, however, others act over longer distances, and such fi eld delineation is expected to be scale-dependent.Using this geostatistical technique, it is possible to perform withinfi eld delineation at different spatial scales, differently compared to traditional clustering, thus increasing the effi cacy of site-specifi c management.
The objective of this contribution is to describe a method of geostatistics-based data fusion that is widely applied, as evidenced by numerous bibliographic references, and suffi ciently fl exible to be able to jointly analyse a virtually infi nite number of multi-source data with applications particularly in precision agriculture.
The various steps, which make up the proposed approach for the delineation of homogeneous areas in PA, are indicated below.

Exploratory data analysis
A dataset with selected variables for analysis generally requires some preliminary adjustment.If some variables are gridded, they have to be migrated to the nearest sample location to create the co-regionalised dataset.Exploratory data analysis is then performed by calculating basic statistics (mean, median, minimum and maximum values, standard deviation, skewness, and kurtosis).

Point Gaussian anamorphosis
Skewed or erratic data can often be made more suitable for geostatistical modeling by appropriate transformation.
Therefore, normalization and standardization may be required before kriging (ARMSTRONG, 1998).One way is to perform all geostatistical procedures in the Gaussian domain and at the end to back-transform the estimates to the original units.Each variable is transformed into a normally distributed variable beforehand if it shows a large departure from Gaussian distribution.This is performed by fitting a mathematical function, called Gaussian Anamorphosis, expressed as an expansion of Hermit polynomials (CHILES; DELFINER, 2012;WACKERNAGEL, 2003).An advantage of the Gaussian distribution is that spatial variability is easier to model because it reduces the effects of extreme values providing more stable semivariograms (ARMSTRONG, 1998).

Block/support correction
If the estimates are produced on a block instead of a point support, a coeffi cients is needed to obtain an anamorphosis on a block support (CHILES; DELFINER, 2012), which will be later used to back-transform block Gaussian estimates into raw data.A support correction coeffi cient r is determined from the variance of blocks, and the punctual variance calculated as the sill of the variogram assumed stationary.The variogram based on a point support is calculated on the smallest support, whereas a variogram on a given block support requires a process of regularization, consisting of discretizing the blocks into equal cells after which a pseudo-experimental variogram is calculated in the fi ctitious cell centers, and then the point variograms are averaged over the block (MANZIONE; CASTRIGNANÒ, 2019).

Spatial modelling of multivariate dataset
Multivariate spatial correlations are modeled under the scope of the linear model of co-regionalization (LMC) (JOURNEL;HUIJBREGTS, 1978), which considers all the studied variables to be generated by the same independent physical processes acting at N s different spatial scales (WACKERNAGEL, 2003).Therefore, all (both direct and cross) variograms are modeled as linear combinations of the same basic spatial structures Geostatistics and its potential in Agriculture 4.0 (authorized mathematical functions) corresponding to the N S spatial scales.

Geostatistical solutions for support correction
Geostatistical techniques are used to model the effects of re-scaling data due to the change of support when treating heterogeneous spatial data.Block cokriging is the traditional geostatistical interpolation method used in solving a change of support problem (ARMSTRONG, 1998;CHILÈS;DELFINER, 2012;JOURNEL;HUIJBREGTS, 1978) when observations onpoint support are used to predict the average values of a multivariate process on a larger scale.The main difference between block (co)kriging and point (co)kriging is in the calculation of the point-to-block covariance through the regularization process as described above.
The display of cokriging estimates after backtransformation allows the production of thematic maps of individual attributes and may be an effective support for decision making.

Multi-collocated block cokriging
Multivariate spatial prediction, or cokriging, is used to treat the multivariate data set.It is a quite popular geostatistical estimator because it enables the coestimation of poorly sampled variables from secondary more exhaustive information.Details of the cokriging system can be found in reports of Wackernagel ( 2003), and Chilès and Delfi ner (2012).The equations are valid regardless of the support of data; however, taking into account the different supports in the calculation of autocovariance and cross-covariance functions is crucial to obtain valid inference (CASTRIGNANÒ et al., 2018).Once a consistent model for the point covariance functions is estimated, using LMC, block cokriging is applied by replacing the point cross-covariances in point cokriging by their block-averaged counterparts through the abovementioned variogram regularisation (MYERS, 1984).
The variant multi-collocated block cokriging is used when fusing sparse sample data with gridded variables as the ones from remote or proximal sensing.
Here it is suffi cient to point out that its use is indispensable in the treatment of big data, whereby the inversion of a huge matrix for the resolution of the cokriging system could be otherwise impossible.Indeed, bock cokriging, in its more general formulation, can also be used in downscaling, when the estimates are on a smaller support than the one of observations.In this case it is necessary a process of deregulation and deconvolution to transform variograms on areal support into variograms on point support.This is achieved through an iterative fi tting process (CASTRIGNANÒ; BUTTAFUOCO, 2020), but unfortunately, at present, it is implemented only on very few commercial or free softwares.

Factor block cokriging
This geostatistical procedure is similar to the traditional Principal Component Analysis (PCA) instead of applying it to the whole variance-covariance matrix, it applied to its individual spatial components (co-regionalization matrices).This decomposition is accomplished through the determination of the linear model of co-regionalization.PCA decomposes each coregionalization matrix into two additional matrices: the eigenvalues and egeinvector matrices (WACKERNAGEL, 2003).The transformation (loading) coeffi cients correspond to the covariances between the variables and principal component, which are called regionalized factors, relative to a given spatial scale.They express the infl uence of each variable on the factor at a given spatial scale and are then used to assign a physical meaning to the factor.Mapping the scale-dependent regionalized factors is a very powerful and synthetic way to display the spatial relationships among several variables at different spatial scales.Therefore, it is commonly used in PA to produce a partition of the fi eld in homogeneous zones, according to the attributes represented by the variables chosen for the analysis.In addition, if exhaustive information is available, multi-collocated factor block cokriging is preferred to the full version.Figure 1 provides a complete overview of the proposed method.

Some applications
The approach previously described is now widely applied in various experimental contexts with spacetime data of different nature, spatial support, temporal frequency, and different degrees of uncertainty for a variety of objectives.The most frequent application is the subdivision of study areas into relatively homogeneous zones, according to the attributes selected and measured using various sensors, submitting them to site-specifi c management in PA.The wide fl exibility of the method, which is applicable to a practically infi nite number of sensors of any type, is demonstrated by the numerous bibliographical references.Interested readers should refer to them for more details and clarifi cations (AGGELOPOOULOU et al., 2013;BUTTAFUOCO et al., 2015BUTTAFUOCO et al., , 2017BUTTAFUOCO et al., , 2019;;CASTRIGNANÒ et al., 2009CASTRIGNANÒ et al., , 2012CASTRIGNANÒ et al., , 2017CASTRIGNANÒ et al., , 2018;;CASTRIGNANÒ;BUTTAFUOCO, 2020;DE BENEDETTO et al., 2013, 2013a;LANDRUM et al., 2015;MANZIONE et al., 2019;SHADDAD et al., 2019).
Three case studies are briefl y presented here as examples to illustrate the potential of the approach.The fi rst one (BUTTAFUOCO et al., 2017) is located in Foggia, southeastern Italy in a fi eld of 12 ha cultivated with rainfed durum wheat (Triticum durum Desf.).One hundred soil samples, arranged according to a regular grid, were picked up at two depths (0.20 and 0.40 m) and analyzed in the laboratory for different physical, chemical, and hydraulic attributes.In addition, the grain yield was measured with a John Deere combine over three growing seasons (2005)(2006)(2007).The regionalized factor with an effective range of approximately 285 m was used for partitioning the fi eld.The organic matter content, total nitrogen, and fi eld capacity weighed were mainly and positively correlated, whereas clay negatively correlated because the soil was heavy and moderately drained.Therefore, this factor could be interpreted as an indicator of soil chemical fertility.The fi eld could be roughly divided into three macro-areas of equal extension with a central area, which was less fertile due to the reduced values of organic matter and N.
For each macro-area, the expected value and standard deviation of grain yield were also estimated over the three crop seasons (Figure 2) using polygon kriging (CASTRIGNANÒ; BUTTAFUOCO, 2020), a sort of block cokriging but on an irregular block (polygon).As can be seen in Figure 2, the spatial variation in yield differed It is now known that the simple traditional sampling is not capable of assessing the variability of agricultural system at a space-time scale that is necessary for effectively PA management.Therefore, it is then necessary to integrate it with proximal and/or remote sensing of both soil and crop at fi ner scales.
The second case study shows an example of data fusion including both soil variables measured in the laboratory and the ones obtained from online visible spectroscopy (SHADDAD et al., 2016).The study was conducted in an 18-ha fi eld located in Wilstead, Bedfordshire, UK.A total of 183 soil samples on a 30 m x 30 m cell grid were collected and submitted for different physical and chemical analyses in the laboratory.The fi eld was also surveyed with a mobile fi bre-type Vis-NIR spectrophotometer along parallel transects of 700-m length and 15 m apart for four variables: pH, P, K, and water content.The regionalized factor over a scale of approximately 181 m was used to delineate homogeneous zones.It was mainly and positively correlated with total carbon, pH, total N, and cation exchange capacity, therefore it could be interpreted as a soil fertility indicator.The fi eld was then partitioned into three macro-zones of equal size with the central and northern parts characterised by lower fertility (Figure 3).
In order to evaluate the productive capacity of such delineation, the map as shown in Figure 3 was compared with a barley yield map of the same fi eld (Figure 4) and the yield was calculated for each macro-area.As shown  2005-2006, 2006-2007 and 2007-2008 growing seasons Geostatistics and its potential in Agriculture 4.0 in Table 1, there is a direct relationship between soil fertility, as expressed by the regionalized factor, and yield.However, the degree of spatial association between the two maps was less than 40%.This means that even if soil variation affects yield variation, it does not fully explain yield variation, which seems more related to dynamic factors.For effective site-specifi c crop management, it The third case study is taken from Precision Viticulture (ANASTASIOU et al., 2019), which requires a very ne space-time monitoring scale to accurately assess the variation in a vineyard.The study was conducted at a commercial table grape vineyard located in Corinth, southern Greece, planted with Vitis vinifera L. cv.Thompson seedless.Two different types of proximal sensors were used to assess and map the spatial-temporal variation of the vineyard: the one on the plant was a crop circle canopy sensor (ACS-470, Holland Scientifi c Inc., Lincoln, NE, USA), used to scan the side vine canopy and collect georeferenced radiometric data at three bands: 670 nm (red), 730 nm (red-edge) and 760 nm (NIR) at three crop stages: start of veraison, mid of veraison and technological maturity/harvest.For soil monitoring, a single electromagnetic induction (EMI) soil survey was conducted using an EM38 sensor (Geonics Ltd, Ontario, Canada).The two sensors had different supports: the one of Crop Circle was a spot of 0.40-m diameter, whereas the one of EM38 was of approximately 1-m horizontal length.The application of the proposed approach then required a change of support.While EMI map showed some spatial dependence structures, due to differences in soil texture, the radiometric ones appeared noisier.This was refl ected in the map of the regionalized factor with 32-m scale (Figure 5), obtained from the joint analysis of 10 variables, 9 radiometric variables and one variable relative to the soil that was the most infl uential.Actually, the examination of Figure 5 shows that it is diffi cult to subdivide the vineyard into macro areas to be subjected to differential agronomic management.In fact, the fi eld is characterized by high micro variability, due to the particular vine management, for which it would be better to apply variable rate technology at a very fi ne spatial scale.
From the analyses of the three case studies may be verifi ed that the method is quite fl exible and can easily adapt to any experimental conditions and type of sensor

Figure 1 -
Figure 1 -Flowchart of the proposed methodology based on geostatistical data fusion techniques

Figure 3 -
Figure 3 -Map of three management zones (MZ) according to the estimates of the factor (F1) with scale of 181 m

Table 1 -
Mean yield of each management zone delineated by F1