Rapid characterisation of landslide heterogeneity using unsupervised classification of electrical resistivity and seismic refraction surveys

The characterisation of the subsurface of a landslide is a critical step in developing ground models that inform planned mitigation measures, remediation works or future early-warning of instability. When a landslide failure may be imminent, the time pressures on producing such models may be great. Geoelectrical and seismic geophysical surveys are able to rapidly acquire volumetric data across large areas of the subsurface at the slope-scale. However, analysis of the individual model derived from each survey is typically undertaken in isolation, and a robust, accurate interpretation is highly dependent on the experience and skills of the operator. We demonstrate a machine learning process for constructing a rapid reconnaissance ground model, by integrating several sources of geophysical data in to a single ground model in a rapid and objective manner. Firstly, we use topographic data acquired by a UAV survey to co-locate three geophysical surveys of the Hollin Hill Landslide Observatory in the UK. The data are inverted using a joint 2D mesh, resulting in a set of co-located models of resistivity, P-wave velocity and S-wave velocity. Secondly, we analyse the relationships and trends present be- tween the variables for each point in the mesh (resistivity, P-wave velocity, S-wave velocity , depth) to identify correlations. Thirdly, we use a Gaussian Mixture Model (GMM), a form of unsupervised machine learning, to classify the geophysical data into cluster groups with similar ranges and trends in measurements. The resulting model created from probabilistically assigning each subsurface point to a cluster group characterises the het- erogeneity of landslide materials based on their geophysical properties, identifying the major subsurface discontinuities at the site. Finally, we compare the results of the cluster groups to intrusive borehole data, which show good agreement with the spatial variations in lithology. We demonstrate the applicability of integrated geophysical surveys coupled with simple unsupervised machine learning for producing rapid reconnaissance ground models in time-critical situations with minimal prior knowledge about the subsurface.


Introduction
Growing populations and concomitant land use change are increasing the exposure of people and infrastructure to landslide hazards (Froude and Petley, 2018). Characterising the subsurface of a landslide is the first step toward understanding the future causes of instability and mechanisms of failure, which in turn forms the basis of assessing and mitigating risk through monitoring, modelling, and early-warning (Pecoraro et al., 2019;Intrieri et al., 2013). Geophysical measurements play an increasingly important role in characterising and monitoring landslide systems at the slope-scale (i.e., covering the entire area of a landslide, in contrast to regional-scale studies) due to their greater spatial coverage and acquisition rates compared to detailed intrusive observations (see reviews by Hack, 2000, Jongmans and Garambois, 2007, Schrott and Sass, 2008, Van Dam, 2012, Perrone et al., 2014, Pazzi et al., 2019, Whiteley et al., 2019. As such, geophysical surveys are well-suited for rapid reconnaissance activities and are able to provide significant volumes of subsurface data at the slope-scale, which can inform initial ground model development in the absence of further intrusive information. Of the wide range of geophysical methods used in landslide investigations, geoelectrical and seismic methods are the most common (Baroň and Supper, 2013;Jaboyedoff et al., 2019;Whiteley et al., 2019). In particular, electrical resistivity tomography (ERT) and seismic refraction tomography (SRT) are two complementary methods that are able to produce high spatial resolution models of the subsurface, and are sensitive to different hydro-mechanical properties of the rocks and soil that make up the landslide material. Generally, in landslide investigation (and in other types of ground engineering investigations), interpretation of geophysical measurements relies on a qualitative, heuristic approach based on visual analysis of an inverted model. In rapid reconnaissance surveys where only a single geophysical method is used, there can be high uncertainties when determining the source of spatial variation in the observed measurements. For example, resistivity decreases in areas of higher moisture content, but also in materials with increased clay content. Identifying the underlying cause of ERT anomalies can be difficult without a priori information or more detailed follow-up information, thereby significantly affecting the reliability and speed of the interpretation.
Landslide systems move toward critical failure when the restraining forces that give materials shear strength (τ f ) are overcome by destabilising forces that act to reduce shear strength. Therefore, understanding the landslide system in terms of the state of τ f is crucial for understanding risk of failure. In a simple infinite slope model, τ f is defined as where c is cohesion, σ is total normal stress, u is pore water pressure, and ϕ' cv is the angle of shear resistance at a critical state. ERT measurements are sensitive to variations in moisture content, porosity and clay content, whilst seismic methods are sensitive to elastic properties controlled by material strength, density, porosity and saturation, particularly the distinction between saturated and partially saturated ground in the case of P-wave velocity . In a rapid reconnaissance setting, it may not be possible to quantify absolute values for the parameters of τ f from geophysical measurements (such an approach requires the determination of petrophysical relationships in a laboratory setting; for examples see Merritt et al. (2016) and Uhlemann et al. (2017)). However, relative states of the parameters of τ f can be estimated from geophysical models. For example, ERT measurements can provide information on the moisture and clay content of materials in order to distinguish lithological formations (e.g., Chambers et al., 2011), which can provide indirect information on the likely state of u and c respectively (e.g., Merritt et al., 2013). SRT measurements can provide information on elastic moduli, which relate to the relative state of σ in the landslide system (e.g., Uhlemann et al., 2016).
To improve the interpretation of geophysical data, exploit the different sensitivities of different methods and to move toward repeatable and more objective ground model development, multi-method geophysical surveys combined with increasingly automated data integration and processing approaches are needed. We use the data from an integrated near-surface geophysical survey to provide insights into soil and rock properties and to map subsurface heterogeneity in terms of the major subsurface discontinuities at an active landslide in North Yorkshire, UK. We produce a rapid reconnaissance ground model using unsupervised machine learning techniques applied to ERT and P-and Swave seismic refraction tomography (SRT) data. Here, we use the term 'rapid' to refer not just to the speed of data acquisition and processing, but to how rapid this approach is when compared to the time taken to achieve the same coverage, detail and reduced uncertainty in interpretation obtained from other previous studies of the landslide (see studies by Chambers et al., 2011. Integrated surveys combining these data sources are abundant in the literature (e.g., Merritt et al., 2013;Francioni et al., 2019;Kannaujiya et al., 2019;Cody et al., 2020). In our approach, we first map the surface of the landslide using high resolution imagery acquired by unmanned aerial vehicle (UAV). The UAV topographic data are then used in the construction of a joint 2D mesh incorporating topographic points from both surveys, allowing individual inversion of the near-surface ERT and SRT data on a joint mesh. To produce a more objective ground model requiring minimal user interpretation, we apply a simple unsupervised clustering algorithm to the modelled geophysical data cross-plotted against each other to identify areas of similar and contiguous properties. This approach requires minimal a priori information about the subsurface. Finally, we validate the results of this approach by comparison with pre-existing ground models of the site and geotechnical observations acquired from a borehole.

Site description
Landslide hazards are common in the UK, although they generally pose a low risk to human life (Gibson et al., 2013). They are most often triggered by periods of intense or prolonged rainfall, and their locations are strongly linked to the underlying geology. The rocks of the Lias Group are responsible for several types of geohazards across the UK, including cambering, valley bulging and landsliding, outcropping in a broad zone extending from the south-west to north-east coasts, along with other smaller outlying outcrops (Fig. 1a). In particular, weak claybearing horizons within the Whitby Mudstone Formation (WMF) in the Lower Jurassic Upper Lias Group experience some of the highest densities of landslides in the UK at approximately 42 landslides per 100 km 2 of outcrop (Hobbs et al., 2012). To study the processes that destabilise the formations of the Lias Group, the British Geological Survey established the Hollin Hill Landslide Observatory (HHLO) over a decade ago. The HHLO, located in North Yorkshire, UK, is a complex, slow moving earth slide-flow, situated in the Lias Group mudrocks. It serves as a test bed for novel geophysical, geodetic and geotechnical landslide characterisation and monitoring technologies (Chambers et al., 2011). Developments in ERT optimisation (Loke et al., 2010;Uhlemann et al., 2015;Wilkinson et al., 2012), electrode displacement monitoring (Boyle et al., 2017;Loke et al., 2018;Wilkinson et al., 2010;Wilkinson et al., 2016), geophysical-geotechnical petrophysical relationships (Merritt et al., 2016) and landslide monitoring using geophysics (Merritt et al., 2018;Uhlemann et al., 2017;Whiteley et al., 2020;Whiteley et al., 2021) have all taken place at the HHLO. The following site description is summarised from these works.
The HHLO is located on a south facing ~12 • slope on the north eastern edge of the Sherriff Hutton Carr embayment, formed by the draining of Lake Mowthorpe after the Devensian glaciation (Fig. 1b). The site comprises agricultural land used for seasonal grazing and associated agroforestry. The slope comprises a series of interbedded shallow marine Lias Group mudstones and sandstone, comprising (from the base of the slope in ascending order) the Redcar Mudstone Formation (RMF), Staithes Sandstone Formation (SSF) and Whitby Mudstone Formation (WMF). The escarpment is capped by the Dogger Formation (DF), a limestone and sandstone unit forming the base of the Ravenscar Group, which also acts as a minor aquifer in the area. These units are conformably deposited, and gently dipping to the north at an angle of 1-2 • . A thin layer (<0.3 m) of topsoil is present across the site, beneath which the underlying rocks are highly weathered and present at residual strengths, often showing properties similar to soil . A working conceptual model of the site (Fig. 1c) has been developed as a result of several studies which have used combinations of geophysical, geotechnical and geodetic investigations to investigate the subsurface structure of the landslide (see studies by Chambers et al., 2011. The main landslide processes occur in the highly weathered nearsurface zone of the WMF. The WMF is a 25 m thick layer of grey to dark grey mudstone and siltstone showing bands of calcareous and sideritic concretions with high to very high plasticity. Elevated moisture contents cause the clay-rich WMF to approach or exceed the liquid limit, causing mid-slope plastic deformation and translational movement of weathered WMF downslope (Hobbs et al., 2012). Material at the base of this translational zone accumulates at a break of slope indicating the surface outcrop of the underlying SSF. The SSF formation is approximately 20 m thick, and comprises ferruginous, micaceous siltstone, with fine grained sandstones and thin mudstone layers. The SSF is a comparatively porous sandstone unit that permits drainage of the overlying mobilised WMF, arresting the movement of the plastically deformed material from the mid-slope. The RMF is present at the site, underlying the SSF, although the exact boundary is uncertain due to the location of a persistent water table at a similar horizon, as evidenced by spring lines further down the slope. WMF material accumulated at the surface of the SSF is subject to sub-aerial weathering processes, including seasonal temperature cycling, inducing shrinkage and subsequent cracking. Water ingress into these shrinkage cracks can induce mud flows, further transporting disturbed WMF material downslope and leading to the formation of flow lobes. Above the mid-slope zone of plastically deforming WMF, support for overlying WMF material is removed, and rotational slumps form, creating large (3-4 m vertical displacement) backscarps close to the contact between the WMF and overlying DF. In a similar manner to the disturbed WMF material accumulated on the SSF lower down the slope, the WMF material upslope that is disturbed by rotational slumping also forms shrinkage cracks, providing pathways for future water ingress into the landslide system.

Survey activities
The remotely sensed UAV photogrammetry, near-surface geophysical and geotechnical data were obtained over four days in July 2019 (Fig. 2), with the seismic and ERT survey acquired on two consecutive days. Due to equipment and personnel restrictions, it was not possible to acquire all of the geophysical data in a single day. Table 1 shows the attributes of the various surveys, comparing resolution, acquisition rates and dimensions of each survey type. The individual activities for each survey are described in the following sections.
For the ERT survey, electrodes were deployed at 2 m intervals from the top of the landslide, and a total of 96 electrodes were deployed to give a nominal survey profile length of 190 m. A dipole-dipole array with a = 4 and n = 8 (where a is the maximum separation between current electrodes and n is the ratio of distance between the first current and potential electrodes to current dipole spacing), was used to acquire measurements, and reciprocal measurements were acquired for each reading.
For the SRT surveys, hammer shots were acquired from the first geophone (0 m) of the survey profile at 4 m intervals along a series of geophones deployed at 2 m intervals (i.e., hammer shots at every other geophone). A flat horizontal plate source was used for the P-wave survey, and an inclined source used to generate horizontally polarised Swaves. A total of 72 geophone locations were measured (comprising two overlapping 48 geophone spreads acquired separately), giving a nominal survey length of 142 m.

UAV data acquisition and processing
A UAV survey was conducted using a DJI Inspire 1 Raw rotary drone, equipped with a Zenmuse X5S 16 megapixel camera. 210 vertical airphotos were acquired with 70% forward and sideways overlaps at an altitude of 90 m. Seven ground control points were georeferenced using RTK-GPS. Structure from motion processing was undertaken using Pix4Dmapper to produce a point cloud (11.9 million points, with an average density of 207 points per m 3 ), digital elevation model (DEM) with a horizontal resolution of 0.1 m, and orthophotograph. The raised vegetation (scrub and trees) was automatically removed from the DEM to produce a bare earth digital terrain model (DTM) as input to the geophysical processing.

Topographic pre-processing and joint 2D mesh generation
The first electrode and geophone in the survey arrays were positioned at the top of the slope adjacent to a permanent marker peg. The remainder of the electrodes and geophones were deployed downslope at 2 m intervals relative to this position using a tape measure. However, the surveys were not conducted simultaneously, and the tape measure was inadvertently moved between deployments for each survey. Consequently, later analysis of the RTK-GPS measurements of the electrode and geophone positions shows some positional discrepancies between the positions of the geophones and electrodes, meaning that these locations are not truly 'co-located' (Fig. 3).
Using the same subsurface mesh to invert multiple-methods allows for interpretation of the model in the same spatial dimensions and at comparable resolution. If the same mesh can be used, then every cell in that mesh will have a value of P-and S-wave velocity and resistivity, allowing for easier statistical comparison of the values using approaches such as cross-plotting. Although it is possible to invert ERT data acquired in 2D using a 3D inversion scheme, we invert the seismic first-arrivals using a 2D inversion, and so aligning data positions is necessary to emulate spatial co-location of the data. We construct a 2D mesh from 3D positions, firstly by translating points from British National Grid coordinates (as acquired from the RTK-GPS survey) to a local coordinate system, where the x position is perpendicular to the profile orientation, and the y position is parallel to the profile orientation. There are discrepancies in the positions of the electrodes (red dots) and geophones (blue dots) as deployed in the field (Fig. 3a). To be able to create a joint mesh, we interpolate a local x position between each electrode and geophone (black triangle), averaging the difference between the true electrode and geophone location. At each of these interpolated sensor positions, a new elevation (z) is extracted from the DEM at that position. The elevation of the interpolated positions shows little variation relative to the true elevation of the electrodes and geophones as deployed in the field (Fig. 3b). During this process, the position of the electrodes and geophones in the y orientation (i.e., along the survey line) are preserved. It is worth noting that although this procedure was implemented to correct discrepancies from the field that could have been avoided, nonetheless this approach to minimising the differences between closely positioned datasets can prove useful for future surveys where unavoidable spatial discrepancies exist e.g., when comparing repeat measurements from monitoring campaigns.
After pre-processing, we generate a joint 2D mesh on which both ERT and SRT data are inverted, acknowledging that small errors are introduced into both datasets by this process. However, the introduction of small errors associated with using an interpolated position is more favourable than preserving the original positions of one dataset at the expense of introducing significantly larger errors to the other; in other words, averaging the error across all data is preferable to prioritising one dataset over the other. Despite this, the interpolated positions used to create the 2D mesh all have deviations in the x direction of less than 0.3 m, which is 15% or less than the 2 m electrode and geophone spacing used (Fig. 3a). This level of error is not unusual when collecting geophysical data in a steeply dipping, rough terrain environment with an undulating surface . The variation in elevation by extracting a new z position from the DEM is shown to be very low, with a maximum variation of <0.08 m (Fig. 3b), although the majority of points show a much lower difference than this. We create the joint mesh using the mesh generation module in pyGIMLi (Rücker et al., 2017). The result is a joint 2D mesh with a large number of small cells in the near-surface, due to the very small electrode and geophone separations (i.e., distance in the y orientation) found at some locations. Only the surface nodes corresponding to the electrode or geophone locations for the appropriate data type inverted are used by PyGIMLi as sensor locations for the inversion, with the remainder providing inter-sensor topographic refinement.

ERT inversion
We filter the raw ERT data to remove measurements with reciprocal errors >20%. The reciprocal error (e) of a measurement is given as in which m n is the measurement recorded with a source at position A, and receiver at position B, and m r is the measurement recorded with a source at position B and a receiver at position A. We produce a reciprocal error model by binning the reciprocal error and resistance (r) data in logarithmically increasing bin sizes and fitting a line to the distribution of these bins (Mwakanyamale et al., 2012). We fit an error model with   reciprocal and resistance data. We filter the data again to remove reciprocal errors >10%, and invert the ERT data using the ERT manager module of PyGIMLi with the joint 2D mesh constructed from the UAV topography. The resulting inverted model converges at χ 2 = 1.23 and relative root-mean-square error (RRMSE) of 3.82% after five iterations.

SRT inversion
We pick the first-arrivals of the P-and S-wave datasets, and analyse the dataset to estimate inherent errors. Unlike ERT, reciprocal data are only available for a subset of each SRT dataset, as hammer shots were not undertaken at every geophone; consequently, for the first-arrivals detected by geophones where no hammer shots were undertaken, there is no means of directly assessing the quality of the data, other than following basic seismic refraction principles (Leung, 2003). Similarly, where environmental conditions (e.g., wind, rain) are different between the recording of the forward and reverse shots of a single reciprocal measurement, then the signal-to-noise ratio may be worse at one location, preventing identification of a clear first-arrival to the same offset of that measured from the reciprocal shot location. As such, reciprocal measurements are not always available for all picked first-arrivals even at shot locations where reciprocal data are available. For the P-wave dataset, only 373 of the 2124 picked first-arrivals (17.5%) have reciprocal measurements. In the S-wave dataset, 393 of the 1751 picked firstarrivals (22.4%) have reciprocal measurements. Reliable inversion of the data relies on the creation of an error model from the available reciprocal data, and so the SRT data are filtered on modelled errors, unlike the pre-filtering of reciprocal errors undertaken on the ERT data before the creation of an error model.
We create error models for the P-and S-wave SRT datasets using the same procedure; for all reciprocal measurements, the reciprocal errors and travel-time (t) of the datasets are divided into logarithmically increasing bin sizes. We fit error models with an R 2 value of 0.76 (a + tb, where a = 0.014, b = 0.827) to the P-wave data, and with an R 2 value of 0.91 (a + tb, where a = 0.0094, b = 1.778) to the S-wave data. We filter points with modelled error of >5%, resulting in the removal of 8.7% of the P-wave data and 5.1% of the S-wave data. We invert the P-and Swave data using the SRT manager module in pyGIMLi on the joint 2D mesh constructed from UAV topography. The P-wave model converges with a χ 2 of 1.6 and RRMSE error of 2.38% after nine iterations. The Swave model converges with a χ 2 of 1.14 and an RRMSE error of 2.27% after 11 iterations.

Unsupervised learning for ground model development
In order to produce an objective ground model that provides a robust interrogation of the modelled data with minimal a priori information or operator interpretation, we use an unsupervised clustering algorithm to group areas of similar geophysical properties in the landslide subsurface. Unsupervised learning algorithms are a basic form of machine learning, which are more commonly used at the regional scale in landslide studies for the identification of slopes at risk of future failure across large geographical areas based on pre-conditioning factors (Merghadi et al., 2020). In near-surface geophysics, they are increasingly used for identifying data patterns in processing (Sun and Zhang, 2020; Xia et al., 2018), predicting petrophysical relationships between geophysical measurements and material states (Ihamouten et al., 2016;Moghadas and Badorreck, 2019), identifying time-lapse variations in monitoring scenarios (Audebert et al., 2014;Delforge et al., 2021;Xu et al., 2017) and identifying zones of common geophysical properties for characterising the subsurface (Doetsch et al., 2010;Ward et al., 2014).
For this exercise in rapid reconnaissance, a balance is required between a user-friendly algorithm (i.e., an algorithm with minimal hyperparameters to tune) and one which can appropriately identify trends and relationships in the geophysical data. An assumption in this approach is that the ERT and SRT measurements made within contiguous subsurface units of similar subsurface properties will display relationships unique to the hydrogeological unit under investigation, and therefore an unsupervised method that can identify relationships in an unsorted group of data is required, as no training data are available in these circumstances. One such approach is the Gaussian Mixture Model (GMM), a probabilistic unsupervised learning algorithm that classifies data as belonging to one of an expected number of Gaussian distributions present within a dataset. We use a GMM as i) it is unsupervised (i. e., requires no training data), ii) it assigns each point in the dataset to one of a specified number of clusters in a probabilistic manner, identifying potential areas of uncertainty (Doetsch et al., 2010), iii) it allows a range of covariance types in the data, allowing for identification of anisotropic, convex clusters in the parameter space (Delforge et al., 2021), and iv) it is easily implemented using the scikit-learn package, a collection of machine learning tools for predictive data analysis that is part of the Python based SciPy library (Pedregosa et al., 2011).
A GMM implements the Expectation-Maximization (EM) algorithm, an iterative process that initially uses random components in the data and then switches between calculating the probability of a data point being generated by the Gaussian components of the model (i.e., expectation), and then maximizing the likelihood of the data given the classification (i.e., maximization). The result is a dataset that produces a series of classification labels identifying data points as belonging to one of a specified number of expected Gaussian distributions (i.e., cluster groups) present in the dataset.
Four variables are available for passing to the GMM algorithm, three geophysical variables (resistivity, P-wave velocity (V p ) and S-wave velocity (V s )) and one spatial variable (depth in z from the ground surface). We therefore consider two possible inputs to the GMM; one using only the three geophysical inputs (i.e., a GMM without depth constraint) and one using the four inputs combining the geophysical and depth inputs (i. e., a GMM with depth constraint). An important hyperparameter to tune in a GMM is the number of expected cluster groups in the data. Given that we are assuming only basic a priori information for the site that might be obtained from a simple desk study, we determine the number of expected cluster groups using statistical analysis of the model inputs to the GMM and comparing this result with the a priori information available from a geological map.
Firstly, we can assess the optimal number of cluster groups by considering the Bayesian Information Criterion (BIC) for a range of cluster group quantities (Fig. 4). The objective of considering the BIC is to identify the minimum number of cluster groups that can describe the heterogeneity of the model (Delforge et al., 2021;Schwarz, 1978). The lowest BIC value, or threshold at which increasing the number of cluster groups does not change the gradient between BIC values, can be considered the optimum number of cluster groups to use in the GMM algorithm. In the case of the GMM without depth constraint (Fig. 4a), the absolute BIC values decrease significantly above two cluster groups, and continue to decrease marginally with increasing cluster numbers, showing no obvious absolute BIC minima. However, the BIC gradient becomes asymptotic to 0 at four cluster groups. In the GMM with depth constraint (Fig. 4b), a sharp decrease in the BIC is observed at four cluster groups, above which the BIC increases again. Although the BIC marginally decreases again at seven cluster groups, given the objective of considering the BIC is to minimise the number of cluster groups to describe the data, four is identified as the optimal number of cluster groups for both models.
Secondly, we confirm the use of four cluster groups by considering basic a priori information about the site from maps and on site observations. The publicly available geological map of the site (Fox-Strangways and Howell, 1983) shows four stratigraphic units beneath the slope; the Dogger Formation (DF), Whitby Mudstone Formation (WMF), Staithes Sandstone Formation (SSF), Redcar Mudstone Formation (RMF). Within these four units, two distinct lithologies are present; mudstone (WMF and RMF) and sandstone (DF and SSF). The satellite images (Fig. 1b) show surface materials (derived from these parent lithological units) that are in the process of being reworked by creeping landslide processes, for example, as reworked WMF material forming the flow lobes at the site. We also observe areas of very little deformation, where the surface materials are assumed to be stable. Therefore, our a priori information confirms the use of four cluster groups in either GMM model, representing i) sandstones, ii) mudstones, iii) actively deforming (disturbed) surface material likely derived from the displaced WMF, and iv) relict (undisturbed) surface material likely derived from the SSF. In addition to the deciding the number of cluster groups, some additional hyperparameters require tuning. For each cluster group in both GMM models, we set the covariance type to full to allow for the identification of anisotropic clusters. In addition, we pass a random state to the algorithm to allow for reproducibility between separate runs of the algorithm.

Individual model results
The inverted geophysical models are shown in Fig. 5. Fig. 5a, b and c show the data inverted using the individual meshes derived from the RTK-GPS positions of electrodes and geophones in the field. Fig. 5d, e and f show the results of inverting the data using the joint mesh. The methods have varying depths of penetration, with the P-wave model reaching deeper parts of the landslide, and the ERT model imaging shallower areas. The inverted ERT model shows a small zone of intermediate (20-50 Ωm) resistivity at the top of the slope, above a large unit of low resistivity (<20 Ωm) extending from the base of the section to the ground surface, which is identified at 20-60 m horizontal distance. Downslope of this, the low resistivity layer is present as a shallow surface layer, extending to ~130 m horizontal distance along the profile. A   Fig. 4. The Bayesian Information Criterion (BIC) and gradient of the BIC for a range of a number of cluster groups for a) inputs without depth constraint including resistivity, P-and S-wave velocity and b) inputs with depth constraint including resistivity, Pand S-wave velocity and depth. The lowest number of cluster groups above which adding clusters does not change the BIC or gradient of the BIC indicates the optimal number to use, in each case four cluster groups. © BGS and © University of Bristol. zone of increased resistivity underlies this surface layer, containing some thin layers of high resistivity (>100 Ωm), which are identified at the surface from 120 to 170 m horizontal distance. At the base of the slope, an area of intermediate resistivity values underlies this higher resistivity area between 110 and 180 m horizontal distance, which also shows localised zones of lower resistivity.
The P-wave model shows uniformly low velocity surface layer of 300-500 m/s, thickest in the central part of the slope (60-100 m horizontal distance) and thinning toward the surface both from 0 to 60 m horizontal distance and from 100 to 150 m horizontal distance. Units of increased velocity are present at depths of >15 m bgl at the top of the slope (between 10 and 50 m horizontal distance) and shallowing from ~20 m bgl to ~10 m bgl between 80 and 150 m horizontal distance. This latter high velocity unit at the base of the slope has the highest P-wave velocities, in excess of 2000 m/s, while the former high velocity unit toward the top of the slope has reduced P-wave velocities in the range of 1000-1500 m/s. P-wave velocity should be sensitive to changes in bulk modulus, and will increase significantly as materials move from a highly-to fully-saturated state.
The S-wave model shows a smaller range of velocities, with a very uniform low velocity layer of 50-100 m/s from the surface to about 2 m bgl across the slope. An increased velocity boundary is identified between 30 and 70 m horizontal distance, rising from ~20 m bgl to the near-surface across this area. Similarly, a broadly horizontal increase in S-wave velocity is identified from 90 to 150 m horizontal distance, at ~60 m elevation asl. S-wave velocity is sensitive to variations in shear modulus, generally decreasing with increasing saturation, but with no rapid rise at the highly-to fully-saturated state (as with P-waves velocity).

Geophysical model relationships
After inversion of the individual geophysical datasets on the joint 2D mesh, we extract a model with a co-located resistivity, P-wave and Swave velocity values. The points are extracted from the edges of cells that have overlapping coverage from all of the geophysical surveys, limiting the data to a shallow layer approximately 20 m thick (the maximum depth extent of the ERT survey) and extending from 0 to 144 m horizontal distance along the survey profile (the maximum horizontal extent of the SRT survey). The co-located model comprises 1937 data points with values of resistivity and P-and S-wave velocity (Fig. 6). In each of the individual models, these points have full coverage in each survey; reducing this threshold has the potential to produce a co-located model with a greater number of points, at the cost of increasing the uncertainty associated with the non-uniqueness of inverted measurements in the model.
We use cross-plots and simple statistics of the different data variables (resistivity, P-wave and S-wave velocity, and depth below ground level) to explore potential relationships in the co-located models. Combining these approaches, we produce a 'cross-plot correlation matrix', in which each variable in the co-located model is plotted against each other, showing the different relationships present in the data (Fig. 7). A Spearman rank correlation test, measuring the statistical dependence between two variables described by a monotonic function, provides a simple means of identifying positive or negative correlations in multivariate data. In the cross-plot correlation matrix, the background of each plot is coloured according to the Spearman rank correlation coefficient (ρ), with dark red indicating strong positive correlations in the data, and dark blue indicating strong negative correlations.
For some cross-plots in the correlation matrix, ρ is expectedly high. Swave velocity is sensitive to increases in stiffness, which is increased by consolidation, and therefore a very strong positive approximately linear correlation between S-wave velocity and depth (ρ = 0.93) is identified. Similarly, P-wave velocity increases with bulk modulus, which is increased with consolidation, and therefore also has a strong approximately linear correlation with depth (ρ = 0.76). Consequently, the correlation between P-wave and S-wave velocity is also strongly positive and approximately linear (ρ = 0.71). The approximately linear nature of the relationships between seismic velocity and depth indicate a simple relationship in which increasing depth results in increased seismic velocities; this is expected as greater compaction is experienced with the increasing volumes of material overlying deeper points.
However, some variable pairs show no significant statistical correlation, even though the human eye can see patterns in the data that suggest non-random relationships. For example, resistivity and P-wave velocity (ρ = − 0.02) show no statistically significant correlation, but overlapping non-linear relationships between certain sets of variables are observed. This also appears to be true for the very weakly positively correlated resistivity and S-wave velocity (ρ = 0.34) and resistivity and depth (ρ = 0.35) variables. In an attempt to separate and classify these sub-relationships within the geophysical data, the unsupervised GMM algorithm is applied to the co-located model to objectively determine classifications of data based on these relationships.

Clustered ground model results
We assess two GMM outputs for producing ground models of the subsurface; one without depth constraint, in which resistivity, P-and Swave velocity are passed to the algorithm, and another including depth constraint. The resulting classifications of the model values identify clusters of points with similar trends and ranges of measurements ( Fig. 8a and c), rather than classifications based on generic reference values. Cluster group 1 (blue) is characterised by lower resistivity, lower P-wave velocities and lower S-wave velocities, while cluster group 2 (green) shows lower to medium resistivity values, with medium to higher P-and S-wave velocity ranges. Cluster group 3 (orange) is Fig. 6. The 1937 data points (black dots) that have a co-located measurements of Vp, Vs and resistivity, and their position on the joint 2D mesh created from the separate topography of the ERT and SRT surveys (grey background). © BGS and © University of Bristol. characterised by lower to higher resistivity, lower to medium P-wave velocity and medium to higher S-wave velocities, while cluster group 4 (brown) show lower to higher resistivity with low P-and S-wave velocities (see Table 2 for summary).
When the cluster groups are plotted back to their locations in the landslide subsurface, the groups form highly contiguous regions Fig. 7. A cross-plot correlation matrix, plotting each variable from the co-located model (resistivity, P-wave velocity, S-wave velocity, and depth below ground level). The shading behind each panel corresponds to the Spearman rank correlation, printed above each panel, measuring statistical correlation between the variables. © BGS and © University of Bristol. Fig. 8. The results of the two GMM models; a) the data distribution and b) cluster ground model for the GMM with no depth constraint, and c) data distribution and d) cluster ground model for the GMM with depth constraint. © BGS and © University of Bristol. (Fig. 8b), with cluster groups 1 and 2 each forming two separated units at depth, likely to indicate separate lithological units, and cluster groups 3 and 4 forming adjacent surface regions, indicating different states of surficial materials. The two GMM models show a similar pattern of cluster distribution; the main regions described above are present in both models, but with some areas of cluster group 3 assigned to cluster group 4 in the lower slope in the GMM model with no depth constraint. As the GMM algorithm assigns clusters based on probability, the probability of assignment for each point in a cluster group can be used to assess the result of the model output (Fig. 9). Using depth constraint in the GMM increases the probability of cluster groups 3 and 4 being assigned to the correct cluster, while the probabilities of cluster group 2 remain broadly the same. There is a marginal reduction in the probability of cluster group 1 being correctly assigned, but this reduction is much less than the gains made in cluster groups 3 and 4. Consequently, the GMM using depth constraint can be considered to be the more reliable of the two GMM outputs. A cross-plot correlation matrix showing the relationships between the various geophysical data variables with the cluster group assignments shows the different properties of the cluster groups (Fig. 10). The cluster groups within the cross-plot correlation matrix highlight the relationships between the different geophysical properties of the various rocks and soils present at the site, and these distributions can be used for determining potential lithology based on the ranges of values present.

Discussion
Acquiring multi-method, co-located geophysical measurements can reduce uncertainty in ground model development, as different geophysical techniques are sensitive to different properties of the subsurface. Although increasing the number of geophysical techniques can reduce uncertainty, nonetheless, a qualitative, heuristic form of comparative interpretation can be open to operator bias (Niccoli, 2014), and requires both prior knowledge of the subsurface and of the sensitivity of the geophysical method to other similar subsurface conditions to produce a robust interpretation. In some instances, ranges of reference values may be used to aid the interpretation of geophysical data (e. g., Bichler et al., 2004). With this approach, large assumptions are made regarding the validity of measurements acquired in one setting and used to interpret geophysical models acquired from another. Additionally, the typical ranges that are presented by such reference values are usually very large and tend to overlap between material types, therefore still relying on the judgement of an operator to use effectively. Overconfidence in interpretation may produce a detailed ground model with many (unknown or undisclosed) uncertainties, whilst underconfidence in interpretation may result in a conservative ground model that does not capture the heterogeneity of the subsurface. In either case, the resulting ground model does not reflect the true heterogeneity of the subsurface and may fail to map major subsurface discontinuities. Furthermore, the process of arriving at the final ground model is unlikely to be repeatable between operators.
Therefore, rapid geophysical investigation and processing methods which can produce robust and objective results for informing ground model development are needed. Using co-located models, we show that there exist relationships between geophysical variables that are more complex than can be described by simple statistics that consider the dataset as a whole. A GMM is able to identify overlapping relationships and trends in the co-located dataset, and classify data according to these distributions. The resulting ground model, created by plotting the cluster assignments to their spatial location has identified the major discontinuities present in the subsurface, which had been identified in prior ground models of the HHLO produced from joint geotechnical and geophysical studies.
In previous studies of the HHLO, arbitrary values of geophysical measurements have been used to identify the boundary between lithological units. Chambers et al. (2011) used a resistivity value of 30 Ω.m to distinguish between the clay-rich WMF and clay-deficient SSF at the site, based on visual inspection of inverted ERT models. Uhlemann et al. (2016) used rapidly increasing seismic velocity gradients to locate potential lithological boundaries between the WMF and SSF. This boundary was then later used in the processing of 3D ERT surveys from the site, and was used as a boundary within the inversion process and to determine different zones of the subsurface for applying petrophysical relationships by Uhlemann et al. (2017). Similarly in the study by Whiteley et al. (2020) considering time-lapse SRT models from the HHLO, an arbitrary value from the Vp/Vs model (Vp/Vs >5) based on the judgement of the operator was used to identify the sliding layer at the HHLO. In all of these previous studies, different persons considering Table 2 The properties of the cluster group assignments output from the GMM. © BGS and © University of Bristol.  Fig. 9. The probability of each variable in the two GMM models being assigned the correct cluster group. Using depth constraint reduces uncertainty in cluster groups 3 and 4, at the cost of introducing marginally higher uncertainties in to the assignment of cluster group 1. © BGS and © University of Bristol.
the inverted models may have produced different opinions or interpretations of the location of these discontinuities. That is not to say that these results are necessarily incorrect or inaccurate; the crosssection in Fig. 1c is the result of these many studies, including information from other intrusive  and remote-sensing observations , and consequently uncertainties in interpretation are reduced with each additional data source incorporated. The discontinuities from this ground model have been transposed  to the rapid reconnaissance ground model for comparison (Fig. 11). The approach presented in this study provides a means of identifying these discontinuities by interrogating the relationships present in the parameter space (see Fig. 10), rather than based on judgement alone or by incorporation of many different geophysical, geotechnical and geodetic datasets. Crucially, this approach will yield repeatable results independent of the skill and experience of the operator. The geophysical properties of each cluster group are summarised in Table 2, and these geophysical properties characterize the materials of the landslide. The surface layers, cluster group 1 and 4, which are located toward the mid to upper slope and lower slope respectively, share broadly similar seismic velocities (low to medium velocities) but have different resistivity ranges. The low seismic properties of both these groups indicate disturbed and unconsolidated material in this surface layer, consistent with other seismic investigations conducted by Uhlemann et al. (2016) and Whiteley et al. (2020). In cluster group 1, which extends to ~2 m bgl from 0 m to 90 m horizontal distance before thinning out at the surface by 120 m horizontal distance, the low range of resistivity (<25 Ωm) points to a high clay content in the surface layer, similar to those observed by Chambers et al. (2011) andUhlemann et al. (2017). High levels of saturation in this surface layer are unlikely due to the antecedent dry conditions that preceded the survey in this study. The underlying Whitby Mudstone Formation (WMF) located in the upper slope has a high clay content and is the primary failing unit at the HHLO, and is also the parent unit for this disturbed surface layer identified by cluster group 1. Cluster group 4, underlying cluster group 3 from 65 m horizontal distance and outcropping at the surface from 90 m, has a much higher range of resistivity (25 to >100 Ωm). This indicates a material with lower clay content, likely derived from the relatively sandrich Staithes Sandstone Formation (SSF) underlying this part of the landslide.
Beneath these surface layers, areas of cluster group 3 are observed (located in a small area at the top of the landslide slope from 0 to 10 m horizontal distance, and again from 45 m to the end of the survey line) and cluster group 2 is intercalated in cluster group 3. The seismic properties of each unit are broadly similar, showing medium to high Pand S-wave velocities, and indicating more consolidated deposits. The high range of resistivity of cluster group 3 indicates a material with variable clay content and saturation regime, whereas the limited, low resistivity of cluster group 2 indicates a clay-rich and saturated material. Therefore, the large zone of cluster group 3 located below 2 m bgl between 45 m and the end of the survey line is likely to be the SSF, which is a relatively porous and a sand-rich unit. The small area of cluster group 3 located at the top of the landslide is likely a small section of DF, the sandstone and limestone unit capping the escarpment. The two zones of cluster group 2 represent units of different materials but with similar properties. The upper area of cluster group 2 is the WMF, as indicated by low resistivity and medium to high seismic velocities, and the lower area is most likely the RMF, a unit of clay-rich material situated within a location more prone to increased saturation at the base of the landslide, giving it similar properties to the clay-rich WMF in the upper slope.
The results of the GMM clustering can be compared with intrusive borehole data from the site, acquired at the same time as the surveys (Fig. 12). A borehole drilled at 52.85 m horizontal distance (and offset by 12 m to the east of the survey line) recovered a core including disturbed surface deposits, and underlying WMF and SSF materials to a depth of ~10 m bgl. The lithological transitions are identified by the changes in colour from the grey WMF to brown SSF, although the transition between these units is not well reflected by individual variations in the geophysical logs extracted from the borehole position. However, changes in the cluster group profile reflect the changes from disturbed surface material (0-2 m bgl; cluster group 1) to WMF (2-7.5 m bgl; cluster group 2) and SSF (>7.5 m bgl; cluster group 3), highlighting the use of the GMM in identifying subtle variations that may not be easily identified by individual visual inspection and comparison alone.
It is important to note that the machine learning approach presented in this study does not remove uncertainties associated with the nonuniqueness of inverted geophysical data. For this, data integration at the inversion stage is required, which requires a different and more technically demanding approach, such as the use of petrophysical joint- inversions (e.g., Mollaret et al., 2020;Wagner et al., 2019). Neither does the approach presented in this study advocate the omission of individual evaluation of inverted geophysical models; although skills and experience may vary between persons, this does not preclude expert analysis of inverted models as has been the approach taken for many decades with great success in the field of near-surface geophysics. Rather, this approach unifies the interpretation of three individually inverted geophysical models into a single model, which is easier to understand, rapidly identifies major subsurface discontinuities, and is compiled in a more rapid and objective manner than individual comparison between models. Because of the level of detail regarding major subsurface features acquired in comparison to the many more detailed studies preceding this one, we term this single unified model a 'rapid reconnaissance ground model'. Additionally, as the machine learning approach exploits inherent relationships within the co-located measurements, the rapid reconnaissance ground model contains lower interpretation uncertainties than those that may result from using, for example, comparison to generic reference values (e.g., Bichler et al., 2004).

Conclusions
We produced three sets of geophysical models from electrical resistivity tomography and P-and S-wave seismic refraction tomography at a slow-moving clay-rich landslide. After data pre-processing to colocate the surveys using topography acquired from a UAV, we inverted the geophysical data on a joint 2D mesh, producing co-located geophysical models. We cross-plotted the variables of resistivity, Pwave velocity, S-wave velocity, (Fig. 7) revealing complex trends and relationships in the data, indicating the presence of multiple contiguous geophysical zones in the subsurface.
Using an unsupervised GMM algorithm, we classified cells in the geophysical models (resistivity, P-wave and S-wave velocity) as belonging to one of four cluster groups, with the number of groups being chosen based on statistical measurements of the data, alongside the expected lithological variations present at the site. The GMM algorithm identified four distinct areas of the landslide; a surface layer of low resistivity and low seismic velocity, thought to represent the failing Whitby Mudstone surface layer, and a second surface layer showing higher resistivity, believed to be a sand-rich surface material derived from both the underlying Staithes Sandstone Formations. Beneath these surface layers, the GMM algorithm, identified four discrete domains comprising two cluster groups, indicating areas of mudstones and sandstones with similar geophysical properties in the landslide. We interpret a cluster group of low resistivity and increased seismic velocities as the underlying Whitby Mudstone formation in the mid to upper slope, and the Redcar Mudstone Formation at the bottom of the slope. Another cluster group with a high resistivity range and slightly decreased seismic velocities is interpreted as the Dogger Formation at the very top of the slope, and the unsaturated Staithes Sandstone Formation in the mid-slope area.
The ground model produced by plotting these cluster groups back to their location within the landslide subsurface maps the major discontinuities identified in previous ground models of the HHLO constructed by qualitative and heuristic interpretations of geotechnical and geophysical data (e.g., Uhlemann et al., 2016;Whiteley et al., 2020). Additionally, when compared to a borehole log from near the geophysical survey line, the extracted geophysical and cluster group logs show good agreement with the main lithological changes identified in the subsurface.
These types of ground models, acquired rapidly over large areas at the slope-scale using multi-geophysical survey approaches, and processed and interpreted using objective and automated methods, are a valuable tool in the rapid reconnaissance of landslide systems. Using unsupervised machine learning algorithms for these purposes is an emerging field in engineering geophysics, and one which shows much promise in overcoming the pitfalls associated with operator bias, overdependence on operator skill and interpretation, and conveying results objectively to other end users with clarity. This approach to producing rapid reconnaissance ground models gives broad stroke information about the subsurface with little prior knowledge, and can be crucial in disaster risk reduction scenarios and time-critical ground investigations of unstable landslide structures and their associated disaster cascades (Fan et al., 2021), or for providing early stage design information for the establishment of slope-scale landslide early warning systems.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.