Continental-scale geochemical surveys and mineral prospectivity: Comparison of a trivariate and a multivariate approach

Abstract The National Geochemical Survey of Australia (NGSA) provides an internally consistent, state-of-the-art, continental-scale geochemical dataset that can be used to assess areas of Australia more elevated in commodity metals and/or pathfinder elements than others. But do regions elevated in such elements correspond to known mineralized provinces, and what is the best method for detecting and thus potentially predicting those? Here, using base metal associations as an example, I compare a trivariate rank-based index and a multivariate-based Principal Component Analysis method. The analysis suggests that the simpler rank-based index better discriminates catchments endowed with known base metal mineralization from barren ones and could be used as a first-pass prospectivity tool.


Introduction
Modern exploration for mineral resources has relied on geochemical methods since the early days of 'mineral prospecting' in the first half of the XXth century (e.g., Hawkes and Webb, 1962;Levinson, 1974;Beus and Grigorian, 1977;Rose et al., 1979). Former Soviet Union and Scandinavian geologists led the way in developing sampling and analytical methods for the first systematic geochemical surveys. Over recent decades, a large number of geochemical surveys have been carried out at low (~1 site/1000 km 2 ) to ultra-low (~1 site/10,000 km 2 ) sampling densities revealing increasingly large geochemical features (Xie and Yin, 1993;Garrett et al., 2008). These have been shown to faithfully represent geochemical patterns revealed in greater detail by higher density surveys of the same areas (e.g., Smith and Reimann, 2008;Birke et al., 2015).
As regions of low and elevated element content are revealed by such surveys, it is relevant to investigate statistically if and how these relate to mineral systems (Wyborn et al., 1994;Reimann et al., 2016). Statistical and visualization methods ranging from univariate analysis to complex machine learning algorithms and geographic information systems have been used to analyze geochemical data, including for prospectivity analysis purposes (e.g., Grunsky, 2010;Zuo et al., 2016;Cracknell and Caritat, 2017). The purpose of such studies is to identify which regions could be prioritized for further strategic investigation and investment. Here I will use low-density geochemical data from the National Geochemical Survey of Australia (NGSA; Cooper, 2011, 2016) to investigate if base metal mineral endowment can be recognized at the continental scale. I choose to compare a relatively simple statistical analysis to a more complex multivariate approach.

The National Geochemical Survey of Australia
The NGSA project  aimed at providing pre-competitive data and knowledge to support exploration for energy resources in Australia (www.ga.gov.au/ngsa). In particular, it improved existing knowledge of the concentrations and distributions of energy related elements such as U and Th at the national scale.
The project was underpinned by a series of pilot geochemical surveys carried out in previous years by Geoscience Australia and the Cooperative Research Centre for Landscape Environments and Mineral Exploration (CRC LEME). These developed and tested robust and costeffective protocols for sample collection, preparation and analysis (see below). Selected results from these pilot projects were summarized in Caritat et al. (2008a).
The NGSA project was conducted in collaboration with all the State and Northern Territory (NT) geoscience agencies. It was initiated be-cause a geochemical coverage for Australia was identified as a gap, which when filled would complement the national-scale geological and geophysical datasets (Caritat et al., 2008b).

Material and methods
Catchment outlet sediments were collected from 1186 catchments (or 1315 sites, including field duplicates), which together cover over 6.174 million km 2 or~81% of Australia at the average density of 1 site/ 5200 km 2 . Approximately 200 catchments in South Australia and Western Australia could not be sampled during this project due to access limitations. Collaboration with State and NT geoscience agencies was critical for the completion of the project, particularly regarding the sampling phase.
Sampling procedures, sample preparation, and sample analysis protocols, as well as data quality assessment have been presented in detail in a series of reports (see Caritat and Cooper, 2016, for a full bibliography) and are thus only briefly described below. The geochemical atlas (Caritat and Cooper, 2011) presented 529 maps illustrating the geographical distribution of the concentration of chemical elements and properties as acquired by the NGSA project.
In brief the relevant points here are that catchment outlet sediments (similar to overbank/floodplain sediments) were collected from two depths (0-10 cm for Top Outlet Sediment or TOS and~60-80 cm for Bottom Outlet Sediment or BOS) near the lower point of those large catchments; after air drying a coarse (< 2 mm) and a fine (< 75 μm) fraction were separated, yielding four sample types (TOS coarse or Tc;  (Lech et al., 2007) describing in detail all standard operating procedures to ensure homogenous practice; the use of the same field equipment and consumables provided centrally to all field parties to avoid random contamination due to variable quality of tools and storage bags; the use of gloves during sample collection to minimize contamination; the double labelling of all samples to minimize sample mix-up; the collection of field duplicates at 10% of the sites to assess sample collectionincluding natural heterogeneity such as the nugget effect -, preparation and analysis uncertainty; the randomization of sample numbers including the field duplicates to avoid spurious spatial anomalies due to instrument drift or memory effects; the insertion of blind laboratory duplicates to assess sample preparation and analysis uncertainty; and the insertion of blind internal project standards, exchanged project standards (e.g., Reimann et al., 2012), and Certified Reference Materials (CRMs) at regular intervals in samples submitted to the laboratory to assess accuracy and instrument drift. The elements used herein were all found to be of fit-for-purpose quality.

Results and discussion
In this contribution, I focus my attention on base metal mineral systems, and therefore on silver (Ag), lead (Pb) and zinc (Zn) concentrations. Table 1 summarizes the statistical parameters for those variables in the four different sample types. Their distributions are both asymmetric and with outliers, as is common for geochemical data. Figs. 1, 2 and 3 show the (raw) distributions of Ag, Pb and Zn (mg/kg) in Australia as obtained for the BOS fine samples at the points of sampling overlain on a geospatial interpolation raster background obtained by ordinary kriging in ESRI's ArcMAP software. All mineral deposits from the OZMIN database (Ewers et al., 2002) mentioning Ag, Pb or Zn in their commodity makeup (however important) are shown as crosses (n = 358), and only those listing Ag, Pb or Zn as the first (most important) commodity are shown as stars (n = 163).
The maps for the individual elements Ag, Pb or Zn show quite different patterns, e.g., the Yilgarn craton in southwestern Australia is elevated in Pb but not in Ag or Zn. Overall the patterns on these maps

P.d. Caritat
Journal of Geochemical Exploration 188 (2018) [87][88][89][90][91][92][93][94] do not match the distribution of all base metal deposits, which are typically polymetallic, particularly well as shown by Figs. 1-3. One must thus investigate options for integrating more than one chemical element into the analysis and test how well the results map the distribution of known mineralization. Here I will compare a trivariate method to a multivariate method, as an example of a simple approach compared to a state-of-the-art approach to test if 'more complex' necessarily implies 'better' in terms of prospectivity analysis.

Trivariate method
Because the concentration values differ widely between the elements of interest, e.g., Ag < 0.002-5.42 mg/kg vs Zn < 0.1-8910 mg/ kg, concentrations were converted to quantile rank (Rnk) values, which uniformly range from 0 to 1 for each variable and are dimensionless. Despite having been shown before to be useful for treating and interpreting geochemical data (Mäkinen, 1991), rank statistics seldom have been used in exploration geochemistry applications. They eliminate or alleviate compositional data problems of closure, skewness and relative scale (Aitchison, 1999). Next the 'Average Base Metal Rank', defined as ABMR = [Rnk(Ag) + Rnk(Pb) + Rnk(Zn)]/3, was calculated for each site, mapped, and kriged. The results for the BOS fine sample type are shown in Fig. 4.

Multivariate method
Next, a multivariate approach was taken to compare with the above trivariate rank-based method. Here, to avoid compositional data limitation (e.g., closure) and focus on metallogenic processes, a sub-composition of 24 most relevant metals and pathfinders (Ag, As, Au, Ba, Bi, Cd, Co, Cr, Cu, Fe, Hg, Mn, Mo, Ni, Pb, Sb, Se, Sn, Te, Tl, U, V, W, Zn) was first centred log ratio (clr) transformed (e.g., Aitchison, 1999) then subjected to a Principal Component Analysis, following the approach of Caritat and Grunsky (2013). It was found that Principal Component 5 (PC5) has its most negative loadings for Pb (−0.58), Sb, Ag and Zn (−0.16), accounts for 5.9% of the total variance and has an eigenvalue of 1.4. This combination of elements is typical of base metal sulfide deposits. PC5 was mapped and interpolated using kriging in Fig. 5, and overlain with base metal deposits.

Comparison of methods
To determine which of the above two methods of estimating mineral potential from surface geochemistry works best, the catchments containing at least one deposit listing Ag, Pb or Zn as the first commodity were deemed 'mineralized' (M), the others 'non-mineralized' (NM). In total, 82 catchments were found to be mineralized and contained the 163 deposits discussed above (stars in Figs. 1-5). The differences between the two subsets are shown statistically in Table 2.
In order to determine if these differences are statistically significant, the non-parametric Kolmogorov-Smirnov two-sample distribution comparison test (Kolmogorov, 1933;Smirnov, 1948) was used. The null hypothesis (H0) for this test is that the two samples are drawn from the same distribution. Table 2 shows that H0 is accepted for the multivariate method (PC5) because p > .05, but rejected for the trivariate (ABMR) method because p < .05. The maximum distance between the two distributions is 0.21 (21% of range) for the ABMR method and 0.101 (only 1.2% of range) for the PC5 method. Thus only for the ABMR method is the distribution of mineralized catchments significantly different from that of the non-mineralized ones. This conclusion is visually corroborated by the distribution plots of Fig. 6. It is speculated that the trivariate method is more successful in this case because it focusses only on the metals of direct interest, whereas the multivariate techniques takes into account all the variables perhaps diluting the information that is sought. Several catchments with elevated ABMR and no known Ag-Pb-Zn deposits are visible on Fig. 4 and could benefit from more detailed follow-up exploration. Future research should investigate simple oligovariate rank-based statistical methods, such as the trivariate method developed here, applied to other mineral systems and focussing on different metals or elements of interest.

Conclusions
Low-or ultra-low-density geochemical survey data can be used to identify broad regions elevated in commodity and/or pathfinder elements. Results presented here support recent findings reported elsewhere that these regions can relate to areas of known mineralization. Two measures of 'prospectivity' for base metal deposits are presented, one using the average rank of the three elements of interest Ag, Pb and Zn (ABMR), the other using a compositionally compliant multivariate approach (PC5). Statistical analysis of the results indicates that the trivariate ABMR method performs better than the PC5 method: mineralized catchments (i.e., those containing at least one known mineral deposit listing Ag, Pb or Zn as the most important commodity) define a distribution that is statistically distinct to that of the other catchments. For the PC5 method, the two distributions overlap considerably and are not different based on a Kolmogorov-Smirnov test. A number of NGSA samples with elevated ABMR values that are not from catchments known to be mineralized are also identified, and these could be the focus of further investigation.

Acknowledgments
The NGSA project was part of the Australian Government's Onshore Energy Security Program 2006-2011, from which funding support is gratefully acknowledged. NGSA was led and managed by Geoscience Australia and carried out in collaboration with the geological surveys of every State and the Northern Territory under National Geoscience Agreements. The author acknowledges and thanks all landowners for granting access to the sampling sites and all those who took part in sample collection. The sample preparation and analysis team at Geoscience Australia, Canberra, and the analytical staff at the Actlabs Perth laboratories are thanked for their contributions. Constructive reviews by Karol Czarnota (Geoscience Australia), Eric Grunsky (University of Waterloo), two anonymous journal referees, and JGE Editor-in-Chief Stefano Albanese all contributed to significantly improving the original manuscript. Published with permission from the Chief Executive Officer, Geoscience Australia.   6. Normal score distributions of the trivariate ABMR (a), and multivariate PC5 (b) variables for both mineralized (red dots; see text) and non-mineralized catchments (gray dots). (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)