Fuzzy inference systems for mineral prospectivity modeling-optimized using Monte Carlo simulations

Highlights • Mineral prospectivity modeling at the target scale for identification of drilling targets.• Monte Carlo simulations for optimization of fuzzy inference systems.• Quantification of uncertainties related to the fuzzy inference systems-based mineral prospectivity modeling.• Identification of exploration targets at varying confidence levels for informed decision making.


Prospectivity modeling using a fuzzy inference system (FIS)
A fuzzy inference system (FIS) is a knowledge-driven expert system based on the concept of fuzzy sets and fuzzy logic [9,15] . The main components of a FIS are the fuzzy membership functions and the fuzzy-'If-Then' rules. The output of a FIS depends primarily on the fuzzy membership values generated from the fuzzy membership functions. Traditionally, the parameters of the membership functions are identified based on domain knowledge of the geoscientist. Hence model-based uncertainties affect FIS-based prospectivity models most significantly [ 7 , 11 , 13 , 1 , 8 ]. The model uncertainty stems from the uncertainty in assigning the parameters and the type of membership functions that are used for deriving fuzzy membership values of the input evidence layers. Hence to optimize a FIS this study first utilizes the descriptive statistics of the values attained by mineralized and non-mineralized drill core locations in the evidence layers for defining the parameters of the membership functions. This was followed by Monte Carlo simulations (MCS) of the parameters and the fuzzy membership values from the probability distributions defined from data statistics.

Proposed method -FIS optimization using Monte Carlo simulations (MCS)
The implementation of a Mamdani-type FIS-based mineral prospectivity modeling involves the following steps [ 12 ]: (1) fuzzification of the input numeric values to the degree of membership to the linguistic fuzzy sets using mathematical functions; (2) combining the linguistic values of the input variables using fuzzy operators by constructing fuzzy ''If-Then'' rules; (3) aggregation of outputs across all the 'If-Then' rules; and (4) defuzzification of the output aggregate to calculate the output prospectivity values and generate the prospectivity map. In the method proposed in this paper, we modify the fuzzification step at (1) by implementing the Monte Carlo simulations (MCS) technique (Heuvelink 1998(Heuvelink , 1999; [8] ). The MCS method is used to assess and quantify uncertainty in the fuzzy membership values [ 7 , 1 , 8 ]. This method was first applied to fuzzy logic overlay by Lisitsin et al. [8] .
Here we extend it to the rule-based FISs where the proposed method simulates the fuzzy membership values based on the simulations of the parameters defining the membership functions. The optimized parameters are then used to design the FISs mapping the potential of the targeted mineral systems.

Detailed steps
First, the initial model parameters were assigned to conform the shape of the membership functions to the training data statistics ( Table 1 ). These values were considered as the 'expected' values for the membership function parameters. Next assuming, that the parameters of the fuzzy membership functions have a beta-PERT distribution [ 7 , 8 ], 'minimum' and 'maximum' values were assigned to the parameters, defining the corresponding distribution curve for each of the parameters of the functions ( Fig. 1 a and b). The beta-PERT is a type of beta statistical distribution bounded by the minimum, maximum and the mode values [ 4 , 6 , 10 ]. With these parameters, a continuous distribution of a wide variety of shapes such as L, J, U, or a nearly-normally distributed bell-shaped curve can be defined. Beta distributions are often used when there is no training data, and the only information available is the expert knowledge about the optimistic, most likely, and pessimistic values. However, because of its flexibility, it can also be defined to model the random variables such that the distribution can approximate the variations identified in the data [10] . Hence it is applicable to modeling procedures in various fields for simulations and uncertainty assessments. Some such studies are Johnson et al. [5] for assessing potential of natural gas in the Horn River Basin in Canada; Damgaard and Irvine, [3] for exploring spatial and temporal patterns and modeling plant abundance from plant cover data; Xu and Chowdhury [14] , for probabilistic analysis of structured rock and soil slopes.
For the current study, the bounding parameters of the beta-PERT distribution, i.e., 'minimum' and 'maximum' values were derived from the descriptive statistics of different data groups in the drill core data, for instance using the quartile 1 and quartile 3 values, respectively ( Table 1 ). The values for each parameter were then randomly extracted 10 0 0 times from the beta-PERT distribution defined by the 'minimum', 'expected' and 'maximum' values of the membership function parameters, and the corresponding output fuzzy membership values were estimated. This created a probability distribution of the output fuzzy membership values ( Fig. 1 c). From the distribution of output membership values in Fig. 1 c, we can extract the range of 'probable' values at varying certainty levels. For instance, the 10% certainty level gives a narrow range of the output fuzzy membership values but the range at 90% certainty is wide ( Fig. 1 c). Hence, at 90% certainty we can provide the least, the most probable and the highest membership values. Consequently, we use the 10-, 50-and 90-percentile membership values from the 90% certainty envelope to provide the least, the most probable and the highest membership values ( Fig. 1 c). The 10-percentile value is among the lowest or the most conservative estimate of the output, and hence a 90% confidence level is assigned to it. The 90-percentile value gives an estimate on the higher side of the density distribution. It is therefore the most optimistic estimate of the fuzzy membership value and hence 10% confidence level is assigned to the 90 percentile values. The 50-percentile value is the most probable estimate and hence is the 'expected' prediction of prospectivity values, supplemented by the10 and 90% confidence level values. Fig. 1 demonstrates the above-described method of parameter estimation applied to the input variable NE-SW magnetic anomalies. The input variable representing NE-SW magnetic anomalies has three linguistic values. However here we describe the parameter estimation for the membership function representing the linguistic value 'Mineralized anomalies'. Based on the descriptive statistics in Table 1 , Fig. 1 a and b present the beta-PERT distribution defined for the parameters σ , μ of the 'Mineralized anomalies' membership function. From these distributions the parameters of the membership function were randomly sampled by running MCS ( Fig. 1 a and b, respectively).    'Mineralized anomalies' linguistic variable at 90% and 10% confidence levels, respectively ( Fig. 2 ). This procedure was implemented for each linguistic value of each input variable. (There were a total -17 linguistic values in 7 evidence layers - Table 2 ). Subsequently the parameters corresponding to the 50-, 10-, and 90-percentile output fuzzy membership values for all the variables were extracted and used to the design the expected FIS and the supplementary 90% and 10% confidence levels FISs, respectively ( Figs. 3 and 4 ). The simulated results were combined to generate the hostchemical traps and structural settings FISs at varying confidence levels ( Figs. 3 , 4 ). Finally, the fuzzy prospectivity values from each of the FISs were mapped to produce corresponding prospectivity map ( Fig. 5 ).

Results and method validation
The FISs at varying confidence levels were derived from the 90-, 50-, and 10-percentile values of the parameters extracted from the 90% certainty envelope of probability distribution of the simulations. The consequent changes in the fuzzy membership values can be visualized for instance in Fig. 2 that shows the fuzzy membership function expected from data statistics and the supplementary membership functions at 10 and 90% confidence levels for the fuzzy set representing 'mineralized anomalies' in the evidence layer of NE-SW trending magnetic response. An input value attains three fuzzy membership values, one from each of the confidence levels functions. For instance, an input value 0.218 (around the peak of all the three membership functions) is mapped to nearly maximum membership value (i.e., ∼1) at 10%-, expected, and 90%-confidence levels. However, the differences in the membership values increase towards the tail of the functions, thus indicating progressive increase of the uncertainties. Additionally, for the given example, these differences are not symmetric along both the sides of the curves. For an input value of 0.25, that plots on the right-side curves, the membership values at 10%-, expected, and 90%-confidence levels are 0.825, 0.658, and 0.525, respectively. The right side of the curve shows less difference between the expected and 90%-confidence level membership values. Similarly, for an input value of 0.18, that plots on the left-side curves, the membership values at 10%-, expected and 90%-confidence levels are 0.725, 0.658, and 0.458, respectively. The left-side of the curves show relatively less differences between the expected and 10% confidence level membership values, particularly for values above the inflection points. The same input value, therefore, gets mapped to different output space at varying confidence levels.
In Figs. 3 and 4 , an input feature vector with values 0.5 in all input variables has the host potential as 0.1, 0.07 and 0.04 and the favorable structural settings potential as 0.163, 0.153 and 0.122 in 10%-, expected and 90% -confidence level FISs, respectively. Hence the final prospectivity values attained by such an input feature vector ranges from 0.016 -0.005 at 90% certainty, while   The degree of membership to the 'non-anomalous-low' fuzzy set was constrained to less than the Q1 value of mineralized drill core sections; i.e. below 0.36; the parameters of the MF were adjusted accordingly. Anomalous The high degree of membership to the fuzzy set is centered between the median and just below Q3 of the mineralized drill core sections; the SD of the MF corresponds to that of the mineralized drill core data. Non-anomalous: High Sigmoid (a, c) (iv) a = −7.605, c = 0.432 A gently sloping MF extending till the Q3 value of non-mineralized drill core sections.
( continued on next page )  * Subjective reasoning formed the basis for generating the initial FIS which was then optimised using parameters obtained from MCS. Such reasoning varies from person to person and therefore lacks precision and accuracy. Nevertheless, it is important to document and make these available for the information of end-users and relevant stakeholders.
(i) Gaussian MF: It is defined by the parameters center (μ) and spread ( σ ) as follows:.
Composite Gaussian MF: It comprises two Gaussian functions, i.e. the left and the right curves. The composite gaussian function is defined by the center (μ) and spread ( σ ) parameters of the left and right curves as follows:.
In the composite function for μ L ≤ μ R the membership values reach maximum of 1 over the range [μ L , μ R ]. For μ L > μ R , the maximum value is less than 1. The composite gaussian function applied to fuzzy sets such as 'Low' or 'Non-anomalous' linguistic labels is comprised of the left curve only and when applied to fuzzy sets such as 'High' or 'Anomalous' linguistic labels it is comprised of the right curve only. The main utility of composite gaussian function is that the left and right need not be symmetric curves. This facilitates shaping the function with different slopes on either side. (ii) Generalized bell function: This function is defined by parameters a, b, and c as follows:.
The parameters a and c control the width and center of the transition area, respectively. c is the inflection point where f(x) = 0.5 and a is the slope at c .   the most probable prospectivity value is 0.010. The overall standard deviation of the prospectivity values for this input feature vector is 0.0055. The consequent changes in the final prospectivity values when all the variables are integrated in the 10%-, expected and 90%-confidence level FISs are evident in the prospectivity results also ( Fig. 5 ). As the confidence level of FIS results increases from 10% to 90% the area mapped as highly prospective reduces spatially and the high prospectivity zones become well-defined and localized ( Fig. 5 ). For instance, a random location marked by 'o' in the black box in Fig. 5 has prospectivity values ranging from 0.806 to 0.167 at 10-90%-confidence levels, respectively (standard deviation of 0.26, implying high uncertainty). In the Hirvimaa area, or the well-endowed Palokas and Raja prospects the prospectivity is enhanced and the spatial extent of prospective zones is reduced as the confidence level increases from 10 to 90% confidence. Such variations in the prospectivity values therefore facilitate identification of exploration targets with a confidence factor attached to the results. Hence, we were able to quantify the model uncertainties by assigning confidence level to the reported results.
Additionally, all the results are validated using the receiver operating characteristics (ROC) curves and the corresponding area under curve (AUC) values ( Fig. 6 ). The prospectivity map from the optimized FIS shows AUC value of 0.839 and the supplementary results also have high AUC values of 0.710, and 0.811 for the 10%-, and 90%-confidence level results, respectively. The capture efficiency, tabulated in Table 3 , further validate the efficiency of the method proposed in this study. The FIS model captures 80% of the mineralized drill core sections in the highly prospective tracts that occupy 24% of the study area. The ground exploration targets identified by the prospectivity maps together comprise about 1.5 km 2 of the ∼20 km 2 study area (i.e., 7.5% of the study area) [2] . This indicates considerable search-space reduction by the modeling results for follow-up ground exploration with high confidence.

Declaration of Competing Interest
No competing interests to declare.