A statistical approach to study the spatial heavy metal distribution in soils in the Kushk Mine, Iran

The present study was conducted for the spatial distribution and concentration evaluation of heavy metals, including Cu, Cd, Mn, Fe, Zn, and Pb, within 102 soil samples collected from Kushk Mine in Bafgh, Iran. This work employed hierarchical clustering analysis (HCA), principal component analysis (PCA), and spatial distribution patterns, to perform element distribution evaluation within the area. The distributions of heavy metals in the entire area were exhibited in the form of maps. The average concentrations of Cd, Cu, Mn, Fe, Zn, and Pb were found to be 0.39, 0.26, 5.3, 4.1, 51.9, and 40.9 mg.kg−1, respectively. Based on the PCA and HCA findings, the heavy metals were divided into two groups. The first group included Pb, Cd, Zn, and Cu. In the first group, altered threshold‐surpassing anthropogenic and lithogenic pollution was found to be the main factor accounting for Pb and Zn. The second group involved Fe and Mn, which could be impacted by either anthropogenic and lithogenic factors. Furthermore, the geo‐statistical results demonstrated higher contents of the heavy metals in the south of the mine and in the vicinity of the mine tailings. It may be concluded from the results that the heavy metal contents of the area are impacted by anthropogenic and lithogenic factors.


| INTRODUCTION
On account of essential geological processes, the centre of Iran hosts many mines that are mostly either utilized or under exploration phases. Such activities have contributed to the spread of contaminants arising from desertification and mineral discoveries, processes, and transportation. As a rich and potential region, Yazd Province has many abandoned and active mines (Hoseini et al., 2009). The Iran Statistics Center (2019) reported that Yazd Province had 138 operating mines. Based on value-added and mineral production, Yazd is the second largest mineral province. Soil is an environmental component in geochemical pollutant storage. The ecosystem is impacted by toxic metal accumulation. It may influence people via the food chain. It is essential to assess the presence and occurrence possibility of organic and inorganic contaminants within the soil (Sapcanin et al., 2017). Also, the soil may pollute both groundwater and surface water resources, sediments, living organisms, and oceans (Facchinelli et al., 2001;Kumari & Mishra, 2021). Heavy metals are highly nonbiodegradable. The cycling of the food chain may enrich such metals within living organisms (Deng et al., 2017;Li et al., 2019). As the most prominent environmental contaminants, a large heavy metal concentration would considerably contribute to particular negative environmental impacts (Khosravi, Doulati Ardejani, et al., 2018a). Soil is of the highest sensitivity and criticality among earth components. Not only is soil involved in the maintenance of the processes of life but also it has majorly contributed to lifecycle evolution and even its emergence. Heavy metals may contaminate the soil through natural processes, for example, industrial human activities, mineral activities, and weathering, the consumption of fossil energy resources, agriculture, vehicles, metallurgical procedures, and the disposal of waste (Ayari et al., 2021(Ayari et al., , 2022Li et al., 2018). The heavy metal contamination of soils stems from growing industrial development. The disposal of waste and human activities have substantially enhanced heavy metal contaminants in the soil in recent decades (Ghemari et al., 2017;Xiang et al., 2021).
The heavy metal content of the soil can reach the biomass, atmosphere, and hydrosphere (Kabata-pendias & Pendias, 2001). Industrial development plans that do not consider environmental aspects, the absence of clearly specified criteria, and low-efficient legislation for environmental protection actions have resulted in the penetration of chemical contaminants into the air, soil, and water in Iran (Khosravi, Doulati Ardejani, et al., 2018a). Currently, industrial pollution is viewed to be the main explanation for the degradation of the environment. Research has shown air, soil, and water contain contaminants in the vicinity of industrial areas (Hoange et al., 2021). Hence, industrial activities may impose health impacts on humans near the area via the soil, water, and respiratory air, leading to disease and complications (Briffa et al., 2020;Skaldina et al., 2018). In the literature on heavy metal-contaminated soils, Li et al. (2015) performed the identification of high contamination risk areas through the spatial analysis of unusual local concentrations. Several approaches are today adopted for the study of heavy metal contaminants in the soil. The combination of spatial and geo-statistical techniques and geographic information systems (GIS) is extensively employed for the spatial variation demonstration of soil parameters, for example, soil pollution, at a lower cost (Ilie et al., 2016;Khosravi, Ali, et al., 2018b;Zheng et al., 2008). Studies should take into account not only the sampling and analysis of the soil but also the costs. To avoid such issues, the spatial distribution maps of contaminants are employed using spatial interpolation techniques (Santos-Francés et al., 2017). Today, inverse distance weighting (IDW) is employed for the spatial variability description and prediction of soil parameters (Lee et al., 2006;Santos-Francés et al., 2017;Tang et al., 2013;Ungureanu et al., 2017). Also, the Kriging technique is exploited for concentration predictions in unsampled areas (Lark, 2012). The principal component analysis (PCA) and hierarchical cluster analysis (HCA) have recently been employed in numerous soil studies, including heavy metal contamination (Santos-Francés et al., 2017;Ungureanu et al., 2017). Morales Ruano et al. (2019) stated that the proposed method is based on a combination of simple, quick, and reliable chemical and bioassays with chemometric tools such as HCA and PCA which provide a viable approach to carrying out a preliminary estimation of toxicity levels associated with the soil sample in a studied area. PCA reveals the connections between heavy metals and performs their importance prioritization (Oliver & Webster, 2014). In addition, due to its simplicity HCA and PCA can be defined as a quick, economical, and reliable tool for use in the first stages of environmental risk characterization in mining areas (Morales Ruano et al., 2019).
Kushk Mine, Iran has been under manual operation since 1940. It was mechanized in 1964. The mine is located near a processing and condensing plant. Mining waste deposits on the mine margin. Water and wind could spread the waste downstream. A number of studies analysed contaminant contamination in the soil adjacent to Pb and Zn mines. Despite the many years of mining activities, so far few studies have been conducted in the field of soil pollution assessment in Iranian mines, especially in the Kushk mine, which in recent years has attracted the attention of some researchers in this field. Environmental impacts of mining can be created based on local, regional, and global capacities through direct and indirect mining practices. Mines are one of the most polluting sources of the environment due to the production of acidic and toxic wastes.
Paying attention to the environmental effects of mining in the planning of mining production will reduce the harmful effects of this industry on the environment. Mining and its various stages are one of the important factors in spreading pollution in natural environments. Contaminations may affect soil and water resources and actually enter the food chain due to absorption by plants. The objectives of this study include (i) the determination of heavy metals and the detection of contaminated sites, (ii) the execution of PCA and HCA, (iii) the spatial distribution characterization of heavy metals by IDW, and (iv) soil contamination source detection in the vicinity of Kushk Mine.

| Study area
Kushk Mine is located at an altitude of 31-43°N and latitude of 55-40°E. It is 165 km east of Yazd and 45 km northeast of Bafgh, as shown in Figure 1. Soil type of the Yazd plateau and mountainous regions is Solonchaks (Dewan & Famouri, 1964). The mine has an average elevation of nearly 2070 m. The area has a dry climate according to the De Martonne climate classification, whereas it has a temperate climate with dry summers on the basis of the Köppen climate classification. Also, the area has an annual precipitation of nearly 50 mm (raining events mostly occur in winter) and potential evapotranspiration (ET) of approximately 3,500 mm. The wind most commonly blows in a northwest-southeast direction.
The closest villages to the mine include the villages of Kushk with a population of 167 people and Seyedabad with a population of 61 people, which are located about 2 km above the mine. The occupations of the inhabitants of the region are mainly farming and animal husbandry, and some of them work in the mines (Mokhtari et al., 2018).

| Geological of the study area
Bafgh Metallogenical Zone is located in the eastern part of the Central Iran zone and is called the Bafgh F I G U R E 1 The location of the Kushk Pb-Zn mine of Bafq in Yazd Province of Iran. metallogenic zone. This metallogenic region ends from the southeast to Kuhbanan fault and from the northeast with Naeen fault and Kalmard fault is separated from Tabas block and from the north to Chapdoni fault and Saghand fault Posht Badam and from the west. It leads to the Kuhbanan fault.
The Bafgh block is one of the oldest areas of Iran, where very large deposits of iron, lead, zinc, and phosphate are formed during the Late Precambrian-Early Cambrian age.
Neogene and alluvial quaternary deposits account for nearly one-fourth of the northwestern area. Also, rocky outcrops and a low amount of quaternary sediments cover the remaining area of the region.
There are Upper Precambrian-Lower Cambrian sedimentary rocks with a few meters of sandstone (Єs) consisting of two lower and upper parts, for example, lime, dolomite, sandstone (Єdsh), shale, and acidic lava (Єr [and acidic tuffs] Єrl) on older sediments with angular anomalies. The sediments include dolomitic, calcareous dolomite, and Soltanieh Formation (Єdl). The lower part of the fossil vermiporella manchuri of Manchuria can be identified in the dolomitic calcareous strata, which suggests that the sedimentary rocks belong to the Lower Cambrian. Associated with the Middle-Upper Cambrian, the Milma Formation (Єm) involves dolomite, siltstone, lime, and green and purple sandstones. It has hyolithid, stromatolite, and fossil contents as well as salt pseudomorph within a number of calcareous horizons, as illustrated in Figure 2.

| Soil samples
A total of 102 samples were extracted from the soil of the area of Kush Mine in November 2018. The Latin hypercube sampling (LHS) method was employed to determine the locations of samples (Minasny & McBratney, 2006). The entire samples had a specific weight of 1.5-2.0 kg. m −3 . To obtain particular geological properties, the samples were extracted at a depth of up to 5 cm (topsoil). Steel shovels and individual plastic bags were employed to collect the samples so that they would not be polluted. In addition to the above samples, a sample as a control was needed to compare the levels of heavy metals in the basin samples. For this purpose, in the adjacent basin with a geology similar to the studied basin a sample was prepared.

| Latin hypercube sampling
Latin hypercube sampling adopts an efficient sample location strategy to obtain a complete environmental covariate distribution (Minasny & McBratney, 2006). LHS was introduced for use in the Monte-Carlo simulation to make efficient selections of computerized model input variables (Iman & Conover, 1980;McKay et al., 1979). Environmental and soil studies have utilized LSH for uncertainty evaluation purposes within prediction models (Minasny & McBratney, 2006).
In the LHS procedure with k variables, each variable X is divided into n intervals of equal probability (strata). Next, a sample is randomly collected for each of the variables in each of the intervals (stratum). Then, the n values of each variable undergo pairing either randomly or on the basis of rules. Eventually, there are n samples covering n intervals for the entire variables. Hence, this procedure does not need more samples for more dimensions (i.e., variables). LHS ensures the entire variables are represented in a completely stratified fashion Minasny and McBratney (2006) provided a two-dimensional example.

| Analytical procedure
The entire samples underwent air drying and sieving into 2-mm fractions for the removal of stones and coarse grains. The diethylenetriaminepentaacetic acid (DTPA) micronutrient extraction method was employed to extract Cd, Cu, Fe, Mn, Pb, and Zn (Lindsay & Norvell, 1978). Also, analytical atomic absorption spectroscopy was employed to measure heavy metal concentrations. A Germany-made Jena 330 atomic spectrometer was utilized. To avoid inter-sample emissions, all the devices were sterilized (Ungureanu et al., 2017). To check the accuracy of analytical atomic absorption device data, 25 duplicate samples were randomly selected and numbered similar to the original samples and added to the samples (Mokhtari et al., 2018).

| Statistical method and geo-statistics
IBM SPSS Statistics 22 was employed to calculate the means, modes, medians, standard errors, standard divisions, skewness, kurtosis, and minimum and maximum values, as shown in Table 1. Logarithmic or second-root transforms into a normal distribution were performed. ArcGis v10.3 was exploited to perform GIS analysis, spatial analysis extensions, and geo-statistical analysis to obtain the spatial variability degree of heavy metals via IDW for interpolation determination.

| Principal component analysis and hierarchical clusters analysis
The relationships between the quantitative data were assessed using principal component analysis (PCA) and HCA. Also, heavy metal groups were identified as anthropic or natural source tracers, contaminant source description was supported using PCA and HCA (Ungureanu et al., 2017).
Principal component analysis was adopted to find how heavy metals were related and to identify natural and anthropogenic factors. A varimax rotation was used to obtain a simplified structure and better interpret the results. Cluster analysis is aimed at identifying a standard to classify variables on the basis of inter-group differences and intra-group similarity. The data underwent normalization for cluster analysis. The present study employed Ward's method and the Euclidean distance (Johnson, 1998). Table 1 shows the obtained concentrations of heavy metals in the surrounding of the lead and zinc mine. As can be seen, the concentrations of Cd, Cu, Mn, Fe, Zn, and Pb were found to be 0.03-7.98, 0.06-2.53, 0.07-123.92, 0.76-9.09, 28.27-1,079.12, and 13.79-394.4 mg. kg −1 , respectively. Also, the mean concentrations of Cd, Cu, Mn, Fe, Zn, and Pb were calculated to be 0. 39, 0.26, 5.36, 4.16, 51.92, and 40.92 mg.kg −1 , respectively. As can be seen, the mean concentrations reduced in order of Zn > Pb > Mn > Fe > Cd > Cu.

| Heavy metal concentrations in the case study
The coefficient of variations (CV) is significantly large for the anthropogenically-controlled heavy metals, whereas it is almost small for naturally controlled ones (Yongming et al., 2006). Hence, heavy metals could be F I G U R E 2 Geological map of the study area. classified into two groups on the basis of their CV values. The CV of Fe was smaller than 0.4, whereas those of Cd, Cu, Mn, Zn, and Pb, were above 1.0. Fe is released from natural sources. However, Cd, Cu, Mn, Zn, and Pb stem from anthropogenic sources.
The skewness of Cd, Cu, Mn, Zn, and Pb were found to be above zero. This suggests a distribution with positive skewness and a number of samples with rather large quantities. However, the skewness of Fe was calculated to be closer to zero. This implies a normal distribution (Chen et al., 2010;Ungureanu et al., 2017). Figure 3 illustrates the histograms of the results. As can be seen, the heavy metals had high concentrations and normal distributions.
According to Table 1, the contaminant concentrations of the soil samples the entire heavy metals exhibited values equal to or greater than the Global, European, and Iranian quantities. The global and Iranian Cd concentrations are 0.35 and 3.9 mg.kg −1 , respectively, while that of the soil samples was observed to be 0.03-7.98 mg.kg −1 . The Cd concentration of the samples exceeded the alert threshold. The largest Cd concentration was identified around the mine, which could have arisen from mining activities. Khosravi, Ali, et al. (2018b) assessed heavy metal spatial distribution patterns in Iran. They reported an average Cd concentration of 5.79 mg.kg −1 and attributed to anthropogenic sources. The penetration of Cd in the soil can have different explanations, including sewage sludge, phosphate fertilizers, and agricultural reform (Maas et al., 2010).
For the case study, the Cu concentration varied from 0.06 to 2.53 mg.kg −1 , while the mean concentration was obtained to be 0.39 mg.kg −1 . The Cu contamination of soils typically stems from industrial atmospheric deposits or agrochemicals (Kabata-pendias & Pendias, 2001;Maas et al., 2010). Based on the threshold (Table 1), the Cu concentration is generally at a nontoxic level.
The Mn concentration was measured to vary between 0.07 and 123.92 mg.kg −1 in the soil samples. The mean Mn concentration was calculated to be 5.36 mg.kg −1 . Lu et al. (2016) studied the potential toxic elements sources and distribution in Guangzhou, China. They found that the Mn concentration varied from 21.2 to 1,286 mg.kg −1 . Zhaoyong et al. (2015) suggested that mining and other industrial activities that consume water in their operations are most commonly established in the vicinity of water resources. In the process of manufacturing, the heavy metal content of wastewater could readily penetrate groundwater, streams, and rivers, streams through rainfall. Thus, wastewater can induce the secondary contamination of surface water and soil. Additionally, the Fe concentration was derived to vary from 0.76 to 9.09 mg.kg −1 , with a mean concentration of 4.15 mg.kg −1 in the case study. As can be seen, the Fe concentration of no sample was T A B L E 1 Descriptive of the heavy metal content (mg.kg −1 ) in topsoil samples The Pb concentration of the samples was measured to be 13.79-394.4 mg.kg −1 . Also, the mean Pb concentration was obtained to be 40.92 mg.kg −1 . The Pb-contaminated samples were collected from the vicinity of the roadside transport and mineral deposits. A large number of studies on Pb concentration reported similar findings (Khosravi, Ali, et al., 2018b;Santos-Francés et al., 2017;Ungureanu et al., 2017). The concentration of soil is directly associated with the parent material, particularly Al and Fe hydroxides, Mn oxides, and clay minerals, organic material, anthropogenic traffic-associated Pb sources, metal mining, and sewage (Salminen et al., 2005).
Finally, in the case study, the Zn concentration was obtained to be 28.27-1,079.12 mg.kg −1 . Also, the mean Zn concentration was calculated to be 51.92 mg.kg −1 . The largest Zn concentrations were similar to the highest Pb values. Given that the Zn contents of a number of the samples were higher than the alert threshold, the case study can be said to be a Zn-contaminated area. Khosravi, Ali, et al. (2018b) investigated heavy metal spatial distribution patterns. They reported the mean Zn content of the soil as 684.11 mg.kg −1 . Santos-Francés et al. (2017) studied heavy metal spatial distributions in Spain. They found the mean Zn content of the soil to be approximately 35.31 mg. kg −1 . The Zn content of the soil stems from the organic material, parent material, soil texture, and anthropogenic sources such as coal industries, steel processing, and mining (Salminen et al., 2005).

| Hierarchical clustering analysis
To divide the heavy metal sources into anthropogenic and natural groups, exploratory HCA was carried out on the heavy metals for the case study. Figure 4 demonstrates the hierarchical dendrogram of the results. A lower distance cluster value represents a dependence of higher significance (Lee et al., 2006). Figure 4 suggests two clusters, one containing Cu, Cd, Zn, and Pb, while the other consists of Fe and Mn; the former represented anthropogenic sources, while the latter stands for natural sources (i.e., parent material and soil characteristics; Atapour, 2015). Similar findings were reported by previous studies (Atapour, 2015;Chen et al., 2010;Ungureanu et al., 2017). Ungureanu et al. (2017) classified Cu, Cd, Zn, and Pb factors into two groups. They suggested that the two groups shared the source. The Pb and Zn cluster could represent human-induced pollution while the Cu and Cd cluster was attributed to both parent material and human factors. The results are consistent with Zhaoyong et al. (2015), who placed Pb, Zn, and Cu, in the same group and classified As, Cr, Ni, and Co into another group. Also, Hg, Cd, and Mn formed the third group. The contaminants of the third group were argued to arise from artificial sources (Lu et al., 2016). The next section procures the cluster analysis results based on PCA.

| Principal component analysis
The present study utilized multivariate-grouped PCA. The Kaiser-Meyer-Olkin value was found to be 0.710. Therefore, the data were concluded to suit factor analysis. In addition, the results of Bartlett's test for sphericity were observed to be significant. This verified the opposite assumption, that is, the variables were significantly correlated. It is crucial to perform normalization in PCA (Miller & Miller, 2005;Möller et al., 2005). Also, varimax rotation was employed to represent the principal components, as shown in Table 2.
According to the eigenvalues (eigenvalue > 1), 81.15% of the total variance was explained by the two main factors; the first and second components explained 60.001% and 21.15% of the total variance, respectively. The first factor had a positive, significant relationship with Cu, Cd, Zn, and Pb. This indicates the impacts of lithologic and anthropogenic (e.g., traffic) factors on the contamination of the samples (Khosravi, Ali, et al., 2018b;Ungureanu et al., 2017). Ungureanu et al. (2017) demonstrated that Cd and Cr had lithogenic and anthropogenic sources, even when their concentrations were below the alert threshold. Furthermore, the second factor demonstrated positive burdening on Fe and Mn concentrations. The Mn contamination was mostly impacted by parent material, as shown in Figure 5 (Johnson, 1998).
The Pearson correlation matrix was employed to explore how the heavy metals were related in the case study area, as shown in Table 3. As can be seen in Table 3, significant, positive correlations were found between Cd and Cu, Cd and Pb, Cd and Zn, Cu and Pb, Cu and Zn, and Pb and Zn, while Mn and Zn were observed to be negatively correlated (r = −0.010). The PCA and HCA result well agreed with each other, verifying the data interpretation.
Based on the results, the heavy metals were classified into two groups; the first group contained Cd, Cu, Zn, and Pb; Zn and Pb are significantly affected by anthropogenic (e.g., traffic) and lithogenic sources. Both anthropogenic and lithogenic sources could induce Mn and Fe contamination. Also, the fertilizer contents of the samples could impact Mn contamination.

| Spatial distribution pattern of the heavy metals and soil properties
It is important to identify geographical patterns to evaluate soil behaviour (Tang et al., 2013). Thus, ArcGIS and IDW were employed to perform mapping, as illustrated in Figure 6, to assess the relationships between the heavy metals, spatial characteristics, and soil properties.
Cu, Cd, Zn, and Pb had similar spatial distributions within the soil, probably on account of similar sources and anthropological inputs (Lee et al., 2006). Nonetheless, the maximum Cd content was found in the southern part of the mine and in the vicinity of mining waste. Ravankhah et al. (2015) demonstrated that industrial activities play a more significant role than other possible factors in the enhancement of Cd concentration. The F I G U R E 4 Cluster analysis by Ward linkage method and Euclidean distance.
greatest Cu concentrations were identified in the vicinity of mining waste and in the eastern and middle parts of the case study. Fe had the lowest dispersion among the heavy metals. According to Figure 6, the distribution of Mn in the eastern part of the area arose from agricultural factors. However, the concentrations of Fe, Mn, and Cu were below the threshold and caused no contamination. Thus, they probably were the sources of such parent material elements. Also, high Pb concentrations were found down the road, on the southern edge and in the western part of the mine, and in the northwest. Higher Pb distributions were found in the mine and on the roadside, probably on account of geological factors and traffic. Zn was observed to be distributed merely in the southern part of the mine. Galena is the ore of Pb and Zn, and Pb remains to be nontoxic until it is released. Hence, the total Pb content of the soil may be an indicator of a geological contamination source. One of the causes of the spread of lead and zinc pollution is the surface runoff in the area. Among the waterways, the main channel of the region plays the most important role; This is because this waterway passes through the margin of the waste dam and part of the mine wastewater that overflows from the waste dam is discharged into this channel. The rock contains lead and zinc with a high volumetric weight, the sediments do not travel a long way by water and at a distance of several kilometres from the mine are deposited on the floor and the edge of the T A B L E 2 Varimax-rotation factor PCA for metals in the study area  F I G U R E 6 Spatial distribution of the content of the different heavy metals in the soil of the study area by IDW method.

| CONCLUSIONS
The present study integrated PCA and HCA in spatial distribution patterns to find and classify heavy metal sources. PCA and HCA yielded the same results and classified the heavy metals into two groups. The findings demonstrated that human activities and parent material played the most significant roles in Cd, Pb, and Zn contamination. Also, geo-genic sources and parent material mainly controlled Mn, Cu, and Fe contents in the soil. Merely, the concentrations of Cd, Pb, and Zn exceeded the alert threshold. It can be concluded from the results that the heavy metal contents of the case study area are impacted by anthropogenic and lithogenic factors. Mines are one of the most polluted sources of the environment due to the production of acidic and toxic waste. Paying attention to the environmental effects of mining in mineral production planning reduces the harmful effects of this industry on the environment. Considering the importance of soil and water contamination in the food chain, this mine and other mines should be investigated from the point of view of contamination.