Multivariate statistical analysis of urban soil contamination by heavy metals at selected industrial locations in the Greater Toronto area , Canada

A good understanding of urban soils metal contamination and locating their pollution sources due to industrialization and urbanization is important for addressing environmental problems. Urban soil samples near industrial locations in the Greater Toronto Area (GTA), Ontario, Canada were analyzed for metal (Cr, Mn, Fe, Ni, Cu, Zn, and Pb) contamination. Multivariate geo-statistical analysis (correlation matrix, cluster analysis, principal component analysis) was used to estimate the variability of the soil chemical content. The correlation matrix exhibits a negative correlation with Cr. The principal component analysis (PCA) displays two components. The first component explains the major part of the total variance and is loaded heavily with Cr, Mn, Fe, Zn, and Pb, and the sources are industrial activities and traffic flows. The second component is loaded with Ni, and Cd, and the sources could be lithology and traffic flow. The results of the cluster analysis demonstrate three major clusters: 1) Mn-Zn, 2) Pb-Cd-Cu and Cr, 3) Fe-Ni. The geo-accumulation index (I geo) and the pollution load index (PLI) were determined and show the main I geo values to be in the range of 0-1.67, indicating the studied soil samples are slightly to moderately contaminated with Cr, Fe, Cu, Zn, and Cd, and moderately contaminated with Pb, while Ni, and Mn fall into class “0”. Regarding the PLI, the lowest values are observed at stations 6, 7, 9, 10, 11, 12, 25, 27 and 28, while the highest values are recorded for stations 1, 5, 6, 13, 14, 16, 17, 18, 20, 22 and 24, and very high PLI readings are seen for stations 5, 13, 16, 17, 18, 22 and 24. These data confirm that in addition to heavy traffic flows, the chemical and metallurgical type of industries are the major source for soil pollution in the GTA.


INTRODUCTION
In recent years, much concern has been expressed over the problem of urban soil contamination with heavy metals due to rapid industrialization and urbanization (SUN et al., 2008).
Many studies indicate that urban soils are contaminated by heavy metals, and this phenomenon has been attributed mainly to modern industries, traffic and mining activities in urban areas (NAZZAL et al., 2012(NAZZAL et al., , 2014;;XIAOYU LI et al., 2013;DE KIMPE & MORAL, 2000;GALLAGHER et al., sources of some heavy metals in top soils, based on both visual inspection of concentration maps and quantitative analyses over varied spatial scales of the spatial variability of the elements and their relationships.They used multivariate methods to relate concentrations of heavy metals to the local geology and land use, by comparing the results of principal component analyses carried out on concentration data with the experimental indicator variogram for selected categorical information. Multivariate geostatistics uses information arising from relationships among variables to improve estimations for variables and to identify the different causes of variations over different spatial scales (CASTRIGNANO et al., 2009).Some factors affecting soil variations exhibit short-range effects while others are important over greater distances.Hence, soil variables are anticipated to correlate in a scaledependent manner, and this correlation structure of some soil variables, providing it reflects different variability sources, is important in environmental studies.Such activities require a statistical approach that combines classic principal component analysis, to describe correlation structure of multivariate data sets, with geostatistics, to account for the coregionalized nature of the variables (SOLLITTO et al., 2010).
Here, we apply this method to the GTA, with the objective of improving understanding of the environmental impacts in that region and their causes.As part of the investigation, the authors undertook field sampling to identify sources, site-specific pollution and profiles of pollutant concentrations in the form of heavy metals and their geochemical associations.

Study Area
As the third largest industrial area in North America, industrial entities are major contributors to the economy and employment of the GTA.The varied industries are operating in this area impact air, water and soil quality.In addition, the region has dense urban traffic and its major highways (Highways 401, 400 and 404, and the Don Valley Parkway) exhibit daily traffic flows of about 920,000 vehicles (NAZZAL et al., 2012).
Toronto is located in southern Ontario on the north western shore of Lake Ontario (Fig. 1).With over 5 million residents, it is the fifth most populous city in North America.Its metropolitan area with over 5 million residents is the seventh largest urban region in North America.Toronto is at the heart of the Greater Toronto Area (GTA), and is part of a densely populated region in Southern Ontario known as the Golden Horseshoe, with over 8.1 million residents, representing approximately 25% of Canada's population (POPU-LATION OF CENSUS METROPOLITAN AREA, 2006).
As table 1 shows the GTA has industries of many types (e.g., agriculture, food, textiles, metallurgical, chemical, electronic, machinery, etc.).These industries release large amounts of solid, liquid and gaseous wastes.To identify possible sources of soil pollution in the study area, dusts and particle 2008).The concentrations of heavy metals and toxic elements in roadside soils and dust can help inform about pollution levels in urban and industrial areas since such concentrations usually reflect the extent of the emissions of these substances from anthropogenic sources (FERGUSSON, 1990;HARRISON et al., 1981).Lead is a particularly widespread pollutant and its presence in soils has been linked to the use of alkyl-lead compounds such as antiknock additives in gasoline, GRATANI et al. (1992), NAZZAL et al. (2014).
There are generally two main sources of heavy metals in the soils (LI et al., 2009b) I) natural background, representing heavy metals concentrations derived from parent rocks; and II) anthropogenic, caused by agrochemical activity, organic amendments, animal manure, mineral fertilizer, sewage sludge and industrial waste and dusts.Over the last few decades, the natural input to soils of several heavy metals due to pedogenesis has been surpassed, at local through to regional and global levels, by anthropogenic inputs (FACCHINELLI et al., 2001;NRIAGU & PACYNA, 1988;XIAOYU Li et al., 2013).
Recent studies of pollution along highways in the greater Toronto area (GTA), in Ontario, Canada (NAZZAL et al., 2012, 2014) have identified some significant environmental situations, with increased heavy metal concentrations resulting from various processes acting at different spatial scales.These variations in metal concentrations in the roadside dust and soils have both natural and anthropogenic origins.
The Greater Toronto Area is the third-largest industrial market in North America, and factories and other industrial properties are major contributors to employment and the economy.Manufacturing in the GTA increased at a rate of 4.6% per annum in the fourth quarter of 2013.Toronto is consistently ranked near the top in terms of global competitiveness, innovation and quality of life, and possesses multisector strength, depth of talent and a driving economic and financial engine.As the business and financial capital of Canada, the GTA economy contributes $286 billion (approximately half of which comes directly from Toronto) annually to the Canadian economy.The GTA has a wide range of industries and services, supporting a range of business needs.Between 1990 and 2010, the region's economy grew from $140 billion to $240 billion (in 2002 CAD dollars), while Ontario's economy grew from $310 billion to $490 billion.WEBSTER et al. (1994) have applied multivariate geostatistics to provide a more objective assessment of the emissions are assessed.This information, with data on the geology, soils, land-use patterns, topographic environment of the area and other physical and technical conditions, allow soil pollution sources in the study area to be investigated.

Geology, Terrain, Soils, and Land Use
The bedrock geology of Ontario is variable in lithology, structure and age, although approximately 61 percent of the province is underlain by Precambrian rocks of the Canadian Shield (THURSTON, 1991).In the Phanerozoic, sedimentary rocks developed in marine basins along the northern border of the Shield, forming the Hudson Bay lowlands and in the Great Lakes Basins in the south.The Shield can be divided into three major geological and physiographic regions, with the oldest in the northwest and the youngest in the southeast.The northwestern region, known as the Superior Province and lying north and west of Sudbury, is more than 2.5 billion years old.This region is composed mainly of felsic intrusive rocks forming the rocky Severn and Abitibi uplands (BOSTOCK, 1970).The central region, known as the Grenville Province, at 1.0-1.6 billion years old and lying to the south of Sudbury, is dominated by metasedimentary rocks that form the Laurentian Highlands.The Penokean hills, a fold belt, and the Cobalt plain, an embayment, comprise the Southern Province, which is a narrow region approximately 1.8-2.4 billion years old extending from Sault Ste.Marie in the west approximately to Kirkland Lake in the east.
To the north of the Shield, in the Hudson Bay lowlands, the bedrock is composed of carbonate sedimentary formations.These formations date primarily from the Silurian period, but there are also significant parts representign the Ordovician and Devonian periods.Other sedimentary rocks exist near Ottawa, in an area referred to as the Ottawa embayment, as well as throughout areas north of Lakes Erie and Ontario (DYKE et al., 1989).The clastic and marine carbonate bedrock of southern Ontario is interrupted by the Frontenac Axis, a southern extension of the Shield, which intersects the St. Lawrence Seaway east of Kingston.The Frontenac Axis has different forest cover and land-use patterns compared to areas to the west or east, due to its uneven terrain and shallow acidic soils, both of which are characteristic of the Canadian Shield.
The Canadian System of Soil Classification AGRICUL-TURE CANADA (1987) is a standard series of orders and component groups by which soils can be identified and described.Six of the soil orders in this classification predominate in Ontario: organic and related organic cryosolic soils in northern parts of the province, brunisols in the northwest part of the Shield and south of the Shield, podzols over much of the central and southern Shield, luvisols in the Claybelt and over much of southern Ontario, and gleysols in poorly drained areas and in the Claybelt lacustrine deposits.Regosolic soils are dominant only in a thin band along the southwest shore of Hudson's Bay (AGRICULTURE AND AGRI-FOOD CANADA, 1996).From the early 1990s to the early 2000s, the total area of settlement and developed land in the GTA increased by 513 km 2 , while the areas of agricultural and naturally vegetated land decreased by 114 and 423 km 2 , respectively.

Collection of Samples
A total of 30 soil samples were collected from industrial locations in the Greater Toronto Area in August 2013.At every field station, samples were collected with a manually operated stainless steel auger, within the surface 10 cm and mainly from grassland.The soil sample raw weight was about 1 kg.

Geochemical Analyses
The soil samples were transported to the laboratory in polyethylene bags and dried in an oven at 60 °C for 24 h.The time duration between sampling and bringing them to the lab was 24 hours.Different grain size fractions were selected for the analysis of metals: <2  , 1997).In the present study, contamination of soils with particle size fractions below 2 µm was investigated using the pipette method (GEE & BAUDER, 1986), in which a sample is pipetted at different times and from various depths of the sample suspension in a measuring cylinder.The pipetted suspension is condensed and dried, and the mass ratio of the pipetted fraction is determined by weighing.Then, 0.5 g of the pipetted fraction was digested using 4 ml of HNO 3 (65%), 2 ml of HF (40%), and 4 ml of HClO 4 (70%).The resulting solution analyzed with an Atomic Absorption Spectrometer (PYE UNICAM SP9) for lead (Pb), zinc (Zn), cadmium (Cd), nickel (Ni), chromium (Cr), copper (Cu), manganese (Mn), and iron (Fe).For quality control, all soils samples were analyzed in triplicate and mean values calculated for estimation of their precision.In addition, analytical blanks were run in the same way as the samples and concentrations were determined using standard solutions prepared with the same acid matrix to monitor the possibility of sample contamination during digestion and subsequent analysis.The absorption wavelength and detection limits, respectively, were as follows: 228.8 nm and 0.0006 ppm for Cd; 240.7 nm and 0.007 ppm for Co; 324.7 nm and 0.003 ppm for Cu; 248.3 nm and 0.005 ppm for Fe; 279.5 nm and 0.003 ppm for Mn; 232.0 nm and 0.008 ppm for Ni; 217.0 nm and 0.02 ppm for Pb; and 213.9 nm and 0.002 ppm for Zn.
The accuracy of the Atomic Absorption Spectrometer measurements was assessed by analyzing the standard reference material NIST, SRM 1646.The calculation of the different statistical parameters was performed using the SPSS (Statistical Program for the Social Sciences) software package.

General Trends
The average concentrations of the metals considered in the sampling in relation to the natural background values are listed in table 2. A wide range of values is observed for each metal.Also, spatial distribution maps for the various analyzed samples are presented in fig. 2. A comparison of the data shows that the average concentration of the investigated heavy metals in the analyzed soil samples are in many locations, higher than their corresponding values in the average world soils (Table 3).

Concentrations of Metals in Soil Samples
Minimum and maximum concentrations, mean values and standard deviations are presented in Table 2 for each of the analyzed metals in the collected samples from the industrial areas in the GTA, together with the corresponding background values.The mean concentrations for metals are as follows (in parts per million): Cd (0.41), Cr (392.95),Mn (1047.35),Fe (65.012),Ni (0.22), Cu (82.9),Zn (174.3), and Pb (55.37).In general, the concentrations of the metals vary widely in the studied locations.For some metals the mean concentrations for the soil samples collected in industrial locations are higher than their background values, suggesting that the presence of these metals in the collected samples around the GTA is influenced by anthropogenic sources, such as industrial activity and traffic flows.Kurtosis values (Table 2) for some of the studied metals are higher than zero, indicating that the peak in the distribution is sharper than expected, CHEN et al. (2012).The skewness values (Table 2) of some metals (Fe, Ni Cu) are larger than unity, demonstrating that these metals positively skew towards lower concentrations (LU et al., 2009;XIA et al., 2011).This observation is also confirmed by the fact that the arithmetic mean concentrations are greater than the corresponding median concentrations.As a result the geometric means for these metals provide more useful data than the arithmetic means.Particular elements are discussed in detail below:

Lead
Lead (Pb), a naturally occurring bluish-gray metal, is found in small quantities in the Earth's crust.Much of the lead in the environment is derived from human activities including fossil fuel combustion, mining and manufacturing (NAZZAL et al., 2012).Due to health concerns, the lead content in gasoline, paint and ceramic products and pipe solder have been dramatically reduced in recent years.When released into the air, lead can travel long distances before settling to the ground and it usually sticks to soil particles once it falls onto soils (HOW-ARI et al., 2004).In the study area, the Pb concentration ranges   between 18-114 ppm.The critical concentration in soils of lead is between 100 and 400 ppm, and the global measured lead concentration in surface soils is estimated as 25 ppm (ALLOWAY, 1990) (Table 3).Locations 5, 14, 17, 21 and 23 exhibit concentrations of more than 100 ppm and these values depend on such factors as industry type and size, traffic flows, prevailing winds and humidity (NAZZAL et al., 2014).

Zinc poor format here
Zinc, a heavy metal essential for life, acts biologically as a catalytic or structural component of numerous enzymes involved in energy metabolism and in transcription and translation (ALLOWAY, 1990); FORSTNER & WITTMAN, 1983).Zinc enters air, water and soils partly as a result of natural processes but mostly due to human activities (e.g., mining, zinc purification, steel production, burning of coal and wastes (NAZZAL et al., 2012).The zinc concentration in the investigated soil samples ranges from 45 to 435 ppm (Table 2).Most of the locations (13, 14, 15, 16, 17, 18, 20, 22, 24, and 29) (1982), zinc may be derived from mechanical abrasion and oil leaks from vehicles, so high concentrations in the study area are likely to be related to high traffic flows into and out of the industrial units.

Cadmium
Cadmium has no biological functions and is highly toxic.Environmental pollution from cadmium has increased rapidly in recent decades due to the rising use of cadmium by industry (ALLOWAY, 1990).The average natural abundance of cadmium in the earth's crust is normally reported to be 0.1-0.5 ppm, but much higher and much lower values have also been cited, depending on various factors (HOWARI et al., 2004).

Multivariate Geostatistics
Geostatistical methods are useful when data are normally distributed around a stationary mean and variances do not vary significantly spatially (SOUMYA et al., 2013).Significant deviations from normality and stationary means can cause problems.Initial plotting of a histogram of the data facilitates the checking of normality and position of the data values in space to determine any significant trends.Data matrices were evaluated through Principal Component Analysis (PCA) allowing the summarized data to be further analyzed and plotted.Generally, three approaches are used: Catell Sree test, (CATTELL, 1966), Kaiser Criterion (KAISER & RICE, 1974), and variance explained criteria (percentage of variance).
We use the Scree plot method to assess the number of principal components (PCs) to be obtained (Fig 3).The plotting of eigenvalues (x axis) against PC numbers (y axis) provides insight into the maximum number of components to be extracted.According to the eigenvalues 5.490 and 1.527 respectively (Table 4), two principal components PC1 and PC2 are selected (Table 5) which explain 68.6 % and 19.1 % respectively of the total variance, (Table 4).The application of the rotation matrix method leads to an increase in PC1 and a reduction in PC2 (Table 5).
The application of Kaiser-Meyer-Olkin (KMO) and Bartlett's test of sphericity gives values of 0.82 and 315.9 (Table 6), respectively, and the significance level is <0.001.These values demonstrate that factor analysis for this study is suitable.Factor 1 should be industrial and traffic flow, which is also evident from the presence of various metal processing industries in the area (i.e., the GTA has many metallurgical and chemical industries as shown in Table 1).The source of factor 2 is anthropogenic and natural, including weathering and erosion.
The correlation matrix for heavy metals in analyzed soil samples from the GTA is provided in Table 6.The results here shows that, when p is 0.05 then the result is significant at the 95 % confidence level but highly significant would be p 0.01 (99%) confidence level.
A positive correlation between the metals is shown in Table 7. Cr is highly positively correlated with Mn, Fe, Cu, Zn, Cd and Pb.Mn positively highly correlated with Fe, Cu, Zn, Cd and Pb.Fe is highly positively correlated with Cu, Zn and Pb.Cu is positively correlated with Zn, while Zn is positively correlated with Cd and Pb.This highly positive correlation among metals may suggest a common origin, such as traffic flow or industrial activity in the study area.PCA is applied to assist in identifying sources of pollutants.By extracting the eigenvalues from the correlation matrix, the number of significant factors and the presence of variance explained by each of them, are calculated using the software package SPSS 15.
Table 4 displays two components.The first explains the majority of the total variance and is loaded heavily with Cr, Mn, Fe, Zn and Pb.The source of the component may be industrial activities and traffic flows.This observation is also evident from the presence of various metal processing industries in the study area in addition to traffic flow.Component 2 is loaded with Ni and Cd, the source of which could be lithology and traffic flow.
Before performing a cluster analysis, the variables are standardized by means of Z-scores; then Euclidean distances for similarities in the variables are calculated.Finally hierarchical clustering is determined by applying Ward's method with the standardized data set.The results of the cluster analyses for the variables are shown in Fig. 4 as a dendrogram.This figure displays three major clusters, Mn-Zn, Pb-Cd-Cu -Cr and Fe-Ni, in agreement with PCA results.

Geo-accumulation Index
The geo-accumulation index (I geo ), introduced by MULLER (1979), is used to assess the extent of pollution in the analyzed soil samples from industrial locations in the GTA, and can be calculated as follows: Here, C n is the measured concentration of metal n in the soils, 1.5 is used as a factor to minimize possible variations in the background values, and B n is the geochemical background value of metal n.Geochemical background values for the studied metals in soil media proposed by ALLOWAY (1990) were used as Bn.
Mean values of I geo for each element are shown in Table 8.Based on the I geo data and Muller's geo-accumulation index listed in Table 9, the main I geo values are in the range of 0 to 1.67.The I geo values indicate that the soil samples for the studied industrial locations in the GTA are slightly/moderately contaminated with Cr, Fe, Cu, Zn and Cd, and moderately contaminated with Pb.For Ni and Mn, I geo falls in class 0, indicating that no pollution from these metals was recognized in the studied samples.These results indicate that the soil samples for the studied industrial locations in the GTA are enriched with potentially toxic metals from anthropogenic sources such as industry and traffic flow.

Pollution Load Index
The contamination extent by metals in soils is assessed using the Pollution Load Index (PLI) (TOMLINSON et al., 1980;SELVAMl et al., 2012) with heavy metal data and world shale average values (TUREKIAN & WEDEPOHL, 1961) of the metals, (BADR et al., 2009;RAY et al., 2006).The PLI provides a summative indication of the overall level of heavy metal pollution in a sample.The PLI was calculated for the investigation area for eight metals, considering the lowest toxicity of the most abundant metals.PLI values are calculated using the equation Here, CF denotes the contamination factor and n is the number of metals.The contamination factor is calculated as follows: metal concentration in sample CF = shale value / background of the metal The PLI values for all sites are found to range from 0.0-3.6 (Fig. 5).The lowest PLI values occurred at stations 6, 7, 8, 9, 10, 11, 12, 25, 26, 27 and 28, and the highest at stations 1, 5, 6,13,14,16, 17, 18, 20, 22 and 24, while stations 5, 13, 16, 17, 18, 22 and 24 exhibit very high PLI values.This confirms, as discussed earlier, the importance of the type of industries (especially metallurgical and chemical ones) in the area besides high traffic flows.Most of the industrial locations considered are close to the busiest highway in North America (Highway 401).

CONCLUSION
Heavy metal soil concentrations for industrial locations in the Greater Toronto area (GTA), Ontario, Canada have been investigated.The mean concentrations (ppm) for the metals assessed observed are Cd (0.41), Cr (392.95),Mn (1047.35),Fe (65012), Ni (0.22), Cu (82.9),Zn (174.3) and Pb (55.37).The use of a correlation matrix, cluster analysis and principal component analysis to estimate the variability of the chemical content of the soil reveals the impact of industrial and anthropogenic activities on heavy metal concentrations in the soils of the study area.The geo-accumulation index and pollution load index results show that the main geo-accumulation index values are 0-1.67,suggesting that soil samples for the studied industrial locations in the GTA are slightly -moderately contaminated with Cr, Fe, Cu, Zn and Cd, and moderately contaminated with Pb, while Ni and Mn fall in class 0. The pollution loading index values are lowest at stations 6, 7, 9, 10, 11, 12, 25, 27 and 28, and highest values at stations 1, 5, 6, 13, 14, 16, 17, 18, 20, 22 and 24, while stations 5, 13, 16, 17, 18, 22 and 24 exhibit very high PLI values.It is concluded that the type of industry, especially those related to metallurgical and chemical activity, in the study area as well as traffic flows, are the main sources for the esnsuing soil pollution.

Figure 1 :
Figure 1: A map of the sampling and study location in the Greater Toronto Area.Values on horizontal axis denote latitude, and on vertical axis denote longitude.

Figure 2 :
Figure 2: Spatial distribution maps of metal concentrations (in ppm) for samples collected from different industrial locations in Greater Toronto.a) Cr, b) Cu, c) Cd, d) Fe, e) Mn, f) Ni, g) Pb, & h) Zn.

Table 2 :
Statistical summary of metal levels (in ppm) for the collected soil samples.

Table 3 :
Average crustal abundance of and average world soils concentrations.

Table 4 :
Principal component analysis results.

Table 9 :
Geo-accumulation index (I geo ) and contamination levels (in parentheses) for selected metals in Greater Toronto Area soil samples.