Evaluating the capability of Worldview-2 imagery for mapping alien tree species in a heterogeneous urban environment

Abstract Street trees in urban planning have a long history as providers of an amicable environment for urban dwellers. Nevertheless, street trees are not always without a challenge, their ecosystem disservices include, inter alia, cracking pavements and foundations due to wandering tree roots that destroy concrete or asphalt surfaces. Thus, effective mapping of street trees assists in planning a suitable urban environment to improve city life. The traditional method for urban tree mapping is costly, time-consuming and labour intensive. However, commercially operated multi-spectral sensors, such as WorldView (WV) provide a more viable way to map trees at the species level. This study investigates the use of WV-2 imagery in the classification and mapping of five common alien street trees in a complex urban environment. It also examined the feasibility of Random Forest (RF) and Support Vector Machines (SVM) classifiers in mapping street trees in a heterogeneous urban environment. The classifiers produced an overall accuracy of 84.2 % for RF and 81.2 % for SVM. This study provides a detailed understanding of urban tree species to the municipality of Johannesburg and offers environmental managers an insight of classification methods for mapping trees using satellite imagery to comprehend their spatial distribution.

Abstract: Street trees in urban planning have a long history as providers of an amicable environment for urban dwellers. Nevertheless, street trees are not always without a challenge, their ecosystem disservices include, inter alia, cracking pavements and foundations due to wandering tree roots that destroy concrete or asphalt surfaces. Thus, effective mapping of street trees assists in planning a suitable urban environment to improve city life. The traditional method for urban tree mapping is costly, time-consuming and labour intensive. However, commercially operated multi-spectral sensors, such as WorldView (WV) provide a more viable way to map trees at the species level. This study investigates the use of WV-2 imagery in the classification and mapping of five common alien street trees in a complex urban environment. It also examined the feasibility of Random Forest (RF) and Support Vector Machines (SVM) classifiers in mapping street trees in a heterogeneous urban environment. The classifiers produced an overall accuracy of 84.2 % for RF and 81.2 % for SVM. This study provides a detailed understanding of ABOUT THE AUTHORS Simbarashe Jombo is a PhD candidate at University of the Witwatersrand, Johannesburg. His research interests are in the use of GIS and remote sensing techniques in environmental management. He has knowledge in advanced satellite image processing, mapping, geospatial analysis and modelling, land use and land cover mapping and related accuracy assessment.
Dr Elhadi Adam is a senior lecturer at University of the Witwatersrand, Johannesburg and his expertise lies on applications of remote sensing and GIS in applied environmental management.
Professor Marcus J. Byrne works at the School of Animal, Plant and Environmental Sciences, University of the Witwatersrand. He won the 2013 Ig Nobel Prize for Biology/Astronomy on dung beetles. He does research in Entomology and Zoology.
Dr Solomon Newete is a Senior Researcher at the Agricultural Research Council (ARC) with research interests in invasive alien plants, environmental pollution and phytotechnology, agriculture and food security, and remote sensing technology.

PUBLIC INTEREST STATEMENT
This study was conducted to evaluate the strength of the Worldview-2 satellite imagery in mapping urban alien tree species in the city of Johannesburg. The study provides an understanding of mapping common alien tree species showing their spatial distribution in the study area. The results of this study are of paramount importance in the development of a tree inventory, planning and management of urban tree species. The output map could be of great help to urban forest and municipal managers as well as various stakeholders involved in the management of urban tree species. This study provides information on the type of data and classification methods that are useful in the mapping of tree species in a heterogeneous urban environment.

Introduction
The mapping of tree species in an urban environment is vital for planning through strengthening our knowledge of their ecological functions in these environments (Liu et al., 2017). Urban trees provide ecosystem services such as mitigating urban heat island effects (Yan et al., 2018), reducing air and noise pollution, as well as potential flooding and support biodiversity (Seiferling et al., 2017). Urban trees also offer essential social, economic and psychological amenities to human beings such as increased property values and improved physical/mental health (Tyrväinen et al., 2005). Even though urban trees offer various advantages, there is a debate about alien tree species use in cities.
The major debate over the usefulness of alien vs indigenous tree species necessitates precise identification and mapping of trees, which is essential for city or metropolis managers to develop and sustain urban planning strategies (Li et al., 2015). Alien trees were introduced into South Africafor recreation parks, soil stabilisation, erosion control, timber use, shelter, beekeeping, ornamental use, firewood and for commercial purposes, amongst others (Henderson, 2001;Zengeya et al., 2017). The alien trees in South Africa are native to Europe, Australia, Central America, and north-west Argentina (Henderson, 2001). Accurate spatial data on urban trees is required to assess the extent and distribution of tree species to estimate their ecosystem services and disservices (Nowak et al., 2008).
Urban tree inventories based on spatial data have long been established using field-based surveys and visual interpretation of aerial photography (Myeong et al., 2006). These techniques work well on local and small-scale levels (Alonzo et al., 2014). However, the process is tedious, labour intensive and costly, especially at large scales and in heterogeneous environments (Alonzo et al., 2016). Remote sensing technologies offer a solution to these problems (Fassnacht et al., 2016;Liu et al., 2017) because they are efficient, robust, repeatable and rapid in monitoring urban tree dynamics (Myeong et al., 2006). The use of hyperspectral imagery has overcome time and labour challenges in mapping urban tree species (Liu et al., 2017). These challenges are because of the urban areas' complexity of features and the land cover heterogeneity (Li et al., 2015), diversity of the species and variations in the spatial characteristics (Alonzo et al., 2014). On the other hand, hyperspectral data usage is constrained by availability, the high dimensionality of data, computer processing time and cost .
Commercially operated sensors such as RapidEye and WorldView provide multispectral and panchromatic imagery globally, at a high spatial resolution for analysing tree species (Ke & Quackenbush, 2011;Richardson & Moskal, 2014). Their efficiency in classifying trees in a complex urban setting justify their continued use (Novack et al., 2011). WorldView-2 (WV-2) satellite imagery with good radiometric, spectral and geometric resolution (up to eight bands) has produced satisfactory results in urban tree species mapping. The recent studies by Li et al. (2015), Yan et al. (2018), and Shojanoori et al. (2016), showed high level of accuracies in mapping urban tree species using WV-2 satellite imagery. The WV-2 satellite data depends on the selected remote sensing data characteristics (landscape complexity), as well as the methods used for processing and classifying the images to provide precise thematic maps and high accuracy values (Lu & Weng, 2007).
Image appropriateness and the classification method used are essential requirements to obtain optimal accuracies (Lu & Weng, 2007). Previous studies utilized different machine learning algorithms such as random forest (RF), decision trees, support vector machines (SVM) and k-Nearest Neighbors (kNN) to classify satellite imagery (Thanh Noi & Kappas, 2017). Machine learning algorithms, in particular, RF have been used in many studies (Li et al., 2015;Puissant et al., 2014) and performed well in classifying tree species in urban environments. Support Vector Machines (SVM) have also increasingly been utilized in natural vegetation classification (Krahwinkler & Rossman, 2011;Li et al., 2015;Niska et al., 2010;Pal, 2005;Thanh Noi & Kappas, 2017). Furthermore, RF and SVM classification methods have been utilized mostly to classify images using the pixel-based approach, but few studiesused both algorithms in an object-based image analysis (OBIA) and achieved satisfactory results (Li et al., 2015;Mustafa et al., 2015;Yan et al., 2018).
The City of Johannesburg (CoJ) is considered the largest man-made forest in the world (Schäffler & Swilling, 2013). The city is predominantly planted with alien trees that appeared with the arrival of the Europeans in the 19th century (Schäffler & Swilling, 2013). Improved classification accuracy can provide us with a better understanding of the health of all trees planted in the CoJ. The alien tree species are subject to many stresses which include: soil, water and air pollution, soil sealing and compaction (Paap et al., 2017;Pautasso et al., 2015). These stresses make trees susceptible to pathogens and insects such as the polyphagous shot hole borer (Euwallacea fornicatus), an ambrosia beetle which is affecting urban alien trees in the CoJ, including Platanus spp., (Paap et al., 2018), Jacaranda spp., Quercus spp. and Eucalyptus spp. (Forestry & Agricultural Biotechnology Institute, 2019). Data capture via WV-2 satellite imagery can be used to identify such affected areas, which prove invaluable for urban planners and other research groups.
The CoJ is believed to have over 10 million trees, with over 4.5 million in private gardens and about 2.5 million trees in conservation areas, parks, nature reserves, city's pavements and cemeteries (Johannesburg City Parks & Zoo, 2018). Nevertheless, management of these trees depends on anecdotal information on the urban tree species in the CoJ. To address this issue, this study assessed the ability of remote sensing to map and classify urban alien tree species in the Johannesburg area using high-resolution multispectral data. The specific objectives of the study are: i) to examine the effectiveness of RF and SVM in mapping urban alien tree species and other land use and land cover (LULC) classes using the WV-2 imagery in Johannesburg suburb; ii) test the utility of the spectral bands of WV-2 in mapping urban alien tree species and LULC classes, and iii) compare the performance of RF and SVM classification methods in mapping urban alien tree species and LULC classes using high-resolution WV-2 image.

Study area
Randburg municipal area (167.98 km 2 ) is the location where this study was conducted, which is in Region B of the CoJ (Figure 1). The annual rainfall in CoJ is approximately 750 mm per annum, with potential evaporation of around 1600 mm/annum (Tyson & Wilcocks, 1971). The rain mostly falls in the course of the summer season which runs from October to February and temperature averages reaches around 15 ºC in winter and 20 ºC in summer (Naicker et al., 2003). The mean altitude of the CoJ is 1512 m, with altitudes ranging between 1081 m and 1899 m (Grobler et al., 2002). The soil in this area is composed of Glenrosa, and Mispah soil forms (Grobler et al., 2002) and the rocks are crystalline of Archean age which underlie the whole of Johannesburg (Abiye et al., 2018). The rocks in the area are mainly granitic gneiss, meta-volcanic and metasedimentary rocks (Abiye et al., 2018).
The most common and fast-growing alien trees in the study area are black wattle, jacaranda, eucalyptus, pepper trees, oaks and London planes which were introduced in the 19th century under the tree planting project (Turton et al., 2006). There are indigenous trees in the CoJ, but primarily non-indigenous tree species were planted during the colonial-era resulting in the spatial diversity of trees in the city (Schäffler & Swilling, 2013).

Remote sensing data acquisition
This study is based on remote sensing and field data obtained in spring 2017. A total number of 36 scenes of WV-2 satellite imagery, captured on 15 September 2017, and field data collected in November of the same year was used. The scenes were supplied by the CoJ Corporate Geo-Informatics, Development Planning and Urban Management (DP&UM) department, free of charge with radiometric and geometric corrections already done. To combine the scenes for the WV-2 image, they were mosaicked using ArcMap 10.5 software. The WV-2 image provides high spatial resolution data, with a swath width of 16.4 km at nadir, eight multispectral bands, and a 2 m spatial resolution. Moreover, the WV-2 multispectral sensor's eight multi-spectral bands cover the NIR1, NIR2, red edge, red, yellow, green, blue and coastal wavelength region (Table 1). Furthermore, the panchromatic sensor band has a spatial resolution of 0.46 m and lies within a spectral range of 450-800 nm. The WV-2 image was provided in Hartebeeshoek WGS84 coordinates (Central Meridian-29, Projection-Transverse Mercator).

Field data collection
A handheld outdoor Global Positioning System (G.P.S) receiver (Garmin eTrex 20 X) was used to collect field data and determine the coordinates for the dominant alien tree species. Due to the dearth of precise data on the spatial distribution of urban trees in South Africa, stratified purposive sampling was used to visit areas dominated by five dominant alien tree species (Eucalyptus spp., Jacaranda spp., Platanus spp., Quercus spp. and Pinus spp.). The study area was divided into zone segments and samples taken, ensuring that every block has an equal chance of being selected. Stratification permits accuracy by providing a combination of the results from random samples in the chosen zone segments (Jaenson et al., 1992). In addition, this method captures major differences rather than identifying a common species even though the latter may appear in the analysis (Palinkas et al., 2015). The zigzag sampling method was used to provide a uniform distribution of the sampling sites (Ryan et al., 2007) and it identified the urban alien tree species which were planted mostly along the streets in the study area. Additional ground truth points were also collected for the common land use/cover classes ( Table 2). The LULC classes were added due to their importance to urban planners in understanding the spatial distribution of urban tree species with the city plan and structure.
The sample size for the five urban alien tree species and additional LULC classes was 1369 ( Table 2). The sample was divided into a training set (70%) of 964 and a test set (30%) of 405.
The spectral characteristics of the five dominant urban alien tree species considered in this study are shown in Figure 2. The average spectrum was extracted from WV-2 imagery pixels, with n = 40 for each dominant alien tree species.

Data analysis and classification
The ability of RF and SVM machine learning algorithms to map urban alien trees was assessed in this study. The pixel-based classification techniques were used to map the urban alien trees using the machine learning algorithms. Both classifiers (RF and SVM) were trained with the training sample set, while the test set was used to measure the accuracy of the classification maps. The training and test data obtained from the WV-2 image in the study area are shown in Table 2. The RF and SVM classifiers' parameters were analysed with the open-source statistical software R for classification of the WV-2 imagery.

Random Forest (RF) classifier
The RF machine learning algorithm for statistical data analysis was put forward by Breiman (2001) for the improvement of classification and regression trees (CART). RF builds up numerous unpruned trees (ntree) on original data's bootstrap sample, where the default ntree value in the "random forest" package in R statistical software is 500, specifically assigned for large datasets. The user defines ntree value and trees with low bias, and high diversity are created by the RF algorithm (Breiman, 2001). A bootstrap sample, which is a two-thirds majority of the original data, known as "in-bag" samples are utilized in the training of trees and the rest of them (one-third) is "out-of-thebag" samples (Belgiu & Drăguţ, 2016). The "out-of-the-bag" samples estimate how well the RF classifier performs in a cross-validation approach, and the resulting estimate value is known as the out-of-bag (OOB) error (Belgiu & Drăguţ, 2016).
Furthermore, in RF machine learning algorithm, random subsets of variables were used to split trees into nodes, referred to as mtry and its default value is the √P, which is the predictor variables' total number (Breiman, 2001). On the basis of the chosen variables of the mtry (default value is 500 trees), the variable yielding the greatest decrease in impurity is selected for the splitting of samples at each node (Breiman, 2001). Both the ntree and mtry parameters need to be put together to generate forest trees, and the optimization of these parameters can improve the classification accuracy values (Belgiu & Drăguţ, 2016;Mutanga et al., 2012).
In this study, a 10-fold cross-validation technique dependant on the OOB error was used to identify the best ntree and mtry parameters. The ntree parameter tested in this study varied from 500 to 10 000 with 1000 interval while the mtry values ranged from 1 to 8. If the OOB error is low, this shows that the RF classification method performed well, whereas high OOB error values show poor performance.
In addition, the RF algorithm sets out the variable importance (VI) measurement and is calculated in different ways which include making use of the Mean Decrease in Accuracy (MDA), or else the Mean Decrease in Gini (MDG) (Breiman, 2001). In the RF model establishment, the computing time required is calculated using Eq.1 below: where M illustrates the specific number of variables applied in each split, whereas N represents the total number of the training samples, and T stands for the total number of trees (Belgiu & Drăguţ, 2016;Breiman, 2001).
RF classifier has been used in other studies to successfully map urban buildings (Belgiu & Drăguţ, 2016), boreal forest habits (Räsänen et al., 2013), tree, biomass and crown cover (Karlson et al., 2015) as well as urban tree species (Li et al., 2015;Liu et al., 2017;Puissant et al., 2014). This illustrates the usefulness of RF classifier to map urban alien tree species for the current study. In this study, the RF classifier in R was utilized in mapping urban alien tree species in the study area.

Support Vector Machines (SVM) classifier
The Support Vector Machines (SVM) classifier does not require an assumption concerning data distribution and does not overfit the new or test data samples by applying viable principles (Abdel-Rahman et al., 2014). The SVM machine-learning algorithm initially proposed by Vapnik (1995), focuses on the training sites, which are closest to the optimal boundary amid classes in the attribute space (Maxwell et al., 2018). In addition, SVM is a supervised classification approach which is non-parametric and non-linear broadly employed in remote sensing research because of its precise capacity to derive results even using a small number of training samples or sites (Mountrakis et al., 2011). The data points that lie on the supporting hyperplanes are known as support vectors, and optimal hyperplane is positioned in the middle of the margin. Additionally, the machine-learning algorithm has kernel and mapping functions derived from the original input space to the feature space that is highly dimensional (Mountrakis et al., 2011).
In recent years, SVM's four most commonly utilized kernel functions are the polynomial, sigmoid, linear and radial basis function (RBF) kernels (Qian et al., 2014). The current study makes use of the radial basis function (RBF) kernel since it lends itself to successfully classify urban land cover (Qian et al., 2014) and tree species (Raczko & Zagajewski, 2017). The RBF kernel consists of two significant parameters, which are the "gamma" (γ), and "cost" (C), which play an essential role in determining the overall classification accuracy value (Qian et al., 2014).
We utilized all eight WV-2 bands and the RBF kernel to find an optimal hyperplane that could distinguish all the LULC classes, including the five-target tree species of this study. More so, the optimization of γ and C parameters of the RBF function was done using the LIBSVM library to obtain the best parameter values. The SVM classifier was run in R statistical software version 3.4.4, and the "e1071" library was used in the optimization of SVM parameters.

Accuracy assessment and validation
The accuracy of any classification algorithm in remote sensing is judged by the prediction performance testing results (Odindi et al., 2016). Thus, the RF and SVM machine-learning algorithms were utilized to assess their efficiency using the test dataset (30%) in confusion matrices, where the kappa coefficient, user's, overall and producer's accuracy values were derived on the classified maps to ascertain the level of reliability and accuracy (Jombo et al., 2017).
Moreover, the significant difference between the results from the SVM and RF classifiers was tested in this study using McNemar's test. The test is non-parametric, based on a normal test statistic, and calculated out of the two classification methods' error matrices. It is measured as shown below in (Eq.2):

RF parameter tuning
The best-input parameters for the classification of the five urban trees species and LULC classes were determined through optimization for training in the RF machine-learning algorithm. Consequently, the 10-fold cross-validation sampling method was utilized in this study, and a combination of the best parameters mtry and ntree had values of 2 and 6500, respectively ( Figure 3). Additionally, this combination produced the lowest OOB error rate of 19.2%, and the highest OOB error rate of 21.2% (mtry of 8 and ntree of 3500).

SVM parameters tuning
The best parameters of the γ and C RBF kernel parameters were found through optimization for classification using the SVM method. Therefore, the best γ and C parameters were obtained with the use of the 10-fold cross-validation sampling technique. Figure 4 illustrates that the best parameters for γ and C were 1 and 100, respectively, using the radial SVM-kernel and 521 support vectors.

RF and SVM in urban tree species mapping
The maps showing other LULC classes and urban alien tree species' spatial distribution were produced for both the RF and SVM classifiers ( Figure 5). The classified maps showed that built-upcovers the largest area of land as compared to other classes. The other woody vegetation class covers mostly the southern part in the area of study ( Figure 5). The Platanus spp. covered the largest area as compared to all the other urban alien trees classes with values of 8.38% and 8.11% using the RF and SVM classifiers, respectively (Table 3). The Platanus spp. were mainly found in Pine Park and Linden suburbs, situated in the southern side of the study area ( Figure 5).
Furthermore, most of the Eucalyptus spp. are situated in the northern part of the area of study in places like suburbs such as Bloubosrand and Fleurhof, while most of the Jacaranda spp. were in the southern part of the area of study in suburbs like Randpark ridge, Fairland and Windsor East. The Pinus spp. and Quercus spp. were mostly located in the Greenside and Parkview suburbs (southern part). All the LULC classes (other woody vegetation, built-up, grassland, shadow and bare land) were spread across all regions of the study area.
The two classifiers (RF and SVM) showed some minor inconsistencies in the size of each class. The differences in the areas covered by the five alien tree species classified using the two models were less than 1% (Table 3). However, the area of coverage (ha), between each of the species showed some significant differences. For example, the Eucalyptus spp., shows an area of 737.96 ha  for SVM and 576.54 ha for RF classifier (Table 3). This gives a difference of 161.42 ha of total area coverage between the two classifiers for the Eucalyptus spp. TheQuercus spp. had a difference of 164.08 ha since the SVM showed a total coverage area of 409.80 ha and RF, 245.71 ha ( Table 3). The values for the SVM classifier were generally higher than the ones for the RF classifier. There are also differences in the area covered by other LULC classes, for example, the built-up area showed a difference of 769.19 ha as RF showed total coverage of 5489.70 ha, whilst SVM showed 4720.51 ha for the same class. The RF classifier showed that the built-up and other woody vegetation were the dominant classes in the study area ( Figure 5), whereas Quercus spp. and Eucalyptus spp. were the less dominant classes (Table 3). Shadow and Quercus spp. were the least dominant classes from the classified map using SVM classifier (Table 3).
The RF classifier presented the variable of importance (Figure 6), indicating the part each band played in the classification procedure. More importantly, the more essential bands are the ones with the uppermost Mean Decrease Accuracy (MDA) values.  The utility of every single WV-2 spectral band was evaluated in mapping all the classes in the study ( Figure 7). As shown by the high level of MDA in Figure 7, the NIR and red edge bands were the most imperative in the classification of the five urban alien tree species (Eucalyptus spp., Jacaranda spp., Platanus spp., Quercus spp. and Pinus spp.). Similarly, areas with trees or vegetation predominantly fell on the red edge and NIR bands, which are valuable in the discrimination of plant species in classification, biomass and vegetation analysis. As illustrated in Figure 7, the green, red and yellow bands were the most important contributors in mapping the bare land areas. The red, yellow and coastal bands were more significant in the classification of the built-up class, whereas red edge and NIR bands were important in classifying the grasslands (Figure 7). The blue and two red bands were the most important bands for the other woody vegetation, whilst the NIR and red edge bands were the imperative bands for detecting the shadow class.

Accuracy assessment and validation
The accuracy levels for both RF and SVM classifiers were attained, utilizing a 30% test set of data on the WV-2 imagery and the confusion matrices for both classifiers (Table 4). The range for the user's accuracy values for RF was between 61.11% (Platanus spp.) and 100 % for the grassland class (Table 4), while the producer's accuracy ranged from 55% (Platanus spp.) to 100% for the Quercus spp. classes. Moreover, the RF values for the user's and producer's accuracy were lowest on the Platanus spp. at 61.11% and 55%, respectively (Table 4). The user's accuracy values for the SVM classifier ranged from 50% (shadow) to 100% (grassland) whilst the producer's accuracy ranged from 42.86% (Quercus spp.) to 98.81% (built-up) ( Table 4).
The overall accuracy assessment based on the independent test dataset was 84.2% for RF, with 0.82 kappa coefficient value (Table 4), while that of the SVM classifier was 81.2%, with a kappa coefficient of 0.78 (Table 4).
The results from the McNemar's test showed that at a 5% significance level, the RF and SVM's confusion matrices show no significant difference (Table 5). The z value obtained was −1.64 (Table  5), which was less than 1.96. Some of the classified areas using the RF classifier were almost similar to those obtained using the SVM classifier ( Figure 5).

Discussion
Worldview-2 satellite imagery was used to map five common alien urban trees using RF and SVM classification algorithms. The most influential bands for mapping alien tree species and LULC classes were the Red and the NIR bands ( Figure 6). Thus, areas with vegetation cover predominantly fall in the NIR and red regions of the WV-2 imagery. Mean decreasing accuracy (MDA) for each variable was obtained during the OOB calculation (Liu et al., 2017). Although for many years remote sensing data has been used in studying ecological and environmental phenomena, beneficial information has not successfully given fine spatial extent as well as temporal measures (Anderson & Gaston, 2013). Nevertheless, the traditional photointerpretation methods used in the past were time and labour-intensive, and costly, especially for data acquisition over large geographical areas and more importantly such aerial photographs and pixel-based image classification approaches are not well suited for LULC mapping in a heterogeneous urbanized landscape (Cleve et al., 2008). However, Zhang et al. (2015) indicated that a problem of mixing various impervious surface areas and vegetation in one pixel for medium spatial resolution images might exist, for example, Landsat TM imagery. However, very high-resolution WV-2 images have resolved this problem.
This study has shown that Eucalyptus spp. were mostly found in the southern part of the study area in places such as Robertville, where the mining sites such as Rand Leases and Main Reef are located. These trees were planted to stabilize the soil and reduce soil erosion. Eucalyptus spp. in other areas, were also planted for timber, pulp, poles (Van Wilgen & Richardson, 2014), mine props and excavation (Turton et al., 2006). Jacaranda spp., Platanus spp. and Quercus spp. were found in areas such as Randpark ridge, Fairland, Pine Park, Linden and Windsor East, situated in the southern part of the study area. In addition, it is significant to bear that these areas were among the affluent suburbs of Johannesburg inhabited by the white community during the apartheid era before 1994, many of which were planted in these neighbourhood streets for beautification and settling dust (Turton et al., 2006). The Pinus spp. were situated in areas such as Parkview and Greenside, especially in and around open spaces such as Parkview Golf Club and Johannesburg Botanical Gardens and were introduced in these areas for ornamental or recreational values.
Our results indicate that the multispectral WV-2 sensor is suitable in urban tree species mapping using RF and SVM classifiers with overall accuracies of 84.2% and 81.2%, respectively. The RF user accuracy values for the Eucalyptus, Jacaranda, Pinus and Quercus species were higher than those in the SVM classifier (Table 4). The RF user accuracy value for Platanus spp. (61.11%) was less than the one for SVM (62.50%). This was mainly due to the misclassifications of Platanus spp. as other woody vegetation, Jacaranda spp. or Quercus spp. classes (Table 4). The misclassifications of the Platanus spp. were maybe due to errors in the reference dataset as it was challenging to find these tree species of concern in the area of study as they were not usually found as pure stands. This resulted in differences in the area of coverage, with the SVM classifier having Platanus spp. covering 4.68% of the total area while RF had 4.98% (Table 3). The misclassifications due to errors in the reference dataset were one of the reasons leading to the dissimilarities in the area of coverage for all classes classified by the RF and SVM. The RF producer's accuracy values for four of the five urban tree species (Eucalyptus spp., Platanus spp., Pinus spp. and Quercus spp.) were higher than those for SVM. For example, SVM's producer's accuracy value for Jacaranda spp. (85.19%) was higher than that for RF (62.96%). Moreover, the low producer's and user's accuracy values for Platanus spp. and Quercus spp. for both algorithms (RF and SVM) could be due to a small number of reference polygons as illustrated in Table 2. It has been conclusively shown that a small number of reference polygons make it challenging in characterizing spectral variability of individual tree species due to their unique phenological ages or stages all along the slope (Le Louarn et al., 2017). In addition, misclassifications were also due to tree species' variety in this urbanized area of sudy which leads to a "mixed pixel problem" where some pixels are not entirely occupied by one homogeneous class (Salih et al., 2017).
The RF user accuracies for the LULC classes were between 77.36% and 100%, while those of SVM was between 50% and 100% ( Table 4). The RF had producer's accuracy values ranging from 75% to 96.88% for all LULC classes, whereas those for the SVM classifier for the same classes were between 53.85% and 98.81%. The low SVM producer's accuracy value of 53.85% (shadow) was due to misclassifications between this class and the Quercus spp., Pinus spp. and other woody vegetation classes (Table 4). The misclassifications may be due to errors in the reference dataset where some points for shadow class were classified to some of the urban tree species. It might have been the case that the values for the user's and producer's accuracy could be high if there were low tree species diversity in the area of study. The identification, classification and mapping of tree species in an urban area at individual level nowadays have been restricted to airborne LiDAR and hyperspectral data as they produce high accuracy values due to their higher spectral and spatial resolutions as compared to multispectral data (Fassnacht et al., 2016;Liu et al., 2017;Raczko & Zagajewski, 2017). However, hyperspectral data are costly, time-consuming for processing, require huge storage space and also suffer from multi-collinearity (Odindi et al., 2016). Recently, multispectral remote sensing imagery characterised by finer resolutions (i.e. spectral and spatial) has improved classification results (Odindi et al., 2016). Thus, the advent of the new age WV-2 satellite imageries which costs relatively less offers new perspectives due to the increased radiometric, geometric and spatial (8 bands) resolutions and can map trees at the species level (Waser et al., 2014). The performance of satellite data and classification models used in mapping urban tree species vary significantly across various regional environments (Le Louarn et al., 2017). Therefore, they cannot be directly transferred across different urban environments and vegetation phenological stages. Further comparison of different satellite data in mapping urban tree species need to be established for achieving better accuracies at the lowest cost.
Our study reflected that the RF marginally outperformed the SVM classifier by 3%. This result agrees well with previous studies (Le Louarn et al., 2017;Puissant et al., 2014) where the RF outperformed the SVM machine learning algorithms in mapping urban tree species. The high accuracy values achieved by the RF and SVM classifiers may be attributed to the utilization of WV-2 satellite imagery captured in the spring season, a period when the urban trees' new leaves start to expand, and the buds begin to open. The tree leaves assisted in showing spectral differences amongst the species enhancing their separability in mapping urban trees in this study. The development of tree leaves between spring and summer seasons cause an increase in the content of leaf chlorophyll (Le Louarn et al., 2017). The difference in the MDA values showed high NIR values and low blue and red bands for the five urban tree species (Figure 7). The high accuracy values for the urban tree species may be due to the strength of the RF classifier to handle unbalanced datasets, computational efficiency and having no distributional assumptions on the input data set (Odindi et al., 2016). The SVM classifier also produced high accuracy values as a high "cost" (C) value of 100 was used, and the training error is penalized, forcing the misclassification of points to be lowered by the algorithm in the course of the training operation (Abdel-Rahman et al., 2014). It is noteworthy that a non-linear RBF kernel for the SVM classifier was utilized in this study as it solves the inseparability problems that may arise in the vegetation species mapping (Mountrakis et al., 2011). The high accuracy values in this study, however, remain unclear as to whether they were found due to the RF and SVM classifiers used, WV-2's high spatial resolution or the absence of complications in the landscape.

Conclusions
The ability of the high-resolution multispectral WV-2 satellite imagery by utilizing RF and SVM classification algorithms was assessed in mapping of urban tree species at Randburg municipality in the CoJ. The results demonstrates the enormous ability of WV-2 satellite imagery and machine learning classifiers (RF and SVM) in detecting and mapping urban tree species in a complex urban environment. This study provides an insight to the municipality's urban forest and environmental managers in monitoring the urban tree species. Such spatial information will assist urban forest managers in the current urban tree inventory and urban greening program.
We caution, however, that this conclusion is based on the fact that there are misclassification errors which might be due to the many factors such the image registration error, the spectral and spatial resolution limitations and the spectral signature overlap. Future studies should consider testing different classification approaches such as object-based image analysis and exclude other LULC classes to improve classification accuracy and minimize misclassification errors.
Overall, we believe that based on our relatively high accuracy levels achieved, the low cost and the ease with which the data is analysed it is worth to consider WV2 satellite data for urban tree species classification.