Impact of the Accuracy of Land Cover Data sets on the Accuracy of Land Cover Change Scenarios in the Mono River Basin, Togo, West Africa

Knowledge about land use and land cover (LULC) dynamics is of high importance for a number of environmental studies including the development of water resources, land degradation and food security. Often, available global or regional data sets are used for impact studies, although they have not been validated for the area of interest. Validation is especially required if data are used to set up a land change model predicting future changes for management purposes. Therefore, three different LULC maps of the Mono River Basin in Togo were evaluated in this study. The analyzed maps were obtained from three sources: CILSS (2 km resolution), ESA (300 m), and Globeland (30m) datasets. Validation was performed using 1,000 reference points in the watershed derived from satellite images. The results reveal CILSS as the most accurate data set with a Kappa coefficient of 68% and an overall accuracy of 83%. CILSS data shows a decrease of savanna and forest whereas an increase of cropland over the period 1975 to 2013. The increase of cropland area of 30.97% from 1975 to 2013 can be related to the increase in population and their food demand, while the losses of forest area and the decrease of savanna are further amplified by using wood as energy sources and the lack of forest management. The three datasets were used to simulate future LULC changes using the Terrset Land Change Modeler. The validation of the model using CILSS data for 2013 showed a quality of 50.94%, it is only 40.04% for ESA and 20.13% for Globeland30. CILSS data was utilized to simulate the LULC distribution for the years 2020 and 2027 because of its satisfactory performances. The results show that a high spatial resolution is not a guarantee of high quality. The results of this study can be used for impact studies and to develop management strategies for mitigating negative effects of land use and land cover change.

demand for food, energy, and water is also increasing (Lambin et al., 2003), which causes land use and land cover changes (LULCC). West Africa is a region facing severe LULCC, particularly in the Republic of Togo (TG) and the Republic of Benin (BN), which are experiencing an environmental and social decline resulting in increasing subsistence farming. This causes an acceleration of the degradation of the natural resources and the increase of agricultural area due to rapid population and economic growth (Koglo et al., 2018).
Land use refers to "man's activities on land which are directly related to the land," while land cover is "the vegetation and artificial constructions covering the land surface" (Anderson et al., 1976). LULCC in West African countries are driven by natural and anthropogenic factors. The anthropogenic factors are mainly related to demographic growth (Brink and Eva, 2009), while the natural factors are linked to climate variability and climate change (Koubodana, 2015;Oguntunde et al., 2006). LULCC influence hydrological processes as agricultural intensification results in increased surface runoff, reduced groundwater recharge, and transfer of pollutants (Veldkamp and Lambin, 2001). Knowledge about LULC dynamics at the watershed scale is indispensable for water and land resource management (Eisfelder et al., 2012;Wisser et al., 2010).
LULC products from remote sensing are often the input for environmental modeling and analysis. This is the case in hydrologic modeling and trend analysis (Wisser et al., 2010), biomass and energy modeling (Eisfelder et al., 2012), population density modeling (Sutton, 1997) as well as risk and hazard analysis (Herbst et al., 2006;Mishra et al., 2014).
In many studies, LULC assessment has been performed with data available from the U.S Geological Survey (USGS). These products are developed on a large, often global scale and applying them to the local scale without any validation can significantly affect the model results and future scenario development (Pontius and Neeti, 2010;Sun and Robinson, 2018). In the present study, the impact of the LULC data sets accuracy on future scenarios in the Mono River Basin (MRB) was investigated.
For LULCC analysis and future scenario prediction, a number of models have been developed like the GEOMOD, the Cellular Automata (CA) and STCHOICE (Arsanjani et al., 2013) and applied in a number of studies (Herbst et al., 2006;Mishra et al., 2014). A comparison of four statistical approaches of these models (Markov chain, logistic regression, generalized additive models, and survival analysis) was done by Sun and Robinson (2018) to detect their ability to quantify LULC changes and to perform prediction. The results show that the generalized additive model performs better for overall accuracy and is best for LULC validation and modeling. For example, Pontius and Neeti (2010); Pontius and Spencer. (2005) analyzed the uncertainty of future LULC scenarios and discussed techniques to quantify the meaningful differences between future scenarios using the GEOMOD model. However, each land cover modeling approach was developed with different strengths, weaknesses, and applications (Mas et al., 2014). A number of studies on LULCC used computation of transition potentials, the spatial trend change analysis and land cover change prediction using the Land Change Modeler (LCM), a tool in the TerrSet Geospatial Monitoring and Modeling System integrated in the IDRISI software (Du et al., 2012;Eastman, 2006). This LCM software provides a robust set of tools for change analysis and spatial trend analysis utilizing different variables as drivers for future scenarios computation (Eastman, 2006;Mishra and Singh, 2010). Generally, LULC data are required for the analysis of the past, but also for developing LULC scenarios (Rounsevell et al., 2006). Thus, validated data are used to analyze the drivers of change in the past and to project them for the future (Pontius et al., 2001).
The methodology described by Olofsson et al. (2013); Pontius and Malanson (2014) to detect or to compare LULCC often used in the generation of LULC maps can also be applied for evaluating the results of different scenarios. Peixoto et al. (2006);Stehman. (2009) have described the method of spatial accuracy assessment by sampling approach and have proposed this method as appropriate for land cover accuracy assessment. The analysis of Pontius and Millones (2011) discussed the limitations of comparing two maps using the Kappa coefficient and proposed a new methodology for comparison namely "quantify disagreement and allocation disagreement". Nevertheless, the Kappa coefficient is still considered as a vital tool for accuracy assessment measurement in a number of studies (Biondini and Kandus, 2006;Milad et al., 2017;Ren et al., 2018;Sitthi et al., 2016).
This study analyzes LULCC in the MRB from 1975 to 2013 using different data sources and simulates potential future LULC distributions. The main objective of this study is therefore to assess the accuracy of past and future land cover changes in the MRB using different data sets. The specifics objectives are: (i) to analyze the past LULCC in the MRB using three different data products, (ii) to project future LULC considering population growth as the main driver and (iii) to determine how the choice of the data set will influence projected future LULC accuracy.

Study area
The study area is the Mono River Basin (MRB) in West Africa. The study area was selected for LULCC change analysis because of huge environmental problems like flooding downstream of the Nangbéto dam, soil erosion and dam-siltation caused by agricultural intensification, and cutting of trees, all exacerbated by the non-existence of any cooperative communal structure and reduced livelihood opportunities (SAWES, 2011). The MRB is the second largest river in Togo, and shared with the Republic of Benin. The basin is located between 06°16' and 9°20'North latitude and 0° 42'and 1° 40' East longitude (Figure 1). At the outlet at Athiémé, the basin covers an area of 22,014 km 2 with 88% of its area in Togo and the 12% in Benin (PCCP, 2008). The MRB is 309 km long, has its source in the Alédjo Mountains (Amoussou, 2010) in the north of Benin and drains into the Atlantic Ocean via "la bouche du roi". The elevation of the basin ranges from 12 to 948 meters (http://srtm.csi.cgiar.org/). The biggest dam on the river is at Nangbéto and produces 20% of the total hydroelectricity used by Togo and Benin.
The watershed area encompasses two climate zones. In the south, from 6° to 8°N, two rainy seasons and two dry seasons exist with rainfall between 1200 and 1500 mm/year in the mountainous area of the southwest and 800 to 1000 mm/year in the coastal zone.
The natural vegetation is mainly savanna and is composed of the bush and tree savanna, gallery forests, and grassland. The relief is generally flat, except for the mountainous regions of the West and the Northwest. In the lower part of the basin, there are very narrow coastal sedimentary island, often covered by alluvial deposits.
In 2011, the MRB was populated by about 5.1 million inhabitants (FAO, 2012;PCCP, 2008;SAWES, 2011). The main socioeconomic activities are agriculture, trade, fisheries and livestock husbandry (Amoussou, 2010). According to FAO (http://worldpopulationreview.com/countries/togo-population), the population in Togo has tripled since 1975 and is still increasing (Table 1). Where K (%) is growth rate estimated from reported population data assuming exponential growth as given in Eq. (1): and P is the population value at time + ∆ , 0 is the initial population value at 0.

Data Description
Land use and land cover: In this study, we used three available sources of LULC data for the MRB comprising different temporal and spatial resolutions: the Permanent Interstate Committee for drought control in Sahel (CILSS) (CILSS, 2016) data set developed for West Africa at 2-km spatial resolution, a global map at 300-m resolution offered by the European Space Agency (ESA) in the frame of the project on Climate Change Initiatives (CCI) (Gessner et al., 2012), and the global Globeland30 project land cover map developed by the National Geomatics Center of China (NGCC) with a resolution of 30m (Eastman, 2006;Mishra and Singh, 2010). The details of the data sets are provided in Table 2. (1)

Pre-analysis and Harmonization of Land Use and Land Cover Type
After extracting the Mono watershed in CILSS, ESA, and Globeland30 datasets, a maximum of ten LULC types are represented (Table 3). The pre-analysis consists first of reclassifying the ten land cover types into six major LULC types using ArcGIS 10.5 tools. Second, the spatial resolution of the CILSS and ESA maps was resampled to the 30-m resolution of Globeland30 (Thibaut et al., 2011) to be able to superimpose the maps for comparison (Bárdossy and Schmidt, 2009). The six LULC classes in Table 3 are similar to those proposed by Penman et al. (2003) in the IPCC Guidelines according to the Kyoto Protocol of 2001 and the Good Practices Guidelines for Land Use, Land Use Change and Forestry (GPG-LULUCF).

Accuracy assessment, land use/cover area, and change analysis
According to Sitthi et al. (2016), a LULC accuracy assessment is required in any study using remote sensing data. LULC map accuracy is quantified by creating an error matrix or a confusion matrix, which compares the classified map with a reference classification or a true map. These matrices can be used as a measure of agreement between model algorithm predictions and the references points (Congalton, 1991). Following the guidelines of the Food and Agriculture Organization (FAO), the tables of accuracy estimates were produced for each of the three data sets. This was followed by confidence intervals for area estimation and comparison of area estimation derived from map data to reference data (FAO-ONU, 2016). Many past studies have estimated the accuracy of the observed LULC map with a modeled one using a Kappa coefficient and overall accuracies (Chen et al., 2015;Franklin and Wulder, 2002;Lunetta et al., 2006;Ren et al., 2018). For accuracy assessment, 1,000 reference points were randomly taken from high-resolution (in meter) satellite images for the years 2010 and 2013 provided by Google Earth Pro (version 7.3). These reference points were distributed proportionally to the size of the six LULC types inside MRB and compared with a 30-m spatial resolution classified map (Table 4).
The accuracy assessment and an error matrix for each category of dataset were generated by following the guidelines of Congalton (1991); Huth et al. (2012) and the method proposed and described by Olofsson et al. (2013). According to this method, an error matrix can be computed by accounting LULC number of pixels. In addition, from this error matrix statistics such as user and producer accuracies are generated for individual LULC category of the data sets, then the overall accuracy and Kappa coefficient are computed from this error matrix (Pij) for each data set.
User's accuracy (Ȗ l ) of class i is the ratio of the correct mapped pixels of a particular class i by the row total pixels (P i+ ) Eq. (2).
Producer's accuracy (P j ) of class j is the ratio of the number of correctly classified pixels to class j in the data to be evaluated and is estimated by Eq. (3) The overall accuracy (Ô) indicates the overall proportion of area correctly classified (P ii ). It is the sum of all pixels on the major diagonal in the adjusted error matrix over the total number of pixels in the error matrix (N) as in Eq. (4).
The Kappa coefficient (K) is computed based on the error matrix and is the value that shows the consistency of data classification. This value is used to evaluate the accuracy of remote sensing data as following Eq. (5) (Amler et al., 2015;Ren et al., 2018;Sitthi et al., 2016). (2) (3) According to, Fitzgerald and Lees (1994), K is considered to be statistically significant at p<0.001 at a level of confidence for the following intervals values:

Land use and land cover change scenarios
Developing future LULC scenarios consists of two steps. In the first step, the rate of change has to be estimated, and in the second step the probability for a change into a certain LULC class to take place has to be computed (Verburg and Veldkamp, 2002). The flowchart of Mono land use and land cover modeling is shown in Figure 2.
The spatial trend change analysis was performed for CILSS  and for the periods, 2000-2010and 2000. Spatial trends per LULC category were computed as 9th order polynomial and presents positive, no and negative trend area of change (Eastman, 2006;Václavík and Rogan, 2009).
The results are used to compute spatial transition probabilities for every LULC category. In this study, population growth, elevation, and distance to roads were used as drivers for calculating the transitions from forest to savanna, from forest to cropland, from savanna to cropland, and from savanna to forest. Road network and elevation are static drivers while population is a dynamic driver.  (Eastman, 2006). The parameters which are the driving forces of change are assumed to be the same (Eastman, 2006). Many studies have shown that MLP is useful and a good tool for prediction, function approximation and classification (Gardner and Dorling, 1998). We adopted a Markov Chain prediction process and a transition probability to model the future LULC scenarios (Eastman, 2006). The transition probability file is a matrix that records the probability that each LULC category will change to any other category. The quality of the prediction can be evaluated using an observed map not used for (5) calculating the transition potentials (Eastman, 2006). Computing the rate of change between 1975 and 2000 and comparing the projected LULC of 2013 with the observed one allows validation of CILSS datasets. Afterwards, the prediction of LULC scenarios of CILSS and ESA at a time step of seven years from 2013 to 2027 and for Globeland30 at the step of ten year from 2010 to 2020 was performed.

Accuracy assessment of land use and land cover
The assessment of accuracy of LULC maps was done using the latest available LULC maps of the years 2010 (Globeland30) and 2013 (CILSS and ESA). The percentage of reference points estimated correctly, known as overall accuracy and the Kappa coefficient, were 83% and 68% for CILSS in 2013 product, 69% and 36% using the ESA 2013 data set, and 57%, and 34% using the Globeland30 data set, respectively. The Kappa coefficient from CILSS is considered good but is poor for the ESA and Globeland30 data sets (Chen et al., 2004;Fitzgerald et Lees, 1994). The overall accuracy of CILSS is excellent, good for the ESA and Globeland30 data sets. Detailed producer and user accuracy computed is shown in Table 5. According to Table 5, CILSS dataset shows acceptable results of user and producer accuracies higher than 60%. User and producer accuracies resulting from ESA for forest and wetland are very poor especially for wetland and settlements in Globeland30 data set. Particularly for the LULC categories of forest, savanna, cropland, and water, the accuracies are acceptable with ESA and Globeland30 data sets. In the three data sets, user and producer accuracies for savanna and water are acceptable while forest is good in the CILSS and Globeland30 datasets. We can conclude that globally the reclassification consistence is best from CILSS, ESA to Globeland30 data sets in MRB, whereas some individual LULC type have a best user and producer accuracies according to the data set.

Land cover area and change area estimation
The analysis of the CILSS data sets LULC type area reveals savanna, cropland and forest as the dominant land cover in the basin (Figure 3). This means there is an increase of savanna and a reduction of cropland over time. From our knowledge in the field, deforestation is still occurring, resulting in increasing cropland area in the MRB. Table 6 shows that forest decreases at 0.51% from 2000 to 2013 and an increase of savanna from 2000 to 2013 for the ESA dataset. However, the major LULC types in the ESA map are still savanna, cropland and forest, but the change between 2000 and 2013 is positive for savanna, negative for forest, and positive for cropland. Considering the user and the overall accuracy by types (Table 5), it is clear that forest and wetland are not well classified in the ESA data sets (Figure 3 & Table 6).

Land cover spatial trend of change
The spatial trend of change computed for the CILSS, ESA and Globeland30 data sets is given in  (Table 6).
There are some similarities of the spatial trend of the transition forest to savanna between 2000 and 2013 using CILSS and ESA data sets and between 2000-2013 and 2000-2010 using the three data sets for the transition of savanna to cropland (Appendix A).

Quantifications, locations of land use/cover change and driving forces
Land use and land cover modeling requires knowledge about how much change occurs in the land, where it happened and why. Therefore, quantification of historical LULCC allows knowing the past state of LULC. Additionally, drivers involving change are useful for future land projection. , and forest to cropland (54 km 2 ) using the CILSS data set. The LULCC using ESA and Globeland30 is underestimated compared to CILSS. The LULCC can be IJARSG-An Open Access Journal (ISSN 2320 -0243) International Journal of Advanced Remote Sensing and GIS explained by the population growth in the region as from 2000 to 2015, the population in Togo increases from 4.90 to 7.40 million (Table 1).
The hot spot of the change of forest to savanna is located in the southwest of the basin, while forest to cropland change is also important in the northeast. Changes of savanna to cropland are occurring over the entire basin but densely centered in the basin and from the south to the north. The change from forest to savanna with CILLS datasets is located in the south and west of the basin where the rural population likely has access to wood for their domestic needs.

Land use and land cover validation and change predictions
Because of limited data available of the year 1975 of ESA and Globeland30, validation was performed only for the CILSS data set. For that, after assessing LULCC between 1975 and 2000 a LULC map was generated for the year 2013 using the LCM. The estimated map was compared with the observed LULC map. The results of the validation were ranked as acceptable with an accuracy rate higher than 50% (Appendix B).
After analyzing LULC, future LULC was predicted for all data sets by supposing population growth as the main driver.

Land use and land cover area [%]
Change

Figure 4: Projected land use and land cover scenarios and areal changes for 2020 and 2027 with CILSS dataset
The predicted LULC scenarios for 2020 and 2027 using the CILSS data sets are shown in Figure 4 together with the related statistics. According to this projection, forest and savanna LULC decrease with a change rate of 0.45% for forest and for savanna of 5.64%. By contrast, cropland is constantly increasing with a rate of 5.26% and settlements are increasing at 0.83% between 2020 and 2027. Wetland and water bodies in the area did not change significantly.
Because of weak representation of LULC using ESA and Globeland30 confirmed by the prediction accuracy of less than 50% (Appendix B), the projection was performed only for the year 2020. In Figure 5, CILSS, ESA, and Globeland30, LULC scenarios of 2020 are shown. The projected LULC map of 2020 is almost similar to the earlier LULC map from 2013 and 2010 for ESA and Globeland30, respectively ( Figure 2). These similarities can be explained by the low prediction accuracies.  The predicted LULC maps in 2020 depend strongly on the accuracy of each LULC source. Results show that the temporal change of LULC in the basin is best reproduced by CILSS. Beyond the CILSS data set, Globeland30 data performs better concerning the spatial representation of some LULC such as forest, savanna, cropland and water LULC types.
Savanna, cropland, and forest are the dominant LULC types in the region. From 1975 to 2027, there is a decrease of forest and savanna followed by an increase of cropland and settlements in the MRB.

Accuracy assessment and past land cover change
Although the spatial resolution of the ESA and the Globeland30 data is high, the two data sets do not accurately map some LULC types in the study area, which may be explained by the fact that for CILSS local information are used during the automated and semi-automated classification (CILSS, 2016;Cotillon, 2017). It may be also due to the number of reference points spatial repartition used for the accuracy assessment. Indeed, the selected random points size from Google earth imagery can affect the spatial distribution depending on the resolution of 2-km, 300-m or 30-m (Congalton, 1991;Stehman, 2009). The visual identification of land use or land cover classes is easy when the resolution is high (Huang and Siegert, 2006;Stuckens et al., 2000). Table 5 shows the difference of user and producer accuracies from CILSS ESA and Globeland30 dataset. CILSS dataset reveals acceptable accuracies of each LULC category. This difference can be explained by the data set spatial resolution and references points.
The finding that savanna and agriculture are the dominant LULC classes in the study area during the study period is in accordance with other studies. For example, Badjana (2015) analyzed LULCC in the Kara River basin, and showed that savanna was dominant. This was also observed by Diwediga et al. (2015) in the Mo River basin, a small tributary of Oti river in central region of Togo. It was also concluded by Koglo et al. (2018) that savanna and forest are the most important LULC type that are being converted by cropland in Kloto, a small district in the south of the MRB.
The results of CILSS LULCC in MRB confirm many analyses performed in Togo and Benin about LULCC mainly caused by deforestation, cropland expansion, and losses of savanna (Akinyemi et al., 2017;Kleemann et al., 2017). The results of Badjana et al. (2017); Koglo et al. (2018) revealed that deforestation and savanna changed to cropland and settlements in south and north of Togo. In Fazao-Malfacassa National Park, in the northern part of the MRB, Atsri et al. (2018) found that forest and savanna are degraded, which could be explained by agriculture expansion, bush fire, timber extraction and linked by population growth. By assessing the land use change process in the Kéran protected area in the northern Togo, Polo-Polo-akpisso et al. (2019) confirmed that savanna and forest have decreased annually at the rate of more than 2%, whereas cropland and settlements have increased in the region.
The results of the current study show that deforestation is increasing over the whole period of analysis. According to Kokou et al. (2005) more than 80% of the rural communities in Togo are using wood for cooking, causing significant losses of forest. Therefore, decision makers need to take measures to reduce forest degradation, sensitizing the local communities concerning the advantages of reforestation, and the negative impacts on the climate due to losses of forests. Measures must be also taken concerning demographic policies.
The increase of the water bodies between 1975 and 2000, can be explained by the building of the Nangbéto dam in 1987 and rainfall variability in this region (Badjana, 2015). As the consequences of climate change and climate variability, reduced precipitation causes a decrease of the water body of the reservoir from 2000 to 2013, which had consequences for hydroelectricity production as mentioned by Houessou (2016). Climate variability, especially the droughts between 1970s and 1980s, negatively affected grassland due to overgrazing. The increase of settlements is also realistic and can be explained by demography in Togo and Benin (see Table 1).

Land use and land cover scenarios accuracy and assessment
LULC spatial trend direction and location are approximately situated in the locations of the main cities of the basin; therefore, LULC spatial trend can be explained by population activities and growth as mentioned by Koglo et al. (2018) in the south district of MRB. Because of its fine resolution, the Nangbéto dam area and some protected forests such as Malafacassa (Amoussou et al., 2017;Atsri et al., 2018) are well delimited. The excellent reclassification of these land cover is due to low and high albedo factor of water and forest which plays a role during data collection by satellite's sensors. CILSS LULC scenarios shows positive area change of cropland and settlements; negative area change of forest and savanna can be explained by the same factors cited above.
Difference between future LULC scenarios of the data sets is due to the poor and better Kappa coefficients obtained, which prove the importance of LULC validation. Therefore validation or LULC based on supervised classification are preferable as an input in LULCC scenario studies (Foody, 2002).
LULC scenarios accuracy rate are strongly impacted by the accuracy assessment and LULCC of historical CILSS, ESA and Globeland30 data sets. Furthermore, the low accuracies obtained from the modeling can also be explained by the fact that we were not able to take into account all the drivers as well anthropogenic and natural during the modeling. Others reason are the weakness of LCM software or user manipulation errors (Camacho Olmedo et al., 2015;Mas et al., 2014). The simulation can allow understanding, forecasting, and anticipating the future evolution of environment coverage. Nevertheless it is important to know the validity of LCM outputs based on local expertise (Zoungrana et al., 2015).

Conclusion
This work focused on land use and land cover changes assessment and future scenarios in the Mono river basin (MRB) over the period 1975 to 2027 using three different data sets. The results show that the CILSS data set is the most reliable for the MRB with acceptable accuracy assessment efficiencies (higher than 75%). In the MRB, savanna, cropland, and forest are the major land use and land cover classes with decreasing (forest and savanna) and increasing (cropland, settlement) trends. The expansion of agriculture due to population growth occurs at the expense of savanna. In the tropical zone of West Africa, people use wood as an energy source, another cause of deforestation. LULCC must be taken seriously by the authorities and population themselves. It is very important to know the evolution of LULCC in order to develop strategies for planning of an integral water resources management (IWRM) in general.
The study assessed scenarios of future LULC by mapping and analyzing the situation for two time steps (2020 and 2027). The maps obtained from this analysis can be used as inputs in hydrological modeling for assessing the impacts of LULC and climate changes on water yield and surface runoff of in MRB.
Future scenarios of LULC depend significantly on the source of the underlying data set. The high spatial resolutions of Globeland (30 m) and ESA (300 m) are attractive, but the quality is limited to specific land use or land cover categories. The resolution of CILSS is rather coarse and therefore, users often prefer other data sets. Nevertheless, because CILSS data were produced with local knowledge, the quality is convincing and outperforms the others. Using the data sets for scenario analysis results in completely diverging futures; this may significantly affect management strategies. This study shows the importance of validating land cover data sets before scenario analysis.