Fuzzy model-based reconstruction of paleovegetation in Ethiopia

ABSTRACT We introduce a new method to compute plant distribution in Ethiopia under paleoclimatic conditions using fuzzy logic. Using a published map of the potential vegetation for Ethiopia we decipher the boundary conditions for the main vegetation units shown, reflecting modern climatic conditions for temperature and precipitation in this region. Fuzzy logic using these climatic values on a GIS platform then derived the computational map of the potential vegetation. Comparing it with the original map shows a general correspondence of about 90%. By changing the underlying climate parameters, we then used this model for hypothetical paleoclimatic conditions to simulate the vegetational response on these changed climate settings. Finally, vegetational response maps for Ethiopia are presented for two scenarios: (i) a colder and drier condition (such as the Last Glacial Maximum) and (ii) a warmer and wetter condition (such as the last interglacial) than today.


Introduction
The objective of the procedure described below is to model the potential vegetation of Ethiopia in terms of precipitation and temperature to compute the distribution of certain plant associations during the geologic past. The method is based on the hypothesis that a given type of vegetation cover in a specific region is mainly a reflection of the ecological and climatological conditions in that region, including human influence. Following the concept of Tüxen (Tüxen, 1956), developed in the 1950s, botanists all over the continents have created potential vegetation maps by time-consuming field surveys of plant associations assuming that humans will no longer interact. Under these conditions mainly climate parameters, such as temperature and precipitation, will influence the plant distribution. A good example of such a map is the one created by Friis et al. for Ethiopia (Friis et al., 2010), which was improved by van Breugel et al. (2015) and which we used here as our 'base map'. This map heavily relies on observations in the field covering a broad area of the land. The results are valid only for current climatic conditions. If we want to model vegetation patterns under climatic conditions of the past we have to rely on climate reconstructions using, e.g. lake sediments (e.g. Schaebitz et al., 2021). These give an accurate description of the surrounding environment from which we can derive the relevant climatic conditions in terms of precipitation and temperature, but only for a restricted area in which the lake sediments occur. Our model uses these factors to extrapolate climate conditions to a wider region and thus derive a description of the potential vegetation under a given climatic condition.
Our model is based on the principles of fuzzy logic (see Kainz, 2007 for fuzzy logic and GIS). Fuzzy logic used for vegetation distribution provides a way of distinguishing smoothly between the two extremes of 'totally suitable situation for a certain group of plants' and 'totally unsuitable'. The idea is to identify for each plant association a certain span of tolerance in precipitation and temperature ranges in which this particular type of vegetation can exist ( Figure 1).
Following the general assumption that those plants which are best adapted to the dominating conditions prevail, the model chooses the plant association that has the strongest overall membership value for a particular location. Oldeland et al. (2010) use Fuzzy logic for vegetation mapping in Namibia recognizing its 'great potential to map and identify continuous natural vegetation' (ibid, 1156). They use the fuzzy method to compute a membership value for each vegetation unit based on hyperspectral remote sensing data. Based on the highest membership value for each pixel the prevailing vegetation unit was mapped in a hard classification image. Triepke (2017, p. 64) states that fuzzy sets correspond more adequately to the 'fuzzy nature of natural landscapes'. Applying the fuzzy logic helps to reduce the arbitrariness of human-made classifications. Räsänen et al. (2019Räsänen et al. ( , p. 1024) use the fuzzy method for forest mapping and conclude that this kind of method is to be preferred when it comes to capturing 'continuity in species distribution'. Feilhauer et al. (2021, p. 330) do not see advantages of hard classification over the fuzzy method when it comes to vegetation mapping. They conclude that fuzzy classification is 'a better representation of reality' compared to hard classification and offers several other advantages, such as accuracy assessment, and mapping performance. Finally, fuzzy maps can always be converted into hard classification but not vice versa.

Methods
We used mean annual precipitation data obtained from the WorldClim database (Hijmans et al., 2005). It provides a resolution of approximately 1 × 1 km for Ethiopia. The mean annual temperature was approximated using SRTM elevation data, downloaded from USGS earthExplorer. Following the example of Friis et al. (2010), we used the SRTM with its higher resolution as a substitute for temperature data. Higher temperatures correspond to lower altitudes. We assume a fairly linear change of −6°C per 1000 m difference in elevation as the general lapse rate. The resolution for the SRTM raster close to the equator is about 90 m. The data was downloaded in tiles. To cover the entire area of Ethiopia more than 100 tiles were used.
The term 'fuzziness' refers to a certain vagueness, in this particular case related to borders between plant associations. A fuzzy membership function expresses to what degree something belongs to a class. Every entity is given a value between 0 and 1 where 0 indicates no membership and 1 definite membership of a certain class (Figure 1). Members in the center of the set usually receive a value of 1, with the value declining towards the edges of the set. Membership values of 0.5 (crossover point) should be considered where the crisp boundary of a non-fuzzy set would have been defined. This means that entities can be considered which do not simply belong or do not belong to the set, but are something in between. This is interesting in the case of vegetation distribution because in natural vegetation there are no clear boundaries but rather fuzzy transitions between plant associations. In our case, the membership functions were data driven using a training area in Southern Ethiopia, based on the atlas of the recent potential vegetation (Friis et al., 2010). To extract the characteristics of the plant associations we used geographical raster data sets indicating mean annual precipitation and elevation (which we converted into temperature). First, we overlay the existing map with the precipitation and elevation raster files. A statistical description of the data layers for each period calculated with the proposed method in the current study is given in Table 1. Then we performed a simple cell  count on each raster within the vegetation zones of the map. We then calculated the mean, maximum and minimum values, and the standard deviation for the raster data sets in each vegetation zone. This yields a unique profile for each vegetation zone indicating the corresponding plant association's favorable conditions in terms of precipitation and elevation (as a proxy for temperature; Figure 2). Around these values, the membership functions are constructed. The mean value was used to establish the center of each set's core. An interval of +/-the standard deviation around the mean (Figure 1) is declared as positive membership within the group corresponding to a membership value of 1. In other words, this range of cell values is assumed to be safely associated with a definite membership of the plant association. The maximum and minimum values are used to define the limits of the set outside of which membership values are 0.
We applied this procedure to a simplified map of potential natural vegetation by Friis et al. (2010) and van Breugel et al. (2015) using only six vegetation units instead of nine (Map 1) by putting all the original forests types (wet evergreen mountain forests = MAF, transition forests = TRF and Afromontane forests = DAF) in only one class (F) together. Furthermore, we joined all bush and savanna types (Acacia-Commiphora = ACB 'forests', Acacia dominated grassland = ACB/RV and Combretum-Terminala 'forests' and forested grasslands = CTW) into another class named Bush and Grassland (BG). Therefore, we finally differentiated between: Desert (DSS), Bush and Grassland (BG), Ericaceous Belt (EB), Forest (F), Afroalpine Belt (AA), and a currently non-existent Periglacial and Glacial Belt (PGG), which might have existed in times of low temperatures. The results were two trapezoidal membership functions for each plant association, except for the DSS and PGG which were L and S shaped respectively and defined by elevation/temperature and precipitation only ( Figure 3). We used the L-shaped functions (for explanation of L and S-shaped functions see Figure 1) to indicate that membership values below a certain threshold will not decrease. Membership values for DSS will not decrease below a certain amount of precipitation. In parallel, glacial conditions will not dwindle when temperatures get lower. The respective membership function will assume a distinct S-shape because the membership value will not decrease after a certain temperature value. Thus the PGG membership function expresses the assumption that below a certain temperature (or above a certain elevation) no plants can exist. This defines at some point a definitive limit to any plant association no matter how low temperatures get or how high precipitation will be. There should be a similar limit for low precipitation, demarcating desert climate mostly bordering the BG plant association. For boundary conditions of each plant association used in this model, see Figure 3.
The membership functions were translated into the python scripting language. In ArcGIS, python can be used as an interface to access tools for processing raster data sets in geographically meaningful ways. Using the spatial analyst library of the arcpy module provided by ESRI, we can perform all kinds of calculations on a given raster dataset. We established the fuzzy membership functions using conditional statements. See the commented scripts for details.
All six membership functions are applied to the precipitation and the elevation data sets, producing five new data sets each. These data sets describe for each cell in the raster, the fuzzy membership values of the respective plant association separately for precipitation and elevation (temperature). To aggregate the two membership values of each vegetation class, the average value among them is calculated for each cell. This results in five layers representing the membership values of every raster cell for each plant association in terms of precipitation and elevation (temperature) combined.
For the display of the data in a single map, we had to defuzzify our results. Following Friis et al. (2010) and van Breugel et al. (2015), we use the spatial resolution of our temperature proxy data, the SRTM datasets, as basic cell size for our calculations. This corresponds to the natural conditions where temperature changes are more corresponding to relief changes than precipitation patterns do. This means the basic cell size for our defuzzification is approximately 90 m × 90 m (SRTM spatial resolution) and thus fairly small. We assume that one plant association prevails over another under most favorable conditions, which are expressed in the fuzzy membership values. Using the stacking algorithm of the ArcGIS desktop toolbox the values were integrated into a single raster output by selecting the strongest membership value in the stack and assigning the cells the class of the prevailing plant association (Figure 4).

Results
The resulting map of the fuzzy-based potential vegetation for Ethiopia is shown in Map 2. The map shows that the lowest elevations are predominantly covered by desert vegetation. This changes to 'Bush and Grasslands' in the slightly more elevated regions. For the more mountainous areas, our model indicates predominantly forest vegetation that yield to the Ericacea belt. The highest regions are characterized by Afro-alpine conditions. Considering the whole of Ethiopia, there are only 10.2% differences (see Map 3) in the plant distribution areas between our computational fuzzy-based map and the modified original Comparing our fuzzy logic map (Map 2) with the field-based map (Map 1) certain deviations of our model are visible in the extremely dry areas of NE and eastern Ethiopia (Map 3). Here the 'Bush and Grassland' vegetation covers less terrain as effectively prevailing under recent climate conditions. Our model calculates more desertic areas.
Moreover, some significant differences appear on the western exposed slopes of the forest in the SW of the country. Here the forest vegetation covers less terrain than the 'Bush and Grassland'. Friis et al. (2010) designate these areas to the transitional rain forests of the moist evergreen montane forest ecosystem (p. 252). Once again, our model overestimates the water stress for the plants. This phenomenon is also visible in general at the borderline of the forest all over the country. In sum, all red marked differences in Map 3 reach about 10.2% of the complete Ethiopian terrain, which suggests our Fuzzy model in these areas is not accurately representing the potential vegetation map.
A second step was to use this fuzzy model for the past by changing the temperature and precipitation values for two hypothetical scenarios: the last glacial maximum (LGM) and the Last Interglacial (Maps 4 and 5). First, we present the modeled situation for the Last Glacial Maximum (LGM = MIS 2 about 18-22 ka BP; Map 4) which was derived by using a general  temperature decrease of −6°C as discussed for the tropics by Hostetler et al. (2006) and a mean reduction of precipitation of about 25% as proposed by Fischer et al. (2020) for southern Ethiopia (the Chew Bahir region). Our model calculates a spread of the desert vegetation into higher altitudes, whereas the best conditions for forest reach further down the elevations. As a result 'Bush and Grassland' retreat under pressure from lower and higher elevations. The Ericacea belt and Afro-alpine conditions extended to lower elevations. In the highest elevations, even periglacial and glacial conditions were maintained.
The second example shows the modeled situation based on the recent version of our Fuzzy model for the Last Interglacial (MIS 5e = about 120-130 ka BP, Map 5). We generally calculated a rise in temperature of +2°C and 25% more in annual precipitation for this time interval in comparison to modern conditions (i.e. Asrat et al., 2018;Fischer et al., 2020). Here our calculations suggest a fairly wide spread of 'Bush and Grassland' and growth of the Ericacea belt. The forests slightly retreat to higher altitudes.
In comparison to the modern conditions calculated by our model (Map 2), it is visible that the desert vegetation in the NE and east of the country covers more terrain during LGM (Map 4) which is expected when decreasing the precipitation values countrywide. Moreover, the prevailing forests in lower altitudinal positions of the mountains, directly border the desert vegetation in some areas in the east. It looks like in general the forest was able to grow on lower elevated slopes due to the reduction of water stress by decreasing the temperature level which might have overcompensated for the lesser annual precipitation values. For the higher elevated mountain areas, mainly in the southwest, our simulation for the LGM shows widespread Ericacea and Afro-alpine belts. Additionally, small spots of Periglacial and Glacial (PPG) environments appear on the highest peaks (i.e. Simien and Bale Mountains). A broad corridor with open vegetation ('Bush and Grassland') stretching in N-Sdirection through the highlands north of the rift (where forest is dominating currently) dominated during the LGM, while in the Afar region and eastwards to Somalia, desertic environments predominate.
The simulation of the conditions under the last interglacial led to more widespread 'Bush and Grassland' with broad corridors in the rift and shrinking areas of deserts. Moreover, the forest shrank at its eastern flanks but spread in western parts of the country which today border Kenya and southern Sudan. The smaller forest terrain in the east is surprising but could be due to the warmer conditions raising the water stress for trees overcompensating the higher precipitation. In the higher elevations, the Ericacea belt tended to spread in the terrain of the Afroalpine belt while periglacial and glacial zones at very high elevations disappear. It looks like open vegetation during the Last Interglacial predominated in the rift and offers broader corridors to allow for the dispersal of our ancestors. While during the LGM, under drier and colder climate conditions, the model shows more desertic conditions in the lowlands, which might have triggered the upward movement into higher elevated mountains for people living during these times (Schaebitz et al., 2021).

Discussion
In general, the fuzzy model proved to be reliable in predicting potential vegetation in terms of precipitation and elevation (temperature) compared to the modified map by Friis et al. (2010) and van Breugel et al. (2015). Using more accurate precipitation and elevation (temperature) data with a higher spatial resolution or describing seasonal changes might yield even better results. One main limiting factor is the spatial resolution especially of the precipitation data. Using more exact precipitation regimes and its patterns across Ethiopia in future model runs would probably result in more accurate localization of plant associations. The membership functions could be re-modeled to represent more precisely real-world behavior of plant associations. Possibly a linear decline does not properly describe membership values toward the edges of a set. Functions with smooth transitions between membership values like sinusoidal (see Figure 1) or Gaussian functions might be more appropriate to describe vegetation in fuzzy terms. Moreover, the core of the set could be better defined either in a data-driven approach using a less arbitrary indicator than the aforementioned spans of suitability for plant growth. Alternatively, instead of a data-driven approach, field or laboratory observations of the tolerance ranges of the given plant associations could be used to more accurately describe the behavior of plant associations.
Given the reliable functionality of the fuzzy model, it can be easily enhanced by adding more membership functions besides precipitation and elevation (temperature). Functions describing the affiliation of plant associations to soil, wind patterns, exposure to sunlight, etc. might yield even more reliable results. Finally, the training areas could be improved. They were chosen for pragmatic reasons, but could also be determined by field observations in possibly undisturbed environments.

Conclusion
Our Potential Natural Vegetation Map of Ethiopia based on Fuzzy Logic is about 90% similar to the field-based potential vegetation map of Ethiopia (Friis et al., 2010;van Breugel et al., 2015) and therefore can generally be used as the first version of a new tool to generally describe the vegetation cover in the country. The Fuzzy modeled map can be improved with the use of higher resolution and more accurate precipitation and temperature data. Nevertheless, our model even with the available coarse climate parameter data has shown that it can be useful for characterizing late Quaternary vegetation regimes. During the drier and colder LGM desert vegetation spread in the east and NE of Ethiopia, while 'Bush and Grassland' were predominant in the middlehigh elevated regions north of the central rift and the forest appears in more westerly positions. The higher mountains showed more Ericacea and Afroalpine vegetation topped by periglacial and glacial environments on the highest peaks. During the much wetter and slightly warmer Last Interglacial, these last two features disappeared from the highest peaks, forest vegetation shrank while Ericacea spread. 'Bush and Grasslands' appeared as the dominant vegetation form over much of the county, covering parts of the areas in the east where today forest is dominating, broadly covering the rift and spread into some of today's desert regions in the east and NE offering corridors with open vegetation to the Red Sea and the Horn of Africa.
The resulting maps can be used in a wide range of disciplines like paleoecology, paleogeography, and Paleoanthropology, because they reveal interesting insights into the potential spread of vegetation which influences the livelihoods and activities of human beings in these areas.

Software
For modeling the potential vegetation with the fuzzy method, we used the python interface for the ArcGIS desktop 10.5. We did the map design, layout, typography, and additional graphics in Adobe Illustrator CS6.