Datasets for mapping pastoralist movement patterns and risk zones of Rift Valley fever occurrence

Rift Valley fever (RVF) is a zoonotic disease affecting humans and animals. It is caused by RVF virus transmitted primarily by Aedes mosquitoes. The data presented in this article propose environmental layers suitable for mapping RVF vector habitat zones and livestock migratory routes. Using species distribution modelling, we used RVF vector occurrence data sampled along livestock migratory routes to identify suitable vector habitats within the study region which is located in the central and the north-eastern part of Kenya. Eleven herds monitored with GPS collars were used to estimate cattle utilization distribution patterns. We used kernel density estimator to produce utilization contours where the 0.5 percentile represents core grazing areas and the 0.99 percentile represents the entire home range. The home ranges were overlaid on the vector suitability map to identify risks zones for possible RVF exposure. Assimilating high spatial and temporal livestock movement and vector distribution datasets generates new knowledge in understanding RVF epidemiology and generates spatially explicit risk maps. The results can be used to guide vector control and vaccination strategies for better disease control.


a b s t r a c t
Rift Valley fever (RVF) is a zoonotic disease affecting humans and animals. It is caused by RVF virus transmitted primarily by Aedes mosquitoes. The data presented in this article propose environmental layers suitable for mapping RVF vector habitat zones and livestock migratory routes. Using species distribution modelling, we used RVF vector occurrence data sampled along livestock migratory routes to identify suitable vector habitats within the study region which is located in the central and the north-eastern part of Kenya. Eleven herds monitored with GPS collars were used to estimate cattle utilization distribution patterns. We used kernel density estimator to produce utilization contours where the 0.5 percentile represents core grazing areas and the 0.99 percentile represents the entire home range. The home ranges were overlaid on the vector suitability map to identify risks zones for possible RVF exposure. Assimilating high spatial and temporal livestock

Data accessibility
Provided in this article

Value of the data
Vegetation seasonality, topography, soil types and climatic data can be used to understand ecological characteristics of mosquito habitats as a factor for RVF propagation.
Livestock movement patterns can be used to explore the role of animal movement in RVF propagation.
The datasets can be integrated and used to identify risk zones for RVF hence, improve the effectiveness of intervention strategies against the disease.

Data
This article presents datasets used to map exposure of pastoralist to RVF vectors along their migratory routes. Fig. 1 shows habitat suitability for RVF vectors overlaid with livestock grazing areas.   Table 1 shows the summary of the datasets used in the study. Fig. 2b shows cattle movement data obtained from 2012 to 2016 from 11 collared herds from Garissa, Tana River and Isiolo counties herein referred to as Garissa, Tana River and Isiolo herds respectively. We collared six Garissa herds between September 2012 and June 2014 while two Tana River and three Isiolo herds were collared from August 2013 to December 2016. The temporal resolution for transmission was after every one hour during the day i.e. twelve GPS location per herd between 6am and 6pm. However, there were several times when the collars failed to transmit because the animals were either out of range of the satellites or when the battery life ended.

Mosquito sampling
[5] and [6] articulate the procedure in which mosquito sampling was done. In both studies, approximately over 100,000 mosquitoes were sampled belonging to six genera namely; Aedes, Anopheles, Mansonia, Culex, Aedeomyia and Coquillettidia. Sampling was done during long (March, April, May) and short (October, November, December) rains and each sampling site was considered an occurrence point for species distribution modelling as shown in Fig. 2a

Environmental layers
We downloaded pre-processed 16-day NDVI and monthly MOD16 Evapotranspiration (ET) time series data for 2001-2015 from University of Natural Resources and Life Science, Vienna portal [7] and USGS data portal respectively [1]. Fig. 5b shows the soil type map obtained from the Kenya Soil Survey dataset while elevation data from 90 m Digital Elevation Model (DEM) from the Shuttle Radar Topographic Mission (SRTM) was obtained from USGS data portal. We also downloaded current climatic conditions from 1 km AfriClim datasets from The University of York portal as shown in Fig. 6 [3].

Methods
The data variables and methods are summarized in Fig. 7. The vegetation seasonality parameters shown in Fig. 3 were extracted from NDVI time series using TIMESAT [4]. A description for the meaning of each seasonality parameter extracted is provided by Jönsson and Eklundh [4]. We conducted a principle component analysis on ET time-series to obtain the data shown in Fig. 4. This reduced data dimensionality and maximized data variability over the observation period by   extracting the underlying data structure [8,9]. We extracted TWI from 90 m DEM data as shown in Fig. 5a using SAGA GIS to identify steadiness of wetness of the study area [10,11]. Steadiness of wetness of an area is defined by the contribution the slope and the upstream region has in influencing its ability/capacity of retaining water in any particular time [12]. We aggregated seasonality parameters, ET components, TWI, soil type and AfriClim herein referred to as environmental layers (Fig. 7) and tested for multi-collinearity using Variance Inflation Factors (VIF) before using them for further analysis in species distribution modelling.  We used species distribution modelling technique to map vector habitat suitability. This was achieved by associating the occurrence data with environmental layers (Fig. 7) resulting to similar environmental characteristics as sampled data being identified and projected over the study area [13]. We achieved this extrapolation using MAXENT algorithm with 68 occurrence points shown in Fig. 2a and environmental layers shown in Figs. 3-6. 70% of the occurrence data were used to train the model while 30% was used for model evaluation. Fig. 1 shows the vector habitat suitability map generated with an accuracy of 0.75 Area Under Curve (AUC) of Receiver Operating Curve. Fig. 1 also shows the home ranges for the collared herds. This was achieved by generating utilization distribution using Kernel Density Estimator (KDE) from the telemetry data shown in Fig. 2b [14]. The home range is defined as that area criss-crossed by an animal as part of its normal activity and movement due to food gathering, mating, and caring for the young [15]. Within given home ranges (Fig. 1), we have core areas that are frequently used by the animals than other areas [16]. The utilization distribution map describes this intensity of use within the home ranges using contour boundaries defining the space use percentage where 50% describes the 'core area' and 99% describes the entire home range [17]. The home range map was overlaid on vector habitat suitability map to identify risk zones in a GIS environment