Drivers of regional crop variability in Chad

Agricultural systems across the Sudano-Sahelian zone of Africa often display considerable interannual variability in output, which often is ascribed to unstable environmental conditions, usually rainfall (FAO, 2016a; Leroux et al., 2016). The relationships between rainfed crop output and environmental conditions in this zone have been explored extensively on different spatial scales, usually by focusing on statistical correlations between crop yields and remote sensing atmospheric and soil variables, but often leave a large part of the crop variability unexplained (Kamali et al., 2018; McNally et al., 2015; Traoré et al., 2011). Adding to the difficulty of capturing the complex relationships between water availability, nutrients, and crop growth under changing agro-ecological conditions, the limited explanatory capacity is commonly explained by a lack of data availability of omitted factors that are relevant for crop variability (Louise et al., 2015). Examples of such factors are pests, technological innovations, fertilizers, farm incomes, labour and land availability, development projects, political programs, conflicts, population dynamics, and market prices (Hazell and Wood, 2008; Mertz et al., 2010; Nelson et al., 2016; Ouédraogo et al., 2017; van Vliet et al., 2013). One reason for the omission of these factors is the dominance of climate change agendas and climate models in setting agricultural research priorities (Whitfield et al., 2015), with its allure of providing results over extended spatiotemporal scales. Such results, however, do not add much of value for agricultural management in the Sudano-Sahelian zone in near-time periods, where decision makers are concerned with tangible productivity improvements, and where combinations of socio-economic, political, and climate variability factors are the prime drivers of change. Moreover, the influences of these factors are strongly connected to the spatial scale of the analysis, and the socio-economic complexity that goes with it (Jahel et al., 2016; Whitfield et al., 2015). This emphasizes a further need for including socio-economic factors when analysing crop variability on spatial scales beyond the farm level. Crop output on a sub-national (henceforth referred to as regional) scale is crucial for a range of food security metrics and operational tools employed by AGRHYMET, the Food and Agriculture Organization of the United Nations (FAO), the Famine Early Warning Systems Network (FEWS NET), the World Food Programme (WFP), and others (FEWS NET, 2011; IPC Global Partners, 2012; Jones et al., 2013). Understanding the drivers of regional crop variability however requires socioeconomic data of considerable detail, which is generally lacking across the Sudano-Sahelian zone. The continuous food security monitoring and data collection from these organizations over recent years provide valuable information on the ongoing processes in agricultural systems on regional scales, but their qualitative nature render comparisons over extended time periods difficult. Coupling socio-economic data from these and other sources with quantitative environmental data in studies of regional crop variability thus faces several methodological issues. Partly due to this, studies on both the farm and national scales have received more attention than regional ones (Hoffman et al., 2018; Jahel et al., 2016; Whitfield et al., 2015; Yengoh, 2013), based on either detailed data from field studies in specific areas, or nationally aggregated datasets. This paper presents a method under which a broad range of datasets of varied resolution and type can be combined to improve the


Introduction
Agricultural systems across the Sudano-Sahelian zone of Africa often display considerable interannual variability in output, which often is ascribed to unstable environmental conditions, usually rainfall (FAO, 2016a;Leroux et al., 2016). The relationships between rainfed crop output and environmental conditions in this zone have been explored extensively on different spatial scales, usually by focusing on statistical correlations between crop yields and remote sensing atmospheric and soil variables, but often leave a large part of the crop variability unexplained (Kamali et al., 2018;McNally et al., 2015;Traoré et al., 2011). Adding to the difficulty of capturing the complex relationships between water availability, nutrients, and crop growth under changing agro-ecological conditions, the limited explanatory capacity is commonly explained by a lack of data availability of omitted factors that are relevant for crop variability (Louise et al., 2015). Examples of such factors are pests, technological innovations, fertilizers, farm incomes, labour and land availability, development projects, political programs, conflicts, population dynamics, and market prices (Hazell and Wood, 2008;Mertz et al., 2010;Nelson et al., 2016;Ouédraogo et al., 2017;van Vliet et al., 2013). One reason for the omission of these factors is the dominance of climate change agendas and climate models in setting agricultural research priorities (Whitfield et al., 2015), with its allure of providing results over extended spatiotemporal scales. Such results, however, do not add much of value for agricultural management in the Sudano-Sahelian zone in near-time periods, where decision makers are concerned with tangible productivity improvements, and where combinations of socio-economic, political, and climate variability factors are the prime drivers of change. Moreover, the influences of these factors are strongly connected to the spatial scale of the analysis, and the socio-economic complexity that goes with it (Jahel et al., 2016;Whitfield et al., 2015). This emphasizes a further need for including socio-economic factors when analysing crop variability on spatial scales beyond the farm level.
Crop output on a sub-national (henceforth referred to as regional) scale is crucial for a range of food security metrics and operational tools employed by AGRHYMET, the Food and Agriculture Organization of the United Nations (FAO), the Famine Early Warning Systems Network (FEWS NET), the World Food Programme (WFP), and others (FEWS NET, 2011;IPC Global Partners, 2012;Jones et al., 2013). Understanding the drivers of regional crop variability however requires socioeconomic data of considerable detail, which is generally lacking across the Sudano-Sahelian zone. The continuous food security monitoring and data collection from these organizations over recent years provide valuable information on the ongoing processes in agricultural systems on regional scales, but their qualitative nature render comparisons over extended time periods difficult. Coupling socio-economic data from these and other sources with quantitative environmental data in studies of regional crop variability thus faces several methodological issues. Partly due to this, studies on both the farm and national scales have received more attention than regional ones (Hoffman et al., 2018;Jahel et al., 2016;Whitfield et al., 2015;Yengoh, 2013), based on either detailed data from field studies in specific areas, or nationally aggregated datasets.
This paper presents a method under which a broad range of datasets of varied resolution and type can be combined to improve the https://doi.org/10.1016/j.jaridenv.2019.104081 Received 10 April 2019; Received in revised form 14 August 2019; Accepted 30 November 2019 explanatory and predictive capacity to the regional crop production variability, and applies it to an empirical analysis in Chad. By combining an extensive set of agricultural, environmental, and livelihood data spanning 1983-2016, this paper is the most extensive and detailed study of crop production variability in Chad to date. It further hopes to spur similar research on coupled socio-economic and agricultural dynamics on this level of analysis across the Sudano-Sahelian zone.

Agricultural systems in Chad
Regional crop dynamics in Chad are known to be highly fluctuating, which is usually ascribed to the high ratio of rainfed and low-input farming, as well as the high variability in atmospheric conditions (World Bank, 2015). Despite its recognized importance for both local livelihoods and national economic development, and continuous efforts to increase and stabilize agricultural productivity (Ministry of Agriculture and Irrigation, 2013), knowledge on both the patterns and drivers of crop variability is sparse, with seasonal production forecasts based foremost on cumulative precipitation and NDVI estimates, and with unevaluated predictive capacities (AGRHYMET, 2017;FEWS NET, 2016). The livelihoods in Chad are furthermore subject to transformations related to population increase, urbanisation, natural gas exploration, economic development, and conflicts both within the country and in the neighboring countries (World Bank, 2015). FEWS NET's livelihood profiles provide a thorough overview of the varying agroecological conditions, population dynamics, economic opportunities, coping strategies, and food security components for the different regions in the country (FEWS NET, 2011). With the currently low agricultural productivity levels, combined with the ongoing rural transformations, and potentially disruptive events such as pests, conflicts, and altered economic opportunities, crop production variability is likely to hold more complexity than what cumulative precipitation and NDVI estimates capture. Furthermore, with widespread differences in agro-ecological conditions and livelihoods within the country, both the patterns and the drivers of regional crop production variability differ accordingly. Only two previous studies have explored the quantitative correlations between environmental variables and crop production variability in Chad, and only by including hydro-climatic and soil variables: one, for national millet yields for 2001-2010(McNally et al., 2015, and another for sub-regional harvested area and yield in the Lac region around Lake Chad for 1988-2012 (Nilsson et al., 2016). Reports on agricultural performance and food security conditions are however plentiful, both for specific areas and points in time (INSEED, 2012(INSEED, , 1993Republic of Chad, 2009;WFP, 2013WFP, , 2009WFP, , 2005, as well as in the continuous publications of FEWS NET's food security updates (e.g. FEWS NET, 2010). Moreover, regional crop statistics and market prices form the basis of much of the operative food security work, but has yet to be combined over extended time series to evaluate patterns of covariation and quantified causalities, which this study is now addressing.

Methods & data
The methodology was developed to analyze drivers of interannual variability in the agricultural systems on a regional level of analysis. Data were collected from crop statistics, key atmospheric and soil variables, market prices, livelihood conditions, and food security classifications. Due to the varying resolutions and types of data used, as well as the small sample sizes, both qualitative and quantitative methods were applied. The quantitative methods formed the basis of the analysis, while the qualitative methods were used to validate the quantitative analysis, add additional detail, and to infer on explanations for deviating observations and patterns.

Crop data
Annual harvested area and production of the three main rainfed crops (maize, millet, and sorghum) for 1983-2016 on regional levels were acquired from the Direction de la Production et des Statistiques Agricoles in Chad (DPSA), which consistently has been collecting agricultural data through their regional and sub-regional offices (DPSA, 2017a;INSEED, 2006). The yield was derived by dividing the production with the harvested area, which distinguishes it from a yield related to the planted area. While the harvested area, the yield, and the production all influence the dynamics of economic development, the production is arguably of most importance for food security and was the main focus of the analysis. The harvested area and yield were included to the extent that they could advance the understanding of the dynamics in the production data. As the administrative regions in Chad have changed over this time period, the regional division with the lowest shared detail was used (Fig. 1), which is the one used by DPSA for the period 1998period -2009period (DPSA, 2017b. With up to three rainfed crops analyzed in each of the 13 regions, this resulted in 37 crop production variables, as no maize production was reported for Batha and Biltine. Errors and inconsistencies in the crop statistics are expected to be prevalent (FAO, 2013;INSEED, 2006;Ronelyambaye, 2015;World Bank, 2017), which justifies aiming for generalized conclusions. At the Fig. 1. Distribution of maize, millet, and sorghum crops as given in the Earthstat dataset (Monfreda et al., 2008), together with the regional divisions used in this study. Note that this is not the current administrative regional division in Chad. same time, its continuous use in operational food security work improves its reliability (see e.g. FEWS NET, 2000). The location of the cultivated areas of maize, millet, and sorghum were taken from the Earthstat dataset (Fig. 1, Monfreda et al., 2008), which gives the average fraction of hectares under cultivation between 1997 and 2003 for each crop and grid cell (at 0.1°).
These fractions were used to ascribe weights to each grid cell, to calculate weighted averages of the atmospheric and soil water variables for each crop and region. Due to ongoing transformations in the agricultural systems across Chad, the extent and location of cultivated areas have changed over the studied time period . However, as such expansions are likely to occur adjacent to current areas, the atmospheric conditions are likely to remain similar. The start of the growing season for each crop and region was estimated based on crop calendars in FEWS NET's livelihood profiles for 2005 and 2011 (FEWS NET, 2011, 2005. To include effects of changing atmospheric growing conditions over the studied period, precipitation thresholds were identified based on the given crop calendars at the year of the livelihood profiles for each crop and region, which were used to define the start of the growing seasons for all other years.

Detrending and filtering of crop statistics
The crop data were detrended to remove the influence of factors that were not included in the analysis, both in the mean values and the variability. Examples of causes of trends in the mean values are population increase, farm inputs and market incentives, while for trends in the variability it could be due to increased specialization, expansion into marginal lands, and altered seasonal water availability. As detrending risks removing the influence of the drivers one wishes to evaluate, the drivers were also detrended, which further focuses the analysis on the correlated patterns of variability between the drivers and response variables. Due to the uncertainty involved in detrending under conditions of multifaceted and dynamic drivers, and as previous research has showed that abrupt and structural breaks are common in the crop statistics in Chad (Nilsson and Uvo, 2018), two different detrending methods were used. One solely used moving averages of 5 years as the basis of the detrending, while the other combined this with a breakpoint analysis, where moving averages were applied within the subsamples between the breakpoints. The breakpoint methodology used the Wald test (see e.g. Andrews, 1993), based on linear robust regressions with Huber's maximum likelihood estimator (see e.g. Huber and Ronchetti, 2009;Stuart, 2011), and is presented in detail in Nilsson and Uvo (2018). With two trend dimensions per variable, i.e. mean and variability, this resulted in four different detrended data series per crop variable: Moving Average (MA), Moving Average with variability (MA + var.), Breakpoint (BP), and Breakpoint with variability (BP + var.). A data error filter was further applied to remove unreasonable values, which were defined as values exceeding two standard deviations within moving windows of five data points, in the detrended datasets.

Environmental data for crop-water relationships
The environmental impact on crop production variability was assessed based on precipitation, crop evapotranspiration, and crop water deficits over each growing season. With considerable uncertainties in the datasets and methods, as well as lack of information on specific crop and growing conditions, a range of potentially influential environmental variables were evaluated to identify the one with the highest overall correlative performance against the crop production. Crop evapotranspiration, crop water deficits, and subsequent effects on the crops were estimated based on FAO's methods for yield response to water (Allen et al., 1998;Doorenbos and Kassam, 1979;Steduto et al., 2012). Crop specific information needed for these calculations were taken for generic crop varieties for arid conditions as presented in key FAO references (Table 1). Crop characteristics will however vary according to the specific crop varieties used in the different regions in Chad, which generally are composite varieties for maize, open pollinated varieties for millet, and a mixture of pure lineage and improved population varieties for sorghum (CEDEAO et al., 2016;FAO, 2012). But due to a lack of information on the spatiotemporal distribution of the respective varieties, and on the crop characteristics required to complete the calculations, the generic crop characteristics in Table 1 were used for all regions and years.
Seasonal precipitation, crop evapotranspiration and crop water deficits per region were calculated as weighted averages based on a 0.1°s patial resolution and relative weights according to the previously described crop maps. A combination of remote sensing, modelled, and reanalysis datasets were used for the environmental data needed for these calculations (Table 2).
For datasets with larger spatial resolution than 0.1°, the grid cell center closest to the 0.1°grid cell was selected, thus assuming homogeneity within each grid cell irrespective of resolution. Two data sources were used to drive the crop water availability calculations: precipitation from ARC2 (Novella and Thiaw, 2013), and satellite based soil moisture estimates from ESA CCI SM v03.2 (EODC, 2017;Liu et al., 2012Liu et al., , 2011Wagner et al., 2012). The ARC2 dataset was developed specifically for the operational use of FEWS NET in Africa, while its high resolution and extensive coverage period also makes it suitable for research activities (Novella and Thiaw, 2013). The ESA CCI SM v03.2 have shown promising validation results (Liu et al., 2012), and improvements over atmospherically driven soil water estimates related to crop yields in the Sahel (McNally et al., 2015). A potential advantage with remote sensing soil moisture data is that they include all sources of water input to the soil, e.g. runoff and irrigation, which are usually neglected in the atmospherically driven estimates. Shortcomings, on top of measurement uncertainties, mostly stem from the fact that only the soil moisture in the topsoil is measured (EODC, 2017). Due to uncertainties in the input data and few validation options, three commonly applied methods were used to estimate the seasonal crop water availability in the root zone (Fig. 2). First, the daily water availability was estimated with a one-size soil bucket model with ARC2 precipitation as input, evapotranspiration demand estimated from FAO's crop soil-water model (Allen et al., 1998;Steduto et al., 2012), and gridded root zone soil water holding capacity from the Global Soil Hydraulic Properties dataset (Hengl et al., 2014;Montzka et al., 2017;Schaap et al., 2001). Secondly, a combination of the ESA estimated topsoil water content and the ARC2 model was employed, where the daily root zone water variability was taken from the variability of the ESA data and unified to the seasonal mean and standard deviation of the ARC2 model. The assumptions underlying this was that the variability in the ESA data reflects the variability in the full root zone, while actual levels and amplitudes of the root zone availability were more accurately described by the ARC2 model. Thirdly, a layered bucket model with water transfer between the layers was developed by using the ESA estimated topsoil water content combined with layered soil parameters from the Global Soil Hydraulic dataset. Water transfers at each time step were calculated according to the layers' water content, unsaturated hydraulic conductivity, and pressure differences, where the topsoil water content was set to the ESA estimate. As sequences of missing data was common for the remote sensing data, daily sequences of missing data in grid cells in the ARC2 and ESA data of up to five days were replaced with linearly interpolated values, while grid cells with missing sequences larger than this were discarded from the analysis.
The effect of crop water deficits in specific growth stages on the yield were estimated by yield response factors and yield response relationships (Doorenbos and Kassam, 1979). The estimates of these factors as given by Doorenbos and Kassam (1979) are based on experimental results with high performing crop varieties, and are set to be valid for daily evapotranspiration deficits of up to 50%. Yield response factors and relationships are furthermore known to vary widely between crop varieties and growing conditions, and several studies have argued for applying site specific relationships (Greaves and Wang, 2017). Due to lack of more precise information on the yield responses in the studied regions, both the additive and multiplicative yield reduction equations were applied (Doorenbos and Kassam, 1979;Garg and Dadhich, 2014;Jensen, 1968). To address uncertainties in the yield response factors under conditions of evapotranspiration deficits above the set limit of 50%, a new set of adjusted yield response factors for each region and crop were calculated based on constrained regressions to the respective crop variable within a range of ± 50% of the original yield response factors. An issue with this approach is that the resulting adjusted yield response factors are overfitted to the specific crop variables, while using the original yield response factors might not be applicable to the specific growing conditions. Due to the uncertainties involved in the yield reduction estimates, total seasonal precipitation and total seasonal evapotranspiration were included as less detailed, but possibly more reliable, water availability determinants. In total, this resulted in 16 water availability determinants (Table 3).
The precipitation, relative humidity and temperature datasets were verified against monthly observations stations in Chad from the Direction des Ressources en Eau et de la Météorologie (DREM) for the period 1998-2013 (DREM, 2014), with 8 stations for precipitation and 3 stations for temperature and relative humidity. The average Spearman correlation coefficient to the nearest four grid cells of each observation station was 0.88 for the ARC2 data, 0.89 for ERA Interim Temperature, and 0.95 for ERA Interim Relative Humidity, which were acceptable validations of the remote sensing atmospheric data. Due to data limitations, additional soil characteristics such as soil salinity, soil nutrients, and fertilizer applications were not included in the analysis. Data on irrigation was also lacking from the analysis, and while it is known to be rare in the Sahelian parts of Chad, it is applied extensively in some parts of the Soudanian zone (FEWS NET, 2011). Irrigation in the Soudanian zones is however usually confined to certain crops, mainly rice, and might thus be of limited importance for the analysis of variability in the rainfed crops considered in this study.

Food security reports and market prices
To assess the role of socio-economic factors to the crop production variability, data provided by food security reports from FEWS NET were included. These reports are available on a monthly basis from 2000 (FEWS NET, 2018), and cover events and conditions relevant to the operational activities of food security actors in Chad. Aspects covered by these reports generally include: food security conditions, market prices, conflicts both in Chad and in neighboring countries, population movements, trade flows, employment opportunities, pests, and political developments. The extent, detail, and temporal consistency in these reports provide a wealth of information about livelihood changes in Chad. The reporting is mostly qualitative in nature, with few quantifiable parameters, and differs noticeably across the studied time period, both in coverage and in detail. Initially, data taken from the food security reports every 3 months for 2000-2016 were sorted according to the region, month, and livelihood factor they concerned. The reporting is done on different spatial scales, and to accurately relate the data to the concerned regions, all data within each livelihood zone, as defined by FEWS NET (2011), were assigned to the respective regions used in this study (Fig. 1), while data on broader spatial scales than this were excluded. This initial sorting and categorization was done in MAXQDA, and later imported as a database to MatLab for further analysis. From here on, three strands of approaches were developed to explore the explanatory capacity of this database to the crop output.
First, already quantified categories (food security classifications and market prices) and categories that were consistently reported on in the food security reports were set as "Livelihood Inclusion Determinants" (Table 4). Pests, floods, and conflicts were quantified for each growing season by ascribing intensity scores from 1 to 5 according to their qualitative descriptions. All information in these data categories preceding the estimated harvest date of up to 9 months for each region and crop were averaged and assigned to each growing season.
Quantifying a large set of qualitative data with relative scores has the advantage of being applicable to statistical analysis, while a noteworthy drawback is the uncertainty in the estimated scores, especially as the descriptions and detail may vary considerably across time series. To reduce the uncertainty in these estimates, categorical values (present/not present) were also assigned to these categories. Monthly market prices for the included crops were collected from reports and online databases from FAO, FEWS NET, and INSEED for the time period 1990-2016, and deflated with the World Bank Consumer Price Index (FAO, 2016b;FEWS NET, 2002, 20011997;INSEED, 1999INSEED, , 1994World Bank, 2017b). Market prices from these sources were available for eight main cereal markets: Abéchè, Bol, Mao, Mongo, Moundou, Moussoro, N'Djamena, and Sarh. Trade routes and market connections for each region were taken from FEWS NET's livelihood profiles (FEWS NET, 2011), which were used to connect each region to its primary  Allen et al. (1998). b Doorenbos and Pruitt (1977). c Doorenbos and Kassam (1979). d FAO, 2009. e Describes the relationship between crop evapotranspiration to reference evapotranspiration when water is not a limitation. Coefficients for stage 2 are interpolated on a daily basis. E. Nilsson, et al. Journal of Arid Environments 175 (2020) 104081 market from this list. Average market prices at the primary market over 6 months preceding the estimated planting dates were included as a determinant in the statistical analysis. A general assumption underlying this approach is that market prices preceding the planting date of each season influence farmers' agricultural strategies, by focusing on certain crops, altering the planted area, investing in farm inputs, or diversifying into other livelihoods. However, market prices can also be connected to altered access to food and farm goods, which further complicates the potential relationships between market prices and crop production. Secondly, certain events and interventions can have substantial and unique effects on the crop production, such as development projects and conflicts, which poses issues for statistical evaluation. If such effects are apparent and consistent over at least a small number of years, they could partly be detected and evaluated through a trend and breakpoint analysis, or through inclusion as categorical variables. If any such effect on other hand is transient and inconsistent, it would not be captured by either of these methods, and left unaddressed, it could distort the regression analysis for the whole sample. Another strand of analysis based on the food security reports was thus added to evaluate the effect of excluding years with potentially distortive events. The groups of events listed in Table 5, if preceding the estimated harvest date with less than 9 months, where included as "Livelihood Exclusion Determinants" for this end. A drawback with this approach is that any established explanatory capacity with excluded years will only be valid for a subgroup of the whole sample, but this could still be a considerable improvement over a lower explanatory capacity for the full sample.
Thirdly, the created database with categorized and sorted qualitative data from the food security reports was used to complement the statistical analysis by providing qualitative descriptions of the livelihood conditions for selected regions and years. As the information concerning each region and growing season was taken from food security reports spanning up to two years, simply categorizing and sorting all this information into a searchable database provided valuable analytical advantages compared to the original report structure.

Statistical analysis
The statistical analysis was based on regression analyses and sought to evaluate the explanatory capacity of combinations of the water and the livelihood determinants (Tables 3-5) to the crop production (i.e. response variables). It aimed to identify the combination of determinants and detrending method that provided the highest explanatory capacity to the variability in the full crop production dataset, consisting of 37 crop production variables. The explanatory capacities of the determinants to the harvested area and yield were evaluated as an alternative to the production, where the respective explanatory capacities were weighted and combined according to their relative correlation to the production. For regions and crops where the determinants had low explanatory capacity to the production, this combined analysis of the harvested and the yield could add an additional, if yet fragmented, understanding of the dynamics in the production. From the set of determinants, regression models were created and fitted to the response variables through constrained multivariate least square error linear regressions. The water determinants formed the basis of the regression models, and were evaluated both separately and jointly with all combinations of the livelihood determinants. As the effects of the evaluated livelihood factors were expected to differ considerably between the regions and crops, contrary to the effect of water availability which should be more constant, the methodology focused on finding the water determinant that in combination with a group of livelihood determinants provided the highest overall explanatory capacity to the crop production dataset. As the livelihood data were only available from 2000, only water determinants were evaluated for the 1983-2016 period, to compare their performance over different time periods. With a large set of combinations of determinants explored for each response Table 2 Environmental data for seasonal water availability estimates. variable, low degrees of freedom, and uncertainties in the data, the statistical analysis needed to be conducted with constraints, and performative evaluations adapted to these conditions. Positive or negative constraint limits for the regression coefficients were thus set to comply with the expected relationships for the water and Livelihood Inclusion determinants to the response variables (Table 6). A minimum of eight degrees of freedom was deemed to be sufficient to identify reliable correlations, while acknowledging the sparse temporal extent of several of the datasets. Regression performances were evaluated by leave-one out cross validated root mean squared errors (RMSE) (see e.g. Taylor et al., 1984). Significance levels were estimated based on coefficients of determination (R 2 ) from 1000 pair-wise randomized bootstrap iterations (see e.g. Fox, 2016). Only regression models significant at the 0.05 level were selected for the results summary. Due to the low sample size and expected uncertainties in the data sets, only linear relationships were evaluated, and no interaction terms were included in the regression models.

Results
The results section initially presents an evaluation of the explanatory capacities of the combinations of determinants and detrending method to the full crop production dataset. The performance of the selected determinant combination and detrending method for a set of subgroups is further evaluated, as is the selection of livelihood determinants. Finally, a combined livelihood data and water determinant analysis is provided for the crop production variables where the cross validated R 2 is above 0.5.

Summarized determinant combination performances
The summed significant cross validated R 2 to all of the 37 crop production variables is presented for each detrending method and determinant category in Fig. 3. For the determinant categories with livelihood data (yellow, purple, and green lines), the livelihood determinants that in combination with each water determinant give the highest cross validated R 2 are selected. Contributions from the analysis of the harvested area and the yield are included, as previously described. As the rate of significant variables are low for all of the determinant combinations, the R 2 s presented here are low, and should only be interpreted in relation to each other. The improved explanatory capacity with addition of the livelihood data in the determinant categories (moving to the right in the legend) is clear from Fig. 3, with a mean relative improvement in summed cross validated R 2 of 286% between only using water determinants (red lines) and combined water with both Livelihood Inclusion and Exclusion determinants (green lines). Addition of the Livelihood Inclusion determinants (yellow lines) generally outperforms additions of Livelihood Exclusion determinants (purple lines). By combining both of them, the explanatory capacity improves considerably, and foremost for the "BP + var." detrending method (including breakpoints for both the mean and variability). As the regression models evaluate a large set of determinants for each of the livelihood determinant categories, improvements in the cross validated R 2 s are expected on purely statistical grounds. A larger number of potential determinant combinations can also explain why the    Livelihood Inclusion determinants generally outperforms the Exclusion determinants, and that their joint improvements over only using water determinants are larger than their summed respective improvement. For a reliable evaluation of the explanatory capacity of the livelihood determinants to the crop production variability, livelihood determinants with high selection rates in the determinant combinations must be identified, which is done in Section 3.2.1. By only including determinants from the livelihood determinants categories (not presented in Fig. 3), i.e. excluding water determinants altogether, the results are similar as for the water determinants for 2000-2016, with an average relative increase of 12% in mean summed cross validated R 2 . The low and comparable separate performances of these determinant categories, and their noteworthy combined improvements, show that they all are correlated to the crop production, and that their joint effects need to be acknowledged. The two periods of water determinants, 1983-2016 and 2000-2016, show similar results with only minor improvements seen for the later period, which indicates that there is no pronounced difference in data quality over the two time periods. The four detrending methods show similar results for all but the joint Livelihood Inclusion and Exclusion determinant category, where on average the two variability based detrending methods have the highest performances. Here, the "BP + var." detrending method increases the mean summed cross validated R 2 by 18, 33, and 54% relative to the three other detrending methods. These differences are in line with the increased analytical detail involved in adding a detrending of the variability, and applying a breakpoint methodology to a dataset known to have structural breaks (Nilsson and Uvo, 2018). It further shows that there are trends in the variability of the crops, and breakpoints in both the mean and the variability. That only slight differences are seen between the detrending methods for the rest of the determinant categories can be explained by their overall low performance, as additional detrending detail is not adding any considerable improvements in the summed cross validated R 2 s. The differences in detrending performances are on the other hand most apparent for the highest performing determinant and determinant category, identified as the ARC2 driven Yield Reduction Additive Adjusted determinant with the Livelihood Inclusion and Exclusion determinant category. Compared to the basic "MA" detrending method, both the "MA + var." and the "BP" have relative improvements of 30% for this determinant and determinant category, while the "BP + var." clearly outperforms the others with a relative improvement of 65%.

Selected determinant combination performance
The ARC2 driven Yield Reduction Additive Adjusted determinant, with the "BP + var." detrending and the joint livelihood determinant category, has a total significant cross validated R 2 of 0.193 to the full crop production dataset of 37 variables. Its performance is more clearly understood when reviewing its summed significant cross validated R 2 to all the crop categories in the Sahelian and Soudanian zone ( Table 7). The highest explanatory capacity is generally found in the Sahelian zone, and primarily to the production variables. Noteworthy results are especially the pronounced difference between the harvested area and the production variables in the Sahelian zone, as well as the lack of any significant variables found for the yield variables in the Soudanian zone. As the production and harvested area is what is assessed from the fields, and the yield being derived from the two, the higher performance of the production variables points to uncertainties and possible inconsistencies in the data on harvested area and subsequently the yield.
The rates of significant variables in Table 7 show that only a minority of the included crop variables have significant regression models established, with an average of 35% for the production variables, which explains the low values for the summed cross validated R 2 s as seen against the whole production dataset (0.20 & 0.12). However, the mean cross validated R 2 s are much higher within each subgroup's set of significant variables, which is in line with the few other studies conducted on environmentally driven regressions models against crop output in Chad (McNally et al., 2015;Nilsson et al., 2016). This further affirms that the best performance is within the production variables, with an average cross validated R 2 of 0.50 for 9 production variables in the Sahelian zone, and 0.44 for 4 production variables in the Soudanian zone. The spatial distribution of the explanatory capacity to the production, including contributions from the harvested area and the yield where no significant regression models were established for the production, is presented in Fig. 4. The significant regression models for maize are generally found in the Soudanian zone, while for millet and sorghum they are generally found in the Sahelian zone. These spatial differences are best understood in terms of accuracy of the crop statistics, and regional variations in rainfed farming, as the Sahelian zone is rainfed to a higher degree than the Soudanian zone, which in turn has more extensive irrigation practices (FEWS NET, 2011). Moreover, for the maize variables in Guéra and Moyen-Chari, both with cross validated R 2 ≥ 0.50, the regression models excluded the water determinant altogether and only included livelihood determinants, which further points to the decoupling of these crops' variability from the atmospheric water conditions. The higher proportion of millet production found in the Sahelian zone (DPSA, 2017a; FEWS NET, 2011) could also explain these patterns, as measures for data collection and the responses to varying water conditions both might be more consistent than for the other crops.

Evaluation of livelihood determinants
To get a reliable evaluation of the explanatory capacity of the livelihood determinants, their selection rates in the highest performing regression models are presented in Table 8. As this table only shows which livelihood determinants improved the explanatory capacity of the regression models the most, it does not exclude the possibility that other livelihood determinants had significant correlations, but should serve as an indication of which livelihood determinants are most potent for this end. Seen for the 21 production variables with significant regression models in both zones, the Livelihood Inclusions determinants are selected in 57% of the cases, while the corresponding rates for Exclusion determinants are 71%. The most frequently selected Inclusion determinants are Market Prices and Food Security Classifications and, for the Exclusion determinants, Conflicts and Agricultural Support. Market Prices and Food Security Classifications were the only Inclusion determinants that were quantitative in their original form, while the rest of the Inclusion determinants where quantified based on the qualitative information in the food security reports. Their high selection frequency points to their relevance, and to the uncertainties involved in quantifying the livelihood data. Moreover, for the group of quantified livelihood determinants, only categorical determinants were selected, and no intensity determinants, which further confirms the uncertainties in the quantification processes.

Combined analysis of variables with high explanatory capacity
Even though the explanatory capacity is considerably improved when involving livelihood determinants, it is still low as seen for the whole dataset (Table 7). For the selected water determinant and determinant category, 16 of the 37 production variables have no significant regression models established, and the mean cross validated R 2 for all the significant production variables is 0.36. For specific production variables, as seen in Fig. 4, the cross validated R 2 s are however much higher, which can serve as examples of the usefulness of this methodology. With an increasing rate of explained variability for a crop variable, there is also an increasing potential to attribute the unexplained variability to qualitative descriptions of the livelihood conditions for specific years, which can provide a more comprehensive understanding of the drivers of production variability in these systems. Table 9 lists all the production variables with cross validated R 2 s above 0.5 together with the selected livelihood determinants, indicating where the evaluated methods perform best.
The utility of this kind of combined analysis can be exemplified with two of highest performing variables, millet production in Batha (Fig. 5) and Biltine (Fig. 6). Here, high prediction deviations, set to 0.8 standard deviations, are given potential explanations based on the information in FEWS NET's food security reports for the respective growing season.
The effect of including a Livelihood Exclusion determinant in the regression predictions can be seen in Fig. 6. The effect of excluding agricultural support years in Biltine (Fig. 6), for 2007 and 2008, although correctly identifying deviating prediction performances, also shows the uncertainty involved in this methodology, as the resulting deviation goes in different directions for the two years. The effects of agricultural support on the crop production on regional scales will depend on a range of factors which cannot be assessed further without additional sample cases and more detailed data. On this level of analysis, as agricultural support is usually triggered by low food security and conflicts, growing seasons with agricultural support recorded could as well be linked to underperformance of the water availability conditions. For operational purposes, the main usage of identifying categorical variables that correlate with prediction deviations, like agricultural support in this case, is first and foremost to give an early indication of deviations from normal production patterns, and secondly to determine their effects. For the additional production variables with high performing regression models established (Table 9), the years with high prediction deviations are given potential explanations in Table 10. The qualitative attributions in this analysis show that negative deviations are generally more robustly established than positive ones (Figs. 5 and 6, and Table 10), which stems from the negative bias in the food  Nilsson, et al. Journal of Arid Environments 175 (2020) 104081 security reports, which are more focused on monitoring and averting crises than optimizing the production systems. Several of the explanations given here were also evaluated but not selected as Inclusion or Exclusion determinants for the respective variables in the statistical analyses, such as Agricultural Support, Floods, and Pests. The uncertainty involved in the effects of these factors, stemming from their broad descriptions in the food security reports as well as their interaction with other factors, limits their potential to establish any statistically reliable causalities. Using them as potential explanations in qualitative terms holds less explanatory and predictive applicability, but is still able to point to potential relationships with more precision than previous studies.

Discussion
By adding livelihood data to the commonly assessed water availability, the explanatory capacity to crop variability was considerably improved. This improvement was mostly realized through the 57% selection rate of the Livelihood Inclusion determinants, while the 71% selection rate of the Livelihood Exclusion determinants holds less analytical clout due to their unspecified effects and low occurrence rates, but can still be useful in identifying disruptions to normal crop production patterns. The Livelihood Inclusion determinants with the highest selection rates in the regression models, which were Market Prices and Food Security Classifications, were both quantitative variables originally, while quantifying the qualitative information in the food security reports was of less use. Given the broad descriptions used in these reports and the complexity involved in regional agricultural systems, this comes as no surprise, and goes to show that this information foremost lends itself to updates on the food security conditions, rather than for predictive crop variability purposes. The examples of high-performing production variables given in this study also show how qualitative and quantitative datasets can be combined to identify and explain deviating production patterns. Building on these methods and results to further attribute prediction deviations to potential causes using food security reports and other datasets will be able to advance the understanding of the crop production dynamics in Chad. Added detail in the detrending methods was able to further improve the explanatory capacities, where a breakpoint based detrending of both the mean and the variability showed large improvements over the more basic methods. The potency of breakpoint methodologies in the trend analysis confirms findings by previous studies that the progressions in these datasets have non-linear elements with abrupt and structural breaks, which needs to be accounted for in studies of both long-term trends as well as drivers of variability (Nilsson and Uvo, 2018). Understanding the patterns and causes of such progressions could improve the precision of the detrending, as well as provide valuable information about ongoing transformations in the agricultural systems.
Despite applying a broad range of datasets and analytical combinations, the majority of the regional crop production variability is left unexplained, with only 21 of the 37 production variables having  significant regression models established for the best performing determinant combination. The role of water availability has arguably been explored more exhaustively than the livelihood factors, and builds on established methodologies from a long tradition of crop-water studies. The detailed adjustments of such methodologies for the specific conditions in Chad have however not been established, as the water variables were only crudely validated, and as all of the crop specific factors behind the yield reduction estimates were set according to generic assumptions about crop type and agro-ecological conditions. The set of evaluated water determinants in this study has addressed some of these issues, and found that the precipitation driven Yield Reduction Additive Adjusted determinant had the best overall performance, which outperformed determinants driven by a satellite measured topsoil moisture dataset, the ESA CCI SM v03.2. The relative performance of these two groups of water determinants in the different regions was however not explored, which could be of interest to future studies, as increasing rates of irrigation might improve the performance of satellite measured soil moisture products over precipitation as predictors of crop water uptake. Further selecting and adjusting water availability estimates and crop specific factors can be advanced by categorizing information from governmental institutions and development organizations on various spatial scales, and increasing its accessibility for research projects. Increased validation and calibration potential of such datasets could come through agricultural field trials, but with added costs. The lack of higher explanatory capacities can further be explained by data quality issues in the crop statistics, with potentially inconsistent data collection methods and coverage. Although the explanatory capacity was similar for the time periods 1983-2016 and 2000-2016, indicating that there are no consistent changes to data quality in any direction, year-to-year changes in data collection could limit the potential of the regression models to capture the crop variability. Besides the already ongoing initiatives to improve data quality and coverage in Chad (see e.g. World Bank, 2017a), accessing and evaluating the subregional crop data that constitute the regional data used here could improve the identification of erroneous data points.
With improved crop statistics and soil water estimates, together with increased detail in the food security reports and similar assessments, new valuable research opportunities would open up. Benefits of such research would come in terms of improved food security assessments, evaluations of rural development projects, and identifications of investment opportunities in the agricultural sector. As information channels are already established for these ends by governmental institutions and development organizations, increasing the quality and quantitative applicability of the collected data, and its accessibility, might be a cost-efficient strategy for food security and rural development purposes.

Conclusion
This study has showed that the explanatory capacities towards crop production variability can be considerably improved by adding livelihood data to the commonly applied environmental datasets, as well as increasing the detail of the detrending methods. By combining  (Table 6).

Fig. 5.
Detrended millet production in Batha and prediction from the ARC2 driven Yield Reduction Additive Adjusted determinant. Prediction deviations above 0.8 standard deviations are given potential explanations from FEWS NET's food security reports for the respective growing season.
E. Nilsson, et al. Journal of Arid Environments 175 (2020) 104081 qualitative and quantitative methods, it has shown how a more comprehensive understanding can be achieved in studies of crop variability on regional scales. Several shortcomings and further improvements centered on data coverage and quality have been identified, where collaborations between government institutions, development organizations, and research bodies, in both Chad and other countries in similar development contexts, are set to be fruitful. For the regions in Chad where the developed methodology has the highest performance, the established relationships are of sufficient precision to inform food security assessments and outlooks.

Declaration of competing interest
The authors declare that they have no conflict of interest.