From plot to scale: ex-ante assessment of conservation agriculture in Zambia

This study combined bottom-up and top-down approaches to assess the ex-ante effects of conservation agriculture (CA)-based systems in Zambia considering both biophysical and economic factors and prevailing farm systems characteristics. For continuous maize cropping we compared a CA-based system of no-tillage with crop residue retention to a control system of conventional tillage with crop residue removal. First, we simulated yield effects that were calibrated and evaluated against multiple datasets, including on-farm agronomic trials from two seasons and six sites. Next, we extrapolated our simulations to all maize-growing areas in Zambia using gridded climate and soil datasets. Then simulated yields (in kg ha−1) were combined with economic data from a nationally-representative household survey to construct economic indicators including benefit-cost ratios (based on gross benefits and variable costs both in $ ha−1) that captured the implicit value of crop residues and labor demands. The field scale (per ha) indicators were scaled out using harvested areas as an expansion factor. All indicators were calculated over 3-, 10-, and 20-year simulation periods using an interpolated sequence of historical climate data. Finally, we conducted a spatial farm typology analysis to help understand the spatial variation in our field-scale indicators and provide insights into trade-offs and the suitability of CA-based systems for farmers. Average changes in yield from using CA-based systems (compared with the control) at the district scale ranged from −37% to 70% (average 33%), with a similar range of changes in benefit-cost ratios once economic factors were included, in addition to intra-district yield variability. Combining the changes in benefitcost ratios with maize harvested area resulted in an average annual change in district-scale net benefit ranging from US $−3.9 to US $9.9 million (with an average of US $1.1 million). The heterogeneity in biophysical and economic factors gave a ranking of provinces different according to biophysical or economic indicators, reinforcing the importance of coupling biophysical and economic approaches. The spatial farm typology analysis highlighted the specific contexts of farmers relevant to the suitability of CA, such as their mineral fertilizer applications rates, ownership of livestock, and prevailing soil texture and rainfall.


Introduction
A challenge for society is how to increase agricultural production and simultaneously reduce agriculture's environmental harm (Godfray et al., 2010). The adoption and ongoing use of sustainable agricultural technologies is one component in the quest to meet this challenge. Different stressors have intensified this quest in recent decades, some of which include a rising demand for food (Godfray et al., 2010), the detrimental effect of some technologies on the environment such as soil fertility decline, land degradation, and loss of biodiversity (Tilman et al., 2002;Pretty, 2008). Many agricultural technologies currently used by farmers, although yield enhancing, contribute to these detrimental effects (Foley et al., 2011). Therefore, there is a need for considering sustainable agricultural technologies that can contribute to meeting the challenge above. The adoption and use of these technologies is connected with both the biophysical and socio-economic conditions in which farmers operate. 1 An ongoing challenge for researchers, and hence for targeting technologies, is how to scale out assessments of technologies that consider both farmer biophysical and economic conditions (Dumanski et al., 1998), and are also relevant to the design of policies that may encourage adoption. This assessment should encompass developing a deeper understanding of the spatial variation in the biophysical and economic effect of technologies and the conditions that may catalyze their scaling out. Scaling out assessments of technologies may help inform policy debates, with scaling out often referring to the spatial diffusion of technologies within the same stakeholder group (Wigboldus et al., 2016). 2 The objective of our study was to present an approach to scaling out technology assessments and illustrate this approach for conservation agriculture (CA)-based systems in Zambia. Specifically, we examined the ex-ante biophysical and economic effects of CA-based systems for continuous maize (Zea mays L.) cropping in maize-growing areas of Zambia. 3 Our study combined simulation modelling with bottom-up approaches using data from on-farm agronomic field trials-hereafter trials-in Eastern Province and household surveys that interviewed farmers with top-down approaches (using gridded climate and soil datasets). For CA systems, the general factors influencing their suitability for farmers include CA's technical performance related to fieldscale yields and economic returns, trade-offs, and social dynamics at the farm and village scale, as well as the different contexts in which farm systems operate . To capture some of these tradeoffs and contexts, we extended our assessment from the field scale to the farm scale by conducting a spatial farm typology analysis. The purpose of the typology was to provide insights at a scale beyond the field into context-specific factors underlying the spatial variation in our biophysical and economic indicators, and to highlight trade-offs associated with CA-based systems and their suitability to farmers.
According to Hobbs (2007), CA is based on three principles: 1) minimum or no mechanical soil disturbance; 2) permanent organic soil cover (consisting of a growing crop or a dead mulch of crop residues-hereafter residues); and 3) diversified crop rotations. 4 We used the term "CA-based" system because we only studied the first two principles of CA that relate to no-tillage and soil cover. 5 Evidence from Southern Africa suggests that CA systems generally reduce soil erosion (Thierfelder and Wall, 2009) and increase yields (Thierfelder et al., 2015a;Mafongoya et al., 2016;Ngoma et al., 2015) compared with non-CA systems, with time dynamics influencing the magnitude of yield changes and their cumulative differences. Yet, CA should not be seen as a panacea and ultimate solution to all farmer challenges (Giller et al., 2009), as it needs tailoring to the specific contexts of farm systems and environments Rusinamhodzi et al., 2011).
Ultimately, a mixture of biophysical and economic factors, consistent with farmer needs, constraints, and circumstances influence CA adoption (Pannell et al., 2014;Corbeels et al., 2014). For example, its adoption could have important implications for labor demands  and trade-offs for residues (Valbuena et al., 2015). As a result, there is a need for the development and application of frameworks for scaling out assessments of technologies (Rattalino Edreira et al., 2018). Grassini et al. (2017) proposed a biophysical spatial framework to assess sustainable crop intensification and highlighted the demand for complementing their proposed framework with spatially-explicit economic assessments. They documented the benefits and costs of bottom-up and top-down approaches to scaling out assessments of sustainable crop intensification. The need to couple biophysical factors with economic factors in scaling out assessments is becoming evident, and examples of this coupling are emerging in the literature on yield gaps (van Dijk et al., 2017). The "technology extrapolation domain" framework proposed by Rattalino Edreira et al. (2018) focused on biophysical factors and acknowledged the need to complement the biophysical framework with economic factors that may shape technology adoption.
Different approaches for improving food security, such as climatesmart agriculture and sustainable agriculture intensification, embed technologies (sometimes related to CA) at their core, hence suggesting the need for scaling out assessments of technology potential. Literature also highlights the policy dimensions of technologies such as CA in Zambia (Andersson and D'Souza, 2014). Understanding the spatial variation in the effect of technologies is crucial for guiding locationspecific policy debates about technology suitability (Pachico and Fujisaka, 2004;Pretty, 1998).
Our study complements the agronomic literature by illustrating an approach to scaling out technology assessments that combines bottomup and top-down approaches considering both biophysical and economic factors. We illustrated our approach using CA-based systems in Zambia where we added an economic approach and a spatial farm typology analysis to the already proposed biophysical approaches (Grassini et al., 2017;Rattalino Edreira et al., 2018).
CA practices have been promoted in Zambia since 1996 (Haggblade and Tembo, 2003), and several policy documents suggest that government supports CA (Andersson and D'Souza, 2014;Whitfield et al., 2015). In terms of uptake of CA, Zulu-Mbata et al. (2016) report, using the 2015 Rural Agricultural Livelihoods Survey (RALS), that 8.8% of smallholder rural households practiced CA in the 2013/14 season, and Arslan et al. (2014) used the Rural Incomes and Livelihoods Surveys to report 16% of households practiced no-tillage. A range of biophysical and economic conditions exist in Zambia, which influence the suitability of CA for farmers. For example, most of Eastern Province lies in agro-ecological zone (AEZ) 2A. 6 The AEZ 2A includes some of Zambia's most fertile soils, and receives, on average, annual rainfall ranging from 800 to 1000 mm, with a growing season between 100 and 140 days (Chikowo, 2016).

Overview
We first collected data from on-farm trials in Zambia's Eastern Province where the CA-based system of no-tillage with residue retention was practiced over multiple maize cropping seasons (hereafter seasons) and compared with a control system of conventional tillage (ridge and furrow tillage or moldboard ploughing with residue removal). 7 We used a mixture of site-specific trial data and gridded data to parameterize the CERES-Maize model in the Decision Support System for Agrotechnology Transfer (DSSAT) version 4.5 (Jones et al., 2003), a process-based cropping system model. The DSSAT model was calibrated using site-specific grain yields from the trials and was evaluated by comparing province-scale simulated yields with yields from the Central Statistical Office of Zambia (CSO, 2019). Then, we used DSSAT to simulate grain and biomass yields for a CA-based system and a control system in all five arc-minute grid cells where records of maize harvests existed (You et al., 2017). 8 Finally, we collated economic data to combine with the simulated yields to calculate economic indicators.
Our assessment combined bottom-up and top-down approaches 2 Conversely, scaling up often refers to an institutional expansion from grassroots organizations to policymakers, donors, and development institutions to build an enabling environment for change (Douthwaite et al., 2003). 3 The phrase "maize-growing areas" refers to maize harvested areas reported in the Spatial Production Allocation Model (You et al., 2017). 4 The three principles of CA are broadly consistent with the climate-smart agriculture approach. 5 Existing evidence suggests that farmers often use partial adoption (i.e., two principles not three) as an entry point to full adoption (all three principles). Partial adoption is especially relevant if resources and enabling conditions for adoption are limited (Brown et al., 2017). 6 AEZs in Zambia are primarily based on rainfall and soil profiles (De Groote et al., 2014;Thierfelder et al., 2013;Chikowo, 2016). Supplementary Fig. 1 describes Zambia's AEZs. 7 "Conventional" tillage refers, in a manual labor system, to ridge and furrow tillage with ridges formed before the beginning of each season using hand hoes. 8 At sea level on the equator a 5 arc-minute resolution is 9.276 km × 9.276 km.
A.M. Komarek, et al. Agricultural Systems 173 (2019) 504-518 summarized in Fig. 1. The terms "bottom-up" and "top-down" have different interpretations for different researchers. Regarding these terms, Grassini et al. (2017) write "a top-down approach largely based on gridded spatial framework for data on climate, soils, and crop production" and the "bottom-up approach is based on a relatively small number of sites that represent major crop producing regions". Whereas You et al. (2009) write that top-down approaches "decompose information from coarser scale into its constituents at finer scale" and bottom-up approaches use "information at a finer scale to derive information at a coarser scale". The context from Grassini et al. (2017) connects to our assessment that combined the bottom-up and top-down approaches primarily through data and simulation models.

Data
Data for our study came from several sources. For the DSSAT simulations, data primarily came from trials in Eastern Province for the consecutive seasons of 2014/15 and 2015/16 (Section 2.3). The parameterization of DSSAT required data on crop management, climate, and soil profiles. Crop management data sourced from the trials covered cultivars planted, planting dates and densities, harvest dates, fertilization rates and dates, and tillage and residue management. Data on grain yields and biomass were also recorded in the trials and used in the calibration of DSSAT. 9 The DSSAT simulations also relied on climate and soil profile data. Daily rainfall data was recorded at each trial site in both seasons. We obtained other historical climate data for DSSAT parameterization from the Prediction of Worldwide Energy Resource (NASA, 2018) using the GPS coordinates of each site, and this included daily minimum and maximum temperatures and solar radiation. These data were also used in the calibration. To extrapolate beyond the trial sites, we simulated yields in each grid cell using a 30-year sequence of historical interpolated climate data. The sequence was derived from the Pattern Scaling with MarkSim Weather Generator based on historical climate sequences (Jones and Thornton, 2013). MarkSim-generated climate data have been used to examine the effect of crop management on yields in DSSAT or similar models (Pasuquin et al., 2014;Rigolot et al., 2017). We obtained site-specific soil profile data for the parameterization and calibration of DSSAT from SoilGrids, which is a system for automated soil mapping based on soil profile and covariate data (Hengl et al., 2014). For extrapolation across Zambia, we retrieved soil profile data from the Harmonized World Soil Database (FAO et al., 2012), which are at a 30 arc-minute spatial resolution.
Maize grain yield data reported at the province scale by the Central Statistical Office of Zambia from 2010 to 2017 were used to evaluate simulated grain yields. The year 2010 was selected as the start year as this broadly represents that start of Zambia's Farmer Input Support Programme (FISP), the successor of Zambia's Fertilizer Support Programme. Muchinga Province was excluded from the evaluation because only three years of observed yields were available after administrative boundary changes occurred.
Crop management data for the simulations (i.e., for the extrapolation beyond the trial sites) came from the trials and household surveys, including the 2015 RALS (IAPRI, 2015). The RALS is a nationally-representative survey of households drawn using a stratified two-stage sample design, with the first stage involving the selection of Standard Enumeration Areas (SEAs) and the second a systematic and stratified sampling of 20 households per selected SEA. The RALS included 7934 households covering all provinces and 476 SEAs 10 . The RALS Fig. 1. Schematic representation of study approaches. 9 In our study, parameterization refers to the process of populating the DSSAT model with parameters, so the model can run. Calibration refers to parameter adjustments of the parameterized model to minimize differences between observed (trial trial) and simulated (DSSAT) yields. Evaluation refers to comparing simulated yields against yields from a dataset not used in the calibration (statistical-agency data in our case). The word "simulations" refers to extrapolation of the model outside the trial site locations and years. Section 2.4 provides additional details. 10 The SEA is the smallest geographic area with well-defined boundaries identified on census sketch maps. agricultural data used here refers to the 2013/14 season.
Data for computing economic indicators were sourced from the RALS. These data included district-scale maize grain price, maize seed price, price and quantity of mineral fertilizer (hereafter fertilizer), and wages. Additional data on labor demands, the implicit value of residues, and herbicide costs mainly came from farmers involved in the trials. Prices and costs were converted from the Zambian Kwacha to US dollars ($) using the official 2013 exchange rate from the Reserve Bank of Zambia that was $ 1 = 5.39 Zambian Kwacha. 11 Grain and residue yields were taken from the DSSAT simulations. Fertilizer quantities in the simulations were specific to individual districts.
Data on herbicide costs and labor demand for maize were recorded in the trial sites using the standardized protocols developed by CIMMYT (1988), and implemented by the government and NGO resident extension officers (sample size details in Section 2.3). These data were further corroborated during monitoring surveys conducted at the end of each season using stratified random sampling (trial host farmer, adopters, dis-adopters, and non-adopters) (IITA et al., 2017). Data on the implicit value of residues were also collected during the monitoring surveys and corroborated through key informant interviews.
We obtained herbicide unit costs for glyphosate (Roundup) using farmer-reported data from Eastern Province  and from agro-dealers in Chipata in Eastern Province, Kasama in Northern Province, and Monze in Southern Province. The herbicide cost was $ 13.91 L −1 in Eastern Province, $ 12.06 L −1 in Northern Province, and $ 10.20 L −1 in Southern Province. We allocated these costs to the other study provinces based on similarities in AEZ. Central, Lusaka, and Southern provinces all share similar agro-ecological conditions with Eastern Province. Copperbelt, Luapula, Muchinga, and North-Western provinces all share similar agro-ecological conditions with Northern Province. The farmer-reported data were based on the protocol described in CIMMYT (1988).
Because residue management is one part of CA's second principle, we also used farmer-reported data on the implicit value of residues based on the protocol described in CIMMYT (1988). Farmer-reported residue values were $ 0.04 kg −1 in both Eastern  and Northern provinces (CIMMYT, 2018). The value for Western Province was $ 0.05 kg −1 , a value that was also reported for Zimuto Communal Area in Zimbabwe, a region sharing similar agro-ecological conditions (i.e., recommendation domain) to Western Province (Thierfelder et al., 2015b). 12 Households may burn residues or use them in multiple other ways including as a soil amendment, livestock feed, fuel for cooking, and roof and fencing material. These competing uses imply residues may have different implicit values depending on how the household uses them. Different approaches exist to assess the value of residues including direct farmer valuation (such as our farmer-reported values), based on their value as a livestock feed (Komarek et al., 2015), or using statistical approaches to estimate their value as a mulch (Berazneva et al., 2018). 13 The values applied in our study are use specific. Like for herbicide costs, we allocated the residue price to other provinces based on AEZ, so an implicit value of residue of $ 0.04 kg −1 was used in all provinces apart from Western Province. We set the implicit value of residues equal to the average value reported by farmers who were exposed to CA in the different trial areas and sites, following the protocol described in CIMMYT (1988).
Variable costs we computed included the implicit cost of labor based on labor demands reported in the trials. According to data collected from six sites in Eastern Province from 2012 to 2015 for both manual and animal traction systems, the average labor demand for maize per season was 71 days ha −1 in the control and 36 days ha −1 in the CAbased system . Based on trial data from five sites in Northern Province for 2016/17, we computed the average labor demand at 68 days ha −1 in the control and 36 days ha −1 in the CAbased system (CIMMYT, 2018). Labor demands in Western Province averaged 26 days ha −1 in the control and 33 days ha −1 in the CA-based system, based on trial data with a similar recommendation domain from Zimuto Communal Area in Zimbabwe (Thierfelder et al., 2015b). All labor data were collected using the protocol described in CIMMYT (1988). We extrapolated labor demands to other provinces based on the AEZ, using the same allocation procedure described for herbicides. Animal traction is commonly used in Western Province resulting in lower labor demands there than those reported for Eastern and Northern provinces where manual tillage methods are more common. We multiplied the labor demands by district-scale daily wage data from the RALS to calculate implicit labor costs. The daily wage was calculated at the district scale based on average wages earnt throughout the year by households in the RALS.
Supplementary Table 1 reports the province-scale summary statistics for prices, costs, and input application rates. Our study combined prices and costs from different years and data sources (mainly trials and surveys), arguably a second-best option because of the constraints on data availability.
For all maps generated in our study, we took the administrative boundaries of Zambia from version 3.6 of GADM (https://www.gadm. org/, accessed July 20, 2018).

On-farm agronomic trials
Since 2011, farmers have conducted, under supervision, on-farm trials in Eastern Province to adapt CA-based systems to their conditions, needs, and circumstances. We calibrated DSSAT using data from six trial sites in three districts (Chipata, Katete, and Lundazi) (Fig. 2). Data on agronomic management and site-specific rainfall was collected at each site. All farmer replicates had a control system of continuous maize with conventional tillage and residue removal. The conventional tillage varied according to the traction available and was either with animal traction moldboard ploughing or in a manual ridge and furrow system. Our study focused on plots with continuous maize that allowed the yield effects of no-tillage with residue retention to be isolated from other practices such as rotations and intercropping. The trials were setup in targeted communities of approximately 200 households. A clustered approach was used in which six farmers in each community hosted mother trials to demonstrate a range of technologies to farmers. The most advantageous systems were then applied and multiplied in baby trials. Clusters of six and, in later years, four on-farm replicates served as replication trials. The systems tested in the trials were a spectrum of CA principles, compared with a control system. CA-based systems included continuous maize with no-tillage and residue retention, control plots included continuous maize with conventional tillage and residue removal, and the trial also had plots with a mixture of maize-legume rotations and intercropping with different tillage and residue management. Mupangwa et al. (2017) provide a full description of the trials. Briefly, the systems we studied included: • A control system that consisted of ridge and furrow tillage or moldboard ploughing, in which animal traction CA was practiced, planted for continuous maize cropping. Residues were grazed, removed, burned, or incorporated into the soil. We summarized and labeled our control system as conventional tillage with residue removal.
• A CA-based system that consisted of direct seeding (also called seeding with a manual dibble stick (pointed stick), or animal traction with a ripper or direct seeder) for continuous maize cropping. 11 Accessed from http://www.boz.zm/statistics3.htm on July 20, 2018. 12 A recommendation domain is a group of roughly homogenous farmers with similar circumstances for whom the same recommendations can be more or less made (Byerlee et al., 1988). 13 Supplementary Section 1 provides more details on the implicit value of residues and Supplementary Fig. 2 reports household residue use from the RALS.
Residues were retained in the field as a mulch. We summarized and labeled our CA-based system as no-tillage with residue retention.
At all sites, participating farmers were responsible for managing the plots, including planting, fertilization, weed and disease control, and harvesting with supervision from a trained extension officer and researchers from the Zambian Agriculture Research Institute. Farmers selected different cultivars at each site based on their preferences (Table 1). All plots were seeded soon after the first effective rains after mid-November. Each system had the same target plant population of 44,444 plants ha −1 (90 cm between rows and 25 cm between plants) giving a sowing rate of 25 kg ha −1 . A total nutrient content of 108 N:34P 2 O 5 :17K 2 O was applied to all plots as basal fertilizer at planting with Compound D (165 kg ha −1 of 10 N:20P 2 O 5 :10K 2 O), and as top-dressing with urea (200 kg ha −1 of 46% N) also applied to all plots approximately five weeks after planting. For weed control, in the CA-based system farmers applied 2.5 L ha −1 of glyphosate (Roundup) as a full-cover spray either at planting or 2 to 3 days after planting.

Biophysical indicators
Our biophysical indicators included yields of maize grain and biomass (kg ha −1 ) and total production of maize grain, computed as yield × harvested area (ha). In our study the area planted of maize is the same in both the CA-based and control system. Our DSSAT simulations ran on all 5 arc-minute grid cells in Zambia that reported maize harvested in the Spatial Production Allocation Model (You et al., 2017). We used the CERES-Maize model in DSSAT to simulate the yield effects of different tillage and residue practices. DSSAT has been previously used to simulate different management practices in maize-based systems in Southern Africa (Ngwira et al., 2014;Nyagumbo et al., 2017;Corbeels et al., 2016). The calibration of DSSAT followed the general principles from previous research in which coefficient and parameter values were adjusted within a specific range to minimize differences between simulated and observed data (Donatelli et al., 1997;Kassie et al., 2014). The calibration involved adjusting crop cultivar coefficients and a soil parameter called SLPF (Supplementary Table 2).
To facilitate model calibration we used an automated calibration algorithm, the SCE-UA (Shuffled Complex Evolution-Universal Algorithm) (Duan et al., 1992). The SCE-UA minimizes an objective function that captures model fit to observations using a generic algorithm to find crop cultivar coefficient values and the SLPF within lower and upper limits (Supplementary Table 2). Variants of the SCE-UA have been previously used in the calibration of models for simulating cereal crops (Duchemin et al., 2008;Confalonieri et al., 2009). After the SCE-UA performed 5000 iterations, we retained the crop cultivar coefficient and SLPF values for each cultivar used in each site that minimized the root mean square error (RMSE) between simulated and observed yields and biomasses for the two seasons and systems in each site. To assess how the calibrated model performed against the observed yield data, we calculated four statistics, similar to the calculations in Yang et al. (2014): mean absolute error (MAE), normalized RMSE (nRMSE), the coefficient of determination (R 2 ), and the Willmott (1981) The quantity of fertilizer applied in our evaluation and simulations matched district-scale basal and top-dressing application rates computed from the RALS (Supplementary Table 1). Fertilizer included compound D and urea with application rates similar to those reported in other Zambian surveys (Burke et al., 2017). The same application rates were used for both systems, and the rates are unconditional (including zero) weighted averages with district-scale maize area used as  Note: Data are for continuous maize, control is conventional tillage with residue removal and CA-based is no-tillage with residue retention. Season refers to maize cropping season (in general, plant in December and harvest in May). Biomass refers to the total above-ground non-cob biomass (excluding the core of the cob). Data adapted from Mupangwa et al. (2017).
the weight. Quantities of seed (planting density converted into kg ha −1 ) in the evaluation and simulation were taken from the trials. Zambian farmers plant a variety of maize cultivars, with the commercial hybrid maize PAN53 being the cultivar most frequently planted in eight of the ten provinces ( Supplementary Fig. 3). We ran our simulations in all maize-growing areas of Zambia using both PAN53 and SC627, but only reported results for PAN53 given its popularity. To evaluate the calibrated model, we first simulated average province-scale yields in the control system over the 30-year simulation period using our historical interpolated climate sequence. These province-scale yields are harvested area-weighted averages. We then compared these simulated grain yields to province-scale grain yields averaged between 2010 and 2017 (CSO, 2019).

Economic indicators
We calculated four economic indicators for the main maize growing season expressed in annual values: gross benefits (GB), variable costs (VC), net benefits (NB), and the benefit-cost ratio (BCR). Gross benefits ($ ha −1 ) for each system (s) and grid cell (subscripts not shown) were calculated using Eq. 1.
where GY is maize grain yield (kg ha −1 ), GP is the market price of maize grain ($ kg −1 ), RR is the quantity of residues removed (kg ha −1 ), and RV is the per unit implicit value of residues ($ kg −1 ). The RR equaled zero in the CA-based system and equaled the simulated residue yield in the control system. Maize grain can be sold through two main marketing channels: 1) the Food Reserve Agency channel and 2) the private sector channel, we did not disaggregate the grain price by its marketing channel. The VC was calculated with Eq. 2.
where SQ is the quantity of seed planted (kg ha −1 ), SC is seed cost ($ kg −1 ), FQ is the quantity of different fertilizers applied (kg ha −1 ) (t, Urea or Compound D), FC is the unit cost of fertilizer ($ kg −1 ), HQ is the quantity of herbicide applied (L ha −1 ), HC is the unit cost of herbicide ($ L −1 ), LQ is the labor demand in each system each season (days ha −1 ), and w is the implicit cost of labor (wage rate) ($ day −1 ). The unit cost of fertilizer was the acquisition cost of the fertilizer by the household and included fertilizer purchased from all sources including through government (i.e., the FISP) and commercial channels, we did not disaggregate the fertilizer cost by its acquisition source. The BCR equaled the ratio of gross benefits to variable costs (Eq. 3), and net benefit ($ ha −1 ) equaled gross benefits minus variable costs. Total net benefits were computed as net benefit ($ ha −1 ) × harvested area (ha).

Temporal and spatial data consolidation
The planning horizon over which benefits and costs of changes in crop management are examined will most likely differ for different farmers. Resource-poor farmers often have a short planning horizon for the number of years over which they consider the benefits and costs of changes in crop management relevant (Pannell et al., 2014). On the other hand, the full agronomic benefits of CA principles may take multiple seasons to materialize, and are location specific (Thierfelder et al., 2015a). Therefore, Lynam and Herdt (1989) suggest using a planning horizon longer than 3 to 5 years but fewer than 20 years when assessing sustainability in farm systems. We calculated indicators as an average over 3-, 10-, and 20-year simulation periods.
To calculate and report results at different spatial scales we took the following steps. We first allocated input quantities, prices, and costs to each five arc-minute grid cell where maize yields were simulated. In each grid cell we calculated the economic indicators. Input quantities, prices, and costs from the RALS were aggregated from the household-to the district-scale using weighted averages, in which the weights were survey population expansion factors, statistically valid at the province scale. Data on the implicit value of residues, labor demands, and herbicide costs were based on extrapolations from trial sites based on similarity of AEZs. Therefore, the non-RALS results should be interpreted as descriptive rather than a statistically-valid representation.
Next, we calculated the arithmetic average of the grid cell-scale biophysical and economic indicators over 3-, 10-, and 20-year simulation periods. Using these arithmetic averages we computed areaweighted averages at the district, province, and country scale for each simulation sequence using the harvested area of maize in each grid cell as weight, as suggested in . The percentage change between the CA-based system and the control system for the different indicators was computed using Eq. 4.

Spatial farm typology analysis
To examine the contextual factors that may drive the spatial variation in field-scale indicators and examine the trade-offs relevant to the suitability of CA-based systems we used a statistical-based method to construct a spatial farm typology by applying principal component analysis (PCA) (Jolliffe, 2002) and Agglomerative Hierarchical Clustering Analysis (HCA). The HCA used Ward's minimum-variance method (Ward, 1963). The objective of the PCA was to reduce the dimensionality of the variables used to then identify farm types (clusters of households). The objective of the HCA was to determine the number of farm types present among the individual households in the RALS. The first step involved the selection of ten variables for the typology (Table 2). These variables have a mix of structural (farm assets and resources), functional (livelihood pursuits), and biophysical characteristics. We then examined the coefficient of variation (standard deviation divided by average) of each variable. It is recommended that PCA is conducted using variables that exhibit a degree of spatial variation across farms (Köbrich et al., 2003). We also examined the correlation between variables using pairwise Pearson correlation coefficients because highly correlated variables can be problematic in a PCA (Köbrich et al., 2003). Prior to the PCA, we checked variable distributions for normality and, whenever necessary and possible, transformations were applied to bring distributions closer to normal, in line with Hammond et al. (2017).
The PCA reduced the dimensionality of the ten variables and transformed them into uncorrelated orthogonal variables called principal components (PCs). We selected the PCs to retain in the HCA using a combination of the Kaiser's rule (Kaiser, 1961) that suggests retaining PCs with an eigenvalue over one and by visual inspection for a break in the scree plot. After the PCA, we conducted the HCA to identify different farm types (the clusters), similar to previous typology analysis (Alvarez et al., 2018;Blazy et al., 2009;Lopez-Ridaura et al., 2018). We also constructed a dendrogram as a visual aid for choosing the number of farm types for the subsequent analysis. Cluster numbers to cut the dendrogram were based on the shape of the dendrogram, in combination with the silhouette method (Rousseeuw, 1987) and the gap statistic (Tibshirani et al., 2001). This method generated five farm types, as reported in Section 3.4. Table 3 reports the calibration statistics by system and season for both grain and biomass. For grain, the MAE ranged from 643 to 689 kg ha −1 and the nRMSE ranged from 30 to 39% across all systems and seasons. The d-index for grain ranged from 0.71 to 0.90. The R 2 for grain for the full dataset was 61%. The cropping systems model results suggested that grain simulations had a higher R 2 and d-index than biomass simulations. Results indicated that the model had similar performance for the control and CA-based systems, with a similar MAE and nRMSE in bothsystems.

Model calibration and evaluation results
With 84 paired data points, we disaggregated our comparison of simulated and observed grain yields by system, season, and site (Fig. 3). Several patterns emerge from the simulation results in Fig. 3. First, simulated yields were, on average, less than observed yields from the trials. Second, observed and simulated grain yields and biomasses differed across system, season, and site with 2015/16 yields higher than 2014/15 yields in all sites and for both systems. Third, there was spatial variation in yields by season and system with the lowest grain yields observed in Mtaya and the highest grain yields observed in Vuu. Finally, yields in the CA-based system were greater than yields in the control system across all sites and seasons.
Model evaluation involved comparing simulated grain yields with observed grain yields from secondary data (Fig. 4). Fig. 4 highlights the inter-province differences in yields. Evaluation statistics included an MAE of 357 kg ha −1 , a nRMSE of 25%, a d-index of 0.81, and an R 2 of 68%.

Model simulation results
Using the calibrated and evaluated model, we extrapolated our yield simulations to all grid cells in Zambia that harvested maize as reported in You et al. (2017). Simulation results suggested that CA-based systems produced 33% more grain yield than the control systems, averaged across all the provinces and simulation periods ( Table 4).
The average yield differences between the CA-based system and the control system varied by district (Fig. 5A) and province (Supplementary Table 3). At the province scale, the average 10-year period differences in yield between the CA-based system and the control system ranged from −10% in Western to 42% in Lusaka, and we also found intraprovince ( Supplementary Fig. 4) and intra-district yield variation ( Supplementary Fig. 5). At the district scale, the largest yield gain from using CA-based systems (compared to the control) was 70% in Chongwe district of Lusaka Province and the largest yield reduction was 37% in Note: Non-rainfall, non-soil data from the 2015 Rural Agricultural Livelihoods Survey (RALS) (IAPRI, 2015). Fertilizer is mineral fertilizer. TLU is a Tropical Livestock Unit, defined as an animal of 250 kg liveweight, 1 cattle = 0.7 TLU, 1 sheep or 1 goat = 0.1 TLU (Jahnke, 1982). Non-agricultural asset index constructed using the method described in Filmer and Pritchett (2001). Travel time variable description is how question was asked in the RALS, i.e., it is conditional on the respondent's perception of established market with many buyers. Rainfall data from WorldClim (Fick and Hijmans, 2017) and soil data from SoilGrids (Hengl et al., 2014). Rainfall and soil data at a spatial resolution of 1 km 2 and are mapped to the RALS households' GPS coordinates (offset by 2.5 km). Chilubi district of Northern Province. Average yield differences slightly increased as the years in the simulation period increased, but average absolute yields declined. These changes in yields translated into changes in production ( Fig. 5B and Supplementary Table 3). Absolute province-scale differences in production related to both yield and area harvested.
Although simulated yields changed from −37 to 70% at the district scale (Fig. 5A), the absolute change in grain production (Fig. 5B) related to yield and area harvested, and this absolute change was lowest Fig. 3. Comparison between the simulated and observed grain yields by system, season, and site in Eastern Province of Zambia. Data are for continuous maize, control is conventional tillage with residue removal and CA-based is no-tillage with residue retention. Observed yields are from on-farm agronomic field trials .  Note: Data are for continuous maize, control is conventional tillage with residue removal and CA-based is no-tillage with residue retention. in the districts with yield declines because of lower maize harvested areas. For example, the only province experiencing a decline in average yield was Western Province; however, harvested area for Western Province was lower than the national average and yields associated with the control system were lowest so the total decline in grain production was limited. Supplementary Table 3 reports province-scale yield and production results for all three simulation periods.

Economic results
Economic results suggested that CA-based systems, compared to the control systems, generated greater gross benefits, smaller variables costs, and therefore greater net benefits and benefit-cost ratios, on average and at the country scale (Table 5). Supplementary Table 4 and Supplementary Fig. 6 report province-scale results.
Taking the country-scale 10-year average, the value of grain production was $ 128 ha −1 more in the CA-based system than in the control system. Once the implicit value of residues removed was considered, the gross benefit in the CA-based system was only $ 44 ha −1 more than in the control system. Differences in variable costs were related to herbicide costs being $ 29 ha −1 greater and the implicit labor Table 5 Average country-scale simulated economic indicators in Zambia.

Years in simulation
System Grain value Implicit value of residue removed Gross benefit cost being $ 112 ha −1 smaller in the CA-based system compared with the control system. The changes in benefit-cost ratios and economic indicators varied by district and province ( Fig. 6A and Supplementary  Table 4) and changes in the benefit-cost ratios ranged from −62% in Kalabo district of Western Province to 76% in Chavuma district of North-Western Province. The changes in per hectare economic indicators translated into changes in total net benefits ( Fig. 6B and Supplementary Table 4). Supplementary Table 4 reports the simulated province-scale economic results for all three simulation periods. The Pearson correlation coefficient between province-scale grain yields and grain prices (using the 10-year average yields in the control system) was −0.43. Table 6 reports the ranking of provinces based on differences in economic indicators by system. The largest percentage gains in yields were in Lusaka Province, with Western Province having the smallest percentage change. The ranking of BCR changes varied from that of yield changes in six provinces with only Central and Western provinces having the same rank for both yield and BCR change. Using harvested area as an expansion factor created differences in percentage versus absolute rankings of changes in production and net benefits. Even though Central Province was associated with the 4th largest change in yield, it had the largest change in production because its harvested area was larger than that of many other provinces.

Spatial farm typology results
Summary statistics suggested the coefficient of variation for each variable ranged from 0.18 to 3.36, with an average of 1.39. Pairwise correlation coefficients between variables ranged from −0.21 to 0.45. We retained the first four PCs for the HCA and these PCs explained 60% of the variance in the ten variables. 14 The silhouette method and gap statistic both suggested clustering the individual farms into five farm types, supported by the dendrogram shape. 15 Fig. 7 presents the projection of each variable on the PC1-PC2 axes and the PC1-PC3 axes, overlaid with the location of each household on the PC axes. The three variables most correlated with PC1 were non-agricultural assets, fertilizer use, and household head education; with PC2 were rainfall, clay content of soil, and livestock ownership; and with PC3 were household head age, labor-to-land ratio, and number of off-farm workers. Travel time contributed most to PC4. Projecting the farm types onto the PCs illustrated the main characteristics of each farm type (Fig. 7).
Comparing the average values of the variables for the different farm types based on the whole sample of 7750 households (Supplementary Table 6), farm type one can be described as owning more ruminant livestock (9.25 TLUs, with 2.22 TLUs for the whole sample) and having a higher number of non-agricultural assets (index value 0.76, average of 0 for the whole sample) than the other farm types (Farm type two shows more working labor per unit of land than other farm types, a labor-land ratio of 3.64 compared with 1.85 for the whole sample. Farm type three is associated with higher rainfall and slightly more clay content in the soil than other farm types, with average annual rainfall of 1152 mm compared with 1017 mm for the whole sample, and the top 30 cm of the soil profile averaged 26% clay compared with 22.5% for the whole sample. We characterized farm type four as possessing limited non-agricultural assets (index value of −0.62) and less education for the household head (2.2 years compared with 5.96 years for the whole sample). Farm type five can be differentiated from other farm types by its lower levels of fertilizer application to maize (7 kg elemental nitrogen ha −1 compared with 35 kg elemental nitrogen ha −1 for the whole sample), limited non-agricultural assets (index value of −0.45), and greater travel time to market (4.2 h compared with 1.58 h for the whole sample). Table 7 reports the number of households in each farm type by province. Sixty four percent of households in Western Province were in farm type five, 49% of households in Southern Province were in farm type one, and two-thirds of households in Northern Province were in farm type three. The distribution of households across the farm types was more even in the Eastern Province, with farm type four being most common (34%).

Discussion
By combining bottom-up and top-down approaches, we assessed spatial variation in the ex-ante biophysical and economic effects of CAbased systems in all maize-growing areas of Zambia. Our simulated results were, in general, consistent with previous research that reports CA-based systems mostly have a positive effect on yields in Southern Africa (Section 1). But, we also found that in some districts-mainly in Western Province-CA-based systems showed a yield penalty. This may be because farmers in Western Province apply less fertilizer compared to other provinces that are more suitable for maize production (Supplementary Table 3). In turn, fertilizer use may negatively interact with the principles of CA, resulting in yield penalties. Lower yield gains in Northern and North-Western provinces could also be explained by higher rainfall, on average, than in other provinces leading to waterlogging and by the acidic soils. Because CA-based systems can help conserve soil moisture (Thierfelder et al., 2013), in seasons of excessive rainfall, CA-based systems can have excessive intake of moisture leading to waterlogged conditions (Thierfelder and Wall, 2009) and the Table 6 Simulated ranking of provinces for percentage and absolute change in indicators between the CA-based and control systems for Zambia. Note: Ranking relates to change in grain yield or BCR between CA-based and control cropping system. For example, a rank of 1st implies that the province had the greatest percentage change between the CA-based system and the control system across all provinces.
14 Supplementary Table 5 reports the loading and contribution of each variable to the individual PCs, along with the eigenvalue and cumulative variability explained by each PC. The scree plot is shown in Supplementary Fig. 7. 15 Supplementary Fig. 8 is the dendrogram.
proliferation of diseases , and hence yield penalties, relative to a control system. No-tillage generally performs better under dry climates (Pittelkow et al., 2015;Steward et al., 2018), and indeed our results suggested lower yield gains in the wetter provinces in the north of Zambia (such as Northern Province), compared with other provinces. The functioning of CA systems is contingent on no-tillage being practiced with residue retention, and the importance of residue retention as a surface mulch in minimizing the negative effects of only practicing no-tillage in the context of CA have been previously quantified (Pittelkow et al., 2015;Rusinamhodzi et al., 2011). Average grain and biomass yields are lower in Western Province compared to other provinces ( Fig. 4 and Supplementary Table 3), so there are also less residues retained in fields in the CA-based system, compromising the functioning of the second principle of CA. If insufficient quantities of residues are available to retain, notillage may lead to lower grain yields compared with tillage, because of less soil moisture conservation, reduced infiltration, and increased evaporation, particularly on soils that are prone to crusting and compaction (Baudron et al., 2012). Despite the field-scale agronomic benefits of retaining adequate amounts of residues, many reasons may prevent farmers from doing so including the competing uses of residues at the farm scale. Indeed, in Western Province the demand for residues as a livestock feed is stronger than in most other provinces, with the most common farm type having more livestock than the most common farm type in most other provinces (Table 7 and Supplementary  Table 6), presenting a practical challenge to the suitability of CA.
Although fertilizer applied varied by district (Supplementary Table 2), application rates did not change by system thereby adding a management dimension to inter-district yield variability. Initially, CAbased systems often require more inputs-especially nitrogen-than non-CA systems to provide yield gains. This is necessary due to lower mobilization and mineralization of organic carbon in no-tillage conditions and nitrogen lock-up due to retained residues. Heterogeneity in soil profiles is another reason for the spatial variation in yield differences, with greater yield gains more likely on well-drained sandy and loamy soils. Moreover, CA is arguably more suitable to the conditions in AEZs 1 and 2A (Ngoma et al., 2015), AEZ 3 shows higher rainfall and more acidic soils than the other zones (Chikowo, 2016), and CA is generally more favorable in dry climates.
Many factors could contribute to the spatial variation in the ranking based on economic gains from the CA-based system. First, due to the spatial variation in grain prices, the ranking of gross benefits may not mirror that of yields. The negative correlation between grain yields and prices supported the change in rankings. Second, although seed and fertilizer costs did not change by system, they varied by district. Similarly, labor costs differed by province mainly due to inter-district differences in wages. Because of possible temporal variations in prices, costs, and yields, the ranking of provinces may also change over time. In general, CA-based systems produced biophysical and economic gains; however, these gains were not ubiquitous across Zambia, and showed considerable spatial variation.
The ranking of provinces by differences in indicator values often changed according to the indicator chosen, either yield or benefit-cost ratio (BCR). Consistent with this finding, the geographic area with the greatest yield gain may not be associated with the greatest BCR gain. Including economic factors in technology assessments can provide greater nuances into the ex-ante effect of technologies. Different economic factors, and their combination, contributed to the change in province ranking. For example, North-Western Province ranked third in yield difference and first in BCR difference. This result could be   Luapula  12  141  372  136  10  671  Lusaka  123  174  50  61  22  430  Muchinga  52  161  272  139  79  703  Northern  22  76  522  143  14  777  Northwestern  30  69  283  96  20  498  Southern  436  128  47  130  143  884  Western  68  68  7  72  383  598  Total  1442  1435  2407  1615  851  7750 attributed to the high wage rate (Supplementary Table 2), thereby making the cost difference between systems sharper than would have been the case otherwise. These results suggested that combing both economic and biophysical factors can provide valuable information for scaling out technology assessments that biophysical-only analyses may not generate, therefore reinforcing the call for incorporating economic factors into technology assessments (Rattalino Edreira et al., 2018;Grassini et al., 2017). Our study is not exempt from shortcomings. Our simulated system was an incomplete CA system, as crop diversification (rotation or intercropping) was not considered. One area of future research could therefore focus on improving the ability of DSSAT to simulate crop diversification. In addition, although DSSAT can capture the effect of residues on yields, DSSAT has limitations in simulating plough pan formation as well as pest and weed dynamics (Corbeels et al., 2016).
Another shortcoming is the limited availability of specific economic data, such as the implicit values of labor and residues. Quantifying household labor demands for different cropping activities can be a daunting task (Franke et al., 2014), and increased efforts towards collecting disaggregated labor data using standardized protocols may help produce more complete technology assessments. Crop-level labor demand in days per hectare are currently unavailable in nationally-representative datasets in Zambia. 16 Despite reduced tillage generally decreasing labor demand for land preparation (Andersson and D'Souza, 2014), CA affects labor demand at different stages of the cropping cycle with the overall effect being context specific. Previous studies have shown that CA-based systems can increase or decrease labor demands in Zambia depending on the district assessed (Cacho et al., 2018); however, the exact definition and application of CA practices may vary across studies, thereby making cross-study comparisonschallenging. The inclusion of detailed labor time modules in nationally-representative surveys is desirable, since these data provide greater nuances of labor demands and trade-offs across wider spatial scales than individual case studies typically encompass. Another limitation is the data used in the calibration of DSSAT. Ideally for scaling out across Zambia, we would have used multi-year, multi-site trial data from more provinces that just Eastern Province. Ideally, the spatial coverage of the trials should include all the relevant climatic zones, soil types, and cropping systems of the target region (van Wart et al., 2013). Unfortunately, in our study all six trial sites were in the same AEZ.
Despite the limitations with data and modelling approaches, we highlight five considerations to aid the identification of more comprehensive frameworks for combining biophysical and economic factors to scale out technology assessments. These considerations do not replace, or compete with, the use of biophysical frameworks for which guidelines are already available (van Wart et al., 2013;, and are mostly related to economic factors. First, household surveys can capture the adoption of technologies, although capturing yield changes from adoption can be complicated because the same household rarely uses different technologies in paired plots, which are required for side-by-side assessments. Literature also points to measurement error in yield estimates from household surveys (Desiere and Jolliffe, 2018). On-farm or controlled field trials can potentially identify the treatment effects of technologies, although trials can be unintentionally compromised (Whitbread et al., 2010).
Second, nationally-representative surveys can provide a useful source of economic data on prices, costs, and management practices. When all these data cannot be obtained from the same source, efforts should be made to collect and triangulate this information from closely comparable sources to improve statistical (and temporal) representativeness. Data on crop management, such as fertilization rates, cultivars planted, as well as tillage and residue management, can guide the design of location-specific simulations that can then be scaled out. Our study sourced data on labor time and the implicit value of residues mainly from farmers involved in the trials, with limited representativeness and spatial coverage. As a result, data extrapolation based on similar agro-ecological conditions was necessary.
Third, several factors complicate the economic assessment of technologies, such as the computation of the implicit value of labor or residues. Although labor demand can be valued using market wages-if available, the implicit value of residues should ideally depend on rarely available market prices or farmer-reported' valuations. In this regard, production economics can perhaps guide the computation of implicit values for alternative residue uses, such as livestock feed, mulching, or burning; and household surveys may help identify the most common, location-specific uses of residues.
Fourth, when scaling out a technology assessment both the effect of the technology on yield and on total production should be comparatively examined. For example, a technology might exert a large effect on yield in an area with relatively low overall harvested crop area, thereby resulting in a relatively smaller effect on total production. At the same time, the technology may give a moderate effect on yield in another area where that specific crop is widely harvested. This scale issue is consistent with the Global Yield Gap Atlas that suggests identifying regions with the greatest contribution to national production totals for a specific crop and water regime (van Bussel et al., 2015). Reliable spatially-explicit and high-resolution data on area harvested are a crucial element in this regard.
Fifth, our spatial farm typology analysis provided insights into the diversity of household assets, resources, and livelihoods relevant to the suitability of CA-based systems for farmers. These insights help provide more nuanced context to our ex-ante biophysical and economic indicators, and identify potential opportunities and trade-offs associated with CA-based systems that a field-scale assessment may miss. Our field-scale assessment highlighted yield penalties often occurred in Western Province (Fig. 5A). This yield penalty result aligns with our spatial farm typology analysis, as the most common farm type in Western Province applied limited quantities of fertilizer to maize, took longer to reach a market, and owned limited non-agricultural assets ( Fig. 7 and Table 7). Adequate crop nutrition, which can be supplied by fertilizer, is important for CA-system performance , especially in the initial years when the soil fertility effects of residues and rotations may not yet have fully materialized. Moreover, some authors believe that the appropriate use of fertilizer may improve the feasibility of CA systems wherever it is practiced (Vanlauwe et al., 2014). Western Province is challenging for CA-based systems, and crop production in general, as the sandy soils are often unresponsive to fertilizer partly because of their low cation exchange capacity and generally low organic matter content. We observed that farmers in Western Province applied limited fertilizer. Farmers here are often cut off from fertilizer supply chains and are limited to manure use only when fertilizing their crops. This has negative effects on crop productivity and hence biomass production that can be retained as residues.
The most common farm type in Southern Province focused on ruminant livestock production (compared with other farm types). This livestock focus needs to be considered when evaluating the suitability of CA-based systems for farmers, as trade-offs may be substantial for these farm types depending on the allocation of residues as a mulch or as a livestock feed. Although owning livestock may reduce the suitability of CA-based systems for some farmers, other factors must also be considered such as the role of livestock in providing animal traction and manure and alternative sources of feed such as grazing land. Farm type three shows higher average annual rainfall and higher clay content in the soil than the other farm types. In Northern and Luapula Province farm type three is most common, and our simulation results suggest CA- 16 In the 2015 RALS available labor data were restricted to the cost of hired labor. In the 2009/10 Zambia Crop Forecast Survey for large scale farms information on labor use by crop and activity was collected, although unfortunately, it was unavailable for small scale farms. based systems are associated with a worse performance for yield change compared to the control systems in these provinces (along with Western Province) (Table 6). Results from a meta-analysis in Southern Africa (Rusinamhodzi et al., 2011) suggested yields under CA mostly perform worse in clay soils and in wetter climates. The interpretation of the spatial farm typology results with our simulation results highlighted some examples of farm types for which CA-based systems may be more suitable.

Conclusion
Longstanding interest exists in assessing agricultural technologies that have the potential to increase agricultural production and reduce environmental harm. However, the literature on alternative approaches to scaling out technology assessments is limited, albeit growing. This study presented an approach for technology scaling out and for ex-ante assessment using CA-based systems in Zambia as an example. We combined bottom-up and top-down approaches to examine spatial variation in biophysical and economic indicators, and then conducted a spatial farm typology analysis to assess the implications of CA-based systems beyond the field scale. We also highlighted five considerations for the improved assessment of scaling out technologies.
Our results highlighted a 33% increase in average maize yields across Zambia when comparing the CA-based system with the control system, with the average masking spatial variation both within and between districts and provinces (the district-scale range was −37% to 70%). Although country-scale studies inevitably miss some of the location-specific nuances desireable for technology assessments, we highlighted the value of considering economic factors in scaling out technology assessments. Furthermore, the spatial farm typology analysis provided context underlying the spatial variation in our indicators and insights into the suitability of CA-based systems for farmers. Our results suggested that the ranking of provinces based on biophysical indicators of CA-based systems can differ from the ranking based on economic indicators. Assessments, such as ours, that combine trial data and household survey data with simulation models and typology analysis highlight the synergies of the combined approach. Adequate data (for example, farm-level by crop and cropping system) to conduct such assessments are often a limiting factor and require stronger efforts by the international research and development community to enable a more holistic assessment of farm systems, especially on their economic benefits.