How sustainable is sustainable intensification? Assessing yield gaps at field and farm level across the globe

Sustainable intensification has been proposed as a pathway to achieve food security and reduce environmental impacts of agriculture by focusing on narrowing yield gaps on existing agricultural land while improving resource use efficiencies. There is a general consensus that regions with large yield gaps can benefit most from sustainable intensification but it remains unclear how sustainable this is for farmers given their current resource constraints and livelihood strategies. Here, we draw upon three contrasting case studies, for which detailed data at field and farm levels were available for yield gap decomposition, to assess how sustainable intensification of crops (at field level) works out at farm level using environmental and socio-economic indicators. Although there is large potential for future intensification (more output with more input) of cereal production in southern Ethiopia, current input use in these farming systems is not economically and environmentally sustainable at farm level. The same is true for rice production in Central Luzon where sustainable intensification (more output with less input) can help to narrow yield gaps and improve N use efficiency (NUE) but it is not profitable due to the heavy reliance on costly hired labour. Trade-offs between yield gap closure and labour productivity were also observed in the aforementioned farming systems. Arable farms in the Netherlands exhibit small yield gaps as well as higher economic performance, NUE and N surplus compared to those observed in Southern Ethiopia and Central Luzon. For improving environmental sustainability, these farms require increases in resource-use efficiency and a reduction of the environmental impacts through a lower use of inputs (same output with less input). We conclude that public investments conducive for innovation and profitable farming are essential to make technologies accessible and affordable for farmers and to ensure that yield gaps can be narrowed and sustainability objectives served at the farm level.

Sustainable intensification has been proposed as a pathway to achieve food security and reduce environmental impacts of agriculture by focusing on narrowing yield gaps on existing agricultural land while improving resource use efficiencies. There is a general consensus that regions with large yield gaps can benefit most from sustainable intensification but it remains unclear how sustainable this is for farmers given their current resource constraints and livelihood strategies. Here, we draw upon three contrasting case studies, for which detailed data at field and farm levels were available for yield gap decomposition, to assess how sustainable intensification of crops (at field level) works out at farm level using environmental and socio-economic indicators. Although there is large potential for future intensification (more output with more input) of cereal production in southern Ethiopia, current input use in these farming systems is not economically and environmentally sustainable at farm level. The same is true for rice production in Central Luzon where sustainable intensification (more output with less input) can help to narrow yield gaps and improve N use efficiency (NUE) but it is not profitable due to the heavy reliance on costly hired labour. Trade-offs between yield gap closure and labour productivity were also observed in the aforementioned farming systems. Arable farms in the Netherlands exhibit small yield gaps as well as higher economic performance, NUE and N surplus compared to those observed in Southern Ethiopia and Central Luzon. For improving environmental sustainability, these farms require increases in resource-use efficiency and a reduction of the environmental impacts through a lower use of inputs (same output with less input). We conclude that public investments conducive for innovation and profitable farming are essential to make technologies accessible and affordable for farmers and to ensure that yield gaps can be narrowed and sustainability objectives served at the farm level.

Introduction
The world faces an enormous challenge to supply affordable food to an ever-increasing human population without overexploitation of natural resources and degradation of ecosystems services (Tilman et al., 2011). Sustainable intensification aims to narrow yield gaps on existing agricultural land while increasing resource-use efficiencies (Cassman and Grassini, 2020;Vanlauwe et al., 2014). Progress towards sustainable intensification can be monitored by measuring system indicators such as crop yield and yield gap, resource-use efficiency and soil quality (Cassman and Grassini, 2020). Agronomic technologies that can deliver sustainable intensification at field level are in general available for the major cereal crops (Wezel et al., 2015;Carberry et al., 2013). At regional level, there is consensus that regions with large yield gaps can benefit most from sustainable intensification (Cassman and Grassini, 2020;van Ittersum et al., 2016).
Yield gaps are defined as the difference between the potential (Yp) or the water-limited yield (Yw) and the actual yield (Ya) observed in farmers' fields under irrigated or rainfed conditions, respectively . Decomposing yield gaps into efficiency, resource and technology yield gaps is helpful to identify the management drivers of existing yield gaps (Silva et al., 2017b). The efficiency yield gap is defined as the difference between the technically efficient yield (Y TEx , the maximum yield that can be achieved for a given input level) and Ya and captures the contribution of sub-optimal time, form and/or space of crop management practices. The resource yield gap is defined as the difference between the highest farmers' yield (Y HF ) and Y TEx and it is attributed to a sub-optimal amount of inputs applied. Finally, the technology yield gap is defined as the difference between Yp or Yw and Y HF and it can be attributed to the use of inferior technologies (e.g., varieties or balanced nutrition) in farmers' fields than those needed to reach Yp or Yw.
Sustainable intensification involves trade-offs between 'sustainability' and 'intensification' (Struik et al., 2014). Such trade-offs are likely to occur at the farm level given resource constraints and the timing of activities; some objectives are then prioritized over others. Few studies have paid attention to this in the past. Hence it remains unclear how sustainable intensification of crops (at field level) works out at farm level, given constraints of land, labour and capital availability and farmers' decisions on resource allocation coupled with their prioritization of crop management activities. Farmers' decisions can be classified as strategic, tactical and operational in terms of long, intermediate and immediate time scales (de Koeijer et al., 1999). Their decisions determine actual resource-use efficiencies and the extent to which growth-defining, -limiting and -reducing factors are optimised for a specific crop in the biophysical environment of the farm (Giller et al., 2011). In turn, management decisions are strongly conditional on the socio-economic environment of the farm and the farmers' personal priorities.
The analysis presented here draws upon a comparative analysis of farming systems with different degrees of agricultural development and intensification (Table 1). This is important to capture low-, medium-and high-yielding systems with contrasting resource-use efficiencies and historical differences in yield progress (Fischer et al., 2014). We selected three contrasting farming systems for which suitable on-farm data at field and farm levels were available for yield gap decomposition (Beza et al., 2017). These farming systems were mixed crop-livestock systems in southern Ethiopia, specialised rice-based farming systems in Central Luzon (Philippines) and arable farming systems in the Netherlands. The farming systems exhibited different rates of yield progress and intensity of fertiliser use (Tittonell and Giller, 2013). They have also been influenced by different degrees of structural adjustment in the national economy over the past half-century (Timmer, 2009).
This study aimed to assess how sustainable is the sustainable intensification of crops at farm level, using environmental and socioeconomic indicators. First, yield gaps of the main cereal crops in each farming system were decomposed into efficiency, resource and technology yield gaps. Second, data on farm size and labour use at farm level were used to explain differences in the degree of intensification of each farming system. Third, the relationship between environmental indicators (N use efficiency and N surplus) and socio-economic indicators (economic performance and labour use) on the one hand and yield gap closure on the other were investigated.

Individual farm data
Household surveys in southern Ethiopia were conducted in 2012 by the International Maize and Wheat Improvement Center (CIMMYT) to map the potential demand for mechanization in the region Baudron et al., 2019). A total of 200 farmers were interviewed in Hawassa and Asella using a semi-structured questionnaire. Households were selected using a systematic sampling procedure in each village. Hawassa is located in the Rift Valley and the main crops cultivated are maize, bean and enset, mostly for home consumption. By contrast, Asella is located in the southern highlands and the main crops grown are wheat, barley, tef, sorghum, and legumes such as pea and faba bean. Cereals are used for home consumption and for selling while legumes are mostly produced for home consumption.
The Central Luzon Loop Survey (Philippines) has been conducted by the International Rice Research Institute (IRRI) every 4-5 years since 1966. Data from all surveys rounds between 1966 and 1967 up to 2011-2012 were used for yield gap analysis but results are only presented for the 2011-2012 period in this study. This survey aims to monitor temporal changes in crop management and household characteristics in rice-based farming systems in Central Luzon Moya et al., 2015). Double cropping of rice is common in this farming system with a wet season (WS) crop cultivated between June-July and September-October and a dry season (DS) crop Table 1 Overview of the datasets used for yield gap analysis and average values of key indicators for the year 2012. Yield gaps were also investigated for other crops marked with † but results are presented in Silva et al. (2017a). Costs included in the calculation of economic performance are specified in Table 2. Labour use represents an average at farm level and all other metrics refer to averages per crop. Maize and vegetables in Central Luzon were not included in any of the analyses presented in the manuscript. The sample size was smaller in the DS than in the WS, due to water-related constraints for rice production and/or cultivation of other crops in the DS (e.g., maize and vegetables). The latter crops were not considered in the analysis. Individual farm data from arable farms in the Netherlands during the period 2008-2012 were obtained from BINternet, Wageningen Economic Research. These data are collected yearly to monitor the economic performance of agricultural holdings in the Netherlands (van der Veen et al., 2014). Data from all years were used for yield gap analysis (Silva et al., 2017a) but here results are presented for 2012 only. The year 2012 was an atypical year with high cereal prices (data not shown), hence the economic results presented are well above average and shall be interpreted with caution. The farms monitored (n ≈ 175) are a random sample of farms selected from the agricultural census based on the type of farming (e.g., arable farms) and the economic size class ( ≥ 25 k€). Arable crops are cultivated in the Netherlands between March and October with the exception of winter wheat which is cultivated between November and August. Most farms grow a succession of root and tuber crops and cereals over multiple years, in particular sugar beet, potato and winter wheat. Spring onions and spring barley are also important but they are cultivated by fewer farmers.

Yield gap analysis
The yield gap decomposition presented in this paper builds upon the framework and methods introduced by Silva et al. (2017b). The framework was consistently applied to the main cereal crops in each farming system, as documented elsewhere (Silva, 2017). For each crop × country combination, stochastic frontier analysis was used to disentangle efficiency and resource yield gaps and crop growth models were applied to simulate the yield ceilings used to estimate technology yield gaps. The analysis was done independently for each crop × country combination but the use of a common framework and common methods allows for a robust comparison of yield gaps across case studies.
The yield gap was calculated as the difference between the potential yield (Yp) and the actual yield (Ya) for arable crops in the Netherlands and for rice in Central Luzon and as the difference between the waterlimited yield (Yw) and Ya for wheat and maize in southern Ethiopia.
Yp was used as yield ceiling in the Netherlands due to the humid climate and shallow water tables with capillary rise and in Central Luzon where rice is irrigated. Cereals in Ethiopia are rainfed, hence Yw was adopted as a yield ceiling.
Crop models were used to simulate Yp and Yw for the growing seasons covered in the household surveys. Yw of wheat and maize in southern Ethiopia were simulated with, respectively, the WOFOST (Boogaard et al., 2013) and Hybrid-Maize (Yang et al., 2004) crop models and obtained from the Global Yield Gap Atlas (see . Further details about the crop models and the methodological approach used are provided elsewhere . For rice in Central Luzon, ORYZA v3 (Li et al., 2017) was used to simulate Yp for two rice varieties (IR72 and NSIC Rc222). Further details about model evaluation against field data are provided in Silva et al. (2017b). Yield potential (Yp as in many locations capillary rise of water limits water stress) of cereal crops in the Netherlands was simulated with the WOFOST crop model as described by Reidsma et al. (2015). The use of crop models ensures that yield ceilings and yield gap closure (i.e., the ratio between actual yield and potential or water-limited yield under irrigated and rainfed conditions, respectively) can be reliably compared among the different crop × country combinations.
Ya was obtained through farmer's recall of production and field area, as recorded in the individual farm data. Yp, Yw and Ya were standardized to a moisture content of 15.5 and 13.5% for maize and wheat in southern Ethiopia, respectively, 14.0% for rice in Central Luzon and 16% for winter wheat and spring barley in the Netherlands. Y HF was estimated as the mean Ya above the 90 th percentile of Ya for each crop, considering WS and DS rice separately. We did not differentiate Y HF per soil type and variety due to lack of data or lack of variation in the data in these factors across farms. Stochastic frontier analysis was used to estimate Y TEx because it distinguishes two error terms, statistical noise (v it ) and technical inefficiency (u it ), while using a set of input variables x to explain variation in a dependent variable y (Kumbhakar and Lovell, 2000). For further details on the stochastic frontier estimation for each crop × country combination see Silva (2017).
Cereal yield responses to N were quantified using quantile regressions fitted to the 90 th percentile of the data. These were estimated for wheat and maize in southern Ethiopia, WS and DS rice in Central Luzon and wheat and barley in the Netherlands individually as well as for the pooled sample with the smf() function of the statsmodels library in Table 2 Costs considered in the estimation of economic performance at crop and farm level in each farming system. Seed costs for farms in Ethiopia were imputed. Other income sources were also included for arable farms in the Netherlands. Fixed costs associated to family labour and opportunity costs of owned land were not considered in the calculations of economic performance. ✓= 'yes', × = 'no'.
Variable cost × × ✓ † Costs with administration, car, communication, advocacy, hygiene, water and insurance. ‡ Costs on interest and financial services. ∦ Costs as per the crop level plus delivery, depreciation, auction and other costs for Dutch farms. * Costs with electricity, gas, oil and other sources of energy. ⋆ Depreciation and other costs of car, leasing, buildings, soil amendments, installations, inventory and machines and costs of property and water taxes.
Python (Seabold and Perktold, 2010). A logistic functional form of the type y = a + b × x + c × 0.99 x was assumed for this relationship, where a, b and c were the parameters estimated. These quantile regressions help visualizing the concept of Y TEx in a single input-output setting.

Use of resources at farm level
Data on farm size and labour use were available at crop and farm levels. Farm size (ha) refers to the sum of the area of individual rice fields cultivated by a household in Central Luzon and to the sum of the cultivated area of the main crops in southern Ethiopia (i.e., cereals and pulses in Asella and maize, beans and enset in Hawassa) and the Netherlands (all arable crops cultivated). Data on labour use for the main crop management operations were available for each rice field cultivated by a household in Central Luzon and for each of the dominant crops cultivated in southern Ethiopia. In both sites, it was possible to disaggregate total labour use by source (family and hired labour) and to quantify the proportion of hired to total labour used at farm level. Total labour use (labour-days per ha, ld ha -1 ) was defined as the sum of family and hired labour used for all crops, and respective operations between sowing and harvesting, standardized to an 8 h working day. Total labour use and the proportion of hired labour for arable farms in the Netherlands were estimated in a similar way but labour data for these farms were only available at farm level (i.e., aggregated for all crops and operations). Finally, the area cultivated per unit labour (m 2 ld -1 ) was calculated as the ratio between farm size and total labour use.
The relationship between relative yield gap closure on the one hand and farm size, labour use, area cultivated per unit labour and proportion of hired labour on the other was investigated visually. In the case of total labour use, quantile regressions were fitted to the 90 th percentile of the data assuming the same functional form and using the same estimation method used to investigate cereal yield responses to N.

N-use efficiency and N surplus
N-use efficiency (NUE) and N surplus (Ns) were estimated at crop and farm level following the guidelines of the EU N Expert Panel (Quemada et al., 2020). These indicators were computed based on the mass balance principle for N: where Ya i (t ha -1 ) stands for the actual yield of crop c in farm i, DM% (%) for the dry-matter content and N YIELD (kg N t crop -1 ) for the N concentration in the harvested product of crop c. In Ethiopia, N YIELD was assumed to be 11.3 for maize, 7.5 for kocho (enset), 37.8 for bean and faba bean, 12.5 for barley and 17.6 kg N t -1 DM for tef (Mellisse et al., 2017) and 21.5 for wheat, 21.0 for sorghum and 41.5 kg N t -1 DM for pea (Nijhof, 1987). A value of 11.0 kg N t -1 DM was assumed for rice in Central Luzon (Witt et al., 1999). For crops in the Netherlands, N YIELD was assumed to be 3.3 kg N t -1 FM for ware potato, 3.0 kg N t -1 FM for seed potato, 3.7 kg N t -1 FM for starch potato, 1.8 kg N t -1 FM for sugar beet, 2.2 kg N t -1 FM for spring onion, 17.3 kg N t -1 FM for winter wheat and, 13.0 kg N t -1 FM for spring barley (de Haan and van Geel, 2013). N input (Equation (2)) was defined based on three main sources of N available for crop growth during the growing season. Total N (kg N ha -1 ) refers to the amount of N applied with mineral and organic fertilisers (not corrected for fertiliser replacement values of organic sources) to crop c in farm i and, N FIX (kg N ha -1 ) and N DEPO (kg N ha -1 ) refer to the amount of biological N 2 -fixation provided by legume crops in the rotation and to the atmospheric N deposition in each site, respectively. For farms in Ethiopia, N input comprises the mineral N applied with urea and di-ammonium phosphate to all crops and an additional 42 kg N ha -1 for the farms in Hawassa which reported the use of animal manure for enset. Biological N fixation and atmospheric deposition were assumed to provide 4 and 5 kg N ha -1 yr -1 in this site, respectively (Mellisse et al., 2017). For farms in Central Luzon, N input comprises the N applied with mineral fertilizers and an additional 50 kg N ha -1 crop -1 available through a combination of irrigation sediments, rain dust and biological N 2 -fixation (Dobermann, 2000). For Dutch farms, N input comprises the N applied with mineral fertiliser plus the N applied with organic manures not corrected for a fertiliser replacement value and an additional 25 kg N ha -1 yr -1 available due to atmospheric N deposition. Plant available N applied (kg N ha -1 ) was defined as the amount of N applied with mineral fertilisers and with organic fertilisers corrected for the their fertiliser replacement values and it was used to study cereal yield responses to N applied (cf. Fig. 2). NUE values between 0.5 and 0.9 kg N kg -1 N are considered within a desirable range, values greater than 0.9 kg N kg -1 N indicate risk of soil mining in the long-term and values lower than 0.5 kg N kg -1 N point to inefficient use of N (Quemada et al., 2020). For Ns, a maximum of 80 kg N ha -1 was adopted as values above this threshold have high risk of N losses (e.g., NO − 3 -leaching and NH 3 -volatilization). Lastly, relative yield gap closure for the main cereals was compared to the crop-specific NUE (estimated in the same way as per Equations (1)-(3) but for the crops of interest) in order to identify possible trade-offs between crop production and environmental performance.

Revenues, costs and economic performance
Economic performance at crop and farm level were computed as the difference between revenues and production costs, respectively, for each crop or for all crops within a farm. Farmer self-reported data on quantities and prices were used for farms in Central Luzon and the Netherlands, but not for farms in Ethiopia for which prices were obtained from expert knowledge.
Revenues per crop were calculated as the product between Ya and the market price for that crop. Variable costs associated with material inputs per crop (e.g., seeds, fertilisers, crop protection and irrigation water) were considered for all farming systems as specified in Table 2. Seed costs were calculated based on self-reported data on seed rates and prices for farms in the Philippines and the Netherlands. The surveys conducted in Ethiopia did not record seed rates and costs, so these were imputed from the median values reported in other household surveys conducted by CIMMYT during 2013 in Hawassa and Asella (data not shown). Further details of these household surveys can be found in Assefa et al. (2020) and Silva et al. (2021a). Hired labour costs were also considered for farms in Ethiopia and Central Luzon and costs of rented land were only considered in Central Luzon (Table 2). No data on hired labour or land rental costs were available at crop level for Dutch arable farms.
Revenues at farm level were estimated as the sum of the revenues of the crops cultivated by each farm. Other sources of agricultural income (e.g., subsidies) were also considered for farms in the Netherlands in the estimation of revenues at farm level. The calculation of economic performance at farm level considered the same costs as specified for the calculation at crop level (i.e., material inputs, hired labour and rented land costs) for farming systems in Ethiopia and Central Luzon. For Dutch arable farms, the fixed costs specified in Table 2, as well as hired labour and land rental costs at farm level, were taken into account in addition to the material input costs used for the estimation of economic performance for individual crops. Examples of fixed costs considered for Dutch farms include general costs (e.g., administration and insurance), financial costs (e.g., on interest and services) and depreciation costs of buildings, materials and machines (Table 2). Fixed costs of family labour and opportunity costs of owned land were not considered in the analysis for any farming system. Prices refer to the year 2012 and were converted from local currencies to € using exchange rates for the same period (31.4 Ethiopian Birr per € and 59.9 Philippines Peso per €).
Fixed costs at farm level were not considered for farms in Ethiopia and in the Philippines due to lack of data in the surveys analyzed and lack of alternative data sources to retrieve fixed costs at farm level with accuracy. Moreover, imputing fixed costs would require a large number of assumptions and introduce a large uncertainty in the economic assessment presented here. Therefore, when comparing economic assessments between the different case studies (cf. Fig. 6) it is important to realize that fixed costs were only included for farms in the Netherlands.

Yield gaps and yield responses to N
Actual yields were smallest in southern Ethiopia, intermediate in Central Luzon (Philippines) and greatest in the Netherlands (Fig. 1). For instance, maize and wheat yields in Hawassa and Asella (southern Ethiopia) were on average 1.6 and 2.7 t ha -1 , respectively, which corresponds to ca. 25% of Yw. Rice yields in Central Luzon varied between 3.2 t ha -1 in the WS and 4.8 t ha -1 in the DS, which corresponds to ca. 50% of Yp. In the Netherlands, actual yields of spring barley were 5.5 t ha -1 (65% of Yp) while winter wheat yields were on average 7.6 t ha -1 or 75% of Yp. Yield gaps smaller than 30% of Yp were also observed for ware potato, sugar beet and spring onion in the Netherlands (Silva et al., 2017a). These figures align with earlier reports that yield gaps are small (20 -30% of Yp) in Northwest Europe, intermediate (30 -50% of Yp) in Southeast Asia and large ( > 50% of Yw) in East Africa (Assefa et al., 2020;Schils et al., 2018;Stuart et al., 2016;Neumann et al., 2010).
Yield gaps were mostly attributed to technology yield gaps in southern Ethiopia, to efficiency (and technology) yield gaps in the Netherlands and to all three intermediate yield gaps in Central Luzon (Fig. 1). Although technology yield gaps in southern Ethiopia were the largest, closing efficiency and resource yield gaps would nearly double actual yields. This corresponds roughly to the yield increases needed to reach cereal self-sufficiency in Ethiopia . The yield gap of WS rice in Central Luzon was explained approximately equal by efficiency, resource and technology yield gaps (20, 15 and 15% of Yp) while for DS rice the relative contribution of technology yield gaps (25% of Yp) was slightly higher than that of other yield gaps (each ca. 15% of Yp). Narrowing efficiency and resource yield gaps in this farming system results in a yield gap closure of ca. 80% of Yp. For cereals in the Netherlands, the yield gap was explained equally by efficiency and technology yield gaps: ca. 10% of Yp for wheat and 15% of Yp for barley, respectively. Narrowing efficiency yield gaps for Dutch arable crops would increase yields to ca. 80% of Yp.
Results from the stochastic frontier analysis indicate that narrowing yield gaps of maize and wheat in southern Ethiopia and of DS rice in Central Luzon requires larger N application rates (Table 3). Conversely, the lack of yield response to N for wheat in the Netherlands suggests there is scope to decrease nutrient application rates without compromising yield while increasing N rates for WS rice in Central Luzon should not be encouraged due to risk of lodging (Lampayan et al., 2010). Fig. 2A shows that cereal yield responses to N across crops and farming systems follow the law of diminishing returns (Nijland et al., 2008). This was not evident when data were analyzed for each farming system separately Getnet et al., 2016;Gines et al., 2004) as farms in southern Ethiopia (Fig. 2B) and farms in Central Luzon are found in the steep slope (Fig. 2C) while Dutch farms are on the plateau of the response curve (Fig. 2D).
Results of quantile regressions suggest other factors limit yield responses to N, indicating better agronomy is needed to support responses to greater N rates (Fig. 2B-D). Narrowing resource and technology yield gaps in southern Ethiopia requires inputs not currently used (e.g., herbicides and composite fertilisers with K), knowledge of ecological principles to control pests and diseases (e.g. Kebede et al., 2015;Taa et al., 2004) and technologies that can ensure timely and precise operations (e.g., mechanization). For rice in Central Luzon, more timely fertiliser and pesticide applications are needed to narrow efficiency yield gaps (Silva et al., 2017b) while better management of K and of the interaction between establishment method and weed pressure can help to narrow resource and technology yield gaps (Lantican et al., 1999;Dobermann et al., 1996). Finally, it is questionable whether it is possible to further narrow efficiency yield gaps in the Netherlands due to impacts of weather extremes and machinery constraints on the timeliness of operations (Reidsma et al., 2015;van Oort et al., 2012). Further it may not lead to the largest profit for farmers.

Availability of land, labour and capital
Farm sizes were on average 0.8 ha in Hawassa, 2.0 ha in Asella and 1.7 ha in Central Luzon, which are much smaller than the 54 ha in the Netherlands (Fig. 3A). Moreover, the maximum farm size recorded in southern Ethiopia and Central Luzon (ca. 10 ha) corresponded to the minimum farm size observed in the Netherlands. Land was distributed unequally as 45% of the farms owned only 20% of the land in each farming system (data not shown). The analyses provide no clear evidence that small farms in a given farming system are more productive than large farms from the same farming system. Farms in southern Ethiopia and Central Luzon used much more labour to cultivate one ha of land than farms in the Netherlands and a threshold of 15 ld ha -1 was identified as the minimum and maximum labour use for smallholders in the tropics and for farms in the Netherlands, respectively (Fig. 3C). There was no association between yield gap closure and labour use at farm level in the Netherlands while there was a weak positive association between these two variables in southern Ethiopia and Central Luzon (Fig. 3C). This is not surprising as virtually all operations in Dutch farms are mechanized. Conversely, farming in southern Ethiopia remains dependent on manual labour and on labour-intensive animal draught (Gebregziabher et al., 2006;Aune et al., 2001;McCann, 1995).
The situation of rice farming in Central Luzon in the late 1960s was not much different from the current situation in southern Ethiopia but farmers were able to access credit and to substitute labour by capital over time (Takahashi and Otsuka, 2009). Capital availability facilitated the adoption of improved varieties, direct-seeding and small machinery (Moya et al., 2015;Launio et al., 2008) which, together with investments in irrigation, contributed to increase rice yields from the 1970s onwards (Estudillo and Otsuka, 2006). A more dramatic transformation occurred in the Netherlands where labour has been substituted by capital to a point in which the degree of debt and investment capacity became a major determinant of farm performance (Zhengfei and Lansink, 2006). This economic pressure drove many non-profitable farms out of business and triggered increases in farm size Table 3 Summary of key drivers of the efficiency, resource and technology yield gaps for wheat and maize production in southern Ethiopia, rice production in Central Luzon (Philippines) and arable crop production in the Netherlands. Water limitation in the Netherlands refers mostly to sub-optimal distribution of rainfall during the growing season .  (Mandryk et al., 2012). The area cultivated per unit labour (Fig. 3B) is useful to understand the substitution of labour by energy, and the capital costs associated with the use of that energy (de Wit, 1979). Dutch farms cultivated up to 5000 m 2 ld -1 while that value was a factor 10 less for smallholders in the tropics (Fig. 3B). Historical data for Central Luzon showed sharp increases in land productivity during the 1970s followed by sharp increases in labour productivity from the late 1980s onwards . The latter was associated with an increase of the area cultivated per unit labour from ca. 160 m 2 ld -1 in the late 1980s up to ca. 225 m 2 ld -1 in 2012 in the DS, as a result of the adoption of direct-seeding and small machinery. The increase in the WS was not as sharp because transplanting remained the preferred establishment method during this season when water is abundant.

Relationships between farm resources and yield gap closure
Greater use of energy, as observed in the Netherlands, favours yield gap closure as it relates to capital intensive technologies (e.g., machinery). Such technologies require strategic investments in the long-term and farmers are encouraged to maximize their returns to such technologies as they often can only be used for specific operations. By contrast, smallholders strive to maximize returns to labour, through on-and/or off-farm employment, and have limited access to markets. This is clear in Central Luzon, where the proportion of hired labour increased over the past half-century up to ca. 80% of total labour use ( Fig. 3D; Takahashi and Otsuka, 2009;Otsuka, 2000;Estudillo and Otsuka, 1999;Kerkvliet, 1990). This is an example where short-term investments focused on maximising returns to labour, which are not 'locked in' to farming and allow flexibility, are a suitable livelihood strategy for smallholders (Dorward, 2009).
The lack of investment in labour-saving technologies can explain the greater yield gaps for smallholders than for Dutch arable farms, but not the smaller yield gaps in Central Luzon than in Southern Ethiopia. The latter may be explained by three main factors. First, rice farming in Central Luzon is mostly irrigated while cereal farming in southern Ethiopia is rainfed and thus, dependent on the amount and distribution of rainfall. Second, farms in Central Luzon have better access to innovation, markets and infrastructure than farms in southern Ethiopia, where timely availability of inputs remains problematic. This is especially true for the farms included in the Central Luzon Loop Survey, as these are located along the main road, and close to research centres, hence being the first to benefit from improved seed and practices. Finally, farms in Central Luzon are largely dominated by irrigated lowland rice while farms in southern Ethiopia cultivate cereals and pulses that compete for labour over time (see Fig. 5).
The sharp contrast in farm size between Hawassa and Asella has important consequences for sustainable intensification. Farms in Hawassa are very small (Fig. 3A) as a result of high population densities (600 person km -2 ). The small farm sizes triggered the farmers to replace staple crops by high-value crops (e.g., the narcotic, khat) in order to increase the economic returns to land (Mellisse et al., 2017). Intensification of staple crop production in this site is thus unlikely and opportunities off-farm are needed to solve liquidity problems. Conversely, the prospects for Asella are different given the lower population density (200 person km 2 ), slightly greater farm sizes and good market opportunities (e.g., beer breweries and increasing demand for cereals). Mechanization is considered an interesting option to increase land and labour productivities while reducing labour requirements in this location .

Crop and farm performance
NUE was greater than 0.9 kg N kg -1 N for ca. 60% of the farms in Asella (Fig. 4A). The high NUE in this site results from a high N concentration assumed for this crop (21.5 kg N t -1 DM) in combination with Fig. 3. Yield gap closure for cereals and farm resources during the year 2012: A) farm size (in log), B) area cultivated per labour-day at farm level, C) labour use at farm level and D) proportion of hired labour at farm level. Solid lines in C) are quantile regressions fitted to 90 th percentile of the data. Total labour use, in labour-days (ld), expresses the total number of days worked by family and hired labourers per ha on an 8 h working day basis. Country codes: ETH = Ethiopia, PHL = Philippines, NLD = Netherlands. low N amounts of N applied ( Fig. 2A). This indicates a critical risk of soil N mining in the long-run -a characteristic of low-input cropping systems (Stoorvogel et al., 1993). NUE was less than 0.5 kg N kg -1 N for ca. 85% of the farms in Central Luzon in the 2011 WS and 2012 DS and Hawassa (Fig. 4A). The low NUE in these sites indicates that there is scope to reduce N losses to the environment through improved crop management (Table 3). High NUE and N output were observed for Dutch farms, with ca. 80% of the farms surveyed in 2012 exhibiting a NUE between 0.5 and 0.9 kg N kg -1 N (average of 0.63 kg N kg -1 N, Fig. 4A). Analysis for individual crops corroborates these findings as NUE was very high for wheat in Asella (between 0.5 and 1.5 kg N kg -1 N), very low for maize in Hawassa and rice in Central Luzon (0.3-0.5 kg N kg -1 N) and within a desirable range for wheat and barley in the Netherlands (0.4-1.0 kg N kg -1 N; Fig. 4B). High values of NUE (0.9-1.0 kg N kg -1 N or even higher) in the Netherlands are desirable given the high input levels and fertility status of the soils (e.g., Reijneveld et al., 2009).
N surplus was below 80 kg N ha -1 for all farms in Asella and ca. 90% of the farms in Hawassa while the opposite was true in Central Luzon and in the Netherlands, where Ns was above 80 kg N ha -1 for ca. 70% and 60% of the farms, respectively (Fig. 4A). The latter indicates a risk of high N losses to the environment and is a result of poor crop management in Central Luzon, where matching N application with soil N supply is a well-known challenge , and a combination of high N input and high N output in the Netherlands (Silva et al., 2021b;van Grinsven et al., 2019).
The importance of labour use and labour productivity is often overlooked in smallholder farming systems Woodhouse, 2010). Yet, labour dynamics are informative of the Fig. 4. Environmental performance during the year 2012: A) N outputs and N inputs (without considering mineral fertiliser replacement values for organic manures) at farm level following the NUE indicator of Quemada et al. (2020) and B) yield gap closure for cereals and N use efficiency at crop level. N use efficiency did not take into account soil mineralization (input) and crop residues (output) under the assumption both N flows match each other in the internal cycling. Country codes: ETH = Ethiopia, PHL = Philippines, NLD = Netherlands. sustainability of a farming system from a social perspective. Labour peaks for land preparation, sowing and harvesting of crops in Asella overlapped in time as a result of a narrow sowing window (in the months of June and July where rainfall is greatest, Fig. 5A), which has also been identified as a critical feature of other smallholder farming systems (Ollenburger et al., 2016;Leonardo et al., 2015;Baudron, 2011;Stone et al., 1990). Conversely, there was a complementary use of labour between crops cultivated in Hawassa, as the labour peaks for different crops are more evenly spread throughout the year (Fig. 5B). Competition for labour within the growing season was also not pronounced for rice farms in Central Luzon because labour peaks for key operations are distributed over time (Fig. 5C).
Farms in southern Ethiopia obtained at most 1500 € ha -1 (800 € ha -1 on average) and had total production costs lower than 300 € ha -1 (100 € ha -1 on average; Fig. 6A). Revenues from rice farming in Central Luzon were similar to those in southern Ethiopia (mean and maximum of 900 and 2000 € ha -1 ) but total production costs were considerably higher (average and a maximum of 650 and 1250 € ha -1 ) with hired labour costs accounting for more than 50% of the total production costs (Moya et al., 2015). For Dutch farms, revenues and total production costs per ha were above the maximum observed in Central Luzon (of 2000 € ha -1 ). However, negative economic performance for some farms in Central Luzon (during the WS) and the Netherlands indicate farming is not always profitable (Fig. 6A). For the Netherlands this is due to high fixed costs. At cereal crop level, economic performance per ha was ca. three times greater in the Netherlands and for wheat in Asella than for maize in Hawassa and rice in Central Luzon (Fig. 6B). Economic performance of wheat in Asella and the Netherlands were at most ca. 1000 € ha -1 and ca. 2000 € ha -1 , respectively (Fig. 6B). Conversely, economic performance was lower than 500 € ha -1 for maize in Hawassa and rice in Central Luzon (Fig. 6B).

Sustainability assessment at farm level
Sustainable intensification for smallholders in Africa is urgently needed to satisfy the region's rapidly rising and shifting food demands (Gerard, 2020). While large yield gaps and low resource-use efficiencies show that there is considerable room for increasing productivity in Africa ( Figs. 1 and 4), it is also clear that this has to occur within a context where farm sizes are small, capital is scarce, labour is 'abundant' (Giller, 2021, Fig. 3) and where returns to labour are equally or more important than returns to land (Silva and Ramisch, 2019). The small farm sizes are indeed central to the food security conundrum observed in Africa where achieving food security requires abundant, affordable and nutritious food for a growing population, but most smallholders lack sufficient incentives to invest in agriculture (Giller, 2021). On the other hand, the Asian experience during the years of the Green Revolution showed that it is possible to substantially increase yields under smallholder conditions (Cassman et al., 2003). The latter has led the proposal that development and effective dissemination of technologies are the key to boost smallholder productivity and that rural development in Africa, like in Asia, must depend on small farms (Larson et al., 2016). Evidently, farming is only one of many livelihood activities for smallholders and economic development must offer jobs outside agriculture (Giller, 2021;Larson et al., 2016).
Increasing wheat yields in Asella also increases economic performance, making intensification economically interesting (Fig. 6). This is indeed what would be expected in a region with high population density and expanding markets, factors that are conducive to the adoption of sustainable intensification technologies (Jayne et al., 2019). However, the different crops cultivated in this region compete for labour in key periods of the growing season ( Fig. 5; , hence yield increases should go in tandem with increases in labour productivity . The high NUE, and associated risk of soil N mining in the long-term, indicates N application rates should be increased and to do so would be sustainable from an environmental perspective ( Fig. 4; Ladha et al., 2020). Conversely, increasing maize yields in Hawassa barely increases economic performance, which, in addition to small farm sizes, provides few economic incentives for intensification (Figs. 6 and 3). As NUE is low in this site (Fig. 4), intensification through greater N inputs would only be sustainable if other practices are also improved (Table 3). In summary, increasing input use is needed to increase wheat and maize yields and NUE in Southern Ethiopia, yet economic incentives to do so depend on farm sizes and types of crops cultivated.
There are also few economic incentives to narrow rice yield gaps in Central Luzon given that higher yields translate into greater costs and dependency on hired labour (Figs. 3 and 6). The low NUE indicates other investments are needed to increase yield and environmental performance ( Fig. 4; Ladha et al., 2020), which is indicative of the challenge to manage fertilizer N efficiently in irrigated rice . Rice farming in Central Luzon has remained a smallholder operation over the past decades but experienced sharp increases in labour productivity, hired labour and off-farm income as well as a stagnation or even decline in profitability Moya et al., 2015;Takahashi and Otsuka, 2009). Despite being the main rice supplier to Metro Manila, rice farming in Central Luzon is not a 'professional business', which may well compromise the ability of the country to feed itself in the future (Laborte et al., 2012). A new, yet controversial, Rice Tariffication Law (RTL) was introduced in 2019 to increase the competitiveness of rice farming in the Philippines. Ex-ante assessments of the RTL indicate a reduction in domestic rice prices and of inflation in years with high rice prices (Balie and Valera, 2020). The former is beneficial to consumers (and producers who are net buyers of rice) but not to noncompetitive rice farmers (Balie et al., 2021). Tariff revenues  Table 2 for an overview of the costs considered per farming system. Country codes: ETH = Ethiopia, PHL = Philippines, NLD = Netherlands. are envisaged to promote technologies like improved seeds and mechanization, and capacity building that can increase the competitiveness and modernization of rice farmers, and to be invested in public goods and services offering off-farm opportunities to noncompetitive rice farmers.
In the Netherlands, yield gaps are small and therefore only small yield increases are possible. The same is true for other countries in Northwest Europe (Schils et al., 2018) and for intensive maize-soybean cropping systems in North America (Grassini et al., 2015). Our analysis, however, suggests that there is a positive relationship between cereal yields and economic performance (Figs. 4B and 6B), and perhaps economic performance could be increased further when yields go beyond the often quoted 80% of Yp (Cassman, 1999). Economically, yield increases are attractive and important to counter-balance the very high production and fixed costs. For the Netherlands in particular, the tight land market translates into very high prices of land which make inputs costs relatively less important. At the same time stringent environmental legislation limits the use of external inputs and manure in the European Union (Velthof et al., 2014;Grinsven et al., 2016). Yet, at least for some farms in the Netherlands it seems possible to achieve higher yields with the same N inputs or the same yields with lower N inputs (Figs. 2 and 4), meaning that further increases in NUE may be possible without sacrificing yields (Silva et al., 2021b;Ladha et al., 2020), while reducing N surpluses and emissions.
Yield gap closure in the Netherlands and other intensive farming systems of the developed world (e.g., United States, Japan and other countries of NW Europe) benefited from large public investments in the agricultural sector. Such investments included subsidies and price support for producers and consumers (e.g., the Common Agricultural Policy in the European Union) and other institutional supports for successful collaboration between research, education and extension organizations. At farm level, public sector support translated into accessible and affordable capital-intensive technologies that favored agricultural intensification per unit of land in most high-yielding regions of today. Farms in those regions also contributed to and benefited from general economic growth which created jobs outside of agriculture, which in turn allowed for increases in farm size with associated benefits from economies of scale . In contrast, public investments in the agricultural sector of most sub-Saharan African countries have been considerably smaller (on average ca. 4% of public expenditures for countries in sub-Saharan Africa; Goyal and Nash, 2017) than the committed allocation of at least 10% of public expenditure to agriculture stated in the Malabo Declaration (African Union, 2014). Increasing public investments in agriculture is thus essential to ensure yield gaps can be narrowed and sustainability objectives served at the farm level.

Conclusion
Assessing whether sustainable intensification ('more output with less input') is truly sustainable requires consideration of resource constraints and environmental and socio-economic performance at farm level. We conclude that whilst there is large potential for intensification ('more output with more input') in southern Ethiopia, where yield gaps are about 80% of Yw, this is currently neither economically nor environmentally sustainable at farm level. The same applies to rice farming in Central Luzon where the combination of negative profitability and a heavy reliance on hired labour slow the progress towards sustainable intensification as a way to improve NUE and increase rice yields beyond 50% of Yp. Although high yields in the Netherlands, where yield gaps are only 20 -30% of Yp, are associated with higher economic performance and resource-use efficiency, future research should investigate options for increasing resource-use efficiency and lowering environmental impacts through reducing input intensity ('same output with less input'). Yield gap closure in the Netherlands, and other intensive farming systems in Europe and the Americas, was largely accompanied by public investments encouraging the adoption of innovations and supporting agricultural markets (including, subsidies, price support and other institutional supports) which in turn made technologies accessible and affordable for farmers. Agricultural transformation, including the adoption of sustainable intensification technologies, is thus only likely to take place where public investments to support farmers are ensured, even in regions with large yield gaps.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.