Subnational distribution of average farm size and smallholder contributions to global food production

Smallholder farming is the most prevalent form of agriculture in the world, supports many of the planet’s most vulnerable populations, and coexists with some of its most diverse and threatened landscapes. However, there is little information about the location of small farms, making it difficult both to estimate their numbers and to implement effective agricultural, development, and land use policies. Here, we present a map of mean agricultural area, classified by the amount of land per farming household, at subnational resolutions across three key global regions using a novel integration of household microdata and agricultural landscape data. This approach provides a subnational estimate of the number, average size, and contribution of farms across much of the developing world. By our estimates, 918 subnational units in 83 countries in Latin America, sub-Saharan Africa, and South and East Asia average less than five hectares of agricultural land per farming household. These smallholder-dominated systems are home to more than 380 million farming households, make up roughly 30% of the agricultural land and produce more than 70% of the food calories produced in these regions, and are responsible for more than half of the food calories produced globally, as well as more than half of global production of several major food crops. Smallholder systems in these three regions direct a greater percentage of calories produced toward direct human consumption, with 70% of calories produced in these units consumed as food, compared to 55% globally. Our approach provides the ability to disaggregate farming populations from non-farming populations, providing a more accurate picture of farming households on the landscape than has previously been available. These data meet a critical need, as improved understanding of the prevalence and distribution of smallholder farming is essential for effective policy development for food security, poverty reduction, and conservation agendas.


Introduction
In recent years, the attention of global agriculture and development communities has turned toward the world's smallest farms [1][2][3]. Evidence is mounting that smallholder and family farms are crucial to feeding the planet, and that successful policies aimed at poverty alleviation, food security, and protection of biodiversity and natural resources depend on the inclusion and participation of small farmers [4][5][6].
This shift aligns with increased global focus on the sustainable development goals (SDGs), as agricultural development has been identified as an essential component of the first goal of reducing poverty and hunger [2], and investments in small farms have been specifically identified by the United Nations as a way to address SDGs relating to poverty, nutrition, hunger, and environmental sustainability [3].
There is good reason for this focus; small farms, often cultivated by single families on very small plots of land, are the most prevalent form of agriculture in the world [6][7][8]. Agriculture remains one of the only global industries that relies largely on family scale labor and production [9], and small farms support many of the planet's most vulnerable populations and coexist with some of its most diverse and threatened landscapes [10][11][12][13]. Crop and landscape diversity on small farms can regulate ecosystem processes and increase system resilience [14,15], and small farms are seen in many systems to have greater crop productivity per unit area than large farms [15,16].
A number of assessments have found that growth in smallholder agriculture can have strong impacts on poverty reduction [3,9,17]. The United Nations has stated that achieving poverty reduction goals requires policies that cater to the needs of smallholders [3], and last year the Gates Foundation pledged $2 billion for investment in agricultural technology innovations that will enhance the productivity of smallholder farmers, as part of a push to meet SDG targets [18].
Recent attempts to quantify the prevalence and contributions of small and family farms (these terms often used interchangeably) estimate that, at the global scale, there are more than 475 million farms that are less than two hectares in size, and that small farms control from 40% to more than 50% of global farmland and produce more than half of the world's food [4,7,8,19]. However, few studies are able to compare the impact on poverty of agricultural growth from large farms versus that from small farms [2]. A 2015 UN report explicitly stated that lack of attention to and investment in small farms has been 'exacerbated by the poor quality of data on the number of smallholders, their contribution to total agricultural production and GDP, and their share in labor force participation' [3].
Challenges of mapping small farms Effective policies for agricultural innovation, land use, or poverty and hunger reduction require identification of groups of producers with similar production strategies, resources, and constraints [20,21], and classifying and mapping global agricultural systems is essential to designing appropriate technologies and identifying environmental impacts of agriculture [22,23]. However, despite the recent spotlight on small farms and increasing consensus on their importance, spatially explicit data on smallholder farmers are virtually absent.
Global-scale assessments of smallholder farming, including the World Census of Agriculture collected by the United Nations Food and Agriculture Organization (FAO), provide data at the national scale only [4,6,24]. There is also an absence of systematic and comparable data for all countries, including a wide variation among and within countries as to the terminology used to describe small farms, what thresholds are used to define farm size classes, and whether the smallest farms are even included [4,24,25].
In addition, there is a need for improved data on human populations and land-use practices, especially in the developing world [18,21]. Over the past decade there have been numerous ongoing efforts to map global land use, including farming systems, that combine remotely sensed land cover data, crop production data, and spatial estimates of human population density [see: 10,20,[26][27][28][29]. However, none of these previous large-scale approaches to mapping the human landscape incorporate household census data that can, for example, distinguish between farming and nonfarming populations.
Mapping the concentration of farming households Here, we create a map of the concentration of farming households at subnational scales across much of the developing world, in order to better assess the role of smallholders in food security and for use in effective targeting of agricultural and land use policies. We focus on Latin America, sub-Saharan Africa, and East and South Asia, which together account for nearly 90% of the world's farms [4]. This product improves on our previous ability to quantify smallholder farming in two primary ways. First, it incorporates household-level census data on farming activity, allowing us to to differentiate agricultural populations from overall human population density. Second, the use of census microdata at subnational scales allows for greater spatial disaggregation of household data than has previously been possible [30], allowing us to identify specific administrative units where agricultural production is likely to be most dependent on small farms. We use this mapped product to estimate the contribution of smallholder systems to global agricultural extent, farming population, and food production.

IPUMS data extraction
We extracted household census data from the Integrated Public Use Microdata Series-International (IPUMS) database, which harmonizes household-level demographic variables across countries and years [31,32]. All records were extracted from the most recent available census for all available countries in Latin America, sub-Saharan Africa, and South and East Asia. These three global regions are heavily represented in the IPUMS database, and largely comprise developing countries in the global south, where smallholder farming is both crucial to livelihoods and poorly quantified through existing datasets. Census dates range from 1983-2011, and all but five are dated 2000 or later.
Within each country, we utilized the smallest subnational administrative unit for which data were available in the IPUMS dataset, with the exception of Mexico where the smallest available division produced more than 3000 units. For each subnational unit the following were tabulated: (1) records indicating the respondent was a household head (varia-ble=RELATE); (2) among household heads, records indicating that the respondent was involved in agriculture as their primary industry (variable=IND-GEN). This variable specifies individual employment of census respondents based on the United Nations' International Standard Industrial Classification of All Economic Activities, and may include forestry and fisheries activities as well as cropping and livestock husbandry. We consider each household head working in agriculture to represent a farming household. Tabulated counts of farming households were scaled up to approximate the total number in each subnational unit using weights provided by IPUMS.

Estimation of agricultural land
We define agricultural land as the sum of cropland, including permanent tree crops, and pasture area in each unit, using a new global data set of croplands and pastures for year 2010 [33]. This new data set was developed using an approach similar to that of Ramankutty et al [24], but using updated methodology, and by calibrating MODIS-satellite based land cover from Boston University against a global compilation of agricultural census statistics. While Ramankutty et al [29] used the final classified MODIS land cover dataset, Plouffe et al [33] used the posterior landcover probabilities that are calculated in the process of making the MODIS land cover product; this resulted in greater accuracy in the spatial patterns of the final cropland and pasture maps.
By overlaying census data onto this map of agricultural extent, we are able to calculate the number of farming households per hectare of agricultural land. The inverse of this farming household density provides an estimate of the amount of agricultural land area per farming household in each subnational administrative unit. We refer to this figure as the mean agricultural area (MAA) for each unit, defined as hectares of agricultural land divided by number of farming households. While differing from traditional metrics of farm size, it is designed as a proxy for the prevalence of smaller or larger farms on the landscape.

Classification of units by MAA
Classification occurred in three steps. The GRUMPv1 population density grid for the year 2000 was used to classify units with more than one thousand people per km 2 as urban [34]. We classified all remaining units by MAA. Common thresholds for defining small farms are two and five hectares [6]. We use a MAA of five hectares as our definition for smallholder systems in order to better account for both the range of farm size distributions in our study area (especially in Latin America) and the prevalence of mixed systems (especially in sub-Saharan Africa) that incorporate pasture together with cropland, increasing the amount of agricultural area per farming household. Units with less than two hectares per farming household are further classified as very small farms. Remaining units are classified to represent the log-normal distribution of MAA in our sample: units of medium sized farms are defined as having a mean farm size of 5-15 hectares, large farming as 15-50 hectares, and very large farming as 50 hectares or more. Third, we classified primarily extensive grazing lands as those units with large or very large mean farm size (equivalent to mean agricultural area greater than 15 ha) where pasture makes up more than 90% of agricultural land.
Construction of model to classify units for which there is no available census data While IPUMS provides extensive census data in our three regions of interest, there are a number of countries for which census data is unavailable. MAA is correlated with population density, as well as with other features of the agricultural landscape; thus incorporating available data into a simple logistic model allowed us to estimate classifications at subnational scales for remaining countries in these regions.
To assess the best predictors of MAA, we clipped each subnational unit to the extent of agricultural land, and calculated the percentage of the unit area taken up by agricultural land and the percent of agricultural land in pasture, as well as the mean population density [34], mean market access index [35], and the mean value of a geo-wiki-based field size index [23]. Using a set of nominal logistic regressions in JMP statistical software [36], we tested each combination of these factors, and their two-way interactions, to determine the best model for predicting MAA classification. Units classified as urban or under extensive grazing regimes were excluded from the model. The most parsimonious model included human population density [34], percentage of unit in agricultural land, percentage of agricultural land in pasture, field size index [28], and geographic region, along with significant two-way interaction terms (table S1).
The model was used to predict classifications for missing countries in the three global regions at the first subnational administrative unit in each country [37]. The analysis was run with all available IPUMS units, as well as with random subsets comprising 50% of the data. Assessed based on existing data, this model successfully predicted farm size category with an R 2 of 0.49 and a Chi-Square value of <0.0001, and properly classified 83% of units with MAA less than five hectares as smallholder units (figure S1). The model was also tested iteratively, with each country with IPUMS data excluded, and units assigned a predicted classification. The iterative model fit differed by region: In Asia, 95% of units were properly classified as small or very small; in sub-Saharan Africa 87% of units assigned to those classes were assigned correctly. In Latin America, 70% of units assigned as smallholder classifications were in fact classified as such, with all but one of the misclassified units found in the medium category.
To calculate the number of farms in regions classified as small and very small, units in these categories were assigned a farm size value equivalent to the average MAA in that size class for that region. This approach is similar to that taken by others when looking at the relationship between land area, farm numbers, and farm size [19].
Comparison with country-level FAO data We compared farming household numbers calculated from the IPUMS data to the number of agricultural holdings reported in each country by the FAO in the 2014 State of Food and Agriculture report [6] for the 40 countries in our analysis for which data is available from both datasets. FAO data is compiled from agricultural censuses, independent from population census data, from 1960 through 2011. Nigeria was dropped from analysis due to the extremely low FAO estimate, potentially a result of the age of the agricultural census (1960). For the remaining 39 countries, we performed a linear regression of the number of holdings reported by the FAO and the number of farms as calculated through our methodology at the national scale. Both FAO and IPUMS numbers were normalized by total population in each country in order to account for underlying differences between countries. We also performed a country-level regression of our agricultural area and the agricultural area reported by the FAO, normalized by total land area in the country; though these data are not independent, as the cropland and pasture maps in our analysis incorporate FAO figures, this allows us to cross-check these data at the subnational scale of our analysis.
Assessing the role of smallholders in food production In order to estimate the contribution of each MAA class toward global crop production, we used production figures drawn from the EarthStat crop database, which integrate agricultural census data and remote sensing to estimate patterns of crop area and yields at a 5 min global resolution [38]. We calculated the proportion of global production of 17 major cropsthe 16 highest-calorie producing crops consumed as food, as well as cotton, as per West et al [39]-that takes place in units where MAA is less than five hectares. These 17 crops account for more than 85% of global calorie production, as well as a majority of the water, fertilizer, and other inputs to agriculture.
We estimated the contribution of each farm size class to global calorie production, following the methodology of Cassidy et al [40]. For each crop, Cassidy et al assessed allocation to food, feed, biofuels, and other uses, identified proportions of crops used for domestic consumption, and used USDA conversion ratios to convert calories in animal feed to meat, egg, and dairy calories for human consumption at the national scale. For the 41 food crops analyzed, we assessed both total calorie production and production of available food calories at subnational scales.
Finally, we used a map of cattle density [41] to calculate the mean cattle density in each subnational unit, and analyzed units by MAA classification.

Results and discussion
Mapping MAA We were able to identify farming households from IPUMS census data in 44 countries in Latin America, sub-Saharan Africa, and South and East Asia (table S2). Records were tabulated at the first or second administrative unit in each country, for a total of 1965 subnational units. The logistic model allowed us to estimate farm size classifications at subnational scales for 39 additional countries. The total analysis therefore comprises 2412 subnational administrative units in 83 countries. Figure 1 displays the map of MAA in our three global regions. The study area includes 2094 331 241 hectares of cropland. Approximately 65% of this agricultural land is in pasture, with 35% in cropland. Overall this accounts for roughly 55% of global agricultural land.

Prevalence of smallholder systems
We identified 918 administrative units with a high density of farming households, equating to less than five hectares of agricultural land per farming household; 669 are drawn from the IPUMS data, with another 449 predicted by the model. These units, which we consider likely smallholder systems, are distributed across 67 countries, with 408 units in sub-Saharan Africa, 329 in Asia, and 181 in Latin America (table 1). Together, these units with a MAA less than 5 hectares account for 586 661 120 hectares of agricultural land, or 28% of agricultural land in the 83 countries, and are farmed by roughly 383 million households.
In the 44 countries with census data available through IPUMS, we calculate a total of nearly 391 million farms, 85% of which are located in units where small farms likely predominate (table 1). The vast majority of farms are found Asia, making up 82% of the farms in our study area and 89% of farms in smallholder units. Latin America contributes 4% of farms and 1% of farms in smallholder areas, while 14% of farms and 11% of farms in smallholder areas are found in sub-Saharan Africa. These figures are in keeping with global distributions of farm numbers drawn from FAO census data [4].
Though the vast majority of farms in most countries are smallholder or family farms [4,19], highly unequal land distribution means that in many places these farmers control only small proportions of land [6,42]. This methodology therefore identifies subnational units in which smallholder production is more likely to be a dominant component of the agricultural landscape, and is unable to capture the contributions of small farmers in regions where the landscape is dominated by large farms.

Relationship to FAO data
The farm household data from IPUMS is strongly correlated with independently collected FAO data on agricultural holdings at the national level [6] for the 39 countries for which data are available (R 2 =0.65; figure S2(a)). Our national figures were more likely to be lower than FAO tallies (25 out of 39 countries), rather than higher, indicating that our methodology is more likely to produce an underestimate than an overestimate of farm numbers. Our estimate of the total amount of agricultural land in each country is also generally consistent with FAO data from the 2011 census (R 2 =0.71; figure S2(b)). The finer spatial resolution of the IPUMS data compared to FAO data allows us to extend the national-level estimates of farm numbers to subnational scales for the first time in many countries.
Contribution of smallholder farming to global food production Smallholder units, with less than five hectares of agricultural land per farming household, account for a significant portion of global production of many crops, contributing more than 80% of global rice production, 75% of global production of groundnuts and oil palm, nearly 60% of global production of millet and cassava, and more than 40% of production of cotton and sugarcane (figure 2; table 2). We also identify crops that are less reliant on smallholder production; for example, these regions account for only 11% of global soy production (table 2).
Including 41 crops, accounting for more than 90% of global calorie production [40], we find that units of high-density smallholder farming across these 83 countries are responsible for 41% of total global calorie production, and 53% of the global production of food calories for human consumption (table 3(a)). Within these 83 countries, units with less than five hectares of agricultural land per farming household contribute 70% of food calories produced.
Contribution of smallholder farming to regional staple food production Within the 83 countries studied, subnational units with MAA of five hectares or less account for more than half of the production by mass of eight staple crops: rice, groundnut, cassava, millet, wheat, potato, maize, barley, and rye; illustrating the specific importance of smallholder production for food security. Other assessments, such as one by Herrero et al [8], have shown that 50% of global cereal production occurs in the developing world, including 86% of rice and 67% of millet. Our results indicate that areas of high-density smallholder agriculture account for much of this staple crop production.
The role of smallholder systems in food production varies between regions ( figure 3(a)). In Asia, production is driven by units with MAA less than five hectares, which produce 90% of food calories in the region, in addition to the farming carried out in densely populated areas classified as urban in this study. In sub-Saharan Africa, these smallholder units produce half of food calories in the region, and units of medium-density farm households (mean farm size 5-15 ha) account for another 26%. In Latin America this pattern is reversed, with 70% of food calories produced in regions with large and very large MAA, and less than 7% produced in units with MAA less than five hectares. However, as less than 2% of agricultural land in Latin America is found in these smallholder units, that proportion of production represents greater productivity per hectare than is found in areas characterized by larger farms.
In all regions, a greater proportion of the calories produced in units with small MAA are consumed as food, rather than converted to feed, fuel, or other uses. In smallholder areas, 70% of calories produced are available for consumption as food, compared to 66% in the 83 countries studied (table 3(b)) and 55% for the global agricultural system as a whole [40].
Cassidy et al found that percent of calories delivered to the food system varies widely by region, with 90% of calories produced in India delivered to the food system, in contrast with 50% in Brazil. We add to this analysis by disaggregating likely smallholderdominated systems from other agricultural systems within global regions. In Latin America, 47% of calories produced on units with large or very large MAA are consumed as food, in contrast to 70% in smallholder units and 80% on farms in urban areas (table 3(b)). Similarly, in Asia, only 38% of calories produced in areas of large farming are consumed by humans-likely due to dependence on livestock in those regions-compared to 70% in smallholderdominated units and 90% on urban-area farms.
MAA classifications also provide a useful metric in the study of livestock systems. An analysis of cattle density displays patterns similar to those seen in crop production ( figure 3(b)). In Asia and sub-Saharan Africa, smallholder units have, on average, the highest cattle density. In Latin America, these units have the lowest density of livestock, while units dominated by large farms have the highest average number of cattle per hectare. These distinct patterns provide insight into the different roles of livestock production in these systems. The high cattle densities in regions of small farming in Asia and sub-Saharan Africa indicate the importance of livestock for the livelihoods of small farmers. In Latin America, the high density of livestock in regions with large MAA, and the much lower density in regions likely dominated by small farms, illustrate the extent to which cattle is incorporated into commodity production rather than mixed or subsistence farming systems.

Limitations
Although combining household and land cover data greatly improve estimates of the spatial distribution of small farms, there are limitations to this approach. First, in much of the world, farm sizes are heterogeneous, with a right-skewed distribution, and a small number of large farms control the majority of agricultural land [6,42]. Our use of a mean farm size does not capture this distribution; however, in areas with MAA of five hectares or less a large proportion of land is likely controlled by smallholders, whether or not there are also larger landholdings present, and these subnational units are the key focus of our study.
In regions with larger MAAs, there may also be a great number of small farms, however their presence is masked by the dominance of large farms or less densely populated pasturelands. This is likely to be especially relevant in Latin America, where the distribution of land ownership is particularly unequal [4,24], as well as in many parts of Africa where mixed cropping and pastoral systems create complex landscape mosaics [7,10,26]. Second, our method for counting farming households may incorporate errors in both directions, by including household heads who are employees on larger farms, or excluding household heads with their own farm who earn their primary income from another source. As smallholder farming is more laborintensive than large-scale farming [15,16], areas in which the majority of farming heads of households are laborers are unlikely to contain the highest levels of farm household density. This method also does not address questions of ownership or land tenure; farms that are leased or organized under tenant farm systems may still be considered small farms. Farming systems differ in many ways, and this metric will inherently capture reality more closely in some places than others; however the strong correlation with FAO data on agricultural holdings, as well as the general agreement with farm numbers drawn from other sources, provide support for the utility of this approach in capturing farm numbers.
Third, land area is an imperfect metric for classifying smallholder farming; in many cases income thresholds, family labor use, or subsistence orientation are more useful and appropriate indicators   [19,24]. In addition, the extent to which a given land area is considered 'small' is largely dependent on the greater agricultural context, and our estimate of mean farm size uses simply the extent of agricultural land, which is a narrower definition of the land base than many other estimates of farm size. Finally, estimated cropland and pasture area data are based on calibrating satellite and census data and contain their own uncertainties, as previously documented by Ramankutty et al [29].

Conclusions
Our methodology provides estimates of farming households at subnational spatial resolutions, and allows us to determine where in the developing world small farms are likely to be concentrated, paving the way for a range of comparative analyses and providing guidance for investments. This analysis of the spatial patterns of farm size can improve the ability of policy makers to effectively design and target the market and development programs essential for continued agricultural growth and poverty reduction in regions of small scale farming. Our analysis supports assertions that, in much of the developing world, food production on smallholder farms is not only a key facet of food security for the rural poor but also makes up the majority of production and underpins agricultural sustainability at national and regional scales [21,43]. Our findings indicate that more than half of food calories produced globally come from subnational units in the developing world where the density of farming households is very high, averaging less than five hectares per farming household, offering support to frequently cited statistics about the contribution of small or family farms [e.g. 6,19,25].
In addition to contributions to food security at regional and global scales, our results lend support to the importance of small farms for local food production, as smallholder units are especially key in the production of staple crops and direct a greater proportion of their production toward the food supply. Contributions to local food security are crucial, as smallholder agriculture supports the livelihoods of many of the planet's marginalized populations [25]. Two-thirds of the developing world's three billion rural people live on farms less than two hectares, and these farms are home to half of the planet's undernourished population and the majority of people living in absolute poverty [11]. Women, who in many places are less foodsecure than men, play a crucial role in smallholder systems [17].
Smallholders' livelihoods are exposed to risk in many sectors and at many scales; most face missing or inequitable access to markets or capital, few have resources with which to cope with hazards and shocks, and policies designed to improve food security in rural regions of developing countries often fail [21,25,44]. In addition, the nature, intensity, and structure of smallholder farming is rapidly changing, and requires policies that allow for these shifts [15]. The success of smallholder agriculture in much of the world is thus largely dependent on supportive policy environments that provide appropriate technology and market supports for small farmers, and create incentives for sustainable intensification [3,5,11,[45][46][47][48].
Improved spatial information about smallholder farming can also aid in the design of policies intended to mitigate environmental impacts of agricultural intensification and expansion, including resource degradation, land abandonment, and conversion of natural land cover into new farmland [38]. Effective conservation in many parts of the world depends on policies and innovations that reconcile smallholder farming systems with the maintenance of diverse and functioning ecosystems [12,14,17,49]. The dynamics of agricultural expansion are distinctly different in smallholder systems [50], and smallholder farmers often lack the resources or technical capacity to respond to land use policies, enforcement mechanisms, and certification programs [51][52][53]. Our approach provides the ability to disaggregate farming populations from non-farming populations, providing a more accurate picture of farming households on the landscape than has previously been available. These data meet a critical need, as improved understanding of the prevalence and distribution of smallholder farming is essential for effective policy development for food security, poverty reduction, and conservation agendas. Targeted investments in smallholder technology adoption, market access, and land tenure, and community organization can lead toward attainment of multiple related SDGs [3], and the improved spatial information on the concentration of small farms provided by this product can help investors, NGOs, and governments direct resources appropriately.
The increasing availability of large-scale household microdata provides a much-needed window into the dynamics of populations and livelihoods. This study is only a first effort at utilizing these rich and complex datasets; we envision numerous future applications of this farm size product in combination with other variables related to food security, natural resource use, and human wellbeing that will further increase our understanding of the dynamics of small farms and the livelihoods of those who depend on them.