Production Diversity and Socioeconomic Characteristics of Household Farms

The level of production diversity chosen by small household farms may not be optimal from a social perspective, due to the existence of market failures such as environmental externalities or barriers to credit. Public policies designed to stimulate more diversified crops are supposed to correct that inefficiency. Understanding the socioeconomic characteristics associated with agricultural diversification is important for a successful implementation of those policies. In this paper we investigate which are those characteristics that are mostly related with crop diversification. Unlike previous studies, which use small samples, circumvented to small geographical areas, we address these issues with a large and comprehensive dataset, with observations spread through a large geographical dimension, making it possible to analyze the role played by regions. We take a group of 4.7 million Brazilian farm households, of which a random sample is extracted and used in the estimation procedures. We then estimate a Tobit regression model using key agricultural variables and the well-known Simpson Diversification Index to measure crop diversification. The main findings are that the region where the farm is located, the on and off farm incomes, the farm’s size, the access to technical assistance, the farmer’s age and education all play important roles in explaining production diversity. Public policies will more likely achieve crop diversification if they take into account those characteristics.


Introduction
Production diversification is a widely known strategy used by firms to deal with production risks.It provides a more stable income, reducing profit volatility.This is particularly relevant when we consider that price and trade uncertainties strongly affect firms in free and globalized markets (Weiss & Briglauer, 2002).In the more specific case of agricultural firms, different sorts of risks must be added to the equation: weather uncertainties, pests and diseases can cause strong fluctuations in the production output.For that reason, agriculture is very sensitive to risks, and it very often relies on insurance and subsidy policies (Di Falco & Perrings, 2005;Baumgärtner & Quaas, 2010).
A few aspects related to the income stabilization that results from diversification deserve some attention: (i) there is a reduction in overall income insufficiency, spreading the impact of a failure in any specific income source; (ii) there is a reduction in the within-year income volatility accruing from agricultural income flows that tend to be highly seasonal; (iii) there is a reduction in between-year income volatility that results from production and market instabilities (Ellis, 1998).In contrast with big, non-agricultural corporate firms, where property dispersion helps to spread risks through a large number of stock holders, the small family-owned farms (farm households) typically bear a much larger portion of risks.It is a stylized fact that their family wealth as well as their labor is all committed to their own business (Weiss & Briglauer, 2002).
In the specific case of farm households in developing countries, the diversification is part of a broader, well documented, strategy of livelihood diversification, set up to provide family survival when facing harsh economic challenges (Ellis, 1998).Besides the production diversification, this strategy includes earning non-farm incomes, such as rents and non-farm labor.It also includes some subsistence agriculture, providing a minimum level of consumption for the family.The non-farm income diversification is often pursued by family households because non-farm risks are usually uncorrelated to on-farm production risks.Hence, the combination of those two types of income tends to be an efficient strategy to reduce the overall risk.
The on-farm diversification can provide additional advantages, beyond risk reduction.One such example is the possible increase in efficiency due to decreasing returns to scale, economies of scope, and a better management of the available resources (Mcnamara & Weiss, 2005).With decreasing returns to scale, average costs rise with production, and then diversifying the crop generates higher profits than increasing the production of a single crop.Moreover, if there are economies of scope, then average costs can be reduced (and hence profitability increased) when the farm produces more types of crops simultaneously (Chavas & Kim, 2010).Another advantage of on-farm diversification comes from the efficiency gains with a better allocation of labor throughout the year.Seasonal peaks of labor demand may not coincide between crops.So a potential source of gain comes from matching crops with different seasonal labor demand peaks (Rahman, 2009).Also, eventual differences in soil and microclimate within the same farm may lead to efficiency gains when different crops are matched to specific conditions (Di Falco, Penov, Aleksiev, & van Rensburg, 2010;Schroth & Ruf, 2014).
Crop diversification also entails environmental benefits.Diversified production systems are in general related to soil, water and biodiversity conservation.They are also less dependent on agrochemicals such as pesticides and fertilizers, rendering a healthier and more environmentally neutral production (Lin, 2011;Sambuichi, Galindo, Oliveira, & Pereira, 2014;Ruf & Schroth, 2015).Some production systems such as agroforestry help to absorb carbon, mitigating the emissions of greenhouse gases (Schroth et al., 2015).These are classic production externalities that benefit the whole society.As such, they are not internalized by the farmers, who end up with an output that is less diversified than what would be the optimal level of diversification from a social perspective (Baumgärtner & Quaas, 2010).Just like every other market failure, externalities render a bad resource allocation.Good public policies should be designed to change this allocation, moving the economy towards the social optimum.Understanding the mechanisms behind diversification is important for the successful implementation of these policies.
The literature has emphasized a number of features that may affect the farmer's decision to either diversify or specialize his income sources.Farm qualities, such as size, type of production; the characteristics of the farmer or firm that runs the farm, such as age, technical knowledge, education; the business economic profile, such as the firm's size and financial strength; environmental aspects, such as crop pests and diseases; in addition to access to insurance, subsidies, markets, technical assistance may all affect the farmers diversification level (Pope & Prescott, 1980;Bosma, Udo, Verreth, Visser, & Nam, 2005;Mcnamara & Weiss, 2005;Bravo-Ureta, Cocchi, & Solís, 2006;Singha, Baruah, Bordoloi, Dutta, & Saikia, 2012;Longpichai, 2013;Ruf & Schroth, 2015).
Household farms are an important component of the Brazilian agricultural and livestock sector.The 2006 census shows that household farms are responsible for a large part of basic agricultural products, such as beans, milk, vegetables, yucca, among others.Household farms take up only 24% of the total farming land in Brazil, but they employ 74% of the farming labor.Moreover, 84% of all Brazilian farms lie in the category of household farms (Sambuichi et al., 2014).
In this paper we investigate a group of 4.7 million Brazilian household farms that participate in the Declaracao de Aptidao ao Pronaf (DAP).This large dataset is formed when farmers voluntarily register their DAP, which entitles them to apply to all the key federal agricultural development programs in Brazil, such as subsidized credit, insurance, free technical assistance, price floors, etc.We use the DAP dataset to address the issue of diversification, which, to the best of our knowledge has not yet been done.As a matter of fact, in spite of its potential, DAP's dataset is remarkably fresh and unexplored in the literature.In this paper we intend to start closing this gap, by using the dataset to answer one specific issue.What are the variables and characteristics of Brazilian household farms that are related to more specialization or more diversification of the production?In order to tackle this question, we calculate an index of diversity (Simpson Index), and run a Tobit regression of this index on a number of key socioeconomic variables.

Collecting and Analyzing Data
We treat our data as a cross-section of 4.8 million household farms extracted from DAP in October 2014 (although the actual time of each farm's application may vary).After purging missing data and outliers from the dataset we ended up with 4.7 million observations.Roughly 133 thousand (2.7%) farms were excluded.
To measure the diversity we use the Simpson Index (Simpson, 1949), which is perhaps the most popular measure in the literature.It has the advantage of considering the contribution of each source of income to the overall on-farm income.The Simpson Index of Diversity (SID) is given by, Where, X i represents the gross production value of product i, and N represents the number of products in the farm.For that matter, we considered all kinds of income obtained within the farm, including primary and processed products from crops and livestock, arts and crafts, and rural tourism.When the farm household produces only one good, than SID = 0.This is the case of complete specialization.As the number of products increase, the participation of each product in the total gross production value goes to zero, the square of it converges to zero even faster, and the index converges to the unit.
A common problem in regression analysis happens when the dependent variable is censored.Values above (or below) a certain threshold are all transformed to a unique value.The diversity index used in this paper has this characteristic.Approximately one third of the farms in the dataset are monocultures, and for those observations the value of the index is zero.The Tobit model (Tobin, 1958) was developed to deal with such types of limited dependet variables.In the Tobit model the marginal effects are the estimated coefficients multiplied by the probability of the dependent variable to be in the non-truncated region of the latent variable's normal distribution.
In order to assess the regression fit we use McFadden's pseudo R 2 , which is equivalent to the unit subtracted from the ratio between the likelihood of the complete model and the model estimated with just a constant, and no other regressor.The Pseudo R 2 is similar to the conventional R 2 in the sense that they both capture the quality of the regression as well as its predictive power.But they have different interpretations.

The Empirical Model
The Tobit estimation procedure is applied to the following regression equation: (2) Four regional dummies are included in the model.Reg 1 , Reg 2 , Reg 3 , and Reg 4 are dummy variables for the North, Northeast, Southeast and South regions respectively.Since the model has an intercept, there is no dummy variable for the Center-West, the 5 th Brazilian region.The GPV (Gross Production Value) is included in the model both in level and in logarithm.It captures the effect of on-farm incomes on the farmer's decision do diversify production.An attempt was made to include the square root of GPV in the model, in order to capture a possible non-linearity, but it did not have statistical significance.Two variables are included to capture the effect of off-farm income on the decision to diversify, the income from social benefits (Income_socben), which includes pensions and transfers; and other incomes, which includes labor incomes and land and machine rentals.Instead of aggregating these two sources of off-farm income in one single variable, we decided to keep them separated because they are very different in nature, and may affect differently the decision to diversify production.The farmer may be entitled to receive social benefits, but its amount is out of his control.The variable "Other incomes", on the other hand, is a farmer's active option to diversify his sources of income by choosing a different use to his inputs (labor, land, and machinery), which clearly sets an opportunity cost in using these inputs for the farm's internal production.
A few features of the farmer are also included in the model, such as age and education.The variable Age is the farmer's age in years at the time he fills his DAP forms.The square root of age is also included to account for a possible non-linearity.The DAP dataset does not have a complete quantitative information about schooling, only categories.That only allows us to use dummy variables for education.We build four of them.School 1 has a value zero for illiterate farmers, and 1 otherwise.School 2 has a value 1 for farmers with a complete elementary school, and zero otherwise.School 3 has a value of 1 for farmers with a complete high school and/or a complete technical school, and zero otherwise.School 4 has a value of 1 for farmers with college degree, and zero otherwise.Of course, if a certain farmer has college degree, he also has high and elementary school degrees, and he is not illiterate, so all four dummies will have unitary values.
Some of the variables in the model capture the farm's characteristics.That is the case of Area, which is the size in hectares of all the farms managed by the household; Numberprop, which is the number of farms handled by the household; Prop, which is a dummy variable related to the ownership of the farm, with a value 1 if the household owns the farm, and zero otherwise; and Labor Force which is the number of permanent workers in the farm, both active family members and employees.
In addition to these variables two other dummies were included.Coop has a value of one if the farmer is a member of a farmer's co-op, or if he has any other kind of association in which resources are pooled, and a value of zero otherwise.Techassist has a value of one if the farmer has access to contracts of public technical assistance.Table 1 summarizes some descriptive statistics of the variables used in the model.The original dataset, with almost five million observations, is well suited for descriptive statistics.However, a large volume of observations becomes inconvenient for inferential procedures because in this case p-values converge to zero.So, in a regression with millions of observations the p-values of the individual significance tests are all very close to zero, and then any independent variable included in the model, no matter how absurd it may be, has statistical significance.Inference becomes useless.There are a few alternatives to deal with this problem.A popular one would be bootstrap techniques, in which multiple small sample regressions are performed.We choose to randomly select a larger random sample of ten thousand farms.As a consequence of using a relatively large sample, parameter estimates tend to have small variances, but not to the point of generating infinitely small p-values.Because of that small variance, the estimates obtained with the sample are very close to the ones obtained with the full dataset.And in this case we are able to pinpoint the variables that are not statistically significant in explaining the farmer's decision to diversify.

Regional Effects
The results of the Tobit estimation of the regression equation are presented in Table 2.The regional dummies all have positive, significant coefficients.So, controlling for other attributes, the Northern, Northeastern, Southeastern and Southern farms are on average more diversified (have a smaller SID) than the farms located in the Center-West region, whose dummy has been omitted in the equation.For example, Northeastern farms have on average a diversity index 0.181 larger than Center-Western farms of the same size, work force, GPV, etc.In the South, that value is 0.154 larger.This regional bias in diversification is very noticeable, with all the dummies being highly significant.So, the region where the farm is located matters a lot to determine its degree of diversification.
The large scope of our dataset, with its continental range, made possible to highlight the strong influence that regional differences have over crop diversification.In fact, farmers with similar characteristics of income, age, education, number of employees, etc tend to have on average very different levels of crop diversification depending on which region their farm is located.Previous studies could not find this result because of the limitations of their datasets, mostly restricted to small geographic areas.The one exception is Mcnamara and Weiss (2005).In spite of using data for a small area, specifically, census data of the state of Upper in Austria, these authors did introduce regional dummies in their regression model.In this case, the regions are within that particular state.They use a Probit model to analyze the variables that affect production diversification, as strategies to stabilize income in farm households.They found that the region where the farm is located significantly affects the diversification index, with coefficients ranging from 0.087 to 0.217.
Differences between regions are mainly a result of the different production systems that prevail in each of them.For example, the predominance of livestock in Center-Western household farms may explain the higher specialization in this region.The higher production diversity in the Northeast may be related to the prevalence of subsistence crops, which are important to guarantee nutritional security for low income farm households.Pellegrini and Tasciotti (2014) analyze eight developing economies emphasizing the importance of crop diversification to food security of rural families.They did find a positive correlation between the number of crops, family income, and dietary diversity.
In the South region, on the other hand, GPVs are in general higher, and the choice to diversify would not be related to subsistence issues, but rather with efficiency gains in production.Moreover, diversification can be linked to cultural traditions, in which immigrants brought from their countries polyculture practices.Trends such as agroecology, and organic agriculture may well have influenced the decision of farms to diversify more in the South and Northeast regions (Sambuichi et al., 2014).

Income and Diversification
A key issue in our analysis is to investigate a possible association between a farmer's income and his decision to diversify production.We use the Gross Production Value (GPV) as a proxy of income.A careful look at the data suggests the possibility of a non-linear relation between income and diversification.We introduced this non-linearity in the regression model with the addition of the logarithm of GPV.We tried first with the square of GPV, but that generated close multicollinearity.That setup departs from the rest of the literature, in which the relation between income and diversification is considered to be linear.The coefficients of GPV (-0.0004) and its log (0.014) are both significant.These two coefficients together capture the presumed non-linearity.The marginal effect of GPV in this case is -0.0004 + 0.014/GPV.So, for poor farmers, with a low GPV, income is positively related with diversification.Suppose that a farm has a very low GPV, say GPV = 10.That is 10 thousand reais per year, since GPV is measured in thousands of reais.Then the net marginal effect of an increase in income would be -0.0004+ 0.014/10 = 0.001.So, an extra thousand reais would on average encourage the farmer to diversify its production achieving an SID 0.001 higher.However, for rich farmers, more income leads to less diversification.For example if a farm produces 100 thousand reais, the marginal effect of income on diversification would be -0.00036.So, at high levels if income, the richer the farmer gets, the less diversified its production will be.
Intuitively, it seems that poor farmers are willing to diversify production, but their low income and possible barriers to credit prevent them from doing so.Therefore, when their income goes up, they increase diversification.Rich farmers, on the other hand, are bound by modern technology standards, which are mostly linked to intensive monoculture systems.The same is true for credit policies and agricultural insurance (Sambuichi et al., 2014).
Besides that, diversity is positively related with social benefits.An extra thousand reais in benefits on average increases the SID in 0.006.And other off-farm incomes do not significantly affect diversification.
Except for very low levels of GPV (very poor farmers), the marginal effect on the SID is quantitatively small.Part of the reason can be that GPV does not account for off-farm incomes, and does not consider production costs.It is just the revenue side.But when diversification brings efficiency it happens mostly through scope economies, with the reduction in production costs.So, the GPV will not capture that.Recent research with household farmers, in Brazil and overseas, has in fact shown a positive association between production diversity and income, once production costs are subtracted (Perondi, 2007;Di Falco et al., 2010;Kiprono, 2012).
In order to understand the full effect of income on diversification, it is important do consider off-farm incomes (Mcnamara & Weiss, 2005).Different sources of off-farm income may have different effects on diversification.
The positive effect that we found for social benefits may be related to the fact that poor farmers are precisely the ones entitled to get those benefits, in particular the Bolsa Familia program.And, as we emphasized before, for poor farmers more income means more diversification.Since this is the case for in-farm income, it may very well be the case for off-farm income as well.
We did not find a statistically significant coefficient for other off-farm incomes (the ones that do not involve social benefits).This is in line with the findings of Bravo-Ureta et al. ( 2006) with small mountain farms in El Salvador, and of Kiprono (2012) with small farms in the district of Konoin, South Africa.But the opposite result has been found by Weiss and Briglauer (2002), using the agricultural census of the state of Upper, in Austria.These authors show that part-time farmers, namely those who use part of their time with other activities outside the farm, are more prone to specialization.
Another point worth mentioning is the fact that the DAP dataset that we use is based on a voluntary statement by the household farmer in which the off-farm income data could be underreported.The reason is a striking difference observed between enrolment and entitlement.Only 14% of the household farmers in the dataset declare earning some kind of social benefit.But 34% of them are entitled to participate in what is arguably the most comprehensive program, the Bolsa Familia (the sum of their on and off-farm incomes is inferior to R$ 77 per month, per capita).In spite of that, the variable "income from social benefits" seems to have explanatory power, and therefore was kept in the model.

Farmer's Age and Education
Production diversification is a form of hedging against risks.If a farmer's degree of risk aversion varies through his lifetime, it seems reasonable for his decision to diversify to be dependent on his age.Hence we introduce in the model the variable age, and the square of age, to capture a possible non-linear relation.Both coefficients have statistical significance at the usual levels.The net effect of age on diversification is given by 0.007-0.00012(Age).This is positive up to the age of 58.3.So, for young farmers (younger than 58.3) getting older means diversifying more.But the marginal increase in diversification decreases with age, reaching zero at the age of 58.3.For farmers older than 58.3 years diversification will decrease as they age.Mcnamara and Weiss (2005) also tried to link age with diversification, but they found a different result with Austrian data, namely, a negative coefficient for age and a positive coefficient for its square.Besides that, it is a common practice in the literature to use experience at work, rather than age in econometric models.Pope and Prescott (1980) use a sample of 1000 farms in California, USA, to analyze the relationship between diversification with the farm's size and the farmer's socio-economic characteristics.They found that experience enhances diversification.A similar result was obtained by Oliveira Filho, Melo, Xavier, Sobel, and Costa (2014) using Brazilian data.They found a positive but non-linear relation between experience and diversification.Inasmuch as experience tends to be correlated to age, those results may suggest that diversification in Brazil may be more related to cultural traditions than to innovations in technology.
Besides that, older farmers are likely to be more risk averse than their younger counterparts.
As mentioned before, the education variable available in the dataset is categorical, rather than numerical.So we had no option other than working with dummy variables.Only one out of the four educational dummies is statistically significant at the usual levels, namely the variable School 2 .We found a coefficient of -0.027.It means that, controlling for other variables, farmers who completed elementary school have on average an SID 0.027 smaller than farmers who did not complete it.So, in this particular case education leads to specialization.This result contrasts with other findings in the literature which show a positive relation between schooling and diversification (Weiss & Briglauer, 2000;Bravo-Ureta et al., 2006;Longpichai, 2013).
A few points should be mentioned here.First, the dummy that showed statistical significance, School 2 , is the only educational dummy that splits farmers in two big groups, 36.6% with elementary school degree, 63.4% without it.
There are relatively few illiterate, high school educated, or college educated farmers.Second, education leading to more production specialization may be a consequence of the current methods applied in agro-technical schools, which focus more on specialized farming.

Other Variables
The farm's area and the number of farms handled by the farmer positively affect diversification.The marginal effects of 0.001 and 0.014, respectively, are both statistically significant at the usual levels.Hence, larger farms and a large number of farms operated by the household are features that enhance diversification.This result is in line with the evidence for other countries (Pope & Prescott, 1980;McNamara & Weiss, 2005;Bravo-Ureta et al., 2006;Kiprono, 2012;Oliveira Filho et al., 2014).Decreasing returns to scale may be part of the explanation for that.Another possibility is the fact that larger areas may feature a number of micro-environmental conditions which favor diversification.The same reasoning applies to the finding that a larger number of farms handled increases diversification.
In certain circumstances, however, the relationship between area and diversity seem to be the opposite, probably an effect of the types of production prevalent in the region.Di Falco et al. ( 2010) study the effects of farm fragmentation in Bulgaria, and show that the reduction of farms sizes enhanced production diversity in that country.
Diversified production systems tend to be more intensive in labor.Therefore it would be reasonable to expect that farms with larger labor forces would have more diversified productions.In fact, Weiss and Briglauer (2002), McNamara and Weiss (2005), and Kiprono (2012) have shown a positive relation between family size and production diversification in household farms.In our dataset, however, the labor force does not significantly affect diversification.Brazil is a very diverse country, and sometimes relations that could show up locally are not evident for the country as a whole.
Co-op membership also does not help to explain diversification.The coefficient for the co-op dummy variable is not statistically significant at the usual levels.Indeed, being member of a cooperative and/or an association seems to have effects on production diversity that vary across regions, and perhaps because of that do not have a significant effect in a dataset of all Brazilian regions altogether.Using a dataset of farms in the Petrolina-Juazeiro area, in the Brazilian northeast, Oliveira Filho et al. ( 2014) found that co-op membership reduces production diversity.In that particular case, those co-ops are widely known for their specialization penchant, trading only a few specific goods.Bravo-Ureta et al. (2006), on the other hand, found a positive association between being a member of social organizations, and crop diversification in a sample of El Salvador farmers.Co-op membership's role in diversification is related to its potential to facilitate trade, due to increasing returns to scale in trading.However, when co-ops are too specialized, being a member of one renders the farmer less willing to diversify.
The coefficient for the farm's ownership dummy variable is negative and significant at the usual levels of significance.Controlling for other attributes, farmers that own their farm have a SID 0.049 smaller on average than farmers that do not own it.A possible intuition for this finding is that the owners can use their property as collateral, and for that reason have better access to credit and insurances, and hence are less risk-averse.Then they are less prone to use diversification as an instrument to reduce risks.In the literature this variable is often non-significant (e.g., see Bravo-Ureta et al., 2006).
The access to technical assistance enhances production diversification.The estimated coefficient of this dummy variable is 0.046, which is statistically significant at the usual levels.In 2003 Brazil implemented the National Policy of Technical Assistance and Rural Extension (Politica Nacional de Assistencia Tecnica e Extensao Rural-PNATER) in which the Federal Government offers free technical assistance to Brazilian household farmers.This policy explicitly adopted a technological framework based on the agro-ecology principles, which lie strongly on diversified production systems.In 2010, however, a new PNATER is implemented with a new law (number 12,188), changing the way service providers were hired, and dropping the emphasis on agro-ecology (Caporal, 2014).In spite of that, the Ministry of Agricultural Development, responsible for the policy's enforcement, has kept the ecological emphasis, especially when training the technical assistance and rural extension agents (ATER).
Our dummy variable for technical assistance is created considering only contracts of technical assistance and rural extension under the new 2010 law.The evidence supports the idea that farmers benefiting from free technical assistance tend to have a more diversified production.This finding is in line with other studies in the literature (Bravo-Ureta et al., 2006;Longpichai, 2013;Oliveira Filho et al., 2014).Note.The dependent variable is the Simpson Index of Diversification (SID).The regressors are a number of variables that may affect Brazilian small farmer's decision to diversify.

Conclusion
Our results indicate that regional differences play a crucial role in diversification.The regional dummies all have positive, significant coefficients.They are much larger in magnitude than all the estimates for the other coefficients, suggesting that regional differences have a strong influence on crop diversification, almost dwarfing the effects of other variables (of course, considering the units in which each variable is measured).Farmers with similar characteristics could have on average very different levels of crop diversification, depending on the region they are located.That fact has not been emphasized before in the literature, possibly because of dataset limitations in previous studies.The large, continental-sized range of our dataset, on the other hand, made possible to highlight this regional effect.On-farm income relates to diversification in a non-linear fashion.As they enrich, poor farmers tend to get more diversified, rich farmers tend to diversify less.Social Benefits significantly affect diversification, but other off-farm incomes do not.The more social benefits a family earns, the more diversified its production tends to be.The farmer's age and education significantly affect diversification.The effect of age is non-linear.As they get older, younger farmers increase and older farmers reduce production diversification.Education was negatively associated with diversification.Farmers that complete elementary school have less diversified crops than the ones that do not complete.Also, the ownership of the farm significantly affects diversification.Farmers who own their farm have a less diversified production.On the other hand, farmers with access to technical assistance tend to diversify more.The size of the farm, and the number of farms managed by the farm household also significantly affect diversification.The larger the farm, and the larger the number of farms handled, the more diversified the crops tend to be.

Table 1 .
Descriptive statistics of the variables

Table 2 .
Estimation of the Tobit Model