Does survey recall error explain the Deaton–Paxson puzzle?

• A proposed resolution of the ‘‘Deaton–Paxson puzzle’’ is evaluated. • Household size elasticities of food expenditure are estimated on both recall and diary food expenditure data. • Evidence of the puzzle is found in data collected by either method. Using recall and diary food expenditure data from Canada, we compare estimates of the household size elasticityofpercapitafoodexpenditure.Incontrastto Gibson(2002),wefindnegativeelasticitiesinboth recall and diary data. This in turn means we find evidence of the ‘‘Deaton–Paxson puzzle’’ in both diary and recall data. Recall error cannot be the sole explanation of the puzzle. under the (http://creativecommons.org/licenses/by/4.0/).


Introduction
In applied demand analysis, the income and household size elasticities of food expenditure play an important role, particularly in thinking about the economies of scale in household consumption. An assertion due to Engel is that households of different size with the same food budget share have the same standard of living. This leads to the ''Engel method'' of calculating economies of scale in household consumption. Suppose, for the purposes of illustration, that the food budget share is adequately modelled by w f = α 0 + α 1 ln pcy + β ln n + ε where w f is the food share, ln pcy is the logarithm of per capita income, and ln n is the logarithm of household size. Thus to hold living standards (the food share) equal as household size doubles * Corresponding author at: University of Essex, United Kingdom. (increases by 100%), per capita income should change by (approximately) − (β/α 1 ) × 100%. Economies of scale imply that the per capita income required to keep living standards constant should fall with household size. Empirically, α 1 is always negative (this is ''Engel's Law''). Thus, if the food share can be taken as a welfare measure (as Engel asserted), economies of scale require that β be negative (the budget share should fall with increasing household size, holding pcy constant). Empirically, this turns out to be the case. For example using Thai, Pakistani, South African, US, French and British data, Deaton and Paxson (1998) find that, holding per capita income constant, the food share varies inversely with household size. The Engel method delivers estimates of the economies of scale in consumption that many researchers find plausible.
Against this, Deaton and Paxson (1998)  of private goods, such as food. 1 Thus, holding per capita income constant, the per capita quantity of food, and hence the budget share, should rise. Thus β (and β/w f , the elasticity of food expenditures with respect to household size) should be positive. The fact that this compelling piece of analysis is empirically contradicted is referred to as the ''Deaton-Paxson puzzle''. Gibson (2002) suggests that one possible explanation for the Deaton-Paxson puzzle is measurement error in recall food expenditure data that is negatively correlated with household size. For larger households, it becomes an increasingly cumbersome task to accurately recall all food related purchases made over even a modest time period. Thus the larger the household, the more likely is systematic underreporting of food expenditure. A negative correlation between the measurement error and household size imparts a negative bias on the estimated relationship between the food share and household size.
Many of the surveys examined by Deaton and Paxson do employ recall methods to collect food expenditures, and Gibson suggests that the Deaton and Paxson puzzle might be resolved by using diary based food expenditures. He uses data from Papua New Guinea (PNG) to test the validity of this prediction. Households were randomly divided into two subsamples; one was asked to keep a diary while the other was asked recall questions. His results suggest that while recall data underestimates household size elasticities, estimates based on diary data do not exhibit the Deaton-Paxson puzzle.

Data
The 1996 Canadian Food Expenditure Survey (FoodEx) provides a unique opportunity to study how food expenditure measures constructed from recall questions compare to those obtained from expenditure diaries. This nationally representative survey first asked respondents to estimate their household's food expenditure over the past four weeks, along with basic demographic questions. They were then asked to record daily food expenditure in two consecutive weekly diaries. The survey involved three visits to each household. At the initial visit, demographic and recall food consumption questions were asked. The weekly diaries were collected at subsequent visits. The interviewers double-checked diaries and verified the quality of the responses. The survey was run throughout the year. The initial response rate was 76 percent, and there were 10 898 responding households. The non-response rate to the recall question was less than 2 percent. Attrition between the first and second week of the diary was less than 2 percent. Weights are provided that account for the survey design and non-response, but not for the attrition between the two weeks.
We can also compare the FoodEx to data from a second large Canadian survey. The 1996 Family Expenditure Survey (FamEx) is a full household expenditure survey (collecting information on all categories of expenditure). 2 Face-to-face interviews were conducted in the first quarter of 1997 to collect income and expenditure information for the previous year. Statistics Canada undertakes various checks of the data and the data are generally thought to be of very good quality. 3 There are 10 085 respondent households in the 1996 FamEx. 4 Because the FamEx collected annual data and the FoodEx survey ran continuously over the year, they refer to the same time period. The surveys were based on 1 This assumes limited substitution between food and the public good.
2 The FamEx surveys were used to determine the weights for the Consumer Price Index in Canada.
3 Further details on the quality of this data are in Brzozowski and Crossley (2011). 4 The response rate to the FamEx surveys is about 75%.
the same (Labour Force Survey) sampling frame. Thus these two surveys readily lend themselves to comparison. 5, 6

Results
We estimate food share equations that are a quadratic extension of the Working-Leser form, w f = α 0 + α 1 ln pcy + α 2 (ln pcy) 2 + β ln n + γ X + ε where w f is the budget share of food at home, 7 ln pcy is the logarithm of per capita income, ln n is the logarithm of household size, and X are other variables. We estimate this equation using two data sets and three measures of the food share. First, we use a food share based on the average of the diary weeks in the FoodEx. Second, we use a food share based on the (1 month) recall measure in the FoodEx. Third, we use a food share based on the (1 year) recall measure in the FamEx. The results are presented in Table 1.
We find that the food share varies inversely with household size in all three cases. The coefficient on log household size is −0.007 with the FoodEx diary data, −0.023 with the FoodEx recall data, and −0.003 with the FamEx recall data (3rd row, 2nd panel, Table 1). The first two estimates are different from zero at conventional levels of statistical significance, while the third is not. Although the estimates are of the same sign, F -tests do indicate that the FamEx recall estimates are statistically different from both FoodEx estimates (2nd and 4th row, 3rd panel, Table 1). 8 The implied elasticities are presented in the 4th panel of Table 1. The bottom line is that we find the Deaton-Paxson puzzle with both recall and diary data. Thus our data are incongruent with Gibson's resolution of the puzzle.

Discussion
In response to an early version of our analysis of the Foodex, Gibson and Kim (2007) propose an explanation for the contrast with the results reported in Gibson (2002) and Gibson and Kim (2007). They postulate that because Foodex respondents are asked a broad question about total household food expenditure, they are more likely to employ an estimation-based response strategy rather than enumeration (of actual purchases). In contrast, the surveys studied in Gibson (2002) and Gibson and Kim (2007) ask more detailed questions about expenditure on different food categories, and so respondents may be more likely to enumerate actual purchases. Errors in estimation may be less strongly related 5 FoodEx measures were converted to annual values. For detailed comparison of the consumption measures used in this paper, see Brzozowski et al. (in press). Also see that paper for a discussion of differences across surveys in the construction of the household income variable and in top coding of household size.
6 To deal with potential outliers we trimmed the top and bottom 2% of expenditure reports. 7 We define the food at home budget share as expenditure on food at home divided by gross income. While total outlay is the preferred denominator, gross income is the measure of resources that we have in both surveys. In demand analysis, it would be common to use total outlay both to construct the budget share and as an explanatory variable but then to instrument total outlay with income to mitigate endogeneity and attenuation do to measurement error. Measurement error in income may lead to some attenuation bias in our estimates but this would be common to diary and recall food expenditure measures. 8 To implement these tests we treat the diary and recall expenditure reports data as separate observations, effectively a panel of two observations on each household in the Foodex, and then pool the data, including the Famex. This gives an unbalanced panel, with two observations on some households and one observation on others. We then estimate a regression model with full interactions between ''Foodex Recall'' and ''Famex'' dummies and all other variables. In estimating this model, we calculate cluster-robust standard errors with clustering at the household level, to allow for the obvious correlation between the responses of the same households. We then tested the interaction terms (jointly where appropriate) using the cluster-robust covariance matrix. to household size than errors in recalling actual purchases. This is an appealing argument, and it would clearly be useful to better understand respondents' use of estimation and enumeration strategies when asked recall questions, and the nature of the errors associated with each. Moreover, other differences between surveys and settings, including the definition of food expenditure, the mix of food consumed inside and outside the household, the role of home-produced food and average household sizes, could mean that the nature of recall error is quite different in Canada and PNG. However, neither the suggestion that Foodex respondents employ an estimation-based response strategy to recall questions, nor other factors that might matter for recall error, explain why we find evidence of the Deaton-Paxson puzzle in diary data. Our results suggest that the Deaton-Paxson puzzle must arise, at least in some instances, for reasons other than (or in addition to) recall error. Finally, if we employ the ''Engel Method'' to estimate returns to scale, the FamEx recall data imply that a doubling of household size allows a 3% cut in per capita income, while the FoodEx diary and recall data give estimates of 9% and 24% respectively. 9

Conclusion
In an application drawn from demand analysis, we compared estimates of household size elasticities of food expenditure based on recall and diary food expenditure data. We find negative household size elasticities with both kinds of data. This leads us to doubt the generality of a resolution of the Deaton-Paxson puzzle proposed by Gibson (2002). 9 Computed as β α 1 +2α 2 ln pcy at the mean value of ln pcy.