How high frequency food diaries can transform understanding of food security

Globally, around 2 billion people are affected by moderate to severe food insecurity. The linkages from food security through to environmental sustainability are well established, but not yet well measured. This is a critical gap, as it hampers our understanding of how environmental shocks carry through to become consumption shocks to households, communities, or regions; how responses to these shocks (e.g., dietary substitutions) feed back into further environmental stress. In this study, we present preliminary results from an innovative approach that could transform conventional practices of measuring food consumption into data on the same temporal and spatial footing as environmental data. We developed an alternative approach to conventional one-off food consumption measures that harnesses the expanding presence of mobile and smartphones, measuring food consumption over time with precision and with the potential to capture seasonal shifts in diet and food consumption patterns. Our method provides a picture of breaks and booms in access to forms of food calories, and the ability to compare different moments in time, such as those before and after a nutrition or other economic intervention, a key priority area for research and humanitarian decision-making in nutrition. We show that the distribution of food calories over time is a stronger prediction of health outcomes than any one period's measure, as might be obtained in a conventional household survey, and discuss the part that methods like this should play in a future reimagination of rural engagement and social data collection.

To accurately identify interventions to reduce individual-and household-level food insecurity, we need detailed measures of food consumption. Three frequently employed-and frequently comparedmeasures of food consumption are daily food recalls (where an individual looks back on their consumption over the immediately preceding 24 h), 3-4 d detailed food records (where an individual records what they eat as they go through each day), and the semi-quantitative food frequency questionnaire (FFQ; where an individual indicates their typical consumption of different food types over a specified period, typically 1 week or 1 month) [1][2][3]. These measures trade off quantitative precision for an ability to capture some longer-term signal of consumption; we developed an alternative approach that harnesses the expanding presence of mobile and smartphones to move beyond this trade-off, measuring food consumption over time with precision and with the potential to capture seasonal shifts in diet and food consumption patterns. While mobile-based measurement of food consumption by surveyors is not uncommon [4,5], and mobile devices already aid in the dissemination of nutrition knowledge and guidance [6], remote engagement and regular, selfadministered reporting of food consumption is as yet examined in the literature. Our approach is well adapted to a range of communities, including rural and remote areas in developing countries. It streamlines the regular, high-frequency collection of 24 h food recalls through 'push' reminders and mobile data package rewards to smartphone users (section 1). We obtain the longer term signal that an FFQ looks for, but with insights and details the FFQ cannot provide-a picture of breaks and booms in access to forms of food calories, and the ability to compare different moments in time, such as those before and after a nutrition or other economic intervention, a key priority area for research and humanitarian decision-making in nutrition [7].
High-frequency, self-administered mobile-phones based data collection addresses a broader issue across domains of standardized social data collection, where conventional approaches-long surveys conducted by enumerators, spread in waves far apart in timelimit what we can know or compare across people, communities, and time. The constraints of cost and logistics have historically made it difficult to visit a large sample of people (large n) more than once in a short period of time as is typical in most integrated household surveys, or to follow more than a small sample of people with regularity (large t) [8]. Access to mobile devices is changing that in many ways, all the way down to the possibility of reaching respondents on their own time, on their own devices and in their own spaces, and generating true large-n, large-t baselines of socioeconomic status and activity.
This approach (unsupervised survey through respondent-managed device) is at the end of a spectrum of computer-assisted personal interviewing (CAPI) approaches that includes the use of devices by trained enumerators from the local area or from further away. It adds to the benefits that these more mainstream CAPI approaches already provide (reduced data error and time/labor commitment to making dataset available) by improving the match between survey task demands and respondent attention (shorter blocks), as well as reducing recall bias and missed intertemporal variation [8]. It also can improve representativeness in the sample, empowering participants to respond in their own time and thus not biasing toward those who are able to conveniently take time away from work or other responsibilities to participate in a half-day or longer survey [9]. While it is true that these benefits must be weighed against the barriers to participation that literacy, numeracy and technology savvy play in shaping the respondent pool, it is also true that such barriers are fading over time as smartphones follow in the wake of simple mobile phones, becoming ubiquitous across rural contexts.

Materials and methods
We examined data from two of 46 modules-'Food Consumption' and 'Household Composition'-from a pilot study of high-frequency smartphone data collection conducted in Rangpur District, Bangladesh in 2015/16, whose study design is summarized briefly here [10]. This study used a quasi-random selection process to recruit 480 'likely early adopters' of smartphones-sampled from lists of those with technical literacy identified by local extension officersinto a 1 year experiment during which they received payments of mobile talk-time and data for responding to short survey tasks on a near-daily basis in a 'microtasks for micropayments' data collection platform. Participants were asked to respond to 46 different survey tasks (mostly identical to survey questions included in the Bangladesh Integrated Household Survey; BIHS [11]) at varying frequencies along the year; while some were asked only once throughout the duration of the pilot, others were asked once per season, once per month, or even once per week. Assignment of repeated task frequency (season, month, week) to phones was made randomly across 24 unique phone setups, using an algorithm to make pairwise switches of task versions between phone setups to equalize earning potential (e.g. a task asked weekly offers more opportunity than the same task asked monthly). We include in our analysis only participants who were tasked with the 'Food Consumption' task on a weekly basis (n = 176).
The 'Food Consumption' task asked respondents to select which among a set of general food types their household had consumed in the immediately preceding 24 h: cereals; pulses, nuts and seeds; oils; vegetables; leafy vegetables; fruits; meat and eggs; large fish; small fish; spices, sweeteners, etc; drinks; and any other foods. For each type selected, respondents were then asked to indicate which specific food items within the more general food types they had consumed (e.g. rice, wheat, barley, maize, etc as specific cereals). Finally, for each specific food item reported, participants were asked how much of the food they had consumed, and how much was purchased vs produced themselves. The 'Household Composition' task followed the structure of the same module in the BIHS [11], and asked participants to indicate the age, gender, literacy, education level, occupation, height, and weight for each member of their household, and as well to report three measures of their physical wellbeing: whether they could stand up on their own after sitting down, whether they could walk for 5 km, and whether they could carry 20 l of water for 20 m.
We estimated the nutritional information for the majority of foods in our list using food composition data compiled by the United States Department of Agriculture (USDA) and the Food and Agriculture Organization of the United Nations (FAO) [12], supplementing where necessary with peerreviewed studies tracking local foods (e.g. [13]); all sources for food items can be found in table S2 (supplementary material (available online at stacks.iop.org/ERL/16/041002/mmedia)). Calories were standardized to cal g −1 from different reported measurements such as kJ/100 g and kcal/100 g. Nutritional information for several regional food items could not be found. In these instances, the calories were estimated by the average calories per food group from the FCT for Bangladesh by Shaheen et al. This FCT was used because it is one of the most comprehensive tables for national and regional food compositions of Bangladesh.
Food items were standardized to cal g −1 from eight different reported measurements: kilograms, grams, liters, milliliters, tablespoons, teaspoons, drops, and count. For foods reported by count, calories were calculated using the standard serving size for the food item. After converting reported foods into calories gram −1 , there were approximately 171 entries where the total calories for a food item was over 10 000 calories. We made the assumption that the incorrect unit of measurement was selected in the survey. We chose a more suitable unit of measurement when available and unresolved entries (75) were removed from the final data (13 230 rows of individual foods). Average weekly calorie intake for household consumption units (CUs) was calculated using the total calories reported for the week and CUs as follows, following [14]: 1 CU for the first adult in the household; 0.5 CU for each additional household member older than 14 years of age, and; 0.3 CU for household members under 14 years of age. We used these data to translate participant responses into time series of per-household member meat, fish, and overall calorie consumption. We estimated withinsubject means and coefficients of variation (standard deviation normalized by mean) for each time series, and used these to test the additive explanatory power of these food consumption metrics on the three measures of physical wellbeing used in the BIHS and described above (see probit regressions shown in table 1), with individuals weighted by the number of responses received throughout the duration of the pilot the experiment (up to 48), and standard errors clustered by household.

Information is gained when we consider the variation within micro-level observations
High-frequency, regularized data could be transformational for measuring micro-level food consumption as a key component of food utilization and food security [15] in several ways. First, they could be transformational in the way we engage with rural communities-regularly, privately, and on their own time-and in the way we channel resources for data collection. Second, they could be transformational in the kinds of metrics we can create for looking at socioeconomic status or wellbeing [16]. Social scientists commonly rely on point estimates of a range of factors related to hypothesized determinants of food security (such as patterns in food consumption from household-level consumption and expenditure surveys) as a primary strategy for comparing populations. Food security monitoring is evolving to consider more diverse metrics for capturing the different, and potentially interconnected, dimensions of food security (see [15] for a description of the wide range of data and techniques used). High frequency data collection has the potential to advance measurements of food security, specifically utilization and stability, by providing more complex insights into the variation within consumption patterns through documenting micro-level variation and distribution over time (including spread, skewness, spikes, and breaks).
On average across the sample, consumption is similar from week to week, with the notable exceptions of major festivals and 1 week in February 2016 during which fish consumption is anomalously high (figure 1, panels (A)-(C)). Thus, conducting food recalls in any particular week adequately estimates mean consumption, as long as the survey campaign is spaced far enough from planned festivals to not have a partial overlap of some respondents with these festival periods that could bias the average consumption upward. However, individually the reported consumption of calories and protein can vary widely from week to week, for reasons that are likely not known to data collectors. From a micro-data perspective, considering multiple weeks or time periods provides greater insight into household consumption because it provides both general trends in consumption that can be compared across households, as well as insight into how varied consumption patterns are within a household (figure 1, panels (D)-(F)). Instead of these highly variant point estimates, we now have household-level distributions (i.e. mean and coefficient of variation) to better describe the shape of food consumption along the year. When these data are collected regularly, we open up new opportunities to link food consumption to the distinct but interconnected pillars of food insecurity.
To demonstrate the utility of high-frequency food consumption data we examine the fundamental question 'Can we learn more about food insecurity by observing how people eat over time, than we can from any single-period measure?' We evaluate how 'shape' measures-such as mean and variance-improve the predictive capability over the more standard pointbased estimates for basic physical health outcomes associated with food insecurity such as the ability to stand, carry burdens, and walk (section 1). These measures are informed by many factors and our design is not intended to present a full model-rather, to examine possible explanatory power alongside key factors such as age.
For each of three measures related to physical health associated with food insecurity, we conduct probit regressions on (a) a simple demographic model (of age, age squared, sex, and level of education) and (b) the same demographic model plus measures of the 'shape' of per-person meat, fish, and overall calories (including mean, coefficient of variation, and percentage of overall calories).
These shape measures matter, explaining significant variation (and improved models, measured by a lower Akaike Information Critierion; AIC) for all three physical wellbeing measures. Importantly, they matter more than any 1 week's food signal: single week signals of meat, fish, and total calories are significant in some models, but the models based on food calorie distributions are superior (by AIC) for every week of the experiment (table S1).

Making high-frequency diet monitoring the new standard
Globally, around two billion people are affected by moderate to severe food insecurity [17]. Widespread and growing food insecurity, especially in Asia and Africa, means that the planet is not on track to meet humanitarian goals to significantly reduce food insecurity, including Sustainable Development Goal #2. The linkages from food security through to environmental sustainability are well established [18][19][20], but not yet well measured. Literature examining ties from sustainability to food security commonly focuses on the link from agriculture to food production, e.g. [21][22][23]. or through production to commodity markets [24,25], but the connection of analyses in production and trade through to consumption seems elusive. Though food consumption sits alongside food production as 'among the most important drivers of environmental pressures' [26], we do not yet have the tools to analyze food consumption in the same ways (i.e. at the same scales and frequency) as we do for environmental variables and food production. This is a critical gap, as it hampers our understanding of how environmental shocks carry through to become consumption shocks to households, communities, or regions [27,28]; how responses to these shocks (e.g. dietary substitutions) feed back into further environmental stress [8]. In this study, we present preliminary results from an innovative approach that could transform conventional practices of measuring food consumption into data on the same temporal and spatial footing as environmental data. We acknowledge limitations in our pilot study data that we would not expect to limit further, focused examinations of food consumption. Firstly, these data were collected as one module among many in a larger examination on recall bias and intra-period variation in high-frequency data collection [8]. Our food consumption module was adapted from a similar module in the BIHS [11] in a manner which made foods selection intuitive (users first selected food groups, then specific foods within those groups, then estimated amounts) but which was not intensively field tested on its own. We do not have testing data on how challenging it was for participants to estimate portions, distinguish raw vs cooked or wet vs dry amounts, or how likely participants were to omit foods. Foodsspecific tailoring such as anchoring recall to meals within the day, highlighting previously entered foods, etc, could (through an iterative testing procedure) lead to a more accurate signal of overall food consumption. Secondly, following as well from the larger experimental design, respondents were given a flexible response window of several days to complete tasks, so that time of day (or day of week) are variable both within and across subjects. Thirdly, we sampled individuals more likely to be digitally literate and did not explicitly tackle the challenge of designing instruments for which literacy, digital literacy, or numeracy are non-constraining (as others such as Daum et al have done [29]). With these limitations to pilot data quality in mind, we refrain from making quantitative claims about effects in our analysis and point to our robust, qualitative findings: that 'shape' measures (e.g. mean and variation) have explanatory power on physical wellbeing outcomes that no one-off or typical food consumption measure can provide. We hope that the shortcomings of our pilot study design outlined above point to easily addressable improvements to a method with ever greater potential as barriers to digital, remote engagement subside.
Specifically, the break from enumerated to selfadministered surveys transforms the potential for data to be representative of a target population and the quantity of interest. We have previously shown that significant intra-period variation is captured in the move from measuring food consumption once per season to once monthly, and as well moving from monthly to weekly [8]; this allows observations over time for at-risk groups whose nutritional needs and behaviors may change dramatically over a short period of time (such as pregnant/breastfeeding women [30], infants, and children) that are not possible at scale with enumerated survey waves. Further, training participants to capably self-administer survey tasks such as food intake may alleviate concerns over privacy, judgment, or other cultural and social expectations that shape reporting in any enumerator-respondent relationship. Added to the benefit cited earlier of reaching people without their needing to step away from work or other obligations, the move to self-administration offers a range of transformative improvements to representation that must be weighed against the challenges of training respondents.
Long-term, high-frequency pictures of household food consumption can reveal subtle nuances in chronic and/or acute food insecurity and allow for modeling and understanding of the processes that link consumption and health that has heretofore been impossible. Improving household-level food consumption monitoring in evaluations of food or nutrition interventions is one of many entry points across the environmental social sciences to transform the way we engage with communities of respondents-moving us toward regularized socioeconomic baselines informed by high-frequency data, with incentives provided via the positive development externality of improved access to mobile internet. This depth in data collection and a shift toward capturing variation in human status and wellbeing outside of environmental crises-i.e. a move from McCubbins and Schwartz's 'fire alarms' to more regular 'police patrols' [31]-opens a wealth of new ways of knowing across environment and development fields. Evaluating the impact of pro-poor interventions on well-being, consumption, and land use would be possible in near real-time. Data on food and water access that are regular in space and time would expose the factors that prevent moisture deficits from becoming agricultural or socioeconomic droughts. More broadly, our ability to discuss variations in labor, consumption, or illness as anomalies from a statistical mean would put socioeconomic science on the same analytic and semantic footing as the natural systems branches of environmental science.
Enacting the transformation of data collection, food measurement, and rural engagement that we highlight above will rely on these new streams of data (a) connecting to what we have learned already, and (b) showing us something new. To the first of these challenges, we encourage researchers to begin building comparative work and ground-truthing into all new high-frequency, self-administered mobilephones based data collection. The mixing of data collection modes will always bring shifts in responses. For example, people respond differently to things that have been read to them vs things they have read themselves [32] so there is a burden to conduct extensive ground truthing and validation if we wish to stitch this new flow of data appropriately to that of the past. To the second, we encourage connections with decision makers and a dive into usable, useful data; measures of spread (how variable is their access to fish protein) or gaps (how long do families go without grain calories) may become central to monitoring and tracking processes identified as critical to making nutrition interventions work and build human capital [7].

Data availability statement
The data that support the findings of this study are openly available at the following URL/DOI: https://doi.org/10.7910/DVN/HBQQVE.

Funding
This work was supported by the Cereal Systems Initiative for South Asia (CSISA) of the Consultative Group on International Agricultural Research (CGIAR), with generous funding provided by the United States Agency for International Development (USAID) and the Bill and Melinda Gates Foundation. The authors also acknowledge support from the International Food Policy Research Institute and the CGIAR Collaborative Research Program on Policies, Institutions, and Markets.