Comparison of Dietary Quality Assessment Using Food Frequency Questionnaire and 24-hour-recalls in Older Men and Women

Objectives To examine the agreement in nutrient intake and alternative healthy eating indices (AHEI) between a self-administered Food Frequency Questionnaire (FFQ) and 24-hour recall (24HR) measurements of diet by gender, among older adults. Material and methods This is a cross-sectional observational study of 105 men and 99 women aged 65 and older living in urban and rural neighborhoods in Worcester County, Massachusetts, USA. Participants were queried on diet using both FFQ and 24HR. The healthy eating classification was compared between the two instruments by gender. Results For men, the mean ± SD of AHEI total score was 48.2 ± 12.3 based on FFQ versus 34.7 ± 10.2 based on 24HR. For women, the mean ± SD was 47.9 ± 10.1 based on FFQ versus 36.1 ± 10.0 based on 24HR. Using 32 as the cutoff (40% of maximum AHEI score), 9% of men and 7% of women were classified as eating unhealthy based on the FFQ, versus 47% of men and 38% of women based on 24HR. Compared to women, men had larger 24HR to FFQ discrepancies in the nuts and vegetable protein subscore and white/red meat ratio, and smaller discrepancy in alcohol beverages subscore. Conclusion Agreements between FFQ and 24HR-based measures of diet quality were roughly comparable between men and women, though slightly better for women than men. Compared to 24HR, the FFQ tended to underestimate the proportions of older men and women classified as eating unhealthy and misclassified more men than women. Such limitations should be considered when the FFQ is used to study healthy eating in older age.


Introduction
Healthy eating is critical to the prevention and care of chronic diseases and disabilities in older age [1,2]. Accurate and cost-effective assessment of dietary intake and diet quality may inform personalized nutrition counseling and help monitor trends in adherence to dietary guidelines.
Commonly used methods include the interviewer-administered 24-hour dietary recall interviews (24HR) and self-administered food frequency questionnaire (FFQ). The 24HR requires a professionally trained interviewer to administer multiple telephone interviews with the respondents to capture the intra-person variability of diet [3]. The interviewer-administered 24HR is often preferred as the "gold standard", against which a FFQ is calibrated [4,5]. The FFQ is a low cost alternative and a frequent measurement of choice for large scale epidemiologic and health promotion studies. A typical FFQ includes a finite list of foods and food groups from which participants quantitatively report how often each item was consumed over a specified period of time. Portion size is collected according to standardized portion sizes.
Dietary behaviors are complex, and dietary assessment is prone to errors [6]. The accuracy of dietary assessment may be influenced by a number of respondents' personal characteristics, such as gender, age, race, culture, education, income, language, home environment and living arrangements, cognitive function and memory [7], and nutrition literacy [8]. In addition, social desirability or pressure regarding body image and weight can also impede reporting accuracy, disproportionally affecting overweight women [9]. Because biology, behaviors and social desirability differ between men and women, the accuracy of instruments to measure their dietary intake could well differ. Therefore, it is important to investigate discrepancies in dietary quality assessment using different instruments and among different populations. For example, in-depth analysis is needed to better understand whether use of different dietary assessment tools would affect the validity of the analysis of gender differences in diet quality.
As a part of a larger study of health behaviors among older adults, we assessed the utility of the self-administered General Nutrition Assessment (GNA) FFQ of the Fred Hutchinson Cancer Research Center (FHCRC) [10] in community-based studies on healthy eating. Our previous analysis showed that the FFQ has limited ability to accurately assess nutrient intake among older Black women, and tends to underestimate racial differences in diet and healthy eating among urban older women when compared to 24HR [11]. In the current analysis, we examined the agreement in nutrient intakes and the alternative Data from the first battery of instruments is itemized in Table 1. Most characteristics were assessed by self-report using questionnaires designed by this study, along with a number of standardized instruments including the Tinetti Falls Efficacy Scale for fear of falling [14], Beck Anxiety Inventory [15], CES-D Depression Scale [16], Activities of Daily Living (ADL) for physical limitations, and the CHAMPS (Community Healthy Activities Model Program for Seniors) survey of frequency of exercise activities, both recreational and functional [17,18]. Weight and height of each participant was obtained by self-report. Physical activity was measured objectively by an accelerometer (ActiGraph GT3X-Plus) worn by each participant during all waking hours for 7 consecutive days in the week following the completion of the first battery of questionnaires. A daily mean number of steps was calculated for each person, excluding any non-wearing days (i.e., if <10 steps were recorded).

Measurements of dietary intake
The participants were queried about their diet using the 24HR and then the GNA FFQ, all performed within 3 weeks of each other. Each participant received three unannounced computer-assisted 24HR (University of Minnesota Nutrition Coordinating Center version NDSR 2011), conducted on randomly selected days within a 1-week period (two weekdays and one weekend) [19][20][21]. The GNA FFQ was developed by the NASR of the FHCRC, and is described elsewhere [11]. In summary, the GNA FFQ is originally based on the WHI FFQ, using the same format and analysis algorithms, and was updated in late 2010. The GNA FFQ relies upon the University of Minnesota Nutrition Data Systems for Research (NDSR) software version 2014 for data entry and nutrient analysis [20,22].
Dietary outcomes of primary analytic interest in our study included average daily total caloric intake and measures of dietary quality, including consumption of fruits and vegetables, legumes, nuts, averaged daily intake of total protein, fats (saturated, poly-and monounsaturated fats, trans-fats), types of carbohydrates, fibers, and micronutrients (such as sodium and calcium).
We calculated healthy eating scores to compare the ability of the 24HR and FFQ instruments to evaluate dietary quality. First, an alternate healthy eating index (24HR AHEI) was calculated for each participant based on their 24HR intake, modifying the formula developed by the USDA Center for Nutrition Policy and Promotion [23]. As in prior studies, we did not include multivitamin intake in total score due to our interest in diet quality associated with food eating behaviors [24,25]. The overall index had a total possible score ranging from 0 to 80 points, with higher scores indicating better overall dietary quality as it relates to morbidity and mortality of chronic diseases. Dietary component subscores were calculated for 8 components, including vegetables, fruits, nuts and legumes, ratio of white to red meat, cereal fiber, alcohol, trans fats, and ratio of polyunsaturated to saturated fats. Each component score had a scoring range of 0 to 10 points. In parallel, a FFQ-based AHEI score was calculated for each participant using the same 8 component scores. Comparable measures were available in the 24HR and FFQ data for all components except for the nuts and legumes subscore. Servings per day of nuts and legumes were not directly available from the FFQ, so this component score was estimated from the vegetable protein intake reported in the FFQ and scaled to match the point range of the same component in the 24HR AHEI. Total AHEI scores were categorized as "poor" in nutritional value if less than 40% of the maximum score, i.e., <32 points, following the procedure of Rehm and associates [26].

Statistical analysis
Gender differences in sociodemographic, physical and mental health, and lifestyle factors were evaluated using Chi-squared tests for percentages or Wilcoxon rank-sum tests for continuous variables (because many of the variables have skewed distributions). We also divided the participants into quartiles based on the absolute value of discrepancy between their total 24HR AHEI score and their total FFQ AHEI score. We then compared the characteristics of individuals in the quartile with greatest discrepancies with those in the lowest-discrepancy quartile in a supplementary table.
Dietary intake values of 48 nutrient items were compared by gender within both the FFQ and 24HR measurements using unadjusted linear regression models. For each nutrient item, the Pearson's product moment correlation (rho) between individuals' FFQ and 24HR measurements was estimated by gender, along with its 95% confidence interval based on Fisher's transformation. Rhos for each nutrient item were tested for equality between genders. Actual discrepancies between the FFQ and 24HR measurements were then examined item by item. For each nutrient item and each participant, the raw discrepancy was calculated (24HR measure − FFQ measure) as well as percentage discrepancy [absolute value (100 × (24HR − FFQ) / 24HR))]. The mean percentage discrepancies were compared by gender using linear regression models (1) unadjusted; and (2) adjusted for age, income, and education.
AHEI total scores and component subscores were summarized and compared for male to female differences using the Wilcoxon rank-sum test. Rhos between the FFQ-and 24HR-based scores were also estimated by gender. The mean differences between the 24HR-and FFQ-based AHEI scores were calculated. These were compared for gender differences using unadjusted linear regression models. Kappa scores and percent agreement were calculated for the classification of total AHEI scores into poor vs. better dietary quality by the FFQ vs. 24HR method. A scatter plot along with fitted linear regression lines and AHEI cutoff points at 32 was drawn to illustrate the gender differences in the distributional characteristics, correlations between FFQ-and 24HR-based AHEI scores, and misclassification of poor vs. better diet quality.

Participant characteristics
Participants who completed both dietary assessments included 105 men and 99 women. They had a mean (SD) age of 73.8 (5.9), had 14.9 (2.8) years of education, and 88.2% were White (Table 1). Nearly all (97.5%) were current non-smokers and owned a car (99.5%). Compared to the men, higher proportions of the women had an annual household income below $50,000 and were living alone and/or unmarried. Women had, on average, lower levels of physical activity as measured by step counts and self-reported exercise frequencies. Women also had slightly but significantly lower BMI, a lower frequency of drinking alcoholic beverages, and higher levels of anxiety. Larger percentages of women than men had osteoporosis and respiratory disease, while more men had diabetes.

Dietary intake
Men and women differed significantly in their intake of many nutrients, according to both the FFQ and 24HR, with men consuming more daily calories, and thus having higher intakes of carbohydrates, proteins, and fats, and many micronutrients ( Table 2). Men also consumed more caffeine and approximately twice as much alcohol as women, but had somewhat lower levels of vitamin A/beta carotene, lutein/zeaxanthin, and vitamin K in their diets. The 24HR appeared to be more sensitive to the gender differences than the FFQ. For instance, the 24HR estimated that men consumed 479 more calories than women vs. 287 more according to the FFQ. Similarly, the FFQ slightly underestimated differences in most macro-and micronutrient intakes relative to the 24HR, although there was good agreement between the two measures on the direction of the gender differences.
Examination of the correlations (rho) between FFQ and 24HR measurements revealed modest gender differences in a number of energy and nutrient intake measures ( Table 3). The mean difference between male and female correlation coefficients for FFQ-24HR agreement was 0.07, with men having a mean correlation of 0.34 and women 0.42. Men had a "strong correlation" of 0.5 or greater for 4 of the 48 pairs, while women had strong correlations for 14 pairs. Statistically significant male to female differences were found in the correlations in reported percentage of calories from total fat, saturated fat, monounsaturated fat, and trans-fats; several vitamins (B6, folate, and E); and calcium. Notably, we found no statistically significant correlations between the FFQ and 24HR measurements of folate, vitamin B6, and vitamin E for men; percent of calories from trans-fats for women; and galactose (milk sugar) for both men and women.
Averaging, by gender, each individual's reporting discrepancy [24HR measure − FFQ measure] for each nutrient showed little male to female difference (Table 4). There were only three nutrients (of 48) for which one gender had a significantly (p < 0.05) greater percent discrepancy than the other in the crude analysis. In one of these cases (for percent of calories from protein), women had greater discrepancy between the two instruments, and for total folate and glycemic load, the men had a greater discrepancy. Adjusting for age, income, and education did not appear to have a large or univalent effect on the gender difference.       3 Men are required to have a higher alcohol intake than women (1.5-2.5 vs. 0.5-1.5 servings per day) for maximum score. 4 Conventional classifications according to Rehm et al (2016).

Healthy eating index
Alternate Healthy Eating Indices (AHEI) calculated from the FFQ-reported diet vs. the 24HR-reported diet are shown in Table 5. Men and women did not differ significantly from each other in total AHEI by either measure. Women had a significantly higher subscore for the ratio of white/red meat consumed according to both measures. Only the 24HR found subscore differences for alcohol (men higher) and polyunsaturated/saturated fat ratio (women higher). The greatest discrepancies between FFQ and 24HR (3.5 points or greater) were found in the subscores for white/red meat ratio and for cereal fiber in both genders. Compared to women, men had larger FFQ to 24HR discrepancies in the subscores for nuts and vegetable protein and for white/red meat ratio, while women had a larger discrepancy in the alcoholic beverages subscore. With respect to total AHEI scores, personal characteristics of participants in the highest 24HR-FFQ-discrepancy quartile did not differ greatly from those in the least-discrepancy quartile (Supplementary Table 1). Those with the greatest discrepancy were more physically active (more steps per day) and were more likely to be married, and less likely to live alone. Women had a higher correlation between FFQ and 24HR-derived total AHEI scores, but men had higher correlations for six of eight subscores (Table 6), although none of these gender differences were statistically significant. For men and women combined the correlations were strongest for the subscores of alcoholic beverages, fruit, vegetables, and polyunsaturated to saturated fat ratio and weakest for the white to red meat ratio. The FFQ overestimated total AHEI relative to the 24HR by 13.5 points for men and 11.8 points for women (Table 5). Using 32 points as the cutoff (40% of maximum AHEI score), 47% of men and 38% of women were classified as having a poor diet based on 24HR versus 9% and 7% based on FFQ, respectively. The percent agreement between FFQ and 24HR AHEIs for poor diet classification, as determined by the kappa test, was 56% for men but 69% for women. As shown in Figure 1, while all FFQ AHEI scores tended to be higher than the corresponding 24HR AHEI scores (above the equality line), those in the lowest range tended to have the greatest discrepancy (i.e., line of best fit is farthest from the equality line in the lower region).

Discussion
A dietary assessment tool that is equally accurate and reliable for both men and women is critical to the studies of diet and eating behaviors in the general population. The GNA FFQ has been applied in numerous large scale studies [19,[27][28][29][30][31][32], primarily among post-menopausal women. This cost-efficient instrument has contributed new and important findings in older women's nutrition and health. It is of great interest to examine its utility in a broad variety of populations, such as studies on elder gender differences. Our analysis in a predominantly white, mixed rural-urban population in Central Massachusetts found that the FFQ was somewhat effective in identifying nutrient intake differences between the genders. However, the FFQ was less sensitive and detected a smaller magnitude of difference for total calories, total grams (volume of intake), many micronutrients, and most macronutrients as compared with the 24HR.
For the older women in our sample, the GNA FFQ and concurrent 24HR had an overall mean correlation of 0.42 for 48 nutrients. This was commensurate in accuracy with the original validation study results of the WHI FFQ against the 24HR. The correlation coefficients obtained by Kristal et al. [28].
for Whites (rho = 0.49) and by Patterson et al. [33] (0.45), are comparable to this. The men's mean rho of 0.34 raises the question of whether the FFQ could be a less accurate instrument for assessing men's diets in general, although this gender difference in rho was much less than some previously observed racial differences among older urban women by Olendzki et al. [11] (rho = 0.46 Whites and 0.23 Blacks) and Kristal et al. [28] (rho = 0.49 Whites and 0.26 Blacks). Our supplementary inter-quartile comparison of high-discrepancy vs. low-discrepancy participants found that living alone/unmarried was more common among those with the best agreement between their 24HR and FFQ total AHEI scores. Since this was also a more common characteristic among women than men in our study, it raises the question of whether living in a household where someone else may do the shopping and/or cooking, rather than gender itself, leads to greater disagreement between the two measures. Perhaps this factor should be considered when selecting a dietary evaluation instrument.
On an item-by-item basis both genders were similar in their 24HR-FFQ discrepancies. There were only three nutrients for which one gender had a significantly greater percent discrepancy than the other. This is very different from the results found by Olendzki et al [11]. For the same set of nutrients in a comparison between Black and White older women, there were 24 nutrients with significant White-Black differences in mean percent discrepancy, and in every case, Blacks had a greater discrepancy than Whites. Here again, as for the measurement correlations, the gender differences in FFQ-24HR agreement were much less than the previously observed racial differences.
For assessment of healthy eating status, the correlations between FFQ-and 24HR-based AHEI total scores were high for both genders (0.55 for women and 0.46 for men). However, the FFQ tended to overestimate AHEI scores and the proportion of participants designated as eating healthy for bo th men and women. Even though the men in our sample had only slightly lower mean AHEI scores than women according to the 24HR, the overestimation by the FFQ moved a larger proportion of men into the "intermediate" diet quality range from their "poor" status under the 24HR. As discussed by our previous analysis [11], this error may be especially serious in studies that include at-risk racial minority populations.
Although our analysis focused on the discrepancies in AHEI, similar analysis should be conducted to examine discrepancies in other well-known dietary quality indices such as HEI and DASH. Such analyses may inform users about the proper use of these instruments when studying disparities in diet and dietary behaviors. In addition to gender and racial differences, geographical and cultural differences may also need to be carefully examined to ensure the validity of diet quality assessment valid across geographic regions and cultural groups. We will address these issues in our future studies. There are several strengths as well as weaknesses in the current study. The strengths include a relatively large, representative sample of community-dwelling men and women from diverse neighborhoods of low to high housing density. The timing of the two measurement surveys was less than three weeks apart for the majority of participants, limiting the impact of seasonal and typical dietary variation. The study also carefully measured a large number of sociodemographic (e.g., education and income), lifestyle and health characteristics of the participants, which allowed us to explore personal factors that may influence reporting accuracy and calculate covariate-adjusted gender differences. However, the present sample consisted of predominantly non-Hispanic white men and women, and the study findings cannot be generalized to racial and ethnic minorities. Both FFQ and 24HR are subject to recall accuracy and social desirability bias.

Conclusion
The GNA FFQ has played a vital role in discovery of nutrition-related risk factors for chronic diseases and injuries among older women. This analysis showed that agreements between FFQ-and 24HR-based measures of diet quality were roughly comparable for women and men. Therefore, its utility among older men could be considered in future large scale studies on both genders, at least for non-Hispanic white populations. Compared to 24HR, however, the FFQ tended to underestimate the proportions of older men and women classified as eating unhealthy, and a somewhat higher proportion of men were misclassified. The correlations for a number of energy and nutrient intake measures between FFQ and 24HR differ between men and women. Such limitations also should be considered when the FFQ is used to study healthy eating in older age.