Online version of the self-administered food frequency questionnaire for the Japan Public Health Center-based Prospective Study for the Next Generation (JPHC-NEXT) protocol: Relative validity, usability, and comparison with a printed questionnaire

Background Online dietary assessment tools offer advantages over printed questionnaires, such as the automatic and direct data storage of answers, and have the potential to become valuable research methods. We developed an online survey system (web-FFQ) for the existing printed FFQ used in the JPHC-NEXT protocol, the platform of a large-scale genetic cohort study. Here, we examined the validity of ranking individuals according to dietary intake using this web-FFQ and its usability compared with the printed questionnaire (print-FFQ) for combined usage. Methods We included 237 men and women aged 40–74 years from five areas specified in the JPHC-NEXT protocol. From 2012 to 2013, participants were asked to provide 12-day weighed food records (12d-WFR) as the reference intake and to respond to the print- and web-FFQs. Spearman's correlation coefficients (CCs) between estimates using the web-FFQ and 12d-WFR were calculated. Cross-classification of intakes was compared with those using the print-FFQ. Results Most participants (83%) answered that completing the web-FFQ was comparable to or easier than completing the printed questionnaire. The median value of CCs across energy and 53 nutrients for men and women was 0.47 (range, 0.10–0.86) and 0.46 (range, 0.16–0.69), respectively. CCs for individual nutrient intakes were closely similar to those based on the print-FFQ, irrespective of response location. Cross-classification by quintile of intake based on two FFQs was reasonably accurate for many nutrients and food groups. Conclusion This online survey system is a reasonably valid measure for ranking individuals by intake for many nutrients, like the printed FFQ. Mixing of two FFQs for exposure assessments in epidemiological studies appears acceptable.


Introduction
Many epidemiological studies that evaluate dietedisease associations, such as large-scale prospective cohort studies, assess the usual long-term diet and rank individuals by intake of specific nutrients using food frequency questionnaires (FFQ). 1 Typically, a printed questionnaire is sent to the subject and returned to the study office after completion. If many missing responses or logical errors are discovered during study office review, the subject is asked to provide the missing information via telephone or the questionnaire is returned to the subject. 2 Accepted responses are then converted to electronic data.
With increasing use of the Internet, many dietary assessments, such as FFQs and diet history questionnaires, have been developed using Web technology. Reports on the validity of these assessments have increased dramatically over the last 10 years. 3e16 However, the subjects in these studies have all been computer-literate young or highly educated individuals, and the use of Web assessment by middle-aged and elderly local populations, which provide the subjects of actual cohort studies, has not been validated.
Web-based dietary assessment tools offer three major advantages. First, the conversion of print questionnaire responses to electronic data is omitted, and data processing is simple and fast. 3e7 Second, the questionnaire can be sent to many people at once, typically by including a URL for the questionnaire in an e-mail message. 17 Third, missing responses can be minimized through the use of warnings displayed by a computer program, 3e6 obviating the need to check for missing responses or conduct follow-up inquiries and improving data quality and time and cost efficiency. Further, the subject does not need to perform certain tasks, such as crossing out or erasing marked sheet responses, or repeatedly troubleshoot difficulties with the questionnaire with the study administrative office. Even if web-FFQs are restricted to subjects with Internet access, 4 combined use of web-and print-FFQs in actual cohort populations may help improve response rates and reduce the total burden of large-scale epidemiological studies. To our knowledge, however, the combined use of web-and print-FFQs has not been studied.
We developed an online version of the print questionnaire 18 used in the baseline survey of the Japan Public Health Centerbased Prospective Study for the Next Generation (JPHC-NEXT). 19 Here, we examined the usability and validity of this on-line questionnaire for a local population within the geographic area specified in the JPHC-NEXT protocol. Further, to examine the mixing of exposure assessments, we compared the estimated intake rankings obtained with the online and print FFQs.

Study settings and participants
The study was conducted in five areas specified in the JPHC-NEXT protocol (Yokote, Saku, Chikusei, Murakami, and Uonuma). 19 Eligibility criteria were middle-aged and elderly residents of these five areas, as in the JPHC-NEXT protocol. The protocol was approved by the Institutional Review Board of the National Cancer Center, Tokyo, Japan and all other collaborating research institutions.
A total of 255 men and women participated in the study. The 12day food records and two identical print-FFQs were completed by 253 participants, of whom 250 also completed the web-FFQ. The present validation study was conducted in 237 men and women aged 40e74 years at the start of the study.

Data collection
To establish a reference intake, participants completed a series of 3-consecutive-day weighed food records, one in each of four seasons (12d-WFR), at intervals of approximately 3 months from November 2012 to December 2013. The self-administered semiquantitative printed questionnaire (including general information on lifestyle, such as disease history, smoking status, and physical activity, in addition to the FFQ) for the JPHC-NEXT protocol was administered twice between November 2012 and December 2013 at a 1-year interval. The web questionnaire (also including overall information on lifestyle as well as the FFQ) was administered between August and December 2013. Data collection and methods have been described elsewhere. 18

Reference method
Each 3-consecutive-day period consisted of 2 weekdays and 1 weekend day. Food portions were measured by each participant during meal preparation using supplied digital scales and measuring spoons and cups. For foods purchased or consumed outside the home, the participants were instructed to record the approximate quantity of all foods in the meal and/or the names of the product and company. To account for the validity of water consumption (from fluids or beverages), water used in soup and in boiled food, as well as drinking water, were also checked. Food records were checked by trained dietitians with the participants on the day after each of the 3-day WFRs on site in each study area and were coded for foods and weights. In some cases, the 3-day WFR was submitted via fax or mail to the study office and checked via telephone.

Print-FFQ
The print-FFQ consisted of 172 food and beverage items in nine frequency categories and three portion size categories. It asked about the usual consumption of listed foods during the previous year. The food list was initially developed and used for the Japan Public Health Center-based prospective Study 20e26 and modified for middle-aged and elderly Japanese residents in a wide variety of areas for use in the JPHC-NEXT Study baseline survey. Details of items and the validity of intake estimates based on the print-FFQ have been described elsewhere. 18 When staff identified missing answers or errors in the print-FFQ, the participants were asked to provide that information again.
Intakes of energy, 53 nutrients (including water content), and 29 food groups were calculated using the Standard Tables of Food  Composition in Japan 2010, 27 Standard Tables of Food Composition  in Japan Fifth Revised and Enlarged Edition 2005 Fatty Acids Section, 28 and a specifically developed food composition table for isoflavones in Japanese foods. 29 To compare categories of estimated intake based on the web-FFQ with those based on the print-FFQ, we used data from the second administration, because these FFQs were administered around the same time. To compare usability, the second print questionnaire asked about the total time required to answer the overall information on lifestyle.

Development and characteristics of the web-FFQ
The web-FFQ is an online self-administered semi-quantitative FFQ. The interface is configured similarly to the print-FFQ, and the structure is the same. With the intention of deriving similar estimates (and validities) from the online version to those from the printed FFQ, we determined not to newly add photographic images of food items or other visual artifices to aid subject recall. Time to complete the questionnaire (including date and time lapsed from when the "start" button was clicked to time the "send data" button was clicked) was measured automatically. The web questionnaire included a question on the ease of use in comparison with the printed questionnaire.
Participants with private or residential internet access (excluding mobile phones) received an e-mail message containing their ID and a unique URL in August 2013. Participants without private or residential internet access completed the web questionnaire via tablet computer or personal computer at a specified site in each area from August to December 2013.
The web questionnaire retained answers entered in preceding pages, allowing completion across different sessions. Programmed alerts were raised if mandatory information was not entered. To check usability, total completion time was compared with the selfreported time required to complete the printed questionnaire.

Statistical analysis
We included 98 men and 139 women in the main analysis for validity. After exclusion of 27 participants who required 24 h or more to complete the web questionnaire (which allows completion over multiple sessions) or did not provide information on time to complete the printed questionnaire, median values of completion time were compared by sex, age, and response location (private/ residential or on-site) among the remaining 86 men and 124 women.
Mean intakes of nutrients and food groups estimated using the web-FFQ were compared to those estimated using the 12d-WFR among the 98 men and 139 women who completed both. To assess agreement in estimated intakes, limits of agreement (LOA) were calculated based-on log-transformed values. The LOA were obtained by overlaying the plot of difference (FFQ À WFR) versus mean ((FFQ þ WFR)/2) between the two methods. This was originally termed the BlandeAltman method, 30 which can also be characterized as the mean difference ±1.96 multiplied by SD of differences. The exponentiated mean difference provided the ratio of intake estimated using the web-FFQ relative to the WFR, with an exponentiated LOA range between 50% and 200% indicating acceptable agreement. 31 Any dependency between the two methods was tested by fitting the regression line of differences. To determine the validity of the web-FFQ, Spearman's rank correlation coefficients (CCs) between intakes based on the web-FFQ and 12d-WFR were calculated for energy-adjusted values. A residual model was used for energy adjustment. 1 We corrected the observed CCs for the attenuating effect of random intra-individual error from the usual intake of each energy and nutrient and each food group. 1,32 Also, to compare categories of estimated intake between the web-FFQ and print-FFQ, we computed the number of participants classified into the same, adjacent, and extreme categories by cross classification according to both quintiles using the web-and print-FFQ. 32 All analyses were performed using SAS Version 9.4 (SAS Institute Inc., Cary, NC, USA).

Study participants
Study participants are characterized in Table 1

Usability of the web-based questionnaire
Of the 253 participants who completed the 12d-WFR and 2 identical print-FFQs, 3 participants did not complete the web-based questionnaire due to technical and network issues. Most participants (83%) answered that the web questionnaire was "very easy (9.3%)", "easy (53.2%)", or "almost the same (20.7%)", compared with the printed questionnaire. Total proportions of answers representing suitable usability of the web-based questionnaire varied by age, with corresponding values of 88%, 86%, 80%, and 72% for those in their 40's, 50's, 60's and 70's, respectively.
Of the 237 participants, 81 without private/residential internet access completed the web questionnaire at the specified site in each area; 30.9% (9 men and 16 women; mean age 67.1; SD, 5.3 years) of these 81 respondents required complete or partial assistance by staff. Table 2 shows median time to complete the printed and web questionnaires (including overall information on lifestyle) by sex, age, and response location. Participants with private/residential internet access were approximately 7 years younger than those without access. Median time to complete the web questionnaire was similar to that for the printed questionnaire in men, but slightly longer in women, with corresponding values of 63.4 and 60.0 min for men and 81.2 and 60.0 for women, respectively. Although median time to complete the web questionnaire was greater among the respondents on site than for the private/residential respondents in both sexes, the results were similar to those for the printed questionnaire, at 70 and 50 min for men and 90 and 55 min for women, respectively. Median time to complete the web questionnaire among private/residential respondents was closely similar to or slightly longer than that for the printed questionnaire for both sexes.
Estimates of intake by web-FFQ and ranking compared with 12d-WFR Tables 3 and 4 show daily intakes of energy and 53 nutrients by the 12d-WFR and web-FFQ, percentage differences between web- FFQ and 12d-WFR, and correlations among men and women. Estimated energy intake levels between the two methods were similar for men (mean percentage: 98%), whereas those based on the web-FFQ was slightly higher among women (112%). BlandeAltman analysis to check agreement of estimated intakes showed that many nutrients were underestimated in men and overestimated in women. Relatively few nutrients and food groups showed an acceptable LOA range between 50% and 200% in their estimates of intake. Regression coefficients were positive for almost all nutrients and statistically significant for both men and women. This indicates that agreement in the estimation of intake became worse with increasing intake. The deattenuated CC of total energy intake in women was lower than in men. The CCs of deattenuated energy-adjusted values varied from 0.10 for iodine to 0.86 for ethanol in men, and from 0.16 for beta-tocopherol to 0.69 for ethanol in women. Median CC across energy and the 53 nutrients was 0.47 in men and 0.46 in women. These CCs for energy and the individual nutrients between intakes from the web-FFQ and 12d-WFR were closely similar to those between the print-FFQ and 12d-WFR, 18 with corresponding median CCs of 0.50 and 0.43, respectively (data not shown). Pearson's correlation coefficient between these CCs was 0.81 for men and 0.84 for women ( Fig. 1). Tables 5 and 6 also show these results for 29 food groups. With regard to agreement in estimating food group intakes, many items were either under-or overestimated in both men and women. Also, positive regression coefficients were statistically significant for many food groups in both men and women. The CCs of deattenuated energy-adjusted values varied from 0.09 for algae to 0.74 for alcoholic beverages in men and from 0.07 for fats and oils to 0.77 for green tea in women. Median CC across 29 food groups was 0.48 for men and 0.44 for women. On cross classification by quintile, however, almost all nutrients and food groups were classified into their respective opposite extreme category by 5% or lower in men or women, with corresponding median values of 2% and 3% for nutrients, and 3% and 3% for food groups, respectively.

Cross-classification by quintile compared with print-FFQ
We further compared agreement of the categorization of estimated intake by the two different FFQs administered at an average interval of 1.7 (SD, 0.8) months based on cross-classification by quintile (Tables 7 and 8). Nutrients and food groups were classified into their opposite extreme categories by 5% or less of men or women, with corresponding median values for men and women of 1% and 2% for nutrients, and 1% and 1% for food groups, respectively. In addition, classification into the same and adjacent categories for nutrients ranged from 57% for total fat in percentage of energy derived from fats to 97% for ethanol in men and from 64% for selenium to 93% for ethanol in women; for food groups, classification into the same and adjacent categories ranged from 66% for fats and oils to 91% for alcoholic beverages in men and from 60% for red meat to 91% for coffee in women. Median values of the same and adjacent categories for nutrients were 77% in men and 75% in women; corresponding values for food groups were and 74% in men and 75% in women.
Finally, we conducted an additional stratified analysis of correlation coefficients between CCs of nutrient intake based on the 12d-WFR and each of the two FFQs by response location to the web-FFQ. CCs for energy and nutrients between the web-or print-FFQs and 12d-WFR were closely similar regardless of response location, with corresponding median CCs of 0.48 and 0.49, respectively, for men and 0.45 and 0.46, respectively, for women among private/residential respondents; corresponding values among onsite respondents were 0.48 and 0.46 for men and 0.38 and 0.40 for women (data not shown). Pearson's correlation coefficient between these CCs for nutrient intake based on the two FFQs and 12d-WFR were 0.7 and 0.8 for both men and women, respectively, among private/residential respondents, and 0.6 and 0.7 for men and women, respectively, among onsite respondents.

Discussion
We examined the usability and validity of a web-FFQ developed as an online version of a print-FFQ used in the baseline survey of the JPHC-NEXT study. The accuracy of estimates obtained with the web-FFQ were comparable to those obtained with the print-FFQ.
Response times for the printed and web-based questionnaires, including overall information on lifestyle, were approximately the same for men, while the web questionnaire took slightly longer for women. The printed questionnaire is likely to require additional time to construct analyzable data over and above that allotted in this study, because of the need for staff review and follow-up for missing information or logical errors, as well as in the conversion of responses to electronic data. Considering conversion of data to electronic form, therefore, the web questionnaire was not inferior from the perspective of study efficiency. On the other hand, individuals who completed the web questionnaire on site because they did not have private or residential Internet access took longer to respond than individuals responding with their own Internet access. These individuals might have been unfamiliar with computer use, and this might have impacted their response time. However, because this was also true of response time with the printed questionnaire, the difference could not be explained by the interface alone. Rather, it might have also been because the onsite respondent was approximately 7 years older on average than those using their own Internet access. Moreover, many subjects said it was as easy or easier to respond using the web than the printed questionnaire. These results indicate that, with regard to study  Table 3 Comparison of nutrient intakes using the web-based food frequency questionnaire (web-FFQ) and 12-day WFR based on agreement, ranking correlations, and joint classification by quintile in men (n ¼ 98).  . c Regression coefficient of the mean of two methods regressed on the difference between the methods. d Spearman's rank correlation coefficients based on energy-adjusted values (other than energy intake and total fat in %energy) and expressed as deattenuated CC.
e Deattenuated CCx ¼ observed CCx*SQRT(1 þ lx/n), where lx is the ratio of within-to between-individual variance for nutrient x, and n is number of food records.
f p values were for Spearman's CCs of energy-adjusted intake. g Joint classification by quintile, expressed as a percentage.

Table 4
Comparison of energy and nutrient intakes using the web-based food frequency questionnaire (web-FFQ) and 12d-WFR based on agreement, ranking correlations, and joint classification by quintile in women (n ¼ 139). . c Regression coefficient of the mean of two methods regressed on the difference between the methods. d Spearman's rank correlation coefficients based on energy-adjusted values (other than energy intake and total fat in %energy) and expressed as deattenuated CC. e Deattenuated CCx ¼ observed CCx*SQRT(1 þ lx/n), where lx is the ratio of within-to between-individual variance for nutrient x, and n is number of food records.
f p values were for Spearman's CCs of energy-adjusted intake. g Joint classification by quintile, expressed as a percentage.

Table 5
Comparison of food group intakes using the web-based food frequency questionnaire (web-FFQ) and 12d-WFR based on agreement, ranking correlations, and joint classification by quintile in men (n ¼ 98). efficiency, the use of web-FFQs in cohort studies is reasonable, including use on site.
Correlations between the intake estimates obtained with the web-FFQ and 12d-WFR were moderate or better for many nutrients compared with previous validation studies of traditional printed FFQs among Japanese populations: these had median CCs ranging from 0.31 to 0.56 for target nutrients 33 versus a median correlation for nutrients in our present study of approximately 0.5. These results are similar to previous results for the validity of web-FFQs compared with food records: the mean correlation coefficient across nutrients was 0.55 in a Canadian study of 69 men and women, 3 0.43 in an American study of 213 men and women, 4 and 0.47 in a British study of 15 men and 34 women. 13 The subjects in all of these studies were highly educated. Unlike any previous study of the validity of web-FFQs, 3e16 our present subjects were middleaged and elderly individuals from the local population of the geographic areas covered by a cohort study, albeit that their participation was voluntary. Moreover, the number of days the reference method was used and the number of subjects were greater in our study than in these previous studies. The relatively much lower CC for estimated energy intake based on web-FFQ among women as well as print-FFQ might be caused by the food list on FFQ. As described in detail in our previous paper for validity of print FFQ, 18 errors in estimates from the predetermined list were likely caused by the small contribution of individual foods to total energy intake. Our results show that the web-FFQ provided reasonable ranking for many nutrients and food groups in a range of intakes, as evidenced from the quintile cross-classification, albeit that agreement in estimating absolute intake was poor.
The characteristics of CCs for each nutrient and food group with the web-FFQ compared with the 12d-WFR were closely similar to those for the print-FFQ among both men and women, both when stratified by response location and combined. This finding indicates that the web-and print-FFQs provide similar levels of estimation accuracy for the same nutrients and food groups, and that intakes can be estimated in a similar fashion regardless of questionnaire format, whether by subjects with Internet access responding to a web-FFQ or subjects without Internet access responding to a print-FFQ.
In addition, a high proportion of rankings of intake estimates obtained with the web-and print-FFQs by quintile were classified into the same and adjacent quintiles for many nutrients and food groups (range for nutrients: 57e97% for men and 64e93% for women). A previous study that ranked nutrient intake estimates obtained with web-and print-FFQs by quartiles reported that 77e97% were classified into the same or adjacent categories. 7 Our results compare favorably, even though these previous subjects were younger computer-literate individuals with relatively high Table 6 Comparison of food group intakes using the web-based food frequency questionnaire (web-FFQ) and 12d-WFR based on agreement, ranking correlations, and joint classification by quintile in women (n ¼ 139). e Deattenuated CCx ¼ observed CCx*SQRT(1 þ lx/n), where lx is the ratio of within-to between-individual variance for nutrient x, and n is number of food records.
f p values were for CCs of energy-adjusted intake. g Joint classification by quintile, expressed as a percentage. education. 7 Moreover, a previous study of the degree of concordance between nutrient rankings with two identical web-FFQs at 4week intervals by quartiles (among 31 men and 69 women) reported that 87e98% were classified into the same and adjacent categories. 13 By comparison, a study of concordance of nutrient rankings with two identical print-FFQs administered within the same year by quintiles (among 66 men and women) reported a range of 52e83%. 34 Our study had several limitations. First, the time required to complete the printed questionnaire was self-reported. Moreover, because the web questionnaire could be completed in several separate sessions, response time included time for breaks and interruptions, although subjects taking longer than 24 h were excluded from the usability analysis. The actual web questionnaire response time may have been shorter, and the difference in total response time between the questionnaires may be overestimated. Table 7 Comparison of the web-FFQ and print-FFQ for energy-adjusted intake of nutrients, based on correlation coefficient and cross-classification by quintile (%). a Spearman's rank correlation coefficients and the p values < 0.001 for energy and all nutrients. b CCs and cross-classification for energy intake and total fat in %energy were calculated by using crude values; Percentages were based on the number of participants classified into the same, adjacent, and extreme categories by cross classification according to both quintiles using the web-and print-FFQ.
It is possible that the heightened degree of motivation and interest required of the participants of a validation study 1 in their provision of complete and accurate information for this reference method might have had some effect of overestimating usability regarding time for completion. However, if present, the impact of this effect might be same for both the web-and print-FFQs. Second, because the mean interval between administration of the two different FFQs was 1.7 months (maximum, 4 months), the possibility that seasonal dietary changes affected the responses cannot be excluded. 35 A previous comparison of web-and print-FFQs administered within 1 month showed a high level of concordance between rankings, although that study compared quartiles. 7 This suggests that concordance may have been higher if the timing of administration were closer. Although cooking water could not be considered in these FFQs (in contrast to drinks, water, water content of food, noodle soup, and miso soup, which were included), this study also showed moderate validity for water content in men and women.
In conclusion, correlations between the intake estimates obtained with the web-FFQ and 12d-WFR indicated moderate validity for many nutrients and food groups in ranking of individuals by these intakes. These validities were closely similar to those of the print-FFQ, irrespective of the location of Internet access, with good concordance between individual rankings obtained with the two FFQs. These results suggest that the web-or print-FFQ can be used in epidemiological studies consistent with the location of the individual subject.

Conflicts of interest
None declared.