Using Web-Based Questionnaires and Obstetric Records to Assess General Health Characteristics Among Pregnant Women: A Validation Study

Background Self-reported medical history information is included in many studies. However, data on the validity of Web-based questionnaires assessing medical history are scarce. If proven to be valid, Web-based questionnaires may provide researchers with an efficient means to collect data on this parameter in large populations. Objective The aim of this study was to assess the validity of a Web-based questionnaire on chronic medical conditions, allergies, and blood pressure readings against obstetric records and data from general practitioners. Methods Self-reported questionnaire data were compared with obstetric records for 519 pregnant women participating in the Dutch PRegnancy and Infant DEvelopment (PRIDE) Study from July 2011 through November 2012. These women completed Web-based questionnaires around their first prenatal care visit and in gestational weeks 17 and 34. We calculated kappa statistics (κ) and the observed proportions of positive and negative agreement between the baseline questionnaire and obstetric records for chronic conditions and allergies. In case of inconsistencies between these 2 data sources, medical records from the woman’s general practitioner were consulted as the reference standard. For systolic and diastolic blood pressure, intraclass correlation coefficients (ICCs) were calculated for multiple data points. Results Agreement between the baseline questionnaire and the obstetric record was substantial (κ=.61) for any chronic condition and moderate for any allergy (κ=.51). For specific conditions, we found high observed proportions of negative agreement (range 0.88-1.00) and on average moderate observed proportions of positive agreement with a wide range (range 0.19-0.90). Using the reference standard, the sensitivity of the Web-based questionnaire for chronic conditions and allergies was comparable to or even better than the sensitivity of the obstetric records, in particular for migraine (0.90 vs 0.40, P=.02), asthma (0.86 vs 0.61, P=.04), inhalation allergies (0.92 vs 0.74, P=.003), hay fever (0.90 vs 0.64, P=.001), and allergies to animals (0.89 vs 0.53, P=.01). However, some overreporting of allergies was observed in the questionnaire and for some nonsomatic conditions sensitivity of both measurement instruments was low. The ICCs for blood pressure readings ranged between 0.72 and 0.92 with very small mean differences between the 2 methods of data collection. Conclusions Web-based questionnaires can be used to validly collect data on many chronic disorders, allergies, and blood pressure readings among pregnant women.


Introduction
Self-reported methods of data collection are often applied in large-scale medical or biomedical studies for efficiency reasons. In these studies, it may not be feasible to conduct clinical measurements on all participants. Therefore, paper-and-pencil questionnaires or telephone interviews were traditionally used to gather information on the study variables. Nowadays, these modes of data collection are increasingly being substituted by Web-based questionnaires. However, knowledge on the validity of data collected with Web-based questionnaires is limited [1], although the quality of the data on a number of traditional epidemiologic risk factors, including body weight [2][3][4], smoking [5], alcohol consumption [6], and energy and macronutrient intake [7,8], is reported to be high. Medical history is included as an exposure or potential confounding factor in many studies and Web-based questionnaires may be an efficient way to collect these data in large samples of participants, if proven to be valid.
Most validation studies on medical history collected through self-reported methods has focused on chronic conditions, in particular cardiovascular diseases [9][10][11][12][13][14][15], diabetes [10,[12][13][14][15][16], cancer [11,17,18], and asthma [10,13,14,19,20]. Agreement between self-reports and medical records differed among these studies and was affected by study methodology, target population, condition of interest, and the statistical analyses. In general, agreement was good for conditions that have clear diagnostic criteria, but it was low to moderate for conditions that are less serious or more complex to diagnose. Accordingly, discordance between questionnaires and biochemical measures or patch testing for allergic conditions or atopy is substantial [21][22][23]. Data on the validity of self-report on the results of common measurements taken during health care visits, such as blood pressure readings and hemoglobin levels, are very limited.
To the best of our knowledge, only Landkroon et al [24] compared data on medical history from a Web-based questionnaire with a "reference standard," but this study was too small (N=106) to produce robust estimates for levels of agreement. Therefore, the aim of this study was to assess the validity of a Web-based questionnaire on chronic conditions, allergies, and blood pressure readings among pregnant women by comparing the questionnaire data to obstetric records and data from general practitioners (GPs).

Setting
The Dutch prenatal care system is unique in the Western world. In the Netherlands, midwives are qualified to provide full prenatal care to all women with uncomplicated pregnancies and deliveries. The first prenatal care visit, which may be scheduled without referral of a general practitioner, usually takes place in gestational weeks 8 to 10 and frequent contacts are scheduled throughout pregnancy. Women are referred to a secondary or tertiary midwife or gynecologist in case of risk factors or complications. In 2013, 85% of pregnant women started their prenatal care in a primary care setting [25].

Study Population
We used data from the PRegnancy and Infant DEvelopment (PRIDE) Study, an ongoing, prospective cohort study that enrolls Dutch women early in pregnancy. The PRIDE Study started enrollment in July 2011 in the Nijmegen region and aims at including more than 150,000 pregnancies to study a broad range of research questions pertaining to maternal and child health. Details on the study design are described elsewhere [26]. Briefly, pregnant women aged 18 years and older were invited to participate in the PRIDE Study by their midwife or gynecologist just before or during their first prenatal care visit. They were asked to complete Web-based questionnaires at baseline, in gestational weeks 17 (questionnaire 2) and 34 (questionnaire 3), as well as 2 and 6 months after the estimated date of delivery. The baseline questionnaire was completed between weeks 6 and 16 of gestation. Researchers from various medical disciplines selected, modified, and tailored existing, validated paper-based questionnaires or parts thereof to fit our Web-based application. Paper-based questionnaires were available for women who could not or did not want to participate through the Internet (n=1; excluded from this study). Questions were asked on demographic factors, reproductive history, maternal health, lifestyle factors, and occupational exposures. Furthermore, consent was asked for review of medical records to enrich the PRIDE Study database with detailed clinical information.

Data Collection
Through the baseline questionnaire, data on medical history were collected. Women were asked gateway questions to assess chronic conditions ("Do you have a chronic or long-term illness that was diagnosed by a medical doctor" followed by some examples of chronic conditions) and allergies ("Do you have an allergy or eczema?"). These questions were followed by multiple-choice questions with blank options to specify the chronic condition or allergy among those who answered positively to the relevant gateway question. Chronic conditions reported in other parts of the baseline questionnaire (eg, as causes for subfertility or as indications for medication use) were included in the analysis as well. In each prenatal questionnaire, we asked for the date of the most recent prenatal care visit, whether blood pressure was measured during this visit, and if so, for the systolic and diastolic blood pressure readings in mm Hg. A screenshot of the relevant parts of the questionnaires is provided in Multimedia Appendix 1.
A pretested, standardized case report form (CRF) was used to abstract data from the obstetric records of women who gave consent for medical record review. For logistical reasons, obstetric records were only reviewed in participating study centers in the Nijmegen region (7 midwifery practices and 1 academic hospital). Using the CRF, 2 medically trained abstracters collected data from the obstetric records on medical history, including chronic conditions, allergies, and pregnancy history, the pregnancy itself, anthropometrical measures including blood pressure taken during pregnancy, and pregnancy outcome, not all of which were included in this validation study.
Preexisting medical conditions are self-reported by the pregnant woman during the first prenatal care visit and are usually only recorded in the obstetric record by the prenatal care provider if deemed important for the course of pregnancy or the delivery [27]. As a consequence, obstetric records may not be a suitable reference standard for self-reported chronic conditions and allergies. Therefore, information on the diagnosis of chronic conditions and allergies was obtained from the woman's GP in case of inconsistencies between the questionnaire and the obstetric record for reasons of efficiency.

Statistical Analysis
Only PRIDE Study participants with complete information on chronic conditions, allergies, and blood pressure during the most recent prenatal care visit in the baseline questionnaire who gave consent to review their medical records were included in this validation study. For chronic conditions and allergies with at least 5 cases in either the questionnaire or the obstetrical record, we calculated kappa statistics (κ) to quantify agreement between the baseline questionnaire and the obstetric record regarding chronic conditions and allergies. We also calculated the observed proportions of positive and negative agreement (p pos and p neg , respectively) because kappa is strongly affected by imbalances in marginal totals (ie, a low kappa despite a high level of agreement) [29]. The calculation of p pos and p neg is shown in Figure 1 [30].
To determine which method of data collection was most valid to collect information on chronic conditions and allergies among pregnant women, sensitivity and specificity were calculated with GP data until the date of completion of the baseline questionnaire as our reference standard. When GP data were unavailable, pharmacy records were screened for diagnoses of chronic conditions or allergies and for medication dispensed that was indicative for chronic conditions or allergies. In addition to the discordant questionnaire-obstetric record pairs, women with positive scores on both the Web-based questionnaire and the obstetric record were included in these calculations as true positives. Likewise, women with negative scores on both methods were included as true negatives. We assessed potential differences in sensitivity and specificity between the questionnaires and the obstetric records using chi-square tests.
For the validity analyses regarding blood pressure readings, only women with an exact match between the date of the most recent prenatal care visit reported in any of the prenatal questionnaires and a visit date recorded in the obstetric record were included to be certain that both data sources referred to the same measurement. Intraclass correlation coefficients (ICCs) with 95% confidence intervals (CIs) for systolic blood pressure (SBP) and diastolic blood pressure (DBP) were calculated using 2-way mixed effects models (single measure). To assess absolute agreement and potential differences in bias within the SBP and DBP range, we plotted the difference in blood pressure readings between the questionnaire and the obstetric record (y-axis) against the mean of the 2 methods of data collection (x-axis) according to the Bland-Altman technique [31]. In secondary analyses, we included all women who reported the most recent prenatal care visit date in the questionnaire within 5 days of a visit date recorded in the obstetric record. All statistical analyses were performed using IBM SPSS version 20 (IBM Corp, Armonk, NY, USA), except for p pos and p neg , which were calculated in Microsoft Office Excel 2007 (Microsoft Corp, Redmond, WA, USA).

Results
Women enrolled in the PRIDE Study between July 2011 through November 2012 were eligible for this study (N=725). The overall participation rate in the PRIDE Study was 42.90% (725/1690) during this time period. Figure 2 shows the flow of participants. Of the 725 women enrolled during the study period, 22 (3.0%) only completed a few sections of the baseline questionnaire, mostly because of technical issues in the first weeks of enrollment. Among those with complete baseline questionnaires, 24.8% (174/703) did not give consent for medical record review. Furthermore, 10 women were excluded because their obstetric records were not available (n=9) or they participated with multiple pregnancies in the PRIDE Study (n=1). Therefore, 519 women were included in this validation study. Compared with the women who did not give consent to obtain medical records, women participating in this validation study were more likely to have a lower level of education (P=.03) and to be obese (P=.06; Table 1). Furthermore, women who did not give consent for medical record review were more likely to have completed the baseline questionnaire before their first prenatal care visit compared to women included in the validation study (P=.02). We did not observe substantial differences in maternal age, country of birth, gravidity, and gestational age at inclusion between these 2 groups. Regarding the blood pressure readings, follow-up information was not available for all participants for several reasons: (1) they did not reach the gestational week for administration of questionnaire 2 or 3 yet at the date of obstetric record review; (2) they had a miscarriage, stillbirth, termination of pregnancy (TOP), or very preterm birth; or (3) they skipped questionnaire 2 or 3, were lost to follow-up, or changed prenatal care provider resulting in incomplete obstetric records.  Of the 519 participants, 118 (22.7%) women reported having a chronic condition in the baseline questionnaire, whereas chronic conditions were recorded in the obstetric records of 105 (20.2%) women. Overall, agreement between the Web-based questionnaire and the obstetric record was substantial for any chronic condition (κ=.61; Table 2) with a higher p neg (0.92) than p pos (0.69). Level of agreement differed between the specific chronic conditions with relatively high levels of agreement for endocrine, nutritional, and metabolic diseases (κ=.72) and in particular for thyroid disease (κ=.90), epilepsy (κ=.89), and diseases of the genitourinary tract (κ=.72). However, for a number of conditions, including migraine (κ=.30), diseases of the circulatory system (κ=.25), and irritable bowel syndrome (κ=.39), agreement between the questionnaire and the obstetric record was poor. For all specific conditions, the p neg was high (range 0.98-1.00), but the p pos followed a pattern comparable to the kappa statistic.  Allergies were reported by 229 of 519 (44.1%) women in the baseline questionnaire and recorded in the obstetric record of 168 (32.4%) women. In Table 3, agreement between the Web-based questionnaire and the obstetric record is shown for the mutually exclusive groups of allergies and selected specific allergies. For any allergy, agreement between the questionnaire and the obstetric record was moderate (κ=.51) with a p pos and p neg of 0.70 and 0.81, respectively. The kappa values for the groups of allergies ranged between 0.21 (insect sting allergy) and 0.66 (drug allergies) and between 0.33 (fragrance hypersensitivity) and 0.73 (latex allergy) for the specific types of allergies. House dust mite allergy, latex allergy, and drug allergies were more often reported in the obstetric record than in the questionnaire. Again, the p neg (range 0.81-1.00) was higher than the p pos (range 0.19-0.73) for all groups of allergies or specific allergies included. Regarding the 254 women with an inconsistency between the Web-based questionnaire and the obstetric record for chronic conditions or allergies, complete GP data were obtained for 194 (76.4%) women; the GP was unknown for 12 women, 21 women were not registered with the GP whose name was provided, the GP did not respond to our multiple data requests for 25 women, and GP records were incomplete for 2 women. For 7 women lacking GP data, the diagnosis of a chronic disorder was ascertained from their pharmacy records. Generally, sensitivity was better for the Web-based questionnaire than for the obstetric record when compared to GP data (Table 4)  ). When these women were considered as not having reported these chronic conditions, agreement between the Web-based questionnaire and the obstetric record decreased, except for skin diseases. Furthermore, it decreased the sensitivity of the questionnaire, especially for endocrine diseases (0.67), polycystic ovarian syndrome (no true positive subjects), and diseases of the genitourinary tract (0.67).
Analyses on the validity of the Web-based questionnaires for blood pressure readings could not be conducted on the complete study sample. At baseline, 123 of 519 (23.7%) women did not have a prenatal care visit yet and, therefore, no valid blood pressure measurement (Table 5). Among women with a prenatal care visit, no match on visit date was established for 91 of 396 (23.0%), 65 of 423 (15.4%), and 32 of 295 (10.8%) women for the baseline questionnaire, questionnaire 2, and questionnaire 3, respectively. Furthermore, a substantial proportion of women whose blood pressure was measured could not remember the blood pressure readings (baseline questionnaire: 27.9%, 76/272; questionnaire 2: 28.4%, 93/328; questionnaire 3: 19.1%, 50/262). Of the women included at baseline and eligible for the reliability analyses of the follow-up questionnaires, 78.6% (121/154) and 84.6% (88/104) were included for questionnaires 2 and 3, respectively. Out of the 142 women included for questionnaire 2 and eligible for the analysis of questionnaire 3, 128 (90.1%) were included for questionnaire 3. Table 5. Validity analyses comparing Web-based questionnaires and obstetric records for systolic and diastolic blood pressure readings: sample description and intraclass correlation coefficients. At baseline, the ICCs for SBP and DBP were 0.72 (95% CI 0.65-0.79) and 0.79 (95% CI 0.73-0.84), respectively. In the follow-up questionnaires, ICCs were substantially higher, ranging between 0.89 (95% CI 0.86-0.91; DBP in questionnaire 3) and 0.92 (95% CI 0.89-0.94; SBP in questionnaire 2). The Bland-Altman plots (Figure 3) also showed good agreement between the 2 methods of data collection with very small mean differences, ranging between 1.26 mm Hg (SD 7.72) for SBP in the baseline questionnaire and -0.04 (SD 4.09) for DBP in questionnaire 3. No trends in bias within the SBP and DBP ranges were observed. The secondary analyses, in which the date of the prenatal care visit was allowed to differ up to 5 days between the questionnaire and the obstetric record, yielded similar results (data not shown).

Principal Findings
Web-based questionnaires are increasingly being used as a method of data collection in medical research. The results from the present study show that data on many chronic conditions and allergies can be validly collected among pregnant women using Web-based questionnaires with sensitivities comparable to or even higher than obstetric records. However, some overreporting of allergies was observed and absence of disease was more accurately reported than presence of disease. In addition, pregnant women were able to reliably recall blood pressure readings from the most recent prenatal care visit, especially in the follow-up questionnaires, but a substantial proportion of women could not remember their blood pressure readings at all.

Strengths and Limitations
In addition to the relatively large sample size, the use of GP records as a reference standard to validate the Web-based questionnaire and obstetric records for chronic conditions and allergies is a major strength of this study. In the Netherlands, inhabitants are obligatory listed with one GP, who coordinates access to specialized care and always receives all relevant medical information about the patient [32]. Therefore, GP records should contain the most complete information, although inaccuracies in registration of diagnoses cannot be excluded. Other strengths of this validation study include the high consent rate (75.2%) to review medical records, the high retrieval rate of obstetric and GP records (98.3% and 76.4%, respectively), and the high willingness of PRIDE Study participants to complete questionnaires through the Internet despite the study's mixed-mode design.
Women participating in the PRIDE Study represent a highly educated population, potentially limiting the generalizability of our results. However, women included in the validation study had a lower level of education compared to women who did not give consent for review of medical records. Previous studies on the association between maternal level of education and recall sensitivity of pregnancy-related events showed inconsistent results [33][34][35][36], indicating that imbalances in this baseline characteristic may or may not be a major threat to external validity.
Validity could not be determined reliably for a number of specific chronic conditions due to their low prevalence rates in our study population or in strata based on baseline characteristics. However, it was not feasible to increase the size of the study population because medical record abstraction is a labor-intensive process. Moreover, during the time frame of this study, only one secondary/tertiary care facility participated in the PRIDE Study. Women with certain medical conditions, including preexisting hypertension or diabetes and rheumatoid arthritis, are often referred to these facilities for prenatal care in the Netherlands. Reassuringly, only a small proportion of pregnant women (15%) start prenatal care in a secondary or tertiary care setting, mainly because of complications in a previous pregnancy [25].

Comparison With Prior Work
For many chronic conditions that were included in our analyses, data on the validity of self-report are scarce due to differences in study populations between this study among pregnant women and previous studies, which often selected an older population with higher prevalences of cardiovascular diseases, diabetes, and cancer. However, the general pattern of a better agreement for chronic conditions that have clear diagnostic criteria than for conditions that are less well-defined observed previously [11,37,38] was also visible in our study. We observed high sensitivities and specificities for somatic diseases, but low levels of agreement for a number of nonsomatic diseases, including mental and behavioral disorders and irritable bowel syndrome. This was not only the case for data from the Web-based questionnaire, but also for data from the obstetric records. Possible causes for this variability include poor communication between the patient and the health care provider, limited health literacy of the patient, or self-diagnosis in the absence of a satisfactory medical explanation for the symptoms [39].
Surprisingly, sensitivity of the Web-based questionnaire was substantially higher for asthma (0.86) and migraine (0.90) compared to the obstetric record, whereas the specificities were comparable. The traditional self-reported modes of data collection have a sensitivity ranging between 0.55 and 0.95 (median 0.72) for asthma [10,13,14,19,40] and between 0.35 and 0.67 (median 0.51) for migraine [41][42][43], suggesting that Web-based questionnaires might be more suitable for detecting subjects with these conditions in epidemiologic studies than paper-based questionnaires, interviews, and obstetric records. However, future studies should confirm these findings, also taking into account the manner in which the questions about these conditions are posed.
With regard to allergies, the Web-based questionnaire also seemed to be more sensitive than the obstetric record, but at the expense of its specificity indicating that overreporting occurs with the use of the Web-based questionnaire and underreporting is present when using obstetric records. However, participants with allergic symptoms who manage their symptoms with over-the-counter medication may not be registered as allergic in GP records, resulting in a lower specificity (increased number of false positives). Therefore, skin-prick tests or serum-specific immunoglobulin E levels may be a more appropriate reference standard. In comparison with previous studies in different populations [20][21][22][23], allergies were somewhat more accurately reported in our Web-based questionnaire compared to the other self-reported modes of data collection.
Research interests in changes in blood pressure over time in relation to disease outcomes is growing (eg, [44,45]), but obtaining data on individual blood pressure readings may be challenging. Alonso et al [46] observed a low correlation between self-reported and directly observed information on SBP and DBP among 127 university graduates with an ICC of 0.35 (95% CI 0.09-0.55 and 95% CI 0.16-0.51, respectively). We are not aware of other studies reporting on the validity of self-reported blood pressure readings. In our longitudinal study, we observed a learning effect; the ICC for SBP and DBP was higher for the follow-up questionnaires than for the baseline questionnaire. Once women reported a blood pressure reading, they were very likely to report blood pressure readings in follow-up questionnaires as well. In addition, the proportion of women who could not remember their blood pressure readings decreased. As a future alternative to self-reports of blood pressure measurements conducted in health care settings, home blood pressure telemonitoring may be used to collect data on blood pressure changes over time. In addition, dedicated applications may be developed in which pregnant women could record their blood pressure readings directly after every prenatal care visit.

Conclusions
We showed that Web-based questionnaires can validly collect data on many chronic disorders, including asthma, migraine, and thyroid disease, and also allergies among pregnant women with equal or better data quality compared to obstetric records. Although a substantial proportion of women could not remember their blood pressure readings, pregnant women who did recall the readings, recalled them well. This indicates that accurate data on general health characteristics may be collected using Web-based questionnaires in this population.