Ruling out early trimester pregnancy when implementing community-based deworming programs

Background Large-scale deworming programs have, to date, mostly targeted preschool- and school-age children. As community-based deworming programs become more common, deworming will be offered to women of reproductive age. The World Health Organization recommends preventive chemotherapy be administered to pregnant women only after the first trimester. It is therefore important for deworming programs to be able to identify women in early pregnancy. Our objective was to validate a short questionnaire which could be used by deworming program managers to identify and screen out women in early pregnancy. Methodology/Principal findings In May and June 2018, interviewers administered a questionnaire, followed by a pregnancy test, to 1,203 adult women living in the Peruvian Amazon. Regression analyses were performed to identify questions with high predictive properties (using the pregnancy test as the gold standard). Test parameters were computed at different decision tree nodes (where nodes represented questions). With 106 women confirmed to be pregnant, the positive predictive value of asking the single question ‘Are you pregnant?’ was 100%, at a ‘cost’ of a false negative rate of 1.9% (i.e. 21 women were incorrectly identified as not pregnant when they were truly pregnant). Additional questions reduced the false negative rate, but increased the false positive rate. Rates were dependent on both the combination and the order of questions. Conclusions/Significance To identify women in early pregnancy when deworming programs are community-based, both the number and order of questions are important. The local context and cultural acceptability of different questions should inform this decision. When numbers are manageable and resources are available, pregnancy tests can be considered at different decision tree nodes to confirm pregnancy status. Trade-offs in terms of efficiency and misclassification rates will need to be considered to optimize deworming coverage in women of reproductive age.


Methodology/Principal findings
In May and June 2018, interviewers administered a questionnaire, followed by a pregnancy test, to 1,203 adult women living in the Peruvian Amazon. Regression analyses were performed to identify questions with high predictive properties (using the pregnancy test as the gold standard). Test parameters were computed at different decision tree nodes (where nodes represented questions). With 106 women confirmed to be pregnant, the positive predictive value of asking the single question 'Are you pregnant?' was 100%, at a 'cost' of a false negative rate of 1.9% (i.e. 21 women were incorrectly identified as not pregnant when they were truly pregnant). Additional questions reduced the false negative rate, but increased the false positive rate. Rates were dependent on both the combination and the order of questions.

Conclusions/Significance
To identify women in early pregnancy when deworming programs are community-based, both the number and order of questions are important. The local context and cultural PLOS  acceptability of different questions should inform this decision. When numbers are manageable and resources are available, pregnancy tests can be considered at different decision tree nodes to confirm pregnancy status. Trade-offs in terms of efficiency and misclassification rates will need to be considered to optimize deworming coverage in women of reproductive age.

Introduction
Women of reproductive age (WRA) in low-and middle-income countries are particularly vulnerable to morbidity resulting from soil-transmitted helminth infection (STH) [1][2][3]. It has recently been estimated that approximately 700 million WRA in over 100 countries are at risk of morbidity from these infections [4,5]. Species-specific prevalences (and intensities of infection) can vary greatly. For example, of five study populations of pregnant women recruited into randomized controlled trials in Africa and South America, the baseline prevalences of hookworm varied from 38% to 67% and for Trichuris trichiura, from 5% to 82% [6][7][8][9][10]. High STH prevalence, when combined with anemia, malnutrition and other co-occurring conditions, contribute to a high burden of disease. Most previous research on STH infections and deworming has focused on children. Even though it is well known that worm infections cause and exacerbate anemia during the different stages of a woman's reproductive life span, WRA have been under-studied in this context [2,11]. The resulting lack of uptake by endemic countries of deworming programs that include women of reproductive age inadvertently neglects an important cause of STH-attributable morbidity and reinforces gender inequities [3,5,12].
The deworming intervention currently used in large-scale public health intervention programs consists of a single dose of deworming medicine (either albendazole or mebendazole), repeated either once or twice a year, depending on the prevalence of STH in the area [13]. While the World Health Organization (WHO) recommends large-scale deworming programs in endemic areas for all women of reproductive age, it specifically recommends that pregnant women be treated only after the first trimester (i.e. in either the second or third trimester) [3,14]. Therefore, the challenge of large-scale community-based deworming programs is to identify pregnant women in early pregnancy in order to delay treatment. Deworming program managers would benefit from having an easily implementable and cost-efficient 'pregnancy ruling out tool' which would ideally rule out all women in the early stages of pregnancy while keeping in all other women. Therefore, given the emphasis on ruling out potentially pregnant women, such a tool would have the lowest false negative rate possible, even though this might increase the false positive rate.

Determining pregnancy status
In resource-limited settings, determining pregnancy status by ultrasound, blood testing, or a urine-based pregnancy test is impractical, unfeasible and prohibitively costly, so the common practice has been to use the start date of the last menstrual period (LMP) [15]. Gestational age, based on either certain or uncertain dates of LMP, is known to be unreliable, resulting in both over-and under-estimation [16][17][18][19][20].
Only a small number of studies have examined the performance of a multi-item questionnaire in determining pregnancy status. In a clinic-based study sample of 283 women who had missed their periods and who had requested a pregnancy test, the authors concluded that neither a woman's self-assessment of pregnancy nor the presence of pregnancy symptoms had a high degree of accuracy in predicting pregnancy status [21]. In a study of women seen in an emergency department setting, no combination of variables was found to be able to exclude pregnancy due to high false negative rates [22]. Several more recent publications conducted in emergency department, pre-operative and family planning settings [23][24][25][26][27][28][29] present conflicting results on the degree of reliability of administered questionnaires. All of these studies were conducted in clinical settings and are therefore unlikely to be generalizable to the context of community-based deworming programs.
To our knowledge, only one previous study has examined the performance of such a questionnaire in the context of women eligible for a mass treatment program (ivermectin in onchocerciasis control) in a limited resource setting [30]. Unfortunately, the published methods and results from this study in Cameroon are not easily interpretable nor replicable (e.g. results from two separate surveys were combined; not all diagnostic tests were performed on the same number of women, some having very small sample sizes; the comparator urine-based pregnancy test had a much higher threshold of human chorionic gonadotropin (hCG) than tests used today (i.e. 75 IU (international units)) with results available after two hours) and are thus unlikely to be generalizable.
The objective of the present study, therefore, was to determine the most parsimonious set of questions to identify women in early pregnancy.

Ethics statement
This study was approved by the Oficina del Comité Institucional de Ētica en Investigación of the Hospital Regional de Loreto "Felipe Arriola Iglesias" in Iqutios, Peru (ID 019-CIEI-2018) and by the Research Ethics Board of the McGill University Health Centre (CT1 panel; 23-04-2018), in Montreal, Canada. All participants were adults and provided written informed consent.

Study design and study population
Data collection was completed in the 3-week period from May 29 to June 15, 2018, in Belén, an impoverished district of the capital city of Iquitos (in the department of Loreto), situated in the floodplain of the Itaya and Amazon rivers. This district has a high prevalence of STH infections in all three high risk groups (i.e. school-age children, preschool-age children and women of reproductive age, including pregnant women) [8,31,32]. A catchment area of five sub-districts was randomly selected from a list of sub-districts surrounding the three main health centers in Belén (the Centro de Salud de Belén, the Centro de Salud 9 de Octubre, and the Centro de Salud 6 de Octubre). Each sub-district had recently been mapped, such that the number of households in each neighborhood (or manzana) was known. The manzana numbers were manually entered into a database and selected by computer-generated random sequence. Every household in a selected manzana was visited and every woman of reproductive age from each occupied household was invited to participate. The sample size was estimated as follows: the most recent (2016) crude birth rate in Loreto, Peru was obtained (20.83 births per 1000 persons) (World Data Atlas (Knoema) 2018). The most recent (2015) population figures for the region of Loreto were obtained to estimate the number of WRA (Citypopulation.info 2018). Using these figures, it was estimated that there would be approximately 42 pregnancies per 1000 WRA in the study area in one year. The manufacturer's estimate of the sensitivity and specificity of the rapid pregnancy test is 100% (Core Technology Co., Ltd. 2018). Specifying an alpha of 1% and the total width of the confidence interval to be 0.03, with an expected pregnancy proportion of 0.042, the total sample size is estimated to be 1186.
Each study visit consisted of the sequential administration of the study questionnaire, followed immediately by a pregnancy test. The cross-sectional study design was chosen over that of a cohort study (where it might be perceived that a better 'gold standard' could be used (i.e. confirmed clinical pregnancy), for the following reasons: i) the loss to follow-up between the baseline assessments and the confirmed clinical pregnancy would likely exceed the measurement error of the currently proposed gold standard (i.e. the rapid pregnancy test); ii) some number of new pregnancies and miscarriages occurring between the baseline assessments and follow-up confirmation would be missed, contributing to measurement error; and iii) cost and feasibility issues were of concern.
The questionnaire and pregnancy tests were administered by a group of 11 research assistants (RAs) who were midwives or midwife-trainees. The RAs were trained on the administration of informed consent, the questionnaire and the pregnancy test. They worked in groups of two or three, visiting women in their own households according to a pre-set daily assignment. While obtaining informed consent, the women were asked where in their home they would be most comfortable completing the questionnaire and pregnancy test. Women were included in the study if they were between the ages of 18 and 49 years and consented to participate.

Questionnaire
The one-page study questionnaire had five basic identifier questions (e.g. manzana number, household number), 17 questions on socio-demographic and pregnancy-related characteristics (e.g. age, birth date, marital status, number of children breastfeeding status, known or suspected pregnancy status (including gestational age and method of confirmation for women declaring they were pregnant), start date of the last menstrual cycle, and the presence/absence of breast tenderness, darkened areolas, fatigue, nausea, and vomiting) and one last question on whether or not the woman wished to receive the results of the pregnancy test (Fig 1). Several draft questions had initially been proposed by WHO and others were added by the research team based on their expertise and field experience. The questionnaire was developed in an iterative manner with several translations from English to Spanish and then back-translations from Spanish to English until the final version was confirmed. The questionnaire was designed to be administered in under 10 minutes.

Pregnancy test
The pregnancy test (ABON hCG, Abon Biopharm (Hangzhou) Co., Ltd, China) was performed on a urine sample provided by the woman after the questionnaire was complete. The test was a rapid chromatographic immunoassay designed to detect the presence of hCG. The manufacturer reports test sensitivity to 25mIU/ML hCG with results available within 3 minutes. Elevated hCG levels can be reliably detected by this test after a minimum interval of one day following the first missed menstrual period in a woman's cycle, with both a sensitivity and specificity of 100% (95% CI [95-100%]) and a precision level of 100% (95% CI [98%-100%]) [33]. The test was performed according to the manufacturer's instructions and results were interpreted and recorded by the RAs on site. Women who wished to be informed of their pregnancy status were verbally provided with the results of the test.

Data analysis
The results from both the questionnaire and the pregnancy test were entered into an electronic database by two independent research staff. Data quality was monitored daily by research personnel through field supervision and internal checks for consistency.
General socio-demographic information was summarized for the population as a whole, and stratified by pregnancy test result, to present an initial unadjusted view of the distribution of baseline variables in the study population. Comparisons were performed using independent t-tests for continuous variables and the chi-square test with a continuity correction for categorical variables.
Answers to the questionnaire and the pregnancy test results were summarized using univariate analyses to assess the predictive capacity of each question (answers to each question were considered as variables in the analyses) with respect to the outcome of pregnancy status. A multivariate analysis was then performed in order to assess the stability of the estimates and to identify the variables that independently had the highest predictive capacity. Odds ratios (ORs) and 95% confidence intervals (CIs) for the association between the independent predictor variables and the dependent binary variable (i.e. the pregnancy test result) were computed using logistic regression for both the univariate and multivariate analyses. Variables were removed from modeling when co-linear or sparse. Self-awareness of pregnancy status was categorized as either knowing one was pregnant (i.e. answering 'Yes' to the question: Are you pregnant?); thinking one was pregnant (i.e. answering 'Yes' to the question: Do you think you might be pregnant?); thinking one was not pregnant (i.e. answering 'No' to the question: Do you think you might be pregnant?); or not knowing whether one was pregnant (i.e. answering 'I don't know' to the question: Do you think you might be pregnant?).
Diagnostic measures of sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) were computed for each multi-item question response against the gold standard of the binary pregnancy test result. Youden's J statistic was then calculated to identify questions with the highest accuracy in terms of sensitivity and specificity. (The Youden's J statistic takes into consideration both false positives (women incorrectly classified as pregnant when asked one or more questions) and false negatives (women incorrectly classified as not pregnant when asked one or more questions), and ranges from 0 (the test is completely uninformative) to 1 (the test is perfect) [34]). To further confirm potential question options and sequencing of questions, decision tree analyses were also performed. Decision trees use binary recursive splitting to partition the data into nodes, each representing a potential decision point. When used for classification, this method can predict responses associated with particular terminal node regions while minimizing classification error rates [35]. Errors in classification were described in terms of the numbers and rates of false positives and of false negatives for each decision tree node.
All statistical analyses were performed using R (version 3.4.3, R Core Team 2017, Vienna, Austria).

Results
A total of 1,203 adult women consented to participate in the study. The women were recruited from 1,074 households of a total of 1,258 households (85.4%) canvassed from 182 manzanas located within the district of Belén. Both the study questionnaire and the pregnancy test were administered to all consenting participants. Of these, 1,097 women (91.2%) had a negative pregnancy test result and 106 women (8.8%) had a positive result (Fig 2).
Study population characteristics are summarized in Table 1. The mean age was 31.1 years (±8.7). The most commonly reported civil status was that of conviviente, where 63.7% of the women lived with a partner in a shared household. Other common categories included single The results presented in Table 1 are further stratified by the results of the pregnancy test. The marital status described as living together with a partner in a shared household was significantly more common (81.1% versus 62.0%), and that of being single significantly less common (5.7% versus 23.0%), among women who had a positive pregnancy test result. Moreover, there was a significant association between age and pregnancy status, where women with a positive test result tended to be younger than both the overall study population and women with a negative test result (p < 0.001). Women with a positive test result also tended to have had fewer children than women with a negative test result (p = 0.048).
The results of the univariate and multivariate logistic regression analyses are presented in Table 2. The following variables were found to be significant predictors of the pregnancy test results in univariate analyses: age, marital status, breastfeeding status, various levels of awareness regarding own pregnancy status, various levels of the timing of the last menstrual period, and all reported symptoms of pregnancy. In the multivariate analyses, only three questions were found to be statistically significant independent predictors of the pregnancy test results: "thinking" that she was not pregnant (adjusted odds ratio (aOR) [95% confidence interval] = 0. 16 Table 3 presents the test parameters for each question separately. While several questions had high sensitivity, specificity and predictive values, the highest Youden's J statistic (0.802) was obtained with answering 'Yes' to the question 'Are you pregnant?'. Therefore, of the 106 truly pregnant women, this first question (Q1) would correctly identify 85 (i.e. 80.2%) women as being pregnant and they would not be treated (Table 4). Since all women identified as being pregnant were in reality pregnant, the false positive rate of this question would be 0%. However, the remaining 21 pregnant women would be in the group that answered 'No' to the first question (n = 1,118), leading to a false negative rate of 1.9%. In order to identify as many of these 21 pregnant women as possible so that they would be screened out from receiving the deworming treatment, further questions would need to be asked. The most accurate second question (Q2), based on test parameters of the 1,118 women and the re-calculated Youden's J statistic, would be 'Do you think you might be pregnant?' (Table 5). A total of 14 of the 21 pregnant women who answered "No" to "Are you pregnant?" answered 'Yes' or 'I don't know' to this question (n = 212). Among those answering 'No' to this question (n = 906), seven women would, in fact, be pregnant. Therefore, after asking these first two questions, 99 (93.4%) of the 106 truly pregnant women would have been identified as pregnant and would not be treated, while seven pregnant women (6.6%) would be among those who would have been classified as not pregnant and receive treatment (n = 906). It should be noted that, in order to identify 14 of the 21 pregnant women not identified after Q1 so that they could be excluded from treatment, 198 women would be (incorrectly) identified as being potentially pregnant and not receive treatment. As such, the addition of this question would successfully decrease the rate of false negatives (from 1.9% to 0.8%), but at the cost of increasing the rate of false positives (from 0 to 66.6%).
To reduce the relatively large number of women who would not receive treatment (the number of false positives), while minimizing the risk of misclassifying women who were likely to be pregnant (the number of false negatives), a third question (Q3) (i.e. 'Have you experienced darkened areolas in the past week?') could be asked of the 212 women who thought they might be pregnant or didn't know if they were pregnant on Q2 (Table 6). Seventeen women answered 'Yes" to Q3, of which eight would be correctly identified as pregnant and all 17 excluded from treatment; 195 answered 'No' to Q3, who would then be classified as not pregnant (when in fact, six pregnant women are in this group) and all 195 would receive treatment. Therefore, after asking three questions, 93 truly pregnant women would be correctly excluded from treatment; 1,088 truly not pregnant women would Table 3. Sensitivity, specificity, positive predictive value, negative predictive value and Youden's J statistic, for question responses, using the pregnancy test result as the gold standard (n = 1,203). correctly receive treatment; and 13 truly pregnant women would be inadvertently treated. So, asking Q3 as an additional question would result in adding 189 non-pregnant women to the pool of women eligible for treatment (and improve coverage rates) at the 'cost' of misclassifying an additional six pregnant women (who would be inadvertently treated). Compared to the previous 2-question model, this 3-question model would thus translate into a slight increase in the rate of false negatives (from 0.8% to 1.2%), while achieving a large reduction in the rate of false positives (from 66.6% to 8.8%).

Questions Sensitivity (%) Specificity (%) PPV (%) NPV (%) Youden's J
The misclassification in terms of the rate of false negatives can be reduced by asking a fourth question (Q4) ( Table 7) Additional analyses were conducted to examine whether alternative questions were more predictive of pregnancy in two distinct subgroups: 1) women who were currently breastfeeding and 2) women who had children. These did not provide any increased predictive accuracy. Subgroup analyses were also conducted to explore potential reasons for questionnaire responses (e.g. whether being younger, single or primiparous could explain a difference in responses). These analyses offered no additional insights.

Discussion
Increasingly, girls and women of reproductive age will be included in large-scale deworming programs and it will be important to effectively rule out those who are in the early stages of pregnancy. While it is unlikely that all pregnant women will be able to be screened out unless a pregnancy test is administered, we have shown that by asking one or two questions, false negative rates can be minimized. Increases in positive predictive values (and decreases in the rate of false negatives) can only be achieved by adding one or more questions, which inevitably leads Ruling out early trimester pregnancy in deworming programs to misclassification in terms of an increase in the rate of false positives (i.e. more women would be screened out as being pregnant when they are truly not pregnant). As such, this trade-off must be balanced in light of the risks of inadvertently administering deworming treatment to pregnant women in their first trimester and the unrealized benefits stemming from eligible women not receiving the deworming treatment. To a certain extent, increases in the rate of false positives can be justified by decreases in the rate of false negatives given the relative importance of competing risks. Unfortunately, no objective standard currently exists to quantify this trade-off. One potential strategy to mitigate misclassification would be to administer pregnancy tests to those women who cannot be definitively classified as "pregnant" or "not pregnant" after one or two questions; however, pregnancy tests are not likely to be available in the context of large-scale deworming programs and the numbers of tests required might be quite high, making this approach infeasible. Ruling out early trimester pregnancy in deworming programs Considerations other than the statistical approach used in this study should be taken into account during the implementation of a deworming program. Questions on sexual activity were not included in the present questionnaire because of acceptability issues. Cultural practices related to recognizing, and then disclosing, pregnancy status must also be considered [36]. In addition, the sex of the person distributing the deworming medicine may determine whether it is culturally appropriate to ask a particular question. Ideally, a qualitative study, with participation of women in each subgroup of women of reproductive age (i.e. adolescent girls, pregnant women, lactating women and other adult women), should be undertaken to ensure the acceptance of the proposed questions before the start of the program.

Conclusion
The rationale for identifying women in early pregnancy in the implementation of deworming programs is the concern for adverse effects attributable to the deworming treatment in the first trimester of pregnancy. In large-scale deworming programs which include women of reproductive age, and where administering pregnancy tests to each woman would be prohibitively costly, this study has shown that inadvertent exposure to deworming treatment in early pregnancy can be considerably reduced by asking a parsimonious set of questions. The wording, number and order of questions may differ by cultural setting, in light of acceptability and feasibility concerns. These concerns can be appropriately captured in focus group discussions (ideally undertaken in all WRA subgroups) conducted prior to the launching of a WRA-targeted deworming program. This will not only enable adaptation of questions to context, including around norms linked to pregnancy disclosure, but it will also inform relevant and appropriate training for the front line health workers who will administer the questions and distribute the deworming treatment. Where possible, inadvertent exposure to deworming treatment may be further reduced, even to zero, where pregnancy tests can be administered to smaller numbers of women after answering an initial series of questions.