Assessing the Impact of Misclassification when Comparing Prevalence Data: A Novel Sensitivity Analysis Approach

Background: A simple sensitivity analysis technique was developed to assess the impact of misclassification and verify observed prevalence differences between distinct populations. Methods: The prevalence of self-reported comorbid diseases in 4,331 women with surgically-diagnosed endometriosis was compared to published clinical and population-based prevalence estimates. Disease prevalence misclassification was assessed by assuming over-reporting in the study sample and under-reporting in the general (comparison) population. Overand under-reporting by 10%, 25%, 50%, 75%, and 90% was used to create a 5×5 table for each disease. The new prevalences represented by each table cell were compared by p-values, prevalence odds ratios, and 95% confidence intervals. Results: Three misclassification patterns were observed: 1) differences remained significant except at high degrees (>50%) of misclassification; 2) minimal (10%) misclassification negated any observed difference; and 3) with some (25-50%) misclassification, the difference disappeared, and the direction of significance changed at higher levels (>50%). Conclusions: This sensitivity analysis enabled us to verify observed prevalence differences. This useful, simple approach is for comparing prevalence estimates between distinct populations.


Introductıon
Establishing differences in disease prevalence between populations is a common application of epidemiology. Disease prevalence data may be obtained using surveys, medical record reviews, and surveillance reporting, and thus disease may be over-or underestimated because of unmeasured confounding, misclassification (information bias), and selection bias. While medical researchers strive to collect valid and minimally biased data, missing or limited validation data can be an important obstacle in addressing the effect of misclassification. Analytic techniques may be employed to assess the uncertainty of study results and to correct for potential bias due to misclassification and therefore, are useful in interpreting whether significant differences are real. Sensitivity analysis may be used to quantitatively evaluate the effect of misclassification.
Various sensitivity analyses techniques use basic and matrix algebra to assess and correct for differential, non-differential, or simultaneous misclassification of exposure and disease on epidemiologic measures of association [1][2][3][4][5][6]. Predictive values are also used to adjust relative risk estimates and to correct for biases resulting from misclassification of outcome status [7,8]. In some instances, computer programs are used to perform more extensive analyses [9]. While these established techniques for conducting a "formal" sensitivity analysis are valuable, there are several important reasons why these comparisons may not be carried out. First, reliable estimates of sensitivity, specificity, and true disease frequency are often required, but may not be available. Second, these methods make assumptions about the data such as misclassification of only the outcome variable, sensitivity and specificity parameters that are the same for each comparison group, or misclassification that is considered in isolation from other forms of bias, such as selection bias or confounding. Third, these methods are not standardized and may be useful only with particular study designs, further hampering their appropriate use. Finally, the methodology is complex, such that most public health professionals or clinicians cannot undertake a sensitivity analysis without formal training in epidemiology or statistics [10].
The prevalence of published self-reported physician-diagnosed autoimmune, chronic fatigue syndrome, and fibromyalgia [11], as well as cancer, and infectious or endocrine diseases [12] in up to 4,331 women with surgically diagnosed endometriosis were compared to prevalences from studies published in the last 30 years. Comparing population prevalences obtained from clinical, population-based, or self-reported studies to those that are solely self-reported may present disparity not only due to differences in study methodology, but also inherent differences in the populations being compared. We assumed that women with endometriosis who self-report a diagnosis may believe they have a disease when they actually do not, therefore biasing prevalence estimates upward. Some diseases were rare and others, like infectious diseases were more commonly reported, but are less specific and, perhaps, open to interpretation. In both instances, this may lead to overestimation of diagnoses.
By contrast, population disease prevalence estimates based on clinical or population-based studies may use more stringent definitions, which might bias prevalence estimates downward. These types of biases may result in conclusions of 1) a difference when one does not exist (Type I error), 2) no difference when there is one (Type II error), or 3) a difference in the opposite direction from the true difference. We therefore considered the degree of underestimation and overestimation of true disease prevalence because even modest amounts of error can profoundly affect results [13].
We developed a novel sensitivity analysis approach to determine the threshold of misclassification that would eliminate the observed differences between the disease prevalence, in two different populations, in this instance, for women with endometriosis and the general female population. This provided us with a visual and quantitative validation of the increased prevalence of comorbid diseases among women with endometriosis. Our method only requires a numerator and denominator for prevalence computation and does not rely on detailed information, assumptions, or complex methodology. Our goal was not to replace formal sensitivity analysis techniques, which should be carried out, when possible, but to offer a simple way to assess the impact of misclassification, and to verify study findings.

Materials and Methods
Prevalence estimates from up to 4,331 female members of the Endometriosis Association (International Headquarters, Milwaukee, Wisconsin) who reported surgical diagnosis of endometriosis and the physician diagnosis of comorbid diseases were compared to the general population [11,12]. Exemptions from Investigational Review Board reviews were granted by the Office of Human Subjects Research at the National Institutes of Health, Bethesda, Maryland, and the Committee on Human Research, The George Washington University, Washington, DC.
Disease prevalence in the general female population for systemic lupus erythematosus, Sjögren's syndrome, rheumatoid arthritis, multiple sclerosis, Hashimoto's thyroiditis/hypothyroidism, Graves' disease/hyperthyroidism, diabetes mellitus, chronic fatigue syndrome, and fibromyalgia were estimated from studies published between 1969 and 2001. Age-specific population estimates of breast cancer, ovarian cancer, non-Hodgkin's lymphoma, and melanoma were obtained from the National Cancer Institute's Surveillance, Epidemiology, and End Results (SEER) database. The remaining population prevalence estimates were obtained from published literature or sources such as the Centers for Disease Control and Prevention (CDC) and the National Center for Health Statistics (NCHS). Studies were included if prevalence could be calculated for women by 1) providing the total number in the population for the denominator and 2) prevalence data for women. For some, denominators were determined using U.S. Census Bureau data. The published studies were pooled to derive disease prevalence using standardized medical definitions, patient interviews, clinical and laboratory evaluation, and self-reported surveillance data. Disease prevalences for women with endometriosis compared to the general female population prevalence estimates are presented in Table 1.  For each disease found to be statistically significantly different in either direction, we propose selecting an appropriate range of underand overestimation degrees to create an n by n table for comparing the prevalences of each disease (Appendix). In our study, we considered the prevalence from published studies to be underestimated by 10%, 25%, 50%, 75%, and 90% and our study population prevalence overestimated by 10%, 25%, 50%, 75%, and 90%, to create a 5 by 5 table.
The new general population and endometriosis prevalence generated by each cell in the table were then compared by Z-tests, and p-values were reported. To assess the magnitude of the differences and determine the direction of the effect, prevalence odds ratios (POR) and 95% confidence intervals (CI) were calculated. A p-value of less than or equal to 0.05, and a CI excluding 1.0 were considered statistically significant. These results were used to identify the threshold (a "line" connecting the cells), where statistically significant differences between the two groups reversed. The amount of misclassification required for results to change was subjectively defined as the midpoint along the threshold in the table. Generally, a low degree of misclassification was considered to be less than 50%, while misclassification greater than 50% was considered to be high. Table 2 displays the degree of overestimation in the study population of women with endometriosis and underestimation in the published studies that was necessary to negate the differences between the observed disease prevalence. For most diseases that had significantly different prevalences between the study sample and general population, a high degree (>50% in either direction) of misclassification was needed to eliminate these differences. However, for some diseases, a smaller degree of misclassification nullified the differences in prevalence between populations.  a=statistically non-significant at first level of misclassification b=not applicable because statistically non-significant at observed crude level, or comparison with the general population could not be done c=Breast cancer was observed to be statistically significantly lower in women with endometriosis Overall, three different patterns were observed in the 5 by 5 tables used for our sensitivity analysis. Figure 1a represents an example of the general pattern for those diseases that reached non-significant levels with increasing degrees of misclassification, or those that never reached non-significant levels. Such a pattern was noted for chronic fatigue syndrome, breast cancer, Sjögren's syndrome, systemic lupus erythematosus, multiple sclerosis, and ovarian cancer, although the threshold varied for each disease. In the second scenario (Figure 1b), non-significant levels were reached with the lowest degree (10%) of misclassification, and the direction of significance was reversed at higher levels (>50%) of misclassification. This pattern was noted for rheumatoid arthritis only. In the third pattern (Figure 1c), statistical significance disappeared at a high level of misclassification, and then at even higher levels (e.g., 90% overestimation and >50% underestimation) the direction of significance was reversed. Recurrent upper respiratory infections, Hashimoto's thyroiditis/hypothyroidism, recurrent vaginal infections, melanoma, mitral valve prolapse, and fibromyalgia displayed such a pattern.

Discussion
This novel sensitivity analysis was a technique to help assess the impact of misclassification in disease prevalence, assuming it existed, and verified the observed significant differences between populations. The resulting 5 by 5 tables pictorially illustrated the analysis results and aided in the determination of the misclassification threshold where statistical difference between the two groups disappeared.
We observed three patterns. In the first pattern, the difference became non-significant with high degrees (>50% in either direction) of misclassification, suggesting the observed difference was truly significant. Thus, higher thresholds provided a greater likelihood that the observed differences were valid and real. In the second pattern, the difference disappeared with the first degree (10% in either direction) of misclassification, resulting in the failure to verify the observed association, and suggesting that there is no association. This occurred for only one disease, rheumatoid arthritis, in which the magnitude of the difference was weak at the observed level. The third pattern, in which differences disappeared and the direction of significance was reversed at higher degrees of under-and overestimation (>50% in either direction), leads to the opposite conclusion and suggests no observed difference.
The interpretation of any of these patterns, the last pattern in particular, depend on the assumed degree of misclassification based on the study design, methodology, source of data, and other differences in the populations that were compared. Furthermore, prevalences from published studies need not always be considered to be underestimated or should population prevalences always be considered to be overestimated. These should be adjusted to what is believed to be true, depending on the diseases in question as well as the sources from which prevalences are being compared.
There is an increasing need for epidemiologic and biostatistical methodology, or "how to" papers, that can be easily applied by publicsector epidemiologists, other public health practitioners, and clinicians [10,14]. Most methods employ complex methodology and require detailed data, making their application by medical researchers impractical. The method presented here is simple, yet powerful in allowing investigators to judge their conclusions of any observed differences against how likely they are to be true. In the absence of the necessary information for conducting a formal analysis, we developed this new sensitivity analysis approach. Its advantage lies in its ease of use by any public health professional, and provides substantial power for validating findings.
In conclusion, we developed a novel, practical sensitivity analysis approach to verify findings by determining the degree of misclassification necessary to negate the difference between population prevalence estimates. The tables for each disease pictorially illustrated three different patterns, which helped to interpret the observed differences and sensitivity analysis results. The sensitivity analysis presented here is a useful alternative to a formal correction method for comparing population prevalence estimates between different populations and could be added to routine study methodology.