Factors associated with self-reported health: implications for screening level community-based health and environmental studies

Background Advocates for environmental justice, local, state, and national public health officials, exposure scientists, need broad-based health indices to identify vulnerable communities. Longitudinal studies show that perception of current health status predicts subsequent mortality, suggesting that self-reported health (SRH) may be useful in screening-level community assessments. This paper evaluates whether SRH is an appropriate surrogate indicator of health status by evaluating relationships between SRH and sociodemographic, lifestyle, and health care factors as well as serological indicators of nutrition, health risk, and environmental exposures. Methods Data were combined from the 2003–2006 National Health and Nutrition Examination Surveys for 1372 nonsmoking 20–50 year olds. Ordinal and binary logistic regression was used to estimate odds ratios and 95 % confidence intervals of reporting poorer health based on measures of nutrition, health condition, environmental contaminants, and sociodemographic, health care, and lifestyle factors. Results Poorer SRH was associated with several serological measures of nutrition, health condition, and biomarkers of toluene, cadmium, lead, and mercury exposure. Race/ethnicity, income, education, access to health care, food security, exercise, poor mental and physical health, prescription drug use, and multiple health outcome measures (e.g., diabetes, thyroid problems, asthma) were also associated with poorer SRH. Conclusion Based on the many significant associations between SRH and serological assays of health risk, sociodemographic measures, health care access and utilization, and lifestyle factors, SRH appears to be a useful health indicator with potential relevance for screening level community-based health and environmental studies. Electronic supplementary material The online version of this article (doi:10.1186/s12889-016-3321-5) contains supplementary material, which is available to authorized users.


Background
Health outcomes are multi-determined and result from complex interactions of social, cultural, economic, psychosocial, environmental, and community factors. However, this wide range of factors are typically studied in a 'siloed' manner [1]. Effective public health policies can be generated only if a range of risks along the complex causal chain leading to health outcomes is assessed, defined, and studied comprehensively [2].
Self-reported health (SRH) is a qualitative singlequestion assessment of health [3]. SRH is commonly acquired in health surveys in the United States (e.g., MacArthur Field Study of Successful Aging, Hawaii Health Survey, San Luis Valley Diabetes Study, National Risk Survey, National Health and Nutrition Examination Survey [NHANES], and Robert Wood Johnson Foundation) [4][5][6][7] and internationally (e.g., Spanish National Disability Survey, European Organization for Research and Treatment of Cancer Quality of Life Questionnaire, National Population Health Survey, and Manitoba Longitudinal Study on Aging) [8][9][10]. SRH is also commonly used in psychological research, clinical settings, and in general population surveys [11].
Studies have shown that SRH is associated with lifestyle related diseases (e.g., diabetes and hypertension [12]), lifestyle habits (e.g., smoking status [13], regular physical exercise [14], obesity [15], and, most notably, subsequent mortality [4,10]). The validity and value of SRH, with respect to mortality, is independent of clinical or physician assessments, and SRH surpasses these measures in predictive power [11]. Few studies link SRH with diagnostic clinical indicators of disease [12,16,17] and even fewer evaluate SRH in relation to blood or urinary based biomarkers of environmental exposure [17,18].
Many diseases and health conditions are often not reported, thus county, state, and national surveys often have limited health outcome data [19]. Local, state, and national public health officials, exposure scientists, and environmental justice advocates would benefit from a screening level health status indicator, such as SRH, to identify potentially vulnerable communities and modifiable health risk factors. Such an indicator would also add value to studies where both environmental exposures and social determinants of health are simultaneously assessed [20].
This study investigates the utility of SRH as a general proxy for health status by investigating whether, and to what extent, SRH is associated with race/ethnicity and broad range of health-risk indicators (N = 57) thought to be important determinates of health. Data were extracted from NHANES and include race/ethnicity and health risk factors across six domains: sociodemographic, health care, health status (e.g., diseases/health conditions), lifestyle factors, serological clinical and nutritional indicators, and blood biomarkers of exposures for metals and volatile organic compounds.

Methods
Physical, medical, laboratory, and respondent data from questionnaires and clinical analysis were extracted from publically available NHANES data from survey years 2003-2004 and 2005-2006. The data and more information about data collection are available online [21]. Data on SRH and a broad array of subjective and objective respondent characteristics, including sociodemographic indicators, health care, lifestyle factors, and diseases, were obtained from interviewer administrated computerassisted personal interviews conducted at the household interview and mobile examination center [22,23]. While all NHANES participants complete a computer assisted personal interview, full serum analysis, including chemical exposure assessment, is conducted for only a randomly selected subset of NHANES participants.

Study population
Of the full NHANES study sample of 5214 participants between the ages of twenty and fifty, the study population for this analysis is composed of 1372 twenty to fifty year old nonsmokers with complete data on SRH and serum biomarkers. Respondents (N = 1731) were omitted from the analysis if their serum cotinine concentration was greater than 10 ng/mL (N = 1648), or if serum cotinine was missing and they self-identified as a current smoker (N = 81), or if both were missing (N = 2). We restricted the analysis to current nonsmokers due to the adverse health impact associated with smoking. We did not want to overly influence (weaken or strengthen) any potential associations between SRH and the various factors by including smokers. We verified the suspected strong relationship between SRH and smoking in preliminary analyses (not shown) that found smokers, both self-identified current smokers and participants with cotinine measurements >10 ng/mL, were twice as likely to report poor/fair health as compared to nonsmokers. An additional 2111 respondents were excluded due to missing values for benzene and/or toluene (N = 1948) or due to missing data for SRH and pertinent demographic, body measurement, and clinical data (N = 163). If data were missing from less than 20 participants for other variables, those participants were excluded from analysis using that variable. If data were missing from more than 20 participants, a "missing" category was included for analysis of that variable. Sample sizes are provided in Tables 1 and 2. Some variables of interest were only analyzed for females (e.g., ferritin, transferrin receptor, transferrin saturation, iron, hemoglobin, and total iron binding capacity); thus, the sample size for analyses that include these variables is lower.

Self-reported health (SRH)
NHANES respondents were asked in a computer assisted personal interview: would you say your health in general is excellent, very good, good, fair, or poor? SRH was analyzed in two ways. First, SRH was collapsed into a binary variable that combined excellent, very good, and good into one category and fair and poor into a second category. This dichotomy is commonly used by others investigating SRH [24][25][26][27] and helps account for imbalances resulting from low numbers of respondents in the extreme lower ends of the scale (i.e., those reporting poor health). Second, SRH was considered a continuous ordinal measure (5 = Excellent to 1 = Poor) and modeled using ordinal logistic regression with the resultant odds ratios (OR) reflecting the odds of a respondent reporting poorer health. A comparison of the relationships between SRH categories and health risk indicators and ORs derived using the ordinal five-point response versus the binary responses are shown in (Additional file 1: Tables S1-S6). We present results from the ordinal SRH categories in the Figs. 1, 2, 3 and 4 and comparison for both the binary and ordinal responses in Tables 4 and 5.

Race/ethnicity
Race/ethnicity was reported as a five-category variable in NHANES and derived from responses to survey questions on race and Hispanic origin. Respondents who self-identified as Hispanic of Mexican-American origin or ancestry were coded as "Mexican-American." Respondents who self-identified as Hispanic of other Hispanic origins or ancestries (e.g., Puerto Rican, Cuban, and Dominican) were coded as "Other Hispanic." Respondents who self-identified as non-Hispanic were then categorized based on their self-reported race: non-Hispanic White, non-Hispanic Black, and other non-Hispanic including multi-racial.

Health risk indicators
Health risk factors were selected from NHANES based on their known ability to reflect or contribute to health status by direct or indirect pathways. There are over 1200 indicator variables to choose from, including over a hundred blood biomarkers of chemical exposure. Health risk indicator variables were extracted for 57 respondent characteristics across six domains.
Domain 1 focused on sociodemographic factors: income and poverty-income ratio (PIR), high school education attainment, and marital status. Domain 2 focused on health care factors: lack of health insurance, hospitalizations, number of times received health care, mental health visits, prescription medication use in the last 30 days, and Hepatitis A and B immunization. Domain 3 focused on health status factors: mental and physical health, body mass index (BMI), high blood pressure, asthma, thyroid problems, diabetes, stomach illness, and cancer/malignancy. Domain 4 focused on lifestyle behaviors: whether respondents were worried The weighted percent adjusts for differential probabilities of selection, nonresponse, and differences between the final sample and the total population  a past year; b past 30 days; c Weighted percentage less than 0.3 (SE not calculated); d High blood pressure was determined using measured values at the examination; e Self-reported that respondents were told by doctor or other health care provider that they had the condition; f The weighted percent adjusts for differential probabilities of selection, nonresponse, and differences between the final sample and the total population high-density lipoprotein (HDL), cholesterol (below 60), Creactive protein (CRP; ≥1 μg/dL), serum glucose, glycohemoglobin (>7 %), and serological nutritional indicators of health including calcium, vitamin C, and vitamin D, cell counts and morphology, and blood iron markers. Domain 6 focused on blood biomarkers of chemical exposure: cotinine, three metals (cadmium, lead, and mercury) and two volatile organic compounds (VOCs: benzene and toluene). Additionally, we explored cumulative exposure to environmental chemicals. From the variables in Domain 6, we derived three environmental scores reflecting combinations of blood metal levels and VOCs. Environmental Score 1 combined blood lead and cadmium levels, Environmental Score 2 combined lead, cadmium, and mercury blood levels, and Environmental Score 3 combined benzene and toluene blood levels. These cumulative environmental scores were calculated by assigning participants a value of one if their blood chemical level was greater than the median blood level in the population studied and a value of zero for each blood chemical level less than or equal to the median blood level. For example, Environmental Score 1 had a range of 0 to 2.

Statistical analysis
The outcome of interest for the study was poorer SRH. The predictors of interest were the variables within the six domains and environmental scores described above.
ORs and 95 % CIs were calculated using binary and ordinal logistic regression. All analyses were carried out with SAS 9.3 (SAS Institute, Inc., Cary, North Carolina) and incorporated the appropriate sample weights to adjust for differential probabilities of selection, nonresponse, and differences between the final sample and the total population. The NHANES stratification and clustering design variables were used in the binary and ordinal logistic regression modeling to obtain proper standard errors of the estimates. Models were adjusted for age and sex for all domains. We derived an indicator variable for three clinical indicators with widely recognized cut-offs for assessing health: CRP, HDL, and vitamin D. A subject's continuous data points were transformed and assigned a value of 1 or 0 for these three derived variables with a value of 1 representing an indicator of a poorer health quality (CRP ≥1 mg/dl, HDL < 60 mg/dL, and vitamin D <20 ng/mL). When predicting poorer SRH for the serological health risk indicators, the models were additionally adjusted for asthma and diabetes, two diseases known to significantly impact SRH. For the serological nutritional and health risk indicators, the median levels or physiological relevant cut-points were applied to the whole NHANES population of 20 to 50 year olds that had data for the indicator of interest. For the blood biomarkers of exposure, the median levels were calculated based on the 20 to 50 year old-nonsmokers (with less than 10 ng/ml of cotinine) who had data for the chemical of interest. To calculate the median values for environmental exposures, individuals with values below the limit of detection (LOD) were assigned the LOD divided by the

Results
The full set of characteristics and health indicators can be found in Tables 1 and 2, which display the frequency and weighted percent of the characteristics of the study participants and the indicators. Additional file 1: Tables S1-S6 includes all health risk factors examined along with associations of poorer reports of health. Table 1 presents descriptive demographic statistics of this study sample. The mean age of study respondents was 36. Among these, 54 % were women, and 9 % reported poor/fair SRH (N = 187). Approximately 70 % of respondents had some college education or higher. Sixty percent had annual incomes greater than $45,000, while 14 % had annual incomes less than $20,000. The majority of respondents were non-Hispanic White (68 %), while 11 % were non-Hispanic Black, 11 % Mexican American, 5 % other Hispanic, and 5 % other race including multi-racial. Table 2 highlights selected lifestyle, health care, health status, dietary and clinical indicators, and environmental exposure characteristics. The majority of the respondents did not have self-reported current asthma (92 %),  Table 3 provides the distributions of the continuous variables used in the analysis. Figures 1, 2, 3 and 4 show ORs for poorer SRH health derived from ordinal logistic regression in association with race/ethnicity and the 57 health-risk indicators. All associations are adjusted for age and gender. Additional file 1: Tables S1-S6 compares the ORs for the ordinal five-point and binary (poor/fair versus good very/good/ excellent) responses in association with race/ethnicity and all 57 health-risk indicator variables. Figure 1 shows associations with SRH and sociodemographic characteristics. Mexican Americans, non-Hispanic Blacks, and other Hispanics reported poorer SRH than non-Hispanic Whites (Fig. 1A). Participants who were not U.S. citizens, were born outside of the U.S., or were widowed/divorced/separated also had poorer SRH. Lower income and education levels were consistently associated with poorer SRH. The PIR, which is an index for the ratio of family income to poverty, was associated with poorer SRH. Marital status, defined as living with a partner or never married versus married, was not associated with poor SRH (see Additional file 1: Table S1 "Sociodemographic Domain"). Figure 2 shows associations between indicator variables of health care access and utilization and reports of poorer SRH. Lack of health insurance, number of times the participant received health care over the past year, and taking prescribed meds over the past month were associated with poorer SHR (ordinal five point and binary SRH responses). In contrast, whether a participant had seen a mental health professional over the past year, was associated with poorer SRH only for the ordinal five point SRH responses. Receiving fewer doses (<2 versus at least 2) Hepatitis A vaccine immunizations was associated with a better SRH for the binary but not ordinal five point SRH responses. Hepatitis B vaccination was not associated with poorer SRH for either the binary or ordinal five-point responses (see Additional file 1: Table S2 "Health Care Domain"). Figure 3 presents associations between SRH and indicators of mental and physical health. Having more than 8 days in the past 30 when mental health was not good was associated with poorer SRH, as were the number of days physical health was not good during the past 30 days, BMI, diabetes, doctor-diagnosed high blood pressure, ever being told you have a thyroid problem, diabetes, ever told you were overweight, or had an asthma diagnosis, asthma attack last year, and having stomach illness. Generally, the relationship between these health status indicators was consistent with the binary and the ordinal five-point responses for SRH, except asthma and stomach illness, which were not significant in the binary model. Ever being told you have cancer/malignancy was not associated with SRH by either the binary or ordinal five-point responses (see Additional file 1: Table S3 "Health Status Domain"). Figure 4 presents associations between SRH and lifestyle factors. Respondents were more likely to report poorer SRH if they also reported being worried that the household would run out of food in the last 12 months (often and sometimes) or they reported not being able to afford balanced meals in the past 12 months. Watching more than three hours of TV daily or having no vigorous or moderate activity in the last 30 days were all associated with poorer SHR. Living in an apartment versus a single family home, or living in a mobile home or trailer versus detached single family home, were associated with poorer SRH, while owning versus renting a home and consuming alcohol (5-7 days/week vs. never) were less likely to report poorer SRH. No relationships were  To verify that the blood cadmium and lead levels were not confounded by passive exposure to cigarette smoke, we adjusted for cotinine; however, this further adjustment did not impact the associations. Environmental Score 3, which considered the cumulative effect of benzene and toluene, was not associated with SRH. The differences between the ordered logit and binary logit models, in terms of statistical significance, was particularly apparent for the Environmental scores 1 and 2. These discrepant results tend to be those with the smallest number of respondents. For example, in Environmental Score 1, respondents in this category were as low as 374 respondents. This could affect the accuracy of the assumption that the odds ratio is constant across categories of self-reported health, especially since there are very few reporting "Poor" health.

Discussion
Effective planning and decision-making for improving the health of a community requires information about the current health status and individual factors that will influence health status [28]. SRH was used to delineate and explore relationships between race/ethnicity and 56 potentially modifiable population health determinants across six domains: sociodemographic, health care, health status (e.g., diseases/health conditions), lifestyle factors, serological clinical and nutritional indicators, and blood biomarkers of exposures for metals and volatile organic compounds. Individual-level data was combined from two NHANES reports (2003)(2004)(2005)(2006) for 1372 nonsmoking adults.
Poorer SRH was associated with race/ethnicity, citizenship, income and education level, lack of health insurance, number of hospitalizations, food security, exercise, poor mental and physical health, prescription drug use, health outcome measures (e.g., diabetes, thyroid problems, asthma, stomach illness), several serological levels of nutrition, clinical measures of health risk, and blood biomarkers of environmental exposures for lead, mercury, cadmium, and toluene, but not benzene.
We note general consistency between the ordered logit and binary logit models in terms of statistical significance, but there are a few differences of note. The discrepancies may be explained in part by the different assumptions of the two models. The ordered logit model assumes the effect is constant for each category of selfreported health (i.e., the effect of poorer heath from Excellent to Good is the same as Fair to Poor); whereas, the binary model assumes that those reporting Fair and Poor and those reporting Excellent, Very Good, and Good categories are similar enough to be grouped together. While some information is certainly lost by this grouping, as we noted previously, this is a fairly common practice in studies of SRH [24][25][26][27] and overcomes issues related to low numbers in some of the categories, especially the "Poor" response which was only by 21 (1.2 %) of respondents in our sample ( Table 2). The binary model also does not require an assumption related to constancy of the odds ratio across multiple categories. These two models provide complementary but different interpretations of the association between SRH and the health risk indicators. For a more rigorous comparison and discussion of binary and several ordered SRH analytic choices, see Manor et al. [29] and Barger [30]. Mexican Americans and non-Hispanic Blacks, when compared to non-Hispanic Whites, were more likely to report poorer SRH. These findings are consistent with Shetterly et al. [5] and Benjamins et al. [31]. Our analyses showed a strong association with poorer SRH with lower education and income levels. Lahelma et al. [32] explain the clear associations between health and education, occupational class, and family income. Adler and Ostrove [33] discussed how sociodemographic and environmental factors, individual psychological and behavioral factors, and biological predispositions and processes can impact health status.
Associations were observed between poorer SRH and the number of days a respondent's mental health was not good. This finding suggests that SRH incorporates a mental health or psychosocial component that otherwise would go undetected in serological based clinical tests.
Poorer SRH was associated with lower levels of Vitamin C, Vitamin D, and calcium. These findings are consistent with Radimer et al. [34], who showed that intake of multivitamin and multi-minerals dietary supplements by US adults was associated with very good/excellent self-reported health. Poorer SRH was associated with lower levels of HDL, higher levels CRP, triglycerides, serum glucose, glycohemoglobin, platelet count, elevated eosinophil, and lymphocyte number. Nine of eleven blood iron markers were associated with SRH. These health indicators are linked to cardiac health, diabetes risk, and other medical conditions. Of particular note in our study was the strong association between the RDW and poorer SRH. Several studies have reported strong associations between RDW and mortality, although the mechanism by which RDW influences health status is unknown [35][36][37]. Based on the strong associations observed between RDW and SRH and because RDW is routinely performed, RDW may serve as an important early indicator of adverse health status prior to disease onset.

Biomarkers of chemical exposure
Blood levels of the three toxic heavy metals (cadmium, lead, mercury) and two VOCs (toluene and benzene) were evaluated in relation to SRH. All five chemicals have public health importance due to their environmental abundance and well-documented toxicity. SRH was associated with blood levels of cadmium, mercury, lead, and toluene but not benzene (perhaps because only 43 % of the respondents in this study were above the limit of detection for benzene). Examination of benzene in relation to health is of interest in light of studies showing ambient air levels of benzene and formaldehyde contribute nearly 60 % of the total cancer-related health impacts of air pollution in the United States [38]. People are exposed to mixtures of pollutants, through a variety of media, including air, water, and food. Thus, research is needed to better understand the cumulative risks posed to human health from the myriad of environmental contaminants that can occur simultaneously. Interactive effects of chemical within mixtures are complex and can result in alterations in the distribution, metabolism, absorption, and excretion of the chemicals [39]. Recently, Cobbina et al. [40] observed synergistic effects of metals mixtures which is consistent with our data where the odds of reporting poorer SRH were greater if the combined blood levels of mercury, lead, and cadmium were considered as opposed to each of the individual metals. In isolation, increasing levels of blood mercury were associated with a better SRH, an association that is likely confounded by income and fish consumption. For example, Mahaffey et al. [41] showed that blood mercury levels in women was related to higher income, consumption of fish, ethnicity, and residence (census region and coastal proximity). Higher blood lead and cadmium levels were associated with lower income levels [42]. Taken together, these studies underscore the need for further research into the relationships between health and cumulative exposures to chemicals, in the context of cultural, economic factors, especially for vulnerable populations and communities [43].
Our data suggest that SRH may be a useful screeninglevel indicator of health status for community-based health and environmental studies based on the number of associations of SRH of several sociodemographic, health care, health, lifestyle, serum-based nutritional, and serum-based environmental measurements. Examples of studies using screening level indices are those by Gallagher et al. [44] where health, sustainability, and environmental indices were derived for fifty major US cities. These diverse indices and associated indicators from which they were derived were associated with disparities related to race, education, and income. Messer et al. [45] applied a multidimensional neighborhood deprivation index (which considered income/poverty, education, employment, housing, and occupation) in relation to adverse prenatal events. Major et al. [46] applied the same index to evaluate associations with all-cause cancer, cardiovascular disease, and mortality. Derivation of an environmental quality index holds promise for improving the linkage between the impact of the overall environment and health [47].

Limitations
As a single-question qualitative measurement, SRH is unable to capture all aspects of health risk or health status. Burgard and Chen [48] suggested that the comparability of self-reported information about specific health conditions might vary across race and social groups, in part because of diagnosis bias. Additionally, measures of specific symptoms may differ if respondents interpret questions or concepts differently. In this analysis, the study population was limited to 20-50 year old nonsmokers, which limits the generalizability of our findings for children and the elderly. We selected this age range in part because some of the blood chemical concentrations were only available for 20-50 year olds. Additionally, the elderly have higher rates of morbidity and children are undergoing rapid developmental changes that may lead to more varied clinical and nutritional measures. Due to the cross-sectional design of the study, we cannot infer causality as the basis for any relationships observed between the explanatory variables and SRH. We did not conduct analyses to evaluate possible correlations between and amongst variables within each of the domains. Further, it is likely that many of the social factors that affect health have both independent and interactive effects on various measures of health. For example, low income is often associated with many other factors contributing to poor health outcomes (e.g., lower levels of education, substandard housing, risky health behaviors, food insecurity, and lack of health insurance coverage). Because this was an exploratory, hypothesis generating analysis, multiple testing correction approaches were not applied. Therefore, p-values should be interpreted with caution. In addition, multivariate regression models were not evaluated.

Conclusion
SRH was used to delineate and explore relationships between multiple health risk factors that ultimately will help inform the design of subsequent studies by highlighting risk factors that relate to health status. To the best of our knowledge, no previous research has applied both binary and ordered logit models to study the