Urinary Concentrations of Parabens and Serum Hormone Levels, Semen Quality Parameters, and Sperm DNA Damage

Background Parabens are commonly used as antimicrobial preservatives in cosmetics, pharmaceuticals, and food and beverage processing. Widespread human exposure to parabens has been recently documented, and some parabens have demonstrated adverse effects on male reproduction in animal studies. However, human epidemiologic studies are lacking. Objective We investigated relationships between urinary concentrations of parabens and markers of male reproductive health in an ongoing reproductive epidemiology study. Methods Urine samples collected from male partners attending an infertility clinic were analyzed for methyl paraben (MP), propyl paraben (PP), butyl paraben (BP), and bisphenol A (BPA). Associations with serum hormone levels (n = 167), semen quality parameters (n = 190), and sperm DNA damage measures (n = 132) were assessed using multivariable linear regression. Results Detection rates in urine were 100% for MP, 92% for PP, and 32% for BP. We observed no statistically significant associations between MP or PP and the outcome measures. Categories of urinary BP concentration were not associated with hormone levels or conventional semen quality parameters, but they were positively associated with sperm DNA damage (p for trend = 0.03). When urinary BPA quartiles were added to the model, BP and BPA were both positively associated with sperm DNA damage (p for trend = 0.03). Assessment of paraben concentrations measured on repeated urine samples from a subset of the men (n = 78) revealed substantial temporal variability. Conclusions We found no evidence for a relationship between urinary parabens and hormone levels or semen quality, although intraindividual variability in exposure and a modest sample size could have limited our ability to detect subtle relationships. Our observation of a relationship between BP and sperm DNA damage warrants further investigation.

Parabens are esters of 4-hydroxybenzoic acid that are commonly used as anti microbial preservatives in cosmetics, pharmaceuticals, and food or beverage processing (Andersen 2008). It is likely that repeated contact with products or foods containing parabens leads to widespread human exposure through ingestion, inhalation, or dermal contact. Methyl paraben (MP), ethyl paraben (EP), propyl paraben (PP), and butyl paraben (BP) have recently been detected in a high proportion of urine samples collected as part of a national U.S. survey of exposure to environmental chemicals .
Parabens are thought to possess low toxicity (Golden et al. 2005;Soni et al. 2005). However, some parabens may be estrogenic in vitro (Routledge et al. 1998), although with activity levels several orders of magnitude lower than that of estrogen. Recent experimental studies have reported that certain parabens may also act as anti androgens (Chen et al. 2007;Darbre and Harvey 2008;Satoh et al. 2005), and there is limited evidence that parabens may affect thyroid function (Rousset 1981;Taxvig et al. 2008;Vo et al. 2010). Several animal studies have explored the impacts of parabens on male reproductive effects, with some reporting that exposure to BP and PP, but not MP, adversely affects spermatogenesis and endocrine function in rats or mice (Kang et al. 2002;Oishi 2001Oishi , 2002a2002b;2004). Conversely, in another recent study, Hoberman et al. (2008) found no association between MP or BP and reproductive markers in rats. Based on these limited animal studies, in addition to the observation that paraben estrogenicity depends on the length of the alkyl side chain (Darbre and Harvey 2008), the reproductive toxicity potential of parabens for which widespread human exposure has been documented is thought to be BP > PP > EP > MP.
To our knowledge, no human studies have investigated the association between paraben exposure and measures of male reproduction. In the present study we assessed relation ships between urinary concentrations of several parabens and a range of male reproductive health markers in an ongoing study of environmental determinants of reproductive health. We also considered the potential for inter actions between parabens and another estrogenic xeno biotic, bisphenol A (BPA), for which we recently reported associations with reproductive measures in this population (Meeker et al. 2010a(Meeker et al. , 2010b.

Materials and Methods
Participants were male partners in sub fertile couples seeking treatment between 2000 and 2004 from the Vincent Memorial Obstetrics and Gynecology Service. After the study procedures were explained and all questions answered, subjects signed an informed consent. Men between 18 and 55 years of age without post vasectomy status who presented to the Andrology Laboratory were eligible to participate. Of those approached, approximately 65% consented. Most men who declined to participate in the study cited lack of time on the day of their clinic visit as the reason for not participating. The study was approved by the human studies institutional review boards of MGH, Harvard School of Public Health, the Centers for Disease Control and Prevention (CDC), and the University of Michigan.
On the day of each subject's clinic visit, a single spot urine sample was collected into a sterile polypropylene cup. Because the temporal reliability of parabens in urine is unknown, second and third urine samples were collected from a subset of men at subsequent clinic visits. These samples were generally collected between 1 week and 2 months after the first sample. After measuring specific gravity (SG) using a handheld refractometer (National Instrument Co. Inc., Baltimore, MD), each urine sample was divided in aliquots and frozen at -80°C. Samples were shipped on dry ice overnight to the CDC, where concentrations of total (free + conjugated) MP, PP, BP, and BPA were meas ured using a modification of the approach described by Ye et al. (2006). EP was not meas ured because of low detection rates among the U.S. population . The analytical method and results for BPA have been previously reported (Meeker et al. 2010a(Meeker et al. , 2010b. The conjugated species of BPA and parabens were first hydrolyzed in 100 µL of urine using β-glucuronidase/ sulfatase (Helix pomatia, H1; Sigma Aldrich, St. Louis, MO). The target compounds were pre concentrated by online solid-phase extraction, separated from other urine components by reversed-phase high-performance liquid chromatography, and detected by atmospheric pressure chemical ionization/isotope dilution/tandem mass spectrometry with peak focusing (Ye et al. 2006). The limits of detection (LODs) were 1.0 µg/L for MP and 0.2 µg/L for PP and BP. Low-concentration (2.2-9.0 µg/L) and high-concentration (10.5-53.8 µg/L) quality control materials (QCLs and QCHs, respectively) were prepared with pooled human urine that was analyzed with standards, reagent blanks, and study samples. The precision of measure ments, expressed as the relative SD of 55-66 meas ures, depending on the analyte, was 5.8-12.1% for QCLs and 4.4-5.6% for QCHs. Paraben concentrations were corrected for SG using the formula where P c is the SG-adjusted paraben concentration (nanograms per milliliter), P is the observed paraben concentration, and SG is the specific gravity of the urine sample.
One non fasting blood sample was drawn between 0900 and 1600 hours on the same day and time that the first urine sample was collected. Blood samples were centrifuged, and the resulting serum was stored at -80°C until hormone analysis at the MGH Reproductive Endocrinology Laboratory. Serum testosterone, estradiol (E 2 ), sex-hormonebinding globulin (SHBG), inhibin B, follicle-stimulating hormone (FSH), luteinizing hormone (LH), prolactin, free thyroxine (T 4 ), total triiodo thyronine (T 3 ), and thyroidstimulating hormone (TSH) were meas ured using sensitive immuno assay methods as described previously (Meeker et al. 2007). The free androgen index (FAI) was calculated as the molar ratio of total testosterone to SHBG. Free testosterone was also estimated using published methods (Vermeulen et al. 1999). The testosterone:LH ratio, a measure of Leydig cell function, was calculated by dividing testosterone (nanomoles per liter) by LH (international units per liter). The FSH:inhibin B and E 2 :testosterone ratios were also calculated as measures of Sertoli cell function and aromatase activity, respectively.
Onsite at MGH, semen was collected from each subject into a sterile plastic specimen cup after a recommended abstinence period of 48 hr. After liquefaction at 37°C for 20-60 min, semen quality parameters and motion charac teris tics were meas ured at the clinic.
Semen samples were analyzed for sperm concentration, motility, and motion parameters using a computer-aided semen analyzer (CASA; version 10HTM-IVOS; Hamilton-Thorne Research, Beverly, MA) as previously described (Meeker et al. 2004a(Meeker et al. , 2008. Total sperm count (10 6 per ejaculate) was calculated by multiplying sperm concentration (10 6 /mL) by semen sample volume (milli liters). Motile sperm was defined as World Health Organization (WHO) grade "a" sperm (rapidly progressive with a velocity ≥ 25 µm/sec at 37°C) plus "b" grade sperm (slow/sluggish progressive with a velocity of ≥ 5 µm/sec but < 25 µm/sec) (WHO 1999). Of seven CASA motion variables meas ured, we included only three in our analy sis [straight-line velocity (VSL), curvilinear velocity (VCL), and linearity (VSL/VCL × 100)] because of a high degree of dependence between several of the meas ures (Duty et al. 2004;Meeker et al. 2004a). Sperm morphology was assessed on two slides per specimen (with a minimum of 200 cells assessed per slide) with a Nikon microscope using an oil-immersion 100× objective (Nikon Company, Tokyo, Japan). We used strict Kruger scoring criteria to classify men as having normal or below normal morphology (Kruger et al. 1988).
The remaining unprocessed semen was frozen in 0.25 mL cryogenic straws (Cryo Bio System, IMV Technologies, San Diego, CA) by immersing the straws directly into liquid nitrogen (-196°C). Previous work in our laboratory showed that this freezing method produced comet assay results that were highly correlated with results from fresh, unfrozen samples (Duty et al. 2002). Semen samples were later analyzed in batches, where straws were thawed by gently shaking in a 37°C water bath for 10 sec, and the semen was immediately processed for the comet assay. To assess sperm DNA damage, we followed a comet assay procedure that has been previously described (Duty et al. 2003;Singh and Stephens 1998). After lysis, electrophoresis, and staining, we meas ured comet extent, tail distributed moment (TDM), and percent DNA located in the tail (Tail%) for 100 sperm in each semen sample using a fluorescence microscope and VisComet software (Impuls Computergestutzte Bildanalyse GmbH, Gilching, Germany). Comet extent is a meas ure of total comet length from the beginning of the head to the last visible pixel in the tail. Tail% is a measurement of the proportion of total DNA that is present in the comet tail (i.e., fragmented DNA that has migrated away from the comet head). TDM is an integrated value that takes into account both the distance and intensity of comet fragments: where ΣI is the sum of all intensity values that belong to the head, body, or tail and X is the x-position of the intensity value. We performed data analysis using SAS (version 9.1; SAS Institute Inc., Cary, NC). In preliminary data analysis, paraben concentrations and outcome meas ures were compared by demo graphic categories and other potentially important covariates using appropriate parametric or non parametric tests to investigate the potential for confounding. Multivariable linear regression was used to explore relationships between urinary paraben concentrations and hormone levels, semen quality parameters, and sperm DNA damage measures. Serum concentrations of inhibin B, testosterone, free testosterone, and E 2 , along with semen volume, sperm motility, morphology, and all sperm motion and sperm DNA damage measures, closely approximated normality and were used in statistical models untransformed. The distribution of sperm count, sperm concentration, FSH, LH, SHBG, prolactin, TSH, and all calculated hormone ratios were positively skewed and transformed by the natural logarithm (ln) for statistical analy ses. Urinary MP and PP concentrations were also ln transformed. For PP, MP, and BPA concentrations < LOD, we used an imputed value of LOD/2. Because of the high proportion of samples with BP values < LOD, a three-level ordinal variable was formed: all samples with concentrations < LOD were assigned to the lowest group, and two equally sized groups were formed among the samples with detectable concentrations to form the median-and high-exposure groups. Tests for trend were conducted for ordinal BP categories in regression models using integer values (0, 1, 2).
Inclusion of covariates in the multi variable models was based on statistical and biologic considerations (Kleinbaum et al. 1998). We included SG as a continuous variable in all models to adjust for urinary dilution. Age and body mass index (BMI) were also modeled as continuous variables, whereas abstinence period was treated as an ordinal categorical variable. Race (white vs. other), smoking status (current smoker vs. former or never smoker), and timing of the clinic visit (i.e., time of collection of urine/blood/semen samples) by season (winter vs. spring, summer, or fall) and by time of day (0900-1259 hours vs. 1300-1600 hours) were considered for inclusion in the models as dichotomous variables. Covariates with a p-value < 0.2 in their relationship with one or more parabens or at least one outcome meas ure in the preliminary bivariate analyses were included in a "full" model. Covariates with a p-value > 0.15 in full models for all measures within the three sets of outcomes (hormone levels, semen quality, sperm DNA damage) were removed from the final models. Models for all outcome volume 119 | number 2 | February 2011 • Environmental Health Perspectives measures within each of the three sets of outcomes were adjusted for the same covariates to maintain consistency. For the dependent variables of interest (urinary paraben concentrations), we considered p < 0.05 statistically significant. Because of the explora tory nature of the analysis, p-values < 0.1 were considered statistically suggestive.
Because repeated urine samples were available for a subset of the men, two sets of models were constructed: a) using only urinary paraben concentrations from a single urine sample collected on the same day as the serum sample; and b) using the geometric mean urinary paraben concentration for each participant, where between one and three values were used to calculate each individual's geometric mean (i.e., the geometric mean for men with only one value was equal to that single value). In further sensitivity analyses, the multi variable models were rerun after excluding men with highly concentrated or highly dilute urine samples (SG > 1.03 or < 1.01) (Teass et al. 1998), and when using SG-corrected paraben concentrations rather than uncorrected urinary paraben concentrations but including SG as a covariate. Finally, to assess temporal variability in urinary paraben concentrations, we calculated Spearman correlations for paraben concentrations in the first and second urine samples among men with two urine samples. Using paraben data from all men, we also calculated the intra class correlation coefficient (ICC), which is the ratio of between-subject variability to total variability (total variability = between-subject variability + within-subject variability), using SAS PROC MIXED.
Detection rates and distributions of uncorrected and SG-corrected urinary paraben concentrations from urine samples collected from 194 men on the same day as a serum sample (hormone levels) and/or semen sample (semen quality, sperm DNA damage) are presented in Table 1. Information on SG was missing for four urine samples. We detected MP in all samples and at the highest concentrations (median = 27.4 µg/L), followed by PP (median = 3.5 µg/L). BP was positively associated with age (Spearman r = 0.16; p = 0.03), but MP and PP were not. All three parabens were inversely associated with BMI (Spearman r, between -0.15 and -0.17; p-values < 0.05). Paraben concentrations were not associated with smoking status, time of day or season of urine sample collection, or duration of abstinence before semen sample collection. PP concentrations (p < 0.05), but not MP or BP concentrations (p > 0.05), were higher in non white men than in white men. MP and PP concentrations were strongly correlated with one another (r = 0.82; p < 0.0001), and both were only weakly to moderately correlated with BP (both r = 0.39; p < 0.0001). Spearman correlation coefficients with urinary   Abbreviations: CI, confidence interval, T, testosterone. a Adjusted for SG, age, BMI, current smoking status, and time of day of blood/urine sample collection. b Model included ln-transformations for parabens and hormones; inhibin B, testosterone, free T, E 2 , free T 4 , and total T 3 were modeled untransformed. c Models for T also adjusted for ln-transformed SHBG. BPA were 0.29 for MP (p < 0.0001), 0.18 for PP (p = 0.01), and 0.11 for BP (p = 0.13).
In addition to the 194 urine samples collected at the same visit as serum/semen sample collection, a second urine sample was later collected from 78 of the men, and a third urine sample was collected from 4 men. The amount of time between consecutive urine samples ranged from 3 to 75 days, with a median (25th, 75th percentile) of 29 (27, 34) days. Among the 78 men who provided two urine samples, paraben concentrations in the first and second urine samples were weakly correlated for MP (Spearman r = 0.36; p = 0.003) and PP (r = 0.25; p = 0.03), and moderately correlated for BP (r = 0.46; p < 0.0001). However, these values increased when using SG-corrected concentrations (r = 0.46, 0.31, and 0.58, respectively). When considering all 272 urine samples with paraben concentrations measured, the ICC was 0.26 (SG-corrected, 0.35) for MP and 0.18 (SG-corrected, 0.26) for PP. We did not calculate ICC for BP because of the high proportion of non detectable concentrations. For comparison, the ICC for BPA in these samples was 0.08 (SG-corrected, 0.13).
Urinary paraben, covariate, and outcome data were available from 167 men for hormone levels, 190 men for semen quality parameters, and 132 men for sperm DNA damage measures. In exploratory bivariate analyses, we found no statistically significant correlations between uncorrected or SG-corrected concentrations of MP or PP and hormone levels, semen quality parameters, or sperm DNA damage (data not shown). We also found no associations in multi variable linear regression models adjusted for age, BMI, smoking, time of day, and duration of abstinence (Tables 2  and 3), although we found a suggestive inverse relationship between MP and TSH (Table 2), a suggestive positive association between MP and Tail% (Table 3), and a suggestive inverse relationship between PP and TDM (Table 3). Results were similar when we also considered repeated urinary paraben concentrations meas ured in samples collected weeks or months after the initial urine sample, although the relationships between MP and TSH and Tail%, and between PP and TDM, were somewhat weakened (i.e., regression coefficients were closer to zero; data not shown). Our findings were also consistent when using SG-corrected paraben concentrations in the models, as well as when excluding concentrated or diluted urine samples with SG > 1.03 or < 1.01 (data not shown).
Aside from a suggestive positive association with FAI, categories of urinary BP were also not associated with hormone levels or semen quality parameters (Table 4). However, BP categories were associated with a dose-related increase in Tail% (p for trend = 0.03). Overall,  Table 4. Adjusted linear regression coefficients for change in hormone levels (n = 167), semen quality (n = 190), and sperm DNA damage measure (n = 137) associated with categories of urinary BP.
Abbreviations: CI, confidence interval; T, testosterone. a Adjusted for SG, age, BMI, current smoking, and time of day of urine/serum/semen sample collection. b Adjusted for SG, age, BMI, abstinence period, current smoking, and time of urine sample. c Model included ln-transformations.
volume 119 | number 2 | February 2011 • Environmental Health Perspectives the results for BP were also consistent when modeling geometric mean or SG-corrected paraben concentrations (data not shown). Because we previously reported a positive association between urinary BPA concentration and Tail% in these same men (Meeker et al. 2010b) and because BP and BPA may impart similar biological effects (e.g., estrogenic), we investigated the possibility of an inter action between the two urinary biomarkers on Tail%. When we included both BP categories and BPA quartiles in the multi variable model, both were associated with significant dosedependent increases in Tail% (Figure 1). However, we found no evidence of interaction in stratified analyses or when including a BP × BPA inter action term in the model (p = 0.25; data not shown), although the statistical power of the test for inter action was low because of a fairly small sample size.

Discussion
As far as we are aware, this is the first human study to explore relationships between biomarkers of paraben exposure and male reproductive health. With the exception of a suggestive inverse association between MP and TSH, and a suggestive positive association between BP and FAI, we found no evidence for a relationship between MP, PP, or BP and altered hormone levels or conventional semen quality parameters. For sperm DNA damage, we observed a suggestive inverse association between PP and TDM, a suggestive positive association between MP and Tail%, and a statistically significant positive association between BP and Tail%. The relationship between BP and Tail% was not confounded by our recently reported relationship between urinary BPA concentrations and Tail% and may suggest additive effects on Tail% in relation to combined exposures to both BP and BPA. We also found an inverse relationship between BMI and urinary paraben concentrations, which may reflect differences in exposure (e.g., product use, diet) and/or paraben metabolism between people with differing BMI. Our observation that the only statistically significant relationship involved BP is consistent with in vitro and animal data that suggest BP has greater reproductive toxicity potential than the other parabens examined in this study (Darbre and Harvey 2008;Golden et al. 2005). However, we did not observe inverse relationships between BP and sperm concentration and testosterone, which were reported in sub chronic studies of rodents exposed to BP at around 4 weeks of age (Oishi 2001(Oishi , 2002a or in utero (Kang et al. 2002). Studies investigating whether BP causes DNA damage are limited, although BP has been associated with geno toxicity in CHO-K1 Chinese hamster ovary cells (Tayama et al. 2008) and increased cell death and injury in rat hepatocytes [National Toxicology Program (NTP) 2004]. Our observation of a positive association with sperm DNA damage, as measured by Tail%, may also be consistent with previous reports that BP may be a suitable vaginal contraceptive because of its ability to inhibit acrosin (an enzyme that aids sperm penetration into the oocyte during the fertilization process), most likely by damaging the sperm membrane (Song et al. 1991(Song et al. , 1989. For example, the primary cause of sperm DNA damage is likely to be oxidative stress (Aitken and De Iuliis 2010;Aitken et al. 2009), which can also damage the sperm membrane (Sharma et al. 2004). Although not well studied, parabens caused oxidative stress in skin cells (Nishizawa et al. 2006), and BP was associated with decreased cellular levels of glutathione and protein-sulfhydryl groups in rat hepatocytes (NTP 2004).
Although it is currently unclear which comet assay parameter is the most relevant measure of sperm DNA damage, Tail% has been shown to be proportional to the frequency of DNA strand breaks (Olive et al. 1990). Tail% may also represent a more sensitive measure of DNA damage than both TDM and comet extent, because Tail% continues to increase with increased DNA damage whereas comet extent may not (McKelvey-Martin et al. 1993). Inconsistent results between the various DNA damage meas ures obtained by the neutral comet assay regressed on the same independent variable have been observed in previous studies (Meeker et al. 2004b). It has been hypothe sized that the different comet assay parameters may reflect different types of DNA strand breaks (Meeker et al. 2004b): Specifically, because of the lack of correlation between TDM and Tail%, a high TDM may be more likely to be associated with doublestrand breaks, whereas a high Tail% may reflect single-strand breaks. Thus, in the present study, BP was positively associated with Tail%, which may reflect a relation ship between BP and single-strand breaks. However, it is possible that this relationship was a chance finding in our data, and future research is needed in order to confirm our results.
Paraben exposures in the present study were likely representative of those found among men in the U.S. general population. Median and 95th percentile concentrations recently reported in males participating in the National Health and Nutrition Examination Survey (NHANES) for  were, respectively, 23.7 and 491 µg/L for MP, 2.3 and 125 µg/L for PP, and < LOD and 3.2 µg/L for BP, compared with 27.4 and 258 µg/L for MP, 3.5 and 95.5 µg/L for PP, and < LOD and 4.1 µg/L for BP in the present study. It should be noted that paraben exposure was much higher among women than among men in NHANES; females had 75th percentile urinary MP, PP, and BP concentrations that were 4, 7, and 12 times higher, respectively, than those of males ). Thus, human epidemiologic studies of female reproductive effects and adverse pregnancy outcomes in relation to paraben exposure should also be conducted.
We believe the present study is also the first to assess the temporal variability of paraben concentrations in urine. MP and PP concentrations were weakly correlated in repeated urine samples collected from the same individuals and had low ICCs (≤ 0.35). Repeated BP concentrations were moderately correlated. These measures of temporal reliability were greater than for urinary BPA concentrations among these men (Meeker et al. 2010a) but less than for concentrations of urinary metabolites of phthalates and non persistent pesticides among a different subset of men from the ongoing study (Hauser et al. 2004;Meeker et al. 2005). Thus, the lack of an association between urinary parabens and hormone levels or semen quality in the present study may be due to the presence of non differential  (random) measurement error in our exposure estimates. Future studies should measure paraben concentrations in multiple urine samples collected over the exposure window of interest to reduce exposure measurement error. In the present study, the use of the geometric mean of repeated urinary paraben concentrations, when available, generally resulted in weaker associations with reproductive measures. This was consistent with our recent analysis of urinary BPA among these men (Meeker et al. 2010b). However, this may be because we collected the repeated urine samples weeks or months after the serum/semen samples used for measuring hormone levels, semen quality, and sperm DNA damage, whereas the exposure window of interest would likely be weeks or months leading up to the assessment of these measures.
The present study had a number of other limitations in addition to the likely presence of high temporal variability in paraben exposure levels. This includes the availability of only a single blood or semen sample for the assessment of hormone levels, semen quality, and sperm DNA damage, which may also vary over time. The cross-sectional design of the present analysis also restricts our ability to make conclusions regarding causal relationships, and the relatively small sample size provided low statistical power, which limited our ability to detect subtle relationships between urinary parabens and male reproductive health markers. For example, we observed adjusted regression coefficients of -0.08 and -0.06 for the relationships between sperm concentration and MP or PP, respectively. With our sample size of 190, we would have had 80% power to detect (α = 0.05) adjusted regression coefficients of approximately -0.22 and -0.15, respectively (Lenth 2009). Finally, because of the study's exploratory nature, we made a large number of statistical comparisons. Thus, we cannot rule out the possibility of chance findings to explain the observed relationship between BP and Tail%.

Conclusion
We found limited to no evidence of a relationship between paraben concentrations in urine and hormone levels or conventional semen quality parameters. However, it is possible that intra individual variability in exposure and a modest sample size could have limited our ability to detect subtle relationships. Our observation of a relation ship between BP and sperm DNA damage, in addition to the potential for additive effects from combined exposures to BP and BPA, warrants further investigation. Added research on the potential for additive or multiplicative interactions between exposures to multiple environmental agents on male reproduction is also needed.