Exposure profiles in pregnant women from a birth cohort in a highly contaminated area of southern Italy

Protecting the health of pregnant women from environmental stressors is crucial for reducing the burden of non-communicable diseases. In industrially contaminated sites, this action is particularly challenging due to the heterogeneous pollutant mixtures in environmental matrices. The aim of this study was to evaluate distribution patterns of mercury, hexachlorobenzene and polychlorobiphenyls in the serum of 161 pregnant women recruited in the framework of the Neonatal Environment and Health Outcomes (NEHO) cohort and living both inside and outside the National Priority Contaminated Site (NPCS) of Priolo. Food macro-categories were determined, and serum levels of contaminants were used to perform k-means cluster analysis and identify the role of food in pollutant transfer from the environment. Two groups of mothers with high and low measured pollutant levels were distinguished. Concentrations in mothers in the high-exposure cluster were at least twofold for all the evaluated pollutants (p < 0.0001) and included mothers living inside and outside NPCS, with a predominance of individuals from the NPCS (p = 0.045). Fish consumption was higher in the high-exposure cluster (p = 0.019). These findings suggest a link between contamination of environmental matrices such as sediment with maternal exposure, through the intake of local food. Such consideration appears poorly investigated in the context of contaminated sites.

Questionnaire.The present study combines a variety of information from a subset of questionnaires with the aim of shedding light on possible associations between lifestyle and detected concentrations of pollutants in serum.Mothers enrolled in the NEHO cohort were asked to fill in different questionnaires.The "Baseline-first part" questionnaire provided information on maternal health and lifestyle during the gestational period.Some of the questions were aimed at retrieving information about the consumption habits of different types of food.
The maternal characteristics and socio-environmental factors were age, body mass index before pregnancy (BMI), marital status, weeks of gestation and educational level, which originally was categorised into four levels: "Elementary school", "Middle school", "High school", "Degree or higher qualification".Here, educational levels were categorized on the basis of the years of education: "Elementary school" and "Middle school" were unified into "0-8 years of education"; while "High School" and "degree or higher qualification" were considered as "8-13 years of education" and "more than 13 years of education" respectively.Because the present study is also focused on the frequency of food consumption, the original questionnaire items collecting information about the consumption frequencies of the considered categories were modified as follows: "Never" was recorded in 0 days/ month; "Once per month" was recorded in 1 day/month, "2/3 times per month" was recorded in 2.5 days/month; "Once per week" was recorded in 4 days/month; "2/3 times per week" was recorded in 10 days/month; "2/3 times per week" was recorded in 18 days/month; "Every day" was recorded in 30 days/month.A standard portion of each food was then considered in order to compute the total amount (in grams) consumed by each mother in one month.According to the National Recommended Energy and Nutrient Intake Levels 28 , a standard fish portion corresponds to 150 g; 100 g is the standard portion for each type of meat; dairy products were expressed as a sum of yogurt 125 g, milk 125 g and fresh cheese 50 g; 200 g were used for eggs and vegetables, except for leafy and stem vegetables, for which a standard portion of 80 g was assumed.The quantity of vegetables consumed was computed as the sum of stem vegetables, leafy vegetables, Brassicaceae, raw and cooked vegetables.
In addition, to build a logistic regression model analysis, levels of consumption were rearranged to obtain a homogeneous distribution among categories of fish and vegetable consumption.To this aim, fish consumption was categorized into three levels: 0 if no consumption, 1 if only one time in a month, and 2 if equal to or more than 2/3 times in a month.As concerns vegetables, the classification was performed by unifying "no" and "low" (only one time in a month) consumption into level 0; level 1 referred to a consumption from 2/3 in a month to one time per week; level 2 was assigned to mothers with a "greater" (equal to or more than 2/3 times per week) consumption.The different classifications adopted for fish and vegetables were due to the different frequencies of each food item (i.e., while "no consumption" class for fish presented a sufficient recurrence, the same class for vegetables was almost absent).
Analytical procedure.Maternal blood samples were collected during the last trimester of pregnancy, and serum was then separated by centrifugation and temporarily stored at − 20 °C in each maternal unit before being transported on dry ice to the NEHO biobank for long-term storage at − 80 °C until the analysis was carried out.Analyses of POPs (HCB and three congeners of PCBs 138, 153 and 180) in maternal serum were performed at the National Institute for Health and Welfare, Chemical Exposure Unit, Kuopio, Finland, with an Agilent 7000B gas chromatograph triple quadrupole mass spectrometer (GC-MS/MS).Ethanol and 13 C-labelled internal standards were added to samples.Dichloromethane-hexane was added for extraction, followed by the addition of activated silica gel to bind the sample water and ethanol.The dichloromethane-hexane layer was poured into a solid phase extraction cartridge (SPE cartridge) containing 44% sulphuric acid silica, 10% silver nitrate impregnated silica and a mixture of sodium sulfate and silica.The lower semi-solid layer was extracted again with dichloromethane-hexane that was also poured into an SPE-cartridge.Elution of the SPE-cartridge was continued with dichloromethane-hexane, and the eluate was concentrated for GC-MS/MS.The quantification was performed by multiple reaction monitoring using an Agilent 7890A gas chromatograph/Agilent 7010 triple quadrupole mass spectrometer with DB-5MS UI column (J&W Scientific, 20 m, lD 0.18 mm, 0.18 µm).Reference materials for organic contaminants in human serum were analysed to estimate accuracy (SRM 1589a-National Institute of Standards and Technology, Gaithersburg, MD, USA).Recoveries ranged between 96 and 104% for each PCB and HCB analyte.Analytical precision was routinely better than 3% RSD%.
Total serum triglycerides and cholesterol concentrations were assayed by certified spectrophotometric methods (Randox Laboratories, Crumlin, UK) at the Institute of Clinical Physiology of the National Research Council, Pisa, Italy.Total lipids were formulated according to the following equation 29 : Lipid-normalized organochlorine concentrations were calculated from wet weight concentrations divided by total lipids and expressed as ng/g of total lipids.
Analyses of Hg were performed at the laboratory of LERES (Laboratoire d'Etude et de Recherche en Environnement et Santé) at the French School of Public Health-EHESP (Rennes, France), following the procedures described by Davies et al. 30 .The 161 serum samples were analysed by a plasma torch coupled with tandem mass spectrometer (ICP-MSMS 8800, Agilent Technologies) after a mineralisation step by adding nitric acid and heated with a heating block (Hotblock Pro, model SC-189, Environmental Express) at 83 °C for 4 h.Matrix effects correction was guaranteed through the use of internal standards (Sc, Ge, 77Se, Rh, Re and Ir).All internal standards were quantified in samples with less than 25% variation.Certified or internal control materials (measured additions) of blood and serum were added to the series (Utak level 1, Seronorm level 1) in order to guarantee the smooth running of the different stages and to cover the set of blood matrices.The results were validated since the concentrations of the controls were located within the limits of the control charts.This procedure was accredited by the French accreditation committee (CoFrac) in January 2020.
Concentrations below the limit of quantification (LOQ) were replaced by LOQ/2.

Statistics.
Women were grouped according to their pollutant serum levels.A non-supervised k-means algorithm was used on the scaled logarithms to base 2. Concentrations were log-transformed to obtain a normal distribution and then standardized to define the concentrations on the same scale.The Shapiro-Wilk test was used to determine whether the variables came from a normal distribution.The optimal number of clusters was estimated by computing both the Within Cluster Sums of Squares (WCSS), for the Elbow method, and the average silhouette.The classification into clusters was used as a factor for testing associations with the NPCS and LRAs, as well as with other relevant qualitative variables using a Chi-Squared test or a Fisher exact test, when appropriate.The dependence of quantitative variables on the individuated clusters and the possible association between the clusters and the quantities of food consumed were assessed by means of a Mann-Whitney U test.
Aimed at identifying the variables to be introduced in a multivariable model, univariable logistic regression models were used to test the dependence of cluster on the consumption of each food category and on sociodemographic predictors.Only predictors significant at a p level of 10% in the univariable analysis were then included into a multivariable logistic model in order to limit the number of predictors given the small size of the study sample.
To assess the contribution of dietary habits and socio-demographic characteristics, two multivariable models were implemented: one including only food items, and the other one including socio-demographic variables (i.e., maternal age, area of residence and educational level).A further model including both types of variables was also built.In addition to the above described procedure, a LASSO regression including all the food categories and the socio-demographic predictors was also performed by means of the IsLASSO R package 31 and result compared.
Moreover, a Weighted Quantile Sum (WQS) regression 32,33 was performed to assess the impact of food consumption in relation to exposure clustering, including only the predictors significant in the univariable logistic models.The repeated holdout procedure 34 , with 100 repetitions, was used to stabilize results.For each repetition 100 bootstraps were implemented for a total of 10,000 estimates.A 30%/70% training/testing splits was used.
A heatmap was used to graphically show the representative levels of blood pollutants in each cluster.p-values < 0.05 were considered significant.All the analyses were performed in R, version 4.1.3 35.

Ethics approval.
The NEHO study protocol has been approved by the Ethics Committee "Catania 2" for the NPCS of Priolo (11 July 2017, n. 38/2017/CECT2), and strictly followed the Declaration of Helsinki.Each participant read the information sheet and signed the informed consent.The participant information sheet is Total lipids (mg/dL) = 1.12 × total cholesterol + 1. 33

Results
Study sample characteristics.Table 1 reports a description of the enrolled women with relevant demographic and socio-economic traits, separately by residence (LRAs or NPCS).Mean age (± SD) was 30.7 ± 4.7 years, with no difference between the two groups (p = 0.811).Similarly, BMI and the variable "weeks of gestation" emerged as not statistically different.The association between educational level and location was significant (p = 0.023), highlighting a larger percentage of mothers with a higher educational level living in the NPCS.Marital status (married, never married/separated) was not significantly different (p = 0.800), with the highest percentage of married women (65.0%) reflecting the distribution observed in the whole NEHO cohort 36 .Table 2 reports the pollutant concentrations in maternal serum of residents in LRAs and NPCS.Pollutants in serum of women living in the highly contaminated area were significantly higher than those detected in samples from the reference areas, excluding Hg (p = 0.402).

K-means clustering.
From the k-means cluster analysis, both the Elbow and Silhouette methods identified 2 as the optimum number of clusters.A3 in the Supplementary Materials reports the measured pollutants from the two clusters highlighting the significant differences for all the pollutants measured in serum.Table 3 shows the mothers' distribution in the two clusters by their socio-economic traits.In particular, the association between clusters and area of residence (NPCS vs LRA) was significant (p = 0.045), with the largest percentage of women living in the NPCS belonging to the H-Exp cluster (47 of 77-61%).In addition, individuals from this latter cluster were older than mothers with lower levels of contaminants (p < 0.001) and with higher educational levels (p = 0.018).The role of food consumption as a driver of contamination to mothers was investigated by means of the Mann-Whitney U test.Among the considered food categories (including meat, milk, eggs, fish and vegetables), consumption of fish and vegetables was significantly higher in the H-Exp cluster than in the L-Exp Cluster (p = 0.019 and p = 0.017 respectively, Table 4).
Figure 2 shows the geographical distribution of mothers in the Priolo area according to the k-means clustering (H-Exp and L-Exp) and their area of residence (LRAs vs NPCS).Residences of the mothers from the H-Exp cluster are shown in red, while those from the L-Exp cluster are in green.Moreover, circles and triangles Table 1.Socio-demographic characteristics of the study population.LRA local reference area, NPCS national priority contaminated site, SD standard deviation; *p-value from Mann-Whitney U test for the differences between the quantitative variables and the residence areas; # p-value from a Chi-square test for the differences between the qualitative variables and the residence areas.Significant values are in bold.www.nature.com/scientificreports/discriminate between the mothers residing in the NPCS and LRA, respectively.In the municipality of Augusta, within the NPCS, most of the mothers were associated with the H-Exp cluster (Fig. 2B).Specifically, Fig. 3 (upper panel), shows the values of the average consumption of the categories of fish considered in the two k-means clusters (High-and Low-exposure levels).The p values from the Mann-Whitney U test are also reported in the corresponding graphs.The average consumption of "Fresh caught fish", "Blue fish" and "Farmed fish" resulted significantly different in the two clusters, with the women belonging to the H-Exp cluster consuming larger quantities of fish.Differently, "Shellfish" consumption was not significantly different between the two groups, also considering that shellfish are barely present in the diet of all the individuals studied.The average consumption of cooked vegetables was significantly higher in women belonging to the H-Exp cluster (p = 0.014).Notably, while mothers of the H-Exp cluster preferentially consumed fish of local origin, no differences were found in terms of the provenance of vegetables (see Supplementary Material Table A4).However, in both cases, the consumption of products of local origin exceeded 70% of preferences.
Figure 4 shows the beta coefficients and their related confidence interval obtained from the univariable logistic models, for fish (Panel A) and vegetables (Panel B) items.From the multivariable logistic model with a stepwise selection procedure, assessing the independent role of food categories in determining the clustering, "Blue fish" consumption was the only predictor retained in the final model (β = 0.91; ODDS = 2.49, p = 0.028).
The multivariable model for socio-demographic variables showed that only age (β = 0.25; ODDS = 1.29; p < 0.0001) and Area of residence (for LRA β = − 0.84; ODDS = 0.43; p = 0.022) were significant predictors.When all the considered variables were simultaneously included in a single multivariable model, the effect of the "blue fish" decreased and the variable was dropped out from the final model.The LASSO regression produced the same result: the only significant predictor was maternal age (β = 0.228, p < 0.0001).
We then tested the relationship between "blue fish" consumption and "age": a significant association between the two variables was found, with older mothers consuming a greater amount of fish (p = 0.003).Moreover, in order to understand if fish consumption determined a major risk of belonging to H-Exp cluster independently from age, a subanalysis was performed on the subgroup of individuals (n = 22) who did not consume fish.The 68% (15/22) of mothers were in the L-Exp group, while 32% (7/22) were in the H-Exp."Age" was not found to be significant in predicting clusters in a logistic univariable model in this subgroup.Table 3. Socio-demographic characteristics of the two clusters.LRA local reference area, NPCS national priority contaminated site, SD standard deviation; *p-value from Mann-Whitney U test for the differences between the quantitative variables and the two clusters.# p-value from Chi-square test for the differences between the qualitative variables and the two clusters.Significant values are in bold.Figure A2 in Supplementary Material reports the barplot showing the weights assigned to each variable ordered from higher to lower weights, as resulted from WQS model.The beta coefficient of the mixture was not significant and equal to 0.62 (95% CI including the zero: − 0.21/1.45).However, the results seem to indicate that "Shellfish" and "Blue Fish" are the greatest contributors to the mixture effect.

Discussion
Comparison of biomonitoring results with worldwide databases.The results of the k-means cluster analysis applied to the 161 pregnant women suggest that (i) maternal residence only partially explains the higher levels of contaminants in cluster comparisons; (ii) the "higher exposure cluster" is characterised by a relatively higher consumption of local fish; (iii) women in the high exposure group are significantly older and have a higher educational level than the low exposure group.In general, the concentrations of organochlorine (OC) compounds measured in this study were lower than those reported in pregnant women from other European countries such as Poland 37 , Norway 38 , the Netherlands 39 , Denmark 40 and Spain [41][42][43] .Similarly, a study conducted near the industrial area of Brescia, in northern Italy, reported higher OC levels in maternal serum 44 than those found in this survey.Conversely, PCB concentrations were higher than those previously found in Japanese 45 , Canadian 46 and U.S. [47][48][49] studies.Moreover, the levels found in our sample were very close to those reported by the multicentre European birth cohort study HELIX 50 , based on data produced in six different European countries.To our knowledge, only two studies have reported concentrations of total Hg in the serum of pregnant women 51,52 .In particular, Yau et al. performed a case-control study to test the association between serum Hg levels and autism spectrum disorders, without documenting a meaningful association 51 .The second study was conducted in Croatia where Sekovanic et al. analysed Hg in serum from mothers living both in continental and coastal areas 52 .This latter study found significant differences in Hg concentration between the two areas but, in both cases, the levels of Hg reported in those studies were lower compared to our data.In 2011, Alimonti et al. piloted a wide biomonitoring survey in Italy, the PROBE study (PROgramme for Biomonitoring general population Exposure), which assessed the internal dosage of 20 metals in a representative sample of the Italian population 53 .The levels of Hg in the Italian female population were lower on average than those found in our sample, in particular of women residing inside the NPCS (arithmetic mean = 0.70 μg/L and 0.88 μg/L respectively).This also reflects the outcomes of previous investigations in the Priolo area 16,54 and indicates a crucial exposure of the local population to Hg.
Environmental contaminants and exposure pathways in the Priolo site.As shown in Table 1, no major differences in the socioeconomic variables were found between the mothers enrolled in NPCS and LRAs, with the exception of the educational level, which appears higher in the NPCS.The comparison between the serum levels of selected contaminants in mothers from the NPCS and LRAs (Table 2) shows a significantly higher concentration of HCB and PCBs in the NPCS group, while the higher concentration of Hg was not statistically different (p = 0.402).Remarkably, unlike other studies 55,56 , in our sample maternal Hg serum concentration was not associated with dental amalgam (p = 0.603-median and [IQR] of 0.60 µg/L [< LOQ-1.20 µg/L] and 0.66 µg/L [< LOQ-1.22 µg/L] for mothers without and with dental amalgam, respectively).In addition, although higher concentrations of the same group of measured contaminants were found in other studies, the exposure levels required for endocrine disruption during pregnancy are reported to be extremely low 57,58 .However, synergistic or additive effects between pollutants have been increasingly documented [59][60][61] .In light of this concern, we performed k-means cluster analysis aimed at identifying groups of mothers with different exposure levels.The two clusters show profiles of cumulative chemical exposure that might be associated with first-order indices of impact on health outcomes 62 .Using a k-means clustering algorithm, we identified two distinct clusters of women based on serum contaminant concentrations.Specifically, pollutant levels in mothers from the H-Exp group were at least twofold higher in concentration with respect to the L-Exp group (Table A3  As mentioned above, the H-Exp group contained 61% of mothers living within the NPCS and 39% residing in the control area.This emphasises that higher contaminant levels can be found not only in individuals living in the NPCS, but also in the LRAs, and that pollutants may be rationally associated with common sources and pathways of contamination.Reasonably, local food and associated diets, reflecting the impact of environmental contamination, may represent a major pathway for transferring pollutants to humans, also for those populations living at a some distance from the emission site and primarily 'linked' to the same supply chains.Such a 'food hypothesis' is also corroborated by evidence that mothers in the H-Exp group were characterised by significantly higher levels of fish consumption.Mean total fish consumption in the H-Exp group (929.4 g/ month) is in line with a previous study of fish consumption in 17 European birth cohorts (plus one American), which found an overall mean consumption of 1.5 times/week, corresponding to about 900 g per month 63 .In the same study, the average consumption of fish for the Spanish birth cohort was 4.5 times/week, three times higher than those found in our cohort.These data could partially explain the higher levels of OCs found in their cohort than those found in our samples.The differences in fish consumption in the two clusters remain significant, even considering any individual fish category both in terms of grams/month (Fig. 3) and consumption frequency in univariable models (Fig. 4).Moreover, those in the cluster characterised by the highest exposure levels preferably consumed local fish (Supplementary Material Table A4).Traina et al. reported a systematic correlation of PCBs (considering the same congeners), HCB and Hg between benthic commercial fish and marine sediments in Augusta Bay, thus demonstrating a robust fingerprinting of contamination pathways 11 .This suggests a link between the highly polluted marine sediments of Augusta Bay that primarily drive benthic fish contamination, that, in turn, mirrors the higher levels of (analogue) contaminants in pregnant women with diets characterised by preference for local fish.In particular, among other fish categories, "Blue fish" was the only one variable retained in the multivariable model by the stepwise procedure.
Interestingly, we found that the H-Exp cluster was composed of a higher percentage of women with higher educational level: this result is in agreement with our recent study 27 showing that, in pregnant women, higher educational stage and older age appear to enhance attention toward a "healthy" dietary pattern characterised by higher fish (bluefish in particular) and vegetable consumption.Nevertheless, a similar diet, in a highly contaminated area, could produce a counterintuitive effect with a higher risk for exposure to environmental pollutants.With regard to fish consumption and its origin, our results confirm the data of the numerous studies carried out in the same areas on the risk of consuming local fish severely impacted by polluted sediments.
Notably, the H-Exp group consisted mainly of older women, suggesting bioaccumulation effects for all the analysed pollutants [64][65][66] .We are aware of the difficulty to distinguish among age, bioaccumulation due to ageing, and the significant association found between age and blue fish consumption.In this regard, in order to assess if belonging to the high exposure cluster still depends on age in the subgroup of individuals who do not consume fish, we applied a logistic model to the 22 mothers with a free-fish diet.The beta coefficient of age from the logistic univariable model was not significant.This result, despite the small number of subjects, seems to suggest that the association between cluster and age is subordinated to the consumption of polluted food.
To our knowledge, this is the first biomonitoring study investigating serum levels of Hg and OCs in a sample of pregnant women residing in a NPCS.On our view, the present work has some limitations.The first is recruitment, performed on a voluntary basis, which could have been influenced by a similar sociocultural level of the participants, joined by a common interest toward the health-related aspects of living in highly polluted areas.Moreover, it remains difficult-mainly due to the small sample size-to disentangle the existing relationships among age, dietary pattern, socioeconomic status and exposure level.In fact, while age and socioeconomic status are able to influence dietary habits, age, per se, is a risk factor for bioaccumulation by several routes, including diet.
Despite these limitations, our findings highlight an urgent need to inform pregnant women living in highly contaminated areas about the risk arising from pollutants 67 , as well as to suggest healthy lifestyle habits and diets, even outside the pregnancy period.Remarkably, transfer routes of pollutants across the food chain and potentially reaching humans through daily diet appear priority areas of research.This should inspire and support urgent large scale studies to address possible interventions policies for mitigating environmental impact on highly sensitive subgroups of population.
Figure A1 (panels A and B in Supplementary Material) shows the values of the two indices in correspondence with different Ks.The k-means procedure subdivided the sample into high-exposure (H-Exp) and low-exposure (L-Exp) groups.The heatmap in Fig. 1 shows the average values of the pollutants in the two clusters (Panel A).Panel B shows how individuals were grouped into the two clusters based on the concentration levels of Hg, HCB and PCBs.Table

Figure 1 .
Figure 1.(A) Heatmap generated by k-means clustering analysis; (B) Groups of individuals in the two clusters identified according to the levels of Hg and sum of the PCBs.Points aligned at the bottom of the figure represent the non-quantifiable values (i.e., values below the LOQ).

Figure 2 .
Figure 2. Geographical distribution of mothers in the Priolo area according to the k-means clustering (low vs high pollutant levels) and area of residence (NPCS vs LRA).(A) Refers to the entire study area.(B) Refers to the municipality of Augusta and to Augusta Bay.The maps were created using the OpenStreetMap package (https:// cran.r-project.org/ packa ge= OpenS treet Map) of R version 4.1.3.

Figure 3 .
Figure 3. Average of fish (upper panel) and vegetable (lower panel) consumption in the two clusters.The rhombus, indicates the average consumption in each clusters, the bar indicates the standard error.p-values from Mann-Whitney U-test.

Figure 4 .
Figure 4. Forest plot relevant to the results of univariable logistic models built for each fish (A) and vegetable (B) categories.Each point shows the relevant estimate, horizontal bars refer to 95% confidence intervals.

Table 2 .
Maternal serum contaminant levels.Serum levels of HCB and PCBs were normalized to total lipid content and reported in ng/g lipids.LRA local reference area, NPCS national priority contaminated site, LOQ limit of quantification, SD standard deviation, IQR interquartile range; *p-values from Mann-Whitney U test for the differences in pollutant levels between residence areas; ΣPCB a : Sum of PCB138, PCB153 and PCB180 congeners; concentrations below the LOQ were replaced by LOQ/2 before lipid adjustment.Significant values are in bold.

Table 4 .
Consumption of food in g/month in the two clusters, identified by the k-means procedure.*p-values computed for the differences between the two clusters by Mann-Whitney U test.Significant values are in bold.

SD) Median (25-75%) Mean (SD) Median (25-75%)
in Supplementary Materials).The median values of HCB and PCBs found in the H-Exp group exceeded the values reported in the above-mentioned work of Montazeri et al., which reported serum OC levels from six birth cohorts of different European countries (HCB median values = 9.74 vs 8.20 ng/g; PCB138 = 11.66 vs 9.1 ng/g; PCB153 = 21.66 vs 17.6 ng/g and PCB180 = 16.26 vs 10.4 ng/g, respectively, for our sample and Montazeri et al.) 50 .