A Hybrid Bayesian Network Framework for Risk Assessment of Arsenic Exposure and Adverse Reproductive Outcomes
Introduction
Conventional health risk assessment methods are not very effective in analyzing the actual risk of arsenic exposure and also these methods are not capable of interpreting large amounts of data, which could lead to a better understanding of uncertainties (Orak et al., 2019; USNRC, 2013; Wilson, 2001). The U.S. National Research Council USNRC (2013) published a report that recommends data-driven approaches over default practices for assessing multiple effects of inorganic arsenic. The complex integrated systems, where physical-biological-human systems interact, requires a multidisciplinary approach (McCann et al., 2006). Therefore, we aimed to explore a prenatal inorganic arsenic exposure system (network) and the strength of interactions (causal relationships) by using BBN modeling.
Bayesian Networks were developed in the late 1980s to visualize the probabilistic dependency models via directed acyclic graphs (DAG) to understand the probabilistic relationships between variables (Newton, 2009; Pearl, 1988a). BBNs are strong decision making tools and relatively simple compared to other modeling approaches (Pollino and Henderson, 2010). BBNs bring a holistic approach to understanding the important pathways in networks, which are not easily expressed by mathematical equations, by integrating qualitative expert knowledge, equations, probabilistic modeling, and empirical data (Gat-Viks et al., 2006; Pearl, 1988b; Tighe et al., 2013). BBNs have been used to analyze problems, plan, monitor and evaluate diverse cases of varying size and complexity in several different disciplines (Beaudequin et al., 2015; Weber et al., 2012; Yang et al., 2016).
BBNs apply Bayes' Theorem (also known as Bayes' rule or Bayes' law), which was first derived by Thomas Bayes and published in 1764 (Murphy, 2012). According to Bayes' theorem, a prior probability provides information about the likelihood of a parameter, and the posterior probability is calculated based on the conditional probability of that likelihood (Su et al., 2013). This feature of the theorem differentiates the Bayesian statistical models from ordinary un-Bayesian statistical models, because the Bayesian approach is a mixture of ordinary linear models, and joint distribution over the measured variables (Spirtes et al., 1993). Bayes’ rule (Eq. (1)) continuously updates the belief probability of each node in the network (Murphy, 2012; Tang et al., 2016).
We build the statistical model on qualitative literature knowledge and the empirical data of Laine et al. (2015), in which they recruited two hundred pregnant women in Gómez Palacio, Mexico to analyze the effect of drinking water inorganic arsenic exposure (DW-iAs) on birth outcomes. We develop a prenatal arsenic exposure A-BBN to compare the outcomes with the conventional regression approach. Conventional linear regression approach considers that outcome (Y) is a function of an explanatory (predictor) variable (X) as shown in Eq. (2) (Murphy, 2012; Varaksin and Panov 2012).wTX represents the scalar product between the input vector X and the weight vector of the model, ε is the residual error.
Regression models are most commonly used in empirical studies to predict the effects of measured factors without considering the effects of unmeasured factors or a potential correlation between variables on final outcome. Therefore, linear multiple regression models are not very accurate if there is any correlation between predictors.
Natural water resource contamination by arsenic exposure is a global threat to public health (Hughes, 2006; Stanton et al., 2015). The most common inorganic arsenic forms, arsenate (As5+) and arsenite (As3+), are more toxic than organic arsenic (Qi et al., 2014). There is a potential linkage between prenatal inorganic arsenic exposure and infant development, and survival rate (Gardner et al., 2011; Rager et al., 2014). Therefore, for drinking water, the maximum permissible arsenic concentration is 10 μg/L by the U.S. Environmental Protection Agency (EPA), and recommended value is 10 μg/L by World Health Organization (WHO) (Stanton et al., 2015). High levels of arsenic concentration can cross the placental barrier and cause negative reproductive and developmental effects such as spontaneous abortion, preterm birth, stillbirth, and low birth weight (<2500 g) (Ahmad et al., 2001; Bailey and Fry, 2015; Punshon et al., 2015). The adverse health outcomes are caused by several factors (variables), such as arsenic concentration in drinking water, characteristics of women and genetic components (sex, age, and body weight) and life style (smoking habits, alcohol consumption, etc.) (NRC, 2001). In addition, genetic factors can also affect the metabolism of arsenic to form different arsenic metabolites (arsenicals). Yet, even though the interactions between these variables and the underlying reason of arsenic partitioning are extensively studied in the literature; the current dose-response methods do not explain these relationships well.
Prior research on arsenic detoxification has shown that an effective method is methylation (Thomas et al., 2001; Vahter, 2002; Wanibuchi et al., 2004). Inorganic arsenic compounds can be methylated to monomethylarsonic acid (MMA), dimethylarsinic acid (DMA) and trimethylarsine oxide (TMAsO) (Mandal and Suzuki, 2002; Qi et al., 2014). These highly methylated species have fewer toxic effects compared to less methylated compounds (Laine et al., 2015; Wanibuchi et al., 2004). Several studies show that the partitioning of inorganic arsenic metabolites in the human body is 20–30% inorganic arsenic (iA), 10–20% MMA and 60–80% DMA (Gardner et al., 2011). Women have higher efficiency in converting inorganic arsenic to DMA compared to men; this efficiency may be higher during pregnancy (Vahter et al., 2006). There are several biomarkers of arsenic exposure in the human body; the most common ones are blood arsenic and urinary arsenic (Jarup, 2003). Laine et al. (2015) measured urinary arsenicals as indicators (Fig. 1), which are selected as important variables in the BBN.
Section snippets
Methods
As shown in Fig. 2, the model development process starts with setting goals for the model, which in this case are exploring the causal relationships in inorganic arsenic (iAs) exposure network and the effects of iAs exposure in drinking water (DW-iAs) on low birth weight. The second step is designing the conceptual model of the network, which includes integration of literature knowledge and empirical data. For the third step, we use the statistic moments of the experimental data to identify and
Results and Discussion
We developed several A-BBN models with five different learning algorithms as mentioned in the methods section (specifically, 1) a Bayesian search algorithm, 2) PC, 3) a tree augmented network (TAN) learning algorithm, 4) an augmented naive Bayes algorithm and 5) a naive Bayes algorithm). Developing a model requires several iterations to understand the significance of each variable and simplify the influence diagram. We tested the models with the “leave one out” method and compared the
Conclusion
This study demonstrates how a hybrid BBN can provide insight on the potential interactions in the arsenic exposure network. We achieved the main goal by developing a hybrid BBN model to investigate the theory of identifying the arsenic methylation level as an indicator of infant health risk. The main limitation of the model limitation of the model relies on its relatively small data set. Moreover, birthweight cases between four states are not equal. Therefore, there are small number of cases in
Acknowledgements
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors. The authors declare that they have no significant competing financial, professional or personal interests that might have influenced the performance or presentation of the work described in this manuscript. The models described in this paper were created using the GeNIe Modeler, available free of charge for academic research use from BayesFusion, LLC, //www.bayesfusion.com/
References (41)
- et al.
Health effects of prenatal and early-life exposure to arsenic. Handbook of arsenic toxicology
(2015) - et al.
Beyond qmra: modelling microbial health risk as a complex system using bayesian networks
Environ. Int.
(2015) - et al.
Bayesian modeling approach for characterizing groundwater arsenic contamination in the mekong river basin
Chemosphere
(2016) - et al.
Using publicly available data, a physiologically-based pharmacokinetic model and bayesian simulation to improve arsenic non-cancer dose-response
Environ. Int.
(2016) - et al.
Arsenic methylation efficiency increases during the first trimester of pregnancy independent of folate status
Reprod. Toxicol.
(2011) - et al.
Arsenic round the world: a review
Talanta
(2002) - et al.
Autophagy in arsenic carcinogenesis
Exp. Toxicol. Pathol. : Off. J. Gesellschaft fur Toxikologische Pathol.
(2014) - et al.
Risk analysis of emergent water pollution accidents based on a bayesian network
J. Environ. Manag.
(2016) - et al.
The cellular metabolism and systemic toxicity of arsenic
Toxicol. Appl. Pharmacol.
(2001) - et al.
Bayesian networks as a screening tool for exposure assessment
J. Environ. Manag.
(2013)
Mechanisms of arsenic biotransformation
Toxicology
Understanding arsenic carcinogenicity by the use of animal models
Toxicol. Appl. Pharmacol.
Overview on bayesian networks applications for dependability, risk analysis and maintenance areas
Eng. Appl. Artif. Intell.
Performance evaluation of classification algorithms by k-fold and leave-one-out cross validation
Pattern Recogn.
Structural learning of bayesian networks by bacterial foraging optimization
Int. J. Approx. Reason.
Arsenic in drinking water and pregnancy outcomes
Environ. Health Perspect.
Stochastic sampling algorithms: epis sampling
Roc Graphs: Notes and Practical Considerations for Data Mining Researchers
Estimating survival in patients with operable skeletal metastases: an application of a bayesian belief network
PloS One
A probabilistic methodology for integrating knowledge and experiments on biological networks
J. Comput. Biol.
Cited by (11)
Association of maternal metals exposure, metabolites and birth outcomes in newborns: A prospective cohort study
2023, Environment InternationalProbabilistic multi-pathway human health risk assessment due to heavy metal(loid)s in a traditional gold mining area in Ecuador
2021, Ecotoxicology and Environmental SafetyCitation Excerpt :The evaluation process is improved when scarce data or measurement errors are present (Bonotto et al., 2018). In addition, this methodology enables the update of probabilistic distributions of the parameters as new information is made (McCarthy and Masters, 2005; Orak, 2020). This study applied the Bayesian approach to extend the knowledge base on the parameters utilized in risk quantification (Iribarren et al., 2009).
Data mining for pesticide decontamination using heterogeneous photocatalytic processes
2021, ChemosphereCitation Excerpt :In recent decades, the BN has been widely used in engineering sciences. For example, for risk assessment of arsenic exposure (Orak, 2020), for improving heavy metal risk management in soil-rice system (Jia et al., 2020), predictions of BuChE in drug discovery (Fang et al., 2013), etc. More examples of BN applications in chemical and process industries can be found in (Zerrouki et al., 2019).
Quantifying and predicting ecological and human health risks for binary heavy metal pollution accidents at the watershed scale using Bayesian Networks
2021, Environmental PollutionCitation Excerpt :Probabilistic techniques that use probability distributions to quantitatively estimate uncertainties are more realistic and suitable to address risks caused by the influential fractions in the process studied. Bayesian Networks (BNs, Pearl, 1986) are typical graphical models used to describe probabilistic cause and effect relationships (Ayre and Landis, 2012; Marcot and Penman, 2019), and they are similar to conceptual models for risk assessments, which makes them increasingly applicable for ecological risk analyses (Lehikoinen et al., 2015; Carriger and Barron, 2020) and human health risk assessments (Beaudequin et al., 2015; Wijesiri et al., 2018; Orak, 2020). Comprised of a directed acyclic graph (DAG) and conditional probability tables (CPTs), BNs can integrate qualitative and quantitative knowledge from experts (Fenton and Neil, 2013; Lehikoinen et al., 2015), monitor datasets (Uusitalo, 2007; Landuyt et al., 2014), and perform model calculations of experimental data (Ii et al., 2011).
Multi-pathway human exposure risk assessment using Bayesian modeling at the historically largest mercury mining district
2020, Ecotoxicology and Environmental SafetyCitation Excerpt :The resulting non-carcinogenic risk by exposure route was similar for both methods. However, the deterministic assessment presented a single numerical estimate while the Bayesian approach resulted in a probability distribution of the risk, providing more information for decision making (Chen et al., 2019; Orak, 2020; USEPA, 2017). Previous studies reported that deterministic method may under-or-over-estimate the risk (Peng et al., 2016; Saha et al., 2017; Spence and Walden, 2001).