A Hybrid Bayesian Network Framework for Risk Assessment of Arsenic Exposure and Adverse Reproductive Outcomes

https://doi.org/10.1016/j.ecoenv.2020.110270Get rights and content

Highlights

  • “Dose-response” graphs are limited in their ability to unveil the relationships between potential risk factors of exposure on health outcomes.

  • A hybrid BBN can provide insight on the potential interactions in the arsenic exposure network.

  • The BBN model provides 82% sensitivity and 72% specificity in average for different states of birthweight.

Abstract

Arsenic contamination of drinking water affects more than 137 million people and has been linked to several adverse health effects. The traditional toxicological approach, “dose-response” graphs, are limited in their ability to unveil the relationships between potential risk factors of arsenic exposure for adverse human health outcomes, which are critically important to understanding the risk at low exposure levels of arsenic. Therefore, to provide insight on the potential interactions of different variables of the arsenic exposure network, this study characterizes the risk factors by developing a hybrid Bayesian Belief Network (BBN) model for health risk assessment. The results show that the low inorganic arsenic concentration increases the risk of low birth weight even for low gestational age scenarios. While increasing the mother's age does not increase the low birthweight risk, it affects the distribution between other categories of baby weight. For low MMA% (<4%) in the human body, increasing gestational age decreases the risk of having low birthweight. The proposed BBN model provides 82% sensitivity and 72% specificity in average for different states of birthweight.

Introduction

Conventional health risk assessment methods are not very effective in analyzing the actual risk of arsenic exposure and also these methods are not capable of interpreting large amounts of data, which could lead to a better understanding of uncertainties (Orak et al., 2019; USNRC, 2013; Wilson, 2001). The U.S. National Research Council USNRC (2013) published a report that recommends data-driven approaches over default practices for assessing multiple effects of inorganic arsenic. The complex integrated systems, where physical-biological-human systems interact, requires a multidisciplinary approach (McCann et al., 2006). Therefore, we aimed to explore a prenatal inorganic arsenic exposure system (network) and the strength of interactions (causal relationships) by using BBN modeling.

Bayesian Networks were developed in the late 1980s to visualize the probabilistic dependency models via directed acyclic graphs (DAG) to understand the probabilistic relationships between variables (Newton, 2009; Pearl, 1988a). BBNs are strong decision making tools and relatively simple compared to other modeling approaches (Pollino and Henderson, 2010). BBNs bring a holistic approach to understanding the important pathways in networks, which are not easily expressed by mathematical equations, by integrating qualitative expert knowledge, equations, probabilistic modeling, and empirical data (Gat-Viks et al., 2006; Pearl, 1988b; Tighe et al., 2013). BBNs have been used to analyze problems, plan, monitor and evaluate diverse cases of varying size and complexity in several different disciplines (Beaudequin et al., 2015; Weber et al., 2012; Yang et al., 2016).

BBNs apply Bayes' Theorem (also known as Bayes' rule or Bayes' law), which was first derived by Thomas Bayes and published in 1764 (Murphy, 2012). According to Bayes' theorem, a prior probability provides information about the likelihood of a parameter, and the posterior probability is calculated based on the conditional probability of that likelihood (Su et al., 2013). This feature of the theorem differentiates the Bayesian statistical models from ordinary un-Bayesian statistical models, because the Bayesian approach is a mixture of ordinary linear models, and joint distribution over the measured variables (Spirtes et al., 1993). Bayes’ rule (Eq. (1)) continuously updates the belief probability of each node in the network (Murphy, 2012; Tang et al., 2016).p(X=x|Y=y)=p(X=x,Y=y)p(Y=y)=p(X=x)p(Y=y|X=x)x'p(X=x')p(Y=y|X=x')

We build the statistical model on qualitative literature knowledge and the empirical data of Laine et al. (2015), in which they recruited two hundred pregnant women in Gómez Palacio, Mexico to analyze the effect of drinking water inorganic arsenic exposure (DW-iAs) on birth outcomes. We develop a prenatal arsenic exposure A-BBN to compare the outcomes with the conventional regression approach. Conventional linear regression approach considers that outcome (Y) is a function of an explanatory (predictor) variable (X) as shown in Eq. (2) (Murphy, 2012; Varaksin and Panov 2012).Y(X)=wTX+ε=j=1DwjXj+εwTX represents the scalar product between the input vector X and the weight vector of the model, ε is the residual error.

Regression models are most commonly used in empirical studies to predict the effects of measured factors without considering the effects of unmeasured factors or a potential correlation between variables on final outcome. Therefore, linear multiple regression models are not very accurate if there is any correlation between predictors.

Natural water resource contamination by arsenic exposure is a global threat to public health (Hughes, 2006; Stanton et al., 2015). The most common inorganic arsenic forms, arsenate (As5+) and arsenite (As3+), are more toxic than organic arsenic (Qi et al., 2014). There is a potential linkage between prenatal inorganic arsenic exposure and infant development, and survival rate (Gardner et al., 2011; Rager et al., 2014). Therefore, for drinking water, the maximum permissible arsenic concentration is 10 μg/L by the U.S. Environmental Protection Agency (EPA), and recommended value is 10 μg/L by World Health Organization (WHO) (Stanton et al., 2015). High levels of arsenic concentration can cross the placental barrier and cause negative reproductive and developmental effects such as spontaneous abortion, preterm birth, stillbirth, and low birth weight (<2500 g) (Ahmad et al., 2001; Bailey and Fry, 2015; Punshon et al., 2015). The adverse health outcomes are caused by several factors (variables), such as arsenic concentration in drinking water, characteristics of women and genetic components (sex, age, and body weight) and life style (smoking habits, alcohol consumption, etc.) (NRC, 2001). In addition, genetic factors can also affect the metabolism of arsenic to form different arsenic metabolites (arsenicals). Yet, even though the interactions between these variables and the underlying reason of arsenic partitioning are extensively studied in the literature; the current dose-response methods do not explain these relationships well.

Prior research on arsenic detoxification has shown that an effective method is methylation (Thomas et al., 2001; Vahter, 2002; Wanibuchi et al., 2004). Inorganic arsenic compounds can be methylated to monomethylarsonic acid (MMA), dimethylarsinic acid (DMA) and trimethylarsine oxide (TMAsO) (Mandal and Suzuki, 2002; Qi et al., 2014). These highly methylated species have fewer toxic effects compared to less methylated compounds (Laine et al., 2015; Wanibuchi et al., 2004). Several studies show that the partitioning of inorganic arsenic metabolites in the human body is 20–30% inorganic arsenic (iA), 10–20% MMA and 60–80% DMA (Gardner et al., 2011). Women have higher efficiency in converting inorganic arsenic to DMA compared to men; this efficiency may be higher during pregnancy (Vahter et al., 2006). There are several biomarkers of arsenic exposure in the human body; the most common ones are blood arsenic and urinary arsenic (Jarup, 2003). Laine et al. (2015) measured urinary arsenicals as indicators (Fig. 1), which are selected as important variables in the BBN.

Section snippets

Methods

As shown in Fig. 2, the model development process starts with setting goals for the model, which in this case are exploring the causal relationships in inorganic arsenic (iAs) exposure network and the effects of iAs exposure in drinking water (DW-iAs) on low birth weight. The second step is designing the conceptual model of the network, which includes integration of literature knowledge and empirical data. For the third step, we use the statistic moments of the experimental data to identify and

Results and Discussion

We developed several A-BBN models with five different learning algorithms as mentioned in the methods section (specifically, 1) a Bayesian search algorithm, 2) PC, 3) a tree augmented network (TAN) learning algorithm, 4) an augmented naive Bayes algorithm and 5) a naive Bayes algorithm). Developing a model requires several iterations to understand the significance of each variable and simplify the influence diagram. We tested the models with the “leave one out” method and compared the

Conclusion

This study demonstrates how a hybrid BBN can provide insight on the potential interactions in the arsenic exposure network. We achieved the main goal by developing a hybrid BBN model to investigate the theory of identifying the arsenic methylation level as an indicator of infant health risk. The main limitation of the model limitation of the model relies on its relatively small data set. Moreover, birthweight cases between four states are not equal. Therefore, there are small number of cases in

Acknowledgements

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors. The authors declare that they have no significant competing financial, professional or personal interests that might have influenced the performance or presentation of the work described in this manuscript. The models described in this paper were created using the GeNIe Modeler, available free of charge for academic research use from BayesFusion, LLC, //www.bayesfusion.com/

References (41)

  • M. Vahter

    Mechanisms of arsenic biotransformation

    Toxicology

    (2002)
  • H. Wanibuchi et al.

    Understanding arsenic carcinogenicity by the use of animal models

    Toxicol. Appl. Pharmacol.

    (2004)
  • P. Weber et al.

    Overview on bayesian networks applications for dependability, risk analysis and maintenance areas

    Eng. Appl. Artif. Intell.

    (2012)
  • T.-T. Wong

    Performance evaluation of classification algorithms by k-fold and leave-one-out cross validation

    Pattern Recogn.

    (2015)
  • C. Yang et al.

    Structural learning of bayesian networks by bacterial foraging optimization

    Int. J. Approx. Reason.

    (2016)
  • S.A. Ahmad et al.

    Arsenic in drinking water and pregnancy outcomes

    Environ. Health Perspect.

    (2001)
  • M. Druzdzel

    Stochastic sampling algorithms: epis sampling

    (2016)
  • T. Fawcett

    Roc Graphs: Notes and Practical Considerations for Data Mining Researchers

    (2003)
  • J.A. Forsberg et al.

    Estimating survival in patients with operable skeletal metastases: an application of a bayesian belief network

    PloS One

    (2011)
  • I. Gat-Viks et al.

    A probabilistic methodology for integrating knowledge and experiments on biological networks

    J. Comput. Biol.

    (2006)
  • Cited by (11)

    • Probabilistic multi-pathway human health risk assessment due to heavy metal(loid)s in a traditional gold mining area in Ecuador

      2021, Ecotoxicology and Environmental Safety
      Citation Excerpt :

      The evaluation process is improved when scarce data or measurement errors are present (Bonotto et al., 2018). In addition, this methodology enables the update of probabilistic distributions of the parameters as new information is made (McCarthy and Masters, 2005; Orak, 2020). This study applied the Bayesian approach to extend the knowledge base on the parameters utilized in risk quantification (Iribarren et al., 2009).

    • Data mining for pesticide decontamination using heterogeneous photocatalytic processes

      2021, Chemosphere
      Citation Excerpt :

      In recent decades, the BN has been widely used in engineering sciences. For example, for risk assessment of arsenic exposure (Orak, 2020), for improving heavy metal risk management in soil-rice system (Jia et al., 2020), predictions of BuChE in drug discovery (Fang et al., 2013), etc. More examples of BN applications in chemical and process industries can be found in (Zerrouki et al., 2019).

    • Quantifying and predicting ecological and human health risks for binary heavy metal pollution accidents at the watershed scale using Bayesian Networks

      2021, Environmental Pollution
      Citation Excerpt :

      Probabilistic techniques that use probability distributions to quantitatively estimate uncertainties are more realistic and suitable to address risks caused by the influential fractions in the process studied. Bayesian Networks (BNs, Pearl, 1986) are typical graphical models used to describe probabilistic cause and effect relationships (Ayre and Landis, 2012; Marcot and Penman, 2019), and they are similar to conceptual models for risk assessments, which makes them increasingly applicable for ecological risk analyses (Lehikoinen et al., 2015; Carriger and Barron, 2020) and human health risk assessments (Beaudequin et al., 2015; Wijesiri et al., 2018; Orak, 2020). Comprised of a directed acyclic graph (DAG) and conditional probability tables (CPTs), BNs can integrate qualitative and quantitative knowledge from experts (Fenton and Neil, 2013; Lehikoinen et al., 2015), monitor datasets (Uusitalo, 2007; Landuyt et al., 2014), and perform model calculations of experimental data (Ii et al., 2011).

    • Multi-pathway human exposure risk assessment using Bayesian modeling at the historically largest mercury mining district

      2020, Ecotoxicology and Environmental Safety
      Citation Excerpt :

      The resulting non-carcinogenic risk by exposure route was similar for both methods. However, the deterministic assessment presented a single numerical estimate while the Bayesian approach resulted in a probability distribution of the risk, providing more information for decision making (Chen et al., 2019; Orak, 2020; USEPA, 2017). Previous studies reported that deterministic method may under-or-over-estimate the risk (Peng et al., 2016; Saha et al., 2017; Spence and Walden, 2001).

    View all citing articles on Scopus
    View full text