Application of machine learning in predicting the adsorption capacity of organic compounds onto biochar and resin
Introduction
The severe shortage of clean water poses huge challenges for our modern social development (Ma et al., 2020). The organic pollutant of water such as personal care products, pesticides and food additives brought a big threaten to water resource and environmental (Rojas and Horcajada, 2020). Regarding the water restoration, physical adsorption is one of the promising strategies to remove organic pollutants on account of cost, simplicity, and energy considerations (Shi et al., 2021). Numerous adsorbents, including carbon nanotubes, granular activated carbons, Graphene nanosheets and biochars have been reported to be able to remove those organic pollutant from water (Bunmahotama et al., 2015; Ersan et al., 2019; Wu et al., 2013). However, It is clear that currently used this technology are not efficient, especially against novel chemicals and adsorbents, and has brought about scarcity in data for adsorption of compounds on adsorbents (Chen et al., 2011a; Kuncek and Sener, 2010; Luo et al., 2019; Zheng et al., 2019). The establishment of efficient prediction models can make full use of existing data and replace some labor-intensive adsorption experiments (Zhang et al., 2020a). Polyparameter linear free energy relationship (pp-LFER) model is a popular model used in predicting adsorption of organic compounds (Qi et al., 2020). The molecular structure parameters of pp-LFER model can explain the distribution of organic pollutants between solid and liquid phases from the perspective of intermolecular force (Zhu et al., 2019a). The size of pp-LFER model coefficients can distinguish the difference in force between the molecules of different systems, and then clarify adsorption mechanism of organic compounds making a correct environmental risk assessment. The pp-LFER model is as follows Eq. (1). The meaning of the symbols of pp-LFER model can be seen in Table S1. However, pp-LFER model tends to ignore the effect of adsorbent properties and leads to large prediction errors. In addition, it usually needs to establish a multilinear regression (MLR) for each equilibrium concentration Ce, and the prediction based on the MLR is limited to the concentration level involved in the modeling (Zhang et al., 2020a). Considering limitations of the above method, the machine learning model combining basic properties of adsorbents and compounds can provide a helpful solution.
Machine learning (ML), an interdisciplinary subject involving various fields, has been applied to adsorption research by scholars at present (Kobayashi et al., 2020). The theory of ML is mainly to design and analyze some algorithms that let computer to “learn” automatically. Based on the principles of statistics, ML algorithms can automatically analyze the structure of existing data and mine regulation, to judge and predict for the unknown samples (Mazaheri et al., 2017). In terms of organic pollutants adsorption, ML methods (such as artificial neural networks (ANN), support vector machines (SVM), random forest (RF), etc.) have gradually become the key method of research (Sahu et al., 2019). ML method can not only optimize the adsorption parameters (Rahman et al., 2019; Zhang and Pan, 2014), but also has more obvious effect than the traditional regression methods in simulating single-component/multi-component adsorption (Panapitiya et al., 2018; Zhang et al., 2019b). ANN model was used to study the ability of biochar to predict the adsorption of pollutants. It is proved that the overall RMSE and R2 values obtained by ANN algorithm were acceptable, but some of the prediction performance was poor (Yang et al., 2014). Least square support vector machine (LS-SVM) and ANN was proved that can promote the adsorption efficiency of methylene blue fuel (Asfaram et al., 2016). Widely used feedforward backpropagation ANN and RF were compared in performance, and their applicability was evaluated for 336 energy consumption predictions. It was found that the two models perform well in training and validating data (Ahmad et al., 2017). Previous studies found that incorporating the surface area of describing the adsorbent properties into a deep neural network model to predict the fitted parameters of the Freundlich isotherm largely solved the problem of the MLR limiting equilibrium concentration (Sigmund et al., 2020). In addition, ANN-LFER method based on this strategy provides a solution to solve the limitations of existing single solute adsorption prediction model, but the prediction accuracy still needs to be further improved (Zhang et al., 2020a).
Compared to the above methods, Kriging is an accurate interpolation method, which has received extensive attention in modeling in recent years (Xiao et al., 2020). Originated from geostatistics, Kriging model has been applied to various optimization problems and structural reliability analyses (Huang et al., 2016). Kriging model has the characteristics of high prediction accuracy, less time-consuming and strong robustness (Zhao et al., 2016). At present, it has been widely used in meteorology, geography, and computational science, but there are few reports on the organic compounds’ adsorption (Zhao et al., 2021).
To address the limitations of the existing models, Kriging-LFER model is established based upon the collected data. Moreover, the sensitivity analysis is used to evaluate the influence of each variable on the output of the model. Finally, the range of adsorption coefficient is effectively determined by uncertainty analysis, and the most likely results of the model are predicted. Kriging-LFER model established in this study can not only reduce the experimental workload, but also rapidly predict adsorption efficiency of biochar and resin for organic compounds according to the basic properties of adsorbent and compounds, which is of great significance for understanding the importance of various parameters, adjusting, and improving the direction of experiment, and environmental governance.
Section snippets
Data collection
In this study, biochar and polymer resin are selected as the target adsorbent, meanwhile, high-quality, representative raw data are selected to establish a prediction model. To compare with developed models in previous studies, experimental data are selected the same as previous studies (Zhang et al., 2020a). Supporting Information about the collected data is shown in Table S2.1-S2.2. 1750 adsorption data points related to adsorption isotherms are excavated. These concentration points, which
Statistical results of biochar and resins characteristics and pearson correlation matrix analysis
A description of the statistical distribution of variables for 50 biochar and 30 polymeric resin is presented in Fig. 1. Box plot uses the maximum value, upper and lower quartile, median, and minimum value to describe data from top to bottom. The data set is based on quartiles and interquartile ranges to measure the variability, while points outside the inner limit are drawn with diamonds. The results of boxplot show that the value of the BET area is 2–3 orders of magnitude higher than that of
Conclusion
Machine learning model can effectively predict the adsorption efficiency of biochar and resin for organic pollutants in aqueous solution, and it can achieve lower prediction errors (R2 = 0.940, RMSE = 0.037 and R2 = 0.976, RMSE = 0.019). Compared with ANN-LFER, the prediction accuracy of Kriging model is improved by 7% and 9.6%, respectively. In addition, sensitivity analysis provides the priority of parameter adjustment (), having important reference value for optimizing
Declaration of competing interest
The authors declared that they have no conflicts of interest to this work.
We declare that we do not have any commercial or associative interest that represents a conflict of interest in connection with the work submitted.
Acknowledgements
The authors appreciate the financial support by National Natural Science Foundation of China (No. 41807196), Postdoctoral Science Foundation of China (2021T140104, 2018M641793), Excellent Youth of Heilongjiang Province Natural Science Foundation (YQ2021D003), Postdoctoral Science Foundation of Heilongjiang Province of China (No.LBH-Z19002).
References (48)
- et al.
Trees vs Neurons: comparison between random forest and ANN for high-resolution prediction of building energy consumption
Energy Build.
(2017) - et al.
Predicting the adsorption of organic pollutants from water onto activated carbons based on the pore size distribution and molecular connectivity index
Water Res.
(2015) - et al.
Investigations on the batch and fixed-bed column performance of fluoride adsorption by Kanuma mud
Desalination
(2011) - et al.
MDHGI: matrix decomposition and heterogeneous graph inference for miRNA-disease association prediction
PLoS Comput. Biol.
(2018) - et al.
Adsorption of copper and zinc by biochars produced from pyrolysis of hardwood and corn straw in aqueous solution
Bioresour. Technol.
(2011) - et al.
AK-MCS: an active learning reliability method combining Kriging and Monte Carlo Simulation
Struct. Saf.
(2011) - et al.
Predictive models for adsorption of organic compounds by Graphene nanosheets: comparison with carbon nanotubes
Sci. Total Environ.
(2019) - et al.
Assessing small failure probabilities by AK-SS: an active learning method combining Kriging and Subset Simulation
Struct. Saf.
(2016) - et al.
Interventions to mitigate early spread of SARS-CoV-2 in Singapore: a modelling study
Lancet Infect. Dis.
(2020) - et al.
Adsorption of methylene blue onto sonicated sepiolite from aqueous solutions
Ultrason. Sonochem.
(2010)
Single and competitive dye adsorption onto chitosan-based hybrid hydrogels using artificial neural network modeling
J. Colloid Interface Sci.
Variance-based sensitivity analysis of a forest growth model
Ecol. Model.
LIF: a new Kriging based learning function and its application to structural reliability analysis
Reliab. Eng. Syst. Saf.
The sorption of organic contaminants on biochars derived from sediments with high organic carbon content
Chemosphere
Achieving no spent brine discharge in an anion exchange resin-based fixed-bed process for typical DOM removal
Chem. Eng. Res. Des.
Maximizing natural frequencies of inhomogeneous cellular structures by Kriging-assisted multiscale topology optimization
Comput. Struct.
Modeling batch and column phosphate removal by hydrated ferric oxide-based nanocomposite using response surface methodology and artificial neural network
Chem. Eng. J.
A Kriging surrogate model coupled in simulation-optimization approach for identifying release history of groundwater sources
J. Contam. Hydrol.
Prediction of adsorption properties for ionic and neutral pharmaceuticals and pharmaceutical intermediates on activated charcoal from aqueous solution via LFER model
Chem. Eng. J.
Adsorption desulfurization performance and adsorption-diffusion study of B2O3 modified Ag-CeOx/TiO2-SiO2
J. Hazard Mater.
Output uncertainty of dynamic growth models: effect of uncertain parameter estimates on model reliability
Biochem. Eng. J.
Statistical experimental design, least square-support vector machine (LS-SVM) and artificial neural network (ANN) methods for modeling of facilitated adsorption of methylene blue dye
RSC Adv.
Self-assembly biochar colloids mycelial pellet for heavy metal removal from aqueous solution
Chemosphere
Prediction of soil adsorption coefficient in pesticides using physicochemical properties and molecular descriptors by machine learning models
Environ. Toxicol. Chem.
Cited by (24)
Prediction of antibiotic sorption in soil with machine learning and analysis of global antibiotic resistance risk
2024, Journal of Hazardous MaterialsMachine learning applications for biochar studies: A mini-review
2024, Bioresource TechnologyMachine learning based prediction and experimental validation of arsenite and arsenate sorption on biochars
2023, Science of the Total EnvironmentArtificial neural networks for insights into adsorption capacity of industrial dyes using carbon-based materials
2023, Separation and Purification TechnologyPrediction of metformin adsorption on subsurface sediments based on quantitative experiment and artificial neural network modeling
2023, Science of the Total Environment
- 1
Ying Zhao and Da Fan contribute equally to this work and should be considered co-first authors.