Data on estimation of health hazards associated with pesticide residues in drinking water

The dataset presents the occurrence of 113 pesticide residues (PR) in drinking water samples from 31 counties worldwide and correlates their concentrates with human health. The dataset classifies PRs to four toxicity classes. Class IA (extremely toxic), includes four residues with an LD50 value < 5 mg/kg. b. w.; class IB (highly toxic compounds), includes 14 residues with an LD50 value in the range of 5-<50 mg/kg b w.); Class II, (moderately toxic) includes 55 residues with an LD50 value in the range of 50-<500 mg/kg b w.); Class III, (slightly toxic compounds) includes 17 residues with an LD50 value in the range of 500-<2000 mg/kg bw. and class IV (less toxic compound) includes 23 residues with an LD50 value > 2000 mg/kg bw. The dataset provides a new statistical method that link all PRs together throughout using reference average (Ref Aver), reference standard deviation (Ref Stdev), country average and country standard deviation to show the statistical variations among them. Furthermore, the dataset calculates hazard indices (HIs) and shows its distribution among 31 countries. Noteworthy, the dataset provides advanced techniques to clean water from PRs. Detailed explanation and discussion of the present dataset can be found in the article entitled “Pesticide residues in drinking water, their potential risk to human health and removal options” under article doi: 10.1016/j.jenvman.2021.113611 (El-Nahhal and El-Nahhal, 2021). To the best of our knowledge, this is the first dataset that describes the use of Ref Aver and Ref Stdev to link the averages of all PRs of countries together to show the differences of occurrence and provides several cleaning options of PRs from drinking water.


a b s t r a c t
The dataset presents the occurrence of 113 pesticide residues (PR) in drinking water samples from 31 counties worldwide and correlates their concentrates with human health. The dataset classifies PRs to four toxicity classes. Class IA (extremely toxic), includes four residues with an LD 50 value < 5 mg/kg. b. w.; class IB (highly toxic compounds), includes 14 residues with an LD 50 value in the range of 5-< 50 mg/kg b w.); Class II, (moderately toxic) includes 55 residues with an LD 50 value in the range of 50-< 500 mg/kg b w.); Class III, (slightly toxic compounds) includes 17 residues with an LD 50 value in the range of 50 0-< 20 0 0 mg/kg bw. and class IV (less toxic compound) includes 23 residues with an LD 50 value > 20 0 0 mg/kg bw. The dataset provides a new statistical method that link all PRs together throughout using reference average (Ref Aver), reference standard deviation (Ref Stdev), country average and country standard deviation to show the statistical variations among them. Furthermore, the dataset calculates hazard indices (HIs) and shows its distribution among 31 countries. Noteworthy, the dataset provides advanced techniques to clean water from PRs. Detailed explanation and discussion of the present dataset can be found in the article entitled "Pesticide residues in drinking water, their potential risk to human health and removal options" under article doi: 10.1016/j.jenvman.2021.113611 (El-Nahhal and El-Nahhal, 2021). To the best of our knowledge, this is the first dataset that describes the use of Ref

Value of the Data
• These data provide detailed calculations on health hazards associated with PRs in drinking water. • It can be used by local authorities, policy makers and researchers to improve the environmental health standards. • It provides a better understanding in the occurrence of PRs in drinking water in 31 countries.
• It shows an important and useful statistical method for other researchers.
• The figures provide an overview of toxicity classes of PRs and their health hazards.

Data Description
The dataset contains 12 figures describing the study results. For instance, Fig. 1 shows the steps of article collection, excluding irrelevant articles and including the relevant ones. It appears that 61.24% of the collected articles were excluded due to irrelevancy and only 38.76% of article were included. This shows the huge effort s needed to collect the relevant articles from 31 country. Furthermore, the dataset summarize the collected pesticide residues and classifies them to insecticides, herbicides and fungicides ( Fig. 2 a), and presents insecticide and herbicides residues whereas fungicide residues are not presented in this dataset because they were found only in three countries (i.e. Spain, Brazil and Japan) with low concentrations that do not present health hazards. The dataset classifies the insecticides and herbicide residues according to their relative average (Rel Aver) of concentrations to five groups ( Fig. 2 b,c). It appears that group 1 (G1), the lowest relative average < 0.1) includes seven countries having insecticide residues and five countries having herbicide concentrations. On the other hand, group 5 (G5) the highest Rel Aver ( > 10) includes two and three countries having insecticide residues and herbicide residues, respectively. The other groups (G2-G4) have 23 and 10 countries with insicticides insecticides and herbicide residues, respectively. Fig. 3 shows the distribution of countries in a forest plot. It is obvious that some countries have positions on the left side of the relative averge average (0.01), the dotied dotted line, some others on the dotted line and some on the right side of the dotted line. This indicate the differences among countries. The differences are significantly high between the countries having Rel Aver left side and in right side. Additionally, the countries in contact with the dotted line may have significant differences based on the size of error bars. If the error bars are overlapping together, this indicates no significant differences. If the error  The occurrence of country Rel Aver of insecticide are shown in Fig. 4 . It can be seen that 32 Rel Aver representing 24 countries are presented. The extra number of Rel Aver appeared due to repetition of some countries such as China, India and Iran.   5 shows the distribution of herbicide relative average of several countries worldwide as forest plot. The explanation of these results is similar to those given for Rel Aver of insecticides ( Fig. 3 ). Furthermore, the occurrence of country Rel Aver of herbicide are shown in Fig. 6 . It can be seen that 18 Rel Aver of herbicide representing 17 countries worldwide are presented. The difference between the number of Ref Aver and number of countries appeared due to repetition of some countries such as Portugal. Fig. 7 shows the occurrence of insecticide residues as box plot in 20 countries having at least five insecticide residues in drinking. Four countries are not presented here because they have less than five insecticide residues, the essential parameters required to present box plot. It is obvious that the boxes have different sizes and different whiskers indicating different distributions. Additionally, the majority of countries have a high insecticide concentration. This was shown as outliers either low or high. Explanation of the calculation is shown in Materials and methods section. Similarly, Fig. 8 shows the occurrence of HRs in drinking water samples from several countries. Box plot in Figs. 7 and 8 show concentration range of insecticide/herbicide, minimum concentration, 1st quartile, median, 3rd quartile, and maximum concentration. These values are denoted by bottom whisker, 1st 2nd and 3rd lines of the box and the upper whisker, respectively. Additionally, x mark inside a box and circle above whisker denote the country average and outliers of insecticide and/or herbicide concentration.  Fig. 9 shows the occurrence of the toxicity classes of 113 pesticide residues found in drinking water samples collected in 31 countries worldwide. It can be seen that five classes of toxicity are found. For instance, Class IA extremely toxic class (LD 50 < 5 μg/g) represented by four cases. This class represents less that 5% of all residues. Class IB, highly toxic residues (LD 50 in the range of 5-49.99 μg/g), represented by 14 residues and occupies 14% of all residues. Class II, moderately toxic class (LD 50 in the range of 50-499.9 μg/g), represented by 55 residues and occupies 49% of all residues. Class III slightly toxic residues (LD 50 in the range of 500-1999.9 μg/g), represented by 17 residues and occupies 15% of all residues. Class IV, less toxic residues (LD 50 > 20 0 0 μg/g) represented by 23 residues and occupies 20% of all residues.
Hazard index of insecticide residues are shown in Fig. 10 . In fact, three categories are shown, HI 0.001-0.01, this is represented by 17 HI from 17 different countries. 0.1 < HI < 1.0, represents HI from 10 countries whereas HI 1.01-31 represent HI from 7 countries worldwide.
It is obvious that the distribution of HI in each category is different as shown by the five parameters of box plot presented in each category. (details of box plot categories are shown above and in the methodology section).
The occurrence of countries in each category of HI are shown in Fig. 11 . It can be seen that Fig. 11 contains 4 graphics A-D). It is obvious that HI representing 28 countries.  Fig. 12 shows HI of herbicide residues as box plot and distribution of countries as Pie chart. It can be seen that two HI categories are shown, HI 0.001-0.1, represent 10 countries whereas H1 > 1represents two countries. Distribution of countries in the Pie chart shows 13 counties having HI above 0.001 and only two countries having HI above 1, indicating potential risk to humans.
Additionally, the dataset presents several methods with high potential of cleaning water ( Fig. 13 ). It is obvious that the proposed method includes four methods and possible combination such as physical methods, chemical method, biological method and mixed method.

Data collection
Data were collected using the following specific items 1. Specific phrases "pesticide residues in water, insecticide residues in water; fungicide residues in water, herbicide residues in water"; 2. Chemical name "organochlorine residues in water, organophosphate residues in water, carbamate residues in water, pyrethroid residues in water, neonicotinoid residues in water", 3. Pesticide name such as "e.g. DDT residues in water, γ -HCH residues in water, Toxaphene residues in water, parathion residues in water, chlorpyrifos residues in water, diazinon residues in water.

Websites used to collect the relevant articles
1 Google engine; 2 Google Scholar; 3 Researchgate; 4 The database of Scopus; 5 Web of Science; 6 Home page of ScienceDirect; 7 Home page of PubMed; 2.2.8 Home page of BMC; 9 Journals home page and 10 direct contact with corresponding authors.

Downloading the articles
Free download articles were collected easily by just a click on the icon, then were saved on our computer. The unfree articles were collected through the university home page or via direct contact with the corresponding author.

Reading and screening the articles
The collected articles were carefully read by the authors. Then, the articles were classified into groups: Group 1 includes articles developed method to determine pesticide residues in water Group 2 includes articles determined pesticide residues in agricultural water, rivers, lakes, surface water and ground water Group 3 includes articles determined pesticide residues in bottled water, and drinking water.

Sorting the articles
The articles were sorted to the following categories: 1. Conference articles. The articles in this section were subdivided into local scientific conference and international scientific conference. 2. Journal articles. The articles in this section are classified according to publishing house into local Journals and International Journals. The international Journals were sorted according to the impact factor and the cite score of the journal. Then were categorized into Q1-Q4 Journals. Articles published in local journals with different publishing house were sorted into Q5, Q6.

Inclusion and exclusion of articles
Articles published in local or international conference with abstracts and/or proceedings were considered irrelevant and excluded. Additionally, articles aimed to develop method for pesticide detection in water resources were also excluded. Moreover, articles which determined pesticide residues in agricultural water, rivers, lakes and wastewater were also excluded. Articles published in a local Journal with multi-disciplines were also excluded.
Inclusion criteria includes articles which determined pesticide residues in drinking water, bottled waters and published in a Journal of Q1-Q4 rank.

Data preparation, modification, calculations and statistical analysis
Pesticide residues in the included articles were collected and inserted to an excel sheet in our computer. The concentrations in all collected articles were normalized to one unit such as μg/L instead of mg/L or ng/L. The latest units were converted to μg/L if found.

Data separation
Pesticide residues were classified into specific groups such as insecticides, herbicides and fungicides. Then, insecticide, herbicide, and fungicide residues were identified and categorized in separate data sheets.

Data processing
Pesticide residues in the included articles were used to calculate country average, standard deviation and/or standard error instead of using the original analyzed data such as minimum, maximum, median, 25th percentile, 75th percentile and/or range.

Calculation of pesticide daily intake (PDI)
PDI associated with drinking water was calculated according to Eq. (1) , where [APR], Q, and BW are the average of pesticide residue found in a drinking water sample (μg/L), the amount of water consumed by a person and body weight (kg), respectively.
It has been shown that drinking water consumption equals 2, 1.5 and 0.75 l for adults, children and infants, respectively, and the BW of adults, children or infants equals 60, 15 and 5 kg, respectively [3] .

Calculation of health quotient (HQ) and hazards index (HI)
HQ was calculated according to Eq. (2) where ARfD is the acute reference dose of pesticide residue expressed in μg/L/day. Value of ARfD was obtained from Ref. [2] . The use of ARfD was previously reported [1 , 4 , 7] .
When a water sample contains more than one pesticide residue, HQ is calculated individually for each sample. HI values equal to/greater than one indicate additive effects and a high risk, whereas values below one indicate low or negligible health risk [4] .
In fact, HQ was added in Eq. (3) because a mixture of pesticide residues in which each individual residue is present in the mixture at a level approximating the no observed effect level elicits a measurable response denoted as a joint additive effect. Thus, the summation of individual effects is given in Eq. (3) . This is in accordance with US EPA, [4] and with Tinwell and Ashby [5] , who emphasized the joint additive effects of chemical mixtures.

Classification of pesticide residues according to toxicity class and function
The pesticide residues found in the water samples from a country were subdivided into three groups according to their functions: insecticide, herbicide, and fungicide residues. Each group was subdivided into four toxicity classes (Ia, Ib, II, III and IV) according to Ref. [6] .
Application of Eq. (12) enable detection of percentage of differences between Ref aver and Rel Aver. statistical differences between Rel Aver for health hazards and/or concentration averages HIC can be calculated from the relative ration (RR) between any two Rel HI using Eq. (12) Values of% difference < 0.05 indicate significant differences. Furthermore, the values of Rel Aver, of all countries were used to draw forest plot. Relative standard error and Relative standard deviation were taken as lower and upper limit of error bars in the forest plot. Then perpendicular axe was placed in certain points to show the differences.
Additionally, box plot was drawn for each of insecticide and herbicide residues to show the distribution of insecticide and herbicide in each country.

Ethics Statements
This work does not involve human subjects, animal experiments, and/or data collected from social media platforms.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.