Drinking and recreational water-related diseases: a bibliometric analysis (1980–2015)

Background Water – related diseases are worldwide health concern. Microbial contamination and contaminant products in water are a source of disease outbreaks and development of cumulative toxic effects. Ensuring safe water is one of the goals to be achieved at the global level. The aim of this study was to assess publications on drinking and recreational water from a health point of view to understand current problems and future research trends in this field. Methods Scopus, the largest scientific electronic database, was used to retrieve related articles and present the results as bibliometric tables and maps. Search query was modified manually using related terms to maximize accuracy. Results A total of 2267 publications were retrieved with an average of 16.82 citations per article. The h-index of retrieved articles was 88. Visual mapping showed that E. coli, diarrhea, cryptosporidiosis, fluoride, arsenic, cancer, chlorine, trihalomethane, and H. pylori were most frequently encountered terms in title and abstract of retrieved articles. The number of articles on water microbiology was a significant (P < 0.01) predictor of worldwide productivity of water – related disease publications. Journal of Water and Health ranked first in number of publications with 136 (6.00 %) articles. The United States of America ranked first in productivity with a total of 623 (27.48 %) articles. Germany (15.44 %), India (16.00 %) and China (20.66 %) had the least international collaboration in water-related disease research. Environmental Protection Agency and Centers for Disease Prevention and Control were among top ten productive institutions. In the top ten cited articles, there were three articles about arsenic, one about aluminum, one about trihalomethane, one about nitrate, one about toxoplasmosis, one about gastroenteritis, and the remaining two articles were general ones. Conclusions There was a linear increase in the number of publications on water – related diseases in the last decade. Arsenic, in drinking water is a serious concern. Cryptosporidiosis and other infectious gastroenteritis remain a major health risk of exposure to contaminated water. Increased number of publications from Asian countries was not associated with a high percentage of international collaboration.


Background
According to World Health Organization (WHO), water-related diseases mainly include those due to drinking unsafe water or exposure to contaminated recreational water like swimming pools [1]. Disease outbreaks due to microbial or metal contamination of water has been reported [2][3][4]. Direct or indirect exposure to contaminated water has been reported to cause a wide range of healthrelated problems including cancer, gastrointestinal problems, dermatological problems, neuronal toxicity, birth defects, infections, and others [5][6][7][8]. Of particular concern is waterborne microbial infection and exposure to high doses of toxic metals in drinking water. Giardia, Shigella, Salmonella, and Cryptosporidium, Campylobacter, Schistosoma, and other infections have been reported due to exposure to contaminated water [8,9]. Exposure to water contaminated with arsenic, manganese, lead, cadmium and others have been reported to be associated with many serious cardiovascular, oncology and neurology -related health problems [10][11][12][13]. Regulations and standards for drinking water safety and for safe use of recreational water has been set to minimize human health risk hazards [14][15][16][17]. One of the main goals of the United Nations Millennium Development Goals (MDG) set for 2015 was to half the proportion of people who do not have access to sustainable and safe drinking water [18]. Achievement of goals pertaining to safe drinking water requires an understanding of water-associated health problems reported from different world regions. The quantity and quality of research related to waterassociated problems are indicator of the current situation of water safety in different world regions and provides an explanation of certain disease outbreaks related to unsafe water. Bibliometric analysis provides the tools to assess research trends on water-related diseases and important aspects of future research in this field with potential recommendation for international collaboration in certain topics in particular world regions like arsenic contamination of water resources in some Asian countries. Therefore, the objective of this study was to give a basic overview of research publications on water-related diseases. The lesson to be drawn from this study will be the extent of global efforts needed to be implemented in the future to eradicate water-related diseases, particularly in developing countries where water technology and resources available might not be as needed to guarantee water safety.

Methods
In this study, Scopus database was used to retrieve articles related to drinking water or recreational water from a health point of view. The search query used in Scopus was like this: (TITLE("drink* water*" OR "tap water" OR "ground water" OR "swimming" OR "recreational water" OR "Waterborne Disease" OR "Water disease Outbreak" AND NOT (transport OR channels OR surface OR body OR bodies OR coast* OR suppression OR complex* OR extraction OR reaction OR soluble OR emulsion OR irrigation OR remov* OR resorption OR mice OR animal OR hydration)) AND TITLE-ABS(disease OR health OR infect*)) AND PUBYEAR > 1979 AND PUBYEAR < 2016 AND (LIMIT-TO(SUBJAREA,"MEDI")) AND (LIMIT-TO (SRCTYPE,"j")) AND (EXCLUDE (DOCTYPE,"er")) The asterisk was used to retrieve all related words. For example, the word "drink*" could retrieve both "drinking" and "drinkable" terms. The same applies for the word "water*" which could retrieve words related to water or waters. The words were used in title search to increase accuracy and minimize false positive results given that "water" is present in many non-medical articles such as chemistry, engineering and agriculture. Seven phrases related to water or waterrelated health terms were used in title search. The title search was followed by exclusion of all terms that are in the field of water technology or chemistry. These terms were found upon manual search of potential articles on health related articles. The search was even further sharpened by two steps, the first step was the conditional presence of the keyword "health" or "disease" or "infection" in the abstract of retrieved articles. The second step was limiting retrieved articles to all those categorized under the subject heading 'medicine" in Scopus. To further ensure accuracy a sample of 100 highly cited articles were manually reviewed by two co-authors to ensure accuracy of the search query. Whenever the two co-authors disagreed on a certain article, a third co-author was asked to judge and decide that article. An example of a retrieved article that did not fit the scope of the search query was "Experimental study on green electrical discharge machining in tap water of Ti-6Al-4 V and parameters optimization" [19]. Whenever the search accuracy was not satisfactory due to presence of non-health water -related articles, the authors modified the search query until accuracy obtained was 100 % in the top 100 cited articles.
Analysis and graphics of data were carried out by exporting data from Scopus to Microsoft Excel and Statistical Package for Social Sciences Program (IBM SPSS Statistics for Windows, Version 22.0. Armonk, NY: IBM Corp.) For analysis and graphics. Density visualization maps and cluster analysis were carried out using VOSviewer technique (Nees Jan van Eck and Ludo Waltman, Leiden University's Centre for Science and Technology Studies) [20]. The quality of publications was assessed using total citations, citations per article, and Hirshindex (h-index). These parameters were used to assess quality of publications by journals, countries, and institutions. In addition to these parameters, impact factor (IF) was used as an indicator of journal strength publishing articles on waterrelated health problems. Regarding the h-index, it is obtained directly from Scopus. To get the h-index for authors, the data retrieved had to be limited to publications by each author and Scopus will calculate the total citations and h-index immediately as an inherent function in Scopus. Similarly, the h-index for a country, institution, or a journal is calculated by limiting data to the country or institution or journal that we are interested in, the Scopus will do the citation analysis and h-index directly. Regarding IF, it was obtained from the latest Journal Citation Report published by Thompson Reuters.
Poisson regression is a type of regression analysis that is used to test the significance of any related term as a predictor of a count variable. Poison regression requires a dependent variable and one or more independent variables as co-variate. In the current study, annual worldwide publications on waterrelated diseases was used as a dependent variable. Keywords used as a single independent co-variate were selected based on the keyword list produced by Scopus for the retrieved data.

General information
A total of 2267 publications was retrieved from Scopus using the search query presented in the methodology section. The total citations for retrieved publications was 38,219; an average of 16.82 citations per document. The h-index of retrieved data was 88. The highest number of publications was recorded in 2015 with 217 publications. Fig. 1a & b show the worldwide productivity using different time scales. The number of publications was low and steady from 1980 up to 2005 followed by a stepwise increase up to 2015 (Fig. 1a). In the last decade, there were two spikes in the number of publications, one in 2006 and the other one was in 2010 (Fig. 1b). The majority of retrieved publications was original research (1936; 85.40 %). Of the total publications retrieved, 1776 (78.34 %) were written in English and the remaining articles were written in 28 different languages, mostly German (146; 6.44 %). Using VOSviewer application, the most frequent terms encountered in title/abstract of the retrieved publications were analyzed. Terms encountered at least a minimum of 10 times and pertaining to healthrelated conditions, contaminants, microbiology related terms, and countries / institutions were presented. Density visualization map of 138 most frequently encountered terms is shown in Fig. 2. The map has 5 clusters. Each cluster represents closely related frequent terms. In cluster number one, the following terms were most frequent: E. coli (113 occurrences), diarrhea (110 occurrences), and cryptosporidiosis (82 occurrences). In cluster number two, the following terms were most common: USA (79 occurrences), EPA (77 occurrences), and fluoride (73 occurrences). Cluster number three contained the following main frequent terms: Arsenic (238 occurrences), cancer (112 occurrences), and cardiovascular (55 occurrences). In cluster number four, chlorine (62 occurrences), trihalomethane (43 occurrences), and asthma (27 occurrences) were most frequent terms. Finally, cluster number five contained one term which was H. pylori (31 occurrences). Other terms encountered in each cluster can be seen in the density visualization map. Of particular note is the term Bangladesh, Taiwan, Nepal which were seen in cluster number three along with arsenic. The term WHO was also seen frequently in cluster number one along with diarrhea and gastroenteritis. Applying Poisson loglinear regression and using the number of articles with keyword "microbiology" as a predictor variable showed that the number of articles on water microbiology is a significant (P < 0.01) predictor of worldwide productivity of waterrelated health publications (Table 1). The model showed that the worldwide productivity will be 1.059 times greater for each extra article published on water microbiology. In other words, there is a 5.9 % increase in the number of publications for each extra article published on water microbiology.

Journal, country, author, and institutional productivity
The retrieved publications were published in a wide range of medicalrelated journals. The top ten journals involved in publishing water-related diseases are shown in Table 2. The number of different journals that published at least 10 articles on waterrelated diseases was 36. Journal of Water and Health ranked first with 136 (6.00 %) articles followed by Environmental Health Perspectives journal with 87 (3.84) articles. American Journal of Epidemiology had the highest citations per article (75.53) while Environmental Health Perspectives journal had the highest IF (8.440). Four of the top ten journals are issued from the United States of America (USA), two from the United Kingdom (UK), two from Germany, one from China and one from Russian Federation. The Russian and Chinese journals had lowest total citations and citations per article.
USA ranked first in productivity with a total of 623 (27.48 %) articles (Table 3). Germany (149; 6.57 %) and the UK (141, 6.22 %) ranked second and third respectively. Half of the countries in the top ten list were European countries, two were Asians, and two were in northern America. Publications from the USA had the highest hindex (69) and the highest number of citations per article (30.04). Countries in the top ten list with the least international collaboration in the field of waterrelated Research productivity from the USA and Asian/African countries was parallel to worldwide research productivity (Fig. 1b) with a significant correlation (p < 0.01, r = 0.99).
Regarding productivity from institutions, the Environmental Protection Agency (EPA) and Centers for Disease Prevention and Control (CDC) ranked first and second respectively ( Table 4). Six of the top ten productive institutions were based in the USA, one was WHO, while the remaining three were based in the UK, Germany and Taiwan. There was a strong significant and inverse relationship (r = -0.83, p < 0.01) between rank of the institution and the total citations for publication for each institution. Institution which ranked first had the highest total citation while those in rank number ten had the least total citations. Similar relationship existed between rank of the institution and the h-index (r = -0.913, p < 0.01).
Regarding top productive authors, no significant dominance was seen and most authors in the top ten list had research productivity between 14-22 articles (Table 5). However, the majority of authors (90 %) in the top ten list were from the USA while the last one in the list was from Spain. Top ten productive authors is shown in Table 5. There was no significant correlation (p > 0.05) between the rank of the author and the percentage of highly cited articles published by the authors.

Citation analysis and most cited articles
A total of 1702 (75.07 %) articles were cited at least once; the remaining articles were not cited at all. Cited articles were further analyzed using VOSviewer to create visualization maps. Co-authorship analysis using VOSviewer showed three clusters of authors (Fig. 3). Cluster number one included 14 authors, three of them were among the top ten productive authors: Parvez, F (116 co-authorships), Ahsan, H (113 co-authorships), and Chen, Y (112 co-authorships). Authors with higher number of co-authorships had higher collaboration compared with those with lower number of co-authorships. Furthermore, authors in the same cluster are those with closer collaboration compared to authors who exist in other clusters. Cluster number two included 12 authors, one of them was from the top ten productive authors; Colford Jr, J.M (17 co-authorships). Cluster number three included 11 authors, four of them were in the top ten productive list: Wade, T.J (47 co-authorships), Calderon, R.I (34 co-authorships), Craun, G.F (36 coauthorships), and Beach M.J (36 co-authorships).
The top cited articles are shown in Table 6 [6,7,9,[21][22][23][24][25][26][27]. The top cited article was about arsenic in drinking water in Bangladesh and received a total of 919 citations. The article was published in the Bulletin of the  World Health Organization journal. The second ranked article in number of citations was also about arsenic in drinking water and its association with cancer in North Chile. The article received a total of 495 citations and was published in American Journal of Epidemiology. The two articles in the first and second rank in number of citations were published by the same author group and were about arsenic in drinking water. A third article on arsenic was in rank 7 th and was about association between arsenic in drinking water and internal cancer. Of the top ten cited articles, there were about arsenic, one was about aluminum and its association with Alzheimer's disease, one was about association between trihalomethane in drinking water and spontaneous abortion, one was about acceptable levels of nitrate in drinking water, one was about toxoplasmosis infection due to exposure to contaminated water, one was about gastroenteritis due to exposure to contaminated recreational water, and the remaining two articles were about contaminated water and its general health effects.

Discussion
In this manuscript, we tried to present a bibliometric overview of water-related publications on health-related diseases that includes a wide range of possible infections due to microbial contaminations of water or toxicities associated with cancer or cardiovascular or neuronal disorders due to exposure to materials like heavy metals present in drinking water. Several bibliometric analyses on water publications have been carried out that focused  on one aspect like arsenic or lead in drinking water or infection with cryptosporidium [28][29][30]. However, no studies have been carried out to assess the overall health aspects of unsafe drinking or recreational water. Bibliometric analysis on water in general and water technologies have been also carried out without focusing on health related issues [31][32][33][34].
Our study showed that there is a growing interest and research activity on this topic manifested as an increasing trend in the number of publications particularly in the last decade. Furthermore, this interest is being witnessed in different parts of the world manifested in the diversity of geographical distribution of countries in the top ten list. No doubt that governmental and nongovernmental international health bodies like WHO, CDC and EPA are taking the lead in this topic manifested in top ten productive institutions and authors. Finally, the manuscript showed that microbial and toxic related health issues are being heavily addressed. The toxic effects of water contaminants and microbial contamination are being a serious concern particularly in developing countries while risk of infections and negative health effects of recreational and swimming pools are being a concern in developed countries [35][36][37][38]. It has been reported that diarrheal disease mostly due to contaminated drinking water accounts for 4.1 % of the total Disability Adjusted Life Year (DALY) global burden of disease and is responsible for the deaths of 2 million people every year [39]. The high h-index of the retrieved publications is a strong indicator of the value and importance of such publications. The importance of waterrelated diseases was emphasized by dedicating a World Water Day (March 22, 2013) [40]. The top cited articles reveal the hot topics in water-related health research and topics that are of real concern to health organization and health policymakers. Health issues like arsenic and cancer, aluminum and Alzheimer's disease, trihalomethane toxicity in drinking water, risk of gastroenteritis  Institutes having similar number of publications were given the same ranking number, and then a gap is left in the ranking numbers from recreational water, disease outbreaks, and Bangladesh as a country at high risk of arsenic toxicity in drinking water. Journals encountered in the top ten productive list were mainly in the field of environmental health or epidemiology. Specialized journal in water was also present and ranked first among top ten journals publishing on waterrelated diseases. The subject of water and health is a multidisciplinary one and therefore various types of journals were encountered includes ones in the field of chemistry, environment, epidemiology, infection, toxicology, public health and others. That is one reason why no significant dominance was seen among different types of journals excluding the specialized journal of Journal of Water and Health. The findings that Chinese and Russian journals had the least citations and h-index when compared with other journals in the top ten list could be due to the language of Table 6 Top ten cited articles on water-related diseases   [6,7,9,[21][22][23][24][25][26][27]    . Some names might not be seen due to overlap of names or limited magnification power publication where English remains the scientific language for researches. Despite that, the findings that a Russian and a Chinese journal were among the top ten list of journals is indicative of how common waterrelated health problems are in all world regions.
Countries included in the top ten productive list were also diverse and included a bulk of European countries, northern American countries, Asian countries and Australia. However, countries from regions like Africa or Latin America were missing from the top ten list. For the USA to occupy the first rank in the list was not surprising given the research facilities and funds available for healthrelated projects in the USA. Furthermore, the EPA and CDC are being actively involved in waterrelated health issues and that is why both EPA and CDC occupied top ranks in top productive institutions. Bibliometric studies in other medical fields also showed dominance of the USA over other countries in number and quality of publications in many medical fields. In the current manuscript the productivity of the USA was more than one fourth of worldwide research productivity. The USA has witnessed several waterborne related disease outbreaks (WBRDOs). The CDC reported that a total of 32 outbreaks of water-related diseases in 2011 or 2012 and resulted in at least 431 cases, 102 hospitalizations (24 % of cases), and 14 deaths [3]. Another study indicated that from 1971 to 2006, a total of 833 waterborne disease outbreaks with 577,991 cases of illness, and 106 deaths were reported [41]. Microbial contamination such as cryptosporidiosis, E. coli, Norovirus, Legionella, Giardia and other infectious agents seem to be the leading cause of WBRDOs in the USA and other developing countries [41][42][43][44]. The top ten productive institutions included many academic and research centers in the USA. However, two institutions/ organizations in the top ten list is worth commenting, the WHO and the Kaohsiung Medical Universityin Taiwan. The WHO has published a series of reports on water quality, technology and water related-diseases. The WHO is being involved in developing preventive strategies for water-related diseases [45]. Furthermore, the WHO is involved in estimating national burden of water-related diseases and national guidelines for good water quality [46,47]. The Kaohsiung Medical University in Taiwan is being listed as one of the top ten productive institutions in water-related diseases worldwide. Some of the publications of this academic center has focused on the role of arsenic and other heavy metals in drinking water and its association with cancer [48][49][50][51].
Our study has few limitations that need to be mentioned. Most of these limitations are similar to the ones listed in previous bibliometric studies and they are inherent to the technique itself and database chosen to retrieve data [52][53][54][55]. It is important to remember that Scopus retrieve article with English title and abstract regardless of the language of the manuscript. Therefore, articles written completely in non-English language were not retrieved. For example, articles published in Persian language in a journal not indexed in Scopus will not be retrieved. It is the policy of Scopus to have all articles of journals indexed in Scopus to have an English title and abstract. This might have created some bias in data. However, articles with no English title or abstract are of local interest and mostly of little international impact compared with those having English title and abstract and therefore readable by researchers all over the world. In retrieving articles, the search query was built by authors based on literature and on manual review of retrieved articles. Therefore, the results in this study remain valid within the context of search query which was confined by title search and by category of journals under the subject "medicine" in Scopus. Finally, the citation analysis presented in this manuscript did not exclude self-citation which is common in literature and might not be a point of strength for authors and journals. All authors listed in this manuscript were presented as retrieved from Scopus and based on the data present in retrieved articles. The authors of this manuscript did their best to avoid false positive and false negative results by manual review of hundreds of retrieved articles. Furthermore, the authors tried their best to confine data to articles related to water-related health problems and exclude articles in pure chemistry, engineering, technology, and physics.

Conclusions
This study showed that there was a noticeable and almost linear increase in the number of drinking and recreational water health related publications. Major contribution of these publications came from the USA and Europe. Institutions and international health centers like EPA, CDC, and WHO are taking a prominent role in this filed. Arsenic, other heavy metals, gastroenteritis and cryptosporidiosis are important healthrelated problems encountered in drinking and recreational water. Research productivity on waterrelated diseases from Asia and Africa have witnessed an upward increase in the last few years. However, research from Asian countries like India and China was characterized with low percentage of international collaboration. The number of studies linking arsenic and other heavy metals to various types of cancer require global action particularly in countries such as Bangladesh.