Intellectual Structure Mapping of Sickle Cell Anemia Research in India: A Scientometric Analysis

There are limited studies which focused on network analysis and visualization of Indian Research performance in Sickle Cell Anaemia. This study is an attempt to bridge this gap using metrics with a view to understand the present status of research at the global, national, institutional, author and source level. The study is based on publication and citation data sourced from Scopus during 1958 to 2020. The bibliographic data was statistically analyzed on various metrics such as country collaboration, document type, productive author, journal, highly cited works, productive year, author affiliation. The U.S.A, U.K. and India are highly collaborating in research on “Sickle Cell Anemia”. Canada, the USA and Jamaica are highly cited nations. Indian Journal of Hematology and Blood Transfusion is the most productive journal. In addition, the study also investigates and maps productive institutions, collaborations among these institutions, key authors, key source journals and also most significant keywords in the subject thereby visually presenting their inter-relationships using Biblioshiny and VOSviewer software. Results and findings from this study describes the progress made by India through research on this deadly genetic disorder and the future scope as well as trends which will be very useful for researcher working and also having zeal or enthusiasm in this area.


INTRODUCTION
Sickle cell Disease (SCD) is a class of genetic blood disorder meaning inherited from parents. The area considered under study i.e., "Sickle Cell Anemia" (SCA) is a type of Sickle Cell Disease. This disease has been reported first in 1846 during an autopsy of an executed slave was discussed and the finding of the study was absence of Spleen in his body. Now, spleen is an organ in the left part of the abdomen protected by the rib cage. [1] This generally functions as a blood filter in all vertebrates and particularly it recognizes old or damaged Red Blood Cells (RBCs) and eliminates them from our body by breaking them down and storing useful components like iron in the process. This in turn results in circulation of clean blood in our body and functioning of blood at its best. Now, a person with SCA will have abnormality in hemoglobin, the oxygen carrying protein found in RBCs leading to a rigid sickle like shape of these cells. This problem due to this disease can be observed at an early age of 5 to 6 months via number of health problems like attack of pain, acute anemia, swelling of body parts like hands and feet, bacterial infection and also stroke. The body organs of person suffering from this ailment will cease to function gradually due to limited oxygen supply because of gradual decrease in the number RBCs which will limit to null in course of time. [2] This disease is deadly because it is inherited implying that a generation with SCA will pass on this curse to the upcoming progenies limiting their life expectancy within 40 to 60 years. The treatment of this disease needs highly sophisticated and state-of-the-art medical equipment's and tools because this involves works in cellular level. Research on treatment procedures resulted in evolution of three techniques viz. (i)Umbilical Cord Blood Transplant which needs suitable donor and reported to be suitable in only 10% of people, (ii) Gene Therapy uses normal copy of the genes that is mutated (iii)Hematopoietic Stem Cell Transplantation method has no evidence of treating people suffering from SCD. [3] In Indian context, this disease is found to be common in some ethnic groups of Central India where the presence of this genetic disorder has raised from 9.4% to 22.2% in regularly detected areas of Madhya Pradesh, Rajasthan and Chhattisgarh. [4] A study on mortality in Sickle Cell Disease during 2008 to 2018 in an abnormal community in the Gudalur Valley, Nilgiris, Tamil Nadu, India. [5] 157 patients taken as sample who belongs to Paniya, Betta Kurumba, Kattunyakan and Mullu Kurumba tribes. The study reveals that during the study period there were 22 deaths and all are from the Paniya countries, institutes, authors, journals, highly cited papers, and research focus using keywords. Most of the research publications were from the USA (31.67%), and the University of Hong Kong was the most productive institute. A study evaluated the global research output (820 records) on "Use of Convalescent Plasma Therapy for COVID-19" on metrics with the aim to understand the current status of research at the global, national, institutional, and individual author level. [10] The study is based on publications and citations data sourced from the Scopus database during 2020-21. The publications and citations data were statistically analysed on various metrics such as document type, country of publication, collaboration patterns, author affiliation, journal name, and citation patterns.
Recently, a few attempts have been made to analyze literature on Sickle Cell Anemia. Those are reviewed as follows. A study analysed the literature on Sickle Cell Anaemia in Nigeria with data extracted from PubMed listed during 2006 and 2016. [11] Nigerian Journal of Clinical Practice is the most productive journal from the nation. The highest number of contributions are with the USA followed by Italy. Another study analyzed the global literature on Sickle Cell Anemia using bibliometric indicators with literature published during 1997 to 2017 indexed in Scopus database. [12] 19,921 publications were recovered during the period where majority are journal articles. The findings reveal that Blood is the highly productive source, the USA is the leading nation and India is on the 5 th position in global perspective. Keywords like human, hydroxyurea, blood transfusion, controlled study, clinical research, anemia, pathogenesis are most common. There came another study on Sickle cell disease in global perspective with data extracted from Web of Science indexed during 1900 to 2020 applying the bibliometric indicators. [13] This study also reveals that Blood is the most productive journal and the most productive author is Sergeant, GR. The most prominent keywords are anemia, children, disease and management. The USA is the most productive nation.
Bibliometrics applies mathematical and statistical methods to brief scientific activities in a subject leading to help in identifying research frontiers, trending areas and rising patterns based on literatures from various relevant databases. [14][15] Utilization of several visualization tools like VOSviewer, [16] Biblioshiny, [17] CiteSpace, [18] HistCite [19] to develop knowledge and network maps, analyze latest research progress and visualize the trends and co-authorships in scientific publications. [20][21] This study attempts to prepare network visualization maps for different bibliometric parameters using the analysis units like journals, authors, organizations, countries and keywords.
No one can deny that a lot of bibliometric analysis are available on topics related to diseases. But not a single study has been found to be published in a Scopus or Web of Science indexed tribe. Twelve deaths (54.5%) occurred in the hospital and the remaining at home (45.5%), reflecting a crude mortality rate of 140 per 1000 population. Twenty-five percent of deaths occurred in the 6-18 age group. There were no deaths in the 0-5 age group. The median age of death was 25 years, which was 30 years less than in the non-SCD aboriginal population. The leading causes of death were acute chest syndrome, anaemia, and sepsis among the SCD patients and stroke and suicides in the non-SCD aboriginal population. Given the brutality of this genetic ailment in India as well as in global scenario it is considered worthwhile that a scientometric assessment be undertaken to understand and describe the current state of research on "Sickle Cell Anemia" based on Indian literatures published and indexed in Scopus till date.

Literature Review
Over the past several decades, quite a few studies have been undertaken on different human ailments be it genetic or viral using bibliometric methods. Some of the works are reviewed as follows. A very high impactful analysed the pattern of literature growth, global publication share, ranking, authorship pattern, collaborative coefficient, productivity and impact of most productive institutions and authors, highly cited articles based on data obtained from Scopus database on Chronic Liver disease (CLD) in SAARC countries. [6] The study reveals that the SAARC nations contributed 2312 documents during 1996 to 2015 which is only 3.49% of the global output of 66200 publications. The also reflects that the amount of literature has considerably increased over the last five years of the period of study. India is leading among the SAARC nations. A bibliometric analysis of Plesiomonas related research with data from Web of Science during 1990 to 2017 reveals that a total 155 articles were published in the survey period with annual growth of 0.8%. The USA ranks first in terms of number of articles (n=29) and total citations (451). [7] The research collaboration was also low with collaboration index of 3.32. This bibliometric analysis reveals that there is a global diminishing research in Plesiomonas and greater research outcome is from high income countries compared to others. Another bibliometric analysis on prediction of infectious disease with data sourced from Web of Science. [8] The 1880 documents published on the topic has been analyzed and this analysis reveals that the publications were published in 427 different journals with 11 different document types and most common is articles (1618). The study reveals that Nature Journal is the top cited journal with 781 citations, followed by PLoS ONE with 707 citations. The USA is the most productive nation with 749 documents. A study investigated Coronavirus literature using bibliometric indicators taking data from 1970 to 2019 with data sourced from Scopus. [9] The study was carried out using the keyword Coronavirus and analysed for annual growth, productive journal on Indian research output in Sickle Cell Anemia with data extracted from Scopus that too from 1958 till date. So, this study attempts to fulfill this research gap.
The study attempts to describe the Indian contribution on this disease using metrics (i) the extent of global research collaboration of India (ii) analyzes the author's contribution on the basis of number of publications and citation impact (iii) investigate the most productive institutions (iii) identify the highly cited works and the document forms (iv) to analyze the network of country, author, organizational co-authorship and bibliographic coupling of journals (v) analyze the cluster and network of keywords co-occurrence hence evaluate the mostly occurring keywords on the basis of frequencies, mapping word growth and map trending topics in Sickle Cell Anemia Research. The results of this study will be relevant to researchers, physicians and health policy makers as well as government.

Identification of the Search Strategy
A well-defined search strategy was used to retrieve and download publications data from the Scopus database. The search for global literature published on "Sickle Cell Anemia" was conducted with no starting date specified. A suitable search strategy for retrieving data from the Scopus database was developed so as to have a reliable set of data for analysis and to obtain accurate and precise results. An "All field" search has been conducted with search terms related to "Sickle Cell Anemia" as ((ALL ("Sickle Cell Anemia" OR "Anemia" OR "Genetic Blood Disorder") OR (TITLE ("Sickle Cell Anemia" OR "Anemia" OR "Genetic Blood Disorder") AND (LIMIT-TO (AFFILCOUNTRY, "India"))). Here, the search operator "OR" is used instead of "AND" operator because use of "OR" operator gives more broader search results making the findings of the study more efficient, precise and reliable. The database search resulted 57,436 results as global output. Since we are concerned with India, so the data is next limited to India in the country box and this resulted 1443 literature from the initial year of indexing 1958 to 2020. The Indian output was subsequently analyzed for publication output by author, affiliation, journal, country of publication, top cited countries, top cited documents, year-wise productivity and prominent keywords are analyzed to predict the trending areas of research in the field.

Analysis Tools and Techniques
Analysis and tabulation of data are done using MS-Excel. For mapping the data Biblioshiny and VOSviewer are used. The indicators analyzed in the study as per the objectives are top collaborative countries with India on the basis of number of publications and citations; productive authors; productive institutions or organizations on the basis of number of publications, total local and global citations; highly cited research work on the basis of local and global citations, most common medium of communication on the basis of number of records; most productive and highly cited journal; most productive year, significant keywords and trending topics. The analyzed and mapped data is depicted in tables, network visualization maps and interpreted objective-wise in the following segment. In particular, Figure 1 is generated using the trending bibliometric data analysis tool Biblioshiny Web interface in R-studio while Figure 2 is generated using the popular network visualization and mapping tool VOSviewer.
If we see the former has depicted the authors on the basis of productivity and the later represents a network visualization of highly cited authors. The default normalization method for the analysis is "Association Strength" for creating Figures 2-6.

RESULTS
Analysis and tabulation of data are done using MS-Excel. For mapping the data Biblioshiny and VOSviewer are used. The indicators analyzed in the study as per the objectives are top collaborative countries with India on the basis of number of publications and citations; productive authors; productive institutions or organizations on the basis of number of publications, total local and global citations; highly cited research work on the basis of local and global citations, most common medium of communication on the basis of number of records; most productive and highly cited journal; most productive year, significant keywords and trending topics. The analyzed and mapped data is depicted in tables, network visualization maps and interpreted objective-wise in the following segment.

Country co-authorship
In all, 120 countries have participated in global research on "Sickle Cell Anemia" but the distribution of collaborative productivity by these nations is not normal more specifically it is highly skewed.
A country co-authorship map is generated using the popular network visualization and mapping tool VOSviewer [14] [ Figure 3]. It provides a visual presentation of comparative productivity and the nature of the collaboration in research within the field of study. Taking minimum of 2 countries which are co-authoring as threshold, 50 countries are found to be networked. The links shown using straight lines depicts the collaboration between the countries. The thickness of links between the countries represents the strength of their collaboration. Overall, these 50 countries are divided into 7 clusters. Cluster 1 (Red) has 18 countries, some of them are Germany, Switzerland, United Kingdom, Spain, Thailand. Cluster 2 (Green) has 12 countries in total and some of them are China, Italy, Iran, Saudi Arabia, Turkey, Bulgaria. This is followed by Cluster 3 (Deep Blue) with six countries, these are Canada, Finland, Hong Kong, Nigeria, Singapore, South Africa. Cluster 4 (light yellow) has 5 countries, Bangladesh, Japan, Pakistan, South Korea, Taiwan. Cluster 5 (Purple) has 4 countries including India and Cameroon, Jamaica, Libyan Arab Jamahiriya. Cluster 6 (Shallow Blue) has 3 countries, Australia, Sri Lanka and United Arab Emirates. Lastly, cluster 7 (Orange) has 2 countries United States and Russian Federation.

Productive Authors
In total, 320 authors have participated globally in "Sickle Cell Anemia" Research with some authors contributing more articles in this area than others. Table 1 shows top 10 author's scientometric profile based on publication and citation. First three columns in Table 1 shows prolific authors on the publication impact and the other three columns listed authors with citation impact. The results in Table 1 implies that an author who is highly productive (with maximum number of publications) may not be impactful (may depreciate in number of citations In Figure 2, we can observe that some authors have consistently produced research work like Ghosh Kanjaksha and some are very productive earlier but vanishes later on like Agarwal Sarita. The radius of the circles is directly proportional to the productivity of the author. The circles with greater radius for a respective year implies the author to be more productive in that year. In Figure 3, a network map of co-authorship of authors. The size of the circle and the font size of the letters naming the author is directly proportional to the number of documents of the authors. The colorful links between the authors indicates the co-authorship nature among the authors.

Organizational Productivity
In total, nearly 98 organizations participated in Indian "Sickle Cell Anemia" research and descriptive analysis of the dataset implies abnormality in it. Table 2   A three-field plot is used for further analysis which is created using Biblioshiny application [ Figure 7]. The first field at the left are the countries, author is the second field at the middle and followed by affiliations in the third field at utmost right.    Table 3].

Medium of Research Communication
In all, 1505 total research publications are encountered on "Sickle Cell Anemia Research" in Indian perspective as per data retrieved from Scopus. The publications are maximum in the form of journal articles. Of the total 1054 are journal articles (70.03%), 227 are review papers (15.08%), 87 letters (5.78%), 46 conference papers (3.05%), 43 book chapters (2.85%) which takes the majority share. The implication of this dataset is that the authors are more inclined to publish their research work in journals rather than other forms as these counts for more visibility, impact and durability.

Most Productive and Highly Cited Journals
In all, 677 journals are counted to publish articles on "Sickle Cell Anemia" research by Indian researchers. The dataset for most productive journal clearly points the difference between a productive journal and an impactful one. One journal may be productive but simultaneously it can't be impactful. The dataset for total citations and h-index indicates uneven or abnormal behavior with respect to number of publications. Of the 10 most productive journals, 3 of them are having h-index between 10 to 15 followed 12 journals with average h-index between 5 to 10 and 5 journals are having h-index between 1 to 5 range [ Table 4].
The bibliographic coupling between different journals is depicted in Figure 5 generated with VOSviewer. Multicolored circles imply the variations in bibliographic coupling. The size of the circles is related proportionally with the

Year-wise distribution of Publication
In all, bibliographic details of 1505 publications on "Sickle Cell Anemia" have been extracted from Scopus and filtered too on the basis of year of publication of those. It has been observed that the distribution of data obtained (after tabulation in ascending order of the year of publication) is not normal. The value of R 2 (≈1) indicates that the chronological growth in the number of publications is consistent [ Figure 8].
The descriptive statistical analysis of the annual scientific production indicates the dataset to be skewed and leptokurtic.

Significant Keywords and Trending Topics
The keyword co-occurrence in research acts as a secondary support to get an insight into main topics in any subject. Figure 9 and Figure 10 are generated in Biblioshiny web interface through R-Studio platform and Figure 6 is created using the common network visualization tool VOSviewer.      Figure 6 gives a network map of keyword co-occurrences at a glance. The radius of the circle and font size is proportional to the frequency of occurrence of a keyword. The links between the keywords that can be identified with difference in color. Total 12834 keywords were identified, taking minimum number of occurrences of a keyword as 4, it was found that 1937 keywords meet this threshold. These connected keywords are divided into 10 clusters, each with definite number of keywords presented with different colors.  (21), homocysteine (13), platelet count (13), machine learning (6), medial record (5). The keywords listed in Table 5 are on the basis of decreasing order frequency of occurrences in the literature.

DISCUSSION
Bibliometric analysis is an effective and efficient tool for knowing the current status and prediction of future development trends in the knowledge domain of area studied and this makes it different from systematic reviews. [22][23] Research on "Sickle Cell Anemia" comprised of a total of 1505 publications and these are contributed by 320 authors from 98 organizations and collaborated with 120 countries. The average research productivity in the subject was 4.23 authors per document and average citations per year per document is 2.29. The research output (677 productive journals) received total citations of 26543 since publication. The average performance in the subject was 17.59 citations per paper. The USA topped in the global rank to collaborate with India (32.63%) followed by United Kingdom and Australia (6.22% and 5.44%) and the rest 11 most collaborative and productive countries. Moreover, the study provides an insight to key countries, key organizations, key authors, prominent source journals, significant keywords and trending topics of research on "Sickle Cell Anemia". The findings of the study are very significant as it gives a pen picture of research conducted by Indian Scientists and medicos on Sickle Cell Anemia, a deadly genetic disorder of blood. The institutions like ICMR-National Institute of Immunohematology, ICMR-National Institute of Research in Tribal Health and AIIMS are contributing research works prominently in the field. India, being a developing nation has to draw plans to mitigate this genetic disorder through more R&D activities though these activities are increasing in last decades.

Indian Journal of Hematology and Blood Transfusion, Indian
Journal of Pediatrics, Indian Journal of Medical Research are the most preferred journals as implied by their maximum number of publications. The Lancet is found to the most impactful journal with highest number of citations which reveals it's quality. These journals are of high impact, and the quality and publication of quality papers have also raised the academic impact of these journals.
The paper also examines co-authorship in terms of countries, authors and organizations. Co-authorship implies evaluating the relationships among the items (countries, organizations) through the number of co-authored documents. It is applied to assess the cooperation between different organizations, countries and authors in the field of Sickle cell anaemia research.
The quantitative assessment of closeness of cooperation is given by indicators links, number of documents, total link strength (TLS). [21] Higher TLS implies that the countries, organizations and authors tend to work more collaboratively than ones with low TLS. That means the greater the value, the more frequent the cooperation. The analysis of cluster also indicates the closeness/cooperation between the countries, organizations and authors. Presence in a similar cluster with common colour indicates the cooperation in research. The findings for this analysis are mentioned in the previous sections.
In the bibliometric study, as mentioned, the frequency of appearance of keywords in a data set reveals the hot spot categories and future development of a discipline. [21] Analysis is performed in three different ways. According to the keyword co-occurrence network and cluster analysis performed by VOSviewer, all identified keywords were extracted from the Scopus database and analysis is done. Dividing the keywords into 10 different clusters on the basis of their commonness in occurrence and relationship among them the cluster and network analysis is performed. Parallelly, using the Biblioshiny, the frequency of occurrences of the keywords is evaluated and most frequently occurring keywords are depicted. Moreover, the trends of the keywords are also analysed. In the last decade, from 2011 to 2020, thalassemia, sickle cell trait, hemoglobinopathy, male, child, anemia-sickle cell, child-preschool, polymerase chain reaction, human, female, sickle cell anemia, adult, hemoglobin, genetics, blood, young adult, hemoglobin beta chain, metabolism, cytology, hematopoietic stem cell transplantation were the keywords with maximum frequencies implies that focus of the research was human beings with special attention to gender for which keywords like male, female and age group for which child-preschool, young adult came into existence. The research also turned on to state-of-the-art medical treatments like stem cell transplantation. Research in this area has left behind ample scope for new researcher to explore diagnosis and treatment methods to overcome this genetic ailment. The study has been conducted with enriched bibliometric techniques like visualization which is drawing new insights. Moreover, analysis of performance at article, author, institution and country level using citation metrics and science mapping with bibliographic coupling and co-authorship spreads light in this study. Since the study is limited to only network and cluster analysis, so for bibliometricians this study leaves possibilities to conduct more extensive bibliometric analysis using performancecitation related metrics like collaborative coefficient, degree of collaboration, modified collaborative coefficient, authorship pattens to understand the sociological aspects of research in this field which is the future aspect of research in this field. Moreover, the study considers data from only one database i.e., Scopus that too limited to the Indian publications only, leaving opportunity to conduct comparative study with data from multiple databases considering the global publications.

CONCLUSION
This paper contributes in three ways in expanding information on research related to Sickle Cell Anemia. First, it gives insights to previous work published on the topic which helps to find the research gap. Second, it conducts a bibliometric and network analysis to discover the most impactful articles, co-authorship of countries, organizations and authors, highly cited publications, preferred journals, keyword analysis and chronological distribution of publications. Third, analysis of keywords depicts the trending topics and give meaning directions for future research. This information generated in this study will be of immense help for researchers working in this area.
In a nutshell, it can be mentioned that collaboration of India is though more with nations like the USA, U.K. and Australia in terms of number of publications but taking the angle of occurrence of co-citation India is much mutual to nations like Canada, Jamaica though the USA is also in between these two nations. In a nutshell, the strength of international collaboration in "Sickle Cell Anemia" research is observed to be highest among the USA, U.K. and India vis-à-vis other top 26 countries. India is developing in every aspect. Be it medical science, space science etc. but findings of this study still depict India to be remaining behind the global scenario on research in "Sickle Cell Anemia". As per a survey by ICMR, 20% of children with SCD are killed at the age of two and 30% of children with SCD dies before attaining adulthood. This is a very terrible sign for future of India. This is reflected in the findings of the study also. The collaboration network of India is not too strong than that of the USA, moreover India also lags behind in number of publications. Authors like Colah B. Roshan and Ghosh Kanjakshsa of ICMR-National Institute of Immunohematology are contributing quality research work in global standards but these are needed to be practically implemented in the field. This could be achieved only through implementation of government initiatives and policies.