Pathways in the Drug Development for Alzheimer’s Disease (1906-2016): A Bibliometric Study

Investments in drug development for Alzheimer’s disease (AD) have not led to the availability of a treatment to cure or halt the progression of the disease. This study aimed to provide insights into the current lack of an effective therapy against AD by exploring the evolution of research paths in the scientific domains corresponding to fundamental, preclinical and clinical research from the identification of the disease in 1906 up to 2016. More specifically, the influence of the amyloid cascade hypothesis and use of animal models in the evolution of drug development for AD were explored. We used bibliometric analysis for the identification of research paths taken over time, including main path analysis, direct citation analysis and co-word analysis. The results show that the amyloid cascade hypothesis has played an important role in the evolution of research paths in the drug development process of AD. The preclinical domain and to a lesser extent the clinical domain, were found to be increasingly involved in the study of interventions modulating amyloid-beta related neurotoxicity over time in line with the fundamental domain predominantly focussing on amyloid-beta as the primary cause of AD. The results open up a discussion about lock-in, i.e. that decreasing options in the fundamental domain results in less room for manoeuvre in the preclinical and clinical domain.


INTRODUCTION
Since the identification of Alzheimer's disease (AD) in 1906 by Alois Alzheimer, investments into research to unravel the mechanisms of action and development of drugs have not led to a treatment to cure or halt the progression of the disease (yet). Four drugs are approved by the European Medicines Agency and one more by the United States Food and Drug Administration, [1][2][3] which only temporarily improve the symptoms. With no new drug approvals since 2003, the drug development process of AD is characterized by a failure rate which is among the highest for any therapeutic area over the past decades. [4,5] Failures ascribed to drug inefficacy, significant side effects or difficulties in the conduct of trials are argued to be the result of deficiencies in the characterization of the disease, choice of therapeutic targets and design of (pre) clinical studies. [5][6][7][8][9][10] Potential reasons for the high failure rate in drug development for AD are twofold. First, the dominance of the amyloid cascade hypothesis, stating that the accumulation of the protein amyloid-beta in the brain forms the initiating step in the development of AD, may have put back progressions in drug development by evoking the disregard of other hypotheses and their associated drug targets. [6,7,[11][12][13][14][15] While interventions targeting amyloid-beta succeed to prevent its accumulation, improvements on cognition or brain shrinkage in humans seem minimal. [6,16] Second, the way AD-specific animal models have been used to test clinical efficacy of novel drugs may have contributed to high failure rates in clinical trials by providing data that translates poorly to the clinic. [8,14,17] The predictive value is low because animal studies determining the efficacy of interventions are poorly designed. This includes animal models being chosen without considering the aspects of the disease recapitulated, while most models only allow for the evaluation of a single hypothesis for AD. In addition, the cognitive outcome measures used in animal studies have an Journal of Scientometric Research, Vol 9, Issue 3, Sep-Dec 2020 unclear relation with measures used in human clinical trials. [14,17] Without the availability of a cure, the burden of AD on public health, social care and economics is expected to grow rapidly with the ageing of the population worldwide. [1] This study aims to provide insights into the current lack of an effective therapy against AD by exploring the evolution of research paths in the scientific domains corresponding to fundamental, preclinical and clinical research from the identification of the disease in 1906 up to 2016. [18] More specifically, the influence of the amyloid cascade hypothesis and use of animal models on the research paths taken over time were explored. The added value of the study lies in the ambition to create an encompassing overview of the AD field, covering all hypotheses proposed. Moreover, the overview of the AD field is longitudinal, covering developments throughout the years and multi-domain, covering fundamental, preclinical and clinical areas. Such an overview provides a systematic and complete guidance of the choices that have been made in the field of AD.

METHODS
The methodology used to study the evolution of the scientific domains was bibliometric analysis. The evolution of research paths in scientific domains result from the accumulation of knowledge over time, driven by historical findings, guided search and progress in problem perception. [19,20] Publications are able to reveal the research paths or directions taken in drug development for AD by reflecting the codification of research activities. MEDLINE/PubMed of the U.S. National Library of Medicine was used as primary source for the retrieval of publication data. MEDLINE/PubMed is considered the most exhaustive database in the biomedical field and allows for an accurate search using Medical Subject Headings (MeSH) and article types to allocate publications in the field of AD into separate domains. [21,22] A search query for each of the scientific domains was constructed using MeSH vocabulary and free-text terms to identify both indexed and non-indexed publications issued between 1906 and 2016 in MEDLINE/PubMed. The study only included research articles and proceedings papers as these were considered to form the core set of publications that constitute the field. Comments, editorials, guidelines, letters, news articles, reviews and meta-analyses were excluded. The main components of the search queries are provided in Table  1. A full description of the search queries are included in Supplementary Material 1.1.
Search results in MEDLINE/PubMed were saved as text files based on the MEDLINE format, providing data for each publication identified (e.g. PMID, title, abstract, authors, journal etc.). In RStudio, the PubMed Unique Identifier (PMID), Digital Object Identifier (DOI) (if available) and Title (TI) of each publication were extracted from the text files and formatted into a .csv file. These fields were used to search for the same publications in Web of Science from Thomson Scientific in order to acquire additional data on cited references. Search results in Web of Science were saved as text file with the content 'Full record and cited references'. In RStudio, the fields with PMID, DOI, TI, Document Type (DT) and Unique Tag (UT) were extracted from the text files and formatted into a .csv file. The scripts used in RStudio are provided in Supplementary Material 1.2. To check for the availability of fields for each publications, the text file from Web of Science was directly imported into The Science of Science (Sci2) tool (http://sci2.cns.iu.edu/user/index.php) and formatted into a .csv file. In Excel, conditional formatting was applied to PMID, DOI, title (lowercased) and UT to identify and remove duplicate publications. DT was used to identify publications classified as comments, editorials, guidelines, letters, news articles, reviews and meta-analyses. These were manually checked and removed from the dataset. The titles of the publications were used to evaluate their applicability to the fundamental, preclinical or clinical domain as part of the drug development process of AD. Inclusion and exclusion criteria for each domain are reflected in the search query. In general, publications related to the fundamental domain are those that unravel the underlying cause of the disease and contribute to the identification of potential drug targets. Publications related to the preclinical domain are those that validate drug targets by testing the safety and efficacy of interventions in a laboratory vessel or other controlled experimental environment (in vitro) and in living (non-human) organisms (in vivo). Publications related to the clinical domain are those that involve clinical trials for the assessment of the safety and efficacy of the intervention in humans. [23][24][25] By screening the titles of the indexed publications and extensively reading the titles of the non-indexed publications, publications were identified not corresponding to the definition of the scientific domain it was classified. These publications were either moved to one of the other domains when matching its definition or excluded from the study. The remaining publications within each domain were searched for in Web of Science using the UT field and saved as text file with the content 'Full record and cited references'. In OpenRefine (http://openrefine.org), variations in the fields of journal names, author names and cited references were corrected, while maintaining the original file format. This involved the transformation of all fields to uppercase letters. Variations in journal names and the last name of authors were identified and corrected using the clustering algorithms 'key collision: fingerprint' and 'key collision: ngram-fingerprint' build into OpenRefine. Additional variations in journal names were identified and corrected using text facet, providing a list of all unique values. Author names with the same last name and first abbreviation were manually compared to their full names and affiliations. Author names referring to different persons were differentiated by the addition of numbers and author names referring to the same person were merged. This procedure was performed with lower accuracy for last names that are very common in Asia (i.e. Kim, Lee, Li, Lin, Liu, Lu, Luo, Park, Sun, Wang, Wu, Zhang, Zhao, Zhou), as these were often linked to a wide variety of first names that were difficult to distinguish from referring to different persons or the same person. Variations in cited references were corrected to minimize the number of mismatches between cited publications and citing publications. The 'Cite Me As' column in the .csv file acquired using the Sci2 Tool, provides the way each publication should be referred to in order to be recognized as citation link. The cited references of all publications were matched to the 'Cite Me As' value by repeatedly leaving out one field of the cited references (e.g. author name, journal name etc.). In this way, cited references referring to the same publication were standardized by applying the corresponding 'Cite Me As' value.
Multiple bibliometric analyses were performed to gain indepth insight into how the domains of fundamental, preclinical and clinical research as part of the drug development process of AD have evolved. Main path analysis (MPA) was performed to trace dominant research pathways in the drug development process of AD. Direct citation analysis and co-word analysis were performed for different time intervals to reveal the evolution of research areas in the domains of fundamental, preclinical and clinical research and to determine how the domains have developed in relation to each other over time.
An overview of the analyses performed are visualized in Figure 1.
Main path analysis (MPA) was performed to identify the dominant paths or directions taken over the whole evolution of the drug development process of AD from 1906 to 2016. For this matter, direct citation networks were constructed in CitNetExplorer (http://citnetexplorer.nl) using the corrected Web of Science text files that either correspond to all publications retrieved or to publications in one of the three domains. Direct citation networks are able to reveal the evolution of scientific domains by means of citation links, with knowledge flowing from the cited publication to the citing publication. By default, citation links in the networks constructed using CitNetExplorer are directed from the citing publication to the cited publication. Since knowledge flows in the opposite direction, the networks were transposed using Pajek (http:// mrvar.fdv.uni-lj.si/pajek/). MPA reduces the direct citation networks to the dominant paths based on the identification of those publications that are most frequently crossed considering all possible paths between the oldest (source) and most recent (sink) publications in the network. This involves two steps: (1) calculating the weight of each link in the citation network and (2) a search for the main paths connecting links with the highest weights. In the current study, weights were calculated based on the search path count (SPC) method. [26,27] Paths were constructed from various points of view, as described by Liu and Lu, [28] including local forward MPA, global MPA, local backward MPA, multiple local forward MPA, multiple local key-routes MPA and multiple global key-routes MPA. Main paths were constructed and visualized using Pajek wherein the algorithms are implemented. The multiple local forward MPA was set to include all links with a weight falling within 15% of the largest for the total dataset and fundamental domain and within 25% for the preclinical and clinical domain. These are arbitrary values and were chosen based on the preferred level of detail in the visualization of the paths. Decreasing the value shifted the network toward a single path, while increasing the value greatly expanded the network. The key-routes MPA was set to include the top 20 key-routes. MPA was also used to identify time intervals based on which the data could be divided to perform subsequent bibliometric analyses to study the evolution of the drug development process of AD over time in more detail.
Direct citation networks were constructed for each time interval based on all publications retrieved. In this way, the evolution of research areas in the drug development process of AD and use of animals in these areas could be identified, as well as the interactions between the fundamental, preclinical and clinical domain. CitNetExplorer was used to split the transposed direct citation network into multiple networks corresponding to each of the time interval identified. The clustering algorithm Smart Local Moving (SLM), [29] implemented in the Modularity Optimizer tool (http://www. ludowaltman.nl/slm/), was used to divide a citation network in communities, or modules of papers, whereby the citation links are dense within communities and sparse between communities. [30,31] In this way, research areas were identified based on the principle that publications mostly refer to topicrelated publications or publications from the same research areas to support or place their study in the field. The direct citation networks depicting the research areas, scientific domains and usage of animals were visualized with Gephi. [4] The topic of each research area was identified by extracting the titles of all publications and selecting three title words with the highest term frequency-inverse document frequency (TD-IDF) [32,33] and frequent and predictive words measure. [34] The interaction of scientific domains was determined based on the distribution of publications from the fundamental, preclinical and clinical domains in each research area. In addition, the relative openness measure was used to quantify the interactions based on the citation relations between publications. [35,36] The involvement of animal models was determined based on the distribution of animal studies in the fundamental and preclinical domain.
As a complement to direct citation analysis, co-word analysis was performed for each scientific domain separately and for each time interval to provide a more immediate and detailed picture of the research areas that have emerged over time and the animal models used. This type of analysis is based on the nature of words, which are the smallest subunit of a knowledge domain and are used to shed light on the cognitive structure of a field. In this study, the words from the titles of the publications were used for co-word analysis. The title of each publication was extracted from the Web of Science text file using OpenRefine. Single and multi-word concepts were extracted from the titles using MetaMap (http:// metamap.nlm.nih.gov), a tool that matches title words to concepts included in the Unified Medical Language System (UMLS) Metathesaurus. Abbreviations and acronyms are not included in the UMLS and can therefore not be identified by MetaMap. As a workaround, words in the titles consisting of capital letters (with or without the addition of numbers) and a length between 2 and 6 characters were identified in OpenRefine to extract the most common abbreviations and acronyms (frequency > 10). Words reflecting abbreviations and acronyms were added to a plain text file with their expansions, referred to as User Defined Acronyms (UDA) file, allowing MetaMap to match the expansions to UMLS concepts when encountering the abbreviation or acronym in the titles. Other words identified were mostly reflecting names of genes, animal models and cell lines, which were also found to not be included in the UMLS Metathesaurus. These words were manually extracted from the titles in OpenRefine.
The titles submitted to MetaMap were lowercased and special characters removed for better handling of the text. The content of the accompanying UDA file, defining the abbreviations and acronyms, was lowercased as well and is provided in Supplementary Material 1.3. The output from MetaMap was combined with the manual extracted words and processed in OpenRefine. This included the transformation of words to lowercase, correcting variations in words using clustering algorithms, merging words with a similar meaning using the concept identifier (e.g. acetaminophen and paracetamol) and the removal of stop words, generic concepts and duplicate words in one title. The words were exported to a .csv file and the frequency determined. There is no standard cutoff value available to distinguish between high frequency words considered important that should be included in the analysis and low frequency words regarded as 'noise' that should be excluded. [37] The minimal occurrence threshold was set differently for each domain and time interval to account for variations in number of publications and words and frequencies of words. The minimum occurrence threshold had to fulfil three criteria: the high frequency words made up at least 60% of the cumulative percentage of occurrences (not considering words occurring once), captured at least 4% of all the words in the titles of the publication set and covered at least 70% of the publication set. These criteria were chosen to make sure the majority of cognitive content reflected in the titles is represented in the co-word analysis. Based on the remaining words, a document-term was constructed in Excel that was subsequently transformed into a co-occurrence matrix normalized using the Salton's Cosine measure in SPSS. [38][39][40] Based on the matrix, a co-word network was created in Pajek and visualized in Gephi. For better visualization, edges with a weight below average were removed in large networks (nodes>500) and with a weight below 0.1 in smaller networks.
Research areas were identified using the same cluster algorithm as for the citation networks. In addition, a strategic diagram was constructed in Excel for each co-word network. In this diagram, all research areas identified are plotted based on the centrality measure, reflecting the significance of a research area to the development of an entire domain and density measure, reflecting the degree of maturity of a research area (see formula 1 and 2). [41] [1] Centrality= 10* ∑ weight external links ∑ weight internal links [2] Density= 100* nodes in research theme

RESULTS
Research articles and conference proceedings papers related to the scientific domains (fundamental, preclinical, clinical), published between 1906 and 2016, were searched in PubMed on April 9, 2017 and subsequently extracted from Web of Science. The resulting dataset included 43,637 publications. An overview of the number of publications retrieved and included is provided in Table 2. Dataset of included papers is provided as Supplementary Material. The paths show a high degree of similarity in the scientific contributions included from 1963 until 2000, indicating that drug development for AD was a concentrated field with the convergence of research directions over time. All paths start with studies from the 1960s into neurofilaments and plaques, which constitute two hallmarks of AD. Around 1975, attention shifted toward research on neuronal alterations or changes in the cholinergic system of the brain in elderly demented people. The nodes with a red label constitute publications that formed the foundation of the cholinergic hypothesis, being cited in the seminal review by Bartus et al. [42] Proposing this hypothesis. Studies on neuronal alterations are included in the paths until around 1985, extending to 1987 in the path obtained with local backward MPA (Supplementary Figure  1C). Around the same time, the paths shift toward studies on the protein amyloid-beta. The nodes with an orange label constitute publications that formed the foundation of the amyloid cascade hypothesis, being cited in the review by Hardy and Higgins [43] proposing this hypothesis. Until 2000, all paths predominantly include studies on the role of the protein presenilin in the formation of amyloid-beta plaques, in which animals are frequently used. After 2000, a more divergent and scattered view of research directions was found, indicated by a lower degree of similarity in the scientific contributions included in the paths and divergence of the path constructed with multiple local forward MPA by not only considering the protein amyloid-beta as the initiating step of AD development, but also the role of metabolic changes and metals. Around 2011, drug development converged toward research related to the structure and folding of amyloid-beta and effect of amyloid-beta inhibitors. This is shown by the convergence of the multi local forward MPA and increased degree of similarity of publications included in the paths. al. (2003), indicated with the red label in Figure 4, stating that known acetyl cholinesterase inhibitors are able to inhibit acetyl cholinesterase-induced amyloid-beta aggregation. All subsequent publications included in the paths are involved in multi-target acetyl cholinesterase inhibitors that are able to inhibit amyloid-beta aggregation and/or reduce oxidative stress (e.g. tacrine hybrids).

Dominant pathways in the clinical domain of AD
The multiple local forward MPA based on publications from the clinical domain is shown in Figure 5    publications for each domain over the time intervals is shown in Table 3, revealing that the amount of publications has increased over time.
The results acquired with direct citation analysis, shown in Figure 6, are networks for each time interval wherein nodes correspond to publications belonging to the scientific domains and edges are citation links directed from the cited publication to the citing publication. The networks discern the distribution of the scientific domains, usage of animals and communities identified (i.e. group of nodes with high internal citation links and low external links) revealing research areas based on the principle that publications mostly refer to related publications to support or place their study in the field. Details on the identified research areas are provided in Supplementary The results acquired with co-word analysis are networks for each time interval, wherein nodes correspond to high-frequent title words in the publications from one scientific domain and edges reflect the co-occurrence of title words. The networks discern the communities identified revealing research areas based on the principle that words often occurring in one title are related. The research areas were plotted in a strategic diagram (Figures 7-9) to provide insights on its importance over time. In the following sections, the results of direct citation analysis and co-word analysis are described for each scientific domain.

Evolution of the fundamental domain in the drug development for AD
The results of the co-word analysis based on the fundamental domain for the period 1982-1991 and 2011-2016 are shown in Figure 7. The results for the other periods are provided in Supplementary Figure 5. Combined with the results from the direct citation analysis, the results reveal that the fundamental domain predominantly focussed on two major research areas: (1) studies on the level of the human brain for diagnostic purposes and (2) molecular pathology studies into plaques/ amyloid-beta and neurofibrillary tangles/tau. This is indicated by the large proportion of publications in the areas identified with citation analysis and changes in the importance of corresponding research areas based on the strategic diagrams. Moreover, the position of the areas are relatively distant in the citation networks, indicating that they largely evolved independently from each other.     (purple) and research into plaques (yellow) (Supplementary Table 1). Similar research areas were identified in the co-word network (Supplementary Figure 5A). The corresponding strategic diagram shows that studies on the brain of patients is a developed and very central research area (green), research into neurofibrillary tangles is a highly developed but more peripheral research area (orange) and research into plaques is a central and developed research area (red). For the period 1982-1991, the direct citation network ( Figure 6B) shows that the fundamental domain is mostly involved in the research area related to the cholinergic system of the brain (lilac), followed by studies on the brain of patients (light blue and orange), research into plaques, amyloid-beta and its precursor protein (pink) and research into neurofibrillary tangles and the tau protein (yellow) (Supplementary Table 1). These research areas are also shown in the co-word network ( Figure 7A). When comparing the corresponding strategic diagram to the period 1906-1981, research into the cholinergic system of the brain enters the domain and is a highly developed research area (green). The research area related to studies on the brain of patients is reduced in prominence (orange), research into plaques, amyloid-beta and its precursor protein is the most central research area (red) and research into neurofibrillary tangles and tau becomes slightly more central (blue). For the period 1992-2000, the direct citation network ( Figure  6C) shows that the fundamental domain is mostly involved in multiple research areas related to amyloid-beta and its precursor protein considering the role of fibrils (grey), For the period 1906-1981, the direct citation network ( Figure  6A) shows that the fundamental domain is mostly involved in multiple research areas related to the identification of diagnostic markers and progression of the disease (light blue, orange and green), followed by research into neurofibrillary tangles proportion of publications in the areas identified with citation analysis and the change in importance of corresponding research areas based on the strategic diagrams.
For the period 1982-1991, the direct citation network ( Figure  6B) shows that the preclinical domain is mostly involved in the research area related to the cholinergic system of the brain (lilac), followed by the research area into acetyl cholinesterase inhibitors (magenta) (Supplementary Table 1). The same research areas were identified in the co-word network ( Figure 8A). The corresponding strategic diagram shows that the research area into the cholinergic system is highly developed and central (pink), while the research areas related to the acetyl cholinesterase inhibitors tacrine and scopolamine/ physostigmine are peripheral (green and red respectively). For the period 1992-2000, the direct citation network ( Figure 6C) shows that the preclinical domain is mostly involved in the research area into acetyl cholinesterase inhibitors (blue) and develops in the research area of amyloid-beta related to oxidative stress (pink) and the role of the amyloid-beta precursor protein in the formation of plaques (orange) (Supplementary Table 1). This is confirmed by the research areas identified in the coword network (Supplementary Figure 6A). According to the strategic diagram, the research area most central and developed in the domain is the research area into acetyl cholinesterase inhibitors (pink), followed by the research area into amyloidbeta (red). The research area into the role of amyloid-beta precursor protein is peripheral in the domain (green). For the period 2001-2010, the direct citation network ( Figure 6D) shows that the preclinical domain is almost evenly distributed in the area of acetyl cholinesterase inhibitors (pink), amyloidbeta studies related to inflammation (orange) and oxidative stress (dark red) (Supplementary Table 1). It also shows that the research areas into acetyl cholinesterase inhibitors and amyloid-beta have moved more closely together. Similar research areas were identified in the co-word network (Supplementary Figure 6B). According to the strategic diagram, the modulation of amyloid-beta neurotoxicity is the most central and developed research area in the domain (red). The research areas into acetyl cholinesterase inhibitors become more peripheral (blue and green). For the period 2011-2016, the direct citation network ( Figure 6E) shows that the preclinical domain is mostly involved in the research area on amyloid-beta and oxidative stress (light orange), followed by the research area into acetyl cholinesterase inhibitors (coral pink) (Supplementary Table 1). Similar research areas were identified in the co-word network ( Figure 8B). According to the strategic diagram, the research area on amyloid-beta and oxidative stress (red) is highly developed and central in the domain, while research into acetyl cholinesterase inhibitors is a developed but peripheral in the domain.
presenilin (light blue), oxidative stress (pink) and inflammation (yellow) in the formation of plaques. The fundamental domain is also still involved in the research areas related to studies on the brain of patients (green) and research into tau (brown) (Supplementary Table 1). These research areas are roughly identifiable in the co-word network (Supplementary Figure  5B). When comparing the corresponding strategic diagram to the period 1982-1991, the research area on amyloid-beta and its precursor protein is divided into an area specifically on amyloid-beta (red) and its precursor protein (green). Both research areas are central and developed. Studies on the brain of patients (orange) and research into tau (blue) become more isolated research areas, while research into tau is still highly developed. For the period 2001-2010, the direct citation network ( Figure 6D) shows that the fundamental domain is mostly involved in studies on the brain of patients (brown) and research into tau and glycogen synthase kinase 3 (red). The fundamental domain is also still involved in multiple research areas on the aggregation of amyloid-beta (nude, dark purple and turquoise), now also considering the role of neprilysin or the insulin-degrading enzyme into the formation of plaques (black) (Supplementary Table 1). Similar research areas were identified in the co-word network (Supplementary Figure  5C). When comparing the corresponding strategic diagram to the period 1992-2000, both studies on the brain of patients (orange) and on tau (brown) have become undeveloped and isolated research areas. Research into amyloid-beta (red) becomes the most central and developed research area. For the period 2011-2016, the direct citation network ( Figure 6E) shows that the fundamental domain is mostly involved in the research area related to studies on the brain of patients (black). The domain is also still involved in research areas related to amyloid-beta (turquoise, nude, light brown and pink) and research on tau (c pastel green) (Supplementary Table 1). These research areas are also shown in the co-word network ( Figure 7B). When comparing the corresponding strategic diagram to the period 2001-2010, the research area related to amyloid-beta (pink) loses its centrality while being still highly developed. The research area related to studies on the brain of patients (orange) becomes highly developed, while research into tau (blue) becomes more central.

Evolution of the preclinical domain in the drug development for AD
The co-word analysis based on the preclinical domain for the period 1982-1991 and 2011-2016 are shown in Figure 8 and for the other periods in Supplementary Figure 6. Combined with the results from the direct citation analysis, the results reveal that the preclinical domain was predominantly involved in studies modulating the cholinergic system of the brain, while shifting increasingly toward studies into amyloid-beta, oxidative stress and inflammation. This is indicated by the large hormone replacement therapy (yellow) are recognized as the most developed and central areas in the domain. The research area into amyloid-beta (red) is recognized as being peripheral and highly developed in the domain. The research areas related to acetyl cholinesterase inhibitors (pink, blue and green) have become less prominent in the domain being peripheral and undeveloped in the domain. For the period 2011-2016, the direct citation network ( Figure 6E) shows the clinical domain is to a lesser extent involved in the research area into acetyl cholinesterase inhibitors (dark blue and magenta) and is increasingly involved in the research area into amyloid-beta (apple green, yellow and purple). The clinical domain also develops into the research area of statins (red) (Supplementary Table 1). The co-word network and corresponding strategic diagram ( Figure 9B) reveal that research areas are either peripheral and developed, including research into amyloidbeta, hormone replacement therapy and mild-cognitive impairment, or central and undeveloped, including research into medication use of elderly, antipsychotics and acetyl cholinesterase inhibitors.

Interaction of the scientific domains in the drug development for AD
The relative openness measure was used to analyse the interactions of the scientific domains over time. This measure quantifies to what extent a domain builds on knowledge inside its own or another domain based on the citation links between publications corrected for domain size. Results of the relative openness measure are shown in Table 4 Figure 9 and for the other periods in Supplementary Figure 7. Combined with the results from the direct citation analysis, the results reveal that the clinical domain was predominantly involved in studies into acetyl cholinesterase inhibitors over time.
For the period 1906-1981, the direct citation network ( Figure  6A) shows that the clinical domain is involved in the research area related to studies on the brain of patients (light blue) and the research area into antipsychotics (red) (Supplementary Table  1). The co-word network (Supplementary Figure 7A) reveals research areas that are all related to studying the effect of different compounds on the brain of elderly patients. In particular, research into ergot alkaloids (green) and antipsychotics (i.e. thioridazine) (red) are highly developed and central research areas according to the strategic diagram. A peripheral theme is research into the compounds physostigmine and lecithin. For the period 1982-1991, the direct citation network ( Figure 6B) shows that the clinical domain is mostly involved into acetyl cholinesterase inhibitors, including tacrine, physostigmine and lecithin (light red), followed by the research area into the cholinergic system (lilac) (Supplementary Table 1). Similar research areas were identified in the co-word network ( Figure  9A). According to the strategic diagram, the research area into the acetyl cholinesterase inhibitors tacrine and lecithin (red) and physostigmine (orange) are the most developed and central in the domain. Research into the cholinergic system (pink) is a less developed and peripheral area in the domain. For the period 1992-2000, the direct citation network ( Figure  6C) shows that the clinical domain is largely involved in the research area into acetyl cholinesterase inhibitors including tacrine and donepezil (dark blue) (Supplementary Table  1). In the co-word network (Supplementary Figure 7B) an additional search space is identified corresponding to hormone replacement therapy (coloured yellow). When comparing the corresponding strategic diagram to the previous period, less effort has been directed toward the research area into tacrine (red) by becoming less developed in the domain although remaining central in the domain. The research area into physostigmine (orange) becomes more peripheral. The research area into hormone replacement is a peripheral and undeveloped area. For the period 2001-2010, the direct citation network ( Figure 6D) shows that the clinical domain is mostly involved in the research into acetyl cholinesterase inhibitors, including donepezil and galantamine (pink), followed by the research area into amyloid-beta (orange) (Supplementary Table 1). In the co-word network and corresponding strategic diagram (Supplementary Figure 7C) research into antipsychotics (e.g. risperidone) (purple and orange) and Journal of Scientometric Research, Vol 9, Issue 3, Sep-Dec 2020 provided in Table 5. The proportion of publications using animals increased over time for the fundamental domain, while it decreased for the preclinical domain. The distribution of publications including animals over time in the citation networks is provided in Figure 6, which allows for the identification of animal usage in the different research areas. The co-word networks previously discussed include title words referring to animal models, as shown in Figure 10, providing additional insights in the usage of animals of time.

Influence of the amyloid cascade hypothesis and animal models on the evolution of drug development for AD
The results of the study show that the amyloid cascade hypothesis has played an important role in the evolution of research paths in the drug development process of AD. The fundamental domain was found to be mostly involved in research into the cholinergic system between 1982 and 1991. Attention shifted toward research into the role of amyloid-beta in AD after 1992 with multiple studies on the neurotoxicity of amyloid-beta emerging over time. Efforts into other diseaseassociated mechanisms, such as the role of tau, metabolic changes and the cardiovascular system, were found to be less central and less common. The preclinical domain devoted most efforts toward the modulation of the cholinergic system with acetyl cholinesterase inhibitors between 1982 and 1991. After 1992, the preclinical domain became increasingly involved in amyloid-beta research, including a shift toward multi-target acetyl cholinesterase inhibitors modulating the cholinergic system and amyloid-beta neurotoxicity. The results indicate that the preclinical domain evolved toward amyloid-beta in line with the fundamental domain that focused predominantly on amyloid-beta as the primary cause of AD. More specifically, the relative openness measure shows that the fundamental and preclinical domains increasingly build on each other over time. In addition, the direct citation networks show that the preclinical domain became increasingly involved in research areas on amyloid-beta together with the fundamental domain.
Results on the usage of animals indicate that the focus on amyloid-beta in the fundamental and preclinical could have been reinforced by the increased use of transgenic mice models of AD over time. The majority of these transgenic models have been developed based on aspects of the amyloid cascade hypothesis and the genetics of the familial, early-onset drug development, while other mechanism-based approaches have been much less represented in the field. Comparable results were found by the bibliometric study of Serrano-Pozo et al. [50] For the well-informed, expert reader this finding may not come as a surprise, as the dominance of the amyloid cascade hypothesis has been extensively debated. However, the added value of our findings lies in the fact that they show the relative importance of the various hypotheses over time and over the different domains (i.e. fundamental, preclinical and clinical) in a systematic manner. As the amount and complexity of studies in the field of AD is overwhelming, the approach used in this study provides an overview of developments in the field and their relations, which may otherwise be unrecognized and could be used as a resource to advance and guide drug development.
In the time-frame studied, drug development had largely been based on the amyloid cascade hypothesis. The overarching focus on amyloid-beta could be due to the path-dependent nature of the drug development process, whereby established knowledge bases lessen the deviation into other research directions. [19] The amyloid cascade hypothesis was formulated based on strong histopathological and genetic evidence, mainly the discovery of autosomal dominant mutations causing familial, early-onset AD linked to amyloid-beta depositions also found in sporadic, late-onset AD. [51,52] The initiating role of amyloid-beta in AD pathology was subsequently strengthened by additional evidence, including the identification of apolipoprotein e4 as genetic risk factor for late-onset AD and interfering with amyloid-beta clearance and the neurotoxicity of amyloid-beta oligomers. [53] The amyloid cascade hypothesis provided a coherent framework for understanding AD pathogenesis and displayed defined drug targets. This favored research into amyloid-beta and the development of anti-amyloid-beta therapies with the potential to alter the basic pathogenesis and prevent cell death [54] rather than merely improve neurotransmitter function based on the cholinergic hypothesis. [24,51,55,56] The strong knowledge base and proven merits on amyloid-beta as primary cause of AD attracted attention and funds, while alternative hypotheses attracted less attention [55,[57][58][59] As previously discussed, transgenic animal models developed in view of the amyloid cascade hypothesis seem to have retained this tendency toward amyloid-beta research. These models have been commonly used to assess novel mechanisms or compounds for their potential to treat AD. [14,59] In the years after the timespan covered by our analysis, the focus on amyloid-beta as primary cause of AD and potential drug target has remained relevant, with recent studies into amyloid-beta oligomers, [51,60,61] most disease-modifying therapies targeting amyloid-beta, [4,7,62] and recent research showing promising results for amyloidbeta immunotherapies. [51,63] However, failures of clinical trials involving amyloid-beta immunotherapies and BACE1 form of AD. In this way, the transgenic mice are a model of AD wherein amyloid-beta is considered the causal factor, [14] favouring attention toward research paths along the line of the amyloid cascade hypothesis. Most efforts in the clinical domain were found to be devoted to interventions with acetyl cholinesterase inhibitors to modulate the cholinergic system over time. However, from 2001 onwards, interventions for the modulation of amyloid-beta neurotoxicity also entered the clinical domain. The developments in the fundamental and preclinical domain were followed by a shift toward amyloidbeta research in the clinical domain, which could be explained by the conduct of clinical trials and regulatory guidelines designed toward interventions with acetyl cholinesterase inhibitors. [45] Implications In the field of AD research, some bibliometric studies have been conducted to identify global trends in AD research, measure research productivity and discover core biological entities, topics and drugs for AD treatment. [46][47][48][49][50] Our study provides in-depth insight into the evolution of the field by the means of an integrated bibliometric approach, involving the visualization of networks over time. By distinguishing between the three scientific domains of fundamental, preclinical and clinical research, the study promotes the understanding of the knowledge-based dynamics within the scientific domains and their relationship to one another. The results showed that amyloid-beta increasingly started to become the focus in AD inhibitors, whereby no cognitive improvements were shown despite a reduction of amyloid-beta, [64][65][66] have broadened the attention to non-amyloid and holistic approaches. [4,52,62] Examples include the tau, [67,68] inflammatory [69,70] vascular, [71] and antimicrobial (protection) hypotheses. [72,73] As such, the emergence and integration of new hypotheses have the potential to lead to innovative mechanisms of action able to affect the pathophysiology of AD.

Study limitations and future perspectives
The study has some limitations. The study is based on the assumption that the evolution of the scientific domains of fundamental, preclinical and clinical research is reflected in the scientific publications obtained using the search queries and available in PubMed and Web of Science. In particular, the low coverage of publications from non-English journals and articles published a long time ago (on paper) and research into failed drugs generally not being published may have influenced the results in this study. Future studies may consider including clinical trial or patent data in addition to scientific publications. Moreover, as for bibliometric studies in general, the question is whether the links established with citation analysis, or the research themes identified with co-word analysis, constitute the reality of the field. This was accounted for by extensively cleaning the retrieved publications to make sure that the input for the analysis is relatively accurate.
Additional avenues for future research include: (1) the application of the methodology used in this study to other disease categories, (2) performing alternative -more recent -methods of MPA to retrieve more evolutionary trajectories, distinct paths or unexplored themes, including genetic persistence based main path as described by Martinelli and Nomaler [74] and use of flow vergence (FV) gradient as weight assignment method (instead of SPC) as described by Lathabai et al. [75] (3) the inclusion of qualitative research by conducting interviews with AD experts, (4) analysis of the relations and collaborations between researchers in the different scientific domains, (5) performing a systemic review of the regulatory documents issued by the FDA and EMA over time to analyse the role of regulations in a drug development process and (6) analysing the role of academia and pharmaceutical companies in the fundamental, preclinical and clinical domain.

CONCLUSION
The evolution of research paths in drug development for AD has been largely influenced by the amyloid cascade hypothesis. Efforts to develop a disease-modifying therapy for AD were found to be primarily dedicated toward amyloid-beta as the primary cause of AD. The common use of transgenic animal models developed in line with the amyloid cascade hypothesis in the fundamental and preclinical domain has likely reinforced the research paths into amyloid-beta research. The underrepresentation of non-amyloid approaches to AD causality and treatment warrants attention, especially in light of the so far thwarted drug development programs. Actors in the drug development process should be open toward integrating different approaches and aim to avoid lock-in, i.e. that decreasing options in the fundamental domain result in less room for manoeuvre in the preclinical and clinical domains.

Acetylcholinesterase inhibitor
Compound that inhibits the cholinesterase enzyme from breaking down the neurotransmitter acetylcholine in the brain, increasing both the level and duration of the neurotransmitter action.

Amyloid-beta
Protein that constitutes the main component of plaques found in the brains of people with Alzheimer's Disease. Amyloidbeta is formed by the cleavage of the amyloid precursor protein (APP) by the enzymes gamma secretase and beta secretase.

Amyloid cascade hypothesis
Concept on the cause of Alzheimer's Disease stating that the accumulation of the protein amyloid-beta in the brain -due to an imbalance in its production and clearance -forms the initiating step in the development of Alzheimer's Disease. This is followed by the formation of neurofibrillary tangles and subsequent onset of neuronal dysfunction and loss.

Amyloid precursor protein (APP)
Precursor protein that generates amyloid-beta when cleaved by the enzymes gamma secretase and beta secretase.

Cholinergic hypothesis
Concept on the cause of Alzheimer's Disease stating that the dysfunction of neurons containing the neurotransmitter acetylcholine leads to cognitive decline.

Cholinergic system
System of the brain constituting of neurons using the neurotransmitter acetylcholine to send its messages. The system is involved in the regulation of attention and higherorder cognitive processing. Alzheimer's Disease is associated with the dysfunction and loss of neurons part of the cholinergic system.

Drug development process
Long-term problem-solving process involving the interplay of the fundamental, preclinical and clinical research domains.

Clinical research
Research domain conducting clinical trials for the assessment of the safety and efficacy of a drug in humans

Fundamental research
Research domain that aims to unravel the underlying cause of diseases, contributing to the identification of potential drug targets.

Neurofibrillary tangles
Aggregations of hyperphosphorylated tau protein found in the brains of people with Alzheimer's Disease.

Plaques
Extracellular deposits of the protein amyloid-beta found in the brains of people with Alzheimer's Disease.

Preclinical research
Research domain that validates drug targets by testing the safety and efficacy of a compound in a laboratory vessel or other controlled experimental environment (in vitro) and in living (non-human) organisms (in vivo).

Presenilin
Sub-component of the enzyme gamma secretase that is responsible for the cutting of the amyloid precursor protein (APP).

Tau
Protein that is the component of neurofibrillary tangles found in the brains of people with Alzheimer's Disease.

Transgenic (mice) model
Genetically modified animal models.