Evolutionary history and dynamics of dog rabies virus in western and central Africa

Correspondence Hervé Bourhy herve.bourhy@pasteur.fr Institut Pasteur, UPRE Lyssavirus Dynamics and Host Adaptation, National Reference Centre for Rabies, WHO Collaborating Centre for Reference and Research on Rabies, 25 rue du Docteur Roux, 75724 Paris Cedex 15, France Center for Infectious Disease Dynamics, Department of Biology, The Pennsylvania State University, Mueller Laboratory, University Park, PA 16802, USA Istituto Zooprofilattico Sperimentale delle Venezie, Research and Development Department, Rabies Unit, Viale dell’Università 10, Legnaro, Padova, Italy Institut Pasteur de Dakar, Laboratoire Arbovirologie, 36 Avenue Pasteur, BP 220, Dakar, Senegal Institut Pasteur de Bangui, Laboratoire des Arbovirus et Fièvres Hémorragiques Virales, BP 923, Bangui, Central African Republic Direction des Laboratoires Vétérinaires, Niamey, Niger Laboratoire Central Vétérinaire, km 8 route de Koulikoro, BP 2295, Bamako, Mali Centre National d’Elevage et de Recherche Vétérinaires, Nouakchott, Mauritania Virology Department, Laboratoire National d’Elevage, BP 7026, Ouagadougou 03, Burkina Faso Institut Pasteur de Côte d’Ivoire, Unité des Virus du Système Nerveux, Département des Virus Epidémiques, 01 BP 490 Abidjan 01, Ivory Coast


INTRODUCTION
Rabies is a neglected enzootic disease that is globally widespread and represents a serious health problem in developing countries.The burden of rabies in Africa is second globally behind that in Asia, with around 24 000 human deaths estimated each year despite the availability of effective vaccines (Dodet et al., 2008;Gould et al., 1998;Knobel et al., 2005).Dogs (Canis familiaris L.) have always been the principal host species of rabies throughout this area (Nel & Rupprecht, 2007).In most of Africa, and specifically western and central African countries, notification of rabies disease is not mandatory, so epidemiological data are scarce (Dodet et al., 2008).
The aetiological agent, rabies virus (RABV; genus Lyssavirus, family Rhabdoviridae), is an RNA virus with a single-stranded genome of negative polarity (around 12 kb in length) that contains five genes.Phylogenetic analyses have determined the existence of seven genotypes of lyssavirus, although this number is likely to increase with more intensive sampling (Bourhy et al., 1993;Delmas et al., 2008;Gould et al., 1998;Kuzmin et al., 2005).Among this biodiversity, three genotypes (1, 2 and 3) circulate in Africa, with dog-adapted viruses belonging to genotype 1 (i.e.RABV).Previous molecular epidemiological studies of rabies in Africa (Bourhy et al., 2008;David et al., 2007;Kissi et al., 1995) have revealed that four clades of genotype 1 circulate in this continent.The Africa 1 clade, adapted to dogs, is the most similar to current Eurasian RABV lineages and was therefore grouped into a larger 'Cosmopolitan' clade (Kissi et al., 1995;Swanepoel et al., 1993), together with a novel clade (Africa 4) identified recently in Egypt (Bourhy et al., 2008;David et al., 2007).In contrast, the Africa 3 clade is restricted to the Republic of South Africa and is adapted to mongoose, so that it constitutes an epidemiological cycle distinct from that of dog RABV (Bourhy et al., 2008;Davis et al., 2007;Kissi et al., 1995;Nel & Rupprecht, 2007).Finally, the Africa 2 clade includes RABV strains that circulate in dogs in several central and western African countries.
Although rabies has been studied extensively in wildlife populations in Europe and North America, the dynamics of RABV in domestic dog populations have been largely ignored.In particular, data concerning the diversity, distribution and origin of dog RABV circulating in western and central Africa are scarce.Prior studies were either restricted geographically or limited to a very small region of the viral genome, and compromised by a small sample size and a lack of exact spatial co-ordinates (Durr et al., 2008;Kissi et al., 1995;Sacramento et al., 1992;Smith et al., 1993).A recent study based on high-resolution temporal and spatial data demonstrated that rabies epidemics in southern and eastern Africa cycle with a periodicity of 3-6 years and also show significant synchrony across this region (Hampson et al., 2007).This suggests that considerable dispersal occurs from endemic rabies foci in Africa, perhaps due to the presence of 'superspreader' dogs that transmit the disease over large distances (Hampson et al., 2007).
The aim of the present study was to use a combination of phylogenetic and coalescent approaches to infer key aspects of the phylogeography, especially spatio-temporal dynamics, of dog RABV over a large geographical area covering western and central Africa.Most importantly, we addressed the question of the timescale of RABV evolution and its pattern of migration to understand how RABV spread within this geographical region.We therefore analysed a total of 182 isolates sampled from 27 African countries over 29 years; 92 of these isolates were newly collected and sequenced in the present study.To enhance the power of our phylogenetic analysis, we investigated evolutionary patterns and dynamics by using sequences of the complete nucleoprotein (N) and glycoprotein (G) genes from some samples.

METHODS
Viruses and sequencing.To investigate the genetic diversity of RABV circulating in western and central Africa, we sequenced a total of 92 N and 34 G gene sequences, respectively 1335 and 1572 nt in length, for which the time (year) of sampling was also available.A precise spatial co-ordinate was also available for the majority of these sequences, which covered 27 countries over 29 years.Relevant epidemiological information for all RABV isolates analysed in this study is presented in Supplementary Table S1 (available in JGV Online).
Evolutionary analysis.The gene sequences newly described here were combined with relevant sequences from GenBank, resulting in three datasets of (i) 97 N gene sequences from the Africa 1 and Africa 2 lineages combined, (ii) 134 N gene sequences from the Africa 2 lineage, and (iii) 34 G gene sequences from the Africa 2 lineage.All alignments are available from the authors on request.For each dataset, we inferred the maximum clade credibility (MCC) phylogenetic tree by using the Bayesian Markov chain Monte Carlo (MCMC) method available in the BEAST package (Drummond & Rambaut, 2007), thereby incorporating information on sampling time.Posterior probability values provide an assessment of the degree of support for each node on the tree.This analysis utilized a relaxed (uncorrelated log-normal) molecular clock and the HKY85+C 4 model of nucleotide substitution, although highly similar results were obtained under more complex substitution models.As demographic history can be considered a nuisance parameter in our study, we utilized the Bayesian skyline model as a coalescent prior.All chains were run for a sufficient length to ensure convergence, with 10 % removed as burn-in.As well as generating the MCMC tree, this analysis also allowed us to estimate both the rate of nucleotide substitution per site (substitutions per site year 21 ) and the time to most recent common ancestor (TMRCA) in years.The degree of uncertainty in each parameter estimate is provided by 95 % highest posterior density (HPD) values.
We used a parsimony-based approach to determine the geographical structure of dog RABV of Africa 2, based on the MCC tree of the N gene and with the country from where each sequence was collected as the unit of analysis (although this broad-scale categorization clearly hides a great deal of geographical structure).Further, because the MCC tree is automatically rooted through the assumption of a (relaxed) molecular clock, we were also able to infer the direction of IP: 54.70.40.11   On: Sat, 10 Nov 2018 19:28:19 migration events.Overall, we collected sequences from at least 30 different cities, encompassing 14 countries, in western and central Africa.Each RABV sequence was first assigned a character state reflecting its country of origin.The number of unambiguous changes in character state observed among each country, as determined from the MCMC tree, was then recorded and compared with the number expected under the null hypothesis of entirely random mixing by repeating this analysis on 1000 randomized trees.A matrix of the observed-minus-expected character-state changes was then constructed to determine the strength of migration (positive values) or population substitution (negative values).All of these analyses were conducted by using the PAUP* package (Swofford, 2003).
Finally, to determine the nature of the selection pressures acting on the Africa 2 RABV sequences, we estimated the mean ratio of nonsynonymous (d N ) to synonymous (d S ) substitution per site (ratio d N /d S ) by using the maximum-likelihood single likelihood ancestor counting (SLAC) method available through the DATAMONKEY web interface of the HyPhy package (http://www.datamonkey.org).

Phylogeography of Africa 2 RABV
The MCC tree of 97 complete N sequences is shown in Fig. 1.The topology is similar to those of earlier phylogenetic analyses of the N gene (Durr et al., 2008;Kissi et al., 1995).In particular, the sequences were clearly divided into two major clusters, identified as the 'Cosmopolitan' and 'Africa 2' clades.The Cosmopolitan clade includes isolates that are distributed in northern, central, eastern and southern Africa, as shown previously (Bourhy et al., 2008;Kissi et al., 1995;Nel & Rupprecht, 2007).More notably, we provide the first evidence for a widely distributed Africa 2 clade, comprising dog isolates that have a wide geographical range in western and central Africa, including Guinea, Sierra Leone, Senegal, Niger, Nigeria, Mauritania, Ivory Coast, Burkina Faso, Cameroon, Benin, Chad, Mali, Gambia and the Central African Republic (CAR) (Fig. 2).
To obtain a better estimation of the phylogenetic relationships among these isolates, an additional phylogenetic analysis was performed on a larger dataset of 134 N and 34 G gene sequences from the Africa 2 clade only.This analysis confirmed and extended the general conclusions described above.In particular, the Africa 2 clade has a wide distribution in western and central Africa, with only very little overlap with the Africa 1 clade in CAR and Nigeria.For example, in CAR, the Africa 2 clade seems to circulate in the north of the country, whereas those viruses sampled from the south of the country, and particularly from Bangui, belong to the Africa 1 clade.In Nigeria, no precise delimitation of the two clades could be drawn, as the precise spatial co-ordinates of the isolates were unavailable.However, most of the sequences (19 of 20) originating from this country belonged to the Africa 2 clade.
The Africa 2 sequences could also be grouped arbitrarily on the basis of phylogenetic placement into eight main groups (denoted A-H), all of which were supported by strong Bayesian posterior probability values (Figs 2 and 3).The MCC tree of 34 G gene sequences had a topology very similar of that of the N gene (Fig. 4), with equivalent phylogenetic grouping (B-H; no group A sequences were available for analysis) and with matching geographical distributions.

Timescale of RABV evolution and selection pressures
The mean rates of nucleotide substitution for the N and G genes of isolates belonging to the Africa 2 clade, estimated by using a Bayesian MCMC approach, were 3.82610 24 substitutions per site year 21 (95 % HPD, 2.62-5.02610 24substitutions per site year 21 ) and 3.25610 24 substitutions per site year 21 (95 % HPD, 2.22-4.32610 24substitutions per site year 21 ), respectively.These rates are strongly concordant with previous estimates of substitution rates in fox RABV in Europe and mongoose RABV in Africa (Bourhy et al., 1999;Davis et al., 2007), as well as a more global analysis of dog RABV sampled worldwide (Bourhy et al., 2008).By using the same approach, we were able to estimate the TMRCA of the Africa 2 clade to be 163 years (95 % HPD, 72-288 years), which corresponds to the year 1845 .These data also depict a gradual diversification of the Africa 2 clade into the eight phylogenetic groups that may have occupied this entire geographical range by the year 1955 , perhaps 100 years after the TMRCA of the clade as a whole (Fig. 3).
A similar timescale, with overlapping 95 % HPD values,

Population and spatial dynamics of dog RABV
We also conducted a more detailed investigation of the patterns and dynamics of the spatial diffusion of the Africa 2 clade, using the larger N gene dataset.This revealed a very strong population subdivision (P¡0.001),similar to that observed for RABV in other geographical areas (Bourhy et al., 2008).Indeed, of 182 pairwise comparisons, only 11 (6 %) exhibited a positive correlation between countries (.0); this is indicative of migration between them   1917  1942  1932  1942  1955   (Table 1).The strongest evidence for inter-country migration is for from Chad to Nigeria (1.06), Chad to Benin (0.903), Chad to CAR (0.929), Chad to Cameroon (0.777), Chad to Niger (0.333), Niger to Burkina Faso (0.813) and Burkina Faso to Mali (0.636); this is indicative of a general westward movement across west/central Africa.
However, in all cases, the strength of migration was weak, with population subdivision being by far the strongest signal in these data.

DISCUSSION
There are many recent examples of disease introduction and spread that have resulted from the human-mediated movement of animals.Rabies is a prime example (Bourhy et al., 2005;Fevre et al., 2006;Windiyaningsih et al., 2004).Ancient texts describe the existence of this disease in Mesopotamia.From this location, the virus is proposed to have spread to Europe and then, following patterns of human colonization, to Africa (Nel & Rupprecht, 2007).
More recently, we were able to determine a common origin of all dog viruses circulating globally and to propose that the ancestor of these viruses existed ,1500 years ago, perhaps in the Indian subcontinent (Bourhy et al., 2008).
The results described here provide the first evidence for the spread of dog RABV in western and central Africa, associated with the emergence of an Africa 2 clade, which is the dominant lineage of RABV in this area.Notably, this clade is surrounded to the north, the east and the south by the Cosmopolitan clade (Fig. 1).Such a spatial and phylogenetic distinction strongly supports the idea that these clades represent independent introductions into Africa.
More notably, the recent TMRCA for this Africa 2 clade, with a mean date (from the N gene) of 1845 (95 % HPD, 1720-1936), is consistent with the documented timescale of the expanding European colonial influence in western and central Africa.At the beginning of the 19th century, the French had contact in different areas of coastal west Africa, but their efforts were most clearly focused on the Senegal River area and its hinterland (Crowder, 1990).From 1880, French control was established over much of northern, western and central Africa, such that, by the early years of the 20th century, the French held most of what would become their colonial territory in western and central Africa (including present-day Mauritania, Senegal, Mali, Burkina Faso, Benin, Guinea, Ivory Coast, Chad, CAR, Niger and the Democratic Republic of Congo).
Our analysis of phylogeographical structure also provides information on the pattern of RABV spread in this part of Africa.In particular, the facts that (i) sequences sampled from Chad fall into multiple clades, notably those that tend to be basal in the tree, and (ii) those sequences sampled from countries bordering the Atlantic Ocean tend to fall at distant locations, are suggestive of an initial introduction of RABV into the eastern part of west/central Africa, with Chad as a potential source population, followed by a general westward (and southward) diffusion.However, the pattern is not strong and will need to be confirmed with a larger sample of sequences and more refined spatial data.Indeed, the main spatial pattern present in our data is that of population subdivision, with only limited evidence of viral movement among countries.As such, the Africa 2 clade satisfies the model proposed for dog RABV in general; that is, of a series of spatially distinct clusters that experience relatively little contact among them (Bourhy et al., 2008).1934 [1894-1946] 1970 [1958-1980] 1938  1948  1957  1967    Overall, our study illustrates how the establishment and intensification of travel and trade routes between African countries following colonization and during the first half of the 20th century have been accompanied by the spread of rabies in dogs among a large part of west/central Africa.It is also possible that parallel urbanization facilitated the spread and maintenance of dog RABV in the same region.
In addition, our detailed phylogeographical analysis reveals that the Africa 2 clade spread gradually -perhaps taking longer than a century -to western and central Africa, with no evidence for positive selection acting on either the N or the G gene.
Our data sit in marked contrast to a recent study based on high-resolution temporal and spatial data, which suggested that RABV spreads rapidly and continually from endemic rabies regions in Africa (Hampson et al., 2007).Specifically, the authors explained the occurrence of rabies epidemics in southern and eastern Africa by suggesting that some 'superspreader' dogs may transmit the disease over large distances (Hampson et al., 2007).However, the very strong geographical clustering of RABV sequences depicted in our study, as well as a timescale of spatial diffusion measured in decades, argue against any model in which superspreaders transmit their virus rapidly over a large territory.Similarly, the possible transportation of latent, or infectious, dogs by people travelling over this vast region does not seem to have a large epidemiological impact, as demonstrated by the limited movement of virus among localities.
More detailed phylogenetic and epidemiological analysis of higher-resolution data from across temporal and spatial scales is clearly necessary to determine the mechanisms underlying circulation patterns of canine rabies and to develop a more predictive understanding of the spatiotemporal dynamics of rabies in Africa and other localities.
As such, this study also highlights the need for improved inter-country collaboration to better describe the spread of dog RABV among countries in western and central Africa and to obtain a reliable picture of rabies epidemiology in this region (Dodet et al., 2008).Finally, our study is also of importance when trying to design an effective strategy for the control and elimination of dog rabies in western and central Africa.In particular, we demonstrate, for the first time, that there is no spread of rabies between the countries of northern Africa and those of the sub-Saharan region and, further, that the exchange of viruses between countries of western and central Africa is limited.As a consequence, a progressive strategy of rabies elimination from western and central Africa is conceivable.

Fig. 1 .
Fig. 1.MCC tree of 97 sequences estimated from the N gene of RABV.Horizontal branches are drawn to a scale of estimated year of divergence (coalescence), with tip times reflecting sampling date (year).Posterior probability values (.90 %) are shown for key nodes.

Fig. 2 .
Fig. 2. Collection sites and geographical distribution of isolates belonging to the Africa 2 clade.Localization of samples is indicated by coloured numbers or spots.Countries where viruses of the Cosmopolitan lineage are found are shown in light grey, countries where viruses of the Cosmopolitan and Africa 2 clades are found are shown in black and those countries where viruses of Africa 2 are found are depicted in dark grey.

Fig. 3 .
Fig. 3. MCC tree of 134 sequences of the Africa 2 clade, estimated from the N gene of RABV.The estimated TMRCA for this sample of viral lineages, as well as its 95 % HPD values, are indicated.The major groups (A-H) of the Africa 2 clade are also indicated, with their TMRCA in italics.Horizontal branches are drawn to a scale of estimated year of divergence, with tip times reflecting sampling date (year).Posterior probability values (.90 %) are shown for key nodes.

Fig. 4 .
Fig. 4. MCC tree of 34 sequences of the Africa 2 clade, estimated from the G gene of RABV.The estimated TMRCA for this dataset and its 95 % HPD values are indicated.The major groups (B-H; no group A viruses are available) of the Africa 2 clade are also indicated, with their TMRCA in italics.Horizontal branches are drawn to a scale of estimated year of divergence, with tip times reflecting sampling date (year).Posterior probability values (.90 %) are shown for key nodes.

Table 1 .
Parsimony analyses of migration frequency and direction among isolates of the Africa 2 clade Positive values (shown in bold) suggest viral migration between regions, whereas negative values are indicative of population subdivision.Abbreviation: CAR, Central African Republic.