Prioritizing surveillance of Nipah virus in India

The 2018 outbreak of Nipah virus in Kerala, India, highlights the need for global surveillance of henipaviruses in bats, which are the reservoir hosts for this and other viruses. Nipah virus, an emerging paramyxovirus in the genus Henipavirus, causes severe disease and stuttering chains of transmission in humans and is considered a potential pandemic threat. In May 2018, an outbreak of Nipah virus began in Kerala, > 1800 km from the sites of previous outbreaks in eastern India in 2001 and 2007. Twenty-three people were infected and 21 people died (16 deaths and 18 cases were laboratory confirmed). Initial surveillance focused on insectivorous bats (Megaderma spasma), whereas follow-up surveys within Kerala found evidence of Nipah virus in fruit bats (Pteropus medius). P. medius is the confirmed host in Bangladesh and is now a confirmed host in India. However, other bat species may also serve as reservoir hosts of henipaviruses. To inform surveillance of Nipah virus in bats, we reviewed and analyzed the published records of Nipah virus surveillance globally. We applied a trait-based machine learning approach to a subset of species that occur in Asia, Australia, and Oceana. In addition to seven species in Kerala that were previously identified as Nipah virus seropositive, we identified at least four bat species that, on the basis of trait similarity with known Nipah virus-seropositive species, have a relatively high likelihood of exposure to Nipah or Nipah-like viruses in India. These machine-learning approaches provide the first step in the sequence of studies required to assess the risk of Nipah virus spillover in India. Nipah virus surveillance not only within Kerala but also elsewhere in India would benefit from a research pipeline that included surveys of known and predicted reservoirs for serological evidence of past infection with Nipah virus (or cross reacting henipaviruses). Serosurveys should then be followed by longitudinal spatial and temporal studies to detect shedding and isolate virus from species with evidence of infection. Ecological studies will then be required to understand the dynamics governing prevalence and shedding in bats and the contacts that could pose a risk to public health.


Abstract
The 2018 outbreak of Nipah virus in Kerala, India, highlights the need for global surveillance of henipaviruses in bats, which are the reservoir hosts for this and other viruses. Nipah virus, an emerging paramyxovirus in the genus Henipavirus, causes severe disease and stuttering chains of transmission in humans and is considered a potential pandemic threat. In May 2018, an outbreak of Nipah virus began in Kerala, > 1800 km from the sites of previous outbreaks in eastern India in 2001 and 2007. Twenty-three people were infected and 21 people died (16 deaths and 18 cases were laboratory confirmed). Initial surveillance focused on insectivorous bats (Megaderma spasma), whereas follow-up surveys within Kerala found evidence of Nipah virus in fruit bats (Pteropus medius). P. medius is the confirmed host in Bangladesh and is now a confirmed host in India. However, other bat species may also serve as reservoir hosts of henipaviruses. To inform surveillance of Nipah virus in bats, we reviewed and analyzed the published records of Nipah virus surveillance globally. We applied a trait-based machine learning approach to a subset of species that occur in Asia, Australia, and Oceana. In addition to seven species in Kerala that were previously identified as Nipah virus seropositive, we identified at least four bat species that, on the basis of trait similarity with known Nipah virus-seropositive species, have a relatively high likelihood of exposure to Nipah or Nipah-like viruses in India. These machine-learning approaches provide the first step in the sequence of studies required to assess the risk of Nipah virus spillover in India. Nipah virus surveillance not only within Kerala but also elsewhere in India would benefit from a research pipeline that included surveys of known and predicted reservoirs for serological evidence of past infection with Nipah virus (or cross reacting henipaviruses). Serosurveys should then be followed by longitudinal spatial and temporal studies to detect shedding and isolate virus from species with evidence of infection. Ecological studies will then be required to understand the dynamics governing prevalence and shedding in bats and the contacts that could pose a risk to public health. PLOS

Introduction
The henipaviruses, including Nipah virus and Hendra virus, are highly lethal, emerging, batborne viruses within the family Paramyxoviridae that infect humans directly or via domestic animals that function as bridging hosts [1,2]. Previous Nipah virus outbreaks were reported in Malaysia in 1998 [3], in eastern India in 2001 and 2007 [4][5][6], in Bangladesh almost annually since 2001 [7], and in Kerala, India, in May 2018 [8,9]. In Malaysia, transmission from bats to humans occurred through pigs as intermediate hosts. Pigs were putatively infected after consuming fruit that was partially consumed by Pteropus vampyrus bats [10]. In Bangladesh, transmission from bats to humans occurred through consumption of date palm sap contaminated by P. medius (formerly P. giganeteus) [11], and subsequent human-to-human transmission has been commonly observed [12]. Initial wildlife studies in response to the 2018 Kerala outbreak focused on insectivorous bats (Megaderma spasma) [13], whereas a later survey focused on P. medius and found that 19% (10/52) of the P. medius tested had at least one biological sample with evidence of Nipah virus RNA using real-time reverse transcription polymerase chain reaction (RT-PCR). Details such as whether the bats originated from one or more populations and which tissues or specimens were sampled have not been published [9]. The routes of transmission from bats to the index case in Kerala are also unknown [14]. However, a salient feature of the outbreak in Kerala was the superspreader, who, while he was cared for in hospital, infected most cases identified during the outbreak [15].  35]. Although henipaviruses are widely distributed geographically, most surveillance has been patchy in space and time, and it seems likely that henipaviruses occur in species that have not yet been identified as reservoirs [18].
An additional challenge to confirming henipavirus reservoirs and characterizing their dynamics is the generally low and variable prevalence in bats [20,21,36]. Intensive spatial and temporal sampling is necessary to overcome these challenges, and such studies have yet to be conducted in India. Importantly, surveillance for human infections and further epidemiologic gov; and a generous donation of a Titan Xp by the NVIDIA Corporation. The content of the information does not necessarily reflect the position or the policy of the U.S. government, and no official endorsement should be inferred. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
investigation provides crucial context for understanding which reservoir species are epidemiologically important, when and where spillovers occur, and which viruses pose the greatest public health threat.
To provide guidance for sampling bats in India generally, and guidance for epidemiologic studies looking for animal exposures associated with Nipah virus spillovers in Kerala, we systematically searched the literature for records of studies of Nipah virus and henipaviruses in bat species known to occur in Asia, Australia, and Oceana. We collated all records of Nipah virus shedding from bats (PCR) and Nipah virus exposure in bats (serology that likely includes cross-reacting henipaviruses). We used generalized boosted regression of more-extensive data on bats in Asia, Australia, and Oceana to make trait-based predictions of likely henipavirus reservoirs near Kerala.

Detection of Nipah virus in bats
As part of a broader study on filoviruses and henipaviruses in wild bats, we systematically searched Web of Science, Centre for Agriculture and Biosciences International (CAB) Abstracts, and PubMed with the following terms: (bat � OR Chiroptera � ) AND (filovirus OR henipavirus OR "Hendra virus" OR "Nipah virus" OR "Ebola virus" OR "Marburg virus" OR ebolavirus OR marburgvirus) NOT (human); we also performed a secondary search that included "human". We followed a systematic exclusion protocol [37] and, because the search was conducted during a study on viral detection or serological detection estimates, we only retained records from observational studies that measured the proportion of wild bats positive for each viral group as assessed by PCR (prevalence) or serology (seroprevalence). We supplemented these data with studies referenced in the systematically identified publications that report viral isolation but not prevalence or seroprevalence. For the generalized boosted regression analysis, we culled the global data by including only studies that reported Nipah virus (by serology or PCR). This search yielded 286 records from 25 papers. For each record, we classified the species, country of sampling, diagnostic method (PCR or serology), sample size, sampling and reporting method (single or multiple cross-sectional events, samples pooled to one estimate), and the proportion of PCR-positive or seropositive bats (Fig 1). We display these data in a phylogenetic context using the bat phylogeny derived from the Open Tree of Life and the rotl and ape packages (Fig 2) [38,39].

Machine learning analyses
To make predictions of bat species that may carry Nipah virus in India and the surrounding region, we trained a generalized boosted regression model on data that characterized 48 traits of 523 extant bat species with geographic ranges in Asia, Australia, and Oceana. By learning the intrinsic features of species that have previously been found to have evidence of Nipah virus-infection (in this study, either through serology or PCR), the objective is to identify additional bat species whose trait profiles suggest a high probability of being Nipah virus-positive. In addition, by examining those traits that are most predictive of Nipah virus-positive species, we may also glean ecological insights about why some bats are found to be Nipah viruspositive compared to others in this region. While examination of these suites of shared traits can be insightful, it is important to note that these methods are designed for pattern recognition rather than to identify mechanisms; however, in some cases, mechanisms may be suggested [42]).
We acquired range maps from the International Union for Conservation of Nature (IUCN) [43]. We obtained data on foraging method and diet composition from EltonTraits [44]. We derived data on biological and ecological attributes from PanTHERIA [45]. We took data on torpor and migration behaviors from Luis et al. [46], and data on production (a measure of fitness output) from Hamilton et al. [47]. All variables, their definitions, coverage, and data source citations are reported in S1 Table. Models were trained on 80% of this full data set and comprised of 50,000 trees specifying a Bernoulli error distribution and built with 10-fold cross-validation to prevent overfitting. In addition, we weighted each species by its sample size ("sum.sample.size") to account for the fact that some species are more frequently sampled for henipaviruses compared to others. We also applied target shuffling methods to calculate the corrected area under the curve (AUC) [48].
We conducted a second generalized boosted regression analysis to diagnose whether greater data availability for better-studied species leads to trait profiles that describe well studied bat species rather than species where evidence of Nipah virus infection has been reported. In this model, we used the number of citations in Web of Science for each species' scientific name as a proxy for study effort at the time this study was conducted. As before, models were trained on 80% of the full data set and were comprised of 30,000 trees specifying a Poisson error distribution and built with 10-fold cross-validation to prevent overfitting. Hyperparameter values and outputs for generalized boosted regression models can be found in S2 Table. Results

Previous surveys
One hundred twelve species of bats have been detected in India, of which 39 have been detected within the state of Kerala [43,49,50]. Thirty-one bat species that occur in India (and  18 that occur in Kerala) have been sampled for Nipah virus and 11 of these species have been identified as having antibodies that react to Nipah virus serological tests. However, almost all sampling of these species occurred outside of India. The 11 positive species include seven species that reside in Kerala, including five Pteropodidae (Cynopterus brachyotis, C. sphinx, Eonycteris spelaea, Rousettus leschenaultii, and P. medius [formerly P. giganteus]) and two non-Pteropodidae (Scotophilus kuhlii and Hipposideros pomona; Table 1 [30, 40,41,[51][52][53][54][55][56]). Although all of these species had serological evidence of Nipah virus (or cross-reacting Nipahlike viruses), P. medius was the only species with virological evidence of Nipah virus (1 out of 31 individuals tested with PCR [3%]) [40,41]. Seroprevalence in sampled species ranged from 0-83% and prevalence from 0-3% (Table 1). P. medius [41] and R. leschenaultia [56,57] were the only species with seroprevalence >30%. However, most studies reported seroprevalence as pooled detection over time (i.e. samples from multiple time points were included in a single seroprevalence estimate). Only three species (P. medius, Cynopterus sphinx, and Megaderma lyra) were sampled within India, and one of these species (P. medius) had evidence of viral shedding within India [40,41] (Table 1 and Fig 2). Recent media reports suggest that additional cross-sectional surveys of bats have been conducted in response to the outbreak in Kerala and that P. medius tested positive by PCR [14].
In Fig 2, we map detections of Nipah virus by serology or PCR onto the phylogeny of bat species found in India. Our qualitative assessment of Nipah virus detections among these species, within a phylogenetic context, suggested clustering of Nipah virus positivity within Pteropodidae, consistent with the ongoing focus of research efforts on this family. However, Nipah virus reactivity was also detected in other bat families (Fig 2). Moreover, some clades that contain henipavirus-seropositive bats also contain species that occur in Kerala but have not been sampled (Fig 2). For example, a number of unsampled Hipposideros and Rhinolophus that occur in Kerala are members of clades that include Nipah-virus seropositive bats (Fig 2).

Likely reservoirs
The generalized boosted regression model that we applied to species-level trait data identified Nipah virus-positive bat species with~83% accuracy (Fig 3; corrected AUC = 0.83; complete model outputs and hyperparameters are reported in S2 and S3 Tables). In addition to Nipah virus-positive bat species, we identified six species with geographic ranges overlapping Asia, Australia, and Oceana that are not currently identified as Nipah reservoirs but, on the basis of trait similarity with known Nipah virus-seropositive or virological-positive bat species, have high likelihood of exposure to

Discussion
Our trait-based analyses identified four additional Indian bat species to target for surveillance for Nipah virus; two of these species occur within Kerala. Our predictions inform a research pipeline that should include serosurveys of these potential bat reservoirs and the 11 Indian bat species previously identified to have evidence of Nipah virus infection. Species that are seropositive on these initial surveys should then undergo longitudinal spatiotemporal surveillance to detect shedding. Our predictions must be combined with local knowledge on bat ecologyincluding distribution, abundance, and proximity to humans-to design sampling plans that can effectively identify hosts that pose a risk to humans [60]. Moreover, sampling of bats should be combined with epidemiological, anthropological, ecological, immunological, and virological work to uncover the relations that drive transmission of virus from animals to humans. Nipah virus has a wide host breadth in both reservoir bat species and recipient animal species. Therefore, identifying the reservoir in a new location can be challenging. We used a systematic literature search to collate data from previous studies of Nipah virus in bats. We then prioritized surveillance of bats in Kerala, and more generally in India, on the basis of these data. We applied a trait-based generalized boosted regression that identified species with traits similar to those associated with serological or virological evidence of Nipah virus. Nipah virus was detected by PCR in only one species occurring in India, P. medius, which also is the known reservoir in Bangladesh. However, Nipah virus was detected by serology in many species. Eleven out of 112 bat species that occur in India, and seven of the 39 species that occur in Kerala, had serological evidence of Nipah virus exposure (most were sampled outside of India).
Our work provides a list of species to guide early surveillance and should not be taken as a definitive list of reservoirs. A series of further studies are required to triangulate on the reservoir hosts that pose a risk to humans. A major reason these studies do not identify definitive reservoirs is because almost all previous Nipah virus studies relied on serology, but serological assays often lack specificity; detection of Nipah virus may represent cross-reactions to closely related viruses [61]. For example, multiple studies have shown cross reactivity among Hendra, Cedar, and Nipah viruses using glycoprotein assays [62][63][64]. It is likely that many of the positive tests reported here represent exposure to uncharacterized henipaviruses with antigenic similarity to Nipah virus. These viruses may or may not be zoonotic. PCR is specific and sensitive, and positive results demonstrate presence of Nipah virus RNA; however, the prevalence of Nipah virus is usually so low that large sample sizes are needed to yield positive detections [27,65] outside of pulses of shedding [29,36]). Therefore, PCR may not be informative in the early stages of identifying reservoirs. Serology remains an important tool for these initial surveys as long as the assays are interpreted correctly, and positive detections are followed by virological studies to detect shedding. These field surveys need to be followed by virological studies to characterize viruses and their zoonotic risk and then epidemiological studies to understand risk to public health [61].
In addition to suggesting potential reservoir species, the associative traits that predict reservoir capacity inform the ecology of potential bat reservoirs, which may guide epidemiological studies of Nipah virus infection. However, the utility of these traits as predictors of reservoir capacity should be interpreted as associative rather than causal. Some of the traits in the generalized boosted regression (see Supporting Information S2 Table) capture potential phylogenetic structure of Nipah virus hosts. For example, the relative importance of adult body length and forearm length could reflect the strong association of Nipah virus with medium to large Pteropodidae bats, although 'Pteropodidae' was not itself an important predictor (S2 Table). Beyond including bat families as taxonomic predictor variables, our analysis largely subsumes additional phylogenetic structure underlying patterns of Nipah virus seropositivity in bat species. It is likely that patterns of evolutionary relatedness among host species may underlie similarities in factors that determine host receptivity. Such factors may include functional receptors that enable viral entry into host cells and host factors required for viral replication [66,67]. Patterns of co-divergence of hosts and viruses [68] are also reflected in host and viral phylogeny. The association of these traits with reservoir capacity should be elucidated by future phylogenetic comparative analyses of host traits, which will rely on expanded availability of relevant data (e.g., characterization of species level differences in functional receptors).
Other traits with high relative influence included aridity (mean precipitation [mm]/mean potential evapotranspiration [mm]), the maximum latitudinal extent of each species geographic range, the richness of mammal species found within a species' geographic range, and the trophic level of each species (see S2 Table, and partial dependence plots, S1 Fig). In general, our analysis suggests Nipah virus-positive bats in this region tend to be herbivorous or omnivorous species whose geographic ranges overlap with tropical desert (arid) habitats, maximally extending to the northern limit of the tropical belt and overlapping with a high diversity of other mammal species (S1 Fig). Given that bats from arid habitats may forage more widely when water or food resources become limited in dry years, it is also possible that Nipah virus transmission may occur with increasing contact between multiple bat species mixing at higher densities around limited resources [24].
A current constraint on progress towards understanding the epidemiology of Nipah virus in India is the dearth of virologic and taxonomic studies on bats in India. The majority of studies used for these analyses were conducted outside of India and no studies, to our knowledge, investigated Nipah virus in Kerala prior to this outbreak. India encompasses many different bioregions. The outbreak in Kerala shows that the ecological niche for Nipah virus is very wide and could include the entire distribution of P. medius, as well as the distributions of other potential reservoirs proposed here. Studies in wildlife and humans must cover this broad geography to assess future risk in India. Moreover, the last comprehensive and systematic taxonomic study on the bats in India was conducted more than a century ago. There are several cryptic species or species with unresolved taxonomic status in India, and it is possible that species with Nipah virus detections outside of India may have been misidentified. Therefore, our conclusions may change after detailed and systematic taxonomic studies are done on Indian bats.
Once serological evidence of Nipah virus is detected in potential reservoir hosts, longitudinal spatial and temporal surveillance of these hosts will be necessary. Detection of virus at a single point in time and space conveys limited information and could represent a spillover event from another species. To confirm reservoirs status of a species, virus must be consistently found within that species [69]. Moreover, maintenance of henipaviruses can be extremely dynamic. Seasonal, annual, interannual, or stochastic pulses of shedding can be driven by extinction and recolonization of virus among bat populations or episodic shedding in response to stress (see discussions in [26]). Therefore, discriminating viral maintenance versus spillover, and characterizing shedding dynamics, requires intensive sampling over time and space.
Identifying reservoir hosts and then characterizing the diversity of their viruses and their virus shedding patterns are critical steps in understanding spillover. However, the transmission of Nipah virus from bats to humans requires alignment of a number of other ecological and epidemiological factors [67], including bat and human behaviors that expose humans to an infectious dose of Nipah virus. In Bangladesh and Australia, bat and human behaviors facilitate exposure to Nipah and Hendra virus, respectively, when bats exploit human food. In Bangladesh, bats contaminate human-harvested date palm sap [7]. In Australia, bats exploit food from trees in peri-urban areas when native winter food sources are cleared [26,70]. When pulses of virus shedding in bats coincide with bat and human or horse contact through food, spillover is more likely to occur [71]. Understanding these important interfaces requires a variety of epidemiological studies including niche and spatial risk modeling [72], as well as animal and human behavioral studies [7,11].
In addition to sampling bat reservoir hosts, sampling plans should consider that henipaviruses could be maintained in domestic recipient hosts. These hosts, with closer and more frequent contact with humans, can become bridge hosts for human infections [36]. For example, Nipah virus was repeatedly introduced into intensive commercial pig populations in Malaysia. These repeated introductions of Nipah virus into pig farms allowed accumulation of herd immunity and the conditions for long term persistence and regional spread that facilitated transmission to humans [10]. To narrow potential spillover pathways to humans in India, studies should consider susceptible domestic animal species with husbandry that facilitates virus persistence (e.g., intensive commercial farming systems with high turnover of animals).
Projecting the risk of Nipah virus outbreaks in humans requires identification of the reservoir hosts and the dynamics of Nipah virus within those hosts. Our predictions inform initial sampling that can be followed by a sequence of studies that investigate the bat species highlighted here. The machine learning approaches presented here can be the first step in a research pipeline to eventually understand the mechanisms underpinning epidemiologically important cross-species contacts.