Digital Surveillance: A Novel Approach to Monitoring the Illegal Wildlife Trade

A dearth of information obscures the true scale of the global illegal trade in wildlife. Herein, we introduce an automated web crawling surveillance system developed to monitor reports on illegally traded wildlife. A resource for enforcement officials as well as the general public, the freely available website, http://www.healthmap.org/wildlifetrade, provides a customizable visualization of worldwide reports on interceptions of illegally traded wildlife and wildlife products. From August 1, 2010 to July 31, 2011, publicly available English language illegal wildlife trade reports from official and unofficial sources were collected and categorized by location and species involved. During this interval, 858 illegal wildlife trade reports were collected from 89 countries. Countries with the highest number of reports included India (n = 146, 15.6%), the United States (n = 143, 15.3%), South Africa (n = 75, 8.0%), China (n = 41, 4.4%), and Vietnam (n = 37, 4.0%). Species reported as traded or poached included elephants (n = 107, 12.5%), rhinoceros (n = 103, 12.0%), tigers (n = 68, 7.9%), leopards (n = 54, 6.3%), and pangolins (n = 45, 5.2%). The use of unofficial data sources, such as online news sites and social networks, to collect information on international wildlife trade augments traditional approaches drawing on official reporting and presents a novel source of intelligence with which to monitor and collect news in support of enforcement against this threat to wildlife conservation worldwide.


Introduction
The true worth of the illegal wildlife trade is unknown. This multi-faceted and clandestine industry has disrupted fragile ecosystems and facilitated the spread of pathogens and novel infectious diseases in humans, domestic animals, and native wildlife [1,2]. The trade includes live and dead wildlife of multiple species that are captured, poached, and sold for food, medicine, pets and trophies [3]. While some data exist on the volume, scope and scale of the global wildlife trade, the current understanding of the network is largely inferred from data on legal import and exports recorded by the Convention on International Trade in Endangered Species of Wild Fauna and Flora (CITES) treaty, which requires member nations to document global trade in endangered wildlife [4]. The vast illegal trade remains largely unmonitored and underground.
Not only is the true global scale of the illegal wildlife trade unknown, but also regional and local levels of wildlife trade are difficult to assess [5]. Quantifying all global wildlife trade would be a herculean task, as illegal trade routes range in scale from local to international levels and are often conducted through informal networks [2]. Confiscation records provide much of the only available data on the scope of this illegal network. A 2010 study conducted by Rosen and Smith assessed the scope and scale of the worldwide illegal wildlife trade by examining 12 years of seizure records compiled by TRAFFIC, an international wildlife trademonitoring network. The study found 967 documented seizures of illegal wildlife and wildlife products representing vast species diversity and geographic scope. Factors such as inadequate infrastructure, corrupt officials, international crime networks, and a shortage of environmental conservation law enforcement officers affected individual nation's seizure activity [6].
Previous attempts to monitor the illegal wildlife trade have had mixed results. In 2002, the Invasive Species Internet Monitoring System was implemented to track Internet trade of invasive species using a semi-automated process of searching for these species sold online. Refining the search parameters within semi-automated queries allowed the system to monitor Internet sales of CITES listed species. While the system provided excellent results supporting proof of concept, it only captured Internet sales, which may not be representative of the greater illegal trade picture. In addition, the query parameters had to be well-defined for each species of interest in order to obtain relevant search results, and query results had to be consistently reviewed by subject matter experts in order to determine if further action needed to be taken [7]. The sacrifice of sensitivity for specificity combined with its labor-intensive approach limited its scalability.
A second study employed wildlife trade market surveys, which were administered repeatedly to more accurately estimate the number of illegally traded animals [8]. While repeat market surveys may improve the estimation of a particular species being illegally traded in specific regions or localities, obtaining a more accurate estimate of internationally illegally traded wildlife would require a much more robust system built to survey a multitude of markets worldwide for a wide array of species [5].
Illegal wildlife trade can be intercepted and reported by both official (e.g. government agencies, non-profit organizations working with local officials) and unofficial sources (e.g. news media). A surveillance system that collects both unofficial and official data may offer a more complete picture of the trade by providing up-todate, highly localized information on the illegal commercialization of wildlife, providing a picture of the illegal wildlife trade that until now has not been visualized on a global scale. To the best of our knowledge, an automated, real-time, comprehensive, global system monitoring official and unofficial reports of illegal wildlife trade activity has not previously been put into practice. Such a system would undoubtedly be useful to wildlife authorities in tracking illegal wildlife trade. Herein, we introduce an automated digital surveillance system that was developed to monitor reports on illegally traded wildlife and wildlife products.

Website Development
An automated digital surveillance system was developed that utilizes natural language processing and machine-learning algorithms to combine unofficial and official reports of wildlife trade events obtained from the Internet in an effort to establish an automated web crawling surveillance system of the wildlife trade, similar to those used for infectious disease events (e.g. GPHIN, HealthMap) [9,10]. The underlying architecture of the Health-Map system has been previously described [11]. Briefly, the system addresses the challenge of scouring the Internet for pertinent outbreak information through automated querying, filtering, and visualization of reports through the utilization of automated text processing algorithms that classify alerts by location and disease [10]. This flexible system has been adapted for use in other nondisease surveillance settings [12].
Available at http://www.healthmap.org/wildlifetrade, the digital wildlife surveillance tool is freely available and displays realtime reports of illegal wildlife trade activity worldwide as an interactive visualization (Figure 1). The system collects reports based on the keyword search strings described below and utilizes text-mining algorithms to classify reports on the illegal wildlife trade by location and species prior to overlaying this information onto an interactive mapping tool. Reports were gathered from around the world but were limited to English for this proof-ofconcept development. The resulting system continually aggregates, organizes, and disseminates near real-time information to provide insight into the global wildlife trade network. We focus our analysis on one year of illegal wildlife and wildlife product confiscation data collected between August 1, 2010 and July 31, 2011.

Report Sources and Detection
Valuable information about the illegal wildlife trade is available via the Internet through official and unofficial sources. Official sources from freely available RSS feeds were utilized and included TRAFFIC, WildAid, The Coalition Against Wildlife Trafficking (CAWT), World Wildlife Fund (WWF), and the International Fund for Animal Welfare (IFAW). In addition, reports from unofficial sources were automatically collected utilizing key search terms from freely available websites, discussion forums, mailing lists, news media outlets, and blogs. Information was obtained only from publically available sources to respect privacy issues. Standard Internet citation was practiced by including a brief excerpt from the original article and then linking to the source for additional detail. Overall, the system capitalizes on news indexers that draw from over 50,000 possible web-based resources [13].

Keyword Selection
The development and utilization of key search terms allows information on the global wildlife trade network from disparate unofficial sources to be monitored in near real-time. Upon reviewing historical reports on the illegal wildlife trade, we created a small set of candidate keywords that were both relevant and fairly specific to the illegal wildlife trade. We tested the potential of those keywords as queries to retrieve reports within the Google Reader tool (which allows the retrieval of a Google News query as an RSS feed). New keywords were added as determined by browsing for relevant reports that our system was missing. These new keywords were used to increase the amount of results yielded or to improve the specificity of the query. In order to collect reports most relevant to the illegal wildlife trade, there were several rounds of query improvements.
As an example, the word ''poaching'' would often appear, and was therefore used as an initial keyword. However, used alone, the term brought in reports unrelated to the wildlife trade. We added inclusion and exclusion terms to narrow the focus such as ''in title: poaching wildlife'' and ''in title: poaching -bank -egg.'' This process was conducted repeatedly, building on our initial set of candidate keywords, to obtain relevant and specific information on the wildlife trade. A total of 18 search terms were selected to gather the reports analyzed here.

Data Collection and Visualization
For each report yielded by the selected keywords, the system extracted specific details including the species involved in the report, the specific geographical location where interception of illegal wildlife product occurred ( Figure 2), a link to the original information source, and the date of source publication. Due to reporting inconsistencies (e.g. time of occurrence was sometimes reported as ongoing or as specific as a year, month, or week), the time of illegal wildlife trade activity was not included in statistical analysis.
Each report collected was reviewed manually by an analyst to ensure the most accurate information was displayed on the website. The analyst added precise geographical details for interception location and species information as needed. In addition, the analyst ensured that duplicate reports were hidden from display and that reports were correctly tagged. Category tags were applied to improve filtering [11], and include Breaking, for articles pertaining to live animal or wildlife product seizures with specific information on species, date, and location; Warning, for articles with details about historically known illegal wildlife trade activity, articles describing increasing levels of illegal trade over a period of time, or regarding illegal wildlife trade routes commonly utilized; and Context, for alerts on policy, law, or collaborations involving the illegal wildlife trade without specific incident information on product seizures or trade routes used. Reports tagged as Breaking or Warning appear on the wildlife trade website, while reports tagged as Context do not appear on the site to minimize information overload. Our analysis focused on Breaking alerts only.
To avoid confusion, commonly traded wildlife species were displayed on the website as shown in Table 1, broken down into common names and broad categories instead of listed by scientific names. When possible, detailed scientific names of species were recorded in an internal database in order to determine each species' red list status (the conservation status determined by the International Union for Conservation of Nature). Transportation methods used were also recorded when information was available.
Locations of wildlife interception points were visually displayed on the website along with a link to the original information source. Last, a geographical layer featuring international airports was added, as previous studies have highlighted airports as being important points of entry of illegal wildlife products [14][15][16]. For example, one study estimated that five tons of bushmeat was smuggled per week into the Paris Roissy-Charles de Gaulle airport alone [14]. Showing airports may allow officials to identify major transportation hubs for the illegal wildlife trade.

Comparison with CITES
To confirm the underlying assumption that the volume of wildlife traded illegally is not accurately reflected through the legal trade of wildlife products, we examined the CITES database of legally traded, threatened or potentially threatened species. The CITES database is ''based on a system whereby permits or certificates are issued for international trade in specimens of species listed in one of three Appendices, each of which provides a different degree of trade control [17].'' CITES data on gross imports of elephants (both the Elephas and Loxodonta genera), which were the most commonly intercepted based on our findings, were analyzed for 2008, 2009, and 2010. Data from 2011 was not yet available.
CITES data showed that Elephas genera imports averaged 13,661 from 2008-2010, and were comprised of both live animals and animal products ranging from ivory carvings to meat and hair. The Loxodonta genera averaged 89,523 from 2008-2010 comprised of live animals in addition to products such as bone carvings, ears, feet, hair, ivory carvings, meat, skins, skulls, and teeth [18]. It should be noted that according to CITES, gross import data is often an overestimation of the quantity actually traded, as where different quantities have been reported by the importer and the exporter, the program selects the larger of the two quantities [17].

Discussion
The HealthMap Wildlife Trade website is a comprehensive digital surveillance system for aggregating, organizing, and displaying illegal wildlife trade reports from official and unofficial sources. By providing a unified platform of both official and unofficial information, users are able to quickly and easily view a plethora of information (whether by species or location) from one application. From August 1, 2010 to July 31, 2011, 858 reports of illegal wildlife trade were collected from 89 countries illustrating the global extent of this lucrative industry.
Previous methods that have attempted to estimate the enormity of the illegal wildlife trade have focused primarily on traditional data sets of legally traded wildlife. However, substantial amounts of information may be obtained through the utilization of unofficial digital media sources, as demonstrated by the number of reports collected, highlighting just one of the benefits of considering this methodology. Additionally, the data collected may prove useful to both conservation and public health officials by providing real-time and detailed information on whereabouts of threatened species and species capable of transmitting zoonotic diseases.
There are several other organizations that apply a variety of methods for monitoring illegal wildlife trade activity, but due to differences in scope and purpose, direct comparisons are difficult. The WWF, in partnership with TRAFFIC, created the Law Enforcement Management Information System (LEMIS) tracker that plots official data and flows of wildlife products seized upon entry into the United States (http://wildlifetradetracker.org/). As compared to LEMIS, the HealthMap wildlife trade system covers wildlife products seized worldwide and includes unofficial sources of information. It also allows users to view original media reports on seizures made, where LEMIS does not provide access to source information. The HealthMap wildlife trade map does not show wildlife product flows at this time.
The Tiger Tracker, also created by WWF and TRAFFIC, plots official data on seizures of tigers and tiger parts within Asia.    Lastly, the use of submissions from the general public is an important feature that distinguishes our system from most others. Citizen science data has been shown to have the potential to detect and track disease events and trends earlier than official data sources [19,20]. While Freeland allows the general public to contribute information on suspected wildlife trafficking in Southeast Asia, it likely receives a different subset of information from the public. Their website states, ''we will act,'' suggesting that all publically submitted reports are thoroughly investigated and therefore it is likely to elicit eyewitness reports of illegal trade rather than the web-available news articles, press releases, and government statements on the wildlife trade events that our users report. Submissions to the HealthMap wildlife trade site are later reviewed by trained staff to check for duplicates, and to ensure removal of any personally identifying information. If a submitted alert warrants further investigation, our protocol is to refer it to collaborators at WCS.
The International Union for the Conservation of Nature (IUCN) provides a list of species at risk of extinction called the Red List of Threatened Species. Threatened species may be poached or caught live and sold for a high value because of their novelty in the market. The system detected reports involving numerous species of concern including the near-threatened pangolin for its scales, the critically endangered black rhino for its horn, the endangered Javan slow loris for the exotic pet trade, the vulnerable mandrill for bushmeat, and many others [21]. The presence of IUCN Red List species in our digital surveillance underscores the need to prevent illegal trade.
When compared to the CITES database, illegal wildlife trade reports collected through our website included 107, or 12.5%, reports involving illegal interceptions of elephant products. However, just one interception in our database may involve many products from many different species. As an example, one report collected from December 2010 involved the confiscation of 105 pieces of jewelry made of elephant ivory [22], while another 2 tons of elephant ivory, or 247 tusks, were intercepted in April 2011 [23]. This would therefore make these two reported interceptions of elephant products equivalent to 352 products if listed in the CITES database, as the CITES database counts each product individually (i.e. an elephant carving, hide, or skull). It is difficult to make a direct comparison to the CITES data due to the difference in wildlife product categorizations, however further research is being conducted to obtain a better understanding of the scope of the illegal wildlife trade.
Surveillance of illegal wildlife trade activity may also bring insight into the spread of zoonotic diseases as an early detection system, as the trade has been shown to play an inarguable role in the facilitation of disease transmission [24]. Jones et al. found that a majority of emerging infectious diseases were caused by zoonotic pathogens, and that over 70% originated in wildlife, with the number of events increasing significantly over time [25]. The wildlife trade contributes to the potential for increasing numbers of emerging diseases, as humans are directly exposed to wildlife and wildlife products through the many varying aspects of this lucrative industry (e.g. via hunters, salesmen, consumers) [2]. History has shown the dangers of zoonotic disease transmission through commercialization of live wildlife and their products [16,26]. Examples of past outbreaks from wildlife trade include monkeypox, which was imported into the United States in 2003 when infected Gambian pouched rats (Cricetomys gambianus) were transported with pet prairie dogs (Cynomys ludovicianus) [27]. Other cases of zoonotic diseases have also occurred and include rabies from mammals, severe acute respiratory syndrome (SARS) from small carnivores, highly pathogenic avian influenza from avian species, and chytridiomycosis from amphibians [2,28,29]. In light of these threats the USAID's Emerging Pandemic Threats Program: Predict Project strives to build a global early warning system for emerging diseases that are transmitted between wildlife and people. Surveillance of illegal trafficking is inherently limited by the clandestine nature of the activity; underreporting is unavoidable. Additional biases may occur by selection of key search terms, media reporting biases, and restriction to the English language. For instance, a wealthy country may have more resources (such as Internet access, higher regulation standards, and freedom of press) than a less economically developed country. This lack of resources may prevent the detection and reporting of illegally traded wildlife despite the possibility of more trade occurring in underdeveloped areas.
The search for illegally traded items is also limited by the key word search terms utilized. Although a sample of queries was repeatedly tested to select for the most appropriate search terms, selection bias may have still resulted. There may also be additional biases towards certain terms selected or species reportedly traded. Only events reported through media outlets, or announced through official channels in which RSS feeds are being followed (CAWT, IFAW, TRAFFIC, WildAid, and the WWF), are shown on the website. Therefore, a bias exists towards stories that appeal to each media outlet's target audience and stories deemed to be particularly ''newsworthy'', such as those concerning particularly large seizures or that focus on charismatic megafauna like elephants, rhinoceros, and tigers. To the degree that media biases result in underreporting of lesser-known species that are important to the eco-system at large, this system will suffer from mirrored underreporting.
The system does attempt to capture reports regardless of species involved through its utilization of specific key word search terms. Terms like ''seizure'' or ''illegal wildlife trade'' are not synonymous with specific species. If our key search terms are actually used more often in stories for certain species of animals, then those animals may be better represented than others.
Lastly, media reports may not accurately reflect the true legality of each event reported. It may later be determined during legal proceedings that a transport of a wildlife product reported as illegal was actually being legally conducted. For products involving endangered species however, where any transport or sale is prohibited, the number of reports incorrectly categorized is likely minimal.
Further, using only one language (English) may have caused an over-reporting in English speaking countries, such as India (n = 146, 15.6%), the United States (n = 143, 15.3%), and South Africa (n = 75, 8.0%). Previous literature has shown Asia to be a focal point of illegal wildlife trafficking, and these past findings were more in line with the next highest-ranking countries, China (n = 41, 4.4%) and Vietnam (n = 37, 4.0%) where English is not the main language [30]. Our preliminary results show that it would be beneficial to include non-English language reports from additional RSS feeds to obtain a more complete picture of the illegal wildlife trade. We have therefore begun to monitor reports in Japanese and plan to add Chinese, Malay, and Indonesian languages next. Adjustment for English-language Internet news sites per country was not possible under the scope of this pilot study.
Additionally, it would be useful to look into biases of species considered when conducting surveillance of illegal wildlife trade. For instance, in the United States deer and elk were often confiscated but not typically discussed by wildlife trade organizations due to their current status of ''not threatened.'' The higher activity level within the United States in our automated system may have been due to the inclusion of deer, elk, bear, and moose being illegally poached or traded. Monitoring the trade in species not currently classified as threatened or endangered should be considered, as they may be at heightened risk for future threat. In addition, zoonotic diseases may be transmitted regardless of a species' conservation status.
Future work may also concentrate on analysis of media outlets, funding for enforcement, and user demographics. In addition, work may be conducted to display wildlife trade routes to and from a location, transportation methods used, and wildlife red list status of species confiscated. Additional geographical layers such as animal densities and transportation hubs may aid in further understanding the illegal wildlife trade and its associations with geographical factors. Additional work linking species traded with zoonotic diseases may also be conducted to identify potential hotspot regions for emerging zoonotic diseases.
Last, methods used to transport wildlife were documented when available, but not included in our results due to the low proportion of reports providing transportation information (19%). Insight into transportation methods could aid regulators in intercepting illegal wildlife trade before it reaches its final destination, and therefore as the dataset grows we hope to identify useful patterns. From the subset of reports that did include transportation methods of illegal wildlife trade (n = 163), modes of transportation utilized varied greatly and included land vehicles (trucks, buses, and cars), airplanes, boats, and trains. Some also utilized the Internet to purchase wildlife products that were then shipped through the mail (via ground and air postal services). The use of the Internet as a resource for obtaining illegal wildlife products is not new, however as Internet access becomes more readily available this may be an area to monitor closely. The wildlife trade website may aid in highlighting trends such as the utilization of the Internet for illegal wildlife trafficking in addition to shedding light on interceptions in areas not typically emphasized.
Despite the limitations to a digital surveillance system, the HealthMap Wildlife Trade website is currently the most comprehensive and freely available tool for monitoring the illegal wildlife trade and may help improve our understanding of a clandestine market. The illegal wildlife trade continues to grow, and new challenges are consistently emerging, such as an increased number of online sales of illegal wildlife with limited regulation [31,32]. The problem continues to receive limited resources and political attention, as it escapes detection through an underground economy [33]. Further, this illegal industry is worsened by urbanization and global development that commercializes subsistence hunting and fishing and over-exploits terrestrial and marine ecosystems [34]. To keep up with these advances, global digital surveillance of the illegal wildlife trade is necessary to protect biodiversity, prevent endangerment of species, and control the introduction of infectious diseases [35,36].