Building the European Alien Species Information Network ( EASIN ) : a novel approach for the exploration of distributed alien species data

The European Alien Species Information Network (EASIN; http://easin.jrc.ec.europa.eu) aims to facilitate the exploration of existing alien species information from distributed sources through a network of interoperable web services, and to assist the implementation of European policies on biological invasions. The network allows extraction of alien species information from online information systems for all species included in the EASIN catalogue. This catalogue was based on an inventory of reported alien species in Europe that was produced by reviewing and standardizing information from 43 online databases. It includes information on taxonomy, synonyms, common names, pathways of introduction, native range in Europe, and impact. EASIN catalogue entails the basic information needed to efficiently link to existing online databases and retrieve spatial information for alien species distribution in Europe. Using search functionality powered by a widget framework, it is possible to make a tailored selection of a subgroup of species based on various criteria (e.g., environment, taxonomy, pathways). Distribution maps of the selected species can be produced dynamically and downloaded by the user. The EASIN web tools and services follow internationally recognized standards and protocols, and can be utilized freely and independently by any website, while ownership of the data remains with its source, which is properly cited and linked.


Background information
Europe is severely affected by biological invasions, which are considered as one of the most important direct drivers of biodiversity loss, and a major pressure to several types of ecosystems, with both ecological and economic impacts (MEA 2005).A conservative estimate of the annual damage caused in the EU by alien species is € 12 billion (EC 2008a).Recognizing the need for robust action to control biological invasions and thus mitigate their impacts on biodiversity, ecosystem services and human activities, the European Commission has recently adopted a Communication presenting policy options for an EU Strategy on Invasive Species (EC 2008a).Currently a dedicated legislative instrument is being developed by the Commission (to be proposed in 2012) as dictated by Action 16 of the Biodiversity Strategy (EC 2011).The latter explicitly requests that "by 2020, Invasive Alien Species and their pathways are identified and prioritized, priority species are controlled or eradicated, and pathways are managed to prevent the introduction and establishment of new IAS".
There is a need for accurate, detailed, and timely information on alien species occurrence, distribution and impacts to implement the European policies for the efficient prevention, early detection, rapid response, and management of biological invasions and also to evaluate management measures (Lee et al. 2008;Simpson et al. 2009;Hulme and Weser 2011).In recognition of this need, a large number of human networks and online databases have been created to provide information on biological invasions on a national, supranational, or global scale.These information systems provided the basic knowledge for various landmark assessments of alien invasions in Europe or in its regions (e.g.Chiron et al. 2009;Vilà et al. 2010;Pyšek et al. 2010) and have been effective at raising awareness and improving surveillance of biological invasions.
However, various issues such as variation in data quality, inadequate updating of information, lack of common and agreed definitions of 'alien' and 'invasive species', inconsistencies among databases, lack of geo-referenced species occurrences, and gaps in the spatial and taxonomic coverage (Vandekerkhove and Cardoso 2011;Hulme and Weser 2011;Gatto et al. 2012), in many cases are bottlenecks for the effective assessment and management of biological invasions.In a comparative assessment of the existing online information systems on alien species (Gatto et al. 2012) large differences among databases were found that were only partially explained by variations in their taxonomical, environmental, and geographical scopes.Even DAISIE (Delivering Alien Invasive Species Inventories for Europe; http://www.europealiens.org), the most comprehensive online resource of country-level alien species occurrences in Europe, did not include >30% of species reported in online databases (Gatto et al. 2012).Hence, there is a strong need for integration and harmonization of existing distributed information on alien species.
One approach to improve the quality of information on alien species, increase its availability and accessibility, and ultimately support a cost-efficient invasive alien species policy is to create a network of online interoperable web services through which information in distributed resources can be accessed (Vandekerkhove and Cardoso 2011;Gatto et al. 2012).The successful implementation of such a concept relies on the continued engagement at national and regional scale to collect and provide data; the willingness of database managers to harmonize their information; the development of a set of interoperable web services through which the information can be explored; and appropriate and sustainable funding.The European Commission's Joint Research Centre (JRC) has put efforts towards building such a network of online interoperable web services by developing its central platform (Figure 1).This paper describes the first stage of development of the European Alien Species Information Network (EASIN).The EASIN website (http://easin.jrc.ec.europa.eu) was made available to the public on May 15th 2012.Each valid name of a species is linked to 'original names' (valid or invalid names that appeared in one or more online databases) and 'synonyms'.When spatial data for a specific species are sought, a search is made to data providers using not only the valid name but also all 'original names' and synonyms.

Scope
EASIN aims to enable easy access to data and information on alien species in Europe from existing on-line databases (Figure 1) to assist policy makers and scientists in their efforts to prevent and control alien invasions.It facilitates the exploration of existing alien species information from a variety of distributed information sources by developing and making freely available tools and interoperable web services compliant with internationally recognized standards.EASIN seeks to provide an integrated view of biological invasions on a European scale, utilizing regional and national information that is collected and assessed in global, European, supranational and national databases, thereby building on a large network of experts and citizens.
EASIN aims also to provide web tools and services that can be utilized freely and independently by any host.Data retrieved by EASIN can then be explored in various ways.Basic functionalities can be accessed using web widgets (a web widget is a stand-alone application that can be embedded into third party sites by any user on a page where they have rights of authorship) having interactive alien species data querying, GIS-based mapping and reporting interfaces.These widgets can be installed into third party web-sites and in most cases spare users from seeking alien species information among multiple web interfaces.EASIN offers a rich web services API (Application Programming Interface) to provide sufficient flexibility to achieve special or specific tasks, allowing advanced users to build custom applications tailored to particular needs.In every case, ownership of the data remains with its source, which is properly cited and linked.The current list of cooperating providers of geographical data is given in EASIN's website (http://easin.jrc.ec.europa.eu/Partners).

EASIN catalogue of alien species
The EASIN catalogue is in the core of the system.It entails the basic information needed to efficiently link to existing online databases and retrieve spatial information for alien species distribution in Europe (Figure 2).It is based on an inventory of all species and subspecies reported to be alien in (part of) Europe by one or more of 43 online information systems: 7 with global coverage, 2 with European coverage, 5 with supranational coverage, 26 with national coverage, and 3 with sub-national coverage (Table 1).Some of these databases are not specifically targeting alien species but serve a more general purpose of biodiversity monitoring; among these only those in which it was possible From each of the 43 online information systems, all names of species listed as 'alien', 'cryptogenic', 'introduced', 'casual alien' or 'invasive' were extracted.Species listed as 'potential aliens' (watch lists), 'reintroduced', 'excluded' or 'extinct' were excluded.Marine species with type locality within the same regional sea (e.g.NE Atlantic species reported as aliens in the North Sea) were excluded.Vagrant marine species that have entered the Mediterranean via Gibraltar (mostly tropical Atlantic fish and decapods) or Mediterranean planktonic species that entered the Black Sea via the Dardanelles Straits were also excluded.For alien pests from the EPPO (European and Mediterranean Plant Protection Organization) database we included only those with at least some reported records (X1-X3 pests) and excluded pests whose occurrence in a given country was unclear (X0 pests).From the Audit of Non-Native Species in England, 'aliens' categorized as 'formerly native' and 'native with large addition from domestic or non-native stock' were also excluded.The inventory of all extracted species names was further processed in a multi-step procedure to harmonize notations across databases and identify duplicate taxon entries (see Appendix I for a detailed description).
The EASIN catalogue contains additional information for each species included (Figure 2).A taxonomic tree (kingdom, phylum, class, order, and family) is associated with every species in the catalogue.Information on pathways of introduction is being included by largely following the framework proposed by Hulme et al. (2008) (see Appendix I for details).Species that are recognized to have a high impact (i.e.present in the 'high-impact' or 'worst invasive' species lists of DAISIE, GISD and SEBI-2010) are also highlighted.For those species that are alien in some regions of Europe and native in others (e.g.Ponto-Caspian species invading the Baltic and the North Sea through inland canals), the native range has been defined (on a country level for terrestrial and freshwater species and on a marine basin level for marine species).Some of this information (i.e.pathways, native range) is currently available only for marine species.
Compilation of the EASIN catalogue is an ongoing process and includes several steps to achieve high quality standards.For the three final steps (see Table 2) contribution from external contracted experts is being sought (this has been currently done only for marine species).Table 2 shows the progress accomplished so far (by May 15th 2012; version 1.9 of the EASIN In the EASIN catalogue a unique identification code (R_ID) is assigned to each valid species and subspecies name and is used to connect it with related features (e.g.environment, synonyms, common names, taxonomy, pathways, impact), which were allocated in different tables.A formal procedure is followed for updating the catalogue so that all modifications and updates are traceable.A versioning mechanism allows tracking any change in the catalogue.This mechanism is based on the LSID (Life Science Identifiers) approach (TDWG 2011).The LSID concept introduces a straightforward approach to naming and identifying data resources stored in multiple, distributed data sources in a manner that overcomes the limitations of naming schemes in use today.A unique LSID is generated by the system for each collection of species selected by the user.This LSID will allow the user to reach the latest revision of the EASIN catalogue data, when accessing the EASIN system again.If updates took place in the meantime, the system provide notifications for these species that were updated.

Widgets Framework
An important feature of EASIN is the capability to make a tailored selection of a subgroup of alien species from the Catalogue and request relevant output.For this purpose the Widget framework has been developed, providing users with relevant stand-alone applications and services.It is a fast and flexible way to export pre-configured EASIN functionality on other websites focusing on biodiversity issues.The framework has two main types of web widgets, search widgets and widgets to display maps and download queried data.
In the simple search widget (Figure 3), one or more species may be selected by name (full or part of the names, scientific or common names).Suggestions are automatically shown to assist the user to select the desired species based on scientific or common names.Two filters, one on environment (terrestrial, marine, freshwater) and another on impact (high, low, all) can be also jointly used to further species selection.
The combined search widget (Figure 4), contains two additional filters: taxonomy (based on an hierarchical selection on the taxonomic tree, from genus to phylum) and pathways (selecting one or more of the 21 pathway/subpathway categories).After filtering the catalogue, the user may refine the selection by selecting/deselecting species.
The results of the search widgets may be saved, and then retrieved using data display widgets in tabular form or used for the production of distributional maps based on a choice of two possible grids: 10×10 km or 10×10 minutes of a degree.When a single species is selected, the cells of the map grid will be coloured based on a three-colour pattern corresponding to four possible states: (0) absence or no information (no colour); (1) presence as alien; (2) previously present but now absent (e.g.due to eradication); (3) presence as native.When many species are selected, the number of present species in each cell will be depicted by a colour gradient.For each cell of the produced maps and for each species reported as present, links to the original sources of information will be available.
Widgets can be integrated into any webpage, have no link to a particular technology, support various layouts, and require zero programming skills from webmasters to be installed.Instead of writing code, web enablers can configure and place bits of functionality in a predefined grid.The reliance on ready-made features and functionality greatly reduces a project development time allowing performers to address scientific issues better.The Widget Framework embodies best practice methods and may offer valuable support to the developers of similar projects.

EASIN Web Services
All the data indexed by EASIN from distributed data sources e.g.used by the widget framework are made available via the interoperable webservices.The services were implemented respecting various open standards, i.e. freely  (Fielding 2000) and return requested data as XML.The mapping service is built upon OGC WMS 1.3.0standard (OGC 2011), which allows easy integration into users' mapping applications.In addition, each map layer is accompanied with INSPIRE compliant metadata (EC 2008b).
The web-services display the data retrieved from original data providers in an aggregated way.EASIN uses a well-known brokering approach similar to what is used by EuroGEOSS (2011) and also investigated in the context of GISIN (Graham et al. 2011) to index the data from the data providers.It includes three main steps: analysing the data source, capturing changes, and transforming the source data model to the EASIN data model.The latter preserves important elements from all data sources/ providers e.g.geo-referenced elements and links to the original data collections.The model also employs the mechanism for data harmonization among different sources.It allows the user to search multiple databases at once in real time, and arrange the results into a useful form for further reporting operations, such as mapping.

Current Status -Future Plans
EASIN is now in the initial phase of development and not all features have been made available to the public.The current status and the timeline for future development is depicted in Table 3.The marine part of the catalogue is at a more improved stage of development in relation to freshwater and terrestrial environments, as all ten steps (Table 2) for its compilation have been completed.Steps 7-10 (as in Table 2) have been scheduled to be completed by the first quarter of 2013 for freshwater species and by the end of 2013 for terrestrial species.The search widgets are functional and open to the public, while the mapping widgets will be available to the public by the end of 2012.By that time, spatial data would be retrieved from multiple sources (Table 3).This initial set of source databases was purposely selected to have variable attributes in terms of spatial coverage, interoperability, and format of stored spatial information, and will thus serve as a pilot project to validate and improve the performance of EASIN.Based on the experience gained with this initial set of online databases, a targeted workshop will be organized for other potential data providers by the end of 2012, aiming to expand the EASIN network to all major related information systems by the end of 2013.A further step will be the development of a modelling platform, allowing users to produce predictive distribution maps based on historical and contemporary distributions of alien species and anticipated changes in the environment (e.g.climate change, land use).

Figure 1 .
Figure 1.Schematic illustration of the relationships among EASIN (European Alien Species Information Network) and data providers.EASIN aggregates data from all linked data providers and offers tools and web services to the providers and other interested hosts.All information provided by EASIN's services are linked to the source data, where the user should seek more detailed and disaggregated information.

Figure 2 .
Figure 2. Schematic illustration of the information contained in the EASIN catalogue and how species-specific information is retrieved from data providers.Each valid name of a species is linked to 'original names' (valid or invalid names that appeared in one or more online databases) and 'synonyms'.When spatial data for a specific species are sought, a search is made to data providers using not only the valid name but also all 'original names' and synonyms.

Figure 3 .
Figure3.Example of the output of the simple search widget after searching for 'Siganus'.The control button for Siganus luridus has been pressed and a drop-down window with some basic information for the species has been opened.

Figure 4 .
Figure 4. Example of the combined search widget: high-impact marine bivalves and gastropods introduced in Europe by shipping.

Table 1 .
The 43 online species databases used to create the EASIN catalogue.Numbers of alien species may slightly deviate from what was reported in the source databases (retrieved in August 2011), as some species were excluded based on the methodology described herein.

Table 2 .
The main steps of the procedure to compile the EASIN catalogue.The progress achieved so far for each of the three environments (terrestrial, freshwater, and marine) is indicated, referring to version 1.9 of the EASIN catalogue that was uploaded on May 15th 2012.