Risk classifications of aquatic non-native species: Application of contemporary European assessment protocols in different biogeographical settings

Non-native species can cause negative impacts when they become invasive. This study entails a comparison of risk classifications for 25 aquatic non-native species using various European risk identification protocols. For 72% of the species assessed, risk classifications were dissimilar between countries. The pair-wise comparison of Freshwater Fish Invasiveness Scoring Kit (FISK) scores of in total 28 fish species from the UK, Flanders (Belgium) and Belarus resulted in a higher correlation for scores of Flanders- Belarus than that of Flanders-UK and Belarus-UK. We conclude that different risk classifications may occur due to differences in (1) national assessment protocols, (2) species-environment matches in various biogeographical regions, and (3) data availability and expert judgement. European standardisation of risk assessment protocols, performance of biogeographical region specific risk classifications and further research on key factors for invasiveness of aquatic ecosystems are recommended.


Introduction
In the last decades, risk assessment has gained much interest as an instrument to support policy makers in their decisions regarding the need for managing non-native species (Anderson et al. 2004;Byers et al. 2002). Once non-native species are introduced and become invasive, they can cause considerable damage to natural ecosystems, biodiversity, human health, cattle, agriculture, and economy (Pimentel et al. 2005;Oreska and Aldridge 2011). Eradication and control of invasive species are very costly. For instance, recent estimates of environmental, social, and economic costs of 25 invasive non-native species in Europe vary between 12 and 20 billion euro per year for documented and extrapolated costs, respectively (Kettunen et al. 2008). These costs mainly result from damage and control measures. Circa ten percent of nonnative species entering a country or region outside their natural distribution area is able to become highly invasive in marine and freshwater systems (Ricciardi and Kipp 2008). High impacts of non-native fish invaders are limited to about 19% of the total regions they invade (Ricciardi and Kipp 2008).
Risk assessment is useful in identifying species that are likely to become invasive and cause significant negative impacts. In order to derive appropriate management options, several European countries (e.g. Austria, Belgium, Germany, Ireland, Switzerland, and United Kingdom) have recently developed national risk assessment protocols to identify low, moderate and high risk species. Risk assessment protocols for non-native species generally contain the main stages of invasion: (1) entry, (2) establishment, (3) spread, and (4) impacts. Because of the large number of non-native species that spread worldwide, there is a particular need for quick screening tools which can help to identify which new coming species have the potential to become invasive. Therefore, risk identification is one of the most important applications in risk assessment of non-native species.
In Europe, risk standards and assessment protocols have been developed by the European and Mediterranean Plant Protection Organisation (EPPO). These standards can be used for developing (new) risk assessment protocols, such as in the IMPASSE project on the assessment of environmental impacts of alien species in aquaculture (Copp et al. 2008). However, legislative and regulatory requirements for European Union member states concerning risk assessment and management of (invasive) nonnative species are fragmented (Hulme et al. 2009). As a result, different risk classification approaches are being used in Europe and the vast majority of European risk assessment systems are not legally binding, so enforcement of their results in invasive species management is limited ). Genovesi and Shine (2004) stress the importance of risk assessment in European policy on non-native species. They propose the use of a listing system to assign species to a black, white or grey list, depending on the severity of impact and data availability. Although the need for an early warning system for the European Union has recently been acknowledged (Genovesi et al. 2010), legal standards for risk assessment of non-native species are still lacking.
Risk assessment of non-native species tend to be of a qualitative or semi-quantitative nature (Dahlstrom et al. 2011;Heikkilä 2011), mainly because data for quantitative assessments are lacking (Kulhanek et al. 2011). However, in qualitative assessments of non-native species, lack of data is also a common problem (e.g. Gasso et al. 2010). As a result, current risk assessments are often based on incomplete data input and may rely heavily on expert opinions and assessors' interpretations (Maguire 2004;Strubbe et al. 2011). In case of lack of data, available risk classifications from other countries or regions are often used to predict whether or not a non-native species may become invasive. A match of species traits to climate and habitat also helps in predicting invasiveness.
According to Wittenberg and Cock (2001), the only factor consistently correlated with invasiveness in a region is invasiveness elsewhere. Although invasiveness elsewhere is usually included as a criterion in risk assessment, there are still remarkable differences between risk protocols worldwide and within Europe Heikkilä 2011). These include differences in scope, weighting, scoring and classification methods, assessment criteria and uncertainty analysis. Moreover, there are many examples of non-native species which have become invasive in one region, but not in others (Ricciardi and Kipp 2008), and several species are known to expand to other habitat types once outside their native range (Wittenberg and Cock 2001). The sensitivity of ecosystems and economic impact may also differ between countries. For example, the risks and costs for control of the muskrat (Ondatra zibethicus) damaging river dikes in lowland regions are much higher than in uplands. Moreover, ecological impacts depend on region-specific habitat characteristics and conservation aims. So, whether risk classifications from one region are useful predictors for other regions is questionable as they only predict their potential impact.
Previous studies have reviewed risk protocols available worldwide for assessment of aquatic biosecurity (Dahlstrom et al. 2011) and pests and pathogens (Heikkilä 2011). Other studies have evaluated the use of one assessment tool for different species groups and in different geographic regions (e.g. Weed Risk Assessment (WRA), Gordon and Gantz 2011;Gordon et al. 2008). Within Europe, environmental indicators for introduction and impacts of alien aquatic macroinvertebrate species have been developed and applied to various river systems (Panov et al. 2009). In addition, the accuracy of three risk assessment schemes has been tested in Central Europe for woody species (Krivánek and Pyšek 2006). However, the recently developed national risk assessment tools for non-native species and the multitude of risk classifications available in Europe have not yet been analysed. Altogether, comparative analyses of different risk assessment methods are largely missing in Europe ).
The aim of this paper is to evaluate available risk classifications of non-native aquatic species performed with various risk assessment protocols of European countries and to elucidate factors that may contribute to variability in risk classifications between countries. In order to achieve this goal we performed two types of comparisons between countries, using risk classifications from (1) different protocols, and (2) the same protocol. The implications of our results for risk assessment of non-native species and application of risk classifications will be discussed.

Literature search
A literature search was conducted to collect available protocols for risk assessment of nonnative species in Europe (Verbrugge et al. 2010). In addition, an inventory was made of the outcomes in terms of risk classifications of species. Risk classifications and information about the protocols were obtained via the Internet and scientific publications.

Risk identification protocols
This study focused on (trans)nationally developed, generic risk identification protocols from Europe. For the purpose of this study, we included protocols (1) which are currently being used for risk assessment in one (or more) countries and (2) for which risk classifications were available for review. A literature search yielded protocols from Belgium, Germany/ Austria, Ireland, and Switzerland. Moreover, two protocols from the United Kingdom (UK) were included: one species-specific tool developed for freshwater fish and invertebrates, and the national GB risk assessment scheme (formerly referred to as the UK risk assessment scheme). Strictly speaking, the latter is beyond a risk identification tool, including a more elaborate risk analysis. We decided to include the GB scheme as well, because this is a good example of a generic protocol that can be used for all taxonomic groups. Moreover, it is the one of the first and the only elaborate scheme used in Europe in a national context.
Overall, two different approaches for risk classification are applied in the protocols: (1) classification keys using formalized 'yes' or 'no' questions to assign high risk species to a Black List, and (2) semi-quantitative scoring methods, using the sum of the scores for various evaluation criteria as indicator for a high, medium or low risk using cut-off thresholds. The protocols are listed below with a short description of their characteristics.

Classification keys
The scope of the German-Austrian Black List Information System (GABLIS) is limited to ecological effects Nehring et al. 2010). Based on five basic criteria species are assigned to the White, Grey or Black list, according to their potential risk. Species with scientifically sound evidence of a significant threat on native biodiversity are assigned to the Black List; species with a less evidence-based reliability of effects are assigned to the Grey List, and species which do not pose a threat to native biodiversity are assigned to the White List. The Black List and Grey List are further divided into sub-lists based on the distribution of the species and the availability of eradication measures (Black warning, action and management list) and on the level of certainty of the assessment (Grey, watch and operation list). Six complementary, biological and ecological criteria related to impact are used to decide whether the species should be placed on the Grey (watch) List or the White List. For the comparison of risk classifications with other risk assessment tools we distinguish only between the Black, Grey and White List.
The Swiss classification key for neophytes is only applicable to plants and it assesses damage to biodiversity, human health, and economy using a total of ten questions (Weber et al. 2005). Species are then assigned to a Black or Watch List. The Black List includes plants that actually cause damage and the establishment and spread of these species should be prevented. The Watch List includes plants that have the potential to cause damage or are already causing damage in neighbouring countries.

Semi-quantitative protocols
The Invasive Species Environmental Impact Assessment (ISEIA) from Belgium assesses environmental impact only and has no taxonomic boundaries (Branquart 2007). The assessment consists of four sections matching the last steps of the invasion process: the potential for spread (1), establishment (2), adverse impacts on native species (3) and ecosystems (4). ISEIA is based on 12 questions, the results of which reduce to these four numerical responses with which a species is classified. Species are assigned to a list based on their total score: Black list (high environmental risk), Watch list (moderate environmental risk), and Alert list for potential risk species which are not yet present.
The GB risk assessment scheme is based on international risk standards provided by EPPO and can be used for all taxonomic groups. It roughly consists of two parts: (1) a preliminary assessment (14 'yes'/'no' questions) to determine whether a detailed risk assessment is needed, and (2) a detailed risk assessment scheme (51 questions) to assess the potential for entry and establishment, the capacity for spread, and the extent to which economic, environmental or social and human health impacts may occur (Baker et al. 2005(Baker et al. , 2008. Answers can be given on a 5-point scale (ranging from very low to very high risk) and include an assessment of uncertainty (low, medium or high). Risks are then summarised in the four categories: entry, establishment, spread, and impact and aggregated to a final high, medium or low risk indication.
The Freshwater Fish Invasiveness Scoring Kit (FISK) is an adaptation of the WRA from Pheloung et al. (1999). It is one of the prescreening tools that can be used to inform the preliminary assessment section of the GB Scheme. It uses 49 questions in eight categories: (1) domestication, (2) climate and distribution, (3) invasive elsewhere, (4) undesirable traits, (5) feeding guild, (6) reproduction, (7) dispersal mechanisms, and (8) persistence attributes. Moreover, it takes into account the confidence (certainty/uncertainty) ranking of the assessors. Scores can range from −11 to 54 and they classify non-native species into low, medium, and high risk categories. Similar invasiveness screening tools have been developed for nonnative freshwater invertebrates (Tricarico et al. 2010), marine fish and invertebrates, and amphibians (Cefas 2010).
The Invasive Species Ireland Risk Assessment consists of a preliminary and detailed assessment (Invasive Species Ireland 2008). We only included the classifications resulting from the preliminary (i.e. risk identification) assessment in this study (already classifying species as high, medium or low risk). The complementary stage two assessment is only used to be able to rank and prioritize high risk species and therefore not useful for our comparison. There are separate assessment formats for potential and established species. Invasion history, vectors and pathways, suitability of habitats, propagule pressure, establishment success and spread potential are addressed in a total of ten questions, and ecological, economic, and impacts on human and animal health assessed. Finally, the species are assigned to the high, medium or low risk category based on their summed scores.

Comparison of risk classifications
The similarity of risk classifications for aquatic, non-native species was analyzed by comparing risk assessment outcomes in two different ways. First of all, national (i.e. original) risk classifications were screened for similar species and this resulted in a table with risk classifications using different protocols in different contexts (or countries). For species that have been subject of risk assessments in three or more countries, the similarity of the risk classifications of protocols applied in different countries was analysed. Owing to the different phrasing in risk classifications, we distinguished three levels of risk: (1) high risk / black list or high risk species not yet introduced (alert list), (2) medium risk / grey list / watch list, and (3) low risk / white list / not invasive. For each included species the classifications were marked to be either equal (classifications from all protocols fall into the same category) or dissimilar (classification from one or more protocols differs from the others). Some countries have adopted risk identification tools from other countries or use adapted schemes (e.g. ISEIA in the UK; see Parrot et al. 2009). However, in this comparison we limit ourselves to the use of protocols in their 'native' country.
Secondly, mutual comparisons of available risk classifications for a group of non-native fish species occurring in three countries resulting from the same risk assessment protocol (i.e. FISK) were statistically correlated. This approach eliminates differences in risk classifications due to applications of different protocols. FISK originates from the UK and was applied in the UK ), Flanders (Verreycken et al. 2009a, b) and Belarus (Mastitsky et al. 2010) to identify the (potential) risk of non-native fish species. For the UK, minimum, maximum, and mean scores from two assessors were available for each species and we used the mean scores in our study. The scores from Verreycken et al. (2009a, b) are averages of Verreycken et al. (unpublished data) and Vandenbergh (2007). Mastitsky et al. (2010) report single scores only. The scores were converted to risk classifications using thresholds recently calibrated by Copp et al. (2009). Comparisons between two countries were made using risk classifications for mutually assessed species. Correlations were calculated using species that were assessed by all three studies (n = 10).

Comparison of national risk classifications
National risk classifications were equal for seven out of 25 species (28%) ( Table 1). For the remaining species, risk classification of at least one country differed from that of other countries. Comparatively spoken, risk classifications from different countries were more similar for plants than for animal species. Four out of eight plant species were classified equally, although more animal than plant species were assessed. For the eastern mudminnow Umbra pygmaea, the noble crayfish Astacus astacus, and the Turkish crayfish Astacus leptodactylus risk classifications were most different, including both low and high risk classifications. For Ireland, all but one assessed fish species were classified as medium risk. The risk classifications for the remaining countries show more variability and generally give a higher risk indication.

Crossing borders
FISK has recently been applied for 70 non-native fish species in the United Kingdom by . Verreycken et al. (2009a, b) used this tool to assess the potential invasiveness of the present and expected non-native fishes in Flanders (Belgium). FISK was also applied by Mastitsky et al. (2010) to assess the invasion potential of introduced fishes in Belarus. Only one out of 12 species assessed in Flanders and Belarus differed in risk classification, whereas 9 out of 19 and 8 out of 16 differed for pair-wise comparisons of Flanders-UK and Belarus-UK, respectively ( Figure 1A-C). Furthermore, all mean UK scores were consistently higher than the Belgian ones, except that of Ameiurus nebulosus and Pimephales promelas ( Figure 1C; Verreycken et al. 2009a, b). A higher correlation was found between the scores of non-native fish species (n = 10) assessed in both Flanders and Belarus (r 2 = 0.79; P < 0.01) than that of species assessed in Flanders and UK (r 2 = 0.41; P < 0.05) or in Belarus and UK (r 2 = 0.41; P < 0.05).

Discussion
When interpreting the outcome of this study, it is important to realize that our results are derived from a limited number of risk protocols. Firstly, development of risk protocols is an iterative process. Therefore, newly developed protocols are often based on existing risk assessment procedures. In some cases, similar questions or criteria are used, for example in the GB Risk Assessment scheme and the more recent Ireland Risk Assessment. Secondly, the GB risk assessment scheme is a more elaborate protocol than the others and it is not only a risk identification tool but a complete risk analysis.
This protocol requires a detailed assessment of 51 questions and therefore needs more data input. In the preliminary assessment prescreening tools such as FISK can be used. We included both FISK and the GB scheme in our comparison because our aim was to evaluate available risk classifications of non-native aquatic species performed with various risk assessment protocols of European countries to investigate risk classifications from both different countries and different protocols. Moreover, exclusion of the GB scheme would reduce the number of species for comparison (from 25 to 18) but would have produced the same results (72% dissimilar classifications). But when comparing the results of the UK with risk classifications from other countries this second remark has to be taken into account. Thirdly, FISK and its derivatives have been specifically designed to assess invasiveness attributes of freshwater fish, invertebrates etc. The Swiss classification key only focuses on plants, while the remaining protocols include more general criteria which can be applied to all species. Fourthly, because of the novelty of risk assessment of non-native species in Europe, protocols are constantly evaluated and revised. This means that comparisons as conducted in this study must be regularly updated. To our knowledge, the Ireland Risk Assessment and the GB scheme referred to in this study are currently being revised. Moreover, it has also triggered the development of alternative risk assessment procedures in Europe, such as ENSARS, a specific risk assessment for species involved in aquaculture (Copp et al. 2008).
Risk classifications for aquatic species show dissimilarities for 18 of the 25 species included in this study when compared between countries (Table 1). Owing to the large number of variables included in the comparison we cannot attribute these dissimilarities to a single determining factor. Differences in classifications may be related to the different (number of) criteria in risk protocols as well as variability in national context (i.e. invasibility of ecosystems) and in use of literature by experts (i.e. expert judgment). While invasiveness of species elsewhere appeared to be consistently stronger correlated to invasiveness (Wittenberg and Cock 2001; Figure 1), our study also shows that risk classifications from other (neighbouring) countries should always be applied with caution. For example, the fish species Umbra pygmaea is classified both as a non-invasive and a high risk invader within different parts of Europe (Table 1 and Figure 1).
The comparison in this study was limited by the number of completed risk assessments for each country. Taxonomic differences are accounted for by including aquatic plant, vertebrate, and invertebrate species. However, the inclusion of non-aquatic species may alter the results. Differences for species groups have been exposed for the WRA, where risk indications for aquatic plants were more precautionary than for non-aquatic plants because the risk assessment included questions which are not relevant for aquatic plants (Champion and Clayton 2000;Gordon and Gantz 2011). This would speak in favour of species group-specific risk assessment components (such as FISK, FI-ISK etc.), while generic risk protocols, as applied by some European countries (e.g. Belgium and Ireland), may not have the same accuracy for all species groups. We found that risk assessments for plant species were more consistent than for animal species.
Criteria in risk identification also relate to availability of habitat, climate matching, invasion stage, pathways and other regionspecific matters. Essl et al. (2011) also recognized the value of regional risk assessment. In a comparison of assessments of freshwater fish in a German and Austrian context (using the same protocol: GABLIS), 10% of the fish species were classified differently for the two countries. According to the authors, these dissimilarities largely reflected differences in current distributions in the two neighbouring countries .
FISK classifications showed a higher correlation for scores of non-native fish species in Flanders and Belarus than for the pair-wise comparisons of Belarus-UK or Flanders-UK. This may be related to (1) the number and expertise of assessors, and (2) the variability in the bio-geographical and ecological setting of continental water systems versus inland waters on islands. Firstly, the comparison of Belarus and UK scores shows large differences for six species (i.e. Coregonus lavaretus maraenoides, Ameiurus nebulosus, Ctenopharyngodon idella, Neogobius gymnotrachelus, Mylopharyngodon piceus and Hypophthalmichthys molitrix). For these species, the Belarus scores were much lower, dismissing a high risk classification. According to Mastitsky et al. (2010), this may be explained by the use of dual independent assessments for each species in the UK ), while in the Belarus study species were assessed by only one assessor. However, multiple experts may also affect variability, for example when experts judge reliability of data differently based on their experience or when they have different perceptions of risks (Maguire 2004). Qualitative risk assessments of non-native species inherently include normative aspects in the valuation of ecological effects. For example, Strubbe et al. (2011) recently showed that evidence of impacts of invasive birds are generally not based on scientific research but on anecdotal observations relating to small areas only. Secondly, when comparing our results with previous literature we have to make a distinction between applicability of the use of risk classifications from other regions and the use of a protocol (in this case FISK) in different regions. Gordon et al. (2008) evaluated the use of the WRA (of which FISK is an adaption) in six countries and found the number of correct rejections of invader species to be consistent across geographical applications. However, this only refers to the accuracy of the WRA in a certain region as the species assessments were compared to a priori classifications for the same region. It does not compare risk classifications from different regions for the same species, as is the case in our study.
When a semi-quantitative approach is used (i.e. scoring species for each criterion), the normative cut-off thresholds determine whether a species poses a low, medium or high risk (or is assigned to a certain list). This means that small changes in the assessment (e.g. slightly different judgements of available data) or cut-off thresholds can lead to different risk outcomes. Re-calibration of cut-off thresholds between regions is recommended, but this remains to be examined statistically and it would require justification. For instance, the calibration of FISK relied upon independent, international expertise for the a priori classifications of the species examined ). Normative cut-off thresholds effects on risk classification are particularly relevant when risk assessment protocols have a relative small number of criteria (i.e. ISEIA and Ireland Risk Assessment). Screening tools that are based on a larger number of scores (i.e. ask more questions) are more likely to produce lower variability (in the total score rankings) than those based on a few scores. However, more research on this topic is required as the number of species assessed by risk identification protocols is low and prohibits general conclusions on this matter. Parrot et al. (2009) recently applied the ISEIA protocol as a screening tool to identify potentially invasive non-native animal species in England. In their study, the UK scores from the ISEIA protocol were compared with the FISK scores from Copp et al. (2009) and the Freshwater Invertebrate Invasiveness Scoring Kit (FI-ISK) scores from Tricarico et al. (2010). Of the FISK scores for twelve fish species, eight are within the high risk category. Using the adapted ISEIA scheme, all but four species are classified as low risk. Parrot et al. (2009) explain the underestimation of risk using the ISEIA scheme by stating that the number of questions (i.e. the sample size of interrogation about the species) in the ISEIA protocol is insufficient. However, the FI-ISK and ISEIA assessments are in general agreement. Only one of five species was classified lower by ISEIA than FI-ISK (Parrot et al. 2009). In our study, three out of six species assessed are classified lower by ISEIA than FISK (i.e. Ameiurus nebulosus, Lepomis gibbosus and Umbra pygmaea).
Another factor influencing risk assessment is data availability. The absence or scarcity of (literature) data on the invasion and effects of a species requires consultation of experts. One of the species classified as low risk in Belgium, Germany and Austria and as high risk in the UK is Umbra pygmaea. According to , the paucity of (peer-reviewed) publications on the introduced range and the ecological impact of Umbra pygmaea may explain the differences in outcome of the assessors (UK versus Belgium) and of different assessment tools, as the results are probably mainly based on expert judgment.
Another important matter related to data availability is the inconsistency in terminology on the species' status and classification and in information supply on species richness, intertaxon correlations and the significance of individual drivers of invasion for European databases on invasive species (i.e. DAISIE and NOBANIS; Hulme et al. 2011). Both studies and our findings on dissimilarity in risk classifications across countries emphasize the need for transparency in risk assessments, related to data sources as well as limitations of data.
The diversity in scoring and classification systems used in risk identification of non-native species in Europe hampers collaboration and the use of available risk assessments across borders. Considering the spread and impacts of invasive species across borders, European standardization of risk assessment protocols is highly recommended.

Conclusions
Based on the limited comparisons made in this study, risk classifications of pre-screening tools used in Europe resulted in different outcomes for the majority of the tested species (72%). This may result from differences in scoring, classification, and weighting between the protocols. Application of the same protocol in different countries also resulted in differences in risk classifications of some fish species, indicating that variations in assessment outcomes may stem from other reasons. Important factors affecting the risk classifications are related to regional aspects, such as current distributions, habitat availability, and environmental matching. In addition, lack of data, expert judgement, and the number of assessors may play a role.
Our results suggest that risk classifications from one region cannot be applied to other regions without inserting a caveat. In spite of a significant correlation between pair-wise comparisons of risk classifications of non-native fish species in various countries, our results suggest that it would advisable for risk assessments to be performed within a national or even regional context. Research on key factors for invasiveness of species and invasibility of aquatic ecosystems in various biogeographical regions will be required to bridge knowledge gaps in risk assessments and to reduce uncertainties in risk classifications of non-native species. Current evaluations of risk assessment also indicate that the influence of uncertainties and lack of data on expert judgement should be explicitly acknowledged. Finally, European standardisation of risk assessment protocols will contribute to better comparable and transparent risk assessments of non-native species.