Options for reducing uncertainty in impact classi ﬁ cation for alien species

. Impact assessment is an important and cost-effective tool for assisting in the identi ﬁ cation and prioritization of invasive alien species. With the number of alien and invasive alien species expected to increase, reliance on impact assessment tools for the identi ﬁ cation of species that pose the greatest threats will continue to grow. Given the importance of such assessments for management and resource allocation, it is critical to understand the uncertainty involved and what effect this may have on the outcome. Using an uncertainty typology and insects as a model taxon, we identi ﬁ ed and classi ﬁ ed the causes and types of uncertainty when performing impact assessments on alien species. We assessed 100 alien insect species across two rounds of assessments with each species independently assessed by two assessors. Agreement between assessors was relatively low for all three impact classi ﬁ cation components (mechanism, severity, and con ﬁ dence) after the ﬁ rst round of assessments. For the second round, we revised guidelines and gave assessors access to each other ’ s assessments which improved agreement by between 20% and 30% for impact mechanism, severity, and con ﬁ dence. Of the 12 potential reasons for assessment discrepancies identi ﬁ ed a priori, 11 were found to occur. The most frequent causes (and types) of uncertainty (i.e., differences between assessment outcomes for the same species) were as follows: incomplete information searches (sys-tematic error), unclear mechanism and/or extent of impact (subjective judgment due to a lack of knowl-edge), and limitations of the assessment framework (context dependence). In response to these ﬁ ndings, we identify actions that may reduce uncertainty in the impact assessment process, particularly for assessing speciose taxa with diverse life histories such as Insects. Evidence of environmental impact was available for most insect species, and (of the non-random original subset of species assessed) 14 major; MO, moderate; MN, minor; MC, minimal concern; DD, data de ﬁ cient; and NA, not alien) is proportional to the number of species for which that severity was assigned by at least one assessor. The width of the links connecting segments re ﬂ ects the number of species for which there was difference between assessors involving those speci ﬁ c impact severities. The low density of links in B compared with A re ﬂ ects the increase in agreement across assessments, revealing also moderate followed by major and data de ﬁ cient as the most frequently assigned categories.

Abstract. Impact assessment is an important and cost-effective tool for assisting in the identification and prioritization of invasive alien species. With the number of alien and invasive alien species expected to increase, reliance on impact assessment tools for the identification of species that pose the greatest threats will continue to grow. Given the importance of such assessments for management and resource allocation, it is critical to understand the uncertainty involved and what effect this may have on the outcome. Using an uncertainty typology and insects as a model taxon, we identified and classified the causes and types of uncertainty when performing impact assessments on alien species. We assessed 100 alien insect species across two rounds of assessments with each species independently assessed by two assessors. Agreement between assessors was relatively low for all three impact classification components (mechanism, severity, and confidence) after the first round of assessments. For the second round, we revised guidelines and gave assessors access to each other's assessments which improved agreement by between 20% and 30% for impact mechanism, severity, and confidence. Of the 12 potential reasons for assessment discrepancies identified a priori, 11 were found to occur. The most frequent causes (and types) of uncertainty (i.e., differences between assessment outcomes for the same species) were as follows: incomplete information searches (systematic error), unclear mechanism and/or extent of impact (subjective judgment due to a lack of knowledge), and limitations of the assessment framework (context dependence). In response to these findings, we identify actions that may reduce uncertainty in the impact assessment process, particularly for assessing speciose taxa with diverse life histories such as Insects. Evidence of environmental impact was available for most insect species, and (of the non-random original subset of species assessed) 14 of those with evidence were identified as high impact species (with either major or massive impact). Although uncertainty in risk assessment, including impact assessments, can never be eliminated, identifying, and communicating its cause and variety is a first step toward its reduction and a more reliable assessment outcome, regardless of the taxa being assessed.

INTRODUCTION
With a changing climate and growing international trade, the range expansion of alien and invasive alien species (i.e., invasive alien species being those that have a negative impact within their recipient environment) is predicted to continue (Seebens et al. 2017). Ongoing arrival and establishment of alien species necessitates a triage approach to their management, where the most damaging, or those most likely to cause damage, is allocated high priority for surveillance and control (McGeoch et al. 2016). To this end, research on the impacts of alien species has been growing steadily (Crystal-Ornelas and Lockwood 2020), accompanied by the development of various assessment frameworks for classifying their impacts to assist in risk assessments (Roy et al. 2018, Gonz alez-Moreno et al. 2019, Vil a et al. 2019. Risk assessments are generally performed using multiple data types that vary in reliability, and therefore, acknowledging, accounting, and communicating associated uncertainty is an essential component of the process (Harwood and Stokes 2003). For example, assessments by the Intergovernmental Panel on Climate Change (IPCC), that use multiple lines of evidence, are conducted using a formal and agreed upon treatment of uncertainty. This method assesses the type, amount, quality, and consistency of evidence, along with the level of agreement, to assign confidence in the form of a likelihood scale (Mastrandrea et al. 2011). Impact assessments of alien species, which contribute to full risk assessments, likewise should be accompanied by estimates of uncertainty (Roy et al. 2018), although this does not always occur (Caton et al. 2018).
Semi-quantitative decision protocols are now a widely used and accepted approach for assigning relative impact severities to alien species , Gonz alez-Moreno et al. 2019, Vil a et al. 2019, producing evidence-based data that can then be used in risk assessment and prioritization. Semi-quantitative protocols are a valuable and necessary tool in contexts, such as invasion biology, where decisions must consider multiple lines of evidence, be made transparently and where available evidence is incomplete (Gregory et al. 2012). One such tool, Environmental Impact Classification for Alien Taxa (EICAT), is a formal protocol for using peer-reviewed literature to assess any environmental impacts (i.e., negative effects on the native environment) that an alien species has inflicted and with which an impact severity and mechanism is assigned (Hawkins et al. 2015). However, few studies have yet explored the potential types and causes of uncertainty associated with semi-quantitative tool outcomes for alien species impacts, and how these could affect the final classification outcome (but see Kumschick et al. 2017a, who compared results from two different applications of EICAT for alien amphibians). Better understanding of the causes of uncertainty in impact classification is an essential step in testing these protocols and identifying general solutions to strengthen their reliability and value to risk analysis (Milner-Gulland and Shea 2017, Rueda-Cediel et al. 2018, Latombe et al. 2019. While uncertainty cannot be removed entirely from semi-quantitative methods such as EICAT, identifying which causes of uncertainty are reducible (and acknowledging those that are practically irreducible; Regan et al. 2002) is an essential step in improving the reliability and value of the information they generate. Given the inherent nature of uncertainty in ecology more broadly, and the need to account for it, frameworks for identifying and classifying various types of uncertainty have been found to be useful (Milner-Gulland and Shea 2017). For example, the uncertainty typology developed by Regan et al. (2002) has been used to assess uncertainty in conservation decision making in the face of climate change (Kujala et al. 2013) and to assess uncertainty in the alien species listing process and what effect it may have on estimating the identity and number of such species in countries (McGeoch et al. 2012). In both cases, application of the typology helped to identify tactics for improving the transparency and repeatability of environmental decisions. Regan et al. (2002) identify two broad types of uncertainty, with multiple categories under each, that is, (1) epistemic uncertainty, or that brought about by uncertainty in determinate facts, and (2) linguistic uncertainty, or that caused by the inherent variation in our use of language (Regan et al. 2002, Burgman 2005, Latombe et al. 2019. Linguistic uncertainty has been a significant hindrance to progress in invasion biology to date (Roy et al. 2018, Vil a et al. 2019. The uncertainty framework of Regan et al. (2002) provides a widely relevant framework for evaluating uncertainty (McGeoch et al. 2012, Kujala et al. 2013).
Here, we use this framework to identify and classify the types of uncertainty and their causes that may occur when assessing the environmental impacts of alien species. That is, we provide a practical, rather than theoretical, approach to addressing uncertainty in impact assessments. Using insects as a case study, we apply the published protocol, EICAT, for classifying the impacts of alien species (Blackburn et al. 2014, Hawkins et al. 2015. Insects were selected as a model taxon for multiple reasons: (1) Alien invasive insect species pose significant risks to the environment and economy and require impact risk assessments to prioritize management (Bradshaw et al. 2016, Lovett et al. 2016, Suckling et al. 2019); (2) they are one of the most species-rich, abundant, functionally diverse taxonomic groups (Stork et al. 2015, Noriega et al. 2018), and(3) they are a group to which EICAT has not previously been applied. We use differences found between independent assessments of 100 insect species to identify, interpret, and classify causes and types of uncertainty. Incorporating the lessons learnt from performing impact classification on alien insects, we conclude with recommendations for how best to communicate and mitigate general types of uncertainty associated with semi-quantitative methods for classifying the severity of alien species impacts.

METHODS
Several tools and methods exist in support of risk assessment to ensure that the process and its outcomes are as transparent and as evidence based as possible. One of these tools is Environmental Impact Classification of Alien Taxa (EICAT). EICAT and other impact assessment methods for IAS are not risk assessments themselves, but rather tools used in support of evidence-based RA's that encompass a range of other socio-economic and context-specific information (Hawkins et al. 2015, Gonz alez-Moreno et al. 2019. Our intention here is to evaluate the types of uncertainty associated with classifying species based on evidence of their impact. This exercise is therefore intended as a contribution to improve the robustness of impact assessment tools. We do so by (1) selecting 100 insect species to assess, (2) independently applying the EICAT protocol to each species twice, (3) describing the differences found between assessments for each species and the reasons for these, (4) using Regan's et al. (2002) uncertainty framework to assign each difference to an uncertainty type and analyze their relative frequency, and (5) discuss ways in which each uncertainty type could be reduced by refining structured decision protocols for alien species impacts.

Species pool and those assessed
The subset of alien insect species assessed was deliberately selected to be likely to encompass as many species as possible with adequate peer-reviewed evidence of impact, given the demonstrated paucity of impact evidence for alien species (Crystal-Ornelas and Lockwood 2020). The initial set (~2800 species) consisted of all insect species present in the Global Register for Introduced and Invasive Species (GRIIS), which provides verified species by country occurrence records outside their native range at country scale and an attribute for local evidence of impact (Pagad et al. 2018;as at June 2017). In addition, alien insects known to impact the environment were extracted from v www.esajournals.org relevant reviews (Kenis et al. 2009, Vaes-Petignat and Nentwig 2014, McGeoch et al. 2015, Cameron et al. 2016, Bertelsmeier et al. 2017, Evans et al. 2017. Species were then ranked by the number of times they were referred to across all sources, that is, multiple sources designated them as invasive or having a negative environmental impact. The top 100 species were selected for assessment in this way prior to the literature search for each species. Given the assumed positive relationship between pest severity and available literature on the species, we consider this suite of 100 species to incorporate a best-case scenario for evidence available on which to implement EICAT and interpret the results accordingly.

Structured decision protocol
EICAT is a protocol with published guidelines for its implementation to ensure that it is applied in a consistent, transparent, and comparable way across taxa and between assessors (Blackburn et al. 2014, Hawkins et al. 2015. EICAT has now been adopted by the IUCN as a standard for the purpose of assigning severity of impact categories and impact mechanisms to IAS, with strong parallels to the method used for the IUCN Red List of Threatened Species. The EICAT severity of negative impacts is classified as minimal concern (MC), minor (MN), moderate (MO), major (MR), or massive (MV; Hawkins et al. 2015). The final severity attributed to an alien species at the end of the assessment process is its maximum realized impact anywhere within its introduced range; that is, EICAT is only concerned with the most severe realized (as opposed to potential) impact for a species. Two other categories are possible, data deficient (DD) for cases where there is insufficient evidence to perform an assessment, or not alien (NA) for when a given species does not in fact have an alien distribution. Mechanisms of impact are ascribed for each assessment where impact evidence was found. For example, a given species may have a negative impact by preying upon native species and would, therefore, be ascribed the mechanism of predation for that specific piece of evidence. These mechanisms, of which there were 13 (Blackburn et al. 2014, Hawkins et al. 2015, expand on those identified by Kumschick et al. (2012) to align with the Global Invasive Species Database (Blackburn et al. 2014, Hawkins et al. 2015. EICAT assessments have, to date, been published for alien birds (Evans et al. 2016), amphibians (Kumschick et al. 2017a), selected gastropods (Kesner and Kumschick 2018), mammals (Hagen and Kumschick 2018) and bamboo species (Canavan et al. 2019). Here, we used the EICAT protocol as published (Hawkins et al. 2015), although these guidelines have subsequently been revised by the IUCN via an online consultation process and the involvement of one of the co-authors (SK; IUCN 2020a). Our interest here was to identify forms of uncertainty in the impact assessment process, rather than to evaluate EICAT per se, which is similar to a range of other semi-quantitative impact assessment protocols (Kumschick et al. 2017b). We adopted previously suggested modifications to the mechanisms used to assess the environmental impacts of alien insects. Specifically, (1) the mechanism grazing/herbivory/browsing was altered to herbivory as grazing and browsing are not relevant for insects, and (2) the addition of a relevant mechanism, facilitation of native species (whereby an invasive species negatively affects one native species by positively affecting another native species) was considered important for capturing this frequently encountered form of indirect impact of invasive insects on biodiversity (McGeoch et al. 2015).

Implementation of EICAT
The general steps involved following the preassessment and assessment steps of the EICAT protocol as a research exercise, using the detailed guidelines, decision charts, and definitions published in Hawkins et al. (2015). We discussed and conducted the research (including the assessments unless otherwise specified below) and the first and senior authors analyzed and reported the results back to the authorship group for discussion and interpretation.
To measure the level of congruence between assessment outcomes and to identify reasons for the differences that occurred, each of the 100 alien insect species was assessed independently, using peer-reviewed literature, for environmental impacts by two people. The ECAT guidelines specify that assessments may be conducted by individuals or groups, in person or via email v www.esajournals.org (Hawkins et al. 2015). Our rationale for individually conducting the assessments was that by doing so we were able to use differences found between independent assessments to identify where and why uncertainty arises in the impact assessment process. Comparing individual assessments rather than group assessments also allowed us to complete assessments for many more species than would otherwise have been possible. Here, the rationale was that including many species, representing a broad range of life histories and available information, would likely yield a broader array of potential reasons for differences found between assessments of the same species. By controlling the assessments and their revisions using the process outlined below, we were able to exclude as far as possible potential influence of one assessor over the judgment of another, as well as maximize the comparability/ uniformity of the process across each of the species assessed. This suited the purpose of this study, which was to identify types and sources of uncertainty.
The entire process can be generalized by the following steps. First, assessors discussed the protocol and performed a series of pilot assessments as a group to familiarize everyone with the EICAT process and protocol guidelines (Hawkins et al. 2015). Second, impact assessments were performed independently by two assessors for each species. Literature searches were performed first by using ISI Web of Science (webofknowledge.com) using the species name (scientific including synonyms) as the search terms, after which assessors used supplementary searches and sources as needed. Each species assessment resulted in three main outputs with supporting literature evidence: (1) the mechanisms of impact (i.e., how the alien species is having a negative effect, e.g., via competition), (2) the size of the impact (the EICAT magnitude of impact severity category), and (3) the confidence in the evidence supporting the chosen mechanism and impact severity (Hawkins et al. 2015). Third, based on the outcome of the first round where differences in outcomes occurred between assessors, the guidelines were clarified where points of misunderstanding or differences in understanding arose. Our intention here was to apply EICAT critically, adhering to the guidelines as far as possible and elaborating on or departing from these only where it was necessary to clarify, elaborate, or modify details for the purpose of reducing uncertainty. The purpose of revisions to the guidelines was to clarify particular protocol points that were identified as potentially resulting in differences between assessments, particularly to reduce the more common and readily addressed misunderstandings that arose in the first-round assessments (such as the need to first confirm evidence of the existence of alien populations of each species). Fourth, the clarified guidelines, along with the initial results from both assessors for each species, were then used by the assessors in the second assessment round. Differences can occur for one or more outcome of each assessment, that is, differences in allocated impact mechanism, severity, confidence level, or any combination of the three. Fifth, assessors could then either refine their initial assessment considering this new information or leave their assessment as originally provided (Burgman et al. 2011). Either decision required justification. The sixth and final step required assessors to provide possible reasons for why, in their view, initial assessments resulted in differences.

Application of the uncertainty framework
We used the information generated from the above process to understand and classify the assessment uncertainty, by treating the differences in assessment outcomes across assessors as types of uncertainty and classifying them using the uncertainty typology of Regan et al. (2002). Using the detailed definitions of types of uncertainty provided by Regan et al. (2002), causes of uncertainty were first broadly classified as either epistemic or linguistic in origin. Epistemic uncertainty is that associated with the knowledge of a system's state and consists of six main types (measurement error, systematic error, natural variation, inherent randomness, model uncertainty, subjective judgment; Regan et al. 2002). There are five main types of linguistic uncertainty associated with the omnipresent variation in our use of language (vagueness, context dependence, ambiguity, theoretical indeterminacy, under-specificity; Regan et al. 2002). Other studies of uncertainty using this typology have modified it by, for example, including a third broad type of uncertainty (human decision v www.esajournals.org uncertainty) which contain causes of uncertainty such as subjective judgment (Kujala et al. 2013). We chose to follow McGeoch et al. (2012) who used the original typology with minor adjustments for application to listing of invasive alien species, distinguishing between subjective judgment per se and subjective judgment due to a lack of knowledge, and between systematic error per se and systematic error due to a lack of knowledge. The rationale for this is akin to the dual pathway of forming a subjective probability (Burgman 2005), where people's subjectivity stems from either a lack of, or despite, available knowledge. Instances of both epistemic and linguistic uncertainty were in some cases simultaneously possible as the reason for differences between assessment outcomes (McGeoch et al. 2012). Therefore, for each species where there was a difference between assessors regarding mechanism and/or severity of impact attribution, the reasons and their associated uncertainty type were identified using the definitions in Regan et al. (2002) and McGeoch et al. (2012). We did not assess the uncertainty associated with differences in the chosen confidence level due to its strong dependence on the chosen mechanism and severity. We did, however, assess the correlation between the amount of disagreement in impact severity and the mean level of confidence for a given species (Appendix S1: Fig. S1). There was no correlation between the two variables in the first or second round (Appendix S1: Fig. S1). An assignment of low confidence does not necessarily lead to disagreement between assessors and therefore was not used as a proxy for assessor disagreement.

RESULTS
First-round assessments led to agreement levels of between 32% and 44% for each of the three EICAT assessment components, that is, mechanism, severity of impact, and confidence ( Fig. 1). Mechanism of impact had the highest level of agreement, with 44% of species attributed with the same primary mechanism of impact by both independent assessors. The level of agreement, however, varied across mechanisms, as did the frequency with which each mechanism was attributed (Table 1; Appendix S1: Fig. S2). Herbivory, competition, and predation were most frequently identified (Table 1), and agreement on herbivory as the primary mechanism of impact occurred in 38% (n = 14) of instances. The remaining eight mechanisms were less frequently attributed (Table 1). For example, parasitism was attributed to six species, five of which were cases of agreement. Facilitation of native species was also attributed to six species; however, all six were cases that differed between assessors. Multiple (n = 14) instances occurred where the assessors agreed on a mechanism for a species, but one assessor also assigned additional mechanisms. The assignment of multiple mechanisms in EICAT occurs when the same level of impact severity is assigned to more than a single mechanism by an assessor for a given species. Taking a conservative approach here, these cases were considered differences in assessment outcome. After the second round of assessments, agreement on the mechanism of impact increased from 44% to 65% (Fig. 1). Little change between rounds occurred for the allocation of mechanisms of impact for most mechanisms except for herbivory (37-53%), transmission of disease (33-100%), and other (6-40%; Table 1).
Categories of impact severity varied in their levels of agreement ( Fig. 2A). Data deficient (DD) was the impact severity classification with  Fig. 1. Agreement in environmental impact assessment outcomes between two independent assessors across two rounds of assessment. Agreement increased across all assessment components in the second round. The assessment of severity of impact improved most and mechanism of impact least across the two assessment rounds.
highest agreement (see Appendix S1: Table S1 for full details of impact severity assessments). Agreement on impact severity in round 1 occurred for 34% of species; that is, a species was assigned the same severity of impact by both assessors (assessment outcomes were as likely to agree as to differ, v 2 = 1.37, df = 1, P = 0.24).
Assessor agreement on impact severity increased from 34% to 70% across rounds. As in the first round of assessments, variation in levels of agreement among severity of impact categories remained (Appendix S1: Table S1). Similarly, the distribution of agreement across the severity of impact categories differed little between Note: As more than one mechanism can be attributed to a species, the instance totals within and between assessment rounds do not sum to 100. . The segment span width of a given impact severity category (MV, massive; MR, major; MO, moderate; MN, minor; MC, minimal concern; DD, data deficient; and NA, not alien) is proportional to the number of species for which that severity was assigned by at least one assessor. The width of the links connecting segments reflects the number of species for which there was difference between assessors involving those specific impact severities. The low density of links in B compared with A reflects the increase in agreement across assessments, revealing also moderate followed by major and data deficient as the most frequently assigned categories. assessment rounds (Appendix S1: Table S1). Assessors were more likely to agree than disagree (v 2 = 83.64, df = 1, P < 0.001), and most differences (22/30) between impact severity categories in the second round involved categories adjacent on the severity scale, as opposed to the first round which had much less structure (Fig. 2B, Appendix S1: Fig. S2). For example, in the second round, if there was difference in a Moderate impact, it involved one assessor assigning either a Minor (one category lower) or Major (one category higher) impact (Fig. 2B,  Fig. 3). Exceptions to this occurred for two species (Diaphora citri, Tetropium fuscum) in which one assessor assessed severity as Minor and the other assessed it as Major, as well as for six species (Orthotomicus erosus, Solenopsis richteri, Chaetosiphon fragaefolii, Polistes chinensis, Rhyzopertha dominica, and Thaumetopoea processionea) in which one assessor assigned an impact severity and the other thought there was insufficient data to do so or that there was no evidence of alien populations in the case of T. processionea (Fig. 3). Eleven possible causes of uncertainty (Table 2) identified a priori occurred across the 100 species assessed during the implementation of the protocol (Table 2). Most frequently, different use of reference material was part of, if not the entire cause of difference in a species assessment, closely followed by limitations of the assessment framework and the mechanism or extent of impact being unclear in the literature (Fig. 4). However, the frequency of these causes of uncertainty differed depending on the assessment component (impact mechanism versus severity). For example, although extrapolation of evidence and deviation from assessment protocol occurred similar numbers of times (14 and 12 times, respectively), all but one of the former cases were related to differences in severity, while all instances of the latter were related to differences in mechanisms (Fig. 4). Systematic error was the uncertainty type that occurred most frequently, followed by subjective judgment as a result of lack of knowledge and context dependence (Fig. 4).

NA
Final assessments (after the completion of both rounds) revealed evidence of environmental impact was inadequate or unavailable for 14 species (Fig. 3, Appendix S2: Table S1), that is, where both assessors agreed on the species being data deficient. Including the additional 10 species where assessors disagreed on data availability equates to approximately 25% of alien insects assessed as data deficient by at least one assessor. These species were distributed across five orders: Coleoptera (n = 9), Diptera (n = 2), Hemiptera (n = 5), Hymenoptera (n = 7), and Lepidoptera (n = 1). Of those species for which environmental impact information was available and assessors agreed on impact severity (n = 56), 25 were Hymenoptera. This order was most frequently represented in each impact severity category, except for Minimal Concern where there were no Hymenoptera. It was also the only order for which there was agreement on a Massive impact, that is, for both the yellow crazy ant (Anoplolepsis gracilipes) and the big-headed ant (Pheidole megacephala). Few species were attributed this highest level of impact, and only five species were assigned this category by at least one assessor (Fig. 3). Considering only those species for which there was agreement on severity and excluding those that were data deficient, the number of species across impact severity categories approximates a symmetric distribution with Moderate (n = 26) the most attributed category (Appendix S1: Table S1). Including instances of difference alters the distribution of impact severities, which reflects the uneven frequency with which each category was associated with assessment differences. Agreement on impact mechanisms was most common for herbivory (n = 17) and competition (n = 17; Table 1). Hemiptera species. Species for which there remained difference after both rounds of assessment are shown with two different colored tiles in the row associated with them. Most instances of difference occurred at the interface between major and moderate, and between minor or minimal concern and data deficient. Category abbreviations are massive, MV; major, MR; moderate, MO; minor, MN; minimal concern, MC; data deficient, DD; not alien, NA. For additional information on species identities, see Appendix S2. (Fig. 3. Continued) v www.esajournals.org v www.esajournals.org were somewhat of an exception, and here, transmission of disease occurred as often as herbivory.
Although these two mechanisms were most frequent, all orders except one (Dermaptera) were associated with at least four mechanisms of impact (Appendix S1: Fig. S3). This declined slightly when considering only instances of agreement, with Lepidoptera decreasing to two attributed mechanisms.

DISCUSSION
With alien species introduction events expected to continue (Seebens et al. 2017), the prioritization of species of most concern for management authorities is likely to become increasingly important. Information obtained from alien species impact classifications can assist in both prioritizing those species to manage, and those in which to invest research. However, uncertainty that arises in the process of conducting impact assessments for alien species means that the results may not be reproducible and, as we show here, independent assessments can arrive at different conclusions for the same species, and single assessments may produce biased outcomes. Neither of these alternatives is desirable when decision makers rely on such information to guide action and investment. Here, we showed how examining alternative outcomes for individual species provides insight on the type and extent of uncertainty involved, and potential solutions for reducing it. We found that following the protocol guidelines resulted in relatively high levels of disagreement between independently conducted assessments, with the reasons for these differences associated with 11 different causes and six different types of uncertainty. However, levels of agreement increased substantially following discussion between assessors of reasons for the uncertainty involved. This process enabled us to identify several practicable recommendations for reducing the uncertainty associated with assessing and classifying the impact of alien species. Alien invasive insects have negative impacts within and across multiple environmental settings (Kenis et al. 2009, McGeoch et al. 2015. This can add uncertainty to decisions on which evidence to include (context dependence, Table 2, Cause 6, Type iv). For example, the Mediterranean pine beetle has been unintentionally introduced to multiple countries, most likely via the timber trade (Brockerhoff et al. 2006, Haack 2006. It is known for causing damage to conifers, particularly pines, often leading to tree mortality (Sang€ uesa- Barreda et al. 2015). However, all such evidence of a negative impact inside its alien range takes place within managed forestry plantations (Stephens and Wagner 2007). As such, both assessors for O. erosus, who differed in opinion as to which evidence to include, came to entirely different conclusions, even after the second round of assessments (data deficient vs. massive). Similar discrepancies occurred in the assessments of other species (citrus longhorned beetle (Anoplophora chinensis), Asian long-horned beetle (A. glabripennis)). Similar arguments can be made for agricultural crops, depending on the extent to which native ( (2012), checked for relevance to impact classification, and assigned to the relevant uncertainty type following Regan et al. (2002). Species examples provided are those encountered during the implementation of impact classification for insects and are discussed further in the text. Uncertainty types are categorized as epistemic (E) or linguistic (L).
† Although classified as systematic error, this cause of uncertainty would be more accurately considered a form of theoretical indeterminacy as per Regan et al. (2002). Range uncertain (5a) Species identification (4) Range uncertain (5b) Knowledge inaccessible (3) Invasiveness criteria (7) No apparent cause (11) Human error (1) Deviation from protocol (10) Extrapolation of evidence (9) Assessment limits (6) Mechanism/impact unclear (8) Incomplete search ( Table 2, and ii_a and iii_a are the sub-types related to lack of knowledge. biodiversity or native relatives within them are impacted by the alien insect. Such impacts can thus pose a potential risk to native biodiversity more widely. Without formal direction and clarification at the outset of the assessment process, differences based on subjective interpretations of systems suitable for inclusion are inevitable, albeit avoidable. Inclusion of an alien species in impact classification should, we argue, be based on what is being negatively affected more so than where; that is, the context within which the impact is occurring is less relevant in some cases.

Mechanism
A different situation arises for species that may arguably never negatively affect the native environment, and this assumption led to some species being assigned minimal concern despite the absence of evidence (subjective judgment, Table 2, Cause 9, Type iii). Two groups of alien insects, given their life histories, may lead an assessor to assign minimal concern without evidence (i.e., rather than data deficient). The first group are those species that are primarily stored product pests. For example, the bean weevil (Acanthoscelides obtectus) is an economically important pest of legumes (Soares et al. 2015). While damaging, its impact almost entirely occurs on stored products, and it is a species primarily of economic concern with a very low likelihood of having an environmental impact. The second group are monophagous insects that only attack and exist in agricultural crop environments and/or have a long history of no evidence of environmental impact. The broad bean beetle (Bruchus rufimanus) is a widespread crop pest, specifically targeting the faba bean (Vicia faba L.; Clement et al. 2002). Given the long history of evidence for this species only being of concern for specific crop types, one may conclude that it is of minimal environmental concern, even though the lack of evidence would otherwise lead to a data-deficient classification. Such conclusions are an instance of using absence of evidence as evidence of absence, generally undesirable but logically valid under certain circumstances (Sober 2009). The decision to classify a species as minimal concern for environmental impact without evidence, however, should occur outside of the impact assessment process in the form of additional information that may complement the final assessment outcome, as suggested for accommodating expert input.

Uncertainty associated with severity of impact
Agreement on severity of impact increased from 34% to 70% following the second round of assessments, suggesting that this assessment component is quite amenable to a reduction in uncertainty. Incomplete information searches (systematic error, Table 2, Cause 2, Type ii) were a common cause of difference, and the most frequent reason for why independent assessors assigned different impact severities. There are multiple possible explanations for why different pieces of evidence were selected for use, not least of all are different interpretations of what constitutes evidence appropriate for inclusion in the impact assessment (i.e., limitations of assessment framework, Table 2, Cause 6, Type iv).
Even with definitions for each severity category (Hawkins et al. 2015), assigning impact severity appears to be a decision open to greater interpretation than assigning impact mechanisms. One explanation may be the naturally unbounded nature of the relationship between successively more severe impact categories, where even with definitions for each impact severity category there is inevitably scope for subjective interpretations of degree. For example, an alien species with a minor impact is one that negatively affects the fitness (or performance, see Fig. 5) of a native species (Hawkins et al. 2015). However, a decline in fitness of individuals is likely to lead to a decline in population size, which constitutes the next most severe impact category (moderate). The difference between a negative impact on the fitness of individuals in a population and a population-level effect is a matter of degree (Ricciardi et al. 2013, Blackburn et al. 2014. Thus, if we assume a given impact severity begets the next most severe category, then there are two alternative pathways to arrive at a given severity category, that is, either via evidence of no impact or via no evidence of an impact at the next most severe category (the decision tree in Fig. 5 enables these alternatives to be distinguished). For example, an assessor may conclude that a given species is having a minor impact either because (1) there is evidence, it is not causing population declines, or (2) because there is no evidence, it is causing population declines. For example, Hadley and Betts (2012) found that a proposed lack of negative consequences of habitat fragmentation on pollination v www.esajournals.org Decision tree for assigning an impact severity category to species of environmental concern NA NA DD MC

MV
: No reliable evidence that it has or had individuals existing in a wild state in a region beyond its native geographic range.
: Best available evidence indicates that a given taxon has individuals existing in a wild state outside of its native range but there is inadequate information to classify an impact or insufficient time has elapsed since introductions have become apparent.
: Unlikely to have caused deleterious impacts on the native biota or abiotic environment : Causing reductions in the fitness of individuals in the native biota, but not declines in native population sizes : Causing declines in the population size of native species, but not changes to the structure of communities or abiotic/biotic ecosystem composition : Causing the local population extinction of at least one native species, and leads to reversible changes in the structure of communities and the abiotic/biotic ecosystem composition.
: Causing the replacement and local exitinction of native species, and produces irreversible changes in the structure of communities and abiotic/biotic ecosystem composition. Is the species negatively affecting the fitness* of native species?
Is the species negatively affecting the population of native species?
Is the species causing local extinction of native species leading to community level effects?
Are the community level effects irreversible? Fig. 5. Decision tree for improving the transparency and repeatability of assigning EICAT impact severity v www.esajournals.org dynamics was a result of the absence of evidence, and not evidence of an absence of such consequences. Hawkins et al. (2015) state that an impact severity category is assigned if there is no evidence for the next most severe category, implying the absence of evidence case (2 above). However, explicitly distinguishing between these two scenarios is important (Altman andBland 1995, Alderson 2004), even if the former occurs rarely by comparison, because it informs the strength of evidence available and confidence placed in it. Specifying such reasoning during assessments by making the decision process explicit (as shown in Fig. 5) could reduce uncertainty, as well as make the process more readily repeatable by specifying the basis for the final decision, that is, that there was, or was not, evidence for the next most severe category (Fig. 5).
Uncertainty associated with mechanism of impact One of the more common causes of difference in assigning a mechanism of impact was a lack of clarification, on the part of the authors of the studies used as evidence, as to how an alien species was negatively affecting the native environment. In such cases, assessors were required to make a subjective interpretation as to what mechanism was most likely, based on the context of the study. Similarly, it was often difficult to differentiate between two mechanisms, for example, competition and predation (Sandvik et al. 2004), where the evidence suggested that either choice was valid. This led to situations where either one assessor chose competition and the other predation, or one assessor provided both competition and predation because based on the evidence, they were not able to distinguish between them. For example, the evidence of environmental impact for the harlequin ladybeetle (Harmonia axyridis; Roy et al. 2012, Masetti et al. 2018) and the Argentine ant (Linepithema humile; Menke et al. 2018) often inferred the impact mechanism could be competition or predation. In situations such as this where the evidence itself was sometimes unclear on the mechanism of impact, it is difficult to avoid differing subjective interpretations by independent assessors (i.e., subjective judgment as a result of lack of knowledge, Table 2, Cause 8, Type iii_a).
Another common reason for differences between assessors on impact mechanisms was the use of different lines of evidence (systematic error, Table 2, Cause 2, Type ii). Multiple reasons for this are possible. First, for many insects the quantity of literature is large, and a literature search can return thousands of studies for a single species (e.g., some species assessed had search returns in the order of 14,000). This increases significantly the time needed to perform an assessment, and the goal is to be thorough. Nonetheless, simple oversight can result in mistakenly missing, misinterpreting, or including a certain piece of evidence (increasing the chance of human error and incomplete information searches as the cause of uncertainty; Table 2, Cause 1, Type ii). For example, differences in the first round of assessments for the shot-hole borer (Xylosandrus compactus) occurred because of the categories. This framework is based on the assumption that the different categories (MC-MV) are hierarchically related; that is, a negative effect at one level necessitates a negative effect at the level below it (dashed arrows) and may mean existence of an impact at a higher impact severity level, although no evidence may yet be available to support designation of this higher level of impact. Under this assumption, there can often be two pathways by which an impact category can be assigned, with the followed path dependent on information availability. For example, in the case of the two pathways for assigning a moderate (MO) impact; if there is information to support the conclusion that there is a negative effect on the population size of one or more native species, but no effect on community structure, then the appropriate category to assign is MO. Alternatively, there may be information on native population-level negative effects but no information (lack of evidence) available on the effects at the community level, in which case the category assigned would also be MO. In general terms, these two alternative pathways represent evidence of no impact (right) and no evidence of impact (left). Ã In the most recent version of the EICAT implementation guidelines (IUCN 2020a, b), the term fitness has been replaced by performance. (Fig. 5. Continued) v www.esajournals.org accidental use of evidence pertaining to a different species (Xyleborus glabratus; Table 2). By contrast, the Mediterranean pine beetle (Orthotomicus erosus) assessments differed as a result of different interpretations of the EICAT guidelines (Table 2). Both examples become causes of uncertainty for reasons other than simply incomplete information searches. This shows how uncertainty can compound during the assessment process, starting with one uncertainty type (in this case measurement error) that leads to another (context dependence). However, this also implies that by addressing some of the more manageable causes of assessor difference, uncertainty propagation could feasibly be reduced (see recommendations below on embedding formal systematic review protocols into environmental impact assessment protocols to facilitate the transparency, efficiency, and repeatability of literature screening and inclusion (Siddaway et al. 2019, Haddaway et al. 2020).
The complexity of assessing environmental impacts extends beyond the broad range of ecological systems that species invade to the range and complexity of mechanisms of impact invasive alien species can have. One of the most frequent mechanism categories was other; that is, the evidence suggested a mechanism not currently listed under the EICAT protocol. For example, the gall wasp (Andricus quercuscalicis) was assessed as having an impact by altering the sex ratio of native species (Sch€ onrogge et al. 2000), a mechanism that does not easily fit within any of the other mechanism categories. Another example is the Asian honeybee (Apis cerana) that robs food stores of native species (Hyatt 2012) and was noted as possibly fitting under multiple mechanism categories. Differences in mechanism choice between assessors involving the use of other often also involved the mechanism "interaction with other alien species" or "facilitation of native species." This may imply that other is often used to denote some form of indirect interaction that is associated with a negative environmental impact. If the outcome of an assessment is the assignment of other for the mechanism of impact, then the assessor should provide a concise description of what they interpret the mechanism to be. Such information would be useful not only for the assessment itself, but also for potentially updating the framework to include a new mechanism of impact. This could subsequently reduce uncertainty associated with impact mechanism decisions.

Consequences of uncertainty in impact classification for IAS
Not all causes of uncertainty have equally significant consequences for the outcome of an assessment. For example, deviation from assessment protocol (measurement error, Table 2, Cause 10, Type i) in some instances was simply the situation where one of the assessors, although assessing a species as data deficient did actually provide a mechanism of impact, rather than assigning none (e.g., oleander scale, Aspidiotus nerii; Table 2). This tended to occur when the assessor was familiar with the species and providing an educated judgment as to the most likely mechanism of impact. For example, Lepidoptera are likely to affect via herbivory and carabids by predation (or competition). Discrepancies of this kind are unlikely to have any serious consequences for species prioritization, and by extension resource allocation, and can be readily avoided. Extrapolating evidence to assign an impact severity beyond that provided by the available evidence is, however, more problematic (i.e., subjective judgment, Table 2, Cause 9, Type iii). This was one of the more frequent causes of uncertainty associated with impact severity. Subjective judgment of this form stemmed primarily from the expert knowledge of those performing assessments on species with which they are particularly familiar. For example, impact severity for the hemlock woolly adelgid (Adelges tsugae) differed as a result of this cause of uncertainty (Table 2). Extended familiarity with a given species may reduce objectivity leading to a more liberal, but not necessarily wrong, interpretation of impact severity. This subjectivity can result in an assessment at odds with an assessor who is less familiar with the same species, but who therefore bases their assessment on a more objective interpretation of the evidence. We suggest that this cause of uncertainty could be reduced by explicitly incorporating, within the assessment framework, opportunity for expert opinion to be provided in addition to and alongside the assessment based on published evidence alone (Morgan 2014; see recommendations below).
Most types of uncertainty encountered were epistemic in nature, a similar result to other studies on the presence and effect of assessment uncertainty (McGeoch et al. 2012, Kujala et al. 2013. The disproportionate occurrence of epistemic causes of uncertainty, however, does not downplay the importance of linguistic uncertainty (Table 2, Cause 7, Type v). Historically overlooked, this pervasive type of uncertainty can be influential, problematic, and difficult to reduce (Regan et al. 2002, Burgman 2005, Carey and Burgman 2008. This is particularly true in invasion biology where variation in definitions and their use have hindered progress in the field (Verbrugge et al. 2016, Courchamp et al. 2017. Crypticity, an aspect of biological invasions that can lead to underestimation of impacts, is also in part affected by linguistic uncertainty due to the dynamic nature of taxonomy (Jari c et al. 2019). Regan et al. (2002) characterized this as a form of theoretical indeterminacy where a term, or in this case a species name, will not necessarily retain its meaning into the future. Advances in molecular biology may unveil previously unknown species complexes or reveal that specific sub-species that are in fact alien to a given region (Jari c et al. 2019).

Recommendations for reducing uncertainty in impact classification for IAS
Certain causes of uncertainty are unavoidable (albeit theoretically reducible, e.g., human error), and their presence and potential impact on assessment outcomes can only be acknowledged (Humair et al. 2014), whereas other causes can potentially be minimized. Based on a structured process for identifying causes of uncertainty, we recommend the following six steps to reduce uncertainty in impact classification for alien species of environmental concern.
1. The dissemination of information that informs risk analysis, including impact classification, is often overlooked (Vanderhoeven et al. 2017). Not only is it important to communicate the conclusions of impact assessments, such as EICAT, but also to ensure the degree of uncertainty is transparently and effectively shared with end-users. There is a need to develop and implement effective risk communication methods that are appropriate to the target audience using the outputs of tools such as EICAT. The IUCN Red List is used as a communication tool in many different contexts, and EICAT could benefit from exploring the most effective approaches employed. Particular attention should be given to transparent communication of uncertainty to increase understanding of risk by user groups. 2. For an impact classification to convey the most accurate representation of the evidence, assessments should be performed as objectively as possible. Although perhaps self-evident, it is reasonable to assume that some assessors may use expert knowledge of a certain species to extrapolate beyond available evidence. To accommodate this valuable information, we suggest that protocols include the opportunity for assessors to provide their expert judgment, in addition to but separate from the objective results of the assessment itself. This could allow for what is analogous to uncertainty bounds around the final assessment outcome (e.g., as portrayed in Fig. 3). 3. Many assessment differences involving the use of other also involved mechanisms indicative of indirect or higher order interactions. As we found here, indirect interactions can involve facilitation of native species, as well as facilitation by native species in the establishment of introduced organisms (Northfield et al. 2018). The use of any attribution of other as mechanism of impact, as was frequent in this study, should always be accompanied by a description of the impact mechanism. This information would not only enrich the impact assessment but would also assist in future developments of the assessment framework to include new mechanisms of impact should similar descriptions be repeated over time.

An unambiguous definition for what quali-
fies as data deficient for environmental impact should be established to reduce misinterpretation about what evidence to include in an assessment. Invasive species may have a negative environmental impact in several different settings, including agriculture, forestry, and urban environments. Emphasis should be placed on what part of the environment is being negatively affected in addition to where the impact is taking place. In other words, the environmental context of the evidence should be specified and used as part of the rationale for including or excluding a given piece of evidence. 5. The use of a detailed decision tree to harmonize and capture the decision-making process (such as Fig. 5) would improve the transparency and repeatability of assessments, and formally distinguish between evidence of no impact and no evidence of impact, and therefore improve the rigor of assigning confidence to each decision. 6. Systematic literature review protocols should be explicitly integrated into the impact assessment guidelines and literature searches performed using established best practice and formal protocols such as the Guidelines and Standards for Evidence Synthesis in Environmental Management (CEE 2018, Siddaway et al. 2019, Haddaway et al. 2020. One of the more frequent causes of uncertainty was the use of different literature by assessors. Differences in evidence accrual procedures have also been identified to affect comparability of various impact assessment frameworks (Strubbe et al. 2019). Following such protocols would allow transparency and repeatability of the literature search process and should reduce the occurrence of different literature use between assessors for the same species. In addition, given the pervasive uncertainty associated with taxonomy (Py sek et al. 2013), all known synonyms of the accepted scientific name of a species should be included in the search terms (Haddaway et al. 2020), thus ensuring the inclusion of historical information. This is also why assessments ultimately will benefit from multiple people independently assessing a given species. This minimizes the chance of potentially missing evidence of impact and bolsters confidence in the final decision (if there is consistent agreement) or identifies problematic species (if there is consistent disagreement).
With processes to prioritize alien species, including information obtained from impact assessments, becoming important to guide biosecurity actions and investment in pest management, being able to reduce the uncertainty associated with the outcomes of impact assessments is important. Incorporating the above recommendations is likely to increase the accuracy, transparency, and reliability of the alien impact categorizations. Regardless of the type (epistemic or linguistic), being aware of the potential causes of uncertainty in alien species impact assessments, and to a greater extent assessment processes in general, is important if we are to reduce uncertainty as much as possible and base our management and prioritization decisions on the most accurate and reliable information available (Hamel andBryant 2017, Latombe et al. 2019). This is particularly relevant given the current Intergovernmental Science-Policy Platform on Biodiversity and Ecosystem Services (IPBES) assessment on invasive alien species and their control (IPBES 2018).