How far can EPTs fly? A comparison of empirical flying distances of riverine invertebrates and existing dispersal metrics

The species composition of a community is driven by the dispersal capacity of the species forming that com- munity and their ecological niche. While the ecological niches of EPTs (Ephemeroptera, Plecoptera and Trichoptera) are well-studied due to their wide use as indicators for the ecological status of freshwater ecosystems, their dispersal capacity has not yet been accurately characterized. Dispersion of the merolimnic EPT species during the terrestrial aerial adult stage is of special importance because the distance dispersed by active flight or passive wind drift is usually much larger compared to dispersion during the aquatic larval stage by active crawling or by drifting downstream. The aerial dispersal distance has been directly measured for only a small number of EPT species. For most other species, the dispersal capacity is assessed indirectly using species ’ traits that are mainly based on expert judgement and dispersal indices derived from trait information. In this study, we compiled a database of European EPTs ’ aerial dispersal distances reported in empirical studies and compared them to the dispersal capacity of the species as described by five different dispersal indices (original and modified versions of Li ’ s Dispersal Capacity Metric DCM and Sarremejane ’ s Species Flying Propensity SFP as well as relative wing length). The database included empirical data on 180 species, comprising 9.3% of European EPT species. Most data came from trap experiments with traps located at different distances from the assumed emergence point. Since the distance classes differed between studies and had to be translated to a fixed set of four distance classes here, several species had to be assigned to more than one class. To account for this uncertainty, five ordered logistic regression models, each one with a dispersal index as predictor and the ordinal-scaled aerial dispersal distance as response, were bootstrapped 10,000 times. In each run, species belonging to several distance classes were randomly assigned to a single class out of all possible classes. Since wing length had no significant effect on aerial dispersal distance in any of the 10,000 bootstrap runs, we question the use of this anatomical trait as an indicator for the aerial dispersal capacity. In contrast, a modified version of the DCM index was consistently related to the aerial dispersal distances (96%). The original SFP index had a significant effect in 100% of the model runs, indicating that this index is very well-suited as an indicator for the aerial dispersal capacity of European EPT species. This study facilitates the assessment of European EPT flying distances by providing a compilation of empirical data on the topic and by recommending an accurate indirect method when empirical data is not available.


Introduction
Many species of the three orders of freshwater invertebrates Ephemeroptera, Plecoptera and Trichoptera (EPTs) are sensitive to environmental stressors and widely used as indicators for the ecological status of rivers (Bauernfeind and Moog, 2000;Graf et al., 2009). Moreover, their relatively short life-cycle assists in the tracking of recent environmental changes (Schletterer et al., 2010) and their relatively high taxonomic diversity makes EPT richness a good indicator for changes in invertebrate communities along environmental stressor gradients (Lewin et al., 2013). However, invertebrate species composition at a given site is not solely driven by the environmental conditions but also by the dispersal capacity of the species (Sarremejane et al., 2020).
Dispersion plays a fundamental role in the life-cycle of riverine benthic invertebrates: First, active dispersion must compensate for aquatic passive downstream drift (Muller, 1954;Pechlaner, 1986;Kopp et al., 2001). Second, dispersion is needed to re-colonise habitats after disturbances like the re-wetting of intermittent streams after the dry season (Cañedo-Argüelles et al., 2015) or to colonise new habitats in e.g. restored reaches (Sundermann et al., 2011). Third, gene flow by dispersion is a prerequisite to maintain metapopulations (Wilcock et al., 2001;Morrissey and de Kerckhove, 2009).
In contrast to hololimnic species with a fully aquatic life-cycle, merolimnic species have an aquatic larval but also a terrestrial aerial adult stage. During the terrestrial adult stage, aerial dispersion allows merolimnic benthic invertebrates to disperse long distances and to travel away from their aquatic habitats. The distance dispersed as adults in the short terrestrial live stage by active flight or passive wind drift is usually much larger than the aquatic dispersion by active crawling or by drifting downstream , allowing the insects to reach more distant habitats. Moreover, aerial dispersion is not restricted to the river network (Cañedo-Argüelles et al., 2015), allowing merolimnic invertebrates to cross catchment borders and colonise even isolated river networks (Tonkin et al., 2014).
Insect aerial dispersion follows a distribution pattern where most of the individuals disperse closer to the site of emergence and a few of them fly greater distances (Coutant, 1982;Collier and Smith, 1998). However, most studies just measure a maximum dispersal distance (e.g. Malicky, 1987) and many others cannot provide enough information to fully reconstruct the dispersal distance distribution of the studied species (e.g. Bagge, 1995). When dispersal data from different studies is used and compared, this limited information on the maximum dispersal distance has to be considered as well as the differences in the methods used to quantify dispersal distances. Different experimental designs have been used to measure the aerial dispersal capacity of merolimnic invertebrates . Sticky traps placed at varying distances from the site of emergence of the invertebrates allow for a direct comparison of flying abilities between species of a given community (Bagge, 1995). Mark and release experiments, particularly those using stable isotopes marking, can accurately measure absolute distances flown by individuals (Coutant, 1982;Briers et al., 2004). Genetic distance studies quantify the gene flow and indicate the migration rate and connection between populations (Monaghan et al., 2001;Wilcock et al., 2001;Leys et al., 2016). However, dispersion has been directly measured for only a small number of species out of the merolimnic invertebrate community.
Therefore, dispersal capacity of the majority of species is assessed indirectly, using species' traits that are based primarily on expert judgement. Dispersal capacity can be a trait in itself, as described for example by Tachet et al. (2002). Alternatively, other functional traits, such as anatomical and life-history traits, which are known or assumed to be related to aerial dispersal capacity can be used (Malmqvist, 2000;Müller-Peddinghaus and Hering, 2013). Recently, these traits were used to develop two different dispersal indices: the Dispersal Capacity Metric (Li et al., 2016) is solely based on the four different dispersal modes as described by the dispersal trait of Tachet et al. (2002) while Species Flying Propensity (Sarremejane et al., 2017) combines them with additional functional traits. Moreover, relative wing length is often considered a good index and proxy for the dispersal capacity of meroliminic species (Malmqvist, 2000;Müller-Peddinghaus and Hering, 2013). These indices have been shown to reflect temporal colonisation patterns (Li et al., 2016) and connectivity between sites (Sarremejane et al., 2017), and hence to indicate species' dispersal capacity. However, these indices and underlying traits are mainly based on expert judgement and have not yet been compared to empirical data on realised dispersal distances of merolimnic invertebrate species. Such a comparison could be used to validate dispersal indices and to identify the index best suited as an indicator for the aerial dispersal capacity of merolimnic invertebrates. Since the indices have been developed to assess dispersal capacity during the terrestrial adult stage as well as the aquatic larval stage, modified versions of the indices focusing on the terrestrial adult stage might be superior as an indicator for the aerial dispersal capacity of merolimnic invertebrates.
There is a major problem that hampers such a comparison of empirical data on the aerial dispersion of merolimnic invertebrate species and expert-based traits and indices: many studiesas the ones using sticky trapsdid not report absolute dispersal distances but several predefined fixed distances. For example, Bagge (1995) installed traps at 0, 0.2, 0.4, 0.6, 3.0 and 3.7 km from the emergence point. Adult individuals of the caddisfly Cheumatopsyche lepida were only caught at the four shorter distances up to 0.6 km and therefore dispersal capacity of this species is somewhere between 0.6 and 3.0 km. This issue can be resolved by assigning the species to an ordinal-scaled dispersal distance class. However, different pre-defined fixed distances have been used in different studies, and when using one single set of distance classes to combine reported flying distances from these studies some species cannot be clearly assigned to one single class, instead belonging to several. In the above example, if two dispersal classes of 0-2.0 and 2-3.0 km are used, C. lepida could be assigned to both of them. This uncertainty in assigning species to one distinct distance class must be considered.
The main objective of this study is to test the different dispersal indices by comparing them to reported dispersal distances based on a review of existing empirical studies in order to identify the index best suited as indicator for the aerial dispersal capacity of merolimnic invertebrates. This paper focuses on European Ephemeroptera, Plecoptera and Trichoptera (EPTs) due to their importance in the aquatic community (Bauernfeind and Moog, 2000;Graf et al., 2009;Graf, 2011, 2013) and the availability of a larger number of empirical studies on the flying distances of these merolimnic invertebrate groups. The study centres on European species as the dispersal indices included were developed for taxa in that geographical area. More specifically, the objectives are: (i) to compile the empirical data on realised flying distances of European EPTs in a database, (ii) to compare the reported flying distances to the dispersal capacity as described by the existing dispersal indices DCM (Li et al., 2016) and SFP (Sarremejane et al., 2017) as well as relative wing length, and (iii) to identify the index most consistently related to the reported flying distances when the uncertainty in assigning species to one distinct distance class is considered. This will establish which index is best suited as indicator for the aerial dispersal capacity of merolimnic invertebrates.

Database on realised flying distances of European EPTs
Our literature review resulted in a list of 71 publications about freshwater macroinvertebrate aerial dispersion and flying distances (references in Supplement 1). The publications were extracted from Sondermann (2017) and from references cited in articles from the same source. The review was supplemented with a search in Google scholar (carried out in May 2019) using "EPTs", "Ephemeroptera", "Plecoptera", "Trichoptera", "dispersion" and "flying distance" as search terms. The following information on flying distances of European EPT taxa was extracted from these publications and compiled in a database, only taking into account experimentally obtained flying distances (no expert judgement, one single entry in the database per species): (i) order; (ii) family; (iii) species name; (iv) maximum flying distance reported in the literature (i.e. if several papers reported flying distances for the same species, the largest value was chosen); (v) information on the experimental design used in the respective study: traps (excluding experiments using light sources as attractors), light traps, mark and release (visual and isotopes marking), population genetics, direct observation and observation of a recolonisation process; and (vi) species scores on the dispersal indices described below. Taxa with apterous or micropterous adults (e.g. Taeniopteryx araneoides) or taxa with a completely terrestrial lifecycle (e.g. Enoicyla) were not considered in from database. The resulting database (Supplement 1) contains flying distances for 180 out of the 1,938 European EPT species, including mainland species and species from the Portuguese and Spanish Atlantic islands, according to Schmidt-Kloiber and Hering (2015).
Most of the analysed publications did not report absolute distances but pre-defined fixed distances and therefore flying distances had to be assigned to a set of discrete distance classes. The number of classes was restricted to four to keep the number of species per class large enough to allow for group comparisons. Class ranges of 0-0.8 km, 0.8-3 km, 3-5 km and >5 km were chosen. These ranges were chosen (i) based on the distance ranges commonly used in the studies; (ii) to uniformly distribute the species among the distance classes; and (iii) to provide a threshold of 5 km for the largest distance class, since 5 km was found to be an ecological boundary for benthic invertebrate recolonisation of restored rivers (Sundermann et al., 2011). As these distance classes partly differed from the classes used in the single studies, some species had to be assigned to more than one class, as described in the introduction. In some of the sources we included, it was not possible to discriminate which distance class a given species belonged to due to the experimental design used (for example, a trap-based experiment where the trap furthest away from the emergence point was set at 50 m) resulting in some species being assigned to all four distance classes. Therefore, these species were excluded as their flying capacity could not be determined, reducing the number of species in the analysis from 180 to 129 (9 Ephemeroptera, 23 Plecoptera and 97 Trichoptera species). Among the species left, 51 were assigned to a single distance class, 41 to two distance classes and 36 to three distance classes.

Dispersal indices
The Dispersal Capacity Metric DCM of Li et al. (2016) is based on the dispersal trait of Tachet et al. (2002), which assesses the dispersal capacity of each species for each of four different dispersal modes (aquatic active aqa, aquatic passive aqp, aerial active aea, aerial passive aep) on a four-point ordinal scale (0-3) based on expert judgement. For each species, the scores of the four dispersal modes are summed, with a weight of 2 given to the two aerial dispersal modes since the aerial dispersal distance is greater than the aquatic dispersal distance for most species (Minshall & Petersen 1985). This sum is standardised using the maximum-minimum rescaling approach, with max c and min c representing the species with the highest and lowest sum of scores respectively. The resulting standardised scores range from 0 to 1, with 0 being the score of the worst disperser and 1 the score of the best in the community.
The scores of this index were calculated for every species (i) in our database with max c and min c being the values of the species scoring highest and lowest in the entire database. In the case of species or genera not reported by Tachet, we used the average values of the next superior taxonomic category available for the aforementioned traits in Tachet's database.
In addition, the original index was modified (DCM') by excluding the aquatic dispersal modes as it will be compared to flying distances, i.e. terrestrial dispersion only.
The Species Flying Propensity of Sarremejane et al. (2017) (referred to as SFP in the following) is based on the combination of four functional traits that were considered to be related to the aerial dispersal capacity by Sarremejane et al. (2017). For each of these four traits, the scores of categories promoting aerial dispersion were given higher weights, ranging from 1 to 4. First, the aerial dispersal trait of Tachet et al. (2002) is used, with a weight of 2 given to aerial passive mode and a weight of 4 given to aerial active mode, since authors assumed that active flyers can disperse longer distances. Second, the maximum adult size given in Tachet et al. (2002) is included, with a weight of 2 given to a small maximum adult size (mss; <1 cm) and a weight of 4 to larger species (mbs; >1 cm) since larger animals were assumed to be stronger flyers. Third, adult lifespan given in Poff et al. (2006) is used, with increasing weights of 1, 2, and 4 given respectively to increasing lifespans, from very short (vsl; <7 days), to short (sl; <30 days), to long (ll; >30 days), available flying time increases with adult lifespan. Fourth, the number of generations given in Tachet et al. (2002) is used, with a weight of 2 and 4 given to univoltine (uv) and semivoltine (sv) species respectively, since species with more generations within a year have more dispersal opportunities. Following the recommendation of Sarremejane, multivoltine (mv) was included as a trait category with a weight of 6 (Sarremejane, personal communication, July 11, 2019). The final SFPs index is calculated by the sum of trait scores multiplied by corresponding trait weights, standardised by the sum of scores over all categories of a trait so each trait has a similar relevance in the final score of the index.
The index scores were calculated for every species (i) in our database. In case of species or genera from our database not reported by Tachet et al. (2002) or Poff et al. (2006), we used the average values of the next superior taxonomic category available for the aforementioned traits.
In addition, the original index was modified (SFP') by excluding the number of generations. The index scores were compared to flying distances from empirical studies that captured individuals of one single generation and therefore the argument for including this trait does not apply in our study.
Relative wing length is an anatomical index calculated as the forewing length divided by the body length. This index was included in our study because it has been widely used as a proxy for dispersal capacity (Malmqvist, 2000;Müller-Peddinghaus & Hering, 2013) as species with a higher relative wing length and lower wingload are considered better flyers. Forewing and body lengths of European EPT have been collected from the literature (Lillehammer, 1972;Elliott, 1987Elliott, , 1988Stevens et al., 1999;Bauernfeind & Humpesch, 2001;Hoffsten, 2004;Malicky, 2004;Soldán et al., 2009). The scores of this index were calculated for every species (i) with published data for forewing and body lengths in our database (Supplement 1).

Statistical analysis
To test if the five indices were related to the four ordinal-scaled distance classes, a one-way ANOVA was performed for each index (DCM, DCM', SFP, SFP' and relative wing length). Tukey's Honestly Significant Difference was used as a post-hoc test to identify distance classes which significantly differed in index values. For this first exploratory analysis, each species was assigned to the smallest distance class it belonged to.
Instead of using the four ordinal-scaled distance classes, it can be considered that these classes represent an underlying continuous variable (flying distance), and hence ordered logistic regression models (or a proportional odds regression model) can be used to investigate the relationship between flying distance and the five indices in more detail. Flying distance was used as a response variable and each index as a single predictor variable in five logistic regression models. Each model calculates the coefficient for the logarithm of the odds (ologit) of the probabilities for each value of the predictor variable (dispersal index) to fall in each category of the response variable (distance classes). Moreover, each model provides intercepts for each boundary between categories of the response variable (boundaries between distance classes), i. e. the value of the predictor variable at the respective class boundary. Finally, each model output includes p-values and estimators of the model fit like the Akaike Information Criterion (AIC). The Brant test was used to test if the proportional odds assumption was met (a change of the odds for each response class has to be proportional to a change of the predictor). Such ologit models can be used to predict the probability of a species belonging to the different distance classes, given the specific index score of the species. However, in this study the models were solely used to test if the empirical flying distances were significantly positively related to the dispersal index and to assess the model fit. This information was finally used to identify the index which was related most closely to the flying distances.
To consider the uncertainty in the assignment of some species to a single distance class and to assess the consistency of results, each of the five logistic regression models was bootstrapped 10,000 times. In each of the 10,000 runs, species belonging to several distance classes were randomly assigned to a single class out of all the possible classes. Second, species assigned to each distance class were grouped andfor each of the five dispersal metricsmedian index scores as well as the quartiles of each distance class were calculated (since some species were randomly assigned to different distance classes in each run, median index scores and quartiles differed between runs). Third, an ordered logistic regressions model was calculated for each of the five dispersal indices and the coefficients, p-values of the coefficients, AIC-values and the result of the Brant tests stored, using MASS (Venables and Ripley 2002) and Brant (Schlegel and Steenbergen, 2018) packages in R (R Core Team 2020).
The variability between runs is shown in five plots, one per index. Each plot includes three boxplots for each distance class, showing the variability of the medians, 25th percentile, and 75th percentile in the 10,000 runs. Each boxplot shows median values, quartiles and highest and lowest value up to 1.5 times the inter-quartile range. Hence, these boxplots illustrate how the median values and the quartiles of each distance class varied between runs and if the differences of the index scores between distance classes were consistent between runs.
To identify the index most closely related to the empirical flying distances, the number of significant runs and median AIC values were compared between indices. For each of the five indices, only runs where the proportional odds assumption was met were considered (Brant test non-significant p> 0.05). Among those, the percentage of runs with a significant regression coefficient (p < 0.05) and positive relationship (flying distance increasing with index score) was calculated. In addition, the median AIC value of these runs was calculated.

Database on realised flying distances of European EPTs
The database (Supplement 1) contains empirically obtained flying distances for 180 European EPT species (Table 1). The flying distances were compiled from 21 different publications but information on most of the species were reported in three publications, namely Malicky (1987), Mendl an Müller (1974) and Bagge (1995) (110, 24, and 14 species). The remaining 18 publications report information on <10 species each, and account for 32 EPT species in total. As a consequence, most flying distances were obtained using the experimental designs applied by the three main studies: trap experiments using light sources (113 species), trap experiments (43 species), observation of recolonisation processes (12 species), direct observation (5 species), genetic distances (4 species), and mark-and-release experiments (3 species).
Overall there is data available for 9.3% of the European EPT species, but the numbers vary between orders: Trichoptera were better represented in the database (140 species, 12.6%) compared to Plecoptera (31 species, 6.3%) and Ephemeroptera (9 species, 2.6%). Details on the representativeness of each taxon in the database are summarised in Table 2.

Exploratory analysis on the relationship between the five dispersal indices and empirical flying distances
In the first exploratory analysis, with each species assigned to the smallest distance class it belonged to, three out of the five indices were significantly related to the four ordinal-scaled distance classes. The SFP index showed the largest differences in index scores between classes, indicating that this index was related best to the empirical flying distances (Fig. 1). Mean scores of the DCM' (One-Way ANOVA, F 3,124 = 9.574, P < 0.001), SFP (One-Way ANOVA, F 3,120 = 12.11, P < 0.001)
The random assignment of species belonging to several distance classes in the 10,000 runs resulted in some variability of the model results (Fig. 2). However, variability of the index scores was rather low and results of the ologit models were consistent. The standard deviation of the median scores for each distance class was always <15% of the range of values of each index (Table 3).
The ologit models confirmed the results of the exploratory ANOVAs. In the ordered logistic models the SFP index was most closely related to flying distance. The DCM' and SFP' indices were also significantly related to flying distance while the DCM index and relative wing length were not (Fig. 2). In 100% of the runs passing the Brant test, the SFP index was significantly and positively related to flying distance and the median AIC value of the runs was comparatively low (328.7, Table 4). The share of significant runs where flying distance was positively related to the DCM' index was slightly lower (96%) and the median AIC value higher (346.2) and therefore the SFP index can be considered to better fit the data compared to the DCM' index. The median AIC value of the SFP' models were even lower compared to the SFP models, but differences were marginal (327.3 compared to 328.7) and the SFP' index was significantly and positively related to flying distance in only 77% of the model runs (Table 4). The percentage of significant model runs for the DCM index and relative wing length was too low to consider those indices good predictors for empirical data (4% and 0% respectively, Table 4).

Discussion
In this study, we compiled the empirical data on realised flying distances of European EPTs in order to compare them to the dispersal capacity of the species as described by five different dispersal indices (original and modified versions of Li's DCM and Sarremejane's SFP as well as relative wing length). The main objective was to identify the index which was most consistently related to the reported flying distances and therefore best suited as an indicator for the aerial dispersal capacity of merolimnic invertebrates.

Database on realised flying distances of European EPTs
The database we compiled includes all empirical data on European EPT flying distances that we were aware of. In this study it was used to compare macroinvertebrate dispersal indices, but it can be considered a valuable contribution to the field of dispersal ecology in its own right as it summarises the current scientific knowledge. Therefore, we would like to discuss the possibilities and limitations in using this database. First, we have data for just 9% of the European EPT species. Most of this information was already published in the 20th century and there are few recent studies. Since data availability in Europe is usually better compared to other regions, the share of ETP species for which we have empirical information on their dispersal capacity is most likely even lower globally. It is important to note that the main literature source of the database, Sondermann (2017), was based on a thorough bibliographic search of German and English publications conducted in 1992 (Hering, 1995), meaning that older publications in languages other than English or German, as well as publications published after 1992 but not indexed are likely to be missing in the database. In summary, there is a need to reinforce efforts aimed at improving our empirical knowledge on EPT dispersion both in Europe and globally.
Second, data from different studies cannot always be directly compared due to methodological differences. Moreover, it is difficult to assess maximum flying distances based on trap-experiments, which is the most widely used method in our database. Traps can only catch insects at the predetermined distances where they are set, do not necessarily catch every individual flying by, and most trap experiments use artificial light to attract the insects, possibly stimulating some individuals (Ephemeroptera and Trichoptera) to fly longer distances than they normally would. In addition, taxonomic inconsistency is a problem, since numerous subspecies are found in many European EPT species, especially within the Trichoptera (Neu et al., 2018). Due to their Table 2 Database summarised by families. For each order (Ephemeroptera, Plecoptera, Trichoptera) all families are listed. For each family, the number of genera and species occuring in Europe according to Schmidt-Kloiber & Hering (2015) as well as the number of genera and species present in the database and the maximum flying distance reported in the database are given. taxonomic status, different distribution or ecological niches, these taxa can have different flight distances within one species. Nevertheless, other methods less common in our database might be better suited to assess maximum flying distances. Isotope marking (Coutant, 1982;Hallworth et al., 2018) can be very precise and has already proved that trap-experiments tend to underestimate dispersal distances (Briers et al., 2004) while genetic distance analysis (Kelly et al., 2001;Wilcock et al., 2001;Polato et al., 2017) measures effective dispersal capacity (instead of realised flying distance) by testing which populations are in genetic contact. In summary, the maximum flying distances of individual species reported in the database can only be considered general estimates and should therefore be used with caution. Despite these methodological issues, results of the bootstrapping runs were surprisingly consistent, which partly might have been due to the approach of using a large number of species and rather large distance classes. So, despite the uncertainty related to the maximum flying distance reported for single species, the database could be used for similar analyses on dispersion of EPTs in general.

Dispersal indices as indicators for the aerial dispersal capacity of merolimnic invertebrates
Besides compiling a database on the empirical flying distances of European EPT taxa reported in the literature, the main objective of this study was to test if and which of the existing dispersal indices and anatomical traits is most closely related to empirical flying distances and therefore best suited as an indicator of the aerial dispersal capacity of EPT communities. The analyses conducted for this study are purely correlative and therefore, the causes of the differences in disparsal capacity between taxa cannot be determined.
We selected relative wing length as an anatomical trait for our study since it is reported in literature or has been widely used in the past (Malmqvist, 2000;Müller-Peddinghaus and Hering, 2013), and can easily be calculated based on separate measurements of forewing length and body length given in different publications (Lillehammer, 1972;Elliott, 1987Elliott, , 1988Stevens et al., 1999;Bauernfeind and Humpesch, 2001;Hoffsten, 2004;Malicky, 2004;Soldán et al., 2009). Since the results clearly showed that relative wing length is not significantly related to the reported flying distances, we question the use of this anatomical trait as an indicator for the dispersal capacity of EPT species. We did not consider other anatomical traits or indices like aspect ratios (Müller-Peddinghaus and Hering, 2013) or relative thoracic mass (Hoffsten, 2004) simply because information was only available for a few of the species in the database. If this information can be compiled from other sources or future studies for a larger number of EPT species, these additional anatomical traits can easily be tested using our database on empirical flying distances (Supplement 1). In contrast to relative wing length, all three indices only or mainly based on Tachet's aerial dispersal traits (DCM', SFP, SFP') were significantly and consistently related to reported flying distances and can be used as indicators for the dispersal capacity of EPT communities. However, the original Dispersal Capacity Metric DCM of Li et al. (2016), which includes Tachet's aquatic dispersal traits besides the aerial traits, was not significantly related to empirical flying distances, indicating that aquatic dispersal capacity is not correlated with aerial dispersion.
Additional ecological traits, as applied in the Species Flying Propensity SFP of Sarremejane et al. (2017) increased the model fit, indicating potential correlations between the additional traits included in this index and aerial dispersal capacity. It seems reasonable that the additional traits "maximum adult size" and "adult life span" included in the SFP as well as the modified version SFP' resulted in a better model fit compared to DCM'. However, it was surprising that the original SFP Fig. 1. Index scores of the species in the four distance classes, with the species being assigned to the smallest distance class they belong to. Plots marked with an asterisk (*) indicate a significant relation between the distance classes. For those plots, classes marked with different lettersshow significant differences between them at P < 0.05 (One-Way ANOVA followed by Tukey's post-hoc comparisons).
including voltinism performed even better than the modified version SFP' where we excluded this trait. Over a whole year, several generations within a species may indeed disperse further, since individuals of each generation may disperse, but flying distances measured in the empirical studies refer to individuals of one single generation, so the number of generations per year cannot directly influence the empirical dispersal capacity of individuals. We speculate that voltinism is cocorrelated with one or more traits that have a direct influence on flying capacity. One possible reason might be a co-correlation with lifehistory strategies, since species with many generations per year and therefore short life-cycles can be considered r-strategists, i.e. pioneer species that are in all likelihood good dispersers.
Overall, the original version of the Species Flying Propensity SFP of Sarremejane et al. (2017) was the index most closely related to the empirical flying distances and we recommend its use in future research as indicator for the aerial dispersal capacity of merolimnic invertebrates. First, index scores of the different distance classes showed the largest differences in the exploratory ANOVA analysis. Second, it was able to predict empirical dispersal data consistently, despite the uncertainty in this data and in assigning some species to one specific distance class, indicated by the high share of significant model runs. Third, the median fit of the models based on SFP was virtually as good as the best model fit of SFP'. The relatively low share of model runs passing the Brant test is no indication for SFP being a bad predictor but the relationship between predictor and response differs from the proportional odds assumption of the ologit model. Specifically, for some of the runs the dispersal capacity increases with the index score, but the relationship between predictor and response is not linear (for example, it could be asymptotic or sigmoid). Alternatively, generalised ordered logistic models could be used to overcome this limitation (Williams, 2016), but they will not provide relevant additional insights in relation to the objectives of this study.

Conclusions
In this study, we compiled and provided a database on the empirical flying distances of European EPT taxa. Moreover, we developed and Fig. 2. Variability of the index scores of the species in the four distance classes between the 10,000 ologit model runs, with species belonging to several distance classes assigned randomly to one out of the possible classes in each run. Each boxplot shows median values, quartiles and highest and lowest value up to 1.5 times the inter-quartile range.

Table 3
Comparison of the standard deviation of the medians and range of the five indices scores for the 10,000 runs. SD Table 4 Summary of the 10,000 ologit model runs for each index, where species belonging to several distance classes were randomly assigned to a single class out of all the possible classes. Only runs passing the Brant test were considered (percentage on all 10,000 runs given in column % Brant). The percentage of runs where the coefficient was significant and positive (% significant) and the median AIC value of the significant and positive runs (AIC) are given. used a methodological approach to compare freshwater invertebrate dispersal indices to this empirical data. We would like to encourage colleagues to use and update the database as new empirical studies are published, as well as to add scores of other dispersal indices and test them against the empirical data. This will allow us to maintain an updated view on the best indices to assess freshwater macroinvertebrate aerial dispersal capacity.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.