Comparing nekton distributions at two tidal energy sites suggests potential for generic environmental monitoring

Tidal energy is a renewable resource that can contribute towards meeting growing energy demands, but uncertainties remain about environmental impacts of device installation and operation. Environmental monitoring programs are used to detect and evaluate impacts caused by anthropogenic disturbances and are a mandatory requirement of project operating licenses in the United States. In the United Kingdom, consent conditions require monitoring of any adverse impacts on species of concern. While tidal turbine sites share similar physical characteristics (e.g. strong tidal flows), similarities in their biological characteristics have not been examined. To characterize the generality of biological attributes at tidal energy sites, metrics derived from acoustic backscatter describing temporal and spatial distributions of fish and macrozooplankton at Admiralty Inlet, Washington State and the Fall of Warness, Scotland were compared using t-tests, F-tests, linear regressions, spectral analysis, and extreme value analysis (EVA). EVA was used to characterize metric values that are rare but potentially associated with biological impacts, defined as relevant change as a consequence of human activity. Pelagic nekton densities were similar at both sites, as evidenced by no statistically significant difference in densities, and similar daily density patterns of pelagic nekton between sites. Biological characteristics were similar, suggesting that generic biological monitoring programs could be implemented at these two sites, which would streamline permitting, facilitate site comparison, and enable environmental impact detection associated with tidal energy deployment. 2016 Published by Elsevier Ltd.


Introduction
Environmental monitoring is used to identify impacts caused by anthropogenic disturbances. Biological components of monitoring programs focus on the detection of change in variables such as diversity, size, or abundance of monitored species [1]. Prior to establishing long term monitoring, regulatory agencies typically require the collection of baseline data before projects can be implemented that may cause alteration to an ecosystem [2]. At a single site, biological characteristics before and after an alteration can be compared to detect change, as in classic Before -After -Control -Impact (BACI) sample designs [3]. Standard sampling protocols permit monitoring datasets to be compared between or among sites to evaluate if observed changes are site-specific or generic.
Biological monitoring programs are mandatory for marine renewable energy (MRE) tidal energy projects in the US, yet no standards for monitoring procedures, technologies, or metrics currently exist [4]. This lack of standardization has resulted in site-specific monitoring programs for each tidal energy pilot project in the US. In Scotland, an Environmental Impact Assessment and a post-installation monitoring program are required for each MRE project, and these are also site-specific [5]. Standardization of a portion or all monitoring components would enable monitoring plans to be proposed in a time-efficient manner, and make monitoring datasets comparable across sites.
Determining what the maximum level of ''acceptable" impact, or biologically significant change, is a high priority when forming a monitoring plan [6]. Because impact above a threshold can determine if a tidal project requires operational modifications, additional monitoring, mitigation, or removal [7], impact thresholds and characterizations should be determined before post-installation monitoring begins [8]. Extreme value analysis (EVA) is an approach used to model values that are infrequent but potentially associated with impacts [9]. This approach also provides a threshold to identify infrequent values, and could provide statistically significant thresholds for use in biological monitoring [10].
Few studies of fish and macrozooplankton in the water column (i.e. pelagic nekton) have been conducted at tidally dynamic sites because these sites are challenging to sample. One option for studying biota in the water column is active acoustic technology. Acoustic instruments use sound to evaluate distributions, abundances, and behavior of fish and macrozooplankton [11,12]. Acoustic instruments offer non-invasive methods to continuously sample large volumes of water, regardless of current speed or light levels. These instruments can be deployed on autonomous or cabled platforms that are suitable for monitoring at high spatial and temporal resolution [13], and low cost [14].
It is important to evaluate and compare MRE site biological characteristics so that the potential for standardized monitoring programs can be assessed, and if applicable, developed and implemented. MRE tidal sites share similar physical characteristics (e.g. high tidal flows), but it is unknown whether these sites share similar biological characteristics. In this study we describe and compare biological characteristics of pelagic nekton distributions at two tidal energy sites, to examine whether density distributions are similar or site-specific. We also evaluate whether EVA is an appropriate general approach to determine impact thresholds of biological monitoring variables at tidal energy sites and comment on the feasibility of developing generic monitoring programs.

Site descriptions
Active acoustic data used for this study were collected at two tidal energy sites. Admiralty Inlet, on the west side of Whidbey Island in Puget Sound, Washington State, was the proposed site of the Snohomish Public Utility District 1 (SnoPUD) pilot tidal energy project that received its project license from the Federal Energy Regulatory Commission (FERC) on March 20, 2014. The project, now dormant, would have deployed two OpenHydro turbines (http://www.openhydro.com/) approximately one kilometer off Whidbey Island (Fig. 1a). Two buried cables were to connect the turbines to the electric grid [15]. The second dataset was collected at the European Marine Energy Council (EMEC) test facility in the Fall of Warness, located centrally in the North Isles of Orkney, Scotland (Fig. 1b). The Fall of Warness provides eight grid-connected turbine berths in depths of 12-50 m with current speeds up to 4 ms À1 . Although the site has actively generating turbines, the dataset used for this study was collected in the tidal channel, away from any turbine structure to provide a control dataset for the FLOWBEC project (http://noc.ac.uk/project/flowbec).

Data acquisition
Acoustic backscatter (i.e. reflected energy) data were recorded at Admiralty Inlet using a seabed mounted BioSonics DTX echosounder (http://www.biosonicsinc.com/) operating at 120 kHz from May 9 to June 9, 2011 [16]. The echosounder was placed on the seabed at 55 m depth about 750 m off Admiralty Head at the SnoPUD tidal turbine site. The echosounder sampled at 5 Hz for 12 min every 2 h (Table 1). Tidal velocity data were collected once every 10 min by a Nortek acoustic Doppler current profiler (http://www.nortek-as.com) operating at 1 Hz.
At the Fall of Warness, a seabed-mounted acoustic platform containing a multibeam sonar and an EK60 echosounder [17] was deployed at 35 m depth over an 18 day period from June 18 to July 5, 2013. The echosounder collected data at 38 kHz, 120 kHz, and 200 kHz, sampling at 1 Hz (Table 1). Water column mean tidal speeds were modeled from tidal velocity data that were collected every minute using an SonTek/YSI ADVOcean acoustic Doppler velocimeter (http://www.sontek.com/) [17].

Admiralty Inlet
Acoustic data from Admiralty Inlet data were processed prior to this study, with processing described in [16] and [18]. Due to a 3rd surface echo in the water column, data values were constrained to 25 m from the seabed, a height corresponding to approximately twice that of the proposed turbines. A volume backscattering strength (S v ) threshold of À75 dB re 1 m À1 (cf. [12], hereafter dB) was applied to remove noise [16]. Data were horizontally binned in 12 min samples and vertically integrated over the 25 m, resulting in 361 datapoints [18].

The Fall of Warness
Acoustic data from the Fall of Warness site were processed in Echoview (version 6.0). Background noise was removed using a post-processing time varied gain noise reduction algorithm [19]. Noise estimates were obtained from three sections of the water column with low water column S v values (i.e. empty water samples). The average of these three empty water samples, À105.44 dB, was subtracted from all data bins [20]. The data range was constrained to 25 m from the seabed to ensure surface turbulence exclusion and to match the depth of the Admiralty Inlet data. A 12 min temporal block from every two hours was used to match the sample block size used in the Admiralty Inlet data.
Water turbulence was detected using the SHAPES algorithm [21,22], as implemented in Echoview. This algorithm is typically used to detect fish and macrozooplankton aggregations by searching for adjacent pixels with density values above a threshold, and applying a minimum size criterion to groups of pixels. The acoustic threshold, minimum aggregation size, and amalgamation parameters are set by the analyst. Virtual positions, necessary for use of this algorithm with the Fall of Warness data, were created using flow rates derived from ADV data by matching the start and end times of the echogram to the ADV data, and then indexing each second to a corresponding flow speed. The turbulence detection threshold was set to À75 dB, to include all backscatter attributed to pelagic nekton and to exclude particulates. After aggregations were detected, they were classified as turbulence or non-turbulence using depth and length of detected aggregation as criteria (turbulence: mean depth <8 m, thickness >15 m; pelagic schools: thickness <4 m).

Data alignment and subsampling
To align datasets from the two sites, an 18 day period needed to be selected from the Admiralty Inlet dataset that closely matched times and conditions sampled at the Fall of Warness. Using historical tide charts from NOAA [23,24], the 18 day period was selected so that the lunar phases of the two datasets matched. The start and end times of day in the Admiralty Inlet dataset were also selected to match the start and end times of day in the Fall of Warness dataset.
To enable a direct comparison of samples between sites, the Fall of Warness data needed to be subsampled to match the resolution of the Admiralty Inlet data, where the echosounder sampled continuously for 12 min every 2 h. Since the echosounder at the Fall of Warness sampled continuously over the deployment period, there are 10 possible 12 min sequential time bins that could be chosen to represent each 2 h block. The continuous sampling at the Fall of Warness also facilitated an analysis of how representative each 12 min bin is of a two hour period. The ten mean S v series of regularly spaced 12 min bins were compared using an ANOVA, and by examining the fit of a Generalized Pareto Distribution (GPD) for each series [9]. The GPD is used to model extreme values, which are exceedances above a threshold. Thresholds of the GPD fits for the 10 series were compared to examine how the choice of data subset (i.e. 12 min bin) affected the robustness of the threshold estimate.
The GPD threshold is usually defined using a Mean Residual Life (MRL) plot, which shows the mean number of values above a threshold as the threshold is increased [8,25]. The optimal GPD threshold is identified as the first value where the curve becomes linear, however this choice is subjective and because of noise in the graph, the value is not always obvious. An alternate approach is to take the derivative of the smoothed MRL plot, which was smoothed using a polynomial kernel density smoother [26], implemented using the KernSmooth package in R, and then select the point where the derivative equals zero [10]. The threshold value obtained by the derivative method for each of the 10 series was compared to the threshold value obtained for the complete mean S v dataset.
The effect of varying the amount of data on the threshold estimate for the GPD was also examined. Mean S v series containing increasing amounts of data were generated, ranging from 10% (one bin randomly selected from each 2 h block) to 90% of the data (nine bins randomly selected from each two hour block). The random selection was repeated 500 times for a total of 4500 series, and derivative-based thresholds were estimated for each series. Variability of threshold estimates was evaluated by comparing the mean threshold estimate using an ANOVA, and by examining changes in the standard deviation of threshold estimates as the amount of data included in the calculation increased.

Metric suite
Ecological indicators are measurable characteristics of three biological attributes: composition, structure, and function. Changes in indicator values can be used to detect ecosystem change in response to disturbances. Composition is defined as the number and variety of elements in a system, structure is the physical organization of a system, and function includes ecological and evolutionary processes [27]. A suite of indicator metrics [cf. 13] was used to quantify variability in vertical distributions of pelagic nekton through time. Density and center of mass were used to monitor changes in ecosystem structure, while dispersion and an aggregation index were used to track changes in ecosystem function.
Density is quantified using mean volume-backscattering strength (i.e. mean S v , the average of the reflected sound from targets (fish, zooplankton) in the insonified volume of water), which is proportional to biomass density. The aggregation index (unit: m À1 ) quantifies vertical patchiness and is calculated on a relative scale of 0-1, with 0 being evenly dispersed and 1 being aggregated. Mean S v and aggregation index are used for extreme value analyses. High aggregation and density values are assumed to be associated with an increased risk of interaction with MRE devices. The center of mass (unit: m) measures the location of the mean weighted acoustic backscatter relative to the seabed. The dispersion (unit: m 2 ) metric measures the spread of biomass around the center of mass, and is analogous to the variance.

Tidal site comparison
Central tendencies in S v , center of mass, dispersion metrics, and tidal speed were compared using t-tests for means and F-tests for the variances at an alpha value of 0.05. The aggregation index was not normally distributed and could not be compared with parametric tests, so a Kolmogorov-Smirnov (K-S) test [28] was used to compare mean values, and a Bartlett's test [29] was used to compare variances. Differences in metric values between day (Admiralty Inlet: 06:00-20:00, the Fall of Warness: 04:00-22:00) and night (Admiralty Inlet: 22:00-04:00, the Fall of Warness: 00:00-02:00) were examined using t-tests for means and F-tests to compare variances.
Frequency domain analysis [30,31] was used to compare dominant periodicities in metric values of the two datasets. Periodograms can be used to examine how the variance of a time series is distributed over its frequency components [31]. Peaks in plots of frequency against power, a measure of variance, are used to identify frequencies that contribute to the variance of the time series. Periodograms were generated for metrics at both Admiralty Inlet and the Fall of Warness. Significant frequencies in each periodogram were identified as values exceeding a red noise spectrum (i.e. an auto-regressive process with a memory of 1 [31]). Coherence between any two series is measured on a scale from 0 to 1, with 0 signifying that the two series are significantly different and a value of 1 being that phases and amplitudes are the same for all frequencies [31]. Coherence between periodograms of metric pairs was calculated to compare amplitudes and phases of frequencies between the two sites.
To compare possible underlying processes that influence observed patterns, linear regression models were fit to the four metric series at each site. A group of potential covariates were tested in each model: tidal speed (ms À1 ), hour of day, Julian day, a Fourier series defined by the 4 h period, a Fourier series defined by a 12 h period, and a Fourier series defined by a 24 h period. Fast Fourier transforms provided amplitudes and phases for the Fourier series. The models were fit by forward selection [32], and the fit was evaluated using the Akaike Information Criterion (AIC) [33]. The model with the lowest AIC was selected as the best model. Residual plots were examined to evaluate model fit. Multicollinearity was examined using the variance inflation factor (VIF) [34].
Fits of the GPD to mean S v and aggregation indices were compared between sites using EVA. Extreme value theory [9,35,36] is a statistical technique used to model the probability and periodicity of extreme values, which are rare values in the tail of a probability distribution. In the peaks-over-threshold (POT) method, extreme values are identified as exceedances above a threshold [9], which follow a GPD. To fit a GPD to data, a threshold is selected and then scale and shape parameters are fit to the data. The GPD was fit to the mean S v and aggregation index metrics calculated from the Admiralty Inlet and the Fall of Warness backscatter data. The threshold for the GPD fit was selected by computing the derivative of diagnostic plots [10] (see Section 2.3.3). Posterior distributions were obtained for the scale and shape parameters using Markov chain Monte Carlo (MCMC) simulations [9,31]. The MCMC simulation chains were run with one million iterations, where 20% of the iterations were discarded as a burn-in period, and chains were thinned according to the autocorrelation between iterations. Medians of the resulting posterior distributions were used as scale and shape parameters of the GPDs for each metric. The fit of the GPD was evaluated for each metric by computing the sums of squares between the observed density function and the corresponding GPD, and by comparing results between sites. Smaller sums of squares indicate a better GPD fit. Return level (i.e. the value expected to be exceeded, on average, once every associated period [25]) plots were generated for the mean S v and aggregation indices at both sites. The 95% interval values from the posterior distributions of the GPD parameters were used to generate 95% credible intervals for the return level plots [25]. Shapes of the return level plots were compared qualitatively.

The Fall of Warness sample block selection
The ANOVA comparing the ten mean S v series of regularly spaced 12 min bins per two hour block showed that means of the series were not significantly different (p = 0.7024), suggesting that any of the ten series could be used as a representative dataset for the Fall of Warness site (Fig. 2).
To aid in the selection of a data series, GPD thresholds for the 10 series were compared. GPD threshold values in the ten series differed. Standard deviations (range: 1.95-2.87 dB) and number of significant outliers (1-4) also varied among series (Fig. 2), which could affect the GPD fit as extreme values are those greater than the threshold. The MCMC routine to fit the GPD did not converge for series 3, 6, and 7, which have few outliers and low variance. The mean for the thresholds of the ten series was À75.41 dB. Series 8, with a threshold of À75.63 dB was selected as the Fall of Warness dataset to be used in further analysis as it had the closest threshold to the mean of the thresholds, and successfully converged to a GPD posterior.
The proportion of data used (i.e. how many 12 min bins per 2 h block) in the derivative method did not greatly affect the mean value of the resulting GPD threshold, but it did affect the variance (Fig. 3). The threshold when all data were included was À75.68 dB. The mean threshold increased slightly as the proportion of data used to estimate the threshold increased (e.g. one 12-min bin per 2-h block = À75.72 dB, 9 bins per block = À75.67 dB), but the overall increase was less than 0.1 dB. The standard deviation of the threshold estimate decreased with an increasing proportion of data used, from 0.63 dB at one bin per block, to 0.045 dB at 9 bins per block. Over 500 iterations, when 10% of the data were used (i.e. 1 bin per block); the threshold estimates ranged from À77.84 dB to À73.79 dB, suggesting that increasing the amount of data used decreases the variance in the threshold estimate. In comparison, the mean threshold estimate from 500 random draws did not change significantly with increasing data proportion above 70%.

Comparison of tidal site characteristics
Similarities of biological characteristics were evaluated by comparing metrics describing distributions of pelagic nekton. Mean S v , center of mass, and dispersion metrics at both sites displayed a saw-tooth pattern with a low-frequency sinusoidal component (Fig. 4), while the aggregation index series contained low amplitude values interspersed with high amplitude spikes. Values at Admiralty Inlet had larger amplitudes compared to metric values from the Fall of Warness. F-tests showed that standard deviations for the three metrics were significantly (p < 0.05) greater at Admiralty Inlet than at the Fall of Warness (Table 2). Metric means, except for mean S v , were significantly different between sites. Despite shallower water at the Fall of Warness (35 m) compared to Admiralty Inlet (55 m), the center of mass was, on average, higher (p = 2.2eÀ16) in the water column at the Fall of Warness than at Admiralty Inlet (Table 2). Dispersion was greater at Admiralty Inlet while aggregation was greater at the Fall of Warness.  Daily patterns in metric values also varied between sites. On average there was greater variability between day and night mean S v at Admiralty Inlet than at the Fall of Warness (Admiralty Inlet, difference = 2.63 dB, p = 9.15eÀ07; the Fall of Warness difference = 1.35 dB, p = 5.54eÀ04). This may be due to the difference in day (Admiralty Inlet: 06:00-20:00, the Fall of Warness: 04:00-22:00) and night (Admiralty Inlet: 22:00-04:00, the Fall of Warness: 00:00-02:00) lengths. At Admiralty Inlet, the average center of mass descended from 13.8 m above the seabed at night to 10 m during the day (Fig. 5b). The opposite pattern was observed at the Fall of Warness where the center of mass ascends from 13.8 m at night to 15.2 m during the day. Dispersion was significantly different between night and day at Admiralty Inlet (difference = 4.25 m 2 , p = 2.6eÀ03), but not at the Fall of Warness (difference = 0.15 m 2 , p = 0.972).
Biomass distribution was predicted to vary with tidal speed. Tides were stronger at the Fall of Warness (mean speed = 1.58 ms À1 ) compared to Admiralty Inlet (mean speed = 1.18 ms À1 ). At Admiralty Inlet, mean tidal speeds ranged between 0.5 and 2 ms À1 , compared to 0 and 3 ms À1 at the Fall of Warness. At Admiralty Inlet, mean S v increased with tidal speed (slope = 0.64), but the relationship was not significant (p = 0.481) (Fig. 6a). At the Fall of Warness, biomass density increased significantly with tidal speed, and at a greater rate than at Admiralty Inlet (slope = 1.033, p = 7.28eÀ5). The center of mass decreased significantly with increased tidal speed at Admiralty Inlet (slope = À1.47, p = 0.0153), but not at the Fall of Warness (slope = 0.04, p = 0.947). Aggregation index values did not change with increasing tidal speed at either site

Table 2
Means and standard deviations of biological characteristics at Admiralty Inlet (AI) and the Fall of Warness (FoW). All p-values for the mean are from t-tests, while those for the standard deviation are from F-tests, except tests using aggregation index values. For the aggregation index, the p-value for the mean is from a Kolgomorov-Smirnov test and the p-value for the standard deviation is from a Bartlett's test.

Mean
Standard Deviation Metric values at both sites had many of the same dominant periodicities, though not always with similar amplitudes (Fig. 7). For mean S v , 24 h was the dominant periodicity (amplitude = 2.5), perhaps highlighting the importance of diel processes at this site. At the Fall of Warness, the importance of the 12 h (amplitude = 0.9) and 4 h frequencies (amplitude = 0.6) indicates the potential importance of tidal over diel processes at this site (Table 3). We discounted the 404 h period as its significance may be due to edge effects [31]. However, amplitudes of the 12 and 4 h frequencies were small compared to the 24 h period amplitude at Admiralty Inlet. Coherence between the two mean S v metrics was the highest (0.997) among all metrics.
Similarities in amplitudes and values of the significant frequencies for the center of mass at both sites were consistent with the coherence for this metric (0.923). The sites did not share any similar significant periods for aggregation index (Table 3), and aggregation metrics at the two sites had the lowest coherence of all metric pairs (0.378). The two dispersion series did not share any significant periods (Table 3), but the coherence for the dispersion metrics was higher (0.903) than that of the aggregation indices.
Overall, common significant periods were 24 h, 12 h, and 4 h. These periods were used as covariates in linear regressions ( Table 4). Amplitudes of significant periods had similar values between sites with the exception of the 24 h period. The greater amplitudes for the 24 h frequency component in the mean S v and center of mass periodograms at Admiralty Inlet indicate a greater dominance of diel processes at Admiralty Inlet than at the Fall of Warness, which is supported by the pattern in hourly variability (i.e. greater variability at Admiralty Inlet between day and night) of these metrics (Fig. 5).
All linear regression models of the metrics, except for aggregation index, included the 24 and 12 h periods (Table 4). Regression models for Admiralty Inlet and the Fall of Warness mean S v shared these two covariates, and the Fall of Warness model included two more significant covariates. The mean S v models for both sites were the only models that included Julian day. Besides the 24 and 12 h periods, the center of mass model for Admiralty Inlet included tidal speed, while the Fall of Warness model included the 4 h period. The aggregation model was the only regression model where the two sites did not have any covariates in common ( Table 4). The dispersion model was the same for Admiralty Inlet and the Fall of Warness with the addition of tidal speed in the Fall of Warness model. All model residuals formed a random pattern indicating good model fit.
No VIF for any model covariates was above 5, indicating no severe multicollinearity.

EVA results comparison
The GPD fit did not differ greatly between sites for the mean S v or aggregation index variables (Figs. 8 and 9). Metric thresholds were similar between sites (Table 5), especially for the aggregation index metric. Scale and shape parameters for both metrics were of the same order of magnitude, with shape parameters positive for mean S v and negative for aggregation index. The sums of squares between the GPD and the metric density function were used to evaluate the fit of the GPD. The mean S v and aggregation index from Admiralty Inlet had a better GPD fit (i.e. lower sums of squares) than those of the Fall of Warness (Table 5). It should be noted that there were small differences in the numbers of datapoints over the threshold for the metrics between Admiralty Inlet and the Fall of Warness (difference of 4 datapoints for mean S v , 1 data point for aggregation), which may have affected the sums of squares.
Shapes of return level plots for the two sites were similar for mean S v (Fig. 8) but not for the aggregation index (Fig. 9). The temporal range between the data (2 weeks) and the maximum prediction interval (up to 10 years) is offered as an explanation of why credible intervals spread quickly.

Site comparison
Patterns in pelagic nekton density and distribution between tidal energy sites have not been previously compared. Tidal energy sites are expected to have similar physical characteristics and are chosen because of high tidal flows. However, common physical traits do not assure similarity in biological features. While there are biological dissimilarities between Admiralty Inlet and the Fall of Warness, many features are similar at both sites. To illustrate by example, with the exception of aggregation, linear regression models of metrics from each site always shared two or more covariates. Density means and coherence values were very similar at both sites. Coherence values for the center of mass and dispersion metrics were similar between sites, suggesting that temporal patterns of pelagic nekton are in phase. Both sites had significant periodicities in metrics that reflected tidal and diel processes, although significant metrics at each site were not consistently associated with the same process at each site (Table 6). Observed similarities in the two sites suggest that a common suite of metrics may be used in biological monitoring programs at tidal turbine MRE sites.
Characterizing the three primary biological attributes (composition, structure, and function [37]) of an ecosystem is challenging. Composition is not an attribute that can be well-addressed using single-frequency, active acoustic data without direct samples such as midwater trawls to identify species. Metrics used in this study enable an assessment of biological structure and function, and are appropriate indices for comparison of two or more sites. This comparative indicator approach Fig. 7. Periodograms for the suite of metrics (mean S v (dB), center of mass (m), dispersion (m 2 ) aggregation (m À1 )) at Admiralty Inlet and the Fall of Warness. Significant frequencies (purple dots) occur above the red noise spectrum (red line).

Table 3
Significant periods in periodograms (i.e. values exceeding a red noise spectrum) rounded to the nearest hour with corresponding amplitudes in parentheses.

Admiralty Inlet
The Fall of Warness could be extended to include physical metrics, such as tidal speed or tidal range, to complete an environmental monitoring program. Even though species composition was not compared between the two sites, taxonomic composition can be inferred from other studies. Historic studies of fauna around the Fall of Warness are scarce, but fish species likely to be present during the summer spawning season include Atlantic mackerel (Scomber scombus), Atlantic herring (Clupea harengus), and sprat (Sprattus sprattus) [38]. Other species that are likely to be in the vicinity are haddock (Melanogrammus aeglefinus), ling (Molva molva), saithe (Pollachius limanda), and Atlantic cod (Gadus morhua). The Environmental Statement [38] for the Fall of Warness also identifies butterfish (Pholis gunnellus) and scorpion fish (Taurulus bubalis) as non-commercial but important fish species. In the North Sea, long-term data from the Continuous Plankton Recorder show large inter-annual fluctuations in zooplankton biomass [39,40]. The zooplankton community is composed primarily of copepod taxa that serve as the primary prey for commercial fish species such as herring [41]. In comparison, trawls conducted during mobile acoustic surveys at Admiralty Inlet [42], consistently caught Pacific sand lance (Ammodytes hexapterus), northern lampfish (Stenobrachius leucopsarus), copper rockfish (Sebastes caurinus), and Pacific herring (Clupea pallasii). Documented zooplankton taxa at Admiralty Inlet are similar to those at the Fall of Warness, including copepods, hydromedusae, and larval stages of fish and small pelagic crustaceans [43]. Relative species' abundances at the two sites are unknown. Pelagic fish species at both sites (mackerel, sprat, and herring at the Fall of Warness; Pacific sand lance, and Pacific herring at Admiralty Inlet) provide a prey base supporting piscivorous fish and apex predators in upper trophic levels [44,45]. Additional data on site-specific species abundance and distribution are necessary for a complete species characterization and comparison of the macrofauna at the two sites.
One primary difference between the two sites is the magnitude of variance in metric values. With the exception of the aggregation index, variance values were greater at Admiralty Inlet than at the Fall of Warness. A possible explanation for the difference is that water flow at Admiralty Inlet is more complex than at the Fall of Warness. Admiralty Inlet is located near the entrance of Puget Sound at the confluence of Deception Pass, the Hood Canal Basin, and the Puget Sound main basin [46,47]. These three water sources have different oceanographic properties (e.g. ocean water from the Strait of Juan de Fuca, fresh water from the Fraser River) and potentially host different fish and zooplankton species, which may increase variability in the species composition between ebb and flood tides at Admiralty Inlet. Conversely, the Fall of Warness is located on an open ocean coast, making water sources during ebb and flood tides more uniform. An alternative explanation is that differences in tidal speeds (mean tidal speeds were significantly greater at the Fall of Warness) could affect biomass distribution variability. Nekton mobility is partially dependent on flow speed of the surrounding medium, with the ratio of nekton locomotory velocity to fluid velocity increasing with body length [48]. Greater flow speeds may result in smaller nekton (especially micronekton < 5 mm [48]) acting as passive particles, possibly causing metric patterns at the Fall of Warness to be more uniform than those observed at Admiralty Inlet. Significant positive relationships between tidal speed and both density and dispersion at the Fall of Warness, which are not seen at Admiralty Inlet, support this hypothesis.

Sampling frequency and the generality of EVA applicability
This study allowed an examination of the proportion of data necessary for conducting an EVA on baseline biological data from MRE sites. As the proportion of data from a series increases when determining a GPD threshold, the precision of a threshold estimate should increase because of the higher percentage of extreme values included in the threshold estimate [9]. As predicted, a higher proportion of data included in the analysis reduced variability in the threshold estimate (i.e. threshold estimates from 10% of the data were most variable; threshold estimates from 90% of the data were least variable). At a data proportion of 70% or greater, the mean threshold estimate remained stable. This exercise and its result is used as justification to recommend lowering the sampling frequency of data collection for monitoring variables, as a greater percentage of data inclusion did not change the mean threshold estimate. When sampling at low (i.e. between 10% and 70% of continuous data sampling) frequency, the standard deviation of the threshold estimate was greater than 0.1 dB over 500 draws, so the precision of the threshold estimate from any single sample will be low at low sample frequencies. If accuracy of the threshold estimate is a monitoring objective, then sampling should be the equivalent of at least 70% of continuous data sampling.
Even though return levels and GPD threshold values differed between sites, the process of applying the POT EVA method was successful in each case. This paper illustrates analyses of high extreme values, but the same approach can be used to model low extreme values in physical or biological monitoring data. A shift in an EVA threshold derived using operational monitoring data, compared to that based on baseline data, indicates an increase or decrease in the probability of an extreme The solid line is the best fit, and gray colors indicate credible intervals, from 10% (darkest gray), 40%, 80%, to 90% (lightest gray). event. A relevant example is an increase in the threshold of the Sv metric, indicating an increase in high density events, which may result from the turbine acting as a biological aggregating device. As EVA has not been previously used for biological monitoring at MRE sites, it was important to determine if results from the application of EVA could be applied to a second, independent site using the same approach. A similar proportion of the density and aggregation index datasets were fit to the GPD at both sites. The MCMC diagnostics showed that convergence to a stationary distribution of GPD parameters was achieved at both sites for both metrics. The sums of squares results were also of the same order of magnitude for both sites, demonstrating that the GPD fit the data at both sites. We conclude that EVA can be used as a generic monitoring tool to determine change and biological impacts, defined as relevant change as a consequence of human activity, in monitoring data from tidal energy sites. Solid lines indicate best model fit, and gray colors indicate credible intervals, spanning 10% (darkest gray) 40%, 80%, to 90% (lightest gray). Table 5 Summary of Generalized Pareto Distribution fit for mean S v and aggregation index metrics from Admiralty Inlet (AI) and the Fall of Warness (FoW), with: the GPD threshold, median estimates and 95% credible intervals (lower, upper) for scale and shape parameters, and sums of squares for the GPD fit.

Standardizing MRE monitoring
Because tidal energy is a relatively new technology, regulators are unsure of its biological impacts, and decisions on what to monitor have largely focused on site-specific concerns (e.g. salmon and Southern resident killer whales at Admiralty Inlet; harbor seals, Atlantic salmon, and seabirds at Scottish sites). While regulators are unsure of generic monitoring targets, there is also uncertainty around how to monitor environmental variables. The three monitoring plans for current marine hydrokinetic energy projects in the US (Admiralty Inlet, Roosevelt Island, and Cobscook Bay) share broad objectives. For example, fish monitoring includes distribution, abundance, and diversity. Differences in the monitoring plans include the choice of monitoring technologies and the spatial and temporal scales of monitoring. Differences in monitoring methods may reflect differences in objectives but also reflect perceptions, preferences, and knowledge of those proposing the monitoring plans. Results from monitoring of early tidal energy projects will be useful in identifying important spatiotemporal scales at which to monitor [18], and the optimal sampling frequency and instrumentation to use when sampling. Standardization of tools and techniques will allow for streamlining project development, especially during project application, which currently is long and expensive (e.g. [49]).
Comparing results among tidal sites is one reason why standardization of data acquisition methods and analysis is so important to MRE monitoring. The primary justification for comparing the Fall of Warness to Admiralty Inlet was that both datasets were collected with seabed-mounted echosounders. It was therefore relatively simple to subsample the Fall of Warness data to match the Admiralty Inlet sample design. This comparison would not have been possible, and certainly would have been less powerful, if data collection had not been so similar. Comparison among sites once tidal energy projects are operational will be crucial in determining whether there are generic impacts from tidal energy development. Site-specific monitoring plans are motivated by the idea that sites differ and need to have monitoring plans tailored to the biology of each site. This study suggests that not all biological characteristics of tidal energy sites are site-specific. While tidal energy is still in the developmental stage, standardization of monitoring objectives and methods is a viable and necessary goal to facilitate project development and the detection of environmental impacts.