The effect of serial correlation in environmental conditions on estimates of extreme events

In offshore engineering, it is common practice to estimate long-term extremes under the assumption that environmental conditions are independent. However, many environmental variables, such as winds and waves, exhibit correlation over several days. In this work, we consider the impact that this has on estimates of return values of metocean variables, environmental contours and long-term extreme responses. It is shown that methods which neglect serial correlation over-estimate the size of extreme events at a given return period. We introduce a new definition of a sub-asymptotic extremal index, and show how this can be used to quantify the effect of neglecting serial correlation. Simple examples are presented to illustrate why neglecting serial correlation leads to positive bias. We show how the size of the bias is related to the average shape of storm events and the shape of the tail of the distribution of storm peak values, with the latter having the dominant effect. Storm peak distributions with longer tails lead to larger biases when serial correlation is neglected. In the examples presented, neglecting serial correlation resulted in relative errors of over 50% in the 25-year extreme response estimates in some cases. The examples presented show that accounting for serial correlation in estimates of environmental contours and long-term extreme responses can reduce over-conservatism and result in more efficient designs


Motivation
Environmental variables such as winds, waves, currents and surges are serially correlated.Conditions can persist over time scales of hours to days.In the statistics literature, the effects of serial dependence on the modelling of extreme values has been discussed widely (see e.g.Leadbetter et al., 1983;Coles, 2001;Beirlant et al., 2004;Chavez-Demoulin and Davison, 2012).In offshore and coastal engineering, there are inconsistencies between the way in which serial dependence is treated in estimates of univariate extremes, multivariate extremes and long-term extreme responses.For univariate extremes, it is common practice to use methods such as annual maxima or peaks-overthreshold (Jonathan and Ewans, 2013), which remove the effects of serial correlation by considering only peak events that are sufficiently separated in time that they are effectively independent.Other methods, such as the average conditional exceedance rate (ACER) method (Naess and Gaidai, 2009), explicitly treat the serial correlation in a univariate time series.In contrast, multivariate extremes or long-term extreme responses, are often calculated based on the distribution of all observations, under the tacit assumption that sequential observations are independent.
The distribution of all observations is different to distribution of peak events.When the interest is in assessing the probability of structural failure due to an extreme response, the pertinent problem is to estimate how often a storm event will occur in which a given response level is exceeded.If such an event occurs, it does not matter how many times a failure would occur within the storm event.In this case, it is the distribution of peak events that is of interest.However, if we are interested in the total number of times a level is exceeded (e.g. for lowcycle fatigue analysis), then it is the distribution of all observations that is of interest.In this paper, we focus on the problem of estimating the occurrence of extreme events for the purpose of assessing the risk of structural failure.
The main point that we wish to emphasise, is that neglecting serial correlation will result in over-estimates of the occurrence of extreme https://doi.org/10.1016/j.oceaneng.2021.110092Received 26 March 2021; Received in revised form 21 September 2021; Accepted 22 October 2021 events.To illustrate this point, consider the following simple example.Suppose we have a 100-year time series of significant wave height,   , in which there is one storm which exceeds   = 10 m, and all other storms have   < 10 m.Suppose that in the largest storm, there are  hours which exceed 10 m.If each hour is treated as an independent event, then the empirical estimate would be that   = 10 m is exceeded  times in 100 years, on average.Whereas, in fact, there was only one event in 100 years which exceeds   = 10 m.Of course, the actual return period of the largest storm in the record is not known exactly, but the example illustrates the difference between the two interpretations.
It is important to emphasise that neglecting serial correlation in the data does not necessarily lead to bias in the estimate of the distribution of all observations.Rather, the simple example above illustrates that the distribution of all observations is not what is required for estimating the occurrence of extreme events.The distribution of all observations tells us the proportion of time that a level is exceeded.It tells us nothing about how those exceedances are distributed in time, in particular it does not specify the rate of up-crossing of a particular level.
The problem of dependence is also important in the estimation of spatial extremes, and many of the methods to account for spatial dependence are similar to those for temporal (serial) dependence.However, the most common problem in offshore engineering is to estimate extreme events based on a time series of historic observations, so the focus here is on the effect of temporal rather than spatial dependence.
The arguments we present in this work will be familiar to some people in the offshore and coastal engineering communities.However, from various conversations with practitioners involved with estimating both extreme environmental conditions and extreme loads on marine structures, it was clear that the effect of neglecting serial correlation is not widely appreciated in the offshore community.Moreover, a number of widely-used methods for estimating extreme events do not account for serial correlation.For example, environmental contours are usually constructed from the distribution of all observations, without accounting for serial correlation (Haselsteiner et al., 2021).Similarly, estimates of long-term extreme response calculated by integrating the short-term distribution function over the long-term distribution of environmental conditions (e.g.Fogle et al., 2008;Sagrilo et al., 2011;Naess and Moan, 2013), also neglect serial correlation.The various methods for combining short-term response functions with long-term distributions of environmental conditions has been considered by a various authors in the past (e.g.Sagrilo et al., 2011;Forristall, 2008;Mackay and Johanning, 2018c).The current work extends these studies by considering the impact of serial dependence in a range of contexts, including univariate and multivariate problems.
The main aim of this paper is to illustrate why it is important to consider dependence in estimates of extreme events and provide examples which quantify the impact of neglecting serial correlation in practical cases.We also consider the theoretical treatment of serial dependence in the statistical literature, and show how the effect of neglecting serial correlation can be quantified in terms of the extremal index, defined in the following sub-section.The examples are intended to provide a more intuitive understanding of why neglecting serial correlation causes bias, and explain how various factors influence the size of bias.

Statistical treatment of dependence in time series extremes
In the statistics literature, extremes of time-series have been considered by a number of authors over many years.Chavez-Demoulin and Davison (2012) provide a relatively recent review.Informally, the analysis considers a time-series  1 , … ,   such that the marginal distribution of   is common to all choices of  ∈ [1, 2, … , ].It is then assumed that the time-series exhibits serial dependence, with certain conditions.Importantly, the focus of the analysis is typically the estimation of the common marginal distribution  .As discussed above, this is not typically the main concern of the ocean engineer interested in estimation of return values and similar quantities.
For the practical analysis of dependent sequences, some constraints are usually assumed, which limit the long-range dependence at extreme levels.A widely used constraint is the so-called (  ) condition (see e.g.Leadbetter et al., 1983;Coles, 2001;Beirlant et al., 2004).Informally, for sequences which meet this condition, the events   >  and   >  are approximately independent, provided that  is high enough and times  and  are sufficiently separated.
The key metric used to quantify temporal dependence in extreme values is the extremal index, .The extremal index can be defined in a number of equivalent ways.In serially correlated time series, extreme events occur in clusters.Asymptotically, extremes of a stationary sequence can be shown to occur in clusters with mean size 1∕ (Hsing, 1987a;Hsing et al., 1988).An equivalent way to define the extremal index is in terms of the distribution of the maximum values of a sample.Suppose that  1 , … ,   are a stationary process with some level of serial correlation, and with marginal distribution function  .Assuming that  is in the domain of attraction of an extreme value distribution (see e.g.Coles, 2001), the distribution of the maximum of a sample will tend to a limit Pr(max where  is a member of the generalised extreme value (GEV) family.Now consider a sequence of independent variables, X1 , … , X , also with common distribution function  .It can be shown that Pr(max{ X1 , … , X } ≤ ) → G() as  → ∞, where G is also a member of the generalised extreme value (GEV) family, and () = [ G()]  , where  ∈ [0, 1] is the extremal index (see e.g.Coles, 2001, Theorem 5.2).In the case that  < 1, we have () = [ G()]  > G(), so if the serially correlated data are treated as independent, this will result in an over-estimate of the probability of extreme events.
For a series of independent observations we have  = 1, but there are also dependent series for which  = 1.For example, Chavez-Demoulin and Davison (2012) note that linear Gaussian autoregressivemoving average models have  = 1, corresponding to independent extremes at very high levels, despite clustering at lower levels.This highlights a key feature of the extremal index.Coles (2001, p97) notes that ''a series for which  = 1 means that dependence is negligible at asymptotically high levels, but not necessarily so at extreme levels that are relevant for any particular application''.Tawn (1990) proposed a penultimate approximation for the extremal index,   , which depends on the level,  and noted that in general   ≤ .Ledford and Tawn (2003) define the sub-asymptotic extremal index as   () = Pr( 2 ,  3 , … ,   ≤ | 1 > ).A review of methods for estimating the extremal index was presented by Ancona-Navarrete and Tawn (2000), who noted that most methods for estimating  actually estimate   .Some authors have proposed incorporating estimates of   explicitly into inferences of extremes from serially correlated data (e.g.Eastoe and Tawn, 2012).However, most studies concerned with the effect of serial correlation, only consider the asymptotic extremal index, .In the examples presented in Sections 3-5, we show that understanding how   varies with  (or equivalently with return period), is important for various problems in offshore design.In Section 2, we introduce a new definition of a sub-asymptotic extremal index and show how this is related to biases in return values and return periods when serial correlation is neglected.
A multivariate extension to the extremal index has also been proposed (Nandagopalan, 1994).Whereas in the univariate case, the asymptotic dependence is summarised by a single number, in the multivariate case the extremal index is a function, describing the dependence as a function of angle (Beirlant et al., 2004).In the present work, the examples considered can all be reduced to univariate problems, so our focus will be on the univariate extremal index.
Another important metric for dependence in time series extremes is the limiting conditional probability lim →∞ Pr( + > |  > ), which estimates the extent of asymptotic dependence in the time series at lag .This corresponds to the so-called  statistic used in spatial extremes (Coles et al., 1999), and has been referred to as the extremogram (Davis and Mikosch, 2009).For stationary series, the quantity Pr( + > |  > ) − Pr(  > ) will tend to zero when   and  + are independent.This quantity can be plotted as a function of  for various thresholds , to select de-correlation timescales used in declustering routines (Mackay and Johanning, 2018c).The corresponding χ statistic, characterising asymptotic independence can also be estimated (Coles et al., 1999).
For univariate environmental data it is more common to adopt either the annual maxima method or peaks-over-threshold method.In the latter, a suitable declustering scheme is used to select events that are approximately independent, thus ensuring that  ≈ 1.It is important to note that, in the statistics literature, the distribution of quantities such as cluster maxima are considered mainly as a pragmatic approach to obtain the marginal distribution of all observations.In ocean engineering, however, it is often appropriate to focus on the distributions of cluster maxima or storm peak events.

The effect of seasonal and long-term variation
Throughout this work we will be mainly concerned with the effect of serial correlation in environmental conditions over time-scales of hours to several days.Seasonal variation in environmental conditions introduces a longer range dependence into conditions.However, this has a slightly different effect to short-range serial dependence.Provided that storm peaks are sufficiently separated in time, storm peak values of variables such as wind speeds or wave heights, can be considered as independent realisations from a distribution that is dependent on the season.That is, the storm peak value does not dependent on the size of the previous storm peak, but only the time of year when it occurs.Mackay and Johanning (2018c) showed that when the seasonal signal is removed from time series of significant wave height,   , values of storm peak   separated by 4-5 days were effectively independent.
Seasonality can be modelled as a covariate effect.There is some debate about whether it is necessary to model seasonality and other covariate effects in order to accurately characterise the annual (or omni-covariate) distribution (Jonathan et al., 2008;Mackay et al., 2010).Using a non-stationary model can improve the accuracy in some cases, but this is dependent on the particular covariate model used and the strength of covariate influence (Mackay and Jonathan, 2020a).The naive use of non-stationary models, where the data is subdivided into seasonal bins and an independent model is fitted in each bin, will lead to larger biases than using stationary models (Mackay et al., 2010;Mackay and Jonathan, 2020a).
Alternatively, the effect of seasonality can be modelled by first removing the seasonal signal (see e.g.Vanem, 2018).However, this generally requires stronger assumptions than covariate approaches, for example that there is no change in the shape of the distribution over the year and that seasonality only affects the mean and variance.
Longer-term climate patterns such as the El Niño Southern Oscillation (ENSO) or the North Atlantic Oscillation (NAO) can also introduce long-range dependence into environmental conditions.However, as with seasonality, these can also be modelled as covariate effects.While neglecting short-range dependence will always result in a positive bias in estimates of return values, the effect of neglecting covariate effects such as seasonality or long-term climatic variations is case dependent and can result in either positive or negative biases.

Definition of return periods for serially correlated data
Suppose, as above, that  is any observation of a serially correlated sequence in time, with common marginal distribution function  .An equivalent independent sequence, X, also with marginal distribution function  , can be defined by randomising the order of observations in the dependent sequence.Alternatively, inferences from X can be interpreted as those that would be obtained if the serial correlation in the dependent sequence, , is ignored and observations are treated as independent.
We denote cluster maxima in the dependent sequence as   , where it is assumed that clusters are defined such that   are sufficiently separated in time that they are effectively independent.The marginal distribution of the cluster maxima is denoted   .It is assumed there are a fixed number, , observations of  (or X) per year and  cluster maxima on average per year.
The return period of level  for the dependent sequence, , is defined as . (1) Note that if annual maxima of  are considered, rather than cluster maxima, then definition (1) still applies, under the interpretation that   is the distribution of annual maxima and  = 1.In this case, return values can only be defined for  > 1.As it is often useful to consider the 1-year return value, it is useful to define return periods in terms of cluster maxima rather than annual maxima.
The return period of level  for the independent sequence, X, is defined as . (2) Eq. ( 2) is sometimes used (incorrectly) to define return periods and return values for the dependent sequence.Our main concern in this paper will be to quantify the impact of using Eq. ( 2) to calculate return periods and return values when the data exhibit serial correlation.
For serially correlated data, arguably, it is clearer to define return values in terms of the distribution of the maximum value in the return period.If we denote the distribution of the maximum value of  in  years as   () then the  -year return value can be defined as the solution of   () = exp(−1). (3) If we suppose that occurrences of cluster maxima are independent and a Poisson point process with mean rate of occurrence  per year, then   can be written where   () is the Poisson probability mass function, given by   () =  −   ∕!.Noting that exp() = ∑ ∞ =0   ∕!, we can write: Equating (3) and (5) shows that the two definitions of return periods for the dependent sequence, (1) and (3), are equivalent.The use of definition (1) requires the identification of independent cluster maxima, whereas definition (3) does not explicitly include the requirement to define independent events, only the requirement to establish the distribution of the maximum value in  -years.Of course, the problem of estimating   will require some method to account for the serial correlation in the data, for example through declustering to identify independent peaks, the use of annual maxima or through explicit treatment of the time series dependence structure.As will be seen in Section 2, definition (3) is useful when considering the theoretical aspects.However, for the practical examples considered in Sections 3-5, definition (1) is more useful.

Outline of the paper
In Section 2 we introduce a new definition of a sub-asymptotic extremal index,   , which is dependent on return period  .We show how the impact of neglecting serial correlation on estimates of extremes can be quantified in terms   .We then go on to consider three types of problem involving serial dependence.Firstly, in Section 3, we consider the estimation of univariate extremes of an observed variable, such as an environmental state or measured response.Methods for dealing with serial correlation are well understood for this case.However, it is instructive to start by considering this simple case to illustrate the general principle.The examples demonstrate why neglecting serial correlation leads to biases, and how the size of the biases (or, equivalently, the value of   ) are related to the average shape of storm events and the shape parameter for the tail of the distribution of storm peak events.The theory and results developed in Sections 2 and 3 are then applied to more complex problems.In Section 4, we consider the effect of serial correlation in a multivariate context, where extremes are quantified in terms of environmental contours.In the third example, in Section 5, we go on to consider the effect of serial correlation on long-term extremes of a short-term response.In this case, the short-term response is not directly observed, but is specified in terms of a distribution conditional on the environmental state.If the short-term process is not directly observed, then the historic time series of environmental conditions combined with the short-term response function only gives a probabilistic description of historic response, so the methods required to account for serial correlation are slightly different to the case where we are interested in the extremes of an observed variable.Finally, the conclusions of the work are presented in Section 6.

Theoretical aspects
In this section, we demonstrate how the effect of neglecting serial correlation on return values and return periods, can be quantified in terms of the extremal index.In Section 2.1, we introduce a new definition of a sub-asymptotic extremal index, relevant for any return period of interest.In Section 2.2 we show how this new definition of the sub-asymptotic extremal index is consistent with the usual asymptotic definition of the extremal index, introduced in Section 1.2.

Relation between extremal index, return periods and return values at sub-asymptotic levels
As before, suppose that  is a serially-correlated random process, with marginal distribution function  .According to the results discussed in Section 1.2, there exists a value   ∈ [0, 1] such that the distribution of the maximum value in  years is given by   () = [ ()]    , where  is the (fixed) number of observations per year.Here,   is a sub-asymptotic extremal index, with   ≤ .Using definition (3), the return period of , is the solution of The return period for the equivalent independent sequence is the solution of Equating ( 6) and ( 7) and taking logarithms, we have So, the sub-asymptotic extremal index,   , can be interpreted as the ratio of the return period of , calculated under the assumption of independence, to the true return period of level .
Whilst the effect of serial correlation on return periods of a given level is dependent only on   , the effect on return values depends on the shape of the distribution.To illustrate this, we assume that the distribution of the maximum value in  years for the dependent sequence is [ ()]    ≈ (), where  is a member of the GEV family, with cumulative distribution function (CDF) where  + = max {, 0},  = ( − )∕,  ∈ R is the location parameter,  > 0 is the scale parameter and  ∈ R is the shape parameter.Under this assumption, it is straightforward to show that [ ()]  ≈ G() is also a GEV distribution, where G  () = () and the parameters of G are given by So serial correlation has no effect on the shape of the tail of the distribution, or on the upper end point in the case that  < 0, where the upper end point is given by The size of the bias in the  -year return value, caused by neglecting serial correlation, can be calculated by comparing the quantiles of  and G at probability level  = exp(−1).The inverse distribution function for the GEV is given by: for  ∈ [0, 1].Using definition (3) and substituting  = exp(−1) into ( 11) gives where   is the true return value and x is the return value for the equivalent independent sequence.The size of the bias is determined entirely by ,   and .The bias is independent of the location parameter, , and scales linearly with the scale parameter, .The bias in return value estimates is shown in Fig. 1 as a function of  for three values of   .For a given value of   , the bias increases with .
In the discussion above,   is defined in terms of the distributions of the  -year maximum,   ().In practical cases of interest, it is not known a priori how   varies with  .In the simulations presented in Section 3, it will be shown that for a given level of serial correlation around the storm peak, the value of   is strongly dependent on the shape parameter of   .Therefore, quantifying bias in return values in terms of  and   alone does not tell the full story.The relationship between bias and  shown in Fig. 1, must be understood as the relationship for a fixed value of   .The analysis presented in Section 3 indicates that if the average shape of a storm is roughly constant, then   will decrease as  increases.In the case that  < 0, it will be shown that   → 1 as  → ∞, and hence the bias caused by neglecting serial correlation also tends to zero.However, when  ≥ 0, we can have   →  < 1 as  → ∞ and the bias tends to a constant value.

Asymptotic considerations
The discussion above has focused on the effect of serial correlation at sub-asymptotic levels.A somewhat surprising asymptotic result is that as the threshold tends toward the upper end point of the distribution, the conditional distribution of a randomly chosen threshold exceedance converges to the conditional distribution of cluster maxima (see e.g.Hsing, 1987b;Anderson, 1990;Leadbetter, 1991).To state this more precisely, suppose, as above, that  is a serially-correlated random process, with marginal distribution function  , and upper endpoint   ≤ ∞.Under the assumption that  is in the domain of attraction of an extreme value distribution, we have where H = 1 − () and () is a member of the generalised Pareto family.Similarly, for cluster maxima,   , we have To examine the effect that the asymptotic equivalence of these distributions has on estimates of return values, we need to consider the non-conditional probabilities Pr( > ) and Pr(  > ).Denote the threshold exceedance probabilities for all observations and cluster maxima as   = Pr( > ) and   = Pr(  > ).Then Pr( > ) =   H() and Pr(  > ) =   H().Using (1), the return period of level  >  for the dependent sequence is given by Similarly, using (2), the return period of level  >  for the equivalent independent sequence is given by Note that   is the expected number of clusters per year with peak value that exceeds , and   is the expected number of observations per year that exceed .Therefore, for  > , T () where we have used the interpretation that   is the inverse of the mean cluster size at level , noted in Section 1.2.As this is an asymptotic result,   is equal to the asymptotic extremal index  for  →   .So, although the conditional distributions of cluster maxima and all observations are asymptotically equivalent, the difference in the expected number of clusters and observations exceeding a given level, , results in different return periods for the dependent and independent sequences, consistent with the analysis in the previous section.

Effect of serial correlation on univariate extremes
In this section, we consider two examples to illustrate the effect of serial correlation on univariate extremes.In the first example, we assume that the time series consists of a series of independent 'storm' events, where the evolution of the variable over the storm is given by a simple parametric form.The motivation for using a parametric representation of the time series, is that it allows us to adjust the level of serial correlation around the storm peaks.In the second example, we use more realistic time histories, resampled from measured time series.Throughout this section, our interest lies in the extremes of an observed variable (i.e.there is no unobserved short-term variability).The case where we have an unobserved short-term response is considered separately in Section 5.

Examples with parametric time series
In this example, we consider time series of significant wave height,   , but the conclusions from the analysis are applicable to other variables as well.We assume that the time series of   is composed of discrete storm events, where the peak values of   in each storm, denoted    = , are assumed to be independent GEV variables, with CDF given in ( 9).In the present example, we fix the location parameter as  = 4 and scale parameter as  = 1.The duration of each storm, , is assumed to be related to the storm peak, by the fixed relation  = 150 −0.25 hours (a similar relation was reported in Mackay and Johanning, 2018a).The time series of   within each storm is given by the 'power storm' model (Arena et al., 2014): where  0 is the time corresponding to the storm peak.When  = 1, the storms are triangular.When  < 1, the storms have sharper peaks, and when  > 1, the storms have more rounded peaks.The storm shape parameter, , determines the level of serial correlation around the storm peak.For  → 0, the peak becomes narrower, the storm tends to a delta function, with value  at the storm peak and   → 0 away from the storm peak.In this case, the serial correlation close to the peak tends to zero.Conversely, as  → ∞, the storm tends to a constant rectangular shape, with constant   =  throughout the storm.In this case, the serial correlation within the storm is equal to one.We use numerical simulation to generate random storm peaks, , with the values of   over the course of the storm, conditional on    , given by ( 18).An example of random storms generated using the power storm model is shown in Fig. 2. In this example, the shape parameter for the GEV distribution of storm peak heights is  = 0, and the storms have been generated using exponents of  = 0.5, 1 and 2. The distribution of storm peak heights is not affected by the storm shape parameter  (it is determined solely by the GEV parameters), whereas the marginal distribution of   is directly affected by the value of .
Simulations were conducted for three values of the GEV shape parameter, with  = −0.2,0 and 0.1.For each value of , 10 8 random storm peak heights were simulated and hourly values of   within each storm were generated for storm shape parameters  = 0.5, 1, and 2. The total lengths of the simulations were approximately 2 × 10 6 years (this is related to the fixed relationship between  and , noted above).
Fig. 3 shows return values estimated from the independent storm peaks and from all hourly values of   , under the assumption of independence.Return values are shown for return periods up to 10 4 years, where the sampling uncertainty is an acceptable level (see Mackay and Jonathan, 2021).It is clear that both the GEV shape parameter and the storm shape parameter influence the differences between the return values from the storm peak values and the return values calculated from hourly   under the assumption of independent observations.As expected, the effect of neglecting serial correlation has a greater impact when the correlation around the storm peak is higher, corresponding to the higher values of .
For the simulations with  = −0.2 and  = 0.5, neglecting serial correlation has negligible effect for return periods greater than 10 years.The reason for this is that when a storm occurs with a peak height exceeding the 10-year return value, typically there is only one hour in the storm when this occurs, so exceedances of this level are effectively independent.However, for higher values of the storm shape parameter there are typically multiple hours in a storm exceeding the Fig. 2. Simulated time series using parametric storm shapes with random peak heights and various storm shape parameters, .The storm shape affects the marginal distribution of   , but not the distribution of peak heights.

Fig. 3. Top row:
Return values of   calculated using the time series of parametric storms, for various storm shape parameters, , and GEV shape parameter, .Black lines show return values calculated from storm peak values only.Coloured lines indicate return values calculated from all values of   under the assumption of independent hourly observations.Bottom row: estimated sub-asymptotic extremal index,   .(For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)true 10-year return level, which are all treated as independent events if serial correlation is ignored.In the latter case, the more correlated observations that are treated as independent, the greater the bias will be.
The GEV shape parameter has a similar effect.In the case that the distribution of storm peaks has a longer tail (i.e. higher values of ), the difference between e.g. the 10-year and 100-year return value is larger.So, when a storm occurs where the peak value exceeds the 100-year level, there will be many hourly values which also exceed the 10-year level.So if these are all treated as independent events then this will introduce bias.In contrast, when the tail of the distribution of storm peaks is shorter, the difference between the 10-year and 100-year level is smaller, so for a storm with peak value exceeding the 100-year level, there will be fewer hourly values exceeding the 10-year level.
Estimates of   are shown in Fig. 3.It is evident that  is dependent on both the storm shape parameter  and the GEV shape parameter .For the cases with  = −0.2,  increases with level for all values of .For the more peaked storms with  = 0.5, the extremal index tends towards 1 for   > 8 m (note that finite sample size effects mean that   is not exactly equal to one).However, for  = 0 and  = 0.1, the extremal index for  = 0.5 tends to a constant less than 1 (note that the upper limit of the -axis has been set at the 10 4 -year return value, so that sampling variability is kept to a reasonable level).So, despite the level of serial correlation around the storm peaks being constant,   varies with the shape of the distribution of storm peak heights.Since   effectively quantifies the level of bias introduced by neglecting serial correlation, the reason for the dependence of   on  is clear, given the discussion above.When  < 0, the GEV distribution has a finite upper end point,   =  − ∕.In this case, for storms which have a fixed shape with a single largest value, there will be only one point in a storm that exceeds a level  as  →   .This means that exceedances of  are independent as  →   and hence   → 1.In contrast, for  ≥ 0 and fixed storm shape,   will tend to a finite value.
The examples presented in this section are relatively simplistic, but serve to illustrate how the storm shape and the distribution of the storm peak values both influence the bias related to neglecting serial correlation.In the next section, we use more realistic examples to illustrate the effect of serial correlation in real datasets.

Examples with resampled time series
In this section, we illustrate the effect of serial correlation in a dataset for a location in the central North Sea (dataset E from the recent benchmarking exercise for environmental contours, described in Haselsteiner et al., 2019).The dataset consists of a 50 year time series of hindcast values of   and wind speed at 10 m above water level,  10 , at hourly intervals.If we used the original 50-year time series to compare empirical estimates of return values from declustered peaks and all hourly observations, then the largest return values will coincide from both methods.However, this is an effect of finite sample size, which will happen for any sample, since the largest storm peak is, by definition, equal to the largest observation.
To circumvent this problem, we can resample the time series to create synthetic time series of arbitrary length.We use an approach similar to that outlined in Mackay and Jonathan (2020b), although here we consider each variable on its own, rather than modelling their joint distribution.The method is as follows.The time series of   and  10 are declustered to identify local peaks which are approximately independent, with peaks defined as local maxima within a moving window of length ±5 days, to ensure peaks are separated by a minimum of 5 days.(Changing the minimum separation time used for the declustering mainly affects the isolation of lower peaks which have little influence on the extremes).Each time series is then partitioned into blocks where the peak values are assumed independent.The boundaries between the blocks are defined as the minimum value between adjacent peaks.This definition is somewhat arbitrary, but ensures that large values are not located close to block boundaries, so we do not separate one 'event' into separate blocks.Fig. 4 shows examples of how the time series of each variable are partitioned into blocks.
The storm-peak values are modelled with a GEV distribution, with CDF given by (9).It is more common to model the storm-peaks exceeding some threshold level with a generalised Pareto (GP) distribution.However, here we wish to fit a distribution to all storm-peak values, which can be used for simulating artificial time series.This could be achieved by using a two-part model with a parametric distribution for the body of the distribution and a GP model for the tail.The problem with this approach is that it is difficult to ensure continuity of the density function on the boundary between body and tail.The GEV gives a simple, flexible model for all storm-peak values, and avoids the need to fit a two-part model.In practical cases of estimating univariate extremes, this would not be necessary, as only a model for the tail is required.
Fig. 5 shows the fitted GEV distributions and parameter estimates for the peak values of   and  10 .Whilst the GEV distribution is asymptotically justified as a model for maxima of blocks of constant size and sufficient length, the model diagnostics for the current data suggests that it is still a reasonable model for both variables in this example, despite the non-constant block sizes and finite block length.It should be emphasised that the aim here is not to fit an optimal model for the data or to advocate that data should be modelled in this way in general.The aim is to fit a simple parametric model for all storm peak values, which can be used to generate synthetic time series that are reasonably realistic.Fig. 5 indicates that the GEV distribution is a reasonable model for the peaks for this purpose.
To simulate synthetic time series, we simulate random storm peaks and resample and rescale the observed blocks to match the simulated peak heights.However, the correlation structure of the observed time series is dependent on the block-peak value.To illustrate this, we can consider the mean storm shape as a function of peak level.If hourly values are treated as independent, then the order in which they occur within each block does not matter.So we can re-order each block into a descending sequence, to examine how many hours there are where   or  10 is close to the peak value.Fig. 6 shows the average shape of the ordered blocks (or 'storms'), when storms are binned by the peak value.The average storm shape is similar for both   and  10 .In both cases, the larger storms decay more rapidly away from the peak.To preserve this effect, for each simulated storm peak, measured blocks are resampled at random from the five blocks with the closest peak values.This does not provide a way to extrapolate the shape of the storm outside the range of observations, but provided the extrapolation is not too large, the results should be representative.It should be noted that although there may be covariate effects on the distribution of storm peak   or  10 in the original data (e.g.dependence on season or direction), by design, there are no covariate effects in the simulated time series.A total of 10 6 random storm peaks were simulated from the fitted GEV distributions for  10 and   .This gave a total length of approximately 30,000 years of hourly records.Fig. 7 shows the return values of   and  10 from the peak values and the hourly values from the resampled time series.Return values are shown for return periods between 0.1 and 100 years, for which the sampling uncertainties in the return values are very low from the 30,000-year time series (see Mackay and Jonathan, 2021).For   there is a difference of more than 2 m at the 1-year level and around 1 m at the 100-year level.For  10 the difference at the 1-year level is around 1.7 m/s, but at the 100-year level, the difference is relatively small (around 0.1 m/s).Fig. 6 indicated that the level of serial correlation around the storm peaks was similar for winds and waves at this location.The difference between the effect of neglecting serial correlation is therefore related to the shape of the tail of the distribution of storm peaks.The tail of the distribution of   10 is shorter than the tail for    (i.e. the GEV shape parameter was more negative, as shown in Fig. 5); as discussed in the previous section, this results in exceedances of higher levels being effectively independent.It is important to note that these conclusions about neglecting serial correlation are particular to this example.In other locations, where the shape of the distribution of storm peaks is different or where the average ordered storm shape is different, then the effect of serial correlation would change accordingly, as indicated by the simple examples in the previous section.

Effect of serial correlation on environmental contours
Environmental contours (EC) are widely used as an approximate method of estimating the long-term extreme response of a marine structure, based on response calculations for a small number of design conditions.In the EC method, the probabilistic description of the environment is separated from structural design, by making certain assumptions about the region of the environmental variable space where the structure fails (the failure region).The most commonly used EC approach is the inverse first-order reliability method (IFORM) (Winterstein et al., 1993).In this method, it is assumed that the failure surface (the boundary of the failure region) can be approximated by a straight line in two dimensions (or hyperplane in higher dimensions) at the design point (the point on the failure surface with the highest probability of occurrence).The IFORM approximation to the failure surface is illustrated in Fig. 8.The method can be applied in higher dimensions, but we will restrict our attention to the two-dimensional case.
Typically, short-term variability in the response is not considered directly in the EC method.Instead, the short-term response is evaluated at a fixed quantile.Often the median value of the distribution of the maximum response in each environmental state is used, but some design standards recommend using a higher quantile to compensate for neglecting the effect of short-term variability -see e.g.Winterstein    (2017).This is equivalent to assuming that the short-term response function is deterministic, with the value given by the fixed quantile of the distribution of the maximum response in each environmental state.
Under these assumptions, the problem of estimating extreme structural responses is reduced to estimating the probability that environmental conditions fall within the half-plane region defined by the linearised failure surface.Since the location of the failure region is not known, the -exceedance environmental contour is defined by the intersection of all half-plane regions containing a probability 1 − .The definition of these regions is not unique.In the original IFORM method (Winterstein et al., 1993), the regions are calculated by applying the Rosenblatt transformation to the joint density function, which maps the original variables to independent standard Gaussian variables.In the Gaussian space, the half-plane areas containing a probability 1− are at a constant radius of  −1 (1 − ) from the origin, where  is the CDF of the standard Gaussian distribution.Huseby et al. (2013Huseby et al. ( , 2015) ) noted that it was not necessary to apply the Rosenblatt transformation to calculate these exceedance regions and proposed that the boundary of the half-plane regions could be calculated in the original parameter space.
Both the original and modified IFORM methods can be calculated by projecting the variables onto lines at various angles, , to the origin (either in Gaussian space or the original parameter space) and calculating the 1 −  quantile for each projection angle.The problem of calculating multivariate exceedance regions is therefore reduced to a series of univariate analyses of deterministic variables.(Note that this is different to the cases considered in Section 5, since shortterm variability is not considered directly in the EC method).In both it is only the proportion of observations a region that is accounting for correlation in the data.As such, the positive biases from neglecting that were discussed in the previous section, also affect environmental contours calculated in this way.However, since IFORM contours are defined in terms of univariate exceedance regions, the techniques of univariate analysis can be applied here as well.Derbanne and de Hauteclocque (2019) proposed using a POT method on projections of the data onto lines at given angles, in the original variables space.In their method, the projected data are first declustered to identify independent peaks, and a generalised Pareto (GP) model is fitted to the declustered peak values at each angle.
To illustrate the effect of serial correlation on construction of environmental contours, we will use another dataset from the recent benchmarking exercise (Haselsteiner et al., 2019).The dataset is a 30year time series of hourly measurements of significant wave height,   , and zero-upcrossing period,   , taken from NDBC buoy 41009, located of the coast of Florida, US.We consider the construction of IFORM contours in the original parameter space, following the approach of Huseby et al. (2013).Fig. 9 shows some steps involved in the construction of the contour.An extract of the time series when the data are projected onto a line at  = 45 • to the origin is shown (where the projected variable is defined by  =   cos  +   sin ), together with peaks defined as local maxima within a moving window of length ±5 days.Rather than fitting a GP model to the projected data, we consider the 1-year empirical return values, for which the sampling variability is reasonable for a 30-year dataset (see Mackay and Jonathan, 2021).The empirical return values from the peaks and all observations are shown in panel (c), and the boundaries to the half plane regions defined by the return values of the projected data are shown in panel (b).Another example for projections at  = 15 • is shown in panels (d) and (e).
For these projection angles, the tail of the distribution of storm peaks is approximately exponential (evident from the approximately linear relationship between return values and return periods, when plotted on a semi-log scale).As shown in the previous section, this results in a large difference between the 1-year return values from the peak values and all observations.By definition, the empirical estimates of return values coincide for the largest storm peak and largest observation in the 30-year dataset.However, as noted in the previous section, this is a finite sample size effect that will occur for any sample.
Fig. 10 shows the 1-year empirical IFORM contours for the Florida dataset, constructed from all observations and from peak values only.The contours are constructed as the intersection of the non-exceedance regions at projection angles  = 10 • , 20 • , … , 360 • .The effect of sampling uncertainty is evident in some of the 'loops' at various points on the contours (see Huseby et al., 2013 for a discussion of this effect and Vanem, 2019 for a discussion of improved sampling methods from fitted models).However, the differences between the two contours is clear.There is a large difference for the highest values of   and   , with nearly a 3 m difference in the maximum value of   and nearly 3 s difference in the maximum value of   .This is related to the long tails of these distributions, discussed in the previous section.In contrast, the differences between the two contours on the low period side is much lower, since the tail of the distribution of the projected data is shorter for these angles, due to the physical limit imposed by steep waves breaking.
As discussed in the previous section, the effect of neglecting serial correlation may decrease for higher return periods (e.g. when  < 0).However, the size of the effect and how it varies with return period will depend on the particular joint distribution of interest.Further discussion of the effect of neglecting serial correlation on response estimates from environmental contours is provided in de Hauteclocque et al. (2021).
Finally, it is important to note that neglecting serial correlation also affects defined in terms of 'total exceedance' probability, the probability than an observation falls outside the include density region contours (Haselsteiner et al., 2017) and inverse second-order reliability contours (Chai and Leira, 2018).As standard IFORM contours, highest density and ISORM contours are calculated in terms of the proportion of data outside a region, which tells us nothing about how these occurrences are distributed in time.As far as the present authors are aware, currently, no method has been proposed to account for serial correlation in highest density or ISORM contours.Given that 'total exceedance' contours are significantly more conservative than contours defined in terms of marginal exceedance probabilities (i.e. the various types of IFORM contour) (Mackay and Haselsteiner, 2021), the effect of neglecting serial correlation will introduce yet further conservatism.

Effect of serial correlation on long-term extreme responses
In this section we consider the problem of how to calculate the extremes of a short-term response that is conditional on the environmental state.The short-term response can represent a range of processes, such as sea surface elevation, the motion response of an offshore structure, or the load on a particular part of a structure.The environmental conditions are assumed to vary on a much slower time scale than the short-term response, such that the environmental condition can be considered stationary for a period of the order of a few hours, during which there are many cycles of the short-term response.
The key difference between this problem and the univariate cases discussed in Section 3, is that, typically, the short-term response is not directly observed, but is described in terms of a previously-established distribution, conditional on the environmental state.In this case, a record of historic environmental conditions only enables a probabilistic description of the historic response.
To illustrate why this problem is different to the cases considered in Section 3, consider the problem of estimating the long-term extreme individual wave height.Fig. 11 shows a time series of   over the course of a storm, together with 1-hour maximum wave heights (simulated assuming a Rayleigh distribution of wave heights and 3600∕  waves per hour).In this example, the maximum individual wave height in the storm does not occur at the time of the storm peak   .In the examples in Section 3, we were interested in the distribution of the peak values of an observed variable, and we only needed to consider the peak value of the variable in each storm.In the present example, it is clear that if we only consider the peak   in a storm, then we risk missing the maximum individual wave height.So in the case where we do not directly observe the short-term response, we need to account for all environmental conditions over a storm.
There is a common misconception about the need to account for serial correlation in the estimation of long-term extreme responses, related to the difference in the correlation time scales for the short-term response process and the environmental variables.In a stationary environmental state, the correlation time scales for most response processes are relatively short, and it is reasonable to expect that peak responses separated by 1 h would be independent.However, since environmental conditions are non-stationary and correlated over time scales of hours to days, this introduces correlation into the values of peak responses in adjacent 1-hour periods.This effect is apparent in the lower panel of Fig. 11, where the serial correlation in 1-hour maximum wave heights over the storm is clearly visible.Fig. 12 shows a one year time series of   and 1-hour maximum wave height.In this figure, the dependence of the 1-hour maximum wave height on   is very clear, illustrating how the serial correlation in the environmental state introduces serial correlation into the short-term maximum responses.
We start by presenting a brief review of methods for calculating long-term extreme response in Section 5.1.Some examples of the effect of neglecting serial correlation on estimates of long-term extreme responses are then presented in Section 5.2.

Methods for calculating long-term extreme response
Methods for calculating the long-term extreme response are usually based on calculating the distribution of the maximum response in a random event, where the event is either a storm, a single short-term condition, or a single peak.The long-term distribution is then formed under the assumption that each event is independent.We shall refer to the methods as the 'all-peak', 'short-term maxima' and 'storm-based' methods.Comparisons of all-peak and short-term maxima methods were presented in Sagrilo et al. (2011).Further comparisons with storm-based methods were presented in Forristall (2008) and Mackay and Johanning (2018c).Here, we present a brief overview of the methods.As storm-based approaches are less widely used, we present more detail on these methods.In the following, we denote the vector of parameters representing the environmental state as  and the shortterm response peaks as .Here we use the term 'peaks' to distinguish between local peaks (e.g.crest heights) and the continuous short-term response (e.g.surface elevation).The distribution of response peaks conditional on environmental state is denoted  | (|) and the joint density function of environmental conditions is denoted   ().

All-peaks methods
Methods for calculating the long-term distribution of all short-term response peaks have been proposed in Nordenström (1969), Battjes (1970) and Tucker (1989).In this work, the long-term distribution of all short-term peaks is calculated using the Battjes method (Battjes, 1970).In this method, the probability that a randomly chosen response peak does not exceed a level, , is given by where () is the average number of response peaks in condition  and N is the average number of response peaks in any environmental state chosen at random, given by: The distribution of the maximum response in a  -year period is calculated under the assumption that all response peaks are independent: where  = 365.25 × 24∕ is the number of environmental conditions per year, and  is the duration of each condition in hours.

Short-term maxima methods
Two methods are commonly used for calculating the long-term distribution of the maximum response in each environmental condition.One is based on ergodic mean (Naess, 1984;Krogstad, 1985) and the other based on an average over the probability of occurrence of environmental states.The two methods give almost identical results for return periods of interest for offshore design (see Sagrilo et al.,Fig. 11.Example of   and random 1-hour maximum wave heights over a storm.The 5% and 95% quantiles of the distribution of the 1-hour maximum wave height are also shown, assuming that individual wave heights follow a Rayleigh distribution.In this example the maximum individual wave height does not occur at the storm peak   .2011; Mackay and Johanning, 2018c).For consistency with the allpeaks and storm-based methods, we shall use the definition in terms of the probabilistic average.Denote the distribution of the maximum response in environmental state  as    | (|).Then the distribution of the maximum response in any environmental condition selected at random, is given by The distribution of the maximum response in a  -year period is calculated under the assumption that the short-term maximum responses are independent: As discussed above, short-term maximum responses exhibit serial correlation, so (23) will overestimate the probability of occurrence of large response values.We note that ( 22) is a valid way of calculating this distribution of all short-term maximum responses (given the caveats about the differences between the ergodic mean and probabilistic mean, mentioned above).However, since short-term maximum response are serially-correlated, this distribution is not what is required for calculating long-term extreme responses.Calculating long-term extreme responses from    (), is analogous to calculating long-term extremes of an environmental variable, , based on the distribution of all observations,   (), without accounting for serial correlation.
As discussed in Sections 1-3, this leads to a positive bias in estimated return values.

Storm-based methods
Storm-based methods are derived in a similar way to the all-peaks or short-term maxima methods.We start by calculating the distribution of the maximum response in any storm selected at random,    , where SP denotes storm peak.The long-term response distribution is then calculated under the assumption that storm peak responses are independent: where  is the expected number of storms per year.Storm-based methods can be classed as either ''equivalent storm'' methods or Monte Carlo methods.In equivalent storm methods, the distributions of the maximum response in measured storms is parameterised in terms of a statistically equivalent storm and    is calculated in an analogous way to the distributions of all peaks or all short-term maxima.In Monte Carlo methods,    is estimated directly from the data, removing the need to define equivalent storms.

Equivalent storm methods:
Although sequential environmental states are correlated, the extreme responses in adjacent environmental states are dependent only on the environmental conditions and not on the extreme response in adjacent environmental states.That is, the sequence of values of  , are independent realisations from    |  , but the sequence   is serially correlated.The correlation in   introduces correlation into sequence of values of  , (see Fig. 11).
Given that    |  are conditionally independent, the distribution of the extreme response over a measured storm (MS) can be calculated as where  0 and  1 are the indices of the start and end of the storm.
There is a potential area for misunderstanding here.In ( 25), the distribution of the maximum response in the measured storm is calculated as the product of the conditionally-independent distributions of maximum responses in each short-term condition.We could extend this argument to say that the distribution of the maximum response over a given year is the product of the distributions of maximum responses in each observed short-term condition in the given year.Since this distribution is calculated as a product, the order of the terms in the sequence does not matter.Based on this, some authors argue (incorrectly) that serial correlation therefore cannot affect longterm extreme responses, since the product could be rearranged into any sequence without affecting the result.However, this argument only applies to an observed time series.As discussed in Sections 1-3, if a year of observations was simulated from the distribution of all observations, without accounting for serial correlation, then this would overestimate the frequency of occurrence of large values.The method applied in (25), for calculating the distribution of the maximum response in an observed, serially-correlate time series of environmental variables, therefore cannot be extended to uncorrelated sequences of environmental variables simulated from   (𝐬).
Two types of method have been proposed for parameterising the distribution of the maximum response in a storm.One approach to parameterising    | is to model the temporal evolution of environmental states in a storm using a simplified geometric form, such as a triangle (Boccotti, 1986(Boccotti, , 2000;;Arena and Pavone, 2006), power law (Fedele and Arena, 2010;Arena et al., 2014) or exponential law (Laface and Arena, 2016).In this approach, the temporal evolution of only one variable is modelled (such as significant wave height or wind speed) and all other variables are modelled as taking a single value, conditional on the dominant variable.If storm peaks are sufficiently separated in time, the order in which environmental states occur within a storm does not matter.Each measured storm can therefore be rearranged into a monotonic sequence, for which a simple parametric form is a reasonable approximation for the largest values.Temporal evolution methods were derived for calculating return periods of wave and crest heights, for which   has the dominant influence on the shortterm distribution.For responses such as loads or motions of floating structures, which are dependent on multiple variables, modelling only the temporal evolution of a single variable may not provide an adequate model for the distribution of the maximum response in a storm.
The alternative is to define an equivalent storm as a parametric statistical distribution, fitted to    | for each measured storm.Tromans and Vanderschuren (1995) proposed to model the distribution of the square of the maximum response in a storm as a Gumbel distribution.A generalisation of this method was proposed in Mackay (2017) and Mackay and Johanning (2018a), where    | is modelled using the GEV distribution.
For both the temporal evolution methods and distribution modelling methods, the distribution of the maximum response in any storm selected at random is calculated as where Ω is the vector of equivalent storm parameters and   () is the long-term joint density function of equivalent storm parameters.Often, the integral over the joint density of the equivalent storm parameters can be simplified by taking the mean value of some equivalent storm parameters conditional on the others, for example, taking the mean GEV shape parameter and modelling the scale parameter as conditional on the location parameter.Details are provided in the works cited above.

Monte Carlo methods:
Broadly speaking, two types of Monte Carlo methods have been proposed for combining long-term and short-term distributions.One type uses only the observed environmental conditions; we shall refer to this as MC1.The other uses some method to generate random time series of environmental conditions, which we will refer to as MC2.Mackay and Johanning (2018b) proposed an MC1 method where a random realisation of the maximum response in each environmental state is generated.A POT analysis is then conducted, with the generalised Pareto (GP) distribution fitted to the random storm peak responses.This process is repeated a large number of times and the results are averaged over all trials.It should be noted that results will differ, depending on whether the GP parameters are averaged and return values are calculated, or the return values are calculated for each trial and then averaged.The difference is due to the nonlinear relationship between the GP parameters and return values.Methods used for estimating GP parameters and quantiles vary in the bias and variance (see e.g. de Zea Bermudez and Kotz, 2010;Mackay et al., 2011).Whether it is appropriate to average GP parameters or return values depends on the parameter estimation method.Jonathan et al. (2021) showed that if maximum likelihood was used to estimate GP parameters, then averaging return values over each trial results in a lower bias than calculating return values from the average GP parameters.However, if the empirical Bayesian method (EBM) (Zhang, 2010) is used to estimate GP parameters, then using the average GP parameters to calculate return values results in a lower bias.
Various types of MC2 methods have been proposed.The methods generally involve three steps: (1) a method to partition the observed time series into a sequence of non-overlapping storm events; (2) a joint model for the storm peak variables; and (3) a model for the distribution of variables conditional on the storm peak values.For multivariate problems, the peak values of each variable within a storm may not occur simultaneously.There is therefore some choice in how to chose the values used for the joint model of storm peak variables.Some practitioners opt to identify one dominant variable, such as significant wave height or wind speed, and select the values of the other variables at the time of the peak value of the dominant variable (e.g.Ewans and Jonathan, 2008).In other works, 'characteristic' values of variables over the course of a storm are defined (Hansen et al., 2020).Alternatively, the peaks values of each variable within a storm can be modelled, where the peak values are not required to occur simultaneously (Mackay and Jonathan, 2020b).To obtain a model for the joint distribution of variables, conditional on the storm peak values, measured storms can be resampled and rescaled (Ross et al., 2017;Hansen et al., 2020;Mackay and Jonathan, 2020b).Alternatively, a model for the time series evolution around the peak values can be estimated (Tendijck et al., 2019).
The MC1 method is simpler to implement than MC2, as it only requires the short-term response function and a standard univariate POT analysis.The disadvantage of MC1 is that a separate analysis is required for each response of interest.It also assumes that response function does not change significantly outside the range of observations.Some responses, such as mooring line snatch loads or wave-in-deck loads may exhibit a sharp change at some level.In comparison, MC2 is more complex to implement, but has the advantage that the extreme value analysis of environmental conditions can be separated from the extreme response analysis, and no assumptions are required about how the response function behaves outside the range of observations.

Short-term response model
We use response functions for the vertical bending moment (VBM) and roll motion of eight ships, considered in de Hauteclocque et al. (2021).The normalised VBM and roll response amplitude operators (RAOs) are shown in Fig. 13. 1 The natural periods for the roll responses are between 7.5 s and 25 s.Although, only two types of ship response are considered, roll is typical of many resonance driven responses, with a sharp-peaked narrow-band response.In contrast, VBM has a broader bandwidth.Thus the roll and VBM responses used here can be considered as representative of a wider range of responses of offshore structures.
The response spectrum is given by where  | () is the surface elevation spectrum in sea state .In the examples in this section it is assumed that each sea state has a JONSWAP spectrum with peak enhancement factor  = 1.5.The significant response is defined as   = 2 √  0 , where   is the th moment of the response spectrum Under the assumption of a linear, narrow-band response, peak responses follow a Rayleigh distribution (Naess and Moan, 2013): If we make the simplifying assumption that all short-term response peaks in a sea state are independent, then where  =  × 3600,  = √ ( 2 ()∕ 0 ())∕2 is the mean number of peaks per unit time, and  is the duration of the sea state in hours.In reality, consecutive response peaks are correlated due to the grouping of waves.In practice, more advanced methods could be used to estimate    | (|), which do account for short-term correlation in response peaks.However, this simple model will suffice for the purpose of illustrating the effect of serial correlation in environmental conditions.

Time series model
To simulate arbitrarily long time series of   and   , we use the storm resampling approach described in Mackay and Jonathan (2020b) (in the notation of Section 5.1.3,this is an MC2-type model).The method for dividing the time series into discrete storm events is very similar to that described in Section 3.2.The details of the joint model for storm peaks are not repeated here, but is described in Mackay and Jonathan (2020b).The method was applied to three datasets of   and   , provided in the recent environmental contour benchmarking exercise (Haselsteiner et al., 2021).Comparisons of the observations and empirical isodensity contours for the resampled data are shown in Fig. 14 and further comparisons are presented in de Hauteclocque et al. (2021).Although there are some differences between the observed and simulated distributions, the simulated data are sufficiently representative of real datasets to illustrate the effect of neglecting serial correlation in the 'all-peaks' and 'short-term maxima' methods.

Results
For each dataset, 10 4 years of synthetic hourly time series of   and   were simulated.For each hour in the records, a random maximum response was simulated for each of the 16 response functions described in Section 5.2.1.Return periods were calculated from the empirical distribution function (EDF) for the simulated storm peak values (the maximum response in each resampled block) and for hourly values under the assumption of independence.Examples of return values of VBM for ship R05 are shown in Fig. 15.For dataset A, the effect of neglecting serial correlation is negligible for return periods above 10 years.In contrast, for datasets B and C, there are relatively large   Return values of VBM calculated using the all-peaks method are also shown in Fig. 15.Instead of generating time series of all peaks within the 10,000 year datasets, a discrete analogue of ( 19) was applied to the hourly time series to calculate the distribution of a randomly chosen response peak: where  = 10 4 × 8760 is the number of hourly sea states in the 10,000year time series.Return values were then calculated as described in Section 5.1.1.For dataset A, there is negligible difference in the return values for the all-peaks and short-term maxima methods above the 1year level, whereas for datasets B and C there are still some differences at the 10-year level.Mackay and Johanning (2018c) showed that if the distribution of the maximum short-term response is formed under the assumption that all peaks are independent, then the return values from the all-peak method will converge to those from the short-term maxima method as  → ∞.This is because both methods make the assumption of no serial correlation between individual peaks or sea states.
It should be noted that after the random maximum response in each hour is simulated, we now have a univariate time series of observed variables, so the situation is identical to that discussed in Section 3. Therefore, for a given response, the effect of neglecting serial correlation will depend on the shape of the tail of the distribution of storm peak response and the average shape of the 'storms', where in this case 'storms' refer to clusters of large responses.Moreover, note that for a given dataset both the shape of the tail of storm peaks and the mean storm shape will depend on the particular response function.
To summarise the results over all datasets, Fig. 16 shows the subasymptotic extremal index   for each of the response functions and each of the datasets.In this plot,   is calculated as the ratio of the return period from the short-term maxima method to that from the storm peak method.There are similar trends for each response function.For dataset A,   increases to close to 1 at  = 100 years, whereas for datasets B and C, there is more scatter, but lower values on average.
To assess the influence of the shape of the tail of storm peak response    , Fig. 17(a) shows a scatter plot of   for  = 25 years against the ratio  10 ∕ 100 , where   is the  -year return value of the response.The ratio  10 ∕ 100 is a proxy for the shape of the tail of the distribution of    | around the 25-year return level, with lower values of the ratio corresponding to higher values of . 2 The relatively tight banding of the results over all 16 responses evaluated for the three datasets, indicates that the mean storm shape is fairly constant for each case.The colour of the points in Fig. 17 indicates the period of the peak response.It is apparent that the longer period responses tend to have distributions of storm peak response with longer tails, and hence small values of  10 ∕ 100 .This is related to the shape of the joint distribution of   and   , discussed in Section 4. For the lower period responses, the tail of the distribution of storm peak response will be shorter, since the minimum period at a given   is limited by wave breaking.Fig. 17(b) shows the relative error in the 25-year return value for short-term maxima methods compared to storm-based methods (defined as ( R25 −  25 )∕ 25 , where R25 is the estimate from the short-term maxima method and  25 is storm-based estimate).When the tail of the distribution of    | is short and  10 ∕ 100 ≥ 0.6, the relative error is negligible.However, for smaller values of  10 ∕ 100 , the biases can be very large -just under 100% in the largest case.For this case, a structure designed using response estimates from a short-term maxima method would be extremely conservative.
Finally, we consider the differences in the effect of serial correlation with and without short-term variability.Fig. 16 shows the values of 2 The correspondence between the ratio and the tail shape can be seen by considering the Hill estimator (Hill, 1975)       for the significant response,   , for each of the response functions (also presented in de Hauteclocque et al., 2021).It is apparent that including short-term variability increases the value of   .As discussed before, a change in   could be due to either a change in the shape of the tail of the distribution of storm peak response, or a change in the mean storm shape.Quantile-quantile plots of the random storm peak response against the storm peak significant response indicated a linear relationship (not shown).So for the present examples, it can be inferred that short-term variability influences the mean storm shape (where the storm shape is defined in the same way as in Section 3.2, as the ordered sequence of response values over the storm, averaged over storms with similar peak response values).However, for other response functions, short-term variability could also influence shape the distribution of storm peak response.
When the short-term distribution is Rayleigh, given by ( 29), the distribution of the maximum response in  cycles tends to a Gumbel distribution as  → ∞, (a GEV distribution with CDF given in ( 9) and  = 0) with parameters  =   √ log()∕2 and  =   ∕2 √ 2 log().As a measure of the variability of   we define a coefficient of variability of the short-term maximum response as   = STD(  ())∕mode(  ()) = ∕(2 √ 6 log()).Here we have normalised by the most probable maximum value, , rather than the expected maximum value to simplify the expression.In the case that   is large relative to the variation of   within storms, the presence of short-term variability dominates any serial correlation in significant response, making time series of   () effectively independent.In the present examples,   () is relatively constant at around 10% for all response functions and the ranges of sea states considered.So clusters of extreme values in the time series of   and   will occur at approximately the same times (similar to the case illustrated in Fig. 11).
To investigate the effect of additional short-term variability on the mean storm shape, consider the following simple example.We assume that over the course of a storm,   increases linearly from zero to    over 100 h, and that the distribution of the maximum response in each hour follows a Gumbel distribution with parameters given above.We simulate random values of   in each hour of the storm for  = 10, 100, and 1000.For each simulation, the values of   over the storm are ordered, and the process is repeated 1000 times to calculate the mean shape of the ordered values of   .The results are shown in Fig. 18.It is evident that the mean shape of the ordered values of   is more peaked than ordered values of   , and that the peakedness increases with   (), as expected.This implies that the reason that   is reduced when considering random responses rather than significant responses is that the mean storm shapes are more peaked.

Conclusions
Sections 1 and 2 present theoretical arguments suggesting how a rational extreme value analysis from serially-correlated data should be conducted in the ocean engineering context.We show that basing analysis on storm peaks or cluster maxima of variables leads to intuitive interpretations of return values and return periods.We have presented theoretical arguments and simple examples to show how neglecting serial correlation leads to positive bias in the estimation of extreme events.
A new definition of a sub-asymptotic extremal index,   , was introduced.It was shown   () = T ()∕ (), where  is the true return period of , and T () is the return period when serial correlation is neglected.It was also shown that the bias in return values introduced by neglecting serial correlation can be quantified in terms of   .The sub-asymptotic extremal index is more relevant to offshore engineering than the standard extremal index, which describes the effect of serial correlation at asymptotically high levels, corresponding to return periods outside the usual range of interest.We show that the value of

Fig. 1 .
Fig.1.Bias in the return value estimated under the assumption of independent observations, x , relative to true return value,   , for various values of the subasymptotic extremal index,   .Calculations assume that the true distribution function of the  -year maximum is a GEV distribution with arbitrary location parameter, scale parameter, , and shape parameter, .

E
.Mackay et al.

Fig. 4 .
Fig. 4. Examples of time series of   and  10 , with declustered peaks circled in red and minima between adjacent peaks indicated by dashed lines.

E
.Mackay et al.

Fig. 5 .
Fig. 5. GEV fit to declustered storm peaks for North Sea dataset.GEV parameter estimates are shown above each plot.

Fig. 6 .
Fig. 6.Average shapes of ordered storms for North Sea dataset.

Fig. 7 .
Fig. 7.Return values of   and  10 from resampled North Sea data.

Fig. 9 .
Fig. 9. Illustration of construction of IFORM contours by projecting data onto lines at various angles  to origin.(a) Time series of (  ,   ) data projected onto line at 45 • to the origin, together with local peaks within a 5-day moving window.(b) Scatter plot of   against   , together with empirical 1-year return values for projected data at  = 45 • .(c) Empirical return values for projected data at  = 45 • , calculated using all observations (blue) and peaks only (red).(d) and (e) same as (b) and (c), but for  = 15 • .(For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Fig. 10 .
Fig. 10.1-year IFORM contours for Florida dataset based on declustered data and all observations.

Fig. 12 .
Fig. 12. Example time series of   and random 1-hour maximum wave heights over a year.The serial correlation in 1-hour maximum wave heights is evident.
E.Mackay et al.

Fig. 15 .
Fig. 15.Examples of return values of vertical bending moment for ship R05 for the three datasets, calculated using various methods.

E
.Mackay et al.

Fig. 16 .
Fig. 16.Sub-asymptotic extremal index for three dataset and various ship responses.Thin lines are values for each response, bold lines are mean over all responses.

Fig. 17 .
Fig. 17.(a) Sub-asymptotic extremal index and relative error at  = 25 years, for each of the 16 responses and three datasets. 10 ∕ 100 is the ratio between the 10-year and 100-year response, used as a proxy to characterise tail shape.(b) Relative error in 25-year return value for short-term maxima methods compared to storm-based methods.(For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)