The Rayleigh-Haring-Tayfun distribution of wave heights in deep water

Nearly every known exceeding probability distribution applied to rogue waves has shown disagreement with its peers, seeming to be microcosms, special cases of the underlying essence of this phenomenon. Recently, Karmpadakis et al. (2020) supported this view as it concluded that there is no distribution suitable for wide ranges of sea states, even though distributions can perform well in a limited and small number of conditions. Towards the better quantitative assessment of rogue wave occurrence, the present work focuses on the apparent dispute among the majority of previous works. Furthermore, on the grounds of the uneven distribution of rogue waves found in Stansell (2004), a new exceeding probability distribution for rogue waves is conceived. The proposed distribution is a geometrical composition among some of the most popular trivial models for waves (Longuet-Higgins, 1952; Haring et al., 1976; Tayfun, 1980) while introducing additional algebraic structures. The model also obeys empirical physical bounds obtained from the analysis of nearly 350,000 waves from storms recorded in North sea and supports the qualitative analysis of rogue wave occurrence based on the symbiosis among three sea state parameters outlined in Mendes et al. (2021).


Introduction
The seminal work of Longuet-Higgins (1952) introduced a Rayleigh distribution (normal) of wave heights for a narrow-banded sea, henceforth abbreviated as RD. The observation of the so-called Draupner wave in the mid '90s brought a renewed interest on the subject of rogue waves (Haver and Andersen, 2000;Haver, 2004), whose unusually large amplitude and steepness seem to appear out of nowhere (Akhmediev et al., 2009b). Naturally, one of the most important considerations for marine safety, offshore operations and ship traffic is the recurrence interval of extreme waves in a particular sea state, also known as the return period, i.e. the required number of waves for each observed rogue wave. The accurate forecast of the rogue wave return period is paramount because these waves cause a lot of damage to the industry (Faukner, 2002;Cruz and Krausmann, 2008). As such, the drive to obtain cuttingedge knowledge about oceanographic, mathematical and meteorological conditions for ship and offshore standards has received increased focus and importance over the last decade. Hence, a complete understanding of all possible physical processes of generation, modeling of the exceeding probability as well as the creation of an accurate warning system for diametrical sea states are essential for the design of both marine installations and ships (Bitner-Gregersen and Gramstad, 2015).
Due to both randomness in the forcing mechanism and the nonlinear dynamics underlying the evolution of the wave field, a description of the sea state is necessarily statistical in nature. The statistical approach leading to the RD describes the sea surface as a superposition of wavelets with different frequencies and directional spreading. The simplest configuration of this group of waves is the assumption of a uniform distribution of wave phases (Bitner, 1980;Tayfun and Lo, 1989;Tayfun, 1994Tayfun, , 2008. Although these approximate formulations have shown remarkable success in describing general ocean wave statistics and kinematics, some issues arise and is our interest to understand the deviation from these models that lead to the formation of rogue waves in the upper tail of the distribution. For instance, based on RD, Øistein (2002) predicts that the famous Draupner wave crest would appear only within a group of a million waves, almost two orders of magnitude higher than the observed number of waves during the Draupner storm (Trulsen and Dysthe, 1997;Haver and Andersen, 2000;Haver, 2004). Noticing the inaccuracy of the RD model prediction, strongly nonlinear mechanisms have been proposed to fill the gaps in the standard model (Kharif and Pelinovsky, 2003;Dysthe et al., 2008;Onorato et al., 2013). Among these nonlinear models, the socalled Nonlinear Schrödinger equations (NLSE) stand out as the most popular. They have been widely used to model rogue waves for the last three decades (Peregrine, 1983;Janssen, 2003), particularly in the context of unidimensional water wave fields (Akhmediev and Korneev, 1986;Akhmediev et al., 2009a), upon which the Zakharov equation (Zakharov, 1968) captures weak nonlinearity instabilities experimentally observed by Benjamin and Feir (1967) that are believed to be a mechanism of rogue wave formation. Several other types of NLSE have been studied, including the Dysthe and Davey-Stewartson equations (Davey and Stewartson, 1974;Dysthe, 1979). Nevertheless, the mathematical physics of the NLSE and related subjects have not yet been applied to the task of prediction and warning system, but of rogue wave formation alone. Moreover, Fedele et al. (2016) explains that typical oceanic wind seas features multidirectional wave fields which spreads the energy directionally thus diminishing the effect of both modulational instability and/or nonlinear focusing, such that the role of these instabilities in the formation of rogue waves appear of lesser importance in realistic seas. Nonetheless, the contradictory views between quasi-determinism (Boccotti, 1989(Boccotti, , 2000 and nonlinear mechanisms could be unified in the near future (Dematteis et al., 2018(Dematteis et al., , 2019. On the other hand, more down-to-earth models have been proposed in the meantime for different aspects of surface elevation or heights, with not less than a few dozen distributions being proposed (Jahns and Wheeler, 1973;Haring et al., 1976;Forristall, 1978;Tayfun, 1980;Ochi, 1986;Langley, 1987;Winterstein, 1988;Tayfun and Lo, 1989;Tayfun, 1990;Kriebel and Dawson, 1991;Ochi and Ahn, 1994;Cieslikiewicz, 1998;Kobayashi et al., 1998;Srokosz, 1998;Forristall, 2000;Song et al., 2002;Prevosto and Bouffandeau, 2002;Krogstad and Barstow, 2004;Socquet-Juglard et al., 2005;Tayfun and Fedele, 2007). Most of these distributions include Stokes second-order terms and the probability distribution is expressed in terms of the significant steepness and may include a depth parameter for shallow water waves (see Tayfun and Alkhalidi (2020) for a review). Additional methods have been applied such as Gram-Charlier expansions (Longuet-Higgins, 1963;Al-Humoud et al., 2002;Mori and Yasuda, 2002) which highlight the importance of the skewness and excess kurtosis of the surface elevation. Although an approach based on the Gram-Charlier covers water waves of any bandwidth, it suffers from being computationally burdensome. In addition, this approach produces undesirable negative probabilities and bounds for maximum and minimum surface elevations that are not compatible with the physics of the system (Forristall, 2000). Generally, as Karmpadakis et al. (2020) demonstrates, several distributions are capable of performing well in comparison with a narrow range of sea states but we lack a universal distribution for a wide range of sea states, such that a unified framework containing aspects of these distributions is yet to be achieved.
On the more operational aspect of rogue waves, despite the joint effort of several leading academic and private groups to obtain further understanding of this phenomenon, no consensus regarding warning system implementation has been reached (Bitner-Gregersen and Gramstad, 2015). Furthermore, the lack of consensus inevitably leads to a hold on any intention to update offshore standards by classification societies (Bitner-Gregersen and Gramstad, 2015). In fact, the MaxWave project (Savina et al., 2003) proposed the implementation of indices that coupled significant wave height and wave steepness as well as directional spreading as a warning system for rogue waves in the weather forecasting. Two indices were implemented by Meteo France and the occurrence of rogue waves was found to be weakly correlated with them (Bitner-Gregersen and Gramstad, 2015), while numerous authors have also attempted to link spectral parameters and the occurrence of rogue waves with similar dire results, making the current state of warning system not satisfactory (Bitner-Gregersen and Gramstad, 2015).
Therefore, in this work we attempt to devise an exceeding probability distribution that explains the uneven spread of rogues among several storms as described by Stansell (2004) and pursue the unification of distributions (Karmpadakis et al., 2020), while upholding the physical limits obtained from the same data (Mendes et al., 2021). Nevertheless, the present data set did not include information on the directionality, which can indeed affect the results obtained thereof. The reader is referred to the most important references regarding experimental Waseda et al., 2009;Toffoli et al., 2017) and numerical (Toffoli et al., 2008 analysis of the role of directionality and sea state parameters affecting rogue wave statistics.

Rayleigh-Haring-Tayfun Model
The analysis in Mendes et al. (2021), which is also indispensable to follow the notation (see Appendix A) in this study, has described Stansell's data in detail and highlighted the unevenness of the observed return period into further minutiae. In order to study this unevenness, we advocate for the formulation of a distribution built through a geometrical composition that approaches the RD in the appropriate limit. In view of the shortcomings of the distributions highlighted in Mendes et al. (2021), we will use a different strategy: we start with a narrow-band approximation of the geometrical composition followed by the inclusion of nonlinear effects as a final step. Such constraint will be key to decide the final version of the sought distribution. Let the composition function be of the type (see Figure 1): Regardless of the choice for the composition, eq. (1) must recover the normal distribution. With that in mind, we write a general expression for the distribution structure J i jk , where H 0 is the Haring root function. Then, we must find unique values for A, θ andδ for a given δ that fulfills the Rayleigh-Haring-Tayfun limit: Accordingly, in order that the limit does not vanish or blows-up the solution must include θ = 1 + δ/2. Hence, the most trivial case has the pair (A, δ,δ, θ) becoming (8, 0, 0, 1). In addition, Mendes et al. (2021) introduced physical bounds for the variables α, and ε, so that the following criteria should be fulfilled by established and forthcoming probability distributions: Following eq. (3), it is plain to find an expression for such an operation (see Figure 1):  Hence, the composition law of RD with another function J(α) would produce, The combinatorics of the composition law leads to the following amount of unique outcomes: The factor of two in the right hand side of eq. (7) is necessary to account for the possible non-associativity of the The quotient containing factorials represent how many different ways we can arrange two tildes with three variables whereas the 3! comes from the number of possible ways for arranging three variables disregarding the brackets and the tildes. Thus, The first 36 possibilities can be seen in Figure 1 (right). For the second part of eq. (7), the quotient of factorials are subtracted by one due to the long tilde that can not be put over the J i but rather of two functions combined. Discarding cumbersome looking combinations and taking into account the constraint in eq. (3), the best version for eq. (2) is: which we call the Rayleigh-Haring-Tayfun (RHT) model. It recovers Tayfun (1980) when → 0 (see appendix A of Mendes et al. (2021)), Haring et al. (1976) when ε → 0 and Longuet-Higgins (1952) when both variables vanish, as intended. Since the negligible variability displayed by Haring et al. (1976) and Tayfun (1980) in response to sea states is well-known (Mendes et al., 2021), and in order to preserve the Stokes expansion structure of Tung and Huang (1985), we modify the Haring function in the following fashion: to obtain the Modified Rayleigh-Haring-Tayfun (MRHT) model. Figure 2 shows that the modification to the Haring root in eq. (9) is much more flexible than its original version (Haring et al., 1976). However, the modified version of Haring et al. (1976) will violate the monotonicity of the exceeding probability (Rohatgi, 1976) and corollaries for very high and negative γ, thus, we shall attempt to find a remedy for anomalies that may arise.

Longuet-Higgins
However, as expected, Figure 3 shows this picture to not be too simplistic. On one hand, a fixed ratio ε/ tend to give similar probabilities, though we clearly observe a skewed behavior depending on the sign of γ, thus suggesting that if γ = γ(φ), with an unknown function of sea parameters φ, the peak γ(φ c ) is skewed towards positive increments of φ c . Therefore, whatever γ may be, a large deviation from γ(φ c ) will affect the exceeding probability significantly, provided the ratio ε/ remains fixed. Now, according to Table 4 and Figure 16 of Mendes et al. (2021), the mediator when the latter two parameters are fixed is the nonlinearity η 1/3 , hence, γ = γ(φ(η 1/3 )). For further validation of this hypothesis, Figure 4 (left) confirms that when γ, and thus η 1/3 , is fixed, the higher the ratio ε/ the higher the probability. At least theoretically, the MRHT model prescribes the exact same solution as the graphical description of Mendes et al. (2021). This match between qualitative analysis and mathematical modelling is not coincidental: all one has to do is select distributions that perform these calculations individually and combine into a unified model, thus, providing the exact motivation for the proposition in section 2. Now, to better understand γ we notice that Figure 3 (right) has identical probabilities as increases if γ decreases when φ c → φ + or γ increases when φ − → φ c , suggesting a bell curve centered at φ c . Remarkably, this seems to be connected to the Ursell number, and one can show that the latter depends on these two parameters (see definitions in Appendix A): Then, at fixed ℵ 1 , we conclude that the Ursell number will increase as the wave approaches shallower waters. Moreover, for the same variation in we see that the region φ > φ c has a much faster drop ∆γ ∼ −0.9 than its counterpart is of the order of its equivalent λ 2 /D ∈ [5, +∞), such that the seemingly slow decay in the region φ ∈ [0, φ c ) is actually much faster because it covers a very small interval λ 2 /D ∈ [0, 5). Therefore, we may describe, asymptotically, the shape of γ(φ) as follows, depicted in Figure 4 (right) and in agreement with Table 4 of Mendes et al. (2021): Consequently, the effort of the system to produce shallow water rogue waves should be in general, but not always, much higher than for deep waters. Moreover, one may convince oneself that we need a translation for the nonlinear measure γ * −→ γ+k in order to maintain a Gaussian shape (where k is some real non-negative number), due to the fact that γ is typically negative and barely crosses to the positive realm. A quick back-of-the-envelope reverse calculation using eq. (8) and the entries of Tables 1 and 3 of Mendes et al. (2021) will show that the interval of interest here is −2 < γ < 1/2, hence, k = 4 is advised to obtain a mathematical expression as simple as possible.

Modified Haring Function and a Constrained MRHT Model
Since we understand the rough shape of γ, we can apply the monotonicity condition to eq. (9), i.e. guaranteeing that the derivative of the exceeding probability is non-positive (Rohatgi, 1976), which is reduced to: Considering that the Haring function is strictly non-negative (see eq. (2)), we require: As shown in Figure 5, the Haring function has both negative and positive slopes, with the latter dominating past α ∼ 0.5 (see eq. (2)) and obeying eq. (13) when γ is not too negative. We should, however, be even more careful, and seek the conditions below: which can never be satisfied the MRHT model. Therefore, we reformulate the Haring function to obey eq. (14). Hence, our task is to obtain a Taylor series capable of approximating the Haring function for α < 0.5 and departing from it to both vanishing (γ < 0) and divergent (γ > 0) regimes in Figure 5 (left). Clearly, the best option for a Taylor series obeying such constraints are of trigonometric behavior coupled to an exponential. Furthermore, the product of cosine and exponential functions feature an expansion (up to coefficients) of the type 1 − O(x) + O(x 2 ) + · · · , being optimal for the modelling and generalization of eq. (2) and without undesirable trigonometric bouncing features. If, for the sake of simplicity, we rewrite eq. (9) with 2γ, we obtain, which is plotted in Figure 5 (right) and matches the desired shape required by eq. (14). In addition, the new formulation of eq. (15) does not assign finite exceeding probability for extreme values of α that would ultimately violate the bound, as shown by Figure 6. Notice, however, that a γ modification to the Haring et al. (1976) distribution, in respect of the monotonicity conditions, would imply the following: ln such that it is possible to have both cases and still obey eq. (13) by inverting the sign in the trigonometric function in eq. (15), as the RHT model can be controlled by the steepness whereas the Haring model depends on eq. (15) alone. Hence, possessing all the mechanisms necessary to repair eq. (8), we rewrite the new exceeding probability as: Following the reasoning of the previous section, we start with the idea of γ * that is a function of ℵ 1 and ℵ 2 . For instance, low intermediate water waves are approximately described by ℵ 1 ℵ 2 ≈ 40 Ur. Tentatively, one can easily find a mathematical range for a good filter φ and thus fitting of the data, however, the filter should be obtainable from the Ursell number at leading order. Then, if we reverse engineer the entries of Tables 1 and 3 of Mendes et al. (2021) while combining eqs. (17) and (10), the most suitable clustering is: Moreover, we introduce the error range for the calibration of γ in order to get a closer fit to the expected values of the latter that are not well covered by (18). Hence, we have: As a bonus, as shown in Figure 7, φ does such a good job at gathering homogeneously the γ extracted from Table 1 in Mendes et al. (2021), that is also possible to find a generalized expression for γ * (α, ℵ 1 , ℵ 2 ) conveying a linear model. However, our task is to provide a model that supports the interpretation of the underlying dynamics as suggested by Table 4 of Mendes et al. (2021) and all the constraints and conditions thereof, in addition to the interpretations and predictions of section 4 of Mendes et al. (2021). For instance, a linear γ model would have a maximum in very deep water while becoming infinitely negative in shallow waters, thus making it impossible to form rogue waves. Thence, for the reasons already pointed out by eq. (11) and in Figure 4, we select the model of eq. (18), whose non-trivial shape is depicted in Figure 8 (left). Generally, as conveyed by previous versions of P α (8, 9, 17), when α approaches zero it should lead to a 100% exceeding probability as attested by Longuet-Higgins (1952). Likewise, we need to find the limit of P α in eq. (17) when γ * is very large. Having found the order of magnitude of the product γ * α and applying it into eq. (17), we readily conclude: thus being consistent with the standard limit of R α when α approaches zero.

Smoothness near α ∞
Nevertheless, we still need to obtain a bound for the specific case of very large dimensionless wave heights and obey the monotonicity condition. This bound is also related to the first part of the condition in eq. (4). The main issue at hand is that it is typically very complicated to satisfy the monotonicity and non-negative second derivative (henceforth called "meandering") simultaneously. Often distributions obey one but not the other. Moreover, the second derivative would look terribly complicated in eq. (14). Instead of finding a general formula for H γ 0 obeying monotonicity and "meandering" at once, it seems more realistic to stick to the general solution for monotonicity and modify it with a truncation that makes the "meandering" arise for selected storms. In other words, it might be too complicated to find a "meandering" formula for all storms, and even doing so, this expression would not affect the less complex form of eq. (17). Lastly, we notice that the meandering for small negative values of γ starts about or after its bound α ∞ , whereas when feasible, the "meandering" should happen in the vicinity but not after its upper normalized height bound. Accordingly, when the normalized height approaches α ∞ the corresponding value of γ * must drop, because eq. (14) at a very large γ requires a very high slope for H 0 capable of producing Planck scale low probabilities (see Figure 6). However, this drop can not change the sign of γ, in order to not violate the meandering condition, such that this modification should be as shown in Figure 8 (right). Notice that at α > α ∞ will reduce the distribution to the Tayfun (1980) form, however, at so large normalized heights it also produces Planck scale low for storms in the extremes of the curve γ(φ) as required by eq. (1). These considerations are important for super-rogue waves, such that using the four storms with highest rogue wave occurrence and eq. (4) as liaisons, we modify γ: where ξ denotes the bound availability, i.e. how the combined typical maximum normalized heights α and α ∞ / measure up to the upper bound α ∞ (Mendes et al., 2021). Therefore, we obtain, In Figure 8 (right) we show the differences between models eq. (18) and eq. (21): the first sharply increases in the vicinity of α ∞ whereas the second maintains a stable trajectory of small values for γ ⊕ . From Figure 9 (left) the roles played by each parameter becomes clear: ξ controls the strength of the meandering, φ controls the sign of γ and also affect ξ strength (because shallow water waves with higher ξ still produced a smaller ∆γ than a deep water counterpart), while an increasing α ∞ diminishes the likelihood (if all terms in eq. (17)  condition (14) a more positive γ is attached to a positive slope H 0 producing very low probabilities and vice-versa, partially explaining why the storms with highest (on average) bounds α ∞ on Table 5 of Mendes et al. (2021) are those with the smallest ratios α /α ∞ . This interpretation is also roughly supported by our conjectures in sections 5.3.2 and 5.5 in Mendes et al. (2021), as when α ∞ increases a decrease in E α follows. Lastly, is important to note that if B 1 ∼ e −α 2 /α 2 ∞ we would produce very similar results, however, the more complex function B 1 (ξ) brings in richer interpretation, which validates its choice. A further modification of interest is related to the exponent of the trigonometric function of Haring et al. (1976) in eq. (17). The new exponent shall read (see Figure 9, right): The main effect of B 2 is to enhance the cosine exponent in the vicinity of the bound α ∞ without affecting the bulk waves significantly. Furthermore, very high bound availability is counter-productive towards B 2 , thus creating a balance between B 1 and B 2 contributions. For mid-range values of significant steepness and height-to-depth ratio such procedure is enough to avoid a second derivative of the exceeding probability becoming positive, as described in Figure  10 (right), however, there are scenarios where these two coefficients do not have full control of the monotonicity.

Asymptotic Validation
In order to obey the triple asymptotic condition in eq. (4), is important to highlight that the smoothness procedures discussed in section 4.1 are useful only for storms with low dimensionless depth filter φ ∈ [0.3, 0.9]. For extreme regimes of nonlinearity, further improvements are necessary. The reason for that is the fact that smoothness such as in Figure 8 will lead to γ = 0 when the normalized height blows-up. For the targeted range α 3, the simpler model lacking any smoothness parameters still provides the meandering necessary to explain Stansell (2004) data, however it would fail to obey eq. (4) in the range α 3 for time series with φ ≈ φ peak , otherwise not displaying any anomalies. For the extremes of the curve when the asymptotic conditions laid out in eq. (4) apply the picture is worsened, as a growing steepness would always increase the probability when γ → 0, i.e. it would go back to the modified version of Tayfun (1980), whose structure does not obey eq. (4). Therefore, it is paramount to lower the limiting condition for γ when α → α ∞ to increasing negative values, especially if α ∞ is very small due to very large significant steepness or approaching shallow water. The latter aspect is reasonable as typical rogue waves in shallow water follow Glukhovskii (1966) types of distributions (sub-Rayleigh). Consequently, the simplest structure for this required effect is written as: as displayed in Figure 11. Hence, we are able to assess the asymptotics of the MRHT model with the appropriate smoothness. Firstly, let us investigate the cases when → ∞. For most beach shapes, the typical maximum of this parameter for rogue waves is around 0.5 (Hallowell, 2015), that is to say H 1/3 D. On the other hand, is sufficient to assign ε ∼ 1 to convey ε → ∞, as it is one order of magnitude higher than the Stokes wave breaking limit (Miche, 1944). Qualitatively, as B 0 guarantees that γ −4 for either too small or large φ, the diagram in Figure 13 confirms the desired asymptotic behavior. Furthermore, a few details in Figure 12 are important to remark, for instance, that our full RHT model obtain the same results as in Figure 6, i.e. Planck scale probabilities, assigning no chance of forming waves taller than the water column approaching the shore. Although Haring et al. (1976) provides likelihoods lower than ascribed by Longuet-Higgins (1952), the latter assigns too high probability for physical scenarios not allowed by the governing equations, such that the step function displayed by eq. (17) with smoothness parameter B 0 is suitable for unequivocally ruling out such a possibility. Regarding the steepness extremes, which are also theoretically unrealistic and could not avoid wave breaking, our model also provides quite negligible likelihood, of the order of 10 −10 for any rogue wave when the significant steepness exceeds the Miche (1944) limit by threefold (see Mendes et al. (2021) for the definitions) in deep water while Tayfun (1980) (see section 3.3 of Tayfun and Fedele (2007) for a more up-to-date distribution, providing the same super-Rayleigh regime) consider these waves as likely as waves with non-breaking steepness. However, when the steepness is too high in shallow water while maitaining the same height-to-depth ratio, the likelihood drops to 10 −10 6 for rogue waves (red curve in Figure 6).
These conditions on the distribution choice seems to be a sort of phase transition. The models P α (17) andP α (25) are nearly indiscernible for the majority of storms in the interval 0 < α 2.2 (see Figures 14-16), and indeed the model (17) is quite reliable when 1.50 < α < 2.2, showing very few exceptions to this rule. However, their range of similarity is affected by sea parameter variations, as shown by the very same plots. Therefore, as Figures 14-16 show, our final model matches these qualitative observations of uneven occurrence, variability (or stratification) of the probability distribution as well as the super-rogue wave "meandering" in storm 172. Noticeably, as expected for deep water, most storms could be fitted by Boccotti (1989) and similar distributions (Naess, 1985;Tayfun, 1990) in the regime α 1.4 but not otherwise (see Figures 14-15). Here we have shown that, following the steps of Stansell (2004), selecting homogeneous groups of storms according to their similarity shows that these distributions will not explain uneven distribution nor meandering, as they were devised to explain stationary conditions and thus are not sensitive to sea state conditions. Interestingly, all storms start to deviate from each other exceeding probability near the rogue wave threshold, which consists the main motivation for this study.   Table 1: Expanded summary of wave count exceeding large and rogue wave thresholds for storms (Stansell, 2004) with decreasing rogue wave probability (α = 2) moving rightwards. Bold numbers depict rogue wave count, while α > 0 counts the total number of waves within a time series. The parameter believed to control rogue waves for group A had an average ℵ 1 ≈ 1.01 and ℵ 1 ≈ 1.16 for group B. The four storms with highest rogue wave likelihood displayed average ℵ 1 ≈ 1.005 whereas the four bottom ones had ℵ 1 ≈ 1.209, highlighting their uneven sea states. In spite of group A containing 61% of all waves, it claimed 87% of all rogue waves, demonstrating that rogue waves tend to appear in some sea states more than others, through a combination of sea state parameters (Mendes et al., 2021). Indeed, storms 29-172 contained 38% of all waves but claimed a staggering 64% of all rogue waves, whereas storms 25-195 contained 19% of total waves and combined amounted to only 3% of rogue waves. For higher order rogue waves (α > 2.1) this disparity grows and the group of first four storms claimed 75-100% of them. Naturally, this discrepancy is not restricted to rogue waves, but decrease considerably as α → 1. For instance, storms 29-172 obtained 53% of all waves with α = 1.8 while storms 25-195 claimed only 14% of them. Therefore, we conclusively find that both large and rogue waves emerge disproportionally in storms with a suitable and sensitive combination of sea parameters, a phenomenon that second-order models can not account for.  Figure 14: Numerical accuracy of the RHT model with smoothness parameters as compared to wave data from Stansell (2004). For the sake of simplicity, we have plotted only Forristall (1978) as numerically representative of second-order distributions (Longuet-Higgins, 1980;Naess, 1985;Boccotti, 1989;Tayfun, 1990), for a numerical equivalency see Tayfun and Fedele (2007), Lu et al. (2019) and Karmpadakis et al. (2020). Longuet-Higgins (1952) is conveyed by the dashed lines, Forristall (1978) by red curves, the RHT model with smoothness of B 0 in purple curves, B 1 and B 2 in blue curves and those without smoothness in cyan curves.  Figure 16: Equivalent of Figure 10, i.e. the predicted uneven distribution among storms (Stansell, 2004) with Sub-Rayleigh and Super-Rayleigh regimes for the model in eq. (17) in the lower range α < 2.6 with (left) smoothness adjustments according to the model in 4.1 and (right) without. 29 33 (37 ± 4) 2 ( 4.0 ± 0.9)   (17) model has nearly 70% of accuracy. Considering only the first two columns (with higher statistical significance), these accuracy percentages are reduced to 25% and 70% respectively. Small and yet significant number of rogue waves smaller than unit suggest that if the total number of waves is grown by their inverse, a rogue wave for this threshold would likely appear.   (Magnusson and Donelan, 2013).

Discussion
The model in eq. (26) intends to be a good mathematical description for the entire interval α ∈ [1, α ∞ ). For reference, the full statistics of the number of waves per each normalized height threshold is presented in Table 1. As such, Table 2 shows the remarkable performance of eq. (26) across four crucial fixed normalized heights, including Longuet-Higgins (1952) for the total count only (see Table 2 of Mendes and Scotti (2020) for its individual performance). Contrary to Longuet-Higgins (1952), our model maintains an error smaller than 20% in virtually all cases and smaller than 10% in half of all cases. Second-order models, as demonstrated by Figures 14-15, will not provide a good fit either, performing worse than Longuet-Higgins (1952) for α > 2. In addition, is important to remark the evolution of percentages of groups A and B in Table 1, illustrating that the sea state effect on rogue wave likelihood is substantial, also pointing to the argument in Mendes et al. (2021) that a better definition for rogue wave could be based on the dynamics creating the disparity in these percentages, i.e. when sea states are more likely to claim the majority of large waves. Under the current data set, regarding the likelihood of waves exceeding any given threshold α, we notice that there is no distinction among all storms until α < 1.8, otherwise storms with a sensitive combination of sea parameters dominate, such that a dynamical rogue wave could be redefined as H > 1.8H 1/3 . The most studied individual rogue waves, the Draupner wave (Haver and Andersen, 2000;Øistein, 2002;Haver, 2004;Walker et al., 2004;Adcock et al., 2011;Clauss and Klein, 2011;Adcock, 2017) and the Andrea Wave (Magnusson and Donelan, 2013;Cherneva and Guedes Soares, 2014;Bitner-Gregersen et al., 2014;Donelan and Magnusson, 2017;Fedele et al., 2017) are well below this threshold. The characteristics of Stansell's super-rogue wave are much more remarkable than the famous Draupner and Andrea waves, as shown by Table 3. Applying the available data of storm 172 found in Table 3 of Mendes et al. (2021) into eq. (18), we obtain the storm average φ 172 ≈ 0.523. According to the conditions laid in eq. (26), we reevaluate the North Alwyn super-rogue wave minimum return period, obtaining 63,000 waves (6 days), equivalently a likelihood of 2 · 10 −5 . As the storm 172 contains 23,591 waves (2 days and 4 hours), we obtain a maximum relative probability of 37% to find at least one super-rogue wave as reported by Stansell (2004). Notice, however, that using B 1 ∼ e −α 2 /α 2 ∞ would make this likelihood increase to 70-80% while not affecting lower statistics (α < 2.5).
On the other hand, more established distributions discussed in Mendes et al. (2021) would prescribe the respective likelihoods for the North Alwyn super-rogue wave: e −2·(3.19) 2 ∼ 10 −9 (Longuet-Higgins, 1952), 4 · 10 −8 (Haring et al., 1976), 3 · 10 −12 (Forristall, 1978), 8 · 10 −8 (Tayfun, 1980) and 3 · 10 −8 (Forristall, 2000). Looking from another perspective, we can analyze the likelihood for the North Alwyn wave by models of rogue wave crest exceeding probability. For instance, Longuet-Higgins's crest model predicts a probability of e −8·(2.46) 2 ∼ 10 −21 for the North Alwyn super-rogue wave crest. Considering the storm zero-crossing period of 8 s, the latter estimates a necessary interval of 3 · 10 14 years for this wave to appear at a fixed point in the ocean, or ten thousand times the age of the universe. However, according to second-order models, the Draupner and Andrea wave crests should have an exceeding probability of 5 · 10 −7 (Øistein, 2002), 4 · 10 −6 (Prevosto and Bouffandeau, 2002), 5 · 10 −6 (Walker et al., 2004), 6 · 10 −7 (Forristall, 2005) and more recently 3 · 10 −6 (Adcock, 2017), whereas Longuet-Higgins's would estimate e −8·(1.60) 2 ∼ 10 −9 . Consequently, the best estimate for the Draupner and Andrea waves (Adcock, 2017) would likely assign an exceeding probability of 10 −11 for the North Alwyn super-rogue wave crest height (26,000 years). Furthermore, to highlight the extraordinary aspect of this super-rogue wave we shall compare it to known physical mechanisms for their formation. Barbariol et al. (2015) found that a current in opposition to wave train would increase the exceeding probability, however, the increase in extreme wave crest was only of 4% with an opposing current with velocity of 0.4 m/s at the surface. Comparing it to the Agulhas current that has a velocity of 2.1 m/s at the surface (Boebel et al., 2003), the maximum increase of a extreme wave crest due to an opposing current would be of the order of 20%. Such an increase is not enough to explain the North Alwyn super-rogue wave crest which is about 80% taller than expected by Longuet-Higgins (1952).
Overall, regardless of the complexity of the smoothness coefficients, the uneven occurrence of rogue waves is mostly due to a strong variation in γ when oscillations in and ε are small, such as in Mendes et al. (2021). This can be seen in Figure 7 as the variations in γ necessary to produce the same probability narrows (flattens) as α decreases. In this sense, in agreement with the interpretation of Mendes et al. (2021), a better rogue wave definition would be one in which the storm parameters influence on γ, i.e. the exceeding probability, grows considerably. Moreover, we observe a tendency of higher rogue wave occurrence in deep and intermediate water waves for small η 1/3 . In other words, it is easier for γ to reach high non-negative values when φ < 1. Accordingly, typical shallow water waves with ≈ 1/5 tend to have high occurrence of rogue waves if η 1/3 ∼ 2.4, as governed by eq. (18), thus, the "meandering" in shallow water would require a herculean effort from the system to form rogue waves. On other hand, through a proper combination of and ε is possible to form rogue waves in shallow water, though is unlikely to form as many rogue waves as in deep water. Therefore, the consequences arising from the present model are in complete agreement with available in-situ data and numerical simulations (Glukhovskii, 1966;Bitner, 1980;Chien et al., 2002;Didenkulova and Anderson, 2010;Didenkulova and Rodin, 2012;Barbariol et al., 2015).

Conclusions
Given the scope of this endeavor, a review of the results found in this paper are: the proposed exceeding probability combines three major sea state variables that are readily available from hindcast and that mathematically are implemented as three dimensionless variables ( , ε, η 1/3 ). Through a geometrical composition, we have obtained very good agreement with empirical distributions while also implementing the relevant physical variables and constraints of ocean states described in Mendes et al. (2021) into the distribution. Moreover, it shows unique flexibility (oscillation around the Longuet-Higgins (1952), the "equilibrium" model) not found in most well-known distributions. Additionally, our model prescribes an interpretation for the combination of sea state parameters that could be a blueprint for rogue wave warning systems while not depending heavily on skewness and kurtosis. We have further demonstrated that rogue waves can be redefined to convey the threshold when several storms begin to deviated from each other regarding the occurrence of large waves, here of the order of H > 1.8H 1/3 , i.e. the stage where the overall dynamics affecting wave occurrence will be sensitive to metocean conditions. The new exceeding probability is generous to higher rogue waves (α 3) without having any anomalous behavior for other α. The major innovation of this work relies on the continuous approach to the exceeding probability distribution. In fact, we experimented the same model for different values of the dimensionless wave height α and have conclusively shown how sea states create variability in empirical distributions. Forristall (2000) second-order model (adapted to wave heights) predicts that storm 124 has the highest probability of featuring rogue waves (Mendes et al., 2021), as it has the highest significant steepness. In practice, this storm had no rogue wave and yet Forristall (2000) predicts 22 rogue waves for storm 124 alone. Accordingly, Forristall (2000) would foresee nearly 300 rogue waves appearing in the fourteen storms, almost three times the number reported by Stansell (2004). Nonetheless, Forristall (1978) distribution would assign only 18 rogue waves for the 14 storms, whereas while the storm 172 had 26 rogue waves Forristall (1978) would predict only 3. Therefore, second-order models do not produce uneven extreme wave distributions as observed in Stansell (2004). Moreover, additional distributions such as Naess (1985), Boccotti (1989), Tayfun (1990), Boccotti (2000), Tayfun and Fedele (2007) (see Mendes et al. (2021) for a review) are unable to explain the rogue wave uneven occurrence. We also observe that these distributions do not perform well in the range 1.5 < α < 2.5 for individual storms. However, in agreement with empirical data (Casas-Prat and Holthuijsen, 2010;Christou and Ewans, 2014;Collins III et al., 2014;Teutsch et al., 2020), when all storms are combined into a single data set Forristall (1978) is a better fit than Longuet-Higgins (1952) in the range α < 2 (see Table  2), although not better than the model presented here (see Figures 14-15), and after this point neither the second-order nor the normal distribution are able to fit the data. The Longuet-Higgins (1952) model seems to agree well with empirical data in the range α < 1, which can be readily recovered by all distributions discussed here. Therefore, if one aims to build a continuous distribution capable of describing rogue waves in a wide range of sea states (Karmpadakis et al., 2020), these second-order models would not be suitable, however, if one is content with employing a distribution for the merger of several storms, these models may be useful. Surprisingly, however, second-order models were not able to capture empirical distributions of individual storms, performing even worse than Longuet-Higgins (1952).
Except for the interval α > 1.8 of storm 25 and 1.7 < α < 1.9 of storm 28, the present model succeeded in its numerical tasks, however, it is vulnerable to large deviations from the average values of η 1/3 , and ε within a sample, as already expected in the storm geometry discussion of Mendes et al. (2021). We suggest that a group of wave records should be combined and averaged-out and analyzed by eq. (17) if and only if the deviation of the significant wave height and other major variables does not exceed 25%, converging to the methodology in Karmpadakis et al. (2020). Higher deviations will likely increase the discrepancy between the prediction and observation of the return period of rogue waves and we believe that is the most probable cause of disagreement among many studies (see Mendes et al. (2021) for a review). Moreover, our model needs validation for upper intermediate and shallow water waves, however, it already represents a good toy model for explaining both underprediction and overprediction of the Longuet-Higgins's model. The present model is far from being considered as an universal distribution (Karmpadakis et al., 2020), but a first step into this direction, thus, likely requiring further revisions in order to accommodate the ocean complexity. Future work upon the improvement of the current model includes translating the same method to distributions of wave crests, troughs and surface elevation.

Acknowledgements
S.M. and A.S. were supported by the National Science Foundation under grant OCE-0729636. The authors are thankful to P. Stansell for gracefully sharing the data.