Development of a Bayesian networks-based early warning system for wave-induced flooding

Coastal flooding prediction systems can be an efficient risk-reduction instrument. The goal of this study was to design, build, test


Introduction
Coastal flooding poses risks in extensive coastal areas globally.One of the processes contributing to this hazard is wave-induced flooding or overtopping.Wave overtopping occurs when fluctuations associated with surfbeat and the motion of individual waves at the shoreline exceed the elevation of a coastal protection element (e.g., dunes, vertical seawalls, rock revetments), particularly during elevated tide and surge levels.This can affect the integrity of urban elements and properties, create disruptions in local services and transportation, and severely injure people.Moreover, anthropogenic pressures, climate change with rising sea levels and a potential increase in storminess are expected to exacerbate coastal flooding risks in the future [1,2].These enhanced risks will promote an increase in casualties, economic losses, and damages.In order to better protect lives and livelihoods in coastal communities, part of the government efforts should focus on anticipating and alerting these hazardous situations in advance.Therefore, coastal communities at risk need adequate warning systems to respond to incoming extreme oceanic events.In that sense, early warning systems (EWSs) are a crucial tool for coastal planning and management, disaster risk reduction, and a key element in climate change adaptation as they deliver timely and accurate information about the impact of coastal flooding to save lives and avoid vast economic losses [3].While economic analyses have demonstrated that the benefit-to-cost ratio of coastal flooding EWSs can be remarkably high [4], only a few examples of EWSs for coastal flooding are found worldwide (e.g.Refs.[5][6][7][8][9][10][11][12]).
When it comes to coastal flooding EWSs, temporal and spatial variations in floodings due to overtopping are fundamental to make accurate predictions [2].However, wave overtopping processes are not frequently integrated into hydrodynamic flooding prediction systems and it has been prioritized as a critical research area [13].Some of the reasons are the lack of detailed topography and nearshore bathymetric information and reliable and robust overtopping flow models that can be suitable for being implemented in an EWS.Common strategies to compute wave overtopping in EWSs mainly involve empirical methods (e.g.Refs.[6,11,14]), established neural networks [10] and numerical models [7,9].Empirical methods (e.g.Refs.[15][16][17]) have been used since the 1970s and are a rapid and efficient tool to make wave overtopping estimations based on the profile morphology and geometry, the water level and wave conditions (onshore or offshore).Since they are developed based on limited empirical data, they present some limitations such as they are less flexible to be adapted to specific natural landscapes or nonclassical coastal defenses, need to incorporate simplifications on the profile, might neglect important processes namely infragravity energy, and don't account for the stochastic uncertainty of the random wave phasing.More sophisticated methods such as the artificial Neural Network Prediction derived from physical experiments conducted under the CLASH project [18], and later expanded and improved with the XGBoost model [19], are available to compute wave overtopping over specific typologies of coastal structures and for given overtopping regimes.Regarding overtopping data measured on natural or hybrid defenses (e.g., beach and seawall), which is not covered in the CLASH database, some studies [20][21][22][23] have investigated this phenomenon and gathered some quantitative information, but still too scarce to be used for the application of Neural Network Prediction systems in these landscapes.Process-based numerical models, such as SWASH or XBeach, are gaining popularity due to the improvement in computational power, and high robustness and flexibility to adapt to different landscapes (including sandy coasts) and oceanic and morphological situations.Nevertheless, these numerical models also have some limitations.For instance, in its current version, XBeach nonhydrostatic still requires a better characterization of the physical processes to successfully simulate morphological evolution during storm events [24], and thus, the influence of the beachface evolution in the wave overtopping is not seamlessly simulated with this modeling approach.Another limitation is the lack of quantitative data for the calibration and validation of models that reproduce coastal structures at a prototype scale [20].Also, for operational purposes, even with great improvements in the efficiency of function libraries (e.g., parallel computation and scalability) and technology, the simulation time of these sophisticated numerical models is still very long, which constitutes a drawback for their inclusion in an EWS.Hence, despite the continuous efforts to develop a framework for building efficient EWSs, further investigation is required to cover the challenge of accurately modeling nearshore hydrodynamics and simulating coastal inundation and drastically minimize the intensive computations of process-based models [25].
Artificial intelligence and data-driven techniques are rapidly increasing their field of applications due to, among other things, their capacity to rapidly analyze large data sets and obtain useful and reliable information [26].Thus, they also represent a unique opportunity for the strengthening of accurate coastal hazard prediction systems.In the flooding prediction context, data must cover a high number of episodes with a diverse nature to be representative.However, comprehensive observational data are rarely available, and thus, this information must be alternatively generated by running process-based models.When generated beforehand, demanding process-based models can be utilized to simulate complex physical processes occurring in nearshore and swash areas.For instance, Chondros et al. (2021) [25] used a suite of high-credibility numerical models to train an Artificial Neural Network that is capable of predicting coastal flood risks on the Island of Crete in Greece.Idier et al. (2021) [27] implemented the wave phase-resolving SWASH model to create the training dataset for a metamodel that supports a local coastal flooding EWS.Other authors [28][29][30] have investigated the suitability of building Bayesian-based EWSs trained with results from the process-based model XBeach.They highlighted the capabilities of the Bayesian Networks (BNs) for the construction of fast storm impact forecasting tools.A Bayesian Network is a probabilistic multivariate graphical model that provides a probability distribution of likely outputs.This is done by expressing probability in terms of conditional probability tables (CPTs) that make explicit the statistical dependencies between variables.The relationship between the variables in a system is derived from prior knowledge about the system [31].For instance, for a simple two-dimensional case, a coastal hazard forecast can be made by updating the BN with new information, such as the wave height (H) associated with a storm.Then, the BN uses Bayes' rule Eq. (1) to update the likeliness of the other variables according to: p(F i |H j ) is the conditional (posterior) probability of the forecast F i , given H j .p(H j |F i ) is the conditional probability that H j occurs given F i , and p(F i ) and p(H j ) represent the prior marginal probabilities of the forecast and wave height, respectively.
The goal of this study is to: i) develop, ii) test, iii) prove the concept of an operational EWS supported by BN techniques to predict wave-overtopping coastal flooding impacts on three coastal receptors such as pedestrians, urban components (fittings and posts) and buildings (hereafter referred to as "buildings"), and vehicles in urban areas fronted by sandy beaches.After investigating the suitability of the method developed in the present study, a coastal flooding EWS was implemented in the operational platform HIDRALERTA [32] that issues alerts for coastal risks.The structure of the paper is as follows.In section 2, the study area is presented.In section 3, the methods and the data used to develop the BN-based EWS are described, the predictions of the EWS are validated against observations and an analysis of the uncertainties is conducted.In section 4, the main aspects of the implementation of the developed and tested EWS in the HIDRALERTA platform are presented.In section 5, the predictive skills of the systems are discussed along with the limitations and future developments and in section 6 the main conclusions are summarized.

Study area
The study area is the urbanized stretch of Praia de Faro, an open sandy beach located in a narrow peninsula (Ancão) of the Ria Formosa barrier island system on the southern coast of Portugal (Fig. 1A and B).Astronomical tides are semi-diurnal, with an average range of 1.3 m for neap tides and 2.8 m for spring tides.Regarding the wave climate, the site is exposed to the dominant and more energetic wave conditions (W-SW) and sheltered from the less energetic E-SE waves.Praia de Faro exhibits a steep and narrow beachface with a single or double berm.The average beachface slope observed by various authors varies between 0.11 and 0.14 [33,34].Sediments are classified as medium to very coarse sand with d 50 (median diameter particle size) of about 0.5 mm [35].
The oceanfront is protected either by walls and rocks (mostly buried by sand) or a natural beach-dune system.The area was divided into four sectors (western, parking, central, and eastern; see Fig. 1C-F) according to their morphology and exposed elements.The occupation of the site and some of its main economic activities have a strong seasonal behavior.Also, in the western and eastern sectors, the most exposed dwellings are devoted to second homes and rentals, while in the parking and central sectors they are mainly services.The dune elevation varies alongshore being the western sector the one with the highest dunes and less affected by human occupation.In this sector, a limited number of houses are present on the crest of the dune (Fig. 1C).Conversely, the dune system is poorly maintained in the parking and central sectors, where the natural system was either fully replaced by houses and infrastructures or minimally preserved (Fig. 1D and E).In the eastern sector, the dune is better preserved and the crest is higher than in the two J.L. Garzon et al. previous sectors (Fig. 1F).

Development phase
In order to develop an efficient coastal flooding EWS that provides the end-users robust scientific-based information, state-of-theart numerical models and data-driven techniques were exploited.In the development phase, the events were schematized, all the numerical model runs were performed, and the associated impact levels were categorized, resulting in the training information.Then, the BNs were designed and trained to generate the CPTs that linked storm characteristics, sectors, receptors and impacts.Finally, the performance of the predictions was assessed against impact observations and an uncertainty analysis was performed.

Event schematization and boundary conditions
Bayesian Networks cannot extrapolate beyond the parameter space in which they are constructed.Thus, BNs must be trained with comprehensive information that incorporates the natural variability expected for the site to achieve good predictive skills.Therefore, a wide range of oceanic situations must be represented in the boundary conditions (BCs) of the BNs to obtain reliable impact predictions.This site is subject to different types of wave conditions that can induce wave overtopping.On the one hand, the site can be hit by events caused by deep atmospheric disturbances that generate waves coming from the west-southwest directions.These events have significant wave height (Hs) above 3.0 m while the associated peak period (Tp) varies between 9 s and 14 s [37,38] (hereafter referred to as storm events).This threshold was established based on the studies of [37,39].On the other hand, this site can also be vulnerable to events associated with large swell conditions, hereafter referred as large-swell events, where the significant wave height is not remarkably high for this location (less than 3.5 m), but the large peak periods (greater than 12 s) make these events especially hazardous.Similarly, the direction of the wave propagation is southwest.The implementation of both types of events in the numerical model framework included only static water levels and stationary wave conditions for each sea state.
As suggested by Ref. [40], the bins of the nodes of the BCs were designed according to three principles: i) As wide as possible to reduce the computational effort; ii) sufficiently narrow to provide meaningful forecasts and resolve the uncertainty, and iii) sufficiently wide to capture multiple data points.Accordingly, the Hs bin size was constrained to 0.5 m, the tide and water level bin size was restricted to 0.25 m, and the Tp size was set to 1 s (except for one bin of 2 s).Moreover, each bin was populated with five events.

Storm events
Synthetic storms, represented by Hs and tide level, were created and implemented in the numerical framework to obtain wave overtopping discharges.These parameters were inspired by a collection of nearly 30 years of wave measurements recorded at the Faro buoy and the tidal conditions in the region.The dependence between Hs, surge levels and Tp is statistically significant for storms in this region [30].Thus, previous authors [39,41] have found that the peak period and the storm surge can be described as a function of the Hs for storm events.This had an important implication because it amply reduced the number of computations.Therefore, the only two independent variables used to characterize the oceanic BCs of these events were the astronomical tide levels and the Hs, while the Tp and surge were internally computed based on the wave height expressions presented in Refs.[41,42] respectively.The BCs were represented by seven bins that covered the variability of the tidal conditions (from 0.25 m to 1.75 m above mean sea level, MSL) and twelve bins that described the wave height conditions spanning from 3.0 m to 8.5 m (approximately a 50-year return period event).This resulted in a dataset of 84 (12 × 7) synthetic storms.In addition to these storms, four events were generated to complement the training data.This allowed for better accounting for the stochastic effects of the wave overtopping process, enriching the training information and obtaining impact probability information associated with each bin.These storms were obtained by creating events whose tidal and wave height conditions varied randomly within the limits of each bin.Therefore, the number of synthetic storms used to populate the bins and train the BN was 420 per profile (84 × 5).Based on the authors' judgment and knowledge acquired in the area for the last three decades [39], lower elevations of the tide level do not result in hazardous conditions in the urbanized area and thus, they were not included as BCs in the training data.

Large-swell events
For the events that are characterized by large swell conditions, three independent variables were selected (total water levels above MSL, significant wave height and peak period) as there is no statistically significant relationship between Hs and Tp or surge for such large-swell conditions.Six bins were considered for the total water levels, five bins for the Hs and six bins for the Tp (Table 1).The combination of the lowest limit of every bin resulted in 180 (6 × 5 x 6) synthetic large-swell events.Similarly to the storm events, four additional runs were included to populate each bin with their variables randomly varying between the limits of each bin.The total number of runs used to generate the training data was 900 (180 × 5) per profile.The large-swell events have been less intensively studied in the study area and thus, information about recurrence periods for these events is not available.Consequently, the range of parameters used to characterize these events was based on empirical observations and authors' knowledge acquired over the last 30 years in the area.

Hazard numerical modeling
The previously generated synthetic events were implemented in a numerical framework that coupled SWAN (v41.31)[43] and XBeach (XBeachX release [44,45]) to compute wave overtopping at several locations in Praia de Faro.The SWAN model was built for the entire southern Portuguese coast (Fig. 1B) and consisted of a structured grid with an approximated resolution of 350 m and 600 m in the cross-shore and alongshore directions respectively.The model was forced with a JONSWAP wave spectra built from the wave parameters and gamma of 3.3 at the southern oceanic boundary located at 36.905 • N and extended between 8.95 • W and 7.50 • W, where the water depth ranged between 70 m and 700 m with an average depth of 200 m.The wave spectral resolution in space was 5 o and the range of frequencies, from 0.0345 s − 1 to 1 s − 1 , was divided into 34 parts.This model was used to propagate and downscale the wave conditions from the Faro buoy (~100 m depth) to 25 m depth, where the XBeach offshore models were located.Then, the XBeach model, non-hydrostatic mode, was used to propagate the wave conditions to the shore and compute wave overtopping.This mode was used to assess overtopping since it solves individual short-waves along with the low-frequency motions associated with the short-wave groups, although the morphological processes were not included.This model has been found skillful in replicating runup oscillations in steep beaches dominated by short waves [46,47].Since running this large number of simulations on a 2D scheme would be very expensive, one cross-shore profile was implemented for each sector.The elevation of the cross-shore profiles and the location where the impacts on the coastal receptors were evaluated are displayed in Fig. 1 and Fig. 2.These four profiles were selected because they represented worst-case scenarios in the sense that the exposure of the coastal receptors was the highest in the sector.The location of the receptors was defined based on the position where those elements would be first impacted (within the urbanized area).The elevation of the models was interpolated from several sources.The topography was extracted from a UAV survey conducted by the COSMO program [48] in October 2018 that covered the emerged beach.This topography represented a typical pre-storm profile, with a poorly developed berm.Bathymetric information in shallow areas up to 13.5 m below MSL was also extracted from COSMO (survey from 2018).A regional bathymetry extracted from MIRONE [49] was used to interpolate areas below − 13.5 m.This numerical framework has been previously tested and validated by Ref. [50].
The horizontal resolution of the 1D models (Fig. 2) varied spatially: 2 m from the offshore boundary to 22 m depth, 1 m from 22 m to 17 m depth and 0.5 m for the rest of the profile.The boundary conditions were set to non-hydrostatic at the seaward boundary and to absorbing-generating (weakly-reflective) at the back of the domain.Neumann conditions were imposed on the lateral boundaries.A Manning's n value of 0.02 s/m 1/3 s was chosen to represent the bed roughness corresponding to a sandy bottom, in agreement with previous studies [51,52].The groundwater flow module was also considered and this accounts for three mechanisms regarding the vertical exchange of groundwater and surface water: 1) submarine exchange, 2) infiltration and 3) exfiltration.The hydraulic conductivity (k) or Darcy coefficient was calculated following the expression presented by Ref. [53]: where g is the gravitational acceleration and v is the fluid kinematic viscosity (0.89 × 10 − 6 m 2 /s) and C H is a unitless coefficient, 6.54 × 10 − 4 .Based on recent sedimentological analyses conducted on this site, d 10 is equal to 0.260 mm, i.e., the 10% of the particles in a sample were smaller than that.Therefore, k computed with Eq (2) is equal to 0.0004 m/s.A two-layer model (nonhq3d on) was applied to better simulate the dispersion relation and wave shoaling in intermediate depths [54].Thus, it is applicable in shallow and intermediate water depths (up to relative depths, defined as the product of wavenumber and depth, equal to 4).The layer distribution was set to the default value (0.33).The parameter maxbrsteep controlling the maximum wave steepness criterium was set to 0.5 based on previous studies [55] and the water depth threshold above which cells are considered wet was set to 0.001 m.Jonswap spectra were selected to represent the wave energy spectra with the default value for the peak enhancement factor (3.3) and the second order of wave propagation scheme was specified.The Courant-Friedrichs-Lewy (CFL) criteria was set to 0.55.Instantaneous discharge values were output at the interest locations (Fig. 2) every half second or 2 Hz frequency.An example of instantaneous discharge values can be found in the Supplementary material.The simulation time was not fixed but varied between runs depending on the wave period in order to simulate the same number of waves.After a warm-up period, the up-crossing method was selected to calculate the mean period.The mean period multiplied by the number of waves resulted in the duration of the runs and it was used to compute the mean overtopping discharge (q x ).In the present study, 600 waves were selected in an attempt at achieving convergent overtopping discharges [56][57][58], while maintaining an affordable computational effort.Moreover, populating each bin with five runs can contribute to enhancing the convergence of the discharge variables.

Overtopping impact assessment
The wave overtopping prediction system was designed to anticipate the impacts on pedestrians, buildings , and vehicles.The system categorizes the impact into four levels (Table 2) according to the conceptual description provided by Ref. [50].The mean overtopping discharge simulated by the numerical model framework was then used as a proxy to characterize wave overtopping impacts at each sector and receptor according to the limits proposed by the EW-Coast method (Table 2) presented in Ref. [50].

Bayesian network structure
Bayesian networks can make stochastic predictions by graphically representing the statistical relationships between variables using a series of nodes and connections.The nodes represent the different variables, and the arcs (as unidirectional arrows) indicate causal relationships between them.The design of the BNs, such as the nodes and arcs was based on the authors' criteria and previous experiences in the study site [30,59].The structure of the BNs had two input variables (boundary conditions and sector/receptors) and two output variables (hazards and impacts).The BCs (blue boxes in Figs. 3 and 4) were represented as event scenarios and divided into bins, as presented in the 3.1 section.The coastal receptors (grey boxes in Figs. 3 and 4) were defined as three nodes that represented each receptor and their location within the urbanized stretch of Praia de Faro.The location of each sector considered in the EWS is displayed in Fig. 1C-F.All BCs, receptors and sectors were mutually independent.The hazards were characterized by q x , and they corresponded to the orange boxes in Figs. 3 and 4, and their bin sizes were defined based on the limits shown in Table 2.The impact level (green boxes in Figs. 3 and 4) indicated the severity of the flood impact for specific BCs, receptors, and each sector.The design of the hazard and impact variables accounted for the three receptors.The direction of the arcs that connect the nodes, from so-called parent to child, represents the direction of influence.For instance, the oceanic variables and pedestrians' nodes (for a certain sector) condition the mean overtopping discharge and this influences the impact level for pedestrians.Two different BNs were designed to accommodate the differences in the type of events (large-swell vs storms) driving the hazard.Thus, both BNs showed similar structures, and the only differences were found in the variables that defined the oceanic BCs and their bin discretization (Figs. 3  and 4).

Table 2
Mean discharge limits (l/s per m) of the EW-Coast method implemented in the EWS to define impacts on the three evaluated receptors.Table adapted from Ref. [50].
After the design of the BN structures, the BNs must be trained with the previously generated information that involved synthetic BCs, the computed hazards at each specific sector (see an example in Fig. 5) and the level of impact for three coastal receptors.The training of the Bayesian Network entailed calculating the conditional probabilities of each node in the network based on available data and it was essential to construct the conditional probabilities derived from joint probability distributions that supported the EWS.This training process of the BNs followed the method described by Ref. [60].After training, the BNs can anticipate impact levels for all considered BCs, including the probabilities of occurrence as illustrated in a particular example in Figs. 3 and 4 (highlighted in bold and underlined), and therefore functioning as a final BN-based EWS.The visualization of the variables, their dependencies and the probabilities can be achieved through the graphical interface of the software Genie, which is freely available for academic research from BayesFusion, LLC (https://www.bayesfusion.com/).

Validation assessment
The methodology for the development of the EWS, which included the generation of the BCs, the numerical modeling of the coastal hazards, the impact definitions, and the design, construction and training of the BNs, was validated by comparing the impact predictions with the observed impacts during several overtopping and quasi-overtopping events reported in the study area.These events at the parking and central sectors, including the three receptors, were documented and their impacts were characterized following the conceptual description of [50].The other two sectors were not included because there were not enough observed occurrences to make a relevant analysis.The wave parameters that brought these occurrences were extracted from the Faro buoy until 2019, except for some gaps in the measurements that were complemented with the data downloaded from the Ocean Wave Reanalysis provided by Copernicus Marine Services.Wave data after 2019 were extracted from Puertos del Estado -IBI system since the buoy was inoperative for a long period.Tide levels were extracted from XTide (www.flaterco.com/xtide)at the position "Faro -Olhao", while the residual tide or storm surge was obtained from measurements collected at the Huelva tidal gage, which is representative of the study area.The approximately 100 reported occurrences were split into storm (40%) and large-swell (60%) events.The oceanic conditions and the corresponding observed impacts are shown in Fig. 6 (parking sector) and in the Supplementary material (central sector).This wave and water level information was used as BCs of the BNs, for both storm and large-swell events (Fig. 6).Then, for each sector and receptor, the probability distributions of each impact level were extracted from the BNs, but for operational purposes, the predicted impact level was defined as the maximum level estimated by the BNs.
The predictive skills of the developed methodology were assessed by the accuracy, precision, recall and the F1 score formulated by a confusion matrix, which is commonly used for assessing the performance of predictive models [61,62].In the confusion matrix, the observed impacts are displayed on the horizontal axis and the predicted impacts are shown on the vertical axis.The diagonal cells (green cells in Fig. 7, right side plots), then correspond to observed impacts that were correctly predicted, i.e., true positives.The non-diagonal cells correspond to incorrect predictions.Both the number of cases and their percentage are shown within each cell.The column of the grey boxes on the far right of the matrix (Fig. 7) reveals the relative number of the predictions for each impact level that were correct (true positives) known as precision (green color), and incorrect (false positives), known as the false discovery rate (red color).It explains the ability of the system to correctly predict the levels.The row of grey boxes at the bottom of the matrix (Fig. 7) indicates the relative number of observations for each level that were correctly predicted (true positives), known as recall (green color) and incorrectly predicted (false negatives) known as false negative rate (red color).It refers to a system whose observed levels are correctly predicted.The concept of precision and recall is illustrated as follows.For instance, a precision of 90% and recall of 20% for a given level mean that when predicting that level, the predictions succeed in 90% of the cases but when observing that level, the predictions only succeed in 20% of the cases.The cell at the bottom right of the matrix (darker grey box in Fig. 7) displays the overall accuracy (green color).Accuracy is the ratio of correctly predicted overtopping occurrences to the total number of occurrences and this metric refers to the overall model performance.Accuracy, precision, recall and F1-score were calculated as presented in Eqs ( 3)-( 6) for each impact level.where TP is true positive, TN is true negative, FP is false positive and FN is false negative.F1-score varies between 0 and 1, with 1 representing a perfect predictive skill.
Regarding the predictions of the overtopping impacts on pedestrians at the parking and central sectors, the accuracy of the method was 76.3%, i.e., 74 cases were well predicted (51.5% no impact, 1.0% low impact, 3.1% moderate impact and 20.6% high impact), 20.7% of the cases were overpredicted and 3.0% were underestimated (Fig. 7 and Table 3).The no impact and high impact levels or categories obtained the highest precision (96.2% and 87.0% respectively; Fig. 7), indicating that the relative number of false positives (false discovery rate) for these categories was low.Similarly, for these two categories, the recall was the highest, 76.9% and 90.9% respectively (Fig. 7), implying that the relative number of false negatives (false negative rate) was low.Conversely, the false positives and false negatives were more prevalent in the intermediate categories resulting in lower values of the recall and precision skills.It is important to note that the number of observed cases categorized as low and moderate impacts was significantly lower than for the other two categories (Figs. 6 and 7).Regarding the skills by sectors, it can be seen that in both sectors the F1 score ranged between 0.84 and 0.94 for the no and high impact categories, 0.40 for moderate impact in the parking sector (no data for moderate impacts at the central sector) and between 0.0 at the parking sector (precision was 0% for a reduced number of cases, see Fig. 6) and 0.40 for low impact at the central sector (Fig. 7).
Regarding the overtopping impacts related to vehicles at the parking and central sectors, the accuracy of the predictions was 86.9%, i.e., 86 cases were well predicted (67.7% no impact, 4.0% low impact, 1.0% moderate impact and 14.1% high impact), 8.1% of the cases were overpredicted and 5% were underestimated (Fig. 7 and Table 3).Again, the no and high impact categories obtained the highest precision (97.1% and 100%, respectively) and recall (89.3% and 82.4%, respectively) as displayed in Fig. 7. Importantly, for this receptor, the skills of the intermediate categories were higher than for pedestrians, 33.3% (precision) and between 50% and 80% (recall).This was also seen in the F1 score metric, which was around 0.90 for the no and high impact categories in both sectors.In the parking sector, the F1 score was 0.66 for low impact and 0.50 for moderate impact (Fig. 7).Low and moderate impacts were not documented in the central sector.
The skills of the predictions of the overtopping impacts on buildings at the parking and central sectors yielded an accuracy of 84.4%, i.e., 81 cases were well predicted (76.0%no impact, 2.1% low impact, 6.3% moderate impact and 0.0% high impact), 11.4% of the cases were overpredicted and 4.2% were underestimated (Fig. 7 and Table 3).The precision of the predictions was 94.8% for the no impact category, 50.0% for the low impact and 85.7% for the moderate impact (Fig. 7).There were no observations of high impact for this receptor.The recall for no impact was 97.3% and it was lower for the other categories (around 40%).When analyzing the skills per sector, the F1 score on the parking sector was 0.92 for no impact, 0.44 for low impact and 0.42 for moderate impact.In the central sector, only two categories of impact were reported, no impact and moderate impact, they obtained an F1 score of 1.0 and 0.66, respectively (Fig. 7).Instead of defining the impact predictions by selecting the highest impact level provided by the BNs (strategy 1), regardless of the probability of occurrence, another strategy to determine the impact category was investigated.It consisted of selecting the impact level that exhibited the highest probability, and in the case of two levels with the same probability, the more severe level was considered (strategy 2).In order to assess which strategy yielded the most reliable results for the implementation in an EWS, the F1 score, and the underestimation and overestimation cases were compared.Firstly, it can be seen that for the three receptors, the F1 score calculated with strategy 1 was close to or higher than the F1 score estimated with strategy 2 for the no impact and high impact levels (Table 3).Secondly, strategy 2 slightly overperformed strategy 1 for the true positive cases (between 1% and 3%) as displayed in Table 3. Thirdly, strategy 2 provided a higher percentage of underestimation cases than strategy 1 for pedestrians, vehicles and buildings (11.3% against 3.1%, 11.1% against 5%, and 7.3% against 4.2%, respectively, Table 3).Fourthly, the opposite occurred in the overestimation cases, and strategy 1 was more prone to overestimate the observations than strategy 2 (Table 3).Therefore, for operational purposes, strategy 1 was selected since it had in general better predictive skills and it was more conservative, which in terms of risk analysis and alerts is preferable to an underestimation tendency.

Uncertainty assessment
Uncertainty in the predictions can be introduced through the model architecture and numerical methods used to represent and solve the physical equations, which were out of the scope of this analysis, the boundary conditions (e.g.inaccurate wave information used to condition the BNs and topo-bathymetry data), the selection of model parameters, and the randomness of the seed number used to generate the free surface elevation and velocity time series at the seaward boundary.Regarding the last factor, this uncertainty is intrinsic to the wave overtopping processes since the phases of the individual wave components are unknown.To assess this uncertainty, four real events obtained from Ref. [50] with varying wave and water level conditions, as displayed in Table 4, were selected.Then, these events were implemented in XBeach (parking sector), and five simulations per event were performed by randomly varying the seed number.The minimum and maximum mean overtopping discharges among the five simulations were exploited to calculate Fig. 6.Wave and water level data used for conditioning the BNs.The circles display large-swell conditions and the squares storm events.The color of the symbols indicates the level of the observed impacts at the parking sector following the code displayed in Table 2.The impacts were defined on the three coastal receptors.Please note that horizontal and vertical axis have different limits for large-swell and storm conditions.the impact level for each coastal receptor as shown in Fig. 8A.The prediction variability depended on the event severity and receptor, but in most situations, the maximum and minimum impacts were the same or with one level difference (Fig. 8A).The selection of model parameters might have an impact on the predictions.According to Refs.[47,55,63] the maximum wave steepness criterium (maxbrsteep) and the bed friction coefficient are considered the most important parameters to tune wave transformation and consequently runup and overtopping.In order to reduce the number of free model parameters to calibrate, the friction coefficient for a sandy bottom in the Manning formulation was set to 0.02 s/m 1/3 , in agreement with previous works [52,64].Two values of maxbrsteep, 0.4 and 0.5, (with the same seed number) were tested to assess the uncertainty of the predictions associated with model parameters.The values of maxbrsteep selected were based on a literature review where it was found that most of the previous XBeach  nonhydrostatic studies used either 0.4 or 0.5 [29,47,54,55,65].Note that 0.4 is the recommended value in the user manual.The same events and the parking sector were considered.The results of the simulations indicated that the consequence of increasing this parameter was a shoreward shift of the wave-breaking location and higher overtopping discharges (Supplementary material).When analyzing the variations in the impact level, it was found that for buildings and vehicles, the impact level was not sensitive to changes in this model parameter (Fig. 8B).Conversely, the impact level associated with pedestrians varied by one level for Event2 and Event 4 and two levels for Event3 (Fig. 8B).
To assess the uncertainty related to the XBeach boundary conditions, two experiments were carried out.In the first experiment, the same previous four events were tested (same seed number), but four different beach profiles were implemented in the model.They covered a typical profile evolution during an energetic storm event from a pre-storm profile to a heavily eroded profile with two intermediate profiles (Berm1 and Berm2) as illustrated in Fig. 9.Note that the sediment was preserved within the profile and the eroded sediment was deposited in the submerged profile.The simulations demonstrated that, for all the storms, the gradual erosion of the profile induced a reduction in the wave overtopping discharge (Supplementary material).When investigating the effect of the profile evolution on the variations of the impact of level, it can be concluded that it had an important effect, especially for the predictions devoted to pedestrians.For this receptor, the profile morphology induced a high variability on the predictions, from high to no impact for Event2, and from moderate to no impact for Event3 and Event4.In the other receptors, the variability of the predictions induced by the profile morphology accounted for up to 1 level of difference (Fig. 8C).In general, the heavily eroded profile provided enhanced protection and the predicted impact was lower than on the pre-storm profile.It is worthwhile to mention that the Post-storm profile is the consequence of the impact of a storm with at least a return period of 10 years.
In the second experiment, the sensitivity of the predictions to changes in wave heights was studied.Thus, the hourly predictions provided by the BNs for two large storm events were compared against similar storms but with an increment and decrease of the actual Hs of 0.5 m.It allowed to understand how inaccuracies in regional wave forecast systems propagated into the predictions.The first storm event analyzed occurred in 2018 and the maximum Hs was 6.55 m.The storm coincided with spring tides (tides level up to 1.5 m) and lasted 61 h, approximately five high tides.The second storm event occurred in 2008 and it was the combination of two consecutive events.In the first one, the tidal levels reached up to 1.50 m and the maximum Hs was 4.5 m, and in the second one, the tidal levels were lower, between 1.30 m and 1.10 m and the maximum Hs was 5.2 m.For the 2018 storm, ±0.5 m variations of Hs resulted in small changes in the hourly predictions with respect to the impact levels predicted for the actual storm and they mainly occurred in the parking and central sectors (Supplementary material).Thus, for the entire storm and in these sectors, the variation of Hs affected approximately 10% of the hourly predictions, which deviated ±1 impact level from hourly predictions of the actual storm.These deviations happened on the three receptors.Larger deviations accounted for an irrelevant percentage of the hourly predictions (Fig. 10A).Regarding the 2008 storm, ±0.5 m variations of Hs did not cause changes in the hourly predictions on the western and eastern sectors.However, in the remaining sectors, between 6% and 16% of the hourly predictions suffered from deviations of ±1 impact level (all receptors) and around 10% were subject to deviations of ±2 impact levels (only in predictions for pedestrians), as shown in Fig. 10B.

Operational phase
In the operational phase, the validated method was implemented operationally to release flood alerts.In this phase, the BN-based coastal flooding EWS runs autonomously through five steps: i) downloading and extracting the oceanic forecasts, ii) identification of the type of event (storm versus large-swell), iii) introducing these variables into the corresponding trained BN, iv) obtaining hourly impact predictions and v) disseminating the early alerts via email among the final end-users.A set of Python scripts are responsible for automating the entire process and implementing the EWS within the HIDRALERTA architecture [32].The daily flooding impact predictions are generated quasi-immediately based on the latest oceanic forecast information and cover the next 72 h.
The impact predictions at Praia de Faro depend on two types of input conditions.Firstly, estimations of hourly tide levels provided   by XTide are extracted at the position "Faro -Olhao" for the next 72 h every day.Secondly, wave parameters, such as spectral significant wave height, wave spectrum peak period, and mean wave direction forecasted by Puertos del Estado -IBI system (https:// opendap.puertos.es/thredds/catalog/wave_regional_ibi/HOURLY/catalog.html) are downloaded.This data has a grid format with a resolution of 0.0833 • and the forecast information of the four nodes surrounding the location of the Faro buoy is extracted.Then, these four values are average and this information is used to identify the type of event that might drive the flooding.For storm events, only Hs and tide predictions are required while for large-swell events Hs, Tp, tide and surge are needed.The surge predictions are also forecasted by Puertos del Estado (https://opendap.puertos.es/thredds/catalog/nivmar_large_nivmar/HOURLY/catalog.html) and they are extracted at the closest location of the study site.Storms coming from S-SE do not bring critical overtopping occurrences due to their relatively short period and the loss of energy as a consequence of the intense refraction before arriving at the shoreline.Consequently, the associated impact for the S-SE events is no impact.Usually, BNs utilize a graphical interface that allows the user to introduce the input conditions and obtain the results in the same interface.However, for automatized operational purposes, using a graphical interface is not suitable.Instead, the CPTs of the BNs built during the development phase are extracted to obtain the probabilities of each impact level associated with a specific sector and BCs.Then, the impact level is defined according to the criteria established in strategy 1.After that, the impact alerts at each sector for the next 72 h are converted to a color code (Table 2), plotted on a map of the study area, and printed in a bulletin for a simplified visualization (see an example of a segment of the bulletin in Fig. 11 ).This bulletin is sent to several end-users or entities with the authority to implement or coordinate risk-reduction measures, namely the Civil Protection of Faro Council, the national and regional divisions of the Portuguese Environmental Agency, and the Portuguese Navy.

Performance analysis
The combination of process-based modeling and the BN technique succeeded in correctly predicting the occurrence and severity of 76%, 87% and 84% of the wave-overtopping episodes for pedestrians, vehicles, and buildings respectively.This high accuracy along with the reduced computational time required (quasi-instantaneous) demonstrated the capability of this methodology to be implemented in a coastal flooding EWS.Therefore, the inclusion of BNs is a key aspect to guarantee the release of alerts in a short period of time between the successively updated wave forecasts.Whether a non-surrogated approach would have been implemented instead, one simulation of 72 h of wave-induced flooding by running SWAN + XBeach would take around 55 h on one logical processor in a 1600x Six-Core Processor, 3600 Mhz workstation.Importantly, the non-surrogated approach offers deterministic results while the BN approach produces stochastic results, i.e., including probability distributions that provide additional information for the impact characterization.In this regard, the BNs can also be conditioned with information from an ensemble wave model whose members could have different probabilities, in order to estimate the impact uncertainties associated with the wave forecast.This would lead to a more reliable EWS than that based on a single forecast system.
For the sake of simplicity, only the deterministic predictions were considered in the bulletins emailed to the risk-reduction managers.This was mainly due to the need to provide managers with clear and simple predictions.Of the two different strategies investigated for deterministically determining the impact level to be considered in the alert, strategy 1 (defining the impact predictions by selecting the highest impact level provided by the BNs) was selected because largely reduced the number of underestimations while keeping overestimations relatively low.Other authors who developed BNs to predict storm impacts [66] opted for selecting the level with the highest probability of occurrence.Therefore, this aspect should be carefully evaluated when implementing a prediction system based on BNs or other probabilistic techniques.For risk assessment purposes and the development of an EWS, overestimation is preferable to underestimation because it is a more conservative approach.However, frequent overestimations should also be avoided since they can lead to the absence of mitigation measures or the banalization of the warning.In this analysis, overestimations varied between 8% (vehicles) and 20% (pedestrians), which seems to be a tolerable level of occurrence.The intermediate levels, especially for pedestrians, were more difficult to predict, because the limits that separate these impacts are close, from 0.03 l/s/m to 0.1 l/s/m and from 0.1 l/s/m to 1.0 l/s/m (Table 2).Due to the limited number of occurrences for these levels, more observations will be necessary to better evaluate the predictive skills of the EWS and eventually adjust these limits.If these intermediate levels were eliminated, a node impact with two bins (no impact and high impact) would give a less detailed prediction but would be correct much more frequently.
The validation was performed mostly using observations from a buoy and therefore, did not evaluate the uncertainty related to the input information (i.e.wave and water level information).However, the accuracy of the alerts depends on the predictive skills of the wave forecast system utilized as BCs of the BN.As demonstrated in our uncertainty analysis to the BCs, ±0.5 m variations of Hs can drive deviations of up to ±2 impact levels.It is important to note that some of the global scale models tend to underpredict the wave heights and periods during storm conditions in many coastal areas [67].Thus, future efforts should aim to improve these predictions which will result in an increment in the accuracy of the EWS.

Uncertainties and limitations
One of the most important limitations of the modeling approach is that it does not account for the effect of the morphological evolution in wave runup and overtopping, and thus it doesn't account for morphodynamic feedback mechanisms that can greatly influence the final magnitude of the overwash in sandy coasts.Our uncertainty analysis to the beach profile revealed that the sector was more vulnerable when a pre-storm profile was present.Then, with the evolution of the profile associated with the impact of a large storm, the site reduced its vulnerability considerably; large variations on the impact levels, 2 or 3 levels especially for pedestrians, can be found associated with the morphology of the profile.Thus, this would also indicate that the EWS might overpredict the impacts if the profile is heavily eroded during the course of an event.Furthermore, the importance of the feedback between profile evolution and run-up was observed in the field during the largest event included in the assessment.Thus, during this storm, a large vertical dune scarp was formed that halted the run-up incursions.However, the model predicted large overtopping discharges that even reached the road behind the dune in the western sector.This model limitation led to a false positive.In line with our findings, previous authors [68,69] have concluded that the morphological evolution of the sandy bottom highly affects the time-integrated total wave overtopping volume and the timing of peak wave overtopping.Importantly, most of the tools or models to predict wave overtopping over naturally defended sandy coastal profiles are not capable of simulating these processes (erosion and wave-induced flooding) jointly and considering their interactions.As warned by Ref. [11], further research is needed to better characterize overtopping in these environments.A possible solution might be to consider the utilization of the XBeach surfbeat mode since it simultaneously simulates erosion and hydrodynamics; however, it can underpredict the runup, especially in steep beaches [46,47].
The uncertainty related to the initial beach profile can also affect the predictive skills of tan EWS.Besides the model results previously discussed, this was also observed in the field during a large-swell event with Hs above 3 m, Tp higher than 14 s and total water levels ranging from 1.0 m to 1.2 m.This large-swell event was preceded by a storm with large wave obliquity and steepness that caused a high scarp on the beach.Hence, during this large-swell event, the wave run-up barely exceeded the crest of the scarp, with no observed impacts in the urbanized area.However, with these oceanic conditions, the BN anticipated moderate and low impacts for pedestrians and vehicles in the parking sector.This also led to a false positive.A possible solution would be to test, model and implement the flooding impacts for different beach profiles and train the BN with that information.A new node would be incorporated that would represent the different possibilities for the existing profile conditions (this would also require continuous monitoring of all considered beaches/sectors).
Another source of uncertainty arose from the choice of the model parameters since quantitative measurements of wave overtopping were not available for calibrating and validating the model, but only impact observations [50].Considering variations in only one free parameter of XBeach, it was concluded that, on the one hand, the predicted impact can vary up to two levels.Importantly, it only affected the predictions for pedestrians, and the predictions for buildings and vehicles were not sensitive to changes in that model parameter.On the other hand, when comparing this variability with intrinsic variability related to wave randomness, it was found the latter induced similar uncertainties in the predictions (pedestrians), and it even affected the predictions for the buildings and vehicles.Prioritizing a detailed or accurate prediction will depend on the intended use of the EWS, the complexity of the hazard modeling, and the expected uncertainties resulting from the development and implementation phases of the EWS.
For simplification and practicality, the study area was divided into four regions and it was assumed that (1) a cross-shore profile was representative of the entire sector and (2) the receptors were positioned at the most exposed location within the urbanized area.It is clear that even within each sector, the exposure can vary alongshore and cross-shore.However, this approach was chosen since highly reduces the computational efforts and is able to identify the worst case within each considered coastal sector.Risk-reduction managers usually have a good perception of the most vulnerable locations and their expert knowledge can be used when designing EWSs to focus on specific sectors or relevant exposed areas.Hence, they will be able to concentrate their management efforts on those areas based on the alerts of the EWS.Further improvements and developments could include the extraction of model outputs further inland and assessing the flooding impact at those locations, defining not only an alongshore but also a cross-shore variability of the warning levels.This is particularly relevant for wide coastal areas prone to flooding and with different types of occupation/exposure.
There are other aspects that could be improved to increase the reliability of the EWS.For instance, more accurate total water level predictions should incorporate sea-level anomalies due to ocean circulations as suggested by Ref. [70], which are available operationally on a global and regional scale.This type of observation is increasing its applicability supported by satellite technology that will eventually provide observations of sea-level anomalies at high spatial resolution and in real-time.Then, it can be exploited for data assimilation in operational coastal flooding systems.Another improvement to consider is the incorporation in the EWS of the compound impact of flooding and wind [71].They demonstrated that the impacts of the compound hazard can be higher than that of the flooding standalone, as winds can enhance the physical instability of individuals during severe hurricanes.Thus, high-resolution winds could also be incorporated into the EWS to compute this combined effect.Physics-based models can be implemented to quantify the physical vulnerability of individuals, both adults and children, and categorize the impacts.Regarding the characterization of the impacts, some studies [15,72] have suggested that besides using the mean overtopping discharge as a proxy, other indicators should be explored (e.g., individual wave volume, the product of overtopping flow velocity and thickness).Behavioral and social components of the risk assessment could be incorporated into the BNs by defining a new node that, for instance, represents the level of human occupation as a function of the hour (e.g.morning, afternoon, night) the weekday (e.g.weekend versus working day) or season (e.g.Winter versus Summer).Importantly, with increasing reliability of the BCs (namely for wave forecasts) in the future, the output of the EWS can be also extended, from the current 72 h to longer lead periods without affecting the time consumed during the operational phase.

Conclusions
This study presents the design, building, and implementation of a novel EWS for wave-induced flooding in urban areas fronted by sandy beaches and its validation for a study area, Praia de Faro.The system utilizes an approach that combines BN techniques and state-of-the-art numerical models (SWAN + XBeach) to predict the impacts of wave overtopping events in occupied areas fronted by a beach/dune system.This EWS was developed in two phases.In the development phase, the learning information was generated and the BN method was built and trained.It included the definition of oceanic BCs based on a long record of wave and water level measurements that represented two different hazardous situations (storm and large-swell events), the modeling of the mean overtopping discharges caused by these events, and the characterization of their impacts on three coastal receptors such as pedestrians, urban components and buildings and vehicles according to four levels of severity.The trained networks computed the conditional probability tables which constituted the ground knowledge to make the predictions in the second phase (operational phase).
Oceanic conditions for different events hitting the study area between 2008 and 2022 were collected and the observed impacts for those conditions were documented and characterized for the validation of the methodology designed here.By contrasting these impact observations (in the three receptors) and the system predictions it was found that the system correctly predicted around 80% of the situations.However, the predictive skills were different depending on the level.The no impact and high impact levels displayed better skills (precision and recall mainly above 80%) while precision and recall of the low and moderate impact levels were about 20-50%.
The BN approach presented here provides quasi-instantaneous predictions for the considered forecast period (72 h) including probability distributions of the hazards/impacts.Praia de Faro was selected as the demonstration site but this methodology can be transferred to other urbanized areas fronted by sandy beaches.Moreover, the two-working phase approach is very flexible enabling the inclusion of additional features (new oceanic conditions and longer prediction periods, varying beach morphology characteristics, human occupation and social behavior) and this constitutes a powerful tool for disaster management.Therefore, the knowledge and methodology developed in this study contribute to the expansion of EWSs, to minimize the impact of extreme oceanic events in coastal communities, and to increase their resilience.Furthermore, this can have other potential implications for coastal flooding management.In fact, it can also be applied to analyze different flooding risk scenarios for present and future conditions accounting for the uncertainties of the sea-level rise projections if adaptation measurements are not implemented [73].

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Fig. 1 .
Fig. 1. (A) The Iberian Peninsula with the blue box highlighting the southern region of mainland Portugal.(B) The SWAN wave model domain (red box) and the Praia de Faro and Faro buoy locations.Aerial view of the stretch of Praia de Faro where the EWS was implemented and the four sectors investigated: (C) Western, (D) Parking, (E) Central and (F) Eastern.The thick white lines display the location of the representative profiles within each sector and the squares, circles and triangles mark the position where the pedestrians, buildings, and vehicles respectively are found.Imagery source for (A) and (B): Esri, DigitalGlobe, GeoEye, Earthstar Geographics, CNES/Airbus DS, USDA, USGS, AeroGRID, IGN, and the GIS User Community.Imagery source for (C) to (F): World-View 2 mission [36].

Fig. 2 .
Fig. 2. Cross-shore beach transects used in the XBeach model at the four sectors investigated.Each symbol marks the location of the corresponding coastal receptors where the model outputs were extracted.The cross-shore distance is measured from the offshore model boundary.The elevation is referred to MSL.

Fig. 3 .
Fig. 3. Bayesian Network structure for storm events.The BCs are represented in blue, the sectors in grey, qx (discharge) in orange and the impact level in green.The arrows illustrate the dependence between the boxes.The selection of the BCs and sector used for illustration purposes (highlighted in bold and underlined) was arbitrary.The percentage of each box indicates the probability of occurrence.For predictive purposes, the input variables are set to 100%.

Fig. 4 .
Fig. 4. Bayesian Network structure for large-swell events.The BCs are represented in blue, the sectors in grey, qx (discharge) in orange and the impact level in green.The arrows illustrate the dependence between the boxes.The selection of the BCs and sector used for illustration purposes (highlighted in bold and underlined) was arbitrary.The percentage of each box indicates the probability of occurrence.For predictive purposes, the input variables are set to 100%.

Fig. 5 .
Fig. 5. Example of a set of information used to train the BN for storm events.Mean discharge at the pedestrian location against wave height in (A) western, (B) parking, (C) central and (D) eastern sector.The color bar represents tide levels above MSL (m).

Fig. 7 .
Fig. 7.The plots on the left show the F1 score for each level of impact and separated by sector (parking and central) and the plots on the right show the confusion matrix for these two sectors together.In the matrix, the recall and the false negative rates are shown in the horizontal grey boxes; the precision and the false discovery rates are shown in the vertical grey boxes; their intersection indicates the accuracy.a) Pedestrians (N Parking = 53, N Central = 44), b) Vehicles (N Parking = 52, N Central = 47) and c) Buildings (N Parking = 52, N Central = 44).N is the number of overtopping and quasi-overtopping occurrences.

Fig. 10 .
Fig. 10.Percentage of hourly predictions that either maintained or varied their impact level after modifying Hs. (A) Storm 2018 and (B) Storm 2008.

Fig. 11 .
Fig. 11.Alert created by the coastal component of HIDRALERTA for Praia de Faro (flooding) for the 9th of December at 3 a.m.The 4 sectors are represented by the polygons and the color at the sea represents the wave height according to the color code expressed in the legend.In the actual alerts, only the non-green circles are shown, but the three receptors are displayed here for better illustration.The wave height predictions at the local scale are not required for the release of the alerts and they are only shown for a better illustration.

Table 1
Bins used for the discretization of the total water level, Hs and Tp for the large-swell events in Praia de Faro.

Table 3
Comparison of the skill metrics of strategy 1 and strategy 2 for both sectors.The brackets indicate the percentage relative to the total number of occurrences.