Shipping accident analysis in restricted waters: Lesson from the Suez Canal blockage in 2021

.


Introduction
Although growth in international maritime trade stalled in 2019 with volumes expanding at a marginal 0.5% owing to the slowdown in the world economy and trade (UNCTAD, 2020), it does not deny the fact that shipping plays a crucial role in the global economy.As maritime accidents expose high risks of casualties, economic loss, and severe maritime pollution, maritime safety represents the priority for the international shipping industry (Fan et al., 2020c).The investigation into maritime accidents helps understand the causes leading to the failures, provides recommendations on countermeasures taken for accident prevention, and guides ship owners and maritime authorities to manage risks rationally.
The occurrence of maritime accidents varies with voyage segments and ship operations.The spatial distributions of global maritime accidents reveal the value of investigating accidents in maritime-accidentprone regions or restricted waters.For example, 12% of global shipping uses the Suez Canal, which can easily cause supply chain disruptions, fuelling shortages and hike prices if there are any delays by accidents (Topham, 2021).On the March 23, 2021, one of the world's largest container ships (i.e.Ever Given) ran aground on the banks of the Suez Canal, which caused an approximate £7bn loss a day in trade due to the blockage of the Canal and up to £10.9m incoming loss each day for the Canal (Michaelson and Safi, 2021).Because of the unique characteristics of the navigational environments in restricted waters (incl.canals, channels and straits), the findings from previous studies relating to general maritime accidents/risk analysis only provide limited insights on both theoretical and practical implications within the context of restricted waters.Thus, it is essential and beneficial to initiate a new study on maritime accident analysis in restricted waters to bridge the research gap.
This paper aims to develop a new data-driven Bayesian network (BN) based risk model to analyse risk factors contributing to maritime accidents in restricted waters and use the Suez Canal blockage case to generate insights for accident prevention.It extracted the data from three public databases to train prior probabilities in the risk-based BN through the historical data-driven model.From the newly developed database, it is found that the accidents in restricted waters share many commonalities which are different from those in other waters/regions.It further justifies the necessity of this investigation, as well as the rationale of using such a developed database to analyse the accidents.The findings can benefit backward risk diagnosis for accident investigation and forward risk prediction for accident prevention in restricted waters.Specifically, it firstly investigates maritime accident reports from the Global Integrated Shipping Information System (GISIS), as well as accident reports within transit voyages from the Marine Accident Investigation Branch (MAIB) and the Transportation Safety Board of Canada (TSB).Secondly, it uses classical statistical analysis to identify the most frequently appearing risk influential factors (RIFs).Thirdly, to analyse the impact of RIFs, a Tree Augmented Network (TAN) is generated to aid the construction of a data-driven model between different nodes.In addition, the conditional causal relationship among the RIFs is purely configured by historical data.By doing so, this research makes new contributions as follows.
• Developing a new data-driven risk model that enables the systematic analysis of the risk factors of maritime accidents in restricted waters.• Revealing the dependencies between the RIFs for maritime accidents in restricted waters through a BN model, which is constructed and trained by a TAN learning method.• Conducting the forward risk prediction for accident analysis and prevention, which generates insights for maritime authorities to reduce the shipping risk in restricted waters.• Providing useful insights to guide the Suze Canal blockage accident investigation on its risk root causes through backward risk cause diagnosis.
The rest of the paper is organised as follows.Section 2 summarises and critically reviews the literature.Section 3 describes the methodology development for data collection, modelling and validation.In Section 4, the case analysis results show the robustness and reliability of the risk model and reveal the importance of each RIF in the model.Then the accident of the Suez Canal blockage is analysed to generate the implications for accident prevention.Section 5 concludes the paper.

Shipping accident analysis
Maritime accidents and catastrophes raise public concerns in terms of casualties, economic loss, and maritime pollution.Owing to the severe-consequence and low-frequency of maritime accidents, there are qualitative and quantitative methods to analyse accidents with limited historical data.Some qualitative studies were proposed to analyse the maritime accidents from systematic perspectives (Kim et al., 2016;Puisa et al., 2018;Uddin and Ibn Awal, 2020).In addition, the Functional Resonance Analysis Model (FRAM) was applied to analyse how shipboard operations cause accidents (Salihoglu and Besikci, 2021).Due to unavailable or nonrepresentative data, Martins et al. (2020) proposed a methodology to assess and quantify the probabilities of occurrence of undesired events based on expert opinions combined with fuzzy analysis.In the light of Maritime Autonomous Surface Ships (MASS), several risk factors have been identified using qualitative methods (Chang et al., 2021;Fan et al., 2020a;Liu et al., 2022).A framework for MASS operating at the third degree of autonomy (remotely controlled ship without seafarers on board) was proposed using literature review and expert knowledge, including 23 human factors, 12 ship factors, 8 environment factors, and 12 technology factors (Fan et al., 2020a).The complex mechanism of system failures was illustrated by these methods to analyse the occurrence of the relevant accidents.However, such qualitative analysis was largely based on expert judgement, causing concerns over subjective bias of the findings.
To further illustrate the causal analysis, historical data from maritime accidents is integrated with expert knowledge to conduct the (semi-) quantitative analysis of maritime accidents.Statistical analysis, Multiple Correspondence Analysis (MCA), and hierarchical clustering were utilised to explain the causal factors contributing to maritime accidents from a statistical perspective (Chauvin et al., 2013;Ugurlu and Cicek, 2022;Wang et al., 2021).In addition, a statistical analysis was integrated with the implementation scenarios for autonomous ships to quantify the potential reduction in loss of life given autonomous shipping (De Vos et al., 2021).Moreover, the BN method has been applied to predict the occurrence of shipping accidents (Khan et al., 2020;Ung, 2021).Fan et al. (2020b) proposed a data-driven BN model to reflect the interdependencies among human factors, which generated rational scenarios for accident prevention.A dynamic fuzzy BN was developed to list unsafe preconditions and unsafe supervision as the top two considerations for human factors analysis in maritime accidents, especially for supervision failures of shipping companies and ship owners (Qiao et al., 2020).Furthermore, a Human Factor Analysis and Classification System (HFACS) framework was proposed to identify human factors using multiple linear regression (Hasanspahic et al., 2021) and fault tree model (Zhang et al., 2019c).It can also be integrated with BN to reveal accident formation patterns and show that the inland and aged vessels were important factors in sinking and grounding incidents (Ugurlu et al., 2020).In addition, the quantitative research shows that multiple levels of risk factors interacted with each other affect the event chain in maritime accidents, including environmental factors, vessel factors, human and organisational factors, and management factors (Fan et al., 2020c).Among them, human errors and human factors are significant risk factors in shipping accidents (Weng et al., 2019;Chen et al., 2020).
To conduct quantitative analysis of accidents, the database is often used as one of the most available sources to obtain the primary data, including the GISIS (Pristrom et al., 2016;Wang et al., 2022), automatic identification system (AIS) data (Zhang et al., 2019a), and the historical accident data collected from national/regional maritime administrations (Liu et al., 2021;Xu and Hu, 2019;Coraddu et al., 2020).However, such databases reveal different formats to present the results of accident analysis and have no uniform criteria to assess risks.From this perspective, maritime accident reports are commonly used sources with detailed information for the analysis such as: navigational environment, operational process, direct or indirect causes of the accidents, and the actions taken after each event failure (Wang et al., 2021).Also, the potential hazards and causal analysis for various factors, which are not stated in the given public database, are demonstrated in the accident reports with detailed information.However, there are rare studies in which primary data is extracted from maritime accident reports due to the time-consuming process of collecting data from the reports and limited records within public sources (Chauvin et al., 2013;Yildiz et al., 2021;Zhang et al., 2019b).With more regional accident records, accident reports from regional maritime administration were utilised to generate vital risk factors influencing the severity of accidents (Wang and Yang, 2018).To analyse the spatial patterns of global maritime accidents, density analysis and clustering analysis were utilised to find that approximately 60% of serious and very serious accidents happened within 30 nm to the coastline (Wang et al., 2022).In addition, it was found that the small general cargo ships are the riskiest in the coastal waters of China through the analysis of public and national databases (Liu et al., 2021).Therefore, even studies utilising limited content of the data sources revealed rare implications on how to use outcomes of the model to predict the scenarios of maritime accidents.
To bridge the gap, one novelty of this study lies in that it proposes a generic model by extracting all key RIFs from maritime accident reports from a comprehensive perspective, overcoming the drawbacks on risk analysis and prediction with insufficient data sources.

Risk assessment of maritime accidents in restricted waters
Shipping in restricted waters faces more significant challenges in navigational safety compared to that in open waters, such as hydrodynamic and bank effects.Previous studies show that 46% of collisions occurred in restricted waters (rivers or fairways), and most of those were S. Fan et al. under the conduct of a pilot (Chauvin et al., 2013).Regarding the accidents in restricted waters, communication problems on board and bridge resource management (BRM) deficiencies are frequently highlighted issues for collisions in restricted waters.Also, in confined waters such as the narrow channel, ship handling is significantly affected by moments acting between the ship and bank, hydrodynamic forces, and sidewalls of the channel.Among them, the hydrodynamic force is closely linked to the water depth, the ship's speed and longitudinal and lateral distance.This inevitable force has effects on the ship manoeuvring.Therefore, when a ship is approaching the narrow or restricted waters, it may encounter a high risk of collision, grounding, or contact due to the combined effect of various factors.
To assess the risks of shipping in restricted waters, the hydrodynamic effects on vessels from an individual perspective have been investigated, such as hydrodynamic interaction effects between two large vessels in narrow waterways, the minimum safe distance, and appropriate safe speed required to avoid the accident (Lee et al., 2016).With regard to ship handling, Maimun et al. (2013) investigated the manoeuvring performance of an LNG tanker considering the ship bank interaction effects, showing that the interaction effects, fitting fins and enlarging rudder size greatly influenced the ship handling in restricted waters.To investigate the emergency response in confined waters, a three-stage decision-making framework was developed to select the best risk control options in inland waterways: proposing options in the first stage; selecting the most feasible options by comparing cases in the second stage; and making decisions using BN in the third stage (Wu et al., 2017).Previous studies focusing on ship manoeuvring in restricted waters provided advice such as safe distance and safe velocity for ships.However, it cannot generate causal analysis and risk assessment in view of factor interactions in maritime accidents.
In addition, there are few studies on risk assessment of maritime accidents with a focus on restricted waters.A BN model was constructed by domain experts and the Conditional Probability Tables (CPTs) were developed from historical data, to model the collision consequences in downstream of the Yangtze River (Wu et al., 2020).The risk causal model in traffic-intensive waters was constructed to reveal the key risk causal transmission process (Chen et al., 2019).A fuzzy DEMATEL (Decision Making Trial and Evaluation Laboratory) method was used to emphasise that human errors and weaknesses in organisational factors are primarily responsible for accidents in enclosed spaces (Soner, 2021).The significance of investigating human and organisational factors with regard to maritime accidents in restricted waters was illustrated.In addition, BN was utilised to estimate the occurrence likelihood of grounding accidents in the fluctuating backwater zone, which found that the fluctuating backwater zone, the month, and water level were essential factors for grounding accidents in the Three Gorges Reservoir (Jiang et al., 2021).
Moreover, the risk analysis can be found in maritime accident reports.M/V NEW KATERINA grounding in the Suez Canal bank was entirely due to the human error factor, such as lack of personal capability and lack of knowledge (Authority, 2016).The same blame for human factors in the accidents was found in the collision of M/V CHUANHE in the channel of Xiamen (Authority, 2013).Because the master of the ship was unable to keep proper lookout considering it was in the fairway.In addition, there were environmental factors and vessel factors in restricted waters, including drift, wind, high superstructure, hydrodynamic effects, and limited space.It was reported that M/V COMMANDER drifted on to the Round Reef in the channel due to loss of steering and pushed by the wind (Authority, 2014).In the collision on the Kiel Canal between HANSE VISION and BIRKA EXPRESS (Seeunfalluntersuchung, 2010), the high superstructure of the BIRKA EXPRESS made her particularly susceptible to wind.Then an unexpectedly strong turning motion occurred to her due to the caused hydrodynamic effects, which was worsened by the wind.With regard to accident features in narrow waterways, the attempt of ships to take effective actions in limited space with hydrodynamic effects was risky, which enabled the ship to run aground, even blocking the channel or canal.
Previous maritime accident reports focused on the identification of the risk factors in restricted waters.The thorough literature analysis reveals that there is yet to be a study to investigate how all RIFs influence each other and how RIFs work co-ordinately to contribute to an accident in restricted waters.The Suez Canal blockage accident leading to catastrophic consequences and significant effect, causes a growing concern and public safety awareness on the effectiveness of the current maritime risk analysis methods within the context of restrict waters, as it shows a typical low frequency but high consequence risk feature.In the meantime, they share many common navigational and geographical characteristics that are different with the ones in other regions/waters (e.g.open sea).From this perspective, the presented research makes new contributions by developing a new data-driven BN based risk model, with a special focus on the systematic analysis of the risk factors of maritime accidents in restricted waters.To reveal the dependencies between RIFs for maritime accidents in restricted waters, a BN model is constructed and trained by using the TAN learning method.It conducts backward risk cause diagnosis (after the occurrence of an accident) and forward risk prediction (when the navigational environment triggers a high risk) for accident prevention, which generates insight for maritime authorities to reduce risks.When applied in the recent accidents (e.g., Suze Canal blockage), the model makes a significant contribution to accident investigation, which aids the finding of root causes leading to the occurrence from an applied research perspective.

Data collection
To identify risk factors from maritime accidents in restricted waters, data was obtained from the following sources: 1) database established from previous studies (Fan et al., 2020c(Fan et al., , 2020d) ) between January to December 2017 from MAIB and TSB, and 2) accident reports in English available from the Marine Casualties and Incidents (MCI) module of the GISIS dating from January 1, 2005 to April 9, 2021.
To generate the RIFs in restricted waters, the procedure consists of reports screening, refining, and RIF selection.All available accident reports within the canal or the 'transit' voyage segment are selected for analysis.In this regard, the study generates a database with 61 vessels, including 24 vessels from the GISIS, and 37 vessels from the MAIB and TSB.Referring to the framework by Fan et al. (2020b), records from the GISIS have been refined and analysed manually in the context of the given risk factors.
The risk factors extracted from maritime accident reports are presented in Table 1.However, due to the size of the database (i.e., vessels), the risk factors utilised for the model need to be filtered.The occurrence frequencies of each factor are calculated to rank them.Based on the distribution curve of the ranked factors (from the maximum value to the minimum value), several values (i.e., 0.3443, 0.1475 and 0.0820) leading to a shape change as the turning points are observed and used for the threshold selection.Secondly, after building different models using the turning point values, sensitivity analysis is carried out to determine the appropriate threshold value.When a too-low threshold value is chosen, many RIFs retain in the model, and the model is not sensitive to minor input changes.When the value is set too high, only a few number of nodes are taken into account, leading to the result not being logical and accurate.The threshold value of 0.1475 was selected through many trials and case-by-case comparisons.The model based on this value is proved to be optimal in terms of both the delivery of accurate results and model representation by the retention of key RIFs.
Simultaneously, domain experts (a scientist in maritime risk assessment and an expert in marine engineering with industry experience) were then invited to verify the definitions of RIFs purely based on the accident records.In this process, 'situation awareness' is excluded because it relates to different vague descriptions of the events in S. Fan et al. restricted waters from accident reports.Furthermore, 'ergonomic design' is excluded due to its uneven distribution in different databases.For instance, it occurs only once in the GISIS but is frequently mentioned in the MAIB database.

Tree Augmented Network (TAN) modelling
To reveal the dependencies between RIFs for maritime accidents in restricted waters, a BN model is constructed and trained by a learning algorithm.The historical data collected from maritime accident reports is treated as the training data used to calculate the CPTs in the BN model.In this process, the conditional causal relationship among the RIFs is purely configured by historical data without subject input from domain experts.The data-driven BN model is constructed in two steps: the first step is to generate the structure between different nodes in the network by using the TAN learning method and the second step is to calculate the CPTs by the Gradient method.A BN illustrates a joint probability distribution over a number of random variables, which is an annotated directed acyclic graph (DAG).The definition and applications of BN in accident analysis can be found in the literature (Fan et al., 2020c;Yang et al., 2018a;Wang and Yang, 2018).TAN learning is an optimisation method by training the data collected from accident reports, which defines the optimised tree structure using conditional mutual information between attributes.The details of finding a tree structure and main steps for TAN learning were demonstrated in Yang et al. (2018b) and Fan et al. (2020b), followed by CPTs calculation.
There are several algorithms that can be used to learn CPTs.The features, advantages, and disadvantages of these algorithms are shown in Table 2.Because the proposed model considers missing data within a number of nodes in the BN, Gradient decent is selected as the appropriate learning algorithm to calculate CPTs.

Model validation
The model validation is conducted using methods including D-separation, expert knowledge, sensitivity analysis, and case study.The expert knowledge and D-separation concept are firstly used to investigate the relationship between nodes.Then, the sensitivity analysis of the BN model and a real case study are conducted to validate the model quantitatively.

Expert knowledge and D-separation
The BN-based risk model requires validation to determine the model's robustness and reliability.For instance, using expert knowledge continues to be a reasonable way to validate when there is limited historical data in the quantitative analysis.Experts are required to provide judgements on whether the BN structure and variables selected for the nodes in the network are appropriate.The BN model can be validated through a panel of experts to utilise their knowledge and experience to ensure consistency (Yu et al., 2020).
The links of nodes in the networks represent the propagation of information between events rather than the causal relationships.It  explains why the anti-causal links existing in the risk based BNs might be valid (Jensen, 1996).From this point of view, the D-separation concept is introduced (Jensen, 2001), which represents conditional independence in the Bayesian probability theory and can be applied to check and modify the BN structure (Yang et al., 2010).

Sensitivity analysis
Mutual information is derived from entropy theory, which describes the uncertainty of the dataset and interprets the entropy reduction.Specifically, it represents the dependence between two variables in the probabilistic theory (Yang et al., 2018b).In this study, mutual information shows how strong the connection between the RIF and the node 'accident type', which can be defined as follows: where s is 'accident type', α i represents the ith RIF, α ij represents the jth state of the ith RIF, P(a ij ) is the probability of the jth state of the ith RIF, P(s) is the probability of s, P(s, a ij ) represents the joint probability of α ij and s, I(s, α i ) is the mutual information between 'accident type' and the ith RIF in restricted waters.In this way, the value of mutual information works in filtering out the RIFs that show less relevance to the node 'accident type'.By doing this, the remaining RIFs are important variables regarding the node 'accident type' in the BN.
Then, scenario simulation is conducted to determine the effects of different RIFs in a combined way as another form of sensitivity analysis.The traditional method of upgrading the states of one node with the other nodes locked is applicable for two-state variables but does not fit variables with more than two states (Yang et al., 2018b).To overcome this disadvantage, a method proposed by Alyami et al. (2019) calculates the True Risk Influence (TRI) for the multi-state variable against a type of accident (e.g.grounding).For instance, to calculate the TRI value between one RIF and grounding, it firstly calculates the High Risk Influence (HRI) by increasing the probability of the state producing the strongest influence on grounding to 100%.Secondly, the Low Risk Inference (LRI) of grounding is calculated by increasing the probability of the state generating the lowest influence on the grounding to 100%.Next, the average value of HRI and LRI is the TRI of each RIF in the 'grounding' accident type.Then, a similar procedure is applied to other accident types respectively.Therefore, the TRI values of variables in different accident types are obtained, which illustrates the RIFs' influences on accident types.The average TRI values representing RIFs' effects on 'accident type' rank the variables' effects on the 'accident type'.
Apart from the above, there are two axioms to be satisfied in sensitivity analysis (Fan et al., 2020c;Yang et al., 2009;Zhang et al., 2013).
Axiom 1.A slight increase or decrease in the prior probabilities of each RIF, should contribute to the corresponding increase or decrease in the posterior probability of the target node.
Axiom 2. The total influence of the integration of the probability variations of x parameters should be not smaller than the one from the set of y (y∈x) RIFs.
To meet the axioms, minor changes of variables are updated in the scenario simulation.Specifically, the state of the first node generating the highest changed value of the first state of 'accident type' is increased by 10%, while the state of the first node generating the lowest changed value of the first state in 'accident type' is decreased by 10%, referring to Yang et al. (2018b).Then, the same approach is applied to the second node of RIFs, and the value is updated.Next, the third node is also included in the same process.In this way, the updated values of the first state of 'accident type' are gradually increasing or decreasing when more RIFs are included.Subsequently, similar procedures are applied to the second, third and fourth states of the node 'accident type'.If the updated values of 'accident type' are gradually increasing or decreasing along with the continuously changing RIFs, two axioms are satisfied.

Scenarios setting and real case testing
The model validation is also conducted by simulating past maritime accidents.Given the observed states of several nodes, how the target node and other nodes reflect in the model implies whether the model is consistent with reality.By simulating the past maritime accidents with the associated parameter settings, the configuration can be tested when some states of nodes are given.In addition, the real case testing also provides a plausible explanation to the other nodes for the observed findings, which generates insights for maritime accident investigation.

Factor selection
From the established database extracted from the GISIS, MAIB and TSB, there are in total 61 vessels for maritime accident analysis.'Accident type' is set as the root node in the model, and the state of the root node 'accident type' is shown in Table 3.
With regard to the accidents occurring in canals or in transit voyage segments, the most common types are collision (44.3%) and grounding (21.3%) due to the limited space for ships to take effective manoeuvring and the hydrodynamic effects of the waterways.Besides, contact or crush on the bank also occurs in the canal or channel.Other accidents, including capsize, sinking and falling overboard are not commonly observed in restricted waters.In terms of the location of shipping accidents, the details are shown in Table 4.
Although the database is developed based on 61 vessels, the relevant RIFs are very similar in each investigated accident and hence representative.The RIFs used in this study are filtered based on the frequency of occurrence in Section 3.1.Regarding the factor selection, the details of each RIF are presented in Table 5.
The states of the RIFs are defined according to the literature and accident reports.For example, 'ship age', 'length', 'gross tonnage' and 'time of day' were graded according to Wang and Yang (2018) and Fan et al. (2020b).'Ship type' is defined according to the records in accident reports.In addition, most two-state RIFs are graded based on whether they are blamed for the failures in accidents.Specifically, some RIFs are similar to risk factors in previous maritime accident analyses but reveal their characteristics within restricted waters.The detailed explanations of such RIFs are illustrated below: (1) Weather condition The weather condition of maritime accidents refers to wind, fog, or visibility.Good weather is described in the reports as clear sky, light/ low intensity wind, fresh breeze, overcast and good visibility.However, poor weather refers to rainy, drizzling, heavy fog, and poor visibility (300-400m or less than 100m).Specifically, the contexts highlight that wind pushes the vessel towards one side, or the vessel is susceptible to wind, are also explained as poor weather conditions.For example, the BIRKA EXPRESS in Section 2.2 is particularly susceptible to the wind because of the high superstructure around her forecastle.

Table 3
The states of node 'accident type'.

State
Accident type The sea condition of maritime accidents refers to the tides, currents, and waves.Good sea condition is stated as slight seas and swell, calm, nil swell, flat sea, or weak currents, while the poor condition is swelling with significant waves, neap tide or strong currents.
(3) Fairway condition The fairway condition factor represents the density of traffic for general maritime accidents.With regard to the characteristics of transport in restricted waters, the fairway condition is closely linked with bank effects or hydrodynamic effects that cause a sheer to the port side and drift induced by the current.
(4) Ship speed Each ship has a certain speed that it may attain in canals, but cannot exceed, which is due to the large physical displacement of water when the vessel passes through the canal compared with the canal's width and depth.For example, the permissible velocity through water for non-  tanker vessels is 16 km/h (approximately 8.5 knots) and 14 km/h for tanker vessels, referring to the New Suez Canal Regulations.Due to narrow waterways for the channel, the effect of shallow waters and high tide also accelerates the vessel's speed.

Data-driven modelling for maritime accidents
After collecting the maritime accident records, 25 RIFs are analysed to illustrate interdependencies in the BN model.In this way, the structure of BN is trained and optimised by calculating the conditional mutual information in Section 3.2.Then, it is carefully checked by domain experts and D-separation to ensure that all the links between the variables are meaningful.
Once the TAN structure is generated, the parameter learning of CPTs from the cases is conducted by using 'Learn using Gradient' (Yang et al., 2018a).The arrows from one RIF to the others represent their causal relationship.Such a relationship is learnt from the historical accident data.It means that the relationship is developed based on the statistical correlation in terms of the RIFs and their contributions to the occurrence of accidents in restricted waters.The magnitude of such relationship is modelled by the CPTs of each pair of related RIFs.After CPTs are obtained, the posterior probabilities of each node can be calculated, and the results are shown in Fig. 1.The states and explanations of each node (i.e.RIF) are explained in Table 5.
The statistical analysis of the probabilities of nodes represents initial findings regarding various RIFs.Among the accidents, collision is the most frequent accident type, accounting for 40.7%.As for the accidents in restricted waters, most of these ships are in relatively new age, from 0 to 5 years, accounting for 32.4%.Approximately 36.6% of the accidents in canals occur under poor weather conditions, which is predominantly affected by wind or limited visibility.In addition, 47.4% occur with poor sea conditions, which is closely linked to currents, tides and waves.Referring to the features of fairway in restricted waters, 36.7% of vessels involving in the accidents are affected by hydrodynamic effects.

TAN model validation
The qualitative BN model is first validated by face validation using domain experts.17 world-leading scholars in the maritime risk area (in terms of the number of SCI-cited core journal publications) were invited via emails 1 .13 of them provided their feedback, among which 9 agreed with the TAN trained network and 4 suggested the removal of some links.With regard to such links, an extra test was conducted on their impact to the model result accuracy.New scenario tests and analyses were conducted using real cases (20% of the sample size) to compare their prediction performance with the original BN.The results showed that the original model in Fig. 1 (with a 100% accuracy for all cases) outperforms the modified BN (with an average 98% accuracy in 13 real cases and 27.4% accuracy in 1 real case).As a result, the BN structure remains at this stage of the model validation.
Furthermore, more model validation is conducted using other methods, including D-separation, sensitivity analysis, and case study.The validation for the rationality and consistence of the BN is conducted.The D-separation concept is firstly used to investigate the relationship between nodes 'length' and 'weather condition'.With the evidence of node 'accident type', the connection between nodes 'length' and 'weather condition' is independent.Therefore, they are d-separated (conditionally independent) and suit the concept of d-separation with sound links and directions.Then, similar investigations are conducted in other nodes and links, providing confidence that the BN structure is rational.
Sensitivity analysis of the BN model is also conducted.The mutual information between RIFs and accident types is illustrated in Table 6.
Referring to Eq.( 1), higher I(s, α i ) reveals essential impacts of the RIF on 'accident type'.From this point of view, 'ship type' with corresponding mutual information value of 0.5458 has the most critical impact on the accident type.Variables with higher mutual information are selected as essential RIFs.In order to set a threshold for the selection, the average value of all RIFs is calculated, 0.1051.The variables with values higher than the threshold are extracted for further discussion, including 'ship type', 'ship age', 'gross tonnage', 'passage plan', 'information', 'risk assessment', and 'weather condition'.They are calculated in terms of the quantitative extent to which one RIF influences another in the BN model.
In view of TRI calculation between RIFs and accident types,  From the results of the updated value of the dependent node, this model is proved to be in line with Axiom 1 (Fan et al., 2020b).
The first row in Table 7 represents the base-case scenario, and the following rows show the different scenarios when each state of the variable reaches 100%.Then the TRI values of 'ship type' against collision are obtained.Through calculating the TRI values of RIFs against every accident type, key factor identification against different accident types is illustrated.From Table 8, the most important factor for collision is 'ship type' with a TRI value of 39.25, and the most important factor for grounding is 'passage plan' with a TRI value of 28.20.By comparing the average TRI values, the most important variables for 'accident types' are as follows: Ship type > Ship age > Passage plan > Gross tonnage > Weather condition > Risk assessment > Information To further validate the BN model, it is examined by testing the combined effect of multiple RIFs on the accident types, referring to Section 3.3.2written as '~10%' in Table 9.Specifically, the first column of Table 9 represents the original values of accident types in TAN, and the following columns show the updated results with changed values of RIFs.However, each state of 'accident type' is calculated separately from each other, i.e., each row representing each accident type is calculated through the updated change of states of RIFs.In view of the results, the updated values of the 'accident type' are gradually increasing along with the continuously changing RIFs, which is in line with Axiom 2.
Furthermore, a reported maritime accident (which has not been included in the database for the BN construction) is tested to validate the model.Based on the details of accident reports, particular states of the selected relevant variables are given with a probability of 100%.Then the probability of each state of the 'accident type' node is updated accordingly, reflecting the predictive accident type.In this case, it will validate the model if it is consistent with the actual accident type.In the case of grounding of the passenger vessel Royal Iris of the Mersey while manoeuvring toward Eastham Locks at the entrance to the Manchester Ship Canal on the July 10, 2016, the states of nodes in the BN model are assigned in Fig. 2, based on the details of accident report published by the MAIB (report number 11/2017).(2) 'July 10, 2016 at 1254 (UTC+1)' (state 1) for time of day.
(3) 'Good visibility' (state 1) for the weather condition.(4) 'The bridge team were navigating solely by eye and incorrectly assessed that the ferry was in safe water', which reflected the deficiencies for bridge team management and the improper use of the equipment and device onboard.
The result of the updated BN model illustrates the high probability of this accident type being in state 2, i.e., grounding, which is consistent with reality.Moreover, this model reflects more information associated with the accident investigation.According to the statement of the accident report, the information shown on the chart of the area in terms of the status of the mooring dolphins was inaccurate but did not contribute to the grounding.However, the risk factor 'information' with a high probability of being in state 2 is observed in the proposed model, as shown in Fig. 2, which supports the investigation result of the 'information' factor in the accident.It further demonstrates the reliability as well as the usefulness of the constructed BN model.
On the other hand, this model helps illustrate the possibility of the state of each single risk factor by setting parameters in the proposed BN, which benefits the ongoing/future investigation of maritime accidents in restricted waters.For example, one of the world's largest container ships (the Ever Given), an Ultra-Large Container Vessel (ULCV) capable of carrying over 20,000 shipping containers, ran aground at Suez Canal's east bank and caused the blockage, on the March 23, 2021.
The Suez Canal is an artificial sea-level waterway with a length of 193 km, connecting the Mediterranean Sea and the Red Sea.It is the shortest maritime route between Europe and South Asia, which conveys 12% of global shipping.According to the available information currently released from the press, parameter settings for some variables in the proposed BN model can be obtained, as shown in Fig. 3.
(1) 'Container ship' for ship type, '3 years' for ship age (the year of build is 2018).(2) Weather condition is flawed, because the ship 'was suspected of being hit by a sudden strong wind, causing the hull to deviate from the waterway and accidently hit bottom' (Maguire, 2021).Besides, the 'sail effect' worsens the situation when containers piled high on the top of the large vessel are more susceptible to strong winds (Michaelson and Safi, 2021).(3) As for the fairway condition, it has been reported that the ship ran fifth in a northbound convoy and in the queue behind it sat fifteen vessels when it ran aground.For just six days, there was traffic jam in both directions of over two hundred vessels (International, 2021).(4) '07:40' for the time of day.
Under these circumstances, shown in Fig. 3, it can provide a plausible configuration for the observed findings.The above information is used to assign CPTs of the corresponding nodes in the model.Specifically, it reveals a very high probability of 82.0% for the vessel to be involved in grounding, which further validates the proposed BN model, as shown in Fig. 3.Moreover, the ship length has a probability of 97.2% to be in state 2 (i.e., more than 100 m), and gross tonnage has a probability of 96.4% to be in state 3 (i.e., more than 10000 GT), further demonstrating the consistency with the reality ('220940 GT' for the gross tonnage, '399.94m' for the length).

Risk prediction and implications from the Suez Canal blockage
This study explains differences among critical factors, contributing to different types of accidents in restricted waters, and provides the most probable scenarios with reference to specific conditions.The Suez Canal    case study generates insights for accident analysis and accident prevention by explaining the most probable scenarios.It has been addressed by Suez Canal Authorities in a press conference that weather conditions were not the main causes for the ship's grounding, and technical or human errors may exist.Although there is so far no public statement on contributing factors for the accident, this study provides a systematic perspective considering vessel factors, environmental factors, and human factors.Specifically, the states of nodes in Fig. 3 show implications for occurrence probabilities of identified contributing factors.
(1) There is a probability of 86.9% for the Ever Given having insufficient information.
Vessels need to obtain adequate and updated information in shipping, especially when in restricted waters.Insufficient information may relate to poor quality of equipment data, falsified records of information, reliance on a single piece of navigational equipment, no working indicators or light for necessary observation.
(2) There is a 71.9% probability for the Ever Given having poor communication.
Communication makes the teamwork onboard safe and effective, while poor communication not only affects the daily routine duties with high risks, but also deteriorates the situation in an emergency response.
(3) There is a probability of 84.8% for the Ever Given involving in the complacent issue.
The complacent issue means the situation when seafarers are complacent about the duties and underestimate the severity of the condition, which leads to actions contributing to subsystem failures.It emphasises the high risk of complacent issues contributing to the misconduct of seafarers and wrong decision-making.
(4) There is a probability of 90.2% for the Ever Given under dysfunctional management system.
The management system consists of shore management, maintenance management, bridge resource management, bridge team management, port service, safety management system, qualification examination, training, practice, and emergency drill.In restricted waters, dysfunctional management system contributes to technical or nontechnical errors due to human and organisational factors.
(5) There is a probability of 94.4% for the Ever Given with inadequate safety culture.
Safety culture defines how safety is managed onboard a vessel and can be illustrated as the way of doing things on board.It reflects the perceptions and values of the crew concerning safety, which may be influenced by management factors and commercial pressure.However, it can be difficult to measure or quantify.
The Suez Canal case study results show the probability of node states when the states of other nodes are observed, which provides a plausible explanation to the investigated nodes for the observed findings.Therefore, this model explains risk factors' states and reveals the interrelationships between risk factors, helping investigate the hidden causes of maritime accidents in restricted waters.By using scenario testing, the proposed model provides clues about the deficiency in the system and the probability of ineffective responses to the accidents.In addition, the proposed model helps explore the possibility of reducing risks.The states of other nodes also contribute to the value of grounding probability in the BN model.For example, by only assigning the state of 'communication' to be '1' (good communication), the grounding probability of such an accident decreases from 82% to 61.4%.By only assigning the state of 'complacent' to be '1' (good and proper perceptions of duties and situations), the grounding probability decreases to 0.037%.Thus, it reflects insights for applying this model to the past and ongoing investigation of maritime accidents in restricted waters, which benefits backward risk cause diagnosis (after the occurrence of an accident) and forward risk prediction (when RIFs trigger a high risk) for accident prevention.

Conclusion
This study proposes a data-driven TAN approach to investigate how different RIFs generate impact on maritime accident types in restricted waters.Maritime accident reports from the MAIB and TSB from 2012 to 2017 and the GISIS from 2005 to 2021, are reviewed and refined to develop the primary database to identify RIFs.Then, the BN model is constructed by the TAN approach to analyse risk factors in maritime accidents.The sensitivity analysis and case study are conducted to validate the model.Lastly, the Suez Canal blockage case study is analysed to provide insights for risk assessment and implications for accident investigation.
According to the mutual information and TRI calculations, important RIFs for shipping accidents in restricted waters are ranked against accident types, i.e., 'ship type', 'ship age', 'passage plan', 'gross tonnage', 'weather condition, 'risk assessment', and 'information'.The model shows that ship length has less risk contribution in restricted waters than in ports.Because a ship needs to change its course in ports while maintaining a certain course following the geological characteristics of the passing canals and channels.It is difficult for a large ship to change course.In other words, it is easy to maintain the course of a large ship in canals/channels.The statistics also support that among all the investigated accidents, most ships involved in accidents are small to medium ships (200 m or less).Although the ship length as an individual factor has shown limited impact on the accidents in restricted waters, its combined effect with the other factors, such as 'gross tonnage', has shown its importance indirectly.Specifically, factor identification against different accident types is demonstrated according to the TRIs of RIFs against each accident type.Meanwhile, the case study shows implications from a plausible explanation for the observed findings by scenario analysis.There is a high probability for the accident of Ever Given with insufficient information, poor communication, a complacent issue, a dysfunctional management system, and inadequate safety culture.
Compared with the established general models in inland waterways (Zhang et al., 2013) and coastal waters (Wang and Yang, 2018), the proposed model shows very different results, reflecting the unique characteristics of restricted waters.The general models could not be used to analyse the shipping risk in restricted waters because they fail to incorporate a few key RIFs such as 'fairway condition'.Oppositely, some RIFs concerned in the general models such as 'season' are not applicable in restricted waters.Although different accident investigation organisations have a variety of methods or frameworks to conduct investigations, this model identifies contributing factors by predicting the probabilities of nodes' states in the model regarding human factors, environmental factors and vessel factors.Therefore, it helps identify the potential hazards by predicting contributing factors and effectively assists maritime authorities with accident investigation.
As for the lessons learned from maritime accident reports in restricted waters, it is evident that the manoeuvring of a vessel is heavily affected due to the effect of lateral banks.Effective recommendations are given to provide awareness training to the crews to have preparedness for manoeuvring in restricted waters.For example, the multipurpose cargo ship BBC STEINHOEFT (C0008890 -M11C0001, IMO number 9358046) in the South Shore Canal of the St. Lawrence Seaway was grounded on the March 31, 2011.As the vessel approached the entrance to the narrower area of the channel, it suffered the bank suction effect, which caused a sheer to port.In addition, factors such as wind, current, and drift are also closely linked with grounding and collision of the vessels in restricted waters.The joint impact analysis using the proposed model in this paper can effectively prevent the occurrence of a similar accident in future.In some cases, these factors cannot be blamed for the leading causes of the accidents, as risk factors are interacted with each other, especially human factors.Therefore, lessons from marine casualty highlight the significance of human factors regarding negligence and good seamanship.For example, good communication will decrease the grounding probability of Suez Canal accident from 82% to 61.4%.The proper perceptions of duties and situations will significantly decrease the probability of grounding.The success of crews in intervening in minor mistakes or violations benefits the risk control of the system.From this perspective, risk assessment of maritime accidents in restricted waters provides insights for accident prevention strategies.
Generally, the results from the proposed approach present differentiations among the vital RIFs contributing to different types of accidents in restricted waters, which helps provide implications for accident investigation and prevention.As the wind is not the leading cause of the accidents, the case study of the Suez Canal blockage shows a plausible explanation of the scenario to find the potential hazards from a systematic perspective.It implies clues for the accident investigation.There is a recommendation for captains to take training courses aiming at safely navigating ships in restricted waters within wind and current effects for the individuals.
Furthermore, the proposed model is capable of simulating the interactions between risk factors and presenting the probability of states of investigated nodes, which helps provide guides for relevant investigation.For maritime authorities and ship owners, it is possible to obtain information from the proposed model to investigate accidents, manage risk levels of the voyage, and eliminate the reoccurrence of the similar accidents that cause enormous economic losses or casualties influencing the associated whole supply chain.Nevertheless, the limitation of this study lies in the comparatively small number of data records used in the model for a particular restricted water, meaning that the findings are only representative to the generic restrict waters.When more data becomes available with regards to a particular restrict water/region, the model in this study can be used as the basis to support further in-depth analysis (e.g.location-related ones) to generate the results of more specific implications for accident prevention.A thorough analysis shows that many accidents in narrow waters in specific and in other waters in general lacked detailed information recorded on the investigated RIFs.Therefore, a compromising solution is proposed to analyse all the maritime accident reports and derive the primary data ourselves.This work identifies and refines every piece of data with detailed attributes, which increases the quality of data.Collecting the primary data also benefits the feasibility of adjusting the RIFs (i.e., nodes of BN) to obtain an optimal solution between accuracy and easiness.Through this finetuning process, the model delivers robust results even for some recent accidents.It helps create a number of scenarios in the form of IF-THEN by locking a few nodes (including the occurrence of the accident), thus producing the possible causes (the change of root causes).This further verifies the model and hence generates useful guidance for accident investigation and prevention.

Declaration of competing interest
The authors declare the following financial interests/personal relationships which may be considered as potential competing interests: Zaili Yang reports financial support was provided by European Research Council.

Table 1
Risk factors extracted from maritime accident reports.

Table 2
Typical Bayesian learning algorithm.

Table 4
The location of shipping accidents.

Table 5
The details of each RIF.
Table 7 presents the TRI value of 'ship type' against collision where TRI is equal to (HRI + LRI)/2.Table 8 shows all TRI values of RIFs for all accidents.

Table 5
(continued ) 1 Among the scholars with at least 2 Q1 journal publications in maritime risk and Bayesian networks, 2 are from the authors of this manuscript and 1 has a conflict of interest with the first author of this work.S.Fan et al.

Table 6
Mutual information between RIFs and accident types.

Table 7
TRI calculation between ship type and collision.

Table 9
Accident rate of the minor change in variables.