Association of Flood Risk Patterns with Waterborne Bacterial Diseases in Malaysia

: Flood risk has increased distressingly, and the incidence of waterborne diseases, such as diarrhoeal diseases from bacteria, has been reported to be high in ﬂood-prone areas. This study aimed to evaluate the ﬂood risk patterns and the plausible application of ﬂow cytometry (FCM) as a method of assessment to understand the relationship between ﬂooding and waterborne diseases in Malaysia. Thirty years of secondary hydrological data were analysed using chemometrics to determine the ﬂood risk patterns. Water samples collected at Kuantan River were analysed using FCM for bacterial detection and live/dead discrimination. The water level variable had the strongest factor loading (0.98) and was selected for the Flood Risk Index (FRI) model, which revealed that 29.23% of the plotted data were high-risk, and 70.77% were moderate-risk. The viability pattern of live bacterial cells was more prominent during the monsoon season compared to the non-monsoon season. The live bacterial population concentration was signiﬁcantly higher in the midstream ( p < 0.05) during the monsoon season ( p < 0.01). The ﬂood risk patterns were successfully established based on the water level control limit. The viability of waterborne bacteria associated with the monsoon season was precisely determined using FCM. Effective ﬂood risk management is mandatory to prevent outbreaks of waterborne diseases.


Introduction
Floods are the most common type of natural disasters, and can have devastating impacts on over two billion people around the world [1]. According to the International Disaster Database (EM-DAT), floods affected more people globally and caused more damage than any other type of natural disaster in the 21st century [2]. Moreover, 95% of people living in Asia would be affected, on large and varied landmasses including multiple river basins, floodplains, and other high-risk zones of natural hazards, as well as the high-density populations in disaster-prone areas [3]. It was also reported that there have been tremendous economic losses worldwide for low-, middle-, and high-income countries due to flood damages [2,[4][5][6]. Flood magnitude and frequency in various locations are expected to increase over time as a result of uncontrolled development and climate change [1,7,8].
Flood risk in Malaysia has increased distressingly in recent decades. Malaysia is one of the Asian countries predisposed to flood risk, with approximately 33,298 km 2 of the

Data Collection of Hydrological Data
Secondary hydrological data from selected DID hydrological stations in Kuantan were compiled over a thirty-year period . The data were retrieved from th National Hydrological Network Management System (SPRHiN) websit (http://sprhin.water.gov.my/, accessed on 5 May 2023), accessed on 8 January 2019. De tailed information regarding data, particularly the type of variable, the name of the sta tions, the coordinates of the stations, and the dates and years for all data were collected by the competent DID officers, either manually or via telemetry, and stored in the DID databank system. The data were analysed using chemometrics techniques using XLSTAT software (Addinsoft, New York, NY, USA) and JPM software (SAS Institute, Cary, NC USA).

Data Analysis Using Chemometrics
• Factor Analysis (FA) FA measures the underlying, error-free latent (unobserved) variables and allows th inclusion of a large number of many correlated variables in a smaller set of variable known as factors. This technique manages the data by recognising the most useful and significant variables as a result of variations in spatial and temporal characteristics tha can describe the entire dataset for analysis. It also functions by minimising the loss o original information and the statement is supported by the FA equation, where the FA is

Flood Risk Patterns Using Chemometrics Techniques 2.3.1. Data Collection of Hydrological Data
Secondary hydrological data from selected DID hydrological stations in Kuantan were compiled over a thirty-year period . The data were retrieved from the National Hydrological Network Management System (SPRHiN) website (http://sprhin.water.gov. my/, accessed on 5 May 2023), accessed on 8 January 2019. Detailed information regarding data, particularly the type of variable, the name of the stations, the coordinates of the stations, and the dates and years for all data were collected by the competent DID officers, either manually or via telemetry, and stored in the DID databank system. The data were analysed using chemometrics techniques using XLSTAT software (Addinsoft, New York, NY, USA) and JPM software (SAS Institute, Cary, NC, USA).

•
Factor Analysis (FA) FA measures the underlying, error-free latent (unobserved) variables and allows the inclusion of a large number of many correlated variables in a smaller set of variables known as factors. This technique manages the data by recognising the most useful and significant variables as a result of variations in spatial and temporal characteristics that can describe the entire dataset for analysis. It also functions by minimising the loss of original information and the statement is supported by the FA equation, where the FA is: X i = a i1 F 1 + a i2 F 2 + a i3 F 3 + · · · + a im F m + e i (1)  where X is the measured variable, F is the factor score, a is the factor loading, i is the sample number, m is the total number of factors, and e is the measurement error or variance. The varimax rotation process was applied to maximise the difference between the variables, facilitating an easy interpretation of the data [23]. The process was also used to produce new groups of variables known as varifactors (VFs). The number of variables that had similar features, and unobservable and hypothetical data, would be the same as the number of VFs obtained from the varimax rotation process. After obtaining the VF, the level of significance of the variables for factor loadings was classified based on the ratio from the analysis. Factor loading with a correlation coefficient of more than 0.70 was regarded as a strong factor loading for further analysis [24].

•
Statistical Process Control (SPC) Time series analysis was very important in predicting water levels in the study area. Time series analysis was performed using the SPC technique, an analytical control chart that constantly visualises the level of quality of the selected variables as time passes. The control chart establishes control limit lines that are used as a measurement for the quality condition by adhering to the specific control limit lines. It can also reveal some trends and patterns showing actual data deviations from the historical baseline and dynamic threshold, as well as detecting unusual resource usage that could be the best baseline [24].
For this analysis, the SPC analysed the selected hydrological variable using the factor loading generated by FA. The control limit values for the selected hydrological variable were the Upper Control Limit (UCL), Central Control Limit (CCL), and Lower Control Limit (LCL). The UCL value indicates the maximum capacity that the river can support; if this value exceeds the limit line, the possibility of flooding is considered very high [24]. The equation for this analysis was: Moving Range = Plot : MR t for t = 2, 3, . . . , m. ( where UCL is 3.267MR, CCL is MR, LCL is 0, MR is the average moving range, t is time, and m represents individual values, which are also associated with: Average Value : where ∼ x is the moving range, m represents the individual values, and x i is the difference between data points.

•
Flood Risk Index (FRI) The FRI of the flood risk model was developed on the basis of a combination of several types of multivariate analysis such as FA, SPC, and ANN techniques. The model was designed to develop an effective guideline for assessing flood risk in the study area. This is significant and represents a new breakthrough in the study of flood risk, and the model demonstrated the ability to be sustained in flood research studies. The process of creating the FRI was followed by a few statistical analysis processes. First, by applying FA, the best variable with the highest factor loadings was selected to be applied in the initial step of the development of the FRI model. Following the selection of the variable, the determination of the control limit value was progressed through the implementation of SPC. Using this method, the formation of UCL, CCL, and LCL was able to provide guidance for determining the FRI ratio. The value of UCL was deemed to be an intolerable value for a variable and was considered a high-risk flood situation. The UCL value was applied for the formation of the FRI and the risk index was determined using the equation below: where UCLV is the UCL value of the variable, x is the highest value of the data, 100 represents the range of the risk index, which was from 1 to 100, and 70 is the significant value of the index for high-risk. The FRI formula based on the above equation was designed to achieve the best flood risk model in monitoring the risk of flooding in the study area. The computed values of the FRI ranged from 0 to 100, and the values were classified into three main categories, which corresponded to low-risk for 0 to 34, moderate-risk for 35 to 69, and high-risk for 70 to 100. The selection of the range from 70 to 100 for high-risk was adapted from the Relative Strength Index (RSI) concept, in which 70 was considered to be an upper limit of overflow, above which values are intolerable. This concept had been applied in previous studies related to floods [14,24,25].
An ANN is an Artificial Intelligence (AI) information-processing system designed to imitate the human brain in data analysis, primarily for the purpose of discovering knowledge, patterns, or models from large amounts of data. The back-propagation algorithm technique is used in ANNs, which require the training of a multi-layer feed-forward network algorithm composed of an input layer, one or more hidden layers, and an output layer [24]. In this study, the findings of the previous analysis were utilised in the application of an ANN for the FRI prediction model by determining the prediction accuracy of the selected variable. The technique also aimed to predict the accuracy of the new FRI that would be applied to determine the flood risk level by predicting the risk generated by the actual risk rate. The risk comparisons were intended to provide a more detailed and significant view of the flood risk level in the affected area.
Two criteria that need to be taken into account to ensure the prediction for each network is accurate and efficient are the correlation of determination (R 2 ) and the root mean square error (RMSE). A prediction is considered to be more accurate if the R 2 value is higher and closer to one, and the RMSE value is lower and closer to zero. The equations of R 2 efficiency and RMSE can be defined as: where x i represents the observed data, y i represents the predicted data, and n is the number of observations.

Water Sampling
The water sampling was conducted in two phases of the timeline based on the Northeast Monsoon for seasonal variation. The first phase was carried out during the monsoon season (November-February), on 18 December 2018, and the second phase took place after or before the monsoon season, on 14 March 2019. New 1000 mL Thermo Scientific™ Nal-gene™ Wide-Mouth High-density Polyethylene (HDPE) bottles were used for the surface water sampling at each site of sampling. Before being transported to the sampling sites, all sampling bottles were initially washed with a 70% concentrate ethanol solution (R&M Chemicals) and allowed to dry [26,27]. The sampling bottles were stored in a sterilised portable CoolFreezer CDF-45 (Waeco Mobile Solutions, Dometic Group) that was reserved exclusively for water sampling purposes. For documentation and reference purposes, the sampling bottles were labelled with the required details such as the name, location, coordinates, date, and time of the sample taken at the sampling sites [28,29].
All water samples were collected via the grab sampling technique by filling a container held beneath the surface of the water to obtain a sample at a particular selected location and time, which reflected the water composition source at that location and time [28][29][30]. To prevent contamination, the mouth and inside of the sampling cap and bottle were carefully handled to avoid contact with any non-sterile objects. A small amount of air space was left in the sample bottles to facilitate mixing before FCM analysis. For preservation, the collected water samples were stored in a portable freezer box with ice packs at 4 • C. As this study intended to conduct bacterial analysis, the samples were analysed within 24 h after the collection of water samples. All samples were immediately transported to the University of Putra Malaysia (UPM) laboratory for FCM analysis.

Bacterial Detection and Live/Dead Discrimination
The bacterial detection and live/dead discrimination by FCM were carried out according to BD Biosciences Immunocytometry Systems [31]. A BD™ Cell Viability Kit with BD Liquid Counting Beads (Catalog No. 349480; Becton, Dickson and Company, BD Biosciences, San Jose, CA, USA) was utilised for the staining procedure. The BD™ Cell Viability Kit contains two fluorescent dyes: propidium iodide (PI) (Becton, Dickson and Company, BD Biosciences) and thiazole orange (TO) (Becton, Dickson and Company, BD Biosciences). These two dyes have different characteristics of cell permeability that can be used to distinguish cells with different integrities of the membrane [32]. PI solution was used for staining dead cells and TO solution was used for staining all cells. Living cells have intact membranes and are impermeable to dyes such as PI, which penetrates cells with damaged membranes, while TO is a permeable dye that enters all cells, both live and dead, to varying degrees. BD Liquid Counting Beads (Becton, Dickson and Company, BD Biosciences), a flow cytometry bead standard, were applied to enumerate the absolute count of live, dead, and total bacteria. The staining procedure started by adding 500 µL of water samples collected from the Kuantan River into a labelled disposable 12 × 75-mm BD Falcon™ polystyrene test tube (Becton, Dickinson and Company). The water sample was initially vortexed using a vortex mixer. Five microliters of each dye solution were added to the tubes with final concentrations of 420 nM for TO and 48 µM for PI. The mixture was briefly vortexed and incubated for 5 min at room temperature. Fifty microliters of BD Liquid Counting Beads were added to the staining tube using the reverse pipetting technique to determine the concentration of live, dead, and total bacteria. The staining tube was capped and gently vortexed to mix the solution. The final mixture was analysed using a BD LSRFortessa™ (BD Biosciences, San Jose, CA, USA) analyser equipped with lasers having excitations of 488 nm Blue and 640 nm Red. After the analysis was completed, all stained samples and extra dye solutions were disposed of in accordance with local standard regulations.

Data Acquisition and Analysis
The FCM data on the water samples were analysed using BD FACSDiva™ (BD Biosciences, San Jose, CA, USA) software. The data sets were reserved to handle no more than 10,000 events per second. The unstained sample was also analysed in parallel with each stained sample to confirm that the voltages of the photomultiplier tubes (PMTs) were appropriately set up. The bacterial population should be positioned entirely on the scale on a forward scatter (FSC) plot versus a side scatter (SSC) plot by pre-setting a gating strategy to discriminate live and dead cells from background noise. The best discrimination of stained cells was visualised on FL1 (TO fluoresces) versus FL3 (PI fluoresces). Three main cell populations were expected to be defined, namely, non-damaged viable (live) cells, intermediate (injured) cells, and membrane-damaged (dead) cells. The concentration of the bacterial cell populations was determined using the equation shown in Equation (7): # events in cell region # events in bead region × # beads/test test volume × dilution factor = concentration of cell population The data gathered from the FCM analysis were further analysed for the statistical significance of bacterial concentrations. Statistical comparison among water sampling station groups was performed using one-way analysis of variance (ANOVA), while comparison among season groups was assessed using the unpaired t-test. All analyses were performed using the GraphPad Prism 9 software (GraphPad Software, San Diego, CA, USA). p values of less than 0.05 (p < 0.05) were considered statistically significant.

General Descriptive
According to the general descriptive statistical analysis of the hydrological data shown in Table 1, the mean water level in the Kuantan River was 17.10 m, with a standard deviation of 0.65 m. The minimum water level was 15.75 m and the maximum water level was 24.69 m. The stream flow variable was analysed to provide a clear picture of the stream flow rate in the river. Based on the results, the mean for the stream flow rate was 51.86 m 3 /s, with the minimum and maximum values recorded at 2.80 m 3 /s and 2164.00 m 3 /s, respectively. For the suspended sediment variable, the mean value of the suspended sediment inflow was 846.42 tons/day. According to the hydrological data for the past 30 years, the lowest value for the suspended sediment in the study area was 1.40 tons/day and the maximum value was 3985.70 tons/day. Meanwhile, the mean rainfall value for the Kuantan district was 0.29 mm. No rainfall was recorded at the minimum value as the result was shown at 0 mm. The maximum and heaviest rainfall in the study area was recorded at 83.30 mm. The descriptive analysis above describes the characteristics of the data for each of the hydrological variables. Based on the overall standard deviations analysis, there was high variability of data for variables of suspended sediment and stream flow, moderate variability in rainfall, and the least variability in water level. The data of the water level variable had a very low dispersion as the coefficient of variation (CV) was less than one, whereas other variables had a CV of more than one. Therefore, the analysis revealed that the water level variable had good data homogeneity, with variability values of 4% [33].

Identification of the Significant Factors
The data in this study were further analysed with FA to determine the most significant variables that contributed to the flood events. The FA was applied to hydrological variables and the most significant factor loading reflected the most significant variable with the strongest association with the underlying latent variable. The most significant variable from this analysis was selected and applied to design the FRI model for the flood warning system, which could serve as a useful tool in flood preparation and management. Figure 2 shows the diagram of the scree plot used to evaluate the cut-off point of the strong factors selected for interpretation. The diagram shows that only one of the three principal factors had an eigenvalue greater than one (>1.0), which was associated with the cumulative variability of 52.27% of the total variance in the hydrological database. Varimax rotation was applied to better interpret the result [23]. Therefore, only one principal factor was chosen to be transformed by varimax rotation because it was the only principal factor that had an eigenvalue of more than one (>1.0). Another two principal factors with an eigenvalue less than one (<1.0) were neglected to avoid redundancy with the main factor [34,35].
warning system, which could serve as a useful tool in flood preparation and management Figure 2 shows the diagram of the scree plot used to evaluate the cut-off point of th strong factors selected for interpretation. The diagram shows that only one of the thre principal factors had an eigenvalue greater than one (>1.0), which was associated with th cumulative variability of 52.27% of the total variance in the hydrological database. Vari max rotation was applied to better interpret the result [23]. Therefore, only one principa factor was chosen to be transformed by varimax rotation because it was the only principa factor that had an eigenvalue of more than one (>1.0). Another two principal factors wit an eigenvalue less than one (<1.0) were neglected to avoid redundancy with the main fac tor ( [34,35]. The findings of the factor loading after varimax rotation are shown in Table 2 below Two factor loadings were obtained from the rotation, and these factor loadings repre sented 52.27% of the data cumulative variability. In this study, only a factor loadin greater than 0.70 (>0.70) was selected for interpretation as this value was considered stable and strong loading [24,35]. As a result, the total variability in the first factor loadin (F1) was approximately 48.68%, with large positive factor loadings for water level (0.98 and stream flow (0.97), while factor loadings for suspended sediment (0.18) and rainfal (0.07) were noticeably small. However, in the second factor loading (F2), all hydrologica variables had small factor loadings and the percentage of the variability in F2 was als very small, at 3.59% of the total variability.  The findings of the factor loading after varimax rotation are shown in Table 2 below. Two factor loadings were obtained from the rotation, and these factor loadings represented 52.27% of the data cumulative variability. In this study, only a factor loading greater than 0.70 (>0.70) was selected for interpretation as this value was considered a stable and strong loading [24,35]. As a result, the total variability in the first factor loading (F1) was approximately 48.68%, with large positive factor loadings for water level (0.98) and stream flow (0.97), while factor loadings for suspended sediment (0.18) and rainfall (0.07) were noticeably small. However, in the second factor loading (F2), all hydrological variables had small factor loadings and the percentage of the variability in F2 was also very small, at 3.59% of the total variability. In addition, Cronbach's alpha was also utilised to analyse the reliability or the internal consistency of variables for each factor. The analysis indicated that Cronbach's alpha for F1 and F2 was 0.98 and 0.10, respectively. The minimum acceptable value for Cronbach's alpha was 0.70 [36]. Hence, the values of the variables of factor loadings in F1 were highly reliable and acceptable based on the internal consistency of the factor. However, the values for F2 were excluded from being discussed further as the value of Cronbach's alpha was very small. Furthermore, FA also revealed the interrelationship or correlation between each variable and the underlying factors ( Figure 3). The water level had the highest positive correlation of 0.97 with F1 and other variables, followed by stream flow with a very high positive correlation of around 0.94, suspended sediment with a very low positive correlation of approximately 0.12, and rainfall with the markedly lowest positive correlation of 0.07. The correlation between hydrological variables and F2 showed that only stream flow had a low positive correlation of 0.32, while suspended sediment had a moderate negative correlation of −0.59. The rainfall and water level had very low negative correlations of −0.19 and −0.12, respectively. nal consistency of variables for each factor. The analysis indicated that Cronbach's alph for F1 and F2 was 0.98 and 0.10, respectively. The minimum acceptable value fo Cronbach's alpha was 0.70 [36]. Hence, the values of the variables of factor loadings in F were highly reliable and acceptable based on the internal consistency of the factor. How ever, the values for F2 were excluded from being discussed further as the value o Cronbach's alpha was very small.
Furthermore, FA also revealed the interrelationship or correlation between each var iable and the underlying factors ( Figure 3). The water level had the highest positive cor relation of 0.97 with F1 and other variables, followed by stream flow with a very hig positive correlation of around 0.94, suspended sediment with a very low positive correla tion of approximately 0.12, and rainfall with the markedly lowest positive correlation o 0.07. The correlation between hydrological variables and F2 showed that only stream flow had a low positive correlation of 0.32, while suspended sediment had a moderate negativ correlation of −0.59. The rainfall and water level had very low negative correlations o −0.19 and −0.12, respectively. FA findings indicated that the water level variable had the largest loadings and th highest positive correlation with other variables and factors. This shows water level wa the strongest variable that was associated with and contributed to the flood, and was th underlying latent factor in the study. The variable was considered the most significan variable that was an indicator of flooding occurrence in the river. Hence, it was selecte for the development of the FRI model.

FRI Model
Water level values were transformed to time series analysis using SPC to comput the limitation for the selected variable for the flood control warning system. The mai purpose of this analysis was to evaluate the efficacy of the SPC analysis in determinin the control limit for the selected variable involved in this study. The control chart for th FA findings indicated that the water level variable had the largest loadings and the highest positive correlation with other variables and factors. This shows water level was the strongest variable that was associated with and contributed to the flood, and was the underlying latent factor in the study. The variable was considered the most significant variable that was an indicator of flooding occurrence in the river. Hence, it was selected for the development of the FRI model.

FRI Model
Water level values were transformed to time series analysis using SPC to compute the limitation for the selected variable for the flood control warning system. The main purpose of this analysis was to evaluate the efficacy of the SPC analysis in determining the control limit for the selected variable involved in this study. The control chart for the selected variable was used to monitor the real-time water level series and identify any alarming readings that exceeded the normal water level of the Kuantan River. selected variable was used to monitor the real-time water level series and identify any alarming readings that exceeded the normal water level of the Kuantan River. Figure 4 illustrates the SPC control chart of the time series analysis of the water level based on the individual moving range. The highest water level in the river was 24.69 m (observation: 68900), which was in 1994. This was followed by 23.  In addition, SPC analysis was used to compare the current flood alert system applied by the Malaysian DID with the flood risk model in this study ( Figure 4). According to SPC, the UCL for the water level in the Kuantan River was 17.18 m, the CCL was 17.10 m, and the LCL was 17.07 m. The capacity of the river to support water levels was within the CCL range of 17.10 m and LCL range of 17.07 m. The water level capacity was unable to sustain the river beyond the UCL range of 17.18 m. The current rates applied by the DID in the flood warning system are 17.00 m for the normal level, 20.00 m for the alert level, 20.75 m for the warning level, and 21.50 m for the danger level.
The FRI was generated using a combination of the algebra method to verify its efficacy and practicability in monitoring flood disasters. Based on SPC analysis, the water level control limit was used to develop the FRI formula (Equation (4)). The risk of flooding was categorised according to the high-risk, moderate-risk, and low-risk levels ( Figure 5). FRI values ranged from 0 to 100, with a high-risk rate of 70 and above, a moderate-risk rate of 35 to 69, and a low-risk rate of 0 to 34. The risk setting for the high-risk level corresponded to the values above the UCL line in the control chart of the SPC analysis. The UCL was determined to be an intolerable value, indicating a high-risk flood event. Moderate risk was determined for the FRI values plotted between the CCL line and the UCL line, while the low-risk level was based on the FRI values plotted between the CCL line and the LCL line. The result in Figure 6 reveals that 29.23% of the total data plotted were classified as the high-risk class and 70.77% as the moderate-risk class. There were no values plotted in the low-risk class. The values of the high-risk class were mostly allocated after 1991, which explains the river's high rate of flooding in recent years. In addition, SPC analysis was used to compare the current flood alert system applied by the Malaysian DID with the flood risk model in this study ( Figure 4). According to SPC, the UCL for the water level in the Kuantan River was 17.18 m, the CCL was 17.10 m, and the LCL was 17.07 m. The capacity of the river to support water levels was within the CCL range of 17.10 m and LCL range of 17.07 m. The water level capacity was unable to sustain the river beyond the UCL range of 17.18 m. The current rates applied by the DID in the flood warning system are 17.00 m for the normal level, 20.00 m for the alert level, 20.75 m for the warning level, and 21.50 m for the danger level.
The FRI was generated using a combination of the algebra method to verify its efficacy and practicability in monitoring flood disasters. Based on SPC analysis, the water level control limit was used to develop the FRI formula (Equation (4)). The risk of flooding was categorised according to the high-risk, moderate-risk, and low-risk levels ( Figure 5). FRI values ranged from 0 to 100, with a high-risk rate of 70 and above, a moderate-risk rate of 35 to 69, and a low-risk rate of 0 to 34. The risk setting for the high-risk level corresponded to the values above the UCL line in the control chart of the SPC analysis. The UCL was determined to be an intolerable value, indicating a high-risk flood event. Moderate risk was determined for the FRI values plotted between the CCL line and the UCL line, while the low-risk level was based on the FRI values plotted between the CCL line and the LCL line. The result in Figure 6 reveals that 29.23% of the total data plotted were classified as the high-risk class and 70.77% as the moderate-risk class. There were no values plotted in the low-risk class. The values of the high-risk class were mostly allocated after 1991, which explains the river's high rate of flooding in recent years.

Prediction Performance by ANN
The prediction of flood risk that was aligned with the FRI was identified to set and guide good mitigating measures to prevent and manage floods in the study area. ANN analysis was performed on the expected precision and accuracy of the risk index information obtained. The results in Table 3 present the prediction performance for the training and validation of the water level. The R 2 value demonstrates a result of training and validating with the lowest RMSE of 0.002855 and a total number of five hidden nodes to achieve optimal results. The results for the training prediction of the water level showed that R 2 was 0.999937 with the lowest RMSE of 0.002855 and three hidden nodes to achieve optimal results. The validation prediction was also carried out with a very high R 2 for water level, 0.999953, and the lowest RMSE of 0.002855, as well as five hidden nodes for optimal results.

. Prediction Performance by ANN
The prediction of flood risk that was aligned with the FRI was identified to set and guide good mitigating measures to prevent and manage floods in the study area. ANN analysis was performed on the expected precision and accuracy of the risk index information obtained. The results in Table 3 present the prediction performance for the training and validation of the water level. The R 2 value demonstrates a result of training and validating with the lowest RMSE of 0.002855 and a total number of five hidden nodes to achieve optimal results. The results for the training prediction of the water level showed that R 2 was 0.999937 with the lowest RMSE of 0.002855 and three hidden nodes to achieve optimal results. The validation prediction was also carried out with a very high R 2 for water level, 0.999953, and the lowest RMSE of 0.002855, as well as five hidden nodes for optimal results.

Prediction Performance by ANN
The prediction of flood risk that was aligned with the FRI was identified to set and guide good mitigating measures to prevent and manage floods in the study area. ANN analysis was performed on the expected precision and accuracy of the risk index information obtained. The results in Table 3 present the prediction performance for the training and validation of the water level. The R 2 value demonstrates a result of training and validating with the lowest RMSE of 0.002855 and a total number of five hidden nodes to achieve optimal results. The results for the training prediction of the water level showed that R 2 was 0.999937 with the lowest RMSE of 0.002855 and three hidden nodes to achieve optimal results. The validation prediction was also carried out with a very high R 2 for water level, 0.999953, and the lowest RMSE of 0.002855, as well as five hidden nodes for optimal results. Waterborne diseases are one of the major flood-related issues in flood-prone areas that have previously received little attention from researchers. This study was motivated by the desire to demonstrate a clear link to address the issue of waterborne diseases associated with flooding in the study area. Figure 7 shows statistical data from the Malaysian Ministry of Health (MOH) database on reported cases of patients with waterborne infectious diseases in Pahang State from 2012 to 2017 (6 years). The data are presented to demonstrate the risks posed to the population in the study area. The total number of confirmed cases of waterborne infectious diseases based on the data is 4246. Bacterial food poisoning is related to the most cases (3091 cases), followed by leptospirosis (1069 cases), dysentery (39 cases), and melioidosis (37 cases), with typhoid or paratyphoid having the fewest cases (10 cases).  Waterborne diseases are one of the major flood-related issues in flood-prone areas that have previously received little attention from researchers. This study was motivated by the desire to demonstrate a clear link to address the issue of waterborne diseases associated with flooding in the study area. Figure 7 shows statistical data from the Malaysian Ministry of Health (MOH) database on reported cases of patients with waterborne infectious diseases in Pahang State from 2012 to 2017 (6 years). The data are presented to demonstrate the risks posed to the population in the study area. The total number of confirmed cases of waterborne infectious diseases based on the data is 4246. Bacterial food poisoning is related to the most cases (3091 cases), followed by leptospirosis (1069 cases), dysentery (39 cases), and melioidosis (37 cases), with typhoid or paratyphoid having the fewest cases (10 cases). The number of bacterial food poisoning cases increased dramatically from 2014 to 2017, with a sudden spike of 1333 cases in 2016. Meanwhile, cases of leptospirosis also gradually increased over the six years. Moreover, dysentery cases became more noticeable in 2016. The data also revealed that there were outbreaks of melioidosis cases in 2014 and 2015, despite the prevalence of typhoid or paratyphoid disease being significantly low. As a result, waterborne diseases became a major concern for the local population in the study area, as they lived in the high-risk flood areas, and the population will be at risk in the future while dealing with this issue. The number of bacterial food poisoning cases increased dramatically from 2014 to 2017, with a sudden spike of 1333 cases in 2016. Meanwhile, cases of leptospirosis also gradually increased over the six years. Moreover, dysentery cases became more noticeable in 2016. The data also revealed that there were outbreaks of melioidosis cases in 2014 and 2015, despite the prevalence of typhoid or paratyphoid disease being significantly low. As a result, waterborne diseases became a major concern for the local population in the study area, as they lived in the high-risk flood areas, and the population will be at risk in the future while dealing with this issue.

Waterborne Bacterial Detection and Live/Dead Discrimination
The assessment of the bacterial viability of the surface water samples of the Kuantan River was undertaken using the FCM technique with the staining procedure using the BD™ Cell Viability Kit, which contained two fluorescent dyes. PI, a membrane-impermeable fluorescent dye, was applied to label dead or dying cells with damaged membranes, and TO was applied to identify viable or live cells. Simultaneous PI and TO staining for each water sample resulted in a reproducible and distinctive pattern of bacterial cell viability in red fluorescence over green fluorescence plots. The electronic gating strategy was applied to differentiate the bacterial signals of either live, injured, or dead cells, from the background noise. Cells were finally gated on FITC-A (FL1) versus PerCP-A (FL3), which distinctly showed the discrimination of stained cells among non-damaged viable (live) cells, intermediate (injured) cells, and membrane-damaged (dead) cells. Figure 8 displays the dot plots together with gating zones and the contour plots of bacterial cells for the water samples collected from the upstream, midstream, and downstream of the Kuantan River during the Northeast Monsoon season. The fluorescence intensity of viable cells for all streams was higher compared to that of injured and dead cells, which had low intensity. Therefore, the viability pattern of live cells was determined to be high, mainly for the midstream, which had the highest percentage of parents (98.9%) for the subpopulation in the hierarchy, followed by the downstream (95.8%), and lastly, the upstream (90.6%). The high number of live cells indicated the large dynamic changes in bacterial growth in the water sample in the study area.

Waterborne Bacterial Detection and Live/Dead Discrimination
The assessment of the bacterial viability of the surface water samples of the Kuantan River was undertaken using the FCM technique with the staining procedure using the BD™ Cell Viability Kit, which contained two fluorescent dyes. PI, a membrane-impermeable fluorescent dye, was applied to label dead or dying cells with damaged membranes, and TO was applied to identify viable or live cells. Simultaneous PI and TO staining for each water sample resulted in a reproducible and distinctive pattern of bacterial cell viability in red fluorescence over green fluorescence plots. The electronic gating strategy was applied to differentiate the bacterial signals of either live, injured, or dead cells, from the background noise. Cells were finally gated on FITC-A (FL1) versus PerCP-A (FL3), which distinctly showed the discrimination of stained cells among non-damaged viable (live) cells, intermediate (injured) cells, and membrane-damaged (dead) cells. Figure 8 displays the dot plots together with gating zones and the contour plots of bacterial cells for the water samples collected from the upstream, midstream, and downstream of the Kuantan River during the Northeast Monsoon season. The fluorescence intensity of viable cells for all streams was higher compared to that of injured and dead cells, which had low intensity. Therefore, the viability pattern of live cells was determined to be high, mainly for the midstream, which had the highest percentage of parents (98.9%) for the subpopulation in the hierarchy, followed by the downstream (95.8%), and lastly, the upstream (90.6%). The high number of live cells indicated the large dynamic changes in bacterial growth in the water sample in the study area. Furthermore, Figure 9 reveals the dot plots of FCM for water samples during the non-Northeast Monsoon season. The viable cells for all streams also indicated intense fluorescence. However, the results also showed that the fluorescence intensity for injured and dead cells was becoming more prominent, indicating an increase in cell injury and death in the stained water samples for all streams for the non-monsoon season. In the upstream, the percentage of parents for live bacteria cells (70.2%) for the subpopulation was relatively higher compared to that of the midstream (55.6%) and downstream (68.5%), but the number of events was less than that in other streams.  Furthermore, Figure 9 reveals the dot plots of FCM for water samples during the non-Northeast Monsoon season. The viable cells for all streams also indicated intense fluorescence. However, the results also showed that the fluorescence intensity for injured and dead cells was becoming more prominent, indicating an increase in cell injury and death in the stained water samples for all streams for the non-monsoon season. In the upstream, the percentage of parents for live bacteria cells (70.2%) for the subpopulation was relatively higher compared to that of the midstream (55.6%) and downstream (68.5%), but the number of events was less than that in other streams.

Concentrations of Live Bacterial Population
The concentration of the bacterial population was defined as the number of live cells per unit volume. The findings are illustrated in the interleaved bar graphs with the error bars representing the standard error means for three water samples. Figure 10 shows the comparison of absolute concentrations of live bacterial populations between three water sampling stations, namely, upstream, midstream, and downstream areas, during monsoon and non-monsoon seasons. According to the findings, during the monsoon season, the midstream had a significantly greater number of live bacterial cells in comparison to the upstream (p < 0.001) and downstream (p < 0.05), with the highest average bacterial population of approximately 599 cells/µ L. The number of live bacterial cells in the upstream area, which was at Panching Waterfall, averaged around 71 cells/µ L during the monsoon season and was significantly lower than the number of live bacterial cells in the downstream (p < 0.01) during the monsoon season, which was nearly 444 cells/µ L.
Meanwhile, during the non-monsoon season, the concentrations of live bacteria populations in all streams were relatively similar, as displayed in Figure 10. The highest concentration was downstream with 289 cells/µ L, followed by midstream with 238 cells/µ L, and the lowest was upstream with 122 cells/µ L. There were no significant differences in the concentration of live bacterial cells between all streams in the water sampling stations during the non-monsoon season.

Concentrations of Live Bacterial Population
The concentration of the bacterial population was defined as the number of live cells per unit volume. The findings are illustrated in the interleaved bar graphs with the error bars representing the standard error means for three water samples. Figure 10 shows the comparison of absolute concentrations of live bacterial populations between three water sampling stations, namely, upstream, midstream, and downstream areas, during monsoon and non-monsoon seasons. According to the findings, during the monsoon season, the midstream had a significantly greater number of live bacterial cells in comparison to the upstream (p < 0.001) and downstream (p < 0.05), with the highest average bacterial population of approximately 599 cells/µL. The number of live bacterial cells in the upstream area, which was at Panching Waterfall, averaged around 71 cells/µL during the monsoon season and was significantly lower than the number of live bacterial cells in the downstream (p < 0.01) during the monsoon season, which was nearly 444 cells/µL.
Meanwhile, during the non-monsoon season, the concentrations of live bacteria populations in all streams were relatively similar, as displayed in Figure 10. The highest concentration was downstream with 289 cells/µL, followed by midstream with 238 cells/µL, and the lowest was upstream with 122 cells/µL. There were no significant differences in the concentration of live bacterial cells between all streams in the water sampling stations during the non-monsoon season.
In this study, the concentrations of the live bacterial population were also compared between monsoon and non-monsoon seasons for all three streams of the river (Figure 11). The concentration of the live bacterial population in the midstream was significantly increased by 2.5-fold during the monsoon season compared to the non-monsoon season (p < 0.01). The monsoon season had a high number of live bacteria in the river in comparison to the non-monsoon season, except for the upstream in the monsoon season, which was lower by 0.5-fold than in the non-monsoon season. The result also showed that the live bacteria population in the monsoon season for the downstream was approximately 1.5-fold more than that in the non-monsoon season. However, there were no statistically significant differences in the concentration of live bacterial cells between the monsoon season and the non-monsoon season for upstream and downstream areas. In general, the FCM finding revealed that the concentration of the live bacteria population was significantly higher in the midstream during the monsoon season. In this study, the concentrations of the live bacterial population were also compared between monsoon and non-monsoon seasons for all three streams of the river (Figure 11). The concentration of the live bacterial population in the midstream was significantly increased by 2.5-fold during the monsoon season compared to the non-monsoon season (p < 0.01). The monsoon season had a high number of live bacteria in the river in comparison to the non-monsoon season, except for the upstream in the monsoon season, which was lower by 0.5-fold than in the non-monsoon season. The result also showed that the live bacteria population in the monsoon season for the downstream was approximately 1.5fold more than that in the non-monsoon season. However, there were no statistically significant differences in the concentration of live bacterial cells between the monsoon season and the non-monsoon season for upstream and downstream areas. In general, the FCM finding revealed that the concentration of the live bacteria population was significantly higher in the midstream during the monsoon season.

Flood Risk Patterns
The flood risk patterns for the FRI model in this study was developed using chemometrics. Chemometrics is a powerful environmental analytical tool based on multivariate statistical data modelling to analyse and interpret a large and complex environmental database [35,37]. This technique has been widely applied to study various environmental elements in the environment because a large and complex database can reveal and pro-

Flood Risk Patterns
The flood risk patterns for the FRI model in this study was developed using chemometrics. Chemometrics is a powerful environmental analytical tool based on multivariate statistical data modelling to analyse and interpret a large and complex environmental database [35,37]. This technique has been widely applied to study various environmental elements in the environment because a large and complex database can reveal and produce a large amount of important information [35,38,39]. Chemometrics also aims to assess relevant patterns and variations without having to be concerned about misinterpreting environmental data.

The Most Significant Variable Contributing to Flood Occurrences
The water level variable was identified as having the strongest factor loading in the FA findings. This was followed by the stream flow variable as the second strongest factor loading in the analysis. This indicated that every increment of stream flow in the river basin leads to a significant rise in water level. The discharge and velocity of the river increases the capacity of the river, as more water is added either through rainfall, snowmelt, or tributary streams, or from the groundwater seeping into the river, resulting in flooding [40,41]. In addition, the increased rate of water flow influences the rate of erosion and the suspended sediment yield along the river [42,43].
These findings of the FA were similar to those from a previous study [44], as the positive factor loading of the water level changes significantly with the increasing rate of stream flow in the Klang River Basin. Changes in stream flow, which depend on the amount of rainfall and load of suspended sediment flowing into the river, should have an effect on the water level in the river basin [44]. The water level and stream flow are influenced by the impact of unsustainable development, which causes the river to become shallower due to the massive erosion of the river bank [44]. As a result, the shallow river was unable to accommodate a higher-than-normal volume of water, causing the river to overflow and causing flooding in the area, thereby inevitably affecting resident settlements. The results also reflected the contribution of point and non-point sources to the rate of suspended sediment, which results in an increase in the water level in most Malaysian river basins [14,41,45].
However, the results also revealed that every increase in water level and stream flow in the study area had only a small impact on suspended sediment and rainfall variables as these two variables did not have strong factor loadings. The reduction in suspended sediment and rainfall in the river basin did not have a significant effect on changes in water level and stream flow in the study area. Theoretically, stream flow is monitored based on the volume of the stream, with changes in water level serving as an indicator [46]. The stream flow rate is determined by two major groups of factors, which are meteorological factors and geomorphological factors, such as land use, soil type, and drainage, which affect runoff [40]. Human development along the river is one of the contributing factors that causes the high rate of surface runoff, which affects the stream flow and water level. The state of uncontrolled development triggers excessive impervious surface runoff and boosts water levels, resulting in an elevated flood risk in the study area. This is supported by a review study on urban development and its impacts on hydrological and water quality dynamics [47], as well as a study on the effects of urbanisation on runoff changes [48].
Furthermore, the results showed that rainfall is not the only factor that changes water levels and triggers the risk of flooding in the study area. Many previous hydrological studies have taken rainfall into account when referring to flood issues [49][50][51][52]. Nevertheless, rainfall conditions are random and non-localised, and the locations of rainfall monitoring stations are dispersed, resulting in the unsuccessful measurement of rainwater for every rainfall event in the study area. The state of the unbalanced distribution of rainfall is influenced by the extensive scale of atmospheric circulations and anomalies of weather and climate variability [25]. This finding is consistent with other studies [24,53], which discovered that the monsoon season, particularly rainfall variability, is statistically incompatible with being a factor for flood occurrence in the river basin because rainfall distributions are generally scattered. As a result of the FA findings, which reduced the complexity of the database, the water level was selected as the most consistent and appropriate variable to be used in the flood patterns for the FRI model.

Flood Patterns for the FRI Model
Water level values that exceed the UCL indicate a high risk of flooding, and the risk was interpreted using the FRI model, which was proven to have high predictive performance accuracy. Based on the findings of the ANN, which used machine learning algorithms on hydrological data from the DID database, the FRI model's predictive performance accuracy was greater than 95%. The FRI model for this study was sufficiently accurate to allow it to be applied for future flood risk research as well as future predictions of the rate of ANN application in the FRI's new UCL for the next 30 years [24]. This would help with the development of future flood risk models and provide a clearer understanding of their accuracy from the present to the future predictions.
The application of these methods in this study has been proven in previous studies [14,24]. According to a study in the Muda River Basin, the application of SPC appeared to be more convenient and cost-effective, as well as producing more accurate results in improving the early warning system for flood alerts [24]. Hence, the findings of the FRI derived from the SPC in this study are capable of bringing about changes in the regulation of flood risk control in Malaysian river basins. Prompt actions would be taken earlier and more effectively as part of the emergency response plan at the high-risk class level, as these approaches would provide a more detailed picture in establishing precise guidelines for flood risk levels in the river basin in order to prevent the consequences of major flood damage and causalities.
Based on the findings, the flood event recorded in January 2017 caused massive destruction, costing millions of ringgits, resulting in a large number of evacuees, and destroying significant infrastructure during the worst of the disaster. Furthermore, the impacts became more significant because the town was the state capital and served as the state's main administrative and economic hub for industry and tourism [54]. During the monsoon season, businesses and tourism activities were severely affected due to disruptions in communication services and road closures caused by the floods. Historical records indicate that the district was one of Malaysia's state capitals with the highest risk of experienced flood events [55,56]. Furthermore, most of the midstream and downstream parts of the river basin are low-lying swampy areas that are prone to flooding.
Floods in Malaysia frequently occur during the Northeast Monsoon season, which happens between November and February. This monsoon season generally results in widespread and prolonged rainfall, often lasting for several days, which causes the river to rise above normal levels. As a result of this study, an effective flood alert system, such as the SPC and FRI methods, should be implemented because this system could minimise the cost of flood management during a flood event as it could be planned and implemented in the earlier stages of the flood disaster. Furthermore, by establishing the new control limit, local authorities and other flood-related organisations would be able to continuously monitor flood control, flood management, and other mitigation measures in the flood-prone areas in Malaysia.

Viability of Bacterial Waterborne Pathogens
FCM is a technique utilised for rapid and accurate quantification of both viable but non-culturable (VBNC) and non-viable microorganisms. It was originally applied to eukaryotic cells, and has now been adapted and readily utilised in analysing the viability, metabolic states, and antigenic markers of bacteria in a sample. FCM allows rapid, precise, and quantitative information on airborne and waterborne pathogens and toxins [57]. The technique is immensely beneficial because it cannot only distinguish between non-biological and biological particles, but it can also identify living and dead organisms. This can be accomplished by combining FCM with live/dead stains that distinguish between live and dead cells [32].

The Occurrence of Waterborne Bacterial Diseases in Pahang, Malaysia
One of the main concerns due to the impact of flooding, particularly in the high-risk flood areas, is the transmission of water-related diseases or waterborne diseases. Cases of waterborne diseases such as cholera, dysentery, bacterial food poisoning, leptospirosis, melioidosis, and typhoid fever had previously been reported in Pahang. According to the database obtained from the MOH, food poisoning caused by bacterial infection was the most prevalent waterborne disease in Pahang State over the six-year period (2012-2017). In 2016, bacterial food poisoning had the highest incidence rate of 47.3 per 100,000 population in Malaysia [58]. The incidence rate of food poisoning fluctuated even though cases continued to occur every year among school students, especially involving school canteens and residential school kitchens [59,60].
Food poisoning is characterised by the sudden onset of vomiting, diarrhoea, or other symptoms caused by bacteria, viruses, parasites, or chemical substances that enter the body via contaminated food or water [17,60]. The most common bacterial pathogens that can cause food poisoning are Salmonella, Campylobacter, Enterohaemorrhagic Escherichia coli (EHEC), and Vibrio cholera [17]. As water is the major transmission route, using contaminated water for cleaning, food processing, and irrigation purposes exposes humans to the bacteria when they adhere to food surfaces and kitchen utensils [59]. The second-most prevalent waterborne disease in this study was leptospirosis, which is the most common zoonotic disease worldwide and is caused by the pathogenic bacteria Leptospira interrogans. It infects humans through direct contact with the urine of animal reservoirs or contact with contaminated soil or water [61]. Many studies have suggested that the monsoon season and flooding are associated with an increased risk of leptospirosis in endemic developing countries [61][62][63].
Another common waterborne infectious disease recorded is dysentery, an illness known as bloody diarrhoea and that is frequently caused by bacteria of the Shigella species [17]. Acute diarrhoea is a major public health concern, and it is strongly associated with food hygiene and safety, as well as practises among food handlers and the general public [64][65][66]. The MOH reported that the incidence rate of dysentery in the country is low, at 0.50 per 100,000 population, and that it occurs sporadically rather than causing an outbreak [67].
Furthermore, in 2014 and 2015, there were significant outbreaks of melioidosis cases in Pahang. Burkholderia pseudomalleus, a Gram-negative bacillus found in soil and water in tropical and subtropical regions, causes melioidosis through contact with contaminated soil or water and through penetration of skin lesions or wounds [62,68]. The infected person presents with fever, pneumonia, septicaemia, or localised skin infections. Melioidosis became a notifiable disease in Malaysia on 9 January 2015 due to the outbreaks and fatality cases in a rescue operation in Lubuk Yu, Pahang in 2010, and flooding in Peninsular Malaysia from late 2014 to early 2015 [67]. Typhoid fever, an acute systemic enteric disease caused by Salmonella typhi and transmitted via the faecal-oral route, is a global public health burden, primarily in developing countries [69]. The incidence of typhoid cases has decreased over the last 10 years, and the number of cases has been low and sporadic over the years [67]. This disease is frequently related to food safety and hygiene, water supply, and wastewater management [69].

Bacterial Detection, Live/Dead Discrimination, and Its Association with Waterborne Diseases during Monsoon Season
The most basic approach for determining the viability of bacteria is laboratory culture and plate-based testing, which is usually equivalent to testing the ability to form colonies and proliferate on a solid growth medium with liquid nutrient broths. These traditional techniques are time-consuming, require strict standardised counting procedures, and are ineffective for slow-growing or VBNC organisms. However, the FCM counting technique effectively overcomes these disadvantages as it has been proven to provide a rapid, automated, and reliable result that accurately estimates the live, dead and total bacteria in many routine microbiology monitoring studies [70][71][72][73][74][75][76].
The number of viable bacterial cells in the Kuantan River was determined using the FCM technique when combined with viability stains that easily allowed distinction between the intact-membrane and damaged-membrane bacterial cells. A viable cell possesses three characteristics, namely, an intact membrane, the ability to reproduce, and the ability to be metabolically active [77]. The number of viable cells indicates dynamic changes in bacterial growth that can result in potential human pathogenicity. The ability to differentiate between live (viable) and dead bacteria is crucial in the microbiological field because it is vital in many applications, such as disinfection, antimicrobial therapy assessment, assessment of the viability of starter cultures, and cell proliferation monitoring [78]. Moreover, enumeration and differentiation of bacterial cell viability in environmental samples are very important in tracking and preventing the spread of infectious agents [79].
The FCM findings provided a comprehensive insight into bacterial populations in the river and allowed conclusions to be drawn about the distribution of bacterial waterborne pathogens during the monsoon season in the study area. The live bacterial population in the river was significantly abundant during the monsoon season compared to the non-monsoon season. Water levels were elevated during the monsoon season, and the rising water levels most likely led to flood events. This implies that, as water levels rose or flooding occurred, the live bacterial population increased, increasing the probability of exposure to the health risk of waterborne diseases. These results were consistent with the findings of [75], which established that the dependency of microbial dynamics and the improvement in the overall microbial water composition for drinking water distribution were dependent on water levels. Even though the effects of flooding on microbial communities occur over time, increased surface water levels are one of the factors that increases the likelihood of shifting microbial communities affecting clean and high-quality drinking water due to surface water contamination [75].
Increased frequency of extreme weather events, such as flooding, not only causes infrastructure damage and significant loss of human life, but even more devastating consequences emerge in the form of increased transmission, incidence, and dispersal of waterborne infectious diseases [15,80]. A study of the River Thames, England, highlighted increased detection of river-emerging microbes following a flood event, and the slow recovery of flooding impacts on bank filtration systems with plausible contaminant loads was observed when extreme flooding occurred without flexible and resilient operating regimes [81]. As contaminated floodwater causes contamination of surface water and groundwater, the water supply serves as an environmental reservoir for the transmission of infectious diseases [69].
Furthermore, during the monsoon season, the viability and the concentration of live bacterial cells were significantly higher in the midstream compared to the downstream and upstream areas. In this study, the midstream was located in the Kuantan city centre, surrounded by residential and commercial areas that are considered to be flood-prone. Inadequate solid waste management and sanitation in residential and city areas might lead to the emergence of high levels of waterborne pathogens [80,82]. These studies emphasised that the majority of flood victims in these areas would be exposed to microbial infections as a result of floodwater contamination, and thereby bear the risk of waterborne diseases. Meanwhile, residents of the fishing villages where the downstream samples were collected would also be affected, but the risk would be reduced because the stream was close to the South China Sea. The upstream population of live bacterial cells was the lowest as it was located at the Panching Waterfall, a nature preserve park in Malaysia.
Several factors could influence bacterial growth and proliferation, leading to the infection of flood-related communicable diseases such as cholera, typhoid, leptospirosis, and E. coli. Following a massive flooding event, the transmission and contraction of leptospirosis increased among those living in urbanised and densely populated areas near water bodies and garbage accumulation areas [63]. Moreover, environmental factors such as an untreated water supply and inadequate wastewater management were associated with an outbreak of infectious diseases following a major flood in Northeastern Malaysia in December 2014 [69]. These factors were strongly related to flooding and attributed to poor drinking water, and sanitation and hygiene situations in the environment, which thus provide favourable environmental conditions for the transmission and outbreak of infectious diseases [15,19,80,83]. Moreover, the current study found that due to the impacts of climate change, such as heavy rainfall and flooding, the incidence of waterborne diseases is expected to rise significantly, along with an increase in plastic debris, particularly complex biofilms [84].
In addition, there is a risk of cutaneous infection following exposure to contaminated floodwater. Trauma is common in flood victims who are injured by fast-moving water, while attempting to escape floodwaters, or during cleaning up after flooding, and this trauma might introduce pathogens into wounds [85]. For instance, melioidosis is transmitted into wounds and skin abrasions through direct contact with contaminated soil or water. The disease is associated with a high rate of death due to the early onset of fulminant sepsis [86]. The incidence and mortality rates of melioidosis are relatively high in Pahang State [68]. Fatality cases were associated with the melioidosis outbreak in Lubuk Yu, Pahang, in June 2010, and it was reported that heavy rains and flooding led to soil erosion, pathogen exposure, and water contamination, particularly of stagnant water along the river bank, thereby increasing the risk of infection [62]. Furthermore, the majority of melioidosis patients were admitted during the monsoon season, with the highest individual frequency of monthly admissions observed during November, December, January, and February, which were higher than in other months of the year [87].

Implications for Flood Risk and Flood-Related Disease Management
Flood risk management commonly comprises four main phases, namely, prevention or mitigation, preparedness, response, and recovery. The first phase, flood prevention or mitigation, comprises actions taken to avoid and reduce the impact of flooding, as well as to protect the flood-prone areas before the disaster occurs. The actions include structural flood control measures such as the construction of dams or river dikes and levees, and implementation of non-structural measures such as flood forecasting and warning, flood hazard and risk management, public participation, and institutional arrangements [88]. The phase of flood preparedness comprises preparations made to accomplish readiness upon flood arrival. These preparations include developing a flood crisis plan, utilising emergency flood warning systems, providing awareness of flood risks and their impacts, and educating or training the public to be ready and take immediate action for a flood disaster [89].
The flood response phase includes the emergency actions taken during a flood disaster that aim to provide assistance, protect lives, minimise economic losses, and alleviate suffering. This phase involves actions such as the evacuation of flood victims, rescue and relief efforts, and the provision of temporary shelters and basic necessities including safe food supplies, clean water supplies, and medical support services [89,90]. The last phase of flood management is flood recovery, which involves actions taken after the flood disaster. This phase refers to the process of reviewing the flood impacts and restoring the normal condition as quickly as possible [90]. Recovery actions generally include search and relief practice, rehabilitation and moral support, financial aid, discharge support, reconstruction of transportation facilities, and providing rapid communication to the impacted areas [89,90].
The first two phases, prevention or mitigation and preparedness, are the most crucial, as the effectiveness of these phases is an indicator of the next phases as well as the full implementation of flood risk management. Good flood forecasting and warning systems are obligatory for flood risk mitigation, and the success is dependent on the effectiveness of preparedness and the level of the correct response [11]. Information regarding flood hazard risk, flood forecasting, flood monitoring, and flood zoning should be designed to be proactive and interactive in dealing with flood disasters [89,91]. The utilisation of a variety of effective communication channels and technological advances to promote flood risk communication would improve community agility and resilience to the flood disaster challenges [89,90,92].
The application of chemometrics techniques and the FRI model in this study would enhance and strengthen the flood forecasting and warning system. The FRI model, as a part of flood risk assessment tools, could be potentially further developed and integrated into the flood mapping tools, such as flood hazard and flood risk maps. The flood hazard map is an important tool to understand the hazard situation in an area by showing the extent and expected water level or depth of a flooded area [93]. Meanwhile, the flood risk map would demonstrate a combination of the probability of a flood event and the possible adverse effects on human health, the environment, and the socio-economic factors associated with the flood [93]. Contours reflecting the severity of the flood risk could be constructed based on the FRI analysis, flood hazard information, and the database showing the viability of waterborne bacterial diseases in flood-prone areas.
Flooding causes sudden changes in the environment as well as in human and animal behaviour. Complex microbial communities could be highly responsive to environmental changes [94,95]. Flooding provides the ideal environment for bacterial proliferation due to temporary water accumulation, contamination of drinking water, which is a plausible route of transmission, and possible disruption of routine health facilities, leading to poor or delayed health services [61]. Water supply issues related to flooding, such as contamination of water resources, scarcity of safe drinking water, outbreaks of waterborne disease, and disruption of water treatment facilities, suggest that water supply management during flooding should be carried out efficiently and systematically to ensure adequate and safe water supply for flood victims [96].
In this study, the flood patterns based on the water level was found to be associated with the bacterial population in the river of the study area. As the frequency of floods in Malaysia increases, the incidence of waterborne infectious diseases will also intensify. Hence, the risk of waterborne infection is present at any time a flood event occurs. In order to mitigate and minimise the waterborne disease risk, flood mitigation measures should be taken in advance as flooding plays a vital role in the outbreak and transmission of waterborne diseases [19]. Furthermore, this serves as an early indication for future preparedness and allocation of public health interventions in flood-affected areas to improve infectious disease surveillance and reduce the incidence and outbreaks of waterborne disease [63].

Limitations of the Study
There were a few limitations of this study. The flood patterns and FRI model were developed entirely using the DID's four main hydrological data sets. In addition, natural disasters, in this case, flooding, differ because of variations that are influenced by topography, geomorphology, structural engineering, and climate change due to global warming. Furthermore, land use was not integrated and utilised in this study to determine the relationship of land use with changes in suspended sediment as it was not parallel to the findings in this study. Finally, numerous water pathogens have the potential to cause waterborne diseases. This study, however, was limited to the flood-related water infectious diseases caused by bacteria. The FCM application was utilised for the viability of bacterial detection and its live or dead discrimination.

Future Research
An effective and integrated flood management system with active multi-sectoral collaboration among government authorities and agencies is critical to preventing and controlling waterborne diseases during the monsoon season. Further investigation of the flood risk model might improve its practicality and effectiveness in the future. Moreover, flood warning and forecasting systems should be integrated with land use and engineering methods so that flood control can be structured more systematically and early action can be taken promptly. In addition, future studies should explore a diverse range of microorganisms found in floodwater, such as viruses, bacteria, and protozoans, as exposure to pathogenic microorganisms is harmful to human health.

Conclusions
The water level variable was the most significant variable for the formation of the flood risk model in the study area, as FA showed it had the strongest factor loading among the hydrological variables. The SPC analysis emphasised the flood patterns visualisation and the maximum limit of flood control in the river basin. Water level values beyond the UCL in the FRI implied a high risk of flooding, and the risk evaluated using the ANN was statistically proven, with a very strong predictive performance accuracy of over 99%. The formation of an effective control limit that was sensitive to changes in water level and the reliability of the FRI model could be applied for future flood risk analysis, thus strengthening existing flood warning systems. Meanwhile, FCM was found to be a powerful and useful technique for determining the viability of bacterial populations and the distribution of waterborne pathogens during the monsoon season. The live bacterial population was significantly abundant in the river during the monsoon season, as rising water levels or flooding increased the bacterial population, increasing the likelihood of exposure to waterborne diseases. Flood victims who lived in the midstream, which is surrounded by urbanised and densely populated areas, were significantly vulnerable to infection due to the favourable environmental conditions for the transmission and outbreak of waterborne diseases.