Leakage localization using pressure sensors and spatial clustering in water distribution systems

Leakages in water distribution systems (WDSs) are a worldwide problem, which can result in an intolerable burden in satisfying the water demands of the consumers. There is an urgent demand to develop technologies that can detect and localize the leakage in a timely and efficient manner. The monitoring data of the WDS is a typical time series, and there is a certain spatiotemporal correlation between the data provided by the devices distributed at different locations of the WDS. This paper proposes a novel model-based method for WDS leakage localization. The method is characterized by (1) developing the dominant sensor sequence for each candidate leakage node to improve the localization accuracy based on the spatial correlation analysis; (2) utilizing multiple time steps of the measurements which are temporal varying correlated; (3) ranking leakage regions and nodes by their possibility to contain the true leakage. A realistic WDS is used to evaluate the performance of the method. Results show that the method can accurately and efficiently localize the leakage.


INTRODUCTION
The water distribution system (WDS) is one of the most important infrastructures that deliver drinking water to various consumers. The safety and reliability of water distribution systems (WDSs) are crucial for cities . Leakages in WDS can damage the infrastructure, leading to an intolerable burden in a world struggling with satisfying the water demands of a growing population. In some cities, water losses caused by pipe leakages account for 30% of the total amount of drinking water in the WDSs (Puust et al. 2010). The water losses, as well as the cost of repairing the failed pipes, can result in significant economic costs. Locating and repairing leakages in a timely manner is extremely urgent to the water utility for economic, environmental, and reputational reasons.
Generally, leakage localization is realized by the use of advanced devices to monitor system behaviors. The acoustic equipment uses the acoustic device to localize the leakages by monitoring the abnormal behaviors at the potential leakage locations in the WDS Zhou et al. 2019). However, being time-consuming is the main drawback for this method as detection of all potential leakage pipes is a heavy burden and usually takes a lot of time. This shortage prevents the acoustic equipment from being widely used in real problems. Alternatively, methods that utilize measured data, typically nodal pressure and pipe flow data provided by the sensor networks distributed in the WDS, are developed for leakage detection and localization. Compared with flow meters, pressure meters are easily installed and less expensive. The methods that use pressure data to locate the leakage can efficiently reduce the investments (Wu & Liu 2017). Therefore, pressure-based methods have come to be more and more popular for leakage detection and localization in WDS. The main mechanism of the pressure-based methods is that the pressure under normal working conditions fluctuates in a certain range, while the leakage can lead to pressure drop and make the pressure fluctuate outside the normal range. Analysis of the deviation between the real-time pressure data and the normal ranges can efficiently detect and locate the leakages (Wu et al. 2018b;Zhou et al. 2019).
The state-of-the-art for leakage localization in WDSs is filled with contributions of different methods (Pérez et al. 2014a;Sanz et al. 2016;Sun et al. 2016;Zhang et al. 2016;Moser et al. 2018;Salguero et al. 2018;Wu et al. 2018a;Manzi et al. 2019;Sun et al. 2019). Typically, these methods can be classified as model-based methods, transient-based methods, and data-driven-based methods. The model-based method uses the hydraulic model to predict the normal range of nodal pressure and a comparison between the measured value and predicted value can be used to detect the leakages. For example, Farley et al. (2013) proposed a model-based method in which a sensitive matrix of different pressure measurements is adopted to quantify the relationship between the leakage rate and pressure fluctuation. Moser et al. (2018) used an explicit representation of the uncertainty distribution of modeling and measurement at each location. The threshold limit for falsifying model instances in error domain modeling is calculated to determine the candidate leak nodes. In addition, the pipe materials would affect the applicability and accuracy/efficiency of the model-based method, since previous studies (Duan et al. sensors to enrich the information from the sensors; (2) a strategy that uses the pressure data from multiple time steps is adopted to locate the leakage region; (3) an approach that gives the priority detection order in the leakage region is developed to improve the efficiency of the leakage localization. The paper is expected to improve the accuracy and time efficiency of leakage localization in WDS.
The rest of the paper is organized as follows. The Methodology section describes the principle of this methodology in detail. In the Results and Discussion section, a case study is presented using the WDS of J City of China to evaluate the performance of the leakage localization method, and a discussion of the relevant results is given. The Conclusions section draws the main conclusions of the paper and introduces some potential extensions.

METHODOLOGY
The framework of the leakage localization methodology Figure 1 shows the framework of the model-based leakage localization method. As shown in Figure 1, this method consists of four parts, namely, leakage detection, leakage scenario simulation, indicator calculation, and leakage localization analysis. The leakage detection algorithm (Shao et al. 2019) is used to determine whether the leakage occurs. If the detection alarm is not triggered, the network status is classified as a no-fault state. Conversely, if the detection alarm is triggered, the leakage should be located in the following time steps. Since the leakage detection algorithm has been investigated by (Shao et al. 2019), we focus on the leakage localization once the detection alarm is triggered (the remaining three parts).
Different from the existing approaches that locate the leakage immediately once the detection alarm is triggered, the developed approach locates the leakage only after the leakage has persisted for some time steps (DT ). The reason is that the use of the pressure data from multiple time steps can utilize the temporal varying correlation of the measurements and thus efficiently improve the localization accuracy. The leakage localization process consists of three parts: leakage scenario simulation, indicator calculation, and leakage localization analysis. In the leakage scenario simulation, the concept of a dominant sensor sequence is developed based on sensitivity analysis. The dominant sensor sequence utilizes the spatial correlation between sensor and leakage and only the sensors that are highly correlated to the leakages are used for the leakage localization. Then a leakage scenario simulation is adopted to simulate the leakage at different nodes with different intensities. By calculating the similarity between the real-time measurements and the simulated leakage scenarios, the leakage will be located in a certain region. To achieve a comprehensive evaluation of the similarity, seven different metrics are used to evaluate the similarity between real-time measurement and simulated scenarios. Finally, the leakage localization analysis to determine the operation priority of the candidate leakage region and nodes is proposed to help the operator to inspect the actual leakage location in the field. This paper has the following assumptions: (i) there is only one leakage that appears at one time, and (ii) pressure sensors in the WDS are placed in the selected nodes and are working well.

The dominant sensor sequence
In previous studies (Perez et al. 2014b;Kang et al. 2018;Shao et al. 2019), the measurements from all the sensors are used for leakage detection and localization. However, the sensors that are far from the leakage will only have a small pressure drop, which may be much smaller than the pressure fluctuation caused by the measurement uncertainties (noise, outliers, etc.). In this situation, the measurements used for detection and localization analysis are polluted by the measurement uncertainties, leading to poor robustness of the leakage detection and localization. To address this problem, Qi et al. (2018) have established the coverage region of a single sensor to detect and locate the leakage. This method reduces the adverse effects of insensitive sensors to a leakage. However, only a single sensor is used for localization, resulting in low localization accuracy in their study.
To compensate for the above shortcomings, the concept of the dominant sensor sequence is developed. The dominant sensor sequence is only a part of all the sensors. Each leakage node corresponds to a specific number of sensors, which must be sensitive to the leakage node. The generation of the dominant sensor sequence consists of two steps: sensitivity analysis, and sequence reconstruction.
The sensitivity matrix represents the sensitivity relationship between nodal demands and nodal pressures, which measures the degree of the effect of demand variation at one node on pressure variation at another node. In this paper, the sensor sensitivity matrix is obtained by WDS model hydraulic simulation (Perez et al. 2011;Blesa et al. 2012). A detailed process for the calculation of sensor sensitivity matrix (S t ) at a given time step t can be found at Blesa et al. 2015;Steffelbauer & Fuchs-Hanusch 2016). As mentioned previously, the developed approach locates the leakage only after the leakage has persisted for DT time steps. Therefore, the sensor weight matrix S is constructed by averaging S t at DT time steps, as shown in Equation (2).
. . @H Sn s @Q Nn n 2 6 6 6 6 6 6 6 6 6 6 6 6 6 4 3 7 7 7 7 7 7 7 7 7 7 7 7 7 5 where S t [ R nsÂnn is the sensor sensitivity matrix at time step t; H S k and Q N k are the nodal pressure and water demand at the sensor S k and node N k , @H S k @Q N k indicates the effect of pressure change on the sensor S k due to leakage at node N k ; S is the sensor weight matrix; n s is the number of sensors and n n is the number of nodes. The number of elements in the original sensor sequence for each node is equal to the total number of pressure sensors arranged in WDS. Sequence reconstruction is based on sensor sensitivity weight matrix S. The k th column of the S matrix can be used to quantify the sensitivity of S k sensors to the nodal demand at N k node. The dominant sensors for the nodal demand at N k node are selected based on the sensitivity values, corresponding to N sd elements with big value in the k th column of the S matrix. The original sensor sequence is reconstructed into the dominant sensor sequence. The number of dominant sensors N sd (N sd n s ) is a hyper-parameter that can be determined by the engineering experience.

Leakage scenario simulation
Water demand prediction can be found in many studies (Kang & Lansey 2009;Arandia et al. 2016;Xie et al. 2017). In this paper, a prediction function is adopted to predict the nodal water demand at the current time step t from the historical demand information. x where x t is the nodal water demand vector at the current time step t; f ( ) is the nodal water demand prediction function;x tÀ1, tÀ2,... is the historical demand information.
Then the predicted nodal water demand at the current time step is used as input of the WDS hydraulic model to get the predicted pressure vectorp f ( t ).
wherep fi ( t ) is the predicted pressure value of the i th sensor at the current time step; g i is the output WDS hydraulic model corresponding to the i th sensor; and n s is the number of pressure sensors in the network. Equation (5) gives the predicted pressure value (p f ( t )) under the normal work condition. Comparingp f ( t ) with the pressure under the leakage scenarios can help to detect the leakages. Here, the leakage scenario is simulated by adding an extra demand to the predicted normal nodal water demand. For a given node, the extra demands with different values are added to the predicted demand at the node. After the hydraulic simulation, the pressures at the leakage scenarios are acquired. The above is processed node by node. Then the leakage scenarios at nodes with different leakage intensities are generated.
wherep ki ( t ) is the estimated pressure value of the i th sensor for the k th leakage scenario; n is the total number of simulated leakage scenarios. The pressure residual vectorr k (t) for the k th leakage scenario can be obtained by subtractingp (t) from the predicted pressure vectorp f (t).
At time step t, the n s pressure sensors will upload a set of measured nodal pressure data, which constitute a measured pressure vector (Equation (8)).
Then the measurement pressure residual vector r ( t ) can be calculated by Equation (9).

Candidate leakage region ranking
Leakage localization is based on the analysis of simulated and measured pressure residuals (r k and r). The leakage scenario, of which the simulated residualr k is most similar to the measured residual r, is considered to be the representation of the actual leakage. Therefore, the location and leakage intensity value of the scenario is treated as the actual leakage location and intensity value. Some metrics are used to measure the similarity betweenr k and r (t), namely Manhattan distance, Euclidean distance, Chebyshev distance coefficient, cosine similarity, Pearson correlation coefficient, Spearman rank correlation coefficient, Kendall τ correlation coefficient (Perez et al. 2014b;Ponce et al. 2014). These metrics assess the similarity of the two vectors from different perspectives. For example, Manhattan distance is sensitive to changes in the average value, whereas the Pearson correlation coefficient considers the degree of linear correlation. Therefore, the above seven metrics Water Supply Vol 22 No 1, 1024 are combined to obtain more reliable localization results. Taking the Pearson correlation coefficient as an example, we explain below. As mentioned previously, only the elements corresponding to the dominant sensors in the vectorr k and r are adopted in the leakage localization process. Denoting the two residuals of the dominant sensors asr 0 k and r 0 , the Pearson correlation coefficient has the following form, where cov (r 0 k , r 0 ) is the covariance betweenr 0 k and r 0 . n is the total number of leakage scenarios, C t k is the Pearson correlation coefficient that is used as an indicator for the k th leakage scenario. Since the developed approach locates the leakage has persisted for DT time steps, the average correlation coefficient has the following form, The seven metrics can be calculated similarly and the averaged indicators calculated, which are used to determine the most similar leakage scenario. Generally speaking, a higher correlation coefficient indicates that this scenario has a higher probability to be considered as the actual leakage. Each scenario is ranked and scored based on the correlation coefficient (C k ). The scenarios ranked in the top 5% are given a score of 2, while the scenarios ranked in 5%-20% are given a score of 1, and the rest of the scenarios are given a score of 0.
As mentioned previously, seven metrics are used as indicators for leakage localization. Therefore, there are 7 scores for each scenario, and the sum of the 7 scores is used as the total score of the scenario (SS). The scenarios of a higher score are treated as the candidate leakage scenarios. The scenarios in which the scores are higher than a score-threshold l are selected as candidate leakage scenarios. A candidate leakage scenario corresponds to a leakage node and a leakage intensity. The above strategy allows that the scenarios with the same location but different intensities may be selected as candidate leakages. In this situation, the number of times (NS) that the scenario with the same location is selected is recorded.
Here, the locations (nodes) of the candidate scenarios constitute a candidate node set {C node }. Based on the spatial distance, these nodes can be distributed into different candidate regions and each region has a clustering center, using the hierarchical clustering algorithm and K-means clustering algorithm (MacQueen 1967;MacKay 2004;Sarrate et al. 2014). Previous studies using clustering for leakage localization have classified the nodes first and then matched the localization to regions (Soldevila et al. 2016;Zhang et al. 2016). This method first generates the set of candidate nodes and then characterizes the probability that the candidate regions contain the actual leakage by classification and indicator. In the case of noise in pressure measurements, the localization performance is significantly improved. This is because the clustering can handle more efficiently the dispersion produced by noise in measurements. The same occurs when demand uncertainty is considered (Soldevila et al. 2016). As shown in Figure 2, the candidate nodes set consists of 10 nodes and these nodes are distributed to three regions based on the clustering algorithm. Three candidate regions and three candidate region centers are formed by spatial clustering. Update to now, the leakages are probably located in the candidate regions. The acoustic methods can be used to precisely locate the leakage in the candidate regions. A key problem is to determine the order of leakage detection for these candidate regions. For a WDS network, the pressure drop will be large if the sensors are closer to the leakage location. The average pressure drop of all sensors can be calculated by Equation (12).
The sensors are sorted according to the elements in the vector r, the top n g sensors of which, are chosen to determine the detection order of the candidate regions. The average distance from the regional center which is formed by spatial clustering to these sensors can be used to rank the order of the candidate regions.

Candidate nodes ranking
The previous section gives the detection order of the candidate region, and the next step is to determine the detection order of the nodes in the region to be detected. Assuming that the candidate region consists of y candidate nodes. For every node, it may be related to several candidate leakage scenarios with the same leakage location and different leakage intensities. Each scenario has a score (SS) as mentioned previously. Therefore, the Sum score (Sn i ) for the node n i can be the sum of these scores. Similarly, the repeated number (NS) of candidate leakage scenarios with a certain candidate node is counted, the Cumulative ratio (P i ) for nodes n i is obtained by dividing the NS by the total number of candidate scenarios. The pattern of pressure fluctuations at a certain leakage node is similar regardless of leakage intensity, the higher the cumulative ratio is, the more likely the pressure fluctuation caused by actual leakage is matched to the certain node.
Based on the magnitude of the characteristic parameter (Sum score, Cumulative ratio), the nodes in the detection region are ranked. The higher the characteristic parameters of the node, the higher the probability that the node is a true leakage node.

Detection evaluation indicator
The performance of the developed method is evaluated by two indicators, namely, Geographical distance (Topological distance) and Pipe distance. The geographic distance intuitively shows the distance from the leakage candidate nodes or the candidate region center to the actual leakage node, which is directly related to the localization accuracy.
The pipeline distance between two nodes refers to the shortest hydraulic path of the WDS connecting the two nodes; that is, the minimum value of the sum of the lengths of the pipes connecting the two nodes. This distance can help to assess the use of acoustic methods that can locate precisely the leakage if it is within a determined pipe distance.

WDS description
The method is applied to a realistic WDS hydraulic model with synthetic data. The network is located in a city in Zhejiang Province, China. It consists of 509 pipes, 491 nodes, and three water sources, as shown in Figure 3. A total of 20 pressure sensors are installed in the network. The model of this network is created using the software EPANET 2.0 (Rossman 2000).

Parameter settings
Modeling error and measurement error are two uncertainties considered in this paper. This study focuses on using the spatiotemporal correlation of multiple sensors, which is closely related to the measurement error. The measurements in the normal working condition (without leakage) are synthesized by adding random noise N (0, s p ) to the real pressure. For comparison, two data sets with different precision are generated with the standard deviation (STD) s p ¼ 0.1 m and 0.3 m respectively.
As mentioned in the methodology, by comparing the similarity between the real-time measurements and the values of the simulated scenarios, the leakage will be located in a certain region. For the leakage scenarios simulation, The leakage intensity ranges from 20 m 3 =h to 350 m 3 =h, with an increment of 1 m 3 =h for intervals of 20 m 3 =h to 50 m 3 =h, and 5 m 3 =h for intervals of 50 m 3 =h to 350 m 3 =h. It contains a total of 91 discrete values. The total number of simulated scenarios n is 44,681 (491Â91).
The parameters are as follows: N n is the total number of nodes in WDS (N n ¼ 491), n s is the total number of pressure sensors (n s ¼ 20). The prediction time-domain covers nt (nt ¼ 24, one hour step) consecutive time steps, and DT is set to 3. The number of selected sensors (n g ) to determine the region detection order is set to 2. The score-threshold (l) is set in a way that the probability of the total score of the scenario (SS) exceeding 5%.
To test the performance of the approach in leakage localization, several leakage scenarios are generated to represent the actual leakage events. The leakage intensity for samples tested is divided into six flow rate intervals: 20∼30 m 3 =h, 30∼40 m 3 =h, 40∼50 m 3 =h, 50∼100 m 3 =h, 100∼200 m 3 =h, 200∼350 m 3 =h. The leakage flow rate at the node is generated by randomly sampling in the corresponding interval. The samples contain new leakage that occurs at any time in the predicted time domain. The sample traverses all nodes, and each node simulates 30 different leakage intensities in the corresponding interval. The number of samples for each flow rate interval is 14,730 (491Â30). Figure 4 shows the values of the sensor sensitivity weight S for the case study. Nodes with a small topological distance or pipeline distance to the sensor nodes have a big weight value, quantifying the hydraulic correlation between the sensors and the nodes. The weight peaks occur overwhelmingly when the X-axis and Y-axis coordinates correspond to the same node index, indicating that the most relevant leakage is the one on the node itself. As shown in Table 1, the sensor index 1 corresponds to node index 368, and the weight value is 6.269 when the coordinate is (368,1). The dominant sensor sequence for Node 368 is constructed based on the first column of the Table 1. For example, the dominant sensor sequence is (1, 16, 10, 12, 6) when the sequence length is set to 5. Some sensors are much more sensitive to leakage at a certain node than others. Thus, the dominant sensor sequence is developed for each leakage node to enrich the positive spatial information to the sensors.

The number of dominant sensor sequence
The dominant sensor sequence is a container covered by multiple dominant sensors. To explore the reasonable number of dominant sensors (N sd ), the number of dominant sensors N sd is taken from the integer interval [3, 20] (N sd . ¼ 3,4,…,20). Two clustering algorithms, namely hierarchical clustering algorithm and K-means clustering algorithm, are used for the nodes spatial clustering. The localization performance is evaluated by the geographic distance between the actual leakage node and the top-ranked candidate region cenr.   Figure 5 gives the localization accuracy for the different number of dominant sensors. The trend of geographic distance over the number of dominant sensors can be divided into three stages. In the first stage (3 N sd 6), the localization accuracy increases rapidly as the number of dominant sensors increases, indicating that more information from sensors that are sensitive to leakage is used, leading to a rapid increase in localization accuracy. In the second stage (6 N sd 10), the accuracy does not change much as the number of dominant sensors increases. This is because the sensitivity of these newly adopted sensors gradually decreases whereas the impact of sensor noises increases. In the third stage (11 N sd 20), the accuracy gradually decreases as the number of dominant sensors increases. The main reason is that the newly used sensors are not sensitive enough to the leakage and the pressure fluctuations are mainly caused by the noise, resulting in misleading the leakage localization. Therefore, this is the trade-off between information enrichment caused by the increase in the number of dominant sensors and deterioration caused by the noises. The localization accuracy will be reduced while the number of dominant sensors N sd is too small or too large. A reasonable number of dominant sensors can achieve a more accurate leakage localization. In this case study, the number of dominant sensors is set to 6 (N sd ¼ 6).
Compared with the traditional sensor sequence (N sd ¼ 20) which is utilized by Shao et al. (2019), this proposed method using the dominant sensor sequence can improve the localization accuracy. It is worth noting that the use of dominant sensors can greatly improve the localization accuracy especially when big noise exists in the measurements. As shown in Figure 5, the geographic distance for s p ¼ 0:1m is reduced from 2,000 m to 1,000 m when N sd is reduced from 20 to 6. This accuracy improvement of using dominant sensors is more significant in the case of greater noise than smaller noise. The geographic distance is reduced from 6,000 m to 3,000 m when s p ¼ 0:3 m. Using the dominant sensor sequences can improve localization accuracy by about 50% compared with the method that does not use the dominant sensor sequence. The localization accuracy is deteriorated with the noise standard deviation increasing from 0.1 m to 0.3 m. Similar phenomena can be found in Pérez et al. 2014a). This indicates that accurate monitoring equipment will help improve localization accuracy, allowing more sensors to participate in leakage localization.

Candidate region detection priority
A set of candidate regions can be obtained based on the method described in the section Candidate Leakage Region Ranking. Then the detection priority of these regions should be ranked. As mentioned previously, a total of 14,730 leakage events are simulated to test the performance of the developed method. Figure 6 shows the candidate regions and their detection priority for one of these leakage events. Three candidate regions are obtained and the number '1' indicates that the detection order is first and the leakage is most likely to occur in this region, the area of which is about 390,625 square meters. The actual leakage node is within the candidate region '1st', indicating that the leakage localization accuracy is good, and the detection priority helps to shorten the leakage localization time.
The summary results of region localization of 14,730 leakage events are illustrated in Table 2. Approximately 97% of leakage events have been located in the first three candidate regions. About 74.5% of leakage events are located in the region '1st', indicating that the leakage can be efficiently located since the operator only needs to inspect the '1st' region. Compared with the smaller leakage events, the events with large leakage have a large probability within the region '1st'. When the leakage rates are within 20-50 m 3 =h, the probability within the region '1st' is approximately 70%. It increases from 70% to 79% when the leakage rates increase from 50 m 3 =h to 350 m 3 =h. Figure 7 gives the cumulative probability of the geographic distance from the actual leakage position to the candidate nodes in the candidate region '1st' for the 14,730 leakage events. The cumulative probability is approximately 0.35 for 1,000 m, 0.6 for 2,000 m, and 0.7 for 3,000 m, showing the region localization accuracy is acceptable. However, about 30% of leakage events are at a distance greater than 3,000 m. This is because some leakage events are not located in the region '1st' (Table 2).

Leakage localization
The above section gives the detection priority of the candidate regions. The detection priority of the nodes in the region should be determined based on the method presented in the section Candidate Node Analysis.   Figure 8 gives the geographical distance of all nodes in the candidate leakage region '1st' (see Figure 6). Two indicators, namely Sum score and Cumulative ratio, are shown in Figure 8(a) and 8(b) respectively. The node with the highest indicator value is shown with a dashed line. As shown in Figure 8(a), the node with the highest Sum score is close to the actual leakage node and all the nodes are within 2,000 m of the actual leakage node. As shown in Figure 8(b), the node with the highest Cumulative ratio is also close to the actual leakage node. Figure 9 gives the cumulative probability of the geographic distance from the actual leakage position to the candidate nodes. Node 1st , Node 2nd , and Node 3rd are the top three priority candidate nodes for every leakage event, respectively. 'Node 1stþ2ndþ3rd ' represents the optimal nodes among the top three priority candidate nodes, which are closest to the actual leakage node. The cumulative probabilities of the distance within 5,000 m are about 0.716, 0.703, 0.685, 0.724 for Node 1st , Node 2nd , Node 3rd , and Node 1stþ2ndþ3rd , respectively. The cumulative probability is approximately 0.568, 0.539, 0.518, 0.591 when the distance is less than 2,000 m. The large distance values are mainly due to the uncertainties, including the unknown leakage magnitude, the differences between the real and the estimated nodal water demands, and the measurement noises. However, even in the absence of uncertainty, the leakages in some nodes cannot be located, in the case of the   nodes being located in the branch of the WDS whereas none of these nodes is equipped with a pressure sensor. Besides, pipe distances are also adopted to evaluate the leakage detection performance of the method and the corresponding results are shown in Appendix A.

CONCLUSIONS
This paper presents a novel model-based method for leakage localization in WDSs. It is characterized by (1) defining specific dominant sensor sequences for each candidate leakage node; (2) utilizing multiple time steps of the measurements which are temporal varying correlated; (3) ranking leakage regions and nodes by their possibility to contain the true leakage. Application to the WDS network highlights the effectiveness and robustness of the method, showing that the method can accurately and efficiently localize the leakage.
The adoption of the dominant sensor sequence can enhance leakage localization performance. There is an optimal number of dominant sensors that can maximize leakage localization accuracy. The optimal sensor number is 6-10 for different conditions in the case study. Using the dominant sensor sequences can improve localization accuracy by about 50% compared with the method that does not use the dominant sensor sequence. Region detection priority helps to shorten the leakage detection time. Approximately 97% of leakage events have been located in the candidate regions, indicating that the candidate leakage region is well defined within the considered leakage intensity. The cumulative probability of the distance between the actual leakage and the node selected is approximately 35% within 1,000 m, 60% within 2,000 m, and 70% within 3,000 m, showing good localization accuracy.
Several research tasks remain open. The proposed approach has been developed assuming only a single leakage occurs. The extension to multiple leakages is possible but it would require numerous leakage scenarios to be simulated. This could be very time-consuming. Considering that the sensor fault usually exists in the sensor networks, it is also of interest to develop a sensor-fault detection method to improve the robustness of the developed method.