ABSTRACT
Breaks in water consumption records can represent apparent losses which are generally associated with the volumes of water that are consumed but not billed. The detection of these losses at the appropriate time can have a significant economic impact on the water company’s revenues. However, the real datasets available to test and evaluate the current methods on the detection of breaks are not always large enough or do not present abnormal water consumption patterns. This study proposes an approach to generate synthetic data of water consumption with structural breaks which follows the statistical proprieties of real datasets from a hotel and a hospital. The parameters of the best-fit probability distributions (gamma, Weibull, log-Normal, log-logistic, and exponential) to real water consumption data are used to generate the new datasets. Two decreasing breaks on the mean were inserted in each new dataset associated with one selected probability distribution for each study case with a time horizon of 914 days. Three different change point detection methods provided by the R packages strucchange and changepoint were evaluated making use of these new datasets. Based on Root Mean Square Error (RMSE) and Mean Absolute Error (MAE) performance indices, a higher performance has been observed for the breakpoint method provided by the package strucchange.
- Samaneh Aminikhanghahi and Diane J Cook. 2017. A survey of methods for time series change point detection. Knowledge and information systems 51, 2 (2017), 339–367.Google Scholar
- Marta Santos Ana Borges, Davide Carneiroand Flora Ferreira. 2021. Synthetic Datasets of Water Consumptions. Mendeley Data, V2 (2021). https://doi.org/10.17632/v4ynw83j6k.2.Google Scholar
- FJ Arregui, J Soriano, E Cabrera Jr, and R Cobacho. 2012. Nine steps towards a better water meter management. Water Science and Technology 65, 7 (2012), 1273–1280.Google ScholarCross Ref
- Emilie Lundin Barse, Hakan Kvarnstrom, and Erland Jonsson. 2003. Synthesizing test data for fraud detection systems. In 19th Annual Computer Security Applications Conference, 2003. Proceedings. IEEE, 384–394.Google ScholarCross Ref
- José Carlos Carrasco-Jiménez, Filippo Baldaro, and Fernando Cucchietti. 2020. Detection of Anomalous Patterns in Water Consumption: An Overview of Approaches. In Proceedings of SAI Intelligent Systems Conference. Springer, 19–33.Google Scholar
- Tianfeng Chai and Roland R Draxler. 2014. Root mean square error (RMSE) or mean absolute error (MAE)?–Arguments against avoiding RMSE in the literature. Geoscientific model development 7, 3 (2014), 1247–1250.Google Scholar
- Marie Laure Delignette-Muller and Christophe Dutang. 2012. Fitting parametric univariate distributions to non censored or censored data using the R fitdistrplus package.Google Scholar
- Marie Laure Delignette-Muller, Christophe Dutang, Regis Pouillot, Jean-Baptiste Denis, and Maintainer Marie Laure Delignette-Muller. 2015. Package ‘fitdistrplus’.Google Scholar
- Idris A Eckley, Paul Fearnhead, and Rebecca Killick. 2011. Analysis of changepoint models. Bayesian time series models(2011), 205–224.Google Scholar
- Rudy Gargano, Carla Tricarico, Francesco Granata, Simone Santopietro, and Giovanni De Marinis. 2017. Probabilistic models for the peak residential water demand. Water 9, 6 (2017), 417.Google ScholarCross Ref
- Cyrus M Hester and Kelli L Larson. 2016. Time-series analysis of water demands in three North Carolina cities. Journal of Water Resources Planning and Management 142, 8(2016), 05016005.Google ScholarCross Ref
- Rebecca Killick and Idris Eckley. 2014. changepoint: An R package for changepoint analysis. Journal of statistical software 58, 3 (2014), 1–19.Google ScholarCross Ref
- Rebecca Killick, Paul Fearnhead, and Idris A Eckley. 2012. Optimal detection of changepoints with a linear computational cost. J. Amer. Statist. Assoc. 107, 500 (2012), 1590–1598.Google ScholarCross Ref
- Dimitris T Kofinas, Alexandra Spyropoulou, and Chrysi S Laspidou. 2018. A methodology for synthetic household water consumption data generation. Environmental modelling & software 100 (2018), 48–66.Google Scholar
- Panagiotis Kossieris and Christos Makropoulos. 2018. Exploring the statistical and distributional properties of residential water demand at fine time scales. Water 10, 10 (2018), 1481.Google ScholarCross Ref
- Martin Oberascher, Michael Möderl, and Robert Sitzenfrei. 2020. Water Loss Management in Small Municipalities: The Situation in Tyrol. Water 12, 12 (2020), 3446.Google Scholar
- Kimberly J Quesnel and Newsha K Ajami. 2017. Changes in water consumption linked to heavy news media coverage of extreme climatic events. Science advances 3, 10 (2017), e1700784.Google Scholar
- Christian Rohrbeck. 2013. Detection of changes in variance using binary segmentation and optimal partitioning. Allen Institute for AI: Seattle, WA, USA(2013).Google Scholar
- Marta Santos, Ana Borges, Davide Carneiro, and Flora Ferreira. 2021. Time Series Analysis for Anomaly Detection of Water Consumption: A Case Study. In International Conference Innovation in Engineering. Springer, 234–245.Google Scholar
- Manqing Shao, Gang Zhao, Shih-Chieh Kao, Lan Cuo, Cheryl Rankin, and Huilin Gao. 2020. Quantifying the effects of urbanization on floods in a changing environment to promote water security—A case study of two adjacent basins in Texas. Journal of Hydrology 589(2020), 125154.Google ScholarCross Ref
- Shilpy Sharma, David A Swayne, and Charlie Obimbo. 2016. Trend analysis and change point techniques: a survey. Energy, Ecology and Environment 1, 3 (2016), 123–130.Google ScholarCross Ref
- Seevali Surendran and Kiran Tota-Maharaj. 2018. Effectiveness of log-logistic distribution to model water-consumption data. Journal of Water Supply: Research and Technology—AQUA 67, 4(2018), 375–383.Google Scholar
- R Core Team 2013. R: A language and environment for statistical computing. (2013).Google Scholar
- Kiran Tota-Maharaj and Seevali Surendran. 2020. 3-Parameter Log-Logistic Distribution Modelling and Scenario Development to Evaluate the United Kingdom’s Water Demand. Institute of Water Journal(2020). Issue 4.Google Scholar
- G Dorcas Wambui, Gichuhi Anthony Waititu, and Anthony Wanjoya. 2015. The power of the pruned exact linear time (PELT) test in multiple changepoint detection. American Journal of Theoretical and Applied Statistics 4, 6 (2015), 581.Google ScholarCross Ref
- Achim Zeileis, Friedrich Leisch, Kurt Hornik, Christian Kleiber, Bruce Hansen, Edgar C Merkle, and Maintainer Achim Zeileis. 2015. Package ‘strucchange’. Journal of Statistical Software(2015).Google Scholar
Recommendations
Assessment of Spatial-Temporal Patterns of Surface Water Quality in the Min River (China) and Implications for Management
CDCIEM '11: Proceedings of the 2011 International Conference on Computer Distributed Control and Intelligent Environmental MonitoringThis paper investigated the spatial-temporal variations and pollution source of surface water quality by analyzing the data from 8 water quality monitoring stations along the middle and lower reaches of the Min River between2003 and 2008. The results ...
Synthetic dataset generator for anomaly detection in a university environment
This article introduces a recently developed synthetic dataset generator, which contains anonymised data from the Prague University of Economics and Business information system logs. The generator is opensource and is able to scale this data time-...
The Energy-Water Nexus in Campuses
BuildSys '13: Proceedings of the 5th ACM Workshop on Embedded Systems For Energy-Efficient BuildingsWater is a critical index of an organization's sustainability. Since water reuse consumes energy, water management requires careful analysis of energy implications. To this end, we study the energy-water nexus in a multi-building campus with a water ...
Comments