skip to main content
10.1145/3475827.3475836acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicomsConference Proceedingsconference-collections
research-article
Open Access

Synthetic dataset to study breaks in the consumer’s water consumption patterns

Published:27 October 2021Publication History

ABSTRACT

Breaks in water consumption records can represent apparent losses which are generally associated with the volumes of water that are consumed but not billed. The detection of these losses at the appropriate time can have a significant economic impact on the water company’s revenues. However, the real datasets available to test and evaluate the current methods on the detection of breaks are not always large enough or do not present abnormal water consumption patterns. This study proposes an approach to generate synthetic data of water consumption with structural breaks which follows the statistical proprieties of real datasets from a hotel and a hospital. The parameters of the best-fit probability distributions (gamma, Weibull, log-Normal, log-logistic, and exponential) to real water consumption data are used to generate the new datasets. Two decreasing breaks on the mean were inserted in each new dataset associated with one selected probability distribution for each study case with a time horizon of 914 days. Three different change point detection methods provided by the R packages strucchange and changepoint were evaluated making use of these new datasets. Based on Root Mean Square Error (RMSE) and Mean Absolute Error (MAE) performance indices, a higher performance has been observed for the breakpoint method provided by the package strucchange.

References

  1. Samaneh Aminikhanghahi and Diane J Cook. 2017. A survey of methods for time series change point detection. Knowledge and information systems 51, 2 (2017), 339–367.Google ScholarGoogle Scholar
  2. Marta Santos Ana Borges, Davide Carneiroand Flora Ferreira. 2021. Synthetic Datasets of Water Consumptions. Mendeley Data, V2 (2021). https://doi.org/10.17632/v4ynw83j6k.2.Google ScholarGoogle Scholar
  3. FJ Arregui, J Soriano, E Cabrera Jr, and R Cobacho. 2012. Nine steps towards a better water meter management. Water Science and Technology 65, 7 (2012), 1273–1280.Google ScholarGoogle ScholarCross RefCross Ref
  4. Emilie Lundin Barse, Hakan Kvarnstrom, and Erland Jonsson. 2003. Synthesizing test data for fraud detection systems. In 19th Annual Computer Security Applications Conference, 2003. Proceedings. IEEE, 384–394.Google ScholarGoogle ScholarCross RefCross Ref
  5. José Carlos Carrasco-Jiménez, Filippo Baldaro, and Fernando Cucchietti. 2020. Detection of Anomalous Patterns in Water Consumption: An Overview of Approaches. In Proceedings of SAI Intelligent Systems Conference. Springer, 19–33.Google ScholarGoogle Scholar
  6. Tianfeng Chai and Roland R Draxler. 2014. Root mean square error (RMSE) or mean absolute error (MAE)?–Arguments against avoiding RMSE in the literature. Geoscientific model development 7, 3 (2014), 1247–1250.Google ScholarGoogle Scholar
  7. Marie Laure Delignette-Muller and Christophe Dutang. 2012. Fitting parametric univariate distributions to non censored or censored data using the R fitdistrplus package.Google ScholarGoogle Scholar
  8. Marie Laure Delignette-Muller, Christophe Dutang, Regis Pouillot, Jean-Baptiste Denis, and Maintainer Marie Laure Delignette-Muller. 2015. Package ‘fitdistrplus’.Google ScholarGoogle Scholar
  9. Idris A Eckley, Paul Fearnhead, and Rebecca Killick. 2011. Analysis of changepoint models. Bayesian time series models(2011), 205–224.Google ScholarGoogle Scholar
  10. Rudy Gargano, Carla Tricarico, Francesco Granata, Simone Santopietro, and Giovanni De Marinis. 2017. Probabilistic models for the peak residential water demand. Water 9, 6 (2017), 417.Google ScholarGoogle ScholarCross RefCross Ref
  11. Cyrus M Hester and Kelli L Larson. 2016. Time-series analysis of water demands in three North Carolina cities. Journal of Water Resources Planning and Management 142, 8(2016), 05016005.Google ScholarGoogle ScholarCross RefCross Ref
  12. Rebecca Killick and Idris Eckley. 2014. changepoint: An R package for changepoint analysis. Journal of statistical software 58, 3 (2014), 1–19.Google ScholarGoogle ScholarCross RefCross Ref
  13. Rebecca Killick, Paul Fearnhead, and Idris A Eckley. 2012. Optimal detection of changepoints with a linear computational cost. J. Amer. Statist. Assoc. 107, 500 (2012), 1590–1598.Google ScholarGoogle ScholarCross RefCross Ref
  14. Dimitris T Kofinas, Alexandra Spyropoulou, and Chrysi S Laspidou. 2018. A methodology for synthetic household water consumption data generation. Environmental modelling & software 100 (2018), 48–66.Google ScholarGoogle Scholar
  15. Panagiotis Kossieris and Christos Makropoulos. 2018. Exploring the statistical and distributional properties of residential water demand at fine time scales. Water 10, 10 (2018), 1481.Google ScholarGoogle ScholarCross RefCross Ref
  16. Martin Oberascher, Michael Möderl, and Robert Sitzenfrei. 2020. Water Loss Management in Small Municipalities: The Situation in Tyrol. Water 12, 12 (2020), 3446.Google ScholarGoogle Scholar
  17. Kimberly J Quesnel and Newsha K Ajami. 2017. Changes in water consumption linked to heavy news media coverage of extreme climatic events. Science advances 3, 10 (2017), e1700784.Google ScholarGoogle Scholar
  18. Christian Rohrbeck. 2013. Detection of changes in variance using binary segmentation and optimal partitioning. Allen Institute for AI: Seattle, WA, USA(2013).Google ScholarGoogle Scholar
  19. Marta Santos, Ana Borges, Davide Carneiro, and Flora Ferreira. 2021. Time Series Analysis for Anomaly Detection of Water Consumption: A Case Study. In International Conference Innovation in Engineering. Springer, 234–245.Google ScholarGoogle Scholar
  20. Manqing Shao, Gang Zhao, Shih-Chieh Kao, Lan Cuo, Cheryl Rankin, and Huilin Gao. 2020. Quantifying the effects of urbanization on floods in a changing environment to promote water security—A case study of two adjacent basins in Texas. Journal of Hydrology 589(2020), 125154.Google ScholarGoogle ScholarCross RefCross Ref
  21. Shilpy Sharma, David A Swayne, and Charlie Obimbo. 2016. Trend analysis and change point techniques: a survey. Energy, Ecology and Environment 1, 3 (2016), 123–130.Google ScholarGoogle ScholarCross RefCross Ref
  22. Seevali Surendran and Kiran Tota-Maharaj. 2018. Effectiveness of log-logistic distribution to model water-consumption data. Journal of Water Supply: Research and Technology—AQUA 67, 4(2018), 375–383.Google ScholarGoogle Scholar
  23. R Core Team 2013. R: A language and environment for statistical computing. (2013).Google ScholarGoogle Scholar
  24. Kiran Tota-Maharaj and Seevali Surendran. 2020. 3-Parameter Log-Logistic Distribution Modelling and Scenario Development to Evaluate the United Kingdom’s Water Demand. Institute of Water Journal(2020). Issue 4.Google ScholarGoogle Scholar
  25. G Dorcas Wambui, Gichuhi Anthony Waititu, and Anthony Wanjoya. 2015. The power of the pruned exact linear time (PELT) test in multiple changepoint detection. American Journal of Theoretical and Applied Statistics 4, 6 (2015), 581.Google ScholarGoogle ScholarCross RefCross Ref
  26. Achim Zeileis, Friedrich Leisch, Kurt Hornik, Christian Kleiber, Bruce Hansen, Edgar C Merkle, and Maintainer Achim Zeileis. 2015. Package ‘strucchange’. Journal of Statistical Software(2015).Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Other conferences
    ICoMS '21: Proceedings of the 2021 4th International Conference on Mathematics and Statistics
    June 2021
    102 pages
    ISBN:9781450389907
    DOI:10.1145/3475827

    Copyright © 2021 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 27 October 2021

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article
    • Research
    • Refereed limited
  • Article Metrics

    • Downloads (Last 12 months)100
    • Downloads (Last 6 weeks)16

    Other Metrics

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format