ABSTRACT
Energy data from industrial facilities is collected with high frequency. The resulting data volumes pose a scalability challenge for subsequent analyses. While data aggregation can be used to address it, the quality of analyses on aggregated data often is unknown. In our work, we propose an experimental design to evaluate the effects of aggregation on clustering energy data.
- Charu C Aggarwal. 2015. Data Mining: The Textbook. Springer. Google ScholarDigital Library
- Olatz Arbelaitz, Ibai Gurrutxaga, Javier Muguerza, Jesús M Pérez, and Inigo Perona. 2013. An extensive comparative study of cluster validity indices. Pattern Recognition 46, 1 (2013), 243--256. Google ScholarDigital Library
- Gianfranco Chicco. 2012. Overview and performance assessment of the clustering methods for electrical load pattern grouping. Energy 42, 1 (2012), 68--80.Google ScholarCross Ref
- Lawrence Hubert and Phipps Arabie. 1985. Comparing Partitions. J Classification 2, 1 (1985), 193--218.Google ScholarCross Ref
- Rishee K Jain, Kevin M Smith, Patricia J Culligan, and John E Taylor. 2014. Forecasting energy consumption of multi-family residential buildings using support vector regression: Investigating the impact of temporal and spatial monitoring granularity on performance accuracy. Applied Energy 123 (2014), 168--178.Google ScholarCross Ref
- Ling Jin, Doris Lee, Alex Sim, Sam Borgeson, Kesheng Wu, C Anna Spurlock, and Annika Todd. 2017. Comparison of Clustering Techniques for Residential Energy Behavior using Smart Meter Data. Technical Report. LBNL.Google Scholar
- Eamonn Keogh, Kaushik Chakrabarti, Michael Pazzani, and Sharad Mehrotra. 2001. Dimensionality Reduction for Fast Similarity Search in Large Time Series Databases. Knowl Inf Syst 3, 3 (2001), 263--286.Google ScholarCross Ref
- Eamonn Keogh and Shruti Kasetty. 2003. On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration. Data Min Knowl Disc 7, 4 (2003), 349--371. Google ScholarDigital Library
- Eamonn J Keogh and Michael J Pazzani. 2000. Scaling up Dynamic Time Warping for Datamining Applications. In KDD. 285--289. Google ScholarDigital Library
- William M Rand. 1971. Objective Criteria for the Evaluation of Clustering Methods. J Am Stat Assoc 66, 336 (1971), 846--850.Google ScholarCross Ref
- Peter J Rousseeuw. 1987. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math 20 (1987), 53--65. Google ScholarDigital Library
- Silke Wagner and Dorothea Wagner. 2007. Comparing Clusterings -- An Overview. Technical Report. Faculty of Informatics, Universität Karlsruhe (TH).Google Scholar
- Xiaoyue Wang, Abdullah Mueen, Hui Ding, Goce Trajcevski, Peter Scheuermann, and Eamonn Keogh. 2013. Experimental comparison of representation methods and distance measures for time series data. Data Min Knowl Disc 26, 2 (2013), 275--309. Google ScholarDigital Library
- Tri Kurniawan Wijaya, Tanuja Ganu, Dipanjan Chakraborty, Karl Aberer, and Deva P Seetharam. 2014. Consumer Segmentation and Knowledge Extraction from Smart Meter and Survey Data. In ICDM. 226--234.Google Scholar
Index Terms
- On the Tradeoff between Energy Data Aggregation and Clustering Quality
Recommendations
Clustering data with measurement errors
Traditional clustering methods assume that there is no measurement error, or uncertainty, associated with data. Often, however, real world applications require treatment of data that have such errors. In the presence of measurement errors, well-known ...
Specification-based data reduction in dimensional data warehouses
Many data warehouses contain massive amounts of data, accumulated over long periods of time. In some cases, it is necessary or desirable to either delete ''old'' data or to maintain the data at an aggregate level. This may be due to privacy concerns, in ...
Energy Efficient Data Aggregation Techniques in Wireless Sensor Networks
CICN '13: Proceedings of the 2013 5th International Conference on Computational Intelligence and Communication NetworksThe data in wireless sensor networks is organized in an efficient manner using data aggregation and data dissemination protocols. Due to the energy constraints in sensor nodes, energy-efficient data aggregation protocols are used to save the node energy ...
Comments