Computer Science and Information Systems 2022 Volume 19, Issue 1, Pages: 117-139
https://doi.org/10.2298/CSIS200531039S
Full text ( 895 KB)
Scaling industrial applications for the Big Data era
Šutić Davor (Faculty of Technical Sciences, Novi Sad, Serbia), sutic@uns.ac.rs
Varga Ervin (Faculty of Technical Sciences, Novi Sad, Serbia), evarga@uns.ac.rs
Industrial applications tend to rely increasingly on large datasets for regular operations. In order to facilitate that need, we unite the increasingly available hardware resources with fundamental problems found in classical algorithms. We show solutions to the following problems: power flow and island detection in power networks, and the more general graph sparsification. At their core lie respectively algorithms for solving systems of linear equations, graph connectivity and matrix multiplication, and spectral sparsification of graphs, which are applicable on their own to a far greater spectrum of problems. The novelty of our approach lies in developing the first open source and distributed solutions, capable of handling large datasets. Such solutions constitute a toolkit, which, aside from the initial purpose, can be used for the development of unrelated applications and for educational purposes in the study of distributed algorithms.
Keywords: distributed computing, big data, smart grid
Show references
Zaharia, Matei, et al. "Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing." Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation. USENIX Association, 2012.
Li, Xueqi, et al. "Accelerating large-scale genomic analysis with Spark." Bioinformatics and Biomedicine (BIBM), 2016 IEEE International Conference on. IEEE, 2016
Ji, Hao, et al. "An Apache Spark implementation of block power method for computing dominant eigenvalues and eigenvectors of large-scale matrices." Big Data and Cloud Computing (BDCloud), Social Computing and Networking (SocialCom), Sustainable Computing and Communications (SustainCom)(BDCloud-SocialCom-SustainCom), 2016 IEEE International Conferences on. IEEE, 2016.
Van Ness, James E. "Iteration methods for digital load flow studies." Transactions of the American Institute of Electrical Engineers. Part III: Power Apparatus and Systems 78.3 (1959): 583-586.
Tinney, William F., and Clifford E. Hart. "Power flow solution by Newton's method." IEEE Transactions on Power Apparatus and systems 11 (1967): 1449-1460.
Trias, Antonio. "The holomorphic embedding load flow method." Power and Energy Society General Meeting, 2012 IEEE. IEEE, 2012.
Goderya, F., A. A. Metwally, and O. Mansour. "Fast detection and identification of islands in power networks." IEEE transactions on power apparatus and systems 1 (1980): 217-221.
Montagna, M., and G. P. Granelli. "Detection of Jacobian singularity and network islanding in power flow computations." IEE Proceedings-Generation, Transmission and Distribution 142.6 (1995): 589-594.
Guler, Teoman, and George Gross. "Detection of island formation and identification of causal factors under multiple line outages." IEEE Transactions on Power Systems 22.2 (2007): 505-513.
Stott, Brian, Ongun Alsac, and Alcir J. Monticelli. "Security analysis and optimization." Proceedings of the IEEE 75.12 (1987): 1623-1644.
Zimmerman, Ray Daniel, Carlos Edmundo Murillo-Sánchez, and Robert John Thomas. "MATPOWER: Steady-state operations, planning, and analysis tools for power systems research and education." IEEE Transactions on power systems 26.1 (2011): 12-19
Beerten, Jef, and Ronnie Belmans. "Development of an open source power flow software for high voltage direct current grids and hybrid AC/DC systems: MATACDC." IET Generation, Transmission & Distribution 9.10 (2015): 966-974.
Li, Hongyan, Junjie Sun, and Leigh Tesfatsion. "Dynamic LMP response under alternative price-cap and price-sensitive demand scenarios." Power and Energy Society General Meeting-Conversion and Delivery of Electrical Energy in the 21st Century, 2008 IEEE. IEEE, 2008.
Zhou, Michael, and Shizhao Zhou. "Internet, open-source and power system simulation." Power Engineering Society General Meeting, 2007. IEEE. IEEE, 2007.
https://bitbucket.org/suticd/sparkpowertools/src/master/
https://bitbucket.org/suticd/spectralgraphanalysistool/src/master/
Spielman, Daniel A., and Shang-Hua Teng. "Spectral sparsification of graphs." SIAM Journal on Computing 40.4 (2011): 981-1025.
Spielman, Daniel A., and Shang-Hua Teng. "A local clustering algorithm for massive graphs and its application to nearly linear time graph partitioning." SIAM Journal on Computing42.1 (2013): 1-26.
Spielman, Daniel A., and Shang-Hua Teng. "Nearly linear time algorithms for preconditioning and solving symmetric, diagonally dominant linear systems." SIAM Journal on Matrix Analysis and Applications 35.3 (2014): 835-885.
Spielman, Daniel A., and Nikhil Srivastava. "Graph sparsification by effective resistances." SIAM Journal on Computing 40.6 (2011): 1913-1926.
Koutis, Ioannis, and Shen Chen Xu. "Simple parallel and distributed algorithms for spectral graph sparsification." ACM Transactions on Parallel Computing (TOPC) 3.2 (2016): 14.
Sun, He, and Luca Zanetti. "Distributed graph clustering and sparsification." ACM Transactions on Parallel Computing (TOPC) 6.3 (2019): 17.
Šutić, Davor, and Ervin Varga. "Spectral Graph Analysis with Apache Spark." Proceedings of the 2018 International Conference on Mathematics and Statistics. ACM, 2018.
Rossi, Ryan, and Nesreen Ahmed. "The network data repository with interactive graph analytics and visualization." Twenty-Ninth AAAI Conference on Artificial Intelligence. 2015.
Xin, Reynold S., et al. "Graphx: A resilient distributed graph system on spark." First International Workshop on Graph Data Management Experiences and Systems. ACM, 2013.
Zhao, Xueqian, Zhuo Feng, and Cheng Zhuo. "An efficient spectral graph sparsification approach to scalable reduction of large flip-chip power grids." Proceedings of the 2014 IEEE/ACM International Conference on Computer-Aided Design. IEEE Press, 2014.
Jancauskas, Vytautas. “Scientific Computing with Scala.” Packt Publishing Ltd, 2016
Hong, Yoo Pyo, and C-T. Pan. "Rank-revealing factorizations and the singular value decomposition." Mathematics of Computation 58.197 (1992): 213-232.
Goderya, F., A. A. Metwally, and O. Mansour. "Fast detection and identification of islands in power networks." IEEE transactions on power apparatus and systems 1 (1980): 217-221.
Bosagh Zadeh, Reza, et al. "Matrix computations and optimization in apache spark." Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2016.
Batson, Joshua, et al. "Spectral sparsification of graphs: theory and algorithms." Communications of the ACM 56.8 (2013): 87-94.
https://issues.apache.org/jira/browse/SPARK-10335
Šutić, Davor, and Ervin Varga. " Appendix - Grid model", https://bitbucket.org/suticd/sparkpowercalculations/src/master/Documentation/Appendix%20-%20Grid%20Model.pdf
Šutić, Davor, and Ervin Varga. " Appendix - Power flow problem formulation", https://bitbucket.org/suticd/sparkpowercalculations/src/master/Documentation/Appendix%20-%20Power%20flow%20problem%20formulation.pdf
Perraudin, Nathanaël, Johan Paratte, David Shuman, Lionel Martin, Vassilis Kalofolias, Pierre Vandergheynst, and David K. Hammond. "GSPBOX: A toolbox for signal processing on graphs." arXiv preprint arXiv:1408.5781 (2014).