Distributed Data Mining Methodology for Clustering and Classification Model

Gorawski, Marcin; Pluciennik-Psota, Ewa

doi:10.1007/978-3-642-13208-7_41

Marcin Gorawski²⁴ &
Ewa Pluciennik-Psota²⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6113))

Included in the following conference series:

International Conference on Artificial Intelligence and Soft Computing

1803 Accesses

Abstract

Distributed computing and data mining are nowadays almost ubiquitous. Authors propose methodology of distributed data mining by combining local analytical models (built in parallel in nodes of a distributed computer system) into a global one without necessity to construct distributed version of data mining algorithm. Different combining strategies for clustering and classification are proposed and their verification methods as well. Proposed solutions were tested with data sets coming from UCI Machine Learning Repository.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Chan, P., Prodromidis, A., Stolfo, G.: Meta-learning in distributed data mining systems: Issues and approaches. Advances of Distributed Data Mining. AAAI Press, Menlo Park (2000)
Google Scholar
Guo, Y., Reuger, S.M., Sutiwaraphun, J., Forbes-Millot, J.: Meta-learning for parallel data mining. In: Proceedings of the 7th Parallel Computing Workshop (1997)
Google Scholar
Caragea, D., Silvescu, A., Honavar, V.: Invited Paper. a Framework for Learning from Distributed Data Using Sufficient Statistics and its Application to Learning Decision Trees. International Journal of Hybrid Intelligent Systems 1(2), 80–89 (2004)
MATH Google Scholar
Gorawski, M., Pluciennik, E.: Distributed Data Mining Methodology with Classification Model Example. In: 1st International Conference on Computational Collective Intelligence - Semantic Web, Social Networks & Multiagent Systems, ICCCI, Wrocaw, Poland, October 5-7 (2009)
Google Scholar
Grossman, R., Turinsky, A.: A Framework for Finding Distributed Data Mining Strategies That Are Intermediate Between Centralized Strategies and In-Place Strategies. In: Proceedings of Workshop on Distributed and Parallel Knowledge Discovery at KDD 2000, pp. 1–7 (2000)
Google Scholar
Theodorakis, M., Vlachos, A., Kalamboukis, T.Z.: Using Hierarchical Clustering to Enhance Classification Accuracy. In: Proceedings of 3rd Hellenic Conference in Artificial Intelligence, Samos (2004)
Google Scholar
Topchy, A., Jain, A.K., Punch, W.: Clustering ensembles: models of consensus and weak partitions. IEEE Transactions on Pattern Analysis and Machine Intelligence 27(12), 1866–1881 (2005); Digital Object Identifier: 10.1109/TPAMI.2005.237
Article Google Scholar
Petrakis, Y., Koloniari, G., Pitoura, E.: On Using Histograms as Routing Indexes in Peer-to-Peer System. In: Ng, W.S., Ooi, B.-C., Ouksel, A.M., Sartori, C. (eds.) DBISP2P 2004. LNCS, vol. 3367, pp. 16–30. Springer, Heidelberg (2005)
Chapter Google Scholar
Gorawski, M., Pluciennik, E.: Analytical Models Combining Methodology with Classification Model Example. In: First International Conference on Information Technology, Gdansk (2008), ISBN:978-1-4244-2244-9, http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=4621623 ,
Asuncion, A., Newman, D.J.: UCI Machine Learning Repository. University of California, School of Information and Computer Science, Irvine (2007), http://www.ics.uci.edu/~mlearn/MLRepository.html
Google Scholar
Quinlan, R.: Induction of Decision Trees. Machine Learning 1, 81–106 (1986)
Google Scholar
Fisher, D.H.: Knowledge acquisition via incremental conceptual clustering. Journal Machine Learning 2(2), 139–172 (1987)
Google Scholar
Milenova, B.L., Campos, M.M.: O-Cluster: Scalable Clustering of Large High Dimensional Data Sets. In: Proceedings of the 2002 IEEE International Conference on Data Mining (ICDM 2002), p. 290 (2002)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Computer Science, Silesian University of Technology, Akademicka 16, 44-100, Gliwice, Poland
Marcin Gorawski & Ewa Pluciennik-Psota

Authors

Marcin Gorawski
View author publications
You can also search for this author in PubMed Google Scholar
Ewa Pluciennik-Psota
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Artificial Intelligence, Academy of Humanities and Economics, Poland
Leszek Rutkowski
Academy of Humanities and Economics in Łódź,, ul. Rewolucji 1905 nr 64, Łódź, Poland
Rafał Scherer
Institute of Automatics,, AGH University of Science and Technology, Al. Mickiewicza 30, PL-30-059, Kraków, Poland
Ryszard Tadeusiewicz
Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, Initiative in Soft Computing (BISC), 94720-1776, Berkeley, CA,
Lotfi A. Zadeh
Computational Intelligence Laboratory Department of Electrical and Computer Engineering, University of Louisville, 40292, Louisville, KY
Jacek M. Zurada

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gorawski, M., Pluciennik-Psota, E. (2010). Distributed Data Mining Methodology for Clustering and Classification Model. In: Rutkowski, L., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds) Artificial Intelligence and Soft Computing. ICAISC 2010. Lecture Notes in Computer Science(), vol 6113. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13208-7_41

Download citation

DOI: https://doi.org/10.1007/978-3-642-13208-7_41
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-13207-0
Online ISBN: 978-3-642-13208-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics