Skip to main content

Distributed Data Mining Methodology for Clustering and Classification Model

  • Conference paper
Artificial Intelligence and Soft Computing (ICAISC 2010)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6113))

Included in the following conference series:

  • 1803 Accesses

Abstract

Distributed computing and data mining are nowadays almost ubiquitous. Authors propose methodology of distributed data mining by combining local analytical models (built in parallel in nodes of a distributed computer system) into a global one without necessity to construct distributed version of data mining algorithm. Different combining strategies for clustering and classification are proposed and their verification methods as well. Proposed solutions were tested with data sets coming from UCI Machine Learning Repository.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Chan, P., Prodromidis, A., Stolfo, G.: Meta-learning in distributed data mining systems: Issues and approaches. Advances of Distributed Data Mining. AAAI Press, Menlo Park (2000)

    Google Scholar 

  2. Guo, Y., Reuger, S.M., Sutiwaraphun, J., Forbes-Millot, J.: Meta-learning for parallel data mining. In: Proceedings of the 7th Parallel Computing Workshop (1997)

    Google Scholar 

  3. Caragea, D., Silvescu, A., Honavar, V.: Invited Paper. a Framework for Learning from Distributed Data Using Sufficient Statistics and its Application to Learning Decision Trees. International Journal of Hybrid Intelligent Systems 1(2), 80–89 (2004)

    MATH  Google Scholar 

  4. Gorawski, M., Pluciennik, E.: Distributed Data Mining Methodology with Classification Model Example. In: 1st International Conference on Computational Collective Intelligence - Semantic Web, Social Networks & Multiagent Systems, ICCCI, Wrocaw, Poland, October 5-7 (2009)

    Google Scholar 

  5. Grossman, R., Turinsky, A.: A Framework for Finding Distributed Data Mining Strategies That Are Intermediate Between Centralized Strategies and In-Place Strategies. In: Proceedings of Workshop on Distributed and Parallel Knowledge Discovery at KDD 2000, pp. 1–7 (2000)

    Google Scholar 

  6. Theodorakis, M., Vlachos, A., Kalamboukis, T.Z.: Using Hierarchical Clustering to Enhance Classification Accuracy. In: Proceedings of 3rd Hellenic Conference in Artificial Intelligence, Samos (2004)

    Google Scholar 

  7. Topchy, A., Jain, A.K., Punch, W.: Clustering ensembles: models of consensus and weak partitions. IEEE Transactions on Pattern Analysis and Machine Intelligence 27(12), 1866–1881 (2005); Digital Object Identifier: 10.1109/TPAMI.2005.237

    Article  Google Scholar 

  8. Petrakis, Y., Koloniari, G., Pitoura, E.: On Using Histograms as Routing Indexes in Peer-to-Peer System. In: Ng, W.S., Ooi, B.-C., Ouksel, A.M., Sartori, C. (eds.) DBISP2P 2004. LNCS, vol. 3367, pp. 16–30. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  9. Gorawski, M., Pluciennik, E.: Analytical Models Combining Methodology with Classification Model Example. In: First International Conference on Information Technology, Gdansk (2008), ISBN:978-1-4244-2244-9, http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=4621623 ,

  10. Asuncion, A., Newman, D.J.: UCI Machine Learning Repository. University of California, School of Information and Computer Science, Irvine (2007), http://www.ics.uci.edu/~mlearn/MLRepository.html

    Google Scholar 

  11. Quinlan, R.: Induction of Decision Trees. Machine Learning 1, 81–106 (1986)

    Google Scholar 

  12. Fisher, D.H.: Knowledge acquisition via incremental conceptual clustering. Journal Machine Learning 2(2), 139–172 (1987)

    Google Scholar 

  13. Milenova, B.L., Campos, M.M.: O-Cluster: Scalable Clustering of Large High Dimensional Data Sets. In: Proceedings of the 2002 IEEE International Conference on Data Mining (ICDM 2002), p. 290 (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Gorawski, M., Pluciennik-Psota, E. (2010). Distributed Data Mining Methodology for Clustering and Classification Model. In: Rutkowski, L., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds) Artificial Intelligence and Soft Computing. ICAISC 2010. Lecture Notes in Computer Science(), vol 6113. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13208-7_41

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-13208-7_41

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-13207-0

  • Online ISBN: 978-3-642-13208-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics