Problem of knowledge discovery in noisy databases

Vagin, Vadim; Fomina, Marina

doi:10.1007/s13042-011-0028-x

Problem of knowledge discovery in noisy databases

Original Article
Published: 01 July 2011

Volume 2, pages 135–145, (2011)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Vadim Vagin¹ &
Marina Fomina¹

164 Accesses
24 Citations
Explore all metrics

Abstract

The problem of information generalization for real data that may contain noisy data is considered. Various models of information noise are presented, and the influence of noise to the algorithms of generalization is discussed. We used the methods of constructing decision trees and forming production rules. The results of the modeling are presented.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Noisy Data Set Identification

Knowledge Discovery: From Uncertainty to Ambiguity and Back

A Comparison of Mining Incomplete and Inconsistent Data

References

Michalski RS (1983) A theory and methodology of inductive learning. In: Michalski RS, Carbonel JG, Mitchell TM (eds) Machine learning: an artificial intelligence approach. Morgan Kaufman, Palo Alto, pp 83–134
Google Scholar
Vagin VN, Golovina EJ, Zagoryanskaya AA, Fomina MV (2008) Exact and plausible inference in intelligent systems. In: Vagin VN, Pospelov DA (eds) 2nd edn, Fizmatlit, Moscow (in Russian)
Parsons S (1995) Current approaches to handling imperfect information in data and knowledge bases. IEEE Trans Knowl Data Eng 8(3):353–372
Article MathSciNet Google Scholar
Quinlan JR (1986) Induction of decision trees. Mach Learn 1:81–106
Google Scholar
Quinlan JR (1996) Improved use of continuous attributes in C 4.5. J Artif Intell Res 4:77–90
MATH Google Scholar
Clark P, Niblett T (1989) The CN2 induction algorithm. Mach Learn 3:261–283
Google Scholar
Clark P, Boswell R (1991) Rule induction with CN2: some recent improvements. In: Machine learning—Proceedings of the Fifth European Conference (ESWL-91), Berlin, 151–163
Fayyad U, Piatetsky-Shapiro G, Smith P (1996) From data mining to knowledge discovery: an overview. In: Advances in knowledge discovery and data mining. AAAI Press/The MIT Press, Cambridge, pp 1–36
Wong E (1982) A statistical approach to incomplete information in database Systems. ACM Trans Database Syst 7:470–488
Article MATH Google Scholar
Weiss S, Kulikowski C (1991) Computer system that learn: classification and prediction methods from statistics neural networks machine learning and expert systems. Morgan Kaufmann, San Francisco
Google Scholar
Kolodner J (1993) Case-based reasoning. Morgan Kaufmann, San-Francisco
Google Scholar
Hasenjäger M, Ritter H (1999) Active learning in neural networks, working paper in the university of Bielefeld http://citeseer.nj.nec.com/404108.html
Utgoff PE (1989) Incremental induction of decision trees. Mach Learn 4:161–186
Article Google Scholar
Nilsson N (1998) Artificial Intelligence: a new synthesis. Computers. Morgan Kaufmann, San Francisco
Google Scholar
Heckerman D (1996) Bayesian networks for knowledge discovery. In: Advances in knowledge discovery and data mining. AAAI Press/The MIT Press, Cambridge, pp 273–306
Goldberg David E (1989) Genetic algorithms in search, optimization and machine learning. Addison–Wesley Longman Publishing Co, Inc., Boston, MA, USA
Pawlak Z (1984) Rough classification. Intern J Man-Mach Stud. 20:469–483
Article MATH Google Scholar
Berisha AM, Vagin VN, Kulikov AV, Fomina MV (2005) Methods of knowledge discovery in “noisy” databases. J Comput Syst Sci Intern 44(6):973–987
MathSciNet Google Scholar
Mookerjee V, Mannino M, Gilson R (1995) Improving the performance stability of inductive expert systems under input noise. Info Syst Res 6(4):328–356
Article Google Scholar
Dasarathy BV (1991) Nearest neighbor (NN) norms: NN pattern classification techniques. IEEE Computer Society Press, Los Alamos
Google Scholar
Quinlan JR (1989) Unknown attribute values in induction. In: Proceedings of the sixth international workshop on machine learning, pp 164–168
Domingos P (1996) Unifying instance-based and rule-based induction. Mach Learn 24:141–168
Google Scholar
Vagin VN, Fomina MV, Kulikov AV (2008) The problem of object recognition in the presence of noise in original data. In: 10th Scandinavian Conference on Artificial Intelligence SCAI 2008, pp 60–67
Merz CJ, Murphy (1998) PMUCI Repository of Machine Learning Datasets, Information and Computer Science University of California, Irvine 92697-3425 http://archive.ics.uci.edu/ml/
Vagin V, Fomina M (2010) Methods and algorithms of information generalization in noisy databases. In: Advances in Soft Computing. 9th Mexican International Conference on Artificial Intelligence, MICAI 2010, Proceedings, Part II/Springer Verlag, Berlin, pp 44–55

Download references

Author information

Authors and Affiliations

Moscow Power Engineering Institute (Technical University), Krasnokazarmennaya 14, 111250, Moscow, Russia
Vadim Vagin & Marina Fomina

Authors

Vadim Vagin
View author publications
You can also search for this author in PubMed Google Scholar
Marina Fomina
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Vadim Vagin.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Vagin, V., Fomina, M. Problem of knowledge discovery in noisy databases. Int. J. Mach. Learn. & Cyber. 2, 135–145 (2011). https://doi.org/10.1007/s13042-011-0028-x

Download citation

Received: 22 April 2011
Accepted: 10 June 2011
Published: 01 July 2011
Issue Date: September 2011
DOI: https://doi.org/10.1007/s13042-011-0028-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Problem of knowledge discovery in noisy databases

Abstract

Access this article

Similar content being viewed by others

Noisy Data Set Identification

Knowledge Discovery: From Uncertainty to Ambiguity and Back

A Comparison of Mining Incomplete and Inconsistent Data

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Problem of knowledge discovery in noisy databases

Abstract

Access this article

Similar content being viewed by others

Noisy Data Set Identification

Knowledge Discovery: From Uncertainty to Ambiguity and Back

A Comparison of Mining Incomplete and Inconsistent Data

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation