skip to main content
10.1145/3219788.3219808acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiccdeConference Proceedingsconference-collections
research-article

iCFDMiner: An Incremental Algorithm of Mining Constant CFDs from Dynamic Databases

Authors Info & Claims
Published:04 May 2018Publication History

ABSTRACT

Conditional functional dependency (CFD) has been verified to be more effective for checking data consistency than traditional FD, and there are quite a few algorithms of mining CFDs from a static database. However, records in a database are frequently added, deleted or modified in reality. Thus, relevant incremental algorithms are preferred in a dynamic updating database. To our knowledge, the study of incremental algorithms for mining CFDs are rare. In this paper, an incremental algorithm, iCFDMiner is proposed based on the batch algorithm CFDMiner, which is very popular for discovering constant CFDs in static databases. It is proved that iCFDMiner scales well with the size of the database, and all operations (adding, deleting and modifying). Experiments show that iCFDMiner outperforms CFDMiner in terms of running time and computing spaces in most cases.

References

  1. Cong G, Fan W, Geerts F, et al. Improving Data Quality: Consistency and Accuracy{C}. Proceedings of the 33rd International Conference on Very Large Data Bases (VLDB'07), Austria, Sep. 23-27, 2007(7): 315--326. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Fan W, Geerts F, Jia X, et al. Conditional Functional Dependencies for Capturing Data Inconsistencies{J}. ACM Transactions on Database Systems, 2008, 33(2): 1--48. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Li J, Liu J, Toivonen H, et al. Effective Pruning for the Discovery of Conditional Functional Dependencies{J}. The Computer Journal, 2013, 56(3): 378--392. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Fan W, Geerts F, Li J, et al. Discovering Conditional Functional Dependencies{J}. IEEE Transactions on Knowledge & Data Engineering, 2011, 23(5): 683--698. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Diallo T, Novelli N, Petit J. Discovering (Frequent) Constant Conditional Functional Dependencies{J}. International Journal of Data Mining, Modelling and Management, 2012, 4(3): 205--223.Google ScholarGoogle Scholar
  6. Bohannon P, Fan W, Geerts F, et al. Conditional Functional Dependencies for Data Cleaning{C}. Proceedings of the 23rd IEEE International Conference on Data Engineering (ICDE'07), Istanbul, Turkey, Apr. 15-20, 2007: 746--755.Google ScholarGoogle ScholarCross RefCross Ref
  7. Fan W, Geerts F. Foundations of Data Quality Management{M}. Morgan & Claypool, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Liu X, Li J. Discovering extended conditional functional dependencies{J}. Journal of Computer Research & Development, 2015, 52(1): 130--140.Google ScholarGoogle Scholar
  9. Zhou J, Diao X, Cao J, et al. A method for generating fixing rules from constant conditional functional dependencies{C}. IEEE International Conference on Knowledge Engineering & Applications(ICKEA'17), Singapore, Sep 28-30, 2017: 6--11.Google ScholarGoogle Scholar
  10. Zhou J, Diao X, Cao J, et al. An Optimization Strategy for CFDMiner: An Algorithm of Discovering Constant Conditional Functional Dependencies{J}. IEICE Transactions on Information and Systems, 2016, 99(2): 537--540.Google ScholarGoogle ScholarCross RefCross Ref
  11. Li J, Li H, Wong L, et al. Minimum Description Length Principle: Generators are Preferable to Closed Patterns{C}. Proceedings of the 21st National Conference on Artificial Intelligence (AAAI'06) and the 18th Innovative Applications of Artificial Intelligence Conference (IAAI'06), Boston, Massachusetts, USA, Jul. 16-20, 2006: 409--414. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Li J, Liu G, Wong L. Mining Statistically Important Equivalence Classes and Delta-discriminative Emerging Patterns{C}. Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'07), San Jose, California, USA, Aug. 12-15, 2007: 430--439. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Zhou J, Diao X, Cao J. Mining of constant conditional functional dependencies based on pruning free itemsets{J}. Qinghua Daxue Xuebao/Journal of Tsinghua University, 2016, 56(3): 253--261.Google ScholarGoogle Scholar
  14. Tran A, Truong T, Le B. Simultaneous Mining of Frequent Closed Itemsets and Their Generators{J}. Engineering Applications of Artificial Intelligence, 2014, 36: 64--80. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Agrawal R, Srikant R. Fast Algorithms for Mining Association Rules{C}. Proceedings of the 20th International Conference on Very Large Data Bases (VLDB'94), Santiago de Chile, Chile, Sep. 12-15, 1994: 487--499. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Calders T, Goethals B. Non-derivable Itemset Mining{J}. Data Mining & Knowledge Discovery, 2007, 14(1): 171--206. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Goethals B, Zaki M J. Frequent Itemset Mining Implementations{C}. Proceedings of the 3rd IEEE International Conference on Data Mining (ICDM'03), Melbourne, Florida, USA, Dec. 19-22, 2003: 1--13.Google ScholarGoogle Scholar
  18. Li H, Li J, Wong L, et al. Relative Risk and Odds Ratio: A Data Mining Perspective{C}. Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD'05), Baltimore, Maryland, USA, Jun. 14-16, 2005: 368--377. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Pasquier N, Bastide Y, Taouil R, et al. Discovering Frequent Closed Itemsets for Association Rules{C}. Proceedings of the 7th International Conference on Database Theory (ICDT'99), Jerusalem, Israel, Jan. 10-12, 1999: 398--416. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Wang J, Han J, Pei J. CLOSET+: Searching for the Best Strategies for Mining Frequent Closed Itemsets{C}. Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'03), Washington, DC, USA, Aug. 24-27, 2003: 236--245. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Zaki M J. Mining Non-redundant Association Rules{J}. Data Mining & Knowledge Discovery, 2004, 9(3): 223--248. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. iCFDMiner: An Incremental Algorithm of Mining Constant CFDs from Dynamic Databases

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      ICCDE '18: Proceedings of the 2018 International Conference on Computing and Data Engineering
      May 2018
      116 pages
      ISBN:9781450363938
      DOI:10.1145/3219788

      Copyright © 2018 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 4 May 2018

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited
    • Article Metrics

      • Downloads (Last 12 months)1
      • Downloads (Last 6 weeks)0

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader