Skip to main content
Log in

Semantic-based pruning of redundant and uninteresting frequent geographic patterns

  • Published:
GeoInformatica Aims and scope Submit manuscript

Abstract

In geographic association rule mining many patterns are either redundant or contain well known geographic domain associations explicitly represented in knowledge resources such as geographic database schemas and geo-ontologies. Existing spatial association rule mining algorithms are Apriori-like, and therefore generate a large amount of redundant patterns. For non-spatial data, the closed frequent pattern mining technique has been introduced to remove redundant patterns. This approach, however, does not warrant the elimination of both redundant and well known geographic dependences when mining geographic databases. This paper presents a novel method for pruning both redundant and well known geographic dependences, by pushing semantics into the pattern mining task. Experiments with real geographic databases have demonstrated a significant reduction of the total amount of patterns and the efficiency of the method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  1. Agrawal R, Srikant R (1994) “Fast algorithms for mining association rules”, In proceedings of the 20th international conference on very large databases, Santiago. Chile. J.B. Bocca, M. Jarke, C. Zaniolo (eds) Morgan Kaufmann, San Francisco, pp. 487-499

  2. Appice A, Ceci M, Lanza A, Francesca L, Malerba D (2003) Discovery of spatial association rules in geo-referenced census data: a relational mining approach. Intell Data Anal 7(6):542–566

    Google Scholar 

  3. Bogorny V, Camargo S, Engel PM, Alvares LO (2006) “Towards elimination of well known geographic domain patterns in spatial association rule mining.” In Proc. 3th IEEE International Conference on Intelligent Systems. London, IEEE Computer Society, Los Alamitos, pp. 532-537

  4. Bogorny V, Camargo S, Engel PM, Alvares LO (2006) "Mining frequent geographic patterns with knowledge constraints". In Proc international symposium on advances in geographic information systems, Arlington, Virginia, R. A. de By, S. Nittel (eds) ACM Press, New York, pp.139-146

  5. Bogorny V, Valiati JF, Camargo S, Engel PM, Kuijpers B, Alvares LO (2006) "Mining maximal generalized frequent geographic patterns with knowledge constraints". In Proc international conference on data mining, Hong Kong, China, IEEE Computer Society, Los Alamitos, pp. 813-817

  6. Bogorny V, Palma AT, Engel PM (2006) Alvares LO “Weka-GDPM: integrating classical data mining toolkit to geographic information systems”, 2nd edn. Porto Alegre, SBBD Workshop on Data Mining Algorithms and Applications, Florianopolis, Brazil, SBC, pp 9–16

    Google Scholar 

  7. Bogorny V, Kuijpers B, Alvares LO (2008) Reducing uninteresting spatial association rules in geographic databases using background knowledge: a summary of results. Int J Geogr Inf Sci 22(4):361–386. doi:10.1080/13658810701412991

    Article  Google Scholar 

  8. Bogorny V, Engel PM (2007) Alvares LO “Enhancing the Process of Knowledge Discovery in Geographic Databases using Geo-Ontologies”, in Data Mining with Ontologies: Implementations, Findings, and Frameworks.: H. O. Nigro, S. G. Cisaro and D. Xodo (eds.). Hershey, Idea Group Inc, pp 160–181

    Google Scholar 

  9. Booch G, Rumbaugh J (1998) Jacobson I The unified modelling language:. Reading, user guide, Addison-Wesley

    Google Scholar 

  10. Chifosky EJ, Cross JH (1990) Reverse engineering and design recovery: a taxonomy. IEEE Softw 7:13–17. doi:10.1109/52.43044

    Article  Google Scholar 

  11. Clementini E, Di Felice P, Koperski K (2000) Mining multiple-level spatial association rules for objects with a broad boundary. Data Knowl Eng 34(3):251–270. doi:10.1016/S0169-023X(00)00017-3

    Article  Google Scholar 

  12. Elmasri R (2003) Navathe S Fundamentals of database systems, 4th edn. Reading, Addison Wesley

    Google Scholar 

  13. Huang Y, Shekhar S, Xiong H (2004) Discovering co-location patterns from spatial datasets: a general approach. IEEE Trans Knowl Data Eng 16(12):1472–1485. doi:10.1109/TKDE.2004.90

    Article  Google Scholar 

  14. Jaroszewicz S, Simovici DA (2001) “A general measure of rule interestingness”. In Proc. international conference on principles and practice of knowledge discovery in databases, L. D. Raedt, A. Siebes (ed) Springer, Berlin, pp. 253–265

  15. Jaroszewicz S, Simovici DA (2004) “Interestingness of frequent itemsets using bayesian networks as background knowledge”, In proceedings of the international conference on knowledge discovery and data mining. W. Kim, R. Kohavi, J. Gehrke, W. DuMouchel (eds) ACM Press, New York, pp. 178–186

  16. Koperski K, Han J (1995) “Discovery of spatial association rules in geographic information databases.” In Proceedings of the 4th international symposium in large spatial databases, M.J. Egenhofer, J.R. Herring (eds), Springer, Berlin, pp. 47-66

  17. Liu B, Hsu W, Chen S, Ma Y (2000) Analyzing the subjective interestingness of association rules. IEEE Intell Syst 15(5):47–55. doi:10.1109/5254.889106

    Article  Google Scholar 

  18. Mennis J, Liu J (2005) Mining association rules in spatio-temporal data: an analysis of urban socioeconomic and land cover change. Trans GIS 9(1):5–17. doi:10.1111/j.1467-9671.2005.00202.x

    Article  Google Scholar 

  19. Pasquier N, Bastide Y, Taouil R, Lakhal L (1999) "Discovering frequent closed itemsets for association rules”. In Proceedings of the Seventh international conference on database theory, C. Beeri, P. Buneman (ed) Springer: Berlin, pp. 398-416

  20. Pei J, Han J, Mao R (2000) “CLOSET an efficient algorithm for mining frequent closed itemsets”. In Proceedings of the ACM SIGMOD workshop on research issues in data mining and knowledge discovery, dallas, USA, W. Chen, J.F. Naughton, P.A. Bernstein (eds) ACM Press, New York

  21. Servigne S, Ubeda T, Puricelli A, Larini R (2000) A Methodology for spatial consistency improvement of geographic databases. GeoInformatica 4(1):7–34. doi:10.1023/A:1009824308542

    Article  Google Scholar 

  22. Shekhar S, Chawla S (2003) Spatial databases: a tour. Prentice Hall, Upper Saddle River

    Google Scholar 

  23. Silberschatz A, Tuzhilin A (1996) What makes patterns interesting in knowledge discovery systems. IEEE Trans Knowl Data Eng 8(6):970–974. doi:10.1109/69.553165

    Article  Google Scholar 

  24. Srikant R, Agrawal R (1995) “Mining generalized association rules”. In Proceedings of the 21st international conference on very large databases, Zurich, Switzerland, U Dayal, P. M. Gray, S. Nishio (eds.), Morgan Kaufmann, San Francisco, pp. 407-419

  25. Webb GI (2006) “Discovering significant rules”, In proceedings of the twelfth ACM SIGKDD international conference on knowledge discovery and data mining. T. Eliassi-Rad, L. H. Ungar, M. Craven, D. Gunopulos (eds) ACM Press, New York, pp. 434–443

  26. Webb GI, Zhang S (2005) K-Optimal Rule Discovery. Data Min Knowl Discov 10(1):5–79. doi:10.1007/s10618-005-0255-4

    Article  Google Scholar 

  27. Witten I (2005) Frank E Data Mining: Practical machine learning tools and techniques, 2nd edn. San Francisco, Morgan Kaufmann

    Google Scholar 

  28. Tan P-N, Kumar V, Srivastava J (2004) Selecting the right objective measure for association analysis. Inf Syst 29(4):293–313. doi:10.1016/S0306-4379(03)00072-3

    Article  Google Scholar 

  29. Xin D, Cheng H, Yan X, Han J (2006) “Extracting redundancy-aware top-k patterns”. In Proceedings of the Twelfth ACM SIGKDD international conference on knowledge discovery and data mining. T. Eliassi-Rad, L. H. Ungar, M. Craven, D. Gunopulos (eds) ACM Press, New York, pp. 444–453

  30. Yoo JS, Shekhar S, Celik M (2006) A joinless approach for mining spatial colocation patterns. IEEE Trans Data Knowl Eng 18(10):1323–1337. doi:10.1109/TKDE.2006.150

    Article  Google Scholar 

  31. Zaki M, Ching-Jui H (2002) “CHARM: An efficient algorithm for closed itemset mining”, In proceedings of the second SIAM international conference on data mining. R. L. Grossman, J. Han, V. Kumar, H. Mannila, R. Motwani (eds) Arlington, VA, SIAM, Philadelphia, pp. 457-473

Download references

Acknowledgment

Our thanks for both CAPES and CNPQ which partially provided the financial support for this research. To Procempa, for the real geographic databases. To the anonymous reviewers for their comments and suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vania Bogorny.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bogorny, V., Valiati, J.F. & Alvares, L.O. Semantic-based pruning of redundant and uninteresting frequent geographic patterns. Geoinformatica 14, 201–220 (2010). https://doi.org/10.1007/s10707-009-0082-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10707-009-0082-7

Keywords

Navigation