Skip to main content
Log in

IRAFCA: an O(n) information retrieval algorithm based on formal concept analysis

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

With the exponential increase in the quantity of information circulating on the Internet, an evolution of information-retrieval systems becomes paramount. Indeed, current approaches for information systems design remain unable to meet the needs of users, either in performance (precision and recall) or response time. In this paper, we propose a new information-retrieval algorithm based on formal concept analysis. The proposed algorithm deals with disjunctive and conjunctive queries. In fact, information retrieval is a direct application of the formal concept analysis (FCA). This makes the adaptation of this theory to this field an easy and intuitive task. In this context, we exploited the theoretical basis provided by the FCA to design an efficient and flexible approach for information retrieval.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. http://sourceforge.net/projects/conexp/.

  2. http://archive.ics.uci.edu/ml/datasets/Connect-4.

  3. http://www.fca.radvansky.net/downloads.php?cat_id=2.

  4. http://ir.dcs.gla.ac.uk/resources/test_collections/medl/.

  5. http://www.ims.uni-stuttgart.de/projekte/corplex/TreeTagger/.

References

  1. Baranyi P, Gedeon TD, Koczy LT (1998) Intelligent information retrieval using fuzzy approach. In: Systems, man, and cybernetics, 1998. 1998 IEEE international conference on, vol 2, pp 1984–1989

  2. Berry MW, Dumais ST, O’Brien GW (1995) Using linear algebra for intelligent information retrieval. SIAM Rev 37(4):573–595

    Article  MathSciNet  MATH  Google Scholar 

  3. Bordogna G, Pasi G (2001) Flexible querying of structured documents. In: Larsen H, Andreasen T, Christiansen H, Kacprzyk J, Zadrony S (eds) Flexible query answering systems, volume 7 of advances in soft computing. Physica-Verlag HD, pp 350–361

  4. Boughanem M, Soul-Dupuy C (1992) A connexionist model for information retrieval. In: Tjoa AM, Ramos I (eds) Database and expert systems applications. Springer, Vienna, pp 260–265

    Chapter  Google Scholar 

  5. Boughanem M, Loiseau Y, Prade H (1992) Rank-ordering documents according to their relevance in information retrieval using refinements of ordered-weighted aggregations. In: Proceedings of the third international conference on adaptive multimedia retrieval: user, context, and feedback, AMR’05. Berlin, Heidelberg, 2006. Springer, pp 44–54

  6. Callan J, Croft WB, Harding SM (1992) The inquery retrieval system. In: Proceedings of the third international conference on database and expert systems applications. Springer, pp 78–83

  7. Claudio C, Giovanni R (2000) Order-theoretical ranking. J Am Soc Inf Sci 51(7):587–601

    Article  Google Scholar 

  8. Claudio C, Giovanni R (2004) Concept data analysis: theory and applications. Wiley, Chichester

    MATH  Google Scholar 

  9. Chebil W, Soualmia LF, Omri MN, Darmoni SJ (2015) Indexing biomedical documents with a possibilistic network. J Assoc Inf Sci Technol. doi:10.1002/asi.23435

  10. Codocedo V, Lykourentzou I, Napoli A (2014) A semantic approach to concept lattice-based information retrieval. Ann Math Artif Intell 72(1–2):169–195

  11. Cole R, Eklund P (1996) Text retrieval for medical discharge summaries using snomed and formal concept analysis. The University of New South Wales, Sydney

    Google Scholar 

  12. Cole R, Eklund P (1999) Scalability in formal concept analysis. Comput Intell 15(1):11–27

  13. Dau F, Ducrou J, Eklund P (2008) Concept similarity and related categories in searchsleuth. In: Eklund P, Haemmerl O (eds) Conceptual structures: knowledge visualization and reasoning, vol 5113, lecture notes in computer science. Springer, Berlin, pp 255–268

  14. Deerwester S, Dumais ST, Furnas GW, Landauer TK, Harshman R (1990) Indexing by latent semantic analysis. J Am Soc Inf Sci 41(6):391–407

    Article  Google Scholar 

  15. Dubois D, de Saint-Cyr FD, Prade H (2007) A possibility-theoretic view of formal concept analysis. Fundam Inf 75(1–4):195–213

    MathSciNet  MATH  Google Scholar 

  16. Jon D (2007) Dvdsleuth: a case study in applied formal concept analysis for navigating web catalogs. In: Priss U, Polovina S, Hill R (eds) Conceptual structures: knowledge architectures for smart applications, vol 4604., lecture notes in computer science. Springer, Berlin, pp 496–500

  17. Fkih F, Omri MN (2012) Complex terminology extraction model from unstructured web text based linguistic and statistical knowledge. IJIRR 2(3):1–18

    Google Scholar 

  18. Fkih F, Omri MN (2012) Information retrieval from unstructured web text document based on automatic learning of the threshold. IJIRR 2(4):12‘–30

    Google Scholar 

  19. Fkih F, Omri MN (2013) Estimation of a priori decision threshold for collocations extraction: an empirical study. IJITWE 8(3):34–49

    Google Scholar 

  20. Fkih F, Omri MN (2013) A statistical classifier based markov chain for complex terms filtration. In: Proceedings of the international conference on web informations and technologies, ICWIT 2013. Hammamet, Tunisia, pp 175–184

  21. Fkih F, Omri MN, Toumia I (2012) A linguistic model for terminology extraction based conditional random field. In: Proceedings of the international conference on computer related knowledge, ICCRK2012, Sousse, Tunisia, p 38

  22. Bernhard G, Rudolf W (1997) Formal concept analysis: mathematical foundations, 1st edn. Springer-Verlag New York Inc, Secaucus

    MATH  Google Scholar 

  23. Godin R, Mineau R, Missaoui R, Mili H (1995) Méthodes de classification conceptuelle basées sur les treillis de galois et applications. Revue d’intelligence artificielle 9(2):105–137

    Google Scholar 

  24. Godin R, Missaoui R, Alaoui H (1995) Incremental concept formation algorithms based on Galois (concept) lattices. Comput Intell 11(2):246–267

    Article  Google Scholar 

  25. Grossman DA, Frieder O (2004) Information retrieval: algorithms and heuristics, 2nd edn. The Kluwer International Series of Information Retrieval, Springer, Berlin

    Book  MATH  Google Scholar 

  26. Bjoern K (2006) Conceptual knowledge retrieval with fooca: improving web search engine results with contexts and concept hierarchies. In: Petra P (ed) Advancesin data mining. Applications in medicine, web mining, marketing, image and signalmining, vol 4065 of lecture notes in computer science. Springer, Berlin, pp 176–190

  27. Kourie DG, Obiedkov S, Watson BW, van der Merwe D (2009) An incremental algorithm to construct a lattice of set intersections. Sci Comput Program 74(3):128–142

    Article  MathSciNet  MATH  Google Scholar 

  28. Kuznetsov SO, Obiedkov SA (2002) Comparing performance of algorithms for generating concept lattices. J Exp Theor Artif Intell 14(2–3):189–216

    Article  MATH  Google Scholar 

  29. Phuong-Thanh L, Bac L, Bay V (2014) Incrementally building frequent closed itemset lattice. Expert Syst Appl 41(6):2703–2712

    Article  Google Scholar 

  30. Linding C (1995) Concept-based component retrieval. In: IJCAI-95 workshop: Formal Approaches to the Reuse of Plans, Proofs and Programs. Montreal, Canada, pp 21–25

  31. Van Der Merwe FJ, Kourie DG (2002) Compressed pseudo-lattices. J Exp Theor Artif Intell 14(2–3):229–254

    Article  MATH  Google Scholar 

  32. Van Der Merwe FJ, Obiedkov S, Kourie D (2004) Addintent: a new incremental algorithm for constructing concept lattices. In: Peter E (ed) Concept lattices, volume 2961 of lecture notes in computer science. Springer, pp 205–206

  33. Messai N, Devignes M-D, Napoli A, Smaïl-Tabbone M (2006) BR-explorer: an FCA-based algorithm for information retrieval. In: Fourth international conference on concept lattices and their applications—CLA 2006, Hammamet/Tunisia

  34. Mothe J (1994) Modèle Connexionniste pour la Recherche d’Information, Expansion dirigée de requêtes et apprentissage. PhD thesis, Université Paul Sabatier, Toulouse (France)

  35. Nauer E, Toussaint Y (2009) Crechaindo: an iterative and interactive web information retrieval system based on lattices. Int J Gen Syst 38(4):363–378

    Article  MATH  Google Scholar 

  36. Nebot V, Berlanga R (2014) Exploiting semantic annotations for open information extraction: an experience in the biomedical domain. Knowl Inf Syst 38(2):365–389

    Article  Google Scholar 

  37. Omri MN (2004) Pertinent knowledge extraction from a semantic network: application of fuzzy sets theory. Int J Artif Intell Tools 13(3):705–720

    Article  MathSciNet  Google Scholar 

  38. Pernelle N, Rousset MC, Soldano H, Ventos V (2002) Zoom: a nested galois lattices-based system for conceptual clustering. J Exp Theor Artif Intell 14(2–3):157–187

    Article  MATH  Google Scholar 

  39. Ponte JM, Croft WB (1998) A language modeling approach to information retrieval. In: Proceedings of the 21st annual international ACM SIGIR conference on research and development in information retrieval, SIGIR ’98. ACM, New York, NY, USA, pp 275–281

  40. Uta P (2000) Lattice-based information retrieval. Knowl Organ 27:132–142

    Google Scholar 

  41. Salton G (1971) The SMART retrieval system: experiments in automatic document processing. Prentice-Hall Inc, Upper Saddle River

    Google Scholar 

  42. Salton G, Fox E, Wu H (1983) Extended boolean information retrieval. Commun ACM 26(11): 1022–1036

  43. Salton G, McGill M (1986) Introduction to modern information retrieval. McGraw-Hill Inc, New York

    MATH  Google Scholar 

  44. Stumme G, Taouil R, Bastide Y, Lakhal L (October 2001) Conceptual clustering with iceberg concept lattices. In: Proceedings of GI-Fachgruppentreffen Maschinelles Lernen ’01

  45. Wille R (1982) Restructuring lattice theory: an approach based on hierarchies of concepts. In: Rival I (ed) Ordered sets, vol 83, NATO advanced study institutes series, Springer, Dordrecht, pp 445–470

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fethi Fkih.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fkih, F., Omri, M.N. IRAFCA: an O(n) information retrieval algorithm based on formal concept analysis. Knowl Inf Syst 48, 465–491 (2016). https://doi.org/10.1007/s10115-015-0876-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-015-0876-x

Keywords

Navigation