Abstract
Web directories are taxonomies for the classification of Web documents using a directed acyclic graph of categories. This paper introduces an optimistic model for Web directories that improves the performance of restricted searches. This model considers the directed acyclic graph of categories as a tree with some “exceptions”. The validity of this optimistic model has been analysed by developing and comparing it with a basic model and a hybrid model with partial information. The proposed model is able to improve in 50% the response time of a basic model, and with respect to the hybrid model, both systems provide similar response time, except for large answers. In this case, the optimistic model outperforms the hybrid model in approximately 61%. Moreover, in a saturated workload environment the optimistic model proved to perform better than the basic and hybrid models for all type of queries.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Agosti, M., Melucci, M.: Information Retrieval on the Web. In: Agosti, M., Crestani, F., Pasi, G. (eds.) ESSIR 2000. LNCS, vol. 1980, pp. 242–285. Springer, Heidelberg (2001)
Baeza-Yates, R.: Searching the Web. In: Baeza-Yates, R., Ribeiro-Neto, B. (eds.) Modern Information Retrieval. ch. 13, pp. 367–395. Addison Wesley, Reading (1999)
Baeza Yates, R., Navarro, G.: Indexing and Searching. In: Baeza-Yates, R., Ribeiro-Neto, B. (eds.) Modern Information Retrieval. ch. 8, pp. 191–228. Addison Wesley, Reading (1999)
Brin, S., Page, L.: The anatomy of a large-scale hypertextual web search engine. In: The 7th International World Wide Web Conference (1998)
Cacheda, F., Viña, A.: Experiencies retrieving information in the World Wide Web. In: 6th IEEE Symposium on Computers and Communications, pp. 72–79 (2001)
Cacheda, F., Viña, A.: Optimization of Restricted Searches in Web Directories Using Hybrid Data Structures. In: Sebastiani, F. (ed.) ECIR 2003. LNCS, vol. 2633, pp. 436–451. Springer, Heidelberg (2003)
Cutting, D., Pedersen, J.: Optimizations for dynamic inverted index maintenance. In: 13th International Conference on Research and Development in Information Retrieval (1990)
Faloutsos, C., Christodoulakis, S.: Description and performance analysis of signature file methods. ACM TOOIS 5(3), 237–257 (1987)
Google (2003), http://www.google.com/
Harman, D., Fox, E., Baeza-Yates, R., Lee, W.: Inverted files. In: Frakes, W., Baeza-Yates, R. (eds.) Information Retrieval: Data structures and algorithms. ch. 3, pp. 28–43. Prentice-Hall, Englewood Cliffs (1992)
Jacobson, G., Krishnamurthy, B., Srivastava, D., Suciu, D.: Focusing Search in Hierarchical Structures with Directory Sets. In: Seventh International Conference on Information and Knowledge Management (CIKM) (1998)
Labrou, Y., Finin, T.: Yahoo! as an ontology – Using Yahoo! categories to describe documents. In: Eighth International Conference on Information Knowledge Management (CIKM), pp. 180–187 (1999)
The Open Directory Project (2003), http://www.dmoz.org/
Roberts, C.S.: Partial-match retrieval via the method of superimposed codes. Proceedings of the IEEE 67(12), 1624–1642 (1979)
Stiassny, S.: Mathematical analysis of various superimposed coding methods. American Documentation 11(2), 155–169 (1960)
Yahoo! (2003), http://www.yahoo.com/
Zobel, J., Moffat, A., Ramamohanarao, K.: Guidelines for Presentation and Comparison of Indexing Techniques. ACM SIGMOD Record 25(3), 10–15 (1996)
Zobel, J., Moffat, A., Ramamohanarao, K.: Inverted files versus signature files for text indexing. ACM Transactions on Database Systems 23(4), 453–490 (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cacheda, F., Baeza-Yates, R. (2004). An Optimistic Model for Searching Web Directories. In: McDonald, S., Tait, J. (eds) Advances in Information Retrieval. ECIR 2004. Lecture Notes in Computer Science, vol 2997. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24752-4_27
Download citation
DOI: https://doi.org/10.1007/978-3-540-24752-4_27
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-21382-6
Online ISBN: 978-3-540-24752-4
eBook Packages: Springer Book Archive