Abstract
Search engines and large scale IR systems need to cache query results for efficiency and scalability purposes. In this study, we propose to explicitly incorporate the query costs in the static caching policy. To this end, a query’s cost is represented by its execution time, which involves CPU time to decompress the postings and compute the query-document similarities to obtain the final top-N answers. Simulation results using a large Web crawl data and a real query log reveal that the proposed strategy improves overall system performance in terms of the total query execution time.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Baeza-Yates, R., Gionis, A., Junqueira, F., Murdock, V., Plachouras, V., Silvestri, F.: The impact of caching on search engines. In: Proc. of SIGIR 2007, Netherlands, pp. 183–190 (2007)
Baeza-Yates, R., Junqueira, F., Plachouras, V., Witschel, H.F.: Admission policies for caches of search engine results. In: Ziviani, N., Baeza-Yates, R. (eds.) SPIRE 2007. LNCS, vol. 4726, pp. 74–85. Springer, Heidelberg (2007)
Baeza-Yates, R., Saint-Jean, F.: A three level search engine index based in query log distribution. In: Nascimento, M.A., de Moura, E.S., Oliveira, A.L. (eds.) SPIRE 2003. LNCS, vol. 2857, pp. 56–65. Springer, Heidelberg (2003)
Cao, P., Irani, S.: Cost-aware WWW proxy caching algorithms. In: Proc. of the USENIX Symposium on Internet Technologies and Systems, Monterey, California (1997)
Fagni, T., Perego, R., Silvestri, F., Orlando, S.: Boosting the performance of Web search engines: Caching and prefetching query results by exploiting historical usage data. ACM TOIS 24(1), 51–78 (2006)
Garcia, S.: Search Engine Optimisation Using Past Queries. Ph.D thesis, RMIT (2007)
Jeong, J., Dubois, M.: Cache Replacement Algorithms with Nonuniform Miss Costs. IEEE Transactions on Computers 55(4), 353–365 (2006)
Liang, S., Chen, K., Jiang, S., Zhang, X.: Cost-Aware Caching Algorithms for Distributed Storage Servers. In: Pelc, A. (ed.) DISC 2007. LNCS, vol. 4731, pp. 373–387. Springer, Heidelberg (2007)
Lester, N., Moffat, A., Webber, W., Zobel, J.: Space-Limited Ranked Query Evaluation Using Adaptive Pruning. In: Ngu, A.H.H., Kitsuregawa, M., Neuhold, E.J., Chung, J.-Y., Sheng, Q.Z. (eds.) WISE 2005. LNCS, vol. 3806, pp. 470–477. Springer, Heidelberg (2005)
Ozcan, R., Altingovde, I.S., Ulusoy, Ö.: Static query result caching revisited. In: Proc. of WWW 2008, Beijing, China, pp. 1169–1170 (2008)
Silverstein, C., Marais, H., Henzinger, M., Moricz, M.: Analysis of a very large web search engine query log. SIGIR Forum 33(1), 6–12 (1999)
Stanford University WebBase Project, http://www-diglib.stanford.edu/~testbed/doc2/WebBase
Strohman, T., Croft, W.B.: Efficient document retrieval in main memory. In: Proc. of SIGIR 2007, Netherlands, pp. 175–182 (2007)
Tsegay, Y., Turpin, A., Zobel, J.: Dynamic index pruning for effective caching. In: Proc. of CIKM 2007, Lisbon, Portugal, pp. 987–990 (2007)
Webber, W., Moffat, A.: In Search of Reliable Retrieval Experiments. In: Proceedings of the Tenth Australasian Document Computing Symposium, ADCS, pp. 26–33 (2005)
Yahoo! “Web search” Web service (2008), http://developer.yahoo.com/search/web/V1
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Altingovde, I.S., Ozcan, R., Ulusoy, Ö. (2009). A Cost-Aware Strategy for Query Result Caching in Web Search Engines. In: Boughanem, M., Berrut, C., Mothe, J., Soule-Dupuy, C. (eds) Advances in Information Retrieval. ECIR 2009. Lecture Notes in Computer Science, vol 5478. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00958-7_59
Download citation
DOI: https://doi.org/10.1007/978-3-642-00958-7_59
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-00957-0
Online ISBN: 978-3-642-00958-7
eBook Packages: Computer ScienceComputer Science (R0)