Skip to main content

A Cost-Aware Strategy for Query Result Caching in Web Search Engines

  • Conference paper
Advances in Information Retrieval (ECIR 2009)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5478))

Included in the following conference series:

Abstract

Search engines and large scale IR systems need to cache query results for efficiency and scalability purposes. In this study, we propose to explicitly incorporate the query costs in the static caching policy. To this end, a query’s cost is represented by its execution time, which involves CPU time to decompress the postings and compute the query-document similarities to obtain the final top-N answers. Simulation results using a large Web crawl data and a real query log reveal that the proposed strategy improves overall system performance in terms of the total query execution time.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Baeza-Yates, R., Gionis, A., Junqueira, F., Murdock, V., Plachouras, V., Silvestri, F.: The impact of caching on search engines. In: Proc. of SIGIR 2007, Netherlands, pp. 183–190 (2007)

    Google Scholar 

  2. Baeza-Yates, R., Junqueira, F., Plachouras, V., Witschel, H.F.: Admission policies for caches of search engine results. In: Ziviani, N., Baeza-Yates, R. (eds.) SPIRE 2007. LNCS, vol. 4726, pp. 74–85. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  3. Baeza-Yates, R., Saint-Jean, F.: A three level search engine index based in query log distribution. In: Nascimento, M.A., de Moura, E.S., Oliveira, A.L. (eds.) SPIRE 2003. LNCS, vol. 2857, pp. 56–65. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  4. Cao, P., Irani, S.: Cost-aware WWW proxy caching algorithms. In: Proc. of the USENIX Symposium on Internet Technologies and Systems, Monterey, California (1997)

    Google Scholar 

  5. Fagni, T., Perego, R., Silvestri, F., Orlando, S.: Boosting the performance of Web search engines: Caching and prefetching query results by exploiting historical usage data. ACM TOIS 24(1), 51–78 (2006)

    Article  Google Scholar 

  6. Garcia, S.: Search Engine Optimisation Using Past Queries. Ph.D thesis, RMIT (2007)

    Google Scholar 

  7. Jeong, J., Dubois, M.: Cache Replacement Algorithms with Nonuniform Miss Costs. IEEE Transactions on Computers 55(4), 353–365 (2006)

    Article  Google Scholar 

  8. Liang, S., Chen, K., Jiang, S., Zhang, X.: Cost-Aware Caching Algorithms for Distributed Storage Servers. In: Pelc, A. (ed.) DISC 2007. LNCS, vol. 4731, pp. 373–387. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  9. Lester, N., Moffat, A., Webber, W., Zobel, J.: Space-Limited Ranked Query Evaluation Using Adaptive Pruning. In: Ngu, A.H.H., Kitsuregawa, M., Neuhold, E.J., Chung, J.-Y., Sheng, Q.Z. (eds.) WISE 2005. LNCS, vol. 3806, pp. 470–477. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  10. Ozcan, R., Altingovde, I.S., Ulusoy, Ö.: Static query result caching revisited. In: Proc. of WWW 2008, Beijing, China, pp. 1169–1170 (2008)

    Google Scholar 

  11. Silverstein, C., Marais, H., Henzinger, M., Moricz, M.: Analysis of a very large web search engine query log. SIGIR Forum 33(1), 6–12 (1999)

    Article  Google Scholar 

  12. Stanford University WebBase Project, http://www-diglib.stanford.edu/~testbed/doc2/WebBase

  13. Strohman, T., Croft, W.B.: Efficient document retrieval in main memory. In: Proc. of SIGIR 2007, Netherlands, pp. 175–182 (2007)

    Google Scholar 

  14. Tsegay, Y., Turpin, A., Zobel, J.: Dynamic index pruning for effective caching. In: Proc. of CIKM 2007, Lisbon, Portugal, pp. 987–990 (2007)

    Google Scholar 

  15. Webber, W., Moffat, A.: In Search of Reliable Retrieval Experiments. In: Proceedings of the Tenth Australasian Document Computing Symposium, ADCS, pp. 26–33 (2005)

    Google Scholar 

  16. Yahoo! “Web search” Web service (2008), http://developer.yahoo.com/search/web/V1

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Altingovde, I.S., Ozcan, R., Ulusoy, Ö. (2009). A Cost-Aware Strategy for Query Result Caching in Web Search Engines. In: Boughanem, M., Berrut, C., Mothe, J., Soule-Dupuy, C. (eds) Advances in Information Retrieval. ECIR 2009. Lecture Notes in Computer Science, vol 5478. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00958-7_59

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-00958-7_59

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-00957-0

  • Online ISBN: 978-3-642-00958-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics