Skip to main content

Generating Pseudo Search History Data in the Absence of Real Search History

  • Conference paper
  • First Online:
  • 901 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9828))

Abstract

Previous studies in Information Retrieval literature have shown that users’ search history can be leveraged to improve current search results. However sometimes we have little to no search history available. In such cases, it would be helpful to obtain data similar to search history data. One way of doing this is by simulating previous search interactions. In the present study, we focus on generating simulated “related queries” that can serve as an additional source of information about the current search [1]. Assuming that users reformulate their queries by leveraging some of the terms and key phrases they find in ranked documents during their search, we proposed simple models for generating such related queries.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Bah, A., Carterette, B.: Aggregating results from multiple related queries to improve web search over sessions. In: Jaafar, A., et al. (eds.) AIRS 2014. LNCS, vol. 8870, pp. 172–183. Springer, Heidelberg (2014)

    Google Scholar 

  2. Baskaya, F.: Simulating Search Sessions in Interactive Information Retrieval Evaluation. Tampere University, Tampere (2014)

    Google Scholar 

  3. Baskaya, F., Keskustalo, H., Järvelin, K.: Simulating simple and fallible relevance feedback. In: Clough, P., Foley, C., Gurrin, C., Jones, G.J., Kraaij, W., Lee, H., Mudoch, V. (eds.) ECIR 2011. LNCS, vol. 6611, pp. 593–604. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  4. Baskaya, F., Keskustalo, H., Järvelin, K.: Time drives interaction: simulating sessions in diverse searching environments. In: Proceedings of SIGIR, August 2012

    Google Scholar 

  5. Carterette, B., Bah, A., Zengin, M.: Dynamic test collections for retrieval evaluation. In: Proceedings of the 2015 International Conference on the Theory of Information Retrieval, pp. 91–100. ACM, September 2015

    Google Scholar 

  6. Carterette, B., Kanoulas, E., Hall, M.M., Clough, P.D.: Overview of the TREC 2013 Session Track. In: TREC (2013)

    Google Scholar 

  7. Cormack, G.V., Smucker, M.D., Clarke, C.L.: Efficient and effective spam filtering and re-ranking for large web datasets. Inf. Retr. 14(5), 441–465 (2011)

    Article  Google Scholar 

  8. Guan, D.: Structured Query Formulation and Result Organization for Session Search (Doctoral dissertation, Georgetown University) (2013)

    Google Scholar 

  9. Guan, D., Zhang, S., Yang, H.: Utilizing query change for session search. In: Proceedings of SIGIR, July 2013

    Google Scholar 

  10. Järvelin, K., Kekäläinen, J.: Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Syst. (TOIS) 20(4), 422–446 (2002)

    Article  Google Scholar 

  11. Jiang, J., He, D., Han, S.: On duplicate results in a search session. In: Proceedings of the 21st TREC (2012)

    Google Scholar 

  12. JTopia. https://github.com/srijiths/jtopia

  13. Keskustalo, H., Järvelin, K., Pirkola, A., Sharma, T., Lykke, M.: Test collection-based IR evaluation needs extension toward sessions – a case of extremely short queries. In: Lee, G.G., Song, D., Lin, C.-Y., Aizawa, A., Kuriyama, K., Yoshioka, M., Sakai, T. (eds.) AIRS 2009. LNCS, vol. 5839, pp. 63–74. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  14. Kruschwitz, U.: University of essex at the TREC 2012 session track. In: Proceedings of the 21st TREC (2012)

    Google Scholar 

  15. Raman, K., Bennett, P.N., Collins-Thompson, K.: Toward whole-session relevance: exploring intrinsic diversity in web search. In: Proceedings of the 36th International ACM SIGIR, pp. 463–472. ACM, July 2013

    Google Scholar 

  16. Strohman, T., Metzler, D., Turtle, H., Croft, W.B.: Indri: a language model-based search engine for complex queries. In: Proceedings of the International Conference on Intelligent Analysis, vol. 2, no. 6, pp. 2–6, May 2005

    Google Scholar 

  17. Verberne, S., Sappelli, M., Järvelin, K., Kraaij, W.: User simulations for interactive search: evaluating personalized query suggestion. In: Hanbury, A., Kazai, G., Rauber, A., Fuhr, N. (eds.) ECIR 2015. LNCS, vol. 9022, pp. 678–690. Springer, Heidelberg (2015)

    Google Scholar 

  18. Yahoo! BOSS. https://developer.yahoo.com/search/boss/

  19. Zhang, S., Guan, D., Yang, H.: Query change as relevance feedback in session search. In: Proceedings of the 36th International ACM SIGIR Conference, pp. 821–824. ACM, July 2013

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ashraf Bah .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Bah, A., Carterette, B. (2016). Generating Pseudo Search History Data in the Absence of Real Search History. In: Hartmann, S., Ma, H. (eds) Database and Expert Systems Applications. DEXA 2016. Lecture Notes in Computer Science(), vol 9828. Springer, Cham. https://doi.org/10.1007/978-3-319-44406-2_34

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-44406-2_34

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-44405-5

  • Online ISBN: 978-3-319-44406-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics