skip to main content
10.1145/1367497.1367506acmconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
research-article

Using the wisdom of the crowds for keyword generation

Published:21 April 2008Publication History

ABSTRACT

In the sponsored search model, search engines are paid by businesses that are interested in displaying ads for their site alongside the search results. Businesses bid for keywords, and their ad is displayed when the keyword is queried to the search engine. An important problem in this process is 'keyword generation': given a business that is interested in launching a campaign, suggest keywords that are related to that campaign. We address this problem by making use of the query logs of the search engine. We identify queries related to a campaign by exploiting the associations between queries and URLs as they are captured by the user's clicks. These queries form good keyword suggestions since they capture the "wisdom of the crowd" as to what is related to a site. We formulate the problem as a semi-supervised learning problem, and propose algorithms within the Markov Random Field model. We perform experiments with real query logs, and we demonstrate that our algorithms scale to large query logs and produce meaningful results.

References

  1. V. Abhishek and K Hosanagar. Keyword generation for search engine advertising using semantic similarity between terms. International Conference on Electronic Commerce, pages 89--94, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. R. Andersen and K. Lang. Communities from seed sets. WWW, pages 223--232, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. R. Baeza-Yates and A. Tiberi. Extracting semantic relations from query logs. KDD, pages 76--85, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. K. Bartz, V. Murthi, and S. Sebastian. Logistic regression and collaborative filtering for sponsored search term recommendation. Second Workshop on Sponsored Search Auctions, 2006.Google ScholarGoogle Scholar
  5. D. Beeferman and A. Berger. Agglomerative clustering of a search engine query log. KDD, pages 407--416, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. A. Broder, M. Fontoura, E. Gabrilovich, A. Joshi, V. Josifovski, and T. Zhang. Robust classification of rare queries using web knowledge. SIGIR, pages 231--238, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Olivier Chapelle, Bernhard Scholkopf, and Alexander Zien. Semi-Supervised Learning. The MIT Press, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. N. Craswell and M. Szummer. Random walks on the click graph. SIGIR, pages 239--246, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. P. Doyle and L. Snell. Random Walks and Electrical Networks. Mathematical Association of America, 1984.Google ScholarGoogle Scholar
  10. E. Frank, G. Paynter, I. Witten, C. Gutwin, and C. Nevill-Manning. Domain-specific keyphrase detection. Proc. of IJCAI, pages 668--673, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Z. Gyöngyi, H. Garcia-Molina, and J. Pedersen. Combating web spam with trustrank. VLDB, pages 576--587, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. A. Hulth. Improved automatic keyword extraction given more linguistic knowledge. EMNLP, pages 216--223, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. M. I. Jordan, Z. Ghahramani, T. S. Jaakkola, and L. K. Saul. An introduction to variational methods for graphical models. In M. I. Jordan, editor, Learning in Graphical Models. Kluwer Academic Publishers, Norwell MA., 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. A. Joshi and R. Motwani. Keyword generation for search engine advertising. ICDM Workshops, pages 490--496, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. D. Kelleher and S. Luz. Automatic hypertext keyphrase extraction. IJCAI, pages 1608--1609, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. J. Kleinberg. Authoritative sources in a hyperlinked environment. Journal of the ACM, 46(5):604--632, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. S. Li. Markov random field modeling in computer vision. Springer-Verlag, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. J. Rocchio. The SMART Retrieval System: Experiments in Automatic Document Processing, chapter Relevance feedback in Information Retrieval, pages 313--323. Prentice Hall, 1971.Google ScholarGoogle Scholar
  19. D. Shen, R. Pan, J. Sun, J. Pan, K. Wu, J. Yin, and Q. Yang. Q2C@UST: Our winning solution to query classification in KDDCUP 2005. SIGKDD Explorations, 7:100--110, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. P. Turney. Learning algorithms for keyphrase extraction. Information Retrieval, 2(4):303--336, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. P. Turney. Coherent keyphrase extraction via web mining. IJCAI, pages 434--439, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. J. Wen, J. Nie, and H. Zhang. Clustering user queries of a search engine. WWW, pages 162--168, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. G. Xue, Y. Yu, D. Shen, Q. Yang, H. Zeng, and Z. Chen. Reinforcing web-object categorization through interrelationships. Data Mining and Knowledge Discovery, 12:229--248, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. G. Xue, H. Zeng, Z. Chen, Y. Yu, W. Ma, W. Xi, and W. Fan. Optimizing web search using web click-through data. CIKM, pages 118--126, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Y. Yang. An evaluation of statistical approaches to text categorization. Information Retrieval, 1(1-2):69--90, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. W. Yih, J. Goodman, and V. Carvalho. Finding advertising keywords on web pages. WWW, pages 213 -- 222, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. X. Zhu, Z. Ghahramani, and J. Lafferty. Semi-supervised learning using gaussian fields and harmonic functions. ICML, pages 912--919, 2003.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Using the wisdom of the crowds for keyword generation

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          WWW '08: Proceedings of the 17th international conference on World Wide Web
          April 2008
          1326 pages
          ISBN:9781605580852
          DOI:10.1145/1367497

          Copyright © 2008 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 21 April 2008

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate1,899of8,196submissions,23%

          Upcoming Conference

          WWW '24
          The ACM Web Conference 2024
          May 13 - 17, 2024
          Singapore , Singapore

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader