Link Analysis

doi:10.1007/978-3-540-37882-2_7

Part of the book series: Data-Centric Systems and Applications ((DCSA))

2465 Accesses

Abstract

Early search engines retrieved relevant pages for the user based primarily on the content similarity of the user query and the indexed pages of the search engines. The retrieval and ranking algorithms were simply direct implementation of those from information retrieval. Starting from 1996, it became clear that content similarity alone was no longer sufficient for search due to two reasons. First, the number of Web pages grew rapidly during the middle to late 1990s. Given any query, the number of relevant pages can be huge. For example, given the search query “classification technique”, the Google search engine estimates that there are about 10 million relevant pages. This abundance of information causes a major problem for ranking, i.e., how to choose only 30–40 pages and rank them suitably to present to the user. Second, content similarity methods are easily spammed. A page owner can repeat some important words and add many remotely related words in his/her pages to boost the rankings of the pages and/or to make the pages relevant to a large number of possible queries.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 44.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

(2007). Link Analysis. In: Web Data Mining. Data-Centric Systems and Applications. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-37882-2_7

Download citation

DOI: https://doi.org/10.1007/978-3-540-37882-2_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-37881-5
Online ISBN: 978-3-540-37882-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Link Analysis

Abstract

Access this chapter

Preview

Similar content being viewed by others

Link Analysis

Web Search

PageRank, Connecting a Line of Nodes with a Complete Graph

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Navigation

Link Analysis

Abstract

Access this chapter

Preview

Similar content being viewed by others

Link Analysis

Web Search

PageRank, Connecting a Line of Nodes with a Complete Graph

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation