skip to main content
10.1145/2350716.2350751acmotherconferencesArticle/Chapter ViewAbstractPublication PagessoictConference Proceedingsconference-collections
research-article

Parallel PageRank computation using GPUs

Authors Info & Claims
Published:23 August 2012Publication History

ABSTRACT

Fast & efficient computing of web rank scores is a necessary issue of search engines today. Because of the enormous size of data and the dynamic nature of World Wide Web, this computation is generally executed on large web graphs (to billions webpages) and requires refreshing quite often, so it becomes a challenging task. In this paper, we propose an efficient method for computing PageRank score -- a Google ranking method based on analyzing the link structure of the Web on graphics processing units (GPUs). We have employed a slightly modification of a storage data format called binary 'link structure file' which inspirited from [2] for storing the web graph data. We then divided the PageRank calculating phases into parallel operations for exploiting the computing power of the graphics cards. Our program was written in CUDA language to experiment on a system equipped two double NVIDIA GeForce GTX 295 graphics cards, using two real datasets which were crawled from Vietnamese sites containing 7 million pages, 132 million links and 15 million pages, 200 million links, respectively. The experimental results showed that the computation speed increase from 10 to 20 times when compared to a CPU Intel Q8400 at 2.67 GHz based version, on both datasets. Our method can also scale up well for larger web graphs.

References

  1. S. Brin and L. Page. 1998. The anatomy of a large-scale hypertextual web search engine. In Proceedings of the 7th WWW Conference. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. A. Rungsawang and B. Manaskasemsak. 2004. Parallel PageRank Computation on a Gigabit PC Cluster. In Proceedings of the 18th International Conference on Advance Information Networking and Application. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. A. Rungsawang and B. Manaskasemsak. 2003. PageRank computation using PC cluster. In Proceedings of the 10th European PVM/MPI User's Group Meeting.Google ScholarGoogle Scholar
  4. A. Rungsawang and B. Manaskasemsak. 2004. An Efficient Partition-Based Parallel PageRank Algorithm. In Proceedings of the 11th International Conference Parallel and Distributed Computing. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. K. Sankaralingam, S. Sethumadhavan and J. C. Browne. 2003. Distributed PageRank for P2P system. In Proceedings of the 11th IEEE HPD'03 Conference. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Amy N. Langville and Carl D. Meyer. 2006. Google's PageRank and Beyond: The Science of Search Engine Rankings. Princeton University Press, 41 William Street, Princeton, New Jersey, 2006, p. 31--46. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Nathan Bell and Michael Garland. 2008. Ecient Sparse Matrix-Vector Multiplication on CUDA. NVIDIA Technical Report.Google ScholarGoogle Scholar
  8. Xintian Yang, Srinivasan Parthasarathy, P. Sadayappan. 2011. Fast Sparse Matrix-Vector Multiplication on GPUs: Implications for Graph Mining. Proceedings of the VLDB Endowment, Vol. 4, No. 4. Seattle, Washington. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Praveen K., Vamshi Krishna K., Anil Sri Harsha B., S. Balasubramanian, P. K. Baruah. 2011. Cost Efficient PageRank Computation using GPU. IEEE International Conference on High Performance Computing (HiPC), Student Research SymposiumGoogle ScholarGoogle Scholar
  10. Tianji WU, Bo WANG, Yi SHAN, Feng YAN, Yu WANG and Ningyi XU. 2010. Efficient PageRank and SpMV Computation on AMD GPUs. 39th International Conference on Parallel Processing, DOI 10.1109, p. 81--89 Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Ali Cevahir, Cevdet Aykanat, Ata Turk, B. Barla Cambazoglu, Akira Nukada and Satoshi Matsuoka. 2010. Efficient PageRank on GPU Clusters. IPSJ SIG Technical Report, Vol. 2010-HPC-128.Google ScholarGoogle Scholar
  12. Chebyshev distance. http://en.wikipedia.org/wiki/Chebyshev_distanceGoogle ScholarGoogle Scholar
  13. M. Harris. 2007. Parallel Prefix Sum (Scan) with CUDA. NVIDIA Corporation.Google ScholarGoogle Scholar
  14. CUDA zone, http://www.NVIDIA.com/object/cuda_home_new.htmlGoogle ScholarGoogle Scholar
  15. NVIDIA, 2009 "NVIDIA CUDA Programming Guide 3.0".Google ScholarGoogle Scholar

Index Terms

  1. Parallel PageRank computation using GPUs

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Other conferences
            SoICT '12: Proceedings of the 3rd Symposium on Information and Communication Technology
            August 2012
            290 pages
            ISBN:9781450312325
            DOI:10.1145/2350716

            Copyright © 2012 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 23 August 2012

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            Overall Acceptance Rate147of318submissions,46%

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader