A New Fast Intersection Algorithm for Sorted Lists on GPU

A New Fast Intersection Algorithm for Sorted Lists on GPU

Faiza Manseur, Lougmiri Zekri, Mohamed Senouci
Copyright: © 2022 |Volume: 15 |Issue: 1 |Pages: 20
ISSN: 1938-7857|EISSN: 1938-7865|EISBN13: 9781683180340|DOI: 10.4018/JITR.298325
Cite Article Cite Article

MLA

Manseur, Faiza, et al. "A New Fast Intersection Algorithm for Sorted Lists on GPU." JITR vol.15, no.1 2022: pp.1-20. http://doi.org/10.4018/JITR.298325

APA

Manseur, F., Zekri, L., & Senouci, M. (2022). A New Fast Intersection Algorithm for Sorted Lists on GPU. Journal of Information Technology Research (JITR), 15(1), 1-20. http://doi.org/10.4018/JITR.298325

Chicago

Manseur, Faiza, Lougmiri Zekri, and Mohamed Senouci. "A New Fast Intersection Algorithm for Sorted Lists on GPU," Journal of Information Technology Research (JITR) 15, no.1: 1-20. http://doi.org/10.4018/JITR.298325

Export Reference

Mendeley
Favorite Full-Issue Download

Abstract

Set intersection algorithms between sorted lists are important in triangles counting, community detection in graph analysis and in search engines where the intersection is computed between queries and inverted indexes. Many researches use GPU techniques for solving this intersection problem. The majority of these techniques focus on improving the level of parallelism by reducing redundant comparisons and distributing the workload among GPU threads. In this paper, we propose the GPU Test with Jumps (GTWJ) algorithm to compute the intersection between sorted lists using a new data structure. The idea of GTWJ is to group the data, of each sorted list, into a set of sequences. A sequence is identified by a key and is handled by a thread. Intersection is computed between sequences with the same key. This key allows skipping data packets in parallel if the keys do not match. A counter is used to avoid useless tests between cells of sequences with different lengths. Experiments on the data used in this filed show that GTWJ is better in terms of execution time and number of tests.