Abstract
With the increase of data generation in distributed fashions such as peer-to-peer systems and sensor networks, top-k query processing which is a way to aggregate only a small set of data that sufficiently satisfies many users’ preferences, becomes a substantial issue. When data are periodically updated in each epoch e.g., weather information, without any techniques, a naive solution is to aggregate all data and their updates to ensure the completeness of final answers, however, it is too costly in terms of data transfer especially for data aggregator nodes (intermediate nodes). In this paper, we propose a top-k monitoring query processing method in 2-tier distributed systems based on a publish-subscribe scheme. A set of top-k subscriptions specifying summary scope of users’ interests is informed to aggregators to limit the number of transferred data records for each epoch. In addition, instead of issuing subscriptions of all queries, our method identifies a small set of minimal subscriptions as well as utilizes some adaptive heuristic rules to efficiently maintain those subscriptions resulting in lower communication overhead. Our experiments through both synthetic and real datasets show that our technique is efficient and outperforms other comparative reactive methods.
Similar content being viewed by others
References
Babcock, B., Olston, C.: Distributed top-k monitoring. In: SIGMOD, pp. 28–39 (2003)
Baikousi, E., Vassiliadis, P.: Maintenance of top-k materialized views. Distrib. Parallel Databases 27(2), 95–137 (2010)
Barber, C.B., Dobkin, D.P., Huhdanpaa, H.: The quickhull algorithm for convex hulls. ACM Trans. Math. Softw. (TOMS) 22(4), 469–483 (1996)
Bertsimas, D., Tsitsiklis, J.N.: Introduction to linear optimization, vol. 6. MA, Athena Scientific Belmont (1997)
Börzsönyi, S., Kossmann, D., Stocker, K.: The skyline operator. In: ICDE, pp. 421–430 (2001)
Chang, K.C.-C., Hwang, S.-w.: Minimal probing: supporting expensive predicates for top-k queries. In: SIGMOD, pp. 346–357 (2002)
Chazelle, B.: An optimal convex hull algorithm in any fixed dimension. Discret. Comput. Geom. 10(1), 377–409 (1993)
Cheema, M.A., Shen, Z., Lin, X., Zhang, W.: A unified framework for efficiently processing ranking related queries. In: EDBT, pp. 427–438 (2014)
Chester, S., Thomo, A., Venkatesh, S., Whitesides, S.: Indexing for vector projections. In: DASFAA, pp. 367–376 (2011)
Chester, S., Thomo, A., Venkatesh, S., Whitesides, S.: Indexing reverse top-k queries in two dimensions. In: DASFAA, pp. 201–208 (2013)
Das, G., Gunopulos, D., Koudas, N., Tsirogiannis, D.: Answering top-k queries using views. In: VLDB, pp. 451–462 (2006)
De Berg, M., Van Kreveld, M., Overmars, M., Schwarzkopf, O.C.: Computational geometry. Springer (2000)
Godfrey, P., Shipley, R., Gryz, J.: Maximal vector computation in large data sets. In: VLDB, pp. 229–240 (2005)
Hristidis, V., Koudas, N., Prefer, Y. Papakonstantinou.: A system for the efficient execution of multi-parametric ranked queries. In: SIGMOD, pp. 259–270 (2001)
Jafarpour, H., Hore, B., Mehrotra, S., Venkatasubramanian, N.: Subscription subsumption evaluation for content-based publish/subscribe systems. In: Middleware, pp. 62–81. Springer (2008)
Jiang, H., Cheng, J., Wang, D., Wang, C., Tan, G.: Continuous multi-dimensional top-k query processing in sensor networks. In: INFOCOM, pp. 793–801 (2011)
Lee, J., Cho, H., Hwang, S.-w.: Efficient dual-resolution layer indexing for top-k queries. In: ICDE, pp. 1084–1095 (2012)
Lu, H., Zhou, Y., Haustad, J.: Efficient and scalable continuous skyline monitoring in two-tier streaming settings. Inform. Syst. 38(1), 68–81 (2013)
Mouratidis, K., Bakiras, S., Papadias, D.: Continuous monitoring of top-k queries over sliding windows. In: SIGMOD, pp. 635–646 (2006)
Papadias, D., Tao, Y., Fu, G., Seeger, B.: Progressive skyline computation in database systems. ACM Trans. Database Syst. (TODS) 30(1), 41–82 (2005)
PripuŻić, K., Podnar żarko, I., Aberer, K.: Top-k/w publish/subscribe: A publish/subscribe model for continuous top-k processing over data streams. Information Systems (2012)
Ryeng, N.H., Vlachou, A., Doulkeridis, C., Nørvåg, K.: Efficient distributed top-k query processing with caching. In: DASFAA, pp. 280–295 (2011)
Sagy, G., Keren, D., Sharfman, I., Schuster, A.: Distributed threshold querying of general functions by a difference of monotonic representation. Proc. VLDB Endow. 4(2), 46–57 (2010)
Sasaki, Y., Hagihara, R., Hara, T., Shinohara, M., Nishio, S.: A top-k query method by estimating score distribution in mobile ad hoc networks. In: DMWPC, pp. 944–949 (2010)
Shastri, A., Di, Y., Rundensteiner, E.A., Ward, M.O.: Mtops: scalable processing of continuous top-k multi-query workloads. In: CIKM, pp. 1107–1116 (2011)
Silberstein, A.S., Braynard, R., Ellis, C., Munagala, K., Yang, J.: A sampling-based approach to optimizing top-k queries in sensor networks. In: ICDE, pp. 68–68 (2006)
Swokowski, E.W.: Calculus with analytic geometry. Taylor & Francis (1979)
Tao, Y., Hristidis, V., Papadias, D., Papakonstantinou, Y.: Branch-and-bound processing of ranked queries. Inform. Syst. 32(3), 424–445 (2007)
Triantafillou, P., Economides, A.: Subscription summarization: A new paradigm for efficient publish/subscribe systems. In: ICDCS, pp. 562–571 (2004)
Udomlamlert, K., Hara, T., Nishio, S.: Communication-efficient preference top-k monitoring queries via subscriptions. In: SSDBM, pp. 44:1–44:4 (2014)
Vlachou, A., Doulkeridis, C., Nørvåg, K.: Distributed top-k query processing by exploiting skyline summaries. Distrib. Parallel Databases 30(3-4), 239–271 (2012)
Vlachou, A., Doulkeridis, C., Nørvåg, K., Vazirgiannis, M.: On efficient top-k query processing in highly distributed environments. In: SIGMOD (2008)
Wu, M., Jianliang Xu, J., Xueyan Tang, X., Wang-Chien Lee, W.-C.: Top-k monitoring in wireless sensor networks. IEEE TKDE 7, 962 –976 (2007)
Xie, M., Lakshmanan, L.V., Wood, P.T.: Efficient top-k query answering using cached views. In: EDBT, pp. 489–500 (2013)
Yang, D., Shastri, A., Rundensteiner, E.A., Ward, M.O.: An optimal strategy for monitoring top-k queries in streaming windows. In: EDBT, pp. 57–68 (2011)
Yu, A., Agarwal, P.K., Yang, J.: Processing a large number of continuous preference top-k queries. In: SIGMOD, pp. 397–408 (2012)
Yu, A., Agarwal, P.K., Yang, J.: Processing and notifying range top-k subscriptions. In: ICDE, pp. 810–821 (2012)
Zhao, K., Tao, Y., Zhou, S.: Efficient top-k processing in large-scaled distributed environments. Data Knowl. Eng. 63(2), 315–335 (2007)
Zou, L., Chen, L.: Pareto-based dominant graph: An efficient indexing structure to answer top-k queries. IEEE TKDE 23(5), 727–741 (2011)
Acknowledgments
This research is partially supported by the Grantin-Aid for Scientific Research (A)(26240013) of MEXT, Japan, and JST, Strategic International Collaborative Research Program, SICORP.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Udomlamlert, K., Hara, T. & Nishio, S. Subscription-based data aggregation techniques for top-k monitoring queries. World Wide Web 20, 237–265 (2017). https://doi.org/10.1007/s11280-016-0385-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11280-016-0385-1