Controversy Corner
Top-k query processing for replicated data in mobile peer to peer networks

https://doi.org/10.1016/j.jss.2013.10.043Get rights and content

Highlights

  • A top-k query processing for replicated data in M-P2P networks is proposed.

  • This method suppresses duplicate transmissions of same data through long paths.

  • An intermediate node stops transmitting a query message on-demand.

  • The simulation result shows that our proposed method achieves high performance.

Abstract

In mobile ad hoc peer to peer (M-P2P) networks, since nodes are highly resource constrained, it is effective to retrieve data items using a top-k query, in which data items are ordered by the score of a particular attribute and the query-issuing node acquires data items with the k highest scores. However, when network partitioning occurs, the query-issuing node cannot connect to some nodes having data items included in the top-k query result, and thus, the accuracy of the query result decreases. To solve this problem, data replication is a promising approach. However, if each node sends back its own data items (replicas) responding to a query without considering replicas held by others, same data items are sent back to the query-issuing node more than once through long paths, which results in increase of traffic. In this paper, we propose a top-k query processing method considering data replication in M-P2P networks. This method suppresses duplicate transmissions of same data items through long paths. Moreover, an intermediate node stops transmitting a query message on-demand.

Introduction

A mobile ad hoc peer-to-peer (M-P2P) network is generally constructed by mobile nodes that have sensing and wireless communication capabilities. Mobile nodes include mobile terminals held by members of rescue operations, excavation activities, or military affair who are engaged in a collaborative work and share the information among them. They also include in-vehicle systems such as car navigation systems that share location dependent information, e.g., traffic, accidents, gas stations, and restaurants around the current position, and information for safe driving. In the former case, because users have to carry mobile terminals, the resources of nodes are basically constrained in terms of computation power, communication bandwidth, battery, and storage. In the latter case, nodes basically have rich resources. In this paper, we assume the former case.

In an M-P2P network with limited resources (in what follows, we simply call it M-P2P network), it is very important that the system middleware supports data management for mobile users so that they can efficiently share information among them. In particular, since the communication bandwidth and battery of nodes are limited, it is important to reduce traffic by acquiring only necessary data when accessing data items. A possible and promising solution is that each node retrieves data using a top-k query, in which data items are ordered by the score of a particular attribute and the node that retrieves data items (the query-issuing node) acquires ones with the k highest scores (top-k result). A rescue effort at a disaster site is a good example of our target applications. In a rescue effort at a disaster site, top-k query is suitable to retrieve necessary data items in short time (small delay). As supplies (e.g., the number of ambulances or amount of medicines) are limited in a rescue effort, rescuers generally want to acquire only specific information such as that on seriously injured victims or the closest victims from rescuers as soon as possible.

In an M-P2P network, network partitioning occurs because the network topology dynamically changes due to movement of mobile nodes. If the network is partitioned, the query-issuing node cannot acquire some data items that are held by mobile nodes in a different partition and included in the top-k result. This means that the accuracy of the query result decreases. To improve data availability at the point of network partitioning in M-P2P networks, the most promising solution is data replication (Cao et al., 2004, Hara, 2001, Hara, 2010, Karumanchi et al., 1999, Padmanabhan et al., 2008).

Fig. 1 shows an example where mobile devices held by rescuers of a rescue effort at a disaster site form an M-P2P network. Each rescuer collects information on the environment and victims by using several kinds of sensing devices and shares the information among rescuers. When the rescuer in the upper-left corner wants to find the three most seriously injured victims in the entire area (e.g., the number of ambulances or amount of medicines are limited), he or she performs a top-k query where k is set to 3. If the radio link between the mobile nodes in the bottom-right and bottom-left corners is disconnected, the query-issuing node cannot acquire information on victims collected by the two mobile nodes on the right side. If the mobile node in the bottom-left corner holds a copy of the information held by the two mobile nodes on the right side, the query-issuing node can acquire all the necessary information.

In our previous works, we proposed a query processing method for top-k query in M-P2P networks (Hagihara et al., 2009) that can reduce traffic and keep high accuracy of the top-k result.1 We also proposed a data replication method for top-k query in M-P2P networks (Hara et al., 2010). In the method proposed in Hara et al. (2010), data items with high scores are frequently replicated on mobile nodes. In this case, if a top-k query is processed by using the method in Hagihara et al. (2009), same data items are sent back to the query-issuing node more than once through long paths because each node does not consider replicas held by others. As a result, the traffic for query processing increases.

In this paper, we propose a query processing method for top-k query considering data replication in M-P2P networks, which is an extension of the method proposed in Hagihara et al. (2009). In this method, each node attaches a query message with the information on its own data items (data item IDs) that have possibility to be included in the top-k result, and thus, each node can know data items held by nodes on the query path. When transmitting a reply, each mobile node does not send back data items held by nodes closer to the query-issuing node than itself. In addition, when a node receives a query message from a node on the path which is different from the path through which the node firstly received a query message for the same query, the node behaves as follows. If the query message contains the same data item IDs as the previously received message and the corresponding data items are held by nodes closer to the query-issuing node than the node, the node does not send back these data items. By doing so, duplicate transmissions of same data items through long paths can be suppressed, and thus, the traffic decreases. Moreover, if a node that received a query message does not have data items that have a possibility to be included in the top-k result, i.e., all of its own data items have lower scores than the minimum score attached in the query message, the node stops transmitting a query message, and starts transmitting a reply message. This is because no change in the information on the candidates of data items included in the top-k result can be seen that all data items included in the top-k result have been already acquired with high probability.

To the best of our knowledge, this is the first work that addresses top-k query processing in M-P2P networks which takes data replication into account. We show that the proposed method can drastically reduce traffic for top-k query processing than other approaches. From this, we can confirm that it is very effective to take data replication into account in top-k query processing.

The remainder of this paper is organized as follows. In Section 2, we present the assumed system model and definitions. In Section 3, we introduce related work, and our previous works proposed in Hagihara et al. (2009) and Hara et al. (2010). In Section 4, we propose a query processing method for top-k query considering data replication in M-P2P networks. In Section 5, we show the results of the simulation experiments. Finally, in Section 6, we summarize this paper.

Section snippets

System model

In this paper, the system environment is assumed to be a M-P2P network in which mobile nodes retrieve data items held by itself and other nodes using a top-k query.

We assign a unique node identifier to each mobile node in the system. The set of all mobile nodes in the system is denoted by M = {M1, M2, …, Mm}, where m is the total number of mobile nodes and Mi(1  i  m) is a node identifier. These nodes move freely, thus, the network topology dynamically changes and sometimes network partitioning

Related work

In this section, we present some conventional studies on top-k query processing and data replication, which are related to our work. We only focus the three research fields: fixed peer-to-peer (P2P) networks, wireless sensor networks and mobile ad hoc networks (MANETs). They have some similar characteristics such as multihop communication and limited knowledge obtained only from neighbor nodes.

Proposed method

In this section, we present the query processing method for top-k query proposed in this paper. The basis of our proposed method is same as that of the method in Hagihara et al. (2009). The difference is whether data replication is taken into account. Firstly, we describe the design policy of our proposed method. Then, we describe how each mobile node determines standard scores in our proposed method. We also describe the procedures for transmitting query and reply messages using the standard

Simulations

In this section, we show results of simulation experiments regarding performance evaluation of our proposed method. For the simulation experiments, we used the network simulator, QualNet4.0 (Scalable Network Technologies).

Conclusion

In this paper, we proposed a query processing method for top-k queries considering data replication in M-P2P networks. The proposed method aims at keeping high accuracy of the top-k result and reducing traffic. For this aim, each node attaches a query message with the information on its own data items, and thus, duplicate transmissions of same data items through long paths can be suppressed. In addition, each node stops transmitting a query message at an appropriate timing, and thus, the search

Acknowledgments

This research is partially supported by the Grant-in-Aid for Scientific Research (S)(21220002), (B)(24300037), and JSPS Fellows (24-293) of the Ministry of Education, Culture, Sports, Science and Technology, Japan.

Yuya Sasaki received the B.E. degree in the Multimedia Engineering and the M.E. degree in the Information Science and Technology from Osaka University, Osaka, Japan, in 2009 and 2011, respectively. Currently, he is a Ph.D. candidate in the Information Science and Technology from Osaka University, Osaka, Japan. His research interests include data search and replication mechanisms in mobile computing environments.

References (24)

  • P. Kalnis et al.

    Answering similarity queries in peer-to-peer networks

    Information Systems

    (2006)
  • R. Akbarinia et al.

    Reducing network traffic in unstructured P2P systems using top-k queries

    Distributed and Parallel Databases

    (2006)
  • W.-T. Balke et al.

    Progressive distributed top-k retrieval in peer-to-peer networks

  • T. Camp et al.

    A survey of mobility models for ad hoc network research

    Wireless Communications and Mobile Computing

    (2002)
  • G. Cao et al.

    Cooperative cache-based data access in ad hoc networks

    IEEE Computer

    (2004)
  • E. Cohen et al.

    Replication strategies in unstructured peer-to-peer networks

  • F.M. Cuenca-Acuna et al.

    Autonomous replication for high availability in unstructured P2P systems

  • A. Datta et al.

    Updates in highly unreliable, replicated peer-to-peer systems

  • R. Hagihara et al.

    A message processing method for top-k query for traffic reduction in ad hoc networks

  • T. Hara

    Effective replica allocation in ad hoc networks for improving data accessibility

  • T. Hara

    Replica allocation methods in ad hoc networks with data update

    ACM–Kluwer Journal on Mobile Networks and Applications

    (2003)
  • T. Hara

    Quantifying impact of mobility on data availability in mobile ad hoc networks

    IEEE Transactions on Mobile Computing

    (2010)
  • Cited by (12)

    View all citing articles on Scopus

    Yuya Sasaki received the B.E. degree in the Multimedia Engineering and the M.E. degree in the Information Science and Technology from Osaka University, Osaka, Japan, in 2009 and 2011, respectively. Currently, he is a Ph.D. candidate in the Information Science and Technology from Osaka University, Osaka, Japan. His research interests include data search and replication mechanisms in mobile computing environments.

    Takahiro Hara received the B.E., M.E., and Dr.E. degrees in Information Systems Engineering from Osaka University, Osaka, Japan, in 1995, 1997, and 2000, respectively. Currently, he is an Associate Professor of the Department of Multimedia Engineering, Osaka University. He has published more than 300 international Journal and conference papers in the areas of databases, mobile computing, peer-to-peer systems, WWW, and wireless networking. He served and is serving as a Program Chair of IEEE International Conferences on Mobile Data Management (MDM'06 and 10) and Advanced Information Networking and Applications (AINA'09 and 14), and IEEE International Symposium on Reliable Distributed Systems (SRDS'12). He guest edited IEEE Journal on Selected Areas in Communications, Sp. Issues on Peer-to-Peer Communications and Applications. His research interests include distributed databases, peer-to-peer systems, mobile networks, and mobile computing systems. He is a senior member of IEEE and ACM and a member of three other learned societies.

    Shojiro Nishio received his B.E., M.E., and Ph.D. degrees from Kyoto University in Japan, in 1975, 1977, and 1980, respectively. He has been a full professor at Osaka University since August 1992, and was given a peculiar title “Distinguished Professor of Osaka University” in July 2013. He served as a Vice President and Trustee of Osaka University from August 2007 to August 2011. Dr. Nishio has co-authored or co-edited more than 55 books, and authored or co-authored more than 600 refereed journal or conference papers. He served as the Program Committee Co-Chairs for several international conferences including DOOD 1989, VLDB 1995, and IEEE ICDE 2005. He has served and is currently serving as an editor of several international journals including IEEE Trans. on Knowledge and Data Engineering, VLDB Journal, ACM Trans. on Internet Technology, and Data $\&$ Knowledge Engineering. Dr. Nishio has received numerous awards during his research career, including the Medal with Purple Ribbon from the Japanese Emperor in 2011. He is also a fellow of IEEE, IEICE and IPSJ, and is a member of five learned societies, including ACM.

    Controversy corner. It is the intention of the Journal of Systems and Software to publish, from time to time, articles cut from a different cloth. This is one such article.

    The goal of CONTROVERSY CORNER is both to present information and to stimulate thought and discussion. Topics chosen for this coverage are not just traditional formal discussions of research work; they also contain ideas at the fringes of the field's “conventional wisdom”.

    These articles will succeed only to the extent that they stimulate not just thought, but action. If you have a strong reaction to the article that follows, either positive or negative, send it along to your editor, at [email protected].

    We will publish the best of the responses as CONTROVERSY REVISITED.

    View full text