A Hot-Area-Based Selfish Routing Protocol for Mobile Social Networks

Data delivery in mobile social network is a challenging task due to the nodal mobility and intermittent connectivity. It is natural to utilize the inherent social properties to assist in making forwarding decisions. However, existing routing schemes seldom consider selfishness of nodes and assume that nodes are willing to forward messages for others. In the real world, most people are selfish and nodes attached to people behave selfishly too. Based on the regularity of human behavior, we propose a hot-area-based selfish routing protocol (HASR) tailored for mobile social networks in this paper. Data transmission is based on the active degree of the node, which is calculated by the weight of hot areas that nodes will visit, when there are no selfish nodes. When nodes behave selfishly, routing decision is made by the contribution index that indicates the contributions to data transmissions of the network made by nodes. Simulation results show that HASR performs better when nodes behave selfishly.


Introduction
e portal devices such as smart phone, laptop, and tablet computer have been very popular in the world with the rapid development of the technologies of wireless communication and integrated circuit.ese devices with the wireless capabilities such as Bluetooth, WiFi, or 3G are oen carried by people and cooperate with each other to form an ad hoc network for exchanging and sharing their data.Social behavior analysis has been introduced to resolve the routing issues when nodes are attached to the human and could achieve better performance by using social relationship or human behavior in real life environment.Hui et al. [1] named the network Pocket Switch Networks (PSNs), a type of Delay Tolerant Networks (DTNs) [2].Because there have been some inherent social features in the network, the network is also called Mobile Social Networks (MSNs).
Since mobile social networks have the great potentials of collaborative data exchanging, opportunistic routing for MSNs has attracted a great interest.Unfortunately, it is hard to �nd an end-to-end path between the source node and the destination node in the networks and the network is usually intermittently connected due to the nodal mobility and the spare distribution of nodes, which pose great challenges for data delivering in mobile social networks.Different from traditional delay tolerant networks, nodes in mobile social networks are oen controlled by people so that nodes have some social features due to the social relationships or social ties among people.It is natural to utilize the inherent social properties to assist in making forwarding decisions.
Some institutes try to �nd social properties of the mobile social networks in the real world based on the data set collected from the portable devices attached to human, for example, Reality Mining [3], Topology Discovery [4], and Haggle [5].And there have been lots of routing strategies for mobile social networks, such as LABEL [6], BUBBLE [7], SimBet [8], Peoplerank [9], and PLBR [10], which all employ the social network properties to help message forwarding.However, in the previous routing techniques, there is a common assumption that all nodes in the network are unsel�sh and coordinated.at is to say, each node is willing to receive and relay the messages sent by other nodes.In the real world, most people are sel�sh and there would be some sel�sh nodes dominated by human.For example, the resources (energy, buffer, bandwidth, etc.) of nodes are usually limited and nodes try to preserve their own resources while just consuming the resources of other nodes.Besides, most people are willing to forward messages for nodes with whom they have social relationship.Social sel�shness will affect node behaviors.
In this paper, we propose a Hot-Area-based Sel�sh Routing protocol (HASR) tailored for mobile social networks.Based on the regularity of human behavior, which people may visit some locations regularly and would spend more time at a few speci�c locations than other locations, HASR proposes a scheme to resolve routing issue for the network with sel�sh nodes.
e rest of the paper is organized as follows.Section 2 discusses related works.Section 3 describes the system model.Section 4 introduces the details of HASR.Section 5 shows the simulation results of HARP.Section 6 concludes the paper and presents the future work.

Related Work
In the recent years, social structures have been used to help forwarding in intermittently connected networks and there have been a lot of studies on data gathering for social networks [].Routing techniques in mobile social networks could be classi�ed into two categories� social relationship based approach and human behavior-based approach.

Social Relationship Based Approach.
In the real world, there are inherent social relationships between people such as relatives, friends, colleagues, schoolmates, and so forth.e relationships usually maintain stable in a long period of time.Based on the social relationships, message could be forwarded efficiently.
Hui and Crowcro have proposed a routing algorithm called LABEL which takes advantage of communities for routing messages [6].LABEL partitions nodes into communities based on only affiliation information.en ach node in the network has a label telling others about its affiliation.A node only chooses to forward messages to destinations, or to the next-hop nodes belonging to the same group (same label) as the destinations.LABEL signi�cantly improves forwarding efficiency over oblivious forwarding using their dataset, but it lacks a mechanism to move messages away from the source when the destinations are socially far away.
BUBBLE combines knowledge of the community structure with knowledge of node centrality to make forwarding decisions [7].Centrality in BUBBLE is equivalent to popularity in real life, which is de�ned as how frequently a node interacts with other nodes.People have different popularities in the real life so that nodes have different centralities in the network.Moreover, people belong to small communities like in LABEL.When two nodes encounter, the node forward the message up to the node with higher centrality (more popular node) in the community until it reaches the same level of centrality as the destination node.en, the message can be forwarded to the destination community at the same ranking (centrality) level.BUBBLE reduces the resource consumption compared to Epidemic and PRoPHET.However, this reduction may not be large since the ranking process creates signi�cant communication overhead.In addition, this protocol still uses multicopy forwarding which means that it is not efficient in terms of resource consumption.
SimBet presented in [8] makes routing decisions by centrality (betweenness) and similarity of nodes.Centrality means popularity as in BUBBLE.More speci�cally, the centrality value captures how oen a node connects nodes that are themselves not directly connected.Similarity is calculated based on the number of common neighbors of each node.SimBet routing exchanges the preestimated centrality and locally determined similarity of each node in order to make a forwarding decision.e forwarding decision is taken based on the similarity Utility Function (SimUtil) and Betweenness Utility unction (BetUtil).When nodes contact with each other, the node selects the relay node with higher SimBet utility for a given destination.
LocalCom proposed by Li and Wu [11] is a communitybased epidemic forwarding scheme in disruption-tolerant network.LocalCom detects the community structure using limited local information and improves the forwarding efficiency based on the community structure.It de�nes similarity metrics according to nodes' encounter history to depict the neighboring relationship between each pair of nodes.A distributed algorithm, which only utilizes local information, is then applied to detect communities and the formed communities have strong intracommunity connections.
In social-greedy [12], forwarding decision is made by the closeness and social distance.Closeness is calculated by the common attributes (address, affiliation, school, major, city, country, etc.) of the two nodes.e more common attributes, the closer the two nodes.Social-greedy forwards a message to the next node if it is socially closer to the destination.Social-greedy outperforms the LABEL protocol.However, the delivery ratio of Epidemic and BUBBLE is better than social-greedy.

Social Relationship Based Approach. Another socialbased routing strategy employs the regularity of human behavior to aid in routing decision.
Liu presents a cyclic MobiSpace [13], which is a MobiSapce where the mobility of the node exhibits a regular cyclic pattern there exists a common motion cycle for all nodes.In a cyclic MobiSpace, if two nodes were oen in contact at a particular time in previous cycles, then the probability that they will be in contact around the same time in the next cycle is high.Based on this phenomenon, Routing in Cyclic MobiSpace (RCM) scheme is proposed.Routing decision is made by the Expected Minimum Delay (EMD), which is the expected time that an optimal forwarding scheme takes to deliver a message at a speci�c time from a source to a destination, in a network with cyclic and uncertain connectivity.When nodes contact, messages would be relayed to the next-hop with minimum EMD.
Liu et al. consider that there are preference locations that people visit frequently and they propose preference locationbased routing strategy (PLBR) [10].Firstly, PLBR provides the approach of acquiring one's preference locations and then calculates the closeness metric which is used to measure the degree of proximity of any two nodes is proposed.On the basis of that the data forwarding algorithm is presented.e closeness is de�ned to indicate the similarity of the preference locations that the two nodes visit.e higher the closeness of the two nodes is, the more the common preference locations they are.If the closeness of the two nodes is high, the probability of the two nodes contact is high.e messages would be forwarded to the next hop with the highest closeness.However, the calculation of the closeness requires the preference locations of the destination node, introducing large network overheads.
An expected shortest path routing (ESPR) [14] scheme improves PLBR by utilizing the stable property of humans that they have preference locations in their mobility traces, and the direct distance between node pairs can be calculated according to the similarity of their location visiting preferences.en an expected shortest path length (ESPL) can be achieved by Dijkstra's algorithm.Messages are forwarded to nodes which are closer to the destination than the previous nodes in the message delivery history.In addition, ESPR also employs the priority of message in the queue management.
CSI [15] is a behavior-oriented service as a new paradigm of communication in mobile human networks, which is motivated by the tight user-network coupling in future mobile societies.In such a scenario, messages are sent to inferred behavioral pro�les, instead of explicit IDs.At �rst, user behavioral pro�les are constructed based on traces collected from two large wireless networks, and their spatiotemporal stability is analyzed.e implicit relationship discovered between mobile users could be utilized to provide a service for message delivery and discovery in various network environments.CSI shows that user behavioral pro�les are surprisingly stable.Leveraging such stability in user behaviors, the CSI service achieves delivery rate very close to the delay-optimal strategy with minimal overhead.

System Model
3.1.Network Model.Assuming that in an  city scenario, the city is divided into several areas by the road.Each area has a unique ID.As shown in Figure 1, there are 10 areas such as     .
De��itio� 1 (hot area).A hot area is the zone in which nodes stay for a long period of time and visit frequently.Each area has its own weight call hot degree and a hot area has the higher hot degree.In city, the hot areas include the downtown, scenic spot, and so forth.In school, the dormitory, canteen, and library are hot areas for students.
De��itio� 2 (hot degree).Hot degree denotes the popularity of a hot area or the attraction for the mobile nodes.e more the hot degree is, the more frequently the area is visited.

Mobility Model.
Based on the regularity of human behavior, the mobility of all nodes is assumed to follow the schedule-based mobility model described in [16], where each node carries a unique schedule that describes its whole day journey.Each item of the schedule indicates when and where the node will be.As shown in Table 1, the node arrives at  on 8:00 and stays in  for 100 minutes.en the node leaves for  and will reach  on 10:00.A node moves only according to its schedule and it moves from the current location towards the next one in its schedule.Whether a node is sel�sh or not could be determined by di�erent metrics.For example, a node behaves sel�shly when most of its resources have been depleted.When the resources of the node are consumed much, the node only delivers message for the node that relay message for itself before.e sel�shness of the node could also be determined by the social relationship between nodes.A node only relays the message from nodes with strong relationships (e.g., friends, classmates, and colleagues) among them and is not willing to transfer the message for strangers.Without loss of generality, we assume that the node is sel�sh only when its energy is lower than a prede�ned threshold in this paper.

HASR Design
HASR modi�es and extends our previous work HARP [17].Data forwarding in HASR is based on the active degree and the contribution index of the node.When the energy of the node is enough, data forwarding is determined only by the active degree of the node.When the energy of the node is consumed too much, the node behaves sel�shly and routing decision is based on the contribution index.
4.1.Hot Degree.Hot degree of an area re�ects the popularity of the area and the attraction for the mobile nodes.ere are more nodes in the area with higher hot degree, which means there are more chances to meet the destination node in the area.Obviously, the hot degree of an area varies with time but it �uctuates a little for a long period of time.Assuming that there are  nodes entering into an area  per unit time, the residence time of node  in the area  is   .en the hot degree could be expressed by where ℎ  is the normalization hot degree of the area  and  is the number of the areas.As in (1), the hot degree is proportional to the number of nodes visiting the area and inversely proportional to the mean time staying in the area of these nodes.e more the number of nodes visiting the area, the higher probability to forwarding message is.By contrast, the mean time staying in the area indicates the activity of the area.e higher the mean residence time is, the less probability to leave for another area is.How to get the hot degree of the areas is a fundamental problem for HASR.Hot degree of the area could be acquired from the data set collected from the real world for a long period of time such as Reality Mining project.Based on the data set, we can learn the number of nodes visiting each area and the residence time that they stay in that area.en hot degree of the area can be calculated by (1).Table 2 shows an example of the hot degree of different locations in school.

Active Degree.
In HASR, data forwarding decision is decided by the active degree of the node, which indicates the visiting frequency of the areas.As mentioned before, each node follows an agenda schedule like in Table 1.As in Table 1, the �rst column  denotes that the areas the node will visit today; the second column  is the time to arrive at the area; and the third column  indicates the residence time that the node stays in the area.en the active degree of the node  is de�ned as the sum of the hot degree of the areas that the node will visit later today, where , , and  are the values of the th item of the agenda schedule table;  is the number of items of the table;   is the current time.e condition   <  +  means that the node stays in the area  and the previous    items in the table is invalid.For instance, as in Table 1, if the current time is 11:30, the �rst item is invalid and the second time will be invalid 30 minutes later.When computing the active degree of nodes, the invalid items should be removed.e active degree of the node re�ects the capability to relay messages.Higher the active degree means the node is more popular and there is more chance to meet the destination node.When two nodes encounter with each other, messages will be transferred to the node with the higher active degree.4.3.Contribution Index.When the energy of the node is enough, forwarding decision is only decided by the active degree of the node.However, if the energy of the node is lower than a preset value, the node should behave sel�shly to save its energy and routing decision is made by the contribution index of the node.at is to say, messages from the node with the more contribution to the network should be delivered when the next hop behaves sel�shly.e contribution index (CI) indicates the contributions to the data transmissions of the network and consists of two parts: network contribution and contribution for speci�c node.e network contribution can be determined by the number of messages generated by itself and the number of messages that it receives from other nodes and relays to the next hop.Similarly, the contribution for speci�c node could be decided by the number of messages that it sends to a speci�c node and the number of messages that the speci�c node sends to it.
Contribution index can be calculated based on the transmission statistics in the network.As shown in Table 3, each node maintains the table recording the data transmissions and updates the table once data transmission occurs.
Let   be the number of messages received from a speci�c node and   be the number of messages that send to the speci�c node.  and   are the number of messages received from other nodes and the number of messages generated by itself.De�ne the balance of contribution  for two speci�c nodes as From ( 3), if  is 0, the two speci�c nodes relay equal messages for each other.If  equals 1, it means that one node relays all messages from the other node that only sends message and does not relay messages.For sel�sh routing, the balance of contribution of the two nodes should be as small as possible.
where  is a weight indicating the importance of the network contribution to CI;  is a prede�ned constant that only when the number of messages the node relay exceeds , the network contribution is;  is a threshold and satis�es 0 <  < 1.Similarly, only when the balance of contribution of the two speci�c nodes is greater than , the contribution for speci�c node could be computed.Clearly,  should not be set to a large number because a node is not willing to relay messages from the node who contributes little for its data transmission.For sel�sh node, it should receive and relay the messages from the node with the approximately equal contribution index.

Data Transmission.
Data transmission in HASR is based on the active degree of the node when the energy is enough and the contribution index when routing sel�shly.e pseudocode of HASR is shown in Pseudocode 1. e node calculates its active degree according to the initial agenda schedule at �rst.When it meets other nodes, they exchange the data transmission statistics.If the energy of the node is greater than the threshold, the node behaves unsel�sh and relays all messages without consideration of the contribution index.en the node will send its message to its neighbor with the maximal active degree.e active degree of the node is different in different time for that some items in is greater than   , messages would be delivered to the node  because node  is more active than node .For example, if SET  = {(01), (01), (01) and SET  = {1(0), (01), (01), (01), which the values in brackets is the hot degree of the area, the different areas of the two set are 1, , and .en there are   = 01 and   = 0 + 01 = 0, and messages will be transmitted to node  because   is larger than   .
When the energy of the node is consumed too much, the node would behave sel�shly and routing decision is made by the contribution index.e pseudo code of HASR is shown in Pseudocode 1.As in Pseudocode 1, node  calculates its   according to its agenda set by people who hold it.When node  meets with node , if the energy of node  is enough, message will be delivered to the node  with higher   .If the   equals   , message will be delivered to the node  with higher   .Once the energy of node  is lower than a preset threshold, node  will send messages to node  based on the contribution index.If   is greater than   and they have almost the same contribution index, message will be transferred to node . denotes the difference between the contribution indexes of the two nodes.

Simulation
5.1.Simulation Setup.We simulate three protocols: the proposed HASR, HARP, and RSD [18] and evaluate their performance on data delivery ratio.We assume that message generation of each node follows a Poisson process and the destination node is randomly selected.To calculate the energy consumption, we use the same radio energy dissipation model as in [19].e initial energy of each node is 10 J and the node behaves sel�shly when half of the energy is consumed.Other simulation parameters and their default values are summarized in Table 4.

Simulation Results.
Figure 2 shows that the impacts of weight factor  on data delivery ratio of HASR. determines the contribution index is calculated mainly by the network contribution of the node or not.As in Figure 2, the data delivery ratio increases with the rising of .When  is small, the contribution index is mainly determined by the contribution for the speci�c node.�ven if the node contributes more to data transmission for the network, its message might not be relayed by other node.For example, node  encounters node .Node  did not relay any message from node  before which means there is little contribution for node , but it contributes more for the whole network.However, node  will not relay the message from node  because node  did not serve it before.When  increases, the network contribution is more important than the contribution for the speci�c node.e node that contributes more for data forwarding in the network has more chance to deliver its messages.When  is greater than 0.6, there is little increase on the data delivery ratio.In the simulation,  is set to 0.6.e in�uence of the proportion of sel�sh nodes denoted by  on data delivery ratio is shown in Figure 3. Obviously, the more the number of sel�sh nodes is, the lower the data delivery ratios of HASR and HARP are.When  is greater than 0.8, the data delivery ratio is just about 10%.As seen from Figure 2, the data deliver ration of HARP decreases drastically than that of HASR with the increase of the ratio of sel�sh nodes.e reason is that when a portion of node behaves sel�shly, they still delivery messages according to the contribution index.e following simulations measure the data delivery ratio of HASR, HARP, and RSD.As shown in Figure 4, the data delivery ratio of the three schemes increases aer simulations begin, but it decreases instead at a speci�c time.is is because the energy of the node is enough at the beginning of the simulation, the node transmits messages unsel�shly.When the energy of the node is lower than the threshold (5 J), the node in HARP does not forward any messages in order to save its energy, so that the data delivery ratio decreases quickly.For HASR and RSD, the node would deliver messages sel�shly.HASR forwards messages based on the contribution index and routing decision of RSD is decided by a reputationbased scheme.at is to say, the sel�sh nodes still relay messages for some other nodes.So, the data delivery ratio of the two schemes decrease slower than that of HARP.

Conclusions and Future Works
In this paper, we propose a hot-area-based sel�sh routing (HASR) protocol for mobile social network.In HASR, routing decision is made by the active degree and the contribution index of the node.When the energy of the node is enough, data forwarding is determined only by the active degree of the node.When the energy of the node is consumed too much, the node behaves sel�shly and routing decision is based on the contribution index.Simulation results show that HASR performs better than other schemes when nodes behave sel�shly.
In the future works, we will study the sel�shness of the node based on the social relationship between nodes.A node only relays the message from nodes with strong relationships among them and is not willing to transfer the message for strangers.

F 2 :
Impact of weight factor.

F 3 :
Impact of the proportion of sel�sh nodes.

F 4 :
Performance of the three schemes.
T 2: e hot degree of different locations.
Node  initializes its agenda Calculates   Get the set of its neighbors, SET  Node  = the neighbor with the maximal   if (the energy of node is enough) the schedule table would expire.If the active degrees of the two nodes are equal, it needs another metric to make routing decision.Let SET  and SET  be the set of areas in the valid items of the node  and node ;   and   is the number of elements of the two sets, respectively.  is the number of the same elements of the two set and   = |SET  ∩ SET  |.Let SET �  and SET �  be the set of different areas that the two nodes will visit and there are SET �  = SET  − SET  ∩ SET  and SET �  = SET  − SET  ∩ SET  .De�ne   and   as the sum of hot degrees in SET �  and SET �  .When node  contacts node , if   equals to   and