An Evolutionary Game-Based Trust Cooperative Stimulation Model for Large Scale MANETs

In order to realize a methodical, effective cooperative stimulation for MANETs and search dynamic trust cooperative stimulation scheme in environment under a high malicious ratio, we have proposed an evolutionary game-based trust cooperative stimulation model for large scale MANETs in this paper. First, the system members' pluralistic behavior for MANETs has been covered by means of constructing the complete multirisk level strategy space. Then a trust-preferential strategy has been built through trust numerical value mapping technology, which achieves the aim that the malicious action is effectively constrained to avoid a low trust level. Furthermore, the mobility probable parameters and information propagation error matrix are introduced into game model, and the convergence condition between optimum strategy which represents payoff maximization principle and trust-preferential strategy is deduced through evolutionary analyzing finally. Both theoretical analysis and simulation experiments have demonstrated that our model can effectively stimulate cooperation among members and meanwhile be robust under the condition where the environment is harsh under a high original malicious ratio in large scale MANETs.


Introduction
With the development of perception theory, ubiquitous computation, and radio technology of multihop, the basic services of large scale mobile ad hoc networks (MANETs) can be autonomously deployed via local backbone nodes (BN), and accomplished by access network nodes (AN). Then the managers of each AN cooperatively upload essential information back to BN, which achieves network managements for largescale MANETs. Hence, the cooperation among members is the premise for MANETs to provide network services.
However, realizing a methodical and effective scheme for cooperation is facing tougher challenges in MANETs. First, current networks are threatened by a wide range of attacks, such as flooding [1], spoofing [2], wormhole, and Sybil attacks [3], as well as other external attacks [4]. These threats seriously destroy cooperation in MANETs. Furthermore, even if adopting current popular secure mechanism [5,6] to resist these attacks, due to the own inherent natures of MANETs, including limited available resources, complex deployment environment, exposed communication medium, and intermittent end-to-end links, some normal members may be unwilling to cooperate with others for saving resources to prolong their own network lifetime. The tolerable selfish behavior inevitably interrupts member cooperation of MANETs. For that reason, the exploration and researches on cooperative stimulation scheme for MANETs have been carried out all over the world. Along with current achievements, the cooperative stimulation based on game theory is the most representative. Combining with trust management, distributed system, and key protocol for MANETs, it has effectively stimulated members' cooperation by game theory for small range of wireless sensor networks and distributed networks. Obviously, the anticipation is clear for game theory as an analytical tool of MANETs: through modeling an independent strategy decision maker, users can control the whole network scene as an acentric control entity and abstract necessary hypothesis to address important problem like other mathematical models [7]. For large scale MANETs, we have found that there are some problems of current game theoretic cooperative stimulation scheme to be solved. First, it is the incompleteness of strategy and payoff space. Current scheme usually defines member's strategy space simply as "cooperative forwarding, packets dropping" and distributes each action with a payoff value. However in a large scale MANETs with infrastructure mobility, the network action chosen by members is diverse and complicated. The noncooperative actions come from malicious attacks as well as nonmalicious selfish behavior, which is not complete enough by merely describing it as packets dropping. Furthermore the cooperative action is not only the forwarding behavior. According to different network business, it shows various forms of cooperative behavior. Thus a complete strategy space reflecting realistic large scale MANETs and its rational payoff frame must be modeled. Second, it lacks a standard action selection guideline for members. In current game model, network members are modeled as rational and thus naturally selfish individual; they will make any efforts to maximize their payoffs. It is reasonable to assign a higher payoff to cooperation action in order to stimulate members taking cooperative action. In fact, in realistic network, to malicious members, they may more likely launch attacks to get a higher illegal payoff from network collapse. Since large scale MANETs are usually applied for harsh environment monitoring or military detecting, a high malicious ratio is an outstanding feature. As a matter of fact, besides payoff frame reflecting realistic network, an action selection guideline is also needed as scoring system assisting cooperative stimulation scheme based on game theory. Thirdly, the current scheme lacks evolutionary analysis for strategy space using game theory. In fact, the strategy space taken by members is not invariant as the game runs. It may be evolved according to long-term expected benefit or suffering from intrusion of instable strategy. Thus it is necessary to evaluate the evolutionary and convergence performance of each strategy space when using game theory to stimulate the member's cooperative action. Last but not least, in large scale MANETs, considering the own inherent natures of network, the scheme has to adapt dynamic property as well as propagation error when updating strategy and payoff set.
Aiming at the previous issues, in this paper, we model the transmission process as an evolutionary game and propose a trust cooperative stimulation scheme based on it; our main contributions are summarized as follows.
(1) We formulate a transmission evolutionary game defining an abstract concept of level classification in strategy space based on network risk analysis, which can cover the member's possible network actions under complicated compound attacks in large scale MANETs to enhance the completeness of strategy and payoff space.
(2) We construct a trust-preferential expected action space as strategy selection guideline for members through mapping trust management to game theoretical cooperative stimulation, which realizes effective constraint for malicious members obtaining illegal payoffs.
(3) We quantitatively analyze the stability and convergence property between our trust-preferential expected action space and payoff maximization-preferential optimum action space and then provide the sufficient and necessary numerical conditions, which can incent members to cooperate with each other.
(4) We introduce the mobility probability of members and information propagation error into the formulation of our scheme and make it approach to the realistic large scale MANETs.

Related Works
In the literature there are many papers proposing various methods for stimulating members' cooperation in selforganization networks which can be summarily classified into two schemes: (1) price-based schemes and (2) trustbased schemes. Price-based schemes use the tamper-proof hardware or central billing services to encourage cooperation by rewarding price credits to the cooperative nodes. For example, a cooperation stimulation scheme proposed in [8] employed a virtual currency named Nuglets as price payment for cooperative transmission, later; it was improved in [9] by using price counters. Although price-based schemes can effectively stimulate cooperation among selfish members, the requirement of tamper-proof hardware or central billing service inevitably limits their applications. What is more, the existing works are only fit for traditional multi-hop networks. The price-based schemes depend on end-to-end connections to determine how many prices each member should receive. In MANETs, since end-to-end paths are not guaranteed at all, the existing price-based schemes cannot be used. Regarding this issue, the second method to stimulate cooperation is to adopt trust-based schemes with necessary monitoring, such as CORE [10], CONFIDANT [11], and ARCS [12]. They usually rely on observing the actions of neighbor members and then use mathematical methods such as Dempster-Shafer belief theory to compute the incorporating secondhand information (reports by other nodes) to create a reputation score of members. The trust/reputation score is used for stimulating cooperation because the detected noncooperative members will be assigned a low score as a penalty to be forced out of the network. However, in realistic dynamic environment of MANETs, for a distributed trust form, the deviating actions of a non-cooperative member are more difficult to be monitored and detected by other members since the connections with the same members are occasional. Game theory has been widely applied to design and analyze stimulation schemes for wireless network recently. For example, in [13], a Worst Behavior Tit-for-Tat (WBTFT) incentive strategy is proposed to stimulate cooperation at the desired cooperation state, and with perfect monitoring the conditions for the proposed strategy to be subgame perfect are analyzed. In [14], a cooperation stimulation scheme are proposed based on indirect reciprocity game for the scenario where the number of interactions between any pair of players is finite. For large scale wireless networks, Xiao et al. [15] International Journal of Distributed Sensor Networks 3 proposed a security system that applies the indirect reciprocity principle to combat attacks in wireless networks using the evolutionarily stable strategy concept of game theory. In [16], the authors investigated whether the cooperation among members can improve energy efficiency in ad hoc wireless networks using the behavior-tracking algorithm from game theory, and then the conclusion that the cooperation can reduce power wastage at the same time maximizing the delivery rate is proved.
In addition, further researches have also been made toward mathematically analyzing cooperative stimulation for self-organized wireless network (e.g., MANETs, wireless sensor networks, delay tolerant networks) by using game theory [17][18][19][20][21][22][23]. Zhao et al. [17] proposed a wage-based incentive mechanism for encouraging rational individuals to provide truthful feedbacks. The feedback reporting process in a reputation system was modeled as a reporting game. They also proposed a set of incentive compatibility constraint rules including participation constraints and self-selection constraints. Ze and Haiying [18] analyzed the underlying cooperation of the reputation systems, price-based systems, and a defenseless system through game theory. Based on the results, they proposed an integrated system with a higher performance in terms of the effectiveness of cooperation and selfish node detection. Li et al. [19] showed how game theory can be a tool to analyze the behaviors of every player in role-based trust framework. Considering two types of users, cooperative and malicious, they analyze the strategy sets and payoffs of trust domains and each type of users. Charles et al. [20] investigated when for each node it was costeffective to freely participate in the security mechanism or protect its privacy according to its own belief in others. The game theoretic framework was used to model trust, and evolutionary game theory was used to capture the dynamic evolution of trust behavior in the network. Also, the studies of cooperative stimulation conditions under correlated equilibrium of coalitional game theoretic approach in ad hoc networks have also been issued in [21][22][23].
In most existing studies the modeled game theoretic cooperative stimulation shows a promising incentive effect in the network with a small-range, static topology paradigm. Designing a scheme using game theory towards large scale MANETs is the purpose of this paper. The major difference between this paper and current studies is as followss: (1) we formulate a transmission evolutionary trust game constructing the complete strategy and payoff space to cover the member's possible network actions under complicated compound attacks in large scale MANETs; (2) we separate trust-preferential strategy from payoff-maximization frame, which can be used as strategy selection guideline by means of numeric mapping technology. It can effectively resist malicious members obtaining illegal payoffs from attacking network; (3) we quantitatively analyze the stability and convergence property of the proposed game model in detail, and then provide the sufficient and necessary numerical conditions which can incent members to cooperate with each other; furthermore, (4) we introduce the mobility probability parameters and information propagation error into the formulation of our scheme and make it approach to the realistic large scale MANETs.

Information Transmission Scenario.
We design the scheme model used in homogenous mobile ad hoc networks consisting of homogenous randomly mobile nodes. For the convenience of discussing, we make the following assumptions: (1) the underlying channel model adopts disk model in order to abstract asymmetrical information away from the complicated properties of RF. (2) As for information transmitter, the probability that the arbitrary other nodes move away from its communication range or the newly nodes accesses into its communication range is .
In our model a typical information transmission scenario is composed of one transmitter, one intended receiver, and several information relay nodes. The transmitter generates the information and sends it to the intended receiver with the help of relay nodes. The node within the communication range of the transmitter can be selected to the relay nodes if it has the optimal link state described by medium congestion level, robustness of route protocol, mobile state prediction, and the health degree of itself. In a similar way, the relay node selects the next relay node and the link route according to the same principle until the generated information successfully gets to the intended receiver. For simplicity of mathematical expression, each node in our model becomes the relay node with the probability . In this paper, we use the symbol Φ to indicate whether a node is selected to the relay node. More specifically, Φ = 1 indicates that the node becomes the relay node on the information transmission path while Φ = 0 indicates that this node is only in charge of monitoring the behavior of other nodes and computing their trust value. Then these trust values will be exchanged by neighbor nodes via the cryptographic secure channel.

Trust Management.
In MANETs trust management can effectively resist various internal attacks conducted by compromised internal members. In this paper, to stimulate the cooperation behavior among nodes we design a game model in order to enhance the information transmission throughput which needs a scoring system to evaluate such behaviors. Hence we adopt the trust management to design the scoring system.
More specifically, one node monitors and records various communication factors (i.e., transmission rate, forwarding rate, etc.). Then by means of robust mathematical calculation method (i.e., Bayesian interference, DS evidence theory, fuzzy logic classification, etc.), the quantifiable trust value of each supervised members can be obtained by trust manager via the cryptographic secure channel, which can be regarded as the members' credible extent.

Game Model.
In this paper, we model the aforementioned information transmission process as a dynamic Bayesian game among all nodes in MANETs. During this game all players who participate in this game make efforts to maximize their own payoffs. That is to say, all nodes in our game model are deemed as rational players related to game theory.
More specifically, in MANETs, there are three kinds of players which amount to + 2 members engaged in our game: a transmitter, an intended receiver, and participants within the transmitter's communication range of the game. At time , participant selects one action according to the rational principle from our designed complete strategy space to play the game, denoted as [ ]. Based on the analysis of [24], current malicious nodes in large scale MANETs have gradually changed conventional pure attack modes into purposive strategic attack modes, such as selective forwarding attack, frame flooding, spoof, selfish packet dropping attack, and black hole and Sybil attack. In this paper, unlike present research works which simply build the behavior space composed of cooperative and uncooperative actions, we consider a comprehensive situation of attacks in MANETs. In order to stimulate cooperation among nodes in MANETs when nodes are at risk of aforementioned purposive strategic attacks, we classify and abstract current network attacks into multiple levels and then put them in the behavior space of the game model. The strategy space of our game model is shown in Table 1, where { 1 , 2 , 3 , . . . , } denotes the attack set classified and abstracted by the risk level. Note that in large scale MANETs the specific attack form corresponding to certain risk level is changing when the network operation goal is different. For example, with regard to the information monitoring network, enhance the energy utility which prolongs the network life time is the most important thing to be considered. Thus the frame flooding attack or the relevant combination of attacks which deteriorate the energy performance should be identified as the high-level risk attacks; another, as for the network that emphasizes the data transmission rate and throughput such as media ad hoc network, black hole and Sybil attack or the relevant combination of attacks which deteriorate the QoS performance should be identified as the high-level risk attacks. Besides, the elements in behavior set { , } denote the action taken by the participant who violates and complies with the cooperation rule respectively. More specifically, to the relay node, { , } denotes {selfish, forward}. On the contrary, to the monitoring node, { , } denotes {forward, monitor}.
The payoff frame is an important factor to model as well as analyze the behavior of players. In our game model after taking one certain strategy from behavior space to participate in the game, each participant obtains a real-time Particularly, as for the information transmitter, at time every other player that takes one action [ ] will produce one instant payoff to it, denoted as [ ] [ ]. We use [ ] to stand for the payoff that belongs to the transmitter. Generally speaking, the payoff is composed of two parts; one is the gain from action, and the other is the cost when taking this action. The value obtained by subtracting the cost from the gain means the payoff of taking this action. In this paper, positive payoff means that the participant earns profit from action, while negative payoff means that the participant loses some resources such as energy, throughput. More specifically Level-attack (the highest risk level attack) Violation of cooperation rule Cooperation in our game model, with regard to the participant who takes the cooperation behavior , both information forwarding (Φ = 1) and channel monitoring (Φ = 0) inevitably consume its own resources; hence On the other hand, the information transmitter would earn a profit after taking action ; thus [ ] > 0. Next, with regard to the participants who take action that attacks the network or violates the cooperation rule (denoted as the malicious behavior set . . , , )}), they can earn profits from these actions, so [Φ] ≥ 0. In this situation the information transmitter's benefit is threatened which leads to a negative instant payoff, [ ] ≤ 0. In addition, according to a wide range of attacks in MANETs classified by multiple risk levels in our model, the instant payoff satisfies the following conditions for both the transmitter and the game members: (1) Note that in our modeled transmission game for MANETs, the strict transmission constraint condition is used. More specifically, the information successfully reaches the intended receiver only if all the participants on transmission path comply with the cooperation rule. Based on this condition, at time the total instant payoff for transmitter is the minimum of all the obtained instant payoffs from other nodes, denoted as Similar to the game participant , its total instant payoff can be expressed . For convenience of understanding the game model, Table 2 lists the symbols as well as their meanings used in this paper.

Trust Cooperative Stimulation Scheme
In this paper we design a cooperative stimulation scheme for large scale MANETs combining game theory and trust management. On one hand, the equilibrium and stability condition of the aforementioned game model is deduced to grasp and predict the result through figuring out each node's optimal strategy. What is more, based on solution of game International Journal of Distributed Sensor Networks 5  The anticipated payoff of participant taking action , model, the mathematical relationship between payoff and statistical parameters can be used to guide (stimulate) members to choose cooperation with each other in order to resist selfish behavior or even high-level risk attacks. On the other hand, by means of trust management mechanism, a uniform frame concerning trust distribution, trust update and behavior selection is constructed by the whole members in the network.
In this section, we mainly introduce the behavior selection frame based on trust management (i.e., each member in MANETs takes action according to its trust value) and propose a trust cooperative stimulation scheme. Trust value records the member's quantified credibility (the higher the member's trust value is, the reliable the member is), and its Table 3: Action-trust based mapping rule.

Trust value
Action space Numeric indicator of trust level − 1 The symbol in action space (i.e., 3 , , etc.) corresponding to numerical indicator in the third column can be used to indicate trust level of the member. More specifically, the greater the member's numerical indicator, the higher the trust value it has, the more reliable it belongs to, hence the higherlevel cooperative action can be serviced by other members in MANETs.
calculation, distribution, and updating must be accomplished via the cryptographic secure channel. Without permission, members cannot clear or temper trust value optionally. All these paradigms of trust scheme indicate that it is fit for large scale MANETs since the cooperation interactive between two nodes only related to their recorded current trust value. For example, without prior knowledge about whether you cooperated with me before, I can decide to take cooperative action with you only if you have an acceptable trust value for me. It differs from current research about cooperative stimulation based on first-hand in MANETs. The trust cooperative stimulation scheme contains trust evaluation which is based on our previous works [5], actiontrust-based mapping method, game action decision principle, and trust updating frame. In this frame, the higher the member's trust value is, the more likely this member is stimulated to take cooperative action (i.e., ), which results in spreading the cooperation behavior to the whole network. On the contrary, if the member takes a high-level risk attack behavior for obtaining a short-term positive payoff, its trust value will be rapidly declined which causes cooperative service rejection in terms of scheme rule.
Recall that the behavior space for each node participating in the transmission game is { 1 , 2 , . . . , , , } amounting to + 2 elements. To combine trust level with game action, we classify the trust value (ranging from 0 to 1 usually) into + 2 trust levels and map each level into the behavior space { 1 , 2 , . . . , , , } linearly as Table 3, which we call action-trust-based mapping method. Consequently, the higher the member's trust level is, the more likely it can get a higher level cooperative action from other members. Note that by means of this mapping method, we can use element in action space to indicate node's trust value (also trust level).
According to original mapping, each node is assigned an original trust level ∈ { 1 , 2 , . . . , , , } and a trust After trust mapping, game action decision principle and trust updating frame are the two important parts affecting the performance of the trust cooperative stimulation. More specifically, at single time moment the game action decision principle is designed according to interaction between transmitter with trust level and participant with trust level shown as matrix [ , ] ( +2)×( +2) in the following, where element , denotes the assigned trust level of the participant who takes action ( ∈ { 1 , 2 , . . . , , , }) towards the transmitter with trust level and Φ (Φ = 0, 1) is the relay indicator of the participant: ] . (2) From this matrix, the game action decision principle can be explained that the node could take cooperative actions with its neighbor to obtain a higher trust level striving to restrain the attack actions. Generally speaking, at one time moment, taking action can obtain a highest instant trust level (i.e., ) while taking action can inevitably obtain a lower instant trust level (i.e., type ).
If the node takes actions for maintaining own high trust level in the game, we can intuitively get the expected action denoted as matrix * where * , denotes the participant who takes action ( ∈ { 1 , 2 , . . . , , , }) towards the transmitter with trust level and the Φ (Φ = 0, 1) is the relay indicator of the participant.
From matrix * ( +2)×( +2) , for the view of maintaining a higher trust level in this game, this expected strategy space can effectively encourage participants to take cooperative actions.
Recall that each newly node participating in the game would be assigned a trust vector. Suppose that a new node has a good intention to cooperate with each other; it could be assigned a trust vector vector = (0, 0, . . . , 0, 1) . At time trust updating process is triggered as shown in Figure 1. From Figure 1, the participant's trust vector at time + 1 is expressed as It is composed of three parts: the first part is instant trust vector , vector = (0, . . . , [1] , . . . , 0) ⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟⏟ The corresponding vector position of ( +1) (it is extended by instant trust level , at time + 1, that is, filling the vector's position corresponding to , = , (Φ) with numerical value 1, and other position with numerical value 0). The second part is time factor of the action taken at time + 1. In our game, considering the coupling degree between instant and the accumulated trust vector we define the time factor vector Γ = (Γ 1 , Γ 2 , . . . , Γ , Γ ) depicting the coupling degree, and thus element Γ , (Φ) is the active factor contributing to the updating process. Note that the greater the Γ is, the more likely the trust vector is to depend on the previous value and the fewer effects of instant value will be received. The third part is trust propagation matrix Θ whose role is to conquer behavior monitoring error and trust vector error by channel error via trust propagation. Specifically, Θ can be denoted as where denotes the systematic probability of correctly recognizing trust level .
In this section, we propose a trust cooperative stimulation scheme based on trust-game mapping idea. An expected action matrix * ( +2)×( +2) (trust-preferential strategy) and the trust updating frame are deduced to guide members tending to cooperation. For the view of maintaining a higher trust level in this game, this scheme can effectively encourage participants to take cooperative actions.

Game Theoretic Analysis
In the large scale MANETs, members who take the expected action * ( +2)×( +2) to select cooperative behavior can maintain a higher trust level (trust-preferential strategy) with themselves. Hence even if the network topology changes dramatically, nodes can also continue to obtain high-level cooperative network services in the new area by means of their high trust level. However in game theory, optimum actions refer to the strategy that receives a highest payoff for all the players. In this section, we mainly study whether the nodes can take the expected trust-preferential strategy also obtain a higher payoff after long-time running of the game? In addition, we adopt the evolutionarily game idea to analyze under which numerical condition can the expected trustpreferential strategy evolve to the payoff-preferential strategy (optimum strategy), that is, evolutionarily stable strategy (ESS).

Evolutionary Game Theory.
Evolutionary game theory provides a new angle of view to research the network cooperative stimulation scheme. It well overcomes the difficulties about rational hypothesis and multiple equilibriums in classical game theory. What is more, it can obtain more accurate results than traditional theory by using evolutionary game theory to research network security and can realistically analyze and explain cooperative motivation. To the best of our knowledge, introducing evolutionary game theory to study the mechanism of cooperative stimulation is a method innovation in MANETs.
In evolutionary game model, if most of the members take ESS, other parts of members' alternative strategies cannot invade the ESS. First we use the expected trust-preferential action as the original strategy in the game, and then the strategy starts to evolve in terms of the payoff maximum criteria which can deduce the optimum strategy of the game. More specifically, at the original time moment 1, game participant takes the action according to the game action decision matrix [ , ] ( +2)×( +2) and the expected action matrix * ( +2)×( +2) , and the evolutionary process is triggered. At time + 1, the probability of taking action [ + 1] = ∈ { 1 , 2 , . . . , , , } for node is denoted as where , [ ] denotes the instant payoff obtained by participant who takes the action at time . From (7) we can solve the ESS (i.e., optimum strategy space) as well as the corresponding stable trust vector̃v ector of this evolutionary game when taking the expected trust-preferential strategy as original strategy in MANETs.

Optimum Action Space. We first define matrix̃=
In our game model, considering the trust updating frame, the trust level of each participant may be transferred at different time moment. Thus we must evaluate the evolutionary optimum action space under the influence of the trust updating process. Recall that in our game the probability of participant selected to the information relay node is . Suppose that each behavior in set { 1 , 2 , . . . , , , } has the same time factor Γ. We define trust transfer vector TT = , denoting the transfer probability vector after the participant with trust level takes action towards the participant with trust level . Based on trust updating process shown in Figure 1, we can calculate the TT [ ] denotes the probability that the trust level of the participant taking action towards the participant with trust level has transferred from to . Formula (9) takes behavior time factor and relay factor into account, which is embodied by the application of Γ and , respectively.
Second, we solve the game payoff , , obtained by participant with trust level towards the participant taking optimum actioñ, with trust level . If takes action , , its instant payoff at time can be denoted as Φ=1 , In addition, consider the dynamics of MANETs, suppose that the probability of the node staying in the local area or moving to the new area is , and the stable trust vector vector does not change as time goes on; we can calculate the instant payoff obtained by nonrelay participant as On the other hand, if faces the information transmitter, similarly, its instant payoff can be calculated by To sum up, we define the probability that the node becomes the information transmitter in the game is consequently the game payoff , , can be expressed as Thirdly, we give the expression of stable trust vector vector when the game evolves to the ESS. According to trust updating frame, when taking probability and relay indicator Φ of the participant into account, the stable trust vector vector can be expressed by  ] ] ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) ) , where ∑ +2 =1 ∑ ,̃, , (Φ)=̃ṼV denotes the probability that the trust level of participant (with relay indicator Φ) who takes optimum actioñ, transfers to level . Based on previous analysis, the optimum action space has been modeled into a Markov decision process.
Combining (9), (12), and (13), the proposed optimum action spacẽ, and its corresponding stable trust vector vector of the evolutionary game can be solved by iterative numerical method.

Relationship between Optimum and Expected Action and
Its Convergence Condition. So far by means of evolutionary game theory we have deduced the ESS of the payoffpreferential strategy (optimum strategy) when taking the expected trust-preferential strategy as the original dominant action. In this section, we continue to study the convergence condition of this game, which is depicted as the paradigm that the ESS of the game converges to the original strategy. Meanwhile, we deduce and give the numerical relationship between optimum and expected action and its convergence condition.
If our evolutionary game converges, the original dominant strategy * , will evolve to be the optimum strategỹ, ; that is,̃, = * , . According to (3), we have ] .
For the convenience of deducing, suppose that all the probability of correctly recognizing trust level in trust propagation matrix Θ is the same, and = ; we also have ] .
According to Proposition 1, we can infer that the participant with trust level (less than the highest level + 2) would obtain a lower bound of the payoff after taking the expected trust-preferential strategy: .
The mechanism of cooperative stimulation by using the expected trust-preferential strategy is depicted as follows: no matter how low the participant's trust level is or which attribute (transmitter, receiver, relay node, and monitor) the participant belongs to, the expected trust-preferential strategy can stimulate it to cooperate with other members to obtain needed payoff to be serviced by the network.
Second consider the situation that the participant with trust level faces the participant with the highest trust level ( + 2) and relay indicator Φ = 0. Similar to the above case with Φ = 1, according to (14), (12) can be rewritten as Because of 1 ≤ ≤ + 1, taking (17) into the above formula, we can obtain To sum up, we can get the conclusion of Theorem 2 when the participant with trust level faces the participant with the highest trust level ( + 2).
Next we have to consider the situation that the participant with trust level faces the participant with the trust level < ( + 2) and relay indicator Φ = 1. In this situation, it should take the expected action shown as̃(Φ = 1) = + 1.
Simplifying it, we have Hence, Because of 1 ≤ ≤ , taking (17) into the above, we can obtain: In the same situation, (2) for action indicator = + 2, the expected action participant should take is still shown as (Φ = 1) = + 1. Hence, Last, consider the situation that the participant with trust level faces the participant with the trust level < ( + 2) and relay indicator Φ = 0. Similar to the above proof procedure, we can obtain the following two expressions: To sum up, we considered all the situations and deduced (25), (28)-(36), which can support and verify the conclusion of Theorem 2. If the parameters of our proposed stimulation scheme are set to satisfy Theorem 2, the transmission game can get into ESS and the optimum strategy space converges to the expected strategy space which can also make members obtain maximum payoff.

Simulation Setup.
In this part, we conduct extensive simulations to evaluate the network performance of our proposed stimulation model. All simulations are conducted in randomly generated MANETs. 5000 members are randomly deployed in a 10000 m × 10000 m region. The Medium-Access Control (MAC) layer protocol implements the IEEE 802.11 DCF with a four-way handshaking mechanism. The default link bandwidth is 2 Mb/s. DSR is adopted as route protocol. The maximum transmission range is 100 m. In our simulated MANET, each node is moving according to the random waypoint model: a node randomly chooses a destination within the circle and moves forward to the destination at a velocity uniformly chosen in 0.5 m/s, 2.5 m/s. When arriving at the destination, the node will choose a new location and a new speed to move on. Table 4 lists the default settings of stimulation scheme.  To evaluate the network transmission performance of our proposed cooperative stimulation scheme in large scale MANETs, a proportion of malicious members who give priority to take attack action from strategy space on the basis of payoff maximization will be mixed up with normal members at the initial time, that is, original malicious ratio (i.e., we mainly set the ratio at 0%, 20%, 40%, and 60% in simulation). More specifically, take wireless medium network for instance in simulation; a 3-level attack set is provided for malicious members to make decisions; 1 means frame flooding attack, 2 means black hole attack, and highest risk of 3 means packets dropping attack. Then the following indexes are measured for evaluating network performance.
(1) Cooperative Population: it is defined as the ratio between the total number of members taking cooperative action and that of all members in MANETs.   In our evolutionary game, each simulation is evolved 500 times to estimate these indexes. What is more, the evolutionary stable status and game convergence performance are also measured to verify our trust cooperative stimulation scheme. Figure 2 compares the overall effect of evolutionary cooperative population under various kinds of malicious ratio using our trust cooperative stimulation scheme. From Figure 2 we can see that as the cooperation game goes on, the network cooperative population, which takes strategy , is all increased significantly during 500 game rounds under original malicious ratio at 0%, 20%, 40%, and 60%. This is because nodes could take cooperation strategy to obtain a higher trust level in the frame of cooperative stimulation scheme in order to strive for continuous network services. In addition, the simulation parameter setting meets the condition of Theorem 2; namely, the game model exists evolutionary stable status and convergence point. According to the simulation results, after the evolutionary game is played 439, rounds the evolutionary stable status (ESS) comes and the cooperative population cannot fluctuate wildly. More specifically, when original malicious ratio of network is at 0%, 20%, 40%, and 60%, respectively, the cooperative population is increased from 62.3%, 45.8%, 36%, and 19.6% to 94.3%, 83.9%, 78.6%, and 72.4% at ESS point. Even if there is small proportion of network members engaged in malicious attacking after ESS, the stable status of cooperative population is not invaded by malicious strategy. These simulation results prove that our proposed scheme can stimulate cooperation behavior among network members under a high malicious ratio as well as promote ratio of population participating in cooperative transmission so as to maintain normal services of MANETs. Figure 3 shows the comparison of the cooperative population using our proposed stimulation scheme and the method not using stimulation scheme under original malicious ratio at 0%, 20%, 40%, and 60%, which are shown Figures 3(a), 3(b), 3(c), and 3(d), respectively. Note that in MANETs, especially in large scale MANETs, if there are more than 70% of network members refusing to cooperate with others, the network services will be impeded seriously. Thus in the simulation, if the cooperative population is less than 30% and this tendency continues 100 game rounds, the network transmission service is suspended. Without loss of generality the round number of the game which corresponds to the point of 30% of cooperative population is defined as network lifetime. From Figure 3, we can see that the network lifetime is effectively prolonged by improving cooperative population far above 30% using stimulation scheme compared with the other method. More specifically, under original malicious ratio at 0%, 20%, 40%, and 60% the network lifetime is 129 rounds, 63 rounds, 23 rounds, and 19 rounds, respectively, by using the method without stimulation scheme. While using our scheme, until the end of 500 rounds of the simulation, the network services are still maintained by large crowd of cooperative population (94.3%, 83.9%, 78.6%, and 72.4% when it comes to ESS). It can be inferred that in large scale MANETs (member number exceeds 5000) as well as high malicious ratio (>50%), our scheme still has a better performance.

Simulation Results.
Recall that in our proposed cooperative stimulation scheme, the time factor of the action space plays a role in coupling current and accumulated trust level of the members in MANETs; hence it contributes to the improvement in cooperative performance of the network. In order to verify and evaluate the impact of the time factor on cooperative population and convergence rate of the proposed evolutionary game, a series of simulations have been conducted. Figure 4 shows the result of the two kinds of settings of the time factor; one is time factor for each action at 0.5; that is, the updated trust level of the members is equal-weighted by the current value of trust level and that of accumulated value. The other one is time factor optimally by hierarchically weighting different action element; that is, the higher the risk level corresponding to action element, the smaller its time factor is set. By using hierarchical setting of time factor, the updated trust value by taking the higher risk action relies less on accumulated value. On the contrary, because of the larger value of time factor corresponding to beneficial action, the updated trust value relies more on accumulated value. Thus once the member takes action with a higher risk level, its trust level will be reduced immediately as punishment, and while taking actions that do not threaten the network, its reduction rate of trust level slows down with the increase of coupling degree. From Figure 4, we can see that, adopting the equal-weighted setting of time factor ([0.5, 0.5, 0.5, 0.5, 0.5]), the game gets into the ESS at the point of 439 game rounds and the cooperative population at this time remains 78.6% under original malicious rate at 40%. While in the same circumstances, not only the convergence rate, but also the cooperative population is superior to the previous result by using optimal hierarchal setting of time factor ([0.3, 0.2, 0.1, 0.4, 0.5]) whose value is 361 rounds and 87.3%, respectively. To sum up, the hierarchical setting method can be regarded as user interface which adjusts the risk level of various actions in our scheme.
In previous systematic simulations, all parameters are set to satisfy Theorem 2. When the transmission game gets into ESS, the optimum strategy space converges to the expected strategy space which can also make members obtain maximum payoff. In the following simulation, we need to evaluate another important index which mainly drives members to take which actions, that is, average payoff, and verify the effect of Theorem 2 in this paper. As can be shown in Figure 5, we compared 4 action spaces in our game, which are optimum action space (payoff-preference), expected action space satisfying Theorem 2 (convergent trust-preference), expected action space not satisfying Theorem 2 (nonconvergent trust level-preference), and expected action space not satisfying Theorem 2 with a hierarchical time factor (optimal attack classification). From the simulation result, under original malicious ratio at 50%, the optimum action space has the highest average payoffs during each round of the game (the average value of payoffs obtained by 5000 members is greater than 0.6 and grows top to 0.92). By contrast, the average payoffs of the other 3 action spaces are lower than optimum action space (about 0.4-0.8). On the other hand when members take optimum strategy, the cooperative population of the network does not increase in spite of the maximizing members' payoffs. This is because that the strategy driven by obtaining maximized payoff principle is always attack, that is, violating to cooperate with each other. As a consequence, as shown in Figure 5(b), taking optimum strategy would reduce the cooperative population (as curve 1). According to our inference in this paper, expected strategy (trust-preference) can effectively stimulate members to cooperate with others, but it cannot bring members a satisfying payoff (as curve 4). To solve this problem, if parameters are set to satisfy Theorem 2, it can not only stimulate members to cooperate with others, but also increase average payoff of the whole network (as curves 2, 3). Moreover the strategy which is set to include hierarchal time factor performs better than that without hierarchal time factor (see curve 2), which well verifies the simulation result above.
To extend our theoretical game model to the application of realistic MANETs, there are 2 important indexes referring to network transmission service, transmission success rate (TSR) and normalized network throughput (NNT), which must be measured. So finally we conduct afterwards simulation to evaluate TSR and NNT using the proposed cooperative stimulation scheme comparing to that using traditional multihop transmission scheme in large scale MANETs. The bar chart of Figure 6 shows the simulation result, where A, B, C, and D denote the member number of the network 2500, 5000, 7500, and 10000, respectively. From Figure 6, we can see that due to the increase of the cooperative population by using stimulation scheme, the TSR has been increased from 79% to 84% as the network member number ranges from 2500 to 10000. On the contrary, under original malicious ratio at 40% by using traditional multi-hop scheme the TSR drops dramatically from 71% to 42% which results in lacking of cooperation among network members. Then to the index NNT which reflects the active degree of network information, as a matter of fact, a higher NNT means larger accommodation of data stream of MANETs. From Figure 6, the NNT has been effectively maintained from 62% only down to 54% with the growth of the network scale. But in the same situation, by using traditional multi-hop scheme the NNT has been reduced dramatically from 59% to 35%. Therefore, our proposed cooperative stimulation scheme can effectively serve date transmission in large scale MANETs with a higher malicious ratio.

Conclusion
In this paper, we have investigated an evolutionary game theoretic trust cooperative stimulation scheme for large scale MANETs to incent members to take cooperative actions with each other so as to maintain cooperative performance. By means of constructing the complete multirisk level strategy and payoff space and building trust-preferential strategy, the malicious action can be effectively constrained to a low trust level. Then through evolutionary analysis of game model, the convergence condition between optimum strategy which represents payoff maximization principle and trustpreferential strategy is deduced. Furthermore, the mobility probability parameters and information propagation error are also introduced into our scheme, which makes it approach to the realistic large scale MANETs. Both theoretical analysis and simulation experiments have demonstrated that although a gap may exist between the game model and reality, the game-theoretic approach can still provide thoughtful insights and helpful guidelines when stimulating members to cooperate with each other from multirisk level of purposive strategic attack in large scale MANETs. The proposed scheme can effectively stimulate cooperation among members and meanwhile be robust under the condition where the environment is harsh under a high original malicious ratio in large scale MANETs.