Adaptive Quantization for Distributed Estimation in Energy-Harvesting Wireless Sensor Networks: A Game-Theoretic Approach

The problem of distributed estimation in energy-harvesting wireless sensor networks (EH-WSNs) is studied. In general, the energy state of an energy-harvesting sensor varies dramatically. Existing efforts mainly concentrate on the problem of distributed estimation for battery-powered WSNs, ignoring the crucial issue of energy harvesting. Therefore, the unpredictable energy harvesting, the energy storage device, and energy consumption are modeled in a unified way to jointly address the energy harvesting and distributed estimation problem. In this paper, combining with the classical adaptive distributed estimation scheme, the problem of parameter estimation in EH-WSNs is formulated as a game of complete and perfect information. Each player decides its strategy according to the others' energy states and actions. The subgame perfect equilibrium (SPE) is derived by backward induction. Simulation results show that the proposed SPE makes full use of the harvested energy and improves the estimation performance.


Introduction
Energy is the key factor in wireless sensor networks (WSNs), and extensive research effort has been put into prolonging network lifetime.There are two kinds of major strategies.One is to reduce energy consumption, such as designing low-complexity software implementation [1], power-efficient coverage, energy-efficient topologies [2,3], routing techniques [4], and data gathering [5].Another kind of strategy is to harvest ambient energy from mechanical, thermal, and photovoltaic energy [6] and so forth.A fundamental problem in battery-power WSNs is the finite battery lifetime of sensors.However, the energy-harvesting technique conquers the fundamental problem and can provide perpetual operations of WSNs.
In this paper, we consider the distributed estimation problem [7][8][9][10][11][12] in the context of energy-harvesting WSNs (EH-WSNs).The goal is to maximize a WSN's lifetime while ensuring all parameters of underlying process are monitored by sensors, such as the environment temperature, soil moisture, pressure, and sound [7].The problem in battery-power WSNs has been pursued in lots of earlier works due to many potential application fields.One of the earlier works addressed bandwidth-constrained distributed parameter estimation by using a one-bit quantizer and proposed maximum-likelihood estimators (MLEs) for sensor networks [8].Recently, the authors of [9,10] considered the decentralized estimation problem over noisy channels.Other works investigated the problem of optimal power allocation among sensors under a given estimation mean-squared error (MSE) for sensor networks [11,12].Briefly, the above solutions [7][8][9][10][11][12] do not consider sensors' recharging opportunities and are not suitable for EH-WSNs.It is noted that the authors in [13] presented an analysis of optimization problems of distributed estimation and their solution was found through a constrained utility maximization method for EH-WSNs.
International Journal of Distributed Sensor Networks However, the authors of [13] follow the assumption that the harvested energy is uncertain but predictable.Actually, it is not always true.For example, solar energy is dependent on sensors' solar cell size, its orientation to the sun, the temperature of the solar module, seasonal characteristics [14], and so forth.Thus, solar energy is unpredictable or predictable at a high energy consumption.On the other hand, the constrained utility maximization method is usually in the charge of the fusion center and obviously is centralized.
To this end, we propose a game-theoretic approach to model the distributed estimation problem in EH-WSNs.Firstly, energy harvesting, the energy storage device, and energy consumption are considered in a unified way.The energy is assumed to be unpredictable here.Then, the classical quantization for distributed estimation is formulated as a game of complete and perfect information.Different from the centralized method [15][16][17], where each sensor makes a decision according to the fusion center's scheduling scheme, the game-theoretic approach is distributed and each sensor makes decisions autonomously.
In existing game-theoretic models for battery-powered WSNs, the distributed estimation problem does not necessarily consider the harvested energy.The formation of nonoverlapping coalition is investigated and each sensor's performance is maximized under a specific energy constraint, ignoring the crucial issue of energy harvesting [18].Gametheoretic models have also been applied for EH-WSNs.A Bayesian game-theoretic approach is used to model transmission control in EH-WSHs and can effectively reduce the bandwidth overhead in exchanging information among sensors [19].The problem of determining the sleep and wakeup probabilities is modeled as a bargaining game for a solar-powered WSN [20].However, these existing efforts still lack consideration of the distributed estimation problem in EH-WSNs from the perspective of game theory.
The main contributions of this paper are as follows.
(1) A game-theoretic model has been proposed for EH-WSNs.Within, the extensive (sequential move) game theory and the distributed estimation problem are integrated into a distributed estimation game.
(2) Further, the refined Nash equilibrium is defined and its subgame perfect equilibrium (SPE) is also derived by backward induction.Simulations show that the proposed SPE can deal with the problem of unpredictable energy and improve the estimation performance.
The remainder of this paper is organized as follows.Section 2 provides a description of extensive games and the system model.Section 3 presents an adaptive quantization game for distributed estimation in EH-WSNs, especially showing some further results on the refined Nash equilibrium.Section 4 provides the simulation results and Section 5 concludes the paper.

Extensive Form
Games.An extensive (sequential move) game is one of the basic types of games, where players take turns choosing plans of actions.An extensive game with perfect information is such a game, in which each player, when making any decision, is perfectly informed of all the events that have previously occurred.A finitely extensive game with perfect information consists of [21] (i) a set of players N, (ii) a set of sequences  of actions A (terminal histories) that can possibly occur from the start of the game to an action that ends the game, (iii) a player function (⋅) that assigns a player to every sequence that is a proper subhistory of some terminal history, (iv) for each player  ∈ N, preferences {  (⋅)} over the set of terminal histories.
The extensive form game is usually pictured by way of a game tree, which consists of choice nodes and terminal nodes: (1) choice nodes are labeled with players and each outgoing edge is labeled with an action for that player; (2) terminal nodes are labeled with utilities.In such a game tree, as the most important concept, subgame perfect equilibrium is a robust steady state.It requires each player's strategy to be optimal, given the other players' strategies, not only at the start of the game, but also after every possible history.Definition 1.The strategy profile  * in an extensive game with perfect information is a subgame perfect equilibrium if, for every player  and every history ℎ after which it is the player ith turn to move, for every strategy   of the player , where  ℎ () denotes the terminal history consisting of ℎ followed by the sequence of actions generated by  after ℎ.

System Model.
A physical phenomenon (a scalar parameter) being observed by a set of sensors (indexed by N = {1, 2, . . ., }) is considered here, for example, temperature and toxicity of gas.Each sensor consists of a solar cell, a rechargeable energy storage device with limited capacity, and a wireless module.These sensors' observations are disturbed by independent and identically distributed additive Gaussian white noises.Their observations are quantized by the adaptive distributed estimation scheme in order to design an estimation scheme with robustness to the unknown scalar parameter [15].The scheme dynamically adjusts each sensor's threshold based on the binary data received from other previous sensors and broadcasts each sensor's onebit quantized observation over ideal shared time-division wireless channels.Each sensor's threshold   is a point beyond which there is a change, and it is used to determine the output value of each sensor's quantizer.Note that the estimation performance is evaluated by the benchmark (the corresponding Cramér-Rao lower bounds (CRLB)).The CRLB for the adaptive distributed estimation scheme is expressed as [15] CRLB where   := (1/) ∑  =1 (  = ()) and (  = ()) denotes the probability with which the local sensor  uses () as the quantization threshold.Additionally, (, ()) and (, ()) denote the probability density function (PDF) and the complementary cumulative density function (CCDF) of the additive Gaussian white noise, respectively.
To model the behavior of an energy-harvesting sensor node effectively, energy harvesting, the energy storage device with limited capacity, and energy consumption should be considered in a unified way.Each estimation period consists of two time slots: the long energy-harvesting slot   , where each sensor is harvesting solar energy, and the short transmission slot   , where each sensor broadcasts its information.It is reasonably assumed that   ≫   and there is no energy harvesting during the short transmission slot.
(1) Harvesting Slot.A typical solar model is adopted here.Each day is divided into  slots.Let  , and  , denote the total amount of harvested energy and the initial energy for the sensor  at slot , { = 1, 2, . . ., }, respectively.We consider a more realistic scene, in which ambient solar energy sources exploited for harvesting are unpredictable and random in nature due to complex surroundings of solar cells, for example, orientation to the sum, temperature of the solar module, diurnal variations, and seasonal characteristics [14].The harvested energy is stored in an energy storage device with limited capacity  max (i.e., a supercapacitor), which denotes the maximum battery level of the sensor .It is assumed that the energy storage device behaves perfectly in terms of leakage since the leakage is usually only a secondary effect [22].
(2) Transmission Slot.Sequences of binary data are generated by the adaptive distributed estimation scheme.The energy consumption for this binary data transmission is only considered and the energy consumption on sensing and signal processing is negligible.Similar to [17,23], to transmit a -bit message, the energy consumption of the sensor  in the tth estimation cycle (e.g., the slot ) is expressed as where  , denotes the longest distance among the distances from the ith sensor in the slot  to the other sensors and the fusion center.Note  elec and  fs denote the electronics energy and the energy factor, respectively. Tr , () is required here to guarantee that the -bit message can be received by the other sensors and the fusion center.Additionally, to receive a -bit message, the energy consumption of the ith sensor in the tth estimation cycle is expressed as Thus, for the ith sensor, the total energy consumption in the tth estimation cycle is expressed as The analysis of the above two time slots provides a unified way of modeling the behavior of the distributed estimation task, that is, through discretization.Thus, we have where B , denotes the remaining energy for the ith sensor at the slot .Obviously,  , satisfies the following condition:

Adaptive Quantization Game for Distributed Estimation
Since sensors in WSNs make decisions autonomously, we can define the extensive form game with perfect information to model the adaptive quantization problem in an estimation period , which is denoted as the adaptive quantization game with perfect information.Each sensor (player) wants to maximize its own utility in a selfish and rational manner, which is usually defined as a function of estimation performance and the sensor's residual energy.Additionally, it is noted that the adaptive quantization game with perfect information is also with complete information because players' utilities and strategies are completely known by all the players.(i) N  = {1, 2, . . ., }.
(iv)  , (⋅) is a function of the terminal history, estimation performance, and the sensor's residual energy in the current estimation period .The adaptive quantization game with complete information requires that preferences and strategies of all the players are common knowledge, and  , (⋅) should be designed well to meet the requirement.
Before formally defining  , (⋅), the discretized model of the adaptive distributed estimation task needs to be reconsidered.Firstly, as shown in Section 2.2, each player's energy states including  , and  , are private information due to the assumption of solar energy sources being unpredictable.However, players' energy information becomes a common knowledge if a one-bit observation message and a onebit additional energy message are both broadcasted in the adaptive quantization game.
The one-bit additional energy message is expressed as where  , denotes the threshold energy for the ith player at the slot .The choice of  , is dependent on the order of participating in the adaptive quantization game and the ith player's distance parameter  , .If Mess  (, ) = 1, it means that the ith player at the slot  has enough energy to participate in the adaptive quantization game at the next slot +1, though there is no any more harvested energy for the sensor at the slot  + 1.Instead, Mess  (, ) = 0 means that the ith player at the slot  has not enough energy to participate in the next slot +1.
Meanwhile, the sensor will not be permitted to participate even if there is enough harvested energy for the sensor at the slot +1 in order to store more harvested energy.Without loss of generality,  =  , for  ∈ {1, 2, . . ., } and  ∈ {1, 2, . . ., }.Each player's energy state can be derived through the additional energy messages (10).Thus, the adaptive quantization game with complete information in energy-harvesting WSNs is obtained.Then, A , is further depicted as follows: A , = 1 denotes that the ith player chooses to broadcast the two-bit message of its observation and its energy state at the current slot ; A , = 0 denotes that the ith player chooses to broadcast the one-bit message of its observation at the current slot .
According to formulas (3)-( 5), the total energy consumption in the ith estimation cycle is expressed as where  = A , + 1 and  = ∑  =1, ̸ = (A , + 1).It is reasonably assumed that  elec and  fs  2 , are in the same order.Thus, a simplified energy factor is defined as follows: It is noted that Ã, is a function of all the players' actions {A 1, , . . ., A , }.After reconsidering the discretized model of the adaptive distributed estimation task, the preference  , (⋅) is formally defined as where  0 is assumed to be the minimum number of participants broadcasting their two-bit messages if the adaptive estimation performance is satisfied,   = (A 1, , A 2, , . . ., A , ) is one of the elements in (8), and  is a given positive weighting factor.Additionally, Sgn(⋅) is given as Note the preference  , (⋅) is negatively related to the energy factor Ã, .According to formula (11), it is obtained that  , | A , = 1 is more than  , | A , = 0 if other players' actions are fixed.Due to the player's selfishness of increasing its residual energy, the action A , = 0 will be preferred under the assumption that the condition ∑  =1 A , ≥  0 is always satisfied no matter which action the ith player chooses.This idea is consistent with the player's selfishness and autonomy.

Backward Induction and Existence of Nash Equilibrium.
According to Definition 1, in a subgame perfect equilibrium every player's strategy is optimal, particularly after the initial history; that is, ℎ =  and  ℎ () =   () = ().Considering the adaptive quantization game with perfect information, we show this in the following proposition.Proposition 2. Every subgame perfect equilibrium in the adaptive quantization game with perfect information   is a Nash equilibrium.
Proof.From Section 3.1, it is noted that the extensive form game with perfect information   is modeled to depict the adaptive quantization problem in an estimation period .According to Definition 1, the whole game is always a subgame; that is, the initial history ℎ =  is set, and every subgame perfect equilibrium is a Nash equilibrium.Thus, we also say that subgame perfect equilibrium is a refinement of Nash equilibrium.
It is noted that subgame perfect equilibrium can be found using a simple algorithm known as backward induction [21].Backward induction refers to elimination procedures that is shown as follows: (1) identify the terminal nodes in the game tree; (2) determine the optimal actions for each choice node that is an immediate predecessor of a terminal node; (3) eliminate the above terminal nodes, and change those choice nodes into new terminal nodes with preferences from the optimal actions; (4) apply step 1 to smaller and smaller games until we can assign preferences to the initial choice node of the game.
In the following proposition, the existence of subgame perfect equilibrium in   is shown.

Proposition 3. Every adaptive quantization game with perfect information 𝐺 𝑡 has at least a subgame perfect equilibrium.
Proof.It is well known that the set of subgame perfect equilibriums of any finite horizon extensive form game with perfect information is equal to the set of strategy profiles isolated by the procedure of backward induction [21].Obviously, the adaptive quantization game   is a finite game that has not only a finite  horizon, but also a finite number of terminal histories, at most 2  .In   , the player who moves first in any subgame has at most two actions A , = {1, 0}; at least one action is optimal.Thus, the procedure of backward induction can isolate at least one strategy profile, that is, at least one subgame perfect equilibrium.
We show an example which is a simple adaptive quantization game   and then illustrate how to use the procedure of backward induction to find a subgame perfect equilibrium of the game.In this example, let  = 3,  = 0.9, and  0 = 2.For simplicity, it is assumed that Mess  (, 0) = 1 for  ∈ {1, 2, 3}, Mess  (, 1) = 1 for  ∈ {1, 2}, and Mess  (3, 1) = 0.The adaptive quantization game   can be pictured by the way of a game tree, as shown in Figures 1 and 2.
In Figures 1 and 2, the choice nodes labeled with players are rectangular and each edge is labeled with one of actions {1, 0}.Additionally, terminal nodes are labeled with utilities.For example, according to formula (13), we have 5, as shown in Figure 1.
Considering each player's energy state, that is, Mess  (, 1) = 1 for  ∈ {1, 2} and Mess  (3, 1) = 0, we have the 1st player and the 2nd player at the slot  = 1 have enough energy to participate in the game  2 , but the 3rd player at the slot  = 1 cannot participate in the game  2 .In other words, the 1st player and the 2nd player at the slot  = 1 (i.e., the 3rd player and the 1st player at the slot  = 2) have two actions in the game  2 , but the 3rd player at the slot  = 1 (i.e., the 2nd player at the slot  = 2) only has one action in the game  2 .
For an example of backward induction of the game  1 , consider Figure 1.According to the elimination procedures of backward induction, we first identify terminal nodes, such as those labeled as (0.5, 0.5, 0.5), (0.6, 0.6, 0.7), and (0.6, 0.7, 0.6).At these choice nodes that are immediate predecessors of terminal nodes, the optimal actions for each choice node are determined.For example, the 3rd player at the slot  = 1 is the predecessor of terminal nodes (0.5, 0.5, 0.5) and (0.6, 0.6, 0.7).At this choice node, the action A 1,1 = 0 dominates the action A 1,1 = 1.Thus, the branch where the choice node plays A 1,1 = 1 and its associated utilities are either erased or eliminated.Similarly, for the choice node being the predecessor of terminal nodes (0.6, 0.7, 0.6) and (−1.1, −1, −1), its branch and its associated utilities are eliminated.It is clear that a subgame perfect equilibrium can be found by using backward induction; that is, (0, 01, 0110) and (1, 0, 10) are the subgame perfect equilibrium of the games  1 and  2 , respectively.The flow chart of the adaptive quantization game for one cycle is shown in Figure 3. Within, backward induction is adopted in the operation "strategy choosing." It is noted that backward induction is the process of looking ahead and working backwards to solve the adaptive quantization game.The process continues backwards until one has determined the best action for every possible situation.According to Proposition 3, at least one of the subgame perfect equilibriums can be found through backward induction.Additionally, the time complexity of solving the adaptive quantization game is (2  ), as shown in Figures 1 and 2.

Numerical Results
In this section, a set of  = 10 sensors are randomly deployed in a 45 m ×45 m squared area.These sensors are indexed by {1, 2, . . ., 10}.The fusion center is located at the point (25 m, 100 m).The parameters are set as  = 24,  elec = 100 nJoules/bit,  fs = 10 pJoules/bit/m 2 ,  1,1 =  2,1 = ⋅ ⋅ ⋅ =  ,1 = 20Joules,  max = 1000 Joules,  = 2Joules,  = 0.9, and  0 = 6.The player function is initialized as  1 (ℎ () 1 ) =  + 1 for { = 0, 1, . . ., 9}.It is assumed that the total amount of harvested energy from exploiting solar energy is shown in Figure 4.The energyharvesting process reveals that the first slot  = 1 is defined at five to six o' clock in the morning and its level of harvested energy is low (around 1.4 Joules), while levels of harvested energy at the slots from  = 7 to  = 10 (around 2.6 Joules, 2.8 Joules, 3.0 Joules, and 3.0 Joules, resp.) at noon are higher than that of previous and subsequent slots.At night, its levels of harvested energy become rather low and are assumed to be 0 Joules.Additionally, due to different surroundings of sensors' solar cells, it is reasonably assumed that each sensor's harvested energy at a slot is slightly different.
Residual energy of all the sensors at each slot is shown in Figure 5.It is noted that each sensor's residual energy increases only in daylight and decreases at night.It is consistent with the model of a solar-powered energy for timeslotted operation shown in Figure 4.
Before the game  2 , the one-bit additional energy messages of sensors are generated.As shown in Figure 4, the 7th and 8th sensors' harvested energy is lower than that of others' .According to formula (10), it is obtained that Mess  (7, 1) = 0 and Mess  (8, 1) = 0. Thus, the 7th and 8th sensors at the slot  = 1 have not enough energy to participate in the slot  = 2. Additionally, the player function  2 (⋅) will change the order of participating.Then, the procedure of backward induction is adopted.The actions combination (1, 0, 0, 1, 1, 1, 0, 0, 1, 1) at the slot  = 2, which is shown in Figure 6, is chosen, and their utilities combination is (0.2, 0.3, 0.3, 0.2, 0.2, 0.2, 0.3, 0.3, 0.2, 0.2).It is worth mentioning that the 1st sensor at the slot  = 1 is the 1st player in  1 and has a first-mover advantage, while the 1st sensor at the slot  = 2 is the 10th player in  2 and does not have advantage over others.As shown in Figure 6, A 1,1 = 0 and A 1,2 = 1.From Figure 7, it is obtained that each sensor's average utility at one slot is almost the same and all sensors are treated fairly.

Conclusion
A novel distributed estimation scheme in EH-WSNs has been proposed.It adopts game theory of complete and perfect information and is suitable to any discrete models (predictable or unpredictable).It is noted that its player function and preferences are dependent on players' energy states and the performance of the EH-WSN.Additionally, the existence of SPE in the distributed estimation game has been derived and could be found by backward induction.Finally, simulations show that the proposed game-based SPE improves the performance and all sensors are treated fairly.

Figure 1 :
Figure 1: Game tree of the adaptive quantization game  1 and its backward induction.

Figure 2 :
Figure 2: Game tree of the adaptive quantization game  2 and its backward induction.

Figure 3 :
Figure 3: The flow chart of the adaptive quantization game for one cycle.

Figure 4 :t
Figure 4: The total amount of harvested energy on a certain day.

Figure 5 :
Figure 5: Sensors' residual energy on a certain day.

Figure 7 :
Figure 7: Players' utilities on a certain day.