Hardware Architecture Design for WSN Runtime Extension

Internet of Things imposes demanding requirements on wireless sensor networks as key players in context awareness procurement. Temporal and spatial ubiquities are one of the essential features that meet technology boundaries in terms of energy management. Limited energy availability makes anywhere and anytime sensing a challenging task that forces sensor nodes to wisely use every bit of available power. One of the earliest and most determining decisions in the electronic design stage is the choice of the silicon building blocks that will conform hardware architecture. Designers have to choose between dual architectures (based on a low-power microcontroller controlling a radio module) and single architectures (based on a system on chip). This decision, together with finite state machine design and application firmware, is crucial to minimize power consumption while maintaining expected sensor node performance. This paper provides keys for energy analysis of wireless sensor node architecture according to the specific requirements of any application. It thoroughly analyzes pros and cons of dual and single architectures providing designers with the basis to select the most efficient for each application. It also provides helpful considerations for optimal sensing-system design, analyzing how different strategies for sensor measuring and data exchanging affect node energy consumption.


Introduction
Internet of Things (IoT) applications and scenarios are very heterogeneous: environmental monitoring in large areas [1], people monitoring in their own homes [2], or industrial environments [3] are some examples. This derives different requirements regarding network architecture and sensing nodes design [4]. According to Merriam-Webster dictionary, ubiquity is defined as the capacity of presence everywhere and in many places simultaneously. Sensors are today needed in different scenarios, and in all of them it is desirable that they be operative everywhere and every time they are required; for this reason, it is said that future sensors must be ubiquitous. It has two faces: spatial ubiquity-which inherently forces wireless communications and absence of wired power sources-and temporal ubiquity-which implies availability along functioning time (maximum energy autonomy) and also availability at any given time. Whichever the case, it leads to the common need of installation's runtime maximization and consequently minimization of energy demanded by sensing nodes [5]. There are many options to power wireless sensor nodes [6], but a real installation usually poses severe limitations: there is not unlimited power source available, energy from the environment is scarce and not enough for continuous running (e.g., indoors), maintenance of sensors is problematic (e.g., physically hard to reach to change batteries or expensive), and so forth. Thus, is critical to minimize node's power consumption while maintaining application's required quality of service. It is well known that power consumption has a high impact over quality of service offer by a WSN and its lifetime [3][4][5]7]; the paper is centered on its analysis.
Depending on the deployment scenario, sensor duties will vary: data sensing, processing, aggregation, forwarding, sending, and so forth. In this paper we focus on a common case in many IoT applications: a sensor node periodically samples (every SAMPLE ) one or more sensors (temperature, humidity, light, presence, chemical concentration, etc.), and then it performs some data processing and reports the readings to the network every REPORT .
Standard IEEE 1451 describes a set of open, common, network-independent communication interfaces for connecting transducers (sensors or actuators) to microprocessors, instrumentation systems, and control/field networks [8]. IEEE 1451 introduces the concept of a transducer interface module (TIM) as a module that contains the interface, signal conditioning, analog-to-digital and/or digital-to-analog conversion, and, in many cases, the transducer. The specification defines a generic finite state machine (FSM) in Figure 1 that describes the operation of sensing nodes-TIMs-with three different operational states: initialization, active, and sleep [9]. IEEE 1451 is not restricted to any communication technology, and thus FSM definition is generic and leaves to each standard the specification of the substates needed. There are many WSN protocols available [10], and we select ZigBee for the study as it is a mature wireless standard for sensor networks, worldwide accepted, and with many hardware manufacturers available. The methodology described could be easily applied to any other standard. According to the standard specification [11], FSM states are defined as follows ( Figure 1).
(1) Initialization State. Besides hardware startup (oscillator warmup, peripheral initialization, etc.), the ZigBee node has to initialize the network which means to check its network parameters (PANID-personal area network identifier-and channel mask), and if previously not joined to any network then scan the radio channels to discover available networks, join to a specific network, announce itself in the network,and, if the network has security enabled, wait to be authenticated by the Trust Center and for successful acquisition of the network key.
(2) Active State. Minimum tasks defined are polling its parent (to check if there are messages pending for the node), responding to any device discovery or service discovery operations requested, periodically requesting the Trust Center to update its network key (if security is enabled), processing device announce messages from other nodes, rejoining the network if disconnected for any reason, searching for alternative parent in order to optimize recovery latency and reliability, and so forth. Besides these network tasks, the node will also manage the sensors it might has, process and send sensor data, and so forth.
(3) Sleep State. It generically does not have any network or sensor and process duty assigned. This state is devoted to power electronics down to the maximum and to wait until there is any task to do switching to active state.
Temporal ubiquity of a wireless sensor node might suppose that communication with node must be guaranteed with a minimal latency time. This is commonly implemented following two different strategies that ensure lowest power of a wireless node: stay connected doing periodical network polls to receive incoming messages or leave the network and periodically reconnect. According to ZigBee specification, this is implemented following two different strategies shown in Figure 1.
(i) Polling configuration indicates that sensor node never leaves the network and periodically polls its "parent" (another node in the network that holds its messages while it sleeps).
(ii) Rejoining configuration indicates that sensor node leaves the network between reporting periods.
Both strategies are considered in ZigBee standard but no one is always more convenient than the other; while the first strategy guarantees that the node will receive messages from the network every time it polls, the second strategy reduces radio power consumption between reports to the minimum. Energy required to retrieve and send data from the sensor to its destination must be as small as possible and its optimization needs from a multidisciplinary knowledge are improved electronic stages, network management optimization, cooperative tasking, or other alternatives [12]. It should be approached from a combined perspective [13] that merges network, (spatial distribution of network nodes [14], medium access control [15], routing [16], etc.) and node design considerations. Hardware [17] and firmware [18] design of the sensing node is crucial and it is usually done in a superficial way, just looking at the power requirements of the different hardware blocks and optimizing firmware [19].
This paper analyses energy issues associated with the different design alternatives. The next section shows main hardware architectures used to build a wireless sensor: single and dual. Then, based on the implementation of the previously described finite state machine, a mathematical model of energy consumption is defined. The energetic impact derived from hardware architecture and runtime pattern is presented in Section 4. Finally, several considerations about how design strategies impact over energy consumption and performance comparison of different WSN platforms are shown.

Hardware Architecture
The building blocks of a sensor node are power management, sensor, communication, and control and/or processing. Wireless communications are the power hungriest part in a node [20]; nevertheless, its impact in overall energy demand can be reduced as these systems optimize its use to the maximum. On the contrary, power consumption of the sensor is often lower compared to communications, but it can have larger influence on the overall system performance depending on how the node performs the measuring process (sampling rate, signal conditioning, data acquisition, etc.) [21]. As a consequence, hardware architecture of node is critical when implementing a real application and electronic designers must decide between two different architectures.
(1) Dual architecture is composed of a microcontroller (uC) that runs the application and control and a radio module (RM) that implements wireless communication. Depending on the radio module, it can just be a transceiver implementing the lowest ISO/OSI layers of a standard (e.g., TI's CC2420 [22], that is, IEEE 802.15.4 compliant) or implementing a specific wireless standard to the application level (e.g., Ember's EM260 network coprocessor implementing ZigBee stack). Both cases share in common the RM that is not programmed, but is just configured or controlled through Universal Asynchronous Receiver/Transmitter (UART), Serial Peripheral Interface (SPI), or Inter-Integrated Circuit (I 2 C) protocols [23] using a set of commands provided by the manufacturer.
(2) Single architecture is composed of a system-onchip (SoC) embedding a radio module and a programmable microcontroller. In this case, the hardware manufacturer provides wireless standard compliance through an API and/or development environment that the programmer uses and implements the application and downloads it to the SoC. (e.g., Ember's 35x with EmberZNet Pro [24] or TI's CC2530 [25] with Z-Stack).
As seen in Figure 2, both architectures can be used to implement a low power consumption end node. Hardware manufacturers are clearly pointing to single architectures in order to maximize energy efficiency, reduce complexity, easily design, and so forth. Nevertheless, is this always true?, under which conditions?, is the strategy of splitting tasks between two low power microcontrollers more convenient in terms of energy efficiency? [26]. In order to answer 4 International Journal of Distributed Sensor Networks Internal uC active and radio on these questions, in the following sections we compare both architectures analyzing the energy consumption related to each state first theoretically (Section 3) and then with two real implementations.

Runtime Energy Consumption Analysis
Energy monitoring during design and commissioning of a wireless sensor network is challenging. Real measurement in specific nodes is possible [27]; nevertheless, WSN characteristics make it difficult to set nodal energy meters all over the network. Thus, it is common to use tools based on nodes' and networks' models that simulate hardware [28], data traffic [29], and associated energy consumption. As it is of key importance to understand the origin of every nanoamp in order to achieve the lowest power consumption [30] and due to the fact that there are no models that consider the architectures described, in the following we study in depth the energy associated with each substate and transition of the sensing node's FSM described in Figure 1. Minimization of energy consumption is a tradeoff between strategy chosen, application times between events ( SAMPLE , REPORT , and POLL ), and hardware architecture. According to this, many authors propose different energy models, most of them differentiating between four silicon modules: microprocessor, transceiver, sensor, and power supply [31]. In this study, as we aim to compare the hardware architectures discussed in previous section, it is not needed to consider sensor and power supply models because both will equally affect the energy balance; for example, whichever sensor(s) we use, they will output a digital serial communication interface (e.g., SPI and I 2 C) or an analog signal that will be, respectively, digitalized by the uC or the SoC.
The estimation of the power consumption of a sensor node is normally based on determination of each of the operation modes of the sensor [32]. These modes are highly influenced by the communication protocol and system hardware. In Table 1, we specify all the power modes in which a node will work. Table 2 specifies the power mode in which the hardware (uC, RM, and SoC) of the sensor will be in order to work according to poll configuration scheme in Figure 1. (We use poll configuration as it is the most complex scenario and rejoin configuration eliminates "poll parent" state, and the PM 0 of the RM will be reduced, while PM 0 → will increase.) Energy necessary to switch between power modes is not negligible, especially when going from low power to high power [33], thus it is also indicated in Table 2.
Energy consumed in a given state " " will be the sum of its " " substates calculated as where is the voltage supply and the second term is the integral of the current consumed and during the time the substate lasts.
Attending to the substates and considering the information that can be measured and extracted from hardware datasheets and application notes, the charge demanded by each state is defined in Table 3, where UC,RM,SoC 0,1,2,3 is the current consumed by uC, RM, and SoC in power modes 0, 1, 2, and 3 respectively, UC,RM,SoC 0,1,2,3 → 0,1,2,3 is the charge drained by uC, RM, and SoC in transitions between corresponding power modes, UC 0,1 → 1,2,3 is the time needed by uC to change from modes 0 and 1 to 1 and 2, respectively, RM,SoC INIT,REPORT,POLL is the charge drained by RM and SoC in network initialization, data report, and parent poll, RM,SoC INIT,REJOIN,REPORT,POLL is the time needed by RM and SoC in respective network process, SENSOR is the time needed by the sensing entities to sensor a valid measure in their outputs, UC,SoC ACQ is the current needed by uC and SOC for data acquisition from the sensing entities, for example, A/D conversion, UC,SoC ACQ is the time needed by uC and SOC for data acquisition from the sensing entities, for example, A/D conversion, UC,RM SCI is the current needed by uC and RM for data communication via serial communication interface, SCI REPORT,POLL,POLL ANSW , is the times needed to communicate between RM and uC via serial communication interface, and SLEEP is the time in sleep mode.
As we aim to compare both architectures, many simplifications are possible.
(i) Terms related to network operations ( RM,SoC INIT,SEND,POLL ) and power state change ( RM,SoC 0,1,2,3 → 0,1,2,3 ) are equivalent in terms of energy consumption for RM and SoC. (This assumption can be considered as RM and SoC from the same manufacturer share the same radiofrequency hardware, for example, Texas Instruments' CC2520 transceiver and CC2530 SoC or Ember's EM357 coprocessor and EM357 SoC.) (ii) Charge needed for network initialization is only consumed once and it is negligible compared to the charge needed by other states and consequently to the charge of the battery (below 0,05% with a 1000 mAh battery).
International Journal of Distributed Sensor Networks 5  Sense data Activate sensor and wait for data ready Exchange "report data" command (RM → UC) Send data to the network (rejoin if needed) Exchange "poll response" (RM → UC) ( UC 2 + UC SCI + RM 2 + RM SCI ) × SCI POLL ANSW 0 Change power mode International Journal of Distributed Sensor Networks (iv) Time in sleep mode is several orders of magnitude larger than any other times.
Considering the former simplifications and application times between events ( SAMPLE , REPORT , and POLL ), the resulting energy balance between dual and single architecture for a given cycle is where Thus, when CYCLE − < 0, the dual architecture will be more power efficient than the single architecture and vice versa when CYCLE − > 0.

Experimental Method and Results
As mentioned above, there are different WSN simulation tools that focus on specific aspects of the network: latency times, bandwidth, collisions, message integrity, and so forth. According to the previous section analysis, we need to focus more deeply on the architecture of the node and associated states, than on the network characteristics. Thus, we used MATLAB suite to model energy consumption of real sensing nodes' hardware and simulate FSM operation. Comparison between architectures has been done analyzing two real implementations with devices having similar International Journal of Distributed Sensor Networks 7 It is important to remark that internal RTCC in PM 0 has been selected.) For a given conditions and according to the analysis in Section 3, Table 5 shows the charge difference between dual and single architecture (% − ) of each substate, expressed in percentage contribution to the normalized total consumption per cycle. On one hand, it highlights the importance of sleeping and sensing processes related to total energy consumption evidencing their importance in autonomy maximization. It also proves the slight differences between chipsets, which together with the fact that information available about power consumption is more profuse for Microchip-Ember configuration leads us to choose it for further analyses.

Sensing and Reporting.
When focusing on measurement process, there are two important tasks: data acquisition and reporting. Figure 3 represents how the power savings ratio (PSR) of the dual architecture versus single architecture (defined as PSR DSvsS = Q CYCLE − /Q CYCLE ≜ Δ / ) varies depending on SAMPLE , POLL , and SENSOR . Values above zero indicate better performance of the dual architecture and vice versa when PSR DSvsS is below zero.
It is appreciated that variation in POLL has reduced impact on PSR DSvsS . The major effect comes from the variation of the time between measurements ( SAMPLE ) and the time needed to have valid sensor signal ( SENSOR ) [34]; the more time the node spends in sensing tasks, the more effective the dual architecture becomes. This fact is evidenced in Figure 4, where PSR DSvsS is represented versus SAMPLE for various values of SENSOR . We can clearly observe the impact of the measurement process on energy savings in the following example. Considering a sensor node getting one sample each 100 seconds from a sensor that needs 5 ms to provide a valid value (point A in Figure 3), the dual architecture would need 10% of energy less than single architecture. This effect is mainly derived from the higher flexibility in terms of clock sources of low power microcontrollers that is so far not available in SoCs (PIC24F16KA102 has five external and internal clock sources, providing 11 different clock modes with a minimum CPU clock speed of 31 kHz. Ember 357 has four clock sources with a minimum CPU clock speed of 6 MHz. The same happens to TI's hardware); that is, microcontrollers consider low power modes with slow clocks (PM 1 ) that are very convenient for sensing tasks. On the other hand if SENSOR is reduced to 500 us (point B in Figure 3), single architecture would be 6% more efficient. Finally, when sampling time SAMPLE exceeds 5 minutes (point C in Figure 3), for the conditions given ( REPORT = 4 hours; POLL = 4 min; SENSOR ≤ 10 ms), single architecture will be always more efficient.

Rejoining and Polling
Strategies. Regardless of the dual or single architectures, if it is assumable that the node is not connected to the network, a rejoin strategy can be more optimal depending mainly on the reporting period ( REPORT ). This basically occurs when the overconsumption due to rejoin process compensates the accumulated energy consumptions of the polls. Figure 5 compares PSR between rejoining and polling strategies for single and dual architectures.
Intersection between lines with zero (points A in Figure 4) indicates the REPORT above which rejoining strategy would be more convenient for any architecture. Intersection between red and blue lines (points B in Figure 4) indicates the REPORT above which dual architecture is more efficient than single architecture.
As expected, the energy savings of rejoining strategy increases with time between reports, faster at the beginning, until reaching a final stable value. This is because increasing time between reports decreases relative impact of REJOIN over the total. For this same reason, the final PSR is much more affected by the time between polls rather than by the value of REJOIN . Table 5, with any given sampling/polling/reporting conditions, the current in sleep mode is a relevant variable that has major impact in node lifetime. Thus, it is evident that the primary goal of a low power system is being in sleep mode as long as possible [35]. Some authors propose adaptive runtime to maximize efficiency [36]. Indeed, it is common to perform nodal power consumption analysis according to sleeping duty cycle [37]. Given the presented FSM tasks, considering sleeping time that is several orders of magnitude higher than the time devoted to all other tasks, having a battery charged with BATT and " " being the number of reports performed by the node during its lifetime, charge will be drained as

Sleeping. As we have seen in
Dual architecture with low power microcontrollers allows greater versatility to reduce sleep current, due to additional capabilities provided by a microcontroller: ultralow wakeup with external capacitor and radio module's totally powered off. (Frequently, microcontrollers have external interrupts based on discharged time of a capacitor. (See Microchip AN879 Using the Microchip Ultra Low-power Wake-up Module) or high impedance RC external circuits could be used in an low power interrupt. Note that the consumption for charging this capacitor is negligible.) Both architectures can also use an external RTCC to reduce to the maximum energy required for timing. (Low-Current High-ESR Crystals (such as Maxim DS1341) with I 2 C communication and one output used to activate an alarm interrupt of the microcontroller.) Table 6 shows pros and cons of different sleep mode strategies, sleep current of hardware, and associated PSR of dual architecture versus single architecture.
For polling (node can receive messages) and rejoining (node cannot receive messages) configurations, we considered four sleeping strategies. Using internal or external RTCC (additional chip necessary) provides node's conscience about clock and calendar and high precision in wakeup timing. It can be useful to build time synchronized WSNs, to accurately monitor variables or to timestamp measurements. Internal WDT reduces current consumption and loses timing functionalities. Finally, ultralow power wakeup has the most inaccurate timing (that could be enough to form any applications) but greatly reduces current consumption.
Evidently, the more the silicon modules that can be powered off, the less the power consumption in sleep mode. Thus, due to its higher flexibility, the dual architecture can be very convenient in case the application requirements allow it; it is especially remarkable to note the PSR difference in the rejoining strategy with ultralow power wakeup.

Hardware Architecture Performance Comparison.
In order to range the importance of the issues described here, this section provides a hardware architecture performance comparison of well-known WSN platforms [38][39][40]. The methodology followed has been to model the hardware blocks of the platforms according to chip manufacturer specifications and calculate the expected battery lifetime in a realistic scenario. Table 7 show the life expectancy expressed in years and the ratio compared to the best performance architecture. (Test framework considered: supply = 3 V; internal oscillator, main frequency = 8 mhz, secondary frequency = 1 MHz; External Oscillator, Crystal frequency = 32.768 kHz; SAMPLE = 120 s, POLL = 4 min, SENSOR = 1 ms; REPORT = 60 min; Battery type = LiMnO 2 , model = 2032/5004LC, capacity = 210 mAh). Obviously, it is necessary to consider that older systems are at disadvantage as chipset performance improves every year.
According to the results in previous sections, dual architecture is more efficient than the single one for the given conditions. Also both Texas Instruments and Microchip-Ember provides the highest performance. As sensing duties are not exigent in terms of microcontroller requirements, we can observe the negative effect of oversizing them (SunSpot's microcontroller is very powerful) in terms of life expectancy.  Also, comparing performance of platforms sharing the same transceiver (CC2420 and EM357), the influence of the microcontroller chosen is obvious.

Conclusions
WSNs are essential in the next generation of Internet where ubiquitous interconnected objects are available for interaction. Ubiquity means everywhere and anytime availability of sensing nodes implying wireless communication, energy harvesting, low power, and so forth; concepts that if not properly considered can lead to reduced systems' autonomy killing many real IoT applications. With these considerations in mind, low power consumption is one of the most important targets when designing IoT ready sensors. This paper studies different sensor node hardware architectures, deepening in the power consumptions associated with each state of the runtime cycle and time-relationship between them. It compares the energy consumption involved in the operation of a sensor node implemented using two different architectures: dual (based on a low power microcontroller and a radio module) and single (based on a system on chip). The specific finite state machine that describes the operation of sensing node is based on standard IEEE 1451 and the specific communications substates are modeled according to ZigBee Pro standard.
One important conclusion is that energy required in the sensing procedure has an important impact on this balance. There are some tasks, such as waiting for a valid sensor output ( SENSOR ) or acquiring the sensor data, which might require relevant amount of energy depending on the sampling rate ( SAMPLE ). This can turn dual architecture more efficient than the single one. One reason is that because low power microcontrollers in single architecture have higher flexibility than SoC architectures in terms of low power oscillator configurations, microcontrollers embedded in SoCs are usually not able to run with kHz oscillators. The second reason is because low power microcontroller peripherals are more optimized, something which can be especially relevant in case of using analog sensors that require the use of analog-digital converter (the same performance in terms of quality of the conversion requires less current and time in low power microcontroller than in SoC).
Considering temporal ubiquity requirements, if the IoT application does not require nodal availability at any time (for example to change sampling parameters), nodes can disconnect from the WSN. In case of ZigBee standard, this can be implemented using rejoining and polling strategies. In that case, when energy needed to rejoin exceeds consumption due to several polls, polling strategy turns to be more energy efficient. It also shows that, above a certain reporting period, dual architecture is more efficient because rejoining strategy allows to totally power off the radio module when not using it.
Power consumption in sleep mode has major impact on node lifetime, so there is a need to design a system with a current in sleep mode as low as possible. Again, dual architecture might be more convenient because low power microcontrollers are more flexible in terms of oscillator configuration and have additional low power modules such as ultralow-power wake-up module.
The main conclusion of the study evidences that, despite what could be considered initially and stated in datasheets, no architecture is always energetically more efficient than the other; deep contextualized system analysis is mandatory to squeeze batteries to the maximum. This paper provides generic guidelines that would help electronic designers in this analysis in order to decide the most energy efficient hardware architecture of sensor nodes. We also find it useful for firmware and even software developers in order to provide understanding about how IoT application requirements (e.g., reporting time) affect WSN performance and lifetime. Finally, a performance comparison of different WSN platforms attending to their hardware architecture evidences the impact of the issues just stated.
As a final example, making clear the importance of the analysis, if a sensor that polls for data every 4 min samples every minute a sensor that needs 5 ms to set up and reports data each 4 hours is implemented using a dual architecture, it would need 24% less energy than implemented using a SoC. But just changing sampling rate from 1 minute to 5 minutes would turn the situation making the dual architecture consume 6% more energy than single architecture.