Power-Management Strategies in sEMG Wireless Body Sensor Networks Based on Computation Allocations: A Case Study for Fatigue Assessments

Surface ElectroMyoGraphy (sEMG) is widely applied to a variety of applications. Managing the power consumption of battery-constrained sEMG Wireless Body Sensor Networks (WBSN) is an important topic. In this paper, we use fatigue assessments as a case study. We apply the concept of distributed computing to explore the impact of computation allocations on the client power consumption and the requirement of architecture specifications. Regarding the CPU clock rate, we propose a power-saving method based on the ping-pong buffer mechanism and evaluate all the crucial factors which affect the power consumption such as sEMG sample rates, algorithmic computational costs, wireless throughputs, and selection of wireless technologies. To sum up, we conduct a comprehensive analysis of all possible distributed computing architectures of WBSN to determine the lowest-power WBSN architecture. The results show that the implementation based on the lowest-power WBSN architecture has lower power consumption compared with other hardwares. The average current of the proposed architecture can be reduced by 81.7% compared with the previous work. Besides, the battery life is 4.48 times that of the previous work under the continuous wireless connection equipped with the same 300mAh lithium battery. Compared with the commercial device, the battery life is 1.6 times that of the commercial one.


I. INTRODUCTION
A. OVERVIEW OF sEMG APPLICATION ElectroMyoGraphy (EMG) signal is an electrophysiological signal which is produced by muscle contraction and propagated from the muscles to the detection points on the skin's surface, and surface ElectroMyoGraphy (sEMG) is a noninvasive recording method of EMG signals. sEMG signals are collected from the surface of the skin and often used to evaluate muscle functions and muscle activities [1]. sEMG is widely applied to a variety of applications. In clinical and biomedical diagnosis, EMG signals are used to assess the status of muscles that provide essential information about the functional aspects of the Motor Units (MUs) to assist in neuromuscular diagnosis [2]- [4]. [5] summaries the overall structure of pattern recognition schemes for EMG-based prosthetic systems and discusses the real-time use on an amputee's upper limbs. Besides, in the sporting environment, The associate editor coordinating the review of this manuscript and approving it for publication was Yue Zhang . EMG has been applied to long-distance runners to measure the fatigue of the rectus femoris muscle [6], [7]. Furthermore, [8] measures neck fatigue with an sEMG sensor system to prevent turtle neck syndrome; this system can warn employees who work very long hours to receive medical treatment earlier. Some previous works [1], [9], [10] had developed fatigue analysis algorithms based on EMG signals which are worth to be mentioned.

B. ARCHITECTURE DESIGN OF WIRELESS BODY SENSOR NETWORK
Wireless Body Sensor Network (WBSN) is defined as an integrated system which is used to analyze the physiological information of a person. WBSN consists of a client and a server. The client comprises four types of independent devices, such as sensor nodes, an actuator node, a wireless device, and a power unit, and the server is a personal device. Based on the processing flow, [11] has indicated that there are mainly three types of WBSN architectures collected from one or more sensor nodes: managed WBSN, autonomous WBSN, and intelligent WBSN.  A managed WBSN is a network in which the analysis on the data collected is done by a personal device which can be a smartphone or a personal computer, as illustrated in Figure 1. The advantage of managed WBSN lies in the fact that the demand of actuator performance is lower. All signals can be operated by the third party. In a real-time application, high-throughput wireless technology is required. In the meanwhile, it will also increase the power consumption of wireless transmission. Besides, the required performance and power consumption of actuator hardware are lower. An autonomous WBSN has the same purpose as a managed WBSN; but the processing flows are different. In an autonomous WBSN, the actuator collected the data from the sensor nodes and analyzed it without the need to wait for any third-party decision, as illustrated in Figure 2. Since the data were processed completely in the actuator, the wireless device only needs to transmit little information to a personal device. The wireless device can select low-power and low-throughput technologies; but the WBSN system requires a high-performance actuator hardware.
The optimal design is an intelligent WBSN that is a combination of the above architectures. If situations are simple, analyses are done on their own by the actuator node; but if they are complex, then the data are sent to the third party for further processing.
Managing the power consumption of a battery-constrained sEMG WBSN is an important topic. In this paper, we use fatigue assessment as a case study. We apply the concept of distributed computing [12] to explore the impact of computation allocations on client power consumption and the requirements of architecture specifications. The proposed distributed computing WBSN architecture is illustrated in Figure 3. The key contributions are summarized as follows.
• We divide the fatigue assessment algorithm into two parts for distributed computing: a pre-processing algorithm and a post-processing algorithm. The pre-processing algorithm performs on an actuator while the post-processing algorithm performs on a personal device. • We evaluate all the crucial factors which affect the power consumption to design a low-power architecture. These factors include EMG sample rates, algorithmic computational costs, wireless throughputs, and selection of wireless technologies.
• We propose a CPU clock rate minimization method based on a ping-pong buffer as the memory architecture.

A. THE FLOW OF FATIGUE ASSESSMENT ALGORITHM
The entire flow of the fatigue assessment algorithm is illustrated in Figure 4. A typical EMG signal processing flow includes noise filtration, feature extraction, and feature detection [13], [14]. The noise filtration includes a band-pass filter (described in Section II-A1), the Hamming window function (described in Section II-A2), and the energetic compensation in power spectrum density (described in Section II-A4). The feature extraction in the fatigue assessment application represents frequency-domain transformation (described in Section II-A3), and the feature detection in which the median frequency algorithm is used (described in Section II-A5).

1) BAND-PASS FILTER
There are two main sources of motion artifacts [15], [16]: one is from the interface between the detection surface of the electrode and the skin, the other is from movement of the cable connecting the electrode to the amplifier. The energetic distribution of the above noise sources ranges from 0 to 20Hz. Aliasing effect [17] leads to high frequency noise when sample rate does not satisfy half the Nyquist rate. In order to avoid these effects, we employ a sixth-order digital Butterworth Infinite Impulse Response (IIR) band-pass filter. The passband ranges from 40Hz to 450Hz. Although the utilization of FIR-based filters using the least-square strategy can avoid arithmetic divisions, it needs a higher order to achieve the same attenuation slope as the IIR filter. Therefore, the IIR filter has the advantage of lower power consumption and is chosen as the noise filtration method. The equation of IIR filter is shown in Equation 1.
where K is the feedforward filter order, b k is the feedforward filter coefficient, L is the feedback filter order, a l is the feedback filter coefficient, x[n] is the input signal, and y[n] is the output signal. Figure 5 shows an example of a band-pass filtered sEMG signal. VOLUME 8, 2020

2) HAMMING WINDOW
After the data are processed by spectral analysis, the frequency spectrum leaks to the contiguous frequencies causing severe distortion when the waveform comprises frequencies are not integer multiples of the sampling frequency. This effect is called ''Spectral Leakage'' [18], [19]. Note that the leakage phenomenon could be greatly mitigated when the sampling rate is carefully selected. However, the EMG signal consists of various frequency components, thus making it hard to avoid the effect using the power of two sampling frequency. In other words, the data are always processed by finite-length sampling in any practical application. This approach is equivalent to the processing of data by rectangular window function and the convolution operation of the sinc function on the spectrum. The role of data windowing is to reduce the artificial high frequencies caused by finite-length sampling. In the field of audio signal processing, Hamming window is a common way to remove this effect. The equation of the Hamming window function is shown in Equation 2 .
where N is the window length, x[n] is the input signal, and y[n] is the output signal. Figure 6 shows an example of a sEMG signal which is processed by the Hamming window.

3) FAST FOURIER TRANSFORM
In order to decompose the frequency component of an sEMG signal, a frequency-domain transformation must be used. The time complexity of the Fast Fourier Transform (FFT) algorithm is O(N log N ) where N is the data size. This algorithm is the computational bottleneck in the entire flow. We perform the frequency-domain transformation on stage 1, stage 2, and stage 3 as shown in Figure 7.

4) THE ENERGETIC COMPENSATION IN POWER SPECTRUM DENSITY
The main source of ambient noise is the electromagnetic radiation [20] that is around 60Hz as shown in Equation 3a.
The amplitude of the ambient noise is sometimes greater than the desired EMG signal about 1X to 3X. Since the surface of the human body is constantly exposed to electromagnetic radiation, it is not easy to avoid this exposure on the surface of the earth. In the previous work, the 60Hz notch filter was used but it influenced the original frequency spectrum and needed higher computation. However, Power Spectrum Density (PSD) [21] describes the power distribution of frequency components composing that signal. Therefore, we perform the 60Hz energetic compensation in PSD to reduce the radiation effect acquired by sEMG sensors as shown in Equation 3. The steps are shown as follows: 1) Convert the total data to the PSD domain.
2) Calculate the average value P average from the surrounding intensity of 10 frequencies (shown in Equation 3b). 3) Assign the average value to the frequency from 57Hz to 63Hz (shown in Equation 3c).
where P(f ) is the PSD of the signal. Figure 8 shows an example of a sEMG signal which is processed by the energetic compensation in PSD when the muscle is in a relaxed state.

5) MEDIAN FREQUENCY
During the fatiguing contractions, the PSD of an analyzed sEMG signal shifts toward lower frequencies. The shift can be described as a compression of the spectrum. The most common parameters that would be enough to represent the compression are mean and median frequencies [10], [22]. Compared with mean frequency, the median frequency has the advantage of less sensitivity to noise [23]. Therefore, we use median frequency to indicate fatigue as shown in Equation 4. Thanks to the PSD conversion finished by the former algorithm, the part of the PSD conversion in the median frequency algorithm is omitted.
where f med is the median frequency, f s is the sampling frequency, and P(f ) is the PSD of the signal. Figure 9 illustrates an example of a sEMG signal which is analyzed by the median frequency.  in poor temporal and amplitude of the signal. The high frequency components of the sEMG signal have been reported to be around 400-500Hz [25]; thus the recommended ADC sample rate is from 800Hz to 1000Hz [26]. To ensure that there is no information loss, 1000Hz is set as the maximum frequency of the sEMG sample rate.

2) DATA QUANTITY OF EACH STAGE IN THE FATIGUE ASSESSMENT FLOW
As aforementioned, we determine 1000Hz as the sEMG sample rate. Due to the requirement of the FFT algorithm, the frame length is set as 8.192 seconds. In this section, we analyze each stage of data quantity as illustrated in Figure 4. Stage 1 is the sEMG raw data; Stage 2 is the post-processed data after processing by the band-pass filter; Stage 3 is the post-processed data after processing by the Hamming window function, and Stage 4 is the post-processed data after processing by frequency-domain transformation. The amount of data in stages 1-3 is shown in Equation 5: where x 1,2,3 is the total amount of the processed data in stage 1-3, α is the sample rate of ADC, β is the processing frame time, and γ is the data precision. After finishing FFT, the original time-series data will be transformed into frequency-domain data. Due to the mirroring of the frequency spectrum, the practical valid amount of data in stages 4-5 is half of the amount of data in stages 1-3 as shown in Equation 6: where x 4,5 is the total amount of the processed data in stages 4-5, α is the sample rate of ADC, and γ is the data precision. The amount of data in stage 6 is shown in Equation 7: where x 6 is the total amount of the processed data in stage 6, and γ is the data precision. Therefore, the amounts of data in stages 1-3 (x 1,2,3 ) are 16.384 kilobytes; the amounts of data in stages 4-5 (x 4,5 ) are 8.192 kilobytes, and the amount of data in stage 6 (x 6 ) is 2 bytes as shown in Table 1.

3) COMPARISON OF WIRELESS TRANSMISSION TECHNOLOGIES
Concerning about wearable applications, the sEMG data transmission can be achieved by utilizing wireless technologies. The 2G/3G are costly and consume high power consumption, as well as the data rates of LoRa and Zigbee are not enough to realize the high-sampling systems. Bluetooth Low Energy (BLE) and Wi-Fi are the most common and low-cost wireless transmission technologies in our daily life, which are equipped in mobile phones, laptops, and so on. BLE has much lower power consumption than Wi-Fi. However, the data rate of Wi-Fi is sufficient enough to support the EMG raw data. BLE cannot support the raw data but it can support the processed data by DSP algorithms. In Section II-B2, we have evaluated every data quantity of processed data in each stage and decided which wireless technology we are going to choose according to the data throughput. For the Wi-Fi technology, takes ESP8266EX [27] as an example, the data transmission speed reaches 4.5 megabytes per second and the average current is 80 mA. For the BLE technology, takes HC-08 [28] as an example, the mode 1 is mainly used to reduce the power consumption, and the current is 1.6 mA. The data transmission speed reaches 2 kilobytes per second.

4) CPU CLOCK RATE MINIMIZATION METHOD BASED ON A PING-PONG BUFFER AS THE MEMORY ARCHITECTURE a: THE RELATIONSHIP BETWEEN THE CPU CLOCK RATE AND ACTUATOR POWER CONSUMPTION
The dynamic power [29] consumed by a CPU is approximately proportional to the CPU clock rate and the square of the CPU voltage as shown in Equation 8.
where P d is the dynamic power, C is the switched load capacitance, V is the supply voltage, and f is the CPU adjustable clock rate. It is essential to minimize CPU frequency to reduce power consumption and finish data analysis in time simultaneously.

b: IMPROVEMENT OF DATA PROCESSING EFFICIENCY BASED ON PING-PONG BUFFER
Most MCU applications are organized as a foreground and background system. The architecture consists of two main parts-the foreground is the Interrupt Service Routine (ISR) that switches the buffer destination when the ADC hardware finishes acquiring. The background is an infinite loop that uses all remaining CPU cycles to perform the sEMG signal processing algorithm. The common design of processing allocation is to wait for the ADC hardware to finish acquiring data and then analyze them in the background as shown in Figure 10. However, it causes data loss, poor system utilization, and waste of client power consumption.   Therefore, we embedded a ping-pong buffer [30]- [33] mechanism on the memory architecture. It results in two possible situations: 1) The data acquisition time is shorter than the DSP time.
It reflects that MCU has insufficient computational resources. The acquisition hardware will turn into the idle state and some data will be lost as shown in Figure 11. 2) The data acquisition time is longer than the DSP time. Owing to the sufficient computational resources, the data analysis is completed early. It makes background locked in an infinite loop, and is equivalent to the continuous waste of CPU power consumption as shown in Figure 12. To improve these two situations, we moderate the clock rate to balance the signal processing time and the data acquisition time to minimize the power consumption of the actuator while achieving twice the quantity of data processing in the same period. The improved time graph is illustrated in Figure 13.

c: IMPROVED SYSTEM FLOW WITH A PING-PONG BUFFER
A Petri net [34], [35] is a mathematical modeling language for the description of distributed systems and the representation is suitable for embedded systems [36]- [38]. Our Petri net representation shows how double buffering works on our proposed WBSN architecture as illustrated in Figure 14.

5) COMPREHENSIVE ANALYSIS IN DIFFERENT COMPUTATION ALLOCATIONS
Based on the proposed clock rate minimization method, different computations executing on an MCU may result in different power consumption. We use different algorithms mentioned in Section II-A as components, and divide  all algorithms into two parts for distributed computing: the pre-processing algorithm and the post-processing algorithm. The pre-processing algorithm is performed on the MCU while the post-processing algorithm is performed on the personal device. In this section, we list all possible architectures in different algorithm allocations as shown in Figure 15. Moreover, we aggregate the above factors (described in Section II-B1, II-B2, and II-B3) to design all possible architectures and apply the proposed clock rate minimization method to determine the lowest CPU frequency in order to investigate power consumption. The required wireless throughput is calculated by dividing the data quantity of the stage by the frame length. The selection of wireless technologies is determined by which wireless device satisfies the required wireless throughput and has the lowest   Table 2. The power distribution in the WBSN client indicates that the selection of wireless devices dominates the total power consumption of the system as shown in Figure 16. After the above evaluation process, the total current found in the case 6 architecture is the lowest. Therefore, the case 6 architecture is chosen to implement the whole system.

III. IMPLEMENTATION
Based on the above evaluation results, we design the WBSN client architecture as illustrated in Figure 17. The PCB implementation is shown in Figure 18.

A. MyoWare TM MUSCLE SENSOR
The EMG signals are quasi-random in nature [39]. Before amplification, the amplitude of EMG was 0-10 mV [1]. The EMG signals will be mixed with noise when propagating in different tissues. Without the use of a myoelectric sensor, the original analog myoelectric signals are too weak to be directly converted to digital signals with an MCU. The muscle sensors amplify the analog sEMG signals conducted from the electrodes and output band-limited signals with proper amplitude intensity for the digitization conversion. Due to the above-mentioned reasons, the MyoWare muscle sensor [40] is chosen as the myoelectric sensor in this system implementation as shown in Figure 19.

B. ACTUATOR
The aim of the actuator is to acquire and analyze data. Yang et al. [41] have proved that an MCU is afford to acquire VOLUME 8, 2020   sEMG signals. Therefore, STM32H743 [42] is chosen as the main processing chip. In order to apply the method described in Section II-B4, the main memory is allocated for two buffers. The sEMG signals are digitized with the built-in ADC hardware. After finishing acquiring digital data, the ADC function triggers the Direct Memory Access (DMA) engine to move the data into the ping-pong buffer without CPU intervention as illustrated in Figure 14. On the other hand, the background is an infinite loop that performs all these five algorithms (described in Section II-A) and transmits the fatigue result to the BLE module via the Universal Asynchronous Receiver/Transmitter (UART) interface. Then, the BLE module transmits the processed data to the smart phone.

C. POWER UNIT
The whole system uses a 300 mAh lithium battery as the main power source. The power unit consists of two circuits: a power supply circuit and a power charging circuit. These two circuits are controlled by a power switch. The power supply circuit comprises a DC-DC converter [43] that converts battery voltages to 3.3 voltage and supplies the whole system. The power charging circuit includes a linear li-ion battery charger [44] that provides fixed current to a lithium battery while charging with a micro-USB interface. Table 3 compares the differences between the proposed design and other architectures. Both the commercial device [45] and [41] can only support one channel sEMG signal. The commercial device [45] supports the band-pass filter for signal pre-processing. However, the proposed architecture supports four channel sEMG inputs and several DSP algorithms such as band-pass filter, hamming window, FFT, PSD compensation, and median frequency. The functionality of both [41] and our proposed architecture can be modified by updating the firmware based on the implementation of an MCU. Compared with the commercial device [45] and [41], the reserved four sensor connectors allow users to customize the required number of EMG channels for most applications. In addition, a high-performance MCU gives users greater flexibility to customize the allocation of computing capacity according to different computing requirements. In other words, the client can allocate more computations, or allocate fewer computations to reduce the CPU clock frequency due to power-saving requirements. In terms of wireless transmission technologies, the commercial device [45] uses the same BLE technology as the proposed architecture, but [41] uses a higher-power Wi-Fi device. Besides, the moderated CPU clock rate and the selection of wireless device improve the power consumption where the battery life lasts for 11.2 hours. The average current of the proposed architecture can be reduced by 81.7% compared with [41]. Furthermore, the battery life is 4.48 times that of [41] under the continuous wireless connection equipped with the same 300mAh lithium battery. Compared with the commercial device [45], our proposed system reduces the power consumption by 3.5%, and the battery life is 1.6 times that of the commercial device [45].

V. CONCLUSION
This paper mainly tends to devise the general design strategies to develop a low-power WBSN architecture based on algorithm allocations. We determined the wireless technology by investigating the required wireless throughput. Regarding the CPU clock rate, we proposed a minimization method based on a ping-pong buffer as the memory architecture. In this paper, we used fatigue assessment as a case study and designed our own digital signal processing flow with five algorithms. To sum up, we conducted a comprehensive analysis of all possible distributed computing architectures of WBSN to determine the lowest-power WBSN architecture. The results showed that the implementation based on the lowest-power WBSN architecture has the lowest power consumption compared with other hardwares. In the future, the proposed power-management strategies could be templates to apply to other sEMG applications.