Application of data fusion techniques and technologies for wearable health monitoring

Technological advances in sensors and communications have enabled discrete integration into everyday objects, both in the home and about the person. Information gathered by monitoring physiological, behavioural, and social aspects of our lives, can be used to achieve a positive impact on quality of life, health, and well-being. Wearable sensors are at the cusp of becoming truly pervasive, and could be wo-ven into the clothes and accessories that we wear such that they become ubiquitous and transparent. To interpret the complex multidimensional information provided by these sensors, data fusion techniques are employed to provide a meaningful representation of the sensor outputs. This paper is intended to provide a short overview of data fusion techniques and algorithms that can be used to interpret wearable sensor data in the context of health monitoring applications. The application of these techniques are then described in the context of healthcare including activity and ambulatory monitoring, gait analysis, fall detection, and biometric monitoring. A snap-shot of current commercially available sensors is also provided, focusing on their sensing capability, and a commentary on the gaps that need to be bridged to bring research to market.


Introduction
Many countries, including the United Kingdom, have an ageing population, with an increase in the average age and proportion of older people [1] .In 2010, there were approximately 10 million people over the age of 65 in the United Kingdom, with this number projected to rise by over 50% by 2020 [2] .One consequence of the ageing population is an increase in life expectancy implying greater healthcare needs.However, the relationship between age and dependency is complicated and not determined by age alone.Indeed, the risk factor profile of those born more recently is worse than previous generations [3] .This can be attributed, in part, to the link between economic development and increased risky behaviours [4] .Risk factors such as tobacco and alcohol use, inactivity, and poor diet choices are associated with chronic diseases including obesity, cardiovascular disease, and diabetes [4] .
Recent advances in wearable technology including microelectromechanical (MEM) devices, physiological sensors, low-power wireless communications, and energy harvesting, have set the stage for a significant change in health monitoring.Technology can be discreetly worn and used as a means to monitor health and potentially enable older adults to live safely and independently at home.Early detection of key health risk factors enables more effective interventions to reduce the impact of, or even avoid, serious or chronic illness.Inertial measurement devices, such as accelerometers, represent a range of sensors that can be used for healthcare monitoring and are being extensively investigated for the monitoring of human movement [5] and daily activity [6] .Another application for wearable systems is rehabilitation [7] .There are also currently many systems commercially available for the monitoring of sports and some aspects of health.
The richness of data available using wearable sensors presents challenges in the way that it is processed to provide accurate and relevant outputs.To fully exploit this data for the purposes of healthcare monitoring, data fusion techniques can be employed to make inferences and improve the accuracy of the output.Hall and Llinas [8] provide a detailed introduction and discussion to multisensor data fusion.A review of data fusion techniques is also provided by Castanedo [9] including the different categories of data

ARTICLE IN PRESS
JID: JJBE [m5G; February 22, 2017;19:10 ] fusion techniques.With a focus on body sensor networks, Fortino et al. [10] discuss wearable multisensor fusion with an emphasis on collaborative computing.This paper introduces wearable sensors for human monitoring in the context of health and well-being, including a snap shot of current commercial wearable sensor systems.An overview of data fusion techniques and algorithms is offered, including data fusion architecture, feature selection, and inference algorithms.These are put into the context of wearable technology for healthcare applications including activity recognition, falls detection, gait and ambulation, biomechanical modelling, and physiological sensing.Related challenges of data fusion for healthcare are presented and discussed.

Wearable sensors
Wearable sensors can be considered in three categories: motion, biometric, and environmental sensors.Sensors used to capture human motions include inertial sensors such as accelerometers, gyroscopes, and magnetometers.By combining a tri-axial accelerometer, gyroscope, and magnetometer, inertial measurement units can be made for 9 degree of freedom tracking and are used for biomechanical modelling.Common biometric sensors are used to measure heart rate, muscle activation, respiration, oximetry, blood pressure, galvanic skin response, heat flux, perspiration, and hydration level.Electrocardiogram (ECG) and electromyography (EMG) detect the electrical activity produced by the heart and muscles respectively and are interpreted into heart rate and muscle activation.
For a wearable monitoring system to be practical it needs to meet several key criteria: to be non-invasive, intuitive to use, reliable, and provide relevant feedback to the wearer.The number of devices, location, and attachment method would be considered during design, and are usually application specific.Wearable sensor systems also have to take the target users' needs, such as dexterity or cognitive ability, into account.Devices can be either attached directly to the skin using some form of adhesive, mechanically using a clip, strap or belt, or incorporated directly into clothing or shoes.Advanced fabrication techniques can now create 'flexible/stretchable electronics' for integrated circuits, electronics and sensors [11] .Such systems can be applied directly to the skin enabling discrete sensing possibilities e.g.devices developed by MC10 Inc. [12] .
It is essential the system is reliable and measures with acceptable accuracy, providing the user with relevant feedback.In the research literature this is often presented as the accuracy of identifying specific events or health aspects, or in terms of selectivity and specificity, the proportion of the data that is positively identified correctly and the proportion of the data that is negatively identified correctly, respectively.
The past decade has seen major advances in sensing technologies, including MEMs and physiological sensors.Wireless low power communications, such as BLE, enable sensing technology to be integrated into wearable devices, clothing, and in the future embedded about the person without the restrictions of wires or the need to download data.Low power sensing and communications also enable wearable energy harvesting to be a viable option for powering and recharging these systems.
Commercially, wearable sensor systems are available for human monitoring and some of their output features are tabulated in Table 2 .Much of the software developed for commercial devices is proprietary; however, some systems are able to provide raw data, or have been explicitly designed for the purposes of research.gives a snapshot overview of commercial wearable devices as this is a wide and rapidly changing landscape, with the features monitored and the sensors used for daily monitoring, including a few examples for specific applications.Devices that only provide step count have not been included.A large proportion of these sensors target the health and fitness industry, and track the amount and intensity of activity performed including measures such as an estimate of energy expenditure and calories burned.For purposes of research however, a much broader range of outputs are being investigated and will be described in greater detail, including the techniques used to achieve them, in Section 5 .

Sensor placement
The placement of wearable sensors for health monitoring is motivated by three main driving forces: (1) what data is required or provided by the sensors; (2) where it is considered acceptable to wear the sensors; and (3) the number of sensors the user is willing to wear.For commercial systems the most common place to wear a sensor is on the wrist or arm although many systems can be worn at multiple locations, such as on the chest using a clip or as a pendent, and the thigh and ankle ( Table 3 ).The waist and wrist are intuitive and unobtrusive places to wear sensors as many people are already accustomed to wearing watches or belts.In a study conducted by van Hess et al. [13] to investigate the estimation of daily energy expenditure using a wrist-worn accelerometer, the acceptability of wearing the device on the hip or wrist was also examined.It was found that both sensor placements were rated as highly acceptable, however, men on average preferred wearing the sensor on the wrist.
Systems with more niche applications need to be worn at more specific locations relevant to the information being acquired, e.g. the Reebok Checklight with MC10 helmet [14] that determines the number and severity of impacts to the head while participating in sports.
Sensor placement for activity recognition has been investigated in several studies.Atallah et al. [15] investigated the most relevant features and sensor locations for discriminating activity levels, demonstrating the dependence of sensor location on the activities being monitored.Liu et al. [16] investigated different combinations of sensors and locations for physical activity assessment.The "best" results, i.e. the ones giving the highest activity recognition accuracy, were obtained using all the sensors, followed by a combination of the wrist and waist worn sensors.patients with chronic obstructive pulmonary disease (COPD) and again found the "best" results were obtained using all the sensors (in this case 10 accelerometers distributed about the body).
The "best" single sensor location was found to be on the left or right thigh.Pärkkä et al. [18] conducted a study to determine which sensors are most information rich for activity classification and included both motion and physiological sensors.Accelerometers were found to be most informative for activity monitoring, however the position of the sensors (on the wrists) did not enable the separation of sitting and standing.Interestingly, physiological sensors did not prove as useful for activity monitoring due to the delay in physiological reactions to activity changes, whereas accelerometers react immediately.
Sensor orientation can also effect classification accuracy.Thiemjarus et al. [19] compared the performance of the k -NN ( k -nearest neighbour) classifier using accelerometry data of activities with the sensor orientated in different directions.By transforming the signal to eliminate the orientation of the sensor an overall accuracy of 91% was achieved.

Data fusion
This section discusses data fusion models and the different levels of data fusion.A description of the possible types of features that can be extracted to characterise the data and techniques to select them are also described.

Data fusion models
A useful data fusion model is The Joint Directors of Laboratory model described by Hall and Llinas [8] that was developed to improve communications among military researchers and system developers.Work by Luo and Kay [20] define a hierarchical model consisting of four levels of abstraction at which fusion can take place; signal level fusion, pixel level fusion (for image data), feature level fusion, and symbol level fusion.Dasarathy [21] expanded on the hierarchical data fusion models by defining five fusion processes characterised by each processes input-output mode, e.g.data in -feature out fusion.For the application of healthcare many models have been suggested.Lee et al. [22] proposed a hierarchical model for the application of pervasive healthcare to minimise the probability of unacceptable error.Fortino et al. [10] described a framework for collaborative body sensor networks, C-SPINE.Gong et al. [23] proposed a multi preference-driven data fusion model and demonstrated its application for a wireless sensor network healthcare monitoring system.
Fig. 1 describes a generic centralised hierarchical data fusion architecture for a wearable health monitoring systems, drawing on three of the data fusion levels of abstraction (signal, feature, and decision) and elements from the previously described models.Data is sampled from the sensors (at a frequency appropriate to the sensor type and application) and transferred to the fusion centre which may reside on a smart phone or a gateway.An obvious way to do this is by using wireless radio communications, such as Low Energy Bluetooth (BLE) or Zigbee.Alignment and cleaning of the data takes place at the pre-processing stage to take into account differences in sampling rates, timing offsets, and lost or corrupt data.Filtering would also take place at this stage.Data can then be processed at the appropriate level of fusion.Additionally, some sensors may operate by being activated by an event trigger which may be the result of the systems output.Potentially, in the case of a suspected fall detected using body worn accelerometry, a camera could be activated to gain additional context of the event.
To interpret the sensor data three main hierarchical levels at which data fusion takes place are commonly used: signal level data fusion (sometimes referred to as direct or raw data fusion), feature level fusion, and decision (symbolic or inference) level fusion [8] .Signal level fusion can be applied to combine commensurate data i.e. data measuring the same property, directly.For example, to deduce kinematic parameters for biomechanical modelling, the Kalman filter (KF) can be used to estimate the state.
For data that is non-commensurate, fusion takes place at the feature level [8] .Features are extracted from the sensor data and used to form a feature vector that, after fusion, will result in a higher level representation of the data.If appropriate, output from the signal level fusion can be used as part of the feature vector.There are a wide range of parametric and non-parametric algorithms that can be used to classify the data into higher levels of abstraction, which will be described in further detail in Section 4 .
Decision level fusion is performed at the highest level of abstraction from sensor data and can be based on raw data, features extracted from the raw data, and symbols defined at the feature level fusion to make higher level deductions.Probabilistic methods are commonly used at the decision level due to the high levels of uncertainty; however other methods that are also tolerant of Please cite this article as: R.C. King Steps, activity(sedentary, standing, steps), duration and time Sleep, HR, perspiration, skin temperature, motion, calorie expenditure, steps, activity.
x-BIMU x-io uncertainty can also be used including artificial intelligence, fuzzy logic and genetic algorithms.

Feature extraction and selection
To combine data for the classification or detection of an activity or event characteristics, or features, are extracted from the sensor data as input for the data fusion algorithm.The features rep-resent the information in the original signal and are usually calculated over fixed time windows that can range from 0.5 to 10 s long.Using a fixed window, an overlap in the data can be applied, with the effect of smoothing the output.Typically, a 1 s window is sufficient, with a 50% overlap with the previous window, however this is application dependent and a longer or shorter window maybe more appropriate.Features can be summarised into two main domains: time and frequency, however some features in-Please cite this article as: R.C. King

Fig. 1.
A data fusion architecture for wearable health monitoring systems incorporating concepts from [8] and [20] .
corporate both temporal and frequency elements, such as wavelets [24] .A summary of some of these features can be found in Table 4 .
Feature selection describes the process by which features are chosen.This is sometimes based on empirical observation, however, search strategies can provide an objective means to select appropriate features.Search strategies fall broadly under two types; filter based, where the properties of the data are examined without knowledge of the inference algorithms to be used; and wrapper based that use the performance of the target learning algorithm to inform the set of features [25] .An introduction to feature selection has been provided by Guyon and Elisseeff [26] .For wearable sensor applications, selecting the most appropriate features can make a great difference to the quality of the inference.Atallah et al. [15] compared feature sets for activity recognition compiled using several filter based feature selection algorithms including Relief and Simba, that aim to maximise the margins between decision boundaries, and minimum redundancy maximum relevance.
A common problem for multi-sensory systems is high dimensionality feature space which leads to increased computational costs and higher demands on memory.Algorithms such as independent component analysis and principal component analysis [24] can be used to reduce the dimensionality of feature space.Deep learning, offers an alternative approach building features at multiple levels of a deep network.While deep learning has often been applied to static data, Längkvist et al. [27] provided a review of deep learning for time-series data.Plötz et al. [28] com- For systems reliant on wireless communications, including body worn systems, power consumption also requires consideration i.e. the trade-off between transmitting raw data to the fusion centre vs. extracting features for transmission on the sensing device.

Data fusion algorithm overview
In the following sections an overview of the different types of data fusion algorithms are presented and examples given from the research literature.For feature level data fusion, non-parametric algorithms (that do not make assumptions regarding the distribution of the data) and parametric algorithms are presented.At the decision level, algorithms including Bayesian approaches, fuzzy logic, and topic models will be described.

Signal level algorithms
• Weighted averages -is a simple signal level fusion method for combining commensurate information by taking an average of all the sensor readings [20] .The contribution of the "worst" sensor's error will be alleviated in the final estimate, although not eliminate it completely.To reduce the impact of large erroneous sensor readings weighted averages can be used [24] .For example, the weighted average of physiological temperature measurements could be taken from an array of body worn thermistors to find a single best estimate.the state of a system at the current time is based on the state of the system at the previous time interval.One of the main advantages of the KF is that it is computationally efficient [29] .
The KF is often used to fuse accelerometer and gyroscope information to provide better estimates, an example of which is the use of the KF to detect postural sway during quiet standing (standing in one spot with out performing any other activity or leaning on anything) [30] .For non-linear filtering the extended KF or unscented KF can be used.• Particle filtering (PF) -Particle filtering is a stochastic method to estimate moments of a target probability density, when they can't be computed analytically.The principle is to generate random numbers called particles, from an "importance" distribution that can be easily sampled.Then, each particle is associated a weight that corrects the dissimilarity between the target and the importance probabilities.In the Bayesian context, particle filters are often used to estimate the mean of the posterior density.They have the benefit of estimating the full target distribution without any assumption, which makes them particularly useful for nonlinear /non-Gaussian systems.Djuri ć et al. [31] and Arulampalam [32] both provided a tutorial of PF theory.The PF can be used for biomechanical state estimation based on accelerometer and gyroscope data.

Feature level non-parametric algorithms
• k-Nearest Neighbour (k-NN) -One of the simplest classification algorithms, k -NN measures the distance between the unlabelled observations and the training samples to infer which class they belong to.The unlabelled observation is assigned the label of its nearest neighbours where k is the number of training observations to be taken into account.Distance measures include the Euclidean and Manhattan distance.Use of k -NN has been widely used and reported in the literature for activity classification applications [15,16,19,[33][34][35][36][37] .Bicocchi et al. [37] , in particular, compared k -NN to several other instance based learning algorithms using a real-life activity set and achieved a precision of about 75% with k equal to 1. • Decision Trees (DT) -DT or rule-based algorithms are a popular method used for classification.Rules are defined in the form of a "tree", starting at the root that is split into decision nodes which refine the class prediction with each level of decision nodes.Leaf nodes represent the predicated class of the unknown data [5] .DT can be constructed manually by empirically defining rules; however, algorithms are available to automatically generate trees based on the data such as ID3 and C4.5.Other DT algorithms include CART, random tree, random forest, and J48.Examples of the use of DT for activity recognition include [17,18,34,35,[38][39][40][41]] .
• Support Vector Machines (SVM) -SVM have been extensively used for human activity classification [16,17,36,39,42,43] and can be used for both linear and non-linear classification problems.SVM is a binary classifier finding separation between two classes.The data is mapped into a high dimensional space using a kernel function (such as a Gaussian, sigmoid, or radial basis function).A hyperplane is then found that maximises the decision boundary between the examples of the classes [44] .In a comparative study by Liu et al [16] to determine the best sensor configuration to recognise activities, SVM performed better than the k -NN and Naive Bayes classifiers with an accuracy of 76% using a single hip worn accelerometer, to 88% using a hip and wrist worn accelerometer and a ventilation sensor that measures features associated with breathing.or nodes [45] .An ANN structure is composed of several layers of nodes connected by weighted links.Inputs into the ANN are propagated forward through the layers to compute the output of the network, as follows: for each node, the sum of the weights multiplied by the input value of all inputs is found.The output for this node is then calculated by the activation function, such as the sigmoid function.To train the network, the internal connective weights are adjusted using techniques such as back propagation which minimises the error between the network's output and the target output [45] .ANN have been applied to the problem of classifying human activity recognition; some examples include [18,36,46,47] .Pärkkä et al. [18] , Roy et al. [46] , and Altun et al. [36] conducted studies to compare the performance of ANN to other algorithms.Yang et al. [47] implemented an activity recognition strategy based on two phase neural classification.During the first phase, activities are classified as either static or dynamic activities, then during the second phase more detailed activity recognition is performed.
Recently, success with deep learning methods, based on neural networks, have attracted interest from many domains including image classification and natural language processing [48] .
As mentioned previously, deep learning can be used to learn features for activity recognition [28] , and as well as perform classification.

Feature level parametric algorithms
• Gaussian mixture model (GMM) -GMM can be used as a parametric classifier by modelling the probability distribution of continuous measurements or features.A GMM consists of a weighted sum of Gaussian distributions that can be trained with example data using algorithms such as expectationmaximisation (EM) [38,49] .A GMM is trained for each class, then the new data examples are classified by determining the GMM that provides the highest likelihood of producing the data.Allen et al. [38] used GMM to distinguish postures and movements for the monitoring of older patients based on accelerometer data, comparing it to the performance of a heuristic DT system.Wang et al. [49] classified five gait patterns using GMM.
• k-Means -k -means is an unsupervised iterative distance-based clustering algorithm.It aims to classify data based on the distance of a data point to the mean centroid of each cluster.The classifier is trained by defining k centroids, one for each cluster.These can be defined randomly or by defining the initial centroid based on all the training data and subsequent centroids using the data points furthest away from the initial centre [24] .An iterative process is then used to minimise the distance of the centroids from the data points.Each data point is assigned to the nearest centroid, after which the centroid is recalculated based on the clusters that are formed.This process is repeated until the criteria to stop have been met.After this process, data for classification is assigned to the closest centroid.Ghassemzadeh et al. [33] used k -means clustering to define motion primitives which, in combination, form transcripts that can be used for activity recognition.Machado et al. [50] applied k -means clustering to the problem of activity recognition using accelerometry successfully predicting activities with an accuracy of 89% for the user independent case. of the observations given the hypothesis.Bayesian methods enable the inclusion of prior probabilities that can take into account known information and can be updated based on the observations.The Naive Bayes classifier is a popular method for inferring activity from sensor data.Despite the assumption of independence between features, which is often considered poor, it can perform well.Atallah et al. [51] used Bayesian classification for activity recognition from an ear worn accelerometer based device.One drawback of Bayesian inference is the requirement that competing hypotheses are mutually exclusive, however, this is not generally compatible with the way humans assign belief [24] .Dempster-Shafer theory, also known as belief function theory or evidential reasoning, provides a framework for reasoning with uncertainty by extending the Bayesian approach [24] .• Fuzzy logic -or fuzzy set theory, is a fusion technique that can be applied at the decision level and have been used for the recognition of human activities using both wearable and ambient sensors [52,53] .Fuzzy logic describes input data in terms of possibility , i.e. the possibility the input data describes some property [24] .Medjahed et al. [53] describe three main steps for the application of fuzzy logic.First, fuzzification takes place converting the data into fuzzy sets.Secondly, a fuzzy inference system is applied which consists of fuzzy rules that take the IF/THEN form and fuzzy set operators including the union, complement and intersection [24] .Finally, defuzzification is applied to convert fuzzy variables generated by the process into real values.

Decision level algorithms
• Topic models -are an unsupervised machine learning algorithm originally designed for aiding understanding of large corpuses of text.They allow hidden thematic patterns in a dataset to be discovered using latent Dirichlet allocation.Huynh et al. [54] showed that Topic Models could be used to discover routine behaviours (e.g.lunch) from other activities (e.g.queuing, eating).Seiter et al. [55] further investigated the robustness of Topic Models for daily routine discovery by varying the characteristics of simulated datasets based on the original data collected by Huynh et al. and identified optimal values of dataset properties required to achieve good performance stability.

Activity recognition
Activity monitoring using wearable technology has received a vast amount of attention.A person's level of functional mobility can directly reflect quality of life (QoL) and overall health.From information provided by wearable sensors, feature level data fusion techniques and inference methods can be used for activity recognition at different levels of detail: activity intensity levels, static and dynamic postures, and activities of daily living (ADL).
Static postures refer to activities which are globally still, such as lying and sitting, where as dynamic postures refer to activities during which someone is actively moving, such as bipedal activities and during transitions, e.g.moving from sitting to standing.Standing can be referred to as a dynamic activity, e.g.[19] , or a static activity, e.g.[56] , depending on the perspective and application.Standing is a globally stationary activity, however, to maintain a standing posture active work is required on the part of the person.Corrective movements are continuously made which can be detected using a trunk worn accelerometer and have been used to investigate standing balance [57] .In contrast to maintain static postures such as sitting or lying, no active work is required on the part of the person.There are links between health and the amount of dynamic activity a person performs in the form of physical activity, such as walking, thus, even simple measures can provide in-sight into well-being [58] .Static and dynamic postural information can be used to determine the time spent in various positions and the amount of dynamic activity being carried out.
ADL describe in greater detail the essential tasks of daily living.The ability with which individuals can perform these tasks are commonly assessed using questionnaires [59] .The research literature reflects the interest in using body-worn sensors to identify these activities, which can be treated either as individual activities [37] or by dividing the ADL into the levels of physical intensity each activity requires [51] .
It can be seen from the research literature that accelerometers are the most widely used sensors for these applications.Exceptions include Pawar et al. [60] , who performed body movement classification using artifacts present in wearable ECG signals, and Roy et al. [46] who combined surface EMG with accelerometers for activity recognition.Gyroscopes are also used for activity recognition, although not as frequently.Potentially this is due to their high power consumption while accelerometers can operate at very low power making them attractive for battery powered systems.An in-depth review of the technology used in wearable systems for health applications can be found in a review by Lowe and OLaighin [61] .
It is worth noting that heuristic algorithms are often employed and used to great effect for activity recognition.These can be used alone or in conjunction with other data fusion techniques.For example, thresholds can be used to define the limits between one state and another, distinguish between periods of static and dynamic activity, and identify posture [19,56,[62][63][64][65] .Culhane et al. [64] used two bi-axial accelerometers attached to the thigh and sternum and by applying a threshold to the standard deviation of the sensor data, it could be determined if the wearer was static or dynamic.During static activities, posture was inferred using the accelerometer by measuring the tilt of the trunk and thigh.Dalton et al. [65] compared the mean of accelerometer data to thresholds that had been pre-defined to differentiate between activities.
There are a wide range of approaches used for general activity recognition, however some studies are more disease specific.Tsipouras et al. [66] developed a method for the automatic assessment of levodopa-induced dyskinesia for patients living with Parkinson's disease.Using data from body worn accelerometers and gyroscopes, levodopa-induced dyskinesia could be detected and the severity assessed.Salarian et al. [67] and Rodriguez-Martin et al. [43] also investigated the use of activity classification for Parkinson's disease using fuzzy classification and SVM, respectively.Other participant cohorts that were the focus of different studies include: those who had recently been in hospital [62] , rehabilitation [64] , stroke [46] , and COPD [17,68] .

Fall detection and prediction
Fall detection, often performed in conjunction with activity recognition [63,69,70] , is another widely researched application for wearable sensing technology.The incidence of falls and the risk of injury due to a fall increases as people age, affecting QoL and confidence.After a fall, it may not be possible to call for help or attract attention which could result in a sustained period of time without assistance.During this time, dehydration, hunger, and injuries sustained during the fall can lead to prolonged hospital stays and potentially prove fatal.
Heuristics are often employed for fall detection including work by Bourke et al. [71] who investigated fall detection using 2 triaxial trunk and thigh worn accelerometers.The resultant was calculated for both accelerometers and an upper falls thresholds applied capable of identifying 100% of falls from normal activities.In subsequent work, Bourke et al. [72] applied thresholds to the resultant of the angular velocity from a trunk mounted gyroscope.
Please cite this article as: R.C. King  Karantonis et al. [63] used a single waist worn accelerometer and thresholds to determine activity, rest, posture and falls.Benocci et al. [73] also conducted falls detection using an accelerometer attached to the sacrum and simulated falls from standing, walking, out of bed, and sliding down a wall.Wang et al. [74] described a three-fold threshold system that combine a trunk worn accelerometer and cardiotachometer to detect falls.The thresholds test for high accelerometer values, angle of the trunk, and heart rate to detect a fall.One of the greatest predictors of a fall is having fallen previously, therefore it is of equal importance to be able to predict a fall such that preventative measures can be put in place.As well as the detection of falls, work by Giansanti et al. [75] used wearable sensors to determine the risk of falls using 60 s balance tests.An accelerometer and gyroscope were worn on the trunk and a four layer ANN were used to classify participants into fall risk levels.

Gait and ambulatory monitoring
Gait analysis can provide insight into functional mobility, ranging from the ability to perform various bipedal activities to a detailed account of the gait cycle.Gait analysis and biomechanical modelling are traditionally performed in laboratory environments using optical motion capture to track body segment motion.More recently body worn inertial devices have been investigated as an alternative, eliminating the need to collect data in specialised laboratories.Biomechanical modelling of the lower body could be used to build unique gait models such that deviations from the norm could indicate the need for treatment or intervention.
Moe-Nilssen and Helbostad [76] used a low back mounted accelerometer to monitor gait variability in the anterior-posterior and mediolateral plane, and estimate cadence, step, and stride length over a known distance and was used to differentiate between fit and frail older adults.Xu et al. [77] examined the walking parameters of those recovering from stroke with a hemiparetic gait for rehabilitation purposes.A hierarchical approach using Naïve Bayes and dynamic time warping methods were used to classify walking, then gait parameters are computed including walking speed, cadence, stride length, and distance travelled.
In the clinical environment, gait has been used to predict the risk of falling using tools such as the Tinetti gait and balance assessment [78] .Body-worn sensors could be used as an alternative or complementary assessment.Caby et al. [79] collected accelerometry data from 10 sensors during a walking test and the Timed Up-and-Go for the objective classification of fallers and nonfallers.Accelerometry and force sensitive resistors have also been used to distinguish between normal and abnormal gait [80] .Ishigaki et al. [81] determined pelvic movement from an accelerometer and gyroscope mounted on the sacrum during 10m of free walking to find correlations with stability in older adults.Less pelvic motion was found for those classed as unstable based on a single leg balance test.
The differences in bipedal locomotion styles imposed by environmental conditions such as a flat or sloped surface, and stairs are subtle.The ability to negotiate these conditions can be an indication of physical well-being and used to monitor those with limited mobility.To this end, Wang et al. [82] decomposed the acceleration data from a single waist mounted sensor into frequency features using wavelets to classify the different walking patterns using a multilayer perceptron neural network.In further work, Wang et al. [83] included walking up and down two different gradients and used GMM for classification.Lau et al. [84] focused on walking conditions for those with uni-lateral drop foot and deployed two accelerometers and a single gyroscope on the affected side to distinguish the aforementioned conditions and compare classification results from several data fusion methods.Muscillo et al.
[85] adopted an adaptive Kalman-based Bayes estimation method to differentiate between locomotor conditions for both young and older adults.
By analysing gait events, such as heel contact, heel-off, and toeoff, body-worn sensors can be used to characterise gait for applications such as drop foot stimulation [86] .Kotiadis et al. [87] investigated gait phase detection for drop foot, exploring trigger timings for a stimulator.For those suffering from Parkinson's disease and multiple sclerosis gait disturbances, such as freezing of gait, can be an indication of a higher risk of a fall.Tripoliti et al. [88] used body worn accelerometers and gyroscopes for the automatic detection of freezing of gait.Accelerometers can also be used to recognise an individual's gait [89] which in a multi-resident home or scenario where sensors are shared could aid identification of the wearer.

Biomechanical modelling
Parametric state estimation algorithms, such as the KF and PF, can be used to measure biomechanical motions by combining accelerometer and gyroscope data to estimate the kinematic parameters.These algorithms come under the banner of signal level fusion methods as they combine commensurate data to achieve the best estimation of a parameter.Musi ć et al [90] used an extended KF to fuse inertial sensor data for the reconstruction of body segment trajectories in the sagittal plane of sit-to-stand motions.
Takeda et al. [91] presented a method for gait analysis by calculating the 3-dimensional position of each lower body segment using 7 tri-axial accelerometers and gyroscopes, joint-range-ofmotion, the contribution of gravity to the accelerometer signals, and frequency features that describing the cyclic nature of walking.
Due to the high power consumption of gyroscopes other methods using multiple accelerometers are being developed such as the double-sensor difference algorithm presented by Liu et al. [92] for the measurement of rotational angles of human segments.Djuri ć-Jovi či ć et al. [93] used pairs of tri-axial accelerometers for the estimation of leg segment angles and trajectories in the sagittal plane through the removal of sensor drift.

Physiological monitoring
By monitoring physiological aspects of health, an insight can be gained into how well our bodies are functioning, and can be used to monitor cardiovascular health, and the potential onset of illness (i.e.body temperature).A novel use of accelerometers was presented by Lapi et al. [94] to detect respiratory rate by positioning sensors on opposite sides of the chest wall.Li and Kim [95] developed a patch style sensor for wireless heart rate monitoring and movement index incorporating a HR monitor and accelerometer.
Stress is another area of well-being that has drawn interest by the research community due to its impact on health and wellbeing.A system presented by Healey and Picard [96] was able to classify stress during real-world driving tasks into three levels based on wearable sensors including two skin conductivity sensors, ECG, EMG, chest expansion respiration sensor.Ikehara and Crosby [97] used physiological sensors to assess cognitive load.Sensors used in this study included those to measure electrodermal temperature and blood flow, an eye tracker extracting related features, and an oximeter.Luprano et al. [98] incorporated textile electrodes and an accelerometer into a shirt to measure ECG and perform activity recognition.Fletcher et al. [99] developed a system for cognitive behavioural therapy for drug addiction that monitors for unusual arousal patterns using accelerometer, temperature, and electrodermal activity sensors (with optional ECG).When specific arousal events are detected a message was automatically sent to the wearer's phone with an empathetic message.
Please cite this article as: R.C. King  Bandodkar et al. [100] described sodium sweat sensors applied as a temporary stick on 'tattoo' sensor.These sensors were tested in a laboratory during stationary cycling activities.Indeed there are many biological MEMs sensors being developed that can be applied to physiological monitoring such as the triglyceride biosensor, C-reactive protein detector to monitor increases which may cause heart attacks or cardiovascular disease, and membrane-based glucose sensors for diabetics [101] .

Wearable sensors
Energy remains a dilemma for long term wearable research as it dictates not only how the wearable is used by the individual, but also the quality and availability of the data.For the application of activity recognition, inertial sensors such as accelerometers and gyroscopes provide the most appropriate data.The number of sensors required depends largely on the application.If we consider the use of one to five sensors, for the purpose of identifying fundamental static and dynamic postures a single sensing device can be sufficient.Wrist worn devices, as favoured commercially, are not well placed to accurately distinguish between sitting and standing postures but can detect overall activity level.For the general population, measuring activity intensity may be sufficient, however, for those that live with chronic disease or have restricted movement, the distinction between sitting and standing would provide further insight into their well-being and health.A single waist or trunk worn sensor will provide information on the transitions between sitting and standing and the global pose of the body, improving activity recognition accuracy.A single waist worn sensor can also be used to monitor gait variability, cadence, step and stride length [76] as described in Section 5.3 .However, these methods were developed for walking in a straight line using a known distance and would not be suitable for free living monitoring.
A two-sensor scenario would include a sensor on the wrist which would provide information related to ADL, e.g.cooking, eating and drinking.With the addition of a third sensor on an ankle or foot, more detailed parameters regarding gait can be extracted such as unilateral step length and height.An optional sensor positioned on the thigh would provide more definitive information regarding body posture, however maybe redundant if used in conjunction with a waist worn sensor.Five sensors, worn at the waist, wrists and ankles, would provide even greater levels of detail regarding both leg and arm movement that can be used for bilateral gait analysis and increase the accuracy of activity recognition algorithms.
For applications that require data from many sensors to address specific diseases or conditions, the benefits of an improved QoL may well outweigh the inconvenience of wearing multiple sensors.This presents several challenges regarding the usability of the system, such as taking the sensors on and off, recharging the sensors, and overall adherence of wearing the system.With the wide availability of small, cheap, low powered sensors, incorporating them directly into clothing where needed could address some of these challenges.Further, near field charging would negate the need to directly connect the system to a power source.

Data fusion models and algorithms
The data fusion model presented in this paper is based on a centralised hierarchical data fusion model and can be seen to be the most commonly used model for most commercial health monitoring and many research systems.Most of these systems are aimed at personal health and well-being monitoring and focus on determining specific features related to that individual.For more complex environments and scenarios, this type of architecture can be extended, such that the output, i.e. the local view, can be used to contribute towards the global view.This is similar to a distributed architecture [9] and could be used in the study of epidemiology, e.g.disease surveillance in hospitals.This architecture also naturally lends itself towards a decentralised architecture where data fusion takes place at each node and does not rely on a single fusion centre making it more robust to intermittent or unreliable communications services [9] .In this case each personal system becomes part of a community of nodes, each contributing information as and when it can and could be implemented in situations such as disaster sites.
The choice of data fusion algorithm used depends on the target application.Influences include the required output, system accuracy, computational complexity, available processing power, battery power available, and expected operational time.Many of these aspects constitute a direct trade off.
Low complexity data fusion algorithms, such as heuristic thresholds, weighted averages, k -NN, and k -means, are well suited to simple activity recognition applications.These include estimating activity intensity and fundamental static and dynamic postures.These are ideal for applications where a long battery life is expected and on-wearable user feedback is given.These algorithms can be trained in advance and could be implemented on the wearable using simple features extracted from the sensor data.These type of algorithms are well suited to everyday free living situations as targeted by many commercial systems.
Medium complexity data fusion algorithms, require more computational power, and in turn more energy to run.The data can be treated in two ways, (1) implement the algorithm on-wearable, or, (2) transmit the data off-wearable to the fusion centre.Both methods require more energy and will shorten the battery life of the wearable system.These algorithms include activity recognition algorithms that can infer more complex ADL such as Naive Bayes, GMM, DT, and NN.Kinematic estimation algorithms such as the KF which can be used towards biomechanical and gait analysis, however, require a high sampling frequency of typically 50-100Hz, higher than many sampling frequencies required for activity recognition.
For research applications, data is often collected using wearable sensor nodes and then post-processed.Medium complexity, as previously mentioned, to high complexity algorithms have been used for activity recognition including SVM, deep learning, and Bayesian networks.To extract and process the the relevant data for biomechanical and gait analysis, as previously described, KF, extended KF, and PF can be used for the kinematic state estimation.Feature level algorithms can then be used to extract features such as clinically relevant outputs.
Depending on the algorithm, there is more or less transparency of how the algorithm maps the sensor data to the output features.Algorithms based on neural networks and deep learning provide little insight into this process and requires training with large example data sets.Where as model based algorithms, for example the KF, control how the sensor data maps to the features but requires a predefined model.

Annotation and system validation
Collecting accurately labelled activity data in a natural environment to apply to machine learning techniques is time consuming and expensive.To reduce the amount of labelled data needed to train activity recognition algorithms, techniques can be used such as semi-supervised training and active learning [102][103][104][105] .
Semi-supervised training approaches use small amounts of labelled training data to initially train the activity recognition algorithms which are then used to label the unlabelled data.
Please cite this article as: R.C. King  Active learning finds the unlabeled data with the most information and queries the user to label them.Various strategies can be used to decide what data has the most information such as the data that is classified with the least confidence, or the amount of disagreement between two classifiers [102] .This reduces the cost of annotating all the data and is a good alternative to manual annotation.Hoque and Stankovic [104] used a clustering technique to group activities based on data from a smart home environment and asked users to label each cluster rather than label all data.Active learning techniques can also be used to update a classifier after deployment.Longstaff et al. [105] explored active learning as a means to dynamically augment mobile activity classifiers.Diethe et al. [106] proposed a Bayesian active transfer learning framework for smart home environments.
Although there is a wealth of research being carried out in the area of body worn sensors for health applications, further validation for many of the methods developed is needed using realistic conditions such as: matched participant cohorts, target environments, and natural behavioural conditions.This is especially true of fall detection where the algorithms used are often developed using simulated data from young healthy participants by tripping onto a crash mat or mattress.Algorithms based solely on laboratory data have been shown to fail and lead to unacceptably high rates of false alarms [107] .In a similar way, people rarely perform activities and ambulation in the same way as they would naturally when being cued to do it, or carrying out a script.Although features and data fusion algorithms may appear to be successful based on laboratory training and testing data, they may fail when used in real-world situations or from one person to the next.

Data loss and synchronisation
Another challenge for data fusion for health monitoring is the imperfection introduced throughout the data fusion health monitoring system.Khaleghi et al. [108] , in an in-depth review of the state-of-the-art in multisensor data fusion, provided a taxonomy of data imperfection including uncertainty, imprecision (vagueness, ambiguity and incompleteness), and granularity.Transferable belief models could be used as a method for modelling sensor reliability [109] .As well as error introduced by the sensors, wireless communications present another source of system error.For the application of body worn sensors, wireless transmission of data to a fusion centre is a desirable and practical option allowing it to be analysed continuously without unnecessary user interaction.Disruption in the communication of data to the fusion centre could severally affect the quality of the received data and be caused by: operation outside the range of the receiver; loss of power; receiver error, and packet loss.Retransmission of lost or corrupted packets can increase data reliability using two way communications, i.e. acknowledgement of received packets [69] , however, there is a power trade off associated with receiving and resending packets and there will be a time delay introduced.
Data transmitted from different sources will arrive to the fusion centre at different times and need to be aligned prior to analysis.This raises the issue of data synchronisation.Sensor data that is collected using more than one stand alone module can be synchronised by providing an input that each sensor can pick up, e.g. a series of taps made during recording.Any drift can then be calculated and the data resampled.Systems employing wireless communications can correct for clock drift by broadcasting a regular beacon from a master clock which can be used determine drift.In-cluding this additional information with the time stamp of when the data was received can be used to reorder the data before fusion.The synchronisation of sensors is an open and often overlooked area of research and methods are restrained by the target application requirements, power consumption, sampling and transmission frequency, and robustness to data loss.
Alemdar and Ersoy [110] presented a survey on wireless sensor networks for healthcare and discussed design considerations.The wireless sensor network system was broken down into five subsystems including: body area network, personal area network, gateway to the wide area network, and the end-user healthcare monitoring application.Each subsystem has a different set of design considerations.Gravina et al. [111] presented a framework called SPINE that can be used for multiple body worn sensor applications.Baker et al. [112] described wireless sensor network prototypes for home healthcare.

Conclusions
This paper outlined the state-of-the-art and future concepts for using wearable sensors in healthcare applications.It describes some principles of data fusion and many of the foundation techniques that can be used to perform data fusion on wearable sensor data.The commercial landscape of wearable sensors is constantly changing, however a snap shot of some of the currently available products has been given, providing context for an overview of the research literature conducted in the area of wearable sensors for healthcare applications.Applications of wearable technology for healthcare has been described including activity recognition, falls detection, ambulatory monitoring, and biomechanical monitoring.A discussion of other considerations that need to be addressed to augment wearable sensor technology has been provided, highlighting potential directions for research and issues such as data collection, algorithm training, quality of data, infrastructure and the potential fusion of wearable sensors with other external data sources.

Table 3
describes wearable devices that are commercially available for activity, physiological, and biomechanical monitoring, including both consumer and research devices.The table presented

Table 1
Table of abbreviations.

Table 2
Output features from commercial health monitoring systems.

Table 4
Example features that can be extracted from sensor data.
pared different types of features used to represent human activity data including: statistical metrics, fast Fourier transform coefficients, principal component analysis based features, and those derived using deep learning methods.A standard nearest neighbour classifier, which will be described later, was used to demonstrate the effectiveness of the features.
et al., Application of data fusion techniques and technologies for wearable health monitoring, Medical Engineering and Physics (2017), http://dx.doi.org/10.1016/j.medengphy.2016.12.011 et al., Application of data fusion techniques and technologies for wearable health monitoring, Medical Engineering and Physics (2017), http://dx.doi.org/10.1016/j.medengphy.2016.12.011Stikic et al. [102] demonstrated two approaches to semi-supervised training, self-training (the classification model is updated iteratively based on the most confidently predicted newly labelled data) and co-training (the same as self-training but uses additional information to augment the process).