Consistency of Outputs of the Selected Motion Acquisition Methods for Human Activity Recognition

The aim of this paper is to choose the optimal motion sensor for the selected human activity recognition. In the described studies, different human motion measurement methods are used simultaneously such as optoelectronics, video, electromyographic, accelerometric, and pressure sensors. Several analyses of activity recognition were performed: recognition correctness for all activities together, matrices of the recognition errors of the individual activities for all volunteers for the individual sensors, and recognition correctness of all activities for each volunteer and each sensor. The experiments enabled to find a range of interchangeability and to choose the most appropriate sensor for recognition of the selected motion.


Introduction
Telemetric recording and automatic interpretation of motion activities play a significant role in home monitoring. From a variety of applications, we can distinguish a few most common ones: prevention and detection of falls, detection of abnormal or dangerous situations, rehabilitation monitoring, and activity assessment and quantification. An automatic system usually consists of sensors, specific signal or image processing methods, and recognition module for the selected activity. Selection of sensors seems to be the most important issue and must take into account useable sensor properties: wearing ability, sensitivity to disturbances, occurrence of outsiders, etc. Out of the many propositions of sensors, it is difficult to choose the best universal one because each sensor works best in a certain range of recognized activities. is fact motivates us to study that topic.
In [1], electromyographic (EMG) analysis of four lower limb muscles was performed during seven classes of preventive exercises against loss of balance or falling. Other researchers integrated EMG and inertial measurement unit (IMU) to construct a balance evaluation system for recording the body in a dynamic and static posture [2]. In [3], seven hand movements were classified (by neural networks with backpropagation and Gustafson-Kessel algorithm) on the basis of EMG signal of four forearm muscles. An EMGand augmented reality-(AR-) based rehabilitation system for the upper limbs was proposed in [4]. In [5], an EMG biofeedback device for forearm physiotherapy was constructed to discriminate 6 classes of movements.
Novak et al. [6] proposed a system for automatic detection of gait phases using acceleration and pressure sensors and supervised learning algorithm. For gait abnormalities detection in [7], the authors built a prototype of pressure force sensing resistor (FSR), bend sensor, and IMU. Principal component analysis (PCA) was used for the features generation and support vector machine (SVM) for multiclass classification. Shu et al. [8] presented a time-space measurement tool in the form of insoles of conductive fabric sensors placed around the midfoot and the heel. e wireless capacitive pressure sensors were introduced in [9]. Other studies [10] were related to equilibrium measurements with an instrumented insole with 3 pressure sensors per foot.
An accelerometric (ACC) system for monitoring the daily motor activity (sitting, standing, lying, and periods of natural walking) was proposed in [11]. An ACC sensor was placed on the subject's sternum. Detection of gait parameters by means of a detector composed of gyrometric, accelerometric, and magnetic sensors was proposed in [12]. Rong et al. [13] presented the use of 3D accelerometric sensor located at the waist to identify people based on their characteristic gait patterns. Identification was prepared with discrete wavelet transform (DWT). Jafari et al. [14] proposed ACC-based detection of accidental fall. e selected signal features were used for distinction of four transitions (sittingstanding, standing-sitting, lying-standing, and standing-lying) with the use of neural network and k-nearest neighbour (k-NN) classification. In [15], researchers developed ACCbased fall detection for smartphones. e proposed system enabled fall event detection, location tracking of the person, and notifications of emergency situations.
Juang et al. [16] introduced a system for detection of four body postures (standing, bending forward, sitting, or lying) and sudden falls. For classification purposes, the silhouette was segmented from each image frame. e feature vector was composed of Fourier transform coefficients and a ratio of body silhouette length and width. Real-time system was implemented in [17]. It consisted of three main modules: segmentation of silhouette, recognition, and identification of posture.
e authors introduced decision rules based on body parameters. It was possible to detect four postures: standing, sitting, squatting, and bending. In [18], authors performed analysis by means of supervised and nonsupervised learning for classification of the body position on images sequence. Other researchers [19] presented the posture detection method which took into account information about the body shape and the skin colour. Song and Chen [20] proposed vision-based activity recognition on the basis of information of pose, location, and elapsed time.
In the mentioned papers, the selection of particular sensors was not so clearly justified.
is raises a natural question about the optimal choice. e aim of our research was based on the use of various sensors applied to simultaneously capture the signs in basic activities and study the correlation of information obtained from them.
is approach enabled the choice of the proper sensor depending on the situation and the current need. e experiments aimed at determining how well the simple measuring devices can approximate the information obtained from the specialized medical equipment. Our measurements were performed by means of three-dimensional motion capture system, wireless EMG amplifier and wireless feet pressure system (as reference equipment), and accelerometer and video camera (as currently available consumer-grade sensors). (i) Squatting from a stand position (1a) and getting up from a squat (1b) (ii) Sitting on a chair from a stand position (2a) and getting up from a chair (2b) (iii) Reaching (3a) and returning from reaching the upper limb forward in the sagittal plane (standing) (3b) (iv) Reaching (4a) and returning from reaching the upper limb upwards in the sagittal plane (standing) (4b) (v) Bending from a stand position (5a) and straightening the trunk forward in the sagittal plane (5b) (vi) Single step for the right (6a) and left lower limb (6b). e measurements were performed simultaneously with the following:

Materials and Methods
(i) A, a motion capture system: Optotrak Certus (NDI) with NDI First Principles software (ii) B, a wireless biopotential amplifier: ME6000 (Mega Electronics) with MegaWin software (iii) C, a wireless feet pressure measurement system: ParoLogg with Parologg software (iv) D, a digital video camera: Sony HDR-FX7E (v) E: ACC recorder (Revitus system) with dedicated software.

Characteristics of the Examined Signals.
e three-dimensional motion trajectories of 30 infrared markers M1 to M30 located on the body were measured from the left side of the observed person ( Figure 1). e acquisition was performed with the sampling frequency 100 Hz, accuracy 0.1 mm, and resolution 0.01 mm.
Feet pressure signals were captured with 64 piezoresistive sensors (32 for each feet) with 100 Hz. Triaxial acceleration signal was recorded by sensors integrated in Revitus device located on the sternum. e recorder enabled online measurement via Bluetooth (100 Hz).
Video signals (720 × 576 pixels, 25 frames per second) were obtained from silhouette measurement using a digital camera placed from the volunteer's left side.

Processing of the Measurement Data.
To calculate feature vectors for classification, the processing of data recorded with sensors B to E was performed in MATLAB. e three-dimensional motion trajectories were used for determining the precise time moments of start and end of activities. e exception was the gait (6a, 6b), which cannot be performed in a natural way in the distance as short as 4 m (the maximal width of registration space of the motion capture system). erefore, for the gait (6a, 6b), the start and end points of duration were determined from visual analysis of video frames. e difference of performance time between analyzed movements and acting volunteers requires normalization of the data length with a window W. In order to make the optimal selection of its width, a set of histograms of activities performance were calculated: e accelerometric signals were processed as follows [21]: (i) Subtracting the offset value from the signal (offset-average of the 10 s length signal, when a person is in a stationary upright position) separately for each channel (x, y, z) and for each person (ii) Averaging the signal in a moving time window of 0.2 s (iii) Normalizing the amplitude for each volunteer separately (iv) Creating the vector data consisting of a prepared acceleration signal in the axes x, y, z: X Y Z (v) Normalizing the amplitude to (0 1] interval (vi) Resampling the signal to the frequency of 25 Hz. e video signal was prepared as follows [22]: (i) Converting a colour image to a grayscale. (ii) Calculating the vector motion field with 2 coordinates-optical flow (OF) using Horn-Schunck algorithm [23]. (iii) Median filtering of the motion field components (5 × 5 pixels). (iv) Detecting the moving objects-binarization of the motion field module with a T threshold constant for all people and all activities; the threshold has been chosen experimentally in [24]. (v) Calculating an area of the moving silhouette S n−1 on the (n − 1)-th frame (yellow area in Figure 4(b)) as a joint part from areas OF n−1/n−2 (blue) and OF n/n−1 (turquoise), where OF n−1/n−2 is the motion field calculated on the basis of (n − 1)-th and (n − 2)-th frame and OF n/n−1 is the motion field calculated on the basis of n-th and (n − 1)-th frame. (vi) Filling the holes in the area S n−1 .

Identification of the Activities.
To identify the selected activities, a supervised classification was performed. e set of all measurement data from each sensor was divided into learning and test sets. e former contained 2400 randomly selected representatives of all 10 activities, while the latter all 4874 remaining cases. For classification of the selected activities, k-NN algorithm and Manhattan metrics were used. Before the classification step, the classifier was tested using the LOO (Leave-One-Out) method. On the basis of these analyses, k equal to 1 was the optimal value for all sensors and sets of sensors.
For each activity a and each sensor s, the correctness of recognition for all volunteers R s_a (1) and its calculation error U s_a (2) were calculated. U s_a is a measure of the results dispersion coming from intersubject differences. Due to different numbers of activity repetitions for each volunteer, we used weighted standard deviation (2): where P s_a is the sum of correctly identified repetitions of the activity a for all volunteers for the sensor s and W s_a is the sum of all repetitions of the activity a performed by all volunteers for the sensor s: where n � 20 is the number of weights, equal to the number of volunteers; w i is the weight for the i-th volunteer, equal to the number of the activity a repetitions performed by the i-th volunteer; and x i is the percentage of correct recognition for specific activity calculated for the i-th volunteer. In order to represent an additional variable, R s_a_ALL (and its calculation error U s_ALL ) was employed. It illustrates the percentage of correct recognition for all activities and all volunteers for each sensor: where P s_a_ALL is the sum of correctly identified repetitions of all activities ALL performed by all volunteers for the sensor s and W s_a_ALL is the sum of all performed repetitions of all activities ALL for all volunteers.
where u i is the weight for the i-th volunteer, equal to the total number of repetitions of all activities performed by the i-th volunteer, and y i is the percentage of correct recognition for all activities calculated for volunteer i. For each volunteer V and sensor s, the percent recognition for all activities R s_V (5) and its calculation error U s_V (6) were determined. U s_V is a measure of the results value dispersion arising from differences between different activities.  where P s_V is the sum of correctly identified repetitions of all activities with the sensor s performed by the volunteer V and W s_V is the sum of repetitions of all activities performed by the volunteer V.
where m � 12 is the number of weights, equal to the number of activity types, p j is the weight for the j-th activity, equal to the number of its repetitions performed by the volunteer, and z j is the percentage of correct recognition for the j-th activity for the specific subject. In addition, the calculation error U s_V_ALL , was determined as an activity-related dispersion:

Journal of Healthcare Engineering
where q j is the weight for the j-th activity equal to the number of all the repetitions performed by all volunteers and r j is the percentage of correct recognition for the j-th activity calculated for all volunteers.

Results
e correctness of recognition R s_a (1) of activities 1a to 6b for all persons for sensors B to E is presented in Table 1.
Matrices of the recognition errors (in %) of the individual activities 1a to 6b for all volunteers for the individual sensors B to E are shown in Tables 2-5. e percentage of correct recognition R s_a for the individual activities is therefore placed on a diagonal matrix. e correctness of recognition R s_V of all activities for volunteers V1 to V20 and R s_a_ALL for ALL volunteers for sensors B to E is presented in Table 6.

Discussion
e correctness of recognition R s_a (1) is negatively correlated with the dispersion of the value U s_a (2) ( Table 1). erefore, less reliable recognition of the activity carried out by all volunteers does not mean worse recognition of the activity for each individual volunteer, but rather it is the implication of the individual way of performing the activity by the volunteer.
Some types of activities such as free gait or the return from reaching in the vertical and horizontal plane showed much less reliable recognition than others, regardless of the sensor type. Reliability of gait recognition is low probably due to high diversity in walking rhythm. Reaching is difficult to recognize, as it is characterized by low degree of dynamics of the whole body.
It was found that, among the single sensors, the best classifier for different activities is sensor B, followed successively by sensors D, E, and C. e correctness of recognition R s_V (5) is negatively correlated with the value of dispersion U s_V (7) ( Table 6). It means that less reliable recognition for a single volunteer (taking into account all activities) does not come from an inferior recognition reliability of every single activity for that volunteer, but rather it is a result of the existing inconsistency of individual activities recognitions.
Our research is focused on the recognition of only 12 types of daily life activities. e motivation of that choice is mainly based on the following aspects: (i) Since the chosen activities are done quite often and are easy to repeat, we limit as much as possible the errors coming from different volunteer performance of the activity and thus the comparison of the sensors is more reliable (ii) It can be presumed that any activity (even more complex) can be presented by means of the simple (elementary) poses [26].
Although the choice of a proper sensor is a very complex issue, in our studies, we simplify it only to the comparison of motion items. Nevertheless, the final choice of the sensors is precisely related with the application.
e following requirements should then be taken into consideration: (i) Individual characteristics of the sensor signal (ii) Size of the registration space (iii) Sensor accuracy (iv) Sensor portability and unobtrusiveness (v) Cost of the sensor device and reliable software (vi) Privacy of the supervised person. e reason for the performance differences for each activity and for each sensor has the source in differences in: (i) Speed, range, and way of doing the particular motion

Conclusions
e paper presents results of recognition of 12 motor activities in human based on individual interpretation of simultaneous recordings from various sensors. e main finding is that some sensors are more appropriate to the selected activities, while the other sensors show higher performance compared with the others. Consequently, we specified both areas where sensors show distinctive properties and a common range of activities where the sensors show similar metrological properties and may be selected based on other criteria (e.g., cost and commodity).
Additionally, we found that some recognition results generalized for all volunteers as well as those generalized for all activities showed surprisingly low values. is suggests that the recognition performance is dependent on particular   volunteer (i.e., subject-specific) and also on particular action. Accordingly, the hierarchy of expected recognition results for particular actions is not universal, and to produce optimal results, it should be individually adjusted with regard to particular user behavior. e prospective ways of future extension of our studies are as follows: (i) Expanding the list of activities with more complex ones (ii) Evaluating and adaptating the proposed solutions in home environment (iii) Extending video processing algorithm with a detection of individual body parts.

Data Availability
Research data are not openly available because of the volunteers' privacy.

Conflicts of Interest
e author declares that there are no conflicts of interest regarding the publication of this paper.