Gait Signal Analysis with Similarity Measure

Human gait decision was carried out with the help of similarity measure design. Gait signal was selected through hardware implementation including all in one sensor, control unit, and notebook with connector. Each gait signal was considered as high dimensional data. Therefore, high dimensional data analysis was considered via heuristic technique such as the similarity measure. Each human pattern such as walking, sitting, standing, and stepping up was obtained through experiment. By the results of the analysis, we also identified the overlapped and nonoverlapped data relation, and similarity measure analysis was also illustrated, and comparison with conventional similarity measure was also carried out. Hence, nonoverlapped data similarity analysis provided the clue to solve the similarity of high dimensional data. Considered high dimensional data analysis was designed with consideration of neighborhood information. Proposed similarity measure was applied to identify the behavior patterns of different persons, and different behaviours of the same person. Obtained analysis can be extended to organize health monitoring system for specially elderly persons.


Introduction
Analysis on the human gait signal has been studied steadily by numerous researchers [1][2][3]. The research on the gait signal applies to the field of healthcare development, security system, and another related area. Research methodology is based on how to classify the signal and develop pattern recognition algorithm for the comparable data sets. This algorithm applies the processing of the different signals of the same person and same action signals from multiple persons. Developing methodology for gait discrimination is challenge. However human gait signal has high dimensional characteristics, hence analyzing and designing explicit classifying formula is needed [3].
Generally, human gait signals consist of walking, sitting, standing, stepping up and down, and other usual behavior. Such a usual behavior would be done in house life, then it is quite easy for us to identify when we watch them in real situation. In order to develop a more massive monitoring system and healthcare system to analyze and identify behavior signal of each person, decision and classifying system for the gait signal is required. More specifically, decision whether he/she is doing in normal activities or not can be applied to the design of health care system. Therefore, obtained research output can be applied to the identification, healthcare, and other related fields. Additionally, more detail checking result even for the healthy people such as athletes can provide useful information whether he/she has suffered from other problems compared to the previous behavior.
To discriminate between different patterns, rational measure obtained from a statistical approach or heuristic approach is needed. By the point of statistical method, signal autocorrelation and cross correlation knowledge are useful because such formula provide us how much the signals are related with each other by the numeric value. Also, it is rather convenient to calculate due to the conventional software such as Matlab toolbox. However, it is not easy for high dimensional data to construct high dimensional correlation/covariance matrices. For heuristic approach, it needs preliminary processing for the signal, such as data redefinition and measure design based on the human thinking. Even the realization of measure based on heuristic 2 The Scientific World Journal idea is considered, ordinary measure for discrimination has to be considered such as distance. Fortunately, similarity measure design for the vague data has become a more interesting research topic; hence, numerous researchers have been focused on the similarity measure, entropy design problem for fuzzy set, and intuitionistic fuzzy set [4][5][6][7][8].
Similarity measure [9][10][11][12] provides useful knowledge to the clustering, and pattern recognition for data sets [13]. However, most of the conventional results were not included in high dimensional data. Human gait signal represents high dimensional data characteristics. So similarity measure design problem for high dimensional data are also needed to deal with the human gait signal. Similarity measure research has rather long history; square error clustering algorithm has been used from the late 1960s [14]. And it was modified to create the cluster program [15]. Naturally, similarity measure topic has been moved to many areas such as statistics [16], machine learning [17,18], pattern recognition [19], and image processing. Extended research on high-dimensional data can be applied to the security business including fingerprint and iris identification, image processing enhancement, and even big data application recently.
Then, distance between vectors can be organized by norms such as 1-norm, Euclidean-norm, and so forth. Similarity measure is also designed explicitly with the distance norm. Similarity measure design problem for highdimension needs more considerate approach. Conventionally, the similarity measure has been designed based on the distance measure between two considered data, that is, distance measure was considered information distance between two membership functions. In the similarity measure design with distance measure, measure structure should be related to the same support of the universe of discourse [14,15]. Additionally, similarity measure consideration on overlapped or nonoverlapped data is needed because many cases of high dimensional data are represented nonoverlapped data structure. With conventional similarity measure, nonoverlapped data analysis is not possible. Hence, similarity measure design for nonoverlapped data should be followed. In order to design similarity measure on nonoverlapped data, neighbor data information was considered. Data has to be affected from the adjacent information, so similarity measure on nonoverlapped data was designed. Inside of literature, artificial data was given to compare with conventional similarity measure; calculation result was also illustrated.
In the following section, preliminary results on the similarity measure on overlapped and nonoverlapped data were introduced. Proposed similarity measure was proved and applied to overlapped and nonoverlapped artificial data. In Section 3, gait signal acquisition system was considered with sensor and data acquisition Gait signal which was also illustrated with different behaviors. High dimensional similarity measure was proposed and proved in Section 4. Similarity measure design for high dimensional data was also discussed by way of norm structure. Similarity calculation results for different behavior and individuals were also shown. Finally, conclusions are followed in Section 5.

Similarity Measure Based Distance Property
In order to design similarity measure explicitly, usual measure such as Hamming distance was commonly used as distance measure between sets and as follows: where = { 1 , 2 , . . . , }, ( ) and ( ) are fuzzy membership function of fuzzy sets and at , and | | was the absolute value of . The membership function of ∈ ( ) is represented by = {( , ( )) | ∈ , 0 ≤ ( ) ≤ 1}, is total set, and ( ) is the class of all fuzzy sets of . Similarity measure definition was defined with the help of distance measure [14]. There are numerous similarity measures satisfying the following definition.
Definition 1 (see [14]). A real function : 2 → + is called a similarity measure if has the following properties: (S1) ( , ) = ( , ), , ∈ ( ); where + = [0,∞), ( ) is the class of ordinary sets of and is the complement set of . By this definition, numerous similarity measures could be derived. In the following theorem, similarity measures based on distance measure is illustrated. is the similarity measure between sets and .
(S3) is also easy to prove as follows: It is natural that ( , ) satisfied maximal value. Finally, The Scientific World Journal also guarantees ( , ) < ( , ); therefore, triangular equality is obvious by the definition, and hence (S4) is also satisfied.
Besides Theorem 2, numerous similarity measures are possible. Another similarity measure is illustrated in Theorem 3, and its proof is also found in the previous result [15,16].
are the similarity measure between sets and .
Proof. Proofs are easy to be derived, and it was found in previous results [15,16].
Besides similarity measures of (7) to (9), other similarity measures are also illustrated in previous results [15][16][17]. With similarity measure in Theorems 2 and 3, it is only possible to compute the similarity measure for overlapped data ( Figure 1). Following data distributions of diamonds (⧫) and circles (e) illustrates nonoverlapped data; it is appropriate to design similarity measure for data in Figure 2. Consider the nonoverlapped data distribution of diamonds (⧫) and circles (e), the similarity measure of (7) to (9) cannot provide the appropriate solution for the nonoverlapped distribution. Two data pairs that constitute different distributions are considered in Figure 2. Twelve data with six diamonds (⧫) and six circles (e) are illustrated with different combination in Figures 2(a) and 2(b). Similarity degree between circles and diamonds must be different between Figures 2(a) and 2(b) because of different distribution. For example, (7) represents From (7), (( ∩ ), ( ∪ )) provides distance between ( ∩ ) and ( ∪ ). By the following definitions: Nonoverlapped data satisfies ∩ = min( , ) = 0, and ∪ is defined as or . Hence, ( , ) = 1−(1/ ) ∑ max( , ) shows similarity measure, where is the total number of data sets and . From this property, and the same result is obtained for Figures 2(a) and 2(b). Hence, similarity measures (2) to (9) are not proper for the nonoverlapped data distribution. Two different data in Figure 2(a) are less discriminate than Figure 2(b). It means that similarity measure of Figure 2(a) has a higher value than Figure 2(b). Similar results are also obtained by the calculation of similarity measures (8) and (9). Hence, it is required to design similarity measure for nonoverlapping data distribution. Consider the following similarity measure for nonoverlapped data such as Figures  2(a) and 2(b).
Similarity measure (13) is also designed with the distance measure such as Hamming distance. As noted before, conventional measures were not proper for nonoverlapping Calculation result shows that the proposed similarity measure is possible to evaluate the degree of similarity for nonoverlapped distributions. By comparison with Figure 2, distribution between diamond and circle in Figure 2(a) shows more similar.

Human Behavior Signal Analysis and Experiments
Gait signals are collected with experiment unit; acquisition system ( Figure 3) system is composed with all in one sensor in  Figure 4(f) for stair up with Gyro sensor; we can notice 12 signals for --direction, and it shows almost the same pattern for similar gait. Hence, -direction signals are considered in each figure. Due to the fact that signal patterns are almost the same and numerous quantity, we collect two directional signals. From the top, the first two signals represent -signals at head, and next ones are waist and left and right leg signals, respectively. We also carried out preprocessing to make synchronize signals and obtained gait signals that are illustrated in Figure 4.  The Scientific World Journal We get the signals from the control unit, and the signal is processed in a note book. Signal characteristics were considered peak value and magnitude distance between each gait signal. Next, by the application of the similarity measure, we get the calculation of each action such as walking, step up, and so on.

High Dimensional Analysis.
Research on big data analysis has been emphasized by research outcomes recently [7,8,13,19]. Big data examples are illustrated as follows.
(i) Biomedical data such as DNA sequence or Electroencephalography (EEG) data. It contains not only high dimension but also large number of channel data.
(ii) Recommendation systems and target marketing are important applications in the e-commerce area. Sets of customers/clients' information analysis help to predict their action to purchase based on customers' interest. It also includes a huge amount of data and high dimensional structure.
(iii) Industry application such as EV station scheduling problem needs geometrical information, city size, population, traffic flow, and others. Hence, number of station and station size constitute huge data and high dimension.
Data might be expressed as high dimensional structure such as where and denote the number of data and dimension, respectively. Direct data comparison is applicable to overlapped data with norm definition including Euclidean norm such as Information distributions show various configurations, and hence it needs consider various types of distance measure to complete discriminative measure. Furthermore analysis of similarity and relation between different information should be considered carefully when it represents high-dimensional data. Specifically, dimension represents the independent number of characteristics or attributes.
And comparison with different patterns for the same person is carrying out between walking, step up, or step down. Gait signal related with his/her movement was gathered to analyze their different patterns and different persons. Each signal constitutes walking, stair up, and stair down with 20 persons' gaits were gathered. Hence, personal information can be represented by multidimensional information such as = ( , , ) , where = 1, . . . , 20.
And among 20 personal data, 3 different behaviors were also expressed. Considering the data, it is obvious that data is overlapped. Hence, it is clear that similarity measures of (2) and (7)-(9) provide similarity calculation for overlapped data. Similarity measure between person to person is expressed as Also different action from the same person as follows: Normalized similarity calculation results are illustrated in Table 1.
Results illustrate that the stair up and down shows higher similarity than the others. However, even similarity calculation result is higher than others; it is not much close to one, it just satisfies 0.55. Due to different directions stair up/down should have basic limit to close maximum similarity. Table 2 shows the average similarity between different individuals. Results show that the stair up is the closest even with a different gait. Naturally, walking pattern represents the least similar. In Table 3, walking similarity between different  Table 3. Similar results for stair up and down are obtained.

Conclusions
Gait signal identification was carried out through similarity measure design. Gait signal was obtained via data acquisition system including mobile station, all in one sensor attached to the head, waist, and two legs. In order to discriminate the gait signal with respect to different behaviors and individuals, similarity measure design was considered. Similarity measure was considered with the distance measure. For data distribution, overlapped and nonoverlapped distribution were considered, and similarity measure was applied to calculate the similarity. However, the conventional similarity measure was shown that it was not available to calculate the similarity on non-overlapped data. To overcome such a drawback, the similarity measure was considered with data information of neighbor. Closeness between neighbor data provides a measure of similarity among data sets; hence, the similarity measure was calculated. Calculation proposed two different artificial data, and the proposed similarity measure was useful to identify nonoverlapped data distribution. It is meaningful that similarity measure design can be extended to high dimensional data processing because gait signal was considered as a high dimensional data. With data acquisition system, 20 person gait signals were collected through experiments. Different gaits, walking, stair up, and stair down signals were obtained, and similarity measure was applied. By calculation, similarity between stair up and stair down showed higher similarity than others. Individual similarity for a different gait signal was also obtained. Gait signal analysis can be used for behavior decision system development; it is also naturally extended to health care system, especially to elderly people. Additionally, it is also useful for athlete to provide useful information if he/she is suffering from different actions compared to previous behavior.