Accelerometer-based algorithm for the segmentation and classification of repetitive human movements during workouts

Monitoring a person’s physical activity has a wide range of applications in both sports and medicine. With the advancement of technology for measuring human movement, it is possible to monitor the performed activity without a need for an expert to directly overlook the trainee. While the initial interest focused mainly on aerobic exercises, research has recently begun to focus on strength exercises. The goal is to achieve the highest possible accuracy in tracking movement while maintaining the low cost and energy autonomy of the monitoring device. In this paper, an algorithm for the segmentation and classification of repetitive movements during workouts based on 3-axis accelerometer data from a wearable device is presented. The accelerometer signals were recorded continuously during the workout session which consisted typically of 9 strength exercises, where 8 default movements were repeated in three sets. Segmentation of the acceleration signals recorded during the workout was done using the frequency spectrum of the acceleration magnitude with an accuracy of 99.4%, while the classification of the segmented movements was done using the Dynamic Time Warping (DTW) algorithm with an accuracy of 85.7%.


Introduction
Regular and moderate physical activity has a positive effect on human health, it reduces the risk of illness and death, and is often used as part of rehabilitation during recovery from surgery or serious illness [1,2]. It is defined as the movement of the body that increases energy expenditure above the level of rest and can occur spontaneously (leisure, work or transport) or organized (sports, physical training or exercise). Recommendations and strategies for physical activity can be found in the publicly available literature [3,4], and it can be briefly concluded that a person in addition to their daily responsibilities (spontaneous activities) during the week, should also strive for a balanced organized programme of activities or exercise to improve aerobic working capacity and muscle strength. This form of activity can be carried out in the gymnasiums, at home or outdoors, and in order to maximize the positive effect, it is recommended to perform it with some knowledge or in the presence of a professional expert.
Most often due to financial incapacity and dependence on a pre-planned training schedule, people with inadequate knowledge approach exercise on their own, which often leads to loss of motivation and giving up, ineffective training or in the worst case even to injuries [5][6][7]. If a person still decides to apply some form of supervised exercise, it is often in a group, rather than individualized, which is a challenge for trainers to systematically monitor an individual [8]. In addition, the existing measurement instruments (tests, questionnaires, assessment scales) used in the evaluation of the exercise [9] can be improved in terms of automation, objectivity and reliability [10,11].
The rapid development of technology in the last decade has greatly enabled and facilitated the process of digital recording of human movement [12]; therefore, it can present a possible solution for the development of intelligent supporting devices or systems that would additionally help a professional expert or could even replace them in certain situations. Considering the equipment and technology needed for recording, it can be roughly divided into two groups: (a) vision-based systems and (b) wearable-based systems. The choice of the group primarily depends on the field of application, type of activity and the space where the activity takes place [13].
Vision-based systems use a set of cameras and specialized markers attached to a subject's body to acquire the positions of markers in the 3D space. An example is the widely used Vicon system [14]. These systems provide high accuracy and are often considered the gold standard [15] but they also have some drawbacks such as a large number of cameras needed for complex movements, dependency on enclosed spaces (laboratory) or high costs. On the market, there are also simpler, more portable, markerless and low-cost vision-based systems (e.g. Microsoft Kinect [16]) but they are still lagging behind the more expensive ones in terms of accuracy.
Unlike vision-based systems, wearable-based systems do not have cameras and, therefore, provide more flexibility without spatial constraints [13]. In this case, a system is compounded of different wearable devices attached to a subject that can measure many parameters depending on the used sensors. Due to the low price, sufficiently accurate measurements, portability and availability, this paper will focus on wearable systems, specifically inertial systems. Inertial systems are composed of miniature inertial sensors, accelerometers and/or gyroscopes and often magnetometers. This combination of sensors is frequently used with a microcontroller, internal memory and a communication module and together they form a wearable sensor device called Inertial Measurement Unit (IMU) [17][18][19].
Different types of physical activity can be monitored with the help of wearable systems. While in the initial research emphasis was mainly on aerobic exercises [20,21], more recently attention has also begun to focus on strength exercises. To independently monitor and evaluate repetitive human movements through this type of exercise, a proper form is to classify them into two main categories: quantitative and qualitative [22]. Quantitative evaluation will provide an overview of how many repetitions are done, and qualitative evaluation will show whether repetition is being performed correctly.
Several research groups have successfully tackled this issue of monitoring strength exercises, but some issues still remain open. Further in the text, gaps are listed which certain systems have left open so those systems are particularly incomplete in terms of quantitative and qualitative evaluation.
The system detects the number of performed repetitions but there is no information about the beginning and end of repetitions so the trajectory of movement between the starting and ending point remains unknown [5,27,28].
A certain amount of data is needed in the training phase for the system to be able to perform repetition segmentation [1,29,30].
Taking into consideration the previous research and gaps, we propose an accelerometer-based algorithm for the segmentation and classification of repetitive human movements during a workout. The proposed algorithm leaves the emphasis on accuracy in tracking movement while maintaining low cost and low energy consumption of the device. Segmentation of the movement was done using the frequency spectrum of the acceleration magnitude, while the classification of the segmented movement was done using the DTW algorithm. For this research, volunteers performed 9 different strength exercises (Appendix, Table A11).
The remainder of the paper is organized as follows: Section "Materials and methods" describes the methodology, followed by results and discussion in section "Results and discussion" and section "Conclusions" concludes the paper and points out the important practical implications of this study with future work.

Materials and methods
This section describes the methods used to develop and test an algorithm for the segmentation and classification of human movements during exercise. The function of the algorithm is to separate the accelerometer data obtained by measurements during repetitive exercises into individual repetitions of the default exercise movement and then to determine which exercise is characterized by the signal of that repetition. We generated our own data set for this experiment. For all calculations and analyses in this study, we used Matlab R2020a.

Participants and performed exercises
For this research, four healthy male subjects were selected, whereby no subject had a current or recent musculoskeletal injury that would impair their exercise performance. Information on age, body weight, height and exercise experience are listed in Table 1. Each subject performed a cycle of exercises (workout) according to a pre-agreed protocol (number of repetitions and sets) in the presence of an expert. Thereby, one complete movement of the exercise is called repetition, and several such repetitions without any rest between them form a set. The task of the expert was to explain to the subjects how to perform a particular exercise and to keep records of performed movements, i.e. repetitions. The workout consisted of 9 strength exercises that focused on activating the whole body, not just individual extremities. The workout included (1) Standing Front Dumbbell Raise, (2) Standing Dumbbell Lateral Raise with Arms Straight, (3) Standing Side Dumbbell Shrug, (4) Standing Dumbbell Curl with Rotation, (5) Bent-over Dumbbell Row, (6) Push-up, (7) Dumbbell Step-up, (8) Box Squat and (9) Heel Touch. The pace of exercise execution and breaks between sets and between individual exercises were adjusted by the participant in agreement with the expert, and the order of performing the exercises was the same for all subjects. Subject A performed the given workout once, subject B 3 times, subject C 2 times and subject D 2 times, leading to a total of 8 different sets of data from the measurements. The duration of the workout of each subject was approximately 30 min.

Position and number of IMUs
For data acquisition, the Shimmer3 IMU was selected. As mentioned earlier, only accelerometer data were acquired. For monitoring the performance of a particular exercise, we searched for the optimal position of the IMUs. The position of the IMUs should allow monitoring of all performed exercises. The idea is to use as few IMUs as possible and a single algorithm. Based on our previous experience with tracking physical activities, we decided to place at least one IMU per body segment that is primarily involved in the performance of the selected exercises. This includes a minimum of 3 IMUs, which were placed on the wrist of the right hand, the middle of the chest and the right thigh ( Figure  1). Table 2 lists the exercises and the IMU from which the accelerometer readings were taken for data processing of the corresponding exercise. The accelerometer range was set to ±4 g and the sampling frequency at 128 Hz. To ensure the highest possible accuracy of the sensors, calibration was performed using the Shimmer 9DoF Calibration application.

Signal preparation and processing
The software tool ConsensysPro was used to collect and save data from the sensors, and Matlab was used for processing and analysis. During the measurement, the expert marked the beginning and end of each set on the recorded data in real-time by an event marker tool (ConsensysPro). Figure 2 shows three raw acceleration components and event marker signals during the performance of three sets of Standing Front Dumbbell Raise exercises obtained from an IMU located on the right wrist. First, individual sets were separated using event markers. Then, acceleration components were processed. The processing consists in removing the mean value, scaling with a factor of g = 9.81 m s 2 and calculating the Acceleration Vector Magnitude (AVM) according to the following expression: where i is the current data sample a x , a y and a z represent respectively the acceleration signals in the x, y and z axes of the sensor. Acceleration and AVM are expressed in g units (1 g = 9.81 m s 2 ). After the calculation of the AVM, the segmentation was done. Two methods for repetition segmentation are presented below.

Repetition segmentation
After extracting the individual set and calculating the AVM, the next step is to make the segmentation of the repetitions from each set. Because the algorithm should be energy efficient and implementable on embedded devices, the AVM signal should take such a waveform that it can be easily segmented using simple and fast functions, such as those for finding local minima and maxima ( Figure 6).
The first step in the segmentation process is to determine the frequency spectrum of the signal and the dominant frequency in the spectrum. Figure 3 shows the flow diagram of the algorithm by which individual repetitions are obtained from the set. The dominant frequency in the first step of the algorithm is assumed as the frequency of the peak amplitude in the spectrum. Figure 4 shows the spectrum of the signal from the second set for the Heel Touch exercise performed by subject C. In the spectrum, the peak amplitude is marked with a red circle, and the corresponding frequency is 0.4 Hz.
After calculating the dominant frequency in the spectrum, it is necessary to determine whether this frequency is optimal for repetition segmentation. In Figure 4, the dominant frequency is also optimal for segmentation, but if in the frequency spectrum of the signal, in addition to the dominant frequency, there is a lower frequency at which the signal has a pronounced amplitude ( Figure 5), empirically it has been shown that for segmentation it is necessary to choose a lower frequency. In Figure 5, this amplitude in the spectrum is indicated by a red circle.
When the optimal frequency is determined, the AVM is filtered. A low-pass Chebyshev filter type 2 is used for filtering. The order of the filter depends on the passband and stopband frequency at which the signal is filtered. The optimal frequency is taken as the passband frequency, and the stopband frequency is twice as high as the passband frequency ( Figure 6).
Minima or maxima of the AVM are used to define boundaries between segments, depending on the individual exercise. In exercise 9 (Heel Touch) and exercise 3 (Standing Side Dumbbell Shrug), maxima are taken, while in the others, minima are taken as the boundaries between the segments.
After repetition segmentation, it is necessary to remove the artifacts that most often appear before the first and after the last repetition. They are easy to eliminate by considering two criteria -the reciprocal value of the segmentation frequency (which roughly represents the average repetition time) and the maximum value per amplitude in the set. The segment that lasts less than half the average repetition time in that set and the maximum amplitude of that segment is not in the range determined by 50% of the maximum value in the whole set is discarded.
The last step in the segmentation process is used only for certain exercises where it is necessary to connect adjacent segments. This is because two adjacent segments together actually form one repetition. This most

commonly occurs with Dumbbell
Step-up because of the pronounced pause that occurs when a person stands on a bench and pauses before returning to the starting position.
As already described, using this method of segmentation, it can happen that the dominant frequency does not correspond to the optimal frequency required for filtering, which prolongs the performance of the algorithm and also affects the accuracy of segmentation. On a set of 137 performed sets in 5 different measurements, in 108 (78.8%) sets the dominant frequency is optimal for signal segmentation, while in 29 (21.2%) sets it is necessary to choose another frequency as optimal. Table 3 shows the ranges of segmentation frequencies by exercises. It can be seen from the table that in most exercises the segmentation frequency ranges from 0.3 to -0.9 Hz. This frequency range data can be used to speed up the process of determining the optimal frequency, so a new segmentation method is proposed below.

Improved repetition segmentation with band-pass filtering
To further automate the segmentation presented in the previous subsection in terms of selecting the optimal frequency, an improved repetition segmentation method has been proposed, which includes an additional preprocessing of AVM. Preprocessing AVM consists in filtering by a band-pass filter whose cutoff frequencies are 0.25 and 1.2 Hz determined using the knowledge obtained from the previous segmentation method. Figure 7 shows the modified flow diagram. Filtering with a band-pass filter removes most frequency components that are not relevant for repetition segmentation in which it is important to obtain prominent minima or maxima that mark the boundaries between repetitions in the series.

Repetition classification
The Dynamic Time Warping (DTW) algorithm is used for repetition classification from previously segmented signals. DTW presents a measure of the similarity between two signals that are compared. For each exercise, the signal of a trained and experienced person is used as a reference template. Unlike Euclidean distance, which compares the corresponding pairs of points of two signals and calculates their distances, DTW looks for pairs of points that best match. The DTW algorithm is suitable for use with signals that do not have the same number of samples during different durations. This is a very big advantage for signals that represent the repetition of the same exercise and are performed at different speeds.
Before classification, the signal of each repetition was filtered by a low-pass filter with a cut-off frequency of 2 Hz, taking into account the average time duration of individual repetitions. The classification was done by calculating the distances between each repetition and the nine saved templates using the DTW function. Repetition was classified as the type of exercise for which the distance between the repetition and the corresponding template was the smallest.

Results and discussion
This section presents and comments on the results obtained using an improved method for repetition segmentation with a band-pass filter and repetition classification using the DTW algorithm. Measurements were done with 4 subjects during 9 different exercises.

Improved repetition segmentation with band-pass filtering
This segmentation method uses the knowledge learned from the method presented in section "Repetition segmentation". The AVM is further filtered through a band-pass filter with cut-off frequencies of 0.25 and 1.2 Hz in order to find the optimal frequency (explained in section "Improved repetition segmentation withband-pass filtering"). Starting and ending time points of each segmented repetition were compared with manually selected points marked by the same expert who was supervising the subjects during the experiment. Out of 8 different sets of data from measurements, 1652 repetitions were accurately segmented, 4 repetitions were not segmented (once exercise 2, once exercise 6 and twice exercise 5) and 6 signal segments were incorrectly classified as repetitions, i.e. this method of segmentation achieves an accuracy of 99.4%.

Repetition classification
For the classification procedure it was necessary to store one repetition of each exercise. One filtered repetition from each exercise performed by subject B was taken as repetition template. Signals obtained from subject B were taken as a reference because subject B had the greatest experience in performing strength exercises. The templates are visible in Figure 8. The classification results for each of the measurements are given in Tables A1-A10 (Appendix).
From the total results of the classification, 1415 out of 1652 repetitions or 85.7% were accurately classified. However, the results differ considerably in terms of subjects and measurements.
When looking at the results by subjects, 145 of 204 repetitions or 71.1% were correctly classified for subject A, 554 of 578 (95.8%) for subject B, 375 of 439 (85.4%) for subject C and for subject D 341 out of 431 movements (79.1%). The results for subject B proved to be the best, which is very likely related to the fact that the templates for each of the exercises were taken from the measurements performed by subject B, so its repetitions are most similar to the templates. In subjects C and D the results are very good, in both subjects, the accuracy of classification is higher than 75%, while in subject A who performed one measurement the result is slightly worse.
Referring to the exercises (Table A10), the results of the classification are quite different. For five exercises (3. Standing Side Dumbbell Shrug, 4. Standing Dumbbell Curl with Rotation, 6. Push-up, 7. Dumbbell Step-up and 9. Heel Touch) the accuracy of the classification is over 90%, and for two it is 100%. In exercise 8 (Box Squat), the accuracy is 76.4%, and incorrectly classified movements are in all cases classified into exercise 1 (Standing Front Dumbbell Raise). The signals for these two exercises were taken from two different IMUs so it would be expected that these two exercises could be better classified. For exercise 5 (Bent-over Dumbbell Row) the result is very good (87.8%), in this exercise the misclassified movements are classified as exercise 2 (Standing Dumbbell Lateral Raise with Arms Straight). In exercise 1 (Standing Front Dumbbell Raise) the result is also good (77.3%), in the case of incorrect classification the movements from that exercise were classified into 4 other exercises. The result for exercise 2 is poor (42.1%), in a large number of cases the movements of this exercise are classified either as exercise 1 (37.6%) or exercise 4 (11.2%). This can be expected to some extent because performing the first and second exercises is very similar. If these two exercises were viewed as the same exercise, then the overall classification accuracy for the two would be 81.5%, which would be a good result.

Discussion
To develop a wearable system that would make it easier for a professional expert to monitor a person who performs exercise or enable people to train independently with good form quality and motivation, it is necessary that the system has satisfactory feedback to the user [4,7,8,17,23]. In the development of system feedback, where a workout consists primarily of strength exercises, it is possible to take advantage of the fact that human movements are repetitive. The form of feedback in this case should contain two important parameters, quantitative and qualitative, i.e. the number of performed repetitions of a particular exercise and the quality of the performed repetitions [22,31]. For successful counting and assessment of repetition quality, repetition must first be detected and isolated (segmented) and then identified (classified) to which exercise it belongs [30]. Only after the segmentation and classification of repetitions have been done, it is possible to start the quality assessment.
The main objective of this research was to determine the accuracy of the developed accelerometerbased algorithm for the segmentation and classification of repetitive human movements during a workout. The workout consisted of 9 strength exercises that focused on activating the whole body; therefore, the number and placement of IMUs were carefully selected, using prior knowledge in monitoring physical activities. Due to the desire to achieve an energy-efficient algorithm, implementable on embedded systems and without the need for a certain amount of data in the training phase, the number of input parameters is reduced to only one waveform (AVM) and the process of segmentation and classification is simplified as much as possible.
Out of 1656 movements, i.e. repetitions, using the proposed segmentation method, 1652 were successfully segmented, segmentation of 4 repetitions was not successful, and 6 signal segments were incorrectly classified as repetitions. We achieved an accuracy of 99.4%, recall of 99.7%, precision of 99.6% and F1-score of 99.7%, which we find comparable to and better than reported in the literature. To the authors' knowledge, no research has been done so far with the same selection of exercises and position of IMUs, so it cannot be directly compared with the existing literature, but we have listed the most relevant ones. Guo et al. [23] compared the repetition segmentation accuracy of two different IMUs in two different positions, a smartwatch on the wrist and a smartphone on the upper arm. They achieved an average accuracy of 99%. It is necessary to mention that their choice of exercises primarily referred to exercises in which the arm represented the dominant body segment and they used the data obtained from all three sensors, accelerometer, gyroscope and magnetometer. In [5], the authors achieved segmentation recalls a minimum of 84.1% for IMU located in the ear to a maximum of 91.6% on the wrist. A wider range of body activation was present during the workout and only accelerometer signals were used. Pernek et al. [32] detected and separated repetitions using a method based on the DTW algorithm. They chose a very wide range of exercises with which they managed to activate the whole body. The data were obtained from an accelerometer inside a smartphone that was located at 3 different locations, wrist, ankle or on the top of the weights, depending on the exercise. The average F1-score, precision and recall for all exercises and environments were 99.3%, 100% and 98.8%, respectively.
When it comes to recognizing segmented repetitions, in [23] was achieved with an average accuracy of 95% for a smartwatch on the wrist and 91% for a smartphone on the upper arm. A lightweight classifier (Support Vector Machine) was used on 27 features extracted from the acceleration in the world coordinate system. In [5] classification mean accuracy achieved a minimum of 78.4% for IMU located in the ear to a maximum of 97.2% on the chest. The template for the DTW algorithm in the process of classification was chosen randomly 50 times to avoid redundancy. O'Reilly et al. [8] implemented a method for tracking and recognizing lower-limb exercises with wearable sensors. They placed 5 IMUs on subjects (on the thighs, shanks and lumbar) and achieved 99% accuracy. Furthermore, for a single IMU placed on the shank, they obtained 98% accuracy.
Regardless of the small number of subjects in the proposed research, overall accuracy is comparable with the abovementioned studies. Detailed classification results can be analyzed using Tables A1-A10 (Appendix), and the authors' observations can be found in the previous subsection.
As indicated before, a main disadvantage of the research is the small number of subjects, and, therefore, through future work, the plan is to implement the proposed algorithm on a more extensive set of subjects who exercise simultaneously, and to compare the accuracy of our algorithm with other common classification methods implemented on larger groups.

Conclusions
In this paper, a method for segmentation of repetitive movements during strength exercises was successfully done from signals acquired by three IMUs and the Matlab software tool. The method for segmentation of the movement is based on the frequency spectrum of the acceleration magnitude and we achieved an accuracy of 99.4%. The classification of the segmented movements was performed using the DTW algorithm and we achieved an accuracy of 85.7%. As the simplicity of performance of the methods is aimed in order to implement the methods and algorithms in large groups of subjects during simultaneous exercising, we consider the segmented movements classification results as satisfactory. Table A1. Colours for accurately and incorrectly classified repetitions. Table A2. Repetition classification results, subject A, measurement 1. Table A3. Repetition classification results, subject B, measurement 1. Table A4. Repetition classification results, subject B, measurement 2. Table A5. Repetition classification results, subject B, measurement 3. Table A6. Repetition classification results, subject C, measurement 1. Table A7. Repetition classification results, subject C, measurement 2. Table A8. Repetition classification results, subject D, measurement 1.