Fall Detection Based on Key Points of Human-Skeleton Using OpenPose

According to statistics, falls are the primary cause of injury or death for the elderly over 65 years old. About 30% of the elderly over 65 years old fall every year. Along with the increase in the elderly fall accidents each year, it is urgent to find a fast and effective fall detection method to help the elderly fall.The reason for falling is that the center of gravity of the human body is not stable or symmetry breaking, and the body cannot keep balance. To solve the above problem, in this paper, we propose an approach for reorganization of accidental falls based on the symmetry principle. We extract the skeleton information of the human body by OpenPose and identify the fall through three critical parameters: speed of descent at the center of the hip joint, the human body centerline angle with the ground, and width-to-height ratio of the human body external rectangular. Unlike previous studies that have just investigated falling behavior, we consider the standing up of people after falls. This method has 97% success rate to recognize the fall down behavior.


Introduction
The decline of birth rate and the prolongation of life span lead to the aging of the population, which has become a worldwide problem [1]. According to the research [2], the elderly population will increase dramatically in the future, and the proportion of the elderly in the world population will continue to grow, which is expected to reach 28% in 2050. Aging is accompanied by a decline in human function, which increases the risk of falls. According to statistics, falls are the primary cause of injury or death for the elderly over 65 years old. About 30% of the elderly over 65 years old fall every year [3]. In 2015, there were 29 million elderly falls in the United States, of which 37.5% required medical treatment or restricted activities for 1 day or more, and about 33,000 people died [4]. The most common immediate consequences of falls are fractures and other long-term ailments, which can lead to disability and loss of independence and psychological fear of falling again [5]. Falls not only make the elderly suffer moderate or severe injuries, but also bring a mental burden and economic pressure to the elderly and their relatives [6]. Faced with this situation, it is particularly important to quickly and effectively detect the fall of the elderly and provide emergency assistance. In a word, it is extremely important for those who fall and cannot call for help to be found in time and to be treated. This paper proposes a new detection method for falling. This method processes every frame captured by monitoring, which is to use the OpenPose skeleton extraction algorithm to obtain the skeleton data of people on the screen. In the plane coordinate system, the horizontal and vertical coordinates are used to represent each node. According to the speed of descent at the center of the hip joint, the human body centerline angle with the ground, and the width-to-height ratio of the human body external rectangular, these determine the conditions to identify falling behavior and if the person can stand on his/her own after a fall. effective. Xu et al. [21] used OpenPose to get the data set of a human skeleton map and trained to get a new model that can predict the fall.

Fall Detection
In the current fall detection research, both articles [22,23] divide fall systems into three categories: vision-based sensors, wearable device-based sensors, and ambient sensors. Later, Ren at al. [24] proposed a more comprehensive classification scheme on fall detection from the sensor apparatus. According to the sensing equipment used in the existing fall detection system, as shown in Figure 1, fall detection is divided into four categories: inertial sensor-based, context-based, RFbased, and sensor fusion-based.

Inertial Sensor(s)-Based Fall Detection
There are severe changes during the falling, such as collisions, changes in body orientation, or severe tilts. These features can be measured by sensors such as accelerometers, barometers, gyroscopes, magnetometers, etc. Shahzad et al. [25] developed apervasive fall detection system on smartphones, in which the system uses accelerometer signals embedded in the smartphone and a proposed two-step algorithm to detect falls. Fino et al. [26] proposed two novel methods and combined them to achieve remote monitoring of turning behavior using three uniaxial gyroscopes, and found the relationship between rotation frequency and fall. The principle of the pressure sensor is to detect and track the pressure based on the weight of the object. Light et al. [27] have built pressure sensors into smart shoes that detect falls by measuring whether a person's gait is consistent. Han et al. [28] used the bidirectional EMG (electromyographic) sensor network model to realize simple communication between the user and the nursing staff and finally proved that the method could detect the fall events more flexibly and effectively. Sun et al. [29] used a plantar inclinometer sensor to obtain the angle change information in the process of walking and the angle status after falling. Select the threshold in four directions from the plantar angle of the fall state: forward, backward, left, and right. They conducted 100 tests on falls under different circumstances, and the detection rate was 92%. The advantages of the inertial sensor(s)-based fall detection method are: portability, easy to implement, good real-time, few privacy issues, and high accuracy. However, this method also has corresponding shortcomings. The most obvious is that people need to wear the corresponding device on their bodies, which is undoubtedly an intrusion for users.

Context-Based Fall Detection
Context-based fall detection can be divided into two categories: ambient-based and visionbased. The common ground of these two methods is that they detect falls by detecting external environmental information to track human behavior. The ambient-based system mainly uses sensors

Inertial Sensor(s)-Based Fall Detection
There are severe changes during the falling, such as collisions, changes in body orientation, or severe tilts. These features can be measured by sensors such as accelerometers, barometers, gyroscopes, magnetometers, etc. Shahzad et al. [25] developed apervasive fall detection system on smartphones, in which the system uses accelerometer signals embedded in the smartphone and a proposed two-step algorithm to detect falls. Fino et al. [26] proposed two novel methods and combined them to achieve remote monitoring of turning behavior using three uniaxial gyroscopes, and found the relationship between rotation frequency and fall. The principle of the pressure sensor is to detect and track the pressure based on the weight of the object. Light et al. [27] have built pressure sensors into smart shoes that detect falls by measuring whether a person's gait is consistent. Han et al. [28] used the bidirectional EMG (electromyographic) sensor network model to realize simple communication between the user and the nursing staff and finally proved that the method could detect the fall events more flexibly and effectively. Sun et al. [29] used a plantar inclinometer sensor to obtain the angle change information in the process of walking and the angle status after falling. Select the threshold in four directions from the plantar angle of the fall state: forward, backward, left, and right. They conducted 100 tests on falls under different circumstances, and the detection rate was 92%. The advantages of the inertial sensor(s)-based fall detection method are: portability, easy to implement, good real-time, few privacy issues, and high accuracy. However, this method also has corresponding shortcomings. The most obvious is that people need to wear the corresponding device on their bodies, which is undoubtedly an intrusion for users.

Context-Based Fall Detection
Context-based fall detection can be divided into two categories: ambient-based and vision-based. The common ground of these two methods is that they detect falls by detecting external environmental information to track human behavior. The ambient-based system mainly uses sensors to collect vibration, acoustic and pressure signals to track the human body. Droghini et al. [30] used a floor acoustic sensor to acquire sound waves passing through the floor and established a human fall classification system. Using the new sensing technology of piezoresistive pressure sensors, Chaccour et al. [31] designed an intelligent carpet to detect falls. Infrared array sensors are also used in fall detection systems. The difference is that Fan et al. [32] used a variety of deep learning methods to improve them, making their fall detection systems have more obvious advantages. Compared with the inertial-based system, Symmetry 2020, 12, 744 4 of 17 the biggest advantage of ambient-based devices is that they basically do not interfere with people. The obscure and minimal interaction with people also determines that the ambient-based fall detection system rarely involves security and privacy issues. However, these methods have a limited detection range. Besides, ambient sensors are easier affected by the external environment.
As the camera is widely used in our daily life, the camera is also gradually used to obtain relevant information, which is also considered as vision-based fall detection systems. Many studies have used depth camera(s) (Kinect), RGB camera(s), thermal sensor(s), or even a combination of cameras to track changes in body shape, the trajectory of the head, or to monitor the body posture of the subject to detect or prevent falls. Fan et al. [33] proposed a novel fall detection method based on vision; this method is used to describe the human body posture to analyze extraction. Based on Kinect, Liu et al. [34] developed a novel fall recognition algorithm, which can quickly and effectively recognize human falls after experimental verification. Kong et al. [35] got the outline of the binary image by the canny filter and a depth camera. Then, the output outline image was used for falling detection. Rafferty et al. [36] combined the computer vision processes with the thermal vision sensor installed on the ceiling for fall detection. This novel method overcomes the shortcomings of traditional methods. However, there are some disadvantages in vision-based fall detection, such as considerable computing and storage capacity to run the real-time algorithm, privacy issues, and limited capture space can be monitored.

RF-Based Fall Detection
The study found that violent body movements can cause abnormal changes in the RF signal. This feature provides a new idea for fall detection, which is to detect falls through the fluctuation of the RF signal. RF-based fall detection systems are mainly divided into two categories, including radar frequency-based and wireless channel-based system. Tang et al. [37] proposed a fall prevention system based on FMCW radar, which predicts a fall by continuously measuring the distance between the radar and the surrounding environment and analyzing the relationship between human motion and radar frequency. The fall-related system based on wireless channel state information can quickly estimate changes in wireless signals caused by different human activities, which can be WiFi or Bluetooth. Wang et al. [38] designed an indoor fall detection system using ordinary WiFi equipment, which has many advantages such as being real-time, non-contact, low-cost, and accurate. As the frequency signal of radar is ubiquitous, its biggest advantage is that it can detect the fall event conveniently without being intrusive to the user. However, RF-based technology also has its limitations. Most wireless networks are deployed in houses within a limited range, and there are problems with their comprehensive coverage.

Sensor Fusion-Based Fall Detection
The problem of low accuracy or high false positives is widespread in the fall detection system with a single sensor, which means that other information is needed to improve the accuracy of the system. For example, Lu et al. [39] used a combination of infrared sensors and pressure sensors to detect fall events. Quadros et al. [40] used an accelerometer, gyroscope and magnetometer to obtain a variety of information such as acceleration, velocity, displacement, and direction components and then integrated them. Using the fusion information, they proposed a fall detection method based on the combination of threshold value and machine learning. Kepski et al. [41] proposed an efficient fall detection algorithm, which uses information derived from wireless inertial sensors and depth images. Ramezani et al. [42] detected falls according to ground vibration. Different from traditional methods, ground vibration signals were acquired by combining Wi-Fi Channel State Information (CSI) with the ground-mounted accelerometer. The sensor fusion system can provide more human activity information. On the one hand, the increase in information has significantly improved the performance of the fall detection system, but at the same time, a large amount of information also brings many disadvantages, such as poor performance of information fusion methods and redundant information and robust fusion algorithm. The advantages and disadvantages of various classifications are shown in Table 1. The method proposed in this paper belongs to vision-based. It can make full use of the cameras around our lives, which is convenient and has high accuracy, low cost, and is easy to implement.

Methods
Our proposed approach consists of five key steps: (1) OpenPose gets the skeleton information of the human body; (2) Decision condition one (The angle between the centerline of the body and the ground); (3) Decision condition two (The angle between the centerline of the human and the ground); (4) Decision condition three (The width to height ratio of the human body external rectangular); and (5) The procedure of implementation of our proposed approach is as shown in Figure 2. As shown in Figure 3, the screen taken by the surveillance camera uses OpenPose to obtain the information of human key nodes. The surveillance video is divided into a series of frames, each showing the skeleton of a person.

OpenPose Gets the Skeleton Information of the Human Body
The OpenPose human gesture recognition project is an open-source library developed by Carnegie Mellon University (CMU) based on convolutional neural network and supervised learning and based on Caffe (Convolutional Architecture for Fast Feature Embedding) [43]. In 2017, researchers from Carnegie Mellon University released the source code of the human skeleton recognition system of OpenPose to realize real-time tracking of targets under the monitoring of video. It can capture the COCO (Common Objects in Context) human skeleton information in the color video and provide joints information in the scene. OpenPose human key node recognition system can realize real-time detection of multi-person skeleton information. It adopts the top-down human body attitude estimation algorithm to detect the position of key points of the human body and then uses the feature vector affinity parameter to determine the hot spot map of human key nodes. OpenPose can realize human movement, facial expression, finger movement, and other posture estimation. It is suitable for a single person and many people with excellent robustness.
As shown in Figure 3, the screen taken by the surveillance camera uses OpenPose to obtain the information of human key nodes. The surveillance video is divided into a series of frames, each showing the skeleton of a person.  As shown in Table 2, the position information of each joint point is represented by the horizontal and vertical coordinate values, and the accuracy of each joint point is provided. For some joints, the accuracy of their coordinate position is not very ideal. This problem is mainly due to the defects of OpenPose algorithm itself, but the deviation of some key points has little effect on the recognition of the whole fall action. The specific joint points corresponding to each joint point number in the table are shown in Figure 4.

Decision Condition One (the Speed of Descent at the Center of the Hip Joint)
As shown in Figure 5, in the process of sudden fall, the center of gravity of the human body will change in the vertical direction. The central point of the human hip joint can represent the center of gravity of the human body and reflect this feature. By processing the joint point data obtained from the OpenPose, the longitudinal coordinates of the hip joint center point of each frame of the image are obtained. Because it is a very short process from standing posture to falling posture, and the time used is also very short, it is detected once every five adjacent frames, with a time interval of 0.25 s. The coordinates of the hips are 8 . According to these, the descent velocity of the hip joint center can be obtained.
where v is greater than or equal to the critical speed v, the fall feature is considered to be detected.
According to the experimental results, this paper chooses 0.009 m s as the threshold of the falling speed of the hip joint center. For the convenience of representation, S = {s 0 , s 1 , · · · , s 13 } represents the joint position set. We define the Joint Coordinates (JC): Define the position of the node j at time t as s j (t) = (x tj , y tj ), j ∈ {0, 1, · · · , 13}.

Decision Condition One (the Speed of Descent at the Center of the Hip Joint)
As shown in Figure 5, in the process of sudden fall, the center of gravity of the human body will change in the vertical direction. The central point of the human hip joint can represent the center of gravity of the human body and reflect this feature. By processing the joint point data obtained from the OpenPose, the longitudinal coordinates of the hip joint center point of each frame of the image are obtained. Because it is a very short process from standing posture to falling posture, and the time used is also very short, it is detected once every five adjacent frames, with a time interval of 0.25 s. The coordinates of the hips are s 8 (t) = (x t8 , y t8 ) and s 11 (t) = (x t11 , y t11 ). Assume that the y-coordinate of the center of the human hip joint at time t 1 is y t 1 = y t 1 8 +y t 1 11 2 and the y-coordinate at time t 2 is y t 2 = y t 2 8 +y t 2 11 2 . According to these, the descent velocity of the hip joint center can be obtained.
where v is greater than or equal to the critical speed v, the fall feature is considered to be detected. According to the experimental results, this paper chooses 0.009 m/s as the threshold of the falling speed of the hip joint center.
when v ≥ v M 1 = 1 , it can be considered to satisfy the decision condition one. When 1 1 v v M ≥ = , it can be considered to satisfy the decision condition one.

Decision Condition Two (the Angle between the Centerline of the Human and the Ground)
In the process of falling, the most obvious feature of the human body is the body tilt, and tilt degree will continue to increase. In order to reflect the characteristics of the body's continuous tilt in the process of human fall, a human centerline L is defined in this paper (Let the midpoint of joint As shown in Figure 6, θ is the angle between the centerline of the human and the ground.
Through OpenPose, the data of joint points 0, 10 and 13 are 0 When , it can be considered as satisfying the decision condition two for the occurrence of the fall event. Figure 5. The falling process.

Decision Condition Two (the Angle between the Centerline of the Human and the Ground)
In the process of falling, the most obvious feature of the human body is the body tilt, and tilt degree will continue to increase. In order to reflect the characteristics of the body's continuous tilt in the process of human fall, a human centerline L is defined in this paper (Let the midpoint of joint s 12 and joint point s 13 be s, and the connection of midpoint s and joint s 0 is the centerline L of the human body).
As shown in Figure 6, θ is the angle between the centerline of the human and the ground. Through OpenPose, the data of joint points 0, 10 and 13 are s 0 (t) = (x t0 , y t0 ), s 10 (t) = (x t10 , y t10 ) and s 13 (t) = (x t13 , y t13 ) respectively. So s = s 10 +s 13 2 , s(t) = (x t , y t ). At time t, the angle between the centerline of human body and the ground is θ t = arctan when θ < θ 0 (θ 0 = 45 • ) M 2 = 1, it can be considered as satisfying the decision condition two for the occurrence of the fall event.
Symmetry 2020, 12, x FOR PEER REVIEW 9 of 19 When 1 1 v v M ≥ = , it can be considered to satisfy the decision condition one.

Decision Condition Two (the Angle between the Centerline of the Human and the Ground)
In the process of falling, the most obvious feature of the human body is the body tilt, and tilt degree will continue to increase. In order to reflect the characteristics of the body's continuous tilt in the process of human fall, a human centerline L is defined in this paper (Let the midpoint of joint As shown in Figure 6, θ is the angle between the centerline of the human and the ground.
Through OpenPose, the data of joint points 0, 10 and 13 are 0 , it can be considered as satisfying the decision condition two for the occurrence of the fall event.

Decision Condition Three (the Width to Height Ratio of the Human Body External Rectangular)
When a fall is detected, the most intuitive feature is a change in the contours of the body. If we simply compare the length and height of the moving target, both the length and height of the moving target will change due to the distance from or near the camera, while their ratio will not exist. We will detect the falling behavior through the change of the length and height ratio of the target contour rectangle. Figure 7, the ratio of width to the height of the outer rectangle of the human body is P = Width/Height. When the human body falls, the outer rectangle of the target will also change; the most significant manifestation is the change of the length-height ratio.

As shown in
where T is the threshold. According to the actual situation, when a human body normally walks, the width-to-height ratio P is less than 1, while the width-to-height ratio for falling is greater than 1. When P ≥ T M 3 = 1 , it can be considered as satisfying decision condition three of the occurrence of the fall event.
When a fall is detected, the most intuitive feature is a change in the contours of the body. If we simply compare the length and height of the moving target, both the length and height of the moving target will change due to the distance from or near the camera, while their ratio will not exist. We will detect the falling behavior through the change of the length and height ratio of the target contour rectangle.
As shown in Figure 7, the ratio of width to the height of the outer rectangle of the human body is = P W idth H eight . When the human body falls, the outer rectangle of the target will also change; the most significant manifestation is the change of the length-height ratio.
where T is the threshold. According to the actual situation, when a human body normally walks, the width-to-height ratio P is less than 1, while the width-to-height ratio for falling is greater than 1. When

=1 P T M ≥
, it can be considered as satisfying decision condition three of the occurrence of the fall event.

Determine Whether a Person Can Stand after a Fall
If a person can stand on his own within a period after falling, no alarm is required. Nowadays, most of the fall detection focuses on the analysis of the fall process, rarely considering that people stand on their own within a short time after falling. As shown in Figure 8, standing up after a fall can be regarded as an inverse process of a fall. The only difference is that the whole process is slower

Determine Whether a Person Can Stand after a Fall
If a person can stand on his own within a period after falling, no alarm is required. Nowadays, most of the fall detection focuses on the analysis of the fall process, rarely considering that people stand on their own within a short time after falling. As shown in Figure 8, standing up after a fall can be regarded as an inverse process of a fall. The only difference is that the whole process is slower than a fall. According to the analysis of this paper, if the ratio of height to width of the external rectangle of the human body is less than 1 and the inclination angle of the central line is greater than 45 • in a period of time after a fall, it can be concluded that the person has stood up. The point of judging whether people can stand up on their own after a fall is to reduce unnecessary alarms because sometimes falls do not cause serious injury to the human body. than a fall. According to the analysis of this paper, if the ratio of height to width of the external rectangle of the human body is less than 1 and the inclination angle of the central line is greater than 45° in a period of time after a fall, it can be concluded that the person has stood up. The point of judging whether people can stand up on their own after a fall is to reduce unnecessary alarms because sometimes falls do not cause serious injury to the human body.

Experiment Data and Test
In order to verify the effectiveness of the proposed method, the fall event is tested. Because this experiment has certain risks, the experimental site is chosen in the laboratory. We randomly select 10 experimenters who made falls or non-falls during the test. As shown in Table 3  In order to ensure the universality of the system in the test experiment, 10 different types of experimental subjects are randomly selected. The height and weight data of 10 experimenters are shown in Table 4. In the experiment, each person performed 10 actions, including six falls and four non-falls, with a total of 100 action samples. In the test of falling, there are four possible cases: In the first case, a fall event occurs and the algorithm correctly detects the fall; in the second case, the fall did not happen but the algorithm misidentified it as a fall; in the third case, a fall occurs but the algorithm judges that it did not fall; in

Experiment Data and Test
In order to verify the effectiveness of the proposed method, the fall event is tested. Because this experiment has certain risks, the experimental site is chosen in the laboratory. We randomly select 10 experimenters who made falls or non-falls during the test. As shown in Table 3, the actions collected in the experiment are divided into three categories, namely falling actions (fall, stand up after a fall), similar falling actions (squat, stoop), and daily actions (walk, sit down). A total of 100 actions are collected, including 60 falling actions and 40 non-falling actions, each lasting about 5-11 s. From each video, 100-350 valid video frames can be extracted as samples. In order to ensure the universality of the system in the test experiment, 10 different types of experimental subjects are randomly selected. The height and weight data of 10 experimenters are shown in Table 4. In the experiment, each person performed 10 actions, including six falls and four non-falls, with a total of 100 action samples. In the test of falling, there are four possible cases: In the first case, a fall event occurs and the algorithm correctly detects the fall; in the second case, the fall did not happen but the algorithm misidentified it as a fall; in the third case, a fall occurs but the algorithm judges that it did not fall; in the fourth case, the fall did not happen and the algorithm did not detect the fall. The above four cases are defined as TP, FP, TN, and FN respectively. To evaluate the response to these four situations, two criteria are proposed: Sensitivity is the capacity to detect a fall: Specificity is the capacity to detect only a fall: Accuracy is the capacity to correctly detect fall and no fall:

Analysis of the Experimental Results
Before the final experimental judgment, we analyze the feasibility of the three conditions and the final conditions of standing up after falling.
When detecting the descending speed of the hip joint center point, the speed of change of each action is shown in Figure 9 below. We can see that the speed of fall and squat can exceed the critical value (0.09 m/s). In other words, only falling and squatting down meet the conditions by decision condition one (the speed of descent at the center of the hip joint).

TP Sensitivity TP FN
= + (6) Specificity is the capacity to detect only a fall:

TN Specificity TN FP
= + (7) Accuracy is the capacity to correctly detect fall and no fall:

Analysis of the Experimental Results
Before the final experimental judgment, we analyze the feasibility of the three conditions and the final conditions of standing up after falling.
When detecting the descending speed of the hip joint center point, the speed of change of each action is shown in Figure 9 below. We can see that the speed of fall and squat can exceed the critical value (0.09 m/s). In other words, only falling and squatting down meet the conditions by decision condition one (the speed of descent at the center of the hip joint).  As shown in Figure 10: When walking and sitting down, the inclination angle of the human body fluctuates less; when squatting down, the inclination angle of the human body fluctuates, but the whole body is relatively stable; only when stooping and falling, the inclination angle of the human body fluctuates greatly, and the inclination angle is less than the critical angle 45 • . We can exclude walking, sitting down and squatting from the decision condition two (the angle between the centerline of the body and the ground).
As shown in Figure 11, in all the actions, only the width-height ratio of the external rectangle of the human body in the falling action is greater than 1. By decision condition three (the width to height ratio of the human body external rectangular), we can find that only the falling action meets the requirement.
As shown in Figure 12, it shows that the common feature of falling action is that the inclination angle of the human body must fall below 45 • and the aspect ratio of the external rectangle of the human body will be greater than 1 at a certain time. For the action of standing up after a fall, the fall process can be judged according to the judgment conditions of the fall. In the subsequent rise process, it can be found that the inclination angle of the human body will gradually increase to above 45 • , and the width-height ratio of the external rectangle of the human body is also less than 1.
As shown in Figure 10: When walking and sitting down, the inclination angle of the human body fluctuates less; when squatting down, the inclination angle of the human body fluctuates, but the whole body is relatively stable; only when stooping and falling, the inclination angle of the human body fluctuates greatly, and the inclination angle is less than the critical angle 45°. We can exclude walking, sitting down and squatting from the decision condition two (the angle between the centerline of the body and the ground). As shown in Figure 11, in all the actions, only the width-height ratio of the external rectangle of the human body in the falling action is greater than 1. By decision condition three (the width to height ratio of the human body external rectangular), we can find that only the falling action meets the requirement. As shown in Figure 12, it shows that the common feature of falling action is that the inclination angle of the human body must fall below 45° and the aspect ratio of the external rectangle of the human body will be greater than 1 at a certain time. For the action of standing up after a fall, the fall process can be judged according to the judgment conditions of the fall. In the subsequent rise process,  As shown in Figure 11, in all the actions, only the width-height ratio of the external rectangle of the human body in the falling action is greater than 1. By decision condition three (the width to height ratio of the human body external rectangular), we can find that only the falling action meets the requirement. As shown in Figure 12, it shows that the common feature of falling action is that the inclination angle of the human body must fall below 45° and the aspect ratio of the external rectangle of the human body will be greater than 1 at a certain time. For the action of standing up after a fall, the fall process can be judged according to the judgment conditions of the fall. In the subsequent rise process, Figure 11. The change of the aspect ratio of the outer rectangle for each action. it can be found that the inclination angle of the human body will gradually increase to above 45°, and the width-height ratio of the external rectangle of the human body is also less than 1. Through the analysis of a total of 100 experimental actions, the specific situation is shown in the Table 5 below. In the table,  indicates that the action is correctly identified,  indicates that the action is incorrectly identified. It can be seen that No.1 and No.3 experiments' stooping actions in the nonfalling actions are wrongly identified as falling, and only one time in the falling actions is wrongly identified as non-falling. Through the analysis of a total of 100 experimental actions, the specific situation is shown in the Table 5 below. In the table, indicates that the action is correctly identified, × indicates that the action is incorrectly identified. It can be seen that No.1 and No.3 experiments' stooping actions in the non-falling actions are wrongly identified as falling, and only one time in the falling actions is wrongly identified as non-falling. According to the calculation formula proposed in Section 4.2, the sensitivity, specificity and accuracy are 98.3%, 95% and 97% in Table 6. There are the following reasons for wrong discrimination: (a) The lack of joint points in skeleton estimation results in incomplete data, which affects the final recognition. (b) The three thresholds selected in the experiment are not necessarily optimal. (c) During the experiment, due to the self-protection consciousness of the experimenter, there are still differences between the recorded falls and the real falls.

Conclusions
At present, because there are no suitable public datasets of falls, we cannot directly compare our results with previous results in detail. As shown in Table 7, we list the algorithms, classifications, features, and final accuracy of other fall detection technologies. Droghini et al. [30] detected falls by capturing sound waves transmitted on the floor. The accuracy of the experimental results is high, but the experiment uses a puppet to imitate falls, which is still very different from the real human fall. In addition, its detection method is extremely susceptible to interference from external noise, and the available environment is limited. Shahzad et al. [25] make good use of the sensors in smartphones and improves the power consumption of the algorithm, but the phone can always also cause false positives and requires the user to wear the phone. Kepski et al. [44] proposed a fall recognition system based on microwave doppler sensor, which can not only distinguish fall and fall-like movements accurately, but also does not infringe on the human body. The only disadvantage of this method is that the detection range is too small. Quadros et al. [40], the threshold method and machine learning are used to fuse multiple signals to identify falls, which undoubtedly improves the reliability of the recognition results. However, the user needs to wear the device for a long time, and the endurance of the device should also be considered. The method of OpenPose [20,21] can be used to identify the images captured by the camera, which is convenient and fast, and has a broad prospect in video-based methods. Compared with other methods, vision-based is more convenient. OpenPose gets the skeleton information of the human body, which is convenient and accurate. To some degree, our method not only has high accuracy but also is simple and low cost. According to statistics, the elderly population will continue to increase in the future, and falling is one of the major public health problems in an aging society. It is necessary to find out the characteristics of the fall movement for fall detection. In this paper, we introduce a novel method for this problem. Using OpenPose algorithm to process video captured by surveillance, the data of human joint points are obtained. Then, the falling motion is recognized by setting three conditions: the speed of descent at the center of the hip joint, the angle between the centerline of the human body and the ground, and the width-to-height ratio of the human body external rectangular. Based on the recognition of falls, considering the situation of people standing up after falls, the process of standing up after falls is regarded as an inverse process of falling. The method is verified by experiments and achieved the ideal result. The sensitivity is 98.3%, the specificity is 95%, and the accuracy is 97%.
With the popularity of the camera and the clearer quality of the captured image, the vison-based fall detection method has a broader space. In the future, we can carry out the following work: (a) The environment of daily life is complex, there may be situations in which peoples' actions cannot be completely captured by surveillance. In the future, we can study the estimation and prediction of peoples' behavior and actions in the presence of partial occlusion. (b) In this paper, the action is identified from the side, and the other directions are not considered.
Future research can start with multiple directions recognition and then comprehensively judge whether to fall. (c) Building a fall alarm system for people. In the event of a fall, the scene, time, location, and other detailed information shall be timely notified to the rescuer, to speed up the response speed of emergency rescue.
Author Contributions: W.C. contributed to the conception of the study. Z.J. performed the experiment; W.C., Z.J. performed the data analyses and wrote the manuscript; H.G., X.N. helped perform the analysis with constructive discussions. All authors read and approved the manuscript.
Funding: This research was funded by the Open Fund of Teaching Laboratory of China University of Geosciences (Wuhan) grant number SKJ2019095.