Machine Learning Modeling for Failure Detection of Elevator Doors by Three-Dimensional Video Monitoring

High buildings in cities need elevators to transport people in daily life. Therefore, early to detect the occurrence of elevator failures by developing the diagnosis method is significant to ensure people’s safety. This paper presents the machine learning procedure for developing the diagnosis method for failure detection of elevator doors. The first step of the procedure is the motion analysis of the elevator doors by three-dimensional video monitoring. The signal from the dynamic distances between the elevator doors versus time is extracted by the image processing to measure the changing distance between elevator doors. The second step is modeling the signal curve by trapezoidal curves from the noised signal data. The third step is to train the classifiers to identify the motion of the elevator doors. Monte Carlo method is used to simulate and create normal and abnormal samples for training classifiers. The failure of detection of the elevator doors can be implemented by using three classifiers, such as K-Nearest Neighbor Classification, Support Vector Machine, and Binary Classification Tree for identification of the dynamical curves of the elevator doors. The results show Binary Classification Tree achieves the classification accuracy (99.28%) that is better than K-Nearest Neighbor and Support Vector Machine. The numbers of features influence the performance of Binary Classification Tree. The results show that the Binary Classification Tree with four features has the best accuracy, and the running speed is 0.51s. The accuracy of the three features has also reached 90.28%, but the running time is 0.24s. The result shows that there is a trade-off between accuracy and time.


I. INTRODUCTION
Elevators are transport devices in high buildings that people take them in daily life. In recent years, the safety problems caused by elevator accidents have aroused public attention. The goal to improve the safety of elevators for preventing and reducing accidents has become an essential issue for the elevator industry and the government. Early detect the occurrence of elevator failures by developing the diagnosis method is significant to achieve the goal. For the managing elevators, managers want to monitor the conditions of the elevator and perform maintenance on the elevator in advance to avoid The associate editor coordinating the review of this manuscript and approving it for publication was Prakasam Periasamy . elevator accidents. The researchers [1] focus on recognition of the movement of the machinery of elevators by using sensor devices, such as microphone, gyro, accelerometer and etc. To the best of our knowledge, there is no research on the failure detection of the elevator doors by using RGB cameras monitoring methods. Our previous works [2], [3] have developed an RGB color image processing method to extract the gaps between one elevator's two doors that distance changing versus time in one cycle. The RGB color frame images are used by the previous work to develop the diagnosis method of the failure detection of the elevator doors. However, the color image processing method has some drawbacks because the accuracy of distance measurement is influenced by the light intensity. To improve the results, a three-dimensional camera VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ captures frames of depth images that are used to measure the distance between the elevator doors. The diagnosis method uses machine learning technology to train a model to detect the failure detection of the elevator doors. The distance curves of two doors of a working elevator is necessary to be used to generate the training data by the Monte Carlo method. In this paper, the Monte Carlo method simulates the motion trapezoid curve of elevator doors. The four verticals of the trapezoid are the random variables of the machine learning model that can classify normal and abnormal cases by training data generated by the Monte Carlo method. Monte Carlo method [6]- [8], also known as a random sampling technique or statistical test method, is a numerical calculation method based on the theory of probability statistics. There steps of the Monte Carlo method to solve the problem are as follows: 1. Establish a random model. The model is related to the problem. Based on the stochastic model, determine a random variable. The numerical characteristics of variables (such as probability, expectation, second moment, etc.) are the solutions to the problem.
2. According to the established probability model, a large number of random experiments are conducted to obtain a large number of values of the random variable.
3. Statistical methods estimate some properties from random variables to get an approximating solution to the problem.
Monte Carlo method was first mentioned on the Buffon's needle problem in the 19th century [9]. Since the 1940s, the development of computers made this method successful in the development and experimentation of nuclear weapons [10]- [12]. In recent decades, the Monte Carlo method was also widely used in finance and other fields such as molecular chemistry [13], Bayesian statistical inference [14], radar/sonar, and Global Positioning System (G.P.S.), signal processing problems [15], [16].
There are three classifiers, Support vector machine (SVM), K-Nearest Neighbor (KNN), and Binary Classification Tree (BCT), commonly used in machine learning. KNN classification algorithm is one of the simplest methods in machine learning classification technology. The so-called K nearest neighbor is the meaning of the k nearest neighbors, which means that its closest k neighbors can represent each sample. In 1968, Cover and Hart [17] proposed the initial proximity algorithm of KNN that is a type of instance-based learning.
The SVM is a generalized linear classifier that performs binary classification on data by supervised learning. The decision boundary of SVM is the maximum margin hyperplane that solves the learning samples. The SVM was proposed in 1964 and rapidly developed after the 1990s. SVM had a series of improved and extended algorithms, which are used in pattern recognition problems such as facial recognition and text classification [18]- [21]. The Binary classification tree algorithm is a kind of supervised learning [22]. The binary classification tree infers the representation of the decision tree through the unordered and irregular sample data set and is used to classify the target data set. The binary classification tree can be used to process high-dimensional data and has good accuracy. Because the construction does not require any domain knowledge or parameter settings, it is suitable for probing knowledge discovery. In the late 1980s and early 1990s, machine learning researcher I. Ross. Quinlan developed a binary classification tree algorithm that is called ID3 [23], [24].
In this paper, the opening distances between two doors of an elevator are measured based on our previous work. Second, the Monte Carlo method is used to simulate normal and abnormal cases for training classifiers. There are three classifiers such as SVM, KNN, and BCT, used for testing the performances of identification of elevator normality and the comparison of three classifiers are discussed.
The structure of this article is as follows: The introduction section gives an overview of related researches. The second section explains the methods used in this research. The third section is the experimental results and discussions. Finally, conclusions are some suggestions for the safety analysis of elevators.

II. THEORY AND METHODS
The theory and methods are to explain how to detect the motion of elevator doors. The genetic algorithm and the objective function find the optimal model curve of the elevator doors. The third part compares KNN, SVM, and BCT to evaluate the motion state of elevator doors.

A. EXTRACTION OF DYNAMIC SIGNAL CURVES
In this subsection, readers may want to know how to measure the distances between the elevator's two doors when they are opening and closing because the changing distances can identify the behaviors of the two elevator doors. It is vital to measure the distances between the elevator's two doors automatically by video monitoring. Our previous works [22], [23] have shown details on how to extract the gaps between one elevator's two doors and obtain the curves that distances are changing versus time in one cycle. Now, we only show the flowchart of the extraction of dynamic signal curves that represents the changing distances. The variable d is the distance between one elevator's two doors, as shown in Figure 1, and it equals to the length of the yellow double arrow.
Cameras take videos of an elevator with two doors motion. This section shows how to automatically analyze the monitoring video for the identification of the door motion. There are three stages to identify the movement of two doors of an elevator. The first stage is when the doors are opening.  The first step of the flowchart to find the distance between the elevator doors is to capture each frame as an image. The semantic segmentation network is trained to find the regions that belong to the elevator doors in the picture. By thresholding method, the color images are transformed into binary images. Erosion is a mathematical morphology operation on the binary images to remove some separated pieces that do not belong to any objects. After converting a binary image into a gray image, the Canny edge detection operator can use the genetic algorithm to find the optimal parameters to obtain the edges of the boundaries of the doors. The last step is to measure the distance between the boundaries of the doors. The flowchart of the described process is shown in figure 3.
The following paragraphs describe each step more details.
Step 1: The captured frame is an elevator as an input image in figure 4.
Step 2: The semantic segmentation network [22] needs a lot of labeled images to build a training set for segmenting regions of doors, as shown in Figure 5(a). The blue areas are elevator doors, and the red areas are backgrounds. Figure 5 (b) shows some blue regions predicted by the trained model.
Step 3: Figure 5 (c) shows a binary image that is obtained from the RGB image by the thresholding method.
Step 4: Figure 5 (c) has some white pieces in the image. The erosion operation removes some white parts, as shown in figure 5 (d).
Step 5: After converting a binary image into a grayscale image, the Canny edge detection operator can detect the edges in Figure 5 (d). However, the Canny edge detector has two parameters to be adjusted to obtain the desired boundaries.  The genetic algorithm can find optimal parameters by the defined objective function. Our previous work [23] describes more details. Figure 5 (e) shows all edges that include the boundaries of the doors.   Step 6: In figure 5(f), there is a distance d between the boundaries of the doors.
In Figure 6, the y-axis is the variable d that represents the distance between two elevator doors. The ratio 7:25 (cm/pixel) is the physical distance to the number of pixels in the image. The camera has a framing speed of 24 frames per second. A curve is a sequence data of d obtained from each frame of a video when elevator doors run a cycle,

B. MODELING THE MOTION OF ELEVATOR DOORS
There are 300 data points of d variable in figure 6. The frame number is the independent variable, and d is the dependent variable. The curve shape is very similar to a trapezoid that is reasonable to be the model curve.

1) TRAPEZOID MODEL CURVE
There are four points with eight coordinates, which are parameters of the trapezoid model in Figure 7. There are two conditions B y = C y and AD length is constant.
In Figure 7, there are three lines, AB, BC, and CD construct a trapezoid model curve with eight coordinates. A trapezoid model curve is a function f x, β defined by the piecewise functions in equation 1.
which are the x, y coordinates of the four points A, B, C, and D. The genetic algorithm (G.A.) can obtain the optimal trapezoid curve to fit the d sequence data points by adjusting β of an objective function. The objective function Obj (β) in equation 2 is squared of the difference between the data d i The optimization process is to find the optimal parameter β * in equation 3 This section describes how to prepare the training data to train classifiers, such as KNN, SVM, and BDT. The objective of these classifiers is to distinguish normal and abnormal cases.
There are some abnormal cases, such as the unusual speed of the running doors, and the doors are not fully opened. The Monte Carlo method obtains optimal classifiers by generating the training data. The balance problem of training data should be considered. The random number generators with distribution density functions are vital for Monte Carlo simulation.
A histogram, as shown in Figure 8 (a), is the data generated by the random number generator ''rand'' that has equally function distribution between 0 and 1. The histogram in Figure 8 (b) represents data generated by the random number generator ''randn'' that has a standard normal distribution with mean 0 and variance 1.
The choice of the random generators is based on the balance of the data set.

1) K-NEAREST NEIGHBOR CLASSIFICATION
This subsection describes the K-Nearest Neighbor binary classification method. Given a training data set D has n points and the corresponding label set L.
The classŷ of the x is estimated by an optimization problem in equation (5).
where p (y = 0| x,N k ( x)) denotes the ratio of numbers of points with label y = 0 to the total numbers of points where the x is the center of the neighbor N k ( x). p (y = 1| x,N k ( x)) denotes the ratio of numbers of points with label y = 1 to the total numbers of points where the x is the center of the neighbor N k ( x).

2) BINARY CLASSIFICATION TREE
A procedure of how to grow a binary classification tree is described as flows: Space R d has a d-dimensional cubic surface contains the training data points x i = (x i1 , x i2 , . . . , x id ), i = 1, . . . , n. A plane x ij = s has two parameters j and s, in R d splits a region R k−1 into two subregions R k and R k (k >= 1) and two regions are sets as follows The Function E(a) is to calculate fraction of points in x i ∈ R miss-classified by a majority in region R and the optimal j and s minimize equation is E (R k (j, s))+E(R k (j, s)). The function E is as follow:

3) SUPPORT VECTOR MACHINE
Given a training data set D has n points and the corresponding label set L.
is a vector in a d-dimensional space R d and x i ∈ D. y i is a label and y i ∈ L = {1, −1} . It is an optimization problem to find the ''maximum-margin hyperplane'' for the SVM classifier. The hyperplane is the set of points X ∈R D satisfying W · X−B = 0.
where w is the normal vector to the hyperplane and the parameter b w determines the offset of the hyperplane from the origin along the normal vector w. These two hyperplanes bond the region called the ''margin''. Two parallel hyperplane separate two classes of data. The maximummargin hyperplane is the middle of these two hyperplanes as follows.
w · x − b = 1, for y i = 1 w · x − b = −1, for y i = −1 (8) where x i denotes support-vector. The distance between these two hyperplanes is 2 w , so to maximize the distance equal to minimize w . The optimization problem is formulated as follow: The condition subject to y i · ( w · x − b) ≥ 1, for i = 1, . . . , n.

III. RESULTS AND DISCUSSIONS
To find optimal parameter vectors is to obtain the approximation of the motion curves of elevator doors are by genetic algorithm. The effects of two different random number generators that rand and randn functions are discussed about the data balance. Finally, we compare the performances of three classifiers (KNN, SVM, and BCT).

A. MODEL CURVES
An optimization problem finds global optima by genetic algorithm to find a motion model curve with parameters in figure 9.
If the origin is the point of A and x-axis coincides with the line segment AD, then A x = 0,A y = 0 and D y = 0 in figure 8.
Another condition is f x, β = 0 when x = D x . The equation (1) can be simplified as in equation (10).
Objective function value decreases as the generation increases in figure 10. Figure 11 shows the corresponding model curves in the process of the genetic algorithm at the number of iterations is 40,80,120,161. The red curves represent the fitted model curves, and the blue star ( * ) denotes the data points in figure 11 at iterations is 40,80,120,161. Table 1 shows the values of the four parameters in different iterations.
In table 1, the parameters B x has a mean 56.5 and a standard deviation of 0.86. The parameters B y has a mean 213.25 and a standard deviation of 0.43. The parameters C x has a mean 242.25 and a standard deviation of 3.34. The parameters D x has a mean 292.25 and a standard deviation of 2.59.

B. TRAINING DATA
In Figure 11 25± 12, 0). Tratio transformation, in the x-axis, 12 frames is 0.5s, and in the y-axis, 5 pixels is 1.4cm. An admissible region has a blue upper boundary and a lower boundary in the in figure 11(a). The negative samples that are the points located in the admissible region. If a curve is in the admissible region, it is a negative sample. The negative sample means the operations of doors are normal. On the other side, the positive training data samples that are the red  If the vertical value of the red line segments parallel the yaxis is outside the admissible region, it means the doors are not fully opened or corrupted and they are abnormal. For generation of training data set, it is necessary to automatically labeling the data. The following algorithm 1 has a pseudo code to label a sample from a random generator.
Data balance is very important for training classifiers. For example, a binary classifier can obtain a very high

Algorithm 1 Labeling an Parameter Vector for Training Data
Input:

in the admissible region
Label ← 0 % negative sample Else Label ← 1 % positive sample End IF accuracy if the number of true cases is 99% and the false cases is 1%. If the prediction of the binary classifier is always true, the accuracy is 99% but the good results is not suitable for unknown data. This is a data unbalance problem. Two thousand samples are generated by two kinds of random generators. The components of the parameter vector β = [B x , B y , C x , D x ] are generated by equation a + b * (2 * function (k, 1) − 1). The function has two choices, one is rand and the other is randn. The values of variables a, b and β are listed in table 2. The outputs of rand and randn functions has the same interval [0,1) but they have different probability density distributions. Two generators, rand and randn, use variant a and b can generate k samples in table 2.    figure 13. The red dot indicates the label is 1, and the blue dot indicates the label is 0. The four-dimensional parameters are projected on the three-dimensional space. Three-dimensional x, y, and z axes in Fig. 13 (a) are B x , B y , C x . Three dimensional x, y, and z axes in Fig. 13(b) are B x , B y , D x . Three dimensional x, y, and z axes in Fig. 13 (c) are B x , C x , D x . Three dimensional x, y, and z-axis in Fig. 13 (d) The 75% data are used for training classifiers KNN, SVM, and BCT and the 25% data is used to test the classifier. Table 3 shows the accuracy and Area Under Curve (AUC) of three classifiers with four features (B x , B y , C x , D x ) and the BCT classifier has the best accuracy 90.8% and AUC 99.28%. The receiver operating characteristic (ROC) curves are shown in Figure 13. The horizontal axis is false positive rate and the vertical axis is true positive rate.  Table 3 and Figure 14 both show that the Binary Classification tree classifier has the best accuracy.
We compare the confusion matrixes that are shown in Figure 15. One necessary condition is that each classifier uses the features β = [B x , B y , C x , D x ]. True positive of KNN is 239 and true negative is 170. The accuracy of the KNN is 81.8% that is equal to (239+170)/500. True positive of SVM is 161 and true negative is 113. The accuracy of the SVM is 54.8% that is equal to (161+113)/500. True positive of BCT is 299 and true negative is 200. The accuracy of the BCT is 99.8% that is equal to (299+200)/500.
The accuracy values of KNN, SVM and BCT are summarized in the Table 3. The best BCT has the best accuracy is 99.8 % and the SVM has the lowest accuracy 54.8%.   dot indicates the label is 1, and the blue dot indicates the label is 0. Figure 16 (a) shows the visualization of vector parameters (B x , B y ) in two-dimensional space. Figure 16 (b) shows the visualization of vector parameters (B x , C x ) in two-dimensional space. Figure 16 (c) shows the visualization of vector parameters (B x , D x ) in two-dimensional space. Figure 16 (d) shows the visualization of vector  parameters (B y , C x ) in two-dimensional space. Figure 16 (e) shows the visualization of vector parameters (B y , D x ) in two-dimensional space. Figure 16 (f) shows the visualization of vector parameters (C x , D y ) in two-dimensional space.
We compare the variant feature number in Tables 4 and show the results of accuracy and AUC for BCT classifier with three features, two features and one feature. The best accuracy is 90.40% when three features is (B x , B y , D x ). The lowest accuracy is 56.0% when one feature is used except the feature (B y ).  Table 5 shows the BCT classifier's performance with different numbers of features and the best accuracy is 99.28% by using 4 features and the time takes 0.5s. If BCT classifier has 3 features, the accuracy is 90.30% but the time only takes 0.2363.
Three comments of the results can be summarized as follows.
1. The KNN algorithm uses the evaluation metric that is the voting result according to the maximum ratio of points that have label = 0 or label = 1 in a neighbor. In figure 13, we know that the evaluation metric on the boundary is fail.
The reason shows the results are not good and really the accuracy and AUC of KNN are 81.8% and 80.73%.
2. The SVM algorithm uses the evaluation metric that is the distance between these two hyperplanes. In figure 13, we know that there is no space margin between the samples of two classes. The reason shows the results are not good and really the accuracy and AUC of SVM are 54.8% and 55.3%.
3. The BCT algorithm uses the evaluation metric that is the miss-classified fraction of points by a majority in a region can obtain the best location of the plane to separate the samples into two classes. The reason shows the results are very good and really the accuracy and AUC of BCT are 99.8% and 99.28%.

D. COMPARATIVE RESULTS FOR 3D FRAME IMAGES
For improving the failure detection procedure is suitable for every different elevator, a three-dimensional camera is used. Figure 17 shows a three-dimensional TOF camera that captures RGB and 3D images. The 3D images are commonly represented as images with color codifying the depth. Figure 18 shows the ToF camera captures RGB and 3D images when the elevator door is (a) opening (b) keeping open and (c) closing with the temporal order 1,2,3,4.

E. DESIGN DETAILS
The procedure for failure detection design is based on the measurement of the distance between two doors. The flowchart of the processing for depth images obtained by 3D camera is shown in figure 19. Depth images can easily distinguish the doors and the other objects because they have different distances from the camera.
In figure 20(a), a depth image has the doors that are the green regions with sharp vertical edges and background regions that are black areas. By thresholding method applied on the grayscale image 20(b), a binary image is obtained in figure 20(c). The image in 20(d) is the result of erosion and dilation operations. The area enclosed by a red rectangle 20(e) is found to measure the distance d in 20(f).
The edges of the doors in figure 20 are better than the boundaries in figure 5. The result of measurement of distance d by using depth image is better than by using RGB image. By analyzing of one cycle of the doors' motion by depth images, a curve of the distance d between two doors varies with the frame number is shown in figure 21.
By analyzing of one cycle of the doors' motion by RGB images, a curve of the distance d between two doors varies with the frame number is shown in figure 22.
By analyzing of one cycle of the doors' motion by RGB images, a curve of the distance d between two doors varies with the frame number is shown in figure 22.
The red curve is obtained from the RGB images, and there are some vertical lines caused by unstable image processing due to the light intensity. The blue curve has no vertical lines because the depth image is not affected by the influence of light.

IV. CONCLUSION AND FUTURE WORK A. CONCLUSIONS
This paper presents the procedure for motion analysis of the elevator's doors by video monitoring. There are three classifiers, KNN, SVM and BCT, as candidates are chosen to test the performances on the identification of doors' motion patterns. The results show BCT classifier has the best classification accuracy (99.8%) that is better than KNN and SVM. The accuracy of the BCT classifier depends on the number of features. The BCT classifier with four features has the accuracy be 99.8%, and the running speed 0.51 seconds. The BCT classifier with three features makes the accuracy reduce to 90.4%, but the running time is only 0.24 seconds. The running time decreases as BCT classifier changes four features to three features. The result shows that we have to trade-off between accuracy and time. If we want to save time, we have to lose some accuracy. If we're going to have higher accuracy, we have to endure time costs.

B. FUTURE WORK
Future works includes solving problems in applications, data collections and machine learning algorithms.

1) APPLICATIONS
The flowchart of failure diagnosis methods includes modeling and classification method can be applied on any signals acquired by sensor devices, such as microphone, gyro, accelerometer and etc.

2) DATA COLLECTIONS
Feature work can focus on collection data of the movement of the machinery of elevators by using above listed sensor devices.

3) MACHINE LEARNING ALGORITHMS
To solve data balance problem is important to train unbiased classifiers. But the probability is very low to capture the failure movements of elevators. Based on our model parameters, the transfer learning method can obtain better parameters of classifiers by adding only a few abnormal actual cases.