Analysis and Prediction of Pedestrians’ Violation Behavior at the Intersection Based on a Markov Chain

: Pedestrian violations pose a danger to themselves and other road users. Most previous studies predict pedestrian violation behaviors based only on pedestrians’ demographic characteristics. In practice, in addition to demographic characteristics, other factors may also impact pedestrian violation behaviors. Therefore, this study aims to predict pedestrian crossing violations based on pedestrian attributes, trafﬁc conditions, road geometry, and environmental conditions. Data on the pedestrian crossing, both in compliance and in violation, were collected from 10 signalized intersections in the city of Jinhua, China. We propose an illegal pedestrian crossing behavior prediction approach that consists of a logistic regression model and a Markov Chain model. The former calculates the likelihood that the ﬁrst pedestrian who decides to cross the intersection illegally within each signal cycle, while the latter computes the probability that the subsequent pedestrians who decides to follow the violation. The proposed approach was validated using data gathered from an additional signalized intersection in Jinhua city. The results show that the proposed approach has a robust ability in pedestrian violation behavior prediction. The ﬁndings can provide theoretical references for pedestrian signal timing, crossing facility optimization, and warning system design.


Introduction
As the key node of the urban road network [1], road intersections undertake the important task of separating the participants of various traffic modes. At present, mixed traffic is the main feature of urban traffic in China [2]. In order to reduce the mutual interference between pedestrians and motor vehicles, crosswalks and signal lights are usually set at intersections [3]. Relevant studies show that the proportion of traffic accidents caused by running red lights at crosswalks is more than 40%, and the annual death toll is up to thousands [4]. The increasingly serious problem of pedestrian traffic not only poses a great threat to the safety of people's lives and property, but also seriously disrupts the social order.
Affected by the herd mentality, the violation behavior of pedestrians is basically completed in the form of groups. Generally speaking, the first pedestrian crossing the street will comprehensively judge the situation of the intersection and implement the corresponding behavior decision, while the subsequent pedestrian crossing the street will show a judgment and understanding in conformity with public opinion or the behavior of the majority under the influence of herd mentality. Studies have shown [5] that the herd psychology can prompt pedestrians who cross the street illegally to form a temporary group and influence the decision-making of each individual in the group to make the same choice.
By combing through the related literature at home and abroad, a large number of previous analysis and research have been conducted on pedestrian crossing violation prediction. Mihiran et al. [6] proposed a combination of a Markov model and a Bayesian network prediction. Mihiran et al. [6] proposed a combination of a Markov model and a Bayesian network for a real-time monitoring method to estimate the probability of pedestrian group violations. Saeideh et al. [7] applied a region growing technique to construct a Hidden Markov Model (HMM) to simulate the time-varying trajectory of moving objects. Javan D. et al. [8] analyzed polydisperse systems by a Markov chain algorithm, but ignored the influence of interaction between groups. Considering the great uncertainty in pedestrian crossing decisions, this paper combines a logistic model and a Markov chain to build a pedestrian crossing violation model, which will improve prediction accuracy.
To sum up, this paper focuses on the phenomenon and probability of pedestrians crossing the crosswalk illegally at signalized intersections from the perspective of crowd psychology. A quantitative prediction model of the decision-making behavior acts by the first pedestrian is obtained by logistic regression analysis, and the transition probability matrix between different states is constructed by a Markov chain [9]. The results can be used to analyze and predict the violation probability of pedestrian groups.

Research on the Influencing Factors of Pedestrian Violation
According to the law of the Road Traffic Safety Law of the People's Republic of China, pedestrian crossing violations are mainly divided into two types: temporal or spatial, as shown in Figures 1 and 2. Through a field investigation and review of previous research results, it was found that the decision-making behavior of pedestrians crossing the street is mainly affected by three aspects: (1) Road environment Before modeling and analyzing the traffic flow at intersections, including pedestrian flow and vehicle flow, it is necessary to understand the geometric dimensions of road facilities and signal design. The specific implementation method is to obtain the relevant data information of the collection place through field investigation, including the length prediction. Mihiran et al. [6] proposed a combination of a Markov model and a Bayesian network for a real-time monitoring method to estimate the probability of pedestrian group violations. Saeideh et al. [7] applied a region growing technique to construct a Hidden Markov Model (HMM) to simulate the time-varying trajectory of moving objects. Javan D. et al. [8] analyzed polydisperse systems by a Markov chain algorithm, but ignored the influence of interaction between groups. Considering the great uncertainty in pedestrian crossing decisions, this paper combines a logistic model and a Markov chain to build a pedestrian crossing violation model, which will improve prediction accuracy.
To sum up, this paper focuses on the phenomenon and probability of pedestrians crossing the crosswalk illegally at signalized intersections from the perspective of crowd psychology. A quantitative prediction model of the decision-making behavior acts by the first pedestrian is obtained by logistic regression analysis, and the transition probability matrix between different states is constructed by a Markov chain [9]. The results can be used to analyze and predict the violation probability of pedestrian groups.

Research on the Influencing Factors of Pedestrian Violation
According to the law of the Road Traffic Safety Law of the People's Republic of China, pedestrian crossing violations are mainly divided into two types: temporal or spatial, as shown in Figures 1 and 2. Through a field investigation and review of previous research results, it was found that the decision-making behavior of pedestrians crossing the street is mainly affected by three aspects: (1) Road environment Before modeling and analyzing the traffic flow at intersections, including pedestrian flow and vehicle flow, it is necessary to understand the geometric dimensions of road facilities and signal design. The specific implementation method is to obtain the relevant data information of the collection place through field investigation, including the length Through a field investigation and review of previous research results, it was found that the decision-making behavior of pedestrians crossing the street is mainly affected by three aspects: (1) Road environment Before modeling and analyzing the traffic flow at intersections, including pedestrian flow and vehicle flow, it is necessary to understand the geometric dimensions of road facilities and signal design. The specific implementation method is to obtain the relevant data information of the collection place through field investigation, including the length and width of the crosswalk and the waiting area, the number of lanes, the signal cycle, and the duration of the red light. (2) Crossing facilities The reasonable layout of pedestrian crossing facilities can improve the comfort of pedestrians using these facilities, and indirectly reduce the probability of pedestrian violation. The pedestrian crossing distances and the type of pedestrian crossing facilities can vary from one signalized intersection to another, which will lead to the occurrence of violations. The crosswalk can be divided into three forms according to the setting of the pedestrian refuge: no pedestrian refuge, pedestrian refuge in the middle of the road, and pedestrian refuge at the right turn ramp, as shown in Figure 3. and width of the crosswalk and the waiting area, the number of lanes, the signal cycle, and the duration of the red light.
(2) Crossing facilities The reasonable layout of pedestrian crossing facilities can improve the comfort of pedestrians using these facilities, and indirectly reduce the probability of pedestrian violation. The pedestrian crossing distances and the type of pedestrian crossing facilities can vary from one signalized intersection to another, which will lead to the occurrence of violations. The crosswalk can be divided into three forms according to the setting of the pedestrian refuge: no pedestrian refuge, pedestrian refuge in the middle of the road, and pedestrian refuge at the right turn ramp, as shown in Figure 3. (

3) Traffic condition
When pedestrians cross the street at the intersection, they often focus on the acceptable gap to avoid conflict with motor vehicles. In addition, the pedestrian's decision-making behavior will be affected by the waiting time of the red light and whether a countdown device set or not [10].

Index Quantification and Sample Collection
In order to fully cover the factors affecting pedestrian violations, the authors investigated and selected 10 representative intersections in Jinhua City as the research objects, and summarized the easily observed and quantifiable pedestrian violation indicators in the actual survey, as shown in Table 1, and the basic information of each intersection is shown in Table 2. Aimed to improve the efficiency of the survey and increase the sample size, the survey was conducted in the morning and evening rush hours with more pedestrians, 8:00-9:00 a.m. and 17:00-18:00 p.m.
According to the relevant design standards and statistical needs, the indicators are classified as follows: (1) Crosswalk width X1. According to the standard "Urban Road Traffic Signs and Markings Setting Specification (GB51038-2015)", the width of the pedestrian crossing is greater than or equal to 3 m, and 1 m is the first level of widening. This paper obtains the actual width of pedestrian crossing through field investigation and measurement, and the classification and coding are as follows: 1) X1 = 3 m or 4 m; 2) X1 = 5 m or 6 m; 3) 6 m < X1. waiting to cross the street has a direct impact on the behavior decision of pedestrians. In this paper, the waiting environment is divided as follows: 1) the inhabiting recreation district, 2) the central business district, and 3) the mixed-use district. (

3) Traffic condition
When pedestrians cross the street at the intersection, they often focus on the acceptable gap to avoid conflict with motor vehicles. In addition, the pedestrian's decision-making behavior will be affected by the waiting time of the red light and whether a countdown device set or not [10].

Index Quantification and Sample Collection
In order to fully cover the factors affecting pedestrian violations, the authors investigated and selected 10 representative intersections in Jinhua City as the research objects, and summarized the easily observed and quantifiable pedestrian violation indicators in the actual survey, as shown in Table 1, and the basic information of each intersection is shown in Table 2. Aimed to improve the efficiency of the survey and increase the sample size, the survey was conducted in the morning and evening rush hours with more pedestrians, 8:00-9:00 a.m. and 17:00-18:00 p.m.
According to the relevant design standards and statistical needs, the indicators are classified as follows: (1) Crosswalk width X 1 . According to the standard "Urban Road Traffic Signs and Markings Setting Specification (GB51038-2015)", the width of the pedestrian crossing is greater than or equal to 3 m, and 1 m is the first level of widening. This paper obtains the actual width of pedestrian crossing through field investigation and measurement, and the classification and coding are as follows: (1) X 1 = 3 m or 4 m; (2) X 1 = 5 m or 6 m; 3) 6 m < X 1 . (5) Average headway X 5 . When pedestrians make decisions on crossing behavior, they will make a subjective judgment on the headway of motor vehicles. If it is within the safe range, pedestrians with weak traffic awareness may act illegally. In this survey, the headway of motor vehicles is classified as follows: (1) X 5 ≤ 5 s; (2) 6 s < X 5 ≤ 10 s; (3) 10 s < X 5 . (6) Pedestrian waiting time X 6 . Whether the setting of a signal lamp duration is reasonable or not will directly affect the decision-making behavior of pedestrians at intersections. Too long a waiting time at a red light will promote the occurrence of pedestrian violations. Classification according to the survey data: (1) 0 s < X 6 ≤ 60 s; (2) 60 s < X 6 ≤ 120 s; (3) 120 s < X 6 . (7) Red light countdown device X 7 . The setting of the countdown signal lamp will affect the choice of pedestrian crossing. According to whether the intersection is equipped with red light countdown device, the corresponding setting variables are as follows: (1) with a countdown device and (2) without a countdown device.
Through the survey, 2183 samples of the pedestrian crossing were obtained, including 391 first pedestrian violation samples and 376 following pedestrian violation samples. The statistics of pedestrian violations are shown in Table 3. Average headway (1) X 5 ≤ 5 s; 2-6 s < X 5 ≤ 10 s; 3-10 s < X 5 X 6 Pedestrian waiting time 1-0 s < X 6 ≤ 60 s; 2-60 s < X 6 ≤ 120 s; 3-120 s < X 6 X 7 Red light countdown device (1) yes; (2) no  As shown in the above table, due to the numerous factors influencing pedestrian crossing decision-making behavior, it is necessary to use the correlation between variables to reduce the dimension of the factors in the raw data, so as to facilitate the use of public factors to process the overall raw data. Using statistical analysis software SPSS for factor analysis, combined with the existing literature research methods, three public factors were extracted, which can cover the valid information required for this study [11,12]. Figure 4 shows the impact index system of pedestrian crossing violation behavior.  As shown in the above table, due to the numerous factors influencing pedestrian crossing decision-making behavior, it is necessary to use the correlation between variables to reduce the dimension of the factors in the raw data, so as to facilitate the use of public factors to process the overall raw data. Using statistical analysis software SPSS for factor analysis, combined with the existing literature research methods, three public factors were extracted, which can cover the valid information required for this study [11,12]. Figure 4 shows the impact index system of pedestrian crossing violation behavior.   The first level indicator is the pedestrian crossing violation probability; the second level indicator, the road environment (F 1 ), reflects the impact of intersection facilities on pedestrian violations, including the crosswalk type (X 1 ), crosswalk width (X 2 ), and Sustainability 2021, 13, 5690 6 of 15 intersection environment (X 3 ); the second level indicator, traffic condition (F 2 ), mainly reflects the impact of the traffic environment at the intersection on pedestrian violations, including traffic congestion (X 4 ) and average headway (X 5 ); the third level indicator, the crossing facility (F 3 ), reflects the impact of signal light-related factors on pedestrian violations, including pedestrian waiting time (X 6 ) and the red light countdown device (X 7 ).
After extracting the influencing factors of pedestrian crossing behavior, the state of each variable is evaluated by the factor analysis model and the factor score coefficient matrix is obtained, as shown in Table 4. The linear combinations of variables can then be used to represent the public factors in the influencing factor system. Combined with Table 4, the original variables are represented by the linear combinations of factor variables, and the linear expressions for five public factors are as follows:

Model Construction
Based on the survey results obtained in Section 2, the prediction model of the first pedestrian decision-making behavior is constructed by using the logistic regression analysis method. In order to further explore the influence of herd mentality on pedestrian decision-making behavior, the transition probability matrix of subsequent pedestrian group violations is calculated by a Markov chain method, which can analyze and predict the crossing decision-making behavior of the whole pedestrian group at the intersection.

Decision Model of Pedestrian Crossing Behavior Based on Logistic Regression
In order to determine whether the expected frequencies are significantly different from the observed frequencies, a goodness-of-fit test of the logistic regression model is required. In this paper, the likelihood ratio test is carried out on the frequency of the occurrence and non-occurrence of model prediction and observation events to determine whether the model is valid, where the likelihood ratio statistics approximately obey the χ 2 distribution. The so-called chi-square model χ 2 can be defined as the gap between the zero hypothesis model [13] and the set model at −2LL. When the chi-square model χ 2 is large, its significance is small, that is, the difference between the predicted value and the observed value is not obvious, indicating that the model can fit the data well. where f o is the actual observation frequency; f e is the expected frequency. For two-dimensional interaction tables or univariate logistic regression models, χ 2 = 0.05 is usually used as the significant level for screening candidate variables; that is, when the χ 2 significance value is less than 0.05, the independent variables are valuable for predicting the results [14].
In Section 2, three public factors were extracted by principal component analysis, and their variance contribution rate was as high as 80.147% through computer analysis, which can replace most information of the original seven variables. Before substituting them into the logistic model, this paper also conducts factor screening.
According to Table 5, three independent variables of the road environment, crossing facilities, and traffic conditions all have a freedom degree of 1, and their significance is all less than 0.05, which indicates that they have a significant impact on the decision-making of crossing behavior and can be brought into the model. When pedestrians cross the crosswalk with signalized control, there are only two kinds of decision-making behaviors: crossing in violation or crossing in compliance, which can be considered as binary variables. Therefore, this paper uses the binary logistic model to fit the data, taking whether or not pedestrians cross in violation as the dependent variable of the model, and various factor indicators as the independent variables. The model defines that the dependent variable code of pedestrian crossing is y 1 = 1, and the selection probability is P 1 = P; if there is a pedestrian crossing violation, the dependent variable code is y 1 = 0, and the selection probability is P 0 = 1-P 1 .
The binomial logistic regression model with K independent variables is as follows: Therefore, the occurrence ratio of a pedestrian crossing in compliance, that is, the probability of event occurrence and the probability of event non-occurrence, is Take the logarithm on both sides of the above formula, After screening the independent variables, three indicators that meet the requirements of a significant level are finally selected. According to the requirements of the logistic regression model, the following regression linear equation can be obtained: where α is the constant term; β j is the regression coefficient of each independent variable. The statistical software SPSS was used to calibrate the impact factor system of pedestrian crossing decision behaviors, using variables satisfying the significance level to estimate the model parameters. The final results are shown in Table 6, where the degrees of freedom of the parameters are all 1: According to the model estimation results, the pedestrian crossing compliance rate is After estimating the parameters of the logistic model, deviation statistics, person χ 2 , and Hosmer-Lemeshow statistics were used to test the effectiveness of the influence degree of the model coefficient response.
From the data in Table 7, when the significance level χ 2 = 0.05, the selected index variables of pedestrian crossing violations can well fit the data, so the independent variable index of the model has a significant explanatory ability for the dependent variable.

Probability Model of Following Pedestrian Violations Based on a Markov Chain
Many researchers focus on intersection crossing behavior using, e.g., the survey statistics method [15,16], the micro-simulation analysis method [17,18], the survival analysis method [19,20], the discrete choice method [21,22], and so on. Relevant studies have carried out in-depth analyses on the maximum time pedestrians will wait to cross the street, the delay caused by pedestrians crossing the street, and the traffic characteristics of speed, and have constructed models adapted to specific conditions. However, they fail to reflect the inherent following psychology and dynamic change process of pedestrians when crossing the street in a time series.
In this paper, the decision-making behavior of pedestrian crossing is divided into countable t states based on minutes. The time and state obtained are discrete, which conforms to the definition of a Markov chain. Therefore, the probability vector S(j) is constructed to represent the decision-making state of pedestrians in the j-th cycle: S(j) = (S 1 (j), S 2 (j), S 3 (j)) (10) where S t (j) (t = 1, 2, 3) represents the probability of being in state t during period j. Other variables are defined as follows: n is the number of pedestrians counted by the survey sample; A i is the decisionmaking behavior sequence of the i-th pedestrian in the selected period, where A i1 represents the behavior sequence of pedestrians crossing the street in compliance, A i2 represents the behavior sequence of pedestrians crossing the street in violation, A i3 represents the behavior sequence of pedestrians following the previous violation, where 1 ≤ I ≤ n.
In order to judge whether the decision-making sequence of pedestrian behavior conforms to the definition of a Markov chain, that is, the state of the random variable X(n+1) only depends on the current state of X(n), and is independent of the state of the previous random variable [23,24], the χ 2 statistic can be used to test the "Markov property" of the discrete sequence [25]: Suppose v 1 , v 2 , . . . v n is a sequence of index values containing m states in the number of decision-making behaviors, using f ij to represent the frequency of a one-step transition from state i to state j in the sequence. The marginal probability pj denotes the sum of the j-th column of the state transition frequency matrix F divided by the sum of each row and column of the matrix.
When n tends to be large enough, the χ 2 statistic is which satisfies the χ 2 distribution with freedom degrees (m − 1) 2 , where p ij is the transition probability from state i to state j. Under the given significance level α, if the statistic χ 2 satisfies it shows that the decision sequence of crossing behavior has a "Markov property". At this time, the k-step transition probability of Markov chain at time n is P X(n + k) = j|X (n) = i} = p ij (n, k) (13) Suppose that the probability vector of a pedestrian's initial crossing state is S(0) = (S 1 (0), S 2 (0), S 3 (j)) and, after k-step transition, the value is S(k) = (S 1 (k), S 2 (k), S 3 (k)), that is, the selection behavior of pedestrian crossing after k cycles. In period k, the probability of pedestrian crossing in state t is S t (k)(t = 1, 2, 3), and the transition probability matrix P of pedestrian crossing decision behavior state is According to the ergodicity of the Markov chain, the probability of the process in state j is stable to π after an infinite time from any state π j . When k → ∞ , the state probability S(k) will be infinitely close to a certain value, that is, as the transition step length gradually expands, the pedestrian's decision-making state will tend to be stable. In the steady state, the probabilities of the two decision states can be expressed by π 1 , π 2 , π 3 : π = (π 1 , π 2 , π 3 ) × P π 1 + π 2 + π 3 = 1 Since the statistical time series contains multiple statistical moments, the mean absolute error (MAE) and mean absolute percentage error (MAPE) can be used as evaluation indicators [26]. According to the relevant references, the smaller the values of these two errors are, the more accurate the prediction results are, and the stronger the matching degree is: whereqs is the predicted value; qs is the actual value.

Case Overview
This paper takes the Danxi Road-Lanxi Street intersection as an example, which is a cross signalized intersection, and the situation of each approach is shown in Figure 5.

Case Overview
This paper takes the Danxi Road-Lanxi Street intersection as an example, which is a cross signalized intersection, and the situation of each approach is shown in Figure 5. Through the investigation, the basic traffic environment information of the Danxi Road-Lanxi Street intersection can be obtained and then transformed into corresponding index variables by the principal component analysis method. The survey data of 7:00-7:15 a.m. is shown in Table 8.  Through the investigation, the basic traffic environment information of the Danxi Road-Lanxi Street intersection can be obtained and then transformed into corresponding index variables by the principal component analysis method. The survey data of 7:00-7:15 a.m. is shown in Table 8. The pedestrian crossing survey was conducted from 7:00 to 8:00 a.m. in the morning rush hour, and the data in Table 9 were collected every 5 min. The pedestrian crossing violations are divided into first pedestrian violations and following pedestrian violations, as shown in Table 10. The statistics of crossing pedestrians at intersections under continuous time were mainly investigated according to the divided time intervals. In order to obtain the current state of pedestrians waiting to cross the street at the intersection, combined with the survey data in Table 9, Formula (18) obtained by the logistic model was used to process the survey data at 7:00-7:15 a.m. in Table 8. Therefore, the number of first pedestrian violations at the intersection could be predicted, as shown in Table 11.

Calculate the One-Step State Transition Probability Matrix
According to the survey statistics of pedestrian status, the frequency of different decision state transitions was obtained. Using the formula of the transition probability matrix [27] in discrete time, the one-step state transition probability matrix is calculated as follows: The χ 2 statistic can be used to test whether the discrete time series has "Markov properties". According to the one-step transfer probability matrix of pedestrians in discrete time obtained in the previous section, combined with Equation (7), under the given significance level α = 0.05 when m = 4, we can obtain χ 2 = 15.316 > χ 2 0.05 (m − 1) 2 = 12.854, by querying the χ 2 conditional distribution table. This result shows that the sequence has a "Markov property" and can be used to analyze and predict the future state.

Pedestrian Status Prediction
The one-step transition probability matrix was obtained by Step (2), and the pedestrian decision-making at multiple times of one continuous time period was predicted by the one-step transition probability π. Table 11 shows the predicted number of following pedestrian violations at the morning rush hour. Given the initial state (1,0,0), the prediction probability of pedestrian violations in the future can be obtained by iterative calculation. The comparison results with the actual violation probability are shown in Table 12: Based on the total number of pedestrians crossing at the intersection in Table 9, the actual survey data in Table 10 are compared with the prediction results in Table 11, so as to clarify the effect of the prediction model based on the Markov chain. Based on the total number of pedestrians crossing at the intersection in Table 9, the actual survey data in Table 10 are compared with the prediction results in Table 11, so as to clarify the effect of the prediction model based on the Markov chain.  Analyzing the Figure 6 and Figure 7, it can be obtained that, considering that herd mentality has a large influence on whether a following pedestrian violation occurs, the predicted probability value curve has a larger variation range than the actual probability value curve when classifying the decision state of pedestrians. This method expands the prediction range and reflects the actual situation of pedestrians crossing the street. It shows that the Markov prediction model based on mean value can better predict the decision behavior of pedestrians crossing the intersection in peak hours. However, the use of Markov chain prediction has some limitations; that is, the number of pedestrians crossing in the actual process tends to be a random process, resulting in prediction that is limited to a certain time.
Since the Markov characteristics of pedestrians crossing the intersection determine the independence between two values, the predicted value may be higher than the actual value at certain moments. In addition, the prediction results show that the pedestrian crossing decision state has not changed greatly, which indicates that the pedestrian crossing decision state at the intersection is not optimal. Due to the large number of following pedestrian violations at the intersections, further optimization is needed to improve the pedestrian crossing compliance rate.

Prediction Results Evaluation
By evaluating the prediction effect of the established decision-making behavior model of pedestrians based on the mean Markov chain, it can be concluded that the average error of the number of waiting pedestrians at the intersection is 3.18 person-times, and the average relative error percentage is only 0.97745%, less than 1%. This indicates that the prediction ability of the model is good under the given conditions, so the model can effectively predict pedestrian decision-making at intersections at certain continuous times. Analyzing the Figures 6 and 7, it can be obtained that, considering that herd mentality has a large influence on whether a following pedestrian violation occurs, the predicted probability value curve has a larger variation range than the actual probability value curve when classifying the decision state of pedestrians. This method expands the prediction range and reflects the actual situation of pedestrians crossing the street. It shows that the Markov prediction model based on mean value can better predict the decision behavior of pedestrians crossing the intersection in peak hours. However, the use of Markov chain prediction has some limitations; that is, the number of pedestrians crossing in the actual process tends to be a random process, resulting in prediction that is limited to a certain time.

Conclusions
Since the Markov characteristics of pedestrians crossing the intersection determine the independence between two values, the predicted value may be higher than the actual value at certain moments. In addition, the prediction results show that the pedestrian crossing decision state has not changed greatly, which indicates that the pedestrian crossing decision state at the intersection is not optimal. Due to the large number of following pedestrian violations at the intersections, further optimization is needed to improve the pedestrian crossing compliance rate.

Prediction Results Evaluation
By evaluating the prediction effect of the established decision-making behavior model of pedestrians based on the mean Markov chain, it can be concluded that the average error of the number of waiting pedestrians at the intersection is 3.18 person-times, and the average relative error percentage is only 0.97745%, less than 1%. This indicates that the prediction ability of the model is good under the given conditions, so the model can effectively predict pedestrian decision-making at intersections at certain continuous times.

Conclusions
This paper takes pedestrian crossing at urban signalized intersections as the research object. Combined with field investigation and video recording detection methods, the quantifiable indicators affecting pedestrian crossing behavior are preliminarily classified.
According to the pedestrian crossing behavior data collected from the survey, the index system affecting the decision-making behavior acts by the first pedestrian is constructed, and parameters are calibrated by the logistic regression analysis model. The evaluation results show that the model can well reflect the behavior characteristics of the first pedestrian crossing at the signalized intersection.
Based on the analysis of the mechanism of the first pedestrian's crossing violation, the decision model of the following pedestrian crossing is constructed by combining herd psychology and the Markov chain method. The results can predict the probability of pedestrian crossing violations at intersections. Finally, the validity of the logistic regression analysis model and the Markov chain method is verified by the survey data of pedestrians crossing at the Danxi Road-Lanxi Street intersection in Jinhua City.
Generally, some limitations in this study merit further study. Firstly, the psychological characteristics that affect pedestrian crossing violations should be further explored to refine the influencing factors; secondly, the model assumes that the indicator variables can be obtained by actual observation, a limitation that can be handled by a Hidden Markov Model; thirdly, the research results can be extended for a traffic flow assessment [28,29] to improve road safety analysis [30,31]. Finally, more pedestrian crossing data from multiple types of intersections are needed to validate the proposed model in this paper.