IDENTIFICATION OF ROAD SAFETY MEASURES FOR ELDERLY PEDESTRIANS BASED ON K-MEANS CLUSTERING AND HIERARCHICAL CLUSTER ANALYSIS

Introduction: Pedestrians aged over 65 are known to be a critical group in terms of road safety because they represent the age group with the highest number of fatalities or injured people in road accidents. With a current ageing population throughout much of the developed world, there is an imminent need to understand the current transportation requirements of older adults, and to ensure sustained safe mobility and healthy. Objectives: The aim of this study is to capture and analyze the key components that influence the identification of design solutions and strategies aimed at improving the safety of pedestrian paths for elderly. Method: A survey was conducted in 5 different locations in Catania, Italy. The locations were specifically chosen near to attraction poles for elderly pedestrians (e.g. centers for the elderly, squares, churches). Participants were recruited in person, so as to select exclusively people over 70. The sample comprised 322 participants. Both Hierarchical and K-Means clustering were used in order to explore which solutions elderly pedestrian propose for improving the safety of pedestrian path. Results: The results show that the judgment expressed by the elderly on the solutions for improving pedestrian safety is linked to the gender, to the experience as road users, and to mobility and vision problems. All solutions proposed regard road infrastructure (improvement of pedestrian crossings and of sidewalks, implementation of traffic calming measures, improvement of lighting), except for police supervision. Conclusion: This study has identified the factors that influence the identification of the best solutions to increase the safety level of pedestrian paths for elderly people. The aspects related to human factors considered were the gender, the factors associated with the experience as road users and the factors related to age related problems (mobility, vision and hearing problems). The results of this research could support traffic engineers, planners, and decision-makers to consider the contributing factors in engineering measures to improve the safety of vulnerable users such as elderly pedestrians.

relationship among elderly's age-related declines in perceptual and physical abilities and their perception and opinion of pedestrian paths. In order to help prevent elderly pedestrian accidents, it is necessary to answer questions about how they perceive pedestrian paths with respect to their age related declines in perceptual and physical abilities and with respect to their experiences as road users. This paper presents the result of an analysis developed considering the case study of the urban area of Catania (Italy). The final aim of this study is first of all to understand which solutions elderly pedestrian identify as solutions to the critical issues of the pedestrian paths they usually walk. Moreover, this study seeks to analyze how elderly pedestrians' age related declines in perceptual and physical abilities (vision, hearing and mobility problems) and experiences as road user (no driving license, no still driving, accidents driving, accident pedestrian) can affect their opinion on the solutions to the critical issues of pedestrian paths. The intent of this approach was to allow older people's voices to broaden our understanding of neighbourhood walkability and to understand their priorities. This is important to determine interventions and could support traffic engineers, planners, and decision-makers to consider the contributing factors in engineering countermeasures.

Literature review
The rapid growth and development in urban areas has resulted in a drastic increase in human population as well as vehicular population in most of the metropolitans across the globe. Due to this, there is an unavoidable increase in conflicts between vehicular traffic and pedestrians often sharing the same road space (Thakur and Biswas, 2019). No matter if the primary mode of transportation is the automobile, bicycle, or public transit; people must walk as a part of the trip, such as from their home to the store or place of employment, and/or to the transit stop.  . In particular, the road intersections are real "black spots" for pedestrian accidents (Canale et al., 2015).
It is well known that older road users, both drivers and pedestrians, have a heightened accident risk compared to younger road users (Staplin et al., 2001;Tollazzi et al., 2010). Numerous pedestrian studies identified old age as one of the major risk factors of severe injury or fatality (Abay, 2013;Eluru et al., 2008;Kim et al., 2017;Pour-Rouholamin and Zhou, 2016). Lot of studies investigated the issues of accessibility of urban areas for vulnerable road users (Giuffrida et al., 2017a;Giuffrida et al., 2017b;Benenson et al., 2011) and for people with motor and visual disabilities (Mrak et al., 2019). Individuals with disabilities may include individuals with mobility disabilities, using wheelchairs, walkers or canes, individuals who are blind or who have impaired vision, individuals with cognitive impairments from developmental disabilities, stroke or brain injury, and others. Individuals with disabilities may be the most vulnerable users of transportation facilities. Many are unable to drive and are dependent on public transit and pedestrian facilities to travel to work and to family, shopping, medical, and recreation destinations. The safety of persons with disabilities as road users is often dependent on the design of sidewalks and street crossings for usability and safety. Many people change their routes, or use paratransit services, or do not travel at all in response to poor roadway facilities. Safety of persons with disabilities is an essential part of improving roadway safety. Age related declines in perceptual, cognitive, and physical abilities have been shown to contribute to the high rate of fatal or serious-injury crashes found for elderly pedestrians (Road Safety Research Report, 2004). Because of age-related perceptual, cognitive, and motor limitations elderly pedestrians are expected to experience more difficulty than young pedestrians (Oxley et al., 1997;Fontaine and Gourlet, 1997). Yee (2006) showed that the fatality rate of the pedestrians aged 65 and above was almost double that of the younger group. Elsewhere (Sklar et al., 1989), the fatality rate of elderly pedestrians was reported to be 5 to 6 times higher than that of their younger counterparts. There exist several recent studies with a specific focus on elderly pedestrian risk factors. Wang et al. (2017) studied elderly pedestrian injury in Singapore and found that nighttime, high-speed roads, 3-legged intersections, and unlawful crossings increased elderly pedestrian injury severity. Inadequate crossing time at intersections was identified as a contributing factor to high susceptibility of elderly pedestrians to serious injury or fatality (Martin et al., 2010). In seeking a gap between vehicular traffic while crossing a street, elderly people were found to underestimate the crossing times, endangering themselves while crossing (Zivotofsky et al., 2012). Older people walk more slowly than younger people, and take smaller steps (Ketcham and Stelmach, 2001). Studies of people with physical mobility problems confirm that they typically walk rather more slowly (Shumway-Cook and Woollacott, 2001). Ketcham and Stelmach (2001) reviewed current research on changes in motor control related to ageing. In general, older adults move more slowly. When high accuracy is required, older adults tend to decelerate more slowly as they approach the target in a movement task. Attention is important to pedestrians in a number of ways. Pedestrians need to be able to switch attention between tasks, to focus attention in particular locations, and to carry out visual search. There is some evidence linking attention to the pedestrian task and to accidents. Most pedestrians who are struck by cars do not see the vehicle that hits them at all, and many report that they looked but did not see it (Wilson and Grayson, 1980). Some aspects of cognitive performance decline with age. There are, for example, age-related deficits in both speed and accuracy for memory, spatial processing, planning, and attention. There are several types of theoretical explanation for lower performance on specific tasks. The theorists argue that the differences result from general slowing, from some other general reduction in resources, and from reduction in some specific capacity, such as attention (Road Safety Research Report, 2004). It has also been suggested that older people perform more poorly on cognitive tasks in part because they more frequently produce very slow responses, or lapses, and that this is true for executive tasks specifically (West, 2001). Regarding the methods used for data analysis (see Section 3.2), it is important to note that the application of cluster analysis is widely used in the scientific literature. Cluster analysis has been applied in a wide variety of fields, ranging from engineering, computer sciences, life and medical sciences, to earth sciences and economics (Xu and Wunsch, 2005). As for transport engineering, there are several studies that have applied cluster analysis in the most varied sectors of interest, such as the optimization of Leonardi,S.,Distefano,N.,Pulvirenti,G.,Archives of Transport,56(4), [107][108][109][110][111][112][113][114][115][116][117][118]2020 the traffic control systems (Lin and Xu, 2020) and the analysis of both accidents data and survey data (Depaire et al., 2008;Distefano et al., 2018; Sivasankaran and Balasubramanian, 2020; Choudhari and Maji, 2019). Clustering is an unsupervised machine learning method based on heuristics which maximizes the similarity between in-cluster elements and the dissimilarity between inter-cluster elements (Fraley and Raftery, 2002). . The survey was conducted in 5 different locations in Catania, Italy. The locations were specifically chosen near to attraction poles for elderly pedestrians (e.g. centers for the elderly, squares, churches). Participants were recruited in person, so as to select exclusively people over 70. Participants were briefed of the nature and time required to participate in the study prior to commencement. After their consent was obtained, the questionnaire started. It was decided to question directly the participants, instead of leaving them alone with the questionnaire, in order to provide visual aids and detailed explanations and clarifications. Each survey lasted approximatively 20 minutes. Participants were assured of anonymity and confidentiality. The total sample comprised 322 participants (164 men and 158 women). Participants who did not complete the questionnaire or who gave uncertain answers were excluded. The respondents excluded were about 5% of the sample. The final sample was composed by 306 participants (156 men and 150 women). The majority of respondents (50.33%) were aged between 70 and 75. 28.10% of respondents were aged between 75 and 80 and 21.57% of respondents were over 80. The survey contained questions about socio-demographic characteristics, the experience as road-users and age related declines of perceptual and physical abilities. Finally, the survey contained two openended questions related to the critical issues of pedestrian paths and the solutions for these issues. The questionnaire was divided into the following 5 sections: -Section 1: participants reported their age, their gender and other basic socio-demographic characteristics information in the first section; -Section 2: this section included questions regarding the experience as road users of participants.

Method 3.1. Participants and survey
Participants were asked if they ever had the driving license, if they still drove, if they ever had accidents while driving and if they ever had accidents as pedestrians; -Section 3: the third section contained questions about the age related declines of perceptual and physical abilities. Participants were asked if they had vision problems, hearing problems and mobility problems. -Section 4: this section consisted of an openended question related to the critical issues of pedestrian paths. Participants could express freely their opinion related to the critical issues and the problems they found in the pedestrian paths they usually walked. -Section 5: this section consisted of an openended question related to the solutions for critical issues of pedestrian paths. Participants could express freely their opinion related to the solutions they thought could improve the safety of pedestrian paths they usually walked. Since the aim of this study is to explore which solutions elderly pedestrian propose for improving the safety of pedestrian paths they usually walk, this study focuses on Sections 1, 2, 3 and 5 of the questionnaire. A previous study analyzed the results of Section 4 of the questionnaire in order to explore the perception of elderly pedestrians of the critical issues of pedestrian paths (Pulvirenti et al., 2020).

Model development
In order to analyze the data obtained from the survey, a Cluster Analysis was developed. Cluster analysis seeks to separate data elements into groups or clusters such that both the homogeneity of elements within the clusters and the heterogeneity between clusters are maximized (Hair et al. 1998). Among the similarity-based techniques, two major approaches can be discerned. One approach is a distance-based clustering algorithm to identify homogenous subsets (partitioning approach; e.g., k-means clustering). Another approach is the hierarchical clustering (e.g., ward's method, a single linkage method). K-Means clustering is the most commonly used unsupervised machine learning algorithm for partitioning a given data set into a set of k groups (i.e. k clusters), where k represents the number of groups prespecified by the analyst. It classifies objects in multiple groups (i.e., clusters), such that objects within the same cluster are as similar as possible (i.e., high intra-class similarity), whereas objects from different clusters are as dissimilar as possible (i.e., low inter-class similarity). The first step when using K-Means clustering is to indicate the number of clusters (k) that will be generated in the final solution. The algorithm starts by randomly selecting k objects from the data set to serve as the initial centers for the clusters. The selected objects are also known as cluster means or centroids. Next, each of the remaining objects is assigned to its closest centroid, where closest is defined using the Euclidean distance between the object and the cluster mean. The optimal number of clusters with the Hierarchical method is determined by the minimum number of groups with the maximum amount of distance between group means. Frequently, this is illustrated with a dendrogram of the merging clusters. Using a dendrogram, the ideal number of clusters is determined by the number of clusters intersected when drawing a vertical line through the largest horizontal distance between merging clusters. In this study both hierarchical and k-means clustering were used in order to explore which solutions elderly pedestrian propose for improving the safety of pedestrian path. Starting from the results of the survey, a cluster analysis was developed to answer the following questions: 1. Can we group together elderly pedestrians with a similar perception of the solutions for improving pedestrian safety? 2. How can we interpret the groups obtained? What do elderly pedestrians belonging to the same group have in common? 3. Which variables do mostly affect the determination of the groups? The nominal variable considered is solutions for improving pedestrian safety, with the fourteen possible items showed in Table 1. These items were deduced from the open-ended question related to the solutions for improving the safety of pedestrian paths of Section 5 of the questionnaire.  The variables considered for the cluster analysis were chosen in order to be representative of the respondents' experience as road users and of the respondents' age related declines in perceptual and physical abilities. Moreover, the gender was taken into account. Each of the 8 variables considered can range from 0 to 1, indicating the percentages of respondents who answered Yes or No to each question. The value 0 is associated to the answer No for each variable (except for the variable Gender, for which 0 corresponds to Male), while the value 1 is associated to the answer Yes for each variable (except for the variable Gender, for which 1 corresponds to Female). The final cluster centers can range from 0 to 1. These conditions are all representative of age related declines in perceptual and physical abilities (vision, hearing and mobility problems) or of experiences as road user (no driving license, no still driving, accidents driving, accident pedestrian) which can affect the opinion on the solutions for improving pedestrian safety.

K-Means cluster analysis
As shown in Table 2, the solutions for improving pedestrian safety were grouped in clusters by using SPSS software. To use K-Means clustering, the number of clusters is arbitrarily determined, either from existing knowledge of the data and the approximate number of groups you want to divide the data into. Of course, a good approach to K-Means is to try several numbers of clusters and see which number best represents the data or produces any significant differences in analysis. Different models of clusters were therefore estimated, from one to seven, for selecting the suitable number of clusters. For further analysis, the solutions for improving pedestrian safety were divided into five clusters. Table 2 shows the clusters membership. The first cluster is composed only by item 6, i.e. "construction of pedestrian crossings". This cluster can therefore be named Construction of pedestrian crossings. Cluster 2 groups together three items, i.e. item 2 ("adaptation of sidewalks (width)"), item 13 ("increased police supervision") and item 14 ("other forms of adaptation of sidewalks (height, obstacle removal, etc.)"). The second cluster can therefore be named Adaptation of sidewalks and police supervision. Cluster 3 groups together three items, i.e. item 3 ("Renovation of the sidewalks surface"), item 9 ("Installation of the street lighting system/Adaptation of street lighting system") and item 12 ("Repair of road pavement"). Cluster 3 can therefore be named Walking surfaces and lighting. Cluster 4 groups together 5 items, i.e. item 1 ("Construction of sidewalks"), item 4 ("Prevention of parking on sidewalks"), item 7 ("Improvement of pedestrian crossing conditions"), item 8 ("Installation of signalized pedestrian crossings") and item 11 ("Implementation of pedestrian areas"). The fourth cluster can therefore be named Pedestrian areas, construction of sidewalks and improvement of pedestrian crossings. Finally, Cluster 5 is composed by two items, which are item 5 ("Construction of ADA ramps on sidewalks") and item 10 ("Installation of traffic calming measures"). Cluster 5 can therefore be named Traffic calming measures and elimination of architectural barriers.   Table 3 shows the ANOVA analysis results and allows to understand which variables affect more the identification of the clusters. The variables mostly contributing to the identification of the clusters are Accidents driving (Sig.=0.002), Gender (Sig.=0.004), Mobility problems (Sig.=0.002), Accidents pedestrian (Sig=0.014), Vision problems (Sig=0.026) and No still driving (Sig=0.033). Hearing problems (Sig=0.401) and No driving license (Sig.=0.517) are instead the variables less affecting the division into different clusters. Table 4 shows the profiles of the clusters obtained with the K-Means procedure. Each group is represented by a center which originate a vector (row) whom components are the means of the values of the variables that defines the coordinates of the objects belonging to that group. The main and most interesting characteristics of the respondents belonging to the five clusters are given below.

-Cluster 1 (Construction of pedestrian crossings):
The majority of respondents belonging to this group are men and have mobility problems. The majority of respondents belonging to this group are men who had accidents while driving and as pedestrian in the past.

Hierarchical cluster analysis
Hierarchical clustering allows to confirm the number of clusters which was hypothesized with the K-Means clustering. The optimal number of clusters with the hierarchical method is determined by the minimum number of groups with the maximum amount of distance between group means. Frequently, this is illustrated with a dendrogram of the merging clusters. Using a dendrogram, the ideal number of clusters is determined by the number of clusters intersected when drawing a horizontal line through the largest vertical distance between merging clusters. Similar to K-Means, the optimal value of clusters must be chosen, but this method gives some perspective as to what the ideal value may be. The hierarchical clustering allowed to illustrate the hierarchical organization of groups as shown in the dendrogram of Figure 1. This visualization confirms the previous result, but offers also a hierarchical view of the clusters. By cutting the dendrogram at height 6, corresponding to the highest jump between levels of similarity, five clusters homogeneous as for their level of perceived safety are obtained. These clusters correspond to the five clusters resulting from the K-Means cluster analysis. The hypothesis made for K-Means cluster analysis was therefore fully confirmed by hierarchical cluster analysis.

Discussion
The results of cluster analysis show that that the judgment expressed by the elderly on the solutions for improving pedestrian safety seems to be significantly linked to the gender, to the experience as road users, and to mobility and vision problems that compromise the correct perception of the road environment. Previous study shows that the gender of pedestrian impacts on the injury severity of elderly pedestrian (Retting, 1993). Age-related declines in perceptual, cognitive, and physical abilities have also been shown to contribute to the high rate of fatal or serious-injury crashes found for old pedestrians (Dunbar et al., 2004). Therefore, as might be expected, the variables which were found to affect the solutions for improving pedestrian safety proposed by elderly in this study are variables which are known to affect the injury severity of elderly pedestrian.
On the other hand, the least significant variables in conditioning the judgment on the solutions for improving pedestrian safety is the one related to the driving license. Hearing problems, even if conditions the perception of urban pedestrian paths, are less significant than mobility and vision problems. Basically, in identifying the solutions for improving pedestrian safety, elderly pedestrians are mainly conditioned by the difficulty of correctly seeing the paths themselves. The physical disability that most influence participants' answers in this study is vision problems. This is a confirmation of the findings of Barnett et al. (2016). The analysis of clusters ( The cluster analysis developed in this study allowed to investigate the key components that influence the elderly pedestrians' perception of pedestrian paths and to identify how these perceptions change for different pedestrian "profiles" based on human factors. In order to meet the needs of pedestrians, it is indeed necessary to have a clear understanding of the wide range of abilities that exists in the population. Like roads, pedestrian paths should be designed to serve all users. In the same way that a roadway is not designed for a particular type of vehicle, the design of a pedestrian path should not be restricted to a single type of pedestrian.

Conclusion
Pedestrian safety policies and guidelines should always (a) recognize pedestrians as legitimate road users and promote this recognition among planners, engineers and professionals who plan and manage the road transport system; (b) set and enforce traffic laws that ensure safety of pedestrians; (c) encourage an inclusive approach in planning new roads and/or retrofitting existing roads; (d) pay attention to the specific needs of the most vulnerable users such as people with disabilities, children and the elderly. This research has identified the factors that influence the identification of the best solutions to increase the safety level of pedestrian paths for elderly people.
The aspects related to human factors considered were the gender, the factors associated with the experience as road users and the factors related to age related problems (mobility, vision and hearing problems).
The results show that the judgment expressed by the elderly on the solutions for improving pedestrian safety is significantly linked to the gender, to the experience as road users, and to mobility and vision problems which compromise the correct perception of the road environment. All solutions proposed by elderly for improving pedestrian safety regards road infrastructure (improvement of pedestrian crossings and of sidewalks, implementation of traffic calming measures, improvement of lighting), except for police supervision. The road infrastructure should consider the need of elderly pedestrians, which are less able than others to cope with risky situations. Several aspects of road infrastructure engineering, such as a lamp replacement program to improve night visibility at residential areas, better regulation of traffic lights to help elderly pedestrians in terms of time allowed for their crossing, improvement of pedestrian crossings and of sidewalks, may need to be considered in order to improve the safety of pedestrian paths for elderly. It is necessary to create the adequate environment to lower vehicle speeds in urban areas, with the development of traffic calming measures for example, in order to facilitate movements of less-able pedestrians.
The results of this study can be used to support traffic engineers, planners, and decision-makers to consider the contributing factors in engineering measures for improving the safety of particularly vulnerable users such as elderly pedestrians.