Tram drivers' perceived safety and driving stress evaluation A stated preference experiment

The tram isa sustainablemode oftransport. However,tram tracks are oftenshared withvulnerable roadusers (VRUs) suchaspedestriansandcyclists.Inthismixedenvironment,accidentsbetweentramsandVRUsareveryrarebutsevere at the same time. Previous studies have acknowledged that tram driving is a complex and very demanding task. Yet, subjective notions of traf ﬁ c safety that are more connected with the behavior of tram drivers, have never been quanti ﬁ ed. This is important in order to better interpret the challenges that tram drivers face. To do so, a stated preference experiment was designed in which tram drivers in Athens rated their perceived safety and driving stress of different driving scenarios on a 7-point Likert Scale. The driving scenarios were presented to the tram drivers using static images. According to the estimated perceived safety model, the alignment type (such as exclusive, semi-exclusive), theexistenceandthetypeofpedestriancrossingandthevolumeofVRUsin ﬂ uencetramdrivers'perceivedsafety.Driv-ing stress was affected mainly by arrival delay and load of standing passengers. Route familiarity also appeared as an important factor, that in ﬂ uences driving stress. No statistically signi ﬁ cant correlation between perceived safety and driving stress was observed. One explanation for this is that experienced tram drivers believe that they are ready to respond properly in a section that they perceive as unsafe, if they are familiar with it. If there is no familiarity, tram drivers lack con ﬁ dence and therefore driving stress is increased.


Introduction
Sustainable mobility requires the advancement of public transport, which in many cities in the world includes the use of trams and light rail systems. These systems are often the backbone of the public transport network in cities and regions (Kanacilo and Van Oort, 2008). UITP (2015) reports that approximately 13.6 billion passengers boarded a tram or light rail vehicle in 2014 in the 388 cities where a system is running. Tramway and light rail systems have been on the up since the turn of the millennium: 78 cities have opened new networks since 2000, with the United States and France spearheading a revival period. Most tram systems are running in Europe (206 cities) and Eurasia (93 cities), Germany and Russia having the most networks. These systems are able to: 1) improve the effectiveness of the transport system, 2) make the city more efficient, 3) boost the economic development, 4) protect the environment and 5) ensure social equity (five E's concept) (van der Bijl et al., 2018). Dutch data indicate that the number of tram accidents with vulnerable road users (VRUs) with severe outcomes per kilometer travelled is twelve times higher in comparison with the number of severe car accidents (SWOV, 2011). In addition, the percentages of tram collisions leading to personal injuries or deaths were 89.6% for pedestrians, and 83.1% for cyclists in Switzerland (Marti et al., 2016). The main problem is that the rail vehicle requires a longer distance in order to brake until standstill and at the same time, its mass is much bigger in comparison with other transport modes. Furthermore, a tram driver is unable to maneuver away in order to escape colliding with an object. Tram driving induces a high level of workload, since the driver should run on time, maintain his/her concentration and predict the behavior of other road users (Naznin et al., 2017). Nevertheless, it is not always feasible to provide a complete separation of tramways from the other road users (Marti et al., 2016). A higher percentage of fatal accidents occur in exclusive and semi-exclusive alignments, while in non-exclusive alignments the absolute number of recorded accidents appears much higher (Korve et al., 2001). An increase of 1 kmph in the tram's average speed, increases the probability of fatal crashes by 11.8% (Naznin et al., 2016). Higher tram speeds were mainly observed in semi-exclusive sections.
Transportation Research Interdisciplinary Perspectives 7 (2020) 100205 There are different methods to analyze tram safety problems. For example, by using a dataset of past tram accidents; yet, it should be acknowledged that tram accidents are very rare and therefore, datasets of more than one tram network are required in order to achieve statistically significant correlations. Another method is to install traffic cameras in order to observe traffic conflicts that occur in the urban space. However, this is not always an easy and quick process. A third method is the examination of perceived safety in different tram sections and driving stress of tram drivers in order to explain tram safety problems. In the literature, few studies attempted to examine subjective concepts of tram safety. Naweed and Rose (2015) and Naznin et al. (2017) were the first studies that dealt with the behavior of tram drivers and spoke about their daily challenges. However, only qualitative results were presented by these studies. Therefore, the objective of the present study is the quantification of subjective notions related to tram drivers' behavior and psychology, such as perceived safety and driving stress through a stated preference experiment that is designed and conducted for this purpose.
Perceived safety may be influenced by multiple factors; some potential factors are: alignment type, existence of a tram stop, a crossing, or a curve, visibility and traffic conditions. It is assumed that in sections with low perceived safety, tram drivers lower the tram speed in order to feel safer. Driving stress may be increased due to the feeling of insecurity. Yet, driving stress is not only affected by perceived safety; additional factors related to the tram operations, like on-time running, fatigue and load of standing passengers may also affect driving stress (Naweed and Rose, 2015;Naznin et al., 2017). In this study, the perceived safety and driving stress as rated by tram drivers were examined in relation to the previously mentioned factors in different hypothetical driving scenarios in urban places where trams interact with VRUs. The tram network of Athens in Greece was used as a case study. The stated preference experiment was designed based on the characteristics of this network; yet, this methodological tool can easily be adapted to other tram networks of the world. The paper is structured as follows: in Section 2 the relevant literature is reviewed, followed by descriptions of the research methodology in Section 3, the results in Section 4, and finally the discussion and conclusions in Section 5.

Literature review
Most of the times, tram tracks are installed on urban streets that have already been configured. The available space is therefore limited and the design of the tram line has to be adjusted to the characteristics and the functionality of each street. This is why many different designs of tram lines have been developed in the past. The first effort for the classification of the different designs of cross sections of tram lines was made in the Transit Cooperative Research Program (TCRP) Report 17 (Korve et al., 1996(Korve et al., , 2001. The authors created three classes of designs, namely: the exclusive (type a), the semi-exclusive (type b) and the non-exclusive alignment (type c). In the first type, the tram line has a full-grade separation from both motor vehicle and pedestrian facilities. In type b, the separation still exists, although in some locations of the line there are grade crossings, where the tram intersects with motor vehicles, bicycles and pedestrians. Mixed traffic operations occur in non-exclusive alignments. In some cases, the tram lane is shared with buses or other public transport modes and in some other cases, it has been aligned into pedestrianized zones, like city squares or shopping centers. In Melbourne, Australia, a different classification has been presented by Diemer et al. (2018). According to this last study, the tram alignment can be classified into five categories based on the level of separation, namely: no-separation (i.e. mixed traffic operation), part-time separation (i.e. separation during peak hours only), shared separation (tram line shared with pedestrians, cyclists and emergency vehicles), visible separation (i.e. separation with painted lines) and physical separation (i.e. exclusive right-of-ways). There are also different designs of pedestrian crossings and tram stops. According to the TCRP Report 137 (Cleghorn, 2009), audible warning systems, flashing lights and automatic (or manual) gates can be installed at a crossing in order to enhance pedestrian safety. The offset pedestrian crossing (or Z-pedestrian crossing) is a new modern design that aims to increase pedestrian awareness. Their movements are channelized by installing some barriers or fences. A tram stop can be on the curb (i.e. curb-side stop), in the middle of the street (i.e. super stop), or near a traffic lane (i.e. safety zone). Yet, in some tram networks (e.g. Athens, the Hague, Amsterdam), super stops are not only observed in the middle of the street but also near the sidewalks. Tram platforms and additional equipment (e.g. ticket machines, seats, etc.) exist only in the super-stop design; pedestrians have access to the platforms by two protected crossings, if it is in the middle of the street (Currie and Smith, 2006;Currie and Reynolds, 2010;Currie et al., 2011).
In this very complex road environment, the tram drivers have to keep everyone safe by controlling only the longitudinal movement and not the lateral one, as car drivers can (Naznin et al., 2017). As mentioned, the studies of Naweed and Rose (2015) and Naznin et al. (2017) attempted to discuss the challenges of tram drivers by conducting interviews with more than 10 and 20 participants, respectively. One of the main challenges is that tram drivers should predict the behavior and the movement of the other traffic road users in order to increase/decrease speed and avoid dangerous situations. Emergency braking is an option provided by the system in order to avoid a collision. Yet, it can result in falls of standing passengers inside the cabin; therefore, many drivers tend to avoid this option. Furthermore, on-time running is a second and very important challenge. Most of the times, tram drivers are evaluated on their on-time running performance by the company managers. The pressure for the minimization of delays negatively influences the performance of the driver and consequently, safety. Lastly, experience is very significant when it comes to dealing with high driving workload that can lead to fatigue and downgraded driving performance. On the other hand, pedestrians are completely unaware of the potential risks, when they are interacting with trams. Castanier et al. (2012) conducted a questionnaire study to identify the perceptions of pedestrians, cyclists and motorists regarding the probability of a crash with a tram. They found a low perceived crash risk in all age groups. Most of the respondents thought that the probability of being involved in an accident with a tram is lower for themselves and higher for the others (i.e. comparative optimism).
Surveys regarding the perceptions of tram drivers have not been conducted yet. The studies of Wang et al. (2002) and Hill and Boyle (2007) attempted to examine perceived safety and driving stress of car drivers, respectively, by conducting stated preferences experiments. The first study focused on roundabouts and the main explanatory variables were: circle radius, number of circular lanes, visibility, traffic volume level, car speed and presence of pedestrians at crossings. An image for each driving scenario (i.e. combination of different variable levels) was shown to the respondents. They rated the perceived safety of each driving situation on a 5-point Likert scale. An ordinal regression model was estimated for the perceived safety of roundabouts. In the second study, the respondents rated the driving stress on a 7-point Likert scale. Eighteen different hypothetical driving scenarios were presented to the respondents. Some examples of these hypothetical driving situations are: driving on an icy road, driving in heavy rain, driving behind a vehicle that is moving slower or braking, making a left turn, merging into heavy traffic, and night driving (Hill and Boyle, 2007). They estimated driving stress using a proportional odds method. This method has been also adopted for the present study to estimate models of perceived safety and driving stress.
In conclusion, the previously mentioned studies were used as a source of inspiration in order to design a new stated preference experiment adapted to the tram drivers' challenges. The main research question of this study is: "Which factors have a statistically significant impact on the perceived safety and driving stress of tram drivers?" Another question the study aims to answer is: "Is there any correlation between perceived safety and driving stress?"

Research methodology
Subjective notions, like perceived safety and driving stress, cannot be directly measured. Therefore, stated preference experiments can be applied for the quantification of these notions. A stated preference experiment was P.G. Tzouras et al. Transportation Research Interdisciplinary Perspectives 7 (2020) 100205 designed in order to collect tram drivers' perceived safety and driving stress ratings. According to Kroes and Sheldon (1988), stated preference methods refer to a family of techniques, that utilize the preferences of respondents regarding a set of transport options to estimate utility functions. The researcher is responsible for constructing a set of different transport scenarios at the beginning of this process. Some of these developed scenarios may not exist in reality (Kroes and Sheldon, 1988).
In the next sub-section (3.1), the main characteristics of the study network (i.e. the tram network of Athens and the study participants), where the stated preferences experiment was conducted, are given. This is followed by Sections 3.2 to 3.5 which describe the choices made at each step of the stated preferences experiment design in more detail. Specifically, the first step regards the identification of the set of explanatory (or independent) variables and the selection of the measurement unit of each of the variables (Section 3.2).  The selection of the number and the magnitude of the attribute values is accomplished in the second step (Section 3.2). In the third step, the mathematical form of the utility function is specified (Section 3.3). The fourth step relates to the design of the survey (Section 3.4). There are several methods of designing a survey; the most important ones are the full factorial and the fractional factorial design. The first type of survey design contains all the possible combinations of attribute levels. Fractional factorial design is able to reduce selectively the size of the experiment (Gunst and Mason, 2009) and at the same time, it ensures zero-correlation among the independent variables (Hensher, 1994). The next step is to translate all the different profiles (i.e. combinations of attribute levels) into a set of questions that are contained in a survey form (Section 3.4). The sixth step (Section 3.5) is the selection of the appropriate estimation method based on the type of obtained data (i.e. rank-ordered data, rating data and choice data) (Hensher, 1994). In the last Section 3.6, the experimental procedure including the pilot study is described.

Study network
Compared to other European tram networks, the network of Athens is quite new and small. The length of the tram network of Athens is 30.90 km. The tram operations in Athens started in March 2004. The transport operator of this network is STASY. In the public transport system of Athens, the tram network has a complementary role compared to the metro network. It consists of 50 tram stops and 3 tram lines that connect the city center with the southern districts of Athens metropolitan area. There are five metro-tram interchange stations; two of them are located in Piraeus, which is the port of Athens. In the city center of Piraeus, a new tram section was fully constructed in November 2018; the tram operations are expected to start in January 2021. It is a loop, which starts from the existing terminal (also a metro-tram interchange station) called Neo Faliro and ends at the same point. The length of the new section is approximately 5 km and consists of 12 tram stops. Fig. 1 shows the tram network of Athens.
In the majority of the urban streets (14.88 out of 30.90 km, i.e. 48.15%), the tram track is semi-exclusive and is located in the middle of the street. There are cases in Athens (mainly near the beach), where the semiexclusive alignment is not in the middle of the street but near the sidewalk. By integrating the new section located in Piraeus in the length calculations, the share of mixed traffic alignments is 18.25% (5.64 km). In addition, the tram track is fully shared with pedestrians in 2.12 km (6.86%). As it can be seen in Fig. 1, the new tram section consists only of non-exclusive alignments in which the tram track is either shared with pedestrians (49.53%) or with motorized traffic (50.46%).
The previously mentioned facts are some of the key reasons why the tram network of Athens was included in the analysis. Tram drivers of Athens are very experienced in driving mainly in semi-exclusive alignments. The new section, as it was designed, will bring up new challenges and difficulties, which have not been faced in the past. Hence, additional parameters, like route familiarity, which may affect perceived safety and driving stress, can be examined in Athens.

Variables selection and definition of variable levels
Perceived safety and driving stress of tram drivers were selected as dependent variables. Therefore, in the survey, the respondents (i.e. the tram drivers) answered two questions, i.e. 1) how safe would you feel and 2) how stressed would you feel, while you are driving in the presented sections/conditions. The tram drivers responded using a rating scale from 1 (very unsafe and not stressful at all) to 7 (very safe and very stressful). According to Joshi et al. (2015), a 7-point Likert scale provides enough options, that are closer to the original view of the respondent and reduces the role of ambiguity in the responses compared to a 5-point scale. As is shown in Table 1, an inverse relationship between these two Likert scales could be assumed. For example, a tram section that is rated as very unsafe (1) is likely to be evaluated as very stressful (7) at the same time.
For the selection of the independent variables, the challenges of tram drivers, as described analytically by Naweed and Rose (2015) and Naznin et al. (2017) were considered. The alignment type, the existence and type of a pedestrian crossing and tram stop are variables related with the design of each tram section. It was assumed that these variables affect the perceived safety. The variable levels were selected based on the design characteristics and the conditions that occur in the selected tram network. Taking into account the basic classification developed by Korve et al. (1996Korve et al. ( , 2001, the different segments of the tram network of Athens were classified into 4 types of tram alignments, namely: tramways shared with pedestrians, mixed traffic operations, semi-exclusive alignment near the sidewalk and semi-exclusive alignment in the middle of the street. Furthermore, there are locations with unprotected crossings (i.e. without traffic lights) and locations with protected crossings. The latter appear in junctions, where both the traffic flows and tram movements are controlled by traffic lights. In the network of Athens, only one design of stations (i.e. super stop) appears. All the previously mentioned variables are considered as categorical variables.
The volume (or number) of VRUs in the road environment is a continuous variable that was expected to influence tram drivers' perceived safety. It was decided that this variable would be described by three levels in the questionnaire form, namely: level A: low volume (≤10 VRUs, i.e. almost without pedestrians), level B: medium volume (11-20 VRUs) and level C: high volume (>20 VRUs, i.e. very crowded case). These quantitative thresholds (also shown in Table 2) were selected after collecting images from many different sections of the tram network at peak and non-peak hours. The main goal was to classify the collected images into different volume level groups, which can be distinguished easily by an average respondent.
The arrival delay and the load of standing passengers may also affect the driving stress. Fatigue could also be considered as a potential independent variable, according to the study of Naznin et al. (2017). However, it could not be examined in this study. The distribution of arrival delays differs among tram companies. In the beginning, the values of 2.5-, 5-and 10-min delay were selected for the low, medium and high delay level, respectively. These values were reconsidered after conducting a pilot study. The final selected values were 5, 15 and 25 min, as presented in Table 2. The load of standing passengers is expressed as proportions of tram standing capacity. At the lowest level, there are no standing passengers and at the highest level, the number of standing passengers is equal to the capacity.
Gender, age and driving experience are independent variables related to personal characteristics and may influence both perceived safety and driving stress. Route familiarity is a dummy variable that was examined in tram networks in which new sections had been added recently. The tram network of Athens is one of them; in 2019, only the trainers of the drivers had driven in a newly added section.

Model formulation: perceived safety and driving stress
In the perceived safety model, the relationship of perceived safety with the alignment type, the existence and type of pedestrian crossing, existence of tram stop and the number of vulnerable road users in the road environment was examined (see Eq. (1)). Since the variables regarding alignment, existence and type of pedestrian crossing and existence of tram stop are categorical, a dummy coding was utilized in order to describe the nonlinearities that exist between categories. For example, the contribution of a tramway shared with pedestrians to perceived safety may differ significantly compared to the contribution of a semi-exclusive alignment. There are two ways to formulate the utility function of driving stress. The first one is by introducing an additional independent variable, which is the perceived safety, as it can be estimated by the perceived safety model (see Eq. (2)). A different approach is to import all the independent variables of perceived safety in the model related with driving stress (see Eq. (3)). In the first approach, the contribution of perceived safety to the stress felt by the drivers can be computed, while the second approach provides more evidences regarding the impact of the characteristics of the road environment on driving stress.
where: stress driving stress time arrival delay expressed in minutes load load of standing passengers expressed in percentage of the tram standing capacity

Survey design
According to the model given in Eq. (3), the total number of independent variables is 6, excluding the extra parameters from the dummy coding scheme. The alignment type variable has 4 levels, the pedestrian crossing variable has 3 levels, the variable related to station existence has 2 levels and the variables of volume of VRUs, arrival delay and load of standing passengers have 3 levels (see Table 2). Therefore, the total number of combinations (scenarios) would be 4 * 2 * 2 * 3 * 3 * 3 = 432, if it had been decided to develop a full factorial design. A fractional factorial design was therefore chosen in order to design the survey with reasonable required time for completion. This type of design is based on an orthogonal table, which ensures zero correlation between the independent variables; yet, there are correlations between the interaction effects. By using a fractional factorial design, the number of combinations (i.e. driving scenarios) could be reduced to 36. The 36 scenarios were divided into 3 blocks (i.e. 12 scenarios in each) in order to be able to create a 10-min survey form, as the tram company required. Each participant (tram driver) completed one block of 12 scenarios.
The survey form was uploaded on the internet using the SurveyMonkey platform and the tram drivers could complete it by using either a desktop, laptop or smartphone/tablet. On the first page, three notifications were displayed to the participant before starting the survey; the first one urges the driver to focus only on the VRUs presented in the pictures, the second one informs them that in all cases, the speed of the tram vehicle is lower than or equal to the speed limit, and the last one asks them to assume that they are driving in the morning and with clear weather conditions (no rain or fog). In Fig. 2, a single page from the questionnaire form translated in English is given (the original survey was in Greek). At the top of the page, information about the scenario is provided. The respondent was able to click on the link to see the exact location of each scenario on an online Google map. Therefore, the respondents knew very well the location of the image before rating the perceived safety. Next, the picture of the scenario was presented to the tram driver accompanied by a text underneath it. The variable values were also described there. In the majority of scenarios, the text confirmed the information presented in the pictures. One exception occurred when the scenario had a protected crossing. Since in reality the number of protected tram crossings is limited in Athens, tram drivers were asked to assume that the crossing that appeared in the image is now protected. Additional pieces of information related to the arrival delay and the load of standing passengers were provided in the question that followed. The phrase: "if in the previous conditions, you take into account that" was used, so that drivers would be toned to consider the information from the previous image (i.e. first question) before rating the stress level (see Fig. 1).
Route familiarity as a variable affecting the perceived safety and driving stress was not considered in the design of this stated preferences experiment. In the "unfamiliar" section of the tram network of Athens (i.e. the new section located in Piraeus), there are no semi-exclusive alignments, while in the "familiar" part of the network, there are few cases where the infrastructure is fully shared with pedestrians. Therefore, it was not possible to find images from the network with all the combinations of route familiarity and alignment type variables. In order to investigate the influence of familiarity on perceived safety, we asked the tram drivers to rate the perceived safety in two cases (i.e. "doubled scenarios"), in which the same driving conditions occur and only route familiarity varies. There were two "doubled scenarios" per block. In the Results section, the differences between these two consecutive ratings are discussed. The majority of the images were photographs taken in the field with a smartphone camera. Before collecting the photographs, the potential location of each driving scenario was noted on an online map. The photographs of scenarios, in which the volumes of pedestrians are low, were taken during the non-peak hours, i.e. at 5:00-7:00 in the morning. Athenians prefer to go shopping in the time period between 11:00-14:00; thus, at this time of the day, the flow of pedestrians at shopping centers, like in Pireaus and Nea Smirni, is usually quite high. For the stations that are located near the beach, the volume of passengers and pedestrians increases during the summer weekends, when people choose to go for swimming or for a coffee by the sea. Lastly, in Neos Kosmos, there is a local street market near the tram track, every Saturday.

Models specification: perceived safety and driving stress
A 7-point Likert scale was utilized for the evaluation of perceived safety and driving stress; by definition, the Likert is an ordinal scale. Ordinal scales utilize numbers to indicate a rank of a single attribute (Scott Long, 2015), but the ordinal data do not provide metric information (Liddell and Kruschke, 2018). Although the set of categories on the ordinal scale is clear, the distances between the categories are not known. For example, the real numerical distance between a very unsafe (1) and a neutral (4) section may be smaller than the distance between a neutral (4) and a very safe (7) section. For ordinal scales, the most commonly used modelling methods are the ordered probit firstly developed by McKelvey and Zavoina (1975) and the ordered logit (or the proportional odds method) firstly developed by McCullagh (1980). In this study, the ordinal logistic regression (i.e. ordered logit) was preferred for the estimation of statistical models using the ratings of tram driving. The general form of an ordinal logistic model including both random and fixed beta parameters is presented in Eqs. (4) and (5). where: y i,t response (i.e. observation) t of individual i y i,t * latent dependent variable β 1,i , β 2,i , . . , β K,i set of beta random parameters. Their values differ among individuals Β 1 , Β 2 , . . , Β L set of beta random parameters. Their values are the same among individuals x 1,i,t , x 2,i,t , . . , x M,i,t independent variable values (random effects) X 1,i,t , X 2,i,t , . . , X L,i,t independent variable values (fixed effects) k 1 , k 2 , . . , k J set of thresholds. ε it error term In ordinal models, the cumulative probabilities for each of the previously presented intervals can be computed by Eqs. (6)-(8).
The proportional odds assumption is one of the basic properties of an ordered logit model. According to this assumption, the odds ratio remains constant for all the different intervals; therefore, there is only one set of betas and as a result, the final estimated model is linear. The interpretation of (linear) proportional odds models is much simpler compared to other (non-linear) ordinal models (McCullagh, 1980). For a given dataset, the validity of the proportional odds assumption can be tested by performing a X 2 test, comparing a model using the proportional odds assumption (null hypothesis) with one not using it. For small confidence intervals of around 71% and 82%, the models without the proportional odds assumption represent better the perceived safety and driving stress observations, respectively.
Each tram driver rated perceived safety 14 times (i.e. 10 scenarios +2 doubled scenarios) and driving stress 12 times. These observations are not independent from each other and therefore it is considered to be a panel dataset. Therefore, the introduction of random beta parameters in the models is necessary in order to describe the heterogeneity among the individuals. For the estimation of the fixed and random beta parameters of perceived safety and driving stress, the Simulated Maximum Likelihood (SML) method was implemented. The mean and the standard deviation of each random beta parameter are unknown parameters and were estimated in the computation procedure. The models were estimated using R software. The joint probability function for the individual i can be estimated by Eq. (9). The maximization of this function can be accomplished through Monte-Carlo simulation. In fact, the integral is computed using random draws. In this study, the Halton draws method was selected, since it provides a better coverage per unit of square.

Participants and procedure
In total, 118 tram drivers are employed to operate the current study network. There are also 4 trainers, who are responsible for training and evaluating the tram drivers. Hence, the maximum size of the sample is 122 respondents. Most of the tram drivers started working at the company since the beginning of its operations in 2004; therefore, they are very experienced. Their mean age is equal to 42 years old and the crew only consist of 9 female drivers.
The trainers of the tram drivers were first asked to fill in the survey form as part of a pilot study, which was conducted before the main study. The trainers evaluated the quality of the survey form and provided useful recommendations. Some of their main recommendations were related to the chosen arrival delay levels and the poor connection between perceived safety and driving stress questions. Many of them did not consider the information provided from the picture in the evaluation of driving stress. Therefore, the phrase: "if you also know that" which was between the two questions, was replaced with a new one that said: "if, in the previous conditions, you take into account that". STASY, the transport operator of Athens, was responsible for sending the online links with the questionnaire form to the tram drivers and for convincing them to fill it in. Before that, STASY randomly divided the tram drivers into 3 groups, each group filled only one block of questions. To avoid the existence of correlations between the independent variables, each block of ratings had to be filled by the same number of respondents. The online links were open for around 3 weeks, i.e. from 3 until 20 of July 2019. The main goal was to collect responses from approximately the 40% of the tram drivers of Athens (i.e. 46-48 respondents). In addition, there was a time limit in this procedure, since within August, most of the tram drivers take time off for summer holidays.
After the end of the experiment procedure, the ordinal models were estimated using the collected ratings and performing ordinal regressions with random variables, as it was described before.

Results
First, descriptive statistics of the dataset are given (Section 4.1). Then we present the estimated models of perceived safety (Section 4.2) and driving stress (Section 4.3), respectively. The heterogeneity among the collected responses is discussed in these sections, too.

Descriptive statistics
The survey was answered by 57 out of 118 tram drivers who were working in STASY during the time of the study (i.e. response rate of 48.31%). P.G. Tzouras et al. Transportation Research Interdisciplinary Perspectives 7 (2020) 100205 There were 9 respondents who did not answer all the questions of the survey. These responses were discarded; therefore, the sample size was equal to 48 tram drivers. Also, the number of respondents per block was equal to 16. The same number of observations in each block means no correlations among the independent variables (i.e. orthogonality) in the final dataset and consequently, more statistically significant beta parameters, as it was proved later in the model estimation process. The final dataset contained 672 (i.e. 14 ratings * 48 respondents) and 576 (i.e. 12 ratings * 48 respondents) observations of perceived safety and driving stress, respectively.
Regarding the demographic characteristics, 89.58% (i.e. 43 respondents) of the sample were male and 10.42% (i.e. 6 respondents) were female drivers. Also, the majority of the respondents (i.e. 60.41% or 29 respondents) belonged to the age group 41-50 years. 79.16% of the respondents (i.e. 38 respondents) have been in the company since the beginning of its operations in 2004, therefore they have more than 10 years of experience. There were no drivers with less than 3 years driving experience. It should be noted that the average experience of tram drivers of STASY is almost 12 years. Generally speaking, there is low variance in personal characteristics of the tram drivers of Athens; this made the introduction of relevant beta parameters unfeasible in the estimated models. Yet, by observing the number and the proportions, the sample can be considered representative for the examination of perceived safety and driving stress along tram lines of Athens.
In the evaluation of perceived safety, the respondents avoided extreme ratings, such as 1: very unsafe or 7: very safe. In 17 out of 36 (47.22%) driving scenarios, the mode value was equal to 4 out of 7. The maximum mean value was equal to 5.18 and reported in scenario 23, in which the tram track is a semi-exclusive alignment and there are no pedestrians in the road environment. The minimum mean value was equal to 2.31. In the evaluation procedure of driving stress, some drivers selected very high values like 6 or 7 and some others very low, like 1 and 2 in the majority of driving scenarios. Indeed, in 26 out of 36 scenarios, the range of the given ratings was equal to 6, which is the maximum for a 7-point Likert scale.

Perceived safety model
The estimation results of the statistical model of perceived safety are shown in Table 3. As it can be seen, all the beta parameters were correctly selected to be random, since the standard deviations are statistically significant at a confidence level of 95%. The parameters of the existence of a station and the existence of a semi-exclusive alignment located near the sidewalk were statistically insignificant. All the mean values of the random independent variables had negative signs. Hence, a semi-exclusive alignment, located in the middle of the street with protected pedestrian crossings and without pedestrians in the road environment of the tram driver, was proved as the safest case. On the contrary, the model reaches lower values of perceived safety when the number of pedestrians in the road environment tends to increase. Regarding the odds ratios, parameters such as pcrs2 (i.e. section with an unprotected crossing) and align1 (i.e. tram/pedestrian mall) reported the highest ones, namely: 5.392 and 5.518, respectively. This means that the existence of an unprotected crossing or an alignment that is fully shared with pedestrians in one section changes the odds of being in one category less by a factor higher than the ones of the other parameters.
The heterogeneity in "tastes" among the individuals can be described by plotting the normal distributions of the random parameters. In Fig. 3, it is obvious that parameters related with the existence and the type of a pedestrian crossing have higher heterogeneity. This means that for some tram drivers, the existence of an unprotected crossing was very important in the assessment of safety, while some other drivers did not perceive it as equally important. The heterogeneity of the "taste" related with the existence of a mixed traffic operation alignment was larger compared to the "taste" related with the existence of a tramway shared with pedestrians. Lastly, the majority of the drivers agreed that high volumes of pedestrians in the road environment negatively affect perceived safety.
Due to the low variance that was observed in the personal characteristics of the tram drivers, variables like gender, age and experience were not included in the perceived safety model. Furthermore, the variable of route familiarity had high correlation with alignment type (R 2 = 0.56). The contribution of familiarity to perceived safety was tested by showing to the tram drivers the "doubled scenarios", i.e. scenarios in which the same driving conditions occur and only familiarity varies. The mean difference between these two consecutive ratings was computed to be equal to −0.56. The negative sign illustrates that the familiar scenario was considered as safer compared to the scenario without familiarity. The mode difference was equal to 0 and the maximum (not the absolute) difference was equal to +3, which means that some drivers thought that the unfamiliar driving scenario is safer than the familiar one. In general, it can be concluded that the contribution of familiarity in perceived safety rating was not as high as it was expected in the beginning.

Driving stress model
One of the major problems that was observed in the estimation of the driving stress model was the statistical insignificance of the beta perceived safety parameter. Perceived safety did not actually correlate with driving stress, as we expected. To estimate a model that would fit better with the observations, the parameter of perceived safety was replaced by route familiarity parameter. Tram drivers of Athens had not driven in Piraeus; thus, their experience in driving on tramways that are fully shared with pedestrians was limited. As was shown before, the existence of a nonexclusive alignment impacts perceived safety.
The output from the estimation procedure of the statistical model of driving stress is shown in Table 4. Apart from route familiarity, all the other beta parameters were correctly selected to be random, since the Table 3 Results of perceived safety model. standard deviations are statistically significant for a confidence interval of 95%. The mean values of arrival delay and load of standing passengers' parameters had a positive sign, while the coefficient connected with familiarity had a negative one. This means that a familiar section decreases driving stress; on the contrary, higher arrival delays and load of standing passengers increase the driving stress. According to the estimated model (considering the means of the random parameters), if arrival delay is zero, the proportion of standing passengers is less than 50% and the route is familiar to the tram drivers, the driving stress is 2/7. On the other hand, if arrival delay is higher than 40 min, the number of standing passengers is equal to the tram standing capacity and the route is unfamiliar to the tram drivers, the model predicts a quite high driving stress level, equal to 6/7. By looking at the results, it was also clear that the sizes of the intervals corresponding to the 4th and 5th level of driving stress were much bigger compared to the other intervals. The highest odds ratio, equal to 1.461, was observed in the familiarity parameter. The factor of arrival delay had a smaller odds ratio, equal to 0.919. This value means that for a 1-min increase in arrival delay, the odds of being in one stress level lower changes by 0.919. The load of standing passengers does not affect driving stress too much, as it was expected. The beta parameter related with the load of standing passengers presents higher heterogeneity compared to the beta parameter related with arrival delay, as it can be seen in Fig. 4. Tram drivers agreed that high delays mean high driving stress. In addition, they totally agreed on the contribution of familiarity to driving stress. By looking at the statistical models that were developed in this study, route familiarity was the only nonrandom parameter.

Discussion and conclusions
In this study, a stated preference experiment was designed and conducted in Athens for the examination of subjective notions of tram safety, namely perceived safety and driving stress. Knowledge from previous qualitative studies (Naweed and Rose, 2015;Naznin et al., 2017) was utilized for the selection of independent variables of the statistical models and for survey design, since it was not feasible to conduct interviews with the tram drivers. The variable levels were selected based on the design characteristics and the traffic conditions that appear in the tram network of Athens. In order to evaluate and quantify perceived safety and driving stress, 7-point Likert scales were used. The existence of many random variables confirms the subjective nature of perceived safety and driving stress. Driving stress is a more subjective notion compared to perceived safety. In driving stress ratings, large differences among the individuals appeared.
Statistically significant parameters of perceived safety are: alignment type, existence and level of protection of pedestrian crossings and volume of Vulnerable Road Users (VRUs) in the road environment. The existence of a tram stop in a section does not decrease the perceived safety, as it was expected. According to the views of tram drivers of Athens, tram/pedestrian  malls and mixed traffic alignments are less safe compared to semi-exclusive alignments located either in the middle of the street or near the sidewalk. This conclusion is in line with some findings of previous studies about objective safety (Korve et al., 2001;Brčić et al., 2013;Naznin et al., 2016). According to these findings, more accidents with severe injuries have been recorded in non-exclusive alignments, while in semi-exclusive tram tracks, the frequency of fatal accidents is higher compared to the non-exclusive ones. Moreover, high heterogeneity appeared in the beta parameters related with the existence and the type of pedestrian crossings; this means that not all tram drivers have the same opinion on whether the existence of an unprotected crossing in a section causes low perceived safety. On the contrary, there is a higher level of agreement regarding the impact of the factor related with the volume of VRUs in perceived safety.
According to the computed model of driving stress, factors that influence driving stress are: arrival delay, load of standing passengers and familiarity. As also mentioned in the study of Naznin et al. (2017), ontime running adds extra pressure to tram drivers. Yet, in Athens, very high values of arrival delay, e.g. 25 min, were proposed by the trainers of the tram drivers to be used in the questionnaire survey in order to test the relationship with stress levels. The tram system of Athens cannot be characterized as very reliable one, and it seems that tram managers do not put too much pressure on tram drivers to perform better. In other tram companies, pressure is likely to be higher and therefore the correspondent odds ratio may be estimated higher. The load of standing passengers is a statistically significant factor, though not so important as compared to the other factors of driving stress. No statistically significant correlation between perceived safety and driving stress was found in this study. Experienced tram drivers believe that they are ready to respond properly in a (subjectively) unsafe section, if they are familiar with it. If not, the tram drivers lack confidence and the driving stress increases. Yet, it is still a question how inexperienced tram drivers respond to an unfamiliar section, like the one located in Piraeus.
Perceived safety, as it can be computed by the estimated model, and especially the differences in the levels of perceived safety from one section to the other, can be utilized as an indicator to describe design inconsistencies that appear in a network; a very relevant concept that has not been discussed by previous studies about design of tram lines and tram safety. However, this is valid only under the assumption that tram drivers adopt the tram speed based on the speed limit of each section and their feeling of safety. Therefore, it does not contradict results of previous studies on road design consistency, which used speed deviations as a main indicator (Ng and Sayed, 2004;Camacho-Torregrosa et al., 2013). In the driving stress model, the size of the beta parameter of arrival delay appears to be connected with the culture and the priorities of each tram company. It can be interpreted as indicative of how the tram company balances safety and efficiency considerations. Tram managers could therefore consider this parameter in their schedules by providing additional time margins in place, where the interaction between trams and VRUs are many and complex.
At this point, it should be noted that the developed methodological tool can easily be adapted and used in other networks with some modifications in the variable levels. Additional independent variables related with perceived safety can also be added, such as existence of a curve, slope, etc. In addition, live images instead of still images can be utilized in order to give some clues to the respondents regarding the movements of pedestrians and the complexity of interactions. Designs that allow pedestrians to cross the tram tracks freely (i.e. tram pedestrian malls), may increase the workload of drivers and consequently their fatigue that is a potential factor of driving stress. Also, since the driving stress is a more subjective notion compared to perceived safety, empirical data through the use of driving simulators and Photoplethysmogram (PPG) sensors, which record the heart rate and the skin response, should be also collected in order to exclude more secure conclusions about driving stress of tram drivers. Driving experience is another factor that is likely to have significant impact on perceived safety. In Athens, this relationship could not be examined due to the low variance in drivers' personal characteristics. Yet, it should be introduced in future experiments with tram drivers and tested by conducting a sensitivity analysis. Lastly, it is recommended that the importance of the public transport drivers' views should be upgraded in scientific research in the future. It is essential that public transport drivers' expertise and experiences be taken into consideration in the design of tram lines and in the assessment of their safety.