How should external human-machine interfaces behave? Examining the effects of colour, position, message, activation distance, vehicle yielding, and visual distraction among 1,434 participants

External human-machine interfaces (eHMIs) may be useful for communicating the intention of an automated vehicle (AV) to a pedestrian, but it is unclear which eHMI design is most effective. In a crowdsourced experiment, we examined the effects of (1) colour (red, green, cyan), (2) position (roof, bumper, windshield), (3) message (WALK, DON ’ T WALK, WILL STOP, WON ’ T STOP, light bar), (4) activation distance (35 or 50 m from the pedestrian), and (5) the presence of visual distraction in the environment, on pedestrians ’ perceived safety of crossing the road in front of yielding and non-yielding AVs. Participants ( N = 1434) had to press a key when they felt safe to cross while watching a random 40 out of 276 videos of an approaching AV with eHMI. Results showed that (1) green and cyan eHMIs led to higher perceived safety of crossing than red eHMIs; no significant difference was found between green and cyan, (2) eHMIs on the bumper and roof were more effective than eHMIs on the windshield, (3) for yielding AVs, perceived safety was higher for WALK compared to WILL STOP, followed by the light bar; for non-yielding AVs, a red bar yielded similar results to red text, (4) for yielding AVs, a red bar caused lower perceived safety when activated early compared to late, whereas green/cyan WALK led to higher perceived safety when activated late compared to early, and (5) distraction had no significant effect. We conclude that people adopt an egocentric perspective, that the windshield is an ineffective position, that the often-recommended colour cyan may have to be avoided, and that eHMI activation distance has intricate effects related to onset saliency.


Introduction
Automated vehicles (AVs) may be driving on public roads in the coming decades (Bazilinskyy et al., 2019b;Faggella, 2020). In such vehicles, the driver could be busy with a non-driving task (SAE levels 3 and 4 automation) or absent (SAE level 5 automation). Interactions between AVs and human road users may be inefficient or uncomfortable because of a lack of eye contact, hand gestures, or postural signals. Although some voices have argued that these forms of bodily communication are rare or unimportant in traffic (Lee et al., 2021;Moore et al., 2019), other research suggests that such communication is common and of value. For example, a Wizard-of-Oz study by Malmsten Lundgren et al. (2017) found that pedestrian's willingness to cross decreased with an inattentive driver. In the same vein, an observational study by Sucha et al. (2017) showed that pedestrians expressed their intention to cross by seeking eye contact (84% of the cases) and waving (4% of the cases), and that drivers showed such behaviours as well (34% and 5% of the cases, respectively).
There are several means to let AVs communicate with other road users. One approach is implicit communication, where the vehicle motion itself is the communication channel (e.g., Schieben et al., 2019). For example, lateral movement in the lane (Fuest et al., 2018;Rossner and Bullinger, 2019;Sripada et al. 2021) or creeping-forward movement (Bazilinskyy et al. 2021b;Oliveira et al., 2019) could prove to be a valuable cues for communicating the AV's intentions and for increasing overall AV acceptance. Another approach is to use an external human-machine interface (eHMI), which is a display on the outside of the AV that communicates the state or intention of the AV to other road users, such as pedestrians.
As of present, at least 70 eHMI concepts have been proposed by academics and industry (Dey et al., 2020a). These eHMIs appear in numerous shapes and formats, ranging from led strips and icons to text messages and projections (Bazilinskyy et al., 2019a;Dey et al., 2020a). However, there appears to be a shortage of systematic evaluations of these different concepts. For eHMIs to be deployed on future AVs, several essential questions would have to be answered. One of the questions concerns the colour of the eHMI. Previous research using front brake lights does not provide clear leads. Some have proposed green front brake lights (Petzoldt et al., 2018;Schubert and Kirschbaum, 2018), whereas others have proposed red ones (Antonescu, 2013;Jandron, 1998). Petzoldt et al. motivated the use of green as follows: "As the brake light on the front, other than the ones on the rear, has no warning function, but rather indicates that a safe crossing in front of the vehicle might be possible, we decided for a green (instead of a red) light" (p. 450). A green front brake light thus assumes an egocentric perspective for the pedestrian (i.e., a message that addresses the pedestrian), as the pedestrian is informed that (s)he can walk. In contrast, a red front brake light assumes an allocentric perspective (i.e., a message that refers to the AV itself), where the pedestrian is informed that the car is not free to continue driving. According to online studies using images of AVs with eHMIs (Bazilinskyy et al., 2019a;Dey et al., 2020a), pedestrians tend to adopt an egocentric perspective. That is, participants are more inclined to cross in front of an AV with a green than a red eHMI. In recent years, cyan has been introduced in several eHMI concepts (Daimler, 2017;Faas et al., 2020;Lee et al., 2019;Mercedes-Benz, 2015). In a field study with a Wizard-of-Oz vehicle by Faas and Baumann (2019), participants gave higher ratings of discriminability, suitability, and sense of safety for a cyan-light eHMI as compared to a white-light eHMI. The authors argued that cyan is "a novel color in traffic that is not associated with a specific meaning yet" Baumann, 2019, p. 1236) and so prevents confusion with existing lights on the vehicle. Similarly, Werner (2019) concluded that cyan is the "colour best suited for the identification of autonomous cars and human-automobile communication" (p. 1) because of its discriminability, peripheral visibility, uniqueness, and attractiveness. Dey et al. (2020b) tested green, cyan, and red eHMIs in an online survey and concluded that participants regarded cyan as "a neutral colour for communicating a yielding intention". A recent image-based crowdsourced study examined the intuitiveness of 729 colours from the entire RGB spectrum for an eHMI in a crossing scenario (Bazilinskyy et al. 2020). The study showed that the green colour is intuitive if the eHMI is intended to indicate 'please cross', but green and red should be avoided if the eHMI is meant to signal 'please do NOT cross'. However, how pedestrians would respond to eHMI colours in dynamic conditions in which an AV approaches them is yet to be examined.
The position of the eHMI is another parameter that needs to be considered. Various positions have been proposed, including the front bumper (Daimler, 2017;Semcon, 2016;Toyota, 2018), the windshield (Ford Media Center, 2017;Nissan, 2015), the roof (Golson, 2016), the side of the vehicle (Ackerman, 2018;Nissan, 2015;Sweeney et al., 2018;Troel-Madec et al., 2019;Urmson et al., 2015;Volvo Cars, 2018), or a projection on the road in front of the AV Dietrich et al., 2018;Löcken et al., 2019;Mercedes-Benz, 2015;Mitsubishi Electric, 2015;Nguyen et al., 2019;Rinspeed AG, 2017;Sweeney et al., 2018). In Eisma et al. (2020), participants viewed animations of AVs equipped with eHMIs at various positions on the AV. Their results showed that if the AV approached along a straight road, participants were more inclined to cross when eHMIs were placed on the roof, windscreen, or bumper as compared to eHMIs positioned above the wheels or projections on the road. A corresponding eye-tracking analysis showed that the eHMI on the windscreen resulted in the least dispersed eye movements among the tested eHMI designs. This favourable characteristic may be attributable to the fact that pedestrians are inclined to look at the windshield when the approaching car is close by, presumably because this is where the driver's head is located (De Winter et al., in press;Dey et al., 2019b). Further research is needed to validate whether the windshield is indeed the most suitable position for an eHMI.
A third parameter is the message of the eHMI. Various text-based (Deb et al., 2016;Fridman et al., 2019;Hudson et al., 2019;Matthews et al., 2017;Mercedes-Benz, 2015;Nissan, 2015;Vlakveld et al., 2020) and light-based (Benderius et al., 2018;BMW, 2016;Cefkin et al., 2019;De Winter and Happee, 2019;Faas et al., 2020;Ford Media Center, 2017;Hensch et al., 2020;Volvo Cars, 2018;Weber et al., 2019) eHMIs have been proposed in the literature. The meaning of an eHMI that consists of only a coloured lamp or light strip can be unintuitive or unclear if no prior training or instruction is provided (De Clercq et al., 2019;Hensch et al., 2020). The advantage of text-based eHMIs is that they can be understood already at the first encounter. However, text requires foveal visual attention, can be misunderstood by children, and is hard to read from a distance (Clamann et al., 2017). Consistent with the above information about colour, it has been found that participants tend to adopt an egocentric perspective when it comes to text (Ackermann et al., 2019;De Clercq et al., 2019;Eisma et al. 2021). Although several studies have compared text-based eHMIs with other types of eHMIs (Bazilinskyy et al., 2019a;De Clercq et al., 2019), a comprehensive evaluation of text-based eHMIs in different eHMI configurations is still lacking.
In present traffic, road users rely not only on eye-contact and gestures but also on implicit communication such as speed and distance to the pedestrian (Beggiato et al., 2017;Dey et al., 2019a). Various studies report that implicit cues are dominant in deciding whether to cross (Clamann et al., 2017;Lee et al., 2019;Li et al., 2018). However, even if implicit cues are dominant, an eHMI could still be value. De Clercq et al.
(2019) and Eisma et al. (2020) showed that activating an eHMI before the AV started to brake increased pedestrians' willingness to cross as compared to an eHMI that activated at the same moment the AV started to brake. Their rationale for using anticipatory eHMI activation was that an AV has knowledge about whether it will stop or not, which could be communicated early to improve traffic safety and efficiency. It is unknown how early an eHMI should be activated. Suppose an eHMI is activated at a far distance from the pedestrian. In that case, the pedestrian may be unable to detect the eHMI onset, and the eHMI may fail to attract the pedestrian's attention. On the other hand, if an eHMI is activated at a close distance from the pedestrian, there may be too little time for the pedestrian to benefit from the information, and implicit cues may already be dominant. Clearly, there is a need for more research into eHMI timing.
Lastly, the majority of studies on eHMIs have been conducted on empty roads without taking visual distractions around the car into consideration. Future automated vehicles will not necessarily be driving on empty roads but also in crowded cities and busy highways. Tapiro et al. (2020) found that high visual clutter (e.g., billboards, garbage bins, other road users) can lead to missing opportunities to cross and high visual attention dispersion by pedestrians. Considering that visual eHMIs will compete with other stimuli for pedestrians' visual attention, it is important to consider the effect of visual distractions from the environment on the effectiveness of eHMIs.
In summary, a substantial amount of research on eHMI design has been done so far. In much of this research, eHMI concepts were proposed but not evaluated in a human-subject study, or the effectiveness of an eHMI design was evaluated in a limited setting (e.g., static images, small number of experimental conditions). It would be worthwhile to perform more research into the effectiveness of eHMI colour, position, message, activation moment (before braking, at the onset of braking) and visual clutter in a dynamic scenario in which the AV is approaching the pedestrian. The uniqueness of the present research is that the effects of all the aforementioned parameters were assessed relative to each other in a combined total of 276 conditions, using a large sample size. The participants were recruited and tested via crowdsourcing, and we examined whether participants felt safe to cross the road in front of the 3 AV by letting them hold and release a key on their keyboard. An eHMI was regarded as effective if it made participants feel safe to cross (i.e., hold the key) in case the AV yielded, and oppositely, if it did not make them feel safe to cross (i.e., release the key) when the AV did not yield. Table 1 shows the parameters that we used to generate eHMIs presented to the participants on an AV. The AV contained no driver or passenger, corresponding to SAE level 5 automation. The experiment consisted of non-interactive video clips in which an AV with an eHMI approached at 50 km/h on a straight road with 5-m wide lanes, as seen from the viewpoint of a pedestrian's standing on the curb. The camera was positioned 2.2 m above the road surface, and 1.9 m above the curb (the curb was 0.3 m high), and was oriented at an angle of 15 deg relative to the road. There was no other moving traffic in the environment.

eHMI concepts and virtual environment
The eHMI was displayed on the front bumper, the windshield, or the roof, and was coloured red, green, or cyan. The eHMI messages did not change after activation. Distances between the pedestrian and the AV of 50 and 35 m were used for activation of the eHMI. Fig. 1 depicts the distance parallel to the road between the pedestrian and the front of the AV versus elapsed video time. For trials in which the AV was yielding, the AV started to brake at a distance of 35 m. In other words, the eHMI activation distances of 50 and 35 m corresponded to 15 m before braking and the onset of braking, respectively. The distraction was presented as a moving billboard and a flickering banner.
We did not use all combinations of the parameters to avoid presenting a disproportional amount of text messages that are incongruent with the vehicle's yielding behaviour (e.g., an eHMI that displays 'WALK' while the vehicle is not yielding). Such incongruent eHMIs could negatively affect the participants' overall trust in the eHMIs. From the 144 incongruent designs possible, we included 24. In total, 276 videos were presented to the participants: 216 eHMI types (3 colours x 3 positions x 2 yielding behaviours x 2 distraction levels x 3 messages x 2 activation distances) + 36 no eHMI activation (3 colours x 3 positions x 2 yielding behaviours x 2 distraction levels) + 24 incongruent designs (3 colours x 2 yielding behaviours x 2 messages x 2 activation distances, only roof positions and distraction). That is, about 5% (= 24/276) of the videos contained incongruent eHMI designs to keep participants alert. The videos were 1280 × 720 pixels, as these dimensions were deemed to offer a reasonable balance between visual quality and time needed to preload the videos. All videos were 10-s long and had a frame rate of 60 fps. Additional 1-s long segments with black frames were added at the beginning and at the end of each video to make the transitions between videos less abrupt, resulting in a 12-s total length of each video. The videos did not contain sound. Fig. 2 shows the colours and messages used in the study. Fig. 3 provides an example of an eHMI with the message 'WALK' placed on the front bumper, the roof, and the windshield. The orange billboard banner and the green logo 'Meos beer' on the façades of the buildings served as distractions. When activated, sliding images were shown on the billboard, and the banner was flickering on and off. The videos were generated in a modified version of the environment used in De Clercq et al. (2019) and Kooijman et al. (2019).

Crowdsourcing experiment
Participants completed the experiment through the crowdsourcing service Appen (i.e., https://appen.com; the service was called Figure Eight at the time of the experiment). Participants became aware of this research by logging into one of many channel websites (e.g., htt ps://www.ysense.com), where they would see our experiment in the list of other projects available for completion. They then self-enrolled for the study. We allowed contributors from all countries to participate. It was not permitted to complete the experiment more than once from the same worker ID. A payment of USD 0.22 was offered for the completion of the experiment.
The entire study was presented in English. At the top of the page, contact information of the researchers was provided, and the purpose of the upcoming study was described as "to determine willingness to cross the road in front of a car with an external Human Machine Interface (eHMI)". Participants were informed that they could contact the investigators to ask questions about the study and that they had to be at least 18 years old. Information about anonymity and voluntary participation was provided as well. Next, participants completed a number of questions (e.g., age, gender). The participants were then asked to leave the questionnaire by clicking on a link that opened a webpage with the experiment. The participants were presented with instructions on how to complete the given task: "In this experiment, you will view 40 videos of cars with eHMI concepts. Each time you feel safe to cross the road in front of the car, please do the following: (1) Press key 'F'. (2) Keep pressing the key as long as you feel safe. (3) When you do not feel safe to cross anymore, release the key. After every five videos, you will be able to take a small break. Press 'C' to start the first video". No practice trials were included.
The experiment was created using a modified version of the framework based on jsPsych (i.e., https://www.jspsych.org (De Leeuw, 2015) that was used in a previous study on the measurement reaction times to auditory, visual, and multimodal stimuli (Bazilinskyy and De Winter 2018). The videos were uploaded to the S3 cloud storage provided by Amazon. The videos were preloaded to eliminate delays during the experiment.
The participants had to respond to a randomly selected subset of 40 videos presented in 8 batches of 5 videos. After each batch, participants were shown the following text: "You have now completed 5 [10,15,20,25,30,35] videos out of 40. When ready press 'C' to proceed to the next batch." At the end of the experiment, the participants were shown a unique code. They were required to enter the code on the questionnaire as proof that they completed the experiment to receive their remuneration.

Analyses
First, we calculated the 'perceived-safety' percentages for each of the 276 videos, averaged over all participants. The perceived-safety percentage was defined as the percentage of time that participants held the response key between 5 s and 11 s into the video (note that the eHMI switched on at 5.3 s for the 50 m activation, and the video turned black at 11 s). The perceived-safety percentages were grouped into videos in Note. The cyan was a shade of cyan known as aquamarine. The decision to use this shade was based on existing eHMIs (e.g., Daimler, 2017).
which the AV yielded and videos in which the AV did not yield, and sorted in ascending order. For yielding AVs, a high perceived safety percentage was regarded as effective, and for non-yielding AVs, a low perceived safety percentage was regarded as effective. Next, we assessed the effects of each of the aforementioned parameters (colour, position, message, and activation distance of the eHMI, and presence of visual distraction). Based on a visual inspection and exploratory analyses of the perceived-safety percentages of the 276 videos, we noted no substantial interactions between most of the parameters.
We therefore proceeded with investigating the effects of the parameters, or combinations of selected parameters, separately using paired t-tests. For example, to investigate the effect of eHMI colour, we selected all 36 videos per eHMI colour (3 positions x 2 activation distances x 3 messages [congruent messages only] x 2 distraction levels) and averaged the results of these videos per participant, excluding missing values. Note that participants viewed 40 of the 276 videos, so there was an approximately 14% probability that a participant had watched a particular video. We calculated the percentage of participants who pressed the response key as a function of elapsed video time, and performed paired t-tests between parameter (e.g., green vs. red) per 1-s interval of the video (responses averaged over that 1 s). The statistical tests were performed for yielding trials and non-yielding trials separately. A significance level of 0.005 was used (Benjamin et al., 2018).

Results
A total of 2231 people participated between August 22, 2019 and January 2, 2020. The total cost was USD 586. The survey received a satisfaction rating of 3.7 on a scale from 1 ('very dissatisfied') to 5 ('very satisfied'). Before proceeding to the analysis, we adopted a strict screening procedure by removing participants who did not complete the experiment properly. People were excluded if we suspected that they had cheated the system or if they suffered from technical issues such as delayed video playback. In total, 797 participants were removed, leaving 1434 participants. The participants resided in 91 countries, with the most represented countries being Venezuela (N = 259), USA (N = 106), India (N = 101), and Turkey (N = 84).
The sample consisted of 916 males, 516 females, and 2 participants who selected 'I prefer not to respond' to the gender question. The mean age of the participants was 35.9 years (SD = 11.5). The participants took, on average, 21.0 min to complete the study (SD = 9.3 min, median = 18.1 min). Participants had viewed on average 43.39 videos (SD = 14.41, min = 30, max = 204, median = 40). The deviation from the nominal number of 40 videos per participant can be explained by technical limitations in the software (e.g., some participants may have closed and reopened their browser in between, resulting in a higher number of videos viewed).
Appendices A and B contain perceived-safety percentages of the eHMIs for yielding and non-yielding AVs, respectively. The rows are sorted by the perceived-safety percentages. A hierarchy of the effects of parameters can be distinguished. For yielding AVs, the highest perceived-safety percentages were obtained for green and cyan eHMIs displaying WALK or WILL STOP on the bumper or roof. For non-yielding AV, the lowest perceived-safety percentages were obtained for eHMIs displaying WON'T STOP or DON'T WALK or a red bar.
Next, we examined the temporal effects of keypress behaviour. For  all figures below, the same pattern can be distinguished: After the start of the trial, participants started pressing the response key (between 1 s and 4 s). At 4 s into the trial, 57% of the participants had the response key pressed. Then, participants released the response key (from 5 s to 8 s), which can be explained by the fact that the vehicle got closer, as a result of which it became more and more dangerous to cross. Next, participants pressed the response key again (between 8 s and 11 s) for yielding AVs, which can be explained by the fact that the AV decelerated (from 6.4 s until coming to a full stop at 10.1 s), making it increasingly apparent to participants that they could cross. For non-yielding AVs, participants started pressing the button again between 9 and 10 s, which is when the vehicle had passed, and it was safe to cross again. The following effects of eHMI design parameters were observed: • Colour: A red eHMI caused participants not to feel safe to cross as compared to a green or cyan eHMI, for both yielding and nonyielding vehicles. There was no significant difference between the green and cyan eHMIs (Fig. 4). • Message: For yielding vehicles, the egocentric message WALK led to the highest perceived safety of crossing, followed by the allocentric WILL STOP, which in turn was followed by the bar. Similarly, for non-yielding vehicles, the egocentric DON'T WALK and the allocentric WON'T STOP led to lower perceived crossing safety than the bar (Fig. 5). A disadvantage of the presentation in Fig. 5 is that it shows results for the three colours aggregated. Fig. 6 provides the results for eHMI message, but only for the 'best' colours (green and cyan combined for yielding AVs, red for non-yielding AVs). It can be seen again that, for yielding vehicles, green egocentric messages were more effective (WALK) than green allocentric messages (WILL STOP), followed by a green bar. For non-yielding vehicles, there were no significant differences between the red messages (DON'T WALK, WON'T STOP, red bar), and there was only a small difference with no eHMI activation at all. • Position: For yielding vehicles, an eHMI on the windshield was the least effective (i.e., participants were the least likely to hold the key), followed by an eHMI on the roof, followed by an eHMI on the bumper. Similarly, for non-yielding vehicles, eHMIs on the windshield position were less effective as the car approached (i.e., participants were more likely to hold the key) than eHMIs positioned on the roof (Fig. 7). • Activation distance: The effect of activation distance was small for yielding and non-yielding vehicles, with a significant effect for nonyielding vehicles only (Fig. 8). Fig. 9 provides further detail about the effect of activation distance for two selected eHMIs on yielding AVs: green and cyan egocentric messages (WALK) and red bar messages. The red bar caused participants to release the response key for the early activation (50 m), while the green and cyan text WALK caused participants to press the response key for the late activation (35 m). • Distraction: Visual distraction had no statistically significant effect on participants' perceived safety of crossing the road (Fig. 10).
A final point of attention is that our participants were crossnationally diverse. To investigate generalizability, we selected one of our primary results (the results regarding colours, as shown in Fig. 6) and computed the mean perceived-safety percentage across participants from the four most highly represented countries. The results, as shown in Fig. 11, reveal strong correlations for the perceived-safety percentage between different countries. Regardless of country, for yielding AVs, egocentric messages were more effective than allocentric messages, which in turn were more effective than the bar messages.

Discussion
In this crowdsourced study, we assessed whether pedestrians felt safe to cross the road in front of an AV equipped with an eHMI, as a function of the colour, position, message, activation distance of the eHMI, and presence of visual distraction in the environment. First of all, our findings confirm that, overall, eHMIs are effective compared to no eHMI. This could be seen in Fig. 6, where the perceived-safety percentage with eHMI was higher than baseline for yielding AVs and lower than baseline for non-yielding AVs. Our findings regarding the independent variables are discussed below.

Colour
The participants felt equally safe to cross the street in front of the AV with an eHMI of cyan or green colour. Cyan is becoming the colour of choice for prototypes of eHMIs in the industry (Bazilinskyy et al., 2019a) and academia (Faas and Baumann, 2019) because it is regarded as neutral, without specific meaning. However, the present results indicate that cyan is interpreted similarly to green. That is, the cyan and green light bars both yielded higher perceived safety of crossing the road than an eHMI that did not activate. Red stimulated non-crossing, which indicates that pedestrians interpret the eHMI colour from an egocentric perspective. However, the effects for non-yielding vehicles were small, and it is noted that recent research has shown that red can be confusing for signalling 'please do not cross' as some participants appear to interpret a red eHMI as a front brake light (Bazilinskyy et al., 2020). In summary, our results indicate that a red eHMI has some potential to indicate to a pedestrian that he/she should not cross the road, and that a green eHMI is an appropriate colour for indicating that a pedestrian may safely cross. The present results replicate earlier studies (Bazilinskyy et al., 2019a;Dey et al., 2020b) but with the difference that we showed that cyan could be confusing. This confusion may have arisen due to our specific choice of aquamarine (sRGB 127, 255, 215) instead of pure cyan (sRGB 0, 255, 255) (the reader may refer to Fig. 4, in which we depict the actual colours used). Dey et al., (2020b) used pure cyan in their online survey and found that cyan was interpreted as neutral (i.e., neither intuitive nor counterintuitive). Having said that, some participants in (Dey et al., 2020b, p. 9) were also confused about cyan: "Some observed that cyan was reasonably close to green to be 'passable' as a yielding signal, although not quite as well as green". In summary, we recommend caution against using cyan in eHMIs, especially if the eHMI is supposed to signal that the pedestrian should not cross.

Fig. 4.
Percentage of participants pressing the response key as a function of elapsed video time, distinguishing between yielding behaviour and eHMI colour. The legend shows the means and standard deviations across participants for an elapsed time between 5 s and 11 s. Y = significant difference between colours was observed for yielding vehicles for that second in the video; NY = significant difference between colours was observed for non-yielding vehicles for that second in the video; RG = significant difference between red and green; RC = significant difference between red and cyan.

Fig. 5.
Percentage of participants pressing the response key as a function of elapsed video time, distinguishing between yielding behaviour and eHMI message. The legend shows the means and standard deviations across participants for an elapsed time between 5 s and 11 s. Y = significant difference between messages was observed for yielding vehicles for that second in the video; NY = significant difference between messages was observed for non-yielding vehicles for that second in the video; EA = significant difference between egocentric and allocentric messages, EB = significant difference between egocentric messages and bar messages; AB = significant difference between allocentric messages and bar messages.

Position
The bumper and the roof were found to be superior eHMI positions as compared to the windshield. A likely explanation is that the eHMI on the windshield was shown at an angle, making it less visible. Our findings pertaining to eHMI position are contingent on the scenario. In our study, only one AV approached the pedestrian and the road was otherwise empty. An eHMI positioned at the front of the AV, and at its lower part in particular, might become less visible if multiple vehicles drive behind each other (Troel-Madec et al., 2019). In an eye-tracking study in which simulated AVs could approach from different directions, it was concluded that eHMIs should be visible from multiple sides of the AV (Eisma et al., 2020). Furthermore, findings from immersive virtual-reality experiments (Kaleefathullah et al., in press) and on the road (Cefkin et al., 2019) show that eHMIs tend to be overlooked unless they are very bright. We recommend further investigation into the visibility of omnidirectional eHMIs, such as led strips surrounding the car (e.g., Nissan, 2015;Volvo Cars, 2018) in comparison to eHMIs displayed Fig. 6. Percentage of participants pressing the response key as a function of elapsed video time, distinguishing between eHMI message, selecting green and cyan eHMIs for yielding vehicles and red eHMIs for non-yielding vehicles. As a reference, the results for conditions in which the eHMI was inactive are shown as well. The legend shows the means and standard deviations across participants for an elapsed time between 5 s and 11 s. Y = significant difference between messages was observed for yielding vehicles for that second in the video; EA = significant difference between egocentric and allocentric messages; EB = significant difference between egocentric messages and bar messages; AB = significant difference between allocentric messages and bar messages. Fig. 7. Percentage of participants pressing the response key as a function of elapsed video time, distinguishing between yielding behaviour and eHMI position. The legend shows the means and standard deviations across participants for an elapsed time between 5 s and 11 s. Y = significant difference between positions was observed for yielding vehicles for that second in the video; NY = significant difference between positions was observed for non-yielding vehicles for that second in the video; BW = significant difference between bumper and windshield; WR = significant difference between windshield and roof. 8 on the front of the car.

Message
For yielding AVs, text-based eHMIs were found to be more effective than a light bar, which is in line with past research showing that textbased eHMIs yield high clarity ratings (e.g., Bazilinskyy et al., 2019a). Note, however, that text has been criticised because of language barriers (Krampen, 1983) and legibility from a distance (Clamann et al., 2017).
For non-yielding AVs, red textual messages and a red bar led to similar perceived-safety percentages, in line with Bazilinskyy et al., 2019a. This suggests that if one's goal is to prevent pedestrians from crossing, a red bar is sufficient and text messages are not needed. It should be noted, however, that for non-yielding AVs, the differences were only small with respect to no eHMI activation at all. Presumably, implicit cues (i.e., closing distance, looming cues) provided sufficient information for the participant to understand that the AV would not stop.

Fig. 8.
Percentage of participants pressing the response key as a function of elapsed video time, distinguishing between yielding behaviour and eHMI activation distance. The legend shows the means and standard deviations across participants for an elapsed time between 5 s and 11 s. The activation distance of 50 m corresponds to an elapsed time of 5.3 s; the activation distance of 35 m corresponds to 6.4 s. NY = significant difference between activation distances was observed for non-yielding vehicles for that second in the video. Fig. 9. Percentage of participants pressing the response key as a function of elapsed video time in videos depicting a yielding AV, comparing early and late eHMI activation distance, for red bar messages and green/cyan WALK messages. The legend shows the means and standard deviations across participants for an elapsed time between 5 s and 11 s. The activation distance of 50 m corresponds to an elapsed time of 5.3 s; the activation distance of 35 m corresponds to 6.4 s. E = significant difference between activation distances was observed for green/cyan egocentric messages (WALK) for that second in the video; B = significant difference between activation distances was observed for red bar messages for that second in the video.
Previous research suggests a difference between text and light-based eHMIs, where text is understood already during the first encounter, whereas light strips/bars require a small amount of practice (De Clercq et al., 2019;Kaleefathullah et al., in press). We recommend further research into the advantages and disadvantages of light-based versus text-based eHMIs and the extent of training required.

Activation distance
The effects of eHMI activation distance were minor and nonsignificant (Fig. 8). However, a follow-up analysis revealed an interaction between message and activation distance (Fig. 9). More specifically, a red light bar that activated early (50 m) caused participants to release the response key compared to the same red light bar that appeared late (35 m). This effect can be explained by the fact that the colour red signals a hazard and that the vehicle started to decelerate only later, at 35 m. In other words, the red light bar was visible from a far distance, and in the absence of contradictory implicit AV communication (vehicle deceleration), participants released the key. A green/cyan message WALK, however, caused participants to press the key when presented late as compared to when presented early. A possible explanation is that text is hard to read from a far distance. In other words, the participant may see the text message but cannot interpret it (e.g., whether it says WALK or DON'T WALK). Additionally, the early onset of a text message was hard to notice; it is possible that the late appearance (35 m) was more noticeable and attracted attention than the early appearance (50 m). In summary, our results regarding activation distance can be explained by legibility and noticeability of the onset of the eHMI. Our findings suggest that there is potential for a multimodal eHMI, where a light bar ensures that the eHMI is visible and emits a colour signal and that text delivers the actual message.

Distraction
There was no statistically significant effect of distraction from the environment on participants' perceived safety of crossing, which seems   11. Scatter plots of mean perceived-safety percentages for participants from different countries, for the conditions shown in Fig. 6. The perceived-safety percentage was defined as the percentage of time that participants held the response key between 5 s and 11 s into the video. consistent with research on the effect of billboards on driver behaviour. Although it is plausible that billboards may cause driver distraction and accidents, the totality of evidence suggests that drivers are well able to ignore billboards and other distractors in most circumstances (Decker et al., 2015;Yannis et al., 2013). We recommend further investigation in environments with a larger amount of distractions, such as a higher density of traffic and a large number of pedestrians around.

Limitations
A point of consideration is that our sample was cross-nationally diverse. Previous crowdsourcing research into AVs and eHMIs shows that, although there are differences in mean responses between countries, the differences between experimental conditions replicate within different countries (Bazilinskyy et al., 2019(Bazilinskyy et al., , 2020Oudshoorn et al., 2021;Sripada et al., 2021). Our results are consistent with this pattern, with strong correlations (r between 0.93 and 0.98) between the means of participants from different countries.
In our study, there were a total of 276 experimental conditions. Instead of performing a complex multi-way ANOVA, we performed simple paired comparisons that were not hypothesised beforehand. For example, since we observed no significant effect of distraction at all (see Fig. 10) and detected no meaningful patterns from the results per condition (see Appendices), we decided to aggregate the results for the distraction and non-distraction conditions in all analyses. Regarding the effect of activation distance, however, we did, after exploring the results, note an interaction with message type (text vs. bar), see Fig. 9. Because our statistically significant effects are in line with the literature, and because we used a conservative alpha value of 0.005 with a large sample size, we expect our results to be replicable despite this study's somewhat exploratory nature.
The study has a number of other limitations, including the use of a computer screen with a limited field of view, no control over the way colours are rendered on the participant's screen, the use of a simple task of pressing a key, and the use of single-car scenarios in which vehicles could either yield or not yield but otherwise behaved identically. It also seems that not all participants were fully engaged, with at a given time, about 7.5% of participants holding the key when the AV was not yielding. It is possible that these participants did not understand the task or held and released the key without paying attention. We expected that these anomalous behaviours would cause a reduction of statistical power, but not a systematic bias in comparisons between conditions, as compared to when all participants would be engaged.
It remains to be investigated whether the present findings with respect to eHMI design would replicate in a more immersive setting with realistic traffic flow or in a Wizard-of-Oz on-road study such as in Cefkin et al. (2019), Baumann (2019), or Rodriguez-Palmeiro et al. (2018). In particular, a concern is that, in real traffic, eHMIs may not be seen at all (e.g., Cefkin et al., 2019), suggesting that training and standardisation are required. Another concern is that text-based eHMIs may require too much visual attention if the traffic scenario becomes more complex. Nonetheless, it can be expected that several of our findings, such as regarding the main effects of message, text, and colour, generalise well to on-road scenarios. For example, findings from a recent study on the effects of eHMIs and blinded windows by Faas et al. (in press) were found to correspond well to findings from an online image-based study (Bazilinskyy et al., 2021a).

Conclusions
In conclusion, our findings point towards the adoption of an egocentric perspective by pedestrians, which is consistent with previous research (Ackermann et al., 2019;Bazilinskyy et al., 2019a;De Clercq et al., 2019;Eisma et al., 2021). Put differently, among the tested conditions, a green eHMI with the text WALK was found to be most effective. Furthermore, eHMIs on the front window should not be used, as this configuration may be hard to perceive. Our selected cyan (aquamarine) yielded results that are similar to green; if an eHMI is intended to be neutral, then resemblance with green should be avoided (see also Bazilinskyy et al., 2020).
Our work offers important insights that may prove valuable for vehicle manufacturers who want to equip their vehicles with eHMIs. While generalizability to the real world remains an important limitation, we have been able to highlight some misconceptions from the literature: A coloured light bar is not effective without further explanation or training, cyan and green give the same results, and the idea of placing an eHMI on the windshield has proved unsuccessful.

Supplementary material
Supplementary material that includes the questionnaire used, videos and their overview with corresponding 'perceived-safety' percentages, anonymous data, and MATLAB code used for analysis may be found at https://doi.org/10.4121/14465715.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.