The effect of peripheral visual feedforward system in enhancing situation awareness and mitigating motion sickness in fully automated driving

This study investigates the impact of peripheral visual information in alleviating motion sickness when engaging in non-driving tasks in fully automated driving. A peripheral visual feedforward system (PVFS) was designed providing information about the upcoming actions of the automated car in the periphery of the occupant’s attention. It was hypothesized that after getting the information from the PVFS, the users’ situation awareness is improved while motion sickness is prevented from developing. The PVFS was also assumed not to increase mental workload nor interrupt the performance of the non-driving tasks. The study was accomplished on an actual road using a Wizard of Oz technique deploying an instrumented car that behaved like a real fully automated car. The test rides using the current setup and methodology indicated high consistency in simulating the automated driving. Results showed that with PVFS, situation awareness was enhanced and motion sickness was lessened while mental workload was unchanged. Participants also indicated high hedonistic user experience with the PVFS. While providing peripheral information showed positive results, further study such as delivering richer information and active head movement are possibly needed. (cid:1) 2018 The Authors. Published by Elsevier Ltd. ThisisanopenaccessarticleundertheCCBY

The effect of peripheral visual feedforward system in enhancing situation awareness and mitigating motion sickness in fully automated driving

Introduction
In fully automated driving, human drivers will no longer drive at the operational level but rather only within the strategic level according to Michon's definition level of driving (Michon, 1985). Michon distinguished three levels of driving: operational, tactical, and strategic. The operational level involves control tasks like braking and accelerating. The tactical level requires planning and controlled actions like overtaking another car. The highest level is the strategic level concerning for example, the route to be taken and the estimated time of arrival. Having a fully automated vehicle (AV), a human driver only decides on the final destination, and the vehicle will handle all the driving tasks and decisions. Therefore, human drivers become occupants and have the freedom to conduct their own preferred activities. Based on a study done by Schoettle and Sivak (2014) on users from China, India, Japan, US, UK, and Australia on what kind of activity one would like to do inside an AV, they found that roughly 50-60% of the respondents imagined themselves doing non-driving tasks (NDT) such as watching a television/movie, socializing with other passengers, working, reading, and sleeping to fill in their journey time. A similar finding was also reported by Kyriakidis, Happee, and De Winter (2015). Building complete situation awareness (SA) requires awareness about the surrounding as mentioned by Endsley (1995). She explained that there are three levels of SA in a dynamic situation: perception, comprehension, and projection. Engaging in the aforementioned NDT will make the AV passengers/occupants become unaware of their current situation and have less control regarding the intention of the AV (Diels & Bos, 2016;Diels, 2014). They will have lower SA and to make matters worse, most if not all of the attention will be channeled on the NDT making the AV occupants unprepared for the induced forces generated from the horizontal accelerations.
In-vehicle video watching has been shown to induce motion sickness (MS) to the passengers in both a survey study (Schoettle & Sivak, 2009) and an experiment (Isu, Hasegawa, Takeuchi, & Morimoto, 2014). Although video watching produces less MS when compared to reading inside the car (Kato & Kitazaki, 2008), mild symptoms of MS such as feeling queasy and dizziness may tarnish the whole automated driving experience. Humans are known to be prone to MS when exposed to low-frequency horizontal (longitudinal and lateral) accelerations especially within the 0.1-0.5 Hz range (Turner & Griffin, 1999). This could quickly develop in the urban areas where abrupt changes in longitudinal and lateral accelerations are likely to occur because of the geometrical landscape of the road such as roundabouts, junctions, and small-radius corners. An AV undertaking a corner or junction would produce sudden movement to its passengers who might not be aware of the current action of the vehicle. As a result, the incongruity of inputs coming from the passengers' visual, vestibular and somatosensory system would cause sensory conflicts to develop. The sensory conflict theory was first introduced by Reason and Brand in 1975, and it is the most accepted and utilized theory in explaining MS. A mismatch occurs when the sensations and perceptions from the current experience are different from the stored memory which a person has developed in his/her brain over time (Tal, Wiener, & Shupak, 2014). It usually occurs when a person is experiencing real motions (such as riding in a car, plane, or boat) or virtual motions (such as riding a motion simulator or watching a 3D movie). That is the reason why a human driver is less likely to get MS compared to passengers (Rolnick & Lubow, 1991), as the driver has control over the motion produced by the movement of the vehicle.
One way to avoid sensory mismatch is to make the required information available to the passengers of an AV. Such required information is, for example, the immediate intention of the AV that involves variation in the longitudinal and lateral forces. This information can be presented shortly before an important situation is about to occur (such as when a junction is approaching). One modality that can be used to deliver this information inside a moving AV is the visual modality, more specifically a light that is placed within the peripheral view of the user. Löcken et al. (2017) summarized the insights and guidelines regarding peripheral displays or adaptive ambient displays as they specifically termed them. For the current application in automotive, they suggested the use of peripheral displays to lower the mental workload, improve awareness, and also to display the vehicle's state. Peripheral and ambient lights information systems have been previously utilized in simulator and on-road studies. Most of the past work focuses on assisting the human driver in partial automation, for example, as a navigation aid (Matviienko, Löcken, El Ali, Heuten, & Boll, 2016), future traffic situation assistance (Laquai, Chowanetz, & Rigoll, 2011), lane changing decision support (Löcken Heuten, & Boll, 2015;Löcken, Müller, Heuten, & Boll, 2015), feedback about perception of speed (Meschtscherjakov, Döttlinger, Rödel, & Tscheligi, 2015), and communication between passenger and driver (Trösterer, Wuchse, Döttlinger, Meschtscherjakov, & Tscheligi, 2015). Within the fully automated driving context, in one study, a proximal light display was implemented as a wearable device to increase the SA when the users were not paying attention to the road and focusing on reading as the NDT ( van Veen, Karjanto, & Terken, 2017). They found that although the SA was improved without the need to observe the environment outside the vehicle, the proximal light display distracted the users from their NDT.
The primary objective of this study was to investigate if the peripheral information helps to protect the AV occupants from getting MS. A peripheral visual feedforward system (PVFS) was designed to provide the information about the upcoming navigational actions of the AV. The navigational information was abstracted into light movement and presented in the periphery of the attention. It was hypothesized that the gained information from the PFVS increases SA regarding the future navigational direction of the AV and also reduces the level of MS experienced by the AV's occupant. In addition, it was hypothesized that the given peripheral light information does not increase the mental workload nor degrade the experience of the primary task of the AV's occupant, in our case watching a video. The study was performed on an instrumented car that behaves like a real AV on the real road. Since this was not a driving simulator study, analyses of consistency of the test rides will be discussed. Afterward, SA, MS, and mental workload were quantitatively analyzed. User experience in interacting with the PVFS was also assessed.

Experiment design
A within-subject design was implemented, as suggested by Isu et al. (2014) in dealing with dropouts in an MS-related experiment. The independent variable was the study condition while the dependent variables were MS, SA, and mental workload. In this study, all the participants had to go through two conditions. One condition was without the PVFS and was termed control-condition, and the other condition was with the PVFS and was termed test-condition. The order of conditions was counterbalanced to control carry-over effects. Conditions were executed at least three days apart to make sure that if MS occurred within the first condition, it would not affect the result in the second condition. All the test rides were performed on the Eindhoven University of Technology's campus road where Dutch traffic laws and regulations apply. In our experiment, only the lateral accelerations (y-axis) were manipulated while the longitudinal accelerations (x-axis) were kept to a minimum. The experiments were only performed after regular office hours (after 5:30 pm) and during weekends, so that other traffic would not influence the longitudinal accelerations and decelerations of these test rides.

Mobility lab
An instrumented car named Mobility Lab (ML) was developed and employed as an on-road automated car simulator to provide a fully automated driving experience. The detailed design and validation of ML as an automated vehicle simulator is described elsewhere (Karjanto, Md. Yusof, Terken, Delbressine, Rauterberg, & Hassan, 2018). The method used in operating the ML is known as Wizard of Oz, and our approach was particularly inspired by the work of Baltodano, Sibi, Martelaro, Gowda, and Ju (2015). A television display was placed on a wall partition that separates operators of the ML and the participants. A television display was used as it is projected to be one of the main entertainment systems inside a fully automated vehicle as shown by the recent patent by Ford (Cuddihy & Rao, 2015).
The automated driving test ride was realized based on the setup from the previous studies in which a setting called defensive automated driving style was implemented Yusof et al., 2016). The driving speed was set at 30 km h À1 , and the lateral force generated at the turning/cornering was aimed to be about 0.29 g or 2.84 ms À2 . This was based on previous findings that regardless of the type of driver or driving style, most people prefer the AV to be driven in a more defensive driving style (Basu, Yang, Hungerman, Singhal, & Dragan, 2017;Yusof et al., 2016). The windows of the ML were made opaque in order to make sure information regarding upcoming corners and junctions comes only from the PVFS. In addition, it was assumed that when the passengers were engaged in the NDT, they would mainly focus on their tasks and become visually unaware of what was happening outside the vehicle.

Peripheral visual feedforward system (PVFS)
The PVFS was designed based on the previous guidelines in developing a peripheral information system (Matthews, Dey, Mankoff, Carter, & Rattenbury, 2004;Pousman & Stasko, 2006). The information regarding the navigational intention of the AV was abstracted into light movement (see Fig. 1). The light seen by the passengers was also diffused in order to create a divided attention phase where participants can still do their task but manage to digest the given information at the same time. The PVFS consists of two displays, right and left, and each display consists of 32 LED lights diffused on a customized 3D-printed cover. Diffusing the lights was also applied rather than direct flashing as this would create unwanted effects of disruptions and would degrade the experience of the primary task (i.e., watching video) as well as creating an attentiongrabbing effect (Endsley & Jones, 2004). Blue light was selected based on findings from earlier research inside a flight cockpit in a study with pilots, showing that blue can be efficiently discriminated in the periphery (Ancman, 1991). The PVFS was placed on the left and right of the television display with an inward-inclined angle of 140°measured from the front surface of the television display. Hence, the participant does not need moving their head in the direction of the PVFS when watching the video on the television display. The design of PVFS was chosen based on pre-test with users and small interviews with six participants. In addition, the PVFS was developed based on the understanding of the human peripheral vision ability. The peripheral area of the human retina is mostly packed with receptors which are sensitive to illumination and motion but not to colours (rods). Therefore, the PVFS was designed to be operating based on the movement of the diffused LEDs. The LEDs moved from the bottom to the top of the PVFS in order to notify the AV's user that the AV was about to turn to the right or left (see Fig. 1). The PVFS moved at a speed of 50 cm/s and eight LEDs were active and moved together at a time. The signal was given three seconds prior to the corner/turning (lateral force started to be generated) and ended when AV began to corner/turn. The illumination of the light signal could be clearly seen by the participants who sat 1.2 m from where the PVFS was located the viewing angle from the centre of the screen to either left or right panel was approximately 30°. The PVFS was activated by the experimenter who was assisted by unique marks placed on the side of the road signalling the distance to the corner.

Participants
Twenty participants (13 males and 7 females) aged between 18 and 33 years old (Mean = 26.2, SD = 4.8) took part in this study. Stratified sampling was implemented based on the short version of the Motion Sickness Susceptibility Questionnaire (MSSQ) (Golding, 1998(Golding, , 2006. MSSQ's score is based on 100% scale on which a larger number indicates higher susceptibility to MS. Within this study, participants with mild and severe susceptibility were selected based on the MSSQ's scores (Mean = 74.7%, SD = 22.1%).

Procedure
The experiment consisted of two conditions (control-and test-condition), and each condition was divided into three stages (see Fig. 2). Upon arrival, the participants were briefed about the nature of the experiment and were asked to sign the informed consent form. The experimenter explicitly explained that the participant would later be seated inside a fully automated vehicle. The participants were also required to answer the pre-study questionnaire before entering the ML. In Stage 1, the experimenter ushered the participant to the ML in which the driving wizard was already in position. Inside the ML, the participant was seated and asked to watch a video for about five minutes. Then, in Stage 2, the participant was driven on the pre-defined route while continuing watching the video. After that, in Stage 3, ML was stopped and parked, and at this stage, the participant was required to watch the continuation of the video for another five minutes.
Two different videos (Steves, 2015a(Steves, , 2015b were used for the two different conditions in order to keep the participants interested but similar enough in order not to make any significant difference. The content of the videos was about tourism in the Netherlands and was selected based on the idea that it should be engaging enough but not elicit any strong emotions such as sadness or happiness. In addition, the temperature inside the ML was controlled to be constant at about 20°C at all times during the experiment (Holmes & Griffin, 2001).

Data collection and analysis
Three sets of data were measured within this experiment, (1) ML-based measurements, (2) participant-based measurements, and (3) assessment of the PVFS. ML-based measurements were implemented to quantify the consistency of the driving sessions experienced by each of the participants. Participant-based measurements consisted of dependent variables that were tested in this research. The assessment of the PVFS was to investigate the participants' experience in interacting with the peripheral display.
All of the measured objective data were sampled at 250 Hz, synchronized, and stored using National Instrument cRIO-9030 data acquisition system (DAQ) with National Instrument 9205 (analog input) and National Instrument 9401 (digital input/output) module. All the questionnaire data were manually collected using pen and paper, and were later transferred to IBM SPSS for statistical analysis.

Mobility Lab (ML) based measurements
Motion Sickness Dose Value (MSDV). The International Organization for Standardization (ISO) specifies a method for evaluating the dose of MS from acceleration. The method is known as Motion Sickness Dose Value (MSDV) (ISO, 1997). An ADXL 335 accelerometer was implemented to measure all the accelerations. The accelerometer was placed in the middle of the ML and close to the passenger's feet. The speed of the ML was collected using Adafruit Ultimate GPS Breakout.
Automated Driving Test Ride Quality (ADTQ). In addition, there was a question asking about the Automated Driving Test-ride Quality (ADTQ) experienced by the participant. The question involved a 10-point scale (1 = very unrealistic, 10 = very realistic) and was asked in the post-study questionnaire.

Participant-based measurements
Three dependent variables were measured within this study, MS, SA, and mental workload. The independent variable was the condition (with and without the PVFS).
Motion Sickness Assessment Questionnaire (MSAQ). MS was measured through the Motion Sickness Assessment Questionnaire (MSAQ) (Gianaros, Muth, Mordkoff, Levine, & Stern, 2001) and through heart rate (HR) measurement in terms of beats per minute (BPM). MSAQ consists of four constructs which are gastrointestinal-, central-, peripheral-and sopiterelated. Pulse sensor (photoplethysmogram) was utilized to measure the HR and was placed on the index finger of the less-dominant hand of the participant. This experiment followed the recommendation by Laborde, Mosley, and Thayer (2017) who suggested a within-subject design as well as a three-stage measurement. A minimum window size of five minutes of HR recording was also applied as recommended by Malik (1996) for short-term HR measurement.
Situation Awareness Rating Technique (SART). For SA, the Situation Awareness Rating Technique (SART) was used based on the work of Taylor (1990). SART consists of ten items which cluster into three constructs, ''demand", ''supply", and ''understanding". Each items consist of a 7-point scale (1 = low, 7 = high). The ''demand" construct consists of questions assessing the instability, validity, and complexity of the situation. The ''supply" construct consists of questions measuring arousal, spare mental capacity, concentration, and division of attention. The ''understanding" construct measures the information quality, quantity, and familiarity. The lowest score is -5 indicating very low SA and the highest score is 13 indicating very high SA regarding the situation being probed.
Rating Scale Mental Effort (RSME). The mental workload was evaluated using the one-dimensional Rating Scale Mental Effort (RSME) which was developed by Zijlstra (1993). The participant needs to indicate the invested effort in getting the information from the PVFS while performing the NDT using the RSME. RSME's scale was represented by a 150 mm line with the lowest number indicating ''absolutely no effort" and the maximum number indicating ''maximum effort".
In this study, MSAQ was applied in both pre-and post-study questionnaire while HR was measured continuously from Stage 1 to 3. SART and RSME were both measured in the post-study questionnaire (see Fig. 2).

PVFS evaluation
There were two types of assessment for PVFS, one was the User Experience Questionnaire (UEQ), and the other was reaction time in responses to the given information from the PVFS. Both of these assessments were only measured in the testcondition.
User Experience Questionnaire (UEQ). UEQ was employed to access the experience of the participants with the PVFS. The questionnaire was constructed and validated by Laugwitz, Held, and Schrepp (2008). The research objective of UEQ within this study was twofold. One was to check if the PVFS provided a satisfactory user experience in terms of expectations while the other was to investigate which elements needed to be improved in order to fulfil the users' needs in this particular context (Schrepp, Hinderks, & Thomaschewski, 2014).
Reaction Time. Besides watching the video as the NDT, the participants were given a clicker with two buttons to be held in their dominant hand. Once the peripheral visual information was presented from the PVFS, the participants had to indicate the direction of the future course of the AV by clicking either one of the buttons (left button to indicate that the vehicle just took a left corner and right button to indicate that the vehicle just took a right corner). An explicit verbal and figurative instruction was provided in the briefing to shorten the learning time with the clicker. Each time the participant pushed a button, a digital signal (powered by two 1.5 V batteries) was generated and transmitted to the DAQ through an NI 9401 digital I/O module. Reaction time is the time taken for the clicker to be pushed by the participant after the peripheral information was given. Reaction time was applied to measure the attentiveness of the participant regarding the information given by the PVFS.

Statistical analyses
Normality tests for distribution were done using Shapiro-Wilks test since less than 50 people participated in this study. In comparing means for two conditions (control-and test-condition), paired t-tests were used for the parametric analysis and Wilcoxon signed rank tests (WSRT) were applied for the non-parametric analysis. Two-way repeated measures of analysis of variance (ANOVA) were performed for the analysis of HR measurement to determine interaction effect between the stages (1, 2, and 3) and the conditions (control-and test-condition). In the ANOVA, when the sphericity assumption was met (p > 0.05) Mauchly's value was used. G*Power software (Faul, Erdfelder, Lang, & Buchner, 2007) was used to calculate the statistical power while all the other statistical analyses were performed using IBM SPSS version 23 (IBM Corp., 2015).

The consistency of the test rides
The distributions of accelerations across the frequency spectrum for all 40 test rides were plotted as a function of Power Spectral Density (PSD). Both PSDs of the control-and test-condition were plotted on the triaxial directions (x-, y-, and z-axis) on semi-log graphs (see Fig. 3). The mean, standard deviation (SD) and coefficient of variation (CoV = SD/Mean) of the MSDVs produced by the driving wizard for the 20 participants for the two conditions indicated high reliability and consistency (see Table 1). Since the MSDVs in both lateral and longitudinal are the directions of interest due to the fact that MS develops at low-frequency horizontal oscillations (ISO, 1997;Turner & Griffin, 1999), only MSDVs in x-and y-direction were plotted. The distributions of the averaged MSDVs with frequency-weighted acceleration in both x-and y-direction were found to be almost identical (see Fig. 4).
For the ADTQ, Shapiro-Wilk test for normality indicated that the ADTQ ratings were not normally distributed (p < 0.05), hence WSRT was performed. A WSRT showed that there was a statistically significant median difference for the ADTQ rating when participants experienced rides for the test-condition compared to the rides for the control-condition, z = À2.171, r = 0.485, p = 0.030. A median of 7.0 (inter-quartile range (IQR) = 5.0-8.0) was found for the rating given by the participants for the control-condition. For the rides in the test-condition, a median of 7.5 (IQR = 7.0-8.0) was found for the ADTQ rating.

Motion sickness
WSRTs were performed on the pre-and post-MSAQ data in order to check if the setup induced MS to all the participants for both conditions. Both the control-and the test-condition showed statistically significant differences between all the preand post-MSAQ constructs, except for the peripheral-related construct (see Table 2).
In order to compare the severity of the experienced MS in the two conditions, the differences between pre-and post-MSAQ were analyzed and compared (see Table 3). A WSRT showed that there was a statistically significant median difference  in overall MSAQ between the test-condition and the control-condition with z = À2.436, p < 0.05. Analysing the constructs of the MSAQ, it was found that only the peripheral-related construct has no statistical significance. Effect size (r = 0.360) of MSAQ's peripheral-related (P) construct was converted into Cohen's d (0.772) using effect size calculator (Ellis, 2009). G*Power software (Faul et al., 2007) was used to calculate the statistical power of this particular analysis and found to be

Table 2
Wilcoxon signed rank test (WSRT) for the pre-and post-MSAQ scores with median and inter-quartile range (IQR) for the overall MSAQ (O) and its constructs (peripheral-related (P), gastrointestinal-related (G), central-related (C), and sopite-related (S)) scores for control-and test-condition. 0.905. The other three MSAQ's constructs (gastrointestinal-(G), central-(C), and sopite-related (S)) indicated statistically significant differences. When the two conditions were compared, all the medians were much smaller in the test-condition than in the control-condition. Two-way repeated measures ANOVA were performed to determine any statistical significant interaction effect between the two within-subject factors (control-condition vs. test-condition, and three stages of HR measurement) on the continuous dependent variable, MS. Analysis of normality indicated that the data were normally distributed as assessed by the Shapiro-Wilk test of normality (p > 0.05). Mauchly's test of sphericity indicated that the assumption of sphericity was met for the two-way interaction, v 2 (2) = 3.256, p = 0.196. There was a statistically significant interaction between the two conditions and HR measurement over the three stages, F (2, 38) = 5.161, p = 0.010, partial g 2 = 0.214. Therefore, simple main effects (one-way repeated measures ANOVA) were performed (see Table 4).
For the simple main effect of condition, there was no statistically significant difference between HR measurement for the control-and the test-condition at the different stages. The statistical power of the Stage 1 analysis was 0.060. For Stage 2, the statistical power analysis was 0.164. For Stage 3, the statistical power analysis was 0.078. Although there was no main effect of condition, HR measurement in term of BPM was notably higher at Stage 2 in the control-condition compared to testcondition even though both measurements at the beginning (Stage 1) and end (Stage 3) were relatively similar (see Fig. 5).
For the simple main effect of stage of HR measurements, there was a statistically significant effect for both conditions (see Table 4). For both control-and test-condition, post-hoc tests showed that Stage 2 was statistically significantly difference both from Stage 1 and 3 and there was also no statistical significant difference between Stage 1 and 3.

Situation awareness and mental workload
A paired-samples t-test was used to determine whether there was a statistically significant difference in mean SA when participants were exposed to the condition with and without the PVFS (see Table 5). According to Shapiro-Wilk's test, the Table 3 Wilcoxon signed rank test (WSRT) for the total differences between pre-and post-MSAQ scores with median and inter-quartile range (IQR) between the two conditions (control-and test-condition).  assumption of normality was not violated as the SART's score (total and all its constructs) showed that p > 0.05. Participants experienced higher total SA within the test-condition when compared to the control-condition, a statistically significant increase in the mean score of 2.38 (see Fig. 6). In terms of the ''demand" and ''supply" constructs of SART, paired-samples t-tests indicated a statistically significant difference in the mean between the conditions. Participants experienced lower ''demand" in the test-condition compared to the control-condition, a statistically significant change in the mean score of 0.87. Participants experienced higher ''supply" in the test-condition compared to the control-condition, a statistically significant change in the mean score of 0.70. In terms of ''understanding", there was no statistically significant difference in the mean between the conditions. A power analysis was conducted and the statistical power for ''understanding" was found to be 0.837. A paired-samples t-test was also used to determine whether there was a statistically significant difference in mean mental workload (RSME score) between the two conditions. According to Shapiro-Wilk's test, the assumption of normality was not violated as the mental workload data showed p > 0.05. There was no statistical significant difference between the means for control-(Mean = 43.550, SD = 22.402) and test-condition (Mean = 37.100, SD = 27.701), 95% CI [À3.581, 16.481], t (19) = 1.346, p = 0.194, Cohen's d = 0.300. A power analysis was conducted, and the statistical power for this particular analysis was 0.247.

UEQ and reaction time
The opinion of the participant about the PVFS based on the six UEQ constructs was tabulated with scores between À3 (bad) to +3 (good) (see Table 6). Means, standard deviations (SD), confidence (C), and measurement of consistency Cronbach

Table 5
Total differences in SART in between the two conditions (control-and test-condition) for overall SART and its constructs (7-point scale; 1 = low, 7 = high). a are also presented in Table 6. Perspicuity, efficiency, and dependability can be interpreted as pragmatic values while stimulation and novelty can be classified as hedonistic values. For the assessment of reaction time to the given information, there were 18 corners (ten to the right and eight to the left). In general, the participants took about the same time to acknowledge the directions to the left (1.17 s) and to the right (1.03 s). Out of 360 corners, only two times (once to the left and once to the right) the participants incorrectly indicated the direction of the AV.

Validation of the test rides
In this study, the first validation was to make sure that the accelerations experienced by all the participants were always about the same or at least that differences were within acceptable margins (i.e., small standard deviations and coefficient of covariance) (see Table 1). For the total 40 test rides, the driving wizard produced high consistency as indicated by the small Fig. 6. Mean score for constructs of SART (demand (D), supply (S), understanding (U), and total (T)) for control-and test-condition. Error bars represent the 95% confidence intervals (CI).

Construct
Mean SD values of coefficient of covariance and standard deviation and the almost identical means for MSDV in every direction (see Figs. 3 and 4). In addition, since the objective of the study was to minimize the longitudinal acceleration and only to manipulate the lateral acceleration, MSDV y was understandably much higher than MSDV x . Meanwhile, MSDV z , the vertical acceleration, which was produced by the road surface that is made out of cobblestones and the vehicle's suspension system, was relatively small when compared to the other two MSDVs. Nonetheless, the value of MSDV z (vertical acceleration) would only contribute to the uncomfortable feeling, but it was shown before that only horizontal accelerations (longitudinal and lateral accelerations) directly contribute to the development of motion sickness (Turner & Griffin, 1999;Vogel, Kohlhaas, & Baumgarten, 1982). The frequency of the PSD for the MSDV z of control-and test-condition was found to be dominant around 1.5 Hz (see Fig. 3). Oscillations below 0.5 Hz are considered as low frequency, and values above 0.5 Hz are regarded as high frequency and therefore do not directly contribute to the development of MS (Donohew & Griffin, 2004;Golding, Mueller, & Gresty, 2001;Griffin & Newman, 2004a;Lawther & Griffin, 1987;Turner & Griffin, 1999). On the other hand, the PSD for xand y-direction were found to be dominant below 0.2 and 0.3 Hz, respectively (see Fig. 3). Hence both x-and y-direction imposed low-frequency motions, but lower amplitudes were nine times smaller in the x-direction than in the y-direction. In addition, the produced frequencies were almost similar in both conditions, which is an essential feature in MS studies (Golding, Bles, Bos, Haynes, & Gresty, 2003). Therefore, the objective of the experiment to emulate consistent and similar automated driving that only manipulates the lateral acceleration while minimizing the longitudinal acceleration was achieved. One aspect that needs consideration was that all the MSDVs values were derived from the accelerometer located on the floor of the vehicle and close to the passengers. This was done based on what was implemented in previous studies (Griffin & Newman, 2004a;Turner & Griffin, 1999), but might not reflect what the passengers were actually experiencing. Based on the postural stability theory by Riccio and Stoffregen (1991), two different participants might react differently even though the same dosage of motion sickness was applied. Therefore, wearable accelerometer may be required to exactly measure the passengers' MSDVs.
For the subjective rating of the ADTQ, although it was shown that all the produced MSDVs were about the same, participants rated differently on the two conditions which they had to go through. For the control-condition, the average rating was about one point lower when compared with the test-condition. Out of the 20 participants, only three participants rated the ADTQ of the control-condition higher than the ADTQ of the test-condition. It was expected that these ratings have correlations to the MS, SA, or mental workload but no statistical correlations were found between any of them.

MS assessment
The overall score of MSAQ and all of its constructs (except the peripheral-construct) for the test-condition indicated statistically significant differences when compared to the control-condition. Therefore, participants experienced lesser motion sickness when the PVFS was implemented than when there was no intervention at all. However, when the participants were exposed to around nine minutes of MSDV y at a level of about 7.4 ms À1.5 , the level of MS experienced by the participants in the test-condition was significantly lower in both MSAQ and HR measurements compared to the control-condition. The medians of the total difference between pre-and post-MSAQ (10.415 and 1.390, respectively), showed that in both conditions the participants experienced mild MS.
There were no significant changes in the peripheral-construct of MSAQ, and this phenomenon may be explained by the controlled temperature inside the test vehicle (Mobility Lab). The experiments were conducted from March to April in the Eindhoven, Netherlands in 2017, where the outside temperature ranged from 9°C to 24°C. Therefore the temperature inside the ML was set to be fixed at 20°C for all the conditions, in order to control any temperature effects (Griffin & Newman, 2004b). If the temperature inside the ML had not been controlled, the participants would become uncomfortable because it might be too hot or too cold depending on the temperature outside of the test vehicle on that particular day. However, the fixed temperature set at 20°C inside the ML may have caused the participants to mitigate a sweaty and clammy feeling (peripheral-related symptoms).
Some participants indicated mild MS in the test-condition. This may be caused by the involuntary movement of the head of the participant instead of an active movement as mentioned by Carriot, Brooks, and Cullen (2013), who found that the vestibular and cerebellar neurons only react in passive head motions but not in active head motions. In our particular study, although the participants were aware of the intention of the car from the information given by the PVFS, if his/her head was still involuntarily being moved or tilted by the applied accelerations, s/he might still develop some mild MS. Unlike the driver of a vehicle, who is known to tilt their head in the opposite of the induced lateral force or aligned with the gravito-inertial force, passengers usually move their head in the direction of the induced lateral force (Zikovitz & Harris, 1999). Past studies have shown that actively moving one's head aligned with the gravito-inertial force (into the turning/cornering) reduced the level of MS compared to involuntarily allowing the induced force to move one's head in the opposite of the gravito-inertial force (Golding et al., 2003;Wada & Yoshida, 2016;Wada, Konno, Fujisawa, & Doi, 2012).
In terms of HR measurement, past studies have shown inconsistencies regarding HR measurement and its relations to MS. While some studies found no significant relations (Graybiel & Lackner, 1980;Hu, Grant, Stern, & Koch, 1991), other studies found that increased MS was correlated with increased HR measurement (Cowings, Suter, Toscano, Kamiya, & Naifeh, 1986;Himi et al., 2004;Stout, Toscano, & Cowings, 1995). In our study, we found no statistically significant differences between the two conditions. One possible explanation is that the participants were only exposed to mild MS as indicated by the MSDV y (7.4 ms À1.5 ). Past studies (Holmes & Griffin, 2001;Mullen, Berger, Oman, & Cohen, 1998) also found no statistical changes in HR measurement when the participants were only exposed to mild MS. Low statistical powers were also found in the analysis of HR measurement. This usually indicates that there were not enough participants to conclude if there was any statistical significance. Although 20 participants is generally adequate in HR measurement (Simmons, Nelson, & Simonsohn, 2011), variables that influence the participants' HR measurement need to be controlled. In order to yield a valuable insight in terms of HR measurement, both the stable (such as age, gender, and medication intakes) and transient (such as sleep routine, physical exercise, and caffeine and alcohol intakes) variables need to be about the same level for each of the participants (Laborde et al., 2017;Quintana, Alvares, & Heathers, 2016). In our study, we did control some of the stable variables (such as age, heart-related disease, and medication intakes) but did not control the transient variables. Therefore, that might be the reason for the underpowered statistical analysis of the HR measurement. However, the average BPM indicated that participants recorded higher measurements in the control-condition. For the control-condition, the average resting (Stage 1) BPM was at 71 and then increased to 78 during Stage 2 and finally decreased to 70 at Stage 3, while for the test-condition, 71 BPM was recorded for Stage 1, 75 BPM for Stage 2, and 71 BPM for Stage 3 (see Fig. 5). For comparison, Cowings, Naifeh, and Toscano (1990) found an average of 77 BPM measurement for their participants during the mild MS phase and 87 BPM during the severe MS phase.

SA and mental workload assessment
For the analysis of SA, although the SART construct of ''supply" and ''demand" indicated a significant change in the testcondition when compared to the control-condition, the ''understanding" construct did not show any statistical significance. However, the mean score for ''understanding" construct for the test-condition was 3.7 (out of 7-point scale) while for the control-condition was only 3.2, which was less than half of the 7-point scale. On the other hand, ''demand" construct for the control-condition was higher (4.3) compared to the test-condition (3.4), indicated that participants demand for more information in the test-condition. This was in line with the result of the ''supply" construct which showed the opposite of the result the ''demand" construct. In ''supply" construct, higher score in test-condition (4.3) were found when compared to the control-condition (3.6). One explanation was that the PVFS managed to deliver the intended information (''AV is about to turn to the right or left"), but apparently, the given information was not rich enough for the participants to understand what was actually happening and was going to happen next. The PVFS only indicated the direction of the corner or junction, but complete characteristics of the corner were not displayed or delivered. The complete information regarding the characteristics of the corner (e.g., intensity, radius, and position relative to corner) might not only increase the SA of the participants but also help the participants to prepare (e.g., tilt one's head into the corner) for the incoming accelerations forces that induce the development of MS.
As regards mental workload, participants found that performing NDT (watching a video) and retrieving information from the PVFS produced ''some mental efforts" based on mean score of 37 on the RSME scale. In this study, the participants were asked to watch a video as a primary task while the PVFS was presented in the periphery. Therefore, most of the mental workload might be used to focus on the task of watching the video while only a small amount of mental workload could be allocated to the task of understanding the given information from the PFVS. Hence, the lack of the aforementioned information (e.g., intensity, radius, and position relative to corner) given by the PFVS may have caused partial understanding of what was happening and thus may have increased the mental workload experienced by the participants. This was shown by the nonsignificant statistical difference for experienced mental workload between the control-and test-condition. Therefore, the information given by the PFVS needs to be very rich in information yet intuitive and straightforward so that digesting and understanding it should be quick and intuitive. In addition, high SDs were found for both test-(27.701) and controlcondition (22.402), indicating that different amounts of mental workload were experienced by different participants.

PVFS assessment
Analysing the user experience, most participants agreed that the PVFS has a high attractiveness value as indicated by the high attractiveness score (see Table 6). For the pragmatic values, constructs like perspicuity, efficiency, and dependability revealed a good ease-of-learning, minimized effort and decent interaction scores. Low Cronbach's a value for both perspicuity and dependability indicated that there may have been misinterpretations of the questions by the participants. For example, for perspicuity's item of ''confusing" versus ''clear", a participant might misinterpret this as being about the situation that they were facing rather than about the delivery of the information from the PVFS. Multiple interpretations of a particular question in one construct would decrease the Cronbach's a value. Furthermore, PVFS might not be a solution that fits all participants. Different participants may have different perceptions and acceptance towards PVFS. It has been shown that negative reception may actually make the participants feel uncomfortable, as was found in a study using virtual reality for reducing MS (McGill, Ng, & Brewster, 2017). On the other hand, high consistency was observed for stimulation and novelty. The participants assigned reasonable scores to novelty, but they also assigned low scores to stimulation. This is not necessarily a bad result as items under stimulation are probing the motivation of using the PVFS. The PVFS was intentionally designed to deliver information in a subtle/unobtrusive way with the goal that NDT's performance would not be degraded or interrupted.
On the other hand, reaction time was found to be very quick ($1 s) indicating that PVFS managed to grasp the attention of the participants in an instant. In addition, only two participants made one mistake each in reporting the upcoming turning/corner. Therefore, a phenomenon such as inattention blindness (Mack & Rock, 1998) where users might become utterly unaware of a situation or object because of the focused attention on the primary task was not the case. Thus, the PVFS was capable of delivering quick and correct information. It needs to be noted that reaction time was only measured in the test-condition with the presence of the PVFS. The task of using a digital clicker to indicate the direction of the vehicle by the participants can be seen as an additional secondary task on top of watching the video. It has been argued that distraction from mental activity can cause less motion sickness (Bos, 2015;Schwab, 1954). They suggest that this is because attention is directed to another situation rather than just focusing on the feeling of MS. Therefore, the usage of the digital clicker may have influenced the mitigation of MS within this study.

Conclusion
The automated driving test rides, which were simulated by the driving wizard using Mobility Lab (ML), managed to yield high consistency as well as provided sufficient dosage to make participants experience mild motion sickness (MS). The Peripheral Visual Feedforward System (PVFS) managed to prevent the experienced MS from getting higher when being exposed to low-frequency accelerations while watching a video. This was achieved by increasing the level of situation awareness (SA) by providing the intention of the fully automated vehicle (AV) in regard to lateral direction. The PVFS was also successful in delivering direct and fast information, but lack of richness in the information resulted in ''some mental efforts" in terms of mental workload for some participants. The complete characteristics of the motion such as direction, frequency, and magnitude need to be considered and translated into information that is simple and easy to be understood by the passengers. As suggested from recent works (Diels & Bos, 2015;Diels, Bos, Hottelart, & Reilhac, 2014;Löcken et al., 2017;Wada, 2016), cues regarding the upcoming path of the AV may be presented to the users in order to increase the SA and to mitigate MS. In addition, proper interaction with the prototype requires only a small amount of attention that should not degrade the performance of the non-driving task (NDT) and at the same time reduce or completely prevent MS. In addition to a rich and complete future information system, active head movement of the AV's users, like for example the work of Morimoto et al. (2008) and Wada and Yoshida (2016), might be essential in compensating the perceived acceleration. A different way of tackling the problem is to completely isolate the passengers from the induced forces by compensating the accelerations mechanically like in the work of Frechin, Ariño, and Fontaine (2005) or implementing an active suspension like in the tilting train technology (Golding et al., 2003).

Limitation and future works
For the measurement of Motion Sickness Dose Values (MSDV), the implementation of the accelerometer on the floor of the ML could be improved. There are possibilities that some participants might implement active movement (moving their head in the direction opposite of the induced lateral force) or prepare themselves after getting the information from the PVFS. Therefore, a wearable accelerometer that could measure the participant's head movement or vestibular system like from the study done by Wada et al. (2012) could be added to improve the understanding of human reaction towards forces and MS.
Based on our findings, delivering the information in a subtle/unobtrusive way was attainable, but the challenge was how to provide rich information that would complete the mental model (matched sensory inputs) from a limited design space (vehicle's interior and architecture). Furthermore, the peripheral information system might depend on the context of the NDT. A different NDT, for example like reading, socializing, or listening, might provide a different kind of challenge especially in delivering a universal solution.