Train travel in corona time: Safety perceptions of and support for policy measures

To minimize the risk of becoming infected by the Coronavirus while traveling by train, the national government and the Dutch railways' operator (NS) in the Netherlands have taken several policy measures. These involve that passengers have to wear masks and guidelines are issued for working at home and teaching online. In addition, other policy measures, such as introducing a reservation system, were considered. To examine to what extent train travelers support policy measures and how these change their perception of becoming infected while traveling by train, a stated preference experiment is conducted. Respondents were asked to evaluate various combinations of policy measures, both whether they consider it safe to travel by train under the stated conditions and whether they would vote in favor of the policy package in a referendum. To analyze the data, a mediation choice model is developed, which allows disentangling the direct effect of the policy measures on support and the indirect effect mediated by infection safety perception. To illustrate this, the results show that implementing the policy measure teaching on campus with later starting times would decrease travelers’ infection safety perception and therefore indirectly decrease its support. However, the positive direct effect on support suggests that travelers like this option better than teaching online, the guideline that applied at the time of data collection. The direct and indirect effects cancel each other out, indicating that this alternative policy measure would not count on more support than the guideline teaching online. Furthermore, this paper examines the heterogeneity in the support for policy measures by presenting and discussing the results of a Latent Class Choice Model. Amongst others, the results reveal that one class strongly supports the policy measure reservation system, while another class stongly opposes this measure, suggesting that implementing this measure is not trivial as suggested by its moderate effects at the aggregate level.


A B S T R A C T
To minimize the risk of becoming infected by the Coronavirus while traveling by train, the national government and the Dutch railways' operator (NS) in the Netherlands have taken several policy measures. These involve that passengers have to wear masks and guidelines are issued for working at home and teaching online. In addition, other policy measures, such as introducing a reservation system, were considered. To examine to what extent train travelers support policy measures and how these change their perception of becoming infected while traveling by train, a stated preference experiment is conducted. Respondents were asked to evaluate various combinations of policy measures, both whether they consider it safe to travel by train under the stated conditions and whether they would vote in favor of the policy package in a referendum. To analyze the data, a mediation choice model is developed, which allows disentangling the direct effect of the policy measures on support and the indirect effect mediated by infection safety perception. To illustrate this, the results show that implementing the policy measure teaching on campus with later starting times would decrease travelers' infection safety perception and therefore indirectly decrease its support. However, the positive direct effect on support suggests that travelers like this option better than teaching online, the guideline that applied at the time of data collection. The direct and indirect effects cancel each other out, indicating that this alternative policy measure would not count on more support than the guideline teaching online. Furthermore, this paper examines the heterogeneity in the support for policy measures by presenting and discussing the results of a Latent Class Choice Model. Amongst others, the results reveal that one class strongly supports the policy measure reservation system, while another class stongly opposes this measure, suggesting that implementing this measure is not trivial as suggested by its moderate effects at the aggregate level.

Introduction
The spread of the coronavirus and the policy measures taken by governments to contain the virus have major consequences in a variety of areas, such as health (care), the economy, and mobility. With regard to mobility, the short-and long-term consequences are likely significant. Research has shown that large numbers of travelers avoid public transport in case of a pandemic and avoid traveling to places where they perceive a higher risk (e.g., Abdullah et al., 2020, Bucsky, 2020, Kwok et al., 2020, Sadique et al. 2007Shamshiripour et al. 2020, Yıldırım et al., 2020). De Haas et al. (2020 report a reduction of more than 90% in trips made by public transport in the first wave of Covid19 (De Haas et al., 2020). Abdullah et al. (2020) found that during the pandemic people place a high emphasis on infection-related factors, such as passengers with face masks, social distance, cleanliness, and infection when choosing a travel mode, hence, fear of infection and perceived risk significantly influence travel decisions (Abdullah et al., 2020;Kim et al., 2017;Cahyanto et al., 2016;Hotle et al., 2020, Jenelius andCebacauer, 2020).
As a consequence of the drop in ridership and transport restriction policy measures, the capacity of public transport may be expected to be reduced for a prolonged period. In this, public transport operators have a dual interest. On the one hand, they want to continue to transport as many travelers as possible, but on the other hand, they want to do this in a safe manner, which, in practice, implies that a limit needs to be imposed on the number of passengers. To properly inform decisions regarding this trade-off, knowledge of travelers' preferences regarding the possible measures that support safe travel by public transport is crucial.
The study reported in this paper is conducted at the end of the first wave in the Netherlands, in a period in which the national railway operator (as in other countries, e.g., Tirachini and Cats, 2020) had taken two important measures, namely the obligation to wear non-medical masks and a limit on the number of available places, that is only window seats may be occupied. However, during the corona crises policies are being revised constantly, either made more strict or more relaxed. At the time data were collected, it was mandatory to use non-medical masks, which protect fellow passengers but not the wearer him-/herself (Peeples, 2020;CDC, 2021). A discussion has been ongoing on whether medical masks should also be made compulsory.
In addition, at times when society is opening up, the demand for train travel will likely increase, and the question arises of how to regulate the demand in such a way that the restricted maximum capacity will not be reached. To this end, various measures have been proposed. For example, the Dutch railway operator is considering a reservation system that only allows travel at the reserved timeslot. Another policy measure that could be taken to attempt spreading the demand over the day is to increase prices in peakhours and/or reduce prices in off-peak hours. However, applying pricing mechanisms in public transport is politically sensitive (PBL, CPB and SCP, 2020). The national government, on the other hand, can implement (or lift) restrictions for employers regarding working from home and/or restrictions for colleges and universities regarding the on-/off campus attendance of students.
With regard to these measures, it is important for both the railway operator and the national government to know what the support base is for policy measures or combinations thereof, but also whether these measures contribute to people's sense of becoming infected when traveling by train. In this, the dual interest of the public transport operator can also be observed at the level of individual travelers. For example, an individual traveler is likely to dislike a peak-hour price increase, but if the same individual believes this measure will help spread demand over the day, it may positively contribute to his or her perception that traveling by train is safe. To properly inform decision-making, it is important to tease out these effects. Otherwise, the incorrect impression may arise that travelers are indifferent towards or even gain utility from a price increase.
The objective of this study is to obtain insights into the degrees to which the various combinations of measures, hereinafter referred to as 'policy packages', are supported by respondents and to what extent these effects are mediated by the perception that traveling by train will be safe given the respective policy packages. To achieve this goal, we set up a stated preference experiment in which respondents were asked to evaluate various policy packages, indicating both whether they would vote for or against the package if it would be subjected to a referendum and to what extent they considered it safe to travel by train under the stated conditions. To analyze the data, a mediation choice model is developed. As far as the authors' are aware, this is the first time that a perception variable that is specific to the examined alternative is directly included in a choice model as a mediating variable. As such, this can be considered a relevant methodological contribution of this study. The results are based on a convenience sample of 1396 respondents collected in the Netherlands.
The remainder of this paper is organized as follows. In the following section, we first discuss previous work as relevant background for the development of our mediation model. Then we consecutively discuss the conceptualization, construction of the experiment, the sample, and the model estimation process. We next present the results of the estimated mediation model that disentangles direct and indirect effects of policy measures on support. This is followed by a presentation of an estimated latent class choice model that explores the heterogeneity among citizens in the support for policy measures. In the final section, we draw conclusions, discuss policy implications and discuss some avenues for further research.

Previous work
Stated choice experiments (e.g. Louviere et al. 2000) are typically applied to examine to what extent the attributes (e.g., price, time, comfort level) of an alternative (e.g., a transport mode, a mobility service, a route, a policy package) affect the choice of that alternative. To that effect, a series of choice sets is constructed that each represents a limited number of alternatives that are described by a series of attributes. The values of the attributes, which are called levels, are systematically varied across the choice sets, usually based on an experimental design. Typically, participants in an SC experiment are requested to indicate which single alternative in each choice set they like best. Based on an assumed decision rule, a discrete choice model is then estimated from all observed choices. A widely adopted decision rule is Random Utility Maximization (e.g. Ben-Akiva and Lerman, 1985), which assumes that respondents choose the alternative from which they derive the highest utility. This allows applying logit models to estimate the parameters of an assumed utility function. In the most basic conceptualization and formalization of choice behavior, only attributes affect utility and they are assumed to have a direct effect on utility.
In addition to attributes, it is often assumed that characteristics of persons also affect utility. A distinction can be made between measured personal characteristics and latent variables. Measured variables are usually variables that can be observed or measured by a single survey question such as socio-demographic variables like age and gender. They are assumed to be measured without measurement error. Latent variables often represent psychological variables such as attitudes, which are considered to be relatively stable ways of thinking or feeling about a particular object or behavior. Latent variables cannot be observed directly and they are typically measured by a series of indicator variables (statements related to opinions, perceptions, or preferences), of which the responses are measured on Likert scales. Because multiple indicators are used to measure a single latent variable, reliability can be determined. This implies that measurement errors are assumed to exist, which can be taken into account by specifying a measurement model for each latent variable. This is typically done by estimating a so-called Hybrid Choice Model (HCM; Ben-Akiva et al., 2002). Both measured and latent variables may have direct effects on the utility of an alternative and/or they may modify the impact each attribute has on utility, which indicates that this impact varies depending on the personal characteristics. Moreover, in HCM often the latent variables are assumed to be dependent on measured personal characteristics, which implies that measured personal variables additionally may have an indirect effect on utility mediated by the latent variables. The upper part of Fig. 1 reflects the standard conceptualization of HCM.
Next to this quite common way to include psychological variables in choice models, a second less common and complementary way has been proposed, one that is inspired by the principles of Hierarchical Information Integration (HII) theory (Louviere, 1984; for a review see Molin and Timmermans, 2009). This theory assumes that when persons are confronted with many influential attributes, they group the attributes that belong together into so-called higher-order decision constructs. Individuals then trade-off only the attributes that belong to a particular decision construct to arrive at a construct evaluation, which is consecutively done for all decision constructs. Finally, they trade off the various construct evaluations to arrive at an overall evaluation. In line with this theory, a series of experiments is then constructed (see Fig. 2). In the original HII approach, a subexperiment for each decision construct is constructed to measure how the attributes belonging to a construct affect its construct evaluation. This involves that the attribute profiles are evaluated and responses administered on a rating scale. Second, a so-called bridging experiment is constructed, either as a rating or as a choice experiment, that includes alternatives that are described in its construct evaluations. Typically, the levels of the decision construct evaluation attributes are expressed in numbers that correspond to the numbers of the rating scales used in the subexperiments. We refer to Chiang et al. (2003) for an application on Intercity train choice, who constructed subexperiments for decision constructs service quality, transfer quality, information quality, and environmental quality and a bridging experiment. Typically, a model is estimated from each subexperiment that reveals how the attributes influence the decision construct evaluations and a model is estimated from the bridging experiment that reveals the impact of the decision constructs on the overall rating or choice. Thus, in this way, the attributes that define a construct have an indirect effect on choice, which is mediated by the decision construct evaluations. This is reflected in the lower part of Fig. 1.
These decision construct evaluations are essentially direct measures of people's perceptions regarding groups of attributes, thereby presenting a different way of including psychological variables in a choice model. So instead of using indicators of latent variables that reflect individual traits of respondents as is assumed in HCMs, the perceptions reflect evaluations of attributes that mediate the effects of attributes on utility. This also means that they vary and should be measured at the level of choice situations, instead of at the individual level. Although several variants of the HII approach have been developed, they all have in common that attributes have an indirect effect on utility mediated by the decision constructs. This is reflected in the lower part of Fig. 1. See Richter and Keuchel (2012) for an application of an HII variant in the context of passenger transport mode choice that is entirely based on choice experiments.
One of the HII variants was developed by Bos et al. (2004) in the context of Park and Ride choice. In this approach, two subexperiments were developed to measure how attributes affect the evaluations of the decision constructs Quality of the Park and Ride facility and Quality of connecting Public Transport respectively. However, the bridging choice experiment did not only describe the decision construct evaluations like in the original HII approach but also the attributes costs and time, the fundamental mode choice attributes. This approach allows examining how the two decision construct evaluations are traded-off against time and costs. In fact, this approach can be applied to any higher-order decision construct that is assumed to be influenced by more basic attributes. To better understand the difference between decision constructs and attributes, an example is provided in the context of air passenger transport: if comfort is varied in the levels economic, business and first class, then comfort is regarded as a regular attribute. If, on the other hand, we conceptualize that comfort is influenced by more basic attributes such as the amount of leg space, quality of food, friendliness of personnel, entertainment program offered, etc., comfort is considered to be a higher-order decision construct. A subexperiment is then constructed, in which the just mentioned basic attributes are combined to form profiles. Respondents are requested to evaluate each profile and express their response on some rating scale, for example, a scale of which the endpoints are labeled as (1) very low comfort to (10) very high comfort. A model estimated from this experiment allows predicting the perception of comfort for any particular flight option that is described in the attributes as varied in the subexperiment. Molin et al. (2017), applied this approach to examine the perception of safety in air passenger transport. In the first experiment, various airline and route attributes were varied to examine how these attributes affected safety perception. A 7-point rating scale was used to administer the responses, of which the endpoints were labeled as 1 = very unsafe and 7 = very safe. In a second experiment, safety perception was included as an attribute and varied in the levels 1, 4, and 7 that corresponded to the levels of the same 7-point rating scale. As this experiment also varied regular attributes like costs and time, this allowed examining the trade-offs of safety perception with costs and time. As stated above, we like to stress again that safety perception is not conceptualized as a personality trait like attitudes discussed earlier, but as a function of more basic attributes, hence, its score depends on the levels of those attributes and therefore may vary from situation to situation. Personal characteristics may affect the perception score and they may modify the impact of the attributes on perception, and in turn how perception influences choice (indicated by the dashed lines in Fig. 1).
The conceptualization that the impact of attributes is mediated by perceptions, does not necessarily require constructing multiple experiments. Molin and Marchau (2004) constructed a single experiment in the context of Advanced Driver Assistance Systems (ADAS), in which they constructed profiles that described combinations of the functional ADAS attributes distance keeping, speed adaptation, and navigation speed. With respect to each profile, respondents were asked to indicate how they believe that this ADAS profile would affect their perception of driving safety, driving comfort, travel time, and fuel consumption respectively. After price was presented, respondents also rated the overall attractiveness of the presented ADAS profile. A structural equation model was then estimated to examine the direct and indirect effects of the varied attributes on the four measured perception variables and overall attractiveness. Also, Molin et al. (2018) developed a single experiment with multiple questions. In the context of computer security preferences, this study constructed alternatives in which the attributes involved technical security measures that may be implemented to protect computers. The alternatives were first shown one by one to the respondents, which were requested to express their perception of security and their perception of user-friendliness of each presented attribute combination. Then the respondents were shown a choice set containing three of those profiles and requested to indicate which profile they preferred. Separate regression and logit models were estimated to examine the impacts of the attributes on the perceptions as well as on choice.
The approach developed in the current paper is best viewed as a combination of the latter two approaches: a single experiment is developed with multiple questions. In addition, a simultaneous structural equation model is estimated to examine the direct effects of the attributes on choice and their indirect effects on choice mediated by another variable. New is that the ultimate dependent variable concerns a choice and not a rating. Hence, to the best of our knowledge, this is the first choice model that models both direct and indirect effects of attributes on observed choices. This approach is discussed in more detail in the following section.

Conceptualization
As discussed in the Introduction, the aim of this paper is to examine the public support for policy packages related to traveling by train in corona time. Fig. 3 depicts the conceptualization that underlies the present study. We assume that citizens evaluate each policy measure in terms of infection safety, which may be regarded as the perceived probability of becoming infected by Covid19 while traveling by train. We assume that citizens are able to assess the consequences of the measures that make up a policy package in terms of increasing or decreasing the chances of infection and integrate these into an overall safety perception score of the respective policy package. In turn, this safety perception is assumed to affect the support for a policy package, hence, the higher the safety perception, the higher the probability one votes in favor of a policy package. Thus, the policy measures are assumed to have an indirect effect on support mediated by safety perception. In addition, the policy measures also have direct effects on support, which captures the effects of all other evaluations of the policy measures that are not related to infection safety. To give an example, wearing masks is expected to increase the safety perception and is thus expected to have a positive indirect effect on support. At the same time, however, wearing a mask is expected to have a negative direct effect on support, because it is inconvenient to wear and costs money. The total effect of each policy measure on support can be obtained by summing its indirect and direct effects. Thus, this conceptualization disentangles the direct and indirect effects of each policy measure on support for a policy package.
As mentioned in the Introduction, the benefit of disentangling these separate direct and indirect effects is that a richer picture emerges regarding the pathways through which the measures influence support, in particular, whether it is via increased or decreased safety perceptions or via other routes, which are undefined but likely of a more personal nature (e.g. personal costs/inconvenience). This, in turn, can aid policymakers or railway operators to make more informed decisions, who may, for example, choose to put more weight on the effects operating via the safety perception and less on the remaining direct effects (or vice versa).

The choice experiment
The conceptualization discussed above underlies the construction of a stated choice experiment, which is developed to examine the support for policy packages. The attributes varied in this experiment concern the policy measures that have already been mentioned in the Introduction and which will be discussed further in the following (see Table 1 for an overview). We are mainly interested in measuring the respondents' infection safety perception and support for each policy package. To that effect, infection safety perception is measured on a five-point rating scale running from (1) 'much less safe' to (5) 'much safer' compared to the situation that applied at the time of research. In addition, support for policy packages is measured by a referendum-style question. Hence, respondents have to assume that they are asked to vote in favor or against a policy package in a referendum. Although in principle, we could pose more questions about other associations respondents might have with each policy package, we did not want to exhaust respondents with too many questions per package. Hence, while the indirect effect of the policy measures on support represents the infection safety impact, the direct effects represent the impact of all other evaluations of citizens of the policy measures.
In the following, we discuss the policy measures we selected to vary in the choice experiment and their expected safety perception and other impacts, although it is not clear beforehand which net impact they will have on support. A first policy measure involves wearing masks by travelers. At the time of research, wearing a non-medical mask was compulsory for traveling by train. Non-medical masks are cheaper than medical masks and can even be homemade. Non-medical masks may help to prevent the wearer spreads the virus, but they do not heavily protect the wearer from particles that may contain the virus spread by others (Peeples, 2020, CDC, 2021. Medical masks, on the other hand, would better protect the wearer from becoming infected, however, this policy measure is under political debate because it may lead to an increased shortage of medical masks that are much needed in the health care sector. It is expected that masks and especially the medical ones, increase perceived safety, but decrease support because of the inconvenience wearing them brings along. On the other hand, relaxing this policy would involve making wearing masks non-compulsory, which is expected to be perceived as less safe, but more convenient. A second policy measure involves introducing a seat reservation system, which ensures travelers of a seat and a guaranteed ride in times of high travel demand. This may be beneficial to travelers, though on the other hand having to make a reservation requires additional effort and makes traveling by train less flexible because advance booking requires selecting a particular train at a particular time. Although this limitation can be overcome by making re-booking possible as long as vacant seats are available, this still requires additional effort compared to the current situation without a reservation system (Note that we use the term current here and in the Fig. 3. The conceptual model. remainder of this paper as a shorthand for the situation the applies at the time the data were collected). It may therefore be expected that having the possibility of making a reservation is preferred over a compulsory system. On the other hand, since Dutch trains currently do not have in-train facilities to signal which seats are reserved, travelers may realize that a non-compulsory system may lead to conflicts within the trains between travelers who have a reservation for a particular seat and travelers without a reservation who occupied a seemingly vacant seat. A compulsory system may avoid such situations because then every traveler needs a seat reservation.
The next two policy measures involve pricing measures that aim at reducing the number of travelers in the peak hours, which involves the time periods 6:30-9:00 h for the morning peak and 16:00-18:30 h for the evening peak. The policy measures involve a surcharge for traveling in the peak hours and an extra discount for traveling in the off-peak hours. The latter is framed as an extra discount that comes on top of possible existing ones because several train subscription cards already offer a discount of 20% or 40% for traveling in off-peak hours. It is expected that many people will not prefer a peak hour surcharge but would welcome an extra discount in off-peak hours. On the other hand, citizens probably understand that the pricing measures reduce the number of peak travelers and therefore contribute to perceived safety.
The final two policy measures involve relaxing the guidelines for working from home and online teaching for higher education, that

E. Molin and M. Kroesen
is colleges and universities, which concern two generic measures the national government has taken to help prevent the coronavirus from spreading. For each measure, we examine two relaxation variants: for working from home: allowing a maximum of two days at the work location and no guidelines, and for higher education teaching: teaching at the campus in two variants, either as before the corona crises and with later starting times, well after the peak hours. The latter variant aims to avoid students from traveling in peak hours and therefore helps avoid peak hour crowdedness. This policy measure is already on the national government's wish list for many years, but it is never implemented because it has not been acceptable for colleges and universities, who refused to cooperate, however, the current corona crisis may change this. It is expected that relaxing work and education guidelines will decrease perceived safety but will have positive direct effects, as citizens value the opportunity to meet colleagues or fellow students. When constructing the experiment, we expected that the current seating policy, which involves that each two-seater could only be occupied by a single person, would not be relaxed in corona time. The reason for assuming this was that keeping 1.5 m distance is the cornerstone of the Dutch corona policy. However, about a month after we collected our data, this policy was relaxed, so it is a pity we did not vary this policy measure in our study.
The selected policy measures and their variants are combined into policy packages by making use of an orthogonal fractional factorial design, which resulted in 18 policy packages. We added its foldover design, which has the advantage that not only the main effects of all attributes are uncorrelated with each other, but these are also uncorrelated with all two-way interactions. Hence, should these interactions have played a role in evaluating the policy packages, then the estimated main effects of the attributes are still unbiased, which enhances the validity of the estimated parameters. Thus, in total 36 policy package profiles are constructed, which are blocked into 6 blocks of 6 packages each. Each respondent is randomly assigned to a single block and thus responds to 6 policy packages. An example of a constructed policy package and the questions posed for each package is presented in Fig. 4.

Data collection
The stated choice experiment is included in an online questionnaire, which also included questions on socio-demographic characteristics. Respondents were recruited between June 10 and June 24, 2020 by the participants of a data analysis course of the Bachelor program of the Faculty Technology, Policy and Management of Delft University of Technology in the Netherlands. They were asked to fill in the questionnaire themselves once and to recruit people from their social network.
In total, 1396 respondents completed the questionnaire. Because recruitment was based on the social networks of students, the realized sample should be considered a convenience sample. To understand how this sample differs from the population of train travelers, Table 2 presents the distribution of several socio-demographic characteristics in the sample and in the population. The latter is based on the average number of train trips per day as measured by Netherlands Statistics (CBS, 2020). Comparing the distribution reveals that the sample is representative for gender, however, students and young persons are overrepresented. The population distribution suggests that especially many higher educated travel by train in the Netherlands. Although the categorization of education differs between the sample and the population, the combined categories bachelor and master level in our sample suggest that the higher educated are overrepresented in our sample. Although Table 2 makes clear that except for gender, the variables are not representative of the population, it also makes clear that all categories are represented in sufficient numbers to allow exploring any differences in support for the policy measures between the distinguished categories.

Model estimation
In line with our conceptual model presented in Fig. 3, we specified a mediation model. In our case, a model is estimated that includes two dependent or endogenous variables, that is infection safety perception and referendum vote. Safety perception is measured on a 5-point response scale, which is assumed to be of interval level measurement. Hence, this part of the model is specified as a linear regression model. It allows examining how each policy measure affects infection safety perception (ISP). Hence, the following function to predict the ISP score for a policy package j is estimated:. # The population categories of education levels are: low, middle, and high respectively and these do not match with the categories in our sample.
In this equation, C is the regression constant, X ij are the dummy coded policy measures i (see Table 1), and β i is a regression coefficient estimated for each policy measure, which denotes the contribution of each policy measure i to the infection safety perception score. Furthermore, superscript ISP is added to distinguish the estimated coefficients from those estimated in the second part of the model, which we discuss next.
The second part of our model concerns the observed referendum vote, that is voting in favor or against a policy package. This is a dichotomous dependent variable and therefore this part of the model is specified as a binary logistic regression model, which is equivalent to a binary logit model. Consequently, the observed choices are transformed into the logit, which is done by taking the natural logarithm of the probability of choosing in favor (P F ) over the probability of choosing against a policy package (P A ). In choice models the logit is typically interpreted in terms of the structural utility (V j ) derived from a choice alternative, however, in this paper, we interpret the logit in terms of support for a policy package. Hence, the following function is estimated:.
Once the parameters of this function are estimated, the logit of a policy package can be predicted and consecutively, the share of travelers that vote in favor of the policy package can be predicted by:(e logit /(e logit + 1))*100%. Because in this part of the model, the measured safety infection perception score is included as a predictor variable in addition to the policy measures, the estimated direct impacts of the policy measures on support for a policy package are controlled for safety infection perception. Hence, the direct impacts capture all evaluations related to a policy package that are not related to infection safety perception. In addition, the policy measures are assumed to indirectly affect support mediated by safety perception. Hence, the first part of the model reveals how the policy measures affect the safety infection perception score and the second part of the model reveals how safety infection perception, in turn, affects the support for a policy package.
These two specified functions are simultaneously estimated in Mplus (Muthén & Muthén, 2018). Mplus is a software package for estimating structural equation models that enables disentangling direct and indirect effects between variables. In addition to variables of interval and ratio measurement level, Mplus allows including variables of lower measurement level as dependent variables, that is variables that have ordinal and nominal measurement level (see also Muthén, 2011).
In addition to disentangling the infection safety effect and other effects, we are interested in examining heterogeneity among citizens with respect to support for policy measures. To that effect, we estimated a Latent Class Choice Model (LCCM) with support, indicated by referendum vote, as the dependent variable. An LCCM assumes that classes of citizens exist who have different preferences but which are relatively homogeneous within a class (Greene and Hensher, 2003). Hence a LCCM estimates a set of parameters for each class. These classes are latent and emerge in the model estimation process. To that effect, a series of models each with a different number of classes are estimated and model fit indicators together with interpretability of the results determine the optimal number of classes. Since the models are not nested, Likelihood Ratio tests cannot be applied for this purpose (e.g., Ortúzar and Willumsen, 2011, Chapter 8). Instead the AIC and the BIC are consulted (Gupta & Chintagunta, 1994;Kamakura & Russell, 1989), lower values indicating a better fit. Since the mediation model already examines the impact of safety infection perception, this variable was not taken into account in this model.
The LCCM is estimated in Apollo (Hess and Palma, 2019), a freeware package for estimating discrete choice models. We were able to estimate models from 1 to 4 classes, while a five-class model failed to converge. A model with four classes appeared to be optimal (see Table 3). In addition, because each individual has a certain probability of belonging to each class, a membership function was estimated to examine whether this probability related to personal characteristics (e.g. Boxall and Adamowicz, 2002). The personal characteristics presented in Table 2 were included in the membership function, however, none of these variables turned out to be statistically significant and they were therefore removed from the model. The finding that belonging to a class is not related to any of the measured personal variables, suggests that if we had realized a representative sample instead of our convenience sample, the results probably would not have been very different. However, this requires assuming that respondents that belong to a certain segment are representative of that segment. Furthermore, the Rho-square value of the four-class model is very low (0.094), suggesting there is still much heterogeneity among the respondents that is not captured by the four classes. Hence, we cannot rule out the possibility that class membership correlates with personal characteristics that we did not measure in our study. That the Rho-square for predicting support is much lower than in the mediation model is due to the fact that safety perception, which appeared to be a very strong predictor, is not included in this model. In order to allow nominal attributes to enter the models, they need to be coded. For this purpose, we applied dummy coding. Table 1 shows how each of the attributes is coded by two dummy indicator variables. For each attribute, we selected the current level as the reference level, hence the level that corresponded to the policy measure that applied at the time the data were collected. This means that the parameters estimated for the dummy indicator variables express to what extent the dependent variable, either safety perception or support, changes if the alternative policy measure is adopted instead of the policy measure that was implemented at the time of data collection. This also implies that the estimated constants reflect the current infection safety perception level and support level respectively of the reference policy package, hence, the policy package that applied at the time the data were collected. Although in principle the two pricing measures do not need to be coded because these are ratio level variables, they are coded in the same way as the other attributes, which allows direct comparison of all parameters in terms of weights (importance). In addition, dummy coding easily captures non-linear effects. Table 4 presents the results of the estimated mediation model. The first column presents the impact of the policy measures on infection safety perception. To illustrate this, we regard the policy measure compulsory medical mask. Its estimated impact on safety perception is 0.126, which means that in case this policy measure is adopted instead of the current policy measure compulsory nonmedical mask, the safety perception score of the policy package increases by this number when no other changes are made at the same time. Since the direct effect of safety perception on support is exactly 1.000 (see the bottom of the second column), this has the very convenient consequence that in our model, the impact of each policy measure on safety perception is numerically exactly the same as its indirect effect on support. Thus, the indirect impact of compulsory medical mask on support mediated by safety perception is also equal to 0.126. The second column presents the direct effect of each policy measure on support controlled for safety perception. For medical mask, this is − 0.809, which indicates that this policy measure is less supported than the current policy measure non-medical mask. Likely explanations for this are its higher price and the shortage of medical masks in the health care sector, hence, it is likely that at least part of the citizens does not wish to add to this shortage. Finally, the third column presents the total effect of each policy measure on support, which is simply the sum of the indirect and the direct effects that are presented in the first two columns. In the following, we discuss the direct, indirect, and total effects of each policy measure on support for a policy package.

The mediation model
With respect to mask policy, we just discussed as an illustration that the more stringent variant medical mask is perceived to be Absolute t-values greater than 1.96 denote statistical significance at the 95% confidence level.
slightly safer than the current policy of non-medical mask. On the other hand, this measure has a strong negative direct effect on support, which results in a relatively strong negative total impact. Hence, support for this more stringent policy is clearly lower than for the current one. To illustrate what this means for a change in voting behavior, the model is applied to predict the changes in voting behavior. The model predicts that 62.8% would vote in favor of the current policy package. If only non-medical mask would be replaced by medical mask, while no other policy measures would be changed, the in-favor vote drops to 46.0%, hence a drop of 16.8% (percent point). At the same time, completely abolishing the policy of compulsory mask would considerably decrease infection safety perception and consequently has a strong negative indirect effect on support, while its direct effect is the same as the current policy. Since wearing a mask is inconvenient, we however expected a positive direct effect. We do not have a ready explanation for this unexpected finding. The total effect of non-compulsory mask is a severe drop in support. In summary, the results show strong support for the current policy of compulsory non-medical mask. A seat reservation system increases safety perception and this effect is stronger for a compulsory system than for an optional system. However, while an optional variant has a slightly positive direct effect, a compulsory reservation system has a clear negative direct effect. The total effects show that an optional seat reservation system is supported, while a compulsory system clearly is not. Thus, respondents support having the option to make seat reservations, whereas they oppose a policy that enforces travelers to use such a system. The model predicts that a compulsory seat reservation system would result in a drop of 5.1% in the in favor votes for a policy package, again assuming this is the only change made to the current policy package.
With respect to the two pricing measures, respondents perceive that both peak hour surcharge and off-peak discount increase safety perception, and thus their indirect effects are positive. On the other hand, as expected, a peak surcharge has a negative direct effect, while an extra off-peak discount has a slight positive direct effect. The total effects indicate that an off-peak discount can count on more support than a peak surcharge, however, a 10% surcharge still increases support. The results make clear that the pricing effects are not linear. For both pricing measures, we find that a limited 10% change can count on higher support than a 20% change. In summary, the results suggest that both pricing measures are supported on the condition that the change is limited to 10%.
The results for working from home guidelines show that any relaxation of the current guideline working from home as much as possible decreases safety perception, resulting in a negative indirect effect on support. While this negative effect is rather strong for no guidelines at all, it is rather limited and not statistically significant for a maximum of two days at work. Surprisingly, both less stringent guidelines also have a negative direct impact on support, which again is limited for two days at work, and stronger for no guidelines. Apparently, the wish to return to the office is not that strong at the time of research. The total effects indicate that a relaxation of the current policy to two days at work is almost equally acceptable as the current policy, while a complete relaxation of the working from home guideline clearly is less acceptable. From these results can be concluded that there is much support for the current working from home guideline, although a relaxation that allows being some days at work is almost similarly supported.

Table 5
The estimated binary logit model and LCCM. Finally, any relaxation from the current guideline for higher education online teaching as much as possible clearly decreases safety perception. This effect is stronger for on-campus teaching with pre-corona time, that is early starting times, than for later starting times. Obviously, citizens understand that later starting time avoids traveling in peak hours and that this limits crowding in the peak. On the other hand, the direct effects are positive, which is stronger for late starting times than for early starting times. The total effects show that Teaching on campus with late starting times is about equally acceptable as the current policy, while early starting times receive much lower support. In summary, there is support for the current teaching online policy, but also for on-campus teaching with late starting times. Compared to the working from home guideline, there is somewhat more support for relaxing the online teaching guideline.
An estimated constant, in general, reflects the score of a dependent variable when the values of all predictor variables are zero, which in this study represents the current policy package. Hence, with respect to safety perception, the estimated constant is equal to 3.276, hence, the safety perception score of the current policy package is close to the middle value of the rating scale. This is no surprise since the middle value was defined as 'equally safe as the current situation'. The constant estimated for support is negative − 2.752, which is the hypothetical score for the current policy package if its safety perception score would be zero. Adding the safety perception score of the current policy package, which is 3.276 as just discussed, reveals the support score of the current policy package. This is equal to 0.524, which means that more people would vote in favor of the current policy package than against. To be precise and as earlier discussed, the model predicts that 62.8 % would vote in favor of the current policy package.

The latent class choice model
The parameters of the estimated LCCM are presented in Table 5, together with the results of the binary logit model. The labeling of the class is based on the estimated constant, because this indicates whether the current policy package is either supported by a class as indicated by a positive constant or not supported, as indicated by a negative constant. Two classes have a rather high positive constant, which indicates that they are strong supporters of the implemented policy. One class has a low positive constant and is therefore labeled as weak supporters. Finally, a single class has a strong negative constant and is labeled opposers. In the following, each of the four classes is briefly characterized.
The first strong supporters class, to which 20% of the respondents belong, is strongly in favor of the current mask policy and particularly against entirely abolishing the mask policy. They are also against a seat reservation system, in particular in case its use is made compulsory. They seem to be somewhat against pricing measures, but none of these effects is significant. They are only mildly against working two days at the office but against abolishing the current working from home guideline. Finally, they are somewhat supportive of on-campus education, in particular when starting times are late. Overall, this class strongly supports the current policy in particular the mask policy, however, they seem to oppose additional policy measures and lean more towards relaxation of the current policy measures.
The second strong supporters class, which is a little bigger than the first one (27%), differs in the following respect from this class. This class is a bit less outspoken with respect to the mask policy and pricing measures, but it is clearly in favor of a seat reservation system, even in case its use is made compulsory. Members of this class clearly support the working from home and online teaching policy and in particular oppose on-campus teaching, and they hardly make a distinction between two days at work or completely abolishing the working from home guideline. Overall, this class strongly supports the current policy and does not seem to oppose additional policy measures but does oppose going back to work and on-campus education.
The weak supporter class (34%) is not very outspoken with respect to any of the policy measures. This class' members seem to be somewhat more in favor of pricing measures than the strong supporter groups and most so for off-peak discount. And they seem to be more supportive of allowing two days at work and on-campus teaching, but similar to the other two supporter classes, its members support the working from home guideline. Overall, this class seems to be a less outspoken version of the first strong supporters class, and thus also leans towards relaxation of the policy measures.
Finally, the only opposers class (19%), is the only class that is against the current mask policy and strongly supports abolishing any compulsory mask policy. Furthermore, its members are not in favor of a seat reservation system. On the other hand, this class is most supportive of pricing measures: members are clearly in favor of off-peak discount, while they are also somewhat supportive of peak charge. Similar to all classes they are supportive of the working from home guideline, but the relaxation of the other work and teaching guidelines is acceptable to this class.
The results of the LCCM presented in this subsection suggest considerable heterogeneity in support of the policy measures among train travelers, an insight that is not provided by the binary logit model. To illustrate this point: the binary logit model suggests that at the aggregate level, teaching on campus with later starting times can count on as much support as the current policy measure that is teaching online. Hence, this result cannot tell whether citizens are indifferent with respect to this policy measure or whether there are about equal numbers of supporters and opposers to this policy measure that cancels out at the aggregate level. The LCCM shows that indeed the latter is the case, indicating that the far majority of train travelers is indifferent with respect to this policy measure. A similar result is found for the policy measure reservation possible. Insight into heterogeneity provides relevant information for policymakers, because if citizens are really indifferent with respect to an alternative polipolicymakers may as well implement the alternative measure when they believe this is more beneficial. However, when policymakers know that a considerable group fiercely opposes a policy measure, another situation emerges because they have to weigh the benefits of implementing the alternative measure against the social unrest it may bring along.

Conclusion and discussion
This study examines the infection safety perceptions and overall support for various policy packages that aim to enable safe travel by train during the ongoing corona pandemic. In general, the results indicate support for the current policy package, which consists of a compulsory policy to wear non-medical masks, no seat reservation system, no peak or off-peak pricing policies, and working from home and online education as much as possible. Based on the model parameters, 62.8% is expected to vote in favor of this package in a referendum setting. That being said, some specific policies would be favored over the current situation, namely an optional seat reservation system and an off-peak discount of 10%. Another relevant finding is that, while the results clearly indicate that having no guidelines for working from home is strongly opposed, respondents are largely indifferent with respect to the most strict variant of this policy (working from home as much as possible) and the intermediate one, that is a maximum 2 days at work. Here, it is also interesting to note that the policy of working a maximum of 2 days at work does not decrease the safety perception of traveling by train. This result may perhaps also reflect a latent need to work at the workplace.
We argued that by separating the total effects of the policy measures on support into indirect effects via safety perception and remaining direct effects, a richer picture emerges to make more informed policy decisions. This point can nicely be illustrated with regard to the policy related to higher education. Here, the total effects indicate that respondents are indifferent between the options of teaching online as much as possible and teaching at campus with late starting times. Keeping everything online would therefore seem to be the most sensible thing to do, as this policy is clearly the safest and respondents support it equally to the option at campus with late starting times. Yet, the indirect and direct effects of the option teaching at campus with late starting time indicate that, while this policy is perceived to decrease the perception that travel by train is safe in comparison to no guidelines, it does have a direct positive effect. Hence, respondents do attach value to having education at campus. In the computation of the total effect, this positive utility is entirely canceled out by the indirect negative effect via safety perception. This result indicates that switching back to on-campus education will likely be appreciated if people consider it also safe to allow travel for this purpose. It, therefore, illustrates the relevance of having knowledge about the different pathways.
Finally, the results of the latent class model again support the overall conclusion that there is much support for the current policy package: only a single class was found that clearly opposes this package representing 19% of the respondents. Another policy-relevant finding is that the two classes that are generally in favor of the current policy package (class 1 and 2) have very different opinions with regard to the seat reservation system, class 1 clearly disfavouring it and class 2 favoring it. Hence, while the overall model (Table 3) suggests that people are largely indifferent with regard to an optional or compulsory seat reservation system compared to not having such a system, the latent class model reveals strongly opposing opinions. This is an important finding to take into account should the railway operator indeed consider implementing such a system. The operator should then make an effort to convince a substantial portion of the population of its added value.
Lastly, the fact that none of the socio-demographic characteristics significantly influences latent class membership is something to highlight and further discuss here. As mentioned above, it means that differences in safety perceptions and support cannot be linked to particular groups in terms of gender, age, level of education, etc. Especially with respect to age, this is a surprising result since the mortality risk of Covid-19 is positively associated with age. We can provide two explanations for this result. Firstly, the setup of the survey (i.e. the stated choice experiment) is such that it stimulates people to make relative judgments and not absolute ones. Hence, when confronted with varying policy measures, respondents are put into the position to evaluate whether these would make travel safer or less safe compared to perceived safety under the current policy and not whether they consider travel by train to be safe in an absolute sense (under the stated conditions). It makes sense that younger people make the same relative evaluations as older participants. A second reason may be that respondents in the experiment are evaluating the policy packages from a collective (and not individualistic) viewpoint, i.e. for each package, they are essentially asked to judge whether it is sensible to implement the policies on a national level or not. It may be speculated that this also creates some distance, and that, as a result, people are less likely to consider their personal circumstances when evaluating the measures. Related to this latter reason it may also be speculated that support or opposition regarding the measures is mainly politically driven. Tentative empirical evidence in favour of this notion is recently also provided by Farias and Pilati (2020) who showed that -in the US-political partisanship is a significant predictor of support for prevention measures. Whether this is also the case for the Netherlands, or more generally for European countries, remains to be assessed. In any case, this would be a relevant thesis to explore in future work.
A limitation of this study is that the results are stemming from a non-representative sample of train travelers in the Netherlands. Students of a methodology course filled out the survey and recruited persons from their social network to do the same. This resulted in clear over-representation of students and young persons in general. Although we could not find any systematic relationships with measured personal characteristics, suggesting that over-and under-representation of several groups did not severely bias the results, we cannot rule out that results may correlate with personal characteristics we did not measure in this study. Moreover, we have to assume that the persons in our sample that belong to a particular segment are representative of all persons in that segment in the population. Because respondents are recruited from a particular student group, hence, a non-random sample, this representativeness is not guaranteed. Hence, we have to be cautious in generalizing the results of this sample to the population of train travelers. It goes without saying that more representative samples should be realized before results could be more confidently used for policy making.
To conclude, this research has revealed to what extent various policy measures are perceived to influence the infection safety perception of traveling by train and to what extent they are supported at a particular moment in time, namely in June 2020. Given that the corona pandemic is constantly evolving that is, that the numbers of infections decrease and increase and thus actual risks of infection changes and that new information about the virus is continuously becoming available, it seems likely that respondents' opinions regarding the measures will also change in response to this. In addition, people who travel by train gain new experiences that may influence their opinions. Hence, it would be worthwhile to repeat the survey to assess how the perceptions and levels of support are changing in various phases of the Corona pandemic and in which directions.
That being said, it seems likely that the general support observed in this study for the current policy package will have a lasting nature. In addition, the finding that some measures may decrease perceptions of infection safety but increase support via other likely more personal pathways reveals a general mechanism that is important to take note of, namely that people can have conflicting evaluations leading to ambivalent (/neutral) overall levels of support. This finding strengthens the need to better understand through which pathways support comes about (or not), since this knowledge may be leveraged by policymakers to develop and/or use those arguments that 'tap into' specific pathways and are therefore most effective to increase overall levels of support.