The sounds of safety silence: Interventions and temporal patterns unmute unique safety voice content in speech

Research shows that withholding safety concerns on encountering hazards – safety silence – is a critical contributor to accidents. Studies therefore aim to prevent accidental harm through interventions for reducing safety silence. Yet, the behaviour remains poorly understood, obstructing effective safety management: it is unclear to what extent safety silence involves muted safety voice (the partial withholding of safety concerns), and how muted safety voice can be recognised in speech, may be measured based on the degrees and types of safety voice (speaking up about safety), progresses over time, and may be optimally reduced. To improve safety management, this study proposes a conceptual model for the manifestation of safety silence and muted safety voice using a laboratory experiment (N = 404) to evaluate the implications for the effectiveness of three interventions (salient hazards, clear responsibilities, encouragements) across stages of a hazard. Results indicated that safety silence and muted safety voice are measurable in terms of the degree to which concerned people engage in five types of safety voice at different points in time, and we revealed this is important for safety management: interventions only unmute safety voice at unique hazard stages and for knowledge-based speech when people are concerned. This indicates that safety silence and muted safety voice are situated and can be recognised in nuanced speech, with interventions being most effective when timed appropriately and people have safety concerns to speak up about.


Introduction
Safety silence is the act of withholding safety concerns about accidental harm (Schwappach and Richard, 2018;Tucker et al., 2008). In social and organisational settings, the act of speaking up about safety (termed 'safety voice') is recognised as crucial for mitigating hazardous conditions (Okuyama et al., 2014) and an ethical and financial imperative (Novak, 2019). However, people often do not engage in safety voice upon encountering a hazard (Noort et al., 2019a), and safety silence has contributed to tragic outcomes in transportation (e.g., aerospace; Bienefeld and Grote, 2012;Cocklin, 2004;Moorhead et al., 1991;Tarnow, 1999), offshore oil drilling (Reader and O'Connor, 2014) and healthcare (Bromiley and Mitchell, 2009;Francis, 2013). Consequently, reducing safety silence is integral to improving organisational safety performance (Griffin and Neal, 2000;Hofmann et al., 2003).
Safety voice theory suggests that interventions can reduce the likelihood of safety silence (Noort et al., 2019a), for instance, by improving hazard salience (Tucker et al., 2008), people's felt responsibility (e.g., Duan et al., 2017) and leaders' inclusiveness (e.g., Barzallo Salazar et al., 2014;Burris, 2012). Yet, how interventions reduce safety silence remains unclear, as there has been little insight into how safety voice manifests when individuals speak but do not refer explicitly to the perceived risk (e.g., speaking less overall, or only about safety); the extent to which speech conveys clues that people are concerned; and the degree to which interventions target unique aspects of the behaviour (e. g., content, time-points). Presently, conceptual models recognise that the relationship between safety voice and safety silence is not a binary one, but may occur with different gradients (e.g., Jones and Kelly, 2014;Noort et al., 2019b) and be characterized by different types (e.g., explicit, respectful, oblique voice; Krenz et al., 2019;Pian-Smith et al., 2009). However, the nature of this relationship has not been theorised as a concepetual model that can capture the extent to which individuals engage in safety voice, safety silence or 'muted safety voice' as a degree between voice and silence (i.e., where individuals speak, but do not explicitly raise issues). Furthermore, while measures have been developed to capture distinct types of safety voice (e.g., Krenz et al., 2019), the degree to which interventions reduce safety silence as a continuous (i.e., the degree and timing of speech) and categorical phenomenon (i.e., content of speech) is unclear.
For effective safety management, it is important to propose models and measures for the content, timing and degree to which safety voice manifests in speech. Without such models, i) interventions may not optimally improve the flow of safety information (Westrum, 2014) that is necessary for mitigating the dysfunctional momentum of hazardous scenarios towards accidents (Barton and Sutcliffe, 2009); ii) accident analyses may not correctly identify the extent to which speech contained relevant safety information; iii) the behaviour's quality and impact on accident prevention may not be assessed (Kolbe et al., 2013); iv) research would provide limited conceptual clarity, and measurement, on the behaviour it aims to improve; and v) interventions may fail to acknowledge that operators may engage in a mixture of voice and silence behaviours (with their frequency, urgency and timing shaping the strength and effectiveness of voice). More tangibly, organisations might only be able to engage in limited safety management because senior staff would be unable to optimally recognise safety concerns in verbal or written communication, interventions would not be precisely targeted towards unique speech patterns, and traning programmes would be suboptimal at enabling staff to spot signs of safety concerns. Thus, in order to improve safety management, models, measures and interventions are needed to address how safety voice manifests when individuals speak but do not explicitly refer to the risks in question (i.e., muted safety voice).
Therefore, in applying a validated experimental scenario (Noort et al., 2019b), the current study aims to improve safety management by contributing a conceptual model for how the manifestation of safety concerns in speech may be measured and by evaluating how interventions and time unmute safety voice.

Conceptualising the degree of safety voice
Safety silence is the act of withholding safety concerns during hazardous scenarios (e.g., Tucker and Turner, 2011) and is contrasted with safety voice: the act of raising safety concerns through discretionary verbal expressions (Conchie et al., 2012;Tucker et al., 2008). Although few studies have focused directly on safety silence, it has been implicit in research on communication and safety (Noort et al., 2019a). Due to its importance for safety management, safety silence is integral to behavioural models and measures of organisational safety (Griffin and Neal, 2000;Hofmann et al., 2003), safety culture and climate (Reader et al., 2015;Zohar, 2010), safety citizenship (Didla et al., 2009), and safety leadership (Barling et al., 2002). Furthermore, by virtue of safety silence involving the withholding of communication (e.g., reporting errors, advocating safe practice, transmitting warnings), insights from other voice concepts have been applied to conceptualise safety silence antecedents (e.g., whistleblowing, upward dissent, employee voice and silence; Kassing, 2002;Morrison, 2014;Near and Miceli, 1985). However, safety silence is different from other behavioural safety (e. g., hand-washing, performing checklists) and voice/silence concepts because it involves the withholding of concerns in any hazardous setting. Moreover, it extends to non-employees (e.g., patients reporting on deteriorating health, minibus passengers speaking up about poor driving; Entwistle et al., 2010;Habyarimana & Jack, 2011). Therefore, safety silence requires a distinct conceptualisation to understand the types of behaviours that constitute the phenomenon and their relationship to safety voice in order to reduce silence during critical incidents.
Thus, while safety voice is rooted in the extent to which people engage in information processing about risk from hazards (Noort et al., 2019b;Schwappach and Gehring, 2014a), safety silence is conceptualised as the absence of communicating about risk. In addition, research has suggested that a middle ground may exist on the continuum between safety voice and safety silence in which individuals only partially raise safety concerns (e.g., Jones and Kelly, 2014). For instance, individuals may merely hint at concerns (Fischer and Orasanu, 2000;Orasanu and Fischer, 1992), or speak less about risk. This behaviour may be best labelled 'muted safety voice' (i.e., in contrast to 'strong safety voice') to capture that individuals' safety concerns are 'on mute', with interventions 'dialling up or down' the degrees of safety voice or safety silence. Thus, for example, interventions to reduce safety silence (or 'unmute safety voice') may be understood to move the balance on the continuous scale towards enabling people to better communicate about risk, and vice versa.
Risk is important for conceptualising safety silence because it elicits safety concerns. Situational (e.g., actual levels of risk) and individual factors (e.g., knowledge, skills, experience) may lead to variation in risk perception (Slovic, 1987), alter the interpretation of safety silence and muted safety voice, and reduce the effectiveness of interventions. For instance, if individuals are unconcerned about hazards (e.g., falling ill from COVID-19), then safety silence indicates that they would have nothing to say, and their speech would be unrelated to safety. Conversely, for concerned individuals, safety silence would indicate that personal or contextual factors have fully muted their safety voice; although their speech reflects safety concerns, their silence on specific risks indicates the strength of the muting factors. Safety silence and muted manifestations of safety voice are therefore not captured by the mere absence of voice (Brinsfield, 2013;van Dyne et al., 2003) or safetyrelated communication during hazardous situations (i.e., it is unclear whether concerns are withheld) because individuals may vary in the degree to which concerns are uttered in safety-related speech. Nevertheless, few safety voice studies have investigated how people may speak up about safety concerns in strong and muted ways in dynamic situations, with their behaviour at different moments ranging on a continuous scale from safety silence, through muted safety voice, to strong safety voice (see Table 1).

The manifestation of safety concerns in speech
Theoretical insights into the nature of safety silence are ambiguous because the literature has conceptualised and assessed its manifestation inconsistently (Mumford, 2015), using inconsistent labels. The statistics representing the extent of safety voice, post-hoc statements on silence, and available measures of safety voice and silence (Manapragada and Bruk-Lee, 2016;Tucker and Turner, 2011) have not conceptualised and measured how safety concern may manifest in distinct ways when individuals speak but do not explicitly refer to safety. An exception to this are the different types of silence proposed for employee silence (Brinsfield, 2013; van Dyne et al., 2003), such as defensive, deviant and relational silence. However, these capture motives for withholding voice and do not clarify the extent to which concerns are reflected in muted safety voice.
Arguably, motives should be distinguished from actions because identical motives (e.g., the desire to prevent harm) may lead to different utterances (e.g., 'please be careful', 'stop doing that') and are not necessarily expressed in the content (e.g., when sharing safety limits). Research on aircrew conversations provides another notable exception (e.g., co-pilots providing hints or questioning captains; Fischer and Orasanu, 2000;Sassen, 2005), but this literature has not delivered a model for the manifestation of safety silence in speech.
Additionally, it is uncertain whether insights into safety voice can be applied to the muted manifestation of safety concerns in speech. While research suggests that safety voice and silence may be distinct behaviours (Sherf et al., 2020), studies have rarely conceptualised and operationalised safety silence in terms of the withholding of safety concerns (Noort et al., 2019b). Furthermore, such research may have confounded concerned and unconcerned participants because they did not assess the extent to which participants were concerned about encountered situations. There have been few attempts to investigate the degree to which antecedents can unmute safety voice, and how this varies over time. This knowledge appears ritical for establishing the effectiveness of interventions to 'unmute' safety voice (e.g., providing encouragement; Barzallo Salazar et al., 2014).
In this way, safety voice and silence may not be dichotomous. In dynamic safety-critical scenarios, individuals may raise their concerns to varying degrees, through distinct content, and by implicit statements, all determining the effectiveness of safety voice (Kolbe et al., 2013). It is necessary to conceptualise this phenomenon in order to disentangle distinct content of the behaviour and to improve safety management by establishing precise interventions, enabling the recognition of safety concerns in speech and supporting training programmes and accident analyses to reduce safety silence. By using an experimental paradigm and analysing speech, the current study aims to improve safety management by i) measuring how safety voice manifests in speech for people engaging in safety silence, muted safety voice and explicit safety voice, and ii) enabling specific interventions to reduce safety silence. To this end, this research contributes a conceptual model of how safety silence can be measured relative to safety voice in scenarios that elicit safety concerns, as well as concrete indicators that can be used to test the effectiveness of interventions for unmuting safety voice at different time-points.

Current study
The current study investigates the degree to which individuals engage in safety voice, muted safety voice, and safety silence, and the implications of this for interventions, using a previously validated experimental scenario. The 'Walking the Plank' paradigm presents participants with an apparent safety problem (a research assistant walking across a 'weak' and elevated wooden plank; Noort et al., 2019b) that can only be mitigated by raising safety concerns. Because the scenario measures hazard perceptions, it enables the assessment of safety silence behaviours in relation to contextual variables (Noort et al., 2019b). Moreover, this scenario enables controlled investigation of the extent to which individuals raise safety concerns. Genuine hazards cannot be ethically introduced or prolonged in naturally occurring scenarios (preventing in-situ data collection), while post-hoc self-report surveys about previous hazardous situations provide uncertain data (e. g., due to silence being socially undesirable, erroneous memories). The debated generalisability of experiments notwithstanding (e.g., naturally occuring scenarios may pose stronger hierarchies; Gigerenzer, 1984;Jiménez-Buedo and Miller, 2010;Mitchell, 2012;Nembhard and Edmondson, 2006), safety silence behaviour can be directly observed using this paradigm. In addition, a model for measuring safety concerns in speech can be tested and interventions can be evaluated for their success at unmuting safety voice. Accordingly, the current study investigates how safety silence and muted safety voice manifests by controlling situational variables. The investigation of broader variables (e. g., individual differences), although valuable, is beyond the scope of this study.

Measuring safety silence and muted safety voice
Our first research question investigates the degree to which safety silence can be measured in relationship to safety voice based on the extent to which individuals' speech reflects safety concerns. Theory suggests that, during safety-critical scenarios, people can say nothing (i. e., acoustic silence; Kurzon, 2011), engage in unrelated speech (i.e., veiled and thematic silence; Kurzon, 2011;Morison and Macleod, 2014), or raise concerns. When concerns are fully withheld (i.e., safety silence), it appears self-evident that this may appear as no speech or unrelated speech. Conversely, if individuals are concerned, safety voice is the strongest way to express concerns in speech. Yet, importantly, people may only partially withhold safety concerns and produce some less meaningful communication on safety that is not captured by binary concepts (Jones and Kelly, 2014). This is because the degree to which specific themes (e.g., safety information, the desire to avoid harm) feature in conversations can vary (i.e., muted safety voice). Concerns may therefore manifest in speech as a continuous (i.e., the degree to which safety voice is muted) and categorical phenomenon (i.e., the content of safety voice speech). Thus, safety voice and safety silence may be best conceptualised as the degree to which individuals speak about safety concerns that are reflected in distinct safety themes.
Specifically, it is proposed here that the degree to which safety voice is muted can be measured based on distinct types of safety voice related to safety knowledge and safety motivation. Initial evidence exists for distinct ways to raise concerns (e.g., respectful, explicit, oblique; Friedman et al., 2015;Kassing, 2002;Krenz et al., 2019;Park et al., 2013;Pian-Smith et al., 2009), which may be used to conceptualise and measure safety silence while a unified model is lacking. In particular, arguably, because i) beliefs and intentions provide the content of communication (Searle, 2008), and ii) because safety knowledge and motivation shape safety participation behaviours such as voice (Christian et al., 2009), safety knowledge and motivations should manifest in distinct speech. For instance, encountered hazards can prompt different perceptions about safety (e.g., uncertainty on safety limits, concern for others' wellbeing; the content of safety voice); if people discuss safety concerns with others, they make sense of perceived risks and evaluate intentions to avoid harm through safer action (Brinsfield, 2013;Gruman and Saks, 2014;Searle, 2008;Turner and Gray, 2009). Therefore, when people partially withhold safety concerns, speech should reflect less discussion about safety knowledge and motivations, which may appear as five types of muted safety voice speech: i) informative, ii) inquisitive, iii) prohibitive, iv) cautionary, and v) oblique safety voice.
That is, first, people can discuss safety information by raising safety concerns. When people warn others, they declare their safety beliefs (Searle, 2008). Evidence indicates that people raise safety concerns by appealing to facts or better solutions (Kassing, 2002), presenting logical arguments (Schwappach and Gehring, 2014b), or, conversely, requesting clarification (Pian-Smith et al., 2009). This enables people to make sense of the nature of anticipated or encountered hazards (Weick, 2010) and evaluate appropriate actions. Thus, it may be expected that muted safety voice manifests as less discussion of safety knowledge through less provision of safety information (i.e., informative safety voice) and fewer requests for clarification (i.e., inquisitive voice).

Hypothesis 1. b: Muted safety voice manifests as less inquisitive safety voice.
Second, through raising concerns, people can express the desire for intended states of the environment (eg., taking action, avoiding harm; Searle, 2008). Because safety motivations lead to safety voice (Christian et al., 2009), safety silence may be reflected in less speech that clarifies the intention to avoid harm, such as prohibitive statements (e.g., 'please, stop that') or tentative cautionary statements (e.g., 'be careful'). Prohibitive (i.e., explicit, blunt) statements enable clarity on the desire to avoid harm (Krenz et al., 2019) by crisply advocating for different actions (Pian-Smith et al., 2009) or threatening with resignation (Kassing, 2002). Cautionary statements express a desire to avoid harm while conveying more respect (Krenz et al., 2019). Because the motivation for avoiding harm reduces safety silence (Christian et al., 2009), this may be reflected in speech, and muted safety voice may therefore involve less prohibitive and less cautionary safety voice.

Hypothesis 1. d: Muted safety voice involves less cautionary safety voice.
Finally, people may raise concerns through unclear utterances (e.g., 'okay', 'ha?', 'hmm?', joar?', 'how?'; Krenz et al., 2019) and oblique speech (Pian-Smith et al., 2009), which merely hint at concerns held (Fischer and Orasanu, 2000). These utterances provide unclear content; consequently, the relationship with expressing safety knowledge and safety motivation is not straightforward. This manifestation of safety silence may emerge because the hesitancy to express safety concerns (e. g., due to the higher cost of speaking up) leads to mitigated speech (Edmondson, 1999;Fischer and Orasanu, 2000) that appears in partial statements or utterances that are not explicit but that imply concerns insitu. Thus, finally, it is expected that safety silence may manifest in less oblique speech. Hypothesis 1. e: Muted safety voice involves less oblique safety voice.

Unmuting safety voice
Our second research question investigates the degree to which interventions unmute safety voice. Safety voice behaviour is attenuated (e. g., in occurrence, assertiveness of communication, repetition, explicitness) by situational variables (e.g., leadership styles, national culture; Barzallo Salazar et al., 2014;Rhee et al., 2014;Weiss et al., 2018) and varies in effectiveness for how individuals and groups (e.g., safety managers, flight crews, operating teams) understand and decide on safety (e.g., problem-solving, being listened to; Jones and Kelly, 2014;Orasanu and Fischer, 1992). This means that it is important to investigate the relationship between situational variables and the manifestation of safety silence in order to design effective interventions (Noort et al., 2019b). Yet, few studies have directly observed safety voice while manipulating interventions (Barzallo Salazar et al., 2014;Friedman et al., 2015;Hodges, 2018). Furthermore, studies that have manipulated interventions while assessing variation in participants' safety concerns and speech remain scant. Without this type of assessment, studies i) assume that hazards elicit concerns, ii) confound concerned and unconcerned participants, and iii) may not reduce the active withholding of safety concerns but increase the perception of risk. Accordingly, there is a need to evaluate the degree to which interventions can reduce safety silence.
Here, it is proposed that interventions for unmuting safety voice work optimally when concerned participants engage in more safety voice speech. Research using the Walking the Plank paradigm has indicated that safety silence is associated with participants reporting that they are unaware of hazards, feel less responsible and worry more about the consequences of speaking up (Noort et al., 2019b). This is consistent with proposed interventions for increasing hazard salience (Tucker et al., 2008), felt responsibility (e.g., Duan et al., 2017) and for providing encouragement (e.g., Barzallo Salazar et al., 2014;Burris, 2012). By applying these manipulations, one is able to evaluate the conceptual model against the literature.
First, safety voice is associated with people being aware of (Lindberg et al., 2013;Manias, 2015) and concerned about hazards (Gurung et al., 2017;Manapragada and Bruk-Lee, 2016;Gehring, 2014a, 2014c). Arguably, this leads to reduced safety silence because salient hazards (e.g., reminders of death) elicit risk perceptions by increasing perceived threat and uncertainty (i.e., outcomes are not clear a priori; Burke et al., 2010). Theory on risk communication and uncertainty management suggests that uncertainty can be managed through information-sharing (e.g., speech), which creates shared awareness, (dis)confirms risk perceptions and evaluates appropriate actions (Brashers, 2001;Lindell and Perry, 2012). Increasing hazard salience should therefore manifest in more safety voice.
Hypothesis 2. a: Salient hazards unmute safety voice when people are concerned.
Second, ample research has indicated that felt responsibility for situational outcomes increases voice (Aydon et al., 2016;Bickhoff et al., 2016;Duan et al., 2017;Jackson et al., 2010;Lyndon, 2008;Malvey et al., 2013;Manias, 2015;Nembhard et al., 2015;Schwappach and Gehring, 2014c). This is because clear responsibilities increase the intention to communicate in order to i) decide on appropriate action (Fischer et al., 2011;Lindell and Perry, 2012;Weiss et al., 2018), ii) redefine optimal performance (Fuller et al., 2006), and iii) explicitly prevent harmful outcomes (Weiss et al., 2014). Clear responsibilities describe the accountability for situational outcomes (e.g., harm) and increase the willingness to accept accountability for future consequences (Fuller et al., 2006). This may legitimise the sharing of safety knowledge through group norms for communicating risk. Thus, safety voice may be unmuted by increasing the extent to which people feel responsible for the outcomes of hazardous situations.

Hypothesis 2. b: Felt responsibility unmutes safety voice when people are concerned.
Third, encouragement can communicate favourable norms for speaking up. Research indicates that people speak up more to receptive leaders (e.g., through transformational leadership styles; Bickhoff et al., 2016;Nembhard and Edmondson, 2006). This is because explicit communication is more likely when others are supportive (Brashers, 2001;Lindell and Perry, 2012) and the costs of safety voice are low (Edmondson, 1999;Fischer et al., 2006;Lindell and Perry, 2012). Supporting this, encouraged participants have been shown to be more likely to speak up (Barzallo Salazar et al., 2014). Accordingly, safety silence should be reduced by providing encouragement.

Hypothesis 2. c: Encouragements unmutes safety voice when people are concerned.
It has been argued above that muted safety voice can manifest in speech as a continuous (i.e., degrees of speech) and categorical phenomenon (i.e., types of speech). This suggests that interventions may only unmute safety voice for specific manifestations. Because insights remain scant, this study explores how interventions unmute specific types of safety voice. Arguably, safety silence may be reduced most in terms of speech related to safety knowledge (i.e., inquisitive and informative safety voice). This is because hazard salience, felt responsibility and encouragement involve clarity on safety information and the norms for communicating this. In exploring this, this study aims to reveal whether interventions for unmuting safety voice should be tailored to different types of safety voice.

The effect of time on unmuting safety voice
Our third research question investigates the degree to which safety silence and muted safety voice manifests differently over time. Time provides a natural influence on communication; yet, few studies have conceptualised temporal differences in safety silence and muted safety voice or the effect of interventions across stages of hazardous scenarios. An exception, Farh and Chen (2018) showed that intervention success depends on intervention timing (i.e., preparation versus execution of procedures). This indicates that safety silence and muted safety voice may manifest differently across stages of hazardous scenarios, with interventions targeting distinct aspects of safety silence. Arguably, in temporal order, hazardous scenarios may i) be anticipated as a potential future state (e.g., designing new systems, planning routes), ii) be physically encountered (e.g., medical alarms sounding), and iii) provide the potential for imminent harm (i.e., initiated actions with impending outcomes). In the first two stages, harm is not immediate and remains distal compared to initiated actions that require immediate action. Arguably, this may elicit more conceptual evaluations (i.e., knowledgebased speech) in the early phases of hazards, and more discussion of the intention to avoid harm in later stages (i.e., motivation-based speech).

Hypothesis 3:. As hazardous scenarios progress, knowledge-based speech is muted while motivation-based speech is unmuted.
In addition, the current study explores intervention effects over time.
Little evidence exists to enable explicit hypotheses. However, because hazard salience, felt responsibility and encouragement involve clarity on safety information and the norms for communicating, interventions may be more effective at unmuting safety voice in the early stages of the hazard.

Design
Within a laboratory environment, participants engaged in the validated Walking the Plank paradigm (Noort et al., 2019b). Under the guise of a creativity study, this paradigm presented an apparent hazard of walking a footbridge (i.e., the plank supposedly only held 30 kg) and enabled the direct observation (through video recording) of safety silence in response to controlled hazards.
The protocol had three stages. First, after obtaining informed consent, participants engaged in a 5-minute creativity task where they described the possible uses of a plank and four blocks of wood. Second, participants engaged in a task with a research assistant to test the feasibility and creativity of the ideas of a 'previous participant' (i.e., a standard set: shelving, mirror, juggling, footbridge, piece of art). Finally, participants completed a questionnaire and were fully debriefed. For the footbridge idea, the protocol required the research assistant to i) introduce the footbridge idea ('Hmm. This idea is pretty obvious, but I haven't seen it before. Could you build a footbridge, please?'), ii) prompt the participant to place the plank across two chairs, iii) state the intention to walk the plank ('I will now test the footbridge idea by walking over it'), and iv) walk the plank (stepping onto the footbridge at one chair, stepping off the footbridge at the other). The plank required three steps to walk across it, including one over the exposed gap between the two chairs (for illustrative pictures see the appended manual to: Noort et al., 2019b).
The hazard salience and responsibility manipulations were presented electronically (i.e., through Qualtrics on an iPad) in counterbalanced order before the creativity task. The encouragement manipulation was introduced by the research assistant before the 'previous participant's ideas' were tested. The eight conditions were randomised across all participants and research assistants were blind to the study hypotheses.
For the hazard salience manipulation, participants evaluated a photograph of a man talking on his phone while crossing a busy street and were asked 'What aspects of this picture make it a hazardous situation, where harmful outcomes might occur?' (salient condition), or: 'What aspects of this picture make it a typical situation, one you could encounter any day?' (control condition). For the responsibility manipulation, participants read: 'Please think of a situation from your lifewhere "you" (clear condition)/ "it was not clear who" (unclear condition) were/was responsible for the outcomes of the situation'. Participants then described the situation, what they had done, and how they had felt. For the encouragement manipulation, the research assistant stated one of two messages: 'Please keep your thoughts and opinions to yourself. I do not like it when people share those, and I might then reduce your study reward because expressing your true feelings is not part of the task' (discouraged condition), or 'Please feel free to express your thoughts and opinions. I like it when people share those, and it will not impact your study reward because expressing your true feelings is part of the task' (encouraged condition).

Participants
For the study, 404 participants (n students = 377; n female = 277, Age M (sd) = 22.897 (5.386) , n missing_demographics = 9) consented to participate (including anonymised data to be archived and used within public domains) and completed the study between 31 May and 10 December 2018. Among student participants, most studied management (n = 51), with only 21 psychology students. A pilot study confirmed that gender does not shape safety silence, OR = 1.096, Wald(1) = 0.22, p = .881. Participants lived in the United Kingdom, had no expertise on relevant legislation (i.e., whistle-blowing) or building materials, and spoke fluent English, with 97% being native speakers (n = 166) or having spoken English for more than five years (n = 216). Most participants were from middle-(n = 164) and upper-middle-class backgrounds (n = 90). Participants received a £5 reward for their time. On a question asking participants to report on the perceived study intention, no participant guessed the true study aims.
The full dataset is available as supplementary material. Twelve participants were excluded from analyses (i.e., 10 technical issues with video recording, 2 non-responses to whether the scenario elicited concern).

Measures
Measures included self-report and behavioural measures tailored to the laboratory environment (for an overview, see Table 2).
Safety concerns. Safety concerns were measured using a 5-point Likert scale item: 'I was concerned about the footbridge idea'. To enable identification of the continuous safety concern dictionary, the item was adapted to concerned (i.e., ≥ 3) and unconcerned (i.e., ≤ 2). The concern dictionary scored the frequency of concerned words in participants' speech.
Muted safety voice. To enable dictionary development providing continuous measures, safety voice and safety silence were initially coded as a binary variable based on whether individuals were concerned about the footbridge idea and engaged in safety voice behaviours. That is, safety voice behaviours were coded based on transcribed videorecordings of the hazardous scenario (i.e., introduction of the footbridge idea up to moving on to the next phase of the study). For concerned participants, speech was coded as 'safety silence' when participants did not engage in safety voice, ICC(1,1) = 0.749, p < .001.
By contrast, the occurrence of 'safety voice' was coded when participants verbally indicated that they were concerned about the research assistant walking the plank (i.e., a risk was indicated, the situation prohibited, proceedings questioned, caution urged, or a concern suggested through an oblique expression). Specific occurrences of safety voice were coded with 'substantial' or better inter-rater reliability (Wongpakaran et al., 2013)  Seven participants who withdrew their voice (i.e., they spoke up, but backtracked and allowed the footbridge to be walked) were coded as safety voice because an option to respond was given. Conversely, independent conversational gasps and apologies were not considered safety voice.
Safety voice dictionaries. To measure the degree of muted safety voice in speech, participant text was scored with i) LIWC2015 dictionaries for risk, perceptions, future-orientation, personal pronouns, negation and formalities (Pennebaker et al., 2015), ii) the communication vagueness scale (Hiller et al., 1969), and iii) safety voice dictionaries (i.e., informative, inquisitive, prohibitive, cautionary, oblique; see Table 3). Safety voice dictionaries were developed by identifying words associated with coded safety voice behaviour, identifying synonyms using word vectors (Mikolov et al., 2013) and manually evaluating patterns through author discussion. Dictionary scores (i.e., continuous scales) were therefore distinct from coded observations (i.e., binary scales). Safety silence is  Table 3 Description of safety concern and safety voice dictionaries.
Questionnaire items. Felt responsibility was measured with an adapted survey item ('I would feel obligated to raise any concerns I had'; Liang et al., 2012), and six items measured social risk (α = 0.762; Noort et al., 2019b).

Analyses
Analyses were conducted using Python 3.7 (using the pandas, numpy, scipy, statsmodels, spacy and scattertext packages). First, manipulation checks were performed for the scenario (i.e., whether the scenario elicited safety concerns and levels of safety silence compared to the average of 44% found in the literature; Noort et al., 2019a) and for the manipulations (i.e., if the hazard salience, responsibility and encouragement manipulations led to more safety concerns, felt responsibility and perceived social risk, respectively), and Spearman's correlations were calculated to provide an overview of the relationships between study variables. Second, the safety concern and safety voice dictionaries were validated using (M)ANOVAs testing the relationship with reported safety concerns and perceived social risk (for the concern dictionary) and observed safety voice behaviours (for safety voice dictionaries). Correlations were calculated to establish the extent to which the composite safety voice dictionary related to the safety concern measures. Third, hypotheses 1a-e were tested using one-sample T-tests (with the test-value '0 ′ reflecting no speech) and a MANOVA to understand the degree of difference across safety voice dictionaries. Fourth, hypotheses 2a-c were tested using a multiple linear regression based on the safety voice dictionary scores, with follow-up conditional analyses for each intervention. Fourth, the results for hypotheses 2a-c were probed for individual safety voice dictionaries. Finally, hypothesis 3 was tested using logistic regressions that compared intervention effects across hazard stages and a MANOVA that established the extent to which hazard stages led to different degrees of speech.
The Jupyter notebook and supporting files are submitted as data in brief. Accordingly, to improve readability, statistics have been summarised, and non-significant statistics are presented as 'ns'.

Manipulation and dictionary checks
Manipulation checks indicated that the scenario and experimental manipulations worked as intended, with mixed success for the responsibility manipulation.

Dictionary validation.
Whether participants were concerned accurately related to dictionary scores for safety concern, F(1,390) = 4.446, p = .036, η 2 = 0.011. Concerned participants' speech was more disfluent, F(1,390) = 5.574, p = .004, η 2 = 0.022, indicating a possible tension between raising safety concerns and perceived social risk. After identifying synonyms, the dictionaries related accurately to the intended behaviour (e.g., informative versus not informative), F(1,390)s ≥ 26.169, ps < 0.001, η 2 s ≥ 0.063. The dictionaries provided one composite safety voice dictionary, which accurately distinguished between participants who were observed to voice or remain silent, F(1,390) = 138.085, p < .001, η 2 = 0.261. The safety voice dictionary was associated with self-reported safety voice, r = 0.490, p < .001, and only with the concern dictionary, r = 0.813, p < .001, but not self-reported concerns, r = 0.077, p = .126. The means and correlations of the study variables and manipulations are presented in Table 5.

Measuring safety silence and muted safety voice
Supporting hypotheses 1a-e, participants that were observed to engage in safety silence uttered words (M = 29.838, SD = 27.990), t (153) = 13.229, p < .001, and indicating muted safety voice this involved non-zero scores on the safety voice dictionary (M = 9.474, SD = 10.348), t(153) = 11.362, p < .001. Specifically, people withholding safety concerns engaged in informative (M = 2.877, SD = 3.878), inquisitive (M = 3.714, SD = 3.970), prohibitive (M = 1.75, SD = 2.771), cautionary (M = 0.331, SD = 0.724) and oblique safety voice (M = 1.617, SD = 1.931), t(153)s ≥ 5.680, ps < 0.001. Supporting the need to conceptualise muted safety voice, the distinction between safety silence and safety voice was a matter of degree: Participants who were observed to not explicitly speak up scored lower on the five safety voice dictionaries, F(1,390)s ≥ 4.900, ps ≤ 0.028, η 2 s ≥ 0.016. This illustrates that safety themes are less present for muted safety voice and, importantly, indicates that safety silence and safety voice can be measured in terms of the degree to which safety voice manifests in speech.
in speech are provided in Table 6, and the relationship to the range of safety voice and concern is presented in Fig. 1.

Unmuting safety voice
Supporting hypotheses 2a-c, safety voice was unmuted by manipulating beliefs on safety and norms for speaking up. However, only encouragement had a direct effect on safety voice, whereas hazard salience and responsibility modified the effect of safety concerns on safety voice. That is, concerned participants did not engage in less safety silence when hazard salience, b = − 0.459, t(305) = − 0.249, p = .804, and responsibility, b = -1.061, t(305) = − 0.576, p = .565, were manipulated. However, encouragement unmuted safety voice, b = 4.000, t(305) = 2.171, p = .031, with participants uttering more words in the safety voice dictionary. Underscoring the importance of assessing safety concerns, interventions only unmuted safety voice for the levels of the manipulations. That is, stronger safety concerns let to lower scores on the safety voice dictionary, b = 1.287, t(3 9 0) = 2.032, p = .043, but only when hazards were salient, b = 1.956, t(3 8 8) = 2.049, p = .041, participants were discouraged rather than encouraged, b = 1.695, t (388) = 1.970, p = .050, and (through a marginal effect) responsibilities were clear, b = 1.666, t(388) = 1.757, p = .080. Yet, stronger concerns did not unmute safety voice when hazards were not salient, responsibilities unclear and participants were encouraged, ns.
Probing effects. Further analyses suggested that stronger concerns did not universally unmute safety voice; stronger concerns only reduced unique manifestations of muted safety voice for the levels of the manipulations. Specifically, salient hazards only led to more inquisitive safety voice, b = 0.705, t(388) = 2.091, p = .037; clear responsibilities to more inquisitive safety voice, b = 0.698, t(388) = 2.083, p = .038, and less oblique safety voice, b = 0.226, t(388) = 2.157, p = .032; and discouragement only to more informative safety voice, b = 0.890, t(388) = 2.005, p = .046. Otherwise, safety concerns did not unmute safety voice in the safety voice dictionaries, ns.

Table 5
Spearman's correlations, means and standard deviations of variables. .001.

Table 6
Illustration of the manifestation of strong and muted safety voice.

The effect of time on unmuting safety voice
Muted safety voice manifested differently for participants that initially spoke up during the first (i.e., conceptualisation stage; n = 44), second (i.e., encounter stage; n = 39) or third stage (i.e., imminent danger stage; n = 104) of the hazard, F(1,184) = 13.686, p < .001, η 2 = 0.129. This indicates the need to compare the manifestations of safety silence across these stages.
Specifically, compared to other stages, dictionary scores during the conceptualisation stage indicated that participants were more concerned, F(1,179) = 17.371, p < .001, η 2 = 0.086, which led to more informative, inquisitive and prohibitive, and less oblique and disfluent safety voice, F(1,179)s ≥ 4.846, ps ≤ 0.029, η 2 s ≥ 0.026. However, they did not engage in more cautionary safety voice than in the other stages, ns. Partially supporting hypothesis 3, this suggests that participants at this time-point were oriented towards evaluating the idea of walking the plank, without perceived risk interrupting their speech. The encounter stage involved marginally more informative safety voice, F(1,178) = 3.429, p = .066, η 2 ≥ 0.018. This suggests that the second stage may involve sensemaking about the physical encounter of the walking the plank idea. Finally, when danger was imminent, participants engaged in more informative safety voice, F(1,178) = 54.728, p < .001, η 2 = 0.228. However, their speech was also less concerned and prohibitive, F(1,178) s ≥ 8.080, ps ≤ 0.005, η 2 s ≥ 0.042. This suggests that imminent harm may be more effectively mitigated by indicating safety knowledge than through safety motivation. Interestingly, the imminent danger stage led to higher disfluency and oblique safety voice scores, F(1,178)s ≥ 10.704, ps ≤ 0.001, η 2 s ≥ 0.055, suggesting that mitigating imminent danger may be cognitively disruptive.
Finally, encouragement reduced the likelihood of safety silence during the first stage of the hazard, OR = 0.205, z(305) = -2.672, p = .008, whereas clear responsibilities increased the likelihood that people spoke up during the second stage, OR = 0.257, z(305) = 2.058, p = .040.

Discussion
To enable targeted interventions for mitigating accidents, this study proposed a model for measuring the extent to which safety voice manifests in speech based on five types of safety voice speech, and evaluated when interventions can unmute safety voice. The experimental investigation provided the first behavioural evidence that safety silence and safety voice can be measured based on the degree to which people raise their safety concerns. Furthermore, this study demonstrated that interventions for unmuting safety voice are optimal when participants hold safety concerns, which can be detected in muted safety voice. Additionally, interventions tend to unmute knowledge-based speech, but not motivation-based speech, and the temporal progression of hazards leads, in order, to conceptual evaluations, exploration of consequences, and attempts at mitigating the hazard. These findings have implications for conceptualising and unmuting safety voice.

Theoretical implications
First, in revealing that safety silence and safety voice can be measured based on the degree of muted safety voice speech (hypothesis 1), this study contributes a conceptual model explaining how muted safety voice manifests in the speech of participants who are concerned about hazardous scenarios. This conceptual model highlights that the essential flow of safety information (Westrum, 2014) during hazardous scenarios manifests as a matter of degree in speech about distinct safetyrelated themes (Kurzon, 2011(Kurzon, , 2007, with interventions optimally targeted at the degrees of distinct content in speech. This underscores previous propositions (Friedman et al., 2015;Kassing, 2002;Krenz et al., 2019;Pian-Smith et al., 2009), clarifies that the nature of safety voice can involve a degree of muteness that is not complete silence, and emphasises the importance of the content of safety voice (i.e., safety knowledge and motivation; Christian et al., 2009;Searle, 2008) alongside its occurrence for avoiding accidents (Jones and Kelly, 2014). Moreover, our findings indicated that participants were not fully silent, and it appears important for research to establish to what extent previous research has established muted safety voice. That is, if participants report withholding safety concerns, this may have been misinterpreted as strong rather than partial muting of safety voice, and accidents that were attributed to safety silence may contain muted safety voice (e.g., Fig. 1. Model for the manifestation of safety concerns in safety voice behaviours. Note: coordinates for the five types of voice reflect correlations with safety voice and safety concern dictionaries, with the areas for no and unrelated speech reflecting these can occur for all degrees of concern and voice, respectively. indicating hazard awareness, poor listening to safety voice, etc.).
Second, findings revealed that measuring safety silence and muted safety voice is important for designing interventions (hypothesis 2). The presentation of salient hazards (Tucker et al., 2008) and encouragement (e.g., Barzallo Salazar et al., 2014;Burris, 2012) elicited speech on safety knowledge for concerned participants, while felt responsibility (e.g., Duan et al., 2017) reduced unclear content in speech. This means that interventions should be conceptualised as important for mitigating the dysfunctional momentum towards accidents (Barton and Sutcliffe, 2009); such interventions optimally unmute safety voice (in particular speech about safety knowledge) when participants are concerned about safety and discuss less safety knowledge. This underscores that accidents could be mitigated by addressing how people engage in social information processing to evaluate contextual (e.g., risk) and social cues (e. g., psychological safety; Edmondson and Lei, 2014). Moreover, these findings indicate that safety motivation-based themes are not elicited by the manipulations and may be better addressed by alternative interventions.
Finally, the results indicate the importance of the temporal progression of hazards for unmuting safety voice (hypothesis 3). Muted safety voice only manifested in less knowledge-based speech across the hazard stages. This corresponds to recent findings indicating that nurses voice later, not less, depending on leadership influences (Krenz et al., 2020). These findings suggest that research (e.g., spiral of silence; Noelle-Neumann, 1974;Scheufle and Moy, 2000) needs to account for different factors at different time-points.

Practical applications
This study has at least five concrete practical applications for safety management. These relate to improved training and employee induction programmes, the identification of errors and safety silence by inserting controlled errors, enhanced accident analyses, and the development of speech recognition software (summarised in Table 7).
First, safety practitioners may be trained to better recognise when others (with)hold safety concerns. Training programmes aimed at safety-specific communication (e.g., Crew Resource Management, teamSTEPPS, LOFT; Kanki et al., 2019;King et al., 2008) typically emphasise communication styles (e.g., assertiveness) and collaboration on safety (e.g., shared mental models, adaptability, error management; Helmreich et al., 1999). However, these rarely train people on specific language (Leonard et al., 2004) or the recognition of safety concerns in speech. Recognising safety concerns is essential for mitigating safety threats; by applying and contextualising the proposed dictionaries, junior and senior practitioners may be trained to better identify safety concerns and safety silence in language.
Second, leaders may use the presented insights to introduce controlled errors to identify whether these are raised by staff. For instance, by making a deliberate, but controlled, error in handwashing hygiene, leaders may identify the extent to which junior colleagues engage in safety voice speech and when they do so (e.g., in later stages of hazards). In this way, safety voice dictionaries enable the identification of areas for improvement based on the timing and relative absence of unique types of speech (e.g., prohibitive speech).
Third, practitioners may adapt the experimental paradigm for employee induction purposes. Debriefing simulations can raise awareness on safety issues among participants (Kolbe et al., 2015), and the scenario may be especially relevant for highlighting organisational norms on safety voice and risks associated with safety silence and muted safety voice. This is because the scenario does not require prior specialist safety knowledge and has a low threshold for participation.
Fourth, by providing measurement of safety silence and muted safety voice, this study enables accident analyses to better assess the extent to which people were concerned and engaged in safety voice during accidents. For example, Tarnow (2000) described how the crash of Express II Airlines, Inc./Northwest Airlink 5719 was attributable to tense and hesitant communication. Fischer and Orassanu (2000) also described how indirect speech contributed to the crash of Air Florida Flight 90. Using the presented model and dictionaries for the manifestation of safety silence, conversations in field settings may be characterised in terms of safety voice and silence, and the extent to which safety issues were picked up on may be identified.
Fifth, speech recognition software may be developed to capture and analyse recorded and live speech. The sensitivity of such approaches notwithstanding (e.g., data security, the perceived autonomy of employees), hazards may be acted on more proactively when automated speech recognition software highlights to team members that concerns are being muted (e.g., in medical reports, notes, live operating procedures; Jiang et al., 2017).
Finally, these findings indicate that safety silence and muted safety voice are contingent upon the perception of risk and that interventions are therefore most optimal for concerned people. This appears especially useful for altering whether people discuss safety knowledge. Voice may therefore be optimally unmuted by providing explicit safety information (e.g., in healthcare leaflets, through warning signs; Matthews et al., 2014;Pander Maat and Lentz, 2010), clear accountability structures, and inclusive leadership (Nembhard and Edmondson, 2006;Weiss et al., 2018).

Limitations
First, the experimental paradigm has debated external validity (Noort et al., 2019b). A need remains to extend findings to natural speech in other contexts (e.g., in operating rooms, flight decks) and to scenarios that pose more substantial, or actual, risk. This appears key because speech is highly context-dependent (e.g., informative speech included characteristics of the experimental scenario; Gillespie and Cornish, 2010), and actual risks may elicit more safety voice than can be elicited in simulated scenarios. For instance, technical expertise and strong emotional responses to danger (Loewenstein et al., 2001) may enable more voice. However, the extent to which individuals engage in safety voice in the context of real risks remains undetermined (Krenz et al., 2020), and the criteria to establish the fidelity of the study, and thus its generalisability to the context of real risks, are unclear (Nestel et al., 2017). Arguably, the presented standardised scenario reveals causal mechanisms with high internal validity that may be generalised with more certainty to settings with known characteristics, such as the degree of safety voice and safety concerns (Noort et al., 2019b). For instance, the experimental paradigm enables conclusions on contextual variables such as risk or hierarchies of safety silence (Krenz et al., 2020) by providing a high degree of control (e.g., through introducing power manipulations; Galinsky et al., 2015), which is necessary to establish mechanisms. Furthermore, establishing safety silence and muted safety voice data in naturally occurring scenarios is challenging because actual hazards cannot be ethically introduced or prolonged (i.e., this exposes participants to undue risk). The utilised experiment therefore enabled internally valid and ethical data, while enabling generalisation to Table 7 Implications for safety management.

Application Benefit for safety management
Training programmes • Improved detection of concerned speech • Implementation of interventions within training programmes Inserting controlled errors • Recognising the extent to which team members speak up Employee inductions • Highlighting risks associated with muted safety voice • Making desired norms on safety voice salient Accident analysis • Improved understanding of the extent of safety voice during accidents Speech recognition software • Improved detection of safety concerns in written and spoken language populations with comparable characteristics (e.g., levels of safety concerns and safety silence). To date, however, the literature has not established degrees of safety concerns in relationship to safety silence or variables such as risk within naturally occurring scenarios. Future research should therefore aim to assess safety voice in the context of actual risk in order to improve insights into generalisability. Such research may, for instance, establish the extent to which the proposed dictionaries have sufficient sensitivity to identify safety voice behaviours in workplace contexts. Second, caution is warranted in generalising from the majority student sample to naturally occurring scenarios. Student samples may elicit data that differ from field-based data (Mitchell, 2012), for instance, due to the effect of uncontrolled contextual factors. Yet, equivalent results were found for the sample in the pilot (72%) and the main study (93% students). Furthermore, the data indicated that the degree to which the student sample spoke up (47.7%) was similar to the levels obtained by practitioners in high-reliability organisations (averaging 44%; Noort et al., 2019a). As argued above, this may enable generalisation to actual hazards. To address this, future research may directly compare data from student and applied samples to establish the degree of generalisability.
Third, the responsibility manipulation unexpectedly reduced felt obligation. This indicates that the interpretation of this manipulation is not straightforward, and future research should examine this. For instance, reminding people of previously held responsibilities may compensate for the need to feel responsible in novel situations. Nevertheless, the manipulation unmuted safety voice and led to more inclusive language, indicating that participants may have felt shared rather than individual responsibility.
Fourth, the association between concerned speech and safety voice was very strong, indicating that safety concerns and safety voice, though distinct concepts, may be less distinguishable in speech. This supports the proposition of the current study, and future research should expand on this.
Fifth, it should be reiterated that the safety concern scores may have been influenced by their measurement after the scenario. Although this was the only way to assess safety concerns through a survey without invalidating the scenario (Noort et al., 2019b), future studies could explore biometric safety concern metrics. Finally, the current study emphasised the impact of situational characteristics on unmuting safety voice in order to evaluate interventions; however, individual differences have an impact on social information processing (Carver and White, 1994;Lauriola and Levin, 2001). Because few safety voice studies have investigated personality (Tucker et al., 2008), future research should therefore establish the impact of individual differences on safety silence.

Conclusions
Unmuting safety voice and recognising weak signals indicating safety concerns is critical for safety management. A model was proposed for the measurement of safety voice and safety silence on a continuous scale based on how safety voice manifests when individuals speak, and introduced the concept of muted safety voice. Findings indicate that safety voice behaviours are measurable in terms of the degree of safety voice speech that distinguishes safety silence, safety voice and muted safety voice. Furthermore, the results showed that interventions only unmuted safety voice about safety knowledge when participants were concerned and at specific time-points. This sheds light on the extent to which people engage in safety voice and silence, and contributes a better understanding of the key social variables that lead to this and how concerned individuals engage in conversation. This study underscores the importance of the behavioural investigation of safety silence and the need to assess the extent to which people perceive risks. Safety silence is reduced most effectively when safety information is available and this is manifested in speech. Future accidents may therefore be prevented by investigating how safety voice may be unmuted.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.