Subjective Probability Increases Across Communication Chains: Introducing the Probability Escalation Effect.

A severity effect has previously been documented, whereby numerical translations of verbal probability expressions are higher for severe outcomes than for non-severe outcomes. Recent work has additionally shown the same effect in the opposite direction (translating numerical probabilities into words). Here, we aimed to test whether these effects lead to an escalation of subjective probabilities across a communication chain. In four ‘communication chain ’ studies, participants at each communication stage either translated a verbal probability expression into a number, or a number into a verbal expression (where the probability to be translated was yoked to a previous participant). Across these four studies, we found a general Probability Escalation Effect, whereby subjective probabilities increased with subsequent communications for severe, non-severe and positive events. Having ruled out some alternative explanations, we propose that the most likely explanation is in terms of communications directing attention towards an event ’ s occurrence. Probability estimates of focal outcomes increase across communication stages.


Introduction
Uncertainty is inherent in daily life, complicating decisions we must make about future opportunities or threats.We do not know whether a stock will increase or fall, whether Mt.Eyjafjallajökull will erupt tomorrow, or whether an incoming rainstorm will hit our locale.In many situations, however, we might have access to a probabilistic estimate (e.g., the weather forecast).Subsequently, our neighbour might ask us for our estimate that the storm will hit.What do we tell them?In the present paper, we are concerned with the continued communication of probabilistic information, through a 'communication chain.'Specifically, we examine whether subjective probabilities change systematically as they are passed along this chain and, at each stage, translated between communication formats: from verbal (e.g., 'unlikely') to numerical (e.g., '30% chance'), and numerical to verbal.

Event severity, event likelihood, and communication chains
Within the Disaster and Risk Reduction community, risk is typically considered as the combination of the severity of an event's impacts and the likelihood of those impacts (e.g., Aven & Renn, 2009;Rosenbaum & Culshaw, 2003;World Meteorological Organization, 2015).Objectively, the severity of an event is independent from its likelihood.Much research has, however, questioned the psychological independence of utility and probability.Researchers have offered evidence both for the conjecture that events are seen as more likely when they are desired (for reviews see Krizan and Windschitl, 2007;Windschitl & Stuart, 2015, and that negative events are seen as more likely the more severe they are (Bilgin, 2012;Harris, Corner, & Hahn, 2009;Risen & Gilovich, 2007; see Harris, 2017, for a discussion of the co-existence of these two effects).
The focus of the present article is on how subjective probabilities are translated between words (Verbal Probability Expressions [VPEs] such as 'unlikely') and numbers (e.g., '20%').The question is of relevance since risk communication can (and does) proceed with either format.A number of applied domains recommend the use of verbal formats for communicating risk (e.g., in climate change [Mastrandrea et al., 2010]; security [College of Policing, n.d.;(ODNI, 2007;NATO, 2016, as cited in Dhami & Mandel, 2021]; pharmacy [MHRA, 2005]).Contrastingly, some organisations propose that risk information is better provided with numbers (e.g., European Food Safety Authority; Hart et al., 2019).Preferences for risk communication formats have been shown to differ between speakers and hearers (Erev & Cohen, 1990;Olson & Budescu, 1997;Wallsten, Budescu, Zwick, & Kemp, 1993;Xu, Ye, & Li, 2009), with different preferences according to the type, or precision, of the uncertainty (e.g., Du et al., 2013;Juanchich & Sirota, 2020;Olson & Budescu, 1997;Wallsten et al., 1993).The key result for present purposes, however, is the heterogeneity in preference always observed in such studies, with some preferences for words and some for numbers.
Returning to the dependence of utility and subjective probability, Weber and Hilton (1990; see also Bonnefon & Villejoubert, 2006;Harris & Corner, 2011;Juanchich, Sirota, & Butler, 2012;Villejoubert, Almond, & Alison, 2009) demonstrated thatafter controlling for the influence of event base rate -people translated VPEs into higher numerical probabilities when they referred to a more severe negative outcome than a more neutral negative outcome.This phenomenon has been labelled the Severity effect and has typically been attributed to either: an asymmetry in the loss function associated with over-versus underestimates (Harris & Corner, 2011;Weber, 1994); or a politeness-based expectation, whereby a communicator is assumed to be downplaying a severe risk, either to protect the communication recipient (Bonnefon & Villejoubert, 2006), or the communicator themselves (Juanchich et al., 2012).In a one-to-one communication context, Holtgraves and Perdew (2016) found evidence that hearers' assumptions were correct: speakers chose lower VPEs to communicate the chance of a more severe outcome.Liefgreen et al. (2024) recently investigated choices of verbal probability classification, on the basis of a numerical probability range, in the domain of weather warnings.Politeness-based considerations may be considered less relevant away from a one-to-one communication context (see e.g., Holtgraves & Perdew, 2016).Impact-based weather warnings require forecasters to provide an assessment of the impacts associated with a weather event, and the likelihood of those impacts.A popular approach for operationalising such forecasts is through a risk matrix (e. g., World Meteorological Organization, 2015), with impact severity on the x-axis and impact likelihood on the y-axis (the orthogonality of severity and likelihood underlining their assumed independence).Liefgreen et al. (2024) provided southeast Asian weather forecasters with (hypothetical) model summaries suggesting specific numerical ranges of impacts that differed in severity (e.g., "Weather modelling of heavy rainfall…suggests there is a 70-90% likelihood of overwhelmed healthcare facilities… in 24 hours [sic] time").The forecasters' task was to specify a weather warning on the risk matrix.Liefgreen et al. observed a relationship between forecast severity and forecast likelihood, such that more severe impacts were associated with higher likelihood classifications.Thus, these results resembled a severity effect in the translation of numerical probabilities to verbal probabilities ordered on a risk matrix (in this instance: very low, low, medium, high).Whilst Liefgreen et al. did not evaluate potential explanations for their effect, it seems less consistent with a politeness-based account.The politeness account assumes a shared understanding between speaker and hearer, whereby the hearer assumes that the speaker is downplaying a risk.In this instance, the 'speaker' was the output of a weather model.In the current paper, we do not primarily seek to further our understanding of the mechanisms contributing to the Severity effect.Rather, our main focus is to test potential consequences of it.
The observation of a severity effect in weather forecasters' warning classifications already underscores the potential significance of this effect, with extreme weather events responsible for approximately 1.23 million deaths between 2000 and 2019 (Centre for Research on the Epidemiology of Disasters: CRED, 2020).It also, however, suggests the potential for further downstream consequences.Specifically, in situations where probabilistic information is relayed between individuals, the choice of expressing it numerically or verbally may vary.If some individuals opt for numerical representation while others opt for verbal representation, there is a possibility that probability estimates could increase each time an individual translates between the two formats.Consequently, the probability communicated to the seventh person in a communication chain might be very different (specifically, higher) than that communicated to the first person.The current studies aim to test such a possibility, which would demonstrate another instance of risk perceptions being affected by social processes (see e.g., (Kasperson et al., 1988;Kasperson, Webler, Ram, & Sutton, 2022;Pidgeon, Kasperson, & Slovic, 2003;Renn, Burns, Kasperson, Kasperson, & Slovic, 1992).
All studies reported in this article are Communication Chain studies.At each 'Communication stage' of the communication chain, participants are tasked to either relay a probability received verbally with a number (odd-numbered stages), or relay a numerical probability with a VPE (even-numbered stages).Previous observations of severity effectsboth in the translation of VPEs to numbers, and numbers to verbal probability classifications -led us to predict a Severity × Communication stage interaction, where communicated probabilities for severe events increase as they are passed along a chain (we did not predict such an increase for neutral events).Previous communication chain studies focussing on risk communication (more generally) have typically reported that negative information persists further across communication stages than does positive information (Jagiello & Hills, 2018;Moussaïd, Brighton, & Gaissmaier, 2015).Observation of the predicted effect in the present studies would represent another mechanism (via probability estimates rather than information about the nature of risks & benefits) whereby risk perception might amplify over time (with communication).Whilst we are also interested in the limits of this effect (whether probabilities continue to increase as chain length increases), we make no specific predictions about this.
To foreshadow our results, across four studies, we consistently observe an increase in probability estimates across Communication stages, but this effect is not reliably qualified by an interaction with Severity.We propose that the increase in probability estimates across Communication stages is likely, therefore, driven by the directionality of communications, such that the focus in the majority of communications after the initial 'Unlikely' at Stage 1, directs participants' attention to the occurrence of the event (see e.g., Teigen & Brun, 1999, 2000). 1 Similarly to how VPEs directing attention to the occurrence of an event reinforce that event's likelihood (two reports that an event is 'likely' might lead to a forecast that the event is 'very likely' [Teigen, Juanchich, & Løhre, 2023; see also Mislavsky & Gaertig, 2022]), communications over time appear to reinforce the possibility that an event will occur.

Study 1
Study 1 tested the hypothesis that probability estimates of a severe event increase across a communication chain.We followed Harris and Corner (2011) in manipulating severity via the consequences of the event being judged, rather than the event itself.This manipulation controls for the natural real-world confound between event severity and base rate (very bad things are typically, and thankfully, rarer than slightly bad things; see also Weber & Hilton, 1990).We followed Liefgreen et al. (2024) in focussing on a weather prediction scenario.In addition to the demonstration of applied relevance with such a scenario, an additional advantage is that the verbal probability categories present in the World Meteorological Organisation's risk matrix suggest a clear rank ordering (very low, low, medium, high).This enabled a straightforward test of whether verbal categories of probability increase across communication stages.

Participants
For all studies in this paper, participants were recruited via Prolific and paid in accordance with the mandated pay rate of UCL's Division of 1 Whilst Study 3 was designed as a direct test of this explanation, the failure to obtain support was basedwe proposeon a natural inability to source appropriate negatively directional VPEs.Our currently preferred explanation therefore represents an inference to the best explanation, rather than one directly supported by empirical evidence.
Psychology & Language Sciences at the time (pro rata rates: £8 per hour for Studies 1 & 2; £9 per hour for Studies 3 & 4).In Study 1, a total of 669 UK-based participants completed all questions they were assigned.19 participants were excluded for failing the attention check (their age had to be consistent with their indicated year of birth), such that 650 participants (489 females, 146 males, aged 18 to 83 years [median = 34]) were retained for analysis.All participants reported being fluent in English, with 88% self-reporting as native English speakers.

Design, materials and procedure
A 7 (Communication stage) × 2 (Severity) yoked betweenparticipants design was employed.Participants were randomly assigned to Severity conditions, whilst the Communication stages were run serially (participants could only participate in one Communication Stage), across a two-week period.
At Stage 1, participants provided a numerical translation of 'unlikely.'At all subsequent stages, participants received the response (numerical or verbal probability) of a participant from the previous stage of the study (from the same Severity condition).Participants in even communication stages (2,4,6) had to determine whether a specified numerical probability (from a participant at the previous communication stage) represented a very low, low, medium, or high chance.Participants in Stages 3,5,7 had to provide a numerical translation of 'very low', 'low', 'medium', or 'high'.The dependent variable was therefore either a numerical probability (Stages 1, 3, 5, 7), or a verbal classification (Stages 2,4,6).
We manipulated severity by varying the consequences of a heavy rainfall event, tasking participants to communicate its likelihood.Participants in the Severe condition were told that the rainfall will fall in "Manila, the most densely populated area of the Philippines.Consequently, in the event that heavy rainfall occurs there will be major damages in residential areas and a high number of casualties due to drowning."Participants in the Non-severe condition were told that the rainfall will fall in "Abra Province, a remote, mostly uninhabited area of the Philippines.Consequently, in the event that heavy rainfall occurs there will be no impact on residential areas."The precise wording of the materials is shown in Fig. 1.We chose the Philippines as the focus of the scenario, partly for consistency with Liefgreen et al. (2024) and having access to colleagues who could suggest Abra Province as a sparsely inhabited area,2 and partly because we anticipated that our UK-based participants were unlikely to have direct first-hand affiliation with the country.Due to the requirement for 7 experimental 'stages', we only used a single scenario in each of the current studies.Each participant provided informed consent, answered 4 demographic questions (age, gender, whether English is native language, whether fluent in English), provided a response to the scenario, indicated their year of birth, and were debriefed as to the purpose of the study.

Results
We analysed the different translation tasks separately.First, we analysed all communication stages (1,3,5,7) where participants were required to provide a numerical translation of a verbal probability classification.Second, we analysed communication stages (2,4,6) where participants provided a verbal classification for a given numerical probability.We did not seek to combine these data given the different nature of the response scales (0-100 vs. 1-4).Note, however, that effects observed in both these analyses might represent increases in either or both of the translations (verbal to numerical; numerical to verbal), as the analyses simply compare probabilities across Stages 1,3,5,7 [or 2,4,6]).All analyses in this article were undertaken using R (R Core Team, 2021), with the use of packages including ggplot2 (Wickham, 2016) for visualisations, and afex (Singmann, Bolker, Westfall, Aust, & Ben-Shachar, 2023) for inferential statistics.All data and analysis code for this article are available at https://osf.io/efd4m/?view_only=c6f 826c0a5054fa1961f697a459e5420.
To test whether there was evidence for the increase across communication stages 'levelling off', we tested for a quadratic component in a regression involving Communication stage (collapsing across Severity, given that there was no interaction).There was no evidence for such a component (p = .14).

Verbal translations of numbers (Stages 2,4,6)
The clear rank ordering of verbal probability terms (very low < low < medium < high) permitted us to analyse the data for the even communication stages in the same way as for the odd communication stages.Fig. 3 suggests the same pattern as for the numerical translations.A 3 × 2 ANOVA (treating probability classification as a continuous variable) replicated the results observed for the numerical translations, with a main effect of Severity, F(1,265) = 10.4,p = .001,eta p 2 = 0.04, and a main effect of Communication stage, F(2,265) = 18.0, p < .001,eta p 2 = 0.12.The predicted interaction was, again, not observed, F (2,265) < 0.1, p = .964,eta p 2 < 0.01.Again, there was no significant quadratic component to the effect of Communication stage (p = .36).

Discussion
Although Study 1 revealed a main effect of Communication stage, this was not qualified by the predicted interaction with Severity.One possible reason for this is that participants did not perceive the Nonsevere condition as truly neutral.Study 2 aimed to ensure the neutrality of the Non-severe condition and included a manipulation check to test that neutrality.

Study 2
In addition to attempting to ensure the neutrality of the Non-severe condition, Study 2 sought to enhance the ecological validity of our investigation.Consequently, in Study 2, participants were allowed to freely express any verbal probability that felt natural to them during communication stages 2, 4, and 6.This modification aimed to capture a more general representation of how subjective probability might be affected by successive translations, generalising the effect observed in Study 1 beyond four probability categories.
As in Study 1, we predicted an interaction between Communication stage and Severity, whereby probability estimates would escalate with increased communications for severe events, but not for neutral events.This hypothesis was pre-registered (https://osf.io/efd4m/?view_only=c6f826c0a5054fa1961f697a459e5420),3 where we further specified our focus on the numerical estimates in Stages 1, 3, 5, and 7, given the challenges associated with ordering the Verbal Probability Expressions (VPEs) provided in Stages 2, 4, and 6.
Moreover, in our pre-registration, we planned to explore the possibility of the effect reaching an asymptote at some stage of the study without making specific predictions.This allows us to investigate the potential limits of the observed probability increase in the communication chain.

Participants
We pre-registered to target 770 participants in total.A total of 760  UK-based participants subsequently completed all questions they were assigned.11 participants were excluded for failing the pre-registered attention check.26 further participants provided verbal probability expressions in Stage 2,4 or 6 that were unusable (typically because they used numbers).After these exclusions, a total of 723 participants (528 females, 190 males, aged 18 to 80 years [median = 36]) provided usable data (360 in the Severe condition), which provides >80% power to detect a small-medium interaction effect (f = 0.17) across Stages 1,3,5,7.One participant reported not being fluent in English (participants were not asked if they were native speakers in this study).

Design and procedure
The Design and Procedure were the same as in Study 1, with two changes.First, participants answered a manipulation check question after providing their main (likelihood) response (on a separate page): "Were the heavy rainfall weather event to occur in [Metro Manila / the Philippine Sea], how good or bad would that be?" Answers were provided on a − 3 (Extremely bad) to +3 (Extremely good) scale, where 0 was labelled as Neutral.
Second, we changed the nature of the verbal probability response provided by participants in Stages 2,4,6 (and subsequently translated into numbers by those in Stages 3,5,7).Participants were free to use any verbal characterisation of likelihood they deemed appropriate: "Please write a sentence to communicate the chance of heavy rainfall in Metro Manila to the readers of your newspaper without using numbers."So as to ensure the integrity of the verbal probability subsequently provided to participants in Stages 3,5,7, participants in these stages read: "Your contact in the Philippines meteorological centre (PAGASA), when asked to describe the chance of heavy rainfall in [Metro Manila / the Philippine Sea] replied: [complete sentence reproduced from a previous participant]." As in Study 1, each participant in Stages 2-7 received probabilistic communications (verbally or as numbers) from a participant at a previous Communication stage.An example single chain is shown in Fig. 4.

Materials
Aside from minor changes (outlined above) to accommodate the more naturalistic VPEs, the materials in the Severe condition were the same as in Study 1.In an effort to ensure that the Non-severe condition was actually neutral, the location of the rain in this condition was changed to the Philippine Sea, where "there will be no impacts because no floods result from heavy rainfall over the sea."

Manipulation check (pre-registered)
Across the whole study, participants saw the potential outcome in the Severe condition as worse (M = − 2.59, SE = 0.04) than the potential outcome in the Non-severe condition (M = 0.25, SE = 0.05), F(1, 676) = 1676.0,p < .001,eta p 2 = 0.71.The manipulation check was not affected by Communication Stage (F < 1), nor was the effect of Severity qualified by an interaction with Communication Stage, F(6, 676) = 1.3, p = .28,eta p 2 = 0.01. 4 The positive mean for the Non-severe condition was (surprisingly) significantly greater than zero, t(4.5) = 345, p < .001,but the important result for current purposes is that it is clear that participants did not view this outcome as negative.Some of the extended verbal responses in Stages 2, 4 and 6 were supportive of a possibility that participants saw it as a good thing that the heavy rainfall was hitting the sea, rather than the land.The mean for the Severe condition was significantly below zero, t(343) = 61.3,p < .001.

Numerical translations of VPEs (pre-registered)
As in Study 1, we first analysed the results from Stages 1, 3, 5 and 7, where participants had to provide a numerical translation of a verbal expression from a previous participant (Stages 3,5,7), or of 'unlikely' (Stage 1).Fig. 5 suggests a replication of the result from Study 2, with estimates for severe events higher than for non-severe events, and an = 51.2,p < .001,eta p 2 = 0.27.On this occasion, there was also a small, but significant, interaction, F(3, 414) = 2.9, p = .04,eta p 2 = 0.02.From Fig. 5, it appears as though this reflects the Severe condition beginning to reach asymptote.Although our pre-registration stated that we would follow-up a significant interaction with simple effects, we decided not to on this occasion, given the small size of the interaction, and the fact that this is the only time we observed a significant interaction across our In this chain, that was translated as 25%, which was subsequently translated as "There is a moderate to medium chance…" As is clear from this example, not every chain increased monotonically.studies.
Without making any predictions, we pre-registered that we would explore the possibility for an asymptote in the increase in probability estimates across Communication stages.Due to the significant interaction, we tested the Severe and Non-severe conditions separately.A negative quadratic component suggested an asymptote effect in the Severe condition, B = − 5.3, p < .001,but not in the Non-severe condition, B = 0.85, p = .66.

Verbal translations of numbers
We pre-registered that we would undertake an analysis of verbal translations (VPEs) of numbers in Communication stages 2, 4, and 6.To ensure accurate and consistent translations, we conducted our own context-free translation task, rather than relying on translations reported in previous literature.

3.2.3.1.
Context-free translation task.100 participants from the same participant pool as the main study (Prolific) were recruited (69 females, 31 males, mean age = 37 years).We obtained numerical translations for a total of 64 Verbal Probability Expressions (VPEs).These were VPEs that appeared at least twice in the whole dataset (Communication stages 2,4,6), and accounted for 88% of all verbal responses.The VPEs included core expressions of likelihood such as "chance," "risk," "likelihood," "probability," "possibility," and "percentage."5A pragmatic decision was taken to treat these stems as equivalents when analysing the VPEs provided in the main study, aiming to counter fatigue in the context-free translation task while maintaining accurate translations.To ensure translations were appropriate, participants in the context-free translation task provided estimates for one of these versions.For example, each participant only rated one of: 'good possibility', 'good chance', 'good probability', 'good risk'.The same numerical equivalent was then assigned to each of these expressions (the average of those assigned to all four).A similar approach was taken for 'almost certain', 'virtually certain', 'close to certainty' and 'near certain.'In total, each participant rated 32 VPEs.
The task itself followed Stewart, Chater, and Brown (2006) and asked participants to imagine they had a black bag full of 100 coloured balls and had to randomly pick one red ball from the bag without looking.They then read a verbal description of the chance of picking a red ball from the bag (e.g., "It is likely that you will pick a red ball from the bag"), before providing a numerical answer (from 0 to 100) to the question "How many balls do you think are red?"They were provided with two examples before commencing the task.These examples used the terms 'certain' and 'impossible', stating that the participant would answer '100' and '0' respectively in these instances.As in the main experimental task, participants reported their year of birth at the end of the study, as well as their age at the start, to serve as an attention check.Four participants were excluded on this basis.No responses in the context-free translation task were further than 2 standard deviations from the mean.The following analysis uses mean translations of the VPEs provided, but results are qualitatively identical if median translations are used instead (see Supplementary Materials).

Putting it all together -7 communication stages
Because participants were free to use any verbal description in Study 2, we were able to aggregate data across the whole study.Fig. 7 shows that there is a steady increase in subjective probability estimates (in numerical and verbal formats) across the study.We observed main effects of Severity, F(1, 676) = 92.0,p < .001,eta p 2 = 0.12, and Communication Stage, F(6, 676) = 36.4,p < .001,eta p 2 = 0.24.The interaction was not statistically significant, F(6, 676) = 1.7, p = .12,eta p 2 = 0.01.
Including a quadratic component in a regression predicting subjective probability estimates from Communication stage, there was evidence for a small negative quadratic component, B = − 0.63, p < .03,suggesting that probability estimates began to stabilise towards the end of the study.From Fig. 7, this would appear to be at a value around 75%.

Discussion
At this stage, we have ascertained that probabilities increase across communication stages in our weather scenario, as VPEs are translated to numbers (for communication), and numbers to VPEs (for communication).Contrary to a priori predictions, the increase in probability across communication stages was not reliably qualified by an interaction with outcome severity.Although such an interaction was observed in the only pre-registered analysis (the numerical translations in Study 2), this was: a) small and marginally significant; b) the only time across four analyses (across both studies) that such an interaction was observed.The body of evidence thus far is therefore certainly not supportive of the increase across communication stages being reliably greater in the Severe condition than the Non-severe condition.One possibility is that the effect is driven by a sub-sample of participants reporting '100' in any oddnumbered stage (whether due to a deterministic-sounding verbal response or not).All results reported in this manuscript are, however, qualitatively identical if responses of '100' are removed from the analysis (the one exception is that the Communication stage × Severity interaction in the verbal translations of Study 2 does not retain significance, p = .053).
The absence of a difference in the effect of Communication stage between severe and non-severe outcomes suggests a common tendency for all probabilities to be overestimated relative to an underlying reference point.The nature of this reference point, however, remains a key question.In an idealised case with perfectly veridical perception, translation, and communication of VPEs and numerical probabilities, we would expect a consistent and stable subjective probability across the seven communication stages.Given, therefore, that the increase in probability estimates across communication stages is not moderated by severity, where does this increase come from?We propose that the effect likely relates to the directionality of probability communications.Teigen and Brun (1999; see also e.g., Honda & Yamagishi, 2006, 2009, 2016;Teigen & Brun, 1995, 2003) demonstrated that VPEs (even those representing approximately the same subjective probability) can either direct one's attention to the presence of an outcome (e.g., likely; positive directionality) or its absence (e.g., not certain; negative directionality).In addition, numbers tend to fall on the positive side of this classification (Teigen & Brun, 2000).In Study 1, one might consider all the VPEs used (after Stage 1), plus all the numbers, as being of positive directionality.An informal coding of VPEs provided in Study 2 also suggests a (very) strong preponderance of positive directionalities there too.In fact, none of the expressions mentioned in Study 2 would typically be classified as of negative directionality (although 'very low / small chance' might be considered ambiguous; only 7 participants used these terms across the three even stages).The positive directionality of the VPEs might lead to a greater focus on the event's occurrence, and subsequently higher subjective probabilities, despite the initial VPE being of negative directionality ('unlikely') for all participants.

Study 3
In Study 3, we sought to test the directionality explanation by employing a design similar to Study 1, but where the VPEs presented for participants to choose from in Stages 2 and 4 (we only used five stages in total) were all of positive, or all of negative, directionality.If directionality was driving the effect, we expected a more pronounced increase across communication stages when participants chose between positive directionality words than when choosing between negative words (pre-registered at: https://osf.io/efd4m/?view_only=c6f 826c0a5054fa1961f697a459e5420).To increase the generalisability of our findings, we used a positively valenced scenario unrelated to the weather domain.

Choosing VPEs
In the first instance, we had to identify potential VPEs for participants to choose between in the even communication stages.We surveyed the list of terms provided in Stewart et al. (2006) to attempt to identify numerically equivalent positive and negative terms. 6We supplemented this list by obtaining numerical equivalents (using the same procedure as in Study 2; 97 Prolific participants following 3 attention check failures) for five additional negative directionality VPEs identified in the previous literature (Teigen et al., 2023;Teigen & Brun, 1995) in an attempt to identify as wide a range of negative VPEs as possible.Our final list of VPEs attempted to match positive and negative VPEs on median numerical translations.Where this was not possible, we ensured that the median of the negative VPE was higher than that of the positive VPE (to act against our hypothesis).Additionally, we attempted to choose VPEs with small differences between mean and median translations, and where the mean of the positive VPE was not higher than that for its corresponding negative VPE.The highest VPE was the one occasion where we did not achieve this (see Table 1).As can be seen from Table 1, we were unable to identify negative VPEs that covered the whole probability range.To the best of our knowledge, between Stewart et al. ( 2006) and our pre-test, we identified all negative phrases previously identified as such in previous literature and found none of them to represent probabilities above 60% (see Supplementary Materials).

Participants
We pre-registered to target 110 participants in each stage (550 total).The experimenter intended to exclude participants from subsequent stages where they provided a year of birth that did not match their age.Thus, a single exclusion in Stage 1 reduces the total participants by five (as responses from only 109 participants can be passed to subsequent stages, plus those data are excluded from Stage 1).The experimenter identified three participants who failed the manipulation check whilst running the stages (one in each of stages 1, 3, 4), reducing the total possible number of participants by 10.At analysis, a further 3 participants were identified whose provided year of birth did not match their age.Thus, the final sample for analysis was 537 (345 females and 187 males, aged 18 to 79 years [median = 37]).All participants reported Fig. 6.Verbal probabilities increased across communication stages in Study 2, with estimates of severe events being higher than for non-severe events.Error bars represent 95% confidence intervals.Numerical equivalents for the VPEs use the mean numerical translations from the translation task.used the translations from Stewart et al. (2006) due to the greater number of negative directionality words included there (the paucity of such words provided by our participants has already been documented in the discussion of Study 2).being fluent in English.

Design and procedure
With our focus on the increase in probability estimates, rather than whether they asymptote, we employed a 5 (Communication stage) × 2 (Directionality) yoked between-participants design.Participants were randomly assigned to Directionality conditions, whilst the Communication stages were run serially (participants could only participate in one Communication Stage), within a single day (January 29th, 2024).
Because participants were restricted to choosing VPEs from the low end of the probability range (see Table 1), participants provided a numerical translation of 'very unlikely' at Stage 1 (rather than 'unlikely', as in Studies 1 & 2).This was to allow for more of an increase in estimates across communication stages.The remainder of the study proceeded in the same way as Study 1, except that the VPEs used were as in Table 1 (dependent on condition).

Materials
To increase the generalisability of our tests of probability escalation, we used a new, positively valenced, scenario, where participants were asked to report on the chances of their firm's income target being met (see Fig. 8).

Results
As in Study 1, we analysed the different translation tasks separately, given the different nature of the response scales (0-100 vs. 1-4).We preregistered that the continuous variable, with more stages (numerical translations of VPEs), constituted our primary dependent variable.

Numerical translations of VPEs (pre-registered)
As can be seen in Fig. 9

Verbal translations of numbers (Stages 2 & 4; pre-registered)
Due to the lower number of stages in this study, we pre-registered that all inferential weight in this study rests on the previous analysis.As suggested in Fig. 10, there was little indication that verbal likelihoods increased between Communication stage 2 and 4 (see Fig. 10;F[1210] = 2.7, p = .102,η p 2 = 0.01).There was no reliable effect of Directionality, either as a main effect, F(1,210) = 2.5, p = .113,η p 2 = 0.01, or interacting

Table 1
Verbal probability expressions used in Study 3, along with means and medians of their prior numerical translations (Stewart et al., 2006).with Communication stage, F(1,210) = 1.9, p = .174,η p 2 < 0.01.As with the verbal translations, if anything, the trend indicated a greater increase in the Negative condition than in the Positive condition.

Discussion
The increase of probability estimates across communications was replicated in our critical dependent variable (as pre-registered) for a completely different, positively valenced, scenario.The study offered no support for the hypothesis that the effect would be restricted to positively directional verbal probability expressions, with no interaction between Communication stage and Directionality.Inferences from the latter result are, however, limited by the restricted range of the probability expressions we were able to use.One result observed during the preparation for this study was that negative directionality expressions rarely extend beyond 50%.This limits the testability of this hypothesis, Note: In Stages 2 and 4, minimal changes were made to the text to ensure grammaticality, participants were provided with a numerical estimate from a previous participant and asked: "As part of your narrative summary of the firm's current position, please report the chance of this year's targets being met on the scale below."Participants selected one of the 4 VPEs from the appropriate condition (see Table 1), presented in increasing order from left to right.especially with people exclusively providing positive probabilities when given free choice (Study 2). 7In Study 4, we aimed to rule out an alternative explanation for the increase in probabilities across communication stages, which we hereafter term the Probability Escalation Effect.

Study 4
In Study 4, we sought to further test the generalisability of the Probability Escalation Effect in a scenario where severity was manipulated via the strength of the event itself, rather than via its consequences.Although such a manipulation confounds severity and base rate (since strong events are typically rarer than weak events; see also Weber & Hilton, 1990), the focus of this study is not on the Severity effect.One explanation for the increase of probabilities in the non-severe conditions of Studies 1 and 2 could be that the event itself was still large in magnitude ('unusually heavy rainfall').If the magnitude of the event itself is what is critical for the Probability Escalation effect (c.f.Keren & Teigen, 2001;Løhre, 2018), in this study we should perceive an interaction between Severity (which now also constitutes a manipulation of Magnitude) and Communication stage.By manipulating both event magnitude and event severity together, we provide the largest differentiation between severity conditions.Because our current hypothesis (despite the failure to obtain direct support for it in Study 3) is, however, that the primary driver of the effect is the directionality of probability communication, we hypothesised a main effect of Communication stage in this study, with no interaction (pre-registered at: https://osf.io/efd4m/?view_only=c6f826c0a5054fa1961f697a459e5420).

Participants
We pre-registered to target 110 participants in each stage (550 total).One participant in Stage 1 provided a year of birth that was not consistent with their age, but this was not identified when carrying estimates to Stage 2. For an unknown reason (perhaps a failure in the Prolific participant counter?), 111 participants contributed data in Stage 2. Eight participants' data were unusable (either because they used a number, or because they failed the attention check).Three of the 100 participants in Stage 3 failed the attention check, responses from three participants in Stage 4 were unusable, and one participant in Stage 5 reported a date of birth that was not consistent with their age.Following these exclusions, 521 participants contributed to the study (332 females, 185 males, aged 18 to 73 years [median = 36]).All participants reported being fluent in English.

Design and procedure
The Design and Procedure were the same as in Study 2, except that there were five communication stages instead of seven, resulting in a 5 (Communication stage) × 2 (Directionality) yoked between-participants design.As in Study 3, there was no manipulation check.A major change was our operationalisation of the Severity manipulation.Instead of manipulating the consequences of the event, the event itself was manipulated.The Severe condition forecasted a 'large volcanic eruption', which would cause 'major damages and a large number of casualties'.The Non-severe condition forecasted the possibility that 'traces of volcanic dust will be emitted', which 'poses no risk to health and safety.'As in Studies 1 & 2, the VPE provided in Stage 1 was 'Unlikely'.

Materials
In order to ascertain the generalisability of the Probability Escalation Effect, we used a new scenario.Participants were told that they worked in a government communications team for a city in the shadow of a volcano.Each week, they are provided with a report from the country's 7 An alternative test of the Directionality hypothesis might provide a high probability (e.g., 'likely') at Stage 1 to ascertain whether probabilities decrease in the Negative condition.Such a study would, however, have the same difficulties as the present one in terms of VPE selection.We chose 'very unlikely' as a starting point, as our primary aim was to understand the Probability Escalation effect.
seismologists that documents the likelihood of either a large volcanic eruption (Severe condition), or traces of volcanic dust emitting from the volcano (Non-severe condition).They receive the report, and are required to communicate this to the city's residents, either with a number (Stages 1,3,5) or a sentence without using numbers (Stages 2 & 4; using the same operationalisation as Study 4).The likelihood in the report is stated as 'unlikely' in Stage 1, whilst in Stages 3 and 5 it was a verbatim sentence from a participant in the previous stage; in Stages 2 and 4, the likelihood was presented as a number (from a participant in the previous stage).For verbatim materials, see https://osf.io/efd4m/?view_only=c6f826c0a5054fa1961f697a459e5420.

Results
For the first time, we pre-registered that our central analysis would include all Communication stages, where VPEs were translated to numbers using the mean responses from the translation study described in Study 2. A number of participants did not use VPEs represented in the translation study and could therefore not be included in the analysis.Consequently, 447 participants were included in the analyses (Stage 1-109; Stage 2-74; Stage 3-100; Stage 4-68; Stage 5-96).
In this analysis, we used mean translations of VPEs (from Study 2's translation task), but results are qualitatively identical if median translations are used instead (see Supplementary Materials).

Numerical translations of VPEs
Although not pre-registered, given the noise associated with our translations of VPEs to numbers for analysis, we recognised the importance of testing the main effect of interest solely looking at the numerical translations of VPEs.As can be seen in Fig. 11, probability estimates increased with multiple communications, F(2,299) = 21.4,p < .001,η p 2 = 0.13.In this analysis, estimates did not differ between the Severe and Non-severe conditions, F(1,299) = 1.2, p = .270,η p 2 < 0.01.No interaction was observed between Severity and Communication stage, F (2,299) = 0.5, p = .620,η p 2 < 0.01.

Discussion
As in Studies 1-3, probability estimates increased across the communication chain, and the effect was not moderated by event severity.This study therefore replicated the results observed in Studies 1 and 2. Importantly, the continued lack of a Communication stage × Severity interaction in Study 4 suggests that the increase in probabilities occurs regardless of the magnitude of the event described.Moreover, the continued presence of the effect in Studies 3 and 4 also suggests that the effect cannot solely be attributed to participants responding to the role of 'journalist' (in Studies 1 & 2) by exaggerating probabilities.8Rather, we suggest that the increase in probability estimates across a communication chain is a general feature of probability communications.
The results of Study 4 differed from those in Studies 1 & 2 by not demonstrating a Severity effect in the numerical translations (or at Stage 1 -see the overlapping datapoints at Stage 1 in Fig. 11).This is likely due to the fact that event severity and event base rate were (knowingly) confounded in this study (see Harris & Corner, 2011;Weber & Hilton, 1990).

General discussion
Both translations from VPEs to numerical probabilities (e.g., Harris & Corner, 2011) and from numerical probabilities to VPEs (Liefgreen et al., 2024) have been shown to be higher for more severe outcomes than less severe outcomes.The communication chain studies presented in this article sought to test the downstream consequences of this Severity effect.The central finding was that mean probabilities increased across communication stages, as VPEs were translated to numbers (for communication) and numbers to VPEs (for communication).This Probability Escalation Effect was, however, observed for both severe and non-severe outcomes, suggesting that it cannot simply be considered a consequence of the Severity effect.The effect held whether participants chose a VPE from a pre-specified list of four (Studies 1 & 3), or provided any verbal expression of probability they chose (Studies 2 & 4).
Holtgraves and Perdew (2016) also employed a yoked design, across two stages.In Stage 1, communicators selected a VPE to communicate a numerical risk (20%, 50%, 80% probability), which was then interpreted (not re-communicated) numerically by recipients in Stage 2. The lower VPE chosen in Stage 1 led to an overall effect in their study whereby subjective probabilities of the severe outcome (e.g., car needs a new transmission) in Stage 2 were lower compared to the less severe outcome (e.g., car needs a new battery).Whilst this effect of severity could reflect politeness concerns in a dyadic communication context, the most relevant result for our investigation is the comparison of estimates in the second stage with the original probability communicated.Estimates in Stage 2 for the outcomes originally described as having 20% and 50% probability were increased (lowest mean of the two severity conditions were 55% and 58% respectively), consistent with the overall effect of Communication stage reported here.The lack of an increase for the 80% probability condition (with mean estimates in the second stage being 69% and 78% depending on Severity condition) suggests a limit to the Probability Escalation Effect.This limit might be attributed to the fuzzy nature of VPEs, or a hedging strategy to protect the speaker from being blamed for a wrong prediction (c.f.Juanchich et al., 2012), leading interpretations to regress towards the midpoint of the scale.The observed increase for the 50% condition does, however, demonstrate that the overall trend is for subjective probability escalation.
We continue to view directionality as a promising explanation for the Probability Escalation Effect (as introduced in the discussion to Study 2).In our preparation for Study 3, however, we identified considerable challenges with providing a test of this explanation.Critically, we were unable to identify any VPEs of negative directionality that conveyed a numerical probability much greater than 50% and, indeed, the expressions that we were forced to use to match positive and negative expressions might not be the most natural expressions for people to use.Whilst this restricted our ability to test the directionality account, the preponderance of positively directional VPEs across the probability space in Studies 2 and 4 (see also Juanchich, Teigen, & Villejoubert, 2010;Teigen & Brun, 1995) highlights the significance of the results reported here.This preponderance, we propose, very likely leads to an escalation in probability estimates wherever probabilities are translated between numbers and words.Descriptively, this takes the form of a framing effect.Directing one's attention to an event via a numerical expression of probability, or a verbal probability expression, appears to lead to an onwards communication of that event's likelihood at the high end of plausible probabilities (whether verbal or numerical).
In the Introduction to this paper, we predicted a Severity × Communication stage interaction, expecting an increase of probability estimates solely in the Severe condition, such that the increase could be attributed to the Severity effect.The lack of such an interaction demonstrated a more general effect.The lack of an interaction also, however, questions the nature of the main effect of Severity identified in Studies 1, 2 and 4. Specifically, were numerical translations of VPEs and verbal translations of numbers always higher for severe outcomes than non-severe outcomes, this should manifest as a Severity × Communication stage interaction in the current design.An inspection of Figs. 2, 7, and 11 suggests that the main effect of severity is manifest in different ways across the three studies.In Study 1, it seems as though a severity effect is generated at Stage 1, and that the increased probability estimates in the severe condition are then maintained throughout the subsequent communication stages.One reason why estimates might not have increased in the Severe condition (over the Non-severe condition) in subsequent Communication stages could be due to the additional experimental noise in subsequent stages, where each participant receives a different VPE (or number).In contrast to Study 1, there is a hint of the predicted interaction (significant in the numerical translations of VPEs) in Study 2. Finally, the different pattern observed in Study 4 is likely a consequence of the probable confound between high severity and low base rate in this study.Weber and Hilton (1990) found that the Severity effect was only observed after such a base rate confound was statistically controlled for.
Regardless of the precise mechanism underlying the effect, the present results show that when people translate from verbal to numerical probabilities (and vice versa), subjective probabilities will be distorted.Framing an outcome in terms of its occurrence (rather than its nonoccurrence) is likely the most natural framing.Where such frames dominate, the current research would seem to suggest that subjective probabilities will increase as a communication chain extends (at least up to a probability of around 70-75%).Future research might investigate whether the effect persists where people are free to choose the format in which to communicate a risk.Participants might, for example, be asked to 'Use your own words to communicate this risk', rather than specifically being required to use either a number or a verbal communication.Such a study would extend our understanding of the limits of probability escalation.Alternatively, a study addressing solely numerical communications (e.g., where a lower-bound interval 'more than X%', or upperbound 'less than Y%' is translated into an approximate point value -'around Z') might further our understanding of the possible role of directionality in the Probability Escalation Effect (see Teigen, 2023). 9he social context of risk is known to influence risk perceptions and feelings (see e.g., Kasperson et al., 1988;Pidgeon et al., 2003;Slovic, 2010).The current research adds to this body of knowledge, demonstrating how perceptions of the likelihood of a risk may increase across communication chains.The lack of an interaction between Communication stage and Severity in the present studies suggests that both risks and benefits might be amplified through this mechanism, depending on the focus of the communication.Previous communication research, however, has suggested that negative information (i.e., focussing on risks rather than benefits) is more likely to persist across communication stages (Jagiello & Hills, 2018;Moussaïd et al., 2015; but see Hoeken & Strick, 2021, for the opposite finding focussing only on the first communication).The combination of this effect with that reported in the present article therefore seems to provide a means by which subjective risks will typically increase with subsequent communication.
The degree to which the Probability Escalation Effect is problematic in practical terms likely depends on the prevalence of format translations in people's communications of risk information.Corrective interventions may be explored.Although Jagiello and Hills (2018) found limited benefit of the reintroduction of the original (qualitative) statements at Communication stage 6, re-introducing the probability in a quantitative format (numerically), from a trusted expert may be facilitatory.It remains for future research to explore such a question.

Ethical statement
All studies reported in the article received ethical approval from the UCL Department of Experimental Psychology (EP/2021/001)

Fig. 1 .
Fig. 1.Precise wording of the experimental materials in Study 1. Notes: In Stage 1 the text introducing the PAGASA contact read slightly differently: "Your contact in the Philippines meteorological centre (PAGASA), when asked to describe the chance of heavy rainfall in Metro Manila replied: "It is unlikely that heavy rainfall will occur in Metro Manila in the next 24 hours."In all studies, numerical probabilities were added into an open text box.

Fig. 2 .
Fig. 2. Numerical probabilities increased across communication stages in Study 1, with estimates of severe events being higher than for non-severe events.Error bars represent 95% confidence intervals.

Fig. 3 .
Fig. 3. Verbal probability classifications increased across communication stages in Study 1, with estimates of severe events being higher than for non-severe events.Individual responses are displayed as well as mean responses, for which error bars (95% confidence intervals) are included.

Fig. 4 .
Fig. 4.An example communication chain observed in Study 2 (Severe condition).Numbers in square brackets represent the Communication stage.Note: All participants saw 'Unlikely' at Stage 1.In this chain, that was translated as 25%, which was subsequently translated as "There is a moderate to medium chance…" As is clear from this example, not every chain increased monotonically.

Fig. 5 .
Fig. 5. Numerical probabilities increased across communication stages in Study 2, with estimates of severe events being higher than for non-severe events.Error bars represent 95% confidence intervals.
Communication stages 2,4,6.The context-free translation task provided us with numerical equivalents for the majority of participants' responses (268 / 301; 134 in the Severe condition).

Fig. 7 .
Fig. 7. Probability estimates (numerical and verbal) increased across communication stages in Study 2, with estimates of severe events being higher than for nonsevere events.Error bars represent 95% confidence intervals.

Fig. 8 .
Fig. 8. Precise wording of the experimental materials in Study 3 (Stages 1, 3, 5).Note: In Stages 2 and 4, minimal changes were made to the text to ensure grammaticality, participants were provided with a numerical estimate from a previous participant and asked: "As part of your narrative summary of the firm's current position, please report the chance of this year's targets being met on the scale below."Participants selected one of the 4 VPEs from the appropriate condition (see Table1), presented in increasing order from left to right.

Fig. 9 .
Fig. 9. Numerical probabilities increased across communication stages in Study 3, though with no difference between positive and negative VPEs.Error bars represent 95% confidence intervals.

Fig. 10 .
Fig. 10.Verbal probabilities did not increase across communication stages in Study 3. Individual responses are displayed as well as mean responses, for which error bars (95% confidence intervals) are included.Note: Response categories represent increasing probabilitiesthe verbal expressions were different in the Negative and Positive Directionality conditions.

Fig. 11 .
Fig. 11.Probability estimates increased across communication stages in Study 4, with estimates of severe events being higher than for non-severe events.Error bars represent 95% confidence intervals.