A user perspective on the avalanche danger scale – insights from North America

. Danger ratings are used across many ﬁelds to convey the severity of a hazard. In snow avalanche risk management, danger ratings play a prominent role in public bul-letins by concisely describing existing and expected conditions. While there is considerable research examining the accuracy and consistency of the production of avalanche danger ratings, far less research has focused on how backcountry recreationists interpret and apply the scale. We used 3195 responses to an online survey to provide insight into how recreationists perceive the North American Public Avalanche Danger Scale and how they use ratings to make trip planning decisions. Using a latent class mixed-effect model, our analysis shows that 65 % of our study participants perceive the avalanche danger scale to be linear, which is different from the scientiﬁc understanding of the scale, which indicates an exponential-like increase in severity between levels. Regardless of perception, most respondents report avoiding the backcountry at the two highest ratings. Using conditional inference trees, we show that participants who recreate fewer days per year and those who have lower levels of avalanche safety training tend to rely more heavily on the danger rating to make trip planning decisions. These results provide avalanche warning services with a better understanding of how recreationists interact with danger ratings and highlight how critical the ratings are for individuals who recreate less often and who have lower levels of training. We discuss opportunities for avalanche warning services to optimize the danger scale to meet the needs of these users who depend on the ratings the most.


Introduction
A central goal of risk communication is the provision of accurate, timely, and trustworthy information that empowers people to make informed decisions about mitigative actions (Lundgren and McMakin, 2018;Eastern Research Group, Inc. (ERG) and the NOAA Social Science Committee, 2019).In the field of natural and environmental hazards, danger or hazard scales are a common method to communicate the severity of current or expected conditions to the public in a simple way.We see examples of these scales in contexts such as forest fire danger ratings (Province of British Columbia, 2022), air quality health indexes (Environment and Climate Change Canada, 2022), heat and humidity measures (National Oceanic and Atmospheric Administration, 2022), and many more.
Snow avalanches are another context where a danger scale plays a prominent role in risk communication.Snow avalanches are a serious natural hazard in mountainous regions around the world that can threaten settlements, transportation corridors, critical infrastructure (e.g., transmission lines), natural resource extraction (e.g., timber harvesting and mining), and people and infrastructure at remote work sites.In addition, recreationists pursuing backcountry activities, such as backcountry skiing, mountain snowmobiling, ice climbing, or snowshoeing, voluntarily expose themselves to avalanche hazard in many countries.
Public avalanche forecasts (also known as bulletins) published by avalanche warning services are a critical source of information for recreationists during trip planning.To make the avalanche hazard information accessible to backcountry users with different levels of avalanche training and education, bulletins communicate avalanche conditions in a tiered format that presents information in layers with increasing levels of detail and complexity.This approach, which is commonly referred to as the information pyramid (European Avalanche Warning Services, 2021b), is designed to maximize comprehension across audiences with varying avalanche education, experience, and needs.The first information that recreationists see when they consult public avalanche bulletins is the avalanche danger rating, which communicates the general severity of avalanche conditions in a region over a certain amount of time (Statham et al., 2010).The scales that have been used to communicate avalanche hazard have evolved over time, ranging from four to eight levels across iterations of the danger scale in both North America and Europe (Dennis and Moore, 1996;Mitterer and Mitterer, 2018).However, an ordinal five-level scale with standardized colours, signal words, numbers, and icons has been used most consistently by public avalanche warning services around the world to describe the conditions.The current version of the North American Public Avalanche Danger Scale (Statham et al., 2010;Fig. 1) is closely tied to the Conceptual Model of Avalanche Hazard (Statham et al., 2018a), which defines the key elements of avalanche hazard and provides a workflow for consistent avalanche hazard assessments in North America.While there are slight variations in the signal words and level definitions between the danger scales used in Europe and North America, the main difference is that the primary purpose of the North American Public Avalanche Danger Scale is public risk communication, whereas the European system is used in a wider range of applications that also includes providing warnings for residential areas and transportation networks (Stoffel and Meister, 2004;Stoffel and Schweizer, 2008).
Despite being used as a critical public risk communication tool since 1994, the North American Public Avalanche Danger Scale has not had a comprehensive analysis of how the target audience uses it.Instead, much of the existing research on the danger scale has focused on the production of danger ratings.Consistently producing accurate and credible avalanche hazard assessments is challenging not only due to variability and uncertainty in the data informing the forecast but also because the human judgment involved in the assessment process is susceptible to interpretation and bias (Statham et al., 2018b).Several recent studies have focused on identifying sources of bias or error and improving the production of accurate and credible danger ratings (Clark, 2019;Lazar et al., 2016;Schweizer et al., 2020Schweizer et al., , 2021;;Techel and Schweizer, 2017).
In contrast to the research efforts focused on improving the quality and consistency of avalanche bulletins, there has been relatively little research focused on how recreationists are perceiving and using the forecast products, including the danger scale.Best practices in risk communication stress that to communicate the severity of conditions effectively, risk communicators must not only provide accurate and credible risk information from a trusted source but also interact with the target audience to understand their knowledge and perspectives.This information is critical for crafting appropriate messages that resonate with the audience (Lundgren and McMakin, 2018;National Research Council, 1996;National Oceanic and Atmospheric Administration, 2016).Applying these principles to an avalanche context, Winkler and Techel (2014) examined recreationists' assessment of the quality of the bulletin website design compared to previous renditions.Engeset et al. (2018) examined the effectiveness of Norwegian avalanche risk communication products and suggested that, although bulletin users considered danger ratings an important piece of information, the ratings alone were not enough information to communicate intended warning information.St. Clair (2019) developed a typology that describes the different ways recreationists use avalanche bulletin information, and Finn (2020) investigated bulletin literacy amongst different recreation user types and provided targeted risk communication recommendations for specific user groups.One of the few studies that explicitly examined how recreationists use the North American Public Avalanche Danger Scale was conducted by Ipsos Reid (2009) in support of its last revision in 2010.Using an online survey, the study examined the effect of the scale's revised definition on recreationists' ability to identify appropriate terrain.While recreationists presented with the new descriptors alone made more conservative terrain choices under Considerable than with the definitions of the old scale, they reverted to their original terrain choices when the descriptors were presented together with the signal words (Ipsos Reid, 2009).
The lack of research on the perception and use of danger scales does not seem limited to the avalanche safety community.A literature review on research on danger scales in general revealed that natural hazard risk communication literature tends to focus on how the public interacts with warnings, alerts, and orders in an emergency or crisis context.Examples include how people responded to mandatory hurricane evacuation orders (Demuth et al., 2018), flood risk and communication perception (Kellens et al., 2013), flash flood mental models and misunderstandings (Lazrus et al., 2016), tornado warnings (Brotzge and Donner, 2013), and pre-crisis earthquake risk communication (Herovic et al., 2020).While these topics are related to danger ratings, the voluntary nature of interacting with natural hazards in recreational settings (e.g., backcountry travel in avalanche terrain, ocean activities with wave hazards and rip currents, and recreating in canyons prone to flash floods) creates a unique context where the transferability of the existing research results may be limited.As far as we are aware, there has been no user-focused research on danger scales in the context of voluntary hazards to date.
Given the importance of having an in-depth understanding of the risk message audience, the avalanche safety community has a considerable knowledge gap in understanding how recreationists are interacting with the avalanche danger rat- ing, which may limit the effectiveness of this risk communication tool.The purpose of this research is to contribute to a better understanding of the strengths and weaknesses of the avalanche danger scale by exploring how recreationists perceive and use it during their trip planning process to decide whether to go into the backcountry.

Survey design
To gain insight into how winter backcountry recreationists in Canada and the United States use, understand, and apply the safety information presented in daily avalanche bulletins published by warning services, the Simon Fraser University Avalanche Research Program conducted a large online survey in the spring of 2019.The overall objective of this survey was to provide avalanche warning services with empirical evidence for making decisions about how to improve their avalanche risk communication products.While the survey included a wide range of exercises and questions, only the design of the main questions of interest for the present analysis are discussed in this paper.Readers interested in the design of the entire survey are referred to Finn (2020) for a more complete description.Screen shots of the complete survey are also available in Haegeli et al. (2022).
To meaningfully guide study participants through the survey, we presented them with the bulletin user typology statements developed by St. Clair et al. (2021) (Table A1) and asked them to indicate which of these statements describes their personal bulletin use practice most accurately.Participants who indicated that they do not typically use the bulletin (i.e., user type A's) were split from the rest of the sample as most survey questions were only relevant for participants familiar with avalanche bulletins.However, to gain some insight about the intuitiveness of the avalanche danger scale for recreationists with limited familiarity, the survey presented type A participants with the five signal words (Low, Moderate, Considerable, High, and Extreme) in a random order and asked them to arrange the words in order of severity.
For bulletin user types B to F who also indicated that they use the danger scale at least rarely, the survey included several questions targeting participants' understanding and use of the danger scale.To provide detailed insight about where recreationists might be challenged with the danger scale, the questions were designed using Krathwohl's (2002) adaptation of Bloom's taxonomy of learning objectives (Bloom, 1956), which is an education framework that identifies increasingly complex learning processes.Reflecting the first three stages of the learning process described by Krathwohl (2002), our survey questions were designed to shed light on participants' recall, understanding, and use of danger ratings.
To better understand how well participants knew the danger rating terminology, the danger rating section of our survey started with a question that prompted participants to recall the danger rating levels in their proper order.To answer this question, participants had to type the levels from least to most severe into an open-text field.Since the danger rating terms are the primary means for communicating the severity of the existing conditions, not knowing the full scale and proper order of the terms can seriously impact one's ability to use the scale meaningfully.
To examine participants' understanding of the danger scale we designed a question that included sliders where participants could indicate how they perceive the severity of each level of the danger scale on a numeric scale from 0 (no avalanche hazard at all) to 100 (widespread, large natural avalanches reaching valley bottoms) (Fig. 2).All sliders moved in increments of 2, and the design of the question did not restrict the slider movement at all, which means that participants were able to have overlapping severity ranges or gaps between them.Our intent was to get insight into participants' personal conceptualization of the scale (i.e., what levels of severity they personally associate with the different levels), which includes both their technical understanding of the nature of the scale and their personal experience with the scale in the field.
To explore how participants use the danger rating when planning a trip into the backcountry, the survey included a question that asked participants how the danger rating levels typically affect their decision of whether to recreate in the backcountry (Fig. 3).
For answering this question, participants were given different statements describing possible danger rating use cases similar to the format of the avalanche bulletin user type question.The four statements participants could choose from for each level of the danger scale were as follows.
At this danger rating level, 1.I go [into the backcountry] primarily based on the danger rating; 2. I go [into the backcountry] mainly based on the danger rating, but I [also] check other bulletin information; 3. avalanche problems and forecast details are the basis for the decision [to go into the backcountry]; 4. the danger rating alone prevents me from going [into the backcountry].
The first three statements describe a progression where the decision to go into the backcountry relies increasingly on more advanced avalanche safety information and the danger rating itself loses importance in the decision-making processes.The fourth statement represents a situation where the danger rating itself is viewed as the deciding factor for not going into the backcountry at all.The survey also included a series of questions to collect background and demographic information on participants including age, self-identified gender, country of residence, primary winter backcountry activity, years of winter backcountry experience, average number of days spent in the backcountry each winter, and level of completed avalanche awareness training (Table 1).

Survey deployment and analysis dataset
The survey was available for participation from 23 March to 31 May 2019, and a link to the survey website was distributed extensively on social media and was also displayed on the websites of several avalanche forecasting centers across North America.To further incentivize participation, those who completed the survey before 15 May 2019, were entered into a cash prize draw.
During the 2-month period when the survey was available, 4690 individuals started the survey.Prior to analysis, 1355 incomplete surveys were removed from the analysis dataset, which represents a 28.9 % dropout rate.In addition, we removed responses of participants whose residence was outside of North America, who took less than 10 min to complete the survey, whose primary activity does not involve exposure to avalanche hazard (e.g., trail running), and whose avalanche training level we were unable to confidently classify as none, introductory, advanced, or professional level.
The final number of survey participants included in the analysis dataset was 3195.While most of the analysis sample (75.6 %) reported to primarily participate in backcountry skiing or snowboarding, snowshoeing (7.6 % of sample), mountain snowmobiling (6.0 %), out-of-bounds skiing (5.1 %), ice climbing (3.4 %), and sled-accessed backcountry skiing (1.8 %) were also chosen as primary winter backcountry activities.Most respondents (73.0 %) identified as male and 25.1 % as female.The United States was the residence for 54.6 % of the sample, and 45.2 % were from Canada.Participants' ages ranged from younger than 20 years to older than 55 years with the age category 25-34 forming the largest group (38.8 % of sample), followed by 35-44 (22.6 % of sample).Backcountry experience was indicated both by how many days per year a participant typically spends in the backcountry and by how many years of experience the individual had accumulated.Days per year ranged from 1-2 d per winter to more than 50 d per winter (modal response 21-50 d per winter with 30.1 % of the sample).Years of experience varied from those in their first year to those with decades of experience (modal response 2-5 years with 34.4 % of the sample).The level of completed avalanche awareness training also varied considerably in our sample: 17.8 % had no formal training, 47.6 % had introductory recreational training, 19.3 % had advanced recreational training, and 15.1 % had training aimed at aspiring avalanche professionals.The most common self-identified bulletin user type was type E (45.4 % of the sample), while 1.3 % of the sample identified as type A.

Analysis approach
We conducted our entire analysis in the R statistical environment (R Core Team, 2022) and started with standard descriptive statistics to describe the nature of the analysis dataset and explore the relationships between different variables.We   used Pearson chi-squared tests, Wilcoxon rank-sum tests, or Kruskal-Wallis tests depending on whether the variables of interest were categorical or ordinal and whether we were comparing two or more groups.Unless stated otherwise, we used a p-value threshold of 0.05 to determine whether differences are statistically significant.However, minute differences that cannot be interpreted meaningfully were disregarded even if they were statistically significant.
The three main survey questions of interest -the recall, perception, and use of the danger scale -were presented to survey participants who use the bulletin (i.e., bulletin user types B-F) and focus on the danger scale at least rarely.This resulted in an original analysis dataset of 3130 responses for these questions.The general analysis approach for these questions included two steps.First, we identified common response patterns using an approach tailored to the questionspecific response format.In the second step, we related the identified response patterns to participants' avalanche safety training and backcountry experience to better understand the sources of the observed differences.

Response patterns in recall question
To better understand participants' ability to recall the danger scale, we manually examined their free-form text responses in three different ways: (a) how many terms a participant recalled, (b) which danger rating levels were recalled, and (c) how well participants recalled the entire scale in the correct order.Our assessment of how many terms a participant recalled was only concerned with the number of terms, regardless of whether they were the correct terms.To assess participants' recall of individual levels, responses were graded by whether the participant recalled each term of the danger scale, regardless of whether the terms were in the correct order.Finally, to assess the recall of the entire scale, responses were graded by whether the participant correctly identified five danger ratings by the right terms and placed them in the correct order.Incorrect responses were categorized to identify common errors.
For our analyses, the standard colours and numbers were considered acceptable substitutes for the danger rating terminology."Very high" was accepted as "Extreme" if not used in combination with "Extreme".Responses that indicated participants skipped the question or did not understand it (e.g., relaying information on current conditions or providing only the first and last terms of the range) were removed.Of the 3130 survey participants who were presented with the danger rating recall question (bulletin user types B-F who use the danger scale at least rarely), 170 were eliminated due to the above reasons resulting in an analysis dataset of 2960 meaningful responses.

Response patterns in ordering of signal word question
The responses of survey participants who self-identified as bulletin user type A to the ordering of the signal word question were graded on their ability to correctly place all five terms in the correct order.Incorrect responses were categorized to better understand the most common errors.Of the 41 participants answering this question, four people assigned the same term to multiple levels (e.g., assigned Moderate to two levels), and two people did not complete the question, leaving 35 complete responses for analysis.

Response patterns in perception question
To identify common patterns in the responses to our danger scale perception question, we used a latent class mixed-effect model, an analysis approach also known as growth mixture models (Muthén and Muthén, 2000).These types of models combine the capabilities of mixed-effect models that account for correlations that emerge from repeated measure designs (Harrison et al., 2018;Zuur et al., 2009) with person-centered latent class or mixture models that can identify the presence of unobserved (i.e., latent) subpopulations and describe them with separate but simultaneously estimated regression models (e.g., Collins and Lanza, 2010;Lazarsfeld and Henry, 1968).A complete description of latent class mixed-effect models is beyond the scope of this paper, but interested readers are referred to Jung and Wickrama (2008), and van der Nest et al. ( 2020) for more details.
To apply this analysis approach to our dataset, we viewed each participant's minimum and maximum severity estimates for each danger rating level as separate observations and regressed the severity ratings against the danger rating levels.The resulting regression line roughly goes through the center points of the severity ranges of each danger rating level and thereby provides a general approximation of how participants perceive the shape (i.e., functional form) of the danger scale.To allow for a variety of shapes to emerge from the analysis, we included both the linear and quadratic predictors for the danger rating level in the model.A positive parameter estimate for the quadratic term shows that the increase in severity between levels is perceived to increase as one goes up on the scale (i.e., the curve steepens), whereas a negative parameter estimate indicates that the curve flattens out.A quadratic term that is not significantly different from zero indicates a straight linear relationship between the danger rating level and the perceived severity.To further enhance the insight into participants' perception of the danger scale, we included one additional predictor for each danger rating level in the regression model to capture the size of the severity ranges.With the minimum and maximum values for a specific danger rating level coded as −0.5 and +0.5 respectively (0 for all other danger rating levels), the resulting parameter estimates provide a direct estimate of the range size.Hence, the fixed effects included in our analysis describe the severity of the avalanche conditions as with DR being a numeric representation of the danger rating level (Low: 0; Moderate: 1; Considerable: 2; etc.), Rng x representing whether an observation represented the minimum (−0.5) or maximum (0.5) value of a danger rating level range (0 for all other danger ratings), and β x being the regression parameters.Coding Low as zero and omitting a separate intercept ensures that all regression lines start at the center point of the severity range of Low.To account for the repeated measure design, participant ID was included as a random effect, and we included all the main effects in the mixture model estimate, which means that both the shape of the regression line and the width of the ranges were considered for dividing the sample into different groups.The output of the analysis consisted of parameter estimates for a finite set of danger scale perception patterns, and membership probabilities for each study participant.Readers interested in the full details of the model specification are referred to the provided R code of the analysis (Haegeli et al., 2022).We used the hlme() function of the lcmm package (Proust-Lima et al., 2017) for our analysis, and we ran the procedure nine times to first estimate a model with only a single class and then models with two to nine latent classes.Initial iterations of our model estimations highlighted groups of participants whose response pattern clearly showed that they did not use the sliders as intended.This included participants who only moved one of the sliders (minimum or maximum) for all danger rating levels.To minimize the impact of user error on our results, we removed individuals with these types of responses from the analysis.Furthermore, we only included participants in the analysis whose severity midpoints grew monotonically to avoid any spurious responses.Once the dataset was clean, we computed our final model estimations.Our evaluation of the models followed the guidance of Nylund-Gibson and Choi (2018) and included model fit statistic, such as the Akaike information criterion (AIC; Akaike, 1974) and the Bayesian information criterion (BIC; Schwarz, 1978) with smaller values indicating better model fit.However, we also considered classification diagnostics (e.g., average assignment probabilities), as well as the interpretability and utility of the estimated models for the research question.Given our large sample size, https://doi.org/10.5194/nhess-23-1719-2023 Nat. Hazards Earth Syst.Sci., 23, 1719-1742, 2023 even minor differences in parameter estimates can emerge as statistically significant even though they are practically not meaningful.Hence, the selection of the final model included considerable judgment from the research team.We assigned each participant to a danger scale perception pattern using the largest membership probability.
Classes were labelled based on the value of the second polynomial parameter estimate (> 0.75: concave; > −0.75 and < 0.75: linear; and < −0.75: convex), and a qualitative description of the width of the severity ranges of the danger rating levels.Note that we use the term convex to describe patterns where the difference between the levels decreases as one goes up on the scale (i.e., the slope flattens out) and the term concave for situations where the difference between levels increases (i.e., slope steepens).While this use of these terms is at odds with the mathematical definition of convex and concave, it is consistent with their use by avalanche safety practitioner, who follow the geometric definition of these terms to describe the shape of terrain in the backcountry (convex: dome-like, rollovers; concave: bowl-shaped).
Of the 3130 participants who answered the perception question, 446 responses were removed for incorrectly using the sliders or because their severity midpoints did not increase monotonically.The final analysis dataset for this question included responses from 2684 participants.

Response patterns in use questions
Similar to the analysis of the perception question, we used a latent class approach for identifying common patterns in participants' responses to the danger rating use question.However, since the five observed response variables are ordinal, the poLCA() function of the poLCA package (Linzer and Lewis, 2011) was more appropriate for this analysis.In comparison to the latent class mixed-effect models described in the previous section, a polytomous variable latent class analysis stratifies the sample into a finite number of patterns directly based on the observed ordinal response variables without estimating regressions in the process.The output of this analysis consists of sets of probabilities that describe the chance of observing each response for each variable in an identified response pattern (i.e., class-conditional marginal frequencies), as well as membership probabilities for each study participant.
Like in the perception analysis, we only included participants in the analysis whose answers to the use question grew monotonically with the danger rating level.This removed response patterns where participants treated higher avalanche danger rating levels more liberally than lower levels, which we considered unreasonable.Once the dataset was clean, we estimated solutions with two to nine latent classes.We evaluated the fit of the estimated models using the AIC, BIC, and classification diagnostics, as well as their interpretability and utility for the research question.
From the 2705 participants who were presented with the danger rating use question, 121 responses were removed for the analysis because the participants did not answer the question or provided responses that did not increase monotonically.In total, the responses from 2584 participants were available for this part of the analysis.

Relating response patterns to background variables
We used conditional inference trees (CTrees; Hothorn et al., 2006), a classification tree algorithm based on statistical hypothesis testing, to shed light on how the observed response patterns in the danger scale recall, perception, and use questions relate to the background, training, and experience of survey participants.The CTree algorithm employs series of permutation tests to partition a dataset into smaller and smaller subgroups along splits in the predictor variables that produce children nodes whose distribution of the response variable are maximally different from each other (Hothorn et al., 2006).The splitting process repeats until the algorithm can no longer find any statistically significant relationship according to the specified p-value threshold (default value: 0.05).Once the splitting process is complete, the terminal nodes at the end of each branch contain a distribution of the dependent variable that exhibits minimal variation within the node and maximum variation to the immediately adjacent neighbouring node.
In our CTree analyses, we used a consistent set of predictor variables to explore the relationships between users' background and their danger scale recall, perception, and use.This set of predictor variables included users' primary backcountry activity, country of residence, level of avalanche training, years of backcountry experience, and average number of days of backcountry recreation per year.We use 0.05 as the p-value threshold for all CTree analyses.

Recall of danger scale levels
Of the 2960 meaningful answers to the recall question, most participants (78.1 %) provided five terms (Table 2) but not necessarily the right terms and in the correct order.When analyzing which terms were recalled, regardless of order, participants recalled Moderate significantly more often than the other levels (Table 2).Considerable and Low followed next, and Extreme and High were recalled significantly less frequently than the other levels (Pearson chi-squared test: p < 0.001).
Slightly more than two-thirds of participants recalled all five terms of the danger scale using the correct terms and in the correct order (Table 2).The most common patterns among people who did not recall the entire scale correctly were omitting High, Extreme, or Considerable.All other response patterns accounted for less than 1 % each.
The CTree analysis examining the influence of background variables on participants' ability to recall the danger scale correctly (i.e., all five levels in the right order) included 2867 responses as not all participants provided relevant background information.Our analysis revealed that avalanche awareness training and number of backcountry days per season were both significantly associated with participants' performance (Fig. 4).Avalanche education formed the first split in which participants with advanced recreational or professional-level training were more likely to recall the scale correctly than those with lower training (p < 0.001).For all levels of training, participants who spend more days in the backcountry each winter (i.e., who are more engaged in their activity) performed better.The only other significant background variable was primary backcountry activity, which resulted in two final splits.
Overall, non-ice climbers with professional-level training and more than 20 backcountry days a season (Node 19) performed the best, with 93.0 % (357 out of 384) of the participants assigned to this node recalling the danger scale correctly.At the other end of the spectrum, only 32.4 % (59 of 182) of participants with no training and 10 or fewer days per season (Node 4) correctly recalled the entire danger scale.This performance was closely followed by Node 7, in which 36.5 % (31 of 85) of participants recalled the full danger scale correctly.This group represents mountain snowmobile riders and snowshoers without training who spend more than 10 d in the backcountry each winter.

Order of danger rating levels
Of the 35 participants who completed the ordering question meaningfully, 26 participants (74.3 %) placed the terms in the correct order.In the nine incorrect responses, Considerable was incorrectly placed within the scale seven times: five participants (14.3 %) reversed High and Considerable, and two participants (6.0 %) reversed Considerable and Moderate.The final two errors were a seemingly random order of terms.

Perception of danger scale
Steadily decreasing AIC and BIC values (Fig. S1 in the Supplement) for the latent class mixed-effect models with two to nine latent classes indicated that the regression analysis was able to continuously identify new groups with distinct danger rating perceptions.However, while the parameter estimates for the different classes continued to be significantly different from each other, the practical differences between the regression became irrelevant.Hence, using interpretability as the primary guide, we determined that seven was the optimal number of clusters for our avalanche danger perception analysis (Fig. 5 and Table B1).Curious readers are referred to the supplemental material that shows all the different class solutions (one to nine classes) and how participants migrate between them as the number of classes is increased.
Each identified pattern in the seven-class solution is characterized by the shape of the regression line that goes through the center point of each range and typical ranges for each danger rating level.Remember that we use the geometric and not mathematical definitions of the terms convex and concave.
In the seven-class solution, almost half of the sample (46.2 %) was assigned to the Linear, narrower class (Fig. 5b), whose members perceive the danger scale most linearly with narrow ranges.The next most common class was the Convex, narrower (Fig. 5d), with 24.7 % of participants assigned to it and its danger rating curve being the most convex of all the classes.The next largest class was Linear, wider (Fig. 5c), which represented 15.1 % of the sample.The participants assigned to this class perceive the danger rating scale to be fairly linear but with substantially wider severity ranges, particularly for Considerable, than the other linear class.The last of the substantial classes is Concave, wider (Fig. 5a), which represented 7.2 % of our sample.This is the only class whose participants drew patterns that our analysis interpreted as a non-linear concave shape.The concave shape indicates that survey participants assigned to this class perceive the differences between the danger rating levels to increase as one moves up the scale.The concave shape is accompanied by increasing severity range sizes.
The remaining three classes represent much smaller proportions of the sample.Participants in Convex, widest (Fig. 5e), which consisted of 3.3 % of the sample, draw patterns that are slightly convex, but the ranges of the danger ratings in this class are distinctly wider and have more overlap than in other classes.Convex, high start, narrowing (Fig. 5f) represents just 1.9 % of the sample.Similar to the participants assigned to the other convex classes, this class of respondents draw a pattern that our analysis identified as a convex shape, but what separates them from these other classes is its large intercept and its very broad ranges at the lower end of the danger scale, which indicates that these survey respondents perceive the danger scale to have elevated severity at these lower levels.The smallest class, Convex, low end, widening (Fig. 5g), represented just 1.6 % of the sample.The shape of the danger curve for this class is also convex, but there are exceptional low values for the midpoints of the danger rating levels and very broad severity ranges, especially at the upper end of the scale.
The two smallest classes -Convex, low end, widening and Convex, high start, narrowing -had high class assignment probabilities with median values above 0.950, which means that the observed danger rating patterns are distinct, and participants were assigned with high certainty.were associated with Linear, wider (0.900) and Convex, narrower (0.858).Lower class assignment probabilities indicate that the assignments to these classes were more uncertain, and the response patterns of these participants had some similarity to one of the other classes.Among both Linear, wider and Convex, narrower members, the highest median assignment probability for any other class was for Linear, narrower (Linear, wider: 0.059; Convex, narrower: 0.125).This means that there is some permeability between these classes and the response patterns of several participants sit between these classes.
The dataset for the CTree analysis was 2606 responses since not all participants completed the background questions.The predictor variables that had a significant effect on class assignment were level of avalanche training and number of days of backcountry travel per year (Fig. 6).Participants with professional avalanche training had a significantly lower proportion of members being assigned to the Linear, narrow class (36.3 % of participants with professional training assigned to Class 2 versus 47.9 % of participants with all other levels of training; p < 0.001) and a much higher proportion of members being assigned to Linear, wider, the linear class with the wider severity ranges (20.9 % of participants with professional training versus 13.9 % of participants with all other levels of training; p < 0.001).The number of backcountry days per year had a similar effect as avalanche training.Participants with non-professional avalanche training who spent more than 50 d in the backcountry per year also had a significantly lower proportion of members assigned to the Linear, narrow class (35.5 % of participants with more than 50 d were assigned to this class versus 49 % of participants with 50 or fewer days).However, instead of a higher proportion of Linear, wider individuals, this group had a significantly higher proportion of Convex, narrower class members (34.4 % of participants with more than 50 d were assigned to this class versus 23 % of participants with 50 or fewer days).
Those whose response pattern was assigned to the Convex, wider class did not exhibit a specific user profile; the participants in this group were a random assortment of engagement, training, experience, and every other explanatory factor we considered.

Use of danger rating level in trip planning
Overall, 90.7 % of participants reported that an Extreme danger rating prevented them from entering the backcountry, and 62.0 % of respondents reported staying home at both High and Extreme.In contrast to these no-go decisions, 32.5 % of participants stated they would enter the backcountry prihttps://doi.org/10.5194/nhess-23-1719-2023 Nat. Hazards Earth Syst.Sci., 23, 1719-1742, 2023 marily based on a Low danger rating, and 5.9 % of participants would go primarily based on the danger rating at both Low and Moderate.This small percentage for Moderate increased to 43.1 % when including respondents who go into the backcountry based mainly on the danger rating but also check other bulletin information.
In the latent class analysis, the AIC decreased continuously from models with one to eight latent classes, but the BIC showed a distinct minimum at six classes.The selection of six as the optimal number of classes was further supported by the fact that the median assignment probabilities were 1 for all the classes and the meaningful interpretation of the emerging cluster solution.Each class is characterized by different proportions of participants' reliance on bulletin information for decision-making at each danger rating level.The classes that emerged from the analysis are arranged in Fig. 7 based on their strong relationship to St. Clair's (2019) bulletin user types (Spearman rank correlation = 0.32; p < 0.001), which is ordered from a heavier to a lighter reliance on the danger rating for trip planning.The stacked bars represent the conditional (i.e., class-specific) response probabilities at the different danger rating levels.The classes are best described by their most striking patterns.
-Class 1 (Fig. 7a).Participants assigned to this class (5.7 % of the sample) primarily rely on the danger rating at both Low and Moderate and avoid the backcountry at Extreme.About half of these participants also avoid the backcountry at High, while the other half generally use other bulletin information to make decisions at High.
-Class 2 (Fig. 7b).Participants assigned to this class (27.9 %) rely mainly on the danger rating at Low and Moderate, use other bulletin information at Considerable, and avoid the backcountry at High and Extreme.
-Class 3 (Fig. 7c).Participants assigned to this class (4.4 %) rely mainly on danger ratings from Low to Considerable, use other bulletin information at High, and avoid the backcountry at Extreme.
-Class 4 (Fig. 7d).Participants assigned to this class (31.3 %) rely mainly on the danger rating at Low, use other bulletin information for Moderate and Considerable, and avoid the backcountry at High and Extreme.
-Class 5 (Fig. 7e).Participants assigned to this class (17.0 %) rely mainly on the danger rating at Low and Moderate, use other bulletin information at Considerable and High, and avoid the backcountry only at Extreme.
-Class 6 (Fig. 7f).Participants assigned to this class (13.7 %) rely on other bulletin information for making decisions at all levels; approximately half of this class avoids the backcountry at Extreme.
While more than 50 % of the participants assigned to Class 1 self-identified as bulletin user types B and C (18.3 % and 32.5 % respectively), these proportions are considerably smaller in the patterns that rely less on the danger rating and integrate other avalanche bulletin information.In Class 6, the pattern where avalanche problem information and forecast details are used at every danger scale level, the proportion of self-identified bulletin user types E and F is the highest (68.9 % and 8.6 % respectively).A total of 2881 responses were available for the CTree analysis.The most significant predictor variables for the danger rating use were similar to the variables from the danger perception (Fig. 8): days of backcountry travel per year and avalanche training formed 7 of the 10 identified splits.In addition, backcountry activity, which also emerged as a significant predictor in the analysis of the recall performance, caused two splits.Nationality was an additional significant predictor variable, which was unique to the danger use responses.Contradictory to our expectation, danger perception class membership did not have an obvious effect on how people reported using the danger scale.Hence, the perception class membership was removed from the predictor variables to maximize the sample size for the use question.
The most significant split in the CTree was between users with more than 20 d of experience per year and those with 20 or fewer days per year (p < 0.001).Those with 20 or fewer days had a greater prevalence of Classes 2 and 4 (do not enter the backcountry at High) and lower prevalence of Classes 5 and 6 (use other information more often and may go based on other information at High).The opposite pattern is observed in those with more than 20 days of experience.(In Node 2, Class 2 represents 35.9 %, and Class 4 is 36.0%; in Node 11, these proportions are 18.9 % and 25.2 %.For Classes 5 and 6, we see 12.6 % and 5.5 % in Node 2 and 22.5 % and 23.3 % in Node 11.) Avalanche awareness training was of secondary importance for those with more than 20 days of experience (p < 0.001).Generally, Classes 2 and 4 (who do not enter the backcountry at High) were less prevalent among more highly trained people (Node 19 shows 10.3 % and 17.9 %, respec-tively, versus 22.3 % and 28.1 % in Node 12), and Class 6 (relies on other information) was more prevalent (38.4 % versus 17.3 % in Node 12); with the exception of Class 4 (which shows no significant differences), the opposite pattern is observed in those with introductory or lower training (Node 12).
Below these two initial splits, days per year and avalanche awareness training produced several additional splits, and primary backcountry activity, country, and years of experience produced individual splits.Readers interested in the full description of the CTree analysis for danger rating use are referred to Morgan (2021)

Discussion
The results of our study offer a valuable perspective on North American recreationists' understanding and use of the danger scale for general trip planning decisions.At first glance, the key findings include that almost 70 % of people correctly recalled the five levels of the danger scale in the correct order; 65 % of survey participants provided responses that indicate that they perceive the danger scale as a linear scale with or without overlapping ranges; over 90 % of participants avoid the backcountry when the danger rating is Extreme, and 62 % stay home at both High and Extreme; and about a third of participants go out into the backcountry at Low based primarily on the danger rating.However, a more in-depth examination of our results provides deeper insight into the effectiveness of the current danger scale.

Understanding of the danger scale
Our results highlight that recreationists' perception of the ordinal, five-level avalanche danger scale differs substantially from the scientific understanding of the exponential-like increase in hazard between levels most recently described by Schweizer et al. (2020).The predominantly linear interpretation that emerged from our slider question is consistent with the results of other survey studies that examined recreationists terrain preference as a function of the danger rating using discrete choice experiments (Haegeli et al., 2012(Haegeli et al., , 2020;;Haegeli and Strong-Cvetich, 2020).While these studies did not directly ask participants about their perception of the danger scale, the linear patterns in the part-worth utilities provide an indirect measure that aligns with the results of this study.Aside from the linear perception, a considerable proportion of our sample (28 %) associate a convex hazard pattern with the danger scale, and only 7 % of the participants expressed a concave pattern, which is closest to the scientific understanding of the scale.
Although the differences between the seven perception patterns are relatively subtle, they all differ substantially from the exponential-like scientific understanding of the scale, which is based on the simultaneous increase in the sensitivity to triggers, the number of potential trigger locations, and avalanche size from one danger rating level to the next.Studies attempting to quantify the increase in severity have included both hazard-based (Munter, 1997;Schweizer et al., 2020) and risk-based approaches (Pfeifer, 2009;Techel et al., 2015;Winkler et al., 2021;Techel et al., 2022) and found a 2 to 4 times increase in severity between danger rating levels.Obviously, such dramatic differences between the scientific understanding of the scale and recreationists' perception have the potential to lead to serious miscommunication about the severity of avalanche hazard conditions.
We suspect several possible reasons for the dominance of the linear perception pattern amongst recreationists.First, many of the most common tools and displays used in North America do not explicitly state the exponential nature of the scale.Examples include the official graphical representation of the danger scale (Fig. 1; Statham et al., 2010), the Introduction to the North American Public Avalanche Danger Scale video from the National Avalanche Center ( 2016), and the Avalanche Canada tutorial focused on the danger rating (Avalanche Canada, 2022).The visual and numerical cues for interpreting the scale all seem to indicate a linear system, such as each coloured block of the danger scale being the same size and the number of each level increasing by one each step.While there are some resources that explicitly state that the danger scale is non-linear (e.g., European Avalanche Warning Services, 2021a; SLF, 2023; Utah Avalanche Center, 2022), they seem to be less common.Although these different methods of displaying and communicating the scale are not strictly contradictory, they show a predominant linear presentation and inconsistent messaging between educational products.
A second possible reason for the dominance of the linear understanding of the ordinal danger scale could be that it may be the simplest default given the vagueness of the descriptors used in the scale and the challenges that people have understanding these terms.The terms used to indicate the likelihood of avalanches -"unlikely", "possible", "likely", and "very likely" -have been shown to have a broad range of meanings to different people, even when those people have professional-level avalanche training (Thumlert et al., 2019).This same difficulty with interpreting terms has been observed across the wider risk communication community including psychology (e.g., Hancock and Volante, 2020) and medicine (Nakao and Axelrod, 1983).In climate science, public perception of verbal expressions of probabilities improved when paired with a numerical probabilities (Wintle et al., 2019;Budescu et al., 2014).Hence, the persistent issue of ambiguous likelihood terms may be one of the reasons that makes it difficult for users to grasp the functional shape of the danger scale.
Interestingly, those who indicated a concave perception of the danger scale were not a distinct cohort with a specific profile of backcountry experience and demographics but a rather random mix of participants.Participants who completed professional-level training were not more prevalent in the group that expressed the concave perception as anticipated but rather represented a higher proportion in the group that indicated a linear perception of the scale with wider severity ranges, particularly for Considerable.The same pattern was found amongst recreationists with higher levels of engagement in their activities (i.e., more backcountry days per season).This pattern may reflect the recognition that Considerable can represent a wide range of conditions (e.g., high-likelihood and low-consequence storm slab avalanche problem situations, as well as low-likelihood and high-consequence persistent slab avalanche problem situations), or it may be a sign of respondents' perceived differences in their uncertainty (and therefore risk) in decisionmaking between different danger rating levels.While the signs of elevated hazard may become more obvious at High, and therefore make it easier to make a conservative decision, the conditions under Considerable may be perceived as vaguer and more error prone.This perception might be supported by the common peak of fatal avalanche accidents at Considerable (e.g., Greene et al., 2006;Harvey and Zweifel, 2008;Jamieson et al., 2010).However, some studies that include the base rates of avalanche danger ratings in their calculations show similar accident rates between Considerable and High (e.g., Greene et al., 2006;Techel et al., 2015).This inaccurate perception of the danger scale by individuals with professional training has the potential to perpetuate the misconception of a linear scale as avalanche educators share their perspective with their students.In addition, avalanche forecasters' potential inaccurate perception of the danger scale may introduce additional possibilities for miscommunication.However, our study did not explicitly examine the danger rating perception of public forecasters and avalanche course instructors.
Participants with recreational or no avalanche training who had higher levels of engagement in their backcountry activity were more likely to exhibit a convex perception of the scale.While we do not have an obvious explanation for this pattern, it may be related to participants having more personal experience with the lower levels of the danger scale as most danger ratings assigned by forecasters are between Low and Considerable.Avalanche Canada, for example, forecasted these three danger ratings over 90 % of the time for all elevation bands between 2010 and 2019 (Low 24.2 %, Moderate 38.1 %, Considerable 28.7 %, High 6.0 %, Extreme 0.1 %, derived from Avalanche Canada public avalanche bulletin archives; Avalanche Canada, 2021).Greene et al. (2006) present similar danger rating distributions for the United States, France, and Switzerland.We hypothesize that this higher level of familiarity may lead recreationists to being more confident at distinguishing differences among the lower https://doi.org/10.5194/nhess-23-1719-2023 Nat. Hazards Earth Syst.Sci., 23, 1719-1742, 2023 levels of the danger scale.The more frequent use of the lower levels of the danger scale may also be the reason that participants showed better recall for these levels.However, the fact that Moderate and Considerable were presented to all survey participants in the slope choice exercise (Finn, 2020) prior to the questions examined in this paper may have also contributed to the good recall performance for these levels.Regardless of the precise reasons for participants' convex perceptions, an important insight of our results is that neither higher levels of engagement nor more years of experience result in a more accurate perception of the danger scale.We believe that this observation is at least partially a result of the wicked learning environment (Hogarth, 2001;Hogarth et al., 2015) of winter backcountry recreation, where direct experiences with avalanches are rare and recreationists are given insufficient feedback for developing a more accurate perception of the hazard.

Use of the danger scale
Overall, more than 90 % of participants reported staying home at Extreme based on the danger rating alone, and over 60 % of participants reported doing the same at Extreme and High.While there are statistically significant differences among all six danger rating application patterns identified by the latent class analysis, four main overarching patterns emerged: a more aggressive pattern with respondents who potentially still go out at High (Classes 3 and 5), a more conservative pattern with respondents who avoid the backcountry at High (Classes 2 and 4), a pattern that relies primarily on the danger rating at both Low and Moderate (Class 1), and a pattern where respondents primarily rely on other bulletin information for making their decisions about whether to go into the backcountry regardless of the danger rating level (Class 6).The primary background variables that determined class membership were participants' engagement in their activity and their level of avalanche training, which were the same background variables that influenced responses in both the recall and perception questions.In the use analysis, the general patterns showed that less engaged participants and participants with lower levels of training were more prevalent in classes that avoided the backcountry at High, and participants with higher engagement and more advanced training were more likely to use other information to make decisions, particularly at High.Overall, these pre-trip use patterns of the danger scale and their relation to training and engagement seem appropriate, and they are consistent with the reasoning behind the pyramidal structure of the avalanche bulletin, where the most simplified information (i.e., the danger rating) is presented first in an effort to communicate a general idea of conditions accessible to recreationists with little to no training or experience (Statham et al., 2010).
The fact that our analysis did not reveal a relationship between the perception of the danger scale and its use initially surprised us since there is a wide body of literature showing that risk perception affects how people respond to risk (Kahneman, 2011;Loewenstein et al., 2001;Slovic, 1987;Weber et al., 2002).Hence, it seems reasonable to expect that differences in the perceived severity of the conditions at a specific danger rating level could lead to different trip planning behaviours.One possible reason why this expected relationship did not emerge could be that the differences between the seven perception classes were relatively subtle.Furthermore, our relatively simple question targeting general hazard perception and danger scale use did not allow us to capture the nuances of how people form risk perceptions and determine appropriate context-specific behaviours.
Given the multitude of factors influencing risk perception and mitigative actions in the backcountry, it is not surprising that choosing appropriate actions based on a danger rating is a challenging task.Although sparse, recreation-related research has demonstrated that people have difficulty identifying appropriate actions using danger scales.Langer et al. (2011) found that the fire danger signs did not include clear information on appropriate behaviour, and Ménard et al. (2018) critiqued rip current warning signs' ability to effectively communicate intended warning messages.The North American Public Avalanche Danger Scale includes guidance for mitigative actions for each danger rating.Although respondents in our survey did not demonstrate a link between their perception of avalanche hazard in the danger scale, they did exhibit reasonable applications of the danger scale in trip planning in relation to their levels of experience and engagement.This finding may indicate that the danger scale effectively communicates recommendations about appropriate behaviour under different conditions.

Practical implications for avalanche risk communication
Our research has highlighted two main challenges associated with the current scale: inaccuracies in the perception of the scale and difficulties with Considerable.Our results show that the intuitive perception of the scale is linear, while the scientific understanding of the scale shows an exponentiallike increase in severity.While this discrepancy between perceptions creates the potential for miscommunication, our results also show participants' use of the scale for making the decision of whether to go into the backcountry does not seem to be driven by perception.This potentially means that the ratings' actionable guidance (i.e., short descriptors provided with each danger rating) is a stronger driver for danger rating use than the perceived functional form of the scale.However, regardless of this potential effect, there is an opportunity to improve alignment between the scientific and public understanding of the scale.
In addition to the difficulties of varied perceptions, two results of our research show that Considerable remains a challenging condition and a challenging term.Respondents with higher engagement and higher levels of training were more prevalent in the class with a wider range for Considerable, possibly indicating the wide range of conditions that can lead to a Considerable danger rating or the difficulty of managing the conditions experienced at this rating.Furthermore, the biggest source of error in the danger-term ordering question by type A's was failing to place Considerable correctly, indicating that it is a less intuitive term than the other danger signal words.
The danger scale challenges highlighted by our study are not completely new, and several authors have suggested that a four-level scale may be more appropriate for recreational risk communication.Some of the justifications behind these recommendations may assist with aligning the scientific and public understanding of the scale.Using accident data from North America and Europe, McClung (2000) reasoned that a simpler four-step scale may be more comprehensible for recreationists, arguing that the current scale should take human perception into account.Conger (2004) supported Mc-Clung's proposed four-level scale because it would shift Considerable out of the middle of the scale and into the more severe end of the spectrum.Using permit requests and parking data as recreation use data and avalanche bulletins Glacier National Park (British Columbia), Eyland (2018) showed that 70 %-90 % of the local recreational use occurred at Moderate and Considerable.He also found that recreationists tended to treat Moderate and Considerable similarly when deciding to enter the backcountry and that recreationists seemed to treat High as significantly more dangerous than Considerable.Eyland therefore recommended removing Extreme from the scale, which would shift Considerable into the more dangerous half of the spectrum.
Other research has shown that Extreme may not be a practicable rating due to its seldom use and difficulties identifying the condition correctly.When comparing the predicted danger ratings published in avalanche bulletins with the nowcast assessments of the following day in three regions in the Canadian Rockies, Statham et al. (2018b) found that Extreme was forecast just 0.03 % of the time in over 3752 bulletins and that these few instances were incorrect 89 % of the time when compared to the nowcasts.Low was correctly identified most often (84 %), and accuracy fell with each subsequent increase in danger rating: 71 % for Moderate, 69 % for Considerable, 55 % for High, and 11 % for Extreme.Techel and Schweizer (2017) found similar results in their analysis of danger rating estimates in Switzerland.The rarity of Extreme and the difficulties associated with correctly identifying it led Statham et al. (2018b) to stress the importance of focusing risk messages on information that can be used by the receiver in a meaningful way.This observation is supported by research from Aitsi-Selmi et al. (2016) that focuses on improving government communications of complex science in disaster risk reduction strategies using strategies that are "useful, usable, and used".A danger rating that is hardly used has limited utility and usability in a public risk communication context.
When making recommendations for improving a risk communication tool, it is critical to have a clear understanding of the target audience (Lundgren and McMakin, 2018).Since the community of avalanche bulletin users is highly diverse, it is important to properly reflect on which segment of the community depends on the danger rating the most so that the communication tools can be optimized for that audience.The simplicity of the danger rating targets lower-level users, who may not have the training, experience, or desire to pursue, comprehend, and integrate more complex information into their risk management processes.This was explicitly described by St. Clair's (2019) typology of bulletin users, and the responses to our danger rating use question further confirmed that less engaged respondents with lower levels of training are more prevalent in classes that rely more heavily on the danger rating at the lower end of the scale.These users also depend on the danger rating as a threshold for entering or avoiding the backcountry.In addition to these trip planning contexts, the danger rating is often incorporated into decision-making frameworks and aids, such as the reduction method (Munter, 1997) or the trip planning tool of the Avaluator (Haegeli, 2010), which systematically aid basic judgment and decision-making in avalanche terrain and are often recommended as a basic tool for beginner backcountry recreationists.Because the danger rating is so pivotal to these lower-end users, it is critical that the scale be intuitively understood by them.
Our results, combined with the above reflections on Extreme and the nature of the target audience of the danger rating, contribute to the discussion of whether a four-step scale would better serve recreationists.Given that a linear perception of the scale is seemingly the most intuitive, removing a level in the upper end of the scale would move Considerable above the center point of the scale and thereby increase the perceived severity of the condition.Extreme is rarely used and is often incorrectly forecast, and many lower-end users, who are the primary target audience of the danger rating, do not distinguish between High and Extreme.These users essentially already only use four levels out of the current five and removing Extreme would align the danger scale better with the needs and abilities of those who depend on it the most.More advanced users who may use Extreme as their personal threshold for not entering the backcountry could be warned about the extreme hazard conditions in other bulletin parts such as the headline, the avalanche problem information, or the condition summaries because they have the necessary skills to interpret these products in a meaningful way.Hence, eliminating Extreme from the public avalanche danger scale would likely have little impact on these users.
Despite these potential advantages, there are several substantial challenges with changing the well-established danger scale from five to four levels.Retraining users can be difficult, as shown in previous research illustrating recreationhttps://doi.org/10.5194/nhess-23-1719-2023 Nat. Hazards Earth Syst.Sci., 23,2023 ists' tendency to revert to previous practices even after being presented with new definitions for the existing danger rating levels (Ipsos Reid, 2009).Changing the scale would also require avalanche risk communicators to redevelop or adjust many existing products, such as decision aids, online tutorials, educational materials, professional resources, and roadside signs.Consistency between regions was one of the driving factors for selecting five levels for the danger scale in North America (Statham et al., 2010), and adopting a fourlevel scale in North America would create inconsistencies with other regions (e.g., Europe).In fact, confusion resulting from differences in danger rating scales used in European countries in the late 1980s and early 1990s was the impetus for the design of the unified five-level danger scale (Mitterer and Mitterer, 2018).However, in the winter of 2022/23, the Swiss avalanche warning service introduced sublevels to the danger rating levels published in their avalanche forecasts essentially turning the 5-level danger scale into a 13level scale (SLF, 2023;Techel et al., 2022).If a four-level recreational danger rating scale were to be pursued, the development would have to be supported with a comprehensive overview of the best practices surrounding each of its components including the signal words, icons, colours, symbols, and the graphical representation of the overall scale to determine how to best portray the new scale to ensure efficient and accurate information transmission with the target audience.A new four-level danger scale would have to be extensively user tested with the intended target audience before it is introduced to the community to ensure it is truly an improvement.
It is also important to remember that in many European countries, the public avalanche danger ratings are also used to communicate avalanche hazard to decision-makers responsible for residential areas and critical infrastructure, and in these circumstances the fifth danger rating is important to communicate rare but very dangerous conditions.However, using a single scale to communicate to multiple audiences with very different needs might result in suboptimal risk communication to all audiences.This is less of an issue in North America, where the danger scale is primarily used to communicate to recreationists.
In lieu of reducing the danger scale to four levels, avalanche warning services may wish to capitalize on existing strengths to improve risk communication strategies for a wider audience.Our research highlighted that survey participants seem to use the current system in an appropriate fashion for deciding whether to go into the backcountry.As explained earlier, we attribute this observation at least partially to the travel advice column in the danger rating definition table (Fig. 1).To build on this result, avalanche risk communicators may wish to put even stronger focus on recommended protective actions as they provide users with tangible guidance on what to do under specific conditions.Providing this type of guidance has been recommended by Mileti and Sorenson (1990) in a general emergency public warning con-text and by Sutton and Woods (2016) with respect to tsunami warnings.This is consistent with previous recommendations by Klassen (2012), who highlighted the need for developing terrain-based tools and decision aids, particularly to support recreationists in field-based decision-making.Linking actions and field-based tools to varying levels of user skill is critical, as previous research has shown that many backcountry recreationists overestimate their ability to apply bulletin information to terrain selection scenarios (Finn, 2020).Such terrain-based tools and decision aids can offer opportunities for providing guidance that overcome the shortcomings of the existing danger scale without the need to change the existing scale and the risk of confusing existing users.In addition, improvements in the presentation of the danger scale may further improve users' understanding of the scale.For example, the exponential-like increase in severity could be emphasized in educational materials, the graphical representation of the scale, and the numbers that indicate the danger rating levels.These simple strategies are likely insufficient for creating drastic improvements, but they provide options for better aligning the public and scientific understanding of the scale over the long term.

Limitations
While this research provides useful insights about how people perceive and use the danger scale, there are several limitations to the study that should be considered when interpreting and generalizing the results.Some of the standard limitations of online survey research include that (a) stated danger rating use may differ from actual danger rating use and (b) the participants of our survey study do not necessarily provide a representative sample of the full range of backcountry recreationists.Our recruitment strategy aimed to recruit a diverse sample, but our sample is still biased towards backcountry skiers and recreationists with an existing and heightened interest in avalanche safety.This relative dominance of more engaged and advanced backcountry users may have prevented some of the subtleties between lower-level bulletin users from emerging in our results.Future research on avalanche bulletin users should develop better strategies for collecting a more comprehensive and representative sample of the winter backcountry community.
Various aspects of the perception question might have limited participants' ability to accurately represent their perspective.First, using the sliders to describe one's perception of the danger scale was a challenging task.While our question format was the most flexible and least suggestive, we encourage other researchers to use other question formats (e.g., multiple choice questions with verbal descriptions or visual depictions) to further explore people's perception of the danger scale.Furthermore, a more precise wording of the perception question might allow future studies to distinguish between participants' understanding of the theoretical functional form of the danger scale versus their personal expe-rience with the scale in the field.Finally, our approach of using the midpoints of the provided severity ranges to describe the functional form of danger scale offers a meaningful but simplified and potentially limited perspective on participants' perception.
Several additional aspects of the survey design may have further influenced participants' responses.The avalanche bulletin scenario included in the slope choice exercise (Finn, 2020) presented in the survey prior to the questions examined in this paper may have primed participants to remember Moderate and Considerable in the recall question.However, both Low and Considerable were recalled correctly by almost 90 % of participants despite Low not being previously mentioned in the survey.Furthermore, the survey's focus on trip planning and relatively simplistic danger rating use question do not provide comprehensive insight into how recreationists use danger ratings in their risk management practices, both in the trip planning stage and out in the field.While several studies have examined the application of danger ratings to slope-scale decisions using discrete choice experiments (e.g., Haegeli et al., 2012Haegeli et al., , 2020;;Haegeli and Strong-Cvetich, 2020), more qualitative research might provide a richer perspective on the role of the danger scale in field decisions.Having an in-depth understanding of the application of the danger rating to terrain is of critical importance before making any changes to the scale.

Conclusion
Avalanche bulletins are a crucial source of information for recreationists to plan and carry out informed backcountry travel.To effectively communicate hazard information to a wide variety of recreationists, avalanche risk communicators must understand how different users amongst their audience are perceiving and using the different components of avalanche bulletins, including the danger ratings.A better understanding of how people with different levels of training and engagement interact with danger ratings can highlight valuable opportunities for making them resonate better and increasing their effectiveness.
We analyzed responses from an online survey to evaluate recreationists' perception of the North American Public Avalanche Danger Scale and how they use danger ratings to determine whether to enter the backcountry.Our results show that most recreationists (70 %) correctly recall the five levels of the danger scale in the correct order.The most common mistakes in recalling the danger scale or ordering the signal words were associated with Considerable, High, and Extreme.Nearly 65 % of participants provided answers that indicated a linear perception of the danger scale.Approximately 28 % of participants showed a convex perception, and just over 7 % of participants indicated a concave perception.These three major perception patterns indicate that recreationists' understanding of the danger scale differs substantially from the scientific understanding of the exponential-like increase in severity between danger rating levels.In terms of using a danger rating in a trip planning context, over 90 % of participants reported that an Extreme rating would prevent them from entering the backcountry, and more than 60 % of participants avoid the backcountry at both High and Extreme.About a third of participants reported primarily using the danger rating to decide to enter the backcountry at Low.
These results complement existing research on the danger scale.Previous research has addressed the nature of the scale (Schweizer et al., 2020), avalanche forecasters' application of the scale (Statham et al., 2018b;Techel and Schweizer, 2017), and accident distributions according to different danger ratings (Greene et al., 2006;Pfeifer, 2009;Techel et al., 2015;Winkler et al., 2021).Our research fills an important gap in understanding how recreationists, the primary target audience of bulletin products in North America and an important bulletin user in Europe, interact with danger ratings.Given that danger ratings are more critical for less experienced recreationists who interact with the bulletin in less sophisticated ways (St. Clair et al., 2021), it is important that the ratings resonate with these types of users and empower them to make safe backcountry decisions.While the current five-level danger scale seems to serve higher-trained individuals well, Extreme may be an extraneous level for lower-level users, whose trip planning process depends more heavily on the danger rating.
While this study provides a meaningful starting point for a better understanding of bulletin users' perception and use of the avalanche danger scale, there are numerous opportunities for future research.Studies more explicitly targeting avalanche forecasters and educators will provide valuable information on differences in danger rating perceptions and use between recreationists and professionals.Furthermore, using different question formats and examining the danger rating use in more detail are necessary to get a more holistic perspective on the understanding and use of the scale by different recreational audiences.Terum et al. (2022) recently examined the effect of past trends in avalanche danger levels on recreationists' perception on current and future avalanche hazard in Norway.Combined with the results of this research, these studies may provide valuable lessons on the effectiveness of danger scales for avalanche warning services.
Given that little research has addressed how the public perceives and uses danger ratings in any type of hazardous environment, we also suggest that similar research projects should be pursued in other hazard contexts.A more comprehensive examination of hazard scales across other hazard domains might reveal overarching lessons that will help to improve their overall effectiveness.
A. Morgan et al.: User perspective on avalanche danger scale Appendix A Table A1.Statements included in avalanche bulletin user type question.

Bulletin user type Statements
Type A It is not typical for me to consult avalanche bulletin information when making my backcountry travel plans.
Type B I typically use the bulletin to check the danger rating, which informs my decision of whether or not it is safe to travel in the backcountry.
Type C I typically combine the danger rating from the bulletin with knowledge of how avalanche prone an area is to determine where to travel in the backcountry.
Type D I typically make a decision about where or when to go based on the specific nature of the avalanche problem conditions reported in the bulletin and whether I feel that I can manage my travel in the terrain given these conditions.
Type E I typically use the available information about the specific nature of the avalanche problem conditions from the bulletin as a starting point for my continuous assessment in the field to confirm or disconfirm the information where I am travelling.
Type F It is not typical for me to consult public avalanche bulletins or forecasts because I have access to professional information sources (e.g., InfoEx) that offer more detailed insight into current conditions.
Appendix B

Figure 2 .
Figure 2. Screen shot of survey question on avalanche danger perception.Panel (a) shows the question prompt and starting positions for sliders.Panel (b) shows a completed response with example answers.

Figure 3 .
Figure 3. Screen shot of survey question on typical use of avalanche danger rating levels in trip planning.Panel (a) shows the prompt and starting position for drop-down options.Panel (b) shows a completed response with example answers.

Figure 5 .
Figure 5. Results of latent class mixed-effect model analysis of danger rating perception question.Each panel presents the danger rating regression line (black line) and severity ranges for each level (thick vertical lines) from the regression analysis and the severity range distributions of participants' answers associated with this class.Classes are arranged by general shape of the regression lines (starting top left: concave, then linear, then convex).Class sizes are shown above the top left corner of the charts.The identification number of the latent class is provided above the top right corner of the chart.

Figure 6 .
Figure 6.Results of CTree analysis of danger rating perception question.The labels Larger and Smaller indicate proportions of perception class memberships that are significantly larger or smaller than the average.

Figure 7 .
Figure 7. Results of latent class analysis of danger rating use question.Each panel presents the response probability for the ordinal response options (i.e., proportion of participants) for the five danger rating levels of one of the identified classes.Class sizes are shown above the top left corner of the charts.The labels for the latent classes include the following abbreviations: DR -danger rating; L -Low; M -Moderate; C -Considerable; H -High; E -Extreme; info -information.

Figure 8 .
Figure 8. Results of CTree analysis of danger rating use question.The labels for the latent classes include the following abbreviations: DR -danger rating; L -Low; M -Moderate; C -Considerable; H -High; E -Extreme; info -information.

Table 2 .
Overview of danger rating scale recall results (n = 2960).

Table B1 .
Parameter estimates for latent class mixed-effect model for the danger rating perception analysis.