Issues and complexities in safety culture assessment in healthcare

The concept of safety culture in healthcare—a culture that enables staff and patients to be free from harm—is characterized by complexity, multifacetedness, and indefinability. Over the years, disparate and unclear definitions have resulted in a proliferation of measurement tools, with lack of consensus on how safety culture can be best measured and improved. A growing challenge is also achieving sufficient response rates, due to “survey fatigue,” with the need for survey optimisation never being more acute. In this paper, we discuss key challenges and complexities in safety culture assessment relating to definition, tools, dimensionality and response rates. The aim is to prompt critical reflection on these issues and point to possible solutions and areas for future research.

/fpubh. . requirements (2). Notably, improving safety culture has become a significant priority for the Organization for Economic Cooperation and Development (OECD), especially as healthcare systems have faced additional safety concerns due to the implications of the COVID-19 pandemic (11). In 2020, the OECD compared the safety culture results of 16 countries, in an attempt to harmonize approaches, standardize methodologies, improve comparability of safety culture data over time, and to contribute to international benchmarking efforts (11). This work has revealed the heterogeneous nature of how healthcare staff perceive patient safety in their work environments, and has afforded opportunities to best practices regarding efforts to improve safety culture (11). Despite such efforts, several challenges persist in the measurement and intervention of safety culture that must be considered, including variability in definitions, tools, dimensionality, and response rates.
In this paper, we have drawn on recent literature and experiences in patient safety culture assessment to critically appraise each of these issues and then suggest possible solutions and areas for future research.

. . Definitional issues
Safety culture is arguably a poorly articulated concept, whereby many different definitions are apparent both within, and outside of, the healthcare domain (12). For example, there have been over 51 distinct definitions proposed, leading some researchers to refer to the concept as having, "the definitional precision of a cloud" (13,14). This lack of cohesion has led to the development of various frameworks, each built upon varying definitions in how to conceptualize and extract meaning from the concept (14,15).
Compounding the issue of definitional equivocality, many researchers also mark a distinction between safety climate and safety culture. While safety culture is argued to denote more longstanding, engrained behaviors, practices, beliefs and values within an organization, safety climate is proposed to embody people's perceptions of their organization (its procedures, practices, and the kind of behaviors that are tolerated or rewarded) at a given time (16)(17)(18). Following this, some argue that it is easier to measure safety climate than culture; if climate is considered a more temporal state of safety at a discrete point in time, it is thus more measurable. However, many others use the terms safety culture and safety climate interchangeably within the research literature (14,19,20). For the purposes of this paper, we use the term safety culture to include both culture and climate.
The most commonly used definition of safety culture was proposed by the Health Safety Commission (1993): "The product of individual and group values, attitudes, competencies and patterns of behavior that determine the commitment to, and the style and proficiency of, an organization's health and safety programmes" (p. 339) (12). However, some suggest that the broadness of such a definition weakens its scientific utility, indicating that much greater precision is required (21). So here lies another challenge; although the Health Safety Commission's definition may provide some guidance on which constructs to examine when assessing safety culture, the specific values, attitudes, competencies and behaviors and how to measure them is still not clear (15). Consequently, this has led to the development of many different tools, and in particular surveys, with each attempting to measure the complexities of safety culture (4,12,22). Indeed, surveys are particularly attractive as they are practical and time-efficient tools for gathering large amounts of data in a reliable and reproducible manner; thus supporting comparison and international benchmarking efforts. The anonymity usually involved in this form of data collection also makes them appealing for quality improvement, as they facilitate the contributions of staff who may be uncomfortable expressing their views openly (14,15).

. . Variability in tools
Growing interest in safety culture has been accompanied by a proliferation of tools, each deriving from differing conceptualizations of safety culture (23). At least 220 different safety culture or safety climate surveys have been identified across industry sectors (24). The multitude of surveys has led to numerous systematic reviews of the available tools both within and outside of healthcare. Within healthcare, there is wide variability in the number of dimensions (ranging from one to 12) and items (ranging from 10 to 74 items) that the tools contain, and their validity, and adaptability for use in multiple settings (8), with no one tool emerging as the gold standard (12). The most widely employed surveys employed in safety culture research, and arguably the most validated, as identified in a recent safety culture review, are the Hospital Survey on Patient Safety Culture (HSOPS) (25), the Safety Attitudes Questionnaire (SAQ) (26), the Patient Safety Culture in Healthcare Organizations Survey (PSCHO) (9), and the Safety Climate Scale (27). However, again each of these questionnaires assesses a different number and combination of dimensions (ranging from one to 12), vary in length (ranging from 13 to 48 items), and have been designed for particular settings or contexts (28).
Scoring of commonly employed surveys, such as the HSOPS (25), presents further challenges as results can vary depending upon the strategy and computational method selected. While the Agency for Healthcare Research and Quality (AHRQ) recommends for HSOPS that the percentage of positive responses be computed to interpret the 12-dimensional scores, two alternative aggregation methods have been identified in the literature, leading to potential bias when comparing results between studies, hospitals and countries (29). Notably, Giai et al. (29) identified the heterogeneity of results obtained by the three scoring approaches used to assess safety culture in a French university hospital, showing that dimensional score values, as well as their corresponding rankings, varied considerably across the different scoring methods. For example, for the HSOPS dimension "teamwork within hospital units" the score for the worst performing department based on percent positive scores, increased by more than 10% using averaged individual sums (29). This study highlights that healthcare decision makers must consider comparing HSOPS results within and between organizations with great caution, and that agreement must first be reached on a consistent scoring approach.
Additionally, different versions exist for numerous safety culture surveys, including short and long versions (e.g., SAQ 36-item short form and SAQ 60-item full-form), and versions for specific contexts (e.g., HSOPS for hospitals, medical offices, ambulatory surgery centers, nursing homes, and community pharmacies). Further, both the HSOPS and SAQ have undergone major revisions in recent years. The HSOPS 2.0 was released in 2019 and involved deleting, rewording and adding multiple items (25). Furthermore, in 2019, the SAQ was superseded by the Integrated SCORE (Safety, Communication, Operational Reliability & Engagement Survey) (30), which removed one of the original dimensions and added a number of others with a greater focus on staff wellbeing, an issue to be discussed further in this paper. Brian Sexton, co-developer of the SAQ, stated that the older surveys needed to be updated as "they were not intended for use in today's healthcare environment" and had "limited evidence of reliability and validity" (31). However, the Integrated SCORE, also co-developed by Sexton, is no longer freely available, so it is unclear the extent to which this survey will be taken up by hospitals and researchers. On the other hand, transition to the HSOPS 2.0 appears to have been more positive, with countries including Australia developing their own context-specific version (the A-HSOPS 2.0) and a toolkit developed to support its implementation (32). This raises the question though of how comparable the results are between different survey versions, particularly when it comes to international benchmarking.
Further, while the use of questionnaires is practical for simply capturing data from a larger group of participants or staff, one major issue is that the exclusive reliance on quantitative data fails to capture and expose rich insights into the dimensions of culture (33). For example, questionnaires tend to only capture superficial artifacts and beliefs, rather than the underlying shared assumptions which are argued to comprise the culture of an organization (34). Consequently, some researchers argue that a more valid approach to assessing safety culture is to incorporate qualitative methods in addition to questionnaires to enable greater exploration of the identified dimensions (8,15). However, these approaches typically require more researcher involvement and resources, such as participating in fieldwork, directing narrative interviews, or conducting observational research (12). Some questionnaires, including the HSOPS 2.0 and SCORE, recognize the need for mixed-method assessment, and also recommend the inclusion of qualitative, open ended questions at the end of the survey.

. . Inconsistency in dimensionality
Safety culture is multi-faceted, and the tools which are employed to measure the concept are typically based upon the assessment of several inter-related attributes or dimensions (16,35). However, much like the ambiguities that manifest in the definition of safety culture (36), researchers are yet to reach a consensus on the underlying dimensions that comprise safety culture (12), thereby highlighting yet another challenge faced in the field. For example, while some narrowly define safety culture as focusing on the key dimensions of unit and organizational leadership's prioritization of safety (37); others more broadly conceptualize safety culture to include sub-dimensions such as learning, reporting, and blame orientation (21,38,39). Sometimes, more distant dimensions are also included, such as job satisfaction (26) and staffing (2). Furthermore, dimensions comprising safety culture are usually considered highly context dependant (40), varying by industry and even organization (41).
In an attempt to identify the fundamental dimensions of safety culture in healthcare, Flin et al. (16) reviewed 12 quantitative studies in healthcare of safety culture to identify its fundamental dimensions. The 73 safety culture dimensions identified across these 12 studies were re-categorized by the researchers into 10 distinct themes: management/supervision; safety systems; risk perception; job demands; reporting/speaking up; safety attitudes/behaviors; communication/feedback; teamwork; personal resources (such as stress); and organizational factors. In this study, management commitment to safety emerged as the most frequently measured safety culture dimension. More recently, Halligan and Zecevic (12) reviewed 113 articles which explored the dimensions of safety culture in healthcare. In this study, they found that the six most frequently cited dimensions were: leadership commitment to safety, open communication founded on trust; organizational learning; a non-punitive approach to adverse event reporting and analysis; teamwork; and a shared belief in the importance of safety. Organizational learning was identified as an important theme that was not specifically identified as a separate dimension in the Flin et al. review (16). However, for both reviews there was a lack of detail on how dimensions were identified, and in turn how they mapped to the safety culture tools they reviewed.
In a more recent systematic review assessing the dimensions of safety culture, Churruca et al. (15) assessed 694 studies (including quantitative, qualitative and mixed-methods studies) to identify the most commonly utilized approaches to assessing safety culture in healthcare, and the dimensions of safety culture captured through these processes. A comprehensive thematic analysis identified 11 dimensional themes present across studies, including: leadership; perceptions of safety; teamwork and collaboration; safety systems; prioritization of safety; resources and constraints; reporting and just culture; openness; learning and improvement; awareness of human limits; and wellbeing (15). Table 1 provides a summary of the 11 themes and the number of studies identified incorporating each theme. As shown in this table, the most commonly assessed dimensional themes present in over half of the current approaches to assessing safety culture include: leadership; perceptions of safety; teamwork and collaboration; safety systems; prioritization of safety; and resources and constraints (15).
As shown in Table 1, staff wellbeing has been the least frequently assessed dimensional theme, present in less than a quarter of available tools (15,42). While safety culture improvement efforts have traditionally been concentrated on interdisciplinary teamwork and patient safety education, recent research has identified that addressing staff wellbeing factors, especially health care worker burnout, may also play an important role (43)(44)(45)(46). Burnout refers to the ongoing and unmitigated stress response that results in symptoms of depersonalization, emotional exhaustion, and a decreased sense of personal accomplishment (47). Burnout is one of the most prevalent staff wellbeing problems  that healthcare professionals currently face, given the challenges imposed by the nature of clinical work, time constraints, lack of control over work processes, and the higher work demands elicited from the COVID-19 pandemic (45,48). Recognizing that >30% of frontline healthcare staff are experiencing burnout, Sexton et al. added a greater focus on staff wellbeing to the Integrated SCORE (31). However, further work is needed to understand whether staff wellbeing should be studied as a dimension or outcome of an organization's safety culture.

. . Response rates
Another challenge when using questionnaires to assess safety culture is the need to obtain sufficient response rates. Low response rates are particularly problematic as they can increase bias, where non-responders may be systematically different from responders (49). An overall response rate above 60% is often believed to be needed in order to establish sufficient reliability and validity of the data captured (50). Some researchers argue that anything less is considered more of an assessment of "opinion" rather than "culture" (51).
Low response rates are increasingly being reported due to duplicative survey efforts, creating survey fatigue, and isolated datasets that do not produce a consistent snapshot of safety culture (50,52,53). Since the COVID-19 pandemic, additional time constraints, lack of resources and survey fatigue are being reported, and thus the need for survey integration has never being more acute (31).

. Conclusions and recommendations
Although safety culture surveys offer practical and timeefficient tools appealing to quality improvement and international benchmarking efforts, there remains no "gold standard" for measuring safety culture, with no one survey comprehensively evaluating all the important aspects of safety culture (8). Furthermore, variations in survey versions and scoring methods limits the capacity for comparison across studies and counties, which is a factor that makes surveys appealing in the first instance.
In response to the issues we have highlighted, we first recommend using well validated surveys of safety culture followed up by qualitative methods, such as interviews or focus groups, to enrich the exploration of complex issues related to safety culture, identify priority dimensions, and provide insight into areas for improvement (14,15). We also recommend that staff wellbeing should be regularly assessed alongside measures of safety culture and patient safety outcomes to further advance our understanding of how safety is enacted in pressured healthcare environments. The issue of survey fatigue in many hospitals, also points to the broader need to reduce duplicative survey efforts and for a more streamlined and consistent survey approach (31). Moving to an agreed gold standard survey approach across healthcare settings would certainly make benchmarking more reliable. Research has also pointed to some strategies that are available to assist in increasing response rates, such as distributing the questionnaire in person during training sessions or staff meetings, or by allocating a local champion who can motivate non-responders to consider participating (50).
While measuring patient safety culture is a key component of many OECD countries' national patient safety strategies and the topic of a large body of research (11), the next steps for improving safety culture, health system performance and outcomes for staff and patients based on its measurement are less clear. Measuring safety culture should be considered as a starting point from which improvement actions and patient safety changes emerge (2). Systematized data feedback for all who contribute to measurement is recommended, combined with problem solving, action planning and monitoring (2). Team training and team communication skills, executive walk arounds, and intervention strategies combining adaptive interventions (such as continuous learning) with technical interventions (such as clinical care algorithms) have been shown to improve patient safety and quality Frontiers in Public Health frontiersin.org . /fpubh. . (54)(55)(56)(57)(58). Organizational strategies with bottom-up organizational and employee learning from behavioral outcomes, conducive enabling factors, and consistency over time and effective leadership are also key elements (3,22). One promising bottom-up strategy shown to improve patient safety is safety huddles. Although huddles were originally designed to learn from errors and adverse events (known as "Safety-I"), huddles are also now being used to support learning for improvement based on situations where work goes well (Safety-II) (59), by including reflection time to allow staff to talk about and learn from things that went well. Based on the latest evidence, such safety-II-inspired huddles could also be considered to lead to improvements in safety culture (60). Investigating issues and complexities to safety culture assessment in healthcare is a relatively young research field which needs to develop in line with the rapid changes in different healthcare systems. There are varying challenges from high to low-income countries and contexts ranging from primary care, nursing homes, homecare and specialized hospital services. We argue that a continuous critical reflection is needed in this field to keep assessment methods, instruments and approaches relevant and targeted. Keeping instruments and implementation guidance on open access and available is recommended to increase use and enable practice improvement worldwide. That is crucial of we are to encourage widespread application in poorly-resourced settings. This is also a way to support UN goal three: of sustainable development promoting good health and wellbeing for all at all ages.

Data availability statement
The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author.

Author contributions
LE had the idea and developed the first draft of the article with EF, which was further developed in close collaboration with PH and SW. All authors contributed to the revision and have approved the final version of the article.

Funding
This work was supported by funded from NHMRC Partnership Centre in Health System Sustainability (Grant ID 9100002) and NHMRC Investigator Grant (Grant ID 1176620). SW and ER were supported by Research Council of Norway from the FRIPRO TOPPFORSK program, Grant Agreement No. 275367, and the University of Stavanger, Norway; NTNU Gjøvik, Norway.