“Blue flags”, development of a short clinical questionnaire on work-related psychosocial risk factors - a validation study in primary care

Background Working conditions substantially influence health, work ability and sick leave. Useful instruments to help clinicians pay attention to working conditions are lacking in primary care (PC). The aim of this study was to test the validity of a short “Blue flags” questionnaire, which focuses on work-related psychosocial risk factors and any potential need for contacts and/or actions at the workplace. Methods From the original“The General Nordic Questionnaire” (QPSNordic) the research group identified five content areas with a total of 51 items which were considered to be most relevant focusing on work-related psychosocial risk factors. Fourteen items were selected from the identified QPSNordic content areas and organised in a short questionnaire “Blue flags”. These 14 items were validated towards the 51 QPSNordic items. Content validity was reviewed by a professional panel and a patient panel. Structural and concurrent validity were also tested within a randomised clinical trial. Results The two panels (n = 111) considered the 14 psychosocial items to be relevant. A four-factor model was extracted with an explained variance of 25.2%, 14.9%, 10.9% and 8.3% respectively. All 14 items showed satisfactory loadings on all factors. Concerning concurrent validity the overall correlation was very strong rs = 0.87 (p < 0.001).). Correlations were moderately strong for factor one, rs = 0.62 (p < 0.001) and factor two, rs = 0.74 (p < 0.001). Factor three and factor four were weaker, bur still fair and significant at rs = 0.53 (p < 0.001) and rs = 0.41 (p < 0.001) respectively. The internal consistency of the whole “Blue flags” was good with Cronbach’s alpha of 0.76. Conclusions The content, structural and concurrent validity were satisfactory in this first step of development of the “Blue flags” questionnaire. In summary, the overall validity is considered acceptable. Testing in clinical contexts and in other patient populations is recommended to ensure predictive validity and usefulness.


Background
Working conditions are of great importance and influence health, work ability and sick leave [1]. Some conditions at work can be changed or adjusted to the individual, but other conditions are more difficult to modify. Health care practitioners and employers working together with accommodation strategies has been shown to be effective to promote health and work ability [2]. In Sweden, the employer's responsibility for the work environment and work organisations is quite far-reaching and is regulated in law (Work Environment Act). This includes the physical work environment, but also the psychosocial and organisational working conditions. This means that the employer is responsible for doing systematic risk assessments on a regular basis and also take actions based on this [3,4]. Patients with work disability are often seen in primary care (PC) and one of the PCsá ssignments is to support recovery and improve work ability, and therefore methods to help clinicians´address work-related factors are needed.
There is evidence that the individuals´working conditions are of great importance for patients with neck/back pain [5] and patients with symptoms of mental disorders [6]. Frequent neck/back pain combined with stress is associated with a high risk for reduced work ability [7]. Health, work and sick leave are all interrelated and low level of adjustment latitude at work can be a risk factor for sick leave [8][9][10]. Several studies confirm that work stress [7,11], social support [12], balance between demands, control and support [13][14][15] are important factors that have an impact on work ability. Furthermore, psychological factors are important for return to work (RTW) among long-term sick leave patients [16][17][18]. This includes inequality and bullying at work, which are also important factors affecting health [19][20][21].
However, there is a lack of methods and relevant short questionnaires in PC to help clinicians in the consultation to pay attention to work-related factors that might influence the patient's symptoms, diagnoses and potential for recovery. It may be appropriate, in addition to medical measures to advice patients to contact their employer about possible workplace adjustments or to ensure that occupational health services are engaged.
Screening for different health status or risks is common in health care in general and are often described as different type of clinical "flags". The flag system has been developed for the assessment of risk factors and recommended as an investigative methodology and until now especially so in regards to musculoskeletal disorders (MSD) [22]. The identification of red and yellow flags is established and provides valuable information to clinicians in health care. Red flags are screening for severe health problems or diseases in need for more extensive diagnostic investigations [23] and yellow flags assess mental and emotional health risk factors [24].
Blue flags are defined as the individuals' perceptions of work-related factors that can have an impact on disability. Screening for blue flags is intended for identification of work-related psychosocial risk factors, for example job dissatisfaction and/or poor colleague or supervisor relationships [25]. Earlier research indicates that health care should use questionnaires that cover these types of risk factors in order to support work ability [25,26]. Work support [27] and formalised peer support at the workplace [28] has been found to be associated with reduced low back pain and reduction in sick leave. For this reason, there are recommendations that the examination of the patient also should include assessment of workrelated psychosocial risk factors, which can predict the risk of chronic disabling back pain [29,30]. The "Readiness for Return to Work scale" was developed to address the motivational factors contributing to RTW for workers with MSD on sick leave. The instrument is recommended to be used in planning and evaluation of occupational intervention/occupational rehabilitation [31]. Other questionnaires focusing blue flags, such as the Back Disability Risk Questionnaire (BDRQ) [32], the Occupational Role Questionnaire (ORQ) [33], the Obstacles to Return to Work Questionnaire (ORTWQ) [34] and the Psychosocial Aspects of Work Questionnaire (PAWQ) [35] are all designed to be used in occupational health settings, hospitals and rehabilitation clinics. They are not designed to be used for screening for workrelated psychosocial risk factors among patients in PC.
Clinical work and patient assessment is different in PC as compared to occupational rehabilitation settings. The time available for each consultation is generally much shorter and the patient population is unselected. Many patients are in early stages of illness or disease when consulting PC for advice and medical evaluation of symptoms. The assorting function in PC is important and an approach that identifies disease, guides treatment, and prevents unnecessary medicalization is warranted. The importance of robust early screening methods helping clinicians to deliver relevant counselling and treatment is thus central in healthcare development and procedures [36][37][38][39][40]. Until now there is to our knowledge no useful instrument, that is easy to handle and that takes a short time to complete recommended to help professionals in PC to identify important workrelated psychosocial risk factors that can affect health and work ability [26]. Thus, there is a need for a generic instrument designed for use in PC to identify and highlight psychosocial risk factors for work disability, which indicates the need of early contacts and/or actions at the workplace in addition to the medical efforts at the PC. This instrument is intended to be used by different professionals when meeting patients in working age who are at risk of sick leave.
"The General Nordic Questionnaire for Psychological and Social Factors at Work" (QPS Nordic ) is an established well-known questionnaire for the assessment of psychological, social and organisational working conditions as well as individual work-related attitudes. QPS Nordic is the most comprehensive, reliable and valid questionnaire used in the Nordic countries today. This questionnaire has been used for organisational development, documentation of changes in working conditions, evaluation of organisational interventions and research [41][42][43][44][45][46][47][48]. The questionnaire includes 129 items divided into 13 different content areas classified according to task level, social and organisational level and individual level [49]. QPS Nordic was constructed after extensive development and published in 2000. Two data sets were collected in Sweden, Norway, Denmark and Finland within various occupational fields. The factor structure of the questionnaire and the structural of the scales was studied in the first data set (n = 1015). The second data set (n = 995) was used to test the structural and predictive validity of the scales. The internal consistencies (alpha values 0.60-0.88) and test-retest reliabilities (0.55-0.82) were studied for each scale. In the content areas concerning working conditions Cronbach's alpha has been found to be 0.69-0.85 [49].
However, a clinical questionnaire in PC needs to be short and easy to handle and QPS Nordic is too extensive to be useful in clinical practice. The aim of this study was to test the validity of a short "Blue flags" questionnaire, which focuses on work-related psychosocial risk factors and any potential need for contacts and/or actions at the workplace.

Design
This is a methodological study with focus on content, structural and concurrent validity. We conducted the study with two different populations; one for the content validity and a different population for the structural and concurrent validity.

Instrument development
A short questionnaire, "Blue flags", intended for use in PC is under development. In this first step we have focused on work-related psychosocial risk factors based on items from the major QPS Nordic . Our ambition was to limit the number of items in the new short questionnaire. The selection of items from the original QPS Nordic was based on relevant scientific literature studies, clinical experience and competence in the research group. From the 13 established content areas in the original QPS Nordic the research group identified five content areas with a total of 51 items which were considered to be most relevant when focusing on work-related psychosocial risk factors [5,6,[50][51][52][53]. These areas were; job demands [41][42][43], social interactions [45,47,48], quantitative demands [44], equality [54,55], bullying and harassment [46,56]. Therefore the selected QPS Nordic items covered these content areas with the following number of items; job demands (32 items), social interactions (6 items), quantitative demands (9 items), equality (2 items) and bullying and harassment (2 items). The answers in the QPS Nordic are given on a 5 -point Likert scale from one to five (1 = no problems and 5 = most problems). Fourteen items were selected from the identified QPS Nordic content areas and organized in a short questionnaire ("Blue flags"). This method is previous described as relevant in research when a long questionnaire is condensed into a shorter [57,58]. The 14 items in the "Blue flags" questionnaire were 7 items on job demands, 2 items on social interactions, 2 items on quantitative demands, 2 items on equality and 1 item on bullying and harassment. The items related to equality and bullying have to some extent been reformulated to be better integrated in the "Blue flags". The answers are given on a 5 -point Likert scale, as in the QPS Nordic.

Study populations and procedure Content validity
One panel of professionals and one panel of patients were questioned in order to receive constructive feedback about the new short questionnaire [59][60][61]. Our intention was to have a broad and relevant representation of experience; both from pain rehabilitation, vocational rehabilitation and from PC. The intention was to gather information on the representativeness and clarity of the items by the panels´constructive feedback as well as suggestions for improvement [62]. The recruitment criterion of the professional panel in health care was experience of work-related health issues. The recruitment criterion of the patient panel was their individual experience as a patient in PC with an episode of back pain and having risk for developing work disability. We were interested in their understanding of the items, perceived relevance and formulations. The panels were recruited from thirteen primary care centres (PCC), two occupational health services, one specialized pain rehabilitation centre and one inpatient centre in the southern parts of Sweden.
Professional panel Sixty-five professionals from six units agreed to evaluate the short questionnaire "Blue flags" (19 men, 45 women) mean age 45 years (range 21-63 years). The represented professions were physiotherapists (n = 30), occupational therapists (n = 13), physicians (n = 8), social workers (n = 4), nurses (n = 6) and psychologists (n = 4). The professionals were working in health care, mostly in PC (65%) and occupational health (23%) and had been in health care for many years (74% ≥ 10 years). Information about the study was given through presentations at staff meetings and as written information. Professionals in the panel were asked to reflect on the relevance of the14 items when assessing the working conditions. They individually and anonymously evaluated the relevance of each item on a scale from one to three; 1 = not relevant, 2 = relevant and 3 = very relevant. They were also asked if there were items missing, unnecessary items or any need to rephrase items.
Patient panel Consecutive patients at 13 PCCs were asked by physiotherapists to evaluate the 14 psychosocial items in the questionnaire "Blue flags". Information about the study was given as written information. Fortysix patients from nine PCCs agreed to evaluate the items (10 men, 36 women), mean age 45 years (range 21-62 years), with pain problems in neck (n = 19), back/ lumbar back (n = 24) and shoulder (n = 3). Patients were asked to consider whether the items could be helpful in an assessment regarding their working conditions. They individually and anonymously evaluated the relevance of each item on a scale from one to three; 1 = not relevant, 2 = relevant and 3 = very relevant. They were also asked if there were items missing, unnecessary items or any need to rephrase items.

Structural and concurrent validity
To assess structural and concurrent validity a cohort of patients from a randomised clinical trial (WorkUp, ClinicalTrials.gov, ID NCT 02609750) answered both the short "Blue flags" questionnaire (14 items) and the original QPS Nordic (51 items) during one visit to one of ten PCCs in southern Sweden. The patients were recruited consecutively in WorkUp when they applied for physiotherapy due to an episode of acute or subacute non-specific back pain and were identified as having risk for developing work disability according to the Örebro Musculoskeletal Pain Screening Questionnaire (ÖMPSQ), short form [57]. Other inclusion criteria in the WorkUp study were to not be currently on sick leave or being sickness absent less than 60 days. In all, 75 patients were included (73 with employment). Mean age was 44 years, (range 22-64 years). The PC patients completed the short "Blue flags" and the 51 corresponding items from the QPS Nordic questionnaire during the visit to the physiotherapist. The patients also answered questions regarding their professional background (Fig. 1, Table 1).

Statistics
Data from questionnaires were manually entered in the database. SPSS 23.0 was used for all analysis.

Content validity
To compare the answers from the professional panel and the patient panel the ratings were dichotomised as relevant (relevant and very relevant) or not relevant. Due to small sample size or no answers Fishers Exact Test was used, two sided, to test the difference in proportions. P-values less than 0.05 were considered   [63]. We considered the items in "Blue flags" to be relevant if the item-level CVI was >78% per item. The overall "Blue Flags" was considered relevant if the average of the sum of CVI for each item for the entire scale was ≥90%.

Structural validity
An assessment of the factorability of the data was performed using Barlett's test of sphericity and the Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy [64]. Barlett's test should be significant (p < 0.05) for the factor analysis to be considered appropriate. The KMO index ranges from 0 to 1, with 0.6 as a minimum value for a good factor analysis [64]. To investigate the factor structure of the "Blue flags" a factor analysis was performed using the principal components analysis (PCA) extraction with the Varimax rotation. A minimum eigenvalue of 1 was specified as extraction criterion and the criterion for factor loading was set at ≥0.5.

Concurrent validity
Concurrent validity was studied as the correlation between the 14 work-related psychosocial items in the "Blue flags" compared to the 51 corresponding items from the QPS Nordic questionnaire. The items in both questionnaires have the same direction, i.e. a low value indicates better working conditions and answers that indicate problems have a higher value. Since both questionnaires provided ordinal data, we used a nonparametric approach and calculated Spearman's rank correlation coefficient (r s ) [65] between the two questionnaires. We had in accordance to Chan [66] set the limit in this study for values of r s at 0.3-0.5 as fair correlation, r s at 0.6-0.8 as moderately strong correlation and a very strong correlation at r s > 0.8. Internal consistency was analysed by Cronbach's alpha coefficient. We considered values of α ≥ 0.7 as good [67,68].

Results
The 14 items on work-related psychosocial risk factors which were included in the "Blue flags" are shown in Table 2.

Content validity
The two panels (n = 111) regarded the overall "Blue flags" items to be relevant, with a CVI of 90%. The range of the item level, CVI, was 73% -97% (Table 3). A majority of the professionals considered each of the 14 psychosocial items in the "Blue flags" to be relevant. The patients were most doubtful when it came to "My tasks at work are too difficult" (41%) and "There has been bullying and harassment at my workplace during the last 6 months" (57%) ( Table 3). The Fishers Exact Test showed significant differences in the distribution of the responses in the panels´for nine items (Table 3). Twenty-three professionals and one patient gave suggestions about additional psychosocial items. In particular, they thought there could have been items concerning wellbeing at work (n = 20). Nineteen professionals and one patient gave a total of 40 suggestions about rephrasing items, especially concerning "There are clear goals for my work", "There are incompatible demands for me at work", "I have control in my work situation", "I can solve problems that arise at work" and "I have too many tasks, too much work to do". The item "My tasks at work are too difficult" was proposed to have space for comments.

Structural validity
The suitability of the data for factor analysis was satisfactory with the KMO value of 0.6 and the Bartlett's test with the significance of p < 0.001. All 14 items in the "Blue flags" showed satisfactory loadings with a range of 0.514-0.872. A four-factor model was extracted with a  10.9% and 8.3% of the variance respectively. Factor one and two reflected two different aspects of job demands, namely job tasks and job control. Factor three reflected equality and factor four was mixed (Table 4).

Concurrent validity
Correlation between the 14 psychosocial items in "Blue flags" and the 51 corresponding items in QPS Nordic showed very strong correlation, r s = 0.87 (p < 0.001).
Correlations between the "Blue flags" groups of items in the four factors and the corresponding QPS Nordic items were moderately strong for factor one, r s = 0.62 (p < 0.001), and factor two, r s = 0.74 (p < 0.001). Factor three and factor four were weaker, but still fair and significant at r s = 0.53 (p < 0.001) and r s = 0.41 (p < 0.001) respectively ( Table 5). The internal consistency of the whole "Blue flags" was good with Cronbach's alpha of 0.76.

Discussion
This manuscript presents the first preliminary development of a short clinical PC questionnaire focusing on work-related psychosocial risk factors. The "Blue flags" is intended to screen for such risk factors, and to identify any potential need for action at the workplace in addition to the medical interventions in PC. At this stage we denote the "Blue flags" as a questionnaire, but after further development the intention is a short, practical and useful screening tool for clinical practice. Recommendations have been made suggesting the use of screening methods in health care to identify patients in early stages with the purpose to guide them to the best treatment and avoid over-treatment [37][38][39][40]. Despite these recommendations, assessing work-related psychosocial risk factors and any potential need for contacts and/or actions at the workplace as a standardised procedure in PC is still not sufficiently established. The study found satisfactory content validity, structural validity and concurrent validity for the new "Blue flags" questionnaire. The overall correlation for the work-related psychosocial risk factor items between the two questionnaires was very strong and for the factors it was fair to moderately strong. The professional panel and the patient panel had somewhat different views on the relevance of the items, where the professional panel assessed most of the items to be relevant, whereas two of the items were assessed as not relevant by 41-57% in the patient panel.
Regarding ten work-related psychosocial items more than 80% of the patients assessed the items to be relevant. There were differing opinions between professionals and patients especially when it came to the items "My tasks at work are Missing data: a missing ≤ 3, b missing 5 ***Fisher´s Exact Test, the relationship between the distribution of the responses for the professionals and the patients, significance if p < 0.05 too difficult" and "There has been bullying and harassment at my workplace during the last six months". The patient panel had their own individual experience of being patients, unlike the professional panel who worked in the field. The majority of the professionals were highly educated in this area and had long experience of work in health care, on average more than 10 years. Most of them had experience concerning the relationship between work-related risk factors and health and generally they rated the relevance of the items higher than the patients. The patient panel responded to what they thought of the items in regards to assessing their own working conditions. The patient panel applied for physiotherapy treatment due to neck, back or shoulder pain and it might have been difficult to understand the items relevance in relation to their pain or in relation to their working conditions. Unfortunately we had no information as to whether their pain were related to their work, what type of jobs they had or even if they were currently employed. The level of satisfactory content validity was obtained regards the overall items with an average CVI of 0.9. However, the range of the items was broad (0.73-0.97) and this must be considered in regard to the two items mentioned above. Still, considering current research in the area of work-related psychosocial risk factors, we believe that items related to bullying and harassment [21,69,70] and job demands [71,72] should be included in the questionnaire. One third in the professional panel stated that there was a lack of items concerning wellbeing at work, for example relationships, conflicts and meaningfulness in the "Blue flags". It is well known that wellbeing at work is an important psychosocial work area and an important  aspect of the psychosocial environment [1,13,14]. Still, it is evident that all items in "Blue flags" are important components to summarise wellbeing at work and it is debatable if there is a need for additional items. We also have to consider rephrasing the items that the panel assessed to be unclear in the further development of the "Blue flags". In the first step the items were grouped in four content areas and one single item (bullying/harassment). This differed from the PCA distribution, where a four-factor solution was revealed, where bullying/harassment was included in the fourth mixed factor. These findings support the "Blue flags" as a whole questionnaire and as suitable for further development. The PCA result showed good loadings for all items. The factor structure supports our aim for further in-depth research in this area.
The correlation for the 14 psychosocial items in the new questionnaire with the 51 corresponding psychosocial items in QPS Nordic was very strong. We had in accordance to Chan [66] defined values of r s at 0.6-0.8 as moderately strong correlation and a very strong correlation at >0.8, which is a stricter definition than other studies [58,73]. The correlation was considered good which indicates that the shorter "Blue flags" captured the work-related psychosocial items just as good as the longer questionnaire QPS Nordic with 51 items. Both "Blue flags" and QPS Nordic showed satisfactory internal consistency [67]. This is in line with previous evaluation of QPS Nordic [49] and indicates that the 14 psychosocial items in the "Blue flags" is acceptable when it comes to internal consistency.

Strengths and limitations
The intention was to develop a questionnaire for screening in PC and to guide clinicians towards the best action and treatment, including possible contacts and/or actions at the work place. This study did not include the establishment of cut-off points or analysis of predictive validity, which could be considered as limitations. Therefore, this questionnaire needs further development before it can be implemented in clinical practice.
To reduce the number of items and to ensure the construction of a comprehensive questionnaire we based our decisions on our clinical experience and recent research findings [41][42][43][44][45][46][47] so that the most important and relevant work-related psychosocial items in the original version were covered in the new short version. The QPS Nordic items were tested in previous research [49] and the method of selecting items from the original long questionnaire to a short form is an established method [57,58]. The extensive clinical and scientific experience from PC, occupational health, occupational rehabilitation and various professions (physician, physiotherapist and psychologist) strengthened the process when we condensed the number of items to the short "Blue flags". The factor analysis confirmed that the items in this short version can be used as a stand-alone questionnaire.
When studying structural and concurrent validity, we included patients in the WorkUp study with no longterm work disability, although they were at risk for developing long-standing problems. It could also be a limitation since the study included only patients with acute and subacute pain in physiotherapy practice even though it is known that it is important to identify patients with work-related disabilities at an early stage [50][51][52]. Further studies could examine if it is possible to select patients to promote health and work ability and whether the "Blue flags" can indicate the need for early workplace actions. We also set higher level for concurrent validity compared to previous studies [58,73], which is a strength. The "Blue flags" indicated satisfactory structural validity and internal consistency and this strengthen the results [67,68].
The two different groups with patients who assessed either content validity (n = 46) or structural and concurrent validity (n = 75) were recruited from several PCCs and from different areas in southern Sweden, which strengthens the possibilities to generalize the results. The professional panel evaluating the content validity was chosen through personal contact and were not randomly selected. Despite this the range was broad concerning professions and they had extensive experience, which strengthens their trustworthiness. It could also be regarded as strength that there were two different groups of patients in the content and structural/concurrent analyses, respectively.
The result concerning content validity showed the relevance of the items and the importance of identifying work-related risk factors in PC. Furthermore, there were proposals for supplementary items in the questionnaire. The clinical utility needs to be further evaluated. There is also a need to test the questionnaire in other clinical contexts as well as in other patient contexts, such as those with long-standing MSD as well as those with mental disorders [74]. This Swedish questionnaire was tested in a Swedish context and future versions should therefore be validated in other languages and countries. A further step in the development of the "Blue flags" questionnaire could be to supplement it with other types of work-related risk factors that can influence work ability, such as ergonomic items. To examine the usefulness in clinical practice "Blue flags" needs to undergo further evaluation regarding feasibility and predictive validity for identification of the need of workplace interventions.

Conclusions
The content, structural and concurrent validity were satisfactory in this first step of development of the "Blue flags" questionnaire. In summary, the overall validity is considered acceptable. Testing in clinical contexts and in other patient populations is recommended to ensure predictive validity and usefulness.