The validity and reliability of the Swedish version of the Satisfaction with appearance scale for individuals with systemic sclerosis

Background: Systemic sclerosis (SSc) can lead to visible changes in appearance which could generate concerns among patients. Thus, valid questionnaires that capture these concerns are valuable to identify and communicate appearance concerns. Objective: To determine aspects of the validity and reliability of the Swedish version of the Satisfaction with Appearance scale for individuals with SSc (SWAP-Swe in SSc). Methods: Content validity was assessed by interviews. In a cross-sectional design, construct validity was evaluated by comparing the self-reported questionnaire SWAP-Swe in SSc to the Scleroderma Health Assessment Questionnaire (SSc HAQ), Patient Health Questionnaire-8 (PHQ-8), RAND-36, modified Rodnan skin score (mRSS), disease duration and age using Spearman’s rank correlations (rs). Internal consistency was evaluated by Cronbach’s alpha coefficient and corrected item-to-total correlations. Test–retest reliability was investigated using the intraclass correlation coefficient (ICC). Results: Eleven patients and 10 health professionals participated in the assessment of content validity. For the other aspects of validity and reliability 134 patients (median age 62 years, women 81%, limited cutaneous SSc 75%) participated. Overall, the content validity was satisfactory. The SWAP-Swe in SSc correlated with SSc HAQ (HAQ-DI rs = 0.50, visual analogue scales rs = 0.24–0.41), PHQ-8 (rs = 0.46), RAND-36 (rs = −0.21 to −0.47), mRSS (rs = 0.28), disease duration (rs = −0.01) and age (rs = −0.15). The Cronbach’s alpha coefficient was 0.92, corrected item-to-total correlations ⩾ 0.45 and the ICC 0.82. Conclusion: The SWAP-Swe in SSc showed satisfactory content validity, sufficient and good internal consistency and sufficient test–retest reliability. It was more strongly associated with self-reported questionnaires than with physician-assessed skin involvement and age, indicating that appearance concerns in SSc seem to be multidimensional as earlier reported. Our study contributes with a thorough investigation of validity and reliability including aspects that have not been investigated before. However, evaluation of more validity aspects of the SWAP-Swe in SSc is suggested.


Introduction
Systemic sclerosis (SSc) is a rare autoimmune inflammatory disease, characterised by vasculopathy, fibrosis in the skin and internal organs and often divided into diffuse cutaneous (dcSSc) or limited cutaneous SSc (lcSSc). 1 The disease often leads to skin thickening and hardening, pigment changes, cutaneous telangiectasia and Raynaud's phenomenon, changes that commonly involve the face and hands. 2 Appearance concerns are more frequent among persons with SSc than persons from the general population. 3 In SSc, changes in appearance in visible body parts such as face and hands are described to be related to body image distress including dissatisfaction with appearance, 4,5 decreased self-esteem, symptoms of depression and anxiety. 5 Body image dissatisfaction has been reported to have moderate 6 associations with symptoms of depression 7,8 and mental health-related quality of life (HRQL). 8,9 Furthermore, the association between body image dissatisfaction and disability has been presented as weak to moderate, 7,9 and the association to skin disease severity has been reported as weak. 9 To improve quality in life, it is important for the patients to develop coping strategies concerning, for example, body image distress. 10 As health professionals struggle to support patients with SSc in self-management, including appearance concerns, it is important to have tools to assess body image dissatisfaction. A patient-reported outcome measure (PROM) that assesses social discomfort and dissatisfaction with appearance is the Satisfaction with Appearance (SWAP) scale. The reliability and validity of the SWAP have been evaluated in SSc. 8,9,11 The SWAP was originally developed for people with burns 12 and has been adapted for SSc by changing the word 'burn' to 'illness' 13 or 'scleroderma'. 8 Psychometric evaluations of the SWAP have been conducted in the English and French language among persons with SSc, but to date, the Swedish version of the SWAP has not been evaluated in SSc. Validating PROMs in different languages is important to further understand appearance concerns in SSc in different populations and to develop interventions for coping with appearance concerns. Thus, this study aimed to determine aspects of the validity and reliability of the Swedish version of the Satisfaction with Appearance scale for individuals with SSc (SWAP-Swe in SSc).

Methods
In the first step, the SWAP-Swe in SSc was assessed for content validity. Thereafter, construct validity, internal consistency as well as floor and ceiling effects were evaluated in a cross-sectional design. Finally, test-retest reliability was evaluated.

Participants
Patients with SSc were recruited from three rheumatology centres in Sweden (Karolinska University Hospital, Skåne University Hospital and Sunderby Hospital). All patients included met the 2013 ACR/EULAR criteria for SSc, 14 were at least 18 years of age, with disease duration ⩾ 1 year, and could understand and speak Swedish. Patients who were unable to answer written questions were excluded. Patients were enrolled by a rheumatologist during their visit to the rheumatology centre. A convenience sample of patients (n = 11) from one centre and health professionals (HPs) (n = 10) from the two other centres was used to assess content validity. A foremost consecutive sample of patients (n = 134) from two centres participated in the assessment of construct validity, reliability, and floor and ceiling effects. One patient was excluded in this part due to a misunderstanding of the SWAP-Swe in SSc. These sample sizes can be considered very good. 15 The content validity data were collected between September 2017 and January 2018. For the other aspects of validity and reliability, data were collected between February 2019 and April 2020. Informed written consent was obtained from all participants in accordance with the Helsinki Declaration and the regional ethics committee in Umeå (No. 2017/149-31) approved the study.

Sociodemographic data and outcome measures
Most patients answered questions about sociodemographic data and completed PROMs in conjunction with their visits to the rheumatology centre. The following PROMs were self-reported using paper and pencil: The SWAP in SSc consists of 14 items which can be summed into a total score (score range 0-84) or two subscale scores, social discomfort subscale (score range 0-36) and dissatisfaction with appearance subscale (score range 0-48). Items in the dissatisfaction with appearance subscale are reverse scored. Higher scores indicate greater social discomfort and dissatisfaction with appearance. 11 The SWAP in English has support for reliability and validity in SSc. 8,9,11 The Scleroderma Health Assessment Questionnaire (SSc HAQ) evaluates disability, pain and disease interference with daily activities. 16 The SSc HAQ includes the HAQ-Disability Index (HAQ-DI) assessing daily activities (the total score ranges from 0 to 3). Furthermore, the patient rates pain on a 15-cm line visual analogue scale (VAS, 1 cm = 0.2 points, score range from 0 to 3). The SSc HAQ adds five 15 cm VAS evaluating disease interference with daily activities of gastrointestinal symptoms, lung symptoms (breading problems), Raynaud's phenomenon, digital ulcers and overall disease severity. Scores are calculated for each VAS (score range from 0 to 3). The higher the values, the worse disability, pain and interference with daily activities, respectively. The Swedish SSc HAQ has support for reliability and validity in SSc. 17 The Patient Health Questionnaire-8 (PHQ-8) assesses depressive symptoms with a total score ranging from 0 to 24 where a higher score represents more depressive symptoms. 18 The PHQ-8 in Swedish has support for reliability and validity in SSc. 19 The RAND-36 item (RAND-36) Health Survey assessing HRQL. The RAND-36 is composed of eight subscales ranging from 0 to 100 where a higher score represents a higher level of HRQL. The Swedish RAND-36 is reliable and aspects of validity have been proven. 20 The physician provided disease-related variables. Skin involvement was assessed by the modified Rodnan skin score (mRSS), where skin thickness is assessed in 17 body areas. The total score can range from 0 to 51; a higher score illustrates greater skin involvement. 21

Content validity
A SWAP in the Swedish language has been culturally adapted and validated for persons with burns. 22 In that version, the English word chest was changed to a Swedish word for torso and, in another item, the word interfered was changed to a Swedish word for affected/influenced and the instructions were clarified. 22 In our study, the word burn was changed to a Swedish word for systemic sclerosis, and we followed the order of the items in the SWAP version presented by Jewett et al. 11 Persons with SSc and HPs were interviewed to evaluate content validity. 15 A pilot interview was conducted with one HP to test and refine the semi-structured interview guide (Table 1). Before the individual interviews, patients completed the SWAP-Swe in SSc (first version) and HPs reflected on the questionnaire. The interviews were soundrecorded, verbatim transcribed and analysed with qualitative content analysis 23 using a deductive approach. 24 The text was divided into meaning units, which were coded 23 and sorted into the following domains: comprehensibility, relevance and comprehensiveness. 15 Linguistic and layout adjustments of the SWAP-Swe in SSc were made in response to the results of the interviews after discussions within the research team and the patient research partners. Thereafter, a back-translation to English was performed by a professional translator (native English speaker) to assess correspondence with the English SWAP for individuals with SSc.

Construct validity, reliability and floor and ceiling effects
After linguistic and layout adjustments of the SWAP-Swe in SSc were made, evaluation of construct validity by the assessment of possible associations between the SWAP-Swe in SSc and other outcomes measures was done. 15 Reliability was assessed through internal consistency and test-retest reliability and floor and ceiling effects were also evaluated. To evaluate construct validity, internal consistency and floor and ceiling effects data were collected when patients visited the rheumatology centre for enrolment. To assess its test-retest reliability, the SWAP-Swe in SSc was completed a second time at the patient's home and the answers were returned by mail in a prestamped envelope. The mean time between the test occasions was 11 (SD ± 8.1) days.

Statistical analysis
Construct validity was assessed by Spearman's rank correlation coefficient (r s ) which was interpreted accordingly: 0 = indicated 'none', 0.1-0.3 = 'weak', 0.4-0.6 = 'moderate', 0.7-0.9 = 'strong' and 1.0 = 'perfect'. 6 Based on previous results of the SWAP in SSc 7-9 and clinical assumptions the following was anticipated: The total score of the SWAP-Swe in SSc was expected to correlate 'moderately' and positively to depressive symptoms (PHQ-8) and negatively to HRQL (especially mental aspects, RAND-36). 'Weak' to 'moderate' positive association was expected with disability (HAQ-DI). A 'weak' or less correlation was expected between SWAP and pain (HAQ-DI VAS, RAND-36), disease interference with daily activities (SSc HAQ VAS), skin involvement (mRSS), disease duration and age.
Internal consistency was assessed by Cronbach's alpha coefficient and corrected item-to-total correlation. An alpha coefficient ⩾ 0.70 was considered to be 'sufficient' 25 and item correlations > 0.30 as 'good'. 26 Intraclass correlation coefficient (ICC) with a two-way-mixed model and absolute agreement 27 and weighted kappa with quadric weights 28 were used to evaluate test-retest reliability. The ICC was used for the total score and subscale scores, an ICC ⩾ 0.70 was evaluated as 'sufficient'. 25 Weighted kappa were calculated for the items, and values: <0.00 was classified as 'poor', 0.00-0.20 = 'slight', 0.21-0.40 = 'fair', 0.41-0.60 = 'moderate', 0.61-0.80 = 'substantial' and 0.81-1.00 = 'almost perfect'. 29 The sign test was used to assess whether there were any significant changes in the test-retest procedure for the total score, subscales and each item.
Floor and ceiling effects were determined by the numbers of patients scoring the lowest or highest possible score in the total score and subscales and were defined as cases in which > 15% of the patients achieved the lowest or highest possible score. 30 The level of significance was specified at p ⩽ .05. The SPSS, version 26, was used for statistical analyses except Table 1. The interview guide covered the following main questions to assess content validity.
What do you think about the comprehensibility of the items? Are any items difficult to understand? How do you experience the introduction and response set? Are there any items you want to exclude/include? Do the items cover relevant aspects of satisfaction with appearance in SSc? To elaborate the answers, follow-up or clarifying questions were used for Kappa, which was accomplished using VassarSats: Website for Statistical Computation.

Participants
Eleven patients were included in the content validity part, and in the other validly and reliability parts, 134 patients were included and most of them had lcSSc ( Table 2). The patients in the first part, for content validity (n = 11) had more extensive skin disease (median mRSS 14 and 36% dcSSc). The second part, for validity and reliability (n = 134) included patients with milder skin disease (median mRSS 2 and 25% dcSSc). Both parts, however, represent a prevalent population with a mean disease duration of 11 years (IQR 6-18 years), respectively. HPs are also described in Table 2.

Content validity
The analysed interviews showed that the SWAP-Swe in SSc was in general experienced easy to comprehend, items were overall relevant and covered key aspects of satisfaction of appearance (Table 3). From the results of the interviews, the following main changes were done. To support the understanding of the response set, the number 0 to 6 was added below each item with the anchor words (0 = strongly disagree, 6 = strongly agree). Furthermore, items 4, 5 and 6 were clarified that they referred to changes in appearance caused by SSc. In item 8, the word hair was added to increase the relevance of this item (i.e. scalp/hair), and in item 14, the word upper body was added (i.e. torso/upper body) to ease the understanding.

Construct validity
'Moderate' correlations were found between the SWAP-Swe in SSc total score and disability, pain/bodily pain, In general, the instruction and response options were easy to follow and items were experienced as clear. However, some found the response options as extensive and HPs expressed that it could be difficult to separate the response options 'strongly agree' from 'agree' and 'strongly disagree' from 'disagree'. Among patients, the reverse scoring related to the dissatisfaction of appearance subscale was experienced as supportive to diminish risk of un-reflected responses, but others found the reversed scoring as unclear which also influenced their responses. Patients and HPs expressed that it could be difficult to determine which persons in the social environment the items designated, for example 'others' in item 5 and 'relationships' in item 4. Furthermore, when words such as 'changes in my appearance caused by my scleroderma' was absent in items they could be experienced as less clear such as in item 5 and 6 in the social discomfort subscale as well the items related to the Dissatisfaction with appearance subscale. The items were experienced as overall relevant. Some patients experienced disturbing changes in their appearance which they concealed. Others expressed accepting their appearance changes or that SSc had not changed their appearance, thus these patients found the items as more relevant for others than for themselves. It was also expressed that items in the dissatisfaction with appearance subscale were not relevant when not experience appearance changes. Both patients and HPs reflected on the necessity of item 8 (. . .the appearance of my scalp) as they did not experience that the scalp was affected, or if not bald not visible. Emotional distress could be associated with appearance concerns, for example items 6 ('I don't think people would want to touch me'). HPs expressed that the SWAP-Swe in SSc could be emotional demanding for patients and they feared hurt reactions. Among HPs it was expressed as inappropriate to focus on patients' appearance and they feared to imply that there were anything wrong with patients' appearance, for example by items in the Social discomfort subscale. Nevertheless, HPs experienced that patients could have appearance concerns and they wanted to support patients' needs.
Thoughts were expressed about when and how to use the questionnaire, in appearance concerns or only in patients who were confident with the disease and its consequences. Overall items were experienced to cover important aspects of satisfaction of appearance in SSc. The whole body was in general covered by the items making dissatisfaction possible to capture by the items. However, items concerning appearance of mouth, lips, nose, fingers, nails, and feet were suggested to be included. Moreover, HPs experienced that other aspects of appearance such as stiffness and limping when moving or worries of changes of appearance in the future were lacking. disease interference with daily activities of gastrointestinal symptoms and overall disease severity, depressive symptoms, general health, vitality, social function and mental health. 'Weak' correlations were found with disease interference with daily activities of lung symptoms, Raynaud's phenomenon and digital ulcers, physical function, physical role function and emotional role function, skin involvement and age (Table 4).

Reliability, floor and ceiling effects
For the SWAP-Swe in SSc total score and subscales, the Cronbach's alpha coefficients ranged from 0.90 to 0.93 and corrected item-to-total correlations ranged from 0.45 to 0.86 (Table 5). There were no significant differences between patients participating in the first test and retest occasion in terms of age, disease duration, disease subtype and mRSS (data not reported). The ICC for the total score and subscales ranged from 0.71 to 0.83 and the weighted kappa for the items had a median of 0.70 and ranged from 0.52 to 0.84 (Table 5). There were no significant differences in the SWAP-Swe in SSc total score and subscales between the test and retest. Two items (items 1 and 4) were significantly higher upon retest, and one item (item 14) was significantly lower upon retest (Table 5).
There were no floor and ceiling effects for the total score or the subscales except for a floor effect in the social discomfort subscale since 48 (36%) of the patients scored the lowest possible score (score 0).

Discussion
The SWAP-Swe in SSc was interpreted as having overall satisfactory content validity. When it comes to construct validity, the SWAP-Swe in SSc was overall more strongly associated with self-reported questionnaires concerning disability, depressive symptoms and HRQL than with physicians-assessed skin involvement and age. The internal consistency was classified as 'sufficient' and 'good' and the test-retest reliability was classified as 'sufficient'. In addition, there were no floor and ceiling effects, except for a floor effect in the social discomfort subscale.     The results of the analysis of the interviews indicated that the content validity of the SWAP-Swe in SSc was overall satisfactory. However, some patients experienced it as unclear when the response set held a reverse meaning between the subscales as a result of negatively formulated items (e.g. 'I feel that my scleroderma is unattractive to others') in contrast to positively stated items (e.g. 'I am satisfied with the appearance of my face'). Furthermore, among HPs, it was expressed that the items could be emotionally demanding and it was experienced as difficult to know when to use the questionnaire. Thus, a careful introduction of the questionnaire and discussion between the HP and the patient about the completed items would be valuable to diminish any possible misunderstandings or distress. When it comes to when to use the SWAP-Swe in SSc, attention to appearance concerns is of value in SSc 7 since worries about appearance 31 and changes of appearance 32 have been reported among unmet needs for healthcare in patients with SSc. 31,32 In our study, we described aspects of satisfaction with one's appearance that were expressed as missing in the SWAP-Swe in SSc, for example, specific parts of the face, hands, feet and limping issues. When needed, these aspects might be subject to additional discussion with the patients when following up on their questionnaire responses.
The construct validity analysis indicated that the majority of the correlations were consistent with our expected hypothesis. Counter to our expectations, we found 'weak' correlations between the SWAP-Swe in SSc and HRQL aspects of physical function, physical role function and emotional role function. Mills et al. 9 also found a 'weak' correlation between the SWAP and the physical components of SF-36. Furthermore, counter to our expectation we found a 'moderate' correlation instead of a 'weak' correlation between the SWAP-Swe in SSc and pain/bodily pain. This discrepancy between our results and results from previous studies might be explained by differences in samples, 9 pain outcome measures 7,8 and analysis method. 8,9 Higher correlations ('moderate') than postulated were found between the SWAP-Swe in SSc and disease interference of daily activities of gastrointestinal symptoms and overall disease severity in our study. Although gastrointestinal symptoms are a common problem in SSc, 33 appearance concerns might influence how the person experiences their disease severity and vice versa. The SWAP-Swe in SSc correlated with other PROMs but also, to some extent, with skin involvement as assessed by a physician as well as age; this highlights that appearance concerns in SSc seem to be multidimensional, as earlier reported. 34 The associations between appearance concerns and whether the patients have lcSSc or dcSSc might also differ. 34,35 Additional studies are needed to further understand the associations between the SWAP-Swe in SSc and other disease variables that might affect body appearance (e.g. skin pigmentations and telangiectasia) as well as associations with gender. Taken together, support for construct validity was indicated in the present study.
Internal consistency showed Cronbach's alpha coefficients that were 'sufficient' 25 for the SWAP-Swe in SSc findings that are in agreement with previous reports 8,9,11 and corrected item-to-total correlations supported 'good' internal consistency. 26 The test-retest reliability for the total score and subscales showed no significant difference between the test occasions. However there were significant differences in three items, which might be explained by the fact that the questionnaire was completed at the hospital and home, respectively, and the latter could give rise to more reflection on appearance issues. The agreements between the test occasions revealed ICCs for the SWAP-Swe in SSc were 'sufficient' 25 for the total score and subscales. Heinberg et al. 13 also found a 'sufficient' ICC for the subscales of the SWAP.
We only found a floor effect in the social discomfort subscale. When comparing our findings with a previous SSc study, their social discomfort subscale scores were in absolute values slightly higher, 11 which indicate less social discomfort in our patients. Appearance concerns in SSc are more common among women, 3 and younger age is related to higher body image dissatisfaction. 7 The individuals in our study were in absolute values somewhat older than in some other studies 7,9 and included both men and women which might contribute to the floor effect in the social discomfort subscale. Furthermore, both subscales in the SWAP are significantly higher in dcSSc, 35 and in our study, there were more patients with lcSSc. Although there was a floor effect in the social discomfort subscale, some patients indicated distress and the highest social discomfort was reported among the presence of strangers and feeling unattractive to others. When it comes to dissatisfaction with appearance, the appearance with the hands had the highest dissatisfaction.
One methodological limitation in our study is that when assessing construct validity, no other PROMs assessing body image dissatisfaction 36 was included. Unfortunately, no such PROM was available validated in the Swedish language in SSc. Another possible limitation could be that the recruitment procedure that was a mixture of convenient and consecutive may not have been the most optimal sampling. However, the median age, gender and proportion of lcSSc in the cross-sectional design were similar as described in a population-based study of SSc in Sweden, 37 which support generalisability of the SWAP-Swe in SSc. Also, the majority of all patients that were included in our study had mild skin involvement; however, it included both moderate and severe skin involvement. Another limitation is that aspects such as skin pigmentations and joint involvements that might influence appearance were not examined. On the other hand, data of digital ulcers, pitting scars and telangiectasia are described.
Even if our study performed a thorough investigation of validity and reliability, future research on the validity, such as structure validity, of the SWAP Swe in SSc would be of value to investigate. Discriminative validity would be beneficial to examine as well, for example, the questionnaire's ability to discriminate between those with different body appearance involvements. In order to evaluate interventions that focuses on appearance concerns, responsiveness of the SWAP-Swe in SSc needs to be evaluated. Furthermore, feasibility may also be evaluated of the SWAP-Swe in SSc. The Brief-SWAP have in some studies found to be more feasible than its full version with 14 items. 8,15 In conclusion, the present study indicates that the SWAP-Swe in SSc is overall valid and reliable, and contributes with evaluation of aspect not studied before concerning SWAP. The SWAP-Swe in SSc may be used in research and clinical praxis to identify and communicate appearance concerns among patients. Furthermore, using the questionnaire could give insight into how to support coping strategies regarding appearance concerns. More studies on other aspects of validity of the SWAP-Swe in SSc as well as further research on the questionnaire in patients with dcSSc would be valuable.