Cross-cultural validity of the Pulmonary Embolism Quality of Life questionnaire in the quality of life survey after pulmonary embolism: A Persian-speaking cohort

Background The Pulmonary Embolism Quality of Life (PEmb-QoL) questionnaire is the first disease-specific scale for assessing the quality of life in patients with a history of pulmonary embolism (PE). Objectives To assess the cross-cultural validity and reliability of the disease-specific PEmb-QoL questionnaire. Methods The Persian version was prepared through the forward and backward translation of the English questionnaire. Six months after the diagnosis of acute PE, consecutive Persian-speaking patients were asked to complete the PEmb-QoL, the generic 36-item Short Form (SF-36) questionnaires and undertake a 6-minute walk test (6MWT). Acceptability was assessed via item missing rate, reproducibility by the test-retest method, and internal consistency reliability by Cronbach’s α and McDonald’s ω coefficients. Convergence validity was assessed using the Spearman rank correlation between scores of PEmb-QoL, SF-36, and 6MWT. The questionnaire structure was evaluated through exploratory factor analysis. Results Ninety-six patients with a confirmed diagnosis of PE completed the questionnaires. The Persian version of PEmb-QoL had good internal consistency (α = 0.95, 3-factor ω = 0.96), inter-item correlation (0.3–0.62), item-total correlation (0.38–0.71), reproducibility (test-retest ICC with 25 participants = 0.92–0.99), and good discriminant validity. Convergence validity was confirmed by the moderate-to-high correlations between PEmb-QoL and SF-36 scores, and a good correlation between the “limitation in daily activities” dimension of the PEmb-QoL questionnaire and 6MWT results. Exploratory factor analysis suggested a 3-component structure with functional (items 1h, 4b-5d, 6, 8, 9i, and 9j), symptoms (1b-h, 7, and 8), and emotional (5a, 6, and 9a-h) components. Conclusion The Persian version of the PEmb-QoL questionnaire is valid and reliable for measuring the disease-specific quality of life in patients with PE.


| I N T R O D U C T I O N
The post-pulmonary embolism (PE) syndrome is defined as new or progressive dyspnea, exercise intolerance, and/or impaired functional or mental status after at least 3 months of adequate anticoagulation following acute PE, which cannot be explained by other (preexisting) comorbidities [1][2][3]. Although the extent of the disability varies in different populations [4], nearly half of the patients will have persistent dyspnea, with 11% suffering from moderate-to-severe functional limitations [5]. The chronic complications of acute PE are also taxing on one's psychological and social status. The long-term risk of purchasing psychotropic drugs is substantially increased in adolescents with a history of PE, with more than a quarter of the previously employed patients not returning to work even one year after the initial PE episode [6][7][8].
Previous statements highlight that the chronic complications actually experienced following acute PE go well beyond the implications of clinically measured outcomes such as RV dysfunction and pulmonary hypertension [4,9,10]. Therefore, valuable information concerning patients' well-being would be lost if information acquisition is channeled only through the narrow scope of clinical and paraclinical evaluations [11]. Direct formal inquiry of patients' experiences through adequately validated patient-reported outcome (PRO) measures also confers the advantage of negating interobserver variability and is, thus, potentially more reliable than when such experiences are informally inquired, interpreted, and reported by third parties [12].
Health-related quality of life is a multidimensional construct encompassing one's self-perception of, at a minimum, physical, emotional, and social well-being [13]. It is usually assessed using both generic and specific PRO questionnaires. Whereas generic questionnaires allow comparison between different populations irrespective of their underlying conditions, condition-specific questionnaires are more sensitive to changes in the frequency and severity of specific outcomes, making them the instrument of choice for evaluating the impacts of therapeutic and rehabilitation strategies [14].
Pulmonary embolism quality of life (PEmb-QoL) questionnaire is the first and currently the only disease-specific PRO instrument for Essentials • Pulmonary embolism (PE) has been shown to impact different aspects of the quality of life (QoL).
• PE quality of life (PEmb-QoL) is a disease-specific survey for patients with PE.
• Cross-cultural validity and reliability of PEmb-QoL were confirmed in a Persian-speaking cohort.
• Minor adjustments in dimensions and their constituent items were required.
2 of 11patients with a history of PE. The questionnaire was developed in Dutch, cross-culturally validated in several other languages, and is part of the recently developed core set of outcome measures for patients with PE [11,[15][16][17][18][19][20]. Our study aimed to prepare the first Persian-translation PEmb-QoL, provide an ad hoc evaluation of its psychometric properties, and adjust its structure based on a Persianspeaking PE patient population in Iran.

| M E T H O D S
Our study aimed to prepare the Persian version of PEmb-QoL based on the English version and then evaluate its psychometric properties (acceptability, reliability, and validity) as to whether it is an appropriate instrument for measuring PE-related quality of life.

| Data collection and follow-up
Each patient was scheduled for a structural 6-month follow-up program composed of detailed transthoracic echocardiography and a 6minute walk test (6MWT). During the program, a QOL assessment was performed using the disease-specific PEmb-QoL and the generic 36-item Short Form (SF-36) survey over an interview session assisted by an experienced nurse.

| 36-Item Short Form (SF-36) survey
SF-36 is a widely used, generic, health-related QOL questionnaire validated in many disease cohorts and languages, including Persian [23]. The questionnaire covers 8 dimensions grouped into 2 summary components: physical and mental [24,25]. The detailed structure of the SF-36 questionnaire is depicted in Supplementary Figure S1. In this study, SF-36 scores were calculated according to the RAND scoring instructions [26].

| PEmb-QoL questionnaire
Modeled based on the generic SF-36 and the disease-specific quality of life after acute DVT (VEINES-QOL/Sym) questionnaire, the PEmb-QoL was designed to assess disease-specific health-related QOL in patients with a history of acute PE [15]. It contains 40 items, 38 of which are on a Likert-type scale and are grouped into 6 dimensions, including frequency of complaints, limitations in the activities of daily living, work-related problems, social limitations, intensity of complaints, and emotional complaints. To calculate PEmb-QoL scores, responses to individual items were transformed into a 0-100 scale, with higher scores corresponding to worse health states [16]. This required reversing the scales for questions Q1, Q4, Q5, and Q9.
Scores of the constituting items for each dimension were then averaged to produce dimension scores, while an average score of all questionnaire items produced the PEmb-QoL total summary score [20]. Questions Q2 ("At what time of day are your lung symptoms most intense?") and Q3 ("Compared to 1 year ago, how would you rate the condition of your lungs in general now?") were considered descriptive in nature and were not scored on a Likert scale. Thus, they were interpreted as is and not incorporated into the dimension scores.
Item Q4a ("Do your lung symptoms limit your daily activities at work?") was treated as missing if a patient had chosen the "I do not work" response [16].
-3 of 11 2.6 | Acceptability and reliability Acceptability was assessed through the completeness of data (the missing item rate). Reliability was assessed in terms of consistency between the results of repeated measurements (reproducibility) and consistency between responses within the same measurement (internal consistency). Reproducibility was reported as an intraclass correlation via the test-retest method, where the respondents were invited back to our center 3 weeks after the initial QOL assessment to retake the PEmb-QoL questionnaire. The 3-week interval was selected to prevent recall and minimize the chance of clinical changes.
Cronbach's α is considered an adequate measure of internal consistency for individual dimensions [34]. Still, given that multidimensional constructs do not conform well with the one-dimensionality assumption of Cronbach's α, in the current investigation, McDonald's ω was additionally provided for the totality of the questionnaire [35,36]. The evaluation of internal consistency was completed by reporting the association between the individual items (the inter-item correlation), the association between the items and their assigned dimensions (the corrected item-total correlation), and the association between the dimensions (the domain intercorrelation). Although coefficients of internal consistency larger than 0.8 are conventionally regarded as acceptable, a more detailed interpretation should consider that these measures are also influenced by the number of items in the subscale. Supplementary Table S1, extracted from the study by Ponterotto and Ruckdeschel [37], elaborates on the acceptable thresholds for coefficients of internal consistency based on the sample size and scale length.
The score distribution, along with the floor effect and the ceiling effect, was reported for the dimensions. The quality criteria for acceptability and reliability are summarized in Supplementary Table   S2. Each item was examined vis-à-vis its frequency of endorsement (respondents who selected the same item response), skewness, corrected item-total correlation with its' corresponding dimensions, and Cronbach's α of its dimension when the item was excluded. Ideally, fewer than 25% of the items should have negative skewness, while more than 75% should have a skewness value between −1 and 1 [38].
An item is considered for removal when doing so substantially improves α or when its' item-total correlation and frequency of endorsement are outside the range of 0.2 to 0.8 [29].

| Validity
The content validity of the PEmb-QoL has been previously investigated with respect to measurement aims, concepts, target population, and item selection [15,39].
In the present study, construct validity was investigated in terms of convergent and discriminant validity. A PRO measure is expected to have relatively high correlations with other theoretically similar PRO and non-PRO measures (ie, convergent validity) while having relatively lower correlations with theoretically dissimilar measures (ie, discriminant validity) [32,40]. Thus, we expected moderate-to-high Spearman rank correlations between the dimensions of the diseasespecific PEmb-QoL and their theoretically similar counterparts from the generic SF-36 questionnaire, as well as moderate-to-high correlations between the dimension of limitations in the activities of daily living and the 6MWT results. We also expected moderate-to-low correlations between the disease-specific PEmb-QoL scores with age, sex, obesity, cancer, and other cardiovascular comorbidities.
We further evaluated internal consistency, item selection, and questionnaire structure through exploratory factor analysis (EFA).
EFA investigates the coherence of item responses to suggest an underlying structure for the questionnaire. We first used the Scree test to determine the appropriate number of latent factors (ie, dimensions) and then conducted EFA via a polychoric correlation matrix with the maximum likelihood estimation method. The polychoric correlation matrix was chosen over the conventionally used Pearson correlation, given that items with fewer than 5 to 7 response options (eg, items Q4 and Q5) could violate the linearity assumption of the Pearson correlation [41][42][43]. Finally, oblique rotation (Oblimin) was applied to account for the interconnected nature of the psychological constructs and to also maximize the distinction between the extracted factors [41,44].
The R programming statistical software version 4.1.3 was used with the tidyverse, Psych, GPArotation, ggcorrplot, and Ggally packages.

| Participants
Of 170 patients recruited in the RHC-PE registry, 98 agreed to complete the PEmb-QoL questionnaire. Two participants were excluded as they had responded to fewer than 75% of the items. The responses elicited from the remaining 96 participants had a minimal item missing rate (0.38%), with 6 participants having a total of 14 unresponded items. The details of the patient flow and exclusion criteria are depicted in Figure 1. Nearly half of the respondents were female. The median age of the participants was 54 years, and the obesity rate was 48%. The median (IQR) time interval between the first hospital admission due to PE and questionnaire submission was 6.1 months (IQR 5.4-7.1 months) ( Table 1). (skewness −0.41) and 9d (skewness −0.08). However, only 22 scored items (58%) had skewness between −1 and 1. Item Q1a ("Pain behind or between the shoulder blades?") had the lowest item-total correlation (r = 0.34). This item was also the only item that, upon removal, would improve the internal consistency of its designated dimension (α = 0.83 to α = 0.85).

| Psychometric properties
Of the 96 included participants, 25 (26%) returned to complete the questionnaire for the second time. Compared to the primary respondent group, participants of the retest were, on average younger (55 years old vs 46.5 years old, P value = .02) and had more odds of being male (odds ratio = 2.85, P value = .05). The test-retest analysis indicated moderate-to-high intraclass correlations for all dimensions, ranging from 0.70 for work-related problems to 0.97 for the intensity of complaints ( Table 3). The results of the acceptability and reliability analysis are summarized in Supplementary Table S2.

Intensity of complaints
Emotional complaints PEmb total score Score transformed into 0 (best) to 100 (worst) scale F I G U R E 2 Score distribution of the PEmb-QoL total scale and subscales. Individual scores are jittered along the horizontal axis to allow visual distinction. Floor or ceiling effects <15% are desirable. Less than 25% of items should have negative skewness, and less than 25% should have a skewness outside of -1 to 1 range. * Patients with the lowest possible score, ie, the best quality of life for the corresponding dimension. However, the suggested structure should be bolstered through exploratory factor analysis before the confirmatory process [8,12].
Our EFA suggested a 3-factor structure closely resembling the study of the French population [17]. These factors were, by and large, combinations of highly associated dimensions. Factor 1 (Q1h, Q4b-m, Q5a-d, Q9i, Q9j) was predominantly formed by items of dimension activities of daily living limitations and work-related problems, factor 2 (Q1b-g, Q7, Q8) was formed by items of the frequency of complaints and intensity of complaints, whereas factor 3 was largely formed by items from emotional complaints. Henceforth, we regard these 3 factors as functional, symptom, and emotional components, respectively.
As suggested by the FDA guidelines [12], we complemented the Klok et al. [15] theoretical framework for item selection through factor, reliability, and item response analyses. Item 1a ("Pain behind or between the shoulder blades?") was removed because of its poor loading values across all 3 extracted factors. This item also had the lowest coherence with its originally designated dimension (corrected item-total r = 0.38).
The removal of item 1a is supported by the fact that the presented description of "pain behind or between the shoulder blades" is not only an uncommon, chronic complication of PE, but it could also overlap with musculoskeletal complaints [46,47]. Four items showed cross-loading.
Items 1h ("Difficulty in breathing or breathlessness?") and 8 ("Intensity of breathlessness?") had significant loadings in both functional and symptom components, while item 5a ("Cut down the time spent on work or other activities?") and 6 ("Interference of lung symptoms with normal social activities?") had significant loadings in both functional and emotional components. Furthermore, although items 9i ("Felt limited in taking a trip?") and 9j ("Afraid of being alone?") were originally assigned to emotional complaints, they clustered better with the items of the functional component rather than with the emotional component. A logical explanation could be that individuals tend to judge their independence, at least in part, by their self-perception of physical performance.
Mazdak Tavoly et al. [20] was the first to report and evaluate the total score for the PEmb-QoL questionnaire. Since then, validation studies, including the present one, have found this scale to enjoy adequate reliability [15][16][17][18][19][20]48]. Nonetheless, we believe that the validity of a total summary scale should be revisited. It is generally discouraged to calculate a single total summary score for multidimensional constructs such as QOL, as this process intrinsically involves assigning unjustified weights to each dimension [32,49]. In this case, the Norwegian study calculated a single total summary score by averaging dimensional scores, whereas the French, German, and Chinese studies calculated it by averaging individual item scores.
While the former approach assumes equilibrium between the contributions of all dimensions toward QOL, the latter implicitly assumes more weight for dimensions with more items. The rationale behind these assumptions, however, is unclear. Even though the dimensions of QOL are highly influenced by one another, the preference weight T A B L E 2 Assessment of Internal reliability, discriminative ability, redundancy, and homogeneity. The PEmb-QoL questionnaire contained 38 Likert-type items; however, item Q4a was omitted from the reliability assessment due to the high number of participants with an "I do not work" response which was treated as a missing value. Q2 and Q3 are not Likert-type scales and were thus not included in the reliability analysis.
for each dimension (eg, physical well-being and social well-being) may vary between individuals. Conflating the scores of these dimensions without incorporating their preference weight for each respondent could, therefore, be unwarranted [49]. We also cannot confirm the reliability or validity of measuring social limitations solely based on item Q6, primarily because single-item dimensions generally fail not only to sufficiently capture all aspects of a general concept, but also to qualify for the conventional analysis of reliability [12]. The cross-loading values of items Q6 ("Interference of lung symptoms with normal social activities") and Q5a ("Cut down the time spent on work or other activities?") also demonstrate that they should be interpreted as part of the shared attributes between broader concepts rather than being assigned to, or comprehensively describe, a single subscale.
Our study is subject to several limitations.    Nagging feeling" in the lungs?
Burning sensation" in the lung?
Feeling that there is "still something there"?
Feeling of pressure in the chest?
Pain in the back?
Pain in the chest area?
Pain behind or between the shoulder blades?
Loading Score F I G U R E 4 Exploratory factor analysis (EFA) of the Persian version of the PEmb-QoL questionnaire with Oblimin rotation method. EFA uses a covariance matrix to extract a set of latent common variables that best explain the observed variance in the responses to questionnaire items. The 3 extracted factors accounted for 34%, 15%, and 11% of the total variance in patient responses. Factor loadings represent the regression coefficient of each item. Coefficients <0.35 were not mentioned. IC, intensity of complaint; PEmb-QoL, Pulmonary Embolism Quality of Life questionnaire; SL, social limitations.