Confirmatory factor analysis of the International Pain Outcome questionnaire in surgery

Supplemental Digital Content is Available in the Text. The reliability and validity of International Pain Outcome questionnaire Spanish adaptation is confirmed in a large heterogeneous sample. Factor scores can be used as a global outcome analysis tool.


Introduction
Pain is one of the factors that interferes with the proper recovery of patients after surgery, yet it is one of the most challenging factors to quantify because it is a subjective multidimensional experience. 17 One recent effort to collect and analyze postoperative related data from 200 hospitals across the world is the PAIN-OUT project, funded by the European Commission's Seventh Framework Programme (ClinicalTrials.gov: NCT02083835). 30 It focused on 3 areas related to postoperative pain: structure, process of care, and outcomes, and aims to improve postoperative outcomes through benchmarking, quality indicators, and the best available evidence.
As an outcome measurement tool, the International Pain Outcome (IPO) questionnaire 22 was developed in 2 phases by Rothaug et al. based on the Revised American Pain Society Patient Outcome Questionnaire. 14,22 Using a forward-backward methodology, the English version was translated to 9 languages, including Spanish. After phase 1 analysis, they shortened the questionnaire to adapt it to the European population. From the Revised American Pain Society Patient Outcome Questionnaire, falls and sleep were combined. They eliminated the items how helpful the information was as well as the frightened and depressed items because they have a high correlation with anxious. They included the following 3 additional questions: interference with breathing, would have liked more pain treatment, and presence of previous chronic pain. Using principal component analysis, a 3-model factor structure was identified, explaining 53.8% of the total variance: pain intensity and interference, adverse effects, and perception of care. 14,22 The general objective of this study was to evaluate the psychometric properties of the Spanish adaptation of the IPO questionnaire in a large clinical sample including patients who underwent different types of surgery. The specific objectives were as follows: (1) to obtain empirical evidence confirming the 3-factor structure for the IPO questionnaire found by Rothaug et al. 22 (measuring pain intensity and interference, adverse effects, and perceptions of care); (2) to test invariance and differential item functioning of the 3-factor structure by the patients' sex, chronological age, surgery type, current smoking, obesity, affective disorder, and presence of chronic previous pain 27 ; and (3) to estimate the incremental predictive validity of the IPO factor scores on would have liked more pain treatment and total morphine consumption (defined as criteria of poor pain control).

Participants
The methodology of the PAIN-OUT, a project funded by the European Commission and supported by the International Association for the Study of Pain, has been described elsewhere. 20,29 The Spanish subsample of the European PAIN-OUT study was analyzed, which included n 5 4650 patients recruited from 13 hospitals from 7 different regions of Spain. All centers were university hospitals (300-1000 beds) with an acute pain care unit. As the number of participants in the sample used for previous factorial validation studies was very low (less than 10% of the sample), we decided to maintain them in the current study to increase external validity and generalization capacity. Data were collected between February 2010 and December 2013 by trained research assistants, who followed a standard operating procedure provided by the PAIN-OUT project. 20 Patients who accepted to participate completed the IPO questionnaire. The registered data were entered in a secured multi-institutional web-based database using a random identifier. As an inclusion criterion, it was required that patients were in their first postoperative day and in the ward for at least 6 hours. Exclusion criteria included being asleep, sedated, not in the ward at the time of data collection, and not able to communicate, including any language barrier, not able to read and understand questionnaires, and cognitive impairment (Fig. 1).

Instruments
The IPO questionnaire developed by Rothaug et al. 22 and Process of Care questionnaire. This tool includes sociodemographic information, relevant pain treatment comorbid conditions, and perioperative anesthetic, surgical, and analgesic data documented in medical records. Both questionnaires are available on PAIN-OUT website (http://pain-out.med.uni-jena.de/).

Procedure
The study involving human subjects and the use of patient data for research purposes was approved by the Committee on Research Ethics of every participating center, and the research was conducted in accordance with the Declaration of the World Medical Association (Committee on Research Ethics Clínica Parc de Salut MAR, Reference code: No 2007/2998/I), and signed informed consent was obtained from all participants.
The measures analyzed in the study correspond to self-report measures answered by the patients in the first day after surgery at the arrival to the surgery ward. Research assistants encouraged the patients to complete the questionnaires and could read the unanswered questions to patient once, intending that the items were all answered and that no problems had occurred because of a lack of understanding.

Statistical analysis
Statistical analysis was performed with Stata16 for Windows. Confirmatory factor analysis (CFA) was performed considering that the new adapted IPO questionnaire was developed after a theoretical rationale procedure based on the following steps: (1) adapting a new instrument for covering 3 specific domains (constructs/areas) related to the pain measure (intensity, adverse effects, and self-perception of care); (2) reviewing all the items initially assigned to each domain to ensure that the contents were appropriate for the target population(s); and (3) providing a meaningful distribution and order to the new version of the questionnaire. Therefore, CFA in our study was performed assuming the existence of 3 latent theoretical factors from items measures on a 0 to 10 scale: factor 1 defined by 9 items measuring pain intensity and interference(s); F2 defined by 4 items measuring pain treatment-related adverse effects; and F3 defined by 3 items assessing the patients' perception of care: allowed to participate in pain treatment decision, pain relief, and satisfaction with pain treatment, with higher scores meaning better perception of care (Fig. 2). 22 Maximum likelihood (ML) estimator with missing values was used, and the overall goodness of fit was evaluated through the standard statistical measures 3 : the root mean square error of approximation (RMSEA), Bentler's comparative fit index (CFI), and the Tucker-Lewis index (TLI). Adequate model fit was considered for RMSEA ,0.10, TLI .0.9, and CFI .0.9. The x 2 test was not considered as a fitting measure because of the strong dependence of this test with sample sizes (it may fail to reject inappropriate models in small sample sizes because of the lack of statistical power and it may reject appropriate models in large samples sizes because of the excess of statistical power). The internal consistency between items within each defined factor was estimated by omega coefficient (v, this measure was considered instead of usual Cronbach's alpha because of the low number of items for the factors 2 and 3 of the IPO questionnaire) (moderate consistence was considered for consistency coefficients equal or higher than 0.60 and good consistency for values higher than 0.70). A corrected item-scale correlation was calculated for each item. Because of the strong association between statistical significance for the coefficients and sample size, a corrected item-scale correlation was considered low to poor |R| .0.10, moderate to medium for |R| .0.24, and large to high for |R| .0.37 (these thresholds corresponds to Cohen's d of 0.20, 0.50, and 0.80, respectively). 7,21 Because of a steady association of the pain construct with variables such as sex, age (2 groups were defined based on the median in the own sample), surgery type, current smoking, obesity defined as body mass index greater or equal to 30 kg/m 2 , history of affective disorder, and presence of chronic previous pain, 23,27 the structural configural invariance by these variables was analyzed. Given the low frequency of previous consumption of opioid analgesics, this variable was not included in the analysis. In this study, structural invariance tested factor loadings equivalence across the groups. This assumption is supported if multigroup CFA (MGCFA) analysis met the following criteria 6 : (1) the model specifying the items measuring each latent variable fits the data well; (2) all factor loadings are substantial (usually above 0.30) and statistically significant; and (3) no large modification indices exist that point to model misspecifications.
Comparison for the raw factor scores between participants' sex, groups of age, surgery type, current smoking, obesity, affective disorder, and presence of previous pain was based on analysis of variance. Effect size for the mean comparison was based on Cohen's d coefficients (low effect size was considered for |d| .0.20, mild to moderate for |d| .0.50, and high to large for |d| .0.80 16 ).
As there are no gold standard measures of postoperative pain outcomes, the questions on the survey would have liked more pain treatment, either pharmacological or nonpharmacological, and highopioid requirements, defined as more than 30 mg per 24 hours morphine equivalent consumption, were used as measures of convergent validity. Opioid analgesic doses during the intraoperative and the first 24 hours after surgery, obtained from the process questionnaire, were used to calculate oral morphine equivalents consumption, based on published analgesic tables. 19 Short-acting opioids (fentanyl and remifentanil) were part of the anesthetic protocol and not used to provide postoperative analgesia, so they were excluded. The value 30 mg per 24 hours was chosen to dichotomize the variable morphine equivalent consumption to high opioid consumption, based on previous literature. 12,20 The incremental predictive or discriminant validity of the factor scores measured through the IPO questionnaire on the question would have liked more pain treatment and high opioid consumption was estimated with logistic binary regression in 2 steps or blocks: (1) the first step or block entered and fixed the patients' sex, age, surgery type, current smoking, obesity, affective disorder, and presence of previous chronic pain; and (2) the second block added the 3 raw scores in the IPO questionnaire. Goodness of fit was valued with the Hosmer-Lemeshow test (adequate fitting was considered for P . 0.05), whereas the incremental predictive validity of the IPO scores was estimated with the change or increase in the Nagelkerke pseudo-R 2 (ΔR 2 ) comparing first and second steps or blocks of the regression and the incremental discriminant validity with the increase in the area under the ROC curve (ΔAUC 2 ).

Participants
Of 4650 patients recruited, 636 did not participate in the study because of being asleep, sedated, not in the ward at the time of data collection (at least 6 hours in their first postoperative day in the ward), or not able to communicate (patient is deaf or is not able to communicate in any of the IPO available languages). Therefore, 4014 patients were analyzed. Figure 1 shows the flowchart of the study. Descriptive characteristics of the patients are shown in Table 1. Figure 2 contains the path diagram for the 3-factor model tested in the study, with the standardized coefficients obtained in the single-group CFA (whole sample, n 5 4014). All the coefficients achieved high loadings with statistically significant results. Good fitting was obtained for this initial model (RMSEA 5 0.059, 95% CI 0.056-0.061, CFI 5 0.926, and TLI 5 0.905). Internal www.painreportsonline.com 3 consistency was good for the factor F1 pain intensity and interference (v 5 0.82), moderate for factor F2 adverse effects (v 5 0.61), and good for F3 perceptions of care (v 5 0.72). These results confirm the structure in three first-order factors for the IPO questionnaire in the whole sample (Table S1, supplementary material, contains the complete results for this CFA, as well as the frequency distribution of the raw scores for each item and for the dimension scores in the study, available at http://links.lww.com/ PR9/A96). Table S2 (supplementary material, available at http://links.lww. com/PR9/A96) contains the results to move from the singlegroup CFA obtained in the whole sample to the MGCFA to crossvalidate the 3-factor structure across the groups defined by the participants' sex, age, surgery type, current smoking, obesity, affective disorder, and the presence of previous chronic pain. Good fitting indexes were obtained in the MGCFA models defined for assessing no differences in the structure for the IPO based on sex, age, current smoking, affective disorder, obesity, and presence of previous pain, whereas fitting was only moderate for the model measuring invariance by the surgery type. Nonsignificant results were found in the tests assessing invariance by sex (x 2 5 19.7, P 5 0.104), indicating a equally statistical structure for men and women. However, invariance reported significant results in the joint test for the rest of the groups tested in the MGCFA. Examining separately the standardized coefficients in each group, significant high scores (above 0.30) were achieved for all the items (except for the item measuring itching pertaining to the factor adverse effects, which had a score equal to 0.265 for the group defined for other surgery types different to general and orthopedics and 0.271 for the group of men). Table 2 includes the comparison of the mean raw factor scores measured with the IPO questionnaire in the groups considered in the study. Compared with men, women reported high mean values in the factors assessing pain intensity and interference (P 5 0.001) and adverse effects (P , 0.001). These 2 factors also registered higher mean scores as lower the patients' age, whereas the factor perception of care registered lower mean for older patients compared with other 2 groups of age (P , 0.001). Differences for type of surgery only reported differences for the  PAIN Reports ® factor measuring adverse effects; orthopedic surgery reported lower mean compared with both general (P 5 0.007) and other (P 5 0.004) surgeries. Current smoking reported less adverse events (P 5 0.01). Table 3 includes the final result of the logistic regression measuring the incremental predictive validity of the IPO measures after considering the patients' sex, age, surgery type, history of affective disorders, current smoking, obesity, and presence of previous pain. For would have liked more pain treatment and for high opioid consumption, goodness of fit was achieved for the final model for the second step or block (P 5 0.093 and P 5 0.291, respectively). Increase in the pseudo-R 2 after including and fixing the variables defined into the first step or block indicated that the specific incremental predictive capacity of IPO factor scores was around 23% (ΔR 2 5 0.227; global predictive capacity for the final model was R 2 5 0.248) and the specific incremental discriminative capacity was also around 23% (ΔAUC 5 0.228; global discriminative accuracy for the final model was AUC 5 0.829) in would have liked more pain treatment. The predictive capacity of the variables included in the model for high opioid consumption was low (R 2 5 0.057 and AUC 5 0.626) and did not increase noticeably with the inclusion of factor scores (ΔR 2 5 0.027 and ΔAUC 5 0.031). Significant OR coefficients were achieved for the factors measuring pain intensity and interference (OR 5 1.076, P , 0.001) and perception of care (OR 5 0.909, P , 0.001) in would have liked more pain treatment, whereas in high opioid consumption, significant OR coefficients were achieved for pain intensity and interference (OR 5 1.012, P , 0.001) and adverse effects (OR 5 1.014, P , 0.001), indicating that the probability of liking for more treatment was higher for patients who perceived higher pain intensity and interference and lower perception of care, while the probability of high opioid consumption was higher for patients who perceived higher intensity and interference and higher adverse effects.

Discussion
This study aims to test the psychometric validity of the IPO questionnaire in a large clinical Spanish sample with patients who underwent a broad range of surgical procedures and perioperative management. 20 The main results of this study provide evidence about (1) the structure of the IPO questionnaire in 3factors (pain intensity and interference, adverse effects, and perceptions of care); (2) the invariance of the structure by sex, age, surgery type, current smoking, history of affective disorder, obesity, and presence of previous pain; and (3) the capability of the factor scores to predict would have liked more pain treatment.
Similar to the initial IPO exploratory validation study, 22 our CFA shows that interference with breathing and coughing was the item with the lowest factor loading (0.33) on F1 (pain intensity and interference). This could be related to the heterogeneity of the procedures included in the sample, considering that factor loading on this item increases in general compared with orthopedics procedures because limb procedures usually do not affect the respiratory system. 20 The moderate adjustment of invariance by type of procedure could also be influenced by this item. For F2 (adverse effects), itch (0.32) is the factor with the lowest standardized coefficient, which may be due to the low frequency of occurrence and low intensity compared with other adverse effects. 20 Itching was introduced in the IPO as one of the main adverse effects of intrathecal and epidural opioid treatments, but the frequency of occurrence in large sample studies has been around 6% to 18%, 11 and although it is bothersome, its relevance in the functional status and morbidity of postoperative patients is arguable. Regarding perception of care (F3), the satisfaction item explains almost all the variability in that factor, in contrast to the original exploratory factor analysis by Rothaug,22 where the items loads were more balanced. Also, for F3, our study shows moderate internal consistency suggesting that pain relief, satisfaction, and participation in pain treatment decisions measure different aspects of postoperative experience in Spanish patients and should be treated separately. High participation and information about pain therapy, perceived by the patients, has been shown to be a predictor of less pain intensity, restriction with movement, and dissatisfaction in the German population. 18 Unlike other countries, such as the United States, where both participation and satisfaction with treatment are high, 28 Spanish patients, on average, perceive that they participate less in pain treatment, but despite this, satisfaction is comparable to other countries. 20 This may be due to sociocultural differences, the fact that health care is public in Spain, and the degree of involvement that Spanish patients want to have in treatment decisions may differ from other countries.
The analysis presented here suggests invariance of the factor structure for sex, age, previous chronic pain, current smoking, obesity, history of affective disorders, and type of surgery, which implies that factor scores can be used to compare these groups. The lesser fit of the model by surgery type is consistent with previous literature that even lesser surgeries can result in significant pain and it it is likely that individual factors have a greater impact than nociceptiva burden of surgical procedures in the experience of postoperative pain and treatment outcomes. 12 Raw scores show differences by sex in intensity and interference and adverse effects, but not in perception of care. The higher sensitivity to pain in women compared with men seems to occur after puberty, 4 and it is believed to be related to increased occurrence of pain due to the menstrual cycle, that generates a larger painful memory network and higher pain sensitivity, 5 a fact that could explain the increase in pain interference scores. Higher frequency of adverse effects could be associated to pharmacokinetic differences in the metabolism of analgesic medication, as shown by the fact that plasma morphine concentrations with the same dose are higher in women and exposes them to higher frequency of adverse effects. 10 Interestingly, although men had lower scores in intensity of pain (F1) than women, in the logistic regression analysis, being a man is a predictor of would have liked more pain treatment. When patients are grouped by age, the scores of the younger half of the sample are higher in intensity and interference, adverse effects, and perception of care. Gerbershagen et al. had already shown that young patients have greater postoperative pain regardless of the type of surgery. 13 Although previous studies have shown that satisfaction is greater in older patients, 26 Jaipaul and Rosenthal showed that this factor varied with health status, so that older adults (patients older than 65 years) with poor health status report less satisfaction with treatment compared with older adults in good physical condition. 15 That could explain the higher perception of care by the younger sample in our results because our population study came mainly from tertiary care hospitals that treat older people with poor health status. Our raw scores do not differ with the presence or absence of previous chronic pain, which is in agreement with previous studies showing that only previous severe chronic pain is related with worst postoperative outcomes. 13,20 Opposite to results in Yang metanalysis, 27 patients with obesity and history of affective disorders had similar raw scores in the 3 factors and were not predictors of would have liked more treatment. As expected, current smoking had lower raw scores in adverse events, mainly by the decreased risk of nausea, 2 but similar raw scores in pain intensity and perception of care.
Would have liked more pain treatment is used to validate the predictive capacity of the questionnaire, under the assumption that wanting more pain treatment is an indirect but specific measure of poor pain control. Our analysis shows the probability of liking more pain treatment increases with the intensity and interference score and decreases with the perception of care score. This agrees with Schwenkglenks et al. 24 study results that showed that satisfaction with pain treatment was associated with 3 items: more pain relief, greater participation in the treatment of pain, and no desire to have received more treatment. As expected, intensity and interference score and adverse effects were also related to high opioid consumption. 25 The main limitation of this study is the analysis of crosssectional data, which did not allow to study the temporal reliability of the IPO measures, such as test-retest analysis. Other psychometric validations of the IPO, such as acceptability, responsiveness, divergent validity, and feasibility, were already proved in Rothaug study, so they were not analyzed here. The overlap of the sample with the previous validation study is a limitation, although CFAs are statistical hypothesis tests that evaluate if the data fit a hypothesized structural model (in this case a 3-factor structure model). Thus, the CFA is now sustained as the rational procedure used for the elaboration of the IPO questionnaire. Voluntary participation in the PAIN-OUT study makes the sample representative only of patients who attend public university hospitals. Characteristics of patients from other populations in Spain could modify the responses to the questionnaire and its factor structure. 18 The IPO focus on postoperative pain, which is corroborated by its factor structure (the explained variance by intensity and interference factor in the Rothaug study is 36%). However, current trends derived from fast track and enhanced surgery protocols 1,8,9 focus on the quality of postoperative recovery, especially the patient's early ability to move, which depends on other factors additionally to pain outcomes. Studies are needed to test the IPO validity in this new paradigm. Another limitation of the study is the absence of important variables related with the pain construct, such as sleep difficulties, degree of catastrophizing, and patient resilience as well as lack of presenting treatment data, such as type, doses, and combination of pharmacological and nonpharmacological treatment. Items factor loading depend on outcomes distribution, so pain treatments that significantly influence patient outcomes could have altered the present results.
The main strengths of our study are the large sample size, the inclusion of heterogeneous patients, and the use of MGCFA procedures to assess structure invariance by variables strongly related to pain. It is the first time that the IPO is conceptualized as a sum of scores in its main factors, which can serve as a global outcome analysis tool. Low scores in pain interferences and adverse effects with high scores in perception of care would indicate optimal quality of care. 17 The total factor scores also allow simpler comparison between centers and procedures and could become an improvement tool in the quality of postoperative pain management.
Further studies will be needed to increase the convergent and divergent validity of the questionnaire with other measures related to postoperative pain and recovery from surgery.

Conclusion
In conclusion, the study confirms the 3-factors structure of the IPO questionnaire in the Spanish population attending public university hospitals, its invariance by sex, age, previous chronic pain, and type of surgery, and serves as a proof that the sum of the scores of the factorial structure predicts would have liked more pain treatment, a key aspect in patient satisfaction, along with the ability to participate in treatment decisions and a sense of providers caring for them. 24

Disclosures
The authors have no conflicts of interest to declare.