Measuring pain intensity in older adults. Can the visual analogue scale and the numeric rating scale be used interchangeably?

Objectives: Visual analogue scale (VAS) and numeric rating scale (NRS) are two commonly used instruments for measuring pain intensity. Both instruments are validated for use in both clinical and research settings, and share a range of similar aspects. Some studies have shown that the two instruments may be used interchangeably, but the results are conflicting. In this study we assessed whether the VAS and the NRS instruments may be used interchangeably when measuring pain intensity in older adults. Methods: Data were collected in a cross-sectional study, as part of the follow-up in a larger longitudinal study conducted at the Akershus University Hospital, Norway 2021 to 2022 and included 39 older adults aged ≥ 65. Participants were regarded as a normal older adult population as they were not recruited on basis of a specific condition or reports of pain. The participants were asked to rate their pain intensity on an average day using VAS and NRS. Bland-Altman analysis was performed to assess agreement between the two instruments. Results: Thirty-seven participants with mean (SD) age of 77 (5.9) were included in the analysis. Mean (SD) pain assessed by VAS and NRS was 2.8 (1.8) and NRS 4.7 (2.2), respectively. A mean difference (SD) of 2.0 (1.9) between the scores of the two instruments was statistically significantly different from zero ( p < 0.001) confirming bias. The 95% limits of agreement were estimated to be (cid:0) 1.7 to 5.7. A post-hoc analysis, removing an outlier, resulted in similar conclusions. Conclusion: There was poor agreement between the VAS and NRS scale for measuring pain intensity in older adults. This suggests that the two instruments should not be used interchangeably when assessing pain intensity in this population. Ethical approval: Regional Committees for Medical and Health Research Ethics [2016/2289]. Trial registration.


Introduction
Accurate pain assessment is important for appropriate clinical management both for diagnostic and treatment purposes.Pain intensity is used for understanding severity of disease and impact on life quality.Pain measurements are also common endpoints used in research.Instruments that accurately measures pain intensity, is therefore important.Nociception can be measured objectively where focus is on evoking nociceptive pain fibres (Wagemakers et al., 2019), or through biomarkers (Xu and Huang, 2020;Salomons et al., 2016;Mouraux and Iannetti, 2018), while pain experience is a subjective measure obtained through patient reported outcomes where the patients are asked to describe and rate their pain intensity (Haefeli and Elfering, 2006;Hawker et al., 2011).The consensus on definition of pain as "an unpleasant sensory and emotional experience associated with, or resembling that associated with, actual or potential tissue damage" underlines that subjective measurements cover a great part of the many aspects of experiencing pain (Raja et al., 2020).
Visual analogue scale (VAS) and numeric rating scale (NRS) are two of the most commonly used instruments for measuring pain intensity.They are both validated for use in a general population and also in specific conditions and states of acute and chronic pain.Although their usability and individual preference in older adults have been suggested to go in favour of the NRS compared to the VAS (Dworkin et al., 2005;Peters et al., 2007), both instruments are individually validated for both clinical and research purposes in older adults (Haefeli and Elfering, 2006;Hawker et al., 2011;Peters et al., 2007;Ferreira-Valente et al., 2011;Williamson and Hoggart, 2005;Wood et al., 2010;Ferrell et al., 1995).Further, previous studies suggest that psychometric properties including reliability (internal consistency) and validity (scale validity) as well as scale failure and preference are comparable in between younger and older adults (Herr et al., 2004).There is, however, little research on agreement between the two scales when used in older adults.Pain suffering is highly prevalent in older adults and adequately measuring pain intensity is important for accurate pain management (Domenichiello and Ramsden, 2019).
During the Covid-19 pandemic, infection control and social distancing forced a pause in regular medical consultations as well as research projects.Alternative methods with telemedicine were therefore introduced as an option for medical consultation.Some research projects continued their work with video-interviews and online surveys.Certain participant groups, such as older adults, may have difficulties with digital appliances.For some of these, electronic surveys and secureplatform video-consultation is difficult to access.The preferred medium in these instances would be a phone call.Using different instruments and different modes of application to assess the same outcome should in general be avoided, but for instruments that have a high degree of similarity one may query whether they could be used interchangeably.There is evidence supporting high agreement between VAS and NRS in some populations and that versions of these instruments may be used interchangeably (Shafshak and Elnemr, 2021;Bahreini et al., 2015;Alghadir et al., 2018;Kollltveit et al., 2020).There is, however, little evidence on agreement between VAS and NRS in the geriatric population and especially among the oldest of the old.In this paper, we aimed to assess whether the NRS and the VAS instruments may be used interchangeably for measuring pain intensity in older adults.

Study design, setting and participants
Data in this cross-sectional study were collected as part of the followup in a longitudinal study conducted at the Akershus University Hospital, Norway autumn 2021 to winter 2022.Participants included 39 older adults aged ≥65.The participants were originally recruited to a study conducted in 2017-2018 while they were admitted to somatic departments.Study design and results from the 2017-2018 study have otherwise been extensively reported elsewhere (Bjelkaroy et al., 2021;Cheng et al., 2019a;Cheng et al., 2019b;Cheng et al., 2020;Cheng et al., 2021;Siddiqui et al., 2022;Siddiqui et al., 2020).The participants included in the follow-up were regarded as a normal home-dwelling population and they were not recruited on basis of a specific condition or reports of pain.Exclusion criteria were an existing diagnosis of stroke, dementia, psychotic disorders, moderate to severe depression, serious hearing or visual impairment and insufficient language skills to complete a questionnaire in Norwegian.Data was collected in either an office setting or as a home visit where participants were asked to complete a self-administered questionnaire on paper.

Measurements
Measurements included in this study cover sociodemographic information such as sex, age, and years of education as well as two instruments for assessing pain intensity; the visual analogue scale (VAS) and the numeric rating scale (NRS).The scales range from 0 to 10 for the NRS and 0-100 for the VAS.For both instruments, the participants are asked to rate their pain from "no pain" to "worst possible pain".In this study, the question for both scales was "what is your pain intensity when your pain is average (on an average day)".

Visual analogue scale (VAS)
The VAS scale was first used in 1921 by Hayes and Patterson (Yeung and Wong, 2019).The scale is most commonly a 100 mm blank line with demarcation of two extremes "no pain" to the left and "worst possible pain" to the right.VAS may be scored on a paper sheet or a digital device.The participant mark down their pain intensity on the line and the score is then obtained by a researcher using a tape measure on paper or automatically from the digital device (Delgado et al., 2018).The instrument is valid for use in older adults, although some studies suggests that the instrument is to be used with caution in individuals with impaired psychomotor skills (Haefeli and Elfering, 2006;Peters et al., 2007;Ferrell et al., 1995;Herr and Garand, 2001).

Numerical rating scale (NRS)
With the NRS scale the patients are asked to rate their pain with a number from 0 to 10 where 0 is "no pain" and 10 is "worst possible pain".NRS may be scored either orally or in writing.It is widely used in both clinical and research settings and the scale is validated for use in older adults (Wood et al., 2010).

Statistical analysis
Demographic and clinical information is presented as means and standard deviations (SDs) or frequencies and percentages.The normality was assessed by Kolmogorov-Smirnov test.One-sample t-test was used on differences between the measurements to assess bias (mean difference) between the two instruments.A scatterplot with regression line was presented to illustrate the association between the instruments.Bland-Altman analysis (Bland and Altman, 1986;Bland and Altman, 1999) was finally performed for assessing agreement between the two instruments through 95% limits of agreement.As sensitivity analysis, non-parametric limits of agreement defined by 2.5th and 97.5th percentiles were constructed as well.In this later approach mean difference was replaced by median of differences.A post hoc analysis after removing an outlier (subject appeared to have misread question for rating NRS as reported score had a major discrepancy towards the rest of this subjects' answers) was performed.For analyses, the VAS score was transformed from 0 to 100 to 0-10 scale by dividing it by 10 and rounding to nearest digit.Statistical analysis was performed by using IBM SPSS Statistics for Windows ((Version 28.0)IBM Corp. released 2021.Armonk, NY) and Microsoft Corporation, Microsoft Excel version 2016.

Results
Thirty-nine participants thereof 31 (79.5%)females with mean (SD) age 77.9 (5.9) were included in the study.Among the participants, 42.1% had 12 years of education and 47.4% had higher education (13 years and more).The analysis included 37 participants as two had missing data on the NRS score, but not on the VAS score.Testing for normality, confirmed a normal distribution of data for the NRS scale, but not for the VAS scale.The distribution of both scales is presented in Fig. 1A and B. Mean (SD) pain intensity assessed by VAS and NRS was 2.8 (1.8) and 4.7 (2.2), respectively.According to one-sample t-test, mean difference (SD) of 2.0 (1.9) between the scores of the two instruments was statistically significantly different from zero (p < 0.001) implying bias.
As illustrated in scatter plot VAS by NRS (Fig. 1C) there was an association between the two measurements with Pearson's correlation coefficient 0.60 (R 2 = 0.36).The NRS was generally scored higher than the VAS scale.The discrepancy between the two instruments was greatest towards the lower end of the scales, whereas approaching each other towards the higher end.
The Bland-Altman analysis, illustrated in Fig. 1E, shows poor agreement between the two instruments, as 95% limits of agreement around mean difference (bias) of 2.0 were estimated to be from − 1.7 to 5.7.Removing an outlier (marked in grey in Fig. 1C) in the post hoc analysis (Fig. 1D), the Pearson's correlation coefficient changed to 0.70, improving R 2 to 0.49.One-sample t-test identified a mean difference (bias) of 1.9 (p < 0.001).The Bland-Altman analysis then estimated the 95% limits of agreement of 1.9 at − 1.3 to 5.1 (Fig. 1F).
According to Kolmogorov-Smirnov test, the differences were not normally distributed, and the logarithmic transformation did not work.The median difference was 1 in both analysis with and without the outlier.The non-parametric limits of agreement were − 0.5 to 8 in the main analysis and − 0.5 to 5 in the post hoc analysis excluding the outlier, confirming poor agreement between the instruments.

Discussion
In our sample of 39 older adults reporting pain intensity with NRS and VAS rating scales we assessed whether the two commonly used scales could be used interchangeably in the older population.Both our primary and post hoc Bland-Altman analysis found poor agreement and a significant difference in mean pain intensity between the two scales indicating that these two instruments should not be used interchangeably in older adults.

Agreement between instruments
Bland-Altman analysis is suggested to be the preferred method when comparing instruments or assessing agreement between two methods of continuous clinical measurements.It is a way of quantifying agreement between individual measurements.The difference between the two instruments is plotted against their mean (Bland and Altman, 1986;Bland and Altman, 1999).In our primary analysis, we found that the 95% limits of agreement around bias of 2.0 were − 1.7 to 5.7.Removing an outlier that potentially skewed the results, we had similar findings where the 95% limits of agreement were − 1.3 to 5.1.This implies that the 95% of difference between the NRS and VAS scores, are expected to lie within 1.7 points below and 5.7 points above the mean difference, a range that cannot be accepted either statistically or in a clinical setting.This conclusion was supported by non-parametric limits of agreement, an alternative approach in the case of non-normally distributed differences between the measurements.

Association between VAS and NRS
Scatterplot is used to descriptively illustrate association between the two instruments.Importantly, association is not the same as agreement, as agreement might be poor even when the correlation is close to perfect.We found that NRS is generally scored higher compared to the VAS score and that this is particularly so towards the lower end of the scales, whereas towards the higher end of the scales this discrepancy diminishes (see Fig. 1C and D).It appears that the instruments are either overestimating or underestimating the pain intensity.Whether NRS is overestimating or VAS is underestimating the pain intensity is impossible to say with the data obtained in this study.In our study, we found that in 95% of cases the difference between NRS and VAS would lie between − 1.7 and 5.7.In most cases NRS would lie between 0 and 5.7 points over VAS, so that NRS overall estimates pain intensity to be higher than when scored with VAS.But in a few examples NRS would lie between 0 and − 1.7 under VAS.

Minimal clinically important difference (MCID) in VAS and NRS
MCID is change in pain intensity that is regarded as clinically relevant.Some studies argue that degree of MCID vary with acute and chronic pain conditions and that degree of change is not uniform throughout the scales where higher levels of pain require greater pain relief to be significant (Olsen et al., 2017;Bird and Dickson, 2001;Kelly, 2001;Farrar et al., 2001).Regardless of this, there is some literature supporting a change of 20% between two measurements in time to be clinically significant in chronic pain.This is supported both for VAS and NRS scale (Haefeli and Elfering, 2006).A change of two on the 11-point NRS scale and 20 on the 101-point VAS scale would thus be regarded as clinically significant.MCID is recommended to be applied in a population similar to that on which it was estimated (Salas Apaza et al., 2021).To our knowledge there is little evidence on MCID in older adults with pain, therefore, in this study we have applied MCIDs estimated on a general pain population.As the literature regards two points to be of minimal clinical significance for both VAS and NRS, it implies that a difference of <2 is considered as not clinically significant and expected to occur.Our bias of 2.0 is of the same order.However, 95% of differences are expected to lie in the interval of length 7.4 points as opposed to accepted interval of length of four i.e., − 2 to +2.This is a range that cannot be accepted in a clinical setting.

Usability of VAS and NRS
The NRS and VAS instruments are patient-reported outcomes where the patients subjectively rate their pain intensity.The degree to which these instruments accurately or truly measure pain is difficult to determine.When attempting to establish this, one might include parameters that both test nociceptive and psychological aspects of pain.As the consensus definition on measuring pain emphasises the sensory and emotional experience of pain it is acceptable to say that the true pain intensity is that what the patient says it is.This then, creates a problem when two subjective measurements do not agree with each other.The NRS and VAS instruments are separately validated for measuring pain intensity in older adults (Williamson and Hoggart, 2005;Wood et al., 2010), but to our knowledge, there is little research on agreement between these two instruments in this population.In our study, we found that they agree poorly with each other.
Other studies comparing the two instruments have focussed on the practical issues of their use.Some studies argue that the failure rate is greater for the VAS scale than the NRS in older adults (Dworkin et al., 2005;Peters et al., 2007;Ferrell et al., 1995;Herr et al., 2004;Herr and Garand, 2001).In our study, the failure rate was low demonstrated by one outlier and two participants with missing responses on the NRS.

Strengths and limitations
In this study, we investigated whether the VAS and NRS instruments can be used interchangeably when measuring pain intensity in older adults.This is based on the assumption that the two instruments are equally good in measuring true pain intensity in this population.We addressed the instruments in relation to each other, and found poor agreement between the two instruments.With the data collected in this study we cannot ascertain which instrument is superior to the other.Bland and Altman (Bland and Altman, 1999) suggest that repeated measures could assist in determining which instrument is more precise.Other studies have found that both VAS and NRS have good repeatability or test-retest reliability in other populations (Peters et al., 2007;Alghadir et al., 2018;Euasobhon et al., 2022), but our dataset did not contain any replicate observations and we could therefore not perform this analysis.But, if one method has low repeatability it will affect agreement between the methods (Bland and Altman, 1999).
The population in this study was not selected on the basis of a specific condition or diagnosis and we see it as a strength to this study that the data was collected in a general home-dwelling population of older adults and not a population that was recruited on the basis of suffering from a specific condition.By using a general population, we had a greater chance of obtaining data from the whole range of the scales (Heller et al., 2016).In our analysis, we found that the NRS scores were normally distributed, while the VAS scores were not.Normality of the measurements itself is not a prerequisite for Bland-Altman analysis (Bland and Altman, 1986;Bland and Altman, 1999).Further, the difference between the two measurements which is the dependent variable in the Bland-Altman analysis, did not show any clear skewness even though they were not normally distributed according to the Kolmogorov-Smirnov test.However, as pointed out by Bland and Altman (Bland and Altman, 1999) the deviations from normality in this context are not a very serious issue, as also non-normal distributions are likely to have about 95% of observations within two SDs of the mean.Finally, we have included the non-parametric limits of agreement to meet potential shortcomings of a parametric tests applied to non-normally distributed data.To our knowledge this is the first study assessing agreement between the VAS and NRS instruments in measuring pain intensity in older adults and further studies should be conducted with larger samples including test-retest data.

Conclusion
In this study we found poor agreement between the VAS and NRS instruments suggesting that the two instruments may not be used interchangeably when assessing pain in older adults.