Oxford knee score 1 year after TKR for osteoarthritis with reference to a normative population: What can patients expect?

Objectives Total knee replacement (TKR) procedure is commonly carried out in patients with advanced osteoarthritis to reduce pain and increase mobility, with on average 84% rated satisfactory outcome, but some (some suggest 44%) continue to experience debilitating pain. The study aimed to investigate factors affecting pain and function outcomes (using Oxford Knee Score, OKS) one year after TKR, with normative comparison to a reference population. Design We recruited TKR patients from one hospital (Nottinghamshire, UK); collected pre- and post-operative OKS; graded radiographs for severity of osteoarthritis (K-L grade) in a sub-group. We also collected OKS by postal survey from the local area, calculated age and sex specific normative scores and z-scores of post-operative OKS (Z-OKS). The associations between K-L grade, pre-operative OKS, age, sex, against change in OKS and Z-OKS were analysed. Results There were 536 TKR cases, 91 in radiographic sub-group and 360 people in reference cohort. Post-operative Z-OKS was associated with K-L grade (β = 0.368; p<0.001). Change in OKS was associated with K-L grade (β = 0.247; p = 0.003); pre-operative OKS (β = −0.449; p<0.001); age (β = 0.276; p = 0.001); and female sex protective (β = −0.213; p = 0.008). On average TKR patients returned to 74% of their normative age and sex adjusted OKS, with younger women achieving worst outcomes. More severe radiographic osteoarthritis predicted greater improvement and better post-operative outcome when compared to normative population. Conclusion This study identified factors and provided normative OKS data intended to guide clinicians in counselling patients regarding likely surgical outcomes. This could help manage patients’ expectations, aid decision making and increase post-surgery satisfaction rate.


Introduction
Knee osteoarthritis is one of the leading causes of pain and disability worldwide [1]. Total knee replacement (TKR) is a cost-effective treatment that improves pain and function in most patients with osteoarthritis [2,3]. The procedure is indicated when initial management through education, exercise, weight loss and analgesics failed to control symptoms. However, studies estimated that up to 44% continued to experience disabling pain after surgery [4,5]. The National Joint Registry suggested that 84% of patients were satisfied with their TKR, indicating at least 'good' for their satisfaction level [6].
Pre-operative expectations heavily influence satisfaction level [7]. In the ranking of patients' expectations of TKR, relieving pain and improving walking ability have been rated to be the most important [8]. As increasing prevalence of knee osteoarthritis creates greater demand for TKR in the future, it is crucial to identify factors that may predict post-operative pain and function outcomes [9]. These could be discussed with patients and aid decision making during pre-operative counselling and to manage patient expectations about what 'success' would look (feel) like.
The overall aim of this study was to investigate factors that affect pain and function outcomes, measured by Oxford Knee Score (OKS), 1 year after TKR for osteoarthritis, and to compare to a reference population. To achieve this, there are three objectives. 1. Identifying factors that are associated with OKS outcomes 12-months after TKR. 2. Comparing 12month post-operative OKS with that of a population in the same county; this comparison is based on OKS, which measures self-rated pain level and ability to carry out tasks specific to the knee. 3. Identifying which factors are associated with a worse, similar or better outcome postoperatively as compared to the reference population. From a myriad of potential predicting factors, this study focuses on radiographic grading of osteoarthritis, pre-operative OKS score, age and sex.

Methods
This cohort study analyses outcomes of TKR patients at a single site in the UK (Nottingham City Hospital) with a reference population from the local (Nottinghamshire) area.
The TKR study was performed as a service evaluation of anonymised data without the need for formal ethical approval (approved by the Nottingham University Hospital NHS Trust). Ethical approval was obtained from the University of Nottingham Faculty of Medicine and Health Sciences Research Ethics Committee for the reference cohort study.

TKR cohort
In the TKR cohort, patients whose indication for surgery was osteoarthritis were included consecutively (2008)(2009)(2010). For patients who had unilateral TKR performed on both knees within the time period, one knee was randomly selected for data analysis. These data were collected prospectively, including sex, height, weight, date of birth, date of operation, side of surgery, primary diagnosis, comorbidities, type of operation, preoperative OKS questionnaire and the American Society of Anaesthesiologist (ASA) score, with a follow-up OKS questionnaire 12 months after operation.
Radiographic sub-group. For radiographic assessment, a sub-group of the TKR cohort was selected, comparing patients who did not improve/improve significantly post-operatively. This is based on change in OKS (pre-operative to 12months post-operative). Given that the minimal clinically important difference (MCID) for OKS is 5 points [10], those who had < MCID positive change in OKS were categorised as 'Non-Improvers' (n ¼ 98). Those with positive change in OKS of !25 points were categorised as 'Top Improvers' (n ¼ 99). 50 people were randomly selected from each group for radiograph assessment.
The Kellgren and Lawrence (K-L) system was used to evaluate severity of radiographic osteoarthritis, being one of the most commonly used [11], ranging from 0 (no OA) to 4 (severe). Each radiograph was compared against a standard radiographic atlas. Grading was based on the presence of osteophytes and joint space narrowing. Anterior-posterior, weight-bearing radiographs, taken within 6-months prior to TKR, were used to assess the tibio-femoral joint. Radiographs were assessed by a single observer (GSF) blinded to outcome scores. Inter-observer error was assessed by a second observer (BES) assessing 40/100 randomly selected films (20 per group), blinded to initial scores. Differences were assessed using the kappa statistic.

Reference cohort
For the reference cohort, data were collected from participants by postal questionnaire survey. From a total of 25,695 postcodes for those living in Nottinghamshire and the City of Nottingham, 2500 postcodes were randomly selected and a third allocated to each of the 3 available age categories (18-44, 45-69 and ! 70 years). For each postcode a name and address that met the required age group was randomly selected to receive the survey. The self-reported survey included age, sex, height, weight, co-morbidities and OKS questionnaire. From those who responded to the postal questionnaire (between September 2014 and April 2015), only those aged !30 years were included to match demographics of the TKR cohort.
All randomisation was undertaken using a random number generator by a blinded statistician.

Outcome data
Oxford Knee Score (OKS), the outcome measure, measures pain and function (activities of daily living) of the knee, reported by patients. This questionnaire is routinely given to patients before and after TKR [10,12]. The OKS consists of 12 questions, each rated at five levels, ranging from 0 (severe) to 4 (none). The scores were totalled to give an overall score, where 0 is the worst possible score and 48 is the best possible score. The Patient Reported Outcome Measures (PROMs) programme guideline was followed regarding missing responses [13]. Questionnaires considered as complete when there were a maximum of 2 missing responses; mean of remaining responses imputed for missing values.
There were two outcome variables: change in OKS (pre-operative to post-operative) and 12-month post-operative standardised Z-score of OKS (Z-OKS). The former shows the 12-month change since TKR and the latter indicates how the post-operative OKS compares to age and sex matched reference population.

Data analysis
All three cohort demographics were described: providing mean/SD for the continuous variables (age, BMI); median/inter-quartile range (IQR) for OKS (pre-operative/post-operative); and count/percentages for categorised data (sex, ASA, osteoarthritis status).
For the first objective, the change in OKS between the two time points for the TKR cohort were described (median/IQR) as well as categorised by age (<60 years, 60-69years, !70years), sex (male/female), and K-L grade (0-1, 2 and 3-4). OKS at the two time points were compared using Mann-Whitney U test. Next, multivariate regression analyses were undertaken: change in OKS as outcome variable; independent variables were age, sex, pre-operative OKS and K-L grade. The model was run twice, with and without K-L data (as only the sub-group had these data).
For the second objective, analyses were made with reference to an age and sex matched reference cohort. First, the post-operative OKS data were described: categorised by age, sex, and pre-operative OKS category (dichotomised, high/low, around its median value). Then the postoperative OKS was compared to OKS of the age and sex matched reference population (post-operative OKS of TKR cohort divided by OKS of matched reference population, expressed as a percentage). Next, the postoperative OKS was used to calculate individual post-operative Z-OKS. The reference cohort were split into six groups by sex and age (male/ female; <60,60-69,!70 years) and mean and standard deviations of their OKS calculated. Z-OKS was the difference between TKR cohort postoperative OKS value and reference cohort OKS mean, divided by the standard deviation of the reference cohort (age and sex stratified). A negative z-score indicated the patient was below the equivalent figure for the age and sex matched reference group and vice versa. The proportion of TKR cohort with a positive Z-OKS were calculated.
For the third objective, multivariable regression analysis was conducted with post-operative Z-OKS as outcome variable; K-L grade and pre-operative OKS as independent variables. Clearly, age and sex could not be included in this model. This analyse was to investigate how preoperative OKS and K-L grade predict better/worse outcomes in men/ women of different ages.
Statistical analyses were conducted using SPSS (version 22.0, IBM). A p-value <0.05 was considered significant.

Results
TKR study cohort: Data from a total of 869 TKR patients were obtained during the study period. 605 (69.6%) patients had completed questionnaires at both time points. After exclusion (37 had TKR for indications other than osteoarthritis; 32 had bilateral knee TKR), 536 (88.6%) patient knees were analysed. The cohort characteristics are presented in Table 1. The patients' age ranged from 37 to 92 years (mean 70 years). There was a higher representation of females (60%) and obese patients (59%). According to the ASA score, most were classified as having mild systemic disease. All patients underwent a cemented TKR. The median OKS improved 17.6 points between pre-and post-operative ( Table 2). Patients were dichotomised as 'low' pre-operative OKS if the score was less than the median value (<17 points) (n ¼ 269), or 'high' score if it was !17 points (n ¼ 267).
Radiographic sub-group: Out of the 100 patients randomly selected from the 'Non-improver' and 'Top-improver' groups, 8 were excluded due to radiographs being non weight-bearing or taken more than six months before surgery, 1 was excluded as radiograph showed signs of rheumatoid arthritis. Thus 91 radiographs were assessed. These patients were similar to the TKR cohort in terms of age, BMI and ASA, with slightly more females (76%) ( Table 1). There were 55 (60%) patients in the low pre-operative OKS group and 36 (40%) in the high group. The intra-observer reliability scores showed moderate reproducibility for K-L grade (Kappa value of 0.750). The inter-observer reliability scores showed high reproducibility (Kappa value of 0.722).
Of those with osteoarthritis, respondents after TKR reported lower median OKS (29.8) than those without TKR (39.0) (p<0.001). Table 2 shows the change in median OKS from 16.4 to 34.0 between pre-operative and post-operative time points for the TKR cohort, which was statistically significant (95% CI 14.0, 15.7; p<0.001). The change in OKS data were negatively skewed, indicating that most patients experienced improvement after the operation. 40 patients (7.5%) reported lower (worse) OKS post-operatively (negative change in OKS). 58 patients (10.8%) achieved no clinical improvement (change in OKS between 0 and 5).

Change in OKS
These data were then categorised by age, sex and K-L grade ( Table 3). The median and interquartile range (IQR) data are presented for preoperative and post-operative OKS as well as the change between the two scores in Table 3, with totals for the TKR group. This shows that whilst all three age groups had similar pre-operative median OKS, the younger age group performed worst on average (lowest scores) postoperatively, with the lowest change in OKS (11 points). Females' median pre-operative OKS were lower than males' (16 vs 18 respectively), but a slightly higher change in OKS (17.0 vs 14.3 respectively) resulted in similar median post-operative OKS. Patients in the lowest K-L grade group, had higher median pre-operative OKS and lower median postoperative OKS. This group had a negative change in OKS, which was also the lowest median change (worse than before). There were small numbers in this group (n ¼ 9).
When these data were analysed in a multivariate regression model (outcome variable: change in OKS) with age, sex and pre-operative OKS, as independent variables (n ¼ 536). This model explained 11.8% (11.4% adjusted) of the variation in the change in OKS (p<0.001). K-L grade was then added to the model (n ¼ 91); it explained 49.8% (47.4% adjusted) of the outcome variation (p<0.001) ( Table 4). All four variables showed a statistically significant relationship with change in OKS (Table 4). Preoperative OKS showed the biggest adjusted effect size (p<0.001) with lower pre-operative OKS associated with higher change in OKS (improvement). Sex also showed a negative relationship, with women showing a greater change in OKS (p ¼ 0.008). Sex was not statistically significant in the initial model but was in the radiographic sub-group analyses. Age and K-L grade both reported positive associations, so younger patients and those with lower K-L grade showed the least improvement in OKS, and vice versa (p ¼ 0.001 and p ¼ 0.003, respectively). However, the difference in median change in OKS between the   Table 3 Median pre-operative, post-operative and change in OKS by age, sex and K-L score for TKR cohort (n ¼ 536) and the radiographic sub-cohort (n ¼ 91). three age categories was marginally less than the MCID of 5 points (4.5 points between <60 and 60-70 years; 5.0 points between <60 and >70 years).

Normative data
The second objective is examined next, exploring how patients fared compared to a reference population. Table 5 shows the median postoperative OKS of the TKR group and the current OKS for the reference cohort, categorised by sex, age (both groups) and pre-operative OKS (TKR group only) categories. On average, TKR patients did not return to the same levels as the Reference group post-operatively, achieving 74% of the Reference group OKS, with women <60 years 'performing' the least well (55% of the score in a 'normative' population: OKS 26 vs 47 points respectively). Table 5 also highlights that median post-operative OKS was lower for patients with low pre-operative OKS across all age and sex groups. This difference was greatest for young (<60 years) males (23 points for low pre-operative OKS and 41 points for high pre-operative OKS).
The Reference Cohort data were used to calculate the Z-OKS (Table 6). The Z-OKS ranged from À7.586 to 0.899, with mean of À1.0 (SD 1.57). By sex, the Z-OKS ranged from À7.586 to 0.872 for females (mean À1.0; SD 1.61) and À5.553 to 0.899 for males (mean À1.1; SD 1.52) (a negative z-OKS indicates the patient fared less well than that of the reference cohort) ( Table 6). That is, on average females fared less well than males. By age and sex, females aged less than 60 years had the lowest positive results (0.377 compared to the highest rated value of 0.872); while males in the age group of 60-69 had the lowest positive result (0.619 compared to the highest rate value of 0.899). Generally, older patients were more likely to have a better (higher) OKS than the Reference group. Overall, 27% of participants had a Z-OKS>0 (i.e. better than the Reference group). That is, over 70% of patients did not return to an age and sex matched 'norm' post-operatively.
Finally, the third sub-objective is examined. Regression models with Z-OKS as the outcome variable are presented in Table 7, the former for the TKR cohort (n ¼ 536), the latter for the radiographic sub-group (n ¼ 91). The first model with pre-operative OKS explained 7.1% (adjusted 7.0%) of the variation in the Z-OKS (p<0.001). The coefficient of determination doubled to 14.2% (adjusted 12.3%) when K-L grade was added to the model (p ¼ 0.001). K-L grade had a positive association with Z-OKS (standardised coefficient ¼ 0.368, p<0.001). This means a higher K-L grade pre-operatively was associated with a higher Z-OKS (that is, more likely to have a higher post-operative OKS than that of the reference cohort data) after TKR. Pre-operative OKS had a weak negative association (standardised coefficient ¼ À0.037, p ¼ 0.715).

Discussion
In this study, majority of the patients experienced improvement in OKS one year after TKR, with the median post-operative OKS similar to that in other studies [14][15][16][17]. On average, TKR patients in our study achieve 74% of their normative age and sex adjusted OKS levels after surgery, with younger patients (<60 years), particularly women, achieving the worst outcomes. Pre-operative K-L grade is important in determining outcome, with patients with less severe scores achieving the least improvement.

Applications to clinical practice
The normative values in Tables 5 and 6 could aid patients form realistic expectations of pain and functional (OKS) outcomes one year after TKR. Some patients experience mild pain and little functional limitations after surgery. However, on average, patients could expect moderate pain and functional limitations. For instance, they could experience moderate pain when standing up from a chair and have moderate difficulty kneeling and getting up again. Overall, one year after having TKR,  Table 5 Median post-operative OKS of TKR and current OKS of Reference cohorts, categorised by sex and age for both groups and further categorised by pre-operative OKS for the TKR cohort. their OKS level could improve to just over 70% of that experienced by others of similar age and sex.

Radiographic findings
This analysis suggested that having more severe osteoarthritis on radiographs was associated with greater improvement and better postoperative OKS outcomes. This positive association is consistent with several studies [18][19][20]. In Valdes et al., the study (n ¼ 869) used only post-operative WOMAC scores to assess pain and it concluded that those with more severe K-L grade had better post-operative pain outcomes [18]. However, as pre-operative pain scores were not obtained, the degree of improvement was not compared. In another study (n ¼ 478) that included pre-operative scores and used IKSS in its analysis, Dowsey et al. found that those with less severe radiographic changes are less likely to experience major improvement in pain and function 1 and 2 years after operation [19], which concurs with this work. These findings support the use of TKR for patients with more advanced osteoarthritis.
Some studies have shown that local and systemic inflammation is correlated with pain [21]. Moreover, in Lundblad's study comparing rest pain and pain on movement, it was noted that the association between high radiographic osteoarthritis and better post-operative pain outcome only applies to pain related to movement [20]. This pain is more closely related to pathology in knee joint which is mitigated by surgery.
However, some studies have found that although those with more severe radiographic changes may experience major improvements, they do not attain the same level of post-operative function as those with less severe changes [5,22]. This study did not fully concur. This study measured post-operative function in two ways, both the absolute post-operative OKS score and the Z-OKS, determining how the patient's OKS compares to a sex and age matched normative population. We found a positive association between K-L grade with both change in OKS and the z-score, thus demonstrating that patients with more severe pre-operative radiographical changes experienced more change in OKS (improvement) (p ¼ 0.003) and also fared better after TKR compared to the Reference group than patients with less severe pre-operative radiographical changes (p<0.001). That is, 25-30% of patients (depending on sex and age, with females and older patients faring better) had a positive Z-OKS, suggesting their score was higher than for the age and sex matched reference population. This suggests that patients with more severe radiographic findings were more likely to do well post-operatively.
The discrepancy between radiological findings and pain symptoms is widely known [1,23]. A low K-L grade can be associated with high pain; this pain is likely due to a different mechanism than for those patients with high K-L grade and unlikely to be comprehensively addressed by surgery. Pain is influenced by many other factors, including psychosocial factors and central sensitisation, which may not be resolved as easily. In severe osteoarthritis, patient might have progressed to have central pain sensitisation and functional decline, which would reduce the surgical improvement [18][19][20].

Pre-operative OKS finding
Another key finding is that although lower pre-operative score is associated with greater improvement in OKS, the post-operative outcome is still worse than patients with higher pre-operative score. This is consistent with several studies [5,14]. It is rational that patients who had more OKS impairment experienced more improvement after TKR. However, despite the greater improvement, they still experienced more pain than those who had better pre-operative pain score [5,18,20]. It is noted that it is also possible to interpret that patients with lower pre-operative OKS may have experienced a higher change in OKS (Table 4) simply because they have more 'room' to improve their score. This could be explained by the multifactorial nature and subjectivity of pain. Patients themselves giving ratings before and after surgery levels on factors like their pain perception and thresholds. However, other factors like their expectations and psychosocial factors would still affect how the ratings are given [4,5]. Central sensitisation was also postulated as a key reason for the high pain levels experienced before and after surgery, as these are not fixed by the operation [5,20]. Therefore, clinicians should consider pre-operative OKS with psychosocial conditions when evaluating the benefits of TKR for each patient.

Strengths & limitations
This study has several strengths. Measuring outcome at 12 months after operation is a reasonable time point for assessment as studies have shown that most improvement occurs within the first year, and tends to plateau thereafter [8,24]. Also, as the reference population is from the same county, differences between comparison groups such as environmental, psychosocial factors are reduced.
Another strength of this study is using both change in score and standardised scores as outcome variables. Change in scores is a more accurate reflection of TKR outcomes as compared to those who analysed post-operative scores alone. By standardising post-operative outcomes with that of a general population, the floor and ceiling effect often observed in fixed-end scales like OKS is reduced. Analysing the outcomes using both variables simultaneously reinforced the findings.
In addition, this study uses OKS which is reliable and used in the wider PROMs programme in England and Wales. OKS can be easily completed by patients themselves, without the influence of clinicians. This removes reporting bias. The questions in OKS are also specific for the knee joint, reducing the influence of unrelated comorbidities on their rating. Nevertheless, some comorbidities such as hip pathology could still affect the individual's ability to carry out some tasks described in the questionnaire. This study also did not explore other factors such as comorbidities, presence of complications, previous trauma and knee surgery. BMI, though collected in the study, has not been included in analysis due to high level of missing data. All of these could act as confounders.
One of the limitations in our study was the validity of reference cohort as representation of the county population. As the data obtained for reference population was based on voluntary participation, it is subjected to response bias.
Secondly, it should be noted that most patients in this study had high K-L grade (3 or 4), with only 9 having low K-L grade (0 or 1). This is understandable as TKR is usually performed when symptoms of osteoarthritis are refractory to medical management. Also, in this study, only anterior-posterior radiographs were used to assess tibiofemoral K-L grade of osteoarthritis. Fewer patients had skyline radiographs taken, which could have shown patellofemoral osteoarthritis.
Lastly, our analysis is based on a single site, which may limit the generalisability of study findings. All study subjects were from one hospital, which can be subjected to selection bias. Yet, the patient demographics in this study were similar to that reported by National Joint Registry for TKR performed in 2010 in terms of age, proportion of sex and ASA score [25]. As data for the TKR cohort was prospectively collected and around one third of participants were lost over follow up period, non-response bias was present.

Conclusion
A novel finding from this study is the comparison of outcomes in those who had TKR with the general population in the same county. The data stratified by age and sex could help patients formulate postoperative expectations more easily. The mismatch between expectations and outcomes has been known as one of the factors for poor satisfaction, thus this information could help increase satisfaction rate after procedure [7].
This study could help guide clinicians in counselling patients on management options for osteoarthritis, including TKR procedure. In addition to factors analysed in this study, various factors need to be considered collectively for holistic management of each individual. If patients who are less likely to benefit from TKR are identified, different treatment options could be fully explored. By providing information to help patients form more realistic expectations of outcomes after surgery, patients could give more informed consent, and potentially increase their satisfaction rate.

Author contributions
YY: Contributed to the study design, literature review, statistical analysis and writing of manuscript, responding to reviewers, reviewing final drafts. GF: Contributed to the grading of radiographs and drafting of manuscript, reviewing final drafts. HS: Contributed to data collection, analysis of the reference cohort and drafting of manuscript, reviewing final drafts. BS: Contributed to conception and design of study, grading of radiographs (for inter-observer error check), data interpretation and drafting of manuscript, responding to reviewers, reviewing final drafts. KE: Contributed to the study design and statistical analysis, interpretation and writing of manuscript, responding to reviewers, reviewing final drafts.

Role of funding source
This study was funded by the University of Nottingham and by the Centre for Sport, Exercise and Osteoarthritis Research Versus Arthritis (Grant reference 21595).

Declaration of competing interest
The authors have no conflicts of interest.