MRI-Based Shoulder Osteoarthritis Severity score (SOAS) is predictive of functional improvement following total shoulder arthroplasty

Background: Total shoulder arthroplasty (TSA) is an increasingly common treatment for endstage glenohumeral osteoarthritis. Current established radiographic measures and classification systems do not predict patient-reported outcomes from TSA. We hypothesized that the MRI-based Shoulder Osteoarthritis Severity (SOAS) Score would correlate with subjective improvement following TSA. Methods: Patients undergoing TSA with preoperative shoulder MRIs and preand postoperative ASES scores with minimum 2-year follow-up were included from a prospectively collected institutional shoulder arthroplasty database. SOAS scores, which is measured from 0 to 100 with an increasing score reflecting greater global degenerative changes, were assessed by two independent reviewers, and Samilson-Prieto grade and Walch classification were scored by one reviewer. Average SOAS scores were correlated with demographic factors and pre-, post-, and change (D) in ASES scores. Statistical analysis was performed with STATA with Pearson’s correlation, one-way ANOVA, and ROC analysis, with significance defined by p <.05. Results: 30 patients (age 63 § 10 years, 14 females, 16 males) who underwent primary anatomic TSA were included. The intraclass correlation coefficient (ICC) for total SOAS scores calculated by reviewers was 0.91. SOAS score correlated significantly with DASES (r = 0.61, p = .0003) and preoperative ASES (r = -0.37, p = .042), with greater MRI-based degenerative change associated with greater improvement after TSA and lower preoperative ASES score. No significant relationship was found between either Samilson-Prieto or Walch classification and SOAS or ASES scores. No significant relationship was found between SOAS scores and age, sex, or BMI. Using an MCID of 21 as previously reported, an ROC curve was generated and found to have an AUC of 0.96. A SOAS score cut-point of 36.25 was found to maximize sensitivity and specificity in predicting reaching MCID. Conclusion: We observed a significant positive correlation between the MRI-based SOAS score and functional improvement following TSA measured using change in ASES scores, indicating that patients with more advanced degenerative changes on MRI had greater improvement after shoulder replacement surgery. We found that the correlation strength was highest when comparing total SOAS score to DASES as opposed to any individual subA R T I C L E I N F O

the MRI-based Shoulder Osteoarthritis Severity (SOAS) Score would correlate with subjective improvement following TSA. Methods: Patients undergoing TSA with preoperative shoulder MRIs and pre-and postoperative ASES scores with minimum 2-year follow-up were included from a prospectively collected institutional shoulder arthroplasty database. SOAS scores, which is measured from 0 to 100 with an increasing score reflecting greater global degenerative changes, were assessed by two independent reviewers, and Samilson-Prieto grade and Walch classification were scored by one reviewer. Average SOAS scores were correlated with demographic factors and pre-, post-, and change (D) in ASES scores. Statistical analysis was performed with STATA with Pearson's correlation, one-way ANOVA, and ROC analysis, with significance defined by p <.05. Results: 30 patients (age 63 § 10 years, 14 females, 16 males) who underwent primary anatomic TSA were included. The intraclass correlation coefficient (ICC) for total SOAS scores calculated by reviewers was 0.91. SOAS score correlated significantly with DASES (r = 0.61, p = .0003) and preoperative ASES (r = -0.37, p = .042), with greater MRI-based degenerative change associated with greater improvement after TSA and lower preoperative ASES score.
No significant relationship was found between either Samilson-Prieto or Walch classification and SOAS or ASES scores. No significant relationship was found between SOAS scores and age, sex, or BMI. Using an MCID of 21 as previously reported, an ROC curve was generated and found to have an AUC of 0.96. A SOAS score cut-point of 36.25 was found to maximize sensitivity and specificity in predicting reaching MCID. Conclusion: We observed a significant positive correlation between the MRI-based SOAS score and functional improvement following TSA measured using change in ASES scores, indicating that patients with more advanced degenerative changes on MRI had greater improvement after shoulder replacement surgery. We found that the correlation strength was highest when comparing total SOAS score to DASES as opposed to any individual sub- Shoulder osteoarthritis (OA) is a common cause of debilitating joint pain in an aging population. In spite of this, the association between radiographic findings and painful dysfunction are not completely understood. The prevalence of radiographic findings of shoulder OA are as high as 16%-20%, however their association with clinical dysfunction is not well-characterized, as many patients with advanced changes on plain radiographs are asymptomatic. 10 Anatomic total shoulder arthroplasty (TSA) is a treatment option for end-stage symptomatic glenohumeral osteoarthritis in patients with an intact rotator cuff and sufficient glenoid bone stock. 14 In spite of the high prevalence of shoulder OA, there are few established radiographic predictors of OA symptom severity and no well-established predictors for which patients may benefit the most from TSA. Shoulder OA is most commonly diagnosed radiographically, with evidence of joint space narrowing, humeral head and glenoid osteophytes, and increasing glenoid wear as OA progresses. 5 Several established radiographic classification systems for shoulder OA have shown high inter-and intra-observer reliability, including the Samilson-Prieto, Kellgren and Lawrence, Weinstein, and Guyette classifications. 2,3 However, x-ray-based classification systems have not been shown to have an influence on patient-reported outcomes following TSA, 7,8,17 in contrast to the association observed between milder hip and knee arthritis resulting in worse functional outcomes following total knee and total hip arthroplasty. 15,19,20 One limitation of these radiographic classifications is their limited detection of the earlier stages of OA in addition to the soft tissue structures and inflammation associated with OA. MRI has been proposed as an imaging modality that may more accurately stage OA, given the additional information provided by MRI on cartilage, inflammation, and injury or degeneration of surrounding soft tissue structures. 4,10 A recent classification system for quantifying shoulder OA using MRI was developed by Jungmann et al in 2019. 6 The Shoulder Osteoarthritis Severity (SOAS) score is a semi-quantitative global assessment of shoulder OA which assesses the rotator cuff, labral-bicipital-complex, cartilage, osseous findings, joint capsule, and the peri-acromion on MRI to calculate a composite score between 0 and 100. It has a reported interobserver reliability of 0.96-0.98 and was additionally found to correlate strongly with different Samilson grades. 6 The purpose of this study was to evaluate the relationship between MRI-based glenohumeral joint degenerative changes and symptoms before and after shoulder replacement. Given the high reliability of this MRI-based classification system for detecting the severity of shoulder OA, and in light of recent evidence highlighting the utility of MRI findings in predicting total knee arthroplasty outcomes, 4 we hypothesized that SOAS scores would correlate with patient patient-reported outcomes before and after anatomic TSA.

Study design and participants
This was a retrospective cohort study using a prospectively collected database of anatomic TSA patients from a single tertiary referral center. All patients completed the American Shoulder and Elbow Surgeons Standardized Shoulder Assessment Form (ASES) before shoulder replacement surgery and annually after surgery. We included patients who were treated with primary anatomic TSA between with at least 2year follow-up and who also had available preoperative shoulder MRI (n = 53). Exclusion criteria were lack of preoperative ASES scores (n = 15), inadequate MRI sequences to evaluate SOAS criteria (n = 4), and patients with bilateral shoulder arthroplasty (n = 4; unable to determine laterality corresponding to ASES survey). All TSAs were performed by 1 of 3 fellowship-trained surgeons.

Study variables
Demographic variables, including age, sex, and BMI were recorded. Postoperative patient reported outcomes were assessed using the ASES scores at the latest follow-up at a minimum of 2 years after TSA.
Standardized shoulder radiographs and preoperative MRI were collected within one year prior to surgery. SOAS scores were calculated by two independent reviewers. The SOAS score is comprised of 20 subscores of individual structures as described by Jungmann et al, including supraspinatus/infraspinatus tendons, subscapularis tendon, tendon retraction, muscle fatty infiltration, muscle atrophy, glenoid labrum, long head of biceps, paralabral ganglia, glenohumeral ligaments, cartilage, bone marrow edema, intraosseous cysts, osteophytes, bone deformity, synovitis, joint effusion, loose bodies, subacromial bursa, acromioclavicular joint degeneration, and acromial deformity, 6 as summarized in Table 1. Total scores were calculated, with a possible range of 0 to 100, with a higher score indicating more severe degenerative changes in the shoulder. The intraclass correlation coefficient between the two reviewers in this study for total SOAS scores was 0.91. Given this consistency, an average measurement between reviewers was used for subsequent analyses. Shoulder radiographs were assessed by one reviewer (MD) for Samilson-Prieto classification based on the size of the inferior humeral head or glenoid osteophyte, 3 and Walch classification was determined using axial MRIs. 1,9 Statistical analysis Statistical analyses were performed with Stata (version 16.1, StataCorp LP, College Station, TX, USA). Descriptive statistics including mean and standard deviation were calculated. The Pearson correlation coefficient was used to assess the relationship between pairs of continuous variables: SOAS scores and ASES scores, and continuous demographic variables (age, BMI) and SOAS scores. The change in ASES score (DASES) was calculated as the difference between the final postoperative ASES score and the preoperative ASES score for each patient. The DASES was compared to SOAS score with the Pearson correlation coefficient. The point-biserial correlation was used to assess for a relationship between patient sex and SOAS scores. One-way analysis of variance (ANOVA) was used to assess the relationship between SOAS scores and either Samilson-Prieto or Walch classification group. As there was only 1 type C glenoid identified, this subject was excluded from the Walch-SOAS analysis. A receiver-operator characteristic (ROC) curve and area under the curve (AUC) calculation were generated to determine the likelihood of SOAS score predicting an improvement in the ASES score that reached the minimal clinically important difference (MCID) in ASES scores of 21 points as previously reported by Tashjian et al. 18 To identify the optimal cut-point on the ROC curve, the Youden index 13 and associated cut-point were calculated in Stata. Correlation coefficients were interpreted as such: absolute magnitude r < 0.1, negligible; 0.1 < r < 0.39, weak; 0.4 > r > 0.69, moderate; 0.7 > r > 0.9, strong; r > 0.9, very strong. 16 Statistical significance was defined as p <.05.

Results
There were 30 patients (14 female Figure 1 illustrates the range of MRI findings observed. There were 22 patients with Samilson-Prieto grade of "3 00 , 5 patients with Samilson-Prieto grade of "2 00 , and 3 patients with a grade of "1 00 . Based on Walch classification of glenoid morphology, there were 8 type A1 glenoids, 8 A2 glenoids, 7 B1 glenoids, 6 B2 glenoids, and 1 type C glenoid. Increasing SOAS scores showed a negative correlation with preoperative ASES scores (r = -0.37, P = .042), and a positive correlation with DASES scores (r = 0.61, p = .0003), but no significant correlation with postoperative ASES scores alone [ Fig. 2]. These relationships denote that more severe degenerative changes on MRI resulting in a higher SOAS score are correlated with lower preoperative ASES scores and a greater magnitude of improvement in the ASES after shoulder replacement surgery.
Given the significant correlation between total SOAS scores and DASES scores, subcategory analysis was additionally performed using the Pearson coefficient between averaged subcategory scores and DASES scores. Scores from the original 20 subsections of the SOAS score were combined into the following grouped subcategories: rotator cuff, labrum/biceps, cartilage, osseous, inflammation, and acromion [ Table 2]. Each individual subcategory, except for acromion grading, was found to be moderately positively correlated with DASES scores, with worsening pathology in each subcategory associated with greater changes in ASES scores. The strongest relationship was seen with the rotator cuff sub-score r = 0.55, p = .0018), and the acromion sub-score displayed the weakest correlation (r = 0.28, p = .28).
There was no significant association between Samilson-Prieto grade or Walch classification and total SOAS score or DASES scores [Fig. 3].
Based upon the prior study by Tashjian et al, 18 we used an MCID of 21 and substantial clinical benefit (SCB) of 37 21 (both corresponding to DASES) to perform ROC-AUC analysis of the ability of SOAS scores to predict whether patients met MCID and SCB [Fig. 4]. The average SOAS score was significantly higher for patients meeting MCID (42.9 § 7.4) relative to those who did not meet MCID (31 § 3.4; p = .0042). The average  SOAS score of patients meeting SCB was also significantly higher (44.2 § 8.4) relative to those patients who did not meet SCB (37.6 § 6.2; p = .025). Based on the MCID of 21, the Youden index was calculated to be 0.85 and correspond to a SOAS score cut-point of 36.25. The ROC curve corresponding to MCID of 21 had an AUC of 0.96. The ROC curve corresponding to an SCB of 37 had an AUC of 0.71.

Discussion
We observed a positive correlation between SOAS scores and improvement in ASES scores following TSA, as well as a negative correlation between SOAS scores and preoperative ASES scores. Both of these relationships showed that greater evidence of degenerative changes on the preoperative MRI is

ARTICLE IN PRESS
associated with worse patient-reported outcomes before surgery and greater overall improvement after TSA. In contrast, we did not identify any significant correlations between Samilson-Prieto or Walch classification and either SOAS or ASES scores. When analyzing the SOAS sub-scores, we found that the Pearson coefficient between total SOAS score and DASES was higher than that of any of the individual SOAS sub-scores and DASES, while the relationship between the rotator cuff score and the DASES was the strongest of those for the sub-scores. Taken together, these findings suggest that the comprehensive nature of the SOAS score allows for a reliable, global quantification of shoulder OA which in turn is a moderate-to-strong predictor of improvement in patientreported outcomes following anatomic TSA.
There are few known preoperative radiographic predictors of functional outcomes following TSA. Leschinger et al found that increasing Walch classification was a negative predictor of medium-term outcomes following TSA. 11 Contrastingly, other studies have found no correlation between radiographic findings of shoulder OA and functional outcomes following TSA. 7,8,17 Ma et al found that there was no association between postoperative glenoid retroversion and humeral head subluxation and outcomes following TSA. 12 In our analysis, we did not find a significant association between the Walch or Samilson-Prieto classifications and either SOAS scores or ASES scores. These results are in contrast to prior studies of knee and hip arthritis, which concluded that increased severity of radiographic OA correlates with improved outcomes following total hip and total knee arthroplasty, and conversely, that patients with the mildest radiographic disease tend to experience lesser functional gains following surgery. 15,19,20 Importantly, in this study, all patients included had undergone TSA for diagnosed glenohumeral OA at the time of selection. Therefore we do not find it surprising that radiographically the majority of patients were found to have advanced arthritis based on preoperative x-rays. However, when applying the comprehensive SOAS score to shoulder OA in this study, we do appreciate a similar trend between severity of disease on MRI and patient outcomes as originally reported in the knee and hip arthroplasty literature.
Unlike the radiographic or CT-based classification systems of shoulder OA, MRI offers the ability to simultaneously evaluate the bony, cartilaginous, soft tissue, and inflammatory contributors to joint pathology. Taken together, this may offer a more accurate assessment of shoulder arthritis severity. That the total SOAS score is more meaningful than the sum of its parts is supported by the fact that both the Pearson correlations between total SOAS score and ASES scores as well as the ICC for total score between independent reviewers in this study were both higher than the correlations seen for any of the SOAS sub-scores. Interestingly, the weakest correlation between any individual sub-score and DASES was seen for the category of osseous structures (r = 0.36, P = .053). This finding may further highlight the fact that all of the patients selected for inclusion were already indicated for TSA, based  on clinical symptoms as well as plain radiographs. Therefore, it is not surprising that the bony pathology alone appreciated on MRI did not correlate as strongly with overall SOAS score and with functional outcomes as the combined extent of bony, cartilaginous, and soft tissue pathology.
In evaluating the overall joint pathology with the total SOAS score, we were surprised to find that it correlates more robustly with change in ASES scores than with preoperative ASES scores alone. This may be due to the fact that differences in individual interpretations of the rating scales in the ASES survey form between individual patients may be partially controlled for when individuals assess their pain and function before and after surgery, as opposed to at a single timepoint. In this way, using DASES scores as a primary outcome metric may also serve as an internal control for patient differences in questionnaire interpretation.
Given this strong positive correlation between joint pathology noted on MRI and DASES, we next sought to determine if SOAS scores could be applied as a predictive tool to determine which patients are likely to meet MCID or experience SCB. Analyzing our cohort of patients, we found that 26 out of 30 had a DASES > 21, therefore meeting MCID as defined by Tashjian et al. 18 We found a significant difference in the average SOAS scores between patients meeting MCID and those who did not, and the corresponding ROC curve for MCID of 21 was found to have an AUC of 0.96. A SOAS score of 36.25 maximized the sensitivity and specificity of the ROC curve. Based on these data, we would suggest that a SOAS score of 36 or lower may represent relatively milder organic shoulder joint disease, and in our ROC analysis, this was associated with a high risk of not meeting MCID. In our analysis of patients meeting SCB, we found that 17 of 30 met the chosen DASES of 37 or greater, and that the average SOAS score of patients who met SCB was significantly higher than those who did not. The corresponding AUC of this ROC curve was found to be 0.71. Therefore, SOAS scores may be moderately helpful in identifying patients who are likely to experience SCB to a lesser degree than they may predict those who are at risk of not meeting MCID. The findings from this study suggest that the condition of the joint based on preoperative MRI may have a role in counseling patients regarding the likely benefit they may expect following anatomic shoulder replacement surgery.
In many clinical settings, obtaining a shoulder MRI prior to shoulder arthroplasty is not standard practice. We recognize that the addition of MRI for every patient undergoing shoulder arthroplasty may represent a significant cost to the healthcare system. Thus, for clinical settings in which MRI is not currently obtained routinely for shoulder arthroplasty candidates, we would advocate for a patient-specific approach to guide the decision whether to obtain preoperative MRI.
There are several limitations to this study. First, the sample size was limited by the number of patients who met the criteria of having undergone preoperative MRI in addition to having pre-and postoperative ASES scores. Previously, obtaining a preoperative MRI before shoulder replacement surgery has not been done per routine at our institution, which also may introduce unrecognized selection bias. Additionally, 23 patients who underwent TSA were excluded from the study, mostly due to a lack of preoperative ASES scores, further increasing the chances of selection bias. The small size of the cohort may have further limited interpretation of correlations between the radiographic Walch and Samilson-Prieto classifications and ASES scores, as we may have been underpowered to detect more subtle associations. Future investigations of the utility of SOAS scores would benefit from larger cohorts of patients to further assess its utility in predicting patient outcomes and to compare the predictive power of SOAS to known radiographic classifications with better certainty. Second, while the ICC for total SOAS scores demonstrated a very high ICC between independent reviewers (0.91), the ICCs for the sub-scores ranged from 0.44 for labral/biceps pathology to 0.88 for osseous pathology. This is a similar trend to what was noted by Jungmann et al, in which the ICCs for subscores were lower than those for total SOAS scores. 6 Third, we only included patients undergoing anatomic TSA for shoulder OA. The relationships identified in this study may not be generalizable to patients undergoing hemiarthroplasty or reverse shoulder arthroplasty, and these areas can be directions of future research.

Conclusion
We observed a significant positive correlation between the MRI-based SOAS score and subjective improvement following TSA measured using change in ASES scores, indicating that patients with more advanced degenerative changes on MRI had greater improvement after shoulder replacement surgery. We additionally found that the total SOAS score is useful in predicting both MCID as well as SCB following TSA. These data suggest that the utility of the SOAS score in predicting both preoperative and change in patient-reported outcomes may lie in its comprehensive assessment of OA, which is not as readily apparent on plain radiographs.

Disclaimers
Funding: Funding from Zimmer Biomet was used to establish the shoulder arthroplasty registry utilized in this study.
Conflicts of interest: C. Benjamin Ma reports research grants from Zimmer Biomet; grants from Anika; personal fees from CONMED Linvatec; grants and personal fees from Histogenics; personal fees from Medacta; grants from Samumed; personal fees from SLACK Incorporated; personal fees from Stryker; and personal fees from Tornier.
Brian Feeley reports research grants from Zimmer Biomet. He is an associate editor for the Journal of Shoulder and Elbow Surgery and receives funding from the NIH. Drew Lansdown has received fellowship-related educational and research support from Arthrex, Inc. and Smith & Nephew.
The other authors, their immediate families, and any research foundation with which they are affiliated have not received any financial payments or other benefits from any commercial entity related to the subject of this article.