Risk factors for failure of the 36 mm metal-on-metal Pinnacle total hip arthroplasty system

Aims To determine ten-year failure rates following 36 mm metal-on-metal (MoM) Pinnacle total hip arthroplasty (THA), and identify predictors of failure. Patients and Methods We retrospectively assessed a single-centre cohort of 569 primary 36 mm MoM Pinnacle THAs (all Corail stems) followed up since 2012 according to Medicines and Healthcare Products Regulation Agency recommendations. All-cause failure rates (all-cause revision, and non-revised cross-sectional imaging failures) were calculated, with predictors for failure identified using multivariable Cox regression. Results Failure occurred in 97 hips (17.0%). The ten-year cumulative failure rate was 27.1% (95% confidence interval (CI) 21.6 to 33.7). Primary implantation from 2006 onwards (hazard ratio (HR) 4.30; 95% CI 1.82 to 10.1; p = 0.001) and bilateral MoM hip arthroplasty (HR 1.59; 95% CI 1.03 to 2.46; p = 0.037) predicted failure. The effect of implantation year on failure varied over time. From four years onwards following surgery, hips implanted since 2006 had significantly higher failure rates (eight years 28.3%; 95% CI 23.1 to 34.5) compared with hips implanted before 2006 (eight years 6.3%; 95% CI 2.4 to 15.8) (HR 15.2; 95% CI 2.11 to 110.4; p = 0.007). Conclusion We observed that 36 mm MoM Pinnacle THAs have an unacceptably high ten-year failure rate, especially if implanted from 2006 onwards or in bilateral MoM hip patients. Our findings regarding implantation year and failure support recent concerns about the device manufacturing process. We recommend all patients undergoing implantation since 2006 and those with bilateral MoM hips undergo regular investigation, regardless of symptoms. Cite this article: Bone Joint J 2017;99-B:592–600.


Conclusion
We observed that 36  Adverse reactions to metal debris (ARMD) have contributed to the high failure rates observed with stemmed metal-on-metal total hip arthroplasties (MoM THAs). [1][2][3][4][5] These implants are no longer used and patients with these devices currently require regular surveillance. [6][7][8] The Medical and Healthcare Products Regulatory Agency (MHRA) recommends that all patients with stemmed MoM THAs and a 36 mm or larger diameter femoral head undergo annual review, which includes measurement of blood metal ions with or without cross-sectional imaging. 6 The 36 mm MoM Pinnacle THA system (DePuy, Leeds, United Kingdom) was commonly implanted worldwide. 4,[9][10][11] Although this design has not performed as poorly as other MoM THAs, registries have still observed high ten-year cumulative revision rates of up to 14.6%. 4,11 A number of large single-centre cohort studies examining the 36 mm MoM Pinnacle THA system have also confirmed these high failure rates. 10,[12][13][14] Langton et al 10  Furthermore, explant analysis at their national retrieval centre demonstrated that a number of Pinnacle implants were manufactured out of their intended specification. 10 However, a subsequent review of the 36 mm MoM Pinnacle THA system within the National Joint Registry (NJR) for England and Wales did not observe a higher revision rate for the identified batch numbers compared with others. 15 Given the potential implications of the findings reported by Langton et al 10 on follow-up regimens for patients with 36 mm MoM Pinnacle THAs, it is important for the newly identified risk factors to be assessed in external cohorts. We previously reported the medium-term outcomes of the 36 mm MoM Pinnacle THA system at our specialist centre, 12 which represents the second largest study of this particular device. 9,10,13,14 The failure rate at eight years (11.1%) was unacceptably high in our initial study with many revisions performed for ARMD. 12 However, risk factors for implant failure were not formally assessed in our previous study or in most other reports. 9,12,14,16 We aimed to determine the cumulative all-cause failure rate at ten years following implantation of the 36 mm MoM Pinnacle THA system and identify risk factors for failure, with specific focus on the newly identified predictors. 10

Patients and Methods
This retrospective single-centre cohort study included patients treated with a primary MoM Pinnacle THA implanted between 2004 and 2010. This study was registered with the hospital board of our institution; however, ethical approval was not required as patients were assessed according to published regulatory authority guidance. 6 All patients received a cementless Corail femoral stem (DePuy).
For the present study only hips receiving a 36 mm diameter Articul/eze femoral head (DePuy) were included (14 hips with 28 mm diameter femoral heads excluded). There were 569 primary 36 mm MoM Pinnacle THAs (504 patients) eligible for study inclusion (Table I). These patients have been reported on previously, with information regarding the implants used, surgical procedure, patient follow-up, and blood metal ion analysis described in detail. 12,17 Patient follow-up. In 2012, the institution's routine followup of patients with the 36 mm MoM Pinnacle THA system was adapted according to recommendations from the MHRA. 6 All patients were recalled for clinical assessment, which was performed by a combination of senior (consultant) hip surgeons and specialist arthroplasty advanced nurse practitioners in dedicated MoM hip clinics. This clinical assessment included history, examination, anteroposterior pelvic radiographs, blood metal ion sampling, 17 and completion of the Oxford Hip Score (OHS) questionnaire. 18 The OHS is scored on a scale of 0 to 48 points (0, worst possible joint; 48, healthy joint). 19 All symptomatic hips underwent cross-sectional imaging, with asymptomatic hips undergoing cross-sectional imaging if blood metal ion concentrations were above 7 μg/l (MHRA upper-limit). 6 The institution's cross-sectional imaging protocol has previously been described. 17,20 Briefly, this protocol recommends ultrasound for symptomatic patients and metal artifact reduction sequence magnetic resonance imaging (MARS-MRI) for asymptomatic patients. If MARS-MRI was contraindicated then ultrasound was performed. Patients were considered to be symptomatic if when questioned they complained of ipsilateral groin or thigh pain, and/or had experienced episodes of hip squeaking, clicking, grating, locking, clunking, other noises, or instability (and/or these could be elicited during clinical examination). Although no specific OHS threshold was used to distinguish symptomatic patients from asymptomatic individuals, symptomatic patients typically had an OHS of less than 34 of 48 (which represents a fair or poor score; median OHS in symptomatic patients was 30 of 48). 19 All subsequent patient follow-up was dependent on the outcome of the initial assessment. Patients with normal investigations were reviewed annually. Patients with abnormal investigations not undergoing revision were seen more regularly and underwent repeat investigations, typically every three to six months. Revision surgery was recommended based on the overall results of the clinical assessment, with no single investigation used to make decisions regarding revision. Revision surgery for ARMD was performed in certain asymptomatic patients based on abnormal investigations. Exposure variables and outcomes of interest. Data for this study were extracted from the institution's prospectively maintained clinical database. Exposure covariates of interest recorded in this database included patient demographics (gender, age, MoM hip laterality and hip diagnosis), and details related to the primary surgery (year of surgery, surgeon grade and volume, approach, acetabular liner size and femoral head offset).
The study outcome of interest was all-cause failure. The definition for failure included revision surgery performed for any indication, and non-revised hips with crosssectional imaging evidence of ARMD. Patients were considered to have imaging evidence of ARMD if they had a pathological effusion or pseudotumour as previously described. [21][22][23] Pathological effusions were diagnosed when the depth of fluid exceeded 15 mm at the anterior joint line on ultrasound 22 and the patient had raised blood cobalt concentrations (above 3.57 μg/l as recently described). 17 A pseudotumour was defined as a cystic, solid or mixed mass in continuity with the hip joint but extending beyond the confines of the anatomical joint. 22,23 Lesions meeting these criteria were diagnosed as pseudotumours regardless of the wall thickness or blood metal ion concentrations. Hips with imaging evidence of ARMD remained under regular surveillance due to clinician and/or patient preference. This included circumstances where patients had decided against revision surgery whilst asymptomatic, or where revision surgery had been recommended but patients were unfit to undergo further surgery.
Revision surgery was defined as removal or exchange of any component implanted at the primary surgery, including isolated femoral head and acetabular liner exchanges. In all revised cases, the database and hospital records were reviewed to determine the cause of failure. This included review of the pre-revision investigations, the revision operation report, and the results of the microbiological and histopathological analyses performed on intra-operative samples. Hips were considered revised for ARMD if there was intra-operative evidence of pseudotumour, metallosis, synovitis, tissue damage and/or necrosis, combined with histopathological confirmation of ARMD. [23][24][25][26][27] If revision surgery had been performed at another centre, the institution was contacted to obtain the relevant data. All arthroplasty surgeons have an individual NJR profile which tracks the outcomes of any THAs they have performed, even when further procedures are undertaken by other individuals at another centre. 28 Revisions were identified as having occurred elsewhere when the patient had informed us of such procedures, or if additional revisions had been identified from the primary surgeon's individual NJR profile. All failures up until October 2016 were included in this study. No primary 36 mm MoM Pinnacle THAs were awaiting revision surgery at the time of writing. Statistical analysis. This was performed using Stata Version 14.2 (Stata Corp., College Station, Texas). The significance level for all analyses was a p-value < 0.05, with 95% confidence intervals (CI) also used. Cumulative implant failure rates following primary 36 mm MoM Pinnacle THA were estimated using the Kaplan-Meier method. Hips that did not fail were censored on the date of last clinical follow-up, or on the date of death.
Cox proportional hazards models (univariable and multivariable) were used to assess the effects of the predictor variables on time to failure. For continuous predictors, fractional polynomial regression modelling was used to assess the assumption of linearity with outcome, with data grouped if the assumption was not satisfied. Proportional hazards assumptions for each predictor were assessed using Schoenfeld's residuals. 29 If these assumptions were not satisfied, the hazard ratios (HR) were examined in various follow-up time groups, with breaks placed at the points of divergence from proportionality. Likelihood ratio tests were used to assess evidence of two-way interactions between gender and other predictors, including acetabular liner size. The final multivariable model was developed using stepwise backward selection methods. In this automated process all potential predictors are initially included in the model with p-values for removing (p ≥ 0.20) and including (p < 0.10) predictors in the final multivariable model set. The analytical software subsequently compares different statistical models and sequentially removes any variables that do not significantly affect the model fit until the final model with the best fit is identified. As a sensitivity analysis, the models were also repeated with all-cause revision surgery used as the endpoint.
As the effect of year on primary 36 mm MoM Pinnacle THA implantation (pre-2006 versus 2006 onwards) on failure rate was being assessed, further analyses were performed to more accurately examine these relationships. Previous work highlighted potential variations in the 36 mm MoM Pinnacle THA system manufacturing process from 2006 onwards. 10 Given that 2006 was the transition year for this potential variation, Cox regression analyses were repeated with all hips implanted in 2006 excluded. Furthermore, the follow-up time from primary implantation between groups was different with the pre-2006 implantation group having longer follow-up than the 2006 onwards group. To control for this follow-up variability the Cox regression analyses were repeated in a subgroup of hips. This subgroup included all non-failed hips with a minimum of eight years follow-up (with these hips all censored as not failing at eight years) and all failed hips. For this particular analysis, all hips failing before eight years were censored as failures but any hips failing after eight years were censored at eight years as not failing. Follow-up was truncated at eight years because after this time there were fewer hips at risk in the 2006 onwards group. There were 23 non-failing hips (4.0%) that died at a mean time of 2.8 years (0.04 to 7.2) from surgery. The remaining 449 hips that did not fail or die were followed up for a mean time of 8.0 years (1.0 to 12.0). The median (interquartile range (IQR)) whole blood cobalt and chromium concentrations in these patients at latest follow-up were 2.36 μg/l (IQR 0.83 to 4.30) and 1.46 μg/l (IQR 0.88 to 2.44), respectively. The median post-operative OHS was 43 of 48 (IQR 33 to 48). Failure rates. The cumulative ten-year failure rate following primary 36 mm MoM Pinnacle THA was 27.1% (95% CI 21.6 to 33.7; 67 hips remained at risk at ten years) (Fig. 1)  Graph showing cumulative all-cause failure rate following 36 mm primary metal-on-metal Pinnacle total hip arthroplasty at up to ten years. Shaded area represents the respective upper and lower limits of the 95% confidence intervals (CIs). Risk table indicates the number of hips at risk at two year intervals, with the corresponding number in brackets detailing the number of failed hips during each two-year interval. Seven failures occurred after ten-years follow-up and therefore are not included in the risk Year of implantation did not affect failure rates within four years of arthroplasty (HR 1.60, 95% CI 0.47 to 5.44; p = 0.450), with four-year failure rates of 7.5% (95% CI 4.8% to 11.6%) and 4.7% (95% CI 1.5% to 13.8%) for hips implanted from 2006 onwards and before 2006 respectively (Fig. 3). Between four years and eight years from primary surgery, hips implanted from 2006 onwards had a significantly higher failure rate (at eight years 28.3%,   Graph showing cumulative all-cause failure rates following 36 mm primary metal-on-metal Pinnacle total hip arthroplasty at up to ten-years by implant laterality; 95% confidence intervals have been included in the main text, but are omitted from the figure for clarity.  Graph showing cumulative all-cause failure rates following 36 mm primary metal-on-metal Pinnacle total hip arthroplasty at up to eight years by year of implantation. Shaded area represents the respective upper and lower limits of the 95% confidence intervals (CIs Small liner size did not predict failure. Likelihood ratio testing demonstrated a borderline significant interaction between liner size and gender (p = 0.0493). The effect of liner size on failure rates was further examined in men and women separately, which confirmed that liner size did not predict failure in either gender.

Failures
Risk factors for revision surgery were also examined. The results from the multivariable analysis were the same as for all-cause failure, with bilateral MoM hip arthroplasty (HR 2.02, 95% CI 1.25 to 3.25; p = 0.004), and primary hip implantation from 2006 onwards (HR 2.64, 95% CI 1.12 to 6.20; p = 0.026) identified as predictors of revision.

Discussion
The 36 mm MoM Pinnacle THA system was one of the most commonly implanted MoM THAs worldwide. 4,[9][10][11] Recently, Langton et al 10 identified new risk factors for implant failure, which raised serious questions regarding the device manufacturing process since 2006. The clinical implications of the concerns raised by Langton et al 10 regarding potential errors in the manufacturing tolerances are significant; therefore it is important for other institutions, such as ours, to publish their data. The present study represents one of the largest non-registry cohorts of 36 mm MoM Pinnacle THA patients, and has the longest followup period. 9,10,13,14 High ten-year failure rates were observed with the 36 mm MoM Pinnacle THA system, with ARMD representing the most common indication for failure. However, the greatest risk of device failure occurred in hips implanted from 2006 onwards, and in bilateral MoM hip arthroplasty patients. Our findings therefore support the newly identified risk factors for 36 mm MoM Pinnacle THA failure. 10 The unacceptably high failure rates of stemmed MoM THAs are undisputed, with these devices no longer implanted worldwide. [1][2][3][4] Whilst some MoM THAs had very high short-term failure rates, such as the ASR XL (DePuy), the 36 mm MoM Pinnacle THA system did not perform as poorly according to registry data. 4,11 However, registries have recognised limitations, including the potential to under-report revisions and the inability to account for non-revised imaging failures. 30,31 We observed high rates of failure (27.1%) and revision surgery (20.3%) at ten years following primary 36 mm MoM Pinnacle THA. Our revision rate was comparable with recent studies from Langton et al 10 (n = 489) and Lainiala et al 13 (n = 430) who reported rates of up to 16.4% at nine years. However, we considered it important to include non-revised patients with imaging evidence of ARMD as failures. This more accurately estimates the scale of the problem, given that the threshold for performing revision may differ between surgeons, and patients with failing implants may be unfit or unwilling to undergo further surgery. Indeed, all of the nonrevised patients with imaging failures identified in our previous report have now undergone revision surgery. 12 It is suspected that surveillance bias introduced from regular patient follow-up 6,8 has contributed to the high rates of failure and revision reported by this and other studies for the 36 mm MoM Pinnacle THA system. 10,[12][13][14] However, the failure rate continues to increase steadily up to ten years (Fig. 1) rather than plateau as one may expect following early revision of poorly performing implants. Perhaps more concerning are the recent observations that in MoM hip resurfacing patients, failure rates continue to increase up to 15 years from implantation. 32 Stemmed MoM THAs have higher failure rates compared with hip resurfacings, even when bearing surfaces are identical. 2,4,11,33 Therefore we suspect many more MoM THAs will require revision in the second decade following implantation. This itself is likely to be problematic given that the outcomes reported following MoM hip revision surgery for ARMD have also been poor. 34,35 The most important predictor of failure in our cohort was primary implantation from 2006 onwards. However, hips implanted from 2006 onwards only had a significantly higher failure rate between four years and eight years from primary surgery (Fig. 3), with most failures due to ARMD. This reflects the time at which ARMD failures can be expected to occur. 10, 36 Langton et al 10 similarly identified an increased risk of failure in 36 mm MoM Pinnacle THAs implanted from 2006 onwards. Furthermore, the findings from their national implant retrieval centre, which receives many failed prostheses from numerous institutions, demonstrated that 36 mm MoM Pinnacle THAs manufactured from 2006 onwards were increasingly likely to have lower clearance values than intended by the manufacturer. 10 Low clearance between the femoral head and acetabular liner can lead to edge loading, excessive bearing wear, and ultimately implant failure. 37 Furthermore, low clearance may contribute to higher friction at the bearing surface and/or taper junction, which can also result in increased wear and implant failure. 38,39 In light of the findings from two large clinical cohorts comprising over 1000 36 mm MoM Pinnacle THA systems and the results of the retrieval analysis on failed implants from a wide geographical area, 10 we conclude that the potential variations in manufacturing tolerances are unlikely to be confined to a specific batch of implants and/ or region of the country. Our findings regarding implantation year and failure therefore support recent concerns about the 36 mm MoM Pinnacle THA system manufacturing process since 2006. 10 Although large registry cohorts would help support these new observations, registries have recognised limitations. 30,31 In addition, the high failure rates associated with MoM hip arthroplasties were identified by others 5,40 some time before registries detected the problem. 11,41 Therefore, there may be a delay before registries identify a higher failure rate in 36 mm MoM Pinnacle THAs implanted since 2006. Some regulatory authorities consider patients with bilateral MoM hip arthroplasties to be at increased risk of failure, 8,42 given that these patients can develop bilateral ARMD. [43][44][45] Bilateral failures seem to occur in patients undergoing sequential hip implantation, with suggestion that a delayed type IV immune reaction is responsible. [43][44][45] We observed patients with bilateral MoM hip arthroplasties to be at high risk of implant failure, which confirms recent findings in another 36 mm MoM Pinnacle THA cohort 10 but is contrary to that reported in large hip resurfacing cohorts. 32 Bilateral MoM THAs may have a higher risk of failure compared with bilateral hip resurfacings given recent observations that metal debris generated from taper junctions can be more immunogenic than that from bearing surfaces. 46 Further research is therefore needed regarding the pathogenesis of failure in bilateral MoM hip arthroplasties. However, we consider 36 mm MoM Pinnacle THA patients with bilateral MoM hips at high risk of failure and therefore recommend these patients undergo regular surveillance.
In this study, small acetabular liner size did not predict failure, both in the whole cohort and when subdivided by gender. Although Langton et al 10 recently reported small liner size as a predictor of failure when assessed as a categorical variable, small liner size was not a significant predictor in their all-cause revision model. Assessing liner size in MoM THAs is difficult given it is closely related to gender, much like the complex relationship observed between gender and femoral head size in hip resurfacing. 32 Assessment in 36 mm MoM Pinnacle THAs is further complicated by there being nine different liner sizes available (50 mm to 66 mm), with the smallest size not introduced until 2008 (hence liner size was grouped in this study). Although we do not feel that 36 mm MoM Pinnacle THA patients require risk stratification for clinical surveillance based on liner size, it is recommended that future studies assessing this implant explore the relationship between liner size and implant failure.
Given that we have now validated the risk factors for failure of the 36 mm MoM Pinnacle THA system initially reported by Langton et al, 10 it is important that the many patients worldwide with these implants in situ are managed appropriately. Current MHRA follow-up recommendations suggest that asymptomatic patients with non-ASR stemmed MoM THRs and a 36 mm or larger diameter femoral head only require cross-sectional imaging if blood metal ion concentrations are above 7 μg/l and rising on repeat testing. 6 We suggest this follow-up guidance should be modified so that all 36 mm MoM Pinnacle THAs implanted since 2006 and those with bilateral MoM hips should also undergo cross-sectional imaging, regardless of symptoms. However, this additional follow-up is likely to be very costly and resource intensive, 47 given that most patients with these devices are asymptomatic 10,12,13 and most of these prostheses were implanted from 2006 onwards. 4,11 The main study limitation is that hips revised at this institution did not undergo explant analysis. Therefore, unlike in the recent study by Langton et al, 10 we cannot confirm whether or not the specific Pinnacle implants removed were manufactured outside of their intended specification. Although attempts were made to identify all revisions, including using NJR surgeon data, it is possible some hips were revised elsewhere and not reported to the NJR; therefore, our failure rates represent the best-case scenario. The formal radiographic analysis from the initial study 12 was not updated here; therefore it is acknowledged that some non-revised patients may have radiological evidence of implant failure which may eventually require revision. Despite assessing a large cohort at extended follow-up, it is recognised this represents a selected population in a specific geographical area. Although registries have limitations, 30,31 analysis of a large unselected registry cohort would complement the single-centre cohort studies performed by ourselves and others. 10 A further limitation is the lack of a control group of patients with other MoM THA designs. Such a comparator group would allow a more accurate assessment of the extent of surveillance bias on implant failure rates in more recent years. [6][7][8] Finally, although our survival analysis was robust, there is a potential for residual confounding.
In conclusion, this large cohort study has demonstrated that the 36 mm MoM Pinnacle THA system has an unacceptably high ten-year failure rate. The failure rate was especially high in hips implanted from 2006 onwards and in bilateral MoM hip arthroplasty patients. Our findings regarding year of implantation and failure rates support recent serious concerns about previously unrecognised changes in the device manufacturing process. 10 We recommend that all patients with 36 mm MoM Pinnacle THA systems implanted since 2006 and those with bilateral MoM hips undergo regular investigation, including blood metal ions and cross-sectional imaging, regardless of symptoms.

Take home message:
-An unacceptably high ten-year failure rate was observed in The author or one or more of the authors have received or will receive benefits for personal or professional use from a commercial party related directly or indirectly to the subject of this article. In addition, benefits have been or will be directed to a research fund, foundation, educational institution, or other nonprofit organisation with which one or more of the authors are associated.
This is an open-access article distributed under the terms of the Creative Commons Attributions licence (CC-BY-NC), which permits unrestricted use, distribution, and reproduction in any medium, but not for commercial gain, provided the original author and source are credited.
This article was primary edited by G. Scott.