Two-center validation of the Oulu resorption score for bone flap resorption after autologous cranioplasty

OBJECTIVE
Autologous bone has been the gold standard of cranioplasty materials for decades. Unique to autologous cranioplasty, bone flap resorption is a poorly understood and unclearly defined complication. Even further, it has been unclear, whether the resorption process eventually stabilizes over time. Thus, the sufficient follow-up period after autologous cranioplasty is unknown. The Oulu Resorption Score (ORS) is a straight-forward classification system for the radiological interpretation of bone flap resorption. The aims of the present study were to evaluate the reliability of the ORS using intra-class correlation coefficient (ICC) and to assess the temporal progression of the resorption process.


METHODS
We identified 108 consecutive autologous cranioplasty patients treated between 2005 and 2018 in two tertiary referral centers. All 365 head CT scans the patients had undergone were evaluated using the ORS in a blinded, independent two-center setting. Intra- and inter-observer reliabilities were calculated. The ORS was applied to study the temporal progression of the resorption process.


RESULTS
The intra-observer reliability of the ORS was excellent (ICC 0.94, 95%CI 0.93-0.95). Inter-observer reliability was good-to-excellent (ICCs 0.87 and 0.89, 95%CIs 0.84-0.89 and 0.87-0.91, respectively). In scatterplot smoothing analyses, the progression of bone flap resorption appeared to stabilize 12-24 months after cranioplasty.


CONCLUSIONS
ORS is the only validated radiological tool for the standardized analysis of bone flap resorption after autologous cranioplasty. Evaluated using the ORS, the resorption process seemed to stabilize during the first two postoperative years after cranioplasty, suggesting that the sufficient follow-up time after autologous cranioplasty is two years.


Introduction
Autologous bone has been the mainstay cranioplasty material for decades [1]. Though most recent research suggests benefit [1][2][3] and consequent predomination of synthetic implants in clinical practice [4], the majority of cranioplasties that have been implanted to date are probably autologous. Additionally, synthetic materials are not available in every neurosurgical practice. Bone flap resorption is a common, but poorly understood late complication of autologous cranioplasty. Up to 90% of patients develop radiological signs of bone flap resorption over time [5].
Though radiologically detected in most autologous cranioplasty patients [5][6][7][8][9][10][11][12], the development of bone flap resorption over time is unknown. It is unclear whether the resorption process at some point stabilizes without further progression or continues until re-cranioplasty with a synthetic implant becomes necessary. Thus, the appropriate length of follow-up has not been defined [13] and the patients may undergo unnecessary follow-up imaging.
The definition of clinically relevant resorption and indications of reoperation are unclear, and follow-up protocols vary. Consequently, clinical series cite differing resorption rates ranging from 1% to 32% in different mixed cohorts [14,15]. Bone flap resorption is most common in the pediatric subpopulation: an inversely age-dependent incidence relationship appears probable with clinically relevant resorption occurring in 39% of patients [16].
We previously described the Oulu Resorption Score (ORS) [6] designed to standardize the interpretation of bone flap resorption after autologous cranioplasty from head CT scans. Previously published radiological classification systems are descriptive and restricted due to the lack of independent validation [7][8][9][10]17]. We conducted a blinded independent two-center validation study to examine the reliability of the ORS. Secondarily, we describe the temporal course of the bone flap resorption process to ascertain the sufficient length of follow-up after autologous cranioplasty, which has been unclear to date.

Material and methods
We identified all patients (n = 134) who underwent primary autologous cranioplasty with cryopreserved bone in the Oulu and Turku University Hospitals (OUH and TUH, respectively) between 1st January 2005 and 31st December 2018. Demographic and clinical characteristics as well as data on cranioplasty complications were extracted from the patient records. Patients with unavailable imaging data (n = 26) were excluded from the study and thus 108 (81% of total) patients were included in the study. Patients' head CT scans (n = 365) were obtained from the picture archiving and communication systems of the two hospitals. No patients underwent additional CT scans due to the present study. Instead, clinically justified imaging was used retrospectively in an opportunistic manner.
Both study centres used the same bone flap storage and implantation protocols. Following bone flap extraction during decompressive craniectomy, microbiological samples were taken, any debris was removed, and the bone flap was placed under sterile conditions in − 70 • C for storage. Bone flaps with positive microbial cultures were discarded. For cranioplasty, the bone flap was thawed to room temperature in a saline solution and affixed using skull clamps. Detailed description of the protocol was published earlier [18].
The present study was conducted in accordance with the declaration of Helsinki on ethical principles for medical research following approval of the study protocol by the ethical review boards and research steering committees of the hospital districts of Northern Ostrobothnia and the Southwest Finland (decision numbers 111/2015 (section 304) and TO4/ 004/17, respectively). No additional consent from the patients was required as no patients were contacted as part of the present study. The present report follows the guidelines for reporting reliability and agreement studies (GRRAS) [19].

Validation process
The 365 head CT scans were pseudonymized and graded in a random order by two independent raters (TKK & JPP; rater 1 & 2, respectively) using the ORS. The ORS is a tri-variable score consisting of the variables Extent, Severity and Focus (Table 1, Fig. 1) [6].
Prior to the validation process, a brief meeting was held between the raters to ascertain any unclarities in the classification process, during which JPP was also shown five CT scans and their respective ORSs that TKK had earlier assigned. The raters were then blinded to each others' ratings and the patients' clinical data. They were allowed to access the same patient's previous CT scans for comparison purposes. To evaluate intra-rater variability, TKK re-scored all the head CT scans ten months after his initial scoring process. The first and second rating rounds by TKK are referred to as Rater 1.1 and Rater 1.2, respectively. The ratings of JPP are referred to as Rater 2. The mean ORS is calculated as the mean of the scores of Rater 1.1, 1.2 and 2.
Cross-platform compatibility was ensured by using different image analysis systems during the validation process. During the first rating round, TKK used neaView Radiology (Neagen Ltd., Helsinki, Finland) and JPP used OsiriX DICOM viewer (v. 10.0, Pixmeo SARL, Bernex, Switzerland) for the OUH data. The TUH data was analysed using Carestream Vue (v. 12.2.0.1007, Carestream Health, Inc., Rochester, NY, USA) by both authors. For the second rating round, TKK re-scored the CT data using the Horos DICOM viewer (v. 3.3.0, Horosproject. org; Nimble Co LLC d/b/a Purview, Annapolis, MD USA). After developing routine from analyzing a minimum of 100 CT scans, the time required to score individual head CT scans was recorded using the next 50 consecutive scans by both raters using the Carestream Vue software.

Statistical analysis
Intra-and inter-rater agreement was evaluated using the two-way random, single measures absolute agreement intra-class correlation coefficient [ICC (2,1)]. The ICC is a robust marker of measurement reliability, the values of which were interpreted according to Koo & Li [20]: values < 0.50 indicate poor reliability, values between 0.50 and 0.75 indicate moderate reliability, values between 0.75 and 0.90 represent good reliability, and values > 0.90 indicate excellent reliability. Radiological follow-up time was calculated as the time from cranioplasty to the latest CT scan, and clinical follow-up time as the time from cranioplasty to cranioplasty removal, death or the date of the last entry in the patient records prior to 31st Dec 2019.
To visually examine the reliability of the ORS, Bland-Altman plots [21] with limits of agreement and scatter plots with Spearman's rank-order correlation analyses (ρ) were conducted. Spearman's ρ is a nonparametric rank-correlation measure that evaluates the monotonicity of the association between two measurements [22]. The one-sample t-test was used to evaluate the significance of the mean biases between the ratings. The temporal change of the ORS was graphically evaluated using locally estimated scatterplot smoothing (LOESS, tricube kernel with 70% of points fitted), which was also applied to approximate relation evaluated by Spearman's ρ. Finally, a Kaplan-Meier survival analysis was conducted to describe the temporal distribution of cranioplasty removals.
Categorical variables were compared using the χ 2 or Fisher's exact tests where appropriate. Continuous variables were evaluated using the two-way analysis of variance (ANOVA). Continuous variables were summarized as means with standard deviation (SD) and range or medians with interquartile range (IQR) unless otherwise indicated. Statistical analyses were conducted using the Statistical Package for the Social Sciences (IBM SPSS Statistics for Windows, Version 24.0. Armonk, NY: IBM Corp). A p value of < 0.05 was taken to represent statistically significant results.

Table 1
The Oulu Resorption Score components [6]. The sum of the Extent, Severity and Focus scores makes up the Oulu Resorption Score

Description of sample
In total, 108 autologous cranioplasty patients were included in the present study. The mean age of the patient population at the time of cranioplasty was 41.5 years (SD 16.5, range 1-68). Seventy-three (68%) of the patients were male. The cranioplasty procedures had been conducted after a mean interval of 5.9 months (SD 4.2) from craniectomy. The mean two-dimensional bone defect area was 96 cm 2 (SD 47.4 cm 2 ).
The patients had undergone in total 365 head CT scans after a mean interval of 19.5 months (SD 33.5 months, range 0 days to 15.6 years) from the cranioplasty procedure. Each patient had undergone at mean 3.4 head CT scans (SD 2.7, range 1-16 scans). The mean radiological follow-up period was 34.7 months (SD 40.8 months, range 0 days to 15.6 years). Seventy-six (70%) of the patients in the present study had more than one postoperative CT scan at varying intervals as detailed in Table 2.

Independent validation of the Oulu resorption score
As evaluated with the ICC, the intra-rater reliability of the ORS was excellent, and the inter-rater reliability was good-to-excellent (Table 3).
In Bland-Altman plots, the ORS performed well across the score distribution, though agreement appeared to increase toward the extremes of the spectrum (Fig. 2). The mean biases between the ORSs of the three rating rounds were -0.19, − 0.04 and -0.15 points (p = .009, p = .512 and p = .002, respectively), all of which are of clinically insignificant magnitude. As suggested by the LOESS curves, Spearman's ρ demonstrated strong monotonic correlations between the scores assigned by the raters (Fig. 2).
As expected, the proportion of low ORSs was high in the present sample (Table 4). Still, when cases with mean ORS < 1 or > 8 were Fig. 1. Head CT scans of four patients with different levels of bone flap resorption. A: slight loss of diploë was noted, but no relevant bone flap resorption changes were found, and thus the mean Oulu Resorption Score was 0 (grade 0). B: multiple bone flap resorption changes were found with an estimated remaining bone flap volume of 50% but with no perforations detected. The mean Oulu Resorption Score was 4.7 (grade I). C: bone flap resorption changes were found in every CT slice and were thus diffuse with an estimated remaining bone volume of 60% and a perforation smaller than 1 cm detected (arrowhead). The mean Oulu Resorption Score was 6.3 (grade II). D: bone flap resorption was diffuse with less than 25% of the bone flap volume remaining. A 3 cm perforation was detected (double arrowheads). The mean Oulu Resorption Score was 9.

Table 2
Numbers of CT scans per patient and the follow-up time from the cranioplasty procedure.

Clinical results
Forty-three (40%) of the 108 autologous cranioplasty patients had complications after cranioplasty during the mean clinical follow-up period of 2.7 years (SD 3.0, range 1 day to 15.3 years), and 18 (17%) bone flaps had to be removed due to the complications ( Table 5). The complications have been described in more detail elsewhere [18]. Six (6%) patients had undergone re-cranioplasty due to bone flap resorption. Based on the three Raters' scores, the mean ORS of these patients was 7.9 (SD 1.6, range 5.7-9).
The majority (67%) of the patients in the present study population had some BFR changes (Grades I to III) ( Table 4). Analyzing the behavior of bone flap resorption over follow-up time, the progression of the resorption process appeared to stabilize 12-24 months after cranioplasty as suggested by the LOESS curve presented in Fig. 3. Correspondingly in the survival analyses, most cranioplasty removals occurred during the first two years post-cranioplasty (Fig. 4), with bone flap removals due to surgical site infections conducted in the early phase at median 2.3 months (IQR 3.3) after cranioplasty, and removals due to bone flap resorption at median 23.3 months (IQR 31.1) post-cranioplasty. The causative microbes of the surgical site infections were Staph. aureus in four, Staph. epidermidis in three, Prop. acnes in three cases, and unavailable in one case. An additional bone flap was removed due to implant migration 3 days after cranioplasty.

Discussion
We demonstrated that the ORS was reliable in standardized analysis of bone flap resorption following autologous cranioplasty in this internal blinded independent two-center validation study. Intra-observer reliability of the ORS was excellent and inter-observer reliability was goodto-excellent as evaluated using the ICC (Table 3, Fig. 2). Currently, the ORS is the only formally validated radiological bone flap resorption classification system. Given the reliability of the ORS, we sought to determine the temporal development of the bone flap resorption process. The progression of bone flap resorption appeared to stabilize 12-24 months after cranioplasty according to the present study (Fig. 3).

Table 3
Intra-and inter-rater reliability of the Oulu Resorption Score evaluated with the two-way random, single measures absolute agreement intra-class correlation coefficient.

Reliability and utility of the Oulu resorption score
In addition to the ORS, five radiological classification systems for bone flap resorption have been proposed [7][8][9][10]17], none of which have been validated. The ORS was reliable across the score distribution in our internal validation, though agreement increased towards the extremes of the distribution in the Bland-Altman plots (Fig. 2). Anyhow, after the exclusion of patients with scores of < 1 and > 8, reliability remained good-to-excellent and good-to-moderate for the intra-and inter-rater comparisons, respectively. In the Bland-Altman analysis, two of the three mean biases between the raters' scores were statistically significant, but their magnitudes ranged from 1/5-1/25 of one ORS point (mean differences − 0.19, − 0.15, − 0.04). Such small differences probably appeared statistically significant due to the large sample size of the present dataset but bear no clinical significance. The time required for CT image analysis did not increase due to the application of ORS by more than two minutes at median.

Clinical implications of radiological bone flap resorption
The incidence of radiological signs of bone flap resorption, detected in two-thirds of our patients (Table 4), was in line with the resorption rates described in previous radiological studies [5][6][7][8][9][10][11]17]. Even though 28 (26%) patients had clinically relevant bone flap resorption (ORS grades II to III), bone flap resorption necessitated re-cranioplasty with a synthetic implant only in six patients (6%), which was generally in keeping with earlier studies [2]. The discrepancy between rates of radiological and clinical bone flap resorption may arise from the patients and their caregivers not complaining of contour irregularities [8,23], or the patients may be reluctant to undergo further surgery. More likely, the bone flap resorption process, though very common, appears to stabilize over time after cranioplasty.
To date, evidence on the progression of bone flap resorption has been anecdotal: most of the resorption has been suggested to occur during the first two postoperative years [10,24]. We observed a rapid increase in the ORSs of the patients during the first and second years following cranioplasty (Fig. 3). After this, the change in ORS appeared to largely stabilize without further progression. Thus, if bone flap resorption is found after this initial period, the process is not likely to progress further as most patients with radiological resorption did not require surgical interventions. Patients should however be educated on the possibility of bone flap resorption, as two patients in our cohort underwent bone flap removal due to resorption after the initial two-year period (Fig. 4). This might be attributed to the patients or caregivers becoming aware of the resorption due to cosmesis, but in neither of the two cases was early imaging available and thus we could not assess the progression of resorption in these particular cases. In addition, there were no patients with decreasing ORS within measurement accuracy in the present study population. As such, we suggest that a feasible period of follow-up after autologous cranioplasty could be two years, especially as most other surgical complications occur in the early postoperative phase after cranioplasty [9,13]. Routine radiological imaging should be conducted only if surgical intervention is considered due to clinically significant resorption.
The clinical complication rate was 40% in the present cohort, and 18% of the patients eventually required bone flap removal ( Table 5). The complication and bone flap removal rates were consistent with the previous literature [23,[25][26][27][28][29], as were the causative microbes of the infections [30,31]. The time from craniectomy to cranioplasty was 5.9 months in the present study, which is somewhat longer than those published earlier, but the effect of this freezer time on cranioplasty complications is unclear as conflicting results have been published [32][33][34]. On this note, previous attempts to culture osteoblasts from cryopreserved skull bone following preservation times ranging from 7 days to 6 months have been unsuccessful [35][36][37] indicating immediate cell death after freezing. Thus, it appears unlikely from a biological perspective that freezer time would significantly affect bone flap resorption rates. No such association has indeed been found in extensive meta-analyses [32,34]. Anyhow, in view of the possibility of surgical site infections after ultra-early cranioplasty [38], and the positive effects of cranioplasty on rehabilitation [39] and the syndrome of the trephined [40], we presently aim to conduct cranial reparation one to three months after craniectomy.

Strengths and weaknesses
With respect to the validation of the ORS, the strengths of the present study include that the present cohort was predominantly independent from the initial description cohort, using which the ORS was initially developed [6]. Further, the raters were blinded to each other's ratings, the follow-up time was relatively long, multiple imaging softwares were used and the radiological database used in the validation was extensive with 365 CT scans analyzed. The present study sample included all autologous cranioplasty patients with sufficient imaging data comprising 81% of all autologous cranioplasty patients treated in two hospital districts during the study period thus accurately representing the whole patient cohort. All the cranioplasties in the two hospital districts were conducted in OUH and TUH.
Regarding the clinical conclusions, the weakness of the present study setting was that the CT scans were not acquired prospectively at predetermined intervals but were conducted upon clinical requirement, which complicates the clinical interpretation of the follow-up data. However, it would be ethically and financially unjust to acquire serial CT scans only to exclude bone flap resorptionan iatrogenic risk for radiation-induced neoplasia is associated with CT scanning [41]. Additionally, the present work was an internal validation study, and in the future, external validation with a larger number of evaluators is required to definitively ensure generalizability of the ORS. Anyhow, we managed to describe the reliability of the ORS, and to produce a previously Table 4 Mean Oulu Resorption Scores (ORS) and the respective ORS grades as calculated from the scores assigned by Raters 1.1, 1.2 and 2 to the latest available head CT scan.

Conclusions
The Oulu Resorption Score is a reliable and currently the only validated tool for the analysis of bone flap resorption from CT scans after autologous cranioplasty, with good-to-excellent reliability. The progression of bone flap resorption seemed to stabilize at one to two years after cranioplasty, suggesting that two years is the sufficient length of clinical follow-up after autologous cranioplasty.