Distraction forces on the spine in early-onset scoliosis: A systematic review and meta-analysis of clinical and biomechanical literature

Distraction-based growing rods are frequently used to treat Early-Onset Scoliosis. These use intermittent spinal distractions to maintain correction and allow for growth. It is unknown how much spinal distraction can be applied safely. We performed a systematic review and meta-analysis of clinical and biomechanical literature to identify such safety limits for the pediatric spine. This systematic review and meta-analysis was performed according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement. Three systematic searches were performed including in-vivo, ex-vivo and in-silico literature. Study quality was assessed in all studies and data including patientor specimen characteristics, distraction magnitude and spinal failure location and ultimate force at failure were collected. Twelve studies were included, 6 in-vivo, 4 ex-vivo and 2 in-silico studies. Mean in-vivo distraction forces ranged between 242 and 621 N with maxima of 422–981 N, without structural failures when using pedicle screw constructs. In the ex-vivo studies (only cervical spines), segment C0-C2 was strongest, with decreasing strength in more distal segments. Meta-regression analysis demonstrated that ultimate force at birth is 300–350 N, which increases approximately 100 N each year until adulthood. Ex-vivo and in-silico studies showed that yielding occurs at 70–90% of ultimate force, failure starts at the junction between endplate and intervertebral disc, after which the posteriorand anterior long ligament rupture. While data on safety of distraction forces is limited, this systematic review and meta-analysis may aid in the development of guidelines on spinal distraction and may benefit the development and optimization of contemporary and future distractionbased technologies.


Introduction
Distraction-based growing-rods are commonly used to surgically treat Early-Onset Scoliosis (EOS), a complex 3D spinal deformity. They aim to control the curve while allowing further spinal growth. Examples are the traditional growing rod (TGR) and the magnetically controlled growing rod (MCGR) (Akbarnia et al., 2005;Cheung et al., 2012). Although widely used, the magnitude and safety margins of the forces that are exerted during these distractions is still unknown. In MCGRs, the maximum force exerted by the actuator is about 200 N, although many MCGRs transmit only a fraction after several distractions (Rushton et al., 2019). TGR distraction force is determined by the surgeon that performs the distraction surgery, and these forces are rarely measured.
Distraction forces are applied in a more controlled fashion with halo gravity traction (HGT), where forces up to 50% body weight are safely applied for several weeks (Poon et al., 2018;Yang et al., 2017;Yankey et al., 2021).
At our institution, we developed a dynamic growing-rod that exerts continuous distraction forces through a helical coil spring mounted around standard rods Wijdicks et al., 2021). During implant design, we determined that there was a paucity of knowledge on which magnitude of distraction force can be tolerated by the pediatric spine. Pragmatically, we chose a relatively low initial force of 75 N, but higher forces may be much more effective. To aid in the development and optimization of this technology and its contemporary counterparts, a basic understanding on the safety of distraction forces in the pediatric spine needed to be established, a topic that has not yet been addressed in previous (systematic) reviews. Therefore, we performed a systematic review and meta-analysis of the clinical and biomechanical literature to identify the best evidence for upper safety limits of distraction forces on the pediatric spine.

Materials and methods
This systematic review was performed according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement (Moher et al., 2009). We systematically searched the literature for studies that investigated distraction forces on the pediatric spine or its components. The review consists of 3 separate sections: Section 1: In-vivo studies that clinically measure distraction forces in children. Section 2: Ex-vivo biomechanical tensile tests on whole pediatric spines or spinal sections. Section 3: In-silico models that investigate loadsharing of the spine and its components in either children or adults. Since these sections are heterogeneous in study design, the search strategy was different for each section (Table 1).

Search strategy and eligibility criteria
For each section, the PubMed, Embase and Cochrane databases were systematically searched, with no restriction on publication date. We included English articles that investigated the spine or its components in distraction, and that measured or calculated distraction forces that were used. Reference screening and citation tracking was performed to find additional studies. As growth-friendly implants primarily transmit a pure distraction force (and will limit flexion/extension moments), studies investigating such moments without specifying the pure distraction component were excluded. Conference abstracts, letters and (systematic) reviews were also excluded. Additional eligibility criteria per section are outlined in Table 2.

Study selection and quality assessment
Title-and abstract screening was performed by two authors (JVCL and IK). Conflicts were discussed until consensus was reached. For clinical studies, quality was assessed with the Methodological Index for Non-Randomized Studies (MINORS) instrument (Slim et al., 2003). A maximum score of 16 can be obtained for non-comparative studies, we arbitrarily defined low quality as a score below 8, moderate quality as a score between 8 and 12 and high quality as a score > 12. Since no metric to assess biomechanical-and finite element study quality was available at the time of this study, we prospectively created a quality assessment tool for each study type, based on reporting recommendations made by the US Food and Drug Administration (FDA, 2019(FDA, , 2016. All three quality assessment tools and their respective criteria are outlined in Table 3.

Data extraction and statistical (meta-)analysis
Study characteristics and results regarding forces, displacement and tissue damage were extracted using standardized forms. The data of Section 2 was pooled and a meta-analysis was performed to determine relationships between age and ultimate force. Specimens were separated in three anatomical groups: C0-C2, C2-C5 and C5-T1. For each group, a least-squares second-order polynomial regression analysis was performed with age as the independent variable and ultimate force of each specimen as the dependent variable. Profile likelihood 95% confidence intervals were calculated for each regression equation and the adjusted coefficients of determination (R 2 ) were calculated. GraphPad Prism 8.4.1 (GraphPad Software Inc.) was used for statistical analysis.

Results
The literature searches of all sections yielded 5332 results. After titleand abstract screening, 64 studies remained for full-text screening. A PRISMA flowchart for each section is provided in Fig. 1.

Study characteristics and -quality
Six articles were included, study characteristics and quality assessment are shown in Table 4a. Three studies investigated Harrington rod distractions (Dunn et al., 1982;Elfstrom and Nachemson, 1973;Waugh, 1966), three reported TGR distractions (Agarwal et al., 2019;Noordeen et al., 2011;Teli et al., 2012). In one study, bilateral distraction was performed and mean force was reported (Agarwal et al., 2019), the others used unilateral distraction. Mean MINORS score was 10.8 (range 9-13) out of 16, indicating moderate to high study quality (Table 3a). (Table 5a) Waugh measured distraction force in 3 adolescent idiopathic scoliosis (AIS) patients from implantation to several hours postoperatively (Waugh, 1966). Maximum distraction force ranged from 177 to 373 N. In two patients, failures were observed; a laminar fracture at 373 N and several simultaneous transverse process (TP) fractures at 294 N. In the third patient, moments with high intra-abdominal pressure caused considerable increase of measured force (Coughing: 363 N, Vomiting: 677 N), although no failures were seen. Elfström and Nachemson used distraction force measurements in 8 AIS patients; maximum force was 422 N. There were two laminar fractures, at 235 N and 324 N (Elfstrom and Nachemson, 1973). Dunn et al. performed distraction in 12 patients in two steps; first with a slow continuous distraction outrigger, followed by the definitive, more forceful distraction (Dunn et al., 1982). Mean initial outrigger force was 332 N with a maximum of 608 N. During the forceful distractions, mean and maximum force increased to 627 N and 981 N respectively. During distraction, a laminar fracture in a patient with osteopenic bone occurred at 392 N. In all three studies, stressrelaxation was observed starting with a 40% reduction in residual forces during 30-60 min post-operatively. One study measured postoperative forces continuously for 2 weeks (Elfstrom and Nachemson, 1973). A further reduction in distraction forces took place so that only 40% of the force remained after 4 days. After 11 days, the unilateral residual force was relatively stable at 25% of the initial force, corresponding to around 100 N. (Table 5a, Fig. 2

)
Teli et al. investigated how forces increase during every subsequent millimeter of distraction during the first distraction episode (Teli et al., 2012). After a threshold force of 133 N, there was a linear increase in force up to the 12th millimeter of distraction. The two other studies investigated overall increase in applied force with subsequent distractions (Agarwal et al., 2019;Noordeen et al., 2011). Mean distraction force increased from 140 to 142 N during the first distraction to 515-555 N for the latest. Maximally applied forces ranged between 552-645 N. One TGR study reported a failure, a laminar fracture at around 450 N (Agarwal et al., 2019).

Study characteristics and -quality
Four articles were included, study characteristics and quality  Table 4b. All investigated tension to failure in pediatric cervical spines. One study exclusively investigated neonatal spines (Duncan, 1874), the others investigated a range of age groups, from neonates to adolescents (Luck et al., 2013;Nuckley et al., 2013;Ouyang et al., 2005). No studies investigated individual pediatric spinal components like the intervertebral disc (IVD), epiphyseal plate or spinal ligaments. Mean quality score was 14.8 (range 9-17) out of 20 (Table 3b). One study had low study quality (Duncan, 1874), while the other three had high study quality. (Table 5b, Fig. 3) Already in 1874, Duncan investigated the force needed to sever the cervical spine in stillborn infants (Duncan, 1874). Increasing force was applied by weights through a pulley system. Mean force needed was 507 ± 95 N. All spines failed between C3 and C7, the structure that failed first was not reported. The other three studies tested ultimate force (F ultimate ) at different ages in displacement controlled experiments. Ultimate force was defined as the highest force recorded followed by a sudden decrease in reaction force with continued displacement and coincident with gross evidence of tissue damage (i.e. failure of the strongest spinal component). Luck et al. investigated the F ultimate of three different cervical levels in children of three different age groups (<2 years old, 6-9 years, 12-17 years) (Luck et al., 2013). In all age groups, C0-C2 showed the highest F ultimate . In all levels, F ultimate increased nonlinearly with age, with a 3-4-fold increase during the first 6-9 years and a 1.5-2.5-fold increase between 6 and 9 years and adulthood. For C0-C2, F ultimate increased from a mean of 436 ± 363 N in children < 2 to 2714 ± 230 N in adolescents. For C4-C5, F ultimate increased from 317 ± 198 N to 2030 ± 302 N. For C6-C7, these values were 292 ± 186 N and 1832 ± 259 N.

Force and failure results
When correcting for vertebral cross-sectional area, ultimate tensile strength (UTS) also increased with age, albeit less sharply. For C6-C7, The research question of the investigation is explained, and this question can be answered through the proposed research. Solver: The software package, version and type of simulation are reported. Geometry and mesh: The finite element geometry is presented in detail. This includes an explanation on how the geometry was obtained and includes details on elements and mesh used. Material properties: Material properties for all materials are included and referenced. Assumptions and simplifications: Differences with and simplifications of the model to the real-world situation are described (e.g. a rationale is provided for structures not included in the model). Boundary and loading conditions: Boundary conditions are explained. Initial conditions, pre-stresses and loading conditions are provided. Results: A relevant outcome measure was chosen and results were provided for several relevant regions in the finite element model. Mesh refinement: A mesh was chosen so that outcomes were not significantly influenced by element size. This was tested with mesh refinement or convergence analysis techniques. Validation: Results were validated to existing clinical or biomechanical data and were in accordance to that data.
UTS was 2.7 ± 0.6 MPa for children < 2 years and 5.3 ± 1.2 MPa for adolescents. In the spines < 2 years old, failure occurred almost exclusively through the epiphyseal plate or through the cartilaginous synchondrosis.  (Ouyang et al., 2005). In the younger group, mean F ultimate was 609 ± 114 N. In older children, F ultimate was 872 ± 62 N. All failures occurred in the distal cervical spine, the exact structures that failed were not reported. Nuckley investigated F ultimate in level C1-C2 in two age groups, 2-8 year old children and 9-16 year old children (Nuckley et al., 2013). In the younger group, mean F ultimate was 983 ± 265 N. For the older children this was 1669 ± 109 N. All failures were ligamentous failures. Meta-analysis of the ex-vivo studies is shown in Fig. 3. Residuals were normally distributed. Adjusted R 2 values of the cubic functions of different segments ranged between 0.82 and 0.86, indicating that most variation could be explained by age alone. In all segments, an increase in F ultimate was seen with increasing age. For C0-C2, this increase was largest during the first years, for the other segments, the increase followed a more linear trend. For the distal segments, increase in F ultimate was approximately 100 N/year. From infancy to end of adolescence, F ultimate increased from 341 N to 2453 N in C0-C2, from 342 N to 2190 N in C2-C5 and from 294 N to 1902 N in C5-T1.

Study characteristics and -quality
Two articles were included, study characteristics and quality assessment are shown in Table 4c. Dong et al. investigated tension to failure in an osseoligamentous FE model of a 10-year old cervical spine (Head-T1) (Dong et al., 2013), whereas DeWit and Cronin explored tensile failure in a single adult osseoligamentous functional spinal unit (C4-C5) (DeWit and Cronin, 2012). Both studies used models that included vertebral bodies, IVDs and (non-linear) ligaments. Dong et al. also included the epiphyseal plate. Both studies modelled failure, deleting elements of specific spinal components as they were loaded above their failure limit. This ensured a gradual reduction of the load-carrying capacity of the spine and permitted a detailed characterization of when and where failure occurred. Due to modelling constraints, DeWit and Cronin modelled the connection between cartilaginous endplate and IVD with tie-break elements, potentially reducing bio-fidelity of their failure modelling. Both studies validated their model to the respective adult or pediatric experimental literature. Mean study quality was 16.5 (range 16-17) out of 18, indicating high study quality (Table 3c).

Distraction force, load sharing and failure results
The adult C4-C5 FE segment from DeWit and Cronin was validated against previous experimental tensile data (Dibb et al., 2009). During increasing displacement, several distinct peaks were seen where failure occurred (Fig. 4). The first failure was a rupture between the posterior junction of the vertebral endplate and IVD at around 2600 N. With increasing IVD-endplate avulsion, the posterior long ligament (PLL; ~2500 N), anterior long ligament (ALL; ~2500 N) and anterior IVDendplate junction failed (~1750 N). Dong et al. created an FE model that was validated against previous tensile pediatric cadaver experiments (Luck et al., 2013;Ouyang et al., 2005). When simulating tension to failure at C4-C5, first failure occurred at the IVD-endplate junction at around 650 N. This is lower than the F ultimate or F yield reported by Luck et al. However, at this point, only a small decrease in force was observed, Gradually removing elements as the critical failure stress was reached. Tiebreak contacts were chosen for the disc-vertebra interface. These were broken as the failure stress was reached.

Discussion
Although distraction of the spine is often applied in deformity correction, we know little about its safety limits. Forces are generally applied based on previous experiences and common knowledge. The current study was an effort to get a better understanding of what forces can be applied safely for novel distractive implants.
In general, there was a paucity of literature on the pediatric thoracic spine where distraction implants are usually implanted. Relevant literature indicates that the force that can be applied to the pediatric spine is several times the force of body weight, much higher than forces used in HGT, TGR or MCGR. If spinal integrity fails, this is usually at or around the endplate. The resistance to failure of this structure increases with the increase in cross-sectional area during growth, but also independent of this, due to maturation (Luck et al., 2013). As the maturation effect has also been observed quite similarly in animal species, this could enable future in-vivo research on distraction safety and efficacy (Ching et al., 2001;Pintar and Mayer, 2000).
Since most studies investigated forces during only short periods of time, important phenomena like creep and stress relaxation were ignored. Creep properties of spinal components in tension have not been studied extensively, although its effect must take place as shown by HGTs effectiveness over time (Yang et al., 2017;Yankey et al., 2021). The included Harrington rod studies show that stress relaxation most certainly takes place during distraction surgery, where distraction forces decreased 60-75% during the first post-operative weeks. However, as distraction forces decrease non-linearly, even a micro-slippage in the Harrington rod itself could have resulted in substantial reduction of residual forces.
The FEM studies suggest that the epiphyseal plate fails first, followed by the PLL and subsequently the ALL. This pattern seems to be in accordance with those reported in the ex-vivo literature, although F ultimate in the FEM studies is lower (Luck et al., 2013). It could be that micro-failures (apparent only through changes in the force-displacement diagram) are missed in the in-vivo and ex-vivo studies as they are not coincident with obvious visual changes. Potentially, such microfailures of spinal tissues play an important role in autofusion of the spine and the "law of diminishing returns" (Cahill et al., 2010;Williams et al., 1999). As micro-failures are hard to quantify in-vivo and were not subject of the current study, a safety margin should be adopted when choosing a maximum force that is to be applied. In addition, the results of the FEM studies must be interpreted with caution as many different pediatric spinal material and interaction properties are still unknown and had to be estimated from adult values (DeWit and Cronin, 2012;Dong et al., 2013;Jebaseelan et al., 2012;Kumaresan et al., 2000). Uncertainty margins of these estimations may cause large deviations in outcomes, as shown previously (Dreischarf et al., 2014;Naserkhaki et al., 2018).
Unfortunately, while an extensive search was performed, most studies that were identified focused exclusively on the cervical spine, while distraction-based therapy for EOS is primarily performed in the thoracic-and lumbar spine. Therefore, a definitive answer to our research question cannot be given. Nevertheless, due to the increase in cross-sectional area of vertebral structures from cranial to caudal, these are likely stronger than cervical segments Yoganandan et al., 1996Yoganandan et al., , 1988. The observation that HGT complications almost exclusively occur in the cervical spine also suggests that the  (3) a Five groups were created: (1) ≤1 day, (2) 1 day-2 years, (3) 2-5 years, (4) 5-10 years, (5) >10 years.
cervical structures are weakest and that the current results are therefore likely lower bounds of the true maximum, safe distraction force (Yang et al., 2017;Yankey et al., 2021). In addition, most included ex-vivo studies investigated the spine with all muscles and subcutaneous tissues removed, which has been shown to further reduce spinal strength by a factor of 2 (Duncan, 1874;Yoganandan et al., 1996). Taking this into account, we can make several inferences for clinical practice based on current literature. While speculative, they represent the best available evidence: 1. In-vivo literature shows that distractions of 300-400 N are common, without risk of macro-failure (when not using laminar or TP hooks). 2. Ex-vivo literature shows that F ultimate of spinal segments increases with age in a more or less linear fashion. This F ultimate is 300 N at birth, and increases around 100 N each year. 3. In-silico literature shows that first failure occurs at around 70-90% of F ultimate at the level of the endplate, followed by failure of the PLL and ALL. 4. Adjusting for these factors, the conservative F ultimate of the pediatric spine becomes approximately 800 N (age 5-6), 1000 N (age 7-8) and 1200 N (age ≥ 9).
Obviously, a margin of safety must be applied to account for individual variability and the fact that there is a paucity of data on several spinal regions. A reasonable safety factor of 4 will result in a potential maximum force of 200 N (age 5-6), 250 N (age 7-8) and 300 N (age ≥ 9) when using pedicle screw anchors. Anatomical structures at risk and bone-or soft-tissue weakness may require lowering distraction forces further. Special care must be taken to avoid excessive stress on the spinal cord and nerve roots, which have been associated with certain correction manoeuvres (Henao et al., 2018). Whether these stresses are generated following axial distraction in growing-rod surgery is unknown, although neurological complications during spinal distraction are rarely seen (Agarwal et al., 2019;Noordeen et al., 2011;Teli et al., 2012;Yang et al., 2017).
The current study gives an approximation to the upper limit of corrective forces that can be applied to the pediatric spine. Whether maximum forces are also most effective has yet to be studied. There is  (Luck et al., 2013), c Material properties: (Ouyang et al., 2005).
evidence that frequent distractions with lower force improves curve correction, mitigates the "law of diminishing returns" and reduces complication rate (Agarwal et al., 2018(Agarwal et al., , 2017Cheung et al., 2016). Elucidation of these force-effects on different spinal components and implants could lead to optimization of both novel and contemporary growing-rod techniques. This is an attempt to review safety limits of spinal distraction forces across several clinical and biomechanical domains. This approach allows for the synthesis of data from seemingly isolated research modalities which is useful in many fundamental and clinical sciences. Limitations include the low number of studies that could be included, and the fact that the ex-vivo and in-silico studies investigated only the cervical spine and not the thoracic and lumbar spine, which are most often instrumented. While this makes it difficult to draw definitive conclusions, the included studies are currently the best available evidence, which underscores the need for continued research on this important topic of spinal distraction.

Conclusion
Literature on safe distraction forces for the pediatric spine is limited. Clinically applied distraction forces of 300-500 N were frequently applied. Occasionally, this resulted in laminar-or TP fractures, but no study reported ligamentous disruptions or epiphyseal plate fractures. Ex-vivo cervical studies show that F ultimate is around 300 N at birth and increases 100 N each year, a 6-7 fold increase from birth to end of adolescence. In-silico studies show that yielding starts at 70-90% of F ultimate and that the junction between IVD and vertebral endplate fails first.