Association between central sensitization and gait in chronic low back pain

Background: Central sensitization (CS) is often present in patients with chronic low back pain (CLBP). Gait impairments due to CLBP have been extensively reported; however, the association between CS and gait is unknown. The present study examined the association between CS and CLBP on gait during activities of daily living. Method: Forty-two patients with CLBP were included. CS was assessed through the Central Sensitization Inventory (CSI), and patients were divided in a low and high CS group (23 CLBP-and 19 CLBP + , respectively). Patients wore a tri-axial accelerometer device for one week. From the acceleration signals, gait cycles were extracted and 36 gait outcomes representing quantitative and qualitative characteristics of gait were calculated. A Random Forest was trained to classify CLBP-and CLBP + based on the gait outcomes. The maximum Youden index was computed to measure the diagnostic test ’ s ability and SHapley Additive exPlanations (SHAP) indexed the gait outcomes ’ importance to the classification model. Results: The Random Forest accurately (84.4%) classified the CLBP-and CLBP + . Youden index was 0.65, and SHAP revealed that the gait outcomes ’ important to the classification model were related to gait smoothness, stride frequency variability, stride length variability, stride regularity, predictability, and stability. Conclusions: CLBP-and CLBP + patients had different motor control strategies. Patients in the CLBP-group presented with a more “ loose control ” , with higher gait smoothness and stability, while CLBP + patients presented with a “ tight control ” , with a more regular, less variable, and more predictable gait pattern.


Introduction
Chronic low back pain (CLBP) is one of the most prevalent chronic musculoskeletal pains [1].It is responsible for high treatment costs, sick leave and individual suffering and it represents a significant socioeconomic burden [2].For 85%-90% of patients with CLBP, the relation between pathoanatomical and clinical presentations is absent [3] and, therefore, it is classified as nonspecific CLBP [4].In CLBP, and other chronic musculoskeletal disorders, central sensitization (CS) might be present (reviewed in Ref. [5]).CS is defined as "increased responsiveness of nociceptive neurons in the central nervous system to their normal or subthreshold afferent input" [6] and manifests as mechanical hypersensitivity, allodynia and hyperalgesia [7].A considerable number of people need treatment for CLBP.Although the overall efficacy of CLBP rehabilitation programs is positive, but the effect sizes are modest [8].
Correctly recognizing the physical and psychosocial factors perpetuating pain and physical disability of patients with CLBP remains a challenge [9].Altered motor control of patients with CLBP could possibly contribute to the persistence of CLBP [10].Altered motor control could affect daily-living activities, as patients with CLBP often exhibit altered movement patterns and motor control strategies; probably to avoid painful movement, such as walking [11].Many clinicians may intuitively identify "abnormal" gait patterns in patients with CLBP, but identification and objectifying of specific "abnormal" gait outcomes is challenging.During walking, it is suggested that patients often adopt a "protective guarding" or "splinting" strategy [12] to avoid painful movements of the spine.These adaptations may lead to a slower and less flexible gait pattern [13].Evidence for this, however, is ambiguous.Studies between patients with CLBP and healthy controls, observed inconsistent evidence regarding preferred walking velocity [13,14], stride length [15,16], and stride-to-stride variability [17,18].
A possible explanation for these inconsistencies might be an unknown heterogeneity within the samples, such as the presence of CS.CS could plausibly be related to the inconsistent results, because the presence of high CS levels is associated with long-lasting chronic pain [19] and movement may be changed due to pain.Also, general gait outcomes such as walking speed and stride length, might not be sensitive enough to detect small differences between patients with low or high levels of CS.In addition to stride related parameters, gait outcomes that reflect gait quality in terms of regularity, synchronization, smoothness, local stability, and predictability, are sensitive to detect differences in gait performance.These gait outcomes were successfully used to detect the differences between age groups [20], older adults with and without fall risk [21], and patients with and without Parkinson's disease [22].Even though the effects of CLBP on gait have been frequently investigated in controlled laboratory studies, there are no studies about the relationship between CS levels and gait performance under daily-living environment circumstances.
Advances in wearable technology and machine learning approaches offer new opportunities in gait data collection and analysis.Wearable technology allows researchers to record patients' physical activities in unobserved, daily-living environments over extended periods of time.This data can reflect the real gait performance of the patients, since being observed may change the performance of patients under the controlled laboratory environment [23].The successful employment of machine learning approaches in gait analysis makes it possible to extract the most informative gait outcomes from the accelerometer sensor data [20].If patients with low and high levels of CS walk differently, machine learning approaches will be able to successfully recognize these differences and can classify patients with low and high CS level based on their gait outcomes.Many gait outcomes are not independent and interact with each other, such as gait speed and step regularity.Machine learning approaches such as Random Forest (RF), are able to process high dimensional and non-linear data structures and take the interrelation and interaction of the gait outcomes into consideration [20].
Therefore, the aim of this study was to analyze whether and how the presence of CS is related to differences in gait performance of patients with CLBP during daily life by using a machine learning approach.It was hypothesized that patients with CLBP and higher CS levels show differences in daily life gait performance, compared with those with lower CS levels.

Patients
This study included patients with primary CLBP who were recruited from the outpatient Pain Rehabilitation Department of the Center for Rehabilitation of the University Medical Center Groningen (CvR-UMCG).Primary CLBP is defined as low back pain persistent for more than three months, with pain not being the result of any other diagnosis.The patients were selected according to the following inclusion criteria: The study was approved by the Medical Research Ethics Committee of the University Medical Center Groningen (METc 2016/702) and conducted according to the principles expressed in the Declaration of Helsinki.The data used in this paper was derived from a larger study, of which protocol details were described elsewhere [19].

Data collection
Demographics were collected and standard clinical test were applied as part of the usual care of CLBP patients that are referred to the outpatient Pain Rehabilitation Department of the Center for Rehabilitation.Assessments included: Visual Analogue Scale for pain intensity (VAS Pain; 0-10), the Dictionary of Occupational Titles (DOT, the Pain Disability Index (PDI; 0-70), the physical functioning subscale of the Rand36 questionnaire (Rand36-PF; 0-100), the Pain Catastrophizing Scale (PCS, 0-52), the Injustice Experience Questionnaire (IEQ, 0-48), and the Brief Symptom Inventory (BSI global severity index t-score (GSIT))(see Table 3).
Central sensitization (CS).The presence of CS-related manifestations was assessed with section A of the Central Sensitization Inventory (CSI) [24].Section A has 25-items to assess the presence of common CS-related symptoms.Scores can range from 0 to 100 where a higher scoring represents a higher level of CS.A score lower than 40 indicates lower CS levels (CLBP-group) and a score of 40-100 is interpreted as higher CS levels (CLBP + group) [25].
Accelerometer data.The accelerometer data were collected between 2017 and 2019.Patients were instructed to wear a tri-axial accelerometer (ActiGraph GT3X, Actigraph Corporation, Pensacola, FL) at all times for about one week, excluding sleeping or bathing times.The accelerometer was worn at the front right hip of the patient (at the anterior superior iliac spine).Assuming a standing and upright position, the Y-axis pointed to the ground (vertical direction, V), Z-axis faced the walking direction (anteroposterior direction, AP), and the X-axis was perpendicular to the walking direction, pointing from a patient's right to left (mediolateral direction, ML).These directions are approximate only.The sampling frequency of the accelerometer was set to 100 Hz and the dynamic range was ± 6 gravity.

Raw data segmentation
Accelerometer data of each patient was segmented into 24 h span data segments (from 12:00 p.m. to next day 11:59 a.m.) to represent the activities during the days.Because the measurement started at 12:00 p. m., to make full use of the data, the 24 h span was between 12:00 p.m. until next day 12:00 p.m. Data that did not completely covered this 24 h span was discarded from the analysis.Because of technical errors or personal reasons, a full week of data could not be collected from all patients.To compare the data between different patients fairly, 4 segments (representing 4 days) of each patient were included in the analysis.Therefore, 7 patients who had less than 4 segments, were excluded.From patients with more than 4 segments, 4 segments were randomly sampled.Fig. 1a graphically shows the process of the raw data segmentation.

Walking bouts extraction
The accelerometer data of the 4 segments were first smoothed by a low-pass filter with a 2nd order Butterworth and a 20 Hz cut-off frequency.Subsequently, potential walking events were detected by the Fast Fourier Transform (FFT) based method [26], which identified periods with 0.5-3.0Hz power spectrum values.To remove false walking events from the potential walking periods, the zero-cross method [27] was employed.If the time interval between any two adjacent walking events was shorter than 2 s, these two walking events were merged into one walking bout.Finally, the walking bouts in each segment were extracted and their gait outcomes were calculated.Fig. 1b presents the walking bouts as the yellow vertical bars in the rectangle.

Gait outcomes
All walking bouts in one 24 h segment were used to determine the total duration of walking, the total number of steps, the maximum duration of a walking bout and the maximum number of steps of a X.Zheng et al.
walking bout.Subsequently, all walking bouts exceeding 10 s were selected and cut into non-overlapping 10 s windows [28].From the segment, each 10 s window was used to calculate different gait outcomes, and these values were averaged over all 10 s windows in the segment representing the patient's gait performance on that day.
Gait outcomes were divided into two categories, quantitative and qualitative gait outcomes.From one segment, we obtained one gait outcome vector, including 36 gait outcomes, based on the walking bouts (see Fig. 1c).The detailed descriptions of the quantitative and qualitative gait outcomes are presented in Table 1 and Table 2 -for extended explanation of variables see Ref. [29].
Pearson-coefficient was calculated to examine relationship of gait outcomes between weekdays and weekend.The Pearson-coefficient ranges from − 1 to 1, where 1 represents a perfect correlation.
The Mann-Whitney U test was used to statistically test the differences between CLBP-and CLBP + groups for demographics and CSI scores.To separate CLBP-and CLBP + groups by gait outcomes, RF was used.

Random Forest classifier
RF is considered as the optimal machine learning classification approach for the present data, because it performs well with (a) nonlinear and linear data; (b) high dimensional data; and (c) unbalanced and small datasets [30].Apart from this, a comparison of different machine learning classifiers was performed to help to select RF as the best classifier for this study (details in Appendix A).
The input data of this approach was < S, L >. S represents the gait outcome vectors of all patients and L was its corresponding label.The definition of S is: where s i represents a gait outcome vector i and m is the number of all gait outcome vectors, d represents a gait outcome and k = 36.L = l 1 , …., l m , where l ∈ {CLBP − , CLBP + }.
RF is constructed in four steps.
Step one: Randomly sample n gait outcome vectors from S and n corresponding labels from L, with replacement.These new set of gait outcome vectors and labels are called S b and L b .In S b , s i may appear more than one time or not appear. Step , where h is the change in height (in meters), l equals leg length (in meters).h was calculated by a double integration of the accelerometer signal in vertical direction.SL is the sum of the adjacent two step lengths.Stride time (ST; mean, variability) ST = n/f, where f is the sample frequency (in Hertz) and n is the number of samples per dominant period derived from autocorrelation.SF = f/n.

Stride frequency (SF; mean, variability-V/ML/AP)
Root mean square of the variability of the amplitude of accelerations (RMS), , where x, y, z represent the accelerometer signal (in meters per second squared) in x, y, z axis and n is the number of samples.

X. Zheng et al.
two: In S b , randomly sample j (j ≤ k) gait outcomes from s.Therefore, Step four: Repeat steps one to three 1000 times and combine the decision trees into an ensemble, called RF, that predicts by voting (see Fig. 2).
Before training RF, 80% of patients were randomly selected and their 4 corresponding gait outcome vectors were used as the training data.The gait outcome vectors of the remaining 20% of patients were used as the testing data.To avoid overfitting of the hyperparameters, a 5-fold cross-validation approach was used to estimate them, as shown in Fig. 1d.Four folds were used to train the model and the rest fold was used to estimate the performance of the current hyperparameters in RF.The performance reported by the 5-fold cross-validation was the average of the values computed in the 5 splits.After the best hyperparameters were determined, the testing dataset was used to evaluate the generalizability of the model.

Accuracy evaluation
Accuracy, sensitivity, specificity, precision, F1-score, and maximum Youden index were calculated to evaluate the performance of the classification (Fig. 1f).In this study, CLBP+ was considered as the positive case and CLBP-was the negative case.Correct predictions of CLBP+ and CLBP-patients are called true positives (TP) and true negatives (TN), respectively.Incorrect classifications of CLBP-patients as CLBP + or of CLBP + patients as CLBP-, are called false positives (FP) and false negatives (FN) respectively.
Accuracy was the proportion of all the correct classification results.
Sensitivity represents the proportion of positive cases that are correctly assigned (true positive rate).
Specificity refers to the rate of correctly predicted negative cases in all negative cases (true negative rate).
Precision is the ratio of the correctly predicted positive cases in all predicted positive cases.
F1-score is the harmonic mean (average) of the precision and sensitivity.
The receiver operating characteristic (ROC) curve was calculated to evaluate the performance of RF.The Y-axis of this curve represents the true positive rate (sensitivity) and the X-axis means false positive rate (1specificity).The overall classification performance of RF was evaluated by the area under the ROC curve (AUC).A classification model with a larger AUC value has a higher correct rate, and AUC = 1 represents perfect performance.The maximum Youden index was computed to measure the diagnostic test's ability.
where c is the cut-point.When the value J is maximum, the corresponding c is the optimal cut-point.

Feature importance
SHapley Additive exPlanations (SHAP) [31] was used to assess the gait outcomes' importance to the classification model.SHAP connects optimal credit allocation with local explanations using the classic Shapley values from game theory.Shapley values, ∅ i , explains the importance of gait outcome i for RF and is defined as: where N is the size of the full set of gait outcomes, s is the subset that includes i in N, and R( ) is the accuracy of RF of the input gait outcomes.Since computing the exact Shapley values is computationally expensive, SHAP uses a tree explainer to exploit the information stored in the tree structure to calculate the SHAP values which are highly approximate Shapley values.Therefore, higher SHAP values represent higher impact to classify CLBP-and CLBP + groups.SR is computed by using the unbiased autocorrelation coefficient:

Results
Demographic characteristics are provided in Table 3.Out of a total of 60 patients, 11 were excluded because essential parts of their dataset were incomplete (CSI scores or/and accelerometry data), 7 were excluded because they had less than 4 segments data (3 had 1 segment, 2 had 2 segments, and 2 had 3 segments).Therefore, 42 patients were included in the data analysis.Differences between CLBP+ and CLBPgroup characteristics (Table 3) were not statistically significant (p > 0.05), with exception of CSI score (p < 0.001) and BSI (p = 0.01).
Because 42 patients (23 CLBP-and 19 CLBP+) were included, and for every patient 4 segments were randomly selected, the total accelerometer data segments were 168.Therefore, the scales of training and testing dataset were 136 and 32.The mean Pearson-coefficient between workdays and weekend was 0.983, indicating almost perfect correlation.
Testing data were used to evaluate the generalizability of RF and the confusion matrix is shown in Fig. 3. From the confusion matrix, accuracy, sensitivity, specificity, precision, and the F1-score were calculated to evaluate the performance metrics of the model.RF achieved an accurate classification-result (84.4% accuracy), and the sensitivity and specificity were 75.0% and 93% respectively.The precision was 92% and the F1-score was 82.6%.The ROC curve is presented in Fig. 4 showing that RF achieved a 0.83 AUC and the maximum Youden index was 0.69.
The importance of the gait outcomes for RF is shown in Fig. 5. Based on the SHAP values, the 10 gait outcomes (above the red line in Fig. 5) were considered as important to the classification model.For the gait outcomes below the red line, the SHAP values were too low.Important gait outcomes are IH-V, SF variability-ML/AP, SR-ML, Max LyE-V/ML, Sen-AP, Max LyE per stride-V, HR-ML and SL variability.
Fig. 6 shows the violin-box plot of the 10 important gait outcomes.Violin-box plot is a hybrid of a kernel density plot and a box plot, and the dots show the individuals data.A box plot contains a set of whiskers, a box and a horizontal line in the middle of the box, representing the minimum, maximum, first quartile, third quartile and median of the data respectively.From this figure, it is easy to distinguish the differences of the median between groups.It shows that CLBP-group has higher IH-V, HR-ML (better smoothness); higher SF-variance-ML, SFvariance-AP, SL-variance (higher variability); lower SR-ML (lesser regularity), lower Max LyE-V, Max LyE-per-stride-V, slightly lower Max LyE-ML (better stability); and slightly higher Sen-AP (lesser predictability).Although the differences of medians between 2 groups in Sen-AP and Max LyE-ML are small, their distributions are different.In Sen-AP, data of CLBP-had a wider distribution and CLBP + shows more data at the bottom.In the Max LyE-ML, data of CLBP-is concentrated around the median, while CLBP + has a wide distribution and a lower peak.For other gait outcomes, the distributions are also different.In IH-V, distributions of CLBP-and CLBP + all showed a bimodal distribution, but the peaks of distribution are different.In SF Variability-ML and SF Variability-AP, CLBP + has a larger peak at the bottom while CLBP-has a

Discussion
The aim of this study was to analyze whether and how the presence of CS was related to differences in gait performance of patients with CLBP during daily life by using a machine learning approach.Based on quantitative and qualitative gait outcomes, using a RF, the two groups (CLBP-and CLBP+) could be classified with a high accuracy.The classification results indicated that CLBP-patients walk differently from CLBP + patients.Furthermore, the SHAP values showed that the differences between CLBP-and CLBP + groups were present in gait outcomes that represented smoothness, stability, predictability, regularity, and variability.
In the present study, we addressed the walking measurement of patients with CLBP in a daily-living environment.Walking in a controlled laboratory or during a clinical assessment is different from self-initiated gait, during activities of daily living.Walking in daily life, might be subject to environmental perturbations, quick changes while performing a task, and often involves the performance of several actions at the same time [32], e.g.walking when carrying a cup of coffee.These influences on gait are not present in controlled studies and are not captured by conventional gait outcomes that average outcomes over stride cycles, such as mean step length, mean step time, and number of steps.Therefore, the present study included gait outcomes that take into account the interdependency of gait cycles and how gait cycles evolve over time, e.g., using sample entropy as a measure of predictability of the gait pattern, the maximal Lyapunov exponent as quantification of local stability and correlation-based measures [33].
The accuracy value of 84.4% shows that RF has a high classification accuracy.The specificity scores of RF reveals that 93% of the samples (15 samples, true negative) are correctly classified as member of the CLBP-without a high CS level, but it misses 7% (1 sample, false positive).The sensitivity scores show that 75% of the samples (12 samples, true positive) of the CLBP + group were assigned to this group, however 25% were wrongly classified as belonging to the CLBP-group (4 samples, false negative).Decreasing the possibility of false positive will increase the possibility of false negative, and vice versa.The F1-score was calculated to take false positive and false negative into consideration at the same time by computing their harmonic mean.The high F1-score (82.6%) of RF implies that the model has a good and balanced performance.The Youden Index (0.69) was higher than 0.5 which means that RF has a diagnostic test's ability to balance sensitivity and specificity.The AUC indicates that RF has a 83% chance to distinguish CLBP+ and CLBP-correctly.Based on these performance measures of RF, this study leads us to concluded that the CLBP-and CLBP + had different gait patterns, and that the gait outcomes important to the classification model identified by SHAP are trustworthy.
In the present study RF was applied for classification, among the many available machine learning approaches, such as K-nearest neighbors (KNN), Naive Bayes (NB), Artificial Neural Network (ANN), Support Vector Machine (SVM).In general, machine learning approaches can take the interaction of gait outcomes into consideration.KNN and NB are instance-based learning approaches which imply they do not learn from training data [34].Our choice for RF was based on the results of a previous study that compared RF, ANN, and SVM to classify different age groups on similar gait outcomes.The results of this study showed that all approaches had a good overall classification accuracy [20].Moreover, for the current dataset, our preliminary empirical work in which we compared the performance of different machine learning classifiers, showed that RF and ANN had the best performance compared to SV, NB and KNN (details were in Appendix A).A drawback of ANN is that it requires a large data set to find the optimal activation function and avoid overfitting [35].With a limited scale of dataset, both SVM and RF are good choices.Considering the clinical aim of the study, namely to investigate the relationship between CLBP, CS, and gait patterns, it is important that the results of the machine learning can be translated into meaningful outcomes that can support clinical decision making.SVM can deal with non-linear data by using kernel functions; however, choosing an appropriate kernel function could be difficult for clinicians.Additionally, it implicitly maps gait outcomes to a high-dimensional features space.This mapping changes the structure of gait outcomes and makes it hard to explain which gait outcomes contribute most to the classification model.Similarly, ANN uses various of activation functions (e.g., Tanh, Sigmoid), and makes the interactions of the gait outcomes invisible.On the contrary, RF is an ensemble of decision trees.Decision trees can incorporate gait outcomes interactions naturally in the classification process.For example, a decision tree with depth 2 from a RF, with the father node IH-V and the son node Sen-AP, can describe an interactive gait pattern: if IH-V >* and Sen-AP >*, the data belong to CLBP-.Because RF includes multiple decision trees it can capture the complex interaction of gait outcomes with good accuracy.Each tree is built based on a random subset of gait outcomes and the samples in the dataset can be repeatedly selected when training.Consequently, it can help to reduce the chance of overfitting and provide a generalized model.RF can incorporate gait outcomes interactions naturally in the classification process.SHAP can use this information that stores in the tree structure to disclose which gait outcomes are different between CLBP-and CLBP + groups.These differences in terms of gait regularity, smoothness, and stability are meaningful to the clinicians.
In this study, SHAP was used to evaluate the importance of each gait outcome, instead of the conventionally used Gini impurity and information entropy.The value of Gini impurity is based on the tree structure in RF and information entropy reflects the level of "information" of a gait outcome.Gait outcomes are interrelated and interact in a complex nonlinear manner [33].SHAP is based on the game theory and evaluates the contribution of each gait outcome to the classification accuracy by computing all possible combinations between gait outcomes.Therefore, SHAP provides a good method to explain the importance of gait outcomes to RF.The SHAP values suggest that the differences between CLBP-and CLBP + groups are reflected in smoothness, stability, predictability, regularity, and variability of gait.Compared with CLBPgroup, CLBP + group exhibited lower smoothness and local stability of gait, while the CLBP + group exhibited a more regular, less variable, and more predictable gait pattern.
Gait patterns of patients with CLBP, are usually compared with the gait pattern of healthy persons.To the best of our knowledge, this is the first study in patients with CLBP that addresses the difference in gait pattern between two CLBP groups based on low and high CS levels, which makes a direct comparison with other studies intricate.The results of different gait patterns between low and high CS levels support the notion that within the heterogenous CLBP group, different motor control strategies are adopted.Two motor control strategies on a continuum have been suggested with "tight control" and "loose control" at each end, and normal trunk control in the middle [36].
The gait patterns of CLBP + group might suggest that patients with CLBP + adopt a more "tight control".The "tight control" involves increased trunk muscle activation and enhanced muscle co-contraction, might enhance control over trunk posture and movement [36].Increased muscle activation and enhanced co-contraction would help individuals to maintain the stability of lumbar spine [37] by restricting the movement amplitude of lumbar spine.However, in a complex daily-living environment, this strategy might impair patients' ability to maintain balance during walking because of the unstable surfaces and environmental perturbations [38], and therefore has a lower gait stability (compared with CLBP-patients).Increased co-contraction would reduce the demand for the intricate control of the sequences of muscle activation.It might avoid the potential error raised by inaccurate sensory feedback of CLBP [36].This might allow patients to control their trunks' movement precisely [39] and, therefore, result in a lower variability and a higher regularity of gait of CLBP + patients.Our results might infer thus that the CLBP + group exhibited a more "tight control".Therefore, the lower stability and variability, higher regularity and predictability in gait of the CLBP + group could be the result of the adoption of "tight control".
The gait patterns of CLBP-group, on the other hand, might be explained by "loose control" strategy.The "loose control" that involves reduced muscle excitability, might reduce the control over trunk movements [36].The spine of which each spinal unit has 6 • of freedom, is controlled by its surrounding musculature.Reduced muscular excitability, leads to a reduced control over the spinal muscle, to larger amplitude movements, and to more movement variability during repeated tasks [36].The increased variability in gait of CLBP-group might support this finding.Additionally, increased variability would lead to a lower regularity in gait which was also found in the CLBPgroup.Apart from this, increased motor variability might probably prevent muscle fatigue [40] because it allows sharing the load between different structures or tissues.Motor variability makes it possible to explore new pain-free motor control solutions [41].This is a possible explanation for the higher smoothness in gait of CLBP-patients, because it allowed them to flexibly adapt to the complex daily-living environment by using different movement solutions.So, the higher variability and smoothness, and lower regularity in gait patterns might hint that CLBP-group adopted a more "loose control" compared with CLBP + group.
Although the "tight control" adapted strategy might have short-term benefits, it may also contribute to a higher level of CS.The "tight control" present in CLBP + patients presumably increase muscle activation and co-contraction, and lead to larger forces acting on the spine and higher spinal loading.Moreover, it has been shown that even when patients are at rest, muscle co-contraction can be continuous [42].These long-lasting peripheral noxious stimuli might explain the development and/or persistence of CS [43].Additionally, it has been reported that a "tight control" strategy relates to negative pain cognitions [44], a psychological process that also might contribute to the higher CS scores of the CLBP + group.Clinically, the gait outcomes identified as important to the classifier, may assist clinicians in providing them with a more accurate understanding of the gait performance of patients with CLBP, with low or high CS levels, and a with an explicit operationalization of the observed "abnormal" gait pattern of patients with chronic pain.Whether "abnormal" should be interpreted as a functional or a dysfunctional motor control strategy in the short or long term, remains to be studied.RF and SHAP used in this study have presented a novel way to identify interacting features, and therefore, can be used for further studies.The presented accurate classification could become meaningful if this would lead to effective treatment approaches.The differences in gait patterns of CLBP-and CLBP + groups could be the results of the different motor control adapted strategies and the different motor control adapted strategies could be the causes, consequences, or both, of differences in CS levels on patients with CLBP.While this cross-sectional study has objectified a relation between CS and gait outcomes, the causality of this relation is unknown.Follow-up studies would benefit from a longitudinal design with multiple measurements to help further unraveling of this relation, as well as the relation to disability.
In line with most studies on walking and CLBP, we used crosssectional data, thus we are not allowed to infer causality between motor control changes, CS and CLBP.Some patients had analgesic or anti-inflammatory treatment at the beginning of the study, and how these medicines interact with CS and gait outcomes is unknown.Moreover, we labeled the groups based on CSI score and the cut-off values from a previous study [25].It should also be noted that a gold standard measure to diagnose CS is unavailable.The CSI is regarded as an indirect measure of CS, because higher scores are associated with the presence of CS syndromes [25].In addition to gait assessment, it would be interesting to explore differences in physical activities between CLBPand CLBP + groups, because several studies reported that relationship between CLBP and physical activity levels is heterogeneous [45].

Conclusion
The present study analyzed gait data during daily living of CLBP patients with low and high CS levels.RF and SHAP were applied for classification and for assessing the contribution of gait outcomes to the model.This analytic approach demonstrated that RF has the ability to accurately classify subgroups of patients with CLBP and low or high CS levels based on differences in gait outcomes.The results of SHAP showed that the differences of gait outcomes between low and high CS levels were in gait regularity, variability, predictability, smoothness, and stability.This may imply that patients with low and high CS levels adopted different motor control strategies.Patients with CLBP and low CS level (CLBP-) use a "loose control" and, therefore, exhibited more smoothness and stability in gait patterns.Patients with CLBP and high CS level (CLBP+) adopted a "tight control" and showed a more regular, less variable and more predictable gait pattern.
The results of this study may contribute to a better understanding of gait characteristics in patients with CLBP, its association with CS, and may in the future assist in better-personalized rehabilitation interventions [46].
(a) age between 18 and 65 years old at the time of recruitment; (b) admitted to the interdisciplinary pain rehabilitation program; (c) could follow instructions; (d) signed informed consent.Additionally, patients were excluded if they: (a) had a specific diagnosis that would better account for the symptoms (e.g.cancer, inflammatory diseases and/or spinal fractures); (b) had neuralgia and/or radicular pain in the legs; (c) were pregnant; (d) in an acute phase of pain.

Fig. 5 .
Fig. 5. Features importance of Random Forest classifier.The 10 gait outcomes above the red line are: index of harmonicity in vertical direction (IH-V), variability of stride frequency in mediolateral/anteroposterior direction (SF variability-ML/AP), stride regularity in mediolateral direction (SR-ML), Maximal Lyapunov exponent in vertical/mediolateral direction (Max LyE-V/ML), sample entropy in anteroposterior direction (Sen-AP), Max LyE-V: Maximal Lyapunov exponent per stride in vertical direction, harmonic ratio in mediolateral direction (HR-ML) and variability of stride length (SL variability).The remaining gait outcomes below the red line are: WS variability: variability of walking speed, IH-ML: index of harmonicity in mediolateral direction, WS: mean walking speed and SL: mean stride length.ABS: absolute value.SHAP: SHapley Additive exPlanations.

Fig. 6 .
Fig. 6.Violin-box plot for the 10 gait outcomes.Dots show the individuals data.CLBP-, CLBP+: Patients with chronic low back pain with low (− ) and high (+) CS levels.IH-V: index of harmonicity in vertical direction, SF variability-ML/AP: variability of stride frequency in mediolateral/anteroposterior direction, SR-ML: stride regularity in mediolateral direction, Max LyE-V/ML: Maximal Lyapunov exponent in vertical/ mediolateral direction, Sen-AP: sample entropy in anteroposterior direction and HR-ML: harmonic ratio in mediolateral direction.
CatalogGait characteristic Description and method Pace Total duration of walking in the dayThe accumulated time (in seconds) of the walking bouts in one segment.Total number of steps in the day
where is the sample acceleration signal, Acc(i) the number of samples, and N the number of time lag.The first peak of m is Ad(m) and it represents the stride regularity.