Reliability of Musculoskeletal Fitness Tests and Movement Control Impairment Test Battery in Female Health-Care Personnel with ReCurrent Low Back Pain

The prevalence of LBP varies between occupational groups, with physically demanding jobs known to be a risk factor [1]. Among health-care workers, the one-year prevalence of LBP has been found to be as high as 45% to 77% [2]. Low back problems have often developed already during the nursing-school clinical training period [3]. Among nursing personnel, musculoskeletal disorders are the main cause of work absenteeism and early retirement, LBP being the leading cause. In Finland, only 48% of nursing assistants and 58% of nurses reach the old-age pension point [4].


Introduction
The prevalence of LBP varies between occupational groups, with physically demanding jobs known to be a risk factor [1].Among health-care workers, the one-year prevalence of LBP has been found to be as high as 45% to 77% [2].Low back problems have often developed already during the nursing-school clinical training period [3].Among nursing personnel, musculoskeletal disorders are the main cause of work absenteeism and early retirement, LBP being the leading cause.In Finland, only 48% of nursing assistants and 58% of nurses reach the old-age pension point [4].
In most low back patients (85-90%) the pain is classified as nonspecific low back pain (NSLBP).After an acute pain episode, the majority recover, but 50-70% find the pain recurring within the following year, and 10% will become chronic [5].In the literature there is still some ambiguity in conceptions of the episodic or fluctuating nature of NSLBP [6].LBP episodes are traditionally regarded as individual events with good prognosis.This opinion has been challenged in longer-term follow-up studies [7][8][9].These studies indicate, that majority of people with LBP do experience back pain off and on over long period of time [10].Preventing new LBP episodes is considered to be of importance for prevention of persistent pain.over the past seven months or longer, 3) A serious other disease or symptoms limiting participation in moderate-intensity neuromuscular exercise, 4) Engaging in neuromuscular-type exercise more than once a week, and 5) Pregnancy or recent delivery (< 12 months).Written informed consent was obtained from all participants.
More than 80% of the study subjects had irregular working hours, in two or three shifts, and 68% of them were assistant nurses.The majority of them perceived their health to be average or good, although 66% reported having at least three musculoskeletal pain sites.Thirty per cent rated their fitness level somewhat worse than that of persons of the same age and gender.Other background characteristics of the participants are reported in Table 1.

Study design
Long-term reliability was studied by means of test I -test II design.The average interval between pre-study screening and the first measurement point was approximately three weeks, and the mean number of days between the two measurements points was 18 (SD 7.9).The reliability of selected musculoskeletal fitness tests and the MCI test battery were studied in the analysis.Health screening was conducted prior to testing, in accordance with the safety model of the Healthrelated Fitness Test Battery for Middle-aged Adults [32].
At the first measurement point (test I), an experienced physiotherapist (with master´s degree and 14 years' experience of clinical work) scored the performance of the subjects in six tests of MCI and six field tests of motor and musculoskeletal fitness in set order.At the second measurement point (test II), the same physiotherapist followed the test protocol as used before.The results from the first measurements occasion were not available for the physiotherapist at and heavy lifting.Therefore, it is conceivable that movement control (i.e.motor skills), ability to stabilize the spine with trunk muscles and muscular endurance/strength are of importance in connection with the risk of developing LBP [5,13,20].Consequently there is a clear need for reliable measurement methods for evaluation of musculoskeletal fitness and motor abilities in connection with physically demanding nursing work.Repeatability of the field tests for health-related fitness has been studied in healthy populations [21,22], but to our knowledge not in people with NSLBP.
Impairments in postural and movement control of the lumbar spine have been posited to be risk factors for prolonged LBP [23][24][25].A significant difference in ability to actively control the movement of the low back has been found between patients with LBP and subjects without back pain, and between acute and chronic LBP patients [26,24], but to our knowledge the test-retest repeatability of the standard movement control impairment (MCI) test battery has not been studied.
Exercise is often recommended for LBP patients [27].Reliable and valid measurement methods are essential for evaluating the effects of exercises.In addition, methods of measurement for screening the physical performance capacity of people who have a physically demanding job and who may be at risk for LBP are needed.
The standard measure of reliability is the test-retest repeatability within one week [28].However, long-term reliability (with 2-3 weeks interval between measurements) of measurements for sensorimotor functions [29] and muscle strength and endurance measurements [30] has been studied in patients with chronic LBP, but not in people with minor LBP problems, who are still at work.
The aim of the study reported upon here was to examine the longterm reliability of selected motor and musculoskeletal fitness tests and the MCI test battery for female health-care personnel with recurrent NSLBP.Another aim was to ascertain whether change in pain intensity between two measurement points had any effect on the variation in the test results.

Participants
The participants (n=47) of the present reliability study were a subsample from a larger randomised controlled trial (the NURSE RCT, clinical trial registration NCT04165698).The both measurement sessions (test I and test II) in the present reliability study were conducted before the study interventions started, and no feedback, education or training was given to participants between measurements.
Written information about the NURSE RCT was disseminated by the head nurses to all health-care personnel working in the geriatric wards of two municipal hospitals and an old people's home in late 2011 in Tampere, Finland.Personnel with LBP who were willing to participate in the NURSE RCT filled in the screening questionnaire.To be eligible, personnel had to meet the following criteria: 1) Being a woman aged 30-55 2) Who had worked in her current job for at least 12 months and 3) Experienced LBP intensity of at least 2 on the numeric rating scale (NRS) employed (0-10) [31] within the preceding four weeks.
The exclusion criteria for the study were 1) Serious former back injury (fracture, surgery, or prolapsed disc), 2) Chronic LBP as defined by a physician or self-reporting of daily LBP the second measurement point.The measurements were conducted in the physiotherapy department of the geriatric hospital where most of the study subjects were working.

Test protocols and measurements
Motor and musculoskeletal fitness tests: The field tests of motor and musculoskeletal fitness assessed motor abilities, flexibility and muscular strength.The motor ability tests used included rhythm coordination and running a figure of eight [33].Trunk side-bending [22] was used for assessing the range of motion of lateral flexion of the spine, and dynamic sit-ups [34], modified push-ups and one-leg squats [22] for determination of muscular strength.

The movement control impairment test battery:
The MCI test battery [26], based on descriptions by Sahrmann [35] and O'Sullivan [36], consists of six tests: 1) the waiters bow (flexion of the hips in upright standing without movement of the lower back), 2) dorsal tilting of the pelvis, 3) sitting knee extension, 4) rocking backwards and forwards in quadruped position, 5) prone-lying active knee flexion and 6) one-leg stance.The MCI test results were judged dichotomously by observation: the subject was noted either as not having motor control impairment (0) or as having impairments (1).
All individual measurements were conducted in standardised order for each study subject: MCI tests before fitness tests, and motor and flexibility tests before measurements of muscular strength.The test instructions, criteria for the MCI test battery, and motor and musculoskeletal fitness tests are described briefly in Supplementary information (1).
The measurements of LBP intensity: Intensity of LBP during previous four weeks was measured by NRS (1-10) [31] at pre-study screening, at the baseline (test I) and at the re-test session (test II) in order to get an overview of pain development.To clarify more precisely the intensity of LBP at both measurement points (test I and test II), pain intensity during past seven days was also measured via visual analogue scale VAS (where 0 = no pain and 100 = the worst possible pain) [37].

Statistical analysis
The mean, standard deviation (SD), and range of the test I-test II measurements of the fitness tests are presented as descriptive statistics in Table 2.
Estimates of repeatability for interval-scale measurements were calculated in a manner suggested by Hopkins [38]. 1) within-subject variation in terms of typical error (s) as the standard deviation of the testretest difference divided by the square root of 2 ( ) 2) the relative error measure coefficient of variation (CV) as typical error divided by the mean of two tests Systematic change in the mean is an important issue when subjects perform repeated series of test trials [38,28].Therefore, the percentage changes in mean performance between the first and second test results were calculated too.The systematic changes in mean between the two measurement sessions, which were considered statistically significant if their 95% confidence interval (CI) did not include value zero, were also calculated.
The repeatability of the nominal-scale measurements in the MCI tests (0 or 1) was analysed by means of Cohen's kappa coefficient (k).
For analysis of the possible effect of change in pain intensity on variation between the test I and test II results, new variables were calculated: 1) change in intensity of LBP, measured by VAS (0-100) during last seven days 2) change between the first and second MCI tests (indicating that performance had deteriorated, remained unchanged, or improved), and 3) change in fitness-test results.Normality of distributions for numerical variables was confirmed.Associations between change in pain and nominal-class variables for change in MCI tests were analysed via the Kruskal-Wallis test.Kruskal-Wallis test was used instead of analysis of variance (ANOVA), because number of cases in some categorical classes was too small for ANOVA.Association between change in LBP and change in interval-scale fitness-test results were examined by means of Pearson's correlation coefficient (r).All analyses were conducted with the SPSS statistical analysis package, version 22.

The criteria for sufficient reliability
At present, there are no standards delimiting acceptable measurement precision for monitoring physical fitness in healthy populations, nor for monitoring the motor or musculoskeletal fitness levels of people with recurrent or chronic LBP.Altman [39] has rated the kappa coefficient thus: ≤ 0.20=poor, 0.21-0.40=fair,0.41-0.60=moderate,0.61-0.8=good,and ≥ 0.80=very good.

Results
The participants had sub-acute or periodic, but not chronic NSLBP.Mean of their pain intensity (NRS; 1-10 during last four weeks) decreased from 4.5 (SD 1.8) at the pre-screening time to 3.4 (2.4) at the first measurement point and further to 2.8 (2.7) at the second measurement point.At the first measurement point 71% of the study sample had LBP in some or most days of the week, but not daily, 7% had daily pain and 22% had recovered from the screening time and were pain-free (Table 1).
The mean number of days between the test I and test II measurements was 18, and the median LBP intensity, measured on VAS (0-100 mm) during last seven days, decreased from 26 (Q 1 = 15, Q 3 = 51) to 19 (Q 1 = 4, Q 3 = 36) in that time.The reduction in pain intensity was statistically significant (Z = -2.77,p = 0.006).The change in VAS between two measurement points was less than 15mm for 51% of the study subjects, for 38% the decrease in pain intensity was more than 15mm and for 11% the pain intensity increased more than 15  trunk lateral flexion, dynamic sit-up, and modified one-leg squat tests.On account of musculoskeletal problems other than LBP, one person was excluded from the modified push-up test and one from the running figure of eight.
Results of the test I -test II long-term reliability analysis for interval-scale fitness measurements are presented in Table 3.The least within-subject variation was found for the running figure of eight (s = 0.22 s, CV = 2.8%) and the most for modified push-ups (s = 1.04 reps, CV = 12.2%).Systematic changes in the mean between the two measurement sessions were detected for dynamic sit-ups and modified push-ups, which indicates a slight learning effect.
Results speaking to the level of test I -test II long-term reliability of the MCI tests are presented in Table 4.The best repeatability was found for the dorsal pelvic tilt and poorest for the one-leg-stance.
Associations between change in pain intensity and variation in test I -test II results are presented in Table 5.Only the modified push-up test seemed to be correlated with changes in pain.Lower pain levels implied a higher number of push-up repetitions (r = -0.3,p = 0.045, n = 46), for better test results when the pain had decreased, or on account of the learning effect.The change in pain had no effect on variation in any other test, including the MCI tests.

Discussion
The main goal for the study carried out was to evaluate the longterm reliability of selected motor and musculoskeletal fitness tests and the MCI test battery for female health-care workers with recurrent NSLBP.Another aim was to ascertain whether change in pain intensity between two measurement points had any effect on the variation in the test results.
Within-subject variation is the most important type of test-retest repeatability measurements: the less within-subject variation there is, the greater the precision of the individual measurements and the better the observation of changes [28,38].A typical example of a systematic change in results of physical fitness testing is the learning effect (bias).Participants perform better in the second test session than in the first, because they have benefitted from the experience of the first session [28,38].

Long-term reliability of the motor and musculoskeletal fitness tests
The running figure of eight test requires both agility and power.This test showed the lowest intra-individual variation of all tests (CV: 2.8%) and a small change in the mean (5%).The results are in agreement with previous studies [21,40], and the test seems highly repeatable across populations.A high performance level in this test has been linked to high quality of life in elderly women [41], but its relevance in testing of people with LBP has not been studied.That says agility is a capacity that is needed in nursing duties such as transferring patients, who may not behave in the anticipated manner.
Trunk lateral bending assesses spinal mobility in the frontal plane.The low intra-individual variation (CV: 7.5%), very small change in mean (0.3%), and narrow 95% CI (-0.54-0.65)indicate that the test is repeatable, and there is no systematic change in the mean.A reduced range of lateral bending has been shown to be a risk indicator for LBP among younger health-care workers [42] and adolescents [43].
Testing of rhythm co-ordination assesses ability to simultaneously co-ordinate the movements of the upper and lower limbs at a slower and faster rhythm [44].The low intra-individual variation (CV: 7.7%) and change in mean (2%) indicate that the test is repeatable.Previous results in healthy volunteers, reported by Rinne et al. [33] seem only modest, and the intra-class correlation coefficient (ICC) was fairly low, 0.70.Changes in the central nervous system, responsible for interpreting sensory stimuli and motor responses, have been found in LBP patients [23,45,46], however, the relevance of rhythm co-ordination in motor ability testing of LBP patients is not known.
The dynamic sit-up test assesses the strength and endurance of the flexor muscles of the trunk.The intra-individual variation was barely acceptable (CV: 11.2%), and there was a small learning effect (change in mean 6%).In a cross-sectional sample [47], persons with LBP were shown to have less abdominal muscle endurance than asymptomatic controls did.
The one-leg squat test assesses the strength of the lower extremities.The intra-individual variation was in the inadequate-reliability margin (CV: 11.9%), but the change in mean was small (2%).No learning effect was detected.Strength of the lower extremities is an important capacity when one is lifting and transferring patients in nursing duties, along with keeping the back in neutral position, although scientific evidence of associations between leg strength and LBP is limited.Individuals with chronic LBP have been shown to have weaker gluteus medius muscles than control subjects without back pain [48].
The modified push-up test requires both upper-body muscular strength and trunk stabilisation.Of the musculoskeletal fitness tests, this one showed the least repeat ability, with the highest intraindividual variation (CV: 12.2%).The large change in mean (19%) and 95% CI not including the value zero (1.04-1.95)indicate a learning effect.On the other hand, this was the only test for which we found statistically significant associations between reduced pain and better performance.The test is physically heavy to perform, and might create high compressive forces in the spine.Therefore, the variation in test I -test II results may be due to the learning effect, change in pain intensity, or both.Furthermore, the broad range of results (0-18 repetitions) indicates that performance capacity among nurses varies quite dramatically.A learning effect has been found also in physically active healthy adults [21] and in a less selected population [22].Low fitness in the modified push-up test has been associated with poor perceived health, low back dysfunction, and pain among middleaged subjects [49].Also, poor endurance in the back musculature has been reported to be a risk factor for low back pain [14,50].The modified push up test requires endurance of the back muscles for trunk stabilisation.Increased risk of low back pain has also been reported in young conscripts with a poor fitness level in trunk muscle endurance and aerobic performance [13].
All three musculoskeletal fitness tests assessing muscle strength and endurance had a CV > 11% and were borderline for having adequate long-term reliability.With the learning effect detected borne in mind, they can be used in intervention studies to evaluate changes in muscular strength.

Long-term reliability of the movement-control-impairment test battery
Repeatability was good for the dorsal pelvic tilt; moderate for sitting knee extension, waiter's bow, and knee flexion in prone position; fair for rocking forwards and backwards; and poor for the one-leg stance.The results indicate that subjects' performance may vary from day to day.There is a clear need for better standardisation in the two tests yielding the poorest repeatability results (rocking forwards and backwards and the one-leg stance).
Previously, good to excellent intra-observer reliability (k=0.67−0.95) for the same six MCI tests were reported by Luomajoki et al. [26].The subjects in their study were either LBP patients (n=27) or persons without LBP (n=13).Measurements were videod in a standardized manner, and for the analysis of intra-observer reliability, examiners rated the same videos two weeks apart.In our long-term reliability study, the participants performed the MCI tests in two different measurement points, and all subjects had had back pain within the preceding two months.
Monnier et al. [51] found poor to moderate intra-rater test-retest repeatability for six clinical movement-control tests employed with marines.Their test battery differed from that used in our study, but rating was still dichotomous by visual observation.The only similar test was the waiter's bow (or standing bow), with an intra-rater kappa coefficient of 0.48 for rater A and 0.39 for rater B (0.53 in our study).
Whether it is more appropriate in clinical MCI tests to use quantitative outcome variables or dichotomous ones may be a subject worthy of discussion.Quantifying test results might enable the rater to obtain more information, which may be more useful for diagnosis and evaluation in a clinical setting [52].
Variation in MCI test I -test II repeatability results can be caused by day-to-day variation in test performance or difficulties in rating of the performance.According to Enoch et al. [53], it is difficult to estimate visually how much the lumbar region is moving during MCI tests without using any technical equipment.If the test is rated dichotomously 'yes/no' or 'can/cannot', much information is hidden between the two endpoints.Also, there is no clear consensus on when the test is passed / not passed or on where the dichotomous cut-off points should be [53].
In clear 'yes or no' cases, it is quite easy to place a person in the correct category, but there are many more complicated cases.If, for example, the MCI test performance starts correctly but at the very end of the range of motion the person loses control of the lumbar spine (maybe because of muscle tightness or restricted hip mobility), it is more difficult for the rater to decide whether the performance is correct or not.Two thirds of the study subjects had three or more musculoskeletal pain sites, which may have had an effect on the variation of the results.For instance, pain in the wrists or knees can affect weight transfer when one is rocking forwards and backwards in quadruped position.
For bringing higher reliability to the MCI test battery, we suggest either better standardisation of those MCI tests with poor to fair reliability or adding of a third band between the dichotomous ratings currently used (movement-control impairment and no impairment).This new 'in-between' class could cover all of the less clear cases, which are more difficult to classify without hesitation.
Ability to control the position and movement of the lumbar spine is important for back health.Patients with chronic LBP have more movement control impairments [24], lower tactile acuity in the back region [54], and significantly poorer ability, on average, to sense a change in lumbar position [55] relative to healthy subjects.Measurement of the neuro-motor control of the spine in patients with NSLBP is considered to be important, but there seems to be lack of reliable and feasible measurement methods [56].Clinical screening tests for assessing MCI in people with NSLBP include a high risk of bias [52].

Associations between change in pain and variation in test results
Ostelo [37] has evaluated the minimal important change in VAS as 15mm.The change in pain was less than 15mm for 51% and more than 15mm for 49% among our study sample.Pain is often presupposed to influence performance negatively [57].Our hypothesis was that variation in pain causes variation in test results -i.e., less pain means better performance, and vice versa.The results of our study, however, do not support this hypothesis.Change in intensity of pain in persons with NSLBP had almost no effects in terms of variation of the test Itest II results for motor and musculoskeletal fitness tests, and for MCI tests.Our results are in agreement with results presented by Leitner et al. [29], who found in their long-term reliability study on patients with chronic LBP, that changes in pain intensity are not associated with changes in postural stability measurements.
Long-term reliability is not often analysed in repeatability studies.It has been recommended that for basic repeatability assessment the retest time should be within one week from the first test [28].Longer periods between measurements are considered to admit more intraindividual changes in fitness, tiredness, and other such factors affecting physical performance.So far, to our knowledge, only two studies [29,30] with interval of 2-3 weeks between tests have investigated long-term reliability of measurements of physical parameters, such as muscular strength and endurance, and sensorimotor functions like postural control in people with LBP.That time interval corresponds the period of time in inpatient rehabilitation program for patients with chronic LBP in central Europe.Fitness or neuro-motor capacity of people with LBP seldom changes without any intervention in a few weeks, but LBP and pain intensity often seems to fluctuate from day to day.The periodic nature of pain in many people with low back trouble might influence patients´ adherence or motivation with the measurements [30,57].This topic needs further investigation with larger study samples.

Limitations of the study
The main limitation of the study was, that we did not standardise the measurement time in relation to work shifts.Tiredness after work could have caused variation in test results in comparison with measurements, which were conducted before work-shift or on the day off.Given the irregular working hours encountered with shift work, the arrangement of test sessions suitable for the same tester was very challenging.

Strengths of the study
All measurements took place twice in the same locations, with same examiner and same equipment.Prior to the first testing session, the written test instructions and criteria were specified, reviewed, and made the subject of drills by two members of the research group and the rater.The statistical methods and outcome measures agree with expert recommendations [28,38].We showed that selected motor and musculoskeletal fitness tests are reliable, safe and feasible to use in evaluation of motor and musculoskeletal fitness components in female health-care personnel with LBP and having a physically demanding job.We also showed that some of the widely used tests in the MCI test battery need better standardisation.

Implications
For fitness or MCI tests to be applied in clinical practice, the tests must be reliable and also safe, economical, simple, and easy to administer.All of the musculo skeletal and motor fitness tests studied in the present study fulfil the above-mentioned demands; however, the MCI test battery seems to have some deficits with respect to longterm reliability (for the one-leg stance test and rocking forwards and backwards).
The results provide useful information in selection of measurement methods for intervention studies aimed at reducing LBP and for clinical work in patients with LBP.The results indicate a need for further development of the MCI test battery, which is widely used by physiotherapists in evaluation of patients with LBP.

Table 1 :
Baseline characteristics of the study sample (n=47).

Table 2 :
mm. Descriptive results from test I and test II measurements of the motor and musculoskeletal fitness tests are presented in Table 2.In general, all subjects were able to perform the rhythm co-ordination, Descriptive results of test I -test II measurements of the musculoskeletal fitness tests.Taulaniemi RPA, Kankaanpää MJ, Tokola KJ, Luomajoki HA, Suni JH (2016) Reliability of Musculoskeletal Fitness Tests and Movement Control Impairment Test Battery in Female Health-Care Personnel with Re-Current Low Back Pain.J Nov Physiother 6: 282.doi:10.4172/2165-7025.1000282 Citation: Volume 6 • Isue 1 • 1000282 J Nov Physiother ISSN: 2165-7025 JNP, an open access journal

Table 3 :
Test I -test II repeatability of the motor and musculoskeletal fitness test.

Table 4 :
The test I -test II repeatability (with 95% CI) of the movement control tests of the low back.

Table 5 :
Associations between change in pain intensity (VAS in the second test -VAS in the first test) and change in motor and musculoskeletal fitness as well as MCI test results in test I -test II (statistically significant p-value in bold).