Cross-cultural adaptation, reliability, and validity of the Turkish version of the Neck OutcOme Score

Background/aim This study aims to determine the validity and reliability of the Turkish version of the Neck OutcOme Score (NOOS). Materials and methods Two hundred eight patients suffering from nonspecific neck pain participated in the study. Test–retest reliability and internal consistency were assessed using intraclass correlation coefficients (2, 1) and Cronbach’s alpha, respectively. The dimensionality was investigated with the factor analysis. The construct validity was determined by testing whether the hypothesis of correlations between NOOS subscales, Short Form-36 subscales, and the Neck Disability Index were met using Spearman’s rank correlation coefficient. Ceiling/floor effects and measurement error were tested as well. Results The intraclass correlation coefficient results varied between 0.721 and 0.844. Cronbach’s alpha values of the subscale were found to be between 0.847 and 0.916 in the internal consistency analysis. The factor analysis showed that the questionnaire has five factors. Floor/ceiling effects were considered not to be present. Conclusion It was found that the Turkish version of the NOOS is valid and reliable.


Introduction
Neck pain is one of the three most reported complaints of the musculoskeletal system. In general, it was reported that prevalence is most common among around 50-yearold individuals, and it is higher among women than men [1]. It is estimated that between 22% and 70% of the population will experience some degree of neck pain in their lives [2,3].
Evaluating the level of neck pain is important in determining an individual's quality of life, participation in everyday life, and limitations. The methods used for identifying the factors that cause these determined limitations and aggravate the pain include clinical examinations, psychological evaluations, and investigation of sociodemographic and economic factors. In addition to these evaluation parameters, functional scales are now used more commonly by clinicians and clinical researchers [4].
What is expected from scales is that they measure the status of individuals suffering from neck pain objectively and functionally and are sensitive to minor changes in individuals. The Copenhagen Neck Functional Disability Scale, Neck Pain and Disability Scale, and Neck Disability Index (NDI) are major questionnaires used for functional evaluation of neck pain in Turkey [5][6][7]. Among these questionnaires, the NDI is the most widely preferred [8,9]. Despite its common use in current methodological quality studies, the NDI has been criticized for its content validity, reliability, and dimensionality [9][10][11]. Moreover, a wide selection of patients, data saturation, and insufficient selection of samples and study populations when determining the content validity rendered the NDI inadequate [8]. The fact that neck pain triggers symptoms such as nausea, headache, and dizziness and that these symptoms have an impact on individuals' participation in activities and the duration and quality of activities all increased the inadequacy of the NDI. Due to such shortcomings of the NDI, a need arose for a new scale that addresses neck pain symptoms extensively and evaluates the response of patients' pain in their participation in different activities.
To assess the patients' perception of their neckrelated problems, the Neck OutcOme Score (NOOS) was developed by Juul et al. [12]. The NOOS is a 34-question questionnaire that investigates neck mobility, sleep disturbance, participation in everyday activities, quality of life, and neck symptoms. It involves five subscales: "mobility, " "symptoms, " "sleep disturbance, " "everyday activity and pain, " and "participation in everyday life. " The questionnaire aims to measure a patient's neck disability quantitatively according to the WHO's International Classification of Functionality (ICF). The NOOS was developed within the framework of the ICF with a focus on body functions, structure, activity, and participation [13,14]. Hence, the NOOS meets the gaps in current scales. It is not yet available in any other language. Therefore, the aim of this study was to determine the validity and reliability of the Turkish version of the NOOS.

Materials and methods
This study was approved by Gazi University's ethics committee (#77082166-604.01.04-70153). All participants gave informed consent, and all rights of the participants were protected. Two hundred eight patients between the ages of 18 and 65 years diagnosed with nonspecific neck pain were included in the study. Patients were excluded if they had a serious neurological disease, psychological distress, or alcohol or substance abuse.
According to the empirical approach, the sample size required for confirmatory factor analysis was determined as n = 200 or 5 times the number of items in the scale (n/ items ≥5) [15]. If we calculate the number of questions (items = 34), we can say that a sample of 170 people is sufficient. In our study, we found it more appropriate to employ a sample size of n ≥ 200.

Cross-cultural adaptation
Translation and cultural adaptation were performed according to the method described by Beaton et al. [16]. Three forward translations from English into Turkish were performed by three independent bilingual translators (a medical health professional and two professionals without medical background and knowledge). All three versions were discussed and combined in a consensus meeting to provide a preliminary Turkish version. Two nonmedical translators and one medical translator translated the preliminary Turkish version back into English. The back-translation was discussed and compared with the English version. The preliminary version was tested with physically active patients suffering from neck problems for wording, understanding, and solid comprehension by experienced health professionals. First, this test procedure was performed involving 50 individuals of whom 15 were neck patients. The patients found it hard to understand questions M1 and M5, which asked about the mobility of neck extension and rotation. For this reason, items M1 and M5 were described more clearly. Next, comprehensiveness of the questionnaire was evaluated again in a pilot group of 35 sick subjects and 69 healthy subjects.
After approval by the original developers of the NOOS, final adjustments were made to obtain a final Turkish version of the NOOS. It was decided to abbreviated this as NOOS-Tr.

Validation study
The other part of the study included 208 patients between the ages of 18 and 65 years who suffered from nonspecific neck pain, were literate, and agreed to participate in the study.
Data collection took place at the Gazi University Hospital and three private physiotherapy clinics between March and November 2017. The NDI, Short Form-36 (SF-36) [17], and NOOS-Tr questionnaires were applied to all patients during face-to-face interviews. For testretest analysis, 71 of the patients completed the NOOS-Tr questionnaire one week later.

Statistical analysis
Analysis of the data was performed using SPSS 11.5 (SPSS Inc., Chicago, IL, USA). The descriptive statistics were mean ± standard deviation and median (minimummaximum) for quantitative variables and the number of cases for qualitative variables (%). Test-retest and internal consistency analyses were conducted to determine reliability. Test-retest results were evaluated by the intraclass correlation coefficient (ICC) method. Cronbach's alpha coefficient was used to evaluate the reliability of the questionnaire in terms of internal consistency. Cronbach's alpha coefficient can be obtained in the absence of missing data in the dataset and a value of 0.70 is the minimum acceptable value. Multidimensionality of the items was analyzed by confirmatory factor analysis (CFA). Floor/ ceiling effects and measurement error were tested as well. Significance was set at P < 0.05. This study was guided by the COSMIN (Censusbased Standards for the Selection of Health Measurement Instruments) recommendations to evaluate the methodological quality of measurement results and to verify that all important design features and statistical methods were reported clearly [18].

Floor and ceiling effects
In this study, floor/ceiling effects were evaluated for NDI, SF-36, and NOOS subscales. Less than 15% of the participants who scored the lowest and the highest were determined to be acceptable [19,20].

Internal consistency
Cronbach's alpha values were calculated separately for each subscale to evaluate the internal consistency. Scores between 0.70 and 0.90 are considered sufficient for internal consistency [20][21][22].

Test-retest reliability
In this study, test-retest reliability for the NOOS, NDI, and SF-36 was evaluated with ICC (2, 1). To perform the retest, it was decided that a one-week interval was a sufficient period during which clinical change would not occur.

Error of measurement
The error of measurement was calculated using the standard error of measurement (SEM) as follows: SEM = SD × √(1 -Cronbach's alpha). Here, SD refers to the standard deviation value of all participants' baseline scores.
To determine the minimal detectable change (MDC) with 90% confidence, the following formula was used: MDC ind = 1.64 × SD × √(2 × (1 -r)) [23]. In the formulation, SD refers to the standard deviation value of all participants' baseline scores, and r refers to the testretest ICC value. If a change equivalent to or higher than the calculated MDC ind value is observed, it can be said at a confidence level of 90% that the individual experienced an actual change. The MDC ind value was also divided by √n to calculate the MDC (MDC group ) values [19].

Construct Validity
The study's construct validity was evaluated by the predefined hypotheses developed based on the discussion of the current literature and clinical experience by the developers of the NOOS questionnaire [24]. The construct validity was determined by testing whether the hypothesis of correlations between NOOS subscales and the other instruments were met using Spearman's rank correlation coefficient.
In positive correlation low, moderate, and high correlations were defined with 0.10-0.29, 0.30-0.49, and 0.50-1.0, respectively, for the correlation coefficient values, whereas ranges of -0.29 to -0.10, -0.49 to -0.30, and -1.0 to -0.5 were used to define low, moderate, and high correlations, respectively, for the negative correlation coefficient values [25]. It is inferred from the confirmation of 75% of the hypotheses that construct validity was achieved [19].

Factor analysis
The purpose of CFA based on structural equation models is to determine whether predetermined relationship patterns between factors and items have been verified by data. With the help of the model's goodness of fit statistics, it is decided whether the factors actually consist of these items. The most commonly used goodness of fit statistics are comparative fit index (CFI), Tucker-Lewis index (TLI), and root mean square error of approximation (RMSEA). CFI and TLI values greater than 0.90 indicate an acceptable fit and greater than 0.95 is considered to be a good fit. RMSEA values of less than 0.05 indicate a good fit, and less than 0.08 indicates an acceptable fit [26].

Results
Demographic characteristics of the patients are presented in Table 1. The participants, whose ages varied between 18 and 65 years, had a mean age of 35.88 ± 13.79, and their mean body mass index was found to be 25.02 ± 4.46.

Floor and ceiling effects
According to the initial values, no floor and ceiling effects were observed in the NOOS and NDI, and score distributions were found at acceptable levels in both (Table 2). Floor and/ or ceiling effect was observed in the SF-36 subscales of Physical Functioning, Role-Physical, Social Functioning, and Role-Emotional, which were therefore not included in the evaluation of construct validity (Table 2).

Internal consistency
As Cronbach's alpha values were higher than 0.80 for all NOOS subscales, it was considered that all subscales have sufficient internal consistency (  for sleep disturbance should be at least 18 points [24]. As for the MDC group values, the subscale with the lowest value was found to be mobility and the subscale with the highest value was found to be sleep disturbance (Table 3). When taking the highest value as a basis, the change of 2.20 in the sleep disturbance subscale might refer to an actual change only for the group [24].

Construct validity
Eight prespecified hypotheses were tested to confirm construct validity [24]. Since all of the hypotheses were met, it was considered that construct validity is present (Table 4) [19].

Factor analysis
Results of five-factor CFA using 208 patients' responses to the questionnaire revealed that all items were loaded with predetermined factors with over 0.40 factor loads ( Table  5). A relationship of 0.80 was found between the factors and it showed that the items creating the five factors were appropriate factors to measure neck pain in patients. The goodness of fit statistic CFI was found to be 0.907, the TLI value was found to be 0.932, and the RMSEA value was found to be 0.057, showing that the five-dimensional structure determined for the NOOS questionnaire was also valid for patients with neck pain in Turkey. The NOOS subscale "everyday activity and pain" and SF-36 subscale "bodily pain" have a strong correlation (≥0.50) 0.656  The correlation coefficient between NOOS subscale "everyday activity and pain" and SF-36 subscale "bodily pain" should be higher than the correlation coefficient between the NOOS subscale "everyday activity and pain" and SF-36 subscale "physical functioning"

Discussion
We aimed to translate the NOOS into Turkish and assess the questionnaire's validity and reliability. The Turkish version of the NOOS was in general very good with no disturbing questions and few confusing items. Since the NOOS is not yet available in any other translation, we can only compare the results with the original version. The psychometric properties of the Turkish NOOS were similar to those of the original NOOS in general. Cronbach's alpha coefficients were calculated to be 0.916 for mobility, 0.847 for symptoms, 0.855 for sleep disturbance, 0.889 for everyday activity and pain, and 0.915 for participation in everyday life. These values indicate that the questionnaire's internal consistency is at a sufficient level [19,27]. In the original version, Cronbach's alpha coefficients were calculated to be 0.85 for mobility, 0.77 for symptoms, 0.86 for sleep disturbance, 0.92 for everyday activity and pain, and 0.92 for participation in everyday life. No internal consistency analysis was performed for the NDI. Like the original version of the NOOS, internal consistency of the Turkish version was found to be high. The ICC method was used for the test-retest analysis in this study. ICC results of the NOOS-Tr varied between 0.721 and 0.844, and time-dependent invariance of the questionnaire is good. In the original version, the testretest reliability was found between 0.88 and 0.95. The NDI's test-retest reliability was found as 0.979. There might be a few reasons why the NOOS-Tr had lower testretest results than the original version and the NDI. The reliability of the Turkish version of the NDI was calculated with 88 patients suffering from chronic neck pain. That number of samples was lower than the number of patients in this study. Moreover, all other language versions of the NDI were examined in two separate systematic compilation studies. As stated in the compilations, reliability values of the studies with the highest quality among the NDI versions were found to be low. Cleland et al. reported that reliability values of two studies with the best quality, which were conducted with large sample sizes and more extensive statistical analyses, were 0.50 and 0.68 [28,29]. In another systematic analysis, it was stated that other language versions of the NDI grouped the patients as acute and chronic and these studies reported that patients suffering from acute pain had lower reliability values [30][31][32][33][34]. Only the patients who were suffering from chronic neck pain for more than 3 months were included in the NDI study. Considering all these findings, the fact that about half of the participant patients in this study had acute neck pain and the high number of the participants might be the reasons why its reliability value was found to be lower than that of the NDI. Almost all of the patients included in the original study of the NOOS had chronic neck pain. About half of the patients in this study had acute neck pain while the other half had chronic neck pain. The patients with acute neck pain might have caused the NOOS-Tr reliability values to be lower than those of the original version. This study also provided values for SEM and MDC for this test, which can be useful for future studies and clinical practice. The SEM for the NOOS subscales showed "participation in everyday life" having the lowest value and "symptoms" having the highest value. In the original version, the SEMs were lowest for the "everyday activity and pain" subscale and highest for the "symptoms" subscale. MDC values were found to be highest in both studies with 18 points for the sleep disturbance subscale. The score of 18 points for the sleep disturbance subscale was maintained in the 90% confidence interval, which means that it is not a measurement error due to random variation and that there will be true changes in each individual in the future. No floor or ceiling effects were observed. In accordance with the original English version of the NOOS, reliability was satisfactory.
Construct validity was evaluated in accordance with the COSMIN recommendations. All predefined hypotheses support the construct validity. We evaluated the correlation between the specific NDI parameters and related NOOS subscales when determining the construct validity of our study. The correlations were similar to those of the original version. The mobility and sleep disturbance subscales of the scale and general health subscale of SF-36 were in low correlation in both studies. There was a moderate correlation between the related subscales of the NDI and NOOS in both studies. Since the hypothesis available in the studies of the new versions was tested, it is stated that a comparison between the factor structures of the adapted scale and the original version is required [35]. The NOOS-Tr was found to have five factors, like its original version. The results show parallelism with the results achieved in this study. Since the NDI is a one-factor questionnaire, no factor analysis was performed in the version study. In a study, the one-factor structure of the NDI was criticized in terms of dimensionality and found insufficient [36]. A single summarized score may be difficult as in the NDI because it is not clear how rates of scores will be attributed to each of multiple structures. In the NOOS-Tr, expression of restrictions causing or stemming from neck pain through quantitative data enables them to be interpreted separately and more objectively. These factors reinforce the reliability of the Turkish version of the questionnaire. In the systematic compilation study published by Wiitavaara et al. on the quality of questionnaires that evaluate neck pain patients, the criterion validity was found insufficient for the NDI and sufficient for the NOOS [37]. All these results show that the NOOS-Tr questionnaire can evaluate neck pain patients multidimensionally.
In conclusion, the NOOS-Tr was determined to be a valid and reliable questionnaire for evaluating neck pain patients. With the Turkish translation of the NOOS and proving its Turkish validity and reliability, it will be possible for Turkish researchers and clinicians to evaluate Turkish patients who have neck pain complaints multidimensionally. Those who want to perform multidimensional evaluation of patients with neck pain may utilize the NOOS-Tr.