Can within-subject comparisons of thermal thresholds be used for diagnostic purposes?

Highlights • Normal limits for within-subject comparisons of thermal thresholds are wide.• Our findings advocate for site-specific normal values of adequate resolution.• The difference between distal and proximal thresholds increase drastically with age.


Introduction
Small-fiber neuropathies are the result of damage to the small peripheral nerves, and typically presents as a symmetrical, length-dependent polyneuropathy that can give a wide variety of either sensory or autonomic symptoms, or a combination of both (Levine, 2018;Terkelsen et al., 2017). Most peripheral neuropathies involve both large and small fibers, but pure smallfiber neuropathies are increasingly being regarded as distinctive (Levine, 2018;Peters et al., 2013;Terkelsen et al., 2017).
Instead of comparing thermal thresholds to absolute reference material, it is possible to perform within-subject comparisons across anatomical-or contralateral sites. Relative comparisons between distal and proximal areas are an important part of the clinical neurological examination for sensory abnormalities, including the assessment of thermal hyperesthesia and hypoalgesia. Such comparisons could be especially beneficial if they show lower variability, and thus narrower normal limits, or if they are associated with fewer covariates, allowing for more generalized reference data. Comparing thermal thresholds across distalproximal sites could be particularly useful in instances when patients experience symmetrical pain or sensory alterations. Previous investigations have failed to find significant side-differences for QTT (Kemler et al., 2000;Lilliesköld and Nordh, 2018;Meh and Denišlič, 1994;Rolke et al., 2006a) and it has been well established that there are differences between anatomical sites, with a likely distal-proximal gradient of increasing sensitivity Meh and Denišlič, 1994;Siao and Cros, 2003;Stevens and Choo, 1998). However, attempts to quantify normal limits for such comparisons are scarce (Kemler et al., 2000), and the association between relative comparisons and age, sex and height is largely untested. This crucial research gap should be explored further, as much of the clinical value of within-subject comparisons of thermal thresholds rests on the knowledge of normal variability in healthy individuals.
Thus, the aim of this study is to investigate the normal limits for distal-proximal-and contralateral homologous comparisons of thermal thresholds with QTT, and their association with age, sex or height.

Study design
An experimental, cross-sectional study was designed to compare thermal thresholds with regards to side-differences and distal-proximal gradients. Four sites were measured bilaterally: the thenar eminence; the anterior thigh, 10 cm superior to the patellar base in mid-line; the distal medial leg, directly superior and posterior to the medial malleolus, and; the foot dorsum, at the dorsal aspect of metatarsals II-III.
The protocol included cold detection threshold (CDT), warm detection threshold (WDT), heat pain threshold (HPT) and cold pain threshold (CPT). Testing order of the specific sites was ran-domized in advance, and a pre-test of two cold-and warm detection threshold stimuli was performed to familiarize the subjects with the procedure.
A male experimenter carried out all experiments. The QTT protocol, placement of instruments, room temperature, experimenter's clothing, the instructions and lighting was standardized. Participants were not allowed to observe instrument readouts.
The study was approved by Regional Committees for Medical and Health Research Ethics (REC), project no. 2010/2927. All participants provided written consent, and the study was conducted in accordance with the Declaration of Helsinki. Subjects received a gift certificate of NOK 250 for participation.

Subjects
The required sample size for paired comparisons were calculated in advance (Rosner, 2015). For side-differences, a minimal clinical significance of 1°C and standard deviation of 2°C was used, while these values were 2°C and 3°C for distal-proximal comparisons, respectively. A minimum of 31 subjects were needed to detect a side-difference of 1°C, with type I and II error rates of 0.05 and 0.2, respectively, and 18 were needed for distalproximal comparisons.
Healthy men and women, ages 20-79, were recruited through advertisements at Oslo University Hospital, local universities, gyms, centers for the elderly, and on social media. Recruitment efforts focused on achieving an equivalent representation of sex and ages. Exclusion criteria were: cancer (current or previously), diabetes, radiculopathy, chronic pain (average NRS ! 1 for -! 3 months, last two years), pregnant or breastfeeding, limited capacity for consent, personal acquaintance of experimenter, or any disease of nerves, muscles, or of the brain that could influence normal nervous function, including psychiatric illnesses. Subjects were requested not to work nightshifts within 48 h of the experiment, to not consume alcohol in the last 12 h before the experiment, or consume pain-killers the same day as the experiment.

Test protocol
Thermal stimulus was applied with a 30 Â 30 mm Peltier thermode (Medoc, Ramat Yishai, Israel), by method of limits. The thermode was held in place by the experimenter. Baseline temperature was 32°C, with a range of 0-52°C. The ramp-rate was 1°C/s for all tests, and the thermode returned to baseline at 1°C/s for detection thresholds and 5°C/s for pain thresholds. Inter-stimulus-intervals were 4-6 s for all tests.
Subjects lay supine on a treatment table, with the back rest at approximately 120-135°incline. Pillows were used for headsupport and placed under the subjects' knees, and a duvet helped regulate skin temperature. Immediately prior to testing, the skin temperature was measured at each site with an 826-T2 handheld infrared thermometer (Testo SE & Co., Pennsylvania, USA), held perpendicular to the skin's surface at a standardized distance of 1 cm. A re-usable heat pack was applied where skin temperature was <30°C for the lower extremities and <32°C for the thenar eminence. Excessive body-hair was removed with scissors.
CDT, WDT and HPT were measured in succession for each site, followed by CPT in the distal medial legs and feet dorsa, in the same order. Absolute temperature thresholds were recorded.
Scripted, verbal instructions were used. Participants were informed of the procedure in its entirety before testing began, and reminded of the current modality before each test. For CDT and WDT, subjects were asked to press a trigger at the first sensation of cool or warmth. Similarly, for HPT and CPT, the cue was to press the trigger at the first sensation of pain, typically when the thermal stimulus begins to induce a stinging, burning or aching sensation. Subjects were advised that thermal pain thresholds are not a measure of pain tolerance. A response was considered invalid and repeated once if it deviated substantially from contemporaneous measurements, or if the subject admitted to an accidental response.

Data computation and analysis
Statistical analyses were performed using IBM SPSS Statistics v. 25 (Armonk, NY: IBM Corp.) P-values were regarded as significant at 0.05, and Bonferroni correction for multiple testing was applied. Correlation values were interpreted in accordance with Mukaka (2012): negligible correlation ± 0.0-0.3, low correlation ± 0.3-0.5, moderate correlation ± 0.5-0.7, high correlation ± 0.7-0.9 and very high correlation ± 0.9-1.0. Thermal thresholds were calculated to express absolute change from the baseline of 32°C (D°C).
Data distribution was assessed in preliminary analyses by use of histograms, boxplots and Q-Q plots. The arithmetic mean of five (CDT, WDT) or three (HPT, CPT) measurements was used in the analysis. Data from each subject was excluded from the presentation of sample thresholds and for calculating regression equations if the delta value of a thermal threshold was > 3 times the arithmetic mean (detection thresholds)-, <1/3 the arithmetic mean (pain thresholds)-, or if exceeding ± 3 SD from the arithmetic mean of the remaining data points. Invalid measurements due to a test's floor-or ceiling effect were not replaced or included in the final analysis.
Side-differences were determined by use of multiple paired ttests. The distal-proximal gradients were examined by repeated measures ANOVA with post-hoc analysis for pairwise comparisons. Sample normal limits were calculated as mean ± 2 SD for pairwise comparisons that showed statistically significant differences, otherwise ± 2 SD.
Pearson-or Spearman correlation was used to determine the association between side-differences or distal-proximal gradients, and age, sex and height.

Study sample
Forty-eight subjects were included in the analysis. Participants were 46 (SD ± 15.6) years old, 52% of whom female. The inclusion process is displayed in Fig. 1 and sample characteristics are presented in Table 1.
Two subjects were excluded from the analysis of CDT in the distal medial leg, 34 from CPT in the foot dorsum, and 37 from CPT in the distal medial leg, due to reaching the test's floor value of 0°C. The large floor-effect for CPT precluded calculation of relative CPT comparisons.

Distal-proximal gradients for quantitative thermal testing
Repeated measures ANOVA with Greenhouse-Geisser correction detected differences in means when comparing CDT, WDT and HPT (p < 0.001). Post-hoc analysis with Bonferroni correction for multiple testing displayed a general distal-proximal gradient for CDT, WDT and HPT (Table 3). The distal-proximal gradient's linearity was violated by the distal medial leg being less sensitive than the foot dorsum for WDT (p < 0.001) and HPT (p < 0.001), with no difference detected for CDT (p = 0.170) (Fig. 2). In addition, no significant difference was found between the foot dorsum and anterior thigh (p = 0.932) or thenar eminence (p = 0.016) for HPT. Sample normal limits for distal-proximal comparisons ranged from 4.0 to 8.7°C for CDT, 6.0-14.0°C for WDT and 4.2-9.0°C for HPT.

Side-differences for quantitative thermal testing
Side-differences for thermal thresholds presented in Table 4. A difference of À1.4°C (À2.1°C to À0.6°C) was found for WDT in Fig. 1. Flow chart of the study inclusion process.

Relation to age, sex and height
A low correlation was found between age and side-differences for CDT between the thenar eminences (r = 0.46, p = 0.001) and distal medial legs (r = 0.44, p = 0.002). Age was moderately correlated with the CDT gradient between the foot dorsum and anterior thigh (r = 0.52, p < 0.001), the foot dorsum and thenar eminence (r = 0.59, p < 0.001) and between the distal medial leg and thenar eminence (r = 0.52, p < 0.001); with the WDT gradient between the foot dorsum and anterior thigh (q = 0.65), the foot dorsum and thenar eminence (r = 0.65, p < 0.001), the distal medial leg and anterior thigh (r = 0.52, p < 0.001) and between the distal medial leg and thenar eminence (r = 0.50, p < 0.001); and with the HPT gradient between the foot dorsum and anterior thigh (r = 0.53, p < 0.001), the distal medial leg and thenar eminence (r = 0.59, p < 0.001) and between the anterior thigh and thenar eminence (r = 0.56, p < 0.001). A high correlation between age and thermal threshold gradient was found for HPT between the foot dorsum and thenar eminence (r = 0.72, p < 0.001).
The correlation between age and relative comparisons of the feet dorsa or distal medial legs and anterior thigh for CDT and WDT are presented in Fig. 3.
No significant correlations were found for height or sex.

Discussion
We found significant differences for most distal-proximal comparisons of thermal thresholds, while side-differences were practically non-existent. The large inter-subject variability in our sample resulted in wide limits of normality for most relative comparisons. Age was correlated with two of the contralateral-and most of the distal-proximal differences, while no correlation was found between sex or height and side-differences or distal-proximal gradient, suggesting that the latter covariates need not be accounted for when establishing normal limits for within-subject comparisons.

Distal-proximal comparisons
A general distal-proximal gradient of increasing thermal sensitivity, with differences as high as 100-fold between the face and feet, has previously been reported Blankenburg et al., 2010;Claus et al., 1987;Kelly et al., 2005;Lilliesköld and Nordh, 2018;Meh and Denišlič, 1994;Stevens and Choo, 1998;Yarnitsky and Sprecher, 1994). Although a few comparisons revealed no differences between anatomical sites,  i.e. CDT foot dorsum-distal medial leg, HPT foot dorsum-anterior thigh and HPT foot dorsum-thenar eminence, our findings confirm the presence of a general distal-proximal gradient of increasing thermal sensitivity. However, when zoomed in on the distal lower extremities, the data aligns with a previous report (Zhang et al., 2017) in showing that the gradient is non-linear. In the present study, the distal medial calf exhibited equal sensitivity to the foot dorsum for CDT, and was less sensitive for WDT and HPT, while the large floor effect deterred any conclusion regarding CPT. The difference between the foot dorsum and distal medial leg may be due to differences in nerve-fiber density or phenotype distribution, e.g. C/ Ad low threshold mechanoreceptors or C-mechanocold (Lawson, 2002;Paricio-Montesinos et al., 2020). It could also be affected by varying cortical representation, as for instance the big-toe, heel and hip have all been shown to have larger representations and peak activations of the Brodmann areas (BA 3b, 1, 2) than the calf   (Akselrod et al., 2017), although the difference is unlikely to be as pronounced between the foot dorsum and distal medial leg for thermal sensitivity. The finding does have clinical implications, as for instance Maier et al. (2010) reports tentatively using absolute reference values for the hand and feet in the upper and lower body, respectively. According to our findings, e.g. utilizing reference values for the feet dorsa to assess small-fiber function in the relatively adjacent distal medial legs, could result in an increase in false positives for WDT hypoesthesia or false negatives for HPT hyperalgesia. This may indicate that the foot dorsum is a more sensitive test site than the distal medial leg, but the choice of test-site will also depend on availability of local reference values and the indication for the test. Consequently, we surmise that adequate care should be taken to ensure sufficiently high resolution of anatomical sites when creating absolute reference material or determining normal limits for inter-region comparisons, and we advise that only site-specific reference values are used clinically until the required resolution is fully elucidated.
It is uncertain whether inter-subject variability of thermal thresholds increase with distality, as findings from previous research are equivocal (Bartlett et al., 1998;Dyck et al., 1993;Lilliesköld and Nordh, 2018;Meh and Denišlič, 1994;Moravcová et al., 2005;Rolke et al., 2006b;Zhang et al., 2017). However, we show that this may be true for inter-region comparisons. This could mean that comparisons across anatomical sites are less useful when they involve the feet dorsa or distal medial legs, which, regrettably, are high-prevalence areas for distal neuropathies. Rolke et al. (2006a) reported that region, e.g. face vs. foot had a larger effect on thermal thresholds than age or sex, but found no increase in sensitivity by comparing regions instead of using absolute reference data. It cannot be ruled out that the findings of Rolke et al. (2006a) are due to quite distal comparisons, i.e. that adjacency of the anatomical sites compared may influence diagnostic sensitivity somehow; yet our findings of wide normal limits for neighboring anatomical sites raises doubts that this is the case. Indeed, comparisons of detection thresholds between adjacent sites in the lower extremities show normal limits of 5.4-11.2°C in our sample. Still, it must be noted that the aged part of our age-pooled sample may inflate these limits somewhat due to higher variability, and thus, larger samples that allow for age stratification are needed in future research to more accurately explore distal-proximal comparisons across age-groups.

Side-differences
Previous investigations have reported some gains in diagnostic sensitivity when utilizing contralateral comparisons, either on its own, or as a supplement when the patient's results are within normal ranges of absolute reference values (Blankenburg et al., 2010;Maier et al., 2010;Pfau et al., 2014;Rolke et al., 2006a). However, in line with what our results may imply, these findings mainly concern thermal pain thresholds, with sensitivity gains for detection thresholds often being negligible, thus possibly diminishing the clinical value of the findings. Still, since sensitivity and specificity are inversely, proportionally related, the wide normal limits may increase the test's specificity, potentially allowing for a confirmatory role in a cluster of tests assessing small-fiber function.
Although the present study cannot rule out that a sidedifference may exist for WDT in the feet dorsa, we suspect our singular finding to be due to an unknown systematic error. Previous investigations have not found such a side-difference (Kemler et al., 2000;Lilliesköld and Nordh, 2018;Meh and Denišlič, 1994;Rolke et al., 2006a), and even though we applied a conservative method to correct for multiple testing, a false positive could be caused by experimental errors. A possible explanation is the use of a relatively small thermode (30x30mm), and subsequent low spatial resolution on the uneven surface of the foot dorsum. Through a systematic difference in how the thermode was manually applied by the experimenter, small areas with a lower warmth Fig. 3. Scatterplot with fitted line and 95% prediction intervals for difference in (A) cold detection thresholds between foot dorsum and anterior thigh; (B) warm detection thresholds between foot dorsum and anterior thigh; (C) cold detection thresholds between distal medial leg and anterior thigh; and (D) warm detection thresholds between distal medial leg and anterior thigh. Note that C shows an inverse trend of cold detection thresholds decreasing with age (p = 0.052), possibly due to somewhat outlying data. receptor density could theoretically be targeted, or spatial summation could be influenced by varying degrees of contact with the skin (Backonja et al., 2013;Bakkers et al., 2013;Gelber et al., 1995).
The normal limits for side-differences presented are somewhat comparable to previous findings, but due to the large impact of methodology, anatomical sites measured and statistical decisions made, heterogeneity prohibits simple comparisons between studies. Rolke et al. (2006a) investigated side-differences of 180 subjects with the DFNS protocol, but measured the cheeks, hands dorsa and feet dorsa, and pooled all data to only compare means between sides. The authors conclude that relative comparisons may be worthwhile when investigating HPT or CPT, yet a direct comparison of thermal thresholds with the present study would not be valid due to differences in test-sites and method of analysis. In another study assessing side differences in thermal thresholds, Kemler et al. (2000) reported normal limits for volar wrists and feet dorsa in 50 healthy subjects. The normal limits for side-differences for CDT, WDT and HPT in the feet dorsa were lower than in the present study, at 1.5°C, 3.8°C and 2.2°C, respectively. However, this comparison may also be unreasonable, as these lower thresholds may be explained by differences in methodology, particularly the use of method of levels, the fact that a side-difference for WDT was included in our calculations, and that the relatively low sample-sizes in both studies may affect the precision.
In line with Yarnitsky and Sprecher (1994), we found that the inter-individual variability is dependent on body-site, underlining the importance of sufficiently anatomically specific reference values. In fact, the normal limits for contralateral comparisons can vary by a factor of four, exemplified by our sample normal limits for CDT of 1.8°C for the thenar eminences and 7.2°C for the distal medial calves. Following this, regarding a set difference between contralateral homologous sites of e.g. !±1°C for thermal detection thresholds or !±2°C for thermal pain thresholds as pathological, like Leffler and Hansson (2008), may not only be erroneous because of the narrow limit proposed, but also because different limits should systematically be applied depending on the body site examined.
An important limitation of comparing sides is that the contralateral site must be normal, limiting its use in e.g. symmetrical distal polyneuropathies. Besides, both pain and functional alterations can spread with time, i.e. enlarging the affected area or mirroring the pathology (Jancalek, 2011), for instance in complex regional pain syndrome (Maleki et al., 2000;Rasmussen et al., 2018;Veldman and Goris, 1996) or post-herpetic neuralgia (Oaklander et al., 1998). In such cases, it would be more appropriate to compare thermal thresholds across anatomical sites.

Relation between relative comparisons and age, sex and height
The finding of no relation between sex and side-differences is in line with previous work by Kemler et al. (2000). If sex and height is in fact inconsequential to the external validity of relative reference values, large sample sizes may be more accessible to future relative reference material, which could in turn allow for appropriate age stratification.
In some contrast to Kemler et al. (2000), we found that age was significantly correlated with side-differences for CDT in the thenar eminences and distal medial legs. Furthermore, age was associated with most of the distal-proximal comparisons, with larger differences being normal in the older patient. The effect of age on distal-proximal comparisons may be due to faster ageing of the peripheral nerves, for instance age-related perfusion impairment near the peripheral receptors, age-related reduction in interepidermal nerve fiber density, distal axonopathy, or age-related changes in the synthesis, transport and action of key neurotransmitters leading to higher firing thresholds (Chakour et al., 1996;Guergova and Dufour, 2011;Stevens and Choo, 1998). Accordingly, it seems necessary to adjust for age when performing relative comparisons of thermal thresholds, particularly when assessing the most distal sites. However, it should be noted that age only partly explains the variance seen in thermal thresholds in our sample, suggesting that other covariates than age, sex or height may play a role in relative comparisons of thermal thresholds.

Strengths and limitations
The external validity of our findings is strengthened by our close adherence to the DNFS protocol.
An important limitation is the relatively low sample-size, and consequent loss of possibility to stratify by age. Collecting reference material stratified for all age groups constitutes a trade-off between the amount theoretically needed and what is practically possible to achieve. We managed to include 48 persons. Partitioning this sample into small age-groups, e.g. decades would decrease the precision of the normal limits, and so the sample was pooled. Still, older individuals are likely to show high inter-individual variability, which may have increased the normal limits in our study due to a sample consisting of individuals of all ages.
We defined healthy as someone with no evidence of disease, and not evidence of no disease (Jabre, 2018). This entails that while we performed a screening interview and carefully handled abnormal tests and outliers, subclinical neuropathies may still theoretically have been present in some subjects included in the final analysis. However, we believe that our definition represents an adequate compromise between said risk and invasive testing of healthy volunteers.
With a psychophysical design consisting of 116 recorded measurements per subject, some outliers are to be expected. However, we believe that our careful effort of blunting single-test-outliers by utilizing the arithmetic mean for each test site and modality, only removing relatively extreme outliers (mostly delta-values, preserving a mean value also in these subjects), and not replacing invalid measurements that are only considered invalid due to a somewhat arbitrarily limited test range (0-52°C), lets our data represent much of the true variation seen in-and between healthy subjects.
Each trial lasted about one hour without breaks; it is thus possible that attention and reaction-time was diminished through the last tests, and since the test-order was randomized, this could theoretically lead to a global increase in inter-subject variability and widening of the sample normal limits. However, failure to randomize could lead to higher thresholds in only the last measurements, creating a systematic bias, and thus we believe that this source of error is best managed globally.

Conclusion
The normal limits for distal-proximal-and contralateral homologous thermal thresholds were wide, and thus of limited use in a clinical setting. Age, but not sex or height, was associated with most distal-proximal differences, and with contralateral differences in cold detection thresholds in the thenar eminences and distal medial legs. In addition, our data confirms that the distal medial legs may be equally-or less sensitive than the feet dorsa for thermal stimuli, resulting in a non-linear distal-proximal gradient, and highlighting the need for site-specific reference values for thermal thresholds in general.

Funding
Funding was received from Oslo University Hospital and Oslo Metropolitan University. Neither sponsor was involved in design, data collection, analysis, interpretation, writing of the report or in the decision to submit the article for publication.

Declaration of interest
None.

Contributors
All authors contributed to conception and design, while Ø. Dunker performed acquisition-and analysis of data. Ø. Dunker drafted the article, while K.B. Nilsen and M.U. Lie made critical revisions to the text. All authors approve of the final version.