Optimizing odor identification testing as quick and accurate diagnostic tool for Parkinson's disease

ABSTRACT Introduction The aim of this study was to evaluate odor identification testing as a quick, cheap, and reliable tool to identify PD. Methods Odor identification with the 16‐item Sniffin' Sticks test (SS‐16) was assessed in a total of 646 PD patients and 606 controls from three European centers (A, B, and C), as well as 75 patients with atypical parkinsonism or essential tremor and in a prospective cohort of 24 patients with idiopathic rapid eye movement sleep behavior disorder (center A). Reduced odor sets most discriminative for PD were determined in a discovery cohort derived from a random split of PD patients and controls from center A using L1‐regularized logistic regression. Diagnostic accuracy was assessed in the rest of the patients/controls as validation cohorts. Results Olfactory performance was lower in PD patients compared with controls and non‐PD patients in all cohorts (each P < 0.001). Both the full SS‐16 and a subscore of the top eight discriminating odors (SS‐8) were associated with an excellent discrimination of PD from controls (areas under the curve ≥0.90; sensitivities ≥83.3%; specificities ≥82.0%) and from non‐PD patients (areas under the curve ≥0.91; sensitivities ≥84.1%; specificities ≥84.0%) in all cohorts. This remained unchanged when patients with >3 years of disease duration were excluded from analysis. All 8 incident PD cases among patients with idiopathic rapid eye movement sleep behavior disorder were predicted with the SS‐16 and the SS‐8 (sensitivity, 100%; positive predictive value, 61.5%). Conclusions Odor identification testing provides excellent diagnostic accuracy in the distinction of PD patients from controls and diagnostic mimics. A reduced set of eight odors could be used as a quick tool in the workup of patients presenting with parkinsonism and for PD risk indication. © 2016 The Authors. Movement Disorders published by Wiley Periodicals, Inc. on behalf of International Parkinson and Movement Disorder Society

Methods: Odor identification with the 16-item Sniffin' Sticks test  was assessed in a total of 646 PD patients and 606 controls from three European centers (A, B, and C), as well as 75 patients with atypical parkinsonism or essential tremor and in a prospective cohort of 24 patients with idiopathic rapid eye movement sleep behavior disorder (center A). Reduced odor sets most discriminative for PD were determined in a discovery cohort derived from a random split of PD patients and controls from center A using L1regularized logistic regression. Diagnostic accuracy was assessed in the rest of the patients/controls as validation cohorts. Results: Olfactory performance was lower in PD patients compared with controls and non-PD patients in all cohorts (each P < 0.001). Both the full SS-16 and a subscore of the top eight discriminating odors  were associated with an excellent discrimination of PD from controls (areas under the curve 0.90; sensitivities 83.3%; specificities 82.0%) and from non-PD patients (areas under the curve 0.91; sensitivities 84.1%; specificities 84.0%) in all cohorts. This remained unchanged when patients with >3 years of disease duration were excluded from analysis. All 8 incident PD cases among patients with idiopathic rapid eye movement sleep behavior disorder were predicted with the SS-16 and the SS-8 (sensitivity, 100%; positive predictive value, 61.5%). Conclusions: Odor identification testing provides excellent diagnostic accuracy in the distinction of PD patients from controls and diagnostic mimics. A reduced set of eight odors could be used as a quick tool in the workup of patients presenting with parkinsonism and for PD risk indication. Olfactory deficits affect 75% to 90% of patients with Parkinson's disease (PD), and olfactory testing may also represent a sensitive screening test for individuals at risk of developing PD, 1-4 whereas olfactory function is normal or only mildly impaired in other forms of degenerative parkinsonism or essential tremor (ET). 2,5 Olfactory testing has recently been incorporated in the newly established International Parkinson and Movement Disorder Society criteria for PD 6 and prodromal PD. 7 To test for olfactory performance in PD, most studies have focused on odor identification using the disposable University of Pennsylvania Smell Identification Test (UPSIT) or the reusable Sniffin' Sticks test battery assessing olfactory threshold and odor discrimination in addition to odor identification. 2,8 Both tests are time-consuming, and olfactory testing is rarely performed in clinical routine. Most of existing shortened versions of odor identification tests were not specifically developed for PD patients, nor were any of these tests properly validated. [9][10][11][12][13] Hence, we sought to assess the diagnostic value of the 16-item Sniffin' Sticks identification subtest (SS- 16) as an easy-to-use, inexpensive tool. We also aimed to shorten and optimize it to identify both established and early/prodromal PD using a discovery cohort and different validation cohorts.

Patients and Methods
For the present study, data from a total of 134 PD patients and 46 patients with atypical parkinsonism (23 multiple system atrophy [MSA], 23 progressive supranuclear palsy [PSP]), who participated in three independent prospective, cross-sectional clinical studies at the Department of Neurology, Innsbruck Medical University (Innsbruck, Austria) 14,15 and from 336 age-matched healthy controls (HCs) and 29 subjects with ET from the prospective population-based Bruneck Study 16,17 were analyzed (center A). Patients were regularly followed over at least 24 months to reassess their clinical diagnosis, and 4 cases were reclassified as MSA (n 5 1) or PSP (n 5 3) during clinical follow-up. PD patients and HCs from center A were randomly split into approximately equal parts. Patients with MSA, PSP, and ET were subsumed as differential diagnoses (DDs) in the validation cohort only (Supporting Fig. 1). Two independent sets of PD patients and HCs were used as additional validation cohorts; 400 PD patients and 150 HCs from the Departments of Neurology of the VU University Medical Centre (Amsterdam, The Netherlands) and the Leiden University Medical Centre (Leiden, The Netherlands) (center B); 18  Relevant conflicts of interest/financial disclosures: Nothing to report. Full financial disclosures and author roles may be found in the online version of this article. general neurologists in Vienna, Austria (center C). Last, we used a previously described prospective cohort of 24 patients with polysomnography-confirmed idiopathic rapid eye movement sleep behavior disorder (iRBD), 19 consecutively recruited at center A. iRBD patients were tested for olfactory function at baseline and followed up for a mean of 6 years in order to detect incident neurodegenerative diseases, in particular, PD. Studies were approved by the local ethics committees. All participants gave written informed consent according to the Declaration of Helsinki.
Olfactory testing was performed with the SS-16 (Burghart Medizintechnik, Germany) as described elsewhere. 20 In center C, the Sniffn' Sticks 12-item odor identification test (SS-12), 21 a commercially available, shorter version of the SS-16 test, was used. Subscores of reduced sets of odors were derived for the present analyses.
Group comparisons between PD patients and controls or DDs were performed with appropriate tests (see table legends). Odor sets predictive of PD were determined in the discovery cohort by L1-regularized logistic regression implementing the least absolute shrinkage and selection operator (the LASSO) 22 using the glmnet R package. The performance of full and reduced odor sets in discriminating PD from controls or DDs was gauged using area under the receiver operating characteristic curve (AUC) with respective 95% confidence intervals (95% CI). Performance of full and reduced odor sets is given by conventional measures of diagnostic accuracy. To adjust for the bias in prevalence of PD versus DDs in our pooled cohort from center A, positive predictive values (PPVs) and negative predictive values (NPVs) were modeled for two additional scenarios using published data on the relative prevalence of PD versus DDs (1) as reported in general neurological services and (2) as assumed in specialized movement disorder services. 23 Furthermore, we evaluated the accuracy of the SS-16 and its subscores in (1) identifying PD in cohort A after excluding patients with >3 years of disease duration and (2) predicting incident PD among the 24 idiopathic RBD patients. SPSS (version 22.0; IBM Corp., Armonk, NY) and R software (version 3.2.2; R Foundation for Statistical Computing, Vienna, Austria) were used for statistical analyses. The local significance level was set at P < 0.05. Full methods can be found in the Supporting Appendix.

Results
Characteristics of the patients and controls in the different cohorts are shown in Table 1A and in the Supporting Information. Figure 1A and Supporting Table 1 depict differences in identifying individual odors in the study groups.
An increasing discriminatory power in the distinction of PD patient versus HC, as demonstrated in AUCs, was achieved with an increasing number of odor items used in the discovery cohort (Fig. 1B). This could be reproduced in the validation sets, reaching the 95% confidence interval (CI) of AUCs achieved with the entire Sniffin Sticks tests (SS-16 and SS-12; upper and lower row in Fig. 1B, respectively) when using only six sticks and the optimum when using eight (SS-8). We assessed diagnostic accuracy of the SS-16 and SS-8 in identifying PD patients (Table 1B). Of note, all 4 patients who were reclassified (MSA, 1 case; PSP, 3 cases) during clinical follow-up had a normal olfactory function at baseline according to the SS-16 and SS-8. In a modeled general neurological service (PD prevalence: 91.8%), both the SS-16 and the SS-8 would yield PPVs of >97%. In a specialized outpatient clinic (lower PD prevalence 69.0% because of higher proportion of non-PD parkinsonism), PPVs of around 90% would be achieved (Supporting Table 2). To test the usefulness of the SS-16 and the SS-8 as a screening method for early/ prodromal PD, we repeated the diagnostic accuracy analyses after excluding patients with >3 years of disease duration, which did not alter the results (Supporting Table 3). Furthermore, the 8 incident PD cases among iRBD patients were predicted with the SS-16 and the SS-

Discussion
We found excellent diagnostic accuracy for the SS-16 and a shortened test, the SS-8, in the distinction of PD not only from controls, but also from non-PD tremor or atypical parkinsonism.
To the best of our knowledge, our study is the largest study of olfactory testing ever performed in patients with PD, related disorders, and controls comprising a total of 1,351 individuals. We employed a sophisticated logistic regression analysis to determine reduced sets of odors along the LASSO regularization path in a discovery cohort. This variable selection algorithm considers the statistical dependencies among odor-specific olfactory impairments and minimizes redundancy. Whereas the diagnostic performance in identifying PD of the three, four, or five bestdiscriminating odors was inferior to the whole SS-16, the six best discriminating odors achieved accuracy within the 95% CI of the AUCs of the entire set, which was further improved by using a combination of eight odors (but not beyond).
Short tests such as the SS-8 might be particularly appealing for two purposes: First, in a clinical setting, they might serve as an additional quick (approximately 3 minutes) and handy tool in the workup of patients presenting with parkinsonism where clinicians want to identify true PD cases with a high specificity  and predictivity. In our sample, the specificity for PD was high (84% with the SS-16 and 88% with the SS-8) combined with a high sensitivity (84%). When modeling prevalences in a general neurological service and a specialized movement disorders outpatient clinic, the PPVs were high at 98% and 94%, respectively. The usefulness of the SS-16 and SS-8 for ruling out DDs is further supported by the analysis in parkinsonian patients with less than 3 years of disease duration yielding a similar diagnostic accuracy as in the whole sets. Indeed, all 4 patients in whom an initial diagnosis of PD was later changed to MSA or PSP during follow-up had a normal olfactory function. Second, a short olfactory test could be useful as a highly sensitive screening tool in population-based studies seeking to define cohorts at high risk for PD. 3,24 Along these lines, we found a high sensitivity of the SS-16 and SS-8 in identifying PD versus HCs in the center A (92% and 94%) and center B (83% and 85%) validation cohorts combined with a good specificity of 82%. This excellent diagnostic accuracy remained unchanged when only PD patients with less FIG. 1. Identification of individual odors in the study groups in the three different centers (A). Gray horizontal line indicates the probability of correctly guessing an odor in the employed forced choice test. AUCs with 95% CIs as a function of combination of odors best predicting PD according to the LASSO analysis in the various cohorts (B). The three horizontal lines in each graph represent the AUCs with 95% CIs of the full test used (SS-16 for the upper row and SS-12 for the lower row). The best eight discriminating odors derived from the full SS-16 were used for the SS-8 subscore (licorice, anise, mint, cinnamon, banana, pineapple, rose, and coffee). than 3 years of disease duration were included. Furthermore, the SS-16 and SS-8 accurately identified 8 incident PD cases from a previously described cohort of 24 iRBD patients clinically followed for 6 years.
Whereas previous studies focused on even shorter sets of three odors in the Sniffin' Sticks or UPSIT, [9][10][11][12][13] in our analysis six to eight odors emerged as the smallest number with equal performance as the entire set. In line with previous evidence, 25 this argues against the concept of selective anosmia in PD. 13 Also, one must take into account that the nature of the Sniffin' Sticks (and the UPSIT) as a forced-choice test bearing an inherent 25% likelihood of a correct answer, which limits the options of setting cutoffs in reduced odor sets, possibly resulting in unsatisfactory specificity and/or sensitivity. It should be noted that none of the previous studies used independent validation samples, which is a particular strength of our study.
However, there are limitations. Diagnoses of PD and DDs were made according to clinical criteria without pathological confirmation. Therefore, misdiagnosis cannot be ruled out. However, in center A, patients with parkinsonism were followed up for at least 2 years in order to reduce likelihood of misdiagnoses. Furthermore, cultural differences may impact on short olfactory tests to a greater extent compared to longer sets, where a greater variety of odors might balance such effects. 26 Nevertheless, given the reproducibility shown in the external validation samples, it is likely that diagnostic accuracy in other samples will be similar.
To conclude, our analysis confirms that odor identification testing with the SS-16 is associated with excellent accuracy in diagnosing PD and shows that it can be shortened considerably without losing diagnostic power. A shortened test of eight odors may be of substantial value in both a clinical setting assisting in the distinction from frequent diagnostic mimics and in a population-based setting for PD risk evaluation.