Voxel-based morphometry focusing on medial temporal lobe structures has a limited capability to detect amyloid β, an Alzheimer’s disease pathology

Voxel-based morphometry (VBM) analysis of nuclear Magnetic Resonance Imaging (MRI) data allows the identification of medial temporal lobe (MTL) atrophy and is widely used to assist the diagnosis of Alzheimer’s disease (AD). However, its reliability in the clinical environment has not yet been confirmed. To determine the credibility of VBM, amyloid positron emission tomography (PET) and VBM studies were compared retrospectively. Patients who underwent Pittsburgh Compound B (PiB) PET were retrospectively recruited. Ninety-seven patients were found to be amyloid negative and 116 were amyloid positive. MTL atrophy in the PiB positive group, as quantified by thin sliced 3D MRI and VBM software, was significantly more severe (p =0.0039) than in the PiB negative group. However, data histogram showed a vast overlap between the two groups. The area under the ROC curve (AUC) was 0.646. MMSE scores of patients in the amyloid negative and positive groups were also significantly different (p = 0.0028), and the AUC was 0.672. Thus, MTL atrophy could not reliably differentiate between amyloid positive and negative patients in a clinical setting, possibly due to the wide array of dementia-type diseases that exist other than AD.


INTRODUCTION
AGING Japan), provides a score by which medial temporal lobe (MTL) atrophy can be assessed objectively, thus, bypassing the need of specially trained staff for interpretation [1,2]. AD has two main pathological features, senile plaques made of the β-amyloid (Aβ), which invariably occur as a part of the pathological process of AD, and neurofibrillary tangles (NFT) made of phosphorylated tau. NFT develops first in the MTL [3] causing atrophy. The MTL including the hippocampus and entorhinal and perirhinal cortices plays a very important role in memory [4]. Therefore, it is plausible that MTL atrophy might be a useful marker for detecting AD pathology. Several studies have reported that MTL atrophy can be detected in patients with AD from a very early stage [5][6][7][8] and, therefore, is useful to distinguish prodromal AD from normal aging [1,[9][10][11]. Particularly, the CA1 region of the hippocampus shows the most severe atrophy in AD [12][13][14][15][16]. Hippocampal volume provides a quantitative marker of the pathologic substrate that produces the observed cognitive deficit in AD [17].
Although VBM-derived MTL atrophy scores are easy to interpret, they are not without fault and can sometimes produce false positives, categorizing cognitively normal subjects as AD patients. General practitioners in Japan often refer healthy patients with high VBM scores to dementia specialists and prescribe dementia drugs such as donepezil (Aricept ® , Eisai Co., Tokyo, Japan) without undertaking memory examinations. This overprescribing of AD medications and unnecessary referral to dementia specialists places an extraneous burden on the Japanese health insurance system and medical infrastructure.
Although a previous study showed that MTL atrophy scores calculated using VSRAD ® Advance have a high sensitivity (86.4%) and a high specificity (97.5%) [2], there are two serious limitations of the study, which impede its reliability in an actual clinical setting. First, the study population included only AD and cognitively normal subjects. In reality, however, clinicians must be able to differentiate between the different types of cognitive disorders such as dementia with Lewy bodies (DLB), fronto-temporal lobe dementia (FTLD), progressive supranuclear palsy (PSP), cortico-basal degeneration (CBD), neurofibrillary tangle-predominant dementia (NFTD), and argyrophilic grain dementia (AGD). Representative MTL images are shown in Figure 1. A previous study showed that only 34% of patients with neurodegenerative dementia (clinical dementia rating scale; CDR ≥ 1) had AD pathology [18] and another showed that there was no significant difference in MTL atrophy between subjects with AD and non-AD dementia [17]. Therefore, while VSRAD ® Advance may be useful for differentiating AD from cognitive normal subjects, the score alone is not sufficient to diagnose AD. Second, the patients assigned to the AD arm of the study were diagnosed based on clinical criteria. However, false positive diagnosis of AD is possible when using clinical criteria alone and similarly, patients with AD pathologies are often misdiagnosed with normal cognition [19]. A paper reviewing the reliability and validity of NINDS-ARDA Alzheimer's criteria [20] found that the sensitivity and specificity of the 'probable AD' category was 76.6 -70.9% and 59.5 -70.8%, respectively; and those of the 'probable AD' and 'possible AD' categories combined were 87.3 -82.7% and 44.3 -54.5%, respectively [21]. Lim et al. [22] also showed that the 'probable AD' category had 83% sensitivity and 55% specificity; 'probable AD' and 'possible AD' categories combined had 85% sensitivity and 50% specificity. In a population-based study by Petrovitch et al. [23], only 65% of the clinically diagnosed AD was reported to be pathologically accurate. Amyloid imaging is reported to alter the presumptive diagnosis in approximately 30% of cases, increase the diagnostic confidence in about 60% of cases, and change the patient management in about 60% of cases [24]. Owing to the high reliability of amyloid positron emission tomography (PET), a positive result can be considered as a clear confirmation of AD pathology.
This study aimed to investigate the clinical reliability of VBM in diagnosing AD. We compared hippocampal atrophy assessed using VSRAD Z-score and amyloid PET retrospectively.

Demographics
Seventy-three out of 286 patients were excluded from the analysis. Eighteen patients did not have 3D MRI data suitable for VBM. MR images of 46 patients showed bad segmentation, two images were of low quality due to the head movement, and one showed susceptible artifact due to a cochlea implant. One patient had a large arachnoid cyst, one had a large infarction, and one had normal pressure hydrocephalus. Three patients with cerebral amyloid angiopathy were also excluded, as they may show amyloid positivity in the absence of senile plaques and MRI can be influenced by hemorrhage. Of the 213 remaining patients, 97 were amyloid negative and 116 were amyloid positive. Patient characteristics are shown in Table 1. For all patients, the diagnosis at the time of scanning and the most recent diagnosis by neurology specialists are shown in Supplementary  Figure 1.

Distribution of hippocampal atrophy
VSRAD Z scores of PiB negative and positive patients were significantly different (p = 0.00393). However, histograms showed a vast overlap of scores between the two groups ( Figure 2A). A Receiver Operating Characteristic (ROC) curve of VSRAD Z values is shown in Figure 2B. VSRAD ® had 78.4% sensitivity, 54.6% specificity, and 67.6% accuracy at a Z value of 1.20. The Area Under the ROC curve (AUC) was 0.646.

MMSE
The histogram of MMSE scores is shown in Figure 3A. It also showed a vast overlap of scores between the two groups.
The ROC curve of MMSE scores is shown in Figure  3B. As calculated, the sensitivity, specificity, and accuracy of MMSE was 64.3%, 65.2%, and 64.7%, respectively, at MMSE 24.5. AUC was 0.672.

DISCUSSION
We observed that MTL atrophy, determined using VBM with MRI was not a reliable indicator of Aβ deposition. The significant difference between the VSRAD Z scores of Aβ positive and negative patients can be explained by a difference in MMSE score. The ROC analysis of VSRAD Z scores showed a low AUC value (0.646), which was similar to the AUC value of MMSE scores (0.672). Since MMSE is much simpler and inexpensive than MRI examination, MMSE would be preferable to VBM as a tool to diagnose AD.
The low reliability of VBM observed in this study is likely attributable to the existence of many different forms of dementias other than AD, a factor not    AGING accounted for in the previous study on VBM [2]. Although MTL atrophy detected by MRI correlates with NFT pathology, it is not specific to AD [17]. Furthermore, hippocampal-sparing AD, which does not involve hippocampal atrophy, is reported in approximately 11% of cases, which should also be taken into consideration when designing a study involving AD patients [25]. Early stage AD before hippocampal shrinkage [26] (preclinical stages of AD or MCI due to AD) would also result in a false negative VBM.

AGING
VBM for MTL atrophy and PiB PET for Aβ deposition target two different components of AD pathology, NFT and senile plaques, respectively. Tateno et al. [27] reported no correlation between Aβ deposition and MTL atrophy, while Jack et al. [28] reported a weak correlation. Studies showed that Aβ deposition in the neocortex is related to MTL atrophy only at a very early stage [29][30][31], and moreover the relationship is of a weak and inconsistent nature [32]. This phenomenon can be explained by the fact that Aβ accumulation reaches a plateau very early during the disease progression [26,28]. Within Aβ (+) patients, hippocampal atrophy showed a significant correlation with Braak and Braak staging and the level of tau in the cerebrospinal fluid (CSF), moreover, hippocampal atrophy showed a weak correlation with Aβ burden [33].
Although our study demonstrated that VBM is not useful in diagnosing AD, it may be useful in other situations. VBM can be used to access MTL atrophy for research purpose [34]. It has been reported that the pattern of gray matter atrophy is associated with NFT pathology in Braak stage V and VI patients [35]. Identification of the atrophy pattern would be useful for classifying patients into the pathological subtypes of AD, i.e. typical AD, hippocampal-sparing AD, and limbic-predominant AD [36], and to distinguish nonAD degenerative dementia from MCI due to AD [37,38]. Moreover, VBM is routinely used to evaluate diseasespecific atrophic regions [39].
There are several limitations to this study. First, it should be noted that amyloid positivity does not conclusively equate to a diagnosis of AD, although an amyloid negative result can rule out the possibility of AD. Moreover, it takes many years to develop hippocampal atrophy after amyloid deposition [26]. Second, although the study population was large, this retrospective study might be biased since patients who are difficult to be diagnosed require amyloid PET scanning. Therefore, the patient population in this study may not be a true representation of the wider population. However, a pathological study showed that the proportion of patients with cognitive impairment with pure Alzheimer's disease was as little as 34% [18]. Third, the MRI machine was updated to a newer model during the studied period. The difference in machines may influence the VBM results obtained. However, the effect is likely to be insignificant in clinical settings.
In conclusion, our study demonstrated that VBM based analysis of MRI data reliably detects hippocampal atrophy, but is not useful for the diagnosis of AD.

AUTHOR CONTRIBUTIONS
MK and K. Ishibashi contributed to the conceptualization, study design, assessing images, discussing the results, and project administration. MK contributed to statistical analysis, and figures and initial draft manuscript preparation. K. Ishibashi contributed to image analyses. JT contributed to radiopharmaceutical preparation. KW contributed to acquisition of PET data. YUK contributed to the conceptualization, statistical analysis and advised the project. KS and AMT contributed to acquisition of MRI data. KK and SM contributed to patient recruitment. SO advised the project. K. Ishii contributed to supervision and project administration. All the authors discussed the project and have read and approved the final manuscript.