A cortical thinning signature to identify World Trade Center responders with possible dementia

Introduction: World Trade Center (WTC) responders have a high risk of early-onset cognitive impairment (CI), but little is known about the etiology including the extent to which CI in WTC responders is accompanied by cortical atrophy as is common in progressive diseases causing age-related CI such as Alzheimer’s disease and related dementias. In the current study, we entrained an artificial neural network (ANN) to determine the accuracy of cortical thickness (CTX) on magnetic resonance imaging to identify World Trade Center responders at midlife (aged 44–65 years) with possible dementia. Methods: A total of 119 WTC responders (57 with CI and 62 with intact cognition) underwent a structural MRI scanning protocol including T1-weighted MPRAGE as part of two imaging studies. The discovery study was divided into training and validation samples, while a second replication sample was used. An ANN was trained using regional CTX measured across 34 unilateral regions of interest (ROIs) using Freesurfer software and ‘Desikan-Killiany’ brain atlas. The discovery sample was used for model development, and the replication sample was used to evaluate predictive accuracy. Results: In the WTC responder cohort, the ANN algorithm showed high discrimination performance for CI. The ANN model using regional CTX data from both hemispheres achieved an area under the receiver operating characteristic curve (AUC) of 0.96 95% C.I. = [0.91–1.00] (Accuracy = 96.0%, Precision = 97.8%, Recall = 95.8%, Sensitivity = 95.8%, Specificity = 98.0%, F1 = 96.8%) for the discovery sample and AUC = 0.90 [0.70–1.00] (Accuracy = 90.0%, Precision = 90.0%, Sensitivity = 90.0%, Specificity = 90.0%, F1 = 90.0%) in the replication sample. Conclusion: Analysis of bilateral regional CTX data derived from T1-weighted MPRAGE images by ANN analysis demonstrated excellent accuracy in distinguishing WTC responders with early-onset CI.


Introduction
On September 11th, 2001, hundreds of thousands of people including those residing in the Greater New York area, watched first hand as the World Trade Center (WTC) collapsed after two planes were flown into the towers. On that day and in the months thereafter, tens of thousands of men and women worked in search, rescue, and recovery operations (hereafter named "Wresponders"). While studies have documented the extreme conditions to which WTC responders were exposed on November 09, 2001 as well as the chronically elevated risk of psychiatric illness among these men and women responders [1], little is known about reasons for increased risk of neurological diseases in this cohort. However, earlier studies have indicated that post-traumatic stress disorder (PTSD) among veterans of the Vietnam, Iraq, and Afghanistan wars can be associated with cognitive impairment (CI) and cortical atrophy [2,3]. Other studies have suggested that severe and/or chronic exposure to inhaled nano-sized particulate matter (PM < 2.5 μm [PM2.5]) have been linked to amyloidogenesis in mice [4], CI in older women [5], and may cause neurodegeneration [6]. WTC responders were also exposed to inhalation of elevated levels of PM2.5 [7].
Prior efforts to identify biomarkers for CI have determined that the topography of cortical atrophy can help to determine and differentiate subtypes of dementia [8,9]. For example, cortical thickness (CTX), as measured using magnetic resonance imaging (MRI), reliably quantifies neurodegeneration in mild cognitive impairment [10] and diagnosed Alzheimer's disease (AD) [11], with focal reductions evident in the medial temporal lobe and the posterior cingulate [12]. Despite their relative youth, ongoing research has detailed that WTC responders are at increased risk for aging-related CI [13]. Yet, whereas biomarkers for Alzheimer's disease are increasingly well understood [14], the topography of cortical atrophy among WTC responders does not match known Alzheimer's disease topographical signatures [15], and while predictive the AD fingerprint has relatively low accuracy for detecting CI in this population (AUC 0.68 [0.57-0.78]) [16].
CI is most severe when accompanied by cortical atrophy [17,18] yet while research in WTC responders suggests that cortical atrophy is present [15], the commonality of cortical atrophy in WTC responders with CI remains unclear. Nothing is known about the accuracy of CTX measurement to accurately discriminate CI in WTC responders at midlife. The present study filled these gaps in three ways: first, since exposures at the WTC disaster site were severe and unique this study sought to better understand the severity of CI among WTC responders by going beyond reliance on AD signatures to characterize more completely the biomarker signature associated with CI among WTC responders; second, it created a novel artificial neural network (ANN) algorithm to reliably identify responders with CI and reported classification accuracy of neuroimaging parameters as a main outcome; and finally, we uniquely examined cortical atrophy in responders with MCI and dementia to determine reliable and consistent regional distribution of cortical atrophy.

Participants
The present study utilized results from two imaging studies investigating WTC responders with CI as compared to WTC responders who were cognitively unimpaired. In both cases, participants were purposively recruited from a single clinic-based monitoring program in the WTC Health Program [19] whose participants additionally participated in serial administration of the Montreal Cognitive Assessment (MoCA) [20]. Eligible participants completed neuroimaging if they consented and did not drop out during the screening or scanning visits. Participant groups in both studies included WTC responders with CI, and those determined to be cognitively unimpaired using a standard diagnostic protocol as outlined below. By design, cases were matched to controls using demographic characteristics including on age, sex, occupation, education, posttraumatic stress disorder, and minority status. Women and responders from minority backgrounds were over-sampled in this study to improve generalizability. The sample of the first study (N = 99) served as the discovery sample (48 medically healthy individuals with dementia, 51 demographically matched cognitively unimpaired controls) while the replication sample included ten medically healthy WTC responders presenting with mild cognitive impairment (MCI) and ten demographically matched cognitively unimpaired controls.
Eligibility criteria for both studies were ages 44-65, fluent in English, and completion of a diagnostic assessment of WTC-related PTSD. Cognitive status was confirmed at screening visits.
Exclusion criteria for both studies included having a body mass ≥40, history of psychosis, history of diagnosed neurological conditions including diagnosed Alzheimer's disease, other dementias, major stroke, multiple sclerosis, and Parkinson's disease, severe head trauma from the WTC or a history of head trauma, current liver disease, and current use of cognitively active medications. Subjects also satisfied eligibility criteria for MRI scanning including no known claustrophobia, and no known metal implants or shrapnel that was not deemed MRI-safe.

Ethics
The Institutional Review Boards at both Stony Brook University and the Icahn School of Medicine at Mount Sinai approved study procedures. Participants provided written informed consent to participate in all research studies.

Measures
The following demographic characteristics were recorded: age, sex (female versus male), occupation (law enforcement versus other), and educational attainment (university degree versus at least some college versus high school). Upon enrollment, all eligible responders were screened to provide more detailed information about everyday functioning and to ensure case status. In both studies, responders' age, sex, and race/ethnicity as well as occupation and education were matched across cognitive case groupings.

Image processing
T1-MPRAGE images were used to obtain cortical thickness measures utilizing the standard, automated cortical reconstruction pipeline of FreeSurfer V.5.3 as described in previous publications [21,22]. Briefly, the surface models were inflated and registered to a standard spherical surface atlas before being smoothed and recorded [23].

Measures
Cortical Thickness: CTX is a consistent measure of brain atrophy that is commonly used in studies of AD and other related dementias [24]. We focused on CTX both because the validation of these features indicates that CTX is preferable to other possible features because it is highly sensitive to neurodegeneration but less sensitive to unmeasured confounding, and because performing inference on the raw images directly was significantly more computationally expensive than was feasible for the investigators to undertake without specialized hardware. Indeed, CTX compares favorably with gray matter volume, wholebrain volume, and hippocampal volumes because, while all of these measures can be indicators of neurodegeneration, CTX can be quantified across multiple brain regions, and is generally thought to be minimally related to intra-cranial volume and sex [25]. CTX measurements were obtained by calculating the mean distance between gray and white matter boundaries and the outer pial surface of the cerebral cortex. Regional CTX was calculated in each hemisphere separately for the 34 subregions defined by the Desikan-Killiany atlas [23]. Unilateral and bilateral CTX estimates were recorded.

Diagnosis of Cognitive Impairment:
In the discovery sample we relied on data from 48 individuals with mild dementia as diagnosed using NIA-AA standards as indicated by medically healthy individuals with CI identified using a standard cutoff with evidence of functional limitations consistent with possible dementia [26]. Global cognitive functioning wqs measured using the Montreal Cognitive Assessment (MoCA), a widely used measure of cognitive functioning with a clinical application and was developed to objectively and reliably identify age-related CI [20].In the discovery sample, possible mild dementia was characterized by evidence of cognitive impairment (MoCA≤20) without underlying medical complications. To match these patients, data from 51 cognitively unimpaired WTC responders (MoCA≥26) as controls. The replication sample included ten medically healthy WTC responders presenting with mild cognitive impairment (MCI), as defined by the observed onset of mild CI (MoCA≤23) coupled with evidence of cognitive decline but without functional limitations [27], and ten cognitively stable WTC responders served as controls (two observations >12 months apart with MoCA≥26).
The discrepancy between the discovery sample with CI versus the replication sample with mild CI arose primarily because these two studies originally addressed different levels of CI within the same cohort of individuals. Since the present study required a replication sample to validate the predictive power of the ANN with the discovery sample from within the same cohort, these two studies were a good match to fulfill that requirement.
Computer-Assessed Cognitive Performance (CogState) was measured utilizing a brief, 20min computer-administered approach [28] that uniformly measures fluid cognition using data and metadata during a game-like task. Cognitive domains measured included: reaction speed, processing speed, cognitive efficiency, intra-item response variability, attention, executive function, visual memory, paired associate learning, visuospatial learning and memory, and working memory. Validity checks are built into the scoring methodology. Prior work with CogState in this population has found that cognitive dysfunction in several cognitive domains such as reaction speed, processing speed, and memory was associated with both long-term exposures to the WTC sites and with symptoms consistent with severe and chronic PTSD [29].

Statistical analyses
Descriptive characteristics were provided using mean and standard deviations, or frequencies and percentages where noted. In this study, confounding from central variables including age was completed by the use of matching in the design phase. The following describes how we trained, tested, and then examined out of sample replicability for our artificial neural network. Fig. 1 provides the ANN architectural diagram (panel A) and, in panel B, the training, testing, and replication protocol.
As seen in Fig. 1A, the analytic plan focused on studying the accuracy of mean CTX within each cortical sub-region before examining mean hemispheric CTX. Next, we applied the learning ANN to identify differences between brain subregions [30]. To accomplish this task, we completed all of the training and validation efforts in the discovery sample before then, as a final stage, applying the scoring to the replication sample. We relied on randomized K-fold cross-validation to create the learning process -because training requires more statistical power than does validation, we apportioned 66% of discovery sample cases to training and 33% were retained for intra-discovery validation to determine whether the ANN was learning. Learning curves were reported using this training session.
The ANN incorporated (Fig. 1B for architecture) bilateral brain regions from both hemispheres as input layers (spread = 0.5). The ANN had three hidden layers with ten nodes in each layer and out output layer, which recorded the cortical atrophy risk score. ANN training was accomplished in two steps. First, in the training subsample, we defined a training session that trained the ANN to identify the outcome and each training session randomly sorted the cases and controls within the dataset to ensure that case order was not identified by the ANN as a model parameter. The learning rate, a model parameter used to tune the learning rate versus computational speed tradeoff, was set to η = 0.10. Second, epochs were defined by randomly selecting cases from the discovery sample to create a new training session and retrained the ANN using the new subset of cases. At each epoch, the area under the receiver operating curve (AUC) with corresponding 95% confidence intervals [95% C.I.] was reported as the primary measure of model performance for both the training and validation samples. There were 100 epochs each of which incorporated 1000 making a total of 100,000 training opportunities for the ANN in this analysis. Learning curves were created using a moving average of the AUC reported in each epoch and mean within-epoch AUCs and 95% confidence intervals were provided.
The final ANN reported marginal signal intensity estimates arrays to provide a sense of specific regional findings. Marginal signal intensity estimates for each ROI used by the ANN to show signal intensity. Signal intensity provides a standardized metric indicating the difference in output estimates between the ANN with and without the input variable of interest, in this case the specific region of interest. Finally, after the ANN was completely trained, the AUC was reported both in the whole discovery (n = 99) and replication (n = 20) samples. Since cutoffs can help differentiate individuals with high versus low risk of CI; Youden's method was used in the replication sample to determine conservative cutoffs for the best performing model and these cutoffs were used to categorize outcomes in both samples and Youden's index (J) was reported [31]. At this time, we also reported the classification ANN's AUC along with accuracy, precision, specificity, sensitivity, F1 score, positive likelihood ratio (LR+), and negative likelihood ratio (LR−). Analyses were preformed using Stata 16/SE [StataCorp]. Table 1 shows the relevant characteristics of WTC responders by cognitive impairment status. On average, WTC responders were in their mid-fifties and were predominantly male.

Results
Examining the accuracy of the predictive ANN revealed that any particular mean bilateral CTX was, on its own, a weak to moderate indicator of CI (Table 2). Mean CTX of the unilateral whole hemisphere and also unilateral single-region information demonstrated weak to moderate predictive power on their own in the replication sample with the strongest The ANN became, during the training and validation process, increasingly accurate at separating responders with WTC-CI (learning curve provided in Appendix Fig. 1). The final output was a score showing preeminent performance in both discovery and replication samples ( Table 2) that generated a risk score with excellent AUC in the replication sample.
For comparison, we provided overall fit metrics for each region of interest in both samples and also across the entire sample using the optimized cut-point shown.
Examining the distribution of the ANN-derived risk score (Fig. 2), showed that groups appeared highly separated between CI and unimpaired groups [the trained ANN is provided in Appendix Model 1].
Cortical regions comprising the risk score were determined from the signal matrix (provided in Table 3) shows an overall indication of how important each region was to the ANN scoring. Regions with non-zero scores in Table 3 suggest that, after training, the ANN algorithm learned to reliably distinguish between WTC responders with and without CI using CTX signature of decay as indicated by marginal estimates. Unilateral regions displayed prominent contributions to the discrimination process including, for example, the right cuneus, right middle temporal, rigght pericalcarine, right precentral, and right superior frontal regions.
Our ANN incorrectly identified three responders incorrectly classified as cognitively unimpaired (false negative -type II error), while two cognitively unimpaired responders were incorrectly identified as CI (false positive -type I error). Inspecting the data from the two type II errors revealed that when compared to the WTC responders appropriately identified as CU, these three displayed higher mean unilateral CTX in the right ( In exploratory analyses, we further investigated the association between neural network risk scores and performances in a range of cognitive domains as assessed on a computerassisted cognitive exam that was not employed in the characterization of CI or cognitively unimpaired in either the discovery or replication sample. The ANN score was associated with cognitive performances across a range of cognitive domains in the expected directions (Appendix Table 1).

Discussion
As of 2020, 79,189 responders were documented to have worked on response efforts at the WTC [32]. These responders were exposed to potentially injurious events, with many experiencing PTSD or early-onset CI. To date, little is known about the etiology for why early signs of cognitive decline might be present. Recent work has begun to identify signs of neurodegeneration in WTC responders with cognitive impairment [15,43,44]; however, no prior work has determined the accuracy of cortical atrophy to identify CI in WTC responders. Identifying accurate methods for identifying patients is critical to diagnosis and to monitoring patient symptomatology. This is the first study to apply a neural networks approach to characterize regional cortical atrophy in WTC responders. This is also the first study to compare responders with mild cognitive impairment with those with dementia, and to note that those with MCI may be biologically similar to those with dementia indicating the potential for progressive disease. As such, we deemed it important to develop a useable signature by using cortical thickness as a utility that could allow us to reliably identify which responders are at risk of WTC-CI. The resulting efforts suggested that the signature identified here was able to reliably differentiate between CI and cognitively unimpaired in WTC responders. Furthermore, this signature could allow us to identify which WTC responders might be at the highest risk for developing WTC-related CI. Notably, the cortical atrophy signature offered a high degree of accuracy in both the training and for prediction in a second, unrelated, study with different imaging parameters. These results are promising in their ability to reliably identify responders with mild cognitive impairment [33].

Regionally specific cortical thinning in WTC-CI
Prior research has suggested that cognitive impairment most often results from neurodegenerative diseases such as AD [34], as well as other related disorders such as frontotemporal lobular degeneration (FTD) [35], progressive supranuclear palsy (PSP) [36], Parkinson's disease (PD) [37], dementia with Lewy bodies (DLB) [38], and amyotrophic lateral sclerosis (ALS) [39]. These studies suggest that the areas identified in the derived WTC-CI signature may have identified regions active across a range of neurodegenerative conditions including, for example, regions within the temporal lobe that are commonly implicated in AD with focal points in the medial temporal, inferior temporal, temporal pole as well as subregions of the frontal lobe, the right parietal lobe as seen in PD, and the left frontal lobe and supramarginal regions as seen in PSP.
Our findings partially overlapped with five of nine brain regions classically associated with the AD-related CTX signature including the pericalcarine cortex, inferior temporal gyrus, superior frontal gyrus, and rostral medial frontal cortex [40]. Indeed, application of the AD fingerprint resulted in relatively low accuracy in this population. Thus, in the anatomical underpinnings of WTC-CI we found some overlap with the cortical atrophy characteristics of AD. Yet, WTC responders also manifested cortical atrophy in non-AD-related regions including in the right frontal pole, precentral, lateral orbitofrontal, and left insula. Given the similarity in brain regions impacted by diseases with very different etiologies and symptoms, these data may be most useful in providing clues as to the etiology of the disease and about the anatomical basis of WTC-related cognitive pathology. For example, changes in the lateral orbitofrontal cortex have been linked to impairments in learning-based decisionmaking [41,42]. Further work will be required to determine whether a clinicopathological correlation exists in WTC-CI in the same way that it does in other memory-impairing neurodegenerative diseases including, for example, AD, FTD, PD, and PSP. The ANN regional effect sizes seem to suggest differences in spatial location of cortical thinning between hemispheres pointing towards a laterality effect. These findings, if confirmed, would not be consistent with the CTX signature typical for AD [40]. One explanation for this discrepancy could be that indeed, unlike in AD, WTC responders with CI display more laterality in cortical atrophy. An alternative explanation may be that the bilateral regions have the same contribution and therefore the model only needed to use to account for the contribution from both sides.

Strength and limitations
This study used an ANN teaching and learning platform to develop an accurate objective biological indicator of CI in WTC responders. While this is a pivotal strength in this study, there are notable weaknesses including, for example, the reliance of the analysis on measurements from two separate and relatively small imaging studies despite being recruited from a single WTC epidemiologic cohort, without the use of an external control database, though using an identical pipeline to analyze the data. Given these necessary forces, we tried to improve replicability by using validation on a different dataset derived from a different study with individuals defined using different diagnostic criteria to identify possible mild dementia and mild cognitive impairment, with MRI data collected in a different imaging suite and using different acquisition parameters. While this is in some senses a limitation, it may also be seen as a strength since these types of dis-similarities allowed our practice to imitate some of the most common differences experienced when comparing similar individuals across imaging sites and diagnostic protocols and, therefore, reproduces a practical replication effort. However, these also represent sources of variability that may reduce our ability to generalize and induce variation that could bias results. Another limitation is that the ANN learning module independently chose the final results that were presented. This study did not incorporate additional measures of cortical health including cortical density and complexity. To that end, we might also suggest that future work might usefully refine the ANN to improve reliability to differentiate responders at risk of CI. Additionally, while the region-based analyses allowed us to apply an external ANN to the process being used in this study, it is worth noting that future research should refine this protocol by implementing it within an environment that allows for vertex-wise analyses and by using longitudinal data to determine the rate of progression of CTX atrophy in WTC responders. Furthermore, there is a need to replicate this work to determine whether others might benefit from the application of the ANN to cortical atrophy measured in other traumatized populations to determine whether it yields similar predictive accuracy for detecting early-onset CI. Finally, while we oversampled women and visible minorities in this study, the population characteristics make it somewhat difficult to assess levels of bias in this study making replication in studies with more women and people from minority backgrounds is necessary to quantify the potential for bias in this study.

Conclusion
WTC responders experienced a severe and unique exposure, with both psychological trauma and the potential for inhalation of neurotoxins. Prior work has detailed symptoms consistent with this event, while further efforts have suggested that signs of neurodegeneration in the cortical gray matter is evident, but that patterns of neurodegeneration do not necessarily match those of other diseases. As such, this paper fills a particular need for this population to derive biomarkers for WTC-related CI. However, this paper also implemented a novel region-based technique for reliably identifying neurodegenerative disease in highly exposed WTC responders. In so doing, this study provided the basic characterization needed to begin to generate a new subtype of AD or a related dementias and/or to identify a novel neurodegenerative disorder at midlife.

Supplementary Material
Refer to Web version on PubMed Central for supplementary material. National Institutes of Health for supporting the study of the effects of aging on WTC responders (NIH/NIA R01 AG049953).

Fig. 1.
Artificial neural network structure and training protocol Panel A: Workflow for the training and testing process used to train and validate the artificial neural network. The figure notes training efforts (Tr i ) leading into initial testing circumstances (Te i ), and finally upon training/testing conclusions on replication (Te i(R) ), where "i" indicates training epoch. Panel B: Architectural diagram for the artificial neural network using three hidden layers with ten nodes each incorporating 68 bilateral measures of regional cortical thickness. Violin plots with showing the distribution of artificial neural network scores for participants classified as cognitively unimpaired (blue) and cognitively impaired (red). The score for each participant is shown using a translucent dot, while the median and interquartile range are shown using solid and dotted black lines respectively. Bar charts showing estimated measures of accuracy in the discovery (blue bars) and replication (orange bars) samples. Higher overall accuracy is indicated as proximity to the ceiling (100%), while poorer performing metrics would approach a floor of 50% because all scores were oriented in the same direction. AUC indicates the area under the receiveroperating curve.  Table 1 Sample characteristics for the discovery and replication samples.  Table 2 Area under the receiver-operating curve for unique bilateral regions and whole-hemispheric measures of mean cortical thickness.

Cross-Study Statistics (N = 119)
Region of Interest  Table 3 Marginal signal intensity as reported by the artificial neural network relying on 68 unilateral regions of interest and reported in the discovery sample.