A promising tool to explore functional impairment in neurodegeneration: A systematic review of near-infrared spectroscopy in dementia.

This systematic review aimed to evaluate previous studies which used near-infrared spectroscopy (NIRS) in dementia given its suitability as a diagnostic and investigative tool in this population. From 800 identified records which used NIRS in dementia and prodromal stages, 88 studies were evaluated which employed a range of tasks testing memory (29), word retrieval (24), motor (8) and visuo-spatial function (4), and which explored the resting state (32). Across these domains, dementia exhibited blunted haemodynamic responses, often localised to frontal regions of interest, and a lack of task-appropriate frontal lateralisation. Prodromal stages, such as mild cognitive impairment, revealed mixed results. Reduced cognitive performance accompanied by either diminished functional responses or hyperactivity was identified, the latter suggesting a compensatory response not present at the dementia stage. Despite clear evidence of alterations in brain oxygenation in dementia and prodromal stages, a consensus as to the nature of these changes is difficult to reach. This is likely partially due to the lack of standardisation in optical techniques and processing methods for the application of NIRS to dementia. Further studies are required exploring more naturalistic settings and a wider range of dementia subtypes.


Introduction
Dementia is a clinical syndrome, defined by symptoms including problems with memory, language, and executive function, which ultimately emerge due to neuronal loss. The most common cause of dementia is Alzheimer's Disease (AD), which is characterised by amyloid plaques, neurofibrillary tangles, memory impairment, cortical shrinking, and hippocampal atrophy (Arvanitakis et al., 2019). Other degenerative forms of dementia include Dementia with Lewy Bodies (DLB), characterised by Lewy body inclusions and motor symptoms, Fronto-Temporal Dementia (FTD), associated with fronto-temporal degeneration, and Vascular Dementia (VaD), caused by vascular injuries such as ischemia (Arvanitakis et al., 2019). Dementia is a leading cause of disability worldwide (World Health Organisation), in part due to its increasing incidence (Nichols et al., 2022) and deleterious effects on capacity for independent living and cognitive function. The development of new detection and therapeutic tools is thus a priority.
As dementia is a progressive disorder, several structural changes are thought to occur in the brain prior to symptom onset (Beason-Held et al., 2013). Prodromal stages such as Mild Cognitive Impairment (MCI) are therefore a critical target for early intervention. In support of this, around 16% of MCI revert to normal cognition within a year (Koepsell and Monsell, 2012). Yet, current methods for detecting early cognitive decline, the most common of which are cognitive tests, are inadequate. This is demonstrated by the wide variation in their reported specificity and sensitivity (Mitchell, 2013). Such tests are overly reductive, introduce 'arbitrary' cut-off values, and are highly influenced by attention and motivation (Brown, 2015). The subsequent inability to effectively detect early cognitive decline (Elkana et al. (2015) prevents the identification of at-risk individuals prior to irreversible damage. Advances in imaging and fluid biomarkers are rapidly progressing, however, there is a lack of brain-specific, low-cost, and accessible biomarkers for clinical use. Imaging techniques such as Magnetic Resonance Imaging (MRI) and Positron Emission Tomography (PET) are invasive, expensive, and limited to specialist centres. As a result, these techniques are not routinely available in care pathways, and thus only provide a snapshot of a patient's status. Conversely, fluid biomarkers do notprovide regional brain information.
Despite the practical limitations of techniques such as MRI and PET, considerable effort has been made to purpose them for biomarker development. For example, functional MRI (fMRI) has identified altered Default Mode Network (DMN) and medial temporal lobe activity, with inconsistent results in prodromal stages (Sperling, 2011). Of note are the contradicting reports in the literature of a compensatory response in early stages in the form of hyperactivation or reduced deactivation, followed by later protein aggregation and hypoactivation (Bakker et al., 2015;Celone et al., 2006). PET, on the other hand, has helped to identify significantly differentiable patterns of reduced glucose metabolism, using the 18-2fluoro-2-deoxy-D-glucose metabolic tracer, across dementias (Young et al., 2020). Additional work has also crucially revealed a strong association between dementia and vascular dysfunction. Such dysfunction includes neurovascular decoupling, arterial stiffening, increased pulsatility, hypoperfusion, disrupted functioning of the blood brain barrier, and impaired autoregulation (Toth et al., 2017). These changes are thought to arise prior to prodromal stages and partially drive later neuronal damage, in turn resulting in domain-specific impairments and secondary neurometabolic dysfunction (Chung et al., 2017).
An imaging technique which is relatively unestablished, and in its infancy compared to its peers, is near-infrared spectroscopy (NIRS). NIRS is a non-invasive neuroimaging technique which uses nearinfrared light to measure brain oxygenation by exploiting the differing absorption spectra of molecules in the brain. Within an optical window in the near-infrared range (650-950 nm), oxygenated (HbO) and deoxygenated haemoglobin (HbR) are the primary absorbers of light. NIRS exploits this phenomenon by shining two (or more) wavelengths of light into the brain and using the detected light attenuation to estimate the concentrations of HbO and HbR which, due to neurovascular coupling, can quantify brain activity via capture of the haemodynamic response. In continuous-wave NIRS specifically, light attenuation due to absorption is indistinguishable from attenuation due to unknown scattering effects, making only relative concentration changes from baseline measurable.
NIRS is well-suited for widespread clinical use and may thus be highly beneficial for dementia research, particularly as this is a population with high a prevalence of comorbidities (Bunn et al., 2014). NIRS has numerous practical advantages over perhaps more widely known techniques such as MRI and PET: it is non-invasive, well-tolerated, low-cost, portable, and easy-to-setup-and-use. This means that it can be used in naturalistic testing settings, such as outdoors or at the bedside, granting access to a more diverse pool of subjects, both socio-economically (as it does not require travel to a hospital) and with regards to accessibility (as it has very few physical restrictions). Another important advantage of NIRS is its relatively low sensitivity to movement, enabling the adoption of more ecologically valid tasks. Such a lack of restriction on test subjects and experimental paradigms has important implications for generalisability. This is particularly pertinent in dementia, where symptoms such as motor impairments are difficult to investigate using methods like MRI.
With regards to how NIRS can be used to study dementia, NIRS can provide regional brain-specific time courses of oxygenation (for both HbR and HbO) and metabolism, the latter via the quantification of the redox state of cytochrome-c-oxidase using broadband NIRS (Bale et al., 2014). This technique can thus offer information as to how well the brain is supplying neurons with the resources necessary for maintaining function and how efficiently neurons are using these resources. As the NIRS signal itself encodes crucial physiological information, such as the integration of neural-glial-vascular components, by applying signal processing techniques (West et al., 2019), for example, NIRS enables the exploration of physiological processes like neurovascular coupling. For example, time-to-peak reflects the action of neurovascular mediators, vasomotor reactivity, and oxygen extraction efficiency. Additionally, whilst both fMRI and NIRS interrogate the blood-oxygen-level-dependent response and can subsequently investigate network-level activity, NIRS does so with higher temporal resolution and additional practical advantages (i.e. its portability, low cost, and useability). Consequently, NIRS may help enable the early detection of dementia as it is better suited for functional imaging and for providing biomarkers of brain oxygenation and metbaolism which cannot be provided by other neuroimaging techniques.
The present article thus sought to review the application of NIRS to dementia. To do so, data from previous literature was synthesised and the clinical value of their findings was evaluated. Through these aims, future avenues for the adaptation of NIRS for dementia were delineated. Additionally, given growing interest in the use of NIRS in ageing research (Agbangla et al., 2017), a thorough investigation into its application to one of the most prevalent age-related diseases was deemed necessary.
Considering the previous fMRI literature (Sperling, 2011) and the numerous practical advantages of NIRS, we hypothesise that NIRS has the potential to become a standard-of-care for the diagnosis and prognosis management of dementia by detecting differences in brain oxygenation and metabolism between dementia, prodromal stages, and controls. Specifically, we expect to observe hypoperfusion and hypoactivation (reflected by blunted haemodynamic responses) in dementia in the resting state and across activation tasks. We assume that this will be observed alongside hyperactivation in the prodrome, in line with the hypothesis of a 'break point' in early stages, as seen using MRI (Dounavi et al., 2021). If differences are not observed between groups, or a consensus cannot be reached, we anticipate that, upon further inspection, the applied NIRS methods will not have been adequately adapted for this clinical population, partially underlying such an outcome.

Methods
A review protocol was developed according to the Preferred Reporting Items for Systematic review statement (Page et al., 2021) (PROSPERO registration number CRD42021297315). A systematic search of MEDLINE (1946), Embase (1947 and PsychINFO (1806PsychINFO ( -2021 was subsequently performed on the 1st of February 2023 using the following search terms: (Cognitive impairment OR Cognitive disorder OR Cognitive decline OR Vascular dementia OR Cognitive dysfunction OR Neurocognitive disorder OR Alzheimer* OR Dement* OR AD OR FTD OR DLB OR LBD) AND (Near-infrared spectroscopy OR Near infrared spectroscopy OR NIRS OR oxyhaemoglobin OR Tissue oxygenation index).
The results of this search were stored using Covidence (Veritas Health Innovation Ltd.; Australia) and de-duplicated. Two authors (EB, SS) independently screened abstracts and titles for relevant articles and conflicts were resolved by a third reviewer (GB). Full texts were then evaluated for inclusion. Studies involving humans diagnosed with dementia or in prodromal stages, and both case-controlled studies and those exclusively testing clinical groups were included. Conference abstracts, animal studies, reviews, study protocols, and non-English studies were excluded. Additional studies were identified through crossreferencing the bibliographies of the included studies.
The quality of studies was assessed using the Newcastle Ottawa scale (Wells et al., 2009) for case-controlled studies, the JADAD scale (Jadad et al., 1996) for randomised control trials, and the National Heart, Lung, and Blood Institute quality assessment tool for observational cohort studies (National Heart Lung and Blood Institute, n.d.). These quality assessment scales were chosen as they are the most widely used for the respective study types (Ma et al., 2020). The results of this assessment are provided in the appendix (Fig. A1).
Data from the included studies was extracted by two reviewers (EB, SS) and stored using Excel (Microsoft Corporation). The following information was extracted: title, first author's name, publication year, publication journal, experimental paradigm, cohort characteristics, sample size, summary of results, NIRS parameters, and NIRS device. This information is summarised in the tables provided in the appendix. A    Arai et al. (2006). (e) Schematic overview of the one-back task. (f) Mild cognitive impairment was associated with a reduced and delayed rise in the haemodynamic response where as Alzheimer's Disease was associated with a decreased and delayed response compared to controls during memory encoding. Figure from Li et al. (2018a). (g) An example of the Benton Line Orientation task (Benton et al., 1978). (h) Significantly reduced average change in oxygenated haemoglobin concentration in individuals with late life depression compared to Alzheimer's Disease in a parietal channel during a visuo-spatial task. Figure from Kito et al. (2014). meta-analysis was not possible due to the heterogeneity of the studied clinical populations, methods used, and data presented, preventing quantitative synthesis. The included studies were tabulated according to (1) cognitive domain and (2) the clinical population studied. The major outcomes of each study were then summarised within this framework. The proportion of included studies which reported a significant difference between dementia or prodromal stages, and controls was calculated. It is important to note that a study was classified as reporting differences between groups if a significant difference was reported in any single outcome in the study. This was done as most statistical analyses performed and outcomes reported were not standardised across studies. Further, correlations with clinical and behavioural scores, and details as to the NIRS methods used, were considered. A consensus as to the clinical value of the NIRS data could then be ascertained through a critical analysis of this information.

Search results
The search identified 800 records (Fig. 1). Following title and abstract screening, 138 studies were eligible for full-text screening: 24 were conference proceedings or abstracts, 22 studies were excluded for wrong patient population, 7 for wrong study design, and 1 was a book chapter. Four studies were identified through cross-referencing, yielding a total of 88 studies for final evaluation.
Since 1993, when the first paper was published using NIRS in dementia (Hoshi and Tamura, 1993), there has been a steady increase in the number of papers published in the area (Fig. 2). Of note is the gap in published studies between 1998 and 2004. This may be due to a lack of commercially available NIRS systems for research as two of the three studies published prior to 2004 both used a NIRO 500 system (Hock et al., 1996(Hock et al., , 1997 and the third used a tissue oximeter (Fallgatter et al., 1997). Nevertheless, a deficiency in the number of NIRS studies generally published is observable around these years (Yan et al., 2020). Since then, there has been a significant increase and improvement in commercially available and 'user-friendly' systems, such as those from Artinis Medical Systems, which are now largely designed for use in neuroscientific research.
The included studies took several approaches to characterise dementia and its prodrome using NIRS including recording the resting state (32 studies, Table A1) and employing tasks testing word retrieval (24 studies, Table A2), memory (29 studies, Table A3), motor (8 studies, Table A4), and visuo-spatial function (4 studies, Table A5), as well as tasks such as oddball paradigms (13 studies, Table A6).

NIRS is able to detect differences between dementia, prodromal populations, and controls
In accordance with our initial hypothesis, we observed that the majority of the included studies successfully used NIRS to identify significant differences in brain oxygenation between dementia or prodromal stages, and controls (~86.4% of studies), supporting its use as a standard-of-care alternative to currently used methods like MRI. Conversely, none of the studies used NIRS to measure neurometabolism, so it remains to be determined whether NIRS can detect differences in neurometabolism across these populations. These studies and their analysis methods are discussed and critically evaluated according to the cognitive domain they explored below.

Resting state brain oxygenation is reduced in prodromal stages
A total of 32 studies explored resting-state brain oxygenation (Table A1) which used a wide range of devices, experimental paradigms, and analysis methods to do so. Of these, six measured a Tissue Oxygenation Index (TOI) (Viola et al., 2013;Marmarelis et al., 2017;Liu et al., 2014;Tarumi et al., 2014;Li et al., 2022;Viola et al., 2014). This is a commonly used metric in clinics which provides a measure of absolute tissue oxygen saturation, both arterial and venous, from a single measurement location. Several studies found reduced TOI in amnestic MCI (aMCI) (Viola et al., 2013;Tarumi et al., 2014), and cognitively impaired individuals , compared to controls. In support of its clinical use, reduced TOI was also associated with poorer Mini-Mental State Examination (MMSE) (Viola et al., 2013) and memory scores (Tarumi et al., 2014) in aMCI.
TOI has also been considered as a marker of oxygenation to investigate therapeutic efficacy, however the value of TOI in this regard is unclear. Two studies observed negligible TOI reactivity in AD with midazolam administration (Tatsuno et al., 2021;Morimoto et al., 2022), whereas (Viola et al., 2014) observed TOI increases in AD with brain reperfusion rehabilitation therapy alongside improved MMSE scores. The unclear nature of the observed alterations in TOI may be due to issues with intra-device variation (Kleiser et al., 2016). Additionally, TOI is often reported as percent tissue saturation which, whilst useful for quick clinical assessments, provides little information as to the physiological processes underlying such saturation values. Many studies also only recorded TOI from a single measurement location, neglecting any spatial variations in oxygenation.
Another commonly used method to measure resting state oxygenation, or rather cerebrovascular reactivity (i.e. the HbO increase present upon rapid vasodilation), is through sit-stand manoeuvres or CO₂ challenges. Studies using such paradigms yielded mixed results as to differences in response between dementia, MCI, and controls (van Beek et al., 2012;Marmarelis et al., 2021;Babiloni et al., 2014). However, oxygenation during CO2 challenges increased with acupuncture therapy and galantamine treatment in MCI (Ghafoor et al., 2019), VaD (Schwarz, Litscher and Sandner-Kiesling, 2004;Bär et al., 2007), and AD (van Beek et al., 2010). Disrupted cerebrovascular reactivity has been linked to several underlying mechanisms in AD, including the characteristic Aß deposition proposed to cause oxidative stress and decreased production of vasodilatory factors, and reduced cholinergic tone (Bär et al., 2007). In contrast, resting state data in the absence of such challenges was not found to differentiate between AD Zeller et al., 2010;Chiarelli et al., 2021) and MCI (Soo Baik et al., 2021), and controls.
A physiological process of interest in dementia is neurovascular coupling (Shabir et al., 2018), i.e. the coordination of blood flow and neurometabolic demand necessary to maintain neuronal function, which can be explored using a multi-modal approach. Whilst disrupted neurovascular coupling has long been an area of interest in ageing research (Turner et al., 2022;Hutchison et al., 2013), only two studies explored this in AD (Chiarelli et al., 2021) and aMCI (Babiloni et al., 2014) respectively. These used electroencephalography (EEG)-NIRS to identify uncoupling between HbO concentration changes and EEG power in AD (Chiarelli et al., 2021), and an association between poor vasomotor reactivity and EEG coherence in aMCI (Babiloni et al., 2014). However, these studies were limited by a lack of subject-specific anatomical information and low channel counts.

In both prodromal and dementia stages, computational methods identified resting state cortical disorganisation, however, these methods had several limitations
Several computational methods aiming to use resting state data to differentiate between clinical groups ( Fig. 3a) have been reported in the literature Various studies explored network connectivity, many of which identified disturbances in dementia and prodromal stages, the nature of which was not well defined. This is partly due to the diverse methods used to quantify connectivity across studies. One such method is 'effective connectivity', i.e. the causal influence of one brain region's activity over another. Effective connectivity was found to be reduced in MCI across several regions including the bilateral prefrontal cortex (PFC), in which stronger coupling between the dorsolateral PFC and other regions of interest was associated with higher cognitive scores (Bu et al., 2019). Alternatively, correlation coefficients can be calculated between signal time courses to quantify connectivity. Using this method, both increased  and decreased  connectivity has been found in MCI compared to controls. Zhang et al. (2022) concluded that such decreased connectivity is in line with evidence of hypoperfusion and hypometabolism in MCI (Li et al., 2015). However, this is in direct contrast with the hypothesis of a compensatory response in prodromal stages to support declining cognitive function, which fails in dementia stages (Østergaard et al., 2013), i.e. the 'break point' (Dounavi et al., 2021).
Previous studies have also calculated the 'entropy', i.e. complexity, of the NIRS signal, a metric considered to reflect cognitive ability. Reduced signal entropy was observed in AD, which, in accordance with findings from Niu et al. (2019), was localised to the DMN, frontoparietal and ventral/dorsal attention networks (Li et al., 2018b). In contrast, increased signal entropy in the very low frequency bands (0.008-0.1 Hz), also identified in AD (Ferdinando et al., 2022), is suggested to denote increased variation in vasomotor brain waves, potentially indicating greater variability in vascular diameter in AD compared to controls.
With regards to regions of interest, both MCI and AD showed disturbances in dynamic functional connectivity (which accounts for the temporal variability of connectivity) within long-distance connections in prefrontal, parietal, and occipital cortices , and in the DMN and fronto-parietal networks (Niu et al., 2019) (Fig. 3b). This agrees with Keles et al. (2022) who identified dorsolateral PFC activity, part of the fronto-parietal network (Gratton et al., 2018), to be a crucial differentiator between AD and controls during the resting state.
Another significant area of research which has been growing rapidly in popularity within the healthcare sector is the application of machine learning to neuroimaging data. Despite this, few of the included studies (13) applied machine learning to NIRS (Cicalese et al., 2020;Ho et al., 2022;Kim et al., 2021;Oyama et al., 2018;Yang et al., 2019Yang et al., , 2020Yoo et al., 2020;Yang and Hong, 2021;Yoo and Hong, 2019). Furthermore, only one focused on the prediction of a continuous variable (Oyama et al., 2018) while the rest classified dementia stage or task performance. Most used simple models such as support vector machines and linear discriminant analysis, with recent studies demonstrating higher dementia diagnosis classification accuracies using more complex machine learning, or deep learning, models .
With regards to how machine learning was applied to NIRS, four studies performed classification on resting state data, with two of these finding that classification of either AD or MCI from controls was more accurate using HbO compared to HbR (Yang and Hong, 2021). The only study identified in this review which used broadband NIRS classified AD, MCI, and controls from their optical spectrum, finding a feature at 895 nm to be best at differentiating between AD and MCI (Greco et al., 2021). What this indicates is unclear as the biological substance contributing to this peak could not be identified by the authors. In addition, machine learning has also demonstrated promise in identifying regions of interest and functional connections with particularly high predictive accuracy. For example, Zhang et al. (2022) identified the long-range connection between the right PFC and left occipital lobe as a potential biomarker for aMCI.
All of the studies so far, however, suffer from the fact that, bar four studies which focused on multi-class classification, they focused on binary classification between MCI/AD and controls. Of those that performed multi-class classification, Chiarelli et al. (2021) used estimates of neurovascular coupling strength and a multivariate linear regression approach to classify AD from controls. In agreement with Cicalese et al. (2020), classification accuracies were improved using combined EEG-NIRS features (Chiarelli et al., 2021). However, reported classification accuracies were also high using solely NIRS signals. Kim et al. (2022a) demonstrated > 90% prediction accuracies with the difference in left and right PFC signals recorded during olfactory stimulation as an input to a random forest classifier model.
Most discussed studies were limited by small group sizes and group imbalance, which do not provide enough training examples for sufficiently robust models. Many also demonstrated low multi-class prediction capabilities, though with larger volumes of data, prediction and finer-scale classification tasks can be realised with higher accuracies. Simple signal feature sets are used in the majority of the discussed studies, though engineered features containing both spatial and temporal information have been shown to produce predictive models with higher accuracies Zhang et al., 2023). Finally, none focused on interpretable machine learning, such that classification decision pathways, and the signal metrics used to make a particular diagnosis, cannot be easily evaluated by clinicians.

Blunted haemodynamic responses during word retrieval were evident in both prodromal and dementia stages
All of the included studies which assessed word retrieval (24 ,  Table A2) used the verbal fluency task (VFT), or a modification of such (Fig. 3c). This is a frequently used paradigm in dementia studies, in which subjects must generate words within a category ('semantic') or beginning with a specific letter ('phonemic'). Overall, clinical groups generally performed worse than controls, as was the case for AD (Yap et

Table A1
Characteristics of the included studies reporting resting-state near-infrared spectroscopy data in dementia and prodromal stages. In house device Spectral feature at 895 nm which distinguished AD from MCI, the biological identity of which was not identified.  Prodromal AD (50), AD dementia (9), asymptomatic AD (28) Significant difference in TOI response to change in blood CO 2 between MCI and controls.
(continued on next page) 2008), MCI (Yap et al., 2017;Metzger et al., 2016;Yeung et al., 2016a;Nguyen et al., 2019), asymptomatic AD , and the behavioural variant of FTD (bvFTD) Metzger et al., 2015). Such reduced behavioural performance was accompanied by smaller haemodynamic responses in dementia (Takahashi et al., 2015), particularly in AD (Richter et al., 2007;Arai et al., 2006;Hock et al., 1996;Kato et al., 2017;Herrmann et al., 2008). These responses were characterised by longer latencies (Yap et al., 2017), reduced amplitudes, and smaller areas under the waveform (Kato et al., 2017). Such inadequate task-appropriate activation was echoed in prodromal stages such as MCI which presented with hypoactivation , particularly in the right parietal area (Fig. 3d) (Arai et al., 2006), and reduced inter-hemispheric connectivity . However, upon classifying between MCI and controls using HbO, the VFT was not as stable an indicator of MCI as the n-back task (Yang et al., 2019). In support of this lack of diagnostic potential, Soo Baik et al. (2021) found no VFT-related differences between MCI and AD. Additionally, there appeared to be no clear association between the magnitude of the haemodynamic response and clinically relevant features such as MMSE score (Kato et al., 2017;Kito et al., 2014;Arai et al., 2006), or behavioural performance (Araki et al., 2014;Metzger et al., 2016;Richter et al., 2007;Hock et al., 1997).
Although the magnitude of the haemodynamic response during the VFT may not be clinically useful (Takahashi et al., 2022), spatial patterns of activation may be able to differentiate between healthy ageing and dementia, as well as across MCI subtype (Yoon et al., 2019;Yeung et al., 2016a). For example, differences between AD and controls were localised to frontal and bilateral parietal regions (Hock et al., 1996), whereas differences between AD and MCI were limited to right parietal regions (Arai et al., 2006). Similarly, a loss of activation asymmetry was also Reduced whole-brain functional connectivity in aMCI compared to controls. Greater impairment in connectivity between brain regions at larger distances in aMCI.

Table A2
Characteristics of the included studies reporting near-infrared spectroscopy data associated with word retrieval in dementia patients and prodromal stages. Greater task-related increase in HbO for all channels in controls, and only for some channels in dementia.  Prodromal AD (50), AD dementia (9), asymptomatic AD (28), controls (53) VFT 6 frontal channels (2 S, 5 D) In house device Significant difference in HbO concentration change between groups. Significant difference in HbO concentration change between men and women across patient groups. Hock, 1996 Probable AD (19), controls OT-R40 (Hitachi Medical; Tokyo, Japan) No differences in mean HbO change between MCI, AD, and controls.
(continued on next page) E. Butters et al. evident in both dementia (Fallgatter et al., 1997;Richter et al., 2007) and MCI (Yeung et al., 2016a), however, one study found no significant lateralisation in either controls or MCI (Katzorke et al., 2018). The extent of lateralisation has been suggested as a possible biomarker for dementia as it is thought to reflect the recruitment of contralateral resources to support declining function (Yeung et al., 2016a), as is supported by the fMRI literature (Liu et al., 2018).

There was evidence for both hypo-and hyperactivation during memory tasks for all clinical groups
Overall, 29 studies explored memory function (Table A3), many of which used the n-back task (13), with mixed results. This task evaluates working memory (WM) and interrogates frontal regions, making it good for use with NIRS as it avoids monitoring through hair. In this task, subjects are presented with a sequence of letters and must indicate whether the presented letter was the same as that just before (one-back) or that before last (two-back) (Fig. 3e). Using this task, two studies observed blunted haemodynamic responses in MCI (Yang et al., 2019;Yoo et al., 2020), with a gradation from controls, to MCI, to AD , whereas three found no difference in functional response (Yoon et al., 2019;Soo Baik et al., 2021) or connectivity  between MCI and controls. Interestingly, one study identified hyperactivation in MCI compared to controls . Perhaps the discriminatory ability of the n-back task is more subtle: there is evidence for differential WM load modulation across disease stages. For example, certain studies observed differences between MCI and controls only with high WM loads Yeung et al., 2016b) and others identified WM load modulation only in controls (Vermeij et al., 2017;Ung et al., 2020).
With respect to the clinical value of the haemodynamic response during WM tasks, most studies reporting results of correlation analyses identified positive correlations between the magnitude of the HbO signal or functional connectivity metrics , and behavioural or clinical scores (Ni et al., 2021;Yeung et al., 2016b;Li et al., 2018a;Niu et al., 2013;Uemura et al., 2016;Liu et al., 2023) (Fig. 3f) such that greater oxygenation was associated with better scores. Encouragingly, haemodynamic responses during the n-back task were also validated as having strong diagnostic potential using convolutional neural networks (Yang et al., 2019. Additionally, responses to the n-back task may be sensitive markers of treatment responses, reflected by increases in oxygenation (Ni et al., 2021;Ghafoor et al., 2019;Khan et al., 2022). Such results lend support to the hypothesis of a compensatory hyperactive response in prodromal stages to compensate for declining function. However, perhaps surprisingly, two studies found improved memory performance to be associated with reductions in frontal activation in MCI with both photobiomodulation therapy (Chan et al., 2021) and VR-based training (Liao et al., 2020).

Some evidence suggested hypoactivation in clinical groups during motor activity, however, the tasks used were simplistic
Of the eight studies testing motor function (Table A4), six used dualtask walking (Doi et al., 2013;Teo et al., 2021;Nosaka et al., 2022;Wang et al., 2022;Talamonti et al., 2022;Takahashi et al., 2022) with wearable NIRS devices, five of which recorded exclusively from the frontal cortex. The dual-task walking paradigm involves performing a single (e.g. walking), and a dual task (e.g. completing a cognitive task whilst walking). Findings from studies using this task suggest a non-linear relationship between dementia severity, brain oxygenation, and motor performance, unlike memory function  or word retrieval (Yap et al., 2017). For example, people with memory complaints had higher activation during dual-task walking compared to controls, whereas those with dementia had higher activation compared to both controls and people with memory complaints in single-task walking, yet significantly reduced activation in dual-task walking (Teo et al., 2021). Concerning studies directly assessing motor function, none used naturalistic tasks, such as social interaction, but used simplistic motor tasks, such as hand-grip movements (Tak et al., 2011) and finger tapping (Yang et al., 2022), which revealed decreased oxygenation in AD compared to controls and MCI.

More demanding visuo-spatial tasks may have revealed clearer deficits
Four studies explored visuo-spatial processing (Zeller et al., 2010;Kito et al., 2014;Tomioka et al., 2009;Haberstumpf et al., 2022) ( Table A5). Three of these used angle discrimination tasks, such as the Benton Line Orientation task (Fig. 3g), which requires participants to judge the orientation of a presented line (Benton et al., 1978). However, these yielded varied results (Zeller et al., 2010;Kito et al., 2014;Haberstumpf et al., 2022) (Fig. 3h) possibly due to a lack of standardised methodologies across studies. For example, Zeller et al. (2010) used a combined 'dementia' patient group. The absence of performance differences across groups (Zeller et al., 2010;Kito et al., 2014) may also suggest that more demanding visuo-spatial tasks are required to reveal differences in the NIRS data.

A handful of studies used sensory stimuli and oddball tasks, with little consensus
Four studies explored sensory responses by using NIRS (Table A6) with music (Tanaka et al., 2012) and olfactory stimuli, the latter of which could discriminate healthy ageing from prodromal (Kim et al.,

Table A3
Characteristics of the included studies reporting near-infrared spectroscopy data associated with memory function in dementia patients and prodromal stages. Reduced concentration changes of HbO and decreased global efficiency in MCI compared to controls. Increased haemodynamic response with acupuncture therapy in MCI.  Prodromal AD (50), AD dementia (9), asymptomatic AD (28), controls (53) One-back task 6 frontal channels (2 S, 5 D) In house device Significant difference in HbO concentration change between groups. Degree of haemodynamic activation perfectly correlated with AD stage: lowest activation in AD dementia and highest activation in controls. Jang, 2019 MCI ( 2022b) and dementia stages (Fladby et al., 2004;Kim et al., 2022a). Alternatively, eight studies employed oddball tasks. Three found no differences in the haemodynamic response Soo Baik et al., 2021) and connectivity  between MCI and controls, whereas three observed reduced frontal activation in MCI (Yang et al., 2019;Yoo et al., 2020) and AD , with one study finding greater overall HbO increases in MCI compared to controls (Zhang et al., 2023). As most of these studies used the same task design and patient groups, except for Ho et al. (2022) which used a four-minute task block, these mixed results are surprising.

The research methods used were not adequate for use in dementia and prodromal populations
Overall, across cognitive domains, most studies observed reduced magnitudes of the relative concentration changes of HbO and HbR across dementia groups, AD, VaD and FTD. This agrees with results from other modalities, including hypometabolism identified using PET (Costantini et al., 2008), an overall 'slowing' of neocortical EEG (Dringenberg, 2000), and hypoactivation observed with fMRI in dementia (Sperling, 2011). Of the 52 studies which tested those in prodromal stages, i.e. MCI, 32% identified no difference, 8% identified increases, and 60% found decreases in either or both HbO and HbR concentration changes. Such a lack of consensus is consistent with the contradictory reports of an early compensatory response observed across other imaging techniques (Bakker et al., 2015;Celone et al., 2006).
Given the variability of results across certain domains, such as in the resting state, and particularly with regards to prodromal stages, the research methods used across the studies were investigated to determine whether methodological inefficiency could at least partially underlie such variability. A pattern emerged: experimental designs and optical methods lacked consistency, standardisation, and adequacy, as discussed below.

The optical methods did not account for likely differences in brain size and shape present at the dementia stage
Firstly, no studies accounted for the changes in brain size and structure which are commonly observed in dementia and old age. As brain tissue is a highly scattering medium, near infrared light can only penetrate ~4 cm into tissue. Therefore, NIRS can only record from superficial cortical layers. However, in dementia, widespread cortical shrinkage and atrophy (Harper et al., 2017) result in an assumed increased distance of the cortex from the scalp. This in turn may lead to data being recorded solely from extracerebral tissues, as opposed to from brain tissue. The incorporation of subject-specific anatomical data, such as from structural MRIs, to perform source localisation and signal reconstruction is thus necessary to avoid apparent functional differences being caused by anatomical variability or structural degeneration. Doing so is particularly critical in late stages, when the scalp-to-cortex distance can be up to 1.7 cm (Lu et al., 2019). Several studies also did not age-match their control and patient populations (see Fig. A1) which, alongside the absence of correcting for baseline age-related vascular changes, such as via statistical modelling of the haemodynamic response, may lead to the misattribution of alterations in the temporal dynamics of the haemodynamic response to changes in neural activity.
Similarly, many studies used sparse (low-density) NIRS arrays, i.e., in which the sources and detectors are arranged in a grid-like pattern. Not only does this often mean that there are little-to-no short channels, but the light cannot penetrate as deep as in higher density systems. Higher density systems, which consist of overlapping, variable length channels, yield improved resolution and fewer positional errors (White and Culver, 2010), and may achieve better sensitivity in dementia (Srinivasan et al., 2023). High-density NIRS can also be combined with anatomical information to create detailed topographical maps of brain activity, termed High-Density Diffuse Optical Tomography (HD-DOT). Whilst no studies used HD-DOT, and only a few used high-density systems (e.g. Soo Baik et al., 2021; Yoo and Hong, 2019), Talamonti et al. (2022) used DOT and Li et al. (2019) performed source localisation, however, neither used subject-specific anatomical information to do so.

Prodromal groups are highly heterogenous, possibly underlying the diversity of results observed in this population
The majority of studies focused on AD (36) and MCI (52), with few exploring less common dementia subtypes: only three in VaD, one in FTD, and none in DLB. Despite this, the nature of alterations in prodromal stages such as MCI was highly variable, particularly with regards to the presence of an early compensatory response in the form of hyperperfusion and hyperactivation (Merlo et al., 2019). Whether these variable results reflect differences in methodology or indeed indicate true variation in the ability to recruit additional resources across subgroups or individuals is unclear. Firstly, this may simply be due to the relatively small number of studies (12) which directly compare MCI with AD. Additionally, a similar degree of variability has been observed across patterns of activation in fMRI (Yetkin et al., 2006) and across subjects in EEG (Trinh et al., 2021) in MCI, suggesting an inherent heterogeneity to this subpopulation. As such, a lack of subgrouping across studies may contribute to such variable results. This is not an easy issue to resolve: MCI is difficult to diagnose and classify into subtypes (Díaz-Mardomingo et al., 2017), as it can present considerably differently with regards to symptomatology (Lopez, 2006) and patterns of atrophy (Bell-McGinty et al., 2005). To further complicate matters, the fMRI literature suggests distinct manifestations of compensatory responses in early stages between resting state and task-related fMRI, with the sensitivity and reliability of task-related fMRI yet to be established (Young et al., 2020). The adoption of higher density, wider coverage NIRS systems, and improved region of interest selection, would also Characteristics of the included studies reporting near-infrared spectroscopy data associated with motor function in dementia patients and prodromal stages.

Table A5
Characteristics of the included studies reporting near-infrared spectroscopy data associated with visuo-spatial function in dementia patients and prodromal stages. increase sensitivity (Srinivasan et al., 2023) in prodromal populations.

An overall lack of standardisation in methods was evident across studies
A common theme which became apparent across studies was a lack of standardisation in experimental methods. This includes data analysis, evidenced by the wide range of signal metrics and statistical methods used (see Table A6 as an example), experimental design, such as baseline and task duration, and data collection, such as probe placement. For example, almost half of the reviewed studies do not refer to motion burden, or the need to explicitly correct for motion to ensure that spikes, baseline shifts, and low-frequency drifts are not misinterpreted as physiologically relevant signals (e.g. van Beek et al., 2010;Viola et al., 2014) Moreover, even in studies which used systems containing short channels, several did not perform short-channel regression (e.g. Bu et al., 2019) to remove the influence of scalp haemodynamics. Such a lack of standardisation is also seen more widely across NIRS research, in which the myriad of adjustable parameters for processing NIRS data can lead to "misinterpretation and irreproducibility of results" (Pinti et al., 2020;Hocke et al., 2018). For instance, it remains unclear whether it is best to use either or both HbO and HbR to study brain activity (Pinti et al., 2019) or dementia (Zeller et al., 2019;Katzorke et al., 2018;Yang and Hong, 2021). Many studies in the present review only analysed the HbO signal and discarded HbR, citing HbO's higher signal-to-noise ratio and greater correlation with blood oxygen level dependent fMRI (Cui et al., 2011). Nevertheless, efforts are being made to standardise research methods for NIRS, such as adopting SNIRFs for data storage.

Further future directions
Aside from the methodological issues detailed above, there are several avenues of research which remain to be explored. For example, all included studies used continuous-wave NIRS systems bar Oyama et al. (2018), which used a time-resolved system, and Chiarelli et al. (2021) which used a frequency-domain system. In addition, perhaps surprisingly, given NIRS's ability to be relatively easily integrated with other imaging modalities, few studies did so: one PET, three EEG, and one fMRI. Only a single longitudinal study explored how brain oxygenation changes with disease progression (Talamonti et al., 2022), in which exploring stages even earlier such as Apolipoprotein E-4 carriers (Katzorke et al., 2017) is necessary for the assessment of the clinical value of NIRS.
Most studies also only recorded from pre-specified regions of interest, limiting functional connectivity analyses. This is particularly the case in studies measuring task-related activation which predominantly recorded exclusively from frontal regions, even during tasks with motor components (e.g. Takahashi et al., 2022), and despite the established posterior degeneration in AD and DLB (O'Donovan et al., 2013). Similarly, few studies used NIRS to explore motor symptoms (Table A4). This

Table A6
Characteristics of the included studies reporting near-infrared spectroscopy data associated with other functions in dementia patients and prodromal stages. Difference between HbO and HbR concentration change is reduced in patients compared to controls.  Prodromal AD (50), AD dementia (9), asymptomatic AD (28), controls (53) Oddball task 6 frontal channels (2 S, 5 D) In house device Increased HbO concentration change in controls compared to patient groups. Kim, 2022b AD (16) is surprising due to NIRS' low sensitivity to movement and lack of physical restrictions, as well as the characteristic motor symptoms of certain dementia subtypes, such as DLB (Emre, 2003), which cannot be easily explored using techniques like MRI. Though, the emergence of wearable NIRS is relatively recent, possibly explaining the lack of naturalistic task designs. Finally, using broadband NIRS to quantify intracellular neurometabolism (Bale et al., 2016) would be invaluable to investigate neurovascular decoupling in dementia.

Conclusion
Broadly, the previous literature identified differences between dementia, prodromal stages, and healthy ageing. This is evidence by cortical disorganisation, involving the DMN and fronto-parietal networks (e.g. Niu et al., 2019) and hypoactivation (e.g. Niu et al., 2013;Li et al., 2018b): a generally suppressed haemodynamic response across cognitive domains, at the dementia stage. In prodromal stages, several studies found hypoactivation (Yoon et al., 2019;Arai et al., 2006), whereas others identified a possible compensatory response in the form of hyperactivation Yap et al., 2017). Alongside the blunted haemodynamic response in dementia, these findings partially agree with the hypothesis of a 'break point' in prodromal stages (Dounavi et al., 2021). This review highlights the necessity for standardised protocols for both experimental designs, e.g. ecologically valid designs, and analysis methods, e.g. subject-specific information for source localisation, for more holistic and generalisable outcomes. To conclude, NIRS has strong potential for clinical translation and integration into care pathways, however, several methodological issues must be resolved before this is possible.

Declaration of Competing Interest
The authors have no conflicting interests to declare.

Data availability
No data was used for the research described in the article.