A review of visual sustained attention: neural mechanisms and computational models

Sustained attention is one of the basic abilities of humans to maintain concentration on relevant information while ignoring irrelevant information over extended periods. The purpose of the review is to provide insight into how to integrate neural mechanisms of sustained attention with computational models to facilitate research and application. Although many studies have assessed attention, the evaluation of humans’ sustained attention is not sufficiently comprehensive. Hence, this study provides a current review on both neural mechanisms and computational models of visual sustained attention. We first review models, measurements, and neural mechanisms of sustained attention and propose plausible neural pathways for visual sustained attention. Next, we analyze and compare the different computational models of sustained attention that the previous reviews have not systematically summarized. We then provide computational models for automatically detecting vigilance states and evaluation of sustained attention. Finally, we outline possible future trends in the research field of sustained attention.


INTRODUCTION
Attention acts as a gate for information flow in the brain (Cohen, 2014), allowing the brain to concentrate on processing continuous information. The term ''attention'' comes from the Latin ''attentus'', which is the past participle of attendere, which means ''to heed'' (Itti & Baldi, 2005). Although the word existed in Roman times, little scientific research was conducted on it until philosophers and pioneering psychologists paid attention to it. Attention research has primarily interested specialists in psychology because attention is linked to many mental disorders. Human beings with attentiondeficit disorders such as dyslexia (Walda et al., 2021), traumatic brain injury (Carroll et al., 2020), depression (Vaughn-Coaxum et al., 2021), and attention-deficit/hyperactivity disorder (ADHD) (Mansour et al., 2021) will have difficulty concentrating. As one of the fundamental cognitive abilities, attention has been the subject of research by experts in various research fields, including philosophy, physiology, neuropsychology, clinical, education, and computer science (Cohen, Sparling-Cohen & O'Donnell, 1993). However, interdisciplinary research is often considered overly difficult. Interdisciplinary challenges in the field of sustained attention remain unresolved, given the differences in conceptual definitions and research methods across interdisciplinary fields.
Current studies rarely mention sustained attention, which is the basis of other attention types (for example, selective attention) and plays an irreplaceable role in humans' daily lives. For instance, selective attention relates to focus and determines which information is given priority over others, while sustained attention refers to long-term focus and is typically related to vigilance (Cohen, 2014). Sustained attention is characterized by the ability to detect rare and unpredictable signals over a long period of time (Munir, Cornish & Wilding, 2000). Sustained attention or vigilance refers to the ability to maintain a consistent behavioral response to task-related stimuli during continuous and repetitive activity (Robertson et al., 1997). The key to the above definitions is that sustained attention is focused on the performance of a single task over a period of time. The essential difference between sustained and transient attention is that transient attention is a transient eventrelated state, while sustained attention is a sustained block-level state that shows attentional fluctuation over a long activity duration (Li et al., 2019).
Human senses can process an enormous amount of information. However, the brain cannot maintain attention over long periods of time to process the constant influx of information from the environment. One of the most important characteristics of visual sustained attention is the ability to make target-present or target-absent decisions rapidly and accurately (Warm, 1984). The first relevant investigation into the visual sustained attention phenomenon is Mackworth's mission of military personnel surveillance radar during World War II (Mackworth, 1950). At present, the world has entered the era of informatization and digital multimedia. Diverse and complicated information unrelated to the current task can easily divert attention. Sustained attention can ensure a more lasting focus on a task (Chen & Wu, 2015). However, sustained attention is affected by mental fatigue and is frequently diverted to irrelevant information. Humans are especially prone to fall into a state of mental fatigue when tasks require them to maintain a high level of attention for a long time. Furthermore, fatigue often reduces task performance by affecting vigilance (Thompson et al., 2020a;Thompson et al., 2020b). The ability to detect relevant information decreases as the time required to maintain sustained attention increases, a phenomenon known as ''vigilance decrement''. Humans in a low-vigilance state tend to experience mind wandering (Jin, Borst & van Vugt, 2020). Humans with low sustained attention will be unable to complete tasks and may even exhibit symptoms of attentional disorders such as ADHD. To evaluate the level of sustained attention, several studies in the past made use of machine learning combined with neuroimaging technology. Aggarwal et al. (2021) assessed attention levels for students in a massive online open course learning environment using EEG signals. Shoeibi, Ghassemi & Rajendra Acharya (2022) used a new deep learning method to build an ADHD intelligent detection model to assess resting-state functional magnetic resonance imaging (rs-fMRI) data.
Although sustained attention is vital to humans' daily lives, several challenges hinder its exploration in research fields: (1) Previous researchers have often used self-reflection questionnaires, electrocardiograms, eye movements, and electroencephalograms (EEG) to evaluate the sustained attention of humans. However, no complete and comprehensive routine assessment of sustained attention exists due to the different research methods involved in different disciplines.
(2) There have been many attempts to develop guidelines for human learning and work based on neuroscience findings. Therefore, neural mechanisms of sustained attention must be introduced to enhance our understanding of neuroscience-based methods for increasing task efficiency. (3) Sustained attention evaluation for the large number of existing attention-deficit patients usually requires clinical diagnosis by clinicians, which is time-consuming and labor-intensive. A large quantity of data has been produced by advanced neuroimaging techniques and an increasing number of researchers have utilized computational models to evaluate sustained attention. Therefore, machine learning, a relatively new advanced computing method, is uniquely suited for processing large-scale data generated by neuroimaging techniques (Jo, Nho & Saykin, 2019).
Accordingly, the aims of the present article are as follows: (1) to provide a comprehensive understanding of what sustained attention is and how it can be measured, we review theoretical models in 'Sustained attention: state-of-the-art models' and introduce paradigms of psychological experiments in 'Paradigms for psychological experiments on sustained attention'. (2) To explore how sustained attention can effectively promote task performance, we review studies on the neural mechanisms of sustained attention and propose possible visual pathways in 'Neural mechanisms'. (3) To give a panorama of computational models for the automatic diagnosis of sustained attention, we review various computational models for measuring attention in 'Computational models'. (4) To illustrate possible directions in the research field of sustained attention, we outline its applications and future trends in 'Conclusions'.

SURVEY METHODOLOGY
In our daily lives, we are surrounded by constant visual information, but our visual processing capacity is limited. Human access to information is dominated by visual information (Treichler, 1967). A large number of neurons are dedicated to analyzing human visual information, which makes vision an indispensable sense (Kaewkhaw et al., 2015). Sustained attention is crucial when a visual task calls for prolonged attention and ongoing stimulus monitoring (Loetscher et al., 2019). Sustained attention was accompanied by the flow of top-down visual spatial attention signals in human parietal and occipital topographic cortical areas (Lauritzen et al., 2009). However, the visual pathways of sustained attention have not been fully clarified in previous studies. Therefore, in this review, we attempt to cover the neural mechanisms of visual sustained attention and computational models applied to study attention, especially sustained attention. First, we provide theoretical models and experimental paradigms related to sustained attention assessment for the convenience of readers without cognitive science backgrounds. We further explore the neural mechanisms and plausible neural circuits of sustained attention. Finally, we review various computational models of attention.
We used Google Scholar as well as PubMed databases to search for relevant articles in our publication survey. First, we used the search terms ''visual attention '', ''visual pathway'', ''sustained attention'','' attention pathway'', ''machine learning'', ''deep learning'', ''sustained attention assessment'', and ''neural mechanisms of sustained attention''. We then expanded our search, including ''attention disorders'', ''computational models'', ''attention assessment'', ''neural mechanisms of attention'', and ''visual pathways of attention''. Second, we expanded our examination of computational models applied to attention given that there are fewer computational models for sustained attention. We rigorously searched for publications focusing on the application of computational models to attention research. Then, the search term was specified as the name of the computational model described earlier. We consistently excluded irrelevant studies throughout the review process. For publications that met our criteria, we deeply reviewed their computational approaches to sustained attention and categorized them into different computational models. In addition, we compared articles on different computational models with other relevant studies we found. We removed publications that did not match our approach.
As we attempted to capture all available studies of the neural mechanisms of sustained attention, the year of our collection of publications is from 1948 to the present. Additionally, to understand machine learning models of sustained attention, we collected research published after July 2008 based on search criteria.

RESULTS
We found four basic research topics that received sustained attention. These included theoretical models of sustained attention, experimental paradigms of sustained attention, neural mechanisms and computational models of sustained attention. We reviewed these accordingly.

Sustained attention: state-of-the-art models
The purpose of the sustained attention survey is to explain the individual's internal fluctuations during the task as well as his or her overall ability to maintain the task (Esterman & Rothlein, 2019). Moreover, the success of maintaining sustained attention is dependent on modulating both external and internal distractions. Therefore, Chun, Golomb & Turk-Browne (2011) classified attention as external modulation and internal modulation based on whether the attention goal was sensory stimulation (external) or cognitive control processes (internal) (see Fig. 1). Under this taxonomy, sustained attention includes maintaining both external and internal attentional focus as well as persistence over a period of time.
Several theoretical models were proposed to illustrate sustained attention from various perspectives (Hancock, 1989;Hancock & Warm, 2003;Blotenberg & Schmidt-Atzert, 2019;Esterman & Rothlein, 2019). The arousal model modulates sustained attention through the locus coeruleus (LC), affecting the external signal-to-noise ratio and internal informationprocessing ability. This model suggests that the state of arousal is closely related to our Arousal is the level of physiological and psychological activation, which can be determined by various factors, including emotions, motivation, and environmental stimuli. Attentional allocation is influenced by the intrinsic cost of control, motivation, and the degree of arousal. The circles represent the degree of arousal, and the larger the circle, the higher the degree of arousal. Insufficient attentional state in low arousal states affects task performance. The optimal arousal ensures sufficient attention for the task. Excessive arousal states can lead to low task performance due to distraction. Different degrees of arousal are controlled by internal cognition, such as resource-control and opportunity cost, to regulate the proportion of attentional resources. Higher internal controls can handle multitasking or more difficult tasks (the more As in bold), and lower internal controls can only handle single or simple tasks. Blue arrows indicate process of task-unrelated distractors. Red arrows indicate process of task-related targets.
Full-size DOI: 10.7717/peerj.15351/ fig-1 perception and sensory stimulation, as external stimuli automatically capture attention and trigger bottom-up processing. Conversely, top-down control occurs when attention is voluntarily allocated to the internal mind. Moreover, optimal physiological arousal is essential for sustained attention. Arousal is also not a static state. Even if humans are not tired, their arousal levels fluctuate as they become interested, afraid, or surprised. Arousal is mainly regulated by noradrenaline, a neurotransmitter secreted by the locus coeruleus of the midbrain, according to molecular neuroscience (Aston-Jones & Cohen, 2005). The LCnorepinephrine system receives projections from the orbitofrontal cortex and the anterior cingulate gyrus. By enhancing the activity of specific neurons or inhibiting the activities of unrelated neurons, the LC-norepinephrine system optimizes individual behavior through arousal in regular and persistent activities (Lenartowicz, Simpson & Cohen, 2013). In addition, the acetylcholine system in the dorsal pons and basal forebrain, the serotonin system in the raphe nucleus, the histaminergic systems in the tuberomammillary nucleus, and the orexigenic systems in the lateral hypothalamus all contribute to regulating arousal levels through cortical activation. Therefore, the arousal state involves a series of internal physiological changes related to external modulation via the activity of neurotransmitters to enhance task-related information-processing ability. Moreover, after the brain receives the stimulus signal, heart rate, electrophysiological activity, and pupil will change in response to the competition from participants' responses to different sources of stimuli (Unsworth, Robison & Miller, 2018).
While the arousal model explains neural regulation of sustained attention from a neurophysiological perspective, it is difficult to explain the decrease in internal vigilance and the allocation of attention resources. Therefore, a resource-control model (Thomson, Besner & Smilek, 2015) and an opportunity-cost model (Kurzban et al., 2013) have been proposed. These models suggest that the level of intrinsic motivation during a specific task, and the ability to exert influence, diminishes over time (Fortenbaugh, De Gutis & Esterman, 2017). The resource-control model explains the decrease in vigilance over time by the tendency for attentional resource distraction, which is influenced by task difficulty and time course (Thomson, Besner & Smilek, 2015). Generally, increasing the difficulty and duration of a task requires more available attention resources to be used, which increases the demand for attention resource allocation (See et al., 1995). Attention resources are associated with the central executive attentional network (Gartenberg et al., 2018). The depletion of central executive network resources affects sustained attention, leading to errors in information perception and processing. Moreover, considering time-on-task performance, executive control decreases with increased mind wandering, resulting in more attention resources being devoted to mind wandering over time. From the perspective of alternative underload, mindlessness and goal habituation also cause a decline in vigilance (Helton & Russell, 2012). Several behavioral and neuroimaging studies support the resource-control model. A study based on fMRI found that several brain regions associated with vigilance, including the basal ganglia, the sensorimotor cortex, and a right-sided frontal-parietal attention network, were activated after the psychomotor vigilance test (Lim et al., 2010). Vigilance activates the thalamus, as well as the anterior and posterior cortex areas that are potentially related to norepinephrine during the attention network test (Fan et al., 2005). Additionally, sustained attention is related to the functional connections between the default network and the dorsal attention network (Esterman et al., 2017). However, the resource-control model is primarily based on visual modality studies, and it is still unclear whether it can be applied universally to other sensory modalities (Terashima et al., 2021).
Although the resource-control model provides a task-related explanation, it does not elaborate on the effect of subjective experience (mental effort) on task performance (Esterman & Rothlein, 2019). The opportunity-cost model was proposed to explain the decrease in vigilance based on the psychological representation of subjects (Kurzban et al., 2013). It focuses on the expected value of vigilance tasks rather than the proportion of attention resources consumed by mind wandering. The cost and benefit of ''effort'' in task performance are related to psychological representation. The motivation to devote attention to tasks depends on the effort and psychological expectation of the task execution. Manipulation of the model can explain the impact of psychological activities, such as intrinsic motivation, interest, reward, and stress, on time-on-task performance (Esterman et al., 2016). Attentional engagement and time-on-task performance fluctuations were associated with motivation (Brosowsky et al., 2020). Moreover, a large-scale brain attention network was selectively activated in response to the stimulus characteristics of the task (Long & Kuhl, 2018).
The studies mentioned above suggest that sustained attention models can distinguish multiple states of optimal attention due to external or internal modulation. Moreover, these studies reveal that attention, as a limited resource, can affect task performance over time. However, the neural mechanisms by which attention affects task performance are not fully understood (Esterman & Rothlein, 2019). Therefore, more studies are still needed to investigate the neural basis for theoretical models of sustained attention.

Paradigms for psychological experiments on sustained attention
To explore the neural mechanisms of sustained attention, neuroimaging and electrophysiological methods are used to reveal the neural activity of basic cognitive processes in sustained attention. EEG and magnetoencephalography (MEG) have a high temporal resolution of submilliseconds, allowing them to detect rapid changes in electrophysiological responses. Functional magnetic resonance imaging (fMRI) uses endogenous blood oxygenation level-dependent (BOLD) contrast to map human brain activity. Moreover, rs-fMRI brain networks in sustained attention tasks can predict differences in individual performance. It can locate brain regions activated by different tasks or stimuli with a millimeter spatial resolution. While these neuroimaging techniques can effectively monitor potential neural activity, it is vital to design experimental paradigms that can genuinely evoke the neural activity associated with sustained attention. Therefore, many tasks, also called vigilance or sustained attention tasks, have been designed to monitor and evaluate sustained attention. In Table 1, we introduce several experimental paradigms commonly used to measure sustained attention.
Vigilance tasks are used to assess the capacity of sustained attention over long periods. Among the vigilance tasks, the Mackworth clock task (MCT) was a game changer in the 1940s (Mackworth, 1948). It was developed to assess the vigilance of radar technicians during World War II. MCT has been shown in many studies to decrease participant vigilance during tasks (Arsintescu, Mulligan & Flynn-Evans, 2017). Participants monitor the forward ticks of a clock hand and respond when the tick is twice the usual.
Although MCT has been replicated in various studies, the continuous performance test (CPT) is the most reliable and well-recognized approach for the clinical evaluation of vigilance (Arsintescu et al., 2019). The CPT is now applied as a kind of neuropsychological test to assess humans' inattentiveness, impulsivity, and vigilance. Moreover, the CPT has been proven to be sensitive to sustained attention. The CPT has evolved in the last century into different versions. The Conners' CPT (Conners & Sitarenios, 2011), also called the nonX CPT, is the most widely accepted version of the CPT. The Conners' CPT is mainly used to assess vigilance task performance and reaction inhibition. The participants needed to press the space bar when they saw nontarget stimuli (non-X), while they needed to withhold a response to the target letter X. The stimuli occurred at 1-, 2-, or 4-s interstimulus intervals (ISIs) during the Conners' CPT. All Conners' tests took 14 min to administer.
In addition to Conners' CPTs, the test of variables of attention (TOVA) measures the ability to maintain attention with an additional auditory component. Compared with those in other CPTs, the reaction time measurement in TOVA is more accurate and sensitive. Each target (a square near the top edge) or nontarget (a square near the bottom edge) appeared on a computer screen for 100 ms, and participants were asked to press a spacebar if the presented stimulus was a target picture. Many experimental studies have shown that the TOVA is reliable and effective in evaluating ADHD. Most of the results show that TOVA is helpful in distinguishing subjects who have problems with attention lapses (Lin et al., 2021). The adaptive rate continuous performance test (ARCPT) differs from the CCPT and TOVA in that it measures sustained attention on a more demanding rapid information-processing task (Lohr, 1999). The ISIs of the ARCPT are adaptive and vary depending on the performance of the subjects. The initial ISI is set to 60 ms. If the response to the stimulus is correct, the ISI will decrease by 4 ms; otherwise, it will increase by 4 ms after the error. The changing ISIs enable participants to maintain an accuracy of 80% in the task. Although the ISIs of the ARCPT allow participants to maintain a high accuracy level, it is difficult to solve the reaction errors caused by subjects' boredom or mind wandering Cohen, Sparling-Cohen & O'Donnell (1993). Therefore, Manly & Robertson (2005) presented sustained attention to response tasks (SART) as a measurement of sustained attention. The SART is a go/no-go vigilance task used to measure sustained attention in short periods of time. The mechanically continuous response in the SART causes the participant to endogenously regulate attention. When the subjects saw the frequent stimuli (for example, digit '3'), they had to press the space-bar. However, when they saw the infrequent stimuli, they had to withhold response. Because of its conciseness, the SART has been utilized extensively in clinical practice research of sustained attention as well as in a variety of brain imaging studies (Scheinost et al., 2020). Furthermore, the SART has also been used as an additional vigilance test in some sleep studies.
In addition to the SART, the psychomotor vigilance test (PVT) is also related to the measurement of sustained attention in sleep research (Dinges & Powell, 1985). The PVT aims to assess changes in performance caused by decreased vigilance. Participants were instructed to maintain their fastest possible reaction time to a visual stimulus (typically, a milli-second counter) at random 1-9 s ISIs. It has been widely used in research on fatigue. The decline in performance during the PVT is primarily due to cognitive slowing and attention lapses Dinges & Kribbs (1991).
In the experimental paradigms described above, external cues caused by abrupt onsets and offsets of stimuli cannot be easily eliminated. The gradual-onset continuous performance task (gradCPT), which uses fade-in and fade-out for stimulus presentation, can better clarify the behavioral and neural correlates of visual sustained attention (Esterman et al., 2013). Participants were required to press buttons for city scene images (90% of trials displaying city scenes, 10% mountain scenes). Each scene image gradually transitioned, occurring over 800 ms. Several other paradigms are also used to test broader attentionrelated problems (Munnik et al., 2020).
In conclusion, the ultimate goal of different experimental paradigms is to evoke cognitive processes associated with sustained attention during tasks. Moreover, researchers who use the same paradigm tend to frame their questions similarly.

Neural mechanisms
Exploring the neural mechanisms of sustained attention helps in understanding the human psychophysiological process during tasks. Traditional research has mainly focused on the activities of specific brain regions during sustained attention (Sarter, Givens & Bruno, 2001;Sonuga-Barke & Castellanos, 2007). An increasing number of researchers are beginning to recognize that brain areas involved in sustained attention are not limited to specific areas (Klimesch, 2012;Pamplona et al., 2020). In general, the neural mechanisms of sustained attention include visual, auditory, and other somatosensory pathways (Clayton, Yeung & Cohen Kadosh, 2015;Helfrich et al., 2018). However, visual attention is probably more widely known among all cortical systems than auditory and somatosensory attention. Studies have shown that many brain areas, primarily the occipital, parietal, temporal, and frontal eye fields, are involved in the human visual attention system (Saygin & Sereno, 2008;Offen et al., 2010;van et al., 2022).
Regarding brain areas involved in sustained attention, Clayton, Yeung & Cohen Kadosh (2015) addressed this issue by proposing an oscillatory model in which the posterior medial frontal cortex (pMFC), medial prefrontal cortex (mPFC), posterior cingulate cortex (PCC), and lateral prefrontal cortex (LPFC) are primarily involved in sustained attention. Langner & Eickhoff (2013) used functional neuroimaging to identify 14 clusters that were consistently activated across various tasks involved in sustained attention. These clusters are primarily found in the frontal cortex, cingulate cortices, and subcortical structures (see Fig. S1 ).

Sustained attention and the frontal lobe
The frontal lobe is a section of the brain that covers the front part of the cerebral cortex. It is usually regarded as the executive control center of the brain (Luria, 1973). These executive functions consist of a number of individual capacities, such as inhibition, goal-directed behavior, and self-monitoring (Oliveira et al., 2012). These individual capacities control and regulate the process of sustained attention. The frontal lobe mainly comprises the premotor area, primary motor area, and prefrontal lobe. The prefrontal cortex is located in the anterior region of the frontal lobe. It is linked to higher-order processing abilities such as attention, working memory, language, and executive function (Raver & Blair, 2016). Frontal lobe system damage often affects sustained attention, resulting in lapses in sustained attention and attention-executive disorders (Esterman et al., 2013). Moreover, it was found that patients with injuries in some areas of the prefrontal cortex performed abnormally in the implementation of sustained attention tasks, whereas they performed normally in other cognitive ability tests (Sarter, Givens & Bruno, 2001;Langner & Eickhoff, 2013). The frontal cortex has rich functional connections with the posterior brain system as well as several subcortical systems, including the limbic system, midbrain reticular system, and thalamic structure.
Many neurophysiology studies have found increased activation in the frontal cortex during a vigilance task (see Table S1). In addition, Han, Lee & Choi (2019) proposed that theta and alpha-band (4-12 Hz) EEG activities in the frontal cortex were essential for sustained attention and goal-related behaviors. In particular, EEG activity of the CPT state shows the dominance of effective connections going from the prefrontal cortex toward the parietal lobe at 4 Hz (Francisco-Vicencio et al., 2022). Moreover, sustained attentional preparation can be indexed by the deployment of a centrally distributed event-related potential (ERP), named the contingent negative variation (CNV) (Segalowitz, Dywan & Unsal, 1997;Kropp et al., 2001). CNV and P3 within the frontal cortex appear to be good candidates to investigate different mechanisms supporting sustained attention and prediction abilities (Thillay et al., 2015).
Damage to the whole frontal cortex or attention-related brain region (such as, pMFC, mPFC) affects sustained attention function, manifesting as behavioral disturbances or functional abnormalities in an individual. For example, rs-fMRI study found that frontal functional disconnection may underlie the pathogenesis responsible for defective vigilance/sustained attention (Tu et al., 2020). In addition, increased functional connectivity in the right frontoparietal network might reflect excessive cognitive fatigue in patients with traumatic brain injury (TBI) (Shumskaya et al, 2012).
Deficits in sustained attention are the most common disorder caused by frontal cortex damage (Wilkins, Shallice & McCarthy, 1987). Many studies have shown that abnormal neuron development in the frontal lobe may cause sustained attention disorders or even hyperactivity disorders (Rubia et al., 2019). In addition to developmental disorders, stroke and brain tumors in the frontal cortex can also lead to sustained attention deficits (Torres et al., 2021). Closed head injuries caused by external impact or sudden violent exercise can also impair sustained attention (Parasuraman, Mutter & Molloy, 1991). Furthermore, sustained attention decreases with aging because of frontal lobe degeneration over time (Mitko et al., 2019). Patients with deficits in sustained attention often suffer from ADHD, epilepsy, depression, intellectual disability, and other complex neuropsychiatric problems as well (Malkovsky et al., 2012). Among these symptoms, ADHD is a representative disorder of deficits in sustained attention (Kass, Wallace & Vodanovich, 2003). It has been linked to inattention, impulsivity, and negative affect (Barkley, Knouse & Murphy, 2011). Patients with ADHD have difficulty maintaining focus and vigilance for extended periods of time, leading to poor academic performance, career mistakes, and even operator-related train/car accidents (Fortenbaugh, De Gutis & Esterman, 2017;Zeller, 2022).

Sustained attention and the cingulate cortex
The cingulate cortex consists of two distinct systems: (1) a posterior system that receives input from dorsal stream areas and projects to some cortical systems, the thalamus, and (2) an anterior system that receives signals from the thalamus and frontal-parietal lobes and projects to limbic structures (Jafari, Malayeri & Rostami, 2015). Some studies have revealed that the cingulate cortex participates in sustained attention tasks (see Table S2).
The anterior cingulate cortex (ACC) receives inputs from the lateral frontal cortex and the posterior parietal cortex. In addition, it compactly connects with the basal ganglia (BG). The anterior cingulate cortex, which is part of the limbic system, receives inputs from the thalamus and neocortex and has large projections to the nucleus accumbens and amygdala. Many experts have found that the anterior cingulate cortex plays a vital role in conflict monitoring (Jones et al., 2002). The ACC was found to be associated with cognitive impairment in rs-fMRI of sustained attention tasks (Loitfelder et al., 2012). Furthermore, some researchers have confirmed that the anterior cingulate cortex is constantly activated during tasks related to sustained attention (Fan et al., 2018). Subsequent studies using a series of Stroop tasks have suggested that strong reactions in the anterior cingulate cortex mediate attention and conflict resolution (Corlier et al., 2020) (see Table S2). Moreover, in healthy adults, better sustained attention was associated with more robust activation of the ACC during SART and gradCPT tasks (Esterman et al., 2013).
The posterior cingulate cortex (PCC) is considered to be a paralimbic cortical structure. It has rich projections to the frontal, parietal, and temporal cortex, as well as to subcortical systems such as the thalamic nucleus, pontine, and basal ganglia. Therefore, the PCC is thought to be involved in several cognitive activities, although its specific functions have not been clarified. The results of resting functional brain imaging revealed that the PCC is a critical node in the default mode network (DMN) and plays a vital role in attention regulation (Kral et al., 2019). It has been confirmed that the PCC is activated and has strong interactions with other parts of the DMN in both resting state and continuous working memory tasks (Lau, Leung & Zhang, 2020). Abnormalities of the DMN are frequently seen in neurological and psychiatric disorders such as ADHD, Alzheimer's disease, schizophrenia, autism, and depression. Therefore, PCC has important clinical significance (Zhou et al., 2020).
The latest neuroimaging research found that there is a selective enhancement of oscillatory coupling between the ACC and the dorsal attention network (DAN) during attention tasks (Wong et al., 2022). Human single-neuron recordings during conflict tasks suggest that the dorsal ACC can be involved in attention-related performance monitoring (Fu et al., 2022). Resting-state functional connectivity within the DAN can predict individual performance in spatial attention tasks (Machner et al., 2022).

Sustained attention and subcortical structures
Subcortical structures are neural structures located deep in the brain that include the brainstem, midbrain, cerebellum, basal ganglia, thalamus, hypothalamus, and limbic nuclei. The hypothalamus and the reticular formation coordinate arousal through their vast array of projections to other brain regions (Rapoport et al., 1978). The basal ganglia and thalamic nucleus are responsible for processing gating information. They are closely related to attention and somatic movement (McAlonan, Cavanaugh & Wurtz, 2008). According to Fan et al. (2008), the basal ganglia appear to be central to executive regulation mechanisms, error monitoring, and sustained vigilance. Limbic nuclei include the amygdala, septal nucleus, and nucleus accumbens.
The anterior thalamic nuclei may serve as a site of integration between frontal areas and the hippocampus to regulate attentional processes (Nelson, 2021). Attention is also linked to the hippocampus, which is responsible for the storage, conversion, and orientation of long-term memory (Aly & Turk-Browne, 2016). Evidence from the reticular formation (Dietrich & Audiffren, 2011), thalamus (Rajab et al., 2014), and limbic structures (Wang et al., 2013b) suggests that exercise may help to facilitate attentional processes. The asymmetrical development of the right-lateralization of the frontal lobe and left-lateralization of the occipital lobe may affect ADHD severity (Chen et al., 2021). Furthermore, stereo-electroencephalography (SEEG) recordings provide direct evidence that the anterior nucleus of the thalamus modulates hippocampal gamma activity in attention and working memory tasks (Piper et al., 2022;Liu et al., 2021).
Based on the information above, attention is a byproduct of regulation from multiple brain regions rather than a strictly cortical phenomenon. Clinical studies have provided additional evidence that the nervous system contains the frontal lobe, cingulate cortex, and subcortical system, which play an appropriate role in sustained attention.

Neural pathways of visual sustained attention
Sustained visual attention is necessary for humans' visual systems to have incredible perception and data processing capabilities. Many studies have been dedicated to exploring the neural pathways of sustained visual attention.
Dynamic causal modeling provides compelling evidence for the regulation of attention through the PFC ↔thalamic, ACC ↔thalamic, BG ↔thalamic, and PFC ↔BG pathways (Jagtap & Diwadkar, 2016). For example, the modulation of thalamic →PFC pathways is presumed to reflect ascending attention processes engaged by external sensory inputs of salient and novel stimuli. In comparison, modulation of frontal →thalamic pathways represents descending attention processes mediated by voluntary shifts of attention based on expectations of goals and rewards (Connor, Egeth & Yantis, 2004). Before reaching the cortex, visual information is filtered by the thalamus. The thalamus, the ''gateway'' to the cortex, comprises various subnuclei involved in attention gating (Brunia, 1993). The thalamus affects feedforward and feedback information transmission between the frontal, parietal and occipital cortex regions (Tokoro et al., 2015). Attention to stimuli suppresses the neuronal activity of the reticular nucleus over selected relay nuclei, and this disinhibition gates thalamocortical inputs (Conway, 2014). These functional effects appear to be mediated by anatomical connections between the thalamus (and specific thalamic nuclei) and regions of the frontal lobe, including the LPFC and the cingulate cortex. The PCC is closely linked to the thalamus (Leech & Sharp, 2014). It receives information from the visual cortex and sends it to the LPFC (Buschman & Kastner, 2015). Sustained attention responses also exist in the early visual cortex in the absence of visual stimuli (Silver, Ress & Heeger, 2007). Several elegant studies have found that the presupplementary motor area (pre-SMA) is the target of the BG (Akkal, Dum & Strick, 2007;Wiesendanger & Wiesendanger, 1985). The pre-SMA receives input from the cortex and delivers output to the thalamus. The BG organizes motivations that lead to the execution of goal-directed behaviors, for example, pushing a button. When the attention process in the ventral regions is goal-oriented, information from the visual cortex activates neural activity in the inferotemporal cortex (IT), which is followed by activation in the LPFC (Hommel et al., 2019). The anteromedial prefrontal cortex, ACC, anterior insula, and anterior thalamic nodes form the cingulo-opercular circuit, which is involved in distinguishing potential mismatches and conflicts (Williams, 2016).
In addition, the arousal model shows that the LC-norepinephrine system in the brainstem plays a critical role in the vigilance of sustained attention. Norepinephrine projections originating from the LC and ending in the thalamus mediate the attention process (Sarter, Givens & Bruno, 2001). Projections from the ACC to the LC-norepinephrine system indicate that the mPFC is involved in the regulation of arousal through low-frequency phase synchronization with the LPFC (Clayton, Yeung & Cohen Kadosh, 2015;Craigmyle, 2013). Based on the above literature, we propose possible pathways of sustained attention, as shown in Fig. 2.
In addition, functional brain networks play a crucial role in sustained attention. Recent research has posited that the visual processes of sustained attention emerge from an array of large-scale functional networks (Fortenbaugh et al., 2018). In largely independent lines of research, influential brain network models (Esterman et al., 2013) have suggested that optimal sustained attention requires cooperation among the task positive network (TPN), frontoparietal control network (FPN), ventral attention network (VAN), dorsal attention network (DAN), and DMN, as shown in Fig. 2. Intracranial electroencephalography (iEEG) in human subjects offers evidence that the DMN interacts negatively with both the DAN and salience network (SN) (Kucyi et al., 2020). Moreover, researchers who utilized BOLD of fMRI found that many attention-related brain networks, such as the DMN and DAN, were activated during the gradCPT task (Mitko et al., 2019). Better sustained attention is associated with stronger anticorrelations between the DAN and DMN (Chang et al., 2022). Furthermore, optimal sustained attention is less dependent on the DAN and more dependent on brain networks related to task automation, such as the DMN (Okabe, 2016). Therefore, characterizing both anatomic neural pathways and functional connectivity could allow for a more profound study and eventually provide a panorama of the neural mechanisms of sustained attention.  et al., 2013;Conway, 2014). The brain network of visual sustained attention consists of sub-networks including cingulate cortex, LPFC, thalamus, insula, BG, and IT (Jagtap & Diwadkar, 2016). These sub-networks are responsible for functions including regulation, error monitoring or processing, and sustained vigilance. In addition to forebrain, the midbrain LC can also regulate sustained attention by secreting neurotransmitters (the black dashed line). LPFC, lateral prefrontal cortex; pre-SMA, pre-supplementary motor area; ACC, anterior cingulate cortex; PCC, posterior cingulate cortex; IT, inferotemporal; BG, basal ganglia; LC, locus-coeruleus.

Computational models
The theoretical models used to explain sustained attention are constantly evolving as cognitive science advances. The arousal and resource-control models mentioned above describe the complex relationship between multiple processing modules in sustained attention. However, descriptive models cannot quantitatively analyze the relationship between these modules. Moreover, descriptive models provide little information about how the processing modules in sustained attention change under specific conditions. It is difficult to give a precise measurement of sustained attention. Therefore, cognitive and computer scientists have introduced many computational models to quantitatively measure and evaluate sustained attention.

Biomathematical models
Early research on computational models was related to sleep loss and circadian rhythm. Researchers usually ask subjects to perform a vigilance task after sleep deprivation when sleep is insufficient or irregular. Then, the subject's vigilance or sustained attention can be quantified by using a computational model that is based on task performance. A diverse set of vigilance tasks will cause a decline in performance as time-on-task increases (Davies & Parasuraman, 1982). These vigilance tasks proved that the reduction magnitude is influenced by various factors, including the event rate, signal probability, stimulus duration, stimulus modality, and others (Warm, Dember & Hancock, 1996;Greenlee & Hess, 2019). At present, sleep loss and fluctuations in circadian rhythms are used by researchers to explain the reasons for decreased vigilance (Walsh, 2014). There appear to be notable differences in the study of theories about sleep deprivation and vigilance reduction. However, studies have shown that insufficient sleep and reduced vigilance have the same effects on cognitive processing (Veksler & Gunzelmann, 2018). Kronauer, Forger & Jewett (1999) found that ambient light can affect vigilance by influencing the phase and amplitude of circadian pacemakers. Jewett & Kronauer (1999) proposed a circadian rhythm neurobehavioral performance and alertness (CNPA) model. Hursh et al. (2004) proposed a sleep activity, fatigue, and task effectiveness (SAFTE) model. These biomathematical models predict that an increase in sleep deprivation will lead to a continued decline in vigilance. Therefore, early biomathematical models can only provide estimates of vigilance. They cannot predict potential changes in performance or cognitive processing, nor can they explain the mechanism of behavioral changes.

Integration models
Researchers integrated predictions of alertness levels, generated by biomathematical models, with information-processing methods in cognitive architecture model to produce precise predictions of sustained attention. Cognitive architecture model provides a unified information-processing framework that is based on decades of empirical evidence and psychological theories. Thus, the integration model not only helps in understanding the changes in human performance as vigilance declines but also explains the underlying neural mechanisms of sustained attention. Anderson et al. (1998) proposed a cognitive architecture known as adaptive control of thought-rational (ACT-R). It has been used to provide a quantitative description of human performance in cognitive tasks. Gunzelmann et al. (2011) introduced the microlapse theory of fatigue (MTF), which integrated a biomathematical model with ACT-R to model different sustained attention tasks. MTF, as an instantiation of the computational model, describes the process of decreased vigilance. Jackson et al. (2013) improved Gunzelmann's model by including a time-on-task component. Although Jackson's model uses the same mechanism as the original model, it has an impact on sleep deprivation and circadian rhythm research. Gartenberg et al. (2018) proposed the microlapse theory of fatigue with replenishment (MTFR), a process model similar to MTF that supplements the mechanisms related to opportunistic rest periods and internal rewards.
These computational models, when combined with a specific cognitive framework, can achieve a strong fit between simulated and actual behavioral data to predict behavioral performance in sustained attention tasks.

Machine learning algorithms
With the advancement of modern science, neuroimaging technology is increasingly being used to evaluate sustained attention (Zhang et al., 2022). Previously, neuroimaging data were combined with classical statistical methods to construct a computational model of sustained attention. However, as research advances, neuroimaging technology will result in an explosive increase in data scale. Classical statistical methods cannot handle large quantities of neuroimaging data, and manually processing these complex neuroimaging data is time-consuming. Thus, computational models that can automatically and elegantly process massive quantities of neuroimaging data are urgently needed to meet the demands of state-of-the-art neuroimaging research. Machine learning, an advanced computing method, is uniquely suited to address these issues.
In traditional machine learning approaches, handcrafted features are commonly combined with support vector machines (SVMs), k-nearest neighbors (KNNs), and Bayesian networks. The classification accuracy is over 70% when solving a binary classification problem in most EEG-based studies (see Table 2).
SVM is a supervised learning model for classification and regression in machine learning. It performs well when recognizing small samples with high-dimensional data. By using the SVM model, some studies focused on assessing the attentional state in healthy people. Yeo, Shen & Wilder-Smith (2009) tested the usefulness of SVM in identifying or distinguishing between alert and drowsy EEG patterns. Cirett Galán & Beal (2012) used SVM to estimate sustained attention and cognitive workload while students were solving a series of math problems. Zhang et al. (2016) used SVM to detect sustained attention load based on passive brain-computer interface signals from functional near-infrared spectroscopy (fNIRS); Samaha, Sprague & Postle (2016) used SVM to assess whether the spatial selectivity of neural responses can be recovered from the topography of alpha-band oscillations during spatial attention. Batbat, Güven & Dolu (2019) achieved high accuracy by combining EEG data from visual, auditory, and auditory-visual tasks and using an SVM with a linear kernel to classify different attentional states. Moreover, SVM is regarded as a relatively fast classifier. It is practically suitable for cases where the number of features is greater than the number of instances. However, SVM is difficult to implement in large-scale samples and to address multilabel classification problems.
The KNN algorithm, unlike the SVM algorithm, is an instance-based learning method. The KNN classifier is very simple and intuitive. High accuracy was achieved in classifying clinical patients and nonclinical participants using a combination of features with a KNN classifier. There are studies that have distinguished between different attentional states.  (2016) proposed a classification method that combines correlation-based feature selection (CFS) and a KNN algorithm to identify attentional states during the learning process. However, KNN depends heavily on training data. The complexity of KNN increases dramatically as the number of features increases.
The Bayesian model is more adaptable than the two other methods. It is a type of probabilistic graph model that uses Bayesian inference to compute the probability. The Bayesian model is fast and has been used for attentional classification problems by many researchers, such as Larue, Rakotonirainy & Pettitt (2010). In their study, the participants' reaction time during the SART was used to detect vigilance decline in real time using a Bayesian model. They quantified the effect of monotony on overall performance. In addition, a Bayesian model can also be used to predict the likelihood of humans typically focusing on a scene (Pang et al., 2008). Luo et al. (2020) developed a hybrid incremental dynamic Bayesian network and constructed a visual focus detection method based on fusing drivers' head and eye movement data. Borji, Sihite & Itti (2012a);Borji, Sihite & Itti (2012b) used Bayesian networks to estimate the attentional state of subjects while performing a task (for example, playing video games) and mapped the state to an eye position. In addition to the classifiers mentioned above, principal component analysis (Arruda et al., 2007) (PCA), artificial neural networks (ANNs) (Dowman & Ben-Avraham, 2008), linear discriminant analysis (LDA) (Ghassemi et al., 2009), K-Means (Gurudath & Bryan Riley, 2014, and other methods have been used to solve attention-related classification problems. Classic machine learning algorithms require experts to design elegant, handcrafted features. However, deep learning can learn feature representations from datasets automatically. To interpret data, deep learning builds a deep neural network that mimics the neural mechanisms of the human brain (Zaharchuk et al., 2018). As data acquisition technology advances in scientific research, computer-aided data analysis based on deep learning will become more widely accepted. Therefore, we reviewed existing deep learning methods for attentional state classification (see Table 3), including convolutional neural networks (CNNs) and recurrent neural networks (RNNs).
CNNs have been proven to be efficient in research areas such as image recognition and classification. In recent years, many researchers have used CNNs to measure attention and try to find the neural features that correspond to it. For example, Borhani et al. (2018) developed an EEG-based classifier that used CNN to investigate underlying subject-specific features related to early visual attention.  (2020) proposed a multiscale convolutional neural network-dynamic graph convolutional network (AMCNN-DGCN) model that estimated driving fatigue using EEG data from the driving task.
However, CNNs are not well suited to processing sequential data. Therefore, researchers developed RNN models in which neural network nodes and connections form a directed graph along a temporal sequence. As a result, RNNs are suitable for tasks involving sequential data, such as online handwriting recognition (where features can be extracted from both the pen trajectory and the resulting image) (Wu et al., 2014) and speech recognition (Fayek, Lech & Cavedon, 2017). RNNs can also be used to assess the state of sustained attention over time. Moinnereau et al. (2018) presented a deep RNN architecture for learning robust features and predicting cognitive load levels from EEG recordings. Jeong & Jeong (2020) utilized RNNs to distinguish between possible attention states. In addition, some electrophysiological data with long recording times are also suitable for processing with RNNs. For example, Phan et al. (2018) proposed a feature learning approach for single-channel automatic sleep stage classification. This approach is based on a deep bidirectional RNN with an attention mechanism. Huve, Takahashi & Hashimoto (2018)  collected fNIRS signals and then used both a deep neural network (DNN) and an RNN to evaluate the impact of a driver's mental state under various environmental conditions. Here, we reviewed several types of computational models proposed for attention identification and evaluation. With the advancement of modern neuroimaging technology, computational models may be able to reduce labor costs while also facilitating the assessment of sustained attention. Neuroimaging data from these studies could in turn improve the accuracy of computational models and help researchers find neuromarkers that can represent the sustained attention cognitive process.

CONCLUSIONS
This study conducted a systematic review of the research on sustained attention. There are currently only a few studies that comprehensively introduce sustained attention. Therefore, this article began by illustrating sustained attention using theoretical models, measurement methods, and neural mechanisms. Moreover, we proposed possible visual pathways of sustained attention based on the previous literatures. Subsequently, to facilitate evaluating sustained attention, this study summarized and compared various computational models related to attention classification.
Sustained attention, a fundamental component of attention, requires more in-depth research. Although the frontal cortex, cingulate cortex, and subcortex are all involved in sustained attention activities, clear pathways between these regions have not been identified. In addition, sustained attention is affected by both internal (such as mind wandering) and external factors. According to the resource-control model of sustained attention, mind wandering is unrelated to the task at hand. Although computational methods combined with neuroimaging data have high potential value, much work still needs to be done by researchers in this field. Therefore, research on sustained attention is gradually forming its characteristic framework (see Fig. S2).
There are still some limitations in the quantitative research of sustained attention. The first is that neuroimaging data are obtained by measuring the brain's neural activity during a sustained attention task. It is worth noting, however, that the quantifications for sustained attention differ slightly across experimental paradigms. For example, the CPT3 measures inattentiveness, impulsivity, sustained attention, and vigilance (Conners, 2008), whereas the SART measures sustained attention and inhibitory control (Dinges & Powell, 1985). Because the research content of different experimental paradigms differs slightly, neural activity measured by different experimental paradigms may be biased toward different cognitive content. More research is needed in the future to verify whether the differences caused by different experimental paradigms can reflect different cognitive processing of sustained attention. Second, current computational models focus on machine learning algorithms while ignoring the neural mechanisms of sustained attention, resulting in a lack of interpretability. Therefore, in the future, more computational models for sustained attention must be designed and developed in combination with neural mechanisms (Theiss, Bowen & Silver, 2022).
Although many studies have been exploring sustained attention over the past few decades, future research is still needed to address these unclear issues. Paradigms of sustained attention measurements used in neuropsychology are not regularly used in daily life (Mueller et al., 2017). Research must develop more realistic and accurate paradigms that consider actual problems in natural settings. The neural mechanisms of sustained attention play a vital role in improving humans' attention and task performance. Exploring the possible neural mechanisms of sustained attention can assist humans in maximizing the potential of their brains (Unsworth, Robison & Miller, 2018). In addition, analyzing the causal relationship between the neural mechanism of sustained attention and other factors, such as sleep (Veksler & Gunzelmann, 2018), working memory (Buehner et al., 2006), and environment (Kokoç, IIgaz & Altun, 2020), may provide insight into the relationship between sustained attention and cognitive task performance.
It has been previously found that many methods, such as video game training (Anguera et al., 2013), yoga courses (Ganpat, Sheela & Nagendra, 2013), mindfulness courses (Ziegler et al., 2019), neurofeedback (Bagherzadeh et al., 2020, and transcranial direct current stimulation (tDCS) (Gibson et al., 2021), combined with neural mechanisms can effectively increase humans' sustained attention. For example, anodal tDCS of the right inferior frontal cortex resulted in an increase in attention ability (Coffman, Clark & Parasuraman, 2014). Therefore, exploring neural underpinning of sustained attention and combining them with attention therapies has a high potential to provide more effective approaches to improve sustained attention.
Novel machine learning methods that can process multimodal data and measure sustained attention are eagerly awaited. Machine learning is indispensable in processing data. The number of papers that used computational models to analyze and classify sustained attention increased each year from 2012 to 2022, as shown in Fig. 3. Multimodal data, such as behavioral data and neuroimaging data, can provide complementary information for measuring sustained attention (Cruz-Garza et al., 2021;Kucyi et al., 2020). Thus, it is imperative to integrate and analyze these complex data from various recording devices to accurately and robustly monitor humans' sustained attention.
In sum, this review combines the theoretical models, neural mechanisms, and computational models on sustained attention in multiple fields to give a framework that can help researchers to understand sustained attention in as much detail as possible. Although many datasets have been obtained from the latest neuroimaging technology in the field of sustained attention, it is difficult to analyze or predict the obtained results without using computational models. In addition, rather than judging different levels of sustained attention impairment, incorporating experimental data with appropriate theoretical models can accurately interpret the obtained results from a neuroscience perspective. Therefore, this study presents many tables that analyze and compare different categories of sustained attention research for a quick overview. Considering that the visual path for sustained attention has not been fully elucidated, we propose a visual pathway based on sustained attention from related literature for the reference of researchers in this field. With the development of the world's information technology level, future research on sustained attention will develop from cortical brain areas to deep brain areas, from single brain areas to brain networks, and from machine learning to deep learning.