Neural substrates of accurate perception of time duration: A functional magnetic resonance imaging study

Time duration, an essential feature of the physical world, is perceived and cognitively interpreted subjectively. While this perception is deeply connected with arousal and interoceptive signals, the underlying neural mechanisms remain elusive. As the insula is critical for integrating information from the external world with the organism's inner state, we hypothesized that it might have a central role in the perception of time duration and contribute to its estimation accuracy. We conducted a functional magnetic resonance imaging study with 27 healthy participants performing temporal duration and pitch bisection tasks that used the same stimuli. By comparison with two referents with short and long duration in the time range of 1 s (close to the heart rate period), or low and high pitch, participants had to decide whether target stimuli were closer in duration or pitch to the referent stimuli. The temporal bisection point between short and long duration perception was obtained through a psychometric response curve analysis for each participant. The deviation between the bisection point and the average of reference stimuli durations was used as a marker of duration accuracy. Duration discrimination-specific activation, contrasted to pitch discrimination, was found in the dorsomedial prefrontal cortex, bilateral cerebellum, and right anterior insular cortex (AIC), extending to the inferior frontal gyrus (IFG), inferior parietal lobule, and frontal pole. The activity in the right AIC and IFG was positively correlated with the accuracy of duration discrimination. The right AIC is known to be related to the reproduction of duration, whereas the right IFG is involved in categorical decisions. Thus, the comparison between the referent durations reproduced in the AIC and the target duration may occur in the right IFG. We conclude that the right AIC and IFG contribute to the accurate perception of temporal duration.


Introduction
Time perception is central to memory, feelings, behavior, and decision-making for actions affecting the external world (Buhusi and Meck, 2005;Lane et al., 2003;Paulus, 2008, 2009). Within the temporal framework perceived and constructed by the brain, events seem to happen and interact (Nani et al., 2019). "Time has been considered an essential feature of reality, on the one hand (Unger and Smolin, 2015), or a figment of the way we know reality, on the other (Rovelli, 2018)." (Nani et al., 2019). It is, therefore, of the utmost importance to investigate how well the subjective percept of temporal duration corresponds to its objective measure in the external world, a critical component of time perception.
There is still no consensus on the mechanisms that accounts for the subjective sense of temporal duration (Meissner and Wittmann, 2011). An influential model incorporates as the initial clock stage an internal pacemaker producing a series of pulses and an accumulator that counts the number of pulses over a given time, which represents the duration of the experienced events (Church, 1984;Gibbon et al., 1984;Treisman et al., 1990). Then, at the memory stage, the output of the accumulator is transiently stored in the working memory. Finally, at the decision stage, the stored duration values are compared with those in the reference memory to make decisions (Pouthas et al., 2005). Several brain areas have been candidate neural substrates for perception of temporal duration such as the frontostriatal circuits involving the supplementary motor area (SMA), right prefrontal cortex, right posterior parietal cortex, insula, putamen, and cerebellum (Buhusi and Meck, 2005;Elsinger et al., 2006;Harrington et al., 2004a;Rubia and Smith, 2004;Wittmann, 2009;Wittmann and van Wassenhove, 2009). A recent meta-analysis of functional neuroimaging studies identified the consistent involvement of the insula, pre-SMA/SMA, inferior frontal gyrus, pre-central gyrus, cingulate gyrus, superior temporal gyrus, claustrum, putamen, and caudate body (Nani et al., 2019). These areas were involved across four different categories of time processing: sub-second and supra-second timing tasks in motor or non-motor perceptual conditions. This extensive network may be activated partly by duration perception and associated cognitive demands such as memory and decision-making. By controlling cognitive demand to isolate temporal duration perception from other tasks, Livesey et al. (2007) found that the bilateral dorsal putamen, the junction of the anterior insula cortex (AIC), the inferior frontal gyrus (IFG), and the left inferior parietal cortex were crucial for temporal duration perception.
In this study, we investigated the neural correlates of time duration using a temporal bisection (TB) task in a functional magnetic resonance imaging (fMRI) paradigm (Allan and Gibbon, 1991;Church and Deluty, 1977;Wearden, 1991) to behaviorally measure the accuracy of duration perception of 1 s, which approximately corresponds to the heart rate period. More specifically, based on an earlier report by Teghil et al. (2019), we predicted that TB task-specific activation in the insula, in particular, could be correlated across participants with the accuracy of time duration estimation.

Participants
In total, 30 healthy, right-handed (Oldfield, 1971) individuals (15 men and 15 women; mean age ± standard deviation, 21.8 ± 1.6 years) were included in this study. None of the participants had a history of drug exposure or neurological or psychiatric disorders. All individuals gave written informed consent for their participation in this study which was approved by the ethics committee of our institution.

Experimental setup
The experimental task was controlled and presented by Presentation software 16.1 06.11.12 (Neurobehavioral Systems, Albany, CA, USA; RRID: SCR_002521). We presented the auditory stimuli through MRIcompatible headphones (Kiyohara Optics Inc., Japan). For each participant, the sound volume was adjusted to an appropriate level such that each participant felt that they could "comfortably" listen to the tone for task execution. The intensity level was determined prior to the experiment and remained unchanged during the experiment. The participant held an optical response button-box in their right hand to record their responses. Throughout the runs, the participants were asked to focus on a small, white crosshair placed at the center of a screen viewed through a mirror. The half-transparent viewing screen was located behind the MRI head coil, and visual cues were projected through a liquid-crystal display projector. Participants practiced the task outside the scanner for about 5 min before performing the task in the MRI scanner.

Task design and stimulus preparation
With two referents, one short and the other long, participants had to decide if the duration of a given target was closer to either referent. Based on a psychometric curve analysis, the TB task defines the bisection point (BP) as the point of subjective equality (Allan, 2002) that represents the subjective temporal duration equidistant to both referents. Thus, the absolute deviation of the BP determined in each participant from the arithmetic mean of the two referents was set as a quantitative marker of accuracy with respect to duration perception (Harrington et al., 2004b).
More specifically, the participants performed three auditory tasks in the MRI scanner: TB, pitch bisection (PB), and motor control (C). A TB block began by presenting the visual instruction "Duration." Each TB trial, 7.8 s in duration, presented the two pairs of auditory tones with different intervals (anchor stimuli, A1, and A2) followed by the test stimulus (T). The first two tones comprised anchor A1 (R1-R2) and the next two tones comprised A2 (R3-R4), whereas the last two tones were the test stimuli (T5-T6). During the TB task, participants were asked, "which anchor, A1 or A2, is closer to the test stimuli T5-T6 in terms of ISI?" by presenting the visual instruction of "Duration". Participants answered by pressing a button with either the index or middle finger of their right hand, following the visual instruction given with the terms "Short" and "Long." After completing a block of five or six trials, the visual instruction "Rest" appeared for 2.6 s, followed by a crosshair for 5.2 s. The PB block was identical to the TB block except for the fact that the question, "which anchor is closer to the test stimuli T5-T6 in terms of pitch?" was asked by presenting the visual instruction of "Pitch" (Fig. 1). Participants answered by pressing a button following the visual instruction given with the terms "High" and "Low" appearing side by side (Fig. 1). The C block began with the visual instruction "press either button." Participants had to choose freely to press either the left or right button at the end of the T6 stimuli. The control trial presented a series of six random pure tones varying in ISI (817-1762 ms) and pitch (681.4-719.1 Hz), and participants randomly pressed a button without paying attention to pitch or time interval in A1 and A2.
The reference stimuli R consisted of pure tones (50 ms duration with fade-in and fade-out, 10 ms each) at either 681.4 Hz (low pitch, L) or 719.1 Hz (high pitch, H). For each trial, A1 and A2 were paired to differ both in pitch and temporal interval. The pair R1 and R2 had the same pitch. Similarly, R3 and R4 had the same pitch. The time intervals between the stimuli within each anchor were set to 817 ms or 1762 ms. Thus, we adopted four patterns as follows: The interstimulus interval (ISI) was either 850 ms, 880 ms, or 910 ms between A1 and A2 and A2 and T5. The ISI between T5 and T6 varied logarithmically between 817 ms (short anchor) and 1762 ms (long anchor) in seven steps: 817 ms, 990 ms, 1090 ms, 1200 ms, 1321 ms, 1454 ms, and 1792 ms (Fig. 2). Similarly, the pitch varied logarithmically between 681.4 (low anchor) and 719.1 Hz (high anchor) in seven steps: 681.4 Hz, 690.6 Hz, 695.3 Hz, 700 Hz, 704.7 Hz, 709.5 Hz, and 719.1 Hz. A total of 196 trials were conducted, equally distributed between the PB and TB tasks.

Functional magnetic resonance imaging protocol
To build a balanced session, we presented identical stimuli during the PB and TB tasks utilizing 98 trials each, resulting in 196 trials to which 64 control trials were added. The 260 trials in total were organized into four runs, with the first three sharing the same structure. Each of the first three runs contained four blocks of either TB or PB, which were made of six trials each, and four blocks of the C condition containing four trials (Fig. 1). Thus, each run had 64 trials. The TB, PB, and C blocks were separated by a short rest period (5.2 s) except for consecutive blocks with the same condition (TB-TB or PB-PB), between which a long rest period (20.8 s) was inserted (Fig. 1). C conditions were never consecutive. The fourth run had a similar structure, except that it comprised five TB blocks, four of which included five trials each, and the last run included six trials; five blocks of PB with a structure identical to that of TB and four C blocks contained four trials each. This complicated design was used to sample data for the psychometric function.
As two anchor stimuli were presented serially followed by the target stimulus, the order of the anchor presentation may have affected the participants' decisions. To eliminate this potential effect, the order of the two anchor stimuli was randomized.

Data analysis
Data are presented as the mean ± standard error of the mean. During a TB or PB trial (open rectangle), a series of six pure tones with variable interval and pitch are presented (only four during the control trial C). One block contains six trials, and one run contains four blocks from the TB, TC, or C combinations in a random order. The interval between two blocks of different types (e.g., TB-PB) is 5.2 s but 20.8 s between blocks of the same type (e.g., TB-TB). C, control; PB, pitch bisection; TB, temporal bisection.

Fig. 2.
Psychometric function of the judgment probability of a representative participant (Blue, Sub 10) responding with "long" in the temporal bisection task. The bisection point is defined as the duration which produces 50% "long" responses. The accuracy of duration perception is defined as the absolute distance of the bisection point from the arithmetic average of the durations of the two anchor stimuli (1290 ms). The proportion of long responses per test duration is depicted with an average value (asterisk). Light blue areas indicate the distribution between the 99th percentile and 1st percentile, and dark blue zones with notches between the 3rd quantile and 1st quantile with horizontal bars indicate the median. Dark blue sticks indicate the range of mean ± standard deviation. The graph was generated using a MATLAB code, al_goodplot 2.6.1. Psychophysical data We calculated the judgment probability (p) for each test condition task. In the TB task, p was defined by reference to the long anchor.
The psychometric variables k and t 0 were determined using the logit function of Eq. (1): where t is the length of the test stimulus, t 0 is the threshold (the length of the stimulus time such that the probability of answering "closer to the long duration" is the same as the probability of answering "closer to the short duration"), and p is the probability of answering "closer to the long duration". Thus, t 0 represents the subjective temporal duration equidistant to the anchor durations. Using Eq.
(1), we performed a linear regression analysis on the experimental data (data at seven points), where the slope at t 0 becomes steeper as k increases. Thus, k reflects the sensitivity or precision to time perception differences, which has been termed the Weber ratio (WR) (Kopec and Brody, 2010). The parameter t 0 is the BP. When BP deviates to the left of the arithmetic mean of the long and short duration reference stimuli (i.e., when it is small), there is a bias toward a perception of long durations, whereas when BP is higher than the mean of the reference stimuli durations, there is a bias toward the perception of short durations. Therefore, the distance of BP from the arithmetic average of the two standard stimuli provides an estimate of the accuracy for the estimation of the given time interval.
To check if there is a systematic effect of the order on the BP estimation, we obtained psychometric functions separately from trials in which anchors were presented in the order of short (817 ms) to long (1762 ms) duration and from trials in the reverse order. We calculated the difference between BP of different orders of anchor presentation, a positive value indicating a primacy effect, whereas a negative one indicates a recency effect. This procedure aimed to check the "subjective shortening" of past durations which is usually observed for durations exceeding 3 s (Wackermann and Späti, 2006).
We performed an identical analysis on the PB data, in which the probability was determined as to how close a participant's judgment was to the high-pitch anchor. Similar to the TB condition, we acquired the precision (WR) and accuracy (BP) of pitch discrimination.
2.6.2. Functional magnetic resonance imaging data 2.6.2.1. Preprocessing. The first two volumes from each fMRI acquisition were discarded because of unstable magnetization, and the remaining 520 vol (248 for runs 1-3 and 272 for run 4) were used for the analysis. We preprocessed the data using the SPM8 software (Wellcome Trust Centre for Neuroimaging, London, UK) implemented in MATLAB (MathWorks, Natick, MA, USA). The EPI images were first realigned for motion correction, then coregistered with the whole-head MP-RAGE image volume. They were then normalized to the Montréal Neurological Institute (MNI) stereotaxic space and smoothed with an isotropic Gaussian kernel of 8-mm full-width-at-half-maximum in the x, y, and z axes.

Statistical analysis.
We used SPM12 for statistical analyses with the general linear model. The task-related regressors of the TB block, PB block, and C block were implemented as regressors of interest. The duration of each block was set from the first trial to the last one. We also included instruction, button press, and motion regressors as covariates of no-interest in addition to these three regressors. The high-pass filter was set to "infinity," and the global normalization setting was "scaling." We adopted the AR (1) model to create a correction for autocorrelation. At this individual-level analysis, we used (TB > C) and (PB > C) contrasts to depict the neural basis of TB and PB tasks, respectively. These two contrast images were then used in a second-level (group) analysis.
For the group-level analysis, we first used the one-sample t-test to retrieve the neural substrates of TB and PB, respectively. Second, to highlight the neural substrates that are specifically associated with the TB task, we used the paired t-test from the (TB > C) and (PB > C) contrast maps. Using the inclusive mask in SPM, we confined the search area within the region, which shows statistically significant activation in the one-sample t-test with (TB > C) contrast maps. Third, to test our main hypothesis whether any brain region is correlated with TB variance, we used the one-sample t-test with covariates. Two psychophysical quantities, slope WR and BP of the TB task, T abs, for the TB tasks, were included as covariates in this one-sample t-test analysis. A similar procedure was applied to the PB task. Through this analysis, we could test whether there are any regions in which brain activation is correlated with T abs or slope in the TB or PB conditions. We limited the search region to the area that showed the greater activation in TB. In this covariate analysis, the statistical significance was adjusted by using a small volume correction. In these group-level analyses, the statistical threshold was set at p < .05 (family-wise error-corrected) at the cluster level (Friston et al., 1996) with a height threshold of p < .001 (uncorrected) at the voxel level (Flandin et al., 2019). The anatomical locations were determined using the SPM Anatomy Toolbox (Eickhoff et al., 2005(Eickhoff et al., , 2006(Eickhoff et al., , 2007, and we confirmed the locations using a printed atlas (Mai et al., 2015). We also used MRIcroN (https://people.cas.sc.edu/rorden /mricron/index.html) to overlay regions that describe (TB > C) and (PB > C).

Results
Data from one participant were lost due to problems with our MRI scanner. Moreover, two participants had to be excluded due to poor performance. We included participants for whom the proportion of incorrect responses to the shortest duration was lower than 3 standard deviations. The standard deviation for incorrect responses from those participants for the 817-ms duration discrimination was above this threshold (0.43 and 0.50, respectively [mean = 0.07, standard deviation = 0.12]). Thus, 27 participants (13 male and 14 female) were considered for the analysis.

Behavioral data
In the TB task, the BP was 1271.06 ± 21.30 ms, and the Weber ratio was 0.010 ± 0.0008. The absolute deviation of BP from the arithmetic average of the two standard stimuli, i.e., the distance from the median point, was 80.74 ± 15.0 ms. As one participant showed that during short duration first condition the BP was out of range (1863 ms), we tested the anchor order effect with 26 participants. The BP with anchor presentation of short (817 ms) duration first was 1237.11 ± 20.03 ms and that with anchor presentation of long (1762 ms) duration first was not significantly different with a value of 1269.20 ms ± 19.08 (N = 26, p = .136, Wilcoxon signed-rank test). Thus, in this study, the effect of the anchor order was negligible.
In the PB task, the BP was 700.99 ± 0.84 Hz, and the Weber Ratio was 0.26 ± 0.03. The distance from the median point was 3.22 ± 0.59 Hz.

Functional magnetic resonance imaging data
Among all areas that were activated in the TB task more than under the control, C, conditions (TB > C), the TB task induced a stronger activation compared to the PB task (TB > PB) in the right AIC, the IFG, the middle frontal gyrus, the inferior parietal lobule (IPL), the bilateral cerebellum, and the medial prefrontal cortex (Fig. 3).
TB-specific activation, identified by (TB > C), in the right AIC and right IFG was negatively correlated with the absolute deviation of BP from the arithmetic mean of the reference durations (the distance from the median point; Fig. 4). The activation of these areas did not correlate with WR, which represents sensitivity (data not shown). Furthermore, the PB-related activity did not show a significant correlation with the distance of the BP from the median point in the PB task (p = .86 in the right AIC, p = .145 in the right IFG; Fig. 5).

Discussion
The present study showed that the activity in the right AIC and IFG during temporal duration discrimination tasks was positively correlated with the accuracy of the duration estimation.

Unique features of the present study
The association between neural activation and standard psychological measures of time duration markers, such as BP, has been previously explored by neuroimaging studies. Harrington et al. (2004a) found that the temporal interval encoding-related activation in the right parahippocampus and hippocampus is positively correlated with the BP. They suggest that the medial temporal lobe activity represents the memory output from an internal clock. Tipples et al. (2013) conducted both a TB task and a control task (sex discrimination task). They hypothesized that the SMA, as the internal clock, is responsible for accumulating units of time. Thus, its task-related activity should be correlated with the subjective perception of temporal duration measured by the BP. They found increased activation of the right SMA, pre-SMA, and basal ganglia. The negative correlation of BP with the TB task-specific activation contrasted with their simultaneous activation in the right SMA, pre-SMA, and IFG in the sex discrimination task. As a larger BP indicates a shorter perceived time interval, Tipples et al. (2013) concluded that these areas are related to the accumulation of a number of units of time, that is, the clock stage (Pouthas et al., 2005).
The present study differs from that by Tipples et al. (2013) with respect to two points. First, we were interested to determine how the subjective perception of duration represented by BP was matched with  C). Regions representing neural substrates that are more active during the TB task than the PB task. The statistical threshold was set at p < .05, family-wise error-corrected at cluster level using the height threshold of uncorrected p < .001, within the volume defined by (TB > C). The color scale indicates tvalues. C, control; PB, pitch bisection; TB, temporal bisection. (For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.)

Fig. 4. (A)
Neural substrates of subjective time perception. Red indicates the areas with TB-specific activation defined by stronger activation in the TB task compared to C conditions (TB > C). Yellow indicates the area with TB-specific activation which is negatively correlated with the distance of the bisection point from the median point. TB-specific activation (TB > PB) in the right AIC (B) and the right IFG (C) are plotted against the distance from the median point. AIC, anterior insular cortex; C, control; IFG, inferior frontal gyrus; PB, pitch bisection; TB, temporal bisection. (For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.) the actual measure of duration in the physical world. We chose to utilize BP as a measure of duration perception accuracy by calculating the distance of each participant's BP from the median point between the short and long referents. Second, we hypothesized that the AIC-mediated interoceptive process affects the accuracy of time perception because of its integrative function, as suggested by Craig's model (Craig, 2009b).
The AIC is of particular interest considering the subjective characteristics of temporal duration perception. To account for the complex interplay between endogenous and exogeneous factors in temporal processing, Teghil et al. (2019) postulated two complementary, yet distinct, timing mechanisms: an internally-based timing mechanism (IBT), which mediates the generation of temporal representations independently from the sensory environment, and an externally-cued timing mechanism (ECT), which primarily operates when temporal representations are based on exogenous inputs. They suggested that ECT involves the detection of environmental temporal regularities and their integration with the output of IBT processing to generate a representation of time that reflects the temporal metric of the environment. Using ALE meta-analysis of fMRI studies, Teghil et al. (2019) found that both IBT and ECT tasks commonly activate the SMA, IFG, and right basal ganglion, while ECT tasks recruit additional areas, including the posterior cingulate gyrus, posterior superior temporal gyrus, and bilateral insula. As the bilateral anterior insula was also activated by IBT, the authors suggested that the AIC represents an endogenous temporal index (Craig, 2009a). According to Craig's model (Craig, 2009a), salient activity integration relies on an accumulation function of physiological changes in the posterior insula (Wittmann et al., 2010), which is then processed in a posterior-to-anterior progression (Craig, 2009b;Nani et al., 2019;Singer et al., 2009). The AIC integrates information from the external world and within the organism to produce a "global emotional moment," representing the sentient self at one moment. The kinematic change of this "global emotional moment" is the basis of our inner experience of duration (Craig, 2009b). In other words, "when salient moments occur rapidly, the number of global emotional moments increases during that time and, as a consequence, subjective duration dilates" (Craig, 2009b). Thus, the subjectivity may originate from processes of interoceptive bodily information (Craig, 2009a(Craig, , 2009b. Simultaneously, correspondence of subjective time perception to reality is likely mediated in the insula in which the integration with information from the external world occurs. Based on this notion, Meissner and Wittmann (2011) reasoned that if "the flow of interoceptive signals from the body is essential for the representation of duration," then individuals with a high degree of interoceptive awareness, sensitively perceiving bodily changes such as pulse rates, should perform more accurately in time estimation tasks. These authors found that individuals with better accuracy at estimating their heart rate are also better estimators of a given temporal duration during repeated temporal reproduction trials in the range of 8-20 s. Furthermore, a recent neuropsychological study in patients with an insular lesion indicated a specific role of the right insula in the discrimination of duration (Mella et al., 2019). However, the functional role of the AIC in the accurate estimation of temporal duration is unclear.
The present study showed that the TB-specific activation of the right AIC and IFG, revealed by the (TB > PB) contrast, was positively correlated with the accuracy measured by the deviation of the BP from the median point of the two referents. The activation of these areas did not correlate with the WR, which represents the sensitivity. This finding is consistent with the notion that these areas are involved in the integration of ECT and IBT for promoting the accuracy of temporal discrimination.

The right anterior insular cortex
A previous meta-analysis of neuroimaging studies showed that the AIC is involved in the motoric and perceptual temporal processing of both sub-second and second ranges (Nani et al., 2019). The anterior insula is known to have several basic mechanisms: "(1) bottom-up saliency detection, (2) switching between other large-scale networks to facilitate access to attention and working memory when a salient event occurs in both internal and external environments, (3) interaction of the anterior and posterior insula to modulate physiological reactivity to salient stimuli, and (4) access to the motor system via strongly coupling with the anterior cingulate cortex" (Menon and Uddin, 2010). These mechanisms are feasible for the integration of information from the external world with the subjective time percept that originates from interoceptive bodily information.
In primates, the insular cortex integrates multiple signals of both external and internal origins, and it primarily functions as an interoceptive cortex, that is, sensing the entire body's physiological conditions (Wittmann, 2013). In the human insular cortex, sequential body states with cognitive and motivational needs are integrated through a posterior-to-anterior representation progression (Wittmann, 2013). Functional neuroimaging studies in humans showed that encoding of 9-s durations caused climbing neural activity in the posterior insula, and its reproduction caused climbing neural activity in the anterior insula (Wittmann et al., 2010). Thus, this accumulated activity with a posterior-to-anterior gradient may be related to the encoding and reproduction of the cumulative representation of time (Wittmann, 2013). This study revealed the involvement of the anterior insula in temporal duration discrimination within the range of 1 s.
The previous lesion studies of the right insula indicated its functional relevance in time perception. Monfort et al. (2014) reported a case study of a patient with a focal lesion in the right anterior insula, showing global overestimation of duration during a temporal reproduction task in the supra-second range, thus, reduced accuracy, while the precision was preserved. They concluded that the representation of duration length involved the right AIC/IFG (Monfort et al., 2014). Recently, Mella et al. (2019) reported 21 patients with strokes affecting the insula. They found that patients with a right insular lesion showed less temporal sensitivity than both control participants and patients with left insular lesions. As the extent of the lesion covered mainly the posterior region, Mella et al. (2019) suggest that the posterior part is involved in the encoding of duration, while the anterior insula participates in the reproduction of intervals (Wittmann et al., 2010). The current findings indicate that the anterior portion of the insula is related to the accuracy of temporal discrimination, consistent with its integrative function with respect to information from both internal and external environments (Craig, 2009b).

The right inferior frontal gyrus
By contrast, the right IFG is related to categorical decisions (Cromer et al., 2010;Jiang et al., 2007). The right IFG is involved in the duration discrimination task that requires working memory function to allow comparison across two intervals (Hayashi et al., 2013;Mottaghy et al., 2003). Hayashi et al. showed that the right IFG is related to more-versus-less categorical information irrespective of the domain of time and numerosity. Specifically, they found that transcranial magnetic stimulation of the right IFG interferes with categorical duration discrimination, whereas that of the right inferior parietal cortex modulates the influence of numerosity on time perception. They concluded that the right IFG is specifically involved in the categorical decision stage. Therefore, the right IFG in the present task is likely involved in the decision phase, in which the comparison between the referent durations reproduced in the AIC and the target duration took place.

Non-specificity of the right anterior insular cortex and inferior frontal gyrus in pitch discrimination
Although PB-related activity in the right AIC and IFG was not significantly correlated with the distance of the BP in the PB task from the median point, these brain regions showed PB-related activity. This may be explained by the common psychological processes in the perceptual discrimination task, that is, selective attention and categorical decision. The AIC is related to the selective attention toward different characteristics of an object (Yoshioka et al., 2021). In the present study, participants had to attend to the different features of a given stimulus, that is, temporal duration or pitch. These task sets led to the general activation of the right AIC. The right IFG is related to categorical decision-making irrespective of the features of the object. Previous non-human primate studies revealed that the prefrontal cortices are involved in decisions based on both spatial and temporal information (Genovesio et al., 2011) and episodic memory (Genovesio et al., 2009). Thus, both temporal and pitch discrimination tasks recruited the right AIC for selective attention and the right IFG for categorical decisions.

Right-lateralized neural representation of temporal discrimination
The (TB > PB) contrast demonstrated right-lateralized activation, which is consistent with the general notion that temporal processing is preferentially right-lateralized (Ferrandez et al., 2003;Mella et al., 2019;Pouthas et al., 2005). The right IPL and IFG are both parts of the magnitude system (Hayashi et al., 2013). The magnitude system handles both time and numerosity. Hayashi et al. (2013) found that the right IPL is related to the numerosity-time interaction at the perceptual level. By contrast, the right IFG is specifically related to categorical decisions suggesting a two-stage model of numerosity-time interactions.

Psychological processing of duration discrimination tasks
The behavioral analysis indicated that the participants conducted both TB and PB tasks successfully. The TB task requires estimating the duration of the stimuli, storing duration information in the memory, recalling such memories, and comparing two durations (Kopec and Brody, 2010). To control for these memory processes, we adopted a PB task (Harrington et al., 2010) which shares the procedure and the stimuli of the TB task, except for the instruction that specifies the features of the paired auditory stimuli, duration or pitch, to be compared. Hence, we expected the involvement of selective attention, working memory, and decision-making to be controlled. A conjunction analysis revealed that the cerebellum, midbrain, striatum, pre-SMA, bilateral frontoparietal network, dorsolateral prefrontal cortex, and anterior insula were also activated in agreement with the involvement of those neural substrates with the abovementioned psychological processes.

Methodological considerations
It is crucial to control for the subject's ability to use counting as a strategy to keep track of time during time-estimation tasks. Chronometric counting, a language-based method using inner speech, supports more precise estimates than the interval-timing system and is also guided by different brain structures, particularly in the range of >10 s (Hinton et al., 2004). The present study, thus, focused on intervals around 1 s to minimize any influence of this counting effect.
The arousal level is an essential factor influencing task performance. The stimulus intensity is also a factor to be considered in this regard. In the present study, we postulated that a sound that was too loud or too soft would affect the arousal level. Thus, we adjusted the intensity of the heard sound such that each participant felt that they could "comfortably" listen to the tone. The intensity level was determined before the experiment and unchanged during the experiment. In addition to the sound intensity, the individual difference in attentional focus is a confounding factor. Some participants may pay more attention to the first anchor (primacy effect) and others to the second one (recency effect). To eliminate this confounding factor, we have checked the order of anchor presentation effect on BP, which was not significant. Approximately half the participants showed the primacy effect, whereas the other half showed the recency effect; thus, this effect was well balanced across the participants. As we adopted a counter-balanced presentation of the anchors, the overall impact of attention was canceled individually.

Limitations
Although we found a correlational relationship of the activity within the right AIC and IFG with the degree of accuracy of duration, we could not establish any proof of causal relationship. A future study with the use of transcranial magnetic stimulation/transcranial direct current stimulation is warranted. Moreover, the duration which was investigated was around 1-2 s. It remains to be seen whether scaling towards shorter or longer duration ranges shares, qualitatively and quantitively, the same neural correlates. The relationship of the right AIC and IFG with other TB-specific regions such as the right IPL and dorsolateral and dorsomedial prefrontal cortexes was not addressed. Finally, we identified the neural substrates for the accuracy of temporal discrimination in healthy adults. The application of the experimental design to participants experiencing psychiatric disorders would be of high interest. Deficits in time perception have been observed in impulsive disorders, such as impulsive borderline personality disorder (Berlin and Rolls, 2004), stimulant dependency (Wittmann et al., 2007), and attention-deficit/hyperactivity disorder (Rubia et al., 2009). These findings imply an association between impulsiveness and a time perception deficit (Barkley et al., 2001;Plichta et al., 2009;Rubia et al., 2009). Similarly, patients with schizophrenia show deficits in duration discrimination (Davalos et al., 2002;Elvevåg et al., 2003), and patients with depression perceive a slowing of the pace of time (Blewett, 1992;Bschor et al., 2004). Thus, depicting the neural underpinning of time processing contributes to the understanding of psychiatric disorders. To better understand their pathophysiology, future studies in patients examined using the present task design are warranted.

Conclusion
Using functional MRI with healthy volunteers, we found that the right AIC and adjacent IFG, along with other cortical and subcortical regions, contribute to the accuracy of duration perception. The human AIC, along with the anterior cingulate cortex, is one of the most commonly affected regions in psychiatric disorders (Goodkind et al., 2015;Nagai et al., 2007;Seeley, 2008) which often manifest deficits in temporal processing. Further studies in psychiatric patients examined with the present task design are warranted.

Funding information
This study was partly supported by AMED JP20dm0307005 and by the Japan Science and Technology Agency (JST) under the grant number JPMJCE1311 (Center of Innovation Program). The funding sources had no involvement in the study design or conduct; the collection, analysis, and interpretation of data; the preparation, review, or approval of the manuscript; or the decision to submit the manuscript for publication.

Declaration of competing interest
None.