Abstract
Despite a clear ability to detect temporal modulations of visual stimuli in excess of 50 Hz, temporal individuation and serial order judgment tasks can be performed only when stimuli alternate at much slower rates, and the nature of such sluggishness remains unclear. One example of a task with a slow temporal limit is the individuation of a cued letter in a rapid serial visual presentation (RSVP) stream. The present study investigates the nature of the code used to perform such a slow temporal individuation task and the sources of uncertainty involved. The results demonstrate that temporal, rather than ordinal, position in the RSVP stream is critical in serial order estimation, suggesting the involvement of a noisy temporal code. In addition to variability in temporal coding, observers’ choices are also limited by a number of other factors, such as categorical errors and biases related to the position of the cue in the letters’ stream. Attentional filtering improves categorization, but crucially, it does not seem to increase the temporal precision of judgment. Generalizing the present results, I suggest that perception of order is limited by an internal temporal sampling instability that is distinct and independent from attention and that, similarly to temporal jitter in a clock, acts as a low-pass filter that hinders the judgment of the order of events that unfold too quickly.
Similar content being viewed by others
Introduction
Consider the following visual task: Report on the attributes of a target item, identified by a cue, embedded in a series of distractor items presented in rapid succession in the same spatial location. To successfully perform this task, where the items alternate quickly as successive frames in a movie (rapid serial visual presentation [RSVP] procedure), the observer must be able to temporally individuate the target from the distractors. That is, it is not sufficient to simply detect the appearance of the target item; rather, it is necessary to identify the item and its attributes, as well as recognize its rank order in relation to adjacent items and to the cue. With such a task, it is thus possible to test the limits of temporal individuation in the judgment of serial order.
A number of specific variants of the generic task described above have been investigated previously. In an early study by D. H. Lawrence, observers were required to report a singleton word presented in uppercase within a list of lowercase words (Lawrence, 1971). Lawrence varied presentation rate and serial position of the target in the list and found that errors for the first and intermediate target positions increased with increasing rate. However, responses for the last position were very accurate and were not affected by presentation rate. Furthermore, in Lawrence’s study, erroneous responses for intermediate list positions tended to more frequently correspond to words following, rather than preceding, the target word (posttarget intrusion errors). Subsequent studies varied a number of stimulus factors in Lawrence-type tasks and demonstrated pre-, post-, or symmetric intrusion patterns of errors (Botella & Eriksen, 1992; Botella, Garcia, & Barriopedro, 1992; Botella, Suero, & Barriopedro, 2001; Gathercole & Broadbent, 1984; Intraub, 1985; Kikuchi, 1996; McLean, Broadbent, & Broadbent, 1983; Vul, Hanus, & Kanwisher, 2009). From the published data, the following conclusions about accuracy of report in Lawrence-type tasks can be drawn: Accuracy degrades with increasing presentation rates; it is better at list edges than at intermediate positions, particularly at end of list; it can be biased toward list positions early or late with respect to the position of the target, depending on a variety of stimulus attributes and task demands.
What is the cue that the observer uses to individuate the target, and what are the specific factors that limit the observer’s accuracy? Despite the apparent simplicity of tasks of the type described above, temporal individuation is severely limited, and such sluggishness does not yet have a clear explanation and has been discussed within a variety of contexts. I focus here on two broad explanatory frameworks: the illusory features conjunction concept within the attention literature and the serial order judgment problem in memory and motor control.
Much research in vision has investigated the temporal limits of early visual filters with linear systems techniques, characterizing the spatiotemporal contrast sensitivity function and the temporal impulse response (Watson, 1986). The filter characteristics of early vision pose an upper limit to our ability to individuate events, because stimuli that alternate at frequencies outside the filter’s bandwidth are simply not detected. Estimates of the upper limit of visible temporal frequencies range between 50 and 100 Hz. However, temporal individuation cannot be achieved at such fast alternation rates. In fact, the temporal limits of individuation can be very severe—in many cases, as low as 2–3 Hz (Holcombe, 2009). At faster rates, observers often commit illusory conjunction errors, reporting an incorrect pairing of features or an item that appeared earlier or later than the cue. One prominent class of explanations for such sluggish performance invokes attention as the limiting factor. Botella and colleagues stated that “illusory conjunctions occur when the experimental conditions impede adequate focusing of attention on the presented stimuli, e.g. when exposure times are brief” (Botella et al., 2001, p. 1455). Reeves and Sperling claimed that “the perceived order of rapidly presented items in short-term visual memory is determined primarily by the amount of attention they receive at the time of input” (Reeves & Sperling, 1986, p. 181). Vul and Rich suggested that “decreased precision of attention amounts to worse estimates—and thus greater uncertainty—about the location of the object” (Vul & Rich, 2010, p. 1169). Thus, there seems to be widespread acceptance of Treisman’s conjecture that “when attention is loaded, participants make many conjunction errors” (Treisman & Schmidt, 1982, p. 138); in the present context, attention overloading would be achieved by fast presentation rates. Supporting evidence for this attentional overload explanation comes primarily from two types of experiments. The first type of evidence concerns tasks that exclude the involvement of low-level spatiotemporal correlation mechanisms by requiring effortful individuation and comparison of temporal segments of repeating stimuli at large spatial separations. For example, tasks that require individuation and comparison of light and dark phases of stimuli, such as temporal phase discrimination (Battelli, Cavanagh, Martini, & Barton, 2003; Forte, Hogben, & Ross, 1999), or tasks that require integration of spatially separate object features, such as color and orientation binding (Holcombe & Cavanagh, 2001), are limited to rates below 10 Hz. The second type of evidence considered in support of the attentional overload explanation comes from experiments involving dual-probe techniques. For example, identification and report of the second of two targets in an RSVP sequence is severely impaired when the second target appears less than 500 ms after the onset of the first target, a phenomenon known as the attentional blink (Raymond, Shapiro, & Arnell, 1992). The suggestion that in both types of task, a high-level attentional selection process must be setting the limits of performance is based mostly on an exclusion principle. In phase discrimination tasks, low-level mechanisms such as motion detection filters have been excluded on the basis that the spatial density of stimulus elements is too sparse to engage them. In feature-binding experiments, it is argued that elementary feature detectors jointly coding for two arbritary features at separate locations, such as color and orientation, are unlikely to exist and, in fact, have never been found in neurophysiology. In the case of the attentional blink, the observation that the blink disappears when the first target does not need to be identified and reported, but merely detected, supports the suggestion that the source of the phenomenon must be a form of refractoriness in a higher-level attentional selection process (Chun & Potter, 1995). The question remains as to whether performance in these disparate tasks is limited by a unitary or by multiple attention mechanisms and what exactly such putative attention mechanisms are.
An almost completely parallel literature in the domain of serial order in memory has been concerned with tasks that share similar problematics with Lawrence-type tasks. For example, in a probed memory serial order recall task, a participant is required to recall an item that appeared in a specific serial position within a previously presented, temporally ordered list. Here, the cue appears after, rather than within, the list, but the two types of tasks share the common requirement of having to individuate items and encode their serial order. There has been much discussion on what is the nature of the code that enables serial order judgments in memory (for a thorough review, see Henson, 1998). After discarding older ideas, such as chaining, the recent literature has concentrated on distinguishing different types of positional codes—that is, codes based on a signal directly related to the temporal position of an item in the sequence. The focus has been on contrasting an ordinal code based on the rank order of items in the list, but independent of the temporal scale, and a temporal code that explicitly represents events on a temporal dimension and is, thus, sensitive to the time scale. Recent research appears to favor temporal over ordinal codes (e.g., Brown, Neath, & Chater, 2007).
In contrast, an ordinal, rather than a temporal, code appears to be used for representing serial order in motor action sequence generation, perhaps the oldest domain where the serial order problem has been discussed (Lashley, 1951). There exist neurons in a variety of premotor areas—most notably, the basal ganglia and frontal cortex—that appear to be rank-order-selective (ROS) neurons; that is, they change systematically their firing rate depending on the serial order position of an action within a sequence (Tanji, 2001). Prefrontal ROS neurons appear to code rank order independently of passage of time (Berdyyeva & Olson, 2011) and, crucially, also appear to be rank-order generalists; that is, they seem to signal temporal ordinal position for both action and object sequences (Berdyyeva & Olson, 2009, 2010). As such, the most recent investigations in this field seem to suggest that frontal ROS neurons may be implicated generally in tasks requiring temporal serial order judgments, including action, memory, and perception.
In the light of the conflicting findings in memory and action sequence generation above, it seems surprising that, to date, the distinction between ordinal and temporal coding has not been considered in the context of perceptual, Lawrence-type tasks. Just as in memory and motor control, I suggest that this distinction is timely and fundamental for understanding the nature of the mechanisms responsible for temporal individuation. The original observations of Lawrence indicated that error rates increase as presentation accelerates. This finding may be taken to imply an essentially temporal code, but in fact, the information available from extant studies is not sufficient to draw definite conclusions. We do not have quantitative estimates of the precision of temporal individuation in Lawrence-type tasks or of the degree to which such precision is affected by presentation rate and other manipulations that may involve attentional orienting. Until such estimates are obtained, any theoretical account can only remain poorly constrained.
In the present study, I analyze systematically the patterns of errors produced by observers in a single-probe temporal individuation task. The aim of the study is twofold: to investigate the nature of the signal used to achieve individuation and to identify the sources of uncertainty that limit accuracy and precision of temporal judgments. The task employed is identical to that in other recent studies on the same topic (Vul et al., 2009; Vul, Nieuwenstein, & Kanwisher, 2008). Observers are given a list of all 26 letters of the English alphabet, randomly ordered and presented in a rapid serial visual presentation (RSVP) procedure stream, and are asked to report the letter corresponding to a single cued temporal position. I report the results of two experiments. In the first experiment, the cue always corresponded to the midstream temporal position, and the variable of interest was the presentation rate. In the second experiment, the presentation rate was constant, but the cue appeared at different temporal positions within the stream. As such, the first experiment was aimed at distinguishing ordinal versus temporal codes in the presence of minimal uncertainty about the temporal location of the cue, whereas the second experiment investigated contextual influences by introducing uncertainty about the location of the cue, thus modulating attentional orienting by means of elapsed time (increasing cue hazard rate).
The results show that the distribution of observers’ errors can be modeled as a Uniform–Gaussian mixture with at least two sources of uncertainty: a specific temporal uncertainty that gives rise to errors clustered around the position of the cue with a distribution well modeled as a Gaussian, and at least another additional source responsible for spreading errors uniformly across all temporal positions, suggesting uncertainty in letter categorization. The characteristics of the Gaussian distribution of errors with different presentation rates suggest that judgment is based on a temporal, rather than ordinal, position signal. Introducing uncertainty in the location of the cue affects uniform random errors, such that their frequency becomes a decreasing linear function of the ordinal position of the cue in the sequence, but does not affect the Gaussian component; particularly, it does not decrease its precision. Judgments are often biased, demonstrating pre- or posttarget error patterns depending on the statistical distribution of the cue, and are affected by list edge effects, such that items appearing at the beginning and end of the stream are often reported, instead of items cued just after or before a list edge.
Experiment 1a: fixed probe position, different presentation rates
This experiment examined in single observers the ability to individuate a target letter always cued in the middle of the list, so as to allow an estimate of accuracy and precision minimally contaminated by uncertainty about the cue’s location. Performance was compared across a range of presentation rates to establish whether individuation is based on an ordinal or a temporal position signal.
Method
Participants
The author and three students, unaware of the purpose of the study, participated in the experiment.
Stimuli
On each trial, all 26 letters of the English alphabet were presented in an RSVP stream in the center of a CRT monitor with a refresh rate of 90 Hz. The letters were drawn in white Courier font on a black background. Each letter subtended 2.5° of visual angle at a viewing distance of 57 cm. The cue was a white circle with a diameter of 12° centered on the target letter.
Procedure
The sequence order of the 26 letters of the English alphabet was chosen randomly on every trial. Each letter in the RSVP stream was presented in two consecutive screen refreshes (22.2 ms) and was followed by an empty interval before the onset of the following letter (see Fig. 1). Onset and offset of the cue always coincided with the onset and offset of the 14th letter. Observers were instructed to report the cued letter by pressing the corresponding key on a computer keyboard and were not given feedback on the accuracy of their reports. An intertrial interval of 1,500 ms occurred after each response. Each participant was tested in blocks of 400 trials, comprising four sequences of successive 100 trials separated by brief interruptions. The RSVP presentation rate was fixed within each block. Several rates (obtained by altering the spacing between offset–onset of successive letters, with letter exposure kept fixed at 22.2 ms) were tested in randomized order over several days, and each participant was run in three blocks (1,200 trials) per rate. The very first block was treated as practice and discarded.
Data analysis
Report frequency was calculated for each serial position and presentation rate from data averaged across all blocks. In the frequency histogram, each serial position x was expressed as a signed distance from cue onset. A Uniform–Gaussian mixture model was then fitted to the histogram of reports by nonlinear regression. The model had the following form:
Parameter a is fixed and corresponds to the histogram’s bin width (RSVP frame exposure in milliseconds), μ and σ are the mean and standard deviation of the Gaussian component, respectively, and parameter u represents the cumulative probability of the uniform component.
Results
Representative distributions of reports corresponding to the choices of observer J.L. over a range of presentation rates (22.2–6.4 Hz, or stimulus onset asynchronies [SOAs] of 45–157 ms) are plotted in Fig. 2. Letters presented in the immediate neighborhood of the cued item are reported with increasing frequency as presentation rate decreases. When report frequency is plotted as a function of ordinal distance from the cue expressed as number of items (ordinal error; Fig. 2, left panel), the distributions’ variance appears to decrease with decreasing RSVP rate. However, when report frequency is plotted as a function of temporal distance from the cue (temporal error; Fig. 2, right panel), the distributions’ variance appears unaffected by rate, the main difference being a multiplicative change of gain.
In order to measure quantitatively the variation of accuracy and precision of observers’ reports with presentation rate, the Uniform–Gaussian model (Eq. 1) was fitted by nonlinear regression to the frequency histograms of each observer. Representative examples of such fits are shown in Fig. 3 at slow and fast presentation rates for observer J.L. The need for a uniform component in the model is justified by theoretical considerations (see the General Discussion section) and is supported empirically by the substantial presence of reports at large distances from the cue. Note that such uniform errors are more prevalent at the fast presentation rate. Inspection of residuals from the fitted model across observers indicates that the reports’ distributions are not systematically skewed or kurtotic. Accuracy and precision of the reports are taken as the mean and standard deviation of the Gaussian component, respectively.
Best fits of the model’s parameters are plotted in Fig. 4 as a function of RSVP rate for each individual observer. For all observers, the cumulative probability of uniform errors (Fig. 4, top panels) increases with increasing rate. Accuracy and precision (mean and standard deviation of the model’s Gaussian component, respectively) are plotted in the middle panels of Fig. 4 as an ordinal error (distance from the cue as number of letters). For all observers, both accuracy and precision tend to decrease with increasing rate, as demonstrated by the progressive increase of ordinal error magnitude and variance. However, when errors are expressed as a temporal distance from the cue (Fig. 4, bottom panels), accuracy and precision fail to show a systematic trend as a function of rate. Note that the mean value of the Gaussian component is negative in the three naïve observers (pretarget intrusion pattern), but positive for the author (posttarget intrusion pattern).
Given the similar trends observed for all parameters, averages were computed at the rates of 8.1, 11.1, 14.9, and 22.2 Hz, which were tested across all observers (Fig. 5). Because the model’s means had a different sign for 1 observer, accuracy estimates were recalculated before averaging as absolute deviations from the cue; these absolute (always positive) values for the means are plotted in Fig. 5. In the average data, uniform errors (Fig. 5, top panel) increase with increasing RSVP rate in a seemingly linear manner. On an ordinal scale (Fig. 5, middle panel), the mean and standard deviation of the Gaussian component increase roughly threefold with a threefold increase in RSVP rate. However, on a temporal scale (Fig. 5, bottom panel), the mean remains invariant with rate, while the standard deviation increases only slightly, by a factor of 1.14 (in both cases, linear regression slopes are not statistically significant, p > .05). The relationship between standard deviation and rate remains statistically nonsignificant (p > .05) even after correction for grouping with Sheppard’s formula (Ulrich & Giray, 1989), a method that deflates the estimates of variance at slower rates that are biased by coarser temporal sampling (Heitjan, 1989). Average values of mean and standard deviation are μ = 27 ms and σ = 72 ms. Overall, temporal invariance of accuracy and precision as a function of SOA is consistent with the notion that a temporal, not ordinal, position signal drives observers’ choices.
Experiment 1b: fixed probe position, individual differences
The aim of this experiment was to investigate individual differences in the model parameters. The interest here was in knowing how variable across observers accuracy and precision of judgments in the conditions of the previous experiments were.
Method
Participants
Nineteen undergraduate students participated in the experiment for course credit.
Stimuli, procedure, and data analysis
The stimuli, procedure, and data analysis were identical to those in Experiment 1a, with the following exceptions. The RSVP frame rate was fixed at 15 Hz, such that each letter was presented for 22.2 ms and empty intervals of 44.4 ms were interleaved between successive letters. Each participant was tested in one block of 200 trials, comprising two sequences of successive 100 trials separated by a brief interruption. The proportion of reports at each serial position was calculated and the Uniform–Gaussian model of Eq. 1 was fitted to the resulting histogram, obtaining estimates of the parameters of interest for each participant. Reaction times of individual participants for responses at each serial position were log-transformed, z-scored, and then averaged across participants.
Results
The average distribution of reports in Experiment 1b (Fig. 6, left) is broadly similar to the data obtained in Experiment 1a under the same conditions and is mirrored by the distribution of reaction times (Fig. 6, right): Responses corresponding to positions in the immediate neighborhood of the cue are more prevalent and are produced faster.
To investigate individual differences, the model of Eq. 1 was fitted to the data from each individual participant, and the resulting distributions of parameters’ estimates are plotted in Fig. 7. Most participants are biased toward anticipating the cue (pretarget intrusions), as indicated by the fact that the majority of means are negative (Fig. 7, left); however, about a quarter of participants show a posttarget intrusion pattern, similar to that for the author in Experiment 1a. The interquartile ranges are 58–91 ms for standard deviation and .07–.2 for the probability of uniform errors. The data from Experiment 1b are in good agreement with those from Experiment 1a in the averages but also show substantial variability across participants.
Experiment 2: variable probe position, fixed presentation rate
The aim of this experiment was to investigate the effect on accuracy and precision of individuation produced by adding uncertainty about the temporal position of the cue in the list. The temporal position of the cue was uniformly distributed across list positions, such that cue expectation could increase with time elapsed from list onset. The question of interest was whether temporal expectation would affect bias and/or precision by modulating the allocation of attention over time.
Method
Participants
Fifty-four undergraduate students participated in the experiment for course credit.
Stimuli, procedure, and data analysis
The stimuli, procedure, and data analysis were identical to those in Experiment 1a, with the following exceptions. The RSVP frame exposure was fixed at 15 Hz, such that each letter was presented for 22.2 ms and empty intervals of 44.4 ms were interleaved between successive letters. On each trial, the cue appeared in one of four or five possible list positions, alternating across trials in random order according to a uniform distribution. Each participant was assigned to one of three groups that differed with respect to the possible list positions of the cue: Group 1 with positions 2–8–14–20–25, Group 2 with positions 4–10–16–22, or Group 3 with positions 6–12–18–24. Each participant was tested in one block of 400 trials, comprising four sequences of successive 100 trials separated by brief interruptions. For each cued position, the proportion of reports at each serial position was calculated from data averaged across participants. The Uniform–Gaussian model of Eq. 1 was fitted to the resulting histogram, obtaining estimates of the parameters of interest for each cue position.
Results
Examples of the distributions of observers’ choices in Experiment 2 are reported in Fig. 8. Reports for targets cued at intermediate positions in the list (e.g., positions 12 and 16; Fig. 8, middle panels) are distributed in a manner similar to that in Experiment 1. However, reports for targets cued at positions 2 and 25 (Fig. 8, left and right panels), adjacent to the beginning and end of the list, are markedly biased toward the first and last items in the list, respectively. Furthermore, in pilot data where the first and last items were cued, reports for those items achieved almost perfect accuracy, replicating the findings of Lawrence (1971).
Because the three participant groups were tested at different serial positions, data modeling was conducted separately for each group. Precision of judgments (model’s SD; Fig. 9) does not vary with cue position in all groups (the apparently higher estimates for positions 2 and 25 in group 1 have a large uncertainty and were necessarily obtained from truncated data). The average standard deviation, σ = 72.8 ms, is similar to the average estimate obtained in Experiment 1. Interestingly, the estimates for accuracy of judgments (model’s means; Fig. 9) are affected by the distribution of cue positions and differ from those in Experiment 1. Observers in groups 2 and 3, not exposed to cues near list edges, produced reports centered on the veridical location of the cue (Fig. 9, center and right panels). However, observers in group 1, who were exposed to cues near list edges, as in Lawrence (1971), were markedly biased: When probed at positions 2 and 25, they tended to produce reports biased toward the beginning and end of the list, respectively; for intermediate list positions, they tended to report more often items following the cue (posttarget intrusion pattern), as reported in Lawrence (1971). As such, accuracy of judgments in all groups differed from that in Experiment 1, where a pretarget intrusion pattern was most commonly observed.
Best estimates of the cumulative probability of uniform errors as a function of cue position in the list are reported in Fig. 10. For all groups, uniform errors decrease in frequency by about a factor of three from the beginning to the end of the list (significantly negative linear regression slopes; all ps < .05). In summary, variability in cue position demonstrates two main effects on responses: It affects uniform random errors and biases the accuracy of reports when items near list edges are tested but does not affect precision.
General discussion
Sources of uncertainty
There are at least two potential sources of uncertainty in the temporal individuation task studied here: categorical and temporal.
Consider the case where the observer localizes a cued target letter (e.g., the letter “O”) with perfect temporal accuracy: He or she may still commit a categorical error (e.g., erroneously report the letter “Q”). Since by experimental design any single letter of the alphabet has an equal probability (p = 1/26) of appearing in each list position on any single trial, categorization errors across trials will necessarily be uniformly distributed among all list positions, including positions far from the cued letter’s location. As such, any model of the distribution of errors in this task needs to include a uniform component. Errors of this kind may depend on many factors: how legible the character is, how similar the target letter is to other letters in the alphabet, how biased the observer is against reporting a particular letter, and so forth. Regardless of their origin, these errors should be reduced in frequency at slow alternation rates, because it has been found empirically in masking studies that interference decreases, while categorization improves, with increasing SOA (e.g., Bundesen & Harms, 1999). Uniformly distributed errors in the task studied here follow precisely this pattern (see Fig. 5, top panel).
Consider now the case where the observer achieves perfect categorization of all letters. In this case, report accuracy and precision will depend entirely on the observer’s uncertainty in mutually localizing in time the cue and the letters. A model accounting for temporal errors must specify the rules governing the observer’s choices and the nature of the signals upon which choices are made. I will discuss choice models later and concentrate now on the signal involved. It is an empirical fact that most errors cluster in the near vicinity of the cue according to a distribution that appears symmetrical and mesokurtic (see Fig. 2). Let us accept for the moment that a Gaussian distribution is a good fit to the data (see Fig. 3): What factors do its parameters depend on? If the ordinal position of letters in the list is encoded, the parameters of the Gaussian distribution should be a fixed number of letters regardless of the presentation rate. However, if serial order is encoded more finely as a temporal position signal, accuracy and precision should improve with slower presentation rates when expressed as number of letters but should be constant when measured in milliseconds. Temporal, rather than ordinal, invariance of accuracy and precision is indeed observed in the data (Fig. 5), suggesting that order information is coded as a time signal. I suggest that the latency dispersion of such a time signal is the major limit in temporal individuation.
Sources of bias
The accuracy of observers’ responses was biased by a number of factors. First, single observers’ choices in the fixed cue condition were not centered on the timing of the cue but, rather, often tended to anticipate it (pretarget intrusion errors; see Figs. 4 and 5). In the present context, this occurs in the absence of error feedback. A similar bias is often reported in other contexts, such as sensory–motor synchronization tasks (Aschersleben, 2002; Repp, 2005). Although this phenomenon has been known for a long time in the sensory–motor synchronization literature, its explanation within that domain, as well as in the present context, remains unclear.
Second, when there is uncertainty about the temporal location of the cue (Experiment 2), items cued near the beginning of the list tend to be reported less often than items cued at intermediate list positions. Such a phenomenon was noted in previous research. Ariga and Yokosawa (2008) named this effect attentional awakening and interpreted it as the result of a slow-to-start process of orienting of attention to locations in time. However, such a phenomenon also occurs near list end, and the distribution of errors shows that responses are not merely suppressed or uniformly distributed across all list positions but, instead, are selectively clustered toward edge items. Furthermore, the same group of observers produced delayed responses for intermediate list positions. Thus, the mere presence of cues near list edges induces a complex pattern of biases, with strong edge intrusions at the beginning and the end of the list and with posttarget intrusions dominating intermediate list positions, as was originally found by Lawrence (1971). Edge items were not pre- (post-) masked, suggesting that their categorical distinctiveness could be one of the sources of these biases in the observer’s choices. Such edge effects may also be related to previously observed misjudgments of temporal duration at the beginning and end of a sequence of events (Bachmann, Luiga, Poder, & Kalev, 2003; Kanai & Watanabe, 2006; Rose & Summers, 1995). Curiously, when cues are kept moderately distant from list edges, reports become unbiased.
In summary, temporal judgments are unbiased when cues are randomly distributed within a range of intermediate list positions that excludes items proximate to list edges. Judgments do become biased when cues are fully predictable, or when their random distribution includes near-edge items.
Choice model
How does the observer choose the cued target letter? I consider here a Thurstonian (Thurstone, 1927a, 1927b) choice model based on two assumptions: The letters and the cue are assigned noisy temporal tags on each trial, and the decision rule is a minimum operator. The time tags of each letter and of the cue on any given trial are assumed to be independent random samples from a Gaussian distribution centered on the physical timing of each item, and the decision rule is to choose the item whose time tag has the shortest distance from the time tag of the cue. This kind of model produces across trials a Gaussian distribution of choices centered on the cued item with a variance that is the sum of the variances of the timing distributions of the letters and of the cue. This simple model is intended to account for the observed pattern of precision of the Gaussian component of errors (compare the data in Fig. 2 with the simulation in Fig. 11) but would need further complexity to deal with bias and with errors that are distributed uniformly.
The observed error distributions, which do not appear to be particularly kurtotic or skewed, justify empirically the assumption that, in the present task, the noisy distribution of temporal tags is Gaussian and the timeline is linear. This is in contradiction with other models and tasks. For example, temporal ratio models of serial order judgments in memory, such as SIMPLE (Brown et al., 2007), assume exponential distributions and logarithmic time, which are justified by the form of the distribution of errors observed in serial order memory tasks. Other related magnitude estimation models—for example, those devised to account for errors in numerosity judgments (Piazza, Izard, Pinel, Le Bihan, & Dehaene, 2004)—often assume lognormal distributions. It remains unclear whether these idiosyncrasies reflect true differences in mechanisms or are related to methodological and task peculiarities.
Estimates of temporal noise
A robust finding of the present study is the fact that precision of judgment was invariant with presentation rate and was quantified as a standard deviation of 72 ms. Previous research has investigated the temporal limits of individuation by measuring the highest alternation frequency that can support performance at a certain threshold level (e.g., 75 % correct); 7 Hz is often cited as the perceptual limit in these tasks. The peak of the Gaussian component of responses in the present task (disregarding uniformly distributed errors) reaches 75 % at a presentation rate of 7.4 Hz, thus agreeing closely with previously measured limits in a variety of other tasks. For example, Linares and colleagues recently measured the spatial localization of moving objects and found that it was limited by a temporal precision of 70 ms (Linares, Holcombe, & White, 2009). On the basis of a reanalysis of data from a study by Vul et al. (2008) that used stimuli identical to those in the present study, Linares rightly conjectured that a similar limit may apply also to RSVP tasks. However, there are also many reported examples of tasks with even lower limiting rates (Fujisaki & Nishida, 2010; Holcombe, 2009), raising the question of the generality of the 70-ms limit. Many of those studies did not distinguish or consider the many sources of uncertainty in temporal judgment, and as such, it is difficult to know to what extent the limit of performance were set, for example, by the completion rate of categorization processes, rather than by temporal uncertainty per se. A further complicating factor is the nature of the task: Some tasks may require at least two individuation episodes to be performed, suggesting the involvement of refractory processes such as those observed in the attentional blink. It remains for future studies to further investigate the generality of the 70-ms precision (or 7-Hz) limit for individuation, after partialing out sources of uncertainty other than temporal imprecision per se.
Role of attention
Attention has repeatedly been cited in previous research as having a major modulating influence in the generation of errors in Lawrence-type tasks. Botella and colleagues (and several previous authors) have stated this explicitly: “Illusory conjunctions occur when the experimental conditions impede adequate focusing of attention on the presented stimuli, e.g. when exposure times are brief” (Botella et al., 2001, p. 1455). An unanswered question is what exactly does attention (or the lack thereof) do to influence the occurrence of errors.
One classical interpretation may be that attention equates to filtering (Broadbent, 1958; Treisman, 1964)—a mechanism capable of restricting the range of items from which the target has to be chosen. In the present study, an attempt was made to modulate such filtering action by changing the statistical distribution of the cue’s position in the list. As such, I argued that the conditions of the first experiment, where cue position was fixed, should allow maximum filtering of irrelevant letters, whereas in the conditions of the second experiment, where the cue was randomly distributed across list positions, filtering should depend on the hazard rate of the cue appearing in the next instant of time, and as such, it should vary with cue position. Were these manipulations effective, and what specific aspect of performance did they interfere with? The results of both experiments indicate that the precision of the Gaussian component of errors was unaffected in all conditions. What did change was the probability of committing errors uniformly distributed across list positions. Uniform errors for the midstream cue position were about twice more prevalent in the second than in the first experiment (compare Fig. 5, top panel, 15 Hz, with Fig. 10, Group 1, position 14), suggesting that fixing the cue aided temporal filtering. Furthermore, in the variable cue condition (Experiment 2), observers tended to produce more uniformly distributed errors at early list positions, such that the frequency of uniform errors appeared to have an inverse linear relation with list position (Fig. 10). This effect, reminiscent of the influence of the foreperiod in reaction time tasks (Niemi & Naatanen, 1981), suggests that the uniform distribution of cue times allows observers to increase their expectation of cue arrival as time elapses from RSVP onset and to use such expectation to improve filtering (see Nobre, Correa, & Coull, 2007).
In the light of the above observations, it appears likely that attention improves categorization efficiency through selective filtering. How much weight, then, does attention carry in explaining the uncertainty of judgments in individuation tasks? Let us assume, as a limit case, that uniform errors are entirely due to ineffective attentional filtering: In the most taxing condition of 22.2 Hz of Experiment 1, these errors accounted for about 20 % of all responses. The remaining responses are distributed normally around the cue position, with a temporal spread that does not vary with presentation rate and is, thus, unlikely to be limited by attention. Thus, it appears that the preponderant factor limiting temporal individuation is not attention but an intrinsic uncertainty in the temporal localization of the cue and the targets. It may be argued that such temporal uncertainty in the fixed cue condition corresponds to the precision of temporal bisection over the length of the list. There are two observations that make this conjecture unlikely: First, bisection may be more precise than individuation at the fastest RSVP rates tested here (see Zanker & Harris, 2002, for comparison bisection data); second, the Weber-like behavior of bisection thresholds (Kopec & Brody, 2010) would predict a similar Weber-like pattern for individuation (precision degrading at slower rates), which was not observed. I propose instead that the observed uncertainty relates to the latency dispersion of temporal markers assigned to events within the stream (see also Nishida & Johnston, 2002).
Latency dispersion of temporal markers as a limiting factor in temporal individuation
Research on temporal aspects of visual cognition is often introduced by noting the large disparity between the limits of detection of temporally modulated signals as opposed to the limits of individuation of specific temporal segments of such signals (Holcombe, 2009). For example, while temporal modulation in excess of 20 Hz of the luminance contrast of spatially separate objects can be easily detected, discrimination of their relative temporal phase can be achieved only at modulation rates below 10 Hz (Forte et al., 1999). The limits of detection can be explained as the minimum excess activation of neural filters induced by a signal, as compared with the activation induced by noise. The signal can be detected as long as the temporal impulse response of the filter is brief enough as not to blur it completely. However, a phase comparator not only must be able to encode the ongoing contrast modulations, but also must do so with temporal fidelity, such that relative phase information is preserved. As such, phase discrimination imposes stricter limits to the temporal characteristics of the encoding mechanisms, and the different requirements of the two tasks dictate the timescale of the neural code that is appropriate in each condition. A rate code based on counting spikes in an interval commensurate with the timescale of the stimulus’ modulation is sufficient to allow contrast detection and discrimination at that timescale. However, temporal phase discrimination necessitates coding at a finer scale (see Geisler, Albrecht, Salvi, & Saunders, 1991, for a thorough discussion of this point).
Temporal integration within a window of less than 100 ms, as found in early visual areas (Bair & Movshon, 2004), limits the ability to perceive two consecutive flashes as separate events (Bowen, 1989), contributes to explaining the phenomenon of masking (Cogan, 1992), and consequently limits the accuracy of target identification (Bundesen & Harms, 1999). In the present task, such effects due to temporal integration at a timescale of 50–100 ms must affect the frequency of uniformly distributed errors, which ought to be increasingly more prevalent at faster rates, as is in fact observed. However, when uniformly distributed errors are excluded, the data indicate that temporal uncertainty does not depend on presentation rate and suggest that the limit to temporal individuation may be set by the dispersion of latency of a putative internal time marker assigned to each event. Assuming that the latencies of markers for the letters and the cue are independently and identically distributed, the present data indicate that such latency distribution has a standard deviation of about 50 ms. This is a large dispersion, as compared with the reliability of neural responses to repeated stimulus presentations measured at early levels of the visual system, such as first-spike latencies, which seems to fall within a range of less than 10 ms (Butts et al., 2007). It is possible that more advanced stages of visual processing are less reliable. For example, available estimates of the variance of visual latencies in various cortical areas in the macaque monkey’s brain suggest that variability in more frontal regions is more similar in magnitude to the variability observed here (Jin, Fujii, & Graybiel, 2009; Pouget, Emeric, Stuphorn, Reis, & Schall, 2005; Schmolesky et al., 1998). However, while the behavioral data are suggestive of a limit based on temporal coding and a number of neural temporal coding schemes have been proposed (Tiesinga, Fellous, & Sejnowski, 2008), it remains at present unclear what specific aspect of neural activity may embody the putative internal timing marker representing each stimulus event.
Conclusions
Contrary to a widely held belief that the precision of temporal individuation in Lawrence-type tasks depends crucially on attentional load, the present data suggest that performance in such tasks is critically limited by an intrinsic source of temporal noise distinct and independent from attention. Similarly to the time marker theory of Nishida & Johnston (2002), I suggest that perceptual events are assigned internal time markers that are then compared to compute perceived temporal order; but I further suggest that these time markers are subject to a considerable latency dispersion and can be biased by a variety of stimulus factors. The latency dispersion of perceptual time markers can be thought of as a sampling instability that has the same effect as time jitter in a clock and phase noise in frequency oscillators (Balakrishnan, 1962; Souders, Flach, Hagwood, & Yang, 1990); it acts as a low-pass filter that critically limits the observer’s ability to judge the order of a series of events that unfold too quickly.
References
Ariga, A., & Yokosawa, K. (2008). Attentional awakening: Gradual modulation of temporal attention in rapid serial visual presentation. Psychological Research/Psychologische Forschung, 72(2), 192–202.
Aschersleben, G. (2002). Temporal control of movements in sensorimotor synchronization. Brain and Cognition, 48(1), 66–79.
Bachmann, T., Luiga, I., Poder, E., & Kalev, K. (2003). Perceptual acceleration of objects in stream: Evidence from flash-lag displays. Consciousness and Cognition, 12(2), 279–297.
Bair, W., & Movshon, J. A. (2004). Adaptive Temporal Integration of Motion in Direction-Selective Neurons in Macaque Visual Cortex. The Journal of Neuroscience, 24(33), 9305–9323.
Balakrishnan, A. V. (1962). On the problem of time jitter in sampling. Ire Transactions on Information Theory, 8(3), 226–236.
Battelli, L., Cavanagh, P., Martini, P., & Barton, J. J. (2003). Bilateral deficits of transient visual attention in right parietal patients. Brain, 126(Pt 10), 2164–2174.
Berdyyeva, T. K., & Olson, C. R. (2009). Monkey supplementary eye field neurons signal the ordinal position of both actions and objects. Journal of Neuroscience, 29(3), 591–599.
Berdyyeva, T. K., & Olson, C. R. (2010). Rank signals in four areas of macaque frontal cortex during selection of actions and objects in serial order. Journal of Neurophysiology, 104(1), 141–159.
Berdyyeva, T. K., & Olson, C. R. (2011). Relation of ordinal position signals to the expectation of reward and passage of time in four areas of the macaque frontal cortex. Journal of Neurophysiology, 105(5), 2547–2559.
Botella, J., & Eriksen, C. W. (1992). Filtering versus parallel processing in RSVP tasks. Perception & Psychophysics, 51(4), 334–343.
Botella, J., Garcia, M. L., & Barriopedro, M. (1992). Intrusion patterns in rapid serial visual presentation tasks with two response dimensions. Perception & Psychophysics, 52(5), 547–552.
Botella, J., Suero, M., & Barriopedro, M. I. (2001). A model of the formation of illusory conjunctions in the time domain. Journal of Experimental Psychology. Human Perception and Performance, 27(6), 1452–1467.
Bowen, R. W. (1989). Two pulses seen as three flashes: A superposition analysis. Vision Research, 29(4), 409–417.
Broadbent, D. E. (1958). Perception and communication. New York: Pergamon Press.
Brown, G. D. A., Neath, I., & Chater, N. (2007). A temporal ratio model of memory. Psychological Review, 114(3), 539–576.
Bundesen, C., & Harms, L. (1999). Single-letter recognition as a function of exposure duration. Psychological Research/Psychologische Forschung, 62(4), 275–279.
Butts, D. A., Weng, C., Jin, J., Yeh, C.-I., Lesica, N. A., Alonso, J.-M., & Stanley, G. B. (2007). Temporal precision in the neural code and the timescales of natural vision. Nature, 449(7158), 92–95.
Chun, M. M., & Potter, M. C. (1995). A two-stage model for multiple target detection in rapid serial visual presentation. Journal of Experimental Psychology. Human Perception and Performance, 21(1), 109–127.
Cogan, A. I. (1992). Anatomy of a flash: II. The "width" of a temporal edge. Perception, 21(2), 167–176.
Forte, J., Hogben, J. H., & Ross, J. (1999). Spatial limitations of temporal segmentation. Vision Research, 39(24), 4052–4061.
Fujisaki, W., & Nishida, S. (2010). A common perceptual temporal limit of binding synchronous inputs across different sensory attributes and modalities. Proceedings of the Biological Sciences, 277(1692), 2281–2290.
Gathercole, S. E., & Broadbent, D. E. (1984). Combining attributes in specified and categorized target search: Further evidence for strategic differences. Memory and Cognition, 12(4), 329–337.
Geisler, W. S., Albrecht, D. G., Salvi, R. J., & Saunders, S. S. (1991). Discrimination performance of single neurons: Rate and temporal-pattern information. Journal of Neurophysiology, 66(1), 334–362.
Heitjan, D. F. (1989). Inference from Grouped Continuous Data: A review. Statistical Science, 4(2), 164–179.
Henson, R. N. A. (1998). Short-term memory for serial order: The start-end model. Cognitive Psychology, 36(2), 73–137.
Holcombe, A. O. (2009). Seeing slow and seeing fast: Two limits on perception. Trends in Cognitive Science, 13(5), 216–221.
Holcombe, A. O., & Cavanagh, P. (2001). Early binding of feature pairs for visual perception. Nature Neuroscience, 4(2), 127–128.
Intraub, H. (1985). Visual dissociation: An illusory conjunction of pictures and forms. Journal of Experimental Psychology. Human Perception and Performance, 11(4), 431–442.
Jin, D. Z., Fujii, N., & Graybiel, A. M. (2009). Neural representation of time in cortico-basal ganglia circuits. Proceedings of the National Academy of Sciences of the United States of America, 106(45), 19156–19161.
Kanai, R., & Watanabe, M. (2006). Visual onset expands subjective time. Perception & Psychophysics, 68(7), 1113–1123.
Kikuchi, T. (1996). Detection of Kanji words in a rapid serial visual presentation task. Journal of Experimental Psychology. Human Perception and Performance, 22(2), 332–341.
Kopec, C. D., & Brody, C. D. (2010). Human performance on the temporal bisection task. Brain and Cognition, 74(3), 262–272.
Lashley, K. S. (1951). The problem of serial order in behavior Jeffress, Lloyd A (pp. (1951). Cerebral mechanisms in behavior; the Hixon Symposium (pp. 1112–1146). Oxford: Wiley. xiv, 1311.
Lawrence, D. H. (1971). 2 Studies of Visual Search for Word Targets with Controlled Rates of Presentation. Perception & Psychophysics, 10(2), 85.
Linares, D., Holcombe, A. O., & White, A. L. (2009). Where is the moving object now? Judgments of instantaneous position show poor temporal precision (SD = 70 ms). Journal of Vision, 9(13), 1–14.
McLean, J. P., Broadbent, D. E., & Broadbent, M. H. (1983). Combining attributes in rapid serial visual presentation tasks. The Quarterly Journal of Experimental Psychology. A, 35(Pt 1), 171–186.
Niemi, P., & Naatanen, R. (1981). Foreperiod and simple reaction-time. Psychological Bulletin, 89(1), 133–162.
Nishida, S., & Johnston, A. (2002). Marker correspondence, not processing latency, determines temporal binding of visual attributes. Current Biology, 12(5), 359–368.
Nobre, A., Correa, A., & Coull, J. (2007). The hazards of time. Current Opinion in Neurobiology, 17(4), 465–470.
Piazza, M., Izard, V., Pinel, P., Le Bihan, D., & Dehaene, S. (2004). Tuning curves for approximate numerosity in the human intraparietal sulcus. Neuron, 44(3), 547–555.
Pouget, P., Emeric, E. E., Stuphorn, V., Reis, K., & Schall, J. D. (2005). Chronometry of visual responses in frontal eye field, supplementary eye field, and anterior cingulate cortex. Journal of Neurophysiology, 94(3), 2086–2092.
Raymond, J. E., Shapiro, K. L., & Arnell, K. M. (1992). Temporary suppression of visual processing in an RSVP task: An attentional blink? Journal of Experimental Psychology. Human Perception and Performance, 18(3), 849–860.
Reeves, A., & Sperling, G. (1986). Attention gating in short-term visual memory. Psychological Review, 93(2), 180–206.
Repp, B. H. (2005). Sensorimotor synchronization: A review of the tapping literature. Psychonomic Bulletin Review, 12(6), 969–992.
Rose, D., & Summers, J. (1995). Duration illusions in a train of visual stimuli. Perception, 24(10), 1177–1187.
Schmolesky, M. T., Wang, Y., Hanes, D. P., Thompson, K. G., Leutgeb, S., Schall, J. D., & Leventhal, A. G. (1998). Signal timing across the macaque visual system. Journal of Neurophysiology, 79(6), 3272–3278.
Souders, T. M., Flach, D. R., Hagwood, C., & Yang, G. L. (1990). The effects of timing jitter in sampling systems. IEEE Transactions on Instrumentation and Measurement, 39(1), 80–85.
Tanji, J. (2001). Sequential organization of multiple movements: Involvement of cortical motor areas. Annual Review of Neuroscience, 24, 631–651.
Thurstone, L. L. (1927a). A law of comparative judgment. Psychological Review, 34(4), 273–286.
Thurstone, L. L. (1927b). Psychophysical analysis. The American Journal of Psychology, 38, 368–389.
Tiesinga, P., Fellous, J.-M., & Sejnowski, T. J. (2008). Regulation of spike timing in visual cortical circuits. Nature Reviews Neuroscience, 9(2), 97–109.
Treisman, A., & Schmidt, H. (1982). Illusory conjunctions in the perception of objects. Cognitive Psychology, 14(1), 107–141.
Treisman, A. M. (1964). Selective attention in man. British Medical Bulletin, 20, 12–16.
Ulrich, R., & Giray, M. (1989). Time resolution of clocks - effects on reaction-time measurement - good-news for bad clocks. British Journal of Mathematical and Statistical Psychology, 42, 1–12.
Vul, E., Hanus, D., & Kanwisher, N. (2009). Attention as inference: Selection is probabilistic; responses are all-or-none samples. Journal of Experimental Psychology. General, 138(4), 546–560.
Vul, E., Nieuwenstein, M., & Kanwisher, N. (2008). Temporal selection is suppressed, delayed, and diffused during the attentional blink. Psychological Science, 19(1), 55–61.
Vul, E., & Rich, A. N. (2010). Independent sampling of features enables conscious perception of bound objects. Psychological Science, 21(8), 1168–1175.
Watson, A. B. (1986). Temporal sensitivity. In L. K. K. Boff & J. Thomas (Eds.), Handbook of perception and human performance, Vol I Sensory Processes & Perception. New York: Wiley.
Zanker, J. M., & Harris, J. P. (2002). On temporal hyperacuity in the human visual system. Vision Research, 42(22), 2499–2508.
Acknowledgments
Address for correspondence: Paolo Martini, Department of Psychology, The University of Warwick, Gibbet Hill Road, Coventry, CV4 7AL, UK; e-mail: paolo.martini64@gmail.com. Supported by a RCUK fellowship to Paolo Martini.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Martini, P. Sources of bias and uncertainty in a visual temporal individuation task. Atten Percept Psychophys 75, 168–181 (2013). https://doi.org/10.3758/s13414-012-0384-y
Published:
Issue Date:
DOI: https://doi.org/10.3758/s13414-012-0384-y