Three-systems for visual numerosity: A single case study

Humans possess the remarkable capacity to assess the numerosity of a set of items over a wide range of conditions, from a handful of items to hundreds of them. Recent evidence is starting to show that judgments over such a large range is possible because of the presence of three mechanisms, each tailored to specific stimulation conditions. Previous evidence in favour of this theory comes from the fact that discrimination thresholds and estimation reaction times are not constants across numerosity levels. Likewise, attention is capable of dissociating the three mechanisms: when healthy adult observers are asked to perform concurrently a taxing task, the judgments of low numerosities (<4 dots) or of high numerosities is affected greatly, not so however for intermediate numerosities. Here we bring evidence from a neuropsychological perspective. To this end we measured perceptual performance in PA, a 41 year-old patient who suffers simultanagnosia after a hypoxic brain injury. PA showed a profound deficit in attentively tracking objects over space and time (multiple object tracking), even in very simple conditions where controls made no errors. PA also showed a massive deficit on sensory thresholds when comparing dot-arrays containing extremely low (3 dots) or extremely high (64, 128 dots) numerosities as well as in comparing dot-distances. Surprisingly, PA discrimination thresholds were relatively spared for intermediate numerosity (12 and 16 dots). Overall his deficit on the numerosity task results in a U-shape function across numerosity which, combined with the attentional deficit and the inability to judge dot-distances, confirms previously suggested three-systems for numerosity judgments.

Humans possess the remarkable capacity to assess the numerosity of a set of items over a 25 wide range of conditions, from a handful of items to hundreds of them. Recent evidence is 26 starting to show that judgments over such a large range is possible because of the presence of 27 three mechanisms, each tailored to specific stimulation conditions. Previous evidence in 28 favour of this theory comes from the fact that discrimination thresholds and estimation 29 reaction times are not constants across numerosity levels. Likewise, attention is capable of 30 dissociating the three mechanisms: when healthy adult observers are asked to perform 31 concurrently a taxing task, the judgments of low numerosities (<4 dots) or of high 32 numerosities is affected greatly, not so however for intermediate numerosities.
Here we bring 33 evidence from a neuropsychological perspective. To this end we measured perceptual 34 performance in PA, a 41 year-old patient who suffers simultanagnosia after an hypoxic brain 35 injury. PA showed a profound deficit in attentively tracking objects over space and time 36 (multiple object tracking), even in very simple conditions where controls made no errors. PA 37 also showed a massive deficit on sensory thresholds when comparing dot-arrays containing 38 extremely low (3 dots) or extremely high (64, 128 dots) numerosities as well as in comparing 39 dot-distances. Surprisingly, PA discrimination thresholds were relatively spared for 40 intermediate numerosity (12 and 16 dots). Overall his deficit on the numerosity task results in 41 a U-shape function across numerosity which, combined with the attentional deficit and the 42 inability to judge dot-distances, confirms previously suggested three-systems for numerosity 43 judgments. 44 45 46 1. Introduction 47 48 Humans can estimate a wide range of numerosities, from few items to several hundreds. 49 Whether a single mechanism or several mechanisms are engaged in numerosity perception 50 across different numerical ranges, is an open question. While the existence of a single 51 mechanism may look parsimonious, evidence is starting to mount in favour of three separate 52 systems ( A first classical distinction in the mechanisms for numerosity has been made for very low and 60 intermediate numbers. Jevons (1871) discovered that judgements of low numerosities, 61 usually up to 4 items, are very fast (with constant reaction times) and virtually errorless. The 62 ability to enumerate quickly and effortlessly numbers up to four has been coined "subitizing" 63 (Kaufman & Lord, 1949). Past this numerical range a new mechanism takes over, where 64 errors and reaction times covary with numerosity (Atkinson, Campbell, & Francis, 1976; 65 Jevons, 1871; Kaufman & Lord, 1949;Mandler & Shebo, 1982). This system has been called 66 "estimation" (or Approximate Number System), to underline its approximate and inexact 67 nature (Feigenson, Dehaene, & Spelke, 2004). The performance discontinuity between very 68 low and higher numbers resulted in the initial proposal of two separate systems for 69 "subitizing" and "estimation". 70 71 Recent works examined several psychophysical variables across a broader range of stimuli 72 and highlighted another possible break-in performance, suggesting the existence of a third 73 system. In their initial observation Anobile  More recently Pomè and colleagues (2019) measured discrimination thresholds for a wide 138 numerosity range, from very few items to high density stimuli, and measured the cost of 139 introducing a concurrent dual task. The results replicated a high cost in the subitizing range, 140 and an almost complete immunity in the estimation range but also revealed that, when 141 numerosity increases, attentional cost was raised again. In line with this, and using a very 142 similar paradigm, Tibber, Greenwood, and Dakin (2012) found strong visual attentional costs 143 on numerosity and density thresholds, for high numerosities (128 dots).

145
Overall these studies suggest that numerosity can be processed by 1) an attentional subitizing 146 system; 2) a relatively attentional free estimation system, linked to the abstract numerical 147 value of the stimuli; 3) an attentional dependent texture-density system, encoding texture-148 density rather than numerosity and not related to mathematical abilities.

150
In the current study, we tested the three-system hypothesis from a neuropsychological 151 standpoint, taking our lead from the differential attentional demands observed in the three 152 regimes. We will describe a single case of a 41 years-old men (PA) who, following a heart 153 attack, developed clinical signs of simultanagnosia. Psychophysical testing, performed 6 154 months later, revealed a profound spatial attention deficit, massively impairing his ability to 155 attentively track moving objects (Multiple Object Tracking task).

157
According to the results described above, the three-system model provides a clear prediction 158 on PA numerosity performance: the patient should demonstrate stronger thresholds deficits 159 for those numerical ranges that are more attention dependent. More precisely, the three-160 system hypothesis predicts massive deficit in the subitizing range, relatively spared 161 thresholds in the estimation range and again, impaired thresholds in the texture-density 162 regime. In other terms, PA performance measured in single-task condition should 163 qualitatively mirror those obtained previously ( PA is a 40-year old right-handed male who suffered from hypoxic insult due to a heart attack.

184
He was transferred to the rehabilitation centre "Auxilium Vitae" in Volterra from the 185 intensive care unit and was finally discharged after 120 days from the hypoxic insult. He had 186 difficulty in recognising simple everyday objects, perceiving more than a single object at the 187 time (simultagnosia), controlling voluntary and purposeful eye movement (oculomotor 188 apraxia) and moving the hand to a specific position driven by vision (optic ataxia). He also 189 showed ideomotor apraxia, reduction of digit span capacity, slight anterograde memory 190 deficit and mild impairment of the executive functions. He was autonomous in walking, 191 feeding, and daily personal care. One year after the heart attack he went back to work. The 192 MRI of the brain collected 15 days after the hypoxic insult revealed absence of any specific 193 lesion and a very subtle variation of the signal into the basal ganglia. These findings were 194 much less evident at the brain MRI scan collected at 90 days from the event (Figure 1). 195 However, in this latter scan, there was evidence of an overall brain atrophy, in particular in 196 the occipitotemporal inferior regions and in the frontal and parietal paracentral regions and in 197 the hippocampal areas.

199
Neuropsychological measures were taken at 6 months from injury (Table 1) Intelligence Scale (WAIS-IV) were assessed. The VCI is a score derived from the 203 administration of WAIS-IV sub-tests: information, similarities and vocabulary. It provides a 204 measure of verbally acquired knowledge and verbal reasoning. The WMI was obtained from 205 WAIS-IV sub-tests: digit span and arithmetic. It measures the ability to absorb information 206 presented verbally, to manipulate that information in short-term immediate memory, and then 207 to formulate a response. PA scored in the normal range for the VCI, and he scored below the 208 normal range for the WMI; thus PA did not have verbal knowledge and verbal reasoning 209 difficulties but he had reduced attention and memory. PA have 15 years of formal schooling 210 and before the critical event was employed in a local museum. 211 212 213  Figure 2A. Stimuli were coloured disks, 239 each with a 0.9° diameter and moving randomly at 2°/s. Some disks, coloured in green, were 240 to be followed, while the red disks were distractors. The target number was kept constant at 241 two while the number of distractors was varied in separate sessions and were: 3, 4, 6, 8, 10, 242 18 for controls; 3, 4, 6, 8, 10 for the patient. On each trial, two green disks (targets) and a 243 certain number of red disks moved randomly across a grey full screen background for a period of 3 s, and participants had to hold their attention on the targets. After 3s, the green 245 targets were turned red (like the distracters), and subjects were to continue tracking them for 246 a further 3 s. Afterwards, the disks were stopped and the subjects were asked to identify (and 247 point towards) which one of four possible items (highlighted in orange) had previously been 248 green a target (4AFC). The subjects were not asked to respond quickly, but were given all the 249 time they needed to decide. Each experimental session comprised around ten trials. 250 Participants performed one session for each distractor number condition. PA performed 52 251 trials (10, 16, 10, 10, 6 for each distractors level), Control 1 performed 60 trials (10 for each 252 level) and Control 2 performed 70 trials (10, 10, 20, 10, 20). No feedback was provided. 253 Performance was measured as a proportion of correct responses. 254 255 2.6. Numerosity discrimination 256 257 Numerosity thresholds were measured with a two-interval comparison task (2 IFC), sketched 258 in Figure 2B. The stimuli were two clouds of non-overlapping dots (0.5° diameter each), half 259 black half white (in order to balance luminance). The position of each single dot was chosen 260 at random within a circular virtual region (10° diameter), respecting the condition that two 261 dots (center-to-center) should not be separated by less than 0.5°. Dot arrays were sequentially 262 presented for 500 ms each with a fixed blank inter-stimulus interval of 1 s. Dot clouds were 263 centered at ±10° from a central fixation point. The side of the probe and test stimuli relative 264 to the central fixation point was kept constant in order to reduce the spatial uncertainty that 265 could add noise non-related to numerosity perception, especially for the patient. Participants 266 were asked to indicate (by appropriate keyboard pressing), which stimulus contained more 267 dots. As in the attention task, subjects were not asked to respond quickly. Counting ability was tested with a time-unlimited naming task. The stimuli were clouds of 291 non-overlapping white dots (0.5° diameter each). The position of each single dot was chosen 292 at random within a circular virtual region (10° diameter), respecting the condition that two 293 dots (center-to-center) should not be separated by less than 0.5°. On each trial, a single dot 294 array containing from 2 to 10 dots, was presented in the center of the screen and remained on 295 until participants gave a verbal estimation. Participants were instructed to enumerate as fast 296 as they could the dot array, no feedback was provided. As soon as participants provided a 297 response, the experimenter (blind to the stimuli), pressed the space bar in order to save 298 response time. Finally, the experimenter entered the participant numerical response by the 299 keyboard. P.A. performed a total of 51 trials (7,7,7,5,5,5,5,5,5 for N 2,3,4,5,6,7,8,9,10), 300 control subjects performed 45 trials (5 for each numerosity level). For each numerosity level 301 we computed mean response time (secs) and average response. 302 303 304 2.8. Object distance perception.

306
Peripheral distance judgements were assessed via a custom paradigm which displayed two 307 rings made out of twenty small dots (5 pixels diameter), akin to beads making up a necklace 308 ( Figure 2C). The centre of the stimuli was positioned at 8° eccentricity from a central fixation 309 point and dot positions were specified in polar coordinates. More specifically, the distance 310 from the centre of the dots ‫)ݎ(‬ was determined as a sum of two sinusoids, one repeating twice 311 and the other repeating 5 times in a full circle (2ߨ radiants) following the formula: 312 313 ‫ݎ‬ = ‫ݎ‬ +A ହ sin(5ߴ + ߮ ହ ) +A ଶ sin(2ߴ + ߮ ଶ ) 314 Where ߴ is the polar angle, ‫ݎ‬ is the average radius (chosen randomly between 3° and 4.5° 315 degrees for each stimulus), A ହ and A ଶ are the amplitudes of the two sinusoids (random 316 between 0.33° and 0.67° the former and fixed at 1.7° the latter) and ߮ ହ and ߮ ଶ are the two 317 phases (random between 0 and 2ߨ). As in the numerosity task, stimuli were sequentially 318 presented for 500 ms each with a fixed blank inter-stimulus interval of 1 s and the side of the 319 probe and test stimuli relative to the central fixation point was kept constant. Participants 320 were asked to indicate (by appropriate keyboard pressing), which stimulus contained less 321 interdot spacing. The left-side stimulus maintained the same interdot distance across trials 322 (test, 0.7 degrees), while the other (probe) varied between 0.1 and 1.5 degrees. Proportion of 323 judgments in which the test was judged as "sparser" than the test was plotted as function of 324 test inter-bead distance and fitted with a standard psychometric function (see Figure 4). The 325 difference between the spacing that yield 50% and 75% "more sparse judgments" defines the 326 just-noticeable difference (JND) which, divided by the PSE, yields the Weber Fraction (WF).

327
PA performed a total of 53 trials, Control 1 performed 160 trials, all the others performed 110 328 trials. Standard Errors are calculated via bootstrap (Efron & Tibshirani, 1986). 329 330 331 2.9. Data analyses 332 333 Statistical differences between accuracy rates and chance level in the Multiple Object 334 Tracking were computed by binomial tests. Statistical differences on accuracy levels between 335 PA and controls were calculated by Chi-square tests.

337
The subjects' statistical differences on numerosity thresholds (WF) were calculated by a 338 bootstrap technique (Efron & Tibshirani, 1986). For each participant, and separately for each 339 numerosity level, raw data were randomly resampled (selecting a data set as large as the data 340 set taken, sampled with replacement), a psychometric function was fitted and a WF 341 calculated. On each iteration, the WFs obtained by controls were averaged and compared to 342 that obtained by PA. This procedure was repeated 1000 times. The proportion of time that 343 PA's WFs were lower than the controls' averages was the p-value. To compare deficit 344 magnitude across numerical regimes, for each iteration we separately averaged PA's and the 345 controls' WFs on numerosity 12 and 16 (estimation range) as well as those for numerosity 64 346 and 128 (texture density) or N3 (subitizing). Then we computed the ratio between WFs in the 347 subitizing, estimation and texture-density ranges obtained by PA and the controls (deficit 348 index) and counted the time the deficit in one range was higher than that in the other (p-349 value). Numerosity 32 was eliminated from this analysis because for one control participant 350 the WF already started to decrease at this numerosity level making it difficult to categorise it 351 as belonging to the estimation or texture-density regime.

353
We checked the presence of subitizing advantage in serial counting by looking at response 354 time (RT) variation as a function of item number. For each subjects and separately for each 355 numerosity, raw response time were randomly resampled (1000 iterations, selecting a data set 356 as large as the data set taken, sampled with replacement), the average RT computed, plotted 357 against physical numerosity and fitted wither with a linear or a two limb linear function 358 starting with a constant segment and then rising as function of numerosity. On each iteration, 359 we calculated the goodness of fit of the linear and the two limb function by means of 360 Akaike information criterion (AIC). The p-value represents the fraction of times that a given 361 AIC is lower than that of the competing model. 362 363 364 Object distance perception. The subjects' statistical differences on dot-distance thresholds 365 were calculated by a similar bootstrap technique: for each participant, raw data were 366 resampled and a WF calculated. On each iteration, the WFs obtained by the controls were 367 averaged and compared to that obtained by PA. This procedure was repeated 1000 times. The 368 proportion of time that PA's WFs were lower than the controls' average was the p-value.  program that allows for the simultaneous visualisation of the movement of the cursor on the 389 screen within the sagittal, horizontal and coronal planes of the brain MRI together with 390 visualization of x, y, z coordinate. Brain sulci of PA a 40 years old man, were overall 391 increased as a result of the diffuse brain atrophy. No specific lesion and a very subtle 392 variation of the signal into the basal ganglia are visible (z =+7). Axial slice at z=-13 shows a 393 brain atrophy in the occipitotemporal inferior regions and into the hippocampi; the axial slice 394 at z= +39 shows a frontal and parietal paracentral regions atrophy. Visual-spatial attentional capacities were psychophysically measured by a Multiple Object 423 Tracking task (Figure 2A). The number of to-be-tracked targets was fixed at two and the 424 attentional load was manipulated, in separate sessions, by increasing the number of 425 distractors from 3 to 18 (3-10 for PA). 426 Figure 3 shows a proportion of correct responses as a function of the number of distractors. 427 For both control participants (greys lines and symbols), performance was almost perfect with 428 accuracy slightly decreasing at the most difficult condition (18 distractors) for one participant 429 (Control 1, in the figure).

430
PA was able to perform the task, with accuracy above the chance level (0.25 accuracy) in the 431 less attention demanding conditions, namely when the number of distractors was three and 432 four (p<0.001 for both relative to chance). In these two distractors levels, PA's proportion of 433 correct responses was around 0. inspection it is clear that PA was able to perform the comparison task, producing many 466 ordered functions. However, it is also evident that the PA fits for very small (test N=3 dots) 467 and very high (test N=128 dots) numerosities had higher slopes, compared to the controls. 468 The slopes of psychometric functions are indexes of sensory thresholds, with higher values 469 indicating lower precision. 470 471 Figure 4B summarises better the results showing discrimination thresholds (WF) as a 472 function of numerosity levels for the patient PA (black) as well as those obtained by the 473 controls (averaged across the two subjects, greys). Results from control participants 474 replicated previous findings: thresholds were very low in the subitizing range (≅ 0.1) then 475 rose ( ≅ 0.2) and remained constant for higher numerosities (from 12 to ≅ 64); finally, WFs 476 decreased for the densest stimuli (WF<0.1 around N128). As described in the introduction, 477 this three-phase discontinuity is the one that initially led to the hypothesis of the existence of 478 three systems.

480
The PA result were quite different. PA threshold level in the subitizing range (i.e. N3) was 481 very high, with a WF near to 0.6, five times higher compared to the controls (p<0.001). To better visualize the PA sensory thresholds deficit across numerosity levels, we computed a 496 "deficit index" as the ratio between PA's and the controls' average WF levels. Figure 4C  497 shows the deficit index as a function of test numerosity making evident that PA's deficit was 498 not constant across numerosity, but drew a U-shape function. The average deficit for 499 numerosities in the estimation range (12 and 16) was 2.0 while that for numerosities in the 500 texture-density regime (64 and 128) was 8.2 (p=0.03). For the subitizing range (N3) the 501 average deficit was 7.1, higher than the estimation range (p=0.009) but not compared to the 502 texture-density regime (p=0.53). 503 504 505 Figure 4. Numerosity discrimination. A) Psychometric functions from two representative 506 controls (light and dark grey) and the patient (PA) for various level of numerosity, spanning 507 the three regimes. B) Discrimination thresholds (WF) for the patient PA (black), controls 508 (thin coloured lines) and averaged across controls (greys) as a function of numerosity. C) 509 Deficit factor calculated as the ratio between WF returned from PA's fits and the average 510 performance of controls. Values higher than one mean higher thresholds in PA compared to 511 controls. 512 * p<0.05, ** p<0.01 513 514 515 3.3. No evidence of subitizing in counting task 516 517 In order to confirm that the deficit in the subitizing was not task dependent we measured PA 518 performance in a classical dot-counting task in the range 2-10. In this task control subjects 519 exhibit a classical signature of subitizing advantage: performance is fast and constant up to 520 ∼4 items and then it is slower and depends on numerosity from 5 items on (Grey dots in Fig  521  5A). 522 523 PA behaviour dramatically differed from this classic pattern. His response times grew 524 steadily as function of numerosity even with the least numerous items and, for instance 525 counting 3 dots required more time than counting 2 items (Black dots, Fig 5A). This indicates 526 the absence of the capacity of capture at a gist 2, 3 or 4 items, i.e. a lack of the subitizing 527 process. To confirm this quantitatively we fit the two datasets (PA and controls) with two 528 functions, either a linear function or a two-limb linear function and compared the two models 529 by means of Akaike Information Criterion. In case of controls the two limbed function was 530 the better model, outperforming a simple linear fit near always (bootstrap of AIC p=0.008).   Figure 5B shows average responses of PA in the counting task. These data indicating that he 535 was well compliant with the task with responses that grew monotonically with stimulus 536 numerosity albeit with a slight overestimation (slope=1.14±0.06, p<0.001; 537 intercept=0.82±0.24, p=0.01). An overall overestimation has been reported previously in 538 some simultagnosic patients and is generally due to the fact that these subjects, while 539 scanning the display, lose track of the items which they have already analysed and may count 540 twice the same dot (Dehaene & Cohen, 1994). Again, no signature of a specific process for 541 very low numerosities is evident from this data. PA's numerosity thresholds at high numerosities was much worse than controls. Previous 554 studies have shown that for very dense stimuli, perception is dominated by the dot-density. 555 The distance between the elements is a stimulus parameter that has been proved to be a good 556 quantitative descriptor of stimulus density (Anobile et al., 2014). For this reason, we also 557 investigated PA's precision in discriminating distance between objects. If numerosity of 558 dense stimuli is judged, even partially, through computing this visual feature, we expect 559 higher discrimination thresholds compared to controls. 560 561 Figure 5 shows psychometric functions for PA (black) and controls (greys), with associated 562 Weber Fraction estimates (inbox texts). Both controls found the task particularly easy and 563 Recent evidence suggests that numerosity perception can draw upon three distinct 581 mechanisms: 1) an attentional dependent subitizing system encoding numbers up to around 582 four; 2) a relatively "attentional-free" estimation mechanism for intermediate numbers and 3) 583 an attentional demanding texture-density mechanism operating for high dense/numerous 584 stimuli. 585 586 Here we tested this idea from a neuropsychological approach. We measured numerosity 587 thresholds for a wide range of numerosities, spanning the three systems in a single patient 588 Moreover, the pattern of numerosity deficits showed by PA is difficult to explain with a 597 single mechanism spanning all numbers but, instead, fit swell with the three-system model. 598 Results on this simultanagnosic patient also extend nicely the evidence provided by previous We would like to stress that the aim of the current study was not to describe visual perception 603 in simultanagnosia nor the link between math skills and numerosity perception in these 604 patients, both of which issues require certainly much more detailed testing. In the same vein 605 we note that MRI evidence on our patient revealed a rather diffuse atrophy which hinders the 606 possibility to restrict the functional deficit to a circumscribed damage. attentional deficits. In the numerical tasks, patients produced more errors than controls for 626 numerosities above three but had relatively preserved accuracy in quantification of one, two 627 and sometimes three items, demonstrating the subitizing effect. Demeyere, Lestou, and 628 Humphreys (2010) also found unimpaired exact counting for numbers up to four items but 629 impaired enumeration for higher numbers in a brain lesioned patient. Demeyere and 630 Humphreys (2007) measured numerosity performance on GK, a patient with severe 631 simultanagnosic symptoms and clearly impaired attentional capacities. At odds with Dehaene 632 & Cohen patients, GK showed no sign of subitizing advantage, with error rates linearly 633 increasing with numerosity. Our data on serial counting mirrors those of patient GK, with no 634 evidence of subitizing advantage with response time linearly increasing with numerosity. 635 Interestingly, the authors found that when asked to compare the relative numerosity of two 636 fast consecutive displays, GK's performance (error rates) was significantly above chance for 637 many test numerosity levels (2, 4, 6, 8, or 10 dots), suggesting that he had a residual capacity 638 to compare numerosities. The authors suggested that the capacity to distribute attention over 639 space of GK was unimpaired and that distributed attention is the key attentional prerequisite 640 when encoding global stimulus statistics, like numerosity. Following this idea, the same 641 research group also demonstrated the remarkably good ability of GK to encode visual 642 ensemble statistics of objects colour and size ( suggested that subitizing is not a pure numerical ability but reflects a domain general capacity 652 to tag and monitor items of interest in the visual scene. These are attentional demanding 653 processes which, besides supporting target selection, may also provide intrinsically a precise computes interdot distance and assigns the label of more dense (or more numerous) to the one 668 that possesses the smallest average distance (Anobile et al., 2014). Consistently with this, PA 669 displayed a strong impairment in dots distance estimation. All this leads to the speculation 670 that discrimination of highly packed arrays relies heavily on an attention-dependent local 671 feature extraction such as object distance. It is also interesting to note that PA, 672 notwithstanding the deficit in distance estimation, performs relatively well at intermediate The robustness of numerosity perception even in a patient with such severe attentional 679 deficits is consistent with the idea that numerosity of visual arrays is produced by a dedicated 680 primary mechanism which partially escapes cognitive control 681