Recognition of animal faces is impaired in developmental prosopagnosia

An on-going debate in psychology and neuroscience concerns the way faces and objects are represented. Domain- specific theories suggest that faces are processed via a specialised mechanism, separate from objects. Developmental prosopagnosia (DP) is a neurodevelopmental disorder in which there is a deficit in the ability to recognize conspecific (human) faces. It is unclear, however, whether prosopagnosia also affects recognition of hetero-specific (animal) faces. To address this question, we compared recognition performance with human and animal faces in neurotypical controls and participants with DP. We found that DPs showed deficits in the recognition of both human and animal faces compared to neurotypical controls. In contrast to, we found no group-level deficit in the recognition of animate or inanimate non-face objects in DPs. Using an individual-level approach, we demonstrate that in 60% of cases in which face recognition is impaired, there is a concurrent deficit with animal faces. Together, these results show that DPs have a general deficit in the recognition of faces that encompass a range of configural and morphological structures.


Introduction
Prosopagnosia is the inability to recognize faces despite normal visual processing. In cases of acquired prosopagnosia (AP), individuals develop normal face recognition, but following brain damage to the occipito-temporal cortex, experience difficulty in recognising faces (Barton, 2008;de Renzi, Faglionii, Grossi, & Nichelli, 1991). In cases of developmental prosopagnosia (DP), on the other hand, deficits in face recognition are seen in the absence of any observable brain injury (Cook & Biotti, 2016;Duchaine and Nakayama, 2006a;Susilo & Duchaine, 2013). Tests reveal that these individuals perform significantly below average on a range of common tests of face perception and recognition Duchaine, Germine, & Nakayama, 2007;Duchaine & Nakayama, 2006a, 2006b. However, the extent to which prosopagnosia selectively affects the perception of human faces remains contentious (Geskin & Behrmann, 2018).
It is unclear whether the deficit in prosopagnosia also affects animal faces, as standard tests for prosopagnosia only use human faces (Duchaine and Nakayama, 2006b;Shah, Gaule, Gaigg, Bird, & Cook, 2015). There are a few case studies of individuals with AP who report impaired identification of different categories of animal faces and also impaired discrimination of individual animals of the same species (Bornstein, Sroka, & Munitz, 1969;Landis, Cummings, Christens, Bogen, & Imhof, 1986;Toftness, 2019). However, there are other cases of AP in which the ability to recognize animal faces remains intact (Landis et al., 1986;McNeil & Warrington, 1993). For example, patient WJ was a sheep farmer, who acquired prosopagnosia after a stroke, but was still able to differentiate between different sheep (McNeil & Warrington, 1993). These studies provide mixed evidence for impaired animal face recognition in AP. However, to our knowledge, there have been no systematic investigations of animal face recognition in DP.
Neurophysiological and neuroimaging studies provide some support for the idea that similar neural processes underpin the perception and recognition of both human and animal faces. For example, face-selective regions such as the FFA show similar preferential activity for human and animal faces compared to images of bodies and objects (Kanwisher, Stanley, & Harris, 1999;Tong, Nakayama, Moscovitch, Weinrib, & Kanwisher, 2000). Other studies have investigated the pattern of response in the inferior temporal lobe and found similar patterns of response to human and animal faces that are distinct from those elicited by non-face objects (Kriegeskorte et al., 2008). Evidence for the association of human and animal faces also comes from single neuron activity in humans. Face-selective neurons in humans are more responsive to animal faces compared to other object categories (Decramer et al., 2021). Single neuron studies in monkeys have also shown that most face cells respond in a similar way to monkey and human faces (Perrett, Hietanen, Oram, & Benson, 1992).
Findings from developmental psychology suggests that the mechanisms underlying the perception of human and animal faces may be somewhat different. Although young human infants show similar sensitivity to human and monkey faces, they gradually become tuned to human faces throughout infancy reflecting the perceptual experience of the individual (Pascalis et al., 2005;Pascalis, de Haan, & Nelson, 2002). For example, it has been reported that 6-month-old infants are able to discriminate between monkey faces, in a way that 9-month-olds and adults cannot (Pascalis et al., 2005). Exposure to monkey faces in early infancy is thought to attenuate this 'perceptual narrowing' (Pascalis et al., 2002).
Previous studies have, therefore, provided conflicting evidence in support of the idea that the recognition of human and animal faces might engage similar processing mechanisms. To address this question, we investigated human and animal face recognition in DP. Despite the important theoretical implications of this question, this has not been directly investigated in DP. Using an old/new recognition paradigm, we compared performance with human, cat, dog, monkey, and sheep faces in DPs and neurotypical controls. We also compared performance with animate (starfish) and inanimate (bottle) object categories that do not have faces.

Participants
Thirty-seven DPs (7 males, M age = 36.92, SD age = 6.61) and 27 Controls (10 males, M age = 31.78, SD age = 13.27) completed the experiment online using the Pavlovia platform (https://pavlovia.org). There have been no previous studies of animal face recognition. However, we performed a power analysis using the data from Biotti, Gray, and Cook (2017) because the authors used the same diagnostic tests to identify DP and investigated the recognition deficit of a biological class of stimuli (bodies). Based on an effect size of 0.89 (Cohen's d) with an alpha = 0.05 and power = 0.80, the projected sample size needed was a minimum of 21 participants per group. The groups did not differ significantly in age (t(62) = 1.53, p = 0.131), or in gender (X 2 (1) = 1.62, p = 0.105). All participants were over 18 years-old, had normal or corrected-to-normal vision and had no history of neurological conditions (e.g., Schizophrenia or Autism Spectrum Disorder. All participants provided written informed consent and were fully debriefed after the experimental procedure. The experiment was approved by the Psychology Research Ethics Committee at the University of York.

Diagnostic tests
DP participants were recruited through www.troublewithfaces.org. Diagnostic evidence for the presence of DP was collected using the PI20 questionnairea 20-item self-report measure of prosopagnosic traits (Shah et al., 2015), and the Cambridge Face Memory Test (CFMT) -an objective measure of face recognition (Duchaine and Nakayama, 2006b). To be classified with DP, a participant had to score above the established cut-off on the PI20 (> 65) and 2 standard deviations below the typical mean on the CFMT (Table 1). The average DP scores on the diagnostic tests were: PI20 (M = 79.11, SD = 6.61), CFMT: (M = 50.69, SD = 8.49). The use of convergent diagnostic evidence from self-report and objective computer-based measures is thought to afford reliable identification of DP (Gray, Bird, & Cook, 2017;Tsantani, Vestner, & Cook, 2021).

Old/new recognition task
The old/new recognition test used 7 object categories: 1) human face, 2) cat face 3) dog face 4) monkey face, 5) sheep face, 6) starfish and 7) bottles. Fig. 1a shows example images from all conditions. Human face images were taken from the Models Face Matching Test (Dowsett & Burton, 2015). Monkey faces were obtained from the PrimFace database (https://visiome.neuroinf.jp/primface/). Dog faces were obtained from the Flickr-dog dataset (Moreira, Perez, de Werneck, & Valle, 2017). All other images were obtained from a variety of freely available Internet sources. Starfish were chosen as an animate non-face object because they belong to the category of animals but do not have a face. Bottles are a category of non-face object with which DPs have previously demonstrated normal recognition performance (Epihova, Cook, & Andrews, 2022). All images were presented in gray-scale and had a resolution of 400 × 400 pixels.
The old/new recognition task involved a learning phase and a recognition phase (Fig. 1b). In the learning phase, each trial began with the presentation of fixation cross (500 ms) followed by the presentation of a target image (3000 ms). A total of 10 target images were presented in each object category. Participants were instructed to remember the images prior to being tested. The recognition phase followed immediately after the learning phase for each category. In the recognition phase the 10 target images were presented along with 20 foil images from the Table 1 Demographic information and individual scores on the diagnostic tests used to validate developmental prosopagnosia, namely the PI20 questionnaire (PI20) and Cambridge Face Memory Test (CFMT). High scores on teh PI20 indicate the presence of more prosopagnosic traits. Lower scores on the the CFMT (% correct) indicate worse face identification performance. Nb. Comparison data (N = 54) for the PI20 and CFMT were taken from Biotti et al., 2019. same category as the target images. Participants were instructed to indicate by a button press whether the image was old or new. Images stayed on screen until participants made a response. Each category was presented in full before moving to the next category. The order in which categories were presented was counterbalanced and the order of image presentation in the recognition phase within each category was randomised.
We used Signal detection theory (SDT) to measure performance in the old/new recognition task. First, we calculated d' -a measure of sensitivity, incorporating information from hit rate (correctly recognising an image as a target) and false alarm rates (incorrectly mistaking an image for a target). In cases where the hit rate was 1 and/or the false alarm rate was 0, d' was calculated by decreasing the hit rate to 0.99 and increasing the false alarm to 0.01. A d' score of 0 indicates the observer cannot distinguish between a signal and background noise (chance performance).

Perceptual sensitivity and bias
We calculated the mean d' score for each participant on the old/new recognition task for human faces, animal faces and objects (Fig. 2). We then performed a 2 (Group: Control, DP) x 3 (Category: Human, Animal, Objects) mixed ANOVA. There were significant main effects of Group (F (1, 186) = 23.68, p < 0.001) and Category (F(2, 186) = 64.15, p < 0.001). There was also a significant Group x Category interaction (F (2, 186) = 7.20, p = 0.001). To explore the interaction further, we conducted pairwise comparisons (Controls vs DPs) of d' scores for the 3 categories. The d' scores were significantly lower in the DP group compared to the control group for human faces (M C = 2.66, SD C = 1.05, M DP = 1.52, SD DP = 0.90, t(62) = 4.66, p < 0.001, Cohen's d = 1.17) and for animal faces, (M C = 1.28, SD C = 0.56, M DP = 0.90, SD DP = 0.35, t Fig. 1. a) Example targets from each condition in the old/new recognition task. b) Schematic of the experimental procedure in the old/new recognition task. In the learning phase target images were presented sequentially. Accuracy was then measured in a recognition phase in which the targets were presented among foils. For each image participants had to indicate if the image was old or new.

Fig. 2.
Individual sensitivity (d') scores for the Control and DP groups for human faces, animal faces and objects. An average score for the animal faces was calculated by combining the d' scores for the cat, dog, monkey and sheep face conditions. An average score for objects was calculated by combining the d' scores for starfish and bottles. Error bars represent ±1 SEM. ***p < 0.001, **p < 0.01, n.s p > 0.05.
Next, we calculated the median reaction time (RT) for correct trials. A 2 (Group: Control, DP) x 3 (Category: Human, Animal, Objects) mixed ANOVA showed significant effects of Group (F(1, 186)  We then compared sensitivity (d') of the DPs and Controls in each animal face condition (Fig. 3) We also performed a 2 (Groups: Control, DP) x 4 (Category: Dog, Sheep, Monkey, Cat) mixed ANOVA to check for potential RT differences when animal categories are investigated separately. There were no significant main effects of Group F(1, 248) = 1.97, p = 0.162, Category F (3, 248) = 1.87, p = 0.136 and no significant Interaction F(3, 248) = 0.56, p = 0.639. These results suggest significant group difference in RT only for the human faces condition and no difference between Controls and DPs for the animal face conditions.
To examine response bias in the different face conditions, we calculated a criterion score -C (Fig. 3b). The higher the criterion, the more perceptual evidence is required to make a decision (i.e. a conservative response bias). Criterion scores were entered into 2 (Group: Control, DP) x 5 (Category: Human, Dog, Sheep, Monkey, Cat) mixed ANOVA. The main effect of Group was not significant (F(1, 310) = 0.01, p = 0.918), but there was a significant effect of Category (F(4, 310) = 3.91, p = 0.004) and a significant Group x Category interaction (F(4, 310) = 3.97, p = 0.004). DPs had a significantly higher criterion score for human faces (M C = 0.06, SD C = 0.41, M DP = 0.47, SD DP = 0.64, t(62) = 2.98, p = 0.004, Cohen's d = 0.76). There were no significant group differences in criterion scores with any of the animal faces and objects at p < 0.05.

Patterns of recognition dissociations
Finally, we explored patterns of dissociations between human faces, animal faces and objects in the DPs. Tests assessing dissociations have their foundations in neuropsychological case studies, where the goal is to compare the performance of a patient on a pair of tasks with that of a control sample. We used the Bayesian criteria for dissociations test (Crawford & Garthwaite, 2007) to investigate dissociations in deficits across two tasks at the individual level. First, the test compares individual performance on two tasks relative to that of controls to test for a deficit on each of the two tasks. Second, the test measures the standardized difference between the individual scores on the two tasks relative to the difference observed in controls. A classical dissociation was recorded if the individual has a deficit on only one task, but also shows a significant difference between that task and the other task. A strong dissociation was recorded if an individual has a deficit on both tasks and there is also a significant difference between tasks. An association (no dissociation) was recorded if the individual does not meet the criteria for either a strong or classical dissociation.
In this analysis, we investigated dissociations and associations between the recognition of human faces and the recognition of either (i) animal faces or (ii) objects (Table 2). First, we calculated an A score for each condition (non-parametric measure of d') (Zhang & Mueller, 2005). As a group, DPs had significantly lower A scores for human (t(62) = 5.08, p < 0.001) and animal faces (t(62) = 2.94, p = 0.005), but not objects (t(62) = 0.49, p = 0.629). Next, we selected the DPs who scored ≤2SD from the Control mean A score on the human face condition in the old/new recognition task. This was done to avoid the double-dipping problem (Geskin & Behrmann, 2018) by using independent measures to classify DP (PI20 and CFMT) and to investigate face and object dissociations (old/new recognition task). On this basis, 15 of the 37 DPs (40.5%) exhibited evidence of impaired human face recognition at the single-case level. Of the 15 DPs impaired in face recognition 6 (40%) showed a dissociation between human and animal faces. In contrast, 10 of the 15 DPs (67%) showed a dissociation between human faces and objects (Fig. 4).

Discussion
The aim of this study was to investigate whether the deficit in human face recognition evident in DP extends to animal faces. Studies of acquired prosopagnosia have found mixed evidence for a deficit in animal face recognition. McNeil and Warrington (1993) reported the case of a sheep farmer (WJ), who acquired prosopagnosia after a stroke, but was still able to differentiate between different sheep identities. On the other hand, other cases of prosopagnosia have been reported with deficits in the recognition of animal faces (Bornstein et al., 1969;Landis et al., 1986;Toftness, 2019). To date, however, no studies have investigated whether the deficit in human faces in DP extends to animal faces. In this study, we found that at a group level, individuals with DP had recognition deficits with both human and animal faces. The magnitude of group-level impairment varied for the different animal faces. The recognition deficits in DP were most pronounced for dog faces. However, a significant group difference was also seen for monkey faces. While the group differences for sheep and cat faces did not reach significance, we note that a similar trend (DPs < Controls) was also seen in these conditions.
The selectivity of the recognition deficit was shown by the lack of any group-level difference between DPs and controls in the recognition of non-face objects. These findings suggest that the deficit in DP involves a shared representation of human and animal faces. This is consistent with neurophysiological and neuroimaging studies showing a similar representation of human and animal faces in the temporal lobe (Decramer et al., 2021;Kanwisher et al., 1999;Kriegeskorte et al., 2008;Tong et al., 2000). For example, single neuron recordings have shown that neurons in the human brain that are selective for human faces are also selective for monkey faces (Decramer et al., 2021). Neuroimaging studies have reported similar findings. For example, regions showing selectivity for human faces also show selective responses to animal faces (Kanwisher et al., 1999;Tong et al., 2000) and studies using multi-voxel pattern analysis (MVPA) report similar patterns of neural response elicited by human and animal faces (Kriegeskorte et al., 2008).
There were, however, some differences in the way DPs and controls recognized human and animal faces. The impaired recognition of human faces in DPs reflected a lower hit rate, but no difference in false alarms. On the other hand, the animal recognition impairments seen in DPs reflected a higher incidence of false alarms, but no difference in hit rate. One explanation is that these contrasting patterns result from a difference in response bias. Our criterion analysis fits with this account as DPs had a more conservative bias for human face recognition. That is, they required more perceptual evidence to indicate that a target was present when it was a human face. It is possible that a lifetime of face recognition problems -and associated social embarrassment -causes DPs to adopt a conservative decision criterion. This may not generalise to animal faces because the social cost of misidentification (i.e., the potential for embarrassment) is substantially lower. We note, however, that the interpretation of criterion measures under conditions where accuracy is known to differ is challenging. While accuracy measures are generally regarded as meaningful across conditions differing in bias, the reverse is not necessarily true (Wixted & Stretch, 2000). We did not find any group level deficits in the recognition of simple animate (starfish) or inanimate (bottles) non-face objects. Although our findings suggest that DP is selective for faces, the extent to which prosopagnosia selectively affects the perception of non-face objects remains contentious (Geskin & Behrmann, 2018). A number of studies have suggested that deficits in the human face identification occur in the absence of any deficits in the recognition of non-face objects (Barton, Albonico, Susilo, Duchaine, & Corrow, 2019;Bate, Bennetts, Tree, Adams, & Murray, 2019;Garrido, Duchaine, & DeGutis, 2018;Shah et al., 2015). However, there is now increasing evidence that individuals can have co-occurring deficits in non-face object recognition (Barton et al., 2019;Barton & Corrow, 2016;Biotti et al., 2017;de Haan & Campbell, 1991;Duchaine et al., 2007;Epihova et al., 2022;. Nevertheless, it is not clear why only particular objects are affected in DP. In a recent study, we described a deficit in the perception of pareidolic objects (that are perceived as being face-like) in DP (Epihova et al., 2022). However, this was only with pareidolic objects that had similar image properties to faces. Further studies that reveal which objects are or are not affected in DP may help uncover the functional organizing principles involved in object perception.
Next, we investigated the dissociations between performance for human faces and either animal faces or objects. Using only the DPs who performed ≤2 SD below the group mean of the control group on the human face condition of the old/new recognition task, we found that only 40% showed a dissociation between human and animal face recognition, whereas 67% showed a dissociation between human faces and objects. Despite the fact that, at a group level, DPs performed equally to controls with objects, 5 of the 15 DPs exhibited associated object agnosia. This is consistent with previous reports demonstrating that, even in the absence of group-level differences in object recognition at a group level, some DPs exhibit deficits in object recognition (Barton et al., 2019;Bate et al., 2019), suggestive of a heterogenous profile of DP (Minnebusch, Suchan, Ramon, & Daum, 2007).
In the present study, we classified individuals as DP based on their scores on the PI20 and CFMT. Although there is no formal guidance on the diagnosis of DP, we acknowledge that this approach is relatively liberal. In particular, it has been argued that diagnostic decisions should be informed by performance on multiple objective tests of face recognition performance in addition to any self-report evidence (Bate & Tree, 2017;Dalrymple & Palermo, 2016). The use of more liberal diagnostic criteria can complicate the interpretation of null effects of group (DPs vs controls). For example, subtle perceptual deficits may be harder to detect in milder cases. However, it is unlikely that clear evidence of a deficit in animal face recognitionsuch as that described herecan be attributed to the presence of milder cases within the DP sample. If anything, a more liberal approach would be expected to reduce the chance of a significant group difference between DPs and controls.
In conclusion, we provide the first systematic evidence for a deficit in the recognition of animal faces in DP. These findings converge with other studies showing similar patterns of neural response to human and animal faces in the temporal lobe. Together, these results show that DPs have a general deficit in the recognition of faces with a range of configural and morphological structures.

Author contributions
G.E. and T.A. designed the study. G.E. conducted the experiments and analysed the data; R. C. contacted DP participants. All authors contributed to the writing of the manuscript.

Data availability
Experimental stimuli, code and anonymised data are publicly available at https://osf.io/jymqv/