The 20-item prosopagnosia index (PI20): a self-report instrument for identifying developmental prosopagnosia

Self-report plays a key role in the identification of developmental prosopagnosia (DP), providing complementary evidence to computer-based tests of face recognition ability, aiding interpretation of scores. However, the lack of standardized self-report instruments has contributed to heterogeneous reporting standards for self-report evidence in DP research. The lack of standardization prevents comparison across samples and limits investigation of the relationship between objective tests of face processing and self-report measures. To address these issues, this paper introduces the PI20; a 20-item self-report measure for quantifying prosopagnosic traits. The new instrument successfully distinguishes suspected prosopagnosics from typically developed adults. Strong correlations were also observed between PI20 scores and performance on objective tests of familiar and unfamiliar face recognition ability, confirming that people have the necessary insight into their own face recognition ability required by a self-report instrument. Importantly, PI20 scores did not correlate with recognition of non-face objects, indicating that the instrument measures face recognition, and not a general perceptual impairment. These results suggest that the PI20 can play a valuable role in identifying DP. A freely available self-report instrument will permit more effective description of self-report diagnostic evidence, thereby facilitating greater comparison of prosopagnosic samples, and more reliable classification.

Self-report plays a key role in the identification of developmental prosopagnosia (DP), providing complementary evidence to computer-based tests of face recognition ability, aiding interpretation of scores. However, the lack of standardized self-report instruments has contributed to heterogeneous reporting standards for self-report evidence in DP research. The lack of standardization prevents comparison across samples and limits investigation of the relationship between objective tests of face processing and self-report measures. To address these issues, this paper introduces the PI20; a 20-item selfreport measure for quantifying prosopagnosic traits. The new instrument successfully distinguishes suspected prosopagnosics from typically developed adults. Strong correlations were also observed between PI20 scores and performance on objective tests of familiar and unfamiliar face recognition ability, confirming that people have the necessary insight into their own face recognition ability required by a self-report instrument. Importantly, PI20 scores did not correlate with recognition of non-face objects, indicating that the instrument measures face recognition, and not a general perceptual impairment. These results suggest that the PI20 can play a valuable role in identifying DP. A freely available self-report instrument will permit more effective description of self-report diagnostic evidence, thereby facilitating greater comparison of prosopagnosic samples, and more reliable classification. criticized on the grounds that it correlates poorly with objective measures of face recognition ability [35]. Published correlations between scores on this scale and objective tests of face recognition ability range from r = 0.20 [15] to 0.55 [36]. The weak relationship observed probably reflects the inclusion of items pertaining to navigation deficits, the presence of face recognition difficulties in the respondents' wider family, and ability to judge facial attractiveness, facial emotion and facial gender: issues that are not reliable features of DP. 2

The 20-item prosopagnosia index
The 20-item prosopagnosia index (PI20) is a self-report instrument assessing the presence of prosopagnosic traits. Respondents indicate the extent to which 20 statements describe their face recognition experiences (table 1). Agreement is scored on a five-point scale (strongly agree to strongly disagree). Fifteen statements are scored positively, whereby strongly agree is scored '5' and strongly disagree is scored '1'. Five items are reverse scored (strongly agree is scored '1' and strongly disagree is scored '5').
Items were generated following review of the qualitative [16,18,19,[37][38][39] and quantitative literature (e.g. [1,2]) on DP, and through discussions with DPs. Items require no previous knowledge about DP, permitting the identification of sufferers who are unaware they have the condition. No items relate to emotion recognition, navigation difficulties, problems judging facial attractiveness and facial gender. Furthermore, no items were included on the presence of DP in the respondents' wider family, ensuring that the PI20 can be used in quantitative genetic studies estimating the heritability of the condition (defining DP using such criteria renders any conclusion about heritability circular).

Validation Study 1
The PI20 purports to measure prosopagnosic traits. A key indicator of its construct validity is therefore its ability to distinguish known or suspected DPs from the wider population. To determine whether the PI20 satisfies this fundamental criterion, the questionnaire was administered remotely via the Internet, to a sample of suspected DPs and typically developed (TD) controls.

Participants and methods
Three-hundred-and-nineteen adults aged between 18 and 74 years participated in Validation Study 1: 242 TD (M age = 29.8 years; 87 males) and 77 suspected DPs (M age = 43.0 years; 30 males). All participants reported normal or corrected-to-normal vision. TD participants were recruited using a local participant database. Suspected DPs contacted the authors via www.troublewithfaces.org complaining of face recognition difficulties, or were recruited via online communities for individuals with DP. Importantly, these individuals identified themselves as suspected prosopagnosics before administration of the PI20 questionnaire. Typically, the suspected DPs had heard about the condition through friends, family or the popular media. Having sought further information from a variety of sources, and recognized the features and anecdotes described, individuals made themselves known to the authors.

Validation Study 2
Many of the respondents described in Validation study 1 have never undergone formal testing of their face recognition ability. It is therefore likely that both groups are heterogeneous: some self-diagnosed DPs may in fact fall within the typical range of face recognition ability and some of the individuals identified as TD may exhibit some prosopagnosic traits. Validation Study 2 therefore sought to confirm that the PI20 distinguishes between prosopagnosics whose deficits have been verified by formal testing and age-matched controls.

Participants and methods
Thirty-six participants aged between 20 and 74 years completed Validation Study 2: 18 suspected prosopagnosics (M age = 46.7; 12 males), recruited via www.troublewithfaces.org, and 18 matched TD controls (M age = 43.5; 12 males), recruited via the local participant database. All participants completed a range of computer-based tests in our laboratory facilities to assess their face and object recognition ability, including the CFMT [26], the CFPT [13], a version of the FFRT [25] and the CCMT [40]. The scores of the DP group and the control group are shown in table 2. Comparing performance on this battery of tasks is commonly used to diagnose DP. All members of the DP group exhibited evidence of impairment on convergent objective tests of face recognition ability.

Results and discussion
The mean PI20 score of the suspected DPs (M = 81.22, s.d. = 9.47) again exceeded that of the controls (M = 41.67, s.d. = 12.10), t 34 = 10.92, p < 0.001. Of the 18 suspected DPs, 13 scored more than 2.5 s.d. above the control mean (more than or equal to 72), and 12 scored more than 3 s.d. above the control mean (more than or equal to 78). The mean PI20 score of the DP and TD groups in Validation Study 1 and 2 corresponded closely. Importantly, these results confirm that the PI20 distinguishes between prosopagnosics whose deficits have been verified by formal testing and age-matched controls. The results of the Validation Studies 1 and 2 indicate that the PI20 measures the DP construct as it is currently understood. Suspected DPs recognize the experiences and anecdotes contained within the PI20 and the resulting scores afford classification convergent with existing diagnostic procedures. In Validation Studies 3-5, we assessed the relationship between PI20 scores and objective measures of face and object recognition in more detail.

Validation Study 3
In our third validation study, we sought to determine whether PI20 scores correlate with respondents' ability to recognize famous faces. Any suggestion that self-report measures can contribute to the   classification of DP rests on the assumption that people have insight into their own face recognition ability. If respondents are poor judges of their face recognition ability, high self-report scores may simply reflect respondents' personality; for example, some individuals are known to underestimate their cognitive abilities [41]. Alternatively, strong correlation between PI20 scores and objective measures of face recognition ability would confirm that individuals do have insight into their face recognition ability.

Participants and methods
One-hundred-and-seventy-three of the respondents from Validation Study 1, aged between 18 and 74 years, including 100 TD (M age = 30.0; 27 males; 84 UK-based) and 73 suspected DPs (M age = 42.9; 28 males; 39 UK-based) participated in Validation Study 3. This sample included participants from Validation Study 2 (see the electronic supplementary material). All participants completed an Internetbased version of the FFRT [25] remotely, during which they had to identify 34 international celebrities (actors, singers, sports stars and politicians), from cropped photographic images, by providing their name or other identifying information. Faces were visible until participants responded. Scores reflect the number of correct identifications expressed as a percentage of the number of celebrities with whom respondents were familiar.

Results
The  Crucially, PI20 score remained highly predictive (β = −0.76, p < 0.001), accounting for a further 40.8% of unique variance. These results indicate that PI20 scores correlate with ability to recognize familiar faces.

Validation Study 4
Next, we sought to determine whether PI20 scores predict performance on the CFMT [26]. Whereas the FFRT used in Validation Study 3 assesses ability to recognize familiar faces, the CFMT measures ability to match unfamiliar faces, thought to depend on different neurocognitive mechanisms [42][43][44].
A correlation between PI20 scores and CFMT performance would confirm that respondents have insight into both their familiar and unfamiliar face recognition ability.

Participants and methods
One-hundred-and-ten participants from Validation Study 1, aged between 18 and 74 years, including 87 TD (M age = 28.6 years; 30 males) and 23 suspected prosopagnosics (M age = 45.8 years; 15 males) participated in Validation Study 4. A subset of the sample also participated in Validation Studies 2 and 3 (see the electronic supplementary material). All participants were living in the UK at the time of testing. The DP sample contacted the authors via www.troublewithfaces.org. TD participants were recruited through the local participant database. All participants completed the CFMT in our laboratory facilities. The test comprises 72 trials and employs a 3AFC match-to-sample design. Participants first learn a target face in left three-quarters-profile view, frontal view and right three-quarters-profile view. During a subsequent recall phase, participants are required to identify the target in a 3AFC procedure [26].

Results
The participants' CFMT scores (r = −0.68, p < 0.001; figure 2b). Additional hierarchical regression analysis was conducted to control for the influence of participant age (years) and gender (1 = male; 2 = female). When entered in the first step of the model, participant age was predictive (β = −0.40, p < 0.001), but participant gender was not predictive (β = 0.10, p = 0.30), of CFMT performance. Together, these factors accounted for 19.1% of the variance. Importantly, when added to the model, PI20 scores remained highly predictive of CFMT score (β = −0.65, p < 0.001), accounting for a further 27.8% of unique variance. The results of Validation Study 3 indicate that PI20 scores correlate with unfamiliar face recognition. Together, Validation Studies 3 and 4 confirm that people have insight into their face recognition ability.

Validation Study 5
The results of Validation Studies 1-4 confirm that PI20 scores correlate with familiar and unfamiliar face recognition. One account of this relationship is that the PI20 measures a relatively specific constructface recognition ability. However, a second possibility is that the PI20 measures a broader construct (e.g. general memory ability). If the PI20 is measuring a general factor, scores should also correlate with performance on the Cambridge Car Memory Test (CCMT [40]), a well-validated test of non-face object recognition employing an identical format to the CFMT.

Participants, methods and results
The CCMT was administered to the 110 respondents who participated in Validation Study 4. The car recognition ability of the suspected DPs (M = 68.64%, s.d. = 14.34%) was very similar to that of the TD controls (M = 68.29%, s.d. = 13.56%), t 108 = 0.11, p = 0.91. No correlation was observed between participants' PI20 scores and performance on the CCMT (r = 0.07; figure 2c). Hierarchical regression was conducted to determine whether PI20 scores were predictive of CCMT performance once individual differences in age and gender were controlled for. When entered in the first step, participant age was not predictive of CCMT scores (β = 0.08, p = 0.44), but, in line with previous findings [40], respondent gender was a significant predictor (β = −0.20, p = 0.044), together accounting for 5.6% of the variance. When subsequently added to the regression model, PI20 scores were not predictive (β = −0.04, p = 0.71), accounting for a further 0.1% of unique variance. These results suggest that the PI20 is measuring face recognition ability, and not a general factor.

Discussion
This paper introduces the PI20, a 20-item self-report measure of prosopagnosic traits. The new instrument successfully distinguishes suspected DPs from TD adults (Validation Studies 1 and 2). Strong correlations were observed between PI20 scores and performance on objective tests of familiar (Validation Study 3) and unfamiliar face recognition ability (Validation Study 4). Importantly, PI20 scores do not correlate with non-face object recognition (Validation Study 5), indicating that the instrument measures face recognition ability, not a general factor (e.g. wider memory ability).
The results of Validation Studies 3 and 4 confirm that people have the necessary insight into their face recognition ability, required by a self-report instrument. These findings contradict previous suggestions that adults lack insight into their own face recognition ability [27,35]. For example, having asked undergraduates to rate their ability to recognize faces in everyday life 'compared with the average person', Bowles et al. [27] found only weak correlations between self-rated ability and performance on the CFMT and CFPT. However, ratings derived from a single question are likely to provide noisy estimates, making weak correlations unsurprising. In addition, items asking about tangible experiences, such as those included in the PI20 (e.g. 'I sometimes find movies hard to follow because of difficulties recognizing characters'), may be less ambiguous than abstract questions about average face recognition ability.
The foregoing results suggest that the PI20 can play a valuable role in the identification of DP, permitting better description of self-report evidence, thereby facilitating greater comparison of prosopagnosic samples. Based on the relationships observed between PI20 scores and performance on objective tests of face recognition, PI20 scores in the ranges 65-74, 75-84, 85-100 may be broadly indicative of mild, moderate and severe DP, respectively. To be clear, we are not suggesting that the PI20 should replace objective tests of face recognition ability; rather we intend the PI20 to be used as a complementary diagnostic instrument. Where PI20 and computer-based tests provide convergent evidence of impairment, authors can be confident in the composition of prosopagnosic samples. Conversely, where there is discrepancy between objective and self-report evidence, further testing can be undertaken, for example, to determine whether an individual has a severe, lifelong perceptual impairment that they are unaware of, or whether they have simply under-performed on a given task. The use of convergent tasks and complementary paradigms is likely to result in more reliable classification [34].
DP is a heterogeneous condition [4,36,45,46]. The inclusion of self-report measures in diagnostic batteries guards against the possibility that new sub-groups of the DP population go undetected because current computer-based tests are insensitive to their characteristic deficits. For example, current tests require participants to judge static facial images. However, the faces we encounter outside of the laboratory are dynamic [47,48]. Should prosopagnosics exist who have selective problems processing facial motion [49], they may perform within the normal range on current diagnostic tests, despite experiencing face recognition difficulties in their daily lives.
There have been calls for a quick, easy-to-administer instrument for the purposes of screening populations for DP [38]. Computer-based tests are unsuitable for screening large populations. For example, batteries of computerized tasks frequently exceed 45 min in duration and require control groups for interpretation. These factors, together with the expertise and equipment required to administer computer-based tests, limit their clinical and practical utility [30]. Conversely, the PI20 can be completed very quickly in the absence of a computer, by clinicians (e.g. to clients or patients), employers (e.g. to prospective police or border control officers) and judiciary (to eyewitnesses, jurors), to screen populations for DP. Future research directly assessing the PI20's utility as a screening tool in applied contexts will prove informative.
In academic contexts, the instrument may be used both to identify individuals with DP for inclusion in research samples (e.g. screening undergraduate cohorts) and to exclude individuals exhibiting prosopagnosic traits from studies addressing normative face perception. As we have noted, the PI20 is also well suited for use in genetic studies estimating the heritability of the condition. In addition, the availability of a validated self-report measure also permits systematic investigation of the relationship between self-report and objective measures of face recognition ability. For example, future studies might try to better understand which observers are likely to over-or underestimate their actual face recognition ability. Longitudinal studies might also address whether this variability is systematically related to changes in face recognition ability over time.
An increasing number of authors are taking advantage of online platforms to collect behavioural data remotely. While these methods facilitate the collection of large datasets, this trend has provoked considerable discussion about the quality of the data collected [50]. Some readers might therefore query our decision to collect the data reported in Validation Studies 1 and 3 online. A degree of caution is justifiable insofar as current understanding of research conducted online remains relatively limited and best practice continues to evolve. Importantly, however, the findings from these studies were replicated in Validation Studies 2 and 4, respectively, using controlled experimental procedures conducted in the laboratory. Not only do these results confirm that the PI20 can be administered effectively via the Internet-further underscoring its potential value as a screening instrument-but they support the view that online data collection has a valuable role to play in contemporary social perception research [51].
Discussion of the complementary roles played by self-report and objective tests in identification and diagnosis raises fundamental questions about the future of the DP construct. Although DP is not listed as a psychiatric disorder in DSM-5 [23], prosopagnosia is recognized by the World Health Organization [52], and potentially meets the criteria for a mental disability [53]; i.e. a physical or mental impairment that has a 'substantial' and 'long-term' negative effect on one's ability to do normal daily activities. Crucially, however, the extent to which DP impairs normal daily activities is not easily assessed with face recognition tests in the laboratory. Any attempt to move DP into mainstream psychiatry may therefore necessitate a broader approach to diagnosis, encompassing self-report.
Ethics. Ethical clearance was granted by the local ethics committee and the study was conducted in accordance with the ethical standards laid down in the 2008 (6th) Declaration of Helsinki. Informed consent was obtained from all participants.