Can You Identify These Celebrities? A Network Analysis on Differences between Word and Face Recognition

Face recognition is located in the fusiform gyrus, which is also related to other tasks such word recognition. Although these two processes have several similarities, there are remarkable differences that include a vast range of approaches, which results from different groups of participants. This research aims to examine how the word-processing system processes faces at different moments and vice versa. Two experiments were carried out. Experiment 1 allowed us to examine the classical discrimination task, while Experiment 2 allowed us to examine very early moments of discrimination. In the first experiment, 20 Spanish University students volunteered to participate. Secondly, a sample of 60 participants from different nationalities volunteered to take part in Experiment 2. Furthermore, the role of sex and place of origin were considered in Experiment 1. No differences between men and women were found in Experiment 1, nor between conditions. However, Experiment 2 depicted shorter latencies for faces than word names, as well as a higher masked repetition priming effect for word identities and word names preceded by faces. Emerging methodologies in the field might help us to better understand the relationship among these two processes. For this reason, a network analysis approach was carried out, depicting sub-communities of nodes related to face or word name recognition, which were replicated across different groups of participants. Bootstrap inferences are proposed to account for variability in estimating the probabilities in the current samples. This supports that both processes are related to early moments of recognition, and rather than being independent, they might be bilaterally distributed with some expert specializations or preferences.


Introduction
Recognizing a celebrity can be considered a challenge by most people. Regarding the cognitive framework, when this task is done by face or name recognition, many processes are required in a hierarchical way that involves visual and semantic stages [1]. In this way, both face and word recognition are examples of expert visual processing [2]. However, even if name and face recognition have some similarities, they are remarkably different. The literature on face recognition has long argued that faces are singular to us by studies on the fusiform face area (FFA). This is a region of the brain that has been described as one of the most specialized regions for facial recognition in the Mathematics 2020, 8, 699; doi:10.3390/math8050699 www.mdpi.com/journal/mathematics human visual system [3]. This part of the brain is also related to other tasks such as word [4] and object recognition [5]. Word recognition is a process that must be learned through an increasingly specialized effort in reading. Not surprisingly, there is a specific brain area known as the visual word form area (VWFA) in the left fusiform gyrus that seems to be crucial for any skilled reader. According to the literature [6,7], letter specialization learning in the fusiform gyrus might be associated with a smaller FFA for developing readers. This result is consistent with the previous neuronal recycling hypothesis developed in [8], where the process of learning to read is described as a cost that might affect the neural substrates of face processing. Research regarding these networks has revealed a lateralized asymmetry in the left hemisphere of the VWFA and in the right hemisphere of the FFA [2,9]. However, functional imaging studies seem to depict activity in both hemispheres, which could be considered a controversy to lateralization.
The current literature also maintained whether some FFA and VWFA responses to visual faces and written stimuli have been found in other regions [10][11][12]. Furthermore, if VWFA is specific for reading, it would be related to other stages of the process. However, it seems that the VWFA is not associated with other underlying reading-related regions [13]. Studies with neuropsychological and clinical cases offer examples of interest, as they generally differ from the control group [14]. Some examples include pure prosopagnosia [15,16] or dyslexia [17], where face but not word recognition are expected to be selectively impaired, or vice versa. Moreover, it is possible to address different levels of word processing when looking at the process of learning between a developing reader and a skilled adult. However, this cannot be addressed in regard to face recognition, because it is not possible to examine participants without a lack of experience in this process. Nevertheless, the literature has pointed out that individuals from small hometowns show relatively poor face recognition ability as studied by the Cambridge Face Memory Test or CFMT [18]. This suggests that the number of faces present in an individual's visual environment might be related to that individual's face recognition ability. Furthermore, not only geographical places of origin, but also differences between men and women have been explored. From this,  argued that geographical places of origin and differences between men and women interact to predict face recognition ability.
The aim of this study is to examine how the word-processing system processes faces at different moments and vice versa. This might shed light on whether both processes are specific for each other or distributed with overlapping representations. To do so, a simple presentation/discrimination task was selected, as well as a masked priming paradigm technique was chosen, thus employing a selection of different faces and names from international celebrities across different populations. Experiment 1 will allow us to examine the classical discrimination task, while Experiment 2 will allow us to examine very early moments of discrimination. The logic behind this is that masked priming effects might be understood as the reflection of residual activation caused by the prime at a particular stage of processing. Therefore, faces might activate words and vice versa. Knowing that the participants' sex might be a variable of interest in the current literature, this has been controlled for Experiment 1. Furthermore, participants from three different geographical place of origin have been selected for Experiment 2. Even if these results shed light on these variables, caution is advised here, as the sample size might lead to unreliable results regarding sex and country of origin.
Lastly, emerging methodologies in the field might help us to better understand the relationship between face and word name recognition processes. For this reason, a network analysis approach was chosen. This is a graph theory-based methodology that can be used to examine the relationship between observable variables [19][20][21][22]. Therefore, both face and word name recognition processes are considered causally dependent but mutually influential to each other in the current study. To our knowledge, the literature is rather scarce in this comparison of both processes through a network approach.

Participants
A total of two experiments, with two different samples and subsamples, were carried out. In the first one, 20 Spanish University students (10 men and 10 women), with no history or evidence of neurological or psychiatric disease, volunteered to participate.
Secondly, Experiment 2 was composed by 60 participants described as follows: a total of 20 Spanish University students (4 men and 16 women), 20 Brazilian University students (6 men and 14 women), and 20 North American University students (2 men and 18 women) volunteered to participate. They were selected to show adequate variation on demographic characteristics (therefore, controlling for age, sex, and level of education). This study was carried out in accordance with the Declaration of Helsinki and approved by the University ethical committee (UCV/2017-2018/31). Participants gave written consent to participate in the study.

Stimuli
The procedure to select celebrities was similar to the previous literature [1]. They were chosen after asking three different samples (total of 20 University students each) from Spain, Brazil, and the United States to name 20 female celebrities, where half of the stimuli had to be men and half had to be women. All stimuli were presented in black and white resolution. In order to avoid any kind of distraction, such as noise, the test was administered in an isolated room, where participants entered individually.
Participants were instructed to identify celebrities via their name or face and to discard unknown stimuli. Specifically, on the notebook keyboard, the letter M was labeled with a green label in order to indicate where the participants had to press for a target stimulus. On the contrary, if the stimulus presented was a distractor stimulus, the participants were asked to press the letter Z, which was labeled in red. For the first experiment, a total of 200 stimuli were selected. After a pilot study, a total of 10 celebrities were taken out of the study with regard to its accuracy in this previous pilot. Therefore, a final number of 160 stimuli were randomly presented to participants. These were divided into 40 names and 40 faces of celebrities, which were presented as target stimuli, as well as 40 names and 40 faces of non-celebrities as distracting stimuli. As depicted in Appendix A, the sex variable for these stimuli was controlled, being half male and half female.
For the second experiment, a masked priming task was counterbalanced into two conditions of word and face recognition. This was carried out in order to avoid any bias related to any repetition effect via a celebrity duplication on face and word nature. The number of stimuli per condition was reduced, as Experiment 2 involved more experimental conditions. Therefore, a total of 224 stimuli was employed. In this way, 28 photographs or 28 names of famous people were shown to the participants in random order, intermingled with 28 photographs or 28 names of unknown people. Each stimulus appeared 4 times in a different experimental condition (being half male and half female for each one): Identity Masked Priming (face with face/name with name), Related Masked Priming (face as a prime with its name as a target/name as a prime with its face as a target), Unrelated Masked Priming Face-Face (face as a prime with a different face as a target/name as a prime with a different name as a target), and Unrelated Masked Priming (face as a prime with a different name as a target/name as a prime with a different face as a target).

Procedure
To carry out the experiments, a laptop with a Windows operating system and with DMDX software [23] was used. In Experiment 1, a simple presentation task was chosen where each stimulus was preceded by a fixation point (with an appearance of 500 ms) and an image or face (with an appearance of 500 ms). The maximum time allowed for a response was 2500 ms. In addition, participants were instructed to answer as soon as possible, while trying not to make mistakes.
As in Experiment 1, both stimulus from a celebrity were employed (face and name), this was considered a potential bias for Experiment 2. Due to this, the stimuli were counterbalanced into two groups. In the first group, a masked priming technique was chosen, in which a prime was briefly presented (50 ms) and in the second group, a mask (500 ms) preceded the presentation of a target stimulus (maximum time for response was 2500 ms). All stimuli were preceded by a fixation point (with an appearance of 500 ms), and the conditions employed were described as follows: (1) Identity condition, where the prime was the same stimulus as the target, (2) Related condition, where the Prime was a related name to the target that must be the same celebrity face or vice versa (a prime celebrity face and the related words name to the celebrity for the target), as depicted in Figure 1, and an (3) Unrelated condition, where the prime was an unrelated name to the target that must be the a different celebrity face or vice versa. As in Experiment 1, both stimulus from a celebrity were employed (face and name), this was considered a potential bias for Experiment 2. Due to this, the stimuli were counterbalanced into two groups. In the first group, a masked priming technique was chosen, in which a prime was briefly presented (50 ms) and in the second group, a mask (500 ms) preceded the presentation of a target stimulus (maximum time for response was 2500 ms). All stimuli were preceded by a fixation point (with an appearance of 500 ms), and the conditions employed were described as follows: (1) Identity condition, where the prime was the same stimulus as the target, (2) Related condition, where the Prime was a related name to the target that must be the same celebrity face or vice versa (a prime celebrity face and the related words name to the celebrity for the target), as depicted in Figure 1, and an (3) Unrelated condition, where the prime was an unrelated name to the target that must be the a different celebrity face or vice versa. On the top, one example of block for the Prime (Celebrity word name)-Target (Celebrity face), and in the bottom one example for the block of the Prime (Celebrity face)-Target (Celebrity word name). Blocks were counterbalanced across groups.

Data Analysis
A classic analysis of variance (ANOVA), as well as a non-parametric approach, on the reaction times of correct responses and accuracy of the participants were carried out. These analyses were performed using a cut-off or a trimming technique (excluding latencies smaller than 250 ms or greater than 1500 ms). Data analysis was performed using SPSS statistical software version 23. The masked priming effect was estimated by calculating how the unrelated primes belonging to semantic categories differed from parallel prime/target pairs as described in prior literature [24,25]. Figure 2 depicts the procedure employed.
Finally, a network analysis approach was carried out in order to examine the magnitude of associations between variables. JASP (Version 0.11.1) [Computer software] was employed for this purpose. In this way, the EBICglasso estimate was used for the estimation procedure, as it minimizes a false positive detection of connections adapted from the least absolute shrinkage and selection operator (LASSO) regularization method [26].
Network analysis is an exploratory method based on graph theory, where variables are represented by nodes (or circles) and the relationships between the variables are represented as edges (or lines). The intensity of the edges of the graph represent the magnitude of these associations, while their color (red or blue) depicts the direction (negative or positive, respectively) of these associations. To simplify the interpretation of the network, three centrality measures were used. The first measure On the top, one example of block for the Prime (Celebrity word name)-Target (Celebrity face), and in the bottom one example for the block of the Prime (Celebrity face)-Target (Celebrity word name). Blocks were counterbalanced across groups.

Data Analysis
A classic analysis of variance (ANOVA), as well as a non-parametric approach, on the reaction times of correct responses and accuracy of the participants were carried out. These analyses were performed using a cut-off or a trimming technique (excluding latencies smaller than 250 ms or greater than 1500 ms). Data analysis was performed using SPSS statistical software version 23. The masked priming effect was estimated by calculating how the unrelated primes belonging to semantic categories differed from parallel prime/target pairs as described in prior literature [24,25]. Figure 2 depicts the procedure employed.
Finally, a network analysis approach was carried out in order to examine the magnitude of associations between variables. JASP (Version 0.11.1) [Computer software] was employed for this purpose. In this way, the EBICglasso estimate was used for the estimation procedure, as it minimizes a false positive detection of connections adapted from the least absolute shrinkage and selection operator (LASSO) regularization method [26].
Network analysis is an exploratory method based on graph theory, where variables are represented by nodes (or circles) and the relationships between the variables are represented as edges (or lines). The intensity of the edges of the graph represent the magnitude of these associations, while their color (red or blue) depicts the direction (negative or positive, respectively) of these associations.
To simplify the interpretation of the network, three centrality measures were used. The first measure is Betweenness, and it depicts the shortest paths that pass through the node of interest; the second measure is Closeness, and it is employed as an indicator of the inverse sum of all the shortest paths from the node of interest to all other nodes; and finally, the last centrality measure is called Degree, and it is understood as the sum of the absolute input weights of that node. Bootstrap inferences are proposed to account for variability in estimating the probabilities in the current samples. is Betweenness, and it depicts the shortest paths that pass through the node of interest; the second measure is Closeness, and it is employed as an indicator of the inverse sum of all the shortest paths from the node of interest to all other nodes; and finally, the last centrality measure is called Degree, and it is understood as the sum of the absolute input weights of that node. Bootstrap inferences are proposed to account for variability in estimating the probabilities in the current samples. On the top, one example of the following block: Unrelated masked face prime/face target-Identity masked condition for faces, and the Unrelated masked word prime/face target-Related masked word prime/face target. On the bottom, one example of the following block: Unrelated masked word prime/word target-Identity masked condition for words, and the Unrelated masked face prime/word target-Related masked face prime/word target. Table 1 shows the descriptive analysis for the reaction times and accuracy across men and women in Experiment 1. The Kolmogorov-Smirnov and Shapiro-Wilks normality tests were employed to examine if variables were normally distributed. There was no significance, p > 0.05. However, caution is advised here, as small samples could drive to a null hypothesis [27]. On the other hand, the Levene test indicated equality of variances (all p > 0.05). In this way, no statistical differences were found across participants sex or stimuli gender. Nor were statistical differences found across type of stimuli (faces versus word names). However, the within-subjects ANOVA on target versus distracting stimuli did approach the significance level (F(1, 19)   On the top, one example of the following block: Unrelated masked face prime/face target-Identity masked condition for faces, and the Unrelated masked word prime/face target-Related masked word prime/face target. On the bottom, one example of the following block: Unrelated masked word prime/word target-Identity masked condition for words, and the Unrelated masked face prime/word target-Related masked face prime/word target. Table 1 shows the descriptive analysis for the reaction times and accuracy across men and women in Experiment 1. The Kolmogorov-Smirnov and Shapiro-Wilks normality tests were employed to examine if variables were normally distributed. There was no significance, p > 0.05. However, caution is advised here, as small samples could drive to a null hypothesis [27]. On the other hand, the Levene test indicated equality of variances (all p > 0.05). In this way, no statistical differences were found across participants sex or stimuli gender. Nor were statistical differences found across type of stimuli (faces versus word names). However, the within-subjects ANOVA on target versus distracting stimuli did approach the significance level (F (1,19) = 4.29; MSE = 2473.49; p = 0.05). Given that sex must be considered a small sample in this first study, a parametric approach is not the most suitable strategy to address differences among these. For this reason, latencies were also addressed through the non-parametric U of Mann-Whitney test, finding similar results. Of note, differences between the face recognition across sex approached the significance level; U = 27, n 1 = n 2 = 10, p = 0.08). No differences were found for accuracy under the same approach. With regard to Experiment 2, Table 2 depicts the descriptive analysis for the reaction times and accuracy. Since sex was not a controlled variable as in Experiment 1, this was not included in the analysis. The Kolmogorov-Smirnov test was employed to examine if variables were normally distributed, p > 0.05. This was the same case for the Shapiro-Wilks normality test (except for the target Unrelated Masked Priming Face-Face condition, and the distracting Unrelated Masked Priming Face-Face condition, which were p = 0.37 and p = 0.22 respectively). The Levene test an indicated equality of variances (all p > 0.05). The ANOVA on the latency data showed that face targets preceded by an identity prime, as well as the word name ones, were processed faster: F (3, 171) = 32.70; mean square error (MSE) = 1740.34; p < 0.001 η 2 = 0.36.

Results
Target faces were processed faster than target word names: F (1, 57) = 57.49; MSE = 225,599.46; p < 0.001; η 2 = 0.50. As expected, distracting stimuli were processed slower, and this was statistically significant: F (1, 57) = 35.13; MSE = 36,118.37; p < 0.001; η 2 = 0.38. An interaction between conditions and type of target was found to reach a statistically significant level: F (1, 59) = 37.94; MSE = 6853.36; p < 0.001; η 2 = 0.33. A non-parametric Friedman test of differences among repeated measures was conducted and rendered a χ 2 (3) of 41.14, which was statistically significant (p < 0.001; w = 0.25). However, no interactions were found for the country of residence variable under the ANOVA or the non-parametric test, Kruskal-Wallis. Districting stimuli also showed faster latencies for faces rather than words: F (1,57) = 43.62; MSE = 28,038.84; p < 0.001; η 2 = 0.43. The Friedman test also indicated a similar result, with an χ 2 (3) of 26.22 (p < 0.001; w = 0.21). The identity primes for distracting stimuli were also examined, showing that they were also processed faster than the other conditions: F (3, 171) = 8.83; MSE = 1740.34; p < 0.001; η 2 = 0.13. For the non-parametric approach, the Friedman test also indicated a similar result, with an χ 2 (7) of 79.11 (p < 0.001; w = 0.18). No differences were found for accuracy under the same approach, except for condition: (i) F (3, 171) = 24.91; MSE = 0.03; p < 0.001; η 2 = 0.30 (ii) χ 2 (7) of 55.51 (p < 0.001; w = 0.13). On the other hand, Experiment 2 revealed a large masked identity priming effect for both prime faces and prime names, as well as for their related conditions, as depicted in Table 3. Lastly, a network analysis was carried out. Figures 3 and 4 represent the network plot for the whole dataset, involving two networks on target and distracting stimuli. This includes the islands of nodes or sub-communities that are identified according to the type of target stimulus under study: Target faces, Target name words, Distracting faces, and Distracting name words. Figure 5 depicts each network according to the country of residence group. Even if different relations among nodes were found across groups, the previous structed was replicated for each country. Some methodological issues confronting the analysis lead to the need to broaden the network analysis with an inferential tool. The reason for this is rather simple: even if the quality of the measures is carefully studied, one of the main shortcomings of the current study is the sample size, in which the analysis per country is the most sensitive to sample size.
Previous literature support the use of this approach in relatively small samples, when the quality of data is prioritized [19,28,29]. Nevertheless, the use of bootstrapping to this question seems quite appropriate to the key concerns by resampling an original sample [30]. Therefore, a bootstrapping technique was employed, by setting a size of N to 1000. This technique showed the same node structure of sub-communities. Figures 6 and 7 depict the relationship between centrality indexes of networks sampled through bootstrapping with the original sample. The stability of centrality indices under the current approach was examined by estimating network models based on subsets of data per country. Figures 8-10 show the resulting plots and reveal sizable bootstrapped confidence intervals around the estimated edge weights, suggesting that many edge weights likely do not significantly differ from one another in each country. As indicated by the previous literature, the red line indicates the sample values, while the gray area depicts the bootstrapped confidence intervals [31].

Discussion
This study examined how the word-processing system processes faces at different moments and vice versa. We tried to verify whether both processes are specific for each other or distributed with overlapping representations. For this reason, famous names and faces were selected for research proposes, as a common strategy carried out in the previous literature [1,[30][31][32]. This topic is of interest both in theoretical and applied levels. In this way, one should keep in mind that face perception is a critical skill for survival, and it is commonly considered innate for human beings. Moreover, its location in the fusiform gyrus and its role in recognition are also topics of debate [33,34]. Word recognition, which is learned through an increasingly specialized effort in reading is also located in the fusiform gyrus [35]. Likewise, literature has tried to address how the human brain may process information for these specific abilities by hypothesizing that highly specialized areas are involved in object and face recognition [6,36,37] to deal with written language or by hypothesizing that the brain develops specific areas for this task [36,38]. Even if promising results have been found in the field, how independent or overlapped these are is a question that still under debate.
Faster latencies were found for face recognition than for word recognition in Experiment 1, but these differences were not statistically different. Variables such as sex considered in this first study or place of precedence in the second one have reported differences in previous literature [39,40]. However, we were not able to replicate remarkable differences with regard to them. These results might not be reliable due to the small sample size. Moreover, sex was not balanced for Experiment 2. In this way, both systematic and direct replications seem to be imperative, as these pieces of research might show mixed results.
On the other hand, masked priming effects were higher for word recognition identities and for word recognition targets preceded by faces than for face recognition tasks preceded by word recognition tasks in Experiment 2. It is of note that masked priming techniques were designed for psycholinguistic purposes as a potential bias. However, this technique has been employed for face recognition purposes [41] or even a word recognition task preceded by emoticon primes [42]. Therefore, this result might suggest that word processing would be more susceptible to abstraction representations and more dependent to face recognition (which is considered innate, rather than learned).
It is also worth noting the methodological novelty of the network analysis approach in this field [43][44][45][46]. To our knowledge, this research is the first to compare word to face recognition through a network analysis through both a simple and a masked priming paradigm. This approach offers additional evidence to support the main hypothesis of how face and word recognition are distributed while keeping some specializations or preferences in early stages of processing. Moreover, the bootstrap analysis of networks should prove an invaluable tool for the conduct of network analysis in behavioral science.
Future lines of research addressing longitudinal studies in developing readers, as well as research carried out in clinical samples, are of interest for both face and word recognition to examine whether one of the processes could be selectively impaired while the other is kept intact. In particular, we would like to recommend the use of emerging techniques in the field such as network analysis to re-examine and support traditional analysis of variance.

Conclusions
The insight shared in the current work supported the hypothesis that both processes might be bilaterally distributed with some specializations or preferences rather than as independent processes. This could be congruent with the idea that the plasticity of the FFA exhibits structural changes until adulthood. More precisely, sample size as well as the population size of precedence are variables of interest. The global network was highly interconnected but with some more central nodes across masked priming identities. Notably, the sub-communities exactly matched the nature of the target stimuli, faces, or word names, no matter the nature of the previous prime. This might suggest that the early activation of word recognition is not an obstacle for face recognition and vice versa. Furthermore, this structure was replicated for each subgroup of participants.

Target
Distracting