What's in the Chinese Babyface? Cultural Differences in Understanding the Babyface

We investigated the cultural differences in understanding and reacting to the babyface in an effort to identify both cultural and gender biases in the universal hypothesis that the babyfaced individuals are perceived as naïve, cute, innocent, and more trustworthy. Sixty-six Chinese and Sixty-six American participants were required to evaluate Chinese faces selected from the Chinese Academy of Sciences (CAS)—Pose, Expression, Accessories, and Lighting (PEAL) Large-Scale Chinese Face Database. In our study, we applied Active Shape Models, a modern technique of machine learning to measure facial features. We found some cultural similarities and also found that a Chinese babyface has bigger eyes, higher eyebrows, a smaller chin, and greater WHR (Facial width-to-height ratio), and looks more attractive and warmer. New findings demonstrate that Chinese babyfaces have a lower forehead and closer pupil distance (PD). We found that when evaluating the babyfacedness of a face, Chinese are more concerned with the combination of all facial features and American are more sensitive to specific highlighted babyfaced features. The Chinese babyface tended to be perceived as more babyfaced for American participants, but not less competent for Chinese participants.


INTRODUCTION
You can never tell a book by its cover, but we always automatically and unconsciously judge people by their faces. Indeed, in fewer than 50 ms, first impressions about a face can be generated (Todorov et al., 2009). The babyface, with its unique facial structures which differ from mature adult faces, evokes a series of stereotypes. Whether in humans or animals, the babyface is usually defined as a round face with big eyes, high raised eyebrows, a narrow chin, and a small nose. All these features give us the impression of child-like traits, such as being naïve, cute, innocent, etc. Montepare, 1992, 2005;Zebrowitz et al., 1993Zebrowitz et al., , 2012. The babyface overgeneralization effect applies to both infants and adults, including youth and seniors Zebrowitz et al., 2015). Babyface stereotypes can bias social life outcomes, including elections, financial rewards, job applications, academic performances, prison sentences, altruism, and communication environments (Zebrowitz and McDonald, 1991;Zebrowitz et al., , 1998Collins and Zebrowitz, 1995;Zebrowitz and Montepare, 2008;Livingston and Pearce, 2009;Poutvaara et al., 2009).
There are a few cross-cultural investigations of babyface phenomena in different cultural contexts, but they are focused more on identifying similarities than differences (Zebrowitz et al., , 1993. Zebrowitz et al. (2012) proposed a common mechanism among people's social perception of faces. Even individuals who are isolated from the industrial revolution and modernization can still develop the ability to perceive attractive faces and babyfaces.
However, we believe there is still room for cultural differences in the definition of the babyface and in inferences regarding the babyface in different cultural contexts. The babyface distributes in a given ratio among different ethnic groups, but the definition of the babyface in terms of facial structures and social perceptions varies across cultures. We make judgments about faces based on visual information, cognitive learning such as attention training, and the usefulness of information to observers (Zebrowitz et al., 2011). All of these cognitive strategies have been found to have systematic cultural differences by cultural psychologists for decades (Peng and Nisbett, 1999;Nisbett et al., 2001). Hence, we cannot conclude that babyface perceptions and inferences can easily escape cultural influence from a cultural psychology perspective. It is widely known that the Chinese holistic cognitive style is concerned more with integrated features, and the American analytical cognitive style is concerned more with prominent features. Evidence also suggests that Japanese use more configuration information in face perception than Caucasian (Miyamoto et al., 2011). Therefore, we assumed that Chinese and American participants would differ in babyface perceptions and inferences. Chinese should focus more on the combination of all facial features and Americans should be more sensitive to some highlighted babyfaced features. Forehead height and WHR should show cultural differences, because they are closely related to the integrated perception of a face. Eyebrow height and chin width should show cultural differences, because they are highlighted babyfaced features.
Child-like traits from the babyface have been reported by previous studies. We summarized 17 traits which can be inferred from the babyface, including naïveté, attractiveness, likeability, caring, friendliness, kindness, honesty, trustworthiness, health, openness, extroversion, emotional stability, confidence, intelligence, leadership ability, aggressiveness, and threat (Berry and McArthur, 1985;McArthur and Berry, 1987;Zebrowitz et al., , 1993Zebrowitz et al., , 2007bZebrowitz et al., , 2015Albright et al., 1997;Zebrowitz and Montepare, 2005;Zebrowitz, 2006). These 17 traits were evaluated in our study in researching cultural differences in inferences made about the babyface. A stereotype content model (Caprariello et al., 2009;Cuddy et al., 2009) applies a method to refine these traits. They could be divided into two categories-warmth and competence. We anticipated that the babyface overgeneralization hypothesis and the halo effect of attractiveness should also work on Chinese faces. There should be a positive correlation between attractiveness, health, warm traits and babyfacedness. However, cultural differences may appear when perceiving the trait of competence. There may be no negative correlation between these traits and babyfacedness, because there exists an evolutionary mechanism by which the babyface is a wise strategy to help Chinese people gain more resources. A round face shape, especially a babyface, will help them to get limited resources in limited time, which may increase, not decrease their competence inferences (Buss, 1989;Cunningham et al., 1995). Unlike previous studies, we hypothesized that the Chinese babyface may lead to differences in competence inferences.

Participants
Seventy-two undergraduate students from Tsinghua University, 38 males, and 34 females, participated in the experiment. Data from six participants were excluded from further analyses because they are not Chinese. The valid data included 66 people (32 females, age: M ± SD, 21.5 ± 3.17 years old; 34 males, 21.56 ± 3.60 years old, range: 18-31 years old

Stimuli
In this study, Chinese faces were researched as experiment material after being filtered and measured. Chinese faces came from the Chinese Academy of Sciences (CAS)-Pose, Expression, Accessories, and Lighting (PEAL) Large-Scale Chinese Face Database, including 1040 adult volunteers (445 women) (Gao et al., 2008). In the pre-experiment, the black-white photo group with the unified background, light, focal length, neutral expression, and no ornaments was chosen.
We applied Active Shape Models to value the level of babyfacedness and ranked all faces in Stasm Software (Milborrow and Nicolls, 2014). All the 1040 faces were first marked with 38 fixed points (Zebrowitz-McArthur and Montepare, 1989;Zebrowitz et al., 2003Zebrowitz et al., , 2007bZebrowitz et al., , 2010. The result of photo pointing is to get the coordinate value of 38 fixed points by setting up a rectangular coordinate system with the bottom left point of the screen as the origin (Figure 1). Referring to previous research, 18 feature vectors were described by the coordinates of facial standard points ( Table 1). All the values were standardized by pupil distance (PD).
Following Berry and McArthur (1985) and Zebrowitz-McArthur and Montepare (1989), we found that the feature vectors significantly correlated with the babyfacedness and the correlation coefficients between them ( Table 2). The standardized vectors, which were selected, were eye size, eyebrow height, forehead height (from highest point of forehead and highest point of nose), eye shape, chin width, PD, and cheek smoothness degree.
In the list of faces, 23 male faces and 23 female faces were chosen as the stimuli where the step size equaled to 20, from the least babyfaced to the most babyfaced. For each level, the babyfacedness values of faces are close. We controlled three confounding variables in our study: First, age. The faces are between the ages of 22 and 45 as inferred from their appearance (23 females, 29.14 ± 4.87 years old; 23 males, 30.46 ± 5.44 years old). Second, attractiveness. The face which was chosen is the face with average attractiveness among faces of each level. Third, front bangs. We chose the faces in which the forehead can be seen, instead of those in which the forehead is totally covered by front bangs. By controlling these confounding variables, we can not only control the unexpected influence of attractiveness, perceived age and hair style, but also avoid the confound effects of the attractiveness of faces on levels of babyfacedness.

Procedure
The experiment was conducted in a quiet and bright lab.
Participants were asked to fill in the questionnaire individually on a computer with 17 inch LCD monitor (1280 × 1024, 60 HZ). All face images were displayed in 360 × 480 pixel size, 96-dpi. After studying the concept of the babyface: "Babyface, referring to the facial features of those with newborns face, " participants were asked to practice choosing a more babyfaced face from two female and two male faces. If the choice was not correct, a second trial was conducted. If the choice was still wrong at the second trial, then participants were sent back to study the concept of the babyface. They would not enter the next trial until the choice was right.
The gender of faces was randomly presented in the formal trial to ensure the same presentation times of female and male faces. Everyone should react to only one gender. There was no right or wrong in the formal experiment.
The first stage was to practice a forced choice task. 23 faces were presented randomly in pairs, totaling 253 pairs. Participants were asked to judge the two faces with the same gender (two male faces or two female faces), and then give a reaction as soon and as correctly as possible. They pressed S, if the left face was more babyfaced. And L, if the right one was more babyfaced. There was no time limit. The purpose of forced choice task is to make participants practice their definition of babyface and decide what makes a face look babyfaced. The results were not considered in our final analysis or evaluation.
Second, participants were asked to grade the babyfacedness of 23 faces (male or female) presented randomly from 0 to 100, with 0 = the least babyfaced, and 100 = the most babyfaced.
In the rest stage, participants could rest for 10 minutes. Third, according to the grade of every face given by the participant in the second stage, the most babyfaced and the least babyfaced faces were presented randomly. On a 7point Likert-scale, participants evaluated 17 traits, including naïveté, attractiveness, likeability, caring, friendliness, kindness, honesty, trustworthiness, health, openness, extroversion, emotional stability, confidence, intelligence, leadership ability, aggressiveness, and threat. And last, participants graded the babyfacedness of these two faces again from 0 to 100 and answered how certain they were of their judgment.
When participants finished all three stages, the experiment was over. American participants used the same program as the Chinese, while the language was English. The introduction was translated by a bilingual speaker and checked by a native speaker.  Cheek smoothness degree 0.48 r 1 is from Berry and McArthur (1985). r 1 ' is from Zebrowitz-McArthur and Montepare (1989). r 2 is from Berry and McArthur (1985). r 3 is from Zebrowitz-McArthur and Montepare (1989). r 4 is from Berry and McArthur (1985). r 5 is from Berry and McArthur (1985). r 6 is from Zebrowitz-McArthur and Montepare (1989). r 7 is from Zebrowitz-McArthur and Montepare (1989).

Facial Structures of the Babyface
With the Kolmogorov-Smirnov test, we found that the grades of babyfacedness given by Chinese and American participants showed no systematic difference. For female faces, D = 0.44, p = 0.99. For male faces, D = 0.30, p = 0.24.
Examining the effects of culture, Model 2 of Table 4 shows that culture has interaction effects with forehead height (γ = 22.77, p = 0.03) (Figure 2A

Trait Impressions of the Babyface
Attractiveness and health are kinds of evolutionary traits and related more directly to evolutionary tendencies, so we analyzed them individually. We conducted a factor analysis on the other 15 traits. Bartlett's test of sphericity was significant, χ 2 (66) = 2028.45, p < 0.01, showing a factor structure, KMO = 0.896; these traits can be analyzed by factors. Principal component analysis and Promax rotation (Kappa = 4) were adopted. Items with the loading of <0.60 were gradually deleted. One more analysis will be conducted with every change. Twelve remaining traits are presented in Table 6. Following the stereotype content model (Caprariello et al., 2009;Cuddy et al., 2009), these traits were divided into two categories-warmth and competence. Commonalities of all the items were more than 0.50; 67.44% of the total variance was explained by warmth and competence.

DISCUSSION
The babyface phenomenon seems to be easily found among male Chinese faces, which is similar to white faces (McArthur and Apatow, 1984;Berry and McArthur, 1985). However, we also found the babyface effect on female faces, which may be due to the earlier cessation of growth which causes female faces to retain more neotenous traits (Jones et al., 1995;Tanikawa et al., 2015). But because of the halo effect of attractiveness,  further analysis is needed. In the second stage of our experiment, participants graded the babyfacedness at all babyface levels; we did not find significant gender differences when considering the effect of culture. Since we did not ask participants to evaluate the attractiveness of all faces, we may not be able to control attractiveness as a covariate when analyzing the relationship between facial features and babyfacedness. But, with the data of trait evaluation tasks in the third stage, we conducted an ANOVA, with attractiveness, culture and face gender as independent variables and babyfacedness as dependent variables. The main effect of attractiveness was significant, F (14, 208) = 6.11, p < 0.01, η 2 p = 0.29. The interaction effect between attractiveness and face gender was not statistically significant, F (14, 208) = 1.73, p = 0.052, η 2 p = 0.10. According to the result, we may not indicate that babyfacedness ratings of female faces co-vary by attractiveness more, because these data did not include all levels of babyfacedness. One possible explanation is that either male or female faces co-vary by attractiveness. More evidence is needed in the future.
We found that a lower forehead and closer PD were indices of the Chinese babyface. A different definition may be needed for the Chinese babyface. According to previous research about Caucasian faces, the babyface is usually defined as a round face with big eyes, wide PD, high raised eyebrows, a small nose, and low vertical placement of features, which yields a large forehead and a small chin. In our study, a Chinese babyface, a face with a high level of babyfacedness, has bigger eyes, a lower forehead, higher eyebrows, a smaller chin, a narrow PD, and greater WHR. The finding that faces with bigger eyes, higher eyebrows, a narrower chin and greater WHR were more babyfaced replicates many previous studies by Zebrowitz et al. (2015), including studies examining East Asian faces. But the facial features inferring the Chinese babyface, a lower forehead, and a narrower pupil, are new findings. Berry and McArthur (1985) and McArthur and Berry (1987) did not find that forehead height was related to American babyface. We can find that participants use different facial structures to determine the Chinese babyface and Caucasian babyface. Cunningham et al. (1995) demonstrated an evolutionary mechanism which can explain this difference. Cultural groups will modify their reproductive strategies. Comparing Whites and Blacks, greater sexual restraint is produced among Asians because of the harsh climate in North Asia. To get limited resources in limited time, evolutionary changes occurred in their appearance and life style. A round face shape, especially a babyface, will help them to delay and reduce their sexual activities. Asians may be faster on the evolutionary trend of babyface. Hence, people use different definitions to judge the Chinese babyface and Caucasian babyface.
What's more, we find interaction effects between culture and forehead height, eyebrow height, chin width, and WHR on the babyfacedness of Chinese faces. Chinese participants consider a face as more babyfaced with a lower forehead, higher raised eyebrow and greater WHR than American participants. But for Americans, a narrower chin contributes more for the babyfacedness of a face.
When evaluating the babyfacedness of a face, WHR and PD contribute the same to babyface perception for Chinese and American judges. But, cultural differences show that Chinese judges care more about forehead height, because the Chinese are concerned more with the combined result of all facial features. Forehead height is related to the entire facial shape, accounting for a high proportion of a face. Peng and Nisbett (1999) and Nisbett et al. (2001) indicated that it is the result of a holistic cognitive style. In contrast, Americans are accustomed to analytical thinking. One or a few prominent characteristics, such as eyebrow height, will lead to a conclusion of babyface. This finding verified our hypothesis. Different cultural groups with different cognitive styles will take different facial features into account.
Based on the coefficients of the mixed model, the influence of culture has a significant tendency (p = 0.08). When judging on a same Chinese face, especially a babyface, American tend to rate it higher in babyfacedness. A Chinese babyface (Figure 4) is usually perceived to be more babyfaced for the American judges than for the Chinese judges. This result is inconsistent with evidence that Korean faces are judged more babyfaced than White faces by both American and Korean judges, and Korean judges rate both Korean and White as more babyfaced than do American judges (Zebrowitz et al., 2007a). Both Chinese faces and Korean faces are kinds of Asian faces. But these two kinds of faces are not totally the same. Previous studies show DNA evidence of Koreans and we have reasons to infer that the shape of Korean's eyes is more narrow and elongated (Jin et al., 2003(Jin et al., , 2009). But as we know, bigger eyes are the key index of a babyface. From this perspective, Chinese faces seem to be more babyfaced than Korean faces and also American faces. Zebrowitz et al. (2007a) didn't indicate if the rates of Korean and American differ systematically. Their explanation is that it comes from face race effect. We found that a typical Chinese babyface is judged more babyfaced by Americans than Chinese. One possible reason is that Chinese faces are more babyfaced than Korean faces and American faces. Even with the effect of face race, Americans still judge Chinese babyfaces as more babyfaced. More research is needed.
We argue that a babyface is more attractive in general, which is consistent with the babyface overgeneralization hypothesis and the halo effect of attractiveness. There is a remarkable agreement in the warmth inference of the babyface all over the world (Zebrowitz et al., 2012), which is also true in our study. However, an interesting cultural difference is that Chinese participants seem to be more extreme. They consider a highlevel babyface as more attractive and a low-level babyface as less attractive than Americans. Apparently, Chinese people like the female babyfaces and male mature faces more, which can be easily explained by evolutionary tendencies (Buss, 1989;Zebrowitz, 2003;Zebrowitz and Montepare, 2008). This phenomenon may also reflect a preference for own-race faces, which is consisted with previous research (Zebrowitz et al., 2007a).
It is usually believed that the babyface can lead to impressions of weakness, obedience, naïve characteristics (McArthur and Apatow, 1984;Berry and McArthur, 1985) and more femininity (Buss, 1989;Zebrowitz, 2006). Zebrowitz et al. (2012) only examined male faces and found a negative effect of babyface on health. Zebrowitz and Franklin (2014) tested both male and female faces and found a stronger influence of babyface on older adults than young adults in the impressions of health. However, we found no effect of babyface on health rating among young  adults. Because the halo effect of attractiveness makes faces look healthier, we only found that attractiveness has a significant effect on health.
In our study, we found that for male Chinese faces, both Chinese and Americans believed that the babyface shows less competence than mature faces. But for the female Chinese faces, the Chinese do not consider the female babyface as less competent, but it is less competent for American judges. The American inference about Chinese female babyfaces may simply be a natural extension of the babyface in general. However, given that the babyface is an evolutionary tendency and survival strategy, a Chinese female babyface implies more fertility and attractiveness, but no less competence. Because it is an evolutionary result, there is no doubt that a female babyface shows the same competence with mature faces for Chinese participants. Furthermore, Chinese may have often encountered numerous competent Chinese women who happen to have babyfaces; naturalistic realism may be the core source of the cultural differences. Additional research is needed to evaluate if babyfaced Chinese women also have the same social status and power as their peers. Gender differences suggesting that mature male faces are more competent than mature female faces for the Chinese can be explained by evolutionary tendencies (Buss, 1989;Zebrowitz, 2003;Zebrowitz and Montepare, 2008). A man with a mature face may possess more resources, more wealth, and higher social status.
The current study suggests that the facial structures and first impressions of the babyface are not necessarily universal. A Chinese babyface may be more babyfaced in the eyes of Americans, but it is not perceived to be less competent, and may be seen as even more attractive in the eyes of the Chinese. Recognizing the cultural differences in an evolutionbased natural phenomenon may enrich our understanding of human commonality and diversity.

AUTHOR CONTRIBUTIONS
WZ developed the study concept with KP. WZ developed the experimental paradigm with QY and FY. WZ and QY conducted the experiment and collected the data. WZ performed the data analysis and interpretation under the supervision of KP. WZ and KP drafted the manuscript. All authors contributed to the discussion of the manuscript. QY and FY provided critical revisions. All authors approved the work for publication.

NOTES
1. According to Berry and McArthur (1985), forehead height 1 is the distance between highest point of forehead and the connection of the two highest eyebrow points. 2. According to Zebrowitz-McArthur and Montepare (1989), forehead height 2 is the distance between highest point of forehead and the connection of two pupil center points.

FUNDING
This work was supported by National Nature Science Foundations of China (No. 31170973 and No. 31471001) and Tsinghua University Top Genius Training Program (Spark Plan).