The inaccuracy of national character stereotypes

https://doi.org/10.1016/j.jrp.2013.08.006Get rights and content

Highlights

  • Mean trait levels are appropriate criteria for evaluating national stereotypes.

  • The Reference Group Effect has limited impact on culture mean scores.

  • The National Character Survey faithfully reflects beliefs about typical traits.

  • Collective behaviors may not reflect aggregate personality traits.

  • National character stereotypes are inaccurate for most traits and cultures.

Abstract

Consensual stereotypes of some groups are relatively accurate, whereas others are not. Previous work suggesting that national character stereotypes are inaccurate has been criticized on several grounds. In this article we (a) provide arguments for the validity of assessed national mean trait levels as criteria for evaluating stereotype accuracy and (b) report new data on national character in 26 cultures from descriptions (N = 3323) of the typical male or female adolescent, adult, or old person in each. The average ratings were internally consistent and converged with independent stereotypes of the typical culture member, but were weakly related to objective assessments of personality. We argue that this conclusion is consistent with the broader literature on the inaccuracy of national character stereotypes.

Introduction

Since Lippmann (1922/1991) first introduced the term stereotype to refer to people’s beliefs about social groups, most social scientists have emphasized their inaccuracy (Allport, 1979, Brown, 2010). Basic cognitive processes have been identified that lead people to exaggerate real differences between groups (Campbell, 1967), ignore or misremember stereotype-inconsistent information (Stangor & McMillan, 1992), and develop false beliefs to justify injustice (Jost & Banaji, 1994). These processes are practically important because of the role stereotypes can play in sustaining and exacerbating social inequalities, and theoretically important because they demonstrate that people’s perceptions and judgments may deviate from objectivity and rationality.

“However”, as Swim (1994, p. 21) put it, “reasons for inaccuracy are not evidence of inaccuracy”. And, surprisingly, much of the evidence to date shows considerable accuracy in many consensual stereotypes, including those involving age (Chan et al., 2012), gender (Swim, 1994), and race (McCauley & Stitt, 1978; see reviews by Ryan, 2002, Jussim, 2012). By accuracy we mean statistical agreement between beliefs about a group and the aggregate characteristics of the group in question. Importantly, stereotype accuracy does not refer to beliefs about the sociological, historical, or biological bases of differences between groups; it implies only that individuals are able to perceive group differences with some degree of precision. We are concerned here with the accuracy of consensual stereotypes (operationalized as the average beliefs across a sample of respondents); because of the “wisdom of crowds” (Surowiecki, 2004) these are likely to be substantially more accurate than personal stereotypes.

It is now clear that the degree of accuracy or inaccuracy of stereotypes cannot be assumed, but must be evaluated empirically, on a case-by-case basis. In these evaluations, however, the burden of proof has shifted to those who claim that stereotypes are inaccurate, because failure to find evidence of accuracy is often a null result, and the interpretation of null results is always difficult. In this article we take on that burden with respect to the inaccuracy of national character stereotypes. We argue that aggregate personality traits are appropriate criteria for evaluating the accuracy of national character stereotypes and review evidence on the adequacy of our stereotype measure; we then report new data replicating previous findings of inaccuracy.

The term national character might be broadly understood to include a wide range of characteristics, including intelligence, appearance, food preferences, and athletic abilities (e.g., Ibrahim et al., 2010). We adopt a narrower view, equating character with personality traits, and we use a comprehensive model of personality traits, the Five-Factor Model (FFM). Our study thus speaks to the accuracy of national stereotypes of personality traits, but does not imply accuracy or inaccuracy in perceptions of other national characteristics.

Age and gender stereotypes concerning personality traits appear to be largely accurate (Chan et al., 2012, Löckenhoff et al., 2013), but Terracciano et al. (2005) reported that national character stereotypes are not. They examined beliefs about the typical personality traits of members of different cultures and found that they were essentially unrelated to assessed mean levels of traits in 49 cultures. However, that conclusion has been challenged on a number of grounds (e.g., Perugini & Richetin, 2007). Because stereotypes in general are often accurate, it is reasonable to ask if flaws in the Terracciano study accounted for the negative results. Because the sample was large (N = 3,989) and a number of alternative analytic strategies were employed, the most plausible arguments are that (a) the criteria—i.e., assessed national levels of personality traits—were invalid, or (b) the stereotype measure was inadequate. We consider these arguments and then offer new data on the (in)accuracy of national character stereotypes.

The accuracy of stereotypes can only be determined by comparing beliefs to some objective standard. Terracciano et al. (2005) argued that objective data on the mean levels of personality traits in various nations provided such a standard, but that view is currently a matter of controversy. In the 2005 study, personality was assessed using either self-reports or observer ratings of individuals in each culture on versions of the NEO Inventories (McCrae & Costa, 2010), which measure 30 specific traits, or facets, that define the five major personality factors of the FFM. There is ample evidence that these instruments provide valid assessments of personality within cultures—that is, when members of a culture are compared to each other (e.g., McCrae & Terracciano, 2005a).

It is less certain that mean values can be compared across cultures, because different translations, response styles, or reference group effects (RGEs) may limit the scalar equivalence of scores (Church et al., 2011, Heine et al., 2008, Zecca et al., 2013). However, there are reasons to doubt that response styles or problems in translation have serious effects on culture-level NEO Inventory scores, as a number of studies have shown. We review these before turning to a consideration of the RGE.

There are known cultural differences in acquiescent responding (Smith, 2004), but scales from the NEO Inventories have balanced keying, so acquiescent responding should have minimal effect. Cultures also differ in self-enhancement; that might bias self-report data, but should not affect informant ratings of personality. Mõttus et al. (2012a) showed that extreme responding, although it had little effect on individual scores, had a larger effect on culture-level scores of Conscientiousness. Nevertheless, the rank-order of cultures was similar when scores corrected for extreme responding were compared to uncorrected scores, rho = .68, p < .001. The frequency of random responding or missing data might vary across cultures, but McCrae and colleagues (McCrae and Terracciano, 2005b, McCrae et al., 2010) screened out protocols with evidence of random responding or excessive missing data.

Several studies have asked bilingual respondents to complete inventories twice, in different languages. Using the Big Five Inventory (BFI; Benet-Martínez & John, 1998), Ramírez-Esparza, Gosling, Benet-Martínez, Potter, and Pennebaker (2006) showed that English/Spanish bilinguals scored higher on Extraversion, Agreeableness, and Conscientiousness when tested in English. However, most studies using the NEO Inventories have seen only small and scattered differences. For example, in a study of Hong Kong Chinese, consistent differences were found for only 3 of 30 facets (Excitement Seeking, Straightforwardness, and Altruism; McCrae, Yik, Trapnell, Bond, & Paulhus, 1998). Bilingual studies in Korean (Piedmont & Chae, 1997), Shona (Piedmont, Bain, McCrae, & Costa, 2002), and Spanish (Costa, McCrae, & Kay, 1995) have also shown comparability of mean levels across translations for most scales. Different Filipino samples completing the NEO Inventory in Filipino or in English (Church & Katigbak, 2002) and different Indian samples completing the inventory in Marathi or Telugu (McCrae, 2002) showed similar, although not identical, profiles.

These studies suggest that response styles and translations may have some impact on culture-level scores, but that it is likely to be relatively small. When culture-level scores are examined directly for construct validity—a “top down” approach—several lines of evidence support their validity. The geographical distribution of traits is consistent with the hypothesis that national scores are accurate reflections of trait levels (Allik & McCrae, 2004; see also Gelade, 2013)—for example, Danes and Norwegians showed similar personality profiles, as did Zimbabweans and Black South Africans. Again, scores are meaningfully correlated at the culture level with dimensions of culture; for example, cultures high in aggregate Openness score high in Hofstede’s Individualism (Hofstede & McCrae, 2004).

Perhaps most compelling are data from three different sources that demonstrate mutual agreement. McCrae (2002) compiled self-report data from 36 cultures; McCrae and colleagues (McCrae & Terracciano, 2005b) obtained observer ratings of college-age and adult targets in 51 cultures as part of the Personality Profiles of Cultures (PPOC) project; and McCrae et al. (2010) gathered observer ratings of 12–17-year-old targets in 24 cultures as part of the Adolescent PPOC (APPOC). Correlations across cultures of mean scores from these three studies for each of the 30 facet scales of the NEO Inventories ranged from −.18 to .82 (Mdn = .52); 69 (77%) of these were statistically significant (McCrae and Terracciano, 2005b, McCrae et al., 2010). Additional analyses using intraclass correlations (ICCs) showed significant agreement for personality profiles within most cultures (68%). Schmitt et al. (2007) found evidence of convergent validity for Extraversion, Conscientiousness, and Neuroticism when BFI culture means were correlated with NEO Inventory means—although the scales showed rather poor discriminant validity in that study. McCrae (2002) found evidence for both convergent and discriminant validity when NEO Inventory means were correlated with Eysenck Personality Questionnaire (Eysenck & Eysenck, 1975) Neuroticism and Extraversion means (see also Bartram, 2013). Taken together, these findings appear to provide evidence of construct validity for the national means.1

However, Heine and colleagues (Heine et al., 2002, Heine et al., 2008, Heine and Buchtel, 2009) have argued that these findings may be artifacts of the RGE. In this view, responses to personality items are not absolute judgments, but are made relative to some implicit normative group, notably the citizens of one’s country: “Japanese tend to evaluate themselves on the basis of how they compare with other Japanese, whereas Canadians tend to evaluate themselves on the basis of how they compare to other Canadians” (Heine et al., 2002, p. 905). RGEs have been demonstrated to operate in several contexts; for example, Guimond et al. (2007) offered evidence that women in more traditional cultures describe themselves relative to other women, whereas women in more progressive cultures adopt people-in-general as their frame of reference.

Heine and colleagues further argued that RGEs can explain much of the “top down” evidence for the validity of culture means in personality traits. Data from self-reports agree with data from peer ratings, but this might be because both adopt the same reference group. Similarly, geographically close nations might have similar personality profiles because they share similar RGEs.

This is an appealing argument, but it requires careful scrutiny. A number of considerations argue against it.

  • (1)

    The first and most obvious problem is that, carried to its logical conclusion, RGE would eliminate cultural differences in assessed traits, because means everywhere would be average. About half the population in any culture would call themselves high on a trait (relative to their compatriots), and half would call themselves low; the culture mean would always be average. (Note that this is exactly what would happen if a researcher standardized raw scores as T-scores within each culture: All means would be 50.) Where there is no variation—except random sampling error—there can be no correlation, so we would not, for example, expect NEO Neuroticism means to be correlated across cultures with EPQ Neuroticism means, or with mean peer ratings of Neuroticism. Yet such correlations are repeatedly found.

    Heine et al. (2002) recognized this problem and argued that consistent cultural differences occur because social comparison “is not the only process by which people come to understand themselves” (p. 907). This suggests that RGE serves only to attenuate cultural differences—to drive scores some way toward the mean. That would imply that assessed culture means are not accurate in an absolute sense, but—other things being equal—would still be accurate in a relative sense, and it is the relative levels of traits across cultures that we use to assess the accuracy of national character stereotypes.

  • (2)

    It is not clear to whom people in fact compare themselves when completing personality inventories. To their circle of friends? To the national average? To a “perceived international norm” (p. 301) that Heine et al. (2008) suggested is used when describing national stereotypes? In some cultures, comparisons seem to be made to one’s own gender (Guimond et al., 2007). The most plausible case is that different respondents choose different reference groups, contributing noise, but not systematic bias, to mean scores.

  • (3)

    RGE is more problematic for some items than for others. As Heine et al. (2002) noted, responses to some questions “might rely more on introspection and comparison with internal standards than on implicit comparisons with consensually shared standards” (p. 914). Rating the item “I am not a very methodical person” may require some idea of how methodical the typical person is, but it is not clear that any reference group at all is needed to rate such items as “I have never literally jumped for joy” or “I’d rather vacation at a popular beach than an isolated cabin in the woods.” All these items are included in the NEO Inventories.

  • (4)

    The RGE is in fact only one example of a broader class of artifacts, namely, different standards of comparison. In particular, members of a culture might have very high standards for assessing a trait such as Conscientiousness, not because their compatriots on average scored high on the trait (as the RGE assumes), but because Conscientiousness is highly valued in the culture. But are there in fact large cultural differences in norms for Conscientiousness? Mõttus and colleagues (Mõttus et al., 2012b) examined that idea by generating a set of anchoring vignettes and asking respondents in 21 diverse cultures to rate the Conscientiousness of the individual depicted in each. They concluded that there were “no substantial culture-related differences in standards for Conscientiousness” (p. 303) and the small differences they found had little effect on the ranking of self-report means in these cultures.

  • (5)

    McCrae et al. (1998) examined social judgment effects by comparing ratings of Chinese undergraduates made by Canadian-born Chinese or recent immigrants from Hong Kong. These two groups of raters might be assumed to have different standards for judging personality traits. However, the resulting profiles were strikingly similar, and showed significant differences for only four of the 30 NEO Inventory facets and only one factor, Neuroticism (Hong Kong-born raters perceived Chinese undergraduates as somewhat higher in Neuroticism than did Canadian-born raters).

  • (6)

    Geographical patterns cannot easily be explained by RGEs or other culture differences in standards—at least not in ways that favor the accuracy of national stereotypes. Heine and colleagues argued that geographically close countries such as the US and Canada have similar observed mean personality profiles (ICC = .66; Terracciano et al., 2005) because they share cultural norms for the assessment of traits. Shared standards would indeed lead to similar observed profiles—but only if the real underlying profiles were also similar. This must be so because the observed score is a function of the true score and the standard of evaluation that is implicitly relied on in evaluating the true score. But the national stereotypes of unassuming Canadians and arrogant Americans are diametrically opposed (ICC = –.53); if the true underlying profiles are similar, one or both of the stereotypes must be wrong.

    These arguments are not definitive. There have been no studies to date on cultural standards for four of the five factors. England and Australia might have different true score profiles and different standards of comparison that just happen to cancel out to yield similar observed profiles. But the most parsimonious conclusion at present is that RGE and other cultural differences in standards for evaluating traits have fairly minor effects. The assessed personality profiles in our criterion sets are surely not perfect, but they are probably adequate for the assessment of the accuracy of national stereotypes.

Is it possible that the null results reported by Terracciano et al. (2005) are due to problems in the instrument used to assess stereotypes, the National Character Survey (NCS)? When first used it was an ad hoc measure with 30 items corresponding conceptually to the 30 facet scales of the NEO Inventories. Respondents were asked to “judge the likelihood of 30 characteristics for the typical” member of their own culture, using five-point scales. For example, the characteristic national level of anxiety was rated on a scale from anxious, nervous, worrying to at ease, calm, relaxed. Terracciano et al. (2005) showed that the NCS had reasonable psychometric properties, given its brevity: The five domain scales (created by summing the relevant six items for each of the five factors) had adequate internal consistency, the factor structure gave a reasonable approximation to that of the NEO Inventories, and, when aggregated across raters, the mean scores for each trait reliably distinguished among nations.

Subsequent studies using the NCS have provided additional support. Terracciano and McCrae (2007) reported that NCS ratings of Americans remained similar in Lebanon (ICC = .74) and Italy (ICC = .92) in the six-month period before and after the American invasion of Iraq. Five years after the PPOC study, the NCS was readministered to new samples of raters in Estonia and Poland (Realo et al., 2009) and in Slovakia, Germany, Poland, and the Czech Republic (Hřebíčková and Graf, 2013, Kouřilová and Hřebíčková, 2011); in all five cultures, very similar trait profiles were found on the two occasions (ICCs = .78–.93, N = 30, p < .001). This might be seen as evidence of retest reliability at the culture level; it is a particularly stringent test, both because different raters were used on different occasions, and because the retest interval was quite long.

These two studies also provided evidence that different translations of the NCS yield comparable scores. Raters in Estonia, Finland, Belarus, Lithuania, Latvia, and Poland generally agreed on their depiction of the typical Russian (Mdn ICC = .58; Realo et al., 2009). Raters in Austria, Germany, the Czech Republic, Poland, and Slovakia generally agreed on their views of each other (e.g., Czechs and Germans agreed on the depiction of Austrians), with 25 of 30 comparisons statistically significant (Mdn ICC = .68; Hřebíčková & Graf, 2013). Further, there is evidence that heterostereotypes agree with autostereotypes for some (though not all) cultures (Boster & Maltseva, 2006). For example, pooled international ratings of the typical American closely resembled ratings from Americans themselves (Terracciano & McCrae, 2007), and the stereotype of Germans held by other central Europeans matched German autostereotypes (Hřebíčková & Graf, 2013). These findings might be interpreted as evidence of the interrater reliability of the NCS at the culture level, an international consensus on national stereotypes. But consensus is not necessarily evidence of accuracy (Kenny, 1994), just as reliability is not equivalent to validity.

It is more difficult to assess the validity of the NCS. A stereotype measure that accurately reflects what people believe might be called valid, even if the beliefs themselves were entirely false. To avoid confusion between validity and accuracy, we will refer to this psychometric property as fidelity: Does the NCS faithfully reflect the beliefs of respondents? On its face, it does. Terracciano et al. (2005) found, for example, that Americans were characterized as being assertive and the British were described as reserved; these seem to fit familiar stereotypes. Chinese Malaysian students stereotype Malays as friendly but lazy (Ibrahim et al., 2010), consistent with their NCS scores on Warmth (T = 54.5) and Self-Discipline (T = 45.1; McCrae, Terracciano, Realo, & Allik, 2007). Church and Katigbak (2002) recruited panels of American and Filipino judges, all of whom had lived in both the US and the Philippines for at least three years, and asked them to indicate on a 7-point scale whether Filipinos or Americans were higher on each of the NEO facet traits. These judgments correlated r = .72, N = 30, p < .001, with the difference between Terracciano et al.’ NCS scores for Filipinos and Americans. However, a broader and more systematic assessment of fidelity is needed.

When the validity of a trait measure is assessed, the most common form of evidence is a correlation between the measure and another scale designed to assess the same trait—for example, a new anxiety scale may be correlated with an established measure of anxiety. McCrae, Terracciano, Realo, and Allik (2008) provided such evidence for NCS measures of Agreeableness and Conscientiousness by correlating them with scales from the Global Leadership and Organizational Behavior Effectiveness (GLOBE) project (House, Hanges, Javidan, Dorfman, & Gupta, 2004). The GLOBE Humane Orientation scale asks informants if members of their culture are generous and friendly; this scale correlated .50 (N = 33 cultures, p < .01) with NCS Agreeableness. The GLOBE Future Orientation scale, which assesses the degree to which typical culture members are thought to plan ahead, correlated .65 with NCS Conscientiousness. At present, there do not appear to be alternative measures of national stereotypes of Neuroticism, Extraversion, or Openness (but see Peabody, 1985, for data on other personality variables). One design might be to ask respondents to complete the full, 240-item NEO Inventory to describe the typical culture member and to evaluate the briefer NCS against that criterion. In the present study we assess the fidelity of the NCS across different formats of administration. Our design allows us to ask if NCS scores faithfully reflect the perceived character of the whole nation, or if they depict only some demographic segments of the nation.

Terracciano et al. (2005) assessed the accuracy of national character autostereotypes—the views of the group held by ingroup members—of 30 traits across 49 cultures, and of 49 cultures across a 30-trait profile, using both self-reported and observer rated personality assessments as criteria. They found no consistent evidence of accuracy, except for the personality profile of Poles. McCrae et al. (2010) found that the national autostereotype profiles reported by Terracciano and colleagues were related to mean national profiles of adolescents aged 12–17 in Argentina (ICC = .39, p < .05) and Turkey (ICC = .42, p < .05), but not in 20 other cultures. Realo et al. (2009) reported the accuracy of autostereotypes of national character in nine samples from seven cultures, using the NCS to assess stereotypes and a modification of the NCS to obtain self-reported personality assessments to serve as the criteria. They showed agreement (ICCs = .39–.52) for only four of the nine samples (Poles, Finns, Russians, and adult Estonians). Hřebíčková and Graf (2013) found accurate autostereotypes for Poles and adult Czechs, but not for Austrians, Germans, Slovaks, or college-age Czechs. Overall, it appears that there is little evidence that national character autostereotypes as assessed by the NCS are accurate representations of mean trait levels except in Poland, where agreement may be simply a coincidence.

Arguably, autostereotypes may be distorted by ethnocentric bias, whereas the perceptions of outgroup members may be more objective. Terracciano et al. (2005) did not address that possibility, but Realo et al. (2009) compared perceptions of the typical Russian by Belarusians, Estonians, Finns, Latvians, Lithuanians, and Poles to the assessed personality profile of Russians and found no agreement (ICCs = −.39 to .31). Hřebíčková and Graf (2013) used the NCS to gather information on the views of Austrians, Czechs, Germans, Poles, and Slovaks about each other. They found no evidence that any of these heterostereotypes agreed with assessed personality. Outgroup members do not appear to have any more accurate perceptions of national character than do ingroup members.

Many stereotypes are reasonably accurate (Jussim, 2012). When the 30 NCS items were used to assess typical adolescents, adults, and old persons, these age stereotypes proved to be remarkably accurate when compared to known age differences in personality (Chan et al., 2012). In addition, Löckenhoff et al. (2013) showed that, when applied to males and females, NCS items captured gender stereotypes that correspond to established sex differences in personality (Costa, Terracciano, & McCrae, 2001). These studies, which also used NEO Inventory data as criteria, demonstrate that stereotypes assessed by the NCS may be quite accurate when an appropriate target is chosen. Of course, there is no guarantee that all stereotypes are accurate, and if NCS ratings of national character do not resemble NEO Inventory profiles of different cultures, it is probably because national character stereotypes are inherently inaccurate.

One possible explanation for that inaccuracy might be that national character stereotypes vary substantially across different subcultures or subgroups. For example, the stereotype of Northern Italians is dramatically different from that of Southern Italians (McCrae et al., 2007). Conceivably, national character may be different for males and females, or for adults and old persons. If, when asked to rate the typical culture member, some respondents use men as their frame of reference and others use women, the pooled responses might be meaningless.

Psychologists have recently been reminded of the crucial importance of replication for their science (Pashler & Wagenmakers, 2012). It is thus important to attempt to replicate the null findings of Terracciano et al. (2005); here we use a modified version of the NCS in a subset of the cultures originally examined. As part of the APPOC (De Fruyt, De Bolle, McCrae, Terracciano, & Costa, 2009), respondents in 26 cultures were asked to make ratings of the typical male or female of a specific age in their own culture—for example, one group of Ugandans rated the traits of the typical adolescent Ugandan girl. Earlier research had asked only about an undifferentiated national character (e.g., the typical Ugandan), and it is unclear what respondents had in mind when making ratings. In the present study age and gender of the target are specified, and we can determine if national stereotypes are in fact consistent across these categories. If national stereotypes prove to be generalizable across age and gender categories, then averaging them may provide the most faithful assessment of the true national stereotype. The accuracy of these assessments can be judged against assessed mean personality traits. It has been shown that national personality profiles are generalizable across age and gender (McCrae, 2002, McCrae and Terracciano, 2005b), so the criteria can be averaged across these groups.

However, the personality profiles of different age and gender groups in a given culture are not identical, and it is possible that age- and gender-specific stereotypes will be more accurate when compared to criteria matched on age and gender. This hypothesis is based on the premise that people have extensive experience with different age and gender groups within their own culture, and can therefore describe age and sex differences with some degree of accuracy. When they assess a particular category (e.g., adult male Chileans), their ratings are a function of their accurate knowledge of age and gender differences as well as their beliefs about national differences. Even if their national character stereotypes are completely unfounded, these ratings will correlate to some extent with the assessed personality traits of the corresponding age and gender group, because both sets of scores share variance due to true within-culture differences in age and gender. In this study, we also test that hypothesis.

Section snippets

Procedure

Participants (N = 3323) from 26 countries around the world rated the personality characteristics of typical males and females in their culture as part of a study on stereotype accuracy (see Chan et al., 2012). These participants were previously described in detail in Löckenhoff et al. (2009); about two-thirds were women and most were in their early 20 s.

Generalizability across targets

If national character stereotypes in a given culture are truly national, they ought to be reflected in perceptions of the typical man as well as the typical woman, the typical adolescent as well as the typical adult in that culture. To test that assumption, we conducted reliability analyses at the domain and profile levels, asking whether facet levels across all cultures were similar in each of the six target groups. For the five domain analyses, we treated recentered NCS facet means in each

Discussion

All previous research using the NCS has asked for undifferentiated ratings of the typical citizen of a country or region; in this replication we specified the age and gender of the culture member. This modification yielded averaged scores that were generally comparable to those found with a global target, adding to the evidence that NCS scores yield faithful representations of shared beliefs about national character. However, consistent with most previous literature, the accuracy of these

Acknowledgments

This research was supported in part by the Intramural Research Program of the National Institutes of Health, National Institute on Aging. Anu Realo and Jüri Allik were supported by Grants from the Estonian Ministry of Education and Science (SF0180029s08 and IUT2-13). Martina Hřebíčková and Sylvie Graf were supported by a Grant from the Czech Science Foundation (13-25656S). Robert R. McCrae and Paul T. Costa, Jr., receive royalties from the NEO Inventories.

References (73)

  • R. Brown

    Prejudice: Its social psychology

    (2010)
  • D.T. Campbell

    Stereotypes and the perception of group differences

    American Psychologist

    (1967)
  • N.S. Caplan et al.

    The boat people and achievement in America: A study of family life, hard work, and cultural values

    (1989)
  • W. Chan et al.

    Stereotypes of age differences in personality traits: Universal and accurate?

    Journal of Personality and Social Psychology

    (2012)
  • M.M. Chao et al.

    The model minority as a shared reality and its implications for interracial perceptions

    Asian American Journal of Psychology

    (2013)
  • A.T. Church et al.

    Are cross-cultural comparisons of personality profiles meaningful? Differential item and facet functioning in the Revised NEO Personality Inventory

    Journal of Personality and Social Psychology

    (2011)
  • A.T. Church et al.

    The Five-Factor Model in the Philippines: Investigating trait structure and levels across cultures

  • P.T. Costa et al.

    Revised NEO Personality Inventory (NEO-PI-R) and NEO Five-Factor Inventory (NEO-FFI) professional manual

    (1992)
  • P.T. Costa et al.

    Persons, places, and personality: Career assessment using the Revised NEO Personality Inventory

    Journal of Career Assessment

    (1995)
  • P.T. Costa et al.

    Gender differences in personality traits across cultures: Robust and surprising findings

    Journal of Personality and Social Psychology

    (2001)
  • F. De Fruyt et al.

    Assessing the universal structure of personality in early adolescence: The NEO-PI-R and NEO-PI-3 in 24 cultures

    Assessment

    (2009)
  • S. Epstein

    The stability of behavior: I. On predicting most of the people much of the time

    Journal of Personality and Social Psychology

    (1979)
  • H.J. Eysenck et al.

    Manual of the Eysenck Personality Questionnaire

    (1975)
  • S.T. Fiske et al.

    A model of (often mixed) stereotype content: Competence and warmth respectively follow from perceived status and competition

    Journal of Personality and Social Psychology

    (2002)
  • R.M. Furr

    A framework for profile similarity: Integrating similarity, normativeness, and distinctiveness

    Journal of Personality

    (2008)
  • R.M. Furr

    The double-entry intraclass correlation as an index of profile similarity: Meaning, limitations, and alternatives

    Journal of Personality Assessment

    (2010)
  • G. Gelade

    Personality and place

    British Journal of Psychology

    (2013)
  • S. Guimond et al.

    Culture, gender, and the self: Variations and impact of social comparison processes

    Journal of Personality and Social Psychology

    (2007)
  • S.J. Heine et al.

    Personality: The universal and the culturally specific

    Annual Revuew of Psychology

    (2009)
  • S.J. Heine et al.

    What do cross-national comparisons of personality traits tell us? The case of Conscientiousness

    Psychological Science

    (2008)
  • S.J. Heine et al.

    What’s wrong with cross-cultural comparisons of subjective Likert scales? The reference-group problem

    Journal of Personality and Social Psychology

    (2002)
  • G. Hofstede et al.

    Personality and culture revisited: Linking traits and dimensions of culture

    Cross-Cultural Research

    (2004)
  • Hřebíčková, M., & Graf, S. (2013). Accuracy of national stereotypes in central Europe: Outgroups are not better than...
  • F. Ibrahim et al.

    Re-visiting Malay stereotypes: A case study among Malaysian and Indonesian Chinese students

    SEGi Review

    (2010)
  • J.J. Jackson et al.

    Military training and personality trait development: Does the military make the man, or does the man make the military?

    Psychological Science

    (2012)
  • Cited by (38)

    • Through a looking glass, darkly: Using mechanisms of mind perception to identify accuracy, overconfidence, and underappreciated means for improvement

      2019, Advances in Experimental Social Psychology
      Citation Excerpt :

      Stereotypes about majority groups, for instance, tend to yield more accuracy than stereotypes about minority groups (Jussim, 2012). Stereotypes about demographic characteristics that people have considerable exposure to, such as gender and age (Chan et al., 2012), tend to be more accurate than stereotypes based on race (or personality judgments based on nationality; Terracciano et al., 2005; McCrae et al., 2013), where direct exposure to valid information is more limited (Jussim et al., 2016). Unfortunately, these studies do not vary the targets of stereotypes in any systematic fashion, making the connection between observed accuracy rates and the mechanisms that produce stereotypes largely speculative.

    • Major psychological dimensions of cross-cultural differences: Nastiness, Social Awareness/Morality, Religiosity and broad Conservatism/Liberalism

      2016, Learning and Individual Differences
      Citation Excerpt :

      The four domains of social psychology considered in this paper are: a.) Personality; b.) Social Attitudes (and Values); c.) Social Axioms; and d.) Social Norms. Our approach is more comprehensive than typical approaches since many large-scale cross-cultural studies designed to examine the relationship between the domains consider only two out of the four, or rarely three domains – e.g., personality and GLOBE measures of social norms (McCrae et al., 2013). In the following paragraphs we briefly consider the four domains and mention some of the cross-cultural findings.

    • Personality traits across cultures

      2016, Current Opinion in Psychology
      Citation Excerpt :

      By contrast, trait profiles of cultures do not correlate well with national stereotypes, that is, ratings of typical personality in a cultural group [41]. There are significant questions about the accuracy of national stereotypes, however, so their lack of convergence with cultural mean profiles probably does not refute the validity of these profiles [42••]. More difficult to discount are studies that report measurement noninvariance in personality items across cultures.

    • PERSONALITY TRAITS IN SINGERS PERFORMING VARIOUS MUSIC STYLES AND WITH DIFFERENT SINGING STATUS

      2023, International Journal of Occupational Medicine and Environmental Health
    View all citing articles on Scopus
    View full text