Psychometric evaluation of the Japanese version of Ten-Item Personality Inventory (TIPI-J) among middle-aged, and elderly adults: Concurrent validity, internal consistency and test–retest reliability

Abstract Objective: This study aimed to provide a psychometric evaluation of the Japanese version of the Ten-Item Personality Inventory (TIPI-J), and was conducted to confirm the concurrent validity, internal consistency, and test–retest reliability of the measure. Methods: A total of 520 middle-aged (40–64 years old) and 312 older adults (65–79 years old) participated in this study. Participants were registered research volunteers with an internet research company. The TIPI-J assesses the Big Five personality traits (Neuroticism, Extraversion, Openness, Agreeableness, and Conscientiousness). The NEO-Five-Factor Inventory (NEO-FFI) was used to test concurrent validity. Results: Correlations of corresponding trait measures for the TIPI-J and NEO-FFI ranged from 0.45 (Openness) to 0.70 (Extraversion) for middle-aged, and from 0.33 (Openness) to 0.67 (Neuroticism) for older adults. The correlation values of Openness between the two scales were similar to those for the correlation between TIPI-J Openness and NEO-FFI Extraversion for both middle-aged and older adults, and that for the correlation between TIPI-J Openness and NEO-FFI Conscientiousness for older adults. The relationships between TIPI-J personality scores measured at a two-week interval ranged from 0.74 (Agreeableness) to 0.84 (Extraversion) for middle-aged and from 0.67 (Openness) to 0.78 (Neuroticism and Extraversion) for older adults. Conclusion: The TIPI-J has relatively acceptable concurrent validity, with the exception of Openness, which was considerably weaker than the other traits. The scale has relatively acceptable test–retest reliability. Thus, TIPI-J would be a useful instrument, roughly speaking, for assessing the Big Five personality traits among middle-aged and older adults.


Introduction
"Personality refers to psychological qualities that contribute to an individual's enduring and distinctive patterns of feeling, thinking, and behaving" (Cervone & Pervin, 2013, p. 8). The Big Five personality, which is a major personality theory, consists of five higher order traits or domains of personality: Neuroticism, extraversion, openness, agreeableness, and conscientiousness (Costa & McCrae, 1988;McCrae & John, 1992). According to the Big Five personality theory, Neuroticism denotes a tendency to be vulnerable to psychological distress. People with high Neuroticism are prone to experience more negative emotions, including depression, anxiety, and anger. Extraversion indicates a tendency to be sociable, active, and to experience positive emotions. Persons with a high level of Extraversion are prone to engage in social interactions. Openness indicates intellectual curiosity and a preference for varied experiences. People with high Openness are prone to be free from conservative values and to have innovative ideas. People with high Agreeableness are socially desirable and psychologically healthy because they tend to be warm and sympathetic to others. Agreeableness refers to a tendency to be trusting, sympathetic, and cooperative. Conscientiousness gives the disposition to be diligent, organized, and achievement-oriented and is suggested to be the most important personality trait in determining longevity (Iwasa et al., 2008;. In recent years, the Big Five personality theory has been used to examine the relationships between personality and health issues (Marks & Lutgendorf, 1999;Roberts, Kuncel, Shiner, Caspi, & Goldberg, 2007). The relationship between personality and health needs to systematically examined in order to provide useful findings that can contribute to health outcomes among older individuals. Personality is indeed a predictor of health outcomes. Various studies have examined health issues such as mortality (Iwasa et al., 2008;Weiss & Costa, 2005;Wilson, Mendes de Leon, Bienias, Evans, & Bennett, 2004), moderate drinking (Hakuinen, Elovainio, et al., 2015;Ruiz, Pincus, & Dickinson, 2003), smoking (Hakuinen, Hintsanen, et al., 2015;Terracciano & Costa, 2004), regular physical activity (Marks & Lutgendorf, 1999), chronic disease (Chapman, Roberts, Lyness, & Duberstein, 2013;Joklea, Elovainio, et al., 2013;Terracciano et al., 2014), health check-ups (Iwasa et al., 2009), inflammation marker (Mõttus, Luciano, Starr, Pollard, & Deary, 2013), cognitive decline (Chapman et al., 2012;Crowe, Andel, Pedersen, Fratiglioni, & Gatz, 2006;Wang et al., 2009), obesity and overweight individuals (Elfhag & Morey, 2008;Iwasa et al., 2012;Kakizaki et al., 2008;Sutin & Terracciano, 2016), well-being and mental health (Siegler & Brummett, 2000;Steunenberg, Beekman, Deeg, & Kerkhof, 2006), and functional capacity (Iwasa, Masui, Gondo, Kawaai, & Inagaki, 2010;Krueger, Wilson, Shah, Tang, & Bennett, 2006) from a Big Five theory perspective.
Existing Big Five personality scales are lengthy measures that consist of many items. The NEO Personality Inventory Revised (NEO-PI-R) includes 240 items that assess specific facets of each personality factor as well as the core Big Five (Costa & McCrae, 1988;McCrae & John, 1992;Shimonaka, Nakazato, Gondo, & Takayama, 1999). The NEO-Five-Factor Inventory (NEO-FFI), the short version of the NEO-PI-R, includes 60 items (Costa & McCrae, 1988;McCrae & John, 1992;Shimonaka et al., 1999). To use such existing personality scales in a community survey of older adults is not necessarily feasible because of page space limitations and burden of participating in the survey. Therefore, the development of a scale that can better measure the Big Five personality traits using a smaller number of items is necessary.
The Ten-Item Personality Inventory (TIPI) (Gosling, Rentfrow, & Swann, 2003) can measure the Big Five using a very small number of items. A Japanese version of the scale (TIPI-J) has been developed in undergraduates (Oshio, Abe, & Cutrone, 2012). They described the process of translation from English to the Japanese version and described the process of examining the reliability and validity of the scale. The authors concluded that results of the correlation between scale items, configurations of score distribution, test-retest reliability, and concurrent validity roughly confirmed the reliability and validity of the TIPI-J. However, psychometric evaluations of the scale have not been conducted among middle-aged and older adults. Recent studies have reported a longitudinal change of all Big Five personality traits across the life span (Terracciano, McCrae, Brant, & Costa, 2005). Additionally, age-based differences in the TIPI-J scores have also been confirmed using a large-scale cross-sectional data (Kawamoto et al., 2015). These previous reports have suggested that the Big Five personality is changeable through the life span. Therefore, psychometric properties of tools assessing the Big Five (including TIPI-J) should also be newly confirmed in middle-aged and older adults.
This study aimed to examine (1) concurrent validity of the TIPI-J, and (2) reliability (internal consistency and test-retest reliability) of the TIPI-J using a sample of middle-aged and older adults in Japan. In order to examine concurrent validity of the TIPI-J, we calculated the correlation coefficients between TIPI-J and NEO-FFI scores. Corresponding traits on the two scales (e.g. Neuroticism in the TIPI-J and Neuroticism in the NEO-FFI) should show positive and strong correlations. Different traits (e.g. Neuroticism in the TIPI-J and Extraversion in the NEO-FFI) should not show such relationships. When adequately using psychological assessment scales, procedures are required for confirming whether a construct that is being assessed by a scale is, in fact, being appropriately assessed. This study was conducted to examine the concurrent validity of TIPI-J using the NEO-FFI as the external criterion. NEO-FFI has been already validated in a Japanese population and has been widely used in various research settings in Japan.
We also examined the reliability of the TIPI-J. To investigate internal consistency, we calculated Cronbach's alpha coefficients for each subscale of the TIPI-J. To examine test-retest reliability, we conducted two measurements using the TIPI-J with a two-week interval and then calculated correlation coefficients between the two scores for each trait. Acceptable test-retest reliability is indicated by positive correlations (r ≥ 0.70), which are indicative of temporal stability (Oshio, 2016;Takamoto & Hattori, 2015). The TIPI has only two items for each factor, and therefore its reliability would be underestimated if only internal consistency were used (e.g. Chronbach's alpha). Thus, examining the test-retest reliability of the scale should also be undertaken for an adequate evaluation (Oshio et al., 2012).

Participants
We used an Internet research company with 1.2 million registered research volunteers (January 2014) to administer the present survey (Survey 1). The company equally selected samples from stratifications of gender (2 layers: men and women) and age (4 layers: 40, 50, 60, and 70s). We asked the company to provide the data of 800 participants. The company selected 3,264 persons from the registered survey volunteers. The selected volunteers were invited to take part in the survey via an e-mail containing a link to the survey. Participants received online shopping points as an incentive for participation. The survey was conducted from 14 November 2014 to 16 November 2014. All told, 832 participants responded to the survey (response proportion: 25.5%). We divided the participants into two age groups: middle-aged (40-64 years old) and older (65-79 years old). This age classification was determined based on a developmental psychology framework on the developmental processes of the human lifespan (Azuma, Shigeta, & Tajima, 1992). Furthermore, according to typical age classifications in Japanese law regarding older adults, our classification of the middle-aged and older groups is in accord with the division of "Class 2 Insured" (40-64 years old) and "Class 1 Insured" individuals (65 years and over), respectively, in the long-term care insurance system in Japan (Longterm Care Insurance Division, Social Welfare Department, Shinjuku City, 2011).
To conduct another survey for test-retest reliability purposes, the company again emailed those who participated in survey 1 with an invitation to and link for the web survey two weeks after the Survey1 (Survey 2). All told, 794 participants responded to the second round of the survey (re-response proportion: 90.0%).
The study was approved by the Ethics Committee of Fukushima Medical University. The study was described to all participants, who were advised that (a) their participation would be entirely voluntary, (b) they could withdraw from the study at any time, and (c) if they chose to withdraw or to not participate, they would not be disadvantaged in any way.

Measurements
We administered the TIPI-J (Gosling et al., 2003;Oshio et al., 2012) to measure the five personality trait domains (namely, Neuroticism, Extraversion, Openness, Agreeableness, and Conscientiousness). Participants rated each of the 10 items on a seven-point Likert scale from 1 "disagree strongly" to 7 "agree strongly." The averages of the two item scores were calculated to give scores for each trait (range from 1 to 7), with higher scores meaning a higher level of the trait. This study used the term "Neuroticism" instead of the more recent "emotional stability" (Costa & McCrae, 1988;McCrae & John, 1992) because many researchers have used this term previously.
We also administered the NEO five-factor inventory Japanese version (Shimonaka, Nakazato, Gondo, & Takayama, 1998;Shimonaka et al., 1999). Participants rated each of the 60 statements on a five-point Likert scale ranging from "strongly disagree" to "strongly agree." Item scores, which ranged from 0 to 4, were summed up to give total scores for each trait (range from 0 to 48), with higher scores meaning a higher level of the trait. The reliability of each personality domain in the present analysis was relatively high, with Cronbach's alpha coefficients for Neuroticism, Extraversion, Openness, Agreeableness, and Conscientiousness being 0.87, 0.85, 0.64, 0.75, and 0.77 in middleaged adults, and 0.85, 0.82, 0.62, 0.80, and 0.78 in older adults, respectively.
Data for education, living arrangement, self-rated health, history of chronic disease, alcohol intake, smoke, and self-rated financial status were used to describe the characteristics of the study participants.

Statistical analysis
All analyses were conducted according to the age groups. Firstly, descriptive statistics were calculated for each age group to show basic characteristics of participants (including age, gender, educational attainment, living arrangement, self-rated health, history of chronic disease, alcohol intake, smoking, and self-rated financial status). Secondly, we calculated Pearson correlation coefficients between TIPI-J and NEO-FFI scores to examine the concurrent validity of the TIPI-J in Survey 1. Thirdly, we calculated Chronbach's alpha coefficients for each TIPI-J factor to confirm its internal consistency. Finally, we calculated Pearson correlation coefficients between TIPI-J personality scores on Surveys 1 and 2 to test-retest reliability with a two-week interval. All probability values were two-tailed. p values of < 0.05 were considered statistically significant. We used IBM SPSS Statistics version 22 (IBM Corp., Armonk, NY) for the analyses. Table 1 shows participant characteristics, including age, sex, education level, living arrangement, self-rated health, chronic disease, alcohol intake, smoking habits, and self-rated financial status, according to age group. Significant differences between middle-aged and older individuals in age (51.8 vs. 71.7 years, p < 0.01), primary/secondary education (30.4 vs. 48.7%), history of chronic diseases (12.7 vs. 31.7%, p < 0.01), smoke (19.6 vs. 9.0%, p < 0.01), and self-rated financial status (39.4 vs. 28.5%, p < 0.01) were found. Proportion of women (50.0 vs. 50.0%, p = 1.00), living alone (15.6 vs. 13.1%, p = 0.336), self-rated health (poor) (21.3 vs. 18.3%, p = 0.284), and alcohol intake (59.8 vs. 53.5%, p = 0.08) were almost identical between the two age groups. Table 2 shows the relationships between TIPI-J (and items) and the NEO-FFI scores. Correlations of the same traits in the TIPI-J and NEO-FFI for middle-aged and older individuals were 0.68 and 0.70 for Neuroticism, 0.70 and 0.66 for Extraversion, 0.45 and 0.34 for Openness, 0.58 and 0.63 for Agreeableness, and 0.62 and 0.62 for Conscientiousness. Interestingly, the correlation between TIPI-J Openness and NEO-FFI Openness was similar in magnitude to that between TIPI-J Openness and NEO-FFI Extraversion for both middle-aged and older adults (0.45 vs. 0.49 in middle-aged, 0.34 vs. 0.36 in older adults) and to that between TIPI-J Openness and NEO-FFI Conscientiousness for older adults (0.34 vs. 0.32). With regard to each TIPI-J item, the correlation between item 10 ("Conventional, uncreative") in the TIPI-J and openness in the NEO-FFI was particularly weak (r = 0.30 in middle-aged and r = 0.13 in older adults). Table 3 shows the Cronbach's alpha coefficients of each TIPI-J factor and the correlation coefficients between TIPI-J scores at the two time-points. The Cronbach's alpha coefficients for the TIPI-J subscales were 0.51 (Neuroticism), 0.57 (Extraversion), 0.47 (Openness), 0.29 (Agreeableness), and 0.49 (Conscientiousness) in middle-aged adults, and 0.52 (Neuroticism), 0.54 (Extraversion), 0.33 (Openness), 0.42 (Agreeableness), and 0.44 (Conscientiousness) in older adults. The correlation coefficients between TIPI-J scores at the two time points were 0.74-0.84 (middle-aged individuals) and 0.67-0.79 (older individuals).

Discussion
This study aimed to provide psychometric evaluations of the TIPI-J, examining concurrent validity, internal consistency, and test-retest reliability.
About the concurrent validity of the TIPI-J, the same traits across the TIPI-J and NEO-FFI had stronger correlations than other traits, except for Openness (Table 2). These results indicate that the TIPI-J, for the most part, has acceptable concurrent validity. This is consistent with a previous finding for undergraduates (Oshio et al., 2012). However, with regard to the Openness results, the correlation between TIPI-J Openness and NEO-FFI Openness was similar to that between TIPI-J Openness and NEO-FFI Extraversion. A previous study has also reported similar results: Oshio et al. (2012) found that the correlation between Openness scores for the TIPI-J and NEO-FFI (r = 0.35) was relatively weaker than the correlations for the other Big Five scales which included those from Five-Factor Personality Questionnaire (FFPQ; Fujishima, Yamada, & Tsuji, 2005) (r = 0.51), Big Five Scales (BFS; Wada, 1996) (r = 0.60), and Big Five Personality Inventory (Murakami & Murakami, 1998) (r = 0.50). In addition, Oshio, Abe, Cutrone, and Gosling (2013) found that the correlation between TIPI-J Openness and NEO-FFI Openness (r = 0.46) was similar to that between TIPI-J Openness and NEO-FFI Extraversion (r = 0.40). Muck, Hell, and Gosling (2007) reported that the correlation between TIPI-J Openness and NEO-FFI Openness (r = 0.41) was lower than that between TIPI-J Openness and NEO-FFI Extraversion (r = 0.52). Thus, previous studies and current finding indicate that the insufficient discriminability between Openness and other traits could be observed in the TIPI throughout various age groups, and this tendency might increase especially among middle-aged and older adults.
The current study showed that internal consistency using Cronbach's alpha coefficients was insufficient, which was 0.28-0.50 in middle-aged, and 0.37-0.60 in older adults (Table 3) because the scale includes only two items. Therefore, its reliability would be underestimated when using only internal consistency. The reliability and validity of psychological scales that have a small number of items for each factor are prone to "a tradeoff relationship." Indeed, if the two items that belong to the same factor were similar, its reliability would increase, whereas its validity would decrease. If the two items were different to assess a wider range of the construct, its validity would increase, whereas its reliability would decrease. This issue has often been expressed as the "bandwidth and the fidelity dilemma" (Cronbach & Gleser, 1965). Thus, examining the test-retest reliability should be conducted for an adequate evaluation of the reliability of small scales such as the TIPI (Oshio et al., 2012). With regard to test-retest reliability, almost values were acceptable (Oshio, 2016;Takamoto & Hattori, 2015), with the exception of Openness in older individuals, which was considerably weaker than the other traits (0.67). These results are consistent with previous studies: Oshio et al. (2012) reported test-retest reliability of the TIPI-J among undergraduates as ranging from r = 0.64 (Conscientiousness) to r = 0.86 (Extraversion), and Gosling et al. (2003) reported values ranging from 0.62 to 0.77. As a whole, these findings indicate the almost acceptable reliability of the TIPI-J. This study has some limitations. First, the characteristics of the study participants might be somewhat different from those of the general population because our participants were selected from volunteers registered in a survey company. They can use internet service freely in daily life. This might be a particular concern with regard to older individuals. A survey conducted at the end of 2013 showed that the Internet usage rate was 68.9% among those aged 65-69 years, 48.9% for those aged 70-79 years, and 22.3% for those aged 80 years and above (Ministry of Internal Affairs and Communications, 2014). Second, the present participants had a relatively higher education level as shown in Table 1. In sum, our participants had a higher social-economic status, and thus the representativeness of our study sample could potentially limit the external validity of our findings.

Conclusion
This study confirmed the psychometric properties of the TIPI-J, including concurrent validity, internal consistency, and test-retest reliability. Findings indicate the scale is a useful instrument, roughly speaking, for assessing the Big Five personality traits among middle-aged and older adults, although the insufficient discriminability between Openness and other traits was observed in the scale. The TIPI-J could be used in various situations including in community surveys and interventions for older adults, as well as in younger individuals. Additionally, personality assessment can be used for a "high-risk approach" to health support among older individuals. For instance, since Conscientiousness is reportedly predictive of mortality (Jokela, Elovainio, et al., 2013) and health behaviors (Bogg & Roberts, 2004), persons with low Conscientiousness should be detected early and be provided with support to change their unfavorable health practices in clinical and community health settings. Other psychometric evaluations of the TIPI-J should be conducted, (i.e. gender-and age-based differences, associated factors, and score distributions) using large representative samples.