Abstract
We investigated the effects of empirical keying on scoring personality measures. To our knowledge, this is the first published study to investigate the use of empirical keying for personality in a selection context. We hypothesized that empirical keying maximizes use of the information provided in responses to personality items. We also hypothesized that it reduces faking since the relationship between response options and performance is not obvious to respondents. Four studies were used to test the hypotheses. In Study 1, the criterion-related validity of empirically keyed personality measures was investigated using applicant data from a law enforcement officer predictive validation study. A combination of training and job performance measures was used as criteria. In Study 2, two empirical keys were created for long and short measures of the five factors. The criterion-related validities of the empirical keys were investigated using Freshman GPA (FGPA) as a criterion. In Study 3, one set of the empirical keys from Study 2 was applied to experimental data to examine the effects of empirical keying on applicant faking and on the relationship of personality scores and cognitive ability. In Study 4, we examined the generalizability of empirical keying across different organizations. Across the studies, option- and item-level empirical keying increased criterion-related validities for academic, training, and job performance. Empirical keying also reduced the effects of faking. Thus, both hypotheses were supported. We recommend that psychologists using personality measures to predict performance should consider the use of empirical keying as it enhanced validity and reduced faking.
Similar content being viewed by others
Notes
It is interesting that the empirical keys created using the work sample as a criterion had higher cross-validity for a criterion they were not developed to predict (i.e., training performance). We did notice that before cross-validation, these empirical keys had similar validities for the two criteria. Shrinkage occurred only for the work sample criterion and led to the keys having higher criterion-related validity for training performance than the work sample after cross-validation. Regardless, these results suggest that empirically keyed personality scales are not necessarily criterion specific in terms of cross-validity.
As a post hoc analysis, we also applied this approach to the two negative reliability coefficients in Study 1. The reliability of the emotional stability empirical key using item stepwise regression changed sign to .09 (this value is shown in Table 1). However, the reliability of the conscientiousness empirical key using the item correlational method became more negative (the original value appears in Table 1).
References
Barrick, M. R., & Mount, M. K. (1991). The Big Five personality dimensions and job performance: A meta-analysis. Personnel Psychology, 44(1), 1–26.
Bollen, K., & Lennox, R. (1991). Conventional wisdom on measurement: A structural equation perspective. Psychological Bulletin, 110(2), 305–314.
Bornstein, R. F., Rossner, S. C., Hill, E. L., & Stepanian, M. L. (1994). Face validity and fakability of objective and projective measures of dependency. Journal of Personality Assessment, 63(1), 363–386.
Brown, S. H. (1994). Validating biodata. In G. S. Stokes, M. D. Mumford, & W. A. Owens (Eds.), Biodata handbook: Theory, research, and use of biographical information in selection and performance prediction. Palo Alto: CPP Books.
Caputo, P. M., Cucina, J. M., & Sacco, J. M. (2010). Approaches to empirical keying of international biodata instruments. Poster presented at the 25th meeting of the Society for Industrial and Organizational Psychology, Atlanta, GA.
Carter, N. T., Dalal, D. K., Boyce, A. S., O’Connell, M. S., Kung, M. C., & Delgado, K. M. (2014). Uncovering curvilinear relationships between conscientiousness and job performance: How theoretically appropriate measurement makes an empirical difference. Journal of Applied Psychology, 99(4), 564–586.
Christiansen, N. D., Burns, G. N., & Montgomery, G. E. (2005). Reconsidering forced-choice item formats for applicant personality assessment. Human Performance, 18, 267–307.
Cohen, J. (1992). A power primer. Psychological Bulletin, 112(1), 155–159.
Converse, P. D., Oswald, F. L., Imus, A., Hedricks, C., Roy, R., & Butera, H. (2008). Comparing personality test formats and warnings: Effects on criterion-related validity and test-taker reactions. International Journal of Selection and Assessment, 16(2), 155–169.
Costa, P. T., & McCrae, R. R. (1992). Revised NEO Personality Inventory (NEO PI-R) and NEO Five-Factor Inventory (NEO-FFI) professional manual. Odessa, FL: Psychological Assessment Resources.
Cucina, J. M., Busciglio, H. H., & Vaughn, K. (2013). Category ratings and assessments: Impact on validity, utility, veterans’ preference, and managerial choice. Applied HRM Research, 13(1), 51-68.
Cucina, J. M., Caputo, P. M., Thibodeaux, H. F., & MacLane, C. N. (2012). Unlocking the key to biodata scoring: A comparison of empirical, rational, and hybrid approaches at different sample sizes. Personnel Psychology, 65, 385–428.
Cucina, J. M., Su, C., Busciglio, H. H., & Peyton, S. T. (2015). Something more than g: Meaningful Memory uniquely predicts training performance. Intelligence, 49, 192-206.
Cucina, J. M., Su, C., Busciglio, H. H., Thomas, P. H., & Peyton, S. T. (2015). Video-based testing: A high-fidelity job simulation that demonstrates reliability, validity, and utility. International Journal of Selection and Assessment, 23(3), 197-209.
Cucina, J. M., & Vasilopoulos, N. L. (2005). Nonlinear personality-performance relationships and the spurious moderating effects of traitedness. Journal of Personality, 73(1), 227–260.
Davis, B. W. (1997). An integration of biographical data and personality research through Sherwood Forest empiricism: Robbing from personality to give to biodata. Unpublished Doctoral Dissertation, Louisiana State University.
Fan, J., Gao, D., Carroll, S. A., Lopez, F. J., Tian, T. S., & Meng, H. (2012). Testing a new procedure for reducing faking on personality tests within selection contexts. Journal of Applied Psychology, 97, 866–880.
Festinger, L. (1957). A theory of cognitive dissonance. Stanford, CA: Stanford University Press.
Gandy, J. A., Dye, D. A., & MacLane, C. N. (1994). Federal government selection: The Individual Achievement Record. In G. S. Stokes, M. D. Mumford, & W. A. Owens (Eds.), Biodata handbook: Theory, research, and use of biographical information in selection and performance prediction (pp. 275–309). Palo Alto, CA: CPP Books.
Goldberg, L. R. (1993). The structure of personality traits: Vertical and horizontal aspects. In D. C. Funder, R. D. Parke, C. Tomlinson-Keasey, & K. Widaman (Eds.), Studying lives through time: Personality and development. Washington, DC: APA.
Goldberg, L. R. (1999). A broad-bandwidth, public-domain, personality inventory measuring the lower-level facets of several five-factor models. In I. Mervielde, I. J. Deary, F. De Fruyt, & F. Ostendorf (Eds.), Personality psychology in Europe (Vol. 7, pp. 7–28). Tilburg: Tilburg University Press.
Guion, R. M. (1965). Personnel testing. New York: McGraw-Hill Book Company.
Hardy III, J. H., Gibson, C., Sloan, M., & Carr, A. (2017). Are applicants more likely to quit longer assessments? Examining the effect of assessment length on applicant attrition behavior. Journal of Applied Psychology, 102(7), 1148–1158.
Hogan, J. R. (1994). Empirical keying of background data measures. In G. S. Stokes, M. D. Mumford, & W. A. Owens (Eds.), Biodata handbook: Theory, research, and use of biographical information in selection and performance prediction. Palo Alto, CA: Consulting Psychologists Press, Inc..
Holtrop, D., Born, M. P., de Vries, A., & de Vries, R. E. (2014). A matter of context: A comparison of two types of contextualized personality measures. Personality and Individual Differences, 68, 234–240.
Kepes, S., & McDaniel, M. A. (2015). The validity of conscientiousness is overestimated in the prediction of job performance. PLoS One, 10(10), e0141468.
Kluger, A. N., Reilly, R. R., & Russell, C. (1991). Faking biodata tests: Are option-keyed instruments more resistant? Journal of Applied Psychology, 76, 889–896.
Konig, C. J., Steiner Thommen, L. A., Wittwer, A.-M., & Kleinmann, M. (2017). Are observer ratings of applicants’ personality also faked? Yes, but less than self-reports. International Journal of Selection and Assessment, 25, 183–192.
Kuncel, N. R., & Borneman, M. J. (2007). Toward a new method of detecting deliberately faked personality tests: The use of idiosyncratic item responses. International Journal of Selection and Assessment, 15(2), 220–231.
Kuncel, N. R., & Hezlett, S. A. (2010). Fact and fiction in cognitive ability testing for admissions and hiring decisions. Current Directions in Psychological Science, 19(6), 339–345.
Kuncel, N. R., & Tellengen, A. (2009). A conceptual and empirical reexamination of the measurement of the social desirability of items: Implications for detecting desirable response style and scale development. Personnel Psychology, 62, 201–228.
Le, H., Oh, I., Robbins, S. B., Ilies, R., Holland, E., & Westrick, P. (2011). Too much of a good thing: Curvilinear relationships between personality traits and job performance. Journal of Applied Psychology, 96(1), 113–133.
MacLane, C. N., & Cucina, J. M. (2015). Generalization of cognitive and noncognitive validities across personality-based job families. International Journal of Selection and Assessment, 23(4), 316–331.
Mael, F. A. (1991). A conceptual rationale for the domain and attributes of biodata items. Personnel Psychology, 44, 763–792.
McAbee, S. T., & Oswald, F. L. (2013). The criterion-related validity of personality measures for predicting GPA: A meta-analytic validity competition. Psychological Assessment, 25(2), 532–544.
Meng, X. L., Rosenthal, R., & Rubin, D. B. (1992). Comparing correlated correlation coefficients. Psychological Bulletin, 111(1), 172–175.
Mitchell, T. W., & Klimonski, R. J. (1982). Is it rational to be empirical? A test of methods for scoring biographical data. Journal of Applied Psychology, 67(4), 411–418.
Moon, H., Hollenbeck, J. R., Marinova, S., & Humphrey, S. E. (2008). Beneath the surface: Uncovering the relationship between extraversion and organizational citizenship behavior through a facet approach. International Journal of Selection and Assessment, 16, 143–154.
Morgeson, F. P., Campion, M. A., Dipboye, R. L., Hollenbeck, J. R., Murphy, K., & Schmitt, N. (2007). Reconsidering the use of personality tests in personnel selection contexts. Personnel Psychology, 60(3), 683–729.
Mottus, R., Kandler, C., Bleidorn, W., Riemann, R., & McCrae, R. R. (2017). Personality traits below facets: The consensual validity, longitudinal stability, heritability, and utility of personality nuances. Journal of Personality and Social Psychology, 112(3), 474–490.
Mumford, M. D., & Stokes, G. S. (1992). Developmental determinants of individual action. In M. D. Dunnette & L. M. Hough (Eds.), Handbook of industrial and organizational psychology (Vol. 3). Palo Alto, CA: Consulting Psychologists Press.
O’Neill, T. A., Lewis, R. J., Law, S. J., Larson, N., Hancock, S., Radan, J., Lee, N., & Carswell, J. J. (2017). Forced-choice pre-employment personality assessment: Construct validity and resistance to faking. Personality and Individual Differences, 115, 120–127.
Oswald, F. (2014). Under the Hood of Big Data in Personnel Selection. Presentation for the November 7, 2014 meeting of the Personnel Testing Council of Southern California.
Ozer, D. J., & Reise, S. P. (1994). Personality assessment. Annual Review of Psychology, 45, 357–388.
Pozzebon, J., Damian, R. I., Hill, P. L., Lin, Y., Lapham, S., & Roberts, B. W. (2013). Establishing the validity and reliability of the Project Talent Personality Inventory. Frontiers in Psychology, 4, 968.
Rosse, J. G., Stecher, M. D., Levin, R. A., & Miller, J. L. (1998). The impact of response distortion on preemployment personality testing and hiring decisions. Journal of Applied Psychology, 83, 634–644.
Rothstein, H. R., Schmidt, F. L., Erwin, F. W., Owens, W. A., & Sparks, C. P. (1990). Biographical data in employment selection: Can validities be generalizable? Journal of Applied Psychology, 75(2), 175–184.
Schmidt, F. L. (2002). The role of general cognitive ability and job performance: Why there cannot be a debate. Human Performance, 15(1/2), 187–210.
Schmidt, F. L., & Hunter, J. E. (1998). The validity and utility of selection methods in personnel psychology: Practical and theoretical implications of 85 years of research findings. Psychological Bulletin, 124(2), 262–274.
Schmidt, F. L., & Rothstein, H. R. (1994). Application of validity generalization to biodata scales in employment selection. In G. S. Stokes, M. D. Mumford, & W. A. Owens (Eds.), Biodata handbook: Theory, research, and use of biographical information in selection and performance prediction (pp. 237–260). Palo Alto: CA CPP Books.
Shaffer, J. A., & Postlethwaite, B. E. (2012). A matter of context: A meta-analytic investigation of the relative validity of contextualized and noncontextualized personality measures. Personnel Psychology, 65(3), 445–494.
Soares, J. A. (Ed.). (2012). SAT wars: The case for test-optional college admissions. New York, NY: Teachers College Press.
Stark, S., Chernyshenko, O. S., Drasgow, F., White, L. A., Heffner, T., Nye, C. D., & Farmer, W. L. (2014). From ABLE to TAPAS: A new generation of personality tests to support military selection and classification decisions. Military Psychology, 26, 153–164.
Stricker, L. J., & Rock, D. A. (1998). Assessing leadership potential with a biographical measure of personality traits. International Journal of Selection and Assessment, 6(3), 164–184.
Tett, R. P., Freund, K. A., Christiansen, N. D., Fox, K. E., & Coaster, J. (2012). Faking on self-report emotional intelligence and personality tests: Effects of faking opportunity, cognitive ability, and job type. Personality and Individual Differences, 52, 195–201.
van Geert, E., Orhon, A., Cioca, J. A., Mamede, R., Golusin, S., Hubena, B., & Morillo, D. (2016). Study protocol on intentional distortion in personality assessment: Relationship with test format, culture, and cognitive ability. Frontiers in Psychology, 7, 933.
Vasilopoulos, N. L., & Cucina, J. M. (2006). Faking on non-cognitive measures: The interaction of cognitive ability and test characteristics. In R. Griffith (Ed.), A closer examination of applicant faking behavior. Greenwich, CT: Information Age Publishing, Inc..
Vasilopoulos, N. L., Cucina, J. M., Dyomina, N. V., Morewitz, C. L., & Reilly, R. R. (2006). Forced-choice personality tests: A measure of personality and cognitive ability? Human Performance, 19(3), 175–199.
Vasilopoulos, N. L., Cucina, J. M., & Hunter, A. E. (2007). Personality and training proficiency: Issues of validity, curvilinearity, and bandwidth-fidelity. Journal of Organizational and Occupational Psychology, 80, 109–131.
Vasilopoulos, N. L., Cucina, J. M., & McElreath, J. M. (2005). Do warnings of response verification moderate the relationship between personality and cognitive ability? Journal of Applied Psychology, 90(2), 306–322.
Wonderlic, Inc. (1999). Wonderlic personnel test and scholastic level exam. Libertyville, IL: Wonderlic Personnel Test, Inc..
Author information
Authors and Affiliations
Corresponding author
Electronic Supplementary Material
ESM 1
(DOCX 38 kb)
Rights and permissions
About this article
Cite this article
Cucina, J.M., Vasilopoulos, N.L., Su, C. et al. The Effects of Empirical Keying of Personality Measures on Faking and Criterion-Related Validity. J Bus Psychol 34, 337–356 (2019). https://doi.org/10.1007/s10869-018-9544-y
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10869-018-9544-y